Hang Yu1, Zichuan Xie2, Lizhi Xie3, Zhiheng Liu1, Lina Zhang4, Siyao Du4, Xiangjie Yin1, Chenyang Li1, Wenhong Jiang4, Yuru Guo1, and Zhongqi Kang4
1School of Aerospace Science and Technology, Xidian University, Xi'an, China, 2Guangzhou institute of technology, Xidian University, Guangzhou, China, 3GE Healthcare, Beijing, China, 4Department of Radiology, The First Hospital of China Medical University, Shenyang, China
Synopsis
Keywords: Breast, Machine Learning/Artificial Intelligence, Deep Learning
The
multimodal MRI data is often ultilized for breast cancer analysis, and by now
still difficult and inefficient to explored by segmentation algorithms. In this
paper, we propose a MP-Unet based on U-net convolutional neural network, which
can effectively ensemble multiple inputs of modal data and obtain accuracy segmentation
results at the end. In MP-Unet, we reused some good quality modal data for
training. The multiple MP-Unet models are further integrated based on Bagging
algorithm to improve the segmentation accuracy of lesions. Experiments suggest
that our proposed method has a huge performance improvement.
Background and Purpose
Breast
cancer is one of the most common malignant tumors in women. Accurate breast
lesion segmentation is essential for the diagnosis and prognosis of patients
with Breast cancer. The radiologist manually outlines the lesions in the breast
MRI to perform an analysis of the patient's condition. However, such a task is
not only time-consuming and labor-intensive, but also requires a high
professional skill level of radiologists. In recent years, deep convolutional
neural networks have dominated various computer vision fields. For image
segmentation tasks, convolutional neural networks become an effective solution,
which discard the fully connected layers used in traditional image
classification. The image is read by the deep segmentation neural network, and
after layers of convolutional operations, the final output is obtained at the
last layer by softmax or sigmoid activation function to achieve pixel-level
classification. The difference between the output segmentation map and the
ground truth (GT) is used to generate a loss function, and the model is trained
iteratively to reduce the loss values to optimize the weights of the neural
network. The optimal model is obtained after reaching the peak performance on
the test set. However, multi-modal image processing has been a great challenge
in the field of medical image segmentation, and the obstacle lies in the
difficulty of fusing image information from multiple modalities for effective
utilization. This can easily lead to low robustness of deep learning models or
waste of modal data.Methods
In this paper, we
propose a network framework based on U-net [1] with multiple backbone encoders,
named as MP-Unet, which take use of multiple inputs of raw modal data
simultaneously. The proposed network architecture is motivated by previous work
[2], and is depicted in Figure.1. It is expanded from the original U-net
architecture. The difference is that we expand the encoders into three to
accommodate three different inputs of modal data. Before each downsampling, the
feature maps obtained in the three contracting paths are copied and
concatenated, and then concatenated with the feature maps of the same
resolution in the expanding path, as a way to recover the image information
lost in the encoder due to downsampling. After the final output the feature map
of the network is operated by softmax function, the loss value between the
predicted segmentation map and ground truth is obtained using dice loss and
cross entropy loss. We reused those modal data with higher quality so that more
MP-Unet models could be obtained. Inspired by the Bagging algorithm, we
integrated the MP-Unet obtained by training on different modality datasets with
a majority voting mechanism to further improve the segmentation accuracy of
breast lesions. Since the segmentation network is essentially a pixel-level
segmentation of an image, when multiple models disagree on the classification
of a pixel in the segmentation result, the majority vote is taken as the final
result. The overall pipeline of the proposed algorithm is illustrated in Figure.2. The
breast MRI data we processed contained nine main types, including T1 weighted
(T1W), T2 weighted (T2W), T1 Fluid Attenuated Inversion Recovery (T1WFlair), T2
Fluid Attenuated Inversion Recovery (T2WFlair), short tau inversion recovery
(STIR), proton density weighted (PDW), proton density Mapping (PDMapping), T1
weighted Mapping (T1Mapping), T2 weighted Mapping (T2Mapping), respectively.
The nine modal data are shown in Figure.3. T1Mapping, T2Mapping, and STIR modal
data produce severe pretzel noise on the image when generated. So we remove the
noise by selecting a threshold value from the gray scale histogram. The nine
modal data are then normalized. Better quality modal data should be used as
much as possible in the process of generating multi-path models, which is a
basic prerequisite to ensure high accuracy models and these data may contain
more pathological information than the rest of the data. Therefore, we designed
three datasets for training (T1W-T1F-T2F, T1W-T1F-T1M, and T1W-T1F-T2W): the
first dataset contains T1W, T1WFlair, and T2WFlair; the second dataset includes
T1W, T1WFlair, and T1Mapping; and the third dataset consists of T1W, T1WFlair
and T2W. A multi-path model will be trained for each dataset. Finally, three
trained MP-Unets are integrated in combination with the Bagging algorithm.Results
Experiments
demonstrate that the segmentation of breast lesions by the ensemble MP-Unet has
a huge improvement in various segmentation metrics. Compared with the U-net
model which obtained the highest accuracy on the T1W modal dataset, the
proposed method improved 1.85, 1.99, and 2.15 percentage points in the DICE
coefficient, Intersection Over Union (IOU), and Precision (Pre), respectively.Acknowledgements
All data for this study were supported by the Department of Radiology, The First Hospital of ChinaMedical University.References
[1] Ronneberger O, Fischer P, Brox T. U-net:
Convolutional networks for biomedical image segmentation[C]//International
Conference on Medical image computing and computer-assisted intervention.
Springer, Cham, 2015: 234-241.
[2] Nie D, Wang L, Gao Y, et al. Fully convolutional networks
for multi-modality isointense infant brain image segmentation[C]//2016 IEEE
13Th international symposium on biomedical imaging (ISBI). IEEE, 2016:
1342-1345.