4522

Ensemble Multi-Path U-net Segmentation Algorithm for Breast Lesion based on Multi-Modality Image

Hang Yu¹, Zichuan Xie², Lizhi Xie³, Zhiheng Liu¹, Lina Zhang⁴, Siyao Du⁴, Xiangjie Yin¹, Chenyang Li¹, Wenhong Jiang⁴, Yuru Guo¹, and Zhongqi Kang⁴
¹School of Aerospace Science and Technology, Xidian University, Xi'an, China, ²Guangzhou institute of technology, Xidian University, Guangzhou, China, ³GE Healthcare, Beijing, China, ⁴Department of Radiology, The First Hospital of China Medical University, Shenyang, China

Synopsis

Keywords: Breast, Machine Learning/Artificial Intelligence, Deep Learning

The multimodal MRI data is often ultilized for breast cancer analysis, and by now still difficult and inefficient to explored by segmentation algorithms. In this paper, we propose a MP-Unet based on U-net convolutional neural network, which can effectively ensemble multiple inputs of modal data and obtain accuracy segmentation results at the end. In MP-Unet, we reused some good quality modal data for training. The multiple MP-Unet models are further integrated based on Bagging algorithm to improve the segmentation accuracy of lesions. Experiments suggest that our proposed method has a huge performance improvement.

Background and Purpose

Breast cancer is one of the most common malignant tumors in women. Accurate breast lesion segmentation is essential for the diagnosis and prognosis of patients with Breast cancer. The radiologist manually outlines the lesions in the breast MRI to perform an analysis of the patient's condition. However, such a task is not only time-consuming and labor-intensive, but also requires a high professional skill level of radiologists. In recent years, deep convolutional neural networks have dominated various computer vision fields. For image segmentation tasks, convolutional neural networks become an effective solution, which discard the fully connected layers used in traditional image classification. The image is read by the deep segmentation neural network, and after layers of convolutional operations, the final output is obtained at the last layer by softmax or sigmoid activation function to achieve pixel-level classification. The difference between the output segmentation map and the ground truth (GT) is used to generate a loss function, and the model is trained iteratively to reduce the loss values to optimize the weights of the neural network. The optimal model is obtained after reaching the peak performance on the test set. However, multi-modal image processing has been a great challenge in the field of medical image segmentation, and the obstacle lies in the difficulty of fusing image information from multiple modalities for effective utilization. This can easily lead to low robustness of deep learning models or waste of modal data.

Methods

In this paper, we propose a network framework based on U-net [1] with multiple backbone encoders, named as MP-Unet, which take use of multiple inputs of raw modal data simultaneously. The proposed network architecture is motivated by previous work [2], and is depicted in Figure.1. It is expanded from the original U-net architecture. The difference is that we expand the encoders into three to accommodate three different inputs of modal data. Before each downsampling, the feature maps obtained in the three contracting paths are copied and concatenated, and then concatenated with the feature maps of the same resolution in the expanding path, as a way to recover the image information lost in the encoder due to downsampling. After the final output the feature map of the network is operated by softmax function, the loss value between the predicted segmentation map and ground truth is obtained using dice loss and cross entropy loss. We reused those modal data with higher quality so that more MP-Unet models could be obtained. Inspired by the Bagging algorithm, we integrated the MP-Unet obtained by training on different modality datasets with a majority voting mechanism to further improve the segmentation accuracy of breast lesions. Since the segmentation network is essentially a pixel-level segmentation of an image, when multiple models disagree on the classification of a pixel in the segmentation result, the majority vote is taken as the final result. The overall pipeline of the proposed algorithm is illustrated in Figure.2. The breast MRI data we processed contained nine main types, including T1 weighted (T1W), T2 weighted (T2W), T1 Fluid Attenuated Inversion Recovery (T1WFlair), T2 Fluid Attenuated Inversion Recovery (T2WFlair), short tau inversion recovery (STIR), proton density weighted (PDW), proton density Mapping (PDMapping), T1 weighted Mapping (T1Mapping), T2 weighted Mapping (T2Mapping), respectively. The nine modal data are shown in Figure.3. T1Mapping, T2Mapping, and STIR modal data produce severe pretzel noise on the image when generated. So we remove the noise by selecting a threshold value from the gray scale histogram. The nine modal data are then normalized. Better quality modal data should be used as much as possible in the process of generating multi-path models, which is a basic prerequisite to ensure high accuracy models and these data may contain more pathological information than the rest of the data. Therefore, we designed three datasets for training (T1W-T1F-T2F, T1W-T1F-T1M, and T1W-T1F-T2W): the first dataset contains T1W, T1WFlair, and T2WFlair; the second dataset includes T1W, T1WFlair, and T1Mapping; and the third dataset consists of T1W, T1WFlair and T2W. A multi-path model will be trained for each dataset. Finally, three trained MP-Unets are integrated in combination with the Bagging algorithm.

Results

Experiments demonstrate that the segmentation of breast lesions by the ensemble MP-Unet has a huge improvement in various segmentation metrics. Compared with the U-net model which obtained the highest accuracy on the T1W modal dataset, the proposed method improved 1.85, 1.99, and 2.15 percentage points in the DICE coefficient, Intersection Over Union (IOU), and Precision (Pre), respectively.

Acknowledgements

All data for this study were supported by the Department of Radiology, The First Hospital of ChinaMedical University.

References

[1] Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015: 234-241.

[2] Nie D, Wang L, Gao Y, et al. Fully convolutional networks for multi-modality isointense infant brain image segmentation[C]//2016 IEEE 13Th international symposium on biomedical imaging (ISBI). IEEE, 2016: 1342-1345.

Figures

Figure.1. Proposed MP-Unet architecture for breast cancer lesions segmentation in multi-modal images,which extends the traditional Unet.

Figure.2. Ensemble MP-Unet pipeline, which first determines the image quality by the U-net performance obtained from the training of different modal datasets, and subsequently selects the appropriate datasets to train the MP-Unet model separately. Finally, multiple MP-Unets are integrated to further improve the localization and segmentation of breast lesions.

Figure.3. Representative axial slice from one subject. (a) to (i) represent the nine modalities of this slice, and (j) shows the label data shared by each modality.

Table 1 The five modalities with the best U-net performance are selected for the experiments. Ensemble represents the segmentation model combined with the Bagging algorithm, and Ensemble MP is the final proposed model. Specifically, we added Ensemble U-net as a control group to verify the validity of MP-Unet.

Proc. Intl. Soc. Mag. Reson. Med. 31 (2023)

4522

DOI: https://doi.org/10.58530/2023/4522