4109

A novel unsupervised domain adaptation method for deep learning-based prostate MR image segmentation

Cheng Li¹, Hui Sun¹, Taohui Xiao¹, Xin Liu¹, Hairong Zheng¹, and Shanshan Wang¹
¹Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China

Synopsis

Automatic prostate MR image segmentation is needed to help doctors achieve fast and accurate disease diagnosis and treatment planning. Deep learning (DL) has shown promising achievements. However, DL models often face challenges in applications when there are large discrepancies between the training (source domain) and test (target domain) data. Here we propose a novel unsupervised domain adaptation method to address this issue without utilizing any target domain labels. Our method introduces two models trained in parallel to filter and correct the pseudo-labels generated for the target domain training data and thus, achieves substantially improved segmentation results on the test data.

Introduction

Prostate cancer is one of the major threats to men's health worldwide [1]. Prostate segmentation is crucial for the disease pathological stage prediction and treatment planning [2]. Magnetic resonance imaging (MRI) has been an important imaging modality for the evaluation of the prostate because of its superior imaging contrast and resolution. Manual delineation of the prostate in MR images is time-consuming and error-prone. Automatic segmentation models are needed to help doctors make fast and accurate imaging-based diagnoses and relevant decisions.
Deep learning (DL) has shown promising performance for prostate segmentation [3,4]. However, it is widely recognized that in real-world applications, DL models often encounter problems due to the discrepancies between the source domain training data and the target domain test data caused by the different image acquisition protocols or different machines. Model optimization with combined source and target domain data is an effective solution [5]. Nevertheless, it may not always feasible to obtain labeled target domain images. Unsupervised domain adaptation that targets to transfer models from source domain to target domain without utilizing target domain labels is expected. To this end, we propose a novel unsupervised domain adaptation method for deep learning-based prostate MR image segmentation. With our method, DL models trained with source domain images and corresponding labels and only target domain images achieve promising segmentation results on target domain test data.

Methods

The overall framework of our proposed method is shown in Figure 1. Three major steps are involved. First, a network (Net0) is learned with the available source domain labeled training data. Second, pseudo labels are generated for the target domain training data with Net0. Combining the source domain labeled training data and target domain pseudo-labeled training data, an enlarged training set is obtained. Cross-domain cross-network optimization is achieved with the enlarged training set. Particularly, two networks are optimized in parallel to conduct cross-network local noisy label filtering and global noisy label correction. Local label filtering is accomplished in each iteration that a defined percentage (half batch size) of suspected large noisy labels (large segmentation loss) are filtered out and a consistency loss is calculated between the predictions of these inputs and averaged predictions of augmented inputs. Global label correction is introduced in each epoch when the whole training set is considered and highly noisy labels (small Dice scores between the network predictions and the pseudo labels, 25% of the target domain training samples) are corrected and replaced by the network predictions. Data augmentation is utilized to achieve data distillation [6]. Cross-network sample filtering is considered to implicitly embed the idea of network distillation [7] and to prevent the error accumulation and propagation within single networks [8]. With these elements, we enforce the network to exploit more image contents in addition to generating label-guided image features. For the networks, classical encoder-decoder architectures are used [9]. Combined Dice loss and cross-entropy loss are calculated as the segmentation loss, and the mean square error is calculated as the consistency loss. Two public datasets are employed for our experiments, NCI-ISBI 2013 [10] and PROMISE12 [2]. Three domain data are obtained with the two datasets, two from NCI-ISBI 2013 and one from PROMISE12. Domain 1 and Domain 2 contain 30 training patients and 10 test patients, respectively. Domain 3 has 37 patients with 10 patients randomly selected as the test cases. Domain 1 data are acquired with 1.5T MRI systems, Domain 2 with 3.0T MRI systems, and Domain 3 with different machines. Figure 2 shows example images from different domains. Large variations in the appearance exist. Two evaluation metrics are reported, Dice score (Dice similarity coefficient, DSC) and average symmetric surface distance (ASSD). Higher DSC and lower ASSD values indicate more accurate segmentation results. Differences between the different models were evaluated by paired t-test with a significance threshold of p < 0.05.

Results and Discussion

The effectiveness of the proposed method is compared to models trained on the source domain and directly tested on the target domain as well as models trained with the combined source domain labeled training data and target domain pseudo-labeled training data with the conventional optimization method. Figure 3 plots the DSC and ASSD metrics. Overall, the models trained and tested in the same domain achieve the best results. Our proposed cross-domain cross-network optimization method outperforms the two comparison methods for unsupervised domain adaptations of prostate segmentation DL models. Particularly, when transferring models from Domain 1 to Domain 2, our method enhances the average DSC by more than 30% (from 45.8% to 80.0%). Figure 4 gives the segmentation results of an example case. Our proposed method achieves much better segmentation maps compared to direct model testing on the target domain.

Conclusion

In this study, an unsupervised domain adaptation method is proposed that can substantially enhance the cross-domain prostate segmentation performance of DL models without utilizing target domain labels. The method addresses a typical issue in clinical applications of DL models that they are largely affected by the training data, and MR images with different properties are commonly acquired by different operators due to the different imaging parameters or different machines utilized. Therefore, our method has a high potential in real clinical applications.

Acknowledgements

This research was partly supported by Scientific and Technical Innovation 2030 - "New Generation Artificial Intelligence" Project (2020AAA0104100, 2020AAA0104105), the National Natural Science Foundation of China (61871371, 81830056), Key-Area Research and Development Program of Guangdong Province (2018B010109009), the Basic Research Program of Shenzhen (JCYJ20180507182400762), Youth Innovation Promotion Association Program of Chinese Academy of Sciences (2019351).

References

[1] Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2020. CA. Cancer J. Clin. 70, 7–30 (2020).

[2] Litjens, G. et al. Evaluation of prostate segmentation algorithms for MRI: The PROMISE12 challenge. Med. Image Anal. 18, 359–373 (2014).

[3] Milletari, F., Navab, N. & Ahmadi, S.-A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. in International Conference on 3D Vision (3DV) 565–571 (2016).

[4] Jia, H. et al. 3D APA-Net: 3D adversarial pyramid anisotropic convolutional network for prostate segmentation in MR images. IEEE Trans. Med. Imaging 39, 447–457 (2020).

[5] Liu, Q., Dou, Q., Yu, L. & Heng, P. A. MS-Net: Multi-site network for improving prostate segmentation with heterogeneous MRI data. IEEE Trans. Med. Imaging 39, 2713–2724 (2020).

[6] Vohra, Y. et al. Data distillation: towards omni-supervised learning. in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 4119–4128 (IEEE, 2018).

[7] Lakshminarayanan, B., Pritzel, A. & Blundell, C. Simple and scalable predictive uncertainty estimation using deep ensembles. in Conference on Neural Information Processing Systems (NeurIPS) (2017).

[8] Han, B. et al. Co-teaching: Robust training of deep neural networks with extremely noisy labels. in Conference on Neural Information Processing Systems (NeurIPS) (2018).

[9] Falk, T. et al. U-Net: deep learning for cell counting, detection, and morphometry. Nat. Methods 16, 67–70 (2019).

[10] NCI-ISBI 2013 challenge: automated segmentation of prostate structures. (2013).

Figures

Figure 1. The proposed framework. Source domain labeled training data are employed first to train a network that generates pseudo labels for the target domain unlabeled training data. Then, the combined source domain labeled training data and target domain pseudo-labeled training data are utilized to achieve our proposed cross-domain cross-network optimization.

Figure 2. Example prostate MR images acquired with different imaging parameters. From left to right, the first image from Domain 1, the second from Domain 2, and the third to the fifth from Domain 3.

Figure 3. Segmentation results. ‘D’ refers to Domain. The first number after D refers to the source domain and the second refers to the target domain. ‘N’, ‘C’, and ‘O’ indicate models are trained with only the source domain training data, models are trained with the combined data with conventional optimization method, and models are trained with the combined data utilizing the proposed method. The red lines indicate the median values and the green triangles refer to the average values. * means significant differences between the corresponding experiments evaluated by paired t-tests.

Figure 4. Example segmentation maps when transferring models from Domain 1 to Domain 2. From left to right, the three columns correspond to the transverse plane, the sagittal plane, and the coronal plane. From the head to bottom, the four rows refer to the ground-truth manual segmentation, the outputs of models trained directly on Domain 2 training set, the outputs of models trained on Domain 1 labeled training set, and the outputs of models trained on Domain 1 labeled training set and Domain 2 pseudo-labeled training set with the proposed cross-domain cross-network optimization method.

Proc. Intl. Soc. Mag. Reson. Med. 29 (2021)

4109