Yu-Chun Lin1, Yenpo Lin1, Yen-Ling Huang1, Chih-Yi Ho1, and Gigin Lin1,2
1Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital at Linkou, Chang Gung Memorial Hospital, Taoyuan, Taiwan, 2Clinical Metabolomics Core Laboratory, Chang Gung Memorial Hospital at Linkou, Taoyuan, Taiwan
Synopsis
This study retrospectively analyzed diffusion-weighted MRI in 320 patients with malignant uterine tumors (UT). A pretrained model was established for cervical cancer dataset. Transfer learning (TL) experiments were performed by adjusting fine-tuning layers and proportions of training data sizes. When using up to 50% of the training data, the TL models outperformed all the models. When the full dataset was used, the aggregated model exhibited the best performance, while the UT-only model exhibited the best in the UT dataset. TL of tumor segmentation on diffusion-weighted MRI for all uterine malignancy is feasible with limited case number.
Purpose
Deep convolutional neural networks have potential to automate the labor-intensive tumor segmentation on magnetic resonance (MR) imaging. However, building a model requires large training data which is challenging for a broad clinical application. The purpose of the study is to investigate how transfer learning (TL) can generalize accurate tumor segmentation from cervical cancer to all malignant uterine tumors (UT).Methods
This study retrospectively analyzed diffusion-weighted MRI in 320 patients with UT. The sagittal DW images and the corresponded ADC maps of each slice were used as input sources for training and testing. Regions of interest (ROIs) of tumor contours were delineated by the consensus of two gynecologic radiologist. The labeled ROIs were used as the ground truth for the model training.The pretrained model was established using DeepLab V3+ [2] for CX dataset (n = 144).
For the TL experiment, we performed three combinations of model training and prediction: (a) UT-only model: training the model from scratch by using the UT dataset without TL from CX model; (b) TL model: Using the pretrained CX model and fine-tuning of certain layers based on the UT dataset; and (c) using the pretrained model of CX to apply directly to the UT dataset. To investigate the effect of freezing/tuning layers on TL performance, we examined three levels as the cutoff layers on the TL model. The layers before the identified layer were frozen, whereas those after that were fine-tuned based on the target domain data (Figure 1).
To assess the influence of data size on training performance, we examined different training data sizes through splitting the training dataset randomly with 2%, 5%, 10%, 20%, 50%, and 100% of patients. The various sizes of patient groups were used as datasets for training the UT-only model and TL models. An independent dataset (n = 64) was used for testing the performance of each group. An aggregated model was trained using the combined CX and UT datasets. The model performance was evaluated using the dice similarity coefficient (DSC) in independent datasets comprising UT (n = 64) and CX (n = 25).Results
Figure 2 shows the performance of TL in various training combinations and under various sample sizes from the UT dataset. When applying the pretrained CX model directly to UT dataset, the DSC appeared to be the lowest of 0.30 (95% confidence interval [CI], 0.25–0.34). The DSC increased as the training data size of UT patient increased. The TL models with the fine-tuning level at L1 exhibited the highest DSCs compared with those at L2 and L3 (P < 0.05 for all data size subgroups). When the data size was small (<128, 50%), the TL model with the fine-tune level at L1 outperformed the UT-only model (P < 0.001 for sample sizes 5, 13, 26, and 51). However, as more data of the target dataset were used, the performance of the UT-only model increased. With the data size of 128, the DSC is not significantly different between the UT-only model and TL model of L1 level (DSC = 0.70 and 0.69, respectively, P = 0.12). With the full training size of 256 patients, the UT-only model exhibited the highest DSC of 0.79 (95% CI, 0.77–0.81) among all the combinations of training regardless of which TL model was used (DSC = 0.71, 0.67, and 0.63 for L1, L2, and L3 levels, respectively. P < 0.001).
Figure 3 demonstrates a patient with endometrial cancer where tumor contours were generated using various training models and sample sizes. The pretrained CX model itself when not fine-tuned using UT data generated only a small part of the tumor with DSC = 0.18. The accuracy increased as more UT data were used for fine-tuning. The TL model outperformed the UT-only model when the fine-tuned data size was <128. The UT-only model exhibited the highest DSC of 0.92 when all patient data were used (n = 256).Conclusions
The TL of tumor segmentation using diffusion MRI is feasible from cervical cancer to all uterine malignancies. The CX data can serve as the pretrained model that can be adopted to the target domain of uterine malignancy. The TL approach is advantageous when the data size of the target domain is limited. However, if large amounts of data are available and can be annotated, training from scratch using the target dataset appears to be a better option for specific disease. The aggregated model provides a potential universal solution for segmenting all uterine malignancies.Acknowledgements
This study was funded by Ministry of Science and Technology, Taiwan (grant no.: MOST 109-2628-B-182A-007 and MOST 109-2314-B-182A-045) and Chang Gung Medical Foundation grant CLRPG3K0021, CMRPG3J1631, CMRPG3I0141, and SMRPG3K0051. Chang Gung IRB 201702080A0C601, 201702204B0 and 201902082B0C601.References
- esion Segmentation, in Medical Image Computing and Computer Assisted Intervention − MICCAI 2017. 2017. p. 516-524.2.
- Chen, L.-C., et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. arXiv e-prints, 2018.