3526

Differentiation of Breast Cancer Molecular Subtypes on DCE-MRI by Using Convolutional Neural Network with Transfer Learning

Yang Zhang¹, Yezhi Lin^1,2, Siwa Chan³, Jeon-Hor Chen^1,4, Jiejie Zhou², Daniel Chow¹, Peter Chang¹, Meihao Wang², and Min-Ying Su¹
¹Department of Radiological Science, University of California, Irvine, CA, United States, ²Department of Radiology, First Affiliate Hospital of Wenzhou Medical University, Wenzhou, China, ³Department of Medical Imaging, Taichung Tzu-Chi Hospital, Taichung, Taiwan, ⁴Department of Radiology, E-Da Hospital and I-Shou University, Kaohsiung, Taiwan

Synopsis

A total of 244 patients were analyzed, 99 in Training, 83 in Testing-1 and 62 in Testing-2. Patients were classified into 3 molecular subtypes: TN, HER2+ and (HR+/HER2-). Deep learning using CNN and Convolutional Long Short Term Memory (CLSTM) were implemented. The mean accuracy in Training dataset evaluated using 10-fold cross-validation was higher using CLSTM (0.91) than CNN (0.79). When the developed model was applied to testing datasets, the accuracy was very low, 0.4-0.5. When transfer learning was applied to re-tune the model using one testing dataset, it could greatly improve accuracy in the other dataset from 0.4-0.5 to 0.8-0.9.

Introduction

Determination of different molecular subtypes in newly diagnosed cancer is very important for choosing the most appropriate treatment strategy. While these markers can be evaluated from tissues obtained in biopsy or surgery, it is subject to the tissue sampling bias problem. For patients electing to receive neoadjuvant chemotherapy, tumor can shrink substantially or even completely regress to achieve pathological complete response, and it will be very difficult to perform a thorough molecular subtyping for choosing additional therapies after surgery. For patients with hormonal receptor positive cancer, long term hormonal therapy is needed, which is known to be very effective in reducing the risk of recurrence and metastasis. Breast MR images contain rich information, which may be used for differentiation of molecular subtypes, done using images acquired at the time of diagnosis before any treatment for a thorough assessment of the entire tumor. The goal of this study is to apply deep learning using two different convolutional neural networks to differentiate three different molecular subtypes of breast cancer: triple negative (TN), HER2 positive (HER2+), and Hormonal receptor positive & HER2 negative (HR+/HER2-). One training dataset and two testing datasets were used. In addition to directly testing, we also applied transfer learning to investigate how it can be used to improve the accuracy.

Methods

The Training Dataset was obtained from one hospital performed on a Siemens 1.5T system, with a total of 99 patients (65 HR+/HER2-, 24 HER2+, 10 TN). The independent testing was done using cases collected from a different hospital performed on a GE 3T system. The Testing Dataset-1 was collected from Jan 2017 to May 2018, with a total of 83 patients (54 HR+/HER2-, 19 HER2+, 10 TN); and Testing Dataset-2 included newer cases collected from June to Dec 2018, with a total of 62 patients (37 HR+/HER2-, 15 HER2+, 10 TN). Tumors were segmented on T1w contrast-enhanced maps using fuzzy-C-means (FCM) clustering algorithm [1]. In a recent study we reported that including a small amount of peri-tumor tissue into analysis can achieve a higher diagnostic accuracy compared to using tumor ROI alone [2], so in this study we adopted the same method. The ROI’s obtained on all slices of one lesion were projected together and the smallest bounding box covering them was used as the input in deep learning [3]. Only the DCE images were analyzed, which included one set of pre- and 4 sets of post-contrast images. Since a time series of DCE-MRI was acquired, a recurrent neural network could be implemented to consider the change of signal intensity over time [4]. Figure 1 shows the conventional convolutional neural network [5-8] by using all 5 sets of images as inputs. Figure 2 shows the convolutional long short term memory (CLSTM) network [9], and the DCE images were put into the network one by one. To avoid overfitting, the dataset was augmented by random affine transformation. In the training dataset, the evaluation was done using 10-fold cross-validation. The developed model was directly applied to the testing datasets to evaluate the accuracy. Then, the first model developed from training was used as the basis in Testing Dataset-1 for transfer learning to re-tune the model. Then this second model developed from transfer learning in Testing Dataset-1 was applied in Testing Dataset-2 for evaluation. The process was repeated again using Dataset-2 for transfer learning and tested in Dataset-1.

Results

Figures 3 and 4 show two case examples, with the smallest bounding box used in deep learning. All results are summarized in Table 1. The accuracy obtained in the training process was evaluated using 10-fold cross-validation, and the range and the mean values were reported. When using conventional CNN, the mean accuracy was 0.79 in the Training dataset, but when the model was applied to Testing-1 and 2, the accuracy was very low, only 0.52 and 0.47. When transfer learning was applied, the mean accuracy in Testing-1 could reach 0.91, and the new model improved accuracy in Tetsing-2 from 0.47 to 0.78. When using the recurrent network with CLSTM, the mean accuracy was 0.91 in the Training dataset, higher than 0.79 using CNN. When the model was applied to Testing-1 and 2, the accuracy was also very low, only 0.44 and 0.39. When transfer learning was applied, the mean accuracy reached 0.83 in Testing-1, and improved accuracy in Tetsing-2 from 0.39 to 0.74. The results were similar when using Testing Dataset-2 for transfer learning, which also greatly improved the accuracy in Dataset-1.

Discussion

In the standard practice for breast cancer, the hormonal receptor and HER2 receptor must be evaluated, so the patient can receive targeted therapies that are known to be effective, including hormonal therapy using tamoxifen and aromatase inhibitors, and HER2 targeting therapy using trastuzumad and pertuzumab. Therefore, an accurate diagnosis of the subtype is very important. Imaging may provide a complementary approach, especially for patients receiving neoadjuvant chemotherapy and the residual tissue is not sufficient for analysis. The training and testing datasets in this study were acquired using different systems (Siemens 1.5T vs. GE 3T) with different protocols (non-fat-sat vs. fat-sat), which might explain the very low accuracy when the developed model from Training was directly applied to Testing datasets. We further showed that transfer learning was an efficient method to re-tune the model for a different dataset. After transfer learning done using one testing dataset, the accuracy was greatly improved in the second testing dataset. Lastly, we compared the results obtained using CNN and CLSTM, and showed that recurrent network was a better architecture to analyze the DCE-MRI images acquired in a time series, which have been demonstrated by several other studies as well. In conclusion, our results show that deep learning provides an efficient method to extract subtle information from images to improve prediction of breast cancer molecular subtypes.

Acknowledgements

This work was supported in part by NIH R01 CA127929, R21 CA208938.

References

[1] Nie K, Chen JH, Yu HJ, Chu Y, Nalcioglu O, Su MY. "Quantitative analysis of lesion morphology and texture features for diagnostic prediction in breast MRI." Academic radiology 15.12 (2008): 1513-1525.

[2] Zhou J, Zhang Y, Chang KT, et al. Diagnosis of Benign and Malignant Breast Lesions on DCE-MRI by Using Radiomics and Deep Learning With Consideration of Peritumor Tissue. J Magn Reson Imaging. 2019 Nov 1. doi: 10.1002/jmri.26981. [Epub ahead of print]

[3] Shi L, Zhang Y, Nie K, et al. Machine learning for prediction of chemoradiation therapy response in rectal cancer using pre-treatment and mid-radiation multi-parametric MRI. Magn Reson Imaging 2019;61:33–40.

[4] Lang N, Zhang Y, Zhang E, et al. Differentiation of spinal metastases originated from lung and other cancers using radiomics and deep learning based on DCE-MRI. Magn Reson Imaging. 2019 Feb 28. pii: S0730-725X(18)30672-6. doi: 10.1016/j.mri.2019.02.013. [Epub ahead of print]

[5] LeCun Y, Bengio Y. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks. 1995;3361(10):1995.

[6] Kingma D, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. 2014 Dec 22.

[7] Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) 2010 (pp. 807-814).

[8] Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research. 2014 Jan 1;15(1):1929-58.

[9] Xingjian SH, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Advances in neural information processing systems 2015 (pp. 802-810). 1.

Figures

Figure 1: Diagram of VGG convolutional neural network (CNN) [5]. The architecture uses 7 serial convolutional 3 x 3 filters followed by the ReLU nonlinear activation function [7]. Dropout at 50% is applied to all convolutional and fully-connected layers after the second layer to avoid over-fitting [8]. Feature maps are down-sampled to 25% of the previous layer by convolutions with a stride length of two. The number of the input channels is 5, using the 5 sets of pre- and post-contrast images acquired in DCE-MRI together without time information. The number of activation channels in deeper layers is progressively increased from 8 to 16 to 32 to 64. Softmax is used as the final activation function of the last fully connected layer.

Figure 2: Diagram of Convolutional Long Short Term Memory (CLSTM) network [9]. The architecture uses 7 serial convolutional LSTM layers via 3x3 filters followed by the ReLU nonlinear activation function [7]. The configuration of the dropout and down-sampling are the same as in Fig.1. The number of the input channels is one. Five sets of pre-contrast and post-contrast DCE images are used as inputs, by adding them one by one into the CLSTM network. The number of activation channels in deeper layers is progressively increased from 4 to 8 to 16 to 32. The last dense layer is obtained by flattening the convolutional output feature maps from all states. Softmax is used as the final activation function of the last fully connected layer.

Figure 3: A case example from a 53-year-old woman with triple negative breast cancer in the right breast. (a) Pre-contrast image; (b) Post-contrast image; (c) The zoom-in image of the lesion with outlined tumor boundary obtained by using the fuzzy-C-means (FCM) clustering-based algorithm. The square box is centered at the centroid of the tumor. The entire bounding box is used in machine learning. (d-h) Color-coded DCE maps at 5 time frames, one pre-contrast and 4 post-contrast, normalized using the same signal intensity scales.

Figure 4: A case example from a 48-year-old woman with Hormonal-positive and HER2-negative breast cancer in the right breast. (a) Pre-contrast image; (b) Post-contrast image; (c) The zoom-in image of the lesion with outlined tumor boundary obtained by using the fuzzy-C-means (FCM) clustering-based algorithm. The square box is centered at the centroid of the tumor. The entire bounding box is used in machine learning. (d-h) Color-coded DCE maps at 5 time frames, one pre-contrast and 4 post-contrast, normalized using the same signal intensity scales.

Table 1: The summary of the differentiation accuracy among three different molecular subtypes. The accuracy obtaining in the training process is evaluated using 10-fold cross-validation, and the range and the mean values are reported. For direct testing using the developed model, the overall accuracy is reported. The results obtained using CNN and CLSTM are marked using different colors.

Proc. Intl. Soc. Mag. Reson. Med. 28 (2020)

3526