Miao Liu1, Qi Wang1, Gaofeng Shi1, Li Yang1, and Qinglei Shi2
1The Fourth Hospital of Hebei Medical University, Shi Jiazhuang, China, 2MR Scientific Marketing , Siemens Healthineers Ltd., Beijing, China
Synopsis
In this study, through optimizing data enhancement, data normalization, dimension reduction and feature screening schemes, an auto encoder (AE) based on clinical and radiomics features of ADC map was established and demonstrated high value in predicting the prognosis of locally advanced cervical cancer LACC in concurrent chemo-radiotherapy.
Introduction:
In concurrent chemo-radiotherapy of locally advanced cervical cancer (LACC), in order to maximize the benefit of the treatment and prolong the survival time to the maximum, a non-invasive method that can accurately predict the therapeutic effect is extremely important. In recent years, machine learning demonstrated high potential in differential diagnosis and therapeutic evaluation of LACC. But the poor generalization ability and stability of the model limited the application of this technology. In this paper, through combing the clinical and radiomics features and optimizing the data enhancement, data normalization, dimension reduction and feature screening schemes in the auto encoder (AE) model, to explore the value of it in predicting the prognosis of locally advanced cervical cancer (LACC) treated based on clinical features and radiomics features coming from apparent diffusion coefficient (ADC) maps. Materials and Methods:
From September 2016 to January 2020, 135 patients (age: 32-77, mean age: 53±9)were included in this retrospective study. The categories of pathological staging include ⅠB3 21 cases, IIA2 37 cases, IIB 2 cases, IIIA 5 cases, IIIB 6 cases, IIIC 59 cases, IVA 5 cases. All the patients’ examination were performed on a same 3T scanner (MAGNETOM Skyra, Siemens Healthcare, Erlangen, Germany) pre- and post-therapy. According to the 5 years’ follow-up results, the patients were divided into recurrent metastasis group and non-recurrent metastasis group. The whole tumor’s volume was delineated based on a high b value images (b= s/mm2) using a software itk-snap (http://www.itksnap.org/), if necessary, a T1 contrast-enhancement or T2WI images will be referenced to improve the accuracy. Radiomics signatures were extracted using an open source tool named Pyradiomics (https://pyradiomics.readthedocs.io/ ). Based on previous research results, an auto encoder (AE) with sigmoid layer was used as the classifier. To explore the potential of this classifier, data enhancement, data normalization, dimension reduction and feature screening schemes in model establishing were optimized, and the number of features to the prediction efficiency of the model was also explored. The performance of the model was evaluated using receiver operating characteristic (ROC) curve analysis. The area under the ROC curve (AUC) was calculated for quantification. The accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were also calculated. All above processes were implemented with FeAture Explorer (FAE,v0.2.5,https://github.com/salan668/FAE) on Python (3.6.8,https://www.python.org/). Results:
After optimization, a synthetic minority over-sampling technique (SMOTE) data enhancement scheme, a normalize to a unit with 0-center scheme in data normalization, a pearson correlation coefficients (PCC) in data dimensionality reduction, and an relevant features (Relief) in features selection are adopted(Figure 3-5). When the number of Eigenvalues was 10, the model showed highest prediction efficiency. On validation data set,the AUC and the accuracy could achieve 0.944(0.913-0.969) and 0.862, respectively. In this point, on testing data set, the AUC and the accuracy of the model achieve 0.992(0.960-1.000) and 0.943,the sensitivity and specificity were 0.923 and 1.000, with 0.818 and 1.000 for the NPV and PPV, respectively. (Figure 2 and Table 2). The typical patient’s information shows in Figure 1.Discussion
In clinical situations, collection of data sets usually needs a long time to observation, especially for the disease with a relatively small percentage of population. Learning from a small data set is challenging due to over fitting. Learning from small data set is an important issue in clinical applications and is studied by other researchers for classification [1]. The purpose of this paper is to evaluate an effective classification model AE for small data sets. To deal with the problem of lack of training data, data enhancement schemes were analyzed and compared, and in order to improve the generalization ability and stability of the model, clinical features were also adopted. Through optimization, the model AE proposed in this paper get a high diagnostic value. Considering the importance of predicting the prognosis of concurrent chemo-radiotherapy for LACC to maximize the benefit of the treatment and prolong the survival time to the maximum[2,3]. The AE model established in this paper may play an important role in concurrent chemo-radiotherapy of LACC in future.Conclusion:
Through optimization, the established AE model based on the clinical and radiomics features of ADC map demonstrated high value in predicting the prognosis of LACC in concurrent chemo-radiotherapy.Acknowledgements
No acknowledgement found.References
[1] Dougherty, Edward R., Lori A. Dalton, and Francis J. Alexander. "Small data is the problem." 2015 49th Asilomar Conference on Signals, Systems and Computers. IEEE, 2015.
[2]Cohen PA, Jhingran A, Oaknin A, Denny L. Cervical cancer. Lancet. 2019 Jan 12;393(10167):169-182.
[3]Katsumata N, Yoshikawa H, Kobayashi H, et al. Phase III randomised controlled trial of neoadjuvant chemotherapy plus radical surgery vs radical surgery alone for stages IB2, IIA2, and IIB cervical cancer: a Japan Clinical Oncology Group trial (JCOG 0102). Br J Cancer. 2013 May 28;108(10):1957-63.