2381

Detailed MRI Report Findings Play Important Role in Establishing Predictive Machine Learning Models For Recurrence in Nasopharyngeal Carcinoma
Weijing Zhang1, Chunyan Cui1, Huali Ma1, Li Tian1, Annan Dong1, Zhiqiang Tian2, Xinlei Deng3, Xucheng Zhang3, Nian Lu1, Haojiang Li1, and Lizhi Liu1

1Sun Yat-sen University Cancer Center, Guangzhou, China, 2Xi’an Jiaotong University, Xi an, China, 3Sun Yat-sen University, Guangzhou, China

Synopsis

To compare different machine-learning approaches, develop the best predictive model for recurrence, and explore interactions between different types of data in non-metastatic nasopharyngeal carcinoma (NPC). Auto Machine Learning (AutoML) classifier plus the minimum redundancy and maximum correlation (mRMR) method achieved the best predictive accuracy to build prediction model for recurrence in NPC. The model incorporating databases including T/N stage data, clinical data, or detailed MRI report findings showed the best performance. Detailed MRI report findings have potential as useful biomarkers in predicting NPC recurrence, which may help develop more individualized multidisciplinary treatment and follow-up strategies.

INTRODUCTION


The tumor recurrence rate remains a major cause responsible for failure of treatment, leading to the high mortality rate in Nasopharyngeal Carcinoma(NPC)1,2. However, there is still no accurate biomarker or model to predict NPC recurrence. Recently, machine learning models are increasingly being used in the field of oncology and contribute to tumor prediction3. Thus, our study aimed to compare different machine-learning approaches, develop the best predictive model for recurrence, and explore interactions between different types of data in non-metastatic NPC.

METHODS

Retrospectively, 792 consecutive non-metastatic NPC patients treated with intensity-modulated radiotherapy from 2010-2013 were enrolled. Seven machine-learning classifiers and four feature selection methods were applied and compared, to establish recurrence models based on T/N stage data, clinical data, detailed MRI report findings, or all of them. The models were evaluated and compared by the mean area under the curve (AUC), test error, sensitivity, and specificity; while the features selected were determined by Cox regression and Kaplan-Meier analysis.

RESULTS

Auto Machine Learning (AutoML) classifier plus the minimum redundancy and maximum correlation (mRMR) method achieved the best predictive accuracy of recurrence (Fig.2). The performance of the model incorporating all databases were better than the model based only on T/N stage data, clinical data, or detailed MRI report findings (Fig.2). The model, based only on detailed MRI report findings, exhibited excellent prediction of recurrence with good AUC (0.729) is similar to the all databases model (AUC, 0.730) (Fig.2). Nine independent predictors (invasion of sphenoid sinus, cervical lymph node metastasis, invasion of fossae lateral pharyngeal, invasion of jugular foramen area, bilateral-retropharyngeal lymph node metastasis, bilateral-cervical III/IV/Va region lymph node metastasis in the cluster, invasion of tensor veli palatini muscle, cervical nodal necrosis in ipsilateral III region, invasion of partes ossea tubae pharyngotympaniae) were selected from the best model (Fig.4,5).

DISCUSSION

We can effectively identify NPC patients at high-risk of recurrence through the best predictive model, to ensure that close follow-up are arranged for them; and to also ensure that the recurrence can be timely detected for surgical or other appropriate individual strategies, which may improve the clinical outcome4. In our study, detailed MRI report findings were included, which previous studies did not take it into account. Surprisingly, the model only based on detailed MRI report findings exhibited brilliant prediction of recurrence with good AUC similar to the model based on total data.This indicated that detailed MRI report findings made significant contribution prior to the inclusion of clinicopathological factors or TN stage data in the prediction model of NPC recurrence.

CONCLUSION

Among several and different ML methods, AutoML classifier plus the mRMR method offered the best prediction value on the recurrence in NPC. The model incorporating databases including T/N stage data, clinical data, or detailed MRI report findings showed the best performance. Detailed MRI report findings have potential as useful biomarkers in predicting recurrence, which may aid individual patients’ strategies selection; thereby improving survivals.

Acknowledgements

This work was supported by grants from the National Natural Science Foundation of China (No.81572652), Health & Medical Collaborative Innovation Project of Guangzhou City, China (grant 201604020003)

References

1. Feng X, Lin J, Xing S, et al. Higher IGFBP-1 to IGF-1 serum ratio predicts unfavourable survival in patients with nasopharyngeal carcinoma. BMC Cancer. 2017;17:90.

2. Comoretto M, Balestreri L, Borsatti E, et al. Detection and restaging of residual and/or recurrent nasopharyngeal carcinoma after chemotherapy and radiation therapy: comparison of MR imaging and FDG PET/CT. Radiology. 2008;249:203-211.

3. Cruz JA, Wishart DS. Applications of Machine Learning in Cancer Prediction and Prognosis. Cancer Inform. 2007;2:59-77.

4. Tang LQ, Chen DP, Guo L, et al. Concurrent chemoradiotherapy with nedaplatin versus cisplatin in stage II-IVB nasopharyngeal carcinoma: an open-label, non-inferiority, randomised phase 3 trial. Lancet Oncol. 2018;19:461-4

Figures

Figure 1. Workflow for the model building and feature selection preprocessing. We collected the data of patients for calculations. Univariate cox regression analysis was applied to clean the preliminary data(P<0.1). Four feature selection methods were used in the analysis, including Minimum redundancy and maximum correlation (mRMR), Boruta, Lasso, and Stepwise methods. We investigated seven machine-learning classifiers including Glm Ridge, Glm Lasso, Glm Grid, Random Forest (RF), GBM Grid, Deep Learning (DL), (AutoML), to build models. Features selected from a subset of 547 variables were validated by multivariate analyses with stepwise. Finally, we drew conclusions according to the results we calculated.

Figure 2. Heatmap, depicting the mean AUC of models constructed by feature selection and classification methods, based on the total data, detailed MRI report findings, and clinical data in NPC. (a)-(c) The models, built by mRMR method plus AutoML classifier, based on the total data, performed best. (d) Using mRMR+AutoML algorithms, the mean AUC of models based on the total data was better than the detailed MRI report findings, clinical or TN data. Significant differences were found between the total and clinical data; the total and TN data; and the detailed MRI report findings and clinical data (P<0.05).

Figure 3. Validation of the total data model by C-index analysis. It showed that total data model showed better C-index than the model based only on the TN stage data.

Figure 4. Nomogram constructed by Cox regression analysis with seven features selected from it. (a) Draw a line straight upward to the points’ axis to determine how many are pointing toward the probability of RFS which the patient receives for his or her Rad-score. Draw a line straight down to find the patient’s probability of RFS. (b) Calibration curves for the nomogram show the calibration of each model in terms of the agreement between the estimated and the observed 1-, 3-, and 5-year RFS.

Figure 5. Univariate and multivariate analysis of features selected by the model of AutoML classifier plus the mRMR method based on the total data.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)
2381