Keywords: Machine Learning/Artificial Intelligence, Modelling
Prior characterization of treatment-effect and tumor recurrence using deep learning approaches have not optimized for spatial classification within a single lesion, which could improve surgical planning and treatment. 10mm patches of pre-surgical anatomical and physiological images surrounding the locations of histopathologically-confirmed tissue samples were used to train our models. Including physiological images, pretraining on unlabeled data in an autoencoding task, and training with an alternative cross-validation approach that enabled many networks to be ensembled, we achieved an ensembled test AUROC of 0.814 and generated spatial maps of tumor probability and model uncertainty. Performance decreased when removing any of these components.[1] Abbasi AW et. al. “Incidence of Tumour Progression and Pseudoprogression in High-Grade Gliomas: A Systematic Review and Meta-Analysis.” Clinical Neuroradiology, U.S. National Library of Medicine, Sept. 2018.
[2] EM, Delgado-López PD, Riñones-Mena E, Corrales-García. “Treatment-Related Changes in Glioblastoma: A Review on the Controversies in Response Assessment Criteria and the Concepts of True Progression, Pseudoprogression, Pseudoresponse and Radionecrosis.” Clinical & Translational Oncology : Official Publication of the Federation of Spanish Oncology Societies and of the National Cancer Institute of Mexico, U.S. National Library of Medicine, Aug. 2018.
[3] Lee, Joonsang, et al. “Discriminating Pseudoprogression and True Progression in Diffuse Infiltrating Glioma Using Multi-Parametric MRI Data Through Deep Learning.” Nature News, Nature Publishing Group, 23 Nov. 2020.
[4] Akbari H;Rathore S;Bakas S;Nasrallah MP;Shukla G;Mamourian E;Rozycki M;Bagley SJ;Rudie JD;Flanders AE;Dicker AP;Desai AS;O'Rourke DM;Brem S;Lustig R;Mohan S;Wolf RL;Bilello M;Martinez-Lage M;Davatzikos C; “Histopathology-Validated Machine Learning Radiographic Biomarker for Noninvasive Discrimination between True Progression and Pseudo-Progression in Glioblastoma.” Cancer, U.S. National Library of Medicine, June 2020.
[5] Gao, Yang, et al. “Deep Learning Methodology for Differentiating Glioma Recurrence from Radiation Necrosis Using Multimodal Magnetic Resonance Imaging: Algorithm Development and Validation.” JMIR Medical Informatics, JMIR Publications, 17 Nov. 2020.
Figure 1: DL model architecture and sample reconstructions obtained during validation of the autoencoder pretraining. Qualitative appearance reflects reasonable initialization for the encoder of the classifier network.
Figure 2: Experimental Design. Initial autoencoder and classifier architecture was tuned before further tuning and training with Repeated Constrained k-fold Cross Validation (CR-CV). DL models without reconstruction pretraining, and physiological imaging contrasts were then compared in CR-CV, 4, and 5-fold Cross Validation. Models are tested independently and in ensemble averaging on the holdout test set split off by 1/4 of patients. ADASYN and augmentation oversampling were used to oversample TxE to the same proportions as rGBM in each fold for ML and DL models respectively.
Figure 3: Receiver Operator Characteristic (ROC) curves comparing testing performance of using physiological imaging contrasts (A), reconstruction pretraining (B), and CR-CV performance in validation (C) and testing (D). Validation ROCs are generated by averaging the predictions made with CR-CV and combining the predictions made with k-fold CV across each fold.
Figure 4: Area Under the Receiver Operator Characteristic (AUROC) curves for all models across CR-CV, 4-fold CV, 5-fold CV, average testing, and ensembled testing is shown. Pretrained CNN (+TL), non pretrained CNN (-TL), CNN using anatomical images only, as well as baseline ML (Soft Voting Ensemble of Random Forest, Gaussian Process, and Support Vector Classifiers) performance is shown.
Figure 5: Sample test set spatial maps. P(TxE) is the model output, where blue represent areas predicted to be TxE and red to be rGBM. Uncertainty maps are shown, with highly certain areas in purple and highlighted with arrows. Images are cropped to surround the T2-lesion. Yellow boxes surrounding the tissue samples confirmed with pathology are 10mm but scaled to the cropped image. Tumor Maps are thresholded at .448 yielding a sensitivity=0.85, specificity=0.83, and accuracy=0.83 in CR-CV. (A) Test case with the sample confirmed as TxE, and (B) test case with the sample confirmed as rGBM.