Longitudinal assessment of glioma burden is important for evaluating treatment response and tumor progression. Delineation of tumor regions is typically performed manually but is time-consuming and subject to inter-rater and intra-rater variability. Therefore, there has been interest in developing automated approaches to calculate 1) glioma volume and 2) the product of maximum diameters of contrast-enhancing tumor (the key measure used in the Response Assessment for Neuro-Oncology (RANO) criteria). We present a fully automated pipeline for brain extraction, tumor segmentation, and RANO measurement (AutoRANO). We show the utility of this pipeline on 713 MRI scans from 54 post-operative glioblastoma patients, demonstrating capacity for tumor burden measurement.
Introduction
Gliomas are infiltrative neoplasms of the central nervous system that affect patients of all ages, with variable growth rates and prognoses.1,2 Serial assessment of tumor burden has been shown to be important for the prediction of survival outcomes and evaluation of treatment effectiveness in gliomas.3,4 Current clinical guidelines (Response Assessment in Neuro-Oncology, RANO) are based on calculating the product of maximum diameters of contrast-enhancing tumor as a measure of assessment of treatment response.5 Manual delineation of tumor boundaries can be difficult if the tumor is diffuse or demonstrates poor or heterogeneous contrast enhancement. Furthermore, manual segmentation is labor-intensive and subject to inter-rater variability, resulting in low reproducibility.6,7 As such, there has been great interest in developing automated approaches for 1) calculation of volume and 2) the product of maximum diameters (the primary metric used in RANO criteria). With the advent of more powerful graphics processing units, deep learning has become the method of choice for automatic segmentation in medical images.8,9 In this study, we present a fully-automated pipeline for brain extraction and tumor segmentation that can be used to reliably extract FLAIR tumor volumes, contrast-enhancing tumor volumes, and RANO criteria from post-operative glioblastoma patient data from two clinical trials.Methods
Following IRB approval, imaging data was acquired from two clinical trials which enrolled patients with newly diagnosed GBM. Our final post-operative patient cohort consisted of 713 MRI scans from 54 patients. Expert segmentations for whole brain regions, FLAIR hyperintensities, and T1 contrast-enhancement were acquired from an expert neuro-radiologist or neuro-oncologist. Additionally, RANO measurements were acquired from two expert neuro-oncologists. We utilized the 3D U-Net architecture - a neural network designed for fast and precise segmentation - for both brain extraction and tumor segmentation (Fig.1).10,11 The code for pre-processing and U-Net architecture is publicly available: https://github.com/QTIM-Lab/DeepNeuro.12 We further developed an automated RANO (AutoRANO) algorithm to automatically derive RANO measurements from automatic contrast-enhancing tumor segmentations.13 The patients from the patient cohort were randomly divided into training and testing sets in a 4:1 ratio. We compared the baseline visits and the last patient visits by subtracting the RANO measures (delta RANO).Results
For the testing set, the mean Dice coefficient between our algorithm and expert manual FLAIR tumor segmentation was 0.701 (95% CI 0.67-0.731) (Fig. 2). The mean Dice coefficient between our algorithm and expert manual contrast-enhancing tumor segmentation was 0.696 (95% CI 0.66-0.728). When comparing the agreement of calculated FLAIR volumes between automatic and manual segmentation, the Spearman rank correlation coefficient was 0.948 for the testing set. For contrast-enhancing tumor volumes, the Spearman rank correlation coefficient was 0.933 for the testing set (Fig. 3).
We assessed reproducibility of manual and automatic measurements by comparing measurements from the two baseline visits (acquired prior to treatment initiation) for each patient. Comparing baseline visits 1 and 2 for RANO measurements, the intraclass correlation coefficient (ICC) for Rater 1 was 0.962, the ICC for Rater 2 was 0.992, and the ICC for Auto RANO was 0.977 (Fig. 4-5).
To assess the capability of RANO measures to assess changes in tumor burden during treatment, we compared the delta RANO across different raters. The ICC for delta RANO for Rater 1 and Rater 2, Auto RANO and Rater 1, and AutoRANO and Rater 2 was 0.877, 0.850, and 0.878, respectively.
Discussion
In this study, we demonstrate the utility of a fully automated, deep-learning based pipeline for calculation of tumor volumes as part of a larger effort to apply deep learning techniques to the field of neuro-oncology. This is the first application, to our knowledge, of deep learning for post-operative glioblastoma segmentation in comparison to previous studies which focused on pre-operative glioblastoma segmentation.14 In comparing manual and automatic segmentation methods, we observed high agreement between manual and automatic volumes. The AutoRANO algorithm also had high reproducibility. We also observed high agreement between AutoRANO and expert raters, as reflected by the high ICC for delta RANO. This demonstrates the utility of automated methods to assessing changes in tumor burden.Conclusion
We present an open-source, fully-automatic pipeline for brain extraction, segmentation, and RANO measurements applied to a large, multi-institutional pre-operative glioma patient cohort and a post-operative glioblastoma patient cohort. This tool may be helpful in clinical trials as well as clinical practice in expediting measurement of tumor burden in the evaluation of treatment response. Furthermore, it serves as an important proof-of-concept for automated tools in the clinic and may be applicable to other tumor pathologies.1. De Robles, P. et al. The worldwide incidence and prevalence of primary brain tumors: A systematic review and meta-analysis. Neuro-Oncology 17, 776–783 (2015).
2. Thakkar, J. P. et al. Epidemiologic and molecular prognostic review of glioblastoma. Cancer Epidemiology Biomarkers and Prevention 23, 1985–1996 (2014).
3. Brasil Caseiras, G. et al. Low-grade gliomas: six-month tumor growth predicts patient outcome better than admission tumor volume, relative cerebral blood volume, and apparent diffusion coefficient. Radiology 253, 505–512 (2009).
4. Iliadis, G. et al. Volumetric and MGMT parameters in glioblastoma patients: Survival analysis. BMC Cancer 12, 3 (2012).
5. Wen, P. Y. et al. Updated response assessment criteria for high-grade gliomas: response assessment in neuro-oncology working group. J. Clin. Oncol. 28, 1963–72 (2010).
6. Deeley, M. A. et al. Comparison of manual and automatic segmentation methods for brain structures in the presence of space-occupying lesions: a multi-expert study. Phys. Med. Biol. 56, 4557–4577 (2011).
7. Huang, R. Y. et al. The Impact of T2/FLAIR Evaluation per RANO Criteria on Response Assessment of Recurrent Glioblastoma Patients Treated with Bevacizumab. Clin. Cancer Res. 22, 575–581 (2016).
8. Kamnitsas, K. et al. Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med. Image Anal. 36, 61–78 (2017).
9. Havaei, M. et al. Brain tumor segmentation with Deep Neural Networks. Med. Image Anal. 35, 18–31 (2017).
10. Beers, A. et al. Sequential neural networks for biologically-informed glioma segmentation. in Medical Imaging 2018: Image Processing (eds. Angelini, E. D. & Landman, B. A.) 10574, 108 (SPIE, 2018).
11. Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T. & Ronneberger, O. 3D U-net: Learning dense volumetric segmentation from sparse annotation. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9901 LNCS, 424–432 (2016).
12. Beers, A. et al. DeepNeuro: an open-source deep learning toolbox for neuroimaging. (2018).
13. Ellingson, B. M., Wen, P. Y. & Cloughesy, T. F. Modified Criteria for Radiographic Response Assessment in Glioblastoma Clinical Trials. Neurotherapeutics 14, 307–320 (2017).
14. Menze, B. H. et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE Trans. Med. Imaging 34, 1993–2024 (2015).