The aim of this study was to implement a deep-learning approach for automatic therapy response assessment in patients with high-grade-glioma (HGG), based on the response-assessment in neuro-oncology (RANO) criteria. A total of 135 conventional MRI scans from 67 patients were included. A neural network with a U-net architecture was trained for identification and subsegmentation of lesion components. The similarity coefficient score between segmentation results and ground truth was 0.88±0.06. Consistency in therapy response assessment was obtained in the majority of cases. These results demonstrate the potential applicability of the proposed method for automatic therapy response assesment in patients with HGG.
Introduction
Volumetric measurements of brain tumors are important for accurate diagnosis and follow-up. However in clinical setup manual segmentation of high grade gliomas (HGG) is usually inapplicable as it is highly challenging, time consuming and user dependent. Currently the most widely used criteria for therapy response assessment in patients with HGG are based on the response assessment in neuro-oncology (RANO)(1). These criteria give definitions of complete response, partial response, stable disease or progressive disease based on changes in the area of enhancing and non-enhancing tumor components(2), and rely mainly on visual assessment by a radiologist. In the past years, the use of deep learning (DL) methods for brain tumor segmentation has gained popularity(3,4), however, the applicability and sensitivity of this approach, in comparison with RANO criteria has not yet been evaluated. The aim of this study was to implement a DL approach to automatic longitudinal assessment of brain tumor response based on RANO criteria.Patients and MRI Protocol: A total of 135 MRI scans obtained from 67 patients with HGG (45 males, age 52±15 years) were included. Patients were scanned longitudinally, every ~2 months as a part of their routine clinical assessment. Scans were performed on 3.0 GE and Siemens scanners and included FLAIR and T1-weighted images acquired before and after contrast agent injection (T1W, T1W+C).
Image Analysis: Manual segmentation of lesion area was performed using AnalyzeDirect software, dividing the lesion area into three components (labels): enhancing, non-enhancing and necrosis. The manual segmentations were used as ground truth for model training and final evaluation. In addition, longitudinal radiological assessment based on RANO criteria(1) was performed at each time point. Further analysis was performed using Matlab 2018a and fast.ai platform (PyTorch environment) and included:
Dataset Preparation: The input data for segmentation included three channels: FLAIR, T1W, and T1W+C images. Preprocessing included brain extraction and realignment of all MRI contrasts to the same space, and resizing of all images to a 256X256 pixel size, resulting in a total of 652 slices. To increase the size of the dataset, data augmentation was performed and included rotation flipping and random lighting.
Network architecture: A U-Net based deep convolutional network was used with training and validation size of 588 and 64 slices respectively and a batch size of 5. For the network training adaptive moment estimator (ADAM)(5) stochastic gradient based optimizer was used with learning rate = 0.03 and the maximum number of epochs = 8.
Evaluation of the segmentation results: The automatically generated lesion components were compared with the manually segmented components in four manners: (1) visual inspection, (2) similarity between the components' contours based on dice similarity coefficient (DSC) score, (3) mean accuracy, sensitivity and specificity, and (4) consistency with RANO criteria (1), defining complete or partial response, stable or progressive disease based on change in lesion volume between two sequential scans.
Results
Visual inspection: In general, in the majority of cases, the proposed method accurately identified and classified the enhanced, non-enhanced and necrotic lesion areas based on conventional MRI. Representative segmentation results obtained in four patients are presented in Figure 1.
Segmentation results: Representative results are given in Figure 2. The mean results for the entire lesion area were: DSC = 0.88±0.06, sensitivity = 0.87±0.007 and specificity = 0.997±0.001. Detailed results for each component are given in Table 1
Longitudinal assessment: Figure 3 shows longitudinal segmentation results (dashed line) compared with ground truth (solid line) obtained in a 62 year old patient with astrocytoma. Based on RANO criteria, consistency in patient assessment in terms of complete or partial response, stable or progressive disease was obtained in 44/51 cases. Inconsistency was observed mainly in cases of small lesions.
Discussion and Conclusion
Here we propose a fully automated DL based method for longitudinal assessment of patients with HGG, based on the currently used radiological criteria (RANO). The promising results obtained in this study demonstrate the potential applicability of the proposed method as a tool to assist radiologists in routine clinical practice. Our current efforts in enlarging our database, and optimizing our DL training model performance are expected to substantially improve our classification results in the foreseeable future.