0505

Utilizing deep learning for automatic longitudinal assessment of brain tumor response based on RANO criteria

Idan Bressler^1,2, Dafna Ben Bashat^1,3,4, Orna Aizensein^3,5, Felix Bokestein^3,6, Deborah T Blumenthal^3,6, and Moran Artzi^1,3,4

¹Sagol Brain Institute, Tel Aviv Sourasky Medical Center, Tel - Aviv, Israel, ²The Iby and Aladar Fleischman Faculty of Engineering, Tel Aviv University, Tel - Aviv, Israel, ³Sackler Faculty of Medicine, Tel Aviv University, Tel - Aviv, Israel, ⁴Sagol School of Neuroscience, Tel Aviv University, Tel - Aviv, Israel, ⁵Division of Radiology, Tel Aviv Sourasky Medical Center, Tel - Aviv, Israel, ⁶Neuro-Oncology Service, Tel Aviv Sourasky Medical Center, Tel - Aviv, Israel

Synopsis

The aim of this study was to implement a deep-learning approach for automatic therapy response assessment in patients with high-grade-glioma (HGG), based on the response-assessment in neuro-oncology (RANO) criteria. A total of 135 conventional MRI scans from 67 patients were included. A neural network with a U-net architecture was trained for identification and subsegmentation of lesion components. The similarity coefficient score between segmentation results and ground truth was 0.88±0.06. Consistency in therapy response assessment was obtained in the majority of cases. These results demonstrate the potential applicability of the proposed method for automatic therapy response assesment in patients with HGG.

Introduction

Volumetric measurements of brain tumors are important for accurate diagnosis and follow-up. However in clinical setup manual segmentation of high grade gliomas (HGG) is usually inapplicable as it is highly challenging, time consuming and user dependent. Currently the most widely used criteria for therapy response assessment in patients with HGG are based on the response assessment in neuro-oncology (RANO)⁽¹⁾. These criteria give definitions of complete response, partial response, stable disease or progressive disease based on changes in the area of enhancing and non-enhancing tumor components⁽²⁾, and rely mainly on visual assessment by a radiologist. In the past years, the use of deep learning (DL) methods for brain tumor segmentation has gained popularity^(3,4), however, the applicability and sensitivity of this approach, in comparison with RANO criteria has not yet been evaluated. The aim of this study was to implement a DL approach to automatic longitudinal assessment of brain tumor response based on RANO criteria.

Methods

Patients and MRI Protocol: A total of 135 MRI scans obtained from 67 patients with HGG (45 males, age 52±15 years) were included. Patients were scanned longitudinally, every ~2 months as a part of their routine clinical assessment. Scans were performed on 3.0 GE and Siemens scanners and included FLAIR and T₁-weighted images acquired before and after contrast agent injection (T₁W, T₁W+C).

Image Analysis: Manual segmentation of lesion area was performed using AnalyzeDirect software, dividing the lesion area into three components (labels): enhancing, non-enhancing and necrosis. The manual segmentations were used as ground truth for model training and final evaluation. In addition, longitudinal radiological assessment based on RANO criteria⁽¹⁾ was performed at each time point. Further analysis was performed using Matlab 2018a and fast.ai platform (PyTorch environment) and included:

Dataset Preparation: The input data for segmentation included three channels: FLAIR, T₁W, and T₁W+C images. Preprocessing included brain extraction and realignment of all MRI contrasts to the same space, and resizing of all images to a 256X256 pixel size, resulting in a total of 652 slices. To increase the size of the dataset, data augmentation was performed and included rotation flipping and random lighting.

Network architecture: A U-Net based deep convolutional network was used with training and validation size of 588 and 64 slices respectively and a batch size of 5. For the network training adaptive moment estimator (ADAM)⁽⁵⁾ stochastic gradient based optimizer was used with learning rate = 0.03 and the maximum number of epochs = 8.

Evaluation of the segmentation results: The automatically generated lesion components were compared with the manually segmented components in four manners: (1) visual inspection, (2) similarity between the components' contours based on dice similarity coefficient (DSC) score, (3) mean accuracy, sensitivity and specificity, and (4) consistency with RANO criteria ⁽¹⁾, defining complete or partial response, stable or progressive disease based on change in lesion volume between two sequential scans.

Results

Visual inspection: In general, in the majority of cases, the proposed method accurately identified and classified the enhanced, non-enhanced and necrotic lesion areas based on conventional MRI. Representative segmentation results obtained in four patients are presented in Figure 1.

Segmentation results: Representative results are given in Figure 2. The mean results for the entire lesion area were: DSC = 0.88±0.06, sensitivity = 0.87±0.007 and specificity = 0.997±0.001. Detailed results for each component are given in Table 1

Longitudinal assessment: Figure 3 shows longitudinal segmentation results (dashed line) compared with ground truth (solid line) obtained in a 62 year old patient with astrocytoma. Based on RANO criteria, consistency in patient assessment in terms of complete or partial response, stable or progressive disease was obtained in 44/51 cases. Inconsistency was observed mainly in cases of small lesions.

Discussion and Conclusion

Here we propose a fully automated DL based method for longitudinal assessment of patients with HGG, based on the currently used radiological criteria (RANO). The promising results obtained in this study demonstrate the potential applicability of the proposed method as a tool to assist radiologists in routine clinical practice. Our current efforts in enlarging our database, and optimizing our DL training model performance are expected to substantially improve our classification results in the foreseeable future.

Acknowledgements

No acknowledgement found.

References

Wen PY, Macdonald DR, Reardon DA, et al. Updated response assessment criteria for high-grade gliomas: response assessment in neuro-oncology working group. Journal of Clinical Oncology 2010;28(11):1963-1972.
Rubin DL, Willrett D, O'Connor MJ, Hage C, Kurtz C, Moreira DA. Automated tracking of quantitative assessments of tumor burden in clinical trials. Translational oncology 2014;7(1):23-35.
Mazurowski MA, Buda M, Saha A, Bashir MR. Deep learning in radiology: an overview of the concepts and a survey of the state of the art. arXiv preprint arXiv:180208717 2018.
Pereira S, Pinto A, Alves V, Silva CA. Brain tumor segmentation using convolutional neural networks in MRI images. IEEE transactions on medical imaging 2016;35(5):1240-1251.
Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980 2014.

Figures

Figure 1: Segmentation results obtained from four patients with glioblastoma. (a) Post contrast T₁WI (T₁WI+C) (b) FLAIR images, and (c) segmentation results: red=enhancing component, blue=non-enhancing component, green=necrotic component

Figure 2: Examples of segmentation results obtained in three patients with glioblastoma, demonstrating the high similarity and DCS values obtained between the segmentation (magenta) and results ground truth (light blue)

Figure 3: Longitudinal segmentation results obtained in a 62 year old male scanned every ~2 months (total of 4 scans) showing the high similarity between the automatic segmentation results (dashed line) and the ground truth (solid line) for all lesion components (red=enhancing, blue=non-enhancing and green=necrotic, components)

Table 1: Segmentation results obtained for the entire lesion area and for different lesion components. DSC=dice similarity coefficient

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)

0505