S59

Assessing the reproducibility of semi-automated segmentation methods on post-operative T1 post contrast and FLAIR MRI of GBM.
Olga V Fadeeva Da Costa1,2, Shah Islam2,3, Mark M Boubnovski3, Eric Aboagye3, and Adam D Waldman4,5
1Department of Surgery & Cancer, Cancer Imaging Centre, Hammersmith Campus, Imperial College, London, United Kingdom, 2Imaging Department, MRI Unit, Hammersmith Hospital, Imperial College Healthcare NHS Trust, London, United Kingdom, 3Department Of Surgery & Cancer, Cancer Imaging Centre, Hammersmith Campus, Imperial College, London, United Kingdom, 4Department of Brain Sciences, Hammersmith Campus, Imperial College, London, United Kingdom, 5Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom

Synopsis

Glioblastoma (GBM) account for ~70% of adult primary brain tumours. Interrogation of post-operative disease residuum and peritumoral microenvironment is necessary for imaging biomarker development. This involves accurate delineation and segmentation of these particular regions of interest. A total of 35 post-operative T1 post contrast and T2 FLAIR MRI’s underwent segmentation by two independent readers. There was poor inter reader reproducibility in segmentations of disease residuum on post contrast T1 images, although this improved was higher for on segmentation of abnormal regions on FLAIR. Strong intra reader reproducibility suggests systematic error effects in identifying regions of interest.

INTRODUCTION

Glioblastoma (GBM) account for ~70% of adult primary brain tumours [1]. Despite optimal current treatment, median survival remains only 12-15 months. Prognosis is also highly variable as therapeutic response and outcome is largely dominated by individual tumour biology [2,3,4]. Magnetic Resonance imaging (MRI) is widely used in assessment of disease status following maximal surgical resection and chemoradiotherapy. Accurate segmentation of post-operative residuum, and the peri-tumoural microenvironment is important for the assessing therapeutic response, development of novel imaging biomarkers and, as well as driving more computationally- advanced methods for disease detection and prognostication e.g. Radio-genomics. In clinical applications, radiologists will may perform manual segmentation, which is subjective, time consuming and difficult to reproduce. Semi- automated segmentation methods can improve efficiency [5] although there is lack of evidence surround around the reproducibility of these segmentations due to variables in seed points, thresholding values and iteration termination conditions. The purpose of our study is therefore to assess the reproducibility of manually- adjusted semi-automated segmentation on post-operative T1 contrast and FLAIR sequences in patients with GBM.

METHODS

A total of 35 post-surgical MRI scans examinations were analysed) from patients with histologically-biopsy confirmed GBM. All imaging was performed using a 3T Verio clinical system (3T) (Siemens, Erlangen, De). The harmonized MRI protocol included a volumetric T2- weighted FLAIR sequence (TR 5000ms, TE 329 ms) and a volumetric post contrast T1-weighted sequence (TR 1900ms, TE 2.5ms). Regions of interest for segmentation were defined as solid enhancing tissue on the T1-weighted post contrast images, and ‘abnormal’ increased signal with exclusion of the cystic core on T2-weighted FLAIR images. All segmentations were performed using commercially available freeware, ITK-snap, using a semiautomated thresholding approach, followed by manual editing by two independent readers. The first reader was a radiologist with 10 years of clinical experience and the second was a senior MRI radiographer with 15 years’ experience. Inter- reader reproducibility of segmentations was assessed by calculating the Sorenson-Dice score. A subset of 15 cases were re-segmented to test the intra- rater reproducibility. The test the reproducibility of the segmentation volumes, inter class-correlation coefficients were calculated to test the reproducibility of segmentation volumes.

RESULTS

Table 1 summarises the inter- and intra- reader reproducibility of segmentations, and their respective volumes. The inter reader Dice score for segmentations performed on post contrast T1 images showed poor reproducibility (0.69). The reproducibility for segmentations on FLAIR images was significantly higher with a Dice score of 0.79. Intra reader reproducibility was consistently high with Dice scores ranging from 0.89- 95 across both readers. Similarly, the volumes of the segmentations were consistently similar for each reader and between each reader with ICC’s ranging from 0.89 to 0.99.

DISCUSSION

The primary purpose of this study was to quantitatively quantify measure the inter and intra reader reproducibility of segmentation of postoperative MRIs in patients with GBM. Accurate delineation and analysis of disease residuum and the peritumoral environment is important for clinical trial endpoints and image biomarker development. Furthermore, computational techniques which require labelled data to learn hierarchical structures within data rely ‘ground truth’ segmentations from clinical subject matter experts. Irreproducibility of these segmentations invalidates these ‘ground-truths’. The inter reader reproducibility was lower than expected on post contrast T1 images, although this did improve on the T2 FLAIR segmentations. The low Dice coefficients are likely to would have been confounded by firstlyresult from the differences in experience between the two readers in interpreting regions of interest across the two sequences. GBM is notoriously heterogenous on MRI, and although only solid enhancing components were included int the segmentation, it is often open to interpretability between readers, where the imaging appearances do no’t fall clearly in to either category (Figure 1). The strong intra reader reproducibility also suggests there was a systematic error effects in choosing appropriate regions of interest for segmentation on T1 post contrast images.
Semi-automated region-based techniques used in this study obtain the segmentation result by iteratively adding voxels adjacent to the specified seed points. The disadvantageA limitation is that the seed points need to be specified manually, which again introduces human bias. It is also clear that semi-automated techniques using thresholding, work better on tissue interfaces with high s with increased contrast to noise ratiosin signal intensities, which is the likely reason why the T2 FLAIR segmentations were more reproducible (Figure 2). Although the Dice scores demonstrated variability across MRI sequences the volumes of the segmentations were more similar across both readers the volumes of the segmentationsand showed significant inter class correlation (0.89-0.99), which suggests that the Dice score which is based on voxel by voxel comparison may overestimate the difference between segmentations.

CONCLUSION

We found that manually-edited At present in our study semi-automated techniques segmentations in the post-surgical setting have failed toare not be reproducible to the level at which the segmentationsthat can be considered the reliable ‘ground truth’. Future work will include comparisons with fully automated techniques.

Acknowledgements

The study was funded by Imperial Health Charity, The Brain Tumour Charity and Brain Tumour Research United Kingdom. SI was funded by an unrestricted educational grant from Bayer.

References

[1] Wen PY et al. Malignant gliomas in adults. N Eng J Med. 2008; 359-492.http://www.ncbi.nlm.nih.gov/pubmed/18669428

[2] Lassman AB et al. Incorporating molecular tools into clinical trials and treatment for gliomas? Curr Opin Neurol. 2007, 20:708–711. http://www.ncbi.nlm.nih.gov/pubmed/17992094

[3] Louis DN. Molecular pathology of malignant gliomas. Annu. Rev. Pathol. Mech. Dis. 2006, 1:97–117.

[4] Scheithauer BW et al. The 2007 WHO classification of tumors of the nervous system: controversies in surgical neuropathology. Neuropathology Brain Pathology. 2008, 18:307–316. http://www.brainlife.org/reprint/2008/Scheithauer_BW080602.pdf

[5] (Wu, Zhao, Wu, Lin, & Wang, 2019) Wu, Y., Zhao, Z., Wu, W., Lin, Y., & Wang, M. (2019) Automatic glioma segmentation based on adaptive superpixel. BMC Medical Imaging, 19(1), 1–14. https://doi.org/10.1186/s12880-019-0369-6

[6] Paul A. Yushkevich, Joseph Piven, Heather Cody Hazlett, Rachel Gimpel Smith, Sean Ho, James C. Gee, and Guido Gerig. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage. 2006 Jul 1; 31(3):1116-28. [bibtex] [medline] [doi:10.1016/j.neuroimage.2006.01.015]

Figures

Table 1: Summary of reproducibility of segmentation voxels and their respective volumes

Figure 1 – (a+d) Original post contrast T1W images. (b+e) Reader 1 segmentations. (c+f) reader 2 segmentations

Figure 2 – (a+d) Original post T2 FLAIR images. (b+e) Reader 1 segmentations. (c+f) reader 2 segmentations

Proc. Intl. Soc. Mag. Reson. Med. 29 (2021)
S59