Extraction of radiomic features has to be repeatable in order to be clinically useful. We investigated the repeatability of radiomic feature extraction on a unique dataset consisting of a double baseline MRI scans in 48 patients diagnosed with glioblastoma. Size and shape features which are mostly governed by tumor segmentation showed on average higher repeatability than intensity and texture-based features which are more dependent on image acquisition and preprocessing. More research on the influence of image acquisition and preprocessing on the repeatability and reliability of radiomic features has to be undertaken to make radiomics a safe image-analysis tool.
Patients: We evaluated imaging from 48 patients from two clinical trials at our institutions (NCT00756106, NCT00662506). The patients were newly diagnosed with glioblastoma and underwent two baseline scans 3 to 5 days apart (mean 3.7 days) prior to start of treatment (double baseline). No tumor showed significant progression between the two baseline scans as measured by change in contrast enhancing tumor volume (T1W post-contrast) or T2-weighted fluid/attenuated inversion recovery (T2W-FLAIR) hyperintensity.3
Image Acquisition and Feature Extraction: Scans were acquired using identical imaging protocols on a 3.0-T MRI System (TimTrio; Siemens Medical Solutions, Malvern, PA). Reproducible slice positioning was ensured using AutoAlign. For further analysis, we used T2W-FLAIR and contrast enhanced T1W (T1W post-contrast) sequences (5 mm slice thickness, 1 mm interslice gap, 0.43 mm in-plane resolution for both sequences). Radiomics features were extracted using two independent, open-source python packages: pyradiomics4 and qtim_tools5. Images were skull stripped and package default image normalization was applied as part of the feature extraction process. Feature calculation was based on manual segmentations of the whole tumor from T2W-FLAIR and enhancing tumor on T1W post-contrast sequences by expert raters.
Analysis: Features were grouped into size, shape, intensity, and texture features.6 Intraclass correlation coefficient (ICC) between feature values of the first and second baseline visit was determined using R (package: IRR; two-way model, type: consistency, unit: single, confidence level: 0.95).
In this study we have analyzed the repeatability of radiomics feature extraction for two open-source radiomics packages. Feature extraction was performed on a unique dataset of double baseline scans of a patient cohort newly diagnosed with glioblastoma. The physicians performing manual segmentation confirmed that between both scans no significant change in tumor volume and shape appearance could be detected (average time interval 3.7 days, no treatment). Both scans for each patient were acquired on the same MRI machine with secured alignment and manual segmentation of the tumor was performed by the same physician to limit variations.
Features in the size and shape groups that are less influenced by image acquisition and preprocessing (such as normalization) showed good test-retest reliability supporting the physicians’ findings. However, intensity and texture-based features showed a great variability in their ICC. This result indicates that the measurement of some of the features in these groups cannot be reliably repeated.
This publication was supported from the Martinos Scholars fund to Katharina Hoebel. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the Martinos Scholars fund.
This project was supported by a training grant from the NIH Blueprint for Neuroscience Research (T90DA022759/R90DA023427) and the National Institute of Biomedical Imaging and Bioengineering (NIBIB) of the National Institutes of Health under award number 5T32EB1680 to K. Chang and J. Patel. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
This study was supported by National Institutes of Health grants U01 CA154601, U24 CA180927, and U24 CA180918 to J. Kalpathy-Cramer.
We would like to acknowledge the GPU computing resources provided by the MGH and BWH Center for Clinical Data Science.
This research was carried out in whole or in part at the Athinoula A. Martinos Center for Biomedical Imaging at the Massachusetts General Hospital, using resources provided by the Center for Functional Neuroimaging Technologies, P41EB015896, a P41 Biotechnology Resource Grant supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB), National Institutes of Health.
1 Lambin, P., Rios-Velazquez, E., Leijenaar, R., Carvalho, S., Van Stiphout, R. G. P. M., Granton, P., … Aerts, H. J. W. L. (2012). Radiomics: Extracting more information from medical images using advanced feature analysis. European Journal of Cancer, 48(4), 441–446. http://doi.org/10.1016/j.ejca.2011.11.036
2 Traverso, A., Wee, L., Dekker, A., & Gillies, R. (2018). Repeatability and Reproducibility of Radiomic Features: A Systematic Review. International Journal of Radiation Oncology*Biology*Physics, 102(4), 1143–1158. http://doi.org/10.1016/J.IJROBP.2018.05.053
3 Batchelor, T. T., Gerstner, E. R., Emblem, K. E., Duda, D. G., Kalpathy-Cramer, J., Snuderl, M., … Jain, R. K. (2013). Improved tumor oxygenation and survival in glioblastoma patients who show increased blood perfusion after cediranib and chemoradiation. Proceedings of the National Academy of Sciences, 110(47), 19059–19064. http://doi.org/10.1073/pnas.1318022110
4 Van Griethuysen, J. J. M., Fedorov, A., Parmar, C., Hosny, A., Aucoin, N., Narayan, V., … Aerts, H. J. W. L. (2017). Computational radiomics system to decode the radiographic phenotype. Cancer Research, 77(21), e104–e107. http://doi.org/10.1158/0008-5472.CAN-17-0339
5 qtim_tools; https://github.com/QTIM-Lab/qtim_tools
6 Kalpathy-Cramer, J., Mamomov, A., Zhao, B., Lu, L., Cherezov, D., Napel, S., … Goldgof, D. (2016). Radiomics of Lung Nodules: A Multi-Institutional Study of Robustness and Agreement of Quantitative Imaging Features. Tomography, 2(4), 430–437. http://doi.org/10.18383/j.tom.2016.00235