Jaime Barranco1,2,3, Hamza Kebiri1,2,3, Óscar Esteban2, Raphael Sznitman4, Sönke Langner5,6, Oliver Stachs7, Adrian Konstantin Luyken7, Philipp Stachs8, Benedetta Franceschiello2,3,9,10,11, and Meritxell Bach Cuadra3,11
1Center for Biomedical Imaging (CIBM), Lausanne, Switzerland, 2Lausanne University Hospital (CHUV ), Lausanne, Switzerland, 3University of Lausanne (UNIL), Lausanne, Switzerland, 4ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland, 5Institute for Diagnostic and Interventional Radiology, Pediatric and Neuroradiology, Rostock University Medical Center, Rostock, Germany, 6Department of Diagnostic Radiology and Neuroradiology, University of Greifswald, Greifswald, Germany, 7Department of Ophthalmology, Rostock University Medical Center, Rostock, Germany, 8Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany, 9HES-SO Valais-Wallis, Sion, Switzerland, 10The Sense Innovation and Research Center, Sion and Lausanne, Switzerland, 11These authors provided equal last-authorship contribution, Lausanne, Switzerland
Synopsis
Keywords: Analysis/Processing, Segmentation, Quality Assessment and Control, Eye, MREye, Ophthalmology, Ocular
Motivation: Reliable large-scale MREye segmentation.
Goal(s): Quality control of eye MRI and deep learning segmentation validation.
Approach: We automatically extract Image Quality Metrics (IQMs) and use them as features to train a model in a supervised framework with expert rating annotations as target. Multi-class 3D MREye segmentation is done for the first time using the deep-learning-based approach nnUNet.
Results: None of the models achieved the required levels of sensitivity and specificity necessary for our MREye application. nnUNet for MREye segmentation tasks yielded promising outcomes, robust to a variety of MRI quality.
Impact: MREye does not escape the evidence that insufficient data quality threatens the reliability of analysis outcomes. We pioneer manual and automated quality control on MREye and benchmark deep learning eye segmentation.
Introduction
MRI of the human eye (MREye) is gaining interest due to its comprehensive 3D anatomical view1. Our previous work2,3,4 introduced A-eye, an automated atlas-based segmentation technique for eye structures in T1w MRI data, aiming for large-scale, reliable segmentation.
Because low-quality data can introduce biases in the results and lead to erroneous conclusions, it is critical to establish a quality assessment and control (QA/QC) protocol. However, setting exclusion criteria is challenging due to variability across applications and researchers. Automated QC protocols, such as MRIQC5, help in the early identification of subpar images by automatically computing several image quality metrics (IQMs), and generate standardized visual reports for manual assessment of different quality-related aspects.
While these techniques have been extensively studied and implemented for the adult brain6, their application to the eyes remains unexplored, leaving a gap in understanding effective QA/QC protocols for MREye.
Our work contributes a deep learning segmentation model compared with manual annotations and previous techniques, and a QA/QC protocol within the A-Eye pipeline. While advocating for both pre- and post-segmentation QA/QC checkpoints, as depicted in Figure 1, this work focuses on the QC of pre-segmented images.
We believe having a tailored MREye QC would significantly enhance the integrity, reliability, and clinical applicability of the segmented large-scale data, rendering it an essential component of MREye analysis workflows.Methods
Data1. SHIP (Study of Health in Pomerania, Germany) dataset
7,8: 1245 T1w from a 1.5T Magnetom Avanto with manual annotations on 68 subjects of: lens, globe, optic nerve, intraconal and extraconal fats, and rectus muscles.
2. MRIQC datasets
5: ABIDE, comprehending 1102 T1w scans from 17 sites (19 scanners); and DS030, with 265 images from 2 sites, selected for their heterogeneity.
MREye segmentation method
A deep learning-based approach (nnUNet
9) was trained on the manually annotated SHIP dataset for 3D MREye segmentation, splitting it into 31 subjects for training (reserving 4 for validation) and 37 were held-out for evaluation.
Manual quality control
Subjective eye-quality assessment was performed on 183 SHIP subjects using adapted MRIQC reports, including the change of the field of view for the thumbnails, and eye-oriented aspects such as open/close, (see Figure 2). Two expert raters assessed 83 subjects with scores from 0 (exclude) to 4 (excellent), which were then averaged and normalized to get binary scoring (exclude/include). Another rater rated additional 100 subjects directly with binary scoring.
Automated quality control
Five eye QC strategies were explored for automatic exclusion/inclusion classification:
- Baseline. Retrained MRIQC classifier using ABIDE for training and DS030 for testing, with updated scikit-learn and numpy python libraries.
- Adapted model. Retrained baseline model using binary-rated SHIP subset (N=183).
- Non-brain model. Implemented previous methods, filtering out brain-related IQMs (counting on 10 out of 68).
- Custom non-brain model. Trained custom classifier using the SHIP subset (N=183), 80-20% as train-test split, omitting brain-based IQMs.
Results
nnUNet for MREye segmentation tasks yielded promising outcomes. Compared to manual annotations for 37 subjects, it achieved a median DSC of 0.82 across 9 structures, outperforming the ATLAS-based method’s 0.68 (see Figure 3), with slightly greater challenges encountered in more variable areas like the extraconal fat. nnUNet also performed well in low-quality images, see Figure 4. Let us note that only subjects with included quality were manually segmented, hence the DSC is always computed within that subset of subjects.
A significant gap remains between automated and manual QC decisions. After reviewing results on 426 subjects, the non-brain baseline model achieved the highest overlap with only 30% agreement with human raters in image exclusions. Indeed, automatic models successfully detected overall bad quality. However, the local quality of the eyes was deemed sufficient by the human raters given the application in a substantial number of those exclusions. This is well illustrated in examples in Figure 5.Discussion
Deep learning segmentation of eye structures surpassed atlas-based methods in 37 subjects. Segmentation performance wasn’t linked to manual quality assessment. Automated global brain- or background-based quality control didn’t meet the needs of our eye segmentation application. Our findings emphasize the need for QA/QC protocols tailored to MREye, including both eye-specific and non-tissue metrics.Conclusion
To accurately evaluate eye quality in MRI, it's imperative to develop novel IQMs specifically tailored to eye tissues. Additionally, incorporating non-brain related IQMs and extending scrutiny to the periorbital region is crucial.Acknowledgements
This work was supported by the Gelbert Foundation, the Swiss National Science Foundation (project 205321-182602). We acknowledge the CIBM Center for Biomedical Imaging, a Swiss research center of excellence founded and supported by CHUV, UNIL, EPFL, UNIGE, HUG and the Leenaards and Jeantet Foundations.References
- T. Niendorf, J.-W. M. Beenakker, S. Langner, K. Erb-Eigner, M. Bach Cuadra, E. Beller, J. M. Millward, T. M. Niendorf, O. Stachs, Ophthalmic magnetic resonance imaging: where are we (heading to)?, Current Eye Research (2021) 1–20.
- Barranco J., Kebiri H., Esteban O., Sznitman R., Stachs O., Stachs P., Langner S., Franceschiello B., Bach Cuadra M., A-Eye: Towards a large-scale MRI-based model of the complete eye, ISMRM abstract (2022).
- Barranco J., Kebiri H., Esteban O., Sznitman R., Stachs O., Stachs P., Langner S., Franceschiello B., Bach Cuadra M., A-Eye: Towards large-scale MRI automated segmentation of the eye, ARVO Imaging Conference (2023).
- Barranco J., Kebiri H., Esteban O., Sznitman R., Stachs O., Stachs P., Langner S., Franceschiello B., Bach Cuadra M., A-Eye: Large-scale MRI automatic biomarkers extraction, ARVO Conference (2023).
- Esteban O, Birman D, Schaer M, Koyejo OO, Poldrack RA, Gorgolewski KJ; MRIQC: Advancing the Automatic Prediction of Image Quality in MRI from Unseen Sites; PLOS ONE 12(9):e0184661; doi:10.1371/journal.pone.0184661.Documentation: https://mriqc.readthedocs.io/en/latest/about.htmlGithub: https://github.com/nipreps/mriqcmriqc-learn github: https://github.com/nipreps/mriqc-learn
- Provins, Céline, et al. ‘Quality Control in Functional MRI Studies with MRIQC and fMRIPrep’. Frontiers in Neuroimaging, vol. 1, 2023. Frontiers, https://www.frontiersin.org/articles/10.3389/fnimg.2022.1073734.
- P. Schmidt, R. Kempin, S. Langner, A. Beule, S. Kindler, T. Koppe, H. Vo ̈lzke, T. Ittermann, C. Jürgens, F. Tost, Association of anthropometric markers with globe position: A population-based MRI study, PloS one 14 (2019) e0211817.
- Völzke, H., Alte, D., Schmidt, C. O., Radke, D., Lorbeer, R., Friedrich, N., et al. (2011). Cohort Profile: The Study of Health in Pomerania. Int. J. Epidemiol. 40, 294–307. doi: 10.1093/ije/dyp394.
- Isensee F., Jaeger P. F., Kohl S. A. A., Petersen J., and Maier-Hein K. H., nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation, 2020. Github: https://github.com/MIC-DKFZ/nnUNet