1395

Automatic segmentation of spinal cord multiple sclerosis lesions across multiple sites, contrasts and vendors

Pierre-Louis Benveniste^1,2, Jan Valošek^1,2,3,4, Michelle Chen¹, Nathan Molinier^1,2, Lisa Eunyoung Lee^5,6, Alexandre Prat^7,8, Zachary Vavasour⁹, Roger Tam⁹, Anthony Traboulsee¹⁰, Shannon Kolind¹⁰, Jiwon Oh^5,6, and Julien Cohen-Adad^1,2,11,12
¹NeuroPoly Lab, Institute of Biomedical Engineering, Polytechnique Montreal, Montréal, QC, Canada, ²Mila - Quebec AI Institute, Montréal, QC, Canada, ³Department of Neurosurgery, Faculty of Medicine and Dentistry, Palacký University Olomouc, Olomouc, Czech Republic, ⁴Department of Neurology, Faculty of Medicine and Dentistry, Palacký University Olomouc, Olomouc, Czech Republic, ⁵Department of Medicine (Neurology), University of Toronto, Toronto, ON, Canada, ⁶BARLO Multiple Sclerosis Centre & Keenan Research Centre, St. Michael's Hospital, Toronto, ON, Canada, ⁷Department of neuroscience, Université de Montréal, Montréal, QC, Canada, ⁸Neuroimmunology research laboratory, University of Montreal Hospital Research Centre (CRCHUM), Montréal, QC, Canada, ⁹School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada, ¹⁰Departments of Medicine (Neurology), Physics, Radiology, University of British Columbia,, Vancouver, BC, Canada, ¹¹Functional Neuroimaging Unit, CRIUGM, Université de Montréal, Montréal, QC, Canada, ¹²Centre de Recherche du CHU Sainte-Justine, Université de Montréal, Montréal, QC, Canada

Synopsis

Keywords: Diagnosis/Prediction, Multiple Sclerosis, Deep Learning, Segmentation, Spinal Cord

Motivation: Longitudinal analysis of spinal cord multiple sclerosis (MS) lesions is clinically relevant for the early diagnosis and monitoring of MS progression.

Goal(s): Develop a deep learning tool for the automatic segmentation of MS spinal cord lesions on PSIR and STIR images from multiple sites.

Approach: A nnUNet model was trained and tested on the baseline data and applied to follow-up scans to create lesion distribution maps.

Results: We demonstrated the utility of the model to map the spatio-temporal distribution of MS lesions across MS phenotypes. The model is packaged into an open-source software.

Impact: Automatic segmentation of spinal cord lesions in large cohorts helps to identify signatures of MS phenotypes for ultimately improving prognosis and optimizing treatment for people with MS.

Introduction

Clinical monitoring of spinal cord (SC) multiple sclerosis (MS) lesions is relevant for the early diagnostics and evaluation of MS progression¹. While methods exist to automatically segment MS lesions in the brain, only a few have tackled lesion segmentation in the SC². Moreover, existing SC MS lesion segmentation algorithms only work well for MRI contrasts used during training but do not generalize well. In this work, we propose a deep learning (DL) model for the automatic segmentation of both SC and lesions in phase sensitive inversion recovery (PSIR) and short tau inversion recovery (STIR) images. The algorithm was tested on longitudinal multi-site, multi-contrast and multi-vendor data. A proof-of-concept application of this work was to automatically generate spatial distribution maps of MS lesions^3,4.

Methods

3T MRI data from 5 sites were collected as part of the ongoing Canadian Prospective Cohort Study to Understand Progression in MS (CanProCo) project⁵. Sagittal PSIR 0.7×0.7×3 mm³ (4 sites, 333 participants) and sagittal STIR 0.7×0.7×3 mm³ (1 site, 92 participants) images of the cervical SC from the baseline session (M0) were used for model training. To study spatio-temporal MS lesion distribution, participants with both M0 and 12-month follow-up (M12) sessions were used, resulting in 158 relapsing-remitting MS (RRMS), 45 primary progressive MS (PPMS), and 45 radiologically isolated syndrome (RIS) participants.
MS lesions were manually segmented by a single-rater, and intervertebral discs were manually identified on M0 images. For each subject, the SC was automatically segmented on M0 images using the contrast-agnostic DL model⁶ on STIR and inverted PSIR (multiplied by -1). Segmentations were corrected when necessary in ~5% of the images. For M12 data, intervertebral discs were obtained using the Hourglass model⁷ fine-tuned on M0 data and manually corrected when necessary (~20% of the images).
The self-configuring nnUNet v2 framework⁸ was used to train two DL models (3D and 2D) on the STIR and inverted PSIR images from M0 to simultaneously segment hyper-intense lesions and the SC (269/67/89 images for training/validation/testing). The performance of both models on the test set (~20% of M0 images for each site) was compared with sct_deepseg_lesion². The models were then applied to unseen M12 data.
Lesion and SC masks were brought to the PAM50 SC template⁹ to create lesion frequency maps⁴ for individual phenotypes and across sessions.

Results

Figure 1 shows that both the 3D and 2D nnUNet models performed similarly, and both outperformed sct_deepseg_lesion on lesion-wide and voxel-wide metrics (Figure 2).
Figure 3 depicts the number of lesions across phenotypes and sessions. We measured an average number of 4.71 manually segmented lesions at M0, and 3.55 lesions at M12 for the 2D model. PPMS participants showed a higher number of lesions relative to RRMS and RIS participants across all sessions.
Figure 4 shows lesion frequency maps across phenotypes and sessions. In both the M0 (created from ground truth segmentations) and the M12 (created from predicted segmentations) maps, lesions were predominantly located at C2-C3 and C5 vertebral levels. PPMS participants demonstrated higher lesion count relative to RRMS and RIS participants.

Discussion

The median Dice scores were 0.55 and 0.53 for the 2D and 3D models, respectively, which is in the ballpark of state-of-the-art performance for SC MS lesion segmentation². The developed models outperformed sct_deepseg_lesion, keeping in mind that sct_deepseg_lesion was trained on different contrasts. Contrary to a previous longitudinal study showing an increase in lesion count in RRMS¹⁰, our model predicted fewer lesions for M12 relative to M0. This is likely caused by a relatively low sensitivity of the model to detect lesions (median sensitivity is 0.5). This can be explained by: (i) the poorly defined lesions due to the highly anisotropic resolution, (ii) the aggregation of two different MRI contrasts for training a single model, and (iii) intra-rater variability in the generation of ground truth lesion masks¹¹. Similarly to previous studies^3,4, we found that lesions were more frequently located at C2-C3 and C5 vertebral levels, with a higher distribution of lesions in PPMS relative to RRMS and RIS. Further validation of the proposed models is needed to validate their performance against M12 manual segmentations.

Conclusion

This work presents an automatic method for the segmentation of MS lesions from PSIR/STIR images. The method generalizes across phenotypes, sites, and sessions and provides results in agreement with previous studies. Subsequent development will further validate the generalizability of the model across additional MRI contrasts.

Acknowledgements

This research was supported by the Multiple Sclerosis Society of Canada, Biogen Idec, Brain Canada, and Roche. We acknowledge all study participants as well as CanProCo collaborators. Thanks to Nick Guenther and Mathieu Guay-Paquet for helping with dataset management. Funded by the Canada Research Chair in Quantitative Magnetic Resonance Imaging [CRC-2020-00179], the Canadian Institute of Health Research [PJT-190258], the Canada Foundation for Innovation [32454, 34824], the Fonds de Recherche du Québec - Santé [322736, 324636], the Natural Sciences and Engineering Research Council of Canada [RGPIN-2019-07244], the Canada First Research Excellence Fund (IVADO and TransMedTech), the Courtois NeuroMod project, the Quebec BioImaging Network [5886, 35450], INSPIRED (Spinal Research, UK; Wings for Life, Austria; Craig H. Neilsen Foundation, USA), Mila - Tech Transfer Funding Program. JV has received funding from the European Union's Horizon Europe research and innovation programme under the Marie Sktodowska-Curie grant agreement No 101107932.

References

1. Cortese, R. & Ciccarelli, O. Clinical monitoring of multiple sclerosis should routinely include spinal cord imaging - Yes. Mult. Scler. 24, 1536–1537 (2018).

2. Gros, C. et al. Automatic segmentation of the spinal cord and intramedullary multiple sclerosis lesions with convolutional neural networks. Neuroimage 184, 901–915 (2019).

3. Kerbrat, A. et al. Multiple sclerosis lesions in motor tracts from brain to cervical cord: spatial distribution and correlation with disability. Brain 143, 2089–2105 (2020).

4. Eden, D. et al. Spatial distribution of multiple sclerosis lesions in the cervical spinal cord. Brain 142, 633–646 (2019).

5. Oh, J. et al. The Canadian prospective cohort study to understand progression in multiple sclerosis (CanProCo): rationale, aims, and study design. BMC Neurol. 21, 1–19 (2021).

6. Bédard, S. et al. Towards contrast-agnostic soft segmentation of the spinal cord. arXiv [eess.IV] (2023).

7. Azad, R., Rouhier, L. & Cohen-Adad, J. Stacked Hourglass Network with a Multi-level Attention Mechanism: Where to Look for Intervertebral Disc Labeling. in Machine Learning in Medical Imaging 406–415 (Springer International Publishing, 2021).

8. Isensee, F., Jaeger, P. F., Kohl, S. A. A., Petersen, J. & Maier-Hein, K. H. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18, 203–211 (2021).

9. De Leener, B. et al. PAM50: Unbiased multimodal template of the brainstem and spinal cord aligned with the ICBM152 space. Neuroimage 165, 170–179 (2018).

10. Zecca, C. et al. Relevance of asymptomatic spinal MRI lesions in patients with multiple sclerosis. Mult. Scler. 22, 782–791 (2016).

11. Walsh, R. et al. Expert Variability and Deep Learning Performance in Spinal Cord Lesion Segmentation for Multiple Sclerosis Patients. in 2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS) 463–470 (2023).

Figures

Figure 1: Comparison of lesion segmentation models across sites. For all the metrics, higher values mean better score. PPVL (Positive Predictive Value for Lesions), SensL (Lesion detection sensitivity), F1 Score (F1 Score between PPVL and SensL) target focus on lesion-wide information while Dice score focuses on voxel-wise information. Both 2D and 3D nnUNet models performed better relative to sct_deepseg_lesion.

Figure 2: Qualitative examples of lesion segmentation. The lesion segmentations results from sct_deepseg_lesion, the 3D nnUNet and the 2D nnUNet overlaid on the sagittal and axial views for STIR and PSIR contrasts. The left panel shows a lesion at level C3-C4; the right panel shows a lesion at level C2-C3.

Figure 3: Distribution of lesion count per phenotype and time point. Lesion count is obtained from manual segmentation (for M0) and from the 2D nnUNet model (for M12). PPMS participants show a higher number of lesions relative to RRMS and RIS participants across all sessions. RIS = radiologically isolated syndrome, RRMS = relapsing-remitting MS, PPMS = primary progressive MS. Wilcoxon signed-rank tests p-values indicate statistically significant differences (*p-value < 0.05).

Figure 4: Frequency maps of lesions in the cervical spinal cord across phenotypes. (A) Baseline (M0) map constructed from manually segmented lesions. (B) Map at follow-up (M12) built from the automatic lesion and spinal cord segmentation. The axial view shows an average of the lesion frequency across each vertebral level. The sagittal views show an average of the lesion frequency across sagittal slices. The grey matter contour is overlaid on the axial view. RIS = radiologically isolated syndrome, RRMS = relapsing-remitting MS, PPMS = primary progressive MS.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

1395

DOI: https://doi.org/10.58530/2024/1395