Joshua R Astley1,2, Alberto M Biancardi1, Paul JC Hughes1, Laurie J Smith1, Helen Marshall1, Grace T Mussell1, James Eaden1, Nicholas D Weatherley1, Guilhem J Collier1, Jim M Wild1, and Bilal A Tahir1,2
1POLARIS, Department of Infection, Immunity & Cardiovascular Disease, University of Sheffield, Sheffield, United Kingdom, 2Department of Oncology and Metabolism, University of Sheffield, Sheffield, United Kingdom
Synopsis
Deep learning (DL)-based segmentation was
conducted on a total of 431 3He and 129Xe 3D ventilation images
using several training paradigms.
Combined 3He and 129Xe training showed a significant
improvement over all other DL methods. In the majority of DL models, no
significant difference was observed between 3He and 129Xe
testing data. Results suggest that 3He and 129Xe images
share important features that allow combined 3He and 129Xe
DL models to provide superior segmentations to singular gas models. In
addition, it was shown that DL generates faster segmentations without the
requirement of proton MRI compared to state-of-the-art model-based solutions.
Introduction
Hyperpolarized gas MRI enables visualization of
regional lung ventilation with high spatial and temporal resolution1.
Quantitative biomarkers derived from this modality, including the ventilated
defect percentage, provide further insights into pulmonary pathologies currently
not possible with alternative techniques2. To facilitate the
computation of such biomarkers, segmentation of ventilated lung is required. Current
approaches such as multichannel spatial fuzzy c-means (SFCM) thresholding3
are semi-automatic, require a corresponding proton image aligned with the
ventilation image and require significant time to manually edit segmentations.
Recent research in deep learning (DL) has shown promising results for numerous image
segmentation problems4. Here, we evaluate several DL methods for the
automatic segmentation of hyperpolarized gas MRI. We also investigate the
effect of the noble gas, 3He and 129Xe, on DL performance. Methods
Imaging data:
All subjects underwent MRI at 1.5T. Flexible
quadrature radiofrequency coils were employed for transmission and reception of
MR signals at the Larmor frequencies of 3He and 129Xe. Data
composed of 431 3D hyperpolarized gas images, with either 3He (n=173)
or 129Xe (n=258), from healthy subjects and patients with pulmonary
pathologies (see Figure 1 for details).
DL segmentation:
DL-based segmentation was performed using
NiftyNet5. A VNet architecture was used with a PreLU activation
function4,6. Three sets of experiments were performed to train a convolutional
neural network: (1) the model was trained on either 129Xe or 3He
images; (2) transfer learning was applied to the pre-trained models in (1) to
fine-tune the network for the opposite gas images7; (3) the model
was trained on the combined 3He and 129Xe data. 10% of
the training data was used for internal validation. Each trained model was evaluated
on a combined testing dataset of 3He and 129Xe images
(n=33). Whilst same-patient longitudinal ventilation image data was employed
during training, no such patient data was included in the testing phase,
representing an independent validation cohort. The experiments are shown in
Figure 1.
Data analysis:
To evaluate
segmentation accuracy, Dice Similarity Coefficients (DSCs) were computed
between the DL-based ventilation masks and those generated by expert observers.
For a random subset of 13 of the testing images, DSC values were compared with
multichannel3. Paired t-tests were employed to assess differences between
methods. To investigate the
effect of noble gas on DL segmentation, the testing set was further split into 3He
and 129Xe and analysed by Mann-Whitney tests.Results
Figure 2 shows
example segmentations from all DL methods for a range of diseases and healthy
participants. Transfer learning exhibited improved DSCs only when the
pre-trained 3He model was fine-tuned with 129Xe data, compared
to training on 129Xe only (p<0.0001). Combined training on 129Xe
and 3He yielded statistically significant improvements over all
other DL methods (p<0.05). A full breakdown of results is shown in Figure 3.
A further
comparison was conducted between multichannel SFCM3 and the combined
training on 129Xe and 3He DL model on a subset of 13 testing
images. No significant difference was observed between methods (p=0.842) (see
Figure 4).
Figure 5 exhibits
the differences between 3He and 129Xe testing images for
each DL method. The majority of DL methods demonstrated no significant differences
between 3He and 129Xe; a significant difference in
testing performance in two of the methods was observed (training on 129Xe,
training on 3He and transfer learning on 129Xe). Discussion
The highest performing DL method
evaluated incorporated both 3He and 129Xe training data;
the significant increase, and variability, in training data reduces overfitting
and hence increases the generalisability of the model. Looking at the
effect of the gas, we found significant differences in DSCs between 3He and 129Xe testing images for two models, indicating
that whilst both gases provide clinically comparable ventilation distributions8,
DL segmentation requires both 3He
and 129Xe images during training to generate a robust, generalizable
model that is agnostic to gas. One limitation is that datasets for 3He
and 129Xe were not identical in both number of scans and which
patients were scanned, perhaps inducing differences in segmentation performance.
Analysis
of multichannel SFCM and the highest performing DL method demonstrated no
significant differences between the methods. Multichannel SFCM3 requires
a corresponding, aligned proton image to generate a ventilation mask; this is
not the case in the DL model as only the ventilation image is required. The DL
method has a significantly shorter run time (approximately 7 seconds per 3D
image on a GPU) compared to 5 minutes for multichannel SFCM3. By visual
inspection (see figure 4), less of the trachea and bronchi were erroneously
segmented, reducing the time taken for manual editing.Conclusion
In this work, DL segmentation methods were capable of
segmenting hyperpolarized gas MRI from both 3He and 129Xe to a statistically identical level as current model-based
segmentation methods. DL methods do not require a registered proton image and
are expected to dramatically reduce the time taken to generate segmentations
and manually edit ventilated masks. It was shown that combined learning on 3He
and 129Xe yields significant improvements in DSC over all methods
investigated. Acknowledgements
This
work was supported by Yorkshire Cancer Research, Weston Park Cancer Charity,
National Institute of Health Research, the Medical Research Council and GlaxoSmithKline
(PJCH:BIDS3000032592).References
1. Fain S, Korosec F, Holmes J, et al. Functional lung
imaging using hyperpolarized gas MRI. J. Magn. Reson. Imaging, 2007;25:910-923.
2. Woodhouse N, Wild J, Paley M, et al. Combined
helium‐3/proton magnetic resonance imaging measurement of ventilated lung
volumes in smokers compared to never‐smokers. J. Magn. Reson. Imaging, 2005;21:365-369.
3. Biancardi AM, Acunzo L, Marshall H, et
al. A paired approach to the segmentation of proton and hyperpolarized gas MR images
of the lungs. ISMRM 2018.
4. Bakator M, Radosav D. Deep Learning and Medical
Diagnosis: A Review of Literature. Multimodal Technologies Interact. 2018;2:47.
5. Gibson E, Li W, Sudre C, et al. NiftyNet: a
deep-learning platform for medical imaging, Computer Methods and Programs in
Biomedicine, 2018;158:113-122.
6. Tustison,
N, Avants B, Lin Z, et al. Convolutional Neural Networks with Template-Based
Data Augmentation for Functional Lung Image Quantification. Academic Radiology,
2019;26(3):412–423.
7.
Zha W, Fain S, Schiebler M, et al. Deep convolutional neural networks with
multiplane consensus labeling for lung function quantification using UTE proton
MRI. J Magn Reson Imaging, 2019:50:1169-1181.
8. Stewart
N, Chan H, Hughes P, et al. Comparison of 3He and 129Xe MRI for
evaluation of lung microstructure and ventilation at 1.5T. J. Magn. Reson.
Imaging, 2018;48:632-642.