3720

SSIMPLE: Scan-SpecIfic parameter MaPping from contrast weighted images with self-supervised LEarning

Fatih Dogangun¹, Yohan Jun^2,3, and Berkin Bilgic^2,3
¹Electrical and Electronics Engineering, Bogazici University, Istanbul, Turkey, ²Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, United States, ³Harvard Medical School, Boston, MA, United States

Synopsis

Keywords: Quantitative Imaging, Quantitative Imaging, self-supervised learning, parameter mapping

Motivation: There is rich and complementary information in clinical images, which may lend itself to the estimation of relaxometry parameters.

Goal(s): To develop a self-supervised network that can estimate T₁, T₂, and PD maps from contrast-weighted images with high fidelity.

Approach: We developed a scan-specific self-supervised model (SSIMPLE) that harnesses Bloch equations and estimates parameter maps from multi-contrast images without the need for a training dataset and additional constraints.

Results: High-fidelity T₁, T₂, and PD maps with minor biases 4.5%, 11.76%, and 15.45%, respectively, were obtained using the proposed self-supervised network.

Impact: Using the developed scan-specific self-supervised neural network, SSIMPLE, high-fidelity parameter maps can be estimated from clinically routine contrast-weighted images without the need for an external training dataset or additional constraints.

Introduction

Current quantitative MRI has limitations due to specialized sequences, long acquisition times, and the requirement of ground-truth quantitative maps for supervised deep learning. Self-supervised learning methods have been developed for parameter mapping from contrast-weighted images. Nevertheless, the training loss function requires additional constraints, and their regularization weights need to be determined manually¹. We propose a scan-specific self-supervised learning method, SSIMPLE, which does not need external training data and additional constraints, to obtain T₁, T₂, and PD maps from clinically routine weighted images.
Data/code: https://anonymous.4open.science/r/SSIMPLE-0D71/README.md

Neural Network Design

A 2D U-net with 5 depths and 64 kernels at the first depth was used. Convolutional layers were followed by batch normalization, activation (ReLU), and dropout with 0.05 rate. The network receives contrast-weighted images as four-channel input. It yields a three-channel output, including T₁, T₂, and PD maps. Contrast-weighted images were synthesized using the estimated parameter maps via Bloch equations:
$$T1w = PD\frac{sin(\alpha)(1-e^{\frac{-TR}{T_{1}}})}{1-cos(\alpha)e^{\frac{-TR}{T_{1}}}}e^{\frac{-TE}{T_{2}}} \\ T2w = PD(1-e^{\frac{-TR}{T_{1}}})e^{\frac{-TE}{T_{2}}} \\ PDw = PD(1-e^{\frac{-TR}{T_{1}}})e^{\frac{-TE}{T_{2}}} \\ MPRAGE = PD(1-\frac{2e^{\frac{-TE}{T_{1}}}}{1+e^{\frac{-TR}{T_{1}}}}) \\ FLAIR = PD(\frac{1-2e^{\frac{-TI}{T_{1}}}+e^{\frac{-TR}{T_{1}}}}{1+e^{\frac{-TR}{T_{1}}}cos(\alpha)})e^{\frac{-TE}{T_{2}}} $$
As detailed in Figure 1, the loss function calculates the mean squared error (MSE) between the input and the synthesized contrast-weighted images in a self-supervised manner. Since the input images are potentially scaled differently, a trainable parameter was defined for each input to scale it appropriately.
Firstly, the model was trained with fixed scale parameters, followed by joint training with scale variables. The training took 2.4 hours for the synthetic and 1.7 hours for in vivo data.

Data Acquisition

Synthetic data: Reference T₁, T₂, and PD maps are derived from an in vivo 3D-QALAS²dataset (FOV=240x240x202mm³, Size=208x208x176, TE=2.29ms, TR=4.5ms, Bandwidth=330Hz/px, scan time=8:24min). T₁-weighted, T₂-weighted, PD-weighted, and MPRAGE images were synthesized using Bloch equations. Each volume was normalized to have different scales.
In vivo data: Contrast-weighted images which include T₁-w, T₂-w, PD-w, and FLAIR images with FOV=220x181mm², Matrix=192x158, Bandwidth=130Hz/px and the following parameters,

T₁-weighted: TE=13ms, TR=500ms, Flip Angles=75°, 160°, scan time=2:44min;
T₂-weighted: TE=85ms, TR=4000ms, Flip Angle=180°, scan time=1:10min;
PD-weighted: TE=12ms, TR=4000ms, Flip Angle=180°, scan time =1:54min;
FLAIR: TE=85ms, TR=8000ms, TI=2500ms, Flip Angle=180°, scan time=4:02min,

Four inversion times were used to estimate a reference T1 map³, and the reference T2 map was obtained by using VARPRO⁴with four TEs from a single-echo spin-echo sequence.

Results

Figure 2: the proposed self-supervised SSIMPLE produced high-fidelity parameter maps for the synthetic data. The RMSE values are 25.89%, 6.88%, and 10.26% for the T₁, T₂, and PD maps, excluding CSF, respectively.
Figure 3: the proposed self-supervised SSIMPLE estimated high-quality quantitative maps with minor biases of 4.5%, 11.76%, and 15.45% for the T₁, T₂, and PD maps, respectively. Bland-Altman plots were generated using measurements from white matter and gray matter.
Figure 4: synthesized contrast-weighted images using the estimated parameter maps via SSIMPLE have higher fidelity compared to synthesized contrast-weighted images using Bloch equations directly on the reference maps estimated from multi-inversion and multi-echo acquisitions.
Figure 5: additional exemplar slices of estimated quantitative maps via SSIMPLE are shown for in vivo data.

Discussion and Conclusion

High-fidelity T₁, T₂, and PD maps were obtained using the proposed self-supervised learning method, SSIMPLE. The need for additional constraints¹, such as total variation loss on M₀map, a loss for constraining the mean value of the M₀, and losses for constraining T₁ and T₂ values in the CSF and parameters to weight these constraints are obviated by using trainable scale parameters, which are estimated jointly with the parameter maps during training.
As demonstrated in Figure 4, high-fidelity contrast-weighted images cannot be directly synthesized with Bloch equations and reference parameter maps. This is likely because the maps and Bloch equations are insufficient to account for additional MR physics, such as magnetization transfer effects, which are known to contribute to the contrast of TSE images (e.g., FLAIR) significantly. In contrast, our self-supervised loss function produces highly similar synthetic images by explicitly seeking parameter maps consistent with the input contrast-weighted images. However, this results in minor biases: 4.5% in T₁, 11.76% in T₂, and 15.45% in PD maps.
In conclusion, we proposed self-supervised SSIMPLE, which does not require external reference data and additional constraints, enabling high-fidelity parameter mapping from standard clinical images.

Acknowledgements

This work was supported by research grants NIH R01 EB028797, P41 EB030006, U01 EB026996, R03 EB031175, R01 EB032378, UG3 EB034875, and NVidia Corporation for computing support.

References

1. Qiu S, Christodoulou AG, Sati P, Xie Y, Li D. Physics-guided self-supervised learning for retrospective T1 and T2 mapping from conventional weighted brain MRI. In: ISMRM 2023. ; p. 2168.

2. Kvernby S, Warntjes MJB, Haraldsson H, Carlhäll C-J, Engvall J, Ebbers T. Simultaneous three-dimensional myocardial T1 and T2 mapping in one breath hold with 3D-QALAS. J. Cardiovasc. Magn. Reson. 2014;16:102.

3. Barral JK, Gudmundson E, Stikov N, Etezadi-Amoli M, Stoica P, Nishimura DG. A robust methodology for in vivo T1 mapping. Magn. Reson. Med. 2010;64:1057–1067.

4. Trzasko JD, Mostardi PM, Riederer SJ, Manduca A. Estimating T1 from multichannel variable flip angle SPGR sequences. Magn. Reson. Med. 2013;69:1787–1794.

Figures

Figure 1: Overall schematic of the proposed scan-specific self-supervised learning method (SSIMPLE) for parameter mapping from contrast-weighted images. The CNN takes contrast-weighted images as a four-channel input, and it estimates T₁, T₂, and PD maps as a three-channel output. Contrast-weighted images were synthesized using the parameter maps estimated by the network via Bloch Equations. The scale variables were trained jointly with the network. The loss calculates the difference between the input and the synthesized contrast-weighted images.

Figure 2: (a) T₁, (b) T₂, and (c) PD maps estimated by the proposed scan-specific self-supervised learning method (SSIMPLE), the reference parameter maps for the synthetic data, and (d) input contrast-weighted images. Root mean square error (RMSE) was computed between the estimated and the reference parameter maps, excluding the CSF region. The difference images show the difference between the estimated and the reference parameter maps.

Figure 3: (a) T₁, (b) T₂, and (c) PD maps estimated by the proposed scan-specific self-supervised learning method (SSIMPLE), the reference parameter maps, and corresponding Bland-Altman plots for the in vivo data. Measurements were taken excluding the CSF region for Bland-Altman plots.

Figure 4: (a) T₁-weighted, (b) T₂-weighted,(c) PD-weighted, and (d) Flair images synthesized using both the estimated parameter maps via the proposed scan-specific self-supervised learning method (SSIMPLE) and the reference parameter maps with the reference contrast-weighted images. Root mean square error (RMSE) was calculated between the synthesized and the reference contrast-weighted images.

Figure 5: Additional exemplar slices of estimated (a) T₁, (b) T₂, and (c) PD maps via the proposed scan-specific self-supervised learning method (SSIMPLE) for in vivo data.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

3720

DOI: https://doi.org/10.58530/2024/3720