1259

Self-Supervised SUper-Resolution ASL Enhancement based on Conditional Diffusion Models (SURED)

Yunzhi Xu¹, Liangchen Shi¹, Jiaxin Zheng¹, Jiaxin Li¹, Yu Zeng¹, Weiying Dai², David Alsop ³, and Li Zhao¹
¹College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, China, ²Department of Computer Science, State University of New York, Binghamton, NY, United States, ³Radiology, Beth Israel Deaconess Medical Center and Harvard Medical School, Boston, MA, United States

Synopsis

Keywords: Arterial Spin Labelling, Arterial spin labelling, Super-resolution, Conditional diffusion model

Motivation: Arterial spin labeling (ASL) MRI is a non-invasive technique used for measuring perfusion. However, the resolution of ASL is limited by its low SNR.

Goal(s): to propose an ASL super-resolution method based on a self-supervised training strategy and the conditional diffusion model.

Approach: Synthetic high resolution ASL images were generated by utilizing paired T1w images and low-resolution ASL images. A modified conditional diffusion model was trained to simultaneously achieve resolution enhancement and denoising. The proposed model was tested on simulated and volunteer images.

Results: The proposed network demonstrates superior enhanced image details, improved SNR, and preserved original contrast in conventional low-resolution ASL images.

Impact: The proposed method enhanced the ASL images without requiring the high-resolution ASL for training. It enables super-resolution ASL images from 4 minutes scans to approach those acquired in 17min.

INTRODUCTION

ASL is a non-invasive perfusion imaging technique ¹, which plays a crucial role in clinic and research. However, ASL suffers from low SNR, resulting in low resolution of 3-4 mm and a long scan time ². In addition to advancements in ultra-high field ASL ³, various image enhancement methods have been proposed to improve ASL ⁴. The SLIce Dithered Enhanced Resolution (SLIDER) technology applied to super-resolution (SR) reconstruction of ASL². Structured MR images have also been utilized by leveraging image self-similarity ^5,6. Deep learning methods have also been adapted to improve ASL images as well. For instance, supervised DLASL denoising⁷ and two-stage multi-loss SR networks ⁸ have showed promise in improving ASL. However, these approaches require high-quality ASL images for model training, which presents a significant challenge due to motion artifact and a long scan time. As an alternative approach, deepASL⁹ used T1w images and fixed CBF values to generate training data, which has different distributions from real data. In this study, we proposed a SUper-Resolution Enhancement based on condition Diffusion model (SURED). A self-supervised strategy was proposed and the synthetic high-resolution ASL was generated by utilizing low-resolution ASL and T1w. A modified 3D conditional diffusion model was trained to enhance ASL. Our preliminary results demonstrated superior performances of the proposed model on simulated data and volunteer scans compared to the conventional methods.

METHODS

Dataset: The external dataset was selected from the ADNI3 database (adni.loni.usc.edu, RRID:SCR_003007), included 170 subjects, including T1w (1mm isotropic) and ASL (4mm isotropic). The internal dataset was acquired from three subjects scanned with a 3T GE Architect scanner, including ASL images with high-resolution (2.5x2.5x4mm³, 17min), low-resolution (3.85x3.85x4mm³, 4:31min), and T1w images (1mm isotropic).

Self-supervised data generator, which fused intensity information from low-resolution ASL and structural details of T1w, Fig. 1a. First, ADNI’s T1w images were segmented into gray and white matter masks. The white matter mask was scaled by a factor of 0.3. The Synthetic High-Resolution ASL ($$$S_{H}$$$) was generated by multiplying the masks with the low-resolution ASL ($$$R_{L}$$$) of same subject. The $$$S_{H}$$$ image was smoothed slightly using a Gaussian kernel (sigma=1). Second, Synthetic Low-Resolution ASL ($$$S_{L}$$$) was generated by truncating the k-space of $$$S_{H}$$$ using a Gaussian windowing (FWHM=1/4 k-space). Finally, realistic low-resolution noise derived from the acquired ASL's background was upsampled and added to the $$$S_{L}$$$.

3D conditional diffusion model: which comprised a diffusion process and neural network $$$f_{θ}$$$, Fig. 1b. The $$$f_{θ}$$$ takes acquired ASL $$$x_{e}$$$ and noise-corrupted target $$$\widetilde{y} $$$ as inputs and aims to restore the noise-free target $$$y_{0}$$$ ¹⁰. The $$$f_{θ}$$$ is implemented using a 3DUNet network, with inputs $$$x_{e}$$$ and $$$y_{t}$$$ concatenated along channels. During model training, $$$x_{e}$$$ is $$$S_{L}$$$, $$$y_{0}$$$ is $$$S_{H}$$$, and $$$y_{t}$$$ results from the forward diffusion process of $$$y_{0}$$$ (T=1000). During inference, $$$x_{e}$$$($$$R_{L}$$$) and gaussian noise generated from the $$$R_{L}$$$ were the inputs, and the enhanced high-resolution ASL was the output. The model is trained on synthesized data and tested on acquired low-resolution ASL.

The proposed method was compared with the BM3D, a state-of-the-art denoising technique, and a non-local (NL) patch-based SR method ⁶using SSIM, PSNR, and Mutual Information. In synthetic data, 150 subjects were used for train and 20 for test, where ground-truth was the synthesized $$$S_{H}$$$. Then model was tested with the acquired real ASL in the internal scans.

RESULTS

On the synthetic test data (20 case), the proposed method outperformed both BM3D and NL patch in SSIM by 12%, PSNR by 33%, and MI by 30%, Figure 2. Compared to BM3D and NL patch methods, the proposed method showed a great degree of similarity to the ground truth on the simulation data, Fig. 3a, as well as providing detailed enhancement and denoising on low-resolution ASL images from ADNI, Fig. 3b. Importantly, it demonstrated a great degree of similarity with the acquired high-resolution ASL while preserving the original contrast (zoom-in images) on the internal data, Fig. 3d. The proposed method consistently outperforms BM3D and NL patch methods in SSIM, PSNR, and MI compared to the acquired high-resolution ASL, Table 1.

Discussion and Conclusion

In this study, we proposed an ASL super-resolution enhancement method utilizing self-supervised strategy and the conditional diffusion model. The proposed model eliminated the requirement of high-resolution ASL images during training and provided advantages including enhanced resolution, reduced noise, and preserved contrast. The proposed method demonstrated its superior performance on simulated and volunteer data compared with the traditional denoising and SR techniques. Furthermore, the performance of the proposed model could be further improved by fine-tuning model with real high-resolution ASL data.

Acknowledgements

This work is supported by the National Key R&D Program of China (2022ZD0118004), the Alzheimer's Association (AARF-18-566347), Zhejiang Provincial Natural Science Foundation of China (LGJ22H180004, 202006140, and 2022C03057), and the MOE Frontier Science Center for Brain Science & Brain-Machine Integration, Zhejiang University.

References

1. J. A. Detre, J. S. Leigh, D. S. Williams, and A. P. Koretsky, “Perfusion imaging,” Magnetic Resonance in Medicine, vol. 23, no. 1, pp. 37–45, 1992, doi: 10.1002/mrm.1910230106.

2. Q. Shou, X. Shao, and D. J. J. Wang, “Super-Resolution Arterial Spin Labeling Using Slice-Dithered Enhanced Resolution and Simultaneous Multi-Slice Acquisition,” Frontiers in Neuroscience, vol. 15, 2021, Accessed: Oct. 20, 2023. [Online]. Available: https://www.frontiersin.org/articles/10.3389/fnins.2021.737525

3. X. Golay and E. T. Petersen, “Arterial Spin Labeling: Benefits and Pitfalls of High Magnetic Field,” Neuroimaging Clinics, vol. 16, no. 2, pp. 259–268, May 2006, doi: 10.1016/j.nic.2006.02.003.

4. L. Zhao et al., “Using Anatomic Magnetic Resonance Image Information to Enhance Visualization and Interpretation of Functional Images: A Comparison of Methods Applied to Clinical Arterial Spin Labeling Images,” IEEE Transactions on Medical Imaging, vol. 36, no. 2, pp. 487–496, Feb. 2017, doi: 10.1109/TMI.2016.2615567.

5. J. V. Manjón, P. Coupé, A. Buades, D. L. Collins, and M. Robles, “MRI Superresolution Using Self-Similarity and Image Priors,” International Journal of Biomedical Imaging, vol. 2010, p. e425891, Dec. 2010, doi: 10.1155/2010/425891.

6. C. Meurée, P. Maurel, J.-C. Ferré, and C. Barillot, “Patch-based super-resolution of arterial spin labeling magnetic resonance images,” NeuroImage, vol. 189, pp. 85–94, Apr. 2019, doi: 10.1016/j.neuroimage.2019.01.004.

7. D. Xie et al., “Denoising arterial spin labeling perfusion MRI with deep machine learning,” Magnetic Resonance Imaging, vol. 68, pp. 95–105, May 2020, doi: 10.1016/j.mri.2020.01.005.

8. Z. Li et al., “A Two-Stage Multi-loss Super-Resolution Network for Arterial Spin Labeling Magnetic Resonance Imaging,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2019, D. Shen, T. Liu, T. M. Peters, L. H. Staib, C. Essert, S. Zhou, P.-T. Yap, and A. Khan, Eds., in Lecture Notes in Computer Science. Cham: Springer International Publishing, 2019, pp. 12–20. doi: 10.1007/978-3-030-32248-9_2.

9. S. A, U. A. C., A. John, K. C., and B. Thomas, “Deep-ASL enhancement technique in arterial spin labeling MRI – A novel approach for the error reduction of partial volume correction technique with linear regression algorithm,” Journal of Computational Science, vol. 58, p. 101546, Feb. 2022, doi: 10.1016/j.jocs.2021.101546.

10. C. Saharia, J. Ho, W. Chan, T. Salimans, D. J. Fleet, and M. Norouzi, “Image Super-Resolution via Iterative Refinement,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 4, pp. 4713–4726, Apr. 2023, doi: 10.1109/TPAMI.2022.3204461.

Figures

Figure 1. The framework of the proposed method, including the synthetic ASL generator (a) and the conditional diffusion model (the training phase (b) and the inference phase (c)).

Figure 2. The performance of the proposed method on synthetic data.

Table 1. The performances of the proposed SURED method compared to the acquired high-resolution ASL

Figure 3. Enhanced ASL on external and internal datasets. (a) Synthetic ASL from ADNI, (b) Low-resolution ASL from ADNI, (c) Internal data, and (d) zoom-in images from (c).

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

1259

DOI: https://doi.org/10.58530/2024/1259