0384

CMRDiff: Multi-sequence CMR synthesis
Puguang Xie1, Zhongsen Li2, Yu Ma1, and Jingjing Xiao3
1Chongqing Emergency Medical Centre, Chongqing University Central Hospital, School of Medicine, Chongqing University, Chongqing, China, 2Center for Biomdical Imaging Research, Tsinghua University, Beijing, China, Beijing, China, 3Bio-Med Informatics Research Centre \& Clinical Research Centre, Xinqiao Hospital, Army Medical University, Chongqing, China

Synopsis

Keywords: AI Diffusion Models, Cardiovascular

Motivation: The synthesis of multi-sequence cardiac magnetic resonance (CMR) images is of great significance to shorten the scan durations and expand the beneficiary population from CMR examination.

Goal(s): Achieving accurate synthesis is particularly challenging due to the inherent suboptimal image quality and the persistent interference from noise.

Approach: We first propose a novel method based on diffusion model, CMRDiff, for multi-sequence CMR synthesis.

Results: We evaluated the proposed CMRDiff on the MICCAI2020 MyoPS Challenge dataset. Our experiments demonstrate that CMRDiff outperforms other state of-the-art multi-modal MRI synthesis methods.

Impact: We design the first denoising diffusion probabilistic modelin the literature for multi-sequence CMR synthesis, promising to serve as an effective tool for multi-sequence CMR synthesis.

Introduction

A standard CMR examination integrates a variety of distinct pulse sequences (1). However, the acquisition of multiple sequences inevitably prolongs the scanning duration, which can be particularly challenging for patients who suffer from claustrophobia. Moreover, certain sequences, especially the Late Gadolinium Enhancement (LGE) sequence, require intravenous contrast agents, limiting their applicability to individuals allergic to these agents (2; 3). Given these challenges, the emerging field of multi-sequence CMR synthesis holds significant promise. However, achieving accurate CMR images synthesis remains an intricate endeavor. This complexity arises primarily from challenges such as suboptimal image quality, various cardiac morphology, indistinct pathological boundaries, difficulties in multisequence information fusion (4). Although a few reports exist on synthesizing CMR images using generative adversarial networks (GAN) models (5; 6), the inherent implicit characterization of GAN-based mdels could result in potential pitfalls, which may adversely affect the quality and diversity of the synthesized images (7). In contrast, diffusion models, anchored in explicit likelihood formulations and a gradual sampling process, are emerging as potent alternatives in the domain of image synthesis (8; 9). However, to the best of our understanding, there has yet to be any research endeavor that employs diffusion models for the synthesis of CMR images. In this context, our proposed model, CMRDiff, pioneers the use of diffusion models for multi-sequence CMR synthesis.

Method

CMRDiff leverages two pre-existing CMR images as conditional inputs to guide the synthesis of a subsequent CMR image. In order to provide a clear illustration of the method, synthetic LGE images is used as an example (Fig. 1). The T2-weighted and bSSFP images are employed as the conditional input for generating the LGE image. Training diffusion models directly in high-resolution pixel space presents significant computational challenges. Drawing inspiration from LDM (10), CMRDiff performs noising process and backward denoising process within the latent embedding space. To ensure the preservation of anatomical structural integrity within the synthesized images, we incorporate a guided image synthesis technique in the model inference phase (Fig. 2). It is crucial to accurately generate the myocardium and lesions in CMR images, as they play a vital role in disease diagnosis and treatment. To enhance the fidelity and quality of these pivotal regions, we propose a heatmap-based denoising loss (Fig. 3). Furthermore, we introduce the multi-condition classifier free guidance specifically designed to modulate the weighting of multiple CMR sequences during the synthesis process. This allows for flexible control over the similarity between the generated images and the condition images.

Result

We evaluated our model using the MyoPS 2020 dataset (4). This dataset encompasses three distinct CMR sequences: end-diastolic phase of bSSFP, T2-weighted, and LGE CMR. The MyoPS 2020 dataset comprises 25 labeled (102 slices) CMR images and 20 unlabeled (72 slices) CMR images. The labeled images were used for model training. The unlabeled images were randomly divided into a validation set (5 CMR images) and a test set (15 CMR images). We demonstrated the performance of CMRDiff in multi-sequence CMR image synthesis. Since there are very few studies on CMR synthesis, CMRDiff was compared with state-of-the-art GAN-based MRI synthesis model or diffusion model. Fig. 4 lists the PSNR and SSIM metrics of CMRDiff and other competing methods in multi-sequence CMR image synthesis. CMRDiff achieves the highest performance in both bSSFP, T2 → LGE, bSSFP, LGE →T2, and T2, LGE → bSSFP. Representative images are displayed in Fig. 5. CMRDiff synthesizes target images with lower artifact levels and sharper tissue depiction.

Dicussion and Conclusion

We propose CMRDiff, a novel synthesis approach for multi-sequence CMR image synthesis using a diffusion model. CMRDiff enhances the integrity of anatomical structures and accuracy of the myocardial region in the synthesized image through a heatmap-based denoising loss and guided image synthesis. Our results demonstrate that CMRDiff outperforms state-of-the-art GAN-based approaches and latent diffusion models in multi-sequence synthesis for CMR. However, it should be noted that the MyoPS 2020 datasets used in our study only contain bSSFP, T2-weighted, and LGE images, potentially leading to inaccurate synthesis of certain sequences, such as the generation of lesion regions in LGE images without T1 images. Despite this limitation, CMRDiff provides valuable insights into the synthesis of CMR images using diffusion models and holds promise for reducing scanning durations, decreasing costs, and expanding the patient population that can benefit from this technology.

Acknowledgements

None

References

1. Wang TKM, Ayoub C, Chetrit M, Kwon DH, Jellis CL, Cremer PC, Bolen MA, Flamm SD, Klein AL. Cardiac magnetic resonance imaging techniques and applications for pericardial diseases. Circ Cardiovasc Imaging 2022;15:e014283 2. Stevenson A, Bray JJ, Tregidgo L, Ahmad M, Sharma A, Ng A, Siddiqui A, Khalid AA, Hylton K, Ionescu A. Prognostic value of late gadolinium enhancement detected on cardiac magnetic resonance in cardiac sarcoidosis. Cardiovascular Imaging 2023;16:345-357 3. van Assen M, Muscogiuri G, Caruso D, Lee SJ, Laghi A, De Cecco CN. Artificial intelligence in cardiac radiology. Radiol Med 2020;125:1186-1199 4. Qiu J, Li L, Wang S, Zhang K, Chen Y, Yang S, Zhuang X. MyoPS-Net: Myocardial pathology segmentation with flexible combination of multi-sequence CMR images. Med Image Anal 2023;84:102694 5. Zhang Q, Burrage MK, Shanmuganathan M, Gonzales RA, Lukaschuk E, Thomas KE, Mills R, Leal Pelado J, Nikolaidou C, Popescu IA. Artificial intelligence for contrast-free MRI: Scar assessment in myocardial infarction using deep learning–based virtual native enhancement. Circulation 2022;146:1492-1503 6. Zhang Q, Burrage MK, Lukaschuk E, Shanmuganathan M, Popescu IA, Nikolaidou C, Mills R, Werys K, Hann E, Barutcu A. Toward replacing late gadolinium enhancement with artificial intelligence virtual native enhancement for gadolinium-free cardiovascular magnetic resonance tissue characterization in hypertrophic cardiomyopathy. Circulation 2021;144:589-599 7. Borji A. Pros and cons of gan evaluation measures. Computer vision and image understanding 2019;179:41-65 8. Dhariwal P, Nichol A. Diffusion models beat gans on image synthesis. Adv Neural Inf Process Syst 2021;34:8780-8794 9. Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models. Adv Neural Inf Process Syst 2020;33:6840-6851 10. Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. p. 10684-10695

Figures

CMRDiff overview and training pipeline. We first encode the input images using pre-trained VAE, then add noise to the encoded LGE, concatenate it with the encoded bssFP and T2, and feed it into UNet. The pre-trained CLIP model is used to further extract bssFP and T2 features, which are introduced into UNet through the cross-attention mechanism. We propose a heatmap-based denoising loss to supervise the model training. During training, bssFP and T2 are set to $$${\varnothing}$$$ with a certain probability .

CMRDiff inference pipeline. We first encode the input images using pre-trained VAE, then add a contorlloed amount noise to the encoded bssFP, concatenate it with the encoded bssFP and T2, and feed it into UNet. The pre-trained CLIP model is used to further extract bssFP and T2 features, which are introduced into UNet as training stage. We introduce a pair of discrete guidance scales $$$c_{I_{1}}$$$ and $$$c_{I_{2}}$$$. These scales facilitate the individualized modulation of the weights associated with each condition.

Framework for heatmap loss. This loss enables the model to focus more on the generation of myocardial regions, especially for poorly performing regions.

Performance for multi-sequence CMR synthesis tasks (bSSFP, T2 → LGE, bSSFP, LGE →T2, and T2, LGE → bSSFP). PSNR (dB) and SSIM are listed as mean±std across the test set. Boldface indicates the top-performing model for each task.

CMRDiff was demonstrated on the MyoPS dataset for three multi-sequence CMR synthesis tasks: a) bSSFP, T2 → LGE, b) bSSFP, LGE →T2, and c) T2, LGE → bSSFP. Synthesized images from all competing methods are shown along with the source images and the reference image. ResViT improves synthesis performance, especially in pathological regions (e.g., tumors, lesions) in comparison to competing methods. Overall, ResViT images have better-delineated tissue boundaries and lower artifact/noise levels.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)
0384
DOI: https://doi.org/10.58530/2024/0384