2246

Improved Deep Learning MR Image Enhancement with Synthetic Images

Zechen Zhou¹ and Ryan Chamberlain¹
¹Subtle Medical Inc, Menlo Park, CA, United States

Synopsis

Keywords: Analysis/Processing, Machine Learning/Artificial Intelligence

Motivation: Deep learning (DL) based image enhancement requires paired data for supervised training. But separately acquired data pairs may encounter spatial mis-alignment that limits the model performance.

Goal(s): Incorporate synthetic data into the training set to address the mis-alignment issue and improve the quality and diversity of the training set.

Approach: Develop and validate the diffusion based image degrader to synthesize low quality images. Compare the performance of DL models trained with/without synthetic data.

Results: DL models trained with synthetic data can achieve similar performance compared to training with acquired pairs. Additional synthetic data can improve DL image enhancement.

Impact: Synthetic data allows building more diverse training sets to achieve multi-task DL models. How much faster the DL model can support and whether it can control the quality of output to meet different clinical preferences is worth further investigation.

Purpose

Deep learning (DL) has been clinically used in image quality restoration for accelerated MR scans. The quality of the training image pairs largely determines the DL model performance. However, the paired high quality (HQ) standard-of-care (SOC) and rapidly scanned low quality (LQ) image may encounter structural mis-alignment and inconsistent artifacts, which reduce the number and diversity of qualified training pairs and adversely affect the model performance. This work investigated the feasibility of incorporating synthetic data to address these issues.

Methods

Synthesized LQ and HQ training pairs
With 20,170 acquired low quality (LQ) and high quality (HQ) 2D MR images, two conditional latent diffusion models (cLDMs) [1] were trained to synthesize the noisy/blurry MR images given the acquired HQ image and degradation level (figure 1A). Another two baseline convolutional neural networks (CNNs) [2] were trained to restore the image quality given the noisy/blurry MR image. Both cLDMs and baseline CNNs can produce well-aligned synthetic image pairs, and the synthetic images were used when misalignment occurs between acquired LQ and HQ image pairs (figure 1B). This enables larger-sized and more diverse data pairs for training.
Evaluation of synthetic data for model training
A synthetic training set was built by replacing the LQ image with the cLDMs' outputs. The performance of CNN trained with synthetic data was compared against the baseline CNN. Pearson correlation and Bland-Altman bias analysis were performed on the SSIM measurements.
Another larger training set was created by incorporating additional 6,725 synthetic pairs with the pre-trained cLDMs. Also, soft targets were generated with the baseline CNN for knowledge distillation (KD) loss. This CNN trained with combined data pairs was compared against the two baseline CNNs in terms of MANIQA image quality score [3] using T-test.

Results

Compared to baseline CNN, CNN trained on synthetic data achieved consistent model performance in both denoising (DNE) and super-resolution (SRE) tasks with high correlation coefficient and no significant bias in terms of SSIM scores (figure 2 and table 1). Qualitatively, CNN trained with synthetic pairs can better preserve the detail structures in both DNE and SRE tasks.
After adding more synthetic training data, significant MANIQA improvement was found in the DNE task (median MANIQA of CNN w/wo additional synthetic data: 0.473285/0.453578, p < 0.01), but not for SRE task. On the other hand, CNN trained with synthetic SRE data alone performed better than the other two CNNs (median MANIQA of CNN with synthetic/acquired data: 0.472702/0.465445, p < 0.01). This result also aligns with the qualitative review.

Discussion and Conclusion

DL models trained with synthetic images can achieve similar performance compared to training with acquired pairs, particularly benefiting small structure enhancement due to better aligned training pairs. Combining acquired and synthetic data can improve the training data diversity that potentially enables a single DL model for both DNE and SRE tasks. Compared to the task specific model, the single model allows more adaptive image enhancement for faster scans with different acceleration approaches.

Acknowledgements

We'd like to acknowledge the funding support from NIH SBIR grant (1R44MH135725-01).

References

1. Rombach R, Blattmann A, Lorenz D, Esser P, and Ommer B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022, pp. 10684-10695.

2. Zhou Z, Chamberlain R, Gulaka P, Gong E, Zaharchuk G, Shankaranarayanan A. Adaptive Deep Learning MR Image Enhancement with Property Constrained Unrolled Network. In Proceedings of the ISMRM 2023, p4031.

3. Yang S, Wu T, Shi S, Lao S, Gong Y, Cao M, Wang J and Yang Y. Maniqa: Multi-dimension attention network for no-reference image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022, pp. 1191-1200.

Figures

Figure 1: (A) Diagram of conditional denoising diffusion probabilistic model (DDPM) in latent space (cLDM) for low quality image synthesis. Latent space was created by training an autoencoder in step 1, and the cLDM was trained within such latent space in step 2. (B) Two strategies to address the mis-aligned training pairs by incorporating synthetic data. If acquired high quality (HQ) image has no artifacts, the cLDM degrader model can be used to synthesize the low quality (LQ) image as the pair. Otherwise, the baseline CNN model can be used to synthesize the HQ image as the pair.

Figure 2: Comparison of models trained with acquired and synthetic data pairs. Top and bottom correspond to the enhancement results from super-resolution (SRE) and denoising (DNE) task, respectively. Left panel illustrates an example case comparison across the SOC and two model outputs, and right panel provides the linear regression of SSIM measurements between the two model outputs for all slices in the test set. Note the correlation coefficient and 95% confidence intervals are provided in the top.

Table 1: Bland-Altman bias analysis for SSIM measurements between the models trained with acquired and synthetic data for SRE and DNE task.

Figure 3: Comparison of different models trained with acquired, synthetic and combined data pairs. Top and bottom correspond to the enhancement results from super-resolution (SRE) and denoising (DNE) task, respectively. Left panel illustrates an example case comparison across the SOC and three model outputs, and the DNE example shows two window level settings to better visualize the difference in background noise suppression. Right panel provides the summary of MANIQA image quality measurements across all slices in the test set.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

2246

DOI: https://doi.org/10.58530/2024/2246