1982

Image Registration using Averaging VoxelMorph with CNN Edge Detector
Xuan Lei1, Philip Schniter1, Chong Chen1, Yingmin Liu1, and Rizwan Ahmad1
1The Ohio State University, Columbus, OH, United States

Synopsis

Keywords: Analysis/Processing, Motion Correction, MOCO, VoxelMorph

Motivation: Image registration followed by averaging is a common technique to improve the quality of free-breathing single-shot cardiac images. However, registering images becomes challenging when the SNR is low.

Goal(s): Improve image registration for free-breathing cardiac MRI.

Approach: We train a network, called AvgMorph, to register all source images to one target image. In addition, we use the output of a sophisticated deep learning-based edge detector to compute loss.

Results: We validate AvgMorph using a realistic MRXCAT digital phantom for late gadolinium enhancement. AvgMorph outperforms existing methods in terms of NMSE, SSIM, and perceptual quality metrics.

Impact: Pairwise registration of free-breathing images is suboptimal. We propose a network to register all source images to a single target image and utilize a loss function computed to edge maps rather than the images themselves.

Introduction

Free-breathing single-shot (FBSS) imaging is gaining popularity for cardiovascular MRI (CMR) because of its high acquisition efficiency and patient comfort. Common CMR examples of FBSS include first-pass perfusion, late gadolinium enhancement (LGE), and parametric mapping. Typically, FBSS images need to be registered for averaging or further processing. Existing methods often rely on pairwise registration, where each noisy source image is separately registered to the noisy target image. These methods either use a variational nonrigid image registration (NRIR) framework [1] or deep learning (DL)-based VoxelMorph [2] with normalized cross-correlation (NCC) loss [3]. However, this pairwise processing is suboptimal, especially when the signal-to-noise ratio (SNR) is low. We present a method, called AvgMorph, that jointly registers all noisy source images to a single noisy target image. In addition, for training AvgMorph, we compute the loss function on the output of a DL-based edge detector [4]. Our preliminary results show that AvgMorph outperforms NRIR and VoxelMorph. These results also show that AvgMorph can be applied to unseen images without further training.

Methods

We aim to register n source images to one target image using AvgMorph. The network architecture for AvgMorph is shown in Figure 1. First, we train a single VoxelMorph network using NCC loss. Second, using the trained VoxelMorph network as a warm start, we train n weight-tied VoxelMorph networks. In this step, we pass the averaged output of n VoxelMorph networks as well the target image through the lightweight dense CNN (LDC), which is a recently proposed DL-based edge detector. The loss function is based on the L1-norm of the absolute pixelwise difference between the two LDC edge maps. To generate edge maps, we use the output of the first block in the LDC network. The training is performed in PyTorch using Adam optimizer with 20,000 epochs on NVIDIA RTX3090 GPU.To train AvgMorph, we used data from the MRXCAT phantom [5]. We simulated 64 unique short-axis slices from 8 digital patients. The parameters defining tissue contrast, e.g., T1, proton density, inversion time, and contrast concentration were selected to match routine LGE scans. To simulate free-breathing imaging, a unique breathing pattern was used for each slice, and 15 (n=14 source, 1 target) free-breathing images were collected per slice. The SNR of the source and target images was randomly varied between 8 dB to 14 dB during each epoch. Once trained, AvgMorph was tested on 32 slices from 4 digital patients, each at 1 dB, 6 dB, and 11 dB SNR. Although the training data did not have a scar, we adjusted the parameter file in MRXCAT to include myocardial scars of varying sizes and shapes in all test slices. For evaluation, AvgMorph was compared to NRIR [6] and VoxelMorph in terms of NMSE (in dB), SSIM, LPIPS, and DISTS [7].

Results

The results are summarized in Figure 2. Across all three testing SNRs (1 dB, 6 dB, and 11 dB), AvgMorph outperforms NRIR and VoxelMorph for all performance metrics. As expected, the performances of all three methods degrade with a decrease in SNR. The improvement of AvgMorph over other methods is more pronounced at low SNR. Figures 3, 4, and 5 show representative images at three different testing SNRs along with the error maps. Although the training images did not have a scar, AvgMorph was successful in registering and denoising the scar area in the test images. The computation time for NRIR was 2 minutes per slice, while the computation time for VoxelMorph and AvgMorph was between 1 to 2 seconds at the inference stage.

Discussion

Image registration is often used to improve the quality of FBSS images. Inaccurate registration can lead to image blurring and motion artifacts. Existing image registration methods perform pairwise processing and thus do not fully utilize the joint redundancy in all source images. We propose an extension of VoxelMorph, called AvgMorph. In AvgMorph, all source images are jointly registered and averaged to match one target image. Compared to mean squared error or NCC loss, LDC attempts to align the images based on the edge information. As a result, AvgMorph is more robust to noise and potentially to changes in the image contrast. Our results show that AvgMorph with LDC-based loss outperforms other methods.

Conclusion

We have proposed and validated a new image registration method that can outperform commonly used registration methods. After the training phase, AvgMorph offers fast computation and performs well for unseen pathologies. Our ongoing efforts are focused on applying AvgMorph to LGE data from human subjects, and our future efforts will include integrating AvgMorph into a reconstruction method.

Acknowledgements

This research was supported by NIH/NIBIB grant R01EB029957.

References

[1] M. Droske and M. Rumpf, “A variational approach to nonrigid morphological image registration,” SIAM J Appl Math, vol. 64, no. 2, 2004, doi: 10.1137/S0036139902419528.

[2] G. Balakrishnan, A. Zhao, M. R. Sabuncu, J. Guttag, and A. V. Dalca, “VoxelMorph: A Learning Framework for Deformable Medical Image Registration,” IEEE Trans Med Imaging, vol. 38, no. 8, 2019, doi: 10.1109/TMI.2019.2897538.

[3] K. D. Ban, J. Lee, D. H. Hwang, and Y. K. Chung, “Face image registration methods using Normalized Cross Correlation,” in 2008 International Conference on Control, Automation and Systems, ICCAS 2008, 2008. doi: 10.1109/ICCAS.2008.4694210.

[4] X. Soria, G. Pomboza-Junez, and A. D. Sappa, “LDC: Lightweight Dense CNN for Edge Detection,” IEEE Access, vol. 10, 2022, doi: 10.1109/ACCESS.2022.3186344.

[5] L. Wissmann, C. Santelli, W. P. Segars, and S. Kozerke, “MRXCAT: Realistic numerical phantoms for cardiovascular magnetic resonance,” Journal of Cardiovascular Magnetic Resonance, vol. 16, no. 1, 2014, doi: 10.1186/s12968-014-0063-3.

[6] H. Xue et al., “Motion correction for myocardial T1 mapping using image registration with synthetic image estimation.,” Magn Reson Med, vol. 67, no. 6, pp. 1644–55, Jun. 2012, doi: 10.1002/mrm.23153.

[7] S. Kastryulin, J. Zakirov, N. Pezzotti, and D. V. Dylov, “Image Quality Assessment for Magnetic Resonance Imaging,” IEEE Access, vol. 11. 2023. doi: 10.1109/ACCESS.2023.3243466.

Figures

Figure 1: The architecture of AvgMorph, which consists of n weight-tied VoxelMorph networks and a mean (averaging) layer. The network is trained using LDC-based loss.

Figure 2: Results of the MRXCAT study. The numbers reported here were averaged over 32 slices from four digital patients. For SSIM, higher values are better, while for NMSE, LPIPS, and DISTS, lower values are better.

Figure 3: A representative example when testing SNR = 11 dB. The noiseless target was used only to compute the performance metrics. The NRIR, VoxelMorph, AvgMorph show the final registered images from the three methods. The noisy source images (not shown) and the noisy target image were used to generate the results shown in the first row. The error maps were amplified five times for better visualization.

Figure 4: A representative example when testing SNR = 6 dB. The noiseless target was used only to compute the performance metrics. The NRIR, VoxelMorph, AvgMorph show the final registered images from the three methods. The noisy source images (not shown) and the noisy target image were used to generate the results shown in the first row. The error maps were amplified five times for better visualization.

Figure 5: A representative example when testing SNR = 1 dB. The noiseless target was used only to compute the performance metrics. The NRIR, VoxelMorph, AvgMorph show the final registered images from the three methods. The noisy source images (not shown) and the noisy target image were used to generate the results shown in the first row. The error maps were amplified five times for better visualization.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)
1982
DOI: https://doi.org/10.58530/2024/1982