1758

Accurate and efficient co-registration of diffusion and T1-weighted MRI using self-supervised deep learning

Keyu Chen¹, Ziyu Li², Zihan Li³, and Qiyuan Tian³
¹School of Biological Science and Medical Engineering, Beihang University, Beijing, China, ²Wellcome Centre for Integrative Neuroimaging, FMRIB, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom, ³Department of Biomedical Engineering, Tsinghua University, Beijing, China

Synopsis

Keywords: Analysis/Processing, Machine Learning/Artificial Intelligence, co-registration, distortion correction, voxelmorph

Motivation: The co-registration between diffusion and T1-weighted data is important for various diffusion analyses, which is challenging due to the geometric distortion in diffusion images.

Goal(s): To achieve accurate and efficient co-registration between diffusion data and T1w image.

Approach: A self-supervised deep learning-based framework VoxelMorph was used to non-linearly align distorted diffusion b=0 image to T1w image. Our proposal was systematically and quantitatively compared to other linear and non-linear transformations. The benefit was also demonstrated.

Results: VoxelMorph achieved comparable co-registration accuracy compared to NiftyReg and seconds processing time, which was 40 times faster than NiftyReg, or even 300 times faster by leveraging transfer learning.

Impact: Our proposal achieved fast and accurate co-registration between distorted diffusion data and T1w image, which has a great potential to benefit various diffusion MRI data analyses for neuroscientific studies, including region-of-interest specific quantification and surface-based analysis.

Introduction

Diffusion MRI (dMRI) is a valuable noninvasive technique for mapping tissue microstructure and structural connectivity. Various diffusion analyses require accurate co-registration of diffusion with T1-weighted (T1w) images for transforming brain segments derived on T1w data to the diffusion image space. The accurate co-registration is challenging as diffusion data, commonly acquired using single-shot echo-planar imaging (EPI) is prone to B0 inhomogeneity-induced geometric distortions. The correction of such distortions requires additionally acquired field map which are often unavailable. Non-linear co-registration methods implemented in software packages such as NiftyReg can compensate for the non-linear distortions but are very slow.

To address this challenge, we leveraged an unsupervised deep learning framework VoxelMorph using convolutional neural network (CNN) to achieve fast accurate co-registration between distorted diffusion data and T1w images. The distorted b=0 image was non-linearly aligned with the T1w image to obtain a warp field, which can be applied to the whole diffusion dataset. Our proposal benefits from CNN’s superior performance in learning the non-linear mapping and fast forward processing once the network is trained. We also show that transfer learning substantially accelerated the network training.

Methods

Data. Data of 30 subjects from MGH Connectome Diffusion Microstructure Dataset were used in this study. The spatial resolutions were 2 mm isotropic and 1 mm isotropic for the diffusion and T1w data, respectively. The diffusion data were corrected for eddy current induced distortions using FSL’s “eddy” but were not corrected for B0 inhomogeneity induced geometric distortions.

Networks. VoxelMorph (version 0.2, https://github.com/voxelmorph/voxelmorph/) with self-supervised learning was adopted in this study, which involved a U-Net and a spatial transformer network(Fig.1A). Distorted b=0 image in the diffusion data and the target T1w image were set as the moving and fixed images, respectively, as the inputs for VoxelMorph. Training and testing were performed on data from 20 subjects and another 10 subjects. The optimized network was further fine-tuned on the data of one evaluation subject.

An Adam optimizer was used to minimize the loss function L, which consists of L_sim that penalizes the differences between co-registered and fixed images, and L_smooth that penalizes local spatial variations in g_θ .
L(f , m, ϕ) = L_sim(f , m· ϕ) + λ L_smooth (ϕ)

Mutual Information(MI) is generally used for inter-modality registration. To address the issue of binning non-differentiability, which makes training ϕ non-backpropagable. Therefore, the differential mutual information (dMI) technique was employed, allowing for discretizing the continuous contributions of each voxel into a histogram range while ensuring the continuity of image intensity.

Evaluation. The accuracy of the co-registration was evaluated by quantifying the similarity of the co-registered distorted b=0 image with a reference b=0 image which was distortion corrected using FSL’s “topup” and co-registered to T1w image using FreeSurfer’s “bbregister”. For comparison, the distorted b=0 image was also co-registered to T1w image using “reg_f3d” from NiftyReg. The structural similarity index measure (SSIM), cross-correlation (CC) and MI between the co-registered and reference b=0 images were calculated to quantify their similarity.

Results

Fig.1B illustrates the training and fine-tuning process of VoxelMorph. VoxelMorph converged rapidly after 14 epochs (learning rate=1e-4). Fine-tuning a pre-trained model significantly accelerated the training, allowing for the rapid development of specialized model within a short time frame.

Fig.2 demonstrates the superior performance of VoxelMorph. VoxelMorph significantly outperformed the affine transformation from FreeSurfer’s “bbregister” and performed comparably to the nonlinear warping using NiftyReg’s “reg_f3d”. Further fine-tuning led to slight performance improvement. The alignment between the native T1w and co-registered b=0 images (with overlayed gray-white boundary derived from T1w data) is shown in Fig.4.

Fig.5 presents the performance of each method across ten evaluation subjects. VoxelMorph and NiftyReg exhibited significant improvements over affine transformation and similar performance across all three metrics.

Fig.5B provides a comparison of the executing time for different methods. VoxelMorph took 14 seconds while NiftyReg took over 10 minutes on a CPU on average. VoxelMorph reduced the registration time by more than 40-fold. Using a GPU (NVidia A800), VoxelMorph took less than two seconds, boosting the efficiency by a factor of 300.

Discussion and Conclusion

Our study demonstrates the efficacy and efficiency of self-supervised deep learning-based co-registration methods to improve the alignment between diffusion data that suffer from geometric distortions and T1w structural images. VoxelMorph has a high potential for handling large-scale data processing tasks and real time processing and visualization such as for imaging based neurosurgical guidance. These findings underscore the potential application of this method in clinical practice and neuroscientific research. Future work will evaluate the generalizability of VoxelMorph by evaluating its performance for training and testing datasets acquired with different hardware systems and imaging protocols.

Acknowledgements

The T1w and diffusion MRI data were provided by the MGH Connectome Diffusion Microstructure Dataset (CDMD).

References

Basser, P.J., Mattiello, J., LeBihan, D., 1994. MR diffusion tensor spectroscopy and imaging. Biophys. J. 66 (1), 259 –267.

Ordidge, R., “The development of echo-planar imaging (epi): 1977–1982,” Magnetic Resonance Materials in Physics, Biology and Medicine 9(3), 117–121 (1999).

Modat, M. et al. Fast free-form deformation using graphics processing units. Computer Methods and Programs in Biomedicine 98, 278–284 (2010).

Balakrishnan, G., Zhao, A., Sabuncu, M. R., Guttag, J. & Dalca, A. V. VoxelMorph: A Learning Framework for Deformable Medical Image Registration. IEEE Trans. Med. Imaging 38, 1788–1800 (2019).

Tian, Q., Fan, Q., Witzel, T. et al. Comprehensive diffusion MRI dataset for in vivo human brain microstructure mapping using 300 mT/m gradients. Sci Data 9, 7 (2022).

Andersson, J. L. R. & Sotiropoulos, S. N. An integrated approach to correction for off-resonance effects and subject movement in diffusion MR imaging. NeuroImage 125, 1063–1078 (2016).

Viola, P. and Wells III, W. M., “Alignment by maximization of mutual information,” IJCV 24(2), 137–154 (1997).

Th ́evenaz, P. and Unser, M., “Optimization of mutual information for multiresolution image registration,” IEEE Trans. Med. Imag. 9(12), 2083–2099 (2000).

Guo, C. K., Multi-modal image registration with unsupervised deep learning, PhD thesis, Massachusetts Institute of Technology (2019).

Greve, D. N. & Fischl, B. Accurate and robust brain image alignment using boundary-based registration. NeuroImage 48, 63–72 (2009).

Andersson, J. L. R., Skare, S. & Ashburner, J. How to correct susceptibility distortions in spin-echo echo-planar images: application to diffusion tensor imaging. NeuroImage 20, 870–888 (2003).

Figures

Figure 1. Framework and training. (A) A U-Net g_θ was employed to map the 3D distorted b=0 m and T1w f to the distortion filed ϕ. The co-registered b=0 image was then computed using the warp field. The U-Net was trained by minimizing the difference (DMI error) between the T1w and co-registered b=0 image. (B) The blue curve represents the DMI error during the VoxelMorph training process on a single subject, while the green curve represents the DMI error when fine-tuning the pre-trained model on the same subject.

Figure 2. Image results. Exemplar axial (a, b) and sagittal (c, d) slices of reference b = 0 images, corrected for susceptibility-induced geometric distortion (a, i and c, i) and co-registered to native T1w (b, i and d, i) using various methods (a, c, ii-v) with residual maps compared to reference b = 0 images (b, d, ii-vi), including (ii) affine co-registration using FreeSurfer’s “bbregister” function, (iii) nonlinear co-registration using NiftyReg’s “reg_f3d” function, (iv) VoxelMorph trained on data from training, and (v) pre-trained VoxelMorph fine-tuned on evaluation subject.

Figure 3. Warp field. The warp field ϕ, created by applying the deformation function trained with U-Net CNNs, is used for registering the distorted b=0 image to the native T1w image. Exemplar axial (a) and sagittal (b) image slices of the warp field ϕ(i) are shown, while the reference warp field ϕ(iii) demonstrates the registration of the distorted b=0 image to the reference b=0 image. The warp fields ϕ(i) and ϕ(iii) map displacements along each spatial dimension to RGB color channels, visualized in the grid deformation variables shown in columns ii and iv.

Figure 4. Boundary alignment. The red curve represents the gray/white boundary derived from native T1w image (i). Exemplar axial (a) and sagittal (b) image slices of co-registered b=0 data using: (ii) Freesurfer's 'bbregister' function, (iii) NiftyReg's 'reg_f3d' function, (iv)VoxelMorph optimized on data from training subjects, and (v) the VoxelMorph model fine-tuned on the evaluation subject, with overlaid gray/white boundary curves. Gray-white boundary aligning more closely with the red line indicates a better registration performance.

Figure 5. Quantitative comparison. (A) The similarity between co-registered distorted b=0 image and T1w data is measured using three metrics: SSIM, CC, and MI. 10 different subjects are evaluated. Higher metrics indicate better performance. (B) The execution time of different methods on a CPU or GPU.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

1758

DOI: https://doi.org/10.58530/2024/1758