1981

AA-VoxelMorph: A Weakly-supervised Learning Model for PET-MRI Registration via Adaptive Attention
Shaoze Zhang1, Yiwei Liu1, Xingyue Wei1, Rui Wang1, Ziwei Liang2, Jianwen Luo1, and Zuo-Xiang He2
1Tsinghua University, Beijing, China, 2Beijing Tsinghua Changgung Hospital, Beijing, China

Synopsis

Keywords: Analysis/Processing, PET/MR, Multi-model Registration, Dual Attention Mechanism

Motivation: Registered PET-MRI is better than single modality in diagnoses, and traditional algorithms are time-consuming and perform poorly in cross-modal registration.

Goal(s): Improve registration efficiency and reduce registration time by improving traditional deep learning networks.

Approach: We propose a weakly-supervised PET-MRI registration network based on a hybrid adaptive attention mechanism. Masks extracted from the fine-tuned large model is uesd to constrain the network.

Results: We validate the proposed method on liver PET-MRI images. The experimental results show that the proposed method achieves a higher DICE value and shorter registration time than the other state-of-the-art registration algorithms.

Impact: Our proposed new network can help doctors to complete the registration between PET and MRI and diagnose a disease in a short period of time.

Introduction

PET as a non-invasive imaging technology, is extensively used for cancer detection and monitoring. MRI has a high resolution in soft tissue imaging. Registration of these two modalities improves doctors' diagnostic efficiency. Factors like patient respiration and variations in data across different devices pose challenges in aligning corresponding regions. Traditional intensity-based registration methods perform poorly in cross-modal registration due to the different physical principles of these imaging modalities. VoxelMorph 1 is a representative end-to-end convolutional neural network (CNN) in the registration field, demonstrating immense potential in deep learning. Through joint optimization using image similarity losses, it aligns a moving image to a fixed image by optimizing a displacement field. However, limited receptive fields in CNN present challenges in extracting features from multimodal images for registration. To address these challenges, we propose a weakly-supervised PET-MRI registration network based on a hybrid adaptive attention mechanism and employ multi-scale residual structures to construct displacement fields. Masks extracted from the fine-tuned Segment Anything Model (SAM) 2 are used to constrain the network and guide efficient registration. We validate the proposed method on liver PET-MRI images. The experimental results show that the proposed method achieves a higher DICE value and shorter registration time than the other state-of-the-art registration algorithms.

Methods

The proposed PET-MRI image registration method yields a dense displacement field between the two images. Our methodology was based on the Voxelmorph network architecture, depicted in Fig 1. Initially, rough labels were acquired through SAM, and these labels were used to provide boundary constraints on the region of interest (ROI). We replaced the original convolutional layers with a hybrid adaptive attention module (HAAM) 3, allowing a flexible selection of different receptive field scales in channel and spatial dimensions. The channel self-attention mechanism assists in selecting relevant features within a broader receptive field, while the spatial self-attention mechanism aids in identifying positional relationships among corresponding features. This contributes to generating a more accurate deformation field. Additionally, in the decoder stage, residual structures were introduced to fuse multi-scale displacement field information. The overall loss function consists of three parts: similarity between the moved images and the fixed images, regularization, and boundary constraint. We employed mutual information (MI) loss and the modality independent neighborhood descriptor (MIND) 4 loss to balance global registration and local registration. The liver dataset, comprising 40 pairs of PET-MRI images and corresponding organ labels, was obtained from Tsinghua Chang Gung Hospital. 35 subjects were utilized for training, while 5 subjects were allocated for validation. Prior to training, all the images underwent a series of preprocessing steps. Initially, PET images and MRI images were aligned using ITK-SNAP 5. Subsequently, the voxel size was resampled to 2.79×2.79×2.79 mm3. Following this, all the images were cropped and zero-padded to 160×160×160. The network was implemented in Python using the PyTorch framework on a Nvidia RTX A6000 GPU. The model was optimized using the Adam optimizer with a learning rate of 10−4. To quantitatively evaluate the registration performance of our proposed method, we compared our approach with ANTs 6, Affine, and VoxelMorph. The evaluation metrics included the DICE score, non-positive Jacobian determinants (|JΦ|≤0), and registration time.

Results

Our experimental results validate the registration accuracy between the PET images and MRI images. Fig. 2 presents the results obtained by different methods and visualization of the deformation fields. The experimental results in Fig. 3 show that the proposed method achieves a higher DICE value and shorter registration time than the other state-of-the-art registration algorithms.

Conclusion

In this work, we replaced the original convolutional kernels with modules featuring a mixed adaptive channel spatial attention mechanism, allowing the neural network to have a more robust ability to extract features. The mask extracted from the large model is used to guide the network to obtain better registration results. The ultimate registration performance significantly surpassed those of other state-of-the-art methods in terms of DICE score and registration time.

Acknowledgements

No acknowledgement found.

References

  1. G. Balakrishnan, A. Zhao, M. R. Sabuncu, J. Guttag, and A. V. Dalca, “VoxelMorph: A Learning Framework for Deformable Medical Image Registration,” IEEE Trans. Med. Imaging, vol. 38, no. 8, pp. 1788–1800, Aug. 2019, doi: 10.1109/TMI.2019.2897538.
  2. Kirillov A, Mintun E, Ravi N, et al. Segment anything[J]. arXiv preprint arXiv:2304.02643, 2023.
  3. G. Chen, L. Li, Y. Dai, J. Zhang, and M. H. Yap, “AAU-Net: An Adaptive Attention U-Net for Breast Lesions Segmentation in Ultrasound Images,” IEEE Trans. Med. Imaging, vol. 42, no. 5, pp. 1289–1300, May 2023, doi: 10.1109/TMI.2022.3226268.
  4. M. P. Heinrich et al., “MIND: Modality independent neighbourhood descriptor for multi-modal deformable registration,” Medical Image Analysis, vol. 16, no. 7, pp. 1423–1435, Oct. 2012, doi: 10.1016/j.media.2012.05.008.
  5. P. A. Yushkevich, Y. Gao, and G. Gerig, “ITK-SNAP: An interactive tool for semi-automatic segmentation of multi-modality biomedical images,” in Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBS), Orlando, FL, USA, pp. 3342–3345, Aug. 2016, doi: 10.1109/EMBC.2016.7591443.
  6. B. B. Avants, N. J. Tustison, G. Song, P. A. Cook, A. Klein, and J. C. Gee, “A reproducible evaluation of ANTs similarity metric performance in brain image registration,” NeuroImage, vol. 54, no. 3, pp. 2033–2044, Feb. 2011, doi: 10.1016/j.neuroimage.2010.09.025.

Figures

Fig. 1. Illustration of AA-VoxelMorph. (a) A registration network integrating the hybrid adaptive attention module (HAAM) within the VoxelMorph framework. (b) The HAAM primarily consists of two main components: the channel self-attention module and the spatial self-attention module.

Fig. 2. The first row comprises PET images, showing the images registered using different methods, the numbers in the lower right corner represent the DICE score for the tumor area, and the numbers in the lower right corner represent the DICE score for the tumor area. The second row contains the fixed MRI image, with red arrows indicating differences in registration details. MRI contours (red) and PET contours (blue) of tumors are overlaid. The zoomed-in views of deformation fields for the tumor region demonstrate the registration accuracy of tumors.

Fig. 3. Quantitative evaluation of the results

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)
1981
DOI: https://doi.org/10.58530/2024/1981