1231

Deep learning-based automated scan planning for brain MRI

Gaojie Zhu^1,2, Xiongjie Shen², and Hua Guo¹
¹Department of Biomedical Engineering, School of Medicine, Tsinghua University, Center for Biomedical Imaging Research, Beijing, China, ²Anke High-tech Co., Ltd, Shenzhen, China

Synopsis

Keywords: Analysis/Processing, Brain, automatic scan planning

Motivation: Manual scan planning in clinical MRI is inaccurate, inconsistent and time-consuming.

Goal(s): A deep learning-based end-to-end automated scan planning framework has been developed for MRI head scans.

Approach: We propose a two-stage end-to-end 3D cascaded convolutional network framework, called 3D CFP-UNet, which localizes the positions of five key anatomical landmarks and achieves a coarse-to-fine result. We also propose loss functions PRL and DRL with physical meaning in automatic scan planning.

Results: Our approach yields satisfactory scan planning results on 229 test subjects, with PAE and PRE reaching 0.872mm and 0.10%, respectively.

Impact: MRI automated scan planning can help improve scan efficiency. Also, it improves scan consistency for follow-up comparisons.

Introduction

Manual planning of scans in clinical magnetic resonance imaging (MRI) exhibits poor accuracy, lacks consistency, and is time-consuming. Meanwhile, classical automated scan planning methods that rely on certain assumptions are not accurate or stable enough, and are computationally inefficient for practical application scenarios.

Methods

As presented in Figure 1, a deep learning-based end-to-end automated scan planning framework has been developed for MRI head scans. Our model takes a 3D pre-scan image input, utilizing a cascaded 3D convolutional neural network to detect anatomical landmarks and plane directions from coarse to fine. The structure of the 3D CFP-UNet is derived from the typical 3D U-Net (1) and it has been improved in three key areas: network architecture, feature extraction module and cascade structure. First, inspired by residual networks (2), we construct a more profound 3D U-Net backbone with enhanced feature extraction ability by incorporating additional residual convolutional kernels. Second, the 3D U-Net backbone is designed with an asymmetric structure whereby the encoder comprises more convolutional kernels than the decoder. This design choice enhances the model's feature extraction capability and effectively limits the number of parameters in the model (3). Finally, it is necessary to reduce the number of convolutional kernel channels while increasing network depth to mitigate the risk of overfitting.

The loss function in this study comprises two main components: the point regression loss, typically used in landmark identification problems, and the direction regression loss, which is based on the physical meaning of the scan planning task. Calculated consecutively in two stages of the 3D CFP-UNet model, the final loss function includes four loss factors that are combined and weighted for supervised training of the network. The general loss function is computed as follows::$$L_{CFP} = \sum_{i=1}^2\alpha_i*(w_1*L_{PRL}+w_2*L_{DRL}), $$In which, $$$L_{PRL}$$$ and $$$L_{DRL}$$$ are the point regression loss and directional regression loss, correspondingly. The weights for the above two losses and are both arranged as “1” in the experiment. represents the loss weights of each stage in the cascade network, and the first and second stages of the network have weights of 1/3 and 2/3, respectively.

In this study, we used the Turbo Field Echo 3D (TFE3D) sequence to acquire data and generate 3D images. Clinicians manually labelled the test set using 3D Slicer software (4). 559 volunteers underwent automatic brain scan planning, with 312 subjects assigned to the training set, 12 to the validation set, and 229 to the test set. The data were obtained using a SuperMark 1.5T MRI system (Anke High-Tech Co., Ltd., Shenzhen, China) approved by the institutional review board of Tsinghua University, and all volunteers provided written informed consent. We have implemented our 3D CFP-UNet model within the PyTorch deep learning framework, utilizing four NVIDIA A4000 graphics cards with 16G video memory. The batch size was 12, and the learning was set at a rate of 0.001. Employing a regular term coefficient of 0.0005, 200 epochs of training were conducted in approximately 12 hours.

Results

Evaluation of 3D CFP-UNet Performance.
The performance of 3D CFP-UNet was assessed on 229 samples with an average prediction time of 0.2 seconds per sample. The results are presented in Table 1 and shown in Figure 2. The test data revealed a point-to-point absolute error (PAE) of 0.872 mm, point-to-point relative error (PRE) of 0.10%, and an average angular error (AAE) of 0.502°, 0.381°, and 0.675° in sagittal, transverse, and coronal planes, respectively.

Effect of network structure and physical loss function.
For this study, we compare four combinations of network structures and loss functions, specifically: 1) 3D U-Net, a fundamental 3D U-Net network structure with mean square error (MSE) as the loss function; 2) 3D U-Net+PRL, a basic 3D U-Net network structure with point regression loss as the loss function; 3)3D CFP-UNet+PRL, a 3D CFP-UNet and use point regression loss as the loss function; 4)3D CFP-UNet+PRL+DRL, a 3D CFP-UNet and use point regression, and directional regression as the combined loss function. Table 2 shows the quantitative performance evaluation results for these four models. The table shows that the 3D CFP-UNet+PRL+DRL proposed in this paper has the best performance in all quantitative metrics, including a PAE error of 0.886 mm, a PRE of 0.11%, and AAEs of 0.521°, 0.384°, and 0. 681 degrees were measured in the sagittal, transverse, and coronal planes, respectively.

Conclusion & Discussion

The proposed deep learning-based automated scan planning shows high accuracy and robustness for clinical brain scans and is of high efficiency.

Acknowledgements

No acknowledgement found.

References

1. Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O, editors. 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. International Conference on Medical Image Computing and Computer-Assisted Intervention; 2016. 2. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015:770-8. 3. Rosas-Gonzalez S, Birgui-Sekou T, Hidane M, Zemmoura I, Tauber C. Asymmetric Ensemble of Asymmetric U-Net Models for Brain Tumor Segmentation With Uncertainty Estimation. Frontiers in Neurology. 2021; 12. 4. Fedorov A, Beichel RR, Kalpathy-Cramer J, Finet J, Fillion-Robin J-C, Pujol S, et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magnetic resonance imaging. 2012; 30 9:1323-41.

Figures

Figure 1. Overview of the proposed 3D CFP-UNet for automatic scan planning. (A) The general framework of 3D CFP-UNet which comprises two sets of spaced 3D-UNet and feature pyramid modules that connect in cascade. (B), feature pyramid module, four feature maps with varying resolutions are combined to provide both identification precision and efficiency.

Figure 2. A typical sample scan planning outcome based on the prediction of the 3D CFP-UNet model. Before automatic scan planning, there was a noticeable tilting angle in the subject's 3D view. However, after automatic scan planning, an accurate scan position can be placed in each plane. Specifically, landmarks such as the nasal root (red dot), the point at the rostral end of the corpus callosum (yellow dot) and the point at the end of the corpus callosum compression section (light green dot) are quite prominent in the sagittal plane.

Table 1. 3D CFP-UNet test results on 229 samples.

Table 2. Quantitative performance evaluation results of 4 models on 229 test samples.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

1231

DOI: https://doi.org/10.58530/2024/1231