Gaojie Zhu1,2, Xiongjie Shen2, and Hua Guo1
1Department of Biomedical Engineering, School of Medicine, Tsinghua University, Center for Biomedical Imaging Research, Beijing, China, 2Anke High-tech Co., Ltd, Shenzhen, China
Synopsis
Keywords: Analysis/Processing, Brain, automatic scan planning
Motivation: Manual scan planning in clinical MRI is inaccurate, inconsistent and time-consuming.
Goal(s): A deep learning-based end-to-end automated scan planning framework has been developed for MRI head scans.
Approach: We propose a two-stage end-to-end 3D cascaded convolutional network framework, called 3D CFP-UNet, which localizes the positions of five key anatomical landmarks and achieves a coarse-to-fine result. We also propose loss functions PRL and DRL with physical meaning in automatic scan planning.
Results: Our approach yields satisfactory scan planning results on 229 test subjects, with PAE and PRE reaching 0.872mm and 0.10%, respectively.
Impact: MRI automated scan planning can help
improve scan efficiency. Also, it improves scan consistency for follow-up
comparisons.
Introduction
Manual planning of scans in clinical
magnetic resonance imaging (MRI) exhibits poor accuracy, lacks consistency, and
is time-consuming. Meanwhile, classical automated scan planning methods that
rely on certain assumptions are not accurate or stable enough, and are
computationally inefficient for practical application scenarios.Methods
As presented in Figure 1, a deep
learning-based end-to-end automated scan planning framework has been developed
for MRI head scans. Our model takes a 3D pre-scan image input, utilizing a
cascaded 3D convolutional neural network to detect anatomical landmarks and plane
directions from coarse to fine. The structure of the 3D CFP-UNet is derived
from the typical 3D U-Net (1) and it has been improved in three key areas:
network architecture, feature extraction module and cascade structure. First, inspired
by residual networks (2), we construct a more profound 3D U-Net backbone with
enhanced feature extraction ability by incorporating additional residual
convolutional kernels. Second, the 3D U-Net backbone is designed with an
asymmetric structure whereby the encoder comprises more convolutional kernels
than the decoder. This design choice enhances the model's feature extraction
capability and effectively limits the number of parameters in the model (3).
Finally, it is necessary to reduce the number of convolutional kernel channels
while increasing network depth to mitigate the risk of overfitting.
The loss function in this study comprises
two main components: the point regression loss, typically used in landmark
identification problems, and the direction regression loss, which is based on
the physical meaning of the scan planning task. Calculated consecutively in two
stages of the 3D CFP-UNet model, the final loss function includes four loss
factors that are combined and weighted for supervised training of the network.
The general loss function is computed as follows::$$L_{CFP} = \sum_{i=1}^2\alpha_i*(w_1*L_{PRL}+w_2*L_{DRL}), $$In which, $$$L_{PRL}$$$ and $$$L_{DRL}$$$ are the
point regression loss and directional regression loss, correspondingly. The
weights for the above two losses and are both
arranged as “1” in the experiment. represents the loss weights of each stage in
the cascade network, and the first and second stages of the network have
weights of 1/3 and 2/3, respectively.
In this study, we used the Turbo Field Echo
3D (TFE3D) sequence to acquire data and generate 3D images. Clinicians manually
labelled the test set using 3D Slicer software (4). 559 volunteers underwent
automatic brain scan planning, with 312 subjects assigned to the training set,
12 to the validation set, and 229 to the test set. The data were obtained using
a SuperMark 1.5T MRI system (Anke High-Tech Co., Ltd., Shenzhen, China) approved by the institutional review board of Tsinghua University, and
all volunteers provided written informed consent. We have implemented our 3D
CFP-UNet model within the PyTorch deep learning framework, utilizing four
NVIDIA A4000 graphics cards with 16G video memory. The batch size was 12, and
the learning was set at a rate of 0.001. Employing a regular term coefficient
of 0.0005, 200 epochs of training were conducted in approximately 12 hours.
Results
Evaluation of 3D CFP-UNet Performance.
The performance of 3D CFP-UNet was assessed
on 229 samples with an average prediction time of 0.2 seconds per sample. The
results are presented in Table 1 and shown in Figure 2. The test data revealed
a point-to-point absolute error (PAE) of 0.872 mm, point-to-point relative
error (PRE) of 0.10%, and an average angular error (AAE) of 0.502°, 0.381°, and
0.675° in sagittal, transverse, and coronal planes, respectively.
Effect of network structure and physical
loss function.
For this study, we compare four
combinations of network structures and loss functions, specifically: 1) 3D
U-Net, a fundamental 3D U-Net network structure with mean square error (MSE) as
the loss function; 2) 3D U-Net+PRL, a basic 3D U-Net network structure with
point regression loss as the loss function; 3)3D CFP-UNet+PRL, a 3D CFP-UNet
and use point regression loss as the loss function; 4)3D CFP-UNet+PRL+DRL, a 3D
CFP-UNet and use point regression, and directional regression as the combined
loss function. Table 2 shows the quantitative performance evaluation results
for these four models. The table shows that the 3D CFP-UNet+PRL+DRL proposed in
this paper has the best performance in all quantitative metrics, including a
PAE error of 0.886 mm, a PRE of 0.11%, and AAEs of 0.521°, 0.384°, and 0. 681
degrees were measured in the sagittal, transverse, and coronal planes,
respectively.Conclusion & Discussion
The proposed deep learning-based automated
scan planning shows high accuracy and robustness for clinical brain scans and
is of high efficiency.Acknowledgements
No acknowledgement found.References
1.
Çiçek Ö, Abdulkadir A, Lienkamp
SS, Brox T, Ronneberger O, editors. 3D U-Net: Learning Dense Volumetric
Segmentation from Sparse Annotation. International Conference on Medical Image
Computing and Computer-Assisted Intervention; 2016.
2.
He K, Zhang X, Ren S, Sun J.
Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR). 2015:770-8.
3.
Rosas-Gonzalez S, Birgui-Sekou
T, Hidane M, Zemmoura I, Tauber C. Asymmetric Ensemble of Asymmetric U-Net
Models for Brain Tumor Segmentation With Uncertainty Estimation. Frontiers in
Neurology. 2021; 12.
4.
Fedorov A, Beichel RR,
Kalpathy-Cramer J, Finet J, Fillion-Robin J-C, Pujol S, et al. 3D Slicer as an
image computing platform for the Quantitative Imaging Network. Magnetic
resonance imaging. 2012; 30 9:1323-41.