Multi-Spectral Imaging (MSI) methods, such as SEMAC and MAVRIC-SL, resolve metal-induced field perturbations by applying additional encoding in the spectral dimension, at the cost of increased scan time. In this work, we introduce a 3D-CNN-based reconstruction to accelerate MSI utilizing spatial-spectral features of aliasing artifacts. We demonstrate in in vivo experiments that the proposed method can accelerate MAVRIC-SL acquisitions by a factor of 3 when used alone, and 17-25 when combined with parallel imaging and half-Fourier acquisition. The 3D-CNN showed significant improvement in image quality compared with parallel image and compressed sensing (PI&CS), with negligible additional computation time.
Multi-Spectral Imaging (MSI) techniques, such as SEMAC1 and MAVRIC-SL2, resolve most metal-induced artifacts by acquiring separate 3D spatial encodings for multiple spectral bins (similar to slices), at the cost of increased scan time. Various methods have been explored to accelerate MSI by exploiting correlations between spectral bins3-5. Model-based reconstruction4 and RPCA5 explicitly or implicitly model spectral bin images as the same underlying magnetization modulated by different RF excitation profiles. They can provide around 20-fold acceleration when combined with parallel imaging (PI) and partial Fourier sampling (PF). However, they require long reconstruction times due to computationally expensive iterations.
Deep learning is an emerging technique in MRI reconstruction with negligible computation time compared to iterative methods6-9. In this work, we introduce a 3D convolutional neural network (3D-CNN) for accelerating MSI, which learns features of aliasing artifacts at different scales in spatial-spectral domain. We assess image quality in reader studies, comparing 3D-CNN output from retrospectively under-sampled data to the fully-sampled reference.
Reconstruction framework
The framework of the proposed 3D-CNN-based reconstruction is shown in Fig.1a. The under-sampled k-space data of each bin is first processed separately with PI & CS10. Next a 3D-CNN is used to remove residual aliasing artifacts and blurring in the initial bin images. The initial bin images are first sliced along the readout (x) direction, so that the input of the network has dimension [y, z, bin]. The output of the network is the residual image, which has the same dimension. The reconstructed bin images (initial + residual) are combined to form the final composite image.
Network architecture
A 3D U-Net11 architecture (Fig. 1b) with a total of 16 convolution layers is used for this study. The 2D U-Net12 has been popular in MRI reconstruction for its ability to learn features at different scales without losing high-spatial-frequency information7,8. Here the spectral dimension is treated equivalently to the spatial dimensions in convolutions, downscaling and upscaling operations. The spectral dimension can be preserved throughout the network by using 3D convolutions, but not 2D convolutions, as demonstrated in Fig.2.
Experiments
The 3D-CNN was trained and tested with MAVRIC-SL proton-density-weighted scans of 15 volunteers (8 for training, 7 for test) with total-hip-replacement implants. All scans were performed on GE 3T MRI systems with 24 spectral bins, 2x2 uniform under-sampling and half-Fourier acquisition. Other parameters include: 32-channel torso array, matrix size=384x256x(24-44), voxel size=1.0x1.6x4.0mm3. The images reconstructed by bin-by-bin PI&CS using all acquired data were used as the reference. Outer k-space was further under-sampled by 5 retrospectively with complementary Poisson-disc sampling13 (total R=17-25), and reconstructed by PI&CS and 3D-CNN. A total of 3072 training samples were used since y-z slices were reconstructed separately. Images were evaluated with normalized root-mean-square error (nRMSE), structural similarity index (SSIM), and scored by an experienced MSK radiologist using a 5-point scale (from 0 to 5: non-diagnostic; limited; diagnostic; good; excellent) in three categories: image sharpness, artifacts near metal and overall image quality.
The 3D-CNN significantly improved the image quality of bin-by-bin PI&CS with negligible computation time (<10s with 1 GPU). The network learns aliasing artifacts effectively, because the encoder/decoder blocks can analyze and synthesize features at multiple spatial-spectral scales. We also experimented with a 2D U-Net architecture, which collapses the spectral dimension at the beginning, and the results were blurrier than 3D U-Net. The 3D U-Net architecture may be useful in other image reconstruction problems to utilize correlations in non-spatial dimensions, such as the temporal dimension.
The bin-by-bin initialization step integrates PI into the proposed reconstruction and reduced the dimension of the network by combining the coil images. Since this step is currently the bottleneck in terms of both computation and image quality, we will explore end-to-end deep learning in future work.