4667

Reconstruction of high undersampling rate images using a cascade of convolutional neural networks

Xi Chen¹, Shuo Chen², and Rui Li²

¹Beijing Institute of Technology, Beijing, China, ²Center for Biomedical Imaging Research, Department of Biomedical Engineering, School of Medicine, Tsinghua University, Beijing, China

Synopsis

Imaging speed is important in many magnetic resonance imaging (MRI) applications because long scan time increases the risk of artifacts. At present, reconstruction method based on compressed sensing and deep learning significantly increases the speed of MRI scan. However, the performance of current models is not good at high undersampling rate. Here we used a large dataset to improve the undersampling rate of a CNN based MR reconstruction while maintaining high image quality. Our results showed an average 2.6% root-mean-square error in reconstructing from 16-fold undersampling k-space, which outperforms traditional method.

Introduction

Imaging speed is important in many magnetic resonance imaging (MRI) applications. Long scan time would increase the risk of motion artifacts and is impractical when applying breath-hold acquisitions. Therefore, reducing the amount of acquired data and improving the speed of MRI scan are the pursuits of many works. Compressed sensing makes it possible to undersample in sub-Nyquist rate, and deep learning can further improve the image quality. Recently a deep cascade of convolutional neural networks (CNN) combining the similarity of adjacent frames¹ was proposed showing promising results. However, constrained by small train dataset, its undersampling rate of this CNN method is limited. In this study, we aim to use a large dataset to improve the undersampling rate of this CNN based MR reconstruction while maintaining high image quality.

Method

Image dataset acquisition: In this study, 41 fully sampled short-axis cardiac cine MR and 352 fully sampled long-axis cardiac cine MR were used. Each cine scan contains 30 temporal frames. And train set and test set were divided according to about 3:1 ratio. The two sections were trained separately. Each image was resized to 256×256 pixels.

Sensor-domain dataset acquisition: K-space data was obtained by FFT of input images and undersampled by masks. Each frame was fully sampled along kx-axis and randomly undersampled along ky-axis. For each frame, eight center k-space lines were acquired along with others randomly sampled according Gaussian distribution.

Model architecture: We implemented a cascade of CNNs¹. The input of CNN first goes through a data sharing layer in which each frame combines the k-space information from adjacent frames by taking the average of them. To maintain the fidelity of reconstructed image, the predicted k-space data is recombined with the undersampled data, which is called the data consistency layer. And then the output of each CNN is sent as input into a new CNN which makes a cascade of CNNs networks. The architecture of CNNs is denoted by Dn_d-Cn_c, n_d is the depth of each CNN and n_c is the number of CNNs that cascade. The model architecture is shown in figure 1.

Reconstruction: Each model had being trained until training loss was plateaued (about 70 to 100 epochs). We used GPU GTX980M to train it. And constrained by GPU memory, we first used a D2-C2 architecture for 12-fold 256×256 images and then a D2-C3 architecture for 16-fold 256×256 images. It is a relative small structure and the performance is likely to be improved by using better GPU.
For comparison, a k-t SLR^2,3 reconstruction using radial sampled k-space with the same undersampling rate was perform.

Results and Discussion

Figure 3 shows the reconstruction result of 12-fold 256×256 image using CNNs. The mean RMSE (root-mean-square error) of the prediction is 1.74% (Standard Deviation=2.196e-03). For the total test set of 12-fold, the mean RMSE of short-axis cardiac cine MR is 2.20%(4.2e-3) and the mean RMSE of long-axis cardiac cine MR is 2.20%(5.1e-3). By contrast, figure 4 shows the image reconstructed by k-t SLR with same undersampling rate. The RMSE of k-t SLR is 3.07%(1.15e-03).

The result demonstrates that the performance of cascade of CNNs is better than k-t SLR method. As for reconstruction time, each frame took averagely 170ms on GTX980M using CNNs. However, it took 30 minutes to reconstruct 30 frames using k-t SLR for 500 iterations.

Figure 5 shows the result of 16-fold 256×256 image through a D2-C3 CNNs. Its RMSE is 2.09% (1.62e-03). For the total test set of 16-fold, the mean RMSE of short-axis cardiac cine MR is 2.67%(5.2e-3) and the mean RMSE of long-axis cardiac cine MR is 2.62%(6.2e-3). The result also outperforms k-t SLR method.

Conclusion

We applied a lager dataset to the deep cascade of convolutional neural networks and train the model for reconstructing cardiac cine MRI. The performance of deep cascade CNN was good with 12 and 16-fold undersampling rate. The quality of images and reconstruction speed outperform traditional method.

Acknowledgements

No acknowledgement found.

References

1. J. Schlemper, J. Caballero, J. V. Hajnal, A. Price, and D. Rueckert, “A deep cascade of convolutional neural networks for dynamic MR image reconstruction,” IEEE Transactions on Medical Imaging, vol. 37, no. 2, 2018

2. S.G.Lingala, Y.Hu, E.DiBella and M.Jacob, “Accelerated dynamic MRI exploiting sparsity and low rank structure: k-t SLR”, IEEE Transactions on Medical Imaging (IEEE-TMI), pp:1042-1054, vol.30, May 2011.

3. S.G.Lingala, Y.Hu, E.DiBella, and M.Jacob, “Accelerated myocardial perfusion imaging using improved k-t SLR”, IEEE International Symposia on Biomedical Imaging (IEEE-ISBI), 2011.

Figures

Figure 1.Model Architecture.

Figure 2. Undersampled mask.

Figure 3. Reconstruction of 12-fold 256×256 image using CNNs. (a) ground truth; (b) direct iFFT of undersampled k-space; (c) prediction of CNN reconstrcution; (d) error of the prediction.

Figure 4. Reconstruction of 12-fold 256×256 image using k-t SLR. (a) ground truth; (b) direct iFFT of undersampled k-space; (c) reconstruction of kt-SLR; (d) error of kt-SLR reconstruction.

Figure 5. Reconstruction of 16-fold 256×256 image using CNNs. (a) ground truth; (b) direct iFFT of of undersampled k-space; (c) prediction of CNN reconstruction; (d) error of the prediction.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)

4667