Keywords: AI/ML Image Reconstruction, Machine Learning/Artificial Intelligence
Motivation: MRI guidance of an interventional procedure requires fast image reconstruction. A neural network(NN)-based approach can exploit the similarities between consecutive frames to improve iMRI image reconstruction.
Goal(s): We investigate if an LSTM can reconstruct images from just ten spokes per frame in a timeframe compatible with iMRI.
Approach: A convolutional (conv)LSTM was trained using the open-source ACDC dataset. Results were compared with Multi-domain convolutional neural network (MD-CNN) - a recently-published 3D NN-based method for undersampled MRI reconstruction.
Results: ConvLSTMs can reconstruct frames at ~226 fps (17x faster than MD-CNN ~13 fps). SSIM for the convLSTM was slightly lower than the MD-CNN (0.85 vs 0.89).
Impact: With our LSTM-based model, we have achieved a 17x speed-up in the iMRI acquisition process without significant loss in image quality. This suggests that an LSTM-based method could be used to improve iMRI image speed and quality.
El-Rewaidy H, Fahmy AS, Pashakhanloo F, Cai X, Kucukseymen S, Csecs I, Neisius U, Haji-Valizadeh H, Menze B, Nezafat R. Multi-domain convolutional neural network (MD-CNN) for radial reconstruction of dynamic cardiac MRI. Magn Reson Med. 2021 Mar;85(3):1195-1208. doi: 10.1002/mrm.28485. Epub 2020 Sep 13. PMID: 32924188.
Wenjie Lu, Jiazheng Li, Yifan Li, Aijun Sun, Jingyang Wang, and Abd E. I.-Baset Hassanien. 2020. A CNN-LSTM-Based Model to Forecast Stock Prices. Complex. 2020 (2020). https://doi.org/10.1155/2020/6622927
Yu Chen, Ruixin Fang, Ting Liang, Zongyu Sha, Shicheng Li, Yugen Yi, Wei Zhou, Huilin Song, and Yi-Zhang Jiang. 2021. Stock Price Forecast Based on CNN-BiLSTM-ECA Model. Sci. Program. 2021 (2021). https://doi.org/10.1155/2021/2446543
Xingjian Shi, Zhourong Chen, Hao Wang, Dit-Yan Yeung, Wai-kin Wong, and Wang-chun Woo. 2015. Convolutional LSTM Network: a machine learning approach for precipitation nowcasting. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1 (NIPS'15). MIT Press, Cambridge, MA, USA, 802–810.
O. Bernard, A. Lalande, C. Zotti, F. Cervenansky, et al. "Deep Learning Techniques for Automatic MRI Cardiac Multi-structures Segmentation and Diagnosis: Is the Problem Solved ?" in IEEE Transactions on Medical Imaging, vol. 37, no. 11, pp. 2514-2525, Nov. 2018 doi: 10.1109/TMI.2018.2837502
El-Rewaidy H, Neisius U, Mancio J, et al. Deep complex convolutional network for fast reconstruction of 3D late gadolinium enhancement cardiac MRI. NMR Biomed. 2020:e4312
Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science(), vol 9351. Springer, Cham. https://doi.org/10.1007/978-3-319-24574-4_28
Fig 1a: MD-CNN Architecture[1]. The input has (Nc*Nw) frames where Nc is the number of coils, Nw is the size of the sliding window. The model predicts the centre frame of this window. Fig 1b: Diagram of the proposed architecture. At every time T Nc frames are taken as input. The KSpace Network has two LSTMs for amplitude and theta. Amplitude and theta can represent any complex number. LSTMs exploit temporal similarities between successive frames to fill in missing k-space data. This is followed by an IFFT and the ImageSpace UNet that performs artefact removal and image enhancement.
Fig 2a,2b: Ground truth cine MRI frame from the ACDC[5] dataset - image (top) and k-space (bottom).
Fig 2c: Images of simulated coils by multiplying the input frame with moving Gaussian blur masks. This is used as the ground truth for the images predicted by the k-space model.
Fig 2d,2e: (bottom) Undersampled Fourier transform computed using NUFFT. The spokes are chosen at Golden Angle intervals. This undersampled k-space data is the input to the convLSTM. We have also shown the inverse FFT (top) of this undersampled data to show how the reconstruction looks at such a severe undersampling.
Representative images generated by the k-space subnetwork. The convLSTM generates images for all the coils, but just one coil is shown. Each pair of images corresponds to a different time frame. t=1 implies the first frame generated. Fig (A-D) The first frame is very similar to the 10-spoke input. But as more frames are passed to the LSTM network, it remembers information, and we can observe spokes being added to the Fourier transform. Fig (E-H) These images are much better quality since they are generated after the LSTM has processed several frames and has reached a steady state.
Fig 5a,5b: Boxplot and histogram of the Frames per Second (FPS) obtained by both methods. convLSTM reconstructs images at ~226 FPS, whereas MD-CNN has a 13 FPS.
Fig 5c, 5d: Boxplot and histogram of the SSIM scores obtained using MD-CNN (blue) and convLSTM(green). The distribution of MD-CNN SSIM scores has a higher peak than the convLSTM scores. This implies that the convLSTM has an overall worse reconstruction quality than the MD-CNN at the benefit of a 17x faster speed