Zechen Zhou1, Christophe Schülke2, Chun Yuan3, and Peter Börnert2
1Philips Research North America, Cambridge, MA, United States, 2Philips Research Hamburg, Hamburg, Germany, 3Department of Radiology, University of Washington, Seattle, WA, United States
Synopsis
Recently,
the convolutional neural network (CNN) based reconstruction concept has emerged
as a promising implementation of compressed sensing tailored for specific fast
imaging applications. The reconstruction performance of such data-driven models
may depend on the CNN structure which determines the feature extraction process
for sparse representation. In this study, a locally and globally concatenated
network is proposed and compared with the residual network as well as the
traditional L1-wavelet ESPIRiT. Preliminary experiments on a public knee
imaging database showed that the proposed approach provided improved fine
structure (e.g. vessel wall) restoration and background noise reduction.
Introduction
Compressed
sensing (CS) MRI1 allows highly subsampled acquisition while restoring
the diagnostic image quality by properly leveraging the corresponding sparse
representation. Data-driven sparse transforms (e.g. dictionary learning, neural
network) have shown more powerful CS reconstruction performance in comparison
to the universal sparse transforms (e.g. total variation, wavelet)2.
In particular, convolutional neural network (CNN) integrated with parallel
imaging (PI) or data consistency has recently achieved great success as an
end-to-end optimized MR image reconstruction, and offered significantly reduced
reconstruction time with more advanced GPU hardware3-6. However,
most CNN-based reconstruction models do not take full advantage of the
hierarchical features extracted from the subsampled image for sparse
representation, resulting in relatively limited performance7,8. In
this work, a CNN with local and global concatenations is proposed to enable the
full exploitation of hierarchical features across all convolutional layers. Its
effectiveness is demonstrated on a public knee dataset.Methods
CNN-based Reconstruction Model with Local and Global Concatenation
The PI-CS
reconstruction problem can be solved by interleaved concatenation of two blocks
(figure 1c): 1) update block to enforce PI and/or data consistency constraint;
2) proximal block to enforce the sparse representation constraint. Different
CNN architectures can be used in the proximal block to create different CNN-based
reconstruction models. To adaptively synthesize more effective features from
preceding and current features, a stack of local feature concatenation blocks (figure
1a) was exploited to learn a better sparse transform in each proximal block. To
further improve the feature connection across different proximal blocks, a
global feature concatenation structure was exploited to skip the update block
so that the global hierarchical features can be maintained in a holistic way.
MR Experiments
Twenty fully sampled 3D fast spin echo (FSE)
knee datasets were downloaded from mridata.org, where 15/2/3 cases were selected
respectively for training/validation/testing purpose. A 9.4-fold variable
density poisson-disc pattern was used for retrospective random undersampling. L1-norm
was used as the loss function during CNN training. CNN training and inference
were performed with Tensorflow on an Nvidia Xp GPU. Several different CNN
models were trained and compared, including the residual network based model
(Proximal Block, Model 1 in figure 1b) and the proposed concatenated network
(Proximal Block-C, Model 2 in figure 1b) with/without the global feature
concatenation. The total number of convolutional layers and parameters in
different CNNs were approximately matched to facilitate a fair comparison. In
addition, the L1-wavelet regularized ESPIRiT reconstruction approach9
in the SigPy package (https://github.com/mikgroup/sigpy)
was applied as another baseline approach. The peak signal-to-noise ratio (PSNR),
normalized root mean square error (nRMSE), and mean structural similarity index
(mSSIM) within the tissue region were calculated as quantitative image quality
metrics between the fully sampled ground truth and zero-filled/reconstructed
images.Results
For
the three testing cases, the CNN based reconstruction models achieved
approximately 0.4dB PSNR increase, 0.01 nRMSE reduction and 0.04 mSSIM
improvement compared with the L1-wavelet ESPIRiT (figure 2). Also, an
additional 0.01 mSSIM increase was found by using the concatenated network in comparison
to the residual network, indicating an improved image structural restoration of
the locally and globally concatenated network. Figure 3 showed one example comparison
to demonstrate the overall improvement on the vessel wall delineation of popliteal
artery with the concatenated network. Figure 4 illustrated a cross-sectional
view comparison of the reconstructed vessel wall. The L1-wavelet ESPIRiT might
overestimate the wall thickness. The residual network improved the
morphological accuracy but with blurring problem. The concatenated network
provided the most similar vessel wall morphology and sharpness to the ground
truth, while other detailed structures were also better preserved (as shown by
red arrows in figure 4). With the global feature concatenation, an improved
background noise reduction was observed in comparison to the locally
concatenated network (figure 3&4). In addition, the average reconstruction
time for each 3D volume was significantly reduced from ~1-hour using L1-wavelet
ESPIRiT to ~1-min using CNN based reconstruction with GPU acceleration.Discussion and Conclusion
The
locally and globally concatenated network based image reconstruction approach
has demonstrated improved detail structural restoration (e.g. vessel wall) and
less background noise amplification in highly accelerated 3D FSE knee imaging.
The local feature concatenation can better represent the fine structure by actively
reusing different levels of features, and the global feature concatenation enables
a contiguous mechanism to differentiate the meaningful structure from background
noise across all proximal blocks. Therefore, a more powerful sparse representation
model can be trained by properly taking advantage of local and global feature
fusion. This approach may require further evaluation on its robustness and
generalization capabilities in different applications.Acknowledgements
We would like to
acknowledge the contributors of mridata.org for sharing the fully sampled knee
database, and Dr. Joseph Y. Cheng for sharing his example code on deep learning
based reconstruction (https://github.com/MRSRL/dl-cs).References
1. Lustig M, Donoho D, Pauly JM. Sparse
MRI: The application of compressed sensing for rapid MR imaging. Magn Reson
Med. 2007;58(6):1182-95.
2. Yu
G, Sapiro G, Mallat S. Solving inverse problems with piecewise linear
estimators: From Gaussian mixture models to structured sparsity. IEEE Transactions
on Image Processing. 2011;21(5):2481-2499.
3. Sun
J, Li H, Xu Z. Deep ADMM-Net for compressive sensing MRI. NIPS 2016, pp. 10-18.
4. Hammernik
K, Klatzer T, Kobler E, Recht MP, Sodickson DK, Pock T, Knoll F. Learning a
variational network for reconstruction of accelerated MRI data. Magn Reson Med.
2018;79(6):3055-3071.
5. Adler
J, Oktem O. Learned primal-dual reconstruction. IEEE Trans Med Imaging.
2018;37(6):1322-1332.
6. Cheng
JY,
Chen F, Sandino C, Mardani M, Pauly JM, Vasanawala SS. Compressed
sensing: from research to clinical practice with data-driven Learning. arXiv:1903.07824v1.
7. Huang G, Liu Z, Van Der Maaten L, Weinberger
KQ. Densely connected convolutional networks. IEEE CVPR 2017, pp.
4700-4708.
8. Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y. Residual
dense network for image restoration. arXiv preprint arXiv:1812.10477.
9. Uecker M, Lai P, Murphy MJ, Virtue P, Elad M,
Pauly JM, Vasanawala SS, Lustig M. ESPIRiT--an eigenvalue approach to
autocalibrating parallel MRI: where SENSE meets GRAPPA. Magn Reson Med.
2014;71(3):990-1001.