Learning a Variational Model for Compressed Sensing MRI Reconstruction

Kerstin Hammernik¹, Florian Knoll², Daniel K Sodickson², and Thomas Pock^1,3

¹Institute for Computer Graphics and Vision, Graz University of Technology, Graz, Austria, ²Center for Biomedical Imaging and Center for Advanced Imaging Innovation and Research (CAI2R), Department of Radiology, NYU School of Medicine, New York, NY, United States, ³Safety & Security Department, AIT Austrian Institute of Technology GmbH, Vienna, Austria

Synopsis

Compressed sensing techniques allow MRI reconstruction from undersampled k-space data. However, most reconstruction methods suffer from high computational costs, selection of adequate regularizers and are limited to low acceleration factors for non-dynamic 2D imaging protocols. In this work, we propose a novel and efficient approach to overcome these limitations by learning a sequence of optimal regularizers that removes typical undersampling artifacts while keeping important details in the imaged objects and preserving the natural appearance of anatomical structures. We test our approach on patient data and show that we achieve superior results than commonly used reconstruction methods.

Purpose

In this work, we propose a novel Compressed Sensing (CS) MRI reconstruction method by learning optimal regularization that can effectively avoid undersampling artifacts while preserving the natural image appearance. Existing methods are based on simple regularizers such as Total Variation (TV)

$^1$ or sparsity in the wavelet domain

$^2$ . However, these handcrafted models are too simple to capture the characteristic structure of complex anatomies. Additionally, application of CS in clinical practice is still challenging due to expensive computations, parameter selection and limited applicability to 2D Cartesian protocols. To overcome these limitations, we propose to learn a set of filters and corresponding penalty functions of a variational model.

Methods

Following the recent work of Chen et al. $^3$ for image denoising, we propose to generate a sequence $\{u_t\},\,t=0...T-1$ of reconstructed images by minimizing a sequence of simple quadratic energies $\{E_t\}$ : $u_{t+1}=\arg\min_uE_t(u,\theta_t)=\left\langle{s_t(u_t),u-u_{t}}\right\rangle+\frac{1}{2}\left\Vert{u-u_{t}}\right\Vert^2_2,$ where $s_t(u_t)$ are learned functions defined below. The minimizers of the quadratic energies form a gradient step with gradients $s_t(u_t)$ : $u_{t+1}=u_t-s_t(u_t).$ Inspired by classical CS regularization approaches, we define functions $s_t(u_t)$ as gradients of following variational model: $E_t(u)=\frac{\lambda_t}{2}\left\Vert{Au-f}\right\Vert_2^2+\sum\limits_{i=1}^{N_k}\phi_{i,t}\left(k_{i,t}\ast{u}\right).$ The first term enforces data consistency to measured k-space data $f$ using the linear forward sampling operator $A$ that implements pointwise multiplications with coil sensitivity profiles and FFTs. The second (regularization) term convolves the image $u$ with $N_k$ filter kernels $k_{i,t}$ and then applies differentiable penalty functions $\phi_{i,t}$ . The influence of both terms is controlled via a regularization parameter $\lambda_t$ . We now define $s_t(u_t)$ as gradients of the variational model: $s_t(u_t)=\left.\nabla{E_t(u)}\right|_{u=u_t}=\lambda_tA^T\left(Au_t-f\right)+\sum\limits_{i=1}^{N_k}\bar{k}_{i,t}\ast\phi_{i,t}'\left(k_{i,t}\ast{u_t}\right),$ where $A^T$ denotes the backward operator, which performs inverse FFTs and combination of coil images, $\bar{k}_{i,t}$ denote the filter kernels $k_{i,t}$ rotated by 180 $^\circ$ and $\phi_{i,t}'$ are the first derivatives of the penalty functions $\phi_{i,t}$ . Hence, we can interpret our method as a specialized convolutional neural network performing a fixed number of learned gradient descent steps starting from an initial solution $u_0$ (see Figure 1).

To learn the parameters of our network, we minimize the following loss function over a set of $S$ training images: $\min_{\theta}\frac{1}{2}\sum\limits_{s=1}^S\left\Vert{u_{T,s}(\theta)-g_s}\right\Vert_2^2,$ where $u_{T,s}$ denotes the output of the network after $T$ steps and $g_s$ denotes the reference images. The vector $\theta=\{\lambda_t,k_{i,t},\phi_{i,t},\,i=1...N_k,\,t=0...T-1\}$ holds all unknown parameters of our network. After training, the learned network can be efficiently applied to new k-space data as it performs only $T$ steps of simple operations.

Experimental Setup

We scanned 10 knee patients on a clinical 3T system (Siemens Magnetom Skyra) using a 15-channel knee coil. The study was approved by the IRB and written informed consent was obtained. A coronal PD weighted Cartesian 2D TSE sequence was used with following sequence parameters: TR=2690ms, TE=33ms, matrix size $320\times320$ , $0.4\times0.4\text{mm}^2$ , 34 $3\text{mm}$ slices, acceleration factor $R=2$ . This clinical protocol delivers reference images without aliasing by using conventional parallel imaging (PI) reconstruction from the scanner vendor.

We conduct the training on a subset of 40 images from 4 different patients. This data is additionally undersampled for an acceleration factor of

$R=6$ . The training and parameter initialization of the network is performed similar to Chen et al.

$^3$ . We train 48 filters of size

$7\times7$ , their corresponding penalty functions and the regularization parameter

$\lambda_t$ for each of

$T=15$ steps.

Results

For evaluation, we apply the learned network to $R=2$ and $R=6$ data from 6 patients that was not used for training, covering a wide range of anatomical knee structures. We compare our method to zero-filled initialization, iterative CG-SENSE $^4$ and TV-regularized combined CS and PI $^1$ reconstructions. Figure 2 depicts the results for $R=2$ . As expected, all methods perform equally well. When the acceleration factor is pushed to $R=6$ , CG-SENSE and TV cannot remove all artifacts as illustrated in Figure 3. This is also depicted in Figure 4 that shows several reconstructed slices of a single patient. Although TV-regularized reconstructions still include residual aliasing at this level of regularization they already look unnatural due to the piecewise-constance assumption and small low-contrast features vanish. Our learning approach leads to artifact-free and natural results with a reconstruction time of 1.78 seconds for a single $320\times320$ slice.

Discussion and Conclusion

In this work, we presented a novel approach for CS reconstruction to learn optimal regularization of a variational model. This allows for highly efficient reconstruction of new k-space data in terms of both image quality and runtime. The impact of our algorithm is visible for high acceleration factors where we achieve visually superior results compared to CG-SENSE and TV. While conventional CS reconstruction methods use handcrafted regularizers that are not particularly suitable for 2D Cartesian sequences which are still essential in clinical practice, our learned network enables the separation of characteristic undersampling artifacts from the natural appearance of MR images.

Acknowledgements

We acknowledge grant support from the Austrian Science Fund (FWF) under the START project BIVISION, No. Y729, NIH P41 EB017183 and R01 EB000447.

References

1. Block K T, Uecker M, and Frahm J. Undersampled radial MRI with multiple coils. Iterative image reconstruction using a total variation constraint. Magn Reson Med, 2007, 57, 1086-1098.

2. Lustig M, Donoho D, and Pauly J M. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magn Reson Med, 2007, 58, 1182-1195.

3. Chen Y, Yu W, and Pock T. On learning optimized reaction diffusion processes for effective image restoration. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, 5261-5269.

4. Pruessmann K P, Weiger M, Boernert P, and Boesiger P. Advances in sensitivity encoding with arbitrary k-space trajectories. Magn Reson Med, 2001, 46, 638-651.

Figures

Figure 1: Network structure. To reconstruct an image

$u_T$ , we propagate the image

$u_0=A^Tf$ through

$T$ steps of the network which correspond to quadratic energies

$E_t$ (blue). During a training process, optimal filter kernels

$k_{i,t}$ , penalty functions

$\phi_{i,t}$ and the regularization parameter

$\lambda_t$ (green) are learned for each update step.

Figure 2: Comparison of reconstruction methods for

$R=2$ . Each slice is taken from a different patient. Since this is the same acceleration as in the conventional clinical protocols, all methods perform equally well, as expected.

Figure 3: Comparison of reconstruction methods for acceleration factor $R=6$ . Each slice is taken from a different patient. CG-SENSE and TV show residual aliasing artifacts and the TV result exhibits the characteristic blocky appearance. In comparison, image quality of our learning approach is similar to the $R=2$ case.

Figure 4: Comparison of reconstruction methods for acceleration factor $R=6$ for different anatomical knee structures of a single patient. CG-SENSE and TV contain residual aliasing and detectability of low-contrast features is reduced for TV. Our learning method shows artifact-free reconstructions.

Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)

1088