Learning a Variational Model for Compressed Sensing MRI Reconstruction

Kerstin Hammernik^{1}, Florian Knoll^{2}, Daniel K Sodickson^{2}, and Thomas Pock^{1,3}

Following the recent work of Chen et al.$$$^3$$$ for image denoising, we propose to generate a sequence $$$\{u_t\},\,t=0...T-1$$$ of reconstructed images by minimizing a sequence of simple quadratic energies $$$\{E_t\}$$$:$$u_{t+1}=\arg\min_uE_t(u,\theta_t)=\left\langle{s_t(u_t),u-u_{t}}\right\rangle+\frac{1}{2}\left\Vert{u-u_{t}}\right\Vert^2_2,$$where $$$s_t(u_t)$$$ are learned functions defined below. The minimizers of the quadratic energies form a gradient step with gradients $$$s_t(u_t)$$$:$$u_{t+1}=u_t-s_t(u_t).$$Inspired by classical CS regularization approaches, we define functions $$$s_t(u_t)$$$ as gradients of following variational model:$$E_t(u)=\frac{\lambda_t}{2}\left\Vert{Au-f}\right\Vert_2^2+\sum\limits_{i=1}^{N_k}\phi_{i,t}\left(k_{i,t}\ast{u}\right).$$The first term enforces data consistency to measured k-space data $$$f$$$ using the linear forward sampling operator $$$A$$$ that implements pointwise multiplications with coil sensitivity profiles and FFTs. The second (regularization) term convolves the image $$$u$$$ with $$$N_k$$$ filter kernels $$$k_{i,t}$$$ and then applies differentiable penalty functions $$$\phi_{i,t}$$$. The influence of both terms is controlled via a regularization parameter $$$\lambda_t$$$. We now define $$$s_t(u_t)$$$ as gradients of the variational model:$$s_t(u_t)=\left.\nabla{E_t(u)}\right|_{u=u_t}=\lambda_tA^T\left(Au_t-f\right)+\sum\limits_{i=1}^{N_k}\bar{k}_{i,t}\ast\phi_{i,t}'\left(k_{i,t}\ast{u_t}\right),$$where $$$A^T$$$ denotes the backward operator, which performs inverse FFTs and combination of coil images, $$$\bar{k}_{i,t}$$$ denote the filter kernels $$$k_{i,t}$$$ rotated by 180$$$^\circ$$$ and $$$\phi_{i,t}'$$$ are the first derivatives of the penalty functions $$$\phi_{i,t}$$$. Hence, we can interpret our method as a specialized convolutional neural network performing a fixed number of learned gradient descent steps starting from an initial solution $$$u_0$$$ (see Figure 1).

To learn the parameters of our network, we minimize the following loss function over a set of $$$S$$$ training images:$$\min_{\theta}\frac{1}{2}\sum\limits_{s=1}^S\left\Vert{u_{T,s}(\theta)-g_s}\right\Vert_2^2,$$where $$$u_{T,s}$$$ denotes the output of the network after $$$T$$$ steps and $$$g_s$$$ denotes the reference images. The vector $$$\theta=\{\lambda_t,k_{i,t},\phi_{i,t},\,i=1...N_k,\,t=0...T-1\}$$$ holds all unknown parameters of our network. After training, the learned network can be efficiently applied to new k-space data as it performs only $$$T$$$ steps of simple operations.

We scanned 10 knee patients on a clinical 3T system (Siemens Magnetom Skyra) using a 15-channel knee coil. The study was approved by the IRB and written informed consent was obtained. A coronal PD weighted Cartesian 2D TSE sequence was used with following sequence parameters: TR=2690ms, TE=33ms, matrix size $$$320\times320$$$, $$$0.4\times0.4\text{mm}^2$$$, 34 $$$3\text{mm}$$$ slices, acceleration factor $$$R=2$$$. This clinical protocol delivers reference images without aliasing by using conventional parallel imaging (PI) reconstruction from the scanner vendor.

We conduct the training on a subset of 40 images from 4 different patients. This data is additionally undersampled for an acceleration factor of $$$R=6$$$. The training and parameter initialization of the network is performed similar to Chen et al.$$$^3$$$. We train 48 filters of size $$$7\times7$$$, their corresponding penalty functions and the regularization parameter $$$\lambda_t$$$ for each of $$$T=15$$$ steps.For evaluation, we apply the learned network to $$$R=2$$$ and $$$R=6$$$ data from 6 patients that was not used for training, covering a wide range of anatomical knee structures. We compare our method to zero-filled initialization, iterative CG-SENSE$$$^4$$$ and TV-regularized combined CS and PI$$$^1$$$ reconstructions. Figure 2 depicts the results for $$$R=2$$$. As expected, all methods perform equally well. When the acceleration factor is pushed to $$$R=6$$$, CG-SENSE and TV cannot remove all artifacts as illustrated in Figure 3. This is also depicted in Figure 4 that shows several reconstructed slices of a single patient. Although TV-regularized reconstructions still include residual aliasing at this level of regularization they already look unnatural due to the piecewise-constance assumption and small low-contrast features vanish. Our learning approach leads to artifact-free and natural results with a reconstruction time of 1.78 seconds for a single $$$320\times320$$$ slice.

In this work, we presented a novel approach for CS reconstruction to learn optimal regularization of a variational model. This allows for highly efficient reconstruction of new k-space data in terms of both image quality and runtime. The impact of our algorithm is visible for high acceleration factors where we achieve visually superior results compared to CG-SENSE and TV. While conventional CS reconstruction methods use handcrafted regularizers that are not particularly suitable for 2D Cartesian sequences which are still essential in clinical practice, our learned network enables the separation of characteristic undersampling artifacts from the natural appearance of MR images.

1. Block K T, Uecker M, and Frahm J. Undersampled radial MRI with multiple coils. Iterative image reconstruction using a total variation constraint. Magn Reson Med, 2007, 57, 1086-1098.

2. Lustig M, Donoho D, and Pauly J M. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magn Reson Med, 2007, 58, 1182-1195.

3. Chen Y, Yu W, and Pock T. On learning optimized reaction diffusion processes for effective image restoration. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, 5261-5269.

4. Pruessmann K P, Weiger M, Boernert P, and Boesiger P. Advances in sensitivity encoding with arbitrary k-space trajectories. Magn Reson Med, 2001, 46, 638-651.

Figure 1: Network structure. To reconstruct an image $$$u_T$$$, we propagate the image $$$u_0=A^Tf$$$ through $$$T$$$ steps of the network which correspond to quadratic energies $$$E_t$$$ (blue). During a training process, optimal filter kernels $$$k_{i,t}$$$, penalty functions $$$\phi_{i,t}$$$ and the regularization parameter $$$\lambda_t$$$ (green) are learned for each update step.

Figure 2: Comparison of reconstruction methods for $$$R=2$$$. Each slice is taken from a different patient. Since this is the same acceleration as in the conventional clinical protocols, all methods perform equally well, as expected.

Figure 3: Comparison of reconstruction methods for acceleration factor $$$R=6$$$. Each slice is taken from a different patient. CG-SENSE and TV show residual aliasing artifacts and the TV result exhibits the characteristic blocky appearance. In comparison, image quality of our learning approach is similar to the $$$R=2$$$ case.

Figure 4: Comparison of reconstruction methods for acceleration factor $$$R=6$$$ for different anatomical knee structures of a single patient. CG-SENSE and TV contain residual aliasing and detectability of low-contrast features is reduced for TV. Our learning method shows artifact-free reconstructions.

Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)

1088