Temporal point spread function interpretation of low rank, dictionary learning models in dynamic MRI

Sajan Goud Lingala^{1}, Sampada Bhave^{2}, Yinghua Zhu^{1}, Krishna Nayak^{1}, and Mathews Jacob^{2}

** Dynamic signal model: **The dynamic pixel time profile $$$\gamma(\mathbf x,t)$$$ is modeled as a linear combination of temporal basis functions $$$v_{i}(t)$$$ derived from the data [1-2]: $$\mathbf \Gamma_{p\times m} = \mathbf V_{p \times r} \mathbf U_{r \times m};$$where $$$\mathbf \Gamma$$$ is the Casorati matrix representation of the dynamic data, and $$$\mathbf U$$$ is the spatial weight matrix; $$$p, r,m$$$ are respectively the number of time frames, number of basis functions, and number of pixels in a time frame. The low rank model assumes linear combination with a few number of orthogonal bases (i.e, $$$r<p$$$), while dictionary learning assumes a

** Denoising using the low-rank model: **Denoising with the low rank model can be formulated as $$\min_{\mathbf U}\|\mathbf V\mathbf U-\mathbf \Gamma_{n}\|_{F}^{2};$$ where $$$\mathbf \Gamma_{n}$$$ is the noisy dynamic data. $$$\mathbf V$$$ is estimated from the data itself via SVD decomposition of rank $$$r$$$ approximation of $$$\Gamma_n$$$; $$$(r<p)$$$. The denoised solution is given as: $$\hat{\mathbf \Gamma}=\underbrace{\mathbf V(\mathbf V^{T}\mathbf V)^{-1}\mathbf V^{T}}_{\mathbf Q_{p\times p}}\mathbf \Gamma_n;$$The above suggests that every spatial frame in $$$\hat{\mathbf \Gamma}$$$ is a weighted linear combination of spatial frames from $$$\mathbf \Gamma_n$$$. The weights are determined by columns (or rows) of the symmetric $$$\mathbf Q$$$ matrix. We term the columns of this matrix as temporal point spread functions (TPSF) as it characterizes averaging across time.

** Denoising using the dictionary-learning model: **The problem is formulated as joint estimation of $$$\mathbf U$$$ and $$$\mathbf V$$$: $$\min_{\mathbf U, \mathbf V}\|\mathbf V \mathbf U-\mathbf \Gamma_n\|_{F}^{2}; \mbox{such that}, \|u_{i}\|_{0}<=k; \|v_i\|_{2}^{2}<=1.$$ The above can be solved by dictionary learning algorithms such as k-SVD [7], with the resulting solution: $$\hat{\mathbf \Gamma}=\underbrace{\mathbf V_{red}(\mathbf V_{red}^{T}\mathbf V_{red})^{-1}\mathbf V_{red}^{T}}_{\mathbf Q_{p\times p}}\mathbf \Gamma_n;$$where the rows of the matrix $$$\mathbf V_{red}$$$ are the temporal basis functions that are active at a specified pixel. Note, $$$\mathbf V_{red}$$$ is the reduced subset from the dictionary $$$\mathbf V$$$ and will vary for different spatial pixels. This implies the TPSF is spatially varying.

*
*

[1] A. S. Gupta and Z. Liang, “Dynamic imaging by temporal modeling with principal component analysis,” 2001, p. 10.

[2] Z.-P. Liang, “Spatiotemporal imaging with partially separable functions,” in Noninvasive Functional Source Imaging of the Brain and Heart and the International Conference on Functional Biomedical Imaging, 2007. NFSI- ICFBI 2007. Joint Meeting of the 6th International Symposium on. IEEE, 2007, pp. 181–182.

[3] H. Jung, K. Sung, K. S. Nayak, E. Y. Kim, and J. C. Ye, “k-t focuss: A general compressed sensing framework for high resolution dynamic mri,” Magnetic Resonance in Medicine, vol. 61, no. 1, pp. 103–116, 2009.

[4] H. Pedersen, S. Kozerke, S. Ringgaard, K. Nehrke, and W. Y. Kim, “k-t pca: Temporally constrained k-t blast reconstruction using principal com- ponent analysis,” Magnetic resonance in medicine, vol. 62, no. 3, pp. 706– 716, 2009.

[5] S. G. Lingala, Y. Hu, E. DiBella, and M. Jacob, “Accelerated dynamic mri exploiting sparsity and low-rank structure: kt slr,” Medical Imaging, IEEE Transactions on, vol. 30, no. 5, pp. 1042–1054, 2011.

[6] S. G. Lingala and M. Jacob, “Blind compressive sensing dynamic mri,” Medical Imaging, IEEE Transactions on, vol. 32, no. 6, pp. 1132–1145, 2013.

[7] M. Aharon, et al, "k-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation" IEEE Trans. Sign Processing; 54 (11): 4311-4322, 2006.

Low rank denoising as non-local view-sharing. The TPSFs are the entries of the $$$\mathbf Q$$$ matrix. Different frames have different TPSFs. Note how the peaks of the TPSFs correspond to similar motion state frames implying implicit non-local view-sharing.

Dictionary learning denoising as spatially varying non-local view-sharing. The TPSFs are the entries of the $$$\mathbf Q_{red}$$$ matrix, and are spatially varying. Note how the peaks of the TPSFs correspond to the peaks of the underlying dynamic pixel time profiles, implying implicit spatial varying non-local view sharing.

Denoising results using data-driven algorithms: The non-local time averaging in the data-driven models enables robust denoising while preserving temporal fidelity. In this example, dictionary learning denoising has subtle gains in performance over low rank denoising, which is attributed to spatial variance of TPSFs.

Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)

4233