Denoising  of diffusion MRI data using Random Matrix Theory
Jelle Veraart1,2, Dmitry S. Novikov2, Jan Sijbers1, and Els Fieremans2

1iMinds Vision Lab, University of Antwerp, Antwerp, Belgium, 2Center for Biomedical Imaging, New York University School of Medicine, New York, NY, United States

Synopsis

We here adopt the idea of noise removal by means of transforming redundant data into the Principal Component Analysis (PCA) domain and preserving only the components that contribute to the signal to denoise diffusion MRI (dMRI) data. We objectify the threshold on the PCA eigenvalues for denoising by exploiting the fact that the noise-only eigenvalues are expected to obey the universal Marchenko-Pastur (MP) distribution. By doing so, we design a selective denoising technique that reduces signal fluctuations solely rooting in thermal noise, not in fine anatomical details.

Purpose

To denoise diffusion MRI (dMRI) data in a model-independent way. The thermal noise in dMRI signals propagates through the estimation process to the diffusion model parameters of interest. Therefore, reducing the noise variance in the dMRI signals increases the precision of the diffusion parameter estimators. We here adopt the idea of noise removal by means of transforming redundant data into the Principal Component Analysis (PCA) domain and preserving only the components that contribute to the signal1. Unfortunately, an objective criterion to discriminate between the “signal-carrying” and “noise-only” components has been lacking. Indeed, the number of signal-carrying principal components is unknown and is expected to depend on factors such as resolution and SNR. Commonly used criteria include thresholding of the eigenvalues associated to the principal components by an empirically set threshold value2. Here, we objectify the threshold for PCA denoising by exploiting the fact that the noise-only eigenvalues are expected to obey the universal Marchenko-Pastur (MP) distribution3.

Methods

Oversampled dMRI data is represented by a number of parameters (P) that is significantly smaller than the number of diffusion encodings (N). Therefore, a M×N data matrix covering $$$M\gg 1$$$ voxels within a sliding window (e.g. $$$[5\times 5\times 5]$$$) is highly redundant. The PCA transformation of redundant data compresses the noise-free signal variance into P components, whereas the noise is spread over all components. The bulk of the PCA eigenspectrum of the data matrix is expected to show the spectral behavior of noise. According to Random Matrix Theory, the MP distribution describes the universal noise eigenspectrum3 (Fig.1):

$$ p(\lambda)=\frac{\sqrt{(\lambda_+ -\lambda)(\lambda -\lambda_-)}}{2\pi \gamma \sigma^2} \quad\mathrm{if}\quad\lambda_-\leq \lambda\leq\lambda_+\quad (p(\lambda)=0\:\mathrm{ otherwise)}$$

with $$$\lambda_\pm=\sigma^2(1\pm\sqrt{\gamma})^2$$$, $$$\sigma$$$ the noise level and $$$\gamma=M/(N-P)$$$. Distribution fitting, i.e. minimizing the error between the MP distribution $$$p(\lambda)$$$ and the histogram $$$n(\lambda)$$$ of the lowest eigenvalues, yields an accurate estimate of $$$\sigma$$$, P, and an objective threshold to idenitify significant signal components, i.e. $$$\lambda_+$$$ . We use the P significant eigenvalues $$$(\lambda >\lambda_+)$$$ to estimate the expected value of the signal in each voxel and diffusion gradient direction. Since the noise level and signal are estimated simultaneously, both parameters can be corrected for the nc-$$$\chi$$$ noise bias afterwards4. We compare our technique (MPPCA) with adaptive non-local mean5 (ANLM) and total generalized variation6 (TGV).

Results

Simulation: A M×N matrix was generated by projecting M, ranging from 100 to 2500, axially symmetric diffusion tensors onto N=30, 60, and 90 diffusion gradient directions with $$$b=1\,\mathrm{ms/\mu m^2}$$$. Monte Carlo simulations showed that the accuracy as function of M and N the method was >99.3%, whereas the noise standard deviation (SD) was reduced to $$$σ\sqrt{P/N+P/M}$$$.

Real data ($$$5\times b=0$$$, and 3 repetitions of $$$ 90\times b=\{1,\, 2.5\}\,\mathrm{ms/\mu m^2}$$$): We applied MPPCA, ANLM, and TGV to a subset of the $$$b=2.5\,\mathrm{ms/\mu m^2}$$$-shells (1 repetition, N=60). Fig. 2 shows the denoised images for qualitative comparison. The blurring for ANLM and TV is quantitatively confirmed by comparing the k-energy density as function of the distance to the k-space center (Fig. 3). Indeed, the suppression of high frequencies explains the low-pass filtering effect. The suppression of high frequencies not strictly reflecting thermal noise, but rather anatomical details, also results in residuals, normalized by σ, that follow a distribution with SD exceeding one (meaning that they are contaminated by signal), whereas MPPCA shows a distribution that matches the simulations, i.e. zero-centered normal distribution with SD=$$$\sigma\sqrt{1-P/N-P/M}$$$(Fig. 4). Hence, MPPCA preserves the actual signal better than competing methods. The same trends are observed in the fractional anisotropy (FA) maps derived from the $$$ b=\{0,\,1\} \,ms/\mu m^2$$$ subsets (Nb>0=30, 60, and 90). We observe a significant increase in precision in the estimation of FA, determined using bootstrapping with replacement based on the 3 repetitions (Fig. 5). Although ANLM might outperform MPPCA in terms of precision, ANLM is shown to be less accurate when comparing the denoised results to a ground truth, derived from $$$3\times (5\times b=0+ 90 \times b=\{1,\, 2.5\} \,\mathrm{ms/\mu m^2})$$$.

Discussion & Conclusion

Unlike competing methods, MPCCA is a selective denoising technique that reduces signal fluctuations solely rooting in thermal noise. The selective nature of the proposed technique is created by exploiting data redundancy in the PCA domain using universal properties of the eigenspectrum of random covariance matrices. Indeed, Typical dMRI data has shown to exhibit sufficient redundancy ($$$P\ll M,N$$$), which allows to distinguish between the noise and signal components in a model-independent way, and without the need to empirically set the PCA threshold value. The result show improved precision in the estimation of diffusion model parameters whereas the accuracy is preserved. Similar results have been obtained for other all diffusion parameters.

Acknowledgements

Work financially supported by:

Fund for Scientific Research-Flanders (FWO) (#12S1615N)

NIH R01 NS088040

Interuniversity Attraction Poles Program (P7/11) initiated by the Belgian Science Policy Office

References

1Hotelling (1933) J. of Educational Psychology 24:417-441

2Manjon et al. (2013) Plos One, DOI: 10.1371/journal.pone.0073021

3Marchenko et al. (1967) Mat. Sb. (N.S.) 72(114):507–536

4Koay et al. (2006) JMR 179(2):317-22

5Manjon et al. (2010) JMRI 31(1):192-203

6Knoll et al. (2011) MRM 65(2):480-91

Figures

Fig. 1: Eigenvalue spectrum of sample covariance matrix of simulated DW data and MP distribution superimposed. The P "signal-carrying" eigenvalues (black) are separated from the MP noise bulk (gray).

Fig. 2: (top row) Denoised single DW images (b=2.5); (bottom row) average residuals across all DW images. The “structure” in the ANLM and TGV maps indicates that those techniques affect the actual signal, whereas the residuals of MPPCA are "pure" noise.

Fig. 3: k-space energy density as function of the distance to the k-space center, δk. Corrected MPPCA (dashed line) closely approximates the predicted REF data. ANLM and TGV have a low-pass filtering effect. The reference is obtained by computing the k-space density of the original data and subtracting the noise power at δk, i.e. Nkσ2 with Nk the number of k-space points at δk . MPPCA was corrected similarly since the residual noise level can be predicted (dashed line).

Fig. 4: The distribution of normalized residuals for the different methods as observed (o) and best fitting normal distribution (solid line). The unit normal distribution is shown for reference. ANLM and TGV clearly over-do the denoising (i.e. SD>1) by interfering with the underlying signal.

Fig. 5: Variability (colored maps) and bias (grayscale maps) in the estimation of fractional anisotropy



Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)
1047