0108

Patch2Self denoising reveals a new theoretical understanding of Diffusion MRI
Shreyas Fadnavis1, Joshua Batson2, and Eleftherios Garyfallidis3
1Intelligent Systems Engineering, Indiana University Bloomington, Bloomington, IN, United States, 2Chan Zuckerberg Biohub, San Francisco, CA, United States, 3Indiana University Bloomington, Bloomington, IN, United States

Synopsis

Diffusion MRI (dMRI) is a promising tool for evaluating the spinal cord in health and disease, however low SNR can impede accurate, repeatable, quantitative measurements. Here, we apply a recently proposed denoiser, Patch2Self, that strictly suppresses statistically independent random fluctuations in the signal originating from various sources of noise. Typical spinal cord dMRI scans have a smaller number of gradient directions (10-20) making PCA based 4D denoisers (require at least 30) inapplicable. Using self-supervised learning, Patch2Self addresses these issues which we quantitatively show with an improvement in repeatability and conspicuity of pathology in the spinal cord.

Introduction, Methods, Results and Discussion

Introduction:
Diffusion MRI (dMRI) which is commonly used to characterize white matter and quantify tissue microstructure is often corrupted by noise from a variety of sources such as thermal fluctuations, physiological motion, signal drifts, etc. Mapping and characterizing this noise is difficult due to the variety of unknown sources that create signal confounds. Different approaches have been proposed to tackle this problem by assuming certain properties of the signal such as: low-rank2, 13, self-similarity/repetition4,5, compressibility/sparsity12, locally smooth signals6. PCA based low-rank approximation based methods have been particularly adopted. These methods are often limited in performance as the PCA assumes that noise is homoscedastic, which is not always valid. It also places limitations on the number of gradient directions, i.e., if the number too high PCA becomes ill-posed and if the number is too low (< 30) the number of principal components is hard to estimate. In practice, the assumptions these methods make (e.g. MPPCA2) are often violated, thus reducing denoising performance. Here we propose Patch2Self1, to address these issues by removing all assumptions on the signal structure. It only relies on the fact that noise in different gradient directions is uncorrelated and statistically independent. By using the theory of J-invariance1, we show that Patch2Self builds an optimal denoiser using a regression framework. With comparisons of simulated and real data, we show improved denoising performance against other methods.
Methods:
Patch2Self proposes learning one regression function to denoise each 3D gradient direction. To do so, Patch2Self works via 3D patches. We start with extracting 3D patches from all ‘n’ gradient directions. Next, we hold out the patches corresponding to the gradient direction to be denoised. Using the patches from the rest of the ‘n-1’ gradient directions, we train a regression function by using the held-out patches as the target function (see Fig. 1 for flow diagram). This training is performed by using a self-supervised loss that satisfies the theorem of J-invariance. Since the noise is randomly fluctuating across different volumes and is spatially uncorrelated, the regressor cannot learn it. Thus the regression function invariably ends up learning only the true underlying signal structure that exhibits a certain correlation structure across different gradient directions. Therefore, to get the denoised 3D gradient direction, we feed in patches into the same ‘n-1’ gradient directions to obtain the denoised output. Patch2Self builds a denoiser using J-invariance which states that: if the noise in one gradient direction is statistically independent of the noise in other gradient directions then, we can use J dimensions as self-labels for training on the Jc dimensions. One can prove that minimizing this J-invariant loss is equivalent to minimizing the ground truth loss along with the noise variance. The advantage of this design of Patch2Self is that it works on a single subject, has low time complexity and is clinically viable. It does not pose any restrictions on the nature of the acquisition and will give denoising performance if the noise is statistically independent.
Results (Simulated and Real Data):
We start by comparing Patch2Self on real data using the Sherbrooke 3-Shell Dataset7. We compare against the empirical Local PCA (LPCA), Adaptive Optimized Non-local Means and Marchenko-Pastur PCA (MPPCA). In all cases, Patch2Self outperforms other methods and interestingly finds a locally linear full-rank representation similar to LPCA13 (ranks 7-10) as shown in Fig. 2A, automatically in a data-driven way. To compare the performance of Patch2Self with other methods, we evaluated the method against MPPCA and NLMeans on real and simulated data. The simulation contained 62 gradient directions (2 b0s, 30 gradient directions with b-value 1000 and 2000 each) with noise added to the real and imaginary part of each channel10. In Fig. 2B we show that the Patch2Self outperforms MPPCA and AONLM in each case. We also computed the RMSE and R2 statistic between the ground truth and the denoised signal. In each case Patch2Self outperformed MPPCA (AONLM heavily smooths out the signal, hence not compared quantitatively). Lastly, we also compare the effect of Patch2Self on the common downstream tasks of microstructure modelling and tractography. We compute a voxel-wise cross-validation8 by fitting the DTI14 and Constrained Spherical Deconvolution11 models to the data. Using an R2 statistic, we show that the goodness-of-fit of the models always improved after Patch2Self denoising in comparison with the noisy and MPPCA denoised data (see Fig. 3). Patch2Self also yields more coherent tracts which we quantify using the Fiber-to-Bundle9 Coherence metric shown in Fig. 4. We performed this comparison on the Optic Radiation bundle using probabilistic tractography with the exact same parameters. Our experiments show that Patch2Self automatically prunes the incoherent false-positive streamlines generated by the tracking algorithm.
Discussion:
Patch2Self thus proposed a new way of looking at dMRI denoising using the notion of statistical independence. It has been made available to the community via the DIPY7 software package. Patch2Self allows users to switch between different regression functions depending on the data. However, our experiments show that a linear regression performs most accurately as shown in Fig. 5. This is not surprising given the past success of the linear denoisers (e.g. PCA) and the oversampled q-space in dMRI acquisitions.

Acknowledgements

We sincerely thank Prof. Qiuting Wen (Indiana University School of Medicine) for providing the simulated data used in the above experiment. S.F. and E.G. were supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) of the National Institutes of Health (NIH) under Award Number R01EB027585. J.B. was supported by the Chan Zuckerberg Biohub.

References

[1] Fadnavis, Shreyas, Joshua Batson, and Eleftherios Garyfallidis. "Patch2Self: Denoising Diffusion MRI with Self-Supervised Learning." Advances in Neural Information Processing Systems 33 (2020).

[2] Veraart, Jelle, et al. "Denoising of diffusion MRI using random matrix theory." Neuroimage 142 (2016): 394-406.

[3] Ramos-Llordén, Gabriel, et al. "SNR-enhanced diffusion MRI with structure-preserving low-rank denoising in reproducing kernel Hilbert spaces." arXiv preprint arXiv:2009.06600 (2020).

[4] Coupé, Pierrick, et al. "An optimized blockwise nonlocal means denoising filter for 3-D magnetic resonance images." IEEE transactions on medical imaging 27.4 (2008): 425-441.

[5] Dabov, Kostadin, et al. "BM3D image denoising with shape-adaptive principal component analysis." 2009.

[6] Knoll, Florian, et al. "Second order total generalized variation (TGV) for MRI." Magnetic resonance in medicine 65.2 (2011): 480-491.

[7] Garyfallidis, Eleftherios, et al. "Dipy, a library for the analysis of diffusion MRI data." Frontiers in neuroinformatics 8 (2014): 8. [8] Rokem, Ariel, et al. "Evaluating the accuracy of diffusion MRI models in white matter." PloS one 10.4 (2015): e0123272.[9] Portegies, Jorg M., et al. "Improving fiber alignment in HARDI by combining contextual PDE flow with constrained spherical deconvolution." PloS one 10.10 (2015): e0138122.

[10] Graham, Mark S., Ivana Drobnjak, and Hui Zhang. "Realistic simulation of artefacts in diffusion MRI for validating post-processing correction techniques." NeuroImage 125 (2016): 1079-1094.

[11] Tournier, J-Donald, Fernando Calamante, and Alan Connelly. "Robust determination of the fibre orientation distribution in diffusion MRI: non-negativity constrained super-resolved spherical deconvolution." Neuroimage 35.4 (2007): 1459-1472.

[12] Gramfort, Alexandre, Cyril Poupon, and Maxime Descoteaux. "Denoising and fast diffusion imaging with physically constrained sparse dictionary learning." Medical image analysis 18.1 (2014): 36-49.

[13] Manjón, José V., et al. "Diffusion weighted image denoising using overcomplete local PCA." PloS one 8.9 (2013): e73021.

[14] P. Basser, J. Mattiello, and D. Lebihan. Mr diffusion tensor spectroscopy and imaging. Biophysical Journal, 66(1):259–267, 1994.

Figures

Explains the flow of Patch2Self Framework. Also depicts the self-supervised loss used in the J-invariant training.

2A: Shows the results on Real Data and the corresponding residual maps obtained from each of the methods (A) LPCA, (B) AONLM (C) MPPCA and (D) Patch2Self. 2B: Shows the comparison on Ground Truth Data via Visual, RMSE and R2 scores.

Effect on Microstructure Modeling

Effect on Tractography via FBC

Regressor Comparisons

Proc. Intl. Soc. Mag. Reson. Med. 29 (2021)
0108