1651

Deep learning with synthetic data for free water elimination in diffusion MRI

Miguel Molina-Romero^1,2, Pedro A. Gómez^1,2, Shadi Albarqouni¹, Jonathan I. Sperl², Marion I. Menzel², and Bjoern H. Menze¹

¹Technical University of Munich, Munich, Germany, ²GE Global Research Europe, Munich, Germany

Synopsis

Diffusion metrics are typically biased by Cerebrospinal fluid (CSF) contamination. In this work, we present a deep learning based solution to remove the CSF contribution. First, we train an artificial neural network (ANN) with synthetic data to estimate the tissue volume fraction. Second, we use the resulting network to predict estimates of the tissue volume fraction for real data, and use them to correct for CSF contamination. Results show corrected CSF contribution which, in turn, indicates that the tissue volume fraction can be estimated using this joint data generation and deep learning approach.

Introduction

Cerebrospinal fluid (CSF) partial volume contamination poses a problem for detecting changes in tissue microstructure¹, biasing the diffusion measurements and derived metrics. CSF is mostly composed of free water, with isotropic diffusion and diffusivity three times bigger than parenchyma².

FLAIR DWI³ tackles the problem suppressing the CSF signal during acquisition, at the cost of low SNR and longer acquisition times. Post-processing solutions have focused on fitting a bi-tensor model; yet, this is an ill-posed problem with several regularizations^1,2,4-7.

In this work, we hypothesise and show that artificial neural networks (ANN) can estimate the tissue volume fraction from the diffusion signal. Then the CSF contribution can be corrected.

Methods

Theory

CSF has isotropic diffusion with diffusivity $$$D_{CSF}=3·10^{-3}mm^2/s$$$ ² and can be computed from b:

$$S_{CSF}=e^{-b·D_{CSF}}. $$[Eq.1]

The measured signal is the contribution of CSF and tissue (parenchyma) components:

$$S = fS_{tissue} + (1-f)S_{CSF}.$$ [Eq.2]

Eq.2 is ill-posed since $$$S_{tissue}$$$ and its volume fraction, $$$f$$$, are unknowns. In this work, we present a deep learning approach that uses ANNs to estimate $$$f$$$, regularizing the problem:

$$S_{tissue} = \frac{S-(1-f)S_{CSF}}{f}.$$ [Eq.3]

Generation of synthetic data

The training dataset were designed to teach the ANN to detect CSF-like components mixed with a random signal (Fig.1). CSF signal was derived from Eq.1 and acquisition parameter b. Tissue signal was randomly generated to simulate undetermined directions. The generation steps were:

$$$S_{CSF}^{training}$$$ was computed (Eq.1).
$$$S_{tissue}^{training}$$$ was randomly created simulating arbitrary directions: $$$U(0,1)$$$.
$$$f$$$ was randomly generated: $$$U(0,1)$$$.
$$$S^{training}$$$ was computed (Eq.2).
The ANN was trained with input $$$S^{training}$$$ to match the output $$$f$$$ (Fig.2).

Free water elimination

For comparison, we trained¹⁰ five ANN architectures in MATLAB (MathWorks, Natick, MA) for datasets with 32 directions (one shell) and 64 directions (two shells), (Fig.2). We chose the best performing ANNs and compared them against Pasternak's⁴ and Hoy’s^6,11 methods.

Data acquisition

A volunteer went under a diffusion acquisition (GE 3T MR750w, Milwaukee, WI) with 30 directions and 2 shells: b=500, 1000s/mm²; four b=0s/mm²; TR/TE=8000/80ms; FOV=200mm; resolution 128x128; ASSET=2; and 25 slices with 3.6mm thickness and no gap.

Pipeline

Diffusion measurement.
Synthetic data generation from the experimental b (Fig.1).
ANN training.
Volume fraction estimation: $$$f=ANN(S)$$$.
Computation of $$$S_{tissue}$$$ (Eq.3).
Fitting of the tensor model^8,9 on $$$S_{tissue}$$$.

Results

The five ANN architectures (Fig.2) showed similar performance (Fig.3). ANNs trained for two shells (ANN2s) outperformed those for one shell (ANN1s), due to the better CSF encoding of two shells protocols. The best performing ANNs were L=2 and L=3 for one and two shells respectively, suggesting a potential coupling between the number of hidden layers and shells.

DTI metrics after ANNs correction showed differences depending on the number of shells. ANN1s estimated larger volumes of CSF than ANN2s (Fig.4c), that resulted in larger FA (Fig.4a) and lower MD (Fig.4b) estimates. This difference on the $$$f$$$ estimate might be explained by the limited CSF information contained in the single shell protocol. MD values for ANN2s (Fig.4b) agreed with the reference².

ANNs kept the anatomical integrity of the FA, MD, and $$$f_{CSF}$$$ maps (Fig.5). We observed the CSF correction in the enlargement of the corpus callosum and fornix, and a general increment of FA in white matter, compared to the standard DTI (Fig.5a,c,e,f). CSF contribution was accurately removed from MD maps, especially for ANN2s (Fig.5g,h,k,l). ANNs1 and ANNs2 differ on the $$$f$$$ estimate in white matter (Fig.5m,o), as previously explained.

Discussion

ANNs trained with synthetic data are capable of estimating the tissue volume fraction from the measured diffusion signal. Their correction is equivalent to well-established methods: Pasternak et al⁴. and Hoy et al⁶. (Fig.4 and Fig. 5).

Using ANNs has a performance advantage. Their training time is in the order of ten minutes and once trained they can be used for any data acquired with the same protocol. CSF correction is faster than traditional methods. For one shell, Pasternak’s method ran for 38.4s and ANN1s for 0.7s (55x). For two shells, Hoy’s method ran for 392.5s and ANN2s for 1.3s (302x). Besides, to improve the accuracy, one can carefully design the training dataset to mimic only tissue characteristics (here it is random), or incorporate prior knowledge of the bi-exponential problem and noise model into the learning process¹².

Conclusions

This is the first application of ANNs to remove CSF contamination. We proved that tissue volume fraction can be estimated by ANNs trained with synthetic data, creating a new tool for free water elimination.

Acknowledgements

With the support of the TUM Institute of Advanced Study, funded by the German Excellence Initiative and the European Commission under Grant Agreement Number 605162.

References

¹ C. Metzler-Baddeley, M. J. O’Sullivan, S. Bells, O. Pasternak, and D. K. Jones, “How and how not to correct for CSF-contamination in diffusion MRI.,” Neuroimage, vol. 59, no. 2, pp. 1394–403, Jan. 2012.

² C. Pierpaoli and D. K. Jones, “Removing CSF Contamination in Brain DT-MRIs by Using a Two-Compartment Tensor Model,” in Proceedings of the 12th Annual Meeting of ISMRM, Kyoto, 2004, p. 1215.

³ G. Liu, P. Van Gelderen, J. Duyn, and C. T. W. Moonen, “Single-shot diffusion MRI of human brain on a conventional clinical instrument,” Magn. Reson. Med., vol. 35, no. 5, pp. 671–677, 1996.

⁴ O. Pasternak, N. Sochen, Y. Gur, N. Intrator, and Y. Assaf, “Free Water Elimination and Mapping from Diffusion MRI,” vol. 730, pp. 717–730, 2009.

⁵ Z. Eaton-Rosen, A. Melbourne, M. J. Cardoso, N. Marlow, and S. Ourselin, “Beyond the Resolution Limit: Diffusion Parameter Estimation in Partial Volume,” MICCAI 2016, vol. 1, pp. 605–612, 2016.

⁶ A. R. Hoy, C. G. Koay, S. R. Kecskemeti, and A. L. Alexander, “Optimization of a Free Water Elimination Two-Compartment Model for Diffusion Tensor Imaging,” Neuroimage, no. 103, pp. 323–333, 2014.

⁷ M. Molina-Romero, P. A. Gómez, J. I. Sperl, A. J. Stewart, D. K. Jones, M. I. Menzel, and B. H. Menze, “Theory, validation and application of blind source separation to diffusion MRI for tissue characterisation and partial volume correction,” in Proceedings of the 25th Annual Meeting of ISMRM, Honolulu, 2017, p. 3462.

⁸ P. J. Basser, J. Mattiello, and D. LeBihan, “MR diffusion tensor spectroscopy and imaging.,” Biophys. J., vol. 66, no. 1, pp. 259–267, 1994.

⁹ M. Jenkinson, C. F. Beckmann, T. E. J. Behrens, M. W. Woolrich, and S. M. Smith, “FSL,” Neuroimage, vol. 62, no. 2, pp. 782–790, 2012.

10 M. T. Hagan and M. B. Menhaj, “Training Feedforward Networks with the Marquardt Algorithm,” IEEE Trans. Neural Networks, vol. 5, no. 6, pp. 989–993, 1994.

¹¹ E. Garyfallidis, M. Brett, B. Amirbekian, A. Rokem, S. van der Walt, M. Descoteaux, and I. Nimmo-Smith, “Dipy, a library for the analysis of diffusion MRI data,” Front. Neuroinform., vol. 8, 2014.

¹² J. Adler and O. Öktem, “Solving ill-posed inverse problems using iterative deep neural networks,” no. 1, pp. 1–24, 2017. arXiv:1704.04058v2

Figures

Fig.1 Generation of synthetic data. The vectorization of the diffusion MRI signal along the diffusion directions (B) shows a tissue dependent pattern. $$$S_{CSF}$$$ is characterized by Eq.1 and can be calculated from the diffusion protocol (b values). $$$S_{tissue}$$$ depends on the tissue anisotropy and acquired directions, thus it cannot be predicted. We represented it as a uniformly distributed signal, $$$U(0,1)$$$, with maximums where b=0s/mm². Tissue volume fraction, $$$f$$$, was also generated uniformly, $$$U(0,1)$$$. Finally, $$$S^{training}$$$ was computed as in Eq.2, and presented to the input of the ANN, and $$$f$$$ to the output for training (Fig.2).

Fig.2 ANN architectures. Five architectures with L=1–5 were tested to determine their performance. For L=1 hidden layer, $$$I_1=I_0/3$$$. For L=2 hidden layers, $$$I_1=I_0/2$$$ and $$$I_2=I_0/4$$$. For L=3 hidden layers, $$$I_1=I_0/2$$$, $$$I_2=I_0/3$$$ and $$$I_3=I_0/4$$$. For L=4 hidden layers, $$$I_1=I_0/2$$$, $$$I_2=I_0/3$$$, $$$I_3=I_0/4$$$ and $$$I_4=I_0/5$$$. For L=5 hidden layers, $$$I_1=2I_0/3$$$, $$$I_2=I_0/2$$$, $$$I_3=I_0/3$$$, $$$I_4=I_0/4$$$ and $$$I_5=I_0/5$$$. The number of inputs, $$$I_0$$$, matched the number of diffusion directions and non-diffusion-weighted volumes. In these experiments, we used $$$I_0=32$$$ for one shell and $$$I_0=64$$$ for two shells. One million signal combinations and volume fractions were generated for training, 20% were separated for validation and 20% for testing.

Fig.3 Performance comparison for the five ANN architectures. We generated 5000 artificial diffusion signals for FA=0-1 and $$$f$$$=0-1. They were mixed as in Eq.2 and Fig.1 and presented to the trained ANNs to estimate $$$\hat{f}$$$. We plot the error ($$$f-\hat{f}$$$) of the estimated volume fraction ($$$\hat{f}$$$) against its true value ($$$f$$$), their correlation ($$$\rho$$$), and the standard deviation of the error ($$$\sigma$$$). For ANN1s, we found the largest $$$\rho$$$ and minimum $$$\sigma$$$ for L=2; and L=3 for ANN2s. We used L=2 for the one shell and L=3 for two shells for in vivo experiments.

Fig.4 Comparison of FA, MD and $$$f$$$ histograms. FA (a) and MD (b) were consistent for standard DTI of one and two shells, fixing a common reference. ANN1s showed larger correction of FA. Hoy’s method did not correct FA=0.15–0.45. ANN2s and Pasternak’s method showed stable correction for all FA values. However, ANN1s and Pasternak’s method suffered from over regularization of MD (b), with peaks off the reference (0.7mm²/s). Volume fraction estimates (c) for ANN1s and Pasternak’s were similar, but the later struggled to estimate small $$$f$$$ values. ANN2s estimated less CSF volume (c) in white matter than other methods (Fig.5o).

Fig.5 Comparison of FA, MD and $$$f_{CSF}$$$ maps. ANN’s maps showed anatomical coherence with standard DTI, Pasternak’s and Hoy’s methods. We observed an enlargement of the corpus callosum and recovery of the fornix in FA for all the methods (a-d) comparted to the standard (e-f). CSF contribution was removed from all the MD maps (g-j vs k-l). MD maps for two shells methods (i-j) contained more information than one shell (g-h). Single shell methods (m-n) showed a larger CSF volume estimate in white matter. ANN2s estimated a lower and more homogeneous CSF volume (o) than Hoy’s (p) in white matter.

Proc. Intl. Soc. Mag. Reson. Med. 26 (2018)

1651