3406

Providing realistic ground truth and AI-ready data for fiber tractography: The 99 simulated brains dataset
Peter Neher1 and Klaus Maier-Hein1,2,3
1Medical Image Computing, German Cancer Research Center (DKFZ), Heidelberg, Germany, 2Medical Faculty, University of Heidelberg, Heidelberg, Germany, 3Pattern Analysis and Learning Group, Heidelberg University Hospital, Heidelberg, Germany

Synopsis

We present our new dataset of 99 simulated brains, suitable for fiber tractography training, validation and beyond. With the proposed approach it was possible to create a large dataset of 792 simulated MR images based on 99 subjects. A large variety of acquisition settings and artifacts could be realized. This dataset is the first large collective of diversely simulated brain-like MRI datasets. We believe that this unique dataset is an important contribution to the ongoing efforts of the tractography community to enable proper validation with a real ground truth and to further enable the training of new machine-learning based approaches.

Introduction

Fiber tractography has come a long way over the last twenty years. A huge variety of different approaches has been published including local and global, deterministic and probabilistic, model-based and model-free as well as classic and machine-learning based methods. One challenge that all of these approaches are facing is that they are difficult to validate. This is founded in the difficulty of creating an even roughly accurate reference for diffusion-weighted MRI (dMRI), not to speak of a real ground truth. Multiple datasets usable for validating tractography and dMRI methods in general have been presented in the last years, including physical phantoms 1–3, manually created references for in vivo datasets 4, as well as simulated datasets 5,6. Nevertheless, all of the currently available datasets have issues that limit their general usability and only enable very preliminary evaluation studies, with typical limitations involving unrealistically low geometric complexity, unrealistic diffusion parameters, the lack of a real ground truth or a very limited sample size, often consisting only of a single image. These limitations have become an increasing nuisance since the advent of machine-learning based tractography, which requires large realistic datasets with a real ground truth not only for validation but also for training. While simulations are never capable of reproducing reality perfectly, they are in theory capable of mitigating the above mentioned limitations to a large extent since they come with real ground truth, they enable complex phantom configurations with realistic diffusion properties and they are producible in large numbers with relatively low effort. Nevertheless, until now no large collective of brain-like simulated dMRI datasets has been published. To change this, we present our new dataset of 99 simulated brains, suitable for fiber tractography training, validation and beyond.

Methods

Using the Fiberfox MRI simulation tool 7,8, we created a large collective of brain MR images based on the HCP dataset 9.
Data preparation: Using TractSeg and MITK Diffusion, bundle specific fiber tractography in 99 subjects was performed 7,10, yielding 71 reference tracts serving as input for the simulation. Further simulation input in the form of volume fraction maps for grey matter (GM), white matter (WM) and corticospinal fluid (CSF), were generated using MRtrix 11,12.
Parameter optimization: The simulation parameters, including tissue properties of the 4 compartments (intra-axonal, inter-axonal, GM, CSF) and scanner parameters were optimized using Hyppopy and Optunity to simulate realistic image contrasts for dMRI as well T1 and T2 weighted images 13,14. The optimization was performed on a single subjects central image slice with the goal of minimizing the compartment-wise normalized L1 error between the real and simulated T1, T2, B0, FA and ADC volumes using a multi-step approach for the different contrasts where the relevant parameters of the previous optimization step are transferred to the next step:
  1. dMRI B0 contrast: Optimize the compartment specific T2 relaxation constant, scanner settings (TR, TE) and a global signal scaling (see Figure 1).
  2. FA and ADC: Optimize the diffusion parameters of the intra-axonal (stick-model), inter-axonal (tensor-model) and GM (ball-model) compartments as well as two compartment scaling parameters. CSF diffusivity (ball-model) was fixed to the respective mean value of the reference.
  3. T1 image contrast: Optimize the compartment specific T1 relaxation constant, scanner settings (TR, TE) and a global signal scaling.
  4. T2 image contrast: Optimize the scanner settings (TR, TE) and a global signal scaling.
Simulation: For each of the 99 reference datasets and using the optimized parameters, one T1, one T2 and six diffusion-weighted images were simulated. T1/T2 images were simulated with 1mm isotropic voxels. dMR images were simulated with and without artifacts for each of the following configurations:
  • 2.5mm isotropic voxels, 1x b=0s/mm², 32x b=1000s/mm²
  • 2.0mm isotropic voxels, 9x b=0s/mm², 30x b=1000s/mm², 30x b=2000s/mm², 30x b=3000s/mm²
  • 1.25mm isotropic voxels, 18x b=0s/mm², 90x b=1000s/mm², 90x b=2000s/mm², 90x b=3000s/mm²
All datasets were simulated with complex gaussian noise. Images without additional artifacts were simulated with a white-matter SNR of about 40. For images with artifacts, the SNR was varied randomly (+-10%).

Results and Discussion

With the proposed approach it was possible to create a large dataset of 792 simulated MR images based on 99 subjects. A large variety of acquisition settings and artifacts such as head motion, spikes, eddy currents, ghosts, Gibbs ringing, inhomogeneity induced distortions and signal drift could be realized (see Figure 2). All images and the corresponding 71 reference fiber bundles for each subject (~433 GB) are openly available online 15,16. While the presented dataset still has some limitations, e.g. that it is based only on the very homogenous HCP dataset and that the microstructural simulation in the form of a multi-compartment model is relatively simple in comparison to the actual simulation of diffusing particles in a restricted environment 17, it is the first large collective of diversely simulated brain-like MRI datasets. We believe that this unique dataset is an important contribution to the ongoing efforts of the dMRI processing and particularly the fiber tractography community to enable proper validation of methods with a real ground truth and to further enable the training of new machine-learning based approaches on a large and variable dataset, which is only possible to a very limited extent with other datasets already available.

Acknowledgements

Data were provided [in part] by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.

This work was supported by the Collaborative Research Center (SFB/TRR 125 Cognition-Guided Surgery) of the German Research Foundation (DFG) grant number INST 35/1120-1, DFG grant MA 6340/10-1, DFG grant MA 6340/12-1 and by the Helmholtz Association Initiative and Networking Fund under project number ZT-I-0003.

References

  1. Fieremans E, De Deene Y, Baete S, Lemahieu I. Design of Anisotropic Diffusion Hardware Fiber Phantoms. Diffus Fundam. 2009;10:1–3.
  2. Fillard P, Descoteaux M, Goh A, Gouttard S, Jeurissen B, Malcolm J, et al. Quantitative evaluation of 10 tractography algorithms on a realistic diffusion MR phantom. Neuroimage. 2011;56:220–234.
  3. Bach M, Fritzsche KH, Stieltjes B, Laun FB. Investigation of resolution effects using a specialized diffusion tensor phantom. Magn Reson Med. 2013;
  4. Wasserthal J, Neher PF, Maier-Hein KH. High quality white matter reference tracts [Internet]. Zenodo; 2017 [cited 2018 Feb 27]. Available from: https://zenodo.org/record/1088278#.WpUqyuZG36U
  5. Neher PF, Maier-Hein KH. Simulated dMRI images and ground truth of random fiber phantoms in various configurations [Internet]. Zenodo; 2019 [cited 2020 Oct 23]. Available from: https://zenodo.org/record/2533250
  6. Maier-Hein K, Neher P, Houde J-C, Caruyer E, Daducci A, Dyrby T, et al. Tractography Challenge ISMRM 2015 Data [Internet]. Zenodo; 2015 [cited 2017 Oct 30]. Available from: https://zenodo.org/record/572345#.WfcPiHBrzRY
  7. Neher P. MIC-DKFZ/MITK-Diffusion [Internet]. MIC-DKFZ; 2020 [cited 2020 Apr 24]. Available from: https://github.com/MIC-DKFZ/MITK-Diffusion
  8. Neher PF, Laun FB, Stieltjes B, Maier-Hein KH. Fiberfox: facilitating the creation of realistic white matter software phantoms. Magn Reson Med. 2014 Nov;72(5):1460–70.
  9. Van Essen DC, Smith SM, Barch DM, Behrens TEJ, Yacoub E, Ugurbil K, et al. The WU-Minn Human Connectome Project: an overview. NeuroImage. 2013 Oct 15;80:62–79.
  10. Wasserthal J, Neher PF, Hirjak D, Maier-Hein KH. Combined tract segmentation and orientation mapping for bundle-specific tractography. Med Image Anal. 2019 Dec 1;58:101559.
  11. Tournier J-D, Calamante F, Connelly A. MRtrix: Diffusion tractography in crossing fiber regions. Int J Imaging Syst Technol. 2012 Mar;22(1):53–66.
  12. Jenkinson M, Beckmann CF, Behrens TEJ, Woolrich MW, Smith SM. FSL. Neuroimage. 2012;62:782–790.
  13. Claesen M, Simm J, Popovic D, Moreau Y, De Moor B. Easy Hyperparameter Search Using Optunity. ArXiv14121114 Cs [Internet]. 2014 Dec 2 [cited 2020 Apr 24]; Available from: http://arxiv.org/abs/1412.1114
  14. MIC-DKFZ/Hyppopy [Internet]. MIC-DKFZ; 2020 [cited 2020 Oct 23]. Available from: https://github.com/MIC-DKFZ/Hyppopy
  15. Neher P, Maier-Hein K. Simulated MRI images and reference fiber tracts of 99 subjects. [Internet]. German Cancer Research Center; 2020 [cited 2020 Jun 10]. Report No.: DKFZ-2020-01042. Available from: https://inrepo01.inet.dkfz-heidelberg.de/record/156611?ln=en
  16. Neher P, Maier-Hein K. Sample data of the 99 simulated brains dataset [Internet]. Zenodo; 2020 [cited 2020 Oct 27]. Available from: https://zenodo.org/record/4139626
  17. Palombo M, Alexander DC, Zhang H. A generative model of realistic brain cells with application to numerical simulation of the diffusion-weighted MR signal. NeuroImage. 2019 Mar 1;188:391–402.

Figures

Comparison between the B0 image contrast of a simulated (a) and the corresponding real reference image (b). This figure illustrates that using the automatic parameter optimization it was possible to closely approximate the contrasts of a real image.

Illustration of simulated MR images with various animated artifacts (a bit excessive for illustration purposes): eddy current distortions (a), intensity drift (b), head motion and spike (c), head motion, eddy currents and noise (d), Gibbs ringing (e), inhomogeneity distortions (f).

Proc. Intl. Soc. Mag. Reson. Med. 29 (2021)
3406