2850

A STATISTICAL FRAMEWORK FOR EVALUATING THE RELIABILITY OF MYELIN IMAGING
Agah Karakuzu1,2, Cyril Pernet3, Tanguy Duval1, Julien Cohen-Adad1,4, and Nikola Stikov1,2

1Polytechnique Montreal, Montreal, QC, Canada, 2Montréal Heart Institute, Montréal, QC, Canada, 3Brain Research Center, Division of Clinical Neurosciences, University of Edinburgh, Edinburgh, Scotland, 4Functional Neuroimaging Unit, CRIUGM, Universite De Montreal, Montreal, QC, Canada

Synopsis

Given the importance of myelin in brain structure and function, the advancement of MR-based myelin imaging techniques has drawn a great deal of attention. In this abstract we propose a statistical framework for analyzing myelin imaging, taking us one step closer to standardizing and industrializing MR-based myelin biomarkers. In a nutshell, we are computing Pearson correlation coefficients for scan-rescan reliability and taking their differences to determine if some myelin techniques are more reliable than others. We tested this framework in ex vivo dog spinal cord and found the differences between myelin metrics to be subtle, indicating that one metric can often serve as a surrogate for another.

PURPOSE

Most myelin imaging is performed in-house, making it difficult to compare protocols objectively. Our goal is to develop a statistical framework for comparing myelin imaging techniques. To perform the comparison, first we look at the scan-rescan correlations of the individual myelin metrics (Pearson’s correlation coefficients). Then, we take the differences between each pair of correlations (see Figure 1). If this difference is not significant (using a bootstrapping method), we conclude that two metrics are statistically indistinguishable. We apply this framework to the analysis of ex vivo dog spinal cord, scanned with four MR-based myelin imaging techniques.

METHODS

Data Overview: In this study, we used our previous dataset, which is publicly available at (http://rebrand.ly/dogSCdata). This dataset contains quantitative maps of i) pool size ratio F acquired by two quantitative magnetization transfer methods (off-resonance SPGR1 and selective recovery fast spin-echo2) ii) macromolecular tissue volume (MTV)3 and iii) myelin water fraction (MWF)4 in one ex-vivo dog spinal cord. All images were acquired at 7T within a scan-rescan paradigm (Fig. 2). For further details regarding sample preparation, image acquisition and data fitting please see 5.

Statistics: Statistics was performed using the SPM reliability toolbox (https://github.com/CPernet/spmrt).

Figure 1 shows a schematic of the tests that were carried out in whole spinal cord (WSC), gray matter (GM) and white matter (WM):

  • Within-method reliability test: For each method, Pearson correlation coefficients (r) were calculated between scan-rescan pairs across all voxels (diagonals in Fig. 3) .
  • Between-method similarity test: The scan-rescan data were averaged, and then Pearson correlation coefficients were calculated for all pairwise combinations of methods (upper-triangle in Fig. 3).
  • Between-method comparison of reliability: To determine whether one myelin metric is more reliable than another, we used percentile bootstrapping6. We report the differences between the r-values along the diagonals in WSC, GM and WM (lower triangle in Fig. 3). If p < .008 (p = .05 corrected for multiple, i.e. six, comparisons) we deem the difference significant.

RESULTS

The matrix diagonals in Fig. 3 indicate that all four myelin markers show high within-method reliability in WSC (.84 < r < .97). Subregional inspection also reveals comparably high correlations for GM (.88 < .93), and moderate correlations in WM (.63 < r < .74) . The upper triangle in Fig. 3 indicate that the inter-method voxelwise correlations are comparable to the intra-method scan-rescan correlations. The lower triangle in Fig. 3 reports the differences between entries along the diagonal, indicating that the scan-rescan Pearson’s correlation coefficients are similar.

Figure 4 shows the statistical significance of the data reported in Fig. 3. Interestingly, all techniques are similarly reliable in white matter, except for the MTV/MWF combination (r=0.63 vs r=0.74). The improvement observed in the correlation coefficients when the whole spinal cord is considered can be ascribed to the strong contrast between gray and white matter. This in turn leads to significant differences between almost all techniques, except for MWF and SPGR, which cannot be distinguished over the whole spinal cord.

DISCUSSION

The statistical framework proposed in this abstract gives us a tool to determine whether one myelin metric is more reliable than another. A blanket conclusion is that five out of six combinations of myelin metrics are indistinguishable in WM, at least for this spinal cord where no WM pathology is present. While there are certain myelin metrics that are distinguishable, such as MTV and MWF (p < .008, Fig. 4), the differences are subtle, and indicate that one myelin metric can often serve as a surrogate for another. In WSC and GM it is easier to distinguish between techniques, yet the differences are small. Our approach was applied to only one spinal cord, but we expect this framework to be even more useful when applied to a larger cohort with varying levels of myelination (e.g., in lesions). We hope to show these results in a follow-up study.

Acknowledgements

This work was funded by the Canada Research Chair in Quantitative Magnetic Resonance Imaging (JCA), the Canadian Institute of Health Research [CIHR FDN-143263], the Canada Foundation for Innovation [32454], the Fonds de Recherche du Québec – Santé [28826], the Fonds de Recherche du Québec - Nature et Technologies [2015-PR-182754], the Natural Sciences and Engineering Research Council of Canada [435897-2013, 06774-2016] and the Quebec BioImaging Network.

The authors would like to thank Mark Does and Kevin Harkins from Vanderbilt University, Eva Alonso-Ortiz and Ives Levesque from McGill University for their support.

References

  1. Sled et al., Magn. Reson Med, 2001. 46:923-931
  2. Gochberg et al., Magn Reson Med, 2007. 57:437-441
  3. Mezer et al., Nature Medicine, 2013. 19:1667
  4. Mackay et al., Magn Reson Med, 1994. 31:673-677
  5. Vuong et al., Proceedings of the ISMRM Annual Meeting, Honolulu, 2017
  6. Wilcox et al., Commun Stat Simul Comput, 2009. 38:2220-223

Figures

Figure 1 Schematic representation of the relation between data groups and statistical tests. Correlations were calculated across all voxels for scan (A1-D1) and rescan (A2-D2). An interactive version of the complete set of results is available online at (http://rebrand.ly/dogSCspmrt).

Figure 2 Scan-Rescan results of four MRI-based myelin metrics: F using SPGR, F using FSIR, MTV and MWF. All the metrics exhibit a similar trend in myelin content distribution, with higher values observed in the dorsal column and lower values in the gray matter.

Figure 3 Diagonal entries show within-method reliability by reporting voxelwise scan-rescan Pearson correlation coefficients. Upper triangle shows between-method similarity by reporting pairwise Pearson correlation coefficients. Lower triangle shows the differences between the Pearson correlation coefficients reported along the diagonals.

Figure 4 Binary tables showing the statistical significance of the entries in Fig. 3. Along the diagonal and in the upper triangle, green means that the Pearson correlations were significant with p<.0001. In the lower triangle, green means that the two metrics are different with p<.008, and red means that they are statistically indistinguishable.

Proc. Intl. Soc. Mag. Reson. Med. 26 (2018)
2850