The main aim of the study is to investigate whether the adoption of a processing method has a relevant influence on the results of a neuroimaging research. We evaluated the intra-method repeatability and the inter-method reproducibility of two widely-used automatic segmentation methods for brain MRI: FreeSurfer (FS) and Statistical Parametric Mapping (SPM) software packages. We segmented the gray matter, the white matter and the subcortical structures in test-retest MRI data of healthy volunteers from two publicly available datasets. High intra-method repeatability was found for both SPM and FS, but SPM was more consistent than FS in measuring ROIs volumes.
Methods
We examined two publicly available data samples: the Kirby-21 (Kirby) dataset3, and the OASIS dataset4. The first one consists of 3D T1-weighted images of 21 healthy volunteers (11 males and 10 females; age: 32±9 years) acquired using a 3T MRI scanner. The considered OASIS dataset consists of 3D T1-weighted images of 20 healthy volunteers (10 males and 10 females; age: 23.4 ± 3.9 years), acquired using a 1.5 T MRI scanner. To enable reproducibility studies, MRI images were acquired twice with the same acquisition parameters after a short time for the Kirby dataset and with a time delay in the range of 1-89 days for the OASIS dataset. Firstly, we quantified the intra-method repeatability of SPM and FS in estimating brain tissues volume through a test-retest analysis. Secondly, we evaluated the inter-method reproducibility of estimated volumes comparing the two processing methods and quantifying the overlap of the segmented regions. Finally, we compared the brain volume measures obtained with each software for the male and female subsamples, in order to reveal gender-related volume differences. These analyses were conducted in parallel for the two data samples assessing Pearson’s correlation, Bland-Altman plot representation, Cohen’s d effect size and Dice similarity index.Results
We found out high Pearson’s correlation (0.98-0.99) between the volumes obtained on test and retest MRI data analyzed with both SPM and FS. The Bland-Altman plots detected the presence of systematic biases between the test-retest measures for the GM and WM volumes obtained by FS on the OASIS database. For these quantities, the test measures were systematically greater (4.6±1.3)% and smaller (-2.8±1.2)% than the retest measures, respectively. The comparison between the volumes provided by SPM and those by FS highlighted that FS overestimated the volumes with respect to SPM, except for the GM volume. These findings were consistently detected on both Kirby and OASIS data samples as shown in Table 1. We reported in Figure 1. the overlays of the ROI masks obtained by SPM and FS for the worst cases of the Kirby and the OASIS data samples, respectively. Dice values were found in the 0.76-0.83 range. In the male vs. female brain volume comparisons, inconsistencies arose for the OASIS dataset, where the gender-related differences appear subtler with respect to the Kirby dataset. In particular, gender-related differences on the Kirby dataset were consistently detected in the analysis of volumes segmented by SPM and FS (Cohen’s effect size > 1.1), whereas, in the case of the OASIS dataset significant volume differences were not consistently detected in the analysis of volumes segmented by SPM and by FS, and in that case the Cohen’s effect size are generally lower (see Table 2).Discussion
Conclusion
We support SPM as the more consistent tool to evaluate ROI volumes. In any case, as the two methods rely on different algorithm pipelines, which can be differently affected by the presence of abnormalities, image artifacts, or variations in the acquisition protocol parameters, we suggest cross-validating the findings of each research study against different segmentation methods before to proceed to their interpretation.
The OASIS project was funded by grants P50 AG05681, P01 AG03991, R01 AG021910, P50 MH071616, U24 RR021382, R01 MH56584.This work has been partially funded by the Tuscany Government (Bando FAS Salute by Sviluppo Toscana, ARIANNA Project), and by the National Institute of Nuclear Physics (nextMR project). Conflict of Interest Statement: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
1.SPM. Available at https://www.fil.ion.ucl.ac.uk/spm/, 2018.
2.FreeSurfer [online]. Available at https://surfer.nmr.mgh.harvard.edu/. Accessed 15 June 2011
3.Landman BA, Huang AJ, Gifford A, Vikram DS, Lim I, Farelli J, Smith S: Multi-parametric neuroimaging reproducibility: a 3-T resource study. Neuroimage 54: 2854–2866, 2011
4.Marcus, Daniel S, Tracy H, Wang: Open Access Series of Imaging Studies (OASIS): Cross-Sectional MRI Data in Young, Middle Aged, Nondemented, and Demented Older Adults. Journal of Cognitive Neuroscience 19(9): 1498–1507, 2007