MRI derived brain structural measurements from multicenter datasets can strongly be affected by factors such as the acquisition protocol, the static magnetic field strength and the scanner manufacturer. A preliminary study was performed to assess the homogeneity of population metrics from 3DT1 scans acquired with already established routine protocols in a dataset of 174 healthy subjects from 18 Italian Research Hospital Centers (IRCCS). The impact of each center acquisition parameters on outcomes was assessed with quality control measurements and FreeSurfer volumetric metrics of cortical and subcortical structures. Future multicenter studies will benefit from harmonizing the acquisition protocols.
INTRODUCTION
In recent years novel MRI biomarkers for the diagnosis and the prognosis of neurodegenerative and neurodevelopmental diseases emerged1–6. These biomarkers include neuroimaging derived measures such as subcortical structure volumes, cortical areas and thicknesses from 3DT1 high resolution scans. However, given the non-quantitative intrinsic nature of structural MRI imaging, the calculation of brain structural metrics from large multisite datasets can strongly be affected by many factors such as the acquisition protocol, the static magnetic field and scanner manufacturer, which can cause increased variability of the measured metrics7–9. In this study, we explored the variability of routine brain scan acquisitions among Italian neuroimaging centers (IRCCS) on healthy volunteers and we assessed the impact that different parameters can have on both image quality and values of derived metrics.We pooled 174 3DT1-weighted MRI scans acquired on healthy volunteers (age 295 years) recruited in 18 IRCCS. The acquisition parameters were directly extracted from the DICOM header files. For each scan, the following quality control measurements were computed according to the Preprocessed Connectome Project (PCP) protocol (http://preprocessed-connectomes-project.org):
-Signal-to-Noise Ratio (SNR): The mean intensity within gray matter divided by the standard deviation of the values outside the brain.
-Contrast-to-Noise Ratio (CNR): The mean of the gray matter intensity values minus the mean of the white matter intensity values divided by the standard deviation of the values outside the brain.
-Entropy-Focus-Criterion (EFC): The Shannon entropy of voxel intensities proportional to the maximum possible entropy, indicative of ghosting and head motion-induced blurring.
The scans were also processed through the FreeSurfer recon-all workflow10 (version FS 6.0) and the volumes of the subcortical structures, the area and thickness of cortical structures were extracted. On these metrics a linear SVM classifier was trained in a 5-fold validation scheme to assess how much the derived measures were actually independent from the acquisition site, the static magnetic field strength and the scanner vendor. To test whether an analogous classifier was able to perform a well-known task, an SVM classifier was also trained on the FreeSurfer metrics to distinguish male vs female subjects.
The acquisition parameters of the different centers are the following:
-Static Magnetic Field: 1.5T/3T
-Scanning sequence: Fast Gradient
Vendor 1: TR within the range [7, 11.5] ms TE within the range [2.5, 5.5] ms
Vendor 2: TR within the range [1900, 2400] ms TE within the range [2.75, 3.4] ms
Vendor 3: TR within the range [6.9, 12.6] ms TE within the range [2.9, 12] ms
-FA within the range [8, 15]°
-Receiving Coil [8, 64] channels
An example of quality control measurement variability is reported in figure 1 (SNR).
An SVM classifier was able to classify the FreeSurfer derived metrics as belonging to a scan coming from a specific site with an accuracy of 55.1% (chance level = 5.6% for 18-class classification). A similar classifier was capable to identify from the same features the static magnetic field used with an accuracy of 85.6% (chance level = 50%). The 3 different vendors were also correctly classified with an accuracy of 75.4% (chance level = 33%). Very similar performances were obtained in the well-known task of male vs. female discrimination starting from FS metrics (accuracy = 76.6%). We considered as a benchmark for the overall variability of the data the hippocampal volume, which we found to be 4240 $$$\pm$$$ 340 (8%) mm3 in our cohort of young volunteers.
1. Jack, C. R. Alliance for Aging Research AD Biomarkers Work Group: structural MRI. Neurobiol. Aging 32, S48–S57 (2011).
2. Frisoni, G. B., Fox, N. C., Jack, C. R., Scheltens, P. & Thompson, P. M. The clinical use of structural MRI in Alzheimer disease. Nat. Rev. Neurol. 6, 67–77 (2010).
3. Dickerson, B. C. et al. Alzheimer-signature MRI biomarker predicts AD dementia in cognitively normal adults. Neurology 76, 1395–402 (2011).
4. Ecker, C., Bookheimer, S. Y. & Murphy, D. G. M. Neuroimaging in autism spectrum disorder: brain structure and function across the lifespan. Lancet Neurol. 14, 1121–1134 (2015).
5. Thayyil, S. et al. Cerebral Magnetic Resonance Biomarkers in Neonatal Encephalopathy: A Meta-analysis. Pediatrics 125, e382–e395 (2010).
6. Woodward, L. J., Anderson, P. J., Austin, N. C., Howard, K. & Inder, T. E. Neonatal MRI to Predict Neurodevelopmental Outcomes in Preterm Infants. N. Engl. J. Med. 355, 685–694 (2006).
7. Martinez-Murcia, F. J. et al. On the brain structure heterogeneity of autism: Parsing out acquisition site effects with significance-weighted principal component analysis. Hum. Brain Mapp. 38, 1208–1223 (2017).
8. Jovicich, J. et al. Reliability in multi-site structural MRI studies: Effects of gradient non-linearity correction on phantom and human data. Neuroimage 30, 436–443 (2006).
9. Suckling, J. et al. The neuro/PsyGRID calibration experiment. Hum. Brain Mapp. 33, 373–386 (2012).
10. Fischl, B. et al. Whole Brain Segmentation: Neurotechnique Automated Labeling of NeuroanatomicalStructures in the Human Brain. Neuron 33, 341–355 (2002).