2245

Accounting for Measurement Noise in a Multi-Site Newborn Diffusion Weighted Imaging Study
Jerod M Rasmussen1, Alice M Graham2, Pathik D Wadhwa1, Sonja Entringer3, Martin Styner4, Beatriz Luna5, Thomas G O'Connor6, Damien A Fair7, and Claudia Buss3
1UC Irvine, Irvine, CA, United States, 2Oregon Health Sciences University, Portland, OR, United States, 3Charité – Universitätsmedizin Berlin, Berlin, Germany, 4University of North Carolina, Chapel Hill, NC, United States, 5University of Pittsburgh, Pittsburgh, PA, United States, 6University of Rochester, Rochester, NY, United States, 7University of Minnesota, Minneapolis, MN, United States

Synopsis

We leverage an exemplar relationship (FAWM vs. postmenstrual age [PMA]) to quantify the benefits conferred by adjustment for QC measures in a multi-site infant DWI study. Data was preprocessed using a common pipeline (dHCP) and the FAWM-PMA association tested without and with controlling for SNR, CNR and mean FD. All three QC measures were broadly associated with FAWM within sites. Within and across sites, inclusion of QC measures greatly increased FAWM variance explained by PMA. Accounting for QC measures results in an improved capability for identifying reproducible small-effect size relationships in large multi-site infant imaging studies.

Introduction

Over the course of the last decade, there has been a dramatic improvement in the ability to non-invasively image the developing human brain using MRI-based techniques. However, the identification of highly reproducible associations between inter-individual variation in brain structure/function and their environmental and behavioral correlates remains challenging due to statistical power constraints [1]. Increasing sample size is one approach, but is not always possible (e.g., in extant studies or where there are limited clinical cases and resources) and, in any event, does not formally address the central problem of measurement error. Here, we leverage a well-established, large effect-size exemplar relationship (newborn white matter fractional anisotropy [FAWM] vs. postmenstrual age at scan [PMA]) [2,3] to quantify the impact of statistically controlling for commonly available QC measures in the context of a multi-site infant Diffusion Weighted Imaging dataset.

Methods

The current dataset includes a total of N=469 infants heterogeneously scanned across four sites (Table 1, NA=94; NB=51; NC=66; ND=258). The dHCP (site D) data is a subsample matched for minimum gestational age at delivery (GA>35 weeks) to sites A-C. Collectively, this sample includes healthy, near full-term deliveries without any major obstetric risks scanned in early postnatal life (postmenstrual age at scan [PMA]=42.8±2.4 wks; gestational age at birth=39.6±1.6 wks; postnatal age at scan=3.1±2.6 wks; Male/Female=254/214).

Data from Sites A-C were preprocessed in a manner consistent with the dHCP minimal diffusion pipeline [4] using FSL. Site D was obtained as fully preprocessed and QC measures parsed from existing QC reports (pypdf2). Preprocessing included distortion correction (excepting Site A), motion/eddy current correction, outlier slice detection, and slice-to-volume correction. FA scalar maps were fit and put into a common template space via a two-stage (individual FA to age/site-specific template to HCP FA template) concatenated transformation. White matter was masked (whole-sample FA threshold >0.2) to extract whole white matter FAWM for further analysis. Three commonly used QC measures (signal-noise-ratio [SNR], contrast-noise-ratio [CNR], and mean framewise displacement [FD]) were extracted via FSL QUAD. CNR was defined using the shell most proximal to b=1000 mm2/s. Linear regression was used to characterize the associations between FAWM, PMA, and QC (SNR, CNR, mean FD) measures controlling for infant sex. Modeling was performed stepwise by constructing a model without and with QC measures. First, this was done within sites to establish a consistent improvement in model performance when including QC adjustment despite the heterogeneity in samples and collection. Second, the same stepwise framework was applied to multi-site aggregated data by adding a main site term and site x QC interaction terms, owing to collection heterogeneity.

Results

Single site model performance is summarized in Table 2. Across all sites, inclusion of QC measures in the model resulted in increased FAWM variance explained by PMA. All three QC measures were significantly (p<0.05) associated with FAWM across all sites with the exception of SNR (sites B and C) and mean FD (site B). QC measures accounted for a large proportion of FAWM variance after adjustment for age (partial RA,QC2=42%, RB,QC2=12%, RC,QC2=56%, RD,QC2=63%).

Scatter plots depicting the multi-site association between observed FAWM and modeled FAWM are provided in Figure 1. Residual error (Mean Squared Error [MSE]), a combination of remaining biological variation and measurement noise, decreased with increasing model complexity (pF<10-10). The partial correlation between PMA and FAWM increased dramatically (from partial-R2=26% to 42%; p-value from p<10-32 to p<10-54) when adjusting for simple QC measures.

Discussion/Conclusion

The current observations suggest increased statistical power when aggregating data across multiple sites, despite expected increases in measurement error introduced by heterogenous multi-site newborn data collection. Further, using a well-known large effect-size exemplar relationship (FAWM vs. PMA), we demonstrate and quantify substantial improvements in statistical power within and across sites when also accounting for measurement error (QC measures). Based on the premise that residual “error” in this case constitutes a combination of measurement noise and remaining true biological variation in FAWM after adjusting for age (i.e., study outcome of interest), we conclude that the reductions in MSE conferred by the inclusion of QC measures reflect an improved capability for identifying reproducible small-effect size relationships in large multi-site infant imaging studies.

Acknowledgements

No acknowledgement found.

References

[1] Marek S, Tervo-Clemmens B, Calabro FJ, Montez DF, Kay BP, Hatoum AS, Donohue MR, Foran W, Miller RL, Feczko E, Miranda-Dominguez O. Towards reproducible brain-wide association studies. BioRxiv. 2020 Jan 1.

[2] Rasmussen JM, Kruggel F, Gilmore JH, Styner M, Entringer S, Consing KN, Potkin SG, Wadhwa PD, Buss C. A novel maturation index based on neonatal diffusion tensor imaging reflects typical perinatal white matter development in humans. International Journal of Developmental Neuroscience. 2017 Feb 1;56:42-51.

[3] McGraw P, Liang L, Provenzale JM. Evaluation of normal age-related changes in anisotropy during infancy and childhood as shown by diffusion tensor imaging. American Journal of Roentgenology. 2002 Dec;179(6):1515-22.

[4] Bastiani M, Andersson JL, Cordero-Grande L, Murgasova M, Hutter J, Price AN, Makropoulos A, Fitzgibbon SP, Hughes E, Rueckert D, Victor S. Automated processing pipeline for neonatal diffusion MRI in the developing Human Connectome Project. NeuroImage. 2019 Jan 15;185:750-63.

Figures

Table 1. Site demographics and basic imaging parameters. Sites were heterogenous in age (C>A|B>D) and acquisition strategies. PMA=Postmenstrual age, PE=Phase-Encode.

Table 2. Single Site Association Between Whole White Matter FA and Postmenstrual Age at Scan Modeled Without and With Controlling for Quality Control Measures. All sites demonstrated marked improvement in model performance when controlling for simple QC measures (mean FD, SNR, CNR). Improved performance is indicated by increased partial R-square values, t-scores, and increased consistency in age-FA slope estimates across sites. All age-FA associations were significant at a p<10-3 threshold for all models and sites.

Figure 1. Multi-Site White Matter FA Model Without and With Consideration of Quality Control Measures. QC measures accounted for 47% of the remaining variance in FAWM after adjusting for PMA. Note the decrease in MSE and increase in the remaining variance in FAWM explained by PMA after adjusting for QC measures.

Proc. Intl. Soc. Mag. Reson. Med. 29 (2021)
2245