4834

Harmonization of Cortical Thickness Measurements Throughout the Human Lifespan
Sahar Ahmad1, Fang Nan2, Ye Wu1, Zhengwang Wu1, Weili Lin1, Li Wang1, Gang Li1, Di Wu3,4, and Pew-Thian Yap1
1Department of Radiology and Biomedical Research Imaging Center (BRIC), The University of North Carolina at Chapel Hill, Chapel Hill, NC, United States, 2Department of Biostatistics, University of Washington, Seattle, WA, United States, 3Department of Biostatistics, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, United States, 4Division of Oral and Craniofacial Health Research, Adams School of Dentistry, The University of North Carolina at Chapel Hill, Chapel Hill, NC, United States

Synopsis

Pooling and integrating diverse imaging data across multiple sites is key to big data analytics in neuroimaging. Data amassed from multiple studies are inevitably heterogeneous due to differences in scanners, acquisition protocols, and post-acquisition image processing pipelines, substantially complicating downstream analyses. Here, we present a harmonization technique for multi-site large-scale longitudinal and cross-sectional data. We demonstrate the utility of our method in removing non-biological variability in cortical thickness measurements of individuals from birth to 100 years of age.

Introduction

Neuroimaging data collected at a single site exhibits less scanner-induced variability and simplifies downstream analyses. However, the trend has recently shifted toward collaborative data collection at multiple sites for open sharing and joint analysis, allowing greater statistical power in identifying changes associated with brain development and aging1. These collaborative efforts also promote transparency, which is vital for reproducing findings and driving research.
Pooling data from different sites is challenging. The use of non-standardized image acquisition protocols, scanner hardware and software, and post-acquisition image processing pipelines causes non-biological variability and introduces inconsistency in downstream analyses. While various approaches have been proposed to remove non-biological variability in multi-site data1-3, none of them has been applied to and tested on data spanning from birth to 100 years of age. Here, we present a harmonization method to remove site-related variability in cortical thickness measurements pooled from five different longitudinal and cross-sectional studies covering individuals from birth to 100 years of age. Our method allows us to chart for the first time the spatiotemporal changes in cortical thickness across the entire human lifespan.

Materials and Methods

We collected structural MRI data (T1-weighted, T2-weighted, and white and pial cortical surfaces) from five different datasets: Developing HCP4 (dHCP), Baby Connectome Project5 (BCP), HCP-Development (HCP-D), HCP-Young Adult (HCP-YA), and HCP-Aging (HCP-A)6-7. These datasets include typically developing and aging individuals scanned from birth to 100 years of age. Cortical thickness measurements were obtained for dHCP, HCP-D, HCP-YA, and HCP-A using the minimally preprocessed data and for BCP using an infant-dedicated processing pipeline8-10.
We harmonized the cortical thickness measurements across the lifespan using a new method called piecewise adjustment of location and scale (PALS), which is realized partly via ComBat11. PALS adjusts for site-related location differences in cortical thickness measurements and then scale differences from the residuals of the adjusted cortical thickness measurements with respect to a fitted curve (Fig. 1a). PALS takes a piecewise adjustment approach that considers data points of two sites at each time at the vicinity of their intersection to accommodate for dynamic age-specific measurement changes (Fig. 1b).
Mathematically, for site $$$i$$$, subject $$$j$$$, and surface vertex $$$k$$$, the cortical thickness $$$Y_{ijk}(t)$$$ at scan time $$$t$$$ is modeled using a generalized additive mixture model (GAMM) $$$f_{k}(X_{ij})$$$ as a function age $$$X_{ij}$$$, i.e.,
$$Y_{ijk}(t)=f_{k}(X_{ij})=f(X_{ij})+\gamma_{jk},$$
where $$$f(X_{ij})$$$ is the smooth nonlinear term for age, and $$$\gamma_{jk}$$$ is the subject-specific random intercept. The location-corrected cortical thickness $$$\hat{Y}_{ijk}(t)$$$ is obtained by
$$\hat{Y}_{ijk}(t)=\left[Z_{ijk}(t)-g^{\ast}_{ik}\right]s^{2}_{k}+f_{k}(X_{ij}),$$
where $$$Z_{ijk}(t)$$$ is the standardized cortical thickness computed as
$$Z_{ijk}(t)=\frac{Y_{ijk}(t)-f_{k}(X_{ij})}{s^{2}_{k}},$$
$$$s^{2}_{k}$$$ is the vertex-specific variance and $$$g^{\ast}_{ik}$$$ are location differences estimated using an empirical Bayes framework11. $$$\hat{Y}_{ijk}(t)$$$ is then corrected for site-related scale differences $$$d^{\ast}_{ik}$$$ to obtain PALS-harmonized cortical thickness $$$Y^{\ast}_{ijk}(t)$$$:
$$Y^{\ast}_{ijk}(t)=\left[\frac{\hat{Z}_{ijk}(t)}{d^{\ast}_{ik}}\right]\hat{s}^{2}_{k}+\hat{f}_{k}(X_{ij}),$$
where $$$\hat{Z}_{ijk}(t)=\frac{\hat{Y}_{ijk}(t)-\hat{f}_{k}(X_{ij})}{\hat{s}^{2}_{k}}$$$. $$$d^{\ast}_{ik}$$$ is estimated by applying ComBat in conjunction with GAMM to the residual of $$$\hat{Y}_{ijk}(t)$$$. To account for the effect of age on the variance, harmonization is carried out in two steps in PALS with scale differences determined based on absolute residuals.

Results

We compared PALS with (i) no harmonization, (ii) longitudinal ComBat2 (longCombat), and (iii) ComBat-GAM1. PALS is more effective in removing site-specific location and scale differences while controlling for age with GAMM (Fig. 2). The residuals given by PALS are centered around zero with a gradual scale increase (Fig. 3); longCombat and ComBat-GAM unnecessarily alter data dispersion and result in jagged trajectories. GAMM-fitted PALS-harmonized cortical thickness measurements, visualized on age-specific cortical atlases, exhibit smooth spatial and temporal changes across the lifespan (Fig. 4).

Conclusion

PALS effectively removes non-biological variability, yielding smooth trajectories of cortical thickness measurements across the entire human lifespan, both in terms of location and scale.

Acknowledgements

This work was supported in part by the United States National Institutes of Health (NIH) through grants MH125479 and EB008374.

Data were provided by the developing Human Connectome Project, KCL-Imperial-Oxford Consortium funded by the European Research Council under the European Union Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement no. [319456]. We are grateful to the families who generously supported this trial.

Data were provided [in part] by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.

References

1. Pomponio R, et al. Harmonization of large MRI datasets for the analysis of brain imaging patterns throughout the lifespan. NeuroImage. 2020;208:116450.

2. Beer JC, et al. Longitudinal ComBat: A method for harmonizing longitudinal multi-scanner imaging data. NeuroImage. 2020;220:117129.

3. Venkatraman VK, et al. Region of interest correction factors improve reliability of diffusion imaging measures within and across scanners and field strengths. NeuroImage. 2015;119:406-416.

4. Makropoulos A, et al. The developing human connectome project: A minimal processing pipeline for neonatal cortical surface reconstruction. NeuroImage. 2018;173:88-112.

5. Howell BR, et al. The UNC/UMN Baby Connectome Project (BCP): An overview of the study design and protocol development. NeuroImage. 2019;185:891 - 905.

6. Essen DCV, et al. The Human Connectome Project: a data acquisition perspective. NeuroImage. 2013;62(4):2222-2231.

7. Glasser MF, et al. The minimal preprocessing pipelines for the Human Connectome Project. NeuroImage 2013;80:105-124.

8. Wang L, et al. Volume-based analysis of 6-month-old infant brain MRI for autism biomarker identification and early diagnosis. International Conference on Medical Image Computing and Computer-Assisted Intervention. 2018;411 - 419.

9. Li G, et al. Measuring the dynamic longitudinal cortex development in infants by reconstruction of temporally consistent cortical surfaces. NeuroImage. 2014;90:266 - 279.

10. Li G, et al. Construction of 4D high-definition cortical surface atlases of infants: Methods and applications. Medical Image Analysis. 2015;25:22 - 36.

11. Johnson WE, et al. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 2007;8(1):118-127.

Figures

Figure 1: (a) PALS procedure for harmonizing cortical thickness measurements across lifespan. (b) Piecewise adjustment of location differences in cortical thickness using ComBat in conjunction with GAMM.

Figure 2: Comparison of cortical thickness measurements before and after harmonization.

Figure 3: Distribution of cortical thickness residuals across sites, before and after harmonization.

Figure 4: Harmonized cortical thickness measurements mapped onto inflated age-specific atlases of white surface.

Proc. Intl. Soc. Mag. Reson. Med. 30 (2022)
4834
DOI: https://doi.org/10.58530/2022/4834