Keywords: Diagnosis/Prediction, Multimodal, Generalization; Generalizable; Longitudinal; Disease progression modeling
Motivation: Current longitudinal AD-dementia progression prediction studies lack cross-cohort evaluation, raising concerns about the clinical applicability of prediction models.
Goal(s): Our goal was to develop a generalizable ML algorithm, L2C-FNN, and assess its generalizability across entirely distinct test cohorts.
Approach: L2C-FNN and baseline models were trained solely on ADNI and subsequently evaluated on AIBL, MACC, and OASIS. Multimodal biomarkers were leveraged for forecasting future clinical diagnosis, cognition, and ventricle volume.
Results: Our algorithm compares favorably against strong baseline models across all test datasets, confirming its superior generalizability.
Impact: The demonstrated potential for improved generalizability in L2C-FNN signifies progress toward enhancing AI prediction models for clinical application. This underscores the continued need for cross-cohort evaluation in future AD-dementia progression modeling studies.
This work was supported by the Singapore National Research Foundation (NRF) Fellowship (Class of 2017), the National University of Singapore Yong Loo Lin School of Medicine (NUHSRO/2020/124/TMR/LOA), the Singapore National Medical Research Council (NMRC) LCG (OFLCG19May-0035), NMRC STaR (STaR20nov-0003), and the United States National Institutes of Health (R01MH120080). Our computational work was partially performed on resources of the National Supercomputing Centre, Singapore (https://www.nscc.sg). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not reflect the views of the Singapore NRF or the Singapore NMRC. Data collection and sharing for this project was funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer's Association; Alzheimer's Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.;Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.;Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. Data were provided [in part] by OASIS: Longitudinal: Principal Investigators: D. Marcus, R, Buckner, J. Csernansky, J. Morris; P50 AG05681, P01 AG03991, P01 AG026276, R01 AG021910, P20 MH071616, U24 RR021382, OASIS-3: Principal Investigators: T. Benzinger, D. Marcus, J. Morris; NIH P50AG00561, P30NS09857781, P01AG026276, P01AG003991, R01AG043434, UL1TR000448, R01EB009352. AV-45 doses were provided by Avid Radiopharmaceuticals, a wholly owned subsidiary of Eli Lilly.
1. Scheltens P, Blennow K, Breteler MMB, et al. Alzheimer’s disease. The Lancet. 2016;388(10043):505-517. doi:10.1016/S0140-6736(15)01124-1
2. Mehdipour Ghazi M, Nielsen M, Pai A, et al. Training recurrent neural networks robust to incomplete data: Application to Alzheimer’s disease progression modeling. Med Image Anal. 2019;53:39-46. doi:10.1016/j.media.2019.01.004
3. Nguyen M, He T, An L, Alexander DC, Feng J, Yeo BTT. Predicting Alzheimer’s disease progression using deep recurrent neural networks. NeuroImage. 2020;222:117203. doi:10.1016/j.neuroimage.2020.117203
4. Zhou J, Liu J, Narayan VA, Ye J. Modeling disease progression via multi-task learning. NeuroImage. 2013;78:233-248. doi:10.1016/j.neuroimage.2013.03.073
5. Wang C, Li Y, Tsuboshita Y, et al. A high-generalizability machine learning framework for predicting the progression of Alzheimer’s disease using limited data. Npj Digit Med. 2022;5(1):1-10. doi:10.1038/s41746-022-00577-x
6. Dewey BE, Zhao C, Reinhold JC, et al. DeepHarmony: A deep learning approach to contrast harmonization across scanner changes. Magn Reson Imaging. 2019;64:160-170. doi:10.1016/j.mri.2019.05.041
7. Kang DW, Wang SM, Na HR, et al. Differences in cortical structure between cognitively normal East Asian and Caucasian older adults: a surface-based morphometry study. Sci Rep. 2020;10(1):20905. doi:10.1038/s41598-020-77848-8
8. Jack CR, Bernstein MA, Borowski BJ, et al. Update on the magnetic resonance imaging core of the Alzheimer’s disease neuroimaging initiative. Alzheimers Dement J Alzheimers Assoc. 2010;6(3):212-220. doi:10.1016/j.jalz.2010.03.004
9. Ellis KA, Rowe CC, Villemagne VL, et al. Addressing population aging and Alzheimer’s disease through the Australian Imaging Biomarkers and Lifestyle study: Collaboration with the Alzheimer’s Disease Neuroimaging Initiative. Alzheimers Dement. 2010;6(3):291-296. doi:10.1016/j.jalz.2010.03.009
10. Hilal S, Tan CS, van Veluw SJ, et al. Cortical cerebral microinfarcts predict cognitive decline in memory clinic patients. J Cereb Blood Flow Metab Off J Int Soc Cereb Blood Flow Metab. 2020;40(1):44-53. doi:10.1177/0271678X19835565
11. Pamela J. LaMontagne, Tammie LS. Benzinger, John C. Morris, et al. OASIS-3: Longitudinal Neuroimaging, Clinical, and Cognitive Dataset for Normal Aging and Alzheimer Disease. medRxiv. Published online January 1, 2019:2019.12.13.19014902. doi:10.1101/2019.12.13.19014902
12. Zhao Y, Wong L, Goh WWB. How to do quantile normalization correctly for gene expression data analyses. Sci Rep. 2020;10(1):15534. doi:10.1038/s41598-020-72664-6
13. Marinescu RV, Oxtoby NP, Young AL, et al. TADPOLE Challenge: Prediction of Longitudinal Evolution in Alzheimer’s Disease. Published online August 30, 2018. doi:10.48550/arXiv.1805.03909
14. Marinescu RV, Oxtoby NP, Young AL, et al. The Alzheimer’s Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up. Published online December 27, 2021. doi:10.48550/arXiv.2002.03419
15. Bouckaert RR, Frank E. Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms. In: Dai H, Srikant R, Zhang C, eds. Vol 3056. Lecture Notes in Computer Science. Springer Berlin Heidelberg; 2004:3-12. doi:10.1007/978-3-540-24775-3_3
16. Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J R Stat Soc Ser B Methodol. 1995;57(1):289-300. doi:10.1111/j.2517-6161.1995.tb02031.x
17. Fan C, Wang J, Gang W, Li S. Assessment of deep recurrent neural network-based strategies for short-term building energy predictions. Appl Energy. 2019;236:700-710. doi:10.1016/j.apenergy.2018.12.004
Figure 1. Training and testing procedure and L2C-FNN model workflow (A) Models were trained on ADNI and adapted to three unseen test cohorts. ADNI participants were divided into training, validation, and test sets for hyperparameter tuning and within-cohort evaluation. The models were then adapted to test cohorts for cross-cohort evaluation, with 20 repetitions to ensure result stability. (B) Multimodal longitudinal inputs were transformed into a cross-sectional format. A deep feedforward neural network takes in the transformed data and generates multimodal forecasts.
Figure 2. Architecture of the Feedforward Neural Network (FNN) and the range of the hyperparameter search (A) FNN incorporates leaky rectified linear units (LeakyReLU) between layers. The final layer simultaneously outputs all target variables to enable multi-task learning. (B) The model’s structure, such as the number of layers and hidden layer size, and training configurations, including learning rate and weight regularization, serve as hyperparameters estimated from the validation sets.
Figure 3. L2C-FNN compares favorably against baseline methods for within-cohort evaluation. (A) Boxplots represent variability across 20 test sets. (B) Statistical difference between models. “***” indicates p < 0.00001 and statistical significance after multiple comparison correction (FDR q < 0.05). “n.s.” indicates no statistical significance (p ≥ 0.05) or did not survive FDR correction. Green color indicates that L2C-FNN significantly outperforms baseline methods.
Figure 5. Breakdown of MMSE prediction performance from Figure 4, detailing yearly intervals up to 6 years into the future. All algorithms exhibited degraded performance with increasing forecast horizon. L2C-FNN demonstrated comparative or superior performance compared to all baseline algorithms across all years and test cohorts.