Investigating the correspondence of clinical diagnostic grouping with underlying neurobiological and phenotypic clusters using unsupervised learning: An application to the Alzheimer’s spectrum
Xinyu Zhao1, D Rangaprakash1, D Narayana Dutt2, and Gopikrishna Deshpande1,3,4

1Electrical and Computer Engineering, AU MRI Research Center, Auburn, AL, United States, 2Medical Electronics, Dayananda Sagar College of Engineering, Bangalore, India, 3Psychology, Auburn University, Auburn, AL, United States, 4Alabama Advanced Imaging Consortium, Auburn University and University of Alabama Birmingham, Auburn, AL, United States

Synopsis

Many brain-based disorders are traditionally diagnosed based on clinical interviews and behavioral assessments. Using Alzheimer’s spectrum (i.e. mild cognitive impairment [MCI] and Alzheimer’s disease [AD]) as a test case, we investigated whether clinical diagnostic grouping is grounded in underlying neurobiological and phenotypic clusters. In order to do so, three unsupervised learning methods were applied on resting-state fMRI connectivity measures obtained from subjects with MCI and AD. High similarity was achieved between connectivity and phenotypic clusters while similarity was low with clinical diagnosis. It shows that neurobiological and phenotypic markers could be used to improve the precision of clinical diagnosis.

Introduction:

Alzheimer’s disease (AD) is a chronic neurodegenerative disorder that gradually impairs memory and cognitive performance, and eventually the ability to carry out the simplest tasks. AD is traditionally diagnosed based on clinical interviews and psychometric testing, which are recognized to be largely imperfect. Therefore, it is necessary to establish neuroimaging-based biomarkers to improve diagnostic precision. Resting state functional magnetic resonance imaging (rs-fMRI) has been used as a promising technique for automatic identification of AD from healthy controls (HC)1,2,3. However, most of these classification methods are supervised, i.e. they require a priori clinical labels to guide classification. In this study, we adopted various unsupervised clustering methods using rs-fMRI connectivity to investigate hidden structures in the AD spectrum without relying on a priori clinical labels.

Methods:

Rs-fMRI data from Alzheimer’s disease neuroimaging initiative (ADNI) database (http://adni.loni.usc.edu/) was used in this study. The sample consisted of subjects with three progressive stages of cognitive impairment – early mild cognitive impairment (MCI, N=23), late MCI (N=29) and AD (N=13) – along with matched healthy controls (HC, N=31). Data was acquired in a Philips 3T MRI scanner with TR=3s, TE=30ms and slice thickness=3.3mm. Standard pre-processing steps were performed and mean fMRI time-series were obtained from 200 functionally homogenous brain regions (cc200 template4). Static and dynamic functional connectivity5 (SFC and DFC) were obtained between all pairs of brain regions. While SFC gives connectivity strength, variance of DFC (vDFC) gives the temporal variability of connectivity5, and has been shown to convey biologically relevant information6 which is distinct from static connectivity. Significant group differences were obtained in SFC (and vDFC) using ANOVA and only the top significant features (p<0.01) were further used in clustering analysis. The main idea of unsupervised clustering is to group objects in such a way that objects in the same group are more similar to each other than to those in other groups. In this study, three clustering methods were adopted, i.e., hierarchical clustering7, ordering points to identify the clustering structure (OPTICS)8 and density peak clustering (DPC)9. These methods were specifically chosen because they did not require a priori specification of the number of clusters. Since clustering accuracy is often lower in high dimensional feature spaces, feature selection methods were applied. A forward searching (FS) method was used by ranking features based on statistical significance and sequentially adding features for clustering. The optimal subset was then determined to be the one resulting in highest accuracy. However, this method did not guarantee a global optimum as statistical significance of individual features does not necessarily guarantee cluster separation when they are combined. To overcome this problem, a genetic algorithm (GA) was used in this study10. GA is a search heuristic method inspired by stochastic evolution theory. It starts from a set of randomly generated solutions and iteratively selects better solutions with larger objective values, which have been generated from crossover and mutation operations (Fig. 1). Clustering was applied on three types of features: (i) SFC and vDFC, (ii) clinical diagnostic labels, and (iii) phenotypic and genetic variables11,12,13 (Table 1). The accuracy of the clustering and feature selection were assessed by computing the similarity14 of clustering between all three feature types. We hypothesized that clinical diagnosis must have high clustering similarity with neurobiological (connectivity) and phenotypic/genetic markers of disease.

Results and Discussion:

FS and GA methods were compared in terms of similarity obtained from different iterations (see Fig. 2). With FS, the curve oscillated dramatically, while with GA, a clearly step-wise convergence was observed. Also, GA led to larger peak similarity. OPTICS gave higher similarity compared to DPC and hierarchical methods (Table. 2) and the determined number of clusters was consistent with clinical diagnosis. Features selected by GA and OPTICS mainly included connections in all lobes of the brain and more specifically related to default mode network (DMN), e.g., medial temporal lobe (MTL), posterior cingulate cortex (PCC), precuneus, superior and inferior parietal gyri (Fig. 3). These findings are consistent with previous studies1,2 showing alterations in DMN in MCI and AD. Selected phenotypic/genetic variables are shown in Table 1 and are consistent with known behavioral and genetic alterations on the AD spectrum12,13,15. The similarity was highest between connectivity and phenotypic variables while clinical diagnosis had low similarity with both of them. This suggests that clinical diagnostic criteria for MCI and AD are not completely grounded in underlying neurobiological, neurobehavioral and genetic markers. Further, our framework using unsupervised clustering may be used to evaluate the fidelity of clinical diagnostic criteria in other brain-based disorders.

Acknowledgements

Data used in this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). Investigators within ADNI contributed to design and implementation of ADNI and provided data but did not participate in analysis or writing of this report. Complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

Data collection and sharing for this work was funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for NIH (www.fnih.org). The grantee is the Northern California Institute for Research and Education, and the study is coordinated by Alzheimer's Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

References

1. Khazaee A, Ebrahimzadeh A, Babajani-Feremi A. Identifying patients with Alzheimer’s disease using resting-state fMRI and graph theory. Clin Neurophysiol. 2015;126(11):2132-2141.

2. Chen G, Ward BD, Xie C, et al. Classification of Alzheimer disease, mild cognitive impairment, and normal cognitive status with large-scale network analysis based on resting-state functional MR imaging. Radiology. 2011;259(1):213-221.

3. Filipovych R, Resnick SM, Davatzikos C. JointMMCC: Joint maximum-margin classification and clustering of imaging data. IEEE Trans Med Imaging. 2012;31(5):1124-1140.

4. Craddock RC, James GA, Holtzheimer PE, Hu XP, Mayberg HS. A whole brain fMRI atlas generated via spatially constrained spectral clustering. Hum Brain Mapp. 2012;33(8):1914-1928.

5. Deshpande G, Laconte S, Peltier SJ, Hu X. Connectivity Analysis of Human Functional MRI Data: From Linear to Nonlinear and Static to Dynamic. In: Medical Imaging and Augmented Reality. 2006;4091:17-24.

6. Jia H, Hu X, Deshpande G. Behavioral relevance of the dynamics of functional brain connectome. Brain Connect. 2014;4(9):741-759.

7. Cheng D, Kannan R, Vempala S, Wang G. A divide-and-merge methodology for clustering. ACM Trans Database Syst. 2006;31(4):1499-1525.

8. Ankerst M, Breunig MM, Kriegel H-P, Sander J. Optics: Ordering points to identify the clustering structure. ACM Sigmod Rec. 1999:49-60.

9. Rodriguez A, Laio A. Machine learning. Clustering by fast search and find of density peaks. Science. 2014;344(6191):1492-1496.

10. Yang J, Honavar V. Feature Subset Selection Using A Genetic Algorithm. Pattern Recognit. 1997;13(2):380.

11. Cummings JL. The Neuropsychiatric Inventory: assessing psychopathology in dementia patients. Neurology. 1997;48(5 Suppl 6):S10-S16.

12. Burke WJ, Houston MJ, Boust SJ, Roccaforte WH. Use of the Geriatric Depression Scale in dementia of the Alzheimer type. J Am Geriatr Soc. 1989;37(9):856-860.

13. Galasko D, Klauber MR, Hofstetter CR, Salmon DP, Lasker B, Thal LJ. The Mini-Mental State Examination in the early diagnosis of Alzheimer’s disease. Arch Neurol. 1990;47(1):49-52.

14. Torres G, Basnet R, Sung A. A similarity measure for clustering and its applications. Proc World Acad Sci Eng Technol. 2008;31(1307-6884):490-496.

15. Kim J, Basak JM, Holtzman DM. The role of apolipoprotein E in Alzheimer’s disease. Neuron. 2009;63(3):287-303.

Figures

Table 1: Phenotypic/genetic variables selected by GA with different clustering methods.

Table 2: Peak similarity, corresponding number of features, and number of clusters obtained using GA with different clustering methods.

Figure 1: Flowchart of GA process for feature selection. In the M-by-N matrix, each row represents a candidate solution, describing a subset of selected features. Each of the M bits in a row represents whether a feature is selected (1) or discarded (0).

Figure 2: Similarity obtained from different iterations using (a) FS and (b) GA.

Figure 3: (a) SFC and (b) DFC features selected by GA and OPTICS.



Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)
4034