Currently, extensive research is ongoing to perform classification between healthy controls (HC) and patients by extracting features from resting state fMRI based dynamic connectivity states where these states are typically identified by applying different clustering algorithm. However, for classification purposes, the information captured by all dynamic states may not be significant. In this work, we propose a brute force (BF) approach where we consider a subset of these states to perform classification. Our results indicate that in most of the cases, there exists a subset of states which provides better accuracy instead of utilizing information from all of the states.
For our experiments, we use resting state fMRI data (163 HC and 151 SZ patients) previously used in the study [1]. For each subject, we took 1081 static functional network connectivity (sFNC) measures and for dynamic FNC (dFNC), we took 136 x 1081 dFNC windows per subject. We evaluated the classification performance of sFNC and dFNC, as shown in our recent work [2]. To perform classification, we applied linear support vector machine (SVM) on 5-fold cross validations of data with 10 repetitions in all our experimental settings.
For sFNC, we trained a linear SVM classifier using the training dataset of any specific fold and then computed the accuracy using the left-out testing dataset of that fold. In the same manner the remaining 4 folds were computed. This whole procedure was repeated 10 times to compute the standard error of mean accuracy. For dFNC of any specific fold (the same fold that we used in sFNC), we ran k-means separately on different groups (HC and SZ) on training dataset for model order 2-5 in k-means. Then, we combined the cluster centroids that were extracted from different groups. When combining the centroids, we apply the BF approach to select the subset of centroids which provide the best classification accuracy. For any subset of centroids from HC and SZ, each observation (i.e., dynamic windows) was regressed onto these combined subset centroids to identify its contribution weight (beta coefficient). The detailed procedure to compute the beta coefficient is described in [2]. After completion of regression, one mean beta (computed as the mean overall observations of any individual subject) per centroid state was obtained as feature for SVM. Next, we trained a linear SVM classifier using these beta coefficients from this training fold. Beta coefficients for testing dataset were computed using the combined centroids (i.e., the centroids that we used on training dataset) and then the classification accuracy was computed for this testing dataset using the SVM classifier. Finally, the best subsets of centroids were chosen which provided the maximum classification accuracy.
[1] Damaraju, E., Allen, E. A., Belger, A., Ford, J. M., McEwen, S., Mathalon, D. H., ... & Calhoun, V. D. (2014). Dynamic functional connectivity analysis reveals transient states of dysconnectivity in schizophrenia. NeuroImage: Clinical, 5, 298-308.
[2] Rashid, B., Arbabshirani, M.R., Damaraju, E., Cetin, M.S., Miller, R., Pearlson, G.D., Calhoun, V.D., (2016). Classification of schizophrenia and bipolar patients using static and dynamic resting-state fmri brain connectivity. NeuroImage 134, 645–657.