Nested support vector machine applied to structural and diffusion MR features for Alzheimer's disease prediction
Giovanni Giulietti1, Mara Cercignani2, and Marco Bozzali1

1Neuroimaging Laboratory, Santa Lucia Foundation, Rome, Italy, 2Clinical Imaging Sciences Centre, Brighton and Sussex Medical School, University of Sussex, Brighton, United Kingdom

Synopsis

The current study is an application of nested support vector machine (SVM) to distinguish healthy subjects and patients with Alzheimer’s disease using very few features coming from structural (T1) and diffusion (DWI) MR. After having segmented the T1 images in GM, WM and CSF, mean values of fractional_anisotropy, mean_diffusivity, radial_diffusivity and axial_diffusivity were computed in GM and WM; volume of GM and WM as percentage of total_intracranial_volume were also assessed. Therefore we computed 1023 different SVMs, one for each possible combination of the 10 features. Surprisingly, the WM diffusion measures resulted to be the most specific of dementia status.

INTRODUCTION

The support vector machine (SVM) is a supervised classification method [1]. SVM "learns" how to distinguish data that belong to one of two possible groups (i.e.: healthy subjects, patient) using a training dataset, than "builds" a prediction model that is used to assign new data (testing dataset) into one group or the other. The present study focuses on the application of nested SVM to prediction of Alzheimer's disease (AD) on the basis of very little number of features (1 to 10) extracted from structural and diffusions MR scans.

METHODS

Subjects and data acquisition

We recruited 40 patients (25F/15M) diagnosed with probable AD (age=69.5±6.5; MMSE=19.1±4.3, range: 11-28) and 28 (12F/16M) healthy subjects (HS; age=66.4±7.0; MMSE=28.8±1.6, range=25-30), age and gender matched to the AD group. All subjects underwent an MRI acquisition at 3.0T including: (1) T1-weighted (MDEFT) scan (TR=1338ms, TE=2.4ms, Matrix=256x224, n. slices=176, thick=1mm) and (2) Diffusion Weighted (DW) twice-refocused spin echo echo-planar imaging (SE EPI; TR=7s, TE=85ms, b factor=1000s/mm2, isotropic resolution=2.3mm3), collecting seven images with no diffusion weighting (b=0) and 61 images with diffusion gradients applied along 61 non-collinear directions.

Image analysis

The MDEFTs were first processed with SPM8 to yield maps of gray matter (GM), white matter (WM) and CSF volume in native space. Brain tissues volumes (GMvol, WMvol, CSFvol) were calculated for each subject. To account for subjects' head size differences, GMvol and WMvol were expressed as percent of the total intracranial volume, and yielded the GM fraction (GMf) and WM fraction (WMf). DW images were processed (using FSL and CAMINO) to compute fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RAD) and axial diffusivity (AXD). The FA maps were warped to the T1 images (used for the tissues segmentation). This way we could compute for each subject the mean values of FA, MD, RAD, AXD in GM and WM respectively.

SVM analysis

The 10 MR features obtained from the image analysis (GMf, WMf; FA, MD, RAD, AXD in GM and WM) were used to compute 1023 different n-features SVM (n=1,2,...,10) classifiers, one for each possible combination (1 by 1, 2 by 2, …, 9 by 9, all 10) of the features (brute force approach). In particular, we used non-linear SVMs with gaussian (RBF) kernel, using custom-made Matlab script, exploiting the libSVM library [2]. For each of the 1023 classifiers, the parameters of the SVM model (the soft margin constant C and the width of the gaussian kernel γ) were tuned through a grid search and leave-one-out (LOO) nested cross-validation (CV) [3]. For each classifier, the optimized values of C and γ were then used to create the optimized SVM model, whose classification accuracy (i.e., proportion of AD and HS subjects correctly classified), sensitivity (i.e., the proportion of AD patients correctly classified) and specificity (i.e., the proportion of HS correctly classified) were computed.

RESULTS

FIG.1 summarizes the main results of the study. Surprisingly, with SVM exploiting only one features (1-feature SVM), the best accuracy (83.82%) and specificity (78.57%) were obtained with one of the WM features, namely with AXD (FIG.2). Similarly, the best sensitivity (92.50%) was obtained with MD of WM (FIG.2). These classification performances were improved by SVM exploiting more features (n-features SVM, n>1): in particular the overall best accuracy (89.71%) was obtained with two different SVMs: the 4-features SVM exploiting the GM diffusion measures and WMf and the 9-features SVM including all measures but FA of WM, even if the specificity and sensitivity of the two SVMs were different (FIG.3).

DISCUSSION

In the current study, we investigated the classification between HS and patients with AD, using multimodal (structural and diffusion) MR data as input to SVM classifiers. The 1-feature SVM performances (FIG.2) indicate that, regarding overall brain measures, the diffusion properties of WM provide a better discrimination between AD and HS. In particular the false positive rate, as highlighted by the specificity of AXD in WM (78.57%), is largely better than that obtained with GM measures. Using brute force approach we explored all the possible n-features SVMs and we found that best classification performance (ACC=89.71%) was obtained combining diffusion and structural features coming from both GM and WM (FIG.3). This finding indicates that, as expected, structural and diffusion MR properties of brain provide complementary information on the dementia status. However, it can be noticed that the best sensitivity (97.50%) was obtained with 3-features SVM including only GM measures (FIG.3), but outweighted by a very poor specificity (57.14%), indicating that aging is a confounding factor mainly affecting GM measures.

Acknowledgements

No acknowledgement found.

References

1. Cortes C, Vapnik V. Machine Learning. 1995;20:273-297; 2. Chang CC and Lin CJ, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011; 3. Krstajic D et al. Journal of Cheminformatics 2014; 6:10

Figures

FIGURE 1. Each panel shows the maximum accuracy obtained with n-features SVM (n=1,2,..10) including that specific feature. For example, in the upper left panel you can see that the accuracy of SVMs featuring GM FA range from about 80% (with 1- and 2-features SVM) to about 90% (with 9-features SVM).

FIGURE 2. Accuracy (ACC), sensitivity (SENS) and specificity (SPEC) values obtained with SVMs exploiting only one feature (1-feature SVM). In bold are highlighted the maximum values.

FIGURE 3. Best accuracy (ACC), sensitivity (SENS) and specificity (SPEC) values among all 1023 n-features SVM classifiers. Maximum values are highlighted in bold.



Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)
1260