Zhengshi Yang1, Xiaowei Zhuang1, Karthik Sreenivasan1, Virendra Mishra1, Christopher Bird1, Tim Curran2, Sarah J Banks1, and Dietmar Cordes1,2
1Cleveland Clinic Lou Ruvo Center for Brain Health, Las Vegas, NV, United States, 2University of Colorado, Boulder, CO, United States
Synopsis
Classification of different
episodic memory tasks by time points is challenging because the signal-to-noise
ratio in affected brain regions of the medial temporal lobes is low and similar
brain regions (such as the hippocampus) contribute to memory activation. No
studies have implemented a deep neural network (DNN) to classify memory tasks at
each fMRI time point using whole-brain data. We have implemented a
region-of-interest based DNN framework and applied it to classify three
different episodic memory tasks. Results indicate that this DNN classifier can
accurately discriminate between all these tasks.
Introduction
Deep neural networks (DNN) were recently applied in neuroimaging
research for tissue segmentation, group classification, task classification,
and prediction of the disease severity of patients [1,2,3,4]. No studies have
implemented DNN to classify different memory tasks by time points using
whole-brain data. Classifying memory tasks by time point is challenging,
because some brain regions, such as the hippocampus, may be involved in all episodic
memory tasks. We have constructed a region-of-interest (ROI) based DNN
framework and applied it to classifying three different episodic memory tasks.Methods
Subjects: Episodic memory task
fMRI data of 16 subjects were acquired on a 3.0T GE MRI scanner. All subjects
were scanned with three memory tasks consisting of faces paired with occupations,
natural scenery pictures, and word pairs describing common objects. Each memory
task consisted of six periods of encoding (21sec), distraction (11sec),
recognition (42sec), and instruction (5sec). Analysis: Only the recognition blocks were extracted for
task classification. A schematic diagram of task classification is shown in
Fig.1. Each time point in the recognition blocks is treated as a sample, and
the mean BOLD signal intensities in ROIs, determined by the fconn atlas [5],
are the input features to the DNN classifier. The DNN is constructed with an input
layer, two 128-node hidden layers, and an output layer with 3 nodes
corresponding to the three memory tasks. The commonly used ReLU [6] activation
function was utilized in the DNN framework. The search direction of weight
matrices ($$$W_1$$$, $$$W_2$$$ and $$$W_3$$$) is updated with an
adaptive gradient descent method [7] and the dropout technique [8] is applied to
alleviate overfitting. The softmax function is used for the output layer so
that the output can be interpreted as the posterior probability of each
corresponding task. We trained the DNN with a leave-one-subject-out cross
validation method and predicted the tasks with the left-out subject. To
determine the importance of each ROI in classifying different tasks, we applied
a principal sensitivity analysis (PSA) [9] to the trained DNN classifier. Similar
to principal component analysis maximizing the variance along a set of
orthogonal directions in feature space, PSA searches the orthogonal directions
in the ROI-feature space and ranks these directions in order of eigenvalues. The
direction with maximal eigenvalue is the most important one to discriminate one
task from all the other tasks. To further validate the maps found by PSA, we
compared the maps obtained with the group activation map for each task for contrast
“recognition-control” (R-C). Results
The DNN classifier achieves 94.7% accuracy in predicting the different memory
tasks at individual fMRI time points. Table 1 shows the accuracy for each
specific task. The time points for the picture task can be predicted with near
perfection, and accuracy for the face task is the lowest. Fig.2 shows the top
ten eigenvalues computed from PSA for all three tasks, with the largest
eigenvalue normalized to be one for the purpose of visualization. The first
component found by PSA is most important since the first eigenvalue dominates
all others. In Fig.3, the top panel shows the group activation maps for these
three tasks. The bottom panel shows the direction obtained from PSA with
maximal eigenvalue. The activations in the fusiform gyrus are distinct among
these three tasks. The activated fusiform area for the picture task is located medially in respect to the activated area for the face task. In contrast, there
is little activation in the fusiform gyrus for the word task. The regions found
in the PSA maps are consistent with the group maps.Discussion and Conclusion
We constructed a deep learning framework by stacking multiple dense
layers to extract non-linear features from episodic memory data. The DNN is
highly accurate in classifying different memory tasks by time point, however,
it misclassifies the face task as the word task with a 10.5% rate. Both face
and word tasks show activation in language regions, such as left inferior
frontal gyrus and left inferior temporal gyrus. Since the face task had a
verbal label (occupation), it likely explains the difficulty in discriminating
face and word tasks. This DNN framework is also robust against the heterogeneity
among subjects since testing was performed on independent subject data. Acknowledgements
The study is supported by the National Institutes of Health (grant number 1R01EB014284 and P20GM109025).References
[1] Wang et al., 2015.
Deep convolutional neural networks for multi-modality isointense infant brain
image segmentation. NeuroImage, 108, pp214-224.
[2] Suk et al., 2014.
Hierarchical feature representation and multimodal fusion with deep learning
for AD/MCI diagnosis. NeuroImage, 101(1), pp 569-582.
[3] Jang et al., 2017. Task-specific
feature extraction and classification of fMRI volumes using a deep neural
network initialized with a deep belief network: Evaluation using sensorimotor
tasks. NeuroImage, 145. Pp 314-328.
[4] Cole et al., 2017. Predicting
brain age with deep learning from raw imaging data results in a reliable and
heritable biomarker. NeuroImage, 163, pp 115-124.
[5] Shen et al., 2013,
Groupwise whole-brain parcellation from resting-state fMRI data for network
node identification.
[6] Jarrett, K.,
Kavukcuoglu, K., Ranzato, M., LeCun, Y., 2009. What is the best multi-stage
architecture for object recognition? In: 2009 IEEE 12th
International Conference on Computer Vision. IEEE, pp. 2146-2153.
[7] Zeiler, Mathew D.,
2012, ADADELTA: an adaptive learning rate method
[8] Hinton, G. E.,
Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. R., jul 2012.
Improving neural networks by preventing co-adaptation offeature detectors.
arXiv preprint arXiv:1207.0580, 1-18.
[9] Koyamada, S., Koyama, M., Nakae, K.,
Ishii, S., dec 2014. Principal Sensitivity Analysis. arXiv preprint
arXiv:1412.6785, 1-13.