Mario Serrano-Sosa1, Jared X. Van Snellenberg1,2,3, and Chuan Huang1,2,4
1Biomedical Engineering, Stony Brook University, Stony Brook, NY, United States, 2Psychiatry, Stony Brook Medicine, Stony Brook, NY, United States, 3Psychology, Stony Brook University, Stony Brook, NY, United States, 4Radiology, Stony Brook Medicine, Stony Brook, NY, United States
Synopsis
Interpretable Deep Learning(DL) models are the next step in establishing DL prediction models as accepted tools that provide researchers with data-driven methods to further understand neuroimaging data. In this work, we developed two interpretable DL models to predict Working Memory(WM) scores from task fMRI data to assess neural circuitry pertaining to WM; wherein a traditional Convolutional Neural Network(CNN)(1-3) contained fMRI activation data from cortical vertices as a single image, and the second contained cortical activation data from both hemispheres as separate channels. Overall, the interpretable DL model provided high quality saliency maps potentially displaying novel regions pertaining to WM.
Introduction
Working Memory (WM) is the temporary maintenance of information in a state of heightened accessibility, such that it can be actively manipulated and subjected to other higher-order cognitive control processes.. This type of cognition has been linked to fluid intelligence and is thought to be impaired in many psychiatric and neurological conditions. However, although standard generalized linear modeling of functional magnetic resonance imaging (fMRI) task data has shown regions activated during WM tasks, it is important to measure non-linear associations between fMRI task activation and task performance as it might uncover neural processes in WM that standard regression and correlational analyses cannot.
Deep Learning (DL) provides non-linear analysis of data, which is highly useful for complex neuroimaging data. This non-linear analysis may provide an improvement in understanding the basic computational circuit and mechanisms of higher-order cognition in humans. In order to visualize the non-linear analyses afforded by the DL network, it is imperative to develop reliable saliency maps that highlight the most important data for prediction; thus, providing researchers with a data-driven tool that can potentially identify novel neural correlates in fMRI data associated with cognition and WM. In terms of cortical fMRI activation maps, it is imperative to feed both left and right cortical activations into the network to contextualize across whole cortical activation. While cortical activations can be placed as channels, they can also be concatenated to be input as a single image.
In this work we develop an interpretable DL model to predict WM subconstruct scores using fMRI data from a WM task. Kernel Ridge Regression (KRR) will be used for comparison. However, KRR does not allow for interpretability as it cannot generate saliency maps. Therefore, we compare two interpretable DL models: our proposed CNN network that contains concatenated left and right cortical activations in a single image and a similar network with hemispheres as channels. This comparison will allow us to measure performance and the degree of interpretability afforded by both methods. Saliency maps based on model prediction from the interpretable DL model will be assessed.Methods
420 fMRI datasets from the Human Connectome Project(HCP)(4) were used in the development of the interpretable DL model. Cortical activation maps from a 2-back WM task were extracted from the individual subjects. This data was stored in CIFTI format which contains ~32K vertices for each hemisphere (Left and Right). These ~32K vertices were linearly interpolated into a 2D image for both left and right hemispheres (Figure 1).The network used in this work followed a similar strategy as VGG network, consisting of various convolutional blocks as shown in Figure 2. We considered two networks: a traditional Convolutional Neural Network (CNN) with left and right cortical fMRI activations in a single image (CNNS). This was compared to left and right cortical fMRI activations as channels (CNNC). The 420 subjects were split into 5 subsets to perform 5-fold cross validation and obtain saliency maps for each subject in the corresponding fold. Quantitative metrics used for assessment were mean absolute error (MAE), mean absolute percent error (MAPE), root mean square error(RMSE) and R2.Results
CNNS outperformed KRR across all quantitative metrics as shown in Table 1: MAE (6.588 vs 6.665), MAPE (7.50% vs 7.67%), RMSE (8.46 vs 8.60) [smaller is better] and R2 (0.389 vs 0.373) [larger is better], respectively. CNNC did not outperform KRR as shown in Table 1. Saliency maps were generated from CNNS are shown in Figure 4.Discussion
We have developed an interpretable deep learning model to predict WM scores using cortical activation maps from 2-back task. We also optimized two CNN models, one with cortical fMRI activations in a single image and another with cortical activations as channels. Both these models were compared to KRR model as comparison. Across all quantitative metrics, our proposed CNNS performed the best across all metrics. KRR was the previous winner for a neurocognitive prediction challenge(5), however they used cortical thickness from T1 weighted imaging as input. Our results show that utilizing CNNS provides more accurate prediction of WM scores compared to KRR when using fMRI cortical activation maps as input data.
Moreover, since the CNN models can generate attention maps and directly takes cortical fMRI activation as input, it was able to create interpretable maps that pinpoint specific cortical regions most predictive of WM. As shown in Figure 4, the saliency maps highlight voxels that agree well with the known neurocircuit that comprises working memory tasks(6); further indicating that interpretable deep learning based predictive models can feasibly identify mechanisms and neurocircuits responsible for a given task. The saliency maps generated show interesting results such as activation in the somatomotor cortex possibly indicating a novel region associated with WM. Further analysis using the generated saliency maps is warranted.Conclusion
This work has utilized a convolutional neural network to predict WM scores from task fMRI activation maps. This network has outperformed other methods including previous winner for neurocognitive prediction challenge and produced attention maps that highlight regions most important for WM prediction. Therefore, our network provides a step toward into interpretable DL networks that can be used as analytical tool to further understand working memory and cognition.Acknowledgements
This work was in part supported by R01 MH120293.References
1. Spuhler KD, Ding J, Liu C, Sun J, Serrano‐Sosa M, Moriarty M, Huang C. Task‐based assessment of a convolutional neural network for segmenting breast lesions for radiomic analysis. Magnetic resonance in medicine. 2019. PubMed PMID: 30957936; PMCID: 6510591.
2. Xu J, Gong E, Pauly J, Zaharchuk GJapa. 200x Low-dose PET reconstruction using deep learning2017.
3. Cui J, Gong K, Guo N, Wu C, Meng X, Kim K, Zheng K, Wu Z, Fu L, Xu B, Zhu Z, Tian J, Liu H, Li Q. PET image denoising using unsupervised deep learning. European Journal of Nuclear Medicine and Molecular Imaging. 2019;46(13):2780-9. doi: 10.1007/s00259-019-04468-4.
4. Van Essen DC, Ugurbil K, Auerbach E, Barch D, Behrens T, Bucholz R, Chang A, Chen L, Corbetta M, Curtiss SW. The Human Connectome Project: a data acquisition perspective. Neuroimage. 2012;62(4):2222-31.
5. Mihalik A, Brudfors M, Robu M, Ferreira FS, Lin H, Rau A, Wu T, Blumberg SB, Kanber B, Tariq M, editors. ABCD Neurocognitive Prediction Challenge 2019: predicting individual fluid intelligence scores from structural MRI using probabilistic segmentation and kernel ridge regression. Challenge in Adolescent Brain Cognitive Development Neurocognitive Prediction; 2019: Springer.
6. Curtis CE, D'Esposito M. Persistent activity in the prefrontal cortex during working memory. Trends in cognitive sciences. 2003;7(9):415-23.