0489

Combined fMRS and fMRI During Reinforcement Learning in a Large Cohort at 7T: When Does Cognitive Processing Occur?
Tal Finkelman1, Edna Furman-Haran2, Kristoffer Carl Mikael Aberg3, Rony Paz3, and Assaf Tal1
1Chemical and Biological Physics, Weizmann Institute of Science, Rehovot, Israel, 2life sciences core facilities, Weizmann Institute of Science, Rehovot, Israel, 3Brain Sciences, Weizmann Institute of Science, Rehovot, Israel

Synopsis

Keywords: Spectroscopy, fMRI (task based), functional MRS

We present multimodal functional MRS-fMRI-Behavioral data, which demonstrates how the E/I balance changes in the dACC during a reinforcement learning paradigm. The E/I balance decreases during rest periods between tasks, supporting a consolidation phase that is invisible to BOLD-fMRI. Additionally, we find a significant negative correlation between both GABA and glutamate, and the mean z-score of the BOLD signal from the spectroscopic voxel, during the decision-making game. We suggests that the elevation in Glu is related to cellular activity rather than neuronal activity, indicating a GABAergic activation during the task.

Background

Reinforcement Learning (RL) is a fundamental learning process that involves updating one's beliefs about the environment. RL tracks outcome expectations and updates these when there is a mismatch between actual and predicted outcomes, the so called prediction error. Prediction error signals correlate with activity in a number of brain regions1, including the anterior cingulate cortex (ACC)2. Magnetic Resonance Spectroscopy (1H-MRS) enables the non-invasive measurements of glutamate (Glu) to γ‑aminobutyric-acid (GABA) ratio – also known as the excitatory-inhibitory (E/I) balance – which modulates a wide range of cognitive and behavioral processes, including RL3. Functional MRS (fMRS) studies of GABA and Glu during rest and task performance have shown that the E/I balance is associated with cognitive control4 and correlated to the BOLD response5. Understanding these dynamics can reveal how BOLD signal changes are regulated, as well as infer the neuroimaging correlates of cognition which are undetectable by BOLD. Here we present a large cohort study examining the dynamics of both GABA and Glu in the dorsal ACC (dACC) during rest and different RL conditions, and their correlation to BOLD-MRI signal.

Methods

83 healthy volunteers (age 27±5; 38 females) were scanned on a 7T scanner (Terra, Siemens) using a 1Tx32Rx head coil (Nova Medical Inc.), while performing a reinforcement learning task (Fig. 1A), using a single‑voxel SemiLASER sequence (TE = 80ms; TR=7s), which was shown to detect GABA and Glu with good precision6. And a multiband gradient-echo EPI sequence (TE=22.2ms, TR=1s, flip angle=45°, MB=5,iPAT=2)7,8. The MRS voxel was positioned in the dACC (40X25X10mm3; Fig 1B). During the task, participants choose between two letters with different letters probabilities (LP) of reward (p) and loss (1‑p). The task had four conditions: 1. LP: 65-35, with positive RL (p=0.65), 2. LP: 50-50, with positive RL (p=0.5), 3. LP: 65-35, negative RL (1-p=0.65), 4. LP: 50-50, with negative RL (1-p=0.5). During the fMRI scan, participants play two games- 1. LP: 50-50 and 2. LP: 65-35, half of the cohort with negative and half with positive RL (Fig 1C). The absolute concentrations of the metabolites were calculated using LCModel. Metabolite concentrations were corrected for tissue fractions within the voxel and relaxation times. fMRI data analysis was done using FSL 6.0.4. We used FSL FEAT to generate the decision-making phase contrasts, in which each voxel's z-score reflected how well its BOLD activity correlated with each decision-making stimulus. Trait anxiety was estimated in each participant using Spielberger’s state-trait anxiety inventory (STAI). Multiple comparisons were accounted‑for applying False Discovery Rate corrections. A modified q-learning model was used to fit behavioral data and extract the learning rates for positive and negative prediction errors9.

Results

Behavioral - Participants performed better in learnable 65-35 games, as compared to the unlearnable 50-50 games (Fig. 2A), indicating that learning had occurred.
fMRS - Glu and GABA at the initial rest were positively correlated with the learning score in the 65-Gain games (p = 0.03; 0.001, respectively; Fig. 2B). Glu concentration during the games was elevated compared to the initial rest, while the E/I balance remains unchanged (Fig. 3). An additional increase in GABA and Glu was observed in the rest periods following the games, as well as a significant decrease in the E/I ratio, unlike its stability during the games (except the rest-after-65-Loss), which suggests a consolidation process.
fMRI- GABA and Glu levels in the 65-Gain game were negatively correlated with the mean z-score from the dACC spectroscopic voxel, in the same game, during the decision-making phase (p = 0.01; p=0.006, respectively; Fig. 4). The correlation between Glu and GABA might imply a GABAergic neuronal activity. No correlation was found between the mean spectroscopic voxel z-score and the learning score.
Other- Glu levels positively correlated with trait anxiety scores under all conditions (Fig. 5), which extends former evidence of a correlation between anxiety scores and Glu levels at rest10,11.

Conclusions

Glu levels were elevated during the RL games compared to the initial rest, whereas the E/I balance remained unchanged. There was a further increase in GABA and Glu during the rest periods following the RL games and, unlike during the games, a decreased E/I balance. Our results suggest an increase in Glu during cognitive load, and point to an added role of Glu outside the context of the E/I balance. The further unexpected increase in GABA in the rest period in between games, in the absence of BOLD activity, might indicate consolidation after RL. Additionally, GABA and Glu were negatively correlated with mean z-score during the decision phase in the dACC, potentially due to GABAergic neuronal activity. We suggests that the elevation in Glu is related to cellular activity rather than neuronal activity, indicating a GABAergic activation during the task. Our data shows dynamic inhibition plays an important role during learning and decision-making.

Acknowledgements

Assaf Tal acknowledges the support of the Monroy-Marks Career Development Fund the Israeli Science Foundation (personal grant 416/20). Dr. E. Furman-Haran holds the Calin and Elaine Rovinescu Research Fellow Chair for Brain Research. Dr. K.C. Aberg holds the Sam and Frances Belzberg Research Fellow Chair in Memory and Learning. We would like to acknowledge the receipt of the pulse sequences from the Center for Magnetic Resonance Research (CMRR), University of Minnesota, USA, and to acknowledge Edward J. Auerbach, Ph.D. and Małgorzata Marjańska, Ph.D. (CMRR) for the development of the spectroscopy pulse sequence.

References

1. Rushworth, M. F. S. & Behrens, T. E. J. Choice, uncertainty and value in prefrontal and cingulate cortex. Nat. Neurosci. 11, 389–397 (2008).

2. Schultz, W. & Dickinson, A. NEURONAL CODING OF PREDICTION ERRORS. Annu. Rev. Neurosci 23, 473–500 (2000).

3. Bezalel, V., Paz, R. & Tal, A. Inhibitory and excitatory mechanisms in the human cingulate-cortex support reinforcement learning: A functional Proton Magnetic Resonance Spectroscopy study. Neuroimage 184, 25–35 (2019).

4. Maruyama, S. et al. Sequential Finger-tapping Learning Mediated by the Primary Motor Cortex and Fronto-parietal Network: A Combined MRI-MRS Study. (2021) doi:10.21203/rs.3.rs-197014/v1.

5. Betina Ip, I. et al. Combined fMRI-MRS acquires simultaneous glutamate and BOLD-fMRI signals in the human brain. Neuroimage 155, 113–119 (2017).

6. Finkelman, T., Furman-Haran, E., Paz, R. & Tal, A. Quantifying the excitatory-inhibitory balance: A comparison of SemiLASER and MEGA-SemiLASER for simultaneously measuring GABA and glutamate at 7T. Neuroimage 247, 118810 (2022).

7. Mikkelsen, M. et al. Correcting frequency and phase offsets in MRS data using robust spectral registration. NMR Biomed. 33, e4368 (2020).

8. Moeller, S. et al. Multiband multislice GE-EPI at 7 tesla, with 16-fold acceleration using partial parallel imaging with application to high spatial and temporal whole-brain fMRI. Magn. Reson. Med. 63, 1144–1153 (2010).

9. Frank, M. J., Doll, B. B., Oas-Terpstra, J. & Moreno, F. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat. Neurosci. 2009 128 12, 1062–1068 (2009).

10. Strawn, J. R. et al. A Pilot Study of Anterior Cingulate Cortex Neurochemistry in Adolescents with Generalized Anxiety Disorder. Neuropsychobiology 67, 224–229 (2013).

11. Modi, S., Rana, P., Kaur, P., Rani, N. & Khushu, S. Glutamate level in anterior cingulate predicts anxiety in healthy humans: A magnetic resonance spectroscopy study. Psychiatry Res. Neuroimaging 224, 34–41 (2014).

Figures

(A). One trial of the RL game paradigm. Each letter has a probability of reward. Two types of letter probabilities were examined 65%-35% and 50%-50%. After choosing one letter, the participant is presented with the outcome: 0 or +5 in the gain condition games (positive RL); 0 or -5 in the loss condition games (negative RL). (B). Voxel position in the dACC in a sample volunteer. (C). Scanning protocol. The order of games was randomized within the MRS and fMRI acquisitions, as well as the order of the fMRI and MRS acquisitions.

A) Learning performance, as quantified by the relative fraction of correct responses over a ten-trial window. Dynamic curves for each probability game are shown for 65-35 and 50-50 in the positive RL (Gain condition) and negative RL (Loss condition). The model-fitted performance is presented together with the actual performance. The shaded regions correspond to the standard deviation across subjects. B) Correlation of GABA and Glu at rest with the modeled learning scores (percent of correct choices in the last ten trials) at the 65-Gain condition.

Figure 3. Mean concentration in the different scanning conditions. A) GABA. B) Glu. C) E/I balance. The error bars represent the standard error of the mean. Significant differences between conditions are marked. *: p<0.05, **: p<0.01, ***: p<0.0001.

fMRI measurements- mean z-score of the 65-Gain game during the stimuli, where the decision took place, in correlation with GABA, Glu, and the E/I balance during the 65-Gain game. The correlation was calculated using linear model fitting. Significant correlations are marked with a bolded red fitting line. All correlations are FDR-corrected. Both GABA and Glu are negatively correlated with the mean z-score, suggesting that the elevation in Glu is related to cellular activity rather than neuronal activity.

Glu correlations with state-trait anxiety inventory (STAI) scores. The correlation was calculated using linear model fitting. It can be seen that Glu levels are positively correlated with anxiety scores, and that the correlation is stronger during the task and following rest periods compared to the initial rest.

Proc. Intl. Soc. Mag. Reson. Med. 31 (2023)
0489
DOI: https://doi.org/10.58530/2023/0489