Matteo Ferrante1, Tommaso Boccato2, Furkan Ozcelik3, Rufin VanRullen4, Rufin VanRullen4, and Nicola Toschi2
1Biomedicine and prevention, University of Rome Tor Vergata, Rome, Italy, 2University of Rome Tor Vergata, Rome, Italy, 3CerCo, University of Toulouse III Paul Sabatier, Toulouse, France, 4CNRS, CerCo, ANITI, TMBI, Univ. Toulouse, Toulouse, France
Synopsis
Keywords: AI Diffusion Models, fMRI (task based), brain decoding, fMRI
Motivation: Brain decoding has been limited by the need for large data amounts and subject-specific methodologies. Current techniques require extensive scanning, which is costly and time-consuming, restricting their applicability.
Goal(s): The study aims to establish a novel, more efficient approach for cross-subject brain decoding of visual stimuli.
Approach: Using the NSD we applied regularized ridge regression to align brain activity across different subjects on common stimuli representations, employing the state-of-the art Brain-Diffuser pipeline for decoding and image reconstruction.
Results: The ridge regression alignment method surpassed others, enabling consistent cross-subject decoding with significantly reduced data—demonstrating feasibility and a potential 90% scan time reduction.
Impact: A reliable technique for cross-subject,
-scanner and -field strength alignment can pave the way for efficient brain
decoding without the need for extensive data collection and/or ultra-high field
strengths.
Introduction
Brain decoding, a cornerstone of modern
neuroscience, seeks to decipher the intricate neural patterns underpinning
cognitive functions. Within this domain, Functional magnetic resonance (fMRI)
has proven invaluable, especially for decoding visual stimuli [1,2,3,4]. By associating
neural patterns with the latent space of deep learning models, several works have
shown the value of using fMRI to predict or reconstruct visual experiences
based on neural responses exclusively.
However, current methodologies are often
tailored to individual subjects and require very large data amounts, resulting
in prolonged (>24 h) and costly scanning. This subject-centric and
data-intensive approach has severely restricted the broader applicability of
brain decoding. In this context, so called functional data alignment techniques,
aimed at using models trained on one subject to decode other subjects’s data,
are being developed.
Here, we propose a novel alignment approach which
sets new benchmarks in cross-subject brain decoding for visual stimuli. Our
approach is scalable and can reduce the amount of data needed by as much as 90%,
paving the way for broader applicability across varied fields and subjects. We
also demonstrate cross-dataset decoding showing how visual stimuli can be
reconstructed across different subjects, magnetic fields and scanners.Methods
We leverage the Natural Scenes Dataset (NSD)
[4], comprising fMRI data from four subjects exposed to 10,000 natural images. 1000
of these images were in common across subjects, and were used to devise our functional
alignment procedure. The subjects participated in several 7T fMRI scanning sessions
(TR=1.6s, 1.8mm isotropic voxel), where distinct natural images from the COCO
dataset were presented for 2 seconds each (1 second interval). GLMsingle [5]
was used to extracted task-related voxel-wise activations, also resulting in visual
cortex masks (~14,000 voxels/subject).
In our approach, different subject-wise activations
in response to the same stimulus are aligned using linear regression model with
L2 regularization. As baselines, we employed 1) anatomical alignment through
T1-based coregistration, and 2) functional hyperalignment: which optimally
aligns local activity patterns across subjects, designating one as a
"template."
We employed the Brain-Diffuser decoding pipeline
as decoder, and trained it to decode
visual stimuli on NSD Subj01 exclusively, followed by the decoding of aligned
activity of other subjects. This pipeline linearly projects brain activity into
the latent space of pretrained models like CLIP and VD-VAE, subsequently using
VersatileDiffusion for image reconstruction from neural activity. We
benchmarked against other alignment methods, also comparing shared data
proportion, using both qualitative and quantitative metrics like PixCorr, SSIM
and CLIP 2-way accuracy
We also conducted a cross-dataset decoding
experiment using the BOLD5000 [6] dataset, which differs in acquisition
protocol from NSD but contains some common images used as stimuli. The
experiment centered on BOLD5000's CSI1 subject, sharing 1,000 images with NSD subjects,
aiming for cross-dataset decoding. Results, both qualitative and quantitative,
were presented to evaluate decoding quality across datasets.Results
The
study's results indicated that the Ridge Regression-based alignment method
outperformed other methods, especially when using a fraction of shared data
between subjects. We demonstrated that the alignment of brain activity across-subjects
is feasible and that is possible to achieve the same qualitative and
quantitative performances of within-subject decoding using 1000 images, hence reducing
scan time by 90%. This was evident in the qualitative nature of the decoded
images, which remained consistent irrespective of the subjects chosen for
training and alignment. The images correctly and consistently reproduced high-level
content and foundational shapes across varying subject combinations. Moreover, we
show experimentally that cross-field, cross-machine and cross-paradigm decoding
in feasibleDiscussion
This study highlights the potential of a simple
alignment method (ridge regression) for streamlining the brain decoding process,
which performs better than all other available linear and nonlinear methods. This
removes the need to repeat most of the experiment when a new subject is
introduced rendering brain decoding much more affordable in terms of time and
resources. We also showed that anatomical alignment, which rely on mathching
brain structure for alignment and decoding, underperforms due to the inherent
brain anatomical variability across individuals. This variability may not correspond
to functional inter-subject variability. Conclusions
Our
research demonstrates that our approach facilitates cross-subject brain
decoding, suggesting a potential reduction of up to 90% in scan time for
subjects other than the “template”. The approach is also able to generalize
across datasets and/or field strengths, significantly lowering the data quality
and quantity requirements for successful brain decoding algorithms.Acknowledgements
This work was supported by NEXTGENERATIONEU (NGEU) and funded by the Italian Ministry of University and Research (MUR), National Recovery and Resilience Plan (NRRP), project MNESYS (PE0000006) (to NT)– A Multiscale integrated approach to the study of the nervous system in health and disease (DN. 1553 11.10.2022); by the MUR-PNRR M4C2I1.3 PE6 project PE00000019 Heal Italia (to NT); by the NATIONAL CENTRE FOR HPC, BIG DATA AND QUANTUM COMPUTING, within the spoke "Multiscale Modeling and Engineering Applications" (to NT); the EXPERIENCE project (European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No. 101017727); the CROSSBRAIN project (European Union’s European Innovation Council under grant agreement No. 101070908). References
[1] Chen, Z., Qing, J., Xiang, T., Yue, W.L.,
Zhou, J.H.: Seeing beyond the brain: Conditional diffusion model with sparse
masked modeling for vision decoding (2022)
[2] Ozcelik, F., VanRullen, R.: Brain-diffuser:
Natural scene reconstruction from fmri signals using generative latent
diffusion (2023)
[3] Ferrante, M., Ozcelik, F., Boccato, T.,
VanRullen, R., Toschi, N.: Brain captioning:Decoding human brain activity into
images and text (2023)
[4] Ferrante, M., Boccato, T., Toschi, N.:
Semantic brain decoding: from fmri to conceptually similar image reconstruction
of visual stimuli (2023)
[4] Allen, E.J., St-Yves, G., Wu, Y.,
Breedlove, J.L., Prince, J.S., Dowdle, L.T., Nau, M., Caron, B., Pestilli, F.,
Charest, I., Hutchinson, J.B., Naselaris, T., Kay, K.: A massive 7t fmri
dataset to bridge cognitive neuroscience and artificial intelligence. Nature
Neuroscience 25(1), 116–126 (Jan 2022).
https://doi.org/10.1038/s41593-021-00962-x,
https://doi.org/10.1038/s41593-021-00962-x
[5] Prince, J.S., Charest, I., Kurzawski, J.W.,
Pyles, J.A., Tarr, M., Kay, K.N. Improving the accuracy of single-trial fMRI
response estimates using GLMsingle. eLife (2022).
[6] Chang, N., Pyles, J.A., Marcus, A., Gupta,
A., Tarr, M.J., Aminoff, E.M.: Bold5000, a public fmri dataset while viewing
5000 visual images. Scientific Data 6(1), 49 (May 2019). https://doi.org/10.1038/s41597-019-0052-3, https://doi.org/10.1038/s41597-019-0052-3