Ren-Horng Wang1, Shu-Yu Huang1, Hsin-Ju Lee2,3, Wen-Jui Kuo3, and Fa-Hsuan Lin2,4,5
1Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan, 2National Taiwan University, Taipei, Taiwan, 3National Yang-Ming University, Taipei, Taiwan, 4Department of Mediacl Biophysics, University of Toronto, Toronto, ON, Canada, 5Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland
Synopsis
We used fast fMRI sampled at the 10-Hz
rate to study fMRI timing in audiovisual integration. Using the McGurk protocol,
we found that the superior temporal gyrus (STG) had significant BOLD time-to-half
(TTH) difference between the McGurk and congruent audiovisual stimuli pairs. The significance of TTH difference between
congruent and McGurk was progressively more significant from posterior to anterior
regions (p-values at visual cortex, lateral occipital lob, occipital parietal
junction, STG, and auditory cortex were 0.13, 0.12, 0.05, 0.02, and 0.01,
respectively), suggesting that incongruent
audio-visual stimuli cause more delayed the brain response at regions closer to
primary auditory processing.
INTRODUCTION
Combining information generated from auditory modality and visual
modalities leads to better speech comprehension1,2. Yet incongruent
combination of auditory and visual information can cause illusion. Specifically,
the McGurk effect describes the auditory percept of the syllable /ta/ or /ha/
when viewing a video clip with the mouth movement of /pa/ and the sound of /ka/3,4. Previous
studies show that the left superior
temporal sulcus (STS) is a critically important brain area integrating auditory
and visual information5-10. People perceiving McGurk illusion have increased BOLD
signal in STS11,12. For McGurk perceivers, how the BOLD signal
differs between congruent, incongruent, and McGurk stimuli were less explored.
Most fMRI studies use the amplitude
of the BOLD signal to correlate with behaviors or to contrast between
experimental conditions. Yet recently we found that BOLD signal latency can be
more sensitivity to differentiate between attentional states then BOLD signal
amplitude13. Accordingly, we hypothesize that the BOLD signals
differ in latencies between congruent and McGurk stimuli.
METHODS
Eleven
participants with participated the study and provided written informed consent.
The experiment was approved by the Institutional Review Board of National
Taiwan University Hospital. Stimuli were short video clips with sound. The
visual component of the video clips was a whole face with a man pronouncing
syllables /pa/ and /ka/; The auditory component of the video clips was the same
person uttering syllables /pa/ and /ka/. Four different conditions of stimuli
were generated by pairing visual and auditory components: congruent/pa/ (visual
/pa/ + auditory /pa/), congruent /ka/ (visual /ka/ + auditory /ka/), incongruent
non-McGurk (visual /pa/ + auditory /ka/), and incongruent McGurk (visual /ka/ +
auditory /pa/). Each trial consisted of two utterances and lasted 1.8 s. Participants
were asked to push the left button upon hearing /ka/ or /pa/ and the right
button upon hearing /ha/ or /ta/ or other percepts. Every run lasted 7 min, and
each participant completed from three to five runs.
Functional
MRI was measured by SMS-InI14 to sample the whole-brain hemodynamic response
in the 10-Hz rate with 5-mm isotropic resolution. Structural MRI data were
obtained by using a T1-weighted
3D sequence. Data pre-processing included slice-timing correction, motion
correction, co-registration between functional and anatomical data, spatial
normalization to the MNI space, and spatial smoothing. All the pre-processing
steps were done by SPM8 (SPM8, Wellcome Department, University College London,
UK). We used General Linear Model to reveal the BOLD response by finite impulse
response (FIR) basis functions. The timing of a BOLD waveform was quantified by
its onset, time-to-half (TTH), time-to-peak (TTP), and time-to-half-off (TTHoff).
The significance of a timing index was evaluated by the permutation test, where
labels to waveforms were randomly shuffled 1,000 times to create an empirical
null distribution.RESULTS
Figure 1 shows the regions-of-interest (ROI) at the visual
cortex, lateral occipital cortex, superior temporal gyrus (STG), and auditory
cortex in this study. These ROI’s were defined from the group analysis of the
BOLD signal. Figure 2 shows the BOLD
waveforms for congruent and incongruent as well as McGurk conditions at STG.
The incongruent condition waveform had visually larger amplitude than congruent
as well as McGurk waveforms. No clear difference was found between congruent condition
and McGurk waveform. Normalizing all time courses to their maximal values suggested
that the incongruent condition waveform has later offset than waveforms of congruent
condition as well as McGurk conditions. Permutation test on BOLD timing was shown
in Figure 3. Only auditory cortex
(Aud), STG, and occipital parietal junction (OPJ) shows significant time-to-half
difference between McGurk and Congruent conditions. No TTH difference between
these two conditions at lateral occipital cortex (LOC) and visual cortex (Vis).
The significance of TTH difference between congruent and McGurk was
progressively more significant from posterior to anterior ROIs. Specifically,
the p-values at Vis, LOC, OPJ, STG, and Aud were 0.13, 0.12, 0.05, 0.02, and
0.01, respectively.DISCUSSION
This study revealed that the BOLD timing difference is
sensitive to disclose the difference between congruent and McGurk condition.
This difference was not found in BOLD signal amplitude comparison. The fine
difference in BOLD timing but not amplitude timing was also reported in our
prevous fast fMRI study to differetiate between attentional states13. Interestingly, we
found that the timing difference became progressively more signficant as
regions are further away from the visual cortex toward the auditory cortex,
suggesting incongruent audio-visual stimuli cause more delayed the brain response
at regions closer to primary auditory processing. Further electrophysioglical
data are required to find the neuronal basis of region-dependent BOLD latencies
between congruent, incongruent, and McGurk conditions.Acknowledgements
This work was partially supported by
Ministry of Science and Technology, Taiwan (103-2628-B-002-002-MY3,
105-2221-E-002- 104), the National Health Research Institutes, Taiwan
(NHRI-EX107-10727EI), and the Academy of Finland (No. 298131).References
1 Poeppel D., Idsardi W. J.
& van Wassenhove V. Philos Trans R Soc Lond B Biol Sci.2008; 363:1071-1086.
2 Sheppard
J. P., Wang J. P. & Wong P. C. PLoS One.2011; 6:e16510.
3 McGurk
H. & MacDonald J. Nature.1976; 264:746-748.
4 Tiippana
K. Front Psychol.2014; 5:725.
5 Beauchamp
M. S., Lee K. E., Argall B. D. et al. Neuron.2004; 41:809-823.
6 Callan
D. E., Jones J. A., Munhall K. et al. J
Cogn Neurosci.2004; 16:805-816.
7 Calvert
G. A., Campbell R. & Brammer M. J. Curr Biol.2000; 10:649-657.
8 Dahl
C. D., Logothetis N. K. & Kayser C. J Neurosci.2009; 29:11924-11932.
9 Miller
L. M. & D'Esposito M. J Neurosci.2005;
25:5884-5893.
10 Noesselt
T., Rieger J. W., Schoenfeld M. A. et al. J
Neurosci.2007; 27:11431-11441.
11 Benoit
M. M., Raij T., Lin F. H. et al. Hum
Brain Mapp.2010; 31:526-538.
12 Nath
A. R. & Beauchamp M. S. Neuroimage.2012;
59:781-787.
13 Chu
Y.-H., Lin J.-F., Wu P.-Y. et al. Proc
Intl Soc Magn Reson Med.2017; 5251.
14 Hsu
Y. C., Chu Y. H., Tsai S. Y. et al. Sci
Rep.2017; 7:17019.