3901

FMRI response representation associated with lower space structure of scene perception
Yun Qin1, Xiaole Zhao1, and Tao Zhang1
1University of electronic science and technology of China, Chengdu, China

Synopsis

This study identified the similarity of fMRI-based brain activity in certain scene-selective regions associated with scene categories and low-level spatial structure. In addition, similar hierarchy was found between fMRI response and convolution neural network features of different layers. The results may provide new insights to the processing of lower and high-level knowledge, as well as the performance-optimized hierarchical models.

Introduction

The scene-selective regions (SSR) of the brain plays an important role in the navigation and identification of the visual environment 1. Previous neuroimaging investigations have identified specific brain regions involved in scene perception, such as parahippocampal place area (PPA), lateral occipital complex (LOC) and retrosplenial complex (RSC) 2, 3. Although these regions were sensitive to scene categories, how these regions accomplish the diverse functions in scene identification is unclear. In this study, in order to solve the response of the SSR associated with the low-level spatial frequency mode and high-level semantic category, the characterization of similarity analysis for different response pattern and lower space structure of regional features were analyzed to quantify the correlation of the two attributes. Moreover, the relationship between the human vision and convolution neural network (CNN) was detected to explore the human visual hierarchy system for scene image response model and the neurophysiological basis of CNN.

Methods

Images from four different scene categories (mountains, forests, cities and streets), and three versions of each image (a full-spectrum unfilterd version; a low spatial frequency version, and high spatial frequency version) were used as the visual stimuli during fMRI scanning. Moreover, additional localizer experiment was also conducted 2. After fMRI data preprocessing, fMRI response related to certain stimuli was obtained based on the generalized linear model, and thus, the PPA, LOC, RSC, occipital place area (OPA) as well as primary visual area (V1) were identified. By comparing the activity patterns associated with each pair of experimental conditions, the representational dissimilarity matrix (RDM) can be obtained. RDM associated with stimuli conditions including low-level properties, i.e., spatial frequency property and GIST descriptor of images, as well as categories was also analyzed 4, 5. In order to investigate the relationship between scene selective brain regions and convolutional neural network (CNN), the correlation between RDM of brain activity and RDM of classic CNN features were performed.

Results

The SSRs including PPA, RSC, OPA, LOC and V1 were identified according to functional localizer experiment. In Figure 1, the tree view illustrated the RDM of brain activity in SSRs associated with all pairs of conditions5, 6. Stronger similarity of brain activity was revealed for the same category in PPA, OPA, RSC and LOC region no matter what frequency version the image was. Whereas, V1 reflected sensitivity to frequency changes comparing to category. In addition, the dissimilarity of brain activity in V1 was significantly correlated with both spatial frequency and GIST descriptor (P<0.01) (Figure 2). In the other hand, activity in PPA, RSC, OPA and LOC was found significantly correlated with category (P<0.01). The association analysis for SSR and CNN showed that PPA and OPA was significantly correlated with the features in the last three full connectivity layers (P<0.01) (Figure 3). RSC reflected progressive correlation with CNN layers, and achieved the largest correlation in the final layer. While V1 had significant correlation with the first five layers in CNN, and this correlation diminished in the following three layers. LOC reflected much stable correlation with most of CNN layers, which indicated the integrative role of LOC in visual information processing.

Discussion

The hierarchical nature of visual system in both dorsal and ventral pathways provides the basis for the presentation of scene and object2, 7. Visual information processing starts with the collections of low-level features, such as the spatial frequency, colors, orientations, and then, semantically higher-level concepts, such as category were accessed based on the neural computations of these integrative features8. While, the SSRs played an important role in integrating the visual information for use of navigating our world. In order to investigate the dissimilarity relationship for brain responses associated with low-level, as well as the higher-level concepts of images, this study used spatial frequency properties and category as the condition stimuli 9, 10. The results showed that PPA, OPA, RSC and LOC were more sensitive to category of the stimuli no matter what frequency version the image was. Whereas, V1 reflected sensitivity to frequency changes as well as the envelope shape comparing to category. The result was in line with the hierarchy of visual system. V1 as an early stage in the visual pathway, has the capacity to extract low-level properties of visual stimuli. The high-level stimulus properties such as the semantic presentation were involved in the late processing of neural networks. Probing the relationship between fMRI response representations and CNN representations can provide new insights to the performance-optimized hierarchical models, as well as the processing of high-level cognitive knowledges 11, 12.

Conclusion

In this study, we identified the similarity of brain activity in certain scene-selective regions associated with scene categories and low-level spatial frequency. According to hierarchy clustering and correlation analysis, PPA, OPA, RSC and LOC region showed more sensitivity to image category, whereas, V1 reflected sensitivity to spatial frequency and envelope information rather than category. In addition, similar hierarchy was found between brain activity and CNN features in different layers.

Acknowledgements

This work is supported in part by the National Key Research and Development Program of China (No. 2016YFC0100800 and 2016YFC0100802) .

References

1. Walther DB, Caddigan E, Fei-Fei L, Beck DM. Natural Scene Categories Revealed in Distributed Patterns of Activity in the Human Brain. Journal of Neuroscience 2009;29:10573-10581.

2. Peissig JJ, Tarr MJ. Visual object recognition: Do we know more now than we did 20 years ago? Annu Rev Psychol 2007;58:75-96.

3. Eger E, Ashburner J, Haynes JD, Dolan RJ, Rees G. fMRI activity patterns in human LOC carry information about object exemplars within category. Journal of cognitive neuroscience 2008;20:356-370. 4. Nili H, Wingfield C, Walther A, Su L, Marslen-Wilson W, Kriegeskorte N. A Toolbox for Representational Similarity Analysis. Plos Comput Biol 2014;10.

5. Edelman S, Grill-Spector K, Kushnir T, Malach R. Toward direct visualization of the internal shape representation space by fMRI. Psychobiology 1998;26:309-321.

6. Oliva A, Torralba A. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int J Comput Vision 2001;42:145-175.

7. Nishida S. Advancement of motion psychophysics: Review 2001-2010. J Vision 2011;11.

8. Berman D, Golomb JD, Walther DB. Scene content is predominantly conveyed by high spatial frequencies in scene-selective visual cortex. PloS one 2017;12.

9. Kauffmann L, Ramanoel S, Peyrin C. The neural bases of spatial frequency processing during scene perception. Frontiers in integrative neuroscience 2014;8:37.

10. Schindler A, Bartels A. Visual high-level regions respond to high-level stimulus content in the absence of low-level confounds. NeuroImage 2016;132:520-525.

11. Scholte HS, Losch MM, Ramakrishnan K, de Haan EHF, Bohte SM. Visual pathways from the perspective of cost functions and multi-task deep neural networks. Cortex 2018;98:249-261.

12. Guclu U, van Gerven MAJ. Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream. Journal of Neuroscience 2015;35:10005-10014.

Figures

Figure1.Tree diagram of the fMRI response representation in different scene selective regions. Note: _H denotes the conditions in high spatial frequency version; _L denotes the conditions in a low spatial frequency version; _F denotes the conditions in a full-spectrum unfilterd version.

Figure2. The correlations between fMRI response representations in scene selective regions and the model representations indicating spatial frequency, GIST descriptor, as well as categories.

Figure3. The illustration of the representation of different layers of classic CNN.

Figure 4. The relationship between fMRI response representation in scene selective regions and the model representations in different layers of CNN network.

Proc. Intl. Soc. Mag. Reson. Med. 28 (2020)
3901