Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Effective and Efficient ROI-wise Visual Encoding Using an End-to-End CNN Regression Model and Selective Optimization

  • Conference paper
  • First Online:
Human Brain and Artificial Intelligence (HBAI 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1369))

Included in the following conference series:

Abstract

In neuroscience, visual encoding based on functional magnetic resonance imaging (fMRI) has been attracting much attention, especially with the recent development of deep learning. Visual encoding model is aimed at predicting subjects’ brain activity in response to presented image stimuli . Current visual encoding models firstly extract image features through a pre-trained convolutional neural network (CNN) model, and secondly learn to linearly map the extracted CNN features to each voxel. However, it is hard for the two-step manner of visual encoding model to guarantee the extracted features are linearly well-matched with fMRI voxels, which reduces final encoding performance. Analogizing the development of the computer vision domain, we introduced the end-to-end manner into the visual encoding domain. In this study, we designed an end-to-end convolution regression model (ETECRM) and selective optimization based on the region of interest (ROI)-wise manner to accomplish more effective and efficient visual encoding. The model can automatically learn to extract better-matched features for encoding performance based on the end-to-end manner. The model can directly encode an entire visual ROI containing enormous voxels for encoding efficiency based on the ROI-wise manner, where the selective optimization was used to avoid the interference of some ineffective voxels in the same ROI. Experimental results demonstrated that ETECRM obtained improved encoding performance and efficiency than previous two-step models. Comparative analysis implied that the end-to-end manner and large volume of fMRI data are potential for the visual encoding domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mitchell, T.M., Shinkareva, S.V., Carlson, A., Chang, K.-M., Malave, V.L., Mason, R.A., et al.: Predicting human brain activity associated with the meanings of nouns. Science 320(5880), 1191–1195 (2008). https://doi.org/10.1126/science.1152876

    Article  Google Scholar 

  2. Naselaris, T., Kay, K.N., Nishimoto, S., Gallant, J.L.: Encoding and decoding in fMRI. Neuroimage 56(2), 400–410 (2011). https://doi.org/10.1016/j.neuroimage.2010.07.073

    Article  Google Scholar 

  3. Liang, Z., Higashi, H., Oba, S., Ishii, S.: Brain dynamics encoding from visual input during free viewing of natural videos. In: International Joint Conference on Neural Networks, pp. 1–8. IEEE Press, Budapest, Hungary (2019)

    Google Scholar 

  4. Pinti, P., et al.: The present and future use of functional near-infrared spectroscopy (fNIRS) for cognitive neuroscience. Ann. N. Y. Acad. Sci. 1464, 1–5 (2020). https://doi.org/10.1111/nyas.13948

    Article  Google Scholar 

  5. Ramkumar, P., Hansen, B.C., Pannasch, S., Loschky, L.C.: Visual information representation and rapid-scene categorization are simultaneous across cortex: an MEG study. Neuroimage 134, 295–304 (2016). https://doi.org/10.1016/j.neuroimage.2016.03.027

    Article  Google Scholar 

  6. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50

    Article  Google Scholar 

  7. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:13126229 (2013)

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE Press, Las Vegas (2016)

    Google Scholar 

  9. Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: Computer Vision and Pattern Recognition, pp. 1–8. IEEE Press, Anchorage, Alaska (2008)

    Google Scholar 

  10. Kay, K.N., Naselaris, T., Prenger, R.J., Gallant, J.L.: Identifying natural images from human brain activity. Nature 452(7185), 352 (2008)

    Article  Google Scholar 

  11. Huth, A.G., Nishimoto, S., Vu, A.T., Gallant, J.L.: A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76(6), 1210–1224 (2012)

    Article  Google Scholar 

  12. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, p. 1097–105. NIPS Press, Lake Tahoe, Nevada (2012)

    Google Scholar 

  13. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  14. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015). https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  15. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning. MIT Press, Cambridge (2016)

    Google Scholar 

  16. Agrawal, P., Stansbury, D., Malik, J., Gallant, J.L.: Pixels to voxels: modeling visual representation in the human brain. arXiv preprint arXiv:14075104 (2014)

  17. Yamins, D.L., Hong, H., Cadieu, C.F., Solomon, E.A., Seibert, D., DiCarlo, J.J.: Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl. Acad. Sci. 111(23), 8619–8624 (2014)

    Article  Google Scholar 

  18. Güçlü, U., van Gerven, M.A.: Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35(27), 10005–10014 (2015)

    Article  Google Scholar 

  19. Eickenberg, M., Gramfort, A., Varoquaux, G., Thirion, B.: Seeing it all: convolutional network layers map the function of the human visual system. Neuroimage 152, 184–194 (2016). https://doi.org/10.1016/j.neuroimage.2016.10.001

    Article  Google Scholar 

  20. Styves, G., Naselaris, T.: The feature-weighted receptive field: an interpretable encoding model for complex feature spaces. Neuroimage 180, 188–202 (2018)

    Article  Google Scholar 

  21. Wen, H., Shi, J., Chen, W., Liu, Z.: Deep residual network predicts cortical representation and organization of visual features for rapid categorization. Sci. Rep. 8(1), 3752 (2018). https://doi.org/10.1038/s41598-018-22160-9

    Article  Google Scholar 

  22. Shi, J., Wen, H., Zhang, Y., Han, K., Liu, Z.: Deep recurrent neural network reveals a hierarchy of process memory during dynamic natural vision. Hum. Brain Mapp. 39(5), 2269–2282 (2018). https://doi.org/10.1002/hbm.24006

    Article  Google Scholar 

  23. Han, K., Wen, H., Shi, J., Lu, K.-H., Zhang, Y., Liu, Z.: Variational autoencoder: an unsupervised model for modeling and decoding fMRI activity in visual cortex. bioRxiv 214247 (2017)

    Google Scholar 

  24. Qiao, K., Zhang, C., Wang, L., Chen, J., Zeng, L., Tong, L., et al.: Accurate reconstruction of image stimuli from human functional magnetic resonance imaging based on the decoding model with capsule network architecture. Front. Neuroinform. 12, 62 (2018)

    Google Scholar 

  25. Horikawa, T., Kamitani, Y.: Generic decoding of seen and imagined objects using hierarchical visual features. Nat. Commun. 8(1), 1–15 (2017). https://doi.org/10.1038/ncomms15037

    Article  Google Scholar 

  26. Zhang, C., et al.: A visual encoding model based on deep neural networks and transfer learning for brain activity measured by functional magnetic resonance imaging. J. Neurosci. Methods 325, 108318 (2019)

    Article  Google Scholar 

  27. Chang, N., Pyles, J.A., Marcus, A., Gupta, A., Tarr, M.J., Aminoff, E.M.: BOLD5000, a public fMRI dataset while viewing 5000 visual images. Sci. Data 6(1), 49 (2019)

    Article  Google Scholar 

  28. Needell, D., Vershynin, R.: Signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit. IEEE J. Sel. Top. Sign. Proces. 4(2), 310–316 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qiao, K., Zhang, C., Chen, J., Wang, L., Tong, L., Yan, B. (2021). Effective and Efficient ROI-wise Visual Encoding Using an End-to-End CNN Regression Model and Selective Optimization. In: Wang, Y. (eds) Human Brain and Artificial Intelligence. HBAI 2021. Communications in Computer and Information Science, vol 1369. Springer, Singapore. https://doi.org/10.1007/978-981-16-1288-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-1288-6_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-1287-9

  • Online ISBN: 978-981-16-1288-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics