Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A model of how hierarchical representations constructed in the hippocampus are used to navigate through space

Published: 07 January 2025 Publication History

Abstract

Animals can navigate through complex environments with amazing flexibility and efficiency: they forage over large areas, quickly learning rewarding behavior and changing their plans when necessary. Some insights into the neural mechanisms supporting this ability can be found in the hippocampus (HPC)—a brain structure involved in navigation, learning, and memory. Neuronal activity in the HPC provides a hierarchical representation of space, representing an environment at multiple scales. In addition, it has been observed that when memory-consolidation processes in the HPC are inactivated, animals can still plan and navigate in a familiar environment but not in new environments. Findings like these suggest three useful principles: spatial learning is hierarchical, learning a hierarchical world-model is intrinsically valuable, and action planning occurs as a downstream process separate from learning. Here, we demonstrate computationally how an agent could learn hierarchical models of an environment using off-line replay of trajectories through that environment and show empirically that this allows computationally efficient planning to reach arbitrary goals within a reinforcement learning setting. Using the computational model to simulate hippocampal damage reproduces navigation behaviors observed in rodents with hippocampal inactivation. The approach presented here might help to clarify different interpretations of some spatial navigation studies in rodents and present some implications for future studies of both machine and biological intelligence.

References

[1]
Banino A., Barry C., Uria B., Blundell C., Lillicrap T., Mirowski P., Pritzel A., Chadwick M. J., Degris T., Modayil J., Wayne G., Soyer H., Viola F., Zhang B., Goroshin R., Rabinowitz N., Pascanu R., Beattie C., Petersen S., and Kumaran D. (2018). Vector-based navigation using grid-like representations in artificial agents. Nature, 557(7705), 429–433. https://doi.org/10.1038/s41586-018-0102-6
[2]
Bermudez-Contreras E. (2021). Deep reinforcement learning to study spatial navigation, learning and memory in artificial and biological agents. Biological Cybernetics, 115(2), 131–134. https://doi.org/10.1007/s00422-021-00862-0
[3]
Botvinick M., Wang J. X., Dabney W., Miller K. J., and Kurth-Nelson Z. (2020). Deep reinforcement learning and its neuroscientific implications. Neuron, 107(4), 603–616. https://doi.org/10.1016/j.neuron.2020.06.014
[4]
Botvinick M., and Weinstein A. (2014). Model-based hierarchical reinforcement learning and human action control. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 369(1655), Article 20130480. https://doi.org/10.1098/rstb.2013.0480
[5]
Burgess N., Donnett J. G., and O’Keefe J. (1998). The representation of space and the hippocampus in rats, robots and humans. Zeitschrift Fur Naturforschung. C, Journal of Biosciences, 53(7–8), 504–509. https://doi.org/10.1515/znc-1998-7-805
[6]
Bye C. M. and McDonald R. J. (2019). A specific role of hippocampal NMDA receptors and arc protein in rapid encoding of novel environmental representations and a more general long-term consolidation function. Frontiers in Behavioral Neuroscience, 13, 8. https://doi.org/10.3389/fnbeh.2019.00008
[7]
Chalmers E., Contreras E. B., Robertson B., Luczak A., and Gruber A. (2018). Learning to predict consequences as a method of knowledge transfer in reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, 29(6), 2259–2270. https://doi.org/10.1109/TNNLS.2017.2690910
[8]
Chalmers E., Gruber A. J., and Luczak A. (2023). Hippocluster: An efficient, hippocampus-inspired algorithm for graph clustering. Information Sciences, 639(9), Article 118999. https://doi.org/10.1016/j.ins.2023.118999
[9]
Chalmers E., Luczak A., and Gruber A. J. (2016). Computational properties of the Hippocampus increase the efficiency of goal-directed foraging through hierarchical reinforcement learning. Frontiers in Computational Neuroscience, 10, 128. https://doi.org/10.3389/fncom.2016.00128
[10]
Chen Z. S., Zhang X., Long X., and Zhang S.-J. (2022). Are grid-like representations a component of all perception and cognition? Frontiers in Neural Circuits, 16, Article 924016. https://doi.org/10.3389/fncir.2022.924016
[11]
Chersi F. and Burgess N. (2015). The cognitive architecture of spatial navigation: Hippocampal and striatal contributions. Neuron, 88(1), 64–77. https://doi.org/10.1016/j.neuron.2015.09.021
[12]
Colgin L. L., Moser E. I., and Moser M.-B. (2008). Understanding memory through hippocampal remapping. Trends in Neurosciences, 31(9), 469–477. https://doi.org/10.1016/j.tins.2008.06.008
[13]
Correa C. G., Ho M. K., Callaway F., Daw N. D., and Griffiths T. L. (2023). Humans decompose tasks by trading off utility and computational cost. PLoS Computational Biology, 19(6), Article e1011087. https://doi.org/10.1371/journal.pcbi.1011087
[14]
Cueva C. J. and Wei X.-X. (2018, February 25). Emergence of grid-like representations by training recurrent neural networks to perform spatial localization. International Conference on Learning Representations. https://openreview.net/forum?id=B17JTOe0-
[15]
da Silva B. C., Basso E. W., Perotto F. S., C Bazzan A. L., and Engel P. M. (2006). Improving reinforcement learning with context detection. In 2006 Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, Hakodate Japan, 8–12 May, 2006, pp. 810–812. https://doi.org/10.1145/1160633.1160779
[16]
Daw N. (2012). Model-based reinforcement learning as cognitive search: Neurocomputational theories. https://www.semanticscholar.org/paper/Model-based-reinforcement-learning-as-cognitive-%3A-Daw/0a47d4f8d5e29a8546aba223afc692b25917efc1
[17]
Dayan P., and Hinton G. E. (1992). Feudal reinforcement learning. In Advances in Neural Information Processing Systems, 5, San Francisco, CA, USA, 30 November–3 December 1992. https://papers.nips.cc/paper/1992/hash/d14220ee66aeec73c49038385428ec4c-Abstract.html
[18]
de Lavilléon G., Lacroix M. M., Rondi-Reig L., and Benchenane K. (2015). Explicit memory creation during sleep demonstrates a causal role of place cells in navigation. Nature Neuroscience, 18(4), 493–495. https://doi.org/10.1038/nn.3970
[19]
Dick J., Ladosz P., Ben-Iwhiwhu E., Shimadzu H., Kinnell P., Pilly P. K., Kolouri S., and Soltoggio A. (2020). Detecting changes and avoiding catastrophic forgetting in dynamic partially observable environments. Frontiers in Neurorobotics, 14, Article 578675. https://doi.org/10.3389/fnbot.2020.578675, https://www.frontiersin.org/articles/10.3389/fnbot.2020.578675
[20]
Dietterich T. G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227–303. https://doi.org/10.1613/jair.639
[21]
Eppe M., Gumbsch C., Kerzel M., Nguyen P. D. H., Butz M. V., and Wermter S. (2022). Intelligent problem-solving as integrated hierarchical reinforcement learning. Nature Machine Intelligence, 4(1), 11–20. https://doi.org/10.1038/s42256-021-00433-9
[22]
Eppe M., Nguyen P. D. H., and Wermter S. (2019). From semantics to execution: Integrating action planning with reinforcement learning for robotic causal problem-solving. Frontiers in Robotics and AI, 6, 123. https://doi.org/10.3389/frobt.2019.00123
[23]
Evensmoen H. R., Ladstein J., Hansen T. I., Møller J. A., Witter M. P., Nadel L., and Håberg A. K. (2015). From details to large scale: The representation of environmental positions follows a granularity gradient along the human hippocampal and entorhinal anterior-posterior axis. Hippocampus, 25(1), 119–135. https://doi.org/10.1002/hipo.22357
[24]
Eysenbach B., Gupta A., Ibarz J., and Levine S. (2019). Diversity is all you need: Learning skills without a reward function. International Conference on Learning Representations. https://openreview.net/forum?id=SJx63jRqFm
[25]
Feinberg V., Wan A., Stoica I., Jordan M. I., Gonzalez J. E., and Levine S. (2018). Model-based value estimation for efficient model-free reinforcement learning (arXiv:1803.00101). arXiv. https://doi.org/10.48550/arXiv.1803.00101
[26]
Frankland S. M. and Greene J. D. (2020). Concepts and compositionality: In search of the brain’s language of thought. Annual Review of Psychology, 71(1), 273–303. https://doi.org/10.1146/annurev-psych-122216-011829
[27]
Frey M., Mathis M. W., and Mathis A. (2023). NeuroAI: If grid cells are the answer, is path integration the question? Current Biology, 33(5), R190–R192. https://doi.org/10.1016/j.cub.2023.01.031
[28]
Gidyk D. C., McDonald R. J., and Sutherland R. J. (2021). Intact behavioral expression of contextual fear, context discrimination, and object discrimination memories acquired in the absence of the Hippocampus. Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 41(11), 2437–2446. https://doi.org/10.1523/JNEUROSCI.0546-20.2020
[29]
Gruber A. J. and McDonald R. J. (2012). Context, emotion, and the strategic pursuit of goals: Interactions among multiple brain systems controlling motivated behavior. Frontiers in Behavioral Neuroscience, 6, 50. https://doi.org/10.3389/fnbeh.2012.00050
[30]
Ha D., and Schmidhuber J. (2018). Recurrent world models facilitate policy evolution. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, 31, Montréal, Canada, 3–8 December, 2018. https://papers.nips.cc/paper_files/paper/2018/hash/2de5d16682c3c35007e4e92982f1a2ba-Abstract.html
[31]
Hartley T., Lever C., Burgess N., and O’Keefe J. (2014). Space in the brain: How the hippocampal formation supports spatial cognition. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 369(1635), Article 20120510. https://doi.org/10.1098/rstb.2012.0510
[32]
Hawkins J., and Dawkins R. (2021). A thousand brains: A new theory of intelligence. Basic Books.
[33]
Hawkins J., Lewis M., Klukas M., Purdy S., and Ahmad S. (2018). A framework for intelligence and cortical function based on grid cells in the neocortex. Frontiers in Neural Circuits, 12, 121. https://doi.org/10.3389/fncir.2018.00121
[34]
Hirsh R. (1974). The hippocampus and contextual retrieval of information from memory: A theory. Behavioral Biology, 12(4), 421–444. https://doi.org/10.1016/s0091-6773(74)92231-7
[35]
Ji T., Luo Y., Sun F., Jing M., He F., and Huang W. (2022, October 31). When to update your model: Constrained model-based reinforcement learning. Advances in Neural Information Processing Systems. https://openreview.net/forum?id=9a1oV7UunyP
[36]
Jiang Y., (Shane) Gu S., Murphy K. P., and Finn C. (2019). Language as an abstraction for hierarchical deep reinforcement learning. In Advances in Neural Information Processing Systems, 32, Vancouver, Canada, 8–14 December 2019. https://proceedings.neurips.cc/paper/2019/hash/0af787945872196b42c9f73ead2565c8-Abstract.html
[37]
Jung M. W., Wiener S. I., and McNaughton B. L. (1994). Comparison of spatial firing characteristics of units in dorsal and ventral hippocampus of the rat. Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 14(12), 7347–7356. https://doi.org/10.1523/JNEUROSCI.14-12-07347.1994
[38]
Keefe J. O., and Nadel L. (1978). The hippocampus as a cognitive map. Clarendon Press.
[39]
Lee J. Q., Sutherland R. J., and McDonald R. J. (2017). Hippocampal damage causes retrograde but not anterograde memory loss for context fear discrimination in rats. Hippocampus, 27(9), 951–958. https://doi.org/10.1002/hipo.22759
[40]
Lee S. L., (Tommy) Lew D., Wickenheisser V., and Markus E. J. (2019). Interdependence between dorsal and ventral hippocampus during spatial navigation. Brain and Behavior, 9(10), Article e01410. https://doi.org/10.1002/brb3.1410
[41]
Lehmann H., Lacanilao S., and Sutherland R. J. (2007). Complete or partial hippocampal damage produces equivalent retrograde amnesia for remote contextual fear memories. European Journal of Neuroscience, 25(5), 1278–1286. https://doi.org/10.1111/j.1460-9568.2007.05374.x
[42]
Li Z., Narayan A., and Leong T.-Y. (2017). An efficient approach to model-based hierarchical reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), Article 1. https://doi.org/10.1609/aaai.v31i1.11024
[43]
Ma A., Ouimet M., and Cortés J. (2020). Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning. Autonomous Robots, 44(3–4), 485–503. https://doi.org/10.1007/s10514-019-09871-2
[44]
Mao T. (2012). Hierarchical state representation and action abstractions in Q-learning for agent-based herding. International Journal of Information and Electronics Engineering. https://doi.org/10.7763/IJIEE.2012.V2.156
[45]
McDonald R. J., Balog R. J., Lee J. Q., Stuart E. E., Carrels B. B., and Hong N. S. (2018). Rats with ventral hippocampal damage are impaired at various forms of learning including conditioned inhibition, spatial navigation, and discriminative fear conditioning to similar contexts. Behavioural Brain Research, 351(7), 138–151. https://doi.org/10.1016/j.bbr.2018.06.003
[46]
McDonald R. J., Hong N. S., Craig L. A., Holahan M. R., Louis M., and Muller R. U. (2005). NMDA-receptor blockade by CPP impairs post-training consolidation of a rapidly acquired spatial representation in rat hippocampus. European Journal of Neuroscience, 22(5), 1201–1213. https://doi.org/10.1111/j.1460-9568.2005.04272.x
[47]
Mehrotra D. and Dubé L. (2023). Accounting for multiscale processing in adaptive real-world decision-making via the hippocampus. Frontiers in Neuroscience, 17, Article 1200842. https://doi.org/10.3389/fnins.2023.1200842, https://www.frontiersin.org/articles/10.3389/fnins.2023.1200842
[48]
Milford M., Jacobson A., Chen Z., and Wyeth G. (2016). RatSLAM: Using models of rodent Hippocampus for robot navigation and beyond. In Inaba M. and Corke P. (Eds.), Robotics research: The 16th international symposium ISRR (pp. 467–485). Springer International Publishing. https://doi.org/10.1007/978-3-319-28872-7_27
[49]
Milford M. J., Wyeth G. F., and Prasser D. (2004). RatSLAM: A hippocampal model for simultaneous localization and mapping. In IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA ’04. 2004, New Orleans, LA, USA, 26 April–01 May 2004, 1, pp. 403–408. https://doi.org/10.1109/ROBOT.2004.1307183
[50]
Mirowski P., Grimes M., Malinowski M., Hermann K. M., Anderson K., Teplyashin D., Simonyan K., kavukcuoglu K., Zisserman A., and Hadsell R. (2018). Learning to navigate in cities without a map. In Advances in Neural Information Processing Systems, 31, Montreal, Canada, 3–8 December 2018. https://proceedings.neurips.cc/paper_files/paper/2018/hash/e034fb6b66aacc1d48f445ddfb08da98-Abstract.html
[51]
Moore A. W., and Atkeson C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13(1), 103–130. https://doi.org/10.1007/BF00993104
[52]
Morris R. G., Anderson E., Lynch G. S., and Baudry M. (1986). Selective impairment of learning and blockade of long-term potentiation by an N-methyl-D-aspartate receptor antagonist, AP5. Nature, 319(6056), 774–776. https://doi.org/10.1038/319774a0
[53]
Morris R. G. M. (1981). Spatial localization does not require the presence of local cues. Learning and Motivation, 12(2), 239–260. https://doi.org/10.1016/0023-9690(81)90020-5
[54]
Morris R. G. M. (2013). NMDA receptors and memory encoding. Neuropharmacology, 74, 32–40. https://doi.org/10.1016/j.neuropharm.2013.04.014
[55]
Morris R. G. M., Garrud P., Rawlins J. N. P., and O’Keefe J. (1982). Place navigation impaired in rats with hippocampal lesions. Nature, 297(5868), 681–683. https://doi.org/10.1038/297681a0
[56]
Moser E. I., Moser M.-B., and McNaughton B. L. (2017). Spatial representation in the hippocampal formation: A history. Nature Neuroscience, 20(11), 1448–1464. https://doi.org/10.1038/nn.4653
[57]
Moser M.-B., Rowland D. C., and Moser E. I. (2015). Place cells, grid cells, and memory. Cold Spring Harbor Perspectives in Biology, 7(2), Article a021808. https://doi.org/10.1101/cshperspect.a021808
[58]
Muller R., and Kubie J. (1987). The effects of changes in the environment on the spatial firing of hippocampal complex-spike cells. Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 7(7), 1951–1968. https://doi.org/10.1523/JNEUROSCI.07-07-01951.1987
[59]
Nagabandi A., Kahn G., Fearing R. S., and Levine S. (2018). Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018, pp. 7559–7566. https://doi.org/10.1109/ICRA.2018.8463189
[60]
Pateria S., Subagdja B., Tan A., and Quek C. (2021). Hierarchical reinforcement learning: A comprehensive survey. ACM Computing Surveys, 54(5), 109.1–109.35. https://doi.org/10.1145/3453160
[61]
Poole D. L., and Mackworth A. K. (2017). Artificial intelligence: Foundations of computational agents (2nd ed.). Cambridge University Press.
[62]
Racanière S., Weber T., Reichert D., Buesing L., Guez A., Jimenez Rezende D., Puigdomènech Badia A., Vinyals O., Heess N., Li Y., Pascanu R., Battaglia P., Hassabis D., Silver D., and Wierstra D. (2017). Imagination-augmented agents for deep reinforcement learning. Advances in Neural Information Processing Systems, 30, Long Beach, California, US, 4–9 December 2017. https://proceedings.neurips.cc/paper/2017/hash/9e82757e9a1c12cb710ad680db11f6f1-Abstract.html
[63]
Rasmussen D., Voelker A., and Eliasmith C. (2017). A neural model of hierarchical reinforcement learning. PLoS One, 12(7), Article e0180234. https://doi.org/10.1371/journal.pone.0180234
[64]
Ruediger S., Spirig D., Donato F., and Caroni P. (2012). Goal-oriented searching mediated by ventral hippocampus early in trial-and-error learning. Nature Neuroscience, 15(11), 1563–1571. https://doi.org/10.1038/nn.3224
[65]
Schaeffer R., Khona M., and Fiete I. R. (2022, May 16). No free lunch from deep learning in neuroscience: A case study through models of the entorhinal-hippocampal circuit. Advances in Neural Information Processing Systems. https://openreview.net/forum?id=syU-XvinTI1
[66]
Schrittwieser J., Antonoglou I., Hubert T., Simonyan K., Sifre L., Schmitt S., Guez A., Lockhart E., Hassabis D., Graepel T., Lillicrap T., and Silver D. (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609. https://doi.org/10.1038/s41586-020-03051-4
[67]
Scleidorovich P., Fellous J.-M., and Weitzenfeld A. (2022). Adapting hippocampus multi-scale place field distributions in cluttered environments optimizes spatial navigation and learning. Frontiers in Computational Neuroscience, 16, Article 1039822. https://doi.org/10.3389/fncom.2022.1039822
[68]
Solway A., Diuk C., Córdova N., Yee D., Barto A. G., Niv Y., and Botvinick M. M. (2014). Optimal behavioral hierarchy. PLoS Computational Biology, 10(8), Article e1003779. https://doi.org/10.1371/journal.pcbi.1003779
[69]
Sorscher B., Mel G. C., Ocko S. A., Giocomo L. M., and Ganguli S. (2023). A unified theory for the computational and mechanistic origins of grid cells. Neuron, 111(1), 121–137.e13. https://doi.org/10.1016/j.neuron.2022.10.003
[70]
Stolle M., and Precup D. (2002). Learning options in reinforcement learning. In Koenig S. and Holte R. C. (Eds.), Abstraction, reformulation, and approximation (pp. 212–223). Springer. https://doi.org/10.1007/3-540-45622-8_16
[71]
Sutherland R. J., Kolb B., and Whishaw I. Q. (1982). Spatial mapping: Definitive disruption by hippocampal or medial frontal cortical damage in the rat. Neuroscience Letters, 31(3), 271–276. https://doi.org/10.1016/0304-3940(82)90032-5
[72]
Sutton R. S. (1991). Planning by incremental dynamic programming. In Machine learning proceedings (pp. 353–357). Elsevier.
[73]
Sutton R. S., and Barto A. G. (1998). Reinforcement learning: An introduction (2nd ed.). A Bradford Book.
[74]
Tomov M. S., Yagati S., Kumar A., Yang W., and Gershman S. J. (2020). Discovery of hierarchical representations for efficient planning. PLoS Computational Biology, 16(4), Article e1007594. https://doi.org/10.1371/journal.pcbi.1007594
[75]
Vaidya A. R., Pujara M. S., Petrides M., Murray E. A., and Fellows L. K. (2019). Lesion studies in contemporary neuroscience. Trends in Cognitive Sciences, 23(8), 653–671. https://doi.org/10.1016/j.tics.2019.05.009
[76]
Wijmans E., Savva M., Essa I., Lee S., Morcos A. S., and Batra D. (2023, February 1). Emergence of maps in the memories of blind navigation agents. The Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=lTt4KjHSsyl
[77]
Wu C. M., Schulz E., Garvert M. M., Meder B., and Schuck N. W. (2020). Similarities and differences in spatial and non-spatial cognitive maps. PLoS Computational Biology, 16(9), Article e1008149. https://doi.org/10.1371/journal.pcbi.1008149
[78]
Xia L., and Collins A. G. E. (2021). Temporal and state abstractions for efficient learning, transfer, and composition in humans. Psychological Review, 128(4), 643–666. https://doi.org/10.1037/rev0000295
[79]
Zhong S. (2005). Efficient online spherical k-means clustering. In Proceedings. 2005 IEEE International Joint Conference on Neural Networks, Montreal, QC, Canada, 31 July–04 August 2005, 5, pp. 3180–3185. https://doi.org/10.1109/IJCNN.2005.1556436

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems  Volume 33, Issue 1
Feb 2025
76 pages
This article is distributed under the terms of the Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).

Publisher

Sage Publications, Inc.

United States

Publication History

Published: 07 January 2025

Author Tags

  1. Reinforcement learning
  2. navigation
  3. hierarchical learning
  4. hippocampus
  5. cognitive map
  6. model-based learning

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media