Abstract
Mammals, and humans in particular, are endowed with an exceptional capacity for cumulative learning. This capacity crucially depends on the presence of intrinsic motivations, that is, motivations that are directly related not to an organism’s survival and reproduction but rather to its ability to learn. Recently, there have been a number of attempts to model and reproduce intrinsic motivations in artificial systems. Different kinds of intrinsic motivations have been proposed both in psychology and in machine learning and robotics: some are based on the knowledge of the learning system, while others are based on its competence. In this contribution, we discuss the distinction between knowledge-based and competence-based intrinsic motivations with respect to both the functional roles that motivations play in learning and the mechanisms by which those functions are implemented. In particular, after arguing that the principal function of intrinsic motivations consists in allowing the development of a repertoire of skills (rather than of knowledge), we suggest that at least two different sub-functions can be identified: (a) discovering which skills might be acquired and (b) deciding which skill to train when. We propose that in biological organisms, knowledge-based intrinsic motivation mechanisms might implement the former function, whereas competence-based mechanisms might underlie the latter one.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alexander, G., DeLong, M., Strick, P.: Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381 (1986)
Baldassarre, G.: A modular neural-network model of the basal ganglia’s role in learning and selecting motor behaviours. J. Cogn. Syst. Res. 3, 5–13 (2002a)
Baldassarre, G.: Planning with neural networks and reinforcement learning. Ph.D. Thesis, Computer Science Department, University of Essex (2002b)
Baldassarre, G., Mirolli, M.: What are the key open challenges for understanding autonomous cumulative learning of skills? AMD Newslett. 7(2), 2–3 (2010)
Baldassarre, G., Mirolli, M.: Deciding which skill to learn when: Temporal-difference competence-based intrinsic motivation (td-cb-im). In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin (2012, this volume)
Baranes, A., Oudeyer, P.-Y.: R-iac: Robust intrinsically motivated exploration and active learning. IEEE Trans. Auton. Mental Dev. 1(3), 155–169 (2009)
Baranes, A., Oudeyer, P.-Y.: Intrinsically motivated goal exploration for active motor learning in robots: A case study. In: Proceedings of the International Conference on Intelligent Robots and Systems (IROS 2010). Taipel, Taiwan (2010)
Barto, A.: Adaptive critics and the basal ganglia. In: Houk, J.C., Davis, J., Beiser, D. (eds.) Models of Information Processing in the Basal Ganglia, pp. 215–232. MIT, Cambridge (1995)
Barto, A., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collections of skills. In: International Conference on Developmental Learning (ICDL), La Jolla (2004)
Barto, A., Sutton, R., Anderson, C.: Neuron-like adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 13, 834–846 (1983)
Barto, A.G.: What are intrinsic reward signals? AMD Newslett. 7(2), 3 (2010)
Barto, A.G.: Intrinsic motivation and reinforcement learning. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin (2012, this volume)
Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discr. Event Dyn. Syst. 13(4), 341–379 (2003)
Berlyne, D.E.: Conflict, Arousal., Curiosity. McGraw-Hill, New York (1960)
Botvinick, M., Plaut, D.: Doing without schema hierarchies: A recurrent connectionist approach to routine sequential action and its pathologies. Psychol. Rev. 111, 395–429 (2004)
Brooks, R.A.: Intelligence without representation. Artif. Intell. J. 47, 139–159 (1991)
Butler, R.A.: Discrimination learning by rhesus monkeys to visual-exploration motivation. J. Comp. Physiol. Psychol. 46(2), 95–98 (1953)
Caligiore, D., Mirolli, M., Parisi, D., Baldassarre, G.: A bioinspired hierarchical reinforcement learning architecture for modeling learning of multiple skills with continuous states and actions. In: Proceedings of the Tenth International Conference on Epigenetic Robotics, vol. 149. Lund University Cognitive Studies, Lund (2010)
Clark, A.: Being There: Putting Brain, Body and World Together Again. Oxford University Press, Oxford (1997)
Csikszentmihalyi, M.: Flow: The Psychology of Optimal Experience. Harper Perennial, New York (1991)
Dalley, J.W., Cardinal, R.N., Robbins, T.W.: Prefrontal executive and cognitive functions in rodents: Neural and neurochemical substrates. Neurosci. Biobehav. Rev. 28(7), 771–784 (2004)
Dayan, P., Hinton, G.E.: Feudal reinforcement learning. In: Advances in Neural Information Processing Systems 5, pp. 271–278. Morgan Kaufmann, San Francisco (1993)
De Charms, R.: Personal Causation: The Internal Affective Determinants of Behavior. Academic, New York (1968)
Deci, E.: Intrinsic Motivation. Plenum, New York (1975)
Deci, E.L., Ryan, R.M.: Intrinsic Motivation and Self-determination in Human Behavior. Plenum, New York (1985)
Dember, W., Earl, R.: Analysis of exploratory, manipulatory and curiosity behaviors. Psychol. Rev. 64, 91–96 (1957)
Dietterich, T.: Hierarchical reinforcement learning with the maxq value function decomposition. J. Artif. Intell. Res. 13, 227–303 (2000)
Dommett, E., Coizet, V., Blaha, C.D., Martindale, J., Lefebvre, V., Walton, N., Mayhew, J.E.W., Overton, P.G., Redgrave, P.: How visual stimuli activate dopaminergic neurons at short latency. Science 307(5714), 1476–1479 (2005)
Doya, K.: Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr. Opin. Neurobiol. 10(6), 732–739 (2000)
Doya, K., Samejima, K., Katagiri, K.-i., Kawato, M.: Multiple model-based reinforcement learning. Neural Comput. 14(6), 1347–1369 (2002)
Festinger, L.: A Theory of Cognitive Dissonance. Stanford University Press, Stanford (1957)
Fiore, V., Mannella, F., Mirolli, M., Gurney, K., Baldassarre, G.: Instrumental conditioning driven by neutral stimuli: A model tested with a simulated robotic rat. In: Proceedings of the Eight International Conference on Epigenetic Robotics, number 139, pp. 13–20. Lund University Cognitive Studies, Lund (2008)
Fuster, J.: The prefrontal cortex-an update: Time is of the essence. Neuron 2, 319–333 (2001)
Geisler, S., Derst, C., Veh, R.W., Zahm, D.S.: Glutamatergic afferents of the ventral tegmental area in the rat. J. Neurosci. 27(21), 5730–5743 (2007)
Grafton, S.T., Hamilton, A.: Evidence for a distributed hierarchy of action representation in the brain. Hum. Brain Mapp. Movement Sci. 26(4), 590–616 (2007)
Graziano, M.: The organization of behavioral repertoire in motor cortex. Annu. Rev. Neurosci. 29, 105–134 (2006)
Gurney, K., Lepora, N., Shah, A., Koene, A., Redgrave, P.: Action discovery and intrinsic motivation: A biologically constrained formalisation. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin (2012, this volume)
Gurney, K., Prescott, T.J., Redgrave, P.: A computational model of action selection in the basal ganglia I. A new functional anatomy. Biol. Cybern. 84(6), 401–410 (2001)
Harlow, H.F.: Learning and satiation of response in intrinsically motivated complex puzzle performance by monkeys. J. Comp. Physiol. Psychol. 43, 289–294 (1950)
Harlow, H.F., Harlow, M.K., Meyer, D.R.: Learning motivated by a manipulation drive. J. Exp. Psychol. 40, 228–234 (1950)
Haruno, M., Wolpert, D., Kawato, M.: Mosaic model for sensorimotor learning and control. Neural Comput. 13, 2201–2220 (2001)
Hebb, D.: Drives and the conceptual nervous system. Psychol. Rev. 62, 243–254 (1955)
Heidbreder, C.A., Groenewegen, H.J.: The medial prefrontal cortex in the rat: Evidence for a dorso-ventral distinction based upon functional and anatomical characteristics. Neurosci. Biobehav. Rev. 27(6), 555–579 (2003)
Hof, P.M., Scherer, C., Heuberger, P.S. (eds.): Model-Based Control: Bridging Rigorous Theory and Advanced Technology. Springer, Berlin (2009)
Horvitz, J.: Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience 96(4), 651–656 (2009)
Houk, J., Adams, J., Barto, A.: A model of how the basal ganglia generates and uses neural signals that predict reinforcement. In: Houk, J., Davis, J., Beiser, D. (eds.) Models of Information Processing in the Basal Ganglia, pp. 249–270. MIT, Cambridge (1995)
Huang, X., Weng, J.: Novelty and reinforcement learning in the value system of developmental robots. In: Proceedings Second International Workshop on Epigenetic Robotics, Edinburgh, pp. 47–55 (2002)
Hull, C.L.: Principles of Behavior. Appleton-Century-Crofts, New York (1943)
Hunt, H.: Intrinsic motivation and its role in psychological development. Nebraska Symp. Motiv. 13, 189–282 (1965)
Joel, D., Niv, Y., Ruppin, E.: Actor-critic models of the basal ganglia: New anatomical and computational perspectives. Neural Netw. 15(4), 535–547 (2002)
Joel, D., Weiner, I.: The organization of the basal ganglia-thalamocortical circuits: Open interconnected rather than closed segregated. Neuroscience 63(2), 363–379 (1994)
Jonsson, A., Barto, A.: Causal graph based decomposition of factored mdps. J. Mach. Learn. Res. 7, 2259–2301 (2006)
Jordan, M.I., Rumelhart, D.E.: Forward models: Supervised learning with a distal teacher. Cogn. Sci. 16, 307–354 (1992)
Kagan, J.: Motives and development. J. Pers. Soc. Psychol. 22, 51–66 (1972)
Kish, G.: Learning when the onset of illumination is used as the reinforcing stimulus. J. Comp. Physiol. Psychol. 48(4), 261–264 (1955)
Kish, G., Antonitis, J.: Unconditioned operant behavior in two homozygous strains of mice. J. Genet. Psychol. Aging 88(1), 121–129 (1956)
Konidaris, G.D., Barto, A.G.: Skill discovery in continuous reinforcement learning domains using skill chaining. In: Advances in Neural Information Processing Systems (NIPS 2009), pp. 1015–1023. Vancouver, B.C., Canada (2009)
Langton, C.G. (ed.): Artificial Life: The Proceedings of an Interdisciplinary Workshop on the Synthesis and Simulation of Living Systems. Addison-Wesley, Redwood City (1989)
Lee, R., Walker, R., Meeden, L., Marshall, J.: Category-based intrinsic motivation. In: Proceedings of the Ninth International Conference on Epigenetic Robotics, vol. 146, pp. 81–88. Lund University Cognitive Studies, Lund (2009)
Lisman, J.E., Grace, A.A.: The hippocampal-vta loop: Controlling the entry of information into long-term memory. Neuron 46(5), 703–713 (2005)
Marshall, J., Blank, D., Meeden, L.: An emergent framework for self-motivation in developmental robotics. In: Proceedings of the Third International Conference on Development and Learning (ICDL 2004), La Jolla, pp. 104–111 (2004)
Merrick, K., Maher, M.L.: Motivated learning from interesting events: Adaptive, multitask learning agents for complex environments. Adap. Behav. 17(1), 7–27 (2009)
Meunier, D., Lambiotte, R., Bullmore, E.T.: Modular and hierarchically modular organization of brain networks. Front. Neurosci. 4 (2010)
Meyer, J.-A., Wilson, S.W. (eds.): From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior. MIT, Cambridge (1990)
Miller, E., Cohen, J.: An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 24, 167–202 (2001)
Mink, J.: The basal ganglia: Focused selection and inhibition of competing motor programs. Prog. Neurobiol. 50(4), 381–425 (1996)
Mirolli, M., Santucci, V.G., Baldassarre, G.: Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: A simulated robotic study. Neural Netw. (2012, submitted for publication)
Mitchell, T.M.: Mach. Learn.. McGraw-Hill, New York (1997)
Montgomery, K.: The role of exploratory drive in learning. J. Comp. Physiol. Psychol. 47, 60–64 (1954)
Otmakova, N., Duzel, E., Deutch, A.Y., Lisman, J.E.: The hippocampal-vta loop: The role of novelty and motivation in controlling the entry of information into long-term memory. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin (2012, this volume)
Oudeyer, P.-Y., Kaplan, F.: What is intrinsic motivation? A typology of computational approaches. Front. Neurorobot. (2007)
Oudeyer, P.-Y., Kaplan, F., Hafner, V.V.: Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evol. Comput. 11(2), 265–286 (2007)
Parr, R., Russell, S.J.: Reinforcement learning with hierarchies of machines. In: Advances in Neural Information Processing Systems. MIT, Cambridge (1997)
Pfeifer, R., Scheier, C.: Understanding intelligence. MIT, Cambridge (1999)
Provost, J., Kuipers, B.J., Miikkulainen, R.: Developing navigation behavior through self-organizing distinctive state abstraction. Connect. Sci. 18(2), 159–172 (2006)
Redgrave, P.: Basal ganglia. Scholarpedia 2(6), 1825 (2007)
Redgrave, P., Gurney, K.: The short-latency dopamine signal: A role in discovering novel actions? Nat. Rev. Neurosci. 7(12), 967–975 (2006)
Redgrave, P., Gurney, K., Stafford, T., Thirkettle, M., Lewis, J.: The role of the basal ganglia in discovering novel actions. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin (2012, this volume)
Redgrave, P., Prescott, T., Gurney, K.: The basal ganglia: A vertebrate solution to the selection problem? Neuroscience 89, 1009–1023 (1999)
Reed, P., Mitchell, C., Nokes, T.: Intrinsic reinforcing properties of putatively neutral stimuli in an instrumental two-lever discrimination task. Anim. Learn. Behav. 24, 38–45 (1996)
Reynolds, J.N., Wickens, J.R.: Dopamine-dependent plasticity of corticostriatal synapses. Neural Netw. 15(4–6), 507–521 (2002)
Rizzolatti, G., Luppino, G.: The cortical motor system. Neuron 31(6), 889–901 (2001)
Romanelli, P., Esposito, V., Schaal, D.W., Heit, G.: Somatotopy in the basal ganglia: Experimental and clinical evidence for segregated sensorimotor channels. Brain Res. Rev. 48, 112–28 (2005)
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice Hall, Upper Saddle River (2003)
Ryan, R., Deci, E.: Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemp. Educ. Psychol. 25, 54–67 (2000)
Santucci, V., Baldassarre, G., Mirolli, M.: Biological cumulative learning requires intrinsic motivations: A simulated robotic study on the development of visually-guided reaching. In: Proceedings of the Tenth International Conference on Epigenetic Robotics, vol. 149. Lund University Cognitive Studies, Lund (2010)
Saunders, R., Gero, J.: Curious agents and situated design evaluations. In: Gero, J., Brazier, F. (eds.) Agents in Design 2002, pp. 133–149. Key Centre of Design Computing and Cognition, University of Sydney, Sydney (2002)
Schembri, M., Mirolli, M., Baldassarre, G.: Evolution and learning in an intrinsically motivated reinforcement learning robot. In: Advances in Artificial Life. Proceedings of the 9th European Conference on Artificial Life, LNAI, vol. 4648, pp. 294–333. Springer, Berlin (2007a)
Schembri, M., Mirolli, M., Baldassarre, G.: Evolving childhood’s length and learning parameters in an intrinsically motivated reinforcement learning robot. In: Proceedings of the Seventh International Conference on Epigenetic Robotics, pp. 141–148. Lund University Cognitive Studies, Lund (2007b)
Schembri, M., Mirolli, M., Baldassarre, G.: Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot. In: Proceedings of the 6th International Conference on Development and Learning, pp. E1–E6. Imperial College, London (2007c)
Schmidhuber, J.: Curious model-building control systems. In: Proceedings of the International Joint Conference on Neural Networks, vol. 2, pp. 1458–1463. IEEE, Singapore (1991a)
Schmidhuber, J.: A possibility for implementing curiosity and boredom in model-building neural controllers. In: From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, pp. 222–227. MIT, Cambridge (1991b)
Schmidhuber, J.: Exploring the predictable. In: Ghosh, S., Tsutsui, T. (eds.) Advances in Evolutionary Computing, pp. 579–612. Springer, Berlin (2002)
Schmidhuber, J.: Maximizing fun by creating data with easily reducible subjective complexity. In: Baldassarre, G., Mirolli, M. (eds.) Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin (2012, this volume)
Schultz, W.: Predictive reward signal of dopamine neurons. J. Neurophysiol. 80(1), 1–27 (1998)
Schultz, W., Dayan, P., Montague, P.: A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)
Schultz, W., Dickinson, A.: Neuronal coding of prediction errors. Annu. Rev. Neurosci. 23, 473–500 (2000)
Simon, H.A.: The Sciences of the Artificial, 3rd edn. MIT, Cambridge (1996)
Singh, S.P.: Transfer of learning by composing solutions of elemental sequential tasks. Mach. Learn. 8, 323–339 (1992)
Sirois, S., Mareschal, D.: An interacting systems model of infant habituation. J. Cogn. Neurosci. 16(8), 1352–1362 (2004)
Storck, J., Hochreiter, S., Schmidhuber, J.: Reinforcement-driven information acquisition in non-deterministic environments. In: Proceedings of ICANN’95, vol. 2, pp. 159–164, Paris (1995)
Stout, A., Barto, A.G.: Competence progress intrinsic motivation. In: Proceedings of the 9th International Conference on Development and Learning (ICDL 2010), pp. 257–262. Ann Arbor, USA (2010)
Stout, A., Konidaris, G.D., Barto, A.G.: Intrinsically motivated reinforcement learning: A promising framework for developmental robot learning. In: Proceedings of the AAAI Spring Symposium on Developmental Robotics, Stanford (2005)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT, Cambridge (1998)
Sutton, R., Precup, D., Singh, S.: Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artif. Intell. 112, 181–211 (1999)
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9–44 (1988)
Taber, M., Das, S., Fibiger, H.: Cortical regulation of subcortical dopamine release: Mediation via the ventral tegmental area. J. Neurochem. 65(3), 1407–1410 (1995)
Tani, J., Nishimoto, R., Paine, R.: Achieving ’organic compositionality’ through self-organization: Reviews on brain-inspired robotics experiments. Neural Netw. 21, 584–603 (2008)
Tani, J., Nolfi, S.: Learning to perceive the world as articulated: An approach for hierarchical learning in sensory-motor systems. Neural Netw. 12, 1131–1141 (1999)
Vigorito, C., Barto, A.: Intrinsically motivated hierarchical skill learning in structured environments. IEEE Trans. Auton. Mental Dev. 2(2), 83–90 (2010)
Weng, J., McClelland, J., Pentland, A., Sporns, O., Stockman, I., Sur, M., Thelen, E.: Autonomous mental development by robots and animals. Science 291, 599–600 (2001)
White, R.W.: Motivation reconsidered: The concept of competence. Psychol. Rev. 66, 297–333 (1959)
Wiering, M., Schmidhuber, J.: Hq-learning. Adap. Behav. 6, 219–246 (1997)
Yamashita, Y., Tani, J.: Emergence of functional hierarchy in a multiple timescale neural network model: A humanoid robot experiment. PLoS Comput. Biol. 4(11), e1000220 (2008)
Yin, H.H., Knowlton, B.J.: The role of the basal ganglia in habit formation. Nat. Rev. Neurosci. 7, 464–476 (2006)
Acknowledgements
Thanks to Pierre-Yves Oudeyer, Andrew Barto, Kevin Gurney, and Jochen Triesh for their useful comments that substantially helped to improve the paper. Any remaining omission or mistake is our own blame. This research has received funds from the European Commission 7th Framework Programme (FP7/2007-2013), “Challenge 2: Cognitive Systems, Interaction, Robotics,” Grant Agreement No. ICT-IP-231722, Project “IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots.”
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mirolli, M., Baldassarre, G. (2013). Functions and Mechanisms of Intrinsic Motivations. In: Baldassarre, G., Mirolli, M. (eds) Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32375-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-32375-1_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32374-4
Online ISBN: 978-3-642-32375-1
eBook Packages: Computer ScienceComputer Science (R0)