article

Teachable robots: Understanding human teaching behavior to build more effective robot learners

Authors:

Andrea L. Thomaz,

Cynthia BreazealAuthors Info & Claims

Artificial Intelligence, Volume 172, Issue 6-7

Pages 716 - 737

https://doi.org/10.1016/j.artint.2007.09.009

Published: 01 April 2008 Publication History

Abstract

While Reinforcement Learning (RL) is not traditionally designed for interactive supervisory input from a human teacher, several works in both robot and software agents have adapted it for human input by letting a human trainer control the reward signal. In this work, we experimentally examine the assumption underlying these works, namely that the human-given reward is compatible with the traditional RL reward signal. We describe an experimental platform with a simulated RL robot and present an analysis of real-time human teaching behavior found in a study in which untrained subjects taught the robot to perform a new task. We report three main observations on how people administer feedback when teaching a Reinforcement Learning agent: (a) they use the reward channel not only for feedback, but also for future-directed guidance; (b) they have a positive bias to their feedback, possibly using the signal as a motivational channel; and (c) they change their behavior as they develop a mental model of the robotic learner. Given this, we made specific modifications to the simulated RL robot, and analyzed and evaluated its learning behavior in four follow-up experiments with human trainers. We report significant improvements on several learning measures. This work demonstrates the importance of understanding the human-teacher/robot-learner partnership in order to design algorithms that support how people want to teach and simultaneously improve the robot's learning behavior.

References

[1]

Argyle, M., Ingham, R. and McCallin, M., The different functions of gaze. Semiotica. 19-32.

[2]

R. Arkin, M. Fujita, T. Takagi, R. Hasegawa, An ethological and emotional basis for human--robot interaction, in: Proceedings of the Conference on Robotics and Autonomous Systems, 2003

[3]

Bates, J., The role of emotion in believable agents. Communications of the ACM. v37 i7. 122-125.

Digital Library

[4]

B. Blumberg, Old tricks, new dogs: Ethology and interactive creatures, Ph.D. thesis, Massachusetts Institute of Technology, 1997

Digital Library

[5]

B. Blumberg, M. Downie, Y. Ivanov, M. Berlin, M. Johnson, B. Tomlinson, Integrated learning for interactive synthetic characters, in: Proceedings of the ACM SIGGRAPH, 2002

Digital Library

[6]

Breazeal, C., Designing Sociable Robots. 2002. MIT Press, Cambridge, MA.

Digital Library

[7]

Breazeal, C., Brooks, A., Gray, J., Hoffman, G., Lieberman, J., Lee, H., Lockerd, A. and Mulanda, D., Tutelage and collaboration for humanoid robots. International Journal of Humanoid Robotics. v1 i2.

[8]

J. Clouse, P. Utgoff, A teaching method for reinforcement learning, in: Proc. of the Ninth International Conf. on Machine Learning (ICML), 1992, pp. 92--101

Digital Library

[9]

Cohn, D., Ghahramani, Z. and Jordan, M., Active learning with statistical models. In: Tesauro, G., Touretzky, D., Alspector, J. (Eds.), Advances in Neural Information Processing, vol. 7. Morgan Kaufmann.

[10]

Evans, R., Varieties of learning. In: Rabin, S. (Ed.), AI Game Programming Wisdom, Charles River Media, Hingham, MA. pp. 567-578.

[11]

Greenfield, P.M., Theory of the teacher in learning activities of everyday life. In: Rogoff, B., Lave, J. (Eds.), Everyday Cognition: Its Development in Social Context, Harvard University Press, Cambridge, MA.

[12]

E. Horvitz, J. Breese, D. Heckerman, D. Hovel, K. Rommelse, The lumiere project: Bayesian user modeling for inferring the goals and needs of software users, in: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, Madison, WI, July 1998, pp. 256--265

Digital Library

[13]

C. Isbell, C. Shelton, M. Kearns, S. Singh, P. Stone, Cobot: A social reinforcement learning agent, in: 5th Intern. Conf. on Autonomous Agents, 2001

Digital Library

[14]

Kaplan, F., Oudeyer, P.-Y., Kubinyi, E. and Miklosi, A., Robotic clicker training. Robotics and Autonomous Systems. v38 i3--4. 197-206.

[15]

Krauss, R.M., Chen, Y. and Chawla, P., Nonverbal behavior and nonverbal communication: What do conversational hand gestures tell us?. In: Zanna, M. (Ed.), Advances in Experimental Social Psychology, Academic Press, Tampa. pp. 389-450.

[16]

Vygotsky, L.S., . In: Cole, M. (Ed.), Mind in Society: The Development of Higher Psychological Processes, Harvard University Press, Cambridge, MA.

[17]

Lashkari, Y., Metral, M. and Maes, P., Collaborative interface agents. In: Proceedings of the Twelfth National Conference on Artificial Intelligence, vol. 1. AAAI Press, Seattle, WA.

Digital Library

[18]

Lauria, S., Bugmann, G., Kyriacou, T. and Klein, E., Mobile robot programming using natural language. Robotics and Autonomous Systems. v38 i3--4. 171-181.

[19]

In: Lieberman, H. (Ed.), Your Wish is My Command: Programming by Example, Morgan Kaufmann, San Francisco, CA.

[20]

A. Lockerd, C. Breazeal, Tutelage and socially guided robot learning, in: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2004

[21]

R. Maclin, J. Shavlik, L. Torrey, T. Walker, E. Wild, Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression, in: Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI), Pittsburgh, PA, July 2005

Digital Library

[22]

Mataric, M., Reinforcement learning in the multi-robot domain. Autonomous Robots. v4 i1. 73-83.

Digital Library

[23]

T.M. Mitchell, S. Wang, Y. Huang, Extracting knowledge about users' activities from raw workstation contents, in: Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), 2006

Digital Library

[24]

M.N. Nicolescu, M.J. Matarić, Natural methods for robot task learning: Instructive demonstrations, generalization and practice, in: Proceedings of the 2nd Intl. Conf. AAMAS. Melbourne, Australia, July 2003

Digital Library

[25]

Saksida, L.M., Raymond, S.M. and Touretzky, D.S., Shaping robot behavior using principles from instrumental conditioning. Robotics and Autonomous Systems. v22 i3/4. 231

[26]

Schaal, S., Is imitation learning the route to humanoid robots?. Trends in Cognitive Sciences. v3. 233-242.

[27]

Schohn, G. and Cohn, D., Less is more: Active learning with support vector machines. In: Proc. 17th ICML, Morgan Kaufmann, San Francisco, CA. pp. 839-846.

Digital Library

[28]

W.D. Smart, L.P. Kaelbling, Effective reinforcement learning for mobile robots, in: Proceedings of the IEEE International Conference on Robotics and Automation, 2002, pp. 3404--3410

[29]

K.O. Stanley, B.D. Bryant, R. Miikkulainen, Evolving neural network agents in the nero video game, in: Proceedings of IEEE 2005 Symposium on Computational Intelligence and Games (CIG'05), 2005

[30]

Steels, L. and Kaplan, F., Aibo's first words: The social learning of language and meaning. Evolution of Communication. v4 i1. 3-32.

[31]

Stern, A., Frank, A. and Resner, B., Virtual petz (video session): A hybrid approach to creating autonomous, lifelike dogz and catz. In: AGENTS '98: Proceedings of the Second International Conference on Autonomous Agents, ACM Press, New York. pp. 334-335.

Digital Library

[32]

Thomas, F. and Johnson, O., Disney Animation: The Illusion of Life. 1981. Abbeville Press, New York.

[33]

Thrun, S., Robotics. In: Russell, S., Norvig, P. (Eds.), Artificial Intelligence: A Modern Approach, Prentice Hall.

[34]

S.B. Thrun, T.M. Mitchell, Lifelong robot learning, Tech. Rep. IAI-TR-93-7, 1993

Digital Library

[35]

Tomlinson, B. and Blumberg, B., Social synthetic characters. Computer Graphics. v26 i2.

[36]

R. Voyles, P. Khosla, A multi-agent system for programming robotic agents by human demonstration, in: Proceedings of AI and Manufacturing Research Planning Workshop, 1998

[37]

Watkins, C. and Dayan, P., Q-learning. Machine Learning. v8 i3. 279-292.

Digital Library

[38]

Wertsch, J.V., Minick, N. and Arns, F.J., Creation of context in joint problem solving. In: Rogoff, B., Lave, J. (Eds.), Everyday Cognition: Its Development in Social Context, Harvard University Press, Cambridge, MA.

Cited By

Hong MSunberg ZSzafir D(2024)Cieran: Designing Sequential Colormaps via In-Situ Active Preference LearningProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642903(1-15)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642903
Schrage-Prent NPreciado Vanegas DBaraka KGrollman DBroadbent EJu WSoh HWilliams T(2024)Interactive Robot Programming Inspired by Dog Training: An Exploratory StudyCompanion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3610978.3640655(965-969)Online publication date: 11-Mar-2024
https://dl.acm.org/doi/10.1145/3610978.3640655
Poole BLee M(2024)Towards interactive reinforcement learning with intrinsic feedbackNeurocomputing10.1016/j.neucom.2024.127628587:COnline publication date: 28-Jun-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.127628
Show More Cited By

Index Terms

Teachable robots: Understanding human teaching behavior to build more effective robot learners

Recommendations

Neuro-fuzzy-based skill learning for robots

Endowing robots with the ability of skill learning enables them to be versatile and skillful in performing various tasks. This paper proposes a neuro-fuzzy-based, self-organizing skill-learning framework, which differs from previous work in its ...
Teachable characters: semantic neural networks in game AI
NN'09: Proceedings of the 10th WSEAS international conference on Neural networks

The aim of the study was to apply machine learning into educational game and evaluate the outcome in context of cognitive psychology of learning. The study was done in two phases: design phase and evaluation phase (N=59). The design of the game was done ...
Teachable characters: user studies, design principles, and learning performance
IVA'06: Proceedings of the 6th international conference on Intelligent Virtual Agents

Teachable characters can enhance entertainment technology by providing new interactions, becoming more competent at game play, and simply being fun to teach. It is important to understand how human players try to teach virtual agents in order to design ...

Comments

Information & Contributors

Information

Published In

cover image Artificial Intelligence

Artificial Intelligence Volume 172, Issue 6-7

April, 2008

267 pages

ISSN:0004-3702

Issue’s Table of Contents

Copyright © Elsevier B.V. © 2007.

Publisher

Elsevier Science Publishers Ltd.

United Kingdom

Publication History

Published: 01 April 2008

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

122
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hong MSunberg ZSzafir D(2024)Cieran: Designing Sequential Colormaps via In-Situ Active Preference LearningProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642903(1-15)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642903
Schrage-Prent NPreciado Vanegas DBaraka KGrollman DBroadbent EJu WSoh HWilliams T(2024)Interactive Robot Programming Inspired by Dog Training: An Exploratory StudyCompanion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3610978.3640655(965-969)Online publication date: 11-Mar-2024
https://dl.acm.org/doi/10.1145/3610978.3640655
Poole BLee M(2024)Towards interactive reinforcement learning with intrinsic feedbackNeurocomputing10.1016/j.neucom.2024.127628587:COnline publication date: 28-Jun-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.127628
Phaijit OSammut CJohal W(2023)User Interface Interventions for Improving Robot Learning from DemonstrationProceedings of the 11th International Conference on Human-Agent Interaction10.1145/3623809.3623848(152-161)Online publication date: 4-Dec-2023
https://dl.acm.org/doi/10.1145/3623809.3623848
Benthall SShekman D(2023)Designing Fiduciary Artificial IntelligenceProceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3617694.3623230(1-15)Online publication date: 30-Oct-2023
https://dl.acm.org/doi/10.1145/3617694.3623230
Aliasghari PGhafurian MNehaniv CDautenhahn K(2023)How Do We Perceive Our Trainee Robots? Exploring the Impact of Robot Errors and Appearance When Performing Domestic Physical Tasks on Teachers’ Trust and EvaluationsACM Transactions on Human-Robot Interaction10.1145/358251612:3(1-41)Online publication date: 5-May-2023
https://dl.acm.org/doi/10.1145/3582516
Hedlund-Botti EGombolay MCastellano GRiek LCakmak MLeite I(2023)Investigating Learning from Demonstration in Imperfect and Real World ScenariosCompanion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3568294.3579980(769-771)Online publication date: 13-Mar-2023
https://dl.acm.org/doi/10.1145/3568294.3579980
Zhang QNarcomey ACandon KVázquez MCastellano GRiek LCakmak MLeite I(2023)Self-Annotation Methods for Aligning Implicit and Explicit Human Feedback in Human-Robot InteractionProceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3568162.3576986(398-407)Online publication date: 13-Mar-2023
https://dl.acm.org/doi/10.1145/3568162.3576986
Ashktorab ZDesmond MJohnson JPan QDugan CBrachman MSpina C(2023)SME-in-the-loop: Interaction Preferences when Supervising Bots in Human-AI CommunitiesProceedings of the 2023 ACM Designing Interactive Systems Conference10.1145/3563657.3596100(2281-2303)Online publication date: 10-Jul-2023
https://dl.acm.org/doi/10.1145/3563657.3596100
Chen SQiu SLi HZhang JWu XZeng WHuang F(2023)An integrated model for predicting pupils’ acceptance of artificially intelligent robots as teachersEducation and Information Technologies10.1007/s10639-023-11601-228:9(11631-11654)Online publication date: 22-Feb-2023
https://dl.acm.org/doi/10.1007/s10639-023-11601-2
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents