research-article

Efficient Model Learning from Joint-Action Demonstrations for Human-Robot Collaborative Tasks

Authors:

Stefanos Nikolaidis,

Ramya Ramakrishnan,

Julie ShahAuthors Info & Claims

HRI '15: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction

Pages 189 - 196

https://doi.org/10.1145/2696454.2696455

Published: 02 March 2015 Publication History

Abstract

We present a framework for automatically learning human user models from joint-action demonstrations that enables a robot to compute a robust policy for a collaborative task with a human. First, the demonstrated action sequences are clustered into different human types using an unsupervised learning algorithm. A reward function is then learned for each type through the employment of an inverse reinforcement learning algorithm. The learned model is then incorporated into a mixed-observability Markov decision process (MOMDP) formulation, wherein the human type is a partially observable variable. With this framework, we can infer online the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this user. In a human subject experiment (n=30), participants agreed more strongly that the robot anticipated their actions when working with a robot incorporating the proposed framework (p<0.01), compared to manually annotating robot actions. In trials where participants faced difficulty annotating the robot actions to complete the task, the proposed framework significantly improved team efficiency (p<0.01). The robot incorporating the framework was also found to be more responsive to human actions compared to policies computed using a hand-coded reward function by a domain expert (p<0.01). These results indicate that learning human user models from joint-action demonstrations and encoding them in a MOMDP formalism can support effective teaming in human-robot collaborative tasks.

References

[1]

P. Abbeel and A. Y. Ng. Apprenticeship learning via inverse reinforcement learning. In Proc. ICML, 2004.

Digital Library

[2]

B. Akgun, M. Cakmak, J. W. Yoo, and A. L. Thomaz. Trajectories and keyframes for kinesthetic teaching: a human-robot interaction perspective. In HRI, 2012.

Digital Library

[3]

B. D. Argall, S. Chernova, M. Veloso, and B. Browning. A survey of robot learning from demonstration. Robot. Auton. Syst., May 2009.

Digital Library

[4]

C. G. Atkeson and S. Schaal. Robot learning from demonstration. In ICML, pages 12--20, 1997.

Digital Library

[5]

T. Bandyopadhyay, K. S. Won, E. Frazzoli, D. Hsu, W. S. Lee, and D. Rus. Intention-aware motion planning. In WAFR. Springer, 2013.

[6]

S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan. Linear Matrix Inequalities in System and Control Theory. Stud. Appl. Math. SIAM, June 1994.

[7]

F. Broz, I. Nourbakhsh, and R. Simmons. Designing pomdp models of socially situated tasks. In RO-MAN, 2011.

[8]

S. Chernova and M. Veloso. Teaching multi-robot coordination using demonstration of communication and state sharing. In Proc. AAMAS, 2008.

Digital Library

[9]

F. Doshi and N. Roy. Efficient model learning for dialog management. In Proc. HRI, March 2007.

Digital Library

[10]

A. Dragan, R. Holladay, and S. Srinivasa. An analysis of deceptive robot motion. In RSS, 2014.

[11]

M. C. Gombolay, R. A. Gutierrez, G. F. Sturla, and J. A. Shah. Decision-making authority, team efficiency and human worker satisfaction in mixed human-robot teams. In RSS, 2014.

[12]

G. Hoffman and C. Breazeal. Effects of anticipatory action on human-robot teamwork efficiency, fluency, and perception of team. In Proc. HRI, 2007.

Digital Library

[13]

V. Jaaskinen, V. Parkkinen, L. Cheng, and J. Corander. Bayesian clustering of dna sequences using markov chains and a stochastic partition model. Stat. Appl. Genet. Mol., 2013.

[14]

L. P. Kaelbling, M. L. Littman, and A. R. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 1998.

Digital Library

[15]

B. Kim and J. Pineau. Maximum mean discrepancy imitation learning. In Proceedings of RSS, 2013.

[16]

H. Kurniawati, D. Hsu, and W. S. Lee. Sarsop: Efficient point-based pomdp planning by approximating optimally reachable belief spaces. In Robotics: Science and Systems, pages 65--72, 2008.

[17]

O. Macindoe, L. P. Kaelbling, and T. Lozano-Perez. Pomcop: Belief space planning for sidekicks in cooperative games. In AIIDE, 2012.

[18]

J. I. Marden. Analyzing and modeling rank data. CRC Press, 1995.

[19]

T. B. Murphy and D. Martin. Mixtures of distance-based models for ranking data. Computational statistics & data analysis, 2003.

Digital Library

[20]

T.-H. D. Nguyen, D. Hsu, W. S. Lee, T.-Y. Leong, L. P. Kaelbling, T. Lozano-Perez, and A. H. Grant. Capir: Collaborative action planning with intention recognition. In AIIDE, 2011.

[21]

M. N. Nicolescu and M. J. Mataric. Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In Proc. AAMAS, 2003.

Digital Library

[22]

S. Nikolaidis and J. Shah. Human-robot cross-training: computational formulation, modeling and evaluation of a human team training strategy. In Proc. HRI, 2013.

Digital Library

[23]

S. C. Ong, Y. Grinberg, and J. Pineau. Mixed observability predictive state representations. In AAAI, 2013.

[24]

S. C. Ong, S. W. Png, D. Hsu, and W. S. Lee. Planning under uncertainty for robotic tasks with mixed observability. IJRR, 29(8):1053--1068, 2010.

Digital Library

[25]

Phasespace, http://www.phasespace.com, 2012.

[26]

J. Pineau, G. Gordon, S. Thrun, et al. Point-based value iteration: An anytime algorithm for pomdps. In IJCAI, volume 3, pages 1025--1032, 2003.

Digital Library

[27]

G. Shani, J. Pineau, and R. Kaplow. A survey of point-based pomdp solvers. In Proc. AAMAS, 2013.

Digital Library

[28]

U. Syed and R. E. Schapire. A game-theoretic approach to apprenticeship learning. In Proc. NIPS, 2007.

[29]

R. Tibshirani, G. Walther, and T. Hastie. Estimating the number of clusters in a data set via the gap statistic. J. Roy. Statist. Soc. Ser. B, 2003.

[30]

K. Waugh, B. D. Ziebart, and J. A. D. Bagnell. Computational rationalization: The inverse equilibrium problem. In Proc. ICML, June 2011.

Cited By

Angelotti GChanel CMoreira Pinto ALounis CChauffaut CDrougard NDastani MSichman JAlechina NDignum V(2024)Offline Risk-sensitive RL with Partial Observability to Enhance Performance in Human-Robot TeamingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662852(58-67)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662852
Mendez EOchoa OOlivera-Guzman DSoto-Herrera VLuna-Sánchez JLucas-Dophe CLugo-del-Real EAyala-Garcia IAlvarado Perez MGonzález A(2024)Integration of Deep Learning and Collaborative Robot for Assembly TasksApplied Sciences10.3390/app1402083914:2(839)Online publication date: 18-Jan-2024
https://doi.org/10.3390/app14020839
Ribeiro JHenriques LColcher SDuarte JMelo FMilidiú RSardinha A(2024)HOTSPOT: An ad hoc teamwork platform for mixed human-robot teamsPLOS ONE10.1371/journal.pone.030570519:6(e0305705)Online publication date: 28-Jun-2024
https://doi.org/10.1371/journal.pone.0305705
Show More Cited By

Index Terms

Efficient Model Learning from Joint-Action Demonstrations for Human-Robot Collaborative Tasks
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Robotics
2. Computing methodologies
  1. Artificial intelligence
    1. Control methods
      1. Robotic planning
    2. Planning and scheduling
      1. Robotic planning
  2. Machine learning

Recommendations

A Human-Robot Collaborative Reinforcement Learning Algorithm

This paper presents a new reinforcement learning algorithm that enables collaborative learning between a robot and a human. The algorithm which is based on the Q ( ) approach expedites the learning process by taking advantage of human intelligence and ...
Human-Robot Co-Learning for Fluent Collaborations
HRI '21 Companion: Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction

A team develops competency by progressive mutual adaptation and learning, a process we call co-learning. In human teams, partners naturally adapt to each other and learn while collaborating. This is not self-evident in human-robot teams. There is a need ...
A Survey of Robot Learning Strategies for Human-Robot Collaboration in Industrial Settings
Highlights
- Comprehensive review of development of adaptive collaborative robots.
- Novel ...
Abstract
Increased global competition has placed a premium on customer satisfaction, and there is a greater demand for manufacturers to be flexible with their products and services. This challenge is usually addressed with the introduction of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HRI '15: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction

March 2015

368 pages

ISBN:9781450328838

DOI:10.1145/2696454

General Chairs:
Julie A. Adams
Vanderbilt University, USA
,
William Smart
Oregon State University, USA
,
Program Chairs:
Bilge Mutlu
University of Wisconsin-Madison, USA
,
Leila Takayama
Google[x], USA

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence
RA: IEEE Robotics and Automation Society
SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 March 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

HRI '15

Sponsor:

HRI '15: ACM/IEEE International Conference on Human-Robot Interaction

March 2 - 5, 2015

Oregon, Portland, USA

Acceptance Rates

HRI '15 Paper Acceptance Rate 43 of 169 submissions, 25%;

Overall Acceptance Rate 268 of 1,124 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

82
Total Citations
View Citations
1,704
Total Downloads

Downloads (Last 12 months)161
Downloads (Last 6 weeks)14

Reflects downloads up to

Other Metrics

View Author Metrics

Citations

Cited By

Angelotti GChanel CMoreira Pinto ALounis CChauffaut CDrougard NDastani MSichman JAlechina NDignum V(2024)Offline Risk-sensitive RL with Partial Observability to Enhance Performance in Human-Robot TeamingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662852(58-67)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3662852
Mendez EOchoa OOlivera-Guzman DSoto-Herrera VLuna-Sánchez JLucas-Dophe CLugo-del-Real EAyala-Garcia IAlvarado Perez MGonzález A(2024)Integration of Deep Learning and Collaborative Robot for Assembly TasksApplied Sciences10.3390/app1402083914:2(839)Online publication date: 18-Jan-2024
https://doi.org/10.3390/app14020839
Ribeiro JHenriques LColcher SDuarte JMelo FMilidiú RSardinha A(2024)HOTSPOT: An ad hoc teamwork platform for mixed human-robot teamsPLOS ONE10.1371/journal.pone.030570519:6(e0305705)Online publication date: 28-Jun-2024
https://doi.org/10.1371/journal.pone.0305705
Xian CZhang JYang WZhang Y(2024)Multi-scale progressive fusion-based depth image completion and enhancement for industrial collaborative robot applicationsJournal of Intelligent Manufacturing10.1007/s10845-023-02299-735:5(2119-2135)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10845-023-02299-7
Yang WXiao QZhang Y(2024)HAbot: a human-centered augmented reality robot programming method with the awareness of cognitive loadJournal of Intelligent Manufacturing10.1007/s10845-023-02096-235:5(1985-2003)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10845-023-02096-2
Sengadu Suresh PGui YDoshi PAgmon NAn BRicci AYeoh W(2023)Dec-AIRL: Decentralized Adversarial IRL for Human-Robot TeamingProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598753(1116-1124)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.5555/3545946.3598753
Weidemann CMandischer Nvan Kerkom FCorves BHüsing MKraus TGarus C(2023)Literature Review on Recent Trends and Perspectives of Collaborative Robotics in Work 4.0Robotics10.3390/robotics1203008412:3(84)Online publication date: 7-Jun-2023
https://doi.org/10.3390/robotics12030084
Görür ORosman BSivrikaya FAlbayrak S(2023)FABRIC: A Framework for the Design and Evaluation of Collaborative Robots with Extended Human AdaptationACM Transactions on Human-Robot Interaction10.1145/358527612:3(1-54)Online publication date: 17-Mar-2023
https://dl.acm.org/doi/10.1145/3585276
Nemlekar HDhanaraj NGuan AGupta SNikolaidis SCastellano GRiek LCakmak MLeite I(2023)Transfer Learning of Human Preferences for Proactive Robot Assistance in Assembly TasksProceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3568162.3576965(575-583)Online publication date: 13-Mar-2023
https://dl.acm.org/doi/10.1145/3568162.3576965
Ross MBroz FBaillie L(2023) Individual Squash Training is More Effective and Social with a Humanoid Robotic Coach * 2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN57019.2023.10309567(621-626)Online publication date: 28-Aug-2023
https://doi.org/10.1109/RO-MAN57019.2023.10309567
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents