poster

A Robust Approach for Continuous Interactive Reinforcement Learning

Authors:

Cristian Millán-Arias,

Bruno Fernandes,

Francisco Cruz,

Richard Dazeley,

Sergio FernandesAuthors Info & Claims

HAI '20: Proceedings of the 8th International Conference on Human-Agent Interaction

Pages 278 - 280

https://doi.org/10.1145/3406499.3418769

Published: 10 November 2020 Publication History

Abstract

Interactive reinforcement learning is an approach in which an external trainer helps an agent to learn through advice. A trainer is useful in large or continuous scenarios; however, when the characteristics of the environment change over time, it can affect the learning. Robust reinforcement learning is a reliable approach that allows an agent to learn a task, regardless of disturbances in the environment. In this work, we present an approach that addresses interactive reinforcement learning problems in a dynamic environment with continuous states and actions. Our results show that the proposed approach allows an agent to complete the cart-pole balancing task satisfactorily in a dynamic, continuous action-state domain.

Supplementary Material

MP4 File (3406499.3418769.mp4)

This video presents an approach that addresses interactive reinforcement learning problems in a dynamic environment with continuous states and actions. In the video, the problems of letting an agent learn a policy independently are exposed, and the difficulty of the agent in exploring regions of space when the environment changes its characteristics. The approach is applied to the cart-pole balancing task. The results show that the proposed approach allows an agent to complete the cart-pole balancing task satisfactorily in a dynamic, continuous action-state domain.

Download
12.49 MB

References

[1]

Angel Ayala, Claudio Henrí quez, and Francisco Cruz. 2019. Reinforcement learning using continuous states and interactive feedback. In Proceedings of the 2nd International Conference on Applications of Intelligent Systems - APPIS '19. ACM Press, New York, New York, USA, 1--5. https://doi.org/10.1145/3309772.3309801

Digital Library

[2]

Andrew G. Barto, Richard S. Sutton, and Charles W. Anderson. 1983. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, Vol. SMC-13, 5 (1983), 834--846. https://doi.org/10.1109/TSMC.1983.6313077

[3]

Francisco Cruz, Sven Magg, Cornelius Weber, and Stefan Wermter. 2016. Training Agents With Interactive Reinforcement Learning and Contextual Affordances. IEEE Transactions on Cognitive and Developmental Systems, Vol. 8, 4 (2016), 271--284. https://doi.org/10.1109/TCDS.2016.2543839

[4]

Francisco Cruz, German I Parisi, and Stefan Wermter. 2018. Multi-modal feedback for affordance-driven interactive reinforcement learning. In 2018 International Joint Conference on Neural Networks (IJCNN). IEEE, Rio de Janeiro, Brazil, 5515--5122.

[5]

Francisco Cruz, Peter Wüppen, Sven Magg, Alvin Fazrie, and Stefan Wermter. 2017. Agent-advising approaches in an interactive reinforcement learning scenario. In 2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob). IEEE, 209--214.

[6]

Richard Dearden, Nir Friedman, and Stuart Russell. 1998. Bayesian Q-learning. In 15th AAAI. AAAI, Wisconsin, 761--768.

[7]

Shane Griffith, Kaushik Subramanian, Jonathan Scholz, Charles L. Isbell, and Andrea L. Thomaz. 2013. Policy Shaping: Integrating Human Feedback with Reinforcement Learning. In Advances in Neural Information Processing Systems 26 (NIPS 2013). NIPS, Lake Tahoe, 2625--2633. https://papers.nips.cc/paper/5187-policy-shaping-integrating-human-feedback-with-reinforcement-learning

[8]

A. Harry Klopf and L. C. Baird. 1993. Reinforcement learning with high-dimensional, continuous actions. Technical Report 513. WRIGHT LAB WRIGHT-PATTERSON AFB OH. 14 pages.

[9]

W. Bradley Knox and Peter Stone. 2009. Interactively shaping agents via human reinforcement. In Proceedings of the fifth international conference on Knowledge capture - K-CAP '09. ACM Press, New York, New York, USA, 9. https://doi.org/10.1145/1597735.1597738

Digital Library

[10]

Cristian Millán, Bruno Fernandes, and Fransisco Cruz. 2019. Human feedback in continuous actor-critic reinforcement learning. In Proceedings European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. ESANN, Bruges (Belgium), 661--666.

[11]

Jun Morimoto and Kenji Doya. 2005. Robust Reinforcement Learning. Neural Computation, Vol. 17, 2 (feb 2005), 335--359. https://doi.org/10.1162/0899766053011528

Digital Library

[12]

Andrew Y. Ng, Harada Daishi, and Russell Stuart. 1999. Policy invariance under reward transformations theory and application to reward shaping. ICML, Vol. 99 (1999), 278--287.

[13]

PM Pilarski and RS Sutton. 2012. Between Instruction and Reward: Human-Prompted Switching. In AAAI Fall Symposium: Robots Learning Interactively from Human Teachers. AAAI, Arlington, Virginia, 46--52.

[14]

Guillermo Puriel-Gil, Wen Yu, and Humberto Sossa. 2018. Reinforcement Learning Compensation based PD Control for Inverted Pendulum. In 2018 15th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE). IEEE, Mexico City, Mexico, 1--6. https://doi.org/10.1109/ICEEE.2018.8533946

[15]

Emrah Sisbot, Luis Marin, Rachid Alami, and Thierry Simeon. 2006. A mobile robot that performs human acceptable motions. In 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, Beijing, 1811--1816. https://doi.org/10.1109/IROS.2006.282223

[16]

Halit Bener Suay and Sonia Chernova. 2011. Effect of human guidance and state space size on Interactive Reinforcement Learning. In RO-MAN 2011 - The 20th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, Atlanta, 1--6. https://doi.org/10.1109/ROMAN.2011.6005223

[17]

Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. MIT press, Cambridge, Massachusetts.

[18]

Richard S. Sutton, David McAllester, Satinder Singh, and Yishay Mansour. 1999. Policy Gradient Methods for Reinforcement Learning with Function Approximation. In Proceedings of the 12th International Conference on Neural Information Processing Systems. MIT Press Cambridge, Denver, CO, 1057--1063.

Digital Library

[19]

Andrea L. Thomaz and Cynthia Breazeal. 2007. Asymmetric Interpretations of Positive and Negative Human Feedback for a Social Learning Agent. In RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, Jeju, South Korea, 720--725. https://doi.org/10.1109/ROMAN.2007.4415180

[20]

Hado Van Hasselt and Marco A. Wiering. 2007. Reinforcement Learning in Continuous Action Spaces. In 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning. IEEE, Honolulu, HI, USA, 272--279. https://doi.org/10.1109/ADPRL.2007.368199

Cited By

Sheidlower IMoore AShort E(2022)Keeping Humans in the Loop: Teaching via Feedback in Continuous Action Space Environments2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS47612.2022.9982282(863-870)Online publication date: 23-Oct-2022
https://doi.org/10.1109/IROS47612.2022.9982282
Harnack DPivin-Bachler JNavarro-Guerrero N(2022)Quantifying the effect of feedback frequency in interactive reinforcement learning for robotic tasksNeural Computing and Applications10.1007/s00521-022-07949-035:23(16931-16943)Online publication date: 5-Dec-2022
https://doi.org/10.1007/s00521-022-07949-0
Bignold ACruz FDazeley RVamplew PFoale C(2022)Human engagement providing evaluative and informative advice for interactive reinforcement learningNeural Computing and Applications10.1007/s00521-021-06850-635:25(18215-18230)Online publication date: 12-Jan-2022
https://doi.org/10.1007/s00521-021-06850-6
Show More Cited By

Index Terms

A Robust Approach for Continuous Interactive Reinforcement Learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
    2. Machine learning algorithms
      1. Dynamic programming for Markov decision processes
        Temporal difference learning

Recommendations

Reinforcement learning using continuous states and interactive feedback
APPIS '19: Proceedings of the 2nd International Conference on Applications of Intelligent Systems

Research in intelligent systems field has led to different learning methods for machines to acquire knowledge, among them, reinforcement learning (RL). Given the problem of the time required to learn how to develop a problem, using RL this work tackles ...
A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
DIS '20: Proceedings of the 2020 ACM Designing Interactive Systems Conference

Interactive reinforcement learning (RL) has been successfully used in various applications in different fields, which has also motivated HCI researchers to contribute in this area. In this paper, we survey interactive RL to empower human-computer ...
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HAI '20: Proceedings of the 8th International Conference on Human-Agent Interaction

November 2020

304 pages

ISBN:9781450380546

DOI:10.1145/3406499

General Chairs:
Mohammad Obaid
Chalmers University of Technology, Sweden
,
Omar Mubin
Western Sydney University, Australia
,
Yukie Nagai
The University of Tokyo, Japan
,
Program Chairs:
Hirotaka Osawa
University of Tsukuba, Japan
,
Yomna Abdelrahman
Bundeswehr University Munich, Germany
,
Morten Fjeld
Chalmers University of Technology, Sweden

Copyright © 2020 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 November 2020

Check for updates

Author Tags

Qualifiers

Poster

Funding Sources

Universidad Central de Chile
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
National Council for Scientific and Technological Development
Prysmian Group

Conference

HAI '20

Sponsor:

SIGCHI

HAI '20: 8th International Conference on Human-Agent Interaction

November 10 - 13, 2020

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 121 of 404 submissions, 30%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
136
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sheidlower IMoore AShort E(2022)Keeping Humans in the Loop: Teaching via Feedback in Continuous Action Space Environments2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS47612.2022.9982282(863-870)Online publication date: 23-Oct-2022
https://doi.org/10.1109/IROS47612.2022.9982282
Harnack DPivin-Bachler JNavarro-Guerrero N(2022)Quantifying the effect of feedback frequency in interactive reinforcement learning for robotic tasksNeural Computing and Applications10.1007/s00521-022-07949-035:23(16931-16943)Online publication date: 5-Dec-2022
https://doi.org/10.1007/s00521-022-07949-0
Bignold ACruz FDazeley RVamplew PFoale C(2022)Human engagement providing evaluative and informative advice for interactive reinforcement learningNeural Computing and Applications10.1007/s00521-021-06850-635:25(18215-18230)Online publication date: 12-Jan-2022
https://doi.org/10.1007/s00521-021-06850-6
Bignold ACruz FDazeley RVamplew PFoale C(2021)An Evaluation Methodology for Interactive Reinforcement Learning with Simulated UsersBiomimetics10.3390/biomimetics60100136:1(13)Online publication date: 9-Feb-2021
https://doi.org/10.3390/biomimetics6010013
Millan-Arias CFernandes BCruz FDazeley RFernandes S(2021)A Robust Approach for Continuous Interactive Actor-Critic AlgorithmsIEEE Access10.1109/ACCESS.2021.30990719(104242-104260)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3099071
Bignold ACruz FTaylor MBrys TDazeley RVamplew PFoale C(2021)A conceptual framework for externally-influenced agents: an assisted reinforcement learning reviewJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-021-03489-y14:4(3621-3644)Online publication date: 18-Sep-2021
https://doi.org/10.1007/s12652-021-03489-y

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents