Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3406499.3418769acmconferencesArticle/Chapter ViewAbstractPublication PageshaiConference Proceedingsconference-collections
poster

A Robust Approach for Continuous Interactive Reinforcement Learning

Published: 10 November 2020 Publication History

Abstract

Interactive reinforcement learning is an approach in which an external trainer helps an agent to learn through advice. A trainer is useful in large or continuous scenarios; however, when the characteristics of the environment change over time, it can affect the learning. Robust reinforcement learning is a reliable approach that allows an agent to learn a task, regardless of disturbances in the environment. In this work, we present an approach that addresses interactive reinforcement learning problems in a dynamic environment with continuous states and actions. Our results show that the proposed approach allows an agent to complete the cart-pole balancing task satisfactorily in a dynamic, continuous action-state domain.

Supplementary Material

MP4 File (3406499.3418769.mp4)
This video presents an approach that addresses interactive reinforcement learning problems in a dynamic environment with continuous states and actions. In the video, the problems of letting an agent learn a policy independently are exposed, and the difficulty of the agent in exploring regions of space when the environment changes its characteristics. The approach is applied to the cart-pole balancing task. The results show that the proposed approach allows an agent to complete the cart-pole balancing task satisfactorily in a dynamic, continuous action-state domain.

References

[1]
Angel Ayala, Claudio Henrí quez, and Francisco Cruz. 2019. Reinforcement learning using continuous states and interactive feedback. In Proceedings of the 2nd International Conference on Applications of Intelligent Systems - APPIS '19. ACM Press, New York, New York, USA, 1--5. https://doi.org/10.1145/3309772.3309801
[2]
Andrew G. Barto, Richard S. Sutton, and Charles W. Anderson. 1983. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, Vol. SMC-13, 5 (1983), 834--846. https://doi.org/10.1109/TSMC.1983.6313077
[3]
Francisco Cruz, Sven Magg, Cornelius Weber, and Stefan Wermter. 2016. Training Agents With Interactive Reinforcement Learning and Contextual Affordances. IEEE Transactions on Cognitive and Developmental Systems, Vol. 8, 4 (2016), 271--284. https://doi.org/10.1109/TCDS.2016.2543839
[4]
Francisco Cruz, German I Parisi, and Stefan Wermter. 2018. Multi-modal feedback for affordance-driven interactive reinforcement learning. In 2018 International Joint Conference on Neural Networks (IJCNN). IEEE, Rio de Janeiro, Brazil, 5515--5122.
[5]
Francisco Cruz, Peter Wüppen, Sven Magg, Alvin Fazrie, and Stefan Wermter. 2017. Agent-advising approaches in an interactive reinforcement learning scenario. In 2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob). IEEE, 209--214.
[6]
Richard Dearden, Nir Friedman, and Stuart Russell. 1998. Bayesian Q-learning. In 15th AAAI. AAAI, Wisconsin, 761--768.
[7]
Shane Griffith, Kaushik Subramanian, Jonathan Scholz, Charles L. Isbell, and Andrea L. Thomaz. 2013. Policy Shaping: Integrating Human Feedback with Reinforcement Learning. In Advances in Neural Information Processing Systems 26 (NIPS 2013). NIPS, Lake Tahoe, 2625--2633. https://papers.nips.cc/paper/5187-policy-shaping-integrating-human-feedback-with-reinforcement-learning
[8]
A. Harry Klopf and L. C. Baird. 1993. Reinforcement learning with high-dimensional, continuous actions. Technical Report 513. WRIGHT LAB WRIGHT-PATTERSON AFB OH. 14 pages.
[9]
W. Bradley Knox and Peter Stone. 2009. Interactively shaping agents via human reinforcement. In Proceedings of the fifth international conference on Knowledge capture - K-CAP '09. ACM Press, New York, New York, USA, 9. https://doi.org/10.1145/1597735.1597738
[10]
Cristian Millán, Bruno Fernandes, and Fransisco Cruz. 2019. Human feedback in continuous actor-critic reinforcement learning. In Proceedings European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. ESANN, Bruges (Belgium), 661--666.
[11]
Jun Morimoto and Kenji Doya. 2005. Robust Reinforcement Learning. Neural Computation, Vol. 17, 2 (feb 2005), 335--359. https://doi.org/10.1162/0899766053011528
[12]
Andrew Y. Ng, Harada Daishi, and Russell Stuart. 1999. Policy invariance under reward transformations theory and application to reward shaping. ICML, Vol. 99 (1999), 278--287.
[13]
PM Pilarski and RS Sutton. 2012. Between Instruction and Reward: Human-Prompted Switching. In AAAI Fall Symposium: Robots Learning Interactively from Human Teachers. AAAI, Arlington, Virginia, 46--52.
[14]
Guillermo Puriel-Gil, Wen Yu, and Humberto Sossa. 2018. Reinforcement Learning Compensation based PD Control for Inverted Pendulum. In 2018 15th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE). IEEE, Mexico City, Mexico, 1--6. https://doi.org/10.1109/ICEEE.2018.8533946
[15]
Emrah Sisbot, Luis Marin, Rachid Alami, and Thierry Simeon. 2006. A mobile robot that performs human acceptable motions. In 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, Beijing, 1811--1816. https://doi.org/10.1109/IROS.2006.282223
[16]
Halit Bener Suay and Sonia Chernova. 2011. Effect of human guidance and state space size on Interactive Reinforcement Learning. In RO-MAN 2011 - The 20th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, Atlanta, 1--6. https://doi.org/10.1109/ROMAN.2011.6005223
[17]
Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. MIT press, Cambridge, Massachusetts.
[18]
Richard S. Sutton, David McAllester, Satinder Singh, and Yishay Mansour. 1999. Policy Gradient Methods for Reinforcement Learning with Function Approximation. In Proceedings of the 12th International Conference on Neural Information Processing Systems. MIT Press Cambridge, Denver, CO, 1057--1063.
[19]
Andrea L. Thomaz and Cynthia Breazeal. 2007. Asymmetric Interpretations of Positive and Negative Human Feedback for a Social Learning Agent. In RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, Jeju, South Korea, 720--725. https://doi.org/10.1109/ROMAN.2007.4415180
[20]
Hado Van Hasselt and Marco A. Wiering. 2007. Reinforcement Learning in Continuous Action Spaces. In 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning. IEEE, Honolulu, HI, USA, 272--279. https://doi.org/10.1109/ADPRL.2007.368199

Cited By

View all
  • (2022)Keeping Humans in the Loop: Teaching via Feedback in Continuous Action Space Environments2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS47612.2022.9982282(863-870)Online publication date: 23-Oct-2022
  • (2022)Quantifying the effect of feedback frequency in interactive reinforcement learning for robotic tasksNeural Computing and Applications10.1007/s00521-022-07949-035:23(16931-16943)Online publication date: 5-Dec-2022
  • (2022)Human engagement providing evaluative and informative advice for interactive reinforcement learningNeural Computing and Applications10.1007/s00521-021-06850-635:25(18215-18230)Online publication date: 12-Jan-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HAI '20: Proceedings of the 8th International Conference on Human-Agent Interaction
November 2020
304 pages
ISBN:9781450380546
DOI:10.1145/3406499
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 November 2020

Check for updates

Author Tags

  1. actor-critic, actor-disturber-critic
  2. interactive reinforcement learning
  3. interactive robust reinforcement learning
  4. policy-shaping
  5. reinforcement learning
  6. robust reinforcement learning

Qualifiers

  • Poster

Funding Sources

  • Universidad Central de Chile
  • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
  • National Council for Scientific and Technological Development
  • Prysmian Group

Conference

HAI '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 121 of 404 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Keeping Humans in the Loop: Teaching via Feedback in Continuous Action Space Environments2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS47612.2022.9982282(863-870)Online publication date: 23-Oct-2022
  • (2022)Quantifying the effect of feedback frequency in interactive reinforcement learning for robotic tasksNeural Computing and Applications10.1007/s00521-022-07949-035:23(16931-16943)Online publication date: 5-Dec-2022
  • (2022)Human engagement providing evaluative and informative advice for interactive reinforcement learningNeural Computing and Applications10.1007/s00521-021-06850-635:25(18215-18230)Online publication date: 12-Jan-2022
  • (2021)An Evaluation Methodology for Interactive Reinforcement Learning with Simulated UsersBiomimetics10.3390/biomimetics60100136:1(13)Online publication date: 9-Feb-2021
  • (2021)A Robust Approach for Continuous Interactive Actor-Critic AlgorithmsIEEE Access10.1109/ACCESS.2021.30990719(104242-104260)Online publication date: 2021
  • (2021)A conceptual framework for externally-influenced agents: an assisted reinforcement learning reviewJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-021-03489-y14:4(3621-3644)Online publication date: 18-Sep-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media