short-paper

MarioMix: Creating Aligned Playstyles for Bots with Interactive Reinforcement Learning

Authors:

Christian Arzate Cruz,

Takeo IgarashiAuthors Info & Claims

CHI PLAY '20: Extended Abstracts of the 2020 Annual Symposium on Computer-Human Interaction in Play

Pages 134 - 139

https://doi.org/10.1145/3383668.3419938

Published: 03 November 2020 Publication History

Abstract

In this paper, we propose a generic framework that enables game developers without knowledge of machine learning to create bot behaviors with playstyles that align with their preferences. Our framework is based on interactive reinforcement learning (RL), and we used it to create a behavior authoring tool called MarioMix. This tool enables non-experts to create bots with varied playstyles for the game titled Super Mario Bros. The main interaction procedure of MarioMix consists of presenting short clips of gameplay displaying precomputed bots with different playstyles to end-users. Then, end-users can select the bot with the playstyle that behaves as intended. We evaluated MarioMix by incorporating input from game designers working in the industry.

Supplementary Material

SRT File (cprc1025vfc.srt)

Preview video

Download
2.01 KB

MP4 File (cprc1025vf.mp4)

Supplemental video

Download
104.43 MB

References

[1]

Pieter Abbeel and Andrew Y Ng. 2004. Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on Machine learning. ACM, 1. MarioMix: Creating Aligned Playstyles for Bots with Interactive Reinforcement Learning Conference'17, July 2017, Washington, DC, USA

Digital Library

[2]

Saleema Amershi, Maya Cakmak, W. Bradley Knox, and Todd Kulesza. 2014. Power to the People: The Role of Humans in Interactive Machine Learning. AI Magazine 35 (2014), 105--120.

Digital Library

[3]

Ofra Amir, Ece Kamar, Andrey Kolobov, and Barbara Grosz. 2016. Interactive teaching strategies for agent training. (2016).

[4]

Riku Arakawa, Sosuke Kobayashi, Yuya Unno, Yuta Tsuboi, and Shinichi Maeda. 2018. DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback. CoRR abs/1810.11748 (2018). arXiv:1810.11748 http://arxiv.org/abs/1810.11748

[5]

Sinan Ariyurek, Aysu Betin-Can, and Elif Surer. 2019. Automated Video Game Testing Using Synthetic and Human-Like Agents. IEEE Transactions on Games (2019).

[6]

Christian Arzate Cruz and Takeo Igarashi. 2020. A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges. In Proceedings of the 2020 ACM on Designing Interactive Systems Conference. 1195--1209.

Digital Library

[7]

Christian Arzate Cruz and Jorge Ramirez Uresti. 2018. HRLB?2: A Reinforcement Learning Based Framework for Believable Bots. Applied Sciences 8, 12 (2018), 2453.

[8]

Francisco Elizalde and Luis Enrique Sucar. 2009. Expert Evaluation of Probabilistic Explanations. In ExaCt. 1--12.

[9]

Katharina Emmerich, Patrizia Ring, and Maic Masuch. 2018. I?m Glad You Are on My Side: How to Design Compelling Game Companions. In Proceedings of the 2018 Annual Symposium on Computer-Human Interaction in Play. 141--152.

Digital Library

[10]

Anestis Fachantidis, Matthew Taylor, and Ioannis Vlahavas. 2018. Learning to teach reinforcement learning agents. Machine Learning and Knowledge Extraction 1, 1 (2018), 21--42.

[11]

Jerry Alan Fails and Dan R. Olsen, Jr. 2003. Interactive Machine Learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces (Miami, Florida, USA) (IUI '03). ACM, New York, NY, USA, 39--45. https://doi.org/10.1145/604045.604056

[12]

Chelsea Finn, Sergey Levine, and Pieter Abbeel. 2016. Guided cost learning: Deep inverse optimal control via policy optimization. In International Conference on Machine Learning. 49--58.

Digital Library

[13]

Sam Greydanus, Anurag Koul, Jonathan Dodge, and Alan Fern. 2017. Visualizing and understanding atari agents. arXiv preprint arXiv:1711.00138 (2017).

[14]

Shane Griffith, Kaushik Subramanian, Jonathan Scholz, Charles L Isbell, and Andrea L Thomaz. 2013. Policy shaping: Integrating human feedback with reinforcement learning. In Advances in neural information processing systems. 2625--2633.

[15]

Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. In Advances in Neural Information Processing Systems. 4565--4573.

[16]

Fred Matthew Hohman, Minsuk Kahng, Robert Pienta, and Duen Horng Chau. 2018. Visual analytics in deep learning: An interrogative survey for the next frontiers. IEEE transactions on visualization and computer graphics (2018).

[17]

Ahmed Khalifa, Diego Perez-Liebana, Simon M Lucas, and Julian Togelius. 2016. General video game level generation. In Proceedings of the Genetic and Evolutionary Computation Conference 2016. ACM, 253--259.

Digital Library

[18]

Raj Korpan, Susan L Epstein, Anoop Aroor, and Gil Dekel. 2017. Why: Natural explanations from a robot navigator. arXiv preprint arXiv:1709.09741 (2017).

[19]

Samantha Krening and Karen M Feigh. 2018. Interaction algorithm effect on human experience with reinforcement learning. ACM Transactions on Human-Robot Interaction (THRI) 7, 2 (2018), 16.

Digital Library

[20]

Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, and Shane Legg. 2018. Scalable agent alignment via reward modeling: a research direction. arXiv preprint arXiv:1811.07871 (2018).

[21]

Guangliang Li, Hayley Hung, Shimon Whiteson, and W Knox. 2013. Using Informative Behavior to Increase Engagement in the TAMER Framework. 12th International Conference on Autonomous Agents and Multiagent Systems 2013, AAMAS 2013 2.

[22]

James MacGlashan, Mark K Ho, Robert Loftin, Bei Peng, Guan Wang, David L Roberts, Matthew E Taylor, and Michael L Littman. 2017. Interactive learning from policy-dependent human feedback. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2285--2294.

Digital Library

[23]

Sean McGregor, Hailey Buckingham, Thomas G Dietterich, Rachel Houtman, Claire Montgomery, and Ronald Metoyer. 2017. Interactive visualization for testing Markov Decision Processes: MDPVIS. Journal of Visual Languages & Computing 39 (2017), 93--106.

Digital Library

[24]

Andrew Y Ng, Daishi Harada, and Stuart Russell. 1999. Policy invariance under reward transformations: Theory and application to reward shaping. In ICML, Vol. 99. 278--287.

Digital Library

[25]

Johannes Pfau, Jan David Smeddinck, and Rainer Malaka. 2018. Towards deep player behavior models in MMORPGs. In Proceedings of the 2018 Annual Symposium on Computer-Human Interaction in Play. 381--392.

Digital Library

[26]

Jonathan Roberts and Ke Chen. 2015. Learning-based procedural content generation. IEEE Transactions on Computational Intelligence and AI in Games 7, 1 (2015), 88--101.

[27]

Ariel Rosenfeld, Matthew E. Taylor, and Sarit Kraus. 2017. Leveraging Human Knowledge in Tabular Reinforcement Learning: A Study of Human Subjects. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17. 3823--3830. https://doi. org/10.24963/ijcai.2017/534

[28]

Dominik Sacha, Michael Sedlmair, Leishi Zhang, John Aldo Lee, Daniel Weiskopf, Stephen C. North, and Daniel A. Keim. 2016. Humancentered machine learning through interactive visualization: review and open challenges. In ESANN.

[29]

Evan C Sheffield and Michael D Shah. 2018. Dungeon digger: Apprenticeship learning for procedural dungeon building agents. In Proceedings of the 2018 Annual Symposium on Computer-Human Interaction in Play Companion Extended Abstracts. 603--610.

Digital Library

[30]

Patrikk D Sørensen, Jeppeh M Olsen, and Sebastian Risi. 2016. Breeding a diversity of super mario behaviors through interactive evolution. In 2016 IEEE Conference on Computational Intelligence and Games (CIG). IEEE, 1--7.

Digital Library

[31]

Kenneth O Stanley, Bobby D Bryant, and Risto Miikkulainen. 2005. Real-time neuroevolution in the NERO video game. IEEE transactions on evolutionary computation 9, 6 (2005), 653--668.

[32]

Adam Summerville, Sam Snodgrass, Matthew Guzdial, Christoffer Holmgård, Amy K Hoover, Aaron Isaksen, Andy Nealen, and Julian Togelius. 2018. Procedural content generation via machine learning (pcgml). IEEE Transactions on Games 10, 3 (2018), 257--270.

[33]

Hideyuki Takagi. 2001. Interactive evolutionary computation: Fusion of the capabilities of EC optimization and human evaluation. Proc. IEEE 89, 9 (2001), 1275--1296.

[34]

Matthew E Taylor, Halit Bener Suay, and Sonia Chernova. 2011. Integrating reinforcement learning with human demonstrations of varying ability. In The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 2. International Foundation for Autonomous Agents and Multiagent Systems, 617--624.

Digital Library

[35]

Ning Wang, David V Pynadath, Ericka Rovira, Michael J Barnes, and Susan G Hill. 2018. Is It My Looks? Or Something I Said? The Impact of Explanations, Embodiment, and Expectations on Trust and Performance in Human-Robot Teams. In International Conference on Persuasive Technology. Springer, 56--69.

[36]

Zhao Yang, Song Bai, Li Zhang, and Philip HS Torr. 2018. Learn to Interpret Atari Agents. arXiv preprint arXiv:1812.11276 (2018). Conference'17, July 2017, Washington, DC, USA Trovato and Tobin, et al.

[37]

Georgios N Yannakakis and Julian Togelius. 2015. A panorama of artificial and computational intelligence in games. IEEE Transactions on Computational Intelligence and AI in Games 7, 4 (2015), 317--335.

[38]

Brian D Ziebart, Andrew Maas, J Andrew Bagnell, and Anind K Dey. 2008. Maximum entropy inverse reinforcement learning. (2008).

[39]

Brian D Ziebart, Andrew L Maas, J Andrew Bagnell, and Anind K Dey. 2009. Human Behavior Modeling with Maximum Entropy Inverse Optimal Control. In AAAI Spring Symposium: Human Behavior Modeling. 92.

Cited By

Cruz CIgarashi T(2021)Interactive Explanations: Diagnosis and Repair of Reinforcement Learning Based Agent Behaviors2021 IEEE Conference on Games (CoG)10.1109/CoG52621.2021.9618999(01-08)Online publication date: 17-Aug-2021
https://doi.org/10.1109/CoG52621.2021.9618999
Arzate Cruz CIgarashi T(2021)Interactive Reinforcement Learning for Autonomous Behavior DesignArtificial Intelligence for Human Computer Interaction: A Modern Approach10.1007/978-3-030-82681-9_11(345-375)Online publication date: 5-Nov-2021
https://doi.org/10.1007/978-3-030-82681-9_11

Index Terms

MarioMix: Creating Aligned Playstyles for Bots with Interactive Reinforcement Learning
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction techniques

Recommendations

A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges
DIS '20: Proceedings of the 2020 ACM Designing Interactive Systems Conference

Interactive reinforcement learning (RL) has been successfully used in various applications in different fields, which has also motivated HCI researchers to contribute in this area. In this paper, we survey interactive RL to empower human-computer ...
Effect of Interaction Design on the Human Experience with Interactive Reinforcement Learning
DIS '19: Proceedings of the 2019 on Designing Interactive Systems Conference

A goal of interactive machine learning (IML) is to enable people with no machine learning knowledge to intuitively teach intelligent agents how to perform tasks. This study investigates how three factors of the design of an interactive reinforcement ...
Teaching agents with human feedback: a demonstration of the TAMER framework
IUI '13 Companion: Proceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companion

Incorporating human interaction into agent learning yields two crucial benefits. First, human knowledge can greatly improve the speed and final result of learning compared to pure trial-and-error approaches like reinforcement learning. And second, human ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI PLAY '20: Extended Abstracts of the 2020 Annual Symposium on Computer-Human Interaction in Play

November 2020

435 pages

ISBN:9781450375870

DOI:10.1145/3383668

General Chairs:
Pejman Mirza-Babaei
Ontario Tech University, Canada
,
Victoria McArthur
Carlton University, Canada
,
Program Chairs:
Vero Vanden Abeele
KU Leuven, Belgium
,
Max Birk
Eindhoven University of Technology, Netherlands

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

CHI PLAY '20

Sponsor:

SIGCHI

CHI PLAY '20: The Annual Symposium on Computer-Human Interaction in Play

November 2 - 4, 2020

Virtual Event, Canada

Acceptance Rates

Overall Acceptance Rate 421 of 1,386 submissions, 30%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
192
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)1

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Cruz CIgarashi T(2021)Interactive Explanations: Diagnosis and Repair of Reinforcement Learning Based Agent Behaviors2021 IEEE Conference on Games (CoG)10.1109/CoG52621.2021.9618999(01-08)Online publication date: 17-Aug-2021
https://doi.org/10.1109/CoG52621.2021.9618999
Arzate Cruz CIgarashi T(2021)Interactive Reinforcement Learning for Autonomous Behavior DesignArtificial Intelligence for Human Computer Interaction: A Modern Approach10.1007/978-3-030-82681-9_11(345-375)Online publication date: 5-Nov-2021
https://doi.org/10.1007/978-3-030-82681-9_11

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten