Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

The Shoutcasters, the Game Enthusiasts, and the AI: Foraging for Explanations of Real-time Strategy Players

Published: 15 March 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Assessing and understanding intelligent agents is a difficult task for users who lack an AI background. “Explainable AI” (XAI) aims to address this problem, but what should be in an explanation? One route toward answering this question is to turn to theories of how humans try to obtain information they seek. Information Foraging Theory (IFT) is one such theory. In this article, we present a series of studies1 using IFT: the first investigates how expert explainers supply explanations in the RTS domain, the second investigates what explanations domain experts demand from agents in the RTS domain, and the last focuses on how both populations try to explain a state-of-the-art AI. Our results show that RTS environments like StarCraft offer so many options that change so rapidly, foraging tends to be very costly. Ways foragers attempted to manage such costs included “satisficing” approaches to reduce their cognitive load, such as focusing more on What information than on Why information, strategic use of language to communicate a lot of nuanced information in a few words, and optimizing their environment when possible to make their most valuable information patches readily available. Further, when a real AI entered the picture, even very experienced domain experts had difficulty understanding and judging some of the AI’s unconventional behaviors. Finally, our results reveal ways Information Foraging Theory can inform future XAI interactive explanation environments, and also how XAI can inform IFT.

    References

    [1]
    Adrian K. Agogino and Kagan Tumer. 2004. Unifying temporal and structural credit assignment problems. In Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems. IEEE Computer Society, 980–987.
    [2]
    S. Amershi, M. Cakmak, W. Knox, and T. Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. AI Mag. 35, 4 (2014), 105–120.
    [3]
    Andrew Anderson, Jonathan Dodge, Amrita Sadarangani, Zoe Juozapaitis, Evan Newman, Jed Irvine, Souti Chattopadhyay, Alan Fern, and Margaret Burnett. 2019. Explaining reinforcement learning to mere mortals: An empirical study. In Proceedings of the International Joint Conferences on Artificial Intelligence.
    [4]
    Balaji Athreya and Chris Scaffidi. 2014. Towards aiding within-patch information foraging by end-user programmers. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’14). IEEE, 13–20.
    [5]
    Juan Felipe Beltran, Ziqi Huang, Azza Abouzied, and Arnab Nandi. 2017. Don’t just swipe left, tell me why: Enhancing gesture-based feedback with reason bins. In Proceedings of the International Conference on Intelligent User Interfaces. ACM, 469–480.
    [6]
    Sourav S. Bhowmick, Aixin Sun, and Ba Quan Truong. 2013. Why not, WINE?: Towards answering why-not questions in social image search. In Proceedings of the ACM International Conference on Multimedia. ACM, 917–926.
    [7]
    Svetlin Bostandjiev, John O’Donovan, and Tobias Höllerer. 2012. TasteWeights: A visual interactive hybrid recommender system. In Proceedings of the ACM Conference on Recommender Systems. ACM, 35–42.
    [8]
    Barrett S. Caldwell, Sandra K. Garrett, and Karim C. Boustany. 2010. Healthcare team performance in time critical environments: Coordinating events, foraging, and system processes. J. Healthc. Eng. 1, 2 (2010), 255–276.
    [9]
    Nico Castelli, Corinna Ogonowski, Timo Jakobi, Martin Stein, Gunnar Stevens, and Volker Wulf. 2017. What happened in my home? An end-user development approach for smart home data visualization. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 853–866.
    [10]
    Gifford Cheung and Jeff Huang. 2011. Starcraft from the stands: Understanding the game spectator. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’11). ACM, New York, NY, 763–772.
    [11]
    Ed H. Chi, Peter Pirolli, Kim Chen, and James Pitkow. 2001. Using information scent to model user information needs and actions and the web. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 490–497.
    [12]
    Robert Collins and David Jefferson. 1991. Representations for artificial organisms. In From Animals to Animats. In Proceedings of the 1st International Conference on Simulation of Adaptive Behavior. The MIT Press.
    [13]
    Kelley Cotter, Janghee Cho, and Emilee Rader. 2017. Explaining the news feed algorithm: An analysis of the “News Feed FYI” blog. In Proceedings of the ACM CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, 1553–1560.
    [14]
    Jonathan Dodge, Sean Penney, Claudia Hilderbrand, Andrew Anderson, and Margaret Burnett. 2018. How the experts do it: Assessing and explaining agent behaviors in real-time strategy games. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’18). ACM, New York, NY.
    [15]
    Upol Ehsan, Pradyumna Tambwekar, Larry Chan, Brent Harrison, and Mark O. Riedl. 2019. Automated rationale generation: A technique for explainable AI and its effects on human perceptions. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI’19). ACM, New York, NY, 263–274.
    [16]
    S. Fleming, C. Scaffidi, D. Piorkowski, M. Burnett, R. Bellamy, J. Lawrance, and I. Kwan. 2013. An information foraging theory perspective on tools for debugging, refactoring, and reuse tasks. ACM Trans. Softw. Eng. Methodol. 22, 2 (2013), 14.
    [17]
    W. Fu and P. Pirolli. 2007. SNIF-ACT: A cognitive model of user navigation on the world wide web. Hum.-comput. Interact. 22, 4 (2007), 355–412.
    [18]
    Sandra K. Garrett and Barrett S. Caldwell. 2009. Human factors aspects of planning and response to pandemic events. In Proceedings of the Institute of Industrial and Systems Engineers Conference (IISE’09). 705.
    [19]
    V. Grigoreanu, M. Burnett, and G. Robertson. 2010. A strategy-centric approach to the design of end-user debugging tools. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 713–722.
    [20]
    Valentina Grigoreanu, Margaret Burnett, Susan Wiedenbeck, Jill Cao, Kyle Rector, and Irwin Kwan. 2012. End-user debugging strategies: A sensemaking perspective. ACM Trans. Comput.-hum. Interact. 19, 1 (2012), 1–28.
    [21]
    Alex Groce, Todd Kulesza, Chaoqiang Zhang, Shalini Shamasunder, Margaret Burnett, Weng-Keen Wong, Simone Stumpf, Shubhomoy Das, Amber Shinsel, Forrest Bice, et al. 2014. You are the only possible oracle: Effective test selection for end users of interactive machine learning systems. IEEE Trans. Softw. Eng. 40, 3 (2014), 307–323.
    [22]
    Bradley Hayes and Julie A. Shah. 2017. Improving robot controller transparency through autonomous policy explanation. In Proceedings of the ACM/IEEE International Conference on Human-robot Interaction. ACM, 303–312.
    [23]
    Steven R. Haynes, Mark A. Cohen, and Frank E. Ritter. 2009. Designs for explaining intelligent agents. Int. J. Hum.-comput. Stud. 67, 1 (2009), 90–110.
    [24]
    Zhian He and Eric Lo. 2014. Answering why-not questions on top-k queries. IEEE Trans. Knowl. Data Eng. 26, 6 (2014), 1300–1315.
    [25]
    Robert R. Hoffman and Gary Klein. 2017. Explaining explanation, Part 1: Theoretical foundations. IEEE Intell. Syst. 32, 3 (2017), 68–73.
    [26]
    Paul Jaccard. 1908. Nouvelles recherches sur la distribution florale. Bull. Soc. Vaud. Sci. Nat. 44 (1908), 223–270.
    [27]
    Ashish Kapoor, Bongshin Lee, Desney Tan, and Eric Horvitz. 2010. Interactive optimization for steering machine classification. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 1343–1352.
    [28]
    Lucas Kempe-Cook, Stephen Tsung-Han Sher, and Norman Makoto Su. 2019. Behind the voices: The practice and challenges of Esports casters. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’19). Association for Computing Machinery, New York, NY.
    [29]
    Man-Je Kim, Kyung-Joong Kim, SeungJun Kim, and Anind K. Dey. 2016. Evaluation of starcraft artificial intelligence competition bots by experienced human players. In Proceedings of the ACM CHI Conference Extended Abstracts. ACM, 1915–1921.
    [30]
    M. J. Kim, K. J. Kim, S. Kim, and A. K. Dey. 2018. Performance evaluation gaps in a real-time strategy game between human and artificial intelligence players. IEEE Access 6 (2018), 13575–13586.
    [31]
    Josua Krause, Adam Perer, and Kenney Ng. 2016. Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’16). ACM, New York, NY, 5686–5697.
    [32]
    Cliff Kuang. 2017. Can AI be taught to explain itself? New York Times. (Nov. 21 2017). Retrieved from https://www.nytimes.com/2017/11/21/magazine/can-ai-be-taught-to-explain-itself.html.
    [33]
    T. Kulesza, M. Burnett, W. Wong, and S. Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the ACM International Conference on Intelligent User Interfaces. ACM, 126–137.
    [34]
    Todd Kulesza, Simone Stumpf, Margaret Burnett, and Irwin Kwan. 2012. Tell me more? The effects of mental model soundness on personalizing an intelligent agent. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 1–10.
    [35]
    T. Kulesza, S. Stumpf, M. Burnett, W. Wong, Y. Riche, T. Moore, I. Oberst, A. Shinsel, and K. McIntosh. 2010. Explanatory debugging: Supporting end-user debugging of machine-learned programs. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’10). IEEE, 41–48.
    [36]
    T. Kulesza, S. Stumpf, W. Wong, M. Burnett, S. Perona, A. Ko, and I. Oberst. 2011. Why-oriented end-user debugging of naive Bayes text classification. ACM Trans. Interact. Intell. Syst. 1, 1 (2011), 2.
    [37]
    Sandeep Kaur Kuttal, Anita Sarma, Margaret Burnett, Gregg Rothermel, Ian Koeppe, and Brooke Shepherd. 2019. How end-user programmers debug visual web-based programs: An information foraging theory perspective. J. Comput. Lang. 53 (2019), 22–37.
    [38]
    Sandeep Kaur Kuttal, Anita Sarma, and Gregg Rothermel. 2013. Predator behavior in the wild web world of bugs: An information foraging theory perspective. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’13). IEEE, 59–66.
    [39]
    Joseph Lawrance, Margaret Burnett, Rachel Bellamy, Christopher Bogart, and Calvin Swart. 2010. Reactive information foraging for evolving goals. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’10). Association for Computing Machinery, New York, NY, 25–34.
    [40]
    B. Lim and A. Dey. 2009. Assessing demand for intelligibility in context-aware applications. In Proceedings of the ACM International Conference on Ubiquitous Computing. ACM, 195–204.
    [41]
    B. Lim, A. Dey, and D. Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 2119–2128.
    [42]
    Brian Y. Lim. 2012. Improving Understanding and Trust with Intelligibility in Context-aware Applications. Ph.D. Dissertation. Carnegie Mellon University.
    [43]
    Diane Litman, Steve Young, M. J. F. Gales, Kate Knill, Karen Ottewell, Rogier van Dalen, and David Vandyke. 2016. Towards using conversations with spoken dialogue systems in the automated assessment of non-native speakers of English. In Proceedings of the SIGDIAL Conference. 270–275.
    [44]
    M. Lomas, R. Chevalier, E. V. Cross, R. C. Garrett, J. Hoare, and M. Kopack. 2012. Explaining robot actions. In Proceedings of the ACM/IEEE International Conference on Human-robot Interaction (HRI’12). 187–188.
    [45]
    S. McGregor, H. Buckingham, T. G. Dietterich, R. Houtman, C. Montgomery, and R. Metoyer. 2015. Facilitating testing and debugging of Markov decision processes with interactive visualization. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’15). 53–61.
    [46]
    Ronald Metoyer, Simone Stumpf, Christoph Neumann, Jonathan Dodge, Jill Cao, and Aaron Schnabel. 2010. Explaining how to play real-time strategy games. Knowl.-based Syst. 23, 4 (2010), 295–301.
    [47]
    Tim Miller. 2017. Explanation in artificial intelligence: Insights from the social sciences. CoRR abs/1706.07269 (2017).
    [48]
    Nan Niu, Anas Mahmoud, Zhangji Chen, and Gary Bradshaw. 2013. Departures from optimality: Understanding human analyst’s information foraging in assisted requirements tracing. In Proceedings of the ACM/ICSE International Conference on Software Engineering. IEEE Press, 572–581.
    [49]
    Donald A. Norman. 1983. Some observations on mental models. Ment. Models 7, 112 (1983), 7–14.
    [50]
    S. Ontañón, G. Synnaeve, A. Uriarte, F. Richoux, D. Churchill, and M. Preuss. 2013. A survey of real-time strategy game AI research and competition in StarCraft. IEEE Trans. Comput. Intell. AI Games 5, 4 (Dec. 2013), 293–311.
    [51]
    Sean Penney, Jonathan Dodge, Claudia Hilderbrand, Andrew Anderson, Logan Simpson, and Margaret Burnett. 2018. Toward foraging for understanding of StarCraft agents: An empirical study. In Proceedings of the 23rd International Conference on Intelligent User Interfaces (IUI’18). ACM, New York, NY, 225–237.
    [52]
    Alexandre Perez and Rui Abreu. 2014. A diagnosis-based approach to software comprehension. In Proceedings of the ACM International Conference on Program Comprehension. ACM, 37–47.
    [53]
    David Piorkowski, Scott Fleming, Christopher Scaffidi, Christopher Bogart, Margaret Burnett, Bonnie John, Rachel Bellamy, and Calvin Swart. 2012. Reactive information foraging: An empirical investigation of theory-based recommender systems for programmers. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’12). Association for Computing Machinery, New York, NY, 1471–1480.
    [54]
    David Piorkowski, Scott D. Fleming, Christopher Scaffidi, Margaret Burnett, Irwin Kwan, Austin Z Henley, Jamie Macbeth, Charles Hill, and Amber Horvath. 2015. To fix or to learn? How production bias affects developers’ information foraging during debugging. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’15). IEEE, 11–20.
    [55]
    D. Piorkowski, A. Henley, T. Nabi, S. Fleming, C. Scaffidi, and M. Burnett. 2016. Foraging and navigations, fundamentally: Developers’ predictions of value and cost. In Proceedings of the ACM International Symposium on Foundations of Software Engineering. ACM, 97–108.
    [56]
    David Piorkowski, Sean Penney, Austin Z. Henley, Marco Pistoia, Margaret Burnett, Omer Tripp, and Pietro Ferrara. 2017. Foraging goes mobile: Foraging while debugging on mobile devices. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’17). IEEE, 9–17.
    [57]
    P. Pirolli. 2007. Information Foraging Theory: Adaptive Interaction with Information. Oxford University Press.
    [58]
    S. S. Ragavan, S. Kuttal, C. Hill, A. Sarma, D. Piorkowski, and M. Burnett. 2016. Foraging among an overabundance of similar variants. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 3509–3521.
    [59]
    Sruti Srinivasa Ragavan, Mihai Codoban, David Piorkowski, Danny Dig, and Burnett Margaret. 2019. Version control systems: An information foraging perspective. IEEE Trans. Softw. Eng. (2019).
    [60]
    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135–1144.
    [61]
    Stephanie Rosenthal, Sai P. Selvaraj, and Manuela Veloso. 2016. Verbalization: Narration of autonomous robot experience. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’16). AAAI Press, 862–868. Retrieved from: http://dl.acm.org/citation.cfm?id=3060621.3060741
    [62]
    Quentin Roy, Futian Zhang, and Daniel Vogel. 2019. Automation accuracy is good, but high controllability may be better. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’19). ACM, New York, NY.
    [63]
    Stuart J. Russell and Peter Norvig. 2003. Artificial Intelligence: A Modern Approach (2nd ed.). Pearson Education.
    [64]
    Robert Spence. 2007. Information Visualization: Design for Interaction (2nd ed.). Prentice-Hall, Inc., Upper Saddle River, NJ.
    [65]
    Sruti Srinivasa Ragavan, Sandeep Kaur Kuttal, Charles Hill, Anita Sarma, David Piorkowski, and Margaret Burnett. 2016. Foraging among an overabundance of similar variants. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 3509–3521.
    [66]
    David J. Stracuzzi, Alan Fern, Kamal Ali, Robin Hess, Jervis Pinto, Nan Li, Tolga Konik, and Daniel G. Shapiro. 2011. An application of transfer to American football: From observation of raw video to control in a simulated environment. AI Mag. 32, 2 (2011), 107–125.
    [67]
    S. Stumpf, E. Sullivan, E. Fitzhenry, I. Oberst, W. Wong, and M. Burnett. 2008. Integrating rich user feedback into intelligent user interfaces. In Proceedings of the ACM International Conference on Intelligent User Interfaces. ACM, 50–59.
    [68]
    Adam Summerville, Michael Cook, and Ben Steenhuisen. 2016. Draft-analysis of the ancients: Predicting draft picks in DotA 2 using machine learning. Retrieved from https://aaai.org/ocs/index.php/AIIDE/AIIDE16/paper/view/14075
    [69]
    Katia Sycara, Christian Lebiere, Yulong Pei, Donald Morrison, and Michael Lewis. 2015. Abstraction of analytical models from cognitive models of human control of robotic swarms. In Proceedings of the International Conference on Cognitive Modeling.
    [70]
    J. Tullio, A. Dey, J. Chalecki, and J. Fogarty. 2007. How it works: A field study of non-technical users interacting with an intelligent system. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 31–40.
    [71]
    J. Vermeulen, G. Vanderhulst, K. Luyten, and K. Coninx. 2010. PervasiveCrystal: Asking and answering why and why not questions about pervasive computing applications. In Proceedings of the IEEE International Conference on Intelligent Environments (IE’10). IEEE, 271–276.
    [72]
    Oriol Vinyals. 2017. DeepMind and Blizzard open StarCraft II as an AI research environment. Retrieved from https://deepmind.com/blog/deepmind-and-blizzard-open-starcraft-ii-ai-research-environment/.
    [73]
    Oriol Vinyals, David Silver, et al. 2019. AlphaStar: Mastering the real-time strategy game StarCraft II. Retrieved from https://deepmind.com/blog/article/alphastar-mastering-real-time-strategy-game-starcraft-ii.
    [74]
    Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Bjöorn Regnell, and Anders Wesslén. 2000. Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, Norwell, MA.
    [75]
    Kevin Wong. 2016. StarCraft 2 and the quest for the highest APM. Retrieved from https://www.engadget.com/2014/10/24/starcraft-2-and-the-quest-for-the-highest-apm/
    [76]
    Robert H. Wortham, Andreas Theodorou, and Joanna J. Bryson. 2017. Improving robot transparency: Real-time visualisation of robot AI substantially improves understanding in naive observers, In Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN’17). Retrieved from http://opus.bath.ac.uk/55793/
    [77]
    Tom Zahavy, Nir Ben Zrihem, and Shie Mannor. 2016. Graying the black box: Understanding DQNs. In Proceedings of the International Conference on Machine Learning (ICML’16). JMLR.org, 1899–1908. Retrieved from http://dl.acm.org/citation.cfm?id=3045390.3045591
    [78]
    Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convolutional Networks. Springer International Publishing, Cham, 818–833.

    Cited By

    View all
    • (2022)Finding AI’s Faults with AAR/AI: An Empirical StudyACM Transactions on Interactive Intelligent Systems10.1145/348706512:1(1-33)Online publication date: 4-Mar-2022
    • (2021)“Why did my AI agent lose?”: Visual Analytics for Scaling Up After-Action Review2021 IEEE Visualization Conference (VIS)10.1109/VIS49827.2021.9623268(16-20)Online publication date: Oct-2021

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Interactive Intelligent Systems
    ACM Transactions on Interactive Intelligent Systems  Volume 11, Issue 1
    March 2021
    245 pages
    ISSN:2160-6455
    EISSN:2160-6463
    DOI:10.1145/3453938
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 March 2021
    Online AM: 07 May 2020
    Accepted: 01 April 2020
    Revised: 01 February 2020
    Received: 01 September 2019
    Published in TIIS Volume 11, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Explainable AI
    2. StarCraft
    3. information foraging
    4. empirical studies with humans
    5. qualitative analysis
    6. humans evaluating AI

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • DARPA
    • NSF

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)386
    • Downloads (Last 6 weeks)38

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Finding AI’s Faults with AAR/AI: An Empirical StudyACM Transactions on Interactive Intelligent Systems10.1145/348706512:1(1-33)Online publication date: 4-Mar-2022
    • (2021)“Why did my AI agent lose?”: Visual Analytics for Scaling Up After-Action Review2021 IEEE Visualization Conference (VIS)10.1109/VIS49827.2021.9623268(16-20)Online publication date: Oct-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media