Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1329125.1329367acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

A globally optimal algorithm for TTD-MDPs

Published: 14 May 2007 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper, we discuss the use of Targeted Trajectory Distribution Markov Decision Processes (TTD-MDPs)---a variant of MDPs in which the goal is to realize a specified distribution of trajectories through a state space---as a general agent-coordination framework.
    We present several advances to previous work on TTD-MDPs. We improve on the existing algorithm for solving TTD-MDPs by deriving a greedy algorithm that finds a policy that provably minimizes the global KL-divergence from the target distribution. We test the new algorithm by applying TTD-MDPs to drama management, where a system must coordinate the behavior of many agents to ensure that a game follows a coherent storyline, is in keeping with the author's desires, and offers a high degree of replayability.
    Although we show that suboptimal greedy strategies will fail in some cases, we validate previous work that suggests that they can work well in practice. We also show that our new algorithm provides guaranteed accuracy even in those cases, with little additional computational cost. Further, we illustrate how this new approach can be applied online, eliminating the memory-intensive offline sampling necessary in the previous approach.

    References

    [1]
    J. Bates. Virtual reality, art, and entertainment. Presence: The Journal of Teleoperators and Virtual Environments, 2(1):133--138, 1992.
    [2]
    S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
    [3]
    C. L. Isbell, Jr., C. R. Shelton, M. Kearns, S. Singh, and P. Stone. A social reinforcement learning agent. In Proceedings of the Fifth International Conference on Autonomous Agents (Agents-01), pages 377--384, 2001.
    [4]
    M. Kearns, Y. Mansour, and A. Y. Ng. Approximate planning in large POMDPs via reusable trajectories. Advances in Neural Information Processing Systems, 12, 2000.
    [5]
    A. Lamstein and M. Mateas. A search-based drama manager. In Proceedings of the AAAI-04 Workshop on Challenges in Game AI, 2004.
    [6]
    B. Laurel. Toward the Design of a Computer-Based Interactive Fantasy System. PhD thesis, Drama department, Ohio State University, 1986.
    [7]
    M. L. Littman. Markov games as a framework for multiagent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning (ICML-94), pages 157--163, 1994.
    [8]
    B. Magerko. Story representation and interactive drama. In Proceedings of the First Annual Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-05), 2005.
    [9]
    M. Mateas. An Oz-centric review of interactive drama and believable agents. In M. Woodridge and M. Veloso, editors, AI Today: Recent Trends and Developments. Lecture Notes in AI 1600. Springer, Berlin, NY, 1999. First appeared in 1997 as Technical Report CMU-CS-97-156, Computer Science Department, Carnegie Mellon University.
    [10]
    M. Mateas and A. Stern. Integrating plot, character, and natural language processing in the interactive drama Façade. In Proceedings of the 1st International Conference on Technologies for Interactive Digital Storytelling and Entertainment (TIDSE-03), 2003.
    [11]
    B. Mott and J. Lester. U-director: A decision-theoretic narrative planning architecture for storytelling environments. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-06), 2006.
    [12]
    M. J. Nelson and M. Mateas. Search-based drama management in the interactive fiction Anchorhead. In Proceedings of the First Annual Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-05), 2005.
    [13]
    M. J. Nelson, D. L. Roberts, C. L. Isbell, Jr., and M. Mateas. Reinforcement learning for declarative optimization-based drama management. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-06), 2006.
    [14]
    A. Y. Ng and S. Russell. Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML-00), pages 663--670, 2000.
    [15]
    Z. Rabinovich and J. S. Rosenschein. Multiagent coordination by extended Markov tracking. In Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-05), pages 431--438, 2005.
    [16]
    Z. Rabinovich and J. S. Rosenschein. On the response of EMT-based control to interacting targets and models. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-06), 2006.
    [17]
    M. O. Riedl, A. Stern, and D. Dini. Mixing story and simulation in interactive narrative. In Proceedings of the Second Annual Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-06), 2006.
    [18]
    D. L. Roberts, M. J. Nelson, C. L. Isbell, M. Mateas, and M. L. Littman. Targeting specific distributions of trajectories in MDPs. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), Boston, MA, 2006.
    [19]
    G. Tesauro. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Unpublished. URL: http://web.cps.msu.edu/rlr/pub/Tesauro2.html.
    [20]
    G. Tesauro. Practical issues in temporal difference learning. Machine Learning, 8:257--277, 1992.
    [21]
    G. Tesauro. Temporal difference learning and TD-Gammon. Communications of the ACM, 38(3):58--68, 1995.
    [22]
    P. Weyhrauch. Guiding Interactive Drama. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 1997. Technical Report CMU-CS-97-109.
    [23]
    R. M. Young, M. O. Riedl, M. Branly, A. Jhala, R. J. Martin, and C. J. Saretto. An architecture for integrating plan-based behavior generation with interactive game environments. Journal of Game Development, 1(1), 2004.

    Cited By

    View all
    • (2014)Story similarity measures for drama management with ttd-mdpsProceedings of the 2014 international conference on Autonomous agents and multi-agent systems10.5555/2615731.2615747(77-84)Online publication date: 5-May-2014
    • (2014)Lessons on Using Computationally Generated Influence for Shaping Narrative ExperiencesIEEE Transactions on Computational Intelligence and AI in Games10.1109/TCIAIG.2013.22871546:2(188-202)Online publication date: Jun-2014
    • (2014)Personalized Interactive Narratives via Sequential Recommendation of Plot PointsIEEE Transactions on Computational Intelligence and AI in Games10.1109/TCIAIG.2013.22827716:2(174-187)Online publication date: Jun-2014
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
    May 2007
    1585 pages
    ISBN:9788190426275
    DOI:10.1145/1329125
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • IFAAMAS

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 May 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Markov decision processes
    2. convex optimization
    3. interactive entertainment

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    AAMAS07
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2014)Story similarity measures for drama management with ttd-mdpsProceedings of the 2014 international conference on Autonomous agents and multi-agent systems10.5555/2615731.2615747(77-84)Online publication date: 5-May-2014
    • (2014)Lessons on Using Computationally Generated Influence for Shaping Narrative ExperiencesIEEE Transactions on Computational Intelligence and AI in Games10.1109/TCIAIG.2013.22871546:2(188-202)Online publication date: Jun-2014
    • (2014)Personalized Interactive Narratives via Sequential Recommendation of Plot PointsIEEE Transactions on Computational Intelligence and AI in Games10.1109/TCIAIG.2013.22827716:2(174-187)Online publication date: Jun-2014
    • (2012)A sequential recommendation approach for interactive personalized story generationProceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 110.5555/2343576.2343586(71-78)Online publication date: 4-Jun-2012
    • (2010)Investigating director agents' decision making in interactive narrativeProceedings of the Intelligent Narrative Technologies III Workshop10.1145/1822309.1822322(1-8)Online publication date: 18-Jun-2010
    • (2009)Using influence and persuasion to shape player experiencesProceedings of the 2009 ACM SIGGRAPH Symposium on Video Games10.1145/1581073.1581077(23-30)Online publication date: 4-Aug-2009
    • (2008)Computational influence for training and entertainmentProceedings of the 23rd national conference on Artificial intelligence - Volume 310.5555/1620270.1620392(1865-1866)Online publication date: 13-Jul-2008
    • (2008)Another look at search-based drama managementProceedings of the 23rd national conference on Artificial intelligence - Volume 210.5555/1620163.1620195(792-797)Online publication date: 13-Jul-2008
    • (2008)Another look at search-based drama managementProceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 310.5555/1402821.1402854(1293-1298)Online publication date: 12-May-2008

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media