research-article

A globally optimal algorithm for TTD-MDPs

Authors:

David L. Roberts,

Mark J. Nelson,

Charles L. Isbell,

Michael MateasAuthors Info & Claims

AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems

Article No.: 199, Pages 1 - 8

https://doi.org/10.1145/1329125.1329367

Published: 14 May 2007 Publication History

Abstract

In this paper, we discuss the use of Targeted Trajectory Distribution Markov Decision Processes (TTD-MDPs)---a variant of MDPs in which the goal is to realize a specified distribution of trajectories through a state space---as a general agent-coordination framework.

We present several advances to previous work on TTD-MDPs. We improve on the existing algorithm for solving TTD-MDPs by deriving a greedy algorithm that finds a policy that provably minimizes the global KL-divergence from the target distribution. We test the new algorithm by applying TTD-MDPs to drama management, where a system must coordinate the behavior of many agents to ensure that a game follows a coherent storyline, is in keeping with the author's desires, and offers a high degree of replayability.

Although we show that suboptimal greedy strategies will fail in some cases, we validate previous work that suggests that they can work well in practice. We also show that our new algorithm provides guaranteed accuracy even in those cases, with little additional computational cost. Further, we illustrate how this new approach can be applied online, eliminating the memory-intensive offline sampling necessary in the previous approach.

References

[1]

J. Bates. Virtual reality, art, and entertainment. Presence: The Journal of Teleoperators and Virtual Environments, 2(1):133--138, 1992.

Digital Library

[2]

S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

Digital Library

[3]

C. L. Isbell, Jr., C. R. Shelton, M. Kearns, S. Singh, and P. Stone. A social reinforcement learning agent. In Proceedings of the Fifth International Conference on Autonomous Agents (Agents-01), pages 377--384, 2001.

Digital Library

[4]

M. Kearns, Y. Mansour, and A. Y. Ng. Approximate planning in large POMDPs via reusable trajectories. Advances in Neural Information Processing Systems, 12, 2000.

[5]

A. Lamstein and M. Mateas. A search-based drama manager. In Proceedings of the AAAI-04 Workshop on Challenges in Game AI, 2004.

[6]

B. Laurel. Toward the Design of a Computer-Based Interactive Fantasy System. PhD thesis, Drama department, Ohio State University, 1986.

[7]

M. L. Littman. Markov games as a framework for multiagent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning (ICML-94), pages 157--163, 1994.

Digital Library

[8]

B. Magerko. Story representation and interactive drama. In Proceedings of the First Annual Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-05), 2005.

Digital Library

[9]

M. Mateas. An Oz-centric review of interactive drama and believable agents. In M. Woodridge and M. Veloso, editors, AI Today: Recent Trends and Developments. Lecture Notes in AI 1600. Springer, Berlin, NY, 1999. First appeared in 1997 as Technical Report CMU-CS-97-156, Computer Science Department, Carnegie Mellon University.

Digital Library

[10]

M. Mateas and A. Stern. Integrating plot, character, and natural language processing in the interactive drama Façade. In Proceedings of the 1st International Conference on Technologies for Interactive Digital Storytelling and Entertainment (TIDSE-03), 2003.

[11]

B. Mott and J. Lester. U-director: A decision-theoretic narrative planning architecture for storytelling environments. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-06), 2006.

Digital Library

[12]

M. J. Nelson and M. Mateas. Search-based drama management in the interactive fiction Anchorhead. In Proceedings of the First Annual Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-05), 2005.

Digital Library

[13]

M. J. Nelson, D. L. Roberts, C. L. Isbell, Jr., and M. Mateas. Reinforcement learning for declarative optimization-based drama management. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-06), 2006.

Digital Library

[14]

A. Y. Ng and S. Russell. Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML-00), pages 663--670, 2000.

Digital Library

[15]

Z. Rabinovich and J. S. Rosenschein. Multiagent coordination by extended Markov tracking. In Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-05), pages 431--438, 2005.

Digital Library

[16]

Z. Rabinovich and J. S. Rosenschein. On the response of EMT-based control to interacting targets and models. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-06), 2006.

Digital Library

[17]

M. O. Riedl, A. Stern, and D. Dini. Mixing story and simulation in interactive narrative. In Proceedings of the Second Annual Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-06), 2006.

Digital Library

[18]

D. L. Roberts, M. J. Nelson, C. L. Isbell, M. Mateas, and M. L. Littman. Targeting specific distributions of trajectories in MDPs. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), Boston, MA, 2006.

Digital Library

[19]

G. Tesauro. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Unpublished. URL: http://web.cps.msu.edu/rlr/pub/Tesauro2.html.

Digital Library

[20]

G. Tesauro. Practical issues in temporal difference learning. Machine Learning, 8:257--277, 1992.

Digital Library

[21]

G. Tesauro. Temporal difference learning and TD-Gammon. Communications of the ACM, 38(3):58--68, 1995.

Digital Library

[22]

P. Weyhrauch. Guiding Interactive Drama. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 1997. Technical Report CMU-CS-97-109.

Digital Library

[23]

R. M. Young, M. O. Riedl, M. Branly, A. Jhala, R. J. Martin, and C. J. Saretto. An architecture for integrating plan-based behavior generation with interactive game environments. Journal of Game Development, 1(1), 2004.

Cited By

Jones JIsbell CBazzan AHuhns MLomuscio AScerri P(2014)Story similarity measures for drama management with ttd-mdpsProceedings of the 2014 international conference on Autonomous agents and multi-agent systems10.5555/2615731.2615747(77-84)Online publication date: 5-May-2014
https://dl.acm.org/doi/10.5555/2615731.2615747
Roberts DIsbell C(2014)Lessons on Using Computationally Generated Influence for Shaping Narrative ExperiencesIEEE Transactions on Computational Intelligence and AI in Games10.1109/TCIAIG.2013.22871546:2(188-202)Online publication date: Jun-2014
https://doi.org/10.1109/TCIAIG.2013.2287154
Yu HRiedl M(2014)Personalized Interactive Narratives via Sequential Recommendation of Plot PointsIEEE Transactions on Computational Intelligence and AI in Games10.1109/TCIAIG.2013.22827716:2(174-187)Online publication date: Jun-2014
https://doi.org/10.1109/TCIAIG.2013.2282771
Show More Cited By

Index Terms

Recommendations

Authorial idioms for target distributions in TTD-MDPs
AAAI'07: Proceedings of the 22nd national conference on Artificial intelligence - Volume 1

In designing Markov Decision Processes (MDP), one must define the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there is a clear choice of reward functions and in these cases significant care ...
Ultimately Stationary Policies to Approximate Risk-Sensitive Discounted MDPs
VALUETOOLS 2019: Proceedings of the 12th EAI International Conference on Performance Evaluation Methodologies and Tools

Risk-sensitive Markov Decision Process (RSMDP) models are less studied than linear Markov decision models. Linear models optimize only expected cost whereas RSMDP models optimize a combination of expected cost and higher moments of the cost. On the ...
Story similarity measures for drama management with ttd-mdps
AAMAS '14: Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems

In interactive drama, whether for entertainment or training purposes, there is a need to balance the enforcement of authorial intent with player autonomy. A promising approach to this problem is the incorporation of an intelligent Drama Manager (DM) ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems

May 2007

1585 pages

ISBN:9788190426275

DOI:10.1145/1329125

Conference Chairs:
Edmund Durfee
University of Michigan
,
Makoto Yokoo
Kyushu University
,
Program Chairs:
Michael Huhns
University of South Carolina
,
Onn Shehory
IBM Haifa Research Lab, Israel

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IFAAMAS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

AAMAS07

Sponsor:

AAMAS07: International Conference on Autonomous Agents and Mulitagent Systems

May 14 - 18, 2007

Hawaii, Honolulu

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
226
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jones JIsbell CBazzan AHuhns MLomuscio AScerri P(2014)Story similarity measures for drama management with ttd-mdpsProceedings of the 2014 international conference on Autonomous agents and multi-agent systems10.5555/2615731.2615747(77-84)Online publication date: 5-May-2014
https://dl.acm.org/doi/10.5555/2615731.2615747
Roberts DIsbell C(2014)Lessons on Using Computationally Generated Influence for Shaping Narrative ExperiencesIEEE Transactions on Computational Intelligence and AI in Games10.1109/TCIAIG.2013.22871546:2(188-202)Online publication date: Jun-2014
https://doi.org/10.1109/TCIAIG.2013.2287154
Yu HRiedl M(2014)Personalized Interactive Narratives via Sequential Recommendation of Plot PointsIEEE Transactions on Computational Intelligence and AI in Games10.1109/TCIAIG.2013.22827716:2(174-187)Online publication date: Jun-2014
https://doi.org/10.1109/TCIAIG.2013.2282771
Yu HRiedl Mvan der Hoek WPadgham LConitzer VWinikoff M(2012)A sequential recommendation approach for interactive personalized story generationProceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 110.5555/2343576.2343586(71-78)Online publication date: 4-Jun-2012
https://dl.acm.org/doi/10.5555/2343576.2343586
Lee SMott BLester JJhala ARiedl MRoberts D(2010)Investigating director agents' decision making in interactive narrativeProceedings of the Intelligent Narrative Technologies III Workshop10.1145/1822309.1822322(1-8)Online publication date: 18-Jun-2010
https://dl.acm.org/doi/10.1145/1822309.1822322
Roberts DFurst MDorn BIsbell CDavidson DFullerton TSchrier K(2009)Using influence and persuasion to shape player experiencesProceedings of the 2009 ACM SIGGRAPH Symposium on Video Games10.1145/1581073.1581077(23-30)Online publication date: 4-Aug-2009
https://dl.acm.org/doi/10.1145/1581073.1581077
Roberts D(2008)Computational influence for training and entertainmentProceedings of the 23rd national conference on Artificial intelligence - Volume 310.5555/1620270.1620392(1865-1866)Online publication date: 13-Jul-2008
https://dl.acm.org/doi/10.5555/1620270.1620392
Nelson MMateas M(2008)Another look at search-based drama managementProceedings of the 23rd national conference on Artificial intelligence - Volume 210.5555/1620163.1620195(792-797)Online publication date: 13-Jul-2008
https://dl.acm.org/doi/10.5555/1620163.1620195
Nelson MMateas M(2008)Another look at search-based drama managementProceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 310.5555/1402821.1402854(1293-1298)Online publication date: 12-May-2008
https://dl.acm.org/doi/10.5555/1402821.1402854

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents