Article

Bisimulation metrics are optimal value functions

Authors:

Doina PrecupAuthors Info & Claims

UAI'14: Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence

Pages 210 - 219

Published: 23 July 2014 Publication History

Abstract

Bisimulation is a notion of behavioural equivalence on the states of a transition system. Its definition has been extended to Markov decision processes, where it can be used to aggregate states. A bisimulation metric is a quantitative analog of bisimulation that measures how similar states are from a the perspective of long-term behavior. Bisimulation metrics have been used to establish approximation bounds for state aggregation and other forms of value function approximation. In this paper, we prove that a bisimulation metric defined on the state space of a Markov decision process is the optimal value function of an optimal coupling of two copies of the original model. We prove the result in the general case of continuous state spaces. This result has important implications in understanding the complexity of computing such metrics, and opens up the possibility of more efficient computational methods.

References

[1]

Abate, A. (2012). Approximation Metrics based on Probabilistic Bisimulations for General State-Space Markov Processes: a Survey. Electronic Notes in Theoretical Computer Sciences.

Digital Library

[2]

Bacci, G., Bacci, G., Larsen, K. G., & Mardare, R. (2013). Computing behavioral distances, compositionally. Proceedings of the 38th International Symposium on Mathematical Foundations of Computer Science (MFCS) (pp. 74-85).

[3]

Boutilier, C., Dearden, R., & Goldszmidt, M. (2000). Stochastic Dynamic Programming with Factored Representations. Artificial Intelligence, 121, 49-107.

Digital Library

[4]

Chen, D., van Breugel, F., & Worrell, J. (2012). On the Complexity of Computing Probabilistic Bisimilarity. FoSSaCS (pp. 437-451). Springer.

Digital Library

[5]

Comanici, G., Panangaden, P., & Precup, D. (2012). On-the-Fly Algorithms for Bisimulation Metrics. QEST (pp. 94-103). IEEE Computer Society.

Digital Library

[6]

Comanici, G., & Precup, D. (2012). Basis Function Discovery Using Spectral Clustering and Bisimulation Metrics. AAAI.

Digital Library

[7]

Dean, T., Givan, R., & Leach, S. (1997). Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes. UAI (pp. 124-131).

Digital Library

[8]

Desharnais, J., Jagadeesan, R., Gupta, V., & Panangaden, P. (2002). The Metric Analogue of Weak Bisimulation for Probabilistic Processes. LICS (pp. 413-422). IEEE Computer Society.

Digital Library

[9]

Desharnais, J., Laviolette, F., & Zhioua, S. (2013). Testing Probabilistic Equivalence Through Reinforcement Learning. Information and Computation, 227, 21-57.

Digital Library

[10]

Doberkat, E.-E. (2007). Stochastic Relations. Foundations for Markov Transition Systems. Chapman & Hall/CRC.

Digital Library

[11]

Feinberg, E., & Shwartz, A. (Eds.). (2002). Handbook of Markov Decision Processes - Methods and Applications. Kluwer International Series.

[12]

Ferns, N., Castro, P. S., Precup, D., & Panangaden, P. (2006). Methods for Computing State Similarity in Markov Decision Processes. UAI.

[13]

Ferns, N., Panangaden, P., & Precup, D. (2004). Metrics for Finite Markov Decision Processes. UAI (pp. 162-169).

Digital Library

[14]

Ferns, N., Panangaden, P., & Precup, D. (2005). Metrics for Markov Decision Processes with Infinite State Spaces. UAI (pp. 201-208).

[15]

Ferns, N., Panangaden, P., & Precup, D. (2011). Bisimulation Metrics for Continuous Markov Decision Processes. SIAM Journal on Computing, 40, 1662-1714.

Digital Library

[16]

Ferns, N., Precup, D., & Knight, S. (2014). Bisimulation for Markov Decision Processes Through Families of Functional Expressions. Horizons of the Mind. A Tribute to Prakash Panangaden (pp. 319-342). Springer.

[17]

Folland, G. B. (1999). Real analysis: Modern techniques and their applications. Wiley-Interscience. Second edition.

[18]

Giry, M. (1982). A Categorical Approach to Probability Theory. Categorical Aspects of Topology and Analysis, 68-85.

[19]

Givan, R., Dean, T., & Greig, M. (2003). Equivalence Notions and Model Minimization in Markov Decision Processes. Artificial Intelligence, 147, 163-223.

Digital Library

[20]

Hernández-Lerma, O., & Lasserre, J. B. (1996). Discrete-Time Markov Control Processes : Basic Optimality Criteria. Applications of Mathematics. Springer.

[21]

Hernández-Lerma, O., & Lasserre, J. B. (1999). Further Topics on Discrete-Time Markov Control Processes. Applications of Mathematics. Springer.

[22]

Larsen, K. G., & Skou, A. (1991). Bisimulation Through Probabilistic Testing. Information and Computation, 94, 1-28.

Digital Library

[23]

Li, L., Walsh, T. J., & Littman, M. L. (2006). Towards a Unified Theory of State Abstraction for MDPs. Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics (pp. 531-539).

[24]

Lindvall, T. (2002). Lectures on the Coupling Method. Dover Publications Inc.

[25]

Pazis, J., & Parr, R. (2013). Sample Complexity and Performance Bounds for Non-Parametric Approximate Linear Programming. AAAI.

Digital Library

[26]

Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. John Wiley & Sons, Inc.

Digital Library

[27]

Srivastava, S. M. (2008). A Course on Borel Sets, vol. 180 of Graduate texts in mathematics. Springer.

[28]

Sutton, R. S., & Barto, A. G. (2012). Reinforcement Learning: An Introduction (Second Edition, In Progress). MIT Press.

[29]

van Breugel, F., & Worrell, J. (2001a). Towards Quantitative Verification of Probabilistic Transition Systems. ICALP (pp. 421-132). Springer.

Digital Library

[30]

Villani, C. (2003). Topics in Optimal Transportation (Graduate Studies in Mathematics, Vol. 58). American Mathematical Society.

Cited By

Pavse BHanna JOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)State-action similarity-based representations for off-policy evaluationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667956(42298-42329)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667956
Oliehoek FWitwicki SKaelbling L(2021)A Sufficient Statistic for Influence in Structured Multiagent EnvironmentsJournal of Artificial Intelligence Research10.1613/jair.1.1213670(789-870)Online publication date: 24-Feb-2021
https://dl.acm.org/doi/10.1613/jair.1.12136

Bisimulation metrics are optimal value functions
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic representations
    2. Stochastic processes

Recommendations

Bisimulation on speed: a unified approach

Two process-algebraic approaches have been developed for comparing two bisimulation-equivalent processes with respect to speed: the one of Moller/Tofts equips actions with lower time bounds, while the other by Lüttgen/Vogler considers upper time bounds ...
Bisimulation on speed: a unified approach
FOSSACS'05: Proceedings of the 8th international conference on Foundations of Software Science and Computation Structures

Two process–algebraic approaches have been developed for comparing two bisimulation–equivalent processes with respect to speed: the one of Moller/Tofts equips actions with lower time bounds, while the one by Lüttgen/Vogler considers upper time bounds ...
Deciding orthogonal bisimulation
Abstract
Bergstra, Ponse and van der Zwaag introduced in 2003 the notion of orthogonal bisimulation equivalence on labeled transition systems. This equivalence is a refinement of branching bisimulation, in which consecutive tau’s (silent steps) can be ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

UAI'14: Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence

July 2014

926 pages

ISBN:9780974903910

Editors:
Nevin Zhang
Department of Computer Science, Hong Kong University of Science and Technology, China
,
Jin Tian
Department of Computer Science, University of California

Sponsors

Google Inc.
Artificial Intelligence Journal
IBMR: IBM Research
Microsoft Research: Microsoft Research
Facebook: Facebook

Publisher

AUAI Press

Arlington, Virginia, United States

Publication History

Published: 23 July 2014

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Pavse BHanna JOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)State-action similarity-based representations for off-policy evaluationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667956(42298-42329)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667956
Oliehoek FWitwicki SKaelbling L(2021)A Sufficient Statistic for Influence in Structured Multiagent EnvironmentsJournal of Artificial Intelligence Research10.1613/jair.1.1213670(789-870)Online publication date: 24-Feb-2021
https://dl.acm.org/doi/10.1613/jair.1.12136

View Options

View options

Figures

Tables

Media

View Table of Conten