Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1609/aaai.v33i01.33019902guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article
Free access

Verifiable and interpretable reinforcement learning through program synthesis

Published: 27 January 2019 Publication History

Abstract

We study the problem of generating interpretable and verifiable policies for Reinforcement Learning (RL). Unlike the popular Deep Reinforcement Learning (DRL) paradigm, in which the policy is represented by a neural network, the aim of this work is to find policies that can be represented in high-level programming languages. Such programmatic policies have several benefits, including being more easily interpreted than neural networks, and being amenable to verification by scalable symbolic methods. The generation methods for programmatic policies also provide a mechanism for systematically using domain knowledge for guiding the policy search. The interpretability and verifiability of these policies provides the opportunity to deploy RL based solutions in safety critical environments. This thesis draws on, and extends, work from both the machine learning and formal methods communities.

References

[1]
Alur, R.; Bodík, R.; Dallal, E.; Fisman, D.; Garg, P.; Juniwal, G.; Kress-Gazit, H.; Madhusudan, P.; Martin, M. M. K.; Raghothaman, M.; Saha, S.; Seshia, S. A.; Singh, R.; Solar-Lezama, A.; Torlak, E.; and Udupa, A. 2015. Syntax-guided synthesis. In Dependable Software Systems Engineering.
[2]
Katz, G.; Barrett, C. W.; Dill, D. L.; Julian, K.; and Kochenderfer, M. J. 2017. Reluplex: An efficient smt solver for verifying deep neural networks. In Computer Aided Verification, 97–117.
[3]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A. A.; Veness, J.; Bellemare, M. G.; Graves, A.; Riedmiller, M.; Fidjeland, A. K.; Ostrovski, G.; et al. 2015. Human-level control through deep reinforcement learning. Nature 518(7540):529.
[4]
Montavon, G.; Samek, W.; and Müller, K. 2017. Methods for interpreting and understanding deep neural networks. CoRR abs/1706.07979.
[5]
Murali, V.; Qi, L.; Chaudhuri, S.; and Jermaine, C. 2018. Neural sketch learning for conditional program generation. In International Conference on Learning Representations.
[6]
Ross, S.; Gordon, G. J.; and Bagnell, D. 2011. A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, 627–635.
[7]
Verma, A.; Murali, V.; Singh, R.; Kohli, P.; and Chaudhuri, S. 2018. Programmatically interpretable reinforcement learning. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, 5045–5054. Stock-holmsmässan, Stockholm Sweden: PMLR.

Cited By

View all
  • (2024)Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and OpportunitiesACM Computing Surveys10.1145/364847256:9(1-33)Online publication date: 24-Apr-2024
  • (2021)Program synthesis guided reinforcement learning for partially observed environmentsProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542532(29669-29683)Online publication date: 6-Dec-2021
  • (2021)Automatic discovery of interpretable planning strategiesMachine Language10.1007/s10994-021-05963-2110:9(2641-2683)Online publication date: 1-Sep-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence
January 2019
10088 pages
ISBN:978-1-57735-809-1

Sponsors

  • Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 27 January 2019

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)44
  • Downloads (Last 6 weeks)10
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and OpportunitiesACM Computing Surveys10.1145/364847256:9(1-33)Online publication date: 24-Apr-2024
  • (2021)Program synthesis guided reinforcement learning for partially observed environmentsProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542532(29669-29683)Online publication date: 6-Dec-2021
  • (2021)Automatic discovery of interpretable planning strategiesMachine Language10.1007/s10994-021-05963-2110:9(2641-2683)Online publication date: 1-Sep-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media