Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

RLPy: a value-function-based reinforcement learning framework for education and research

Published: 01 January 2015 Publication History

Abstract

RLPy is an object-oriented reinforcement learning software package with a focus on value-function-based methods using linear function approximation and discrete actions. The framework was designed for both educational and research purposes. It provides a rich library of fine-grained, easily exchangeable components for learning agents (e.g., policies or representations of value functions), facilitating recently increased specialization in reinforcement learning. RLPy is written in Python to allow fast prototyping, but is also suitable for large-scale experiments through its built-in support for optimized numerical libraries and parallelization. Code profiling, domain visualizations, and data analysis are integrated in a self-contained package available under the Modified BSD License at http://github.com/rlpy/rlpy. All of these properties allow users to compare various reinforcement learning algorithms with little effort.

References

[1]
Douglas Aberdeen. LibPGRL: A high performance reinforcement learning library in C++, 2007. URL https://code.google.com/p/libpgrl.
[2]
Saminda Abeyruwan. RLLib reinforcement learning c++ template library, 2013. URL http://web.cs.miami.edu/home/saminda/rllib.html.
[3]
Lucian Busoniu. Approxrl: A matlab toolbox for approximate reinforcement learning and dynamic programming, 2010. URL http://busoniu.net/files/repository/readme_approxrl.html.
[4]
Francesco de Comite. PIQLE: A platform for implementation of Q-learning experiments, 2006. URL http://piqle.sourceforge.net.
[5]
Thomas Degris. RLPark, 2013. URL http://rlpark.github.io.
[6]
Herve Frezza-Buet and Matthieu Geist. A C++ template-based reinforcement learning library: Fitting the code to the mathematics. Journal of Machine Learning Research (JMLR), 14:625-628, 2013.
[7]
Alborz Geramifard, Finale Doshi, Joshua Redding, Nicholas Roy, and Jonathan How. Online discovery of feature dependencies. In International Conference on Machine Learning (ICML), 2011.
[8]
Todd Hester. rl-texplore-ros-pkg: Reinforcement learning framework, agents, and environments with ROS interface, 2013. URL https://code.google.com/p/rl-texplore-ros-pkg.
[9]
Thomas Jaksch, Ronald Ortner, and Peter Auer. Near-optimal regret bounds for reinforcement learning. Journal of Machine Learning Research (JMLR), 11:1563-1600, 2010.
[10]
Philipp W. Keller, Shie Mannor, and Doina Precup. Automatic basis function construction for approximate dynamic programming and reinforcement learning. In International Conference on Machine Learning (ICML), 2006.
[11]
Mykel Kochenderfer. JRLF: Java reinforcement learning framework, 2006. URL http://mykel.kochenderfer.com/jrlf.
[12]
Lihong Li. Sample complexity bounds of exploration. In Marco Wiering and Martijn van Otterlo, editors, Reinforcement Learning: State of the Art. Springer Verlag, 2012.
[13]
Jan Hendrik Metzen and Mark Edgington. Maja machine learning framework, 2011. URL http://mmlf.sourceforge.net.
[14]
Gerhard Neumann. The reinforcement learning toolbox, reinforcement learning for optimal control tasks, 2005.
[15]
Ali Nouri and Michael L. Littman. Multi-resolution exploration in continuous spaces. In Advances in Neural Information Processing Systems (NIPS), 2009.
[16]
Ronald Parr, Christopher Painter-Wakefield, Lihong Li, and Michael Littman. Analyzing Feature Generation for Value-Function Approximation. In International Conference on Machine Learning (ICML), 2007.
[17]
Martin Riedmiller, Manuel Blum, and Thomas Lampe. CLS2: Closed loop simulation system, 2012. URL http://ml.informatik.uni-freiburg.de/research/clsquare.
[18]
Tom Schaul, Justin Bayer, Daan Wierstra, Yi Shun, Martin Felder, Frank Sehnke, Thomas Rückstieß, and Jürgen Schmidhuber. PyBrain. Journal of Machine Learning Research (JMLR), 11:743-746, 2010.
[19]
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.
[20]
Brian Tanner and Adam White. RL-Glue : Language-independent software for reinforcement-learning experiments. Journal of Machine Learning Research (JMLR), 10:2133-2136, 2009.
[21]
Guido van Rossum and Jelke de Boer. Interactively testing remote servers using the python programming language. CWI Quarterly, 4(4):283-303, 1991.
[22]
Daniel Yamins, David Tax, and James S. Bergstra. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In International Conference on Machine Learning (ICML), 2013.

Cited By

View all
  • (2023)One risk to rule them allProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669511(77520-77545)Online publication date: 10-Dec-2023
  • (2023)Disentangled Representation Learning for Generative Adversarial Multi-task Imitation LearningProceedings of the 2023 4th International Conference on Control, Robotics and Intelligent System10.1145/3622896.3622909(76-80)Online publication date: 25-Aug-2023
  • (2023)Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave OptimizationProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570470(985-993)Online publication date: 27-Feb-2023
  • Show More Cited By
  1. RLPy: a value-function-based reinforcement learning framework for education and research

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image The Journal of Machine Learning Research
      The Journal of Machine Learning Research  Volume 16, Issue 1
      January 2015
      3855 pages
      ISSN:1532-4435
      EISSN:1533-7928
      Issue’s Table of Contents

      Publisher

      JMLR.org

      Publication History

      Published: 01 January 2015
      Revised: 01 November 2014
      Published in JMLR Volume 16, Issue 1

      Author Tags

      1. empirical evaluation
      2. open source
      3. reinforcement learning
      4. value-function

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)47
      • Downloads (Last 6 weeks)9
      Reflects downloads up to 21 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)One risk to rule them allProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669511(77520-77545)Online publication date: 10-Dec-2023
      • (2023)Disentangled Representation Learning for Generative Adversarial Multi-task Imitation LearningProceedings of the 2023 4th International Conference on Control, Robotics and Intelligent System10.1145/3622896.3622909(76-80)Online publication date: 25-Aug-2023
      • (2023)Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave OptimizationProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570470(985-993)Online publication date: 27-Feb-2023
      • (2023)An Intentional Forgetting-Driven Self-Healing Method For Deep Reinforcement Learning SystemsProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00121(1314-1325)Online publication date: 11-Nov-2023
      • (2022)Proximal point imitation learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602035(24309-24326)Online publication date: 28-Nov-2022
      • (2021)Invariant causal imitation learning for generalizable policiesProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3540563(3952-3964)Online publication date: 6-Dec-2021
      • (2020)Control frequency adaptation via action persistence in batch reinforcement learningProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525575(6862-6873)Online publication date: 13-Jul-2020
      • (2020)Evaluating the performance of reinforcement learning algorithmsProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525399(4962-4973)Online publication date: 13-Jul-2020
      • (2020)Active deep Q-learning with demonstrationMachine Language10.1007/s10994-019-05849-4109:9-10(1699-1725)Online publication date: 1-Sep-2020
      • (2019)Monte-Carlo tree search for policy optimizationProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367471.3367474(3116-3122)Online publication date: 10-Aug-2019
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media