article

Free access

RLPy: a value-function-based reinforcement learning framework for education and research

Authors:

Alborz Geramifard,

Christoph Dann,

Robert H. Klein,

William Dabney,

Jonathan P. HowAuthors Info & Claims

The Journal of Machine Learning Research, Volume 16, Issue 1

Pages 1573 - 1578

Published: 01 January 2015 Publication History

PDF eReader Publisher Site

Abstract

RLPy is an object-oriented reinforcement learning software package with a focus on value-function-based methods using linear function approximation and discrete actions. The framework was designed for both educational and research purposes. It provides a rich library of fine-grained, easily exchangeable components for learning agents (e.g., policies or representations of value functions), facilitating recently increased specialization in reinforcement learning. RLPy is written in Python to allow fast prototyping, but is also suitable for large-scale experiments through its built-in support for optimized numerical libraries and parallelization. Code profiling, domain visualizations, and data analysis are integrated in a self-contained package available under the Modified BSD License at http://github.com/rlpy/rlpy. All of these properties allow users to compare various reinforcement learning algorithms with little effort.

References

[1]

Douglas Aberdeen. LibPGRL: A high performance reinforcement learning library in C++, 2007. URL https://code.google.com/p/libpgrl.

[2]

Saminda Abeyruwan. RLLib reinforcement learning c++ template library, 2013. URL http://web.cs.miami.edu/home/saminda/rllib.html.

[3]

Lucian Busoniu. Approxrl: A matlab toolbox for approximate reinforcement learning and dynamic programming, 2010. URL http://busoniu.net/files/repository/readme_approxrl.html.

[4]

Francesco de Comite. PIQLE: A platform for implementation of Q-learning experiments, 2006. URL http://piqle.sourceforge.net.

[5]

Thomas Degris. RLPark, 2013. URL http://rlpark.github.io.

[6]

Herve Frezza-Buet and Matthieu Geist. A C++ template-based reinforcement learning library: Fitting the code to the mathematics. Journal of Machine Learning Research (JMLR), 14:625-628, 2013.

Digital Library

[7]

Alborz Geramifard, Finale Doshi, Joshua Redding, Nicholas Roy, and Jonathan How. Online discovery of feature dependencies. In International Conference on Machine Learning (ICML), 2011.

[8]

Todd Hester. rl-texplore-ros-pkg: Reinforcement learning framework, agents, and environments with ROS interface, 2013. URL https://code.google.com/p/rl-texplore-ros-pkg.

[9]

Thomas Jaksch, Ronald Ortner, and Peter Auer. Near-optimal regret bounds for reinforcement learning. Journal of Machine Learning Research (JMLR), 11:1563-1600, 2010.

Digital Library

[10]

Philipp W. Keller, Shie Mannor, and Doina Precup. Automatic basis function construction for approximate dynamic programming and reinforcement learning. In International Conference on Machine Learning (ICML), 2006.

Digital Library

[11]

Mykel Kochenderfer. JRLF: Java reinforcement learning framework, 2006. URL http://mykel.kochenderfer.com/jrlf.

[12]

Lihong Li. Sample complexity bounds of exploration. In Marco Wiering and Martijn van Otterlo, editors, Reinforcement Learning: State of the Art. Springer Verlag, 2012.

[13]

Jan Hendrik Metzen and Mark Edgington. Maja machine learning framework, 2011. URL http://mmlf.sourceforge.net.

[14]

Gerhard Neumann. The reinforcement learning toolbox, reinforcement learning for optimal control tasks, 2005.

[15]

Ali Nouri and Michael L. Littman. Multi-resolution exploration in continuous spaces. In Advances in Neural Information Processing Systems (NIPS), 2009.

[16]

Ronald Parr, Christopher Painter-Wakefield, Lihong Li, and Michael Littman. Analyzing Feature Generation for Value-Function Approximation. In International Conference on Machine Learning (ICML), 2007.

Digital Library

[17]

Martin Riedmiller, Manuel Blum, and Thomas Lampe. CLS²: Closed loop simulation system, 2012. URL http://ml.informatik.uni-freiburg.de/research/clsquare.

[18]

Tom Schaul, Justin Bayer, Daan Wierstra, Yi Shun, Martin Felder, Frank Sehnke, Thomas Rückstieß, and Jürgen Schmidhuber. PyBrain. Journal of Machine Learning Research (JMLR), 11:743-746, 2010.

Digital Library

[19]

R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.

Digital Library

[20]

Brian Tanner and Adam White. RL-Glue : Language-independent software for reinforcement-learning experiments. Journal of Machine Learning Research (JMLR), 10:2133-2136, 2009.

Digital Library

[21]

Guido van Rossum and Jelke de Boer. Interactively testing remote servers using the python programming language. CWI Quarterly, 4(4):283-303, 1991.

[22]

Daniel Yamins, David Tax, and James S. Bergstra. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In International Conference on Machine Learning (ICML), 2013.

Cited By

Rigter MLacerda BHawes NOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)One risk to rule them allProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669511(77520-77545)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669511
Nguyen TTran CBach NTan PEiji K(2023)Disentangled Representation Learning for Generative Adversarial Multi-task Imitation LearningProceedings of the 2023 4th International Conference on Control, Robotics and Intelligent System10.1145/3622896.3622909(76-80)Online publication date: 25-Aug-2023
https://dl.acm.org/doi/10.1145/3622896.3622909
Zhao CZe YDong JWang BLi SChua TLauw HSi LTerzi ETsaparas P(2023)Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave OptimizationProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570470(985-993)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3539597.3570470
Show More Cited By

RLPy: a value-function-based reinforcement learning framework for education and research
1. Computing methodologies
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages

Recommendations

Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Characterizing reinforcement learning methods through parameterized learning problems

The field of reinforcement learning (RL) has been energized in the past few decades by elegant theoretical results indicating under what conditions, and how quickly, certain algorithms are guaranteed to converge to optimal policies. However, in ...
Testing self-healing cyber-physical systems under uncertainty with reinforcement learning: an empirical study
Abstract
Self-healing is becoming an essential feature of Cyber-Physical Systems (CPSs). CPSs with this feature are named Self-Healing CPSs (SH-CPSs). SH-CPSs detect and recover from errors caused by hardware or software faults at runtime and handle ...

Comments

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research

The Journal of Machine Learning Research Volume 16, Issue 1

January 2015

3855 pages

ISSN:1532-4435

EISSN:1533-7928

Editors:
Kevin Murphy
Google
,
Bernhard Schölkopf
MPI for Intelligent Systems

Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 January 2015

Revised: 01 November 2014

Published in JMLR Volume 16, Issue 1

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
258
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)9

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Rigter MLacerda BHawes NOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)One risk to rule them allProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669511(77520-77545)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669511
Nguyen TTran CBach NTan PEiji K(2023)Disentangled Representation Learning for Generative Adversarial Multi-task Imitation LearningProceedings of the 2023 4th International Conference on Control, Robotics and Intelligent System10.1145/3622896.3622909(76-80)Online publication date: 25-Aug-2023
https://dl.acm.org/doi/10.1145/3622896.3622909
Zhao CZe YDong JWang BLi SChua TLauw HSi LTerzi ETsaparas P(2023)Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave OptimizationProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570470(985-993)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3539597.3570470
Yahmed ABouchoucha RBraiek HKhomh FBissyandé TKlein JBird CSarro F(2023)An Intentional Forgetting-Driven Self-Healing Method For Deep Reinforcement Learning SystemsProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00121(1314-1325)Online publication date: 11-Nov-2023
https://dl.acm.org/doi/10.1109/ASE56229.2023.00121
Viano LKamoutsi ANeu GKrawczuk ICevher VKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Proximal point imitation learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602035(24309-24326)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3602035
Bica IJarrett Dvan der Schaar MRanzato MBeygelzimer ADauphin YLiang PVaughan J(2021)Invariant causal imitation learning for generalizable policiesProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3540563(3952-3964)Online publication date: 6-Dec-2021
https://dl.acm.org/doi/10.5555/3540261.3540563
Metelli AMazzolini FBisi LSabbioni LRestelli MDaumé HSingh A(2020)Control frequency adaptation via action persistence in batch reinforcement learningProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525575(6862-6873)Online publication date: 13-Jul-2020
https://dl.acm.org/doi/10.5555/3524938.3525575
Jordan SChandak YCohen DZhang MThomas PDaumé HSingh A(2020)Evaluating the performance of reinforcement learning algorithmsProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525399(4962-4973)Online publication date: 13-Jul-2020
https://dl.acm.org/doi/10.5555/3524938.3525399
Chen STangkaratt VLin HSugiyama M(2020)Active deep Q-learning with demonstrationMachine Language10.1007/s10994-019-05849-4109:9-10(1699-1725)Online publication date: 1-Sep-2020
https://dl.acm.org/doi/10.1007/s10994-019-05849-4
Ma XDriggs-Campbell KZhang ZKochenderfer M(2019)Monte-Carlo tree search for policy optimizationProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367471.3367474(3116-3122)Online publication date: 10-Aug-2019
https://dl.acm.org/doi/10.5555/3367471.3367474
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents