Article

Gradient descent for general reinforcement learning

Authors:

Leemon Baird,

Andrew MooreAuthors Info & Claims

Proceedings of the 1998 conference on Advances in neural information processing systems II

Pages 968 - 974

Published: 20 July 1999 Publication History

Abstract

No abstract available.

Cited By

View all

Zhu LChen ZSchlegel MWhite MOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)General munchausen reinforcement learning with tsallis kullback-leibler divergenceProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668636(57639-57659)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668636
Yin BDridi MMoudni A(2019)Recursive least-squares temporal difference learning for adaptive traffic signal control at intersectionNeural Computing and Applications10.1007/s00521-017-3066-931:2(1013-1028)Online publication date: 1-Feb-2019
https://dl.acm.org/doi/10.1007/s00521-017-3066-9
Cheng YZhang W(2018)Concise deep reinforcement learning obstacle avoidance for underactuated unmanned marine vesselsNeurocomputing10.1016/j.neucom.2017.06.066272:C(63-73)Online publication date: 10-Jan-2018
https://dl.acm.org/doi/10.1016/j.neucom.2017.06.066
Show More Cited By

Index Terms

Gradient descent for general reinforcement learning
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Mathematics of computing
  1. Probability and statistics

Recommendations

Gradient descent for general reinforcement learning
NIPS'98: Proceedings of the 11th International Conference on Neural Information Processing Systems

A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcement-learning algorithms. These algorithms solve a number of open problems, define several new approaches to reinforcement learning,...
Empirical Comparison of Gradient Descent and Exponentiated Gradient Descent in Supervised and Reinforcement Learning
Swarm robots reinforcement learning convergence accuracy-based learning classifier systems with gradient descent (XCS-GD)

This paper presented a novel approach accuracy-based learning classifier system with gradient descent (XCS-GD) to research on swarm robots reinforcement learning convergence. XCS-GD combines covering operator and genetic algorithm. XCS-GD is responsible ...

Comments

Information & Contributors

Information

Published In

Proceedings of the 1998 conference on Advances in neural information processing systems II

July 1999

1090 pages

ISBN:0262112450

Chairmen:
Michael S. Kearns
AT&T Labs, Florham Park, NJ
,
Sara A. Solla
Northwestern Univ., Evanston, IL
,
Editor:
David A. Cohn
Harlequin, Inc.; and Just Research

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 20 July 1999

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zhu LChen ZSchlegel MWhite MOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)General munchausen reinforcement learning with tsallis kullback-leibler divergenceProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668636(57639-57659)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668636
Yin BDridi MMoudni A(2019)Recursive least-squares temporal difference learning for adaptive traffic signal control at intersectionNeural Computing and Applications10.1007/s00521-017-3066-931:2(1013-1028)Online publication date: 1-Feb-2019
https://dl.acm.org/doi/10.1007/s00521-017-3066-9
Cheng YZhang W(2018)Concise deep reinforcement learning obstacle avoidance for underactuated unmanned marine vesselsNeurocomputing10.1016/j.neucom.2017.06.066272:C(63-73)Online publication date: 10-Jan-2018
https://dl.acm.org/doi/10.1016/j.neucom.2017.06.066
Asadi KLittman M(2017)An alternative softmax operator for reinforcement learningProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305381.3305407(243-252)Online publication date: 6-Aug-2017
https://dl.acm.org/doi/10.5555/3305381.3305407
Luo JDong XYang HAllan JCroft Bde Vries AZhai C(2015)Session Search by Direct Policy LearningProceedings of the 2015 International Conference on The Theory of Information Retrieval10.1145/2808194.2809461(261-270)Online publication date: 27-Sep-2015
https://dl.acm.org/doi/10.1145/2808194.2809461
Lu STessier RBurleson WJones ALi HCoskun AMargala M(2015)Reinforcement Learning for Thermal-aware Many-core Task AllocationProceedings of the 25th edition on Great Lakes Symposium on VLSI10.1145/2742060.2742078(379-384)Online publication date: 20-May-2015
https://dl.acm.org/doi/10.1145/2742060.2742078
Liang JMiikkulainen REsparcia-Alcázar A(2015)Evolutionary Bilevel Optimization for Complex Control TasksProceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation10.1145/2739480.2754732(871-878)Online publication date: 11-Jul-2015
https://dl.acm.org/doi/10.1145/2739480.2754732
Schmidhuber J(2015)Deep learning in neural networksNeural Networks10.1016/j.neunet.2014.09.00361:C(85-117)Online publication date: 1-Jan-2015
https://dl.acm.org/doi/10.1016/j.neunet.2014.09.003
Jakab HCsató L(2011)Improving Gaussian process value function approximation in policy gradient algorithmsProceedings of the 21st international conference on Artificial neural networks - Volume Part II10.5555/2029604.2029633(221-228)Online publication date: 14-Jun-2011
https://dl.acm.org/doi/10.5555/2029604.2029633
Robards MSunehag P(2011)Gradient based algorithms with loss functions and kernels for improved on-policy controlProceedings of the 9th European conference on Recent Advances in Reinforcement Learning10.1007/978-3-642-29946-9_7(30-41)Online publication date: 9-Sep-2011
https://dl.acm.org/doi/10.1007/978-3-642-29946-9_7
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

Cited By

Index Terms

Recommendations

Gradient descent for general reinforcement learning

Empirical Comparison of Gradient Descent and Exponentiated Gradient Descent in Supervised and Reinforcement Learning

Swarm robots reinforcement learning convergence accuracy-based learning classifier systems with gradient descent (XCS-GD)

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations