Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Published: 01 May 1992 Publication History

Abstract

This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units. These algorithms, called REINFORCE algorithms, are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reinforcement tasks, and they do this without explicitly computing gradient estimates or even storing information from which such estimates could be computed. Specific examples of such algorithms are presented, some of which bear a close relationship to certain existing algorithms while others are novel but potentially interesting in their own right. Also given are results that show how such algorithms can be naturally integrated with backpropagation. We close with a brief discussion of a number of additional issues surrounding the use of such algorithms, including what is known about their limiting behaviors as well as further considerations that might be used to help develop similar but potentially more powerful reinforcement learning algorithms.

Cited By

View all
  • (2024)Electric Vehicle Routing for Emergency Power Supply with Deep Reinforcement LearningProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663152(2336-2338)Online publication date: 6-May-2024
  • (2024)Attention-based Priority Learning for Limited Time Multi-Agent Path FindingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663063(1993-2001)Online publication date: 6-May-2024
  • (2024)Collaborative Deep Reinforcement Learning for Solving Multi-Objective Vehicle Routing ProblemsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663059(1956-1965)Online publication date: 6-May-2024
  • Show More Cited By

Index Terms

  1. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Machine Language
    Machine Language  Volume 8, Issue 3-4
    May 1992
    167 pages
    ISSN:0885-6125
    Issue’s Table of Contents

    Publisher

    Kluwer Academic Publishers

    United States

    Publication History

    Published: 01 May 1992

    Author Tags

    1. Reinforcement learning
    2. connectionist networks
    3. gradient descent
    4. mathematical analysis

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Electric Vehicle Routing for Emergency Power Supply with Deep Reinforcement LearningProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663152(2336-2338)Online publication date: 6-May-2024
    • (2024)Attention-based Priority Learning for Limited Time Multi-Agent Path FindingProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663063(1993-2001)Online publication date: 6-May-2024
    • (2024)Collaborative Deep Reinforcement Learning for Solving Multi-Objective Vehicle Routing ProblemsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663059(1956-1965)Online publication date: 6-May-2024
    • (2024)Emergent Cooperation under Uncertain Incentive AlignmentProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663012(1521-1530)Online publication date: 6-May-2024
    • (2024)The Stochastic Evolutionary Dynamics of Softmax Policy Gradient in GamesProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662966(1101-1109)Online publication date: 6-May-2024
    • (2024)Logistics Distribution Route Optimization With Time Windows Based on Multi-Agent Deep Reinforcement LearningInternational Journal of Information Technologies and Systems Approach10.4018/IJITSA.34208417:1(1-23)Online publication date: 16-Apr-2024
    • (2024)Structure-Information-Based Reasoning over the Knowledge Graph: A Survey of Methods and ApplicationsACM Transactions on Knowledge Discovery from Data10.1145/367114818:8(1-42)Online publication date: 16-Aug-2024
    • (2024)A Combinatorial Optimization Framework for Probability-Based Algorithms by Means of Generative ModelsACM Transactions on Evolutionary Learning and Optimization10.1145/36656504:3(1-28)Online publication date: 22-May-2024
    • (2024)Efficient Automation of Neural Network Design: A Survey on Differentiable Neural Architecture SearchACM Computing Surveys10.1145/366513856:11(1-36)Online publication date: 28-Jun-2024
    • (2024)Creativity and Machine Learning: A SurveyACM Computing Surveys10.1145/366459556:11(1-41)Online publication date: 28-Jun-2024
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media