Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Simple reinforcement learning agents: Pareto beats Nash in an algorithmic game theory study

  • Published:
Information Systems and e-Business Management Aims and scope Submit manuscript

Abstract.

Repeated play in games by simple adaptive agents is investigated. The agents use Q-learning, a special form of reinforcement learning, to direct learning of behavioral strategies in a number of 2×2 games. The agents are able effectively to maximize the total wealth extracted. This often leads to Pareto optimal outcomes. When the rewards signals are sufficiently clear, Pareto optimal outcomes will largely be achieved. The effect can select Pareto outcomes that are not Nash equilibria and it can select Pareto optimal outcomes among Nash equilibria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven O. Kimbrough.

Additional information

Acknowledgement This material is based upon work supported by, or in part by, NSF grant number SES-9709548. We wish to thank an anonymous referee for a number of very helpful suggestions.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kimbrough, S., Lu, M. Simple reinforcement learning agents: Pareto beats Nash in an algorithmic game theory study. ISeB 3, 1–19 (2005). https://doi.org/10.1007/s10257-003-0024-0

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10257-003-0024-0

Keywords