Abstract
In this paper we present a new method for improving reinforcement learning training times under the following two assumptions: (1) we know the conditions under which the environment gives reward; and (2) we can control the initial state of the environment at the beginning of a training episode. Our method, called intra-task curriculum learning, presents the different episode starting states to an agent in order of increasing distance to immediate reward.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Eligibility traces are recorded for each value in value function based reinforcement learning. We use the term “state” here, but in practice the values can also be stored for state-action pairs (e.g. in Q-learning).
- 2.
For the purpose of this paper, we define the distance form state \(s_a\) to state \(s_b\) as the minimum number of transitions required to get from \(s_a\) to \(s_b\). This definition is sufficient because the environments we use have deterministic transitions, but a different definition would be required for stochastic environments.
- 3.
We have a \(5 \times 5\) grid with 7 walls; leaving 18 free spaces.
- 4.
We number the states in the 2D environment from left to right, bottom to top.
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge (1998)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Pathak, D., et al.: Curiosity-driven exploration by self-supervised prediction. In: ICML (2017)
Chentanez, N., Barto, A.G., Singh, S.P.: Intrinsically motivated reinforcement learning. In: NIPS, pp. 1281–1288 (2005)
Dosovitskiy, A., Koltun, V.: Learning to act by predicting the future. In: ICLR (2017)
Bengio, Y., et al.: Curriculum learning. In: ICML, pp. 41–48 (2009)
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: AAAI (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
du Preez-Wilkinson, N., Gallagher, M., Hu, X. (2018). Intra-task Curriculum Learning for Faster Reinforcement Learning in Video Games. In: Mitrovic, T., Xue, B., Li, X. (eds) AI 2018: Advances in Artificial Intelligence. AI 2018. Lecture Notes in Computer Science(), vol 11320. Springer, Cham. https://doi.org/10.1007/978-3-030-03991-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-03991-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03990-5
Online ISBN: 978-3-030-03991-2
eBook Packages: Computer ScienceComputer Science (R0)