Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States

Montgomery, William; Ajay, Anurag; Finn, Chelsea; Abbeel, Pieter; Levine, Sergey

Computer Science > Machine Learning

arXiv:1610.01112 (cs)

[Submitted on 4 Oct 2016 (v1), last revised 6 Oct 2016 (this version, v2)]

Title:Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States

Authors:William Montgomery, Anurag Ajay, Chelsea Finn, Pieter Abbeel, Sergey Levine

View PDF

Abstract:Autonomous learning of robotic skills can allow general-purpose robots to learn wide behavioral repertoires without requiring extensive manual engineering. However, robotic skill learning methods typically make one of several trade-offs to enable practical real-world learning, such as requiring manually designed policy or value function representations, initialization from human-provided demonstrations, instrumentation of the training environment, or extremely long training times. In this paper, we propose a new reinforcement learning algorithm for learning manipulation skills that can train general-purpose neural network policies with minimal human engineering, while still allowing for fast, efficient learning in stochastic environments. Our approach builds on the guided policy search (GPS) algorithm, which transforms the reinforcement learning problem into supervised learning from a computational teacher (without human demonstrations). In contrast to prior GPS methods, which require a consistent set of initial states to which the system must be reset after each episode, our approach can handle randomized initial states, allowing it to be used in environments where deterministic resets are impossible. We compare our method to existing policy search techniques in simulation, showing that it can train high-dimensional neural network policies with the same sample efficiency as prior GPS methods, and present real-world results on a PR2 robotic manipulator.

Subjects:	Machine Learning (cs.LG); Robotics (cs.RO)
Cite as:	arXiv:1610.01112 [cs.LG]
	(or arXiv:1610.01112v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1610.01112

Submission history

From: William Montgomery Iv [view email]
[v1] Tue, 4 Oct 2016 18:05:52 UTC (1,050 KB)
[v2] Thu, 6 Oct 2016 05:10:11 UTC (1,050 KB)

Computer Science > Machine Learning

Title:Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators