Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–5 of 5 results for author: Doerr, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:1905.05710  [pdf, other

    cs.LG cs.AI stat.ML

    Trajectory-Based Off-Policy Deep Reinforcement Learning

    Authors: Andreas Doerr, Michael Volpp, Marc Toussaint, Sebastian Trimpe, Christian Daniel

    Abstract: Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks. However, these methods are also data-inefficient, afflicted with high variance gradient estimates, and frequently get stuck in local optima. This work addresses these weaknesses by combining recent improvements in the reuse of off-policy data and exploration in parameter s… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: Includes appendix. Accepted for ICML 2019

  2. arXiv:1904.02642  [pdf, other

    stat.ML cs.AI cs.LG

    Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization

    Authors: Michael Volpp, Lukas P. Fröhlich, Kirsten Fischer, Andreas Doerr, Stefan Falkner, Frank Hutter, Christian Daniel

    Abstract: Transferring knowledge across tasks to improve data-efficiency is one of the open key challenges in the field of global black-box optimization. Readily available algorithms are typically designed to be universal optimizers and, therefore, often suboptimal for specific tasks. We propose a novel transfer learning method to obtain customized optimizers within the well-established framework of Bayesia… ▽ More

    Submitted 14 February, 2020; v1 submitted 4 April, 2019; originally announced April 2019.

  3. arXiv:1810.12263  [pdf, other

    stat.ML cs.LG

    Learning Gaussian Processes by Minimizing PAC-Bayesian Generalization Bounds

    Authors: David Reeb, Andreas Doerr, Sebastian Gerwinn, Barbara Rakitsch

    Abstract: Gaussian Processes (GPs) are a generic modelling tool for supervised learning. While they have been successfully applied on large datasets, their use in safety-critical applications is hindered by the lack of good performance guarantees. To this end, we propose a method to learn GPs and their sparse approximations by directly optimizing a PAC-Bayesian bound on their generalization performance, ins… ▽ More

    Submitted 28 December, 2018; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: 11 pages main text, 12 pages appendix. v2: minor changes, new NeurIPS style file. Final camera-ready version submitted to NeurIPS 2018

    Journal ref: Advances in Neural Information Processing Systems 31 (Proceedings of the NeurIPS Conference 2018), https://papers.nips.cc/paper/7594-learning-gaussian-processes-by-minimizing-pac-bayesian-generalization-bounds

  4. arXiv:1801.10395  [pdf, other

    stat.ML

    Probabilistic Recurrent State-Space Models

    Authors: Andreas Doerr, Christian Daniel, Martin Schiegg, Duy Nguyen-Tuong, Stefan Schaal, Marc Toussaint, Sebastian Trimpe

    Abstract: State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g. LSTMs) proved extremely successful in modeling complex time series data. Fully probabilistic SSMs, however, are often found hard to train, even for smaller problems. To overcome this limitation, we propose a novel model formulat… ▽ More

    Submitted 10 February, 2018; v1 submitted 31 January, 2018; originally announced January 2018.

  5. arXiv:1703.02899  [pdf, other

    cs.LG cs.RO eess.SY stat.ML

    Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

    Authors: Andreas Doerr, Duy Nguyen-Tuong, Alonso Marco, Stefan Schaal, Sebastian Trimpe

    Abstract: PID control architectures are widely used in industrial applications. Despite their low number of open parameters, tuning multiple, coupled PID controllers can become tedious in practice. In this paper, we extend PILCO, a model-based policy search framework, to automatically tune multivariate PID controllers purely based on data observed on an otherwise unknown system. The system's state is extend… ▽ More

    Submitted 8 March, 2017; originally announced March 2017.

    Comments: Accepted final version to appear in 2017 IEEE International Conference on Robotics and Automation (ICRA)