Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–44 of 44 results for author: Leike, J

.
  1. arXiv:2407.13692  [pdf, other

    cs.CL

    Prover-Verifier Games improve legibility of LLM outputs

    Authors: Jan Hendrik Kirchner, Yining Chen, Harri Edwards, Jan Leike, Nat McAleese, Yuri Burda

    Abstract: One way to increase confidence in the outputs of Large Language Models (LLMs) is to support them with reasoning that is clear and easy to check -- a property we call legibility. We study legibility in the context of solving grade-school math problems and show that optimizing chain-of-thought solutions only for answer correctness can make them less legible. To mitigate the loss in legibility, we pr… ▽ More

    Submitted 1 August, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.00215  [pdf, other

    cs.SE cs.LG

    LLM Critics Help Catch LLM Bugs

    Authors: Nat McAleese, Rai Michael Pokorny, Juan Felipe Ceron Uribe, Evgenia Nitishinskaya, Maja Trebacz, Jan Leike

    Abstract: Reinforcement learning from human feedback (RLHF) is fundamentally limited by the capacity of humans to correctly evaluate model output. To improve human evaluation ability and overcome that limitation this work trains "critic" models that help humans to more accurately evaluate model-written code. These critics are themselves LLMs trained with RLHF to write natural language feedback highlighting… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  3. arXiv:2406.04093  [pdf, other

    cs.LG cs.AI

    Scaling and evaluating sparse autoencoders

    Authors: Leo Gao, Tom Dupré la Tour, Henk Tillman, Gabriel Goh, Rajan Troll, Alec Radford, Ilya Sutskever, Jan Leike, Jeffrey Wu

    Abstract: Sparse autoencoders provide a promising unsupervised approach for extracting interpretable features from a language model by reconstructing activations from a sparse bottleneck layer. Since language models learn many concepts, autoencoders need to be very large to recover all relevant features. However, studying the properties of autoencoder scaling is difficult due to the need to balance reconstr… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2312.09390  [pdf, other

    cs.CL

    Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

    Authors: Collin Burns, Pavel Izmailov, Jan Hendrik Kirchner, Bowen Baker, Leo Gao, Leopold Aschenbrenner, Yining Chen, Adrien Ecoffet, Manas Joglekar, Jan Leike, Ilya Sutskever, Jeff Wu

    Abstract: Widely used alignment techniques, such as reinforcement learning from human feedback (RLHF), rely on the ability of humans to supervise model behavior - for example, to evaluate whether a model faithfully followed instructions or generated safe outputs. However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to weakly su… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  5. arXiv:2305.20050  [pdf, other

    cs.LG cs.AI cs.CL

    Let's Verify Step by Step

    Authors: Hunter Lightman, Vineet Kosaraju, Yura Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe

    Abstract: In recent years, large language models have greatly improved in their ability to perform complex multi-step reasoning. However, even state-of-the-art models still regularly produce logical mistakes. To train more reliable models, we can turn either to outcome supervision, which provides feedback for a final result, or process supervision, which provides feedback for each intermediate reasoning ste… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  6. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  7. arXiv:2206.05802  [pdf, other

    cs.CL cs.LG

    Self-critiquing models for assisting human evaluators

    Authors: William Saunders, Catherine Yeh, Jeff Wu, Steven Bills, Long Ouyang, Jonathan Ward, Jan Leike

    Abstract: We fine-tune large language models to write natural language critiques (natural language critical comments) using behavioral cloning. On a topic-based summarization task, critiques written by our models help humans find flaws in summaries that they would have otherwise missed. Our models help find naturally occurring flaws in both model and human written summaries, and intentional flaws in summari… ▽ More

    Submitted 13 June, 2022; v1 submitted 12 June, 2022; originally announced June 2022.

  8. arXiv:2203.02155  [pdf, other

    cs.CL cs.AI cs.LG

    Training language models to follow instructions with human feedback

    Authors: Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan Lowe

    Abstract: Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning wi… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

  9. arXiv:2201.08102  [pdf, other

    cs.LG

    Safe Deep RL in 3D Environments using Human Feedback

    Authors: Matthew Rahtz, Vikrant Varma, Ramana Kumar, Zachary Kenton, Shane Legg, Jan Leike

    Abstract: Agents should avoid unsafe behaviour during both training and deployment. This typically requires a simulator and a procedural specification of unsafe behaviour. Unfortunately, a simulator is not always available, and procedurally specifying constraints can be difficult or impossible for many real-world tasks. A recently introduced technique, ReQueST, aims to solve this problem by learning a neura… ▽ More

    Submitted 21 January, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

  10. arXiv:2109.10862  [pdf, other

    cs.CL cs.AI cs.LG

    Recursively Summarizing Books with Human Feedback

    Authors: Jeff Wu, Long Ouyang, Daniel M. Ziegler, Nisan Stiennon, Ryan Lowe, Jan Leike, Paul Christiano

    Abstract: A major challenge for scaling machine learning is training models to perform tasks that are very difficult or time-consuming for humans to evaluate. We present progress on this problem on the task of abstractive summarization of entire fiction novels. Our method combines learning from human feedback with recursive task decomposition: we use models trained on smaller parts of the task to assist hum… ▽ More

    Submitted 27 September, 2021; v1 submitted 22 September, 2021; originally announced September 2021.

  11. arXiv:2107.03374  [pdf, other

    cs.LG

    Evaluating Large Language Models Trained on Code

    Authors: Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter , et al. (33 additional authors not shown)

    Abstract: We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J sol… ▽ More

    Submitted 14 July, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

    Comments: corrected typos, added references, added authors, added acknowledgements

  12. Institutionalising Ethics in AI through Broader Impact Requirements

    Authors: Carina Prunkl, Carolyn Ashurst, Markus Anderljung, Helena Webb, Jan Leike, Allan Dafoe

    Abstract: Turning principles into practice is one of the most pressing challenges of artificial intelligence (AI) governance. In this article, we reflect on a novel governance initiative by one of the world's largest AI conferences. In 2020, the Conference on Neural Information Processing Systems (NeurIPS) introduced a requirement for submitting authors to include a statement on the broader societal impacts… ▽ More

    Submitted 30 May, 2021; originally announced June 2021.

    Journal ref: Nature Machine Intelligence 3.2 (2021): 104-110

  13. arXiv:2011.06709  [pdf, other

    cs.LG cs.AI stat.ML

    Active Reinforcement Learning: Observing Rewards at a Cost

    Authors: David Krueger, Jan Leike, Owain Evans, John Salvatier

    Abstract: Active reinforcement learning (ARL) is a variant on reinforcement learning where the agent does not observe the reward unless it chooses to pay a query cost c > 0. The central question of ARL is how to quantify the long-term value of reward information. Even in multi-armed bandits, computing the value of this information is intractable and we have to rely on heuristics. We propose and evaluate sev… ▽ More

    Submitted 24 November, 2020; v1 submitted 12 November, 2020; originally announced November 2020.

    Comments: Originally appeared at the NeurIPS 2016 "Future of Interactive Learning Machines (FILM)" workshop

  14. arXiv:2009.09153  [pdf, other

    cs.LG cs.AI stat.ML

    Hidden Incentives for Auto-Induced Distributional Shift

    Authors: David Krueger, Tegan Maharaj, Jan Leike

    Abstract: Decisions made by machine learning systems have increasing influence on the world, yet it is common for machine learning algorithms to assume that no such influence exists. An example is the use of the i.i.d. assumption in content recommendation. In fact, the (choice of) content displayed can change users' perceptions and preferences, or even drive them away, causing a shift in the distribution of… ▽ More

    Submitted 18 September, 2020; originally announced September 2020.

  15. arXiv:2006.13900  [pdf, other

    cs.LG cs.AI stat.ML

    Quantifying Differences in Reward Functions

    Authors: Adam Gleave, Michael Dennis, Shane Legg, Stuart Russell, Jan Leike

    Abstract: For many tasks, the reward function is inaccessible to introspection or too complex to be specified procedurally, and must instead be learned from user data. Prior work has evaluated learned reward functions by evaluating policies optimized for the learned reward. However, this method cannot distinguish between the learned reward function failing to reflect user preferences and the policy optimiza… ▽ More

    Submitted 17 March, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Published at ICLR 2021. 9 pages main paper, 42 pages total

    ACM Class: I.2.6

  16. arXiv:2004.13654  [pdf, other

    cs.AI

    Pitfalls of learning a reward function online

    Authors: Stuart Armstrong, Jan Leike, Laurent Orseau, Shane Legg

    Abstract: In some agent designs like inverse reinforcement learning an agent needs to learn its own reward function. Learning the reward function and optimising for it are typically two different processes, usually performed at different stages. We consider a continual (``one life'') learning approach where the agent both learns the reward function and optimises for it at the same time. We show that this co… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

  17. arXiv:1912.05652  [pdf, other

    cs.CY cs.LG stat.ML

    Learning Human Objectives by Evaluating Hypothetical Behavior

    Authors: Siddharth Reddy, Anca D. Dragan, Sergey Levine, Shane Legg, Jan Leike

    Abstract: We seek to align agent behavior with a user's objectives in a reinforcement learning setting with unknown dynamics, an unknown reward function, and unknown unsafe states. The user knows the rewards and unsafe states, but querying the user is expensive. To address this challenge, we propose an algorithm that safely and interactively learns a model of the user's reward function. We start with a gene… ▽ More

    Submitted 24 March, 2021; v1 submitted 5 December, 2019; originally announced December 2019.

    Comments: Published at International Conference on Machine Learning (ICML) 2020

  18. arXiv:1812.05979  [pdf, ps, other

    cs.LG cs.CR cs.NE

    Scaling shared model governance via model splitting

    Authors: Miljan Martic, Jan Leike, Andrew Trask, Matteo Hessel, Shane Legg, Pushmeet Kohli

    Abstract: Currently the only techniques for sharing governance of a deep learning model are homomorphic encryption and secure multiparty computation. Unfortunately, neither of these techniques is applicable to the training of large neural networks due to their large computational and communication overheads. As a scalable technique for shared model governance, we propose splitting deep learning model betwee… ▽ More

    Submitted 14 December, 2018; originally announced December 2018.

    Comments: 9 pages

  19. arXiv:1811.07871  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Scalable agent alignment via reward modeling: a research direction

    Authors: Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg

    Abstract: One obstacle to applying reinforcement learning algorithms to real-world problems is the lack of suitable reward functions. Designing such reward functions is difficult in part because the user only has an implicit understanding of the task objective. This gives rise to the agent alignment problem: how do we create agents that behave in accordance with the user's intentions? We outline a high-leve… ▽ More

    Submitted 19 November, 2018; originally announced November 2018.

  20. arXiv:1811.06521  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Reward learning from human preferences and demonstrations in Atari

    Authors: Borja Ibarz, Jan Leike, Tobias Pohlen, Geoffrey Irving, Shane Legg, Dario Amodei

    Abstract: To solve complex real-world problems with reinforcement learning, we cannot rely on manually specified reward functions. Instead, we can have humans communicate an objective to the agent directly. In this work, we combine two approaches to learning from human feedback: expert demonstrations and trajectory preferences. We train a deep neural network to model the reward function and use its predicte… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

    Comments: NIPS 2018

  21. arXiv:1806.01946  [pdf, other

    cs.AI cs.LG

    Learning to Understand Goal Specifications by Modelling Reward

    Authors: Dzmitry Bahdanau, Felix Hill, Jan Leike, Edward Hughes, Arian Hosseini, Pushmeet Kohli, Edward Grefenstette

    Abstract: Recent work has shown that deep reinforcement-learning agents can learn to follow language-like instructions from infrequent environment rewards. However, this places on environment designers the onus of designing language-conditional reward functions which may not be easily or tractably implemented as the complexity of the environment and the language scales. To overcome this limitation, we prese… ▽ More

    Submitted 23 December, 2019; v1 submitted 5 June, 2018; originally announced June 2018.

    Comments: 19 pages, 9 figures

  22. arXiv:1711.09883  [pdf, other

    cs.LG cs.AI

    AI Safety Gridworlds

    Authors: Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg

    Abstract: We present a suite of reinforcement learning environments illustrating various safety properties of intelligent agents. These problems include safe interruptibility, avoiding side effects, absent supervisor, reward gaming, safe exploration, as well as robustness to self-modification, distributional shift, and adversaries. To measure compliance with the intended safe behavior, we equip each environ… ▽ More

    Submitted 28 November, 2017; v1 submitted 27 November, 2017; originally announced November 2017.

  23. arXiv:1706.03741  [pdf, other

    stat.ML cs.AI cs.HC cs.LG

    Deep reinforcement learning from human preferences

    Authors: Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei

    Abstract: For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems. In this work, we explore goals defined in terms of (non-expert) human preferences between pairs of trajectory segments. We show that this approach can effectively solve complex RL tasks without access to the reward function, including Atari… ▽ More

    Submitted 17 February, 2023; v1 submitted 12 June, 2017; originally announced June 2017.

  24. arXiv:1705.10557  [pdf, other

    cs.AI

    Universal Reinforcement Learning Algorithms: Survey and Experiments

    Authors: John Aslanides, Jan Leike, Marcus Hutter

    Abstract: Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP). In contrast, the field of universal reinforcement learning (URL) is concerned with algorithms that make as few assumptions as possible about the environment. The universal Bayesian agent AIXI and a family of related URL algorithms have been developed in th… ▽ More

    Submitted 30 May, 2017; originally announced May 2017.

    Comments: 8 pages, 6 figures, Twenty-sixth International Joint Conference on Artificial Intelligence (IJCAI-17)

  25. arXiv:1703.01358  [pdf, ps, other

    cs.AI

    Generalised Discount Functions applied to a Monte-Carlo AImu Implementation

    Authors: Sean Lamont, John Aslanides, Jan Leike, Marcus Hutter

    Abstract: In recent years, work has been done to develop the theory of General Reinforcement Learning (GRL). However, there are few examples demonstrating these results in a concrete way. In particular, there are no examples demonstrating the known results regarding gener- alised discounting. We have added to the GRL simulation platform AIXIjs the functionality to assign an agent arbitrary discount function… ▽ More

    Submitted 3 March, 2017; originally announced March 2017.

    Comments: 12 pages, 4 figures

  26. arXiv:1611.08944  [pdf, other

    cs.AI

    Nonparametric General Reinforcement Learning

    Authors: Jan Leike

    Abstract: Reinforcement learning (RL) problems are often phrased in terms of Markov decision processes (MDPs). In this thesis we go beyond MDPs and consider RL in environments that are non-Markovian, non-ergodic and only partially observable. Our focus is not on practical algorithms, but rather on the fundamental underlying problems: How do we balance exploration and exploitation? How do we explore optimall… ▽ More

    Submitted 27 November, 2016; originally announced November 2016.

    Comments: PhD thesis

  27. arXiv:1609.05207  [pdf, ps, other

    cs.LO

    Geometric Nontermination Arguments

    Authors: Jan Leike, Matthias Heizmann

    Abstract: We present a new kind of nontermination argument, called geometric nontermination argument. The geometric nontermination argument is a finite representation of an infinite execution that has the form of a sum of several geometric series. For so-called linear lasso programs we can decide the existence of a geometric nontermination argument using a nonlinear algebraic $\exists$-constraint. We show t… ▽ More

    Submitted 16 September, 2016; originally announced September 2016.

    Comments: 18 pages

  28. arXiv:1609.05058  [pdf, ps, other

    cs.AI cs.GT cs.LG

    A Formal Solution to the Grain of Truth Problem

    Authors: Jan Leike, Jessica Taylor, Benya Fallenstein

    Abstract: A Bayesian agent acting in a multi-agent environment learns to predict the other agents' policies if its prior assigns positive probability to them (in other words, its prior contains a \emph{grain of truth}). Finding a reasonably large class of policies that contains the Bayes-optimal policies with respect to this class is known as the \emph{grain of truth problem}. Only small classes are known t… ▽ More

    Submitted 16 September, 2016; originally announced September 2016.

    Comments: UAI 2016

  29. arXiv:1609.04994  [pdf, other

    cs.LG cs.AI

    Exploration Potential

    Authors: Jan Leike

    Abstract: We introduce exploration potential, a quantity that measures how much a reinforcement learning agent has explored its environment class. In contrast to information gain, exploration potential takes the problem's reward structure into account. This leads to an exploration criterion that is both necessary and sufficient for asymptotic optimality (learning to act optimally across the entire environme… ▽ More

    Submitted 18 November, 2016; v1 submitted 16 September, 2016; originally announced September 2016.

    Comments: 10 pages, including proofs

  30. arXiv:1604.03343  [pdf, ps, other

    cs.LG stat.ML

    Loss Bounds and Time Complexity for Speed Priors

    Authors: Daniel Filan, Marcus Hutter, Jan Leike

    Abstract: This paper establishes for the first time the predictive performance of speed priors and their computational complexity. A speed prior is essentially a probability distribution that puts low probability on strings that are not efficiently computable. We propose a variant to the original speed prior (Schmidhuber, 2002), and show that our prior can predict sequences drawn from probability measures t… ▽ More

    Submitted 12 April, 2016; originally announced April 2016.

    Comments: AISTATS 2016

  31. arXiv:1602.07905  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Thompson Sampling is Asymptotically Optimal in General Environments

    Authors: Jan Leike, Tor Lattimore, Laurent Orseau, Marcus Hutter

    Abstract: We discuss a variant of Thompson sampling for nonparametric reinforcement learning in a countable classes of general stochastic environments. These environments can be non-Markov, non-ergodic, and partially observable. We show that Thompson sampling learns the environment class in the sense that (1) asymptotically its value converges to the optimal value in mean and (2) given a recoverability assu… ▽ More

    Submitted 3 June, 2016; v1 submitted 25 February, 2016; originally announced February 2016.

    Comments: UAI 2016

  32. arXiv:1510.05572  [pdf, ps, other

    cs.AI

    On the Computability of AIXI

    Authors: Jan Leike, Marcus Hutter

    Abstract: How could we solve the machine learning and the artificial intelligence problem if we had infinite computation? Solomonoff induction and the reinforcement learning agent AIXI are proposed answers to this question. Both are known to be incomputable. In this paper, we quantify this using the arithmetical hierarchy, and prove upper and corresponding lower bounds for incomputability. We show that AIXI… ▽ More

    Submitted 19 October, 2015; originally announced October 2015.

    Comments: UAI 2015

  33. arXiv:1510.04931  [pdf, ps, other

    cs.AI cs.LG

    Bad Universal Priors and Notions of Optimality

    Authors: Jan Leike, Marcus Hutter

    Abstract: A big open question of algorithmic information theory is the choice of the universal Turing machine (UTM). For Kolmogorov complexity and Solomonoff induction we have invariance theorems: the choice of the UTM changes bounds only by a constant. For the universally intelligent agent AIXI (Hutter, 2005) no invariance theorem is known. Our results are entirely negative: we discuss cases in which unluc… ▽ More

    Submitted 16 October, 2015; originally announced October 2015.

    Comments: COLT 2015

  34. arXiv:1507.04124  [pdf, other

    cs.AI cs.LG

    On the Computability of Solomonoff Induction and Knowledge-Seeking

    Authors: Jan Leike, Marcus Hutter

    Abstract: Solomonoff induction is held as a gold standard for learning, but it is known to be incomputable. We quantify its incomputability by placing various flavors of Solomonoff's prior M in the arithmetical hierarchy. We also derive computability bounds for knowledge-seeking agents, and give a limit-computable weakly asymptotically optimal reinforcement learning agent.

    Submitted 15 July, 2015; originally announced July 2015.

    Comments: ALT 2015

  35. arXiv:1507.04121  [pdf, other

    cs.LG cs.AI math.ST

    Solomonoff Induction Violates Nicod's Criterion

    Authors: Jan Leike, Marcus Hutter

    Abstract: Nicod's criterion states that observing a black raven is evidence for the hypothesis H that all ravens are black. We show that Solomonoff induction does not satisfy Nicod's criterion: there are time steps in which observing black ravens decreases the belief in H. Moreover, while observing any computable infinite string compatible with H, the belief in H decreases infinitely often when using the un… ▽ More

    Submitted 15 July, 2015; originally announced July 2015.

    Comments: ALT 2015

  36. arXiv:1506.07359  [pdf, ps, other

    cs.AI

    Sequential Extensions of Causal and Evidential Decision Theory

    Authors: Tom Everitt, Jan Leike, Marcus Hutter

    Abstract: Moving beyond the dualistic view in AI where agent and environment are separated incurs new challenges for decision making, as calculation of expected utility is no longer straightforward. The non-dualistic decision theory literature is split between causal decision theory and evidential decision theory. We extend these decision algorithms to the sequential setting where the agent alternates betwe… ▽ More

    Submitted 24 June, 2015; originally announced June 2015.

    Comments: ADT 2015

  37. arXiv:1505.04497  [pdf, other

    cs.AI

    A Definition of Happiness for Reinforcement Learning Agents

    Authors: Mayank Daswani, Jan Leike

    Abstract: What is happiness for reinforcement learning agents? We seek a formal definition satisfying a list of desiderata. Our proposed definition of happiness is the temporal difference error, i.e. the difference between the value of the obtained reward and observation and the agent's expectation of this value. This definition satisfies most of our desiderata and is compatible with empirical research on h… ▽ More

    Submitted 17 May, 2015; originally announced May 2015.

    Comments: AGI 2015

  38. Ranking Templates for Linear Loops

    Authors: Jan Leike, Matthias Heizmann

    Abstract: We present a new method for the constraint-based synthesis of termination arguments for linear loop programs based on linear ranking templates. Linear ranking templates are parameterized, well-founded relations such that an assignment to the parameters gives rise to a ranking function. Our approach generalizes existing methods and enables us to use templates for many different ranking functions wi… ▽ More

    Submitted 29 March, 2015; v1 submitted 28 February, 2015; originally announced March 2015.

    Journal ref: Logical Methods in Computer Science, Volume 11, Issue 1 (March 31, 2015) lmcs:797

  39. arXiv:1408.3169  [pdf, ps, other

    cs.LG math.PR math.ST

    Indefinitely Oscillating Martingales

    Authors: Jan Leike, Marcus Hutter

    Abstract: We construct a class of nonnegative martingale processes that oscillate indefinitely with high probability. For these processes, we state a uniform rate of the number of oscillations and show that this rate is asymptotically close to the theoretical upper bound. These bounds on probability and expectation of the number of upcrossings are compared to classical bounds from the martingale literature.… ▽ More

    Submitted 13 August, 2014; originally announced August 2014.

    Comments: ALT 2014, extended technical report

  40. arXiv:1405.4413  [pdf, ps, other

    cs.LO

    Geometric Series as Nontermination Arguments for Linear Lasso Programs

    Authors: Jan Leike, Matthias Heizmann

    Abstract: We present a new kind of nontermination argument for linear lasso programs, called geometric nontermination argument. A geometric nontermination argument is a finite representation of an infinite execution of the form $(\vec{x} + \sum_{i=0}^t λ^i \vec{y})_{t \geq 0}$. The existence of this nontermination argument can be stated as a set of nonlinear algebraic constraints. We show that every linear… ▽ More

    Submitted 17 May, 2014; originally announced May 2014.

    Comments: WST 2014

  41. arXiv:1401.5351  [pdf, ps, other

    cs.LO

    Ranking Function Synthesis for Linear Lasso Programs

    Authors: Jan Leike

    Abstract: The scope of this work is the constraint-based synthesis of termination arguments for the restricted class of programs called linear lasso programs. A termination argument consists of a ranking function as well as a set of supporting invariants. We extend existing methods in several ways. First, we use Motzkin's Transposition Theorem instead of Farkas' Lemma. This allows us to consider linear la… ▽ More

    Submitted 21 January, 2014; originally announced January 2014.

    Comments: Master's Thesis

  42. arXiv:1401.5347  [pdf, ps, other

    cs.LO

    Linear Ranking for Linear Lasso Programs

    Authors: Matthias Heizmann, Jochen Hoenicke, Jan Leike, Andreas Podelski

    Abstract: The general setting of this work is the constraint-based synthesis of termination arguments. We consider a restricted class of programs called lasso programs. The termination argument for a lasso program is a pair of a ranking function and an invariant. We present the---to the best of our knowledge---first method to synthesize termination arguments for lasso programs that uses linear arithmetic. W… ▽ More

    Submitted 21 January, 2014; originally announced January 2014.

    Comments: ATVA 2013

  43. arXiv:1401.5338  [pdf, ps, other

    cs.LO

    Ranking Templates for Linear Loops

    Authors: Jan Leike, Matthias Heizmann

    Abstract: We present a new method for the constraint-based synthesis of termination arguments for linear loop programs based on linear ranking templates. Linear ranking templates are parametrized, well-founded relations such that an assignment to the parameters gives rise to a ranking function. This approach generalizes existing methods and enables us to use templates for many different ranking functions wi… ▽ More

    Submitted 21 January, 2014; originally announced January 2014.

    Comments: TACAS 2014

  44. arXiv:1311.4046  [pdf, ps, other

    cs.LO

    Synthesis for Polynomial Lasso Programs

    Authors: Jan Leike, Ashish Tiwari

    Abstract: We present a method for the synthesis of polynomial lasso programs. These programs consist of a program stem, a set of transitions, and an exit condition, all in the form of algebraic assertions (conjunctions of polynomial equalities). Central to this approach is the discovery of non-linear (algebraic) loop invariants. We extend Sankaranarayanan, Sipma, and Manna's template-based approach and prov… ▽ More

    Submitted 16 November, 2013; originally announced November 2013.

    Comments: Paper at VMCAI'14, including appendix