Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–12 of 12 results for author: Ashley, D R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08404  [pdf, other

    cs.LG cs.AI

    Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning

    Authors: Yuhui Wang, Qingyuan Wu, Weida Li, Dylan R. Ashley, Francesco Faccio, Chao Huang, Jürgen Schmidhuber

    Abstract: The Value Iteration Network (VIN) is an end-to-end differentiable architecture that performs value iteration on a latent MDP for planning in reinforcement learning (RL). However, VINs struggle to scale to long-term and large-scale planning tasks, such as navigating a $100\times 100$ maze -- a task which typically requires thousands of planning steps to solve. We observe that this deficiency is due… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    ACM Class: I.2.6

  2. arXiv:2404.08093  [pdf, other

    cs.RO cs.AI cs.LG

    Towards a Robust Soft Baby Robot With Rich Interaction Ability for Advanced Machine Learning Algorithms

    Authors: Mohannad Alhakami, Dylan R. Ashley, Joel Dunham, Francesco Faccio, Eric Feron, Jürgen Schmidhuber

    Abstract: Artificial intelligence has made great strides in many areas lately, yet it has had comparatively little success in general-use robotics. We believe one of the reasons for this is the disconnect between traditional robotic design and the properties needed for open-ended, creativity-based AI systems. To that end, we, taking selective inspiration from nature, build a robust, partially soft robotic l… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 5 pages in main text + 1 page of references, 7 figures in main text; source code available at https://github.com/dylanashley/robot-limb-testai

    ACM Class: I.2.9; I.2.6

  3. arXiv:2305.17066  [pdf, other

    cs.AI cs.CL cs.CV cs.LG cs.MA

    Mindstorms in Natural Language-Based Societies of Mind

    Authors: Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem , et al. (1 additional authors not shown)

    Abstract: Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overco… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 9 pages in main text + 7 pages of references + 38 pages of appendices, 14 figures in main text + 13 in appendices, 7 tables in appendices

    MSC Class: 68T07 ACM Class: I.2.6; I.2.11

  4. arXiv:2211.12423  [pdf, other

    cs.CL cs.AI cs.LG cs.MM cs.NE cs.SD eess.AS

    On Narrative Information and the Distillation of Stories

    Authors: Dylan R. Ashley, Vincent Herrmann, Zachary Friggstad, Jürgen Schmidhuber

    Abstract: The act of telling stories is a fundamental part of what it means to be human. This work introduces the concept of narrative information, which we define to be the overlap in information space between a story and the items that compose the story. Using contrastive learning methods, we show how modern artificial neural networks can be leveraged to distill stories and extract a representation of the… ▽ More

    Submitted 13 February, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: presented in the Information-Theoretic Principles in Cognitive Systems Workshop at the 36th Conference on Neural Information Processing Systems; 4 pages in main text + 2 pages of references + 8 pages of appendices, 2 figures in main text + 3 in appendices, 1 table in main text, 2 algorithms in appendices; source code available at https://github.com/dylanashley/story-distiller

    MSC Class: 68T07 (Primary) 68P30; 68W50; 94A15 (Secondary) ACM Class: H.1.1; H.5.5; I.2.6; I.5.1; J.5

  5. arXiv:2205.06595  [pdf, other

    stat.ML cs.AI cs.LG

    Upside-Down Reinforcement Learning Can Diverge in Stochastic Environments With Episodic Resets

    Authors: Miroslav Štrupl, Francesco Faccio, Dylan R. Ashley, Jürgen Schmidhuber, Rupesh Kumar Srivastava

    Abstract: Upside-Down Reinforcement Learning (UDRL) is an approach for solving RL problems that does not require value functions and uses only supervised learning, where the targets for given inputs in a dataset do not change over time. Ghosh et al. proved that Goal-Conditional Supervised Learning (GCSL) -- which can be viewed as a simplified version of UDRL -- optimizes a lower bound on goal-reaching perfo… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: presented at the 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making; 5 pages in main text + 1 page of references + 3 pages of appendices, 1 figure in main text; source code available at https://github.com/struplm/UDRL-GCSL-counterexample.git

    MSC Class: 68T05 ACM Class: I.2.6

  6. arXiv:2202.12742  [pdf, other

    cs.LG cs.AI

    Learning Relative Return Policies With Upside-Down Reinforcement Learning

    Authors: Dylan R. Ashley, Kai Arulkumaran, Jürgen Schmidhuber, Rupesh Kumar Srivastava

    Abstract: Lately, there has been a resurgence of interest in using supervised learning to solve reinforcement learning problems. Recent work in this area has largely focused on learning command-conditioned policies. We investigate the potential of one such method -- upside-down reinforcement learning -- to work with commands that specify a desired relationship between some scalar value and the observed retu… ▽ More

    Submitted 10 May, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

    Comments: presented at the 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making; 5 pages in main text, 2 figures in main text

    ACM Class: I.2.6

  7. arXiv:2202.11960  [pdf, other

    cs.LG cs.AI

    All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL

    Authors: Kai Arulkumaran, Dylan R. Ashley, Jürgen Schmidhuber, Rupesh K. Srivastava

    Abstract: Upside down reinforcement learning (UDRL) flips the conventional use of the return in the objective function in RL upside down, by taking returns as input and predicting actions. UDRL is based purely on supervised learning, and bypasses some prominent issues in RL: bootstrapping, off-policy corrections, and discount factors. While previous work with UDRL demonstrated it in a traditional online RL… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

  8. arXiv:2111.02216  [pdf, other

    cs.CL cs.LG cs.MM cs.SD eess.AS

    Automatic Embedding of Stories Into Collections of Independent Media

    Authors: Dylan R. Ashley, Vincent Herrmann, Zachary Friggstad, Kory W. Mathewson, Jürgen Schmidhuber

    Abstract: We look at how machine learning techniques that derive properties of items in a collection of independent media can be used to automatically embed stories into such collections. To do so, we use models that extract the tempo of songs to make a music playlist follow a narrative arc. Our work specifies an open-source tool that uses pre-trained neural network models to extract the global tempo of a s… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

    Comments: 2 pages in main text + 1 page of references + 6 pages of appendices, 2 figures in main text + 3 figures in appendices, 1 algorithm in appendices; source code available at https://gist.github.com/dylanashley/1387a99deb85bfc0bce11286810cd98b

    ACM Class: H.5.5; I.2.6; J.5

  9. arXiv:2107.09088  [pdf, other

    stat.ML cs.AI cs.LG

    Reward-Weighted Regression Converges to a Global Optimum

    Authors: Miroslav Štrupl, Francesco Faccio, Dylan R. Ashley, Rupesh Kumar Srivastava, Jürgen Schmidhuber

    Abstract: Reward-Weighted Regression (RWR) belongs to a family of widely known iterative Reinforcement Learning algorithms based on the Expectation-Maximization framework. In this family, learning at each iteration consists of sampling a batch of trajectories using the current policy and fitting a new policy to maximize a return-weighted log-likelihood of actions. Although RWR is known to yield monotonic im… ▽ More

    Submitted 23 February, 2022; v1 submitted 19 July, 2021; originally announced July 2021.

    Comments: 7 pages in main text + 2 pages of references + 6 pages of appendices, 1 figure in main text + 1 figure in appendices; source code available at https://github.com/dylanashley/reward-weighted-regression

    MSC Class: 68T05 ACM Class: I.2.6

  10. arXiv:2102.07686  [pdf, other

    cs.LG cs.AI stat.ML

    Does the Adam Optimizer Exacerbate Catastrophic Forgetting?

    Authors: Dylan R. Ashley, Sina Ghiassian, Richard S. Sutton

    Abstract: Catastrophic forgetting remains a severe hindrance to the broad application of artificial neural networks (ANNs), however, it continues to be a poorly understood phenomenon. Despite the extensive amount of work on catastrophic forgetting, we argue that it is still unclear how exactly the phenomenon should be quantified, and, moreover, to what degree all of the choices we make when designing learni… ▽ More

    Submitted 9 June, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: 9 pages in main text + 3 pages of references + 16 pages of appendices, 6 figures in main text + 21 figures in appendices, 6 tables in appendices; source code available at https://github.com/dylanashley/catastrophic-forgetting/tree/arxiv

    ACM Class: I.2.6

  11. arXiv:2001.04025  [pdf, other

    cs.LG cs.AI stat.ML

    Universal Successor Features for Transfer Reinforcement Learning

    Authors: Chen Ma, Dylan R. Ashley, Junfeng Wen, Yoshua Bengio

    Abstract: Transfer in Reinforcement Learning (RL) refers to the idea of applying knowledge gained from previous tasks to solving related tasks. Learning a universal value function (Schaul et al., 2015), which generalizes over goals and states, has previously been shown to be useful for transfer. However, successor features are believed to be more suitable than values for transfer (Dayan, 1993; Barreto et al… ▽ More

    Submitted 4 January, 2020; originally announced January 2020.

  12. arXiv:1801.08287  [pdf, other

    cs.AI

    Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods

    Authors: Craig Sherstan, Brendan Bennett, Kenny Young, Dylan R. Ashley, Adam White, Martha White, Richard S. Sutton

    Abstract: This paper investigates estimating the variance of a temporal-difference learning agent's update target. Most reinforcement learning methods use an estimate of the value function, which captures how good it is for the agent to be in a particular state and is mathematically expressed as the expected sum of discounted future rewards (called the return). These values can be straightforwardly estimate… ▽ More

    Submitted 14 February, 2018; v1 submitted 25 January, 2018; originally announced January 2018.