Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–25 of 25 results for author: Hausknecht, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.10869  [pdf, other

    cs.LG

    UniMASK: Unified Inference in Sequential Decision Problems

    Authors: Micah Carroll, Orr Paradise, Jessy Lin, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

    Abstract: Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision-making, where many well-studied tasks like behavior cloning, offline reinforcement learning, inverse dynamics, and waypoint conditioning correspond to different sequenc… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022 (Oral). A prior version was published at an ICML Workshop, available at arXiv:2204.13326

  2. arXiv:2208.07363  [pdf, other

    cs.RO cs.GR cs.LG eess.SY

    MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control

    Authors: Nolan Wagener, Andrey Kolobov, Felipe Vieira Frujeri, Ricky Loynd, Ching-An Cheng, Matthew Hausknecht

    Abstract: Simulated humanoids are an appealing research domain due to their physical capabilities. Nonetheless, they are also challenging to control, as a policy must drive an unstable, discontinuous, and high-dimensional physical system. One widely studied approach is to utilize motion capture (MoCap) data to teach the humanoid agent low-level skills (e.g., standing, walking, and running) that can then be… ▽ More

    Submitted 13 January, 2023; v1 submitted 15 August, 2022; originally announced August 2022.

    Comments: Appearing in NeurIPS 2022 Datasets and Benchmarks Track

  3. arXiv:2204.13326  [pdf, other

    cs.LG

    Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers

    Authors: Micah Carroll, Jessy Lin, Orr Paradise, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

    Abstract: Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a se… ▽ More

    Submitted 9 December, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

    Comments: Superseded by arXiv:2211.10869

  4. arXiv:2203.04806  [pdf, other

    cs.CL

    One-Shot Learning from a Demonstration with Hierarchical Latent Language

    Authors: Nathaniel Weir, Xingdi Yuan, Marc-Alexandre Côté, Matthew Hausknecht, Romain Laroche, Ida Momennejad, Harm Van Seijen, Benjamin Van Durme

    Abstract: Humans have the capability, aided by the expressive compositionality of their language, to learn quickly by demonstration. They are able to describe unseen task-performing procedures and generalize their execution to other contexts. In this work, we introduce DescribeWorld, an environment designed to test this sort of generalization skill in grounded agents, where tasks are linguistically and proc… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

  5. arXiv:2202.11818  [pdf, other

    cs.LG cs.AI

    Consistent Dropout for Policy Gradient Reinforcement Learning

    Authors: Matthew Hausknecht, Nolan Wagener

    Abstract: Dropout has long been a staple of supervised learning, but is rarely used in reinforcement learning. We analyze why naive application of dropout is problematic for policy-gradient learning algorithms and introduce consistent dropout, a simple technique to address this instability. We demonstrate consistent dropout enables stable training with A2C and PPO in both continuous and discrete action envi… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

  6. arXiv:2103.15332  [pdf, other

    cs.LG cs.AI

    Measuring Sample Efficiency and Generalization in Reinforcement Learning Benchmarks: NeurIPS 2020 Procgen Benchmark

    Authors: Sharada Mohanty, Jyotish Poonganam, Adrien Gaidon, Andrey Kolobov, Blake Wulfe, Dipam Chakraborty, Gražvydas Šemetulskis, João Schapke, Jonas Kubilius, Jurgis Pašukonis, Linas Klimas, Matthew Hausknecht, Patrick MacAlpine, Quang Nhat Tran, Thomas Tumiel, Xiaocheng Tang, Xinwei Chen, Christopher Hesse, Jacob Hilton, William Hebgen Guss, Sahika Genc, John Schulman, Karl Cobbe

    Abstract: The NeurIPS 2020 Procgen Competition was designed as a centralized benchmark with clearly defined tasks for measuring Sample Efficiency and Generalization in Reinforcement Learning. Generalization remains one of the most fundamental challenges in deep reinforcement learning, and yet we do not have enough benchmarks to measure the progress of the community on Generalization in Reinforcement Learnin… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

  7. arXiv:2103.13552  [pdf, other

    cs.CL cs.AI

    Reading and Acting while Blindfolded: The Need for Semantics in Text Game Agents

    Authors: Shunyu Yao, Karthik Narasimhan, Matthew Hausknecht

    Abstract: Text-based games simulate worlds and interact with players using natural language. Recent work has used them as a testbed for autonomous language-understanding agents, with the motivation being that understanding the meanings of words or semantics is a key component of how humans understand, reason, and act in these worlds. However, it remains unclear to what extent artificial agents utilize seman… ▽ More

    Submitted 29 April, 2021; v1 submitted 24 March, 2021; originally announced March 2021.

    Comments: NAACL 2021. Project page: https://blindfolded.cs.princeton.edu

  8. arXiv:2010.03768  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.RO

    ALFWorld: Aligning Text and Embodied Environments for Interactive Learning

    Authors: Mohit Shridhar, Xingdi Yuan, Marc-Alexandre Côté, Yonatan Bisk, Adam Trischler, Matthew Hausknecht

    Abstract: Given a simple request like Put a washed apple in the kitchen fridge, humans can reason in purely abstract terms by imagining action sequences and scoring their likelihood of success, prototypicality, and efficiency, all without moving a muscle. Once we see the kitchen in question, we can update our abstract plans to fit the scene. Embodied agents require the same abilities, but existing work does… ▽ More

    Submitted 14 March, 2021; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: ICLR 2021; Data, code, and videos are available at alfworld.github.io

  9. arXiv:2010.02903  [pdf, other

    cs.CL

    Keep CALM and Explore: Language Models for Action Generation in Text-based Games

    Authors: Shunyu Yao, Rohan Rao, Matthew Hausknecht, Karthik Narasimhan

    Abstract: Text-based games present a unique challenge for autonomous agents to operate in natural language and handle enormous action spaces. In this paper, we propose the Contextual Action Language Model (CALM) to generate a compact set of action candidates at each game state. Our key insight is to train language models on human gameplay, where people demonstrate linguistic priors and a general game sense… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  10. arXiv:2006.07409  [pdf, other

    cs.AI cs.CL cs.LG stat.ML

    How to Avoid Being Eaten by a Grue: Structured Exploration Strategies for Textual Worlds

    Authors: Prithviraj Ammanabrolu, Ethan Tien, Matthew Hausknecht, Mark O. Riedl

    Abstract: Text-based games are long puzzles or quests, characterized by a sequence of sparse and potentially deceptive rewards. They provide an ideal platform to develop agents that perceive and act upon the world using a combinatorially sized natural language state-action space. Standard Reinforcement Learning agents are poorly equipped to effectively explore such spaces and often struggle to overcome bott… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

  11. arXiv:2001.08837  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Graph Constrained Reinforcement Learning for Natural Language Action Spaces

    Authors: Prithviraj Ammanabrolu, Matthew Hausknecht

    Abstract: Interactive Fiction games are text-based simulations in which an agent interacts with the world purely through natural language. They are ideal environments for studying how to extend reinforcement learning agents to meet the challenges of natural language understanding, partial observability, and action generation in combinatorially-large text-based action spaces. We present KG-A2C, an agent that… ▽ More

    Submitted 23 January, 2020; originally announced January 2020.

    Comments: Accepted to ICLR 2020

  12. arXiv:1911.07141  [pdf, other

    cs.LG cs.AI cs.CL

    Working Memory Graphs

    Authors: Ricky Loynd, Roland Fernandez, Asli Celikyilmaz, Adith Swaminathan, Matthew Hausknecht

    Abstract: Transformers have increasingly outperformed gated RNNs in obtaining new state-of-the-art results on supervised tasks involving text sequences. Inspired by this trend, we study the question of how Transformer-based models can improve the performance of sequential decision-making agents. We present the Working Memory Graph (WMG), an agent that employs multi-head self-attention to reason over a dynam… ▽ More

    Submitted 18 August, 2020; v1 submitted 16 November, 2019; originally announced November 2019.

    Comments: 11 pages, 6 figures, 7 page appendix

  13. arXiv:1910.01179  [pdf, other

    cs.LG stat.ML

    Learning Calibratable Policies using Programmatic Style-Consistency

    Authors: Eric Zhan, Albert Tseng, Yisong Yue, Adith Swaminathan, Matthew Hausknecht

    Abstract: We study the problem of controllable generation of long-term sequential behaviors, where the goal is to calibrate to multiple behavior styles simultaneously. In contrast to the well-studied areas of controllable generation of images, text, and speech, there are two questions that pose significant challenges when generating long-term behaviors: how should we specify the factors of variation to cont… ▽ More

    Submitted 16 July, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

  14. arXiv:1909.05398  [pdf, other

    cs.AI cs.CL

    Interactive Fiction Games: A Colossal Adventure

    Authors: Matthew Hausknecht, Prithviraj Ammanabrolu, Marc-Alexandre Côté, Xingdi Yuan

    Abstract: A hallmark of human intelligence is the ability to understand and communicate with language. Interactive Fiction games are fully text-based simulation environments where a player issues text commands to effect change in the environment and progress through the story. We argue that IF games are an excellent testbed for studying language-based autonomous agents. In particular, IF games combine chall… ▽ More

    Submitted 25 February, 2020; v1 submitted 11 September, 2019; originally announced September 2019.

  15. arXiv:1904.03295  [pdf, other

    cs.LG cs.AI stat.ML

    Multi-Preference Actor Critic

    Authors: Ishan Durugkar, Matthew Hausknecht, Adith Swaminathan, Patrick MacAlpine

    Abstract: Policy gradient algorithms typically combine discounted future rewards with an estimated value function, to compute the direction and magnitude of parameter updates. However, for most Reinforcement Learning tasks, humans can provide additional insight to constrain the policy learning. We introduce a general method to incorporate multiple different feedback channels into a single policy gradient lo… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: NeurIPS Workshop on Deep RL, 2018

  16. arXiv:1904.01126  [pdf, other

    cs.CR cs.AI cs.LG

    ScriptNet: Neural Static Analysis for Malicious JavaScript Detection

    Authors: Jack W. Stokes, Rakshit Agrawal, Geoff McDonald, Matthew Hausknecht

    Abstract: Malicious scripts are an important computer infection threat vector in the wild. For web-scale processing, static analysis offers substantial computing efficiencies. We propose the ScriptNet system for neural malicious JavaScript detection which is based on static analysis. We use the Convoluted Partitioning of Long Sequences (CPoLS) model, which processes Javascript files as byte sequences. Lower… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

  17. arXiv:1902.04259  [pdf, other

    cs.AI

    NAIL: A General Interactive Fiction Agent

    Authors: Matthew Hausknecht, Ricky Loynd, Greg Yang, Adith Swaminathan, Jason D. Williams

    Abstract: Interactive Fiction (IF) games are complex textual decision making problems. This paper introduces NAIL, an autonomous agent for general parser-based IF games. NAIL won the 2018 Text Adventure AI Competition, where it was evaluated on twenty unseen games. This paper describes the architecture, development, and insights underpinning NAIL's performance.

    Submitted 14 February, 2019; v1 submitted 12 February, 2019; originally announced February 2019.

  18. arXiv:1806.11532  [pdf, other

    cs.LG cs.CL stat.ML

    TextWorld: A Learning Environment for Text-based Games

    Authors: Marc-Alexandre Côté, Ákos Kádár, Xingdi Yuan, Ben Kybartas, Tavian Barnes, Emery Fine, James Moore, Ruo Yu Tao, Matthew Hausknecht, Layla El Asri, Mahmoud Adada, Wendy Tay, Adam Trischler

    Abstract: We introduce TextWorld, a sandbox learning environment for the training and evaluation of RL agents on text-based games. TextWorld is a Python library that handles interactive play-through of text games, as well as backend functions like state tracking and reward assignment. It comes with a curated list of games whose features and challenges we have analyzed. More significantly, it enables users t… ▽ More

    Submitted 8 November, 2019; v1 submitted 29 June, 2018; originally announced June 2018.

    Comments: Presented at the Computer Games Workshop at IJCAI 2018, Stockholm

  19. arXiv:1806.11525  [pdf, other

    cs.CL cs.LG

    Counting to Explore and Generalize in Text-based Games

    Authors: Xingdi Yuan, Marc-Alexandre Côté, Alessandro Sordoni, Romain Laroche, Remi Tachet des Combes, Matthew Hausknecht, Adam Trischler

    Abstract: We propose a recurrent RL agent with an episodic exploration mechanism that helps discovering good policies in text-based game environments. We show promising results on a set of generated text-based games of varying difficulty where the goal is to collect a coin located at the end of a chain of rooms. In contrast to previous text-based RL approaches, we observe that our agent learns policies that… ▽ More

    Submitted 6 March, 2019; v1 submitted 29 June, 2018; originally announced June 2018.

  20. arXiv:1805.04276  [pdf, other

    cs.LG stat.ML

    Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis

    Authors: Rudy Bunel, Matthew Hausknecht, Jacob Devlin, Rishabh Singh, Pushmeet Kohli

    Abstract: Program synthesis is the task of automatically generating a program consistent with a specification. Recent years have seen proposal of a number of neural approaches for program synthesis, many of which adopt a sequence generation paradigm similar to neural machine translation, in which sequence-to-sequence models are trained to maximize the likelihood of known reference programs. While achieving… ▽ More

    Submitted 22 May, 2018; v1 submitted 11 May, 2018; originally announced May 2018.

    Comments: ICLR 2018

  21. arXiv:1710.04157  [pdf, other

    cs.AI

    Neural Program Meta-Induction

    Authors: Jacob Devlin, Rudy Bunel, Rishabh Singh, Matthew Hausknecht, Pushmeet Kohli

    Abstract: Most recently proposed methods for Neural Program Induction work under the assumption of having a large set of input/output (I/O) examples for learning any underlying input-output mapping. This paper aims to address the problem of data and computation efficiency of program induction by leveraging information from related tasks. Specifically, we propose two approaches for cross-task knowledge trans… ▽ More

    Submitted 11 October, 2017; originally announced October 2017.

    Comments: 8 Pages + 1 page appendix

  22. arXiv:1709.06009  [pdf, other

    cs.LG

    Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

    Authors: Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, Michael Bowling

    Abstract: The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been receiving increasing attention from the scientific community, leading to some high-profile success stories such as the much publicized Deep Q-Networks (DQN). In t… ▽ More

    Submitted 30 November, 2017; v1 submitted 18 September, 2017; originally announced September 2017.

  23. arXiv:1511.04143  [pdf, other

    cs.AI cs.LG cs.MA cs.NE

    Deep Reinforcement Learning in Parameterized Action Space

    Authors: Matthew Hausknecht, Peter Stone

    Abstract: Recent work has shown that deep neural networks are capable of approximating both value functions and policies in reinforcement learning domains featuring continuous state and action spaces. However, to the best of our knowledge no previous work has succeeded at using deep neural networks in structured (parameterized) continuous action spaces. To fill this gap, this paper focuses on learning withi… ▽ More

    Submitted 3 May, 2024; v1 submitted 12 November, 2015; originally announced November 2015.

  24. arXiv:1507.06527  [pdf, other

    cs.LG

    Deep Recurrent Q-Learning for Partially Observable MDPs

    Authors: Matthew Hausknecht, Peter Stone

    Abstract: Deep Reinforcement Learning has yielded proficient controllers for complex tasks. However, these controllers have limited memory and rely on being able to perceive the complete game screen at each decision point. To address these shortcomings, this article investigates the effects of adding recurrency to a Deep Q-Network (DQN) by replacing the first post-convolutional fully-connected layer with a… ▽ More

    Submitted 11 January, 2017; v1 submitted 23 July, 2015; originally announced July 2015.

  25. arXiv:1503.08909  [pdf, other

    cs.CV

    Beyond Short Snippets: Deep Networks for Video Classification

    Authors: Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici

    Abstract: Convolutional neural networks (CNNs) have been extensively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentation and retrieval. In this work we propose and evaluate several deep neural network architectures to combine image information across a video over longer time periods than previously attempted. We propose two methods capable of handli… ▽ More

    Submitted 13 April, 2015; v1 submitted 31 March, 2015; originally announced March 2015.