Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–11 of 11 results for author: Miller, A H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.05492  [pdf, other

    cs.GT cs.AI cs.LG cs.MA

    Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning

    Authors: Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, Noam Brown

    Abstract: No-press Diplomacy is a complex strategy game involving both cooperation and competition that has served as a benchmark for multi-agent AI research. While self-play reinforcement learning has resulted in numerous successes in purely adversarial games like chess, Go, and poker, self-play alone is insufficient for achieving optimal performance in domains involving cooperation with humans. We address… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  2. arXiv:2006.13760  [pdf, other

    cs.LG cs.AI cs.CL cs.NE stat.ML

    The NetHack Learning Environment

    Authors: Heinrich Küttler, Nantas Nardelli, Alexander H. Miller, Roberta Raileanu, Marco Selvatici, Edward Grefenstette, Tim Rocktäschel

    Abstract: Progress in Reinforcement Learning (RL) algorithms goes hand-in-hand with the development of challenging environments that test the limits of current methods. While existing RL environments are either sufficiently complex or based on fast simulation, they are rarely both. Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging… ▽ More

    Submitted 1 December, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: 28 pages. Accepted at NeurIPS 2020

  3. arXiv:2005.04611  [pdf, other

    cs.CL

    How Context Affects Language Models' Factual Predictions

    Authors: Fabio Petroni, Patrick Lewis, Aleksandra Piktus, Tim Rocktäschel, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

    Abstract: When pre-trained on large unsupervised textual corpora, language models are able to store and retrieve factual knowledge to some extent, making it possible to use them directly for zero-shot cloze-style question answering. However, storing factual knowledge in a fixed number of weights of a language model clearly has limitations. Previous approaches have successfully provided access to information… ▽ More

    Submitted 10 May, 2020; originally announced May 2020.

    Comments: accepted at AKBC 2020

  4. arXiv:1910.04054  [pdf, other

    cs.LG cs.DC cs.NI stat.ML

    MVFST-RL: An Asynchronous RL Framework for Congestion Control with Delayed Actions

    Authors: Viswanath Sivakumar, Olivier Delalleau, Tim Rocktäschel, Alexander H. Miller, Heinrich Küttler, Nantas Nardelli, Mike Rabbat, Joelle Pineau, Sebastian Riedel

    Abstract: Effective network congestion control strategies are key to keeping the Internet (or any large computer network) operational. Network congestion control has been dominated by hand-crafted heuristics for decades. Recently, ReinforcementLearning (RL) has emerged as an alternative to automatically optimize such control strategies. Research so far has primarily considered RL interfaces which block the… ▽ More

    Submitted 26 May, 2021; v1 submitted 9 October, 2019; originally announced October 2019.

    Comments: Workshop on ML for Systems at NeurIPS 2019

  5. arXiv:1909.01066  [pdf, other

    cs.CL

    Language Models as Knowledge Bases?

    Authors: Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

    Abstract: Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as "fill-in-the-blank" cloze statements. Language models have many advantages over structured knowledge… ▽ More

    Submitted 4 September, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: accepted at EMNLP 2019

  6. arXiv:1811.00907  [pdf, other

    cs.CL cs.LG

    Importance of Search and Evaluation Strategies in Neural Dialogue Modeling

    Authors: Ilia Kulikov, Alexander H. Miller, Kyunghyun Cho, Jason Weston

    Abstract: We investigate the impact of search strategies in neural dialogue modeling. We first compare two standard search algorithms, greedy and beam search, as well as our newly proposed iterative beam search which produces a more diverse set of candidate responses. We evaluate these strategies in realistic full conversations with humans and propose a model-based Bayesian calibration to address annotator… ▽ More

    Submitted 3 November, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

    Comments: iNLG 2019 camera ready version

  7. arXiv:1808.04776  [pdf, ps, other

    cs.CL

    Retrieve and Refine: Improved Sequence Generation Models For Dialogue

    Authors: Jason Weston, Emily Dinan, Alexander H. Miller

    Abstract: Sequence generation models for dialogue are known to have several problems: they tend to produce short, generic sentences that are uninformative and unengaging. Retrieval models on the other hand can surface interesting responses, but are restricted to the given retrieval set leading to erroneous replies that cannot be tuned to the specific context. In this work we develop a model that combines th… ▽ More

    Submitted 6 September, 2018; v1 submitted 14 August, 2018; originally announced August 2018.

  8. arXiv:1711.07950  [pdf, other

    cs.CL

    Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent

    Authors: Zhilin Yang, Saizheng Zhang, Jack Urbanek, Will Feng, Alexander H. Miller, Arthur Szlam, Douwe Kiela, Jason Weston

    Abstract: Contrary to most natural language processing research, which makes use of static datasets, humans learn language interactively, grounded in an environment. In this work we propose an interactive learning procedure called Mechanical Turker Descent (MTD) and use it to train agents to execute natural language commands grounded in a fantasy text adventure game. In MTD, Turkers compete to train better… ▽ More

    Submitted 16 April, 2018; v1 submitted 21 November, 2017; originally announced November 2017.

  9. arXiv:1705.06476  [pdf, other

    cs.CL

    ParlAI: A Dialog Research Software Platform

    Authors: Alexander H. Miller, Will Feng, Adam Fisch, Jiasen Lu, Dhruv Batra, Antoine Bordes, Devi Parikh, Jason Weston

    Abstract: We introduce ParlAI (pronounced "par-lay"), an open-source software platform for dialog research implemented in Python, available at http://parl.ai. Its goal is to provide a unified framework for sharing, training and testing of dialog models, integration of Amazon Mechanical Turk for data collection, human evaluation, and online/reinforcement learning; and a repository of machine learning models… ▽ More

    Submitted 8 March, 2018; v1 submitted 18 May, 2017; originally announced May 2017.

  10. arXiv:1612.04936  [pdf, other

    cs.CL cs.AI

    Learning through Dialogue Interactions by Asking Questions

    Authors: Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, Jason Weston

    Abstract: A good dialogue agent should have the ability to interact with users by both responding to questions and by asking questions, and importantly to learn from both types of interaction. In this work, we explore this direction by designing a simulator and a set of synthetic tasks in the movie domain that allow such interactions between a learner and a teacher. We investigate how a learner can benefit… ▽ More

    Submitted 13 February, 2017; v1 submitted 15 December, 2016; originally announced December 2016.

  11. arXiv:1611.09823  [pdf, other

    cs.AI cs.CL

    Dialogue Learning With Human-In-The-Loop

    Authors: Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, Jason Weston

    Abstract: An important aspect of developing conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes. Most research has focused on learning from fixed training sets of labeled data rather than interacting with a dialogue partner in an online fashion. In this paper we explore this direction in a reinforcement learning setting… ▽ More

    Submitted 13 January, 2017; v1 submitted 29 November, 2016; originally announced November 2016.