Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–10 of 10 results for author: Klassen, T Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00120  [pdf, other

    cs.LG cs.AI cs.FL

    Reward Machines for Deep RL in Noisy and Uncertain Environments

    Authors: Andrew C. Li, Zizhao Chen, Toryn Q. Klassen, Pashootan Vaezipoor, Rodrigo Toro Icarte, Sheila A. McIlraith

    Abstract: Reward Machines provide an automata-inspired structure for specifying instructions, safety constraints, and other temporally extended reward-worthy behaviour. By exposing complex reward function structure, they enable counterfactual learning updates that have resulted in impressive sample efficiency gains. While Reward Machines have been employed in both tabular and deep RL settings, they have typ… ▽ More

    Submitted 17 June, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

    ACM Class: I.2.0; I.2.6; I.2.4; F.4.3

  2. arXiv:2312.04772  [pdf, other

    cs.AI cs.CY cs.LG

    Remembering to Be Fair: Non-Markovian Fairness in Sequential Decision Making

    Authors: Parand A. Alamdari, Toryn Q. Klassen, Elliot Creager, Sheila A. McIlraith

    Abstract: Fair decision making has largely been studied with respect to a single decision. Here we investigate the notion of fairness in the context of sequential decision making where multiple stakeholders can be affected by the outcomes of decisions. We observe that fairness often depends on the history of the sequential decision-making process, and in this sense that it is inherently non-Markovian. We fu… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

  3. arXiv:2211.10902  [pdf, other

    cs.LG cs.AI cs.FL

    Noisy Symbolic Abstractions for Deep RL: A case study with Reward Machines

    Authors: Andrew C. Li, Zizhao Chen, Pashootan Vaezipoor, Toryn Q. Klassen, Rodrigo Toro Icarte, Sheila A. McIlraith

    Abstract: Natural and formal languages provide an effective mechanism for humans to specify instructions and reward functions. We investigate how to generate policies via RL when reward functions are specified in a symbolic language captured by Reward Machines, an increasingly popular automaton-inspired structure. We are interested in the case where the mapping of environment state to a symbolic (here, Rewa… ▽ More

    Submitted 23 November, 2022; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: NeurIPS Deep Reinforcement Learning Workshop 2022

  4. arXiv:2211.04591  [pdf, other

    cs.LG cs.AI cs.CL

    Learning to Follow Instructions in Text-Based Games

    Authors: Mathieu Tuli, Andrew C. Li, Pashootan Vaezipoor, Toryn Q. Klassen, Scott Sanner, Sheila A. McIlraith

    Abstract: Text-based games present a unique class of sequential decision making problem in which agents interact with a partially observable, simulated environment via actions and observations conveyed through natural language. Such observations typically include instructions that, in a reinforcement learning (RL) setting, can directly or indirectly guide a player towards completing reward-worthy tasks. In… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022

  5. arXiv:2112.09477  [pdf, other

    cs.LG cs.AI

    Learning Reward Machines: A Study in Partially Observable Reinforcement Learning

    Authors: Rodrigo Toro Icarte, Ethan Waldie, Toryn Q. Klassen, Richard Valenzano, Margarita P. Castro, Sheila A. McIlraith

    Abstract: Reinforcement learning (RL) is a central problem in artificial intelligence. This problem consists of defining artificial agents that can learn optimal behaviour by interacting with an environment -- where the optimal behaviour is defined with respect to a reward signal that the agent seeks to maximize. Reward machines (RMs) provide a structured, automata-based representation of a reward function… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

  6. arXiv:2106.02617  [pdf, other

    cs.AI cs.LG

    Be Considerate: Objectives, Side Effects, and Deciding How to Act

    Authors: Parand Alizadeh Alamdari, Toryn Q. Klassen, Rodrigo Toro Icarte, Sheila A. McIlraith

    Abstract: Recent work in AI safety has highlighted that in sequential decision making, objectives are often underspecified or incomplete. This gives discretion to the acting agent to realize the stated objective in ways that may result in undesirable outcomes. We contend that to learn to act safely, a reinforcement learning (RL) agent should include contemplation of the impact of its actions on the wellbein… ▽ More

    Submitted 4 June, 2021; originally announced June 2021.

  7. Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning

    Authors: Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Sheila A. McIlraith

    Abstract: Reinforcement learning (RL) methods usually treat reward functions as black boxes. As such, these methods must extensively interact with the environment in order to discover rewards and optimal policies. In most RL applications, however, users have to program the reward function and, hence, there is the opportunity to make the reward function visible -- to show the reward function's code to the RL… ▽ More

    Submitted 17 January, 2022; v1 submitted 5 October, 2020; originally announced October 2020.

    Journal ref: Journal of Artificial Intelligence Research 73 (2022) 173-208

  8. arXiv:2010.01753  [pdf, other

    cs.LG cs.AI

    The act of remembering: a study in partially observable reinforcement learning

    Authors: Rodrigo Toro Icarte, Richard Valenzano, Toryn Q. Klassen, Phillip Christoffersen, Amir-massoud Farahmand, Sheila A. McIlraith

    Abstract: Reinforcement Learning (RL) agents typically learn memoryless policies---policies that only consider the last observation when selecting actions. Learning memoryless policies is efficient and optimal in fully observable environments. However, some form of memory is necessary when RL agents are faced with partial observability. In this paper, we study a lightweight approach to tackle partial observ… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

  9. arXiv:2005.02963  [pdf, ps, other

    cs.AI

    Towards the Role of Theory of Mind in Explanation

    Authors: Maayan Shvo, Toryn Q. Klassen, Sheila A. McIlraith

    Abstract: Theory of Mind is commonly defined as the ability to attribute mental states (e.g., beliefs, goals) to oneself, and to others. A large body of previous work - from the social sciences to artificial intelligence - has observed that Theory of Mind capabilities are central to providing an explanation to another agent or when explaining that agent's behaviour. In this paper, we build and expand upon p… ▽ More

    Submitted 6 May, 2020; originally announced May 2020.

  10. arXiv:1112.3323  [pdf, ps, other

    cs.DS

    Independence of Tabulation-Based Hash Classes

    Authors: Toryn Qwyllyn Klassen, Philipp Woelfel

    Abstract: A tabulation-based hash function maps a key into d derived characters indexing random values in tables that are then combined with bitwise xor operations to give the hash. Thorup and Zhang (2004) presented d-wise independent tabulation-based hash classes that use linear maps over finite fields to map a key, considered as a vector (a,b), to derived characters. We show that a variant where the deriv… ▽ More

    Submitted 14 December, 2011; originally announced December 2011.

    Comments: 12 pages with 2 page appendix showing experimental results