Reinforcement learning

Applied Filters

Publication Date

People

Publications

32 Results for: Book/Issue: AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,856,385 records)|Limit your search to The ACM Full-Text Collection (778,812 records)

Showing 1 - 20of32 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
July 2018
DOP: Deep Optimistic Planning with Approximate Value Function Evaluation
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2210–2212

Research on reinforcement learning has demonstrated promising results in manifold applications and domains. Still, efficiently learning effective robot behaviors is very difficult, due to unstructured scenarios, high uncertainties, and large state ...
1
63
Metrics
Total Citations1
Total Downloads63
Last 12 Months0
Last 6 weeks0
Get Access
research-article
Public Access
July 2018
Multi-Armed Bandit Algorithms for Spare Time Planning of a Mobile Service Robot
- Max Korein,
- Manuela Veloso
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2195–2197

We assume that service robots will have spare time in between scheduled user requests, which they could use to perform additional unrequested services in order to learn a model of users' preferences and receive reward. However, a mobile service robot is ...
0
165
Metrics
Total Citations0
Total Downloads165
Last 12 Months52
Last 6 weeks2
View online with eReader
PDF
research-article
July 2018
Recurrent Deep Multiagent Q-Learning for Autonomous Agents in Future Smart Grid
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2136–2138

The broker mechanism is widely applied to serve for interested parties to derive long-term policies to reduce costs or gain profits in smart grid. However, brokers are faced with a number of challenging problems such as balancing demand and supply from ...
0
132
Metrics
Total Citations0
Total Downloads132
Last 12 Months3
Last 6 weeks0
Get Access
research-article
July 2018
An Optimal Algorithm for the Stochastic Bandits with Knowing Near-optimal Mean Reward
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2130–2132

This paper studies a variation of stochastic multi-armed bandit (MAB) problem where the agent knows a prior knowledge named Near-optimal Mean Reward (NoMR). We show that the cumulative regret of this bandit variation has a lower bound of Ω Δeft(1/Δ), ...
0
66
Metrics
Total Citations0
Total Downloads66
Last 12 Months0
Last 6 weeks0
Get Access
research-article
July 2018
Adaptive Incentive Selection for Crowdsourcing Contests
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2100–2102
0
64
Metrics
Total Citations0
Total Downloads64
Last 12 Months0
Last 6 weeks0
Get Access
research-article
July 2018
Trial without Error: Towards Safe Reinforcement Learning via Human Intervention
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2067–2069

During training, model-free reinforcement learning (RL) systems can explore actions that lead to harmful or costly consequences. Having a human "in the loop'' and ready to intervene at all times can prevent these mistakes, but is prohibitively expensive ...
7
496
Metrics
Total Citations7
Total Downloads496
Last 12 Months18
Last 6 weeks2
Get Access
research-article
July 2018
RAIL: Risk-Averse Imitation Learning
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2062–2063

Imitation learning algorithms learn viable policies by imitating an expert's behavior when reward signals are not available. Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the expert's ...
1
204
Metrics
Total Citations1
Total Downloads204
Last 12 Months0
Last 6 weeks0
Get Access
research-article
July 2018
Algorithms to Manage Load Shedding Events in Developing Countries
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2034–2036

Due to the limited generation capacity of power stations, many developing countries frequently resort to disconnecting large parts of the power grid from supply, a process termed load shedding. This leaves homes without electricity, causing them ...
0
49
Metrics
Total Citations0
Total Downloads49
Last 12 Months2
Last 6 weeks0
Get Access
research-article
July 2018
Link-based Parameterized Micro-tolling Scheme for Optimal Traffic Management
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2013–2015

In the micro-tolling paradigm, different toll values are assigned to different links within a congestible traffic network. Self-interested agents then select minimal cost routes, where cost is a function of the travel time and tolls paid. A centralized ...
1
59
Metrics
Total Citations1
Total Downloads59
Last 12 Months0
Last 6 weeks0
Get Access
research-article
July 2018
Leveraging Observational Learning for Exploration in Bandits
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 2001–2003

Learning from a target has been tackled in the reinforcement learning (RL) setting [1, 7] as imitation learning, either through behaviour cloning or inverse RL. In the former, the agent regresses directly onto the policy of a target [5], while in the ...
0
78
Metrics
Total Citations0
Total Downloads78
Last 12 Months0
Last 6 weeks0
Get Access
research-article
July 2018
Introspective Reinforcement Learning and Learning from Demonstration
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1992–1994

Reinforcement learning is a paradigm to model how an autonomous agent learns to maximise its cumulative reward by interacting with the environment. One challenge faced by reinforcement learning is that in many environments the reward signal is sparse, ...
2
138
Metrics
Total Citations2
Total Downloads138
Last 12 Months1
Last 6 weeks0
Get Access
research-article
July 2018
Guiding Reinforcement Learning Exploration Using Natural Language
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1956–1958

In this work we present a technique for using natural language to help reinforcement learning generalize to unseen environments using neural machine translation techniques. These techniques are then integrated into policy shaping to make it more ...
1
125
Metrics
Total Citations1
Total Downloads125
Last 12 Months3
Last 6 weeks0
Get Access
research-article
July 2018
Incident Prediction and Response Optimization
- Ayan Mukhopadhyay
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1758–1760

In urban areas across the globe, incidents like crime, fire and accidents often result in massive losses of life and property. In such scenarios, quick response can minimize or prevent damage. Emergency responder services are eager to adopt mechanisms ...
0
83
Metrics
Total Citations0
Total Downloads83
Last 12 Months0
Last 6 weeks0
Get Access
research-article
July 2018
Adaptive Dynamic Pricing for Market-based Allocation of Interdependent Commodities
- Jan Mrkos
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1755–1757

Ongoing digitization of all kinds of human enterprise is allowing sophisticated pricing strategies to be used in domains where previously this has not been feasible. In the mobility domain, commodities such as shared cars or electric vehicle charging ...
0
47
Metrics
Total Citations0
Total Downloads47
Last 12 Months0
Last 6 weeks0
Get Access
research-article
July 2018
Utility Decomposition for Planning under Uncertainty for Autonomous Driving
- Maxime Bouton
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1731–1732

The objective of this research is to provide scalable decision making algorithms for autonomously navigating urban environments. The vehicle must plan in a stochastic environment with many entities to avoid, rapid changes in driver behavior, and partial ...
0
105
Metrics
Total Citations0
Total Downloads105
Last 12 Months0
Last 6 weeks0
Get Access
research-article
July 2018
Decentralized Reinforcement Learning Inspired by Multiagent Systems
- Dhaval Adjodah
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1729–1730

Existence can perhaps be viewed an exercise of searching high-dimensional, rugged, and approximated (using training data) landscapes for (often time-delayed) rewards. Bounded rationality imposes limits on the success of solutions that can be found by ...
0
94
Metrics
Total Citations0
Total Downloads94
Last 12 Months1
Last 6 weeks0
Get Access
research-article
July 2018
Behavior Model Calibration for Epidemic Simulations
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1640–1648

Computational epidemiologists frequently employ large-scale agent-based simulations of human populations to study disease outbreaks and assess intervention strategies. The agents used in such simulations rarely capture the real-world decision-making of ...
3
118
Metrics
Total Citations3
Total Downloads118
Last 12 Months3
Last 6 weeks0
Get Access
research-article
July 2018
Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning
- Xinlei Pan,
- Yilin Shen
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1380–1387

Humans are able to understand and perform complex tasks by strategically structuring tasks into incremental steps or sub-goals. For a robot attempting to learn to perform a sequential task with critical subgoal states, these subgoal states can provide a ...
1
135
Metrics
Total Citations1
Total Downloads135
Last 12 Months1
Last 6 weeks0
Get Access
research-article
July 2018
Multi-Armed Bandit Algorithms for Crowdsourcing Systems with Online Estimation of Workers' Ability
- Anshuka Rangi,
- Massimo Franceschetti
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1345–1352

Crowdsourcing systems have become a valuable solution for various organizations to outsource work on a temporary basis. Quality assurance in these systems remains a key issue due to the distributed setup of the crowdsourcing platforms and the absence of ...
3
188
Metrics
Total Citations3
Total Downloads188
Last 12 Months9
Last 6 weeks0
Get Access
research-article
Public Access
July 2018
Faster Policy Adaptation in Environments with Exogeneity: A State Augmentation Approach
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent SystemsPages 1035–1043

The reinforcement learning literature typically assumes fixed state transition functions for the sake of tractability. However, in many real-world tasks, the state transition function changes over time, and this change may be governed by exogenous ...
0
95
Metrics
Total Citations0
Total Downloads95
Last 12 Months36
Last 6 weeks10
View online with eReader
PDF

Applied Filters

Publication Date

People

Authors

Institutions

Publications

All Publications

Content Type

Publisher

Proceedings Series

ACM SIG Sponsors

Results

DOP: Deep Optimistic Planning with Approximate Value Function Evaluation

Multi-Armed Bandit Algorithms for Spare Time Planning of a Mobile Service Robot

Recurrent Deep Multiagent Q-Learning for Autonomous Agents in Future Smart Grid

An Optimal Algorithm for the Stochastic Bandits with Knowing Near-optimal Mean Reward

Adaptive Incentive Selection for Crowdsourcing Contests

Trial without Error: Towards Safe Reinforcement Learning via Human Intervention

RAIL: Risk-Averse Imitation Learning

Algorithms to Manage Load Shedding Events in Developing Countries

Link-based Parameterized Micro-tolling Scheme for Optimal Traffic Management

Leveraging Observational Learning for Exploration in Bandits

Introspective Reinforcement Learning and Learning from Demonstration

Guiding Reinforcement Learning Exploration Using Natural Language

Incident Prediction and Response Optimization

Adaptive Dynamic Pricing for Market-based Allocation of Interdependent Commodities

Utility Decomposition for Planning under Uncertainty for Autonomous Driving

Decentralized Reinforcement Learning Inspired by Multiagent Systems

Behavior Model Calibration for Epidemic Simulations

Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning

Multi-Armed Bandit Algorithms for Crowdsourcing Systems with Online Estimation of Workers' Ability

Faster Policy Adaptation in Environments with Exogeneity: A State Augmentation Approach