Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJuly 2023
Trajectory-aware eligibility traces for off-policy reinforcement learning
ICML'23: Proceedings of the 40th International Conference on Machine LearningArticle No.: 273, Pages 6818–6835Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, but counteracting off-policy bias without exacerbating variance is challenging. Classically, off-policy bias is corrected in a per-decision manner: past ...
- articleJune 2023
On Centralized Critics in Multi-Agent Reinforcement Learning
Centralized Training for Decentralized Execution, where agents are trained offline in a centralized fashion and execute online in a decentralized manner, has become a popular approach in Multi-Agent Reinforcement Learning (MARL). In particular, it has ...
- extended-abstractMay 2021
Stratified Experience Replay: Correcting Multiplicity Bias in Off-Policy Reinforcement Learning
AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent SystemsPages 1486–1488Deep Reinforcement Learning (RL) methods rely on experience replay to approximate the minibatched supervised learning setting; however, unlike supervised learning where access to lots of training data is crucial to generalization, replay-based deep RL ...
- research-articleMay 2021
Contrasting Centralized and Decentralized Critics in Multi-Agent Reinforcement Learning
AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent SystemsPages 844–852Centralized Training for Decentralized Execution, where agents are trained offline using centralized information but execute in a decentralized manner online, has gained popularity in the multi-agent reinforcement learning community. In particular, actor-...
- research-articleDecember 2019
Reconciling λ-returns with experience replay
NIPS'19: Proceedings of the 33rd International Conference on Neural Information Processing SystemsDecember 2019, Article No.: 102, Pages 1133–1142Modern deep reinforcement learning methods have departed from the incremental learning required for eligibility traces, rendering the implementation of the λ-return difficult in this context. In particular, off-policy methods that utilize experience ...
- research-articleJanuary 2015
NUPAR: A Benchmark Suite for Modern GPU Architectures
- Yash Ukidave,
- Fanny Nina Paravecino,
- Leiming Yu,
- Charu Kalra,
- Amir Momeni,
- Zhongliang Chen,
- Nick Materise,
- Brett Daley,
- Perhaad Mistry,
- David Kaeli
ICPE '15: Proceedings of the 6th ACM/SPEC International Conference on Performance EngineeringPages 253–264https://doi.org/10.1145/2668930.2688046Heterogeneous systems consisting of multi-core CPUs, Graphics Processing Units (GPUs) and many-core accelerators have gained widespread use by application developers and data-center platform developers. Modern day heterogeneous systems have evolved to ...