Optimizing data usage via differentiable rewards
Abstract
References
Index Terms
- Optimizing data usage via differentiable rewards
Recommendations
Transductive Multilabel Learning via Label Set Propagation
The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Accelerating Lifelong Reinforcement Learning via Reshaping Rewards*
2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC)The reinforcement learning (RL) problem is typically formalized as the Markov Decision Process (MDP), where an agent interacts with the environment to maximize the long-term expected reward. As an important branch of RL, Lifelong RL requires the agent to ...
Comments
Information & Contributors
Information
Published In
Publisher
JMLR.org
Publication History
Qualifiers
- Research-article
- Research
- Refereed limited
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 52Total Downloads
- Downloads (Last 12 months)37
- Downloads (Last 6 weeks)6
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in