Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Jul 7, 2020 · In this paper, we unify these estimators as regularized Lagrangians of the same linear program. The unification allows us to expand the space of ...
We have proposed a unified view of off-policy evaluation via the regularized Lagrangian of the d-LP. Under this unification, existing DICE algorithms are ...
Off-Policy Evaluation via the. Regularized Lagrangian. Sherry Yang*, Ofir Nachum*, Bo Dai*, Lihong Li, Dale Schuurmans. Google Brain. 1. Paper: https://arxiv.
Summary and Contributions: This paper tries to unify the recent minimax approaches for off-policy evaluation using Lagrangian. The main contribution is the ...
Jul 24, 2020 · We have proposed a unified view of off-policy evaluation via the regularized Lagrangian of the d-LP. Under this unification, existing DICE ...
The unification of DICE estimators as regularized Lagrangians of the same linear program finds that dual solutions offer greater flexibility in navigating ...
Dec 6, 2020 · In this paper, we unify these estimators as regularized Lagrangians of the same linear program. The unification allows us to expand the space of ...
In this paper, we unify these estimators as regularized Lagrangians of the same linear program. The unification allows us to expand the space of DICE estimators ...
Dec 6, 2020 · Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes ...