Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
In this paper we show that for discounted MDPs with discount factor, > 1/2 the asymptotic rate of convergence of Q-Iearning.
In this paper we show that for discounted MDPs with discount factor, > 1/2 the asymptotic rate of convergence of Q-Iearning if R(1 - ,) < 1/2 and O( Jlog log ...
People also ask
> 1=2 the asymptotic rate of convergence of Q-learning is O(1=tR(1 )) if R(1. ) < 1=2 and O( p log logt=t) otherwise provided that the state-action pairs are ...
Dec 1, 1997 · In this paper we show that for discounted MDPs with discount factor γ > 1/2 the asymptotic rate of convergence of Q-learning is O(1/tR(1-γ)) ...
In this paper we show that for discounted MDPs with discount factor $\gamma>1/2$ the asymptotic rate of convergence of Q-learning is O($1/t^{R(1-\gamma$)}) ...
Jan 23, 2023 · We show a sufficient condition on the rate of exploration such that the Q-Learning dynamics is guaranteed to converge to a unique equilibrium in ...
Jul 11, 2019 · The asymptotic rate of convergence of Q-learning is Ο( 1/tR(1-γ) ), if R(1-γ)<0.5, where R=Pmin/Pmax, P is state-action occupation frequency.
In this paper we derive convergence rates for Q-learning. We show an interesting relationship between the convergence rate and the learning rate used in Q- ...
In this paper, we provide such a framework, and use it to derive the first finite-time convergence rates (sample size bounds) for both Q-learning and the ...
May 29, 2023 · We show a sufficient condi- tion on the rate of exploration such that the Q-Learning dynamics. Is guaranteed to converge to a unique equilibrium ...