Cited By
View all- Bozkus TMitra U(2024)Multi-Timescale Ensemble
-Learning for Markov Decision Process Policy OptimizationIEEE Transactions on Signal Processing10.1109/TSP.2024.337269972(1427-1442)Online publication date: 1-Jan-2024$Q$ - Goktas DPrakash AGreenwald AOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Convex-concave 0-sum Markov stackelberg gamesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669039(66818-66832)Online publication date: 10-Dec-2023
- Zhang SLi HWang MLiu MChen PLu SLiu SMurugesan KChaudhury SOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)On the convergence and sample complexity analysis of deep Q-networks with ε-greedy explorationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666694(13064-13102)Online publication date: 10-Dec-2023
- Show More Cited By