Splitting Randomized Stationary Policies in Total-Reward Markov Decision Processes.

AllBooks Images News Maps Videos Shopping

Scholarly articles for Splitting Randomized Stationary Policies in Total-Reward Markov Decision Processes.

scholar.google.com › citations

Splitting randomized stationary policies in total-reward …
Feinberg · Cited by 48

Splitting Randomized Stationary Policies in Total-Reward Markov ...

Jan 9, 2012 · This algorithm generates the splitting policies in a way that each pair of consecutive policies differs at exactly one state. The results are ...

Splitting Randomized Stationary Policies in Total-Reward Markov ...

www.jstor.org › stable › pdf

as a convex combination of the occupancy measures of stationary policies, each selecting deterministic actions on the given.

Splitting Randomized Stationary Policies in Total-Reward Markov ...

pubsonline.informs.org › moor.1110.0525

This paper studies a discrete-time total-reward Markov decision process (MDP) with a given initial state distribution. A. (randomized) stationary policy can ...

[PDF] Splitting Randomized Stationary Policies in Total-Reward Markov ...

www.semanticscholar.org › paper › Splitt...

An efficient algorithm is provided that presents the occupancy measure of a given policy as a convex combination of the occupancy measures of finitely many ...

Splitting Randomized Stationary Policies in Total-Reward Markov ...

www.bibsonomy.org › bibtex › dblp

Splitting Randomized Stationary Policies in Total-Reward Markov Decision Processes. E. Feinberg, and U. Rothblum. Math. Oper. Res., 37 (1): 129-153 (2012 ) ...

On occupation measures for total-reward MDPs - IEEE Xplore

ieeexplore.ieee.org › document › similar

If this is possible for a given policy, we say that the policy can be split. In particular, we are interested in splitting a randomized stationary policy into ( ...

Constrained Markov Decision Processes with Expected Total ...

epubs.siam.org › doi › abs

In this paper, we investigate a Markov decision process with constraints on a Borel state space with the expected total reward criterion.

On occupation measures for total-reward MDPs - ResearchGate

www.researchgate.net › ... › Policy

Apr 28, 2015 · The initial state distribution is fixed. According to, for a given randomized stationary policy, its occupation measure as a convex combination ...

NSF Award Search: Award # 0900206 - Collaborative Research

www.nsf.gov › awardsearch › showAward

Rothblum "Splitting Randomized Stationary Policies in Total-Reward Markov Decision Processes" Mathematics of Operations Research , v.37 , 2012 , p.129. E.A. ...

On occupation measures for total-reward MDPs - Academia.edu

www.academia.edu › On_occupation_me...

This paper presents three conditions. Each of them guarantees the uniqueness of optimal policies of discounted Markov decision processes. The conditions ...