Abstract
We develop an algorithm to compute optimal policies for Markov decision processes subject to constraints that result from some observability restrictions on the process. We assume that the state of the Markov process is unobservable. There is an observable process related to the unobservable state. So, we want to find a decision rule depending only on this observable process. The objective is to minimize the expected average cost over an infinite horizon. We also analyze the possibility of performing observations in more detail to obtain improved policies.
Similar content being viewed by others
Author information
Authors and Affiliations
Corresponding author
Additional information
Manuscript received: March 2004/Final version received: June 2004
Rights and permissions
About this article
Cite this article
Serin, Y., Kulkarni, V. Markov decision processes under observability constraints. Math Meth Oper Res 61, 311–328 (2005). https://doi.org/10.1007/s001860400402
Issue Date:
DOI: https://doi.org/10.1007/s001860400402