A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Efficient Identification of State in Reinforcement Learning
2009
Künstliche Intelligenz
A very general framework for modeling uncertainty in learning environments is given by partially observable Markov Decision Processes (POMDPs). In a POMDP setting, the learning agent infers a policy for acting optimally in all possible states of the environment, while receiving only partial observations of these states. To represent an optimal policy for a POMDP, it is generally necessary to employ some form of memory. Perfect memory is represented by the belief space, i.e. the space of
dblp:journals/ki/TimmerR09
fatcat:mtledogt5jcqjnlpczhya4x6di