[PDF][PDF] Multi-time models for temporally abstract planning

D Precup, RS Sutton - Advances in neural information …, 1997 - proceedings.neurips.cc
Advances in neural information processing systems, 1997proceedings.neurips.cc
Planning and learning at multiple levels of temporal abstraction is a key problem for artificial
intelligence. In this paper we summarize an approach to this problem based on the
mathematical framework of Markov decision processes and reinforcement learning. Current
model-based reinforcement learning is based on one-step models that cannot represent
common-sense higher-level actions, such as going to lunch, grasping an object, or flying to
Denver. This paper generalizes prior work on temporally abstract models [Sutton, 1995] and …
Abstract
Planning and learning at multiple levels of temporal abstraction is a key problem for artificial intelligence. In this paper we summarize an approach to this problem based on the mathematical framework of Markov decision processes and reinforcement learning. Current model-based reinforcement learning is based on one-step models that cannot represent common-sense higher-level actions, such as going to lunch, grasping an object, or flying to Denver. This paper generalizes prior work on temporally abstract models [Sutton, 1995] and extends it from the prediction setting to include actions, control, and planning. We introduce a more general form of temporally abstract model, the multi-time model, and establish its suitability for planning and learning by virtue of its relationship to the Bellman equations. This paper summarizes the theoretical framework of multi-time models and illustrates their potential advantages in a grid world planning task.
The need for hierarchical and abstract planning is a fundamental problem in AI (see, eg, Sacerdoti, 1977; Laird et aI., 1986; Korf, 1985; Kaelbling, 1993; Dayan & Hinton, 1993). Model-based reinforcement learning offers a possible solution to the problem of integrating planning with real-time learning and decision-making (Peng & Williams, 1993, Moore & Atkeson, 1993; Sutton and Barto, 1998). However, current model-based reinforcement learning is based on one-step models that cannot represent common-sense, higher-level actions. Modeling such actions requires the ability to handle different, interrelated levels of temporal abstraction.
proceedings.neurips.cc
Showing the best result for this search. See all results