Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works

Lai, Tin

Computer Science > Artificial Intelligence

arXiv:2207.08130v1 (cs)

[Submitted on 17 Jul 2022]

Title:Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works

Authors:Tin Lai

View PDF

Abstract:We propose a novel approach for planning agents to compose abstract skills via observing and learning from historical interactions with the world. Our framework operates in a Markov state-space model via a set of actions under unknown pre-conditions. We formulate skills as high-level abstract policies that propose action plans based on the current state. Each policy learns new plans by observing the states' transitions while the agent interacts with the world. Such an approach automatically learns new plans to achieve specific intended effects, but the success of such plans is often dependent on the states in which they are applicable. Therefore, we formulate the evaluation of such plans as infinitely many multi-armed bandit problems, where we balance the allocation of resources on evaluating the success probability of existing arms and exploring new options. The result is a planner capable of automatically learning robust high-level skills under a noisy environment; such skills implicitly learn the action pre-condition without explicit knowledge. We show that this planning approach is experimentally very competitive in high-dimensional state space domains.

Subjects:	Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)
Cite as:	arXiv:2207.08130 [cs.AI]
	(or arXiv:2207.08130v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2207.08130

Submission history

From: Tin Lai [view email]
[v1] Sun, 17 Jul 2022 10:05:54 UTC (213 KB)

Computer Science > Artificial Intelligence

Title:Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators