Learning Policies with External Memory

Peshkin, Leonid; Meuleau, Nicolas; Kaelbling, Leslie

Computer Science > Machine Learning

arXiv:cs/0103003 (cs)

[Submitted on 2 Mar 2001]

Title:Learning Policies with External Memory

Authors:Leonid Peshkin, Nicolas Meuleau, Leslie Kaelbling

View PDF

Abstract: In order for an agent to perform well in partially observable domains, it is usually necessary for actions to depend on the history of observations. In this paper, we explore a {\it stigmergic} approach, in which the agent's actions include the ability to set and clear bits in an external memory, and the external memory is included as part of the input to the agent. In this case, we need to learn a reactive policy in a highly non-Markovian domain. We explore two algorithms: SARSA(\lambda), which has had empirical success in partially observable domains, and VAPS, a new algorithm due to Baird and Moore, with convergence guarantees in partially observable domains. We compare the performance of these two algorithms on benchmark problems.

Comments:	8 pages
Subjects:	Machine Learning (cs.LG)
ACM classes:	I.2.8;I.2.6;I.2.11;I.2;I.2.3
Cite as:	arXiv:cs/0103003 [cs.LG]
	(or arXiv:cs/0103003v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.cs/0103003
Journal reference:	In Bratko, I., and Dzeroski, S., eds., Machine Learning: Proceedings of the Sixteenth International Conference, pp. 307-314. Morgan Kaufmann, San Francisco, CA

Submission history

From: Leonid Peshkin [view email]
[v1] Fri, 2 Mar 2001 01:55:46 UTC (39 KB)

Computer Science > Machine Learning

Title:Learning Policies with External Memory

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning Policies with External Memory

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators