Learning to Take Actions

Khardon, Roni

doi:10.1023/A:1007571119753

Learning to Take Actions

Published: April 1999

Volume 35, pages 57–90, (1999)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Learning to Take Actions

Download PDF

Roni Khardon¹

546 Accesses
28 Citations
Explore all metrics

Abstract

We formalize a model for supervised learning of action strategies in dynamic stochastic domains and show that PAC-learning results on Occam algorithms hold in this model as well. We then identify a class of rule-based action strategies for which polynomial time learning is possible. The representation of strategies is a generalization of decision lists; strategies include rules with existentially quantified conditions, simple recursive predicates, and small internal state, but are syntactically restricted. We also study the learnability of hierarchically composed strategies where a subroutine already acquired can be used as a basic action in a higher level strategy. We prove some positive results in this setting, but also show that in some cases the hierarchical learning problem is computationally hard.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of the ACMConference on Management of Data (SIGMOD) (pp. 207–216). Washington, DC: ACM Press.
Google Scholar
Allen, J., Hendler, J., & Tate, A. (1990). Readings in planning. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Anderson, J. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press.
Google Scholar
Angluin, D. (1987). Learning regular sets from queries and counterexamples. Information and Computation, 75, 87–106.
Google Scholar
Baum, E. (1996). Toward a model of mind as a laissez-faire economy of idiots. Proceedings of the International Conference on Machine Learning (pp. 28–36). Bari, Italy: Morgan Kaufmann.
Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M.K. (1987). Occam's razor. Information Processing Letters, 24, 377–380.
Google Scholar
Booker, L., Goldberg, D., & Holland, J. (1989). Classifier systems and genetic algorithms. Artificial Intelligence, 40, 235–282.
Google Scholar
Brooks, R.A. (1991). Intelligence without representation. Artificial Intelligence, 47, 139–159.
Google Scholar
Bylander, T. (1994). The computational complexity of propositional STRIPS planning. Artificial Intelligence, 69, 165–204.
Google Scholar
Chapman, D. (1989). Penguins can make cake. AI Magazine, 10(4), 45–50.
Google Scholar
Cohen, W. (1995). PAC-learning recursive logic programs: Efficient algorithms. Journal of Artificial Intelligence Research, 2, 501–539.
Google Scholar
Cook, S.A. (1971). The complexity of theorem proving procedures. Proceedings of the 3rd Annual ACM Symposium of the Theory of Computing (pp. 151–158). Shaker Heights, Ohio: ACM Press.
Google Scholar
DeJong, G., & Bennett, S. (1995). Extending classical planning to real world execution with machine learning. Proceedings of the International Joint Conference of Artificial Intelligence (pp. 1153–1159). Montreal, Canada: Morgan Kaufmann.
Google Scholar
DeJong, G., & Mooney, R. (1986). Explanation based learning: An alternative view. Machine Learning, 1, 145–176.
Google Scholar
De Raedt, L., & Dzeroski, S. (1994). First order jk-clausal theories are PAC-learnable. Artificial Intelligence, 70, 375–392.
Google Scholar
Dzeroski, S., Muggleton, S., & Russell, S. (1992). PAC-learnability of determinate logic programs. Proceedings of the Conference on Computational Learning Theory (pp. 128–135). Pittsburgh, PA: ACM Press.
Google Scholar
Fiechter, C.N. (1994). Efficient reinforcement learning. Proceedings of the Conference on Computational Learning Theory (pp. 88–97). New Brunswick, NJ: ACM Press.
Google Scholar
Garey, M., & Johnson, D. (1979). Computers and intractability: A guide to the theory of NP-completeness. San Francisco: W.H. Freeman.
Google Scholar
Georgeff, M., & Lansky, A. (1987). Reactive reasoning and planning. Proceedings of the National Conference on Artificial Intelligence (pp. 677–682). Philadelphia, PA: AAAI Press.
Google Scholar
Ginsberg, M. (1989). Universal planning: An (almost) universally bad idea. AI Magazine, 10(4), 40–44.
Google Scholar
Grefenstette, J., Ramsey, C., & Schultz, A. (1990). Learning sequential decision rules using simulation models and competition. Machine Learning, 5, 355–381.
Google Scholar
Gupta, N., & Nau, D. (1991). Complexity results for blocksworld planning. Proceedings of the National Conference on Artificial Intelligence (pp. 629–633). Anaheim, CA: AAAI Press.
Google Scholar
Haussler, D. (1989). Learning conjunctive concepts in structural domains. Machine Learning, 4, 7–40.
Google Scholar
Hayes, P., Ford, K., & Agnew, N. (1994). On babies and bathwather. AI Magazine, 15(4), 15–26.
Google Scholar
Jonsson, P., & Bäckström, C. (1996). On the size of reactive plans. Proceedings of the National Conference on Artificial Intelligence (pp. 1182–1187). Portland, Oregon: AAAI Press.
Google Scholar
Kaelbling, L. (1993). Learning in embedded systems. Cambridge, MA: MIT Press.
Google Scholar
Kaelbling, L., Littman, M., & Moore, A. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
Google Scholar
Kambhampati, S. (1995). A comparative analysis of partial order planning and task reduction planning. SIGART Bulletin, 6(1), 16–25.
Google Scholar
Kearns, M.J., & Schapire, R.E. (1994). Efficient distribution-free learning of probabilistic concepts. Journal of Computer and System Sciences, 48, 464–497.
Google Scholar
Kearns, M., & Vazirani, U. (1994). An introduction to computational learning theory. Cambridge, MA: MIT Press.
Google Scholar
Khardon, R. (1997). Learning action strategies for planning domains (Tech. Rep. TR-09-97). Harvard University: Aiken Computation Lab.
Khardon, R., & Roth, D. (1995). Learning to reason with a restricted view. Proceedings of the Conference on Computational Learning Theory (pp. 301–310). Santa Cruz, CA: ACM Press.
Google Scholar
Khardon, R., & Roth, D. (1997). Learning to reason. Journal of the ACM, 44, 697–725.
Google Scholar
Klahr, D., Langley, P., & Neches, R. (1986). Production system models of learning and development. Cambridge, MA: MIT Press.
Google Scholar
Korf, R.E. (1985). Macro operators: A weak method for learning. Artificial Intelligence, 26, 35–77.
Google Scholar
Laird, J., Rosenbloom, P., & Newell, A. (1986). Chunking in Soar: The anatomy of a general learning mechanism. Machine Learning, 1, 11–46.
Google Scholar
Lin, L. (1993). Scaling up reinforcement learning for robot control. Proceedings of the International Conference on Machine Learning (pp. 182–189). Amherst, MA: Morgan Kaufmann.
Google Scholar
Littlestone, N. (1988). Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2, 285–318.
Google Scholar
Littman, M., Cassandra, A., & Kaelbling, L. (1995). Learning policies for partially observable environments: Scaling up. Proceedings of the International Conference on Machine Learning (pp. 362–370). Tahoe, CA: Morgan Kaufmann.
Google Scholar
Maes, P. (1991). Situated agents can have goals. In P. Maes (Ed.), Designing autonomous agents (pp. 49–70). Cambridge, MA: MIT Press.
Google Scholar
McCarthy, J. (1958). Programs with common sense. Proceedings of the Symposium on the Mechanization of Thought Processes (Vol. 1, pp. 77–84), National Physical Laboratory. Reprinted in R. Brachman and H. Levesque (Eds.), Readings in Knowledge Representation, 1985, Los Altos, CA: Morgan Kaufmann.
Google Scholar
Minton, S. (1990). Quantitative results concerning the utility of explanation based learning. Artificial Intelligence, 42, 363–391.
Google Scholar
Mitchell, T., Keller, R., & Kedar-Cabelli, S. (1986). Explanation based learning: A unifying view. Machine Learning, 1, 47–80.
Google Scholar
Mooney, R.J., & Califf, M.E. (1995). Induction of first-order decision lists: Results on learning the past tense of English verbs. Journal of Artificial Intelligence Research, 3, 1–24.
Google Scholar
Muggleton, S. (1994). Inductive logic programming: Derivations, successes and shortcomings. SIGART Bulletin, 5(1), 5–11.
Google Scholar
Muggleton, S., & De Raedt, L. (1994). Inductive logic programming: Theory and methods. Journal of Logic Programming, 20, 629–679.
Google Scholar
Natarajan, B.K. (1989). On learning from exercises. Proceedings of the Conference on Computational Learning Theory (pp. 72–87). Santa Cruz, CA: Morgan Kaufmann.
Google Scholar
Natarajan, B.K., & Tadepalli, P. (1988). Two new frameworks for learning. Proceedings of the International Conference on Machine Learning (pp. 402–415). Ann Arbor, Michigan: Morgan Kaufmann.
Google Scholar
Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press.
Google Scholar
Newell, A., & Simon, H.A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Nilsson, N.J. (1994). Teleo-reactive programs for agent control. Journal of Artificial Intelligence Research, 1, 139–158.
Google Scholar
Pitt, L., & Valiant, L.G. (1988). Computational limitations on learning from examples. Journal of the ACM, 35, 965–984.
Google Scholar
Quinlan, J.R. (1990). Learning logical definitions from relations. Machine Learning, 5, 239–266.
Google Scholar
Rivest, R.L. (1987). Learning decision lists. Machine Learning, 2, 229–246.
Google Scholar
Rosenbloom, P., & Laird, J. (1986). Mapping explanation based learning onto Soar. Proceedings of the National Conference on Artificial Intelligence (pp. 561–567). Philadelphia, PA: AAAI Press.
Google Scholar
Rosenbloom, P.S., Laird, J.E., & Newell, A. (1993). The Soar papers: Research on integrated intelligence. Cambridge, MA: MIT Press.
Google Scholar
Roth, D. (1995). Learning to reason: The non-monotonic case. Proceedings of the International Joint Conference of Artificial Intelligence (pp. 1178–1184). Montreal, Canada: Morgan Kaufmann.
Google Scholar
Sammut, C., Hurst, S., Kedzier, D., & Michie, D. (1992). Learning to fly. Proceedings of the International Conference on Machine Learning (pp. 385–393). Aberdeen, Scotland: Morgan Kaufmann.
Google Scholar
Schoppers, M. (1987). Universal plans for reactive robots in unpredictable domains. Proceedings of the International Joint Conference of Artificial Intelligence (pp. 1039–1046). Milan, Italy: Morgan Kaufmann.
Google Scholar
Schoppers, M. (1989). In defense of reaction plans as caches. AI Magazine, 10(4), 51–62.
Google Scholar
Selman, B. (1994). Near-optimal plans, tractability, and reactivity. Proceedings of the International Conference on Knowledge Representation and Reasoning (pp. 521–529). Bonn, Germany: Morgan Kaufmann.
Google Scholar
Shavlik, J.W. (1990). Acquiring recursive and iterative concepts with explanation based learning. Machine Learning, 5, 39–70.
Google Scholar
Slaney, J., & Thiebaux, S. (1996). Linear time near-optimal planning in the blocks world. Proceedings of the National Conference on Artificial Intelligence (pp. 1208–1214). Portland, Oregon: AAAI Press.
Google Scholar
Sutton, R.S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9–44.
Google Scholar
Sutton, R.S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. Proceedings of the International Conference on Machine Learning (pp. 216–224). Austin, Texas: Morgan Kaufmann.
Google Scholar
Tadepalli, P. (1991). A formalization of explanation based macro-operator learning. Proceedings of the International Joint Conference of Artificial Intelligence (pp. 616–622). Sydney, Australia: Morgan Kaufmann.
Google Scholar
Tadepalli, P. (1992). A theory of unsupervised speedup learning. Proceedings of the National Conference on Artificial Intelligence (pp. 229–234). San Jose, CA: AAAI Press.
Google Scholar
Tadepalli, P., & Natarajan, B. (1996). A formal framework for speedup learning from problems and solutions. Journal of Artificial Intelligence Research, 4, 445–475.
Google Scholar
Tesauro, G. (1992). Temporal difference learning of backgammon strategy. Proceedings of the International Conference on Machine Learning (pp. 451–457). Aberdeen, Scotland: Morgan Kaufmann.
Google Scholar
Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38, 58–68.
Google Scholar
Valiant, L.G. (1984). A theory of the learnable. Communications of the ACM, 27, 1134–1142.
Google Scholar
Valiant, L.G. (1985). Learning disjunctions of conjunctions. Proceedings of the International Joint Conference of Artificial Intelligence (pp. 560–566). Los Angeles, CA: Morgan Kaufmann.
Google Scholar
Valiant, L.G. (1994). Circuits of the mind. Oxford, UK: Oxford University Press.
Google Scholar
Valiant, L.G. (1995). Rationality. Proceedings of the Conference on Computational Learning Theory (pp. 3–14). Santa Cruz, CA: ACM Press.
Google Scholar
Valiant, L.G. (1996). A neuroidal architecture for cognitive computation (Tech. Rep. TR-11-96). Harvard University: Aiken Computation Lab.
Google Scholar
VanLehn, K. (1987). Learning one subprocedure per lesson. Artificial Intelligence, 31, 1–40.
Google Scholar
Veloso, M. (1992). Learning by analogical reasoning in general problem solving. Ph.D. thesis, School of Computer Science, Carnegie Mellon University. Also appeared as Technical Report CMU-CS-92-174.
Veloso, M., Carbonell, J., Perez, A., Borrajo, D., Fink, E., & Blythe, J. (1995). Integrating learning and planning: The PRODIGY architecture. Journal of Experimental and Theoretical Artificial Intelligence, 7, 81–120.
Google Scholar
Vera, A., & Simon, H. (1993). Situated action: A symbolic interpretation. Cognitive Science, 17, 7–48.
Google Scholar
Watkins, C., & Dayan, P. (1992). Q-learning. Machine Learning, 8, 279–292.
Google Scholar
Weld, D. (1994). An introduction to least commitment planning. AI Magazine, 15(4), 27–61.
Google Scholar
Zelle, J.M., & Mooney, R.J. (1994). Inducing deterministic Prolog parsers from treebanks: A machine learning approach. Proceedings of the National Conference on Artificial Intelligence (pp. 748–753). Seattle, Washington: AAAI Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Informatics, University of Edinburgh, JCMB, King's Buildings, Edinburgh, EH9 3JZ, Scotland
Roni Khardon

Authors

Roni Khardon
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khardon, R. Learning to Take Actions. Machine Learning 35, 57–90 (1999). https://doi.org/10.1023/A:1007571119753

Download citation

Issue Date: April 1999
DOI: https://doi.org/10.1023/A:1007571119753

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Learning to Take Actions

Abstract

Article PDF

Similar content being viewed by others

Optimizing Long-Running Action Histories in the Situation Calculus Through Search

The Hierarchical Continuous Pursuit Learning Automation for Large Numbers of Actions

Reasoning About the Executability of Goal-Plan Trees

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Learning to Take Actions

Abstract

Article PDF

Similar content being viewed by others

Optimizing Long-Running Action Histories in the Situation Calculus Through Search

The Hierarchical Continuous Pursuit Learning Automation for Large Numbers of Actions

Reasoning About the Executability of Goal-Plan Trees

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation