The first learning track of the international planning competition

Fern, Alan; Khardon, Roni; Tadepalli, Prasad

doi:10.1007/s10994-011-5234-y

The first learning track of the international planning competition

Published: 31 January 2011

Volume 84, pages 81–107, (2011)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

The first learning track of the international planning competition

Download PDF

Alan Fern¹,
Roni Khardon² &
Prasad Tadepalli¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The International Planning Competition is a biennial event organized in the context of the International Conference on Automated Planning and Scheduling. The 2008 competition included, for the first time, a learning track for comparing approaches for improving automated planners via learning. In this paper, we describe the structure of the learning track, the planning domains used for evaluation, the participating systems, the results, and our observations. Towards supporting the goal of domain-independent learning, one of the key features of the competition was to disallow any code changes or parameter tweaks after the training domains were revealed to the participants. The competition results show that at this stage no learning for planning system outperforms state-of-the-art planners in a domain independent manner across a wide range of domains. However, they appear to be close to providing such performance. Evaluating learning for planning systems in a blind competition raises important questions concerning criteria that should be taken into account in future competitions.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Aler, R., Borrajo, D., & Isasi, P. (2002). Using genetic programming to learn and improve control knowledge. Artificial Intelligence, 141(1–2), 29–56.
Article MATH Google Scholar
Bjarnason, R., Tadepalli, P., & Fern, A. (2007). Searching solitaire in real time. ICGA Journal, 30(3), 131–142.
Google Scholar
Blockeel, H., & De Raedt, L. (1997). Top-down induction of first order logical decision trees. Artificial Intelligence, 101, 285–297.
Article MathSciNet Google Scholar
Bonet, B., & Geffner, H. (2003). Labeled RTDP: improving the convergence of real-time dynamic programming. In Proceedings of the international conference on automated planning and scheduling (pp. 12–31).
Google Scholar
Boutilier, C., Reiter, R., & Price, B. (2001). Symbolic dynamic programming for first-order MDPs. In Proceedings of the international joint conference of artificial intelligence (pp. 690–700).
Google Scholar
Chen, Y., Wah, B. W., & Hsu, C. (2006). Temporal planning using subgoal partitioning and resolution in SGPlan. Journal of Artificial Intelligence Research, 26, 323–369.
Google Scholar
Dzeroski, S., De Raedt, L., & Driessens, K. (2001). Relational reinforcement learning. Machine Learning Journal, 43, 7–52.
Article MATH Google Scholar
Estlin, T. A., & Mooney, R. J. (1996). Multi-strategy learning of search control for partial-order planning. In Proceedings of the thirteenth national conference on artificial intelligence (pp. 843–848).
Google Scholar
Fern, A., Yoon, S., & Givan, R. (2006). Approximate policy iteration with a policy language bias: solving relational Markov decision processes. Journal of Artificial Intelligence Research, 25, 85–118.
MathSciNet Google Scholar
Fikes, R., & Nilsson, N. (1971). STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence Journal, 2(3/4), 189–208.
Article MATH Google Scholar
Fikes, R. E., Hart, P. E., & Nilsson, N. J. (1972). Learning and executing generalized robot plans. Artificial Intelligence Journal, 3(1–3), 251–288.
Article Google Scholar
Galvani, B., Gerevini, A., Saetti, A., & Vallati, M. (2009). A planner based on an automatically configurable portfolio of domain-independent planners with macro-actions: PbP. In International conference on automated planning and scheduling.
Google Scholar
Hoffmann, J., & Nebel, B. (2001). The FF planning system: Fast plan generation through heuristic search. Journal of Artificial Intelligence Research, 14, 263–302.
Google Scholar
Huang, Y.-C., Selman, B., & Kautz, H. (2000). Learning declarative control rules for constraint-based planning. In Proceedings of seventeenth international conference on machine learning (pp. 415–422).
Google Scholar
Iba, G. (1989). A heuristic approach to the discovery of macro-operators. Machine Learning, 3(4), 285–317.
Google Scholar
Joshi, S., & Khardon, R. (2008). Stochastic planning with first order decision diagrams. In Proceedings of the international conference on automated planning and scheduling.
Google Scholar
Joshi, S., Kersting, K., & Khardon, R. (2010). Self-Taught decision theoretic planning with first-order decision diagrams. In Proceedings of the international conference on automated planning and scheduling (pp. 89–96).
Google Scholar
Junghanns, A., & Schaeffer, J. (2001). Sokoban: enhancing general single-agent search methods using domain knowledge. Artificial Intelligence, 129(1–2), 219–251.
Article MathSciNet MATH Google Scholar
Kersting, K., van Otterlo, M., & De Raedt, L. (2004). Bellman goes relational. In Proceedings of the international conference on machine learning.
Google Scholar
Khardon, R. (1999). Learning action strategies for planning domains. Artificial Intelligence, 113(1–2), 125–148.
Article MATH Google Scholar
La Rosa, T., García Olaya, A., & Borrajo, D. (2007). Using cases utility for heuristic planning improvement. In Proceedings of the seventh international conference on case based reasoning.
Google Scholar
Martin, M., & Geffner, H. (2000). Learning generalized policies in planning domains using concept languages. In Proceedings of seventh international conference on principles of knowledge representation and reasoning.
Google Scholar
McDermott, D. (1998). PDDL—the planning domain definition language. In The 1st international planning competition.
Google Scholar
Minton, S. (1988). Quantitative results concerning the utility of explanation-based learning. In Proceedings of national conference on artificial intelligence.
Google Scholar
Minton, S. (Ed.) (1993). Machine learning methods for planning. San Mateo: Morgan Kaufmann.
Google Scholar
Minton, S., Carbonell, J. G., Knoblock, C. A., Kuokka, D., Etzioni, O., & Gil, Y. (1989). Explanation-based learning: a problem solving perspective. Artificial Intelligence, 40, 63–118.
Article Google Scholar
Nigenda, R., Nguyen, X., & Kambhampati, S. (2000). AltAlt: combining the advantages of graphplan and heuristic state search. In International conference on knowledge-based computer systems. Citeseer.
Google Scholar
Puterman, M. L. (1994). Markov decision processes: discrete dynamic stochastic programming. New York: Wiley.
MATH Google Scholar
Richter, S., & Westphal, M. (2008). The Lama planner using landmark counting in heuristic search. In Proceedings of the international planning competition.
Google Scholar
Sanner, S., & Boutilier, C. (2009). Practical solution techniques for first-order MDPs. Artificial Intelligence, 173, 748–788.
Article MathSciNet MATH Google Scholar
Sutton, R., & Barto, A. (1998). Reinforcement learning: an introduction. Cambridge: MIT.
Google Scholar
Tadepalli, P., Givan, B., & Driessens, K. (2004). Relational reinforcement: an overview. In Proceedings of the workshop on relational reinforcement learning, Banff, Canada, International conference on machine learning.
Google Scholar
Taylor, M. E., & Stone, P. (2009). Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res., 10, 1633–1685.
MathSciNet Google Scholar
Vidal, V., & Geffner, H. (2006). Branching and pruning: an optimal temporal POCL planner based on constraint programming. Artificial Intelligence, 170(3), 298–335.
Article MathSciNet MATH Google Scholar
Wang, C., & Khardon, R. (2007). Policy iteration for relational MDPs. In Proceedings of the workshop on uncertainty in artificial intelligence.
Google Scholar
Wang, C., Joshi, S., & Khardon, R. (2008). First-Order decision diagrams for relational MDPs. Journal of Artificial Intelligence Research, 31, 431–472.
MathSciNet MATH Google Scholar
Whiteson, S., Tanner, B., & White, A. (2010). AI Magazine, 31(2), 81–94.
Google Scholar
Xu, L., Hutter, F., Hoos, H., & Leyton-Brown, K. (2007). SATzilla-07: the design and analysis of an algorithm portfolio for SAT. In Lecture Notes in Computer Science (Vol. 4741, p. 712).
Yoon, S., Fern, A., & Givan, R. (2002). Inductive policy selection for first-order MDPs. In Proceedings of eighteenth conference in uncertainty in artificial intelligence.
Google Scholar
Yoon, S., Fern, A., & Givan, R. (2006). Learning heuristic functions from relaxed plans. In International conference on automated planning and scheduling (ICAPS).
Google Scholar
Younes, H. (2003). Extending PDDL to model stochastic decision processes. In Proceedings of the ICAPS-03 workshop on PDDL.
Google Scholar
Zimmerman, T., & Kambhampati, S. (2003). Learning-assisted automated planning: looking back, taking stock, going forward. AI Magazine, 24(2), 73–96.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Science, Oregon State University, Kelley Engineering Center, Corvallis, OR, USA
Alan Fern & Prasad Tadepalli
Department of Computer Science, Tufts University, Medford, MA, USA
Roni Khardon

Authors

Alan Fern
View author publications
You can also search for this author in PubMed Google Scholar
Roni Khardon
View author publications
You can also search for this author in PubMed Google Scholar
Prasad Tadepalli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alan Fern.

Additional information

Editors: S. Whiteson and M. Littman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fern, A., Khardon, R. & Tadepalli, P. The first learning track of the international planning competition. Mach Learn 84, 81–107 (2011). https://doi.org/10.1007/s10994-011-5234-y

Download citation

Received: 29 May 2010
Revised: 13 December 2010
Accepted: 30 December 2010
Published: 31 January 2011
Issue Date: July 2011
DOI: https://doi.org/10.1007/s10994-011-5234-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The first learning track of the international planning competition

Abstract

Article PDF

Similar content being viewed by others

Learning to Solve Sequential Planning Problems Without Rewards

Machine learning and logic: a new frontier in artificial intelligence

Program Search for Machine Learning Pipelines Leveraging Symbolic Planning and Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The first learning track of the international planning competition

Abstract

Article PDF

Similar content being viewed by others

Learning to Solve Sequential Planning Problems Without Rewards

Machine learning and logic: a new frontier in artificial intelligence

Program Search for Machine Learning Pipelines Leveraging Symbolic Planning and Reinforcement Learning

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation