Abstract
This chapter surveys methods that transform a relational representation of a learning problem into a propositional (feature-based, attribute-value) representation. This kind of representation change is known as propositionalization. Taking such an approach, feature construction can be decoupled from model construction. It has been shown that in many relational data mining applications this can be done without loss of predictive performance. After reviewing both general-purpose and domaindependent propositionalization approaches from the literature, an extension to the Linus propositionalization method that overcomes the system’s earlier inability to deal with non-determinate local variables is described.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo. Fast discovery of association rules. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 307–328. MIT press, Cambridge, MA, 1996.
E. Alphonse and C. Rouveirol. Lazy propositionalisation for relational learning. Proceedings of the Fourteenth European Conference on Artificial Intelligence, pages 256–260. IOS Press, Amsterdam, 2000.
I. Bratko, I. Mozetič, and N. Lavrač. KARDIO: A Study in Deep and Qualitative Knowledge for Expert Systems. MIT Press, Cambridge, MA, 1989.
C.J.C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), pages 121–167, 1998.
B. Cestnik, I. Kononenko, and I. Bratko. ASSISTANT 86: A knowledge elicitation tool for sophisticated users. In Proceedings of the Second European Working Session on Learning, pages 31–44. Sigma Press, Wilmslow, UK, 1987.
Y. Chevaleyre and J-D. Zucker. Noise-tolerant rule induction from multi-instance data. Proceedings of the ICML-2000 workshop on Attribute- Value and Relational Learning: Crossing the Boundaries, pages 1–11. Stanford University, Stanford, CA, 2000.
P. Clark and R. Boswell. Rule induction with CN2: Some recent improvements. In Proceedings Fifth European Working Session on Learning, pages 151–163. Springer, Berlin, 1991.
P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3(4):261–283, 1989.
W.W. Cohen. PAC-learning nondeterminate clauses. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 676–681. AAAI Press, Menlo Park, CA, 1994.
W.W. Cohen. Learning trees and rules with set-valued features. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 709–716. AAAI Press, Menlo Park, CA, 1996.
D.J. Cook and L.B. Holder. Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research, 1:231–255, 1994.
L. Dehaspe and H. Toivonen. Discovery of frequent Datalog patterns. Data Mining and Knowledge Discovery, 3(l):7–36, 1999.
L. De Raedt. Logical settings for concept learning. Artificial Intelligence, 95:187–201, 1997.
L. De Raedt. Attribute-value learning versus inductive logic programming: The missing links (extended abstract). In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 1–8. Springer, Berlin, 1998.
T.G. Dietterich, R.H. Lathrop and T. Lozano-Perez. Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence 89(1–2): 31–71, 1997.
S. Dzeroski, H. Blocked, B. Kompare, S. Kramer, B. Pfahringer, and W. Van Laer. Experiments in Predicting Biodegradability. In Proceedings of the Ninth International Workshop on Inductive Logic Programming, pages 80–91. Springer, Berlin, 1999.
D. Fensel, M. Zickwolff, and M. Wiese. Are substitutions the better examples? Learning complete sets of clauses with Frog. In Proceedings of the Fifth International Workshop on Inductive Logic Programming, pages 453–474. Department of Computer Science, Katholieke Universiteit Leuven, 1995.
P. Flach. Knowledge representation for inductive learning. In Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, pages 160–167. Springer, Berlin, 1999.
P. Flach, C. Giraud-Carrier, and J.W. Lloyd. Strongly typed inductive concept learning. In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 185–194. Springer, Berlin, 1998.
P. Flach and N. Lachiche. 1BC: A first-order Bayesian classifier. In Proceedings of the Ninth International Workshop on Inductive Logic Programming, pages 92–103. Springer, Berlin, 1999.
P. Flach and N. Lachiche. Confirmation-guided discovery of first-order rules with Tertius. Machine Learning, 42(1–2): 61–95, 2001.
P. Geibel and F. Wysotzki. Relational learning with decision trees. In Proceedings Twelfth European Conference on Artificial Intelligence, pages 428–432. IOS Press, Amsterdam, 1996.
G. Klopman. Artificial intelligence approach to structure-activity studies: computer automated structure evaluation of biological activity of organic molecules. Journal of the American Chemical Society, 106:7315–7321, 1984.
G. Klopman. MultiCASE: A hierarchical computer automated structure evaluation program. Quantitative Structure Activity Relationships, 11:176–184, 1992.
W. Klösgen. EXPLORA: A multipattern and multistrategy discovery assistant. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 249–271. AAAI Press, Menlo Park, CA, 1996.
R. Kohavi, D. Sommerfield, and J. Dougherty. Data mining using MLC++: A machine learning library in C++. In Proceedings of the Eighth IEEE International Conference on Tools for Artificial Intelligence, pages 234–245. IEEE Computer Society Press, Los Alamitos, CA, 1996. http://www.sgi.com/Technology/mlc.
D. Koller and M. Sahami. Toward optimal feature selection. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 284–292. Morgan Kaufmann, San Francisco, CA, 1996.
S. Kramer. Structural regression trees. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 812–810. AAAI Press, Menlo Park, CA, 1996.
S. Kramer and E. Frank. Bottom-Up propositionalization. In Proceedings of the ILP-2000 Work-in-Progress Track, pages 156–162. Imperial College, London, 2000.
S. Kramer, B. Pfahringer, and C. Helma. Stochastic propositionalization of non-determinate background knowledge. In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 80–94. Springer, Berlin, 1998.
N. Lavrac and S. Dšeroski. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, Chichester, 1994. Freely available at http://www-ai.ijs.si/SasoDzeroski/ILPBook/.
N. Lavrac, S. Dzeroski, and M. Grobelnik. Learning nonrecursive definitions of relations with LINUS. In Proceedings of the Fifth European Working Session on Learning, pages 265–281. Springer-Verlag, Berlin, 1991.
N. Lavrač, D. Gamberger, P. Turney. A relevancy filter for constructive induction. IEEE Intelligent Systems, 13: 50–56, 1998.
H. Mannila and H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1 :241–258, 1997.
D. Michie, S. Muggleton, D. Page, and A. Srinivasan. To the international computing community: A new East-West challenge. Technical report, Oxford University Computing laboratory, Oxford, UK, 1994.
F. Mizoguchi, H. Ohwada, M. Daidoji, and S. Shirato. Learning rules that classify ocular fundus images for glaucoma diagnosis. In Proceedings of the Sixth International Workshop on Inductive Logic Programming, pages 146–162. Springer-Verlag, Berlin, 1996.
I. Mozetič. NEWGEM: Program for learning from examples, technical documentation and user’s guide. Reports of Intelligent Systems Group UIUCDCS-F-85–949, Department of Computer Science, University of Illinois, Urbana Champaign, IL, 1985.
S. Muggleton. Inverse entailment and Progol. New Generation Computing, 13: 245–286, 1995.
S. Muggleton and C. Feng. Efficient induction of logic programs. In S. Muggleton, editor, Inductive Logic Programming, pages 281–298. Academic Press, London, 1992.
S. Muggleton, R.D. King, and M.J.E Sternberg. Protein secondary structure prediction using logic. In Proceedings of the Second International Workshop on Inductive Logic Programming, pages 228–259. TM-1182, ICOT, Tokyo, 1992.
S. Muggleton, A. Srinivasan, R. King, and M. Sternberg. Biochemical knowledge discovery using Inductive Logic Programming. In Proceedings of the First Conference on Discovery Science, pages 326–341. Springer, Berlin, 1998.
A.L. Oliveira and A. Sangiovanni-Vincentelli. Constructive induction using a non-greedy strategy for feature selection. In Proceedings of the Ninth International Workshop on Machine Learning, pages 354–360. Morgan Kaufmann, San Francisco, CA, 1992.
G. Pagallo and D. Haussler. Boolean feature discovery in empirical learning. Machine Learning, 5:71–99, 1990.
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.
B.L. Richards and R.J. Mooney. Learning relations by pathfinding. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 50–55. AAAI Press, Menlo Park, CA, 1992.
M. Sebag and C. Rouveirol. Tractable induction and classification in first order logic via stochastic matching. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pages 888–893. Morgan Kaufmann, San Francisco, CA, 1997.
A. Srinivasan and R. King. Feature construction with inductive logic programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Mining and Knowledge Discovery, 3(l):37–57, 1999.
A. Srinivasan, R. King and D.W. Bristol, An assessment of submissions made to the Predictive Toxicology Evaluation Challenge. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 270–275. Morgan Kaufmann, San Francisco, CA, 1999.
A. Srinivasan, S. Muggleton, R.D. King and M. Sternberg. Theories for mutagenicity: a study of first-order and feature based induction. Artificial Intelligence, 85(1–2):277–299, 1996.
I. Stahl. Predicate invention in inductive logic programming. In L. De Raedt, editor, Advances in Inductive Logic Programming, pages 34–47. IOS Press, Amsterdam, 1996.
P. Turney. Low size-complexity inductive logic programming: The East-West challenge considered as a problem in cost-sensitive classification. In L. De Raedt, editor, Advances in Inductive Logic Programming, pages 308–321. IOS Press, Amsterdam, 1996.
V. Vapnik. Estimation of Dependencies Based on Empirical Data. Springer Verlag, Berlin, 1982.
V. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, Berlin, 1995.
J. Wnek and R.S. Michalski. Hypothesis-driven constructive induction in AQ17: A method and experiments. In Proceedings of IJCAI-91 Workshop on Evaluating and Changing Representations in Machine Learning, pages 13–22. Sydney, Australia, 1991.
S. Wrobel. An algorithm for multi-relational discovery of subgroups. In Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery, pages 78–87. Springer, Berlin, 1997.
J-D. Zucker and J-G. Ganascia. Representation changes for efficient learning in structural domains. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 543–551. Morgan Kaufmann, San Francisco, CA, 1996.
J-D. Zucker and J-G. Ganascia. Learning structurally indeterminate clauses. In Proceedings of the Eighth International Conference on Inductive Logic Programming, pages 235–244. Springer, Berlin, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kramer, S., Lavrač, N., Flach, P. (2001). Propositionalization Approaches to Relational Data Mining. In: Džeroski, S., Lavrač, N. (eds) Relational Data Mining. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04599-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-662-04599-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07604-6
Online ISBN: 978-3-662-04599-2
eBook Packages: Springer Book Archive