Abstract
Graph-Based Induction (GBI) is a machine learning technique developed for the purpose of extracting typical patterns from graph-structured data by stepwise pair expansion (pair-wise chunking). GBI is very efficient because of its greedy search strategy, however, it suffers from the problem of overlapping subgraphs. As a result, some of typical patterns cannot be discovered by GBI though a beam search has been incorporated in an improved version of GBI called Beam-wise GBI (B-GBI). In this paper, improvement is made on the search capability by using a new search strategy, where frequent pairs are never chunked but used as pseudo nodes in the subsequent steps, thus allowing extraction of overlapping subgraphs. This new algorithm, called Cl-GBI (Chunkingless GBI), was tested against two datasets, the promoter dataset from UCI repository and the hepatitis dataset provided by Chiba University, and shown successful in extracting more typical patterns than B-GBI.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Blake, C.L., Keogh, E., Merz, C.J.: UCI Repository of Machine Learning Database (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Borgelt, C., Berthold, M.R.: Mining Molecular Fragments: Finding Relevant Substructures of Molecules. In: Proc. ICDM 2002, pp. 51–58 (2002)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software (1984)
Cook, D.J., Holder, L.B.: Substructure Discovery Using Minimum Description Length and Background Knowledge. Artificial Intelligence Research 1, 231–255 (1994)
Fortin, S.: The Graph Isomorphism Problem, Technical Report TR96-20, Department of Computer Science, University of Alberta, Edmonton, Canada (1996)
Gaemsakul, W., Matsuda, T., Yoshida, T., Motoda, M., Washio, T.: Classifier Construction by Graph-Based Induction for Graph-Structured Data. In: Proc. PAKDD 2003, pp. 52–62 (2003)
Huan, J., Wang, W., Prins, J.: Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism. In: Proc. ICDM 2003, pp. 549–552 (2003)
Inokuchi, A., Washio, T., Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data. Machine Learning 50(3), 321–354 (2003)
Inokuchi, A., Washio, T., Nishimura, K., Motoda, H.: A Fast Algorithm for Mining Frequent Connected Subgraphs. IBM Research Report RT0448, Tokyo Research Laboratory, IBM Japan (2002)
Kuramochi, M., Karypis, G.: An Efficient Algorithm for Discovering Frequent Subgraphs. IEEE Trans. Knowledge and Data Engineering 16(9), 1038–1051 (2004)
Kuramochi, M., Karypis, G.: GREW–A Scalable Frequent Subgraph Discovery Algorithm. In: Proc. ICDM 2004, pp. 439–442 (2004)
Matsuda, T., Motoda, H., Yoshida, T., Washio, T.: Mining Patterns from Structured Data by Beam-wise Graph-Based Induction. In: Proc. DS 2002, pp. 422–429 (2002)
Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)
Yan, X., Han, J.: gSpan: Graph-Based Structure Pattern Mining. In: Proc. ICDM 2002, pp. 721–724 (2002)
Yoshida, K., Motoda, M.: CLIP: Concept Learning from Inference Patterns. Artificial Intelligence 75(1), 63–92 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nguyen, P.C., Ohara, K., Motoda, H., Washio, T. (2005). Cl-GBI: A Novel Approach for Extracting Typical Patterns from Graph-Structured Data. In: Ho, T.B., Cheung, D., Liu, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2005. Lecture Notes in Computer Science(), vol 3518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11430919_74
Download citation
DOI: https://doi.org/10.1007/11430919_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26076-9
Online ISBN: 978-3-540-31935-1
eBook Packages: Computer ScienceComputer Science (R0)