Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Effective graph classification based on topological and label attributes

Published: 01 August 2012 Publication History

Abstract

Graph classification is an important data mining task, and various graph kernel methods have been proposed recently for this task. These methods have proven to be effective, but they tend to have high computational overhead. In this paper, we propose an alternative approach to graph classification that is based on feature vectors constructed from different global topological attributes, as well as global label features. The main idea is that the graphs from the same class should have similar topological and label attributes. Our method is simple and easy to implement, and via a detailed comparison on real benchmark datasets, we show that our topological and label feature-based approach delivers competitive classification accuracy, with significantly better results on those datasets that have large unlabeled graph instances. Our method is also substantially faster than most other graph kernels. © 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2012 © 2012 Wiley Periodicals, Inc.

References

[1]
T. Joachims, T. Hofmann, Y. Yue, and C.-N. Yu, Predicting structured objects with support vector machines, Commun ACM 52(11) (2009), 97–104.
[2]
L. Ralaivola, S. J. Swamidass, H. Saigo, and P. Baldi, Graph kernels for chemical informatics, Neural Netw 18 (2005), 1093–1110.
[3]
P. Mahè and J.-P. Vert, Graph kernels based on tree patterns for molecules, Mach Learn 75(1) (2009), 3–35.
[4]
K. M. Borgwardt, C. S. Ong, S. Schönauer, S. V. N. Vishwanathan, A. J. Smola, and H.-P. Kriegel, Protein function prediction via graph kernels, In ISMB (Supplement of Bioinformatics), 2005, 47–56.
[5]
C. Bilgin, C. Demir, C. Nagi, and B. Yener, Cell-graph mining for breast tissue modeling and classification, In 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2007, 5311–5314.
[6]
B. Scholkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, Cambridge, MA, MIT Press, 2001.
[7]
V. N. Vapnik, The Nature of Statistical Learning Theory, New York, Springer-Verlag, 1995.
[8]
T. Gärtner, P. Flach, and S. Wrobel, On graph kernels: hardness results and efficient alternatives, In 16th Annual Conference on Computational Learning Theory, 2003, 129–143.
[9]
H. Kashima, K. Tsuda, and A. Inokuchi, Marginalized kernels between labeled graphs, In International Conference on Machine Learning, 2003, 321–328.
[10]
K. M. Borgwardt and H.-P. Kriegel, Shortest-path kernels on graphs, In 5th IEEE International Conference on Data Mining, Washington, DC, 2005, 74–81.
[11]
T. Horvàth, T. Gärtner, and S. Wrobel, Cyclic pattern kernels for predictive graph mining, In 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, 158–167.
[12]
J. Ramon and T. Gärtner, Expressivity versus efficiency of graphs kernels, In 1st International Workshop on Mining Graphs, Trees and Sequences, 2003.
[13]
N. Shervashidze and K. M. Borgwardt, Fast subtree kernels on graphs, In Advances in Neural Information Processing Systems, 2009, 1660–1668.
[14]
N. Shervashidze, S. V. Vishwanathan, T. H. Petri, K. Mehlhorn, and K. M. Borgwardt, Efficient graphlet kernels for large graph comparison, In 12th International Conference on Artificial Intelligence and Statistics, 2009, 488–495.
[15]
I. R. Kondor, N. Shervashidze, and K. M. Borgwardt, The graphlet spectrum, In International Conference on Machine Learning, Vol. 382, A. P. Danyluk, L. Bottou, and M. L. Littman, eds., 2009, 67.
[16]
M. Thoma, H. Cheng, A. Gretton, J. Han, H.-P. Kriegel, A. Smola, L. Song, P. S. Yu, X. Yan, and K. M. Borgwardt, Discriminative frequent subgraph mining with optimality guarantees, Stat Anal Data Mining 3(5) (2010), 302–318.
[17]
S. V. N. Vishwanathan, K. M. Borgwardt, and N. N. Schraudolph, Fast computation of graph kernels, In Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman, eds., Cambridge, MA, MIT Press, 2006, 1449–1456.
[18]
I. R. Kondor and J. Lafferty, Diffusion kernels on graphs and other discrete structures, In International Conference on Machine Learning, 2002, 315–322.
[19]
G. Li, M. Semerci, B. Yener, and M. J. Zaki, Graph classification via topological and label attributes, In 9th Workshop on Mining and Learning with Graphs (with SIGKDD), 2011.
[20]
P. Mahè, N. Ueda, T. Akutsu, J.-L. Perret, and J.-P. Vert, Extensions of marginalized graph kernels, In International Conference on Machine Learning, 2004.
[21]
S. V. N. Vishwanathan, N. N. Schraudolph, I. R. Kondor, and K. M. Borgwardt, Graph kernels, J Mach Learn Res 11 (2010), 1201–1242.
[22]
X. Yan and J. Han, gspan: Graph-based substructure pattern mining, In IEEE International Conference on Data Mining, 2002.
[23]
C. Meyer, Matrix Analysis and Applied Linear Algebra, Philadelphia, PA, SIAM, 2000.
[24]
Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. Vorst, Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide, Philadelphia, SIAM, 2000.
[25]
S. Nijssen and J. Kok, A quickstart in frequent structure mining can make a difference, Proceedings of the 2004 ACM SIGKDD international conference o n Knowledge discovery and data mining, 2004, 647–652.
[26]
E. Jones, T. Oliphant, P. Peterson, et al. SciPy: Open source scientific tools for Python, 2001. Available at .
[27]
A. A. Hagberg, D. A. Schult, and P. J. Swart, Exploring network structure, dynamics, and function using NetworkX, In Proceedings of the 7th Python in Science Conference (SciPy2008), Pasadena, CA,2008, 11–15.
[28]
C.-C. Chang and C.-J. Lin, LIBSVM: a library for support vector machines, ACM Trans Intell Syst Technol 2 (2011), 27:1–27:27. Software available at .
[29]
A. Debnath, R. L. de Compadre, G. Debnath, A. Shusterman, and C. Hansch, The structure-activity relationship of mutagenic aromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity, J Med Chem 34 (1991), 786–797.
[30]
N. Wale and G. Karypis, Comparison of descriptor spaces for chemical compound retrieval and classification, In IEEE International Conference on Data Mining, 2006.
[31]
P. Dobson and A. J. Doig, Distinguishing enzyme structures from non-enzymes without alignments, J Mol Biol 330(4) (2003), 771–783.
[32]
C. C. Bilgin, P. Bullough, G. E. Plopper, and B. Yener, Ecm-aware cell-graph mining for bone tissue modeling and classification, Data Mining Knowl Discov 20 (2010), 416–438.
[33]
C. Demir, S. H. Gultekin, and B. Yener, Learning the topological properties of brain tumors, IEEE/ACM Trans Comput Biol Bioinform 2(3) (2005), 262–270.
[34]
I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, Gene selection for cancer classification using support vector machines, Mach Learn 46(1) (2002), 389–422.
[35]
L. Lovász and K. Vesztergombi, Geometric representations of graphs, Paul Erdos and his Mathematics, 1999.
[36]
U. Brandes and D. Fleischer, Centrality measures based on current flow, STACS 2005, 2005, 533–544.
[37]
M. Newman, Networks: An Introduction, New York, NY, Oxford University Press, 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Statistical Analysis and Data Mining
Statistical Analysis and Data Mining  Volume 5, Issue 4
August 2012
88 pages
ISSN:1932-1864
EISSN:1932-1872
Issue’s Table of Contents

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 August 2012

Author Tags

  1. graph classification
  2. graph kernels
  3. graph mining
  4. topological attributes

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Novel Initialization Functions for Metaheuristic-Based Online Virtual Network EmbeddingJournal of Network and Systems Management10.1007/s10922-024-09822-y32:3Online publication date: 30-Apr-2024
  • (2023)Multi-proximity based embedding scheme for learning vector quantization-based classification of biochemical structured dataNeurocomputing10.1016/j.neucom.2023.126632554:COnline publication date: 14-Oct-2023
  • (2022)User-driven diverse scenario exploration in model findersScience of Computer Programming10.1016/j.scico.2021.102745215:COnline publication date: 1-Mar-2022
  • (2022)Bipartite graph capsule networkWorld Wide Web10.1007/s11280-022-01009-226:1(421-440)Online publication date: 14-Feb-2022
  • (2021)Adversarial Separation Network for Cross-Network Node ClassificationProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482228(2618-2626)Online publication date: 26-Oct-2021
  • (2021)Graph Embedding Using Frequency FilteringIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2019.292951943:2(473-484)Online publication date: 9-Jan-2021
  • (2020)Graph Similarity and its Applications to Hardware SecurityIEEE Transactions on Computers10.1109/TC.2019.295375269:4(505-519)Online publication date: 1-Apr-2020
  • (2020)Diverse Scenario Exploration in Model Finders Using Graph Kernels and ClusteringRigorous State-Based Methods10.1007/978-3-030-48077-6_3(27-43)Online publication date: 27-May-2020
  • (2019)Mining data provenance to detect advanced persistent threatsProceedings of the 11th USENIX Conference on Theory and Practice of Provenance10.5555/3359032.3359040(6-6)Online publication date: 3-Jun-2019
  • (2018)Separating Terrorist-Like Topological Signatures Embedded in Benign NetworksMILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM)10.1109/MILCOM.2018.8599854(1-9)Online publication date: 29-Oct-2018
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media