Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1458082.1458124acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

On effective presentation of graph patterns: a structural representative approach

Published: 26 October 2008 Publication History

Abstract

In the past, quite a few fast algorithms have been developed to mine frequent patterns over graph data, with the large spectrum covering many variants of the problem. However, the real bottleneck for knowledge discovery on graphs is neither efficiency nor scalability, but the usability of patterns that are mined out. Currently, what the state-of-art techniques give is a lengthy list of exact patterns, which are undesirable in the following two aspects: (1) on the micro side, due to various inherent noises or data diversity, exact patterns are usually not too useful in many real applications; and (2) on the macro side, the rigid structural requirement being posed often generates an excessive amount of patterns that are only slightly different from each other, which easily overwhelm the users.
In this paper, we study the presentation problem of graph patterns, where structural representatives are deemed as the key mechanism to make the whole strategy effective. As a solution to fill the usability gap, we adopt a two-step smoothing-clustering framework, with the first step adding error tolerance to individual patterns (the micro side), and the second step reducing output cardinality by collapsing multiple structurally similar patterns into one representative (the macro side). This novel, integrative approach is never tried in previous studies, which essentially rolls-up our attention to a more appropriate level that no longer looks into every minute detail. The above framework is general, which may apply under various settings and incorporate a lot of extensions. Empirical studies indicate that a compact group of informative delegates can be achieved on real datasets and the proposed algorithms are both efficient and scalable.

References

[1]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, pages 487--499, 1994.
[2]
R. J. Bayardo. Efficiently mining long patterns from databases. In SIGMOD Conference, pages 85--93, 1998.
[3]
C. Borgelt and M. R. Berthold. Mining molecular fragments: Finding relevant substructures of molecules. In ICDM, pages 51--58, 2002.
[4]
C. Chen, X. Yan, F. Zhu, and J. Han. gapprox: Mining frequent approximate patterns from a massive network. In ICDM, pages 445--450, 2007.
[5]
M. Deshpande, M. Kuramochi, N. Wale, and G. Karypis. Frequent substructure-based approaches for classifying chemical compounds. IEEE Transactions on Knowledge and Data Engineering, 17(8):1036--1050, 2005.
[6]
J. Han, J. Wang, Y. Lu, and P. Tzvetkov. Mining top-k frequent closed patterns without minimum support. In ICDM, pages 211--218, 2002.
[7]
M. Hasan, V. Chaoji, S. Salem, J. Besson, and M. Zaki. Origami: Mining representative orthogonal graph patterns. In ICDM, pages 153--162, 2007.
[8]
D. S. Hochbaum, editor. Approximation Algorithms for NP-Hard Problems. PWS Publishing, 1997.
[9]
J. Huan, W. Wang, J. Prins, and J. Yang. Spin: mining maximal frequent subgraphs from graph databases. In KDD, pages 581--586, 2004.
[10]
L. Kaufman and P. J. Rousseeuw, editors. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley and Sons, 1990.
[11]
M. Kuramochi and G. Karypis. Frequent subgraph discovery. In ICDM, pages 313--320, 2001.
[12]
M. Kuramochi and G. Karypis. Finding frequent patterns in a large sparse graph. In SDM, 2004.
[13]
Y. Liu, J. Li, and H. Gao. Summarizing graph patterns. In ICDE, pages 903--912, 2008.
[14]
T. Mielikäinen and H. Mannila. The pattern ordering problem. In PKDD, pages 327--338, 2003.
[15]
R. T. Ng and J. Han. Clarans: A method for clustering objects for spatial data mining. IEEE Transactions on Knowledge and Data Engineering, 14(5):1003--1016, 2002.
[16]
S. Nijssen and J. N. Kok. A quickstart in frequent structure mining can make a difference. In KDD, pages 647--652, 2004.
[17]
N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. In ICDT, pages 398--416, 1999.
[18]
R. Sharan, S. Suthram, R. M. Kelley, T. Kuhn, S. McCuine, P. Uetz, T. Sittler, R. M. Karp, and T. Ideker. Conserved patterns of protein interaction in multiple species. PNAS, 102(6):1974--1979, 2005.
[19]
C. Wang and S. Parthasarathy. Summarizing itemset patterns using probabilistic models. In KDD, pages 730--735, 2006.
[20]
D. Xin, J. Han, X. Yan, and H. Cheng. Mining compressed frequent-pattern sets. In VLDB, pages 709--720, 2005.
[21]
X. Yan, H. Cheng, J. Han, and D. Xin. Summarizing itemset patterns: a profile-based approach. In KDD, pages 314--323, 2005.
[22]
X. Yan and J. Han. Closegraph: mining closed frequent graph patterns. In KDD, pages 286--295, 2003.
[23]
X. Yan, P. S. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In SIGMOD Conference, pages 335--346, 2004.

Cited By

View all

Index Terms

  1. On effective presentation of graph patterns: a structural representative approach

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management
    October 2008
    1562 pages
    ISBN:9781595939913
    DOI:10.1145/1458082
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 October 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. frequent graph pattern
    2. smoothing-clustering
    3. structural representative

    Qualifiers

    • Research-article

    Conference

    CIKM08
    CIKM08: Conference on Information and Knowledge Management
    October 26 - 30, 2008
    California, Napa Valley, USA

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Frequent Subgraph Mining Algorithms in Static and Temporal Graph-Transaction Settings: A SurveyIEEE Transactions on Big Data10.1109/TBDATA.2021.3072001(1-1)Online publication date: 2021
    • (2020)Online social network trend discovery using frequent subgraph miningSocial Network Analysis and Mining10.1007/s13278-020-00682-310:1Online publication date: 11-Aug-2020
    • (2018)Optimized and Frequent Subgraphs: How Are They Related?IEEE Access10.1109/ACCESS.2018.28466046(37237-37249)Online publication date: 2018
    • (2016)Summarizing scale-free networks based on virtual and real linksPhysica A: Statistical Mechanics and its Applications10.1016/j.physa.2015.08.048444(360-372)Online publication date: Feb-2016
    • (2016)Mining social networks for anomaliesJournal of Network and Computer Applications10.1016/j.jnca.2016.02.02168:C(213-229)Online publication date: 1-Jun-2016
    • (2015)Motif Discovery in Protein 3D‐Structures using Graph Mining TechniquesPattern Recognition in Computational Molecular Biology10.1002/9781119078845.ch10(165-189)Online publication date: 18-Dec-2015
    • (2013)A Study of XML Models for Data MiningData Mining10.4018/978-1-4666-2455-9.ch001(1-27)Online publication date: 2013
    • (2013)Frequent subgraph summarization with error controlProceedings of the 14th international conference on Web-Age Information Management10.1007/978-3-642-38562-9_1(1-12)Online publication date: 14-Jun-2013
    • (2012)A Study of XML Models for Data MiningXML Data Mining10.4018/978-1-61350-356-0.ch001(1-28)Online publication date: 2012
    • (2012)Efficient mining of correlated sequential patterns based on null hypothesisProceedings of the 2012 international workshop on Web-scale knowledge representation, retrieval and reasoning10.1145/2389656.2389660(17-24)Online publication date: 29-Oct-2012
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media