Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Capturing topology in graph pattern matching

Published: 01 December 2011 Publication History

Abstract

Graph pattern matching is often defined in terms of subgraph isomorphism, an np-complete problem. To lower its complexity, various extensions of graph simulation have been considered instead. These extensions allow pattern matching to be conducted in cubic-time. However, they fall short of capturing the topology of data graphs, i.e., graphs may have a structure drastically different from pattern graphs they match, and the matches found are often too large to understand and analyze. To rectify these problems, this paper proposes a notion of strong simulation, a revision of graph simulation, for graph pattern matching. (1) We identify a set of criteria for preserving the topology of graphs matched. We show that strong simulation preserves the topology of data graphs and finds a bounded number of matches. (2) We show that strong simulation retains the same complexity as earlier extensions of simulation, by providing a cubic-time algorithm for computing strong simulation. (3) We present the locality property of strong simulation, which allows us to effectively conduct pattern matching on distributed graphs. (4) We experimentally verify the effectiveness and efficiency of these algorithms, using real-life data and synthetic data.

References

[1]
LinkedIn. www.linkedin.com.
[2]
S. Abiteboul, P. Buneman, and D. Suciu. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, 1999.
[3]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
[4]
C. C. Aggarwal and H. Wang. Managing and Mining Graph Data. Springer, 2010.
[5]
S. Amer-Yahia, M. Benedikt, and P. Bohannon. Challenges in searching online communities. IEEE Data Eng. Bull., 30(2), 2007.
[6]
J. Brynielsson, J. Hogberg, L. Kaati, C. Martenson, and P. Svenson. Detecting social positions using simulation. In ASONAM, 2010.
[7]
N. Buchan and R. Croson. The boundaries of trust: own and others' actions in the US and China. Journal of Economic Behavior & Organization, 55(4), 2004.
[8]
D. Bustan and O. Grumberg. Simulation-based minimization. TOCL, 4(2), 2003.
[9]
D. Cavendish and K. S. Candan. Distributed XML processing: Theory and applications. J. Parallel Distrib. Comput., 68(8), 2008.
[10]
D. Chen and C. Y. Chan. Minimization of tree pattern queries with constraints. In SIGMOD, 2008.
[11]
G. Cong, W. Fan, and A. Kementsietsidis. Distributed query evaluation with performance guarantees. In SIGMOD, 2007.
[12]
L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell., 26(10), 2004.
[13]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, 2001.
[14]
G. Csardi and T. Nepusz. The igraph software package for complex network research. InterJournal Complex Systems, 1695(1695), 2006.
[15]
J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In OSDI, 2004.
[16]
R. Diestel. Graph Theory. Springer, 2005.
[17]
A. Dovier and C. Piazza. The subgraph bisimulation problem. IEEE Trans. Knowl. Data Eng., 15(4), 2003.
[18]
W. Fan, J. Li, S. Ma, N. Tang, and Y. Wu. Adding regular expressions to graph reachability and pattern queries. In ICDE, 2011.
[19]
W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, and Y. Wu. Graph pattern matching: From intractable to polynomial time. PVLDB, 3(1), 2010.
[20]
B. Gallagher. Matching structure and semantics: A survey on graph-based pattern matching. AAAI FS., 2006.
[21]
M. Giatsoglou, S. Papadopoulos, and A. Vakali. Massive graph management for the web and web 2.0. In New Directions in Web Data Management 1. Springer, 2011.
[22]
R. Goldman and J. Widom. Dataguides: Enabling query formulation and optimization in semistructured databases. In VLDB, 1997.
[23]
M. Grohe. From polynomial time queries to graph structure theory. In ICDT, 2010.
[24]
M. R. Henzinger, T. A. Henzinger, and P. W. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995.
[25]
V. Kann. On the approximability of the maximum common subgraph problem. In STACS, 1992.
[26]
D. Kossmann. The state of the art in distributed query processing. ACM Comput. Surv., 32(4), 2000.
[27]
C. Liu, C. Chen, J. Han, and P. S. Yu. Gplag: detection of software plagiarism by program dependence graph analysis. In KDD, 2006.
[28]
G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010.
[29]
R. Milner. Communication and Concurrency. Prentice Hall, 1989.
[30]
C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.
[31]
L. Terveen and D. McDonald. Social matching: A framework and research agenda. ACM Trans. Comput.-Hum. Interact., 12(3), 2005.
[32]
Y. Tian and J. M. Patel. Tale: A tool for approximate large graph matching. In ICDE, 2008.
[33]
H. Tong, C. Faloutsos, B. Gallagher, and T. Eliassi-Rad. Fast best-effort pattern matching in large attributed graphs. In KDD, 2007.
[34]
J. R. Ullmann. An algorithm for subgraph isomorphism. J. ACM, 23(1), 1976.
[35]
L. Zou, L. Chen, and M. T. Özsu. Distance-join: Pattern match query in a large graph database. PVLDB, 2(1), 2009.

Cited By

View all
  • (2024)Towards efficient simulation-based constrained temporal graph pattern matchingWorld Wide Web10.1007/s11280-024-01259-227:3Online publication date: 3-Apr-2024
  • (2022)Flexible application-aware approximation for modern distributed graph processing frameworksProceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3534540.3534693(1-10)Online publication date: 12-Jun-2022
  • (2022)A survey of continuous subgraph matching for dynamic graphsKnowledge and Information Systems10.1007/s10115-022-01753-x65:3(945-989)Online publication date: 19-Oct-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 5, Issue 4
December 2011
120 pages

Publisher

VLDB Endowment

Publication History

Published: 01 December 2011
Published in PVLDB Volume 5, Issue 4

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Towards efficient simulation-based constrained temporal graph pattern matchingWorld Wide Web10.1007/s11280-024-01259-227:3Online publication date: 3-Apr-2024
  • (2022)Flexible application-aware approximation for modern distributed graph processing frameworksProceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3534540.3534693(1-10)Online publication date: 12-Jun-2022
  • (2022)A survey of continuous subgraph matching for dynamic graphsKnowledge and Information Systems10.1007/s10115-022-01753-x65:3(945-989)Online publication date: 19-Oct-2022
  • (2021)A Survey on Distributed Graph Pattern Matching in Massive GraphsACM Computing Surveys10.1145/343972454:2(1-35)Online publication date: 9-Feb-2021
  • (2021)Efficient parallel edge-centric approach for relaxed graph pattern matchingThe Journal of Supercomputing10.1007/s11227-021-03938-778:2(1642-1671)Online publication date: 15-Jun-2021
  • (2020)Simulation-based Approximate Graph Pattern MatchingProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3384401(2825-2827)Online publication date: 11-Jun-2020
  • (2020)A Twig-Based Algorithm for Top-k Subgraph Matching in Large-Scale Graph DataWeb Information Systems and Applications10.1007/978-3-030-60029-7_43(475-487)Online publication date: 23-Sep-2020
  • (2019)Discovering Patterns for Fact Checking in Knowledge GraphsJournal of Data and Information Quality10.1145/328648811:3(1-27)Online publication date: 7-May-2019
  • (2019)Graph simulation on large scale temporal graphsGeoinformatica10.1007/s10707-019-00381-y24:1(199-220)Online publication date: 30-Nov-2019
  • (2019)Querying Knowledge Graphs with Natural LanguagesDatabase and Expert Systems Applications10.1007/978-3-030-27618-8_3(30-46)Online publication date: 26-Aug-2019
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media