Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

High-performance complex event processing over hierarchical data

Published: 04 December 2013 Publication History

Abstract

While Complex Event Processing (CEP) constitutes a considerable portion of the so-called Big Data analytics, current CEP systems can only process data having a simple structure, and are otherwise limited in their ability to efficiently support complex continuous queries on structured or semistructured information. However, XML-like streams represent a very popular form of data exchange, comprising large portions of social network and RSS feeds, financial feeds, configuration files, and similar applications requiring advanced CEP queries. In this article, we present the XSeq language and system that support CEP on XML streams, via an extension of XPath that is both powerful and amenable to an efficient implementation. Specifically, the XSeq language extends XPath with natural operators to express sequential and Kleene-* patterns over XML streams, while remaining highly amenable to efficient execution. In fact, XSeq is designed to take full advantage of the recently proposed Visibly Pushdown Automata (VPA), where higher expressive power can be achieved without compromising the computationally attractive properties of finite state automata. Besides the efficiency and expressivity benefits, the choice of VPA as the underlying model also enables XSeq to go beyond XML streams and be easily applicable to any data with both sequential and hierarchical structures, including JSON messages, RNA sequences, and software traces. Therefore, we illustrate the XSeq's power for CEP applications through examples from different domains and provide formal results on its expressiveness and complexity. Finally, we present several optimization techniques for XSeq queries. Our extensive experiments indicate that XSeq brings outstanding performance to CEP applications: two orders of magnitude improvement is obtained over the same queries executed in general-purpose XML engines.

References

[1]
Alexander, M., Fawcett, J., and Runciman, P. 2000. Nursing Practice: Hospital and Home: The Adult 2nd Ed. Churchill Livingstone.
[2]
Alur, R. and Madhusudan, P. 2004. Visibly pushdown languages. In Proceedings of the Symposium on Theory of Computing (STOC'04).
[3]
Alur, R. and Madhusudan, P. 2006. Adding nesting structure to words. In Proceedings of the 10th International Conference on Developments in Language Theory (DLT'06). Lecture Notes in Computer Science, vol. 4036, Springer, 1--13.
[4]
Amagasa, T., Yoshikawa, M., and Uemura, S. 2000. A data model for temporal xml documents. In Proceedings of the 11th International Conference on Database and Expert Systems Applications (DEXA'00). 334--344.
[5]
Bamford, R., Borkar, V., Branter, M., Fischer, P. M., Florescu, D., et al. 2009. Xquery reloaded. Proc. VLDB Endow. 2, 2, 1342--1353.
[6]
Barton, C., Charles, P., Goyal, D., Raghavachari, M., Fontoura, M., et al. 2003. Streaming xpath processing with forward and backward axes. In Proceedings of the 19th International Conference on Data Engineering (ICDE'03). 455--466.
[7]
Boncz, P., Gurst, T., van Keulen, M., Manegold, S., Rittinger, J., et al. 2006. Monetdb/xquery: A fast xquery processor powered by a relational engine. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'06).
[8]
Brenna, L., Gehrke, J., Hong, M., and Johansen, D. 2009. Distributed event stream processing with non-deterministic finite automata. In Proceedings of the 3rd International Conference on Distributed Event-Based Systems (DEBS'09).
[9]
Cate, B. T. and Lutz, C. 2009. The complexity of query containment in expressive fragments of xpath 2.0. J. ACM 56, 6.
[10]
Chen, Y., Davidson, S. B., and Zheng, Y. 2006. An efficient xpath query processor for xml streams. In Proceedings of the 22nd International Conference on Data Engineering (ICDE'06).
[11]
Diao, Y., Altinel, M., Franklin, M. J., Zhang, H., and Fischer, P. 2003. Path sharing and predicate evaluation for high-performance xml filtering. ACM Trans. Datab. Syst. 28,4.
[12]
Florescu, D., Hilliery, C., Kossman, D., Lucas, P., Riccardi, F., et al. 2003. The bea/xqrl streaming xquery processor. In Proceedings of the 29th International Conference on Very Large Databases (VLDB'03).
[13]
Furche, T., Gottlob, G., Grasso, G., Schallhart, C., and Sellers, A. J. 2011. Oxpath: A language for scalable, memory-efficient data extraction from web applications. Proc. VLDB Endow. 4, 11.
[14]
Gauwin, O. and Niehren, J. 2011. Streamable fragments of forward xpath. In Proceedings of the 16th International Conference on Implementation and Application of Automata (CIAA'11). 3--15.
[15]
Gauwin, O., Niehren, J., and Tison, S. 2011. Queries on xml streams with bounded delay and concurrency. Inf. Comput. 209, 3, 409--442.
[16]
Josifovski, V., Fontoura, M., and Barta, A. 2005. Querying xml streams. VLDB J. 14,2.
[17]
Kay, M. 2008. Ten reasons why saxon xquery is fast. IEEE Data Engin. Bull. 31, 4.
[18]
Knuth, D. E., Morris, J. H. Jr., and Pratt, V. R. 1977. Fast pattern matching in strings. SIAM J. Comput. 6, 1, 323--350.
[19]
Koch, C. 2009. Xml stream processing. In Encyclopedia of Database Systems, Springer, 3634--3637
[20]
Laptev, N. and Zaniolo, C. 2012. Optimization of massive pattern queries by dynamic configuration morphing. In Proceedings of the 28th International Conference on Data Engineering (ICDE'12). 917--928.
[21]
Luckham, D. C. 2001. The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems. Addison-Wesley.
[22]
Madhusudan, P. and Viswanathan, M. 2009. Query automata for nested words. In Proceedings of the 34th International Symposium on Mathematical Foundations of Computer Science (MFCS'09). Lecture Notes in Computer Science, vol. 5734, Springer, 561--573.
[23]
Marx, M. 2005. Conditional xpath. Trans. Datab. Syst. 30, 4.
[24]
Mozafari, B., Zeng, K., and Zaniolo, C. 2010a. From regular expressions to nested words: Unifying languages and query execution for relational and xml sequences. Proc. VLDB Endow. 3, 1.
[25]
Mozafari, B., Zeng, K., and Zaniolo, C. 2010b. K*sql: A unifying engine for sequence patterns and xml. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'10). 1143--1146.
[26]
Mozafari, B., Zeng, K., and Zaniolo, C. 2012. High-performance complex event processing over xml streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'12). 253--264.
[27]
Olteanu, D., Kiesling, T., and Bry, F. 2003. An evaluation of regular path expressions with qualifiers against xml streams. In Proceedings of the 19th International Conference on Data Engineering (ICDE'03). 702--704.
[28]
Peng, F. and Chawathe, S. S. 2003. Xpath queries on streaming data. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'03).
[29]
Pitcher, C. 2005. Visibly pushdown expression effects for xml stream processing. In Proceedings of the Workshop on Programming Language Technologies for XML (PLAN-X'05).
[30]
Schmidt, A., Wass, F., Kersten, M., Carey, M. J., Manoescu, I., and Busse, R. 2002. Xmark: A benchmark for xml data management. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB'02). 974--985.
[31]
Snodgrass, R. T. 2009. Tsql2. In Encyclopedia of Database Systems, Springer, 3192--3197.
[32]
Stromback, L. and Schmidt, S. 2009. An extension of xquery for graph analysis of biological pathways. In Proceedings of the 1st International Conference on Advances on Databases, Knowledge, and Data Applications (DBKDA'09). 22--27.
[33]
Tang, N. V. 2009. A tighter bound for the determinization of visibly pushdown automata. In Proceedings of the International Workshop on Verification of Infinite-State Systems (INFINITY'09). 62--76.
[34]
ten Cate, B. 2006. The expressivity of xpath with transitive closure. In Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS'06). 328-337.
[35]
ten Cate, B. and Marx, M. 2007a. Axiomatizing the logical core of xpath 2.0. In Proceedings of the 11th International Conference on Database Theory (ICDT'07).
[36]
ten Cate, B. and Marx, M. 2007b. Navigational xpath: calculus and algebra. SIGMOD Rec. 36, 2.
[37]
ten Cate, B. and Segoufin, L. 2008. Xpath, transitive closure logic, and nested tree walking automata. In Proceedings of the 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS'08). 251--260.
[38]
Vagena, Z., Moro, M. M., and Tsotras, V. J. 2007. Roxsum: Leveraging data aggregation and batch processing for xml routing. In Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE'07). 1466--1470.
[39]
Wang, F., Zaniolo, C., and Zhou, X. 2008. Archis: an xml-based approach to transaction-time temporal database systems. VLDB J. 17, 6, 1445--1463.
[40]
Wu, E., Diao, Y., and Rizvi, S. 2006. High-performance complex event processing over streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'06). 407--418.
[41]
Zaniolo, C. 2009. Event-oriented data models and temporal queries in transaction-time databases. In Proceedings of the 16th International Symposium on Temporal Representation and Reasoning (TIME'09). 47--53.
[42]
Zeng, K., Yang, M., Mozafari, B., and Zaniolo, C. 2013. Complex pattern matching in complex structures: The xseq approach. In Proceedings of the IEEE International Conference on Data Engineering (ICDE'13). 1328--1331.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 38, Issue 4
Invited papers issue
November 2013
294 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/2539032
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2013
Accepted: 01 February 2013
Received: 01 October 2012
Published in TODS Volume 38, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Complex event processing
  2. JSON
  3. XML
  4. big data analytics
  5. visibly pushdown automata

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 07 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Review on Data Science and Prediction2021 2nd International Conference on Computing and Data Science (CDS)10.1109/CDS52072.2021.00100(548-555)Online publication date: Jan-2021
  • (2019)A query language for semantic complex event processingSemantic Web10.3233/SW-18031310:1(53-93)Online publication date: 1-Jan-2019
  • (2019)Big data stream analysis: a systematic literature reviewJournal of Big Data10.1186/s40537-019-0210-76:1Online publication date: 6-Jun-2019
  • (2019)Event modeling and mining: a long journey toward explainable eventsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-019-00545-029:1(459-482)Online publication date: 1-Jul-2019
  • (2018)Parallel complex event detection based on regular tree pattern matchingProceedings of the 1st International Conference on Big Data Technologies10.1145/3226116.3226128(34-39)Online publication date: 18-May-2018
  • (2018)Tractable queries on big data via preprocessing with logarithmic-size outputKnowledge and Information Systems10.1007/s10115-017-1092-756:1(141-163)Online publication date: 1-Jul-2018
  • (2017)Complex event recognition in the big data eraProceedings of the VLDB Endowment10.14778/3137765.313782910:12(1996-1999)Online publication date: 1-Aug-2017
  • (2017)Fusing effectful comprehensionsACM SIGPLAN Notices10.1145/3140587.306236252:6(17-32)Online publication date: 14-Jun-2017
  • (2017)Complex Event Recognition LanguagesProceedings of the 11th ACM International Conference on Distributed and Event-based Systems10.1145/3093742.3095106(7-10)Online publication date: 8-Jun-2017
  • (2017)Fusing effectful comprehensionsProceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3062341.3062362(17-32)Online publication date: 14-Jun-2017
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media