Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3452021.3458320acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Stackless Processing of Streamed Trees

Published: 20 June 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Processing tree-structured data in the streaming model is a challenge: capturing regular properties of streamed trees by means of a stack is costly in memory, but falling back to finite-state automata drastically limits the computational power. We propose an intermediate stackless model based on register automata equipped with a single counter, used to maintain the current depth in the tree. We explore the power of this model to validate and query streamed trees. Our main result is an effective characterization of regular path queries (RPQs) that can be evaluated stacklessly---with and without registers. In particular, we confirm the conjectured characterization of tree languages defined by DTDs that are recognizable without registers, by Segoufin and Vianu (2002), in the special case of tree languages defined by means of an RPQ.

    References

    [1]
    Vince Bá rá ny, Christof Lö ding, and Olivier Serre. Regularity problems for visibly pushdown languages. In Proc. STACS 2006, pages 420--431. Springer, 2006.
    [2]
    David A. Mix Barrington and James C. Corbett. On the relative complexity of some languages in NC$^1$. Inf. Process. Lett., 32(5):251--256, 1989.
    [3]
    Angela Bonifati, George H. L. Fletcher, Hannes Voigt, and Nikolay Yakovets. Querying Graphs. Morgan & Claypool Publishers, 2018.
    [4]
    Robert D. Cameron, Ehsan Amiri, Kenneth S. Herdy, Dan Lin, Thomas C. Shermer, and Fred Popowich. Parallel scanning with bitstream addition: An XML case study. In Proc. Euro-Par 2011, pages 2--13. Springer, 2011.
    [5]
    Cristiana Chitic and Daniela Rosu. On validation of XML streams using finite state machines. In Proc. WebDB 2004, pages 85--90. ACM, 2004.
    [6]
    Denis Debarbieux, Olivier Gauwin, Joachim Niehren, Tom Sebastian, and Mohamed Zergaoui. Early nested word automata for XPath query answering on XML streams. Theor. Comput. Sci., 578:100--125, 2015.
    [7]
    Patrick Dymond. Input-driven Languages Are in Log N Depth. Inf. Process. Lett., 26(5):247--250, January 1988.
    [8]
    Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj D. Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke. Anatomy of high-performance deep learning convolutions on SIMD architectures. In Proc. SC 2018, pages 66:1--66:12. IEEE / ACM, 2018.
    [9]
    Alejandro Grez, Cristian Riveros, and Mart'i n Ugarte. A formal framework for complex event processing. In Proc. ICDT 2019, pages 5:1--5:18. Schloss Dagstuhl - Leibniz-Zentrum fü r Informatik, 2019.
    [10]
    Sascha Grunert and Daniel Schmidt. A comparison of regex engines, 2017. https://rust-leipzig.github.io/regex/2017/03/28/comparison-of-regex-engines/.
    [11]
    Ashish Kumar Gupta and Dan Suciu. Stream processing of XPath queries with predicates. In Proc. SIGMOD 2003, pages 419--430. ACM, 2003.
    [12]
    Yeye He, Siddharth Barman, and Jeffrey F. Naughton. On load shedding in complex event processing. In Proc. ICDT 2014, pages 213--224. OpenProceedings.org, 2014.
    [13]
    Eryk Kopczynski. Invisible pushdown languages. In Proc. LICS 2016, pages 867--872. ACM, 2016.
    [14]
    Geoff Langdale and Daniel Lemire. Parsing gigabytes of JSON per second. VLDB J., 28(6):941--960, 2019.
    [15]
    Filip Murlak, Charles Paperman, and Michal Pilipczuk. Schema validation via streaming circuits. In Proc. PODS 2016, pages 237--249. ACM, 2016.
    [16]
    Damian Niwinski and Igor Walukiewicz. A gap property of deterministic tree languages. Theor. Comput. Sci., 303(1):215--231, 2003.
    [17]
    Dan Olteanu. SPEX: streamed and progressive evaluation of XPath. IEEE Trans. Knowl. Data Eng., 19(7):934--949, 2007.
    [18]
    Shoumik Palkar, Firas Abuzaid, Peter Bailis, and Matei Zaharia. Filter before you parse: Faster analytics on raw data with sparser. Proc. VLDB Endow., 11(11):1576--1589, 2018.
    [19]
    Jean-Eric Pin. Proprietes syntactiques du produit non ambigu. In Proc. ICALP 1980, pages 483--499. Springer, 1980.
    [20]
    Jean-Eric Pin. On reversible automata. In Proc. LATIN 1992, pages 401--416. Springer, 1992.
    [21]
    Jean Eric Pin and Raymond E. Miller. Varieties Of Formal Languages. Plenum Publishing Co., 1986.
    [22]
    Orestis Polychroniou, Arun Raghavan, and Kenneth A. Ross. Rethinking SIMD vectorization for in-memory databases. In Proc. SIGMOD 2015, pages 1493--1508. ACM, 2015.
    [23]
    Gang Ren, Peng Wu, and David A. Padua. An empirical study on the vectorization of multimedia applications for multimedia extensions. In Proc. IPDPS 2005. IEEE, 2005.
    [24]
    Luc Segoufin and Cristina Sirangelo. Constant-memory validation of streaming XML documents against DTDs. In Proc. ICDT 2007, pages 299--313. Springer, 2007.
    [25]
    Luc Segoufin and Victor Vianu. Validating streaming XML documents. In Proc. PODS 2002, pages 53--64. ACM, 2002.
    [26]
    Dan Suciu. From searching text to querying XML streams. J. Discrete Algorithms, 2(1):17--32, 2004.
    [27]
    Vincent Vanhoucke, Andrew Senior, and Mark Z. Mao. Improving the speed of neural networks on CPUs, 2011. Deep Learning and Unsupervised Feature Learning Workshop @ NIPS 2011.
    [28]
    Burchard von Braunmü hl and Rutger Verbeek. Input-driven languages are recognized in log n space. In Proc. FCT 1983, pages 40--51. Springer, 1983.
    [29]
    Xiang Wang, Yang Hong, Harry Chang, KyoungSoo Park, Geoff Langdale, Jiayu Hu, and Heqing Zhu. Hyperscan: A fast multi-pattern regex matcher for modern CPUs. In Proc. NSDI 2019, pages 631--648. USENIX Association, 2019.
    [30]
    Haopeng Zhang, Yanlei Diao, and Neil Immerman. On complexity and optimization of expensive queries in complex event processing. In Proc. SIGMOD 2014, pages 217--228. ACM, 2014.
    [31]
    Yichun Zhang. Regex engine matching speed benchmark, 2015. http://openresty.org/misc/re/bench/.
    [32]
    Jingren Zhou and Kenneth A. Ross. Implementing database operations using SIMD instructions. In Proc. SIGMOD 2002, pages 145--156. ACM, 2002.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PODS'21: Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
    June 2021
    440 pages
    ISBN:9781450383813
    DOI:10.1145/3452021
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. automata
    2. json
    3. querying
    4. streaming
    5. weak validation
    6. xml

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGMOD/PODS '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 642 of 2,707 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 82
      Total Downloads
    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)1
    Reflects downloads up to

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media