Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3093742.3093921acmconferencesArticle/Chapter ViewAbstractPublication PagesdebsConference Proceedingsconference-collections
research-article

Maximizing Determinism in Stream Processing Under Latency Constraints

Published: 08 June 2017 Publication History

Abstract

The problem of coping with the demands of determinism and meeting latency constraints is challenging in distributed data stream processing systems that have to process high volume data streams that arrive from different unsynchronized input sources. In order to deterministically process the streaming data, they need mechanisms that synchronize the order in which tuples are processed by the operators. On the other hand, achieving real-time response in such a system requires careful tradeoff between determinism and low latency performance. We build on a recently proposed approach to handle data exchange and synchronization in stream processing, namely ScaleGate, which comes with guarantees for determinism and an efficient lock-free implementation, enabling high scalability. Considering the challenge and trade-offs implied by real-time constraints, we propose a system which comprises (a) a novel data structure called Slack-ScaleGate (SSG), along with its algorithmic implementation; SSG enables us to guarantee the deterministic processing of tuples as long as they are able to meet their latency constraints, and (b) a method to dynamically tune the maximum amount of time that a tuple can wait in the SSG data-structure, relaxing the determinism guarantees when needed, in order to satisfy the latency constraints. Our detailed experimental evaluation using a traffic monitoring application deployed in the city of Dublin, illustrates the working and benefits of our approach.

References

[1]
Daniel J Abadi, Don Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: a new model and architecture for data stream management. VLDB 12, 2 (2003), 120--139.
[2]
Apache Spark. 2017. https://spark.apache.org. (2017).
[3]
Apache Storm. 2017. http://storm.apache.org/. (2017).
[4]
Magdalena Balazinska, Hari Balakrishnan, Samuel R Madden, and Michael Stonebraker. 2008. Fault-tolerance in the Borealis distributed stream processing system. ACM TODS 33, 1 (2008), 3.
[5]
Daniel Cederman, Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and Philippas Tsigas. 2014. Brief announcement: concurrent data structures for efficient streaming aggregation. In SPAA, Prague, Czech Republic. 76--78.
[6]
Alan Demers, Johannes Gehrke, Mingsheng Hong, Mirek Riedewald, and Walker White. 2006. Towards Expressive Publish/Subscribe Systems. In EDBT, Munich, Germany. Springer, 627--644.
[7]
Vincenzo Gulisano, Ricardo Jimenez-Peris, Marta Patino-Martinez, Claudio Soriente, and Patrick Valduriez. 2012. Streamcloud: An elastic and scalable data streaming system. Parallel and Distributed Systems, IEEE Transactions on 23, 12 (2012), 2351--2365.
[8]
Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and Philippas Tsigas. 2016. Scalejoin: A deterministic, disjoint-parallel and skew-resilient stream join. IEEE Transactions on Big Data (2016).
[9]
Vincenzo Gulisano, Yiannis Nikolakopoulos, Ivan Walulya, Marina Papatriantafilou, and Philippas Tsigas. 2015. Deterministic Real-time Analytics of Geospatial Data Streams Through ScaleGate Objects. In DEBS. ACM, New York, NY, USA, 316--317.
[10]
Maurice Herlihy and Nir Shavit. 2012. The Art of Multiprocessor Programming, Revised Reprint. Elsevier.
[11]
Yuanzhen Ji, Anisoara Nica, Zbigniew Jerzak, Gregor Hackenbroich, and Christof Fetzer. 2016. Quality-Driven Disorder Handling for Concurrent Windowed Stream Queries with Shared Operators. In DEBS, Irvine, CA. ACM, 25--36.
[12]
Yuanzhen Ji, Jun Sun, Anisoara Nica, Zbigniew Jerzak, Gregor Hackenbroich, and Christof Fetzer. 2016. Quality-driven disorder handling for m-way sliding window stream joins. In ICDE, Helsinki, Finland. IEEE, 493--504.
[13]
Yuanzhen Ji, Hongjin Zhou, Zbigniew Jerzak, Anisoara Nica, Gregor Hackenbroich, and Christof Fetzer. 2015. Quality-driven processing of sliding window aggregates over out-of-order data streams. In DEBS, Oslo, Norway. ACM, 68--79.
[14]
Chuan-Wen Li, Yu Gu, Ge Yu, and Bonghee Hong. 2011. Aggressive complex event processing with confidence over out-of-order streams. Journal of Computer Science and Technology 26, 4 (2011), 685--696.
[15]
Jin Li, Kristin Tufte, Vladislav Shkapenyuk, Vassilis Papadimos, Theodore Johnson, and David Maier. 2008. Out-of-order processing: a new architecture for high-performance stream systems. VLDB (2008), 274--288.
[16]
Christopher Mutschler and Michael Philippsen. 2013. Distributed low-latency out-of-order event processing for high data rate sensor streams. In IPDPS, Boston, Massachusetts, USA. IEEE, 1133--1144.
[17]
Christopher Mutschler and Michael Philippsen. 2013. Reliable speculative processing of out-of-order event streams in generic publish/subscribe middlewares. In DEBS, Arlington, Texas, USA. ACM, 147--158.
[18]
Carl Edward Rasmussen and Christopher K. I. Williams. 2005. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press.
[19]
Esther Ryvkina, Anurag S Maskey, Mitch Cherniack, and Stan Zdonik. 2006. Revision processing in a stream processing engine: A high-level design. In ICDE, Atlanta, Georgia, USA. IEEE, 141--143.
[20]
Utkarsh Srivastava and Jennifer Widom. 2004. Flexible time management in data stream systems. In SIGMOD, Paris, France. ACM, 263--274.
[21]
Håkan Sundell and Philippas Tsigas. 2003. Fast and lock-free concurrent priority queues for multi-thread systems. In IPDPS, Nice, France. IEEE, 609--627.
[22]
Tao Ye and Shivkumar Kalyanaraman. 2003. A recursive random search algorithm for large-scale network parameter configuration. ACM SIGMETRICS 31, 1 (2003), 196--205.
[23]
Nikos Zacheilas and Vana Kalogeraki. 2014. Real-time scheduling of skewed mapreduce jobs in heterogeneous environments. In ICAC, Philadelphia, PA, USA. Usenix, 189--200.
[24]
Nikos Zacheilas, Vana Kalogeraki, Nikolas Zygouras, Nikolaos Panagiotou, and Dimitrios Gunopulos. 2015. Elastic Complex Event Processing exploiting Prediction. In BigData, Santa Clara, CA, USA. IEEE, 213--222.

Cited By

View all
  • (2023)FORTE: an extensible framework for robustness and efficiency in data transfer pipelinesProceedings of the 17th ACM International Conference on Distributed and Event-based Systems10.1145/3583678.3596892(139-150)Online publication date: 27-Jun-2023
  • (2022)Research Summary: Deterministic, Explainable and Efficient Stream ProcessingProceedings of the 2022 Workshop on Advanced tools, programming languages, and PLatforms for Implementing and Evaluating algorithms for Distributed systems10.1145/3524053.3542750(65-69)Online publication date: 25-Jul-2022
  • (2022)Edge-Based Runtime Verification for the Internet of ThingsIEEE Transactions on Services Computing10.1109/TSC.2021.307495615:5(2713-2727)Online publication date: 1-Sep-2022
  • Show More Cited By

Index Terms

  1. Maximizing Determinism in Stream Processing Under Latency Constraints

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DEBS '17: Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems
    June 2017
    393 pages
    ISBN:9781450350655
    DOI:10.1145/3093742
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 June 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Complex Event Processing
    2. Deterministic Processing
    3. Stream Processing

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    DEBS '17

    Acceptance Rates

    DEBS '17 Paper Acceptance Rate 22 of 60 submissions, 37%;
    Overall Acceptance Rate 145 of 583 submissions, 25%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 11 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)FORTE: an extensible framework for robustness and efficiency in data transfer pipelinesProceedings of the 17th ACM International Conference on Distributed and Event-based Systems10.1145/3583678.3596892(139-150)Online publication date: 27-Jun-2023
    • (2022)Research Summary: Deterministic, Explainable and Efficient Stream ProcessingProceedings of the 2022 Workshop on Advanced tools, programming languages, and PLatforms for Implementing and Evaluating algorithms for Distributed systems10.1145/3524053.3542750(65-69)Online publication date: 25-Jul-2022
    • (2022)Edge-Based Runtime Verification for the Internet of ThingsIEEE Transactions on Services Computing10.1109/TSC.2021.307495615:5(2713-2727)Online publication date: 1-Sep-2022
    • (2022)STRETCH: Virtual Shared-Nothing Parallelism for Scalable and Elastic Stream ProcessingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.318197933:12(4221-4238)Online publication date: 1-Dec-2022
    • (2021)Motivations and Challenges for Stream Processing in Edge ComputingCompanion of the ACM/SPEC International Conference on Performance Engineering10.1145/3447545.3451899(17-18)Online publication date: 19-Apr-2021
    • (2020)The role of event-time order in data streaming analysisProceedings of the 14th ACM International Conference on Distributed and Event-based Systems10.1145/3401025.3404088(214-217)Online publication date: 13-Jul-2020
    • (2020)TinTiNProceedings of the 14th ACM International Conference on Distributed and Event-based Systems10.1145/3401025.3401769(141-152)Online publication date: 13-Jul-2020
    • (2020)Simplifying CPS Application Development through Fine-grained, Automatic Timeout PredictionsACM Transactions on Internet of Things10.1145/33859601:3(1-30)Online publication date: 1-Jun-2020
    • (2020)Delegation sketchProceedings of the Fifteenth European Conference on Computer Systems10.1145/3342195.3387542(1-16)Online publication date: 15-Apr-2020
    • (2020)Accelerated LiDAR data processing algorithm for self‐driving cars on the heterogeneous computing platformIET Computers & Digital Techniques10.1049/iet-cdt.2019.016614:5(201-209)Online publication date: 19-May-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media