Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3318464.3389753acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

LightSaber: Efficient Window Aggregation on Multi-core Processors

Published: 31 May 2020 Publication History

Abstract

Window aggregation queries are a core part of streaming applications. To support window aggregation efficiently, stream processing engines face a trade-off between exploiting parallelism (at the instruction/multi-core levels) and incremental computation (across overlapping windows and queries). Existing engines implement ad-hoc aggregation and parallelization strategies. As a result, they only achieve high performance for specific queries depending on the window definition and the type of aggregation function. We describe a general model for the design space of window aggregation strategies. Based on this, we introduce LightSaber, a new stream processing engine that balances parallelism and incremental processing when executing window aggregation queries on multi-core CPUs. Its design generalizes existing approaches: (i) for parallel processing, LightSaber constructs a parallel aggregation tree (PAT) that exploits the parallelism of modern processors. The PAT divides window aggregation into intermediate steps that enable the efficient use of both instruction-level (i.e., SIMD) and task-level (i.e., multi-core) parallelism; and (ii) to generate efficient incremental code from the PAT, LightSaber uses a generalized aggregation graph (GAG), which encodes the low-level data dependencies required to produce aggregates over the stream. A GAG thus generalizes state-of-the-art approaches for incremental window aggregation and supports work-sharing between overlapping windows. LightSaber achieves up to an order of magnitude higher throughput compared to existing systems-on a 16-core server, it processes 470 million records/s with 132 ?s average latency.

Supplementary Material

MP4 File (3318464.3389753.mp4)
Presentation Video

References

[1]
Daniel J. Abadi, Don Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: A New Model and Architecture for Data Stream Management. The VLDB Journal 12, 2 (Aug. 2003), 120--139. https://doi.org/10.1007/s00778-003-0095-z
[2]
adamax. Re: Implement a queue in which push_rear(), pop_front() and get_min() are all constant time operations. http://stackoverflow.com/questions/4802038. Last access: 11/04/20.
[3]
Tyler Akidau, Alex Balikov, Kaya Bekirolu, Slava Chernyak, Josh Haberman, Reuven Lax, Sam McVeety, Daniel Is, Paul Nordstrom, and Sam Whittle. 2013. MillWheel: Fault-tolerant Stream Processing at Internet Scale. Proc. VLDB Endow. 6, 11 (Aug. 2013), 1033--1044. https://doi.org/10.14778/2536222.2536229
[4]
Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael J Fer Andez-Moctezuma, Reuven Lax, Sam Mcveety, Daniel Mills, Frances Perry, Eric Schmidt, and Sam Whittle Google. 2015. The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing. Vldb 8, 12 (2015), 1792--1803. https://doi.org/10.14778/ 2824032.2824076
[5]
Apache Flink. https://flink.apache.org. Last access: 11/04/20.
[6]
Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, Keith Ito, Rajeev Motwani, Itaru Nishizawa, Utkarsh Srivastava, Dilys Thomas, Rohit Varma, and Jennifer Widom. 2003. STREAM: The Stanford Stream Data Manager. IEEE Data Eng. Bull. 26, 1 (March 2003), 19--26. http://sites.computer.org/debull/A03mar/paper.ps
[7]
Arvind Arasu, Shivnath Babu, and Jennifer Widom. 2006. The CQL Continuous Query Language: Semantic Foundations and Query Execution. VLDB Journal (June 2006). https://www.microsoft.com/enus/research/publication/the-cql-continuous-query-languagesemantic-foundations-and-query-execution/
[8]
Arvind Arasu, Mitch Cherniack, Eduardo Galvez, David Maier, Anurag S. Maskey, Esther Ryvkina, Michael Stonebraker, and Richard Tibbetts. 2004. Linear Road: A Stream Data Management Benchmark. In Proceedings of the Thirtieth International Conference on Very Large Data Bases - Volume 30 (VLDB '04). VLDB Endowment, 480--491.
[9]
Arvind Arasu and Jennifer Widom. 2004. Resource Sharing in Continuous Sliding-Window Aggregates. Technical Report 2004--15 (2004), 336--347. https://doi.org/10.1016/B978-012088469--8.50032--2
[10]
Michael Armbrust, Tathagata Das, Joseph Torres, Burak Yavuz, Shixiong Zhu, Reynold Xin, Ali Ghodsi, Ion Stoica, and Matei Zaharia. 2018. Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark. In Proceedings of the 2018 International Conference on Management of Data (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 601--613. https://doi.org/10.1145/3183713.3190664
[11]
Cagri Balkesen and Nesime Tatbul. 2011. Scalable Data Partitioning Techniques for Parallel Sliding Window Processing over Data Streams. In Proceedings of the 8th International Workshop on Data Management for Sensor Networks (DMSN '11).
[12]
François Bancilhon, Ted Briggs, Setrag Khoshafian, and Patrick Valduriez. 1987. FAD, a Powerful and Simple Database Language. In VLDB.
[13]
Pramod Bhatotia, Umut A. Acar, Flavio P. Junqueira, and Rodrigo Rodrigues. 2014. Slider: Incremental Sliding Window Analytics. In Proceedings of the 15th International Middleware Conference (Middleware '14). ACM, New York, NY, USA, 61--72. https://doi.org/10.1145/2663165.2663334
[14]
Guy E. Blelloch. 1990. Vector Models for Data-Parallel Computing. MIT Press, Cambridge, MA, USA.
[15]
Paris Carbone, Stephan Ewen, Gyula Fóra, Seif Haridi, Stefan Richter, and Kostas Tzoumas. 2017. State Management in Apache Flink®: Consistent Stateful Distributed Stream Processing. Proc. VLDB Endow. 10, 12 (Aug. 2017), 1718--1729. https://doi.org/10.14778/3137765.3137777
[16]
Paris Carbone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, and Volker Markl. 2016. Cutty: Aggregate Sharing for User-Defined Windows. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM '16). ACM, New York, NY, USA, 1201--1210. https://doi.org/10.1145/2983323.2983807
[17]
Raul Castro Fernandez, Matteo Migliavacca, Evangelia Kalyvianaki, and Peter Pietzuch. 2013. Integrating Scale out and Fault Tolerance in Stream Processing Using Operator State Management. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13). ACM, New York, NY, USA, 725--736. https://doi.org/10.1145/2463676.2465282
[18]
Ugur Cetintemel, Jiang Du, Tim Kraska, Samuel Madden, David Maier, John Meehan, Andrew Pavlo, Michael Stonebraker, Erik Sutherland, Nesime Tatbul, Kristin Tufte, Hao Wang, and Stanley Zdonik. 2014. S-Store: A Streaming NewSQL System for Big Velocity Applications. Proc. VLDB Endow. 7, 13 (Aug. 2014), 1633--1636. https://doi.org/10.14778/2733004.2733048
[19]
Badrish Chandramouli, Jonathan Goldstein, Mike Barnett, Robert De-Line, Danyel Fisher, John C. Platt, James F. Terwilliger, and John Wernsing. 2014. Trill: A High-performance Incremental Query Processor for Diverse Analytics. Proc. VLDB Endow. 8, 4 (Dec. 2014), 401--412. https://doi.org/10.14778/2735496.2735503
[20]
Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong, Sailesh Krishnamurthy, Samuel R. Madden, Fred Reiss, and Mehul A. Shah. 2003. TelegraphCQ: Continuous Dataflow Processing. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD '03). ACM, New York, NY, USA, 668--668. https://doi.org/10.1145/872757.872857
[21]
Jianjun Chen, David J. DeWitt, Feng Tian, and Yuan Wang. 2000. NiagaraCQ: A Scalable Continuous Query System for Internet Databases. SIGMOD Rec. 29, 2 (May 2000), 379--390. https://doi.org/10.1145/335191.335432
[22]
Xin Chen, Charng-Da Lu, and K. Pattabiraman. 2014. Failure Analysis of Jobs in Compute Clouds: A Google Cluster Case Study. In Proceedings of the 25th International Symposium on Software Reliability Engineering (ISSRE '14). IEEE Computer Society Press, Los Alamitos, CA, USA, 167--177. https://doi.org/10.1109/ISSRE.2014.34
[23]
S. Chintapalli, D. Dagit, B. Evans, R. Farivar, T. Graves, M. Holderbaugh, Z. Liu, K. Nusbaum, K. Patil, B. J. Peng, and P. Poulosky. 2016. Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming. In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 1789--1792. https://doi.org/10.1109/IPDPSW.2016.138
[24]
John Cieslewicz and Kenneth A. Ross. 2007. Adaptive Aggregation on Chip Multiprocessors. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB '07). VLDB Endowment, 339--350.
[25]
N. Elmqvist and J. Fekete. 2010. Hierarchical Aggregation for Information Visualization: Overview, Techniques, and Design Guidelines. IEEE Transactions on Visualization and Computer Graphics 16, 3 (May 2010), 439--454. https://doi.org/10.1109/TVCG.2009.84
[26]
Esper. http://www.espertech.com/esper/. Last access: 11/04/20.
[27]
J. Gray, A. Bosworth, A. Lyaman, and H. Pirahesh. 1996. Data cube: a relational aggregation operator generalizing GROUP-BY, CROSSTAB, and SUB-TOTALS. In Proceedings of the Twelfth International Conference on Data Engineering. 152--159. https://doi.org/10.1109/ICDE.1996.492099
[28]
Jamie Grier. 2016. Extending the Yahoo! Streaming Benchmark. https://www.ververica.com/blog/extending-the-yahoo-streaming-benchmark. Last access:11/04/20.
[29]
Martin Hirzel, Scott Schneider, and Kanat Tangwongsan. 2017. Sliding-Window Aggregation Algorithms. Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems - DEBS '17 (2017), 11--14. https://doi.org/10.1145/3093742.3095107
[30]
Martin Hirzel, Robert Soulé, Scott Schneider, Bura Gedik, and Robert Grimm. 2014. A Catalog of Stream Processing Optimizations. ACM Comput. Surv. 46, 4, Article 46 (March 2014), 34 pages. https://doi.org/10.1145/2528412
[31]
Intel. 2016. Intel® 64 and IA-32 Architectures Software Developer's Manual, Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C and 3D.
[32]
Intel. 2020. Intel threading building blocks. https://software.intel.com/en-us/intel-tbb. Last access: 11/04/20.
[33]
Navendu Jain, Lisa Amini, Henrique Andrade, Richard King, Yoonho Park, Philipe Selo, and Chitra Venkatramani. 2006. Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core. In 25th ACM SIGMOD International Conference on Management of Data (SIGMOD '06) (25th acm sigmod international conference on management of data (sigmod '06) ed.). ACM. https://www.microsoft.com/en-us/research/publication/designimplementation-evaluation-linear-road-benchmark-streamprocessing-core/
[34]
Zbigniew Jerzak, Thomas Heinze, Matthias Fehr, Daniel Gröber, Raik Hartung, and Nenad Stojanovic. 2012. The DEBS 2012 Grand Challenge. In Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems (DEBS '12). ACM, New York, NY, USA, 393--398. https://doi.org/10.1145/2335484.2335536
[35]
Zbigniew Jerzak and Holger Ziekow. 2014. The DEBS 2014 Grand Challenge. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems (DEBS '14). ACM, New York, NY, USA, 266--269. https://doi.org/10.1145/2611286.2611333
[36]
Soila Kavulya, Jiaqi Tan, Rajeev Gandhi, and Priya Narasimhan. 2010. An Analysis of Traces from a Production MapReduce Cluster. In Proceedings of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid '10). IEEE Computer Society,Washington, DC, USA, 94--103. https://doi.org/10.1109/CCGRID.2010.112
[37]
Seyed Jalal Kazemitabar, Ugur Demiryurek, Mohamed Ali, Afsin Akdogan, and Cyrus Shahabi. 2010. Geospatial Stream Query Processing Using Microsoft SQL Server Stream Insight. Proc. VLDB Endow. 3, 1--2 (Sept. 2010), 1537--1540. https://doi.org/10.14778/1920841.1921032
[38]
A. Kemper and T. Neumann. 2011. HyPer: A hybrid OLTP OLAP main memory database system based on virtual memory snapshots. In 2011 IEEE 27th International Conference on Data Engineering. 195--206. https://doi.org/10.1109/ICDE.2011.5767867
[39]
Alexandros Koliousis, Matthias Weidlich, Raul Castro Fernandez, Alexander Wolf, Paolo Costa, and Peter Pietzuch. 2016. Saber: Window- Based Hybrid Stream Processing for Heterogeneous Architectures. In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data (SIGMOD '16). ACM, New York, NY, USA.
[40]
Jay Kreps, Neha Narkhede, Jun Rao, et al. 2011. Kafka: A distributed messaging system for log processing. In Proceedings of the NetDB, Vol. 11. 1--7.
[41]
Sailesh Krishnamurthy, Chung Wu, and Michael Franklin. 2006. On-the-fly Sharing for Streamed Aggregation. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data (SIGMOD '06). ACM, New York, NY, USA, 623--634. https://doi.org/10.1145/1142473.1142543
[42]
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization (CGO '04). IEEE Computer Society, USA, 75.
[43]
Viktor Leis, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2014. Morsel-driven Parallelism: A NUMA-aware Query Evaluation Framework for the Many-core Age. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD '14). ACM, New York, NY, USA, 743--754. https://doi.org/10.1145/2588555.2610507
[44]
Viktor Leis, Kan Kundhikanjana, Alfons Kemper, and Thomas Neumann. 2015. Efficient processing of window functions in analytical SQL queries. Proceedings of the VLDB Endowment 8, 10 (2015), 1058--1069. https://doi.org/10.14778/2794367.2794375
[45]
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005. No Pane, No Gain: Efficient Evaluation of Sliding-window Aggregates over Data Streams. SIGMOD Rec. 34, 1 (March 2005), 39--44. https://doi.org/10.1145/1058150.1058158
[46]
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005. Semantics and Evaluation Techniques for Window Aggregates in Data Streams. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (SIGMOD '05). Association for Computing Machinery, New York, NY, USA, 311--322. https://doi.org/10.1145/1066157.1066193
[47]
Jin Li, Kristin Tufte, Vladislav Shkapenyuk, Vassilis Papadimos, Theodore Johnson, and David Maier. 2008. Out-of-Order Processing: A New Architecture for High-Performance Stream Systems. Proc. VLDB Endow. 1, 1 (Aug. 2008), 274--288. https://doi.org/10.14778/1453856.1453890
[48]
Wei Lin, Zhengping Qian, Junwei Xu, Sen Yang, Jingren Zhou, and Lidong Zhou. 2016. StreamScope: Continuous Reliable Distributed Processing of Big Data Streams. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16). USENIX Association, Santa Clara, CA, 439--453. https://www.usenix.org/conference/nsdi16/ technical-sessions/presentation/lin
[49]
Frank McSherry, Michael Isard, and Derek G. Murray. 2015. Scalability! But at what COST?. In 15th Workshop on Hot Topics in Operating Systems (HotOS XV). USENIX Association, Kartause Ittingen, Switzerland. https://www.usenix.org/conference/hotos15/workshopprogram/presentation/mcsherry
[50]
Hongyu Miao, Heejin Park, Myeongjae Jeon, Gennady Pekhimenko, Kathryn S. McKinley, and Felix Xiaozhu Lin. 2017. StreamBox: Modern Stream Processing on a Multicore Machine. In Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC '17). USENIX Association, Berkeley, CA, USA, 617--629. http://dl.acm.org/citation.cfm?id=3154690.3154749
[51]
Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A Timely Dataflow System. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). ACM, New York, NY, USA, 439--455. https://doi.org/10.1145/2517349.2522738
[52]
Thomas Neumann. 2011. Efficiently compiling efficient query plans for modern hardware. Proceedings of the VLDB Endowment 4, 9 (2011), 539--550.
[53]
Oracle®Stream Explorer. http://bit.ly/1L6tKz3. Last access: 11/04/20.
[54]
Peter Pietzuch, Panagiotis Garefalakis, Alexandros Koliousis, Holger Pirk, and Georgios Theodorakis. 2018. DoWe Need Distributed Stream Processing? https://lsds.doc.ic.ac.uk/blog/do-we-need-distributedstream- processing. Last access: 11/04/20.
[55]
Holger Pirk, Oscar Moll, Matei Zaharia, and Sam Madden. 2016. Voodoo-a vector algebra for portable database performance on modern hardware. Proceedings of the VLDB Endowment 9, 14 (2016), 1707--1718.
[56]
Zhengping Qian, Yong He, Chunzhi Su, Zhuojie Wu, Hongyu Zhu, Taizhi Zhang, Lidong Zhou, Yuan Yu, and Zheng Zhang. 2013. TimeStream: Reliable Stream Computation in the Cloud. In Proceedings of the 8th ACM European Conference on Computer Systems (EuroSys '13). ACM, New York, NY, USA, 1--14. https://doi.org/10.1145/2465351.2465353
[57]
Vijayshankar Raman, Gopi Attaluri, Ronald Barber, Naresh Chainani, David Kalmuk, Vincent KulandaiSamy, Jens Leenstra, Sam Lightstone, Shaorong Liu, Guy M. Lohman, and et al. 2013. DB2 with BLU Acceleration: So Much More than Just a Column Store. Proc. VLDB Endow. 6, 11 (Aug. 2013), 1080--1091. https://doi.org/10.14778/2536222.2536233
[58]
Tiark Rompf and Martin Odersky. 2010. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs. In Acm Sigplan Notices, Vol. 46. ACM, 127--136.
[59]
Amir Shaikhha, Yannis Klonatos, Lionel Parreaux, Lewis Brown, Mohammad Dashti, and Christoph Koch. 2016. How to architect a query compiler. In Proceedings of the 2016 International Conference on Management of Data. ACM, 1907--1922.
[60]
Ambuj Shatdal and Jeffrey F. Naughton. 1995. Adaptive Parallel Aggregation Algorithms. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data (SIGMOD '95). Association for Computing Machinery, New York, NY, USA, 104--114. https://doi.org/10.1145/223784.223801
[61]
A.U. Shein, P.K. Chrysanthis, and A. Labrinidis. 2017. FlatFIT: Accelerated incremental sliding-window aggregation for real-time analytics. ACM International Conference Proceeding Series Part F1286 (2017). https://doi.org/10.1145/3085504.3085509
[62]
Anatoli U Shein, Panos K Chrysanthis, and Alexandros Labrinidis. 2018. SlickDeque: High Throughput and Low Latency Incremental Sliding-Window Aggregation. Section 4 (2018), 397--408. https://doi.org/10.5441/002/edbt.2018.35
[63]
Kanat Tangwongsan and Martin Hirzel. 2015. General Incremental Sliding-Window Aggregation. Pvldb 8, 7 (2015), 702--713. https://doi.org/10.14778/2752939.2752940
[64]
Kanat Tangwongsan, Martin Hirzel, and Scott Schneider. 2017. Low-Latency Sliding-Window Aggregation in Worst-Case Constant Time. Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems - DEBS '17 (2017), 66--77. https://doi.org/10.1145/3093742.3093925
[65]
Kanat Tangwongsan, Martin Hirzel, and Scott Schneider. 2019. Optimal and General Out-of-Order Sliding-Window Aggregation. Proc. VLDB Endow. 12, 10 (June 2019), 1167--1180. https://doi.org/10.14778/3339490.3339499
[66]
Georgios Theodorakis, Alexandros Koliousis, Peter R. Pietzuch, and Holger Pirk. 2018. Hammer Slide: Work- and CPU-efficient Streaming Window Aggregation, See [66], 34--41. http://www.adms-conf.org/2018-camera-ready/SIMDWindowPaper_ADMS%2718.pdf
[67]
Georgios Theodorakis, Peter R. Pietzuch, and Holger Pirk. 2020. SlideSide: A fast Incremental Stream Processing Algorithm for Multiple Queries. In Proceedings of the 23nd International Conference on Extending Database Technology, EDBT 2020, Copenhagen, Denmark, March 30 - April 02, 2020, Angela Bonifati, Yongluan Zhou, Marcos Antonio Vaz Salles, Alexander Böhm, Dan Olteanu, George H. L. Fletcher, Arijit Khan, and Bin Yang (Eds.). OpenProceedings.org, 435--438. https://doi.org/10.5441/002/edbt.2020.51
[68]
Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal, and Dmitriy Ryaboy. 2014. Storm@Twitter. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD '14). ACM, New York, NY, USA, 147--156. https://doi.org/10.1145/2588555.2595641
[69]
J. Traub, P. M. Grulich, A. Rodriguez Cuellar, S. Bress, A. Katsifodimos, T. Rabl, and V. Markl. 2018. Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). 1300--1303. https://doi.org/10.1109/ICDE.2018.00135
[70]
Shivaram Venkataraman, Aurojit Panda, Kay Ousterhout, Michael Armbrust, Ali Ghodsi, Michael J. Franklin, Benjamin Recht, and Ion Stoica. 2017. Drizzle: Fast and Adaptable Stream Processing at Scale. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). Association for Computing Machinery, New York, NY, USA, 374--389. https://doi.org/10.1145/3132747.3132750
[71]
Stratis D. Viglas and Jeffrey F. Naughton. 2002. Rate-based Query Optimization for Streaming Information Sources. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD '02). ACM, New York, NY, USA, 37--48. https://doi.org/10.1145/564691.564697
[72]
John Wilkes. 2011. More Google Cluster Data. Google Research Blog, http://bit.ly/1A38mfR. Last access: 11/04/20.
[73]
Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. 2013. Discretized Streams: Fault-tolerant Streaming Computation at Scale. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). New York, NY, USA, 423--438. https://doi.org/10.1145/2517349.2522737
[74]
Erik Zeitler and Tore Risch. 2011. Massive Scale-out of Expensive continuous Queries. PVLDB 4, 11 (2011), 1181--1188. http://www.vldb.org/pvldb/vol4/p1181-zeitler.pdf
[75]
Steffen Zeuch, Bonaventura Del Monte, Jeyhun Karimov, Clemens Lutz, Manuel Renz, Jonas Traub, Sebastian Breß, Tilmann Rabl, and Volker Markl. 2019. Analyzing Efficient Stream Processing on Modern Hardware. Proc. VLDB Endow. 12, 5 (Jan. 2019), 516--530. https://doi.org/10.14778/3303753.3303758
[76]
Shuhao Zhang, Jiong He, Amelie Chi Zhou, and Bingsheng He. 2019. BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures. In Proceedings of the 2019 International Conference on Management of Data (SIGMOD '19). ACM, New York, NY, USA, 705--722. https://doi.org/10.1145/3299869.3300067

Cited By

View all
  • (2025)Gecko: Efficient Sliding Window Aggregation With Granular-Based Bulk Eviction Over Big Data StreamsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.351133437:2(698-709)Online publication date: Feb-2025
  • (2024)Query Compilation Without RegretsProceedings of the ACM on Management of Data10.1145/36549682:3(1-28)Online publication date: 30-May-2024
  • (2024)μWheel: Aggregate Management for Streams and QueriesProceedings of the 18th ACM International Conference on Distributed and Event-based Systems10.1145/3629104.3666031(54-65)Online publication date: 24-Jun-2024
  • Show More Cited By

Index Terms

  1. LightSaber: Efficient Window Aggregation on Multi-core Processors

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
    June 2020
    2925 pages
    ISBN:9781450367356
    DOI:10.1145/3318464
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 May 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. code generation
    2. incremental computation
    3. stream processing
    4. window aggregation

    Qualifiers

    • Research-article

    Funding Sources

    • EPSRC Centre for Doctoral Training in High Performance Embedded and Distributed Systems

    Conference

    SIGMOD/PODS '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)63
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 25 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Gecko: Efficient Sliding Window Aggregation With Granular-Based Bulk Eviction Over Big Data StreamsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.351133437:2(698-709)Online publication date: Feb-2025
    • (2024)Query Compilation Without RegretsProceedings of the ACM on Management of Data10.1145/36549682:3(1-28)Online publication date: 30-May-2024
    • (2024)μWheel: Aggregate Management for Streams and QueriesProceedings of the 18th ACM International Conference on Distributed and Event-based Systems10.1145/3629104.3666031(54-65)Online publication date: 24-Jun-2024
    • (2024)Advancements in Accelerating Deep Neural Network Inference on AIoT Devices: A SurveyIEEE Transactions on Sustainable Computing10.1109/TSUSC.2024.33531769:6(830-847)Online publication date: Nov-2024
    • (2024)DIBA: A Re-Configurable Stream ProcessorIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338119236:9(4550-4566)Online publication date: 1-Sep-2024
    • (2024)Optimising Queries for Pattern Detection Over Large Scale Temporally Evolving GraphsIEEE Access10.1109/ACCESS.2024.341735212(86790-86808)Online publication date: 2024
    • (2024)General-purpose data stream processing on heterogeneous architectures with WindFlowJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.104782184:COnline publication date: 1-Feb-2024
    • (2023)Out-of-Order Sliding-Window Aggregation with Efficient Bulk Evictions and InsertionsProceedings of the VLDB Endowment10.14778/3611479.361152116:11(3227-3239)Online publication date: 24-Aug-2023
    • (2023)Analyzing Vectorized Hash Tables across CPU ArchitecturesProceedings of the VLDB Endowment10.14778/3611479.361148516:11(2755-2768)Online publication date: 24-Aug-2023
    • (2023)TiLT: A Time-Centric Approach for Stream Query Optimization and ParallelizationProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575704(818-832)Online publication date: 27-Jan-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media