research-article

Workload characterization for operator-based distributed stream processing applications

Authors:

Xiaolan J. Zhang,

Henrique Andrade,

Kun-Lung WuAuthors Info & Claims

DEBS '10: Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems

Pages 235 - 247

https://doi.org/10.1145/1827418.1827466

Published: 12 July 2010 Publication History

Abstract

Operator-based programming languages provide an effective development model for large scale stream processing applications. A stream processing application consists of many runtime deployable software processing elements (PE) that work in flows to process incoming messages. Operators (OP) are logical building blocks hosted by PEs. One or more OPs can be fused into a PE at compile-time. Performance optimization for our streaming system includes compile-time fusion optimization and runtime PE-to-host deployment. One of the goals of an optimized stream application is to use minimal computing resource to sustain maximal message throughput.

Characterizing the resource usage of PEs is critical for performance optimization. During compile-time optimization, OP-level resource usage is used to predict the resource usage of fused PEs. When starting an application, PE-level resource usage is used as an initial estimation by the scheduler. In this paper, we propose an efficient workload characterization approach for data stream processing systems. Our method includes the procedures for obtaining reusable OP-level resource usage information from profiling data and recomposing OP-level profiles to predict PE-level resource usage. We present several techniques to overcome measurement errors from the OP data collection. The impact of hardware speed and multi-threading contention on hyper-threading and multi-core machines are also studied. We show that our method can be applied to several streaming applications and the prediction of the PE CPU resource usage is within 15% of the actual CPU usage.

References

[1]

OProfile. http://oprofile.sourceforge.net/.

[2]

Perfwiki. http://perf.wiki.kernel.org/index.php/Main_Page.

[3]

J. Aas. Understanding the Linux 2.6.8.1 CPU Scheduler. Silicon Graphics, Inc. (SGI), Feb. 2005.

[4]

L. Adhianto, S. Banerjee, M. Fagan, M. Krentel, G. Marin, J. Mellor-Crummey, and N. R. Tallent. HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience, 22(6):685--701, 2010.

Digital Library

[5]

V. S. Adve and M. K. Vernon. Parallel program performance prediction using deterministic task graph analysis. ACM Transactions on Computer Systems, 22(1):94--136, 2004.

Digital Library

[6]

Y. Ahmad, B. Berg, U. Cetintemel, M. Humphrey, J.-H. Hwang, A. Jhingran, A. Maskey, O. Papaemmanouil, A. Rasin, N. Tatbul, W. Xing, Y. Xing, and S. Zdonik. Distributed operation in the borealis stream processing engine. In SIGMOD '05: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pages 882--884, New York, NY, USA, 2005. ACM.

Digital Library

[7]

L. Amini, H. Andrade, R. Bhagwan, F. Eskesen, R. King, P. Selo, Y. Park, and C. Venkatramani. SPC: a distributed, scalable platform for data mining. In DMSSP '06: Proceedings of the 4th International Workshop on Data Mining Standards, Services and Platforms, pages 27--37, New York, NY, USA, 2006. ACM.

Digital Library

[8]

H. Andrade, B. Gedik, K.-L. Wu, and P. S. Yu. Scale-up strategies for processing high-rate data streams in System S. In International Conference on Data Engineering, pages 1375--1378, Los Alamitos, CA, USA, 2009. IEEE Computer Society.

Digital Library

[9]

A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom. Stream: The stanford data stream management system. Technical Report 2004--20, Stanford InfoLab, 2004.

[10]

S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci. A portable programming interface for performance evaluation on modern processors. The International Journal of High Performance Computing Applications, 14:189--204, 2000.

Digital Library

[11]

S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. R. Madden, F. Reiss, and M. A. Shah. Telegraphcq: continuous dataflow processing. In SIGMOD '03: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pages 668--668, New York, NY, USA, 2003. ACM.

Digital Library

[12]

J. Chen, D. J. Dewitt, F. Tian, and Y. Wang. Niagaracq: A scalable continuous query system for internet databases. In SIGMOD '00: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pages 379--390, New York, NY, USA, 2000. ACM.

Digital Library

[13]

M. Curtis-Maury. Improving the Efficiency of Parallel Applications on Multithreaded and Multicore Systems. PhD thesis, Virginia Tech, Mar. 2008.

[14]

J. Fenlason and R. Stallman. GNU gprof: The GNU profiler. http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html.

[15]

B. Gedik, H. Andrade, and K.-L. Wu. A code generation approach to optimizing high-performance distributed data stream processing. In CIKM '09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 847--856, New York, NY, USA, 2009. ACM.

Digital Library

[16]

B. Gedik, H. Andrade, K.-L. Wu, P. S. Yu, and M. Doo. SPADE: the System S declarative stream processing engine. In SIGMOD '08: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pages 1123--1134, New York, NY, USA, 2008. ACM.

Digital Library

[17]

M. Hirzel, H. Andrade, B. Gedik, V. Kumar, G. Losa, M. Mendell, H. Nasgaard, R. Soule, and K.-L. Wu. SPL stream processing language specification. Technical Report RC24897, IBM Research, 2009.

[18]

N. Jain, L. Amini, H. Andrade, R. King, Y. Park, P. Selo, and C. Venkatramani. Design, implementation, and evaluation of the linear road benchmark on the stream processing core. In SIGMOD '06: the 2006 ACM SIGMOD International Conference on Management of Data, pages 431--442, New York, NY, USA, 2006. ACM.

Digital Library

[19]

R. Kufrin. Measuring and improving application performance with perfsuite. Linux Journal, 2005(135):4, 2005.

Digital Library

[20]

C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI '05: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 190--200, New York, NY, USA, 2005. ACM.

Digital Library

[21]

M. C. Merten, A. R. Trick, C. N. George, J. C. Gyllenhaal, and W. mei W. Hwu. A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization. In ISCA '99: Proceedings of the 26th Annual International Symposium on Computer Architecture, pages 136--147, Washington, DC, USA, 1999. IEEE Computer Society.

Digital Library

[22]

S. Moore, D. Cronk, K. S. London, and J. Dongarra. Review of performance analysis tools for mpi parallel programs. In Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, pages 241--248, London, UK, 2001. Springer-Verlag.

Digital Library

[23]

R. H. Saavedra-Barrera, D. E. Culler, and T. V. Eicken. Eicken. analysis of multithreaded architectures for parallel computing. In In Second Annual ACM Symposium on Parallel Algorithms and Architectures, pages 169--178, 1990.

Digital Library

[24]

S. S. Shende and A. D. Malony. The TAU parallel performance system. The International Journal of High Performance Computing Applications, 20:287--331, 2006.

Digital Library

[25]

S. Siddha. Multi-core and Linux kernel. Intel Inc., 2007.

[26]

W. van Dorst. BogoMips mini-Howto. http://tldp.org/HOWTO/BogoMips/.

[27]

J. Wolf, N. Bansal, K. Hildrum, S. Parekh, D. Rajan, R. Wagle, and K.-L. Wu. SODA: An optimizing scheduler for large-scale stream-based distributed computer systems. In Middleware '08: Proceedings of the 9th International Middleware Conference, Dec. 2008.

Digital Library

[28]

X. Zhang, Z. Wang, N. Gloy, J. B. Chen, and M. D. Smith. System support for automatic profiling and optimization. SIGOPS Operating Systems Review, 31(5):15--26, 1997.

Digital Library

[29]

X. J. Zhang, S. S. Parekh, B. Gedik, H. Andrade, and K.-L. Wu. Performance modeling of operators in a streaming system. Technical Report RC24945, IBM Research, 2009.

Cited By

Cantergiani PCottet D(2014)Online Data Processing on S4 Engine: A Study Case on Natural Disasters2014 33rd International Conference of the Chilean Computer Science Society (SCCC)10.1109/SCCC.2014.10(60-64)Online publication date: Nov-2014
https://doi.org/10.1109/SCCC.2014.10
Ajwani DAli SKatrinis KLi CPark AMorrison JSchenfeld E(2013)Generating synthetic task graphs for simulating stream computing systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2013.06.00273:10(1362-1374)Online publication date: 1-Oct-2013
https://dl.acm.org/doi/10.1016/j.jpdc.2013.06.002
Dayarathna MSuzumura T(2013)A performance analysis of system s, s4, and esper via two level benchmarkingProceedings of the 10th international conference on Quantitative Evaluation of Systems10.1007/978-3-642-40196-1_19(225-240)Online publication date: 27-Aug-2013
https://dl.acm.org/doi/10.1007/978-3-642-40196-1_19
Show More Cited By

Index Terms

Workload characterization for operator-based distributed stream processing applications

Recommendations

Characterization of Big Data Stream Processing Pipeline: A Case Study using Flink and Kafka
BDCAT '17: Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies

In recent years there has been a surge in applications focusing on streaming data to generate insights in real-time. Both academia, as well as industry, have tried to address this use case by developing a variety of Stream Processing Engines (SPEs) with ...
Dual-Paradigm Stream Processing
ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

Existing stream processing frameworks operate either under data stream paradigm processing data record by record to favor low latency, or under operation stream paradigm processing data in micro-batches to desire high throughput. For complex and mutable ...
Energy consumption analysis of data stream processing: a benchmarking approach

Energy efficiency of data analysis systems has become a very important issue in recent times because of the increasing costs of data center operations. Although distributed streaming workloads have increasingly been present in modern data centers, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DEBS '10: Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems

July 2010

303 pages

ISBN:9781605589275

DOI:10.1145/1827418

Co-chair:
Peter Pietzuch
Imperial College London, UK
,
Conference Chair:
Jean Bacon
University of Cambridge, UK
,
Program Chairs:
Joe Sventek
University of Glasgow, UK
,
Ugur Cetintemel
Brown University, Rhode Island

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

DEBS '10

Sponsor:

DEBS '10: The 4th ACM International Conference on Distributed Event-based Systems

July 12 - 15, 2010

Cambridge, United Kingdom

Acceptance Rates

Overall Acceptance Rate 145 of 583 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
327
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Cantergiani PCottet D(2014)Online Data Processing on S4 Engine: A Study Case on Natural Disasters2014 33rd International Conference of the Chilean Computer Science Society (SCCC)10.1109/SCCC.2014.10(60-64)Online publication date: Nov-2014
https://doi.org/10.1109/SCCC.2014.10
Ajwani DAli SKatrinis KLi CPark AMorrison JSchenfeld E(2013)Generating synthetic task graphs for simulating stream computing systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2013.06.00273:10(1362-1374)Online publication date: 1-Oct-2013
https://dl.acm.org/doi/10.1016/j.jpdc.2013.06.002
Dayarathna MSuzumura T(2013)A performance analysis of system s, s4, and esper via two level benchmarkingProceedings of the 10th international conference on Quantitative Evaluation of Systems10.1007/978-3-642-40196-1_19(225-240)Online publication date: 27-Aug-2013
https://dl.acm.org/doi/10.1007/978-3-642-40196-1_19
Dayarathna MSuzumura TKaeli DRolia JJohn LKrishnamurthy D(2012)HirundoProceedings of the 3rd ACM/SPEC International Conference on Performance Engineering10.1145/2188286.2188347(335-346)Online publication date: 22-Apr-2012
https://dl.acm.org/doi/10.1145/2188286.2188347
Ajwani DAli SKatrinis KLi CPark AMorrison JSchenfeld E(2011)A Flexible Workload Generator for Simulating Stream Computing SystemsProceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems10.1109/MASCOTS.2011.54(409-417)Online publication date: 25-Jul-2011
https://dl.acm.org/doi/10.1109/MASCOTS.2011.54

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents