Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1827418.1827466acmconferencesArticle/Chapter ViewAbstractPublication PagesdebsConference Proceedingsconference-collections
research-article

Workload characterization for operator-based distributed stream processing applications

Published: 12 July 2010 Publication History

Abstract

Operator-based programming languages provide an effective development model for large scale stream processing applications. A stream processing application consists of many runtime deployable software processing elements (PE) that work in flows to process incoming messages. Operators (OP) are logical building blocks hosted by PEs. One or more OPs can be fused into a PE at compile-time. Performance optimization for our streaming system includes compile-time fusion optimization and runtime PE-to-host deployment. One of the goals of an optimized stream application is to use minimal computing resource to sustain maximal message throughput.
Characterizing the resource usage of PEs is critical for performance optimization. During compile-time optimization, OP-level resource usage is used to predict the resource usage of fused PEs. When starting an application, PE-level resource usage is used as an initial estimation by the scheduler. In this paper, we propose an efficient workload characterization approach for data stream processing systems. Our method includes the procedures for obtaining reusable OP-level resource usage information from profiling data and recomposing OP-level profiles to predict PE-level resource usage. We present several techniques to overcome measurement errors from the OP data collection. The impact of hardware speed and multi-threading contention on hyper-threading and multi-core machines are also studied. We show that our method can be applied to several streaming applications and the prediction of the PE CPU resource usage is within 15% of the actual CPU usage.

References

[1]
OProfile. http://oprofile.sourceforge.net/.
[2]
Perfwiki. http://perf.wiki.kernel.org/index.php/Main_Page.
[3]
J. Aas. Understanding the Linux 2.6.8.1 CPU Scheduler. Silicon Graphics, Inc. (SGI), Feb. 2005.
[4]
L. Adhianto, S. Banerjee, M. Fagan, M. Krentel, G. Marin, J. Mellor-Crummey, and N. R. Tallent. HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience, 22(6):685--701, 2010.
[5]
V. S. Adve and M. K. Vernon. Parallel program performance prediction using deterministic task graph analysis. ACM Transactions on Computer Systems, 22(1):94--136, 2004.
[6]
Y. Ahmad, B. Berg, U. Cetintemel, M. Humphrey, J.-H. Hwang, A. Jhingran, A. Maskey, O. Papaemmanouil, A. Rasin, N. Tatbul, W. Xing, Y. Xing, and S. Zdonik. Distributed operation in the borealis stream processing engine. In SIGMOD '05: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pages 882--884, New York, NY, USA, 2005. ACM.
[7]
L. Amini, H. Andrade, R. Bhagwan, F. Eskesen, R. King, P. Selo, Y. Park, and C. Venkatramani. SPC: a distributed, scalable platform for data mining. In DMSSP '06: Proceedings of the 4th International Workshop on Data Mining Standards, Services and Platforms, pages 27--37, New York, NY, USA, 2006. ACM.
[8]
H. Andrade, B. Gedik, K.-L. Wu, and P. S. Yu. Scale-up strategies for processing high-rate data streams in System S. In International Conference on Data Engineering, pages 1375--1378, Los Alamitos, CA, USA, 2009. IEEE Computer Society.
[9]
A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom. Stream: The stanford data stream management system. Technical Report 2004--20, Stanford InfoLab, 2004.
[10]
S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci. A portable programming interface for performance evaluation on modern processors. The International Journal of High Performance Computing Applications, 14:189--204, 2000.
[11]
S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. R. Madden, F. Reiss, and M. A. Shah. Telegraphcq: continuous dataflow processing. In SIGMOD '03: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pages 668--668, New York, NY, USA, 2003. ACM.
[12]
J. Chen, D. J. Dewitt, F. Tian, and Y. Wang. Niagaracq: A scalable continuous query system for internet databases. In SIGMOD '00: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pages 379--390, New York, NY, USA, 2000. ACM.
[13]
M. Curtis-Maury. Improving the Efficiency of Parallel Applications on Multithreaded and Multicore Systems. PhD thesis, Virginia Tech, Mar. 2008.
[14]
J. Fenlason and R. Stallman. GNU gprof: The GNU profiler. http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html.
[15]
B. Gedik, H. Andrade, and K.-L. Wu. A code generation approach to optimizing high-performance distributed data stream processing. In CIKM '09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 847--856, New York, NY, USA, 2009. ACM.
[16]
B. Gedik, H. Andrade, K.-L. Wu, P. S. Yu, and M. Doo. SPADE: the System S declarative stream processing engine. In SIGMOD '08: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pages 1123--1134, New York, NY, USA, 2008. ACM.
[17]
M. Hirzel, H. Andrade, B. Gedik, V. Kumar, G. Losa, M. Mendell, H. Nasgaard, R. Soule, and K.-L. Wu. SPL stream processing language specification. Technical Report RC24897, IBM Research, 2009.
[18]
N. Jain, L. Amini, H. Andrade, R. King, Y. Park, P. Selo, and C. Venkatramani. Design, implementation, and evaluation of the linear road benchmark on the stream processing core. In SIGMOD '06: the 2006 ACM SIGMOD International Conference on Management of Data, pages 431--442, New York, NY, USA, 2006. ACM.
[19]
R. Kufrin. Measuring and improving application performance with perfsuite. Linux Journal, 2005(135):4, 2005.
[20]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI '05: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 190--200, New York, NY, USA, 2005. ACM.
[21]
M. C. Merten, A. R. Trick, C. N. George, J. C. Gyllenhaal, and W. mei W. Hwu. A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization. In ISCA '99: Proceedings of the 26th Annual International Symposium on Computer Architecture, pages 136--147, Washington, DC, USA, 1999. IEEE Computer Society.
[22]
S. Moore, D. Cronk, K. S. London, and J. Dongarra. Review of performance analysis tools for mpi parallel programs. In Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, pages 241--248, London, UK, 2001. Springer-Verlag.
[23]
R. H. Saavedra-Barrera, D. E. Culler, and T. V. Eicken. Eicken. analysis of multithreaded architectures for parallel computing. In In Second Annual ACM Symposium on Parallel Algorithms and Architectures, pages 169--178, 1990.
[24]
S. S. Shende and A. D. Malony. The TAU parallel performance system. The International Journal of High Performance Computing Applications, 20:287--331, 2006.
[25]
S. Siddha. Multi-core and Linux kernel. Intel Inc., 2007.
[26]
W. van Dorst. BogoMips mini-Howto. http://tldp.org/HOWTO/BogoMips/.
[27]
J. Wolf, N. Bansal, K. Hildrum, S. Parekh, D. Rajan, R. Wagle, and K.-L. Wu. SODA: An optimizing scheduler for large-scale stream-based distributed computer systems. In Middleware '08: Proceedings of the 9th International Middleware Conference, Dec. 2008.
[28]
X. Zhang, Z. Wang, N. Gloy, J. B. Chen, and M. D. Smith. System support for automatic profiling and optimization. SIGOPS Operating Systems Review, 31(5):15--26, 1997.
[29]
X. J. Zhang, S. S. Parekh, B. Gedik, H. Andrade, and K.-L. Wu. Performance modeling of operators in a streaming system. Technical Report RC24945, IBM Research, 2009.

Cited By

View all
  • (2014)Online Data Processing on S4 Engine: A Study Case on Natural Disasters2014 33rd International Conference of the Chilean Computer Science Society (SCCC)10.1109/SCCC.2014.10(60-64)Online publication date: Nov-2014
  • (2013)Generating synthetic task graphs for simulating stream computing systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2013.06.00273:10(1362-1374)Online publication date: 1-Oct-2013
  • (2013)A performance analysis of system s, s4, and esper via two level benchmarkingProceedings of the 10th international conference on Quantitative Evaluation of Systems10.1007/978-3-642-40196-1_19(225-240)Online publication date: 27-Aug-2013
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DEBS '10: Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems
July 2010
303 pages
ISBN:9781605589275
DOI:10.1145/1827418
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. profiling
  2. stream processing system
  3. workload characterization

Qualifiers

  • Research-article

Conference

DEBS '10

Acceptance Rates

Overall Acceptance Rate 145 of 583 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2014)Online Data Processing on S4 Engine: A Study Case on Natural Disasters2014 33rd International Conference of the Chilean Computer Science Society (SCCC)10.1109/SCCC.2014.10(60-64)Online publication date: Nov-2014
  • (2013)Generating synthetic task graphs for simulating stream computing systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2013.06.00273:10(1362-1374)Online publication date: 1-Oct-2013
  • (2013)A performance analysis of system s, s4, and esper via two level benchmarkingProceedings of the 10th international conference on Quantitative Evaluation of Systems10.1007/978-3-642-40196-1_19(225-240)Online publication date: 27-Aug-2013
  • (2012)HirundoProceedings of the 3rd ACM/SPEC International Conference on Performance Engineering10.1145/2188286.2188347(335-346)Online publication date: 22-Apr-2012
  • (2011)A Flexible Workload Generator for Simulating Stream Computing SystemsProceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems10.1109/MASCOTS.2011.54(409-417)Online publication date: 25-Jul-2011

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media