Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2851141.2851148acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Keep calm and react with foresight: strategies for low-latency and energy-efficient elastic data stream processing

Published: 27 February 2016 Publication History

Abstract

This paper addresses the problem of designing scaling strategies for elastic data stream processing. Elasticity allows applications to rapidly change their configuration on-the-fly (e.g., the amount of used resources) in response to dynamic workload fluctuations. In this work we face this problem by adopting the Model Predictive Control technique, a control-theoretic method aimed at finding the optimal application configuration along a limited prediction horizon in the future by solving an online optimization problem. Our control strategies are designed to address latency constraints, using Queueing Theory models, and energy consumption by changing the number of used cores and the CPU frequency through the Dynamic Voltage and Frequency Scaling (DVFS) support available in the modern multicore CPUs. The proactive capabilities, in addition to the latency- and energy-awareness, represent the novel features of our approach. To validate our methodology, we develop a thorough set of experiments on a high-frequency trading application. The results demonstrate the high-degree of flexibility and configurability of our approach, and show the effectiveness of our elastic scaling strategies compared with existing state-of-the-art techniques used in similar scenarios.

Supplementary Material

Supplemental material. (a13-de_metteis.zip)

References

[1]
Fastflow (ff). http://calvados.di.unipi.it/fastflow/.
[2]
Ibm infosphere streams. http://www-03.ibm.com/software/products/en/infosphere-streams.
[3]
Apache spark streaming. https://spark.apache.org/streaming.
[4]
Apache storm. https://storm.apache.org.
[5]
Enhanced intel speedstep technology for the intel pentium m processor, 2004. URL ftp://download.intel.com/design/network/papers/30117401.pdf.
[6]
Joachim wuttke: lmfit a c library for levenberg-marquardt least-squares minimization and curve fitting, 2015. URL http://apps.jcns.fz-juelich.de/lmfit.
[7]
T. Akidau, A. Balikov, K. Bekiroğlu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle. Mill-wheel: Fault-tolerant stream processing at internet scale. Proc. VLDB Endow., 6(11):1033--1044, Aug. 2013. ISSN 2150-8097. 14778/2536222.2536229.
[8]
M. Aldinucci, M. Danelutto, P. Kilpatrick, M. Meneghin, and M. Torquati. An efficient unbounded lock-free queue for multi-core systems. In Proceedings of the 18th International Conference on Parallel Processing, Euro-Par'12, pages 662--673, Berlin, Heidelberg, 2012. Springer-Verlag. ISBN 978-3-642-32819-0.
[9]
H. Andrade, B. Gedik, and D. Turaga. Fundamentals of Stream Processing. Cambridge University Press, 2014. ISBN 9781139058940.
[10]
B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proceedings of the Twenty-first ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS '02, pages 1--16, New York, NY, USA, 2002. ACM. ISBN 1-58113-507-6.
[11]
E. F. Camacho and C. Bordons, editors. Model predictive control. Springer-Verlag, Berlin Heidelberg, 2007.
[12]
R. Castro Fernandez, M. Migliavacca, E. Kalyvianaki, and P. Pietzuch. Integrating scale out and fault tolerance in stream processing using operator state management. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 725--736, New York, NY, USA, 2013. ACM.
[13]
R. Fried and A. George. Exponential and holt-winters smoothing. In M. Lovric, editor, International Encyclopedia of Statistical Science, pages 488--490. Springer Berlin Heidelberg, 2014.
[14]
B. Gedik, S. Schneider, M. Hirzel, and K.-L. Wu. Elastic scaling for data stream processing. Parallel and Distributed Systems, IEEE Transactions on, 25(6):1447--1463, June 2014. ISSN 1045-9219.
[15]
V. Gulisano, R. Jimenez-Peris, M. Patino-Martinez, C. Soriente, and P. Valduriez. Streamcloud: An elastic and scalable data streaming system. IEEE Trans. Parallel Distrib. Syst., 23(12):2351--2365, Dec. 2012. ISSN 1045-9219.
[16]
M. Hähnel, B. Döbel, M. Völp, and H. Härtig. Measuring energy consumption for short code paths using rapl. SIGMETRICS Perform. Eval. Rev., 40(3):13--17, Jan. 2012. ISSN 0163-5999.
[17]
T. Heinze, Z. Jerzak, G. Hackenbroich, and C. Fetzer. Latency-aware elastic scaling for distributed data stream processing systems. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems, DEBS '14, pages 13--22, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2737-4.
[18]
J. L. Hellerstein, Y. Diao, S. Parekh, and D. M. Tilbury. Feedback Control of Computing Systems. John Wiley & Sons, 2004.
[19]
N. R. Herbst, N. Huber, S. Kounev, and E. Amrehn. Self-adaptive workload classification and forecasting for proactive resource provisioning. In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering, ICPE '13, pages 187--198, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-1636-1.
[20]
W. Hummer, B. Satzger, and S. Dustdar. Elastic stream processing in the cloud. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(5):333--345, 2013. ISSN 1942-4795.
[21]
J. F. C. Kingman. On queues in heavy traffic. Journal of the Royal Statistical Society. Series B (Methodological), 24(2):pp. 383--392, 1962.
[22]
A. Kumbhare, Y. Simmhan, and V. Prasanna. Plasticc: Predictive look-ahead scheduling for continuous dataflows on clouds. In Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, pages 344--353, May 2014.
[23]
B. Lohrmann, P. Janacik, and O. Kao. Elastic stream processing with latency guarantees. In The 35th International Conference on Distributed Computing Systems (ICDCS 2015), page to appear, 2015.
[24]
G. Mencagli, M. Vanneschi, and E. Vespa. Control-theoretic adaptation strategies for autonomic reconfigurable parallel applications on cloud environments. In High Performance Computing and Simulation (HPCS), 2013 International Conference on, pages 11--18, July 2013.
[25]
G. Mencagli, M. Vanneschi, and E. Vespa. A cooperative predictive control approach to improve the reconfiguration stability of adaptive distributed parallel applications. ACM Trans. Auton. Adapt. Syst., 9 (1):2:1--2:27, Mar. 2014. ISSN 1556-4665. URL http://doi.acm.org/10.1145/2567929.
[26]
A. Miyoshi, C. Lefurgy, E. Van Hensbergen, R. Rajamony, and R. Rajkumar. Critical power slope: Understanding the runtime effects of frequency scaling. In Proceedings of the 16th International Conference on Supercomputing, ICS '02, pages 35--44, New York, NY, USA, 2002. ACM. ISBN 1-58113-483-5.
[27]
R. A. Shafik, A. Das, S. Yang, G. Merrett, and B. M. Al-Hashimi. Adaptive energy minimization of openmp parallel applications on many-core systems. In Proceedings of the 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, PARMA-DITAM '15, pages 19--24, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-3343-6. 2701311.
[28]
M. Shah, J. Hellerstein, S. Chandrasekaran, and M. Franklin. Flux: an adaptive partitioning operator for continuous query systems. In Data Engineering, 2003. Proceedings. 19th International Conference on, pages 25--36, March 2003.
[29]
D. Sun, G. Zhang, S. Yang, W. Zheng, S. U. Khan, and K. Li. Restream: Real-time and energy-efficient resource scheduling in big data stream computing environments. Information Sciences, 319:92--112, 2015. ISSN 0020-0255.
[30]
V. V. Vazirani. Approximation Algorithms. Springer-Verlag New York, Inc., New York, NY, USA, 2001. ISBN 3-540-65367-8.
[31]
U. Verner, A. Schuster, and M. Silberstein. Processing data streams with hard real-time constraints on heterogeneous systems. In Proceedings of the International Conference on Supercomputing, ICS '11, pages 120--129, New York, NY, USA, 2011. ACM.

Cited By

View all
  • (2022)Alps: An Adaptive Load Partitioning Scaling Solution for Stream Processing System on Skewed StreamDatabase and Expert Systems Applications10.1007/978-3-031-12426-6_2(17-31)Online publication date: 29-Jul-2022
  • (2021)Data-Intensive Workload Consolidation in Serverless (Lambda/FaaS) Platforms2021 IEEE 20th International Symposium on Network Computing and Applications (NCA)10.1109/NCA53618.2021.9685244(1-8)Online publication date: 23-Nov-2021
  • (2021)Graceful Performance Degradation in Apache StormParallel and Distributed Computing, Applications and Technologies10.1007/978-3-030-69244-5_35(389-400)Online publication date: 21-Feb-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
February 2016
420 pages
ISBN:9781450340922
DOI:10.1145/2851141
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 February 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DVFS
  2. data stream processing
  3. elasticity
  4. model predictive control
  5. multicore programming

Qualifiers

  • Research-article

Conference

PPoPP '16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)38
  • Downloads (Last 6 weeks)4
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Alps: An Adaptive Load Partitioning Scaling Solution for Stream Processing System on Skewed StreamDatabase and Expert Systems Applications10.1007/978-3-031-12426-6_2(17-31)Online publication date: 29-Jul-2022
  • (2021)Data-Intensive Workload Consolidation in Serverless (Lambda/FaaS) Platforms2021 IEEE 20th International Symposium on Network Computing and Applications (NCA)10.1109/NCA53618.2021.9685244(1-8)Online publication date: 23-Nov-2021
  • (2021)Graceful Performance Degradation in Apache StormParallel and Distributed Computing, Applications and Technologies10.1007/978-3-030-69244-5_35(389-400)Online publication date: 21-Feb-2021
  • (2020)Resource Management and Scheduling in Distributed Stream Processing SystemsACM Computing Surveys10.1145/335539953:3(1-41)Online publication date: 28-May-2020
  • (2019)Efficient resource scheduling for the analysis of Big Data streamsIntelligent Data Analysis10.3233/IDA-17369123:1(77-102)Online publication date: 20-Feb-2019
  • (2019)Combining it allProceedings of the 20th International Middleware Conference10.1145/3361525.3361551(255-267)Online publication date: 9-Dec-2019
  • (2019)A Comprehensive Survey on Parallelization and Elasticity in Stream ProcessingACM Computing Surveys10.1145/330384952:2(1-37)Online publication date: 30-Apr-2019
  • (2019)Transformation-Based Streaming Workflow Allocation on Geo-Distributed Datacenters for Streaming Big Data ProcessingIEEE Transactions on Services Computing10.1109/TSC.2016.261429712:4(654-668)Online publication date: 1-Jul-2019
  • (2019)Pec: Proactive Elastic Collaborative Resource Scheduling in Data Stream ProcessingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2019.289158730:7(1628-1642)Online publication date: 1-Jul-2019
  • (2019)Performance-Oriented Deployment of Streaming Applications on CloudIEEE Transactions on Big Data10.1109/TBDATA.2017.27206225:1(46-59)Online publication date: 1-Mar-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media