research-article

Reinforcement Learning Based Policies for Elastic Stream Processing on Heterogeneous Resources

Authors:

Gabriele Russo Russo,

Valeria Cardellini,

Francesco Lo PrestiAuthors Info & Claims

DEBS '19: Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems

Pages 31 - 42

https://doi.org/10.1145/3328905.3329506

Published: 24 June 2019 Publication History

Abstract

Data Stream Processing (DSP) has emerged as a key enabler to develop pervasive services that require to process data in a near real-time fashion. DSP applications keep up with the high volume of produced data by scaling their execution on multiple computing nodes, so as to process the incoming data flow in parallel. Workloads variability requires to elastically adapt the application parallelism at run-time in order to avoid over-provisioning. Elasticity policies for DSP have been widely investigated, but mostly under the simplifying assumption of homogeneous infrastructures. The resulting solutions do not capture the richness and inherent complexity of modern infrastructures, where heterogeneous computing resources are available on-demand. In this paper, we formulate the problem of controlling elasticity on heterogeneous resources as a Markov Decision Process (MDP). The resulting MDP is not easily solved by traditional techniques due to state space explosion, and thus we show how linear Function Approximation and Tile Coding can be used to efficiently compute elasticity policies at run-time. In order to deal with parameters uncertainty, we integrate the proposed approach with Reinforcement Learning algorithms. Our numerical evaluation shows the efficacy of the presented solutions compared to standard methods in terms of accuracy and convergence speed.

References

[1]

Y. Al-Dhuraibi, F. Paraiso, N. Djarallah, and P. Merle. 2018. Elasticity in Cloud Computing: State of the Art and Research Challenges. IEEE Trans. Serv. Comput. 11 (2018), 430--447.

[2]

V. Cardellini, F. Lo Presti, M. Nardelli, and G. Russo Russo. 2018. Decentralized Self-Adaptation for Elastic Data Stream Processing. Future Gener. Comput. Syst. 87 (2018), 171--185.

Digital Library

[3]

V. Cardellini, F. Lo Presti, M. Nardelli, and G. Russo Russo. 2018. Optimal Operator Deployment and Replication for Elastic Distributed Data Stream Processing. Concurr. Comput.: Pract. Exper. 30, 9 (2018), e4334.

[4]

M.D. de Assunção, A. da Silva Veith, and R. Buyya. 2018. Distributed data stream processing and edge computing: A survey on resource elasticity and future directions. J. Netw. Comput. Appl. 103 (2018), 1--17.

Digital Library

[5]

T. De Matteis and G. Mencagli. 2017. Proactive Elasticity and Energy Awareness in Data Stream Processing. J. Syst. Softw. 127 (2017), 302--319.

Digital Library

[6]

R.C. Fernandez, M. Migliavacca, E. Kalyvianaki, and P. Pietzuch. 2013. Integrating Scale Out and Fault Tolerance in Stream Processing Using Operator State Management. In Proc. ACM SIGMOD '13. 725--736.

Digital Library

[7]

B. Gedik, S. Schneider, M Hirzel, and K. Wu. 2014. Elastic Scaling for Data Stream Processing. IEEE Trans. Parallel Distrib. Syst. 25, 6 (2014), 1447--1463.

Digital Library

[8]

A. Geramifard, T.J. Walsh, S. Tellex, G. Chowdhary, N. Roy, J.P. How, et al. 2013. A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning. Found. Trends in Mach. Learn. 6, 4 (2013), 375--451.

Digital Library

[9]

P. Graubner, C. Thelen, M. Körber, A. Sterz, G. Salvaneschi, et al. 2018. Multimodal Complex Event Processing on Mobile Devices. In Proc. ACM DEBS '18. 112--123.

Digital Library

[10]

V. Gulisano, R. Jiménez-Peris, M. Patiño Martinez, C. Soriente, and P. Valduriez. 2012. StreamCloud: An Elastic and Scalable Data Streaming System. IEEE Trans. Parallel Distrib. Syst. 23, 12 (2012), 2351--2365.

Digital Library

[11]

J. He, Y. Chen, T. Z. J. Fu, X. Long, M. Winslett, L. You, and Z. Zhang. 2018. HaaS: Cloud-Based Real-Time Data Analytics with Heterogeneity-Aware Scheduling. In Proc. IEEE ICDCS '18. 1017--1028.

[12]

T. Heinze, L. Aniello, L. Querzoni, and J. Zbigniew. 2014. Cloud-based Data Stream Processing. In Proc. ACM DEBS '14. 238--245.

Digital Library

[13]

T. Heinze, V. Pappalardo, Z. Jerzak, and C. Fetzer. 2014. Auto-scaling Techniques for Elastic Data Stream Processing. In Proc. IEEE ICDEW '14. 296--302.

[14]

M. Hirzel, R. Soulé, S. Schneider, B. Gedik, and R. Grimm. 2014. A Catalog of Stream Processing Optimizations. ACM Comput. Surv. 46, 4 (2014), 46:1--46:34.

Digital Library

[15]

Z. Jerzak and H. Ziekow. 2015. The DEBS 2015 Grand Challenge. In Proc. ACM DEBS '15. ACM, 266--268.

[16]

A. Koliousis, M. Weidlich, R. Castro Fernandez, A.L. Wolf, P. Costa, and P. Pietzuch. 2016. SABER: Window-Based Hybrid Stream Processing for Heterogeneous Architectures. In Proc. ACM SIGMOD '16. 555--569.

Digital Library

[17]

R. M. Kretchmar and C. W. Anderson. 1997. Comparison of CMACs and Radial Basis Functions for Local Function Approximators in Reinforcement Learning. In Proc. ICNN '97, Vol. 2. 834--837.

[18]

G. T. Lakshmanan, Y. Li, and R. Strom. 2008. Placement Strategies for Internet-scale Data Stream Systems. IEEE Internet Comput. 12, 6 (2008), 50--60.

Digital Library

[19]

X. Liu, A.V. Dastjerdi, R.N. Calheiros, C. Qu, and R. Buyya. 2018. A Stepwise Auto-Profiling Method for Performance Optimization of Streaming Applications. ACM Trans. Auton. Adapt. Syst. 12, 4 (2018), 24:1--24:33.

Digital Library

[20]

B. Lohrmann, P. Janacik, and O. Kao. 2015. Elastic Stream Processing with Latency Guarantees. In Proc. IEEE ICDCS '15. 399--410.

[21]

F. Lombardi, L. Aniello, S. Bonomi, and L. Querzoni. 2018. Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems. IEEE Trans. Parallel Distrib. Syst. 29, 3 (2018), 572--585.

[22]

G. Mencagli. 2016. A Game-Theoretic Approach for Elastic Distributed Data Stream Processing. ACM Trans. Auton. Adapt. Syst. 11, 2 (2016), 13:1--13:34.

Digital Library

[23]

M.A.U. Nasir, G. De Francisci Morales, D. García-Soriano, N. Kourtellis, and M. Serafini. 2015. The Power of Both Choices: Practical Load Balancing for Distributed Stream Processing Engines. In Proc. IEEE ICDE '15. 137--148.

[24]

M.L. Puterman. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons.

[25]

G. Russo Russo, M. Nardelli, V. Cardellini, and F. Lo Presti. 2018. Multi-Level Elasticity for Wide-Area Data Streaming Systems: A Reinforcement Learning Approach. Algorithms 11, 9 (2018), 134.

[26]

F. Starks, V. Goebel, S. Kristiansen, and T. Plagemann. 2018. Mobile Distributed Complex Event Processing---Ubi Sumus? Quo Vadimus? In Mobile Big Data: A Roadmap from Models to Technologies. Springer, 147--180.

[27]

R.S. Sutton. 1995. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding. In Proc. NIPS '95. MIT Press, 1038--1044.

Digital Library

[28]

R.S. Sutton and A.G. Barto. 1998. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, USA.

Digital Library

[29]

C. Watkins and P. Dayan. 1992. Q-learning. Machine Learning 8, 3-4 (1992), 279--292.

Digital Library

[30]

K.P. Yoon and C.-L. Hwang. 1995. Multiple Attribute Decision Making: an Introduction. Sage Pubs.

Cited By

Hadian HSharifi M(2024)GT-scheduler: a hybrid graph-partitioning and tabu-search based task scheduler for distributed data stream processing systemsCluster Computing10.1007/s10586-023-04260-y27:5(5815-5832)Online publication date: 13-Feb-2024
https://doi.org/10.1007/s10586-023-04260-y
Ait-Salaht FRebai MIzri N(2024)Optimizing Service Replication and Placement for IoT Applications in Fog Computing SystemsEuro-Par 2024: Parallel Processing10.1007/978-3-031-69577-3_20(283-297)Online publication date: 26-Aug-2024
https://doi.org/10.1007/978-3-031-69577-3_20
Gulisano VMedvet E(2024)Evolutionary Computation Meets Stream ProcessingApplications of Evolutionary Computation10.1007/978-3-031-56852-7_24(377-393)Online publication date: 21-Mar-2024
https://doi.org/10.1007/978-3-031-56852-7_24
Show More Cited By

Index Terms

Reinforcement Learning Based Policies for Elastic Stream Processing on Heterogeneous Resources
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
    2. Machine learning approaches
      1. Markov decision processes
2. Information systems
  1. Data management systems
    1. Database management system engines
      1. Stream management

Recommendations

Reinforcement learning with Gaussian processes for condition-based maintenance
Highlights
- Reinforcement learning for condition-based maintenance with continuous-state MDP.
- Gaussian process regression for function approximation in reinforcement learning.
- Develop a new Gaussian process for reinforcement learning (GPRL) ...
Abstract
Condition-based maintenance strategies are effective in enhancing reliability and safety for complex engineering systems that exhibit degradation phenomena with uncertainty. Such sequential decision-making problems are often modeled as Markov ...
Risk-sensitive reinforcement learning: a martingale approach to reward uncertainty
ICAIF '20: Proceedings of the First ACM International Conference on AI in Finance

We introduce a novel framework to account for sensitivity to rewards uncertainty in sequential decision-making problems. While risk-sensitive formulations for Markov decision processes studied so far focus on the distribution of the cumulative reward as ...
Reinforcement learning algorithms: A brief survey
Highlights
- RL can be used to solve problems involving sequential decision-making.
- RL is based on trial-and-error learning through rewards and punishments.
- The ultimate goal of an RL agent is to maximize cumulative reward.
- RL agent tries ...
Abstract
Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential decision-making in complex problems. RL is inspired by trial-and-error based human/animal learning. It can learn an optimal policy autonomously with knowledge ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DEBS '19: Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems

June 2019

291 pages

ISBN:9781450367943

DOI:10.1145/3328905

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

DEBS '19

Sponsor:

DEBS '19: The 13th ACM International Conference on Distributed and Event-based Systems

June 24 - 28, 2019

Darmstadt, Germany

Acceptance Rates

DEBS '19 Paper Acceptance Rate 13 of 47 submissions, 28%;

Overall Acceptance Rate 145 of 583 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
457
Total Downloads

Downloads (Last 12 months)53
Downloads (Last 6 weeks)13

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hadian HSharifi M(2024)GT-scheduler: a hybrid graph-partitioning and tabu-search based task scheduler for distributed data stream processing systemsCluster Computing10.1007/s10586-023-04260-y27:5(5815-5832)Online publication date: 13-Feb-2024
https://doi.org/10.1007/s10586-023-04260-y
Ait-Salaht FRebai MIzri N(2024)Optimizing Service Replication and Placement for IoT Applications in Fog Computing SystemsEuro-Par 2024: Parallel Processing10.1007/978-3-031-69577-3_20(283-297)Online publication date: 26-Aug-2024
https://doi.org/10.1007/978-3-031-69577-3_20
Gulisano VMedvet E(2024)Evolutionary Computation Meets Stream ProcessingApplications of Evolutionary Computation10.1007/978-3-031-56852-7_24(377-393)Online publication date: 21-Mar-2024
https://doi.org/10.1007/978-3-031-56852-7_24
Russo Russo GCardellini VLo Presti F(2023)Hierarchical Auto-scaling Policies for Data Stream Processing on Heterogeneous ResourcesACM Transactions on Autonomous and Adaptive Systems10.1145/359743518:4(1-44)Online publication date: 16-May-2023
https://dl.acm.org/doi/10.1145/3597435
Russo Russo GVieira MCardellini VDi Marco ATuma P(2023)Using Reinforcement Learning to Control Auto-Scaling of Distributed ApplicationsCompanion of the 2023 ACM/SPEC International Conference on Performance Engineering10.1145/3578245.3585427(137-138)Online publication date: 15-Apr-2023
https://dl.acm.org/doi/10.1145/3578245.3585427
Staffolani ADarvariu VBellavista PMusolesi M(2023)RLQ: Workload Allocation With Reinforcement Learning in Distributed QueuesIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.323198134:3(856-868)Online publication date: 1-Mar-2023
https://doi.org/10.1109/TPDS.2022.3231981
Thakkar RBhavsar MSahni SSaxena VIyengar S(2022)Achieving multilevel elasticity for distributed stream processing systems in the cloud environment: A review and conceptual frameworkProceedings of the 2022 Fourteenth International Conference on Contemporary Computing10.1145/3549206.3549224(81-90)Online publication date: 4-Aug-2022
https://dl.acm.org/doi/10.1145/3549206.3549224
Heinrich RLuthra MKornmayer HBinnig CZhou YChrysanthis PGulisano V(2022)Zero-shot cost models for distributed stream processingProceedings of the 16th ACM International Conference on Distributed and Event-Based Systems10.1145/3524860.3539639(85-90)Online publication date: 27-Jun-2022
https://dl.acm.org/doi/10.1145/3524860.3539639
Ntumba PGeorgantas NChristophides V(2022)Scheduling Continuous Operators for IoT edge Analytics with Time Constraints2022 IEEE International Conference on Smart Computing (SMARTCOMP)10.1109/SMARTCOMP55677.2022.00026(78-85)Online publication date: Jun-2022
https://doi.org/10.1109/SMARTCOMP55677.2022.00026
Nabeel Mustafa SUmer Farooque MTahir MKhan SQamar R(2022)Frameworks, Applications and Challenges in Streaming Big Data Analytics: A Review2022 3rd International Conference on Innovations in Computer Science & Software Engineering (ICONICS)10.1109/ICONICS56716.2022.10100410(1-6)Online publication date: 14-Dec-2022
https://doi.org/10.1109/ICONICS56716.2022.10100410
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents