Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3342195.3387553acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

PLASMA: programmable elasticity for stateful cloud computing applications

Published: 17 April 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Developers are always on the lookout for simple solutions to manage their applications on cloud platforms. Major cloud providers have already been offering automatic elasticity management solutions (e.g., AWS Lambda, Azure durable function) to users. However, many cloud applications are stateful --- while executing, functions need to share their state with others. Providing elasticity for such stateful functions is much more challenging, as a deployment/elasticity decision for a stateful entity can strongly affect others in ways which are hard to predict without any application knowledge. Existing solutions either only support stateless applications (e.g., AWS Lambda) or only provide limited elasticity management (e.g., Azure durable function) to stateful applications.
    PLASMA (<u>P</u>rogrammable E<u>la</u>sticity for <u>S</u>tateful Cloud Co<u>m</u>puting <u>A</u>pplications) is a programming framework for elastic stateful cloud applications. It includes (1) an elasticity programming language as a second "level" of programming (complementing the main application programming language) for describing elasticity behavior, and (2) a novel semantics-aware elasticity management runtime that tracks program execution and acts upon application features as suggested by elasticity behavior. We have implemented 10+ applications with PLASMA. Extensive evaluation on Amazon AWS shows that PLASMA significantly improves their efficiency, e.g., achieving same performance as a vanilla setup with 25% fewer resources, or improving performance by 40% compared to the default setup.

    References

    [1]
    Akka. https://akka.io/.
    [2]
    Amazon CloudWatch. https://aws.amazon.com/cloudwatch/.
    [3]
    Amazon ElastiCache Redis. https://aws.amazon.com/elasticache/redis/.
    [4]
    Amazon Lambda Programming Model. https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-features.html#gettingstarted-features-programmingmodel.
    [5]
    Amazon S3. https://aws.amazon.com/s3/.
    [6]
    Apache Incubator Giraph. http://incubator.apache.org/giraph/.
    [7]
    Autoscaling. https://aws.amazon.com/autoscaling/.
    [8]
    Autoscaling Groups of Instances. https://cloud.google.com/compute/docs/autoscaler/.
    [9]
    Autoscaling with Heat. https://docs.openstack.org/senlin/latest/scenarios/autoscaling_heat.html.
    [10]
    AWS Instance Scheduler. https://aws.amazon.com/answers/infrastructure-management/instance-scheduler/.
    [11]
    AWS Lambda. htps://aws.amazon.com/lambda/.
    [12]
    Azure Autoscale. https://azure.microsoft.com/en-us/features/autoscale/.
    [13]
    Azure Durable Function. htps://docs.microsot.com/en-us/azure/azure-functions/durable/durable-functions-overview.
    [14]
    Cloud Functions Programming Model. https://docs.microsoft.com/en-us/azure/azure-functions/functions-reference.
    [15]
    Cloud IoT Core. htps://cloud.google.com/iot-core/.
    [16]
    Erlang. htps://www.erlang.org/.
    [17]
    Memcached. htp://memcached.org/.
    [18]
    METIS Graph Partition Library. http://exoplanet.eu/catalog.php.
    [19]
    Module: Coordinate a Serverless Image Processing Workflow with AWS Step Functions. https://github.com/aws-samples/aws-serverless-workshops/tree/master/ImageProcessing.
    [20]
    Overview of Azure Monitor. https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-overview-azure-monitor.
    [21]
    Redis. https://redis.io/.
    [22]
    Serverless. https://cloud.google.com/serverless/.
    [23]
    Serverless Computing. https://azure.microsoft.com/en-us/overview/serverless-computing/.
    [24]
    SNAP. https://snap.stanford.edu/data/.
    [25]
    The Graphs Blog. https://thegraphsblog.wordpress.com/presentations/.
    [26]
    G. Agha and C. Hewitt. Actors: A conceptual foundation for concurrent object-oriented programming. In Research Directions in Object-Oriented Programming, pages 49--74. 1987.
    [27]
    A. Al-Shishtawy and V. Vlassov. Elastman: autonomic elasticity manager for cloud-based key-value stores. In The 22nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC'13, pages 115--116, 2013.
    [28]
    P. A. Bernstein, S. Bykov, A. Geller, G. Kliot, and J. Thelin. Orleans : Distributed virtual actors for programmability and scalability. Technical report, Microsoft Research, 2014.
    [29]
    G. Bieber and J. Carpenter. Introduction to service-oriented programming (rev 2.1). 2001.
    [30]
    S. Biswas, M. Zhang, M. D. Bond, and B. Lucia. Valor: efficient, software-only region conflict exceptions. ACM SIGPLAN Notices, 50(10):241--259, 2015.
    [31]
    S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, pages 107--117, 1998.
    [32]
    D. M. Bulla and V. Udupi. Cloud Billing Model: A Review. In International Journal of Computer Science and Information Technologies, pages 1455--1458, 2014.
    [33]
    T. Cerný, M. J. Donahoo, and J. Pechanec. Disambiguation and comparison of soa, microservices and self-contained systems. In Proceedings of the International Conference on Research in Adaptive and Convergent Systems, RACS'17, pages 228--235, 2017.
    [34]
    F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2):4, 2008.
    [35]
    W. Chuang, B. Sang, S. Yoo, R. Gu, M. Kulkarni, and C. E. Killian. EventWave: Programming Model and Runtime Support for Tightly-coupled Elastic Cloud Applications. In Proceedings of the 4th ACM Symposium on Cloud Computing, SoCC'13, pages 21:1--21:16, 2013.
    [36]
    J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, et al. Spanner: Google's globally distributed database. ACM Transactions on Computer Systems (TOCS), 31(3):8, 2013.
    [37]
    G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. In Proceedings of the 21st ACM Symposium on Operating Systems Principles, SOSP'07, pages 205--220, 2007.
    [38]
    B. Ding, L. Kot, A.J. Demers, and J. Gehrke. Centiman: elastic, high performance optimistic concurrency control by watermarking. In Proceedings of the Sixth ACM Symposium on Cloud Computing, SoCC'15, pages 262--275, 2015.
    [39]
    A. Ellis. Introducing Functions as a Service (OpenFaaS). https://blog.alexellis.io/introducing-functions-as-a-service/.
    [40]
    Y. Gan, Y. Zhang, D. Cheng, A. Shetty, P. Rathi, N. Katarki, A. Bruno, J. Hu, B. Ritchken, B. Jackson, K. Hu, M. Pancholi, Y. He, B. Clancy, C. Colen, F. Wen, C. Leung, S. Wang, L. Zaruvinsky, M. Espinosa, R. Lin, Z. Liu, J. Padilla, and C. Delimitrou. An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS'19, pages 3--18, 2019.
    [41]
    Z. Gong, X. Gu, and J. Wilkes. PRESS: predictive elastic resource scaling for cloud systems. In Proceedings of the 6th International Conference on Network and Service Management, CNSM'10, pages 9--16, 2010.
    [42]
    M. L. Guimarães and A. R. Silva. Improving early detection of software merge conflicts. In Proceedings of the 34th International Conference on Software Engineering, pages 342--352. IEEE Press, 2012.
    [43]
    H. S. Gunawi, T. Do, P. Joshi, P. Alvaro, J. M. Hellerstein, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, K. Sen, and D. Borthakur. Fate and destini: A framework for cloud recovery testing. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI'11, 2011.
    [44]
    P. Haller and M. Odersky. Scala actors: Unifying thread-based and event-based programming. Theor. Comput. Sci., 410(2--3):202--220, 2009.
    [45]
    S. Hendrickson, S. Sturdevant, T. Harter, V. Venkataramani, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Serverless computation with openlambda. Elastic, 60:80, 2016.
    [46]
    B. Jennings and R. Stadler. Resource management in clouds: Survey and research challenges. Journal of Network and Systems Management, 23(3):567--619, Jul 2015.
    [47]
    B. K. Kasi and A. Sarma. Cassandra: Proactive conflict minimization through optimized task scheduling. In Proceedings of the 2013 International Conference on Software Engineering, pages 732--741, 2013.
    [48]
    Z. Khayyat, K. Awara, A. Alonazi, H. Jamjoom, D. Williams, and P. Kalnis. Mizan: a system for dynamic load balancing in large-scale graph processing. In Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys'16. ACM, 2013.
    [49]
    A. Klimovic, Y. Wang, P. Stuedi, A. Trivedi, J. Pfefferle, and C. Kozyrakis. Pocket: Elastic ephemeral storage for serverless analytics. ;login:, 44(1), 2019.
    [50]
    G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 135--146, 2010.
    [51]
    C. McCaffrey. Architecting and launching the halo 4 services. In USENIX Association, Santa Clara, CA, 2015.
    [52]
    P. Moritz, R. Nishihara, S. Wang, A. Tumanov, R. Liaw, E. Liang, M. Elibol, Z. Yang, W. Paul, M. I. Jordan, and I. Stoica. Ray: A distributed framework for emerging AI applications. In A. C. Arpaci-Dusseau and G. Voelker, editors, 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI'18, pages 561--577, 2018.
    [53]
    A. Newell, G. Kliot, I. Menache, A. Gopalan, S. Akiyama, and M. Silberstein. Optimizing distributed actor systems for dynamic interactive services. pages 38:1--38:15, 2016.
    [54]
    H. Nguyen, Z. Shen, X. Gu, S. Subbiah, and J. Wilkes. AGILE: elastic distributed resource scaling for infrastructure-as-a-service. In 10th International Conference on Autonomic Computing, ICAC'13, pages 69--82, 2013.
    [55]
    OASIS. OASIS SOA Reference Model TC. https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=soa-rm.
    [56]
    M. Odersky and al. An Overview of the Scala Programming Language. Technical Report IC/2004/64, EPFL Lausanne, Switzerland, 2004.
    [57]
    R. Power and J. Li. Piccolo: Building fast, distributed programs with partitioned tables. 2010.
    [58]
    Q. Pu, S. Venkataraman, and I. Stoica. Shuffling, fast and slow: Scalable analytics on serverless infrastructure. In J. R. Lorch and M. Yu, editors, 16th USENIX Symposium on Networked Systems Design and Implementation, NSDI'19, pages 193--206, 2019.
    [59]
    J. M. Pujol, V. Erramilli, G. Siganos, X. Yang, N. Laoutaris, P. Chhabra, and P. Rodriguez. The little engine(s) that could: scaling online social networks. In Proceedings of the ACM SIGCOMM 2010 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pages 375--386, 2010.
    [60]
    S. Salihoglu and J. Widom. Gps: a graph processing system. In Proceedings of the 25th International Conference on Scientific and Statistical Database Management, page 22. ACM, 2013.
    [61]
    B. Sang, G. Petri, M. S. Ardekani, S. Ravi, and P. T. Eugster. Programming scalable cloud services with AEON. In Proceedings of the 17th International Middleware Conference, pages 16:1--16:14, 2016.
    [62]
    B. Sang, S. Ravi, G. Petri, M. Najafzadeh, M. S. Ardekani, and P. Eugster. Programmable elasticity for actor-based cloud applications. In Proceedings of the 9th Workshop on Programming Languages and Operating Systems, PLOS'17, pages 15--21, 2017.
    [63]
    M. Serafini, E. Mansour, A. Aboulnaga, K. Salem, T. Rafiq, and U. F. Minhas. Accordion: Elastic scalability for database systems supporting distributed transactions. PVLDB, 7(12):1035--1046, 2014.
    [64]
    R. Taft, E. Mansour, M. Serafini, J. Duggan, A. J. Elmore, A. Aboulnaga, A. Pavlo, and M. Stonebraker. E-store: Fine-grained elastic partitioning for distributed transaction processing. PVLDB, 8(3):245--256, 2014.
    [65]
    S. Wang, I. Keivanloo, and Y. Zou. How do developers react to restful api evolution? In International Conference on Service-Oriented Computing, pages 245--259. Springer, 2014.
    [66]
    X. Wu, L. Zhang, Y. Wang, Y. Ren, M. Hack, and S. Jiang. zexpander: A key-value cache with both high performance and fewer misses. In Proceedings of the Eleventh European Conference on Computer Systems, EuroSys'16, pages 14:1--14:15, 2016.
    [67]
    M. Zur Muehlen, J. V. Nickerson, and K. D. Swenson. Developing web services choreography standards-the case of REST vs. SOAP. Decision Support Systems, 40(1):9--29, 2005.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    EuroSys '20: Proceedings of the Fifteenth European Conference on Computer Systems
    April 2020
    49 pages
    ISBN:9781450368827
    DOI:10.1145/3342195
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 April 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    EuroSys '20
    Sponsor:
    EuroSys '20: Fifteenth EuroSys Conference 2020
    April 27 - 30, 2020
    Heraklion, Greece

    Acceptance Rates

    EuroSys '20 Paper Acceptance Rate 43 of 234 submissions, 18%;
    Overall Acceptance Rate 241 of 1,308 submissions, 18%

    Upcoming Conference

    EuroSys '25
    Twentieth European Conference on Computer Systems
    March 30 - April 3, 2025
    Rotterdam , Netherlands

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)24
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Extending parallel programming patterns with adaptability featuresCluster Computing10.1007/s10586-024-04622-0Online publication date: 15-Jun-2024
    • (2024)A Survey of Actor-Like Programming Models for Serverless ComputingActive Object Languages: Current Research Trends10.1007/978-3-031-51060-1_5(123-146)Online publication date: 29-Jan-2024
    • (2023)GliderProceedings of the 24th International Middleware Conference10.1145/3590140.3629119(247-260)Online publication date: 27-Nov-2023
    • (2022)Cloud Services Projects to Support Small and Medium-Sized Businesses and Approaches for their Commercialization2022 13th National Conference with International Participation (ELECTRONICA)10.1109/ELECTRONICA55578.2022.9874426(1-4)Online publication date: 19-May-2022
    • (2022)Varda: A Framework for Compositional Distributed ProgrammingNetworked Systems10.1007/978-3-031-17436-0_2(16-30)Online publication date: 28-Sep-2022
    • (2021)ETAS: predictive scheduling of functions on worker nodes of Apache OpenWhisk platformThe Journal of Supercomputing10.1007/s11227-021-04057-z78:4(5358-5393)Online publication date: 23-Sep-2021
    • (2020)Scalable and serializable networked multi-actor programmingProceedings of the ACM on Programming Languages10.1145/34282664:OOPSLA(1-30)Online publication date: 13-Nov-2020

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media