Jiffy: Elastic far-memory for stateful serverless analytics

A Khandelwal, Y Tang, R Agarwal, A Akella… - Proceedings of the …, 2022 - dl.acm.org
Proceedings of the Seventeenth European Conference on Computer Systems, 2022dl.acm.org
Stateful serverless analytics can be enabled using a remote memory system for inter-task
communication, and for storing and exchanging intermediate data. However, existing
systems allocate memory resources at job granularity---jobs specify their memory demands
at the time of the submission; and, the system allocates memory equal to the job's demand
for the entirety of its lifetime. This leads to resource underutilization and/or performance
degradation when intermediate data sizes vary during job execution. This paper presents …
Stateful serverless analytics can be enabled using a remote memory system for inter-task communication, and for storing and exchanging intermediate data. However, existing systems allocate memory resources at job granularity---jobs specify their memory demands at the time of the submission; and, the system allocates memory equal to the job's demand for the entirety of its lifetime. This leads to resource underutilization and/or performance degradation when intermediate data sizes vary during job execution.
This paper presents Jiffy, an elastic far-memory system for stateful serverless analytics that meets the instantaneous memory demand of a job at seconds timescales. Jiffy efficiently multiplexes memory capacity across concurrently running jobs, reducing the overheads of reads and writes to slower persistent storage, resulting in 1.6 -- 2.5× improvements in job execution time over production workloads. Jiffy implementation currently runs on Amazon EC2, enables a wide variety of distributed programming models including MapReduce, Dryad, StreamScope, and Piccolo, and natively supports a large class of analytics applications on AWS Lambda.
ACM Digital Library