research-article

Open access

Limited Access: The Truth Behind Far Memory

Authors:

Radhika Niranjan Mysore,

Marcos K. Aguilera,

Amy Ousterhout,

Alex C. SnoerenAuthors Info & Claims

WORDS '23: Proceedings of the 4th Workshop on Resource Disaggregation and Serverless

Pages 37 - 43

https://doi.org/10.1145/3605181.3626288

Published: 23 October 2023 Publication History

Abstract

Memory capacity in data centers is becoming a scarce resource. To address this issue, emerging runtimes enable applications to supplement their local memory with additional tiers of compressed, non-volatile, or far memory, often accessed via OS-supported paging. In these systems, minimizing page faults is crucial for good performance. Yet, there is little common understanding of which parts of application code are responsible for triggering page faults. In this paper, we analyze page-fault behavior across a suite of 26 applications and find that the vast majority of page faults are triggered by a very small number of lines of application code. In the light of this and related observations, we discuss the feasibility of several ways to reduce page faults.

References

[1]

2023. addr2line. https://man7.org/linux/man-pages/man1/addr2line.1.html.

[2]

2023. GNU core utilities. https://www.gnu.org/software/coreutils/.

[3]

2023. Leveldb - An open-source on-disk key-value store. https://en.wikipedia.org/wiki/LevelDB.

[4]

2023. Linux Userfaultfd. https://www.kernel.org/doc/html/latest/admin-guide/mm/userfaultfd.html.

[5]

2023. Memcached - Free, open source, high-performance, distributed memory object caching system. https://memcached.org/.

[6]

2023. perf-trace. https://man7.org/linux/man-pages/man1/perf-trace.1.html.

[7]

2023. Redis - an open source, in-memory data store. https://redis.io/.

[8]

2023. Rocksdb - A persistent key-value store for fast storage environments. http://rocksdb.org/.

[9]

Masab Ahmad, Farrukh Hijaz, Qingchuan Shi, and Omer Khan. 2015. CRONO: A Benchmark Suite for Multithreaded Graph Algorithms Executing on Futuristic Multicores. In Proceedings of the IEEE International Symposium on Workload Characterization. 44--55.

Digital Library

[10]

Hassan Al Maruf and Mosharaf Chowdhury. 2020. Effectively Prefetching Remote Memory with Leap. In Proceedings of the USENIX Annual Technical Conference (Virtual Event).

[11]

Emmanuel Amaro, Christopher Branner-Augmon, Zhihong Luo, Amy Ousterhout, Marcos K. Aguilera, Aurojit Panda, Sylvia Ratnasamy, and Scott Shenker. 2020. Can far memory improve job throughput?. In Proceedings of the 15th European Conference on Computer Systems (Heraklion, Greece).

Digital Library

[12]

Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph. D. Dissertation. Princeton University.

[13]

Christopher Branner-Augmon, Narek Galstyan, Sam Kumar, Emmanuel Amaro, Amy Ousterhout, Aurojit Panda, Sylvia Ratnasamy, and Scott Shenker. 2023. 3PO: Programmed Far-Memory Prefetching for Oblivious Applications. arXiv preprint arXiv:2207.07688 (2023).

[14]

Kindra Cooper. 2021. OpenAI GPT-3: Everything You Need to Know. https://www.springboard.com/blog/data-science/machine-learning-gpt-3-open-ai/.

[15]

Jonathan Corbet. 2021. Multi-generational LRU: the next generation. https://lwn.net/Articles/856931.

[16]

Peter J. Denning. 1968. The working set model for program behavior. Commun. ACM 11, 5 (May 1968), 323--333.

Digital Library

[17]

Google. 2023. The Size and Quality of a Data Set. https://developers.google.com/machine-learning/data-prep/construct/collect/data-size-quality.

[18]

Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, and Kang G Shin. 2017. Efficient memory disaggregation with infiniswap. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (Boston, MA). 649--667.

[19]

HPC Wire. 2022. AMD's Genoa CPUs Offer Up to 96 5nm Cores Across 12 Chiplets. https://www.hpcwire.com/2022/11/10/amds-4th-gen-epyc-genoa-96-5nm-cores-across-12-compute-chiplets/.

[20]

Intel. 2023. Intel Launches 4th Gen Xeon Scalable Processors, Max Series CPUs. https://www.intel.com/content/www/us/en/newsroom/news/4th-gen-xeon-scalable-processors-max-series-cpus-gpus.html.

[21]

Theodore Johnson and Dennis Shasha. 1994. 2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm. In Proceedings of the International Conference on Very Large Databases (San Francisco, CA). 439--450.

[22]

Uksong Kang, Hak-Soo Yu, Churoo Park, Hongzhong Zheng, John Halbert, Kuljit Bains, S. Jang, and Joo Sun Choi. 2014. Co-architecting Controllers and DRAM to Enhance DRAM Process Scaling. In Proceedings of the Memory Forum.

[23]

Andres Lagar-Cavilla, Junwhan Ahn, Suleiman Souhlal, Neha Agarwal, Radoslaw Burny, Shakeel Butt, Jichuan Chang, Ashwin Chaugule, Nan Deng, Junaid Shahid, Greg Thelen, Kamil Adam Yurtsever, Yu Zhao, and Parthasarathy Ranganathan. 2019. Software-Defined Far Memory in Warehouse-Scale Computers. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 317--330.

Digital Library

[24]

Seok-Hee Lee. 2016. Technology Scaling Challenges and Opportunities of Memory Devices. In IEEE International Electron Devices Meeting.

[25]

Todd C. Mowry, Angela K. Demke, and Orran Krieger. 1996. Automatic Compiler-Inserted I/O Prefetching for Out-of-Core Applications. Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation.

Digital Library

[26]

Amy Ousterhout, Joshua Fried, Jonathan Behrens, and Adam Belay 2019. Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (Boston, MA). 361--377.

[27]

R. Hugo Patterson, Garth A. Gibson, Eka Ginting, Daniel Stodolsky, and Jim Zelenka. 1995. Informed Caching and Prefetching. In Proceedings of the ACM SIGOPS 15th Symposium on Operating Systems Principles (Copper Mountain, CO).

Digital Library

[28]

Amanda Raybuck, Tim Stamler, Wei Zhang, Mattan Erez, and Simon Peter. 2021. HeMem: Scalable Tiered Memory Management for Big Data Applications and Real NVM. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (Virtual Event, Germany). 392--407.

Digital Library

[29]

Zhenyuan Ruan, Malte Schwarzkopf, Marcos K. Aguilera, and Adam Belay. 2020. AIFM: High-Performance, application-integrated far memory. Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 315--332.

[30]

Andrew Tomkins, R. Hugo Patterson, and Garth Gibson. 1997. Informed Multi-Process Prefetching and Caching. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (Seattle, WA).

[31]

Muhammed Ugur, Cheng Jiang, Alex Erf, Tanvir Ahmed Khan, and Baris Kasikci. 2023. One Profile Fits All: Profile-Guided Linux Kernel Optimizations for Data Center Applications. SIGOPS Oper. Syst. Rev. 56, 1 (June 2023), 26--33.

[32]

S. VanDeBogart, C. Frost, and E. Kohler. 2009. Reducing Seek Overhead with Application-Directed Prefetching. In Proceedings of the USENIX Annual Technical Conference (San Diego, CA).

[33]

Chenxi Wang, Haoran Ma, Shi Liu, Yifan Qiao, Jonathan Eyolfson, Christian Navasca, Shan Lu, and Guoqing Harry Xu. 2022. MemLiner: Lining up Tracing and Application for a Far-Memory-Friendly Runtime. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (Carlsbad, CA). 35--53.

[34]

Chenxi Wang, Yifan Qiao, Haoran Ma, Shi Liu, Yiying Zhang, Wenguang Chen, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu. 2023. Canvas: Isolated and adaptive swapping for multi-applications on remote memory. In Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation (Boston, MA).

[35]

Wonsup Yoon, Jinyoung Oh, Jisu Ok, Sue Moon, and Youngjin Kwon. 2021. DiLOS: Adding Performance to Paging-Based Memory Disaggregation. In Proceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems (Hong Kong, China). 70--78.

Digital Library

[36]

Yang Zhou, Hassan M. G. Wassel, Sihang Liu, Jiaqi Gao, James Mickens, Minlan Yu, Chris Kennelly, Paul Turner, David E. Culler, Henry M. Levy, and Amin Vahdat. 2022. Carbink: Fault-Tolerant Far Memory. In Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation (Carlsbad, CA). 55--71.

Recommendations

Asynchronous Memory Access Unit: Exploiting Massive Parallelism for Far Memory Access
The growing memory demands of modern applications have driven the adoption of far memory technologies in data centers to provide cost-effective, high-capacity memory solutions. However, far memory presents new performance challenges because its access ...
Dynamic Random Access Memory
An Expandable Ferroelectric Random Access Memory

A ferroelectric memory array is described that may be implemented with discrete bits, discrete words, or multiple word components. Binary information is stored as either a positive or negative polarization state in the ferroelectric ceramic material and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WORDS '23: Proceedings of the 4th Workshop on Resource Disaggregation and Serverless

October 2023

60 pages

ISBN:9798400702501

DOI:10.1145/3605181

Copyright © 2023 Owner/Author(s).

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

In-Cooperation

USENIX

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2023

Check for updates

Qualifiers

Research-article

Conference

WORDS '23

Sponsor:

SIGOPS

WORDS '23: 4th Workshop on Resource Disaggregation and Serverless

October 23, 2023

Koblenz, Germany

Upcoming Conference

SOSP '25

Sponsor:
sigops

ACM SIGOPS 31st Symposium on Operating Systems Principles

October 13 - 16, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
562
Total Downloads

Downloads (Last 12 months)349
Downloads (Last 6 weeks)37

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents