Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3605181.3626288acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Open access

Limited Access: The Truth Behind Far Memory

Published: 23 October 2023 Publication History

Abstract

Memory capacity in data centers is becoming a scarce resource. To address this issue, emerging runtimes enable applications to supplement their local memory with additional tiers of compressed, non-volatile, or far memory, often accessed via OS-supported paging. In these systems, minimizing page faults is crucial for good performance. Yet, there is little common understanding of which parts of application code are responsible for triggering page faults. In this paper, we analyze page-fault behavior across a suite of 26 applications and find that the vast majority of page faults are triggered by a very small number of lines of application code. In the light of this and related observations, we discuss the feasibility of several ways to reduce page faults.

References

[1]
2023. addr2line. https://man7.org/linux/man-pages/man1/addr2line.1.html.
[2]
2023. GNU core utilities. https://www.gnu.org/software/coreutils/.
[3]
2023. Leveldb - An open-source on-disk key-value store. https://en.wikipedia.org/wiki/LevelDB.
[4]
2023. Linux Userfaultfd. https://www.kernel.org/doc/html/latest/admin-guide/mm/userfaultfd.html.
[5]
2023. Memcached - Free, open source, high-performance, distributed memory object caching system. https://memcached.org/.
[6]
2023. perf-trace. https://man7.org/linux/man-pages/man1/perf-trace.1.html.
[7]
2023. Redis - an open source, in-memory data store. https://redis.io/.
[8]
2023. Rocksdb - A persistent key-value store for fast storage environments. http://rocksdb.org/.
[9]
Masab Ahmad, Farrukh Hijaz, Qingchuan Shi, and Omer Khan. 2015. CRONO: A Benchmark Suite for Multithreaded Graph Algorithms Executing on Futuristic Multicores. In Proceedings of the IEEE International Symposium on Workload Characterization. 44--55.
[10]
Hassan Al Maruf and Mosharaf Chowdhury. 2020. Effectively Prefetching Remote Memory with Leap. In Proceedings of the USENIX Annual Technical Conference (Virtual Event).
[11]
Emmanuel Amaro, Christopher Branner-Augmon, Zhihong Luo, Amy Ousterhout, Marcos K. Aguilera, Aurojit Panda, Sylvia Ratnasamy, and Scott Shenker. 2020. Can far memory improve job throughput?. In Proceedings of the 15th European Conference on Computer Systems (Heraklion, Greece).
[12]
Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph. D. Dissertation. Princeton University.
[13]
Christopher Branner-Augmon, Narek Galstyan, Sam Kumar, Emmanuel Amaro, Amy Ousterhout, Aurojit Panda, Sylvia Ratnasamy, and Scott Shenker. 2023. 3PO: Programmed Far-Memory Prefetching for Oblivious Applications. arXiv preprint arXiv:2207.07688 (2023).
[14]
Kindra Cooper. 2021. OpenAI GPT-3: Everything You Need to Know. https://www.springboard.com/blog/data-science/machine-learning-gpt-3-open-ai/.
[15]
Jonathan Corbet. 2021. Multi-generational LRU: the next generation. https://lwn.net/Articles/856931.
[16]
Peter J. Denning. 1968. The working set model for program behavior. Commun. ACM 11, 5 (May 1968), 323--333.
[17]
Google. 2023. The Size and Quality of a Data Set. https://developers.google.com/machine-learning/data-prep/construct/collect/data-size-quality.
[18]
Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, and Kang G Shin. 2017. Efficient memory disaggregation with infiniswap. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (Boston, MA). 649--667.
[19]
HPC Wire. 2022. AMD's Genoa CPUs Offer Up to 96 5nm Cores Across 12 Chiplets. https://www.hpcwire.com/2022/11/10/amds-4th-gen-epyc-genoa-96-5nm-cores-across-12-compute-chiplets/.
[20]
Intel. 2023. Intel Launches 4th Gen Xeon Scalable Processors, Max Series CPUs. https://www.intel.com/content/www/us/en/newsroom/news/4th-gen-xeon-scalable-processors-max-series-cpus-gpus.html.
[21]
Theodore Johnson and Dennis Shasha. 1994. 2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm. In Proceedings of the International Conference on Very Large Databases (San Francisco, CA). 439--450.
[22]
Uksong Kang, Hak-Soo Yu, Churoo Park, Hongzhong Zheng, John Halbert, Kuljit Bains, S. Jang, and Joo Sun Choi. 2014. Co-architecting Controllers and DRAM to Enhance DRAM Process Scaling. In Proceedings of the Memory Forum.
[23]
Andres Lagar-Cavilla, Junwhan Ahn, Suleiman Souhlal, Neha Agarwal, Radoslaw Burny, Shakeel Butt, Jichuan Chang, Ashwin Chaugule, Nan Deng, Junaid Shahid, Greg Thelen, Kamil Adam Yurtsever, Yu Zhao, and Parthasarathy Ranganathan. 2019. Software-Defined Far Memory in Warehouse-Scale Computers. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 317--330.
[24]
Seok-Hee Lee. 2016. Technology Scaling Challenges and Opportunities of Memory Devices. In IEEE International Electron Devices Meeting.
[25]
Todd C. Mowry, Angela K. Demke, and Orran Krieger. 1996. Automatic Compiler-Inserted I/O Prefetching for Out-of-Core Applications. Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation.
[26]
Amy Ousterhout, Joshua Fried, Jonathan Behrens, and Adam Belay 2019. Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (Boston, MA). 361--377.
[27]
R. Hugo Patterson, Garth A. Gibson, Eka Ginting, Daniel Stodolsky, and Jim Zelenka. 1995. Informed Caching and Prefetching. In Proceedings of the ACM SIGOPS 15th Symposium on Operating Systems Principles (Copper Mountain, CO).
[28]
Amanda Raybuck, Tim Stamler, Wei Zhang, Mattan Erez, and Simon Peter. 2021. HeMem: Scalable Tiered Memory Management for Big Data Applications and Real NVM. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (Virtual Event, Germany). 392--407.
[29]
Zhenyuan Ruan, Malte Schwarzkopf, Marcos K. Aguilera, and Adam Belay. 2020. AIFM: High-Performance, application-integrated far memory. Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 315--332.
[30]
Andrew Tomkins, R. Hugo Patterson, and Garth Gibson. 1997. Informed Multi-Process Prefetching and Caching. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (Seattle, WA).
[31]
Muhammed Ugur, Cheng Jiang, Alex Erf, Tanvir Ahmed Khan, and Baris Kasikci. 2023. One Profile Fits All: Profile-Guided Linux Kernel Optimizations for Data Center Applications. SIGOPS Oper. Syst. Rev. 56, 1 (June 2023), 26--33.
[32]
S. VanDeBogart, C. Frost, and E. Kohler. 2009. Reducing Seek Overhead with Application-Directed Prefetching. In Proceedings of the USENIX Annual Technical Conference (San Diego, CA).
[33]
Chenxi Wang, Haoran Ma, Shi Liu, Yifan Qiao, Jonathan Eyolfson, Christian Navasca, Shan Lu, and Guoqing Harry Xu. 2022. MemLiner: Lining up Tracing and Application for a Far-Memory-Friendly Runtime. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (Carlsbad, CA). 35--53.
[34]
Chenxi Wang, Yifan Qiao, Haoran Ma, Shi Liu, Yiying Zhang, Wenguang Chen, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu. 2023. Canvas: Isolated and adaptive swapping for multi-applications on remote memory. In Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation (Boston, MA).
[35]
Wonsup Yoon, Jinyoung Oh, Jisu Ok, Sue Moon, and Youngjin Kwon. 2021. DiLOS: Adding Performance to Paging-Based Memory Disaggregation. In Proceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems (Hong Kong, China). 70--78.
[36]
Yang Zhou, Hassan M. G. Wassel, Sihang Liu, Jiaqi Gao, James Mickens, Minlan Yu, Chris Kennelly, Paul Turner, David E. Culler, Henry M. Levy, and Amin Vahdat. 2022. Carbink: Fault-Tolerant Far Memory. In Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation (Carlsbad, CA). 55--71.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WORDS '23: Proceedings of the 4th Workshop on Resource Disaggregation and Serverless
October 2023
60 pages
ISBN:9798400702501
DOI:10.1145/3605181
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

  • USENIX

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2023

Check for updates

Qualifiers

  • Research-article

Conference

WORDS '23
Sponsor:

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 562
    Total Downloads
  • Downloads (Last 12 months)349
  • Downloads (Last 6 weeks)37
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media