Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3409964.3461814acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article
Public Access

Paging and the Address-Translation Problem

Published: 06 July 2021 Publication History

Abstract

The classical paging problem, introduced by Sleator and Tarjan in 1985, formalizes the problem of caching pages in RAM in order to minimize IOs. Their online formulation ignores the cost of address translation: programs refer to data via virtual addresses, and these must be translated into physical locations in RAM. Although the cost of an individual address translation is much smaller than that of an IO, every memory access involves an address translation, whereas IOs can be infrequent. In practice, one can spend money to avoid paging by over-provisioning RAM; in contrast, address translation is effectively unavoidable. Thus address-translation costs can sometimes dominate paging costs, and systems must simultaneously optimize both.
To mitigate the cost of address translation, all modern CPUs have translation lookaside buffers (TLBs), which are hardware caches of common address translations. What makes TLBs interesting is that a single TLB entry can potentially encode the address translation for many addresses. This is typically achieved via the use of huge pages, which translate runs of contiguous virtual addresses to runs of contiguous physical addresses. Huge pages reduce TLB misses at the cost of increasing the IOs needed to maintain contiguity in RAM. This tradeoff between TLB misses and IOs suggests that the classical paging problem does not tell the full story.
This paper introduces the Address-Translation Problem, which formalizes the problem of maintaining a TLB, a page table, and RAM in order to minimize the total cost of both TLB misses and IOs. We present an algorithm that achieves the benefits of huge pages for TLB misses without the downsides of huge pages for IOs.

References

[1]
Couchbase: Disabling transparent huge pages (THP). https://docs.couchbase.com/server/current/install/thp-disable.html. Accessed: 2/11/2021.
[2]
MongoDB: Disable transparent huge pages (THP). https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/. Accessed: 2/11/2021.
[3]
Oracle database: Disabling transparent hugepages. https://docs.oracle.com/en/database/oracle/oracle-database/12.2/ladbi/disabling-transparent-hugepages.html. Accessed: 2/11/2021.
[4]
Percona: Settling the myth of transparent hugepages for databases. https://www.percona.com/blog/2019/03/06/settling-the-myth-of-transparent-hugepages-for-databases/. Accessed: 2/11/2021.
[5]
Alok Aggarwal and S. Vitter, Jeffrey. The input/output complexity of sorting and related problems. Commun. ACM, 31(9):1116--1127, September 1988.
[6]
Kunal Agrawal, Michael A. Bender, and Jeremy T. Fineman. The worst page-replacement policy. In Proceedings of the 4th International Conference on Fun with Algorithms (FUN), page 135--145. Springer-Verlag, 2007.
[7]
Inc. AMD. Amd-v nested paging.
[8]
Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, Mark D. Hill, and Michael M. Swift. Efficient virtual memory for big memory servers. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA). ACM, 2013.
[9]
Abhishek Bhattacharjee. Preserving virtual memory by mitigating the address translation wall. IEEE Micro, 37(5):6--10, 2017.
[10]
Abhishek Bhattacharjee. Translation-triggered prefetching. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 63--76. ACM, 2017.
[11]
Allan Borodin and Ran El-Yaniv. Online Computation and Competitive Analysis. Cambridge University Press, USA, 1998.
[12]
Joan Boyar, Lene M. Favrholdt, and Kim S. Larsen. The relative worst-order ratio applied to paging. J. Comput. Syst. Sci., 73(5):818--843, August 2007.
[13]
Mark Brehob, Richard Enbody, Eric Torng, and Stephen Wagner. On-line restricted caching. In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 374--383. Society for Industrial and Applied Mathematics, 2001.
[14]
Niv Buchbinder, Shahar Chen, and Joseph (Seffi) Naor. Competitive algorithms for restricted caching and matroid caching. In Proceedings of the 22nd European Symposium on Algorithms (ESA), pages 209--221. Springer Berlin Heidelberg, 2014.
[15]
Intel's Cascade Lake microarchitecture. https://en.wikichip.org/wiki/intel/microarchitectures/cascade_lake. Accessed: 02/02/2020.
[16]
Fernando J. Corbató. A paging experiment with the Multics system. In MIT Project MAC Report MAC-M-384, 1969.
[17]
Guilherme Cox and Abhishek Bhattacharjee. Efficient address translation for architectures with multiple page sizes. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 435--448. ACM, 2017.
[18]
Peter J. Denning. The working set model for program behavior. Commun. ACM, 11(5):323--333, May 1968.
[19]
Reza Dorrigiv and Alejandro López-Ortiz. Closing the gap between theory and practice: New measures for on-line algorithm analysis. In Shin-ichi Nakano and Md. Saidur Rahman, editors, WALCOM: Algorithms and Computation, pages 13--24. Springer Berlin Heidelberg, 2008.
[20]
Reza Dorrigiv, Alejandro López-Ortiz, and J. Ian Munro. On the relative dominance of paging algorithms. Theor. Comput. Sci., 410(38--40):3694--3701, September 2009.
[21]
Y. Du, M. Zhou, B. R. Childers, D. Mossé, and R. Melhem. Supporting superpages in non-contiguous physical memory. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pages 223--234, Feb 2015.
[22]
Amos Fiat, Richard M Karp, Michael Luby, Lyle A McGeoch, Daniel D Sleator, and Neal E Young. Competitive paging algorithms. Journal of Algorithms, 12(4):685 -- 699, 1991.
[23]
Amos Fiat, Manor Mendel, and Steven Seiden. Online companion caching. Theoretical Computer Science, 324:499--511, 09 2002.
[24]
Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science (FOCS), page 285, 1999.
[25]
Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. ACM Transactions on Algorithms, 8(1):4, 2012.
[26]
Mel Gorman. Linux huge pages. https://lwn.net/Articles/375096/, 2010.
[27]
Mel Gorman. AMD Zen architecture. https://en.wikichip.org/wiki/amd/microarchitectures/zen, 2018.
[28]
Inc. Intel. Intel® 64 and ia-32 architectures software developer's manual volume 3a: System programming guide, part 1.
[29]
V. Karakostas, J. Gandhi, A. Cristal, M. D. Hill, K. S. McKinley, M. Nemirovsky, M. M. Swift, and O. S. Unsal. Energy-efficient address translation. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 631--643, 2016.
[30]
Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel. Coordinated and efficient huge page management with ingens. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 705--721. USENIX Association, November 2016.
[31]
Richard C Murphy, Kyle B Wheeler, Brian W Barrett, and James A Ang. Introducing the graph 500. Cray Users Group (CUG), 19:45--74, 2010.
[32]
Juan Navarro, Sitaram Iyer, Peter Druschel, and Alan L. Cox. Practical, transparent operating system support for superpages. In 5th Symposium on Operating System Design and Implementation (OSDI), 2002.
[33]
Stanko Novakovic, Yizhou Shan, Aasheesh Kolli, Michael Cui, Yiying Zhang, Haggai Eran, Liran Liss, Michael Wei, Dan Tsafrir, and Marcos K. Aguilera. Storm: a fast transactional dataplane for remote data structures. CoRR, abs/1902.02411, 2019.
[34]
Omitted for Anonymity . Dynamic balls-and-bins and iceberg hashing. Under review, 2021. Manuscript available upon request.
[35]
Ashish Panwar, Sorav Bansal, and K. Gopinath. Hawkeye: Efficient fine-grained os support for huge pages. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 347--360. Association for Computing Machinery, 2019.
[36]
Ashish Panwar, Aravinda Prasad, and K. Gopinath. Making huge pages actually useful. SIGPLAN Not., 53(2):679--692, March 2018.
[37]
Chang Hyun Park, Taekyung Heo, Jungi Jeong, and Jaehyuk Huh. Hybrid tlb coalescing: Improving tlb translation coverage under diverse fragmented memory allocations. SIGARCH Comput. Archit. News, 45(2):444--456, June 2017.
[38]
Chang Hyun Park, Taekyung Heo, Jungi Jeong, and Jaehyuk Huh. Hybrid TLB coalescing: Improving TLB translation coverage under diverse fragmented memory allocations. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA), pages 444--456. Association for Computing Machinery, 2017.
[39]
Enoch Peserico. Online paging with arbitrary associativity. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 555--564. Society for Industrial and Applied Mathematics, 2003.
[40]
B. Pham, A. Bhattacharjee, Y. Eckert, and G. H. Loh. Increasing TLB reach by exploiting clustering in page translations. In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), pages 558--567, Feb 2014.
[41]
B. Pham, V. Vaidyanathan, A. Jaleel, and A. Bhattacharjee. CoLT: coalesced large-reach TLBs. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 258--269, December 2012.
[42]
B. Pham, J. Veselý, G. H. Loh, and A. Bhattacharjee. Large pages and lightweight memory management in virtualized environments: Can you have it both ways? In Proceedings of the 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 1--12, Dec 2015.
[43]
H. Prokop. Cache oblivious algorithms. Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, June 1999.
[44]
Martin Raab and Angelika Steger. “balls into bins” -- a simple and tight analysis. In Randomization and Approximation Techniques in Computer Science, pages 159--170. Springer Berlin Heidelberg, 1998.
[45]
SandyBridge. https://www.7-cpu.com/cpu/SandyBridge.html.
[46]
Sandeep Sen and Siddhartha Chatterjee. Towards a theory of cache-efficient algorithms. In Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 829--838, USA, 2000. Society for Industrial and Applied Mathematics.
[47]
Daniel D. Sleator and Robert E. Tarjan. Amortized efficiency of list update and paging rules. Commun. ACM, 28(2):202--208, February 1985.
[48]
Michael M. Swift. Towards O(1) memory. In Proceedings of the 16th Workshop on Hot Topics in Operating Systems (HotOS), pages 7--11, 2017.
[49]
Berthold Vöcking. How asymmetry helps load balancing. J. ACM, 50(4):568--589, July 2003.
[50]
Zi Yan, Daniel Lustig, David Nellans, and Abhishek Bhattacharjee. Translation ranger: Operating system support for contiguity-aware TLBs. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA), pages 698--710, New York, NY, USA, 2019.
[51]
Jian Yang, Joseph Izraelevitz, and Steven Swanson. Filemr: Rethinking RDMA networking for scalable persistent memory. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pages 111--125, Santa Clara, CA, February 2020. USENIX Association.
[52]
N. Young. The k-server dual and loose competitiveness for paging. Algorithmica, 11(6):525--541, Jun 1994.
[53]
Neal E. Young. On-line file caching. In Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 82--86, USA, 1998. Society for Industrial and Applied Mathematics.
[54]
AMD's Zen microarchitecture. https://en.wikichip.org/wiki/amd/microarchitectures/zen. Accessed: 07/15/2020.

Cited By

View all
  • (2024)Mosaic Pages: Big TLB Reach With Small PagesIEEE Micro10.1109/MM.2024.340918144:4(52-59)Online publication date: 6-Jun-2024
  • (2024)Elastic Translations: Fast Virtual Memory with Multiple Translation Sizes2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00012(17-35)Online publication date: 2-Nov-2024
  • (2023)Iceberg Hashing: Optimizing Many Hash-Table Criteria at OnceJournal of the ACM10.1145/362581770:6(1-51)Online publication date: 30-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '21: Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures
July 2021
463 pages
ISBN:9781450380706
DOI:10.1145/3409964
  • General Chair:
  • Kunal Agrawal,
  • Program Chair:
  • Yossi Azar
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 July 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. address translation
  2. hashing
  3. iceberg
  4. paging
  5. tlb
  6. virtual memory

Qualifiers

  • Research-article

Funding Sources

Conference

SPAA '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25
37th ACM Symposium on Parallelism in Algorithms and Architectures
July 28 - August 1, 2025
Portland , OR , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)262
  • Downloads (Last 6 weeks)27
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Mosaic Pages: Big TLB Reach With Small PagesIEEE Micro10.1109/MM.2024.340918144:4(52-59)Online publication date: 6-Jun-2024
  • (2024)Elastic Translations: Fast Virtual Memory with Multiple Translation Sizes2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00012(17-35)Online publication date: 2-Nov-2024
  • (2023)Iceberg Hashing: Optimizing Many Hash-Table Criteria at OnceJournal of the ACM10.1145/362581770:6(1-51)Online publication date: 30-Nov-2023
  • (2023)MEMTIS: Efficient Memory Tiering with Dynamic Page Classification and Page Size DeterminationProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613167(17-34)Online publication date: 23-Oct-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media