Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Surpassing the TLB performance of superpages with less operating system support

Published: 01 November 1994 Publication History

Abstract

Many commercial microprocessor architectures have added translation lookaside buffer (TLB) support for superpages. Superpages differ from segments because their size must be a power of two multiple of the base page size and they must be aligned in both virtual and physical address spaces. Very large superpages (e.g., 1MB) are clearly useful for mapping special structures, such as kernel data or frame buffers. This paper considers the architectural and operating system support required to exploit medium-sized superpages (e.g., 64KB, i.e., sixteen times a 4KB base page size). First, we show that superpages improve TLB performance only after invasive operating system modifications that introduce considerable overhead.
We then propose two subblock TLB designs as alternate ways to improve TLB performance. Analogous to a subblock cache, a complete-subblock TLB associates a tag with a superpage-sized region but has valid bits, physical page number, attributes, etc., for each possible base page mapping. A partial-subblock TLB entry is much smaller than a complete-subblock TLB entry, because it shares physical page number and attribute fields across base page mappings. A drawback of a partial-subblock TLB is that base page mappings can share a TLB entry only if they map to consecutive physical pages and have the same attributes. We propose a physical memory allocation algorithm, page reservation, that makes this sharing more likely. When page reservation is used, experimental results show partial-subblock TLBs perform better than superpage TLBs, while requiring simpler operating system changes. If operating system changes are inappropriate, however, complete-subblock TLBs perform best.

References

[1]
Andrew W. Appel and David B. McQueen_ Standard ML of New Jersey. In Proc. Third International Symposium on Programming Language implementation and Logic Programming, pages 1-13, August 1991.
[2]
David Bailey, John Barton, Thomas Lasinski, Horst Simon. The NAS Parallel Benchmarks. Report RNR-91-002 Revision 2, Ames Research Center, August I991.
[3]
J. Bradley Chen, Anita Borg, Norman P. Jouppi. A Simulation Based Study of TLB Performance. In Proc. of the 19th Annual International Symposium on Computer Architecture, pages 114-123, May 1992.
[4]
Peter J. Denning. Virtual Memory. Computing Surveys, 2(3):153-189, September 1970.
[5]
Yannick Deville and Jean Gobert. A class of replacement policies for medium and high associativity structures. Computer Architecture News, 20(1):55-64, March 1992.
[6]
James R. Goodman. Using Cache Memory to Reduce Processor-Memory Traffic. In Proc. of the Tenth Annual International Symposium on Computer Architecture, pages 124-131, Stockholm Sweden, June 1983.
[7]
John L Hennessy and David A Patterson. Computer Architecture A Quantitative Approach. Morgan Kaufmann Publishers, 1990.
[8]
Mark D. Hill and Alan Jay Smith. Experimental Evaluation of On-Chip Microprocessor Cache Memories. In Proc. of the l l th Annual International Symposium on Computer Architecture, pages 158-166, Ann Arbor MI, June 1984.
[9]
Norman P. Jouppi and Steven J. E. Wilson. Tradeoffs in Two- Level On-Chip Caching. In Proc. of the 21st Annual international Symposium on Computer Architecture, April 1994.
[10]
Toyohiko Kagimasa, Kikuo Takahashi, Toshiaki Mori. Adaptive Storage Management for Very Large Virtual/Real Storage Systems. In Proc. of the 18th Annual International Symposium on Computer Architecture, pages 372-379, May 1991.
[11]
Gerry Kane and Joe Heinrich. MiPS RISC Architecture. Prentice Hall, 1992.
[12]
R.E. Kessler and Mark D. Hill. Page Placement Algorithms for Large Real-Index Caches. A CM Transactions on Computer Systems, 10(4):338-359, November 1992.
[13]
Yousef A. Khalidi, Madhusudhan Talluri, Michael N. Nelson, Dock Williams. Virtual Memory Support for Multiple Page Sizes. in Proc. of the Fourth Workshop on Workstation Operating Systems, pages 104-109, Napa CA, October 1993.
[14]
Donald E. Knuth. The Art of Computer Programming, Volume 1. Addison Wesley, 1968. Second Printing.
[15]
J.S. Liptay. Structural aspects of the System/360 Model 85, Part II: the cache. IBM Systems Journal, 7(i):15-21, 1968.
[16]
M. K. McKusick, W. N. Joy, S. J. Leffler, R. S. Fabry. A Fast File System for UNIX. A CM Transactions on Computer Systems, 2(3):191-197, August 1984.
[17]
Milan Milenkovic. Microprocessor Memory Management Units. IEEE Micro, 10(2):70-85, April 1990.
[18]
MIPS Technologies, Inc. TFP Microprocessor Chip Set: Preliminary Product Information, October 1993.
[19]
Jeffrey C. Mogul. Big Memories on the Desktop. In Proc. of the Fourth Workshop on Workstation Operating Systems, pages 110-115, Napa CA, October 1993.
[20]
Johannes M. Mulder, N. T. Quach, Michael J. Flynn. An Area Model for On-Chip Memories and its Applications. IEEE Journal of Solid State Circuits, 26(2):98-106, February 1991.
[21]
David Nagle, Richard Uhlig, Trevor Mudge. Monster: A Tool for Analyzing the interaction Between Operating Systems and Computer Architecture. University of Michigan Technical Report, May 1992.
[22]
David Nagle, Richard Uhlig, Trevor Mudge, Stuart Sechrest. Optimal Allocation of On-Chip Memory for Multiple-API Operating Systems. In Proc. of the 21st Annual International Sympoaium on Computer Architecture, April 1994.
[23]
E.J. Organick. The Multics System: An Examination of Its Structure. MIT Press, Cambridge, MA, 1972.
[24]
J.L. Peterson and N. Theodore. Buddy Systems. Communications of the ACM, 20(6):421-43I, June 1977.
[25]
Raghu Ramakrishnan, Divesh Srivastava, S. Sudarshan, Praveen Seshadri. Implementation of the CORAL Deductive Database System. In Proceedings of A CM SIGMOD International Conference on Management of Data, 1993.
[26]
Steven K. Reinhardt, Mark D. Hill, James R. Larus, Alvin R. Lebeck, James C. Lewis, David A. Wood. The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel Computers. In Proc. A CM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 48-60, May 1993.
[27]
JaswinderPal Singh, Wolf-Dietrich Weber, Anoop Gupta. SPLASH: Stanford Parallel Applications for Shared Memory. Computer Architecture News, 20(1 ):5-44, March 1992.
[28]
Richard L. Sites. Alpha AXP Architecture. Communications of the ACM, 36(2):33-44, February 1993.
[29]
Alan Jay Smith. Cache Memories. Computing Surveys, 14(3):473-530, September 1982.
[30]
SPARC International Inc. The SPARC Architecture Manual, Version 8, 1991.
[31]
SPEC. (entire issue). SPEC Newsletter, 3(4), December 1991.
[32]
Madhusudhan Talluri, Shing Kong, Mark D. Hill, David A. Patterson. Tradeoffs in Supporting Two Page Sizes. In Proc. of the 19th Annual International Symposium on Computer Architecture, pages 415-424, May 1992.
[33]
Madhusudhan Talluri and Mark D. Hill. Surpassing the TLB Performance of Superpages with Less Operating System Support. Computer Sciences Technical Report #1275, University of Wisconsin, July 1994.
[34]
George Taylor, P. Davies, M. Farmwald. The TLB Slice--A Low-Cost High-Speed Address Translation Mechanism. in Proc. of the 17th Annual International Symposium on Computer Architecture, pages 355-363, June 1990.
[35]
Richard Uhlig, David Nagle, Trevor Mudge, Stuart Sechrest. Trap-driven Simulation with Tapeworm II. In Proc. Sixth International Conference on Architectural Support for Programming Language and Operating Systems, (in these proceedings), October 1994.

Cited By

View all
  • (2021)Rebooting virtual memory with midgardProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00047(512-525)Online publication date: 14-Jun-2021
  • (2021)Exploiting page table locality for agile TLB prefetchingProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00016(85-98)Online publication date: 14-Jun-2021
  • (2020)Superpage-Friendly Page Table Design for Hybrid Memory SystemsData Science10.1007/978-981-15-7981-3_46(623-641)Online publication date: 20-Aug-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 28, Issue 5
Dec. 1994
323 pages
ISSN:0163-5980
DOI:10.1145/381792
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS VI: Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
    November 1994
    341 pages
    ISBN:0897916603
    DOI:10.1145/195473
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 1994
Published in SIGOPS Volume 28, Issue 5

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)144
  • Downloads (Last 6 weeks)21
Reflects downloads up to 13 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Rebooting virtual memory with midgardProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00047(512-525)Online publication date: 14-Jun-2021
  • (2021)Exploiting page table locality for agile TLB prefetchingProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00016(85-98)Online publication date: 14-Jun-2021
  • (2020)Superpage-Friendly Page Table Design for Hybrid Memory SystemsData Science10.1007/978-981-15-7981-3_46(623-641)Online publication date: 20-Aug-2020
  • (2020)CoPTA: Contiguous Pattern Speculating TLB ArchitectureEmbedded Computer Systems: Architectures, Modeling, and Simulation10.1007/978-3-030-60939-9_5(67-83)Online publication date: 7-Oct-2020
  • (2019)Prefetched Address TranslationProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358294(1023-1036)Online publication date: 12-Oct-2019
  • (2015)Redundant memory mappings for fast access to large memoriesACM SIGARCH Computer Architecture News10.1145/2872887.274947143:3S(66-78)Online publication date: 13-Jun-2015
  • (2014)BarTLB: Barren page resistant TLB for managed runtime languages2014 IEEE 32nd International Conference on Computer Design (ICCD)10.1109/ICCD.2014.6974692(270-277)Online publication date: Oct-2014
  • (2010)Memory SystemsThe Electrical Engineering Handbook,Second Edition10.1201/9781420049763.ch88Online publication date: 12-Mar-2010
  • (2010)Introduction to the wire-speed processor and architectureIBM Journal of Research and Development10.1147/JRD.2009.203698054:1(27-37)Online publication date: 1-Jan-2010
  • (2009)IBM system z10 support for large pagesIBM Journal of Research and Development10.5555/1850618.185063553:1(183-190)Online publication date: 1-Jan-2009
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media