article

Free access

Hardware-software trade-offs in a direct Rambus implementation of the RAMpage memory hierarchy

Authors:

Philip Machanick,

Pierre Salverda,

Lance PompeAuthors Info & Claims

ACM SIGPLAN Notices, Volume 33, Issue 11

Pages 105 - 114

https://doi.org/10.1145/291006.291032

Published: 01 October 1998 Publication History

Abstract

The RAMpage memory hierarchy is an alternative to the traditional division between cache and main memory: main memory is moved up a level and DRAM is used as a paging device. The idea behind RAMpage is to reduce hardware complexity, if at the cost of software complexity, with a view to allowing more flexible memory system design. This paper investigates some issues in choosing between RAMpage and a conventionalcache architecture, with a view to illustrating trade-offs which can be made in choosing whether to place complexity in the memory system in hardware or in software. Performance results in this paper are based on a simple Rambus implementation of DRAM, with performance characteristics of Direct Rambus, which should be available in 1999. This paper explores the conditions under which it becomes feasible to perform a context switch on a miss in the RAMpage model, and the conditions under which RAMpage is a win over a conventional cache architecture: as the CPU-DRAM speed gap grows, RAMpage becomes more viable.

References

[1]

A. Agarwal and S.D. Pudar. Column associative caches: A technique for reducing the miss rate of direct mapped caches. In Proc. 20th Int. Syrup. on Computer Architecture (ISCA '93), pages 179-190, May 1993.

Digital Library

[2]

D. Burger and T M. Austin. The SimpleScalar Tool Set. Version 2.0, Tech. Report No. 1342, Computer Sciences Department, University of Wisconsin- Madison, June 1997. ftp://ftp, cs .wisc. edu/galileo/ dburger/papers / TR_13 4 2 . ps.

Digital Library

[3]

J.K. Bennet, J.B. Carter, and W. Zwaenepoel. Adaptive software cache management for distributed shared memory architectures. In Proc. 17th Int. Symp. on Computer Architecture (ISCA '90), pages 125- 134, Seattle, WA, May 1990.

Digital Library

[4]

K. Boland and A. Dollas. Predicting and precluding problems with memory latency. IEEE Micro, 14(4):59-67, August 1994.

Digital Library

[5]

S Belayneh and D.R. Kaeli. A discussion of nonblocking/lockup-free caches. Computer Architecture News, 24(3):18-25, June 1996.

Digital Library

[6]

B.N. Bershad, D. Lee, T.H. Romer, and J.B. Chen. Avoiding conflict misses dynamically in large directmapped caches. In Proc. 6th Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-6), pages 158-170, October 1994.

Digital Library

[7]

T. Chen and J. Baer. Reducing memory latency via non-blocking and prefetching caches. In Proc. 5th Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-5), pages 51-61, September 1992.

Digital Library

[8]

D.R. Cheriton, A. Gupta, P.D. Boyle, and H.A Goosen. The VMP multiprocessor: Initial experience, refinements and performance evaluation. In Proc. 15th Int. Syrup. on ComputerArchitecture (ISCA '88), pages 410-421, Honolulu, May/June 1988.

Digital Library

[9]

D.R. Cheriton, H.A. Goosen, H. Holbrook, and P. Machanick. Restructuring a parallel simulation to improve cache behavior in a shared-memory multiprocessor: The value of distributed synchronization. In Proc. 7th Workshop on Parallel and Distributed Simulation, pages 159-162, San Diego, May 1993.

Digital Library

[10]

D.R. Cheriton, H.A. Goosen, and P Machanick. Restructuring a parallel simulation to improve cache behavior in a shared-memorymultiprocessor: A first experience. In Proc. Int. Symp. on SharedMemory Multiprocessing, pages 109-118, Tokyo, April 1991.

[11]

T-F. Chen. An effective programmable prefetch engine for on-chip caches. In Proc. 28th Int. Symp. on Microarchitecture (MICRO-28), pages 237-242, Ann Arbor, MI, 29 November- 1 December 1995.

Digital Library

[12]

Richard Crisp. Direct Rambus tecnology: The new main memory standard. IEEE Micro, 17(6):18-28, November/December 1997.

Digital Library

[13]

C. Crowley. Operating Systems: A Design-Oriented Approach. Irwin Publishing, 1997.

Digital Library

[14]

D.R. Cheriton, G. Slavenburg, and P. Boyle. Softwarecontrolled caches in the VMP multiprocessor. In Proc. 13th Int. Syrup. on Computer Architecture (ISCA '86), pages 366-374, Tokyo, June 1986.

Digital Library

[15]

C. Dulong. The IA-64 architecture at work. Computer, 31(7):24-32, July 1998.

Digital Library

[16]

R.A. Fatoohi. Vector performance analysis of the NEC SX-2. In Proc. Int. Conf. on Supercomputing, pages 389-400, 1990.

Digital Library

[17]

J. Handy. The Cache Memory Book. Academic Press, San Diego, CA, 2nd edition, 1998.

Digital Library

[18]

J. Huck and J. Hays. Architectural support for translation table management in large address space machines. In Proc. 20th Int. Syrup. on Computer Architecture (ISCA '93), pages 39-50, San Diego, CA, May 1993.

Digital Library

[19]

Y. Hidaka, H. Koike, and H Tanaka. Multiple threads in cyclic register windows. In Proc. 20th Annual Int. Symp. on Computer architecture (ISCA '93), pages 131-142, San Diego, CA, May 1993.

Digital Library

[20]

J.L. Hennessy and D.A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kauffmann, San Francisco, CA, 2nd edition, 1996.

Digital Library

[21]

IBM. Synchronous DRAMs' The DRAM of the Future. http' //www. chips, ibm. com/products / memory / sdramart / sdramart, html, 1997.

[22]

IBM. PowerPC 750 RISC Microprocessor Technical Summary. http: //www. chips, ibm. corn/ products/ppc / do cumen ts / da tashe et s / 750/750_TS_R%0 .pdf, January 1998.

[23]

J. Inouye, R. Konuru, J. Walpole, and B. Sears. The Effects of Virtually Addressed Caches on Virtual Memory Design and Performance. Tech. Report No. CS/E 92-010, Department of Computer Science and Engineering, Oregon Graduate Institute of Science and Engineering, March 1992.

Digital Library

[24]

Intel. Pentium H Processor Product Overview. http: //developer. intel, com/design/ PentiumII/prodbref/index.htm, 1998.

[25]

B. Jacob and T. Mudge. Software-managed address translation. In Prec. Third Int. Symp. on High- Performance Computer Architecture, San Antonio, Texas, February 1997.

Digital Library

[26]

N.P. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In Prec. 17th Int. Symp. on Computer Architecture (ISCA '90), pages 364-373, May 1990.

Digital Library

[27]

D.R. Kaeli and P.G. Emma. Improving the accuracy of history-based branch prediction. IEEE Transactions on Computers, 46(4):469-472, April 1997.

Digital Library

[28]

T Kilburn, D.B.J. Edwards, M.J. Lanigan, and F.H. Sumner. One-level storage system. IRE Transactions on Electronic Computers, EC-11(2):223-35, April 1962.

[29]

G. Kane and J. Heinrich. MIPS RISC Architecture. Prentice Hall, Englewood Cliffs, NJ, 1992.

Digital Library

[30]

R.E. Kessler and M.D. Hill. Page placement algorithms for large real-indexed caches. A CM Transactions on Computer Systems, 10(4):338-359, November 1992.

Digital Library

[31]

A. Ki and A. E. Knowles. Adaptive data prefetching using cache information. In Prec. 1997 Int. Conf. on Supercomputing, pages 204-212, Vienna, 1997.

Digital Library

[32]

D. Kroft. Lockup-free instruction fetch/prefetch cache organisation. In Prec. 8th Int. Symp. on Computer Architecture (ISCA '81), pages 81-84, May 1981.

Digital Library

[33]

J.L. Lo, J.S. Emer, H.M. Levy, R.L. Stamm, and D.M. Tullsen. Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading. ACM Transactions on Computer Systems, 15(3):322-354, August 1997.

Digital Library

[34]

P. Machanick. The case for SRAM main memory. ComputerArchitecture News, 24(5):23-30, December 1996.

Digital Library

[35]

TC. Mowry, M.S. Lam, and A. Gupta. Design and evaluation of a compiler algorithm for prefetching. In Prec. 5th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pages 62-73, September 1992.

Digital Library

[36]

P. Machanick and P. Salverda. Preliminary investigation of the RAMpage memory hierarchy. South African Computer Journal, 1998. In press. http://www, cs .wits .ac. za/~philip/ papers / rampage, html.

[37]

D. Nagle, R. Uhlig, T. Stanley, S. Sechrest, T. Mudge, and R. Brown. Design tradeoffs for software-managed TLBs. In Prec. 20th Int. Symp. on ComputerArchitecture (ISCA '93), pages 27-38, San Diego, CA, May 1993.

Digital Library

[38]

M. Rosenblum, S.A. Herrod, E. Witchel, and A. Gupta. Complete computer system simulation: The SimOS approach. IEEE Parallel and Distributed Technology, 3(4):34-43, Winter 1995.

Digital Library

[39]

A. Rogers and K. Li. Software support for speculative loads. In Prec. 5th Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-5), pages 38-50, September 1992.

Digital Library

[40]

M.L. Simmons, H.J. Wasserman, O.A. Lubeck, C. Eoyang, R Mendez, H Harada, and M Ishigum. A performance comparison of four supercomputers. Comm. ACM, 35(8):116-124, August 1992.

Digital Library

[41]

B. Wheeler and B.N. Bershad. Consistency management for virtually indexed caches. In Prec. 5th Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-5), pages 124-136, September 1992.

Digital Library

[42]

C.A. Waldspurger and W.E. Weihl. Register relocation: flexible contexts for multithreading. In Prec. 20th Annual Int. Syrup. on Computer architecture (ISCA '93), pages 120-130, San Diego, CA, May 1993.

Digital Library

Cited By

Miller JAgarwal A(2006)Software-based instruction caching for embedded processorsACM SIGARCH Computer Architecture News10.1145/1168919.116889434:5(293-302)Online publication date: 20-Oct-2006
https://dl.acm.org/doi/10.1145/1168919.1168894
Miller JAgarwal A(2006)Software-based instruction caching for embedded processorsACM SIGPLAN Notices10.1145/1168918.116889441:11(293-302)Online publication date: 20-Oct-2006
https://dl.acm.org/doi/10.1145/1168918.1168894
Miller JAgarwal A(2006)Software-based instruction caching for embedded processorsACM SIGOPS Operating Systems Review10.1145/1168917.116889440:5(293-302)Online publication date: 20-Oct-2006
https://dl.acm.org/doi/10.1145/1168917.1168894
Show More Cited By

Index Terms

Hardware-software trade-offs in a direct Rambus implementation of the RAMpage memory hierarchy

Recommendations

Hardware-software trade-offs in a direct Rambus implementation of the RAMpage memory hierarchy
ASPLOS VIII: Proceedings of the eighth international conference on Architectural support for programming languages and operating systems

The RAMpage memory hierarchy is an alternative to the traditional division between cache and main memory: main memory is moved up a level and DRAM is used as a paging device. The idea behind RAMpage is to reduce hardware complexity, if at the cost of ...
Hardware-software trade-offs in a direct Rambus implementation of the RAMpage memory hierarchy

The RAMpage memory hierarchy is an alternative to the traditional division between cache and main memory: main memory is moved up a level and DRAM is used as a paging device. The idea behind RAMpage is to reduce hardware complexity, if at the cost of ...
Understanding the trade-offs in multi-level cell ReRAM memory design
DAC '13: Proceedings of the 50th Annual Design Automation Conference

Resistive Random Access Memory (ReRAM) is one of the most promising emerging memory technologies as a potential replacement for DRAM memory and/or NAND Flash. Multi-level cell (MLC) ReRAM, which can store multiple bits in a single ReRAM cell, can ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 33, Issue 11

Nov. 1998

309 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/291006

Chairmen:
Dileep Bhandarkar
Intel
,
Anant Agarwel
Massachusetts Institute of Technology, Cambridge

Issue’s Table of Contents

ASPLOS VIII: Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
October 1998
326 pages
ISBN:1581131070
DOI:10.1145/291069
Chairmen:
Dileep Bhandarkar
Intel
,
Anant Agarwal
Massachusetts Institute of Technology, Cambridge

Copyright © 1998 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 1998

Published in SIGPLAN Volume 33, Issue 11

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
744
Total Downloads

Downloads (Last 12 months)157
Downloads (Last 6 weeks)53

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Miller JAgarwal A(2006)Software-based instruction caching for embedded processorsACM SIGARCH Computer Architecture News10.1145/1168919.116889434:5(293-302)Online publication date: 20-Oct-2006
https://dl.acm.org/doi/10.1145/1168919.1168894
Miller JAgarwal A(2006)Software-based instruction caching for embedded processorsACM SIGPLAN Notices10.1145/1168918.116889441:11(293-302)Online publication date: 20-Oct-2006
https://dl.acm.org/doi/10.1145/1168918.1168894
Miller JAgarwal A(2006)Software-based instruction caching for embedded processorsACM SIGOPS Operating Systems Review10.1145/1168917.116889440:5(293-302)Online publication date: 20-Oct-2006
https://dl.acm.org/doi/10.1145/1168917.1168894
Miller JAgarwal AShen JMartonosi M(2006)Software-based instruction caching for embedded processorsProceedings of the 12th international conference on Architectural support for programming languages and operating systems10.1145/1168857.1168894(293-302)Online publication date: 23-Oct-2006
https://dl.acm.org/doi/10.1145/1168857.1168894
Lin FLiu X(2016)memifACM SIGARCH Computer Architecture News10.1145/2980024.287240144:2(369-383)Online publication date: 25-Mar-2016
https://dl.acm.org/doi/10.1145/2980024.2872401
Lin FLiu X(2016) memif ACM SIGOPS Operating Systems Review10.1145/2954680.287240150:2(369-383)Online publication date: 25-Mar-2016
https://doi.org/10.1145/2954680.2872401
Lin FLiu X(2016)memifACM SIGPLAN Notices10.1145/2954679.287240151:4(369-383)Online publication date: 25-Mar-2016
https://dl.acm.org/doi/10.1145/2954679.2872401
Lin FLiu XConte TZhou Y(2016)memifProceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/2872362.2872401(369-383)Online publication date: 25-Mar-2016
https://dl.acm.org/doi/10.1145/2872362.2872401
Chou CJaleel AQureshi MFlautner KWenisch TOzer EFerdman M(2014)CAMEOProceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2014.63(1-12)Online publication date: 13-Dec-2014
https://dl.acm.org/doi/10.1109/MICRO.2014.63
Sim JAlameldeen AChishti ZWilkerson CKim HFlautner KWenisch TOzer EFerdman M(2014)Transparent Hardware Management of Stacked DRAM as Part of MemoryProceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2014.56(13-24)Online publication date: 13-Dec-2014
https://dl.acm.org/doi/10.1109/MICRO.2014.56
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Issue’s Table of Contents