Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Improving performance of small on-chip instruction caches

Published: 01 April 1989 Publication History

Abstract

Most current single-chip processors employ an on-chip instruction cache to improve performance. A miss in this instruction cache will cause an external memory reference which must compete with data references for access to the external memory, thus affecting the overall performance of the processor. One common way to reduce the number of off-chip instruction requests is to increase the size of the on-chip cache. An alternative approach is presented in this paper, in which a combination of an instruction cache, instruction queue and instruction queue buffer is used to achieve the same effect with a much smaller instruction cache size. Such an approach is significant for emerging technologies where high circuit densities are initially difficult to achieve yet a high level of performance is desired, or for more mature technologies where chip area can be used to provide more functionality. The viability of this approach is demonstrated by its implementation in an existing single-chip processor.

References

[1]
A. Agarwal, P. Chow, M. Horowitz, J. Acken. A. Salz, and J. Hennessy, "On-chip Instruction Caches for High Performance Processors." Proceedings of the Conference on Advanced Research in VLSI, Stanford, pp. l-24, March 1987.
[2]
"Advanced Micro Devices," AM29000 User's Manual (1987).
[3]
A. D. Berenbaum. B. W. Colbry, D. R. Ditzel, R. D. FErnan, H. R. McLellan, and K. J. O'Conner, "CRISP: A Pipelined 32-bit Microprocessor with 13k-bit of Cache Memory," IEEE Journal of Solid State Circuits, vol. SC-22, pp. 776782. October 1987.
[4]
J. R. Goodman, J.-t. Hsieh, K. Liou, A. R. Pleszkun, P. B. Schechter, and H. C. Young, "PIPE: a VLSI Decoupled Architecture," Proceedings of the Twelfih Annual Symposium on Computer Architecture. pp. 20-27. June 1985.
[5]
G. F. Grohoski and J. H. Patel, "A Performance Model for Instruction Prcfetch in Pipclined Instruction" Units,"" Proceedings of the Ninth International Symposium on Parallel Processing, pp. 248-252, August 1982.
[6]
M. Horowitz, P. Chow. D. Stark, R.T. Simoni, A. Salz. S. Przybylski. J. Hennessy, G. Gulak, A. Agarwal, and J.M. Acken, "MIPS-X: A 20-MIPS Peak, 32-bit Microprocessor with On-Chip Cache," IEEE Journal of Solid-State Circuits, vol. SC-22, pp.790-799, Oct. 1987
[7]
J. Hennessy, "VLSI Processor Architecture," IEEE Transactions on Computers, vol. C-33, No. 12, pp.1221-1246, Dec. 1984
[8]
M. D. Hi, Aspects of Cache Memory and Instruction Buffer Peflormance, Doctoral Thesis, Department of Computer Sciences, University of California, Berkeley, California.
[9]
J. Hennessy, N. Jouppi, F. Baskett, T. Gross and J. Gill, "Hardware/Software Tradeoffs for Increased Performance." Symposium on Architectural Support for Programming Languages and Operating Systems, pp. 33-54, March 1983.
[10]
J. Hermessy. N. Jouppi, S. Przybylski, C. Rowen, and T. Gross, "Design of a High Performance VLSI Processor," Proceedings of the Third Caltech Conference on VLSI, pp. 2-l I, March 1982.
[11]
H. Kadota. J. Miyake, I. Okabayashi, T. Maeda, T. Okamoto. M. Nakajima, and K. Kagawa, "A 32-bit CMOS Microprocessor with On-Chip Cache and TLB," IEEE Journal of Solid-State Circuits, vol. SC- 22. pp.800-807. Oct. 1987
[12]
F. H. McMahon, "LLNL FORTRANS KERNELS: MFL.OPS." Lawrence Livermore Laboratories, Livermore, CA, March 1984.
[13]
D. A. Patterson and C. H. Sequin, "Design Considerations for Single-Chin Commuters of the Future." IEEE &ansactiow"on Co;npute&, Vol. C-29, No. 2, February 1980.
[14]
A. R. Pleszkun and M. K. Farrens. "An Instruction Cache Design for use with a Delayed Branch," Advanced Research in VLSI: Proceedings of the Fourth MIT Conference, April 1986.
[15]
G. Radin, "The 801 Minicomputer," Symposium on Architectural Support for Programming Languages and Operating Systems," pp. 3947, March 1982.
[16]
B. R. Rau and G. E. Rossman. "The Effect of Instruction Fetch Strategies upon the. Performance of" Pipelined Instruction Units." Proceedings of the Fourth Annual Symposiwn on Computer Archiitecnue. 80-89, June 1977.
[17]
James E. Smith, "A Study of Branch Prediction Strategies", Proceedings of the Eighth Annual Symposium on Computer Architecture, pp. 135-148, May 1981.
[18]
J. E. Smith and J. R. Goodman, "Instruction Cache Replacement Policies and Organizations," IEEE Transactions on Computers, Vol. C-34, NO. 3, 234-241, March 1985.
[19]
A. J. Smith, "Cache Memories," ACM Computing Surveys, Vol. 14, No. 3, September 1982.
[20]
H. C. Young and I. R. Goodman, "A Simulation Study of Architectural Data Queues and Preparebranch" Instruction, Proceedings of the IEEE tnternatiOnal Conference on Computer Design, pp. 544- 549. October 1984.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 17, Issue 3
Special Issue: Proceedings of the 16th annual international symposium on Computer Architecture
June 1989
400 pages
ISSN:0163-5964
DOI:10.1145/74926
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '89: Proceedings of the 16th annual international symposium on Computer architecture
    April 1989
    426 pages
    ISBN:0897913191
    DOI:10.1145/74925

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1989
Published in SIGARCH Volume 17, Issue 3

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)73
  • Downloads (Last 6 weeks)20
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (1995)Instruction fetchingACM SIGARCH Computer Architecture News10.1145/225830.22444523:2(345-356)Online publication date: 1-May-1995
  • (1991)Implementation of the PIPE ProcessorComputer10.1109/2.6719524:1(65-70)Online publication date: 1-Jan-1991
  • (1990)An evaluation of functional unit lengths for single-chip processorsProceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture10.5555/255237.255278(209-215)Online publication date: 30-Nov-1990
  • (1998)Improving direct-mapped cache performance by the addition of a small fully-associative cache prefetch buffers25 years of the international symposia on Computer architecture (selected papers)10.1145/285930.285998(388-397)Online publication date: 1-Aug-1998
  • (1996)Wrong-path instruction prefetchingProceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture10.5555/243846.243882(165-175)Online publication date: 2-Dec-1996
  • (1995)SPAIDProceedings of the 28th annual international symposium on Microarchitecture10.5555/225160.225197(231-236)Online publication date: 1-Dec-1995
  • (1995)Instruction fetchingACM SIGARCH Computer Architecture News10.1145/225830.22444523:2(345-356)Online publication date: 1-May-1995
  • (1995)Instruction fetchingProceedings of the 22nd annual international symposium on Computer architecture10.1145/223982.224445(345-356)Online publication date: 1-Jul-1995
  • (1994)Optimal allocation of on-chip memory for multiple-API operating systemsACM SIGARCH Computer Architecture News10.1145/192007.19207022:2(358-369)Online publication date: 1-Apr-1994
  • (1994)Optimal allocation of on-chip memory for multiple-API operating systemsProceedings of the 21st annual international symposium on Computer architecture10.1145/191995.192070(358-369)Online publication date: 18-Apr-1994
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media