Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Skip header Section
Superscalar multiprocessor designJanuary 1991
Publisher:
  • Prentice-Hall, Inc.
  • Division of Simon and Schuster One Lake Street Upper Saddle River, NJ
  • United States
ISBN:978-0-13-875634-5
Published:03 January 1991
Pages:
288
Skip Bibliometrics Section
Bibliometrics
Abstract

No abstract available.

Cited By

  1. ACM
    Gebregiorgis A, Du Nguyen H, Yu J, Bishnoi R, Taouil M, Catthoor F and Hamdioui S (2022). A Survey on Memory-centric Computer Architectures, ACM Journal on Emerging Technologies in Computing Systems, 18:4, (1-50), Online publication date: 31-Oct-2022.
  2. ACM
    Yu J, Yan M, Khyzha A, Morrison A, Torrellas J and Fletcher C (2021). Speculative taint tracking (STT), Communications of the ACM, 64:12, (105-112), Online publication date: 1-Dec-2021.
  3. Christie D, Clark M and Schulte M (2021). What Made Us Stronger: An Inside Look Back at the History of AMD Microprocessor Development, IEEE Micro, 41:6, (29-36), Online publication date: 1-Nov-2021.
  4. ACM
    Aşılıoğlu G, Jin Z, Köksal M, Javeri O and Önder S (2015). LaZy superscalar, ACM SIGARCH Computer Architecture News, 43:3S, (260-271), Online publication date: 4-Jan-2016.
  5. ACM
    Aşılıoğlu G, Jin Z, Köksal M, Javeri O and Önder S LaZy superscalar Proceedings of the 42nd Annual International Symposium on Computer Architecture, (260-271)
  6. ACM
    Jin Z, Aşilioğlu G and Önder S Mower Proceedings of the 29th ACM on International Conference on Supercomputing, (285-294)
  7. Dubey P, O'Brien K, O'Brien K and Barton C Single-program speculative multithreading (SPSM) architecture Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, (109-121)
  8. Farrens M, Tyson G and Pleszkun A A study of single-chip processor/cache organizations for large numbers of transistors Proceedings of the 21st annual international symposium on Computer architecture, (338-347)
  9. ACM
    Farrens M, Tyson G and Pleszkun A (1994). A study of single-chip processor/cache organizations for large numbers of transistors, ACM SIGARCH Computer Architecture News, 22:2, (338-347), Online publication date: 1-Apr-1994.
Contributors
  • Advanced Micro Devices, Inc.

Reviews

Ashoke Deb

Miniaturization of computers, and the resulting proximity of their building blocks, usually shortens the data transfer time. In order to maximize the execution speed of programs, many interrelated parameters need to be considered. The total execution time of a program can be minimized by minimizing the number of instructions per program, minimizing the average number of cycles per instruction, and minimizing the processor clock cycle. Achieving these goals requires investigating issues such as the size and complexity (packing) of an instruction—RISC, CISC, and VLIW, parallelism—data parallelism and instruction parallelism, multiplicity of operations—multiple loadstore, parallel fetch, parallel decode, and multiple functional units, detection and exploitation of parallelism—out-of-order issue and out-of-order execution, design and control of the pipelines, memory architecture, and the bus or communication network among the elements of the machine. Johnson focuses on the design issues as they relate to general-purpose superscalar RISC machines. The book contains 12 chapters, more than 150 figures and charts, 10 tables, an appendix, and a bibliography, containing more than 80 recent papers. The tables and the figures are insightful—for example, tables show a “Comparison of Scalar and Superscalar Pipelines” and a “Critical Path for Central Window Issuing.” Figures include such items as “Performance Growth of Scalar and Superscalar Processor,” “Performance of Scoreboarding Compared to Renaming,” and “Relative Contribution of Reservation Stations to Lost Instruction Bandwidth.” Chapter 1 is “Beyond Pipelining, CISC, and RISC.” Chapter 2, “An Introduction to Superscalar Concepts,” covers their fundamental limitations, instruction issues and machine parallelism, the related concepts of VLIW and superpipelined processors, and unrelated parallel schemes. Chapter 3, “Developing an Execution Model,” discusses the simulation technique, benchmarking performance, basic observations on hardware design, the design of the standard processor, the real performance limit, and background. In chapter 4, “Instruction Fetching and Decoding,” Johnson presents branches and instruction-fetch inefficiencies, improving fetch efficiency, implementing hardware branch-prediction, implementing a four-instruction decoder, implementing branches, and reducing the penalty of procedural dependencies. Chapter 5, “The Role of Exception Recovery,” includes buffering state information for restart, restart implementation and its effect on performance, and observations on processor restart. Chapter 6, “Register Dataflow,” covers dependency mechanisms, result buses and arbitration, result forwarding, and supplying instruction operands. Chapter 7, “Out-of-Order Issue,” discusses reservation stations and implementing a central instruction window, and offers some observations. Chapter 8, “Memory Dataflow,” presents the ordering of loads and stores, and addressing and dependencies. Johnson then asks, “What I s More LoadStore Parallelism Worth ” and discusses “Esoterica: Multiprocessing Considerations” and some observations on accessing external data. Chapter 9, “Complexity and Controversy,” contains a brief glimpse at design complexity, major hardware features, hardware simplifications, and a section that asks “Is the Complexity Worth It ” Chapter 10, “Basic Software Scheduling,” covers the benefits of scheduling, program information needed for scheduling, the relationship of the scheduler and the compiler, and algorithms for scheduling basic blocks. It concludes by revisiting the hardware. Chapter 11, “Software Scheduling Across Branches,” discusses trace scheduling, loop unrolling, software pipelining, global code motion, and out-of-order issue and scheduling across branches. Chapter 12, “Evaluating Alternatives: A Perspective on Superscalar Microprocessors,” is divided into two sections—“The Case for Software Solutions” and “The Case for Hardware Solutions.” An appendix presents the architecture and implementation of a superscalar 386. It should be apparent that this work is not a general-purpose book on computer organization or computer architecture. According to the author, “This book is intended as a technical tutorial and introduction for engineers and computer scientists as well as a graduate-level text for students who have a strong background in computer architecture.” It is both specialized and special. In this extremely well-written text, Johnson systematically guides the reader through the issues, problems, and choices of a machine designer with a pragmatic viewpoint. The pragmatism is derived from extensive simulation studies using a collection of general-purpose programs, such as awk, simple, LINPACK, yacc, Whetstone, and LaT E X. I recommend this book highly for anyone interested in computer architecture.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Recommendations