Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

The optimum pipeline depth for a microprocessor

Published: 01 May 2002 Publication History

Abstract

The impact of pipeline length on the performance of a microprocessor is explored both theoretically and by simulation. An analytical theory is presented that shows two opposing architectural parameters affect the optimal pipeline length: the degree of instruction level parallelism (superscalar) decreases the optimal pipeline length, while the lack of pipeline stalls increases the optimal pipeline length. This theory is tested by analyzing the optimal pipeline length for 35 applications representing three classes of workloads. Trace tapes are collected from SPEC95 and SPEC2000 applications, traditional (legacy) database and on-line transaction processing (OLTP) applications, and modern (e. g. web) applications primarily written in Java and C++. The results show that there is a clear and significant difference in the optimal pipeline length between the SPEC workloads and both the legacy and modern applications. The SPEC applications, written in C, optimize to a shorter pipeline length than the legacy applications, largely written in assembler language, with relatively little overlap in the two distributions. Additionally, the optimal pipeline length distribution for the C++ and Java workloads overlaps with the legacy applications, suggesting similar workload characteristics. These results are explored across a wide range of superscalar processors, both in-order and out-of-order.

References

[1]
S. R. Kunkel and J. E. Smith. "Optimal pipelining in supercomputers", Proc. of the 13 th Annual International Symposium on Computer Architectures, pp. 404 - 411, 1986.
[2]
V. Agarwal, M. S. Hrishikesh, S. W. Keckler and D. Burger. "Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures", Proc. of the 27th Annual International Symposium on Computer Architectures, pp. 248 - 259, 2000.
[3]
P. G. Emma and E. S. Davidson. "Characterization of Branch and Data Dependencies in Programs for Evaluating Pipeline Performance", IEEE Transactions on ComputersC-36, pp. 859 - 875, 1987.
[4]
M. H. Macdougal. "Instruction-Level Program and Processor Modeling", Computer, pp. 14 - 24, 1984.
[5]
P. Emma, J Knight, J Pomerene, T Puzak, R Rechschaffen. "Simulation and Analysis of a Pipeline Processor", 8th Winter Simulation Conference, pp. 1047 - 1057, 1989.
[6]
A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski. "Contrasting Characteristics and cache performance of technical and multi-user commercial workloads", ASPLOS VI, pp. 145 - 156, 1994.
[7]
J. D. Gee, M. D. Hill, D. N Pnevmatikatos, and A. J. Smith. "Cache Performance of the SPEC Benchmark Suite", Technical Report 1049, Computer Sciences Department, University of Wisconsin, 1991.
[8]
M. J. Charney and T. R. Puzak. "Prefetching and Memory System Behavior of the SPEC95 benchmark Suite" IBM Journal of Research and Development41, pp. 265 - 286, 1997.

Cited By

View all
  • (2024)Enhancing Power Efficiency in Branch Target Buffer Design with a Two-Level Prediction MechanismElectronics10.3390/electronics1307118513:7(1185)Online publication date: 23-Mar-2024
  • (2024)DGL: Device Generic Latency Model for Neural Architecture Search on Mobile DevicesIEEE Transactions on Mobile Computing10.1109/TMC.2023.324417023:2(1954-1967)Online publication date: 1-Feb-2024
  • (2023)Survey and Comparison of Pipeline of Some RISC and CISC System Architectures2023 8th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS57501.2023.10150975(785-790)Online publication date: 21-Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 30, Issue 2
Special Issue: Proceedings of the 29th annual international symposium on Computer architecture (ISCA '02)
May 2002
304 pages
ISSN:0163-5964
DOI:10.1145/545214
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '02: Proceedings of the 29th annual international symposium on Computer architecture
    May 2002
    346 pages
    ISBN:076951605X
    • Conference Chair:
    • Yale Patt,
    • Program Chair:
    • Dirk Grunwald,
    • Publications Chair:
    • Kevin Skadron

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2002
Published in SIGARCH Volume 30, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)5
Reflects downloads up to 03 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Enhancing Power Efficiency in Branch Target Buffer Design with a Two-Level Prediction MechanismElectronics10.3390/electronics1307118513:7(1185)Online publication date: 23-Mar-2024
  • (2024)DGL: Device Generic Latency Model for Neural Architecture Search on Mobile DevicesIEEE Transactions on Mobile Computing10.1109/TMC.2023.324417023:2(1954-1967)Online publication date: 1-Feb-2024
  • (2023)Survey and Comparison of Pipeline of Some RISC and CISC System Architectures2023 8th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS57501.2023.10150975(785-790)Online publication date: 21-Apr-2023
  • (2021)Models for Calculating Pipeline Performance with Data HazardsCurrent Problems and Ways of Industry Development: Equipment and Technologies10.1007/978-3-030-69421-0_67(632-642)Online publication date: 29-Apr-2021
  • (2018)Fine-Grained Energy-Constrained Microprocessor Pipeline DesignIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2017.276754326:3(457-469)Online publication date: 1-Mar-2018
  • (2016)A design method for an improved soft core of ARMv4 instruction set based on FPGA2016 International Conference on Computer, Information and Telecommunication Systems (CITS)10.1109/CITS.2016.7546426(1-5)Online publication date: Jul-2016
  • (2015)An energy-delay product study on chip multi-processors for variable stage pipeliningHuman-centric Computing and Information Sciences10.1186/s13673-015-0046-x5:1Online publication date: 21-Sep-2015
  • (2015)An optimizing pipeline stall reduction algorithm for power and performance on multi-core CPUsHuman-centric Computing and Information Sciences10.1186/s13673-014-0016-85:1(1-13)Online publication date: 1-Dec-2015
  • (2015)On Data Forwarding in Deeply Pipelined Soft ProcessorsProceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/2684746.2689067(181-189)Online publication date: 22-Feb-2015
  • (2014)Quantifying the Gap Between FPGA and Custom CMOS to Aid Microarchitectural DesignIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2013.228428122:10(2067-2080)Online publication date: Oct-2014
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media