article

The optimum pipeline depth for a microprocessor

Authors:

A. Hartstein,

Thomas R. PuzakAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 30, Issue 2

Pages 7 - 13

https://doi.org/10.1145/545214.545217

Published: 01 May 2002 Publication History

Get Access

Abstract

The impact of pipeline length on the performance of a microprocessor is explored both theoretically and by simulation. An analytical theory is presented that shows two opposing architectural parameters affect the optimal pipeline length: the degree of instruction level parallelism (superscalar) decreases the optimal pipeline length, while the lack of pipeline stalls increases the optimal pipeline length. This theory is tested by analyzing the optimal pipeline length for 35 applications representing three classes of workloads. Trace tapes are collected from SPEC95 and SPEC2000 applications, traditional (legacy) database and on-line transaction processing (OLTP) applications, and modern (e. g. web) applications primarily written in Java and C++. The results show that there is a clear and significant difference in the optimal pipeline length between the SPEC workloads and both the legacy and modern applications. The SPEC applications, written in C, optimize to a shorter pipeline length than the legacy applications, largely written in assembler language, with relatively little overlap in the two distributions. Additionally, the optimal pipeline length distribution for the C++ and Java workloads overlaps with the legacy applications, suggesting similar workload characteristics. These results are explored across a wide range of superscalar processors, both in-order and out-of-order.

References

[1]

S. R. Kunkel and J. E. Smith. "Optimal pipelining in supercomputers", Proc. of the 13 th Annual International Symposium on Computer Architectures, pp. 404 - 411, 1986.

Digital Library

Google Scholar

[2]

V. Agarwal, M. S. Hrishikesh, S. W. Keckler and D. Burger. "Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures", Proc. of the 27th Annual International Symposium on Computer Architectures, pp. 248 - 259, 2000.

Digital Library

Google Scholar

[3]

P. G. Emma and E. S. Davidson. "Characterization of Branch and Data Dependencies in Programs for Evaluating Pipeline Performance", IEEE Transactions on ComputersC-36, pp. 859 - 875, 1987.

Digital Library

Google Scholar

[4]

M. H. Macdougal. "Instruction-Level Program and Processor Modeling", Computer, pp. 14 - 24, 1984.

Google Scholar

[5]

P. Emma, J Knight, J Pomerene, T Puzak, R Rechschaffen. "Simulation and Analysis of a Pipeline Processor", 8th Winter Simulation Conference, pp. 1047 - 1057, 1989.

Digital Library

Google Scholar

[6]

A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski. "Contrasting Characteristics and cache performance of technical and multi-user commercial workloads", ASPLOS VI, pp. 145 - 156, 1994.

Digital Library

Google Scholar

[7]

J. D. Gee, M. D. Hill, D. N Pnevmatikatos, and A. J. Smith. "Cache Performance of the SPEC Benchmark Suite", Technical Report 1049, Computer Sciences Department, University of Wisconsin, 1991.

Digital Library

Google Scholar

[8]

M. J. Charney and T. R. Puzak. "Prefetching and Memory System Behavior of the SPEC95 benchmark Suite" IBM Journal of Research and Development41, pp. 265 - 286, 1997.

Digital Library

Google Scholar

Cited By

View all

Nian JLiu HGao XZhang SYang M(2024)Enhancing Power Efficiency in Branch Target Buffer Design with a Two-Level Prediction MechanismElectronics10.3390/electronics1307118513:7(1185)Online publication date: 23-Mar-2024
https://doi.org/10.3390/electronics13071185
Wang QZhang S(2024)DGL: Device Generic Latency Model for Neural Architecture Search on Mobile DevicesIEEE Transactions on Mobile Computing10.1109/TMC.2023.324417023:2(1954-1967)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TMC.2023.3244170
He YChen X(2023)Survey and Comparison of Pipeline of Some RISC and CISC System Architectures2023 8th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS57501.2023.10150975(785-790)Online publication date: 21-Apr-2023
https://doi.org/10.1109/ICCCS57501.2023.10150975
Show More Cited By

Index Terms

The optimum pipeline depth for a microprocessor
1. Computer systems organization
  1. Embedded and cyber-physical systems
  2. Real-time systems
2. Hardware
  1. Integrated circuits
    1. Logic circuits

Recommendations

The optimum pipeline depth for a microprocessor
ISCA '02: Proceedings of the 29th annual international symposium on Computer architecture

The impact of pipeline length on the performance of a microprocessor is explored both theoretically and by simulation. An analytical theory is presented that shows two opposing architectural parameters affect the optimal pipeline length: the degree of ...
Optimum Power/Performance Pipeline Depth
MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture

The impact of pipeline length on both the power andperformance of a microprocessor is explored boththeoretically and by simulation. A theory is presented fora wide range of power/performance metrics, BIPSm/W.The theory shows that the more important ...
Microprocessor pipeline energy analysis
ISLPED '03: Proceedings of the 2003 international symposium on Low power electronics and design

The increase in high-performance microprocessor power consumption is due in part to the large power overhead of wide-issue, highly speculative cores. Microarchitectural speculation, such as branch prediction, increases instruction throughput but carries ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 30, Issue 2

Special Issue: Proceedings of the 29th annual international symposium on Computer architecture (ISCA '02)

May 2002

304 pages

ISSN:0163-5964

DOI:10.1145/545214

Issue’s Table of Contents

ISCA '02: Proceedings of the 29th annual international symposium on Computer architecture
May 2002
346 pages
ISBN:076951605X
Conference Chair:
Yale Patt
The University of Texas at Austin
,
Program Chair:
Dirk Grunwald
Univeristy of Colorado at Boulder
,
Publications Chair:
Kevin Skadron
University of Virginia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2002

Published in SIGARCH Volume 30, Issue 2

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

70
Total Citations
View Citations
1,539
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)5

Reflects downloads up to 03 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Nian JLiu HGao XZhang SYang M(2024)Enhancing Power Efficiency in Branch Target Buffer Design with a Two-Level Prediction MechanismElectronics10.3390/electronics1307118513:7(1185)Online publication date: 23-Mar-2024
https://doi.org/10.3390/electronics13071185
Wang QZhang S(2024)DGL: Device Generic Latency Model for Neural Architecture Search on Mobile DevicesIEEE Transactions on Mobile Computing10.1109/TMC.2023.324417023:2(1954-1967)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TMC.2023.3244170
He YChen X(2023)Survey and Comparison of Pipeline of Some RISC and CISC System Architectures2023 8th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS57501.2023.10150975(785-790)Online publication date: 21-Apr-2023
https://doi.org/10.1109/ICCCS57501.2023.10150975
Khusainov A(2021)Models for Calculating Pipeline Performance with Data HazardsCurrent Problems and Ways of Industry Development: Equipment and Technologies10.1007/978-3-030-69421-0_67(632-642)Online publication date: 29-Apr-2021
https://doi.org/10.1007/978-3-030-69421-0_67
Gebregiorgis ATahoori M(2018)Fine-Grained Energy-Constrained Microprocessor Pipeline DesignIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2017.276754326:3(457-469)Online publication date: 1-Mar-2018
https://dl.acm.org/doi/10.1109/TVLSI.2017.2767543
Yi QShi MChen MLi S(2016)A design method for an improved soft core of ARMv4 instruction set based on FPGA2016 International Conference on Computer, Information and Telecommunication Systems (CITS)10.1109/CITS.2016.7546426(1-5)Online publication date: Jul-2016
https://doi.org/10.1109/CITS.2016.7546426
Saravanan VAnpalagan AWoungang I(2015)An energy-delay product study on chip multi-processors for variable stage pipeliningHuman-centric Computing and Information Sciences10.1186/s13673-015-0046-x5:1Online publication date: 21-Sep-2015
https://doi.org/10.1186/s13673-015-0046-x
Saravanan VPralhaddas KKothari DWoungang I(2015)An optimizing pipeline stall reduction algorithm for power and performance on multi-core CPUsHuman-centric Computing and Information Sciences10.1186/s13673-014-0016-85:1(1-13)Online publication date: 1-Dec-2015
https://dl.acm.org/doi/10.1186/s13673-014-0016-8
Hui Yan CFahmy SKapre NConstantinides GChen D(2015)On Data Forwarding in Deeply Pipelined Soft ProcessorsProceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/2684746.2689067(181-189)Online publication date: 22-Feb-2015
https://dl.acm.org/doi/10.1145/2684746.2689067
Wong HBetz VRose J(2014)Quantifying the Gap Between FPGA and Custom CMOS to Aid Microarchitectural DesignIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2013.228428122:10(2067-2080)Online publication date: Oct-2014
https://doi.org/10.1109/TVLSI.2013.2284281
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

The optimum pipeline depth for a microprocessor

Optimum Power/Performance Pipeline Depth

Microprocessor pipeline energy analysis