Article

A First-Order Superscalar Processor Model

Authors:

Tejas S. Karkhanis,

James E. SmithAuthors Info & Claims

ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture

Page 338

Published: 02 March 2004 Publication History

Abstract

A proposed performance model for superscalar processorsconsists of 1) a component that models the relationshipbetween instructions issued per cycle and the sizeof the instruction window under ideal conditions, and 2)methods for calculating transient performance penaltiesdue to branch mispredictions, instruction cache misses,and data cache misses.Using trace-derived data dependenceinformation, data and instruction cache miss rates,and branch miss-prediction rates as inputs, the model canarrive at performance estimates for a typical superscalarprocessor that are within 5.8% of detailed simulation onaverage and within 13% in the worst case. The modelalso provides insights into the workings of superscalarprocessors and long-term microarchitecture trends such aspipeline depths and issue widths.

References

[1]

{1} G. Sohi and S. Vajapeyam, "Instruction Issue Logic for High-Performance, Interruptable Pipelined Processors," International Symposium on Computer Architecture , pp. 27-34, 1987.

Digital Library

[2]

{2} P. G. Emma and E. S. Davidson, "Characterization of Branch and Data Dependencies on Programs for Evaluating Pipeline performance," IEEE Transactions on Computers, Vol. 36, pp. 859-875, 1987.

Digital Library

[3]

{3} A. Hartstein and T. R. Puzak, "The Optimum Pipeline Depth for a Microprocessors," International Symposium on Computer Architecture, pp. 7-13, 2002.

Digital Library

[4]

{4} E. Sprangle and D. Carmean, "Increasing Processor Performance by Implementing Deeper Pipelines," International Symposium on Computer Architecture , pp. 25-34, 2002.

Digital Library

[5]

{5} D. B. Noonburg and J. P. Shen, "Theoretical Modeling of Superscalar Processor Performance," International Symposium on Microarchitecture, pp. 52-62, 1994.

Digital Library

[6]

{6} P. Michaud, A. Seznec, and S. Jourdan, "Exploring Instruction-Fetch Bandwidth Requirement in Wide-Issue Superscalar Processors," International Symposium on Parallel Architectures and Compilation Techniques, 1999.

Digital Library

[7]

{7} P. Michaud, A. Seznec, and S. Jourdan, "An Exploration of Instruction Fetch Requirement in Out-Of-Order Superscalar Processors," International Journal of Parallel Programming, vol. 29, 2001.

[8]

{8} S. Nussbaum and J. E. Smith, "Modeling Superscalar Processors via Statistical Simulation," International Symposium on Parallel Architectures and Compilation Techniques, 2001.

Digital Library

[9]

{9} R. Carl and J. E. Smith, "Modeling Superscalar Processors via Statistical Simulation," Workshop on Performance Analysis and Its Impact on Design, 1998.

[10]

{10} L. Eeckhout, K. De Bosschere, and H. Neefs, "Performance Analysis Through Synthetic Trace Generation," International Symposium on Performance Analysis of Systems and Software, 2000.

Digital Library

[11]

{11} D. B. Noonburg and J. P. Shen, "A Framework for Statistical Modeling of Superscalar Processor Performances," International Symposium on High Performance Computer Architecture, pp. 298-309, 1997.

Digital Library

[12]

{12} D. Sorin, V. Pai, S. V. Adve, M. K. Vernon, and D. A. Wood, "Analytic Evaluation of Shared Memory Systems with ILP Processors," International Symposium on Computer Architecture, pp. 380-391, 1998.

Digital Library

[13]

{13} B. A. Fields, R. Bodik, M. D. Hill, and C. J. Newburn, "Using Interaction Costs for Microarchitectural Bottleneck Analysis," International Symposium on Microarchitecture, pp. 228-239, 2003.

Digital Library

[14]

{14} D. J. Ofelt, "Efficient Performance Prediction for Modern Microprocessors," Stanford University PhD Thesis, 1999.

Digital Library

[15]

{15} E. Riseman and C. Foster, "The Inhibition of Potential Parallelism by Conditional Jumps," IEEE Transactions on Computers, vol. C-21, pp. 1405-1411, 1972.

[16]

{16} N. P. Jouppi, "The Nonuniform Distribution of Instruction-Level and Machine Parallelism and Its Effect on Performance," IEEE Transactions on Computers , vol. 38, pp. 1645-1658, 1989.

Digital Library

[17]

{17} S. R. Kunkel and J. E. Smith, "Optimal pipelining in supercomputers," International Symposium on Computer Architecture, pp. 404-411, 1986.

Digital Library

[18]

{18} M. S. Hrishikesh, D. Burger, N. P. Jouppi, S. W. Keckler, K. I. Farkas, and P. Shivakumar, "The Optimal Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays," International Symposium on Computer Architecture, pp. 14-24, 2002.

Digital Library

Cited By

Radulovic MSánchez Verdejo RCarpenter PRadojković PJacob BAyguadé E(2019)PROFETProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/3341617.33261493:2(1-33)Online publication date: 19-Jun-2019
https://dl.acm.org/doi/10.1145/3341617.3326149
Wang YLee VWei GBrooks D(2019)Predicting New Workload or CPU Performance by Analyzing Public DatasetsACM Transactions on Architecture and Code Optimization10.1145/328412715:4(1-21)Online publication date: 8-Jan-2019
https://dl.acm.org/doi/10.1145/3284127
Grass TCarlson TRico ACeballos GAyguade ECasas MMoreto M(2019)Sampled Simulation of Task-Based ProgramsIEEE Transactions on Computers10.1109/TC.2018.286001268:2(255-269)Online publication date: 1-Feb-2019
https://dl.acm.org/doi/10.1109/TC.2018.2860012
Show More Cited By

Recommendations

A First-Order Superscalar Processor Model
ISCA 2004

A proposed performance model for superscalar processorsconsists of 1) a component that models the relationshipbetween instructions issued per cycle and the sizeof the instruction window under ideal conditions, and 2)methods for calculating transient ...
An out-of-order superscalar processor on FPGA: the ReOrder buffer design
DATE '12: Proceedings of the Conference on Design, Automation and Test in Europe

Embedded systems based on FPGA (Field-Programmable Gate Arrays) must exhibit more performance for new applications. However, no high-performance superscalar soft processor is available on the FPGA, because the superscalar architecture is not suitable ...
Microarchitecture of a Coarse-Grain Out-of-Order Superscalar Processor

We explore the design, implementation, and evaluation of a coarse-grain superscalar processor in the context of the microarchitecture of the Control Processor (CP) of the Multilevel Computing Architecture (MLCA), a novel architecture targeted for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture

June 2004

373 pages

ISBN:0769521436

ACM SIGARCH Computer Architecture News Volume 32, Issue 2
ISCA 2004
March 2004
373 pages
ISSN:0163-5964
DOI:10.1145/1028176
Issue’s Table of Contents

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

IEEE Computer Society

United States

Publication History

Published: 02 March 2004

Check for updates

Qualifiers

Article

Conference

ISCA04

Sponsor:

SIGARCH

ISCA04: The 31st Annual International Symposium on Computer Architecture 2004

June 19 - 23, 2004

München, Germany

Acceptance Rates

ISCA '04 Paper Acceptance Rate 31 of 217 submissions, 14%;

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

166
Total Citations
View Citations
1,939
Total Downloads

Downloads (Last 12 months)69
Downloads (Last 6 weeks)6

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Radulovic MSánchez Verdejo RCarpenter PRadojković PJacob BAyguadé E(2019)PROFETProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/3341617.33261493:2(1-33)Online publication date: 19-Jun-2019
https://dl.acm.org/doi/10.1145/3341617.3326149
Wang YLee VWei GBrooks D(2019)Predicting New Workload or CPU Performance by Analyzing Public DatasetsACM Transactions on Architecture and Code Optimization10.1145/328412715:4(1-21)Online publication date: 8-Jan-2019
https://dl.acm.org/doi/10.1145/3284127
Grass TCarlson TRico ACeballos GAyguade ECasas MMoreto M(2019)Sampled Simulation of Task-Based ProgramsIEEE Transactions on Computers10.1109/TC.2018.286001268:2(255-269)Online publication date: 1-Feb-2019
https://dl.acm.org/doi/10.1109/TC.2018.2860012
Ji KLing MShi LPan J(2018)An Analytical Cache Performance Evaluation Framework for Embedded Out-of-Order Processors Using Software CharacteristicsACM Transactions on Embedded Computing Systems10.1145/323318217:4(1-25)Online publication date: 9-Aug-2018
https://dl.acm.org/doi/10.1145/3233182
Badr MJerger N(2018)Fast and Accurate Performance Analysis of SynchronizationProceedings of the 9th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3178442.3178446(31-40)Online publication date: 24-Feb-2018
https://dl.acm.org/doi/10.1145/3178442.3178446
Dey MNazari AZajic APrvulovic MOskin MInoue K(2018)TEMProfProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00076(881-893)Online publication date: 20-Oct-2018
https://dl.acm.org/doi/10.1109/MICRO.2018.00076
Jang HJo JLee JKim JOskin MInoue K(2018)RpStacks-MTProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00054(586-599)Online publication date: 20-Oct-2018
https://dl.acm.org/doi/10.1109/MICRO.2018.00054
Van Den Steen SEeckhout L(2018)Modeling Superscalar Processor Memory-Level ParallelismIEEE Computer Architecture Letters10.1109/LCA.2017.270137017:1(9-12)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.1109/LCA.2017.2701370
Cui WDing YDangwal DHolmes AMcMahan JJavadi-Abhari ATzimpragos GChong FSherwood T(2018)CharmProceedings of the 45th Annual International Symposium on Computer Architecture10.1109/ISCA.2018.00023(152-165)Online publication date: 2-Jun-2018
https://dl.acm.org/doi/10.1109/ISCA.2018.00023
Ravi GLipasti M(2017)CHARSTARACM SIGARCH Computer Architecture News10.1145/3140659.308021245:2(147-160)Online publication date: 24-Jun-2017
https://dl.acm.org/doi/10.1145/3140659.3080212
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten