Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/30350.30383acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free access

WISQ: a restartable architecture using queues

Published: 01 June 1987 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper, the WISQ architecture is described. This architecture is designed to achieve high performance by exploiting new compiler technology and using a highly segmented pipeline. By having a highly segmented pipeline, a very-high-speed clock can be used. Since a highly segmented pipeline will require relatively long pipelines, a way must be provided to minimize the effects of pipeline bubbles that are formed due to data and control dependencies. It is also important to provide a way of supporting precise interrupts. These goals are met, in part, by providing a reorder buffer to help restore the machine to a precise state. The architecture then makes the pipelining visible to the programmer/compiler by making the reorder buffer accessible and by explicitly providing that issued instructions cannot be affected by immediately preceding ones. Compiler techniques have been identified that can take advantage of the reorder buffer and permit a sustained execution rate approaching or exceeding one per clock. These techniques include using trace scheduling and providing a relatively easy way to “undo” instructions if the predicted branch path is not taken. We have also studied ways to further reduce the effects of branches by not having them executed in the execution unit. In particular, branches are detected and resolved in the instruction fetch unit. Using this approach, the execution unit is sent a stream of instructions (without branches) that are guaranteed to execute.

    References

    [1]
    Allen, F.E., and J. Cocke, "A Catalogue of Optimizing Transformations," in Design and Optimization of Compilers, edited by R. Rustin, Prentice-Hall, 1972.
    [2]
    Anderson, D. W., F. J. Sparacio and R. M. Tomasulo, "The IBM System/360 Model 91: Machine Philosophy and Instruction" Handling," IBM Journal of Research and Development, pp. 8-24, January 1967.
    [3]
    Bandyopadhyay, Sumit, V.S. Begwani, and R. B. Murray, "Compiling for the CRISP microprocessor," Proc. IEEE COMPCON SPRING 87, IEEE Cat. Number 87CH2409-1, pp. 96-100.
    [4]
    Coutant, D. S., C. L. Hammond, and J. W. Kelly, "Compilers for the New Generation of Hewlett-Packard Computers," Proc. "86 COMPCON, March 1986.
    [5]
    Cray Research, Inc., Cray-1 Computer Systems, Hardware Reference Manual, Chippewa Falls, Wisconsin, 1979.
    [6]
    Cray Research, Inc., CrayX-MP Series Mainframe Reference Manual, Chippewa Falls, Wisconsin, 1982.
    [7]
    D. R. Ditzel and H. R. McLellan, "Branch folding in the CRISP microprocessor: reducing branch delay to zero," 14th International Symposium on Computer Architecture (1987).
    [8]
    Ditzel, D., and R. McLellan, "Register allocation for free: The C machine stack cache," Symposium on Architecture Support for Programming Languages and Operating Systems, 1982.
    [9]
    Dongarra, J. J., and A. R. Jinds, "Unrolling Loops in Fortran", Software Practice and Experience 9, 3, pp. 219- 226, March 1979.
    [10]
    Fisher, J., "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Transactions on Computers, Vol.C-30, No. 7, July 1981.
    [11]
    Flynn, M. J., "Very High-Speed Computing Systems," Proceedings of the IEEE, Vol. 54, No. 12, pp. 1901-1909, December 1966.
    [12]
    Gibbons, P. B., and S. S. Muchnick, "Efficient Instruction Scheduling for a Pipelined Architecture," Proceedings of the SIGPLAN, '86 Symposium on Compiler Construction, June 1986.
    [13]
    Goodman, J.R., J.T. Hsieh, K. Liou, A.R. Pleszkun, P. B. Schechter and H. C. Young, "PIPE: A VLSI Decoupled Architecture," Proceedings of the 12th Annual Symposium on Computer Architecture, pp. 20-27, June 1985.
    [14]
    Goodman, J. R., and Wei C. Hsu "On the Use of Registers vs Cache to Minimize Memory Traffic," The 13th Annual Symposium on Computer Architecture, June, 1986.
    [15]
    Goodman, J. R., and Wei C. Hsu "A Code Scheduling Technique for Large Basic Blocks," In Preparation.
    [16]
    Hennessy, J. L., N. Jouppi, F. Baskett, T. R. Gross and J. Gill, "Hardware/software tradeoffs for increased performance," Proc. SIGARCH/SIGPLAN Symp. Architectural Support for Programming Languages and Operating Systems, ACM, pp. 2-11, March 1982.
    [17]
    Hennessy, J. L., and T. Gross, "Postpass Code Optimization of Pipeline Constraints," ACM Transactions on Programming Languages and Systems, 5, 3, pp. 422-448, July 1983.
    [18]
    Hennessy, J. L., "VLSI Processor Architecture," 1EEE Transactions on Computers, Vol.c-33 No. 12, Dec., 1984.
    [19]
    Hsu, P. Y., "Highly Concurrent Scalar Processing," Ph. D. Thesis, University of Illinois at Urbana-Champaign, 1986.
    [20]
    Wei C. Hsu, "Register Allocation and Code Scheduling for Load/Store Architectures", Ph.D. Thesis, University of Wisconsin - Madison 1987.
    [21]
    Hwang, K. and F. A. Briggs, "Computer Architecture and Parallel Processing," McGraw-Hill Book Company, 1984.
    [22]
    MacLaren, M. D. "Inline Routines in VAXELN Pascal" ACM SIGPLAN'84 Symposium on Compiler Construction, SIGPLAN Notice Vol.19, No. 6, June, 1984.
    [23]
    McFarling, Scott, and John Hennessy "Reducing the Cost of Branches," Proc. 13th Annual International Symposium on Computer Architectures, June 1986.
    [24]
    Patterson, D. A., and C. H. Sequin, "RISC I: A Reduced Instruction Set VLSI Computer," Proc. of the Eighth Annual Symposium on Computer Architecture, pp. 443- 458, May 1981.
    [25]
    Patterson, D. A., and C. H. Sequin, "A VLSI RISC," IEEE Computer, 15, 9, pp.8-21, Sep., 1982.
    [26]
    Radin, G., "The 801 Minicomputer," Symp. on Architecture Support for Programming Languages and Operating Systems, pp. 39-47, March, 1982.
    [27]
    Scheifler, R. W., "An Analysis of Inline Substitution for a Structured Programming Language," Comm. ACM, 20, 9, pp. 647-654, Sep., 1977.
    [28]
    Smith, J. E., and A. R. Pleszkun "Implementation of Precise Interrupts in Pipelined Processors," Proc. 12th Annual International Symposium on Computer Architectures, pp. 36-44, June 1985.
    [29]
    Wall, David W., "Global Register Allocation at Link Time," Proceedings of the SIGPLAN '82 Symposium on Compiler Construction,
    [30]
    Digital Technical Journal, Digital Equipment Corp. Hudson, MA, August 1985.
    [31]
    Wulf, William, et al., "The Design of an Optimizing Compiler," American Elsevier Publishing Company, NY, 1975
    [32]
    Young, H., "Evaluation of a Decoupled Computer Architecture and the Design" of A Vector Extension," Computer Sciences Technical Report #603, July, 1985.

    Cited By

    View all
    • (1998)Instruction issue logic for high-performance, interruptable pipelined processors25 years of the international symposia on Computer architecture (selected papers)10.1145/285930.285992(329-336)Online publication date: 1-Aug-1998
    • (1995)Exploiting short-lived variables in superscalar processorsProceedings of the 28th annual international symposium on Microarchitecture10.5555/225160.225206(292-302)Online publication date: 1-Dec-1995
    • (1995)Exploiting short-lived variables in superscalar processorsProceedings of the 28th Annual International Symposium on Microarchitecture10.1109/MICRO.1995.476839(292-302)Online publication date: Nov-1995
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISCA '87: Proceedings of the 14th annual international symposium on Computer architecture
    June 1987
    321 pages
    ISBN:0818607769
    DOI:10.1145/30350
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 June 1987

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    ISCA87
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 543 of 3,203 submissions, 17%

    Upcoming Conference

    ISCA '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)49
    • Downloads (Last 6 weeks)4

    Other Metrics

    Citations

    Cited By

    View all
    • (1998)Instruction issue logic for high-performance, interruptable pipelined processors25 years of the international symposia on Computer architecture (selected papers)10.1145/285930.285992(329-336)Online publication date: 1-Aug-1998
    • (1995)Exploiting short-lived variables in superscalar processorsProceedings of the 28th annual international symposium on Microarchitecture10.5555/225160.225206(292-302)Online publication date: 1-Dec-1995
    • (1995)Exploiting short-lived variables in superscalar processorsProceedings of the 28th Annual International Symposium on Microarchitecture10.1109/MICRO.1995.476839(292-302)Online publication date: Nov-1995
    • (1992)Efficient Instruction Sequencing with Inline Target InsertionIEEE Transactions on Computers10.1109/12.21466241:12(1537-1551)Online publication date: 1-Dec-1992
    • (1990)Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined ComputersIEEE Transactions on Computers10.1109/12.4886539:3(349-359)Online publication date: 1-Mar-1990
    • (1989)Instruction fetch unit for parallel execution of branch instructionsProceedings of the 3rd international conference on Supercomputing10.1145/318789.318884(417-426)Online publication date: 1-Jun-1989
    • (1987)Instruction issue logic for high-performance, interruptable pipelined processorsProceedings of the 14th annual international symposium on Computer architecture10.1145/30350.30354(27-34)Online publication date: 1-Jun-1987

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media