Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/192724.192733acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
Article
Free access

Minimizing register requirements under resource-constrained rate-optimal software pipelining

Published: 30 November 1994 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper we address the following software pipelining problem: given a loop and a machine architecture with a fixed number of processor resources (e.g. function units), how can one construct a software-pipelined schedule which runs on the given architecture at the maximum possible iteration rate (a` la rate-optimal) while minimizing the number of registers?
    The main contributions of this paper are:
    •First, we demonstrate that such problem can be described by a simple mathematical formulation with precise optimization objectives under periodic linear scheduling framework. The mathematical formulation provides a clear picture which permits one to visualize the overall solution space (for rate-optimal schedules) under different sets of constraints.
    •Secondly, we show that a precise mathematical formulation and its solution does make a significant performance difference! We evaluated the performance of our method against three other leading contemporary heuristic methods: Huff's Slack Scheduling, Wang, Eisenbeis, Jourdan and Su's FRLC, and Gasperoni and Schwiegelshohn's modified list scheduling. Experimental results show that the method described in this paper performed significantly better than these methods.

    References

    [1]
    A. Aiken and A. Nicolau. A realistic resourceconstrained software pipelining algorithm. In A. Nicolau, D. Gelernter, T. Gross, and D. Padua, editors, Advances ~n Languages and Compilers for Parallel Processing, Res. Monographs in Parallel and Distrib. Computing, chapter 14, pages 274-290. 1991.
    [2]
    J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren. Conversion of control dependence to data dependence. In Conf. Rec. o/the Tenth Ann. A UM Syrup. on Principles of Programming Languages, pages 177-189, Austin, TX, Jan. 24-26, 1983.
    [3]
    E. R. Altman, R. Govindarajan, and G. R. Gao. Software pipelining to minimize registers and resources. ACAPS Technical Memo 79, School of Computer Science, McGill University, Montrdal, Qua., 1994. under preparation.
    [4]
    J. C. Dehnert and R. A. Towle. Compiling for Cydra 5. Journal of $upercomput~ng, 7:181-227, May 1993.
    [5]
    K. Ebcioglu and T. Nakatani. A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture. In D. Gelernter, A. Nicolau, and D. Padua, editors, Languages and Compilers for Parallel Computing, Res. Monographs in Parallel and Distrib. Computing, chapter 12, pages 213-229. 1990.
    [6]
    F. Gasperoni and U. Schwiegelshohn. Efficient algorithms for cyclic scheduling. Res. Rep. RC 17068, IBM T. J. Watson Res. Center, Yorktown Heights, NY, 1991.
    [7]
    M. B. Girkar, M. R. Haghighat, C. L. Lee, B. P. Leung, and D. A. Schouten. Parafrase-2 user's manual. TR RC-17068(#75743), Center for Supercomputing Research and Development, University of Illinois at Urbana-Champagne, IL, 1991.
    [8]
    R. Govindarajan, E. R. Altman, and G. R. Gao. Minimizing register requirement in resource-constrained software pipelining. ACAPS Technical Memo 80, School of Computer Science, McGill University, MontrdM, Qud., 1994.
    [9]
    R. A. Huff. Lifetime-sensitive modulo scheduling. In Proc. of the SIGPLAN '93 Conf. on Programming Language Design and Implementation, pages 258-267, Albuquerque, NM, Jun. 23-25, 1993.
    [10]
    C.-T. Hwang, J.-H. Lee, and Y.-C. Hsu. A formal approach to the scheduling problem in high-level synthesis. }EEE Trans. on Computer-A~ded Deszgn, 10(4):464-475, Apr. 1991.
    [11]
    M. Lam. Software pipelining: An effective schedub ing technique for VLIW machines. In Proc. of the SIGPLAN '88 Conf. on Programming Language Design and Implementation, pages 318-328, Atlanta, GA, Jun. 22-24, 1988.
    [12]
    S. Moon and K. Ebcio~lu. An efficient resourceconstrained global scheduling technique for superscalar and VLIW processors. In Proc. of the 25th Ann. Intl. Syrup. on Microarchitecture, pages 55-71, Portland, OR, Dec. 1-4, 1992.
    [13]
    Q. Ning and G. R. Gao. A novel framework of register allocation for software pipelining. In Conf. Rec. of the Twentieth Ann. A CM SIGPLAN-SIGA CT Syrup. on Principles of Programming Languages, pages 29-42, Charleston, SC, Jan. 10-13, 1993.
    [14]
    M. Rajagopalan and V. H. Allan. Efficient scheduling of fine grain parallelism in loops. In Proc. of the 26th Ann. Intl. Symp. on Microarchitecture, pages 2- 11, Austin, TX, Dec. 1-3, 1993.
    [15]
    S. Ramakrishnan. Software pipelining in PA-RISC compilers. Hewlett-Packard J., pages 39-45, Jun. 1992.
    [16]
    B. R. Rau and J. A. Fisher. Instruction-level parallel processing: History, overview and perspective. J. of Supercomputing, 7:9-50, May 1993.
    [17]
    B. R. Rau and C. D. Glaeser. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing. In Proc. of the 1Jth Ann. Microprogramming Work., pages 183-198, Chatham, MA, Oct. 12-15, 1981.
    [18]
    B. R. Rau, M. Lee, P. P. Tirumalai, and M. S. Schlansker. Register allocation for software pipelined loops. In Proc. of the SIGPLAN '92 Conf. on Programm~ng Language Design and Implementation, pages 283-299, San Francisco, CA, Jun. 17-19, 1992.
    [19]
    B. R. Rau, M. S. Schlansker, and P. P. Tirumalai. Code generation schema for modulo scheduled loops. In Proc. of the 25th Ann. Intl. Syrup. on Microarchitecture, pages 158-169, Portland, OR, Dec. 1-4, 1992.
    [20]
    B. R. Rau, D. W. L. Yen, W. Yen, and R. A. Towle. The Cydra 5 departmental supercomputer. Computer, 22(1):12-35, Jan. 1989.
    [21]
    R. Reiter. Scheduling parallel computations. J. of the A CM, 15(4):590-599, Oct. 1968.
    [22]
    R. F. Touzeau. A Fortran compiler for the FPS- 164 scientific computer. In Proc. o/ the SIGPLAN '8g Syrup. on Compiler Construction, pages 48-57, Montreal, Qud., Jun. 17-22, 1984.
    [23]
    J. Wang, C. Eisenbeis, M. Jourdan, and B. Su. DE- composed Software Pipelining: A new approach to exploit irtstruc~ion-level parallelism for loop programs. Res. Rep. RR-1838, INRIA-Rocquencourt, France, Jan. 1993.
    [24]
    N. J. Wafter, S. A. Mahlke, W. Hwu, and B. Ramakrishna Rau. Reverse If-Conversion. In Proc. o/ the SIGPLAN '93 Conf. on Programming Language Design and Implementation, pages 290-299, Albuquerque, NM, Jun. 23-25, 1993.

    Cited By

    View all
    • (2023)Long-life Sensitive Modulo Scheduling with Adaptive Loop Expansion2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00075(530-537)Online publication date: Jan-2023
    • (2022)Adaptive Low-Cost Loop Expansion for Modulo SchedulingNetwork and Parallel Computing10.1007/978-3-031-21395-3_3(30-41)Online publication date: 1-Dec-2022
    • (2019)Survey on Combinatorial Register Allocation and Instruction SchedulingACM Computing Surveys10.1145/320092052:3(1-50)Online publication date: 18-Jun-2019
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MICRO 27: Proceedings of the 27th annual international symposium on Microarchitecture
    November 1994
    233 pages
    ISBN:0897917073
    DOI:10.1145/192724
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 November 1994

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    MICRO94
    Sponsor:
    MICRO94: 27th Annual International Symposium on Microarchitecture
    November 30 - December 2, 1994
    California, San Jose, USA

    Acceptance Rates

    Overall Acceptance Rate 484 of 2,242 submissions, 22%

    Upcoming Conference

    MICRO '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)49
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Long-life Sensitive Modulo Scheduling with Adaptive Loop Expansion2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00075(530-537)Online publication date: Jan-2023
    • (2022)Adaptive Low-Cost Loop Expansion for Modulo SchedulingNetwork and Parallel Computing10.1007/978-3-031-21395-3_3(30-41)Online publication date: 1-Dec-2022
    • (2019)Survey on Combinatorial Register Allocation and Instruction SchedulingACM Computing Surveys10.1145/320092052:3(1-50)Online publication date: 18-Jun-2019
    • (2016)Modulo SchedulingInstruction Level Parallelism10.1007/978-1-4899-7797-7_6(133-165)Online publication date: 30-Nov-2016
    • (2014)Improving performance of loops on DIAM-based VLIW architecturesACM SIGPLAN Notices10.1145/2666357.259782549:5(135-144)Online publication date: 12-Jun-2014
    • (2014)Improving performance of loops on DIAM-based VLIW architecturesProceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems10.1145/2597809.2597825(135-144)Online publication date: 12-Jun-2014
    • (2014)Flushing-Enabled Loop Pipelining for High-Level SynthesisProceedings of the 51st Annual Design Automation Conference10.1145/2593069.2593143(1-6)Online publication date: 1-Jun-2014
    • (2014)Optimum modulo schedules for minimum register requirementsACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2667171(227-236)Online publication date: 10-Jun-2014
    • (2014)Author retrospective for optimum modulo schedules for minimum register requirementsACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2591653(35-36)Online publication date: 10-Jun-2014
    • (2014)Predicate-aware, makespan-preserving software pipelining of scheduling tablesACM Transactions on Architecture and Code Optimization10.1145/257967611:1(1-26)Online publication date: 1-Feb-2014
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media