Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Meld Scheduling: A Technique for Relaxing Scheduling Constraints

Published: 01 August 1998 Publication History

Abstract

Meld scheduling melds the schedules of neighboring scheduling regions to respect latencies of operations issued in one region but completing after control transfers to the other. In contrast, conventional schedulers ignore latency constraints from other regions leading to potentially avoidable stalls in an interlocked (superscalar) machine or incorrect schedules for noninterlocked (VLIW) machines. Alternatively, schedulers that conservatively require all operations to complete before the branch takes effect produce inefficient schedules. In this paper, we present general data structures for maintaining latency constraint information at region boundaries. We present a meld scheduling algorithm for noninterlocked processors that generates latency constraints at the boundaries of scheduled regions and utilizes this information during the scheduling of other regions. We present a range of design options and describe the reasons behind our particular choices. We evaluate the performance of meld scheduling on a range of machine models on a set of SPEC92 and UNIX benchmarks.

References

[1]
1. W. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A. Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery, The superblock: An effective technique for VLIW and superscalar compilation, J. Supercomputing , 7:229-248 (1993).
[2]
2. J. A. Fisher, Trace scheduling: A technique for global microcode compaction, IEEE Trans. Computers, C-30:478-490 (1981).
[3]
3. J. R. Ellis, Bulldog: A Compiler for VLIW Architectures, Cambridge, Massachusetts, The MIT Press (1985).
[4]
4. D. Bernstein and M. Rodeh, Global instruction scheduling for superscalar machines, SIGPLAN '91 Conf. Progr. Lang. Design and Implementation (1991).
[5]
5. A. Nicolau, Percolation scheduling: A parallel compilation technique, Department of Computer Science, Cornell, Technical Report TR 85-678 (May 1985).
[6]
6. S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bringmann, Effective compiler support for predicated execution using the hyperblock, 25th Ann. Intl. Symp. Microarchit. (1992).
[7]
7. S.-M. Moon and K. Ebcioglu, An efficient resource-constrained global scheduling technique for superscalar and VLIW processors, 25th Ann. Intl. Symp. Microarchit., Portland, Oregon (1992).
[8]
8. V. Kathail, M. S. Schlansker, and B. R. Rau, HPL PlayDoh architecture specification: Version 1.0, Hewlett-Packard Laboratories, Palo Alto, California, Technical Report HPL- 93-80 (1993).
[9]
9. A. V. Aho, J. E. Hopcroft, and J. D. Ullman, Data Structures and Algorithms, Addison-Wesley Publishing Company, Reading, Massachusetts (1983).
[10]
10. B. R. Rau, Iterative modulo scheduling: An algorithm for software pipelining loops, 27th Ann. Intl. Symp. Microarchit., San Jose, California (1994).
[11]
11. M. Lam, Software pipelining: An effective scheduling technique for VLIW machines, ACM SIGPLAN Conf. Progr. Lang. Design and Implementation (1988).
[12]
12. D. M. Lavery and W. W. Hwu, Unrolling-based optimizations for software pipelining, 28th Ann. Intl. Symp. Microarchit., Ann Arbor, Michigan (1995).
[13]
13. B. R. Rau and C. D. Glaeser, Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing, 14th Ann. Workshop on Microprogr. (1981).
[14]
14. G. Lowney, S. Freudenberger, T. Karzes, W. D. Lichtenstein, R. Nix, J. O'Donnell, and J. Ruttenberg, The Multiflow trace scheduling compiler, J. Supercomputing, 7:51-142 (1993).
[15]
15. J. C. Dehnert and R. A. Towle, Compiling for the Cydra 5, J. Supercomputing, 7:181-228 (1993).
[16]
16. S. Srivastava, Implementation of global scheduling: Cydrome Internal Document (1988).

Cited By

View all
  • (2002)Backtracking-Based Instruction Scheduling to Fill Branch Delay SlotsInternational Journal of Parallel Programming10.1023/A:102060111039130:6(397-418)Online publication date: 1-Dec-2002

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Parallel Programming
International Journal of Parallel Programming  Volume 26, Issue 4
August 1998
189 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 August 1998

Author Tags

  1. COMPILER OPTIMIZATION
  2. GLOBAL SCHEDULING
  3. INSTRUCTION SCHEDULING
  4. INSTRUCTION-LEVEL PARALLEL PROCESSORS
  5. LATENCY CONSTRAINT PROPAGATION

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2002)Backtracking-Based Instruction Scheduling to Fill Branch Delay SlotsInternational Journal of Parallel Programming10.1023/A:102060111039130:6(397-418)Online publication date: 1-Dec-2002

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media