article

Free access

Overlapped loop support in the Cydra 5

Authors:

James C. Dehnert,

Peter Y.-T. Hsu,

Joseph P. BrattAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 17, Issue 2

Pages 26 - 38

https://doi.org/10.1145/68182.68185

Published: 01 April 1989 Publication History

Abstract

The Cydra^TM 5 architecture adds unique support for overlapping successive iterations of a loop to a very long instruction word (VLIW) base. This architecture allows highly parallel loop execution for a much larger class of loops than can be vectorized, without requiring the unrolling of loops usually used by compilers for VLIW machines. This paper discusses the Cydra 5 loop scheduling model, the special architectural features which support it, and the loop compilation techniques used to take full advantage of the architecture.

References

[1]

Allen, J.R., "Dependence Analysis for Subscripted Variables and its Application to Program Transformations," Ph.D. Thesis, Rice University, April 1983.

Digital Library

[2]

Allen, J.R., and K. Kennedy, "Automatic Translation of FORTRAN Programs to Vector Form," Transactions on Programming Languages and Systems, October 1987.

Digital Library

[3]

Allen, J.R., K. Kennedy, C. Portertield, and J. Warren, "Conversion of Control Dependence to Data Dependence," Proc. of the 10th Annual ACM Syrup. on Principals of Programming Languages, January 1983.

Digital Library

[4]

Banerjee, U., "Data Dependence in Ordinary Programs," Report No. UiUCDCS-R-76-837 (M.S. Thesis), Dept. of Computer Science, University of Illinois at Urbma-Champaign, November 1976.

[5]

Callahan, D., I. Cocko, and K. Konn~y, "Estimating Interlock and improving Balance for Pipclined Architectures," Journal of Parallel and Distributed Computing, August 1988, pp. 334-358.

Digital Library

[6]

Charleswonh, A.E., "An Approach to Scientific Array Processing: The Architectural Design of the AP-120BIFPS-164 Family," iNEE Computer, September 1981, pp. 18-27.

[7]

Colwell, R.P., et al, "A VLIW Architecture for a Trace Scheduling Compiler," SIGPLAN Notices 22, 10 (Proc. of ASPLOS II), Octobcr 1987.

[8]

Ellis, J.R., Bulldog: A Compiler for VI2W Architectures, MIT Press, Cambridge, Mass., 1986.

Digital Library

[9]

Ferrante, J., "What's in a Name, Or The Value of Renaming for ParaUelism De~on and Storage AHocation," Technical Report //12157, IBM Thomas J. Watson Research Center, January 1987.

[10]

Fisher, J.A., "Very Long Instruction Word Architectures and the ELI-512," Proc. of the 10th Annual Int'l Syrup. on Computer Architecture, june 1983.

Digital Library

[11]

Hennessy, J., N. Iouppi, S. Przbysld, C. Rowen, T. Gross, F. Baakett, and J. Gill, "MIPS: A Microprocessor Architecture," 15th Annual Workshop on Microprogrammlng, October 1982, pp. 17-22.

Digital Library

[12]

Hsu, Peter Y.-T., Highly Concurrent Scalar Processing, Ph.D. Thesis, Univ~sity of Illinois at Urbana- Champaign, 1986.

Digital Library

[13]

Joshi, S.M., and D.M. Dhamdhere, "A Composite Hoisting - Strength Reduction Transformation for Global Program Optimization," Parts I and II, Int'l J. Computer Math., VoL 11, 1982, pp. 21=41, 111-126.

[14]

Kuck, DJ., et al, "Dependence Graphs and Compiler Optimizations," Proc. of the 8th Annual ACM Syrup. on Principles of Programming Languages, January 1981.

Digital Library

[15]

Lain, M.S.-L., "A Systolic Array Optimizing Compiler," Ph.D. Thesis, Carnegie Mellon University, May 1987.

[16]

McMahon, F.H., '~he Livemx)re Fortran Kernels: A Computer Test of the Numcrical Performance Range," Technical Report UCRL-53745, Lawrence Livcrmorc National Laboratory, December 1986.

[17]

Rau, B.R., and C.D. Glaeser, "Some Scheduling Techniques and an Easily Schedulable Horizontal Architecture for High Performance Scientific Computing," Proc. of the 14th Annual Microprogramming Workshop, October 1982.

Digital Library

[18]

Rau, B.R., C.D. Glaescr, and R.L. Picard, "Efficient Code Generation for Horizontal Architectures: Compiler Techniques and Architectural Support," Pt~. of the 9th Annual Int'l Syrup. on Computer Architecture, 1982.

Digital Library

[19]

Rau, B.R., "Cydra 5 Dlrect~ Dataflow Architecture," Proc. of COMPCON 88, San Francisco, California, t988.

[20]

Rau, B.R., D.W.L. Yen, W. Yen, and R.A. Towle, "l'he Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions and Trade-offs," to appear in IEEE Computer special issue "Real Machines," 1989.

Digital Library

[21]

Schlansker, M., and M. McNamara, "The Cydra 5 Computer System Architecture," Proc. of ICCD'88, October, 1988.

[22]

Touzeau, R.F., "A Fortran Compiler for the FPS-164 Scientific Computer," SIGPLAN Notices (Proc. of the SiGPLAN '84 Syrup. on Compiler Construction), June 1984, pp. 48-57.

Digital Library

[23]

Towle, R.A., "Control and Data Dependence for Program Transformations," Ph.D. Thesis, University of Illinois at Urbana-Champaign, March 1976.

Digital Library

Cited By

Carminati AStarke Rde Oliveira R(2017)Combining loop unrolling strategies and code predication to reduce the worst-case execution time of real-time softwareApplied Computing and Informatics10.1016/j.aci.2017.03.00213:2(184-193)Online publication date: Jul-2017
https://doi.org/10.1016/j.aci.2017.03.002
Radigan JChang PBanerjee U(2005)Integer loop code generation for VLIWLanguages and Compilers for Parallel Computing10.1007/BFb0014208(318-330)Online publication date: 9-Jun-2005
https://doi.org/10.1007/BFb0014208
Mantripragada SNicolau A(2000)Selective guarded execution using profiling on a dynamically scheduled processorInnovative Architecture for Future Generation High-Performance Processors and Systems (Cat. No.PR00650)10.1109/IWIA.1999.898839(15-22)Online publication date: 2000
https://doi.org/10.1109/IWIA.1999.898839
Show More Cited By

Index Terms

Overlapped loop support in the Cydra 5
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
      1. Very long instruction word
    2. Serial architectures
      1. Complex instruction set computing
      2. Reduced instruction set computing
2. Theory of computation
  1. Design and analysis of algorithms
    1. Approximation algorithms analysis
      1. Scheduling algorithms
    2. Online algorithms
      1. Online learning algorithms
        Scheduling algorithms
  2. Theory and algorithms for application domains
    1. Machine learning theory
      1. Reinforcement learning
        Sequential decision making

Recommendations

Overlapped loop support in the Cydra 5
ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems

The Cydra^TM 5 architecture adds unique support for overlapping successive iterations of a loop to a very long instruction word (VLIW) base. This architecture allows highly parallel loop execution for a much larger class of loops than can be vectorized, ...
Outer-loop vectorization: revisited for short SIMD architectures
PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques

Vectorization has been an important method of using data-level parallelism to accelerate scientific workloads on vector machines such as Cray for the past three decades. In the last decade it has also proven useful for accelerating multi-media and ...
Loop striping: maximize parallelism for nested loops
EUC'06: Proceedings of the 2006 international conference on Embedded and Ubiquitous Computing

The majority of scientific and Digital Signal Processing (DSP) applications are recursive or iterative. Transformation techniques are generally applied to increase parallelism for these nested loops. Most of the existing loop transformation techniques ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 17, Issue 2

Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systems

April 1989

291 pages

ISSN:0163-5964

DOI:10.1145/68182

Editor:
Joel Emer

Issue’s Table of Contents

ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems
April 1989
303 pages
ISBN:0897913000
DOI:10.1145/70082
Chairman:
Joel Emer,
General Chair:
John Hennessy
Stanford University

Copyright © 1989 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1989

Published in SIGARCH Volume 17, Issue 2

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

160
Total Citations
View Citations
679
Total Downloads

Downloads (Last 12 months)68
Downloads (Last 6 weeks)10

Reflects downloads up to 11 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Carminati AStarke Rde Oliveira R(2017)Combining loop unrolling strategies and code predication to reduce the worst-case execution time of real-time softwareApplied Computing and Informatics10.1016/j.aci.2017.03.00213:2(184-193)Online publication date: Jul-2017
https://doi.org/10.1016/j.aci.2017.03.002
Radigan JChang PBanerjee U(2005)Integer loop code generation for VLIWLanguages and Compilers for Parallel Computing10.1007/BFb0014208(318-330)Online publication date: 9-Jun-2005
https://doi.org/10.1007/BFb0014208
Mantripragada SNicolau A(2000)Selective guarded execution using profiling on a dynamically scheduled processorInnovative Architecture for Future Generation High-Performance Processors and Systems (Cat. No.PR00650)10.1109/IWIA.1999.898839(15-22)Online publication date: 2000
https://doi.org/10.1109/IWIA.1999.898839
Mantripragada SNicolau A(1999)The effects of predicated execution on architectures supporting dynamic speculationInnovative Architecture for Future Generation High-Performance Processors and Systems10.1109/IWIA.1998.779071(37-45)Online publication date: 1999
https://doi.org/10.1109/IWIA.1998.779071
Srinivas MNicolau A(1998)Analyzing the individual/combined effects of speculative and guarded execution on a superscalar architectureProceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing10.1109/IPPS.1998.669911(199-208)Online publication date: 1998
https://doi.org/10.1109/IPPS.1998.669911
Schlansker MKathail V(1997)Techniques for critical path reduction of scalar programsInternational Journal of Parallel Programming10.1007/BF0270003425:3(147-181)Online publication date: 1-Jun-1997
https://dl.acm.org/doi/10.1007/BF02700034
Chen WMahlke SWarter NAnik SHwu W(1994)Profile-assisted instruction schedulingInternational Journal of Parallel Programming10.1007/BF0257787322:2(151-181)Online publication date: 1-Apr-1994
https://dl.acm.org/doi/10.1007/BF02577873
Rau B(1992)Data flow and dependence analysis for instruction level parallelismLanguages and Compilers for Parallel Computing10.1007/BFb0038668(236-250)Online publication date: 1992
https://doi.org/10.1007/BFb0038668
Bodík RGupta R(2016)Array Data Flow Analysis for Load-Store Optimizations in Fine-Grain ArchitecturesInternational Journal of Parallel Programming10.1007/BF0335675724:6(481-512)Online publication date: 26-May-2016
https://doi.org/10.1007/BF03356757
Tyson GFarrens M(2016)Evaluating the Effects of Predicated Execution on Branch PredictionInternational Journal of Parallel Programming10.1007/BF0335674624:2(159-186)Online publication date: 26-May-2016
https://doi.org/10.1007/BF03356746
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents