Aggressive loop unrolling in a retargetable, optimizing compiler

Davidson, Jack W.; Jinturkar, Sanjay

doi:10.1007/3-540-61053-7_53

Jack W. Davidson¹ &
Sanjay Jinturkar¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1060))

Included in the following conference series:

International Conference on Compiler Construction

2242 Accesses
12 Citations

Abstract

A well-known code transformation for improving the run-time performance of a program is loop unrolling. The most obvious benefit of unrolling a loop is that the transformed loop usually requires fewer instruction executions than the original loop. The reduction in instruction executions comes from two sources: the number of branch instructions executed is reduced, and the control variable is modified fewer times. In addition, for architectures with features designed to exploit instruction-level parallelism, loop unrolling can expose greater levels of instruction-level parallelism. Loop unrolling is an effective code transformation often improving the execution performance of programs that spend much of their execution time in loops by 10 to 30 percent. Possibly because of the effectiveness of a simple application of loop unrolling, it has not been studied as extensively as other code improvements such as register allocation or common subexpression elimination. The result is that many compilers employ simplistic loop unrolling algorithms that miss many opportunities for improving run-time performance. This paper describes how aggressive loop unrolling is done in a retargetable optimizing compiler. Using a set of 32 benchmark programs, the effectiveness of this more aggressive approach to loop unrolling is evaluated. The results show that aggressive loop unrolling can yield additional performance increase of 10 to 20 percent over the simple, naive approaches employed by many production compilers.

Download to read the full chapter text

Chapter PDF

Instruction Level Loop De-optimization

Towards an Achievable Performance for the Loop Nests

A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details

Article 09 January 2017

Keywords

References

Alexander, M. J., Bailey, M. W., Childers, B. R., Davidson, J. W., and Jinturkar, S., “Memory Bandwidth Optimizations for Wide-Bus Machines”, Proceedings of the 25th Hawaii International Conference on System Sciences, Maui, HA, January 1993, pp. 466–475.
Google Scholar
Bacon, D. F., Graham, S. L., and Sharp, O. J., “Compiler Transformations for High-Performance Computing”, ACM Computing Surveys, 26(4), Dec. 1994, pp. 345–420.
Google Scholar
Benitez, M. E. and Davidson, J. W., “The Advantages of Machine-Dependent Global Optimizations”, Proceedings of the Conference on Programming Languages and System Architecture, Springer Verlag Lecture Notes in Computer Science, Zurich, Switzerland, March 1994, pp. 105–124.
Google Scholar
Davidson, J. W., and Fraser, C. W., “The Design and Application of a Retargetable Peephole Optimizer”, ACM Transactions on Programming Languages and Systems, 2(2), April 1980, pp. 191–202.
Google Scholar
Davidson, J. W. and Whalley, D. B., “Ease: An Environment for Architecture Study and Experimentation”, Proceedings of the 1990 ACM Sigmetrics Conference on Measurement and Modelling of Computer Systems, Boulder, CO, May 1990, pp. 259–260.
Google Scholar
Davidson, J. W. and Jinturkar, S., “Memory Access Coalescing: A Technique for Eliminating Redundant Memory Accesses”, Proceedings of SIGPLAN '94 Conference on Programming Language Design and Implementation, Orlando, FL, June 1994, pp 186–195.
Google Scholar
Davidson, J. W. and Jinturkar, S., “An Aggressive approach to Loop Unrolling”, available as University of Virginia Technical Report # CS-95-26.
Google Scholar
Davidson, J. W. and Jinturkar, S., “Improving Instruction-level Parallelism by Loop Unrolling and Dynamic Memory Disambiguation”, Proceedings of the 28th International Symposium on Microarchitecture, Ann Arbor, MI, Nov 1995, pp 125–134.
Google Scholar
Alpha Architecture Handbook, Digital Equipment Corporation, Boston, MA, 1992.
Google Scholar
Dongarra, J.J. and Hinds, A. R., “Unrolling Loops in Fortran”, Software-Practice and Experience, 9(3), Mar. 1979, pp. 219–226.
Google Scholar
Fisher, J. A., Ellis, J. R., Ruttenberg, J. C. and Nicolau, A., “Parallel Processing: A Smart Compiler and a Dumb Machine”, Proceedings of the SIGPLAN'84 Symposium on Compiler Construction, Montreal, Canada, June 1984, pp. 37–47.
Google Scholar
Freudenberger, S. M., Gross, T. R. and Lowney, P. G., “Avoidance and Suppression of Compensation Code in a Trace Scheduling Compiler”, ACM Transactions on Programming Languages and Systems, 16(4), July 1994, pp. 1156–1214.
Google Scholar
Hennessy, J. L. and Patterson, D. A., Computer Architecture: A Quantitative Approach, Morgan Kaufmann Publishers, Inc, San Mateo, CA, 1990.
Google Scholar
IBM RISC System/6000 Technology, Austin, TX, 1990.
Google Scholar
Kane, G., “MIPS RISC Architecture”, Prentice-Hall, Englewood Cliffs, NJ, 1992.
Google Scholar
Mahlke, S. A., Chen, W. Y., Gyllenhaal, J. C. and Hwu, W. W., “Compiler Code Transformations for Superscalar-Based High-Performance Systems”, Proceedings of Supercomputing '92, Portland, OR, Nov. 1992, pp. 808–817.
Google Scholar
MC68020 32-Bit Microprocessor User's Manual, Prentice-Hall, Englewood Cliffs, N.J.
Google Scholar
Stallman, R. M., Using and Porting GNU CC, Free Software Foundation, Cambridge, MA, 1989.
Google Scholar
The SPARC Architecture Manual, Version 7, Sun Microsystems Corporation, Mountain View, CA, 1987.
Google Scholar
Weiss, S, and Smith, J. E., “A Study of Scalar Compilation Techniques for Pipelined Supercomputers”, Proceedings of Second International Conference on Architectural Support for Programming Languages and Operating Systems”, Palo Alto, CA, Oct. 1987, pp. 105–109.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Virginia, Thornton Hall, 22903, Charlottesville, VA, USA
Jack W. Davidson & Sanjay Jinturkar

Authors

Jack W. Davidson
View author publications
You can also search for this author in PubMed Google Scholar
Sanjay Jinturkar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Tibor Gyimóthy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Davidson, J.W., Jinturkar, S. (1996). Aggressive loop unrolling in a retargetable, optimizing compiler. In: Gyimóthy, T. (eds) Compiler Construction. CC 1996. Lecture Notes in Computer Science, vol 1060. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61053-7_53

Download citation

DOI: https://doi.org/10.1007/3-540-61053-7_53
Published: 07 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61053-3
Online ISBN: 978-3-540-49939-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Aggressive loop unrolling in a retargetable, optimizing compiler

Abstract

Chapter PDF

Similar content being viewed by others

Instruction Level Loop De-optimization

Towards an Achievable Performance for the Loop Nests

A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Aggressive loop unrolling in a retargetable, optimizing compiler

Abstract

Chapter PDF

Similar content being viewed by others

Instruction Level Loop De-optimization

Towards an Achievable Performance for the Loop Nests

A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation