Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Array Data Flow Analysis for Load-Store Optimizations in Fine-Grain Architectures

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

The performence of scientific programs on modern processors can be significantly degraded by memory references that frequently arise due to load and store operations associated with array references. We have developed techniques for optimally allocating registers to array elements whose values are repeatedly referenced over one or more loop iterations. The resulting placement of loads and stores is optimal in that number of loads and stores encoutered along each path through the loop is minimal for the given program branching structure. To place load, store, and register-to-register shift operations without introducing fully/partially redundant and dead memory operations, a detailed value flow analysis of array references is required. We present an analysis framework to efficiently solve various data flow problems required by array load-store optimizations. The framework determines the collective behavior of recurrent references spread over multiple loop iterations. We also demonstrate how our algorithms can be adapted for various fine-grain architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman, Compilers, Principles, Techniques, and Tools, Addison-Wesley, 1986.

    MATH  Google Scholar 

  2. M. E. Benitez and J. W. Davidson, Code Generation for Streaming: an Access/Execute Mechanism, Proc. of Arch. Support for Programming Languages and Operating Systems-IV, pp. 132–141 (1991).

  3. D. Callahan, S. Carr, and K. Kennedy, Improving Register Allocation for Subscripted Variables, Proc. of the SIGPLAN Conf. on PLDI, White Plains, New York, pp. 53–65 (June 1990).

    Google Scholar 

  4. S. Carr and K. Kennedy, Scalar Replacement in the Presence of Conditional Control Flow, Software-Practice and Experience, 24(1):51–77 (January 1994).

    Article  Google Scholar 

  5. E. Duesterwald, R. Gupta, and M L. Sofia, Register Pipelining: An Integrated Approach to Register Allocation for Scalar and Subscripted Variables, Proc. of Int’l. Workshop on Compiler Construction, LNCS 641 Springer Verlag, Paderborn, Germany, pp. 192–206 (October 1992).

    Chapter  Google Scholar 

  6. E. Duesterwald, R. Gupta, and M. L. Soffa, A Practical Data Flow Framework for Array Reference Analysis and its Application in Optimizations, Proc. of ACM SIGPLAN Conf. PLDI, Albuquerque, New Mexico, pp. 68–77 (June 1993).

    Google Scholar 

  7. D. M. Dhamdhere, Practical Adaption of the Global Optimization Algorithm of Morel and Renvoise, ACM Trans. on Programming Languages and Systems, 13(2):291–294 (April 1991).

    Article  Google Scholar 

  8. D. M. Dhamdhere, B. K. Rosen and F. K. Zadeck, How to Analyze Large Programs Efficiently and Informatively, Proc. of the SIGPLAN PLDI, San Francisco, California, pp. 212–223 (June 1992).

    Google Scholar 

  9. J. Knoop, O. Ruthing, and B. Steffen, Optimal Code Motion: Theory and Practice, ACM TOPLAS, 16 (4): 1117–1155.

  10. E. Morel and C. Renvoise, Global Optimization by Suppression of Partial Redundancies, Comm. ACM, 22(2): 96–103 (1979).

    Article  MathSciNet  MATH  Google Scholar 

  11. M. Wolfe and U. Banerjee, Data Dependence and its Application to Parallel Processing, IJPP, 16 (2): (April 1987).

  12. L. Hendren, G. R. Gao, E. R. Altman, and C. Mukerji, A Register Allocation Framework Based Upon Hierarchical Cyclic Interval Graphs, Int’l. Workshop on Compiler Construction, LNCS 641 Springer Verlag, Germany, pp. 176–191 (1992)

    Chapter  Google Scholar 

  13. R. Gupta, Generalized Dominators and Post-Dominators, The 19th Ann. ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages, Albuquerque, New Mexico, pp. 246–257 (January 1992).

    Chapter  Google Scholar 

  14. T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms, The MIT Press, Cambridge, Massachusetts (1990).

    MATH  Google Scholar 

  15. B. R. Rau. M. Lee, P. P. Tirumalai, M. S. Schlansker, Register Allocation for Software Pipelined Loops, Proc. of the SIGPLAN Conf. PLDI, San Francisco, California, pp. 212–223 (June 1992).

    Google Scholar 

  16. J. C. Dehnert, P. Y.-T. Hsu, and J. P. Bratt, Overlapped Loop Support in the Cydra 5, Proc. of ASPLOS-III, pp. 26–39 (1989).

  17. P. Kolte and M. J. Harrold, Load/Store Range Analysis for Global Register Allocation, Proc. of the SIGPLAN Conf. PLDI, Albequerque, New Mexico, pp. 268–277 (June 1994).

    Google Scholar 

  18. V. Kathail, M. Schlansker, and B. Rau, HPL PlayDoh Architecture Specification: Version 1.0, HPL-93-80 (February 1994).

  19. B. R. Rau, Data Flow and Dependence Analysis for Instruction-Level Parallelism, Fourth Annual Workshop on Languages and Compilers for Parallel Computing, Santa Clara, California (August 1991).

    Google Scholar 

  20. R. Bodik and R. Gupta, Optimal Placement of Load-Store Operations for Array Accesses in Loops, Technical Report 95-03, DCS, University of Pittsburgh (1995).

    Google Scholar 

  21. G. J. Chaitin, Register Allocation and Spilling via Graph Coloring, Proc. of the SIGPLAN Symp. on Compiler Construction, SIGPLAN Notices, 17(6):98–105 (June 1982).

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Partially supported by National Science Foundation Presidential Young Investigator Award CCR-9157371 to the University of Pittsburgh and a grant from Hewlett-Packard Laboratories.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bodík, R., Gupta, R. Array Data Flow Analysis for Load-Store Optimizations in Fine-Grain Architectures. Int J Parallel Prog 24, 481–512 (1996). https://doi.org/10.1007/BF03356757

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF03356757

Key Words