Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Optimal loop storage allocation for argument-fetching dataflow machines

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

In this paper, we consider the optimal loop scheduling and minimum storage allocation problems based on the argument-fetching dataflow architecture model. Under the argument-fetching model, the result generated by a node is stored in a unique location which is addressable by its successors. The main contribution of this paper includes: for loops containing no loop-carried dependences, we prove that the problem of allocating minimum storage required to support rate-optimal loop scheduling can be solved in polynomial time. The polynomial time algorithm is based on the fact that the constraint matrix in the formulation is totally unimodular. Since the instruction processing unit of an argument-fetching dataflow architecture is very much like a conventional processor architecture without a program counter, the solution of the optimal loop storage allocation problem for the former will also be useful for the latter.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. V. Aho, R. Sethi, and J. D. Ullman,Compilers—Principles, Techniques and Tools, Addison-Wesley Publishing Co. (1986).

  2. G. J. Chaitin, Register allocation and spilling via graph coloring,ACM SIGPLAN Symp. on Compiler Construction, pp. 98–105 (1982).

  3. G. J. Chaitin, M. Auslander, A. Chandra, J. Cocke, M. Hopkins, and P. Markstein, Register allocation via coloring,Computer Languages,6:47–57 (January 1981).

    Google Scholar 

  4. G. R. Gao and Qi Ning, Loop storage optimization for dataflow machines, Uptal Banerjee, David Gelernter, Alexandru Nicolau, and David Padua (eds.),Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science 589,Proc. of the Fourth Int'l. Workshop on Languages and Compilers for Parallel Computing, Springer-Verlag, Santa Clara, California, pp. 359–373 (1992).

    Google Scholar 

  5. J. B. Dennis, First version of a data flow procedure language, Technical Memo MIT/LCS/TM-61, MIT Laboratory for Computer Science, Cambridge, Massachusetts (1975).

    Google Scholar 

  6. J. B. Dennis, Data flow for supercomputers,Proc. of the CompCon (March 1984).

  7. Jack B. Dennis, The evolution of “static” data-flow architecture, Jean-Luc Gaudiot and Lubomir Bic, (eds.),Advanced Topics in Data-Flow Computing, Chapter 2, Prentice-Hall, Englewood Cliffs, New Jersey (1991).

    Google Scholar 

  8. J. B. Dennis and G. R. Gao, An efficient pipelined dataflow processor architecture,Proc. of Supercomputing '88, IEEE Computer Society and ACM SIGARCH, Orlando, Florida, pp. 368–373 (November 1988).

    Google Scholar 

  9. Arvind and K. P. Gostelow, The U-Interpreter,IEEE Computer,15(2):42–49 (February 1982).

    Google Scholar 

  10. G. R. Gao, A flexible architecture model for hybrid dataflow and control-flow evaluation,Proc. of the Int'l. Workshop: Data-Flow, A Státus Report, Eilat, Israel (1989).

  11. G. R. Gao, A pipelined code mapping scheme for static dataflow computers, Technical Report TR-371, MIT Laboratory for Computer Science (1986).

  12. G. R. Gao, Algorithmic aspects of balancing techniques for pipelined data flow code generation,Journal of Parallel and Distributed Computing,6:39–61 (1989).

    Google Scholar 

  13. G. R. Gao,A Code Mapping Scheme for Dataflow Software Pipelining, Kluwer Academic Publishers, Boston, Massachusetts (December 1990).

    Google Scholar 

  14. G. R. Gao, H. H. J. Hum, and Y. B. Wong, An efficient scheme for fine-grain software pipelining,Proc. of the CONPAR '90-VAPP IV Conf., Zurich, Switzerland, pp. 709–720 (September 1990).

  15. G. R. Gao, H. H. J. Hum, and Y. B. Wong, Limited balancing—an efficient method for dataflow software pipelining,Proc. of the Int'l. Symp. on Parallel and Distributed Comput., and Syst., New York, New York (October 1990).

  16. G. R. Gao, Y. B. Wong, and Qi Ning, A Petri-Net model for fine-grain loop scheduling,Proc. of the SIGPLAN '91 Conf. on Programming Language Design and Implementation, ACM SIGPLAN. Toronto, Ontario, pp. 204–218 (June 1991).

  17. Q. Ning and G. R. Gao, A novel framework of register allocation for software pipelining,Proc. of 20th Ann. ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages (POPL '93), Charleston, South Carolina, pp. 29–42 (January 1993).

  18. Qi Ning, Register Allocation for Optimal Loop Scheduling, PhD thesis, School of Computer Science, McGill University, Montreal, Canada (May 1993).

    Google Scholar 

  19. J. B. Dennis, Data flow supercomputers,IEEE Computer,13(11):48–56 (November 1980).

    Google Scholar 

  20. G. R. Gao, H. H. J. Hum, and Y. B. Wong, Towards efficient fine-grain software pipelining,Conf. Proc., Int'l. Conf. on Supercomputing, ACM, Amsterdam, The Netherlands, pp. 369–379 (June 1990).

    Google Scholar 

  21. G. R. Gao, H. H. J. Hum, and J. M. Monti, Towards an efficient hybrid dataflow architecture model, E. H. L. Aarts, J. van Leeuwen, and M. Rem, (eds.),Proc. of PARLE '91-Parallel Architectures and Languages Europe, Eindhoven, The Netherlands, Springer-Verlag, Lecture Notes in Computer Science, pp. 505–506 (June 1991).

    Google Scholar 

  22. G. Gao, R. Govindarajan, and Prakash Panangaden, Well-behaved dataflow for dsp computation,ICASSP-92, Int'l. Conf. on Acoustics, Speech, and Signal Processing (March 1992).

  23. L. G. Khachian, A polynomial algorithm in linear programming,Soviet Math. Doklady,20:191–194 (1979).

    Google Scholar 

  24. N. Karmarkar, A new polynomial-time algorithm for linear programming,Combinatorica,4:373–395 (1984).

    Google Scholar 

  25. A. Schrijver,Theory of Linear and Integer Programming, John Wiley and Sons (1986).

  26. P. Camion, Characterizations of totally unimodular matrices,Proc. Amer. Math. Soc.,16:1068–1073 (1965).

    Google Scholar 

  27. Eugene L. Lawler,Combinatorial Optimization: Networks and Matroids, Saunders College Publishing, Ft Worth, Texas (1976).

    Google Scholar 

  28. D. E. Culler, Managing parallelism and resources in scientific dataflow programs, PhD thesis, Technical Report TR-446, MIT Laboratory for Computer Science (1989).

  29. Micah Beck, Keshav K. Pingali, and Alex Nicolau, Static scheduling for dynamic dataflow machines, Technical Report TR 90-1076, Department of Computer Science, Cornell University, Ithaca, New York (January 1990).

    Google Scholar 

  30. A. Aiken, Compaction-based parallelization, PhD thesis, Technical Report 88–922, Cornell University (1988).

  31. A. Aiken and A. Nicolau, Optimal loop parallelization,Proc. of the ACM SIGPLAN Conf. on Programming Languages Design and Implementation (June 1988).

  32. A. Aiken and A. Nicolau, A realistic resource-constrained software pipelining algorithm,Proc. of the Third Workshop on Programming Languages and Compilers for Parallel Computing, Irvine, California (August 1990).

  33. K. Ebcioğlu, A compilation technique for software pipelining of loops with conditional jumps,Proc. of the 20th Ann. Workshop on Microprogramming (December 1987).

  34. K. Ebcioğlu and A. Nicolau, A global resource-constrained parallelization technique,Proc. of the ACM SIGARCH Int'l. Conf. on Supercomputing (June 1989).

  35. Monica Lam, Software pipelining: An effective scheduling technique for VLIW machines,Proc. of the ACM SIGPLAN Conf. on Programming Languages Design and Implementation, Atlanta, Georgia, pp. 318–328 (June 1988).

  36. J. R. Larus and P. N. Hilfinger, Register allocation in the SPUR Lisp compiler,Proc. of the ACM Symp. on Compiler Construction, Palo Alto, California, pp. 255–263 (June 1986).

  37. T. R. Gross, Code Optimization of Pipeline Constraints, PhD thesis, Computing System Lab., Stanford University (1983).

  38. D. Bernstein and I. Gertner, Scheduling expressions on a pipelined processor with a maximal delay of one cycle,ACM Trans. on Programming Languages and Syst.,11(1):57–66 (January 1989).

    Google Scholar 

  39. D. G. Bradlee, S. J. Eggers, and R. R. Henry, Integrating register allocation and instruction scheduling for RISCs,Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV), pp. 122–131 (April 1991).

  40. J. R. Goodman and W. Hsu, Code scheduling and register allocation in large basic blocks,Int'l. Conf. on Supercomputing, pp. 442–452 (July 1988).

  41. R. A. Huff, Lifetime-sensitive modulo scheduling,Proc. of ACM SIGPLAN '93 Conf. on Programming Language Design and Implementation, Albuquerque, New Mexico, pp. 258–267 (June 1993).

  42. S. S. Pinter, Register allocation with instruction scheduling,Proc. of ACM SIGPLAN '93 Conf. on Programming Language Design and Implementation, Albuquerque, New Mexico, pp. 284–257 (June 1993).

  43. C. E. Leiserson, Optimizing synchronous systems, Technical Memo 215, MIT Laboratory for Computer Science (1982).

  44. C. E. Leiserson,Area-Efficient VLSI Computation, MIT Press, Cambridge, Massachusetts (1983).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ning, Q., Gao, G.R. Optimal loop storage allocation for argument-fetching dataflow machines. Int J Parallel Prog 21, 421–448 (1992). https://doi.org/10.1007/BF01379405

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01379405

Key Words