Optimal loop storage allocation for argument-fetching dataflow machines

Ning, Qi; Gao, Guang R.

doi:10.1007/BF01379405

Optimal loop storage allocation for argument-fetching dataflow machines

Published: December 1992

Volume 21, pages 421–448, (1992)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Qi Ning¹ &
Guang R. Gao²

64 Accesses
Explore all metrics

Abstract

In this paper, we consider the optimal loop scheduling and minimum storage allocation problems based on the argument-fetching dataflow architecture model. Under the argument-fetching model, the result generated by a node is stored in a unique location which is addressable by its successors. The main contribution of this paper includes: for loops containing no loop-carried dependences, we prove that the problem of allocating minimum storage required to support rate-optimal loop scheduling can be solved in polynomial time. The polynomial time algorithm is based on the fact that the constraint matrix in the formulation is totally unimodular. Since the instruction processing unit of an argument-fetching dataflow architecture is very much like a conventional processor architecture without a program counter, the solution of the optimal loop storage allocation problem for the former will also be useful for the latter.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

A. V. Aho, R. Sethi, and J. D. Ullman,Compilers—Principles, Techniques and Tools, Addison-Wesley Publishing Co. (1986).
G. J. Chaitin, Register allocation and spilling via graph coloring,ACM SIGPLAN Symp. on Compiler Construction, pp. 98–105 (1982).
G. J. Chaitin, M. Auslander, A. Chandra, J. Cocke, M. Hopkins, and P. Markstein, Register allocation via coloring,Computer Languages,6:47–57 (January 1981).
Google Scholar
G. R. Gao and Qi Ning, Loop storage optimization for dataflow machines, Uptal Banerjee, David Gelernter, Alexandru Nicolau, and David Padua (eds.),Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science 589,Proc. of the Fourth Int'l. Workshop on Languages and Compilers for Parallel Computing, Springer-Verlag, Santa Clara, California, pp. 359–373 (1992).
Google Scholar
J. B. Dennis, First version of a data flow procedure language, Technical Memo MIT/LCS/TM-61, MIT Laboratory for Computer Science, Cambridge, Massachusetts (1975).
Google Scholar
J. B. Dennis, Data flow for supercomputers,Proc. of the CompCon (March 1984).
Jack B. Dennis, The evolution of “static” data-flow architecture, Jean-Luc Gaudiot and Lubomir Bic, (eds.),Advanced Topics in Data-Flow Computing, Chapter 2, Prentice-Hall, Englewood Cliffs, New Jersey (1991).
Google Scholar
J. B. Dennis and G. R. Gao, An efficient pipelined dataflow processor architecture,Proc. of Supercomputing '88, IEEE Computer Society and ACM SIGARCH, Orlando, Florida, pp. 368–373 (November 1988).
Google Scholar
Arvind and K. P. Gostelow, The U-Interpreter,IEEE Computer,15(2):42–49 (February 1982).
Google Scholar
G. R. Gao, A flexible architecture model for hybrid dataflow and control-flow evaluation,Proc. of the Int'l. Workshop: Data-Flow, A Státus Report, Eilat, Israel (1989).
G. R. Gao, A pipelined code mapping scheme for static dataflow computers, Technical Report TR-371, MIT Laboratory for Computer Science (1986).
G. R. Gao, Algorithmic aspects of balancing techniques for pipelined data flow code generation,Journal of Parallel and Distributed Computing,6:39–61 (1989).
Google Scholar
G. R. Gao,A Code Mapping Scheme for Dataflow Software Pipelining, Kluwer Academic Publishers, Boston, Massachusetts (December 1990).
Google Scholar
G. R. Gao, H. H. J. Hum, and Y. B. Wong, An efficient scheme for fine-grain software pipelining,Proc. of the CONPAR '90-VAPP IV Conf., Zurich, Switzerland, pp. 709–720 (September 1990).
G. R. Gao, H. H. J. Hum, and Y. B. Wong, Limited balancing—an efficient method for dataflow software pipelining,Proc. of the Int'l. Symp. on Parallel and Distributed Comput., and Syst., New York, New York (October 1990).
G. R. Gao, Y. B. Wong, and Qi Ning, A Petri-Net model for fine-grain loop scheduling,Proc. of the SIGPLAN '91 Conf. on Programming Language Design and Implementation, ACM SIGPLAN. Toronto, Ontario, pp. 204–218 (June 1991).
Q. Ning and G. R. Gao, A novel framework of register allocation for software pipelining,Proc. of 20th Ann. ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages (POPL '93), Charleston, South Carolina, pp. 29–42 (January 1993).
Qi Ning, Register Allocation for Optimal Loop Scheduling, PhD thesis, School of Computer Science, McGill University, Montreal, Canada (May 1993).
Google Scholar
J. B. Dennis, Data flow supercomputers,IEEE Computer,13(11):48–56 (November 1980).
Google Scholar
G. R. Gao, H. H. J. Hum, and Y. B. Wong, Towards efficient fine-grain software pipelining,Conf. Proc., Int'l. Conf. on Supercomputing, ACM, Amsterdam, The Netherlands, pp. 369–379 (June 1990).
Google Scholar
G. R. Gao, H. H. J. Hum, and J. M. Monti, Towards an efficient hybrid dataflow architecture model, E. H. L. Aarts, J. van Leeuwen, and M. Rem, (eds.),Proc. of PARLE '91-Parallel Architectures and Languages Europe, Eindhoven, The Netherlands, Springer-Verlag, Lecture Notes in Computer Science, pp. 505–506 (June 1991).
Google Scholar
G. Gao, R. Govindarajan, and Prakash Panangaden, Well-behaved dataflow for dsp computation,ICASSP-92, Int'l. Conf. on Acoustics, Speech, and Signal Processing (March 1992).
L. G. Khachian, A polynomial algorithm in linear programming,Soviet Math. Doklady,20:191–194 (1979).
Google Scholar
N. Karmarkar, A new polynomial-time algorithm for linear programming,Combinatorica,4:373–395 (1984).
Google Scholar
A. Schrijver,Theory of Linear and Integer Programming, John Wiley and Sons (1986).
P. Camion, Characterizations of totally unimodular matrices,Proc. Amer. Math. Soc.,16:1068–1073 (1965).
Google Scholar
Eugene L. Lawler,Combinatorial Optimization: Networks and Matroids, Saunders College Publishing, Ft Worth, Texas (1976).
Google Scholar
D. E. Culler, Managing parallelism and resources in scientific dataflow programs, PhD thesis, Technical Report TR-446, MIT Laboratory for Computer Science (1989).
Micah Beck, Keshav K. Pingali, and Alex Nicolau, Static scheduling for dynamic dataflow machines, Technical Report TR 90-1076, Department of Computer Science, Cornell University, Ithaca, New York (January 1990).
Google Scholar
A. Aiken, Compaction-based parallelization, PhD thesis, Technical Report 88–922, Cornell University (1988).
A. Aiken and A. Nicolau, Optimal loop parallelization,Proc. of the ACM SIGPLAN Conf. on Programming Languages Design and Implementation (June 1988).
A. Aiken and A. Nicolau, A realistic resource-constrained software pipelining algorithm,Proc. of the Third Workshop on Programming Languages and Compilers for Parallel Computing, Irvine, California (August 1990).
K. Ebcioğlu, A compilation technique for software pipelining of loops with conditional jumps,Proc. of the 20th Ann. Workshop on Microprogramming (December 1987).
K. Ebcioğlu and A. Nicolau, A global resource-constrained parallelization technique,Proc. of the ACM SIGARCH Int'l. Conf. on Supercomputing (June 1989).
Monica Lam, Software pipelining: An effective scheduling technique for VLIW machines,Proc. of the ACM SIGPLAN Conf. on Programming Languages Design and Implementation, Atlanta, Georgia, pp. 318–328 (June 1988).
J. R. Larus and P. N. Hilfinger, Register allocation in the SPUR Lisp compiler,Proc. of the ACM Symp. on Compiler Construction, Palo Alto, California, pp. 255–263 (June 1986).
T. R. Gross, Code Optimization of Pipeline Constraints, PhD thesis, Computing System Lab., Stanford University (1983).
D. Bernstein and I. Gertner, Scheduling expressions on a pipelined processor with a maximal delay of one cycle,ACM Trans. on Programming Languages and Syst.,11(1):57–66 (January 1989).
Google Scholar
D. G. Bradlee, S. J. Eggers, and R. R. Henry, Integrating register allocation and instruction scheduling for RISCs,Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV), pp. 122–131 (April 1991).
J. R. Goodman and W. Hsu, Code scheduling and register allocation in large basic blocks,Int'l. Conf. on Supercomputing, pp. 442–452 (July 1988).
R. A. Huff, Lifetime-sensitive modulo scheduling,Proc. of ACM SIGPLAN '93 Conf. on Programming Language Design and Implementation, Albuquerque, New Mexico, pp. 258–267 (June 1993).
S. S. Pinter, Register allocation with instruction scheduling,Proc. of ACM SIGPLAN '93 Conf. on Programming Language Design and Implementation, Albuquerque, New Mexico, pp. 284–257 (June 1993).
C. E. Leiserson, Optimizing synchronous systems, Technical Memo 215, MIT Laboratory for Computer Science (1982).
C. E. Leiserson,Area-Efficient VLSI Computation, MIT Press, Cambridge, Massachusetts (1983).
Google Scholar

Download references

Author information

Authors and Affiliations

Centre de Recherche Informatique de Montreal, 1801 McGill College Ave, Bureau 800, Montreal, Quebec, Canada
Qi Ning
School of Computer Science, McGill University, Montreal, Quebec, Canada
Guang R. Gao

Authors

Qi Ning
View author publications
You can also search for this author in PubMed Google Scholar
Guang R. Gao
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ning, Q., Gao, G.R. Optimal loop storage allocation for argument-fetching dataflow machines. Int J Parallel Prog 21, 421–448 (1992). https://doi.org/10.1007/BF01379405

Download citation

Received: 15 April 1992
Revised: 15 August 1993
Issue Date: December 1992
DOI: https://doi.org/10.1007/BF01379405

Key Words

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal loop storage allocation for argument-fetching dataflow machines

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Store Buffer Reduction in the Presence of Mixed-Size Accesses and Misalignment

A Unified Approach to Variable Renaming for Enhanced Vectorization

Impact of Variable Privatization on Extracting Synchronization-Free Slices for Multi-core Computers

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key Words

Subscribe and save

Buy Now

Navigation

Optimal loop storage allocation for argument-fetching dataflow machines

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Store Buffer Reduction in the Presence of Mixed-Size Accesses and Misalignment

A Unified Approach to Variable Renaming for Enhanced Vectorization

Impact of Variable Privatization on Extracting Synchronization-Free Slices for Multi-core Computers

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Subscribe and save

Buy Now

Search

Navigation