Abstract
Over the last few years, the research trend in future generation high-performance computing systems has been moving toward a multi-threaded parallel architectures. Thus the importance to exploit and control parallelism has growing parallel activities must be both synchronized and reduced. In fine-grain parallel computation, designing efficient micro synchronization, at the same level of granularity as the grain size, is essential for implementation. This article discusses methods of synchronizing parallel activities, focusing on the case when the number of activities to be gathered is determined at run time. A new reduction graph, without loss of parallelism, is proposed. It is especially useful if the number of parallel activities is determined dynamically. This method is basically developed for instruction-level dataflow computers. Its full potential should be realized when trends in parallel processing return to finer grain sizes.
Preview
Unable to display preview. Download preview PDF.
References
Hiraki, K., Sekiguchi, S., and Shimada, T., “System architecture of a dataflow supercomputer”, Proc. TENCON 87, IEEE, Seul, August 1987, IEEE.
Hiraki, K., Sekiguchi, S., and Shimada, T., “Efficient vector processing on a dataflow supercomputer SIGMA-1”, Proc. Supercomputing'88, IEEE, Orlando, November 1988, IEEE.
Hiraki, K., Sekiguchi, S., and Shimada, T., “Status report of SIGMA-1: a data-flow supercomputer”, Gaudiot, J.-L., and Bic, L. (eds.), Advanced Topics in Data-Flow Computing, chapter 7, Prentice Hall, 1991, chapter 7.
Sekiguchi, S., Shimada, T., and Hiraki, K., “Sequential description and parallel execution language DFC II for dataflow supercomputers”, 1991 Intl. Conf. on Supercomputing, ACM, Cologne, June 1991, ACM.
Traub, T. R., “A compiler for the MIT tagged token dataflow architecture”, Master's thesis, MIT, 1986.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sekiguchi, S., Hiraki, K. (1995). Automatic reduction tree generation for fine-grain parallel architectures when iteration count is unknown. In: Pingali, K., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1994. Lecture Notes in Computer Science, vol 892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0025898
Download citation
DOI: https://doi.org/10.1007/BFb0025898
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58868-9
Online ISBN: 978-3-540-49134-7
eBook Packages: Springer Book Archive