Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Minimum Distance: A Method for Partitioning Recurrences for Multiprocessors

Published: 01 August 1989 Publication History

Abstract

Parallel execution of nonvectorizable uniform recurrences is considered. When naively scheduled, such recurrences could create unacceptable communication and synchronization on a multiprocessor. The minimum-distance method partitions such recurrences into totally independent computations without increasing redundancy or perturbing numerical stability. The independent computations are well suited for execution on a multiprocessor, but they may not utilize all available processors. How extra processors can be applied to the independent computations is addressed. The methods are especially attractive for multiprocessors comprised of clusters.

References

[1]
{1} U. Banerjee, S. C. Chen, D. J. Kuch, and R. A. Towle, "Time and parallel processor bounds for Fortran-like loops," IEEE Trans. Comput., vol. C-28, pp. 660-670, Sept. 1979.
[2]
{2} Cray Research, Inc. The Cray X-MP Series of Computers, Cray Res., Inc., 1982. Publication MP-0001.
[3]
{3} R. Cytron, "Doacross: Beyond vectorization for multiprocessors," in Proc. 1986 Int. Conf. Parallel Processing, Aug. 1986, pp. 836-844.
[4]
{4} J. R. B. Davies, "Parallel loop constructs for multiprocessors," M. S. thesis, Univ. of Illinois at Urbana-Champaign, 1981.
[5]
{5} J. A. B. Fortes and D. I. Moldovan, "Parallelism detection and transformation techniques useful for VLSI algorithms," J. Parallel Distrib. Comput., May 1985.
[6]
{6} M. R. Garey and D. S. Johnson, Computers and Intractibility: A Guide to the Theory of NP-Completeness. San Francisco, CA: Freeman, 1979.
[7]
{7} D. A. Padua Haiek, "Multiprocessors: Discussions of some theoretical and practical problems," Ph.D. dissertation. Univ. of Illinois at Urbana-Champaign, Urbana, IL, 1979, Rep. UIUCDCS-R-79-99.
[8]
{8} R. W. Heuft and W. D. Little, "Improved time and parallel processor bounds for Fortran-like loops," IEEE Trans. Comput., vol. C-31, Jan. 1982.
[9]
{9} A. M. Kirch, Elementary Number Theory, Intext, 1974.
[10]
{10} D. J. Kuck, The Structure of Computers and Computations. New York: Wiley, 1978.
[11]
{11} D. J. Kuck, D. Lawrie, R. Cytron, A. Sameh, and D. Gajski, "Cedar Project, in D. H. Sharp, N. Metropolis, and W. J. Worlton, Eds. Berkeley. CA: Frontiers of Supercomputing, Univ. of California Press, 1986, pp. 97-123.
[12]
{12} D. I. Moldovan and J. A. B. Fortes, "Partitioning and mapping algorithms into fixed size systolic arrays," IEEE Trans. Comput., vol. C-35, pp. 1-12, Jan. 1986.
[13]
{13} J.-K. Peir, "Program partitioning and synchronization on multiprocessor systems," Ph.D. dissertation, University of Illinois at Urbana-Champaign, Urbana, IL, Mar. 1986, Rep. UIUCDCS-R-86-1259.
[14]
{14} J.-K. Peir and R. Cytron, "Minimum distance: A method for partitioning recurrences for multiprocessors," in Proc. 1987 Int. Conf. Parallel Processing, St. Charles. IL, 1987.
[15]
{15} J.-K. Peir and D. D. Gajski, "CAMP: A programming aid for multiprocessors," in Proc. Int. Conf. Parallel Processing, 1986, pp. 475-482.
[16]
{16} G. F. Pfister, W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfelder, K. P. McAuliffe, E. A. Melton, V. A. Norton, and J. Weiss, "The IBM research parallel prototype (RP3): Introduction and architecture," in Proc. Int. Conf. Parallel Processing, 1985, pp. 764-771.
[17]
{17} C. D. Polychronopoulos, D. J. Kuck, and D. A. Padua, "Execution of parallel loops on parallel processor systems," in Proc. Int. Conf. Parallel Processing, Aug. 1986, pp. 519-527.
[18]
{18} A. H. Sameh, Algorithms and experiments for parallel linear systems solvers," in Proc. Second SIAM Conf. Parallel Processing Scientif. Comput., Nov. 1985.
[19]
{19} W. Shang and J. A. B. Fortes, "Independence partitioning of algorithms with uniform dependencies," in Proc. 1988 Int. Conf. Parallel Processing, Aug. 1988.
[20]
{20} S. G. Tucker, "The IBM 3090 System: An overview," IBM Syst. J., vol. 25, pp. 4-19, 1986.
[21]
{21} M. J. Wolfe, "Optimizing supercompilers for supercomputers," Ph.D. dissertation, Univ. of Illinois at Urbana-Champaign, Urbana, IL, Rep. UIUCDCS-R-82-1105, 1982.

Cited By

View all
  • (2011)Dynamic cache contention detection in multi-threaded applicationsACM SIGPLAN Notices10.1145/2007477.195268846:7(27-38)Online publication date: 9-Mar-2011
  • (2011)Dynamic cache contention detection in multi-threaded applicationsProceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments10.1145/1952682.1952688(27-38)Online publication date: 9-Mar-2011
  • (2009)A reindexing based approach towards mapping of DAG with affine schedules onto parallel embedded systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2008.08.00469:1(1-11)Online publication date: 1-Jan-2009
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers
IEEE Transactions on Computers  Volume 38, Issue 8
August 1989
168 pages
ISSN:0018-9340
Issue’s Table of Contents

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 August 1989

Author Tags

  1. clusters
  2. computer networks
  3. minimum distance
  4. multiprocessing systems.
  5. multiprocessors
  6. nonvectorizable uniform recurrences
  7. numerical stability
  8. parallel execution
  9. partitioning recurrences
  10. totally independent computations

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2011)Dynamic cache contention detection in multi-threaded applicationsACM SIGPLAN Notices10.1145/2007477.195268846:7(27-38)Online publication date: 9-Mar-2011
  • (2011)Dynamic cache contention detection in multi-threaded applicationsProceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments10.1145/1952682.1952688(27-38)Online publication date: 9-Mar-2011
  • (2009)A reindexing based approach towards mapping of DAG with affine schedules onto parallel embedded systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2008.08.00469:1(1-11)Online publication date: 1-Jan-2009
  • (2007)Reducing off-chip memory access via stream-conscious tiling on multimedia applicationsInternational Journal of Parallel Programming10.5555/1241828.124183135:1(63-98)Online publication date: 1-Feb-2007
  • (2004)Mapping rectangular mesh algorithms onto asymptotically space-optimal arraysJournal of Parallel and Distributed Computing10.1016/j.jpdc.2003.04.00264:3(345-359)Online publication date: 1-Mar-2004
  • (2002)Complexity of Multi-dimensional Loop AlignmentProceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science10.5555/646516.696302(179-191)Online publication date: 14-Mar-2002
  • (2000)Evaluation of Loop Grouping Methods Based on Orthogonal Projection SpacesProceedings of the Proceedings of the 2000 International Conference on Parallel Processing10.5555/850941.852932Online publication date: 21-Aug-2000
  • (2000)Chain GroupingIEEE Transactions on Parallel and Distributed Systems10.1109/71.87977711:9(941-955)Online publication date: 1-Sep-2000
  • (1999)A Space-Time Representation Method of Iterative Algorithms for the Design of Processor ArraysJournal of VLSI Signal Processing Systems10.1023/A:100810350483622:3(151-162)Online publication date: 20-Sep-1999
  • (1998)An Efficient Solution to the Cache Thrashing Problem Caused by True Data SharingIEEE Transactions on Computers10.1109/12.67722847:5(527-543)Online publication date: 1-May-1998
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media