Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-540-69848-7_69guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Finding Synchronization-Free Slices of Operations in Arbitrarily Nested Loops

Published: 30 June 2008 Publication History

Abstract

This paper presents a new approach for extracting synchronization-free parallelism being represented by dependent statement instances of an arbitrarily nested loop. Presented algorithms can be applied to both uniform and non-uniform loops. The main advantage is that more synchronization-free parallelism may be extracted than that yielded by existing techniques. Our approach, based on operations on relations and sets, requires exact dependence analysis, such as the one by Pugh and Wonnacott, where dependences are found in the form of tuple relations. Results of experiments with the NAS benchmark are presented.

References

[1]
Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures, p. 790. Morgan Kaufmann, San Francisco (2001)
[2]
Amarasinghe, S.P., Lam, M.S.: Communication optimization and code generation for distributed memory machines. In: Proceedings of the SIGPLAN 1993, pp. 126-138 (1993)
[3]
Ancourt, C., Irigoin, F.: Scanning polyhedra with do loops. In: Proc. of the Third ACM/SIGPLAN Symp. on Principles and Practice of Parallel Programming, pp. 39-50. ACM Press, New York (1991)
[4]
Banerjee, U.: Unimodular transformations of double loops. In: Proceedings of the Third Workshop on Languages and Compilers for Parallel Computing, pp. 192-219 (1990)
[5]
Bastoul, C., Cohen, A., Girbal, S., Sharma, S., Temam, O.: Putting polyhedral loop transformations to work. In: LCPC 16 Intern.l Workshop on Languages and Compilers for Parallel Computing. LNCS, vol. 2958, pp. 209-225. College Station (September 2003)
[6]
Bastoul, C.: Code Generation in the Polyhedral Model Is Easier Than You Think. In: Proceedings of the PACT 13 IEEE International Conference on Parallel Architecture and Compilation Techniques, Juan-les-Pins, pp. 7-16 (2004)
[7]
Beletska, A., Bielecki, W., San Pietro, P.: Extracting Synchronization-Free Slices of Operations in Perfectly-Nested Loops. In: Proceedings of PDCS 2007 (2007)
[8]
Boulet, P., Darte, A., Silber, G.A., Vivien, F.: Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Computing 24, 421-444 (1998)
[9]
Cohen, A., Girbal, S., Temam, O.: A polyhedral approach to ease the composition of program transformations. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 292-303. Springer, Heidelberg (2004)
[10]
Darte, A., Robert, Y., Vivien, F.: Scheduling and Automatic Parallelization. Birkhäuser Boston (2000)
[11]
Feautrier, P.: Some efficient solutions to the affine scheduling problem, part i, one dimensional time. International Journal of Parallel Programming 21, 313-348 (1992)
[12]
Feautrier, P.: Some efficient solutions to the affine scheduling problem, part ii, multidimensional time. International Journal of Parallel Programming 21, 389-420 (1992)
[13]
Feautrier, P.: Toward automatic distribution. Journal of Parallel Processing Letters 4, 233-244 (1994)
[14]
Gavaldà, R., Ayguadé, E., Torres, J.: Obtaining Synchronization-Free Code with Maximum Parallelism, Technical Report LSI-96-23-R, Universitat Politècnica de Catalunya (1996)
[15]
Griebl, M., Lengauer, C.: Classifying Loops for Space-Time Mapping. In: Proceedings of the Euro-Par. LNCS, pp. 467-474. Springer, Heidelberg (1996)
[16]
Huang, C., Sadayappan, P.: Communication-free hyperplane partitioning of nested loops. Journal of Parallel and Distributed Computing 19, 90-102 (1993)
[17]
Kelly, W., Pugh, W., Rosser, E., Shpeisman, T.: Transitive Closure of Infinite Graphs and its Applications. International Journal of Parallel Programming 24(6), 579-598 (1996)
[18]
Kelly, W., Pugh, W.: Minimizing communication while preserving parallelism. In: Proc. of the 1996 ACM International Conference on Supercomputing, pp. 52-60 (1996)
[19]
Kelly, W., Maslov, V., Pugh, W., Rosser, E., Shpeisman, T., Wonnacott, D.: The omega library interface guide, Technical Report CS-TR-3445, University of Maryland (1995)
[20]
Lim, W., Lam, M.S.: Communication-free parallelization via affine transformations. In: Proc. of the 7th workshop on languages and compilers for parallel computing, pp. 92-106 (1994)
[21]
Lim, W., Cheong, G.I., Lam, M.S.: An affine partitioning algorithm to maximize parallelism and minimize communication. In: Proceedings of the 13th ACM SIGARCH International Conference on Supercomputing (1999)
[22]
Lim, W., Liao, S.W., Lam, M.: Blocking and Array Contraction Across Arbitrarily Nested Loops Using Affine Partitioning. In: Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2001)
[23]
Pugh, W., Wonnacott, D.: Constraint-based array dependence analysis. ACM Trans. on Programming Languages and Systems (1998)
[24]
Pugh, W., Rosser, E.: Iteration Space Slicing and Its Application to Communication Optimization. In: Proc. of the International Conf. on Supercomputing, pp. 221-228 (1997)
[25]
Quillere, F., Rajopadhye, S., Wilde, D.: Generation of efficient nested loops from polyhedra. International Journal of Parallel Programming 28 (2000)
[26]
Weiser, M.: Program slices: formal, psychological, and practical investigations of an automatic program abstraction method, PhD thesis, University of Michigan, Ann Arbor, MI (1979)
[27]
Weiser, M.: Program Slicing. IEEE Transactions on Software Engineering SE-10(7), 352-357 (1984)
[28]
Wolf, M.E.: Improving locality and parallelism in nested loops, Ph.D. Dissertation CSLTR-92-538, Stanford University, Dept. Computer Science (1992)
[29]
Vasilache, N., Bastoul, C., Cohen, A.: Polyhedral code generation in the real world. In: Proceedings of the International Conference on Compiler Construction (ETAPS CC 2006). LNCS, pp. 185-201. Springer, Vienna (2006)
[30]
Netlib Repository at UTK and ORNL, http://www.netlib.org/benchmark/livermorec
[31]
http://www.nas.nasa.gov

Cited By

View all
  • (2010)An iterative algorithm of computing the transitive closure of a union of parameterized affine integer tuple relationsProceedings of the 4th international conference on Combinatorial optimization and applications - Volume Part I10.5555/1940390.1940400(104-113)Online publication date: 18-Dec-2010
  • (2009)Synchronization-Free automatic parallelizationProceedings of the 22nd international conference on Languages and Compilers for Parallel Computing10.1007/978-3-642-13374-9_16(233-246)Online publication date: 8-Oct-2009

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICCSA '08: Proceedings of the international conference on Computational Science and Its Applications, Part II
June 2008
1273 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 30 June 2008

Author Tags

  1. arbitrarily nested loops
  2. loop transformations
  3. slicing
  4. synchronization-free parallelism

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2010)An iterative algorithm of computing the transitive closure of a union of parameterized affine integer tuple relationsProceedings of the 4th international conference on Combinatorial optimization and applications - Volume Part I10.5555/1940390.1940400(104-113)Online publication date: 18-Dec-2010
  • (2009)Synchronization-Free automatic parallelizationProceedings of the 22nd international conference on Languages and Compilers for Parallel Computing10.1007/978-3-642-13374-9_16(233-246)Online publication date: 8-Oct-2009

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media