Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP

Nakano, Hirofumi; Ishizaka, Kazuhisa; Obata, Motoki; Kimura, Keiji; Kasahara, Hironori

doi:10.1007/3-540-47847-7_44

Hirofumi Nakano⁶,
Kazuhisa Ishizaka⁷,
Motoki Obata⁷,
Keiji Kimura⁷ &
…
Hironori Kasahara⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2327))

Included in the following conference series:

International Symposium on High Performance Computing

875 Accesses
2 Citations

Abstract

Effective use of cache memory is getting more important with increasing gap between the processor speed and memory access speed. Also, use of multigrain parallelism is getting more important to improve effective performance beyond the limitation of loop iteration level parallelism. Considering these factors, this paper proposes a coarse grain task static scheduling scheme considering cache optimization. The proposed scheme schedules coarse grain tasks to threads so that shared data among coarse grain tasks can be passed via cache after task and data decomposition considering cache size at compile time. It is implemented on OSCAR Fortran multigrain parallelizing compiler and evaluated on Sun Ultra80 four-processor SMP workstation, using Swim and Tomcatv from the SPEC fp 95. As the results, the proposed scheme gives us 4.56 times speedup for Swim and 2.37 times on 4 processors for Tomcatv respectively against the Sun Forte HPC 6 loop parallelizing compiler.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems

Assessing Task-to-Data Affinity in the LLVM OpenMP Runtime

OpenMP Extension for Explicit Task Allocation on NUMA Architecture

References

APC. http://www.apc.waseda.ac.jp/.
M. Okamoto, K. Aida, M. Miyazawa, H. Honda, and H. Kasahara. A hierarchical macro-dataflow computation scheme for oscar multi-grain compiler. Trans. IPSJ, 35(4):513–521, 1994.
Google Scholar
H. Kasahara, M. Obata, and K. Ishizaka. Automatic coarse grain task parallel processing on smp using openmp. In Proc. 12th Workshop on Languages and Compilers for Parallel Computing, Aug 2000.
Google Scholar
A. W. Lim, G. I. Cheong, and M. S. Lam. An affine partitioning algorithm to maximize parallelism and minimize communication. In Proc. 13th ACM SIGARCH International Conference on Supercomputing, Jun 1999.
Google Scholar
A. W. Lim, S. Liao, and M. S. Lam. Blocking and array contraction across arbitrarily nested loops using affine partitioning. In Proc. of the Eighth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Jun 2001.
Google Scholar
A. W. Lim and M. S. Lam. Cache optimizations with affine partitioning. In Proc. of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, Mar 2001.
Google Scholar
S. Vajracharya, S. Karmesin, P. Beckman, J. Crotinger, A. Malony, S. Shende, R. Oldehoeft, and S. Smith. Smarts: exploiting temporal locality and parallelism through vertical execution. In Proc. of the 1999 international conference on Supercomputing, Jun 1999.
Google Scholar
D. Inaishi, K. Kimura, K. Fujimoto, W. Ogata, M. Okamoto, and H. Kasahara. A cache optimization with earliest executable condition analysis. In Technical report of IPSJ, Aug 1998.
Google Scholar
K. Ishizaka, M. Obata, and H. Kasahara. Coarse grain task parallel processing with cache optimization on shared memory multiprocessor. In Proc. 14th Workshop on Languages and Compilers for Parallel Computing, Aug 2001.
Google Scholar
A. Yoshida, K. Koshizuka, M. Okamoto, and H. Kasahara. A data-localization scheme among loops for each layer in hierarchical coarse grain parallel processing. Trans. IPSJ, 40(5):2054–2063, 1999.
Google Scholar
A. Yoshida, S. Yagi, and H. Kasahara. A data-localization scheme for macrotaskgraph with data dependencies on smp. In Technical report of IPSJ, 2001-ARC-141, Jan 2001.
Google Scholar
H. Kasahara. Parallel Processing Technology. CORONA PUBLISHING CO., LTD., 1991.
Google Scholar
H. Kasahara, H. Honda, A. Mogi, A. Ogura, K. Fujiwara, and S. Narita. A multigrain parallelizing compilation scheme for oscar. Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug 1991.
Google Scholar
K. Yoshii, G. Matsui, M. Obata, S. Kumazawa, and H. Kasahara. An analysis-time procedure inlining scheme for multi-grain automatic parallelizing compilation. In Technical report of IPSJ, ARC/HPC, Mar 2000.
Google Scholar

Download references

Author information

Authors and Affiliations

Waseda University, 3-4-1 Ohkubo, Shinjuku-ku, Tokyo, 169-8555, Japan
Hirofumi Nakano
Waseda University & Advanced Parallelizing Compiler Project, Japan
Kazuhisa Ishizaka, Motoki Obata, Keiji Kimura & Hironori Kasahara

Authors

Hirofumi Nakano
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhisa Ishizaka
View author publications
You can also search for this author in PubMed Google Scholar
Motoki Obata
View author publications
You can also search for this author in PubMed Google Scholar
Keiji Kimura
View author publications
You can also search for this author in PubMed Google Scholar
Hironori Kasahara
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Software Science, University of Vienna, Liechtensteinstr. 22, 1090, Vienna, Austria
Hans P. Zima
Department of Information and Computer Science, Nara Women’s University, Kitauoyanishimachi, Nara City, 630-8506, Japan
Kazuki Joe
Institute of Information Science and Electronics, University of Tsukuba, Tenno-dai 1-1-1, Tsukuba, Ibaraki, 305-8577, Japan
Mitsuhisa Sato
Internet Systems Research Laboratories, NEC Corporation, 4-1-1, Miyazaki, Miyamae, Kawasaki, Kanagawa, 216-8555, Japan
Yoshiki Seo
Kyoto University, Yoshidahonmachi, Sakyo-ku, Kyoto, 606-8501, Japan
Masaaki Shimasaki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nakano, H., Ishizaka, K., Obata, M., Kimura, K., Kasahara, H. (2002). Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds) High Performance Computing. ISHPC 2002. Lecture Notes in Computer Science, vol 2327. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47847-7_44

Download citation

DOI: https://doi.org/10.1007/3-540-47847-7_44
Published: 29 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43674-4
Online ISBN: 978-3-540-47847-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems

Assessing Task-to-Data Affinity in the LLVM OpenMP Runtime

OpenMP Extension for Explicit Task Allocation on NUMA Architecture

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems

Assessing Task-to-Data Affinity in the LLVM OpenMP Runtime

OpenMP Extension for Explicit Task Allocation on NUMA Architecture

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation