Abstract
Effective use of cache memory is getting more important with increasing gap between the processor speed and memory access speed. Also, use of multigrain parallelism is getting more important to improve effective performance beyond the limitation of loop iteration level parallelism. Considering these factors, this paper proposes a coarse grain task static scheduling scheme considering cache optimization. The proposed scheme schedules coarse grain tasks to threads so that shared data among coarse grain tasks can be passed via cache after task and data decomposition considering cache size at compile time. It is implemented on OSCAR Fortran multigrain parallelizing compiler and evaluated on Sun Ultra80 four-processor SMP workstation, using Swim and Tomcatv from the SPEC fp 95. As the results, the proposed scheme gives us 4.56 times speedup for Swim and 2.37 times on 4 processors for Tomcatv respectively against the Sun Forte HPC 6 loop parallelizing compiler.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
M. Okamoto, K. Aida, M. Miyazawa, H. Honda, and H. Kasahara. A hierarchical macro-dataflow computation scheme for oscar multi-grain compiler. Trans. IPSJ, 35(4):513–521, 1994.
H. Kasahara, M. Obata, and K. Ishizaka. Automatic coarse grain task parallel processing on smp using openmp. In Proc. 12th Workshop on Languages and Compilers for Parallel Computing, Aug 2000.
A. W. Lim, G. I. Cheong, and M. S. Lam. An affine partitioning algorithm to maximize parallelism and minimize communication. In Proc. 13th ACM SIGARCH International Conference on Supercomputing, Jun 1999.
A. W. Lim, S. Liao, and M. S. Lam. Blocking and array contraction across arbitrarily nested loops using affine partitioning. In Proc. of the Eighth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Jun 2001.
A. W. Lim and M. S. Lam. Cache optimizations with affine partitioning. In Proc. of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, Mar 2001.
S. Vajracharya, S. Karmesin, P. Beckman, J. Crotinger, A. Malony, S. Shende, R. Oldehoeft, and S. Smith. Smarts: exploiting temporal locality and parallelism through vertical execution. In Proc. of the 1999 international conference on Supercomputing, Jun 1999.
D. Inaishi, K. Kimura, K. Fujimoto, W. Ogata, M. Okamoto, and H. Kasahara. A cache optimization with earliest executable condition analysis. In Technical report of IPSJ, Aug 1998.
K. Ishizaka, M. Obata, and H. Kasahara. Coarse grain task parallel processing with cache optimization on shared memory multiprocessor. In Proc. 14th Workshop on Languages and Compilers for Parallel Computing, Aug 2001.
A. Yoshida, K. Koshizuka, M. Okamoto, and H. Kasahara. A data-localization scheme among loops for each layer in hierarchical coarse grain parallel processing. Trans. IPSJ, 40(5):2054–2063, 1999.
A. Yoshida, S. Yagi, and H. Kasahara. A data-localization scheme for macrotaskgraph with data dependencies on smp. In Technical report of IPSJ, 2001-ARC-141, Jan 2001.
H. Kasahara. Parallel Processing Technology. CORONA PUBLISHING CO., LTD., 1991.
H. Kasahara, H. Honda, A. Mogi, A. Ogura, K. Fujiwara, and S. Narita. A multigrain parallelizing compilation scheme for oscar. Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug 1991.
K. Yoshii, G. Matsui, M. Obata, S. Kumazawa, and H. Kasahara. An analysis-time procedure inlining scheme for multi-grain automatic parallelizing compilation. In Technical report of IPSJ, ARC/HPC, Mar 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nakano, H., Ishizaka, K., Obata, M., Kimura, K., Kasahara, H. (2002). Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds) High Performance Computing. ISHPC 2002. Lecture Notes in Computer Science, vol 2327. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47847-7_44
Download citation
DOI: https://doi.org/10.1007/3-540-47847-7_44
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43674-4
Online ISBN: 978-3-540-47847-8
eBook Packages: Springer Book Archive