Abstract
The development of complex simulations with high computational demands often requires an efficient parallel execution of a large number of numerical simulation tasks. Exploiting heterogeneous compute resources for the execution of parallel tasks can be achieved by integrating dedicated scheduling methods into the complex simulation code. In this article, the efforts for developing an application from the area of engineering optimization consisting of various individual components are described. Several scheduling methods for distributing parallel simulation tasks among heterogeneous compute nodes are presented. Performance results and comparisons are shown for two novel scheduling methods and several existing scheduling algorithms for parallel tasks. A heterogeneous compute cluster is used to demonstrate the scheduling and execution of benchmark tasks and FEM simulation tasks.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
MERGE Technologies for Multifunctional Lightweight Structures, http://www.tu-chemnitz.de/merge.
References
Arabnejad H, Barbosa J (2014) List scheduling algorithm for heterogeneous systems by an optimistic cost table. Trans Parallel Distrib Syst 25(3):682–694
Bansal S, Kumar P, Singh K (2006) An improved two-step algorithm for task and data parallel scheduling in distributed memory machines. Parallel Comput 32(10):759–774
Belkhale K, Banerjee P (1990) An approximate algorithm for the partitionable independent task scheduling problem. In: Proceedings of the 1990 International Conference on Parallel Processing, (ICPP’90), pp 72–75
Bernholdt D, Allan B, Armstrong R, Bertrand F, Chiu K, Dahlgren T, Damevski K, Elwasif W, Epperly T, Govindaraju M, Katz D, Kohl J, Krishnan M, Kumfert G, Larson J, Lefantzi S, Lewis M, Malony A, Mclnnes L, Nieplocha J, Norris B, Parker S, Ray J, Shende S, Windus T, Zhou S (2006) A component architecture for high-performance scientific computing. Int J High Perform Comput Appl 20(2):163–202
Beuchler S, Meyer A, Pester M (2001) SPC-PM3AdH v1.0—Programmer’s manual. Preprint SFB/393 01-08, TU-Chemnitz
Bongo LA, Ciegis R, Frasheri N, Gong J, Kimovski D, Kropf P, Margenov S, Mihajlovic M, Neytcheva M, Rauber T, Rünger G, Trobec R, Wuyts R, Wyrzykowski R (2015) Applications for ultrascale computing. Supercomput Front Innov 2(1):19–48
Dümmler J, Kunis R, Rünger G (2007) A comparison of scheduling algorithms for multiprocessortasks with precedence constraints. In: Proceedings of the High Performance Computing & Simulation Conference (HPCS’07), pp 663–669. ECMS
Henderson Squillacote A (2008) The ParaView guide: a parallel visualization application. Kitware, New York
Hofmann M, Ospald F, Schmidt H, Springer R (2014) Programming support for the flexible coupling of distributed software components for scientific simulations. In: Proceedings of the 9th International Conference on Software Engineering and Applications (ICSOFT-EA 2014), pp 506–511. SciTePress, Setúbal
Hofmann M, Rünger G (2015) Sustainability through flexibility: Building complex simulation programs for distributed computing systems. In: Simulation Modelling Practice and Theory, Special Issue on Techniques And Applications For Sustainable Ultrascale Computing Systems, 58(1):65–78
Jasak H, Jemcov A, Tukovic Z (2007) OpenFOAM: A C++ library for complex physics simulations. In: Proc. of the Int. Workshop on Coupled Methods in Numerical Dynamics (CMND’07), pp 1–20
Kleijnen J, van Beers W, van Nieuwenhuyse I (2012) Expected improvement in efficient global optimization through bootstrapped kriging. J Global Optim 54(1):59–73
Leung J (ed) (2004) Handbook of scheduling: algorithms, models, and performance analysis. CRC Press, Boca Raton
Montgomery D (2001) Design and analysis of experiments, 5th edn. Wiley, New York
N’Takpé T, Suter F (2006) Critical path and area based scheduling of parallel task graphs on heterogeneous platforms. In: Proc. of the 12th Int. Conf. on Parallel and Distributed Systems (ICPADS’06), pp 1–8. IEEE
OpenBLAS: An optimized BLAS library. http://www.openblas.net/. Accessed 3 Dec 2015
Parker S (2006) A component-based architecture for parallel multi-physics PDE simulation. Future Gener Comput Syst 22(1–2):204–216
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Re 12:2825–2830
Pinedo M (2012) Scheduling: theory, algorithms, and systems. Springer, New York
pyDOE: Design of experiments for Python. http://pythonhosted.org/pyDOE. Accessed 3 Dec 2015
Radulescu A, Nicolescu C, van Gemund A, Jonker P (2001) CPR: Mixed task and data parallel scheduling for distributed systems. In: Proc. of the 15th Int. Parallel and Distributed Processing Symposium (IPDPS’01). IEEE, pp 1–8
Radulescu A, van Gemund A (2001) A low-cost approach towards mixed task and data parallel scheduling. In: Proc. of the Int. Conf. on Parallel Processing (ICPP’01). IEEE, pp 69–76
Rauber T, Rünger G (1999) Compiler support for task scheduling in hierarchical execution models. J Syst Archit 45(6–7):483–503
Roux W, Stander N, Haftka R (1998) Response surface approximations for structural optimization. Int J Numer Methods Eng 42(3):517–534
Sacks J, Welch W, Mitchell T, Wynn H (1989) Design and analysis of computer experiments. Stat Sci 4(4):409–423
Schroeder W, Martin K, Lorensen B (2006) The Visualization Toolkit: an object-oriented approach to 3D graphics. Kitware, New York
Suter F (2007) Scheduling \(\Delta \)-critical tasks in mixed-parallel applications on a national grid. In: Proc. of the 8th IEEE/ACM Int. Conf. on Grid Computing. IEEE, pp 2–9
Topcuoglu H, Hariri S, Wu MY (1999) Task scheduling algorithms for heterogeneous processors. In: Proc. of the 8th Heterogeneous Computing Workshop (HCW’99). IEEE, pp 3–14
Turek J, Wolf J, Yu P (1992) Approximate algorithms scheduling parallelizable tasks. In: Proc. of the 4th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA’92). ACM, pp 323–332
van der Walt S, Colbert S, Varoquaux G (2011) The NumPy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30
Acknowledgments
This work was performed within the Federal Cluster of Excellence EXC 1075 “MERGE Technologies for Multifunctional Lightweight Structures” and supported by the German Research Foundation (DFG). Financial support is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dietze, R., Hofmann, M. & Rünger, G. Water-Level scheduling for parallel tasks in compute-intensive application components. J Supercomput 72, 4047–4068 (2016). https://doi.org/10.1007/s11227-016-1711-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-016-1711-1