Abstract
The aim of the paper is to introduce general techniques in order to optimize the parallel execution time of sorting on a distributed architectures with processors of various speeds. Such an application requires a partitioning step. For uniformly related processors (processors speeds are related by a constant factor), we develop a constant time technique for mastering processor load and execution time in an heterogeneous environment and also a technique to deal with unknown cost functions. For non uniformly related processors, we use a technique based on dynamic programming. Most of the time, the solutions are in \({\mathcal O}\)(p) (p is the number of processors), independent of the problem size n. Consequently, there is a small overhead regarding the problem we deal with but it is inherently limited by the knowing of time complexity of the portion of code following the partitioning.
Work supported in part by France Agence Nationale de la Recherche under grants ANR-05-SSIA-0005-01 and ANR-05-SSIA-0005-05, programme ARA sécurité.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lastovetsky, A., Reddy, R.: Data partitioning with a realistic performance model of networks of heterogenenous computers. In: Proc. 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), Santa-Fe, New-Mexico. CD–ROM publication (2004)
Drozdowski, M., Lawenda, M.: On optimum multi-installment divisible load processing in heterogeneous distributed systems. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005. LNCS, vol. 3648, pp. 231–240. Springer, Heidelberg (2005)
Li, H., Sevcik, K.C.: Parallel sorting by overpartitioning. In: Proceedings of the 6th Annual Symposium on Parallel Algorithms and Architectures, pp. 46–56. ACM Press, New York (1994)
Reif, J.H., Valiant, L.G.: A Logarithmic time Sort for Linear Size Networks. Journal of the ACM 34(1), 60–76 (1987)
Reif, J.H., Valiant, L.G.: A logarithmic time sort for linear size networks. In: Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing, Boston, Massachusetts, pp. 10–16 (1983)
Shi, H., Schaeffer, J.: Parallel sorting by regular sampling. Journal of Parallel and Distributed Computing 14(4), 361–372 (1992)
Li, X., Lu, P., Schaeffer, J., Shillington, J., Wong, P.S., Shi, H.: On the versatility of parallel sorting by regular sampling. Parallel Computing 19, 1079–1103 (1993)
Helman, D.R., JáJá, J., Bader, D.A.: A new deterministic parallel sorting algorithm with an experimental evaluation. Tech. Rep. CS-TR-3670 and UMIACS-TR-96-54, Institute for Advanced Computer Studies, Univ. of Maryland (1996)
Cérin, C., Gaudiot, J.L.: Evaluation of two BSP libraries through parallel sorting on clusters. In: Proceedings of WCBC 2000 (Workshop on Cluster-Based Computing) in conjunction with ICS 2000 (International Conference on Supercomputing), Santa Fe, New Mexico, pp. 21–26 (2000)
Cérin, C., Gaudiot, J.L.: An over-partitioning scheme for parallel sorting on clusters running at different speeds. In: IEEE International Conference on Cluster Computing, Cluster 2000, T.U. Chemnitz, Saxony, Germany, Poster (2000)
Cérin, C., Gaudiot, J.L.: Parallel sorting algorithms with sampling techniques on clusters with processors running at different speeds. In: Prasanna, V.K., Vajapeyam, S., Valero, M. (eds.) HiPC 2000. LNCS, vol. 1970, p. 301. Springer, Heidelberg (2000)
Cérin, C., Gaudiot, J.L.: On a scheme for parallel sorting on heterogeneous clusters. FGCS (Future Generation Computer Systems 18(4) (2002); The special issue is preliminary scheduled for publication in future vol.
Cérin, C.: An out-of-core sorting algorithm for clusters with processors at different speed. In: 16th International Parallel and Distributed Processing Symposium (IPDPS), Ft Lauderdale, Florida, USA (2002), Available on CDROM from IEEE Computer Society
Cérin, C., Koskas, M., Jemni, M., Fkaier, H.: Improving parallel execution time of sorting on heterogeneous clusters. In: Proc. 16th Int. Symp. on Comp. Architecture and High Performance Computing (SBAC 2004), Foz-do-Iguazu, Brazil (2004)
Corless, R., Jeffrey, D., Knuth, D.: A sequence of series for the lambert w function. In: Kuechlin, W.W. (ed.) Proc. of ISSAC 1997, Maui, Hawaii, pp. 197–204. ACM, New York (1997)
Frigo, M., Johnson, S.G.: The design and implementation of fftw3. Proceedings of the IEEE, Special issue on Program Generation, Optimization, and Platform Adaptation, 216–231 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cérin, C., Dubacq, JC., Roch, JL. (2006). Methods for Partitioning Data to Improve Parallel Execution Time for Sorting on Heterogeneous Clusters. In: Chung, YC., Moreira, J.E. (eds) Advances in Grid and Pervasive Computing. GPC 2006. Lecture Notes in Computer Science, vol 3947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11745693_18
Download citation
DOI: https://doi.org/10.1007/11745693_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33809-3
Online ISBN: 978-3-540-33810-9
eBook Packages: Computer ScienceComputer Science (R0)