Abstract
MapReduce has been shown to be a simple and efficient way to harness the massive resources of clusters. Recently, researchers propose using partitioned global address space (PGAS) based language and runtime to ease the programming of large-scale clusters. In this paper, we present an empirical study on the effectiveness of running MapReduce applications on a typical PGAS language runtime called X10. By tuning the performance of two applications on X10 platforms, we successfully eliminate several performance bottlenecks related to I/O processing. We also identify several remaining problems and propose several approaches to remedying them. Our final performance evaluation on a small-scale multicore cluster shows that the MapReduce applications written with X10 notably outperform those in Hadoop in most cases. Detailed analysis reveals that the major performance advantages come from a simplified task management and data storage scheme.
This work was funded by IBM X10 Innovation Faculty Award, China National Natural Science Foundation under grant numbered 61003002, a grant from the Science and Technology Commission of Shanghai Municipality numbered 10511500100, a research grant from Intel, Fundamental Research Funds for the Central Universities in China and Shanghai Leading Academic Discipline Project (Project Number: B114).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)
Saraswat, V., Almasi, G., Bikshandi, G., Cascaval, C., Cunningham, D., Grove, D., Kodali, S., Peshansky, I., Tardieu, O.: The asynchronous partitioned global address space model. In: Proceedings of Workshop on Advances in Message Passing (2010)
Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In: Proc. OOPLSA, pp. 519–538 (2005)
Bialecki, A., Cafarella, M., Cutting, D., Omalley, O.: Hadoop: a framework for running applications on large clusters built of commodity hardware, http://lucene.apache.org/hadoop
Saraswat, V.A., Sarkar, V., von Praun, C.: X10: concurrent programming for modern architectures. In: Proc. PPoPP, pp. 271–271 (2007)
Murthy, P.: Parallel computing with x10. In: Proceedings of the 1st International Workshop on Multicore Software Engineering, pp. 5–6 (2008)
Saraswat, V.A., Kambadur, P., Kodali, S., Grove, D., Krishnamoorthy, S.: Lifeline-based global load balancing. In: Proc. PPoPP, pp. 201–212 (2011)
Agarwal, S., Barik, R., Nandivada, V.K., Shyamasundar, K., Varma, P.: Static detection of place locality and elimination of runtime checks. In: Ramalingam, G. (ed.) APLAS 2008. LNCS, vol. 5356, pp. 53–77. Springer, Heidelberg (2008)
Agarwal, S., Barik, R., Bonachea, D., Sarkar, V., Shyamasundar, R.K., Yelick, K.: Deadlock-free scheduling of x10 computations with bounded resources. In: Proc. SPAA, pp. 229–240 (2007)
Zhao, J., Shirako, J., Nandivada, V.K., Sarkar, V.: Reducing task creation and termination overhead in explicitly parallel programs. In: Proc. PACT, pp. 169–180 (2010)
Barik, R.: Efficient optimization of memory accesses in parallel programs (2009), www.cs.rice.edu/~vsarkar/PDF/rajbarik_thesis.pdf
Yan, Y., Zhao, J., Guo, Y., Sarkar, V.: Hierarchical place trees: A portable abstraction for task parallelism and data movement. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds.) LCPC 2009. LNCS, vol. 5898, pp. 172–187. Springer, Heidelberg (2010)
Raman, R.: Compiler support for work-stealing parallel runtime systems. M.S. thesis, Department of Computer Science, Rice University (2009)
Bikshandi, G., Castanos, J.G., Kodali, S.B., Nandivada, V.K., Peshansky, I., Saraswat, V.A., Sur, S., Varma, P., Wen, T.: Efficient, portable implementation of asynchronous multi-place programs. In: Proc. PPoPP, pp. 271–282 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, C., Xie, C., Xiao, Z., Chen, H. (2011). Evaluating the Performance and Scalability of MapReduce Applications on X10. In: Temam, O., Yew, PC., Zang, B. (eds) Advanced Parallel Processing Technologies. APPT 2011. Lecture Notes in Computer Science, vol 6965. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24151-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-24151-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24150-5
Online ISBN: 978-3-642-24151-2
eBook Packages: Computer ScienceComputer Science (R0)