Distributed Data Processing on Microcomputers with Ascheduler and Apache Spark

Korkhov, Vladimir; Gankevich, Ivan; Iakushkin, Oleg; Gushchanskiy, Dmitry; Khmel, Dmitry; Ivashchenko, Andrey; Pyayt, Alexander; Zobnin, Sergey; Loginov, Alexander

doi:10.1007/978-3-319-62404-4_28

Vladimir Korkhov²³,
Ivan Gankevich²³,
Oleg Iakushkin²³,
Dmitry Gushchanskiy²³,
Dmitry Khmel²³,
Andrey Ivashchenko²³,
Alexander Pyayt²⁴,
Sergey Zobnin²⁴ &
…
Alexander Loginov²⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10408))

Included in the following conference series:

International Conference on Computational Science and Its Applications

2126 Accesses

Abstract

Modern architectures of data acquisition and processing often consider low-cost and low-power devices that can be bound together to form a distributed infrastructure. In this paper we overview possibilities to organize a distributed computing testbed based on microcomputers similar to Raspberry Pi and Intel Edison. The goal of the research is to investigate and develop a scheduler for orchestrating distributed data processing and general purpose computations on such unreliable and resource-constrained hardware. Also we consider integration of the scheduler with well-known distributed data processing framework Apache Spark. We outline the project carried out in collaboration with Siemens LLC to compare different configurations of the hardware and software deployment and evaluate performance and applicability of the tools to the testbed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Improved and Efficient Distributed Computing Framework with Intelligent Task Scheduling

A Study on Today’s Cloud Environments for HPC Applications

Towards a scalable and energy-efficient resource manager for coupling cluster computing with distributed embedded computing

Article 02 June 2017

References

Apache spark official website. http://spark.apache.org/
B.A.T.M.A.N. official web page. https://www.open-mesh.org/projects/open-mesh/wiki
Cox, S.J., Cox, J.T., Boardman, R.P., Johnston, S.J., Scott, M., Obrien, N.S.: Iridis-pi: a low-cost, compact demonstration cluster. Cluster Comput. 17(2), 349–358 (2014)
Article Google Scholar
Fox, K., Mongan, W.M., Popyack, J.: Raspberry hadoopi: a low-cost, hands-on laboratory in big data and analytics. In: SIGCSE, p. 687 (2015)
Google Scholar
Gankevich, I., Tipikin, Y., Gaiduchok, V.: Subordination: cluster management without distributed consensus. In: 2015 International Conference on High Performance Computing & Simulation (HPCS), pp. 639–642. IEEE (2015)
Google Scholar
Gankevich, I., Tipikin, Y., Korkhov, V., Gaiduchok, V.: Factory: non-stop batch jobs without checkpointing. In: 2016 International Conference on High Performance Computing & Simulation (HPCS), pp. 979–984. IEEE (2016)
Google Scholar
Gankevich, I., Tipikin, Y., Korkhov, V., Gaiduchok, V., Degtyarev, A., Bogdanov, A.: Factory: master node high-availability for big data applications and beyond. In: Gervasi, O., et al. (eds.) ICCSA 2016, Part II. LNCS, vol. 9787, pp. 379–389. Springer, Cham (2016). doi:10.1007/978-3-319-42108-7_29
Chapter Google Scholar
Hajji, W., Tso, F.P.: Understanding the performance of low power raspberry pi cloud for big data. Electronics 5(2), 29 (2016)
Article Google Scholar
Kaewkasi, C., Srisuruk, W.: A study of big data processing constraints on a low-power hadoop cluster. In: 2014 International Conference on Computer Science and Engineering Conference (ICSEC), pp. 267–272. IEEE (2014)
Google Scholar
Laskowski, J.: Mastering apache spark 2.0. https://www.gitbook.com/book/jaceklaskowski/mastering-apache-spark/details

Download references

Acknowledgments

The research was supported by Siemens LLC.

Author information

Authors and Affiliations

Saint Petersburg State University, 7/9 Universitetskaya nab., St. Petersburg, 199034, Russia
Vladimir Korkhov, Ivan Gankevich, Oleg Iakushkin, Dmitry Gushchanskiy, Dmitry Khmel & Andrey Ivashchenko
Siemens LLC, St. Petersburg, Russia
Alexander Pyayt, Sergey Zobnin & Alexander Loginov

Authors

Vladimir Korkhov
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Gankevich
View author publications
You can also search for this author in PubMed Google Scholar
Oleg Iakushkin
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Gushchanskiy
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Khmel
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Ivashchenko
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Pyayt
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Zobnin
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Loginov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vladimir Korkhov .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Italy
Beniamino Murgante
Covenant University, Ota, Nigeria
Sanjay Misra
University of Trieste, Trieste, Italy
Giuseppe Borruso
Polytechnic University of Bari, Bari, Italy
Carmelo M. Torre
University of Minho, Braga, Portugal
Ana Maria A.C. Rocha
Monash University, Clayton, Victoria, Australia
David Taniar
Kyushu Sangyo University, Fukuoka, Japan
Bernady O. Apduhan
Saint Petersburg State University, Saint Petersburg, Russia
Elena Stankova
University of Trieste, Trieste, Italy
Alfredo Cuzzocrea

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Korkhov, V. et al. (2017). Distributed Data Processing on Microcomputers with Ascheduler and Apache Spark. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2017. ICCSA 2017. Lecture Notes in Computer Science(), vol 10408. Springer, Cham. https://doi.org/10.1007/978-3-319-62404-4_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-62404-4_28
Published: 15 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62403-7
Online ISBN: 978-3-319-62404-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Distributed Data Processing on Microcomputers with Ascheduler and Apache Spark

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Improved and Efficient Distributed Computing Framework with Intelligent Task Scheduling

A Study on Today’s Cloud Environments for HPC Applications

Towards a scalable and energy-efficient resource manager for coupling cluster computing with distributed embedded computing

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Distributed Data Processing on Microcomputers with Ascheduler and Apache Spark

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Improved and Efficient Distributed Computing Framework with Intelligent Task Scheduling

A Study on Today’s Cloud Environments for HPC Applications

Towards a scalable and energy-efficient resource manager for coupling cluster computing with distributed embedded computing

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation