Abstract
The growing ubiquity of IoT, along with bigger steps towards full digitalization in the manufacturing industry, makes it easier to constantly monitor equipment activity and implement predictive maintenance approaches. Big Data solutions are best suited to process the large amounts of data generated through monitorization – additionally, they also allow for processing of unstructured data, such as documents used in not fully-digitalized processes. This paper describes the creation of a small Hadoop cluster, without high-availability, its integration in the InValue architecture and the processes through which it was populated with historical data from a relational warehouse. The degree of parallelization on the data ingestion tasks and its effect on performance were evaluated for the different kinds of datasets that are currently being used for batch data processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chui, M., Loffler, M., Robert, R.: The internet of things. Mckinsey Q. (2) (2010)
Mobley, R.K.: An Introduction to Predictive Maintenance. Elsevier Science, New York (2002)
McAfee, A., Brynjolfsson, E.: Big data: the management revolution. Harvard Bus. Rev. 90(10), 60–68 (2012)
Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C.: MAD skills: new analysis practices for big data. Proc. VLDB Endowment 2(2), 1481–1492 (2009)
Al-Noukari, M., Al-Hussan, W.: Using data mining techniques for predicting future car market demand; DCX case study. In: 3rd International Conference on Information and Communication Technologies: From Theory to Applications, ICTTA 2008, pp. 1–5. IEEE (2008)
Chon, S.H., Slaney, M., Berger, J.: Predicting success from music sales data: a statistical and adaptive approach. In: Proceedings of the 1st ACM Workshop on Audio and Music Computing Multimedia, pp. 83–88. ACM, October 2006
Martens, D., Provost, F., Clark, J., de Fortuny, E.J.: Mining massive fine-grained behavior data to improve predictive analytics. MIS Q. 40(4) (2016)
InValuePt. InValuePT - Home (2017). http://www.invalue.com.pt/. Accessed 01 Feb 2018
Canito, A., et al.: An architecture for proactive maintenance in the machinery industry. In: International Symposium on Ambient Intelligence. Springer (2017)
O’Donovan, P., Leahy, K., Bruton, K., O’Sullivan, D.T.: Big data in manufacturing: a systematic mapping study. J. Big Data 2(1), 20 (2015)
The Apache Software Foundation. Welcome to Apache Hadoop (2018). http://hadoop.apache.org/. Accessed 25 Jan 2018
IBM: What is the Hadoop Distributed File System (HDFS)? https://www-01.ibm.com/software/data/infosphere/hadoop/hdfs/
Borthakur, D.: HDFS Architecture Guide (2013). https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html. Accessed 05 Feb 2018
Cloudera. Cluster Hosts and Role Assignments (2018). https://www.cloudera.com/documentation/enterprise/latest/topics/cm_ig_host_allocations.html. Accessed 05 Feb 2018
Thusoo, A., Sen Sarma, J., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endowment 2(2), 1626–1629 (2009)
Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., Saha, B.: Apache hadoop yarn: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, p. 5. ACM, October 2013
Ting, K., Cecho, J.J.: Apache Sqoop Cookbook. O’Reilly Media, Sebastopol (2013)
Islam, M., Huang, A.K., Battisha, M., Chiang, M., Srinivasan, S., Peters, C., Neumann, A., Abdelnur, A.: Oozie: towards a scalable workflow management system for hadoop. In: Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies, p. 4. ACM, May 2012
Apache ZooKeeper: What is zookeeper (2014). http://zookeeper.apache.org. Accessed 01 Feb 2018
Bittorf, M.K.A.B.V., Bobrovytsky, T., Erickson, C.C.A.C.J., Hecht, M.G.D., Kuff, M.J.I.J.L., Leblang, D.K.A., Robinson, N.L.I.P.H., Rus, D.R.S., Wanderman, J.R.D.T.S., Yoder, M.M.: Impala: a modern, open-source SQL engine for Hadoop. In: Proceedings of the 7th Biennial Conference on Innovative Data Systems Research (2015)
Garg, N.: Apache Kafka. Packt Publishing Ltd. (2013)
Fernandes, M., Canito, A., Bolón, V., Conceição, L., Praça, I., Marreiros, G.: Predictive Maintenance in the Metallurgical Industry: data analysis and feature selection. In: World Conference on Information Systems and Technologies, pp. 478–489. Springer, Cham (2018)
White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc. (2012)
Groover, M., Malaska, T., Seidman, J., Saphira, G.: Hadoop Application Architectures: Designing Real-World Big Data Applications. O’Reilly Media, Inc. (2015)
Acknowledgments
The present work has been developed under the EUREKA - ITEA2 Project INVALUE (ITEA-13015), INVALUE Project (ANI|P2020 17990), and has received funding from FEDER Funds through NORTE2020 program and from National Funds through FCT under the project UID/EEA/00760/2013.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Canito, A., Fernandes, M., Conceição, L., Praça, I., Marreiros, G. (2019). A Big Data Platform for Industrial Enterprise Asset Value Enablers. In: De La Prieta, F., Omatu, S., Fernández-Caballero, A. (eds) Distributed Computing and Artificial Intelligence, 15th International Conference. DCAI2018 2018. Advances in Intelligent Systems and Computing, vol 800. Springer, Cham. https://doi.org/10.1007/978-3-319-94649-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-94649-8_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94648-1
Online ISBN: 978-3-319-94649-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)