Performance Meta-analysis for Big-Data Univariate Auto-Imputation in the Building Sector

Stefanopoulou, Aliki; Michailidis, Iakovos; Dimara, Asimina; Krinidis, Stelios; Kosmatopoulos, Elias B.; Anagnostopoulos, Christos-Nikolaos; Tzovaras, Dimitrios

doi:10.1007/978-3-031-08341-9_23

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 652))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

1311 Accesses
1 Citations

Abstract

Filtering refers to the process of defining, detecting and correcting errors in a given dataset, to achieve system reliability and minimize the impact of errors in data analysis. Automated and accurate tools for data filtering and healing are crucial to ensure reliability of the system. This study aims to investigate statistical and machine-learning-based methodologies for data gaps healing and missing values imputation. In total, five models are being investigated individually, the well known ARIMA model, Linear and Polynomial Interpolation, General Regression and Facebook Prophet. The raw data that are used to evaluate these methods are simulated, and artificial data gaps are imposed randomly within the dataset to evaluate the univariate imputation performance of the aforementioned models based on Mean Squared Error and Mean Absolute Error. As expected the evaluation results illustrate the efficiency of highly elaborate machine-learning Facebook Prophet against more simple statistic ARIMA in expense of time and computational efforts. However, for Big Data univariate imputation applications the study findings suggest that a combination of ARIMA and Facebook Prophet, depending on the data gap size, could balance out the required computational resources while maintaining highly accurate imputation results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Performance Comparison of Imputation Methods in Building Energy Data Sets

Addressing missing values in routine health information system data: an evaluation of imputation methods using data from the Democratic Republic of the Congo during the COVID-19 pandemic

Article Open access 04 November 2021

Missing data imputation in tunnel monitoring with a spatio-temporal correlation fused machine learning model

Article 02 December 2024

References

Roque, N.A., Ram, N.: tsfeaturex: an R package for automating time series feature extraction. J. Open Source Softw. 4(37) (2019)
Google Scholar
Olivera, P., et al.: Big data in IBD: a look into the future. Nat. Rev. Gastroenterol. Hepatol. 16(5), 312–321 (2019)
Google Scholar
Hancock, J.T., Khoshgoftaar, T.M.: CatBoost for big data: an interdisciplinary review. J. Big Data 7(1), 1–45 (2020). https://doi.org/10.1186/s40537-020-00369-8
Article Google Scholar
Schauer, J.M., et al.: Exploratory analyses for missing data in meta-analyses and meta-regression: a tutorial. Alcohol Alcohol. 57(1), 35–46 (2022)
Article Google Scholar
Bache-Mathiesen, L.K., et al.: Handling and reporting missing data in training load and injury risk research. Sci. Med. Footb. 1–13 (2021)
Google Scholar
Kahale, L.A., et al.: Potential impact of missing outcome data on treatment effects in systematic reviews: imputation study. bmj 370 (2020)
Google Scholar
Lin, W.-C., Tsai, C.-F.: Missing value imputation: a review and analysis of the literature (2006–2017). Artif. Intell. Rev. 53(2), 1487–1509 (2019). https://doi.org/10.1007/s10462-019-09709-4
Article Google Scholar
Flores, A., Tito, H., Silva, C.: Local average of nearest neighbors: univariate time series imputation. Int. J. Adv. Comput. Sci. Appl. 10(8), 45–50 (2019)
Google Scholar
Saad, M., et al.: Tackling imputation across time series models using deep learning and ensemble learning. In: 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE (2020)
Google Scholar
Saad, M., et al.: Machine learning based approaches for imputation in time series data and their impact on forecasting. In: 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE (2020)
Google Scholar
Zymbler, M., et al.: Cleaning sensor data in smart heating control system. In: 2020 Global Smart Industry Conference (GloSIC). IEEE (2020)
Google Scholar
Brajković, H., Jakšić, D., Poščić, P.: Data warehouse and data quality-an overview. In: Central European Conference on Information and Intelligent Systems. Faculty of Organization and Informatics Varazdin (2020)
Google Scholar
Chiu, P.C., Selamat, A., Krejcar, O.: Infilling missing rainfall and runoff data for Sarawak, Malaysia using gaussian mixture model based K-Nearest neighbor imputation. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds.) IEA/AIE 2019. LNCS (LNAI), vol. 11606, pp. 27–38. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22999-3_3
Chapter Google Scholar
Afrifa-Yamoah, E., et al.: Missing data imputation of high-resolution temporal climate time series data. Meteorol. Appl. 27(1), e1873 (2020)
Google Scholar
Chaudhry, A., et al.: A method for improving imputation and prediction accuracy of highly seasonal univariate data with large periods of missingness. Wirel. Commun. Mob. Comput. 2019, 1–13 (2019)
Google Scholar
Jan, B., et al.: Deep learning in big data analytics: a comparative study. Comput. Electr. Eng. 75, 275–287 (2019)
Article Google Scholar

Download references

Acknowledgements

The research leading to these results was partially funded by the European Commission “EEB-07-2017 Integration of energy harvesting at building and district level” - PLUG-N-HARVEST H2020 project (Grant agreement ID: 768735) https://www.plug-n-harvest.eu/, accessed on 22 February 2022; and “LC-SC3-B4E-3-2020 Upgrading smartness of existing buildings through innovations for legacy equipment” - Smart2B H2020 project (Grant agreement ID: 101023666) https://www.smart2b-project.eu/, accessed on 2 March 2022.

Author information

Authors and Affiliations

Information Technologies Institute, Centre for Research and Technology Hellas, 57001, Thessaloniki, Greece
Aliki Stefanopoulou, Iakovos Michailidis, Asimina Dimara, Stelios Krinidis, Elias B. Kosmatopoulos & Dimitrios Tzovaras
Electrical and Computer Engineering Department, Democritus University of Thrace, 67000, Xanthi, Greece
Iakovos Michailidis & Elias B. Kosmatopoulos
Management Science and Technology Department, International Hellenic University (IHU), Kavala, Greece
Stelios Krinidis
Department of Cultural Technology and Communication, Intelligent Systems Lab, University of the Aegean, Mytilene, Greece
Asimina Dimara & Christos-Nikolaos Anagnostopoulos

Authors

Aliki Stefanopoulou
View author publications
You can also search for this author in PubMed Google Scholar
Iakovos Michailidis
View author publications
You can also search for this author in PubMed Google Scholar
Asimina Dimara
View author publications
You can also search for this author in PubMed Google Scholar
Stelios Krinidis
View author publications
You can also search for this author in PubMed Google Scholar
Elias B. Kosmatopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Christos-Nikolaos Anagnostopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios Tzovaras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Asimina Dimara .

Editor information

Editors and Affiliations

University of Piraeus, Piraeus, Greece
Ilias Maglogiannis
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Sunderland, Sunderland, UK
John Macintyre
Universidade do Minho, Guimaraes, Portugal
Paulo Cortez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stefanopoulou, A. et al. (2022). Performance Meta-analysis for Big-Data Univariate Auto-Imputation in the Building Sector. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Cortez, P. (eds) Artificial Intelligence Applications and Innovations. AIAI 2022 IFIP WG 12.5 International Workshops. AIAI 2022. IFIP Advances in Information and Communication Technology, vol 652. Springer, Cham. https://doi.org/10.1007/978-3-031-08341-9_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-08341-9_23
Published: 10 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08340-2
Online ISBN: 978-3-031-08341-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Performance Meta-analysis for Big-Data Univariate Auto-Imputation in the Building Sector

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Performance Comparison of Imputation Methods in Building Energy Data Sets

Addressing missing values in routine health information system data: an evaluation of imputation methods using data from the Democratic Republic of the Congo during the COVID-19 pandemic

Missing data imputation in tunnel monitoring with a spatio-temporal correlation fused machine learning model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Performance Meta-analysis for Big-Data Univariate Auto-Imputation in the Building Sector

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Performance Comparison of Imputation Methods in Building Energy Data Sets

Addressing missing values in routine health information system data: an evaluation of imputation methods using data from the Democratic Republic of the Congo during the COVID-19 pandemic

Missing data imputation in tunnel monitoring with a spatio-temporal correlation fused machine learning model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation