research-article

Time-Efficient Ensemble Learning with Sample Exchange for Edge Computing

Authors:

Jiamou Liu, and

Kim-Kwang Raymond ChooAuthors Info & Claims

ACM Transactions on Internet Technology (TOIT), Volume 21, Issue 3

Article No.: 76, Pages 1 - 17

https://doi.org/10.1145/3409265

Published: 16 June 2021 Publication History

Abstract

In existing ensemble learning algorithms (e.g., random forest), each base learner’s model needs the entire dataset for sampling and training. However, this may not be practical in many real-world applications, and it incurs additional computational costs. To achieve better efficiency, we propose a decentralized framework: Multi-Agent Ensemble. The framework leverages edge computing to facilitate ensemble learning techniques by focusing on the balancing of access restrictions (small sub-dataset) and accuracy enhancement. Specifically, network edge nodes (learners) are utilized to model classifications and predictions in our framework. Data is then distributed to multiple base learners who exchange data via an interaction mechanism to achieve improved prediction. The proposed approach relies on a training model rather than conventional centralized learning. Findings from the experimental evaluations using 20 real-world datasets suggest that Multi-Agent Ensemble outperforms other ensemble approaches in terms of accuracy even though the base learners require fewer samples (i.e., significant reduction in computation costs).

References

[1]

M. Aloqaily, S. Otoum, I. A. Ridhawi, and Y. Jararweh. 2019. An intrusion detection system for connected vehicles in smart cities. Ad Hoc Networks 90 (2019), 1.

[2]

A. Ardakany, E. Naderi, and A. Osareh. 2010. Parallel weak learners, a novel ensemble method. In Proceedings of the IEEE International Conference on Computational Intelligence and Computing Research. IEEE, Los Alamitos, CA, 1–4.

[3]

B. Aubaidan, M. Mohd, and M. Albared. 2014. Comparative study of k-means and k-Means++ clustering algorithms on crime domain. Journal of Computer Sciences 10, 7 (2014), 1197–1206.

[4]

C. Baechle, C. Huang, A. Agarwal, R. Behara, and J. Goo. 2020. Latent topic ensemble learning for hospital readmission cost optimization. European Journal of Operational Research 281, 3 (2020), 517–531.

[5]

V. Balasubramanian, F. Zaman, M. Aloqaily, S. Alrabaee, M. Gorlatova, and M. Reisslein. 2019. Reinforcing the edge: Autonomous energy management for mobile device clouds. In Proceedings of the IEEE Conference on Computer Communications Workshops (IEEE INFOCOM’19). 44–49.

[6]

A. Bifet, J. Zhang, W. Fan, C. He, J. Zhang, J. Qian, G. Holmes, and B. Pfahringer. 2017. Extremely fast decision tree mining for evolving data streams. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 1733–1742.

Digital Library

[7]

S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3, 1 (2011), 1–122.

Digital Library

[8]

L. Breiman, J. Friedman, R. Olshen, and C. Stone. 1984. Classification and Regression Trees. Chapman & Hall/CRC.

[9]

A. L. Buczak and E. Guven. 2016. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys & Tutorials 18, 2 (2016), 1153–1176.

Digital Library

[10]

Y. Chen, Y. Zhang, S. Maharjan, M. Alam, and T. Wu. 2019. Deep learning for secure mobile edge computing in cyber-physical transportation systems. IEEE Network 33, 4 (2019), 36–41.

[11]

H. Choi, H. Son, and C. Kim. 2018. Predicting financial distress of contractors in the construction industry using ensemble learning. Expert Systems with Applications 110 (2018), 1–10.

[12]

A. Cutler, D. Cutler, and J. Stevens. 2012. Random forests. In Ensemble Machine Learning. Springer, Cham, Switzerland, 157–175.

[13]

W. Dai and W. Ji. 2014. A MapReduce implementation of C4. 5 decision tree algorithm. International Journal of Database Theory and Application 7, 1 (2014), 49–60.

[14]

B. V. Dasarathy and B. V. Sheela. 1979. A composite classifier system design: Concepts and methodology. Proceedings of the IEEE 67, 5 (1979), 708–713.

[15]

M. Delgado, E. Cernadas, S. Barro, and D. Amorim. 2014. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research 15, 1 (2014), 3133–3181.

Digital Library

[16]

A. Desai and S. Chaudhary. 2017. Distributed decision tree v.2.0. In Proceedings of the 2017 IEEE International Conference on Big Data, Jianyun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, and Masashi Toyoda (Eds.). IEEE Computer Society, Boston, MA, 929–934.

Digital Library

[17]

Sharmishta Desai, Sourav Roy, Brina Patel, Samruddhi Purandare, and Minal Kucheria. 2016. Very fast decision tree (VFDT) algorithm on Hadoop. In Proceedings of the 2016 International Conference on Computing Communication Control and Automation. IEEE, Los Alamitos, CA, 1–7.

[18]

K. Dolui and S. K. Datta. 2017. Comparison of edge computing implementations: Fog computing, cloudlet and mobile edge computing. In Proceedings of the Global Internet of Things Summit (GIoTS’17). IEEE, Los Alamitos, CA, 1–6.

[19]

E. M. Dovom, A. Azmoodeh, A. Dehghantanha, D. E. Newton, R. M. Parizi, and H. Karimipour. 2019. Fuzzy pattern tree for edge malware detection and categorization in IoT. Journal of Systems Architecture 97 (2019), 1–7.

Digital Library

[20]

J. Friedman and P. Hall. 2007. On bagging and nonlinear estimation. Journal of Statistical Planning and Inference 137, 3 (2007), 669–683.

[21]

K. Gai, X. Qin, and L. Zhu. 2021. An energy-aware high performance task allocation strategy in heterogeneous fog computing environments. IEEE Transactions on Computers 70, 4 (2021), 626–639.

[22]

K. Gai, M. Qiu, Z. Xiong, and M. Liu. 2018. Privacy-preserving multi-channel communication in Edge-of-Things. Future Generation Computer Systems 85 (2018), 190–200.

Digital Library

[23]

K. Gai, M. Qiu, H. Zhao, L. Tao, and Z. Zong. 2016. Dynamic energy-aware cloudlet-based mobile cloud computing model for green computing. Journal of Network and Computer Applications 59 (2016), 46–54.

Digital Library

[24]

Z. Gan and N. Xiao. 2009. A new ensemble learning algorithm based on improved k-means for training neural network ensembles. In Proceedings of the 2nd International Symposium on Intelligent Information Technology and Security Informatics. IEEE, Los Alamitos, CA, 8–11.

Digital Library

[25]

S. Gu and Y. Jin. 2014. Generating diverse and accurate classifier ensembles using multi-objective optimization. In Proceedings of the 2014 IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making. IEEE, Los Alamitos, CA, 9–15.

[26]

H. Haddadpajouh, A. Dehghantanha, R. Khayami, K.K.R. Choo, and R. M. Parizi. 2020. A deep recurrent neural network based approach for Internet of Things malware threat hunting. Future Generation Computer Systems 85 (2020), 88–96.

[27]

Y. He, F. R. Yu, N. Zhao, V. C. M. Leung, and H. Yin. 2017. Software-defined networks with mobile edge computing and caching for smart cities: A big data deep reinforcement learning approach. IEEE Communications Magazine 55, 12 (2017), 31–37.

Digital Library

[28]

M. Jahrer, A. Töscher, and R. Legenstein. 2010. Combining predictions for accurate recommender systems. In Proceedings of the 16th International Conference on Knowledge Discovery and Data Mining, Bharat Rao, Balaji Krishnapuram, Andrew Tomkins, and Qiang Yang (Eds.). ACM, New York, NY, 693–702.

Digital Library

[29]

A. Jain. 2010. Data clustering: 50 years beyond K-means. Pattern Recognition Letters 31, 8 (2010), 651–666.

Digital Library

[30]

D. Dua and C. Graff. 2019. UCI machine learning repository. School of Information and Computer Science, University of California, Irvine, CA[J].

[31]

G. Louppe. 2014. Understanding random forests: From theory to practice. arXiv:1407.7502

[32]

N. Moustafa, B. Turnbull, and K. K. R. Choo. 2019. An ensemble intrusion detection technique based on proposed statistical flow features for protecting network traffic of Internet of Things. IEEE Internet of Things Journal 6, 3 (2019), 4815–4830.

[33]

T. Oshiro, P. Perez, and J. Baranauskas. 2012. How many trees in a random forest? In Machine Learning and Data Mining in Pattern Recognition. Lecture Notes in Computer Science, Vol. 7376. Springer, 154–168.

Digital Library

[34]

M. Papouskova and P. Hájek. 2019. Two-stage consumer credit risk modelling using heterogeneous ensemble learning. Decision Support Systems 118 (2019), 33–45.

Digital Library

[35]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, et al. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12 (Oct. 2011), 2825–2830.

Digital Library

[36]

V. Petrov. 2012. Sums of Independent Random Variables. Vol. 82. Springer Science & Business Media, Cham, Switzerland.

[37]

R. Polikar. 2009. Ensemble learning. Scholarpedia 4, 1 (2009), 2776.

[38]

P. Probst and A. Boulesteix. 2017. To tune or not to tune the number of trees in random forest. Journal of Machine Learning Research 18 (2017), Article 181, 18 pages.

Digital Library

[39]

I. A. Ridhawi, M. Aloqaily, Y. Kotb, Y. A. Ridhawi, and Y. Jararweh. 2018. A collaborative mobile edge computing and user solution for service composition in 5G systems. Transactions on Emerging Telecommunications Technology 29, 11 (2018), 1–19.

[40]

A. K. Sangaiah, D. V. Medhane, T. Han, M. S. Hossain, and G. Muhammad. 2019. Enforcing position-based confidentiality with machine learning paradigm through mobile edge computing in real-time industrial informatics. IEEE Transactions on Industrial Informatics 15, 7 (2019), 4189–4196.

[41]

R. Schapire. 1990. The strength of weak learnability. Machine Learning 5 (1990), 197–227.

Digital Library

[42]

R. Schapire. 2013. Explaining AdaBoost. In Empirical Inference. Springer, Cham, Switzerland, 37–52.

[43]

F. Schwenker. 2013. Ensemble methods: Foundations and algorithms [Book Review]. IEEE Computational Intelligence Magazine 8, 1 (2013), 77–79.

Digital Library

[44]

N. Shi, X. Liu, and Y. Guan. 2010. Research on k-means clustering algorithm: an improved k-means clustering algorithm. In Proceedings of the 3rd International Symposium on Intelligent Information Technology and Security Informatics. IEEE, Los Alamitos, CA, 63–67.

Digital Library

[45]

M. Syafrudin, N. L. Fitriyani, G. Alfian, and J. Rhee. 2019. An affordable fast early warning system for edge computing in assembly line. Applied Sciences 9, 1 (2019), 84.

[46]

T. Taleb, S. Dutta, A. Ksentini, M. Iqbal, and H. Flinck. 2017. Mobile edge computing potential in making cities smarter. IEEE Communications Magazine 55, 3 (2017), 38–43.

Digital Library

[47]

K. Toczé and S. Nadjm-Tehrani. 2018. A taxonomy for management and optimization of multiple resources in edge computing. Wireless Communications and Mobile Computing 2018 (2018), Article 7476201, 23 pages

[48]

R. Ünlü and P. Xanthopoulos. 2019. A weighted framework for unsupervised ensemble learning based on internal quality measures. Annals of Operations Research 276, 1–2 (2019), 229–247.

[49]

H. Wang, B. Wu, S. Yang, B. Wang, and Y. Liu. 2014. Research of decision tree on yarn using MapReduce and spark. In Proceedings of the 2014 World Congress in Computer Science, Computer Engineering, and Applied Computing. Universal Conference Management Systems & Support, Las Vegas, NV, 21–24.

[50]

H. Wassily. 2012. The Collected Works of Wassily Hoeffding. Springer Science & Business Media, Cham, Switzerland.

[51]

W. Xu and Z. Qin. 2012. Constructing decision trees for mining high-speed data streams. Chinese Journal of Electronics 21, 2 (2012), 215–220.

[52]

J. Yang, X. Zeng, S. Zhong, and S. Wu. 2013. Effective neural network ensemble approach for improving generalization performance. IEEE Transactions on Neural Networks and Learning Systems 24, 6 (2013), 878–887.

Cited By

Luo Y(2022)The Implementation Path of Labor Education in Applied Universities Driven by Artificial Intelligence TechnologyMobile Information Systems10.1155/2022/53754492022Online publication date: 5-Sep-2022
https://dl.acm.org/doi/10.1155/2022/5375449
Gao ZZhang YSun W(2022)Artificial Intelligence Service by Satellite Networks based on Ensemble Learning with Cloud-Edge-End Integration2022 IEEE/CIC International Conference on Communications in China (ICCC Workshops)10.1109/ICCCWorkshops55477.2022.9896696(158-163)Online publication date: 11-Aug-2022
https://doi.org/10.1109/ICCCWorkshops55477.2022.9896696
GAO ZSun WZhang Y(2022)Comparison of Ensemble and Federal Learning for Secure Data Collaboration in Satellite Networks2022 IEEE/CIC International Conference on Communications in China (ICCC Workshops)10.1109/ICCCWorkshops55477.2022.9896660(176-181)Online publication date: 11-Aug-2022
https://doi.org/10.1109/ICCCWorkshops55477.2022.9896660
Show More Cited By

Index Terms

Time-Efficient Ensemble Learning with Sample Exchange for Edge Computing
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed algorithms

Recommendations

Multistrategy Ensemble Learning: Reducing Error by Combining Ensemble Learning Techniques

Ensemble learning strategies, especially Boosting and Bagging decision trees, have demonstrated impressive capacities to improve the prediction accuracy of base learning algorithms. Further gains have been demonstrated by strategies that combine simple ...
Read More
Incremental learning by heterogeneous bagging ensemble
ADMA'10: Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II

Classifier ensemble is a main direction of incremental learning researches, and many ensemble-based incremental learning methods have been presented. Among them, Learn++, which is derived from the famous ensemble algorithm, AdaBoost, is special. Learn++ ...
Read More
Building Locally Discriminative Classifier Ensemble Through Classifier Fusion Among Nearest Neighbors
PCM 2016: 17th Pacific-Rim Conference on Advances in Multimedia Information Processing - Volume 9916

Many studies on ensemble learning that combines multiple classifiers have shown that, it is an effective technique to improve accuracy and stability of a single classifier. In this paper, we propose a novel discriminative classifier fusion method, which ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Internet Technology

ACM Transactions on Internet Technology Volume 21, Issue 3

August 2021

522 pages

ISSN:1533-5399

EISSN:1557-6051

DOI:10.1145/3468071

Editor:
Ling Liu
Georgia Institute of Technology, USA

Issue’s Table of Contents

Copyright © 2021 Association for Computing Machinery.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 June 2021

Accepted: 01 March 2021

Revised: 01 November 2020

Received: 01 December 2019

Published in TOIT Volume 21, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

Fundamental Research Funds for the Central Universities
Key Research Base of Humanities and Social Sciences of Chongqing
National Natural Science Foundation of China
Beijing Municipal Natural Science Foundation
Shandong Provincial Natural Science Foundation
Cloud Technology Endowed Professorship

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
184
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

Cited By

Luo Y(2022)The Implementation Path of Labor Education in Applied Universities Driven by Artificial Intelligence TechnologyMobile Information Systems10.1155/2022/53754492022Online publication date: 5-Sep-2022
https://dl.acm.org/doi/10.1155/2022/5375449
Gao ZZhang YSun W(2022)Artificial Intelligence Service by Satellite Networks based on Ensemble Learning with Cloud-Edge-End Integration2022 IEEE/CIC International Conference on Communications in China (ICCC Workshops)10.1109/ICCCWorkshops55477.2022.9896696(158-163)Online publication date: 11-Aug-2022
https://doi.org/10.1109/ICCCWorkshops55477.2022.9896696
GAO ZSun WZhang Y(2022)Comparison of Ensemble and Federal Learning for Secure Data Collaboration in Satellite Networks2022 IEEE/CIC International Conference on Communications in China (ICCC Workshops)10.1109/ICCCWorkshops55477.2022.9896660(176-181)Online publication date: 11-Aug-2022
https://doi.org/10.1109/ICCCWorkshops55477.2022.9896660
Zheng DZhang YXiao Z(2021)Deep Learning-Driven Gaussian Modeling and Improved Motion Detection Algorithm of the Three-Frame Difference MethodMobile Information Systems10.1155/2021/99766232021Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1155/2021/9976623

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents