article

The online performance estimation framework: heterogeneous ensemble learning for data streams

Authors:

Geoffrey Holmes,

Bernhard Pfahringer,

Joaquin VanschorenAuthors Info & Claims

Machine Learning, Volume 107, Issue 1

Pages 149 - 176

https://doi.org/10.1007/s10994-017-5686-9

Published: 01 January 2018 Publication History

Abstract

Ensembles of classifiers are among the best performing classifiers available in many data mining applications, including the mining of data streams. Rather than training one classifier, multiple classifiers are trained, and their predictions are combined according to a given voting schedule. An important prerequisite for ensembles to be successful is that the individual models are diverse. One way to vastly increase the diversity among the models is to build an heterogeneous ensemble, comprised of fundamentally different model types. However, most ensembles developed specifically for the dynamic data stream setting rely on only one type of base-level classifier, most often Hoeffding Trees. We study the use of heterogeneous ensembles for data streams. We introduce the Online Performance Estimation framework, which dynamically weights the votes of individual classifiers in an ensemble. Using an internal evaluation on recent training data, it measures how well ensemble members performed on this and dynamically updates their weights. Experiments over a wide range of data streams show performance that is competitive with state of the art ensemble techniques, including Online Bagging and Leveraging Bagging, while being significantly faster. All experimental results from this work are easily reproducible and publicly available online.

References

[1]

Apté, C., & Weiss, S. (1997). Data mining with decision trees and decision rules. Future Generation Computer Systems, 13(2), 197-210.

Digital Library

[2]

Beringer, J., & Hüllermeier, E. (2007). Efficient instance-based learning on data streams. Intelligent Data Analysis, 11(6), 627-650.

Digital Library

[3]

Bifet, A., Frank, E., Holmes, G., & Pfahringer, B. (2012). Ensembles of restricted hoeffding trees. ACM Transactions on Intelligent Systems and Technology (TIST), 3(2), 30.

[4]

Bifet, A., & Gavalda, R. (2007). Learning from time-changing data with adaptive windowing. SDM, SIAM, 7, 139-148.

[5]

Bifet, A., & Gavaldà, R. (2009). Adaptive learning from evolving data streams. In Advances in intelligent data analysis VIII (pp. 249-260). Springer.

Digital Library

[6]

Bifet, A., Holmes, G., Kirkby, R., & Pfahringer, B. (2010a). MOA: Massive online analysis. Journal of Machine Learning Research, 11, 1601-1604.

Digital Library

[7]

Bifet, A., Holmes, G., & Pfahringer, B. (2010b). Leveraging bagging for evolving data streams. In Machine learning and knowledge discovery in databases, Lecture Notes in Computer Science (Vol. 6321, pp. 135-150). Springer.

Digital Library

[8]

Bottou, L. (2004). Stochastic learning. In Advanced lectures on machine learning (pp. 146-168). Springer.

[9]

Brazdil, P., Gama, J., & Henery, B. (1994). Characterizing the applicability of classification algorithms using meta-level learning. In Machine learning: ECML-94 (pp. 83-102). Springer.

[10]

Brazdil, P., Soares, C., & Da Costa, J. P. (2003). Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning, 50(3), 251-277.

Digital Library

[11]

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140.

[12]

Caruana, R., Niculescu-Mizil, A., Crew, G., & Ksikes, A. (2004). Ensemble selection from libraries of models. In Proceedings of the twenty-first international conference on Machine learning (p. 18). ACM.

[13]

Dem¿ar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7, 1-30.

Digital Library

[14]

Domingos, P., & Hulten, G. (2000). Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 71-80).

Digital Library

[15]

Domingos, P., & Hulten, G. (2003). A general framework for mining massive data streams. Journal of Computational and Graphical Statistics, 12(4), 945-949.

[16]

Gama, J., & Brazdil, P. (2000). Cascade generalization. Machine Learning, 41(3), 315-343.

Digital Library

[17]

Gama, J., & Kosina, P. (2014). Recurrent concepts in data streams classification. Knowledge and Information Systems, 40(3), 489-507.

Digital Library

[18]

Gama, J., Medas, P., Castillo, G., & Rodrigues, P. (2004a). Learning with drift detection. In SBIA Brazilian symposium on artificial intelligence, Lecture Notes in Computer Science (Vol. 3171, pp. 286-295). Springer.

[19]

Gama, J., Medas, P., & Rocha, R. (2004b). Forest trees for on-line data. In Proceedings of the 2004 ACM symposium on applied computing (pp. 632-636). ACM.

[20]

Gama, J., Sebastião, R., & Rodrigues, P. P. (2009). Issues in evaluation of stream learning algorithms. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 329-338). ACM.

[21]

Gama, J., Sebastião, R., & Rodrigues, P. P. (2013). On evaluating stream learning algorithms. Machine Learning, 90(3), 317-346.

Digital Library

[22]

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I.H. (2009). The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 11(1), 10-18.

Digital Library

[23]

Hansen, L., & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993-1001.

Digital Library

[24]

Hintze, J. L., & Nelson, R. D. (1998). Violin plots: A box plot-density trace synergism. The American Statistician, 52(2), 181-184.

[25]

Kolter, J. Z., & Maloof, M. A. (2007). Dynamic weighted majority: An ensemble method for drifting concepts. Journal of Machine Learning Research, 8, 2755-2790.

Digital Library

[26]

Ladha, K. K. (1993). Condorcet's jury theorem in light of de Finetti's theorem. Social Choice and Welfare, 10(1), 69-85.

[27]

Lee, J. W., & Giraud-Carrier, C. (2011). A metric for unsupervised metalearning. Intelligent Data Analysis, 15(6), 827-841.

[28]

Littlestone, N., & Warmuth, M. (1994). The weighted majority algorithm. Information and Computation, 108(2), 212-261.

Digital Library

[29]

Nguyen, H. L., Woon, Y. K., Ng, W. K., & Wan, L. (2012). Heterogeneous ensemble for feature drifts in data streams. In Advances in knowledge discovery and data mining (pp. 1-12). Springer.

[30]

Oza, N. C. (2005). Online bagging and boosting. In 2005 IEEE international conference on systems, man and cybernetics (Vol. 3, pp. 2340-2345). IEEE.

[31]

Peterson, A. H., & Martinez, T. (2005). Estimating the potential for combining learning models. In Proceedings of the ICML workshop on meta-learning (pp. 68-75).

[32]

Pfahringer, B., Bensusan, H., & Giraud-Carrier, C. (2000). Tell me who can learn you and I can tell you who you are: Landmarking Various learning algorithms. In Proceedings of the 17th international conference on machine learning (pp. 743-750).

[33]

Pfahringer, B., Holmes, G., & Kirkby, R. (2007). New options for hoeffding trees. In AI 2007: Advances in artificial intelligence (pp. 90-99). Springer.

[34]

Read, J., Bifet, A., Pfahringer, B., & Holmes, G. (2012) Batch-incremental versus instance-incremental learning in dynamic and evolving data. In Advances in intelligent data analysis XI (pp. 313-323). Springer.

[35]

Rice, J. R. (1976). The algorithm selection problem. Advances in Computers, 15, 65-118.

[36]

Rokach, L., & Maimon, O. (2005). Clustering methods. In Data mining and knowledge discovery handbook (pp. 321-352). Springer.

[37]

Rossi, A. L. D., de Leon Ferreira, A. C. P., Soares, C., & De Souza, B. F. (2014). MetaStream: A meta-learning based method for periodic algorithm selection in time-changing data. Neurocomputing, 127, 52-64.

Digital Library

[38]

Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5(2), 197-227.

[39]

Shaker, A., & Hüllermeier, E. (2015). Recovery analysis for adaptive learning from non-stationary data streams: Experimental design and case study. Neurocomputing, 150, 250-264.

Digital Library

[40]

Shalev-Shwartz, S., Singer, Y., Srebro, N., & Cotter, A. (2011). Pegasos: Primal estimated sub-gradient solver for SVM. Mathematical Programming, 127(1), 3-30.

[41]

van Rijn, J. N. (2016). Massively collaborative machine learning. PhD thesis, Leiden University.

[42]

van Rijn, J. N., Holmes, G., Pfahringer, B., & Vanschoren, J. (2014). Algorithm selection on data streams. In Discovery Science, Lecture Notes in Computer Science (Vol. 8777, pp. 325-336). Springer.

[43]

van Rijn, J. N., Holmes, G., Pfahringer, B., & Vanschoren, J. (2015). Having a blast: Meta-learning and heterogeneous ensembles for data streams. In 2015 IEEE international conference on datamining (ICDM) (pp. 1003-1008). IEEE.

[44]

Vanschoren, J., van Rijn, J. N., Bischl, B., & Torgo, L. (2014). OpenML: Networked science in machine learning. ACM SIGKDD Explorations Newsletter, 15(2), 49-60.

Digital Library

[45]

Wang, H., Fan, W., Yu, P. S., & Han, J. (2003). Mining concept-drifting data streams using ensemble classifiers. In KDD (pp. 226-235).

[46]

Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241-259.

Digital Library

[47]

Zhang, P., Gao, B. J., Zhu, X., & Guo, L. (2011) Enabling fast lazy learning for data streams. In 2011 IEEE 11th International conference on data mining (ICDM) (pp. 932-941). IEEE.

Cited By

Li JYu HZhang ZLuo XXie S(2024)Concept Drift Adaptation by Exploiting Drift TypeACM Transactions on Knowledge Discovery from Data10.1145/363877718:4(1-22)Online publication date: 12-Feb-2024
https://dl.acm.org/doi/10.1145/3638777
Jemili FMeddeb RKorbaa O(2024)Intrusion detection based on ensemble learning for big data classificationCluster Computing10.1007/s10586-023-04168-727:3(3771-3798)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10586-023-04168-7
Arora SRani RSaxena N(2024)SETL: a transfer learning based dynamic ensemble classifier for concept drift detection in streaming dataCluster Computing10.1007/s10586-023-04149-w27:3(3417-3432)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10586-023-04149-w
Show More Cited By

The online performance estimation framework: heterogeneous ensemble learning for data streams
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches

Recommendations

Classification by cluster analysis: a new meta-learning based approach
MCS'11: Proceedings of the 10th international conference on Multiple classifier systems

Combination of multiple classifiers, commonly referred to as an classifier ensemble, has previously demonstrated the ability to improve classification accuracy in many application domains. One popular approach to building such a combination of ...
Using boosting to prune bagging ensembles

Boosting is used to determine the order in which classifiers are aggregated in a bagging ensemble. Early stopping in the aggregation of the classifiers in the ordered bagging ensemble allows the identification of subensembles that require less memory ...
Optimal construction of one-against-one classifier based on meta-learning

A commonly used strategy for solving a multi-class classification problem is to decompose the original problem into several binary subproblems. The recently proposed method, diversified one-against-one (DOAO), constructs a one-against-one classifier by ...

Comments

Information & Contributors

Information

Published In

cover image Machine Language

Machine Language Volume 107, Issue 1

January 2018

307 pages

ISSN:0885-6125

Issue’s Table of Contents

Copyright © Copyright © 2018 The Author(s).

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 January 2018

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

34
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li JYu HZhang ZLuo XXie S(2024)Concept Drift Adaptation by Exploiting Drift TypeACM Transactions on Knowledge Discovery from Data10.1145/363877718:4(1-22)Online publication date: 12-Feb-2024
https://dl.acm.org/doi/10.1145/3638777
Jemili FMeddeb RKorbaa O(2024)Intrusion detection based on ensemble learning for big data classificationCluster Computing10.1007/s10586-023-04168-727:3(3771-3798)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10586-023-04168-7
Arora SRani RSaxena N(2024)SETL: a transfer learning based dynamic ensemble classifier for concept drift detection in streaming dataCluster Computing10.1007/s10586-023-04149-w27:3(3417-3432)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1007/s10586-023-04149-w
Li ZQi BSun HGao XFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)AutoMRM: A Model Retrieval Method Based on Multimodal Query and Meta-learningProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614787(1228-1237)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614787
Dolzhikova IAbibullaev BZollanvari A(2023)A Jackknife-Inspired Deep Learning Approach to Subject-Independent Classification of EEGPattern Recognition Letters10.1016/j.patrec.2023.10.011176:C(28-33)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.patrec.2023.10.011
Wilson JChaudhury SLall B(2023)Homogeneous–Heterogeneous Hybrid Ensemble for concept-drift adaptationNeurocomputing10.1016/j.neucom.2023.126741557:COnline publication date: 7-Nov-2023
https://dl.acm.org/doi/10.1016/j.neucom.2023.126741
Amador Coelho RBambirra Torres LLeite de Castro C(2023)Concept drift detection with quadtree-based spatial mapping of streaming dataInformation Sciences: an International Journal10.1016/j.ins.2022.12.085625:C(578-592)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1016/j.ins.2022.12.085
Fumanal-Idocin JCordón OBustince H(2023)The Krypteia ensembleInformation Fusion10.1016/j.inffus.2022.09.02190:C(283-297)Online publication date: 1-Feb-2023
https://dl.acm.org/doi/10.1016/j.inffus.2022.09.021
Alissa MSim KHart E(2023)Automated Algorithm Selection: from Feature-Based to Feature-Free ApproachesJournal of Heuristics10.1007/s10732-022-09505-429:1(1-38)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1007/s10732-022-09505-4
Slima IJarraya AAmmar SBorgi A(2022)PCMCRProcedia Computer Science10.1016/j.procs.2022.09.148207:C(926-935)Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1016/j.procs.2022.09.148
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents