Abstract
Today, there is a tremendous growth in the amount of data being generated from various fields (such as smartphones, social networks, emails, customer click streams, different types of sensors and Internet of Things) that show Big Data attributes. Recently efforts have been made towards developing models for knowledge discovery from such data under the research area of stream mining or data stream classification in particular. Ensemble learners have become the popular approach in data stream classification because of their stability-elasticity property, which enables handling data stream challenges such as concept drift, recurrent concepts, novel class detection, and class imbalance. In this paper, we compare ten ensemble classifiers with respect to concept drift and class imbalance using Prequential AUC. In addition, Friedman nonparametric statistical test and Nemenyi post-hoc test were used to identify the best approach among them. This work to some extent can serve as part of a review of existing ensemble classifier algorithms for non-stationary data streams.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gama, J., Rodrigues, P.P., Spinosa, E., Carvalho, A.: Knowledge discovery from data streams. In: Web Intelligence and Security – Advances in Data and Text Mining Techniques for Detecting and Preventing Terrorist Activities on the Web, pp. 125–138 (2010)
Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments: a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015)
Brzezinski, D., Stefanowski, J.: Prequential AUC: properties of the area under the ROC curve for data streams with concept drift. Knowl. Inf. Syst. 52, 531–562 (2017)
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2011)
Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J., Woźniak, M.: Ensemble learning for data stream analysis: a survey. Inf. Fusion 37, 132–156 (2017)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 2, no. 1, pp. 226–235 (2003)
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 81–94 (2014)
Gonçalves Jr., P.M., de Barros, R.S.M.: RCD: a recurring concept drift framework. Pattern Recognit. Lett. 34(9), 1018–1025 (2013)
Metzen, J.H., Edgington, M., Kassahun, Y., Kirchner, F.: Tracking recurrent concept drift in streaming data using ensemble classifiers. In: Proceedings of the 6th International Conference on Machine Learning and Applications ICMLA 2007, pp. 342–347 (2007)
Kolter, J., Maloof, M.: Dynamic weighted majority: an ensemble method for drifting concepts. J. Mach. Learn. Res. 8, 2755–2790 (2007)
Brzezinski, D., Stefanowski, J.: Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf. Sci. (Ny) 265, 50–67 (2014)
Bifet, A., Holmes, G., Pfahringer, B.: Leveraging bagging for evolving data streams. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6321, pp. 135–150. Springer, Heidelberg (2010). 10.1007/978-3-642-15880-3_15
Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà , R.: New ensemble methods for evolving data streams. In: Proceedings of the 15th ACM SIGKDD International Conference Knowledge Discovery data Mining – KDD 2009, p. 139 (2009)
Bifet, A., Gavaldà , R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, Philadelphia, PA: Society for Industrial and Applied Mathematics, pp. 443–448 (2007)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference Knowledge Discovery Data Mining, pp. 71–80 (2000)
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, Ana L.C., Labidi, S. (eds.) SBIA 2004. LNCS, vol. 3171, pp. 286–295. Springer, Heidelberg (2004). 10.1007/978-3-540-28645-5_29
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Tambuwal, A.I., Neagu, D., Gheorghe, M. (2017). An Experimental Comparison of Ensemble Classifiers for Evolving Data Streams. In: Bramer, M., Petridis, M. (eds) Artificial Intelligence XXXIV. SGAI 2017. Lecture Notes in Computer Science(), vol 10630. Springer, Cham. https://doi.org/10.1007/978-3-319-71078-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-71078-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71077-8
Online ISBN: 978-3-319-71078-5
eBook Packages: Computer ScienceComputer Science (R0)