Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-642-13672-6_30guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Fast perceptron decision tree learning from evolving data streams

Published: 21 June 2010 Publication History

Abstract

Mining of data streams must balance three evaluation dimensions: accuracy, time and memory Excellent accuracy on data streams has been obtained with Naive Bayes Hoeffding Trees—Hoeffding Trees with naive Bayes models at the leaf nodes—albeit with increased runtime compared to standard Hoeffding Trees In this paper, we show that runtime can be reduced by replacing naive Bayes with perceptron classifiers, while maintaining highly competitive accuracy We also show that accuracy can be increased even further by combining majority vote, naive Bayes, and perceptrons We evaluate four perceptron-based learning strategies and compare them against appropriate baselines: simple perceptrons, Perceptron Hoeffding Trees, hybrid Naive Bayes Perceptron Trees, and bagged versions thereof We implement a perceptron that uses the sigmoid activation function instead of the threshold activation function and optimizes the squared error, with one perceptron per class value We test our methods by performing an evaluation study on synthetic and real-world datasets comprising up to ten million examples.

References

[1]
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
[2]
Bennett, K., Cristianini, N., Shawe-Taylor, J., Wu, D.: Enlarging the margins in perceptron decision trees Machine Learning 41(3), 295-313 (2000)
[3]
Bifet, A., Gavaldà, R.: Learning from time-changing data with adaptive windowing In: SDM (2007)
[4]
Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams In: KDD, pp 139-148 (2009)
[5]
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees Wadsworth, Belmont (1984)
[6]
Domingos, P., Hulten, G.: Mining high-speed data streams In: KDD, pp 71-80 (2000)
[7]
Frank, E., Wang, Y., Inglis, S., Holmes, G., Witten, I.H.: Using model trees for classification Machine Learning 32(1), 63-76 (1998)
[8]
Gama, J.: On Combining Classification Algorithms VDM Verlag (2009)
[9]
Gama, J., Medas, P., Castillo, G., Rodrigues, P.P.: Learning with drift detection In: Bazzan, A.L.C., Labidi, S (eds.) SBIA 2004 LNCS (LNAI), vol 3171, pp 286-295 Springer, Heidelberg (2004)
[10]
Gama, J., Rocha, R., Medas, P.: Accurate decision trees for mining high-speed data streams In: KDD, pp 523-528 (2003)
[11]
Harries, M.: Splice-2 comparative evaluation: Electricity pricing Technical report, The University of South Wales (1999)
[12]
Holmes, G., Kirkby, R., Pfahringer, B.: Stress-testing Hoeffding trees In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J (eds.) PKDD 2005 LNCS (LNAI), vol 3721, pp 495-502 Springer, Heidelberg (2005)
[13]
Holmes, G., Kirkby, R., Pfahringer, B.: MOA: Massive Online Analysis (2007), http://sourceforge.net/projects/moa-datastream
[14]
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams In: KDD, pp 97-106 (2001)
[15]
Ikonomovska, E., Gama, J.: Learning model trees from data streams Discovery Science, 52-63 (2008)
[16]
Ikonomovska, E., Gama, J., Sebastião, R., Gjorgjevik, D.: Regression trees from data streams with drift detection In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B (eds.) DS 2009 LNCS, vol 5808, pp 121-135 Springer, Heidelberg (2009)
[17]
Landwehr, N., Hall, M., Frank, E.: Logistic model trees Machine Learning 59(1-2), 161-205 (2005)
[18]
Murthy, S.K.: Automatic construction of decision trees from data: A multidisciplinary survey Data Min Knowl Discov 2(4), 345-389 (1998)
[19]
Oza, N., Russell, S.: Online bagging and boosting In: Artificial Intelligence and Statistics 2001, pp 105-112 Morgan Kaufmann, San Francisco (2001)
[20]
Oza, N.C., Russell, S.J.: Experimental comparisons of online and batch versions of bagging and boosting In: KDD, pp 359-364 (2001)
[21]
Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology IEEE Transactions on Systems, Man and Cybernetics 21(3), 660-674 (1991)
[22]
Schlimmer, J.C., Fisher, D.H.: A case study of incremental concept induction In: AAAI, pp 496-501 (1986)
[23]
Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification In: KDD, pp 377-382 (2001)
[24]
Utgoff, P.E.: Perceptron trees: A case study in hybrid concept representations In: AAAI, pp 601-606 (1988)
[25]
Velte, T., Velte, A., Elsenpeter, R.: Cloud Computing, A Practical Approach McGraw-Hill, Inc., New York (2010)
[26]
Zhou, Z., Chen, Z.: Hybrid decision tree Knowledge-based systems 15(8), 515-528 (2002)

Cited By

View all
  • (2024)Leveraging Plasticity in Incremental Decision TreesMachine Learning and Knowledge Discovery in Databases. Research Track10.1007/978-3-031-70362-1_3(38-54)Online publication date: 8-Sep-2024
  • (2022)INDENTProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design10.1145/3508352.3549436(1-9)Online publication date: 30-Oct-2022
  • (2022)Speeding Up Recovery from Concept DriftsMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44845-8_12(179-194)Online publication date: 10-Mar-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
PAKDD'10: Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
June 2010
514 pages
ISBN:3642136710
  • Editors:
  • Mohammed J. Zaki,
  • Jeffrey Xu Yu,
  • B. Ravindran,
  • Vikram Pudi

Sponsors

  • AOARD: Asian Office of Aerospace Research and Development
  • AFOSR: AFOSR
  • ONRGlobal: U.S. Office of Naval Research Global

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 21 June 2010

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Leveraging Plasticity in Incremental Decision TreesMachine Learning and Knowledge Discovery in Databases. Research Track10.1007/978-3-031-70362-1_3(38-54)Online publication date: 8-Sep-2024
  • (2022)INDENTProceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design10.1145/3508352.3549436(1-9)Online publication date: 30-Oct-2022
  • (2022)Speeding Up Recovery from Concept DriftsMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44845-8_12(179-194)Online publication date: 10-Mar-2022
  • (2018)SCARFFInformation Fusion10.1016/j.inffus.2017.09.00541:C(182-194)Online publication date: 1-May-2018
  • (2018)Online machine learning algorithms to predict link quality in community wireless mesh networksComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2018.01.005132:C(68-80)Online publication date: 26-Feb-2018
  • (2017)Genetic programming based feature construction for classification with incomplete dataProceedings of the Genetic and Evolutionary Computation Conference10.1145/3071178.3071183(1033-1040)Online publication date: 1-Jul-2017
  • (2016)A Framework for Classification in Data Streams Using Multi-strategy LearningDiscovery Science10.1007/978-3-319-46307-0_22(341-355)Online publication date: 19-Oct-2016
  • (2015)Learning concept-drifting data streams with random ensemble decision treesNeurocomputing10.1016/j.neucom.2015.04.024166:C(68-83)Online publication date: 20-Oct-2015
  • (2015)Towards cost-sensitive adaptationNeurocomputing10.1016/j.neucom.2014.05.084150:PA(240-249)Online publication date: 20-Feb-2015
  • (2014)Economically-efficient sentiment stream analysisProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval10.1145/2600428.2609612(637-646)Online publication date: 3-Jul-2014
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media