Abstract
Identifying network traffics at their early stages accurately is very important for network management and security. Recent years, more and more studies have devoted to find effective machine learning models to identify traffics with few packets at the early stage. In this paper, we try to build an effective early stage traffic identification model by applying flexible neural trees (FNT). Three network traffic data sets including two open data sets are used for the study. We first extract both packet-level features and statistical features from the first six continuous packets and six noncontinuous packets of each flow. Packet sizes are applied as packet-level features. And for statistical features, average, standard deviation, maximum and minimum are selected. Eight classical classifiers are employed as the comparing methods in the identification experiments. Accuracy, true positive rate (TPR) and false positive rate (FPR) are applied to evaluate the performances of the compared methods. FNT outperforms the other methods for most cases in the identification experiments, and it behaves very well for both TPR and FPR. Furthermore, it can show the selected features in the optimal tree result. Experiment result shows that FNT is effective for early stage traffic identification.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Baluja S (1994) Population-based incremental learning: a method for integrating genetic search based function optimization and competitive learning. Tech report, Carnegie Mellon University
Bernaille L, Teixeira R, Akodkenou I, Soule A, Salamatian K (2006) Traffic classification on the fly. ACM SIGCOMM Comput Commun Rev 36(2):23–26
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Broomhead DS, Lowe D (1988) Multivariable functional interpolation and adaptive networks. Proc Ijcai 2(3):321–355
Chen Y, Yang B, Dong J (2004) Nonlinear system modelling via optimal design of neural trees. Int J Neural Syst 14(02):125–137
Chen Y, Yang B, Dong J, Abraham A (2005) Time-series forecasting using flexible neural tree model. Inf Sci 174(3):219–235
Chen Y, Chen F, Yang JY (2007a) Evolving mimo flexible neural trees for nonlinear system identification. In: IC-AI2007, CSREA Press pp 373–377
Chen Y, Yang B, Abraham A (2007b) Flexible neural trees ensemble for stock index modeling. Neurocomputing 70(4):697–703
Chen Z, Yang B, Chen Y, Abraham A, Grosan C, Peng L (2009) Online hybrid traffic classifier for peer-to-peer systems based on network processors. Appl Soft Comput 9:685–694
Dainotti A, Pescapé A, Sansone C (2011) Early classification of network traffic through multi-classification. Springer, New York, pp 122–135
Dainotti A, Pescape A, Claffy KC (2012) Issues and future directions in traffic classification. Netw IEEE 26(1):35–40
Dainottia A, Pescap A, Rossib PS, Palmieric F, Ventrea G (2008) Internet traffic modeling by means of hidden markov models. Comput Netw 52:2645–2662
Esposito C, Ficco M, Palmieri F, Castiglione A (2013) Interconnecting federated clouds by using publish-subscribe service. Clust Comput 16:887–903
Esposito C, Ficco M, Palmieri F, Castiglione A (2015) Smart cloud storage service selection based on fuzzy logic, theory of evidence and game theory. IEEE Trans Comput. doi:10.1109/TC.2015.2389952
Este A, Gringoli F, Salgarelli L (2009a) On the stability of the information carried by traffic flow features at the packet level. ACM SIGCOMM Comput Commun Rev 39(3):13–18
Este A, Gringoli F, Salgarelli L (2009b) Support vector machines for tcp traffic classification. Comput Netw Int J Comput Telecommun Netw 53(14):2476–2490
Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: Morgan Kaufmann (ed) Proceeding of International Conference on Machine Learning, pp 144–151
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163
Gringoli F, Salgarelli L, Dusi M, Cascarano N, Risso F et al (2009) Gt: picking up the truth from the ground for internet traffic. ACM SIGCOMM Comput Commun Rev 39(5):12–18
Holte RC (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learning 11(1):63–90
Huang NF, Jai GY, Chao HC (2008) Early identifying application traffic with application characteristics. In: Communications, ICC’08. IEEE international conference on, IEEE, pp 5788–5792
Huang NF, Jai GY, Chao HC, Tzang YJ, Chang HY (2013) Application traffic classification at the early stage by characterizing application rounds. Inf Sci 232:130–142
Hullár B, Laki S, György A (2011) Early identification of peer-to-peer traffic. In: Communications (ICC), 2011 IEEE international conference on, IEEE, pp 1–6
Jacobsen V, Leres C, McCanne S (2005) Tcpdump/libpcap. http://www.tcpdump.org
Kennedy J (1999) Small worlds and mega-minds: effects of neighborhood topology on particle swarm performance. In: Evolutionary computation, CEC 99, proceedings of the 1999 congress on, IEEE, vol 3
Kennedy J (2010) Particle swarm optimization. Swarm Intell 1(1):33–57
Kohavi R (1996) Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: KDD’96, AAAI Press, pp 202–207
Krink T, VesterstrOm JS, Riget J (2002) Particle swarm optimisation with spatial particle extension. In: CEC’02, IEEE, pp 1474–1479
Li W, Moore AW (2007) A machine learning approach for efficient traffic classification. In: Modeling, analysis, and simulation of computer and telecommunication systems. MASCOTS’07. 15th international symposium on, IEEE, pp 310–317
Moore A, Zuev D, Crogan M (2005) Discriminators for use in flow-based classification. Tech report, Queen Mary and Westfield College
Moore AW, Zuev D (2005) Internet traffic classification using bayesian analysis techniques. ACM SIGMETRICS Perform Eval Rev ACM 33:50–60
Musilek P, Lau A, Reformat M, Wyard-Scott L (2006) Immune programming. Inf Sci 176(8):972–1002
Nguyen TT, Armitage G, Branch P, Zander S (2012) Timely and continuous machine-learning-based classification for interactive ip traffic. IEEE/ACM Trans Netw (TON) 20(6):1880–1894
NTW (2009) Unibs: data sharing. http://www.ing.unibs.it/ntw/tools/traces/
Palmieri F, Fiore U (2009) A nonlinear, recurrence-based approach to traffic classification. Comput Netw 53:761–773
Peng L, Yang B, Chen Y, Wu T (2014a) How many packets are most effective for early stage traffic identification: an experimental study. Commun Chin 11(9):183–193
Peng L, Zhang H, Yang B, Chen Y, Wu T (2014b) Traffic labeller: collecting internet traffic samples with accurate application information. Commun Chin 11(1):69–78
Qu B, Zhang Z, Guo L, Meng D (2012) On accuracy of early traffic classification. In: Networking, architecture and storage (NAS), 2012 IEEE 7th international conference on, IEEE, pp 348–354
Qu SN, Liu Zl, Cui G, Zhang B, Wang S (2008) Modeling of cement decomposing furnace production process based on flexible neural tree. In: Information management, innovation management and industrial engineering, ICIII’08. international conference on, IEEE, vol 3, pp 128–133
Rizzi A, Colabrese S, Baiocchi A (2013) Low complexity, high performance neuro-fuzzy system for internet traffic flows early classification. In: Wireless Communications and Mobile Computing Conference (IWCMC), 2013 9th international, IEEE, pp 77–82
Salustowicz R, Schmidhuber J (1997) Probabilistic incremental program evolution. Evol Comput 5(2):123–141
Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and qsar modeling. J Chem Inf Comput Sci 43(6):1947–1958
Vapnik VN, Vapnik V (1998) Statistical learning theory, vol 1. Wiley, New York
Waikato U (2013) Weka 3: data mining software in java. http://www.cs.waikato.ac.nz/ml/weka/
WAND (2009) Wits: Waikato internet traffic storage. http://www.wand.net.nz/wits
Yoshida H, Kawata K, Fukuyama Y, Takayama S, Nakanishi Y (2000) A particle swarm optimization for reactive power and voltage control considering voltage security assessment. Power Syst IEEE Trans 15(4):1232–1239
Zhang J, Chen X, Xiang Y, Wu J (2014) Robust network traffic classification. IEEE/ACM Trans Netw 24:84–88
Zhou J, Liu Y, Chen Y (2007) Ica based on kpca and hybrid flexible neural tree to face recognition. In: Computer information systems and industrial management applications, CISIM’07. 6th international conference on, IEEE, pp 245–250
Acknowledgments
This study is supported by the National Natural Science Foundation of China under Grant No. 61472164, the Natural Science Foundation of Shandong Province under Grant Nos. ZR2014JL042 and ZR2012FM010.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflicts of interest to this work.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Chen, Z., Peng, L., Gao, C. et al. Flexible neural trees based early stage identification for IP traffic. Soft Comput 21, 2035–2046 (2017). https://doi.org/10.1007/s00500-015-1902-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-015-1902-3