Active learning for P2P traffic identification

Liu, San-Min; Sun, Zhi-Xin

doi:10.1007/s12083-014-0281-3

Active learning for P2P traffic identification

Published: 17 May 2014

Volume 8, pages 733–740, (2015)
Cite this article

Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

San-Min Liu^1,2 &
Zhi-Xin Sun^1,3

330 Accesses
5 Citations
Explore all metrics

Abstract

P2P traffic identification methods by using machine learning have been provided in a great number of works, which suffer from a large and representative labeled sample set. To overcome the sample labeling problem, a new P2P traffic identification approach by active learning called P2PTIAL is presented. P2PTIAL is composed of two parts: support vector machine as learner and uncertainty selection based on distance. In order to improve the effectiveness of P2PTIAL, we add filtering policy and balanced policy to P2PTIAL. Firstly, we use support vector data description (SVDD) theory to filter some unlabeled samples, which have little contribution on active learning, and so it can save computation cost and storage space. Secondly, we use the unlabeled sample’s pre-labeled information to develop balanced policy, which can keep balanced learning. Lastly, we support our design with extensive simulation experiments, and our results show P2PTIAL is feasible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effectiveness of Statistical Features for Early Stage Internet Traffic Identification

Article 18 January 2015

A novel semi-supervised learning method for Internet application identification

Article 04 November 2015

Feature Evaluation for Early Stage Internet Traffic Identification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Chen ZQ, Delis A, Wei P (2008) Identification and management of sessions generated by instant messaging and peer-to-peer systems. International Journal of Cooperative Information Systems 3:1–50
Article MATH Google Scholar
Karagiannis T, Broido A, Faloutsos M, Claffy K (2004) Transport layer identification of P2P traffic. IMC’04, Taormina, sicily, Italy, pp 121–134
Sen S, Spatscheck O, Wang D (2004) Accurate scalable in-network identification of P2P traffic using application signatures. In: Proc. of the 13th international conference on WWW, New York, USA, pp 512–521
Moore AW, Papagiannaki K (2005) Toward the accurate identification of network applications. Springer-Verlag, Heidelberg, pp 41–54
Google Scholar
Satoshi O, Yoichi H, Matsuaki T, Konosuke K (2005) A traffic identification method and evaluations for a pure P2P application. PAM 2005, LNCS 3431, pp 55–68
Thomas K, Andre B, Michalis F, Kimberly C, Claffy (2004) Transport layer identification of P2P traffic. In: Proc. of the 4th ACM SIGCOMM conference on Internet measurement, Sicily, Italy: ACM Press, pp 121–134
Kyoungwon S, Figueiredo DR, Kurose J, Don T (2006) Characterizing and detecting skype-relayed traffic. In:Proc. of IEEE Conference on computer Communications, pp 1–12
Karagiannis T, Papagiannaki K, Faloutsos M (2005) BLINC: multilevel traffic classification in the dark. ACM SIGCOMM, pp 229–240
Mdhukar A, Wiliamson C. (2006) A longitudinal study of P2P traffic classification. The 14th IEEE Int’1 Symp on Modeling, Analysis, Simulation of Computer and Telecommunication Systems, pp 179–188
Chen ZQ, Zhang Y, Chen ZR, Delis A (2009) A digest and pattern matching-based intrusion detection engine. Comput J 3:1–25
Google Scholar
Marco M, Antonio P, Luca S (2009) Traffic classification and its applications to modern networks. Comput Netw 53(6):759–76
Article Google Scholar
McGregor A, Hall M, Lorier P, Brunskill J (2004) Flow clustering using machine learning techniques. In: Proc. of 5th passive Active measurement Workshop (PAM), pp 205–214
Bernaille L, Teixeira R, Salamatian K (2006) Early application identification. Proc. of 2006 ACM CoNEXT, Lisboa, pp 1–12
Google Scholar
Erman J, Arlitt M, Mahanti A (2006) Traffic classification using clustering algorithms. Proc. of the SIGCOMM Workshop on Mining Network Data, Pisa, pp 281–286
Google Scholar
Zuev D, Moore AW (2005) Traffic classification using a statistical approach. Springer-Verlag, Heidelberg, pp 321–324
Google Scholar
Moore AW, Zuev D (2005) Internet traffic classification using Bayesian analysis techniques. In: Proc. of the 2005 ACM SIGMETRICS, pp 50–60
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proc. of the 20th International Conference on Machine Learning, pp 856–863
Auld T, Moore AW, Gull SF (2007) Bayesian neural networks for internet traffic classification. IEEE Transaction on Neural Network 18(1):223–239
Article Google Scholar
Alice E, Francesco G, Luca S (2009) Support vector machines for TCP traffic classification. Comput Netw 53:2476–2490
Article MATH Google Scholar
Li Z, Yuan RX, Guan XH (2007) Accurate classification of the internet traffic based on the SVM method. In:Proc. of IEEE Int Conference on Communications, Glasgow, Scotland, pp 1373–1378
Li X, Feng Q, Xu D, Qiu XS (2011) An internet traffic classification method based on semi-supervised support vector machine. In:Proc. of IEEE Int. Conference on Communications, Kyoto, Japan, pp 1–5
Yang G, Yuan L, He Y (2012) Timely traffic identification on P2P streaming media. Journal of China University of Posts and Telecommunications 19(2):67–73
Article Google Scholar
Huang NF, Jai GY, Chao HC, Tzang YJ, Chang HY (2013) Application traffic classification at the early stage by characterizing application rounds. Inf Sci 232:130–142
Article Google Scholar
David MJT, Robert PWD (2004) Support vector data description. Mach Learn 54:45–66
Article MATH Google Scholar
Burr S (2009) Active learning literature survey. Computer sciences technical report 1648
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Book MATH Google Scholar
Chen ZQ, Roussopoulos M, Liang ZY, Zhang Y, Chen ZR, Delis A (2012) Malware characteristics and threats on the internet ecosystem. J Syst Softw 85(7):1650–1672
Article Google Scholar
Sotiris K, Dimitris K, Panayiotis P (2006) Handing imbalanced datasets: a review. International Transactions on Computer Science and Engineering 30:1–12
Google Scholar
LIBSVM Toolbox. http://www.csie.ntu.edu.tw/~cjlin/libsvm

Download references

Acknowledgments

The authors would like to thank the anonymous referees for their very valuable comments. This work was supported in part by the National Science Foundation of China under (Grant 60973140, 61170276, 61300170, 61373135, 71371012), the Key Project for Outstanding Young Talents in Higher Education Institutions of Anhui Province of China under Grant 2013 SQRL034ZD, the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province of China under Grant 12 KJA520003, and the Innovation Fund for Technology based Enterprise of Jiangsu Province of China under Grant BC2013027.

Author information

Authors and Affiliations

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, Peoples Republic of China
San-Min Liu & Zhi-Xin Sun
College of Computer and Information, Anhui Polytechnic University, Wuhu, 241000, Peoples Republic of China
San-Min Liu
Key Laboratory of Broadband Wireless Communication and Sensor Network technology, Ministry of Education, Nanjing University of Posts and Telecommunications, Nanjing, 210003, Peoples Republic of China
Zhi-Xin Sun

Authors

San-Min Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Xin Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to San-Min Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, SM., Sun, ZX. Active learning for P2P traffic identification. Peer-to-Peer Netw. Appl. 8, 733–740 (2015). https://doi.org/10.1007/s12083-014-0281-3

Download citation

Received: 06 August 2013
Accepted: 05 May 2014
Published: 17 May 2014
Issue Date: September 2015
DOI: https://doi.org/10.1007/s12083-014-0281-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Active learning for P2P traffic identification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effectiveness of Statistical Features for Early Stage Internet Traffic Identification

A novel semi-supervised learning method for Internet application identification

Feature Evaluation for Early Stage Internet Traffic Identification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Active learning for P2P traffic identification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effectiveness of Statistical Features for Early Stage Internet Traffic Identification

A novel semi-supervised learning method for Internet application identification

Feature Evaluation for Early Stage Internet Traffic Identification

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation