Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Timely and continuous machine-learning-based classification for interactive IP traffic

Published: 01 December 2012 Publication History

Abstract

Machine Learning (ML) for classifying IP traffic has relied on the analysis of statistics of full flows or their first few packets only. However, automated QoS management for interactive traffic flows requires quick and timely classification well before the flows finish. Also, interactive flows are often long-lived and should be continuously monitored during their lifetime. We propose to achieve this by using statistics derived from sub-flows--a small number of most recent packets taken at any point in a flow's lifetime. Then, the ML classifier must be trained on a set of sub-flows, and we investigate different sub-flow selection strategies. We also propose to augment training datasets so that classification accuracy is maintained even when a classifier mixes up client-to-server and server-to-client directions for applications exhibiting asymmetric traffic characteristics. We demonstrate the effectiveness of our approach with the Naive Bayes and C4.5 Decision Tree ML algorithms, for the identification of first-person-shooter online game and VoIP traffic. Our results show that we can classify both applications with up to 99% Precision and 95% Recall within less than 1 s. Stable results are achieved regardless of where within a flow the classifier captures the packets and the traffic direction.

References

[1]
J. But, G. Armitage, and L. Stewart, "Outsourcing automated QoS control of home routers for a better online game experience," IEEE Commun. Mag., vol. 46, no. 12, pp. 64-70, Dec. 2008.
[2]
J. Frank, "Machine learning and intrusion detection: Current and future directions," in Proc. 17th Nat. Comput. Security Conf., Oct. 1994, pp. 22-33.
[3]
F. Baker, B. Foster, and C. Sharp, "Cisco architecture for lawful intercept in IP networks," RFC 3924, Oct. 2004.
[4]
Planet Wolfenstein Enemy Territory, "Wolfenstein: Enemy Territory," Oct. 2010 {Online}. Available: http://www.planetwolfenstein.com/enemyterritory/
[5]
P. A. Branch, A. Heyde, and G. J. Armitage, "Rapid identification of Skype traffic flows," in Proc. 18th NOSSDAV, 2009, pp. 91-96.
[6]
J. But, P. Branch, and T. Le, "Rapid identification of BitTorrent traffic," in Proc. 35th IEEE LCN, Oct. 2010, pp. 536-543.
[7]
T. Nguyen and G. Armitage, "Training on multiple sub-flows to optimise the use of machine learning classifiers in real-world IP networks," in Proc. 31st IEEE LCN, Nov. 2006, pp. 369-376.
[8]
T. Nguyen and G. Armitage, "Synthetic sub-flow pairs for timely and stable IP traffic identification," in Proc. Australian Telecommun. Netw. Appl. Conf., Dec. 2006, pp. 293-297.
[9]
T. Nguyen and G. Armitage, "Clustering to assist supervised machine learning for real-time IP traffic classification," in Proc. IEEE ICC, 2008, pp. 5857-5862.
[10]
S. Zander and G. Armitage, "DIstributed Firewall and Flow-Shaper Using Statistical Evidence (DIFFUSE)," 2010 {Online}. Available: http://caia.swin.edu.au/urp/diffuse/
[11]
A. McGregor, M. Hall, P. Lorier, and J. Brunskill, "Flow clustering using machine learning techniques," in Proc. PAM, Apr. 2004, pp. 205-214.
[12]
M. Roughan, S. Sen, O. Spatscheck, and N. Duffield, "Class-of-service mapping for QoS: A statistical signature-based approach to IP traffic classification," in Proc. 4th ACM SIGCOMM IMC, Oct. 2004, pp. 135-148.
[13]
S. Zander, T. Nguyen, and G. Armitage, "Automated traffic classification and application identification using machine learning," in Proc. IEEE 30th LCN, Nov. 2005, pp. 250-257.
[14]
A. Moore and D. Zuev, "Internet traffic classification using Bayesian analysis techniques," in Proc. ACM SIGMETRICS Int. Conf. Meas. Model. Comput. Syst., Jun. 2005, pp. 50-60.
[15]
T. Auld, A. W. Moore, and S. F. Gull, "Bayesian neural networks for internet traffic classification," IEEE Trans. Neural Netw., vol. 18, no. 1, pp. 223-239, Jan. 2007.
[16]
L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian, "Traffic classification on the fly," Comput. Commun. Rev., vol. 36, no. 2, pp. 23-26, 2006.
[17]
G. Gómez Sena and P. Belzarena, "Early traffic classification using support vector machines," in Proc. 5th LANC, 2009, pp. 60-66.
[18]
W. Li, M. Canini, A. W. Moore, and R. Bolla, "Efficient application identification and the temporal and spatial stability of classification schema," Comput. Netw., vol. 53, pp. 790-809, Apr. 2009.
[19]
W. Li and A. W. Moore, "A machine learning approach for efficient traffic classification," in Proc. Symp. Model., Anal., and Simul. Comput. Telecommun. Syst., 2007, pp. 310-317.
[20]
J. Park, H.-R.Tyan, and C.-C. J. Kuo, "GA-based internet traffic classification technique for QoS provisioning," in Proc IIHMSP, Dec. 2006, pp. 251-254.
[21]
J. Erman, A. Mahanti, M. Arlitt, and C. Williamson, "Identifying and discriminating between Web and peer-to-peer traffic in the network core," in Proc 16th WWW, May 2007, pp. 883-892.
[22]
J. Erman, A. Mahanti, M. Arlitt, I. Cohen, and C.Williamson, "Semisupervised network traffic classification," Perform. Eval. Rev., vol. 35, no. 1, pp. 369-370, 2007.
[23]
C. Rotsos, J. Van Gael, A. W. Moore, and Z. Ghahramani, "Probabilistic graphical models for semi-supervised traffic classification," in Proc. 6th IWCMC, 2010, pp. 752-757.
[24]
F. Rodríguez-Teja, C. Martinez-Cagnazzo, and E. G. Castro, "Bayesian classification: Methodology for network traffic classification combination," in Proc. 6th IWCMC, 2010, pp. 769-773.
[25]
M. Crotti, F. Gringoli, and L. Salgarelli, "Optimizing statistical classifiers of network traffic," in Proc. 6th IWCMC, 2010, pp. 758-763.
[26]
J. Erman, M. Arlitt, and A. Mahanti, "Traffic classification using clustering algorithms," in Proc. SIGCOMM MineNet, 2006, pp. 281-286.
[27]
J. Erman, A. Mahanti, and M. Arlitt, "QRP05-4: Internet traffic identification using machine learning," in Proc. IEEE GLOBECOM, Dec. 2006, pp. 1-6.
[28]
H. Kim, K. Claffy, M. Fomenkov, D. Barman, M. Faloutsos, and K. Lee, "Internet traffic classification demystified: Myths, caveats, and the best practices," in Proc. ACM CoNEXT, 2008, pp. 1-12.
[29]
P. Haffner, S. Sen, O. Spatscheck, and D. Wang, "ACAS: Automated construction of application signatures," in Proc. ACM SIGCOMM MineNet, Aug. 2005, pp. 197-202.
[30]
N. Williams, S. Zander, and G. Armitage, "A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification," Comput. Commun. Rev., vol. 36, no. 5, pp. 5-16, 2006.
[31]
F. Palmieri and U. Fiore, "A nonlinear, recurrence-based approach to traffic classification," Comput. Netw., vol. 53, pp. 761-773, Apr. 2009.
[32]
T. Nguyen and G. Armitage, "A survey of techniques for internet traffic classification using machine learning," IEEE Commun. Surveys Tutorials, vol. 10, no. 4, pp. 56-76, 4th Quart., 2008.
[33]
A. Callado, C. Kamienski, S. Fernandes, D. Sadok, G. Szabo, and B. P. Ger, "A survey on internet traffic identification and classification," IEEE Commun. Surveys Tutorials, vol. 11, no. 3, pp. 37-52, 3rd Quart., 2009.
[34]
R. Kohavi, J. R. Quinlan, W. Klosgen, and J. Zytkow, "Decision tree discovery," Handbook Data Mining Knowl. Discovery, pp. 267-276, 2002.
[35]
G. John and P. Langley, "Estimating continuous distributions in Bayesian classifiers," in Proc. 11th Conf. Uncertainty Artif. Intell., Aug. 1995, pp. 338-345.
[36]
I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques With Java Implementations, 2nd ed. San Mateo, CA: Morgan Kaufmann, 2005.
[37]
C. Schmoll and S. Zander, "NetMate," Oct. 2010 {Online}. Available: http://sourceforge.net/projects/netmate-meter/
[38]
G. Armitage, M. Claypool, and P. Branch, Networking and Online Games--Undertanding and Engineering Multiplayer Internet Games. Hoboken, NJ: Wiley, 2006.
[39]
The University of Twente, "Trafficmeasurement data repository," Mar. 2009 {Online}. Available: http://traces.simpleweb.org/
[40]
J.-C. Bolot, "End-to-end packet delay and loss behavior in the Internet," Comput. Commun. Rev., vol. 23, no. 4, pp. 289-298, 1993.
[41]
M. Dischinger, A. Haeberlen, K. P. Gummadi, and S. Saroiu, "Characterizing residential broadband networks," in Proc. 7th ACM SIGCOMM IMC, 2007, pp. 43-56.
[42]
M. Mathis, J. Semke, and J. Mahdavi, "The macroscopic behavior of the TCP congestion avoidance algorithm," Comput. Commun. Rev., vol. 27, no. 3, pp. 67-82, 1997.
[43]
R.-H. Li and G. G. Belford, "Instability of decision tree classification algorithms," in Proc. 8th ACM KDD, 2002, pp. 570-575.
[44]
J. J. Barbish and B. Davis, "FreeBSD Handbook, Chapter 30 Firewalls," 2012 {Online}. Available: http://www.freebsd.org/doc/en/books/handbook/firewalls.html
[45]
L. Stewart, G. Armitage, P. Branch, and S. Zander, "An architecture for automated network control of QoS over consumer broadband links," in Proc. IEEE Region 10 Conf. (Tencon), Nov. 21-24, 2005, pp. 1-6.
[46]
J. But, N. Williams, S. Zander, L. Stewart, and G. Armitage, "Automated network games enhancement layer--A proposed architecture," in Proc. ACM SIGCOMM NetGames, Oct. 2006, Article no. 9.
[47]
A. Turner, Tcpreplay {Online}. Available: http://tcpreplay.synfin.net
[48]
S. Zander and G. Armitage, "Practical machine learning based multi-media traffic classification for distributed QoS management," in Proc. 36th IEEE LCN, Oct. 2011, pp. 399-406.
[49]
A. Nickerson, N. Japkowicz, and E. Milios, "Using unsupervised learning to guide resampling in imbalanced data sets," in Proc. 8th Int. Workshop Artif. Intell. Statist., 2001, pp. 261-265.
[50]
T. G. Renna, I. Bar-Kana, and P. Kalata, "A two-level gain stochastic disturbance observer with hysteresis," in Proc. IEEE Int. Conf. Syst. Eng., Aug. 1990, pp. 77-80.
[51]
J. But, T. Nguyen, L. Stewart, N. Williams, and G. Armitage, "Performance analysis of the ANGEL system for automated control of game traffic prioritisation," in Proc. 6th ACM SIGCOMM NetGames, Sep. 2007, pp. 123-128.
[52]
W. Jiang and H. Schulzrinne, "Comparison and optimization of packet loss repair methods on VoIP perceived quality under bursty loss," in Proc. 12th NOSSDAV, 2002, pp. 73-81.

Cited By

View all
  • (2023)Scalable Deep Reinforcement Learning-Based Online Routing for Multi-Type Service RequirementsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.328465134:8(2337-2351)Online publication date: 1-Aug-2023
  • (2023)In-Network Machine Learning Using Programmable Network Devices: A SurveyIEEE Communications Surveys & Tutorials10.1109/COMST.2023.334435126:2(1171-1200)Online publication date: 19-Dec-2023
  • (2022)Mass surveillance of VoIP calls in the data planeProceedings of the Symposium on SDN Research10.1145/3563647.3563649(33-49)Online publication date: 19-Oct-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Networking
IEEE/ACM Transactions on Networking  Volume 20, Issue 6
December 2012
336 pages

Publisher

IEEE Press

Publication History

Published: 01 December 2012
Accepted: 29 January 2012
Revised: 15 November 2011
Received: 05 May 2011
Published in TON Volume 20, Issue 6

Author Tags

  1. interactive traffic
  2. machine learning (ML)
  3. sub-flows
  4. traffic classification

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Scalable Deep Reinforcement Learning-Based Online Routing for Multi-Type Service RequirementsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.328465134:8(2337-2351)Online publication date: 1-Aug-2023
  • (2023)In-Network Machine Learning Using Programmable Network Devices: A SurveyIEEE Communications Surveys & Tutorials10.1109/COMST.2023.334435126:2(1171-1200)Online publication date: 19-Dec-2023
  • (2022)Mass surveillance of VoIP calls in the data planeProceedings of the Symposium on SDN Research10.1145/3563647.3563649(33-49)Online publication date: 19-Oct-2022
  • (2022)A Survey on Requirements of Future Intelligent Networks: Solutions and Future Research DirectionsACM Computing Surveys10.1145/352410655:4(1-61)Online publication date: 21-Nov-2022
  • (2022)Intelligent Wireless Networks: Challenges and Future Research TopicsJournal of Network and Systems Management10.1007/s10922-021-09625-530:1Online publication date: 1-Jan-2022
  • (2022)Practical and configurable network traffic classification using probabilistic machine learningCluster Computing10.1007/s10586-021-03393-225:4(2839-2853)Online publication date: 1-Aug-2022
  • (2021)Robust Online Learning against Malicious Manipulation with Application to Network Flow ClassificationIEEE INFOCOM 2021 - IEEE Conference on Computer Communications10.1109/INFOCOM42981.2021.9488890(1-10)Online publication date: 10-May-2021
  • (2021)Flow-Packet Hybrid Traffic Classification for Class-Aware Network Routing2021 IEEE Global Communications Conference (GLOBECOM)10.1109/GLOBECOM46510.2021.9685838(1-6)Online publication date: 7-Dec-2021
  • (2020)SoftSystemWireless Communications & Mobile Computing10.1155/2020/88643012020Online publication date: 1-Jan-2020
  • (2020)A novel network traffic classification approach via discriminative feature learningProceedings of the 35th Annual ACM Symposium on Applied Computing10.1145/3341105.3373844(1026-1033)Online publication date: 30-Mar-2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media