Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

MalAlert: Detecting Malware in Large-Scale Network Traffic Using Statistical Features

Published: 25 January 2019 Publication History
  • Get Citation Alerts
  • Abstract

    In recent years, we witness the spreading of a significant variety of malware, which operate and propagate relying on network communications. Due to the staggering growth of traffic in the last years, detecting malicious software has become infeasible on a packet-by-packet basis. In this paper, we address this challenge by investigating malware behaviors and designing a method to detect them relying only on network flow-level data. In our analysis we identify malware types with regards to their impact on a network and the way they achieve their malicious purposes. Leveraging this knowledge, we propose a machine learning-based and privacy-preserving method to detect malware. We evaluate our results on two malware datasets (MalRec and CTU-13) containing traffic of over 65,000 malware samples, as well as one month of network traffic from the University of Oxford containing over 23 billion flows. We show that despite the coarse-grained information provided by network flows and the imbalance between legitimate and malicious traffic, MalAlert can distinguish between different types of malware with the F1 score of 90%.

    References

    [1]
    A. J. A. et al. Challenges in Experimenting with Botnet Detection Systems. In USENIX CSET, 2011.
    [2]
    A. M. et al. CHATTER: Classifying Malware Families Using System Event Ordering. In CNS, 2014.
    [3]
    B. A. A. et al. MalClassifier: Malware Family Classification Using Network Flow Sequence Behaviour. In eCrime, 2018.
    [4]
    E. M. R. et al. A Survey of Stealth Malware Attacks, Mitigation Measures, and Steps Toward Autonomous Open World Solutions. IEEE Communications Surveys & Tutorials, 19(2):1145--1172, 2017.
    [5]
    G. G. et al. BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation. In USENIX Security, 2007.
    [6]
    G. G. et al. BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection. In USENIX Security, 2008.
    [7]
    G. J. et al. Behavioral Detection of Malware: from a Survey Towards an Established Taxonomy. Journal in Computer Virology, 4:251{266, 2008.
    [8]
    G. S. et al. MalRec: Compact Full-Trace Malware Recording for Retrospective Deep Analysis. In DIMVA, 2018.
    [9]
    I. P.-O. et al. Portscan Detection with Sampled NetFlow. In TMA, 2009.
    [10]
    J. F. et al. BotTrack: Tracking Botnets Using NetFlow and PageRank. In NETWORKING, 2011.
    [11]
    K. B. et al. Optimized Invariant Representation of Network Traffic for Detecting Unseen Malware Variants. In USENIX Security, 2016.
    [12]
    L. B. et al. DISCLOSURE: Detecting Botnet Command and Control Servers Through Large-Scale NetFlow Analysis. In ACSAC, 2012.
    [13]
    M. M. N. et al. Detection of SSH Brute Force Attacks Using Aggregated NetFlow Data. In ICMLA, 2015.
    [14]
    M. S. et al. AVClass: A Tool for Massive Malware Labeling. In RAID, 2016.
    [15]
    M. Z. R. et al. FIRMA: Malware Clustering and Network Signature Generation with Mixed Network Behaviors. In RAID, 2013.
    [16]
    N. K. et al. BotSuer: Suing Stealthy P2P Bots in Network Traffic Through NetFlow Analysis. In CANS, 2013.
    [17]
    P. A. et al. Botnet Detection Using NetFlow and Clustering. ACSIJ, 3(2):139--149, 2014.
    [18]
    R. H. et al. SSH Compromise Detection Using NetFlow/IPFIX.SIGCOMM CCR, 44(5):20--26, 2014.
    [19]
    R. P. et al. Behavioral Clustering of HTTP-based Malware and Signature Generation Using Malicious Network Traces. In USENIX NSDI, 2010.
    [20]
    S. G. et al. An Empirical Comparison of Botnet Detection Methods. Comput. Secur., 45:100--123, 2014.
    [21]
    T. F. Y. et al. Trafficc Aggregation for Malware Detection. In DIMVA, 2008.
    [22]
    V. O. et al. Modeling Botnet C&C Traffic Lifespans from NetFlow Using Survival Analysis. In TSP, 2016.

    Cited By

    View all
    • (2024)An Ensemble-Based Machine Learning-Envisioned Intrusion Detection in Industry 5.0-Driven Healthcare ApplicationsIEEE Transactions on Consumer Electronics10.1109/TCE.2023.331885070:1(1903-1912)Online publication date: Feb-2024
    • (2024)Securing the Industrial Internet of Things against ransomware attacks: A comprehensive analysis of the emerging threat landscape and detection mechanismsJournal of Network and Computer Applications10.1016/j.jnca.2023.103809223(103809)Online publication date: Mar-2024
    • (2023)The ascent of network traffic classification in the dark netJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23109945:3(3679-3700)Online publication date: 1-Jan-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGMETRICS Performance Evaluation Review
    ACM SIGMETRICS Performance Evaluation Review  Volume 46, Issue 3
    December 2018
    174 pages
    ISSN:0163-5999
    DOI:10.1145/3308897
    Issue’s Table of Contents
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 January 2019
    Published in SIGMETRICS Volume 46, Issue 3

    Check for updates

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)50
    • Downloads (Last 6 weeks)1
    Reflects downloads up to

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)An Ensemble-Based Machine Learning-Envisioned Intrusion Detection in Industry 5.0-Driven Healthcare ApplicationsIEEE Transactions on Consumer Electronics10.1109/TCE.2023.331885070:1(1903-1912)Online publication date: Feb-2024
    • (2024)Securing the Industrial Internet of Things against ransomware attacks: A comprehensive analysis of the emerging threat landscape and detection mechanismsJournal of Network and Computer Applications10.1016/j.jnca.2023.103809223(103809)Online publication date: Mar-2024
    • (2023)The ascent of network traffic classification in the dark netJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23109945:3(3679-3700)Online publication date: 1-Jan-2023
    • (2022)Exposing the Rat in the TunnelProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security10.1145/3548606.3560604(875-889)Online publication date: 7-Nov-2022
    • (2022)NBP-MS: Malware Signature Generation Based on Network Behavior Profiling2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956412(1865-1870)Online publication date: 21-Aug-2022
    • (2022)Procedures, Criteria, and Machine Learning Techniques for Network Traffic Classification: A SurveyIEEE Access10.1109/ACCESS.2022.318113510(61135-61158)Online publication date: 2022
    • (2022)Establishing the Contaminating Effect of Metadata Feature Inclusion in Machine-Learned Network Intrusion Detection ModelsDetection of Intrusions and Malware, and Vulnerability Assessment10.1007/978-3-031-09484-2_2(23-41)Online publication date: 29-Jun-2022
    • (2021)Detection of illicit cryptomining using network metadataEURASIP Journal on Information Security10.1186/s13635-021-00126-12021:1Online publication date: 4-Dec-2021
    • (2021)MalPhase: Fine-Grained Malware Detection Using Network Flow DataProceedings of the 2021 ACM Asia Conference on Computer and Communications Security10.1145/3433210.3453101(774-786)Online publication date: 24-May-2021
    • (2021)AndroCreme: Unseen Android Malware Detection Based on Inductive Conformal Learning2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom53373.2021.00097(651-658)Online publication date: Oct-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media