Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1978672.1978676acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation

Published: 10 April 2011 Publication History

Abstract

With the rapid evolution and proliferation of botnets, large-scale cyber attacks such as DDoS, spam emails are also becoming more and more dangerous and serious cyber threats. Because of this, network based security technologies such as Network based Intrusion Detection Systems (NIDSs), Intrusion Prevention Systems (IPSs), firewalls have received remarkable attention to defend our crucial computer systems, networks and sensitive information from attackers on the Internet. In particular, there has been much effort towards high-performance NIDSs based on data mining and machine learning techniques. However, there is a fatal problem in that the existing evaluation dataset, called KDD Cup 99' dataset, cannot reflect current network situations and the latest attack trends. This is because it was generated by simulation over a virtual network more than 10 years ago. To the best of our knowledge, there is no alternative evaluation dataset. In this paper, we present a new evaluation dataset, called Kyoto 2006+, built on the 3 years of real traffic data (Nov. 2006 ~ Aug. 2009) which are obtained from diverse types of honeypots. Kyoto 2006+ dataset will greatly contribute to IDS researchers in obtaining more practical, useful and accurate evaluation results. Furthermore, we provide detailed analysis results of honeypot data and share our experiences so that security researchers are able to get insights into the trends of latest cyber attacks and the Internet situations.

References

[1]
Reza Sadoddin and Ali A. Ghorbani, "A Comparative Study of Unsupervised Machine Learning and Data Mining techniques for Intrusion Detection", MLDM2007, pp. 404--418, 2007.
[2]
Jungsuk Song, Hiroki Takakura, Yasuo Okabe, Daisuke Inoue, Masashi Eto, Koji Nakao, "A Comparative Study of Unsupervised Anomaly Detection Techniques Using Honeypot Data", IEICE Transactions on Information and Systems, Vol. E93-D, No. 9, pp. 2544--2554, Sep. 2010.
[3]
The third international knowledge discovery and data mining tools competition dataset KDD99-Cup http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html, 1999.
[4]
Jungsuk Song, Hiroki Takakura and Yasuo Okabe, "Cooperation of intelligent honeypots to detect unknown malicious codes", WISTDCS 2008, IEEE CS Press, pp. 31--39, 2008.
[5]
http://www.secure-ware.com/contents/product/ashula.html
[6]
Symantec Network Security 7100 Series.
[7]
http://www.clamav.net/
[8]
http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a00801d0808.shtml
[9]
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2002-0649
[10]
http://cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2005-1206
[11]
Jungsuk Song, Hayato Ohba, Hiroki Takakura, Yasuo Okabe, Kenji Ohira and Yongjin Kwon, "A Comprehensive Approach to Detect Unknown Attacks via Intrusion Detection Alerts", ASIAN2007, LNCS 4846, pp. 247--253, Dec. 2007.
[12]
http://www.microsoft.com/technet/security/bulletin/ms08-067.mspx
[13]
http://www.microsoft.com/technet/security/bulletin/ms02-039.mspx
[14]
http://www.sans.org/security-resources/malwarefaq/ms-sql-exploit.php
[15]
http://wombat-project.eu/WP3/FP7-ICT-216026-Wombat_WP3_D13_V01-Sensor-deployment.pdf
[16]
RFC4193: http://www.ietf.org/rfc/rfc4193.txt
[17]
http://www.takakura.com/Kyoto_data/
[18]
http://www.sourcefire.com/
[19]
http://www.bro-ids.org/

Cited By

View all
  • (2025)NTLFlowLyzer: Towards generating an intrusion detection dataset and intruders behavior profiling through network and transport layers traffic analysis and pattern extractionComputers & Security10.1016/j.cose.2024.104160148(104160)Online publication date: Jan-2025
  • (2024)Intrusion Detection System Application with Machine LearningAfyon Kocatepe University Journal of Sciences and Engineering10.35414/akufemubid.145599524:5(1165-1179)Online publication date: 1-Oct-2024
  • (2024)Methodology for the Detection of Contaminated Training Datasets for Machine Learning-Based Network Intrusion-Detection SystemsSensors10.3390/s2402047924:2(479)Online publication date: 12-Jan-2024
  • Show More Cited By

Index Terms

  1. Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        BADGERS '11: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security
        April 2011
        111 pages
        ISBN:9781450307680
        DOI:10.1145/1978672
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 10 April 2011

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Kyoto 2006+ dataset
        2. NIDS
        3. honeypot data

        Qualifiers

        • Research-article

        Funding Sources

        • Strategic Information and Communications R&D Promotion Programme

        Conference

        EuroSys '11
        Sponsor:
        EuroSys '11: Sixth EuroSys Conference 2011
        April 10, 2011
        Salzburg, Austria

        Acceptance Rates

        Overall Acceptance Rate 4 of 7 submissions, 57%

        Upcoming Conference

        EuroSys '25
        Twentieth European Conference on Computer Systems
        March 30 - April 3, 2025
        Rotterdam , Netherlands

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)145
        • Downloads (Last 6 weeks)16
        Reflects downloads up to 27 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)NTLFlowLyzer: Towards generating an intrusion detection dataset and intruders behavior profiling through network and transport layers traffic analysis and pattern extractionComputers & Security10.1016/j.cose.2024.104160148(104160)Online publication date: Jan-2025
        • (2024)Intrusion Detection System Application with Machine LearningAfyon Kocatepe University Journal of Sciences and Engineering10.35414/akufemubid.145599524:5(1165-1179)Online publication date: 1-Oct-2024
        • (2024)Methodology for the Detection of Contaminated Training Datasets for Machine Learning-Based Network Intrusion-Detection SystemsSensors10.3390/s2402047924:2(479)Online publication date: 12-Jan-2024
        • (2024)Current Status and Challenges and Future Trends of Deep Learning-Based Intrusion Detection ModelsJournal of Imaging10.3390/jimaging1010025410:10(254)Online publication date: 14-Oct-2024
        • (2024)Toward Generating a New Cloud-Based Distributed Denial of Service (DDoS) Dataset and Cloud Intrusion Traffic CharacterizationInformation10.3390/info1504019515:4(195)Online publication date: 31-Mar-2024
        • (2024)A Holistic review and performance evaluation of unsupervised learning methods for network anomaly detectionInternational Journal on Smart Sensing and Intelligent Systems10.2478/ijssis-2024-001617:1Online publication date: 19-May-2024
        • (2024)Optimizing IoT intrusion detection system: feature selection versus feature extraction in machine learningJournal of Big Data10.1186/s40537-024-00892-y11:1Online publication date: 24-Feb-2024
        • (2024)Introducing a Comprehensive, Continuous, and Collaborative Survey of Intrusion Detection DatasetsProceedings of the 17th Cyber Security Experimentation and Test Workshop10.1145/3675741.3675754(34-40)Online publication date: 13-Aug-2024
        • (2024)Rules Refine the Riddle: Global Explanation for Deep Learning-Based Anomaly Detection in Security ApplicationsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670375(4509-4523)Online publication date: 2-Dec-2024
        • (2024)Leveraging CNNs, Quantization, and Random Forest for Edge Deployable Intrusion Detection Efficiency2024 5th International Conference for Emerging Technology (INCET)10.1109/INCET61516.2024.10593457(1-7)Online publication date: 24-May-2024
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media