Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Algorithms to accelerate multiple regular expressions matching for deep packet inspection

Published: 11 August 2006 Publication History

Abstract

There is a growing demand for network devices capable of examining the content of data packets in order to improve network security and provide application-specific services. Most high performance systems that perform deep packet inspection implement simple string matching algorithms to match packets against a large, but finite set of strings. owever, there is growing interest in the use of regular expression-based pattern matching, since regular expressions offer superior expressive power and flexibility. Deterministic finite automata (DFA) representations are typically used to implement regular expressions. However, DFA representations of regular expression sets arising in network applications require large amounts of memory, limiting their practical application.In this paper, we introduce a new representation for regular expressions, called the Delayed Input DFA (D2FA), which substantially reduces space equirements as compared to a DFA. A D2FA is constructed by transforming a DFA via incrementally replacing several transitions of the automaton with a single default transition. Our approach dramatically reduces the number of distinct transitions between states. For a collection of regular expressions drawn from current commercial and academic systems, a D2FA representation reduces transitions by more than 95%. Given the substantially reduced space equirements, we describe an efficient architecture that can perform deep packet inspection at multi-gigabit rates. Our architecture uses multiple on-chip memories in such a way that each remains uniformly occupied and accessed over a short duration, thus effectively distributing the load and enabling high throughput. Our architecture can provide ostffective packet content scanning at OC-192 rates with memory requirements that are consistent with current ASIC technology.

References

[1]
R. Sommer, V. Paxson, "Enhancing Byte-Level Network Intrusion Detection Signatures with Context," ACM conf. on Computer and Communication Security, 2003, pp. 262--271.
[2]
J. E. Hopcroft and J. D. Ullman, "Introduction to Automata Theory, Languages, and Computation," Addison Wesley, 1979.
[3]
J. Hopcroft, "An nlogn algorithm for minimizing states in a finite automaton," in Theory of Machines and Computation, J. Kohavi, Ed. New York: Academic, 1971, pp. 189--196.
[4]
Bro: A System for Detecting Network Intruders in Real-Time. http://www.icir.org/vern/bro-info.html
[5]
M. Roesch, "Snort: Lightweight intrusion detection for networks," In Proc. 13th Systems Administration Conference (LISA), USENIX Association, November 1999, pp 229--238.
[6]
S. Antonatos, et. al, "Generating realistic workloads for network intrusion detection systems," In ACM Workshop on Software and Performance, 2004.
[7]
A. V. Aho and M. J. Corasick, "Efficient string matching: An aid to bibliographic search," Comm. of the ACM, 18(6):333--340, 1975.
[8]
B. Commentz-Walter, "A string matching algorithm fast on the average," Proc. of ICALP, pages 118--132, July 1979.
[9]
S. Wu, U. Manber," A fast algorithm for multi-pattern searching," Tech. R. TR-94-17, Dept. of Comp. Science, Univ of Arizona, 1994.
[10]
Fang Yu, et al., "Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection", UCB tech. report, EECS-2005-8.
[11]
N. Tuck, T. Sherwood, B. Calder, and G. Varghese, "Deterministic memory-efficient string matching algorithms for intrusion detection," IEEE Infocom 2004, pp. 333--340.
[12]
L. Tan, and T. Sherwood, "A High Throughput String Matching Architecture for Intrusion Detection and Prevention," ISCA 2005.
[13]
I. Sourdis and D. Pnevmatikatos, "Pre-decoded CAMs for Efficient and High-Speed NIDS Pattern Matching," Proc. IEEE Symp. on Field-Prog. Custom Computing Machines, Apr. 2004, pp. 258--267.
[14]
S. Yusuf and W. Luk, "Bitwise Optimised CAM for Network Intrusion Detection Systems," IEEE FPL 2005.
[15]
R. Sidhu and V. K. Prasanna, "Fast regular expression matching using FPGAs," In IEEE Symposium on Field- Programmable Custom Computing Machines, Rohnert Park, CA, USA, April 2001.
[16]
C. R. Clark and D. E. Schimmel, "Efficient reconfigurable logic circuit for matching complex network intrusion detection patterns," In Proceedings of 13th International Conference on Field Program.
[17]
J. Moscola, et. al, "Implementation of a content-scanning module for an internet firewall," IEEE Workshop on FPGAs for Custom Comp. Machines, Napa, USA, April 2003.
[18]
R. W. Floyd, and J. D. Ullman, "The Compilation of Regular Expressions into Integrated Circuits", Journal of ACM, vol. 29, no. 3, pp 603--622, July 1982.
[19]
Scott Tyler Shafer, Mark Jones, "Network edge courts apps," http://infoworld.com/article/02/05/27/020527newebdev_1.html
[20]
TippingPoint X505, www.tippingpoint.com/products_ips.html
[21]
Cisco IOS IPS Deployment Guide, www.cisco.com
[22]
Tarari RegEx, www. tarari.com/PDF/RegEx_FACT_SHEET.pdf
[23]
Cu-11 standard cell/gate array ASIC, IBM. www.ibm.com
[24]
Virtex-4 FPGA, Xilinx. www.xilinx.com
[25]
N.J. Larsson, "Structures of string matching and data compression," PhD thesis, Dept. of Computer Science, Lund University, 1999 .
[26]
S. Dharmapurikar, P. Krishnamurthy, T. Sproull, and J. Lockwood, "Deep Packet Inspection using Parallel Bloom Filters," IEEE Hot Interconnects 12, August 2003. IEEE Computer Society Press.
[27]
Z. K. Baker, V. K. Prasanna, "Automatic Synthesis of Efficient Intrusion Detection Systems on FPGAs," in Field Prog. Logic and Applications, Aug. 2004, pp. 311--321.
[28]
Y. H. Cho, W. H. Mangione-Smith, "Deep Packet Filter with Dedicated Logic and Read Only Memories," Field Prog. Logic and Applications, Aug. 2004, pp. 125--134.
[29]
M. Gokhale, et al., "Granidt: Towards Gigabit Rate Network Intrusion Detection Technology," Field Programmable Logic and Applications, Sept. 2002, pp. 404--413.
[30]
J. Levandoski, E. Sommer, and M. Strait, "Application Layer Packet Classifier for Linux". http://l7-filter.sourceforge.net/.
[31]
"MIT DARPA Intrusion Detection Data Sets," http://www.ll.mit.edu/IST/ideval/data/2000/2000_data_index.html.
[32]
Vern Paxson et al., "Flex: A fast scanner generator,"http://www.gnu.org/software/flex/
[33]
SafeXcel Content Inspection Engine, hardware regex acceleration IP.
[34]
Network Services Processor, OCTEON CN31XX, CN30XX Family.
[35]
R. Prim, "Shortest connection networks and some generalizations,"Bell System Technical Journal, 36:1389--1401, 1957.
[36]
J. B. Kruskal, "On the shortest spanning subtree of a graph and the traveling salesman problem," Proc. of the American Mathematical Society, 7:48--50, 1956.
[37]
F. M. Liang. A lower bound for on-line bin packing. In Information Processing letters, pages 76--79, 1980.
[38]
Will Eatherton, John Williams, "An encoded version of reg-ex database from cisco systems provided for research purposes".
[39]
Garey, M. R., and Johnson, D. S., "Bounded Component Spanning Forest", pp 208, Computers and Intractability: A Guide to the Theory of NP-Completeness, 1979.

Cited By

View all
  • (2024)One Automaton to Rule Them All: Beyond Multiple Regular Expressions ExecutionProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444810(193-206)Online publication date: 2-Mar-2024
  • (2023)Exploiting Structure in Regular Expression QueriesProceedings of the ACM on Management of Data10.1145/35892971:2(1-28)Online publication date: 20-Jun-2023
  • (2023)Abnormal Traffic Detection: Traffic Feature Extraction and DAE-GAN With Efficient Data AugmentationIEEE Transactions on Reliability10.1109/TR.2022.320434972:2(498-510)Online publication date: Jun-2023
  • Show More Cited By

Index Terms

  1. Algorithms to accelerate multiple regular expressions matching for deep packet inspection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGCOMM Computer Communication Review
    ACM SIGCOMM Computer Communication Review  Volume 36, Issue 4
    Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
    October 2006
    445 pages
    ISSN:0146-4833
    DOI:10.1145/1151659
    Issue’s Table of Contents
    • cover image ACM Conferences
      SIGCOMM '06: Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
      September 2006
      458 pages
      ISBN:1595933085
      DOI:10.1145/1159913
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 August 2006
    Published in SIGCOMM-CCR Volume 36, Issue 4

    Check for updates

    Author Tags

    1. DFA
    2. deep packet inspection
    3. regular expressions

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)165
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 26 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)One Automaton to Rule Them All: Beyond Multiple Regular Expressions ExecutionProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444810(193-206)Online publication date: 2-Mar-2024
    • (2023)Exploiting Structure in Regular Expression QueriesProceedings of the ACM on Management of Data10.1145/35892971:2(1-28)Online publication date: 20-Jun-2023
    • (2023)Abnormal Traffic Detection: Traffic Feature Extraction and DAE-GAN With Efficient Data AugmentationIEEE Transactions on Reliability10.1109/TR.2022.320434972:2(498-510)Online publication date: Jun-2023
    • (2023)Bolt: Scalable and Cost-Efficient Multistring Pattern Matching With Programmable SwitchesIEEE/ACM Transactions on Networking10.1109/TNET.2022.320252331:2(846-861)Online publication date: Apr-2023
    • (2023)Enabling Fast and Memory-Efficient Acceleration for Pattern Matching Workloads: The Lightweight Automata Processing EngineIEEE Transactions on Computers10.1109/TC.2022.318733872:4(1011-1025)Online publication date: 1-Apr-2023
    • (2023)A Compact and Secure Access Control Solution Based on a Deterministic Finite Automaton2023 International Conference Automatics and Informatics (ICAI)10.1109/ICAI58806.2023.10339058(311-316)Online publication date: 5-Oct-2023
    • (2023)Context-driven encrypted multimedia traffic classification on mobile devicesPervasive and Mobile Computing10.1016/j.pmcj.2022.10173788:COnline publication date: 1-Jan-2023
    • (2023)The rise of website fingerprinting on TorJournal of Network and Computer Applications10.1016/j.jnca.2023.103582212:COnline publication date: 1-Mar-2023
    • (2023)DeepMetricCorr: Fast flow correlation for data center networks with deep metric learningComputer Networks10.1016/j.comnet.2023.109904233(109904)Online publication date: Sep-2023
    • (2023)Dazzle-attack: Anti-Forensic Server-side Attack via Fail-Free Dynamic State MachineInformation Security Applications10.1007/978-3-031-25659-2_15(204-221)Online publication date: 4-Feb-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media