Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

CICERO: A Domain-Specific Architecture for Efficient Regular Expression Matching

Published: 17 September 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Regular Expression (RE) matching is a computational kernel used in several applications. Since RE complexity and data volumes are steadily increasing, hardware acceleration is gaining attention also for this problem. Existing approaches have limited flexibility as they require a different implementation for each RE. On the other hand, it is complex to map efficient RE representations like non-deterministic finite-state automata onto software-programmable engines or parallel architectures. In this work, we present CICERO , an end-to-end framework composed of a domain-specific architecture and a companion compilation framework for RE matching. Our solution is suitable for many applications, such as genomics/proteomics and natural language processing. CICERO aims at exploiting the intrinsic parallelism of non-deterministic representations of the REs. CICERO can trade-off accelerators’ efficiency and processors’ flexibility thanks to its programmable architecture and the compilation framework. We implemented CICERO prototypes on embedded FPGA achieving up to 28.6× and 20.8× more energy efficiency than embedded and mainstream processors, respectively. Since it is a programmable architecture, it can be implemented as a custom ASIC that is orders of magnitude more energy-efficient than mainstream processors.

    References

    [1]
    M. Abbas and V. Betz. 2018. Latency insensitive design styles for FPGAs. In Proceedings of the IEEE International Conference on Field Programmable Logic and Applications (FPL). 360–3607.
    [2]
    K. Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Katherine A. Yelick. 2006. The landscape of parallel computing research: A view from Berkeley. (2006).
    [3]
    Michela Becchi and Patrick Crowley. 2007. A hybrid finite automaton for practical deep packet inspection. In Proceedings of the ACM International Conference on emerging Networking EXperiments and Technologies (CoNEXT). 1–12.
    [4]
    Michela Becchi and Patrick Crowley. 2008. Efficient regular expression evaluation: Theory to practice. In Proceedings of the ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS). 50–59.
    [5]
    Andreas Becher, Stefan Wildermann, and Jürgen Teich. 2018. Optimistic regular expression matching on FPGAs for near-data processing. In Proceedings of the 14th International Workshop on Data Management on New Hardware. 1–3.
    [6]
    Chunkun Bo, Vinh Dang, Ted Xie, Jack Wadden, Mircea Stan, and Kevin Skadron. 2019. Automata processing in reconfigurable architectures: In-the-cloud deployment, cross-platform evaluation, and fast symbol-only reconfiguration. ACM Transactions on Reconfigurable Technology and Systems 12, 2 (2019), 1–25.
    [7]
    Jeffrey Brown, Sandra Woodward, Brian Bass, and Charlie Johnson. 2011. IBM power edge of network processor: A wire-speed system on a chip. IEEE Micro 31, 2 (2011), 76–85.
    [8]
    Alessandro Comodi, Davide Conficconi, Alberto Scolari, and Marco D. Santambrogio. 2018. TiReX: Tiled regular expression matching architecture. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 131–137.
    [9]
    Russ Cox. 2007. Regular expression matching can be simple and fast (but is slow in java, perl, php, python, ruby,...). http://swtch.com/rsc/regexp/regexp1.html.
    [10]
    Russ Cox. 2009. Regular expression matching: the virtual machine approach. http://swtch.com/rsc/regexp/regexp2.html.
    [11]
    Russ Cox. 2012. Regular Expression Matching with a Trigram Index or How Google Code Search Worked. https://swtch.com/ rsc/regexp/regexp4.html.
    [12]
    Lorenzo Di Tucci, Davide Conficconi, Alessandro Comodi, Steven Hofmeyr, David Donofrio, and Marco D. Santambrogio. 2018. A parallel, energy efficient hardware architecture for the meraligner on FPGA using chisel HCL. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 214–217.
    [13]
    Paul Dlugosch et al. 2014. An efficient and scalable semiconductor architecture for parallel automata processing. IEEE Transactions on Parallel and Distributed Systems 25, 12 (2014), 3088–3098.
    [14]
    Vaibhav Gogte, Aasheesh Kolli, Michael J. Cafarella, Loris D’Antoni, and Thomas F. Wenisch. 2016. HARE: Hardware accelerator for regular expressions. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 1–12.
    [15]
    Google. 2020. Google re2. https://github.com/google/re2.
    [16]
    John L. Hennessy and David A. Patterson. 2018. A new golden age for computer architecture: Domain-specific hardware/software co-design, enhanced security, open instruction sets, and agile chip development. In Proceedings of the ACM/IEEE Annual International Symposium on Computer Architecture (ISCA).
    [17]
    John E. Hopcroft. 2008. Introduction to Automata Theory, Languages, and Computation. Pearson Education India.
    [18]
    M. Horowitz. 2014. 1.1 Computing’s energy problem (and what we can do about it). In Proceedings of the International Solid-State Circuits Conference (ISSCC). 10–14.
    [19]
    Zsolt István, David Sidler, and Gustavo Alonso. 2016. Runtime parameterizable regular expression operators for databases. In Proceedings of the IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 204–211.
    [20]
    Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan, and Binxing Fang. 2014. A fast regular expression matching engine for NIDS applying prediction scheme. In Proceedings of the IEEE Symposium on Computers and Communications (ISCC).
    [21]
    Donald E. Knuth. 1965. On the translation of languages from left to right. Information and Control 8, 6 (1965), 607–639.
    [22]
    P. Mantovani, E. G. Cota, K. Tien, C. Pilato, G. Di Guglielmo, K. Shepard, and L. P. Carlon. 2016. An FPGA-Based Infrastructure for Fine-Grained DVFS Analysis in High-Performance Embedded Systems. In Proceedings of the ACM/IEEE Design Automation Conference (DAC).
    [23]
    Chad R. Meiners, Jignesh Patel, Eric Norige, Eric Torng, and Alex X. Liu. 2010. Fast regular expression matching using small TCAMs for network intrusion detection and prevention systems. In Proceedings of the USENIX Conference on Security.
    [24]
    Marziyeh Nourian, Xiang Wang, Xiaodong Yu, Wu-chun Feng, and Michela Becchi. 2017. Demystifying automata processing: GPUs, FPGAs or Micron’s AP?. In Proceedings of the International Conference on Supercomputing (ICS). 1–11.
    [25]
    David Pellerin. 2017. Fpga accelerated computing using aws f1 instances. AWS Public Sector Summit (2017).
    [26]
    Reza Rahimi, Elaheh Sadredini, Mircea Stan, and Kevin Skadron. 2020. Grapefruit: An open-source, full-stack, and customizable automata processing on FPGAs. In Proceedings of the IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 138–147.
    [27]
    Alex Roichman and Adar Weidman. 2012. Regular Expression Denial of Service.
    [28]
    Indranil Roy. 2015. Algorithmic Techniques for the Micron Automata Processor. Ph.D. Dissertation. Georgia Institute of Technology.
    [29]
    Elaheh Sadredini, Reza Rahimi, Marzieh Lenjani, Mircea Stan, and Kevin Skadron. 2020. FlexAmata: A universal and efficient adaption of applications to spatial automata processing accelerators. In Proceedings of the ACM/IEEE International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 219–234.
    [30]
    Davide Sampietro, Chiara Crippa, Lorenzo Di Tucci, Emanuele Del Sozzo, and Marco D. Santambrogio. 2018. Fpga-based pairhmm forward algorithm for dna variant calling. In 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP). IEEE, 1–8.
    [31]
    David Sidler, Zsolt István, Muhsen Owaida, and Gustavo Alonso. 2017. Accelerating pattern matching queries in hybrid CPU-FPGA architectures. In Proceedings of the ACM International Conference on Management of Data (SIGMOD). 403–415.
    [32]
    Shreyas G. Singapura, Yi-Hua E. Yang, Anand Panangadan, Tamas Nemeth, and Viktor K. Prasanna. 2015. FPGA Based Accelerator for Pattern Matching in YARA Framework. Technical Report. CE, Los Angeles, CA.
    [33]
    Ken Thompson. 1968. Programming techniques: Regular expression search algorithm. Commun. ACM 11, 6 (June 1968), 419–422.
    [34]
    Jan van Lunteren and Alexis Guanella. 2012. Hardware-accelerated regular expression matching at multiple tens of Gb/s. In Proceedings of IEEE INFOCOM. IEEE, 1737–1745.
    [35]
    J. Wadden, V. Dang, N. Brunelle, T. T. II, D. Guo, E. Sadredini, K. Wang, C. Bo, G. Robins, M. Stan, and K. Skadron. 2016. ANMLzoo: A benchmark suite for exploring bottlenecks in automata processing engines and architectures. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC). 1–12.
    [36]
    J. Wadden, T. Tracy, E. Sadredini, L. Wu, C. Bo, J. Du, Y. Wei, J. Udall, M. Wallace, M. Stan, and K. Skadron. 2018. AutomataZoo: A modern automata processing benchmark suite. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC). 13–24.
    [37]
    Ke Wang, Kevin Angstadt, Chunkun Bo, Nathan Brunelle, Elaheh Sadredini, Tommy Tracy, Jack Wadden, Mircea Stan, and Kevin Skadron. 2016. An overview of micron’s automata processor. In Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis. 1–3.
    [38]
    T. Xie, V. Dang, J. Wadden, K. Skadron, and M. Stan. 2017. REAPR: Reconfigurable engine for automata processing. In Proceedings of the IEEE International Conference on Field Programmable Logic and Applications (FPL). 1–8.
    [39]
    Xilinx. 2016. PYNQ: Python for Productivity for ZYNQ. http://www.pynq.io/.
    [40]
    J. Yang, L. Jiang, Q. Tang, Q. Dai, and J. Tan. 2016. PiDFA: A practical multi-stride regular expression matching engine based on FPGA. In Proceedings of the IEEE International Conference on Communications (ICC). 1–7.
    [41]
    Zhipeng Zhao, Hugo Sadok, Nirav Atre, James C. Hoe, Vyas Sekar, and Justine Sherry. 2020. Achieving 100Gbps intrusion prevention on a single server. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI). 1083–1100.
    [42]
    Keira Zhou, Jack Wadden, Jeffrey J. Fox, Ke Wang, Donald E. Brown, and Kevin Skadron. 2015. Regular expression acceleration on the micron automata processor: Brill tagging as a case study. In Proceedings of the IEEE International Conference on Big Data (BigData). 355–360.

    Cited By

    View all
    • (2024)PSyGS Gen A Generator of Domain-Specific Architectures to Accelerate Sparse Linear System Resolution2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00015(41-47)Online publication date: 27-May-2024
    • (2024)One Automaton to Rule Them All: Beyond Multiple Regular Expressions Execution2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO57630.2024.10444810(193-206)Online publication date: 2-Mar-2024
    • (2023)Exploiting Structure in Regular Expression QueriesProceedings of the ACM on Management of Data10.1145/35892971:2(1-28)Online publication date: 20-Jun-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Embedded Computing Systems
    ACM Transactions on Embedded Computing Systems  Volume 20, Issue 5s
    Special Issue ESWEEK 2021, CASES 2021, CODES+ISSS 2021 and EMSOFT 2021
    October 2021
    1367 pages
    ISSN:1539-9087
    EISSN:1558-3465
    DOI:10.1145/3481713
    • Editor:
    • Tulika Mitra
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 17 September 2021
    Accepted: 01 July 2021
    Revised: 01 June 2021
    Received: 01 April 2021
    Published in TECS Volume 20, Issue 5s

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Domain-specific architecture
    2. regular expressions
    3. non-deterministic automata
    4. energy efficiency

    Qualifiers

    • Research-article
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)150
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)PSyGS Gen A Generator of Domain-Specific Architectures to Accelerate Sparse Linear System Resolution2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00015(41-47)Online publication date: 27-May-2024
    • (2024)One Automaton to Rule Them All: Beyond Multiple Regular Expressions Execution2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO57630.2024.10444810(193-206)Online publication date: 2-Mar-2024
    • (2023)Exploiting Structure in Regular Expression QueriesProceedings of the ACM on Management of Data10.1145/35892971:2(1-28)Online publication date: 20-Jun-2023
    • (2023)An Energy-Efficient Domain-Specific Architecture for Regular ExpressionsIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2022.315794811:1(3-17)Online publication date: 1-Jan-2023
    • (2023)YARB: a Methodology to Characterize Regular Expression Matching on Heterogeneous Systems2023 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS46773.2023.10181547(1-5)Online publication date: 21-May-2023
    • (2023)Enabling Efficient Regular Expression Matching at the Edge through Domain-Specific Architectures2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00023(71-74)Online publication date: May-2023
    • (2023)A Bird's Eye View on Quantum Computing: Current and Future TrendsIEEE EUROCON 2023 - 20th International Conference on Smart Technologies10.1109/EUROCON56442.2023.10198957(689-694)Online publication date: 6-Jul-2023
    • (2022)Offset-FA: A Uniform Method to Handle Both Unbounded and Bounded Repetitions in Regular Expression MatchingSensors10.3390/s2220778122:20(7781)Online publication date: 13-Oct-2022
    • (2022)Pushing the Level of Abstraction of Digital System Design: A Survey on How to Program FPGAsACM Computing Surveys10.1145/353298955:5(1-48)Online publication date: 3-Dec-2022
    • (2022)Online Learning RTL Synthesis for Automated Design Space Exploration2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW55747.2022.00021(69-76)Online publication date: May-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media