Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3195638.3195692acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

HARE: hardware accelerator for regular expressions

Published: 15 October 2016 Publication History

Abstract

Rapidly processing text data is critical for many technical and business applications. Traditional software-based tools for processing large text corpora use memory bandwidth inefficiently due to software overheads and thus fall far short of peak scan rates possible on modern memory systems. Prior hardware designs generally target I/O rather than memory bandwidth. In this paper, we present HARE, a hardware accelerator for matching regular expressions against large in-memory logs. HARE comprises a stall-free hardware pipeline that scans input data at a fixed rate, examining multiple characters from a single input stream in parallel in a single accelerator clock cycle.
We describe a 1GHz 32-character-wide HARE design targeting ASIC implementation that processes data at 32 GB/s---matching modern memory bandwidths. This ASIC design outperforms software solutions by as much as two orders of magnitude. We further demonstrate a scaled-down FPGA proof-of-concept that operates at 100MHz with 4-wide parallelism (400 MB/s). Even at this reduced rate, the prototype outperforms grep by 1.5--20x on commonly used regular expressions.

References

[1]
"Helios Regular Expression Processor." {Online}. Available: http://titanicsystems.com/Products/Regular-eXpression-Processor-(RXP)
[2]
"Intel and Micron Produce Breakthrough Memory Technology." {Online}. Available: https://newsroom.intel.com/news-releases/intel-and-micron-produce-breakthrough-memory-technology/
[3]
"Regular expression library." {Online}. Available: http://regexlib.com/
[4]
"Snort." {Online}. Available: http://snort.org/
[5]
"Structuring Unstructured Data." {Online}. Available: www.forbes.com/2007/04/04/teradata-solution-software-biz-logistics-cx_rm_0405data.html/
[6]
A. V. Aho, "Handbook of theoretical computer science (vol. a)," J. van Leeuwen, Ed. Cambridge, MA, USA: MIT Press, 1990, ch. Algorithms for Finding Patterns in Strings, pp. 255--300.
[7]
A. V. Aho and M. J. Corasick, "Efficient string matching: An aid to bibliographic search," Communications of the ACM, vol. 18, no. 6, 1975.
[8]
K. Asanovic, R. Bodik, J. Demmel, T. Keaveny, K. Keutzer, J. Kubiatowicz, N. Morgan, D. Patterson, K. Sen, J. Wawrzynek et al., "A view of the parallel computing landscape," Communications of the ACM, 2009.
[9]
M. Becchi, M. Franklin, and P. Crowley, "A workload for evaluating deep packet inspection architectures," in IEEE International Symposium on Workload Characterization, 2008.
[10]
J. Bispo, I. Sourdis, J. M. P. Cardoso, and S. Vassiliadis, "Regular expression matching for reconfigurable packet inspection," in IEEE International Conference on Field Programmable Technology, 2006.
[11]
R. D. Cameron and D. Lin, "Architectural support for swar text processing with parallel bit streams: The inductive doubling principle," in Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, 2009.
[12]
R. D. Cameron, T. C. Shermer, A. Shriraman, K. S. Herdy, D. Lin, B. R. Hull, and M. Lin, "Bitwise data parallelism in regular expression matching," in Proceedings of the 23rd International Conference on Parallel Architectures and Compilation Techniques, 2014.
[13]
P. Dlugosch, D. Brown, P. Glendenning, M. Leventhal, and H. Noyes, "An efficient and scalable semiconductor architecture for parallel automata processing," IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 12, pp. 3088--3098, 2014.
[14]
Y. Fang, T. T. Hoang, M. Becchi, and A. A. Chien, "Fast support for unstructured data processing: The unified automata processor," in Proceedings of the 48th International Symposium on Microarchitecture, 2015.
[15]
H. Franke, C. Johnson, and J. Brown, "The ibm power edge of network processor," 2010.
[16]
E. Hatcher and O. Gospodnetic, "Lucene in action," 2004.
[17]
J. Holub and S. Štekr, "On parallel implementations of deterministic finite automata," in Proceedings of the 14th International Conference on Implementation and Application of Automata, 2009.
[18]
J. E. Hopcroft and J. D. Ullman, Formal Languages and Their Relation to Automata. Addison-Wesley Longman Publishing Co., Inc., 1969.
[19]
N. Hua, H. Song, and T. Lakshman, "Variable-stride multi-pattern matching for scalable deep packet inspection," in INFOCOM 2009, IEEE, 2009.
[20]
D. Lin, N. Medforth, K. S. Herdy, A. Shriraman, and R. Cameron, "Parabix: Boosting the efficiency of text processing on commodity processors," in Proceedings of the 18th International Symposium on High Performance Computer Architecture, 2012.
[21]
J. V. Lunteren, C. Hagleitner, T. Heil, G. Biran, U. Shvadron, and K. Atasu, "Designing a programmable wire-speed regular-expression matching accelerator," in Proceedings of the 45th Annual International Symposium on Microarchitecture, 2012.
[22]
S. Manegold, M. L. Kersten, and P. Boncz, "Database architecture evolution: Mammals flourished long before dinosaurs became extinct," Proc. VLDB Endow., 2009.
[23]
S. Memeti and S. Pllana, "Parem: A novel approach for parallel regular expression matching," CoRR, 2014.
[24]
A. Mitra, W. Najjar, and L. Bhuyan, "Compiling pcre to fpga for accelerating snort ids," in Proceedings of the 3rd ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2007.
[25]
T. Mytkowicz, M. Musuvathi, and W. Schulte, "Data-parallel finite-state machines," in Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, 2014.
[26]
M. E. Richard L. Villars, Carl W. Olofson, Big Data: What It Is and Why You Should Care. IDC, 2011.
[27]
V. Salapura, T. Karkhanis, P. Nagpurkar, and J. Moreira, "Accelerating business analytics applications," in Proceedings of the 18th International Symposium on High Performance Computer Architecture, 2012.
[28]
R. Sidhu and V. K. Prasanna, "Fast regular expression matching using fpgas," in Proceedings of the the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2001.
[29]
M. Sipser, Introduction to the Theory of Computation, 1st ed. International Thomson Publishing, 1996.
[30]
M. Stonebraker and L. A. Rowe, "The design of postgres," in Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, 1986.
[31]
M. Stonebraker and A. Weisberg, "The voltdb main memory dbms." IEEE Data Eng. Bull., 2013.
[32]
L. Tan and T. Sherwood, "A high throughput string matching architecture for intrusion detection and prevention," in Proc. ISCA, 2005.
[33]
P. Tandon, F. M. Sleiman, M. Cafarella, and T. F. Wenisch, "Hawk: Hardware support for unstructured log processing," in International Conference on Data Engineering, 2016.
[34]
J. van Lunteren, "High-performance pattern-matching for intrusion detection," in 25th IEEE International Conference on Computer Communications, 2006.
[35]
H. Wang, S. Pu, G. Knezek, and J. C. Liu, "Min-max: A counter-based algorithm for regular expression matching," IEEE Transactions on Parallel and Distributed Systems, 2013.
[36]
Y. H. Yang and V. Prasanna, "High-performance and compact architecture for regular expression matching on fpga," IEEE Transactions on Computers, 2012.
[37]
Y.-H. E. Yang, W. Jiang, and V. K. Prasanna, "Compact architecture for high-throughput regular expression matching on fpga," in Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, 2008.

Cited By

View all
  • (2021)CICERO: A Domain-Specific Architecture for Efficient Regular Expression MatchingACM Transactions on Embedded Computing Systems10.1145/347698220:5s(1-24)Online publication date: 17-Sep-2021
  • (2019)Accelerating raw data analysis with the ACCORDA software and hardware architectureProceedings of the VLDB Endowment10.14778/3342263.334263412:11(1568-1582)Online publication date: 1-Jul-2019
  • (2019)eAPProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358324(87-99)Online publication date: 12-Oct-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture
October 2016
816 pages

Sponsors

Publisher

IEEE Press

Publication History

Published: 15 October 2016

Check for updates

Author Tags

  1. finite automata
  2. regular expression matching
  3. text processing

Qualifiers

  • Research-article

Conference

MICRO-49
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)CICERO: A Domain-Specific Architecture for Efficient Regular Expression MatchingACM Transactions on Embedded Computing Systems10.1145/347698220:5s(1-24)Online publication date: 17-Sep-2021
  • (2019)Accelerating raw data analysis with the ACCORDA software and hardware architectureProceedings of the VLDB Endowment10.14778/3342263.334263412:11(1568-1582)Online publication date: 1-Jul-2019
  • (2019)eAPProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358324(87-99)Online publication date: 12-Oct-2019
  • (2019)Automata Processing in Reconfigurable ArchitecturesACM Transactions on Reconfigurable Technology and Systems10.1145/331457612:2(1-25)Online publication date: 17-May-2019
  • (2019)Fast String Searching on PISAProceedings of the 2019 ACM Symposium on SDN Research10.1145/3314148.3314356(21-28)Online publication date: 3-Apr-2019
  • (2019)REGISTORACM Transactions on Storage10.1145/331014915:1(1-24)Online publication date: 26-Mar-2019
  • (2019)CoNDAProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3322266(629-642)Online publication date: 22-Jun-2019
  • (2018)Optimistic regular expression matching on FPGAs for near-data processingProceedings of the 14th International Workshop on Data Management on New Hardware10.1145/3211922.3211926(1-3)Online publication date: 11-Jun-2018
  • (2018)REGISTORProceedings of the 11th ACM International Systems and Storage Conference10.1145/3211890.3211900(13-25)Online publication date: 4-Jun-2018
  • (2018)ASPENProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00079(921-932)Online publication date: 20-Oct-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media