Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
TiReX: Tiled Regular eXpressions
matching architecture
Alessandro Comodi, Davide Conficconi {alessandro.comodi, davide.conficconi}@mail.polimi.it
Alberto Scolari, Marco Santambrogio {alberto.scolari, marco.santambrogio}@polimi.it
25th Reconfigurable Architectures Workshop (RAW) 2018
21/05/2018
1
Context
Genomic
GenomicIntrusion Detection Systems
Current issues
• The trade off between performance and flexibility
2
• Current approaches lack flexibility
– If they use FPGA, require embedding the regex into
the architecture (= re-synthesis)
– ASIC technology no flexibility at all
Our solution and claims 3
Based on previous work [1] proposing Regular Expressions
as a high level language driving a custom processor
The improvements with respect to ReCPU are:
• A better preprocessing mechanism of the RegExp and a
renewed single core design
• A scalable multi-core architecture for parallelized
computations reaching 100x speedup over Flex
• Cross-platform design able easily integrable with
heterogeneous architectures
[1] M. Paolieri et al “ReCPU: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc Springer 2009
Outline
• Related work
• TiReX design and implementation
• Evaluation
• Conclusions and future work
4
Related Work (1) 5
Most works use DFA (Deterministic Finite Automata) and address DFA
limitations, offering high matching speed at the cost of a fixed structure
Growth of memory usage along with RegExp complexity
• [1], [2] cluster states and group transitions
[1] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014
[2] J. van Lunteren and A. Guanella, “Hardware-accelerated regular expression matching at multiple tens of gb/s2” in INFOCOM, 2012
[3] K. Agarwal and R. Polig, “A high-speed and large-scale dictionary matching engine for information extraction systems,” in Application- Specific Systems,
Architectures and Processors (ASAP), 2013 IEEE 24th International Conference on. IEEE, 2013
[4] X.-T. Nguyen, H.-T. Nguyen, K. Inoue, O. Shimojo, and C.-K. Pham, “Highly parallel bitmap-based regular expression matching for text analytics,” in Circuits
and Systems (ISCAS), 2017
Other focus on achieving an efficient lookup process
• Hash based encoding scheme are another way to solve the problem [3]
• Bitmap index structures [4]
Related Work (2) 6
[5] C. R. Meiners et al “Fast regular expression matching using small tcams for network intrusion detection and prevention systems” 2010
[6] J. Yang et al “Pidfa: A practical multi-stride regular expression matching engine based on fpga” ICC 2016
[7] K. Atasu et al “Hardware-accelerated regular expression matching for high-throughput text analytics,” in FPL 2013
[8] G. Vasiliadis, M. Polychronakis, S. Antonatos, E. P. Markatos, and S. Ioannidis, “Regular expression matching on graphics hardware for intrusion detection,”
in International Workshop on Recent Advances in Intrusion Detection. Springer, 2009
Some works leverage hardware parallelism to match input against multiple RegExp
• [8] uses GPU to activate a new DFA for every initial character
Single character analysis for the basic version
• [5] Ternary Content Addressable Memories (TCAMs)
• [6],[7] precomputation of transitions
DFA encodes a single RegExp and matching one character at time, so it is
intrinsically sequential
Our Approach 7
As in ReCPU, RegExp are translated into program
instructions
TiReX matching core run instructions on input data based on
a dedicated Instruction Set Architecture (ISA)
RegExp is software compiled into a sequence of TiReX
instructions
Flow RE 8
Regular
Expression
Compiler
1 & ACGT
2 JIM offset
3 (
4 |)* AC
5 & TT
Instruction Set
ACGTCGGGGCGTGCAAATGCCCCGTGCGA
TTTGCGTGACGTCGGGGCGTGCAAATGCC
CCGTGCGATTTGCGTGACGTCGGGGCGTG
CAAATGCCCCGTGCGATTTGCGTGACGTC
GGGGCGTGCAAATGCCCCGTGCGATTTGC
GTGCGTGCGATTTGCGTGACGTCGGGGCG
TGCAAACGTGCGATTTGCGTGACGTCGGG
GCGTGCAAAGCTCGATCGATCGATCGA…
Data
Match results
TiReX ISA 9
Opcode RegExp Description Reference
0 00 000 NOP No Operation
1 00 000 ( Enter subroutine
0 10 000 AND And of cluster matches
0 01 000 OR Or of cluster matches
0 11 000 . Match any character 32 bits for
0 00 001 )* Match any number of sub-RE at most
0 00 010 )+ Match one or more sub-RE 4 characters
0 00 011 )| Match previous sub-RE or next one
0 00 100 ) End of subroutine
0 00 101 OKP Open Kleene Parenthesis
0 00 111 JIM Jump If Match
TiReX ISA 10
Opcode RegExp Description Reference
0 00 000 NOP No Operation
1 00 000 ( Enter subroutine
0 10 000 AND And of cluster matches
0 01 000 OR Or of cluster matches
0 11 000 . Match any character 32 bits for
0 00 001 )* Match any number of sub-RE at most
0 00 010 )+ Match one or more sub-RE 4 characters
0 00 011 )| Match previous sub-RE or next one
0 00 100 ) End of subroutine
0 00 101 OKP Open Kleene Parenthesis
0 00 111 JIM Jump If Match
All characters in the Reference must be
equal to the input data to have a match
RegExp: ACCGTGGA
Input 1:
Input 2:
TGGA GACCTACACCG
ACCA TGGACTAGAGG
TiReX ISA 11
Opcode RegExp Description Reference
0 00 000 NOP No Operation
1 00 000 ( Enter subroutine
0 10 000 AND And of cluster matches
0 01 000 OR Or of cluster matches
0 11 000 . Match any character 32 bits for
0 00 001 )* Match any number of sub-RE at most
0 00 010 )+ Match one or more sub-RE 4 characters
0 00 011 )| Match previous sub-RE or next one
0 00 100 ) End of subroutine
0 00 101 OKP Open Kleene Parenthesis
0 00 111 JIM Jump If Match
Special instruction to direct the jump backward
in the program like in a «for loop» with Kleene
operators
RegExp: (ACGT)+
Input 1: ACGT ACGT GACC
TiReX ISA 12
Opcode RegExp Description Reference
0 00 000 NOP No Operation
1 00 000 ( Enter subroutine
0 10 000 AND And of cluster matches
0 01 000 OR Or of cluster matches
0 11 000 . Match any character 32 bits for
0 00 001 )* Match any number of sub-RE at most
0 00 010 )+ Match one or more sub-RE 4 characters
0 00 011 )| Match previous sub-RE or next one
0 00 100 ) End of subroutine
0 00 101 OKP Open Kleene Parenthesis
0 00 111 JIM Jump If Match
Special instruction to direct the jump forward in the
program like in «if else» statement with chained
ORs
RegExp: (TTTT)|(GCAT)|(CTGA)
Input 1: GCAT GACCTAC
Single Core Architecture: Overview 13
Instruction Memory Data Buffer
Fetch & Decode Execution
Control Path
Address Address
DataInstruction Opcode
Reference
MatchControl ControlOpcode
Single Core Architecture: Details 14
Single Core Architecture: Details 15
Fetch & Decode
F&D Unit A: Back up
F&D Unit B: Next one
F&D Unit C: Jump
Single Core Architecture: Details 16
Execute
4 Cluster of 4 Comparators
Engine compute stage result
Single Core Architecture: Details 17
Data Buffer
Addressable Buffer
Intermediate registers:
• Back up
• Hold data
• Shift of 1-4 characters
Single Core Architecture: Details 18
Control Path
Status Register of the computation
Stack for nesting parenthesis
Completely redesign FSM
Multi core 19
Being the recognition process highly parallelizable we adopt a multi-core
architecture
BRAM
TiReX
core1
BRAM
TiReX
core2
AGCT(A|C)*TT
AGCT
AG*(TTAC)
GTTTG(AC)*
Data
BRAM
TiReX
coren-1
BRAM
TiReX
coren
…
…
Multi core 20
Being the recognition process highly parallelizable we adopt a multi-core
architecture
BRAM
TiReX
core1
BRAM
TiReX
core2
BRAM
TiReX
coren-1
BRAM
TiReX
coren
AGCT(A|C)*TT
Data1 Data2
Datan-1 Datan…
…
…
Multi core: Boundary conditions 21
Customizable conditions to avoid boundary match
Data
Match of length N
Chunk 0
Chunk 1
Chunk 2
Chunk 3
Experimental setup and results 22
Evaluation environment:
• VC707 evaluation platform powered by a Virtex-7 FPGA
• Digilent PYNQ-Z1 board powered by a ZYNQ SoC
comprising an ARM CPU and a Xilinx FPGA
We compare against:
• Flex program compiled with O3 optimizations and
runs on an Intel i7 with a peak frequency of 2.8GHz
Single Core Area Utilization 23
VC707 Board Slice LUTs Slice Reg. F7 Muxes
Used 1921 1175 261
Percentage 0.63% 0.29% 0.17%
PYNQ Board Slice LUTs Slice Reg. F7 Muxes
Used 1845 1775 261
Percentage 3.46% 1.66% 0.98%
VC707 Resources utilization
PYNQ Resources utilization
VC707 and PYNQ Results 24
Regular Expression Flex 16-core (VC707)
@130 MHz
Speedup
ACCGTGGA 271 µs 2.07 µs 130.90x
(TTT)+CT 121 µs 4.54 µs 26.65x
(CAGT)|(GGGG)|(TTGG)TGCA(C|G)+ 263 µs 3.36 µs 78.27x
Regular Expression Flex 8-core (PYNQ)
@ 70 MHz
Speedup
ACCGTGGA 271 µs 7.2 µs 37.63x
(TTT)+CT 121 µs 8.21 µs 14.73x
(CAGT)|(GGGG)|(TTGG)TGCA(C|G)+ 263 µs 30.3 µs 8.67x
Dataset with 16KB of the first Homo Sapiens chromosome
Comparisons with Related works 25
Solution Clock Frequency
[MHz]
Bitrate [Gb/s] Flexibility
VC707 16 – core 130 16.64 – 66.54
PYNQ 8 – core 70 4.48 – 17.92
[1] ASIC 318.47 10.19 – 18.18
[2] FPGA 150 230 – 430
[3] FPGA 100 3.2
[3] ASIC 1000 256
[1] M. Paolieri et al “Recpu: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc: Advanced Topics on Systems on a Chip.
Springer, 2009
[2] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014 IEEE
Symposium on.
[3] V. Gogte et al “Hare: Hardware accelerator for regular expressions,” in Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium
on.
Comparisons with Related works 26
Solution Clock Frequency
[MHz]
Bitrate [Gb/s] Flexibility
VC707 16 – core 130 16.64 – 66.54
PYNQ 8 – core 70 4.48 – 17.92
[1] ASIC 318.47 10.19 – 18.18
[2] FPGA 150 230 – 430
[3] FPGA 100 3.2
[3] ASIC 1000 256
[1] M. Paolieri et al “Recpu: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc: Advanced Topics on Systems on a Chip.
Springer, 2009
[2] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014 IEEE
Symposium on.
[3] V. Gogte et al “Hare: Hardware accelerator for regular expressions,” in Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium
on.
Comparisons with Related works 27
Solution Clock Frequency
[MHz]
Bitrate [Gb/s] Flexibility
VC707 16 – core 130 16.64 – 66.54
PYNQ 8 – core 70 4.48 – 17.92
[1] ASIC 318.47 10.19 – 18.18
[2] FPGA 150 230 – 430
[3] FPGA 100 3.2
[3] ASIC 1000 256
[1] M. Paolieri et al “Recpu: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc: Advanced Topics on Systems on a Chip.
Springer, 2009
[2] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014 IEEE
Symposium on.
[3] V. Gogte et al “Hare: Hardware accelerator for regular expressions,” in Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium
on.
Comparisons with Related works 28
Solution Clock Frequency
[MHz]
Bitrate [Gb/s] Flexibility
VC707 16 – core 130 16.64 – 66.54
PYNQ 8 – core 70 4.48 – 17.92
[1] ASIC 318.47 10.19 – 18.18
[2] FPGA 150 230 – 430
[3] FPGA 100 3.2
[3] ASIC 1000 256
[1] M. Paolieri et al “Recpu: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc: Advanced Topics on Systems on a Chip.
Springer, 2009
[2] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014 IEEE
Symposium on.
[3] V. Gogte et al “Hare: Hardware accelerator for regular expressions,” in Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium
on.
Conlusions and future work
• We have presented a multicore pattern matching
architecture implemented on an FPGA
• Overcome Flex solution gaining a 100x speedup
with a remarkable flexibility
• Future Works
– Performance improvements
• Exploration of different memory hierarchies
• Multicore interconnection studies
29
Conlusions and future work
• Future Works
– Performance improvements
• Exploration of different memory hierarchies
• Multicore interconnection studies
30
Thank you for your attention… Questions?
Alessandro Comodi, Davide Conficconi {alessandro.comodi, davide.conficconi}@mail.polimi.it
Alberto Scolari, Marco Santambrogio {alberto.scolari, marco.santambrogio}@polimi.it
NECST: www.necst.it
Slideshare NECST: www.slideshare.net/necstlab
RAW FB Group: facebook.com/groups/ReconfigurableArchitecturesWorkshop

More Related Content

What's hot

A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
NECST Lab @ Politecnico di Milano
 
C-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUsC-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUs
Pandey_G
 
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
PingCAP
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think Vectorized
Databricks
 
Flink Batch Processing and Iterations
Flink Batch Processing and IterationsFlink Batch Processing and Iterations
Flink Batch Processing and Iterations
Sameer Wadkar
 
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING Journal
Pandey_G
 
Apache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapApache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmap
Kostas Tzoumas
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷
Eiji Sekiya
 
From Trill to Quill and Beyond
From Trill to Quill and BeyondFrom Trill to Quill and Beyond
From Trill to Quill and Beyond
Badrish Chandramouli
 
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Storti Mario
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep Dive
Vasia Kalavri
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 
Migrate 10TB to Exadata Tips and Tricks (Presentation)
Migrate 10TB to Exadata Tips and Tricks (Presentation)Migrate 10TB to Exadata Tips and Tricks (Presentation)
Migrate 10TB to Exadata Tips and Tricks (Presentation)
Amin Adatia
 
SLE2015: Distributed ATL
SLE2015: Distributed ATLSLE2015: Distributed ATL
SLE2015: Distributed ATL
Amine Benelallam
 
Developing Your Own Flux Packages by David McKay | Head of Developer Relation...
Developing Your Own Flux Packages by David McKay | Head of Developer Relation...Developing Your Own Flux Packages by David McKay | Head of Developer Relation...
Developing Your Own Flux Packages by David McKay | Head of Developer Relation...
InfluxData
 
Michael Häusler – Everyday flink
Michael Häusler – Everyday flinkMichael Häusler – Everyday flink
Michael Häusler – Everyday flink
Flink Forward
 
Migrate 10TB to Exadata -- Tips and Tricks
Migrate 10TB to Exadata -- Tips and TricksMigrate 10TB to Exadata -- Tips and Tricks
Migrate 10TB to Exadata -- Tips and Tricks
Amin Adatia
 
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
InfluxData
 
Addressing performance issues in titan+cassandra
Addressing performance issues in titan+cassandraAddressing performance issues in titan+cassandra
Addressing performance issues in titan+cassandra
Nakul Jeirath
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Somnath Mazumdar
 

What's hot (20)

A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
 
C-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUsC-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUs
 
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think Vectorized
 
Flink Batch Processing and Iterations
Flink Batch Processing and IterationsFlink Batch Processing and Iterations
Flink Batch Processing and Iterations
 
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING Journal
 
Apache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapApache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmap
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷
 
From Trill to Quill and Beyond
From Trill to Quill and BeyondFrom Trill to Quill and Beyond
From Trill to Quill and Beyond
 
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep Dive
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Migrate 10TB to Exadata Tips and Tricks (Presentation)
Migrate 10TB to Exadata Tips and Tricks (Presentation)Migrate 10TB to Exadata Tips and Tricks (Presentation)
Migrate 10TB to Exadata Tips and Tricks (Presentation)
 
SLE2015: Distributed ATL
SLE2015: Distributed ATLSLE2015: Distributed ATL
SLE2015: Distributed ATL
 
Developing Your Own Flux Packages by David McKay | Head of Developer Relation...
Developing Your Own Flux Packages by David McKay | Head of Developer Relation...Developing Your Own Flux Packages by David McKay | Head of Developer Relation...
Developing Your Own Flux Packages by David McKay | Head of Developer Relation...
 
Michael Häusler – Everyday flink
Michael Häusler – Everyday flinkMichael Häusler – Everyday flink
Michael Häusler – Everyday flink
 
Migrate 10TB to Exadata -- Tips and Tricks
Migrate 10TB to Exadata -- Tips and TricksMigrate 10TB to Exadata -- Tips and Tricks
Migrate 10TB to Exadata -- Tips and Tricks
 
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
 
Addressing performance issues in titan+cassandra
Addressing performance issues in titan+cassandraAddressing performance issues in titan+cassandra
Addressing performance issues in titan+cassandra
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
 

Similar to TiReX: Tiled Regular eXpression matching architecture

TiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architectureTiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architecture
NECST Lab @ Politecnico di Milano
 
The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...
NECST Lab @ Politecnico di Milano
 
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
NECST Lab @ Politecnico di Milano
 
3rd 3DDRESD: ReCPU 4 NIDS
3rd 3DDRESD: ReCPU 4 NIDS3rd 3DDRESD: ReCPU 4 NIDS
3rd 3DDRESD: ReCPU 4 NIDS
Marco Santambrogio
 
Klessydra-T: Designing Configurable Vector Co-Processors for Multi-Threaded E...
Klessydra-T: Designing Configurable Vector Co-Processors for Multi-Threaded E...Klessydra-T: Designing Configurable Vector Co-Processors for Multi-Threaded E...
Klessydra-T: Designing Configurable Vector Co-Processors for Multi-Threaded E...
RISC-V International
 
design-compiler.pdf
design-compiler.pdfdesign-compiler.pdf
design-compiler.pdf
FrangoCamila
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
A Cryptographic Hardware Revolution in Communication Systems using Verilog HDL
A Cryptographic Hardware Revolution in Communication Systems using Verilog HDLA Cryptographic Hardware Revolution in Communication Systems using Verilog HDL
A Cryptographic Hardware Revolution in Communication Systems using Verilog HDL
idescitation
 
11
1111
cug2011-praveen
cug2011-praveencug2011-praveen
cug2011-praveen
Praveen Narayanan
 
IRJET- A Review on Various Secured Data Encryption Models based on AES Standard
IRJET- A Review on Various Secured Data Encryption Models based on AES StandardIRJET- A Review on Various Secured Data Encryption Models based on AES Standard
IRJET- A Review on Various Secured Data Encryption Models based on AES Standard
IRJET Journal
 
A04660105
A04660105A04660105
A04660105
IOSR-JEN
 
Novel Adaptive Hold Logic Circuit for the Multiplier using Add Round Key and ...
Novel Adaptive Hold Logic Circuit for the Multiplier using Add Round Key and ...Novel Adaptive Hold Logic Circuit for the Multiplier using Add Round Key and ...
Novel Adaptive Hold Logic Circuit for the Multiplier using Add Round Key and ...
IJMTST Journal
 
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
An OpenCL Method of Parallel Sorting Algorithms for GPU ArchitectureAn OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
Waqas Tariq
 
Aes
AesAes
Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check  Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check
IJECEIAES
 
Architecture innovations in POWER ISA v3.01 and POWER10
Architecture innovations in POWER ISA v3.01 and POWER10Architecture innovations in POWER ISA v3.01 and POWER10
Architecture innovations in POWER ISA v3.01 and POWER10
Ganesan Narayanasamy
 
20120140506024
2012014050602420120140506024
20120140506024
IAEME Publication
 
The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...
NECST Lab @ Politecnico di Milano
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Deepak Shankar
 

Similar to TiReX: Tiled Regular eXpression matching architecture (20)

TiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architectureTiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architecture
 
The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...
 
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
A Highly Parallel Semi-Dataflow FPGA Architecture for Large-Scale N-Body Simu...
 
3rd 3DDRESD: ReCPU 4 NIDS
3rd 3DDRESD: ReCPU 4 NIDS3rd 3DDRESD: ReCPU 4 NIDS
3rd 3DDRESD: ReCPU 4 NIDS
 
Klessydra-T: Designing Configurable Vector Co-Processors for Multi-Threaded E...
Klessydra-T: Designing Configurable Vector Co-Processors for Multi-Threaded E...Klessydra-T: Designing Configurable Vector Co-Processors for Multi-Threaded E...
Klessydra-T: Designing Configurable Vector Co-Processors for Multi-Threaded E...
 
design-compiler.pdf
design-compiler.pdfdesign-compiler.pdf
design-compiler.pdf
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
A Cryptographic Hardware Revolution in Communication Systems using Verilog HDL
A Cryptographic Hardware Revolution in Communication Systems using Verilog HDLA Cryptographic Hardware Revolution in Communication Systems using Verilog HDL
A Cryptographic Hardware Revolution in Communication Systems using Verilog HDL
 
11
1111
11
 
cug2011-praveen
cug2011-praveencug2011-praveen
cug2011-praveen
 
IRJET- A Review on Various Secured Data Encryption Models based on AES Standard
IRJET- A Review on Various Secured Data Encryption Models based on AES StandardIRJET- A Review on Various Secured Data Encryption Models based on AES Standard
IRJET- A Review on Various Secured Data Encryption Models based on AES Standard
 
A04660105
A04660105A04660105
A04660105
 
Novel Adaptive Hold Logic Circuit for the Multiplier using Add Round Key and ...
Novel Adaptive Hold Logic Circuit for the Multiplier using Add Round Key and ...Novel Adaptive Hold Logic Circuit for the Multiplier using Add Round Key and ...
Novel Adaptive Hold Logic Circuit for the Multiplier using Add Round Key and ...
 
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
An OpenCL Method of Parallel Sorting Algorithms for GPU ArchitectureAn OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture
 
Aes
AesAes
Aes
 
Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check  Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check
 
Architecture innovations in POWER ISA v3.01 and POWER10
Architecture innovations in POWER ISA v3.01 and POWER10Architecture innovations in POWER ISA v3.01 and POWER10
Architecture innovations in POWER ISA v3.01 and POWER10
 
20120140506024
2012014050602420120140506024
20120140506024
 
The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
 

More from NECST Lab @ Politecnico di Milano

Mesticheria Team - WiiReflex
Mesticheria Team - WiiReflexMesticheria Team - WiiReflex
Mesticheria Team - WiiReflex
NECST Lab @ Politecnico di Milano
 
Punto e virgola Team - Stressometro
Punto e virgola Team - StressometroPunto e virgola Team - Stressometro
Punto e virgola Team - Stressometro
NECST Lab @ Politecnico di Milano
 
BitIt Team - Stay.straight
BitIt Team - Stay.straight BitIt Team - Stay.straight
BitIt Team - Stay.straight
NECST Lab @ Politecnico di Milano
 
BabYodini Team - Talking Gloves
BabYodini Team - Talking GlovesBabYodini Team - Talking Gloves
BabYodini Team - Talking Gloves
NECST Lab @ Politecnico di Milano
 
printf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTonprintf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTon
NECST Lab @ Politecnico di Milano
 
BlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking PlatformBlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking Platform
NECST Lab @ Politecnico di Milano
 
#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome
NECST Lab @ Politecnico di Milano
 
Flipflops Team - Wave U
Flipflops Team - Wave UFlipflops Team - Wave U
Flipflops Team - Wave U
NECST Lab @ Politecnico di Milano
 
Bug(atta) Team - Little Brother
Bug(atta) Team - Little BrotherBug(atta) Team - Little Brother
Bug(atta) Team - Little Brother
NECST Lab @ Politecnico di Milano
 
#NECSTCamp: come partecipare
#NECSTCamp: come partecipare#NECSTCamp: come partecipare
#NECSTCamp: come partecipare
NECST Lab @ Politecnico di Milano
 
NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1
NECST Lab @ Politecnico di Milano
 
NECSTLab101 2020.2021
NECSTLab101 2020.2021NECSTLab101 2020.2021
NECSTLab101 2020.2021
NECST Lab @ Politecnico di Milano
 
TreeHouse, nourish your community
TreeHouse, nourish your communityTreeHouse, nourish your community
TreeHouse, nourish your community
NECST Lab @ Politecnico di Milano
 
TiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architectureTiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architecture
NECST Lab @ Politecnico di Milano
 
Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposing
NECST Lab @ Politecnico di Milano
 
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
NECST Lab @ Politecnico di Milano
 
EMPhASIS - An EMbedded Public Attention Stress Identification System
 EMPhASIS - An EMbedded Public Attention Stress Identification System EMPhASIS - An EMbedded Public Attention Stress Identification System
EMPhASIS - An EMbedded Public Attention Stress Identification System
NECST Lab @ Politecnico di Milano
 
Luns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural networkLuns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural network
NECST Lab @ Politecnico di Milano
 
BlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAsBlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAs
NECST Lab @ Politecnico di Milano
 
Maeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingMaeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matching
NECST Lab @ Politecnico di Milano
 

More from NECST Lab @ Politecnico di Milano (20)

Mesticheria Team - WiiReflex
Mesticheria Team - WiiReflexMesticheria Team - WiiReflex
Mesticheria Team - WiiReflex
 
Punto e virgola Team - Stressometro
Punto e virgola Team - StressometroPunto e virgola Team - Stressometro
Punto e virgola Team - Stressometro
 
BitIt Team - Stay.straight
BitIt Team - Stay.straight BitIt Team - Stay.straight
BitIt Team - Stay.straight
 
BabYodini Team - Talking Gloves
BabYodini Team - Talking GlovesBabYodini Team - Talking Gloves
BabYodini Team - Talking Gloves
 
printf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTonprintf("Nome Squadra"); Team - NeoTon
printf("Nome Squadra"); Team - NeoTon
 
BlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking PlatformBlackBoard Team - Motion Tracking Platform
BlackBoard Team - Motion Tracking Platform
 
#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome#include<brain.h> Team - HomeBeatHome
#include<brain.h> Team - HomeBeatHome
 
Flipflops Team - Wave U
Flipflops Team - Wave UFlipflops Team - Wave U
Flipflops Team - Wave U
 
Bug(atta) Team - Little Brother
Bug(atta) Team - Little BrotherBug(atta) Team - Little Brother
Bug(atta) Team - Little Brother
 
#NECSTCamp: come partecipare
#NECSTCamp: come partecipare#NECSTCamp: come partecipare
#NECSTCamp: come partecipare
 
NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1NECSTCamp101@2020.10.1
NECSTCamp101@2020.10.1
 
NECSTLab101 2020.2021
NECSTLab101 2020.2021NECSTLab101 2020.2021
NECSTLab101 2020.2021
 
TreeHouse, nourish your community
TreeHouse, nourish your communityTreeHouse, nourish your community
TreeHouse, nourish your community
 
TiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architectureTiReX: Tiled Regular eXpressionsmatching architecture
TiReX: Tiled Regular eXpressionsmatching architecture
 
Embedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposingEmbedding based knowledge graph link prediction for drug repurposing
Embedding based knowledge graph link prediction for drug repurposing
 
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
 
EMPhASIS - An EMbedded Public Attention Stress Identification System
 EMPhASIS - An EMbedded Public Attention Stress Identification System EMPhASIS - An EMbedded Public Attention Stress Identification System
EMPhASIS - An EMbedded Public Attention Stress Identification System
 
Luns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural networkLuns - Automatic lungs segmentation through neural network
Luns - Automatic lungs segmentation through neural network
 
BlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAsBlastFunction: How to combine Serverless and FPGAs
BlastFunction: How to combine Serverless and FPGAs
 
Maeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matchingMaeve - Fast genome analysis leveraging exact string matching
Maeve - Fast genome analysis leveraging exact string matching
 

Recently uploaded

Introduction to Software Requirement Engineering.pdf
Introduction to Software Requirement Engineering.pdfIntroduction to Software Requirement Engineering.pdf
Introduction to Software Requirement Engineering.pdf
jeevaakatiravanhod
 
Future Networking v Energy Limits ICTON 2024 Bari Italy
Future Networking v Energy Limits ICTON 2024 Bari ItalyFuture Networking v Energy Limits ICTON 2024 Bari Italy
Future Networking v Energy Limits ICTON 2024 Bari Italy
University of Hertfordshire
 
Thesis on Assessment of Landslide Prone Area and Their Consequences Due to C...
Thesis on Assessment of Landslide Prone Area and Their Consequences  Due to C...Thesis on Assessment of Landslide Prone Area and Their Consequences  Due to C...
Thesis on Assessment of Landslide Prone Area and Their Consequences Due to C...
ErBamBhandari
 
Electrical Engineering, DC - AC Machines
Electrical Engineering, DC - AC MachinesElectrical Engineering, DC - AC Machines
Electrical Engineering, DC - AC Machines
Jason J Pulikkottil
 
Good Energy Haus: PHN Presents Building Electrification, A Passive House Symp...
Good Energy Haus: PHN Presents Building Electrification, A Passive House Symp...Good Energy Haus: PHN Presents Building Electrification, A Passive House Symp...
Good Energy Haus: PHN Presents Building Electrification, A Passive House Symp...
TE Studio
 
Computer Vision and GenAI for Geoscientists.pptx
Computer Vision and GenAI for Geoscientists.pptxComputer Vision and GenAI for Geoscientists.pptx
Computer Vision and GenAI for Geoscientists.pptx
Yohanes Nuwara
 
Procurement and Contract Strategy in Malaysia
Procurement and Contract Strategy in MalaysiaProcurement and Contract Strategy in Malaysia
Procurement and Contract Strategy in Malaysia
SingLingLim1
 
Computer Graphics - Cartesian Coordinate System.pdf
Computer Graphics - Cartesian Coordinate System.pdfComputer Graphics - Cartesian Coordinate System.pdf
Computer Graphics - Cartesian Coordinate System.pdf
Amol Gaikwad
 
UNIT-I-METAL CASTING PROCESSES -Manufact
UNIT-I-METAL CASTING PROCESSES -ManufactUNIT-I-METAL CASTING PROCESSES -Manufact
UNIT-I-METAL CASTING PROCESSES -Manufact
Mr.C.Dineshbabu
 
Three Phase Induction Motors, Equivalent Circuits
Three Phase Induction Motors, Equivalent CircuitsThree Phase Induction Motors, Equivalent Circuits
Three Phase Induction Motors, Equivalent Circuits
Jason J Pulikkottil
 
抖音人气博主卖逼【网祉:5j8.net】反差幼师【网祉:5j8.net】中国农村野战
抖音人气博主卖逼【网祉:5j8.net】反差幼师【网祉:5j8.net】中国农村野战抖音人气博主卖逼【网祉:5j8.net】反差幼师【网祉:5j8.net】中国农村野战
抖音人气博主卖逼【网祉:5j8.net】反差幼师【网祉:5j8.net】中国农村野战
【网祉:5j8.net】 极品美鲍【网祉:5j8.net】
 
Importent indian standard code.4081.1986.pdf
Importent indian standard code.4081.1986.pdfImportent indian standard code.4081.1986.pdf
Importent indian standard code.4081.1986.pdf
PradeepNigam12
 
Trends in digital era-Programming Knowledge
Trends in digital era-Programming KnowledgeTrends in digital era-Programming Knowledge
Trends in digital era-Programming Knowledge
DrJSathyaPriyaPhd
 
Predicting damage in notched functionally graded materials plates thr...
Predicting  damage  in  notched  functionally  graded  materials  plates  thr...Predicting  damage  in  notched  functionally  graded  materials  plates  thr...
Predicting damage in notched functionally graded materials plates thr...
Barhm Mohamad
 
MAJOR ACCIDENTS DUE TO FIRE IN COAL MINES.pptx
MAJOR ACCIDENTS DUE TO FIRE IN COAL MINES.pptxMAJOR ACCIDENTS DUE TO FIRE IN COAL MINES.pptx
MAJOR ACCIDENTS DUE TO FIRE IN COAL MINES.pptx
maniksrikant
 
History of Artificial Intelligence.pptx
History  of Artificial Intelligence.pptxHistory  of Artificial Intelligence.pptx
History of Artificial Intelligence.pptx
ayushsharma230705
 
UNIT-1-INTRODUCTION- MECHATRONICS-ENGGINERING
UNIT-1-INTRODUCTION- MECHATRONICS-ENGGINERINGUNIT-1-INTRODUCTION- MECHATRONICS-ENGGINERING
UNIT-1-INTRODUCTION- MECHATRONICS-ENGGINERING
Chandra Kumar S
 
Youtube Transcript Sumariser- application of API
Youtube Transcript Sumariser- application of APIYoutube Transcript Sumariser- application of API
Youtube Transcript Sumariser- application of API
AnamikaRani12
 
Artificial Intelligence Imaging - medical imaging
Artificial Intelligence Imaging - medical imagingArtificial Intelligence Imaging - medical imaging
Artificial Intelligence Imaging - medical imaging
NeeluPari
 
Numerical comaprison of various order explicit runge kutta methods with matla...
Numerical comaprison of various order explicit runge kutta methods with matla...Numerical comaprison of various order explicit runge kutta methods with matla...
Numerical comaprison of various order explicit runge kutta methods with matla...
DrAzizulHasan1
 

Recently uploaded (20)

Introduction to Software Requirement Engineering.pdf
Introduction to Software Requirement Engineering.pdfIntroduction to Software Requirement Engineering.pdf
Introduction to Software Requirement Engineering.pdf
 
Future Networking v Energy Limits ICTON 2024 Bari Italy
Future Networking v Energy Limits ICTON 2024 Bari ItalyFuture Networking v Energy Limits ICTON 2024 Bari Italy
Future Networking v Energy Limits ICTON 2024 Bari Italy
 
Thesis on Assessment of Landslide Prone Area and Their Consequences Due to C...
Thesis on Assessment of Landslide Prone Area and Their Consequences  Due to C...Thesis on Assessment of Landslide Prone Area and Their Consequences  Due to C...
Thesis on Assessment of Landslide Prone Area and Their Consequences Due to C...
 
Electrical Engineering, DC - AC Machines
Electrical Engineering, DC - AC MachinesElectrical Engineering, DC - AC Machines
Electrical Engineering, DC - AC Machines
 
Good Energy Haus: PHN Presents Building Electrification, A Passive House Symp...
Good Energy Haus: PHN Presents Building Electrification, A Passive House Symp...Good Energy Haus: PHN Presents Building Electrification, A Passive House Symp...
Good Energy Haus: PHN Presents Building Electrification, A Passive House Symp...
 
Computer Vision and GenAI for Geoscientists.pptx
Computer Vision and GenAI for Geoscientists.pptxComputer Vision and GenAI for Geoscientists.pptx
Computer Vision and GenAI for Geoscientists.pptx
 
Procurement and Contract Strategy in Malaysia
Procurement and Contract Strategy in MalaysiaProcurement and Contract Strategy in Malaysia
Procurement and Contract Strategy in Malaysia
 
Computer Graphics - Cartesian Coordinate System.pdf
Computer Graphics - Cartesian Coordinate System.pdfComputer Graphics - Cartesian Coordinate System.pdf
Computer Graphics - Cartesian Coordinate System.pdf
 
UNIT-I-METAL CASTING PROCESSES -Manufact
UNIT-I-METAL CASTING PROCESSES -ManufactUNIT-I-METAL CASTING PROCESSES -Manufact
UNIT-I-METAL CASTING PROCESSES -Manufact
 
Three Phase Induction Motors, Equivalent Circuits
Three Phase Induction Motors, Equivalent CircuitsThree Phase Induction Motors, Equivalent Circuits
Three Phase Induction Motors, Equivalent Circuits
 
抖音人气博主卖逼【网祉:5j8.net】反差幼师【网祉:5j8.net】中国农村野战
抖音人气博主卖逼【网祉:5j8.net】反差幼师【网祉:5j8.net】中国农村野战抖音人气博主卖逼【网祉:5j8.net】反差幼师【网祉:5j8.net】中国农村野战
抖音人气博主卖逼【网祉:5j8.net】反差幼师【网祉:5j8.net】中国农村野战
 
Importent indian standard code.4081.1986.pdf
Importent indian standard code.4081.1986.pdfImportent indian standard code.4081.1986.pdf
Importent indian standard code.4081.1986.pdf
 
Trends in digital era-Programming Knowledge
Trends in digital era-Programming KnowledgeTrends in digital era-Programming Knowledge
Trends in digital era-Programming Knowledge
 
Predicting damage in notched functionally graded materials plates thr...
Predicting  damage  in  notched  functionally  graded  materials  plates  thr...Predicting  damage  in  notched  functionally  graded  materials  plates  thr...
Predicting damage in notched functionally graded materials plates thr...
 
MAJOR ACCIDENTS DUE TO FIRE IN COAL MINES.pptx
MAJOR ACCIDENTS DUE TO FIRE IN COAL MINES.pptxMAJOR ACCIDENTS DUE TO FIRE IN COAL MINES.pptx
MAJOR ACCIDENTS DUE TO FIRE IN COAL MINES.pptx
 
History of Artificial Intelligence.pptx
History  of Artificial Intelligence.pptxHistory  of Artificial Intelligence.pptx
History of Artificial Intelligence.pptx
 
UNIT-1-INTRODUCTION- MECHATRONICS-ENGGINERING
UNIT-1-INTRODUCTION- MECHATRONICS-ENGGINERINGUNIT-1-INTRODUCTION- MECHATRONICS-ENGGINERING
UNIT-1-INTRODUCTION- MECHATRONICS-ENGGINERING
 
Youtube Transcript Sumariser- application of API
Youtube Transcript Sumariser- application of APIYoutube Transcript Sumariser- application of API
Youtube Transcript Sumariser- application of API
 
Artificial Intelligence Imaging - medical imaging
Artificial Intelligence Imaging - medical imagingArtificial Intelligence Imaging - medical imaging
Artificial Intelligence Imaging - medical imaging
 
Numerical comaprison of various order explicit runge kutta methods with matla...
Numerical comaprison of various order explicit runge kutta methods with matla...Numerical comaprison of various order explicit runge kutta methods with matla...
Numerical comaprison of various order explicit runge kutta methods with matla...
 

TiReX: Tiled Regular eXpression matching architecture

  • 1. TiReX: Tiled Regular eXpressions matching architecture Alessandro Comodi, Davide Conficconi {alessandro.comodi, davide.conficconi}@mail.polimi.it Alberto Scolari, Marco Santambrogio {alberto.scolari, marco.santambrogio}@polimi.it 25th Reconfigurable Architectures Workshop (RAW) 2018 21/05/2018
  • 3. Current issues • The trade off between performance and flexibility 2 • Current approaches lack flexibility – If they use FPGA, require embedding the regex into the architecture (= re-synthesis) – ASIC technology no flexibility at all
  • 4. Our solution and claims 3 Based on previous work [1] proposing Regular Expressions as a high level language driving a custom processor The improvements with respect to ReCPU are: • A better preprocessing mechanism of the RegExp and a renewed single core design • A scalable multi-core architecture for parallelized computations reaching 100x speedup over Flex • Cross-platform design able easily integrable with heterogeneous architectures [1] M. Paolieri et al “ReCPU: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc Springer 2009
  • 5. Outline • Related work • TiReX design and implementation • Evaluation • Conclusions and future work 4
  • 6. Related Work (1) 5 Most works use DFA (Deterministic Finite Automata) and address DFA limitations, offering high matching speed at the cost of a fixed structure Growth of memory usage along with RegExp complexity • [1], [2] cluster states and group transitions [1] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014 [2] J. van Lunteren and A. Guanella, “Hardware-accelerated regular expression matching at multiple tens of gb/s2” in INFOCOM, 2012 [3] K. Agarwal and R. Polig, “A high-speed and large-scale dictionary matching engine for information extraction systems,” in Application- Specific Systems, Architectures and Processors (ASAP), 2013 IEEE 24th International Conference on. IEEE, 2013 [4] X.-T. Nguyen, H.-T. Nguyen, K. Inoue, O. Shimojo, and C.-K. Pham, “Highly parallel bitmap-based regular expression matching for text analytics,” in Circuits and Systems (ISCAS), 2017 Other focus on achieving an efficient lookup process • Hash based encoding scheme are another way to solve the problem [3] • Bitmap index structures [4]
  • 7. Related Work (2) 6 [5] C. R. Meiners et al “Fast regular expression matching using small tcams for network intrusion detection and prevention systems” 2010 [6] J. Yang et al “Pidfa: A practical multi-stride regular expression matching engine based on fpga” ICC 2016 [7] K. Atasu et al “Hardware-accelerated regular expression matching for high-throughput text analytics,” in FPL 2013 [8] G. Vasiliadis, M. Polychronakis, S. Antonatos, E. P. Markatos, and S. Ioannidis, “Regular expression matching on graphics hardware for intrusion detection,” in International Workshop on Recent Advances in Intrusion Detection. Springer, 2009 Some works leverage hardware parallelism to match input against multiple RegExp • [8] uses GPU to activate a new DFA for every initial character Single character analysis for the basic version • [5] Ternary Content Addressable Memories (TCAMs) • [6],[7] precomputation of transitions DFA encodes a single RegExp and matching one character at time, so it is intrinsically sequential
  • 8. Our Approach 7 As in ReCPU, RegExp are translated into program instructions TiReX matching core run instructions on input data based on a dedicated Instruction Set Architecture (ISA) RegExp is software compiled into a sequence of TiReX instructions
  • 9. Flow RE 8 Regular Expression Compiler 1 & ACGT 2 JIM offset 3 ( 4 |)* AC 5 & TT Instruction Set ACGTCGGGGCGTGCAAATGCCCCGTGCGA TTTGCGTGACGTCGGGGCGTGCAAATGCC CCGTGCGATTTGCGTGACGTCGGGGCGTG CAAATGCCCCGTGCGATTTGCGTGACGTC GGGGCGTGCAAATGCCCCGTGCGATTTGC GTGCGTGCGATTTGCGTGACGTCGGGGCG TGCAAACGTGCGATTTGCGTGACGTCGGG GCGTGCAAAGCTCGATCGATCGATCGA… Data Match results
  • 10. TiReX ISA 9 Opcode RegExp Description Reference 0 00 000 NOP No Operation 1 00 000 ( Enter subroutine 0 10 000 AND And of cluster matches 0 01 000 OR Or of cluster matches 0 11 000 . Match any character 32 bits for 0 00 001 )* Match any number of sub-RE at most 0 00 010 )+ Match one or more sub-RE 4 characters 0 00 011 )| Match previous sub-RE or next one 0 00 100 ) End of subroutine 0 00 101 OKP Open Kleene Parenthesis 0 00 111 JIM Jump If Match
  • 11. TiReX ISA 10 Opcode RegExp Description Reference 0 00 000 NOP No Operation 1 00 000 ( Enter subroutine 0 10 000 AND And of cluster matches 0 01 000 OR Or of cluster matches 0 11 000 . Match any character 32 bits for 0 00 001 )* Match any number of sub-RE at most 0 00 010 )+ Match one or more sub-RE 4 characters 0 00 011 )| Match previous sub-RE or next one 0 00 100 ) End of subroutine 0 00 101 OKP Open Kleene Parenthesis 0 00 111 JIM Jump If Match All characters in the Reference must be equal to the input data to have a match RegExp: ACCGTGGA Input 1: Input 2: TGGA GACCTACACCG ACCA TGGACTAGAGG
  • 12. TiReX ISA 11 Opcode RegExp Description Reference 0 00 000 NOP No Operation 1 00 000 ( Enter subroutine 0 10 000 AND And of cluster matches 0 01 000 OR Or of cluster matches 0 11 000 . Match any character 32 bits for 0 00 001 )* Match any number of sub-RE at most 0 00 010 )+ Match one or more sub-RE 4 characters 0 00 011 )| Match previous sub-RE or next one 0 00 100 ) End of subroutine 0 00 101 OKP Open Kleene Parenthesis 0 00 111 JIM Jump If Match Special instruction to direct the jump backward in the program like in a «for loop» with Kleene operators RegExp: (ACGT)+ Input 1: ACGT ACGT GACC
  • 13. TiReX ISA 12 Opcode RegExp Description Reference 0 00 000 NOP No Operation 1 00 000 ( Enter subroutine 0 10 000 AND And of cluster matches 0 01 000 OR Or of cluster matches 0 11 000 . Match any character 32 bits for 0 00 001 )* Match any number of sub-RE at most 0 00 010 )+ Match one or more sub-RE 4 characters 0 00 011 )| Match previous sub-RE or next one 0 00 100 ) End of subroutine 0 00 101 OKP Open Kleene Parenthesis 0 00 111 JIM Jump If Match Special instruction to direct the jump forward in the program like in «if else» statement with chained ORs RegExp: (TTTT)|(GCAT)|(CTGA) Input 1: GCAT GACCTAC
  • 14. Single Core Architecture: Overview 13 Instruction Memory Data Buffer Fetch & Decode Execution Control Path Address Address DataInstruction Opcode Reference MatchControl ControlOpcode
  • 16. Single Core Architecture: Details 15 Fetch & Decode F&D Unit A: Back up F&D Unit B: Next one F&D Unit C: Jump
  • 17. Single Core Architecture: Details 16 Execute 4 Cluster of 4 Comparators Engine compute stage result
  • 18. Single Core Architecture: Details 17 Data Buffer Addressable Buffer Intermediate registers: • Back up • Hold data • Shift of 1-4 characters
  • 19. Single Core Architecture: Details 18 Control Path Status Register of the computation Stack for nesting parenthesis Completely redesign FSM
  • 20. Multi core 19 Being the recognition process highly parallelizable we adopt a multi-core architecture BRAM TiReX core1 BRAM TiReX core2 AGCT(A|C)*TT AGCT AG*(TTAC) GTTTG(AC)* Data BRAM TiReX coren-1 BRAM TiReX coren … …
  • 21. Multi core 20 Being the recognition process highly parallelizable we adopt a multi-core architecture BRAM TiReX core1 BRAM TiReX core2 BRAM TiReX coren-1 BRAM TiReX coren AGCT(A|C)*TT Data1 Data2 Datan-1 Datan… … …
  • 22. Multi core: Boundary conditions 21 Customizable conditions to avoid boundary match Data Match of length N Chunk 0 Chunk 1 Chunk 2 Chunk 3
  • 23. Experimental setup and results 22 Evaluation environment: • VC707 evaluation platform powered by a Virtex-7 FPGA • Digilent PYNQ-Z1 board powered by a ZYNQ SoC comprising an ARM CPU and a Xilinx FPGA We compare against: • Flex program compiled with O3 optimizations and runs on an Intel i7 with a peak frequency of 2.8GHz
  • 24. Single Core Area Utilization 23 VC707 Board Slice LUTs Slice Reg. F7 Muxes Used 1921 1175 261 Percentage 0.63% 0.29% 0.17% PYNQ Board Slice LUTs Slice Reg. F7 Muxes Used 1845 1775 261 Percentage 3.46% 1.66% 0.98% VC707 Resources utilization PYNQ Resources utilization
  • 25. VC707 and PYNQ Results 24 Regular Expression Flex 16-core (VC707) @130 MHz Speedup ACCGTGGA 271 µs 2.07 µs 130.90x (TTT)+CT 121 µs 4.54 µs 26.65x (CAGT)|(GGGG)|(TTGG)TGCA(C|G)+ 263 µs 3.36 µs 78.27x Regular Expression Flex 8-core (PYNQ) @ 70 MHz Speedup ACCGTGGA 271 µs 7.2 µs 37.63x (TTT)+CT 121 µs 8.21 µs 14.73x (CAGT)|(GGGG)|(TTGG)TGCA(C|G)+ 263 µs 30.3 µs 8.67x Dataset with 16KB of the first Homo Sapiens chromosome
  • 26. Comparisons with Related works 25 Solution Clock Frequency [MHz] Bitrate [Gb/s] Flexibility VC707 16 – core 130 16.64 – 66.54 PYNQ 8 – core 70 4.48 – 17.92 [1] ASIC 318.47 10.19 – 18.18 [2] FPGA 150 230 – 430 [3] FPGA 100 3.2 [3] ASIC 1000 256 [1] M. Paolieri et al “Recpu: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc: Advanced Topics on Systems on a Chip. Springer, 2009 [2] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014 IEEE Symposium on. [3] V. Gogte et al “Hare: Hardware accelerator for regular expressions,” in Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on.
  • 27. Comparisons with Related works 26 Solution Clock Frequency [MHz] Bitrate [Gb/s] Flexibility VC707 16 – core 130 16.64 – 66.54 PYNQ 8 – core 70 4.48 – 17.92 [1] ASIC 318.47 10.19 – 18.18 [2] FPGA 150 230 – 430 [3] FPGA 100 3.2 [3] ASIC 1000 256 [1] M. Paolieri et al “Recpu: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc: Advanced Topics on Systems on a Chip. Springer, 2009 [2] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014 IEEE Symposium on. [3] V. Gogte et al “Hare: Hardware accelerator for regular expressions,” in Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on.
  • 28. Comparisons with Related works 27 Solution Clock Frequency [MHz] Bitrate [Gb/s] Flexibility VC707 16 – core 130 16.64 – 66.54 PYNQ 8 – core 70 4.48 – 17.92 [1] ASIC 318.47 10.19 – 18.18 [2] FPGA 150 230 – 430 [3] FPGA 100 3.2 [3] ASIC 1000 256 [1] M. Paolieri et al “Recpu: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc: Advanced Topics on Systems on a Chip. Springer, 2009 [2] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014 IEEE Symposium on. [3] V. Gogte et al “Hare: Hardware accelerator for regular expressions,” in Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on.
  • 29. Comparisons with Related works 28 Solution Clock Frequency [MHz] Bitrate [Gb/s] Flexibility VC707 16 – core 130 16.64 – 66.54 PYNQ 8 – core 70 4.48 – 17.92 [1] ASIC 318.47 10.19 – 18.18 [2] FPGA 150 230 – 430 [3] FPGA 100 3.2 [3] ASIC 1000 256 [1] M. Paolieri et al “Recpu: A parallel and pipelined architecture for regular expression matching,” in Vlsi-Soc: Advanced Topics on Systems on a Chip. Springer, 2009 [2] L. Jiang et al“A fast regular expression matching engine for nids applying prediction scheme,” in Computers and Communication (ISCC), 2014 IEEE Symposium on. [3] V. Gogte et al “Hare: Hardware accelerator for regular expressions,” in Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on.
  • 30. Conlusions and future work • We have presented a multicore pattern matching architecture implemented on an FPGA • Overcome Flex solution gaining a 100x speedup with a remarkable flexibility • Future Works – Performance improvements • Exploration of different memory hierarchies • Multicore interconnection studies 29
  • 31. Conlusions and future work • Future Works – Performance improvements • Exploration of different memory hierarchies • Multicore interconnection studies 30 Thank you for your attention… Questions? Alessandro Comodi, Davide Conficconi {alessandro.comodi, davide.conficconi}@mail.polimi.it Alberto Scolari, Marco Santambrogio {alberto.scolari, marco.santambrogio}@polimi.it NECST: www.necst.it Slideshare NECST: www.slideshare.net/necstlab RAW FB Group: facebook.com/groups/ReconfigurableArchitecturesWorkshop