Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Hardware Approaches for Fast Lookup & Classification November 09, 2005 by Jignesh Patel CS590AC: Advanced Computing System Design
Hardware Approaches for Fast Lookup & Classification Motivation RAM Based Lookup CAM Based Lookup References
Motivation Need for speed High speed packet processing Interfaces can support OC192c and OC768c (40 Gbps) Interfaces for 10 Tbps under development Software/RAM based implementation  Disadvantage lookup is not fast enough to match the wire-speed Advantages Flexible for later modifications But such modifications are less likely in the near future E.g. IP addressing scheme based on best matching prefix
RAM Based Lookup All software approaches uses some form of Random Access Memory  To store & retrieve data structures RAM operations Writing  Data into a specific  address Reading  Data from a given  address For IP address lookup or Packet classification/filtering Need multiple RAM operations
RAM Based Lookup How to perform lookup in a single memory access? Use destination address as a direct index ( address ) into memory. The data stored at this  address  will be the next hop information Issues? The size of RAM required for direct index grows exponentially with the number of bits in the destination address E.g. – a 32 bit IPv4 address needs 4GB of RAM  a 128 bit IPv6 address needs 316912650057057350374175801344 GB !!!!!  RAM based lookup is not used by any router vendors. 4 2 6 Memory 000.000.000.000 128.128.128.128 255.255.255.255 172.12.180.20
CAM Based Lookup Content-addressable memories (CAMs) Hardware search engines Much faster than algorithmic search techniques Uses conventional memory (usually SRAM) with additional circuitry for comparisons This enables searching the entire memory to be completed in a single clock cycle.
CAM Based Lookup How is it different than RAM? RAM Data is stored at a particular location called  address User supplies the  address  to retrieve the data CAM DATA can be stored without knowing the  address.  Stored in the next free location. User supplies the data and gets the  address  back. CAM word consists of  search-field : mached with search key return-field : the information returned after successful search E.g. –  search-field  usually contains addresses of known destinations and  return-field  contains the next hop or related information
CAM Based Lookup The size of CAM depends on  Number of prefixes that needs to be stored The size of the key only affects the number of bits stored in each location. E.g. for searching 256 entries of 32 bit IPv4 addresses, the CAM must have 256 words with length of each word being atleast 32 bits. The access speed depends on The size of associated information If the associated information is small (e.g. output port/interface #) The CAM word can store this along with the address to match. Provides a fast and direct access since it requires a single CAM read. If the associated information is large (e.g. layer2 mac address) The CAM word stores and index to the associated information Needs both CAM read as well as RAM read.
CAM Based Lookup For IP address lookup A longest prefix matching operation can be performed using exact match search in 32 separate CAMs  CAM- i  stores prefixes of length  I The incoming IP address is given input to all CAMs. The output of the CAMs is filtered through a priority encoder which picks the longest matching CAM. Expensive: each CAM need to be big enough to store large number of prefixes Priority Encoder CAM-1 CAM-2 CAM-32 Next-Hop Table RAM IP Address
CAM Based Lookup A binary CAM stores only two states, 0 / 1 Ternary CAM TCAM stores one of the three states 0, 1 and X (don’t care)  Allows single clock cycle lookups for arbitrary bit mask matches Stores each W-bit field as a ( value , bitmask ) pair. Where  value  and  bitmask  are each W-bit E.g., if W=4, a prefix 01* is stored as pair (0100, 1100).  a given input key  K  matches a stored ( value , bitmask ) pair if ( K  &  bitmask  =  value  &  bitmask)
CAM Based Lookup Such prefix matching works well for IP address lookup But, they are not well-suited for matching ranges (e.g. port number range) Solution Replace each rule with several rules, each covering a portion of desired range. Requires splitting the range into smaller ranges that can be expressed as ( value , bitmask ) pair. E.g. the range 2-10 can be splitted into a set 001*, 01*, 100* and 1010
CAM Based Lookup For multiple field classifiers Needs a TCAM for each field E.g. a Two field classifier  Priority Encoder TCAM-A TCAM-B Action Memory RAM F1 F2 AND
CAM Based Lookup TCAMs are increasingly being used because of their simplicity and speed. Generally used in Routers Binary CAMs are used in switches  Some disadvantages High cost per bit  High power consumption Storage inefficiency
References Chapter 4 draft from book by Dr. Medhi and Ramasamy Chapter 17 draft from book by Dr. Medhi and Ramasamy

More Related Content

Hardware Approaches for Fast Lookup & Classification

  • 1. Hardware Approaches for Fast Lookup & Classification November 09, 2005 by Jignesh Patel CS590AC: Advanced Computing System Design
  • 2. Hardware Approaches for Fast Lookup & Classification Motivation RAM Based Lookup CAM Based Lookup References
  • 3. Motivation Need for speed High speed packet processing Interfaces can support OC192c and OC768c (40 Gbps) Interfaces for 10 Tbps under development Software/RAM based implementation Disadvantage lookup is not fast enough to match the wire-speed Advantages Flexible for later modifications But such modifications are less likely in the near future E.g. IP addressing scheme based on best matching prefix
  • 4. RAM Based Lookup All software approaches uses some form of Random Access Memory To store & retrieve data structures RAM operations Writing Data into a specific address Reading Data from a given address For IP address lookup or Packet classification/filtering Need multiple RAM operations
  • 5. RAM Based Lookup How to perform lookup in a single memory access? Use destination address as a direct index ( address ) into memory. The data stored at this address will be the next hop information Issues? The size of RAM required for direct index grows exponentially with the number of bits in the destination address E.g. – a 32 bit IPv4 address needs 4GB of RAM a 128 bit IPv6 address needs 316912650057057350374175801344 GB !!!!! RAM based lookup is not used by any router vendors. 4 2 6 Memory 000.000.000.000 128.128.128.128 255.255.255.255 172.12.180.20
  • 6. CAM Based Lookup Content-addressable memories (CAMs) Hardware search engines Much faster than algorithmic search techniques Uses conventional memory (usually SRAM) with additional circuitry for comparisons This enables searching the entire memory to be completed in a single clock cycle.
  • 7. CAM Based Lookup How is it different than RAM? RAM Data is stored at a particular location called address User supplies the address to retrieve the data CAM DATA can be stored without knowing the address. Stored in the next free location. User supplies the data and gets the address back. CAM word consists of search-field : mached with search key return-field : the information returned after successful search E.g. – search-field usually contains addresses of known destinations and return-field contains the next hop or related information
  • 8. CAM Based Lookup The size of CAM depends on Number of prefixes that needs to be stored The size of the key only affects the number of bits stored in each location. E.g. for searching 256 entries of 32 bit IPv4 addresses, the CAM must have 256 words with length of each word being atleast 32 bits. The access speed depends on The size of associated information If the associated information is small (e.g. output port/interface #) The CAM word can store this along with the address to match. Provides a fast and direct access since it requires a single CAM read. If the associated information is large (e.g. layer2 mac address) The CAM word stores and index to the associated information Needs both CAM read as well as RAM read.
  • 9. CAM Based Lookup For IP address lookup A longest prefix matching operation can be performed using exact match search in 32 separate CAMs CAM- i stores prefixes of length I The incoming IP address is given input to all CAMs. The output of the CAMs is filtered through a priority encoder which picks the longest matching CAM. Expensive: each CAM need to be big enough to store large number of prefixes Priority Encoder CAM-1 CAM-2 CAM-32 Next-Hop Table RAM IP Address
  • 10. CAM Based Lookup A binary CAM stores only two states, 0 / 1 Ternary CAM TCAM stores one of the three states 0, 1 and X (don’t care) Allows single clock cycle lookups for arbitrary bit mask matches Stores each W-bit field as a ( value , bitmask ) pair. Where value and bitmask are each W-bit E.g., if W=4, a prefix 01* is stored as pair (0100, 1100). a given input key K matches a stored ( value , bitmask ) pair if ( K & bitmask = value & bitmask)
  • 11. CAM Based Lookup Such prefix matching works well for IP address lookup But, they are not well-suited for matching ranges (e.g. port number range) Solution Replace each rule with several rules, each covering a portion of desired range. Requires splitting the range into smaller ranges that can be expressed as ( value , bitmask ) pair. E.g. the range 2-10 can be splitted into a set 001*, 01*, 100* and 1010
  • 12. CAM Based Lookup For multiple field classifiers Needs a TCAM for each field E.g. a Two field classifier Priority Encoder TCAM-A TCAM-B Action Memory RAM F1 F2 AND
  • 13. CAM Based Lookup TCAMs are increasingly being used because of their simplicity and speed. Generally used in Routers Binary CAMs are used in switches Some disadvantages High cost per bit High power consumption Storage inefficiency
  • 14. References Chapter 4 draft from book by Dr. Medhi and Ramasamy Chapter 17 draft from book by Dr. Medhi and Ramasamy