-
2014-05-20
10/832,086
2004-04-26
US 8,730,961 B1
2014-05-20
-
-
Yong Zhou
Kilpatrick Townsend & Stockton LLP
2026-10-09
A system and method for reducing the number of cycles used in CAM lookup. A network comprises a plurality of network devices connected to a router. The router comprises a media access controller which is effective to receive an input packet and a packet processor which is effective to receive the input packet from the media access controller and to extract data stored in the input packet. The router further comprises a CAM which is effective to receive the data stored in the input packet from the packet processor, a PRAM, a control processor and a bus. The control processor controls the packet processor and the CAM so that the packet processor extracts a destination address from the input packet and forwards the destination address to the CAM. The packet processor extracts a source address from the input packet and forwards the source address to the CAM. The CAM performs a lookup of the destination and source addresses in parallel. The packet processor extracts miscellaneous information, a source protocol address, and a destination protocol address from the received packet and the CAM performs a lookup of the miscellaneous information, the source protocol address, and the destination protocol address at the same time.
Get notified when new applications in this technology area are published.
H04L12/56 IPC
Data switching networks; Store-and-forward switching systems Packet switching systems
The invention relates to router resource management and, more particularly, to a system which optimizes lookup of header information in a received packet.
Referring to FIG. 1, there is shown a basic network architecture in accordance with the prior art. A computer or other network device 50 communicates with other network devices by sending packets of information 52 through a router 54. Each packet 52 includes a header indicating basic information such as the source of the packet (for example computer 50a) or source address (SA) and a destination of the packet (for example computer 50b) or destination address (DA). Router 54 receives each packet, determines the SA and DA, and forwards each packet to its appropriate destination.
Routers typically operate at the Data Link layer (“layer 2”) of the Open Systems Interconnection (“OSI”) model. Their operation is defined in the American National Standards Institute (“ANSI”) Institute of Electrical and Electronics Engineers (“IEEE”) 802.1D standard. A copy of the ANSI/IEEE Standard 802.1D, 1998 Edition, is incorporated by referenced herein in its entirety.
Telecommunication traffic among network devices is divided into seven layers under the OSI model and the layers themselves split into two groups. The upper four layers are used whenever a message passes to or from a user. The lower three layers are used when any message passes through the host computer, whereas messages intended for the receiving computer pass to the upper four layers. “Layer 2” refers to the data-link layer, which provides synchronization for the physical level and furnishes transmission protocol knowledge and management.
Referring to FIG. 2, router 54 may include a media access controller “MAC” 60, a packet processor 62, a content addressable memory (“CAM”) 80, a random access memory including parameter information (“PRAM”) 70, and a transmission manager 66 coupled through a bus 74 and controlled by a processor 72. MAC 60 is an interface by which data in the form of packets is transmitted to and received from router 54. MAC 60 performs any data conversions needed for the packets to later be processed by packet processor 62. Received packets 52 are forwarded by MAC 60 to packet processor 62. For example, if the packets are in the form of 32 bit double data rate data and packet processor 62 processes 64 bit single data rate data, MAC 60 performs the needed conversion. Packet processor 62 acts as a conduit between operations performed inside router 54 and MAC 60. For example, packet processor 62 extracts the DA and SA from a received packet.
CAM 80 receives the DA and SA of a received packet forwarded from packet processor 62 and compares this information with information stored within CAM 80. If the DA and SA matches an entry in CAM 80, additional forwarding information regarding the disposition of the received packet is available from PRAM 70 and is retrieved for incorporation into the header of the packet. For example, information such as destination port of the packet, port mirror requirement, packet type, VLAN handling information, prioritization, multicast group member ship, etc., may be included in PRAM 70. The received packet is reformatted with a new header using the PRAM information. If the header information in the received packet does not match information in memory 80, forwarding information is appended directing the packet to a system manager interface (not shown) for additional processing.
Current packet formatting standards are moving from Internet Protocol version 4 (IPV4) to Internet Protocol version 6 (IPV6). IPV4 includes a lookup table of 128 bits whereas IPV6 has a table of 320 bits. A typical CAM is 64 bits wide so it can receive 64 bits at one time. To handle the increased table requirements of IPV6, a conventional approach may be to simply to use the same cycle timing and run the CAM faster so that processing even with the extra cycles may be performed in a desired time period. Such a solution may work in applications which use ASIC (application specific integrated circuits). However, in CAMs which use a field programmable gate arrays (FPGA), simply running the CAM faster is not available. Yet, use of a FPGA is sometimes desirable as they are easier to use and program, more readily available, and easier to modify.
Due to the increased demands of IPV6, it is desirable to reduce the number of cycles used by a CAM and thereby increase processing speed. Therefore, there is a need in the art for a system and method for optimizing the lookup timing of a CAM without simply forcing the CAM to run faster.
One aspect of the invention is a method for performing a lookup of a received packet using a CAM. The method comprises receiving a packet, extracting a destination address from the received packet, and extracting a source address from the received packet. The method further comprises performing a lookup of the destination and source address wherein the extracting and performing is effectuated using only two clock cycles.
Another aspect of the invention is a router comprising a media access controller which is effective to receive an input packet, and a packet processor which is effective to receive the input packet from the media access controller and to extract data stored in the input packet. The router further comprises a CAM which is effective to receive the data stored in the input packet from the packet processor, a PRAM, and a control processor. The router further comprises a bus connected to the media access controller, the packet processor, the CAM, the PRAM and the control processor. The control processor is effective to control the packet processor and the CAM so that the packet processor extracts a destination address from the input packet and forwards the destination address to the CAM. The control processor also controls the packet processor to extract a source address from the input packet and forward the source address to the CAM and the CAM performs a lookup of the destination and source address; wherein the extracting and performing is effectuated using only two clock cycles.
Yet another aspect of the invention is a network comprising a first network device, a second network device, and a router connected to both the first and second network devices. The router comprises a media access controller which is effective to receive an input packet and a packet processor which is effective to receive the input packet from the media access controller and to extract data stored in the input packet. The router further comprises a CAM which is effective to receive the data stored in the input packet from the packet processor, a PRAM, a control processor; and a bus connected to the media access controller, the packet processor, the CAM, the PRAM and the control processor. The control processor is effective to control the packet processor and the CAM so that the packet processor extracts a destination address from the input packet and forward the destination address to the CAM. The control processor also controls the packet processor to extract a source address from the input packet and forward the source address to the CAM. The control processor also controls the CAM to perform a lookup of the destination and source address, wherein the extracting and performing is effectuated using only two clock cycles.
Still yet another aspect of the invention is a method for performing a lookup of a received packet, the method comprising extracting a destination address of a received packet, and extracting a source address of the received packet. The method further comprises performing a parallel lookup of both the destination and source address, extracting miscellaneous information from the received packet and extracting a source protocol address from the received packet. The method further comprising extracting a destination protocol address from the received packet and performing a lookup of the miscellaneous information, the source protocol address, and the destination protocol address in the same one or more clock cycles.
FIG. 1 is a network diagram showing a prior art network architecture.
FIG. 2 is a network diagram showing a prior art router.
FIG. 3 is a network diagram showing a router in accordance with the invention.
FIG. 4 is a timing diagram illustrating the operation of a CAM in a router in accordance with the invention.
FIG. 5 is a timing diagram illustrating the operation of a CAM in a router in accordance with the invention.
FIG. 6 is a timing diagram illustrating the operation of a CAM in a router in accordance with the invention.
FIG. 7 is a timing diagram illustrating the operation of a CAM in a router in accordance with the invention.
Referring to FIG. 3, there is shown a router 154 in accordance with the invention. Router 154 may include a media access controller “MAC” 160, a packet processor 162, a CAM (“CAM”) 180, a random access memory including parameter information (“PRAM”) 170, and a transmission manager 166 coupled through a bus 174 and controlled by an improved processor 172. MAC 160 is an interface by which data in the form of packets is transmitted to and received from router 156. MAC 160 performs any data conversions needed for the packets to later be processed by packet processor 162. Received packets 52 (FIG. 1) are forwarded by MAC 160 to packet processor 162. For example, if the packets are in the form of 32 bit double data rate data and packet processor 162 processes sixty four (64) bit single data rate data, MAC 160 performs the needed conversion. Packet processor 162 acts as a conduit between operations performed inside router 154 and MAC 160. For example, packet processor 162 extracts the DA and SA from a received packet.
Referring now to FIG. 4, there is a shown a clock timing diagram illustrating CAM lookup for use with a IPV6 packet protocol and a CAM capable of 576 bits lookup. The CAM illustrated is available through, for example, NETLOGIC. As shown in FIG. 4, fourteen (14) clock cycles are used to perform a complete lookup for a received packet. In cycle 1, the 48 bit (0 . . . 47) destination address in layer 2 of a received packet is received by CAM 180, temporarily stored in locations 0 . . . 71, and compared with a stored destination address in the CAM. It is noted that while CAMs are shown having widths of 72 bits, clearly the invention may be implemented using a CAM that is 64 bits wide. If such a CAM is utilized, the bit allocations referenced throughout may be simply changed to multiples of 64 bits. In cycle 2, no operation is performed as commercially available CAMs typically do not perform two lookup cycles consecutively.
In cycle 3, the 48 bit (0 . . . 47) source address in layer 2 of the received packet is received by CAM 180, temporarily stored again using locations 0 . . . 71, and compared with a stored source address in the CAM. This comparison in cycle 3 is performed in a different database than that used in cycle 1. Parallel searching in multiple databases is generally not available using the NETLOGIC CAM in FIG. 4.
In cycle 4, 64 (64 . . . 127) bits of the destination protocol address (DPA) stored in layer 3 of the received packet is received by CAM 180 and written to locations 72 . . . 143. In cycle 5, a second 64 bits (0 . . . 63) of the DPA stored in layer 3 of the received packet is received by CAM 180, written to locations 0 . . . 71 and compared with a DPA stored in CAM 180.
In cycles 6-8, dummy data is stored in positions 360 . . . 575 of memory 68 Dummy data is used because previous CAMs could only handle 360 bits and so most received packets do not have data beyond 360 bits. Dummy data is also sometimes used because many CAM systems, like that shown in FIG. 4, do not support early termination of a lookup and so all fourteen cycles need to be used.
In cycle 9, 64 bits of miscellaneous data in layer 4 of the received packet is received by CAM 180 and written to locations 288 . . . 359. In cycle 10, the source protocol address SPA in layer 3 of the received packet having 64 bits is received by CAM 180 and written to locations 216 . . . 287. In cycle 11, the next 64 bits of the SPA is read from the received packet and stored in locations 144 . . . 215. In cycle 12, bits 64 . . . 127 of the DPA stored in layer 3 of the received packet is again forwarded to CAM 180, and stored in locations 72 . . . 143 of memory 68. The DPA is typically processed twice as it usually includes only 32 bits of data. CAMs usually cannot process less that 64 bits at one time. Therefore, the DPA is read a first time, stored in half of the 72 (or 64) bits of memory 68, and a mask is applied to the second half of the bits in memory 68 to enable a comparison. Thereafter, the DPA is written to the second half of the 72 (or 64) bits of memory 68 and a mask is applied to the first half of the bits.
In cycle 13, bits 0 . . . 63 of the DPA are again forwarded to CAM 180, and stored in locations 0 . . . 71. Also in cycle 13, CAM 180 performs a comparison of the information in bits 0 . . . 575 and other information stored in the CAM. Such a comparison typically involves a lookup in a single database and so the CAM in FIG. 4 can handle this operation. In cycle 14, no operation is performed and the CAM can turnaround its processing.
Similarly, referring to FIG. 5, there is shown a timing diagram illustrating CAM lookup for use with a IPV4 packet protocol. In cycle 1, the 48 bit (0 . . . 47) destination address in layer 2 of the received packet is received by CAM 180, temporarily stored using locations 0 . . . 71, and compared with a stored destination address in the CAM. In cycle 2, no operation is performed as commercially available CAMs typically do not perform two lookup cycles consecutively.
In cycle 3, the 48 bit (0 . . . 47) source address in layer 2 of the received packet is received by CAM 180, temporarily stored again using locations 0 . . . 71, and compared with a stored source address in the CAM. In cycle 4, no operation is performed. In cycle 5, 32 bits of the DPA stored in layer 3 of the received packet is received by CAM 180 and written to locations 0 . . . 71 of memory 68. A comparison is also performed. As the CAM typically cannot only compare 32 bits, a mask is applied to one of the first or second 32 bits of the received DPA so that the comparison may be made.
In cycle 6, no operation is performed again to avoid consecutive lookups. In cycle 7, as in cycle 5, 32 bits of the DPA stored in layer 3 of the received packet is received by CAM 180 and written to locations 0 . . . 71. A comparison is performed. A mask is applied to the other of the first or second 32 bits of the received DPA so that the comparison may be made. In cycle 8, 32 bits of layer 3 DPA & SPA are received by CAM 180 and written to locations 72 . . . 143. In cycle 9, bits 0 . . . 63 of miscellaneous data in layer 4 is read into CAM 180, written to locations 0 . . . 71 and a comparison is performed. Cycles 10 and 11 are reserved for output ACL. Cycle 12 has no operations and allows the CAM to turnaround.
Referring now to FIG. 6, there is shown another timing diagram illustrating control of another CAM 180a for packets received under the IPV6 protocol. In cycle 1, the 48 bit (0 . . . 47) destination address in layer 2 of a received packet is received by CAM 180a and temporarily stored using locations 72 . . . 143. In cycle 2, the 48 bit (0 . . . 47) source address in layer 2 of the received packet is received by CAM 180a and stored using locations 0 . . . 71. Unlike in the previous embodiment, in cycle 2, all 144 bits are compared with other information in CAM 180a. In this way, both DA and SA lookup are performed in parallel looking in two different databases.
In cycle 3, 64 bits of miscellaneous data in layer 4 of the received packet is received by CAM 180a and written to locations 288 . . . 359. In cycle 4, bits 64 . . . 127 of the SPA in layer 3 of the received packet is received by CAM 180a and written to locations 216 . . . 287. In cycle 5, the next 64 bits 0 . . . 63 of the SPA is read from the received packet and stored in locations 144 . . . 215. In cycle 6, bits 64 . . . 127 of the DPA stored in layer 3 of the received packet is forwarded to CAM 180a, and stored in locations 72 . . . 143. In cycle 7, bits 0 . . . 63 of the DPA are forwarded to CAM 180a, and stored in locations 0 . . . 71. Also in cycle 7, CAM 180a performs a comparison of the information in bits 0 . . . 360 with other information stored in the CAM. Cycles 8 and 9 are reserved and could be used for access control list output. Cycles 10-14 include no operations and allow for a turnaround of CAM 180a.
Referring to FIG. 7, there is shown a timing diagram illustrating CAM lookup for CAM 180a in accordance with the invention for use with a IPV4 packet protocol. In cycle 1, the 48 bit (0 . . . 47) destination address in layer 2 of the received packet is received by CAM 180a, temporarily stored using locations 0 . . . 71, and compared with a stored destination address in the CAM. In cycle 2, no operation is performed.
In cycle 3, the 48 bit (0 . . . 47) source address in layer 2 of the received packet is received by CAM 180a, temporarily stored again using locations 0 . . . 71, and compared with a stored source address in the CAM. In cycle 4, no operation is performed. In cycle 5, 32 bits of the DPA stored in layer 3 of the received packet is received by CAM 180a and written to locations 0 . . . 71. A comparison is also performed. As the CAM typically cannot compare only 32 bits, a mask is applied to one of the first or second 32 bits of the received DPA so that the comparison may be made.
In cycle 6, 32 bits of layer 3 DPA & SPA are received by CAM 180a and written to locations 72 . . . 143. A mask is applied to the other of the first or second 32 bits of the received DPA so that the comparison may be made. In cycle 7, bits 0 . . . 63 of miscellaneous data in layer 4 is read into CAM 180 in locations 0 . . . 71 and a comparison is performed. Cycles 8 and 9 are reserved for output ACL. Cycles 10-12 have no operations and allow the CAM to turnaround.
Thus, by identifying data dependency characteristics in received packets, and performing parallel lookup in multiple databases, the number of cycles needed in a CAM lookup operation is reduced.
While the invention has been described and illustrated in connection with preferred embodiments, many variations and modifications as will be evident to those skilled in this art may be made without departing from the spirit and scope of the invention, and the invention is thus not to be limited to the precise details of methodology or construction set forth above as such variations and modification are intended to be included within the scope of the invention.
1. A method of performing a lookup for a packet using a content-addressable memory (CAM), the method comprising:
storing first information extracted from the packet in a first set of bits of the CAM;
storing second information extracted from the packet in a second set of bits of the CAM, the second set of bits being contiguous to the first set of bits in the CAM; and
performing, in parallel in a single clock cycle, a first lookup of the first information stored in the first set of bits of the CAM in a first database of the CAM and a second lookup of the second information stored in the second set of bits of the CAM in a second database of the CAM, the second database being different from the first database,
wherein performing the first lookup comprises using the first set of bits and the second set of bits of the CAM and applying a mask to the second set of bits of the CAM, and
wherein performing the second lookup comprises using the first set of bits and the second set of bits of the CAM and applying a mask to the first set of bits of the CAM.
2. The method of claim 1 wherein the first information comprises Layer 3 information extracted from the packet.
3. The method of claim 1 wherein the second information comprises Layer 4 information extracted from the packet.
4. A network device comprising:
a processor configured to extract first information and second information from a packet and store the first information and the second information in contiguous sets of bits of a content-addressable memory (CAM); and
the CAM configured to perform, in parallel in a single clock cycle, a first lookup of the first information in a first database of the CAM and a second lookup of the second information in a second database of the CAM, the second database being different from the first database,
wherein the CAM is further configured, in performing the first lookup, to use the contiguous sets of bits of the CAM and apply a mask to a second set of bits of the contiguous sets of bits of the CAM, and
wherein the CAM is further configured, in performing the second lookup, to use the contiguous sets of bits of the CAM and apply a mask to a first set of bits of the contiguous sets of bits of the CAM.
5. The network device of claim 4 wherein the first information comprises Layer 3 information extracted from the packet.
6. The network device of claim 4 wherein the CAM comprises a register configured to store the first information in a first set of bits of the register and the second information in a second set of bits of the register, the second set of bits being contiguous to the first set of bits in the register.
7. The network device of claim 4 wherein the second information comprises Layer 4 information extracted from the packet.
8. An apparatus for processing a packet, the apparatus comprising:
means for extracting first information and second information from the packet;
means for storing the first information and the second information in contiguous sets of bits of a content-addressable memory (CAM); and
means for performing, in parallel in a single clock cycle, a first lookup of the first information in a first database of the CAM and a second lookup of the second information in a second database of the CAM, the second database being different from the first database,
wherein the means for performing the first lookup comprise means for using the contiguous sets of bits of the CAM and applying a mask to a second set of bits of the contiguous sets of bits of the CAM, and
wherein the means for performing the second lookup comprise means for using the contiguous sets of bits of the CAM and applying a mask to a first set of bits of the contiguous sets of bits of the CAM.
9. The apparatus of claim 8 wherein the first information is Layer 3 information extracted from the packet.
10. The apparatus of claim 8 wherein the CAM comprises a register configured to store the first information in a first set of bits of the register and the second information in a second set of bits of the register, the second set of bits being continuous to the first set of bits in the register.
11. The apparatus of claim 8 wherein the second information is Layer 4 information extracted from the packet.
12. A content-addressable memory (CAM) comprising:
a memory portion configured to store first information and second information extracted from a packet in contiguous sets of bits of the memory portion; and
multiple databases for performing lookups;
wherein the CAM is configured to perform, in parallel in a single clock cycle, a first lookup of the first information in a first database of the multiple databases and a second lookup of the second information in a second database of the multiple databases, the second database being different from the first database,
wherein the CAM is further configured, in performing the first lookup, to use the contiguous sets of bits of the memory portion and apply a mask to a second set of bits of the contiguous sets of bits of the memory portion, and
wherein the CAM is further configured, in performing the second lookup, to use the contiguous sets of bits of the memory portion and apply a mask to a first set of bits of the contiguous sets of bits of the memory portion.
13. The CAM of claim 12 wherein the first information comprises Layer 3 information extracted from the packet.
14. The CAM of claim 12 wherein the second information comprises Layer 4 information extracted from the packet.
15. A network device comprising:
a processor configured to extract a source information and a destination information from a packet and store the source information and the destination information in contiguous sets of bits of a content-addressable memory (CAM); and
the CAM configured to perform, in parallel in a single clock cycle, a first lookup of the source information in a first database of the CAM and a second lookup of the destination information in a second database of the CAM, the second database being different from the first database,
wherein the CAM is further configured, in performing the first lookup, to use the contiguous sets of bits of the CAM and apply a mask to a second set of bits of the contiguous sets of bits of the CAM, the second set of bits storing the destination information, and
wherein the CAM is further configured, in performing the second lookup, to use the contiguous sets of bits of the CAM and apply a mask to a first set of bits of the contiguous sets of bits of the CAM, the first set of bits storing the source information.