440GX Application Note
Total Page:16
File Type:pdf, Size:1020Kb
440GX Application Note Overview of TCP/IP Acceleration Hardware January 22, 2008 Introduction Modern interconnect technology offers Gigabit/second (Gb/s) speed that has shifted the bottleneck in communica- tion from the physical connection to the protocol stack. In traditional systems, message processing by the operating system can use a significant number of CPU cycles. The TCP/IP Acceleration Hardware (TAH) sub- system of the 440GX offloads the checksum and segmentation aspects of protocol processing from the operating system, leaving the CPU free to dedicate more cycles to the application. This application note describes the fea- tures and benefits of TAH and its implementation in the 440GX. For additional information on the 440GX, please refer to http://www.amcc.com. TCP/IP Acceleration Hardware In the 440GX, TAH provides hardware acceleration functions for the two 10/100/gigabit Ethernet Media Access Controllers (EMACs) to improve bandwidth and lower CPU utilization. TAH provides checksum verification for TCP/ UDP/IP headers in the receive path, checksum generation for TCP/UDP/IP headers in the transmit path, and TCP segmentation support in the transmit path. TAH provides support for standard and jumbo packets. No acceleration functions are available for the two 10/100 EMACs. For receive packets, setting the Checksum Verification on Receive (CVR) bit in the accelerate mode register, TAHx_MR, enables hardware checksum verification for all incoming packets for a given EMAC/TAH. For transmitted packets, hardware generated checksums and/or packet segmentation is done on a per-packet basis. This is accomplished by setting the proper bits in the descriptor control/status field of the buffer descriptor. The size that packets should be segmented into (if segmentation is enabled) is controlled by one of six Segment Size Registers (TAH_SSRx). Bits in the buffer descriptor also determine which SSR is used. TCP/IP Overview TCP/IP is a term that commonly refers to a collection of protocols more accurately called the “internet protocol suite”. In addition to TCP (Transmission Control Protocol) and IP (Internet Protocol), the Internet protocol suite also includes additional protocols such as UDP (User Datagram Protocol) and ICMP (Internet Control Message Proto- col). Because TAH only manipulates TCP, IP, and UDP packets, these are the only protocols discussed in this paper. The Internet protocol suite is still evolving through the Request For Comments (RFC) mechanism. RFCs are avail- able online at ftp://ftp.rfceditor. org/in-notes. The Internet Protocol (RFC 791) provides services that are roughly equivalent to the OSI Network Layer. The unit of transfer in an IP network is called a datagram. IP provides a con- nectionless datagram transport service across the network. This service is sometimes referred to as unreliable because packets may be lost, arrive out of order, or perhaps even be duplicated. The network does not guarantee delivery or notify the end host system about packets lost due to errors or network congestion. IP assumes higher- layer protocols will address these anomalies. IP datagrams contain a message, or one fragment of a message, that may be up to 65,535 bytes (octets) in length. IP does not provide a mechanism for flow control. The TCP and UDP protocols correspond to the OSI Transport Layer. TCP, described in RFC 793, provides a virtual circuit (connection-oriented) communication service across the network. TCP includes rules for formatting mes- sages, establishing and terminating virtual circuits, sequencing, flow control, and error correction. Most of the applications in the TCP/IP suite operate over the reliable transport service provided by TCP. Common applications such as ftp and smtp communicate through TCP. UDP, described in RFC 768, provides an end-to-end datagram (connectionless) service. Some applications, such as those that involve a simple query and response, are better suited to the datagram service of UDP because there is no time lost to virtual circuit establishment and termination. UDP's primary function is to add a port number to the IP address to provide a socket for the application. Applications such as bootp and rtelnet communicate through UDP. Revision 1.01 Application Note (Proprietary) AN2017 440GX Application Note Each of these three components of the protocol suite adds its own header to the data to be transferred (or pay- load). The UDP header includes source and destination port numbers, a length, and a checksum. Because TCP is a connection oriented protocol, more information is needed in the header. The TCP header includes source and destination port numbers, a sequence number, an acknowledgement number, an ‘urgent pointer’, and a checksum. Once the IP layer receives the packet, it needs to add information to ensure the packet is sent to the proper desti- nation. The IP header includes a version number, the type of service, the length of the packet, another checksum, source and destination addresses, plus some additional fields that will not be discussed in detail. The format of a TCP/IP packet is shown in Figure 1. Figure 1: TCP/IP Packet Format 03182416 VERS HLEN SERVICE TYPE TOTAL LENGTH IDENTIFICATION FLAGS FRAGMENT OFFSET TIME TO LIVE PROTOCOL HEADER CHECKSUM IP Header SOURCE IP ADDRESS DESTINATION IP ADDRESS IP OPTIONS (If Any) PADDING SOURCE PORT DESTINATION PORT SEQUENCE NUMBER ACKNOWLEDGEMENT NUMBER TCP Header HLEN RESERVED CODE BITS WINDOW CHECKSUM URGENT POINTER OPTIONS (IF ANY) PADDING DATA .......... Memory Access Layer (MAL) One additional component used to facilitate network communications within the 440GX is the Memory Access Layer (MAL). The MAL is a hardware core that manages data transfers between the TAH (or the EMAC if no TAH is present) and memory. The MAL utilizes a buffer descriptor ring structure in memory. A software device driver, such as the TCP/IP protocol stack, uses the buffer descriptor to inform the MAL about buffer locations and packet or buffer status. The MAL uses the buffer descriptors to convey packet transfer status from the EMAC back to the protocol stack. Packet Send with TCP Acceleration An application wishes to send a packet to another system on the network. This example assumes a simple TCP application sending a packet that does not require segmentation. The application first builds an arbitrary payload and sends it to TCP. TCP adds its header and in a system without TAH enabled, calculates a checksum for the entire packet, including the header itself. With hardware checksum calculation enabled, the checksum is not calcu- lated until the TCP/IP Accelerate Hardware receives the packet. The packet is then sent to the IP software layer, and another header is added. The IP layer needs to be able to ver- ify that the header does not get damaged in transit, and another checksum is needed. If hardware checksum calculation is disabled, IP calculates a new checksum and stores it in the appropriate location in the IP header. If hardware checksum calculation is enabled, software does not need to modify either checksum field. The remaining sequence of events is illustrated in Figure 2. 2 Application Note (Proprietary) Revision 1.01 440GX Application Note The protocol stack portion of the operating system initiates a packet transmit (1). The device driver parses the pro- tocol stack buffer into descriptor table entries and buffers (2). It is important to note that the buffer descriptors should be placed in noncacheable memory because they are eight bytes each and must be contiguous. If they are placed in cacheable memory, maintaining software cache coherency may not be possible as a cache flush of a sin- gle descriptor could corrupt the other three (in real memory) that are within the same cache line. The 440GX includes 256KB of on-chip SRAM ideally suited for storing buffers and buffer descriptors. The device driver then instructs the EMAC to process a new transmit packet (3). The EMAC requests the TAH to retrieve descriptor information (4). This request is passed to MAL (5). The MAL then fetches the buffer descriptor (6), writes it to TAH (7) and initiates a data move. The packet is then passed through the MAL and written to TAH (8). If hardware checksum calculation is enabled, the checksums are calculated and written into the appropriate packet headers. Checksums are calculated on the fly as the packet is sent from MAL. After TAH has finished replacing the checksums into the packet, it is sent to the EMAC (9), and then transmitted on the media (10), and the EMAC sends a read packet status request to TAH (11). The status information is passed through MAL (12), and written in the buffer descriptor (13). Software is interrupted, which is then responsible for clearing the interrupt sta- tus bits in the EMAC and MAL, and then notify the protocol stack that the transmission is complete (14). The device driver acknowledges the interrupt, and clears interrupt status bits in the EMAC and MAL (15). The device driver notifies the protocol stack that the operation is complete (16). Figure 2: Send Operation with Acceleration OS 2 1 16 CPU 13 6 8 Buffers MAL 3 5 7 12 15 TAHTAH 4 9 11 14 EMACEMAC 10 Packet processing occurs in much the same way when hardware segmentation is enabled. The difference occurs when manipulating the packet headers within the hardware accelerate function, between steps 8 and 9 in the pre- vious example. When segmentation is enabled, the original headers from the packet received from MAL are saved for later use. New headers based on the original are built and stored. When the amount of data equal to the selected segment size has been transferred from MAL, the TAH stores the new checksums in the appropriate loca- tions in the header, and sends the packet on to the EMAC.