Get More Out Of the Foxhollow Platform

Akber Kazmi, Marketing Director, PLX Technology

Introduction As being reported by the mainstream technology media, Intel is leveraging the technology from its latest server-class Nehalem CPU to offer the Lynnfield CPU, targeted for high-end desktop and entry-level servers. This platform is codenamed “Foxhollow “. Intel is expected to launch this new platform sometime in the second half of 2009. This entry-level uni-processor (UP) server platform will feature two to four cores as Intel wants to pack a lot of processing power in all its platforms.

The Foxhollow platform is quite different from the previous Desktops and UP servers in that it reduces the solution from three chips to two chips by eliminating the and replacing the southbridge with a new device called the (or PCH) code named Ibexpeak (5 Series ).

As Intel has moved the and the graphics function into the CPU, there's no need for an MCH (Memory Controller Hub), so Intel has simplified its chipset design to keep costs down in the entry-level and mainstream segments. The PCH chip interfaces with the CPU through Intel’s DMI interconnect. The PCH will support eight PCIe lanes, up to four PCI slots, the GE MAC, display interface controllers, I/O controllers, RAID controllers, SATA controllers, USB 2.0 controllers, etc.

Foxhollow Foxhollow motherboards are being offered in two configurations, providing either two or three x8 PCIe ports for high performance I/Os. However, vendors can use an alternate configuration that provides one more PCIe x8 port with no significant burden and instead offers 33% more value than the three port solution and 50% more value than the two port solution.

As illustrated in the Standard Configuration figure to the right, the CPU offers two PCIe x8 ports that allow access to two high performance I/Os such a 10GE NIC or an 8G FC card. Other interconnects are offered through the PCH (southbridge). However, the combined bandwidth of all southbound interfaces far exceeds the bandwidth of the DMI interface. This may result in reduced performance and in some cases may cause starvation of the I/Os connected to the PCH. Page 1 of 3 v1.0 *Foxhollow information has been acquired from public sources

PLX Confidential Get More Out Of the Intel Foxhollow Platform

The Port Expansion Configuration 1 figure above offers some relief by offering one additional x8 PCIe port but it also consumes one of the CPU’s two x8 ports. This configuration does not offer enough return or benefit for adding one more component on the board, consuming more power and additional burden on bill of materials (BOM). In most cases, it does not make business sense to add complexity and cost for just one additional PCIe port.

The configuration shown in the Port Expansion Configuration 2 figure above offers two additional x8 PCIe ports – thus allowing better amortization of cost, board-space, and additional power. Additionally, PLX’s PEX 8632’s flexible port configuration allows for the bifurcation of the x8 ports into two x4 ports enabling the creation of even more PCIe ports. Furthermore, PLX offers the PEX 8648, a 48-lane PCIe Gen 2 switch offering x16 PCIe ports that can be connected directly to the CPU to create additional PCIe ports. The PEX 8648, PEX 8632 and PEX 8624 are part of the broad portfolio of PCIe switches offered by PLX sharing a common architecture. The highlights of the PLX PCIe Gen 2 switch architecture and its benefits are discussed below.

PLX PCIe Enhanced Features PCIe Gen 2 switches from PLX have exclusive features such as performancePAK and visionPAK and that go beyond what the PCIe spec requires. PLX’s performancePAK offers features such as read pacing, dynamic buffer allocation and dual-cast to improve overall system performance while keeping its switches fully compliant with the PCIe Gen 2 specification. PLX visionPAK offers features such as access to internal data-paths, receive eye width measurements, SerDes loopback, error injection, line-rate packet generation, and performance counters. Some of these features are discussed below.

Read Pacing Read Pacing is a feature that provides dramatic improvement in overall system performance in server and storage applications. When two or more endpoints are connected to a root complex or CPU through a PCIe switch and an asymmetric number of read requests are being made by the endpoints, one endpoint inevitably dominates the bandwidth of the root complex or CPU queue. As a result, the other endpoints suffer reduced performance. This is known as “endpoint starvation” which can make it appear as if the system is congested and not performing optimally.

With Read Pacing, the switch limits the number of outstanding read requests any one endpoint can have at a time. Programmable registers in the switch control the number of read requests forwarded to the host. As a bursty I/O makes small read requests while an aggressive I/O makes large block reads, the switch allows both read requests through, thus balancing the flow of data for both I/Os.

Read Pacing provides increased system performance by providing a more balanced allocation of bandwidth to the downstream ports of the switch. With Read Pacing, the switch can apply rules to prevent one port from overwhelming the completion bandwidth or buffering in the system. Page 2 of 3 v1.0 *Foxhollow information has been acquired from public sources

PLX Confidential Get More Out Of the Intel Foxhollow Platform

Dynamic Buffer Allocation The Dynamic Buffer Allocation scheme used in PLX’s PCIe switches allows a large pool of buffers to be allocated, under user control, to the active ports. The user can allocate a dedicated set of credits to an individual port based on expected traffic load. Conversely, the user can allocate a portion of the credits to the buffer pool for dynamic sharing amongst the ports as traffic load changes. This allows the system to absorb fluctuations in traffic load without causing congestion. At the same time, no credits are lost or unused when certain ports on the switch are not used.

Dual Cast In addition to balancing bandwidth and improved buffer allocation, the PLX Gen 2 family of switches supports Dual Cast, a feature which allows the copying of data packets from one ingress port to two egress ports – enabling higher performance in dual-graphics, storage, security, and redundant applications. Without Dual Cast, the CPU must generate twice the number of packets, requiring twice the processing power. In some applications, the performance of the CPU can be doubled by using Dual Cast feature of PLX switches.

Latency A read operation is generally considered to be a blocking operation in that once a read request is initiated no additional instructions in thread of processing can be undertaken until it is completed. Simple applications have the following work flow: 1. Make a read request 2. Wait for data 3. Process the data 4. Loop back to 1 In this simple example, the latency of the read directly affects the throughput. If the read latency is much smaller than the processing time, then latency isn’t a problem. When it’s not small, the processor will have to wait wasting valuable CPU cycles.

PLX PCIe Gen 1switch family has the lowest packet latency in the industry. PLX has carried this cut-thru architecture over to its PCIe Gen 2 designs and has the industry’s lowest packet latency for all its PCIe Gen 2 switches as well.

Summary When designing around the Foxhollow platform, it is highly recommended to use a 32- or 48-lane PCIe switch from PLX in order to amortize the cost, power and board-space of a PCIe switch over a larger number of PCIe ports. The DMI interface bandwidth may not be enough to support needs of all the I/Os attached to the PCH. Creating additional PCIe ports through PLX PCIe switches would allow performance driven I/Os to be connected to CPU through the PCIe switch instead of the PCH.

PLX’s PCIe Gen 2 portfolio offers a large section of switches ranging from 4-lanes & 4-ports to 48- lanes and 12-ports. PLX’s PCIe Gen 2 switches offer industry-best performance and exclusive debug features and have been shipping since Q4-2007 and deployed with hundreds of customers.

Page 3 of 3 v1.0 *Foxhollow information has been acquired from public sources

PLX Confidential