TECHNOLOGY FEATURE HyperTransportTM as an emerging new I/O standard By Peter Robinson

Faster processors and silicon process developed for the PCI bus, including the nology provides the same type of “tree” have squeezed traditional single-ended, ability to support non-coherent memory topology and the same ordering rules Address/, and Control I/O buses read/write operations. From a system widely used by PCI buses (see Figure 2). to their engineering limits. These I/O design viewpoint, HyperTransport tech- buses, such as PCI-X, can no longer meet the demand for high-speed, multi- node data connectivity, leaving system designers with few design options. Either they must limit system I/O and throughput and work within the limits of current standards or adopt an emerging new I/O bus standard such as Infini- Bandª, RapidIOª, 3GIOª, Hyper- Transport technology and others.

Today, there is an emerging standard that can deliver the needed throughput and leverage existing PCI drivers Ð HyperTransport technology. While competing next-generation serial I/O technologies are trying to address this problem by saying they are more capable than PCI, HyperTransport tech- nology is both backward-compatible to the PCI era and an extensible bridge to Figure 1 the post-PCI future.

Co-developed by AMD and API NetWorks, HyperTransport technology Data is held in DRAM addresses the need for increased system waiting to be entered into Data blasts in from performance and provides a high-speed redundant RAID arrays, network or Internet because of data bursts. D over fiber channel R scalable point-to-point link that reduces A M D data bottlenecks and boosts the perfor- R DRAM Ctrl. A mance of current and future commu- RAID HT - Single M (Redundant Array Ended Slave H L-1 Cache nications equipment, including PCs, of Inexpensive 8 r CPU processes Disks) arrays Fiber incoming data and PC Lk PC Lk Channel H workstations, servers, and Internet Four o CPU sends it to the 8 8 88 8 HT tunnel HT tunnel Port HT tunnel s RAID arrays routers (see Figure 1). With the devel- Switch t opment of HyperTransport technology, 8 PC Lk PC Lk embedded systems developers now 88 Data is transported over an HT tunnel HT tunnel 8 8-bit wide HyperTransport have a highly scalable bus standard that switch and fabric from tunnel device to tunnel device at up addresses I/O throughput, while pre- Each leg can support up to to 2 Gigabytes/sec 31 tunnel devices allowing serving their PCI software investments. for very complex network topologies, while remaining Extremely Scalable HyperTransport Networking Topology HyperTransport technology extends PCI HyperTransport makes it possible to build complex systems such as a dual DRAM cached RAID based storage subsystem with minimal by providing a high-performance fabric cost and ship count, while remaining extremely scalable compared Source: API Networks, Inc. (see Figure 1) that maintains backward- to technologies currently on the market. compatibility with existing software Figure 2

Copyright 2001 CompactPCI Systems Reprinted from CompactPCI Systems / September 2001 TECHNOLOGY FEATURE

This complementary relationship with There are three reasons why the Hyper- tion to command, addressing, and con- PCI is particularly significant with PCI- Transport technology proprietary LVDS trol, each HyperTransport technology X and double rate PCI-X, viewed as out performs the industry standard: device has PWROK (Power Okay) and the only practical near-term solution to RESET_H (Reset_HyperTransport) I/O performance bottlenecks in com- Lower differential voltage swing pins. These pins are single-ended munications and storage applications. The integration of an intelligent, because of their low-frequency use. At Although PCI-X extends the popular dynamically adjustable source and the user’s option, HyperTransport tech- PCI standard, its deployment success termination resistor mechanism nology devices for use in lower power needs a high-speed chip-to-chip mezza- The transmission of the data clock applications such as hand held appli- nine bus, such as HyperTransport, to with groups of eight or less RX/TX ances can implement HT_Stop (Hyper- serve as a fabric backbone. Likewise, data pairs. Transport_Stop). This pin puts the HyperTransport technology will sup- HyperTransport technology channel in a port and compliment InfiniBand as the The low differential voltage alone low-power state where virtually no preferred “in the box” fabric while implies lower current and higher speed. power is used by the channel. InfiniBand is the preferred “outside the However, the integrated dynamically box” fabric. adjustable source and termination resis- HyperTransport technology is scal- tor mechanism, combined with the able, supporting a wide selection of Gaining accceptance transmitted data clock is where the real bus widths and speeds to fit the power, Currently, HyperTransport technology performance is realized. Together, on space, and cost requirements of next architecture is in the standards review one hand, they add a higher level of generation devices. A HyperTransport process and has gained the support of complexity to the HyperTransport tech- technology channel can be 2, 4, 8, more than 100 early adopters, including nology PHY; on the other-hand, they 16, or 32 bits wide in each direction. Cisco, Sun, Broadcom, , ATI, and have simplified the total PHY approach Asymmetric interconnects supporting Fujitsu. Compared to competing technol- by eliminating the need for external ter- different upstream and downstream ogies, its current time-to-market advan- mination resistors and PLL to recover bandwidths are also supported. As an tage portends economies of scale, which the RX clock. example, it is possible to construct a will drive down prices and increase fabric with a 2-bit wide physical link availability of parts. In addition, with its It is projected, with continued design up-stream and an eight-bit wide link backwards compatibility to PCI, Hyper- improvements to the existing PHY, com- downstream. There is no need for spe- Transport technology is the ideal choice bined with wider channels and reduced cial software I/O drivers to support for companies looking to increase per- silicon geometries, that HyperTransport this type of asymmetric topology, as formance while preserving existing tech- technology’s DDR channel could con- the protocol layer at reset will ensure nology investments. tinue to double in throughput every 18 proper flow of data throughout the months following Moore’s law. fabric. The ability to support different HyperTransport technology I/O up-stream and down-stream widths interconnect physical layer Initially, would gives the designer the flexibility to allo- There are two elements to Hyper- seem to double the pin count to two pins cate bandwidth specific to the needs of Transport technology, the PHY or phys- per bit, but the increase in signal pins is the system. This feature alone will sim- ical pin and the logical or offset by two factors. First, operating at plify PCB layout and reduce pin-count protocol layer. At the pin or physical higher frequencies than single ended on ASICs. layer, HyperTransport technology is busses, HyperTransport technology based on a proprietary point-to-point, requires fewer pins to deliver equivalent The differential data is double-clocked Low Voltage Differential Sign (LVDS) or better bandwidth. Second, differen- or Dual Data Rate (DDR). Today set. The electrical specifica- tial signaling provides a return current HyperTransport technology supports tion of HyperTransport technology is path for each signal, which greatly re- clock rates from 200 MHz to 1 GHz in different from the industry standard duces the number of power and ground 100 MHz steps (up to 2.0 Gbits per LVDS as noted in Table 1. pins required for the interface. In addi- second per physical channel). Each 8-bit channel has its own clock (a 16-bit Specification LVDS HyperTransport LVDS channel would have two clocks and Supply 2.5 volt 1.2 volt 2-bit, four) with one control line per interface to differentiate commands Swing 1.25 volt 600m from address/data. This reduces clock Termination External Resistors Dynamically Balanced line skew and insures a low bit error rate. Peak 622 MHz 1 GHz HyperTransport technology also sup- ports dynamic clock configuration, Table 1 making it possible for the software to

Copyright 2001 CompactPCI Systems Reprinted from CompactPCI Systems / September 2001 TECHNOLOGY FEATURE

change the data clock rate A switch passes data to and from one will be “tens” of HyperTransport tech- between any two devices. HyperTransport technology chain to nology peripheral devices and four plus This is useful for dynamic power man- another. The switch will configure each host port equipped processors. agement in battery-powered applica- of its interfaces as either a host or single tions. ended slave. This allows the system As the technology moves forward, more architect to use a switch to build a and more HyperTransport technology HyperTransport host interfaces have switching fabric while isolating a devices will emerge bridging not only to been announced on a wide range of HyperTransport technology chain from PCI but InfiniBand, Gigabit , MIPS architecture, embedded proces- a host to: PCI-X, and related standards. They will sors, providing a high-speed mezzanine provide the building blocks capable of bus that both enhances and extends the Offloading peer-to-peer traffic spanning a wide range of platforms and life cycle of products built with MIPS Extending a fabric beyond a single applications. architecture processors with no need for chain of 31 tunnel devices software modifications. Enabling a multi-/host HyperTransport technology I/O’s scala- fabric bility makes it an ideal supplement or The HyperTransport Creating a minimal latency tree replacement for traditional serial and/or technology fabric topology address-data bus peripheral ports and Unlike PCI, HyperTransport technology delivers the high-performance link and is a point-to-point interconnect, requir- Data management coherency needed for scalable, multi- ing a source and termination device for One of the more complex data manage- processor servers and next-generation each segment of the bus. There are three ment tasks in a system is tracking a network equipment. It also provides up types of interfaces: host, tunnel, and sin- resource and mapping the available to an order of magnitude of improved gle ended slave. HyperTransport tech- resource so the host can effectively uti- throughput with reduced EMI over nology was initially conceived by AMD lize it. Like PCI, HyperTransport tech- today’s traditional processor-to-periph- and API as a high-speed, low pin-count nology conducts an enumeration task eral interface and translates to higher North/ interconnect bus after reset that maps each attached I/O throughput with a lower pin count, with a Host port on the and device into the host memory space. resulting in lower manufacturing and a single ended slave port on the South- Since HyperTransport technology fol- system costs. bridge. The tunnel construct permits lows PCI’s enumeration model, all inline bridging to devices such as a PCI attached HyperTransport technology Peter Robinson, 66/64, PCI-X, and related standards. An devices appear to the operating system HyperTransport example of a HyperTransport network- like a PCI-to-PCI Bridge. When the host technology technical ing topology is shown in Figure 2. writes (or reads) to that address range, evangelist at API the HyperTransport technology logic NetWorks (formerly The smallest possible HyperTransport transfers and then writes that data into Alpha Processor), technology construct is a host and one the target device’s memory without host is charged with single ended slave device, i.e., a North processor intervention. Unlike PCI, promoting HyperTransport technology and Southbridge. The largest construct, HyperTransport technology can address and leading the technical sales and without introducing a switch, is at least an almost limitless number of attached marketing efforts of the API NetWorks one host with 31 slave devices Ð either I/O devices when switches are used to HyperTransport technology line of tunnels or single ended slaves (Hyper- expand the fabric beyond the Host port products. Transport technology supports a mul- limit of 31. Today data can be routed tiple Host fabric using a master/slave through the fabric up to 2 Gbits/sec per approach). All transactions originate and differential pair. This gives the designer For more information, contact Peter at: terminate via a host port. Hence, all a highly scalable, faster, and signifi- peer-to-peer transactions within a chain cantly larger fabric while still using Peter Robinson must pass to and from the local host port. existing PCI I/O drivers. API NetWorks A fourth type of device, within a fabric, is a MUX or switch. The ports on these HyperTransport technology API NetWorks Silicon Business Unit devices are either a single ended slave or implementation 130C Baker Avenue Extension host. MUX and switches are used to Today, the HyperTransport technology Concord, MA 01742 extend the fabric where there are either a ecosystem is limited to one MIPS-based Tel: 978-318-1100, large number of , a large processor and one 66/64 PCI bridge. amount of peer-to-peer communica- From this humble beginning there are E-mail: [email protected] tions, or a requirement for multiple pro- literally hundreds of designs, based Web site: www.api-networks.com/silicon cessors (master/slave host devices). solely on these two parts. By 2002 there

Copyright 2001 CompactPCI Systems Reprinted from CompactPCI Systems / September 2001