PCI Express and Advanced Switching: Evolutionary Path to Building Next Generation Interconnects

PCI Express and Advanced Switching: Evolutionary Path to Building Next Generation Interconnects David Mayhew and Venkata Krishnan StarGen, Inc. Marlborough, MA 01752 [Mayhew, Krishnan]@StarGen.com http://www.stargen.com Abstract widespread adoption of a PCIe follow-on technology called Advanced Switching (AS)[2]. With processor and memory technologies pushing the performance limit, the bottleneck is It has long been the goal of many in both the clearly shifting towards the system interconnect. computer and communications industries to Any solution that addresses PCI’s bus-based bridge the gap between their respective systems. interconnect, which has serious scalability In many ways, the adoption of inexpensive problems, must also protect the huge legacy computational interfaces and parts by the infrastructure. PCI Express provides such an telecommunications industry has already realized evolutionary approach and allows a smooth this goal in part; but the emergence of PCIe, to a migration towards building a highly scalable small degree, and AS, to a much greater degree, next generation interconnect. Advanced will make the integration of computer and Switching further boosts the capabilities of PCI communications equipment a reality. Express by allowing a rich application space to be covered that includes multiprocessing and 1.1. Terminology peer-to-peer communication. Indeed, the synergy For sake of this discussion, we define three types between PCI Express and Advanced Switching of interconnects typically used in today’s permits the adoption of an evolutionary yet systems. Based on where it resides as well its revolutionary approach for building the purpose, interconnects may be classified as a (a) interconnect of the future. Processor (b) Mezzanine (c) Local Area interconnect. 1. Introduction System interconnects are currently in a The Processor interconnect, typically known as significant state of flux. The resolution of this the processor or front side bus, is the final (or uncertainty will have a significant impact on initial) conduit for data moving into (or from) the system architecture, and, in the longer term, on CPU’s on-chip cache/registers. A processor the definition of a system itself. The historic interconnect is typically used to connect the distinctions between processor buses, mezzanine processor to another component that contains a interconnects, and LANs will be challenged by memory controller and one or more mezzanine the wide scale adoption of low-latency, high- level interconnection ports[3] bandwidth switched serial interconnects. A mezzanine interconnect has historically been a The current uncertainty surrounding system processor (CPU) independent interconnect with interconnects should resolve itself in a relatively very limited reach (typically measured in short time with the nearly unanimous backing centimeters not meters). They generally employ from the industry for the PCI-SIG controlled - an address/data read/write data model with and yet largely Intel created - PCI Express memory-like semantics. These interfaces have (PCIe)[1] specification. We believe that the been targeted at allowing simple translation emergence of PCIe as the heir apparent to the between processor bus memory operations and PCI interconnect legacy will result in the mezzanine interconnects transactions. PCI is, to date, by far the most popular and successful of some of the technologies that have/are the mezzanine interconnects. attempting to displace PCI as the standard mezzanine interconnect of the future provide a Finally, Local Area interconnects or networks completely transparent PCI bus programming (LANs) grew up between the public access model (namely PCI-X[8], HyperTransport[9], network (WANs) and mezzanine interconnects. and PCIe[1]). Of equal interest are fortunes of LANs have continued to mature into protocols InfiniBand[10] and RapidIO[11], which lack PCI that now completely blur the historic distinction transparency. Conversely, PCI-X’s compatibility between LAN and WAN, with perhaps the only with both the logical and physical characteristics significant current distinction being that LANs of PCI resulted in its relatively quick adoption in use computer-borne technology (Ethernet) and the computational server space. WANs use telecommunication protocols (SONET and OC-48). For our purposes, the Shared parallel interconnects do, in most distinction between a LAN and a mezzanine respects, present the worst set of characteristics interconnect lies in the LAN’s reliance on of the three mezzanine structures. The primary protocols that do not share the mezzanine’s advantage of sharing is that, for very small memory read/write semantics, though it may be systems, it is relatively easy to implement. better to draw a simpler distinction based on the Unfortunately, shared parallel interconnects have ability, or lack there of, to issue a read at the some fundamental electrical limitations (loads on hardware level. Mezzanine interconnects can a circuit), speed concerns (capacitance of lots of read and write; LANs can only write. bus loads), reliability problems (the ability for one ill-behaved resident on a bus to make 1.2. Mezzanine Interconnects communication between all other residents There are three types of mezzanine impossible), scalability problems (everyone interconnects: shared parallel, switched parallel, shares bandwidth from a common pool; the more and switched serial. The distinction between residents on the bus the smaller the effective shared and switched is based on whether an bandwidth per resident), and physical distance interconnect’s electrical interface has a multi- problems (skew in source clocked parallel drop capability (shared) or whether the interface structures places fundamental limitations on the only supports point-to-point connections distance between connectors). (switched). The distinction between parallel and serial is based on whether an interface is source 1.2.2. Switched Parallel Interconnects clocked, that is, the interface is strobed by a Many of the fundamental limitations of shared clock signal (parallel) or whether the interface parallel interconnects are addressed by switched relies on some form of clock recovery, that is, parallel interconnects. The electrical loading, the interface is self clocked (serial). reliability, scalability, and, to some degree, speed and physical distance issues of shared structures 1.2.1. Shared Parallel Interconnects are solved by the adoption of switched The list of significant shared parallel structures. Signal skew remains an issue, as does interconnects is long and storied. A few of the the effective width of such connections when major players here include: S-100[4], PC-AT[5], cabled, but, for some applications, a switched Multi-Bus (II)[6], VME[6], PCI[7], and PCI- parallel interconnect is the best current X[8]. As previously stated, PCI is, to date, the technology. In general, a parallel interface can shining star of this category and, indeed, of all always be made serial. The only real issue is categories. Its ubiquity provides an existence whether an appreciable amount of silicon proof for the purported advantages of mezzanine supports a potentially serial version of a parallel interconnects. Namely that mezzanine interface. interconnects allow a vendor independent mechanism in which a broad range of processor A number of new standards are emerging that architectures can talk to a broader range of fall into the switched parallel mode of operation, devices types. most notably HyperTransport[9] and RapidIO[11] (attempting to migrate to a serial Of note is the fact that currently almost the entire interface), though switch parallel interconnects software infrastructure of the computer industry are hardly new[12][13][14]. is tied to the interconnect model of PCI. At least 1.2.3. Switched Serial Interconnects discussion of Advanced Switching in Section 3. Switched serial interconnects solve the last of the Finally, Section 4 concludes the paper. issues with interconnect technology, which is the physical reach. A parallel interface, by definition 2. PCI Express has at least 2 signals (one clock and one data). A One of the reasons for PCI’s remarkable success serial interface can be reduced to a single signal. was that a few of its visionary pioneers Serial signals can also be ganged into groups of recognized the importance of a standard model physically independent streams that can be for bridging between PCI bus segments[15]. This striped on arbitrary bit intervals (bytes seem to capability was intended to solve the electrical be very popular). In many instances, the physical load restrictions of PCI by allowing a single (pair reach of an interconnect dictates the scale of if on an add-in card) load on one bus to spawn an system solution that can be addressed by the entirely new subordinate bus segment that could technology. have a full compliment of additional electrical loads. This standardization allowed for the Many of the interconnect solutions that were creation of a robust, software-independent, PCI- initially parallel solutions have migrated to serial to-PCI bridge market with a standard software solutions. The shift from parallel to serial is model for recognizing devices behind bridges. naturally made when transitioning a technology More importantly for PCIe, it provided a legacy from a copper-cabled solution to an optical software transparent model for switching within solution. However, high solution cost remains a the PCI model: a

Load more