INTERCONNECTS Undestanding RapidIO, PCIe and

By Barry Wood Tundra

While there are many ways to connect components in embedded systems, the most prominent are the high speed serial standards of Ethernet, PCI Express, and RapidIO. All of these standards leverage similar Serialiser/Deserialiser (SerDes) technology to deliver throughput and latency performance greater than what is possible with wide parallel technology. For ex- Figure 1: RapidIO embedded control symbols and PCIe DLLPs. ample, RapidIO and PCI Express leveraged the XAUI SerDes tech- Transaction Layer Packets (TLPs) nology developed for Ethernet. such as reads and writes, and The trend towards leveraging a smaller quantities of link-specific common SerDes technology will information called continue with future versions of Packets (DLLPs). DLLPs are used these specifications. The implica- for link management functions, in- tion is that raw bandwidth is not a cluding physical layer flow control. significant differentiator for these PCIe was designed to be back- protocols. Instead, the usefulness wards compatible with the legacy of each protocol is determined by of PCI and PCI-X devices, which how the bandwidth is used. assumed that the processor(s) sat at the top of a hierarchy of buses. Protocol summaries This had the advantage of lever- Most designers are familiar with aging PCI-related software and basic Ethernet protocol charac- hardware intellectual property. As teristics. Ethernet is a ‘best effort’ discussed later in this article, the means of delivering packets. The PCI bus legacy places significant software protocols built on top of constraints on the switched PCIe the , such protocol. as TCP/IP, are necessary to provide RapidIO technology has been reliable delivery of information, as optimised for embedded systems, Ethernet-based systems gener- particularly those which require ally perform flow control at the multiple processing elements to network layer, not the physical cooperate. Like PCIe, the RapidIO layer. Typically, the bandwidth of protocol exchanges packets and Figure 2: RapidIO Multicast Event Control Symbol and PCIe DLLP. Ethernet-based systems is over- smaller quantities of link-specific provisioned between 20 and 70 information called control sym- Physical layer time. The limited physical layer per cent. Ethernet is best suited bols. RapidIO has characteristics At the physical/link layer, the pro- flow control means that Ethernet for high latency inter-chassis of both PCIe and Ethernet. For tocols have different capabilities networks discard packets to deal applications or on-board/inter- example, RapidIO provides both when it comes to flow control with congestion. board applications where band- reliable and unreliable packet and error recovery. Ethernet flow In contrast, PCIe and RapidIO width requirements are low. delivery mechanisms. RapidIO control is primarily implemented physical-layer flow control mech- PCI Express (PCIe), in contrast, also has many unique capabilities in software at the network layer, anisms ensure reliable delivery of is optimised for reliable delivery of which make it the optimal inter- as this is the most effective for packets. Each packet is retained packets within an on-board inter- connect for on-board, inter-board, large networks. Ethernet’s only by the transmitter until it is ac- connect where latencies are typi- and short distance (<100 m) inter- physical layer flow control mech- knowledged. If a transmission er- cally in the microsecond range. chassis applications. anism is PAUSE, which halts trans- ror is detected, a link maintenance The PCIe protocol exchanges mission for a specified period of protocol ensures that corrupted

 eetindia.com | EE Times-India packets are retransmitted. Unit Interval of jitter and 50 nano- PCIe ensures delivery using seconds of latency per switch, DLLPs, while RapidIO uses control regardless of packet traffic. symbols. Unlike DLLPs, RapidIO control symbols can be embed- PCIe and Ethernet may choose ded within packets. This allows to extend their respective speci- RapidIO flow control informa- fications in future to allow events tion, such as buffer occupancy to be distributed with low latency. levels, to be exchanged with low Introduction of a control-symbol latency, allowing more packets like concept would be a large to be sent sooner. Figure 1 illus- step for Ethernet. Several initia- trates this concept. In the leftmost tives are underway within the panel, Device A cannot send any Ethernet ecosystem to improve packets to Device B because the Ethernet’s abilities within stor- buffers in Device B are full. Device age applications that may need B is sending packets to Device A a control-symbol like concept. continually. In the middle panel, Ethernet is also being enhanced one buffer in Device B becomes to incorporate simple XON/XOFF free. At this point, Device B must flow control. inform Device A that it can send PCIe currently does not allow Figure 3: Read, write, and message semantics. The read semantic retrieves a packet. In the PCIe panel on the DLLPs to be embedded within data 01 02 03 04 from address 0x1234_0000. The write semantic writes data bottom right, the DLLP cannot be TLPs as this concept is incompat- FF FE FD FC to address 0x1234_0000. The message semantic determines the transmitted until transmission of ible with the legacy of the PCI/X location of Fault_Handler without knowing referencing the memory map. the current packet is complete. In bus operation. DLLPs embedded the RapidIO panel on the top right, within TLPs create periods where a control symbol is embedded in no data is available to be placed a packet that is being transmitted. on the legacy bus. PCIe end-points This allows the RapidIO protocol could operate in store-and-for- to deliver packets with lower la- ward mode to ensure that packets tency and higher throughput than are completely received before the other protocols. The ability to being forwarded to the bus, at embed control symbols within the cost of a drastic increase in packets allows the rest of the latency and lowered throughput. RapidIO flow control story to be Given the PCIe focus of on-board much richer than PCIe or Ethernet, interconnect for a uniprocessor as discussed later in this article. system, and the continued need to maintain compatibility with Beyond more efficient flow legacy bus standards, it is un- control, the ability to embed con- likely that the PCIe community trol symbols within packets gives will be motivated to allow DLLPs RapidIO an ability that currently to be embedded within TLPs. neither PCIe nor Ethernet can of- fer. Control symbols can be used Bandwidth options to distribute events throughout a Beyond flow control and link RapidIO system with low latency maintenance, the most obvious and low jitter (Figure 2). This en- difference between Ethernet, ables applications such as distribu- PCIe and RapidIO at the physi- tion of a common real time clock cal/link layer are the bandwidth signal to multiple end-points, or a options supported. Ethernet has frame signal for antenna systems. a long history of evolving band- It also can be used for signalling width by ten times with each other system events, and for de- step. Ethernet currently operates bugging within a multi-proces- at 10Mbps, 100Mbps, 1Gbps, and Figure 4: RapidIO’s virtual output queue backpressure mechanism. sor system. As shown in Figure 2, 10Gbps. Some proprietary parts PCIe DLLPs introduce a significant also support a 2Gbps (2.5 Gbaud) terconnects require power to be while RapidIO supports lane rates amount of latency and jitter ev- option. Next generation Ethernet matched with the data flows. As of 1, 2, 2.5, 4 and 5Gbps (1.25, 2.5, ery time the DLLP is transferred will be capable of operating at 40 a result, PCIe and RapidIO support 3.125, 5, and 6.25 Gbaud). Both PCIe through a switch. In contrast, the and/or 100Gbps. more lane rate and lane width and RapidIO support lane width RapidIO protocol allows a signal PCIe and RapidIO take a dif- combinations than Ethernet. PCIe combinations from a single lane to be distributed throughout a ferent approach, as on-board, 2.0 allows lanes to operate at ei- up to 16 lanes. PCIe also supports RapidIO fabric with less than 10 inter-board and inter-chassis in- ther 2 or 4Gbps (2.5 and 5 Gbaud), a 32 lane port in its specification.

 eetindia.com | EE Times-India For a given lane width, a RapidIO it is not certain if it will be adopted semantics. To send a message, messaging semantics is the en- port can supply both more and within the PCIe ecosystem. PCIe the originator only needs to capsulation of other protocols. less bandwidth than PCIe, allow- components also support NTB know the address of the recipient. RapidIO has standards for encap- ing system designers to tune the functionality, whereby one PCIe Addressing schemes are generally sulating a variety of protocols. amount of power used to the data hierarchy is granted access to a hierarchical, so that no node must The ability to encapsulate gives flows in a system. defined memory range in another know all addresses. Addresses system designers many advan- PCIe hierarchy. System reliability may change as a system evolves, tages, including future-proofing Transport layer and availability of this approach enabling software components to RapidIO backplanes. Any future, The RapidIO and Ethernet speci- are discussed further on in this be loosely coupled to each other. legacy or proprietary protocol can fications are topology agnostic. article. These attributes are necessary for be encapsulated and transported Any set of end-points can be con- Many Ethernet-based systems data plane applications. using standard RapidIO compo- nected using any fabric topology, are also functionally limited to tree nents. For example, the existing including rings, trees, meshes, structures. For example, many im- RapidIO supports both read/ RapidIO specification for Ethernet hyper-cubes, and more esoteric plementations of Ethernet make write and messaging semantics. encapsulation allows designers to geometries such as entangled use of variations on the Spanning Beyond the obvious architectural leverage Ethernet-based software networks. Packets are routed Tree Protocol to pare down the set advantages and system flexibil- within a RapidIO-based system. through these topologies based of available links to a tree of used ity, the support of read/write and Ethernet also supports encap- on their network address. An links. As part of an initiative to messaging allows the use of a sin- sulation, and has several encapsu- example of an Ethernet network extend Ethernet into the storage gle interconnect for both control lation standards to choose from. address is an Internet Protocol network, protocols are being spec- and data planes. RapidIO systems RapidIO can encapsulate more (IP) address. RapidIO network ified that will enable the creation are thus simpler than those which efficiently than Ethernet, as the addresses are called destination of a spanning tree for each of the must incorporate both PCIe and layers of Ethernet-based protocols identifiers, abbreviated as - des 4096 virtual channels supported Ethernet, which in turn provides require more header overhead. tIDs. by Ethernet. This will allow more the benefit of reducing both At one time, there was an Topology-agnostic protocols links to be used simultaneously. power consumption and cost. initiative which attempted to enable a wide variety of resource However, support for 4096 virtual PCIe is unlikely to incorporate standardise encapsulation of a sparing and redundancy options. channels increases the complex- inter-process communication variety of protocols over a PCIe- For example, some systems use ity of switching, and may require messaging semantics within its like fabric. This initiative failed an N+1 sparing strategy to ensure more buffering and increased packet definitions, as these con- years ago due to the complexi- high availability and reliability latency through the switch. flict with the operation of the ties involved. It was too difficult without significantly increasing legacy PCI and PCI-X busses. PCIe to extend PCIe, with its legacy the amount of hardware. (Sparing Logical layer is designed around the concept of bus related requirements, to be provides unused equipment as a There are several large differ- a Root Complex which contains a fabric for embedded systems. backup for in-use equipment. In ences between RapidIO, PCIe and the only processors in the system. N+1 sparing, N components have Ethernet at the logical layer. The In this paradigm, no messaging is Reliability, availability one common spare, so that the most obvious differences are the done over PCIe to other software Most systems have requirements system can continue to operate semantics supported, as shown entities, so messaging semantics for reliability and/or availability. if one of the N components fails. in Figure 3. PCIe packets support have no value. Systems with reliability and/or This strategy is also known as 1: address-based read and write It could be argued that availability requirements need N sparing.) Support for different semantics. In PCIe, the entity Ethernet supports read-and-write mechanisms for detection of er- topology options also allows sys- which originates a read or write semantics through the Remote rors, notification of errors, analysis tem designers to eliminate perfor- must know the target address (RDMA) and isolation of the faulty com- mance bottlenecks in systems by in the system’s global memory protocol, which supports direct ponent, and recovery. At a high matching data paths to data flows. map. This is a natural approach memory copies between de- level, PCIe, RapidIO, and Ethernet System expansion and evolution is for a control plane application. vices. However, RapidIO (and PCIe) have similar capabilities in all of unconstrained within RapidIO and However, this reliance on a global read-and-write semantics are far these areas. There are significant Ethernet networks. address map can lead to tightly more efficient than RDMA. The differences in sparing strategies, In contrast, PCIe supports a coupled software systems which RDMA protocol is built upon other and in the ability to rapidly iso- tree structure with a single Root are difficult to evolve. Ethernet-based protocols, such late the system from the effects Complex at the top of the hier- The PCIe protocol also supports as TCP, and requires a significant of faulty components. archy. Several routing algorithms messaging through Message TLPs. amount of header overhead for In the beginning, the Internet are used, depending on the type However, Message TLPs support a each packet. RDMA implementa- used software protocols at the net- of TLP. The single topology sup- limited number of functions such tions are much more expensive in work level to deliver reliable com- ported by PCIe is part of the leg- as interrupt and reset signals. This terms of latency and bandwidth munications over a network that acy of the PCI/X bus support. The is much different than Ethernet when compared with RapidIO could experience severe damage. PCIe standard has been extended and RapidIO message packets, read/write transactions. While Consequently, Ethernet’s original with Multi-Root I/O Virtualisation which are used for inter-process some RDMA offload engines - ex error management capabilities (MRIOV), which increases the communication. ist, it is difficult to envision using were aimed at detection, isolation number of physical topologies Unlike PCIe, software protocols RDMA for control plane functions and avoidance of new holes in the that PCIe can support. However, as built on the Ethernet physical such as programming registers. network, rather than robustness MRIOV is expensive to implement, layer only support messaging One possible application of of individual systems. System reli-

 eetindia.com | EE Times-India ability was achieved through du- packets and allow congestion physical layer, RapidIO offers PCIe- performance. Best of all, these plication of network components. to occur. RapidIO makes use of a style flow control supplemented mechanisms were designed to be Later enhancements have added ‘leaky bucket’ method of count- with a simple retry mechanism. implemented in hardware, free- standardised error capture and ing errors, with two configurable The simple retry mechanism is ing precious CPU cycles to deliver error notification mechanisms to thresholds. The DEGRADED very efficient to implement, with value to the customer. RapidIO’s simplify network management. threshold gives system manage- minimal performance penalty flow control mechanisms ensure RapidIO has sparing, error cap- ment software an early warning compared to PCIe-style flow con- that RapidIO-based systems use ture and error notification capa- that a link is experiencing errors. trol. The RapidIO physical-layer available bandwidth in an effi- bilities similar to that of Ethernet. The FAILED threshold can be flow control also includes a virtual cient, predictable manner. PCIe supports limited sparing set to trigger packet discard at a output queue-based backpres- strategies, as its transport layer is user defined error rate. The flex- sure mechanism. This mechanism, Summary limited to tree structures. PCIe’s ibility of RapidIO error manage- introduced in RapidIO 2.0, allows Ethernet, PCIe, and RapidIO are Non-Transparent , previ- ment reflects the varying needs switches and end-points to learn all based on similar SerDes tech- ously mentioned, allows two or of embedded system designers. which destinations are congested, nology. SerDes technology is more tree structures to communi- and to send traffic to uncon- therefore not a differentiator for cate, and is sufficient for 1+1 spar- Flow control gested destinations. This feature these technologies, but the way ing (also known as 1:1 sparing). Flow control cuts across the phys- enables distributed decision mak- they make use of the available NTB is difficult to scale to systems ical, transport and logical layers of ing, ensuring that available fabric bandwidth is. Each technology is employing N+M sparing. In the- interconnect specifications. Flow bandwidth is maximally utilised. optimised for a particular applica- ory, Multi-Root I/O Virtualisation control capabilities are critical to The latency of decision-making is tion space. (MRIOV) can be used to support ensuring the correct and robust low, as congestion information is Ethernet has been optimised N+M sparing in PCIe systems operation of a system under a exchanged using control symbols for networks which are geo- where N+M totals eight or less. range of conditions, including which, as already discussed, can graphically distributed, have long However, because the sub-trees partial failure and overload. Flow be embedded within RapidIO latencies, and dynamic network within a MRIOV system cannot control mechanisms allow the packets. configurations. PCIe has been op- communicate with each other, bandwidth available to be used The virtual output queue timised to support a hierarchical recovery from failures may require as efficiently and completely as backpressure mechanism is il- bus structure on a single board. a system outage in order to recon- possible. Flow control strategies lustrated in Figure 4. In the top Both have been used for on- figure the system to isolate failed are becoming more and more panel, the source is able to send board, inter-board, and inter-chas- components and to incorporate important in order to minimise much faster than EndPoint (EP) 1 sis communications, and in many new ones. the amount of bandwidth and can accept packets. This results in cases both are used in the same Ethernet’s error detection power wasted by over-provision- a congestion status control sym- system. RapidIO has the potential mechanisms are generally slower ing high frequency serial links. bol being sent by EP 1 to Switch to combine the benefits of these when compared to PCIe and It is not possible to talk about 2, which cascades the message two interconnects into a single in- RapidIO, as Ethernet was designed a unified Ethernet flow control back to Source. The congestion terconnect, with associated power for a network that is distributed strategy, as many unrelated status control symbol could also and cost savings. across the planet. Both PCIe and Ethernet-based messaging stan- have been originated by Switch 2 RapidIO is the best intercon- RapidIO have error detection and dards have protocol-specific flow when it detected congestion on nect choice for embedded sys- notification mechanisms with control strategies to avoid packet the port connected to EP 1. Once tems. RapidIO has capabilities much lower latency compared to discard. Generally, the flow control Source receives the congestion similar to PCIe and Ethernet, as Ethernet. strategies of these standards are status control symbol, it starts to well as capabilities that other While both PCIe and RapidIO based on reducing the rate of send packets to EP 2, reducing interconnects will not duplicate, guarantee delivery of packets, transmission when packet loss is the rate of packet transmission to such as: under error conditions they can detected. The flow control strate- EP 1. • Low latency, low jitter distri- discard packets to prevent faulty gies are generally implemented in RapidIO’s logical layer flow bution of system events components from causing fatal software, and require significant control mechanisms are designed • Combined link level and congestion. However, the PCIe buffering capabilities to allow for to avoid congestion within the network level flow control error condition mechanism is not retransmission. fabric by metering the admis- mechanisms configurable. Packets are always PCIe flow control is limited to sion of packets to the fabric, thus • Configurable error detection discarded when a link must re- the physical layer. The PCIe flow managing congestion at the and topology agnostic rout- train. Additionally, the PCIe isola- control mechanism is based on network level. This approach is ing enable efficient sparing, tion mechanism is activated only tracking credits for packet head- similar to Ethernet-based software high reliability and availabil- after several milliseconds. These ers and data chunks, tracked sepa- protocols. Admission of packets ity are not desirable behaviours in all rately for Posted, Non-Posted, and for specific flows can be adminis- • Hardware implementation systems. Completion transactions. tered through an XON/XOFF type of both read/write and in- In contrast, the RapidIO stan- RapidIO specifies flow control protocol, as well as by rate-based ter-process communication dard allows a system-specific mechanisms at the physical and and credit-based flow control. messaging semantics response to errors such as link re- logical layers. Physical layer flow Perhaps most importantly, these These capabilities allow system training. When errors occur, the control mechanisms are designed flow control mechanisms can also architects to create better per- system can immediately start to deal with multiple-microsec- be used at the application layer forming systems which consume discarding packets, or it can retain ond periods of congestion. At the to engineer software application less power and are easier to scale.

 eetindia.com | EE Times-India