Analysis and Optimisation of Communication Links for Signal Processing Applications

Analysis and Optimisation of Communication Links for Signal Processing Applications ANDREAS ÖDLING Examensarbete inom elektronik- och datorsystem, avancerad nivå, 30 hp Degree Project, in Electronic- and Computer Systems, second level School of Information and Communication Technology, ICT Royal Institute of Technology, KTH Supervisor: Johnny Öberg Examiner: Ingo Sander Stockholm, November 12, 2012 TRITA-ICT-EX-2012:287 Abstract There are lots of communication links and standards currently being employed to build systems today. These methods are in many way standardised, but far from everyone of them are. The trick is to select the communication method that best suit your needs. Also there is currently a trend that things have to be cheaper and have shorter time to market. That leads to more Component Off The Shelf (COTS) systems being build using commodity components. As one part of this work, Gigabit Ethernet is evaluated as a COTS-solution to building large, high-end systems. The computers used are running Windows and the protocol used over Ethernet will be both TCP and UDP. In this work an attempt is also made to evaluate one of the non-standard protocols, the Link Port protocol for the TigerSHARC 20X-series, which is a narrow-bus, double- data-rate protocol, able to provide multi-gigabit-per-second performance. The studies have shown lots of interesting things, e.g. that using a standard desktop computer and network card, the theoretical throughput of TCP over Gigabit Ethernet can almost be met, reaching well over 900 Mbps. UDP performance gives on the other hand birth to a series of new questions about how to achieve good performance in a Windows environment, since it is constantly outperformed by the TCP connections. For the Link Port assessment a custom built IP block is made that is able to support the protocol in full speed, using a Xilinx Virtex 6 FPGA. The IP block is verified through simulation against a model of the Link Port protocol. It is also shown that the transmitter of the IP block is able to send successfully to the receiver IP block. The IP block that is created, is evaluated against some competing multi-gigabit protocols to show it in comparison, and it is a rather small IP block, capable of handling all transactions on the bus as long as data is provided by its host. Referat I nuläget finns många olika sorters kommunikationslänkar, både standardiserade och inte. Dessutom har krav på kor- tare tid till marknad i många fall gett upphov till att fler och fler system byggs med färdiga komponenter som kopp- las ihop till hela system. Som ett led i detta används ofta väl beprövade tekniker som man vet fungerar. Som en del i det här arbetet kommer prestandan hos Gigabit Ethernet att utvärderas för vanliga persondatorer som kör Windows genom att använda TCP och UDP- protokollen. Dessa är utrustade med standardnätverkskort med låg kostnad och undersökningen går ut på att ta reda på om dessa kort och datorer kan användas till att byg- ga system med hög prestanda. Dessutom kommer ett ic- kestandardiserat protokoll, Länkportsprotokollet för Ti- gerSHARC 20X-serien, som är ett protokoll som stödjer flera Gbps, att utvärderas för prestanda. Studien av TCP och UDP ledde till mycket intressan- ta resultat. Bland annat så har studien visat att man kan få TCP-kommunikation mellan två persondatorer att vara bara enstaka Mbps från det teoretiska maximala värdet, och kommunikationshastigheter långt över 900 Mbps har uppmätts för TCP. UDP i sin tur, väckte mer frågor än det nåddes svar, och den hade genomgående sämre prestanda än TCP-testerna. Det tyder på att man, när man gör pro- gram för vanliga persondatorer, inte tjänar något på att använda UDP utan snarare tvärt om. För studien av Länkportar så skapades ett IP-block som kan sända och ta emot data i samma hastighet som specificeras som den högsta i protokollbeskrivningen, fyra gigabit per sekund. Blocket verifierades genom simulering och genom att låta sändaren sända data som mottagaren lyckades ta emot. Slutligen jämfördes Länkporten mot andra protokoll med liknande karakteristik, och jämförelsen framställer det skapade IP-blocket som ett gott alternativ till andra protokoll, mycket på grund av sin enkelhet. Contents Abstract iii Refereat iv Contents v List of Figures ix List of Tables xi Listings xii Definitions xiii I Prelude 1 1 Introduction 3 1.1 Purpose . 3 1.2 Goals . 4 1.3 Motivations for This Work . 4 1.4 Limitations for This Work . 5 1.5 Layout for the Report . 5 2 Background and Related Work 7 2.1 History of Radar Systems . 7 2.1.1 Radar Construction Basics . 8 2.1.2 A Probable Future . 9 2.2 An Example System . 9 2.2.1 Conceptual Radar System . 9 2.2.2 Data Transfers in the Conceptual Radar System . 10 2.3 A Background to Physical Signalling . 12 2.4 Multi-Gigabit Transceivers . 14 2.5 The Link Port Protocol . 15 2.5.1 Some Link Port Characteristics . 15 v 2.5.2 Previous Work on Link Ports . 17 2.6 Communication Protocols . 18 2.7 Previous Work on Protocol Comparison . 19 2.7.1 TCP and UDP Performance Over Ethernet . 20 2.7.2 RapidIO Analysis . 21 2.7.3 PCI Express Evaluation . 21 2.7.4 USB Experiments . 22 2.7.5 Infiniband Studies . 22 2.7.6 Intel Thunderbolt . 23 2.8 Data Acquisition Networks . 23 2.8.1 Common Features for DAQ Networks . 24 II Contributions 25 3 Methods 27 3.1 Link Port . 27 3.2 Gigabit Ethernet . 28 3.2.1 Setup for the experiment . 28 3.3 Other High-Speed Protocols . 28 4 Ethernet On Windows Computers 29 4.1 Hardware and Software Setup . 30 4.1.1 Offloading Checksum Calculations . 30 4.1.2 Increasing the Transfer Buffers . 31 4.1.3 Increasing the Receiver Buffers . 31 4.1.4 Increasing the Ethernet Frame Size . 31 4.1.5 Control the Interrupt Rate . 31 4.2 Evaluating the Performance . 32 4.2.1 The Measurement Environment . 32 4.3 TCP Specifics . 34 4.3.1 TCP and IP Checksum Offloading . 35 4.3.2 Effects from Interrupt Moderation . 36 4.3.3 Changing the Ethernet Frame Size . 38 4.3.4 Variable Buffer Size . 39 4.4 TCP Evaluation and Summary . 41 4.5 UDP Specifics . 42 4.5.1 Interrupt Moderation Effects . 43 4.5.2 Buffer Size Exploration . 45 4.5.3 Does Frame Size Affect UDP Performance? . 46 4.6 Analysis of UDP Performance . 46 4.7 Summary of Ethernet Performance . 49 4.8 Which Settings to Choose . 50 5 Creating a Link Port IP Block 51 5.1 Link Port Implementation Idea . 51 5.1.1 Key Coding Considerations . 52 5.2 Link Port Transmitter . 52 5.2.1 Transmitter Clocking . 54 5.2.2 Transmitter State Machine . 55 5.2.3 Transmitter LVDS Outputs . 56 5.2.4 The Data Path and Memory Design . 58 5.2.5 Controlling the Transmitter . 58 5.2.6 Checksum Calculator . 60 5.2.7 The Implementation of Block Complete . 61 5.3 Link Port Receiver . 61 5.3.1 Receiver Finite State Machine . 62 5.3.2 Controlling the Receiver . 63 5.3.3 The Deserialisation of Incoming Data . 63 5.3.4 Receiver LVDS Inputs . 64 5.3.5 Getting the Receiver Through Timing . 68 5.4 Testing and Verification . 70 5.5 IP Block Restrictions . 70 5.6 IP Block Metrics . 71 5.7 Link Port Implementation Time . 71 5.8 This Link Port Implementation Contributions . 72 5.9 Comments and Analysis of the Link Port IP Block . 72 6 Comparison of Communication Techniques 75 6.1 Hard facts . 75 6.2 Making a Choice . 76 7 Goal Follow Up and Conclusions 79 8 Future Work 81 Bibliography 83 IIIAppendices 93 A Abbreviations 95 B A Selection of Used Xilinx Primitives 97 C Selection of Needed Constraints 99 D The OSI Model 101 D.1 Physical Layer . 101 D.2 Data Link Layer . 101 D.3 Network Layer . 102 D.4 Transport Layer . 102 D.5 Session Layer . 103 D.6 Presentation Layer . 103 D.7 Application Layer . 103 E PCI Express 105 E.1 Associated Overhead . 107 F Gigabit Ethernet 109 F.1 Real-Time Ethernet . 111 F.2 Efficiency of Gigabit Ethernet . 112 G TCP/IP Protocol Suite 115 G.1 The Internet Protocol Version 4 . 115 G.1.1 Efficiency of the Internet Protocol Datagrams . 117 G.2 The User Datagram Protocol . 118 G.3 The Transmission Control Protocol . 119 G.3.1 Socket Buffer Size . 120 G.3.2 Different TCP Implementations . 120 G.3.3 TCP Offload Engine . 121 G.3.4 RDMA-Enhanced TCP Decoding . 121 G.3.5 TCP Efficiency Over Ethernet . 122 H Link Port for TS20X-Series 125 H.1 Performance of Link Ports . 125 H.2 Uses of Link Ports . 126 I RapidIO 127 I.1 The Logical Layer . 127 I.2 Transaction Layer . 129 I.3 Physical Layers . 130 I.3.1 Serial RapidIO . 130 I.3.2 Parallel RapidIO . 131 J USB 133 K Infiniband 135 L 8B/10B Encoding 137 M Case Study: The ATLAS TADQ-System 139 M.1 The Communication Protocols in ATLAS . 142 M.2 The Physical Interconnects and Software of ATLAS.

Analysis and Optimisation of Communication Links for Signal Processing Applications

End-To-End Performance of 10-Gigabit Ethernet on Commodity Systems

Design of a High Speed XAUI Based on Dynamic Reconfigurable

Parallel Computing at DESY Peter Wegner Outline •Types of Parallel

Ipug68 01.3 Lattice Semiconductor XAUI IP Core User’S Guide

IEEE Std 802.3™-2012 New York, NY 10016-5997 (Revision of USA IEEE Std 802.3-2008)

1 Reference 10Gbe Implementation • XAUI/XGXS and XGMII Are Both

Data Center Architecture and Topology

(12) United States Patent (10) Patent No.: US 7,676,600 B2 Davies Et Al

Latticesc/M Broadcom XAUI/Higig 10 Gbps Lattice Semiconductor Physical Layer Interoperability Over CX-4

Making the Switch to Rapidio

EDSA-401 ISA Security Compliance Institute – Embedded Device Security Assurance – Testing the Robustness of Implementations of Two Common “Ethernet” Protocols

Error Behaviour in Optical Networks Laura Bryony James