High Level Synthesis Tool for High Speed Packet Processing

Y.Wang & V.S.Pakki Y.Wang High Level Synthesis for Tool Master’s Thesis High Level Synthesis Tool for High Speed Packet Processing High Speed Packet Processing SpeedHigh Packet Yian Wang Venkata Soumya Pakki Series of Master’s theses Department of Electrical and Information Technology LU/LTH-EIT 2014-417 Department of Electrical and Information Technology, http://www.eit.lth.se Faculty of Engineering, LTH, Lund University, December 2013. High Level Synthesis Tool for High Speed Packet Processing by Yian Wang and Venkata Soumya Pakki Master’s of Science in System on Chip Department of Electrical and Information Technology Lund University Dec 2013 Abstract The main objective of this Master Thesis is to design and implement a high level synthesis tool for high speed packet processing. For a given network packet, determining the destination and performing the required alterations to the packet are the key parts of Packet Processing. The idea is to provide customers a customized Ethernet switch which is reliable and flexible. As a requirement for this, a high level packet processing language (PPL) is designed instead of any hardware descriptive language because of the regularity of packet processing. The packet processing is described in a powerful way based on the PPL. In this thesis, a design of Ethernet switch based on the PPL is proposed. Hardware implementation is done for the design and MyHDL is used as the hardware description language. Using Python, the compiled PPL program is translated into an hardware model. A tool has been developed which consists of a hardware generator and certain hardware infrastructures. Another part in the thesis is optimization of the initial design. For instance, optimization is done to run as much code as possible in parallel or for removal of unused hardware in the generated switch. Verification is done and synthesis results have been listed comparing the two designs. Hence, we conclude that the initial design is more flexible and has more redundancy while the optimized design is more friendly to hardware cost and power consumption. iii Acknowledgements We initially thank Packet Architects AB for giving us an opportunity to work on this Thesis. We are grateful to our supervisor Robert Wikander, CTO who gave us the opportunity to do this thesis and for encouraging us during the complete duration of the project. His ideas eased a lot to finish the project work. Our sincere gratitude to Examiner at LTH, Viktor Owall¨ for his patience, constant support, motivation. Without his several feedbacks on our report, it would not have been done at this quality. Finally, we would like to thank our family and friends for their encouragement. v Contents Abstract iii Acknowledgements v List of Tables ix List of Figures xi List of Acronyms xiii 1 Introduction 1 1.1Background................................ 1 1.2 Scope of the Thesis Project ....................... 6 1.2.1 Target of the Thesis Project ................... 7 1.2.2 OrganizationoftheThesisReport................ 8 2 Ethernet Switch 11 2.1Overview.................................. 11 2.2EthernetFrameStandard........................ 14 2.3Layer2switchsubsystems........................ 15 vii 2.4 Packet Forwarding Algorithm ...................... 19 3 Packet Processing Implementation 21 3.1HardwareInfrastructure......................... 23 3.1.1 Packet Decoding ......................... 23 3.1.2 Packet Modification ........................ 27 3.1.3 Packet Processing Interface ................... 28 3.1.4 ConfigurableMemoryInterface................. 31 3.2 Packet Forwarding Pipeline ....................... 33 3.2.1 InstructionSet.......................... 35 3.2.2 RTLGenerator.......................... 41 4 Implementation Results 49 4.1HardwareInfrastructureSchematic................... 50 4.2TestingEnvironment........................... 56 4.3SynthesisResult.............................. 58 4.4 Implementation Analysis ......................... 61 5 Packet Processing Optimization 65 5.1IntermediateStage............................ 65 5.2 Infrastructure around Packet Forwarding ................ 68 5.3 Optimized Implementation Result .................... 72 6 Conclusions and Future work 75 Bibliography 77 viii List of Tables 1.1OSImodel................................. 2 2.1TypicalEthernetFramestructure.................... 14 2.2 802.3 Ethernet Header .......................... 15 2.3EthernetHeaderwithVLANTag.................... 15 3.1VariableFileManagementUnit..................... 43 3.2 Components Created by RTL Builder .................. 47 4.1Packetdecoderports........................... 50 4.2Fieldmemoryports............................ 52 4.3 Modification memory ports ........................ 55 4.4 Packet modifier ports ........................... 57 4.5Maximumfrequencyfortwoconfigurations............... 59 4.6 Power summary for two configurations ................. 60 4.7On-chipcomponentspowerreportfora2-portLayer2switch.... 61 4.8On-chipcomponentspowerreportfora6-portLayer2switch.... 62 5.1 Timing Comparison ............................ 72 5.2 Device Utilization Comparison ...................... 73 ix 5.3PowerComparison............................ 73 x List of Figures 1.1 Example of data transmission through OSI model ........... 3 1.2 OSI Model VS IEEE 802.3 ........................ 4 1.3 Projected Share of Network Ports - 2015 ................ 5 2.1 Example of a Layer 2 Switch ....................... 12 2.2 Example of a Layer 2 Switch with two VLANs ............. 13 2.3ABirdEyeviewofthedataflow.................... 16 2.4 Block Diagram of Packet Processing Subsystem ............ 18 2.5 Flowchart of Proposed Packet Forwarding Algorithm ......... 20 3.1FrameworkofFlexswitch......................... 22 3.2 Design division of Packet Processing Subsystem ............ 23 3.3 Block diagram of Packet Decoding ................... 24 3.4 Flowchart of reading certain bits from one packet frame ........ 25 3.5Blockdiagramofpacketreadstage................... 26 3.6FSMofpacketreadstagewithsingleport............... 27 3.7 Block diagram of Packet Modification .................. 28 3.8PacketTransmissionInterface...................... 29 xi 3.9 Interface of Field Memory and Modifier Memory ............ 30 3.10InterfaceBetweenTablesandPFP................... 32 3.11OverallViewofHardwareInfrastructures................ 34 3.12 Structure of packet forwarding pipeline ................. 35 3.13 Example of one packet forwarding pipeline implementation ...... 35 3.14 Example of creation and deletion of variables .............. 36 3.15 Example of Branch in Packet Forwarding ................ 39 3.16StepsofAutoRTLGeneration...................... 41 4.1SchematicofPacketDecoder....................... 51 4.2SchematicofFieldMemory....................... 53 4.3 Schematic of Modification Memory ................... 54 4.4 Schematic of Packet Modifier ...................... 56 4.5TestingConnectionsonBoard...................... 58 5.1 Example of a GraphViz dot file ..................... 67 5.2 Example of a Graphviz DFG . ...................... 69 5.3MovetheSPtotheheadofthePFP.................. 70 5.4 Simplified Packet Forwarding Pipeline Structure ............ 71 5.5 Simplified Packet Processing Subsystem ................ 72 xii Acronyms ARP Address Resolution Protocol. 12 AXI4 Advanced Extensible Interface 4. 57 DEI Drop Eligible Indicator. 15 DFG Data Flow Graph. 66, 67, 69, 72 FCS Frame Check Sequence. 14 FIFO First In First Out. 22, 28, 29, 31, 52, 55, 58, 62, 63, 70, 77 FM Field Memory. 23, 24, 26, 27, 29, 31, 51, 52 FPGA Field Programmable Gate Array. 7, 8, 56, 57, 59 FSM Finite State Machine. 26, 40 IEEE Institute of Electrical and Electronics Engineers. 2, 5, 14–16, 71 IFG Inter Frame Gap. 16 IP Internet Protocol. 13 LAN Local Area Network. 2–4, 12, 14 LLC Logical Link Control. 2 MAC Media Access Control. 2, 4, 11, 12, 14–20, 56, 57, 62, 71 MM Modification Memory. 27–29, 31, 51, 52, 55 OSI Open Systems Interconnection. 1, 2, 7, 11 PCIe Peripheral Component Interconnect Express. 57, 77 PCP Priority Code Point. 15 xiii PCS Physical Coding Sublayer. 56 PD Packet Decoder. 22, 23, 30, 40, 50, 61, 63, 64, 70, 71 PFP Packet Forwarding Pipeline. 22–31, 33, 34, 36–41, 44, 46, 47, 51, 52, 61, 62, 69–72 PHY Physical Layer. 1, 11, 56 PM Packet Modifier. 22, 27, 29, 31, 51, 55, 61–63 PMA Physical Medium Attachment. 56 PPL Packet Processing Language. 7, 8, 21, 22, 34, 37, 38, 67, 68 PS Parallel-to-Serial Converter. 17 RAM Random Access Memory. 31, 51, 52, 60 RR Round Robin. 52 RTL Register Transfer Level. 6–8, 21, 22, 34, 35, 38, 41–47, 49, 62, 63, 66–68, 72 SFP+ Enhanced Small Form-factor Pluggable. 56 SP Serial-to-Parallel Converter. 17, 18, 29, 71, 76 TPID Tag Protocol Identifier. 15 VF Variable File. 43–45 VID VLAN Identifier. 15, 19 VIO Virtual Input/Output. 57 VLAN Virtual LAN. 12–15, 17–20 WAN Wide Area Network. 13, 14 XGMII Ten Gigabit Media Independent Interface. 56 xiv Chapter 1 Introduction 1.1 Background It has been 40 years since the invention of Ethernet. This technology was born at Xe- rox PARC in 1973 which initially aimed at solving communication problems between personal computers and peripherals. In the past 40 years an explosive development has occurred and the speed of Ethernet has improved by one order of magnitude every 10 years, which gives us reason to believe that Ethernet will keep this growing trend in the future. In December of 1982, the advent of IEEE 802.3 remarked the take-off of Ethernet standards. IEEE 802.3 provides the technical standards in order to use Ethernet in the

High Level Synthesis Tool for High Speed Packet Processing

Packet Processing at Wire Speed Using Network Processors

Packet Processing Execution Engine (PROX) - Performance Characterization for NFVI User Guide

Lightweight Internet Protocol Stack Ideal Choice for High-Performance HFT and Telecom Packet Processing Applications Lightweight Internet Protocol Stack

High Level Synthesis for Packet Processing Pipelines Cristian

Evaluating the Power of Flexible Packet Processing for Network Resource Allocation Naveen Kr

An Architecture for High-Speed Packet Switched Networks (Thesis)

ICMP for Ipv6

Packet Processing with a Noc-Enhanced FPGA

Pathminer Powered Predictable Packet Processing

Impressive Packet Processing Performance Enables Greater Workload Consolidation Single Intel® Xeon® Processor Platform Achieves Over 80 Mpps Throughput1

High Performance Packet Processing with Flexnic

IMPLEMENTING VOICE OVER IP Copyright 6 2003 by John Wiley & Sons, Inc