Towards a Programmable Dataplane
Total Page:16
File Type:pdf, Size:1020Kb
TOWARDS A PROGRAMMABLE DATAPLANE A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Han Wang May 2017 c 2017 Han Wang ALL RIGHTS RESERVED TOWARDS A PROGRAMMABLE DATAPLANE Han Wang, Ph.D. Cornell University 2017 Programmable network dataplanes can significantly improve the flexibility and func- tionality of computer networks. This dissertation investigates two building blocks of network dataplane programming for network devices: the packet processing pipeline and network device interface. In the first part of the dissertation, we show that design- ing packet processing pipelines on hardware can be fast and flexible (programmable). A network dataplane compiler and runtime is presented that generates a custom FPGA dataplane designed and built from a dataplane programming language called P4 (pro- gramming protocol independent packet processors). P4FPGA generates designs that can be synthesized to either Xilinx or Altera FPGAs. We have benchmarked several repre- sentative P4 programs, and the experiments show that code generated by P4FPGA runs at line-rate at all packet sizes with latencies comparable to commercial ASICs. In the second part of the dissertation, we present a programmable network interface for the net- work dataplane. We show that a software programmable physical layer (programmable PHY) can capture and control the timing of physical layer bits with sub-nanosecond precision greatly increasing precision in network measurements. The benefits of a pro- grammable PHY is demonstrated with an available bandwidth estimation algorithm and a decentralized clock synchronization protocol that provides bounded precision where no two clocks differ by more than tens of nanoseconds. BIOGRAPHICAL SKETCH Han Wang attended the University of Auckland from 2003 to 2007 for his undergradu- ate studies in Computer Systems Engineering. Han spent a few months working as an embedded system software engineer before starting his PhD program in Electrical and Computer Engineering in Cornell University in 2008. He spent the first two years work- ing with Professor Francois Guimbretiere on low power embedded system design before joining Hakim Weatherspoon’s group to focus on system and networking research. In 2013 and 2014, Han spent two summers at Nicira/VMware working on Software De- fined Networking (SDN) controller and dataplane projects. In 2014 and 2015, Han Wang helped building the initial prototypes that form the basis of Waltz Networks Inc with Kevin Tang and Nithin Michael. In 2016, Han Wang accepted a position at Barefoot Networks Inc to pursue what he believed as the next generation programmable switch ASICs and compiler technology. iii for my family iv ACKNOWLEDGEMENTS First, I would like to express my deepest gratitude to my advisor, Hakim Weatherspoon, who has profoundly influenced my life at Cornell. I have been extremely lucky to have an advisor who put enormous trust and confidence into my ability, who offered unpar- alleled guidance for me to become an independent researcher, and who provided the complete freedom for me to pursue exciting ideas. I could not ask for more from an advisor. My work wouldn’t be possible without my wonderful collaborators. I would like to thank Ki Suh Lee, for his extrodinary ability in system programming and his meticulous attention to details. Without him, the SoNIC project would never be possible. Hyun Tu Dang and Robert Soule are the co-authors of the P4Paxos project. Thanks to Tu for taking the leap-of-faith to use my P4 compiler. Thanks to Robert for being a great collaborator and a great advisor. His help on my writing and presentation skills have been invaluable in the late stage of my PhD career. Jamey Hicks introduced me to Blue- spec and Connectal, which has been enormously helpful for the development P4FPGA. Thanks to Jamey for his unreserved help and guidance during my visit to MIT and after I returned to Cornell. I would like to thank Hakim Weatherspoon, Emin Gun Sirer, Rajit Manohar for serving on my thesis committee, and giving me valuable feedback on the works. Kevin A. Tang, Nithin Michael and Ning Wu have been another source of inspira- tion for my PhD career. Thanks to Kevin and Nithin for giving me the opportunity to participate in their early endeavors of Waltz Networks Inc. The experience of working with an early stage start-up is an invaluable lesson. I would like to thank Alan Shieh and Mukesh Hira for taking me as an intern at VMWare networking group in Palo Alto. Thanks to Baris Kasikci, Jeff Rasley for being wonderful friends, team mates at VMWare. I enjoyed the brainstorming, and lunch v discussions. I am gratefully to many friends at Cornell. Songming Peng, Yuerui Lu, Han Zhang, Pu Zhang, Xi Yan, Ryan Lau, Weiwei Wang, Jiajie Yu, Jiahe Li, Qi Huang, Zhiming Shen, Weijia Song, Haoyan Geng, Erluo Li, Zhefu Jiang, Qin Jia, and too many others to mention. I cannot forget all the parties, game nights, mid-night conversations we have had. You made my life at Ithaca much more fun. Last I would like to thank my family. I thank my parents, dad Xijin Wang, mom Qiaofeng Li, for their enduring love, support and belief, without which I would not have finished this journey. I thank my wife Fan Zhang, for her love and faith in me through the ups and downs of my PhD career. It is the optimism and happiness from her that drives me through the long journey. I dedicate this dissertation to them all. vi TABLE OF CONTENTS Biographical Sketch . iii Dedication . iv Acknowledgements . .v Table of Contents . vii List of Tables . xi List of Figures . xii 1 Introduction 1 1.1 Network Dataplane Programming . .2 1.2 Problems with Existing Network Dataplanes . .5 1.3 The Rise of Programmable Network Dataplanes . .7 1.4 Challenges in the Programmable Dataplane Design . .9 1.4.1 Lack of Programmable Network Dataplanes: Packet Processing Pipeline . .9 1.4.2 Lack of Programmable Network Dataplanes: Network Interface 11 1.5 Contributions Towards a Programmable Network Dataplane . 12 1.6 Organization . 13 2 Scope and Methodology 15 2.1 Scope: Understanding the Network Dataplane . 15 2.1.1 Dataplane Programming . 16 2.1.2 Network Protocol Stack . 18 2.1.3 Fast and Flexible Packet Processors . 22 2.1.4 Precise Packet Timing Control with a Programmable PHY . 26 2.2 Methodology . 28 2.2.1 Systems . 28 2.2.2 Evaluation . 30 2.3 Summary . 33 3 Towards a Programmable Network Dataplane: P4FPGA and Pro- grammable Packet Processing Pipelines 34 3.1 Background . 36 3.2 Design . 41 3.2.1 Programmable Pipeline . 43 3.2.2 Fixed-Function Runtime . 46 3.2.3 Control Plane API . 48 3.2.4 Extension . 49 3.3 Implementation . 50 3.3.1 Optimization . 50 3.3.2 Prototype . 53 3.4 Evaluation . 54 3.4.1 Case Studies . 55 vii 3.4.2 Microbenchmarks . 58 3.5 Application: Hardware Accelerated Consensus Protocol . 67 3.5.1 Background . 68 3.5.2 Design . 70 3.5.3 Discussion . 78 3.6 Summary . 80 4 Towards a Programmable Network Dataplane: SoNIC and Programmable PHYs 84 4.1 Design . 86 4.1.1 Access to the PHY in software . 86 4.1.2 Realtime Capability . 87 4.1.3 Scalability and Efficiency . 89 4.1.4 Precision . 90 4.1.5 User Interface . 90 4.1.6 Discussion . 93 4.2 Implementation . 93 4.2.1 Software Optimizations . 94 4.2.2 Hardware Optimizations . 97 4.3 Evaluation . 101 4.3.1 Packet Generator . 101 4.3.2 Packet Capturer . 103 4.3.3 Profiler . 105 4.4 Application: Measuring Available Bandwidth . 106 4.4.1 Background . 108 4.4.2 Design . 112 4.4.3 Implementation . 119 4.4.4 Evaluation . 121 4.5 Application: Precise Clock Synchronization . 141 4.5.1 Background . 142 4.5.2 Design . 147 4.5.3 Implementation . 155 4.5.4 Evaluation . 158 4.6 Summary . 165 5 Related Work 166 5.1 Programming Network Elements . 166 5.1.1 Hardware . 166 5.1.2 Software . 166 5.1.3 Language . 167 5.2 Network Applications . 169 5.2.1 Consensus Protocol . 169 5.2.2 Timestamping . 170 5.2.3 Bandwidth Estimaton . 171 viii 5.2.4 Clock Synchronization . 173 6 Future Direction 176 6.1 Rack-scale Computing . 176 6.2 Scaling to 100G . 177 6.3 Packet Scheduling . 177 6.4 Distributed and Responsive Network Control Plane . 178 7 Conclusions 179 A Network Concepts 181 A.1 Networking Basic Terminology . 181 A.2 Network Layering Model . 182 A.3 Packet Encapsulation . 183 A.4 Switch and Router . 184 B IEEE 802.3 Standard 186 C Language and Frameworks 190 C.1 Bluespec System Verilog . 190 C.2 Connectal Framework . 191 C.2.1 Top level structure of Connectal applications . 192 C.2.2 Development Cycles . 193 D FPGA Implementation 195 D.1 P4FPGA Hardware Implementation . 195 D.1.1 NetFPGA SUME . 195 D.1.2 Hardware Software Interface . 198 D.1.3 Packet Processing Pipeline Templates . 202 D.1.4 Code Generation . ..