Rethinking Software Network Data Planes in the Era of Microservices

Doctoral Dissertation Doctoral Program in Computer and Control Enginering (32nd cycle) Rethinking Software Network Data Planes in the Era of Microservices Sebastiano Miano ****** Supervisor Prof. Fulvio Risso Doctoral examination committee Prof. Antonio Barbalace, Referee, University of Edinburgh (UK) Prof. Costin Raiciu, Referee, Universitatea Politehnica Bucuresti (RO) Prof. Giuseppe Bianchi, University of Rome “Tor Vergata” (IT) Prof. Marco Chiesa, KTH Royal Institute of Technology (SE) Prof. Riccardo Sisto, Polytechnic University of Turin (IT) Politecnico di Torino 2020 This thesis is licensed under a Creative Commons License, Attribution - Noncommercial- NoDerivative Works 4.0 International: see www.creativecommons.org. The text may be reproduced for non-commercial purposes, provided that credit is given to the original author. I hereby declare that, the contents and organisation of this dissertation constitute my own original work and does not compromise in any way the rights of third parties, including those relating to the security of personal data. ........................................ Sebastiano Miano Turin, 2020 Summary With the advent of Software Defined Networks (SDN) and Network Functions Virtualization (NFV), software started playing a crucial role in the computer network architectures, with the end-hosts representing natural enforcement points for core network functionalities that go beyond simple switching and routing. Recently, there has been a definite shift in the paradigms used to develop and deploy server applications in favor of microservices, which has also brought a visible change in the type and requirements of network functionalities deployed across the data center. Network applications should be able to continuously adapt to the runtime behavior of cloud-native applications, which might regularly change or be scheduled by an orchestrator, or easily interact with existing “native” applications by leveraging kernel functionalities - all of this without sacrificing performance or flexibility. In this dissertation, we explore the design space of software packet processing applications within the new “cloud-native” era, and we propose a novel paradigm to design, run, and manage software network functions that follow the same approach of micro-services. We present Polycube, a software framework that enables the creation of efficient, modular, and dynamically reconfigurable in-kernel networking components available with vanilla Linux. Polycube exploits the extended Berkeley Packet Filter (eBPF) framework to execute the data plane of those network functions and introduces a set of additional components and common APIs that make it easier to develop and manage those services. We design and evaluate the use of this paradigm through bpf-iptables, a clone of iptables characterized by improved performance and scalability. Then, we explore the possibility of enhancing the ca- pabilities of end-hosts through the use of programmable network interface cards (SmartNICs) to offload partially (of fully) existing packet processing applications, in particular in the domain of DDoS Mitigation. In the last part of the dissertation, we present Kecleon, a compiler framework that can be used to dynamically opti- mize generic software data planes, taking into account the runtime characteristics and packet processing behavior of the original network function. We believe that the combination of these works can lay the foundation for a new model of packet processing applications that is better suited for modern cloud environments, having the capability to be dynamically re-combined, re-generated, and re-optimized without sacrificing programmability, extensibility and performance. ii Acknowledgements This Ph.D. has totally changed my personality and way of thinking, thanks to all the wonderful (and, sometimes bad) experiences that I got and the amazing people that I met and worked with. It was a journey that I would always remember and something that I will definitely start again if I could. Many people have assisted me during this journey and I would like to express my gratitude to all of them for helping me become a researcher and the person that I am today. First of all, I would like to thanks my advisor, Prof. Fulvio Risso for all the lessons that I learned from him, for his patience, his immense love in the field, and his capacity to drive me in the right directions when it was necessary. I will be always grateful to him. My colleagues and friends Matteo Bertrone and Mauricio Vasquez (the initial Polycube team), with whom I have shared all the years of my Ph.D. and who helped me to refine my research and provided invaluable help when I needed it most.It is also thanks to them that I was able to reach a certain level of maturity and robustness of the works presented in this thesis. All the people and friends that I have met in Cambridge. Prof. Andrew Moore for giving me the chance to do this wonderful experience, and Christos, Hilda, Salvador, Yuta, and Marcin for all the interesting discussions that we had and for helping me (a poor Sicilian guy) to survive in the UK during the entire summer. A huge thanks go to Prof. Gianni Antichi, from having welcomed me in Cambridge and for teaching me a lot in the last year. He became a real role model for me and a great friend. A great thanks to my parents that gave me the change to do a Ph.D. and bring me up to love school and science, and that always supported me during the entire Ph.D. A special thanks to my sister, Debora for all the suggestions and support that she gave me and for being always present when needed. She was the first person that I always called when, in panic, I was looking for advice. Last but not least, my biggest thanks to Valentina. Although sometimes I neglected her to focus on the Ph.D., she has been always on my side and supported me in any choice I have made, even if this sometimes took me away from her. iv Contents List of Tables ix List of Figures x 1 Introduction 1 1.1 Summary of Contributions.......................5 1.2 Outline..................................8 1.3 Research Projects Not Included in This Dissertation.........8 2 Background and Motivations 10 2.1 Userspace vs. Kernelspace networking................. 10 2.2 The extended Berkley Packet Filter (eBPF)............. 12 2.2.1 eBPF for Network Functions.................. 14 3 Creating Network Service with eBPF: Experience and Lessons Learned 16 3.1 Introduction............................... 16 3.2 Experiences and Insights........................ 17 3.2.1 eBPF limitations........................ 17 3.2.2 Enabling more aggressive service optimization........ 21 3.2.3 Data structures......................... 22 3.2.4 High performance processing with XDP............ 24 3.2.5 Service function chaining.................... 26 3.3 Experimental Evaluation........................ 26 3.3.1 Test environment and evaluation metrics........... 26 3.3.2 Overcoming eBPF limitations................. 27 3.3.3 Enabling more aggressive service optimization........ 29 3.3.4 High performance processing with XDP............ 31 3.3.5 Service function chaining.................... 31 3.4 Conclusions............................... 32 v 4 Polycube: A Framework for Flexible and Efficient In-Kernel Net- work Services 34 4.1 Introduction............................... 34 4.2 Design Goals and Challenges...................... 36 4.3 Architecture Overview......................... 37 4.3.1 Unified Point of Control.................... 37 4.3.2 Structure of Polycube services................. 37 4.3.3 Remote vs Local services.................... 40 4.4 APIs and Abstractions......................... 40 4.4.1 Transparent port handling................... 41 4.4.2 Fast-slow path interaction................... 42 4.4.3 Debug mechanism........................ 44 4.4.4 Table abstractions....................... 44 4.4.5 Transparent Support for Multiple Hook Points........ 44 4.4.6 Transparent Services...................... 45 4.5 Service Chaining Design........................ 46 4.6 Management and Control Plane.................... 49 4.6.1 Model-driven service abstraction................ 49 4.7 Implementation............................. 51 4.7.1 Polycube Core.......................... 51 4.7.2 Polycube Services........................ 53 4.8 Evaluation................................ 57 4.8.1 Setup.............................. 57 4.8.2 Test Applications........................ 58 4.8.3 Framework Overheads..................... 64 4.8.4 Polycube vs Userspace Frameworks.............. 65 4.9 Conclusions............................... 66 5 Accelerating Linux Security with eBPF iptables 67 5.1 Introduction............................... 67 5.2 Design Challenges and Assumptions.................. 69 5.2.1 Guaranteeing filtering semantic................ 69 5.2.2 Efficient classification algorithm in eBPF........... 70 5.2.3 Support for stateful filters (conntrack)............ 71 5.2.4 Working with upstream Linux kernel............. 71 5.3 Overall Architecture.......................... 71 5.4 Data plane................................ 73 5.4.1 Header Parser.......................... 73 5.4.2 Chain Selector.......................... 73 5.4.3 Matching algorithm....................... 74 5.4.4 Classification Pipeline..................... 74 5.4.5 Connection Tracking...................... 79 vi 5.5 Control plane.............................. 83 5.6 Evaluation................................ 85 5.6.1 Test environment.......................

Rethinking Software Network Data Planes in the Era of Microservices

Informal Data Transformation Considered Harmful

Automating the Capture of Data Transformations from Statistical Scripts in Data Documentation Jie Song George Alter H

What Is a Data Warehouse?

POLITECNICO DI TORINO Repository ISTITUZIONALE

Data Warehousing on AWS

Master Data Simplified

Lineage Tracing for General Data Warehouse Transformations

Doppiodb 2.0: Hardware Techniques for Improved Integration of Machine Learning Into Databases

Extract, Transform, Load | ETL Development

Doppiodb 2.0: Hardware Techniques for Improved Integration of Machine Learning Into Databases

Automating the Capture of Data Transformation Metadata

Biological Data Transformation in Pathway Simulation∗