Improving Networking

IMPERIAL COLLEGE LONDON FINALYEARPROJECT JUNE 14, 2010 Improving Networking by moving the network stack to userspace Author: Matthew WHITWORTH Supervisor: Dr. Naranker DULAY 2 Abstract In our modern, networked world the software, protocols and algorithms involved in communication are among some of the most critical parts of an operating system. The core communication software in most modern systems is the network stack, but its basic monolithic design and functioning has remained unchanged for decades. Here we present an adaptable user-space network stack, as an addition to my operating system Whitix. The ideas and concepts presented in this report, however, are applicable to any mainstream operating system. We show how re-imagining the whole architecture of networking in a modern operating system offers numerous benefits for stack-application interactivity, protocol extensibility, and improvements in network throughput and latency. 3 4 Acknowledgements I would like to thank Naranker Dulay for supervising me during the course of this project. His time spent offering constructive feedback about the progress of the project is very much appreciated. I would also like to thank my family and friends for their support, and also anybody who has contributed to Whitix in the past or offered encouragement with the project. 5 6 Contents 1 Introduction 11 1.1 Motivation.................................... 11 1.1.1 Adaptability and interactivity.................... 11 1.1.2 Multiprocessor systems and locking................ 12 1.1.3 Cache performance.......................... 14 1.2 Whitix....................................... 14 1.3 Outline...................................... 15 2 Hardware and the LDL 17 2.1 Architectural overview............................. 17 2.2 Network drivers................................. 18 2.2.1 Driver and device setup........................ 20 2.2.2 Sending packets............................ 21 2.2.3 Receiving packets........................... 21 2.3 Linux Driver Layer............................... 22 2.3.1 Caveats.................................. 22 2.4 Network device manager........................... 23 2.5 Hardware address cache............................ 24 2.6 Packet I/O.................................... 25 2.7 Summary..................................... 26 3 Network channels 27 3.1 Background.................................... 27 3.2 Design....................................... 28 3.2.1 Possibilities............................... 28 3.2.2 Tradeoffs................................. 32 3.2.3 Architectural overview........................ 33 3.3 Channel management............................. 35 3.3.1 Setup................................... 36 3.3.2 Control.................................. 38 3.3.3 Organization.............................. 38 7 CONTENTS CONTENTS 3.3.4 Destruction............................... 39 3.4 Memory management............................. 39 3.4.1 Memory layout............................. 40 3.4.2 Usercode library............................ 45 3.5 File emulation.................................. 46 3.6 Packet classification............................... 47 3.6.1 Family matching............................ 47 3.6.2 Protocols: Matching packets to channels............. 49 3.7 Routing...................................... 50 3.7.1 Firewall................................. 51 3.8 Testing....................................... 51 3.9 Discussion.................................... 53 3.10 Summary..................................... 54 4 Userspace networking 55 4.1 Background.................................... 55 4.1.1 Microkernel research......................... 56 4.2 Overview..................................... 57 4.3 Network stack.................................. 59 4.3.1 Architecture............................... 59 4.3.2 APIs.................................... 61 4.3.3 Internal layers: channels and IP................... 63 4.3.4 UDP and ICMP sockets........................ 64 4.4 TCP........................................ 66 4.4.1 Design choices............................. 66 4.4.2 Possible TCP changes......................... 67 4.4.3 Sending packets............................ 68 4.4.4 Retransmission............................. 69 4.4.5 Receiving packets and the state machine............. 69 4.4.6 Socket polling.............................. 71 4.5 Utilities...................................... 72 4.5.1 firewall.................................. 72 4.5.2 nprof................................... 73 4.5.3 dhcp................................... 73 4.5.4 dns.................................... 73 4.5.5 ping.................................... 74 4.6 Applications................................... 74 4.6.1 ftp..................................... 75 4.6.2 telnet................................... 76 4.6.3 httpd................................... 76 4.7 Testing....................................... 78 4.7.1 Specific methods............................ 78 4.7.2 Application testing and summary.................. 79 4.8 Discussion.................................... 80 4.9 Summary..................................... 80 8 CONTENTS CONTENTS 5 Dynamic protocols 83 5.1 Background.................................... 83 5.1.1 Adaptable and interactive protocols................ 83 5.1.2 Statistics, profiling and adaptation................. 85 5.1.3 Asynchronous I/O........................... 86 5.2 Architecture................................... 87 5.3 Statistics and profiling............................. 88 5.3.1 Categories................................ 89 5.3.2 Use.................................... 90 5.4 Adaptation.................................... 91 5.4.1 Current adaptive technologies.................... 91 5.4.2 Application hints............................ 91 5.4.3 Profiling data.............................. 92 5.5 Events....................................... 93 5.5.1 Design.................................. 94 5.5.2 TCP.................................... 95 5.5.3 Asynchronous I/O........................... 98 5.5.4 Event delivery.............................. 99 5.6 Discussion.................................... 101 5.7 Summary..................................... 102 6 Performance 103 6.1 Background.................................... 103 6.2 Method...................................... 104 6.2.1 Caveats.................................. 104 6.2.2 Procedure................................ 105 6.2.3 Timers.................................. 107 6.3 Experimental testbed.............................. 107 6.4 Analysis...................................... 108 6.4.1 Send................................... 111 6.4.2 Receive.................................. 112 6.5 Summary..................................... 112 7 Evaluation 113 7.1 Flexibility and adaptability.......................... 113 7.2 Correctness and stability............................ 114 7.2.1 Comparisons.............................. 116 7.3 Functionality and usability.......................... 116 7.4 Scalability..................................... 117 7.5 Summary..................................... 119 8 Conclusion 121 8.1 Scale and schedule............................... 122 8.2 Future work................................... 124 8.2.1 Merging shared memory and message queues.......... 124 8.2.2 More protocol suites.......................... 125 9 CONTENTS CONTENTS 8.2.3 Stateful firewall............................ 125 8.2.4 Remote network stack management................ 125 8.3 Final comments................................. 126 9 Bibliography 127 Appendices 131 A Network statistics file format 133 A.1 Header...................................... 133 A.2 Port entries.................................... 134 A.3 Host entries.................................... 134 B System calls 135 B.1 SysChannelCreate................................ 135 B.2 SysChannelControl............................... 136 C Usercode library functions 137 10 Chapter 1 Introduction In this report, I present an adaptable interactive TCP/IP user-space network stack as an addition to my operating system Whitix.1 As well as decoupling the policy of producing and processing packets from the mechanism of sending and receiving them, it scales and adapts better than existing solutions as well as adhering strongly to the end-to-end principle of the Internet. First of all, I will list the current problems and limitations in the field of network stacks, before proposing my solution in detail and the original contributions that I will make, along with a list of objectives that need to be achieved to satisfy the project goals. 1.1 Motivation In our modern, networked world the software, protocols and algorithms involved in communication are among some of the most critical parts of an operating system. The core communication software in most modern systems is the network stack, but its basic monolithic design and functionality appears to have remained unchanged since the 1970s. With the progress in processor technology aiming towards multi- core and many-core systems,[38] there is a need for a new network stack design to replace the current monolithic stacks of today with (what I believe to be) their restricted legacy designs. First of all, I shall summarize the problems inherent in current implementations. 1.1.1 Adaptability and interactivity The first major problem is the lack of adaptability in and feedback from the network stack in general. Although certain transport protocols like TCP can adapt their data transmission algorithms

Improving Networking

Network/Socket Programming in Java

Retrofitting Privacy Controls to Stock Android

Libioth (Slides)

1 Introduction

How to XDP (With Snabb)

ENT-UG1060 User Guide Software API Programming

Post Sockets - a Modern Systems Network API Mihail Yanev (2065983) April 23, 2018

Network Forensics: Following the Digital Trail in a Virtual Environment

Introduction to RAW-Sockets Jens Heuschkel, Tobias Hofmann, Thorsten Hollstein, Joel Kuepper

SUSE Linux Enterprise Server 11 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 11 SP4

Internet Security [1] VU 184.216

DESIGN of an AUTONOMOUS ANTI-DDOS NETWORK (A2D2) By