Why protocol stacks should be in user-space? Michio Honda (NEC Europe), Joao Taveira Araujo (UCL), Luigi Rizzo (Universitea de Pisa), Costin Raiciu (Universitea Buchalest) Felipe Huici (NEC Europe)

IETF87 MPTCP WG July 30, 2013 Berlin Motivation • Extending layer 4 functionality addresses a lot of problems – Increased performance • MPTCP, WindowScale, FastOpen, TLP, PRR – Ubiquitous encryption • TcpCrypt Is it really possible to deploy layer 4 extensions? • Networks still accommodate TCP extensions – 86 % of the paths are usable for well-designed TCP extensions

Protocol stacks in end systems

• OSes implement stacks – high performance – Isolation between applications – Socket • New OS versions adopt new protocols/extensions Extending protocol stacks: reality

• OSes’ release cycle is slow • Support in the newest OS version does not imply deployment – Stakeholders are reluctant to upgrade their OS – Disabling a new feature by default also hinders timely deployment How long does deployment take? 1.00

Option 0.75 SACK Timestamp Windowscale 0.50

Direction Ratio of flows 0.25 Inbound Outbound * Analyzed in traffic traces 0.00 2007 2008 2009 2010 2011 2012 monitored at a single transit Date link in Japan (MAWI traces) – SACK is default since Windows 2000 – WS and TS are implemented in Windows 2000, but enabled as default since Windows Vista

• To ease upgrade, we need to move protocol stacks up into user-space • Problem: We don’t have a practical way – Isolation between applications – Support for legacy applications and OS’s stack – High performance MultiStack: support for user-space stacks (TCP port 22) (TCP port 80) (UDP port 53) legacy apps User App 1 . . . App N Socket API Kernel Stack 1 Stack N OS's stack Netmap API . . . Virtual ports

Multiplex / Demultiplex packets (3-tuple)

NIC MultiStack • Support for multiple stacks (including OS’s stack) • Namespace isolation based on traditional 3-tuple • Very high performance • Run in FreeBSD and Linux MultiStack: operating system support for user-space stacks (TCP port 22) (TCP port 80) (UDP port 53) legacy apps User App 1 . . . App N Socket API Kernel Stack 1 Stack N OS's stack Netmap API . . . Virtual ports

Multiplex / Demultiplex packets (3-tuple)

NIC MultiStack • Support for multiple stacks (including OS’s stack) • Namespace isolation based on traditional 3-tuple • Very high performance • Run in FreeBSD and Linux MultiStack: operating system support for user-space stacks (TCP port 22) (TCP port 80) (UDP port 53) legacy apps User App 1 . . . App N Socket API Kernel Stack 1 Stack N OS's stack Netmap API . . . Virtual ports

Multiplex / Demultiplex packets (3-tuple)

NIC MultiStack • Support for multiple stacks (including OS’s stack) • Namespace isolation based on traditional 3-tuple • Very high performance • Run in FreeBSD and Linux MultiStack: operating system support for user-space stacks (TCP port 22) (TCP port 80) (UDP port 53) legacy apps User App 1 . . . App N Socket API Kernel Stack 1 Stack N OS's stack Netmap API . . . Virtual ports

Multiplex / Demultiplex packets (3-tuple)

NIC MultiStack • Support for multiple stacks (including OS’s stack) • Namespace isolation based on traditional 3-tuple • Very high performance • Run in FreeBSD and Linux Multistack performance 10 10 8 8 6 1 core 6 1 core 4 2 cores 4 2 cores 4 cores 2 4 cores 2 Line rate Line rate 0 0 Throughput (Gbps) Throughput (Gbps) 64 128 256 512 64 128 256 512 Packet size (bytes) Packet size (bytes) TX RX • App creates every packet from • Multistack receives a packet scratch, and send it to the • It identifies destination 3-tuple kernel of the packet • Multistack validates the source 3-tuple of every packet, and • It delivers the packet to the copies the packet to the NIC’s corresponding app/stack TX buffer Performance with stacks

• A simple HTTP server on top of our work-in-progress user-space TCP (UTCP) • The same app running on top of OS’s TCP

10 nginx-TSO OSTCP-TSO 8 OSTCP UTCP 6 Client establishes a

Gbps 4 TCP connection, and sends HTTP GET 2 Server replies with 0 HTTP OK (1- 32KB) 1 8 16 32 Fetch size (KB) Single TCP connection is used for a single HTTP transaction Conclusion

• Rekindle user-space stacks for widespread, timely deployment of new protocols/extensions • Ongoing work: – Improving implementation of MultiStack – Making complete user-space TCP implementation – Integrating user-space stacks into networking library like libevent and Multistack performance (many apps/ stacks)

10 10 8 8 6 8 ports 6 8 ports 4 16 ports 4 16 ports 2 64 ports 2 64 ports Line rate Line rate 0 0 Throughput (Gbps) Throughput (Gbps) 64 128 256 512 64 128 256 512 Packet size (bytes) Packet size (bytes) TX RX • A bit lower performance on many ports is due to the reduced number of packets taken in a single systemcall