WireGuard A next generation VPN tunnel

PRESENTED BY QINGWEI “VINCENT” ZHANG MARCH 7, 2019. OTTAWA CANADA USER GROUP. Who am I

 Qingwei Zhang, a software development engineer with 5 years of experience in high technology and finance.  A previous small business entrepreneurs  Background in computer networks  Motivated to introduce a VPN that avoids the problems in both crypto and implementation

What is WireGuard?

 Layer 3 secure network tunnel for IPv4 and IPv6.  Designed for the Linux kernel

 Slower cross platform implementations.  UDP-based. Punches through firewalls.  Modern conservative cryptographic principles.  Emphasis on simplicity and auditability.  Authentication model similar to SSH’s ./.ssh/authenticated_keys.  Replacement for OpenVPN and IPsec.  Grew out of a stealth rootkit project.

Security Design Principle 1: Easily Auditable

OpenVPN Linux XFRM StrongSwan SoftEther WireGuard

116,730 LoC 119,363 LoC 405,894 LoC 329,853 LoC 3,771 LoC Plus OpenSSL! Plus Plus XFRM! StrongSwan!

Security Design Principle 1: Easily Auditable

WireGuard 3771 LoC

IPsec OpenVPN SoftEther (XFRM+Strongswan) 119,363 329,853LoC 419,792 LoC LoC Security Design Principle 2: Simplicity of Interface

 WireGuard presents a normal network interface:

# iplink add wg0 type WireGuard # ipaddress add 192.168.3.2/24 dev wg0 # iproute add default via wg0 # ifconfig wg0 … # iptables–A INPUT -iwg0 …

/etc/hosts.{allow,deny}, bind(), …

 Everything that ordinarily builds on top of network interfaces –like eth0or wlan0–can build on top of wg0.

Cryptokey Routing

 The fundamental concept of any VPN is an association between public keys of peers and the IP addresses that those peers are allowed to use.  A WireGuard interface has:  A private key  A listening UDP port  A list of peers  A peer:  Is identified by its public key  Has a list of associated tunnel IPs  Optionally has an endpoint IP and port

Cryptokey Routing PUBLIC KEY :: IP ADDRESS CryptokeyRouting

 Server Configure  Client Configure

[Interface] [Interface] PrivateKey= yAnz5TF+lXXJte14tji3zlMNq+hd2rYUIgJBgB3fBmk= PrivateKey= gI6EdUSYvn8ugXOt8QQD6Yc+JyiZxIhp3GInSWRfWGE= ListenPort= 41414 ListenPort= 21841

[Peer] [Peer] PublicKey= xTIBA5rboUvnH4htodjb6e697QjLERt1NAB4mZqp8Dg= PublicKey= HIgo9xNzJMWLKASShiTqIybxZ0U3wGLiUeJ1PKf8ykw=Endpoint = AllowedIPs= 10.192.122.3/32,10.192.124.1/24 192.95.5.69:41414 AllowedIPs= 0.0.0.0/0 [Peer] PublicKey= TrMvSoP4jYQlY6RIzBgbssQqY3vxI2Pi+y71lOWWXX0= AllowedIPs= 10.192.122.4/32,192.168.0.0/16 Cryptokey Routing

WireGuard: Linux kernel: WireGuard: encrypt(packet) Userspace: Ordinary Destination IP send(encrypted) send(packet) routing table address → → peer’s → wg0 which peer endpoint

WireGuard: WireGuard: Linux: WireGuard: decrypt(packe Source IP recv(encrypte Hand packet t) → which address ←→ d) to networking peer peer’s allowed stack IPs Cryptokey Routing

Makes system administration very simple.  If it comes from interface wg0 and is from your friends Bob’ tunnel IP address of 192.168.5.17, then the packet definitely came from Bob. The iptables rules are plain and clear Timers: A Stateless Interface for a Stateful Protocol

 As mentioned prior, WireGuard appears “stateless” to ; you set up your peers, and then it just works.  A series of timers manages session state internally, invisible to the user.  Every transition of the state machine has been accounted for, so there are no undefined states or transitions.  Event based. Timers

• If no session has been established for 120 seconds,send User space sends packet. handshake initiation.

No handshake response after • Resend handshake initiation. 5 seconds.

Successful authentication of • Send an encrypted empty packet after 10 seconds, if we incoming packet. don’t have anything else to send during that time.

No successfully authenticated incoming packets after 15 • Send handshake initiation. seconds. Security Design Principle 2: Simplicity of Interface

 The interface appears stateless to the system administrator.  Add an interface – wg0, wg1, wg2, … – configure its peers, and immediately packets can be sent.  If it’s not set up correctly, most of the time it will just refuse to work, rather than running insecurely: fails safe, rather than fails open.  Endpoints roam, like in mosh.  Identities are just the static public keys, just like SSH. Everything else, like session state, connections, and so forth, is invisible to admin. Demo Simple Composable Tools

 Since wg(8) is a very simple tool, that works with ip(8), other more complicated tools can be built on top.  Integration into various network managers:  OpenWRT  OpenRC netifrc  NixOS  -networkd  LinuxKit  Ubiquiti’s EdgeOS  NetworkManager Simple Composable Tools: wg-quick

 Simple shell script # wg-quick up vpn0 # wg-quick down vpn0

 /etc//vpn0.conf: [Interface] Address = 10.200.100.2 DNS = 10.200.100.1 PostDown = resolvconf -d %i PrivateKey = uDmW0qECQZWPv4K83yg26b3L4r93HvLRcal997IGlEE=

[Peer] PublicKey = +LRS63OXvyCoVDs1zmWRO/6gVkfQ/pTKEZvZ+CehO1E= AllowedIPs = 0.0.0.0/0 Endpoint = demo.wireguard.io:51820 Security Design Principle 3: Static Fixed Length Headers

 All packet headers have fixed width fields, so no parsing is necessary.  Eliminates an entire class of vulnerabilities.  No parsers → no parser vulnerabilities.  Quite a different approach to formats like ASN.1/X.509 or even variable length IP and TCP packet headers. Security Design Principle 4: Static Allocations and Guarded State

 All state required for WireGuard to work is allocated during config.  No memory is dynamically allocated in response to received packets.

 Eliminates another entire classes of vulnerabilities.

 Places an unusual constraint on the crypto, since we are operating over a finite amount of preallocated memory.  No state is modified in response to unauthenticated packets.

 Eliminates yet another entire class of vulnerabilities.

 Also places unusual constraints on the crypto. Security Design Principle 5: Stealth

 Some aspects of WireGuard grew out of akernel rootkit project.  Should not respond to any unauthenticated packets.  Hinder scanners and service discovery.  Service only responds to packets with correct crypto.  Not chatty at all.

 When there’s no data to be exchanged, both peers become silent. Security Design Principle 6: Solid Crypto

 We make use of Noise Protocol Framework – noiseprotocol.org

 WireGuard was involved early on with the design of Noise, ensuring it could do what we needed.

 Custom written very specific implementation of Noise_IKpsk2 for the kernel.

 Related in spirit to the Protocol.  The usual list of modern desirable properties you’d want from an authenticated key exchange  Modern primitives: Curve25519, Blake2s, ChaCha20,  Lack of cipher agility! (Opinionated.) Security Design Principle 6: Solid Crypto

 Strong key agreement & authenticity  Key-compromise impersonation resistance  Unknown key-share attack resistance  Key secrecy  Forward secrecy  Session uniqueness  Identity hiding  Replay-attack prevention, while allowing for network packet reordering Crypto Designed for Kernel

 Design goals of guarded memory safety, few allocations, etc have direct effect on cryptography used.

 Ideally be 1-RTT.  Fast crypto primitives.  Clear division between slowpath for ECDH and fastpath for symmetric crypto.  Handshake in kernel space, instead of punted to userspace daemon like IKE/IPsec.

 Allows for more efficient and less complex protocols.

 Exploit interactions between handshake state and packet encryption state. Multicore Cryptography

 Encryption and decryption of packets can be spread out to all cores in parallel.  Nonce/sequence number checking, netif_rx, and transmission must be done in serial order.  Requirement: fast for single flow traffic in addition to multiflow traffic.  Different from usual assumptions. Multicore Cryptography

 Single queue, shared by all CPUs, rather than queue per CPU

 No reliance on process scheduler, which tends to add latency when waiting for packets to complete

 Serial transmission queue waits on ordered completion of parallel queue items

 Using netif_receive_skb instead of netif_rx to push back on encryption queue  Bunching bundles of packets together to be encrypted on one CPU results in high performance gains

 How to choose the size of the bundle? Multicore Cryptography Performance Performance

 Being in kernel space means that it is fast and low latency.  No need to copy packets twice between user space and kernel space.  ChaCha20Poly1305 is extremely fast on nearly all hardware, and safe.  AES-NI is fast too, obviously, but as Intel and ARM vector instructions become wider and wider, ChaCha is handedly able to compete with AES-NI, and even perform better in some cases.  AES is exceedingly difficult to implement performantly and safely (no cache-timing attacks) without specialized hardware.  ChaCha20 can be implemented efficiently on nearly all general purpose processors.  Simple design of WireGuard means less overhead, and thus better performance.  Less code → Faster program? Not always, but in this case, certainly. Measurements Confluence of Principles → The Key Exchange The Key Exchange

 The key exchange designed to keep our principles static allocations, guarded state, fixed length headers, and stealthiness.  In order for two peers to exchange data, they must first derive ephemeral symmetric crypto session keys from their static public keys.  Either side can reinitiate the handshake to derive new session keys.

 So initiator and responder can “swap” roles.  Invalid handshake messages are ignored, maintaining stealth The Key Exchange: (Elliptic Curve) Diffie-Hellman Review

private A = random() public A = derive_public(private A)

private B = random() public B = derive_public(private B)

ECDH(private A, public B) == ECDH(private B, public A)