software defining system devices with the BANANA double split driver model
Dan WILLIAMS, Hani JAMJOOM IBM Watson Research Center
Hakim WEATHERSPOON Cornell University
Decoupling gives Flexibility
• Cloud’s flexibility comes from decoupling device functionality from physical devices – aka “virtualization”
• Can place VM anywhere – Consolidation – Instantiation – Migration – Placement optimizations
2 Are all Devices Decoupled?
• Today: split driver model – Guests don’t need device-specific driver – System portion interfaces with physical device
Dom 0 (Host OS) Dom U (VM)
Back-end Ring Front-end System Driver Buffer Driver portion Hardware Driver
Hypervisor
Physical Machine 3 Are all Devices Decoupled?
• Today: split driver model – Guests don’t need device-specific driver – System portion interfaces with physical device
• Dependencies Dom 0 (Host OS) Dom U (VM)
on hardware – Presence of Back-end Ring Front-end Buffer device (e.g., System Driver Driver portion GPU, FPGA) Hardware Driver – Device-related configuration Hypervisor (e.g., VLAN) Physical Machine 4 Devices Limit Flexibility
• Split driver model: dependencies break if VM moves
• No easy place to plug into Dom 0 (Host OS) Dom U (VM) hardware driver – System portion Back-end Ring Front-end connected in System Driver Buffer Driver portion ad-hoc way Hardware Driver
Hypervisor
Physical Machine 5 Banana Double-Split
• Clean separation between hardware driver (Corm) and backend driver (Spike)
• Standard Dom 0 (Host OS) Dom U (VM) interface Ring Front-end Spike between Corm Buffer Driver
and Spike Endpoint Wire (endpoint) Corm
• Connected with Hypervisor wires Physical Machine 6 Roadmap
• Banana Double-Split Driver Model • Implementation • Experiments Dom 0 (Host OS) Dom U (VM) • Related Work Ring Front-end Spike • Summary Buffer Driver
Endpoint Wire
Corm
Hypervisor
Physical Machine 7 (Anatomy of a Banana Plant)
Flower Spike
Corm
8 Corms
• One Corm per physical device
• Expose one or Dom 0 (Host OS) Dom U (VM) more endpoints Ring Front-end Spike – Virtual NIC-like Buffer Driver interfaces Endpoint Wire
• Connect to one Corm or more Spikes Hypervisor
Physical Machine 9 Spikes
• One Spike per front-end driver
• Expose an Dom 0 (Host OS) Dom U (VM) endpoint Ring Front-end SpikeCorm Buffer Driver
• Attach to a Endpoint Wire Corm Corm
• No need to Hypervisor share machine! Physical Machine 10 Wires
• Connections between endpoints – E.g., tunnel, VPN, local bridge
• Each hypervisor contains endpoint controller – Advertises endpoints (e.g., for Corms) – Looks up endpoints (e.g., for Spikes) – Sets wire type – Integrates with VM migration
• Simple interface – connect/disconnect 11 Implementation
Dom 0 (Host OS) Dom U (VM) • Xen network
devices Spike Back Ring Front-end
Endpoint vif1.0 Buffer eth0
• Endpoint Endpoint Bridge exposes
netdev Corm Outgoing Interface Endpoint Endpoint • Grafted to eth0
existing vSwitch devices via
endpoint to remote host via eth0 bridge Hypervisor
12 Physical Machine Implementation
• Types of wires – Native (bridge) – Encapsulating (in kernel module) – Tunneling (Open-VPN based)
• /proc interface for configuring wires
• Integrated with live migration
13 Initial Experimentation
• Ultimate test of location independence – Cross-cloud live migration – Keep using the same Corm
• Banana Wire Performance
14 Setup
• Amazon EC2 and local resources – EC2 (4XL): 33 ECUs, 23 GB memory, 10 Gbps Ethernet – Local: 12 cores @ 2.93 GHz, 24 GB memory, 1Gbps Ethernet
• Xen-blanket for nested virtualization – Dom 0: 8 vCPUs, 4 GB memory – PV guests: 4 vCPUs, 8 GB memory
• Local NFS server for VM disk images
• netperf to measure throughput and latency – 1400 byte packets
15 Cross-cloud Live Migration
1600 • Continuous 1400 1200 netperf 1000 traffic 800 between 600 VM2 migration starts VM2 migration ends Throughput (Mbps) 400 VMs 200 VM1 migration starts VM1 migration ends 0 0 100 200 300 400 500 600 700 Time (s) • It works! 10
1 Latency (ms) VM1 migration starts VM1 migration ends VM2 migration starts VM2 migration ends
0.1 0 100 200 300 400 500 600 700 800 900 Time (s) Corm Switching
1600 • Dependency 1400 1200 eliminated 1000 after second 800 VM migrated 600 VM2 migration starts VM2 migration ends Throughput (Mbps) 400 200 VM1 migration starts VM1 migration ends 0 0 100 200 300 400 500 600 700 • Switching Time (s) Corms can 10 recover performance 1 Latency (ms) VM1 migration starts VM1 migration ends VM2 migration starts VM2 migration ends
0.1 0 100 200 300 400 500 600 700 800 900 Time (s) Migration Numbers
• Experimental setup has overhead Downtime Duration – Nesting increases Banana 1.3 [0.5] 20.13 [0.1] downtime by 43% None (nested) 1.0 [0.5] 20.04 [0.1] None (not nested) 0.7 [0.4] 19.86 [0.2] • Flexibility to do such Mean [w/std. dev] local to local live VM extreme migration has migration (s) a cost – Migrating Banana Downtime Duration endpoints introduces a further 30% overhead Local to Local 1.3 [0.5] 20.13 [0.1] EC2 to EC2 1.9 [0.3] 10.69 [0.6] Local to EC2 2.8 [1.2] 162.25 [150.0] • Cross-cloud live Mean [w/std. dev] local to local live VM migration had 2.8s migration (s) downtime 18 Encapsulating Wires
1000 non-nested 800 nested
600
400
200 Throughput (Mbps) 0 baseline tinc OpenVPN vtun Banana wire • UDP throughput experiment
• Using kernel module is especially important in nested environment – Factor of 1.55 improvement becomes factor of 3.28 19 Related Work
• Other cut points – Block tap drivers (block devices)
• Continuous access – Netchannel
• Specialized solutions – Nomad, USB/IP
• Virtual networking – VL2, NetLord, VIOLIN, others 20 Summary
• Banana Double-Split driver model decouples virtual devices from physical – Provides location independence – Mappings are software-defined
• So far, implemented network device – Demonstrated cross-cloud live migration
• Future directions – Controller to manage mappings/re-wirings? – Corm switching?
– Other devices? 21 Thank You
Flower Spike
Corm
22