RedLeaf: Isolation and Communication in a Safe Operating System
Vikram Narayanan1, Tianjiao Huang1, David Detweiler1, Dan Appel1, Zhaofeng Li1, Gerd Zellweger2, Anton Burtsev1
OSDI ’20
1University of California, Irvine
2VMware Research History of Isolation
Cedar Ka�eOS
Multics Pilot Scomp SPIN J-Kernel Mondrian VINO Singularity
1973 1980 1983 1995 1996 1999 2002 2005 Year
• Isolation of kernel subsystems • Final report of Multics (1976) • Scomp (1983) • Systems remained monolithic • Isolation was expensive
1 Isolation mechanisms
• Hardware Isolation • Segmentation (46 cycles)1 • Page table isolation (797 cycles)2 • VMFUNC (396 cycles)3 • Memory protection keys (20-26 cycles)4 • Language based isolation • Compare drivers written (DPDK-style) in a safe high-level language (C, Rust, Go, C#, etc.)5 • Managed runtime and Garbage collection (20-50% overhead on a device-driver workload)
1L4 Microkernel: Jochen Liedtke 2https://sel4.systems/About/Performance/ 3Lightweight Kernel Isolation with Virtualization and VM Functions, VEE 2020 4Hodor: Intra-process isolation for high-throughput data plane libraries 5The Case for Writing Network Drivers in High-Level Programming Languages, ANCS 2019
2 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Rust
Traditional Safe languages vs Rust
Java, C# etc.
A
3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Rust
Traditional Safe languages vs Rust
Java, C# etc.
A
Vector
3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Rust
Traditional Safe languages vs Rust
Java, C# etc.
A B
Vector
3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Rust
Traditional Safe languages vs Rust
Java, C# etc.
B
Vector
3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Rust
Traditional Safe languages vs Rust
Java, C# etc.
Vector
3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Rust
Traditional Safe languages vs Rust
Java, C# etc.
Garbage Vector collection?
3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Traditional Safe languages vs Rust
Rust
A Java, C# etc.
Garbage Vector collection?
3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Traditional Safe languages vs Rust
Rust
A Java, C# etc.
Vector
Garbage Vector collection?
3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Traditional Safe languages vs Rust
Rust
A B Java, C# etc.
Vector
Garbage Vector collection?
3 • Linear types • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Traditional Safe languages vs Rust
Rust
B Java, C# etc.
Vector
Garbage Vector collection?
3 • Enforces type and memory safety • Statically checked at compile time • Safety without runtime garbage collection overhead
Traditional Safe languages vs Rust
Rust
B Java, C# etc.
Vector
• Linear types
Garbage Vector collection?
3 • Statically checked at compile time • Safety without runtime garbage collection overhead
Traditional Safe languages vs Rust
Rust
B Java, C# etc.
Vector
• Linear types • Enforces type and memory
Garbage Vector safety collection?
3 • Safety without runtime garbage collection overhead
Traditional Safe languages vs Rust
Rust
B Java, C# etc.
Vector
• Linear types • Enforces type and memory
Garbage Vector safety collection? • Statically checked at compile time
3 Traditional Safe languages vs Rust
Rust
B Java, C# etc.
Vector
• Linear types • Enforces type and memory
Garbage Vector safety collection? • Statically checked at compile time • Safety without runtime garbage collection overhead
3 Language-based isolation - Rust
• Mostly use Rust as a drop-in replacement for C • Numerous possbilities • Fault Isolation • Transparent device-driver recovery • Safe Kernel extensions • Fine-grained capability-based access control etc.
4 Fault isolation in Language-based systems
Domain Foo Domain Bar
X
X fn bar() { do_foo(&obj1, &obj2);
fn foo(obj1: &Object1, obj2: &Object2) { } do_work(obj1); do_work(obj2); }
• e.g., SPIN (using Modula-3 pointers)
5 Fault isolation in Language-based systems
Domain Foo Domain Bar
fn bar() { do_foo(&obj1, &obj2);
fn foo(obj1: &Object1, obj2: &Object2) { } do_work(obj1); do_work(obj2); }
• e.g., SPIN (using Modula-3 pointers)
5 Fault isolation in Language-based systems
Domain Foo Domain Bar
fn bar() { do_foo(&obj1, &obj2);
fn foo(obj1: &Object1, obj2: &Object2) { } do_work(obj1); do_work(obj2); }
• e.g., SPIN (using Modula-3 pointers)
5 Fault isolation in Language-based systems
Domain Foo Domain Bar
fn bar() { do_foo(&obj1, &obj2);
fn foo(obj1: &Object1, obj2: &Object2) { } do_work(obj1);X do_work(obj2); }
• e.g., SPIN (using Modula-3 pointers)
5 Fault isolation in Language-based systems
Domain Foo Domain Bar
fn bar() { do_foo(&obj1, &obj2);
fn foo(obj1: &Object1, obj2: &Object2) { } do_work(obj1);X do_work(obj2); }
• e.g., SPIN (using Modula-3 pointers)
5 Language-based isolation: Deep copy
Domain Foo Domain Bar
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1:Object1, obj2:Object2) { call_other(obj1, obj2); } }
• e.g., J-Kernel, KaffeOS
6 Language-based isolation: Deep copy
Domain Foo Domain Bar
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1:Object1, obj2:Object2) { call_other(obj1, obj2); } }
• e.g., J-Kernel, KaffeOS
6 Language-based isolation: Deep copy
Domain Foo Domain Bar
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1:Object1, obj2:Object2) { call_other(obj1, obj2); } }
• e.g., J-Kernel, KaffeOS
6 Language-based isolation: Deep copy
Domain Foo Domain Bar
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1:Object1, obj2:Object2) { call_other(obj1,X obj2); } }
• e.g., J-Kernel, KaffeOS
6 Language-based isolation: Capabilities
Domain Foo Domain Bar
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
• e.g., J-Kernel
7 Language-based isolation: Capabilities
Domain Foo Domain Bar
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
• e.g., J-Kernel
7 Language-based isolation: Capabilities
Domain Foo Domain Bar
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
• e.g., J-Kernel
7 Language-based isolation: Capabilities
Domain Foo Domain Bar
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
• e.g., J-Kernel
7 Language-based isolation: Capabilities
Domain Foo Domain Bar
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1,X &obj2); }
• e.g., J-Kernel
7 Language-based isolation: Capabilities
Domain Foo Domain Bar
X
X fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1,X &obj2); }
• e.g., J-Kernel
7 • Single ownership model • Static analysis and verification tools (Sing#) • Reusing the moved object return an error • zero-copy
Language-based isolation: Singularity
Shared Heap
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
Static analysis Verification tools
Domain Foo Domain Bar
• Statically enforced ownership discipline
8 • Static analysis and verification tools (Sing#) • Reusing the moved object return an error • zero-copy
Language-based isolation: Singularity
Shared Heap
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
Static analysis Verification tools
Domain Foo Domain Bar
• Statically enforced ownership discipline • Single ownership model
8 • Static analysis and verification tools (Sing#) • Reusing the moved object return an error • zero-copy
Language-based isolation: Singularity
Shared Heap
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
Static analysis Verification tools
Domain Foo Domain Bar
• Statically enforced ownership discipline • Single ownership model
8 • Static analysis and verification tools (Sing#) • Reusing the moved object return an error • zero-copy
Language-based isolation: Singularity
Shared Heap
X
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
Static analysis Verification tools
Domain Foo Domain Bar
• Statically enforced ownership discipline • Single ownership model
8 • Reusing the moved object return an error • zero-copy
Language-based isolation: Singularity
Shared Heap
X
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
Static analysis Verification tools
Domain Foo Domain Bar
• Statically enforced ownership discipline • Single ownership model • Static analysis and verification tools (Sing#)
8 Language-based isolation: Singularity
Shared Heap
X
fn bar(....) { do_work(&obj1, &obj2);
fn foo(obj1: Object1, obj2: Object2) { } call_other(obj1, &obj2); }
Static analysis Verification tools
Domain Foo Domain Bar
• Statically enforced ownership discipline • Single ownership model • Static analysis and verification tools (Sing#) • Reusing the moved object return an error • zero-copy 8 RedLeaf Shared Heap
RRef
rv6 rv6 RedLeaf User User User
Proxy
rv6 Core
Proxy Proxy Compiler-enforced FS Net protection domains rv6
Proxy Proxy
NVMe Ixgbe Driver Driver Trusted Trusted crate crate
Architecture
Microkernel
9 Shared Heap
RRef
rv6 rv6 RedLeaf User User User
Proxy
rv6 Core
Proxy Proxy Compiler-enforced FS Net protection domains rv6
Proxy Proxy
Architecture
NVMe Ixgbe Driver Driver Trusted Trusted crate crate
Microkernel
9 RRef
rv6 rv6 User User
Proxy
rv6 Core
Proxy Proxy Compiler-enforced FS Net protection domains rv6
Architecture
Shared Heap
RedLeaf User
Proxy Proxy
NVMe Ixgbe Driver Driver Trusted Trusted crate crate
Microkernel
9 Architecture
Shared Heap
RRef
rv6 rv6 RedLeaf User User User
Proxy
rv6 Core
Proxy Proxy Compiler-enforced FS Net protection domains rv6
Proxy Proxy
NVMe Ixgbe Driver Driver Trusted Trusted crate crate
Microkernel
9 • Unwind all threads running inside • Subsequent invocations return error • All resources are deallocated • Other threads continue execution
Fault Isolation
• After a domain crash:
10 • Subsequent invocations return error • All resources are deallocated • Other threads continue execution
Fault Isolation
• After a domain crash: • Unwind all threads running inside
10 • All resources are deallocated • Other threads continue execution
Fault Isolation
• After a domain crash: • Unwind all threads running inside • Subsequent invocations return error
10 • Other threads continue execution
Fault Isolation
• After a domain crash: • Unwind all threads running inside • Subsequent invocations return error • All resources are deallocated
10 Fault Isolation
• After a domain crash: • Unwind all threads running inside • Subsequent invocations return error • All resources are deallocated • Other threads continue execution
10 • Special shared heap for passing objects between domains
Heap Isolation
fn foo(obj1: Object1, fn bar(....) { obj2: Object2) { do_work(&obj1, &obj2); call_other(obj1, &obj2); } }
X
Private Heap Private Heap
Domain Foo Domain Bar
• Domains never hold pointers into other domains
11 • Special shared heap for passing objects between domains
Heap Isolation
fn foo(obj1: Object1, fn bar(....) { obj2: Object2) { do_work(&obj1, &obj2); call_other(obj1, &obj2); } }
Private Heap Private Heap
Domain Foo Domain Bar
• Domains never hold pointers into other domains
11 Heap Isolation
RRef
fn foo(obj1: Object1, fn bar(....) { obj2: RRef
X
Private Heap Private Heap
Domain Foo Domain Bar
• Domains never hold pointers into other domains • Special shared heap for passing objects between domains
11 • Cannot point to normal pointers in shared heap or private heap
Exchangeable types
RRef
Shared Heap
fn foo(obj1: Object1, obj2: RRef
Private Heap
Domain Foo
• Objects in shared heap can only be exchangeable types
12 Exchangeable types
X
RRef
Shared Heap X
fn foo(obj1: Object1, obj2: RRef
Private Heap
Domain Foo
• Objects in shared heap can only be exchangeable types • Cannot point to normal pointers in shared heap or private heap
12 • Metadata keeps track of owner domain and ref count • Mediated through trusted proxies
Ownership tracking
struct RRef
fn foo(obj1: RRef
}
Private Heap
Domain Foo Trusted Proxy Domain Bar
• RRef
13 • Mediated through trusted proxies
Ownership tracking
struct RRef
fn foo(obj1: RRef
Private Heap
Domain Foo Trusted Proxy Domain Bar
• RRef
13 Ownership tracking
struct RRef
fn foo(obj1: RRef
Private Heap
Domain Foo Trusted Proxy Domain Bar
• RRef
13 • On panic • Deallocate all objects owned by the crashing domain • Defer borrowed RRef’s until ref count is zero
Heap reclamation
Dom A Shared Heap ... b.foo(x); ... dom_id: A
X Proxy y: RRef
Dom B fn foo(x: RRef
Microkernel
• Maintains a global registry of allocated objects
14 Heap reclamation
Dom A Shared Heap ... b.foo(x); ... dom_id: A
X Proxy y: RRef
Dom B fn foo(x: RRef
Microkernel
• Maintains a global registry of allocated objects • On panic • Deallocate all objects owned by the crashing domain • Defer borrowed RRef’s until ref count is zero
14 • Creates continuation • Moves ownership of all RRef
Cross-domain call proxying
struct RRef
fn foo(obj1: RRef
Private Heap
Domain Foo Trusted Proxy Domain Bar
• Checks if domain is alive
15 • Moves ownership of all RRef
Cross-domain call proxying
struct RRef
fn foo(obj1: RRef
Private Heap
Domain Foo Trusted Proxy Domain Bar
• Checks if domain is alive • Creates continuation
15 Cross-domain call proxying
struct RRef
fn foo(obj1: RRef
Private Heap
Domain Foo Trusted Proxy Domain Bar
• Checks if domain is alive • Creates continuation • Moves ownership of all RRef
• Validate domain interfaces • Generate proxies to enforce ownership discipline • e.g., Block Device domain Interface pub trait BDev { fn read(&self, block: u32, data: RRef<[u8; BSIZE]>) -> RpcResult
16 RedLeaf
Shared Heap
RRef
rv6 rv6 RedLeaf User User User
Proxy
rv6 Core • Heap isolation
Proxy Proxy • Exchangeable types Compiler-enforced FS Net protection domains • Ownership tracking rv6
Proxy Proxy • Interface validation
NVMe Ixgbe • Cross-domain call proxying Driver Driver Trusted Trusted crate crate
Microkernel
17 Device driver Recovery Device driver Recovery
fn bar(....) { do_work(&obj1, &obj2); fn foo(obj1: Object1, obj2: RRef
Private Heap
Domain Foo Shadow domain Trusted Proxy Domain Bar
• Support transparent device driver recovery • Wraps the interaface to expose an identical interface • Interposes on all communication
18 Device driver Recovery
fn proxy_bar() { fn bar(....) { check_if_alive(); do_work(&obj1, &obj2); fn foo(obj1: Object1, obj2: RRef
Private Heap
Domain Foo Shadow domain Trusted Proxy Domain Bar
• Support transparent device driver recovery • Wraps the interaface to expose an identical interface • Interposes on all communication
18 Device driver Recovery
fn proxy_bar() { fn bar(....) { check_if_alive(); do_work(&obj1, &obj2); fn foo(obj1: Object1, obj2: RRef
replay_init(); }
Private Heap
Domain Foo Shadow domain Trusted Proxy Domain Bar
• Support transparent device driver recovery • Wraps the interaface to expose an identical interface • Interposes on all communication
18 Device driver Recovery
fn proxy_bar() { fn bar(....) { check_if_alive(); do_work(&obj1, &obj2); fn foo(obj1: Object1, obj2: RRef
replay_init(); }
Private Heap
Domain Foo Shadow domain Trusted Proxy Domain Bar
• Support transparent device driver recovery • Wraps the interaface to expose an identical interface • Interposes on all communication
18 Evaluation System setup
• 2 x Intel E5-2660 v3 10-core CPUs at 2.60 GHz (Haswell EP) • Disabled: Hyper-threading, Turbo boost, CPU Idle states • Linux and DPDK benchmarks run on version 4.8.4 • RedLeaf benchmarks run on baremetal
19 Communication costs
Operation Cycles seL4 834 VMFUNC 169 VMFUNC-based call/reply invocation 396 RedLeaf cross-domain invocation 124 RedLeaf cross-domain invocation (passing an RRef) 141 RedLeaf cross-domain invocation via shadow 279 RedLeaf cross-domain via shadow (passing an RRef) 297
20 • C, Idiomatic Rust, C-style Rust, • C-style Rust: No higher order functions usize, usize • Idiomatic Rust - Option<(usize, usize)> • Vary the size (212 to 226 at 75% full) • Idiomatic Rust - 25% overhead, C-style - Closer to C • Cycles for a set on 220 (C - 51, idrust - 81, cst-rust - 48)
Language overheads: C vs Rust
• Hashtable - (FNV hash, open addressing, <8B, 8B>)
21 • C-style Rust: No higher order functions usize, usize • Idiomatic Rust - Option<(usize, usize)> • Vary the size (212 to 226 at 75% full) • Idiomatic Rust - 25% overhead, C-style - Closer to C • Cycles for a set on 220 (C - 51, idrust - 81, cst-rust - 48)
Language overheads: C vs Rust
• Hashtable - (FNV hash, open addressing, <8B, 8B>) • C, Idiomatic Rust, C-style Rust,
21 • Idiomatic Rust - Option<(usize, usize)> • Vary the size (212 to 226 at 75% full) • Idiomatic Rust - 25% overhead, C-style - Closer to C • Cycles for a set on 220 (C - 51, idrust - 81, cst-rust - 48)
Language overheads: C vs Rust
• Hashtable - (FNV hash, open addressing, <8B, 8B>) • C, Idiomatic Rust, C-style Rust, • C-style Rust: No higher order functions usize, usize
21 • Vary the size (212 to 226 at 75% full) • Idiomatic Rust - 25% overhead, C-style - Closer to C • Cycles for a set on 220 (C - 51, idrust - 81, cst-rust - 48)
Language overheads: C vs Rust
• Hashtable - (FNV hash, open addressing, <8B, 8B>) • C, Idiomatic Rust, C-style Rust, • C-style Rust: No higher order functions usize, usize • Idiomatic Rust - Option<(usize, usize)>
21 • Idiomatic Rust - 25% overhead, C-style - Closer to C • Cycles for a set on 220 (C - 51, idrust - 81, cst-rust - 48)
Language overheads: C vs Rust
• Hashtable - (FNV hash, open addressing, <8B, 8B>) • C, Idiomatic Rust, C-style Rust, • C-style Rust: No higher order functions usize, usize • Idiomatic Rust - Option<(usize, usize)> • Vary the size (212 to 226 at 75% full)
21 • Cycles for a set on 220 (C - 51, idrust - 81, cst-rust - 48)
Language overheads: C vs Rust
idiomatic-rust-set() idiomatic-rust-get()
1.5 1.25 1 0.75 0.5 0.25 0
TSC ratio normalized to C 12 14 16 18 20 22 24 26 Number of elements in the hash table (power of two)
• Hashtable - (FNV hash, open addressing, <8B, 8B>) • C, Idiomatic Rust, C-style Rust, • C-style Rust: No higher order functions usize, usize • Idiomatic Rust - Option<(usize, usize)> • Vary the size (212 to 226 at 75% full) • Idiomatic Rust - 25% overhead, C-style - Closer to C
21 • Cycles for a set on 220 (C - 51, idrust - 81, cst-rust - 48)
Language overheads: C vs Rust
idiomatic-rust-set() idiomatic-rust-get() c-style-rust-set() c-style-rust-get() 1.5 1.25 1 0.75 0.5 0.25 0
TSC ratio normalized to C 12 14 16 18 20 22 24 26 Number of elements in the hash table (power of two)
• Hashtable - (FNV hash, open addressing, <8B, 8B>) • C, Idiomatic Rust, C-style Rust, • C-style Rust: No higher order functions usize, usize • Idiomatic Rust - Option<(usize, usize)> • Vary the size (212 to 226 at 75% full) • Idiomatic Rust - 25% overhead, C-style - Closer to C
21 Language overheads: C vs Rust
idiomatic-rust-set() idiomatic-rust-get() c-style-rust-set() c-style-rust-get() 1.5 1.25 1 0.75 0.5 0.25 0
TSC ratio normalized to C 12 14 16 18 20 22 24 26 Number of elements in the hash table (power of two)
• Hashtable - (FNV hash, open addressing, <8B, 8B>) • C, Idiomatic Rust, C-style Rust, • C-style Rust: No higher order functions usize, usize • Idiomatic Rust - Option<(usize, usize)> • Vary the size (212 to 226 at 75% full) • Idiomatic Rust - 25% overhead, C-style - Closer to C • Cycles for a set on 220 (C - 51, idrust - 81, cst-rust - 48)
21 • redleaf-domain • rv6-domain • redleaf-shadow • rv6-shadow
Case Study: Device Drivers
Configurations
• redleaf-driver
Ixgbe Driver
Microkernel
22 • rv6-domain • redleaf-shadow • rv6-shadow
Case Study: Device Drivers
RedLeaf User
Configurations
• redleaf-driver • redleaf-domain
Proxy
Ixgbe Driver
Microkernel
22 • redleaf-shadow • rv6-shadow
Case Study: Device Drivers
rv6 User
Proxy
rv6 Core
Configurations Proxy
Net • redleaf-driver rv6 • redleaf-domain • rv6-domain
Proxy
Ixgbe Driver
Microkernel
22 • rv6-shadow
Case Study: Device Drivers
RedLeaf User
Configurations
• redleaf-driver
• redleaf-domain Proxy • rv6-domain Shadow Driver
• redleaf-shadow Proxy Ixgbe Driver
Microkernel
22 Case Study: Device Drivers
rv6 User
Proxy
rv6 Core
Configurations Proxy
Net • redleaf-driver rv6
• redleaf-domain Proxy • rv6-domain Shadow Driver
• redleaf-shadow Proxy Ixgbe • rv6-shadow Driver
Microkernel
22 • DPDK (388 cycles) and Redleaf driver (400 cycles) • Redleaf-domain and Redleaf-shadow • Rv6-domain and shadow
Ixgbe performance benchmark
Linux redleaf-domain rv6-shadow DPDK redleaf-shadow line rate redleaf-driver rv6-domain 20
16 0.89 M
12
8 0.89 M
Pkts/s (Million) 4
0 Tx-64-1 Rx-64-1
• Linux application (2921 cycles)
23 • Redleaf-domain and Redleaf-shadow • Rv6-domain and shadow
Ixgbe performance benchmark
Linux redleaf-domain rv6-shadow DPDK redleaf-shadow line rate redleaf-driver rv6-domain 20
16 0.89 M
12 6.7 M 6.7 M 6.5 M 6.5 M 8 0.89 M
Pkts/s (Million) 4
0 Tx-64-1 Rx-64-1
• Linux application (2921 cycles) • DPDK (388 cycles) and Redleaf driver (400 cycles)
23 • Redleaf-domain and Redleaf-shadow • Rv6-domain and shadow
Ixgbe performance benchmark
Linux redleaf-domain rv6-shadow DPDK redleaf-shadow line rate redleaf-driver rv6-domain 20
16 0.89 M
12 6.7 M 6.7 M 6.5 M 6.5 M 8 0.89 M
Pkts/s (Million) 4
0 Tx-64-1 Tx-64-32 Rx-64-1 Rx-64-32
• Linux application (2921 cycles) • DPDK (388 cycles) and Redleaf driver (400 cycles)
23 • Rv6-domain and shadow
Ixgbe performance benchmark
Linux redleaf-domain rv6-shadow DPDK redleaf-shadow line rate redleaf-driver rv6-domain 20
16 0.89 M
12 6.7 M 6.7 M 6.5 M 4 M 6.5 M 8 0.89 M 2.9 M
Pkts/s (Million) 4
0 Tx-64-1 Tx-64-32 Rx-64-1 Rx-64-32
• Linux application (2921 cycles) • DPDK (388 cycles) and Redleaf driver (400 cycles) • Redleaf-domain and Redleaf-shadow
23 Ixgbe performance benchmark
Linux redleaf-domain rv6-shadow DPDK redleaf-shadow line rate redleaf-driver rv6-domain 20
16 0.89 M 2.8 M
12 6.7 M 6.7 M 6.5 M 4 M 6.5 M 8 0.89 M 2.9 M 2.4 M
Pkts/s (Million) 4
0 Tx-64-1 Tx-64-32 Rx-64-1 Rx-64-32
• Linux application (2921 cycles) • DPDK (388 cycles) and Redleaf driver (400 cycles) • Redleaf-domain and Redleaf-shadow • Rv6-domain and shadow
23 • Network-attached key-value store • a minimal webserver
Application Benchmarks
• Maglev load balancer
24 • a minimal webserver
Application Benchmarks
• Maglev load balancer • Network-attached key-value store
24 Application Benchmarks
• Maglev load balancer • Network-attached key-value store • a minimal webserver
24 • Lookup or insert into the flow tracking table • Compare C vs Rust Maglev versions
• Linux Application - Limited due to synchronous socket interface • DPDK Application (batch of 32) - 9.7 Mpps per-core • Redleaf-driver - 7.2 Mpps • Rv6-domain - 5.3 Mpps, Rv6-shadow - 5.1 Mpps
Application benchmarks: Maglev
• Load balancer developed by Google
25 • Compare C vs Rust Maglev versions
• Linux Application - Limited due to synchronous socket interface • DPDK Application (batch of 32) - 9.7 Mpps per-core • Redleaf-driver - 7.2 Mpps • Rv6-domain - 5.3 Mpps, Rv6-shadow - 5.1 Mpps
Application benchmarks: Maglev
• Load balancer developed by Google • Lookup or insert into the flow tracking table
25 • Linux Application - Limited due to synchronous socket interface • DPDK Application (batch of 32) - 9.7 Mpps per-core • Redleaf-driver - 7.2 Mpps • Rv6-domain - 5.3 Mpps, Rv6-shadow - 5.1 Mpps
Application benchmarks: Maglev
• Load balancer developed by Google • Lookup or insert into the flow tracking table • Compare C vs Rust Maglev versions
25 • DPDK Application (batch of 32) - 9.7 Mpps per-core • Redleaf-driver - 7.2 Mpps • Rv6-domain - 5.3 Mpps, Rv6-shadow - 5.1 Mpps
Application benchmarks: Maglev
• Load balancer developed by Google • Lookup or insert into the flow tracking table • Compare C vs Rust Maglev versions Linux redleaf-shadow DPDK rv6-domain redleaf-driver rv6-shadow redleaf-domain line rate
16
12
8 1 M 4 Pkts/s (Millions) 0 maglev-32
• Linux Application - Limited due to synchronous socket interface
25 • Redleaf-driver - 7.2 Mpps • Rv6-domain - 5.3 Mpps, Rv6-shadow - 5.1 Mpps
Application benchmarks: Maglev
• Load balancer developed by Google • Lookup or insert into the flow tracking table • Compare C vs Rust Maglev versions Linux redleaf-shadow DPDK rv6-domain redleaf-driver rv6-shadow redleaf-domain line rate
16
12 9.7 M
8 1 M 4 Pkts/s (Millions) 0 maglev-32
• Linux Application - Limited due to synchronous socket interface • DPDK Application (batch of 32) - 9.7 Mpps per-core
25 • Rv6-domain - 5.3 Mpps, Rv6-shadow - 5.1 Mpps
Application benchmarks: Maglev
• Load balancer developed by Google • Lookup or insert into the flow tracking table • Compare C vs Rust Maglev versions Linux redleaf-shadow DPDK rv6-domain redleaf-driver rv6-shadow redleaf-domain line rate
16
9.7 M 12 7.2 M 8 1 M 4 Pkts/s (Millions) 0 maglev-32
• Linux Application - Limited due to synchronous socket interface • DPDK Application (batch of 32) - 9.7 Mpps per-core • Redleaf-driver - 7.2 Mpps
25 Application benchmarks: Maglev
• Load balancer developed by Google • Lookup or insert into the flow tracking table • Compare C vs Rust Maglev versions Linux redleaf-shadow DPDK rv6-domain redleaf-driver rv6-shadow redleaf-domain line rate
16
9.7 M 12 7.2 M 8 5.3 M 1 M 4 5.1 M Pkts/s (Millions) 0 maglev-32
• Linux Application - Limited due to synchronous socket interface • DPDK Application (batch of 32) - 9.7 Mpps per-core • Redleaf-driver - 7.2 Mpps • Rv6-domain - 5.3 Mpps, Rv6-shadow - 5.1 Mpps
25 • FNV Hash with open addressing (linear probing) • Rust (C-Style) vs a DPDK application • Various Key value sizes (<8B, 8B>, <16B, 64B>, <64B, 64B>) • Achieves 61-86% performance (extend_from_slice() x 3) • With Unsafe Rust 85-94% performance of the C DPDK
Application: Key Value Store
• Network attached key-value store
26 • Rust (C-Style) vs a DPDK application • Various Key value sizes (<8B, 8B>, <16B, 64B>, <64B, 64B>) • Achieves 61-86% performance (extend_from_slice() x 3) • With Unsafe Rust 85-94% performance of the C DPDK
Application: Key Value Store
• Network attached key-value store • FNV Hash with open addressing (linear probing)
26 • Various Key value sizes (<8B, 8B>, <16B, 64B>, <64B, 64B>) • Achieves 61-86% performance (extend_from_slice() x 3) • With Unsafe Rust 85-94% performance of the C DPDK
Application: Key Value Store
c-dpdk 8 redleaf-driver
4 Pkts/s (Million)
0
8-8-1M 8-8-16M 16-64-1M 64-64-1M 16-64-16M 64-64-16M
• Network attached key-value store • FNV Hash with open addressing (linear probing) • Rust (C-Style) vs a DPDK application
26 • Achieves 61-86% performance (extend_from_slice() x 3) • With Unsafe Rust 85-94% performance of the C DPDK
Application: Key Value Store
c-dpdk 8 redleaf-driver
4 Pkts/s (Million)
0
8-8-1M 8-8-16M 16-64-1M 64-64-1M 16-64-16M 64-64-16M
• Network attached key-value store • FNV Hash with open addressing (linear probing) • Rust (C-Style) vs a DPDK application • Various Key value sizes (<8B, 8B>, <16B, 64B>, <64B, 64B>)
26 • With Unsafe Rust 85-94% performance of the C DPDK
Application: Key Value Store
c-dpdk 8 redleaf-driver
4 Pkts/s (Million)
0
8-8-1M 8-8-16M 16-64-1M 64-64-1M 16-64-16M 64-64-16M
• Network attached key-value store • FNV Hash with open addressing (linear probing) • Rust (C-Style) vs a DPDK application • Various Key value sizes (<8B, 8B>, <16B, 64B>, <64B, 64B>) • Achieves 61-86% performance (extend_from_slice() x 3)
26 Application: Key Value Store
c-dpdk 8 redleaf-driver redleaf-unsafe
4 Pkts/s (Million)
0
8-8-1M 8-8-16M 16-64-1M 64-64-1M 16-64-16M 64-64-16M
• Network attached key-value store • FNV Hash with open addressing (linear probing) • Rust (C-Style) vs a DPDK application • Various Key value sizes (<8B, 8B>, <16B, 64B>, <64B, 64B>) • Achieves 61-86% performance (extend_from_slice() x 3) • With Unsafe Rust 85-94% performance of the C DPDK
26 • Read: 2062 MB/s (restart), 2164 MB/s (without restart) • Writes: 356 MB/s (restart), 423 MB/s (without restart)
• Trigger crash every second • Small drop in performance
Device driver recovery
• Rv6 program <-> in-memory block device
27 • Read: 2062 MB/s (restart), 2164 MB/s (without restart) • Writes: 356 MB/s (restart), 423 MB/s (without restart)
• Small drop in performance
Device driver recovery
• Rv6 program <-> in-memory block device • Trigger crash every second
27 • Writes: 356 MB/s (restart), 423 MB/s (without restart)
Device driver recovery
Read 2000
1500
1000
500
0 Read Throughput (MB/sec) 0 2 4 6 8 10 Time (sec)
• Rv6 program <-> in-memory block device • Trigger crash every second • Small drop in performance • Read: 2062 MB/s (restart), 2164 MB/s (without restart)
27 Device driver recovery
Read Write 2000 800
1500 600
1000 400
500 200
0 0 Read Throughput (MB/sec) Write Throughput (MB/sec) 0 2 4 6 8 10 Time (sec)
• Rv6 program <-> in-memory block device • Trigger crash every second • Small drop in performance • Read: 2062 MB/s (restart), 2164 MB/s (without restart) • Writes: 356 MB/s (restart), 423 MB/s (without restart)
27 Conclusion
• Heap isolation, exchangeable types, ownership tracking, interface validation, cross-domain call proxying • Provides a collection of mechanisms for enabling isolation • A step forward in enabling future system architectures • Secure kernel extensions, fine-grained access control, transparent recovery etc.
Source code
https://mars-research.github.io/redleaf/
28 Thank you!
28