From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes
Florian Rommel, Christian Dietrich, Daniel Friesel, Marcel Köppen, Christoph Borchert, Michael Müller, Olaf Spinczyk, Daniel Lohmann
Leibniz Universität Hannover Universität Osnabrück
2020-11-05 Dynamic Software Updating
Apply updates during the run time
High Availability service quality must not decrease
Expensive Reboot e.g., applications with large run-time state
Prime Example: Operating Systems → Linux Kernel (Ksplice, kGraft)
Userspace Applications? → DSU rarely used in practice
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 2 / 22 Example: OpenLDAP
OpenLDAP server
listener worker worker dispatch compute compute ...
client
client ...
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 3 / 22 Example: OpenLDAP do_work()
void worker_thread() { void rwm_op_rollback( Operation *op, SlapReply *rs, rwm_op_state *ros ) { ... while (1) { op->o_tmpfree( ros->mapped_attrs, op->o_tmpmemctx ); filter_free_x( op, op->ors_filter, 1 ); wait_for_work(); op->o_tmpfree( op->ors_filterstr.bv_val, op->o_tmpmemctx ); do_work(); op->ors_attrs = ros->ors_attrs; op->ors_filter = ros->ors_filter; } op->ors_filterstr = ros->ors_filterstr; } // quiescence point ... } buggy if (patch_pending()) { barrier(); wait_for_patch(); void rwm_op_rollback( Operation *op, SlapReply *rs, rwm_op_state *ros ) { } ... } if ( op->ors_filter != ros->ors_filter ) { filter_free_x( op, op->ors_filter, 1 ); } op->ors_filter = ros->ors_filter; } void patcher_thread() { if ( op->ors_filterstr.bv_val != ros->ors_filterstr.bv_val ) { op->o_tmpfree( op->ors_filterstr.bv_val, op->o_tmpmemctx ); while (1) { op->ors_filterstr = ros->ors_filterstr; } wait_for_patch_request(); ... patched set_patch_pending(); } barrier(); apply_patch(); reset_patch_pending(); Global Quiescence: resume_workers(); All workers must be in } the barrier before patching }
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 4 / 22 Global Quiescence The to-be-patched code is not active in any thread
150
] Problems s / 1 [
s 100 #1: Long Calculations e s n o
p 50 #2: I/O Operations e R #3: Inter-Thread Dependencies 0 0.0 0.2 0.4 0.6 0.8 1.0 Response Time relative to Patch Request [s]
150 ] s m [
y
c 100 n e t a L
. 50 x a M 0 0.0 0.2 0.4 0.6 0.8 1.0 Response Time relative to Patch Request [s]
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 5 / 22 Global Quiescence
→ MariaDB: Transaction Locks Problems #1: Long Calculations tx_lock() barrier() #2: I/O Operations Thread 1 #3: Inter-Thread Dependencies
tx_lock()
Deadlock Thread 2
time
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 6 / 22 Kernelspace
Ksplice: Probe for quiescence instead of waiting in a barrier
→ Patch may never get applied
kGraft, DynAMOS: Keep patched and unpatched functions in parallel
Decide on per-thread-basis which version to use
Global quiescence → local quiescence
→ Problems: kernel-specific, performance penalty
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 7 / 22 Local Quiescence
Basic Idea: Patching threads independently from each other.
Wait-Free Code Patching via Address-Space Generations
OS extension for run-time modification in multi-threaded processes AS generations: Multiple views of an address-space
Thread-local quiescence
Thread-by-thread migration between AS generations → Implementation in the Linux Kernel
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 8 / 22 Local Quiescence
In ter- 150 Thre ] ad s De / → N pen 1 o de [ m n ore cies s 100 dea e dlo s cks
n ! o
p 50 e R
0 0.0 0.2 0.4 0.6 0.8 1.0 Global Quiescence Response Time relative to Patch Request [s] Local Quiescence
150 ] ] s s m m [ [
y y c c 100 n n e e t t a a L L
. . 50 x x a a M M 0 0.0 0.2 0.4 0.6 0.8 1.0 Response Time relative to Patch Request [s]
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 9 / 22 Address-Space Generations
wf_create() address Generation 0 Generation 1 space text text patched wf_pin() Copy-On-WriteShared Mapping Mapping
data data & Shared Mapping & stack stack
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 10 / 22 Address-Space Generations
wf_delete() wf_create() address Generation 0 Generation 1 space text text patched wf_pin() Copy-On-Write Mapping
data data & Shared Mapping & stack stack
threads wf_migrate()
th2 th1 th3
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 11 / 22 Example: OpenLDAP do_work()
void worker_thread() { void rwm_op_rollback( Operation *op, SlapReply *rs, rwm_op_state *ros ) { ... while (1) { op->o_tmpfree( ros->mapped_attrs, op->o_tmpmemctx ); filter_free_x( op, op->ors_filter, 1 ); wait_for_work(); op->o_tmpfree( op->ors_filterstr.bv_val, op->o_tmpmemctx ); do_work(); op->ors_attrs = ros->ors_attrs; op->ors_filter = ros->ors_filter; op->ors_filterstr = ros->ors_filterstr; // quiescence point ... } buggy if (patch_pendingmigration_pending()) ()){ { barrierwf_migrate();(); } wait_for_patch(); void rwm_op_rollback( Operation *op, SlapReply *rs, rwm_op_state *ros ) { } ... } if ( op->ors_filter != ros->ors_filter ) { filter_free_x( op, op->ors_filter, 1 ); } op->ors_filter = ros->ors_filter; } void patcher_thread() { if ( op->ors_filterstr.bv_val != ros->ors_filterstr.bv_val ) { op->o_tmpfree( op->ors_filterstr.bv_val, op->o_tmpmemctx ); while (1) { op->ors_filterstr = ros->ors_filterstr; } wait_for_patch_request(); ... patched set_patch_pendingwf_create(); (); } barrierwf_migrate();(); apply_patch(); reset_patch_pendingset_migration_pending();(); }resume_workers(); }} }
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 12 / 22 Implementation in the Linux Kernel
wf_create Memory Map (MM) Cloned MM wf_create Clones the memory map (MM) = AS generation COW COW text text like fork() but without COW wf_pin data & bss data & bss However, pinned mappings use COW
heap heap wf_migrate Changes the thread’s MM pointer file mapping file mapping Context switch file mapping file mapping
stack stack
Thread 1 *MM Thread 2 *MM Thread 3 *MM
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 13 / 22 Implementation in the Linux Kernel
wf_create MemoryMaster MM Map (MM) Cloned MM wf_create Clones the memory map (MM) = AS generation COW COW text text like fork() but without COW wf_pin data & bss data & bss However, pinned mappings use COW
heap heap heap heap wf_migrate Changes the thread’s MM pointer Page Fault Page Fault Context switch file mapping file mapping
Mapping Changes stack stack Synchronized on all MMs Master MM: Lazy page initialization, Locking proxy
Thread 1 *MM Thread 2 *MM Thread 3 *MM
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 14 / 22 Evaluation: Patches
Debian 10.0 packages and Debian patches (except MariaDB)
OpenLDAP Apache Memcached Samba MariaDB Node.js Patches (CVE) 13 (2) 10 (10) 1 (1) 2 (2) 74 (26) 4 (0)
Restrict to code-only patches 87% (88%)
text-only 9 (2) 7 (7) 1 (1) 2 (2) 67 (24) 4 (0)
Generate patches via Kpatch 39% (47%)
kpatch’able 9 (2) 7 (7) 1 (1) 2 (2) 16 (5) 0 (0)
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 15 / 22 Evaluation: Request Latencies
105 106 P99.5 (=143.52ms) P99.5 (=601.00ms) P99.5 (=855.90ms) Global Quiescence Global Quiescence Global Quiescence 3 10 4 103 10 s s s t t t s s s
e e e 2 u 1 u 101 u 10
q 10 q q e e e R R R
f f f o 5 o o 6
10 10 r P99.5 (=9.56ms) r P99.5 (=541.00ms) r P99.5 (=32.38ms) e e e b Local Quiescence b Local Quiescence b Local Quiescence
m m 3 m 4 u u 10 u 103 10 N N N
2 1 10 101 10
0 20 40 60 80 100 120 140 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 OpenLDAP: Histogram of Request Latency [ms] Apache: Histogram of Request Latency [ms] Memcached: Histogram of Request Latency [ms]
107 P99.5 (=760.68ms) 105 P99.5 (=323.62ms) P99.5 (=236.08ms) 5 Global Quiescence Global Quiescence Global Quiescence 10 103 103 s s s t t 3 t s s
10 s e e e u u 1 u 101 q q 1 10 q
e 10 e e R R R
f f f o o o
7 r r 10 P99.5 (=55.69ms) 5 P99.5 (=7.84ms) r P99.5 (=243.15ms) e e 10 e b Local Quiescence b Local Quiescence b Local Quiescence m 105 m m 3 u u u 10
N N 3 10 N 103 1 1 10 101 10
0 500 1000 1500 2000 2500 3000 3500 0 200 400 600 800 1000 0 250 500 750 1000 1250 1500 1750 Samba: Histogram of File I/O Latency [ms] MariaDB: Histogram of Request Latency [ms] Node.js: Histogram of Request Latency [ms]
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 16 / 22 Evaluation: Request Latencies
105 P99.5 (=143.52ms) Global Quiescence 103 s t s e
u 1
q 10 e R
f
o 5 10 r P99.5 (=9.56ms) e b Local Quiescence m
u 103 N
101
0 20 40 60 80 100 120 140 OpenLDAP: Histogram of Request Latency [ms]
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 17 / 22 Evaluation: Request Latencies
105 P99.5 (=323.62ms) Global Quiescence
103 s t s e
u 1 q 10 e R
f o
r 5 P99.5 (=7.84ms) e 10 b Local Quiescence m u
N 103
101
0 200 400 600 800 1000 MariaDB: Histogram of Request Latency [ms]
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 18 / 22 Evaluation: Overhead
Run Time (microbenchmarks under load):
wf_create 88±23 µs (Memcached) to 2171±139 µs (Node.js)
wf_migrate 5±5 µs (Samba) to 8±7 µs (Node.js)
Memory (under load): 132 KiB (Memcached) to 1808 KiB (Node.js)
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 19 / 22 Future Work
Basic mechanism Synchronized address-space clones with partial differences
Possible applications Combination with JIT compiler Path-specific kernel modification→ ( Synthesis) Implementation of dynamic variability (→ Multiverse) Address-space views for Data (thread isolation)
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 20 / 22 Conclusion
Wait-Free Code Patching via Address-Space Generations
Goal: Reduce global quiescence to local quiescence Easier to establish Maintain quality of service
Approach: Synchronized address-space clones
Evaluation: 6 server applications Successful application of code-only patches Improved tail latencies during patching
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 21 / 22 Thank you for your attention.
Try it: https://www.sra.uni-hannover.de/p/wfpatch [email protected]
2020-11-05 OSDI '20 From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes — Florian Rommel 22 / 22