Wait-Free Code Patching of Multi-Threaded Processes

From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes Florian Rommel, Christian Dietrich, Daniel Friesel, Marcel Köppen, Christoph Borchert, Michael Müller, Olaf Spincz !, Daniel "ohmann "eibniz %niversität (anno&er %niversität Osnabrüc! 2020-11-05 Dyna#ic Soft)are Updating ,pply updates during the run time *igh Availabilit- service quality must not decrease ./0ensive 'eboot e.g., applications with large run-time state Prime E0ample1 Operatin+ Systems → "in-0 Kernel (Ksplice, !Gra"t# 1serspace +0plications2 → DSU rarely used in practice 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel 2 / 22 /xa#ple1 OpenLD,. OpenLDAP server listener worker worker dispatch compute compute client client 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel 3 / 22 /xa#ple1 OpenLD,. do_work() void worker_thread() { void rwm_op_rollback( Operation *op, SlapReply *rs, rwm_op_state *ros ) { ... while (1) { op->o_tmpfree( ros->mapped_attrs, op->o_tmpmemctx ); filter_free_x( op, op->ors_filter, 1 ); wait_for_work(); op->o_tmpfree( op->ors_filterstr.bv_val, op->o_tmpmemctx ); op->ors_attrs = ros->ors_attrs; do_work(); op->ors_filter = ros->ors_filter; } op->ors_filterstr = ros->ors_filterstr; } // quiescence point ... } "uggy if (patch_pending()) { barrier(); wait_for_patch(); void rwm_op_rollback( Operation *op, SlapReply *rs, rwm_op_state *ros ) { } ... if ( op->ors_filter != ros->ors_filter ) { } filter_free_x( op, op->ors_filter, 1 ); } op->ors_filter = ros->ors_filter; } void patcher_thread() { if ( op->ors_filterstr.bv_val != ros->ors_filterstr.bv_val ) { op->o_tmpfree( op->ors_filterstr.bv_val, op->o_tmpmemctx ); while (1) { op->ors_filterstr = ros->ors_filterstr; } wait_for_patch_request(); ... patched set_patch_pending(); } barrier(); apply_patch(); reset_patch_pending(); Glo"al #uiescence: resume_workers(); All %or&ers must "e in } the "arrier "efore patching } 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel 4 / 22 Global 3-iescence 4he to-be-patched code is not active in any threa* 150 ] Pro"lems s / 1 [ s 100 ()$ Long Calculations e s n o p 50 (*$ +,O Operations e R (-$ +nter-Thread Dependencies 0 0.0 0.2 0.4 0.6 0.8 1.0 Response Time relative to Patch Request [s] 150 ] s m [ y c 100 n e t a L . 50 x a M 0 0.0 0.2 0.4 0.6 0.8 1.0 Response Time relative to Patch Request [s] 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel ) / 22 Global 3-iescence → MariaDB1 Transaction Loc!s Pro"lems ()$ Long Calculations tx_lock() barrier() (*$ +,O Operations Thread ) (-$ +nter-Thread Dependencies tx_lock() Deadloc& Thread * time 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel 5 / 22 Kernelspace Ksplice1 Probe #or quiescence instead of 8aiting in a barrier → .atch #ay never get applied !2raft, DynAMOS: 9eep patched and unpatched #unctions in parallel Decide on per-thread-basis which version to use Global quiescence → local quiescence → .roblems: kernel-speci5c, perfor#ance penalt 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel 6 / 22 Local Quiescence Basic Idea: Patching threads in*ependently from each other7 Wait-Free Code Patching via Address-Space !enerations O. extension for run-time modi0cation in multi-threaded processes ,S generations: Multiple views of an address-space 4hread-local q-iescence 4hread-by-thread migration bet)een AS generations → +mplementation in the Linux 1ernel 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel : / 22 2020-(1-0) Max. Latency [ms] Reponses [1/s] Quiescence Local 100 100 150 150 100 150 50 50 50 0 0 0 OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel Florian — rocesses $ulti-%!readed o# atchin" Code Wait-Free Quiescence: Local to Global From '20 OSDI 0.0 0.0 0.0 Response Time relative to Patch Request [s] Request to Patch [s] relative Time Request to Patch Response relative Time Response [s] Request to Patch relative Time Response 0.2 0.2 0.2 0.4 0.4 0.4 0.6 0.6 0.6 0.8 0.8 0.8 1.0 1.0 1.0 → + n 2 t e o r - m Local Local Quiescence Global Quiescence T o h r r e e a d d e a D d e l p o e c n & d s 3 e n c ; i / e 22 s ,**ress-Space 2enerations wf_create() ad*ress Generation 0 Generation 1 space te0t te0t patche* wf_pin() Cop--On-WriteShared $a00ing $a00ing data data 9 Shared $a00ing 9 stac! stac! 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel (0 / 22 ,**ress-Space 2enerations wf_delete() wf_create() ad*ress Generation 0 Generation 1 space te0t te0t patche* wf_pin() Cop--On-Write $a00ing data data 9 Shared $a00ing 9 stac! stac! threads wf_migrate() th2 th1 th3 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel (( / 22 /xa#ple1 OpenLD,. do_work() void worker_thread() { void rwm_op_rollback( Operation *op, SlapReply *rs, rwm_op_state *ros ) { ... while (1) { op->o_tmpfree( ros->mapped_attrs, op->o_tmpmemctx ); filter_free_x( op, op->ors_filter, 1 ); wait_for_work(); op->o_tmpfree( op->ors_filterstr.bv_val, op->o_tmpmemctx ); op->ors_attrs = ros->ors_attrs; do_work(); op->ors_filter = ros->ors_filter; op->ors_filterstr = ros->ors_filterstr; ... // quiescence point } "uggy if (patch_pendingmigration_pending()) {()) { barrierwf_migrate();(); } wait_for_patch(); void rwm_op_rollback( Operation *op, SlapReply *rs, rwm_op_state *ros ) { } ... if ( op->ors_filter != ros->ors_filter ) { } filter_free_x( op, op->ors_filter, 1 ); } op->ors_filter = ros->ors_filter; } void patcher_thread() { if ( op->ors_filterstr.bv_val != ros->ors_filterstr.bv_val ) { op->o_tmpfree( op->ors_filterstr.bv_val, op->o_tmpmemctx ); while (1) { op->ors_filterstr = ros->ors_filterstr; } wait_for_patch_request(); ... patched set_patch_pendingwf_create(); (); } barrierwf_migrate();(); apply_patch(); reset_patch_pendingset_migration_pending();(); }resume_workers(); }} } 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel 12 / 22 6#ple#entation in the Linux Kernel wf_create Memor' Map 4MM5 Cloned MM %f_create Clones the #e#or #ap :MM; = ,S generation COW COW text text li!e fork() $-t )ithout CO= wf_pin data & bss data & bss Howe&er, pinne* #appings use CO= heap heap %f_migrate Changes the thread’s MM pointer file mapping file mapping Conte0t s)itch file mapping file mapping stack stack Thread 1 *MM Thread 2 *MM Thread 3 *MM 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel (3 / 22 6#ple#entation in the Linux Kernel wf_create Memor'Master MM Map 4MM5 Cloned MM %f_create Clones the #e#or #ap :MM; = ,S generation COW COW text text li!e fork() $-t )ithout CO= wf_pin data & bss data & bss Howe&er, pinned #appin+s use COW heap heap heap heap %f_migrate Changes the thread’s MM pointer Page Fault Page Fault Conte0t s)itch file mapping file mapping Mapping Changes stack stack S nchronize* on all MMs Master MM1 "az page initialization, "oc!ing pro0 Thread 1 *MM Thread 2 *MM Thread 3 *MM 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel (4 / 22 /&aluation: .atches De$ian 1070 pac!a+es and De$ian patches :e0cept MariaDB; OpenLDAP Apache Memcached Samba MariaDB 2ode.8s .atches :CVE) 13 :2) 10 :10) 1 :1) 2 :2) 74 :26; 4 :0) Restrict to code-only patches :6< =::<> te0t-onl 9 :2) 7 :A; 1 :1) 2 :2) 67 :2B; 4 :0) !enerate patches via Kpatch 3;< =46<> !patch>a$le 9 :2) 7 :A; 1 :1) 2 :2; 16 :5) 0 :0) 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel () / 22 /&aluation: Ee8uest "atencies 105 106 P99.5 (=143.52ms) P99.5 (=601.00ms) P99.5 (=855.90ms) Global Quiescence Global Quiescence Global Quiescence 3 10 4 103 10 s s s t t t s s s e e e 2 u 1 u 101 u 10 q 10 q q e e e R R R f f f o 5 o o 6 10 10 r P99.5 (=9.56ms) r P99.5 (=541.00ms) r P99.5 (=32.38ms) e e e b Local Quiescence b Local Quiescence b Local Quiescence m m 3 m 4 u u 10 u 103 10 N N N 2 1 10 101 10 0 20 40 60 80 100 120 140 0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 OpenLDAP: Histogram of Request Latency [ms] Apache: Histogram of Request Latency [ms] Memcached: Histogram of Request Latency [ms] 107 P99.5 (=760.68ms) 105 P99.5 (=323.62ms) P99.5 (=236.08ms) 5 Global Quiescence Global Quiescence Global Quiescence 10 103 103 s s s t t 3 t s s 10 s e e e u u 1 u 101 q q 1 10 q e 10 e e R R R f f f o o o 7 r r 10 P99.5 (=55.69ms) 5 P99.5 (=7.84ms) r P99.5 (=243.15ms) e e 10 e b Local Quiescence b Local Quiescence b Local Quiescence m 105 m m 3 u u u 10 N N 3 10 N 103 1 1 10 101 10 0 500 1000 1500 2000 2500 3000 3500 0 200 400 600 800 1000 0 250 500 750 1000 1250 1500 1750 Samba: Histogram of File I/O Latency [ms] MariaDB: Histogram of Request Latency [ms] Node.js: Histogram of Request Latency [ms] 2020-(1-0) OSDI '20 From Global to Local Quiescence: Wait-Free Code atchin" o# $ulti-%!readed rocesses — Florian 'ommel (5 / 22 /&aluation: Ee8uest "atencies 105 P99.5 (=143.52ms) Global Quiescence 103

Load more