Kpatch Without Stop Machine the Next Step of Kernel Live Patching
Total Page:16
File Type:pdf, Size:1020Kb
LinuxCon NA 2014 (Aug. 2014) Kpatch Without Stop Machine The Next Step of Kernel Live Patching Masami Hiramatsu <[email protected]> Linux Technology Research Center Yokohama Research Lab. Hitachi Ltd., © Hitachi, Ltd. 2014. All rights reserved. 1 Speaker • Masami Hiramatsu – A researcher, working for Hitachi • Researching many RAS features – A linux kprobes-related maintainer • Ftrace dynamic kernel event (a.k.a. kprobe-tracer) • Perf probe (a tool to set up the dynamic events) • X86 instruction decoder (in kernel) © Hitachi, Ltd. 2014. All rights reserved. 2 Agenda: Kpatch Without Stop Machine Background Story Kpatch internal Kpatch without stop_machine Conclusion and Discussion Note: this presentation is only focusing on the kernel-module side of kpatch. More generic design and implementation, please attend to - kpatch: Have Your Security And Eat It Too! – Josh Poimboeuf Aug. 22. pm2:30 © Hitachi, Ltd. 2014. All rights reserved. 3 Agenda: Kpatch Without Stop Machine Background Story What is kpatch? Live patching requirements Major updates and Minor updates Kpatch internal Kpatch Overview Active Safeness check Stop_machine Kpatch without stop_machine Live Patcing Rules Kpatch Reference Counter Safeness check without stop_machine Conclusion and Discussion © Hitachi, Ltd. 2014. All rights reserved. 4 What’s Kpatch? • Kpatch is a LIVE patching function for kernel – This applys a binary patch to kernel on-line – Patching is done without shutdown • Only for a small and critical issues – Not for major kernel update © Hitachi, Ltd. 2014. All rights reserved. 5 Why Live Patching Required? • Live patching is important for appliances for mission critical systems – Some embedded appliances are hard to maintain frequently • Those are distributed widely in country side • Not in the big data center! – Some appliances can’t accept 10ms downtime • Factory control system etc. © Hitachi, Ltd. 2014. All rights reserved. 6 But for Major Updates? • M.C. systems have periodic maintenance – Major fixes can be applied and rebooted – In between the maintenance, live patching will be used • Live patching and major update are complement each other – Live patching temporarily fixes small critical incidents – Major update permanently fixes all bugs Planned Maintenance minor critical Live Patching Major Update © Hitachi, Ltd. 2014. All rights reserved. 7 The History of Live patching on Linux • Live patching is not new 2004 2014 For Pannus Live Patch (2004-2006) Developed for CGL http://pannus.sourceforge.net/ applic No distribution support ation Ptrace and mmap based one Livepatch (2005) http://ukai.jp/Slides/2005/1202-b2con/mop/livepatch.html ksplice (2007-) For https://www.ksplice.com/ kernel Build from scratch kGraft (2014) Acquired and supported by Oracle kpatch (2014) Use ftrace to replace function Will be supported by major distributors © Hitachi, Ltd. 2014. All rights reserved. 8 Agenda: Kpatch Without Stop Machine Background Story What is kpatch? Live patching requirements Major updates and Minor updates Kpatch internal Kpatch Overview Active Safeness check Stop_machine Kpatch without stop_machine Live Patcing Rules Kpatch Reference Counter Safeness check without stop_machine Conclusion and Discussion © Hitachi, Ltd. 2014. All rights reserved. 9 Kpatch: Overview • Kpatch has 2 components – Kpatch build: Build a binary patch module – Kpatch.ko: The kernel module of Kpatch Running Original Build Current Per-function src vmlinux diff linux Patch Patch module Kpatch.ko Patched Patched Build A module which Do patch src vmlinux contains new New functions functions On old functions Done by Kpatch build Done by kpatch.ko Today’s talk © Hitachi, Ltd. 2014. All rights reserved. 10 Kpatch: How to Patch • Kpatch uses Ftrace to patch – Hook the target function entry with registers – Change regs->ip to new function (change the flow) Ftrace hooks Find the new Foo() entry IP from hash Call foo() table foo() ftrace … Call fentry Save Regs kpatch.ko … Call ftrace_ops Get new IP from Return Hash table Restore Regs Change regs->ip Ret to regs->ip Foo() is called even Ret to regs->ip Return After patching. New foo() Function pointer is Available … Ftrace returns Return To new foo() © Hitachi, Ltd. 2014. All rights reserved. 11 Conflict of Old and New Functions • Kpatch will update the execution path of a function – Q: What happen if the patched function is under executed? – A: Old and new functions are executed at the same time !!This should not happen!! • Kpatch ensures the old functions are not executed when patching – “Active Safeness Check” © Hitachi, Ltd. 2014. All rights reserved. 12 Active Safeness Check • Executing functions are on the stack • And IP register points current function too Stack at here Func1 Func1+XX Func2 Stack at here Func2+YY Func1+XX Func3 • Active Safeness Check – Do stack dump to check the target functions are not executed, for each thread. – Need to be done when the process is stopped. – stop_machine is used © Hitachi, Ltd. 2014. All rights reserved. 13 Active Safeness Check With Stop_machine • Kpatch uses stop_machine to check stacks Time Patching Add Process Ftrace entry Call old Process1 func1 Process2 Call old Process3 func2 © Hitachi, Ltd. 2014. All rights reserved. 14 Active Safeness Check With Stop_machine • Kpatch uses stop_machine to check stacks Call Stop Machine Time Patching Add Process Ftrace entry Call old Process1 func1 Process2 Call old Process3 func2 All running processes and interrupts are stopped © Hitachi, Ltd. 2014. All rights reserved. 15 Active Safeness Check With Stop_machine • Kpatch uses stop_machine to check stacks Call Stop Machine Time Patching Add Safeness Process Ftrace entry Check Call old Process1 func1 Walk through the all Thread and check Process2 Old funcs on stacks Call old Process3 func2 All running processes and interrupts are stopped © Hitachi, Ltd. 2014. All rights reserved. 16 Active Safeness Check With Stop_machine • Kpatch uses stop_machine to check stacks Call Stop Machine Return Time Patching Add Safeness Update hash Process Ftrace entry Check table Call old Process1 func1 Walk through the all Thread and check Process2 Old funcs on stacks Call New func2 Call old Call New Process3 func2 func1 Now switch to All running processes and New functions interrupts are stopped © Hitachi, Ltd. 2014. All rights reserved. 17 Stop_machine: Pros and Cons • Pros – Safe, simple and easy to review, Good for the 1st version • Cons – Stop_machine stops all processes a while • It is critical for control/network appliances – In virtual environment, this takes longer time • We need to wait all VCPUs are scheduled on the host machine © Hitachi, Ltd. 2014. All rights reserved. 18 Agenda: Kpatch Without Stop Machine Background Story What is kpatch? Live patching requirements Major updates and Minor updates Kpatch internal Kpatch and Ftrace Kpatch and Safeness check Stop_machine Kpatch without stop_machine Live Patching Rules Kpatch Reference Counter Safeness check without stop_machine Conclusion and Discussion © Hitachi, Ltd. 2014. All rights reserved. 19 Live Patching Rules • Live patching must follow the rules 1. All the new functions in a patch must be applied at once ● We need an atomic operation 2. After switching new function, the old function must not be executed ● We have to ensure no threads runs on old functions ● And no threads sleeps on them © Hitachi, Ltd. 2014. All rights reserved. 20 Two actions for the solution 1. Introduce an atomic reference counter 2. Active safeness check at the context switch © Hitachi, Ltd. 2014. All rights reserved. 21 Atomic Function Reference Counter 1. Introduce an atomic reference counter – Without stop_machine, functions can be called while patching • Ensure no one actually runs functions -> refcounter • Increment the refcounter at entry • Decrement the refcounter at exit – If refcounter is 0, update ALL function paths • We are sure there is no users © Hitachi, Ltd. 2014. All rights reserved. 22 Kpatch Reference Counter • Patching(switching) controlled by refcount Time Patching Add Process +1 Ftrace entry Process1 Process2 Process3 Start patcing Refcnt = 1 © Hitachi, Ltd. 2014. All rights reserved. Kpatch Reference Counter • Patching(switching) controlled by refcount Time Patching Add Prepare new Safeness Process +1 Ftrace entry Hash table Check Call old Call old Process1 +1 func1 -1 +1 func2 -1 Each function call Process2 Inc/dec refcnt While patching Call old Process3 +1 func1 -1 Start patcing Refcnt = 1 © Hitachi, Ltd. 2014. All rights reserved. Kpatch Reference Counter • Patching(switching) controlled by refcount Time Patching Add Prepare new Safeness Process +1 Ftrace entry Hash table Check -1 Call old Call old Call old Process1 +1 func1 -1 +1 func2 -1 +1 func3 Each function call Call old Process2 Inc/dec refcnt While patching +1 func3 -1 Call old Process3 +1 func1 -1 Patching process Is over, but refcnt Is not 0 Start patcing Refcnt = 1 © Hitachi, Ltd. 2014. All rights reserved. 25 Kpatch Reference Counter • Patching(switching) controlled by refcount Time Patching Add Prepare new Safeness +1 -1 Process Ftrace entry Hash table Check Don’t count Call old Call old Call old refcnt anymore Process1 +1 func1 -1 +1 func2 -1 +1 func3 -1 Each function call Call old Call New Process2 Inc/dec refcnt Call New While patching +1 func3 -1 func2 Call old Process3 +1 func1 -1 Patching process Call New Is over, but refcnt func1 Is not 0 Refcnt is 0 Start patcing Patch enabled on Refcnt = 1 all funcs © Hitachi, Ltd. 2014. All rights reserved. 26 Kpatch Reference Counter (cont.) • Control the reference counter – Need to stop counting before and after patching – Use atomic_inc_not_zero/dec_if_positive • These are stopped automatically if counter == 0 refcnt 0 0 func call Do not hook function 27 Entry/exit before patching © Hitachi, Ltd. 2014. All rights reserved. Kpatch Reference Counter (cont.) • Control the reference counter – Need to stop counting before and after patching – Use atomic_inc_not_zero/dec_if_positive • These are stopped automatically if counter == 0 refcnt 0 0 1 2 1 0 atomic_inc forcibly inc refcnt atomic_dec Patching atomic_inc_not_zero atomic_dec_if_positive func call func call Do not hook function 28 Entry/exit before patching © Hitachi, Ltd. 2014.