
SKI: Exposing Kernel Concurrency Bugs through Systematic Schedule Exploration Pedro Fonseca (MPI-SWS) Rodrigo Rodrigues Björn Brandenburg (NOVA University of Lisbon) (MPI-SWS) OSDI 2014 SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Kernel concurrency bugs ● Bugs that depend on the instruction interleavings – Triggered only by a subset of the interleavings SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Kernel concurrency bugs ● Bugs that depend on the instruction interleavings – Triggered only by a subset of the interleavings ● Plenty of kernel concurrency bugs in kernels! SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Kernel concurrency bugs ● Bugs that depend on the instruction interleavings – Triggered only by a subset of the interleavings ● Plenty of kernel concurrency bugs in kernels! TheThe bug bug is is a a race race and and not not always always easy easy to to reproduce reproduce. .[...] [...] On On my my particularparticular machine, machine, [the [the test test case] case] usually usually triggers triggers [the [the bug] bug] withinwithin 10 10 minutes minutes but but enabling enabling debug debug options options can can change change the the timingtiming such such that that it it never never hits hits. .Once Once the the bug bug is is triggered, triggered, the the machinemachine is is in in trouble trouble and and needs needs to to be be rebooted. rebooted. Linux 3.0.41 change log SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Kernel concurrency bugs ● Bugs that depend on the instruction interleavings – Triggered only by a subset of the interleavings ● Plenty of kernel concurrency bugs in kernels! TheThe bug bug is is a a race race and and not not always always easy easy to to reproduce reproduce. .[...] [...] On On my my particularparticular machine, machine, [the [the test test case] case] usually usually triggers triggers [the [the bug] bug] withinwithin 10 10 minutes minutes but but enabling enabling debug debug options options can can change change the the timingtiming such such that that it it never never hits hits. .Once Once the the bug bug is is triggered, triggered, the the machinemachine is is in in trouble trouble and and needs needs to to be be rebooted. rebooted. Linux 3.0.41 change log [The[The bug] bug] was was quite quite hard hard to to decode decode as as the the reproduction reproduction time time isis between between 2 2 days days and and 3 3 weeks weeks and and intrusive intrusive tracing tracing makesmakes it it less less likely likely [...] [...] Linux 3.4.41 change log SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Kernel concurrency bugs ● Bugs that depend on the instruction interleavings – Triggered only by a subset of the interleavings ● Plenty of kernel concurrency bugs in kernels! Three of the fve 3.4.9 machines [...] locked up. TheThe bug bug is is a a race race and Threeand not not of always alwaysthe fve easy easy 3.4.9 to to reproducemachines reproduce [...]. .[...] [...] locked On On my my up. I've tried reproducing the issue, but so far I've particularparticular machine, machine,I've [the [the tried test test reproducing case] case] usually usually the triggers triggers issue, but [the [the so bug] bug]far I've within 10 minutesbeen but unsuccessfulenabling debug [...] options can change the within 10 minutesbeen but unsuccessfulenabling debug [...] options can change the Linux kernel mailing list (5/1/2013) timingtiming such such that that it it never never hits hits. .Once Once the the bug bug is is triggered, triggered, the the machinemachine is is in in trouble trouble and and needs needs to to be be rebooted. rebooted. Linux 3.0.41 change log [The[The bug] bug] was was quite quite hard hard to to decode decode as as the the reproduction reproduction time time isis between between 2 2 days days and and 3 3 weeks weeks and and intrusive intrusive tracing tracing makesmakes it it less less likely likely [...] [...] Linux 3.4.41 change log SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Approaches to explore interleavings ● Stress testing approach – Hope to fnd the interleaving SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Approaches to explore interleavings ● Stress testing approach – Hope to fnd the interleaving ● Systematic approach – Take full control of the interleavings – Existing tools focus on user-mode applications SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Approaches to explore interleavings ● Stress testing approach – Hope to fnd the interleaving ● Systematic approach – Take full control of the interleavings – Existing tools focus on user-mode applications This talk SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Approaches to explore interleavings ● Stress testing approach – Hope to fnd the interleaving ● Systematic approach – Take full control of the interleavings – Existing tools focus on user-mode applications Focus on operating system kernels This talk SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca SKI Finding kernel concurrency bugs ● Testing applications versus kernels ● Our approach ● Implementation ● Evaluation SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Existing user-mode tools App Kernel-level abstractions Existing user-mode Threads and sync. objects systematic tools LD_PRELOAD, ptrace Kernel SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Existing user-mode tools App Existing user-mode User-modeKernel-level testingabstractions tool Threads and sync. objects systematic tools LD_PRELOAD, ptrace Kernel Scheduler SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Kernel-mode challenges Testing tool ● Kernel doesn't have a good instrumentation interface Kernel Scheduler SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Kernel-mode challenges ● Kernel doesn't have a good instrumentation interface Kernel Scheduler ● An alternative would be to modify the kernel – But kernel modifcations: SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Kernel-mode challenges ● Kernel doesn't have a good instrumentation interface Kernel Scheduler ● An alternative would be to modify the kernel – But kernel modifcations: ● Change the tested software ● Are non-trivial ● Hinder portability SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca Kernel-mode challenges ● Kernel doesn't have a good instrumentation interface Kernel Scheduler ● An alternative would be to modify the kernel – But kernel modifcations: ● Change the tested software ● Are non-trivial ● Hinder portability Avoid kernel modifcations SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca User-mode versus kernel-mode App Kernel-level abstractions Existing user-mode Threads and sync. objects systematic tools LD_PRELOAD, ptrace Kernel Scheduler HW-level abstractions mov, add, jmp, registers, APIC Our tool (modifed VMM) Hardware SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca User-mode versus kernel-mode App Kernel-level abstractions Existing user-mode Threads and sync. objects systematic tools LD_PRELOAD, ptrace Kernel Scheduler HW-level abstractions mov,Kernel add, jmp, testing registers, tool APIC Our tool (modifed VMM) Hardware SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca SKI Finding kernel concurrency bugs SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca SKI Finding kernel concurrency bugs Systematic Full control of the kernel interleavings SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca SKI Finding kernel concurrency bugs Systematic Practical No modifcations Full control of the + to the kernel kernel interleavings Fast SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca SKI Finding kernel concurrency bugs ● Challenges testing the kernel code ● SKI's approach ● Implementation ● Evaluation SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca SKI's approach VM App Challenges 1. How to control the schedules? Kernel 2. Which contexts are schedulable? 3. Which schedules to choose? HW-level abstractions mov, add, jmp, registers, APIC SKI VMM SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca 1. How to control the kernel schedules? Thread 1 Thread 2 MOV ADD PUSH t MOV MOV SUB JMP CPU SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca 1. How to control the kernel schedules? ● Pin each tested thread to a diferent CPU (thread afnity) Thread 1 Thread 2 Thread 1 Thread 2 MOV MOV ADD ADD PUSH Pin PUSH t MOV MOV MOV MOV SUB SUB JMP JMP CPU CPU 1 CPU 2 SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca 1. How to control the kernel schedules? ● Pin each tested thread to a diferent CPU (thread afnity) ● Pause and resume CPUs to control schedules Thread 1 Thread 2 Thread 1 Thread 2 Thread 1 Thread 2 MOV MOV MOV ADD ADD PUSH PUSH Pin PUSH Control MOV t MOV MOV SUB MOV MOV ADD SUB SUB MOV JMP JMP JMP CPU CPU 1 CPU 2 CPU 1 CPU 2 SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca 1. How to control the kernel schedules? ● Pin each tested thread to a diferent CPU (thread afnity) ● Pause and resume CPUs to control schedules Thread 1 Thread 2 Thread 1 Thread 2 Thread 1 Thread 2 Thread 1 Thread 2 MOV MOV MOV MOV ADD ADD PUSH ADD PUSH Pin PUSH Control MOV MOV t MOV MOV SUB + PUSH MOV MOV ADD MOV SUB SUB MOV SUB JMP JMP JMP JMP CPU CPU 1 CPU 2 CPU 1 CPU 2 CPU 1 CPU 2 SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca 1. How to control the kernel schedules? ● Pin each tested thread to a diferent CPU (thread afnity) ● Pause and resume CPUs to control schedules Thread 1 Thread 2 Thread 1 Thread 2 Thread 1 Thread 2 Thread 1 Thread 2 MOV MOV MOV MOV ADD ADD PUSH ADD PUSH Pin PUSH Control MOV MOV t MOV MOV SUB + PUSH MOV MOV ADD MOV SUB SUB MOV SUB JMP JMP JMP JMP CPU CPU 1 CPU 2 CPU 1 CPU 2 CPU 1 CPU 2 Leverage thread afnity and control CPUs SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca 2. Which contexts are schedulable? ● Execution of some instructions are good hints SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca 2. Which contexts are schedulable? ● Execution of some instructions are good hints MOV MOV ADD MOV HALT PAUSE MOV MOV MOV MOV SUB SUB PUSH PUSH CPU 1 CPU 2 CPU 1 CPU 2 SKI: Exposing Kernel Concurrency Bugs Pedro Fonseca 2.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages67 Page
-
File Size-