Memory safety
Cyrille Artho and Roberto Guanciale KTH Royal Institute of Technology, Stockholm, Sweden School of Electrical Engineering and Computer Science Theoretical Computer Science [email protected]
2018-05-14
Cyrille Artho, 2018-05-14 What is memory safety?
Safe memory access: Each memory location that is used must have been ◆ allocated (statically, on stack, or on heap), ◆ initialized (write before read).
Resource usage: ◆ Each dynamically allocated memory must be freed exactly once. ◆ No memory exhaustion.
Cyrille Artho, 2018-05-14 1 Frequent problems
◆ Unwanted aliasing, race conditions.
◆ Access errors: Buffer overflow, use before malloc or after free.
◆ Invalid access: null pointer dereference, uninitialized pointer.
◆ Memory leak or double free.
Cyrille Artho, 2018-05-14 2 Memory corruption
◆ Serious security risk.
◆ Access to an unallocated memory region, or a region outside given buffer.
◆ May read uninitialized memory, or write to memory used by other buffer.
◆ One of the most common vulnerabilities. – Exploited to get access to protected data, or overwrite important data that governs control flow; may hijack process.
Cyrille Artho, 2018-05-14 3 Memory leak
◆ Allocated memory is not freed but never used again.
◆ Two versions: – Memory is no longer reachable (lost for good): may be garbage collected. – Memory is still reachable (potentially lost): cannot be garbage collected.
◆ Memory leak is a serious problem in long-running processes.
Cyrille Artho, 2018-05-14 4 How to detect memory corruption?
◆ Use a memory safety checking tool.
◆ In this course: valgrind (http://valgrind.org/). – Mature memory checker, used in many projects. – Finds any problems related to heap-allocated memory and some stack- allocated cases.
◆ Run-time overhead about a factor of 10–15.
Cyrille Artho, 2018-05-14 5 Other memory checking tools
◆ Newer checking tool: clang’s address sanitizer: – -fsanitize=address flag for clang compiler. – Home page: https://clang.llvm.org/docs/AddressSanitizer.html – Fast, robust, but does not work well with debugger.
◆ Many, many other tools exist (open source or commercial).
Cyrille Artho, 2018-05-14 6 Problem: Memory errors require right test input
Unit testing automates test execution with fixed (human-designed) inputs.
✔ Can model specific „difficult” input sequences.
✔ Can include test oracle (check of output against expected value).
✘ Difficult to cover many inputs.
Random testing can be used to generate many different inputs automatically.
✔ Automatically generate many inputs.
✘ No check of output (other than crashes).
✘ Shallow coverage (specific input format not known).
Cyrille Artho, 2018-05-14 7 How to improve beyond random crash testing?
Data model: ◆ Model-based testing: Create a model or grammar to derive inputs from. ◆ Option: Learn model from existing tests/inputs!
Properties: ◆ Memory safety!
Cyrille Artho, 2018-05-14 8 Fuzz testing
◆ Specify structure or format of input by examples.
◆ Randomly mutate examples to generate new inputs.
✔ Generates many slightly invalid inputs based on valid examples.
✔ Much better coverage than random testing.
✔ Ideal to check against memory corruption.
✘ Good examples can be hard to find. (Fuzzer may deviate too much from inputs that would produce full coverage.)
◆ Example fuzzer: radamsa: https://github.com/aoh/radamsa
Cyrille Artho, 2018-05-14 9 Final exercise: memory safety
Initial inputs Examples Fuzzer Modified inputs
Memory safety check Application Crash? valgrind
◆ Reproduce a real memory bug in a JAR file parser.
◆ Mandatory part: valgrind. – Explain bug: Origin of memory corruption. – Fix bug: Prevent memory corruption.
◆ Finding the bug: radamsa. – Create good seed input for fuzzer. – Experiment with it: What works well?
Cyrille Artho, 2018-05-14 10 Optional part: memory leak
◆ The program given in the exercise has (real) memory leaks.
◆ Use valgrind to look at the source of the leaks.
◆ Choose one leak and fix it.
Cyrille Artho, 2018-05-14 11 Summary: Memory safety
1. Memory corruption: Serious flaw in C/C++ programs.
2. Memory safety checkers can find memory bugs.
3. Fuzzing can provide inputs for better coverage.
Cyrille Artho, 2018-05-14 12