<<

Memory safety

Cyrille Artho and Roberto Guanciale KTH Royal Institute of Technology, Stockholm, Sweden School of Electrical Engineering and Computer Science Theoretical Computer Science [email protected]

2018-05-14

Cyrille Artho, 2018-05-14 What is memory safety?

Safe memory access: Each memory location that is used must have been ◆ allocated (statically, on stack, or on heap), ◆ initialized (write before read).

Resource usage: ◆ Each dynamically allocated memory must be freed exactly once. ◆ No memory exhaustion.

Cyrille Artho, 2018-05-14 1 Frequent problems

◆ Unwanted , race conditions.

◆ Access errors: Buffer overflow, use before malloc or after free.

◆ Invalid access: dereference, uninitialized pointer.

or double free.

Cyrille Artho, 2018-05-14 2

◆ Serious security risk.

◆ Access to an unallocated memory region, or a region outside given buffer.

◆ May read uninitialized memory, or write to memory used by other buffer.

◆ One of the most common vulnerabilities. – Exploited to get access to protected data, or overwrite important data that governs control flow; may hijack process.

Cyrille Artho, 2018-05-14 3 Memory leak

◆ Allocated memory is not freed but never used again.

◆ Two versions: – Memory is no longer reachable (lost for good): may be garbage collected. – Memory is still reachable (potentially lost): cannot be garbage collected.

◆ Memory leak is a serious problem in long-running processes.

Cyrille Artho, 2018-05-14 4 How to detect memory corruption?

◆ Use a memory safety checking tool.

◆ In this course: (http://valgrind.org/). – Mature memory checker, used in many projects. – Finds any problems related to heap-allocated memory and some stack- allocated cases.

◆ Run-time overhead about a factor of 10–15.

Cyrille Artho, 2018-05-14 5 Other memory checking tools

◆ Newer checking tool: clang’s address sanitizer: – -fsanitize=address flag for clang compiler. – Home page: https://clang.llvm.org/docs/AddressSanitizer.html – Fast, robust, but does not work well with debugger.

◆ Many, many other tools exist (open source or commercial).

Cyrille Artho, 2018-05-14 6 Problem: Memory errors require right test input

Unit testing automates test execution with fixed (human-designed) inputs.

✔ Can model specific „difficult” input sequences.

✔ Can include test oracle (check of output against expected value).

✘ Difficult to cover many inputs.

Random testing can be used to generate many different inputs automatically.

✔ Automatically generate many inputs.

✘ No check of output (other than crashes).

✘ Shallow coverage (specific input format not known).

Cyrille Artho, 2018-05-14 7 How to improve beyond random crash testing?

Data model: ◆ Model-based testing: Create a model or grammar to derive inputs from. ◆ Option: Learn model from existing tests/inputs!

Properties: ◆ Memory safety!

Cyrille Artho, 2018-05-14 8 Fuzz testing

◆ Specify structure or format of input by examples.

◆ Randomly mutate examples to generate new inputs.

✔ Generates many slightly invalid inputs based on valid examples.

✔ Much better coverage than random testing.

✔ Ideal to check against memory corruption.

✘ Good examples can be hard to find. (Fuzzer may deviate too much from inputs that would produce full coverage.)

◆ Example fuzzer: radamsa: https://github.com/aoh/radamsa

Cyrille Artho, 2018-05-14 9 Final exercise: memory safety

Initial inputs Examples Fuzzer Modified inputs

Memory safety check Application Crash? valgrind

◆ Reproduce a real memory bug in a JAR file parser.

◆ Mandatory part: valgrind. – Explain bug: Origin of memory corruption. – Fix bug: Prevent memory corruption.

◆ Finding the bug: radamsa. – Create good seed input for fuzzer. – Experiment with it: What works well?

Cyrille Artho, 2018-05-14 10 Optional part: memory leak

◆ The program given in the exercise has (real) memory leaks.

◆ Use valgrind to look at the source of the leaks.

◆ Choose one leak and fix it.

Cyrille Artho, 2018-05-14 11 Summary: Memory safety

1. Memory corruption: Serious flaw in /C++ programs.

2. Memory safety checkers can find memory bugs.

3. Fuzzing can provide inputs for better coverage.

Cyrille Artho, 2018-05-14 12