SECURE PROGRAMMING

A.A. 2018/2019 TERMINOLOGY

System, Social and SECURITY FLAWS

Software engineering has long been concerned with the elimination of software defects. A software defect is the encoding of a human error into the software, including omissions. Software defects can originate at any point in the software development life cycle. For example, a defect in a deployed product can originate from a misstated or misrepresented requirement.

Security Flaw üA software defect that poses a potential security risk.

System, Social and Mobile Security SECURITY FLAWS

Not all software defects pose a security risk. Those that do are security flaws. If we accept that a security flaw is a software defect, then we must also accept that by eliminating all software defects, we can eliminate all security flaws. This premise underlies the relationship between software engineering and secure programming. An increase in quality, as might be measured by defects per thousand lines of code, would likely also result in an increase in security. Many tools, techniques, and processes that are designed to eliminate software defects also can be used to eliminate security flaws.

System, Social and Mobile Security DIFFICULT/EXPENSIVE TO REMOVE

However, many security flaws go undetected because traditional software development processes seldom assume the existence of attackers. For example, testing will normally validate that an application behaves correctly for a reasonable range of user inputs. Unfortunately, attackers are seldom reasonable and will spend an inordinate amount of time devising inputs that will break a system.

System, Social and Mobile Security VULNERABILITY

Not all security flaws lead to vulnerabilities. However, a security flaw can cause a program to be vulnerable to attack when the program’s input data (for example, command-line parameters) crosses a security boundary en route to the program. This may occur when a program containing a security flaw is installed with execution privileges greater than those of the person running the program or is used by a network service where the program’s input data arrives via the network connection. Vunerability üA set of conditions that allows an attacker to violate an explicit or implicit security policy.

System, Social and Mobile Security POLICY

Taken verbatim from RFC 2828, the Internet Security Glossary (https://www.ietf.org/standards/rfcs/) Security Policy üA set of rules and practices that specify or regulate how a system or organization provides security services to protect sensitive and critical system resources.

Security policies that are documented, well known, and visibly enforced can help establish expected user behaviour.

System, Social and Mobile Security FLAWS ARE NOT VULNERABILITIES

A security flaw can also exist without all the preconditions required to create a vulnerability. For example, a program can contain a defect that allows a user to run arbitrary code inheriting the permissions of that program. üThis is not a vulnerability if the program has no special permissions and can be accessed only by local users, because there is no possibility that a security policy will be violated.

System, Social and Mobile Security VULNERABILITIES

Vulnerabilities can exist without a security flaw. Because security is a quality attribute that must be traded off with other quality attributes such as performance and usability, software designers may intentionally choose to leave their product vulnerable to some form of exploitation. Making an intentional decision not to eliminate a vulnerability does not mean the software is secure, only that the software designer has accepted the risk on behalf of the software consumer.

System, Social and Mobile Security EXPLOITS

Exploit üA technique that takes advantage of a security vulnerability to violate an explicit or implicit security policy. Vulnerabilities in software are subject to exploitation. Exploits can take many forms, including worms, viruses, and trojans. Understanding how programs can be exploited is a valuable tool that can be used to develop secure software. üHowever, disseminating exploit code against known vulnerabilities can be damaging to everyone.

System, Social and Mobile Security ZERO-DAY EXPLOIT

A zero-day (also known as 0-day) vulnerability is a computer-software vulnerability that is unknown to those who would be interested in mitigating the vulnerability (including the vendor of the target software). An exploit directed at a zero-day is called a zero-day exploit, or zero-day attack Developed by highly-skilled groups (NSA, central governments) üEternal blue https://cve.mitre.org/cgi- bin/cvename.cgi?name=CVE-2017-0144 On the market ühttps://0day.today

System, Social and Mobile Security MITIGATIONS

Mitigation üMethods, techniques, processes, tools, or runtime libraries that can prevent or limit exploits against vulnerabilities. A mitigation (or countermeasure) is a solution for a software flaw or a workaround that can be applied to prevent exploitation of a vulnerability. üAt the source code level, mitigations can be as simple as replacing an unbounded string copy operation with a bounded one. üAt a system or network level, a mitigation might involve turning off a port or filtering traffic to prevent an attacker from accessing a vulnerability.

System, Social and Mobile Security MITIGATIONS

The preferred way to eliminate security flaws is to find and correct the actual defect. However, in some cases it can be more cost-effective to eliminate the security flaw by preventing malicious inputs from reaching the defect. Vulnerabilities can also be addressed operationally by isolating the vulnerability. Of course, operationally addressing vulnerabilities significantly increases the cost of mitigation because the cost is pushed out from the developer to system administrators and end users.

System, Social and Mobile Security TO SUM UP

System, Social and Mobile Security A LINK TO CIA

A resource (either physical or logical) may have one or more vulnerabilities that can be exploited by a threat agent in a threat action. The result can potentially compromise the confidentiality, integrity or availability of resources.

System, Social and Mobile Security TAXONOMY

System, Social and Mobile Security AN EXAMPLE OF CLASSIFICATION

paper

System, Social and Mobile Security A BIT OF C HISTORY

System, Social and Mobile Security WHY C/C++

In this course, the decision to use C and C++ was based on the popularity of these languages, the enormous legacy code base, and the amount of new code being developed in these languages.

System, Social and Mobile Security WHY C

The C programming language is intended to be a lightweight language with a small footprint. This characteristic of C leads to vulnerabilities when programmers fail to implement required logic because they assume it is handled by C (but it is not). This problem is magnified when programmers are familiar with superficially similar languages such as Java, Pascal, or Ada, leading them to believe that C protects the programmer better than it actually does. These false assumptions have led to programmers failing to prevent writing beyond the boundaries of an array, failing to catch integer overflows and truncations and calling functions with the wrong number of arguments.

System, Social and Mobile Security WHY C

The C Standard [ISO/IEC 2011] defines several kinds of behaviors: Locale-specific behavior: behavior that depends on local conventions of nationality, culture, and language that each implementation documents. üAn example of locale-specific behavior is whether the islower() function returns true for characters other than the 26 lowercase Latin letters. Unspecified behavior: use of an unspecified value, or other behavior where the C Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance. üAn example of unspecified behaviour is the order in which the arguments to a function are evaluated. Implementation-defined behaviour: unspecified behaviour where each implementation documents how the choice is made. üAn example of implementation-defined behaviour is the propagation of the high-order bit when a signed integer is shifted right. Undefined behavior: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.

System, Social and Mobile Security LET’S START WITH C

C is a high-level general-purpose, procedural programming language. Dennis Ritchie first devised C in the 1970s at AT&T Bell Laboratories in Murray Hill, New Jersey, for the purpose of implementing the Unix operating system and utilities with the greatest possible degree of independence from specific hardware platforms. The key characteristics of the C language are the qualities that made it suitable for that purpose: üSource code portability üThe ability to operate “close to the machine” üEfficiency As a result, the developers of Unix were able to write most of the operating system in C, leaving only a minimum of system-specific hardware manipulation to be coded in assembler.

System, Social and Mobile Security C

C’s ancestors are the typeless programming languages BCPL (the Basic Combined Programming Language), developed by Martin Richards; and B, a descendant of BCPL, developed by Ken Thompson. A new feature of C was its variety of data types: characters, numeric types, arrays, structures, and so on. Brian Kernighan and Dennis Ritchie published an official description of the C programming language in 1978. Few hardware-dependent elements. For example, the C language proper has no file access or dynamic memory management statements. No input/output. Instead, the extensive standard C library provides the functions for all of these purposes.

System, Social and Mobile Security VIRTUES OF C

Fast (it's a compiled language and so is close to the machine hardware) Portable (you can compile you program to run on just about any hardware platform out there) The language is small (unlike C++ for example) Mature (a long history and lots of resources and experience available) There are many tools for making programming easier (e.g. IDEs like Xcode) You have direct access to memory You have access to low-level system features if needed

System, Social and Mobile Security CHALLENGES OF USING C

The language is small (but there are many APIs) It's easy to get into trouble, e.g. with direct memory access & pointers You must manage memory yourself Sometimes code is more verbose than in high-level scripting languages like Python, R, etc

System, Social and Mobile Security STANDARDS

K & R C (Brian Kernighan and Dennis Ritchie) ü1972 First created by Dennis Ritchie ü1978 The C Programming Language described ANSI C ü1989 ANSI X.159-1989 aka C89 - First standardized version ISO C 1990 ISO/IEC 9899:1990 aka C90 - Equivalent to C89 1995 Amendment 1 aka C95 1999 ISO/IEC 9899:1999 aka C99 2011 ISO/IEC 9899:2011 aka C11 2018 ISO/IEC 9899:2018 aka C18 gcc file.c –std=c11 System, Social and Mobile Security DENNIS

System, Social and Mobile Security HISTORY OF C++

In the early 1970s, Dennis Ritchie introduced “C” at Bell Labs. ühttp://cm.bell-labs.co/who/dmr/chist.html As a Bell Labs employee, Bjarne Stroustrup was exposed to and appreciated the strengths of C, but also appreciated the power and convenience of higher-level languages like Simula, which had language support for object-oriented programming (OOP). üOriginally called C With Classes, in 1983 it becomes C++ In 1985, the first edition of The C++ Programming Language was released Standard in 1998 (ISO/IEC 14882:1998)

System, Social and Mobile Security HISTORY

Adding support for OOP turned out to be the right feature at the right time for the ʽ90s. At a time when GUI programming was all the rage, OOP was the right paradigm, and C++ was the right implementation. At over 700 pages, the C++ standard demonstrated something about C++ that some critics had said about it for a while: C++ is a complicated beast. The first decade of the 21st century saw desktop PCs that were powerful enough that it didn’t seem worthwhile to deal with all this complexity when there were alternatives that offered OOP with less complexity. üJava

System, Social and Mobile Security STROUSTRUP

System, Social and Mobile Security CHARACTERISTICS

The most important feature of C++ is that it is both low- and high- level. Programming in C++ requires a discipline and attention to detail that may not be required of kinder, gentler languages that are not as focused on performance üNo garbage collector!

System, Social and Mobile Security STANDARDS

1998 ISO/IEC 14882:1998 C++98 2003 ISO/IEC 14882:2003 C++03 2011 ISO/IEC 14882:2011 C++11 2014 ISO/IEC 14882:2014 C++14 2017 ISO/IEC 14882:2017 C++17 2020 ???? C++20

System, Social and Mobile Security LET’S START!

System, Social and Mobile Security CERT C CODING STANDARD

https://wiki.sei.cmu.edu/confluence/display/c/SEI+CERT +C+Coding+Standard

The SEI CERT C Coding Standard is a software coding standard for the C programming language, developed by the CERT Coordination Center to improve the safety, reliability, and security of software system üCERT division at Carnagie Mellon üA non-profit United States federally funded research and development center

System, Social and Mobile Security ARRAYS AND STRINGS

System, Social and Mobile Security ARRAYS ARE COMPLICATED

When passed as a parameter, an array name is a pointer sizeof(int *) == 8 always

The CERT C Secure Coding Standard includes “ARR01-C. Do not apply the sizeof operator to a pointer when taking the size of an array,” which warns against this problem.

System, Social and Mobile Security STRINGS

Strings are a fundamental concept in software engineering, but they are not a built-in type in C or C++. The standard C library supports strings of type char and wide strings of type wchar_t.

System, Social and Mobile Security IMPROPERLY BOUNDED COPIES

Reading from a source to a fixed-length array. This program has undefined behaviour if more than eight characters (including the null terminator) are entered at the prompt. The main problem with the gets() function is that it provides no way to specify a limit on the number of characters to read.

The gets() function has been deprecated in C99 and eliminated from C11. The CERT C Secure Coding Standard, “MSC34-C. Do not use deprecated or obsolescent functions”.

System, Social and Mobile Security READING FROM STDIN

Reading data from unbounded sources (such as stdin()) creates an interesting problem for a programmer. Because it is not possible to know beforehand how many characters a user will supply, it is not possible to preallocate an array of sufficient length. A common solution is to statically allocate an array that is thought to be much larger than needed. In this example, the programmer expects the user to enter only one character and consequently assumes that the eight-character array length will not be exceeded. üWith friendly users, this approach works well. But with malicious users, a fixed-length character array can be easily exceeded, resulting in undefined behaviour. This approach is prohibited by The CERT C Secure Coding Standard, “STR35-C. Do not copy data from an unbounded source to a fixed-length array.”

System, Social and Mobile Security FROM PROGRAM PARAMETERS

Vulnerabilities can occur when inadequate space is allocated to copy a program input such as a command-line argument. Although argv[0] contains the program name by convention, an attacker can control the contents of arg[0] to cause a vulnerability in the following program by providing a string with more than 128 bytes.

Problems with C++ as well

System, Social and Mobile Security HOW TO FIX IT

System, Social and Mobile Security NULL TERMINATING STRINGS

The result is that the strcpy() to c may write well beyond the bounds of the array because the string stored in a[] is not correctly null-terminated.

The CERT C Secure Coding Standard includes “STR32-C. Null-terminate byte strings as required.”

System, Social and Mobile Security ERRORS

Most of the functions defined in the standard string-handling library . Visual Studio has deprecated most of them. However, errors are still possible without them, since strings are arrays of chars…

System, Social and Mobile Security STRING VULNERABILITIES AND EXPLOITS

System, Social and Mobile Security TAINTED VALUES

Previous sections described common errors in manipulating strings in C or C++. üThese errors become dangerous when code operates on untrusted data from external sources such as command-line arguments, environment variables, console input, text files, and network connections. It is safer to view all external data as untrusted. In software security analysis, a value is said to be tainted if it comes from an untrusted source (outside of the program’s control) and has not been sanitized to ensure that it conforms to any constraints on its value that consumers of the value require, for example, that all strings are null-terminated.

System, Social and Mobile Security PASSWORD EXAMPLE

System, Social and Mobile Security SECURITY FLAWS

The security flaw in the IsPasswordOK program that allows an attacker to gain unauthorized access is caused by the call to gets(). The condition that allows an out-of-bounds write to occur is referred to in software security as a .

System, Social and Mobile Security ONE MORE FLAW

The IsPasswordOK program has another problem: it does not check the return status of gets(). This is a violation of “FIO04- C. Detect and handle input and output errors.” When gets() fails, the contents of the Password buffer are indeterminate, and the subsequent strcmp() call has undefined behaviour. In a real program, the buffer might even contain the good password previously entered by another user.

System, Social and Mobile Security BUFFER OVERFLOWS

Buffer overflows occur when data is written outside of the boundaries of the memory allocated to a particular data structure. C and C++ are susceptible to buffer overflows because these languages: üDefine strings as null-terminated arrays of characters. üDo not perform implicit bounds checking. üProvide standard library calls for strings that do not enforce bounds checking. Depending on the location of the memory and the size of the overflow, a buffer overflow may go undetected but can corrupt data, cause erratic behaviour, or terminate the program abnormally.

System, Social and Mobile Security BUFFER OVERFLOWS

Not all buffer overflows lead to software vulnerabilities. However, a buffer overflow can lead to a vulnerability if an attacker can manipulate user-controlled inputs to exploit the security flaw. There are, for example, well-known techniques for overwriting frames in the stack to execute arbitrary code. Buffer overflows can also be exploited in heap or static memory areas by overwriting data structures in adjacent memory.

To help, static (at program time) and dynamic analysis tools (if right data is passed)

System, Social and Mobile Security MEMORY

System, Social and Mobile Security PROCESS MEMORY ORGANIZATION

A program instance that is loaded into memory and managed by the operating system.

System, Social and Mobile Security MEMORY

The code or text segment includes instructions and read-only data. It can be marked read-only so that modifying memory in the code section results in faults. The data segment contains initialized data, uninitialized data, static variables, and global variables. The heap is used for dynamically allocating process memory. The stack is a last-in, first-out (LIFO) data structure used to support process execution.

The exact organization of process memory depends on the operating system, compiler, linker, and loader—in other words, on the implementation of the programming language

System, Social and Mobile Security STACK

int fun(int p1, int p2, int p3) { int res= 0; res res= p1 + p2 + p3; return return res; address } p3

p2 int main() { int a= 4, b= 5, c= 7; p1 a= fun(a,b,c); } c

b

a

System, Social and Mobile Security . EXAMPLE . . include frame f2 void f1();

void f2() { frame f1 int c; f1(); puts(“bye f2”); frame f2 }

void f1() { frame f1 int b= 0; f2(); puts(“bye f1”); main }

int main() { int a= 0; MacBook-Francesco:ProgrammI francescosantini$ ./test f1(); Segmentation fault: 11 puts(“bye main”); System,} Social and Mobile Security STACK

To return control to the proper location, the sequence of return addresses must be stored. A stack is well suited for maintaining this information because it is a dynamic data structure that can support any level of nesting within memory constraints. The address of the current frame is stored in the frame or base pointer register. On x86-32, the extended base pointer (ebp) register is used for this purpose. The frame pointer is used as a fixed point of reference within the stack. When a subroutine is called, the frame pointer for the calling routine is also pushed onto the stack so that it can be restored when the subroutine exits.

System, Social and Mobile Security DISASSEMBLY IN INTEL

System, Social and Mobile Security INSTRUCTION POINTER

The instruction pointer (eip) points to the next instruction to be executed. When executing sequential instructions, it is automatically incremented by the size of each instruction, so that the CPU will then execute the next instruction in the sequence. Normally, the eip cannot be modified directly; instead, it must be modified indirectly by instructions such as jump, call, and return. Extended stack pointer (esp) is the current pointer to the stack. The stack pointer points to the top of the stack. üFor many popular architectures, including x86, SPARC, and MIPS processors, the stack grows toward lower memory.

System, Social and Mobile Security DISASSEMBLY OF FOO (PROLOGUE)

System, Social and Mobile Security STACK FRAME FOR FOO

System, Social and Mobile Security DISASSEMBLING FOO (EPILOGUE)

System, Social and Mobile Security RETURN VALUES

If there is a return value, it is stored in eax by the called function before returning. The caller function knows it can be found in eax and can use it.

int MyFunction2(int a, int b) { x = MyFunction2(2, 3); return a + b; }

:_MyFunction2 push ebp push 3 mov ebp, esp push 2 mov eax, [ebp + 8] call _MyFunction2 mov edx, [ebp + 12] %Use x in eax add eax, edx mov esp, ebp pop ebp ret System, Social and Mobile Security STACK SMASHING

System, Social and Mobile Security WHAT IS IT?

Stack smashing is when an attacker purposely overflows a buffer on stack to get access to forbidden regions of computer memory. Stack smashing occurs when a buffer overflow overwrites data in the memory allocated to the execution stack. It can have serious consequences for the reliability and security of a program. Buffer overflows in the stack segment may allow an attacker to modify the values of automatic variables or execute arbitrary code.

System, Social and Mobile Security WHAT CAN HAPPEN?

Overwriting automatic variables can result in a loss of data integrity or, in some cases, a security breach (for example, if a variable containing a user ID or password is overwritten). More often, a buffer overflow in the stack segment can lead to an attacker executing arbitrary code by overwriting a pointer to an address to which control is (eventually) transferred. A common example is overwriting the return address, which is located on the stack.

System, Social and Mobile Security EXAMPLE

The IsPasswordOk program is vulnerable to a stack- smashing attack. The IsPasswordOK program has a security flaw because the Password array is improperly bounded and can hold only an 11-character password plus a trailing null byte. This flaw can easily be demonstrated by entering a 20- character password of “1234567890123456W▸*!” that causes the program to jump in an unexpected way

System, Social and Mobile Security BACK TO THE EXAMPLE

System, Social and Mobile Security EXAMPLE

It crashes!

System, Social and Mobile Security WHAT HAPPENS

Each of these characters has a corresponding hexadecimal value: W = 0x57, ▸ = 0x10, * = 0x2A, and ! = 0x21. In memory, this sequence of four characters corresponds to a 4-byte address that overwrites the return address on the stack, so instead of returning to the instruction immediately following the call in main(): üThe IsPasswordOK() function returns control to the “Access granted” branch, bypassing the password validation logic and allowing unauthorized access to the system

System, Social and Mobile Security GUESS THE RIGHT ADDRESS

void * __builtin_return_address (unsigned int level)

A value of 0 yields the return address of the current function, a value of 1 yields the return address of the caller of the current function, and so forth

System, Social and Mobile Security ARC INJECTION

The arc injection technique (sometimes called return- into-libc) involves transferring control to code that already exists in process memory. These exploits are called arc injection because they insert a new arc (control-flow transfer) into the program’s control-flow graph as opposed to injecting new code. More sophisticated attacks are possible using this technique, including installing the address of an existing function (such as system() or exec(), which can be used to execute commands and other programs already on the local system) on the stack along with the appropriate arguments.

System, Social and Mobile Security ARC INJECTION

An attacker may prefer arc injection over for several reasons. Because arc injection uses code already in memory on the target system, the attacker merely needs to provide the addresses of the functions and arguments for a successful attack. üThe footprint for this type of attack can be significantly smaller and may be used to exploit vulnerabilities that cannot be exploited by the code injection technique. Because the exploit consists entirely of existing code, it cannot be prevented by memory-based protection schemes such as making memory segments (such as the stack) non-executable.

System, Social and Mobile Security ONE MORE EXAMPLE

System, Social and Mobile Security CODE INJECTION (SHELL CODE)

System, Social and Mobile Security INJECTION AND

When the return address is overwritten because of a software flaw, it seldom points to valid instructions. Consequently, transferring control to this address typically causes a trap and results in a corrupted stack. But it is possible for an attacker to create a specially crafted string that contains a pointer to some malicious code, which the attacker also provides. üWhen the function invocation whose return address has been overwritten returns, control is transferred to this code. The malicious code runs with the permissions that the vulnerable program has when the subroutine returns. üThis is why programs running with root or other elevated privileges are normally targeted. The malicious code can perform any function that can otherwise be programmed but often simply opens a remote shell on the compromised machine. For this reason, the injected malicious code is referred to as shellcode.

System, Social and Mobile Security HOW IT HAS TO BE

The pièce de résistance of any good exploit is the malicious argument. A malicious argument must have several characteristics: üIt must be accepted by the vulnerable program as legitimate input. üThe argument, along with other controllable inputs, must result in execution of the vulnerable code path. üThe argument must not cause the program to terminate abnormally before control is passed to the shellcode.

System, Social and Mobile Security BACK TO THE EXAMPLE

System, Social and Mobile Security INJECTION

% ./BufferOverflow < exploit.bin

(exploit.bin is the “payload”)

System, Social and Mobile Security HOW IT WORKS

The lea instruction used in this example stands for “load effective address.” The lea instruction computes the effective address of the second operand (the source operand) and stores it in the first operand (destination operand). The source operand is a memory address (offset part) specified with one of the processor’s addressing modes; the destination operand is a general purpose register.

System, Social and Mobile Security HOW IT WORKS

The exploit code works as follows: ü1. The first mov instruction is used to assign 0xB to the %eax register. 0xB is the number of the execve() system call in Linux. • int execve(const char *filename, char *const argv[], char *const envp[]); ü2. The three arguments for the execve() function call are set up in the subsequent three instructions (the two lea instructions and the mov instruction). The data for these arguments is located on the stack, just before the exploit code. ü3. The int $0x50 instruction is used to invoke execve(), which results in the execution of the Linux calendar program.

System, Social and Mobile Security RESULT

System, Social and Mobile Security RETURN-ORIENTED PROGRAMMING

System, Social and Mobile Security OVERCOMING DEFENCES

Code that already exists in the process image. üThe standard C library, libc, is loaded in nearly every Unix program, it contains routines useful for an attacker. üBut in principle any available code, either from the program’s text segment or from a library it links to, could be used.

By contrast, the building blocks for our attack are short code sequences, each just two or three instructions long. Some are present in libc as a result of the code- generation choices of the compiler. üThese code sequences would be very difficult to eliminate without extensive modifications to the compiler and assembler.

System, Social and Mobile Security HOW IT WORKS

The return-oriented programming exploit technique is similar to arc injection, but instead of returning to functions, the exploit code returns to sequences of instructions followed by a return (ret) instruction. Any such useful sequence of instructions is called a gadget. Each gadget specifies certain values to be placed on the stack that make use of one or more sequences of instructions in the code segment. üGadgets perform well-defined operations, such as a load, an add, or a jump. It allows an attacker to execute code in the presence of security defences such as executable space protection and code signing.

System, Social and Mobile Security HOW IT WORKS

Return-oriented programming is an advanced version of a stack smashing attack. In a standard buffer overrun attack, the attacker would simply write attack code (the "payload") onto the stack and then overwrite the return address with the location of these newly written instructions. ü Since the late 90s, OS/compilers have protections: data zones cannot be executed. • DEP Data Execution prevention (there is a hardware bit) • NX (no execute), on Intel XD (execute disable)

https://cseweb.ucsd.edu/~hovav/dist/geometry.pdf

System, Social and Mobile Security pop %ebx; EXAMPLE ret;

The left side shows the x86-32 assembly language instruction necessary to copy the constant value $0xdeadbeef into the ebx register, and the right side shows the equivalent gadget. With the stack pointer referring to the gadget, the return instruction is executed by the CPU. The resulting gadget pops the constant from the stack and returns execution to the next gadget on the stack.

System, Social and Mobile Security EXAMPLE 2 pop %esp; ret;

An unconditional branch can be used to branch to an earlier gadget on the stack, resulting in an infinite loop.

System, Social and Mobile Security EXAMPLE OF ATTACK

The goal of the attack is to invoke system call sys_write and output “xxxHACKxxx” to screen ssize_t sys_write(unsigned int fd, const char * buf, size_t count)

System, Social and Mobile Security EXAMPLE OF ATTACK http://www.cs.virginia.edu/~ww6r/CS4630/lectures/return_oriented_programming.pdf

int main(int argc, char *argv[]){ char buf[4]; gets(buf) return 0; }

System, Social and Mobile Security GADGETS

5:

System, Social and Mobile Security HOW TO DO IT

System, Social and Mobile Security SOME THEORETICAL ISSUES

Can you always find the gadgets you need? üSome small executable files may not have all the gadgets for you üIf the executable file is larger than 3MB there is a good chance that you can find a set of gadgets for any exploits Do you need “ret”? üNo, other jumps also work

ROP can work also without lib, only with provided code (in case mitigation on libc have been considered)

System, Social and Mobile Security SOME THEORETICAL ISSUES

Return-oriented programming provides a fully functional "language" (Turing complete) that an attacker can use to make a compromised machine perform any operation desired.

System, Social and Mobile Security SUMMARY

Return-oriented Programming (ROP) addresses the limitations of code-injection and return-to-libc üCode-injection: need executable stack üReturn-to-libc: • Highly depends on libc's implementation • Can be defended with mapped memory randomization Gadgets: a small sequence of code ending with “ret” within a program's code section üNo need to inject code, so no need of executable stack üDo not use libc's full function implementation, may even only use just application's code ROP attacks chain several gadgets together to execute arbitrary code Enough ROP gadgets can be found in most executable files using scanning tools

System, Social and Mobile Security COMPLICATED

While return-oriented programming might seem very complex, this complexity can be abstracted behind a programming language, compiler, and support tools, making it a viable technique for writing exploits.

System, Social and Mobile Security HOW TO EXPLOIT/PREVENT IT

An automated tool has been developed to help automate the process of locating gadgets and constructing an attack against a binary. This tool, known as ROPgadget, searches through a binary looking for potentially useful gadgets, and attempts to assemble them into an attack payload that spawns a shell to accept arbitrary commands from the attacker. https://github.com/JonathanSalwan/ROPgadget

System, Social and Mobile Security HOW TO BUILD THEM

pwntools is a CTF (Capture the Flag) framework and exploit development library. Written in Python, it is designed for rapid prototyping and development, and intended to make exploit writing as simple as possible. ühttp://docs.pwntools.com/en/stable/rop/rop.html

System, Social and Mobile Security MITIGATION

System, Social and Mobile Security STACK-SMASHING PROTECTOR (PROPOLICE) In version 4.1, GCC introduced the Stack-Smashing Protector (SSP) feature, which implements canaries derived from StackGuard. Also known as ProPolice, SSP is a GCC extension for protecting applications written in C from the most common forms of stack buffer overflow exploits and is implemented as an intermediate language translator of GCC. Specifically, SSP reorders local variables to place buffers after pointers and copies pointers in function arguments to an area preceding local variable buffers to avoid the corruption of pointers that could be used to further corrupt arbitrary memory locations.

System, Social and Mobile Security CANARIES

Canaries consist of a value that is difficult to insert or spoof and are written to an address before the section of the stack being protected. A sequential write would consequently need to overwrite this value on the way to the protected region. The canary is initialized immediately after the return address is saved and checked immediately before the return address is accessed.

A hard-to-spoof or random canary is a 32-bit secret random number that changes each time the program is executed.

System, Social and Mobile Security CANARIES

SSP works by introducing a canary to detect changes to the arguments, return address, and previous frame pointer in the stack. SSP inserts code fragments into appropriate locations as follows: a random number is generated for the guard value during application initialization, preventing discovery by an unprivileged user.

System, Social and Mobile Security DISABLING IT

The -fstack-protector and -fno-stack-protector options enable and disable stack-smashing protection for functions with vulnerable objects (such as arrays).

System, Social and Mobile Security ASLR

Address space layout randomization (ASLR) is a security feature of many operating systems; its purpose is to prevent arbitrary code execution. The feature randomizes the address of memory pages used by the program. ASLR cannot prevent the returnaddress on the stack from being overwritten by a stack-based overflow. However, by randomizing the address of stack pages, it may prevent attackers from correctly predicting the address of the shellcode, system function, or return- oriented programming gadget that they want to invoke.

System, Social and Mobile Security ASLR AND OS

ASLR was first introduced to Linux in the PaX project in 2000. While the PaX patch has not been submitted to the mainstream Linux kernel, many of its features are incorporated into mainstream Linux distributions. For example, ASLR has been part of Ubuntu since 2008 and Debian since 2007. Both platforms allow for fine- grained tuning of ASLR via the following command: üsysctl -w kernel.randomize_va_space=2 It can be turned off (= 0). ASLR has been available on Windows since Vista.

System, Social and Mobile Security ASLR

ASLR randomly arranges the address space positions of key data areas of a process, including the base of the executable and the positions of the stack, heap and libraries.

System, Social and Mobile Security NON-EXECUTABLE STACK

A nonexecutable stack is a runtime solution to buffer overflows that is designed to prevent executable code from running in the stack segment. üMany operating systems can be configured to use nonexecutable stacks. Nonexecutable stacks are often represented as a panacea in securing against buffer overflow vulnerabilities, but… üThey do not prevent buffer overflows from occurring in the heap or data segments. üThey do not prevent an attacker from using a buffer overflow to modify a return address, variable, object pointer, or function pointer. üAnd they do not prevent arc injection or injection of the execution code in the heap or data segments (or ROP).

System, Social and Mobile Security DRAWBACKS

Depending on how they are implemented, nonexecutable stacks can affect performance. Nonexecutable stacks can also break programs that execute code in the stack segment, including Linux signal delivery and GCC trampolines.

System, Social and Mobile Security W^X

Several operating systems, including OpenBSD, Windows, Linux, and OS X, enforce reduced privileges in the kernel so that no part of the process address space is both writable and executable. This policy is called W xor X (W⊕X), or more concisely W^X, and is supported by the use of a No eXecute (NX) bit on several CPUs. The NX bit enables memory pages to be marked as data, disabling the execution of code on these pages. This bit is named NX on AMD CPUs, XD (for eXecute Disable) on Intel CPUs. W^X requires that no code is intended to be executed that is not part of the program itself. This prevents the execution of shellcode on the stack, heap, or data segment

System, Social and Mobile Security IMPLEMENTATION

Deployment: Linux (via PaX patches); OpenBSD; Windows (since XP SP2); OS X (since 10.5); ... Hardware support: Intel “XD” bit, AMD “NX” bit (and many RISC processors)

System, Social and Mobile Security HEAP ISSUES

System, Social and Mobile Security SOME PROBLEMS WITH MALLOC

Initializing large blocks of memory can degrade performance and is not always necessary. The decision by the C standards committee to not require malloc() to initialize this memory reserves this decision for the programmer. “MEM09-C. Do not assume memory allocation functions initialize memory.”

System, Social and Mobile Security SOME SECURITY PROBLEMS

Where sensitive information is used, it is important to clear or overwrite the sensitive information before calling free(). üMEM03-C of The CERT C Secure Coding Standard: “Clear sensitive information stored in reusable resources”. Unfortunately, compiler optimizations may silently remove a call to memset() if the memory is not accessed following the write. To avoid this possibility, you can use the memset_s() function (if available). Unlike memset(), the memset_s() function assumes that the memory being set may be accessed in the future. CERT C Secure Coding Standard, “MSC06-C. Be aware of compiler optimization when dealing with sensitive data”.

System, Social and Mobile Security FAILING TO CHECK RETURN VALUES

Memory is a limited resource and can be exhausted (memory leaks, overall memory, other processes). Once all virtual memory is allocated, requests for more memory will fail. “MEM32-C. Detect and handle memory allocation errors,”

System, Social and Mobile Security MEMORY LEAKS AND SECURITY

Memory leaks occur when dynamically allocated memory is not freed after it is no longer needed. Memory leaks can be problematic in long-running processes or in resource-exhaustion attacks. üIf an attacker can identify an external action that causes memory to be allocated but not freed, memory can eventually be exhausted. ü Once memory is exhausted, additional allocations fail, and the application is unable to process valid user requests without necessarily crashing.

System, Social and Mobile Security EXAMPLE

System, Social and Mobile Security REFERENCING FREED MEMORY

It is possible to access freed memory unless all pointers to that memory have been set to NULL or otherwise overwritten. Reading from freed memory is undefined behaviour but almost always succeeds without a memory fault because freed memory is recycled by the memory manager. üHowever, there is no guarantee that the contents of the memory have not been altered.

System, Social and Mobile Security SO

When you free, also set the pointer to freed memory to NULL.

Writing to a memory location that has been freed is also unlikely to result in a memory fault but could result in a number of serious problems. If the memory has been reallocated, a programmer may overwrite memory, believing that a memory chunk is dedicated to a particular variable when in reality it is being shared

System, Social and Mobile Security OTHER ERRORS

Dereferencing Null or Invalid Pointers. If the operand doesn’t point to an object or function, the behaviour of the unary * operator is undefined. Cases: ünull pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime Double free: free the same memory chunk more than once.

Also a memory leak

System, Social and Mobile Security HEAP OVERFLOWS

System, Social and Mobile Security DLMALLOC

The GNU C library and most versions of Linux (for example, Red Hat, Debian) are based on Doug Lea’s malloc (dlmalloc) the default native version of malloc. In the following we describe the internals of dlmalloc version 2.7.2, and examples of how these flaws can be exploited. üNow after version 2.8.6 The security flaws responsible for the following vulnerabilities are common to all versions of dlmalloc (and other memory managers as well). üWe suppose Intel and 32bit architecture.

System, Social and Mobile Security DLMALLOC

In dlmalloc, memory chunks are either allocated to a process or are free.

The first 4 bytes of both allocated and free chunks contain either the size of the previous adjacent chunk, if it is free, or the last 4 bytes of user data of the previous chunk, if it is allocated.

System, Social and Mobile Security CHUNCKS

Allocated chunk Free chunk

4 4 4 ?

System, Social and Mobile Security FREE CHUNKS

In dlmalloc, free chunks are arranged in circular double- linked lists, or bins. Each double-linked list has a head that contains forward and backward pointers to the first and last chunks in the list. Both the forward pointer in the last chunk of the list and the backward pointer in the first chunk of the list point to the head element. When the list is empty, the head’s pointers reference the head itself.

System, Social and Mobile Security A BIN

Both allocated and free chunks make use of a PREV_INUSE bit (represented by P in the figure) to indicate whether or not the previous chunk is allocated.

System, Social and Mobile Security ULINK

unlink() macro is used to remove a chunk from its double-linked list. It is used when memory is consolidated and when a chunk is taken off the free list because it has been allocated to a user.

System, Social and Mobile Security ATTACKING THE HEAP

Through DLmalloc http://grantcurell.com/2015/08/16/protostar-exploit- challenges-heap3-solution-exploiting-dlmalloc/

System, Social and Mobile Security BUFFER OVERFLOW ON THE HEAP

Dynamically allocated memory is vulnerable to buffer overflows. Exploiting a buffer overflow in the heap is generally considered to be more difficult than smashing the stack. For this reason, buffer overflows in the heap are not always appropriately addressed, and developers adopt solutions that protect against stack-smashing attacks but not buffer overflows in the heap. Buffer overflows, for example, can be used to corrupt data structures used by the memory manager to execute arbitrary code. üBoth the unlink and frontlink techniques described in the following can be used for this purpose as well.

System, Social and Mobile Security ULINK TECHNIQUE

The unlink technique was successfully used against versions of Netscape browsers and traceroute using DLMalloc.

The unlink technique is used to exploit a buffer overflow to manipulate the boundary tags on chunks of memory to trick the unlink() macro into writing 4 bytes of data to an arbitrary location.

System, Social and Mobile Security VULNERABLE CODE

This vulnerable program allocates three chunks of memory (lines 5–7). The program accepts a single string argument that is copied into first (line 8). This unbounded strcpy() operation is susceptible to a buffer overflow. The boundary tag can be overwritten by a string argument exceeding the length of first because the boundary tag for second is located directly after the first buffer.

System, Social and Mobile Security THE HEAP

Div by 8 The content of the heap at the time free() is called for the first time

System, Social and Mobile Security CONSOLIDATION

If the second chunk is unallocated, the free() operation will attempt to consolidate it with the first chunk.

To determine whether the second chunk is unallocated, free() checks the PREV_INUSE bit of the third chunk.

The location of the third chunk is determined by adding the size of the second chunk to its starting address.

System, Social and Mobile Security HEAP IS OVERWRITTEN

The attacker can overwrite the boundary tag associated with the second chunk of memory, because this boundary tag is located immediately after the end of the first chunk.

System, Social and Mobile Security EXAMPLE

System, Social and Mobile Security IN PRACTICE

The unlink() macro writes 4 bytes of data supplied by an attacker to a 4-byte address also supplied by the attacker.

ulink() is called

System, Social and Mobile Security EFFECT

The first line of unlink, FD = P->fd, assigns the value in P->fd (provided as part of the malicious argument) to FD. The second line of the unlink macro, BK = P->bk, assigns the value of P->bk, also provided by the malicious argument to BK. The third line of the unlink() macro, FD->bk = BK,overwrites the address specified by FD + 12 (the offset of the bk field in the structure) with the value of BK.

System, Social and Mobile Security RESULT

We write at address fp the content in bk An attacker could, for example, provide the address of the return pointer on the stack and use the unlink() macro to overwrite the address with the address of malicious code. üDLMalloc change the stack for me

Exploitation of a buffer overflow in the heap is not particularly difficult. The most difficult part of this exploit is determining the size of the first chunk so that the boundary tag for the second argument can be precisely overwritten.

System, Social and Mobile Security DOUBLE FREE VULNERABILITIES

Doug Lea’s malloc is also susceptible to double-free vulnerabilities. This type of vulnerability arises from freeing the same chunk of memory twice without its being reallocated between the two free operations. For a double-free exploit to be successful, two conditions must be met. The chunk to be freed must be isolated in memory (that is, the adjacent chunks must be allocated so that no consolidation takes place), and the bin into which the chunk is to be placed must be empty.

System, Social and Mobile Security WE START WITH

System, Social and Mobile Security FRONTLINK

When a chunk of memory is freed, it must be linked into the appropriate double-linked list. In some versions of dlmalloc, this is performed by the frontlink code segment.

System, Social and Mobile Security AFTER A FREE

System, Social and Mobile Security ATTACK

The attacker supplies the address of a memory chunk and arranges for the first 4 bytes of this memory chunk to contain executable code (that is, a jump instruction to shellcode). This is accomplished by writing these instructions into the last 4 bytes of the previous chunk in memory.

System, Social and Mobile Security MITIGATION

Randomization works on the principle that it is harder to hit a moving target than a still target. Addresses of memory allocated by malloc() are fairly predictable. Randomizing the addresses of blocks of memory returned by the memory manager can make it more difficult to exploit a heap-based vulnerability.

Tools for static and dynamic analysis: Valgrind üMemcheck is a memory error detector. It can detect the following problems that are common in C and C++ programs. üAccessing memory you shouldn't, e.g. overrunning and underrunning heap blocks, overrunning the top of the stack, and accessing memory after it has been freed.

System, Social and Mobile Security POINTER SUBTERFUGE

System, Social and Mobile Security POINTERS CAN BE MODIFIED

Pointer subterfuge is a general term for exploits that modify a pointer’s value. C and C++ differentiate between pointers to objects and pointers to functions. Function pointers can be overwritten to transfer control to attacker-supplied shellcode. When the program executes a call via the function pointer, the attacker’s code is executed instead of the intended code.

System, Social and Mobile Security POINTERS TO FUNCTIONS

Overflow in the data segment!!!!

Shellcode can be pointed by funcPtr!!

System, Social and Mobile Security POINTERS TO OBJECTS

This program contains an unbounded memory copy on line 5. After overflowing the buffer, an attacker can overwrite ptr and val. When *ptr = val is consequently evaluated on line 6, an arbitrary memory write is performed.

System, Social and Mobile Security ONE MORE EXAMPLE WITH FUNCT

For an attacker to succeed in executing arbitrary code on x86-32, an exploit must modify the value of the instruction pointer to reference the shellcode. The instruction pointer register (eip) contains the offset in the current code segment for the next instruction to be executed. The eip register cannot be accessed directly by software. It is advanced from one instruction boundary to the next when executing code sequentially or modified indirectly by control transfer instructions (such as jmp, jcc, call, and ret), interrupts, and exceptions.

System, Social and Mobile Security ONE MORE EXAMPLE WITH FUNCT

The call instruction, for example, saves return information on the stack and transfers control to the called function specified by the destination (target) operand. The target operand specifies the address of the first instruction in the called function. This operand can be an immediate value, a general-purpose register, or a memory location.

System, Social and Mobile Security EXAMPLE

System, Social and Mobile Security DISASSEMBLING IT ModR/M= 15 points to an absolute, indirect call

System, Social and Mobile Security CONCLUSION

These invocations of good_function() provide examples of call instructions that can and cannot be attacked. üThe static invocation uses an immediate value as relative displacement, and this displacement cannot be overwritten because it is in the code segment. The invocation through the function pointer uses an indirect reference, and the address in the referenced location (typically in the data or stack segment) can be overwritten.

System, Social and Mobile Security TOOLS AND LINKS

System, Social and Mobile Security TOOLS

Tools for static analysis ühttps://en.wikipedia.org/wiki/List_of_tools_for_static_code_a nalysis ühttps://samate.nist.gov/index.php/Source_Code_Security_An alyzers.html Tools for dynamic analysis ühttps://en.wikipedia.org/wiki/Dynamic_program_analysis Deassembling ühttps://rada.re/r/ ROPgadget (gadgets finder and auto-roper) ühttp://shell-storm.org/project/ROPgadget/

System, Social and Mobile Security

Shellcode database for study cases ühttp://shell-storm.org/shellcode/ ühttps://www.exploit-db.com ühttps://0day.today

System, Social and Mobile Security LIBRARIES

Pwntools (Python) ühttps://docs.pwntools.com/en/stable/

System, Social and Mobile Security HOWTO

https://thecyberrecce.net/tag/pwn-tools/ https://ocw.cs.pub.ro/courses/cns/labs/lab-01 https://github.com/FabioBaroni/awesome-exploit- development

System, Social and Mobile Security