Defense Against Software Exploits El Habib Boudjema
Total Page:16
File Type:pdf, Size:1020Kb
Defense against software exploits El Habib Boudjema To cite this version: El Habib Boudjema. Defense against software exploits. Mathematical Software [cs.MS]. Université Paris-Est, 2018. English. NNT : 2018PESC1015. tel-01970951 HAL Id: tel-01970951 https://tel.archives-ouvertes.fr/tel-01970951 Submitted on 6 Jan 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Universite´ Paris Est ECOLE´ DOCTORALE MSTIC THESE` DE DOCTORAT pour obtenir le titre de Docteur de l'Universit´eParis-Est Specialit´ e´ : Informatique d´efenduepar BOUDJEMA El habib D´efense contre les attaques de logiciels soutenue le 04 mai 2018 devant le jury compos´ede : Prof. ZEADALLY Sherali Rapporteur University of Kentucky, USA Dr. SAUVERON Damien Rapporteur Universit´ede Limoges, France Prof. BEN-OTHMAN Jalel Pr´esident Universit´eParis 13, France Prof. HABBAS Zineb Examinateur Universit´ede Lorraine, France Prof. KHEDOUCI Hamamache Examinateur Universit´eLyon 1, France Dr. VERLAN Serghei Examinateur Universit´eParis-Est, France Prof. MOKDAD Lynda Directrice de th`ese Universit´eParis-Est, France Dr. FAURE Christ`ele Co-directrice de th`ese Saferiver, France Dr. DELEBARRE V´eronique Invit´ee Dirigeante Saferiver, France i Th`esepr´epar´ee`a: Saferiver, Montrouge. France et Laboratoire LACL Universit´eParis-Est, Cr´eteil.France Abstract Defense Against Software Exploits In the beginning of the third millennium, we are witnessing a new age. This new age is characterized by the shift from an industrial economy to an economy based on information technology. It is the Information Age. Today, we rely on soft- ware in practically every aspect of our life. Information technology is used by all economic actors: manufactures, governments, banks, universities, hospitals, retail stores, etc. A single software vulnerability can lead to devastating consequences and irreparable damage. The situation is worsened by the software becoming larger and more complex making the task of avoiding software flaws more and more difficult task. Automated tools finding those vulnerabilities rapidly before it is late, are becoming a basic need for software industry community. This thesis is investigating security vulnerabilities occurring in C language appli- cations. We searched the sources of these vulnerabilities with a focus on C library functions calling. We dressed a list of property checks to detect code portions lead- ing to security vulnerabilities. Those properties give for a library function call the conditions making this call a source of a security vulnerability. When these condi- tions are met, the corresponding call must be reported as vulnerable. These checks were implemented in Carto-C tool and experimented on the Juliet test base and on real life application sources. We also investigated the detection of exploitable vulnerability at binary code level. We started by defining what exploitable vulner- ability behavioral patterns are. The focus was on the most exploited vulnerability classes such as stack buffer overflow, heap buffer overflow and use-after-free. Af- ter, a new method on how to search for these patterns by exploring application execution traces is proposed. During the exploration, necessary information is ex- tracted and used to find the patterns of the searched vulnerabilities. This method was implemented in our tool Vyper and experimented successfully on Juliet test base and real life application binaries. R´esum´e D´efensecontre les attaques de logiciels Dans ce d´ebutdu troisi`eme mill´enium,nous sommes t´emoinsd'un nouvel ^age. Ce nouvel ^ageest caract´eris´epar la transition d'une ´economieindustrielle vers une ´economie bas´eesur la technologie de l'information. C'est l'^agede l'information. Aujourd'hui le logiciel est pr´esent dans pratiquement tous les aspects de notre vie. Une seule vuln´erabilit´elogicielle peut conduire `ades cons´equencesd´evastatrices. La d´etectionde ces vuln´erabilit´esest une t^ache qui devient de plus en plus dure surtout avec les logiciels devenant plus grands et plus complexes. Dans cette th`ese,nous nous sommes int´eress´esaux vuln´erabilit´esde s´ecurit´eim- pactant les applications d´evelopp´eesen langage C et particuli`erement les vuln´era- bilit´esprovenant de l'usage des fonctions de ce langage. Nous avons propos´e une liste de v´erificationspour la d´etectiondes portions de code causant des vuln´erabilit´esde s´ecurit´e.Ces v´erificationssont sous la forme de conditions ren- dant l'appel d'une fonction vuln´erable.Des impl´ementations dans l'outil Carto-C et des exp´erimentations sur la base de test Juliet et les sources d'applications r´eellesont ´et´er´ealis´ees.Nous nous sommes ´egalement int´eress´es`ala d´etectionde vuln´erabilit´esexploitables au niveau du code binaire. Nous avons d´efinien quoi consiste le motif comportemental d'une vuln´erabilit´e. Nous avons propos´eune m´ethode permettant de rechercher ces motifs dans les traces d'ex´ecutionsd'une application. Le calcul de ces traces d'ex´ecutionest effectu´een utilisant l'ex´ecution concolique. Cette m´ethode est bas´eesur l'annotation de zones m´emoiressensibles et la d´etection d'acc`esdangereux `aces zones. L'impl´ementation de cette m´ethode a ´et´er´ealis´eedans l'outil Vyper et des exp´erimentations sur la base de test Juliet et les codes binaires d'applications r´eellesont ´et´emen´eesavec succ`es. iv Keywords: static analysis, vulnerability detection, software exploit Mots-cl´es: analyse statique, d´etection de vuln´erabilit´e, attaque logicielle Contents Abstract ii R´esum´e iii Contentsv List of Figures viii List of Tables ix Abbreviationsx Introduction1 1 Overview of cyber security domains5 1.1 Cyber security: a growing large field.................5 1.2 Security vulnerabilities: public enemy of cyber security.......8 1.3 Causes of security vulnerabilities................... 10 1.4 High level security measures...................... 11 2 Source of vulnerabilities in C language 14 2.1 C language definition.......................... 15 2.2 Formatted output functions in C language.............. 16 2.2.1 Output format string structure................ 18 2.2.2 Output functions usage problems............... 20 2.2.3 Checking and reporting issues................. 22 2.3 Input formatted functions in C language............... 23 2.3.1 Input functions format string structure............ 24 2.3.2 Input formatted functions usage problems.......... 26 2.3.3 Checking and reporting issues................. 28 2.4 POSIX formatted functions...................... 28 2.4.1 Format string structure difference............... 28 2.4.2 List of additional functions................... 30 2.4.3 Safety and security issues................... 30 2.4.4 Checking and reporting issues................. 30 v Contents vi 2.5 GNU/Linux formatted functions.................... 31 2.5.1 Format string structure difference............... 31 2.5.2 List of additional functions................... 31 2.5.3 Safety and security issues................... 32 2.5.4 Checking and reporting issues................. 32 2.6 Windows formatted function...................... 32 2.6.1 Format string structure difference............... 32 2.6.2 List of additional functions................... 33 2.6.3 Safety and security issues................... 35 2.6.4 Checking and reporting issues................. 35 2.7 Command and program execution functions............. 35 2.7.1 Safety and security issues................... 37 2.8 Memory manipulation functions.................... 38 2.8.1 Safety and security issues................... 39 3 Static analysis tools 42 3.1 Front-end: parsing and translating.................. 43 3.2 Middle-end : computation and detection............... 49 3.3 Back-end: results collection and reporting.............. 51 3.4 List of tools............................... 51 3.5 Major problems of static analysis................... 52 4 Security vulnerabilities in C language applications 55 4.1 Static analysis for security....................... 57 4.2 Covered C language parts....................... 58 4.3 Covered security vulnerabilities.................... 60 4.4 Properties for security vulnerability detection and reporting.... 63 4.4.1 Format string vulnerabilities.................. 63 4.4.2 Command execution vulnerabilities.............. 69 4.4.3 Buffer and memory vulnerabilities............... 73 4.5 Test and evaluation of Carto-C.................... 76 4.5.1 Test with synthetic test cases................. 77 4.5.2 Test on real applications.................... 81 4.6 Discussion and conclusions....................... 83 5 Vulnerability detection by behavioral pattern recognition 84 5.1 Concolic execution and binary code analysis............. 86 5.2 Exploitable vulnerabilities....................... 87 5.3 Formalization of exploitable vulnerability detection......... 92 5.4 Formal model application on exploitable vulnerabilities....... 95 5.5 Test and evaluation of Vyper..................... 97 5.5.1 Testing on custom test cases.................. 98 5.5.2 Testing on Juliet test base................... 98 5.5.3 Testing