CoDisasm Medium Scale Concatic Disassembly of Self- Modifying Binaries with Overlapping Instructions
!
Jean-Yves Marion Professor at l’Université de Lorraine Institut Universitaire de France ! !
! ! ! NOTE!DE!DESCRIPTION!DE!LA!PROCEDURE!DE!CREATION!! D’UNE!EQUIPE1PROJET!COMMUNE!ENTRE!INRIA!! ET!UN!OU!PLUSIEURS!PARTENAIRES!ACADEMIQUES! AU!SEIN!D’UNE!UMR! ! Septembre!2011! ! Préambule! Le#CNRS#et#Inria#ont#signé#le#15#avril#2011#un#accord#cadre#visant#à#développer#leur# coopération# dans# le# domaine# des# sciences# et# technologies# du# numérique,# et# ainsi# renforcer#le#dispositif#mis#en#place#avec#la#création#de#l’alliance#Allistene.#Cet#accord# cadre#complète#les#modalités#de#coopération#entre#Inria,#le#CNRS,#et#les#universités# telles# que# notamment# exprimées# dans# l’accord# cadre# Inria# –# CPU# en# date# du#17# décembre#2009#et#dans#l’accord#cadre#CNRS#–#CPU#en#date#du#4#novembre#2010.## Cet# accord# cadre# précise# notamment# l’articulation# entre# le# mode# d’organisation# d’Inria# en# équipesNprojets# et# celui# du# CNRS# en# unités# de# recherche# (unités# le# plus# souvent#mixtes#avec#des#établissements#d’enseignement#supérieur).# Une#équipeNprojet#commune#(EPC,#aussi#appelée#équipe#mixte#de#recherche#selon#la# nomenclature# adoptée# par# les# universités)# est# une# équipe# de# personnes# ayant# des# objectifs# scientifiques# et# un# programme# de# recherche# clairement# définis,# sur# une# thématique#focalisée#et#une#durée#fixée.#Une#équipeNprojet#est#animée#par#un#leader# scientifique# qui# a# la# responsabilité# de# coordonner# les# travaux# de# l’ensemble# de# l’équipe.## ! Objet! Cette#note#décrit#le#processus#convenu#entre#le#CNRS#et#Inria#lors#de#la#création#d’une# équipeNprojet# au# sein# d’une# UMR.1# Elle# sera# instanciée# localement# avec# les# universités# et# écoles# concernées,# notamment# pour# prendre# en# compte# les# accords# existants#(par#exemple#dans#le#cas#de#convention#d’UMR#dont#Inria#est#coNtutelle#ou# partenaire).# ! Diffusion! Etablissements#coNtutelles#des#UMR,#directions#des#UMR#et#des#centres#Inria.#
1 Pour simplifier, cette note ne traite que le cas le plus fréquent de création d’une EPC entièrement incluse au sein d’une UMR. Les cas concernant des EPC « à cheval » sur deux UMR, ou non strictement incluses dans une seule UMR, ou encore faisant intervenir d’autres établissements en sus des établissements co-tutelles/partenaires de l’UMR seront traités de façon similaire, mutatis mutandis.
This is a joint works with
Guillaume Bonfante José Fernandez Benjamin Rouxel
Aurélien Thierry Fabrice Sabatier
Jean-Yves Marion Problem What really makes this program?
Input: an x86 obfuscated binary code
What Happen After You Hit Return ?
Jean-Yves Marion Towards a high level semantics
0000000100000e80 push rbp ! 0000000100000e81 mov rbp, rsp! 0000000100000e84 mov qword [ss:rbp+var_8], rdi! 0000000100000e88 mov qword [ss:rbp+var_10], 0x0! Disassembly 0000000100000e90 mov qword [ss:rbp+var_18], 0x0! !0000000100000e98 mov qword [ss:rbp+var_18], 0x0! 0000000100000ea0 mov rax, qword [ss:rbp+var_18] ! 0000000100000ea4 cmp rax, qword [ss:rbp+var_8]! !0000000100000ea8 jge 0x100000f1a! 0000000100000eae mov rax, 0x2! 0000000100000eb8 mov rcx, qword [ss:rbp+var_18]! Memory or PE file 0000000100000ebc mov qword [ss:rbp+var_20], rax! 0000000100000ec0 mov rax, rcx! 0000000100000ec3 cqo ! E80 : 55 48 89 E5 48 89 7D F8 48 C7 45 F0 00 00 00 00 48 C7 45 E8 00 00 0000000100000ec5 mov rcx, qword [ss:rbp+var_20]! E90 : 00 00 48 C7 45 E8 00 00 00 00 48 8B 45 E8 48 3B 45 F8 0F 8D 6C 00 0000000100000ec9 idiv rcx! EA0 : 00 00 48 B8 02 00 00 00 00 00 00 00 48 8B 4D E8 48 89 45 E0 48 89 0000000100000ecc cmp rdx, 0x0! EB0 : C8 48 99 48 8B 4D E0 48 F7 F9 48 81 FA 00 00 00 00 0F 85 17 00 00 0000000100000ed3 jne 0x100000ef0! EC0 : 00 48 8B 45 F0 48 C1 E0 01 48 05 02 00 00 00 48 89 45 F0 E9 12 00 ED0 : 00 00 48 8B 45 F0 48 C1 E0 01 48 05 01 00 00 00 48 89 45 F0 E9 00 ! 0000000100000ed9 mov rax, qword [ss:rbp+var_10]! EE0 : 00 00 00 48 8B 45 E8 48 05 01 00 00 00 48 89 45 E8 E9 86 FF FF FF 0000000100000edd shl rax, 0x1! EF0 : 48 8B 45 F0 5D C3 0000000100000ee1 add rax, 0x2! 0000000100000ee7 mov qword [ss:rbp+var_10], rax! !0000000100000eeb jmp 0x100000f02! 0000000100000ef0 mov rax, qword [ss:rbp+var_10] ! 0000000100000ef4 shl rax, 0x1! 0000000100000ef8 add rax, 0x1! !0000000100000efe mov qword [ss:rbp+var_10], rax! !0000000100000f02 jmp 0x100000f07 ! 0000000100000f07 mov rax, qword [ss:rbp+var_18] ! 0000000100000f0b add rax, 0x1! 0000000100000f11 mov qword [ss:rbp+var_18], rax! !0000000100000f15 jmp 0x100000ea0! 0000000100000f1a mov rax, qword [ss:rbp+var_10] ! 0000000100000f1e pop rbp! 0000000100000f1f ret
Jean-Yves Marion Towards a high level semantics
0000000100000e80 push rbp ! _suite:0000000100000e81 mov rbp, rsp! push rbp mov 0000000100000e84 rbp, rsp mov qword [ss:rbp+var_8], rdi! mov 0000000100000e88 qword [ss:rbp+var_8], movrdi qword [ss:rbp+var_10], 0x0! mov qword [ss:rbp+var_10], 0x0 mov 0000000100000e90 qword [ss:rbp+var_18], mov 0x0 qword [ss:rbp+var_18], 0x0! mov !0000000100000e98 qword [ss:rbp+var_18], mov 0x0 qword [ss:rbp+var_18], 0x0! 0000000100000ea0 mov rax, qword [ss:rbp+var_18] ! 0000000100000ea4 cmp rax, qword [ss:rbp+var_8]! 0000000100000ea8 jge 0x100000f1a! 0x100000ea0: mov ! rax, qword [ss:rbp+var_18] cmp 0000000100000eae rax, qword [ss:rbp +var_8] mov rax, 0x2! jge 0x100000f1a 0000000100000eb8 mov rcx, qword [ss:rbp+var_18]! 0000000100000ebc mov qword [ss:rbp+var_20], rax! 0000000100000ec0 mov rax, rcx! 0000000100000ec3 cqo ! E80 : 55 48 89 E5 48 89 7D F8 48 C7 45 F0 00 00 00 00 48 C7 45 E8 00 00 0000000100000ec5 mov rcx, qword [ss:rbp+var_20]! E90 : 00 00 48 C7 45 E8 00 00 00 00 48 8B 45 E8 48 3B 45 F8 0F 8D 6C 00 0x100000eae: mov 0000000100000ec9 rax, 0x2 idiv rcx! EA0 : 00 00 48 B8 02 00 00 00 00 00 00 00 48 8B 4D E8 48 89 45 E0 48 89 mov 0000000100000ecc rcx, qword [ss:rbp +var_18] cmp rdx, 0x0! EB0 : C8 48 99 48 8B 4D E0 48 F7 F9 48 81 FA 00 00 00 00 0F 85 17 00 00 mov qword [ss:rbp+var_20], rax 0x100000f1a: mov 0000000100000ed3 rax, rcx jne 0x100000ef0mov ! rax, qword [ss:rbp+var_10] EC0 : 00 48 8B 45 F0 48 C1 E0 01 48 05 02 00 00 00 48 89 45 F0 E9 12 00 cqo pop rbp ED0 : 00 00 48 8B 45 F0 48 C1 E0 01 48 05 01 00 00 00 48 89 45 F0 E9 00 mov ! rcx, qword [ss:rbp+var_20] ret idiv 0000000100000ed9 rcx mov rax, qword [ss:rbp+var_10]! EE0 : 00 00 00 48 8B 45 E8 48 05 01 00 00 00 48 89 45 E8 E9 86 FF FF FF cmp rdx, 0x0 jne 0000000100000edd 0x100000ef0 shl rax, 0x1! EF0 : 48 8B 45 F0 5D C3 0000000100000ee1 add rax, 0x2! 0000000100000ee7 mov qword [ss:rbp+var_10], rax! !0000000100000eeb jmp 0x100000f02! 0x100000ed9:0000000100000ef0 mov rax,0x100000ef0: qword [ss:rbp+var_10] ! mov rax, qword [ss:rbp+var_10] mov rax, qword [ss:rbp+var_10] shl 0000000100000ef4 rax, 0x1 shl rax, 0x1shl ! rax, 0x1 add 0000000100000ef8 rax, 0x2 add rax, add0x1 ! rax, 0x1 mov qword [ss:rbp+var_10], rax mov qword [ss:rbp+var_10], rax jmp !0000000100000efe 0x100000f02 mov qword [ss:rbp+var_10], rax! !0000000100000f02 jmp 0x100000f07 ! 0000000100000f07 mov rax, qword [ss:rbp+var_18] ! 0x100000f02: 0000000100000f0bjmp 0x100000f07 add rax, 0x1! 0000000100000f11 mov qword [ss:rbp+var_18], rax! !0000000100000f15 jmp 0x100000ea0! 0000000100000f1a mov rax, qword [ss:rbp+var_10] ! 0x100000f07: 0000000100000f1e pop rbp! mov rax, qword [ss:rbp+var_18] Control Flow Graph add rax, 0x10000000100000f1f ret mov qword [ss:rbp+var_18], rax jmp 0x100000ea0
Jean-Yves Marion Towards a high level semantics
0000000100000e80 push rbp ! 0000000100000e81 mov rbp, rsp! _suite: push rbp 0000000100000e84 mov qword [ss:rbp+var_8], rdi! mov rbp, rsp 0000000100000e88 mov qword [ss:rbp+var_10], 0x0! mov qword [ss:rbp+var_8], rdi mov qword [ss:rbp+var_10], 0x0 0000000100000e90 mov qword [ss:rbp+var_18], 0x0! mov qword [ss:rbp+var_18], 0x0 !0000000100000e98 mov qword [ss:rbp+var_18], 0x0! mov qword [ss:rbp+var_18], 0x0 0000000100000ea0 mov rax, qword [ss:rbp+var_18] ! 0000000100000ea4 cmp rax, qword [ss:rbp+var_8]! 0000000100000ea8 jge 0x100000f1a! 0x100000ea0: ! mov rax, qword [ss:rbp+var_18] 0000000100000eae mov rax, 0x2! cmp rax, qword [ss:rbp+var_8] jge 0x100000f1a long int suite(long int x) 0000000100000eb8 mov rcx, qword [ss:rbp+var_18]! 0000000100000ebc mov qword [ss:rbp+var_20], rax! { 0000000100000ec0 mov rax, rcx! 0000000100000ec3 cqo ! long int u=0; 0000000100000ec5 mov rcx, qword [ss:rbp+var_20]! 0x100000eae: long int i=0; 0000000100000ec9 idiv rcx! mov rax, 0x2 mov rcx, qword [ss:rbp+var_18] 0000000100000ecc cmp rdx, 0x0! mov qword [ss:rbp+var_20], for(i=0;i Jean-Yves Marion Abstraction by reverse-refinement long int suite(long int x) High-level semantics { long int u=0; long int i=0; for(i=0;i _suite: push rbp mov rbp, rsp mov qword [ss:rbp+var_8], rdi mov qword [ss:rbp+var_10], 0x0 mov qword [ss:rbp+var_18], 0x0 mov qword [ss:rbp+var_18], 0x0 0x100000ea0: mov rax, qword [ss:rbp+var_18] cmp rax, qword [ss:rbp+var_8] jge 0x100000f1a 0x100000eae: mov rax, 0x2 mov rcx, qword [ss:rbp+var_18] mov qword [ss:rbp+var_20], rax 0x100000f1a: mov rax, rcx mov rax, qword [ss:rbp+var_10] cqo pop rbp mov rcx, qword [ss:rbp+var_20] ret idiv rcx cmp rdx, 0x0 jne 0x100000ef0 0x100000ed9: mov rax, qword [ss:rbp+var_10] 0x100000ef0: shl rax, 0x1 mov rax, qword [ss:rbp+var_10] add rax, 0x2 shl rax, 0x1 mov qword [ss:rbp+var_10], rax add rax, 0x1 jmp 0x100000f02 mov qword [ss:rbp+var_10], rax 0x100000f02: jmp 0x100000f07 0x100000f07: mov rax, qword [ss:rbp+var_18] add rax, 0x1 mov qword [ss:rbp+var_18], rax jmp 0x100000ea0 Disassembly is a crucial step 0000000100000e80 push rbp ! 0000000100000e81 mov rbp, rsvp! 0000000100000e84 mov qword [ss:rbp+var_8], rdi! 0000000100000e88 mov qword [ss:rbp+var_10], 0x0! 0000000100000e90 mov qword [ss:rbp+var_18], 0x0! 0000000100000e98 mov qword [ss:rbp+var_18], 0x0 E80 : 55 48 89 E5 48 89 7D F8 48 C7 45 F0 00 00 00 00 48 C7 45 E8 00 00 E90 : 00 00 48 C7 45 E8 00 00 00 00 48 8B 45 E8 48 3B 45 F8 0F 8D 6C 00 EA0 : 00 00 48 B8 02 00 00 00 00 00 00 00 48 8B 4D E8 48 89 45 E0 48 89 EB0 : C8 48 99 48 8B 4D E0 48 F7 F9 48 81 FA 00 00 00 00 0F 85 17 00 00 EC0 : 00 48 8B 45 F0 48 C1 E0 01 48 05 02 00 00 00 48 89 45 F0 E9 12 00 ED0 : 00 00 48 8B 45 F0 48 C1 E0 01 48 05 01 00 00 00 48 89 45 F0 E9 00 EE0 : 00 00 00 48 8B 45 E8 48 05 01 00 00 00 48 89 45 E8 E9 86 FF FF FF EF0 : 48 8B 45 F0 5D C3 Low-level semantics Jean-Yves Marion Context Practical issues related to binary analysis Code Protections against analysis Jean-Yves Marion MA"F')2%PH%>F6)$/)%J?%<)#63'I%6))$=% a31:Cb)()'%>F6*A#/*3"$6 Malware protections SB* 3$ "BA @"A'DG @) /#$ :#()H