Reverse Engineering III: PE Format
Gergely Erdélyi – Senior Manager, Anti-malware Research
Protecting the irreplaceable | f-secure.com Introduction to PE
• PE stands for Portable Executable • Microsoft introduced PE in Windows NT 3.1 • It originates from Unix COFF • Features dynamic linking, symbol exporting/importing • Can contain Intel, Alpha, MIPS and even .NET MSIL binary code • 64-bit version is called PE32+
2 Complete Structure of PE
3 File Headers: MZ header
struct _IMAGE_DOS_HEADER { 0x00 WORD e_magic; MZ Header 0x02 WORD e_cblp; 0x04 WORD e_cp; PE Header 0x06 WORD e_crlc; 0x08 WORD e_cparhdr; 0x0a WORD e_minalloc; 0x0c WORD e_maxalloc; Code Section 0x0e WORD e_ss; 0x10 WORD e_sp; 0x12 WORD e_csum; 0x14 WORD e_ip; Data Section 0x16 WORD e_cs; 0x18 WORD e_lfarlc; 0x1a WORD e_ovno; 0x1c WORD e_res[4]; Imports 0x24 WORD e_oemid; 0x26 WORD e_oeminfo; 0x28 WORD e_res2[10]; Resources 0x3c DWORD e_lfanew; };
4 File Headers: PE Header
MZ Header struct _IMAGE_NT_HEADERS { 0x00 DWORD Signature; 0x04 _IMAGE_FILE_HEADER FileHeader; File Header 0x18 _IMAGE_OPTIONAL_HEADER OptionalHeader; }; Optional Header
Section Header
Section Header
5 File Headers: File Header
MZ Header struct _IMAGE_FILE_HEADER { 0x00 WORD Machine; 0x02 WORD NumberOfSections; File Header 0x04 DWORD TimeDateStamp; 0x08 DWORD PointerToSymbolTable; 0x0c DWORD NumberOfSymbols; Optional 0x10 WORD SizeOfOptionalHeader; Header 0x12 WORD Characteristics; }; Section Header
Section Header
6 File Headers: Optional Header
struct _IMAGE_OPTIONAL_HEADER { 0x00 WORD Magic; 0x02 BYTE MajorLinkerVersion; MZ Header 0x03 BYTE MinorLinkerVersion; 0x04 DWORD SizeOfCode; 0x08 DWORD SizeOfInitializedData; 0x0c DWORD SizeOfUninitializedData; 0x10 DWORD AddressOfEntryPoint; File Header 0x14 DWORD BaseOfCode; 0x18 DWORD BaseOfData; 0x1c DWORD ImageBase; 0x20 DWORD SectionAlignment; Optional 0x24 DWORD FileAlignment; 0x28 WORD MajorOperatingSystemVersion; Header 0x2a WORD MinorOperatingSystemVersion; 0x2c WORD MajorImageVersion; 0x2e WORD MinorImageVersion; 0x30 WORD MajorSubsystemVersion; 0x32 WORD MinorSubsystemVersion; Section 0x34 DWORD Win32VersionValue; 0x38 DWORD SizeOfImage; 0x3c DWORD SizeOfHeaders; Header 0x40 DWORD CheckSum; 0x44 WORD Subsystem; 0x46 WORD DllCharacteristics; 0x48 DWORD SizeOfStackReserve; 0x4c DWORD SizeOfStackCommit; Section 0x50 DWORD SizeOfHeapReserve; 0x54 DWORD SizeOfHeapCommit; Header 0x58 DWORD LoaderFlags; 0x5c DWORD NumberOfRvaAndSizes; 0x60 _IMAGE_DATA_DIRECTORY DataDirectory[16]; };
7 File Headers: Optional Header
struct _IMAGE_OPTIONAL_HEADER { 0x00 WORD Magic; MZ Header 0x02 BYTE MajorLinkerVersion; 0x03 BYTE MinorLinkerVersion; 0x04 DWORD SizeOfCode; File Header 0x08 DWORD SizeOfInitializedData; 0x0c DWORD SizeOfUninitializedData; 0x10 DWORD AddressOfEntryPoint; Optional 0x14 DWORD BaseOfCode; 0x18 DWORD BaseOfData; Header 0x1c DWORD ImageBase; 0x20 DWORD SectionAlignment; 0x24 DWORD FileAlignment; 0x28 WORD MajorOperatingSystemVersion; Section 0x2a WORD MinorOperatingSystemVersion; Header 0x2c WORD MajorImageVersion; 0x2e WORD MinorImageVersion; 0x30 WORD MajorSubsystemVersion; 0x32 WORD MinorSubsystemVersion; Section 0x34 DWORD Win32VersionValue; 0x38 DWORD SizeOfImage; Header 0x3c DWORD SizeOfHeaders; 0x40 DWORD CheckSum; 0x44 WORD Subsystem; 0x46 WORD DllCharacteristics; 0x48 DWORD SizeOfStackReserve; 8 0x4c DWORD SizeOfStackCommit; 0x50 DWORD SizeOfHeapReserve; 0x54 DWORD SizeOfHeapCommit; 0x58 DWORD LoaderFlags; 0x5c DWORD NumberOfRvaAndSizes; 0x60 _IMAGE_DATA_DIRECTORY DataDirectory[16]; }; struct _IMAGE_OPTIONAL_HEADER { 0x00 WORD Magic; 0x02 BYTE MajorLinkerVersion; 0x03 BYTE MinorLinkerVersion; 0x04 DWORD SizeOfCode; 0x08 DWORD SizeOfInitializedData; 0x0c DWORD SizeOfUninitializedData; 0x10 DWORD AddressOfEntryPoint; 0x14 DWORD BaseOfCode; File Headers: Optional Header0x18 DWORD BaseOfData; 0x1c DWORD ImageBase; 0x20 DWORD SectionAlignment; 0x24 DWORD FileAlignment; MZ Header 0x28 WORD MajorOperatingSystemVersion; 0x2a WORD MinorOperatingSystemVersion; 0x2c WORD MajorImageVersion; File Header 0x2e WORD MinorImageVersion; 0x30 WORD MajorSubsystemVersion; 0x32 WORD MinorSubsystemVersion; Optional 0x34 DWORD Win32VersionValue; 0x38 DWORD SizeOfImage; Header 0x3c DWORD SizeOfHeaders; 0x40 DWORD CheckSum; 0x44 WORD Subsystem; Section 0x46 WORD DllCharacteristics; Header 0x48 DWORD SizeOfStackReserve; 0x4c DWORD SizeOfStackCommit; 0x50 DWORD SizeOfHeapReserve; 0x54 DWORD SizeOfHeapCommit; Section 0x58 DWORD LoaderFlags; 0x5c DWORD NumberOfRvaAndSizes; Header 0x60 _IMAGE_DATA_DIRECTORY DataDirectory[16]; };
9 File Headers: Image Directory
IMAGE_DIRECTORY_ENTRY_EXPORT
structIMAGE_DIRECT _IMAGE_DAORTA_DIRECTY_ENTRY_IMPORORY { T 0x00 DWORD VirtualAddress; IMAGE_DIRECTORY_ENTRY_RESOURCE MZ Header 0x04struct _IMAGE_DADWORD Size;TA_DIRECTORY { };0x00 DWORD VirtualAddress; 0x04IMAGE_DIRECTstruct _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_EXCEPTIONORY { };0x00 DWORD VirtualAddress; IMAGE_DIRECTstruct0x04 _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_SECURITYORY { File Header 0x00}; DWORD VirtualAddress; IMAGE_DIRECTstruct0x04 _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_BASERELOCORY { 0x00}; DWORD VirtualAddress; IMAGE_DIRECTstruct0x04 _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_DEBUGORY { Optional 0x00}; DWORD VirtualAddress; IMAGE_DIRECTstruct0x04 _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_COPYRIGHTORY { 0x00}; DWORD VirtualAddress; Header structIMAGE_DIRECT0x04 _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_GLOBALPTRORY { 0x00}; DWORD VirtualAddress; IMAGE_DIRECT0x04struct _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_TLSORY { };0x00 DWORD VirtualAddress; Section IMAGE_DIRECTstruct0x04 _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_LOAD_CONFIGORY { 0x00}; DWORD VirtualAddress; IMAGE_DIRECTstruct0x04 _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_BOUND_IMPORORY { T Header 0x00}; DWORD VirtualAddress; IMAGE_DIRECTstruct0x04 _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_IAORTY { 0x00}; DWORD VirtualAddress; struct0x04IMAGE_DIRECT _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_DELAORY {Y_IMPORT Section 0x00}; DWORD VirtualAddress; 0x04IMAGE_DIRECTstruct _IMAGE_DADWORD Size;ORTA_DIRECTY_ENTRY_COM_DESCRIPTORY { OR };0x00 DWORD VirtualAddress; Header struct0x04 _IMAGE_DADWORD Size;TA_DIRECTORY { 0x00}; DWORD VirtualAddress; 0x04 DWORD Size; };
10 File Headers: Section Header
MZ Header File Header typedef struct _IMAGE_SECTION_HEADER { 0x00 BYTE Name[IMAGE_SIZEOF_SHORT_NAME]; Optional union { 0x08 DWORD PhysicalAddress; Header 0x08 DWORD VirtualSize; } Misc; 0x0c DWORD VirtualAddress; Section 0x10 DWORD SizeOfRawData; Header 0x14 DWORD PointerToRawData; 0x18 DWORD PointerToRelocations; 0x1c DWORD PointerToLinenumbers; 0x20 WORD NumberOfRelocations; Section 0x22 WORD NumberOfLinenumbers; Header 0x24 DWORD Characteristics; };
11 PE Image Loading
File image Memory Image
MZ/PE Header MZ/PE Header Section Table Section Table .text .text .data .idata .data .rsrc .idata .rsrc
12 Memory Layout
0x00400000 App. Image
App. Stack
App. Heap
0x00800000 SOME.DLL
0x77E00000 KERNEL32.DLL
13 Importing Symbols
• Symbols (functions/data) can be imported from external DLLs • The loader will load external DLLs automatically • All the dependencies are loaded as well • DLLs will be loaded only once • External addresses are written to the Import Address Table (IAT) • IAT is most often located in the .idata section
14 DLL Dependency Tree
Depends tool shows all dependencies http://www.dependencywalker.com/
15 Importing Symbols
• Each DLL has one IMAGE_IMPORT_DESCRIPTOR • The descriptor points to two parallel lists of symbols to import • Import Address Table (IAT) • Import Name Table (INT) • The primary list is overwritten by the loader, the second one is not • Executables can be pre-bound to DLLs to speed up loading • Symbols can be imported by ASCII name or ordinal
16 Imports
.idata section
IMPORT_DESCRIPTORS: Calling: USER32.DLL ADVAPI32.DLL CALL [IMPORT1]
Import Name Table(INT): ... ‘IMPORT1’ ‘IMPORT2’ CALL [IMPORT2] Import Adddress Table (IAT): DWORD IMPORT1 DWORD IMPORT2
17 Import Descriptors
struct _IMAGE_IMPORT_DESCRIPTOR { 0x00structunion _IMAGE_IMPOR { T_DESCRIPTOR { 0x00 struct union _IMAGE_IMPOR /* {0 for terminatingT_DESCRIPT null importOR descriptor { */ 0x00 0x00 union DWORD /* {0 for terminatingCharacteristics; null import descriptor */ IMAGE_DIRECTORY_ENTRY_IMPORT 0x00 /* RDWORD V /*A 0to for original terminatingCharacteristics; unbound null IA importT */ descriptor */ 0x00 0x00 PIMAGE_THUNK_DA /* RDWORDVA to originalCharacteristics; unboundTA OriginalFirstThunk; IAT */ struct _IMAGE_DATA_DIRECTORY { 0x00 } u; PIMAGE_THUNK_DA /* RVA to original unboundTA OriginalFirstThunk; IAT */ 0x00 DWORD VirtualAddress; 0x040x00DWORD} u; PIMAGE_THUNK_DATimeDateStamp; TA/* OriginalFirstThunk;0 if not bound, 0x04 DWORD Size; 0x040x08DWORD} u;DWORD TimeDateStamp;ForwarderChain;/* 0 if/* not -1 ifbound, no forwarders */ }; 0x0c0x04DWORD0x08DWORDDWORDName;TimeDateStamp;ForwarderChain;/* 0 if/* not -1 ifbound, no forwarders */ 0x0c/* RDWORDV0x08A to IADWORDT (if boundName; thisForwarderChain; IAT has actual addresses)/* -1 if no*/ forwarders */ 0x100x0cPIMAGE_THUNK_DA/* RDWORDVA to IAT (if boundName;TA thisFirstThunk; IAT has actual addresses) */ }; 0x10 PIMAGE_THUNK_DA/* RVA to IAT (if boundTA thisFirstThunk; IAT has actual addresses) */ }; 0x10 PIMAGE_THUNK_DATA FirstThunk; };
typedef struct _IMAGE_THUNK_DATA { typedefunion struct { _IMAGE_THUNK_DATA { 0x00 unionLPBYTE { ForwarderString; 0x000x00 PDWORDLPBYTE Function; ForwarderString; 0x000x00 DWORDPDWORD Ordinal; Function; 0x000x00 PIMAGE_IMPORDWORD Ordinal;T_BY_NAME AddressOfData; 0x00} u1; PIMAGE_IMPORT_BY_NAME AddressOfData; } IMAGE_THUNK_DA} u1; TA,*PIMAGE_THUNK_DATA; } IMAGE_THUNK_DATA,*PIMAGE_THUNK_DATA;
typedef struct _IMAGE_IMPORT_BY_NAME { 0x00typedefWORD struct _IMAGE_IMPORHint; T_BY_NAME { 0x020x00BYTEWORDName[1];Hint; } IMAGE_IMPOR0x02 BYTE Name[1];T_BY_NAME,*PIMAGE_IMPORT_BY_NAME; } IMAGE_IMPORT_BY_NAME,*PIMAGE_IMPORT_BY_NAME;
18 Exporting Symbols
• Symbols can be exported with ordinals, names or both • Ordinals are simple index numbers of symbols • Name is a full ASCII name of the exported symbol • Exports are described by three simple lists • List of ordinals • List of symbol addresses • List of symbol names • Exports can be forwarded to another DLL • Example: NTDLL.RtlAllocHeap • Forwarded symbol’s address points to a name in the exports section
19 Resources
• Resources in PE are similar to an archive (think ZIP) • Resource files can be organised into directory trees • The data structure is quite complex but there are tools to handle it • Most common resources: • Icons • Version information • GUI resources
20 Base Relocation
Preferred Relocation offset image base MZ Header Actual PE Header MZ Header image base PE Header Code Section Code Section Data Section Data Section Imports
Resources Imports Resources
21 Base Relocation
• Sometimes a DLL can not be loaded to its preferred address • When rebasing, the loader has to adjust all hardcoded addresses • Relocations are done in 4KiB blocks (page size on x86) • Each relocation entry gives a type and points to a location • The loader calculates the base address difference • The offsets are adjusted according to the difference
22 Tools for PE Files
Hex Editors • HT Editor at http://hte.sourceforge.net/ • BIEW at http://sourceforge.net/projects/beye/ • Hiew at http://www.hiew.ru/ Programmatic Access • pefile – python module at http://code.google.com/p/pefile/
23 Reading
Matt Pietrek: An In-Depth Look into PE:
http://msdn.microsoft.com/msdnmag/issues/02/02/PE/default.aspx
http://msdn.microsoft.com/msdnmag/issues/02/03/PE2/default.aspx
Microsoft Portable Executable and Common Object File Format Specification http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx
24 Closing
Special thanks to Ero Carrera for his beautiful PE diagrams and the permission to use them in this presentation!
https://www.openrce.org/reference_library/files/reference/PE %20Format.pdf
http://blog.dkbza.org/2007/03/tiny-and-crazy-pe.html
25