Classification and Identification of Malicious Code Based on Heuristic Techniques Utilizing Meta Languages

Classification and identification of malicious code based on heuristic techniques utilizing Meta languages Dissertation zur Erlangung des Doktorgrades des Fachbereiches Informatik der Universität Hamburg vorgelegt von Dipl. Inf. Markus Schmall Hamburg 2003 Gutachter: • Prof. Dr. Klaus Brunnstein • Prof. Dr. H.J. Bentz • Dr. H.J. Mück Die letzte mündliche Prüfung wurde im März 1998 im Fach “Marketing” an der Universität Hildesheim abgelegt. Classification and identification of malicious code based on heuristic techniques utilizing meta languages Contents Table of figures ....................................................................................................................................... 4 Introduction ............................................................................................................................................. 5 Thanks ..................................................................................................................................................... 7 Statement................................................................................................................................................. 8 1. Definitions........................................................................................................................................... 9 2. MetaMS Meta language .................................................................................................................... 18 2.1 Description of the basic elements of the Meta language MetaMS.............................................. 24 2.1.1 Description of program flows based on MetaMS ................................................................ 48 2.1.2 Variant detection utilizing MetaMS .................................................................................... 53 2.2 Description of the W97M/Melissa.A functionality based on the MetaMS language.................. 57 2.3 Description of the VBS/Loveletter.A Email replication functionality based on the MetaMS language ............................................................................................................................................ 62 3. Presentations of malicious code and runtime environments......................................................... 66 3.1 Virus analysis: W97M/Chydow.A.............................................................................................. 66 3.2 Virus analysis: VBS/Loveletter.A............................................................................................... 68 3.2.1 MetaMS representation of VBS/Loveletter.A file replication routine................................. 72 3.3 Virus analysis: W97M/Melissa.A ............................................................................................... 83 3.4 Virus analysis: VBS/FakeHoax.A (VBS/NoWobbler) ............................................................... 87 3.5 Virus analysis: Palm/Liberty.A................................................................................................... 94 3.6 Virus analysis: Palm/Phage.963.................................................................................................. 97 3.7 Virus analysis: PHP/Pirus.A ..................................................................................................... 102 3.8 Virus analysis: Amiga/HitchHiker 5.00.................................................................................... 108 3.9 Kit analysis: VBS/VBSWG ...................................................................................................... 111 3.10 Virus analysis: W97M/Class.A............................................................................................... 115 3.11 Code analysis: JS/Xilos.A....................................................................................................... 120 3.12 Detailed look at applications, runtime environments and languages related to malicious code in context of MetaMS.......................................................................................................................... 125 3.12.1 Windows Scripting Host.................................................................................................. 126 3.12.2 Microsoft Office 200x ..................................................................................................... 128 3.12.3 Javascript ......................................................................................................................... 131 3.12.4 ActiveX/COM.................................................................................................................. 135 3.12.5 WML Script..................................................................................................................... 136 3.12.6 UML description.............................................................................................................. 139 3.12.7 PHP.................................................................................................................................. 141 3.12.8 HTML.............................................................................................................................. 143 4. Relevant detection/classification methods....................................................................................... 145 4.1 Heuristic technologies............................................................................................................... 146 4.2 Self-adaptation approaches as additional method to standard weight/rule based systems........ 153 4.3 Checksums ................................................................................................................................ 155 4.4 Scan string technologies............................................................................................................ 158 4.5 Script languages ........................................................................................................................ 160 4.6 Classification and rating of basis techniques and combination approaches.............................. 161 4.6.1 Weaknesses of the basis techniques................................................................................... 161 4.7 Theoretical concept „Classification of malicious code based on statistical information“ ........ 163 5. Detailed look at addressed, planned and related platforms ............................................................. 165 5.1 WML Script/WAP 1.2.x ........................................................................................................... 165 5.1.1 Aggression points for malicious WML script code ........................................................... 167 5.1.2 WML Script Libraries........................................................................................................ 168 5.1.3 WTAI functions ................................................................................................................. 173 5.1.4 Mass mailer functionality using WTAI library functions.................................................. 175 5.1.5 Payload functionality based on WTAI functions............................................................... 178 5.2 Detailed examination: Palm OS 4............................................................................................. 179 2 Classification and identification of malicious code based on heuristic techniques utilizing meta languages 5.3 Examination: Visual Basic Script 5.x ....................................................................................... 187 5.3 Analysis of the language requirements to realise malicious codes ........................................... 189 5.3.1 Language requirements for the creation of replicating code in the context of script languages .................................................................................................................................... 190 5.3.2 Language requirements for the creation of recursive replicating code in the context of binary languages ......................................................................................................................... 192 6. Detailed concept and development of an advanced heuristic engine .............................................. 193 6.0.1 Requirements on the software/applications ....................................................................... 197 6.0.2 Requirements/Definitions for the build process ................................................................ 198 6.0.3 Utilized applications / software ......................................................................................... 201 6.0.4 Structure of the source code project .................................................................................. 202 6.0.5 Function description for MetaMS plug-ins........................................................................ 204 6.1 Technical basis concept for the heuristic engine to detect script language based malicious codes ......................................................................................................................................................... 219 6.1.1 Variable emulator .............................................................................................................. 220 6.1.2 Parser ................................................................................................................................. 221 6.1.3 Object emulator / Library function emulation

Load more