<<

EXAMENSARBETE INOM TEKNIK, GRUNDNIVÅ, 15 HP STOCKHOLM, SVERIGE 2020

Is it possible to reverse engineer obfuscated back to ?

Är det möjligt att dekompilera obfuskerad bytekod tillbaka till källkod?

GUSTAV SMEDBERG

JENNY MALMGREN

KTH SKOLAN FÖR KEMI, BIOTEKNOLOGI OCH HÄLSA

Is it possible to reverse engineer obfuscated bytecode back to source code?

Är det möjligt att dekompilera obfuskerad bytekod tillbaka till källkod?

Gustav Smedberg Jenny Malmgren

Examensarbete inom Datateknik Grundnivå, 15 hp Handledare på KTH: Anders Cajander Examinator: Ibrahim Orhan TRITA-CBH-GRU-2020:052

KTH Skolan för kemi, bioteknologi och hälsa 141 52 Huddinge, Sverige

Sammanfattning

Det finns mycket gammal mjukvara ute i världen som inte längre underhålls och skulle behöva uppdateras för att kunna täppa säkerhetshål alternativt uppdatera funktioner i dessa program. I fall där källkoden har förlorats eller raderats , skulle det då vara möjligt att använda dekompilering för att återfå källkoden?

Rapport syftar till att redovisa vad bytekod är och hur den används samt hur man kan gå från java bytekod tillbaka till källkod genom en process som kallas dekompilering samt hur man kan skydda sig mot detta genom obfus- kering av kod. Vidare redovisas tidigare forskning inom dekompilering samt obfuskering och kompletterar med förklaringar vad en Java , Bytekod och obfuskering är och hur de fungerar. Tre program av varierande komplexitet görs om till bytekod, obfuskeras för att sedan dekompileras och jämföra resultatet gentemot källkoden.

Slutligen, det är möjligt att dekompilera den obfuskerade koden men enbart vissa delar av källkoden går att återskapa. Alla variabelnamn och oanvända metoder försvinner helt samt att koden ibland ändras till icke-konventionella sätt att programmera.

Nyckelord , Java, JVM, bytecode, obfuskering, dekompilering, sä- kerhet.

Abstract

There are a lot of old in the world that has not been supported or kept up to date and would need to be updated to seal security vulnerabilities, as well as to up- date functions in the program. In those cases where the source code has been lost or deliberately deleted, would it be possible to use reverse engineering to retrieve the source code?

This study aims to show what java bytecode is and how it is used, as well as how one is able to go from java bytecode back to source code in a process called Reverse En- gineering. Furthermore, the study will show previous work in reverse engineering, in and to explain further details about what , bytecode and obfuscation is and how they work. Three programs of various complex- ity are made into bytecode and then obfuscated. The difference between the original code and the obfuscated code are then analyzed.

The results show that it is possible to reverse engineer obfuscated code but some parts. Obfuscation does protect the code, as all the variable names are changed and every unused method are removed, as well as some methods changed to non-con- ventional ways to program.

Keywords Reverse engineering, Java, JVM, bytecode, obfuscation, safety.

Acknowledgements

The Authors of this paper has attended KTH for an entire bachelor’s degree and this is the biggest and last course. We wish to thank all of our teachers for supporting us during these years.

A big thank you to Anders Cajander who has been our mentor for this project and Ibrahim Orhan who has been our examiner who both of them helped us with their experience and perspective. Thank you to Sebastian Zeerak who lend us the code to the most advanced program which was made together with Gustav Smedberg. A bit thank you to Anders Lindstöm who is the author of AudioStreamUDP.java who gave us permission to use it in our thesis.

We also wish to thank AstraZeneca, and in particular Mikael Engström and Olle Sundholm that helped us settle on a report subject, even if they did not have the resources to supervise us during the project.

We also wish to thank our friends and families for the support they have given us during the making of this project.

Table of contents

1 Introduction ...... 1 1.1 Problem definition ...... 1 1.2 Objective ...... 2 1.3 Limitations ...... 2 2 Theory and background ...... 5 2.1 Coding language ...... 5 2.2 Java and Bytecode ...... 5 2.2.1 Class ...... 5 2.3 Virtual Machine System ...... 11 2.3.1 Java Virtual Machine ...... 11 2.4 Reverse engineering analytics methods ...... 12 2.4.1 Dynamic reverse engineering ...... 13 2.4.2 Static reverse engineering ...... 13 2.5 Bytecode Obfuscation ...... 13 2.6 Previous work ...... 15 2.7 Tools and frameworks...... 16 2.7.1 Javap - The Java Class File ...... 16 2.7.2 ...... 16 2.7.3 Dynamic analysis tool ...... 16 2.7.4 Code obfuscating program ...... 16 2.7.5 IntelliJ IDEA ...... 16 2.8 Code ...... 16 2.9 Practical implementation ...... 16 3 Method ...... 19 3.1 Analysis Methods ...... 19 3.2 Java code obfuscation ...... 20 3.3 Static analysis ...... 20 3.4 Dynamic analysis ...... 20 4 Results ...... 21 4.1 Simple program ...... 21 4.1.1 Obfuscated code ...... 21 4.1.2 Decompilation and dynamic analyzation ...... 21 4.2 More complex program ...... 23 4.2.1 Obfuscated code ...... 23 4.2.2 Decompilation and dynamic analysis ...... 24

4.3 Advanced program ...... 26 4.3.1 Obfuscated code ...... 26 4.3.2 Decompilation and dynamic analyzing ...... 26 4.4 Test environments...... 27 5 Analysis and Discussion ...... 29 5.1 Small program...... 29 5.2 More complex program ...... 29 5.3 Advanced program ...... 30 5.4 Why different methods and tools were not used ...... 31 5.5 The impact of this study on society ...... 31 6 Conclusion ...... 33 References ...... 35

INTRODUCTION | 1

1 Introduction

There are many different companies running old software where the devel- oper, the company that developed it or the source code has disappeared. This will result in the software not getting any updates with new functionality, bug fixes or security updates. Some examples of software’s that could be in ques- tion are software drivers for machines used in different industries as cnc rout- ers, woodwork machines that are run from a computer or just a plugin card to the computer. These are some good examples of where reverse engineering software could be used to give these things a longer lifetime and better sup- port.

Reverse engineering is the process of disassembly of something to understand how it works. This can be done in several different fields as electrical, mechan- ical, software and more where the purpose can be to recreate something, op- timize something, give new functionality, provide documentation or under- stand parts of it needed for other things.

Some programing languages use bytecode which is a form of instructions de- signed for an to read. The interpreter is then designed to translate the bytecode to for the hardware to understand and use. This means that the only thing that needs to be designed to fit a operating system is the interpreter and makes all the code done in the programing language universal not depending what operating system is used (ex Windows, or Mac).

The field which this study will use reverse engineering is in the field of soft- ware where the source code is supposed to be lost and the software in question has been obfuscated. Obfuscation in software is the deliberate act of creating source or machine code which is hard to read and understand by humans. This is done to limit the amount that can be reverse engineered to protect from other companies or hackers finding what can be considered company secrets or proprietary .

1.1 Problem definition This study aims to show the reverse engineering process on Java bytecode that has been obfuscated, as well as the difference that Java bytecode obfuscation does to the decompiled source code. The process to reverse engineer code to regain lost source code is a process which can be useful for larger corporations where code might be used for a long time where the person, or even the de- partment in charge might have been changed many times and source code has

2 | INTRODUCTION

been accidentally lost or deliberately deleted by someone with intention of hurting the company.

In those instances in which the code will be released or sold and the source code will have to be as secret as possible there is an option to obfuscate it.

This study aims to study if it is possible to reverse engineer from obfuscated bytecode back to source code and to which extent this is possible. This is done as there are many studies focusing on reverse engineering of bytecode without obfuscation and this study is a way to increase the knowledge in this field.

1.2 Objective The aim of this paper is to discuss, test and analyze different strategies to re- verse engineer obfuscated bytecode back to source code. It also aims to pro- vide two test environments; one in Windows and one in Linux, as well as a few strategies in how reverse engineer code.

We have divided the study’s objective into smaller parts, which are as follow- ing.

Pre-study:

● Literature studies ● Has reverse engineering of bytecode to source code been done before and how successful were they? ● What techniques can be used to obfuscate code? ● What tools can be used to be able to view bytecode?

Testing:

● What tools are available that can help reverse engineer code and in what way? ● What tools are available to obfuscate bytecode?

Analysing results:

● How much obfuscated source code was able to be reverse engineer from the bytecode? ● Did the length and complexity of the program change the obfuscation in any way?

1.3 Limitations While bytecode can be compiled from many different languages, this paper has chosen to only focus on Java bytecode.

There are three programs of different complexity that will be tested both in their loss in compilation to bytecode and decompiled, and to obfuscated bytecode and decompiled. While the study would preferably be testing a lot more programs, it is not possible within the timeframe given.

INTRODUCTION | 3

Due to the limited time of 12 weeks the study will limit itself to trying to reverse engineer source code from java bytecode that has been obfuscated. Only a cou- ple of tools will be used and those tools will be explained and argument for in chapter 3.

4 | INTRODUCTION

THEORY AND BACKGROUND | 5

2 Theory and background

To understand how Reverse Engineering works, we must first analyze how code works and the different stages these can be in. This chapter aims to an- swer how Java and bytecode work at its core. It further explains the Java Vir- tual Machine, as well as some common way to analyze code and also explains the tools used for this project.

2.1 Coding language This paper will focus on Java, developed by Sun Microsystems, who were later acquired by Oracle. According to a study from PYPL, Java was ranked second in the most popular programming languages with a share of 17.75% in terms of searched tutorials in May 2020 [1].

2.2 Java and Bytecode To understand how some of the tools are used in the study work, a under- standing of the class file and bytecode is needed. According to the official doc- umentation of Java, bytecode is compiled source code that have a specific structure describing the class, the bytecode always has a corresponding class file. The bytecode is created by the and this results in a class file which will be described below. The bytecode itself will be stored in an array in the code attribute. The bytecode instructions can be condensed into some broad groups:

● Load and store (e.g. aload_0, istore) ● Arithmetic and logic (e.g. ladd, fcmpl) ● Type conversion (e.g. i2b, d2i) ● Object creation and manipulation (new, putfield) ● Operand stack management (e.g. swap, dup2) ● Control transfer (e.g. ifeq, goto) ● Method invocation and return (e.g. invokespecial, areturn)

[Table 2.1: The prefix and/or the suffix with their corresponding operation]

Operand Operand Prefix/suffix Prefix/suffix type type i integer character l long f float s short d double b byte a reference

2.2.1 Class file According to the official documentation of the Javafile, it is a file containing java bytecode and a description on how a class was built. It consists of 8-bit

6 | THEORY AND BACKGROUND

bytes in a stream where 16-bit and 32-bit are read as two or four consecutive 8-bit bytes [2].

The structure of a class file is as following, where u2 and u4 are represented as 2 and 4-byte quantities;

[Figure 2.1: quantities of bytes represent different commands]

Here, magic supplies a number in hex format for identifying what format the file is in and the default value for the class files has the value of 0xCAFEBABE. Magic numbers are specific values written to the memory during allocation or de-allocation. Minor and major version are declaring what java version the class file can be run on. Constant_pool_count is the number of items in con- stant_pool and is valid if it contains more than one item. The access flags de- clare what access permissions and properties a class is entitled to and is de- scribed in the list below:

● ACC_PUBLIC - the class has been declared public which allows it to be accessed outside its package. ● ACC_FINAL - the class has been declared final, no subclasses allowed. ● ACC_SUPER - treats superclass methods specially when invoked by the invokespecial in- struction. ● ACC_INTERFACE - the file is an interface and not a class. ● ACC_ABSTRACT - the file is declared abstract and may not be instantiated. ● ACC_SYNTHETIC - declared synthetic, not present in the source code.

THEORY AND BACKGROUND | 7

● ACC_ANNOTATION - declared an annotation type. ● ACC_ENUM - declared as an enum. ● ACC_MODULE - the file is a module and not a class or interface.

This_class must contain a valid index in constant_pool table that describes the class or class interface following the CONSTANT_Class_info structure (see 2.2.2.1). Super_class is either zero or a valid index in the constant_pool table. When super_class is zero its class file represents the class Object, the only class or interface without a superclass. If it is a valid index in the con- stant_pool table the index it is pointing to needs to be a CON- STANT_Class_info structure representing the direct superclass of the the class defined in the class file.

Interfaces_count contains the value of the number of interfaces contained in the interfaces array list. The interfaces array list must contain a valid index to the constant_pool table where the entry must be a CONSTANT_Class_info structure representing an interface that is a direct superinterface of the class or interface type.

The fields_count declares the number of fields in the fields array. The fields array contains entries that are following the field_info structure that gives a complete description of a field in the class or interface. The fields table in- cludes only the fields that are declared by this the class or interface. It does not include items representing fields that are inherited from superclasses or superinterfaces.

Methods_count declares the amount of methods in the methods array. The method array contains entries that follow the method_info structure that de- scribes the method in the class or interface. If the ACC_NATIVE and ACC_AB- STRACT flags are not set in the access_flags item of a method_info structure, the JVM (java virtual machine) instructions implementing the method are also supplied. The method_info structure represents all methods declared by the class or interface. It does not contain any methods defined by superclasses or super interfaces.

The attributes_count counts the number of attributes in the attributes array. Each value in the array must correspond with the attribute_info structure [3].

2.2.1.1 CONSTANT_Class_info structure The constant_class_info structure contains a byte (u1) with tag containing the value of constant_Class and a two byte (u2) value of name_index which con- tains the value of the index of where it is in the constant_pool table.

8 | THEORY AND BACKGROUND

[Figure 2.2: The structure of a Constant_class_info]

2.2.1.2 Field_info structure Access flags describe how the class can be accessed outside of its package as public, private, protected, static, final, volatile, transient synthetic or enum. The name index describes where in the constant pool a utf8 info structure is with its name. The descriptor index describes where in the constant pool an- other utf8 info structure is with a valid field descriptor. Then the attributes count describes the number of attributes in the array. The array is composed of attribute_info structures.

[Figure 2.3: The structure of a Field_info]

2.2.1.3 Method_info structure All the methods in the class file are described by the Method_info structure, which is built up by access flags, name index, descriptor index, attributes count and an attribute_info array. The access flags describe how the method is able to be accessed by declaring it one or more of the following, public, pri- vate, protected, static, final, synchronized, bridge, varargs, native, abstract, strict or synthetic. They will be described further down on meaning and use. Name index must be a valid index in the constant_pool table representing a valid unqualified name for the method or the special method name for an interface or special method name . The descriptor index must be a valid index in the constant_pool table representing a valid method descriptor.

[Figure 2.4: The structure of a Method_info]

THEORY AND BACKGROUND | 9

2.2.1.4 Attribute_info structure The attribute info is used in the ClassFile, Field_info, method_info and Code_attribute structures. It contains an attribute name index, attribute length and an info array. The attribute name index has a size of two bytes of utf8 characters, attribute length has a size of 4 bytes and the info array.

There are 28 predefined attributes where six of them are critical to the inter- pretation of the class file. These consist of the values ConstantValue, Code, StackMapTable, BootstrapMethods, NestHost and NestMembers. Nine of the attributes are not critical to the interpretation of the class file but to get a more correct interpretation they are recommended. These values are Exceptions, InnerClasses, EnclosingMethod, Synthetic, Signature, SourceFile, Lin- eNumberTable, LocalVariableTable, LocalVariableTypeTable. And the last 13 attributes are not critical to the interpreter but contain metadata about the class file. The last attributes are SourceDebugExtension, Decapricated, Runti- meVisableAnnotations, RuntimeInvisibleAnnotations, RuntimeVisiblePa- rameterAnnotation, RuntimeInvisibleParameterAnnotations, Runtime- VisibleTypeAnnotations, RuntimeInvisibleTypeAnnotations, AnnotationDe- fault, MethodParameters, Module, ModulePackages and ModuleMainClass.

[Figure 2.5: The structure of an Attribute_info]

2.2.1.5 Constant Pool The constant pool is a table containing a description tag of what type of entity it represents and an info array.

[Figure 2.6: the structure of a cp_info]

10 | THEORY AND BACKGROUND

The tags that are possible to have can be viewed in Table 2.2:

[Table 2.2: List of all constant kinds and their corresponding tag]

Constant Kind Tag Constant Kind Tag

CONSTANT_Utf8 1 CONSTANT_InterfaceMethodref 11

CONSTANT_Integer 3 CONSTANT_NameAndType 12

CONSTANT_Float 4 CONSTANT_MethodHandle 15

CONSTANT_Long 5 CONSTANT_MethodType 16

CONSTANT_Double 6 CONSTANT_Dynamic 17

CONSTANT_Class 7 CONSTANT_InvokeDynamic 18

CONSTANT_String 8 CONSTANT_Module 19

CONSTANT_Fieldref 9 CONSTANT_Package 20

CONSTANT_Methodref 10

These then describe the methods, strings, classes and everything else men- tioned in the class file. That way during the code it is easier to follow where certain values are assigned, methods called or new classes initiated as they will be mentioned in what order they are assigned in the constant pool. This can be viewed on the figure on the next page.

THEORY AND BACKGROUND | 11

[Figure 2.7: example of a constant pool with some bits removed for ease of viewing and explaining]

Figure 2.7 shows that the first method reference that initiates the class points to the 26 place in the array which is a class declaration. That in itself points to place 84 which is an Utf8 declaring the name of the class which in this case is java/lang/Object. Another example is the eleventh in the array where there is a string pointing to place 68 which is an Utf8 type which contains the text “Second number: “. To view the full documentation of the structures that are in the constant pool more can be viewed in the official documentation of java [4].

2.3 Virtual Machine System The last step to our code is the Virtual Machine System, here forth called VMS. Since this report focuses on Java and its virtual machine Java Vir- tual machine. In this step, all unnecessary code is removed, which means that all comments of the code are lost with the compilation.

2.3.1 Java Virtual Machine The Java Virtual Machine can also be called the Java Interpreter. It takes the bytecode and corresponding class file and does a real time compilation to ma- chine code that the system can then understand. According to the official doc- umentation by oracle, each Java Virtual Machine has their own stack store frames that can be either fixed size, or dynamically expandable. A StackOver- flowError is sent if the fixed size stack store frames are too small, and an Ou- tOfMemoryError is sent if the dynamically expandable stack store frames is too small.

12 | THEORY AND BACKGROUND

The heap of the machines is explained by the official Java Virtual Machine documentation, is shared between all threads currently running and is created at start-up and managed by an automatic storage management system, which can have different techniques for storage management depending on which may be best for the problem at hand [4]. This may also be fixed or dynamic.

The Method area is also shared by all Java Virtual Machines and can also be fixed or dynamically sized. It is responsible for storing per-class structures, like the run-time constant pool, method and field data, as well as the code for methods and constructions.

Frames are used to store partial results and data, and one is created every time a method is executed, which makes dynamic linking possible. These are cre- ated every time a new method is invoked, and is destroyed upon completion, whether it be normal or abrupt. In the Frames, there are arrays of Local vari- ables, which can be either Boolean, byte, char, short, int reference, float or ReturnAddress, as well as a pair which used together can form a long or a dou- ble.

Objects does not have a mandated internal structure. Instead references to class instance are used to point to a pair of pointers where one leads to a table that handles the methods or the object, the other helps to memory allocate from the heap to the object data.

Java Virtual Machine instructions consists of opcode, which are one-byte in- structions specifying the operation which is to be executed, as well as none or more operands, which supplies the data or arguments which are to be used.

The official document of the Java Virtual Machine specifies the instruction set as simply [5]:

“ do{ atomically calculate pc and fetch opcode at pc; if (operands) fetch operands; execute the action for the opcode; } while (there is more to do); ”

2.4 Reverse engineering analytics methods There are several different ways possible to reverse engineer code. Two com- mon ways that appeared when researching for this paper were dynamic re- verse engineering and static reverse engineering, which was talked about in the study “Analyzing and securing binaries through static disassembly” as well as the study “Static and Dynamic Reverse engineering for Java software sys- tems” to name a few [6, 7]. These two types of reverse engineering methods complement each other well as they both look at different aspects of a pro- gram. According to the study “Analyzing and securing binaries through static

THEORY AND BACKGROUND | 13

disassembly” by Dennis Adriaan Andriesse, to really get good results in a re- verse engineering project it helps having documentation about the program and how it functions and where it is possible source code to read from and get a better understanding from comments that developers leave behind [8].

2.4.1 Dynamic reverse engineering Dynamic reverse engineering is done when analyzing a running program. Dennis Adriaan Andriesse further explains that this has some pros and cons being that it will only document the functions used but will ignore the rest. This also affects the performance of the program which may not be desired in some instances if timing and performance is crucial. This also documents how the program is interacting with different libraries and functions under runtime [8].

While running a dynamic analyzer, it takes the running process of the code and analyzes the variables, function calls and exceptions and builds up a visual interpretation of what has been run. This way the analyzer will find things that are not obvious while doing a static decompilation on how the data is handled. The negative side of this is that it will not analyze things that are not actively run so it can miss critical functions in a bigger program.

2.4.2 Static reverse engineering Static reverse engineering is when a program is used to read the class file and up a guess of how the source code looks based on the information gath- ered, this is often called decompiling. According to Dennis Adriaan Andriesse the static method is used to get a good look on the structure of a program. Models that are sent between different layers or to a database are a good ex- ample of this [8].

2.5 Bytecode Obfuscation The idea to take finished bytecode and reverse engineer it to gain access to the source code is also a risk to the company as there is nothing to stop Company B from doing the same to Company A’s code and thus gaining knowledge about their hardware and software in a way that might damage Company A. There is also the cost difference in which Company B don’t have to pay their software engineers to design the program allover, but can simply take and run the al- ready existing software that has already been paid for by their competitor. This is of course something that Company A wish to avoid at all costs.

To handle this problem the maker of the code can use Bytecode Obfuscation. There are many ways to do this, some patented and some not, and the goal is to confuse any person trying to unravel the source code by leading them astray. Worth to note is that its purpose is not to a program unable to be reverse engineered as it is simply not a realistic goal, instead it’s mostly made to make those who want to reverse engineer code have to spend more time and energy doing so, and thus stall them as much as possible.

14 | THEORY AND BACKGROUND

In a paper by Jein-Tsai Chan and Wuu Yang called “Advanced obfuscation techniques for Java Bytecode” they describe a strategy of using bogus classes and methods, as well as making them look realistic while not actually being a true part of the code [9]. This is usually done with having a subclass inherit a method and then try to declare their own. This will of course not work since inherited methods cannot be re-declared. This makes that part of the code useless, which is exactly what obfuscation is all about.

They also explain that it’s possible to add code after the return statement, which would result in errors if tried to run. This is usually done by a third- party program that manages to create this in a way that does not change the behavior of the code.

In a study called “Obfuscating Java: The Most Pain for the Least Gain” by Mi- chael Batchelder and Laurie Hendren the authors explain that another com- mon tactic is to change variable and method names from simple and self-in- formative name such as GetName or NumberOfBooks to random Strings to further confuse a decoder [10].

The study further goes on to explain that it’s possible in the bytecode itself to change methods to have ‘GOTO’ operations to split a method in two or to switch them up, se figure 2.8.

THEORY AND BACKGROUND | 15

[Figure 2.8: Figure showing the shuffling of code using the GOTO command]

As the ‘GOTO’ operation is illegal in Java, as well as the code being shuffled, this increases the odds of the code being irreversible.

2.6 Previous work The process of reverse Engineering Byte code is not new, in fact there exists tools for the sole purpose of “Reverse Engineer Byte code into source code”. A study made by Nicolas Harrand, César Soto-Valero, Martin Monperrus, and Benoit Baudry from 2019 has looked at bytecode , but have not added in obfuscation [11].

While the above study did not have obfuscation, some studies found only fo- cuses on the obfuscation and not at all on the reverse engineering such as the previously mentioned study “Obfuscating Java: the most pain for the least gain”, as well as the also previously mentioned study “Advanced obfuscation techniques for Java Bytecode” [10, 9]. The last study mentioned looked at the option to make the decompiler fail and leave manual decompilation the only option. This study has chosen to not obfuscate the code so much that it cannot

16 | THEORY AND BACKGROUND

be decompiled by the decompilers, as can be further explained in chapter 5.4 Why different methods were not used, but can be quickly summarized as a lack of control that the bytecode is indeed the same as the sourcecode.

2.7 Tools and frameworks This chapter will focus on the programs which has been used in the experi- ments as well as how they are used.

2.7.1 Javap - The Java Class File Disassembler Javap is a class file disassembler tool and makes it possible to view the bytecode of a class file. It takes the data in the class file and outputs a long text string with how the class file is built up in bytecode. It is possible to add several different options for more or information to output [12].

2.7.2 Decompilers One of the decompilers this study will use is the one built into the IDE Intellij IDEA powered by FernFlower Decompiler. It takes the bytecode and assem- bles a guess on how the source code was built up and outputs a java file [13].

2.7.3 Dynamic analysis tool The tool used for dynamic analysis was jSonde. According jSonde’s page it is used for generating diagrams, measuring performance and docu- menting dependencies like jar files [14]. This can be used for dynamically re- verse engineer programs as it attaches itself to the running JVM and docu- ments what happens, for example what methods are called, how long they take to execute and the amount of memory used.

2.7.4 Code obfuscating program To obfuscate the code, the program ProGuard will be used. ProGuard is an obfuscator and optimizer that takes a jar file and outputs a jar file which has been modified depending on what setting were used in the program. More details on what settings were used will be told in chapter 3.2 Java code obfuscation [15].

2.7.5 IntelliJ IDEA The program used to write the Java code which is to be compiled into bytecode will be IntelliJ IDEA, which is a program made by the company Jetbrains [16].

2.8 Code The code for all the programs will be attached in the appendix or can be viewed on one of the authors github [17]. There the medium advanced and the ad- vanced will be able to be viewed in the repositories, mul-add (medium ad- vanced) and SimpleSIP (Advanced).

2.9 Practical implementation This subchapter will aim to prove the statements made earlier in this chapter, by making source code into bytecode.

THEORY AND BACKGROUND | 17

To be able to reverse engineer bytecode into source code we must first see how the source code behaves when compiled into bytecode. To do this, the study will take a short and simple Java program and then compile it into bytecode and see how the compiled code looks in comparison to the source code.

The program that was compiled consists of one import, one variable declara- tion, two print statements and one ‘if, else’ case. The code is shown in the fig- ure below.

[Figure 2.9: Code to be compiled into bytecode] The code randomizes a number between 0 and 100 and declare it to the vari- able with the use of the class java.util.Random. If the number modulus 2 gives the remainder 0, thus being an even number, the system will print out a statement saying such. If not, the statement of the number being odd will be printed.

18 | THEORY AND BACKGROUND

[Figure 2.10: The bytecode to the left and the corresponding code after the //] As seen in the figure above, the code contains the istore, iload, iconst, irem and ifne commands specified in the bytecode section above and the official documentation of the JVM instruction set.

We can also see that the command System.out.println(“”) turns into several smaller operations which are always in the order invokespecial, iload_1, in- vokevirtual, idc, invokevirtual, invokevirtual and invokevirtual. The same thing also happen to the java.util.Random, where there first was an invokes- pecial command, followed by an invokevirtual command. This corresponds to the expected results showed in the 2.2 Java and bytecode.

METHOD | 19

3 Method

To understand how to do reverse engineering a literature study was done. The study included earlier studies and research on the subject. Further analyza- tions on the process made the study be based upon three different programs in incrementing complexity. These were later named ‘Simple program’, ‘More complex program’ and ‘Advance program’, in where, as the name suggest, the ‘Simple program’ is the least complex and the ‘Advance program’ the most. The ‘Advance program’ was taken from an earlier course, with the authors per- mission, as it was a good example of a program that was actually used in the field and thus a bit more complicated. These programs were then tried with different methods of reverse engineering and the results were analyzed and will be talked about in the results chapter.

3.1 Analysis Methods As the studies “Analysing and securing binaries through static disassembly” by Dennis Adriaan Andriesse, as well as “Static and Dynamic Reverse engi- neering for Java software systems” by Tarja Systä shows, using static and dy- namic analysis of reverse engineering has proved effective to get back source code (see 2.4 Reverse engineering analytics methods) [8, 7]. Because of this, this study will also use those methods, choosing the internal IntelliJ decom- piler as the static analysis tool and jSonde as the dynamic analysis tool. IntelliJ decompiler was used as the authors of the paper had some experience with it and the IntelliJ IDE was used to write the code for the study as well. jSonde was chosen as it was the only dynamic analysis tool found under the pre-study of software that could be used on java.

The study is going to use three different code bases that differ in complexity going from a very simple program to a little more complex program and lastly to a more usable program (Advanced program). The complexity is decided by the number of classes and type of code structure. The first, easiest program has been described in chapter 2.9.1 Source Code to bytecode in practical im- plementation, the second program, the middle complexity, is a simple pro- gram where it is possible to add or multiply two numbers with an external class for the computation and the last program is a program using the Simple Initiation Protocol (SIP) for online communication between two or more com- puters.

What the study is going to look at is the amount of lines of code (LOC) that are the same as the source code. How many LOC’s are the same after obfuscating and decompiling the code and is it possible to figure out more LOC’s after an- alyzing it by both the static and the dynamic methods. The brackets ({, }) will not count as LOC. Try statements, return statements are also counted as Source line of Code (SLOC). The reason to why this was used is because it is the method used in other studies similar to this, such as “Reverse engineering source code, Empirical studies of limitations and opportunities” by Landman, D [18].

20 | METHOD

Decompilers are going to be used to be able to access the code that is in the class file. The code accessed will be compared to the source codes and deter- mine a percentage that is identical. Later jSonde will be used to get a bigger picture of how the class files are used together to be able to see the flow of data to see if one is able to determine what a file is.

3.2 Java code obfuscation

To obfuscate code, a program called ProGuard will be used. It will then be put under investigation to see what changes it did to the code and if it hindered the decompiler from properly decompiling the code. The settings that that was used for this obfuscation are the standard for ProGuard and will be attached as A.1, A.2 and A.3.

3.3 Static analysis

To do a static analysis a decompiler is going to be used. A decompiler reads the class file (can be read about in chapter 2.2), read the amount of methods, arguments and bytecode to take a guess to how the source code looks. The result is then going to be compared to the original source code to see how much has been able to be restored. After that, the study is going analyse how the obfuscator has changed the bytecode in an attempt receive the code gen- erated from the decompilers. The decompiler used is the Fernflower decom- piler from Jetbrains who developed the Intellij IDE.

3.4 Dynamic analysis jSonde is going to be used to analyze the program to see if it is possible to see how the program handles classes and method calls to get an idea on how the program is built up [14].

RESULTS | 21

4 Results

This chapter aims to show the results of the reverse engineer process of the three programs mentioned in chapter 3.1 Analysis Methods. Focus is on the bytecode, static as well as dynamic analysis, as well as the obfuscation. The main purpose is to see how many SLOC was able to be retrieved from the ob- fuscated code of the different lengths and complexity.

4.1 Simple program This program was explained in 2.9.1 Source Code to bytecode in practical im- plementation where a random number is generated and then different state- ments are printed depending on whether the number is odd or even. A UML of this program will not be provided because of the simple structure of the program.

4.1.1 Obfuscated code The original length of the code was 7 LOC and after obfuscation it increased to 8 LOC. The line added was an empty constructor without any in-parameters as well as out-parameters. The variable named r as in ‘random’ in the source code was in the obfuscated code renamed to var1. As the program is so small, the only real difference is the way that the variable is initialized and used in the if-case, which is shown in the figures below, showing the difference be- tween the non-obfuscated and obfuscated variable initialization.

[Figure 4.1: Figure showing the non-obfuscated variable initialization and us- age]

[Figure 4.2: Figure showing the obfuscated variable initialization and usage, as well as the changed main name]

The change in the code did not change any behavior of the code, which is to be expected. This change will be discussed in chapter 5 Analysis and discussion.

4.1.2 Decompilation and dynamic analyzation

While there were not that many lines in the program, only about 29% of the code remained the same which can be viewed in table 4.1. Decompiling the

22 | RESULTS

non-obfuscated code resulted in a complete recovery of the original code which can be viewed in table 4.2.

The dynamic analytics tool jSonde did not work on this program as it only al- lows programs with more than one class-file.

[Table 4.1: Showing the results of the obfuscated code as well as the source code and the difference between them]

Simple LOC Nr of SLOC % Nr of LOC% .Class file Class LOC (Source) (Obfs) SLOC SLOC / LOC(Source) LOC(Obfs) / LOC(Source) main.class main 7 8 2 28,57% 114,29%

Averages 28,57% 114,29% [Table 4.2: Showing the results of the static reverse engineering without ob- fuscation]

%SLOC %LOC Class LOC (Source) LOC (RE) SLOC SLOC / LOC (source) LOC(RE) / LOC(Source) main 7 8 7 100,00% 114,29%

RESULTS | 23

4.2 More complex program A small program where the user can either multiply or add two numbers with each other and display the results. The program is made of two different class files, where the main class file asks the user which operation they wish to do and then asks for two numbers.

The second class file contains all of the operations for the program, which con- sists of multiplying ints, floats and doubles, as well as adding ints, floats and doubles. Out of these, only the doubles commands are ever called upon.

UML diagram of this program is as pictured below.

[Figure 4.3: UML diagram of the More complex program]

4.2.1 Obfuscated code The biggest difference between the source code and in the obfuscated code is that it completely removed one of the classes, leaving only the main class. It also created an empty class file which was not present in the source code. The different variables were also renamed to varX, where X seemed to be a incre- mental number. Worth noting is that the variable names was not in the order used, as well as there seem to be numbers which are missing.

args = var0 running = var17 scan = var1

mullad = var3 number1 = var5 number2 = var7 result = var9 not existing in source = var18 not existing in source = var2

24 | RESULTS

4.2.2 Decompilation and dynamic analysis Looking at table 4.3 it is possible to see that the Mathematics class has been removed and using static decompilation on the testDynamic class file resulted in a retrieval of 6 SLOC which stands for around 21%. Looking at table 4.4 shows a near 100% recovery by static reverse engineering. Only the testDy- namic class file lost around 7% of the code.

[Table 4.3: Showing the results of the obfuscated code as well as the source code and the difference between them]

More complex program LOC LOC Nr of SLOC % Nr of LOC% .Class file Class (Source) (Obfs) SLOC SLOC/LOC(Source) LOC(Obfs)/LOC(Source)

Removed Mathematics 16 0 0 0,00% 0,00%

testDynamic.class testDynamic 28 36 6 21,43% 128,57

Averages 10,71% 64,29% [Table 4.4: Showing the results of the static reverse engineering without ob- fuscation]

%SLOC %LOC Class LOC (Source) LOC (RE) SLOC SLOC / LOC(Source) LOC(RE) / LOC(Source)

Mathematics 16 16 16 100,00% 100,00% testDynamic 28 37 26 92,86% 132,14%

Averages 96,43% 116,07%

After having run ProGuard obfuscation and optimization on the more complex program it removes the Mathematics class and sets the result equal to the mul- tiplication or adding of the two numbers. This removes a lot of the original source code even though it might not have been in use. This makes it impos- sible to get any data from jSonde.

The major difference was the way that the program was ended. The source program used a global variable, running, which was a boolean set to either true or false. It was initially set to true, and when the program was over it prompted the user to enter ‘y’ if they wish to do the program again, and thus keeping the while-loop. If the user entered anything other than a ‘y’, the program would exit the loop and thus ending the program.

RESULTS | 25

[Figure 4.4: figure showing the code of the looping and end of the program]

While this seem like the logical version to do this, the obfuscated version has made it different. The code of the end of that program is pictured below.

[Figure 4.5: Figure showing the obfuscated codes version of the looping or end of the program]

The boolean running, renamed to var17, did exist in the obfuscated version but did not serve as the main way to end the program. Instead, there is a new variable of a byte type, var2, which initialized to -1. Another variable of a String

26 | RESULTS

value is also created which stores the answer. This byte value gets assigned the value 0 if the player answers ‘y’.

This later gets used in a switch-case where if the var2 is equal to 0, the program loops, otherwise the var17 variable gets assigned a new value of ‘false’ and the program stops.

4.3 Advanced program When unpacking the jar file the one notices that the obfuscator has renamed all but one file to a.class b.class and so on except the classfile containing the main running function. This has made it harder to differentiate the different classes and to identify some of them the code has to be analyzed to see if there are some tell-tale signs of where to code might belong.

4.3.1 Obfuscated code

While examining the bytecode produced by the commando “javap -c -v class- file.class” there were few things that would help understanding the class from a direct view as most if not all of the variable names were obfuscated and a few method names were able to be retrieved. This was viewed in the Constant pool part of the bytecode. Because of this few variables and methods should also be able to be viewed from the decompiling of the class files. But one thing inter- esting was that a few of the method names were preserved after obfuscating the code.

After obfuscating the bytecode an extra class file was created. after decompil- ing the extra file it was discovered to be a synthetic class file.

4.3.2 Decompilation and dynamic analyzing While trying to use a dynamic analyzer with the advanced program, parts of the program crashed and did not allow the user to connect to the remote pro- gram, the exact same program at a remote location with a different IP and/or port . The given exception was “attempted duplicate class definition for name: SimpleSIP/d”.

RESULTS | 27

[Table 4.5: Showing the results of the obfuscated code as well as the source code and the difference between the two]

Complex file LOC LOC .Class file Class (Source) (Obfs) SLOC Nr of SLOC % Nr of LOC%

a.class AudioStreamUDP 30 23 0 0,00% 76,67%

b.class BusyState 3 4 1 33,33% 133,33%

c.class CallEstablished 50 47 8 16,00% 94,00%

d.class ClientListener 56 78 7 12,50% 139,29%

e.class Commands 2 12 0 0,00% 600,00%

f.class Connecting 18 16 5 27,78% 88,89%

g.class Free 23 17 5 21,74% 73,91%

h.class KeepAlive 17 15 2 11,76% 88,24%

i.class Quitting 15 13 6 40,00% 86,67%

j.class Receiver 89 83 9 10,11% 93,26%

k.class Sender 85 80 10 11,76% 94,12%

l.class ServerListener 40 38 5 12,50% 95,00%

m.class State 40 37 4 10,00% 92,50%

n.class StateHandler 60 36 0 0,00% 60,00%

o.class UserCommands 120 140 31 25,83% 116,67%

p.class Synthetic class 1 35 0 0,00% 3500,00%

q.class Waiting 11 9 4 36,36% 81,82% SimpleSIP- Main.class SimpleSIPMain 19 19 7 36,84% 100,00%

Averages 17,03% 311,91%

[Table 4.6: Showing the results of the static reverse engineering without ob- fuscation]

While looking at the table 4.1 it is possible to see that very few lines of code (LOC) are (SLOC). The table also shows that after having run the obfuscator and optimizer some of the class files were largely expanded, especially the command class which is an enum and some were smaller.

4.4 Test environments As all the programs used can be used on each test environment and we did not notice any difference in the results from a Windows or Linux environment.

28 | RESULTS

ANALYSIS AND DISCUSSION | 29

5 Analysis and Discussion

In this chapter the results of chapter 4 will be analyzed and discussed in the way that the obfuscation changed the code as well as explaining why other methods was not used in this study, the impact of society and the ethics of this study.

5.1 Small program Figure 4.1 and 4.2 showed the difference of the variable initialization and us- age. The source code had the variable initialization and declaration in one statement and then used it in the if-case without change. The obfuscated code changed the statement to become a distinct variable declaration and waited to give it an initialization until it was used. This declaration is not limited to the if-case and is consistent in the code until a new declaration is made.

As Java doesn’t allow the declaration of a variable inside of an if-case, there isn’t a way to make this into one statement and thus only these two options remain for a shorter code.

The variable name was also changed from ‘r’ to ‘var1’, which made that the lines which contained these were also changed and thus not part of the Source Lines Of Code counter in Table 4.1.

The main function has also been changed in all programs. Instead of the usual ‘public static void main(String[] args), the args variable has been renamed to var0. This is not illegal by any means but is not usual in the programming field.

5.2 More complex program

Although the change in code might have been seen as much, most of the code in the source was unused and thus not necessary for the code to function. That part of the code was then removed by the shrinker part of the obfuscator be- fore it was even made into bytecode and as such, there was no part of that code available to be restored. That said, the source code does not have to be com- pletely useless, as the functions might be used by a different main file and thus have others or maybe all functions used.

The obfuscation completely removed one class of the code and instead inte- grated the two functions used into the main program. It also changed the way that the switch case was used for the program to be able to loop and created new variables where there were none before.

Because of the removed classes and its code, the program is smaller with less rows in total, thus less complex as when it started.

30 | ANALYSIS AND DISCUSSION

The obfuscated code also had its variable renamed from their original names to var1, var2 etc. These variables were not in order, and a few numbers seems to be missing, such as var4, var6, var8 etc. As the process to obfuscate cannot be viewed, there is no way to know if the missing numbers were in the code but was removed with the shrinking of the program, or if they are deliberately missing to cause confusion.

5.3 Advanced program While comparing the code gotten from the decompilers and source code around 17% of the code was the same as the source code. This points to the obfuscator doing its job on really making the process of reverse engineering programs hard without a lot of time and effort understanding the source code. Some variable names might be obvious when analysing the code, but since they have been lost to the obfuscator, there is no way to know that the name is indeed the same. Most of the code that was recovered were print lines to the console which can be a good way to determine what type of class and function it belongs to. As the study did not focus on the methods or constructors and the way the code is used but the amount of lines in the source code, someone could be able to get more information by analyzing the methods and construc- tors.

In the obfuscated and optimized version each value has its own line of code which explains the larger LOC. The ones that were smaller usually had func- tions moved into the main function. This is also a cause to why it can be harder to retrieve the original source code back, since the methods have changed place.

The obfuscator also created synthetic classes. These are methods, classes fields or other constructs that the Java compiler compiles when they do not have a corresponding construct in the source code, enabling the Java Virtual Machine to implement new Java language features without the JVM changing, which further complicated the reverse engineering [19].

This does not mean that one cannot retrieve some of the functionality of the code but since this study does not focus on retrieving the functionality, but instead focuses only on the similarity of the code, this will not be analyzed further.

The dynamic analysation tool showed an error message and did not work on the obfuscated code, even though the original source code compiled with working results which points to that the obfuscator either has a built in pro- tection for dynamic analyzing tools like the one used in the study which made dynamic analysation tools unuseable. It could also be the way that the bytecode is built up after it has gone through obfuscation. It is also possible that the error would not have been present if another dynamic analysation tool other than jSonde was used. However, because of the time constraint of this study, this will not be analysed further.

ANALYSIS AND DISCUSSION | 31

5.4 Why different methods and tools were not used

Worth noting in the figure A.1 showing the settings for the ProGuard obfusca- tion is that the option to ‘overload aggressively’ and ‘use unique class member names’ is not checked. The first option is described in the program as allowing fields and methods to get the same obfuscated names, even if their only differ- ence is the type or return type. This is illegal in Java code, but completely legal in bytecode. The reason as to why this was not checked was because it made the program not able to be tested by the IntelliJ testing platform, since it’s not able to run pure bytecode. This makes it hard to check if the code is indeed the same as the input.

This study has chosen to only focus on Lines of Code changed, instead changes in the structure of the code. The reason for this is discussed in chapter 3.1 An- alytic methods and will thus not be explained more here. The study did also not check the codes function after obfuscation as it simply was not possible within the timeframe. jSonde, the dynamic analysation tool used for this study, did not work with the two smaller programs, as their obfuscated code was only one class and jSonde do not support so small programs. It also turned out to be unusable for the advanced program too, as the obfuscated program made it display an error message and unable to run. There is a possibility that other dynamic analysa- tion tools would have managed to analyse the obfuscated code without any problems, but this was not tested further, as it only turned out to be a problem in the later parts of the study and by that time, no other dynamic analysation tools could be tested because of the time constraint.

5.5 The impact of this study on society

This study may have the most impact on a economical and social viewpoint as this could help companies decide if it is worth investing money and time in reverse engineering obfuscated bytecode. In that aspect it might save the com- pany money by them deciding that it is worth the investment instead of buying or developing new software from scratch or they decided to employ people to actually reverse engineer a program that they may have lost the code to or need update in a way or another.

From an ethical point of view the progress of reverse engineering might seem a bit dodgy. In a perfect world code obfuscation should not be needed as it increases the amount of work needed to be done and costs of "securing" soft- ware so that thieves, cheaters and other malicious people cannot access code they should not have access to. It can still be necessary to update programs where the source code is forgotten and that is what this study has been focus- ing on. It can also be used to reverse engineer malicious code so that tools for defending against it can be developed or adapted.

32 | ANALYSIS AND DISCUSSION

This is not seen as something that will affect the environmental aspect as this is will most probably be done or used on already existing infrastructure.

CONCLUSSION | 33

6 Conclusion

The study has looked into reverse engineering obfuscated code back to source code. This has been done with a limited success as only around 1/10 of the code was able to be retrieved from the advanced program tested. The com- plexity as well as length has indeed changed the way the obfuscation worked on the different programs, as it sometimes even removed entire class files filled with unused code. In the case of the more complex program it also cre- ated synthetic classes, which made reverse engineering much harder and the similarity of the reverse engineered code much less like the original.

It also changed the names of all files excluding the one with the main class file, as well as all variable names, because of this, the more class files there are that is not the main one, the more obfuscated the outcome will be. This can be seen in the study done for this paper where the total similarity on the obfuscated and source code was 28.57 % on the smallest program of nine rows of code, 10.71 % on the more complex one of 71 lines of code and 17.03 % on the ad- vanced program with over 1000 rows of code. It is a lot less than what was able to be retrieved from non obfuscated code where the most simple program was able to retrieve 100% of the original code, the more complex program 96,43% and the most advanced program 86,25% of the original code.

Most of the retrieved source code were console logging code and some try catch statements which can give an indication on what the method is supposed to do. Given more time and more testing further code might have been able to be retrieved and diagrams may have been able to be generated to see the flow of the program used. Companies should feel relatively safe in using software to obscure bytecode which makes it a lot harder to reverse engineer.

Going forward one might look into different tools used while trying to reverse engineer the programs used here to see if one is able to retrieve more of the source code. More studies should be done on even bigger programs, as well as a bigger quantity to see if the results are consistent and match those that this study has concluded. One should definitely try more programs to dynamically check the code and see if a different result is able to be given.

34 | CONCLUSION

REFERENSES | 35

References

[1] Carbonnelle Pierre: 2019 “PYPL PopularitY of “ http://pypl.github.io/PYPL.html Information collected 10th May 2020 17:58

[2] Oracle America, Inc, 2020 “Chapter 2. The structure of the Java Virtual Machine” https://docs.oracle.com/javase/specs/jvms/se14/html/jvms-2.html Information collected 28th April 2020 09:42

[3] Haggar Peter, 2001 “Java bytecode: understanding bytecode makes you a better ” https://www.ibm.com/developerworks/library/it-haggar_bytecode/index.html Information collected 25th of April 2020 14:34

[4] Oracle America, Inc, 2020 “Chapter 4. The class ” https://docs.oracle.com/ja- vase/specs/jvms/se13/html/jvms-4.html#jvms-4.4 Information collected 27th of April 13:44

[5] Oracle America, Inc, 2020 “The Java® Virtual Machine Specification” https://docs.oracle.com/javase/specs/jvms/se14 Information collected 28th of April 2020 10:23

[6] Andriesse D.A, 2017 “ANALYZING AND SECURING BINARIES THROUGH STATIC DISASSEM- BLY”, Vrie Universiteit, the Neatherlands page number 8-12

[7] Sys-tä Tarja, 2000 “Static and Dynamic Reverse engineering for Java software systems”, university of tampere, department of Computer and Information Science, Finland, page 12-22

[8] Dennis Adriaan Andriesse, 2017 “ANALYZING AND SECURING BINARIES THROUGH STATIC DISASSEMBLY“, Vrije Universiteit, , page 8-10

[9] Chan Jein-Tsai, Wuu Yang: 2004 “Advanced obfuscation techniques for Java Bytecode” The Jour- nal of Systems and Software 71, page 4-10

36 | REFERENSES

[10] Batchelder Michael, Hendren Laurie: 2007, “Obfuscating Java: The Most Pain for the Least Gain” International Conference on Compiler Construction, School of Computer Science, McGill University, , QC, Canada, page 4-13.

[11] Harrand Nicolas, Soto-Valero César, Monperrus Martin, Baudry Benoit, 2019 “The Strength and Be- havioral Quirks of Java Bytecode Decompilers” 19th International Working Conference on Source Code Analysis and Manipulation (SCAM), page 92-97

[12] Oracle America, Inc, 2020 “Javap” https://docs.oracle.com/javase/8/docs/technotes/tools/windows/javap.html Information collected 9th of June 02:22

[13] Fernflower, different contributers, 2004-2020 https://github.com/JetBrains/intellij-community/tree/master/plugins/java-decompiler/engine Information collected 14 of May 13:34

[14] Bedrin Dmitry, “jSonde” https://bedrin.github.io/jsonde/ Information collected 25th of April 11:43

[15] Guardquare, “ProGuard manual” https://www.guardsquare.com/en/products/proguard/manual Information collected 9th of June 02:42

[16] Intellij IDEA https://www.jetbrains.com/idea/ information collected 9th of June 02:43

[17]] Gustav Smedberg, http://github.com/night-shadows/ Information collected and accurate as of 1th of June 10:43

[18] Landman D, 2017. “Reverse engineering source code, Empirical studies of limitations and opportu- nities” page 8-12

REFERENSES | 37

[19] Oracle, Java Documention “Nested Classes” https://docs.oracle.com/javase/tutorial/java/ja- vaOO/nested.html Information collected 21st of May 2020 14:39

APPENDIX | 39

Appendix

Figure A.1: The settings of ProGuard’s shrinking. These are the ones used for the obfuscation.

40 | APPENDIX

Figure A.2: The settings of Proguard’s obfuscation. These are the ones used for the obfuscation.

APPENDIX | 41

Figure A.3: The settings of ProGuard’s optimization. These are the ones used for the obfuscation.

A.4: The program used as the middle complex program to reverse engineer. The program is called “mul-add” and contains the class files “testDynamic.java” and “Mathematics.java” mul-add testDynamic.java package dynamic; import java.util.Scanner; public class testDynamic { public static void main(String args[]) { boolean running = true; while(running) { Mathmatics maths = null; System.out.println("Do you want to multiply [1] or add [2] two numbers? \n[1] Multiply\n[2] Add"); Scanner scan = new Scanner(System.in); double mulladd = 0, number1 = 0, number2 = 0; if (scan.hasNext()) mulladd = scan.nextDouble();

System.out.println("First number: "); if (scan.hasNext())

42 | APPENDIX

number1 = scan.nextDouble();

System.out.println("Second number: ");

if (scan.hasNext()) number2 = scan.nextDouble(); double result; if (mulladd == 1) result = maths.multiplyDouble(number1, number2); else result = maths.addingDouble(number1, number2);

System.out.println("The result is: " + result);

System.out.println("want to go again? \n[y] yes\n[n] no"); switch (scan.next()){ case "y": break; default: running = false; } }

} }

Mathmatics.java package dynamic; public class Mathmatics { private Mathmatics() {

}

public static int multiplyInt(int x, int y) { return x*y; }

public static float multiplyFloat(float x, float y) { return x*y; }

public static double multiplyDouble(double x, double y) {

APPENDIX | 43

return x*y; }

public static int addingInt(int x, int y) { return x+y; }

public static float addingFloat(float x, float y) { return x+y; }

public static double addingDouble(double x, double y) { return x+y; }

public static int adding(int x, int y) { return x+y; }

}

A.5: The program used as the most advance program to reverse engineer. The program is called “Sim- pleSIP”. Written by Gustav Smedberg and Sebastian Zeerak except for AudioStreamUDP.java. Audi- oStreamUDP was written by Anders Lindström at The Royal Institute of Technology in Stockholm also called KTH (Kungliga tekniska högskolan) and he has granted permission for us to use it in our thesis.

SimpleSIP

AudioStreamUDP.java package SimpleSIP; import java.io.*; import java.net.*; import javax.sound.sampled.*;

44 | APPENDIX

public class AudioStreamUDP {

public static final int BUFFER_VS_FRAMES_RATIO = 16; //32 public static final boolean DEBUG = false; public static final int TIME_OUT = 5000; // Time out for receiving packets private static final int port = 2020;

public AudioStreamUDP() throws IOException { this.receiverSocket = new DatagramSocket(port); //receiverSocket.setSoTimeout(TIME_OUT); this.senderSocket = new DatagramSocket();

format = new AudioFormat(22050, 16, 1, true, true); // 44100 this.receiver = new Receiver(receiverSocket, format); this.sender = new Sender(senderSocket, format); }

public int getLocalPort() { return receiverSocket.getLocalPort(); }

public synchronized void connectTo(InetAddress remoteAddress, int remotePort) throws IOException { sender.connectTo(remoteAddress, remotePort); receiver.connectTo(remoteAddress); }

public synchronized void startStreaming() { receiver.startActivity(); sender.startActivity(); }

public synchronized void stopStreaming() { receiver.stopActivity(); sender.stopActivity(); }

public synchronized void close() { if (receiverSocket != null) receiverSocket.close(); if (senderSocket != null) senderSocket.close(); }

private DatagramSocket senderSocket, receiverSocket; private Receiver receiver = null; private Sender sender = null; private AudioFormat format; } class Receiver implements Runnable {

Receiver(DatagramSocket socket, AudioFormat format) { this.socket = socket; this.format = format; }

APPENDIX | 45

void connectTo(InetAddress remoteHost) { this.remoteHost = remoteHost; } synchronized void startActivity() { if (receiverThread == null) { receiverThread = new Thread(this); receiverThread.start(); } } synchronized void stopActivity() { receiverThread = null; } public void run() { // Make the run method a private matter if (receiverThread != Thread.currentThread()) return;

try { initializeLine();

int frameSizeInBytes = format.getFrameSize(); int bufferLengthInFrames = line.getBufferSize() / AudioStreamUDP.BUFFER_VS_FRAMES_RATIO; int bufferLengthInBytes = bufferLengthInFrames * frameSizeInBytes; if (AudioStreamUDP.DEBUG) { System.out.println("bufferLengthInFrames = " + bufferLengthInFrames); System.out.println("bufferLengthInBytes = " + bufferLengthInBytes); } byte[] data = new byte[bufferLengthInBytes]; DatagramPacket packet = new DatagramPacket(data, bufferLengthInBytes); int numBytesRead = 0;

line.start(); int packets = 0; while (receiverThread != null) { socket.receive(packet); // Who's the sender? if (remoteHost.equals(packet.getAddress())) { numBytesRead = packet.getLength(); if (AudioStreamUDP.DEBUG) { System.out.println("Received bytes = " + numBytesRead + ", packets = " + packets++); } int numBytesRemaining = numBytesRead; while (numBytesRemaining > 0) { numBytesRemaining -= line.write(data, 0, numBytesRemaining); } } } } catch (SocketTimeoutException ste) { System.out.println("Receive call timed out"); } catch (SocketException se) { System.out.println("SimpleSIP.Receiver socket is closed"); // If the thread is blocked in a receive call, an exception is thrown when // the socket is closed, causing the thread to unblock.

46 | APPENDIX

} catch (Exception e) { e.printStackTrace(); } finally { this.cleanUp(); } }

private DatagramSocket socket = null; private InetAddress remoteHost; private Thread receiverThread = null; private SourceDataLine line = null; private AudioFormat format = null;

private void initializeLine() throws LineUnavailableException { DataLine.Info info = new DataLine.Info(SourceDataLine.class, format); if (!AudioSystem.isLineSupported(info)) { System.err.println("Line matching " + info + " not supported."); return; }

//line = (SourceDataLine) AudioSystem.getLine(info); line = getSourceDataLine(format); if (!line.isOpen()) { line.open(format, line.getBufferSize()); } }

private void cleanUp() { try { if (line != null) { line.stop(); line.close(); } } catch (Exception e) { } }

protected void finalize() { this.cleanUp(); }

/** * Thanks to: Paulo Levi. * Lines can fail to open because they are already in use. * Java sound uses OSS and some linuxes are using pulseaudio. * OSS needs exclusive access to the line, and pulse audio * highjacks it. Try to open another line. * * @param format * @return a open line * @throws IllegalStateException if it can't open a dataline for the * audioformat. */ private SourceDataLine getSourceDataLine(AudioFormat format) { Exception audioException = null;

APPENDIX | 47

try { DataLine.Info info = new DataLine.Info(SourceDataLine.class, format);

for (Mixer.Info mi : AudioSystem.getMixerInfo()) { SourceDataLine dataline = null; try { Mixer mixer = AudioSystem.getMixer(mi); dataline = (SourceDataLine) mixer.getLine(info); dataline.open(format); dataline.start(); return dataline; } catch (Exception e) { audioException = e; } if (dataline != null) { try { dataline.close(); } catch (Exception e) { } } } } catch (Exception e) { throw new IllegalStateException("Error trying to aquire dataline.", e); } if (audioException == null) { throw new IllegalStateException("Couldn't aquire a dataline, this " + "computer doesn't seem to have audio output?"); } else { throw new IllegalStateException("Couldn't aquire a dataline, probably " + "because all are in use. Last exception:", audioException); } } } class Sender implements Runnable {

Sender(DatagramSocket socket, AudioFormat format) { this.socket = socket; this.format = format; }

public void connectTo(InetAddress remoteAddress, int remotePort) throws IOException { this.remoteAddress = remoteAddress; this.remotePort = remotePort; //socket.connect(new InetSocketAddress(remoteAddress, remotePort)); }

synchronized void startActivity() { if (senderThread == null) { senderThread = new Thread(this); senderThread.start(); } }

synchronized void stopActivity() {

48 | APPENDIX

senderThread = null; }

public void run() { // Make the run method a private matter if (senderThread != Thread.currentThread()) return;

try { initializeLine();

int frameSizeInBytes = format.getFrameSize(); int bufferLengthInFrames = line.getBufferSize() / AudioStreamUDP.BUFFER_VS_FRAMES_RATIO; int bufferLengthInBytes = bufferLengthInFrames * frameSizeInBytes; byte[] data = new byte[bufferLengthInBytes]; int numBytesRead; DatagramPacket packet = null;

line.start(); int packets = 0; System.out.println("Ready"); while (senderThread != null) { if ((numBytesRead = line.read(data, 0, bufferLengthInBytes)) == -1) { break; } packet = new DatagramPacket(data, numBytesRead, remoteAddress, remotePort); socket.send(packet); if (AudioStreamUDP.DEBUG) { System.out.println("Bytes sent = " + numBytesRead + ", packets = " + packets++); } } } catch (SocketException se) { System.out.println("SimpleSIP.Sender socket is closed"); // Exception is thrown if socket is closed before last call to send. } catch (Exception e) { e.printStackTrace(); } finally { this.cleanUp(); } }

private DatagramSocket socket = null; private InetAddress remoteAddress = null; private int remotePort = 0; private Thread senderThread = null; private TargetDataLine line = null; private AudioFormat format = null;

private void initializeLine() throws LineUnavailableException { DataLine.Info info = new DataLine.Info(TargetDataLine.class, format); if (!AudioSystem.isLineSupported(info)) { System.err.println("Line matching " + info + " not supported."); return; }

//line = (TargetDataLine) AudioSystem.getLine(info);

APPENDIX | 49

line = getTargetDataLine(format); if (!line.isOpen()) { line.open(format, line.getBufferSize()); } } private void cleanUp() { try { if (line != null) { line.stop(); line.close(); } } catch (Exception e) { } } protected void finalize() { this.cleanUp(); }

/** * Thanks to: Paulo Levi. * Lines can fail to open because they are already in use. * Java sound uses OSS and some linuxes are using pulseaudio. * OSS needs exclusive access to the line, and pulse audio * highjacks it. Try to open another line. * * @param format * @return a open line * @throws IllegalStateException if it can't open a dataline for the * audioformat. */ private TargetDataLine getTargetDataLine(AudioFormat format) { Exception audioException = null; try { DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);

for (Mixer.Info mi : AudioSystem.getMixerInfo()) { TargetDataLine dataline = null; try { Mixer mixer = AudioSystem.getMixer(mi); dataline = (TargetDataLine) mixer.getLine(info); dataline.open(format); dataline.start(); return dataline; } catch (Exception e) { audioException = e; } if (dataline != null) { try { dataline.close(); } catch (Exception e) { } }

50 | APPENDIX

} } catch (Exception e) { throw new IllegalStateException("Error trying to aquire dataline.", e); } if (audioException == null) { throw new IllegalStateException("Couldn't aquire a dataline, " + "this computer doesn't seem to have audio output?"); } else { throw new IllegalStateException("Couldn't aquire a dataline, probably " + "because all are in use. Last exception:", audioException); } } }

BusyState.java package SimpleSIP; public class BusyState extends State {

@Override public boolean busy() { return true; } } CallEstablished.java package SimpleSIP; import java.io.IOException; import java.io.PrintWriter; import java.net.InetAddress; import java.net.Socket; public class CallEstablished extends BusyState { private InetAddress serverIp; private int port; private AudioStreamUDP audioStreamUDP;

private KeepAlive keepAlive; /** * Starts a call * @param ip */ public CallEstablished(InetAddress ip, PrintWriter out) { this.serverIp = ip; this.port = 2020;

printState(); startRtp();

APPENDIX | 51

keepAlive = new KeepAlive(out); Thread thread = new Thread(keepAlive); thread.start(); }

@Override public State receivedBYESendOK(PrintWriter out, Socket clientSocket) { System.out.println("Received BYE, sending OK"); out.println("OK"); stopRtp(); try { if(clientSocket != null) clientSocket.close(); } catch (IOException e) { System.err.println("Could not close socket"); } return new Free(); }

@Override public State userWantsToQuitSendBYE(PrintWriter out) { System.out.println("User wants to quit, sending BYE"); out.println("BYE"); stopRtp(); return new Quitting(); }

@Override public State errorReceived(PrintWriter out, Socket clientSocket) { return super.errorReceived(null, clientSocket); }

@Override public void error(PrintWriter out) { stopRtp(); super.error(out); } private void startRtp(){ try { this.audioStreamUDP = new AudioStreamUDP(); if(serverIp != null) { audioStreamUDP.connectTo(serverIp, port); audioStreamUDP.startStreaming(); } } catch (IOException e) { e.printStackTrace(); } } private void stopRtp(){ if(audioStreamUDP != null) { audioStreamUDP.stopStreaming();

52 | APPENDIX

audioStreamUDP.close(); } keepAlive.setKeepAlive(false); }

private void printState() { System.out.println("------"); System.out.println("| Current state: SimpleSIP.CallEstablished |"); System.out.println("------"); } }

ClientListner.java package SimpleSIP; import java.io.BufferedReader; import java.io.IOException; import java.net.Socket; import java.net.SocketTimeoutException; public class ClientListener implements Runnable {

private BufferedReader in; private StateHandler handler; private Socket clientSocket;

public ClientListener(BufferedReader in_, StateHandler handler_, Socket clientSocket_) { in = in_; handler = handler_; clientSocket = clientSocket_; System.out.println(clientSocket.toString()); }

@Override public void run() { String port = ""; String inputLine; try { while (((inputLine = in.readLine()) != null) && !clientSocket.isClosed()) { if(!inputLine.equals("ALIVE")) System.out.println("RECEIVED: " + inputLine); if(inputLine.startsWith("port")){ String[] array = inputLine.split("port"); if(array.length > 0) port = array[1]; }

switch (inputLine) { case "INVITE": break;

APPENDIX | 53

case "TRO": break; case "ACK": handler.receivedACKSendNothing(clientSocket); break; case "BYE": break; case "OK": handler.receivedOKsendNothing(clientSocket); break; case "BUSY": closeSocket(); break; case "ALIVE": break; default: handler.errorReceived(clientSocket); break; } } } catch (SocketTimeoutException e){ System.err.println("Timeout"); handler.errorReceived(clientSocket); Thread.currentThread().interrupt(); } catch (IOException e) { System.out.println("Socket closed"); handler.errorReceived(clientSocket); Thread.currentThread().interrupt(); return; }

handler.noInputGoToFree(); }

private void closeSocket() { try { clientSocket.close(); } catch (IOException e) { System.err.println("Could not close socket"); } } }

Commands.java package SimpleSIP; public enum Commands { INVITE, TRO, ACK, BYE, OK, EXIT, ERROR, BUSY, UNKNOWN; }

Connecting.java

54 | APPENDIX

package SimpleSIP; import java.io.IOException; import java.io.PrintWriter; import java.net.InetAddress; import java.net.Socket; public class Connecting extends BusyState {

public Connecting() { printState(); }

@Override public State receivedACKSendNothing(InetAddress ip, Socket clientSocket) { System.out.println("Received ACK, call established!"); PrintWriter out = getOutputStream(clientSocket); return new CallEstablished(ip, out); }

private PrintWriter getOutputStream(Socket clientSocket) { PrintWriter out = null; try { out = new PrintWriter(clientSocket.getOutputStream(), true); } catch (IOException e) { e.printStackTrace(); } return out; }

private void printState() { System.out.println("------"); System.out.println("| Current state: SimpleSIP.Connecting |"); System.out.println("------"); } }

Free.java package SimpleSIP; import java.io.PrintWriter; public class Free extends State {

public Free() { printState(); }

@Override public State receivedInviteSendTRO(PrintWriter out) { if(out == null) { this.error(out);

APPENDIX | 55

return new Free(); } System.out.println("Received INVITE, Sending TRO"); out.println("TRO"); return new Connecting(); }

@Override public State userWantsToConnectSendInvite(PrintWriter out) { System.out.println("Sending INVITE"); out.println("INVITE"); return new Waiting(); }

private void printState() { System.out.println("------"); System.out.println("| Current state: SimpleSIP.Free |"); System.out.println("------"); }

private void sleep() { try { Thread.sleep(1000); } catch (InterruptedException e) { System.out.println("Could not sleep"); } } }

KeepAlive.java package SimpleSIP; import java.io.PrintWriter; public class KeepAlive implements Runnable {

private PrintWriter out; private boolean keepAlive;

public KeepAlive(PrintWriter out_) { keepAlive = true; out = out_; }

@Override public void run() { while(keepAlive) { out.println("ALIVE"); sleep(5); } }

56 | APPENDIX

private void sleep(int seconds) { try { Thread.sleep(1000*seconds); } catch (InterruptedException e) { System.err.println("Thread could not sleep"); } }

public void setKeepAlive(boolean value) { keepAlive = value; } }

Quiting.java package SimpleSIP; import java.io.IOException; import java.net.Socket; public class Quitting extends State {

public Quitting() { printState(); }

@Override public State receivedOKsendNothing(Socket clientSocket) { // TODO: Received OK in SimpleSIP.Quitting state, return nothing and end the call System.out.println("Received OK, now available"); try { if(clientSocket != null) clientSocket.close(); } catch (IOException e) { System.err.println("Could not close socket"); } return new Free(); }

private void printState() { System.out.println("------"); System.out.println("| Current state: SimpleSIP.Quitting |"); System.out.println("------"); } }

ServerListner.java package SimpleSIP; import java.io.*;

APPENDIX | 57

import java.net.ServerSocket; import java.net.Socket; public class ServerListener implements Runnable {

private final int TIMEOUT = 10000;

private ServerSocket serverSocket; private Socket clientSocket; private PrintWriter out; private BufferedReader in; private StateHandler handler;

public ServerListener(ServerSocket serverSocket_, StateHandler handler_) { serverSocket = serverSocket_; handler = handler_; clientSocket = null; }

@Override public void run() { while(true) { try { clientSocket = serverSocket.accept(); if(handler.isBusy()) { PrintWriter tempOut = new PrintWriter(clientSocket.getOutputStream(), true); tempOut.println("BUSY"); tempOut.close(); clientSocket.close(); clientSocket = null; } else { try { clientSocket.setSoTimeout(TIMEOUT); out = new PrintWriter(clientSocket.getOutputStream(), true); out.println("CONNECT"); in = new BufferedReader(new InputStreamReader(clientSocket.getInputStream())); } catch (IOException e) { System.err.println("Couldn't create in and out streams"); if(clientSocket != null) clientSocket.close(); } handler.setIn(in); handler.setOut(out); handler.setRemoteIp(clientSocket.getInetAddress()); ClientListener clientListener = new ClientListener(in, handler, clientSocket); Thread thread = new Thread(clientListener); thread.start(); } } catch(IOException e) { handler.errorReceived(clientSocket); e.printStackTrace(); } } } }

58 | APPENDIX

SimpleSIPMain.java package SimpleSIP; import java.io.IOException; import java.net.ServerSocket; public class SimpleSIPMain { public static void main(String[] args) { if(args.length < 1) { System.out.println("Please enter argument [port number]"); System.exit(1); }

int port = 0; ServerSocket serverSocket = null; try { port = Integer.parseInt(args[0]); serverSocket = new ServerSocket(port); System.out.println("SimpleSIP.Waiting for a connection..."); } catch (IOException e) { System.err.println("Could not listen on port " + port); System.exit(1); } catch (NumberFormatException e) { System.err.println("Port number must be an integer!"); System.exit(1); } // ServerSocket shouldn't need a timeout /* try { serverSocket.setSoTimeout(200); } catch (SocketException e) { System.out.println("Could not set timeout"); } */ UserCommands userCommands = new UserCommands(serverSocket); userCommands.run(); } } State.java package SimpleSIP; import java.io.IOException; import java.io.PrintWriter; import java.net.InetAddress; import java.net.Socket; public abstract class State {

public State receivedInviteSendTRO(PrintWriter out) { error(out);

APPENDIX | 59

return new Free(); } public State receivedACKSendNothing(InetAddress ip, Socket clientSocket) { error(null); return new Free(); } public State userWantsToConnectSendInvite(PrintWriter out) { error(out); return new Free(); } public State receivedTROSendACK(PrintWriter out, InetAddress ip, Socket clientSocket) { error(out); return new Free(); } public State receivedBYESendOK(PrintWriter out, Socket clientSocket) { error(out); return new Free(); } public State userWantsToQuitSendBYE(PrintWriter out) { error(out); return new Free(); } public State receivedOKsendNothing(Socket clientSocket) { error(null); return new Free(); } public State errorReceived(PrintWriter out, Socket clientSocket) { error(out); try { if(clientSocket != null) clientSocket.close(); } catch (IOException e) { System.err.println("Could not close socket"); } return new Free(); } public boolean busy() { return false; } public State sendBusy(PrintWriter out) { if(out != null) out.println("BUSY"); return new Free(); } protected void error(PrintWriter out) {

60 | APPENDIX

if(out != null) out.println("ERROR"); System.err.println("ERROR! Unexpected response!"); } }

StateHandler.java package SimpleSIP; import java.io.BufferedReader; import java.io.PrintWriter; import java.net.InetAddress; import java.net.Socket; public class StateHandler {

private State state; private BufferedReader in; private PrintWriter out; private InetAddress remoteIp;

public StateHandler() { in = null; out = null; state = new Free(); }

public void userWantsToConnectSendInvite() { state = state.userWantsToConnectSendInvite(out); }

public void receivedInviteSendTRO() { state = state.receivedInviteSendTRO(out); }

public void receivedTROSendACK(Socket clientSocket) { state = state.receivedTROSendACK(out, remoteIp, clientSocket); }

public void userWantsToQuitSendBYE() { state = state.userWantsToQuitSendBYE(out); }

public void receivedBYESendOK(Socket clientSocket) { state = state.receivedBYESendOK(out, clientSocket); in = null; out = null; }

public void receivedACKSendNothing(Socket clientSocket) { state = state.receivedACKSendNothing(remoteIp, clientSocket); }

APPENDIX | 61

public void receivedOKsendNothing(Socket clientSocket) { in = null; out = null; state = state.receivedOKsendNothing(clientSocket); } public void errorReceived(Socket clientSocket) { state = state.errorReceived(out, clientSocket); } public void noInputGoToFree() { state = new Free(); } public BufferedReader getIn() { return in; } public void setIn(BufferedReader in) { this.in = in; } public PrintWriter getOut() { return out; } public void setOut(PrintWriter out) { this.out = out; } public InetAddress getRemoteIp() { return remoteIp; } public void setRemoteIp(InetAddress remoteIp) { this.remoteIp = remoteIp; }

public boolean isBusy() { return state.busy(); } public void sendBusy() { state = state.sendBusy(out); } public void printState() { if(state instanceof Free) { System.out.println("SimpleSIP.State is SimpleSIP.Free"); } else if(state instanceof Connecting) { System.out.println("SimpleSIP.State is SimpleSIP.Connecting"); } else if(state instanceof Waiting) { System.out.println("SimpleSIP.State is SimpleSIP.Waiting");

62 | APPENDIX

} else if(state instanceof CallEstablished) { System.out.println("SimpleSIP.State is SimpleSIP.CallEstablished"); } else if(state instanceof Quitting) { System.out.println("SimpleSIP.State is SimpleSIP.Quitting"); } else { System.out.println("SimpleSIP.State is UNKNOWN"); } } }

UserCommands.java package SimpleSIP; import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.io.PrintWriter; import java.net.*; import java.util.Scanner; public class UserCommands {

private final int TIMEOUT = 10000;

private ServerSocket serverSocket; private Socket clientSocket; private PrintWriter out; private BufferedReader in; private Scanner scan;

private StateHandler handler; private ServerListener serverListener; private Thread otherThread;

public UserCommands(ServerSocket socket_) { serverSocket = socket_; scan = new Scanner(System.in); out = null; in = null; clientSocket = null; handler = new StateHandler(); serverListener = new ServerListener(serverSocket, handler); otherThread = new Thread(serverListener); otherThread.start(); }

public void run() { while (true) { printChoices(); String answer = scan.nextLine();

Commands choice = findAnswer(answer);

APPENDIX | 63

switch (choice) { case INVITE: handler.userWantsToConnectSendInvite(); break; case TRO: handler.receivedInviteSendTRO(); break; case ACK: handler.receivedTROSendACK(clientSocket); break; case BYE: handler.userWantsToQuitSendBYE(); break; case OK: handler.receivedBYESendOK(clientSocket); break; case EXIT: System.out.println("Bye bye!"); System.exit(0); case BUSY: handler.sendBusy(); case ERROR: System.err.println("Error"); break; default: System.err.println("Invalid input, please try again"); } } } private Commands findAnswer(String input) { input = input.toUpperCase(); System.out.println(input); if(input.contains("INVITE")) { String[] arr = input.split(" "); if(arr.length < 3) return Commands.ERROR; else if(connectionEstablished(arr)) return Commands.INVITE; else return Commands.ERROR; }

switch(input) { case "TRO": return Commands.TRO; case "ACK": return Commands.ACK; case "BYE": return Commands.BYE; case "OK": return Commands.OK; case "EXIT": return Commands.EXIT; case "BUSY":

64 | APPENDIX

return Commands.BUSY; default: return Commands.UNKNOWN; } }

/** * If an user wants an * @param arr * @return */ private boolean connectionEstablished(String[] arr) { String clientIP = arr[1].toLowerCase(); int clientPort; try { clientPort = Integer.parseInt(arr[2]); } catch (NumberFormatException e) { System.err.println("Invalid integer"); return false; }

try { clientSocket = new Socket(); clientSocket.connect(new InetSocketAddress(clientIP, clientPort)); clientSocket.setSoTimeout(TIMEOUT); out = new PrintWriter(clientSocket.getOutputStream(), true); in = new BufferedReader(new InputStreamReader(clientSocket.getInputStream()));

String isBusy = in.readLine(); if(isBusy.equals("BUSY")) return false; } catch (UnknownHostException e) { System.err.println("Unknown Host, quitting"); return false; } catch (IOException e) { System.err.println("Failed to connect, quitting"); return false; }

handler.setOut(out); handler.setIn(in); try { handler.setRemoteIp(clientSocket.getInetAddress()); }catch (Exception e){ System.err.println("Could not set remote IP"); }

ClientListener clientListener = new ClientListener(in, handler, clientSocket); Thread thread = new Thread(clientListener); thread.start();

return true; }

private void printChoices() {

APPENDIX | 65

System.out.println("\nPossible commands: "); System.out.println("INVITE "); System.out.println("TRO"); System.out.println("BUSY"); System.out.println("ACK (After receiving TRO)"); System.out.println("BYE"); System.out.println("OK (AFTER BYE)"); System.out.println("EXIT (Quit program)\n"); }

private void restartOtherThread() { /* Listener newListener = new Listener(serverSocket, true, clientSocket, out, in); Thread thread = new Thread(newListener); otherThread = thread; listener = newListener; thread.start(); */ } }

Waiting.java package SimpleSIP; import java.io.PrintWriter; import java.net.InetAddress; import java.net.Socket; public class Waiting extends State { public Waiting() { printState(); }

@Override public State receivedTROSendACK(PrintWriter out, InetAddress ip, Socket clientSocket) { System.out.println("Received TRO, sending ACK"); out.println("ACK"); return new CallEstablished(ip, out); }

private void printState() { System.out.println("------"); System.out.println("| Current state: SimpleSIP.Waiting |"); System.out.println("------"); } }

TRITA CBH-GRU-2020:052

www.kth.se