Control Flow Obfuscation for FJ Using Continuation Passing Conference’17, July 2017, Washington, DC, USA
Total Page:16
File Type:pdf, Size:1020Kb
Control Flow Obfuscation for FJ using Continuation Passing Extended Version Kenny Zhuo Ming Lu School of Information Technology Nanyang Polytechnic [email protected] Information Systems Technology and Design Singapore University of Technology and Design [email protected] Abstract of the major security threat to Java applications. Code obfus- Control flow obfuscation deters software reverse engineer- cation is one of the effective mechanism to deter malicious ing attempts by altering the program’s control flow transfer. attack through decompilation. There are many obfuscation The alternation should not affect the software’s run-time be- techniques operating on the level of byte-codes [5, 17, 20]. haviour. In this paper, we propose a control flow obfuscation In the domain of source code obfuscation, we find solutions approach for FJ with exception handling. The approach is such as [9] applying layout obfuscation. We put our inter- based on a source to source transformation using continua- est in control flow obfuscation techniques, which include tion passing style (CPS). We argue that the proposed CPS control flow flattening [3, 11, 21] and continuation passing transformation causes malicious attacks using context in- [12]. Note that the difference between bytecode obfuscation sensitive static analysis and context sensitive analysis with and source code obfuscation is insignificant, because of the fixed call string to lose precision. strong correlation between the Java bytecodes and source codes. In this paper, we propose an extension to the con- CCS Concepts • Software and its engineering → Se- tinuation passing approach to obfuscate FJ with exception mantics; • Security and privacy → Software and applica- handling. tion security. We assume the attackers gain access to the byte-codes Keywords Control flow obfuscation, program transforma- to which layout obfuscation has been applied. The attack- tion, continuation passing style ers decompile the byte-codes into source codes and attempt to extract secret information by running control flow anal- ACM Reference Format: ysis on the decompiled code. Our goal here is to cause the Kenny Zhuo Ming Lu. 2021. Control Flow Obfuscation for FJ using control flow analysis become imprecise or more costly in Continuation Passing: Extended Version. In Proceedings of ACM computation. Conference (Conference’17). ACM, New York, NY, USA, 13 pages. hps://doi.org/10.1145/nnnnnnn.nnnnnnn 2 Motivating Example 1 Introduction Example 1. To motivate the main idea, let’s consider the Java applications are ubiquitous thanks to the wide adoption following Java code snippet arXiv:2012.06340v2 [cs.CR] 26 Feb 2021 of android devices. Since Java byte-codes are close to their source codes, it is easy to decompile Java byte-codes back to class FibGen { source codes with tools. For example, javap shipped with int f1, f2, lpos; JVM [15] can be used to decompile Java class files back to FibGen() { Java source. This makes the Man-At-The-End attack as one f1 = 0; f2 = 1; lpos = 1; } Permission to make digital or hard copies of all or part of this work for int get( int x ) { personal or classroom use is granted without fee provided that copies are int i = lpos; // (1) not made or distributed for profit or commercial advantage and that copies int r = −1; bear this notice and the full citation on the first page. Copyrights for compo- try { // (2) nents of this work owned by others than ACM must be honored. Abstract- ing with credit is permitted. To copy otherwise, or republish, to post on if (x< i) { // (3) servers or to redistribute to lists, requires prior specific permission and/or throw new Exception() ; // (4) a fee. Request permissions from [email protected]. } else { Conference’17, July 2017, Washington, DC, USA while (i <x) { // (5) © 2021 Association for Computing Machinery. int t = f1 + f2; // (6) ACM ISBN 978-x-xxxx-xxxx-x/YY/MM...$15.00 f1 = f2; f2 = t; i++; hps://doi.org/10.1145/nnnnnnn.nnnnnnn } Conference’17, July 2017, Washington, DC, USA Kenny Zhuo Ming Lu 1 an anonymous function whose input is of type void and the body returns a boolean value. For brevity, we omit the type annotations of the formal arguments where there is no con- 2 fusion. The return key word is omitted when there is only one statement in the function body. We omit curly brackets in curried expressions, e.g. x -> raise -> k -> { ... } 3 t is the same as x -> { raise -> { k -> { ... } } }, f where x, raise and k are formal arguments for the lambda 4 5 abstractions. For convenience, we treat method declaration f and lambda declaration as interchangeable. For instance, the t 7 8 lambda declaration 6 int => int => int f =x −> y −>{x+y} 9 is equivalent to the following method declaration int f ( int x, int y ) { return x + y;} Figure 1. CFG of get In the last section, we mentioned that the layout obfusca- tion such as identifier renaming should have been applied to } the obfuscated code; however in this paper we keep all the lpos = i; // (7) identifiers in the obfuscated code unchanged for the ease of r = f2; reasoning. For the sake of assessing the obfuscation potency, } catch (Exception e) { // (8) we “flatten” the nested function calls into sequences of as- println("the input should be greater than " signment statements. For example, let x and y be variables + i + "."); of type int, let f be a function of type int => int => int } and g be a function of type int => int; instead of int r = return r; // (9) f(x)(g(y)); , we write: } } int => int f_x = f(x); int g_y = g(y); In the above we define a Fibonacci number generator in int r = f_x(g_y); class FibGen. In the method get, we compute the Fibonacci number given the position as the input. Note that the gener- As we observe in Figure 2, all the building blocks are contin- ator maintains a state, in which we record the last two com- uation functions with type CpsFunc.The simple codeblocks puted Fibonacci numbers, namely, f1 and f2 and the last (1), (6), (7) and (8) from the original source code, which con- computed position lpos. In method get lines 10 and 11, we tain no control flow branching statements, are translated raise an exception if the given input is smaller than i which into nested CPS functions get1, get6, get7 and get8. Block has been initialized to lpos. Towards the end of the method, (4) containing a throw statement is translated into get4 which we catch the exception and print out the error message. applies the exception object to the exception handling con- The number comment on the right of each statement in- tinuation raise of type ExCont.Block(9)hasa return state- dicates the code block to which the statement belongs. In ment, which is translated into a function in which we assign Figure 1, we represent the function get’s control flow as a the variable being returned r to the res variable and call graph. Each circle denotes a code block from the source pro- the normal continuation k of type NmCont.Block (2) is a try gram. ✷ catch statement which is encoded as a call to the trycatch combinator in line 24. Similarly block (3) the if-else state- Inspired by the approach [12], our main idea is to trans- ment is encoded as a call to the ifelse combinator and late control flow constructs, such as sequence, if-else, loop block(5) the while loopis encodedas a call to the loop com- into CPS combinators. In the context of FJ with exception binator. handling, we translate try-catch statements into CPS com- In Figure 3, we present the definitions of the CPS combi- binators as well. nators used in the obfuscation. Combinator loop accepts a In Figure 2 we find the obfuscated code snippet of get condition test cond, a continuation executor visitor to be method in CPS style. The obfuscated code is in a variant executed when the condition is satisfied, a continuation ex- of FJ, named FJ_, which is FJ with higher order functions, ecutor exit to be activated when the condition is not satis- nested function declaration and mutable variables in func- fied. Combinator seq takes two continuation executors and tion closures. void => void denotes a function type whose executes them in sequence. Combinator trycatch takes a values accept no argument and return no result. Exception continuation executor tr and an exception handling contin- => void denotes a function type that accepts an exception uation hdl. It executes tr by replacing the current excep- and returns no result. type NmCont = void => void de- tion continuation with ex_hdl. Combinator ifelse accepts fines a type alias. (void n) -> {return i < x}; defines a condition test cond, a continuation for the then-branch th Control Flow Obfuscation for FJ using Continuation Passing Conference’17, July 2017, Washington, DC, USA type ExCont = Exception => void ; CpsFunc loop( void => Boolean cond, type NmCont = void => void ; CpsFunc visitor , CpsFunc exit) { type CpsFunc = ExCont => NmCont => void ; return raise −> k −> { int get( int x ) { if (cond()) { int i, t, r, res; Exception ex; NmCont => void visitor_raise = visitor(raise int => ExCont => ( int => void ) => void get_cps = ) ; x −> raise −> k −> { NmCont nloop = n −> { void => bool cond5 = n−> { i <x}; CpsFunc ploop = loop(cond, visitor , exit); void => bool cond3 = n −> {x < i}; NmCont => void = ploop_raise = ploop(raise CpsFunc get5 = loop(cond5, get6 , get7) ) ; CpsFunc get3 = ifelse(cond3,get4 ,get5); return ploop_raise (k) ; CpsFunc get1_2 = seq(get2 , get9); }; CpsFunc pseq = seq(get1 , get1_2); return visitor_raise (nloop); NmCont => void pseq_raise = pseq(raise); } else { NmCont nk_res = n−>k(res);