Framework for combined compile-time and runtime error checking in Open64

Jianxin Lai, Peng Yuan, Hongmei Wang, Dingfei Zhang Unix System Lab, Hewlett-Packard Company {Jianxin.lai, peng.yuan, hong-mei.wang, zhang.ding-fei}@hp.com

Abstract inter-procedural analysis (IPA) framework. The framework provides the facility to instrument the intermediate representation A lot of techniques have been developed to detect errors in (WHIRL) for runtime checking and perform both intra- and program automatically. Static checkers scans the program source inter-procedural data flow analysis for static checking. The code to find potential problems. Runtime checkers changes the protocol to pass the analysis result from static checker to runtime statically by compile-time instrumentation or checker to guide the instrumentation is also developed. With this dynamically by runtime instrumentation. Static checker tends to new framework, the checkers are able to eliminate the false report false positive warnings and runtime checker shows great positive of static checker while decrease the overhead of the overhead at runtime. runtime checker. In this paper, an Open64 based framework for combined compile-time and runtime error checking is present. Based on the 2. Framework overview framework, the checker for uninitialized variables is implemented. The checker has less runtime overhead than traditional runtime Our framework includes two major components. One is the checkers and don’t issue any false positive warnings. With this runtime checker, which is based on instrumentation. The framework, new checkers are able to make better trade-off program will be instrumented with the interface provided by the between false positive warnings and runtime overhead than any runtime checking . The runtime checker is designed to traditional approaches. detect the errors in the program during the of the program. It’s available at all optimization levels. The other is the Categories and Subject Descriptors D.3.3 [Programming static checker. This part is built in WOPT and IPA. One major Languages]: Processors – . task of the static checker is to calculate the candidate set for runtime checking and pass the information to instrumentation General Terms Algorithms phase. Static checker is able to identify the real safe cases and Keywords open64 compiler, static checker, runtime checker exclude them from the candidate set. Since the safe cases are not instrumented, the overhead of our approach is less than traditional 1. Introduction binary instrumentation based checkers. Another task is to identify the real unsafe case and issue warnings to programmer. Thus the A lot of techniques have been developed to detect errors in false positive warnings are eliminated. With the collaboration of program automatically. Basically, the corresponding tools utilize the two parts, we can make better balance between the accuracy two basic approaches for error checking. One is checking the error and overhead than before. by scanning the , aka static checker. The static checker The whole framework is based on existing open64 doesn’t require running the program during checking errors in the infrastructure. It includes 4 parts: program. The other is the runtime checker, which starts the - Instrumentation phase program in a managed environment and performs the checking - Library for runtime checking (librtc) while the program is running. - Static checking phase in WOPT enabled at O2 and O3 The static checker has a lot of advantages over the runtime - Static checking phase in IPA checker. It can be integrated with the development environment easily, check the program during the coding phase and is able to check part or whole of the source code regardless the program is executable or not. Static checker can also cover all code in the program regardless it is executable or not. The major disadvantage of static checker is the accuracy. Static checker may either miss real errors or issue false positive warnings. On the contrary, the runtime checker doesn’t report any false positive warnings but it usually has noticeable overhead. Also, runtime checker is only able to detect errors in the executed code. In this paper, a framework combining compile-time and runtime error checking to detect errors in program will be presented. The framework is based on Open64 [1] in order to leverage the compiler infrastructure, static single assignment form (SSA) based data flow analysis in global optimizer (WOPT) and C/C++/F 2.4 Static checking phase in IPA Open64 IPA framework enables inter-procedural analysis and optimizations. But the current IPA framework is not context FE sensitive and the support for inter-procedural data flow analysis is -IPA -O2 limited. In the new framework, the current open64 IPA infrastructure is enhanced to support context-sensitive and partial flow-sensitive inter-procedural analysis. The new IPA IPL infrastructure is flexible to control the analysis to be either WOPT IPA whole program or partial program. The static checkers built on the -O0/1 new IPA infrastructure are able to perform checking in a larger scope with flexible resource control. Similar to existing Open64 IPA, the new IPA infrastructure Instrument librtc also includes two phases: local summary phase (IPL) and IPA phase. In IPL phase, the summary information for the PU is collected and written into the fake object file. To achieve context CG sensitivity, each call site will be summarized separately regardless if they call the same function or not. To achieve flow sensitivity, Figure 1. Framework overview the SSA versions of function parameters and chi nodes of global variables associated to the call site are also summarized. The SSA 2.1 Instrumentation phase versions of global variables on the RETURN statement are The instrumentation phase is activated at all optimization summarized to model the side effect of the function. Also, the levels. In general, it walks the whole WHIRL tree and instruments checker can have an IPL phase to summarize the information the WHIRL node based on the checker’s requirement. This phase needed for the IPA phase. In IPA phase, the call graph will be also creates new symbols and makes the right layout for all built using the call site summary information. On the call graph, variables used by the checking functions defined in the runtime the SSA versions of actual and chi nodes of global variables will library. This phase includes 2 steps. Step 1 is the candidate be applied to the edges of the call graph. Once the call graph is selection. If the static checker in either WOPT or IPA is available, built, the SSA versions of the RETURN statement will be applied candidate set is provided by the static checker. Otherwise all to the call graph using a bottom-up approach. All chi nodes on the possible cases will be included in the candidate set. Step 2 is the edge will be checked. If the symbol of the chi is not changed in transformation. All candidates will be instrumented for runtime the callee, the result version of the chi node will be replaced by checking. the operand version. All occurrences of the result version will be replaced in all the summary information of the caller. Then we 2.2 Library for runtime checking (librtc) can treat the call graph as control flow graph and perform classic The provides the functionality to check errors SSA construction algorithm on it to construct the global SSA. If at runtime. For different checkers, different checking functions are the callee has multiple call sites, a entry phi node will be needed. Also, the instrumentation phase requires the interfaces introduced to replace the entry chi. The number of the operands of provided by the librtc to generate the right function call at the the entry phi node equals to the number of in-edges of the callee. right position on the IR. The version of the operand matches to the result version of chi node on the edge. 2.3 Static checking phase in WOPT WOPT builds hashed static single assignment form (HSSA) and does data flow analysis on top of the HSSA representation [2]. Symbol resolution Entry PHI insertion On HSSA representation, the alias information is represented by chi nodes and mu nodes. The chi nodes are created for global variables around the call sites to model the side effects of the Symtab merge Rename global SSA function call. The mu nodes are created for global variables and attached to the RETURN statement. The entry chi is inserted to the beginning of the program unit (PU) to make sure all uses have Build Callgraph Update summary the definitions. In our framework, the version defined by entry chi is assigned with a special version number. If the call site uses the Propagate special version in either parameters or the chi attached to the call IPA checking node, the variable is not changed on all paths from the function RETURN version entry to the call site. If the mu node on the RETURN statement uses the special version, the variable is not changed in all paths from the entry to exit. Figure 2. IPA phase ordering The framework is built on top of WOPT. It provides methods to traverse the symbol tables, HSSA representation, In the figure 3 below, (A) is the source code. (B) is the Definition-Use (DU) manager and alias manager. The checkers in intra-procedural SSA. (C) is the call graph before applying the WOPT are able to visit the IR and symbol tables by these global SSA construction. (D) is the global SSA representation. interfaces to perform the intra-procedural checking. if the variable is defined or not. At the present of IPA, if the int dbg; int dbg; variable is passed to other functions, a new parameter will be int foo(int x) { int foo(int x) { appended to the end of the parameter list and the address of the int y; int y; flag will be passed by this new parameter. In the instrumentation yß…; x1=chi(x0); phase, all STIDs and LDIDs of the identified variables will be if (dbg > 3) dbg1=chi(dbg0); instrumented if the compile-time checking in WOPT or IPA is not ß x; y1ß…; available. Otherwise, the instrumentation will be performed on the ß y; if (dbg1 > 3) STIDs and LDIDs identified by the static checker. The statement } ß x1; to check if the flag is set will be inserted before the statement using the variable. The statement to set the flag will be inserted int bar() { ß y1; after the statement setting the variable. Without IPA, only local int x; mu(dbg1); foo(x); } variables are checked since the address taken of the actual in caller is unknown. An example (the control flow is omitted and if (dbg > 2) int bar() { assume x is not address-taken) for the runtime checking of xß…; int x; uninitialized variables is presented in figure 4. foo(x); x1=chi(x0); int foo(int x) { int foo(int x, bool* flag_x) { } dbg1=chi(dbg0); int y; int y; bool flag_y = false; foo(x1); … … (A) dbg2 = chi(dbg1); yß ; yß ; flag_y = true; if (dbg2 > 2) ß x; if (!*flag_x) report_error(); x2ß…; ß y; ß x; if (!flag_y) report_error(); x3=phi(x1,x2); } foo(x3); int bar() { ß y; dbg3=chi(dbg2); int x; } mu(dbg3); xß…; int bar() { } foo(x); int x; bool flag_x = false; (B) } xß…; flag_x = true; foo(x, &flag_x);

Entry: (x0, dbg0) Entry: (xg0, dbgg0) } foo foo Return: (dbg3) Return: (dbgg0) Figure 4. Instrumentation at compile-time

CS0: (x1, dbg1) CS0: (xg0, dbgg0) 3.2 Compile-time checking for uninitialized variables in CS1: (x3, dbg2) CS1: (xg1, dbgg0) WOPT

bar Entry: (x0, dbg0) bar Entry: (xg2, dbgg0) The basic compile-time checking in WOPT has been Return: (dbg1) xg2 = phi(xg0, xg1); implemented for several years. That implementation is quite Return: (dbgg0) simple. As described in section 2.3, because entry chi nodes are (C) (D) created to make sure each use has a definition. If the local variable (except the formal parameters) has an entry chi, there must be a Figure 3. Call graph and global SSA use of the uninitialized variables. Then a warning will be issued. In the new implementation, the responsibilities of the checker in Our framework provides interface to traverse the call graph and WOPT are not only to issue the warnings, but also to identify the retrieve the global SSA information attached on all edges and safeness of each LDID/STID. The new checker needs to identify vertexes of the call graph. The IPA static checker can traverse the all STIDs/LDIDs on local variables without address-taken and call graph by these interfaces and perform the checking actions pass the safeness information to the instrumentation phase. If both using the summary information collected in the IPL phase. the uses and definitions are conditional, the condition expressions on which the uses or definitions are control dependent are also 3. Implementation of uninitialized variables compared. If the use condition is implied by the definition checker condition, the use is still safe. In order to eliminate the false positive warnings, the warnings are only issued for real unsafe In this section, an implementation of uninitialized variables cases. For the real safe cases, the instrumentation for runtime checker will be described. This checker is used to make sure the checking won’t happen to reduce the runtime overhead. scalar variable without address taken must be assigned before its The new algorithm includes three phases. The first phase is to value is used. The checker is applied on local variables or identify all statements possibly using undefined variables and find parameters at the present of IPA. It’s available at all optimization out all definitions reaching to that use. The second phase is to levels. The checker is based on the framework described in check if the use is safe or not by comparing the conditions to section 2. The compile-time checking is performed in either define or use the variable. In the last phase, any possible unsafe intra-procedure or inter-procedure. uses and their definition points will be passed to the 3.1 Runtime checking for uninitialized variables instrumentation phase for runtime checking. The runtime checker is available at all optimization levels. It is based on the compiler instrumentation. For all local variables that might be used before defined, a new flag is introduced to identify 3.3.1 IPL phase int foo() { int foo() { int x, y; int x, y; In IPL phase, the UD chain is traversed in the similar way as x ß; y1 = chi(y0); we described in section 3.2. All uses of uninitialized variables, their definitions and the address-taken information will be if (…) x1 ß; summarized. yß…; if (…) If the uses and definitions are conditional, their control y ß…; ß x; 2 dependencies are also summarized. At the end of IPL phase, the ß y; y3 = phi(y1, y2); summary info will be written to fake object file and passed to IPA. } ß x1; /* safe */ Figure 5 shows an example for the IPL phase. ß y3; /* unsafe */ Figure 6 described how the versions of global variables and } parameters are collected and wrote to summary data. For the int foo() { summary data of use information, there are 5 fields in the entry: index, symbol, index to the WHIRL node uses the symbol, and int x, y; bool flag_y = false; index to the condition expression under which the symbol is used, y1 = chi(y0); version of the symbol. For the definition information, the fields x1 ß; /* no instru */ are: index, symbol, index to the WHIRL node defines the symbol, if (…) { and index to the condition expression under which the symbol is y2ß…; flag_y = true; defined, index to the entry in the summary of use information. } y3 = phi(y1, y2); 3.3.2 IPA phase ß x1; /* no instru */ In the IPA phase, after the preparing phases to build the call if (!flag_y) report_error(); graph, connect the actual and formals, merge global SSA and ß y3; update the SSA versions in summary data described in section 2.4, } the IPA uninitialized variables checking phase is invoked. The Figure 5. Static checker in WOPT IPA checker works similarly to the WOPT checking phase but based on the global SSA version for global variables and 3.3 Compile-time checking for uninitialized variables in IPA parameters. As the example shown below, once the version of the formal parameter is replaced by x3, the two uses in foo and bar are The uninitialized variables checker in IPA contains two parts, connected. Since both the use and definition are conditional, the one is in IPL phase and the other is in IPA phase. The IPL phase two expressions are compared. Because the condition expression adopts the similar algorithm used by the static checker in WOPT. of the definition in foo covers the expression in bar, the use is The IPL phase will be focused on both local parameter and proven to be safe and there is no runtime checking needed. At last, parameter. The summary information will be written into fake the safeness information is written to IR file and passed to object file and passed to IPA. In the IPA phase, the summary data backend for runtime checking instrumentation. are applied to the call graph and be checked by the IPA Figure 7 shows the version changes in IPA phase. Figure (A) is uninitialized variables checker. the versions before connecting the formal parameters and actual parameters. Figure (B) is the versions after connecting the

int dbg; int dbg; foo summary int foo(int x) { int foo(int x) { Use of UV summary int y; int y; 0 x wn:1 exp:0 ver1 yß…; x1=chi(x0); if (dbg > 3) dbg1=chi(dbg0); Expr summary ß x; y1ß…; 0 GT kid:1 kid:2 ß y; if (dbg1 > 3) 1 LDID dbg ver1 } ß x1; // wn:1 2 CNT int 3 int bar() { ß y1; int x; mu(dbg1); if (dbg > 2) } xß…; int bar() { bar summary foo(x); int x; Use of UV summary } x1=chi(x0); 0 x wn:3 N/A ver3 dbg =chi(dbg ); 1 0 Def of UV summary if (dbg1 > 2) 0 x wn:2 exp:0 use:0 x2ß…; // wn:2 Expr summary x3=phi(x1,x2); foo(x3); // wn:3 0 GT kid:1 kid:2 dbg2=chi(dbg1); 1 LDID dbg ver1 mu(dbg2); 2 CNT int 2 } Figure 6. Summary data collected in IPL phase parameters and eliminating the duplicated entry chi nodes. Figure 5.1 Static checker (C) is the versions after the global SSA renaming and updated Static checker works directly on source code, and they always with the new global version. With the new global version, the detect defects early in and are more guaranteed than expressions in different functions can be compared. The runtime checker. Static checker is always embedded in compilers satisfiability of the use condition and definition condition can be like GCC [3][4] etc. HP cadvise[5][6] is a standalone static solved by the Difference Decision Diagrams [19]. analysis tool which leverages advanced cross-file analysis Entry: (x0, dbg0) Entry: (x0, dbg0) Entry: (xg0, dbgg0) technology by storing diagnosed information in a program foo Callsite: (x3, dbg1) Callsite: (x3, dbg1) Callsite: (xg1, dbgg0) database. With build-in knowledge of system APIs, it identifies Return: (dbg2) Return: (dbg1) Return: (dbgg0) potential coding errors, porting issues and security vulnerabilities with fewer false positives. bar Entry: (x0, dbg0) Entry: (x3, dbg1) Entry: (xg1, dbgg0) Return: (dbg1) Return: (dbg1) Return: (dbgg0) 5.2 Runtime checker (A) (B) (C) Runtime checker is widespread for C/C++ program. Figure 7. Version changes in IPA phase Valgrind[7][8] provides a dynamic binary instrumentation (DBI) infrastructure for runtime checking. Memcheck [9][10] is a tool based on Valgrind for detection and debugging. 4. Other checkers Memcheck inserts extra instrumentation code around almost all instructions to keep track of the validity and addressability of Besides the checker for uninitialized variables, several other memory objects. The drawback of it compared to proposed runtime checkers have also been prototyped on this framework. approaches is that it causes dramatically performance penalty. Two of them will be introduced briefly in this section. One is the IBM Rational®Purify [11-14] is a well know commercial runtime truncate checker. The other is a heap-related memory checker. checker which is generally used for detection, With useful compile time diagnostics information, the static memory leak detection, and analysis. It inserts checker makes run time checkers more powerful and less instrumentation code to the executable by parsing and adding to overhead. the and tracks the status of each byte of memory with The truncation checker is to detect data loss in assignment two bits; the first bit recording whether it is allocated or not and when integral values are truncated, no matter the truncation is the second bit recording whether it is initialized or not. Its done in the program. If the results of arithmetic operations are weakness lies in that it is highly dependent on manual inputs from truncated, an error message will be written to stderr and the a user to complete an analysis and it sometimes has false negative program is aborted. Without the static checker, all arithmetic warning such as uninitialized memory defects. The compilers can operations possibly causing truncation must be instrumented. The also do instrumentation to detect the errors in program. HP static checker is based on value range propagation, which is able aCC[15] provides runtime checks to detect coding errors in user to identify the possible range of the operands and results. Similar program through an embedded runtime checker. It inserts to the uninitialized variables checker, the warnings will be issued instrumentation code for designated objects in compile time and for truncated operations and runtime instrumentation will be performs the check in runtime. It also has the disadvantage that applied for all possible truncated operations. the additional instrumentation instructions will significantly slow Heap memory related check can detect several kinds of down user program. memory problems: memory leaks, heap memory bounds overrun and underrun, freed memory read, bad free and duplicate free, and 5.3 Combined approach provide memory heap info. It will cause the user program to abort for writes beyond boundaries of heap objects, free or realloc calls Many existing approaches falls into this category [16]. for a pointer that is not a valid heap object and out-of-memory SAFECode[17][18] mainly uses static compiler analysis to conditions. The runtime heap memory checker replaces the provide most of the error detection capabilities on memory objects. standard C memory allocator and free function with its own Compared to the proposed method, its focus is limited to program implementation, which do memory management and checking. memory safety and it requires some semantic restrictions on user This feature enables the checker to record the allocation info on program. Furthermore, it needs to make use of the allocation of every heap memory object, and to examine the hardware-supported address space protection mechanism to freed pointer in the allocation info table whether it’s a bad free or reduce runtime overhead, which is not so common in embedded violates memory bounds. If a memory problem (such as bounds systems. violation, bad free, out of memory) is found when doing allocation/de-allocation, the program will print out error message 6. Summary and abort the execution. At the program exit time, the checker will In this paper, a framework for combined compile-time and use the memory info to find out memory leaks, and dump the leak runtime error checking in Open64 is presented. Based on the info and memory corruption info to log files. The static checker is framework, a checker for uninitialized variables and other able to identify safeness of some patterns for memory problems checkers are implemented. Compared to other similar checkers, and tell the runtime checker not to instrument them. our checkers don’t report false positive warnings at compile-time and have less overhead at runtime. 5. Related work Many tools have been developed to find the errors in programs. Existing methods for program error checking can be either static References checker or runtime checker, or a combination of these two types. [1] Open64 Compiler, http://www.open64.net [2] Chow, F., Chan, S., Liu, S., Lo, R., and Streich, M. 1996. Effective representation of aliases and indirect memory operations in SSA form. In Proceedings of the Sixth International Conference on Compiler Construction. 253-267. [3] D.Engler, B.Chelf, A.Chou, and S.Hallem. checking system rules using system-specific, program-written compiler extensions. In Proceedings of OSDI 2000, SanDiego, California, USA, 2000. [4] S.Hallem, B.Chen, Y,Xie, and D,Engler. A system and language for building system-specific, static analyses. In proceedings of PLDI2002, Berlin, Germany, June 2002. [5] HP Code Advisor C.02.20 User Guide, http://www.hp.com/go/cadvise [6] HP Code Advisor (cadvise), http://www.hp.com/go/cadvise [7] Valgrind, http://www.valgrind.org/ [8] N.Nethercote, J.Seward. Valgrind: A Framework for Heavyweight Dynamic Binary instrumentation, PLDI07. [9] J. Seward. Valgrind. an open-source for x86-gnu/linux. [10] J. Seward and N. Nethercote. Using Valgrind to detect undefined value errors with bit-precision. In Proceedings of the USENIX’05 Annual Technical Conference, Anaheim, California, USA, April 2005. [11] Application and evaluation of rational purify, 2007, youngki Hong, Taeho Kong, etc. [12] Memory profiling for C/C++ with IBM Rational Test RealTime and IBM Rational PurifyPlus RealTime. Jeff Campbell, 2008. [13] G.Begic, etc., ‘An introduction to runtime analysis with Rational PurifyPlus’ (http://www-128.ibm.com/developerworks/rational/library/mar07/be gic_pratt/index.html), 2007. [14] IBM Rational Purify, http://en.wikipedia.org/wiki/IBM_Rational_Purify [15] HP aCC compiler, http://www.hp.com/go/acc [16] Dinakar Dhurjati, Sumant Kowshik, Vikram Adve, and Chris LAttner. Memory Safety without Garbage Collection for Embedded Applications, ACM Transactions on Embedded Computing Systems, Vol. 4, No. 1, February 2005, Pages 73–111. [17] SAFECode, http://safecode.cs.illinois.edu/ [18] Dinakar Dhurjati, Sumant Kowshik, Vikram Adve. SAFECode: Enforcing Alias Analysis for Weakly Typed Languages. PLDI06 [19] Jesper Møller, Jakob Lichtenberg, Henrik R. Andersen, and Henrik Hulgaard. Difference Decision Diagrams. In proceedings Annual Conference of the European Association for Computer Science Logic (CSL), Madrid Spain, 1999