<<

Using Static and Runtime Analysis to Improve

Developer Productivity and Product Quality

Bill Graham and Paul N. Leroux Todd Landry QNX Systems [email protected], [email protected] [email protected]

April 2008 Static and runtime analysis QNX Software Systems

Abstract Static analysis can discover a variety of defects and weaknesses in system , even before the code is ready to run. Runtime analysis, on the other hand, looks at running software to detect problems as they occur, usually through sophisticated instrumentation. Some may argue that one form of analysis precludes the other, but developers can combine both techniques to achieve faster development and testing as well as higher product quality.

The paper begins with static analysis, which prevents problems from entering the main code stream and ensures that any new code is up to standard. Using techniques such as abstract syntax tree (AST) validation and code path analysis, static analysis tools can uncover security vulnerabilities, logic errors, implementation defects, and other problems, both at the developer’s desktop and at system time. The paper then explores runtime analysis, which developers can perform during module development and system integration to catch any problems missed by static analysis. Runtime analysis not only detects pointer errors and other violations, but also helps optimize utilization of CPU cycles, RAM, flash memory, and other resources.

The paper then discusses how developers can combine static and runtime analysis to prevent regressions as a product matures. This two-pronged approach helps to eliminate most problems early in the development cycle, when they cost least to fix.

Combining the best of both worlds Static analysis tools find bugs early in the coding phase of a project, usually before the execu- ting code is built. This early detection is particularly useful in large embedded projects, where developers cannot use runtime analysis tools until the software is complete enough to run on the target system.

Static analysis detects and describes areas of weakness in source code, including security vulnerabilities, logic errors, implementation defects, concurrency violations, rare boundary conditions, or many other problems. For instance, static analysis tools such as Klocwork Insight perform an in-depth analysis of the source code at a syntactical and semantic level; they also perform sophisticated interprocedural control- and -flow analysis and use advanced techniques to prune false paths, estimate the values that variables will assume, and simulate potential runtime behavior.

Developers can perform static analysis at any time during development, even when only portions of the project are coded; however, the more complete the code, the better. Static analysis can analyze all potential paths through the code — conventional testing rarely does this, unless the project requires 100% . For instance, static analysis can uncover bugs hidden in “edge cases” or error paths in code not tested during development.

Because static analysis attempts to predict behavior based on models of the source code, it will sometimes detect an “error” when, in fact, none exists — this is called a false positive.

2 Static and runtime analysis QNX Software Systems

Many modern static analysis tools have implemented advanced techniques to avoid this problem and to perform highly accurate analyses.

Static analysis pros Static analysis cons

Starts early in the software lifecycle, before the Can find bugs and vulnerabilities code is ready to run and before testing begins. that don’t necessarily cause a crash or impact runtime behavior. Can analyze existing code bases that have already been tested. Non-zero false positive rate.

Can integrate into the development environment, as part of nightly builds and as part of each developer’s desktop toolset.

Low labor costs: no need to generate test cases or stubs; developers can run their own analyses.

Table 1 — Static analysis pros and cons.

Runtime analysis tools detect bugs in running code. They allow the developer to monitor or diagnose an application’s behavior at runtime, ideally in the application’s target environment.

In many cases, the runtime analysis tool modifies the source or binaries of the application to provide hooks for instrumentation; these hooks detect runtime bugs, memory usage, code coverage, and other conditions. Runtime analysis tools can also generate accurate stack trace information that allows debuggers to find the cause of an error. Therefore, when a runtime analysis tool finds a bug, it is likely a real error that the can quickly identify and fix. That said, the exact runtime conditions for creating the bug must exist for the bug to be detected. Consequently, developers must create a test case for that particular scenario.

Runtime analysis pros Runtime analysis cons

Generates few false positives — high Instrumentation impairs realtime behavior; productivity rate for errors found. degree of impact depends on amount of instrumentation. Not always an issue, but Can capture full stack trace and needs to be considered for time-critical code. environment to track cause of error. Completeness of error analysis depends on Catches errors in the context of the running code coverage. Thus, the code path contain- system, either simulated or real. ing the error must be executed and the test case must create conditions required to create the error.

Table 2 — Runtime analysis pros and cons.

3 Static and runtime analysis QNX Software Systems

Early detection for lower development costs The earlier that bugs are found, the faster and cheaper it is to correct them. Thus, static and runtime analysis tools offer real value by finding bugs early in the life- cycle. Various industry studies indicate that fixing an issue during system test (QA) or once the product has shipped is orders of magnitude more expensive than finding and fixing the same issue while the software is still being developed. Many organizations have specific cost- of-defect metrics; Figure 1 shows the numbers reported in a widely cited reference, Applied Software Measurement by Capers Jones.

Figure 1 — As a development project progresses, the cost of fixing software defects can increase exponentially. Static and runtime analysis tools help prevent these costs by finding bugs early in the development lifecycle.

Static analysis Static analysis has been around almost as long as modern software development practices. In its first form, it included tools such as lint, which developers used on their desktop, within their local sandbox. When it came to bug detection, these early tools focused on low-hanging fruit, such as coding style and common syntactical mistakes. For example, even the most basic static analysis tools can detect the following bug:

4 Static and runtime analysis QNX Software Systems

int foo(int x, int* ptr) { if( x & 1 ); { *ptr = x; return; } ... }

Here, the erroneous addition of an extra semicolon leads to potentially disastrous results, with the incoming pointer being dereferenced under unexpected conditions. Whether the tested condition is met or not, the pointer is always dereferenced.

These early tools focused largely on syntactic mistakes. So while they could find serious bugs, most of the problems they uncovered were relatively trivial. Also, the tools had too small a code context to produce accurate results. That is because they operated during a developer’s typical compile/link cycle, and the code on a developer’s desktop tends to represent only a small frac- tion of the code available within the entire code stream. This shortcoming forced the analysis tools to estimations or inferences about what happens outside of the developer’s sandbox, leading to an excess of false positive reports.

Subsequent generations of static analysis tools addressed these shortcomings and expanded beyond syntactic and semantic analyses. These new tools build a rich representation or model of the provided code (akin to a compilation phase) and then simulate all possible execution paths through that model, mapping out the flow of logic on those paths, coupled with how and where data objects are created, used, and destroyed. The analysis can span program modules to include interprocedural control and data-flow analysis; it can also minimize false positives through new approaches for pruning false paths, estimating the values that variables will assume, and simulating potential runtime behavior. To generate this level of analysis, static analysis tools have to analyze the entire code base and integrate with a system build, rather than simply operate within the sandbox on a developer’s desktop.

To perform this more comprehensive form of analysis, static analysis tools employ two major types of code checking:

• Abstract syntax tree (AST) validation — For validating the basic syntax and structure of code.

• Code path analysis — For performing more complete types of analysis that depend on understanding the state of a program’s data objects at any particular point on a code execution path.

Abstract syntax trees An abstract syntax tree, or AST, is simply a tree-structured representation of the source code as might be generated by the preliminary stages of a . This tree contains a

5 Static and runtime analysis QNX Software Systems

rich breakdown of the structure of the code in a nonambiguous manner, allowing the tool to perform simple searches for anomalous syntax.

It’s easy to construct AST checkers that enforce standards around naming conventions and function-call restrictions, such as unsafe checks. Anything that can be inferred from the code without requiring knowledge of that code’s runtime behavior is typically a target for AST checking.

Many tools offer AST checking for a variety of languages, including tools such as PMD for . Several tools use XPath, or an XPath-derived grammar, to define the con- ditions that the checkers look for, and some provide extensibility mechanisms that let users create their own AST checkers. This type of checking is relatively simple to do, and many organizations construct new checkers of this type to enforce corporate coding standards or industry-recommended best practices.

Code path analysis Let us consider a more complex example. This time, instead of looking for style violations, we wish to check whether an attempted dereference of a pointer should be expected to succeed or fail:

if( x & 1 ) ptr = NULL; *ptr = 1;

It is obvious from manual inspection that the variable ptr can assume a NULL value whenever the variable x is odd, and that this condition will cause an unavoidable zero-page dereference. Nonetheless, it is a challenge to create an AST checker that can find this type of bug. Consider the (simplified, for clarity) AST that would be created from the above code :

Statement Block If-statement Check-Expression Binary-operator & x 1 True-Branch Expression-statement Assignment-operator = ptr 0 Expression-statement Assignment-operator = Dereference-pointer - ptr 1

6 Static and runtime analysis QNX Software Systems

In cases like this, no obvious tree search or simple node enumeration could discover the attempted, and at least occasionally illegal, dereferencing of ptr in a reasonably generalized form. Consequently, the analysis tool cannot simply search for syntax patterns; it must also analyze the lifecycle of data objects as they appear and are used within a control path’s flow of execution.

Code path analysis tracks objects within a code execution path and allows checkers to validate the legality or cleanliness of the data as it gets used. The use of code path analysis expands the number and types of questions that static analysis can answer. Instead of merely check- ing whether the code is written properly , code path analysis attempts to determine the code’s intent and whether it’s written as intended . It can answer questions such as:

• Is this newly created object released before all aliases to it are removed from ?

• Is this data object ever range-checked before being passed to an OS function?

• Is this string ever checked for special characters before being submitted as a SQL query?

• Will this copy operation result in a buffer overflow?

• Is it safe to call this function at this time?

By following code execution paths, either forward from a trigger event towards a target scenario, or backwards from a trigger event towards a required initialization, the tool can determine the answers to these questions and provide error reports when the target scenario or initialization either does or does not occur as expected.

This type of capability is essential to performing sophisticated analysis of source code. Thus, developers should look for tools that use comprehensive code path analysis to locate memory leaks, invalid pointer dereferences, unsafe or tainted data propagation, concurrency violations, and many other problem-causing conditions.

Static analysis workflow Static analysis can uncover problems at two key points in the development process: at the developer desktop and at system build time. As mentioned, the current generation of tools operate mainly during system build to obtain a complete view of the entire code stream, resulting in highly accurate analyses.

Uniquely, Klocwork Insight provides a desktop analysis capability that, while operating within a developer’s sandbox, doesn’t have the inaccuracy problems typically associated with desktop tools. Klocwork provides a Connected Desktop Analysis capability that can analyze the devel- oper's code with a full understanding of all system dependencies, leading to a local analysis that is every bit as powerful and accurate as centralized system analysis — but that occurs before code check-in.

7 Static and runtime analysis QNX Software Systems

From a workflow perspective, this capability gives developers the opportunity to run accurate, high-value static analysis at the earliest point in the development lifecycle. Klockwork Insight reports all issues to the integrated development environment (IDE) or command line as developers write their code and conduct their regular desktop compile-and-link process. This reporting occurs prior to any runtime analysis and well before developers check in their code.

Klockwork Insight provides a Klock Insight integrates with connected desktop analysis, allowing build environments to perform developers to remove critical bugs and static analysis with a system-wide vulnerabilities prior to code check-in. perspective. As a result, QA can test code with fewer bugs and focus on functional issues.

Figure 2 — Static analysis workflow.

Runtime analysis techniques To detect bugs, runtime analysis tools often insert small segments of code in either the initial source (source code insertion) or the code ( insertion). These code segments provide a sanity check on the program state and report an error if they detect something incorrect or invalid. They can perform other functions as well, such as tracking memory allocations and usage over time.

Runtime analysis techniques include:

• Source code insertion via — A tool instruments the application’s source code by inserting error detection code before . This approach doesn’t require a detailed knowledge of the execution environment, making it a popular choice for embed- ded testing and analysis tools. IBM Rational Test RealTime is an example of a tool that uses this technique.

• Object code insertion — The runtime analysis tools have sufficient understanding of the runtime environment to instrument the and libraries directly. This technique doesn’t require access to source code or a rebuild of the application. IBM Rational Purify is an example of a tool that uses this technique.

8 Static and runtime analysis QNX Software Systems

• Compile-time code insertion — The developer invokes specific compiler options to instrument the source code. The compiler itself provides the error detection capability. For instance, the GNU /C++ 4.x compiler uses Mudflap technology to check for problems in pointer operations.

• Specialized runtime libraries — The developer uses instrumented versions of system libraries to detect errors in passed parameters. Functions such as strcpy () are notorious for causing runtime errors due to null or erroneous pointers; instrumented libraries can detect these bad parameters. This technique doesn’t require a rebuild of the application and has a smaller impact on performance than full source/object instrumentation. The memory analysis tool in the QNX ® Momentics ® IDE uses this technique.

In this paper we discuss the techniques used by the QNX Momentics development tools, specifically, GCC Mudflap and specialized runtime libraries.

GNU C/C++ Mudflap: compile-time source instrumentation Introduced in version 4.x of the GNU C/C++ (GCC) compiler, Mudflap uses compile-time instrumentation of source code to perform runtime checks of potential error-causing constructs. Mudflap focuses on pointer operations since these lead to many runtime errors in C and C++ code.

Mudflap adds a pass to GCC’s normal compilation to insert validation code for pointer opera- tions. The inserted code typically checks the value of supplied pointers for validity — invalid pointers will raise a violation that GCC outputs to stderr on the console. Mudlfap pointer validation goes beyond checking for null dereferences: it keeps a database of valid memory objects and their properties, such as source location, timestamp, and a stack backtrace to the allocation and deallocation. This database enables rapid lookup when analyzing memory- access operations in the source code.

Library functions such as strcpy () don’t check passed parameters nor are they instrumented with Mudflap. However, Mudflap can create a symbol wrapper for statically linked libraries or a symbol interposition for dynamic libraries. This technique creates a layer between the application and the library, allowing Mudflap to validate parameters and to report violations. In cases where link-time validation doesn’t work, Mudflap uses heuristics based on knowledge of the application’s memory boundaries (heap, stack, code and data segments, etc.) to deter- mine the validity of returned pointers.

Using GCC command-line options, the developer can enable Mudflap instrumentation and control behaviors such as violation handling, extra checking, heuristics, tuning, and intro- spection. For example, the -fmudflap option invokes the default Mudflap configuration. GCC reports Mudflap violations through the command line or console output (stderr). This detailed output gives information about the violation and about the variables, code location, and functions involved. The IDE can automatically import this somewhat detailed and cryptic

9 Static and runtime analysis QNX Software Systems

information to provide visual notification and stack trace information. From there, the developer can quickly navigate to the appropriate location in the source code.

Figure 3 shows an example of an IDE reporting an error, along with relevant backtrace informa- tion. The backtrace acts like a link back to the source code, allowing the developer to quickly track down the cause.

Mudflap can increase build times and reduce runtime performance. Data reported in “Mudflap: Pointer Use Checking for C/C++” [1] shows that, when code is instrumented with Mudflap, build times take 3 to 5 times longer and programs run 1.25 to 5 times slower. Clearly, developers of time-critical code must apply Mudflap instrumentation judiciously. Nonetheless, Mudflap offers a powerful tool for uncovering error-prone and potentially fatal code constructs. QNX plans to utilize Mudflap in a future version of its runtime analysis tools.

Click on error to access backtrace

Click on backtrace to view or edit offending source line

Figure 3 — Using backtrace information displayed in the QNX Momentics IDE to investigate the source of an error.

Runtime library instrumentation Rather than use source code instrumentation, which can incur significant build and runtime penalties, developers can use pre-instrumented runtime libraries. These libraries place additional code around function calls to validate the received parameters. For example, the good old string copy function:

10 Static and runtime analysis QNX Software Systems

strcpy(a,b); has two parameters, source b and destination a, both pointers to type char . Despite its simplicity, this function has numerous possible errors:

• If a is null or invalid, copying to this destination will cause a memory access violation.

• If b is null or invalid, reading this source will cause a memory access violation.

• If b is missing a null termination, more than the intended number of bytes will be copied to the destination.

• If b is larger than the allocation size of a, more bytes than intended will be copied (a typical buffer overrun scenario).

An instrumented library checks the values of ‘a’ and ‘b’; it also checks their lengths to ensure they are compatible. If the library detects an invalid parameter, it reports a violation. In QNX Momentics, the IDE imports these errors from the target system and displays them onscreen. QNX Momentics also uses this technique to monitor memory allocations and deallocations, allowing for in-depth memory analysis.

The instrumented library will work with any application that uses its available functions; no code changes are required. Moreover, the developer can add the library during application startup, and the library will replace portions of the full standard library, eliminating the need for a complete instrumented libc. In the QNX Momentics IDE, the developer can add this library when launching a program as part of a normal interactive debug session. Figure 4 shows the technique that QNX Momentics uses to detect and report memory errors.

Instrumented libraries offer a proven, nonintrusive method to catch errors in library calls. This technique is ideal for memory analysis and other types of analysis that depend on matched pairs of system calls; for instance, malloc () and free (). That said, the technique can detect runtime errors only in code associated with library calls; it cannot detect more typical errors, such as inline pointer dereferences or bad pointer arithmetic. Also, it typically instruments only a subset of system calls in this way. For details on what is instrumented by QNX Momentics, see [2].

11 Static and runtime analysis QNX Software Systems

Application Intercepted calls Memory analysis tool str = ( char *) malloc(10); malloc() strcpy() Debug strcpy(str,"AbCdEfGhIjK "); strcpy() library strcpy(mystring,str ); free() free(str );

Process allocates 10 bytes, but then attempts to copy a string of 11 bytes.

Figure 4 — Memory analysis works by trapping memory-related API calls.

Runtime analysis workflow In a nutshell, runtime analysis consists of capturing violations or other significant events on the embedded target, importing that information into the development environment, and using visualization tools to quickly track the source of the error.

As Figure 5 indicates, runtime analysis not only detects errors, but also helps the developer to fine-tune consumption of memory, CPU cycles, disk space, and other resources. The workflow consists of several steps, and a well-designed runtime analysis tool will provide robust support for each one:

1 Observe — First, the tool catches runtime errors, detects memory leaks, and displays the results in the IDE.

2 Correct — Next, the tool allows the developer to trace each error back to the offending source line. A well-integrated IDE will display each error onscreen; the developer simply has to click on an error to open the source at the offending line. In many cases, the developer can solve the problem quickly by using the available stack trace and additional source tools (function browsers, call trace, etc.) in the IDE

3 Profile — Having eliminated detected errors and leaks, the developer can now analyze resource usage over time, including average usage, peak usage, and overhead. Ideally, the tool will provide a visual presentation for long-term analysis, allowing the developer to identify spikes in usage and other anomalies.

4 Optimize — Using the tool’s profiling information, the developer can now fine-tune each program’s resource usage. Among other things, this optimization can minimize peak resource usage and overhead, including memory and CPU time.

12 Static and runtime analysis QNX Software Systems

Observe Correct Profile Optimize

− Catch runtime − Back trace to − Review peak − Fix abnormal errors offending line resource usage, usage patterns average usage, − of source code − Monitor overhead Reduce peak resource leaks − Remove errors usage and resource leaks

Figure 5 — A typical runtime analysis workflow.

Combining analysis workflows in the developer’s environment Static and runtime analysis tools each have their own strengths. Consequently, development teams should use them in tandem. For example, static analysis tools can find errors that run- time analysis tools miss, since runtime tools can detect a bug only if the code containing the bug is executed during testing. Runtime tools, on the other hand, find definitive bugs in the software; there is usually little dispute over a bug when it’s obvious that a null pointer is being dereferenced.

Ideally, a developer will use both types of tools on a daily basis — a task made easier if the tools are well-integrated in the desktop development environment.

Here’s an example of using the two types of tools together:

1. The developer starts the day by looking through the report from the overnight build. This report includes errors in the build as well as static analyses that ran as part of the build.

2. The static analysis report lists defects, along with information that can help fix the defects, including links to source code. Using the IDE, the developer can mark each report as either a defect or a false positive, then fix any actual defects.

3. The developer saves the fixes, along with any new code, locally within the IDE. The developer doesn’t commit these changes back to the source control system until they have been analyzed and tested.

4. The developer analyzes and corrects any new code, using static analysis on the local desktop. To ensure better error detection and fewer false positives, this analysis uses system-wide information created as part of the nightly build/analysis process.

5. Having analyzed and “cleaned” any new code, the developer builds the code into a local test image or executable.

6. Using runtime analysis tools, the developer runs tests to validate the changes.

13 Static and runtime analysis QNX Software Systems

7. Use the IDE, the developer can quickly locate and fix errors reported by the runtime analysis tools. The code is considered clean and ready to go when static analysis, , and runtime analysis are complete.

8. The developer checks the changes into source control; the changes then become part of the next nightly build.

This workflow is similar to that in a medium-to-large project where nightly builds, source control, and code ownership already exist. Because the tools are integrated into the IDE, developers can quickly perform static and runtime analyses without diverging from the typical workflow. As result, code quality is greatly increased right at the source.

The role of RTOS architecture A discussion of RTOS architecture may seem out of place in a discussion of static and run- time analysis tools. But as it turns out, a well-designed RTOS can make many programming errors easier to detect, isolate, and resolve.

For instance, in a microkernel RTOS like QNX Neutrino, applications, device drivers, file systems, and networking stacks all reside outside of the kernel in separate address spaces. As a result, they are isolated from the kernel and from one another. This approach offers superior fault containment: a fault in one component won’t bring down the entire system. Moreover, it’s easy to isolate a memory or logic error down to the component that caused it.

For instance, if a device driver attempts to access memory outside its process container, the OS can identify the process responsible, indicate the location of the fault, and create a process dump file viewable with source-level tools. Meanwhile, the rest of the system can continue to run, allowing the developer to isolate the problem and to focus on resolving it.

14 Static and runtime analysis QNX Software Systems

Figure 6 — In a microkernel OS, memory faults in drivers, protocol stacks, and other services cannot corrupt other processes or the kernel. Moreover, the OS can immediately detect an invalid attempt to access memory and indicate what line of code made the attempt.

Compared to conventional OS kernels, a microkernel also provides dramatically faster Mean Time to Repair (MTTR). Consider what happens if a device driver faults: the OS can terminate the driver, reclaim the resources the driver was using, and then restart the driver, often within a few milliseconds. With conventional monolithic , the device would have to be rebooted — a process that can take seconds to minutes.

The bottom line Static analysis tools can find bugs even before code is ready to run. They can also find bugs that go undetected in unit, integration, and system testing, since complete coverage for com- plex applications is very difficult and expensive to achieve. Moreover, teams can use static analysis tools during regular system builds to ensure that every piece of new code is analyzed.

Runtime analysis tools, meanwhile, support integration and testing efforts by reporting run- time errors (or potential problems) back into the development environment; they also provide complete backtrace information to track down the error. Using this information, developers can perform postmortem debugging of mysterious program failures and system crashes in signifi- cantly time. Runtime analysis can help determine the root cause of a problem through stack and variable traces — a better alternative than coding “if (ptr != NULL) ” checks to work around errors.

This combination of early detection, better test coverage and completeness, and thorough error correction helps developers create better software, and in less time.

15 Static and runtime analysis QNX Software Systems

Bibliography [1] Eigler, Frank Ch., “Mudflap: Pointer Use Checking for C/C++”, Proceedings of the GCC Developers Summit 2003 , pg. 57-70. http://www.linux.org.uk/~ajh/gcc/gccsummit-2003-proceedings.pdf

[2] “Heap Analysis: Making Memory Errors a Thing of the Past”, QNX Neutrino RTOS Programmer’s Guide . http://pegasus.ott.qnx.com/download/download/16853/neutrino_prog.pdf

16

About QNX Software Systems QNX Software Systems is the leading global provider of innovative embedded technologies, including middleware, development tools, and operating systems. The component-based architectures of the QNX ® Neutrino ® RTOS, QNX Momentics ® development suite, and QNX Aviage ® middleware family together provide the industry’s most reliable and scalable framework for building high-performance embedded syst ems. Global leaders such as Cisco, Daimler, General Electric, Lockheed Martin, and Siemens depend on QNX technology for network routers, medical instruments, vehicle telematics units, security and defense systems, industrial robotics, and other mission- or life-critical applications. The company is headquartered in Ottawa, Canada, and distributes products in over 100 countries worldwide.

About Klocwork Klocwork is an enterprise software company providing automated source code analysis software products that automate security vulnerability and quality risk assessment, remediation and measurement for C, C++ and Java software. More than 250 organizations have integrated Klocwork's automated source code analysis tools into their software development process in order to ensure their code is free of mission- critical flaws while freeing their developers to focus on what they do best — innovate. www.qnx.com www.klocwork.com

© 2008 QNX Software Systems GmbH & Co. KG., a subsidiary of Research In Motion Limited. All rights reserved. QNX, Momentics, Neutrino, Aviage, Photon and Photon microGUI are trademarks of QNX Software Systems GmbH & Co. KG, which are registered trademarks and/or used in certain jurisdictions, and are used under license by QNX Software Systems Co. All other trademarks belong to their respective owners. 302130 MC411.59