Software BETTER PLAY IT SAFE Using static analysis tools for safety certification

By Robert Dewar, PhD, and Ben Brosgol, PhD

Building reliable software is difficult but achievable. Choosing an appropriate language is important but is only the first step. The careful selection of a coordinated set of tools is just as important or perhaps even more so. For safety-critical systems, using qualified verification tools that tell as much as possible about the software as early as possible helps increase confidence in the system’s correctness while reducing the costs for the system’s certification.

Complying with safety-critical standards Tool qualification and software many such optimizations but still does not such as DO-178B involves demonstrat- certification realize the full potential of the Itanium ing that a system meets its requirements Before considering specific tools that chip. Development continues at a rapid and does not introduce safety hazards. An assist in the production of safety-critical pace, but so far no achieves the underlying issue is that the certification certified code, let’s first have a look at the chip’s full performance potential. evidence is based on the system’s static issue of achieving confidence in the tools source text but needs to relate to the sys- themselves. Obviously it is desirable to Not only are the themselves tem’s dynamic, runtime, behavior. How use tools that are known to be 100 percent out of reach of formal safety certification does one guarantee prevention of runtime reliable and free of any errors. because of their inherent complexity, but insecurities such as misusing an integer as to make things worse, there are no formal an address, exhausting memory resources, A first view would be that the tools specifications of these languages suitable selecting a nonexistent element in a data should be certified with the same rig- for certification. International standard structure, buffer overflow, referencing a orous approach that is used for safety- documents define the languages in suf- variable before it is initialized, or access- critical applications. After all, if life ficient detail for programmers and com- ing shared data across concurrent activities safety depends on the program’s correct- piler writers, but these definitions are not without protection? ness, shouldn’t any tools that are used in at the formal complete level needed for its production also be totally reliable? certification using DO-178B or similar The programming language chosen and approaches. the set of features used are obviously Unfortunately, this is not practical. Why important in the development of safety- not? The difficulty is that large systems So that sounds bad. What do software critical systems. But even with a language are written using large and complex lan- developers do if the tools they use cannot designed for reliable programming, such guages. Experience has shown that in be shown to be completely reliable? The as Ada, it will be possible to write pro- order to develop maintainable applica- answer is that in the certification process, grams that compile but that encounter tions, languages with rich features are they either certify at the object code level, such issues at runtime. needed. It is certainly possible to design bypassing the issue of whether the com- very simple languages, but in practice piler is fully reliable, or they certify at the A solution is to use static analysis tools such languages are not suitable for build- source level and include a detailed analy- that can detect and thus prevent poten- ing large, modern, complex applications. sis showing the correspondence of the tial runtime insecurities. If such a tool is with the object code. Vendors being used to replace a process that would Even C now has an international standard such as Verocel provide tools that assist at otherwise be done manually during the document that is very large, and compil- the object code level. In either case, the system certification, then, in DO-178B ers for modern languages are themselves focus is on the object code, so if the com- parlance, the tool needs to be “qualified very large and complex programs. For piler fails to generate correct code, this as a verification tool.” If the tool is such example, the 1999 version exceeds 500 will be discovered during the certification that its failure could introduce errors into pages – comparable in size to the Ada process. the system, then it would need to be quali- standard. A compiler for a language fied as a development tool, a much more such as Ada can comprise a million lines The situation is similar for any tool that stringent requirement. of code or more. Furthermore, modern is involved in actual code generation. microprocessor design requires compil- However, there is a brighter side to this Among the static analysis tools applica- ers to perform extensive optimizations. picture. Software developers also use ble in a DO-178B certification context, a Indeed the requirement for sophisticated tools that are not directly involved in code stack usage analyzer is particularly impor- optimization is becoming more stringent generation, but instead provide informa- tant for data space predictability. This tool as time goes on; for example, reasonable tion about the program. Such analysis calculates the maximum amount of stack performance on the Itanium, ia64, micro- tools have a completely different sta- space that a program would ever need. processor architecture is only possible for tus from the compiler generating object Such information is especially relevant in compilers carrying out advanced opti- code. An error in an analysis tool does not memory-constrained environments such mization algorithms. The directly cause any error in the resulting as VMEbus systems. Foundation’s GCC compiler implements program. At worst, the tool may complain

Reprinted from VMEbus Systems / April 2006 Copyright 2006 Software BETTER PLAY IT SAFE about a nonexistent problem, which can be ignored, or it may miss something it Does the tool replace processes otherwise required by DO-178B? should have caught, which will be found later during object code certification. Yes No DO-178B recognizes the difference between a tool such as a compiler, which No need to qualify generates code that is part of the opera- tional system, and an analysis tool that does not generate code. The differences between the qualification requirements Does the tool produce output that is part of the airborne software Can the tools fail to detect errors? for these tools are illustrated in Figure 1. (i.e., can the tool introduce errors?) The qualification procedure for analysis tools or verification tools still requires Yes Yes careful generation of objectives, and thor- ough testing, but does not operate at the same stringent level as certification of the avionics application code itself. Need to qualify as development tool Need to qualify as verification tool

Qualified tools can be an important com- Figure 1 ponent in the production of safety-critical code. By using analysis tools, the software Ada, however, the two variables have dis- language is only as good as the software developer can find errors earlier, which tinct types. Since Ada’s predefined “>” development environment that supports it. reduces development expense. By using operation requires operands of the same A compiler that correctly implements the qualified analysis tools, the developer can type, the error would be detected at com- language is important but is not sufficient. get credit for some of the automated veri- pile time in Ada, and the compiler would Let’s now look at some other important fication work that otherwise would need reject the program. It is, of course, much attributes for a compiler in the context to be done manually, which reduces the cheaper when the compiler detects such of safety-critical development, and then certification expense. errors than if time is spent debugging to consider the role of static analysis tools track them down. in general and a stack usage analyzer in Programming language choice particular. Although they cannot be certified with Although the compiler cannot be abso- regard to object code accuracy, compil- lutely trusted to generate correct code, in The compiler as a static analysis ers actually do far more than just generate practice it is safe to rely on the compiler tool code. Modern languages have been care- to reject incorrect programs, and to make Although compilers are usually thought fully designed so that the compiler can use of these error messages to find prob- of as tools for generating code, they can detect many problems before a program is lems early. If an incorrect error message often perform much more extensive tasks even run. Most notably, the strong typing is generated, then that’s annoying but of program analysis. In addition to the system of a language like Ada can find does not compromise safety. If a message detection of incorrect programs, a good many errors early on. As an example, sup- is missed, that’s also annoying because compiler can provide extensive warnings pose a program needs to deal with physi- the error won’t be found until later in the about suspicious code that is not actually cal units; in Ada, this might be expressed certification process, but still there is no wrong but which represents likely errors as follows, using distinct types: compromise to safety. or at least code that should be reviewed.

type Temperature is range 20 .. In the case of Ada, there is a formal As an example, consider an Ada program 30; -- Celsius validation process for Ada compilers that contains: type Pressure is range 0 .. 1000; that was initially supervised by the U.S. Department of Defense through the Ada with Vector_Library; Suppose the programmer writes: Joint Program Office and is now a codi- package Fire_Control is … fied international standard. Ada is the Cabin_Pressure : Pressure; only language for which such a stan- This fragment is the start of a package Cabin_Temperature : dard exists. Although validation can- specification for Fire_Control, a module Temperature; not provide an absolute guarantee of that may contain the declaration of types … correctness, it does provide a consider- and relevant procedures and functions. if Cabin_Temperature > Cabin_ able level of assurance that the compiler Note that the ability in Ada to separate Pressure then … correctly interprets the language, and it package specification and implementa- is reasonable to rely on such a compiler tion is in itself a valuable aid in design- The comparison is obviously a mistake, with respect to properly rejecting invalid ing safe and reliable programs. The with because of incompatible units. In C, C++, programs. construct says that this package specifica- or Java, where these variables would likely tion uses features from another package have been declared simply of type int, this The programming language has a key named Vector_Library. error would not be found until testing; in role in helping to prevent errors, but a

Reprinted from VMEbus Systems / April 2006 Copyright 2006 If this program is submitted to GNAT Pro, the Ada compiler from AdaCore, the compiler addition of annotations in Ada comment will check whether features from Vector_Library are actually used in the specification of syntax that say things about the program. Fire_Control. If not, it will generate a warning that the with construct is redundant. This For example, if a global variable is used, is extremely valuable, since a critical element in the generation of complex programs is to the program explicitly indicates who is minimize intermodule dependencies. Perhaps in this case it is the implementation rather allowed to read and write this variable. A than the specification of Fire_Control that references Vector_Library entities (in which procedure can include preconditions that case GNAT Pro will specifically recommend moving the with construct to the body), or must be satisfied when the procedure is perhaps it is entirely unused as a result of maintenance activity, in which case the with called and postconditions that must be construct should be removed entirely. satisfied when it returns.

Again the Ada language helps here because dependencies are a fundamental part of The software developer writes the pro- the language design. In C or C++, the approach of using #include means that the com- gram in SPARK, including annotations, piler does not clearly see the dependency structure, so it is much harder to detect such and then uses the SPARK Examiner tool problems. to compile the code. Why is compile in italics here? Well the SPARK Examiner As another example, consider the assignment statement where X, Y, and Z are from some is an unusual kind of compiler. It does not integer type: generate code at all. Instead, it is only in the business of checking the program. Not X := Y + Z; only does it check for the standard Ada errors, it also checks the consistency of Let’s address the issue of possible overflow in this statement. In C or C++, overflow the annotations. is simply undefined. A program with an overflow has a completely undefined effect. In Java, there is no overflow as the result just silently wraps around – adding two very This approach results in a far more reli- large integers can yield a negative result – so overflow is not an error. At least it is not able program, before it is even fed to the an error from the point of the language definition, although the application may not real compiler. Note, SPARK programs agree with this. In Ada, an overflow results in a runtime exception – certainly a better look like legal programs to an Ada com- outcome – though in a certified application all such overflow situations would likely be piler, so a standard Ada compiler such as eliminated. After all, it was a similar situation that caused the premature destruction of GNAT Pro is used to actually generate the first Ariane-5 rocket. code. SPARK has been extensively used for generating safety critical and high- The compiler may be able to detect that overflow is certain or likely in such a situation security programs. An interesting point and again may be able to issue a warning. Such warnings are not possible in Java since is that since SPARK is a small subset, it there is no way of knowing whether the programmer intended wraparound, but in C, does have a clear formal definition. The C++, or Ada, there are situations in which warnings can be generated. Ada has the advan- SPARK Examiner is not generating code, tage of declared subranges, making it possible to generate warnings in many more cases. so it is a much simpler program than a Here’s an example of the diagnostic messages from AdaCore’s GNAT Pro: conventional compiler. As a result, it is feasible to certify the examiner at a rigor- X : Integer range 1 .. 5; ous level. In fact, the SPARK Examiner Y : Integer range 10 .. 15; is written in SPARK and can successfully … examine itself to show that it is free of any X := X + Y; inconsistencies. | >>> warning: value not in range of subtype… Static analysis tools >>> warning: “Constraint_Error”will be raised at runtime As we have discussed, compilers can per- form significant analysis of the programs Modern compilers such as GNAT Pro have an extensive set of warnings, and, in practice, they compile. They are still, however, many critical programs are compiled in a mode that treats all warnings as fatal errors. By basically in the business of generating addressing all such warnings, the programmer can eliminate many errors, even before code, so analysis remains a by-product of the unit testing stage. the compilation process and is limited in scope. Specialized languages Compilers can issue useful error and warning messages because a program’s source text A number of tools exist that are dedicated contains considerable information. For example, since variables need to be explicitly to examining a program for errors and are declared, the compiler can check that each usage of a variable is consistent with its type. not involved at all with the issue of code generation. Examples of such tools may Imagine a language in which a program’s text contained all sorts of things that could be found in the offerings of SofCheck then be checked. These would be language features having nothing to do with power and GrammaTech. Tools from these of expression or efficient code generation – the two usual focuses of language design. companies read a program and attempt Instead, they would be designed solely with the aim of being able to generate an even to operate much as a human would in more useful set of error messages. examining a program for errors. They trace possible flows of control, look- An interesting example of this approach is the SPARK language from Praxis Critical ing for suspicious usage such as unini- Systems. SPARK is a small subset of Ada that is then significantly expanded by the tialized variables, possible incorrect

Copyright 2006 Reprinted from VMEbus Systems / April 2006 Software BETTER PLAY IT SAFE values, conditions that are always true or false, and out-of-range subscripts in arrays. See Figure 2 for a short example of the sorts of analyses that such tools can perform. “Among the static procedure Example is Flag : Boolean := True; analysis tools applicable M, N : Integer range 10..20; K : Integer range 1..10; in a DO-178B Warning: Buffer : array (1..10) of Float; Variable X not used X : Float; certification context, a begin if Flag then stack usage analyzer is M := 10; Warning: N := 11; Variable N overwritten particularly important before being referenced N := 12; Warning: for data space Reference to else uninitialized K := 1; predictability [and] variable K end if; Buffer( K ) := 0.0; Warning: calculates the maximum Buffer( M+N ) := 1.0; Unreachable code end Example; amount of stack space Warning: Array index M+N out of range that a program would 1 through 10 ever need.” Figure 2

Of course, no such program can find all problems, but in practice many potential space can bring down a plane just as problems can be found. The issue is very similar to the value of warning messages from surely as any bug in the program logic. To a compiler, but a program that concentrates on warnings can do a more extensive job answer this question, consider an inter- than a compiler in this area. Again, the developer is not relying on such tools for safety, esting example of an analysis tool in the but finding an error earlier rather than later is important in increasing reliability and AdaCore tool suite. decreasing costs. This tool is a static analyzer for stack Such tools typically require a lot of information about the program, and thus must usage. It has two components. First, the perform an extensive analysis of the structure of the program, similar to the analysis that compiler itself calculates the maximum must be performed by a compiler. In the case of Ada, there is an international standard amount of stack used by each procedure defining a mechanism for a compiler to provide this information for use by other tools. and function in the program, based on the The Ada Semantic Interface Specification (ASIS) library used for this purpose means actual generated object code. Second, an that for Ada, tools need not duplicate the analysis algorithms of a compiler, which greatly analysis program looks at the structure simplifies the development of such tools. Interestingly, no comparable standard exists for of each task in the system and computes C, C++, or Java, though there is no technical reason that prevents such a standard. the total amount of stack space that can be used in the worst case by each task. Runtime resource issues, stack checking A program can be completely correct, and be subjected to all kinds of static analyses that As always, it is in the details of such a tool fail to indicate any problems, and yet the program may fail at execution time. One very that the real difficulties arise. For example, common case of this is the exhaustion of memory resources. This is particularly relevant there are constructs in Ada that generate for typical programs in the VMEbus environment where physical memory is often very stack allocations, the size of which is not limited and where programs thus need to have a small “footprint” in both code and data. known until runtime (similar to alloca in C or C++). The tool can find worst-case val- Dynamic allocation is often completely avoided in such systems precisely because it is ues, but in practice, the worst case will be hard to ensure that memory is not exhausted. Ada provides a convenient mechanism for far too pessimistic. What the tool does in such avoidance, built into the language. The program can contain the directive: such cases is to warn the programmer and point out the possible trouble spot. The pragma Restrictions (No_Dynamic_Allocation); programmer can then either eliminate this usage or prove a practical maximum usage and then the compiler will reject the program if there is any usage of dynamic alloca- that can be fed back into the tool. tion. Other languages, including C and C++, lack a corresponding mechanism in the language, but external analysis tools could provide a similar capability. However, there Similarly, recursion can lead to unbounded is an important form of dynamic allocation that cannot be statically eliminated, namely stack use. Again, the tool will point out the the allocation of space for the runtime stack or in the case of a multi-threaded program, possible problem. The programmer can several stacks, one for each task in the system. then react by eliminating the recursion, often the best choice in memory-limited How is it possible to ensure that a program will not run out of stack space during execu- systems, or by proving that the recursion tion? On a small memory system this is a major concern, and running out of stack is safely bounded.

Reprinted from VMEbus Systems / April 2006 Copyright 2006 The stack analysis tool does not generate Ada language and related technologies, To learn more, contact Robert and Ben at: code – it only provides information. It can and he is also a professor of Computer thus be qualified as a verification tool in Science at . He holds AdaCore the DO-178B sense and can offer criti- a PhD from the University of Chicago. 104 Fifth Ave., 15th Floor cal feedback early on to address possible New York, NY 10011 memory limitations. Such limitations can Ben Brosgol Tel: 212-620-7300 be extremely costly to fix if not found has been involved E-mail: [email protected] or until the formal certification process, pos- in the design and [email protected] sibly requiring major redesign. development of Website: www.adacore.com the Ada language Robert Dewar is for nearly 30 years CEO and president since its inception. of AdaCore, which He has designed supplies tools Ada and real-time training programs for Ada software for Alsys, Aonix, and AdaCore, has development. He published numerous articles and papers is the author of on Ada, and has consulted on a number numerous articles of critical Ada programs. Ben is cur- and books in the area of programming rently a senior member of the AdaCore languages and microprocessor archi- technical staff. He holds a PhD from tecture. Robert has specialized in the Harvard University.

Copyright 2006 Reprinted from VMEbus Systems / April 2006