<<

H O W T H I N G S W O R K

in Software Security: Building Secu- rity In (McGraw, 2006). Without Automated Code question, the top two best prac- tices are code review (with a static analysis tool) and architectural risk analysis. Review Tools for Among these two best practices, code review with a static analysis tool is the easiest and most straight- Security forward to adopt. There are two reasons for this. First, every soft- Gary McGraw, Cigital ware project has code that can be reviewed (they should all have an architecture too, but that’s a topic for another article). Second, code review has been partially automated Static analysis identifies many with sophisticated tools. common coding problems WHY CODE REVIEW TOOLS? automatically, before a program Many security problems are is released. caused by simple bugs that can be spotted in code. For example, a buf- fer overflow vulnerability is the com- mon result of misusing various string functions, including strcpy() in C. omputer security has Since the late 1990s, a new par- Using a tool makes sense because experienced important adigm in has code review is boring, difficult, and fundamental changes evolved—software security (some- tedious. Analysts who practice code over the past decade. The times called application security). review often are well familiar with most promising devel- Software security is the engineering the “get done, go home” phenom- Copments in security involve arming of software so that it continues to enon described in Building Secure software developers and architects function correctly under malicious Software: How to Avoid Security with the knowledge and tools they attack. Problems the Right Way (J. Viega need to build more secure software. Although software security is rel- and G. McGraw, 2001). It’s all too Among the many security tools atively young as a discipline, much easy to start a review with diligence available to software practitioners, progress has been made on ways to and care, cross-referencing defini- static analysis tools for automated integrate security best practices into tions and variable declarations, and code review are the most effective. the life cycle. end it by giving function definitions Here’s how they work—and why all Microsoft has helped to spearhead (and sometimes even entire pages) developers should use them. software security through its Trust- only a cursory glance. worthy Computing Initiative and the make little mistakes THE RISE OF resulting Secure Development Life- all the time—a missing semicolon SOFTWARE SECURITY cycle (SDL). My company, Cigital, here, an extra parenthesis there. Traditional approaches to computer also has been instrumental in bring- Most of the time, such gaffes are security focused almost exclusively on ing software security to the wider inconsequential; the notes the network; the idea was to keep mali- market through thought leadership the error, the fixes the cious hackers away from vulnerable and professional services. code, and the development process machines by placing a barrier between Numerous software security best continues. This quick cycle of feed- the two. Security vendors introduced practices have emerged in several back and response stands in sharp the network firewall in the late 1980s methodologies, including OWASP contrast to what happens with most as a way of creating such a barrier (Open Web Application Security security vulnerabilities, which can lie between a local area network and the Project) CLASP (Comprehensive, dormant, sometimes for years, before Internet. Although firewalls certainly Lightweight Application Security discovery. The longer a vulnerability have their place in computer security Project), Microsoft’s SDL, and Cig- lies dormant, the more expensive it and have since become ubiquitous, ital’s Touchpoints. Figure 1 shows can be to fix; and, adding insult to serious security problems persist. the Cigital Touchpoints as described injury, the programming community

92 Computer has a long history of repeating the same security-related mistakes. External Code One problem is that security is not Security review review Penetration requirements (tools) testing yet a standard part of the program- Risk-based Abuse Risk security Risk Security ming curriculum. You can’t really cases analysis tests analysis operations blame programmers who intro- duce security problems into their software if nobody ever told them what to avoid or how to build secure Requirements Architecture Tests and Feedback software. Another problem is that and and Test plans Code test from most programming languages were use cases design results the field not designed with security in mind. Unintentional (mis)use of various functions built into these languages leads to common and often exploited vulnerabilities. Creating simple tools to help look Figure 1. The Cigital Touchpoints methodology. Software security best practices for these problems is an obvious way (arrows) applied to various software artifacts (boxes). forward. The promise of static analy- sis is to identify many common cod- exist in hard-to-reach states or crop several simple “rules” that might ing problems automatically, before a up in unusual circumstances. Static indicate possible security vulner- program is released. analysis tools can peer into more of abilities. One such rule might be Static analysis tools—also called a program’s dark corners with less “use of strcpy() should be avoided,” source code analyzers—examine a fuss than dynamic analysis, which which can be applied by looking program’s text without attempting requires actually running the code. through the software for the pattern to execute it. Theoretically, these Static analysis also has the poten- “strcpy” and alerting the user when tools can examine either a program’s tial to be applied before a program and where it is found. This is obvi- source code or a compiled form of the reaches a level of completion at ously a simple-minded approach, program to equal benefit, although which testing can be meaningfully often referred to with the deroga- the problem of decoding the latter can performed. The earlier that security tory label “glorified grep.” be difficult. I focus on source code risks are identified and managed in The best thing about ITS4 and analysis here because that’s where the the software life cycle, the better. similar tools was that creating them most mature technology exists. involved gathering and publishing Manual auditing is a form of static A BRIEF HISTORY OF a preliminary set of software secu- analysis. This is very time-consum- CODE REVIEW TOOLS rity rules all in one place. When we ing, and human code auditors must The first scanner built to look for released ITS4 as an open source tool, first know what security vulner- security problems in code was Cig- our hope was that the world would abilities look like before they can ital’s ITS4 (www.cigital.com/its4). participate in helping to gather and rigorously examine the code. Static (ITS4 is an acronym for “It’s the improve the rule set. Although more analysis tools compare favorably to Software Stupid Security Scanner,” than 15,000 people downloaded manual audits because they’re faster, a name we invented much to the dis- ITS4 in its first year, we never which means they can evaluate pro- may of our marketing people. That received even one rule to add to its grams much more frequently, and was back in the day when Cigital knowledge base. The world did not they encapsulate security knowledge was called Reliable Software Tech- end, however, and several prominent in a way that doesn’t require the tool nologies.) Since ITS4’s release in early commercial efforts to build up and operator to have the same level of 2000, the idea of detecting security evolve rule sets were undertaken. security expertise as a human audi- problems by looking over source code ITS4 and its counterparts were tor. Just as a programmer can rely with a tool has come of age. Much never intended to be “push the but- on a compiler to enforce the finer better approaches exist and are being ton, see the bug” kinds of tools. The points of language syntax consis- rapidly commercialized. basic idea was instead to turn an tently, the operator of a good static ITS4 and its counterparts RATS impossible problem (remembering analysis tool can successfully apply (no longer available) and Flawfinder all those rules while doing manual that tool without being aware of the (www.dwheeler.com/flawfinder) code review) into a really hard one finer points of security bugs. are extremely simple—the tools (figuring out whether the things the Testing for security vulnerabilities scan through a file (lexically) look- tool flagged matter or not). Simple is complicated because they often ing for syntactic matches based on tools like ITS4 help carry out a

December 2008 93 H O W T H I N G S W O R K

source code, such a tool could take References for Secure Code Review into account the basic semantics of the program being evaluated. There are numerous good resources for readers interested in learning Armed with an AST, the next deci- more about automated code review and other software security tech- sion to make involves the scope of nologies. Here are the top five. the analysis. Local analysis exam- ines the program one function at a • R. Anderson, Security Engineering: A Guide to Building Dependable time and doesn’t consider relation- Distributed Systems, 2nd ed., John Wiley & Sons, 2008; www.cl.cam. ships between functions. Module- ac.uk/~rja14/book.html. level analysis considers one class or • B. Chess and J. West, Secure Programming with Static Analysis, Addison- compilation unit at a time, so it takes Wesley, 2007; http://buildingsecurityin.com. into account relationships between • M. Howard and D. LeBlanc, Writing Secure Code, 2nd ed., Microsoft functions in the same module and Press, 2003; http://blogs.msdn.com/michael_howard. considers properties that apply to • G. McGraw, Software Security: Building Security In, Addison-Wesley, classes, but it doesn’t analyze calls 2006; www.swsec.com. between modules. Global analysis • J. Viega and G. McGraw, Building Secure Software: How to Avoid involves analyzing the entire pro- Security Problems the Right Way, Addison-Wesley, 2001; www. gram, so it takes into account all buildingsecuresoftware.com. relationships between functions. The scope of the analysis also determines the amount of context the source code security review, but they a comment tool considers. More context is better certainly don’t do it for you. The when it comes to reducing false posi- same can be said for modern tools, /* never ever call gets */ tives, but it can lead to a huge amount although they definitely make things of computation to perform. much easier than the first-generation and an unrelated identifier tools did. MODERN CODE REVIEW TOOLS Probably the simplest and most int begetsNextChild = 0; In 2004 and 2005, several start- straightforward approach to static ups were formed to address the soft- analysis is the Unix utility grep— Basic lexical analysis is the ware security space. Many of these the same functionality implemented approach taken by early static analy- vendors have built and are selling in the earliest tools such as ITS4. sis tools, including ITS4, Flawfinder, basic source code analysis tools. Armed with a list of good search and RATS, all of which preprocess Major vendors in the space include strings, grep can reveal quite a lot and tokenize source files (the same about a code base. The downside first steps a compiler would take) • Coverity (www.coverity.com), is that grep is rather lo-fi because it and then match the resulting token • Fortify (www.fortify.com), and doesn’t understand anything about stream against a library of vulner- • Ounce Labs (www.ouncelabs. the files it scans. Comments, string able constructs. com). literals, declarations, and function While lexical analysis tools are calls are all just part of a stream of certainly a step up from grep, they These vendors take a similar techno- characters to be matched against. produce a hefty number of false pos- logical approach, but some are more You might be amused to note that itives because they make no effort to academically inclined than others. By a grep through code for words like account for the target code’s seman- basing their tools on compiler technol- “bug,” “XXX,” “Fix,” “Here,” and tics. A stream of tokens is better ogy, these vendors have upped the level best of all, “Assume” often reveals than a stream of characters, but it’s of sophistication far beyond the early interesting and relevant tidbits. Any still a long way from understanding almost unusable tools like ITS4. good security source code review how a program will behave when it A critical feature that currently should start with that. executes. Although some security serves as an important differentiator Better fidelity requires taking into defect signatures are so strong that in the static analysis tools market is the account the lexical rules that govern they don’t require semantic inter- kind of knowledge (the rule set) that the programming language being pretation to be identified accurately, a tool enforces. The importance of a analyzed. By doing this, a tool can most are not so straightforward. good rule set can’t be overestimated. distinguish between a vulnerable To increase precision, a static One reason to use a source code function call analysis tool must leverage more analysis tool is that manual review is compiler technology. By building costly and time consuming. Manual gets(&buf); an abstract syntax tree (AST) from review is such a pain that reviewers

94 Computer regularly suffer from the “get done, go home” phenomenon—starting strong and ending with a sputter. An automated tool can begin to check every line of code whenever a build is complete, allowing development shops to get on with the business of building software. Integrating a source code analyzer into a development life cycle can be painless and easy. As long as your code builds, you should be able to run a modern analysis. Working through the results remains a challenge, but it is nowhere near as much trouble as painstakingly checking every line of code by hand. Figure 2 shows a screenshot from Fortify’s SCA prod- Figure 2. Screenshot from Fortify SCA. Browsable results are displayed in commercial uct, demonstrating how results are source code analysis tools. presented in a commercial tool. Modern approaches to static of determining (through control flow people. This means that the results analysis can now process on the call chains and data-flow structures) from these tools must be understand- order of millions of lines of code whether the possible vulnerability is able to normal developers who might quickly and efficiently. Although a real. Although tools are getting better not know much about security. In complete review certainly requires at determining this kind of thing, they the end, source code analysis tools an analyst with a clue, the process are not perfect. serve to educate their users about of looking through the results of a The root cause of many security good programming practice. Good tool and thinking through potential problems can be found in the source static checkers can help their users vulnerabilities beats looking through code and configuration files of com- spot and eradicate common security everything. A time savings of several mon software applications—especially bugs. This is especially important hundred percent is feasible. custom apps that you write yourself. for languages such as C or C++, for Modern tools have several built-in Problems are seeded when vulnerable which a very large corpus of rules time-saving mechanisms. The first code is written into the system—unde- already exists. is the knowledge encapsulated in a niably the most efficient and effective tool. Keeping a burgeoning list of all time to remove them. tatic analysis for security known security problems found in a The way forward is to use auto- should be applied regularly as language like C (several hundred) in mated tools and processes that sys- S part of any modern software your head while attempting by hand tematically and comprehensively tar- development process. Automated to trace control flow, data flow, and get the root cause of security issues code review for security uses straight- an explosion of states is extremely in source code. Instead of sorting forward technology to help solve an difficult. Having a tool that remem- through millions of lines of code look- important hard problem—identify- bers security problems (and can eas- ing for vulnerabilities, a developer ing security bugs in source code. As ily be expanded to cover new prob- using an advanced software security one of the top two best practices for lems) is a huge help. tool that returns a small set of poten- software security, static code review The second time-saving mechanism tial vulnerabilities can pinpoint actual is moving into widespread use. N involves automatically tracking con- vulnerabilities in seconds—precisely trol flow, call chains, and data flow. the same vulnerabilities that would Although commercial tools make take a malicious hacker or manual Acknowledgment tradeoffs when it comes to soundness, code reviewer weeks or even months Some portions of this article they certainly make the laborious pro- to find. Of course, most bad guys reprinted by permission from Soft- cess of control and data-flow analysis know this and will use these kinds of ware Security: Building Security In, much easier. For example, a decent tools themselves! Addison-Wesley, 2006. tool can locate a potential strcpy() vul- nerability on a given line, present the ARM THE DEVELOPERS Gary McGraw is the CTO of Cigital, result in a results browser, and arm the Good static analysis tools must a software security firm. Contact him user with an easy and automated way be easy to use, even for non-security through www.cigital.com/~gem.

December 2008 95