C/C++ Thread Safety Analysis
Total Page:16
File Type:pdf, Size:1020Kb
C/C++ Thread Safety Analysis DeLesley Hutchins Aaron Ballman Dean Sutherland Google Inc. CERT/SEI Email: [email protected] Email: [email protected] Email: [email protected] Abstract—Writing multithreaded programs is hard. Static including MacOS, Linux, and Windows. The analysis is analysis tools can help developers by allowing threading policies currently implemented as a compiler warning. It has been to be formally specified and mechanically checked. They essen- deployed on a large scale at Google; all C++ code at Google is tially provide a static type system for threads, and can detect potential race conditions and deadlocks. now compiled with thread safety analysis enabled by default. This paper describes Clang Thread Safety Analysis, a tool II. OVERVIEW OF THE ANALYSIS which uses annotations to declare and enforce thread safety policies in C and C++ programs. Clang is a production-quality Thread safety analysis works very much like a type system C++ compiler which is available on most platforms, and the for multithreaded programs. It is based on theoretical work analysis can be enabled for any build with a simple warning on race-free type systems [3]. In addition to declaring the flag: −Wthread−safety. The analysis is deployed on a large scale at Google, where type of data ( int , float , etc.), the programmer may optionally it has provided sufficient value in practice to drive widespread declare how access to that data is controlled in a multithreaded voluntary adoption. Contrary to popular belief, the need for environment. annotations has not been a liability, and even confers some Clang thread safety analysis uses annotations to declare benefits with respect to software evolution and maintenance. threading policies. The annotations can be written using either I. INTRODUCTION GNU-style attributes (e.g., attribute ((...))) or C++11- Writing multithreaded programs is hard, because developers style attributes (e.g., [[...]] ). For portability, the attributes are must consider the potential interactions between concurrently typically hidden behind macros that are disabled when not executing threads. Experience has shown that developers need compiling with Clang. Examples in this paper assume the use help using concurrency correctly [1]. Many frameworks and of macros; actual attribute names, along with a complete list libraries impose thread-related policies on their clients, but of all attributes, can be found in the Clang documentation [4]. they often lack explicit documentation of those policies. Where Figure 1 demonstrates the basic concepts behind the such policies are clearly documented, that documentation analysis, using the classic bank account example. The frequently takes the form of explanatory prose rather than a GUARDED BY attribute declares that a thread must lock mu checkable specification. before it can read or write to balance, thus ensuring that Static analysis tools can help developers by allowing the increment and decrement operations are atomic. Similarly, threading policies to be formally specified and mechanically REQUIRES declares that the calling thread must lock mu checked. Examples of threading policies are: “the mutex mu before calling withdrawImpl. Because the caller is assumed to should always be locked before reading or writing variable have locked mu, it is safe to modify balance within the body accountBalance” and “the draw() method should only be of the method. invoked from the GUI thread.” In the example, the depositImpl() method lacks a REQUIRES Formal specification of policies provides two main benefits. clause, so the analysis issues a warning. Thread safety analysis First, the compiler can issue warnings on policy violations. is not interprocedural, so caller requirements must be explicitly Finding potential bugs at compile time is much less expensive declared. There is also a warning in transferFrom(), because it in terms of engineer time than debugging failed unit tests, or fails to lock b.mu even though it correctly locks this−>mu. worse, having subtle threading bugs hit production. The analysis understands that these are two separate mutexes Second, specifications serve as a form of machine-checked in two different objects. Finally, there is a warning in the documentation. Such documentation is especially important withdraw() method, because it fails to unlock mu. Every lock for software libraries and APIs, because engineers need to must have a corresponding unlock; the analysis detects both know the threading policy to correctly use them. Although double locks and double unlocks. A function may acquire documentation can be put in comments, our experience shows a lock without releasing it (or vice versa), but it must be that comments quickly “rot” because they are not updated annotated to specify this behavior. when variables are renamed or code is refactored. A. Running the Analysis This paper describes thread safety analysis for Clang. The analysis was originally implemented in GCC [2], but the GCC To run the analysis, first download and install Clang [5]. version is no longer being maintained. Clang is a production- Then, compile with the −Wthread−safety flag: quality C++ compiler, which is available on most platforms, clang −c −Wthread−safety example.cpp #include ” mutex . h ” #include ”ThreadRole.h” class BankAcct f ThreadRole Input Thread ; Mutex mu; ThreadRole GUI Thread ; i n t balance GUARDED BY(mu ) ; class Widget f void depositImpl( i n t amount ) f public : / / WARNING! Must lock mu. virtual void onClick () REQUIRES(Input Thread ) ; balance += amount; virtual void draw ( ) REQUIRES( GUI Thread ) ; g g ; void withdrawImpl( i n t amount ) REQUIRES(mu) f class Button : public Widget f // OK. Caller must have locked mu. public : balance −= amount ; void onClick() override f g depressed = true ; draw ( ) ; / / WARNING! public : g void withdraw ( i n t amount ) f g ; mu. lock ( ) ; Fig. 2. Thread Roles // OK. We’ve locked mu. withdrawImpl(amount); // WARNING! Failed to unlock mu. or token, which the thread must present to perform the read g or write. Capabilities can be either unique or shared. A unique void transferFrom(BankAcct& b, i n t amount ) f mu. lock ( ) ; capability cannot be copied, so only one thread can hold // WARNING! Must lock b.mu. the capability at any one time. A shared capability may b.withdrawImpl(amount); have multiple copies that are shared among multiple threads. // OK. depositImpl() has no requirements. Uniqueness is enforced by a linear type system [9]. depositImpl(amount); The analysis enforces a single-writer/multiple-reader disci- mu.unlock (); g pline. Writing to a guarded location requires a unique capa- g ; bility, and reading from a guarded location requires either a Fig. 1. Thread Safety Annotations unique or a shared capability. In other words, many threads can read from a location at the same time because they can share Note that this example assumes the presence of a suitably the capability, but only one thread can write to it. Moreover, a thread cannot write to a memory location at the same time that annotated mutex.h [4] that declares which methods perform locking and unlocking. another thread is reading from it, because a capability cannot be both shared and unique at the same time. B. Thread Roles This discipline ensures that programs are free of data Thread safety analysis was originally designed to enforce races, where a data race is defined as a situation that occurs locking policies such as the one previously described, but locks when multiple threads attempt to access the same location in are not the only way to ensure safety. Another common pattern memory at the same time, and at least one of the accesses in many systems is to assign different roles to different threads, is a write [10]. Because write operations require a unique such as “worker thread” or “GUI thread” [6]. capability, no other thread can access the memory location The same concepts used for mutexes and locking can also at that time. be used for thread roles, as shown in Figure 2. Here, a A. Background: Uniqueness and Linear Logic widget library has two threads, one to handle user input, like mouse clicks, and one to handle rendering. It also enforces a Linear logic is a formal theory for reasoning about re- constraint: the draw() method should only be invoked only by sources; it can be used to express logical statements like: “You the GUI thread. The analysis will warn if draw() is invoked cannot have your cake and eat it too” [9]. A unique, or linear, directly from onClick(). variable must be used exactly once; it cannot be duplicated The rest of this paper will focus discussion on mutexes in (used multiple times) or forgotten (not used). the interest of brevity, but there are analogous examples for A unique object is produced at one point in the program, thread roles. and then later consumed. Functions that use the object without consuming it must be written using a hand-off protocol. The III. BASIC CONCEPTS caller hands the object to the function, thus relinquishing Clang thread safety analysis is based on a calculus of control of it; the function hands the object back to the caller capabilities [7] [8]. To read or write to a particular location in when it returns. memory, a thread must have the capability, or permission, to For example, if std :: stringstream were a linear type, stream do so. A capability can be thought of as an unforgeable key, programs would be written as follows: std::stringstream ss; // produce ss member itself; rather, the data it points to is protected by the auto& ss2 = ss << ” Hello ” ; // consume ss given capability. auto& ss3 = ss2 << ” World . ” ; // consume ss2 return ss3 . s t r ( ) ; // consume ss3 Mutex mu; i n t ∗p2 PT GUARDED BY(mu ) ; Notice that each stream variable is used exactly once. A linear type system is unaware that ss and ss2 refer to the same void t e s t ( ) f stream; the calls to << conceptually consume one stream and ∗p2 = 42; / / Warning ! p2 = new i n t ; / / OK ( no GUARDED BY) .