Alloy: a language and tool for exploring software designs

The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.

Citation Jackson, Daniel. “Alloy: a language and tool for exploring software designs.” Communications of the ACM, 62, 9 (September 2019): 66-76 © 2019 The Author

As Published 10.1145/3338843

Publisher Association for Computing Machinery (ACM)

Version Original manuscript

Citable link https://hdl.handle.net/1721.1/129357

Terms of Use Creative Commons Attribution-Noncommercial-Share Alike

Detailed Terms http://creativecommons.org/licenses/by-nc-sa/4.0/ Alloy: A Language and Tool for Exploring Software Designs

Daniel Jackson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Alloy is a language and a toolkit for ex- to be desired. It’s notoriously incomplete, cally support directly. Also, because they ploring the kinds of structures that arise and burdensome, since you need to write generate mathematical proofs, which can in many software designs. This brief arti- test cases explicitly. And it’s very hard to be checked by tools that are smaller and cle aims to give a flavor of Alloy in action, use code to articulate design without get- simpler than the tool that finds the proof, to summarize how Alloy has been used to ting mired in low level details (such as the you can be confident that the analysis is date, and thereby to give you a sense of choice of data representations). sound. how you might use it in your own soft- An alternative, which has been ex- On the other hand, the combination of ware design work. plored since the 1970s, is to use a design an expressive logic and sound proof has language built not on conventional ma- meant that finding proofs cannot gener- Formal Design Languages chine instructions but on logic. Partiality ally be automated. So theorem provers Software involves structures of many comes for free because, rather than listing usually require considerable effort and sorts: architectures, database schemas, each step of a computation, you write a expertise from the user, often orders of network topologies, ontologies, and so logical constraint saying what’s true after, magnitude greater than the effort of con- on. When you design a software system, and that constraint can say as little or as structing a formal design in the first place. you need to be able to express the struc- much as you please. To analyze such a Moreover, failure to find a proof does not tures that are essential to the design, and language, you use specialized algorithms mean that a proof does not exist, and the- to check that they have the properties you such as model checkers or satisfiability orem provers don’t provide counterexam- expect. solvers (more on these below). This usu- ples that explain concretely why a theo- You can express a structure by sketch- ally requires much less effort than test- rem is not valid. So theorem provers are ing it on a napkin. That’s a good start, but ing, since you only need to express the not so useful when the intended property it’s limited. Informal representations give property you want to check rather than a does not hold—which unfortunately is inconsistent interpretations, and they large collection of cases. And the analy- the common case in design work. can’t be analyzed mechanically. So people sis is much more complete than testing, Model checkers revolutionized design have turned to formal notations that de- because it effectively covers all (or almost analysis by providing exactly the fea- fine structure and behavior precisely and all) test cases that you could have written tures theorem provers lacked. They offer objectively, and that can exploit the pow- by hand. push-button automation, requiring the er of computation. user to give only the design and prop- What Came Before: Theorem Provers By using formality early in develop- erty to be checked. They allow dynam- and Model Checkers ment, you can minimize the costs of am- ic properties to be expressed (through biguity and get feedback on your work by To understand Alloy, it helps to know a temporal logics), and generate counter- running analyses. The most popular ap- bit about the context in which it was de- examples when properties do not hold. proach that advocates this is agile devel- veloped, and the tools that existed at the Model checkers work by exploring the opment, in which the formal representa- time. space of possible states of a system, and tion is code in a traditional programming Theorem provers are mechanical aids if that space is large, they may require language and the analysis is conventional for constructing mathematical proofs. considerable computational resources unit testing. To apply a theorem prover to a software (or may fail to terminate). The so-called As a language for exploring designs, design problem, you formulate some in- “state explosion” problem arises because however, code is imperfect. It’s verbose tended property of the design, and then model checkers are often used to analyze and often indirect, and it doesn’t allow attempt to prove the theorem that the designs involving components that run in partial descriptions in which some details property follows from the design. The- parallel, resulting in an overall state space are left to be resolved later. And testing, orem provers tend to provide very rich that grows exponentially with the number as a way to analyze designs, leaves much logics, so they can usually express any of components. property you might care about, at least Alloy was inspired by the successes and To appear, CACM about states and state transitions—more limitations of model checkers. For designs Draft of February 22, 2019 dynamic properties can require a tempo- involving parallelism and simple state ral logic that theorem provers don’t typi- (comprising boolean variables, bound- ed integers, enumerations and fixed-size Relational logic. Alloy uses the same implicit coercions or overloading to ac- arrays), model checkers were ideal. They logic for describing designs and proper- commodate variants that Alloy unifies.) could easily find subtle synchronization ties. This logic combines the for-all and Alloy was influenced also by modeling bugs that appeared only in rare scenari- exists-some quantifiers of first-order log- languages such as UML. Like the class os that involved long traces with multiple ic with the operators of set theory and re- diagrams of UML, Alloy makes it easy to context switches, and therefore eluded lational calculus. describe a universe of objects as a classi- testing. The idea of modeling software designs fication tree, with each relation defined For hardware designs, model checkers with sets and relations had been pio- over nodes in this tree. Alloy’s dot oper- were often a good match. But for software neered in the Z language [32]. Alloy in- ator was inspired in part by the naviga- designs they were less ideal. Although corporated much of the power of Z, while tional expressions of OCL (the Object some software design problems involve simplifying the logic to make it more Constraint Language [39] of UML), but this kind of synchronization, often the tractable. by defining the dot as relational join, Al- complexity arises from the structure First, Alloy allows only first-order struc- loy dramatically simplifies the semantics of the state itself. Early model checkers tures, ruling out sets of sets and relations of navigation. (such as SMV [9]) had limited expressive- over sets, for example. This changes how Small scope analysis. Even plain first-or- ness in this regard, and did not support designs are modeled, but not what can be der logic (without relational operators) is rich structures such as trees, lists, tables modeled; after all, relational databases not decidable. This means that no algo- and graphs. have flourished despite being first order. rithm can exist that could analyze a soft- Explicit state model checkers, such Second, taking advantage of this re- ware design written in a language like Al- as SPIN [14], and later Java Pathfinder striction, Alloy’s operators are defined in loy completely. So something has to give. [37], allowed designs with rich state to a very general way, so that most expres- You could make the language decidable, be modeled, but, despite providing sup- sions can be written with just a few oper- but that would cripple its expressive pow- port for temporal properties, gave little ators. The key operator is relational join, er and make it unable to express even the help for expressing structural ones. To which in conventional mathematics only most basic properties of structures (al- express reachability (for example that two applies to binary relations, but in Alloy though exciting progress has been made social media users are connected by some works on relations of any arity. By using recently in applying decidable fragments path of friend edges), you would typical- a dot to represent the join operator, Al- of first-order logic to certain problems ly need to code an explicit search, which loy lets you write dereferencing expres- [29]). You could give up on automation, would have to be executed at every point sions as you would in an object oriented and require help from the user, but this at which the property was needed. Also, programming language, but gives these eliminates most of the benefit of an analy- explicit state model checkers have limit- expressions a simple mathematical inter- sis tool; analysis is no longer a reward for ed support for partiality (since the model pretation. So, as in Java, given an employ- constructing a design model, but a major checker would have to conduct a costly ee e, a relation dept that maps employees extra investment beyond modeling. search through possible next states to to departments, and a relation manager The other option is to somehow limit find one satisfying the constraints). that maps departments to their manag- the analysis. Prior to Alloy, two approach- Particularly hard for all model checkers ers, e.dept.manager would give the man- es were popular. Abstraction reduces the are the kinds of designs that involve a con- ager of e’s department. But unlike in Java, analysis to a finite number of cases, by in- figuration of elements in a graph or tree the expression will also work if e is a set of troducing abstract values that each corre- structure. Many network protocols are employees, or dept can map an employee spond to an entire set of real values. This designed to work irrespective of the ini- to multiple departments, giving the ex- often results in false positives that are tial configuration (or of the configuration pected result—the set of managers of the hard to interpret, and in practice picking as it evolves), and exposing a flaw often set of departments that the employees e the right abstraction calls for considerable involves not only finding a behavior that belong to. The expressiondept.manager is ingenuity. Simulation picks a finite num- breaks a property but also finding a con- well defined too, and means the relation ber of cases, usually by random sampling, figuration in which to execute it. that maps employees to their managers. but it covers such a small part of the state Even the few model checkers that can You can also navigate backwards, writing space that subtle flaws elude detection. express rich structures are generally not manager.m for the department(s) that m Alloy offered a new approach: running up to this task. Enumerating possible manages. all small tests. The designer specifies a configurations is not feasible, because (A note for readers interested in lan- scope that bounds each of the types in the the number of configurations grows su- guage design: this flexibility is achieved specification. A scope of 5, for example, per-exponentially: if there are n nodes, by treating all values as relations—a set would include tests involving at most 5 there are 2n×n ways to connect them. being a relation with one column, and a elements of each type: 5 network nodes, 5 scalar being a set with one element—and packets, 5 identifiers, and so on. Alloy’s Innovations defining a join operator that applies uni- The rationale for this is the small scope Alloy brought a new kind of design lan- formly over a pair of relations, irrespec- hypothesis, which asserts that most bugs guage and analysis, made possible by tive of their arity. In contrast, other lan- can be demonstrated with small counter- three innovations. guages tend to have multiple operators, examples. That means that if you test for all small counterexamples, you are likely 1 abstract sig EndPoint { } 17 fact Directions { 18 Request.from + Response.to in Client 2 sig Server extends EndPoint { 19 Request.to + Response.from in Server 3 causes: set HTTPEvent 20 } 4 } 21 fact RequestResponse { 5 sig Client extends EndPoint { } 22 all r: Response | one response.r 6 abstract sig HTTPEvent { 23 all r: Response | 7 from, to, origin: EndPoint 24 r.to = response.r.from 8 } 25 and r.from = response.r.to 26 all r: Request | 9 sig Request extends HTTPEvent { 27 r not in r.^(response.embeds) 10 response: lone Response 28 } 11 } 29 fact Causality { 12 sig Response extends HTTPEvent { 30 all e: HTTPEvent, s: Server | 13 embeds: set Request 31 e in s.causes iff e.from = s or 14 } 32 some r: Response | 33 e in r.embeds and r in s.causes 15 sig Redirect extends Response { 34 } 16 } 35 fact Origin { fig. 2 Data model from declarations 36 all r: Response, e: r.embeds | Structure declarations fig. 1 37 e.origin = r.origin 38 all r: Response | r.origin = to find any bug. Many Alloy case studies to demonstrate the infeasibility of oth- 39 (r in Redirect implies have confirmed the hypothesis, by per- er problems to being a soluble problem 40 response.r.origin else r.from) forming an analysis in a variety of scopes that other problems could be translated 41 all r: Request | and showing, retrospectively, that a small to. Alloy also applies a variety of tactics 42 no embeds.r implies scope would have sufficed to find all the to reduce the problem prior to solving, 43 r.origin in r.from bugs that were discovered. most notably adding symmetry breaking 44 } Translation to SAT. Even with small constraints that save the SAT solver from 45 pred EnforceOrigins (s: Server) { scopes, the state space of an Alloy model considering cases that are equivalent to 46 all r: Request | is fiendishly large. The state comprises a one another. 47 r.to = s implies collection of variables whose values are 48 r.origin = r.to or r.origin = r.from Example: Modeling Origins relations. Just one binary relation in a 49 } scope of 5 has 5 × 5 = 25 possible edges, To see Alloy in action, let’s explore the de- and thus 225 possible values. A very small sign of an origin-tracking mechanism for fig. 3 Fact and predicate declarations design might have 5 such relations, giving web browsers. The model shown here is a (225)5 possible states—about 1037 states. toy version of a real model that exposed Even checking a billion cases per second, several serious flaws in browser security lowed an attacker to change the delivery such an analysis would take many times [1]. Although it cuts corners and is unre- address for the user’s account in a DVD the age of the universe. alistic in some respects, it does capture rental site. What makes CSRF particularly Alloy therefore does not perform an the spirit and style of the original model, problematic is that the browser sends au- explicit search, but instead translates the and is fairly representative of how Alloy is thentication credentials stored as cookies design problem to a satisfiability prob- often used. spontaneously when a request is issued, lem whose variables are not relations but First, some background for those unfa- whether that request is made explicitly by simple bits. By flipping bits individually, a miliar with browser security. Cross-site the user or programmatically by a script. satisfiability (SAT) solver can usually find request forgery (CSRF) is a pernicious One way to counter CSRF is to track a solution (if there is one) or show that and subtle attack in which a malicious the origins of all responses received from none exists by examining only a tiny por- script running in a page that the user has servers. In our example, the browser tion of the space. loaded makes a hidden and unwanted re- would mark the malicious script as orig- Alloy’s analysis tool is essentially a com- quest to a website for which the user is al- inating at the malicious or compromised piler to SAT, which allows it to exploit the ready authenticated. This may happen ei- server. The subsequent request made by latest advances in SAT solvers. The suc- ther because the user was enticed to load that script to the rental site server—the cess of SAT solvers has been a remarkable a page from a malicious server, or because target of the attack—would be labeled as story in computer science: theoreticians a supposedly safe server was the subject having this other origin. The target server had shown that SAT was inherently in- of a cross-site scripting attack, and served can be configured so that it only accepts tractable, but it turned out that most of a page containing a malicious script. Such requests that originate directly from the the cases that arise in practice can be a script can issue any request the user user (for example, by the user entering solved efficiently. So SAT went from -be can issue; one of the first CSRF vulnera- the URL for the request in the address ing the archetypal insoluble problem used bilities to be discovered, for example, al- bar), or from itself (for example, from a 50 check { 51 no good, bad: Server { from a single endpoint, implies that the 52 good.EnforceOrigins same is true for every request and re- 53 no r: Request | sponse. (Alloy is best viewed as untyped. 54 r.to = bad and r.origin in Client It turns out that conventional program- 55 some r: Request | ming language types are far too restric- 56 r.to = good and r in bad.causes tive for a modeling language. Alloy thus 57 } allows expressions such as HTTPEvent.re- 58 } for 5 sponse, denoting the set of responses to any events, but its type checker rejects an Request.embeds fig. 4 Check command expression such as which always denotes an empty set [12].) script embedded in a page previously sent The Alloy Analyzer can generate a by the target server). As always the devil graphical representation of the sets and is in the details, and we shall see that a relations from the signature declarations plausible design of this mechanism turns (Fig. 2); this is just an alternative view and out to be flawed. involves no analysis. Here are some features to look out for Moving to the substance of what the in this model, which distinguish Alloy model actually means: from many other approaches: · Thefrom and to fields are just the · A rich structure of objects, classifica- source and destination of the event’s tion and relationships; packet. · Constraints in a simple logic that ex- · For a response r, the expression r.em- ploits the relations and sets of the beds denotes a set of requests that structure, avoiding the kind of low lev- are embedded as JavaScript in the re- el structures (arrays and indices, etc.) sponse; when that response is loaded that are often needed in model check- into the browser, the requests are exe- ers; cuted spontaneously. · Capturing dynamic behavior without · A redirect is a special kind of response any need for a built-in notion of time fig. 5 Counterexample for check of Fig. 4 that indicates that a resource has or state; moved, and spontaneously issues a re- · Intended properties to check ex- server object, and the value of e is some quest to a different server; this second pressed in the same language as the atomic identifier representing an event. request is modeled as an embedded re- model itself; Fields are declared in signatures to al- quest in the redirect response. · An abstract style of modeling that in- low a kind of object-oriented mindset. Al- · Theorigin of an event is a notion com- cludes only those aspects essential to loy supports this by resolving field names puted by the browser as a means of the problem at hand. contextually (so that field names need preventing cross-site attacks. As we’ll We start by declaring a collection of sig- not be globally unique), and by allowing see later, the idea is that a server may natures (Fig. 1). A signature introduces “signature facts” (not used here) that are choose to reject an event unless it orig- a set of objects and some fields that re- implicitly scoped over the elements of a inated at that server or at a browser. late them to other objects. So Server, for signature and their fields. But don’t be · Thecause of an event is not part of the example, will represent the set of server misled into thinking that there is some actual state of the mechanism. It is in- nodes, and has a fieldcauses that asso- kind of complex object semantics here. troduced in order to express the essen- ciates each server with the set of HTTP The signature structure is only a conve- tial design property: that an evil server events that it causes. nience, and just introduces a set and some cannot cause a client to send a request Keywords (or their omission) indicate relations. to a good server. the multiplicity of the relations between The extends keyword defines one sig- Now let’s look at the constraints (Fig. 3). objects: thus each HTTP event has ex- nature as a subset of another. An abstract If there were no constraints, any behavior actly one from endpoint, one to endpoint, signature has no elements that do not be- would be possible; adding constraints re- and one origin endpoint (line 7); each re- long to a child signature, and the exten- stricts the behavior to include only those quest has at most one response (line 10, sions of a signature are disjoint. So the that are intended by design. with lone being read as “less than or equal declarations of EndPoint, Server and Client The constraints are grouped into sepa- to one”); and each response embeds any imply that the set of endpoints is parti- rate named facts to make the model more number of requests (line 13). tioned into servers and clients: no server understandable: Objects are, mathematically, just atom- is also a client, and there is no endpoint · TheDirections fact contains two con- ic identifiers without any internal struc- that is neither client nor server. A relation straints. The first says that every re- ture. So the causes relation includes tu- defined over a set applies over its subsets quest is from, and every response is to, ples of the form (s, e) where the value of too, so the declaration of from, for exam- a client; the second says that every re- s is some atomic identifier representing a ple, which says that every HTTP event is quest is to, and every response is from With all this in place—the structure of endpoints and messages, the rules about how origins are computed and used, and the definition of causality—we can define a design property to check (Fig. 4). The keywordcheck introduces a com- mand that can be executed. This com- mand instructs the Alloy Analyzer to search for a refutation for the given con- straint. In this case, the constraint asserts the non-existence of a cross-site request fig. 7 A simulated instance forgery attack; refuting this will show that the origin mechanism is not designed tions (using the closure operator ^) of correctly, and an attack is possible. following the response and embeds re- The constraint says that there are no lations, as if we’d written instead the two servers, good and bad, such that the fig. 6 A bogus counterexample infinite expression good server enforces the origin header a server. These kinds of constraints can r.response.embeds (line 52), there are no requests sent di- be written in many ways. Here we’ve + r.response.embeds.response.embeds rectly to the bad server that originate chosen to use expressions denoting + r.response.embeds.response.embeds in the client (line 53), and yet there is sets of endpoints—Request.from for .response.embeds some request to the good server that was the set of endpoints that requests are + … caused by the bad server (line 55). from, eg. But we could equally well have written a constraint like defining the requests embedded in the r Analysis Results: Finding Bugs from in response to , the requests embedded Request -> Client + Response -> Server in the response to the requests em- The Alloy Analyzer finds a counterexam- bedded in the response to r, and so on. ple (Fig. 5) almost instantaneously—in to say that the from relation maps re- (Equivalently, r.^p is the set of nodes 30ms on my 2012 Mac Book (with a 2.6 quests to clients and responses to reachable from r in the graph whose GHz i7 processor and 16GB of RAM). servers. Or in a more familiar but less edges correspond to the relation p.) The counterexample can be displayed succinct style, we could have used · TheCausality fact defines thecauses in various ways—as text, as a table, or as quantifiers: relation. It says that an event is caused a graph whose appearance can be cus- all r: Request | r.from in Client by a server if and only if it is from that tomized. I’ve chosen the graph option, all r: Response | r.from in Server server, or is embedded in a response and have selected which objects are to that the server causes. appear as nodes (just the events and the (which constrains only the range of · TheOrigin fact describes the ori- servers), which relations are to appear as the relations, which is sufficient in this gin-tracking mechanism. Each con- edges (those between events, and causes), case since the declarations constrain straint defines the origin of a different and I’ve picked colors for the sets and re- their domains). kind of HTTP event. The first (line 36) lations. I’ve also chosen to use the Skolem · TheRequestResponse fact defines some says that every embedded request e has constants (witnesses that the analyzer basic properties of how requests and the same origin as the response r that finds for the quantified variables)good responses work: that every response is it is embedded in. The second (line 38) and bad to label the servers. from exactly one server (line 22); that defines the origin of a response: it says Reading the graph from the top, looking every response is to the endpoint its that if the response is a redirect, it has just at the large rectangles representing request was from, and from the end- the same origin as the original request, the HTTP events, we see that a request point its request was to (line 23); and and otherwise its origin is the server (Req1) was sent from a client to the good that a request cannot be embedded in that the response came from. The third server. The response Resp( ) embeds a re- a response to itself (line 26). Two ex- (line 41) handles a request that is not quest (Req0) that is sent to the bad server; pressions in these constraints merit embedded: its origin is the endpoint it this is a cross-site request which won’t be explanation. The expression response.r comes from (which will usually be the rejected because the bad server accepts exploits the flexibility of the join oper- browser). incoming requests irrespective of origin. ator to navigate backwards from the Finally, EnforceOrigins is a predicate that The bad server’s response is to send a re- response r to the request it responds can be applied to a server, indicating that direct whose embedded request (Req2) is to; it could equivalently be written it chooses to enforce the origin header, received by the good server. (Note that r.~response using the transpose op- allowing incoming requests only if they the numbering of objects is arbitrary: erator ~. The expression r.^(response. originate at that server, or at the client Req1 actually happens before Req0.) embeds) starts with the request r, and that sent the request. Now looking at the server nodes and then applies to it one or more naviga- the events they cause, we see that, as ex- run {some response} pected, the good server caused the re- easy to do—just change the declaration sponse to the first request, and the bad of response in line 10 of Fig. 1 by drop- server caused the redirect and its subse- will show instances in which the response ping the lone keyword—but would only quent embedded request. The problem is relation has some tuples. The first one make the result of the analysis less gen- the mismatch between cause and origin generated (Fig. 7) shows a request with a eral. Likewise, the less you constrain the in the last request (Req2): we can see that response that is a redirect from the same mechanism, the better. Allowing multiple it was caused by the bad server, but it was source as the request, and sent to an end- behaviors gives implementation freedom, labelled as originating at the good server. point that is also its origin, and it includes which is especially important in a distrib- In other words, the origin tracking design an orphaned redirect unrelated to any re- uted setting. is allowing a cross-site request forgery by quest! These anomalies immediately sug- Simulation matters for a more pro- incorrectly identifying the origin of the gest enrichments of the model. found reason. Verification—that is, request in the redirect. When we developed Alloy, we under- checking properties—is often overrated The solution to this problem turns out estimated the value of this kind of sim- in its ability to prevent failure. As Chris- to be non-trivial. Updating the origin ulation. As we experimented with Alloy, topher Alexander explains [2], designed header after each redirect would fail for however, we came to realize how helpful it artifacts usually fail to meet their purpos- websites that offer open redirection; a is to have a tool that can generate provoc- es not because specifications are violated better solution is to list a chain of end- ative examples. These examples invari- but because specifications are unknown. points in the origin header [1]. ably expose basic misunderstandings, not The “unknown unknowns” of a software only about what’s being modeled but also design are invariably discovered when the Agile Modeling about which properties matter. It’s essen- design is finally deployed, but can often As I mentioned earlier, our model is rep- tial that Alloy provides this simulation be exposed earlier by simulation, espe- resentative of many Alloy models. But the for free: in particular, you don’t need to cially in the hands of an imaginative de- way I presented it was potentially mis- formulate anything like a test case, which signer. leading. In practice, users of Alloy don’t would defeat the whole point. Verification, in contrast, is too narrow- construct a model in its entirety and then Growing a model in a declarative lan- ly focused to produce such discoveries. check its properties. Instead, they pro- guage like Alloy is very different from This is not to say that property checking is ceed in a more agile way, growing the growing a program in a conventional not useful—it’s especially valuable when a model and simulating and checking it as programming language. A program starts property can be assured with high confi- they go. with no behaviors at all, and as you add dence using a tool such as Alloy or a mod- Take, for example, the constraint on code, new behaviors become possible. el checker or theorem prover (rather than line 26 of Fig. 3. Initially, I hadn’t actually With Alloy, it’s the opposite. The empty by testing). But its value is always contin- noticed the need for this constraint. But model, since it lacks any constraints, al- gent on the sufficiency of the property -it when I ran the check for the first time lows every possible behavior; as you add self, and techniques that help you explore (without this constraint), the analyzer constraints, behaviors are eliminated. properties have an important role to play. presented me with counterexamples such This allows a powerful style of incre- Uses of Alloy as the one shown in Fig. 6, in which the mental development in which you only response to a request is the very response add constraints that are absolutely essen- Hundreds of papers have reported on in which the request is embedded! tial for the task at hand—whether that’s applications of Alloy in a wide variety of One way to build a model, exploiting eliminating pathological cases or ensur- settings. Here are some examples to give a Alloy’s ability to express and analyze very ing that a design property holds. flavor of how Alloy has been used. partial models, is to add one constraint at Typically a model includes both a de- Critical systems. A team at the Univer- a time, exploring its effect. You don’t need scription of the mechanism being de- sity of Washington constructed a depend- to have a property to check; you can just signed and some assumptions about ability case [18] for a neutron radiothera- ask for an instance of the model satisfying the environment in which it operates. py installation. They devised an ingenious all the constraints. Our model does not separate these rig- technique for verifying properties of code Doing this even before any explicit orously, but where brevity is not such a against specifications using lightweight, constraints have been included is very pressing concern, it would be wise to do pluggable checkers. The end-to-end de- helpful. You can run just the data model so. We could separate, for example, the pendability case was assembled in Alloy by itself and see a series of instances that constraints that model the setting and from the code specifications, proper- satisfy the constraints implicit in the dec- checking of the origin field from those ties of the equipment and environment, larations. Often doing this alone exposes that describe what kinds of requests and and the expected properties, and then some interesting issues. In this case, the responses are possible. checked using the Alloy Analyzer. The first few instances include examples with Obviously, the less you assume about analysis found several safety-critical flaws no HTTP events, and with requests and the environment, the better, since every in the latest version of the control soft- responses that are disconnected. assumption you make is a risk (since it ware, which the researchers were able to To get more representative instances, may turn out to be untrue). In our mod- correct prior to its deployment. For a full you can specify an additional constraint el, for example, we don’t require every description, see a recent research report to be satisfied. For example, the command request to have a response. It would be [30] and additional information on the automatically several results for C11 (the true, and the Margrave tool [26] analyzes project’s website [36]. memory model introduced in 2011 for C firewall configurations. Last year, a team Network protocols. Pamela Zave, a re- and C++) and common compiler optimi- from Princeton and Nvidia built a tool searcher at AT&T, has been using Alloy zations associated with it, for the memory that uses Alloy to synthesize security at- for many years to construct and analyze models of the IBM Power and Intel x86 tacks that exploit the Spectre and Melt- models of networking, and for designing chips, and for compiler mappings from down vulnerabilities [35]. a new unifying network architecture. In OpenCL to AMD-style GPUs. They then Teaching. Alloy has been widely taught a major case study, she analyzed Chord, used their technique to develop and check in undergraduate and graduate courses a distributed hash table for peer-to-peer a new memory model for Nvidia GPUs. for many years. At the University of Min- applications. The original paper on Chord Code verification. Alloy can also be used ho in Portugal, Alcino Cunha teaches an [33]—one of the most widely cited papers to verify code, by translating the body of annual course on formal methods using in computer science—notes that an inno- a function into Alloy, and asking Alloy to Alloy, and has developed a web interface vation of Chord was its relative simplicity, find a behavior of the function that vio- to present students with Alloy exercises and consequently the confidence users lates its specification. Greg Dennis built a (which are then automatically checked). can have in its correctness. By modeling tool called Forge that wraps Alloy so that At Brown University, Tim Nelson teach- and analyzing the protocol in Alloy, Zave it can be applied directly to Java code an- es Logic for Systems, which uses Alloy for showed that the Chord protocol was not, notated with JML specifications. In a case modeling and analysis of system designs, however, correct, and she was able to de- study application [10], he checked a vari- and has become one of the most popular velop a fixed version that maintains its ety of implementations of the Java collec- undergraduate classes. Because the Al- simplicity and elegance while guarantee- tions list interface, and found bugs in one loy language is very close to a pure rela- ing correct behavior [43]. Zave also used (a GNU Trove implementation). Dennis tional logic, it has also been popular in the explicit model checker SPIN [14] in also applied his tool to KOA, an electron- the teaching of discrete mathematics, for this work, and wrote an insightful article ic voting system used in the Netherlands example in a course that Charles Wallace explaining the relative merits of the two that was annotated with JML specifica- teaches at Michigan Technological Uni- tools, and how she used them in tandem tions and had previously been analyzed versity [38] and appearing as a chapter in [42]. with a theorem proving tool, and found a popular textbook [15]. Web security. The demonstration exam- several functions that did not satisfy their Alloy Extensions ple of this paper is drawn from a real study specifications [11]. performed by a research group at Berke- Civil engineering. In one of the more Many extensions to Alloy—both to the ley and Stanford [1]. They constructed a innovative applications of Alloy, John language and to the tool—have been cre- library of Alloy models to capture various Baugh and his colleagues have been ap- ated. These offer a variety of improve- aspects of web security mechanisms, and plying Alloy to problems in large-scale ments in expressiveness, performance then analyzed five different mechanisms, physical simulation. They designed an and usability. For the most part, these including: WebAuth, a web-based au- extension to ADCIRC—an ocean circula- extensions have been mutually incom- thentication protocol based on Kerberos tion model widely used by the U.S. Army patible, but a new open source effort is deployed at several universities including Corps of Engineers and others for simu- now working to consolidate them. There Stanford; HTML5 forms; the Cross-Ori- lating hurricane storm surge—that intro- are too many efforts to include here, so gin Resource Sharing protocol; and pro- duces a notion of subdomains to allow we focus on representatives of the main posed designs for using the referer header more localized computation of changes classes. and the origin header to foil cross-site (and thus reduced overall computational Higher-order solving. The Alloy Analyz- attacks (of which the last is the basis for effort). Their extension, which has been er’s constraint solving mechanism cannot the example here). The base library was incorporated into the official ADCIRC handle formulas with universal quantifi- written in 2,000 lines of Alloy; the various release, was modeled and verified in Al- cations over relations—that is, problems mechanisms required between 20 and loy [7]. that reduce to “find some relationP such 214 extra lines; and every bug was found Alloy as a backend. Because Alloy of- that for every relation Q…” This is exactly within two minutes and a scope of 8. Two fers a small and expressive logic, along the form that many synthesis problems previously known vulnerabilities were with a powerful analyzer, it has been ex- take, in which the relation P represents a confirmed by the analysis, and three new ploited as a backend in many different structure to be synthesized, such as the ones discovered. tools. Developers have often used Alloy’s abstract syntax tree of a program, and the Memory models. John Wickerson and own engine, Kodkod [34], directly, rath- relation Q represents the state space over his colleagues have shown that four er than the API of Alloy itself, because it which certain behaviors are to be verified. common tasks in the design of memory offers a simpler programmatic interface Alloy* [24] is an extension of Alloy that models—generating conformance tests, with the ability to set bounds on rela- can solve such formulas, by generalizing comparing two memory models, check- tions, improving performance. Jasmine a tactic known as counterexample-guided ing compiler optimizations, and checking Blanchette’s Nitpick tool [8], for example, inductive synthesis that has been widely compiler mappings—can all be framed uses Kodkod to find counterexamples in used in synthesis engines. as constraint satisfaction problems in Isabelle/HOL, saving the user the trouble . Alloy has no built-in Alloy [41]. They were able to reproduce of trying to prove a theorem that is not notion of time or dynamic behavior. On the one hand, this is an asset, because it for requiring (or forbidding) a particular book’s website [17]). The Alloy communi- keeps the language simple, and allows tuple in the instance. Another extension ty answers questions tagged with the key- it to be used very flexibly. We exploited [21] of the Alloy Analyzer generates min- word alloy on StackOverflow, and hosts a this in the example model of this paper, imal and maximal instances, and choos- discussion forum [5]. A variety of tutori- where the flow of time is captured in the ing a next instance that is as close to, or als for learning Alloy are available online response relation that maps each request as far away from, the current instance as too, as well as blog posts with illustrative to its response. By adding a signature for possible. case studies and examples (eg [40, 19]). state, Alloy supports the specification Better numerics. Alloy handles numeri- The model used in this paper is available style common in languages such as B, cal operations by treating numbers as bit (along with its visualization theme) in the VDM and Z; and by adding a signature strings. This has the advantage of fitting Alloy community’s model repository [4]. for events, Alloy allows analysis over trac- into the SAT solving paradigm smoothly, Acknowledgments es that can be visualized as series of snap- and it allows a good repertoire of integer shots. On the other hand, it would often operations. But the analysis scales poorly, I am very grateful to David Chemouil, be preferable to have dynamic features making Alloy unsuitable for heavily nu- Alcino Cunha, Peter Kriens, Shriram built into the language. Electrum [20] ex- meric applications. The finite scopes of Krishnamurthi, Emina Torlak, Hillel tends Alloy with a keyword var to indicate Alloy can also be an issue when a design- Wayne, Pamela Zave, and the anonymous that a signature or field has a time-varying er would like numbers to be unbounded. reviewers, whose suggestions improved value, and with the quantifiers of linear A possible solution is to replace the SAT this paper greatly; to , who temporal logic (which fit elegantly with backend with an SMT backend instead. encouraged me to write it; and to Devdat- Alloy’s traditional quantifiers). DynAlloy This is challenging because SMT solvers ta Akhawe, Adam Barth, Peifung E. Lam, [31] offers similar functionality, but using have not traditionally supported relation- John Mitchell and Dawn Song whose dynamic logic instead, and is the basis of al operators. Nevertheless, a team at the work formed the basis of the paper’s ex- an impressive code analysis tool called University of Iowa has recently extended ample. Thank you also to the many mem- TACO [13] that outperforms Forge (men- CVC4, a leading SMT solver, with a theo- bers of the Alloy community who have tioned above) by employing domain-spe- ry of finite relations, and has promisingly contributed to Alloy over the years. cific optimizations. No extension of Alloy, demonstrated its application to some Al- References however, has yet addressed the problem loy problems [23]. of combining Alloy’s capacity for struc- Configurations. Many Alloy models 1. Devdatta Akhawe, Adam Barth, Peifung E. tural analysis with the ability of tradition- contain two loosely coupled parts, one Lam, John Mitchell and Dawn Song. Towards a Formal Foundation of Web Security. 23rd IEEE al model checkers to explore long traces, defining a configuration (say of a network) Computer Security Foundations Symposium, so Alloy analyses are still typically limited and the other the behavior (say of sending Edinburgh, 2010, pp. 290–304. to short traces. packets). By iterating through configura- 2. Christopher Alexander. Notes on the Synthesis Instance generation. The result of an Al- tions and analyzing each independently, of Form. Harvard University Press, 1964. loy analysis is not one but an entire set of one can often dramatically reduce anal- 3. Alloy Tools website: http://alloytools.org. 4. Alloy Models repository: https://github.com/ solutions to a constraint-solving problem, ysis time [22]. In some applications, a AlloyTools/models each of which represents either a positive configuration is already fully or partially 5. Alloy discussion forum: https://groups.google. example of a scenario, or a negative ex- known, and the goal is to complete the com/forum/#!forum/alloytools ample, showing how the design fails to instance—in which case searching for the 6. Adam Barth, Colin Jackson and John C. meet some property. The order in which configuration is a wasted effort. Kodkod, Mitchell. Robust defenses for cross-site request forgery. 15th ACM Conf. on Computer and these appear is somewhat arbitrary, being Alloy’s engine, allows the explicit defini- Communications Security (CCS 2008). ACM, determined both by how the problem is tion of a “partial instance” to support this, 2008, pp. 75–88. encoded and the tactics of the backend but in Alloy itself, this notion is not well 7. John Baugh and Alper Altuntas. Formal meth- SAT solver. Since SAT solvers tend to supported (and relies on a heuristic for ods and finite element analysis of hurricane try false before true values, the instanc- extracting partial instances from formu- storm surge: A case study in software veri- fication.Science of Computer Programming, es generated tend to be small ones—with las in a certain form). Researchers have 158:100–121, 2018. few nodes and edges. This is often desir- therefore proposed a language extension 8. Jasmine Blanchette and Tobias Nipkow. Nitpick: able, but is not always ideal. Various ex- [25] to allow partial instances to be de- A counterexample generator for higher-or- tensions to the Alloy Analyzer provide fined directly in Alloy itself. der logic based on a relational model finder. more control over the order in which in- First International Conference on Interactive How to Try Alloy Theorem Proving (ITP 2010), M. Kaufmann and stances appear. Aluminum [28] presents L.C. Paulson, eds. LNCS 6172, pp. 131–146, only minimal scenarios, in which every The Alloy Analyzer [3] is a free download Springer, 2010. relation tuple is needed to satisfy the con- available for Mac, Windows and Linux. 9. Jerry R. Burch, Edmund M. Clarke, Kenneth L. straints, and lets the user add new tuples, The Alloy book [16] provides a gentle in- McMillan, David L. Dill and L. J. Hwang. Sym- 20 automatically compensating with a (min- troduction to relational logic and to the bolic : 10 States and Beyond. Fifth Annual Symposium on Logic in Computer imal) set of additional tuples required for Alloy language, gives many examples of Science (LICS ’90), Philadelphia, Pennsylvania, consistency. Amalgam [27] lets users ask Alloy models, and includes a reference USA, June 4-7, 1990, pp. 428–439. about the provenance of an instance, in- manual and a comparison to other lan- 10. Greg Dennis, Felix Chang and Daniel Jackson. dicating which subformula is responsible guages (both of which are available on the Modular Verification of Code with SAT. Inter- national Symposium on Software Testing and 22. Nuno Macedo, Alcino Cunha and Eduardo Pes- Galeotti and Marcelo Frias. DynAlloy Analyzer: Analysis. Portland, ME, July 2006. soa. Exploiting Partial Knowledge for Efficient A Tool for the Specification and Analysis of Al- 11. Greg Dennis, Kuat Yessenov and Daniel Jack- Model Analysis. 15th International Symposium loy Models with Dynamic Behaviour. 11th Joint son. Bounded Verification of Voting Software. on Automated Technology for Verification and Meeting on Foundations of Software Engineering Second IFIP Working Conference on Verified Analysis (ATVA’17), pp 344-362. Springer, 2017. (ESEC/FSE 2017). ACM, New York, NY, USA, Software: Theories, Tools, and Experiments 23. Baoluo Meng, Andrew Reynolds, Cesare Tinelli pp. 969–973. (VSTTE 2008) . Toronto, Canada, October and Clark Barrett. Relational Constraint Solving 32. John Michael Spivey. The Z Notation: A refer- 2008. in SMT. 26th International Conference on Au- ence manual (2nd ed.), Prentice Hall, 1992. 12. Jonathan Edwards, Daniel Jackson and Emina tomated Deduction (CADE ’17) (Leonardo de 33. Ion Stoica, Robert Morris, David Liben-Nowell, Torlak. A type system for object models. 12th Moura, ed.), Springer, Vol. 10395, Gothenburg, David R. Karger, M. Frans Kaashoek, Frank ACM SIGSOFT International Symposium on Sweden, 2017. Dabek and Hari Balakrishnan. Chord: A Foundations of Software Engineering, 2004, 24. Aleksandar Milicevic, Joseph P. Near, Eunsuk Scalable Peer-to-peer Lookup Protocol for Newport Beach, CA, USA, October 31 - No- Kang and Daniel Jackson. Alloy*: a gener- Internet Applications. IEEE/ACM Transactions vember 6, 2004, pages 189–199, 2004. al-purpose higher-order relational constraint on Networking (TON), Vol. 11, No. 1 (2003): 13. Juan P. Galeotti, Nicolas Rosner, Carlos G. solver. Formal Methods in System Design, 2017, pp.17–32. Lopez Pombo and Marcelo F. Frias. TACO: Ef- pp.1–32. 34. Emina Torlak and Daniel Jackson. Kodkod: ficient SAT-Based Bounded Verification Using 25. Vajih Montaghami and Derek Rayside. Extend- a relational model finder.13th International Symmetry Breaking and Tight Bounds. IEEE ing alloy with partial instances. Third Interna- Conference on Tools and Algorithms for the Con- Trans. Softw. Eng. 39, 9 (September 2013), pp. tional Conference on Abstract State Machines, struction and Analysis of Systems (TACAS’07), 1283–1307. Alloy, B, VDM, and Z (ABZ’12), 2012, pp. Braga, Portugal, 2007, pp. 632–647. 14. Gerard J. Holzmann. The Spin Model Checker: 122–135. 35. Caroline Trippel, Daniel Lustig and Margaret Primer and Reference Manual, Addison Wesley, 26. Timothy Nelson, Christopher Barratt, Daniel Martonosi. MeltdownPrime and SpectrePrime: 2003. J. Dougherty, Kathi Fisler, Shriram Krish- Automatically-Synthesized Attacks Exploiting 15. Michael Huth and Mark Ryan. Logic in Com- namurthi. The Margrave Tool for Firewall Anal- Invalidation-Based Coherence Protocols. arX- puter Science: Modeling and Reasoning about ysis. 24th USENIX Large Installation System iv:1802.03802, February 2018. Systems, Cambridge University Press, 2004. Administration Conference, San Jose, CA, 2010. 36. University of Washington. PLSE Neutrons. http: 16. Daniel Jackson. Software Abstractions, MIT 27. Timothy Nelson, Natasha Danas, Daniel J. neutrons.uwplse.org/ Press, Second edition, 2012. Dougherty and Shriram Krishnamurthi. The 37. W. Visser, K. Havelund, G. Brat, S.-J. Park, and 17. Daniel Jackson. Software Abstractions website. Power of Why and Why Not: Enriching Scenar- F. Lerda. Model Checking Programs. Automat- http://softwareabstractions.org. io Exploration with Provenance. Joint European ed Software Engineering Journal, 10(2), April 18. Daniel Jackson, Martyn Thomas, and Lynette I. Software Engineering Conference and ACM 2003. Millett, eds. Software For Dependable Systems: SIGSOFT Symposium on the Foundations of 38. Charles Wallace. Learning Discrete Structures Sufficient Evidence? Committee on Certifiably Software Engineering, 2017. Interactively with Alloy. 49th ACM Technical Dependable Software Systems, Computer Sci- 28. Timothy Nelson, Salman Saghafi, Daniel J. Symposium on Computer Science Education, ence and Telecommunications Board, Division Dougherty, Kathi Fisler and Shriram Krish- Baltimore, Maryland, USA.February 21–24, on Engineering and Physical Sciences, National namurthi. Aluminum: Principled Scenario 2018, pp. 1051–1051. Research Council of the National Academies. Exploration through Minimality. International 39. Jos B. Warmer and Anneke G. Kleppe. The The National Academies Press, Washington, Conference on Software Engineering, 2013. Object Constraint Language: Precise Modeling DC. 2007. 29. Oded Padon, Giuliano Losa, Mooly Sagiv, With UML. Addison-Wesley, 1999. 19. Peter Kriens. JPMS, The Sequel. http://aqute. and Sharon Shoham. 2017. Paxos Made EPR: 40. Hillel Wayne. Personal blog. https://www.hillel- biz/2017/06/14/jpms-the-sequel.html Decidable Reasoning about Distributed Proto- wayne.com 20. Nuno Macedo, Julien Brunel, David Chemouil, cols. Object-Oriented Programming, Systems, 41. John Wickerson, Mark Batty, Tyler Sorensen Alcino Cunha and Denis Kuperberg. Light- Languages & Applications (OOPSLA 2017), and George A. Constantinides. Automatically weight specification and analysis of dynamic Vancouver, 2017. comparing memory consistency models. 44th systems with rich configurations.24th ACM 30. Stuart Pernsteiner, Calvin Loncaric, Emina ACM SIGPLAN Symposium on Principles of SIGSOFT International Symposium on Founda- Torlak, Zachary Tatlock, Xi Wang, Michael D. Programming Languages (POPL 2017),Paris, tions of Software Engineering (FSE’16), Seattle, Ernst and Jonathan Jacky. Investigating Safety of France, 2017, pp. 190–204. WA, USA, 2016, pp. 373–383. a Radiotherapy Machine Using System Models 42. Pamela Zave. A practical comparison of Alloy 21. Nuno Macedo, Alcino Cunha and Tiago with Pluggable Checkers. Computer Aided and Spin. Formal Aspects of Computing, Vol. 27: Guimaraes. Exploring Scenario Exploration. Verification (CAV 2016). Lecture Notes in 239–253, 2015. Fundamental Approaches to Software Engineer- Computer Science, Vol. 9780, Springer. 43. Pamela Zave. Reasoning about identifier spaces: ing (FASE 2015), A. Egyed and I. Schaefer, eds. 31. German Regis, Cesar Cornejo, Simon Gutierrez How to make Chord correct. IEEE Transactions Lecture Notes in Computer Science, Vol 9033. Brida, Mariano Politano, Fernando Raverta, on Software Engineering, 43(12):1144–1156, Springer, Berlin, Heidelberg. Pablo Ponzio, Nazareno Aguirre, Juan Pablo December 2017.