Proceedings of the 26th ACM Symposium on Principles of Programming Languages (POPL ’99), San Antonio, Texas, USA, January 1999 JFlow: Practical Mostly-Static Information Flow Control

Andrew C. Myers

Laboratory for Computer Science

Massachusetts Institute of Technology http://www.pmg.lcs.mit.edu/ ~andru

Abstract they only control information release, not its propagation once released. Mandatory access control (MAC) mecha- A promising technique for protecting privacy and integrity of nisms prevent leaks through propagation by associating a sensitive data is to statically check information flow within run- security class with every piece of computed data. programs that manipulate the data. While previous work Every computation requires that the security class of the re- has proposed programming language extensions to allow this sult value also be computed, so multi-level systems using static checking, the resulting languages are too restrictive for this approach are slow. Also, these systems usually apply a practical use and have not been implemented. In this pa- security class to an entire process, tainting all data handled per, we describe the new language JFlow, an extension to by the process. This coarse granularity results in data whose the Java language that adds statically-checked information security class is overly restrictive, and makes it difficult to flow annotations. JFlow provides several new features that many useful applications. information flow checking flexible and conve- A promising technique for protecting privacy and integrity nient than in previous models: a decentralized label model, of sensitive data is to statically check information flows label polymorphism, run-time label checking, and automatic within programs that might manipulate the data. Static label inference. JFlow also supports many language features checking allows the fine-grained tracking of security classes that have never been integrated successfully with static infor- through program computations, without the run-time over- mation flow control, including objects, subclassing, dynamic of dynamic security classes. Several simple program- tests, access control, and exceptions. This paper defines ming languages have been proposed to allow this static check- the JFlow language and presents formal rules that are used to ing [DD77, VSI96, ML97, SV98, HR98]. However, the check JFlow programs for correctness. Because check- focus of these languages was correctly checking information ing is static, there is little code space, data space, or run-time flow statically, not providing a realistic programming model. overhead in the JFlow implementation. This paper describes the new language JFlow, an extension to the Java language [GJS96] that permits static checking of

flow annotations. JFlow seems to be the first practical pro- Intro duction 1 gramming language that allows this checking. Like other re- cent approaches [VSI96, ML97, SV98, HR98, ML98], JFlow Protection for the privacy of data is becoming increasingly treats static checking of flow annotations as an extended form important as data and programs become increasingly mobile. of type checking. Programs written in JFlow can be statically Conventional security techniques such as discretionary ac- checked by the JFlow compiler, prevents information cess control and information flow control (including manda- leaks through storage channels [Lam73]. JFlow is intended tory access control) have significant shortcomings as privacy- to support the writing of secure servers and applets that ma- protection mechanisms. nipulate sensitive data. The hard problem in protecting privacy is preventing pri- An important philosophical difference between JFlow and vate information from leaking through computation. Access other work on static checking of information flow is the focus control mechanisms do not with this kind of leak, since on a usable programming model. Despite a long , static information flow analysis has not been widely accepted as a This research was supported in part by DARPA Contract F30602-96-C-0303, monitored security technique. One major reason is that previous models by USAF Rome Laboratory, and in part by DARPA Contract F30602-98-1-0237, also of static flow analysis were too limited or too restrictive to monitored by USAF Rome Laboratory. be used in practice. The goal of the work presented in this Copyright c 1999 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom paper has been to add enough power to the static checking use is granted without fee provided that copies are not made or distributed for profit framework to allow reasonable programs to be written in a or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must natural manner. be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to This work has involved several new contributions: JFlow post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, extends a complex programming language and supports many or [email protected]. Lab els language features that have not been previously integrated 2.1 with static flow checking, including mutable objects (which subsume function values), subclassing, dynamic type tests, In the decentralized label model, data values are labeled and exceptions. JFlow also provides powerful new features with security policies. A label is a generalization of the that make information flow checking restrictive and more usual notion of a security class; it is a set of policies that convenient than in previous programming languages: restrict the movement of any data value to which the label is attached. Each policy in a label has an owner O, which  It supports the decentralized label model [ML97, ML98], which allows multiple principals to protect their is a principal whose data was observed in order to create the privacy even in the presence of mutual distrust. It value. Principals are users and other authority entities such also supports safe, statically-checked declassification, as groups or roles. Each policy also has a set of readers, or downgrading, allowing a principal to relax its own which are principals that O allows to observe the data. A privacy policies without weakening policies of other single principal may be the owner of multiple policies and

principals. may appear in multiple reader sets.

f o r r o r r g For example, the label L = 1: 1, 2; 2: 2, 3 has two

 It provides a simple but powerful model of access con- o policies in it (separated by semicolons), owned by o 1 and 2

trol that allows code privileges to be checked statically,

r r respectively. The policy of principal o 1 allows 1 and 2 to

and also allows authority to be granted and checked

r r read; the policy of principal o 2 allows 2 and 3 to read. The dynamically. effective reader set contains only the common reader r 2. The

least restrictive label possible is the label fg, which contains  It provides label polymorphism, allowing code that is generic with respect to the security class of the data it no policies. Because no principal expresses a privacy interest

manipulates. in this label, data labeled by fg is completely public as far as the labeling scheme is concerned.

 Run-time label checking and first-class label values pro- There are two important intuitions behind this model: first, vide a dynamic escape when static checking is too re- data may only be read by a user of the system if all of the strictive. Run-time checks are statically checked to en- policies on the data list that user as a reader. The effective sure that information is not leaked by the success or policy is an intersection of all the policies on the data. Second, failure of the run-time check itself. a principal may choose to relax a policy that it owns. This is a safe form of declassification — safe, because all of the  Automatic label inference makes it unnecessary to write many of the annotations that would otherwise be re- other policies on the data are still enforced. quired. A process has the authority to act on behalf of some (pos- sibly empty) set of principals. The authority possessed by The JFlow compiler is structured as a source-to-source a process determines the declassifications that it is able to translator, so its output is a standard Java program that can perform. Some principals are also authorized to act for other be compiled by any Java compiler. For the most part, trans- principals, creating a principal hierarchy. The principal hi- lation involves removal of the static annotations in the JFlow erarchy may change over time, but revocation is assumed to program (after checking them, of course). There is little occur infrequently. The meaning of a label is affected by the code space, data space, or run time overhead, because most 0 current principal hierarchy. For example, if the principal r

checking is performed statically. r can act for the principal r , then if is listed as a reader by

The remainder of this paper is structured as follows: Sec- 0 a policy, r is effectively listed by that policy as well. The tion 2 contains an overview of the JFlow language and a meaning of a label under different principal hierarchies is rationale for the decisions taken. Section 3 discusses static discussed extensively in an earlier paper [ML98]. checking, sketches the framework used to check program Every variable is statically bound to a static label. (The constructs in a manner similar to type checking, and both for- alternative, dynamic binding, largely prevents static analysis mally and informally presents some of the rules used. This and can be simulated in JFlow if needed.) If a value v has label

section also describes the translations that are performed by

x L L1 and a variable has label 2, we can assign the value to

the compiler. Section 4 compares this work to other work in

= v L L the variable (x : ) only if 1 can be relabeled to 2, which

related areas, and Section 5 provides some conclusions. The

v L is written as L1 2. The definition of this binary relation

grammar of JFlow is provided for reference in Appendix A.

v L L on labels is intuitive: L1 2 if for every policy in 1, there

is some policy in L 2 that is least as restrictive [ML98]. Language overview 2 Thus, the assignment does not leak information.

In this system, the label on x is assigned by the programmer

This section presents an overview of the JFlow language and writes the code that uses x. The power to select a label v a rationale for its design. JFlow is an extension to the Java for x does not give the programmer the ability to leak ,

language that incorporates the decentralized label model. In because the static checker permits the assignment to x only if

Section 2.1, the previous work on the decentralized label the label on x is sufficiently restrictive. After the assignment,

model [ML97, ML98] is reviewed. The language descrip- the static binding of the label of x prevents leakage. (Changes

tion in the succeeding sections focuses on the differences in who can read the value in x are effected by modifying the between JFlow and Java, since Java is widely known and principal hierarchy, but changes to the principal hierarchy well-documented [GJS96]. require appropriate privilege.)

2

fpublicg x;

Computations (such as multiplying two numbers) cause int

b o oleanfsecretg b; joining ( t ) of labels; the label of the result is the least restric-

tive label that is at least as restrictive as the labels of the values :::

= 0;

used in the computation; that is, the least upper bound of the int x

(b) f

labels. The of two sets of policies is simply the union if

x = 1;

of the sets of policies. The relation v generates a lattice of g

equivalence classes of labels with t as the LUB operator. Lattice properties are important for supporting automatic la-

bel inference and label polymorphism [ML97, ML98]. The Figure 1: Implicit flow example

 B A v B ^ B v A notation A is also used as a shorthand for (which does not mean that the labels are equal [ML98]). static type of each expression is a supertype of the actual, Declassification provides an escape hatch from strict infor- run-time type of every value it might produce; similarly, the mation flow tracking. If the authority of a process includes a goal of label checking is to ensure that the apparent label of principal p, a value may be declassified by dropping policies every expression is at least as restrictive as the actual label owned by principals that p acts for. The ability to declassify of every value it might produce. In addition, label checking provides the opportunity for p to choose to release informa- guarantees that, except when declassification is used, the tion based on a more sophisticated analysis. apparent label of a value is at least as restrictive as the actual All practical information flow control systems provide the label of every value that might affect it. In principle, the ability to declassify data because strict information flow con- actual label could be computed precisely at run time. Static trol is too restrictive to write real applications. More com- checking ensures that the apparent, static label is always plex mechanisms such as inference controls [Den82] often a conservative approximation of the actual label. For this are used to decide when declassification is appropriate. In reason, it is typically unnecessary to represent the actual previous systems, declassification is performed by a trusted label at run time. subject: code having the authority of a highly trusted princi- A labeled type may occur in a JFlow program in most pal. One key advantage of the new label structure is that it places where a type may occur in a Java program. For exam- is decentralized: it does not require that all other principals ple, variables may be declared with labeled type:

in the system trust a principal p’s declassification decision,

intfp:g x;

since p cannot weaken the policies of principals that it does

fxg y;

not act for. int

int z;

Lab eled typ es 2.2 The label may always be omitted from a labeled type, as in

the declaration of z. If omitted, the label of a local variable This section begins the description of the new work in this pa- is inferred automatically based on its uses. In other contexts per (the JFlow programming language), which incorporates where a label is omitted, a context-dependent default label the label model just summarized. In a JFlow program, a label is generated. For example, the default label of an instance

is denoted by a label expression, which is a set of component variable is the public label fg. Several other cases of default expressions. As in Section 2.1, a component expression of label assignment are discussed later. the form owner: reader1, reader2, ::: denotes a policy. A

label expression is a series of component expressions, sep-

2.3 Implicit ows

o r r o r r g arated by semicolons, such as f 1: 1, 2; 2: 2, 3 .Ina program, a component expression may take additional forms; In JFlow, the label of an expression’s value varies depending for example, it may be simply a variable name. In that case, on the evaluation context. This somewhat unusual property

it denotes the set of policies in the label of that variable. The is needed to prevent leaks through implicit flows: channels

ag label f contains a single component; the meaning of the created by the control flow structure itself.

label is that the value it labels should be as restricted as the Consider the code segment of Figure 1. By examining the

fa; o: rg x

variable a is. The label contains two components, value of the variable after this segment has executed, we b

indicating that the labeled value should be as restricted as a can determine the value of the secret boolean , even though x

is, and also that the principal o restricts the value to be read has only been assigned constant values. The problem is the x=1 by at most r. assignment , which should not be allowed.

In JFlow, every value has a labeled type that consists of To prevent information leaks through implicit flows, the pc two parts: an ordinary Java type such as int, and a label that compiler associates a program-counter label ( ) with every describes the ways that the value can propagate. The type and statement and expression, representing the information that label parts of a labeled type act largely independently. Any might be learned from the knowledge that the statement or

type expression t may be labeled with any label expression expression was evaluated. In this program, the value of pc

g f g if fbg

fl . This labeled type expression is written as t l ; for during the consequent of the statement is . After the

fp:g if pc = fg b

example, the labeled type int represents an integer that statement, , since no information about can be

p if principal p owns and only can read (the owner of a policy deduced from the fact that the statement after the statement

is always implicitly a reader). is executed. The label of a literal expression (e.g., 1)isthe

fbg The goal of type checking is to ensure that the apparent, same as its pc,or in this case. The unsafe assignment

3

lab elfLg lb; class Account f

intflbg x; nal principal customer;

intfp:g y; Stringfcustomer:g name;

switch lab el(x) f oatfcustomer:g balance;

case (intfyg z) y = z; g

else thrownew UnsafeTransfer(); g Figure 3: Bank account using run-time principals

Figure 2: Switch label Run-time labels can be manipulated statically, though con-

servatively; they are treated as an unknown but fixed label.

=1 x

(x ) in the example is prevented because the label of The presence of such opaque labels is not a problem for static

publicg 1

(f ) is not at least as restrictive as the label of in this analysis, because of the lattice properties of these labels. For

bg fsecretg

expression, which is f ,or .

L L v L

example, given any two labels L 1 and 2 where 1 2,

L t L v L t L it is the case for any third label L 3 that 1 3 2 3.

This implication makes it possible for an opaque label L 3 to Run-time lab els 2.4 appear in a label without preventing static analysis. Using it, In JFlow, labels are not purely static entities; they may also be unknown labels, including run-time labels, can be propagated used as values. First-class values of the new primitive type statically.

lab el represent labels. This functionality is needed when

the label of a value cannot be determined statically. For

Authority and declassi cation example, if a bank stores a number of customer accounts as 2.5 elements of a large array, each account might have a different JFlow has capability-like access control that is both dynam- label that expresses the privacy requirements of the individual ically and statically checked. A method executes with some customer. To implement this example in JFlow, each account authority that has been granted to it. The authority is es- can be labeled by an attached dynamic label value. sentially the capability to act for some set of principals, and

A variable of type lab el may be used both as a first-class controls the ability to declassify data. Authority also can be value and as a label for other values. For example, methods used to build more complex access control mechanisms. can accept arguments with run-time labels,as in the following At any given point within a program, the compiler under- method declaration: stands the code to be running with the ability to act for some

set of principals, called the static authority of the code at that

f*lbg compute(int xf*lbg, lab el lb) static oat point. The actual authority may be greater, because those

In this example, the component expression *lb denotes the principals may be able to act for other principals.

label contained in the variable lb, rather than the label of The principal hierarchy may be tested at any point using the

actsFor actsFor(p p ) lab el

the variable lb. To preserve safety, variables of type statement. The statement 1, 2 S executes p (such as lb) may be used to construct labels only if they are the statement S if the principal 1 can act for the principal

immutable after initialization; in Java terminology, if they are p2. Otherwise, the statement S is skipped. The statement nal nal. (Unlike in Java, arguments in JFlow are always .) S is checked under the assumption that this acts-for relation

The important power that run-time labels add is the ability exists: for example, if the static authority includes p 1, then p to be examined at run-time, using the switch lab el statement. during static checking of S, it is augmented to include 2.

An example of this statement is shown in Figure 2. The code A program can use its authority to declassify a value. The

e, L)

in this figure attempts to transfer an integer from the variable expression declassify( relabels the result of an expres-

e L y

x to the variable . This transfer is not necessarily safe, sion with the label . Declassification is checked statically, lb

because x’s label, , is not known statically. The statement using the static authority at the point of declassification. The declassify examines the run-time label of the expression x, and executes expression may relax policies owned by principals

one of several case statements. The statement executed is in the static authority. the first whose associated label is at least as restrictive as

the expression label; that is, the first statement for which the

Run-time principals

assignment of the expression value to the declared variable 2.6

flbgvfp g

(in this case, z) is legal. If it is the case that : , the Like labels, principals may also be used as first-class values rincipal

first arm of the switch will be executed, and the transfer will at run time. The type p represents a principal that is a

nal principal

occur safely via z. Otherwise, the code throws an exception. value. A variable of type may be used as if it nal

Since lb is a run-time value, information may be transferred were a real principal. For example, a policy may use a rincipal

through it. This can occur in the example by observing which variable of type p to name an owner or reader. These

actsFor of the two arms of the switch are executed. To prevent this variables may also be used in statements, allowing

information channel from becoming an information leak, the static reasoning about parts of the principal hierarchy that lb

pc in the first arm is augmented to include ’s label, which is may vary at run time. When labels are constructed using

Lg

f . The code passes static checking only if the assignment run-time principals, declassification may also be performed

z fLgvfyg from y to is legal; that is, if . on these labels.

4

Vector[lab el L] extends AbstractList[L] f Vector[fsecretg]  Vector[fpublicg]

public class that , since subtyping is

private intfLg length;

invariant in the parameter L (the subtype relation is denoted

private ObjectfLg[]fLg elements;

here by ). When such a relation is sound, the parameter

riant lab el lab el

may be declared as a cova rather than as a ,

Vector() :::

public which places additional restrictions on its use. For example,

Object elementAt(int i):fL; ig

public no method argument or mutable instance variable may be

throws (ArrayIndexOutOfBoundsException) f

elements[i]; return labeled using the parameter.

g A class always has one implicit label parameter: the label

public void setElementAtfLg(Objectfg o, intfg i) :::

thisg

f , which represents the label on an object of the class.

public intfLg size() f return length; g

v L C fL g

Because L1 2 implies that 1 acts like a subtype of

public void clearfLg() :::

fL g this C 2 , the label of is necessarily a covariant parameter, g and its use is restricted in the same manner as with other covariant parameters.

Figure 4: Parameterization over labels A class may have some authority granted to its objects by

rity adding an autho clause to the class header. The author- ity clause may name principals external to the program, or Run-time principals are needed in order to model systems principal parameters. If the authority clause names external that are heterogeneous with respect to the principals in the principals, the process that installs the class into the system system, without resorting to declassification. For example, must have the authority of the named principals. If the au- a bank might store bank accounts with the structure shown thority clause names principals that are parameters of the in Figure 3, using run-time principals rather than run-time class, the code that creates an object of the class must have

labels. With this structure, each account may be owned by the authority of the actual principal parameters used in the

C C

a different principal (the customer whose account it is). The call to the constructor. If a class has a superclass s ,any

authority C authority

security policy for each account has similar structure but is in s must be covered by the clause of C owned by the principal in the instance variable customer. . It is not possible to obtain authority by inheriting from a Code can manipulate the account in a manner that is generic superclass. with respect to the contained principal, but can also determine

at run-time which principal is being used. The principal cus-

2.8 Metho ds

actsFor tomer may be manipulated by an statement, and the

Like class declarations, JFlow method declarations also con-

customer:g switch lab el label f may be used by a statement. tain some extensions. There are a few optional annotations

to manage information flow and authority delegation. A Classes 2.7 method header has the following syntax (in the form of the Even in the type domain, parameterizing classes is important Java Language Specification [GJS96]): for building reusable data structures. It is even more impor- MethodHeader:

tant to have polymorphism in the information flow domain; Modifiersopt LabeledType Identifier ) the usual way to handle the absence of statically-checked type BeginLabelopt ( FormalParameterListopt EndLabelopt polymorphism is to perform dynamic type casts, but this ap- Throwsopt WhereConstraintsopt proach works poorly when applied to information flow since FormalParameter: new information channels are created by dynamic tests. LabeledType Identifier OptDims To allow usable data structures in JFlow, classes may be parameterized to make them generic with respect to some The return value, the arguments, and the exceptions may number of labels or principals. Class and interface decla- each be individually labeled. One subtle change from Java

rations are extended to include an optional set of explicitly is that arguments are always implicitly nal, allowing them

declared parameters. to be used as type parameters. This change is made for the

ector

For example, the Java V class is translated to JFlow convenience of the programmer and does not significantly

ector as shown in Figure 4. V is parameterized on the label change the power of the language.

L, which represents the label of the contained elements. As- There are also two optional labels called the begin-label public

suming that secret and are appropriately defined, the and the end-label. The begin-label is used to specify any

ector[fsecretg] Vector[fpublicg] pc types V and would represent restriction on at the point of invocation of the method.

vectors of elements of differing sensitivity. Without the abil- The end-label — the final pc — specifies what information

ity to parameterize classes on labels, it would be necessary can be learned by observing whether the method terminates

ector to reimplement V for every distinct element label. normally. Individual exceptions and the return value itself The addition of label and principal parameters to also may have their own distinct labels, which provides fine- JFlow makes parameterized classes into simple dependent grained tracking of information flow. types [Car91], since types contain values. To ensure that In Figure 5 are some examples of JFlow method declara- these dependent types have a well-defined meaning, only tions. When labels are omitted in a JFlow program, a default

immutable variables may be used as parameters. label is assigned. The effect of these defaults is that often

secretg v fpublicg Note that even if f , it is not the case methods require no label annotations whatever. Labels may

5

static intfx;yg add(int x, int y) f return x + y; g class passwordFile authority(ro ot) f

b o olean compare str(String name, String ):fname; pwdg public b o olean

throws(NullPointerException) f ::: g check (String user, String password)

b o olean storefLg(intfg x) where authority(ro ot) f

ws(NotFound) f ::: g //

thro Return whether password is correct

b o olean match = false;

try f

r (int i = 0; i < names.length; i++) f

Figure 5: JFlow method declarations fo

if (names[i] == user &&

passwords[i] == password) f

= true;

be omitted from a method declaration, signifying the use of match

reak;

implicit label polymorphism. For example, the arguments of b

g

compare str add and are unlabeled. When an argument label is omitted, the method is generic with respect to the label of g

the argument. The argument label becomes an implicit pa- g

catch (NullPointerException e) fg

rameter of the procedure. For example, the method add can

catch (IndexOutOfBoundsException e) fg

x y

(match, fuser; passwordg); be called with any two integers and , regardless of their return declassify

labels. This label polymorphism is important for building g

private String [ ] names;

libraries of reusable code. Without it, a math routine like add

rivate String f ro ot: g [ ] passwords; would have to be reimplemented for every argument label p ever used. g The default label for a return value is the end-label, joined

with the labels of all the arguments. For add, the default Figure 6: A JFlow password file

x;yg return value label is exactly the label written (f ), so the

return value could be written just as int. The default label

compare str on an exception is the end-label, as in the - devolves from the caller. A method with a caller clause ample. If the begin-label is omitted, as in add, it becomes may be called only if the calling code possesses the

an implicit parameter to the method. Such a method can be requisite static authority. pc called regardless of the caller’s pc. Because the within the

The principals named in the caller clause need not be method contains an implicit parameter, this method is pre- constants; they may also be the names of method argu-

vented from causing real side effects; it may of course modify rincipal ments whose type is p . By passing a principal as local variables and mutate objects passed as arguments if they the corresponding argument, the caller grants that prin- are appropriately declared, but true side effects would create cipal’s authority to the code. These dynamic principals static checking errors. may be used as first-class principals; for example, they Unlike in Java, the method may contain a list of constraints may be used in labels.

prefixed by the keyword where:

actsFor(p ,p ) actsFor WhereConstraints:  1 2 An constraint may be used to prevent the method from being called unless the spec-

where Constraints p ified acts-for relationship (p 1 acts for 2) holds at the

Constraint: call site. When the method body is checked, the static

rity( )

autho Principals principal hierarchy is assumed to contain any acts-for caller (

Principals ) relationships declared in the method header. This con-

or( , ) actsF Principal Principal straint allows information about the principal hierarchy to be transmitted to the called method without any dy- There are three different kinds of constraints:

namic checking.

 authority(p ;:::;p )

1 n This clause lists principals that

Example: passwordFile the method is authorized to act for. The static authority at 2.9 the beginning of the method includes the set of principals listed in this clause. The principals listed may be either Now that the essentials of the JFlow language are covered, we names of global principals, or names of class parameters are ready to consider some interesting JFlow code. Figure 6

contains a JFlow implementation of a simple password file, rincipal of type p . Every listed principal must be also

in which the passwords are protected by information flow

rity listed in the autho clause of the method’s class. This mechanism obeys the principle of least privilege, since controls. Only the method for checking passwords is shown.

not all the methods of a class need to possess the full This method, check, accepts a password and a user name, authority of the class. and returns a boolean indicating whether the string is the

right password for that user.

 caller(p ;:::;p ) if pass-

1 n Calling code may also dynamically The statement is conditional on the elements of

words user password

grant authority to a method that has a caller constraint. and on the variables and , whose labels

rity if

Unlike with the autho clause, where the authority are implicit parameters. Therefore, the body of the state-

= fuser; password; ro ot:g devolves from the object itself, authority in this case ment has pc , and the variable

6

Protected f Protected

class implementation shows, a is an immutable pair

nal lab elfthisg lb;

Object lb

containing a value content of type and a label

flbg content; Object that protects the value. Its value can be extracted with the

get method, but the caller must provide a label to use for

ProtectedfLLg(ObjectfLLg x, lab el LL) f

public extraction. If the label is insufficient to protect the data,

= LL; // lb must occur before call to super()

an exception is thrown. A value of type Protected behaves

sup er(); //

content = x; //

checked assuming lb == LL very much like a value in dynamic-checked information flow Protected

g systems, since it carries a run-time label. A has an

ObjectfLg get(lab el L):fLg

public obvious analogue in the type domain: a value dynamically

throws (IllegalAccess) f

associated with a type tag (e.g.,the Dynamic type [ACPP91]).

switch lab el(content) f

One key to making Protected convenient is to label the

case (ObjectfLg unwrapp ) return unwrapp ed;

fthisg Pro-

instance variable lb with . Without this labeling,

else throw new IllegalAccess();

tected would need an additional explicit covariant label pa- g

rameter to label lb with.

g

lab el() f public lab el get

return lb;

2.11 Limitations g g JFlow is not completely a superset of Java. Certain features have been omitted to make information flow control tractable.

Figure 7: The Protected class Also, JFlow does not eliminate all possible information leaks. Certain covert channels (particularly, various kinds of timing channels) are difficult to eliminate. Prior work has addressed

match also must have this label in order to allow the assign- static control of timing channels, though the resulting rules match ment match = true. This label prevents from being are restrictive [AR80, SV98]. Other covert channels arise

returned directly as a result, since the label of the return value from Java language features:

user; passwordg is the default label, f . Finally, the method

Threads. JFlow does not prevent threads from communi- declassifies match to this desired label, using its compiled-in

cating covertly via the timing of asynchronous modifications

NullPoint- authority to act for ro ot. Note that the exceptions

to shared objects. This covert channel can be prevented by IndexOutOfBoundsException erException and must be ex- requiring only single-threaded programs.

plicitly caught, since the method does not explicitly declare channels. them. More precise reasoning about the possibility of excep- Timing JFlow cannot prevent threads from tions would make JFlow code more convenient to write. improperly gaining information by timing code with the sys- Otherwise there is very little difference between this code tem clock, except by removing access to the clock.

and the equivalent Java code. Only three annotations have HashCo de. In Java, the built-in implementation of the

authority Object

been added: an clause stating that the principal hashCo de method, provided by the class , can be used declassify

ro ot trusts the code, a expression, and a label on to communicate information covertly. Therefore, in JFlow

passwords the elements of . The labels for all local variables every class must implement its own hashCo de.

and return values are either inferred automatically or assigned riables. Static va The order of static variable initializa- sensible defaults. The task of writing programs is made easier tion can be used to communicate information covertly. In in JFlow because label annotations tend to be required only JFlow, this channel is blocked by ruling out static variables. where interesting security issues are present. However, static methods are legal. This restriction does not In this method, the implementor of the class has decided significantly hurt expressive power, since a program that uses that declassification of match results in an acceptably small static variables usually can be rewritten as a program in which leak of information. Like all login procedures, this method the static variables are instance variables of an object. The does leak information, because exhaustively trying pass- order of initialization of these objects then becomes explicit words will eventually extract the passwords from the pass- and susceptible to analysis. word file. However, assuming that the space of passwords

Finalizers. Finalizers are run in a separate thread from is large and passwords are difficult to guess, the amount of the main program, and therefore can be used to communicate information gained in each trial is far less than one bit. Rea- covertly. Finalizers are not part of JFlow.

soning processes about acceptable leaks of information lie

exhaustion. OutOfMemoryError outside the domain of information flow control, but in this Resource An can be system, such reasoning processes can be accommodated in a used to communicate information covertly, by condition- natural and decentralized manner. ally allocating objects until the heap is exhausted. JFlow treats this error as fatal, preventing it from communicating

more than a single bit of information per program execu-

2.10 Example: Protected

wError tion. Other exhaustion errors such as StackOver o are treated similarly.

The class Protected provides a convenient way of managing

all-clo ck timing channels. run-time labels, as in the bank account example mentioned A JFlow program can earlier. Its implementation is shown in Figure 7. As the change its run time based on private information it has ob-

7 served. As an extreme example, it can enter an infinite loop. through a return statement, and so on. This fine-grained anal- JFlow does not attempt to control these channels. ysis avoids the unnecessary restrictiveness that would be pro-

duced by desugaring exceptions. Each exception that can be

ed exceptions. Uncheck Java allows users to define

exceptions that need not be declared in method headers raised by evaluating a statement or expression has a possibly catch (unchecked exceptions), although this practice is described distinct label that is transferred to the pc of statements as atypical [GJS96]. In JFlow, there are no unchecked ex- that might intercept it. Even finer resolution is provided for ceptions, since they could serve as covert channels. normal termination and for return termination; for example,

the label of the value of an expression may differ from the

yp e discrimination on parameters. T JFlow supports label associated with normal termination. Finally, termina-

the run-time cast and instanceof operators of standard Java,

reak continue tion of a statement by a b or statement is also

but they may only be invoked using classes that lack parame-

reak continue tracked without confusing distinct b or targets. ters. The reason for this restriction is that information about The path labels for a statement or expression are repre- the parameters is not available at run time. These operators sented as a map from symbols to labels. Each mapping could be permitted if the parameters were statically known represents a termination path that the statement or expres- to be matched, but this is not currently supported.

sion might take, and the label of the mapping indicates what

ard compatibility. Backw JFlow is not backward com- information may be transmitted if this path is known to be patible with Java, since existing Java libraries are not flow- the actual termination path. The domain of the map includes checked and do not provide flow annotations. However, in several different kinds of entities:

many cases, a Java library can be wrapped in a JFlow library n

that provides reasonable annotations.  The symbol , which represents normal termination. r

 The symbol , which represents termination through a

3 Static checking and translation

return statement.

Throwable This section covers the static checking that the JFlow com-  Classes that inherit from . A mapping from piler performs as it translates code, and the translation process a class represents termination by an exception.

itself.

nv rv  The symbols and represent the labels of the nor-

mal value of an expression and the return value of a Exceptions 3.1 statement, respectively. They do not represent paths An important limitation of earlier attempts to create lan- themselves, but it is convenient to include them as part guages for static flow checking has been the absence of usable of the map. Their labels are always at least as restrictive exceptions. For example, in Denning’s original work on static as the labels of the corresponding paths.

flow checking, exceptions terminated the program [DD77]

hgoto Li  A tuple of the form represents termination

because any other treatment of exceptions seemingly leaked

reak continue by executing a named b or statement that

information. Subsequent work has avoided exceptions en-

break continue jumps to the target L.A or statement tirely. that does not name a target is represented by the tuple

It might seem unnecessary to treat exceptions directly,

goto i h . These tuples are always mapped to the label since in many languages, a function that generates excep- > since the static checking rules do not use the actual tions can be desugared into a function that returns a discrim- label. inated union or oneof. However, there are problems with

this approach. The obvious way to handle oneofs causes all Path labels are denoted by the letter X in this paper, and s

exceptions to carry the same label — an unacceptable loss of members of the domain of X (paths) are denoted by . The

[s] X s

precision. Also, Java exceptions are actually objects, and the expression X denotes the label that maps to, and the

X [s = L]

::: catch

try statement functions like a typecase. This model expression : denotes a new map that is exactly

s L

cannot be translated directly into a oneof. like X except that is bound to . Path labels may also ;

Nevertheless, it is useful to consider how oneof types might map a symbol s to the pseudo-label , indicating that the ;

be handled in JFlow. The obvious way to treat oneof types statement cannot terminate through the path s. The label

t L = L

is by analogy with record types. Each arm of the oneof has acts as the bottom of the label lattice; ; for all labels

L fg X

a distinct label associated with it. In addition, there is an , including the label . The special path labels ; map ; added integer field tag that indicates which of the arms of all paths to , corresponding to an expression that does not

the oneof is active. The problem with this model is that terminate.

taggvpc

every assignment to the oneof will require that f ,

tagg

and every attempt to use the oneof will implicitly read f .

Typ e checking vs. lab el checking As a result, every arm of the oneof will effectively carry the 3.2 same label. For modeling exceptions, this is unacceptable. The JFlow compiler performs two kinds of static checking For each expression or statement, the static checker deter- as it compiles a program: type checking and label checking. mines its path labels, which are the labels for the informa- These two aspects of checking cannot be entirely disentan- tion transmitted by various possible termination paths: nor- gled, since labels are type constructors and appear in the rules mal termination, termination through exceptions, termination for subtyping. However, the checks needed to show that a

8

 

g

A [C ]=hclass C [::P ::] ::: f:: :gi

i true

0 0

A ` X [n = A[pc]; nv = A[pc]]

(A `Q Q ) _ (P = hcovariant lab el i^A `Q vQ )

i i i

id literal : ; : :

i i

0

A ` C [::Q ::]  C [::Q ::]

i

T i

true

 

g

A ` X [n = A[pc]]

A [C ]=hclass C [::P ::] extends t ::: f:: :gi

s

i

;: ; :

(t ; (C [::Q ::])) T =

s s

interp-T class- i

0 0

A ` T  C [::Q ::]

s

T

 

i

0 0

i [v ]= hvar nal T fLg

A uid

A ` C [::Q ::]  C [::Q ::]

i

T

i

X = X [n = A[pc]; nv = L t A[pc]]

; : :

` v X A :

Figure 8: Subtype rules

A ` E

: X

i [v ]=hvar T fLg A uid

statement or expression is safe largely can be classified as

` X [nv] v L

either type or label checks. This paper focuses on the rules A

` v = E X A : for checking labels, since the type checks are almost exactly the same as in Java.

There are several kinds of judgements made during static

A ` S

1 : X1

A ` E T E

checking. The judgment T : means that has

A; S )[pc = X [n]] ` S X

extend ( 1 : 1 2 : 2

A A ` E X

type T in environment . The judgment : is

= X [n = ; ]  X X 1 : 2

the information-flow counterpart: it means that E has path

` S S X

A 1; 2 :

X A ` labels in environment . The symbol T is used to de-

note inferences in the type domain. The environment A

maps identifiers (e.g., class names, parameter names, vari-

(X = X X )  8s (X [s]=X [s] t X [s])

able names) to various kinds of entities. As with path labels, 1  2 1 2

[s] s A

the notation A is the binding of symbol in . The nota-

[s = B ] s B tion A : is a new environment with rebound to .

In the rules given here, it is assumed that the declarations of Figure 9: Some simple label-checking rules g all classes are found in the global environment, A .

A few more comments on notation will be helpful at this g

A [:: (P ) = Q ::] i bound to the actual parameters: param-id i :

point. The use of large brackets indicates an optional syntac-

ector[L]

Using this rule, V (from Figure 4) would be a

T t

tic element. The letter represents a type, and represents 0

L  L subtype of AbstractList[L'] only if . Java arrays

a type expression. The letter C represents the name of a

fLg[]

(written as T ) are treated internally as a special type l

class. The letter L represents a label, and represents a L with two parameters, T and . As in Java, they are covariant

label expression.  represents an labeled type expression; L in T , but like most JFlow classes, invariant in . User-defined

that is, a pair containing a type expression and an optional types may not be parameterized on other types.

t; A)

label expression. The function interp-T ( converts type T

If S and are not instantiations of the same class, it is nec-

l; A)

expressions to types, and the function interp-L( converts T essary to walk up the type hierarchy from S to , rewriting label expressions to labels. The letter v represents a variable parameters, as shown in the second rule in Figure 8. Together, name. The letter P represents a formal parameter of a class, the two rules inductively prove the appropriate subtype rela- and the letter Q represents an actual parameter used in an tionships.

instantiation of a parameterized class.

3.4 Lab el-checking rules

3.3 Subtyp e rules Let us consider a few examples of static checking rules.

There are some interesting interactions between type and Space restrictions prevent presentation of all the rules, but a

A ` S  T

label checking. Consider the judgment T , mean- complete description of the static checking rules of JFlow is T

ing “S is a subtype of ”. This judgement must be made in available [Mye99]. T JFlow, as in all languages with subtyping. Here, S and are Consider Figure 9, which contains some of the most basic ordinary unlabeled types. The subtype rule, shown in Fig- rules for static checking. The first rule shows that a literal

ure 8, is as in Java, except that it must take account of class expression always terminates normally and that its value is

T pc parameters. If S or is an instantiation of a parameterized labeled with the current , as described earlier. The sec- class, subtyping is invariant in the parameters except when a ond rule shows that an empty statement always terminates

label parameter is declared to be covariant. This subtyping normally, with the same pc as at its start. rule is the first one shown in Figure 8. The function class-env, The third rule shows that the value of a variable is labeled

used in the figure, generates an extension of the global envi- with both the label of the variable and the current pc. Note ronment in which the formal parameters of a class (if any) are that the environment maps a variable identifier to an entry

9 ment is allowed if the variable’s label is more restrictive than

that of the value being assigned (which will include the cur-

A ` E X a a :

rent pc). Whether one label is more restrictive than other is

A[pc = X [n]] ` E X

a i

: i : inferred using the current environment, which contains in-

= X [n]] ` E X A[pc

v v

: i : formation about the static principal hierarchy. The complete

X = (X  X  X ;X [nv]; NullPointerException)

a v a

1 exc i rule for checking this statement would have an additional

= (X ;X [nv] t X [nv]; OutOfBoundsException) X a

exc i

` E T 2 1 A

antecedent T : , but such type-checking rules have

X = (X ;X [nv] t X [nv]; ArrayStoreException) v

exc 2 a been omitted in the interest of space.

A ` E T fL g[]

a a S

T : The final rule in Figure 9 covers two statements 1 and

A ` X [nv ] t X [n] v L

S v

a 2 performed in sequence. The second statement is executed

A ` E [E ]= E X

a v i : only if the first statement terminated normally, so the correct

pc for checking the second statement is the normal path label

[n] of the first statement (X1 ). The function extend extends

the environment A to add any local variable declarations in

S

A ` E X

: E the statement 1. The path labels of the sequence must be

= X [nv]] ` S X A[pc

: E 1 : 1 at least as restrictive as path labels of both statements; this



[pc = X [nv]] ` S X A condition is captured by the operator , which merges two

: E 2 : 2

= X [n = ; ]  X  X X sets of path labels, joining all corresponding paths from both.

E : 1 2

` if (E ) S else S X A 1 2 : Figure 10 contains some more complex rules. The rule for

array element assignment mirrors the order of evaluation of E

the expression. First, the array expression a is evaluated, X

yielding path labels a . If it completes normally, the index

E X i

expression i is evaluated, yielding . Then, the assigned

L = fresh-variable ()

0 value is evaluated. Java checks for three possible exceptions

= L; hgoto i = L] = A[pc A : :

0 before performing the assignment. The function exc, defined

A ` E X

: E

0 at the bottom, is used to simplify these conditions. This

= X [nv]] ` S X A [pc S : E :

function creates a set of path labels that are just like X except

A ` X [n ] v L S

that they include an additional path, the exception C , with

= ; ] X =(X  X )[hgoto i S E :

the path label L. Since observation of normal termination

` while (E ) S X

A : nv

(n) or the value on normal termination ( ) is conditional on

X ` do S while (E ) A :

the exception not being thrown, exc joins the label L to these

two mappings as well. Finally, avoiding leaks requires that L

the label on the array elements ( a ) is at least as restrictive

X [nv]

as the label on the information being stored ( v ).

A ` A[pc ] v A[hgoto Li]

The next rule shows how to check an if statement. First,

A ` continue L X [hgoto Li = >]

: ; : X

the path labels E of the expression are determined. Since

A ` break L X [hgoto Li = >]

: ; :

S E pc

execution of S 1 or 2 is conditional on , the for these

E X [nv]

statements must include the value label of , E . Fi-

nally, the statement as a whole can terminate through any of

0

X ` S

A :

S S

the paths that terminate E , 1,or 2— except normal ter-

s 2fn ; rg

S S

mination of E , since this would cause one of 1 or 2 to be

0 0 0 0

(s j s 2 ^ s 6= s) X [s ]=; 8 paths

executed. If the statement has no else clause, the statement

0

= X [s = A[pc]] X :

S2 is considered to be an empty statement, and the second

` S X A : rule in Figure 9 is applied.

The next rule, for the while statement, is more subtle be- cause of the presence of a loop. This rule introduces a label

variable L to represent the information carried by the con-

nv rv paths = all symbols except ,

tinuation of the loop through various paths. L represents an

unknown label that will be solved for later. It is essentially a

(X; L; C )=X  X [n = L; nv = L; C = L] exc ; : : :

loop invariant for information flow. L may carry information

S break from exceptional termination of E or , or from or

Figure 10: More label-checking rules continue statements that occur inside the loop. An entry is

goto i

added to the environment for the tuple h to capture

reak continue

information flows from any b or statements

reak continue

within the loop. The rules for checking b and ,

var T fLg i hvar nal T fLg i of either the form h uid or uid ,

shown below the rule for while, use these environment entries L where T is the variable’s type, is its label, and uid is a

unique identifier distinguishing it from other variables of the to apply the proper restriction on information flow.

L pc X [n]

same name. Assuming that is the entering label, S is the

0 L The fourth rule covers assignment to a variable. Assign- final pc label. The final condition requires that may be at

10

y = true;

A ` E class C f:: :g

try f

T :

if (x) throw new E();

X A ` E

: E

y = false;

X = (X ;X [nv];C)[n = ;] E

exc E :

g

` throw E X

A :

catch (E e) fg

A ` S X

throw

: S Figure 12: Implicit flow using

= (X ;C ) pc i

exc-label S

i

;v g A[pc = pc = hvar nalC fpc ()i ] ` S X

i i

: : fresh-uid i :

i i

L

try:::catch try::: nally throw

X )  =( (X ; (::; C ;::))

X a inside a .) The rule for is

i i uncaught S

i straightforward.

A ` try fS g ::catch(C v ) fS g:: X

i i

i :

:::catch catch The idea behind the try rule is that each

clause is executed with a pc that includes all the paths that

might cause the clause to be executed: all the paths that are

` S X A ` S X A 1 : 1 2 : 2

exceptions where the exception class is either a subclass or

= X [n = ;]  X X 1 : 2

a superclass of the class named in the catch clause. The

` try fS g nally fS g X A 1 2 : function exc-label joins the labels of these paths. The path

labels of the whole statement merge all the path labels of the X

various catch clauses, plus the paths from S that might not

F

0

X [C ] X; C )=

exc-label (

0 0 0

catch

(C C _C C )

C : be caught by some clause, which include the normal X

termination path of S if any.

0

try::: nally

(X; (::; C ;::)))  (X =

uncaught i The rule is very similar to the rule for sequenc-

0

S

X [s]= (if (9i (s  C )) then ; else X [s])

i ing two statements. One difference is that the statement 2 S

is checked with exactly the same initial pc that 1 is, since S S2 is executed no matter how 1 terminates.

Figure 11: Exception-handling rules To see how these exception rules work, consider the code

y b o olean

in Figure 12. In this example, x and are vari- y ables. This code transfers the information in x to by us- most as restrictive as L, which is what establishes the loop

invariant. ing an implicit flow resulting from an exception. In fact,

= x

The last rule in Figure 10 applies to any statement, and is the code is equivalent to the assignment y . Using the w

important for relaxing restrictive path labels. It is intuitive: if rule of Figure 11, the path labels of the thro statement

E !fxgg if

a statement (or a sequence of statements) can only terminate are f , so the path labels of the statement are

= fE ! fxg; n ! fxgg y = false

X . The assignment is pc

normally, the pc at the end is the same as the at the be-

= X [n]=fxg

ginning. The same is true if the statement can only terminate checked with pc , so the code is allowed only

xgv fyg if f . This restriction is correct since it is exactly with a return statement. This rule is called the single-path rule. It would not be safe for this rule to apply to exception what the equivalent assignment statement would have re- paths. To see why, suppose that a set of path labels formally quired. Finally, applying both the try-catch rule here and the

single-path rule from Figure 10, the value of pc after the code contains only a single exception path C . However, that path might include multiple paths consisting of exceptions that are fragment is seen to be the same as at its start. Throwing and catching an exception does not necessarily taint subsequent subclasses of C . These multiple paths can be discriminated

computation.

::: catch using a try statement. The unusual Java exception model prevents the single-path rule from being applied to

exception paths.

3.6 Run-time lab el checking However, Java is a good language to extend for static flow analysis in other ways because it fully specifies evaluation An interesting aspect of checking JFlow is checking the

order. This property makes static checking of information switch lab el statement, which inspects a dynamic label at run

flow simpler, because the rules tend to encode all possible time. The inference rule for checking this statement is given lab el

evaluation orders. If there were non-determinism in evalua- in Figure 13. Intuitively, the switch statement tests

X [nv ] v L i tion order, it could be encoded by adding label variables in a the equation E for every arm until it finds one for

manner similar to the rule for the while statement. which the equation holds, and executes it. However, this cannot be evaluated either statically or at run time. Therefore,

the test is into two stronger conditions: one that can be

Throwing and catching exceptions 3.5 tested statically, and one that can be tested dynamically. This

Exceptions can be thrown and caught safely in JFlow using rule naturally contains the static part of the test. L

the usual Java constructs. Figure 11 shows the rule for the Let RT be the join of all possible run-time-

w try::: catch nally

thro statement, a statement that lacks a representable components (i.e., components that do not

::: nally try

clause, and a try statement. (A statement with mention formal label or principal parameters). The

catch nally X [nv] t L v L t L

RT i RT both clauses and a clause can be desugared into static test is that E (equiva-

11

def

a

A ` (C [Q ]; (::; E ;::); S ;A ;L ;L )

i j

call-begin I

RV

def

a

A ` E X A ` (C [Q ]; S ;A ;L ;L ) X

i I

: E call-end :

RV

(l ;A) L =

A ` (C [Q ]; (::; E ;::); S ) X

i i j

interp-L call i :

A ` X [nv ] v L t L

i

E RT

T A ` E

T :

     

S = h static  m fI g (:: a ::) fRg throws (:: ::) where K i

r

(t ;A) A ` T 

j j l

: k i

T interp-T

X = X [n = A[pc]]

pc = X [n] ; :

0 E 0

A[pc = X [n]] ` E X

pc = pc t (X [nv] t L )

j j

: j : i

label E 1

i

i 1

() L =

A[pc = pc ;v = hvar nal T fL g ()i ] ` S X

j fresh-variable

i i i i

: i : fresh-uid :

i

L

() =

X ) X = X  (

uid j fresh-uid

i

E

i

c

A = (C [Q ])

A ` switch lab el(E )f::case (t fl g v ) S ::g X

class-env i

i i i

i :

a c c

A = A [::a = hvar nal ( ;A )fL g i ::]

j j j

j : type-part uid

 

a

(I; A ) else X ]) L =(if fI g then [n

I

j )

interp-L max(

a

A ` L  (if ( ) then ( ;A ) t L else L )

j j j j labeled label-part I

Figure 13: Inference rule for switch lab el

A ` X [nv] v L

j j

[n A ` X ] v L

I

j )

max(

F

def

X [nv]) L =(if ( = void) then fg else

r

j

RV

j

X [nv] v L t L

i RT

lently, E ); the dynamic test is that

a

(A; A ;A[::a = E ::]; (::K ::))

j j

satisfies-constraints : l

X [nv] u L v L u L

RT i RT

E . Together, these two tests im-

def

a

A ` (C [Q ]; (::E ::); S ;A ;L ;L )

i j

call-begin I

X [nv] v L

RV i ply the full condition E .

The test itself may be used as an information channel, so

pc X [nv ]

after the check, the must include the labels of E and

L

a m

every i up to this point. This rule uses the label function

(p)= (p; A; A ;A ) in let interp interp-P-call

to achieve this. When applied to a label L, it generates

case K of

a new label that joins together the labels of all variables i

rity (:::)

autho : true

L 0

that are mentioned in . However, the presence of label in 0

caller (::p ::) 8(p )9(p 2 A[auth]) A ` p  (p )

j j j : interp

constraint equations does not change the process of solving

;p ) A ` (p )  (p ) or (p actsF : interp interp

label constraints in any fundamental way. 1 2 1 2

end

end

3.7 Checking metho d calls

a m

(A; A ;A ; (::K ::))

satisfies-constraints i

Let us now look at some of the static checking associated

    

with objects. Static checking in object-oriented languages is 

fRg throws (:: ::) where K i S = h static  m fI g (:: a ::)

r

j j l

: k 

often complex, and the various features of JFlow only add to 

a

L = L t (if fRg then (R; A ) else fg) I

the complexity. This section shows how, despite this com- R : interp-L

def

a

L = L t (if ( ) then ( ;A ) else L )

r r R RV labeled label-part

plexity, method calls and constructor calls (via the operator RV

( ; (C [Q ])) C =

new

i k

) are checked statically. k type-part class-env

L

0

X =( X )[n = L ; nv = L ]

j RV

The rules for checking method and constructor calls are : R :

j

0 a

X = X  X [::C ( ;A ) t L ::]

shown in Figures 14 and 15. Figure 14 defines some generic =

R

k k

; : label-part

def

a

` (C [Q ]; S ;A ;L ;L ) X checking that is performed for all varieties of calls, and Fig- A i call-end I : ure 15 defines the rules for checking ordinary method calls, RV static method calls, and constructor calls. To avoid repetition, the checking of both static and non- static method calls, and also constructor calls, is expressed Figure 14: Checking calls in terms of the predicate call, which is defined in Figure 14.

This predicate is in turn expressed in terms of two predicates:

[n]

call-begin and call-end. X

j ) after evaluating all of the arguments, which is max( . The predicate call-begin checks the argument expressions The call site must satisfy all the constraints im- and establishes that the constraints for calling the method are posed by the method, which is checked by the predicate satisfied. In this rule, the functions type-part and label-part satisfies-constraints. The rule for this predicate, also in Fig-

interpret the type and label parts of a labeled type  . The ure 14, uses the function interp-P-call, which maps iden- L

rule determines the begin label I , the default return label tifiers used in the method constraints to the corresponding

def

a A

L , and the argument environment , which binds all the principals. To perform this mapping, the function needs en- RV

method arguments to appropriately labeled types. Invoking a vironments corresponding to the calling code (A), the called

a

E A

method requires evaluation of the arguments j , producing code ( ), and a special environment that binds the actual

m

X A A[auth]

corresponding path labels j . The argument labels are bound arguments ( ). The environment entry contains

a

A L X [nv] v L

j j

in to labels j , so the line ( ) ensures that the the set of principals that the code is known statically to act

` p  p p

actual arguments can be assigned to the formals. The begin- for. The judgement A 1 2 means that 1 is known

L pc p label I is also required to be more restrictive than the statically to act for 2. (The static principal hierarchy is also

12

A ` E C [::Q ::]

s i

T :

[[ actsFor (p ;p ) S ]] =

T 1 2

A ` E T

j j

T :

dynamic PH:actsFor (T [[ p ]] ; T [[ p ]])) T [[ S ]] (

if 1 2

(C [::Q ::];m(::T ::); S ) j

signature i

A ` E X s

s :

T [[ switch lab el(E ) f ::case( t fl g) S :: else S g ]] =

e

i i i

A[pc = X [nv]] ` (C [::Q ::]; (::E ::); S ) X

s j

: call i :

= T [[ E ]]

Tv ;

A ` E :m(::E ::) X s

j :

if (T [[ X [nv] u L ]] :relab elsTo(T [[ L u L ]])) f

RT RT

E 1

T [[ S ]

1 ]

g else :: :

(t; A) =

T interp-T

if (T [[ X [nv] u L ]] :relab elsTo(T [[ L u L ]]) f

i

E RT RT

A ` E T

j j

T :

T [[ S ]]

i

(T; m(::T ::); S )

signature j

g ::: else f T [[ S ]] g

e

(T; (::E ::); S ) X A `

call j :

A ` t:m(::E ::) X

j :

Figure 16: Interesting JFlow translations

T = C [::Q ::]= (t; A)

i interp-T

   

g

A [C ]= hclass C [::P ::] ::: authority (::p ::) :: :i i

l constraints may be policies, label variables, label parameters,

A ` E T

(L) L

j j

T : dynamic labels, or expressions label for some label .

(T; C(::T ::); S )

signature j The constraints can be efficiently solved, using a modifi-

   

S = hC fI g (:: a ::) fRg throws (:: ::) where K i

j j l

: k cation to a lattice constraint-solving algorithm [RM96] that

   

0

S = hstatic T fg m fI g (:: a ::) fRg throws (:: ::) where K i

j j applies an ordering optimization [HDT87] shown to produce l

: k

0

A ` T; (::E ::); S ) X ( the best of several commonly-used iterative dataflow algo-

call j :

rithms [KW94]. The approach is to initialize all variables

p ) 9(p 2 A[auth]) A ` p  (p ; (T )) 8( l parameters l interp-P class-env

in the constraints with the most restrictive label (>) and it-

A ` new t(::E ::) X

j : eratively relax their labels until a satisfying assignment or a contradiction is found. The label does not create problems because it is monotonic. The relaxation steps are ordered by Figure 15: Rules for specific calls topologically sorting the constraints and looping on strongly-

connected components. The number of iterations required is

(nh) h O where is the maximum height of the lattice struc-

placed in the environment.)

(nd) d ture [RM96], and also O where is the maximum back

Finally, the predicate call-end produces the path labels X edges in depth-first traversal of the constraint dependency d

of the method call by assuming that the method returns the [HDT87]. Both h and seem likely to be bounded for def

path labels that its signature claims. The label L is used reasonable programs. The observed behavior of the JFlow RV

as the label of the return value in the case where the return compiler is that constraint solving is a negligible part of run  type, r , is not labeled. It joins together the labels of all of time. the arguments, since typically the return value of a function

depends on all of its arguments.

Translation The rules for the various kinds of method calls are built on 3.9 of this framework, as shown in Figure 15. In these rules, The JFlow compiler is a static checker and source-to-source the function signature obtains the signature of the named translator. Its output is a standard Java program. Most of the method from the class. The rule for constructors contains annotations in JFlow have no run-time representation; trans- two subtle steps: first, constructors are checked as though lation erases them, leaving a Java program. For example, they were static methods with a similar signature. Second, all type labels are erased to produce the corresponding unla-

a call to a constructor requires that the caller possess the beled Java type. Class parameters are erased. The declassify authority of all principals in the authority of the class that are expression and statement are replaced by their contained ex-

parameters. The caller does not need to have the authority of pression or statement.

authority

principal

external principals named in the clause. Uses of the built-in types lab el and are translated

w.lang.Lab el j ow.lang.Principal to the Java types j o and ,

respectively. Variables declared to have these types remain Constraint solving

3.8 in the translated program. Only two constructs translate to

or switch lab el As the rules for static checking are applied, they generate a interesting code: the actsF and statement, constraint system of label variables for each method [ML97]. which dynamically test principals and labels, respectively.

For example, the assignment rule of Figure 9 generates a The translated code for each is simple and efficient, as shown

[nv] v L switch lab el

constraint X . All of the constraints are of the form in Figure 16. Note that the translation rule for

A t ::: t A v B t ::: t B n

1 m 1 . These constraints can be uses definitions from Figure 13. As discussed earlier, the run-

A v B t ::: t B X [nv] u L v L u L

n E RT RT split into individual constraints i 1 because time check is 1 , which in effect is of the lattice properties of labels. The individual terms in the a test on labels that are completely representable at run time.

13

o acts-

The translated code uses the methods relab elsT and performance of the label inference algorithm (the constraint

j ow.lang.Lab el j ow.lang.Principal or F of the classes and , solver) also has been improved. respectively. These methods are accelerated by a -table

lookup into a cache of results, so the translated code is fast.

5 Conclusions

Related work 4 Privacy is becoming an increasingly important security con- cern, and static program checking appears to be the only There has been much work on information flow control and technique that can provide this security with reasonable effi- on the static analysis of security guarantees. The lattice ciency. This paper has described the new language JFlow, an model of information flow comes from the early work of Bell extension of the Java language that permits static checking of and LaPadula[BL75] and Denning [Den76]. Most subse- flow annotations. To our knowledge, it is the first practical quent information control models use dynamic labels rather programming language that allows this checking. The goal than static labels and therefore cannot be checked statically. of this work has been to add enough power to the static check- The decentralized label model has similarities to the ORAC ing framework to allow reasonable programs to be written in model [MMN90]: both models provide some approxima- a natural manner. tion of the “originator-controlled release” labeling used by JFlow addresses many of the limitations of previous work the U.S. DoD/Intelligence community, although the ORAC in this area. It supports many language features that have model is dynamically checked. not been previously integrated with static flow checking, in- Static analysis of security guarantees also has a long his- cluding mutable objects (which subsume function values), tory. It has been applied to information flow [DD77, AR80], subclassing, dynamic type tests, dynamic access control, and to access control [JL78, RSC92], and to integrated models exceptions. [Sto81]. There has recently been more interest in provably- Avoiding unnecessary restrictiveness while supporting a secure programming languages, treating information flow complex language has required the addition of sophisticated checks in the domain of type checking. Some of this work language mechanisms: implicit and explicit polymorphism, has focused on formally characterizing existing information so code can be written in a generic fashion; dependent types, flow and integrity models [PO95, VSI96, Vol97]. Smith and to allow dynamic label checking when static label checking Volpano have recently examined the difficulty of statically would be too restrictive; static reasoning about access control; checking information flow in a multithreaded functional lan- statically-checked declassification. guage [SV98], which JFlow does not address. However, the This list of mechanisms suggests that one reason why static rules they define prevent the run time of a program from de- flow checking has not been widely accepted as a security tech- pending in any way on non-public data. Abadi [Aba97] has nique, despite having been invented over two decades ago, is examined the problem of achieving secrecy in security proto- that programming language techniques and type theory were cols, also using typing rules, and has shown that encryption not then sophisticated enough to support a sound, practical can be treated as a form of safe declassification through a programming model. By adapting these techniques, JFlow primitive encryption operator. makes a useful step towards usable static flow checking. Heintze and Riecke [HR98] have shown that information- flow-like labels can be applied to a simple language with

reference types (the SLam calculus). They show how to

Acknowledgments statically check an integrated model of access control, infor- mation flow control, and integrity. Their labels include two I would like to thank several people who read this paper and components: one which enforces conventional access con- gave useful suggestions, including Sameer Ajmani, Ulana trol, and one that enforces information flow control. Their Legedza, and the reviewers. Kavita Bala, Miguel Castro, model has the limitation that it is entirely static: it has no and Stephen Garland were particularly helpful in reviewing run-time access control, no declassification, and no run-time the static checking rules. I would also like to thank Nick flow checking. It also does not provide label polymorphism Mathewson for his work on the PolyJ compiler, from which or objects. Heintze and Riecke prove some useful soundness I was able to steal much code, and Barbara Liskov for her theorems for their model. This step would be desirable for support on this project. JFlow, but important features like objects, inheritance and

dependent types make formal proofs of correctness difficult

Grammar extensions at this point. A An earlier paper [ML97] introduced the decentralized label model and suggested a simple language for writing JFlow contains several extensions to the standard Java gram- information-flow safe programs. JFlow extends the ideas of mar, in order to allow information flow annotations to be that simple language in several important ways and shows added. The following productions must be added to or modi- how to apply them to a real programming language, Java. fied from the standard Java Language Specification [GJS96]. JFlow adds support for objects, fine-grained exceptions, As with the Java grammar, some modifications to this gram- explicit parameterization, and the full decentralized label mar are required if the grammar is to be input to a parser model [ML98]. Static checking is described by formal in- generator. These grammar modifications (and, in fact, the ference rules that specify much of the JFlow compiler. The code of the JFlow compiler itself) were to a considerable

14 extent derived from those of PolyJ, an extension to Java that ArrayType:

supports parametric polymorphism [MBL97, LMM98]. LabeledType []

ArrayCreationExpression:

A.1 Lab el expressions new LabeledType DimExprs OptDims

LabelExpr:

A.3 Class declarations g f Componentsopt

Components: ClassDeclaration: Component Modifiersopt class Identifier Paramsopt

Components ; Component Superopt Interfacesopt Authorityopt ClassBody Component: InterfaceDeclaration:

Principal : Principals opt Modifiersopt interface Identifier Paramsopt

this ExtendsInterfacesopt Identifier Interfacesopt InterfaceBody

* Identifier

Params: ] Principals: [ ParameterList Principal ParameterList: Principals , Principal Parameter

Principal: Name ParameterList , Parameter Parameter:

lab el Identifier

A.2 Lab eled typ es riant lab el

cova Identifier rincipal

Types are extended to permit labels. The new primitive types p Identifier

principal lab el and are also added.

Authority:

rity( ) LabeledType: autho Principals PrimitiveType LabelExpropt

ArrayType LabelExpr

Metho d declarations opt A.4 Name LabelExpropt TypeOrIndex LabelExpropt MethodHeader:

PrimitiveType: Modifiersopt LabeledType Identifier ) NumericType BeginLabelopt ( FormalParameterListopt EndLabelopt Throws WhereConstraints b o olean opt opt

Modifiers void Identifier

lab el opt )

BeginLabel ( FormalParameterList EndLabel rincipal p opt opt opt Throwsopt WhereConstraintsopt The TypeOrIndex production represents either an instantia- tion or an array index expression. Since both use brackets, ConstructorDeclaration:

the ambiguity is resolved after parsing. Modifiersopt Identifier ) BeginLabelopt ( FormalParameterList EndLabelopt

TypeOrIndex: Throwsopt WhereConstraintsopt ] Name [ ParamOrExprList FormalParameter: ArrayIndex: LabeledType Identifier OptDims TypeOrIndex

BeginLabel: ] PrimaryNoNewArray [ Expression LabelExpr ClassOrInterfaceType: Name EndLabel: TypeOrIndex : LabelExpr WhereConstraints: ParamOrExprList:

where Constraints ParamOrExpr

ParamOrExprList , ParamOrExpr Constraints: Constraint ParamOrExpr:

Constraints , Constraint Expression LabelExpr Constraint:

15

Authority [ACPP91] Mart«õn Abadi, Luca Cardelli, Benjamin C. Pierce, )

caller ( Principals and Gordon D. Plotkin. Dynamic typing in a stat-

actsFor( ) Principal , Principal ically typed language. ACM Transactions on Pro-

gramming Languages and Systems (TOPLAS), ws To avoid ambiguity, the classes in a thro list must be 13(2):237Ð268, April 1991. Also appeared as placed in parentheses. Otherwise a label might be confused SRC Research Report 47. with the method body. [AR80] Gregory R. Andrews and Richard P. Reitman. An Throws:

axiomatic approach to information flow in pro-

ws ( ) thro ThrowList grams. ACM Transactions on Programming Lan-

guages and Systems, 2(1):56Ð76, 1980. New statements A.5 [BL75] D. E. Bell and L. J. LaPadula. Secure com- puter system: Unified exposition and Multics in- Statement: terpretation. Technical Report ESD--75-306,

StatementWithoutTrailingSubstatement MITRE Corp. MTR-2997, Bedford, MA, 1975. ::: :::existing productions Available as NTIS AD-A023 588. ForStatement SwitchLabelStatement [Car91] Luca Cardelli. Typeful programming. In E. J. ActsForStatement Neuhold and M. Paul, editors, Formal Descrip- DeclassifyStatement tion of Programming Concepts. Springer-Verlag, 1991. An earlier version appeared as DEC Systems Research Center Research Report #45, SwitchLabelStatement:

February 1989.

) f g switch lab el ( Expression LabelCases LabelCases: [DD77] Dorothy E. Denning and Peter J. Denning. Certi- LabelCase fication of programs for secure information flow. LabelCases LabelCase . of the ACM, 20(7):504Ð513, 1977.

LabelCase: [Den76] Dorothy E. Denning. A lattice model of secure ) case ( Type LabelExpr Identifier OptBlockStatements information flow. Comm. of the ACM, 19(5):236Ð

case LabelExpr OptBlockStatements 243, 1976.

else OptBlockStatements [Den82] Dorothy E. Denning. Cryptography and Data

ActsForStatement: Security. Addison-Wesley, Reading, Mas-

or( , ) actsF Principal Principal Statement sachusetts, 1982.

The declassify statement executes a statement, but with [GJS96] James Gosling, Bill Joy, and Guy Steele. The some restrictions removed from pc. Java Language Specification. Addison-Wesley,

DeclassifyStatement: August 1996. ISBN 0-201-63451-1. ) declassify ( LabelExpr Statement [HDT87] Susan Horwitz, Alan Demers, and Tim Teitel- baum. An efficient general iterative algorithm for

dataflow analysis. Acta Informatica, 24:679Ð694,

New expressions A.6 1987.

Literal: [HR98] Nevin Heintze and Jon G. Riecke. The SLam - ::: :::existing productions culus: Programming with secrecy and integrity.

new lab el LabelExpr In Proc. 25th ACM Symp. on Principles of Pro- gramming Languages (POPL), San Diego, Cali-

DeclassifyExpression: fornia, January 1998.

, ) declassify ( Expression LabelExpr [JL78] Anita K. Jones and Barbara Liskov. A language extension for expressing constraints on data ac- cess. Comm. of the ACM, 21(5):358Ð367, May

References 1978.

[Aba97] Mart«õn Abadi. Secrecy by typing in security pro- [KW94] Atsushi Kanamori and Daniel Weise. Work- tocols. In Proc. Theoretical Aspects of Com- list management strategies for dataflow analy- puter Software: Third International Conference, sis. Technical Report MSRÐTRÐ94Ð12, Mi- September 1997. crosoft Research, May 1994.

16 [Lam73] Butler W. Lampson. A note on the confinement [SV98] Geoffrey Smith and Dennis Volpano. Secure in- problem. Comm. of the ACM, 16(10):613Ð615, formation flow in a multi-threaded imperative lan- October 1973. guage. In Proc. 25th ACM Symp. on Principles of Programming Languages (POPL), San Diego, [LMM98] Barbara Liskov, Nicholas Mathewson, and California, January 1998. Andrew C. Myers. PolyJ: Parameterized types for Java. Software release. Located at [Vol97] Dennis Volpano. Provably-secure programming http://www.pmg.lcs.mit.edu/polyj, July 1998. languages for remote evaluation. ACM SIGPLAN Notices, 32(1):117Ð119, January 1997. [MBL97] Andrew C. Myers, Joseph A. Bank, and Bar- bara Liskov. Parameterized types for Java. In [VSI96] Dennis Volpano, Geoffrey Smith, and Cynthia Proc. 24th ACM Symp. on Principles of Program- Irvine. A sound type system for secure flow analy- ming Languages (POPL), pages 132Ð145, Paris, sis. Journal of Computer Security, 4(3):167Ð187, France, January 1997. 1996. [ML97] Andrew C. Myers and Barbara Liskov. A de- centralized model for information flow control. In Proc. 17th ACM Symp. on Principles (SOSP), pages 129Ð142, Saint-Malo, France, 1997. [ML98] Andrew C. Myers and Barbara Liskov. Complete, safe information flow with decentralized labels. In Proc. IEEE Symposium on Security and Pri- vacy, Oakland, CA, USA, May 1998. [MMN90] Catherine J. McCollum, Judith R. Messing, and LouAnna Notargiacomo. Beyond the pale of MAC and DAC — defining new forms of access control. In Proc. IEEE Symposium on Security and Privacy, pages 190Ð200, 1990. [Mye99] Andrew C. Myers. Mostly-Static Decentralized Information Flow Control. PhD thesis, Mas- sachusetts Institute of Technology, Cambridge, MA, 1999. In progress.

[PO95] Jens Palsberg and Peter Ørbæk. Trust in the - calculus. In Proc. 2nd International Symposium on Static Analysis, number 983 in Lecture Notes in Computer Science, pages 314Ð329. Springer, September 1995. [RM96] Jakob Rehof and Torben ®. Mogensen. Trac- table constraints in finite semilattices. In Proc. 3rd International Symposium on Static Analysis, number 1145 in Lecture Notes in Computer Sci- ence, pages 285Ð300. Springer-Verlag, Septem- ber 1996. [RSC92] Joel Richardson, Peter Schwarz, and Luis-Felipe Cabrera. CACL: Efficient fine-grained protection for objects. In Proceedings of the 1992 ACM Con- ference on Object-Oriented Programming Sys- tems, Languages, and Applications, pages 154Ð 165, Vancouver, BC, Canada, October 1992. [Sto81] Allen Stoughton. Access flow: A protection model which integrates access control and infor- mation flow. In IEEE Symposium on Security and Privacy, pages 9Ð18. IEEE Computer Soci- ety Press, 1981.

17