Object-Oriented Units of Measurement

Eric Allen David Chase Victor Luchangco Jan-Willem Maessen Guy L. Steele Jr.

Sun Microsystems Laboratories Burlington MA 01803 .@sun.com

ABSTRACT meters); the unit and dimension of the measurement are Programs that manipulate physical quantities typically rep- not represented in the program at all (except possibly in resent these quantities as raw numbers corresponding to the the names of variables used to hold these numbers). Thus, quantities’ measurements in particular units (e.g., a length the units and dimensions cannot be used in the analysis of represented as a number of meters). This approach elimi- the program. In particular, they cannot be used to detect nates the possibility of catching errors resulting from adding errors resulting from mismatched quantities. While trivial, or comparing quantities expressed in different units (as in such errors are not uncommon, and can have costly reper- the Mars Climate Orbiter error [11]), and does not sup- cussions. For example, the loss of the Mars Climate Orbiter port the safe comparison and addition of quantities of the in September 1999 was ultimately traced to a failure in the same dimension. We show how to formulate dimensions and software to convert between English and metric units [11]. units as classes in a nominally typed object-oriented lan- In another example, a space shuttle rolled itself away from guage through the use of statically typed . Our the Earth in response to an instruction to bounce a laser off formulation allows both parametric and inheritance poly- a mountain it was told was 10,000 miles high, rather than morphism with respect to both dimension and unit types. the intended 10,000 feet [24]. Units were not programmed It also allows for integration of encapsulated measurement as part of the input data and therefore all inputs were im- systems, dynamic conversion factors, declarations of scales plicitly interpreted as measurements in miles. (including nonlinear scales) with defined zeros, and noncon- stant exponents on dimension types. We also show how to At least superficially, dimensions and units bear a strong encapsulate most of the “magic machinery” that handles the similarity to types. However, most contemporary type sys- algebraic nature of dimensions and units in a single meta- tems, even with generic types, are insufficiently expressive class that allows us to treat select static types as generators to capture several properties necessary for dimension check- of a free abelian group. ing. For example, dimension checking at compile-time re- quires the compiler to manipulate dimensions according to a dimensional algebra (e.g., Time × Length = Length × Time). Categories and Subject Descriptors Most previous formulations of dimension checking in pro- D.1.5 [Programming Techniques]: Object-Oriented Pro- gramming languages have encoded dimensions as an ad hoc gramming; D.3.1 [Programming Languages]: Formal Def- language feature rather than integrating them into a gen- initions and Theory; D.3.3 [Programming Languages]: eral . The few attempts at integrating dimen- Language Constructs and Features sions into more general type systems have focused on struc- tural type systems over functional and procedural languages General Terms with significantly different design considerations than those Design,Languages of mainstream object-oriented languages [20].

In this paper, we show how to integrate dimensions and units 1. INTRODUCTION with classes in a nominally typed object-oriented language Physical units and dimensions are a commonly used com- (such as the JavaTM or C#) in a putational construct in science and engineering, but there is way that supports static checking of units and dimensions, relatively little support for them in programming languages. and with a performance cost no greater than that of boxing Instead, physical quantities are typically represented as num- primitive values.1 bers that correspond to the quantities’ measurement in a particular unit (e.g., a length represented as a number of 1Boxing is a technique that enables a primitive value (e.g., 0) to be used in contexts requiring a reference value by wrap- ping the primitive in a new reference (new (0)). It OOPSLA’04, October 24-28, 2004 Vancouver, British Columbia, Canada, can negatively impact execution time because boxed values Copyright 2004 Sun Microsystems, Inc. All rights reserved. must be dereferenced when accessed. However, clever com- ACM 1-58113-712-5/03/0010 . piler optimizations, particularly in just-in-time compilers, Our formulation allows dimensions and units to be treated as 2.1 Basic terminology and analysis ordinary class-based types, and allows both parametric and The fundamental notion in our formulation is that of a quan- inheritance polymorphism with respect to both dimension tity. Every quantity exists in a particular dimension2 (e.g., and unit types. We describe the new language features re- Length, Time). When two quantities Q and Q0 exist in the quired to support this integration in the context of the Mix- same dimension, we can answer the question, “how much of Gen extension [1] of the Java Programming Language, where Q does it take to equal Q0?” generic types are first-class. MixGen allows type-dependent operations such as casts on arbitrary generic types, and even Some quantities are designated units and are denoted with expression of mixins (i.e., generic classes parameterized by special symbols (e.g., Meter, Second). Every quantity can their own superclasses). be denoted by a measurement in terms of any unit U in its dimension. If it takes x of U to equal Q then we denote Q First-class generic types play a crucial role in our formula- with the measurement xU, and we call x the magnitude of tion of dimensions. We present our language as an extension this measurement. For example, the denotation 5 Meters is of MixGen to ground it in a sound, statically typed object- a measurement for a quantity in the dimension Length whose oriented language that supports first-class generic types. A magnitude in terms of the unit Meter is 5. formulation in the context of C++ templates, in particular, would not meet this criterion. MixGen includes many sub- One advantage of using units is that it allows us to compare tle features to handle pathologies that do not arise in our two measurements of the same dimension algebraically. To treatment of dimensions and units—for example, support express a quantity Q denoted by a measurement M = xU in for , hygienic method overriding, and unit U as a measurement in another unit V , we need not re- nontrivial with clauses; we avoid discussing these features. measure the same quantity in V (which is seldom practical, Therefore, we expect our formulation to be accessible to and often impossible). Instead, we can measure our unit U readers familiar only with C++, C#, or the Java Program- with unit V , resulting in a measurement yV ; we say y is ming Language, and with the notion of encoding mixins via the conversion factor from U to V . Then we can form the generic types, as can be done with C++ templates. Also, measurement M 0 of Q in unit V by substituting yV for U because our real interest is in understanding the relation- in M: M = xU = xyV = (xy)V . Two measurements in the ships of dimension and unit checking with object-oriented same unit can be compared by comparing their magnitudes type systems in general, we make no attempt to maintain arithmetically. backward compatibility when extending MixGen. Magnitudes can be taken from an arbitrary algebraic field. Our main extension to MixGen is the addition of statically For now, we assume that magnitudes are always real num- typed metaclasses, which we use to model values that have bers (i.e., instances of a class Real, which we intentionally associated dimensional units. We also define a special meta- leave undefined). Furthermore, we use the standard liter- class abelian class that provides the algebraic properties als and mathematical operators to denote real numbers and necessary to model dimensions and units accurately. Al- their methods, and we do not worry about errors due to though we strive to introduce as few features as possible, computational limitations such as imprecision and overflow. the language we end up with is quite different from the lan- We discuss taking magnitudes from arbitrary fields in Sec- guage we started with; we dub our new language MetaGen. tion 4.2, and implementation issues in Section 5.2.

In Section 2, we briefly analyze some concepts important We can also combine quantities of distinct dimension via to dimension and unit checking, discuss design constraints multiplication and division to form new quantities. Such a for a system to do this checking, and explain why existing quantity exists in a new dimension denoted by combining object-oriented languages do not satisfy these constraints. the respective dimensions with combinators × and /, which In Section 3, we describe several common uses of physical satisfy certain algebraic properties (e.g., Length × Time = quantities for which we want to have static checking, and Time × Length). The of dimensions is an abelian group the features of MetaGen that support this ability, and we under × [21]. We define a special dimension, called unity, describe further extensions in Section 4. We discuss various to serve as the multiplicative identity for dimensions, and syntactic and performance issues in Section 5, survey related we denote the inverse of a dimension D by D−1. work in Section 6, and conclude with future directions in Section 7. Units of combined measurements are combinations of units of the constituent measurements. Just as we can have sev- eral units of a given primary dimension, we can have several 2. ANALYSISAND DESIGN CHALLENGES derived units for a given derived dimension. It is also possi- In this section, we lay out the basic framework and terminol- ble to define new primitive units of combined dimensions. ogy that we use throughout this paper. Although the view we take is not the only way to understand dimensions and 2.2 Design Constraints units, we have found it useful for elucidating many design We want a design that accurately models our understand- issues. We also discuss several pragmatic requirements that ing of dimensions and units. In addition, we impose the influence our design. following pragmatic design constraints: can eliminate much of this overhead by specializing code 2We adopt for now the na¨ıve view that every quantity exists with variants for unboxed values in contexts where observ- in a single, fixed, dimension. In Section 3.6, we discuss some able behavior is unaffected. ways in which this na¨ıve view needs to be refined. 1. It should be easy to convert a measurement in one any unit of D. We may also want to declare abstract methods unit to a measurement in another unit of the same such as shortName on dimensions, or a method devices that dimension. returns a collection of the devices used to measure quantities of a dimension. We cannot represent individual dimensions 2. Type-dependent operations on dimensions and units, as subclasses of Dimension because the instances of partic- such as casting and instanceof tests, as well as the ular dimensions—naturally identified as quantities in that ability to print and store measurements, should be sup- dimension—should not be used in contexts requiring a di- ported. mension. If we model quantities as instances of their corre- 3. It should be possible to require that a quantity be sponding dimension class, and we model dimension classes denoted by a measurement in a specific unit (e.g., a as subclasses of Dimension, then by transitivity of , length that must be measured in meters) as an addi- every quantity would also be an instance of Dimension, which tional error check, to avoid unnecessary conversions, is clearly wrong: 5 Meters is not an instance of Dimension. and to manage the inaccuracies inherent in floating point calculations. We might try to break this inheritance relationship with a shallow workaround, such as treating dimensions as sub- 4. Programmers should be able to define new units and classes of Dimension and introducing a parametric class Quantity dimensions and to define behavior specific to that di- to model quantities of a particular dimension: mension (e.g., a method dilate for Time). Even if all programmers and users were willing to use the SI sys- abstract class Dimension {...} tem, there are contexts in which units and dimension class Length extends Dimension {...} checking can be employed (currency, information, etc.) class Time extends Dimension {...} ... that are not part of this system. class Quantity {...} 5. We should allow for as much polymorphism as possible. In particular, we should support type parameters for Then lengths would be instances of Quantity and dimensions and units. times would be instances of Quantity

The generalization of class relationships turns out to be the class Measurement U> extends D { private final Real _magnitude; key language feature needed for integrating dimensions and ... units with object-oriented types. Furthermore, this gener- } alization has applicability beyond just dimension checking. It allows us to more accurately model many mathematical The special kind class* is introduced to designate a class structures based on set membership and subsets. Metaphor- whose instances are all instance classes. Unit is defined to ically, we can think of classes as sets. Class extension can be be of kind class* so we can enforce the property that every thought of as the subset relationship, and instanceof can be unit will be associated with its own type. An instantiation thought of as set membership. In this view, allowing classes of D must be an instance of Dimension (and a subtype of to contain other classes is quite natural. Object). For example, we can define the following instance classes: We now describe the core of the new semantics of MetaGen, Unit

A class can be assigned to a field or variable and passed as The expression Time.shortName() evaluates to "T". Also, if we defined a method: an argument to a method, provided that it is an instance class of the variable’s type. String dimName(Dimension d) { return d.shortName(); } In general, a variable may be bound to either an instance class or an ordinary (non-class) instance. Member accesses then the method call dimName(Time) evaluates to "T" as well. (i.e., field accesses and method calls) to a variable bound to 3Readers familiar with MixGen semantics may note that our an instance class are interpreted as accesses to static mem- restrictions on type parameters allow us to drop the use of bers of the instance class. Member access to a non-class MixGen’s with clauses on type parameters. instance behaves in the usual way. A variable may not be used in type contexts (e.g., the type Real getPrimaryConversion() { return _primaryConversion; } of a method parameter or the type specified in a cast opera- tion), even if it is bound to an instance class. Also, there is U> Real conversionFactor() { no way to call a constructor of a bound instance class on a return _primaryConversion / U.getPrimaryConversion(); } variable it is bound to (we are intentionally ignoring interac- tions of metaclasses with reflection facilities). For example, U> Measurement inUnit() { the constructor call in the following code is not allowed: return new Measurement(conversionFactor()); } Length measurementMaker(Measurement m) { ... return new m(); // not allowed } } The introduction of primary units also allows us to avoid A type parameter of kind K can be used in all type contexts. cyclic dependencies in unit definitions. It can also be used in any context requiring a value of class K. In particular, it can be assigned to variables of class K Note that the constructor of class Unit ensures that no and instance methods of K may be called on it, just like instances of Unit will have a _primaryConversion of 0. ground instances or other variables of class K. For example, Because this constructor is called in static blocks of each we include the following method definition in class Unit: instance class, we catch such violations when classes are loaded. class* Unit extends D { ... We still need to reify primary units with separate classes String dimName() { return D.shortName(); } so we can refer to them. Given a dimension D, we want to } be able to easily obtain its primary unit. We also need to ensure that each dimension has exactly one primary unit. Furthermore, Minute.dimName() evaluates to "T". We meet both of these requirements by defining a new class With this core semantics in hand, we now discuss advanced PrimaryUnit, parameterized by a dimension. Each instantia- aspects of our library definition, as well as some peripheral tion of this class with a particular dimension D is the primary language features motivated by it. Many of these additional unit of D. We enforce that class PrimaryUnit has no subclasses aspects are quite complex, but this complexity results pre- or instances by introducing a new modifier, final*: cisely from our intention to provide programmers with the final* Unit PrimaryUnit { behavior they intuitively expect. It is often the case with static { container(1); } language design that a semantics that correctly matches pro- } grammers’ intuitive expectations necessarily includes sub- tleties that are seldom thought about or even noticed by The final* keyword indicates that PrimaryUnit has no sub- most users of the language. classes or instances. Conceptually, we view final* classes as typed terminal values; they act like ordinary values with 3.2 Unit conversion and primary units unique types that could be represented in a more clumsy To convert between measurements in different units of the fashion with the Singleton Pattern. same dimension, we must specify conversion factors between various units of that dimension. A natural place to keep this This solution gives a syntactically enforced bijection be- information is in the definition of a unit: each unit specifies tween dimensions and primary units: the primary unit of how to convert measurements in that unit to measurements dimension D is PrimaryUnit. For convenience and read- in any other defined unit (for the same dimension). Al- ability, we use aliases to identify primary units with familiar though the number of such conversion factors is quadratic units. For example, in the number of units, it is not necessary to maintain so many factors explicitly: if we can convert between measure- alias Meter = PrimaryUnit; ments in units A and B, and between measurements in units B and C, then we can convert between measurements in A Using the conversion factors defined by the units, we can and C via B. Thus, it is sufficient to include in the defi- easily convert between measurements in different units: nition of every unit a single conversion factor to a primary class Measurement U> extends D { unit of that dimension, and convert between any two com- Real magnitude; mensurable units via their common primary unit. ... V> Measurement inUnit() { class* Unit extends D { return new Measurement ( final private Real _primaryConversion; magnitude * U.conversionFactor() ); Unit(Real primaryConversion) { } if (primaryConversion == 0) { throw new RuntimeException( Measurement> asPrimary() { "Attempt to define a unit with a 0 conversion." return new Measurement> ( ); magnitude * U.getPrimaryConversion() } ); _primaryConversion = primaryConversion; } } ... } The methods inUnit and asPrimary require calling a static group. For notational convenience, we also parameterize method on the type parameter U. Static checking ensures abelian class with two symbols, included in parentheses, that each instance class of Unit defines conversionFactor that denote (i) the binary operator of the group and (ii) a and getPrimaryConversion. repetition operator of the group, which takes an element of the abelian group and an integer n, and yields the result of We can also name commonly used, algebraically formed di- repeated application of the binary operator to the element mensions using type aliases. For example, we use Area to n times (where, if n is negative, application to an element n 2 Length times is defined to denote the inverse of the result of appli- refer to the dimension Length and Speed to refer to Time . We now turn turn to the question of how to handle alge- cation to that element |n| times). We write an instantiation braically formed dimensions such as these. of type abelian class as follows:

3.3 Dimensional algebra abelian class (bin op,rep op ) name {...} Like quantities, dimensions and units can be combined via The repetition operator is given higher precedence than the multiplication and division operators. We might (na¨ıvely) binary operator. Because we are modeling a multiplicative try to model these operators as classes parameterized by semantics in this section, we use * as the binary operator and their operands: ^ as the repetition operator. For readability, we represent applications of ^ with superscripts. Dimension DimProd {...} Dimension DimQuot {...} We refer to the set of explicitly defined instance classes of an abelian class G as the base types of G. The set of instances This model is unsatisfactory because the algebraic structure of G is the set generated by the following rules: of dimensions requires that we equate certain terms. For ex- ample, Length×Time = Time×Length. Furthermore, the units of a given dimension can be combined as well. Combined • All base types of G are elements of G. units obey algebraic rules, and the subtyping relationships between units and their dimensions should be respected dur- • If T and U are elements of G then T *U is an element ing algebraic manipulation. of G. Another complication is that combined forms of dimensions • If T is an element of G then for all n ∈ , T n is an and units may contain . For example, consider the Z element of G. parameter type of a sqrt method over quantities (where we part with the Java Programming Language specification by using ^ to refer to exponentiation rather than exclusive or): Additionally, generated elements are equated according to the following rules: D sqrt(D^2 x) { ... } • T *U = U*T Notice that the parameter type contains a numeric exponent • T *(U*V ) = (T *U)*V (namely 2). We want to allow combined dimensions that can be raised to (constant) integer exponents. • T 0 = identity

3.3.1 Abelian classes • T 1 = T Rather than support algebraic manipulation in an ad hoc fashion, we introduce a more general language extension m n m+n • T *T = T that gives us the expressiveness we need: a new (special) metaclass abelian class. As with class*, a class of kind abelian class must have only instance classes. The explic- These rules imply all of the ordinary rules for manipulating itly declared instance classes of an abelian class form the integer exponents that we make use of in this paper. identity and basis of a free abelian group.4 Metaclass abelian class is parameterized by an identity element that desig- Now we can redefine class Dimension as an abelian class, with nates the instance class that serves as the identity for the the special dimension Unity as its identity element: 4An abelian group is a set A together with an operator, often denoted + : A, A → A, that satisfies the following properties: abelian class (*,^) Dimension {...} • There exists i ∈ A such that a + i = a for all a ∈ A; i Dimension Unity {...} is called the identity of A. ... Dimension Length {...} • For any a ∈ A, there exists a0 ∈ A such that a+a0 = i; 0 Dimension Time {...} a is called the inverse of a in A. ... • For all a, a0, a00 ∈ A,(a + a0) + a00 = a + (a0 + a00). • For all a, a0 ∈ A, a + a0 = a0 + a. Then in addition to Length, Time, etc., class Dimension also contains Length∗Time-2, etc. 3.3.2 Abelian constructors stract methods. For example, class Dimension includes the For instance classes B and C of an abelian class A, the class B*C abstract method shortName. Abstract methods of an abelian includes (as static fields) all instance fields in the definition class A may be defined for algebraically formed instance of A. It does not include any of the fields defined in B or classes by defining them in the abelian class itself. Methods C. Initialization of the fields of B*C is defined by a special defined this way must include definitions of behavior under constructor in A labeled by the binary operator of A. The combination by both the binary operator and the repeti- arguments to this constructor must be exactly two elements tion operator of A, and these definitions must be prefixed of type A, denoting the two instance classes being combined by the binary and repetition operators. The definition pre- to form the new instance. Similarly, initialization of Bn is fixed by the binary operator includes two type parameters defined by a special constructor in A labeled by the repetition of kind A. The definition prefixed by the repetition operator operator of A. This constructor takes an argument of type includes a type parameter of kind A and a final int param- A and a final int n, denoting the numeric argument to the eter. For example, abstract method shortName is defined for repetition operator. The argument n is constrained by static algebraically formed instance classes of Dimension by adding checking to be a constant (we discuss a generalization to the following method definitions to class Dimension: nonconstant exponents in Section 4.1). We refer to both of these constructors as abelian constructors. Notice that an abelian constructor is shared by all algebraically formed * String shortName() { return E.shortName() + "*" + F.shortName(); elements of A. }

If A is a generic class, we need to declare type parameters on the abelian constructors to denote the instantiations of the ^ String shortName(final int n) { type parameters for a particular instance class. For example, return E.shortName() + "^" + n; we revise our definition of class Unit to be an abelian class, } with PrimaryUnit as its identity element, and abelian constructors as follows: In the example above, does Time*Length.shortName() eval- abelian class (*,^) > Unit uate to "L*T" or "T*L"? The answer would appear to de- extends D pend on the order of the arguments. However, by the alge- { braic properties of abelian classes, Time*Length is the same final private Real _primaryConversion; as Length*Time. To allow methods such as shortName to be well-defined, we need to canonicalize all combined abelian *(Unit V, Unit W) { this(V.getPrimaryConversion().multiply( types. More importantly, canonicalization is a natural way W.getPrimaryConversion() to statically determine type equivalence. We discuss a pro- )); cedure for canonicalization in the next section. } Before any constructors are called on an algebraically formed ^(Unit V, final int n) { instance of an abelian class, the instance is canonicalized. this(V.getPrimaryConversion().power(n)); n } After canonicalization, the terms B of the canonicalized ... product are initialized from left to right via the abelian con- } structor prefixed by the repetition operator. Then combi- nations of these terms are initialized from left to right via For generic abelian classes, we impose restrictions on the re- the abelian constructor prefixed by the binary operator. spective instantiations of type parameters for the elements of a combination. The instantiations of a type parameter 3.4 Canonicalization of abelian type references whose kind is not an abelian class must match; the type To statically check type equivalence between members of an parameter of the combination is instantiated with the same abelian class G, we identify members of G and put them type. The instantiations of a type parameter whose kind is in canonical form so they can be compared directly. an abelian class are combined pointwise in the new instan- tiation. For example, because Meter is an instance class of Ignoring for the moment the complication of type parame- Unit and Second is an instance class of Unit

7 A final int bound in a lexical scope is treated as a type Alternatively, we could define another special metaclass commutative ring and define int as an instance class of that parameter ranging over this abelian class. Thus, the set of kind. But this approach requires adding additional com- valid int types consists of all expressions of int variables and plexity to our language with little reward. Furthermore, constants formed via multiplication and addition. We iden- even the notion of a commutative ring does not capture all tify and canonicalize elements of this type in the same way as of the algebraic simplification possible with the integers. explained for general abelian classes. However, special prop- 8We use the term “magnitude” as it is employed in most erties of the ints allow us to equate even more terms than we contexts concerning dimension checking. However, this use differs from its typical use with regard to impedance, to refer 6We assume that autoboxing of primitives is built into the to the square root of the sum of the squares of the real and language, as it is in the JavaTM 2 SDK1.5. imaginary parts of what we call the magnitude. Because class Unity implements DimensionI, all in- a programmer to easily convert dimensionless quantities to stance classes of MagnitudeField must implement it as well. dimensioned quantities when necessary. Because dimension- Thus, an element of an instance class of MagnitudeField can less quantities are identified with elements of class Unity, be added to or subtracted from any other element of an in- which are also used to represent magnitudes of Measurements, stance class of MagnitudeField. It may also be multiplied by a programmer wishing to treat such a quantity as having a and divided by any other quantity. When an element e of an dimension other than unity can simply wrap it in a new instance class of MagnitudeField is multiplied by another ele- Measurement. For example, to treat a dimensionless quotient ment e0 of an instance class of MagnitudeField, the dimension of two Lengths l1 and l2 as an angle, we can write: of the result is Unity ∗ Unity = Unity. When e is divided by new Measurement(l1.divide(l2)) e0, the dimension of the result is Unity∗Unity−1 = Unity. For quantities of Dimension D, the dimension of a multiplication −1 −1 4.4 Scales of measurement is Unity ∗ D = D and of a division is Unity ∗ D = D . Along with multiplicative conversions between units of the same dimension, we often want to define scales with explicit DimensionI also includes method inUnit, but the only zeros that are based on a unit of measurement (e.g., the Cel- unit of dimension Unity is PrimaryUnit. Therefore, we sius, Kelvin, and Fahrenheit scales). Scale conversions may define this method as follows: if it is called on an instance of involve more than just multiplication by conversion factors; Unity and the given unit to convert to is PrimaryUnit, they can involve addition and subtraction as well (e.g., the then the method simply returns its receiver. Otherwise, it conversion of a Celsius scale reading to a Fahrenheit scale throws an exception. reading involves multiplication by 9/5 and addition of 32 F◦, where we follow the convention that n◦C means a Celsius We redefine existing classes Float and Double as instance scale reading of n but nC◦ means a distance of n degrees be- classes of kind MagnitudeField and provide them with ap- tween two readings on the scale). How are we to understand propriate definitions of the arithmetic operations required scales and scale conversions in the context of our analysis? to satisfy the extends* constraint on MagnitudeField: In particular, can we explain scale conversion solely in terms MagnitudeField Float extends Unity {...} of manipulation of quantities? MagnitudeField Double extends Unity {...} The key to understanding the relation between quantities Now we need only modify field _magnitude in class Measurement and scale readings is to realize that a measurement of a to be of type Unity. In this way, we allow the magnitude of quantity is really a measurement of the difference between a measurement to be any dimensionless quantity. Note that two points in a continuum: an origin (or zero point) and dimensionless products of measurements are both measure- an external point. For example, a measurement of time is ments and instances of Unity. A definition of a Measurement really a measurement between an origin (e.g., the point at class with methods for binary operations and for unit conver- which a stopwatch is started) and an external point (the sion is presented in Figure 1. Class Measurement includes sev- point at which the watch is stopped). Scale readings are eral polymorphic methods with type parameters that cannot measurements for which the origin is fixed by the scale it- be inferred from method arguments alone. As in MixGen, self. Thus, even when two scales are applied to the same polymorphic methods in MetaGen employ explicit polymor- external point, the quantities represented by the readings phism. may not be the same—the scales may have different origins. When we convert a reading R of external point M in scale S 0 0 We can also define new instance classes Complex, Rational, with zero point ZS to a reading R in another scale S with Real, Interval, etc. with kind MagnitudeField, along with ap- zero point ZS0 we are really deducing the size of a measure- propriate subclassing relationships between them. A proper ment from ZS0 to M given both a measurement from ZS to encoding of these classes and their relation to one another M and a measurement from ZS0 to ZS . This deduction in- is beyond the scope of this paper. volves a simple addition. Because scales may also represent quantities in different units, unit conversions of the initial 4.3 Discovered dimensions measurements may be necessary as well. Sometimes the combination of measurements results in di- mensionless quantities (i.e., quantities of dimension unity). For example, a reading of a temperature in degrees Celsius For example, we can think of an angle as being determined is really a measurement of difference of an external “tem- by measuring two lengths (an arc and a radius) and dividing perature point” with a reference point (0◦C) in unit C◦. If one by the other. Similarly, a relative density is obtained by we write the external point as A and 0◦C as Z then we can dividing one density by another. Although angle and rela- express this difference with the following equation: tive density are both of dimension unity, it is nonsense to A − Z = x C◦ compare an angle to a relative density. Now suppose we want to convert the measurement x C◦ to ◦ Therefore, a programmer may want to view (nonzero) mea- a measurement on the Fahrenheit scale. We represent 0 F surements as measurements of quantities in newly discovered as ZF . Then we have: primary dimensions (e.g., angle, relative density, etc.). It is ◦ Z − ZF = 32 F often possible to measure such quantities directly (e.g., with a protractor). We also know the conversion factor of C◦ to F◦: We leave the decision of when to treat a dimensionless quan- 9 C◦ = F◦ tity in this way to the programmer. Our formulation allows 5 class Measurement U> extends D { private final Unity _magnitude;

Measurement(Unity magnitude) { _magnitude = magnitude; }

Unity getMagnitude() { return _magnitude; }

Unity getPrimaryConversion() { return _magnitude.multiply(U.getPrimaryConversion()); }

V> Measurement inUnit() { if (U == V) return this; else return new Measurement(getMagnitude().multiply(V.conversionFactor())); }

// Arithmetic operations are overloaded for cases where more specific arguments can be statically determined.

// add D add(D x) { return new Measurement(getMagnitude().add(x.inUnit().getMagnitude())); }

Measurement add(Measurement x) { return new Measurement(getMagnitude().add(x.getMagnitude())); }

// subtract D subtract(D x) { return new Measurement(getMagnitude().subtract(x.inUnit().getMagnitude())); }

Measurement subtract(Measurement) { return new Measurement(getMagnitude().subtract(x.getMagnitude())); }

// multiply D*E multiply(E x) { return new Measurement>(getMagnitude().multiply(x.inUnit>().getMagnitude())); } V> Measurement multiply(Measurement x) { return new Measurement(getMagnitude().multiply(x.getMagnitude())); } V> Measurement multiply(V v) { return new Measurement(getMagnitude()); }

Zero multiply(Zero x) { return Zero; }

// divide D*E^-1 divide(E x) { return new Measurement>(getMagnitude().divide(x.inUnit>().getMagnitude())); } V> Measurement divide(Measurement x) { return new Measurement(getMagnitude().divide(x.getMagnitude())); } V> Measurement divide(V v) { return new Measurement(getMagnitude()); } Unity divide(D x) { return getMagnitude().divide(x.inUnit().getMagnitude()); }

Unity divide(Measurement x) { return getMagnitude().divide(x.getMagnitude()); } }

Figure 1: Class Measurement ◦ To derive an equation of the form A − ZF = yF , we add A ScaleReading can be converted to any other ScaleReading the two difference equations and perform unit conversion: of the same dimension via method inScale. Class Scale is merely a factory for new ScaleReadings. We may want to A − Z = x C◦ + 32 F◦ F alias particular instantiations of Scale with their common 9 = x F◦ + 32 F◦ names, e.g., 5 9 = ( x + 32) F◦ 5 alias KelvinScale = Scale>; yielding the standard formula. By viewing scale readings as final* Point ZeroCelsius { measurements from fixed reference points, we can convert static { between scales while manipulating only quantities. container ( new Measurement(273) ); We can model scales and points similarly to how we model } measurements and units. Just as every dimension has a pri- } mary unit (which prevents cyclic dependencies), so too it alias CelsiusScale = has a primary origin. Every point in a given dimension is Scale; defined with respect to the primary origin of that dimen- sion. Just as measurements are defined with respect to a particular unit, scales are defined with respect to a unit and 4.5 Nonlinear scales origin. Therefore, we define a class Point as follows: More generally, zeroed “scales” are also defined in which readings are nonlinear, e.g., the bel scale, which is a mea- class Point { sure of the logarithm of the ratio of two measurements of private final D _distanceToOrigin; Power, and the Gregorian calendar, where the length of a Point(D distanceToOrigin) { year depends on which year we’re talking about. A par- _distanceToOrigin = distanceToOrigin; ticularly unusual nonlinear scale from science fiction is the } warp scale from Star Trek: The Next Generation,9 which is asymptotic at Warp 10. D getDistanceToOrigin() { return _distanceToOrigin; Although such “scales” appear superficially to be distinct } D distanceTo(Point p) { from linear scales such as Celsius and Fahrenheit, they re- return _distanceToOrigin - p.distanceToOrigin(); ally are the same kind of entity. Any scale can be thought of } as linear along a particular dimension. In fact, temperature } itself is equal to the reciprocal of the rate of increase of en- tropy with respect to energy of a given system. In the case of Unlike Units, we do not require all points to be instance decibels, “absolute zero” corresponds to 0dB, and it is qual- classes, so we do not define Point as having kind class*. We itatively different from other decibel readings. Differences also do not define Point as an abelian class. can be taken between two readings in the dB scale; the result is expressed in a special unit (let us call it “decibel units”) As with PrimaryUnits, we define an instance class PrimaryOrigin that is not typically written down.10 Another example is the that is parametric in a dimension: musical scale, which can be thought of as based on the unit “halftone”. High C and middle C differ by 12 halftones. If final* Point PrimaryOrigin { static { container(0); } we start from middle C, and then add 4 halftones, we get } E, and if we add 8 more, we get high C. We can arbitrar- ily pick middle C as the 0 point, and count halftones from We can now define classes Scale and ScaleReading that are there. However, we can also associate a reading on the mu- parametric in a dimension, a unit, and a point: sical scale with the frequency of the note. A musical scale based on halftone is logarithmic in the frequency. But we class Scale U, Point P> { might as easily say that the frequency scale is exponential ScaleReading newReading(Point Q) { in the halftone scale. return new ScaleReading(Q.distanceTo(P)); } } Because scale readings are often logarithmic with respect to class ScaleReading U, Point P> { other scales of more fundamental quantities (based on units private final Measurement _value; that the designers of the original scale may not have been aware of), it is natural that we find readings on many scales ScaleReading(Measurement value) { _value = value; } to be restricted to proper subintervals of a field (e.g., the positive reals, the reals between 0 and 10, etc.). Measurement getValue() { return _value; }

9 V, Point Q> ScaleReading inScale() { The warp scale from the original Star Trek series was based return new ScaleReading ( on a less interesting formula. _value.subtract(P.distanceTo(Q)).inUnit() 10Because a decibel reading is measured as the logarithm of ); the ratio of two quantities of Power, a decibel unit is like an } angle in that it can be viewed either as dimensionless or as } a unit in a discovered dimension. 5. PRAGMATICS 6. RELATED WORK We now consider some pragmatic aspects of our system with 6.1 Dimension and unit checking respect to syntax and performance. Most formulations of dimension checking in programming languages encode dimensions as an ad hoc language feature 5.1 Syntax rather than integrating them into a more general type sys- Many of the syntactic constructs in our finished design are tem. Also, most formulations incorporate dimension check- long-winded. However, this is an artifact of C++-style syn- ing into functional and procedural languages rather than tax that is easily addressed with syntactic sugar. For one object-oriented languages, where the design issues are sig- thing, many of the instantiations of generic types involved nificantly different. Few formulations support parametric instantiations of multiple type parameters for Measurement, polymorphism over dimensions. No previous formulation Dimension, and Unit when instantiation of the Unit is all presents dimension checking as a generalization of static that is needed to infer the others. Also, we have selected checking for metaclasses. long names for our classes to help communicate ideas; in a production system, we may want to use shorter names in Andrew Kennedy presents an extension of a core calculus some cases. Finally, support for in our for ML with support for dimension checking [20, 21]. Be- language would allow us to specify arithmetic operations cause ML is a structurally typed functional language rather on Measurements more concisely. If we had autoboxing of than a nominally typed object-oriented language, many of numeric types and we could also overload arithmetic opera- the key issues that have driven our design do not apply to tions on numeric types to take a unit and return a measure- this language. Units are not given separate types and each ment in that unit, many expressions could be written more dimension is associated with a single unit. Other units are succinctly. For example, we could replace expressions such encoded by the programmer as constant values defined in as: terms of this single unit. Because dimensions are added as an ad hoc language extension, the issues involved in en- new Measurement(5) coding them via metaclasses do not arise, nor do those in- volved in general support for kinds whose elements satisfy with 5 * Meter. the properties of an abelian group. Nevertheless, Kennedy’s approach shares similarities with ours. Dimension exponents 5.2 Efficiency of our representation are constrained to be integers. The zero quantity is treated Our formulation of dimensioned quantities is powerful, and as having polymorphic dimension. Also, function types may it allows for implementation with reasonable performance: be polymorphic with respect to dimensions. The problem Our representation of measurements involves only a single of typing functions that are polymorphic with respect to field for each measurement. Elements of this field are el- a dimension exponent (such as power) is discussed but no ements of classes associated with primitive values such as solution is presented. Float and Double. Because the magnitude field is private and final, rudimentary escape analysis should allow us to A significant portion of Kennedy’s proposal involves the for- inline these field values in many cases. Thus, run-time repre- mulation of over dimension types. In this re- sentation of quantities should require overhead about equal spect, it builds on the work of Wand and O’Keefe [28] and to that of boxing primitive values. Additionally, various Goubault [16], which present formulations of dimension in- reductions in expressive power allow for even better per- ference for ML-like languages. Incorporating an inference formance, giving library designers the freedom to choose mechanism over metaclasses and abelian classes based on the power/performance tradeoff for their target applications. these inference systems is an important direction for future When run-time representation of quantities is not required, work. the Measurements themselves could be represented as primi- tive values. Even when run-time representation of quantities Kennedy extends his own work by relating it to the para- is important, it should be possible to simply pass this infor- metricity result of [22]. By proving a “dimen- mation as type parameters to methods that require it and sional invariance” principle over a core calculus based on represent the measurement values themselves as primitives. ML, Kennedy is able to establish several surprising prop- erties of this calculus. For example, it is possible to show All the arithmetic methods are simple nonrecursive one- that functions of the type ∀u . num u2 7→ u cannot return a line methods that are easily inlined by modern just-in-time nonzero result for any argument value, and therefore that an compilers. Through static overloading, calls to the arith- approximating square root function with this type cannot be metic operations are more efficient when additional infor- expressed in the language! It is not at all obvious how to mation can be statically inferred. For example, when the relate this work, which is based on F-bounded parametric static types of the operands are Measurements with the same polymorphism over structural types, to an object-oriented Unit, no unit conversion is necessary. Additionally, field language with typed metaclasses and nominal types. But it _magnitude is final; a savvy just-in-time compiler could cal- would be interesting to determine such a relationship. culate values for methods such as getPrimaryConversion on the first call for a value and cache the result for subsequent Other formulations of dimension checking do not support calls. If performance were still unsatisfactory, we could re- parametric polymorphism over dimensions. Formulations of quire all measurements to be kept in the primary unit of dimensions for object-oriented languages have typically en- their dimension. Although we may lose precision for some coded a fixed set of dimensions. A popular formulation of programs, we would be able to reduce each of the four arith- dimension checking in C++ (via the Standard Li- metic methods to a single arithmetic operation. brary) is SIUnits [5], which provides for the checking of quan- tities in the SI system by encoding the seven dimensions ad- does not include a notion of dimension in his language ex- dressed by that system as integer parameters of a template tension; instead, units of a common dimension are declared class Quantity. Parametric dimensions, class relationships to be related via a counits operator. As a result, unit con- between units dimensions and quantities, and programmer version can occur along multiple paths, adding significant extensions of this hardwired template are not supported. In complexity to an implementation. addition, the underlying type system is unsafe and does not directly represent the abstraction of generic types, greatly Gehani’s approach is criticized by House [18], who notes that reducing the value of static checking. units are not actually declared in Gehani’s proposal. In- stead, new units are simply used when needed by a variable Fowler discusses a modeling of units of measurement used declaration. As a result, it is possible to form pathological in the Cosmos project for the U.K. National Health Ser- definitions where the name of a quantity is identical to the vice [10]. In addition to units and quantities, Fowler in- name of a (distinct) unit. House also notes that Gehani’s cludes a notion of “measurement” that includes the notion system is not statically checkable without severely restrict- of an observation and an observed “patient”. Dimensions ing the use of the UNITS(*) operator. House goes on to cite of quantities and the relationships of algebraically formed a litany of other criticisms of Gehani’s proposal and then units and dimensions are not modeled. Static checking and proposes his own extension of Pascal supporting dimension dimensional parametricity are not discussed. checking.

The formulation closest to our own is by van Delft [27], who In House’s proposal, each dimension is associated with a describes a mechanism for implementing dimensions in Java unique primary unit. Exponents on dimensions are arbi- class files as annotated classes. As with our approach, van trary rational numbers to allow the definition of methods Delft treats the zero quantity as having polymorphic dimen- such as sqrt; the syntax for polymorphic dimensions pro- sion. However, van Delft does not provide a specification for hibits the solution we have presented, where the input di- the semantics of dimension types, does not encode dimen- mension is defined to be the square of a dimension vari- sions as part of a class hierarchy, does not allow for the able. House also restricts exponents on dimensions to be definition of new methods specific to a particular dimen- constants.11 Generic type declarations are supported for sion, and does not include parametric polymorphism over records and procedures. However, procedure return types dimension types or separate types for units. As a result, the are precluded from being generic dimensions independent issues in object-oriented modeling that we have addressed of the parameter dimensions. House proposes to solve the in our proposal do not arise. Van Delft does includes a con- problem of scale conversion by allowing for “units with dif- struct dimension(exp), which evaluates to the dimension of ferent zeros”; unit conversion involves both an additive and exp. Although this feature is useful for linking dimensions a multiplicative conversion. This approach fails to recog- of method parameters (the term “dimension” can be used nize that scale readings have distinct properties from other to take an argument of arbitrary dimension), and provides quantities and should be modeled as separate entities. some of the benefits of parametric dimensions, this limited form of parametricity, combined with the restriction that di- Gehani also discusses the application of units of measure- mension exponents be natural numbers, prevents definition ment to databases [14]. In this context, conversion factors of a polymorphic sqrt function: can change dynamically (e.g., between monetary units), sug- gesting that the best way to store data is not with a stan- double dimension(v)^1/2 sqrt(double*dimension v) dard unit per dimension, but instead in the units in which the data was originally entered. van Delft proposes the following alternative: In the context of database manipulation, Gehani recom- double*dimension sqrt(double*dimension(return*return) v) mends the following operators: Where return is the type of the return value. This nota- tion provides an ad hoc mechanism to allow for equating • counits: specifies that two or more units are of the the return type of a method with its parameter types. But same dimension this mechanism is really just a workaround for the lack of • remove counits: removes unit relationships general support for parametricity, and it prevents expres- sion of parametric dependencies between multiple method • ccc: changes the conversion constant between two units parameters, e.g., (useful when dealing with rates of exchange, improving precision of scientific constants, etc.) boolean lessThan(D x, D y) • in: evaluates an expression in a given unit Other published formulations of dimension checking have • unitsof: returns the units of a given value been over procedural languages. Gehani contrasts the use of derived types in Ada with the use of a facility for unit checking, showing how derived types are an inadequate sub- In our formulation, where dimensions of quantities are de- stitute for a special language feature for units [15]. He also noted explicitly, the counits operator is unnecessary, and presents an extension of Pascal with support for units [13], in 11In fact, because Pascal syntax does not include an expo- which he allows for polymorphic use of units via a UNITS(*) nentiation operator, compound dimensions are expressed via declaration, and a UNITSOF operator. However, his discus- multiplication and division, which is expressively equivalent sion often conflates the notion of units and dimensions: he to requiring dimension exponents to be constant. the remove counits operator is nonsensical (remove counits As a result, detecting commensurability of units becomes a is apparently motivated by the need to maintain possibly central aspect of their approach. However, there is an upshot corrupted, persistent data in a database). In a program- to their focus on commensurability: their mechanism for ming language, units either belong to a dimension or they manipulating unit relationships is more general than ours: do not. Programmers may introduce and correct erroneous Rather than defining the magnitudes of all units for a given declarations of unit classes in their programs, but dynam- dimension in terms of a primary unit, each unit may be re- ically altering the dimension of a quantity is not needed. lated to any other unit. Commensurability is determined The ability to support dynamically changing conversion con- by a interesting computational trick where equations relat- stants between two units could be supported by making the ing units are represented in logarithmic form, presented as conversion factor in class Unit mutable and allowing select matrices, and put into row echelon form. Cycles and in- classes to support a ccc method. The in and unitsof oper- consistencies in the unit relationships can be determined by ator are easily supported as methods in class Measurement. straightforward inspection of the reduced matrix. The fact that we are able to easily incorporate Gehani’s operators into our formulation is a demonstration of the ex- Because we allow for dimension types in our approach, com- tensibility we have attained by unifying our formulation into mensurability detection is not necessary for static checking. a general object-oriented type system. However, intelligent unit conversion at run time would al- low for improved precision. For example, when converting Manner describes a syntactic extension to Pascal and Ada in feet to inches, it may be advantageous to relate them di- which unit declarations are supported [23]. Each dimension rectly rather than to relate them each to meters and then is associated with a unique primary unit and all quantities divide. An interesting direction for future work is to try of that dimension are kept in the primary unit. Polymor- to embed the row-echelon technique of Karr and Loveman phism over quantities is not supported. Dreiheller et al. ex- into the definition of method inUnit on measurements. Ide- pand upon Manner’s syntax to provide a language extension ally, the equations composing this matrix would be checked and implementation (via a preprocessor) for Pascal called statically. PHYSCAL [8]. PHYSCAL supports scale factors on units as well as input/output facilities for quantities. PHYSCAL Parametric units are not discussed in Karr and Loveman’s is limited to supporting SI units; the programmer cannot de- proposal. Raising quantities to a (possibly nonconstant) fine new units or dimensions. Scales of measurement are not power is allowed, so long as the unit of the exponent is com- handled by PHYSCAL, nor is polymorphism over program mensurate with 1. If the exponent is nonconstant, the units constructs involving dimensions. of the expression are assumed to be 1, breaking soundness of static checking.12 Scale measurements and input/output Baldwin proposes another extension of Pascal supporting of quantities are not addressed by their proposal. units inspired by Manner [3]: a fixed set of dimensions is added to Pascal as siblings of type REAL. Each dimension The earliest mention of dimension checking for programming is associated with a unique primary unit, and quantities of languages that we are aware of occurs in J.C. Cleaveland’s that dimension are always kept in the primary unit. Multi- discussion of “pouches” [7]. Pouches are described as a gen- plication and division of dimensioned quantities by dimen- eral way to allow programmers to organize numeric variables sionless quantities is not supported despite recognition that into natural classes or categories. It is shown how a set of such support is needed for correctness. pouches can be organized as the base elements of a freely generated abelian group, and therefore as unit annotations. Hilfinger proposed an encoding of support for dimension No polymorphism is provided over pouches, but it is men- checking in Ada in [17]. Hilfinger’s approach shares some tioned in passing that a notion of “hidden” pouches would be of the flavor of ours in that he attempts to use existing lan- useful when defining arithmetic functions that are naturally guage abstraction mechanisms to encode dimensions rather polymorphic. Exponents on generated elements of pouches than introducing an ad hoc language feature. He relies on are restricted to be constants. No subtyping is provided over the subtyping mechanism of Ada to encode dimensions as pouches. Pouches are used to describe units rather than di- subtypes of records of constant integers. Each field in this mensions, but no facility is provided for converting between record supertype corresponds to a primary dimension. Be- commensurable units. cause the fields in this record are fixed, the programmer cannot define new dimensions. Also, quantities of a di- mension must be represented in a primary unit. Limited 6.2 Generalized class relationships in other polymorphism is allowed because a programmer can refer to languages the record supertype of all dimensions when defining meth- The use of class relationships beyond simple class exten- ods. However, general parametric polymorphism including sion is essential for modeling many real-world ontologies [9]. dependencies between the dimensions of multiple variables Many non-statically-typed object-oriented languages, such is not supported. 12Because parametric polymorphism is not supported, this design decision is reasonable as long as all occurrences of Karr and Loveman discuss mechanically incorporating units these exponent expressions are either accompanied by a into a procedural programming language [19]. Like us, they static warning or required to be annotated by the program- are concerned not with simply adding unit checking to an ex- mer. Provided that enough information is kept to deter- isting language, but rather with understanding more general mine the actual units of these expressions at run time and issues in adding unit checking to a class of languages. Karr to check their use, exponents with nonconstant exponents and Loveman do not include a separate notion of dimension. in this proposal are akin to downcasts in the Java Program- ming Language. as Smalltalk, Python, and Self, allow for more flexible re- concisely than can be done in Smalltalk. Instead of writing lationships. In the case of Smalltalk and Python, classes classes such as ClassClass, ObjectClass, etc., we can define a are instances of metaclasses. Static members of a class are metaclass Class that is parameterized by its contained class: modeled as ordinary members of the corresponding meta- class. Self, on the other hand, is a prototype-based object- Class> Class C> {...} oriented language, where there are no classes at all: object instantiation consists of cloning an existing instance. The Note that the bound on type parameter C is itself Class. members of the clone can be modified and added to at will. This bound allows us to define instance classes such as Class Object {...} Our notion of metaclass differs from that in languages such Class Dimension {...} as Smalltalk and Python, where each class has a unique metaclass that defines its class members. Metaclasses in So instead of having ObjectClass and DimensionClass, we these languages are typically used to define customized prop- have Class and Class. If we want to in- erties of select classes. In contrast, we use metaclasses to clude extra class members in the metaclass of Dimension (for augment traditional static checking of properties that are instance, if we want to model static members of Dimension as typically encoded indirectly. Not all classes have a distinct members of the metaclass), we can subclass Class metaclass. and define Dimension to be a member of the subclass. The important point is that we do not need an explicit Metaclass When viewed as “instance generators”, our metaclasses are class because we can represent metaclasses as instantiations similar to prototypes in untyped languages such as Self. Pro- of a single generic type. Likewise, there is no need for a totypes generate new instances, which themselves can gen- ClassDescription class, significantly simplifying the class hi- erate new instances. Our language is more restrictive than erarchy. prototype-based languages in the sense that all metaclasses and instance classes must be declared statically (i.e., by writ- Dynamically typed languages such as Smalltalk, Self, and ing down class definitions). But MetaGen is more expressive Python are inappropriate for modeling dimension checking. in the sense that we include classes and subclassing rela- The reflection facility in the Java Programming Language tionships. Also, unlike typical prototype-based languages, can also be thought of as providing some of the power of MetaGen is statically typed. metaclasses. For example, each class is regarded by the re- flection facility as an instance of class Class. However, use Because our formulation of metaclasses includes nominal of the reflection facility is not statically checked in the Java generic types, we can handle certain (conceptually paramet- Programming Language, effectively placing it in the same ric) metaclasses more gracefully than they are handled in category as dynamically typed languages in this respect. dynamically typed languages like Smalltalk. For example, in Smalltalk (pre Smalltalk 80),13 the upper ontology of classes Nominal subtyping is an essential property of our extended and metaclasses is as follows (using MetaGen syntax) [29]: language. Many of the instance classes we define may them- selves contain no internal structure, but establishing subtype ObjectClass Object{...} relationships that are independent of structure is central to ClassDescriptionClass ClassDescription extends Object {...} MetaclassClass Metaclass extends ClassDescription {...} our approach. A static type inference system is provided for ClassClass class extends ClassDescription {...} Smalltalk-like metaclasses in Strongtalk in [4]. However, this Metaclass ObjectClass extends class {...} type system is based on structural subtyping with no run- Metaclass classDescriptionClass extends ObjectClass {...} time representation of types. Therefore, it is inadequate for Metaclass MetaclassClass extends ClassDescriptionClass {...} encoding our formulation of dimensions and units, which Metaclass ClassClass extends ClassDescriptionClass {...} relies fundamentally on first-class genericity and run-time User-defined classes are instances of new metaclasses. For representation of types. example, class Dimension would be represented as: 7. FUTURE DIRECTIONS Metaclass DimensionClass extends ObjectClass {...} In this paper, we introduce a generic static-checking system DimensionClass Dimension extends Object {...} over dimensions and units. There are several interesting The metaclasses hold the static members of their instances. directions to extend this work. For example, our system employs explicit polymorphism, which is strictly more ex- Notice that in Smalltalk we are prevented from modeling the pressive in a nominal generic type system with support for structure we really want over these classes because we do not type-dependent operations (consider, for example, a poly- have generic types. An object-oriented generic type system morphic method that takes no arguments and returns an with nominal subtyping such as in NextGen and MixGen [6, instance of a type parameter). Nevertheless, it would be 2, 1] have the important property that each instantiation of natural to extend this system with limited forms of implicit a generic type has a distinct run-time representation (this is polymorphism where type instantiations can be inferred. not the case for a structural type system).14 We can lever- age this fact to encode a metaclass hierarchy much more Another important extension to our system would be to pro- 13 vide support for programmers to define new classes with Smalltalk 80 includes a more complex hierarchy with soft- equational constraints such as those we have built into abelian ware engineering benefits orthogonal to the issues under dis- cussion. classes. A natural approach to this problem would be to 14In Java 2 SDK v.1.5, generic types are erased to their upper operations such as casts and instanceof tests cannot be bounds after type checking. Consequently, type-dependent performed safely on generic types in that language. transfer and extend results from constrained type systems [11] D. Isbell, D. Savage. Mars Climate Orbiter Failure such as the HM(X) extension to Hindley-Milner type infer- Board Releases Report, Numerous NASA Actions ence, presented in [26] and implemented in the Chameleon Underway in Response. NASA Press Release 99-134. extension of Haskell [25], from a structural type system to http://nssdc.gsfc.nasa.gov/planetary/text/ a nominally typed object-oriented type system. mco pr 19991110.txt. Nov. 10, 1999.

We would also like to examine to what degree we can unbox [12] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. dimensioned quantities in real programs to achieve perfor- Design Patterns: Elements of Reusable mance acceptable for high-performance computation. Object-Oriented Software. Addison-Wesley, 1995. [13] N. Gehani. Units of Measure as a Data Attribute. Finally, formalizing the MetaGen semantics over a core cal- Computer Languages, Vol. 2. pp. 93-111. Pergamon culus (e.g., an extension of Core MixGen) is an important Press, 1977. Printed in Great Britain. next step in helping us to better analyze its properties and interactions with other language extensions. [14] N. Gehani. Databases and Units of Measure. IEEE Transactions on Software Engineering. Vol. SE-8, No. Acknowledgements 6, November 1982. We are gratetful to Matthias Felleisen, Sam Tobin-Hochstadt, Cheryl McCosh, Miriam Kadansky, and the anonymous ref- [15] N. Gehani. Ada’s Derived Types and Units of erees for their comments on earlier drafts of this work. Measure. Software – Practice and Experience, Vol. 15(6), 555-569. June 1985. 8. REFERENCES [16] J. Goubault. Inference d’unites physiques en ML. In [1] E. Allen, J. Bannet, and R. Cartwright. A First-Class P. Cointe, C. Queinnec, and B. Serpette, editors, Approach to Genericity. In Proceedings of the Journees Francophones des Langages Applicatifs, Eighteenth Annual SIGPLAN Conference on Noirmoutier, p.3-20. INRIA, Collection didactique, Object-Oriented Systems, Languages, and 1994. Applications, Anaheim CA 2003. [17] P. Hilfinger. An Ada Package for Dimensional [2] E. Allen, R. Cartwright, and B. Stoler. Efficient Analysis. ACM Transactions on Programming implementation of run-time generic types for Java. In Languages and Systems, Vol. 10, No. 2, April 1988. p. IFIP WG2.1 Working Conference on Generic 189-203. Programming, 2002. [18] R. T. House. A Proposal for an Extended Form of [3] G. Baldwin. Implementation of Physical Units. Type Checking of Expressions. The Computer SIGPLAN Notices, V. 22(8). August 1987. Journal, Vol. 26, No. 4. 1983. [4] G. Bracha, D. Griswold. Strongtalk: Typechecking [19] M. Karr, D. B. Loveman III. Incorporation of units Smalltalk in a production environment. In Proceedings into programming languages. Communications of the of the Eighth Annual SIGPLAN Conference on ACM, 21(5):385-391, May 1978. Object-Oriented Programming, Systems, Languages and Applications, pages 215-230, 1993. [20] A. Kennedy. Dimension Types. In Proceedings of the [5] W. Brown. Applied Template Metaprogramming in 5th European Symposium on Programming Languages SIUnits: the Library of Unit-Based Computation. In and Systems. Edinburgh, U.K. 1994. Proceedings of the Second Workshop on C++ Template [21] A. Kennedy. Programming Languages and , October 2001. Available at Programming Dimensions. PhD Thesis. St. Catharine’s College. . http://www.oonumerics.org/tmpw01/brown.pdf November 1995. [6] R. Cartwright and G. Steele. Compatible genericity [22] A. Kennedy. Relational Parametricity and Units and with run-time types for the Java programming Measure. In Proceedings of the 24th Annual ACM language. In Proceedings of the Thirteenth Annual Symposium on Principles of Programming Languages, SIGPLAN Conference on Object-Oriented Systems, Paris, France. January 1997. Languages, and Applications, Vancouver BC 1998. [7] J.C. Cleaveland. Redundant Specification in [23] R. Manner. Strong Typing and Physical Units. Programming Languages Through Pouches. UCLA SIGPLAN Notices V. 21(3), March 1986. Technical Report. 1975. [24] P. Neumann. Risks to the public from the use of [8] A. Dreiheller, M. Moerschbacher, B. Mohr. computers. ACM Software Engineering Notes, 10(3) PHYSCAL: Programming Pascal with Physical Units. July 1985. SIGPLAN Notices, V. 21(12), December 1986. [25] M. Sulzmann, The Chameleon website. [9] D. Ferrucci, C. Welty. What’s in an Instance? RPI http://www.comp.nus.edu.sg/ sulzmann/chameleon/ Computer Science Technical Report. 1994. [26] M. Sulzmann, M. Odersky, M. Wehr, Type Inference [10] M. Fowler. Analysis Patterns: Reusable Object with Constrained Types. Theory and Practice of Models. Addison-Wesley 1997. Object Systems, 5(1), 1999. [27] A. van Delft. A Java Extension with Support for [29] A. Goldberg, D. Robson. Smalltalk-80: The Language. Dimensions. Software – Practice and Experience, Addison-Wesley, 1989. 29(7), 605-616. 1999. [28] M. Wand, P. M. O’Keefe. Automatic dimensional inference. In J.-L. Lassez and G. Plotkin, editors, Computational Logic: Essays in Honor of Alan Robinson. MIT Press, 1991.