Representation of Derived Units in Unitsml (Revised December 8, 2006)
Total Page:16
File Type:pdf, Size:1020Kb
Representation of Derived Units in UnitsML (revised December 8, 2006) Peter J. Linstrom∗ December 8, 2006 ∗phone: (301) 975-5422,DRAFT e-mail: [email protected] 1 INTRODUCTION 12/8/06 Contents 1 Introduction 1 2 Why this convention is needed 2 3 Information needed to define a unit 3 4 Proposed XML encoding 4 5 Important conventions 7 6 Potential problems 8 7 Possible alternatives 10 A Multiplicative prefixes 12 B SI units and units acceptable for use with the SI 14 C non-SI Units 19 1 Introduction This document describes a proposed convention for defining derived units in terms of their base units. This convention is intended for use in the UnitsML markup language to allow a precise definition of a wide range of units. The goal of this convention is to improve interoperability among applications and databases which use derived units based on commonly encountered base units. It is understoodDRAFT that not all units can be represented using this convention. It is, however, Representation of Derived Units in UnitsML (revised December 8, 2006) Page 1 2 WHY THIS CONVENTION IS NEEDED 12/8/06 anticipated that a wide range of scientific and engineering units of measure can be represented with this convention. The convention consists of representing the unit in terms of multiplicative combinations of base units. For example the unit centimeter per second squared would be represented in terms of the following: 1. The unit meter with the prefix centi raised to the power 1. 2. The unit second raised to the power −2. Please note that this convention seeks to address the problem of defining derived units, not to define conversion factors. For this reason it will only support multiplication by constants which have defined prefixes. 2 Why this convention is needed Without this convention, there is no easy way to reliably compare unit definitions from different sources to see if they are the same. The proposed symbolic identifier for UnitsML can be used for this purpose, but it is not parsable XML, so it requires a specialized parser and cannot be validated against an XML schema. As will be noted later, other than syntax, this proposal is similar to the symbolic identifier; the need to enumerate a set of base units and multiplicative prefixes is the same for both approaches. Other identifying data in the current XML schema lacks the qualities which would make them useful for comparing unit definitions from different sources. Numeric identifiers are assigned by the author of the definition and thus are only useful for comparison within the context in which they were assigned. Names are obviously language specific. Even within a given language there may be multiple names for a given unit, so names may not be unique identifiers. Under this proposal, information about the definition is provided in a structured format based on enumerated and external base units combined with multiplicative prefixes. This will allow comparison of unit definitions from different sources; something essential for interoperability of applications with different unit definition databases. Such comparison will be done by comparing base units, multiplicativeDRAFT prefixes, and exponents of units to see if they match. Representation of Derived Units in UnitsML (revised December 8, 2006) Page 2 3 INFORMATION NEEDED TO DEFINE A UNIT 12/8/06 3 Information needed to define a unit In order to define a unit in terms of other units the following information is needed for each unit which will be used in the definition: identifier An identifier which specifies the unit. prefix A code which notes a factor by which to multiply the unit. exponent numerator Numerator of the exponent to which the unit and prefix is raised. The expo- nent is expressed as a separate numerator and denominator to restrict it to rational numbers (by restricting the numerator and denominator to integers). The exponent is applied to both the unit and the prefix. exponent denominator Denominator of the exponent to which the unit and prefix is raised. Base units may be specified via a controlled vocabulary or reference to an external database. This proposal defines a controlled vocabulary for units likely to be found in a wide range of endeav- ors. The units for the controlled vocabulary were chosen to cover a wide range of base units encountered in practice. The codes used to identify units in the controlled vocabulary are internal representations to be used by UnitsML. They are not to be confused with symbols to be used in text documents or official abbreviations for the units. For convenience purposes, codes for the units were taken from CEFACT Recommendation 20 [1] where possible. In cases where Recommenda- tion 20 does not define an appropriate code a, code was constructed from the unit name. Unlike the Recommendation 20 codes, these codes contain lower case letters and are solely intended for use in this controlled vocabulary. In most applications, users should never see the codes defined in the appendix. Since it is not practical to enumerate all possible base units in a fixed controlled vocabulary, this proposal also allows base units from external databases to be specified. It is envisioned that this facility would only be used when a base unit is not in the controlled vocabulary, since use of such identifiers may limit interoperability. It is proposed that only well defined units which are not explicitly derived units be used for base units. This would mean that named derived units, such as newtons, could be used, but explicitly derived units, such as acre-feet could not. Units such as acre-feet can be defined as derived units. There is one importantDRAFT and potentially controversial unit listed in table 24. The item unit refers to a count of items and can be used to note derived units which included such counts (e.g. neutron Representation of Derived Units in UnitsML (revised December 8, 2006) Page 3 4 PROPOSED XML ENCODING 12/8/06 flux). This concept is at odds with the SI which assigns such counts a unit of 1. In this proposal it was chosen to include counts as a named unit, because some communities may find the semantic precision provided by this unit of value. It is anticipated that individual groups will decide if the use of this unit is of value for their applications. Thus it is likely that this unit will be used in some fields of endeavor (commerce), but not in others. The units defined in the appendix have been taken from several sources [2, 3, 4, 5, 6, 7]. It is important to remember that this is a working document and that the list of units in the appendix is only an initial attempt at enumerating units to be defined. It is envisioned that units will be added or removed from the list based on input from the UnitsML developers. In addition, it should be recognized that the codes defined in this document are solely for enumerating base units in the XML schema; they are not intended for use in any way outside of representing derived units in UnitsML. Proposed codes for prefixes are provided in appendix A. Supported prefixes include those defined by the SI and multipliers based on powers of two which are defined by the IEC. Exponents are specified by indicating both an integer numerator and denominator. This avoids the problems associated with using floating point numbers to specify fractional quantities. 4 Proposed XML encoding As noted above, derived units can be expressed as the product of base units with a multiplicative prefix raised to a specified power. It is proposed that such definitions be contained in an element named baseUnits. This element would contain elements for each base unit in the definition. Each base unit would be noted in an enumeratedBaseUnit or externalBaseUnit el- ement depending on whether or not the enumerated list or an external definition is used. The enumeratedBaseUnit element would have the following attributes: multiplier One of the codes for the multiplicative prefixes defined in appendix A. If omitted there is no prefix. unit One of theDRAFT unit codes defined in the appendix. This attribute is required. Representation of Derived Units in UnitsML (revised December 8, 2006) Page 4 4 PROPOSED XML ENCODING 12/8/06 <baseUnits> <enumeratedBaseUnit unit="FOT" numerator="3" /> </baseUnits> Figure 1: Representation for a cubic international foot. <baseUnits> <enumeratedBaseUnit unit="MTR" prefix="c" numerator="3" denominator="2" /> </baseUnits> Figure 2: Representation for centimeters to the three-halves power. numerator The numerator of the exponent which the unit and prefix is raised. The value should be an integer. The default value is one. denominator The denominator of the exponent to which the unit and prefix is raised. The value should be an integer, but must not be zero. The default value is one. The externalBaseUnit element would also have the multiplier, numerator, and de- nominator elements as defined above. To specify units this element would include the following mandatory attributes: source URN which specifies the namespace for the unit identifier. unit Identifier (textttnumericID attribute) for a unit definition (unit element) at the resource specified by the URN noted above. These attributes specify a unit defined by an external resource. The baseUnits element is a child of the unit element. Only one baseUnits element per unit element would be allowed. The proposed markup is best illustrated with a few examples. The text in figure 1 shows the relevant markup for a cubic international foot. Another example (showing the use of rational number exponents is the markup for centimeters to the three-halves power given in figure 2.