On Writing Op enMath Content Dictionaries
James H. Davenp ort
Dept. Mathematical Sciences
University of Bath
Bath BA2 7AY
England
Abstract by which Op enMath achieves its goal of b eing an \an ex-
tensible framework for exchanging semantically-ri ch repre-
This pap er is based on various discussion with the Op en-
sentations of mathematical ob jects". So the motivation for
Math Consortium, and recently at the UniversityofWest-
writing a CD is that there is some semantics that one wishes
ern Ontario. All errors are the author's. Many helpful sug-
to exchange. The consequences of this motivation are the
gestions have b een made, particularly by Dr. Dewar and
following.
Dr. Naylor.
1. There must b e a \new" piece of semantic information
This pap er outlines some of the issues that a ect au-
to convey.
thors of Op enMath Content Dictionarie s, and their asso ci-
ated Small Typ e System [4] les.
2. It must b e p ossible to write down informally the \com-
mented mathematical prop erties" or CMPs the se-
mantics that the author of the CD intends to convey.
1 Intro duction
It should also b e p ossible to write Formal Mathemat-
This pap er addresses the following questions ab out the writ-
ical Prop erties FMPs, but this is not necessary, and
ing of Op enMath Content Dictionaries CDs, and asso ci-
is clearly imp ossible for all of mathematics. It maybe
ated Small Typ e System STS [4] les.
that some of the new items can b e de ned in terms of
others, even if not everything can b e de ned formally.
1. Why write a Content Dictionary?
3. There must b e a motivation for wishing to convey it.
2. What should I lo ok at b efore/while writing a Content
One such motivation would b e that two algebra systems
Dictionary?
wished to communicate ob jects with these semantics,
but this is far from the only motivation. Databases
3. What should I b ear in mind while writing a Content
might contain these ob jects consider the data de-
Dictionary?
scrib ed in [2], or wemay wish to search in pap ers
containing such items in formulae.
4. What is approval for a Content Dictionary, and howdo
I get a Content Dictionary approved?
3 An example: multisets
1.1 Content Dictionary Groups
MathML [12] de nes the concept of set, and says that it can
1
A Content Dictionary Group is a le normally with exten-
haveatype attribute of normal or multiset. Op enMath
sion .cdg which sp eci es a grouping of CDs for a logical
has a set1.ocd and corresp onding set1.sts available as
purp ose. For example, the MathML group is a group of
[9] which de ne the semantics for ordinary sets. Though
CDs whose symb ols are equivalent to the symb ols of Con-
they do not explicitly say so, it is the case that A [ A = A.
2
tent MathML [12, App endix C].
Since Op enMath do es not have the same sort of attribute
A CD can b e in more than one CD group. For example,
concept that MathML has, we need a way to enco de multi-
setname1.ocd is in the MathML CD group, since it contains
sets in Op enMath.
symb ols Z etc. which are in content MathML. It is also in
The following p ossibili ti es exist.
the setname.cdg group, which also contains setname2.ocd,
1. Add a multiset constructor to set1.ocd
which has various set names, suchasP, which are not in
content MathML.
2. De ne a new CD probably called multiset1 with the
same op erators as set1 except that set would proba-
2 Why write a CD?
bly b e called multiset but di erent semantics.
3. De ne a new CD probably called multiset1 with op-
A Content Dictionary is a fundamental concept of Op en-
erators suchasmultiset-union etc.
Math. The symb ols contained in a CD form the mechanism
1
Presumably the default, though this is not made explicit.
Supp orted by the Op enMath Esprit Pro ject 24.696.
2
For go o d reasons: it's only p ossible in XML to have xed at-
tributes of the MathML kind, whichwould con ict with the extensi-
bility of Op enMath. 1
The rst has some drawbacks. 5 What to Bear in Mind
The semantics are not the same, so one needs to say
The key things to b ear in mind while writing a CD are the
something like \If A is a set rather than a multiset
following.
then A [ A = A". Lest this seem trivial, consider the
fact that
1. Op enMath is ab out semantics, rather than the elegance
of rendering. There are many alternativeways to de-
A [ B =AnB[A\B[BnA
termine how something is rendered, but this is the job
is true for sets but false for multisets: if A =
of MathML [12], particularly its presentation mo de,
f1; 1; 2g and B = f1; 2; 2g, the the left-hand side is
rather than of Op enMath.
f1; 1; 1; 2; 2; 2g, but the right-hand side is f1; 1; 2; 2g.
One example will illustrate this p oint. A colleague com-
mented as follows. \It o ccurred to me whilst writing
It is then imp ossible to distinguish b etween the op er-
that some Op enMath phrases like`a; b; c 2 Z' seem to
ation of union on sets and multisets. Some texts use
o ccur quite frequently in mathematics. At the moment
di erent symb ols for the two, thus allowing a text to
the only way to enco de this in Op enMath is the follow-
write \A [ A = A, but A t A 6= A". It is also p ossible
ing:
that an algorithm might wish to consider b oth.
Hence the choice lies b etween the second and third. This is
a matter of preference, but it seems that the general Op en-
3
Math convention has b een to cho ose the second rather than
the third. In the current context, one can see that
seems somewhat redundant compared with
Examples of a CD and the asso ciated STS le constructed
on this principle are given in the full version [5], also [9].
4 What to Lo ok at
The key do cument is the formal Op enMath standard [8].
Clearly this do cument is imp ortant, as is the description of
the Small Typ e System [4]. Existing CDs, particularly the
4
draft ocial ones at [9], are a useful source of layout , but
also of examples of how things can b e describ ed. In par-
ticular, when lo oking at writing multiset1.ocd, the author
Surely it would b e much neater if instead 'in' was made
lo oked carefully at set1.ocd.However, the reader will no-
n + 1-ary, and we could say something like:
tice that not all the examples were blindly copied: some have
b een changed to examples more appropriate for the case at
hand.
In the eld in whichyou are working, there may b e some
standard reference works, e.g. [1] in the area of sp ecial func-
tions. These should certainly b e consulted, but it should b e
b orne in mind that suchworks, however famous, may not b e
complete see [3] for examples. Equally, there may b e stan-
dard software systems, and it would make sense to lo ok at
them rst. However, one should not exp ect uniformity here.
:::".
The following table shows what happ ens on an apparently
simple example: the de nition of arccot 1.
We note, however, that no \new" semantics were in-
[1] 1st printing 3=4 inconsistent
volved: indeed the fact that an existing representation
[1] 9th printing =4
in Op enMath was quoted essentially proves this. This
[6] 5th edition ? inconsistent
led to the analysis in Table 1, and a general principl e:
[13] 30th edition 3=4 inconsistent
MathML Content and Op enMath will aim for sim-
Maple V release 5 3=4
plicity in the core language at the cost of requiring a
Axiom 2.1 3=4
more complex translation into \go o d" written mathe-
Mathematica [11] =4
matics.
Reduce 3.4.1 =4 in oating p oint
At this p oint, the reader maywell ob ject that this prin-
Matlab 5.3.0 =4 in oating p oint
ciple has not always b een followed in the most basic
Matlab 5.3.0 3=4 symb olic to olb ox
Op enMath Content Dictionaries . For example, wehave
3
For example, the op erator in arith2 is called times not not and in,sowhy notin, since the justi cation for re-
commutat iv e-t im es .However, this is not an invariable rule, since the
placing
prop osed CD for multi-valued inverse functions uses Log for the multi-
valued equivalent of log, since this is the common mathematical con-
vention.
4
The formal rules for an Op enMath CD are given in [7].
Area binary 2 status quo n-ary 2
Translation from T X etc. An easy transformation nothing needed
E
Formal Reasoning Nothing sp ecial An extra rule needs to b e added: in practice
most systems will probably remove n-ary 2
Generating `quality' human output Need a rule to atten binary Might not need such a rule, but that's not
2 to n-ary certain
Humans reading or writing MathML Frankly, they have enough problems that who cares?
or Op enMath directly
Table 1: Binary or n-ary 2?
so that we can say that J satis es Bessel's equa-
tion, rather than having to say that x:J ; x satis es
Bessel's equation.
3. Do not pun, or, harder, p erp etrate general puns that
you may not even have noticed. If there are di er-
ent meanings for a piece of notation, then there should
by
b e di erent symb ols. Two classic examples from the
MathML CD group are the existence of b oth minus
and unary_minus in arith1.ocd, and the existence of
b oth int and defint in calculus1.ocd.
4. Use CDgroups, and do not b elieve that a single CD
seems to b e the same as that rejected ab ove.
needs to solveeverything. CDs can b e split themati-
cally: for example, rather than having one CD of sp e-
The answer here is that the \core" CDs of Op en-
cial functions, one could have one for Bessel and, p os-
Math are required to p ossess a natural mapping onto
sibly, Bessel-like functions, one for sp ecial integrals,
MathML, and this was, unfortunately, a higher prior-
such as the sine-integral and, p ossibly erf, and so on.
ity than this general principle as far as these CDs were
All these could b e in one CDgroup of sp ecial functions.
concerned . This may sound like a case of \do as I say,
Doing this split has the advantage that
not as I do", but the Op enMath design community
have tried to apply this principle whenever MathML
compatibili ty did not override the rule.
2. It is imp ortant to de ne the mathematically natural
is slightly easier to write not that direct writing of
concept, rather than the computationall y natural one.
Op enMath is to b e encouraged than
This may sound obvious, but we will give an illustra-
tion: the Bessel functions. Consider the Bessel func-
tion J z for simplicity: the same p oints are true for
other functions. This could b e de ned, as it is in all
and, more seriously, is more likely to render as J with-
numerical libraries, as a function C C ! C, i.e. in
out sp ecial pro cessing.
STS
CDs can also b e split by depth, so group1.ocd might
contain some relatively common group-theoretic con-
cepts, whereas group2.ocd etc. might contain more
abstruse ones.
5. The choice of symb ol name is imp ortant. As seen in the
previous paragraph, it is p ossible to have brief symbol
names if they are unambiguous in the context of that
CD.
but it makes more sense to de ne it as C ! C ! C,
i.e. in STS
One p erennial question is the use of upp er/lower case
in a symb ol name. In general, lower case is used, except
that multi-word names in which no other punctuation is
used generally start the second and subsequent words
with a capital to make reading the symb ol name eas-
ier. Initial capitals are normally used in the following
settings.
deemed to include similar names such as Numer-
icalValue and Ob ject from sts.ocd. 3
b Well-known functions which normally take a cap- 7 The Approval Pro cess
ital: J etc., and NaN, also, as is usual in math-
The Op enMath So cietyhttp://www.openmath.org is the
ematics, the multi-valued elementaries: Log and
ocial guardian of approved Op enMath CDs. These can b e
Arcsin etc. This rule also covers LaTeX_encoding
referred to by their URL on the Op enMath Web site.
etc.
Currently until 30.9.2000 the Esprit Op enMath pro ject
c Typ e constructors like SigmaType and PiType,
is suggesting CDs to the Op enMath So ciety. Professor
named for capital Greek letters, and other con-
Davenp ort [email protected] is the co ordinator of
structors in the ECC world, suchas Setoid.
this pro cess, and suggestions should b e made to the Esprit
d meta.o cd is exempt, since the cases should match
pro ject [email protected] or directly to him.
those in Op enMath keywords.
After this p erio d, the Op enMath So ciety has formulated
a pro cedure for submitting CDs. The full details are at
e General abbreviations, such as CD itself, also
[10], but in brief the CD is submitted to any memberofthe
DMP for distributed multivariate p olynomial.
Executive Committee: http://www.nag.co.uk/projects/
OpenMath/omsoc/society/board.html
6 Go o d CD-writing style
There are various guidelines that make reading a CD easier,
References
for humans and for machines. We list some of them here:
[1] Abramowitz,M. & Stegun,I., Handb o ok of Mathemati-
the list will doubtless evolve as the community gains more
cal Functions with Formulas, Graphs, and Mathemati-
exp erience in writing, and criticisin g, CDs.
cal Tables. US Government Printing Oce, 1964. 10th
1. Give examples. Every symb ol de ned in a CD
Printing Decemb er 1972.
should have at least one example, either as such i.e.
[2] Conway,J.H., Hulpke,A. & McKay,J., On TransitivePer-
mutation Groups. LMS J. Computation and Math. 1
are signi cantly di erent uses of a symb ol see set in
1998 pp. 1{8.
set1.ocd for an example, multiple examples are prob-
ably appropriate. In general, FMPs are more useful
[3] Corless,R.M., Davenp ort,J.H., Je rey,D.J. & Watt,S.M.,
than examples, since formal systems can also read the
\According to Abramowitz and Stegun". This issue of
FMPs, whereas there are no additional semantics asso-
the SIGSAM Bulletin.
ciated with examples. However, examples can illustrate
[4] Davenp ort,J.H., A Small Op enMath Typ e Sys-
uses even idioms which don't have a place in FMPs,
tem. Op enMath Esprit Pro ject Deliverable 1.3.2c.
so there can b e no blind rule \convert all examples into
http://www.nag.co.uk/projects/OpenMath/omstd/
FMPs".
sts.ps.gz. See also this issue of the SIGSAM Bulletin.
2. If an FMP is given, then the equivalent English CMP
[5] Davenp ort,J.H., On Writing Op enMath Content Dic-
should also b e given. If p ossible, then the converse
tionaries. Op enMath Esprit Pro ject Deliverabl e 1.3.2c.
should also b e done. CMPs should b e written in En-
http://openmath.nag.co.uk/openmath/deliverabls/
glish, and preferably avoiding abbreviations or sp ecial
5
d142/d142.tex
notation, e.g. T X , e.g. \if a and b are real numb ers
E
:::" rather than \a,b in R :::". Note that symb ols
[6] Gradshteyn,I.S. & Ryzhik,I.M. ed A. Je rey, Table of
in CDs represent concepts, they do not represent, or
Integrals, Series and Pro ducts. 5th ed., Academic Press,
p erform, op erations.
1994.
3. FMPs should b e as comprehensive as reasonable
[7] The Op enMath Consortium. A DTD for
clearly not all theorems of set theory could b e given in
Op enMath Content Dictionaries. http://
set1.ocd, though. Examples tend, on the other hand,
www.nag.co.uk/projects/OpenMath/omstd/omcd.dtd.
to aim for brevity.
[8] The Op enMath Consortium. Draft Op enMath Standard.
4. The CDUses eld, which lists all the other CDs which
http://www.nag.co.uk/projects/OpenMath/omstd.
have symb ols o ccurring in this CD i.e. in examples
[9] The Op enMath Consortium. Draft Content Dictionarie s.
and FMPs should b e correct. There are to ols to assist
http://www.nag.co.uk/projects/OpenMath/corecd/.
with this.
[10] The Op enMath So ciety. Content Dictionary
5. While it would b e na ve to exp ect a CD to b e com-
Acceptance Pro cedures. http://www.nag.co.uk/
pletely self-contained, some restraint is necessary.In
projects/OpenMath/omsoc/standard/cdacc.html.
particular, a CD should not use CDs which are less
\ocial" than it itself is. Ideally CDs in a given CD
[11] Wolfram,S., The Mathematica Bo ok. Wolfram Me-
group would refer to each other and ocial CDs only.
dia/C.U.P., 1999.
[12] World-Wide Web Consortium, Mathematical Markup
6. The corresp onding STS le [4] should b e written.
Language MathML [tm] 2.0 Draft Sp eci cation .
These signatures should b e as helpful as p ossible, and
W3C Draft, revision of 11 February 2000. http://
not say, for example, Ob ject!Ob ject simply b ecause
www.w3.org/TR/2000/WD-MathML2-20000211.
the writer is lazy.
5
[13] Zwillinger,D. ed., CRC Standard Mathematical Ta-
It is hop ed at least by the author that, as MathML b ecomes
more widely adopted and conventions for mixing di erent XML lan-
bles and Formulae. 30th. ed., CRC Press, Bo ca Raton,
guages b ecome established, the DTD for CDs will allow MathML
1996.
descriptions in CDs. 4