Iso/Iec Jtc1/Sc2/Wg2 N3198r 2007-01-26 L2/07-011R

Iso/Iec Jtc1/Sc2/Wg2 N3198r 2007-01-26 L2/07-011R

N3198 - 29 Additional Mathematical and Symbol Characters ISO/IEC JTC1/SC2/WG2 N3198R 2007-01-26 L2/07-011R Universal Multiple Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation Международная организация по стандартизации Doc Type: Working Group Document Title: 29 Additional Mathematical and Symbol Characters Source: Asmus Freytag (Unicode), Barbara Beeton (American Mathematical Society), Patrick Ion, Murray Sargent and David Carlisle (MathML Working Group of the W3C), Roozbeh Pournader (Sharif FarsiWeb) Status: Expert contribution Action: For consideration by JTC1/SC2/WG2 and UTC Related: Summary As the mathematical community completes its migration to Unicode, the need for addi- tional mathematical symbols has been discovered, to complete the extension of the mathematical repertoire carried out over the last few revisions of the Standard. This document contains a request for 29 new symbols required for the publication of mathe- matical and technical material, to support mathematically oriented publications, or to address the needs for mathematical markup languages, such as MathML. The STIX project (http://www.ams.org/STIX) of the STIPub consortium of technical and scientific publishers, which includes the AMS, is engaged in creating a font encompass- ing the characters needed in the publication of technical and mathematical journal articles and similar works. A number of the requested characters come from this project of re- viewing the mathematical and technical literature, which implies that actual use can be attested and that the reviewers feel that the user community as represented by academic publishers has an interest in being able to encode the character in question. Other requests for additional characters come from the ongoing work of the MathML Working Group of the W3C as well as from public review comments. This document contains several numbered sections containing specific proposals for additional characters with background information, followed by three appendices con- taining related material for information, followed by the proposal summary form. 1 of 18 N3198 - 29 Additional Mathematical and Symbol Characters 1 Non-combining Diacritic A mathematical variable, or even an entire expression, can be decorated with what is called a math accent. By convention such a decoration is indicated with a Unicode char- acter, even in markup languages, such as MathML. MathML 2.0 tags the accent with <mo> (operator) tag. As a result of this, the markup syntax character “>” is occasionally followed by a non-spacing character. Where possible, MathML avoids this by recom- mending the use of spacing clones of diacritics where they are available. Where no spac- ing clone exists, the layout of the source text may be affected by the combining accent, but the interpretation of the source and the display of the actual content of the document remain unambiguous—with one exception. The exception is the diagonal stroke overlay (U+0338) often used for negation. The problem here is that the combination of > fol- lowed by U+0338 is canonically equivalent to U+226F ≯ NOT GREATER-THAN. This means that text containing > followed by U+0338 cannot be distinguished from NOT GREATER-THAN. Therefore, the MathML working group has expressed their concern: It's of considerable importance that we have a uniform treatment of math accenting using non-combining charac- ters in MathML. The interaction with Unicode Normaliza- tion, as recognized by the XML recommendations, and thus in XML parsers, is what drives the request. In support of this request, we propose the addition of a spacing clone for 0338: 27CB / MATHEMATICAL SPACING LONG SOLIDUS OVERLAY The recommended properties are the same as for other spacing clones of combining accents, except that this character will not have a decomposition. A general category of Sm (Symbol, math) is not recommended. 2 Invisible Plus Operator In analogy to the other invisible operators coded in the range 2061 to 2063, it is proposed to encode an INVISIBLE PLUS OPERATOR character to be able to unambiguously represent expressions like 3½, which occur frequently in school or engineering texts. For example, in presentation MathML notation, the markup source for this is: <mrow> <mn>3</mn> <mo>&InvisiblePlus;</mo> <mfrac><mn>1</mn><mn>2</mn></mfrac> </mrow> Not having an operator at all would imply multiplication as in the example 3 where the 3 represents a factor multiplying the following fraction. As is the case for the existing invisible operators, this character would primarily be required for unambiguous 2 of 18 N3198 - 29 Additional Mathematical and Symbol Characters representation of the mathematical intent, for example for machine parsing. In a publica- tion, the operator would not be visible to the human reader, who would, as usual, rely on larger context to determine the intended meaning of the juxtaposition. The MathML Working Group requests the addition of an invisible math operator: 2064 † INVISIBLE PLUS with a glyph consisting of a dashed square around a plus sign in analogy to other such operators, and General_Category=Cf and Default_Ignorable=true. 3 Math Delimiters In typesetting tall mathematical expressions, a special set of delimiters that look like flattened parentheses are occasionally used. The chosen encoding model for large math delimiters is to use the ordinary character code to select the type of delimiter and have the layout software do the required stretching to scale the delimiter to the enclosed content. The delimiters considered here are special, in that they have no ordinary-sized counter- parts. In TeX, these delimiters can be selected with the \lgroup or \rgroup macros, but for Unicode-based math layout, whether via MathML or other format, they cannot be used, as there are currently no corresponding characters. Accordingly we propose 27EE ° MATHEMATICAL LEFT FLATTENED PARENTHESIS = lgroup 27FF ¢ MATHEMATICAL RIGHT FLATTENED PARENTHESIS = rgroup with representative glyphs that look like scaled-down versions of the shapes in Figure 1. Like other paired punctuation, these should be assigned General_Category values of Ps/Pe and should be mirrored. Figure 1. \lgroup and \rgroup from the TeXbook 3 of 18 N3198 - 29 Additional Mathematical and Symbol Characters 3 Arrows The Unicode Standard contains the following quadruple stemmed arrows: U+27F0 ⟰ UPWARDS QUADRUPLE ARROW U+27F1 ⟱ DOWNWARDS QUADRUPLE ARROW. In reviewing the font being created for the STIX Project of the mathematical and scien- tific publishers, it was noted that the left/right orientations of these had been inadvertently omitted from earlier requests. The triple and double arrows are already encoded in all four orientations. Accordingly the following characters are proposed, with glyphs based on appropriately rotated shapes of the glyph for U+27F0. 2B45 £ LEFTWARDS QUADRUPLE ARROW 2B46 § RIGHTWARDS QUADRUPLE ARROW Like all other arrows in Unicode, these should not be mirrored. 4 Long Division In texts for elementary arithmetic, there’s a common notation for long division for which the correct shape looks something like this _____ 2 ) 616 where the horizontal extent matches the operand. The occasional alternate representation ____ 2 | 616 is regarded as an inferior fallback representation. While this symbol is not used in scientific publishing, its use in educational material is very widespread and key implementers have expressed interest in supporting it. Figure 1a shows several typical ways of using this symbol, both in displayed, as well as in inline form (in the “bubble”). Figure 1a. From a 4th Grade Math Workbook 4 of 18 N3198 - 29 Additional Mathematical and Symbol Characters The proper way to support this notation would be analogous to the support of the square root which also needs to expand based on the expression of which the root is taken. In other words, the character code would be used in a context, such as MathML or other mathematical notation that supports the necessary scoping. In the linear notation devel- oped by one of us (MS), for example, the operand would be enclosed in parens which are suppressed in display. Accordingly we propose 27CC • LONG DIVISION with properties to match U+221A √ SQUARE ROOT. Note that in ordinary text the charac- ter would simply act as a standalone symbol and not require special handling by non- mathematical layout engines. 5 Large Squares According to the discussion of the use of abstract geometrical shapes in UTR #25, Uni- code Support for Mathematics (http://www.unicode.org/reports/tr25/), the "normal" size of the characters for squares in the Standard is medium large. See Table 2.4 in that report. However, the STIX project has received a specific request from the American Institute for Physics (AIP) for squares that are larger than the standard squares mapped to U+25A0 BLACK SQUARE and U+25A1 WHITE SQUARE (see figure 2, but note that the source says “should be larger” than shown). Figure 2. AIP character request to STIX The entity names in Figure 2 imply a systematic distinction between large and regular squares (as corroborated by the comment for &squarelg;), even though the glyphs in the request document cited are inadequate. 5 of 18 N3198 - 29 Additional Mathematical and Symbol Characters The character mentioned for that purpose in table 2.4 of UTR #25, U+2588 █ FULL BLOCK does not appear appropriate for several reasons: By design this character is a full display cell in the font, and for nearly all fonts, that’s a rectangle and not a square. There is also no corresponding white form, as there is for all other squares. The final reason is that squares for mathematical fonts should be centered on the math centerline, while the various blocks are aligned on the font’s maximum ascent and descent. We therefore propose the addition of the following characters 2B1B ≤ BLACK LARGE SQUARE 2B1C ≥ WHITE LARGE SQUARE with a size in between U+25A0 ■ BLACK SQUARE and U+2588 █ FULL BLOCK. (See also the review paper by P.R.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    18 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us