Writing Math on the Web
Total Page:16
File Type:pdf, Size:1020Kb
A reprint from American Scientist the magazine of Sigma Xi, The Scientific Research Society This reprint is provided for personal and noncommercial use. For any other use, please send a request Brian Hayes by electronic mail to [email protected]. Computing Science Writing Math on the Web Brian Hayes he world wide web was in- because characters could be placed Tvented at a physics laboratory, The Web would make with equal facility at any position and and the first users were scientists and scaled to any size. But the software for engineers. You might think, therefore, running the phototypesetting machines that this new channel of communica- a dandy blackboard offered no easy way to exploit this flex- tion would be especially well adapted ibility. The printed product was often to scientific discourse—that it would if only we could inferior to expertly set metal type. facilitate the expression of ideas like scribble an equation Enter Donald E. Knuth of Stanford ∂B University, who was so discontented E = with the deteriorating quality of math- ∇ × − ∂t ematical typography that he set aside or signments to blogs. Ideally, the Web other work and undertook to build his ∞ 1 1 ζ(s) = = . would serve as an extension of the own typesetting system. The eventual ns 1 1 blackboard where people gather to result was TeX, a formatting language n=1 p prime ps − talk about science and math during not just for equations but for entire If only it were so! The truth is, the ba- coffee breaks. We need chalk for that documents. Leslie Lamport, now of sic protocols of the Web offer almost blackboard. Microsoft Research, soon introduced no support for rendering mathemat- The problem is not one of simple LaTeX, a higher-level language built ics or other specialized notations such neglect. Over the years there have atop Knuth’s TeX. (The names are pro- as chemical formulas. Presenting such been many earnest efforts to build a nounced tek and lah-tek.) material on a Web page often requires reliable facility for writing and reading The first equation in the first para- software add-ons or plug-ins to be in- mathematics online. The trouble is, no graph on this page is encoded in LaTeX stalled by the author or the reader or one solution has yet gained the kind as follows: both. Fine-tuning the display of math- of widespread adoption that would \nabla \times \mathbf{E}= ematics can be a fussy and finicky pro- make it a standard, supported in main- -\frac{\partial \mathbf{B}} cess, not much easier than formatting stream Web servers and browsers. Still, {\partial{t}}. equations with a typewriter. The re- there’s room for hope. We have tech- sults sometimes render differently—or nologies that work, if we can agree on Each of the terms beginning with a not at all—in various Web browsers. how to use them. And a minor change backslash is a LaTeX command. For This is a sad situation: As the Web has to the infrastructure of the Web might example, \frac{}{} constructs a frac- evolved into a thriving marketplace smooth the way for more online math. tion, with whatever appears inside and playground, the scholarly and the two pairs of brackets forming scientific community that created the Penalty Copy the numerator and the denomina- technology has not been well served. Even in the world of ink-on-paper pub- tor. Some commands simply insert a The confused state of online math- lishing, mathematical notation can be a single character, as with \times ( ), × ematical typography is worrisome as challenge. In the days when printing \partial (∂) and \nabla ( ). The well as sad. In years to come the Web was done with metal type, equations \mathbf{} command applies ∇a bold- will surely be the most important con- had to be assembled by hand, piecing face font to the bracketed text. duit for scientific information. Already together symbols on individual sliv- Let me call attention to what is not it is a major channel for distributing ers of metal and shimming them into present in the TeX encoding. There are publications and preprints in many position. A manuscript with a lot of no explicit instructions for arranging disciplines, and it is becoming a venue mathematics was known as “penalty the symbols on the page; all of the for less-formal jottings and conversa- copy” because printers would charge geometric details, such as centering tions—everything from homework as- extra to compose it. the numerator over the denominator By the 1970s, metal type was giv- and drawing a horizontal bar between Brian Hayes is senior writer for American Sci- ing way to optical and electronic type- them, are handled automatically. entist. Additional material related to the “Com- setting devices, which projected let- Also missing is any hint of what the puting Science” column appears in Hayes’s blog terforms onto photographic film. In equation means; TeX is a language for at bit-player.org. Address: 211 Dacian Avenue, principle the new machinery might describing mathematical notation, not Durham, NC 27701. Internet: brian@bit-player. have aided mathematical typesetting for doing mathematics. For example, © 2009 Brian Hayes. Reproduction with permission only. 98 American Scientist, Volume 97 Contact [email protected]. “ E” is merely a sequence of three symbols,∇× with no reference to the op- eration those symbols denote—namely the “curl” of a vector field, a measure of the field’s rotation or angular mo- mentum. (The first equation is one of James Clerk Maxwell’s four famous equations describing the electromag- netic field. The second example gives two definitions of the Riemann zeta function, revealing a remarkable iden- tity between an infinite sum and an infinite product.) TeX has transformed the process of putting mathematical ideas on paper. What once required the services of a skilled typesetter can now be done by a mere mathematician. A manuscript can go from the author to a printed journal without human intervention. But what about the Web, which did not yet exist when Knuth developed TeX? Modern TeX systems produce output in the form of Postscript or PDF files. Although documents in these for- mats can be distributed over the Web— the arXiv preprint server dispatches thousands of them every day—they are not quite first-class citizens of the Web world. Browsers cannot display Postscript or PDF files without the aid Mathematics remains a marginal participant on the World Wide Web, where finely typeset of extra software, and many readers equations are difficult to produce and their appearance blends poorly with other textual ele- choose to download such files and ments. In this excerpt from a page of Wikipedia, the collaborative encyclopedia, most math- view them offline or print them. This ematical notation is rendered by means of embedded images, which differ in size, style and alignment from the rest of the text. In order to show how the page was assembled, the works well enough for static docu- stylesheet governing the display was altered to give all images a green border and a contrast- ments such as journal articles, but it’s ing background. A few bits of mathematical notation, marked in bright red, are not images not ideal for the more volatile and but ordinary characters; yet even they do not match the size of the main text. The highlighted interactive areas of the Web, such as panel associated with the first equation shows the TeX code that produced the equation. blogs or the always-under-revision pages of Wikipedia. For mathematics to be fully at home online, it needs to ming language, so that Web pages be- of Unicode fonts, which have room be translated into one of the Web’s na- come not just static documents but in- for a larger collection of glyphs than tive languages. teractive programs. Still, none of these earlier font formats. But it’s still not to features directly address the needs of be taken for granted that every reader Marking It Up scientific and mathematical writing. will have the necessary fonts installed. From the outset, the primary language There are no HTML tags for integrals, The second problem is one of layout. of the Web has been HTML (Hypertext say, or for matrices. Mathematical notation is two-dimen- Markup Language), in which “tags” Two main impediments stand in sional; in order to represent a matrix, identify various parts of a document’s the way of presenting mathematics on say, or a summation with upper and structure, such as headings, paragraphs the Web. First is the alphabet prob- lower bounds, it’s necessary to specify and lists. In its earliest versions, HTML lem: Mathematicians have created a the exact x and y coordinates of sym- was quite simple; it couldn’t even do sprawling zoo of novel symbols and bols. Some elements of mathematical subscripts and superscripts, so there embellished or transformed versions notation, such as brackets and the radi- wasn’t much hope of displaying elabo- of familiar characters. The nabla ( ) cal that designates a square root, vary rate mathematical notation. that appears in Maxwell’s equations∇ is in shape and size as well as position. Several revisions later, HTML does a notable example, and it is joined by Encoding such geometric information have subscript and superscript tags. hundreds of other unusual glyphs—∂, in HTML and CSS is not impossible, but And a supplementary language called R, , , , —not to mention the entire it stretches the technology to its limit. CSS (Cascading Style Sheets) allows Greek∀ ∃ alphabet⊥ ∫ and occasional bor- Early in the history of the Web, a finer control over many aspects of the rowings from Hebrew and other lan- group of mathematicians and other appearance of a Web page.