02/02/2010 Writing Math on the Web » American S…

COMPUTING SCIENCE

Writing Math on the Web

The Web would make a dandy blackboard if only we could scribble an equation

BrianHayes Theworldwidewebwasinventedataphysicslaboratory,andthefirstuserswerescientistsandengineers.Youmightthink,therefore,thatthisnew channelofcommunicationwouldbeespeciallywelladaptedtoscientificdiscourse—thatitwouldfacilitatetheexpressionofideaslike

or

Ifonlyitwereso!Thetruthis,thebasicprotocolsoftheWeboffer almostnosupportforrenderingmathematicsorotherspecialized notationssuchaschemicalformulas.PresentingsuchmaterialonaWeb oftenrequiressoftwareaddonsorpluginstobeinstalledbythe authororthereaderorboth.Finetuningthedisplayofmathematicscan beafussyandfinickyprocess,notmucheasierthanformatting equationswithatypewriter.Theresultssometimesrenderdifferently— ornotatall—invariousWebbrowsers.Thisisasadsituation:Asthe Webhasevolvedintoathrivingmarketplaceandplayground,the scholarlyandscientificcommunitythatcreatedthetechnologyhasnotbeenwellserved.

Theconfusedstateofonlinemathematicalisworrisomeaswellassad.InyearstocometheWebwillsurelybethemostimportantconduit forscientificinformation.Alreadyitisamajorchannelfordistributingpublicationsandpreprintsinmanydisciplines,anditisbecomingavenueforless formaljottingsandconversations—everythingfromhomeworkassignmentstoblogs.Ideally,theWebwouldserveasanextensionoftheblackboard wherepeoplegathertotalkaboutscienceandmathduringcoffeebreaks.Weneedchalkforthatblackboard.

Theproblemisnotoneofsimpleneglect.Overtheyearstherehavebeenmanyearnesteffortstobuildareliablefacilityforwritingandreading mathematicsonline.Thetroubleis,noonesolutionhasyetgainedthekindofwidespreadadoptionthatwouldmakeitastandard,supportedin mainstreamWebserversandbrowsers.Still,there’sroomforhope.Wehavetechnologiesthatwork,ifwecanagreeonhowtousethem.Andaminor changetotheinfrastructureoftheWebmightsmooththewayformoreonlinemath.

Penalty Copy Evenintheworldofinkonpaperpublishing,mathematicalnotationcanbeachallenge.Inthedayswhenprintingwas donewithmetaltype,equationshadtobeassembledbyhand,piecingtogethersymbolsonindividualsliversofmetaland shimmingthemintoposition.Amanuscriptwithalotofmathematicswasknownas“penaltycopy”becauseprinterswould chargeextratocomposeit.

Bythe1970s,metaltypewasgivingwaytoopticalandelectronicdevices,whichprojectedletterformsonto photographicfilm.Inprinciplethenewmachinerymighthaveaidedmathematicaltypesettingbecausecharacterscouldbe placedwithequalfacilityatanypositionandscaledtoanysize.Butthesoftwareforrunningthemachines offerednoeasywaytoexploitthisflexibility.Theprintedproductwasofteninferiortoexpertlysetmetaltype.

EnterDonaldE.KnuthofStanfordUniversity,whowassodiscontentedwiththedeterioratingqualityofmathematical typographythathesetasideotherworkandundertooktobuildhisowntypesettingsystem.TheeventualresultwasTeX,a formattinglanguagenotjustforequationsbutforentiredocuments.LeslieLamport,nowofResearch,soon introducedLaTeX,ahigherlevellanguagebuiltatopKnuth’sTeX.(Thenamesarepronounced tek and lah-tek .)

ThefirstequationinthefirstonthispageisencodedinLaTeXasfollows:

\nabla\times\mathbf{E}= \frac{\partial\mathbf{B}} {\partial{t}}

EachofthetermsbeginningwithabackslashisaLaTeXcommand.Forexample,\frac{}{}constructsafraction,withwhateverappearsinsidethetwo pairsofbracketsformingthenumeratorandthedenominator.Somecommandssimplyinsertasinglecharacter,aswith\times(×),\partial(∂)and \nabla( ∇).The\mathbf{}commandappliesaboldfacetothebracketedtext.

Letmecallattentiontowhatis not presentintheTeXencoding.Therearenoexplicitinstructionsforarrangingthesymbolsonthepage;allofthe geometricdetails,suchascenteringthenumeratoroverthedenominatoranddrawingahorizontalbarbetweenthem,arehandledautomatically.

Alsomissingisanyhintofwhattheequationmeans;TeXisalanguagefordescribingmathematical notation ,notfordoingmathematics.Forexample, “∇×E”ismerelyasequenceofthreesymbols,withnoreferencetotheoperationthosesymbolsdenote—namelythe“curl”ofavectorfield,ameasure ofthefield’srotationorangularmomentum.(ThefirstequationisoneofJamesClerkMaxwell’sfourfamousequationsdescribingtheelectromagnetic field.ThesecondexamplegivestwodefinitionsoftheRiemannzetafunction,revealingaremarkableidentitybetweenaninfinitesumandaninfinite product.)

TeXhastransformedtheprocessofputtingmathematicalideasonpaper.Whatoncerequiredtheservicesofaskilledtypesettercannowbedonebya meremathematician.Amanuscriptcangofromtheauthortoaprintedjournalwithouthumanintervention. americanscientist.org/…/issue.aspx 1/4 02/02/2010 Writing Math on the Web » American S… ButwhatabouttheWeb,whichdidnotyetexistwhenKnuthdevelopedTeX?ModernTeXsystemsproduceoutputintheformofPostscriptorPDFfiles. AlthoughdocumentsintheseformatscanbedistributedovertheWeb—thearXivpreprintserverdispatchesthousandsofthemeveryday—theyarenot quitefirstclasscitizensoftheWebworld.BrowserscannotdisplayPostscriptorPDFfileswithouttheaidofextrasoftware,andmanyreaderschooseto downloadsuchfilesandviewthemofflineorprintthem.Thisworkswellenoughforstaticdocumentssuchasjournalarticles,butit’snotidealforthe morevolatileandinteractiveareasoftheWeb,suchasblogsorthealwaysunderrevisionpagesofWikipedia.Formathematicstobefullyathome online,itneedstobetranslatedintooneoftheWeb’snativelanguages.

Marking It Up Fromtheoutset,theprimarylanguageoftheWebhasbeenHTML(HypertextMarkupLanguage),inwhich“tags”identify variouspartsofadocument’sstructure,suchasheadings,andlists.Initsearliestversions,HTMLwasquite simple;itcouldn’tevendosubscriptsandsuperscripts,sotherewasn’tmuchhopeofdisplayingelaboratemathematical notation.

Severalrevisionslater,HTMLdoeshavesubscriptandsuperscripttags.AndasupplementarylanguagecalledC SS (CascadingStyleSheets)allowsfinercontrolovermanyaspectsoftheappearanceofaWebpage.Modernbrowsersare alsoequippedwithaninterpreterforJavaScript,aprogramminglanguage,sothatWebpagesbecomenotjuststatic documentsbutinteractiveprograms.Still,noneofthesefeaturesdirectlyaddresstheneedsofscientificandmathematical writing.TherearenoHTMLtagsforintegrals,say,orformatrices.

TwomainimpedimentsstandinthewayofpresentingmathematicsontheWeb.Firstisthealphabetproblem: Mathematicianshavecreatedasprawlingzooofnovelsymbolsandembellishedortransformedversionsoffamiliar characters.Thenabla( ∇)thatappearsinMaxwell’sequationsisanotableexample,anditisjoinedbyhundredsofother unusual—∂, ∀, ⊥,∫—nottomentiontheentireGreekalphabetandoccasionalborrowingsfromHebrewandother languages.Thedifficultyofreproducingthesecharactershasbeenalleviatedtosomeextentbytherecentadoptionof ,whichhaveroomforalargercollectionofglyphsthanearlierfontformats.Butit’sstillnottobetakenfor grantedthateveryreaderwillhavethenecessaryfontsinstalled.

Thesecondproblemisoneoflayout.Mathematicalnotationistwodimensional;inordertorepresentamatrix,say,ora summationwithupperandlowerbounds,it’snecessarytospecifytheexact xand ycoordinatesofsymbols.Someelementsofmathematicalnotation, suchasbracketsandtheradicalthatdesignatesasquareroot,varyinshapeandsizeaswellasposition.EncodingsuchgeometricinformationinHTML andCSSisnotimpossible,butitstretchesthetechnologytoitslimit.

EarlyinthehistoryoftheWeb,agroupofmathematiciansandotherinterestedpartiesgatheredtoaddressthisissueinasystematicway.Theresult wasanewmarkuplanguagecalledMathML,whichwasendorsedin1998bytheWorldWideWebConsortium.I’llreturnbelowtothepresentstatusand futureprospectsofMathML,butitwillsufficefornowtonotethatmostofthemathematicalnotationtobefoundontheWebis not encodedinMathML. Insteaditreliesonavarietyofingeniousbutadhocworkarounds.MostoftenthereisaTeXsystemsomewhereinthebackground.

Onecommonstrategyistoconvertamathematicalexpressiontoanimage,or“bitmap.”Insomecaseseachsymbolbecomesaseparateimage,and multipleimageshavetobeassembledandcarefullypositionedtorepresentanequation.Inothercasesanentireequationisencapsulatedinasingle image.Thepracticetakesusbacktotheprealphabetictraditionofpictographicwriting.

Pictogramshaveonebigadvantage:Anysymbol,nomatterhowarcane,canbedisplayedinanybrowser.Buttherearealsodrawbacks.Symbolsinthe imagesmaynotblendwellwithonthepage,andit’shardtocontrolsize,spacingandalignment.Amathintensivedocumentcouldhave hundredsofsmallimages,whichareslowtoloadanddisplay.Imagescannotbecopiedandpastedinthesamewaythattextcan,andtheequations cannoteasilybeeditedorrevised.

Butrelyingonfontstosupplymathematicalsymbolsalsohashazards.TheauthorofaWebpagecannotknowwhatfontsareavailableonthereader’s ,orwhichoftheavailablefontswillbeselectedforanygivensymbol.Thusapagethatlooksfinetotheauthormaydisplayverydifferentlyfor somereaders,orcouldbecompletelyindecipherable.

From Author to Server to Reader FormathematicalprosewritteninTeX,oneWebpublishingstrategyistotranslatetheentiredocumentinadvance, creatinganHTMLversionthatcanthenbedisplayedinabrowserwithoutfurtherspecialhandling.Twoprogramsforthis purposeareLaTeX2HTML(writtenin1995byNikosDrakos,nowofGartnerResearch)andTeX4HT(byEitanM.Gurariof OhioStateUniversity).Inessence,thetranslationprogramsreplacethe“backend”ofaTeXsystemwithsoftwarethat generatesHTMLandbitmapimagesratherthanoutputintendedforprinting.

Atranslatorofthiskindrunsontheauthor’sowncomputer;theHTMLfilesandaccompanyingimagesarethenuploadedto aWebserverfordistribution.ThisworkflowissuitedtolonglivedandseldommodifieddocumentswritteninTeX.The arrangementislessconvenientwithfrequentlyupdatedcontentorWebsitesmaintainedbymultipleauthorsin collaboration.It’salsonotidealfordocumentsthatarewrittenmostlyinHTMLratherthanTeX,withjustoccasional snippetsofmathematics.

Asecondstrategyistoperformthetranslationnotontheauthor’scomputerbutontheWebserver.TwoprogramscreatedbyJohnForkoshadoptthis modus operandi .MathTeXreliesonaseparateTeXsystemtodotheactualrenderingofmathematicalnotation;MimeTeXisselfcontained,withitsown TeXinterpreter.Bothprogramsgeneratebitmapimagesofentireequations.ThisschemeisnotmeantforprocessingcompleteTeXdocuments;itworks withHTMLdocumentsthathaveinterspersedTeXexpressions.ProcessingmathematicsontheserverworksparticularlywellforcollaborativeWebsites, sincethetranslationsoftwarehastobeinstalledononlyonemachine.Wikipediaandmanysimilarsitesrelyonthisapproach.(Themathprocessorfor Wikipediaisaprogramcalledtexvc,whichgeneratesimagesforcomplexexpressionsbutoutputsHTMLforsimpleones.)

Thedrawbacksofserversidesoftwarearethosethatafflictallimagebasedsolutions—clumsytypographyandaprofusionoftinyimagefiles.Butifthe resultfallsshortofelegance,atleastitworksreliablyformostreaders,nomatterwhatbrowsertheychoose.

Let the Browser Do It Aswehavejustseen,translationsoftwarecanrunontheauthor’scomputeroronaserver.Thereisonemorepossibility:performingthetranslationon thereader’scomputer,orwhatisknownas“theclientside.”Underthisplan,TeXorsomeotherencodingofmathematicalcontentiswrittenintotheWeb documentandpassedalongunchangedbytheserver,tobeinterpretedbythebrowser.

OfcoursetheproblemwithsendingTeXtothebrowseristhatthebrowserhasnoideawhattodowithit.Thatissuecanbeaddressedbymeansofa plugin—asoftwarecomponentinstalledwithinthebrowser.

Thebestknownmathematicalpluginistechexplorer,aprograminitiallydevelopedbyRobertS.SutoratIBMandnowmaintainedanddistributedby IntegreTechnicalPublishing.TechexplorercandisplaycompleteLaTeXdocumentsoritcanrenderjustthemathematicalexpressionswithinanHTML americanscientist.org/…/issue.aspx 2/4 02/02/2010 Writing Math on the Web » American S… document.Ineithercasethedisplayofequationsisbasedonfontsratherthanimages.

ThebigadvantageofasoftwarepluginisthattherenderingofmathematicsisliberatedfromalltheconstraintsofHTML.Theplugincansupplyitsown fontsandcanplacecharacterswithasmuchprecisionasthehardwarewillallow.Thebig dis advantageisthatnoneofthismagicworksuntiltheend userdownloadsandinstallstheplugin.ThisbarriertoentrytendstodiscouragecasualvisitorstoaWebsite.Italsocreatesathresholdeffect:Authors hesitatetorelyonthetechnologyuntilenoughreadersadoptit,andviceversa.

Aninterestingalternativetoapluginmightbecalledaslipin.TheideaistobundleupthetranslatorprogramandincludeitaspartoftheWebpage itself.There’snoneedfortheusertoinstallanysoftware;thetranslationprogramrunsautomaticallywhenthepageisloadedintoabrowser.Aprogram ofthiskindcalledjsMathtakesadvantageoftheJavaScriptprogramminglanguagebuiltintoWebbrowsers.

JsMathisthecreationofDavideP.CervoneofUnionCollegeinSchenectady,N.Y.Cervoneundertooktheprojectmainlytomeethisownneeds:He wantedtodistributeclassnotesandhomeworkassignmentsviatheWeb,andnoneoftheavailablesolutionswereentirelysatisfactory.Sohewrotewhat amountstoaTeXinterpreterinaJavaScriptprogram.

JsMathoffersthreestylesofequationrendering.ThemostgracefuldisplayrequiresasetofsixfontsbasedontheComputerModernfacesintroducedby Knuth;thosefontsarefreelyavailable,butjsMathcanaccessthemonlyifthereaderdownloadsandinstallsthem.Afallbackstrategyistoassemble equationsfromindividualcharacterimages.Thefullsetofimages—some78megabytesworth—isstoredontheserver;onlythesubsetneededis downloadedwithanyparticulardocument.ThethirdoptionistobuildequationsfromUnicodefonts.Thereaderselectsoneofthethreerendering methodsthroughapopupcontrolpanel.

SqueezingaTeXinterpreterintoaWebpageisanimpressivefeat,butitaddsconsiderablebulkandcomplexity.Documentswithmanyequationstakea whiletofinishrendering.CervoneisnowlaunchingafollowonprojectcalledMathJAX,supportedbyseveralpublishersofmathematicalsoftware.The aimistomakethesystemmoreflexibleandresponsive.

Whatever Happened to MathML? TypesettingmathematicswithHTMLandbitmapimagesisratherliketurningabicycleintoasailboat.Youcan’thelpadmiringtheaudacityoftheattempt, buttheresultstilldoesn’tseemlikethebestvehicleforthepurpose.WhynotchooseMathML,whichwasdesignedexplicitlyforthistask?

MathMLisavarietyofXML(theeXtensibleMarkupLanguage).ComparedwithTeX,itisamoreformallanguage,anditisalsofarmoreverbose. Considerthesimpleexpression x+1,whichmightbeencodedinTeXas$x+1$.(Thedollarsignsmarkthecontentasmathematicsratherthanordinary text.)InMathMLthesameexpressiontakesthisform:

x + 1 . Eachsymbolistaggedtoindicateitsrole—foranidentifier,foranoperator,foranumber—andtheexpressionasawholeiswrapped inantagtoshowthatitbelongsonasingleline.Capturingthisinformationispotentiallyuseful,sinceidentifiers,operatorsandnumbersare accordeddifferenttypographictreatment.(TeXhastoinfertheroleofeachsymbol,andoccasionallygetsitwrong.)

There’smore.ThestyleofmarkupshownaboveisonlyhalfofMathML.It’scalledthepresentationlanguage;thereisalsoacontentlanguage,which attemptstoexpressmeaningratherthanlayout.Theexpression x+1wouldhavethiscontentmarkup:

x 1 . Heredoesnotrefertothesymbol+buttothemathematicaloperationofaddition.Inthisstructurewegetaglimpseofagrandvision—Web pageswith active mathematicalcontent,wherethemenuofthingsyoumightdowithanexpressionincludesnotjustcopying,pastingandprintingbut alsosolvinganequation,graphingafunctionandfactoringapolynomial.

MathMLhasanenthusiasticcommunityofdevelopersandusers.ThereiscommercialsoftwareforwritingandeditingMathMLdocuments(notablyfrom DesignScienceandIntegreTechnicalPublishing)aswellasanoncommercialtranslationprogramcalledASCIIMathML,createdbyPeterJipsenof ChapmanUniversityinOrange,Calif.Severallargescholarlypublishers,includingtheAmericanInstituteofPhysics,havebasedtheiroperationsonXML andMathML;sohastheU.S.PatentOffice.

OntheWeb,however,MathMLhasnotexactlysweptawaythecompetition.Onereasonislackofsupportinbrowsers.Intheearlyyears,theonlyway toreadMathMLWebdocumentswaswithpluginsoftware.Morerecentlyafewbrowsers—notablythoseoftheMozillafamily,suchasFirefox—have gainednativesupportforMathML.ButthereisstillconfusionoverhowMathMLcontentshouldbeembeddedinanHTMLdocument.Moreover,MathMLhas notsolvedthefontsproblem;readersarestillresponsibleforinstallingappropriatefonts.

AnotherfactorinhibitingthespreadofMathMLissimplythatTeXisdeeplyentrenched,particularlyinphysics,mathematicsandcomputerscience.Ifyou liveinaTeXcentricuniverse—IhaveafriendwhoevenwriteslovelettersinTeX—it’shardtoseeanybenefitofanewandverydifferentlanguage.

Embedded Assets Howwillitallturnout?WillWebsitesofthefuturebechockfullofMathML,orwillTeXandHTMLcontinuetoprevail?Orwillsomethingelsealtogether comealong?

Ihavenoanswersforthesequestions,butIwanttosuggestanadjustmentinthewaytheWebworks—asmallchangethatcouldimprove any strategy fordisplayingmathematicalnotation.Ithastodowithwherefontscomefrom.

Underthepresentrules,aWebauthorcanrequestaparticularfont,andthereader’sbrowserwillhonortherequestifthefontisavailableontheclient machine.Ifnot,somedefaultfontissubstituted.Wouldn’titbemorehelpfuliftheauthorcouldsupplythemissingfont,eitherbyembeddingitdirectlyin thepageorbyreferringthebrowsertoasitewherethefontisavailable?Givensuchamechanism,anyfontbasedsystemforpresentingmathematics couldensurethatalltheneededsymbolsarereadyathand.

Thisisnotanewidea.Aproposalfor“Webfonts”wasincludedinadraftCSSstandardin1998,andtheideawasevenimplementedinafewbrowsers, includingMicrosoft’sInternetExplorer.Buttheproposalnevercaughton,anditwasremovedfromlaterversionsofthestandard.Recently,HåkonWium LieofSoftwarehascalledforrenewedconsiderationoftheidea.

Muchofthediscussioncentersonlegalquestionsofinteresttotheownersofcopyrights.Thisdoesn’tseemlikeaninsuperableproblem.Itwas solvedinthecaseofPDFfiles,whichdoembedfonts.Evenifproprietarytypefaceswereofflimits,thereareenoughfreelyavailablefonts—includingall thosecommonlyusedwithTeX—tomaketheprospectattractive.

Forachangeofthiskindtohaveany,allthebrowsermakerswouldhavetoadoptit.Thosearethesamepeoplewhohavesofarresisted americanscientist.org/…/issue.aspx 3/4 02/02/2010 Writing Math on the Web » American S… Forachangeofthiskindtohaveanyimpact,allthebrowsermakerswouldhavetoadoptit.Thosearethesamepeoplewhohavesofarresisted implementingMathML.Whywouldtheytreatthefontproposalanydifferently?Ithinkthereisreasonforoptimismonthisscore,notbecausethe mathematicalcommunityhasmuchcloutbutbecauseembeddedfontswouldbeofvaluetootherconstituencies.Advertisers,inparticular,wouldbe pleasedtogaingreatercontroloverWebtypography.

Meanwhile,asIfinishpreparingthisforthepress,Ialsofacethetaskofhelpingtogetmyownpenaltycopyreadyforpublicationonthe American Scientist Website.ThoseirksomeequationsandcuriouscharactersI’vebeenwritingaboutwillsomehowhavetobemadeWebfriendly.I don’tknowexactlyhowwe’regoingtodothat,butIsuspectsomebicyclesaregoingtobeoutfittedwithspinnakersandjibs.

©BrianHayes

Bibliography

Beeton,Barbara,AsmusFreytagandMurraySargentIII.2008.Unicodesupportformathematics.UnicodeTechnicalReportNo.25. http://www.unicode.org/reports/tr25 Bos,Bert.2008.Forandagainststandardizingfontembedding. http://www.w3.org/Fonts/Misc/eotreport2008 Cervone,DavideP.Website.jsMath:AmethodofincludingmathematicsinWebpages. http://www.math.union.edu/~dpvc/jsMath Goossens,Michel,andSebastianRahtzwithEitanGurari,RossMooreandRobertSutor.1999. The LaTeX Web Companion: Integrating TeX, HTML, and XML .Reading,Mass.:AddisonWesleyLongman. Jipsen,Peter.Website.TranslatingASCIImathnotationtoPresentationMathML. http://www1.chapman.edu/~jipsen/asciimath.html Knuth,DonaldE.1984. The TEXbook .IllustrationsbyDuaneBibby.Reading,Mass.:AddisonWesleyPub.Co. Miner,Robert,andJeffSchaefer.1998.AgentleintroductiontoMathML. http://www.dessci.com/en/reference/mathml/default.htm Miner,Robert.2005.TheimportanceofMathMLtomathematicscommunication. Notices of the AMS 52:532–538. Soiffer,Neil.1997.MathML:AproposalforrepresentingmathematicsinHTML. SIGSAM Bulletin 31(3):44. Wright,FrancisJ.2000.InteractivemathematicsviatheWebusingMathML. SIGSAM Bulletin 34(2):49–57.

Youcanfindthisonlineathttp://www.americanscientist.org/issues/num2/2009/3/writingmathontheweb/2 ©SigmaXi,TheScientificResearchSociety

americanscientist.org/…/issue.aspx 4/4