Towards Making Mathematics a First Class Citizen in General Screen Readers
Total Page:16
File Type:pdf, Size:1020Kb
Towards Making Mathematics a First Class Citizen in General Screen Readers ∗ Volker Sorge Charles Chen, T.V. Raman, David Tseng School of Computer Science Google, Inc. The University of Birmingham, UK Mountain View, CA, USA [email protected] {clchen|raman|dtseng}@google.com ABSTRACT content employing media ranging from traditional articles The text to speech translation of mathematical expressions containing mathematical formulas and scientific diagrams has always been a challenging problem, which has not dimin- to highly interactive web pages often exploiting novel me- ished by more and more content moving to the web. In this dia formats such as dynamic diagrams or simulations, which paper we present our efforts of making the speech translation makes traditional methods of making content accessible all of mathematical formulas a first class citizen in ChromeVox, but obsolete. To avoid the risk that modern technology a general screen reader for the Chrome browser. We exploit might create an even higher obstacle for inclusive educa- ChromeVox's ability to handle alternative representations of tion, it is important to ensure accessibility of scientific web DOM elements for translation of mathematical content given content without the need for expensive, specialist software. in a variety of web formats into uniform utterances. We In this paper we concentrate on making mathematics ac- present a format of flexible and adaptable speech rules that cessible for visually impaired learners in the general screen support the customization of aural rendering of mathematics reader ChromeVox [10]. Mathematics has always been a and introduce a specially semantically enriched representa- challenging problem as formulas can be arbitrarily complex tion of expressions that allows for a more natural reading in the sense that there is no limit on the nesting depth of experience. To further aid understanding of the math we sub-expressions or on how many layers of parenthesis are exploit ChromeVox's idea of letting users engage with con- used. Moreover, a large number of unusual symbols can tent on different levels of granularity to enable interactive occur that might have different meaning in different mathe- exploration of complex mathematical formulas. matical areas. An expression as this one from [13] Categories and Subject Descriptors numerator 1 z }| { 1 k1 n kn H.5.2 [User Interfaces]: Voice I/O; H.5.4 [Hypertext]: n X X j n!(Dxu) ··· (Dx u) Dx w = Duw Navigation, User Issues k1 kn k1!(1!) ··· kn!(n!) 0≤j≤n k1+k2+···+kn=j k1+2k2+···+nkn=n | {z } k1;k2;:::;kn≥0 denominator 1 Keywords | {z } lower constraint 1 Screen reader, Mathematics, ChromeVox can already be difficult to parse for an expert human reader 1. INTRODUCTION with perfect eyesight. But conveying all nuances of the no- Ensuring accessibility to scientific material has always been tation in speech automatically is considerably more difficult. a challenging task and can be considered a major obstacle for Even well known examples like the quadratic formula can be full inclusiveness in education in the fields of science, tech- non-trivial to translate into speech. nology, engineering and mathematics (the traditional STEM p −b ± b2 − 4ac subjects), in particular at the late secondary and tertiary x = (1) stage. As teaching moves more and more towards the pro- 2a vision of online courses, we are faced with rapidly changing Consequently, the work on mathematical screen reading has ∗This work was done while the author spent a sabbatical at explored adaptive techniques to customize output with re- Google, Inc., Mountain View, CA, USA. spect to mathematical domains and personal preferences [15, 1], the use of prosody and pausing to convey meaning [15, 2] as well as to enable better understanding via multi-modal Permission to make digital or hard copies of all or part of this work for presentation of equations [9]. But generally these efforts personal or classroom use is granted without fee provided that copies are have been restricted to specialist systems [15, 18, 5, 21], not made or distributed for profit or commercial advantage and that copies which either are not web-ready solutions or are restricted bear this notice and the full citation on the first page. To copy otherwise, to to single markup formats and offer limited customizability, republish, to post on servers or to redistribute to lists, requires prior specific which has to be done by users rather than by the providers permission and/or a fee. of web content. W4A 2014 — Technical, April 7-9, 2014, Seoul, Korea. Co-Located with the 23nd International World Wide Web Conference. The main contribution of our work is to embed many of Copyright 2014 ACM 978-1-4503-2651-3 ...$15.00. these techniques into a general, open source screen-reader, that enables math accessibility over a wide range of plat- <math xmlns="http://www.w3.org/1998/Math/MathML"> forms in a widely used web browser. We introduce the <mstyle displaystyle="true"> ChromeVox screen reader in Sec. 3. Moreover, our approach <mi>x</mi> can deal uniformly with math on the web in a variety of for- <mo>=</mo> mats | presented in Sec. 2 | and in particular make math <mfrac> accessible even when given as images with annotations, only. <mrow> We furthermore, not only enable users to customize the read- <mo>−</mo> <mi>b</mi> ing experience, but provide an API that allows authors of <mo>±</mo> web content to embed specialist reading rules, thus allowing <msqrt> the very people who know best how the content they pub- <msup> lish should be spoken, to make adjustments to visiting screen <mi>b</mi> reader clients. This is achieved by a flexible speech rule en- <mn>2</mn> gine that makes it possible to customize the reading expe- </msup> <mo>−</mo> rience along several axes (Sec. 4). It also offers users a way <mn>4</mn> to explore interactively mathematical expressions (Sec. 5) <mi>a</mi> as well as the possibility to employ more effective represen- <mi>c</mi> tations in lieu of a web element. We exploit the latter by </msqrt> introducing a new semantic representation in Sec. 6 that </mrow> leads to more natural pronunciation of math formulas. <mrow> <mn>2</mn> <mi>a</mi> 2. MATHEMATICS ON THE WEB </mrow> Mathematical formulas on the web can be represented in </mfrac> </mstyle> their own specialized markup language, MathML [3]. But </math> although MathML is officially part of the HTML5 stan- dard [11], not all major browsers also implement MathML Figure 1: MathML for the quadratic formula (1). rendering, hence support for displaying formulas included in pure MathML on web pages is sketchy. Consequently today mathematics on the web comes in three predominant flavors: sub- and superscripts, fractions, square roots etc. In addi- 1. Pure MathML markup: This relies on the user tion it provides markup to define mathematics specific styles viewing the page with a MathML capable browser. or spacing, as well as specialized attributes such as fonts, 2. Rendered with MathJax: The web page author en- accents, etc. Fig. 1 presents our example of the quadratic sures that mathematical content is rendered indepen- formula (1) in MathML markup. Here the tags mi, mn, and dent of the browsers it is viewed in, by including the mo markup identifiers, number, and operators, respectively, third party MathJax library [4] in the page. Formulas while layout elements fraction, square root and superscript can be given in several different markup languages and are enclosed in the elements mfrac, msqrt, and msup. The MathJax renders them client-side. mrow tag allows the horizontal combination of a string of el- ements, in case they need to be combined to a single node as 3. Pre-rendered images: Content is ensured to display in the case of the two arguments for the fraction tag mfrac. correctly, by including images of formulas in the web MathML is part of the specification for the HTML5 stan- page. The markup from which the content was orig- dard [11], where it lives in its own name space. By exten- inally rendered is often given in an attribute of the sion it is also part of the ePub3 standard [8] and conse- image tag. quently one can anticipate that future implementations of In order to enable access to the majority of mathematics these standards will include native rendering of MathML. that can be found on the web today, one has to provide However, currently only a few browsers and ePub readers text-to-speech support for all of the above formats, which support MathML and rendering is either achieved by spe- we shall briefly sketch in the remainder of this section. cialist browser plugins or by third party libraries. 2.1 MathML 2.2 MathJax Rendering MathML is a specialized markup language that was devel- MathJax is a JavaScript display engine [4] that consis- oped with the express purpose of representing mathematics. tently renders mathematical expressions in all browsers. It Ordinarily it comes in two flavors: presentation MathML handles a number of input formats and translates them into and content MathML. The former is a markup language simple HTML markup that visually renders a math expres- geared towards adequately displaying mathematical formu- sion in a browser, regardless of whether it supports MathML. las thus playing a role for web documents similar to the Thus the basic goal is to shift control on whether and how one of LATEX [14] for printed documents. Contrary, content formulas are displayed to the content author, by allowing MathML aims to serve also as a meaningful exchange for- them to include JavaScript in web pages that call Math- mat between mathematical software systems by providing Jax via a content distribution network to enable client side markup that allows to include semantic meaning of mathe- rendering of math expressions.