<<

End-to-end Solution for Accessible Chemical

Volker Sorge, Mark Lee Sandy Wilkinson School of Computer Science School of Education University of Birmingham, UK University of Birmingham, UK {V.Sorge|M.G.Lee}@cs.bham.ac.uk [email protected]

ABSTRACT relatively simple . Since the majority of images and Chemical diagrams are an important means of conveying in- diagrams on the web use raster-based image formats (e.g. formation in chemistry and biosciences to students, starting gif, png, and jpeg), magnification tools also struggle with as early as secondary school. But even in electronic teaching dealing with diagrams since magnification does not propor- material, diagrams are commonly given as bitmap graphics tionally increase resolution leading to a loss of image quality, leaving them inaccessible for visually impaired learners. We which, in practice leaves most diagrams completely inacces- present an end-to-end solution to making these diagrams sible to visually impaired users. Web accessible, by employing image analysis solutions to There have been a number of approaches to make scien- recognise and semantically analyse diagrams, and by regen- tific diagrams accessible. In particular, for chemical dia- erating them in a format that makes them amenable to assis- grams a number of tools have been built to support visually tive technology. We provide software tools that allow read- impaired users in editing, reading and exploring ers to interactively engage with diagrams by exploring them (see next section for details). The main problem with these step-wise and on different layers, enabling aural rendering approaches is that they require both authors and readers to of diagrams and their individual components together with use specialist software to create and read diagrams, possi- highlighting and magnification to assist readers with low vi- bly also restricting them to particular platforms only, which sion or learning difficulties. Our technology builds on open reduces their effectiveness in practice. standards, supporting a number of computing platforms, We solve this problem for the particular case of chemi- browsers, and screen readers, and is extensible to diagrams cal diagrams by creating a workflow that bridges the gap in other STEM subjects. from images to accessible diagrams in a way that neither relies on authors to produce images in some special format nor requires readers to familiarise themselves with a new be- Keywords spoke tool. Instead, we apply an image analysis system to Accessible Diagrams, Image Transformation, Chemistry bitmap images to extract chemical meaning from diagrams. We then transform the images into Scalable Vector Graph- 1. INTRODUCTION ics (SVG) and use cheminformatics tools to create an un- derlying semantic representation in a Shadow DOM, which Visually impaired learners represent a sizeable minority enables the automatic generation of meaningful descriptions of users of scientific material. For example, it is estimated for components of the diagram. A JavaScript tool allows that there are 25,000 visually impaired children and young the user to “step through” the image via speech with syn- adults in England and Wales who require specialist educa- chronised highlighting and/or magnification in an ordinary tion support. More than 60% of this group are educated in web-browser using the semantic markup together with WAI- mainstream schools and are often without specialist techni- ARIA elements. Our experiments and initial user tests have cal equipment, such as high definition magnification tools indeed shown that our approach can cater to a variety of and braille embossers. different assistive technology platforms and users. The majority of users rely on software-based assistive tech- We therefore see the main contribution of our work in pro- nology such as screen readers and magnifiers. While to a cer- viding the first end-to-end solution to creating accessible di- tain degree, text can be handled using existing screen readers agrams, starting with the analysis of bitmap images and gen- and related tools, diagrams are often completely inaccessible erating explorable diagrams that seamlessly integrate with since even if alternative text is provided, it in no way com- assistive technology solutions already familiar to a reader. pares with the richness of information provided by even a This paper is structured as follows: We discuss related Permission to make digital or hard copies of all or part of this work for work in the next section before giving the necessary back- personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies ground on chemical diagrams and different forms of their bear this notice and the full citation on the first page. To copy otherwise, to representation in Sec. 3. We then present an overview of republish, to post on servers or to redistribute to lists, requires prior specific our process that turns diagrams from bitmap format into permission and/or a fee. Request permissions from [email protected] fully accessible diagrams in scalable vector graphics (SVG) W4A 2015, May 18 - 20, 2015, Florence, Italy in Sec. 4. Sections 5–8 then describe the single steps in Copyright is held by the owner/author(s). Publication rights licensed to more detail, in particular, the image analysis, the genera- ACM ACM 978-1-4503-3342-9/15/05 ...$15.00 tion of annotated SVG diagrams, their semantic enrichment http://dx.doi.org/10.1145/2745555.2746667. and the support for interactive engagement with diagrams we transform these results into an XML structure that con- via a browser front end. We illustrated the latter with an ex- nects this information to the SVG diagram. This allows us tensive example in Sec. 9 before concluding by summarising to use standard web technology, like JavaScript and WAI- results of our experiments and user studies in Sec. 10. ARIA, to implement reader interaction with the diagram, which is platform, browser and screen reader independent, 2. RELATED WORK thus offering users a barrier-free reading experience. There have been a number of approaches to make scientific diagrams accessible, using a variety of means such as sonifi- 3. COMMUNICATING MOLECULES cation, touch exploration and haptic feedback. [3] presents In this section we present some background on how chem- general guidelines for the presentation of tables and line ical diagrams are commonly communicated in teaching and graphs using sonification and speech, while [21] presents a scientific literature. There is naturally a wide gap between multimodal tool that allows blind users to create and explore how molecules are taught to students in secondary school line graphs and via the Web using haptic feedback and how they are communicated to experienced chemist. and sonification. This idea is extended to three-dimensional From the point of view of creating accessible diagrams au- representations of graphs in [5, 4], which presents an author- tomatically, it is of course not known in advance who the ing and teaching tool for the creation accessible graphs and target audience will be and how much detail a reader might their exploration via highlighting and speech using touch need, to comprehend a particular diagram. Therefore, it is screens. While these approaches are suitable for relatively necessary to make diagrams accessible on multiple levels of simple structures with homogeneous layouts, they can not be granularity as well as cater for readers of different expertise easily adapted for semantically richer graphs such as chemi- by producing several alternative audio descriptions. cal diagrams. Consequently a number of tools and environ- We introduce the different types of diagrammatic repre- ments have been built to support visually impaired users in sentations for molecules we are working with as well as the editing, reading and exploring molecules. necessary concepts that we use to achieve our goal of ade- [19] describe various projects involving visually impaired quately describe molecules for different audiences. chemistry students entering chemical information as ASCII- based SMILES (simplified molecular input line entry specifi- 3.1 Forms of Diagrams cation) markup, which is then translated via Open Babel for There are a number of different ways in which molecules use in commercial screen readers, embossers and 3D print- can be represented diagrammatically, varying mainly in the ers. The use of SMILES allows visually impaired students to explicitness in which the chemistry is presented. The par- avoid the need of standard graphical user interfaces for draw- ticular choice of diagrams depends often on the target au- ing chemical molecules but does require prior knowledge of dience of a document. While advanced scientific textbooks basic chemistry and of the SMILES markup language. and publications usually go for the most abstract or most [1] presents the Kekul´esystem that reads molecular struc- space-saving representation, introductory texts in secondary tures from Chemical Markup Language files, which are used schools use the most explicit representations possible. How- to generate graph-based structures which can be explored ever, in general, secondary School teaching material presents by a combination of keyboard input and speech synthesiser all different types of possible representations. Consequently, output. A key aspect of the system is that it infers and it is important that a system like ours can work with the makes explicit chemical structure which is only implicit in whole variety of diagrams. the original CML markup. This allows the user to fully ex- Figure 1 shows the three main categories of diagrams, each plore any at different levels, for instance by moving representing the molecule commonly known as Aspirin. Dia- between adjacent atoms or higher level features. grams given in displayed formula are the most detailed ones, [12, 7] present NavMol, a bespoke system that allows VI where all atoms are given explicitly by their chemical sym- users to edit and navigate molecular structures. NavMol bol (i.e., C, H, O standing for , and , is based on the open-source Chemistry Development Kit respectively) and bonds are given by lines between atoms, (CDK) and aims to present diagrammatic information via in our case either depicting single or double bonds (for the speech using an intuitive scheme based on clock face coordi- whole variety of bonds that can occur in diagrams, see Fig- nates. In addition to CDK, NavMol uses both FreeTTS and ure 5). The arrangement of characters and lines indicate the eSpeak speech synthesis tools and can be integrated with two dimensional (in some cases three dimensional) structure standard screen readers. of a molecule. For example, the six carbon atoms in the A drawback of all these approaches is that it needs signifi- lower left of the formula form a so-called ring. cant investment from authors to prepare diagrams, as well as Skeletal formulas (Figure 1(b)) simplify the representa- forcing visually impaired readers to use yet another special- tion considerably, by omitting all explicit hydrogen atoms ist system they need to learn. Users of assistive technolo- displaying neither their names nor their connecting bonds, gies often have well-established and individual preferences as well as by not writing carbon atoms but only indicating for particular systems, setups and usage. Therefore any new them via as unnamed end points of bonds. technological solution should be transparent and not force Structural formulas (Figure 1(c)) collapse even further ge- the user to radically alter their preferred environment. ometric structure of the molecule by replacing the molecule In our approach we therefore eliminate the need for reed- or parts thereof with a sequence of atom identifiers, from ucating authors by introducing image analysis to automati- which one can infer the two dimensional layout. For in- cally generate an SVG diagram from given bitmap images. stance, OCOCH3 stands for an oxygen bonded to a carbon, Similar to [7], we use algorithms from the Chemical Develop- which in turn is bonded to an oxygen and a carbon with ment Kit (CDK) [17, 20] to identify important compounds, three hydrogen atoms. Furthermore the benzene ring is here such as rings or functional groups in a molecule. However, depicted in a compressed manner, omitting all double bonds. (a) Displayed formula. (b) Skeletal formula. (c) .

Figure 1: Different representations of Aspirin molecule.

Commonly, we mainly find displayed formulas only in sec- 3.3 Blocks ondary school textbooks, while in undergraduate textbooks We give a brief overview of blocks for molecules in organic and scientific literature, one usually finds diagrams as a mix- chemistry, which in our findings are the prevalent molecule ture of skeletal and structural formulas, depending on as- diagrams occurring in secondary school textbooks. Blocks pects such as what parts of the molecule are important in of organic molecules are primarily identified by particular the context, space considerations in the document, and au- groups of carbon atoms. thor’s choice. Ring Systems — As an example we have already seen the benzene ring in the aspirin molecule in Figure 1. A ring of that nature is called isolated, but we can also have so- 3.2 Describing Molecules called fused ring systems, as the one displayed in Figure 2, A major part of effectively communicating chemical struc- where three benzene rings form a single system. The three tures non-visually is to name the molecule and its interesting benzene rings are the sub-rings of the fused system and are components. The simplest way to summarise a molecule is sometimes called its essential rings. to use its molecular formula, which simply sums up all the In addition, there can be rings of different size or number occurring atoms. For example, the molecular formula for of constituent carbon atoms and with so-called internal sub- Aspirin is C9H8O4. Although this is trivial to compute, it stitutions, where one or several carbon atoms are replaced is generally useless as a means of communicating structure by other atoms such as N and O. as the name is highly ambiguous. Aliphatic Chains — A second variant of carbon groups are The IUPAC nomenclature [8] provides a precise naming simple chains of consecutive carbon atoms, called aliphatic convention for uniquely identifying molecules, which also chains. They can be of any length and form the most basic fully reflects the structure of the molecule. For example, of organic molecules. Carbon atoms in aliphatic chains Aspirin is represented as 2-(Acetyloxy). While are mostly connected with single bonds, but there can also these names describe the components and how they are com- be double or triple bonds between two carbon atoms. Fig- bined, one needs to have the required expertise to under- ure 3 shows a molecule with an aliphatic chain of length 4 stand them in the first place. Thus this description is not with IUPAC name 2-bromo-1-chlorobutane. suitable for teaching and even trained experts often forgo us- External Substitutions by single atom — In the basic ing IUPAC naming as they can become quite cumbersome form of rings and aliphatic chains all occurring carbon atoms for complex molecules. In many cases compounds are often are either bonded to other carbon atoms or saturated with referred to by common names, such as Aspirin, which give hydrogen atoms. However, some can be replaced little clues as to their composition or structure. Common by other elements giving a molecule different characteristics. names might not always exist and if they do, they are often These are called external substitutions, or just substitution, not unique. For instance, an alternative name for Aspirin is and are named according to their position in the chain or Acetylsalicylic acid. ring. E.g., the aliphatic chain in Figure 3 has substitutions Using common names still requires that a reader has to be at positions 1 and 2 by elements Cl and Br, respectively. familiar with these molecules. Moreover, it is not useful if Functional Groups — Substitutions are often not just sin- we want to communicate a diagram for an unusual molecule. gle atoms but more complex blocks that are made up of vari- One easy and trivial solution is to use some way to describe ous groups of different atoms, which can be called functional every single atom and bond in the molecule. However, for groups. They can also form a bridge between carbon chains large molecules this can be very confusing without visual or rings. Many important functional groups have common aids. Consequently, molecules are often described on an names that identify them easily. For example, Aspirin has intermediate level in terms of the structural features they two functional groups attached to the benzene ring, given are composed of. We refer to components of a molecule with their structural formulas in Figure 1(c) which are car- as blocks (as they are effectively the building blocks of or- boxylic acid and ester in Figure 4. ganic molecules), and they can in turn be broken down and As with single atom substitutions, locations of functional described either by their common names, by structural for- groups on a chain or ring are communicated with their re- mulas (e.g., as for Aspirin in Figure 1(c)), or in full details. spective positions. See Sec. 7.2 for details on how these are Identifying and naming these components in a diagram computed. is thus a major pre-requisite for effectively communicating chemical structures. Figure 2: Fused Ring System Figure 3: Aliphatic chain of length 4. Figure 4: Functional groups of Aspirin.

4. OVERVIEW OF THE PROCESS 5.1 Image Segmentation The goal of our work is to take bitmap images from a web In an initial step a given diagram image is vectorised in site and replace them with interactive diagrams that support order to segment it into main constituents making up the aural rendering and visual highlighting and magnification. diagram. We briefly summarise the main steps of the vec- To this end we provide a software pipeline that can roughly torisation, which are binarisation of the image, recognition be divided into the following major steps: of characters and separation of elements representing bonds. This results in a set of distinct primitives that will be ex- (i) Image Analysis: Takes the original bitmap image as ploited in the actual diagram recognition phase. input, recognises the molecule diagram and returns Binarisation — As images can be of any format (i.e., colour its representation in the Chemical Markup Language or greyscale) and origin (scanned or digitally drawn), in a (CML) [9], an XML format for encoding molecules. first step we apply some noise reduction, followed by im- age binarisation using Otsu’s method [11]. This yields a binary image, that is, one containing only black and white Input: Bitmap image file (TIF, PNG, . . . ). pixels, from which all the connected black components are Output: CML file with molecule representation. extracted and labelled. Optical Character Recognition — In the next step, op- (ii) Annotated SVG Generation: Translates the CML in- tical character recognition is performed by extracting a set formation into a Scalable Vector Graphic (SVG), tak- of structural features from each connected component in the ing care that image components are grouped and an- image and comparing the resulting feature vectors to a pre- notated with respect to their chemical meaning. computed bag of features for Roman letters, applying a near- est neighbour classification based on a Euclidean metric. Input: CML file from step (i). All connected components recognised as characters are Output: Annotated SVG. removed from the image. Some contextual information is used to disambiguate difficult cases. For example, the lower (iii) Semantic Enrichment: Analyses the initial CML in- case, sans serif letter “l” is often visually indistinguishable formation further to recognise components, layers of from short line segments in a molecule diagram. However, abstraction, and to provide speech annotation, leading in molecule diagrams it usually does not appear except be- to a semantically enriched CML format. side other letters (usually after a capital “C”, to denote a atom). The result of this step is a list of recog- Input: CML file from step (i). nised characters together with their location information in Output: Enriched CML with semantic annotations. the image as well as a skeleton molecule with all detected characters removed. (iv) Browser Front end: An AJAX service imports enriched Separation of Bond Elements — At this point MolRec CML and annotated SVG into web documents, and produces a new copy of the skeleton diagram and applies a provides functionality to interactively engage with the thinning algorithm to connected components to thin them diagram by navigating its layers supporting aural ren- to a single pixel width. Using the thinned lines as a guide, dering, highlighting and magnification of components. we walk the corresponding paths in the original image to determine the average line width by finding the largest disk that fits wholly with the stroke width of the line. At the Input: Annotated SVG and enriched CML from steps same time, we build a polyline representation of the thinned (ii) and (iii), respectively. lines. At every junction where three or more polylines meet, we split them into separate polylines. Closed polylines, as 5. IMAGE ANALYSIS occurring in rings, are also identified. Because of scanning, discretisation and thinning artefacts, The first step in our process is the image analysis of a given these polylines are not, as we would like, smooth idealised molecule diagram. This is based on our previous work on representations of the lines in the original diagram. There- chemical diagram recognition, which is implemented in the fore we clean them up by applying the Douglas-Peucker line MolRec system [15, 14]. The image analysis itself is divided simplification algorithm [6], where we set the simplification into two steps: (1) Image segmentation, using vectorisation, threshold to between 1 and 2 average line widths as found and (2) diagram recognition, via a rule based approach. above. This is sufficient to smooth out the polylines, re- of elements to the graph as well as possibly the addition of moving almost all artefacts, without losing the significant new geometric objects. In general, preconditions of differ- corners in the lines in the diagram. Basing the threshold on ent rules are mutually exclusive, and thus the order of rule the average line width allows the algorithm to adapt to the application is irrelevant. Rules work with a number of pa- different line styles that appear in molecule diagrams. rameters, both fuzzy and strict, that set certain thresholds, In addition to detecting and separating polylines we also for instance the minimal bond length, under which decisions detect circles as well as lines with arrow heads and solid will be made. These parameters allow for the customisation triangles. The latter two are then annotated with their re- of MolRec and its adaptation to particular requirements of spective direction. Thus the result of this step is a set of ge- data sets. As an example of a rule we give the simplified ometric primitives that compose a skeleton molecule, where description of the rule recognising double bonds below. For primitives are either lines, circles, solid triangles or arrows, precise definitions of all rules see [15, 14]. together with their geometric location in the original image. 1. Let l1, l2 be distinct line segments of a minimum length.

5.2 Diagram Recognition 2. If l1 is nearly parallel to and in a neighbourhood of l2. Note that up to now, the image analysis process is fully 3. No other line segment is nearly parallel to l or l . generic, that is, it is independent of the actual type of dia- 1 2 grams we are analysing but is only limited with respect to ⇒ Then (l1, l2) form a . the type of geometric primitives we are extracting. Only the actual recognition phase is based on knowledge about The rule engine is parameterisable with respect to differ- molecule diagrams to distinguish bonds and atoms in the ent sets of rules. For the general task of recognising chemical diagram. This primary recognition task is performed by a diagrams we employ 18 rules altogether. With the exception rule engine, in which largely disjoint rules are repeatedly ap- of two rules, that deal with the recognition of particular 3- plied to the initial set of geometric primitives, rewriting it dimensional structures, so called bridge bonds, that have to into a graph representation of the given molecule diagram. be applied in the very beginning, all other rules can be ap- The final graph structure serves as a basis from which effi- plied in arbitrary order. They deal with the recognition of cient electronic representation formats can be generated. the different types of chemical bonds, which are presented The rule engine essentially works with the geometric prim- in Figure 5. All bonds consist of one or several geometric itives resulting from the vectorisation. In particular it uses objects, which a rule can select using its preconditions and character groups from the OCR step, as well as line seg- rewrite into a corresponding graph entry for the bond type. ments, circles, solid triangles and arrows from the bond sep- There are also single geometric objects that possibly rep- aration. The goal of the rule engine is to rewrite the input resent more than one bond. An example is the implicit nodes set of primitives into a graph structure that represents the presented in Figure 6. Here carbon atoms are understood molecule in terms of the atoms (or atom sequences for struc- to be at the grey circled areas separating the bonds. These tural formulas) and different types of bonds between them. cases are dealt with by rules that pick double or triple bonds, As an example, we list below the set of extracted geometric respectively, while also producing new geometric objects by primitives for the Aspirin molecule from Figure 1(b): effectively cutting the bonds at the implicit nodes. These new objects are then further processed by other rules. 60;4;336;279;188.992693;206.825861 The graph resulting from the recognition step is trans- chargroup;O;258;223;287;257 lated into a standard chemical output format. In our case chargroup;O;195;114;224;148 we use the Chemical Markup Language (CML), an XML chargroup;O;6;5;35;39 format that specifies molecules in terms of XML elements chargroup;OH;132;5;192;39 for atoms and bonds. Atoms are commonly given explicitly line;normal;82;62;85;56;4;4611686018427387904.000000 with their , 2- or 3-dimensional coordinates line;normal;83;130;82;63;4;4611686018427387904.000000 line;normal;83;276;145;240;4;-0.577093 and hydrogen count, that is, the number of hydrogen atoms line;normal;20;239;82;276;4;0.578389 bonded to the atom. Consequently, hydrogen atoms are gen- line;normal;274;166;269;166;4;0.000000 erally not given explicitly. Bonds are represented in terms line;normal;267;220;268;167;4;4611686018427387904.000000 of their type (e.g., single, double, triple) and the atoms they line;normal;84;266;135;236;4;-0.577307 connect. In addition, each XML element in a CML file has line;normal;29;233;29;174;4;4611686018427387904.000000 a unique id. For example, CML for Aspirin is: line;normal;276;220;276;167;4;4611686018427387904.000000 line;normal;137;171;84;142;4;0.573204 line;normal;268;166;226;141;4;0.581697 line;normal;146;167;191;141;4;-0.574221 line;normal;21;166;82;131;4;-0.578908 .... line;normal;20;238;20;167;4;4611686018427387904.000000 line;normal;146;168;146;239;4;4611686018427387904.000000 line;normal;81;62;34;36;4;0.575445 line;normal;86;56;127;32;4;-0.575723 line;normal;85;55;39;28;4;0.580944 Rules are defined in terms of preconditions and conse- .... quences. A rule is applicable if there exist geometric objects that satisfy its preconditions. The consequence results in the removal of existing geometric objects and the addition (a) Single Planar (b) Double Planar (c) Triple Planar (d) Wedge (e) Hollow wedge

Figure 6: Implicit Nodes (circled) (f) Bold (g) Dashed wedge (h) Dashed (i) Dashed bold (j) Wavy (k) Dative

Figure 5: Bond types recognised by MolRec.

6. ANNOTATED SVG GENERATION rings. For the latter we then use a subset coverage method Given the CML representation of a molecule, it is now to compute all the individual sub-rings. fairly straightforward to compute the corresponding diagram Aliphatic Chains are computed using a variant of Floyd- in SVG, and there already exist a number of solutions to do Warshall, ignoring atoms that are already in a ring system. this, such as the Open Babel library [10]. The drawback of Functional Groups are detected using CDK’s SMARTS these solutions is that they exclusively are geared towards query tool. SMARTS is a language for describing molecular rendering the diagram, discarding all chemical information patterns that can be employed to find matching components in the process. That is they, will set all the geometric com- in a molecule. We employ a pattern library of around 300 ponents, lines and characters, in a flat structure, losing in- different functional groups, given as name/pattern pairs. All formation about bonds or atoms. patterns are matched with the molecule in order to retrieve a As making a connection between the geometric component set of candidate groups, which is narrowed down by (1) only of the SVG and the bonds and atoms in the input CML file is allowing groups that have at most one element in common important for the purpose of highlighting and magnification, with a ring or a chain, (2) ignoring groups that are proper we have implemented our own SVG renderer as a simple ex- subsets of another group, and (3) not allowing groups that tension of the Chemical Development Kit (CDK) [17, 20], have only elements that also occur in another block. an open source toolkit for chemical computations. This al- With the blocks available we can now compute the follow- lows us to exploit SVG facilities to group elements together ing graphs: as well as to add attributes reflecting their chemical purpose Block graph: For the molecule M a graph GM = (VM ,EM ) and connecting them to their origins in CML. The following with vertices VM being the blocks of M together with all sin- example is the first double bond in the Aspirin molecule, gle atoms that do not belong to any block. Edges EM consist where the group id refers to the corresponding CML ele- of bonds or shared atoms between elements of VM . ment and the class of each lines element indicates that they Sub-ring graphs: For each fused ring R ∈ VM a graph denote bonds. GR = (VR,ER) with VR representing the sub-rings and ER denoting bonds and atoms shared between neighbour- ing rings in R. and A1,...,Am ∈ VM be single atoms, then for each S ∈ a graph GS = (VS ,ES ), with VS all the atoms belonging to S and ES all the bonds between elements in VS only. Hence any bond connecting an atom in VS with an atom not in VS does not belong to ES . 7. SEMANTIC ENRICHMENT We can now combine these graphs in a tree like structure, As the standard CML format only contains information with the root representing the molecule, the first level con- on atoms and bonds, all one can easily construct from it, is a sisting of GM , and each block in VM having a child that is graph representation that could allow trivial traversal from either an atomic graph or, in case of fused rings, a sub-ring atom to bond. Our next goal is therefore to compute more graph, where the elements of the latter in turn have atomic information about the molecule to produce a semantically graphs as children. As consequence this abstraction tree has richer representation, in particular, by finding blocks, their height of at most 4. relationship to each other and by naming molecule and com- The abstraction tree for Aspirin is given in Figure 7. The ponents. We have implemented the semantic enrichment as molecule itself is represented by node m1, while on the sec- a server side Java application using some CDK algorithms ond level the nodes as1, as2, as3, denote the benzene ring as well as the Chemical Identifier Resolver web service [18]. and the functional groups carboxylic acid and ester, respec- We point these out explicitly in the following exposition. tively. The links on this layer correspond to the bond b3 between the ring and the acid group and the shared carbon 7.1 Abstraction atom a8 between the ring and ester. We first abstract the molecule by finding its blocks. Ring systems are discovered with CDK’s ring finder li- 7.2 Ordering and Positions brary, which can compute sets of isolated rings and fused Our next goal is to compute orderings on the vertices for (ii) external substitution that is heaviest with respect to ≤m. Then enumerate carbon atoms in the direc- tion, such that the next external substitution gets the smallest possible position. Isolated Ring R with internal substitutions: Assign position 1 to (i) an internal substitution with oxygen atom, if one exists, (ii) an internal substitution that is heaviest with respect to ≤m. Then enumerate atoms in R in the direction, such that the next internal substi- tution gets the smallest possible position. If no second internal substitution exists, enumerate similar to the previous case. Figure 7: Abstraction tree for Aspirin. S: Choose as start atom the one ex- ternally bonded to block with the lowest position in each of the graphs defined in the previous section. We do VM . Then assign positions in depth first manner, us- this by computing a separate position function pos for each ing molecular weight for ordering. of the different vertex sets. This is important to obtain For the Aspirin molecule the positions in the benzene ring positions of substitutions in aliphatic chains and rings, for are 1 for the carbon connected to the ester group, 2 for the their natural language description as well as for providing carbon connected to the carboxylic acid group, and so on. orientation when navigating molecules. Fused ring systems have an extremely complex naming The overarching idea is to assign the lowest position to and numbering scheme, which we have deliberately chosen the “heaviest” block or atom, with respect to their molecu- not to employ and implement. Instead we simply assign lar weight. (There are some exceptions which we point out positions to the outward facing atoms, similar to isolated below.) We therefore define a partial order ≤m on chemical 0 rings without internal substitutions, and impose an order structures (blocks or single atoms), and say that S ≤m S iff on the component rings by: S has a molecule weight less than or equal to that of S0. In practice, molecule weights are computed with functionality (i) Let F be the fused ring system, and let A be the or- from the CDK library. dered set of all atoms in F with an assigned position The block graph GM = (VM ,EM ) is ordered by first pos(a), a ∈ A. choosing a start block and then assigning other positions (ii) Let O ⊆ F be the set of all outer rings, such that each in a depth first traversal of the graph. Both the choice of R ∈ O contains an atom from A. Then A induces the start block and the order of the depth first traversal is an order pos on O = {R1,...,Rn}, with pos(Ri) < 0 determined by the following ordering: For S, S ∈ VM we let pos(Rj ), i < j iff there exists an a ∈ Ri ∩ A with 0 pos(S) < pos(S ) iff pos(a) < pos(b) for all b ∈ Rj ∩ A. 0 (i) S is a ring system, S is not a ring system, (iii) Let I = {R1,...,Rm} be the set of all inner rings, such that for each Ri ∈ I Ri ∩ A = ∅, i = 1, . . . , m. 0 (ii) S is an aliphatic chain, S is a functional group, Then we assign positions n + 1, . . . , n + m to elements (iii) S ≤ S0, otherwise. of I exhaustively by choosing the next R ∈ I without m pos(R) value, such that R has the neighbour with the This order always promotes rings over chains over functional smallest position in F. groups, starting with the heaviest system for each. Thus, Finally, given the local orderings for all graphs we can also in our Aspirin example the imposed order on the blocks is compute a global ordering over all atoms in the molecule, benzene ring, ester, carboxylic acid. exploiting the order of blocks and possibly resolving ambi- Our procedure to assign positions to elements in blocks guities for atoms occurring in more than one structure. uses rules based on simplified versions of the IUPAC nam- ing conventions [8] as well as for rings containing non-carbon 7.3 Naming and Description elements, on the Hantzsch-Widman system [13]. In partic- Once the abstraction graph is computed it is further en- ular, we use greatly simplified rules to assigning positions riched by generating basic descriptions for all its compo- in fused ring systems, and therefore, for simplicity, we first nents. In a first step, we try to name automatically all the explain ordering computation without fused rings. In addi- molecule and all its blocks. This is achieved by two means: tion to molecular weight, the main guideline for ordering is (1) We compute the molecular formula and the structural to have substitutions at the smallest positions. formula. (2) We use the Chemical Identifier Resolver web Aliphatic chain S: Enumerate carbon atoms in S starting service [18] to try and generate common names and IUPAC at one end such that first external substitution is at names. The web service can not always guarantee an an- carbon atom with the lowest possible position. If S swer: While IUPAC names can be computed for most, but has no external substitutions perform procedure with not all, blocks, common names only exists for some blocks. respect to internal substitutions. In the latter case we usually receive a list of names, from which we automatically choose, if possible the first name Isolated Ring R without internal substitutions: As- containing Latin characters only. sign position 1 to carbon element in R with (i) ex- Having this information available we compute natural lan- ternal substitution OH (a phenol group), if one exists, guage descriptions of all components of the abstraction tree that can later be used as speech strings. We thereby com- name, IUPAC name (if either is available), molecular and pute three different types of description each with poten- structural formula of the molecule. In brief, we compute tially two variants: basic and advanced. The former is a speech string by combining the summary descriptions for aimed at learners unfamiliar with much of the chemical ver- the elements of the block graph. Finally, verbose consist of nacular, while the latter requires some chemical knowledge a similar combination of descriptions for all atomic graphs. to understand. Since both coincide in some cases, the ad- vanced variant can be omitted. 8.2 Browser Front End Descriptions are hierarchical, starting with the basic com- In addition to the SVG and CML we also inject additional ponents and combined to more complex strings, for all levels JavaScript code that allows interactive exploration of the di- of the abstraction tree. We distinguish three types: basic, agrams. The main idea is that a user can enter a diagram context and summary. and interactively browse through its components on differ- Basic descriptions are attached to each and are, for in- ent levels and in different granularity. The components are stance, descriptions like “Carbon 3”, “Carbon 4 bonded to presented to the reader by making descriptions available for 1 hydrogen”, “double bond”, “six membered carbon ring” or aural rendering by a screen reader, highlighting as well as “benzene ring” as advanced variant. possibly magnification. Technically these three functionali- Context descriptions aim to describe a component from ties are realised as follows: a particular point of view or its role, consequently they are Aural Rendering is achieved by invisibly displaying and usually directed. For example, an edge representing a double updating a speech string in a DOM element designated as bond between two carbon atoms 3 and 4 is described as ARIA live region and ARIA role alert. “double bonded to Carbon 4” outgoing from 3 and “double Highlighting is automatic and synchronised with the au- bonded to Carbon 3” in the opposite direction. ral rendering, that is, the elements that are currently being Summary descriptions combine basic and context descrip- described are highlighted in the SVG graphic. Highlighting tions to elements on the level of the block graph. They are is achieved by dynamically changing the CSS parameters of constructed using the following pattern: the SVG nodes concerned. (i) Backbone, i.e., type of ring, length of aliphatic chain, Magnification is an option of the explorer module and technically achieved via SVG animation, realising the zoom (ii) internal substitutions with position in the backbone, by gradually constraining the View Box of the SVG element to the component that are currently being described. (iii) external substitutions at position of the backbone. The basic browsing functionality is implemented by an For the ring in the Aspirin molecule, we get as basic sum- explorer attached to the SVG element as event listener. A mary description“six membered carbon ring with substitu- reader is alerted to its existence via the ARIA role applica- tions at positions 1 and 2.” and as advanced “benzene ring tion. The explorer can be entered in two basic modes: key- with substitutions at positions 1 and 2”. board driven (keystroke Enter) or menu driven (keystroke After computing all descriptions the completed abstrac- Ctrl-Enter). In the latter case a set of buttons for naviga- tion tree can be translated into an XML representation and tion are displayed next to the diagram that can then either exported into a semantically enriched CML file. We use a be operated by mouse or by keyboard. Once the explorer separate name space for this, in order to retain a CML file has been entered, the user can switch additional features, that is still processable by other applications. such as magnification of components, display of the aural description, or switch to advanced variant of descriptions. The explorer can be left with Escape. 8. ACCESSIBLE DIAGRAMS For browsing the molecule we have chosen a relatively To convert images in web documents into accessible dia- simple exploration model which was honed in user testing. grams, we employ AJAX functionality to import the anno- Granularity and range of movement is determined by the tated SVG and the enriched CML as an SVG+XML media graphs in the abstraction tree. In particular, one cannot type into the web page. This allows us to effectively recre- move between different graphs on the same level in the tree. ate a version of the abstraction tree inside a browser and That is, when browsing a block on the atom level, it is im- to either connect it to the corresponding components in the possible to move to another block even if there is a bond SVG or, in some cases, to employ it to describe the original between the two in the molecule. It is necessary to move bitmap image. up to the level of the block graph first before moving to a neighbouring block. This guarantees that the reader is al- 8.1 Transcribed Bitmaps ways provided with the summary of the next block before During user testing (see Sec. 10 for details) with secondary exploring it in depth, which we have found in user testing school students we realised that some were still working with to improve orientation. relatively dated versions of operating systems, browsers and As a consequence we effectively allow browsing along two screen readers, and hence SVG or WAI-ARIA support was axes only: Right/Left and Down/Up. In keyboard driven not always available. We therefore implemented a fallback exploration this functionality is assigned to the correspond- model, making diagrams at least partially accessible, by ef- ing cursor keys. In addition we have a forward movement fectively enriching the original bitmap image with descrip- (assigned to Enter) that allows for choice at junction points. tions as alternative text that we compute from the imported In more detail, Down/Up corresponds to vertical move- CML and that can be picked up by screen readers. ment in the abstraction tree. Thereby down moves the In practice we replace the original image with three copies reader from the current position in a graph to the node with of itself, each with a different description, simulating a super- position 1 in its child graph. Up moves from anywhere in brief, brief and verbose mode. In superbrief we only give the the child graph and back to the parent graph. Step 1 Step 2 Step 3 & Step 5 Step 4 Step 6 Step 7

Figure 8: Stepping through the Aspirin molecule.

Right/Left corresponds to movement in a graph with re- scope of our technology. As a resource for evaluation, we spect to the ordering of the elements, right in ascending used 63 scanned images from chemistry textbooks up to order, left in descending order. In case there is a junction degree level and 20 complex molecules from a US patent point, the choices can also be browsed using right (or left for database. In addition we harvested 100 chemical diagrams backward) and a choice can be made using the forward key. from the web in a wide range of image formats (e.g. jpeg, Note, that we do not provide a backward motion, as this is gif, tiff, png etc.). This collection features diagrams with effectively subsumed by the left/right axis. different levels of complexity and image quality. Several of the images were hand drawn. Our approach is able to han- 9. EXAMPLE dle all images we collected, with the exception of 4 that were hand drawn and 1 with red background. We therefore feel As an example, we consider some steps in browsing the confident that our approach is capable of handling the full Aspirin molecule, with magnification and highlighting. Fig- range of diagrams likely to be used for educational purposes. ure 8 presents the different states of magnification and high- We also tested using different platforms (i.e., Linux, Mac lighting, Observe that highlighting is done with two differ- OS, Windows), with the five most popular internet browsers ent colours, a primary (yellow) and a secondary (blue). The and common commercial screen readers and magnification moves and speech output corresponding to the single steps tools (e.g., Jaws/NVDA on Microsoft Windows, VoiceOver are given in the below. on Mac OS etc.). We found that though there was a dis- Step Key Speech string parity between how different tools dealt with accessibility 1 Enter Aspirin standards, the most common tools all were able to work 2 Down Benzene ring with substitutions at po- transparently with our end to end solution. sition 1 and 2. We also conducted a series of user tests at different stages 3 Right Substitution at position 1 Ester. during the development of our technology working with a 4 Right Substitution at position 2 Carboxylic local specialist school for students with visual impairments. Acid. Due to the numbers of students available it was not possible 5 Left Substitution at position 1 Ester. to do either a longitudinal or quantitative study. In addition, 6 Forward Functional group Ester with shared the range of visual impairments was wide from students who atom at position 1. were blind to students with restricted sight. Our findings are 7 Down Carbon atom shared with Benzene thus anecdotal rather than statistically significant and need ring, single bonded to oxygen. to be backed up by further user testing in the future. In the initial step 1 we enter the explorer, the name “As- For testing students were using their own personal equip- pirin” is announced and the entire molecule is highlighted in ment and setup, and were only provided with the URL of primary. Going one level down, we reach the benzene ring, samples, without any additional information about working which is magnified and highlighted. We can now browse with the technology. Nevertheless, they could very quickly the substitutions, with cursor keys right or left, which corre- independently engage with the diagrams. Blind students sponds to the steps 3–5. Observe that the magnification now could also reproduce diagrams, even for molecules they were zooms to present both the benzene ring and the functional unfamiliar with, as physical models with so called Molymod group reachable. The ring is still highlighted in primary, blocks, by listening to the produced descriptions.1 while the functional group is highlighted in secondary. The general feedback was that all learners as well as teach- In step 6 we go forward, thus entering the functional group ers present were extremely positive about the technology. ester. This leads to both zoom and highlighting concentrat- Feedback from blind students indicated that our interac- ing on this block only. In the final step we go down to the tive exploration model could replace alternative techniques lowest level of our structure, the atom level. Consequently, for presentation of chemical structures such as with mod- the first atom in the functional group is being described and els or tactile graphics. One interesting point was that blind magnified, together with the bonds connecting it inside the students preferred keyboard controlled exploration, whereas functional group only. This bond is highlighted in primary, students relying on magnification where happier with menu while the reachable next atom, oxygen in our case, is high- buttons. The latter also has the added advantage that it lighted in secondary.

1Some videos from user testing with secondary school stu- 10. EXPERIMENTS AND USER TESTING dents that had media clearance can be found at http: We conducted a number of experiments to evaluate the //www.cs.bham.ac.uk/go/sdag/chemAccess/videos.php. minimised potential clashes of key bindings with the screen Acknowledgements reader software, and can make diagrams accessible even on We like to thank the pupils and teachers at the New College mobile platforms like Android and IOS. Worcester for their time and effort in our user tests. We are grateful to Duncan Bell for expert testing, to Egon Wil- lighagen and John May for initiating us into the CDK, and to Peter Murray-Rust for help with CML and providing us 11. CONCLUSIONS AND FUTURE WORK with many insights into advanced chemistry. We have designed and implemented an end-to-end solu- tion for making chemical diagrams accessible. It starts with 12. REFERENCES an image analysis process that can recognise diagrams from [1] A. Brown, S. Pettifer, and R. Stevens. Evaluation of a regular bitmap images. While this means that much of the non-visual molecule browser. ACM SIGACCESS success of our method relies on the strength of the recogniser, Accessibility and Computing, 77-78:40–47, 2004. we have shown in previous work that MolRec has a partic- [2] A. Brown, R. Stevens, and S. Pettifer. Audio representation ularly robust vectorisation algorithm and therefore a very of graphs: A quick look. In Proc. of Audit. Displays, 2006. high recall rate [16]. While recognition errors can always be [3] L. Brown, S. Brewster, S. Ramloll, R. Burton, and B. Riedel. Design guidelines for audio presentation of an issue, the advantage of starting with a bitmap image is graphs and tables. In Proc. on Auditory Display, 2003. that one can make existing literature accessible retrospec- [4] R. Cohen, A. Meacham, and J. Skaff. Teaching graphs to tively and that there is no need to change author behaviour visually impaired students using an active auditory in order to produce accessible diagrams. interface. ACM SIGCSE Bulletin, 38(1):279–282, 2006. We strongly believe that bypassing the need for bespoke [5] R. Cohen, R. Yu, A. Meacham, and J. Skaff. Plumb: authoring and browsing tools significantly reduces the hur- displaying graphs to the blind using an active auditory dle for production and acceptance of accessible diagrams. interface. In Proc. of Comp. and Accessibility. ACM, 2005. Instead our work is based on open standards (SVG, WAI- [6] D. Douglas and T. Peucker. Algorithms for the reduction of the number of points required to represent a digitized line ARIA, XML, JavaScript) that make our technology platform or its caricature. Cartographica, 10(2):112–122, 1973. independent and support a number of screen readers. Simi- [7] R. Fartaria, F. Pereira, V. Bonif´acio, P. Mata, J. Aires-de larly the exploration can be run in any browser. Sousa, and A. Lobo. Navmol 2.0–a molecular structure While our exploration model is deliberately held simple, navigator/editor for blind and visually impaired users. Eur. our user studies have shown that a simple navigation was J. Org. Chem., 2013(8):1415–1419, 2013. most conducive to comprehension. However, in our stud- [8] H. Favre and W. Powell. Nomenclature of Organic ies we have mainly concentrated on working with secondary Chemistry: IUPAC. Royal Society of Chemistry, 2013. school students, and thus restricted our example to relatively [9] P. Murray-Rust and H. Rzepa. Chemical markup, XML and the world-wide web, part 2. simple structures (like Aspirin). We need to further experi- J. Chem. Inf. Comput. Sci., 41(5):1113–1123, 2001. ment with more complex structures, in particular, contain- [10] N. O’Boyle, M. Banck, C. James, C. Morley, ing large fused ring systems, to understand if our current T. Vandermeersch, and G. Hutchison. Open babel: An model is adequate. One could shift to a wind rose style nav- open chemical toolbox. J. Cheminf., 3:33, 2011. igation, providing eight directions of movement and thereby [11] N. Otsu. A threshold selection method from gray-level communicating better the layout of a molecule. Alterna- histograms. IEEE Trans. Syst. Man Cybern., 9:62–66, 1979. tively, one could employ a more hierarchical presentation [12] F. Pereira, J. Aires-de Sousa, V. Bonifacio, P. Mata, and model or allow for setting way points that can be returned A. Lobo. Molinsight: A web portal for the processing of molecular structures by blind students. J. Chem. Educ., to when a reader gets lost. 88(3):361–362, 2010. We have also not yet dealt with molecules containing 3d [13] W Powell. Revision of the extended Hantzsch-Widman components. Although our image analysis can cope with fea- system of nomenclature for heteromonocycles. Pure Appl. tures like 3d bonds, bridge bonds and bridged rings, we have Chem., 55(2):409–416, 1983. not yet made special provisions for those in the semantic [14] N. Sadawi. Rule-based approach for recognition of chemical analysis or in the exploration model. This might also need structure diagrams. PhD, Univ. Birmingham, 2013. additional description variants, to keep speech strings con- [15] N. Sadawi, A.P. Sexton, and V. Sorge. Chemical structure cise or make use of overview techniques as proposed in [2]. recognition: a rule-based approach. In Doc. Recognition & Retrieval XIX, volume 8297. SPIE, 2012. Image analysis could be employed on a wider scale to make [16] N. Sadawi, A.P. Sexton, and V. Sorge. Molrec at CLEF more scientific and teaching material accessible. Initial ex- 2012 — overview and analysis of results. In CLEF periments have shown that MolRec’s vectorisation can deal Evaluation Labs, 2012. http://clef2012.org. well with diagrams in other STEM subjects, that contain [17] C. Steinbeck, Y. Han, S. Kuhn, O. Horlacher, E. Luttmann, similar geometric primitives as molecule diagrams. Thus and E. Willighagen. The chemistry development kit (CDK): adding semantic recognition and enrichment for other STEM An open-source java library for chemo-and bioinformatics. diagrams could transform our technology into a polyfill so- J. Chem. Inf. Comput. Sci., 43(2):493–500, 2003. lution for diagrams, similar to the role Mathjax plays for [18] CADD Group. Chemical identifier resolver. http://cactus.nci.nih.gov/chemical/structure. displaying mathematics on the web. [19] Henry B. Wedler, et al. Applied computational chemistry Finally, although we have concentrated on support for vi- for the blind and visually impaired. Journal of Chemical sually impaired readers initially, our approach can be ben- Education, 89(11):1400–1404, 2012. eficial for a larger user community. We have already in- [20] E. Willighagen. Groovy Cheminformatics with the tegrated different high contrast settings to support readers Chemistry Development Kit. EL Willighagen, 2011. with learning impairments such as dyslexia. But moreover, [21] W. Yu, K. Kangas, and S. Brewster. Web-based haptic we believe that our technology could be a useful teaching applications for blind people to create virtual graphs. In tool for all students, regardless of their abilities. Proc. of HAPTICS 2003., pages 318–325. IEEE, 2003.