Production of Speech and Braille Output from Formatted Documents
Total Page:16
File Type:pdf, Size:1020Kb
Towards Accessible Technical Documents: Production of Speech and Braille Output from Formatted Documents A T h e s is s u b m it t e d f o r t h e d e g r e e o f PhD Donal Fitzpatrick BSc School of Computer Applications Dublin City University September, 1999 Supervisor Dr Alex Monaghan This thesis is based on the candidate’s own work, and has not previously been submitted for a degree at any academic institution I hereby certify that this material, which I now submit for assessment on the programme of study leading to the award of PhD is entirely my own work and has not been taken from the work of others save and to the extent that such work has been cited and acknowledged within the text of my work Donai Fitzpatrick September 21, 1999 11 Abstract The primary objective of this research, was to devise methods for communicat ing highly technical material to blind people, through the medium of Braille or prosodically enhanced spoken output This solution necessitated devising strategies to both model the document internally, and to unambiguously pro duce the material m the two output media The first phase m the generation of intelligible output was the transforma tion of the source into well-formatted and accurate Braille Following on from this, methodologies were defined to convey the structure and textual content of documents using prosodic alterations to the synthetic voice We have devised mechamsms whereby mathematical content can be delivered m an intuitive manner, using the sole medium of prosodically enhanced spoken output This ensures that the listener will not have to learn specific non-speech auditory sound to gam access to this form of presentation We have also devised a newer, and more flexible means for representing the structure and content of the document m the computer’s memory This Directed Graph, is a radical departure from the traditional, tree-based approach of the past, and facilitates rapid and efficient browsing of the document’s hierarchy This thesis discusses the various aspects of the TechRead system, which will ultimately provide increased accessibility for blind people to technical doc uments We demonstrate how the methods used m TechRead differ from those previously employed to solve this problem Acknowledgements Though this work bears the name of a single author, I wish to acknowledge all those whose contributions made it possible First and foremost, I wish to sincerely thank my supervisor, Dr Alex Monaghan, who had to contend with my crazy ideas, and missed deadlines Most of the good ideas expressed in this work are his, and he should not be held responsible for the remainder of the junk' More importantly, I wish to express my gratitude to Dr Monaghan, for access to one of the finest CD collections I have ever seen I wish to express my gratitude to the two unfortunate people who had the tedious job of correcting this thesis. To Dr Lynn Killen, and Dr Serge Santi, all that I can say is thank you for both your thoroughness, and your especial fairness I look forward to working with both of you in the future, and research, and Braillmg projects, academic or otherwise I would like to thank the staff, and postgraduate students of the School of Computer Applications, at Dublin City University for their help over the last number of years Many of those still residing m the Postgraduate laboratory, will no doubt be relieved to know that they will not have to act as a screenreader when my computer crashes m future To my parents and sister, I owe a debt which will probably never be repaid During my incarceration m special education, my family were always supportive, and never ceased to struggle m their attempt to ensure that I obtained the best education possible This thesis, finally marks the end, and indeed the vindication of their battle against the system Therefore, Mum, Dad and Niamh all that can be said is thanks As all those who know me will agree, this thesis would not be m its present form without the inestimable contribution of Ms Orla Duffner, B Eng Ms Duffner proofread this thesis, and made all those visual alterations which en sured that I wouldn’t be embarrassed when people looked at itt During the last, and most difficult few weeks, she bullied, persuaded and threatened this thesis into existence, when I would have preferred to be anywhere else So, O J , for the thesis, the music, and a list of other things too numerous to mention, thank you I must conclude, by acknowledging the invaluable assistance of those mem- IV bers of the DCU Folk Group, (or whatever its’ name is this week) You are all too numberous to mention, but all that can be said is thanks for being m DCU at the same time as me In particular, however, I will single out two people for special mention Fr John Gilligan, (even though he’s an Everton fan) and Sr Bernadette McDonagh were always around when I needed tea, beer or someone to give grief to To both of you, I wish to say a sincere “thank you” Donal Fitzpatrick September 21, 1999 Contents 1 Introduction 1 1 1 Principal objectives 1 1 2 Braille origins and overview 10 1 3 What is prosody 13 14 Summary 20 2 Literature review 21 2 1 How do we read7 22 2 11 Audio or visual reading 22 2 2 Extracting Document Structure 28 2 2 1 Extracting structure from mark-up 29 2 2 2 The document model used m ASTER 31 2 3 Browsing documents using spoken output 34 2 3 1 Non-structured browsing 35 2 3 2 Structured textual browsing 37 VI 2.3.3 Structured mathematical browsing............................................39 2.4 Producing audio o u tp u t...........................................................................44 2.4.1 Chang’s rules for spoken m athem atics..................................... 45 2.4.2 Audio output generated by A S T E R .........................................47 2.4.3 Audio output produced by MathTalk.........................................53 2.5 Sum m ary.................................................................................................. 63 3 Different views of the same data 65 3.1 The document model..................................................................................65 3.1.1 Model description ..........................................................................69 3.1.2 Representing tex t.............................................................................76 3.1.3 Representing mathematical content............................................ 79 3.1.4 Representing tabular data ......................................................... 84 3.2 Reading strateg ies..................................................................................... 88 3.2.1 Continuous reading...................................................................... 88 3.2.2 Skim-reading....................................................................................91 3.2.3 Non-sequential reading...................................................................93 3.2.4 Reading mathematical material............................................... 95 3.2.5 Reading ta b le s................................................................................99 3.2.6 Reading newly defined objects .................................................101 3.3 The human interface................................................................................ 101 vii 3.3.1 The feel of TechRead............................................................................102 3.3.2 The menu system ................................................................................... 113 3.4 The visual interface ...........................................................................................116 3.5 Summary .................................................................................................................120 4 Producing Braille 121 4.1 Braille translation ...............................................................................................121 4.1.1 Obtaining the characters.....................................................................123 4.1.2 Obtaining mathematical symbols...................................................127 4.2 Braille layout..........................................................................................................132 4.2.1 Simple layout............................................................................................133 4.2.2 Headings, titles and page numbering............................................135 4.2.3 Conveying Emphasis.............................................................................139 4.2.4 Miscellaneous layout considerations............................................... 140 4.2.5 Braille mathematical layout..............................................................141 4.3 Improvements to existing standards..............................................................146 4.3.1 Textual layout ........................................................................................146 4.3.2 Enhancements to mathematical Braille........................................148 4.4 Summary ..................................................................................................................150 5 Producing spoken output 152 5.1 Introduction..............................................................................................................152 H ,v 5 2 Conveying structure 157 5 2 1 Sectional units 159 5 2 2 Textual alterations 164 5 2 3 Lists, and other miscellaneous environments 168 5 3 Mathematical prosody 172 5 3 1 What to speak 173 5 3 2 How