Stroke-Based Handwriting Recognition: Theory and Applications

Technische Universität München Fakultät für Mathematik Lehrstuhl für Geometrie und Visualisierung Stroke-based Handwriting Recognition: Theory and Applications Bernhard Odin Werner Vollständiger Abdruck der von der Fakultät für Mathematik der Technischen Universität München zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Prof. Dr. Massimo Fornasier Prüfer der Dissertation: 1. Prof. Dr. Dr. Jürgen Richter-Gebert 2. Prof. Dr. Kristina Reiss 3. Prof. Dr. Sven de Vries Universität Trier Die Dissertation wurde am 12.02.2019 bei der Technischen Universität München eingereicht und durch die Fakultät für Mathematik am 16.07.2019 angenommen. iv v Acknowledgements First and foremost I want to thank Jürgen Richter-Gebert and Kristina Reiss for this amazing opportunity to work in two different scientific fields, and Frank Reinhold and Stefan Hoch for the marvellous collaboration. I am very grateful that I learned many different things unrelated to my actual dissertation and happy that we managed to produce a piece of software that met the appeal of students, teachers and other researchers alike. The experiences I made during ALICE will, without a doubt, shape my professional future. I want to thank everyone at the chairs for Geometry and Visualisation, for Algebra, and for Mathematics Education for all the enlightening academic discussions and the occasional diverting private ones. I want to thank all my friends in Traunstein, Munich, Münster, Berlin and Boston for many, many diverting private discussions and the occasional academic ones. I want to thank Matt, Noel, Amora, Pedro, Lena, Kevin and Madeline for showing me how to climb a mountain; and Steve and every other Rebel for your support and help on my epic and less epic quests over the last four years. Last, but most important, I want to thank Jan-Christian and Dorothea for keeping me sane during the last few years. vi Contents 1 Writing on touchscreens in technology, education & mathematics3 1.1 Handwriting recognition in technology . 5 1.2 The ALICE project . 13 1.3 The problem of handwriting recognition . 29 1.4 Structure & notation . 34 2 A mathematical model of handwriting 39 2.1 Mathematical fundamentals . 40 2.2 A base model for strokes . 65 2.3 An overview of ALICE:HWR . 78 3 Geometric transformations of strokes 105 3.1 Four classes of geometric transformations . 106 3.2 Applications . 132 4 Aspects of stroke classification 143 4.1 Directional vectors . 144 4.2 Exclusion rules via FCA hypotheses . 153 4.3 Fuzzy matching of feature vectors . 162 5 Characterising strokes via determinants 171 5.1 Curvature . 173 5.2 Determinants . 178 6 Looking back at ALICE:HWR 191 6.1 Training . 194 6.2 Classification . 197 6.3 Parsing . 200 6.4 Performance . 202 7 Looking ahead 205 Appendix A The Manual for the companion iBook 211 Appendix B The code of ALICE:HWR 221 Bibliography 241 1 Writing on touchscreens in technology, education & mathematics “What are letters?” “Kinda like mediaglyphics except they’re all black, and they’re tiny, they don’t move, they’re old and boring and really hard to read. But you can use ’em to make short words for long words.” — Harv to Nell, The Diamond Age, by Neal Stephenson A very well-known problem in machine learning and pattern recognition is to identify written/drawn characters, icons and figures. This usually comes in two flavours: off-line and on-line. The former deals with pixel data as the raw input and encompasses scanned book pages, handwritten exams by students, and even blackboards seen in video-recorded lessons. The latter, however, may use the precise position of the tip of a pen or stylus sampled by a finite number of points and is used when writing on an electronic whiteboard or a touchscreen. In this thesis, we will focus on the latter one and explore several approaches to this task. When surveying the literature on on-line handwriting recognition, three things become eminent: First, as a purely practical problem, handwriting recognition is “solved”. There are more than enough different algorithms that work good enough. Second, the various approaches to handwriting recognition (HWR for short) differ greatly depending on the area of application — Latin letters, simple geometric shapes, Chinese characters, etc. Third, most development processes for these algorithms seem to start with the application in mind and are tailored to recognise a specific symbol set. 4 1 Writing on touchscreens in technology, education & mathematics This thesis aims to shift the thought process away from the purely pragmatic mindset and to establish mathematical foundations that can be applied to various tasks with only small adaptations. Additionally we want to build the classification process in a way that produces descriptive, humanly understandable results. Both these goals are motivated by the use in interactive educational software. The practical background for this thesis is given by the ALICE:fractions project which was established in 2015 by the Heinz Nixdorf Foundation. In it, the Heinz Nixdorf chair for Mathematics Education and the chair for Geometry and Visualization at the TUM together created an interactive schoolbook on iPads for sixth-graders on the topic of fractions. The handwriting recognition algorithm developed for it is the guidepost for the theoretical considerations here. So, in order to motivate the assumptions and requirements for the algorithm, we will also discuss the role of handwriting recognition in educational software and describe how it was implemented in ALICE:fractions. Before we do so, we start with an overview of the history of technical devices capable of handwriting recognition and how they benefit from such programs. 1.1 Handwriting recognition in technology 5 1.1 Handwriting recognition in technology To understand the peculiarities of tablets and other touch surface devices better, we give a brief overview of how this technology arose. 1.1.1 Computers and touch surfaces When asked about “computers on which you can write on the screen” most people nowadays think about state-of-the-art touchscreen devices: smart- phones, tablets and video game consoles. People old enough to remember might come up with Personal Digital Assistants (PDAs) like the PalmPilot, which have been prominent in the 1990s. That decade saw a surge of other technical gadgets like mobile phones and virtual reality devices. The image of the 1990s was shaped by these technical advances even though not all of them were successful.1 Of course, the origins of all these technolo- gies lie farther in the past. For example, mobile phones — in the shape of car phones — were made public in 1946 by Bell System. The advent of interactive computer screens came shortly after screens were added to computers at all. Before the 1960s computers mostly had tele- printers (also known as teletypes) as their “graphical” output device. These were typewriter-like machines that printed the relevant output data on paper. The technology of screens was of course known in 1960, but updating the data to be displayed was very memory-expensive and, therefore, screens were used scarcely. Even after introducing a fixed character set to be displayed — circumventing the need to update every light point individually in the cathode ray tubes — screens were mostly used to show process data. “Proper” output was still printed to paper via teletypes. Then, in 1962 and 1963, three major inventions saw the light of the world. First, Steve Russell and several other members of the Tech Model Railroad Club at MIT created Spacewar!, which is seen by many as the first video game in history and which is, therefore, one of the first “interactive” programs. The 1The failed Nintendo Virtual Boy comes to mind. 6 1 Writing on touchscreens in technology, education & mathematics significance of this lies in the formal definition of the term “interactive”: It is ambiguous, but most attempts to define it are in the vein of: An interactive software is a computer program which enables and demands input from a (human) user while it is running. See, for example, [18]. What that means — especially in the context of early computer science — is that an interactive software is not a glorified calculator that gets started and then works away for hours uninfluenced by any human. It requires frequent, near constant back-and-forth with a user.2 Second, Ivan Sutherland invented Sketchpad — also at the MIT — as part of his PHD thesis in 1963; which made him the father of interactive computing in many eyes. This program used the input of a light pen: a light-sensitive pen that detects the electron beam that generates the image on screen. Because the beam moves with a known speed over a known path across the screen, the light pen is able to compute the exact pixel it points to. Sketchpad uses this information to allow the user to draw straight lines, circles and other simple geometric shapes “onto” the screen and manipulate them in real time. This is the predecessor of all touch surfaces we are interested in this thesis. Both applications make it necessary to update screens fast and often and made it clear that, from then on, it is highly desired. Third, Douglas Engelbart invented the computer mouse only a year later. However, for our concerns here, computer mice are less important as an input device. But it and its success made it clear that the interactivity of computer programs will come to the fore, and that a focus will be put on special peri- pherals for this interaction. So, even in the early 1960s, everything was set for educational software on a technological level. However, the mouse gained popularity over the light pen over the next decades.3 We saw the first big wave of widespread devices with touch surfaces in PDAs 2This is also know as ’on-line processing’; in contrast to ’batch processing’ which dominated the early years of computer science.

Load more