Machine Learning of Fonts
Total Page:16
File Type:pdf, Size:1020Kb
Machine Learning of Fonts Antanas Kascenas MInf Project (Part 1) Report Master of Informatics School of Informatics University of Edinburgh 2017 3 Abstract Font kerning is the adjustment of the spacing between specific pairs of characters that appear next to each other in text. Kerning is important for achieving visually pleas- ing displayed text. Due to the quadratic growth of possible character pairs, manually kerning a font takes time and effort. This project explores the possibility of automatic kerning using machine learning. Ex- isting automatic kerning tools were discussed but no implementations using machine learning were found. Kerning specification in font files and relation to other font met- rics were researched. A dataset for supervised regression learning of the spacings be- tween character pairs was constructed. A neural network model predicting the proper spacing between the glyph bounding boxes given the whitespace shape between the glyphs and the font tracking was developed. The neural network model was evaluated on a test set as well as on standard fonts, both quantitatively and visually. The results beat the baseline performance and an existing implementation of automatic kerning in FontForge font toolset. 4 Acknowledgements First of all, I would like to thank my supervisor, Dr. Iain Murray, for proposing an interesting project topic, helpful feedback, useful thoughts and ideas. Secondly, I would also like to thank Justas Zemgulys for feedback on the visual quality of the intermediate results during the development of the project. Table of Contents 1 Introduction 9 1.1 Main contributions . 10 1.2 Structure of the report . 11 2 Background 13 2.1 Glyph metrics . 14 2.2 Existing solutions . 15 2.2.1 iKern . 15 2.2.2 TypeFacet Autokern . 15 2.2.3 FontForge . 15 2.2.4 Other software . 16 2.3 Discussion . 17 3 Problem set up and dataset creation 19 3.1 The regression problem . 19 3.2 Source of the Font Files . 19 3.3 Software for Parsing Font Data . 20 3.4 Extraction of the Spacing and Kerning Data . 20 3.5 Feature design . 21 3.6 Cleaning, Filtering and Further Processing . 22 4 Baselines and Initial Experiments 23 4.1 Outlier Detection . 23 4.1.1 Results . 23 4.2 Regression models . 24 4.3 Linear Regression . 24 4.3.1 Principal component analysis . 25 4.3.2 Increasing the complexity of the model . 25 4.3.3 Discussion . 26 4.4 Decision trees and random forests . 27 4.4.1 Experiments . 27 4.4.2 Discussion . 28 4.5 Nearest Neighbours . 28 4.5.1 Experiments . 28 4.5.2 Discussion . 28 4.6 Basic neural networks . 29 5 6 TABLE OF CONTENTS 4.6.1 Introduction . 29 4.6.2 Experiments . 30 4.6.3 Discussion . 30 4.7 Discussion . 31 5 Encoding font information 33 5.1 One-hot encoding of fonts . 33 5.1.1 Experiments . 34 5.1.2 Results . 34 5.1.3 Problems with one-hot encoding . 35 5.2 Separating kerning and tracking . 36 5.2.1 Experiments . 37 5.2.2 Discussion . 37 6 Deep neural networks 39 6.1 Deep feed-forward neural networks . 39 6.1.1 Experiment set-up . 40 6.1.2 Experiments and results . 40 6.1.3 Discussion . 40 6.2 Convolutional neural networks . 41 6.2.1 Experiments . 42 6.2.2 Results . 43 6.2.3 Discussion . 44 7 Evaluation 47 7.1 Quantitive evaluation . 47 7.1.1 Test error on held out glyph pairs . 47 7.1.2 Errors on standard fonts . 47 7.1.3 Discussion . 49 7.2 Comparison to FontForge automatic kerning functionality . 50 7.2.1 FontForge set-up . 50 7.2.2 Results and discussion . 50 7.3 Visual comparisons . 51 7.3.1 Implementation details . 51 7.3.2 Standard fonts . 51 7.3.3 Predicting from a set of glyphs . 52 8 Conclusion 55 8.1 Future work . 56 8.1.1 Dataset quality . 56 8.1.2 Glyph shape representation . 56 8.1.3 Different approaches . 56 9 Plan for next year 57 9.1 Problem description . 57 9.2 Existing tools and approaches . 57 9.3 Proposed approach . 58 TABLE OF CONTENTS 7 Bibliography 59 Appendices 63 A Visual comparisons 65 A.1 Standard fonts . 65 A.2 Improved fonts . 68 Chapter 1 Introduction One of the main ways to consume information is through reading printed or otherwise displayed text. It is accepted that the layout and other design aspects of the text content can affect engagement, focus, reading speed, eye strain, text comprehension and other characteristics [1, 2, 3]. One of the text design decisions is the choice of a typeface or a font. A typeface is a family of fonts containing different variations of a font such as bold, italic, condensed etc. (for the purposes of this project, no distinction between typeface and font is made). However, the process of designing a font itself has a lot of subtle decisions that can affect the final design quality. The focus of this project is the kerning and letter spacing aspect of the font design pro- cess. Kerning and letter spacing both affect the intraword spacing. More specifically, letter spacing is the assignment of whitespace on the sides of every character. Kerning is the adjustment is spacing between specific pairs of characters. Together, letter spacing and kerning are used to finely control the space between the letters in the text. A well designed font is more visually pleasing to read and does not attract unnecessary attention to the specifics of the spatial configuration of the letters in a word. There is research indicating that intraword and interword spacings can affect the reading speed and legibility [4, 5] as well as that proportional fonts can be read faster than monospaced fonts [6] (fonts where all characters take up the same amount of horizontal space). The difficulty with producing well spaced fonts is that due to varied shapes of the glyphs it is not enough to set how much space there should be on either side of a glyph. A glyph is a shape that usually corresponds to a character. A character can be a combination of a few glyphs but in Latin letters (which are used in this project) the correspondence is one to one. Placing equal amounts of whitespace between the bounding boxes of neighbouring glyphs does not produce text with spacing that is visually consistent. To obtain text where each glyph pair appears to be equally spaced, manual offsets need to be applied to some specific pairs. In other words, some pairs need to be kerned. The number 9 10 Chapter 1. Introduction of possible glyph pairs in a font grows as a square of the glyphs defined in the font. Therefore, it quickly becomes infeasible to kern all the possible pairs manually. In practice, letter spacing (assigning whitespace to each side of every glyph) is used to get as close to a good result as possible and kerning is used to fix the worst spaced pairs. These usually include capital letter pairs such as “AV” or upper-case and lower-case combinations such as “To”. The aim of this project is to get insight into whether it is possible to leverage already existing manually designed fonts (this project uses fonts collected in the Google Fonts repository [7]) with machine learning methods to create a process that would partly or fully automate the letter spacing and kerning stages of font design. 1.1 Main contributions • Research was done on how fonts are implemented in the currently widely used TrueType and OpenType font file formats. It was investigated how the relevant glyph spacing information is represented in the mentioned font formats. • Existing approaches to automatic font letter spacing and kerning were explored and discussed. • A supervised regression learning problem was set up to predict the proper spac- ing of glyph pairs given the description of the whitespace between the glyph outlines. • Tools were sought to handle fonts and allow the extraction of the spacing infor- mation. A process of extracting or changing spacing information in font files was implemented using FontForge [8]. Code was written to visually compare the text with original and changed spacings in a font. • A dataset from popularly used existing fonts was constructed and features de- scribing the whitespace designed. • Baselines for predicting the glyph pair spacing were set and compared with a variety of basic machine learning regression methods. • Problem description was modified to model the tracking of fonts (a constant spacing offset common to all the pairs) • Fully connected and convolutional neural network models were implemented and tuned to achieve reasonable results on the validation set. • The best model was evaluated on a test set as well as on fonts unseen during train- ing time. The results were also visually compared to the baseline solution and an existing FontForge automatic kerning algorithm. The proposed model beat the baseline and the FontForge implementation both in the quantitative evaluation and visual comparisons in the vast majority of cases. 1.2. Structure of the report 11 1.2 Structure of the report Chapter 2 presents the relevant background information including the main specifica- tions of spacing information in TrueType font file formats and the discussion on the existing automatic kerning and letter spacing solutions. Chapter 3 describes the regression problem in more detail as well as presents the rea- soning, design and the implementation details of constructing the dataset for the re- gression problem. Chapter 4 details the initial experiments, baselines, basic methods and their results on the regression problem as well as the problems with the approach.