Using Simulated Annealing to Find Increased Sound-Shape Systematicity in Parametric

Adam DePauw

Master of Science Artificial Intelligence School of Informatics University of Edinburgh 2019 Abstract

A sound-shape systematicity exists between the English pronunciation of letters and the shapes of their characters in the Roman alphabet. This systematicity may contribute to improved outcomes for early readers, language learners, and those with dyslexia. Motivated by these facts, we describe two methods for modifying fonts to increase this systematicity. Applying a simulated annealing algorithm to parametric coor- dinates resulted in parameterizations that increased the sound-shape systematicity on average by 132% over the default. Training a neural network to generate font glyphs from a set of shape distances did not result in a significant increase in systematicity. We explore the experimental results for relationships that might aid future research into systematicity and font design. These show that thicker strokes, wider charac- ters, and a smaller optical size are correlated with higher sound-shape systematicity. The data also gives evidence of a negative sound-shape correlation in monospaced fonts, but positive sound-shape correlation in proportional-spaced fonts. We introduce a reusable software library to enable additional experiments and a new data set for further research and validation.

i Acknowledgements

I am grateful for the many enlightening conversations I have enjoyed with Dr. Richard Shillcock. This work could not have been done without his guidance and encour- agement. Thank you to Hana Jee for laying the groundwork for this project and for freely sharing information about her approach. This project is entirely motivated by the groundbreaking work that she, Dr. Shillcock, and Dr. Monica Tamariz have done. I am grateful for Ari Anastassiou’s interest in this project and his feedback on ideas and approaches. The neural network experiments in this work are based on his suggestions, though any failures in their execution are surely mine. Credit goes to Leon Overweel for assistance in finding proper citations for several common software packages. Fi- nally, thank you to my wife Dayna, who has done so much to enable me to complete this work.

ii Table of Contents

1 Introduction 1 1.1 Motivation and Objective ...... 1 1.2 Contributions ...... 2 1.3 Outline ...... 3

2 Background: Systematicity 4 2.1 Language ...... 4 2.2 Learning ...... 5 2.3 Fonts ...... 6

3 Rendering Glyphs 8

4 Systematicity in Fonts 11 4.1 Sound and Shape Representation ...... 11 4.2 Distance Measures ...... 12 4.3 Correlation and Significance ...... 15

5 Parametric Font Optimization 16 5.1 Parametric Fonts ...... 16 5.2 Metaheuristics ...... 17 5.3 Font Set ...... 19 5.4 Methodology ...... 19 5.5 Results ...... 20 5.6 Parametric Font Glyphs ...... 24

6 Fonts from Distances Using Neural Networks 31 6.1 Data Set ...... 31 6.2 Pre-processing ...... 31

iii 6.3 Training ...... 32 6.4 Inference Experiments ...... 32 6.5 Results ...... 32

7 Discussion 35 7.1 Parametric Fonts ...... 35 7.2 Neural Networks ...... 36 7.3 Limitations ...... 36 7.4 Future Work ...... 37 7.4.1 Systematicity ...... 37 7.4.2 Optimization ...... 38 7.4.3 Other Languages and Phonologies ...... 38

8 Conclusions 39

Bibliography 40

A List of Parametric Fonts 44

iv Chapter 1

Introduction

1.1 Motivation and Objective

Language is full of statistical patterns called systematicity. Our brains make use of these patterns, leveraging them in the learning, production, and comprehension of lan- guage. The neurological mechanisms that accomplish this are not always clear, but the empirical evidence demonstrates the benefits of systematicity (Imai et al., 2008; Kantartzis et al., 2011). One lesser-known aspect of systematicity is a relationship between orthography, the visual shape of written language, and phonology, the sound of spoken language. This systematicity extends to the shapes of individual letters and their attendant sounds. The relationship can be measured in the form of a correlation between the visual and phonological distances between each shape and sound pair. While this systematicity has been known to exist in other languages, it has only recently been shown to exist in limited amounts in the English language (Jee et al., 2018). Since other types of language systematicity are helpful, it may be that increased sound-shape systematicity would also be of benefit. It could provide benefit to early readers or to those who experience dyslexia. Our goal in this project is to explore the possibility of increasing this sound-shape systematicity for a given font. Little, if any, work has been done to explore how this particular type of systematicity may be of benefit. Our hope is that further research will be enabled by the creation of tools for and identifying methods of improving this systematicity. These methods and tools could be used by researchers in linguistics, psychology, and education to explore the benefits of sound-shape systematicity in language. Gains in systematicity should be balanced with the appeal and readability of a font.

1 Chapter 1. Introduction 2

One could imagine substituting the characters of the Roman alphabet with a set of arbi- trary shapes that correlate highly with the variation of sounds in the English language. Such a set of shapes, however, could be unattractive and entirely unrecognizable to readers. On the other hand, the gains in systematicity that may be achievable with tra- ditional Roman characters is likely to be limited by fidelity to these shapes. For these reasons, both readability and systematicity gains must be kept in view. To balance these goals, two approaches were selected to increase sound-shape sys- tematicity. The first approach used a metaheuristic method called simulated annealing to find coordinates for parametric fonts that resulted in higher sound-shape systematic- ity. We chose this approach because it would result in readable and attractive fonts. Gains in systematicity, however, were limited by the constraints of the font design. In the second approach, a neural network was trained to produce glyphs from a set of shape distances. This approach was chosen for its greater potential for sound-shape systematicity gains, albeit at the cost of clarity and readability.. In addition to new methods for increasing systematicity, we expected to find fac- tors that correlate with higher systematicity. Parametric fonts have variation axes with common definitions. This enabled us to compare these commonalities across the re- sulting systematicity measurements. These relationships could be of use to researchers exploring the benefits of sound-shape systematicity and to typographers and font de- signers.

1.2 Contributions

To our knowledge, this is among the first work to explore methods for increasing sound-shape systematicity in fonts. We expect our unique contributions to enable fur- ther research in this area. The chief contributions made in this work are::

• A successful demonstration of the simulated annealing metaheuristic to find pa- rameterizations of parametric fonts that result in higher sound-shape systematic- ity. This approach resulted in a mean systematicity increase of 119% over the default parameterizations.

• A method for generating glyphs from sizes using a fully-connected feedforward neural network. Our experimental results show that this does not result in fonts with increased sound-shape systematicity. We theorize that the method of train- ing does not allow for generalization beyond the distance distributions in the Chapter 1. Introduction 3

training set.

• Empirical results showing font qualities that correlate with higher sound-shape systematicity. Specifically, we have found that font weights and widths are pos- itively correlated and that font optical size is negatively correlated with this sys- tematicity. We have also found that monospaced fonts demonstrate sound-shape anti-correlation, in opposition to proportional-spaced fonts.

• A software library1 that can be used for further research into sound-shape sys- tematicity in fonts, including glyph rendering, distance calculations, and system- aticity measurements.

• A data set2 of over 85,000 unique glyph sets and their resulting systematicity measurements that can be used for further research and validation.

1.3 Outline

The rest of this work is structured as follows. Chapter 2 provides a brief background on systematicity in language, its role in learning, prior work on sound-shape system- aticity, and prior work on fonts and reading. Chapter 3 discusses our work creating a reusable glyph-rendering module and reviews some of the design decisions made. Chapter 4 details our method of measuring sound-shape systematicity and grounds our choices for visual representation and distance measurement. In chapter 5 we dis- cuss the methodology and results of our experiments with parametric fonts. Chapter 6 discusses producing glyphs from shape distances using a neural network. Chapter 7 discusses our findings and its limitations and highlights promising ideas for future work. Chapter 8 provides a brief summary of and conclusion to our work.

1https://github.com/adamdotdev/font-systematicity 2https://doi.org/10.5281/zenodo.3369478 Chapter 2

Background: Systematicity

We begin with the background in psychology and linguistics that motivates our work. We first review examples of systematicity in language. We then discuss the effects of systematicity and arbitrariness on learning. Finally, we review prior work done in the area of font design and its impact on learning.

2.1 Language

It is well known that systematicity exists between aspects of language. An easily un- derstandable example is the systematicity between syntax and semantics. If a fluent speaker understands the sentence “John chases the cat”, it must entail that they can also understand the sentence “the cat chases John” (Fodor and Pylyshyn, 1988). Like- wise, if a fluent speaker hears “The zorp is broken”, they can easily make sense of or produce sentences like “Please call the zorp repair shop” and the “The zorp has been fixed” even if they have never encountered the word “zorp” before (Hadley, 1994). Systematicity is also present between phonology and semantics. The sound of a word is not entirely independent of its meaning. This, in particular, is an area of linguistics that was long thought to be almost entirely arbitrary until cross-language studies of sound symbolism showed otherwise (Hinton et al., 2006). It has also been demonstrated that similar sounding words are more likely to have similar or related meanings (Monaghan et al., 2014). In English, for example, words that are related to the nose more often begin with “sn-” as in snore, sniff, snot, and snicker, and words related to light are more likely to begin with “gl-” as in glitter, glisten, glare, and glow. This systematicity can not be fully explained by etymology or other language features. The effect, while small, is nonetheless significant.

4 Chapter 2. Background: Systematicity 5

There is also systematicity between phonology and syntax. In Spanish, words that sound similar are more likely to appear in similar syntactic contexts (Tamariz, 2008). Interestingly, while this systematicity is supported by consonants, vowels have the effect of reducing this systematicity. This may help to reduce confusion that may result from too much similarity in both meaning and context. The language systematicity we are concerned with is between phonology (sounds) and orthography (shapes). This sound-shape systematicity has been known to exist in some languages and orthographies for some time. The alphabet used in the Ko- rean language was designed in the 15th century in part with this systematicity in mind (Kim-Renaud, 1997). Recently, it has been shown that a smaller but still significant amount of this systematicity exists for some commonly used English fonts (Jee et al., 2018). In this work, Jee et al. (2018) outline a specific definition of this systematicity. Distances are first measured between each possible pair of sounds and each possible pair of characters. The edit distance between two phonologial vector representations is used for sounds, while a Hausdorff distance between discretized character images is used for shapes. The systematicity is then defined as the correlation between these two distance measures. A Monte Carlo permutation test is then performed to confirm the statistical significance. The pre-print results show significant results for some, but not all, English fonts. In short, visually similar letters in the Roman alphabet are more likely to have similar English sounds. To our knowledge, theirs is the first work to measure the sound-shape systematicity of English fonts.

2.2 Learning

The brain makes use of systematicity to both process and produce language. System- aticity has been shown to be especially helpful for early readers and language learners. Sound-meaning systematicity has been shown to help children learn new words (Imai et al., 2008; Kantartzis et al., 2011). Words that are commonly acquired earlier by chil- dren are more likely to demonstrate sound-meaning systematicity (Monaghan et al., 2014). It may also be the case that sound-shape systematicity would be more benefi- cial to early readers. Methods that can increase this systematicity would aid research in this area. Arbitrariness in language is also helpful. As a learner’s vocabulary and knowledge expand, too much systematicity can lead to confusion. Too many words with too few differences in meaning, sound, or shape can present a barrier to learning and memory, Chapter 2. Background: Systematicity 6 requiring the brain to derive an output from too little input. Arbitrariness, then, pro- vides additional inputs to allow a reader or listener to discriminate (Gasser, 2004). The deviation of vowels from the sound-syntax systematicity in Spanish (described above) may be an example of this effect, helping the brain to discriminate between otherwise similar sounding words in similar contexts (Tamariz, 2008). Since some amount of both systematicity and arbitrariness is helpful, neither a completely systematic nor a completely arbitrary relationship is likely to be best. The evidence that both systematicity and arbitrariness help with learning leads us to suspect that a font with complete correlation between sounds and shapes would not be ideal. For this reason, our goal in this project is to increase the sound-shape systematicity, but not to achieve perfect correlation.

2.3 Fonts

In this project, we aim to modify fonts to increase their sound-shape correlation. Phonology does change over time and can vary across cultures and locations. On the whole, however, orthography is much more malleable than phonology. Readers regularly encounter radically different fonts and typographical styles in various con- texts. Books, websites, signage, and advertisements all employ different typographical styles, often with differing goals. New fonts are frequently created and different styles of fonts can come into and fall out of fashion. Because of this malleability, changes to fonts that might increase sound-shape systematicity are much more likely to be suc- cessfully used rather than any attempted changes to phonology. Previous research on fonts and their effects on learning have shown different qual- ities can have different positive effects. fonts have been shown in one study to help readers to retain information (Gasser et al., 2005). Other work has found that sans serif, monospaced, Roman fonts increased reading performance and were preferred by readers with dyslexia (Rello and Baeza-Yates, 2013). The font Sasoon was cre- ated specifically for reading and its designer intentionally incorporated feedback from children (Sassoon, 1993). Some work shows that fonts with more similar characters are read more slowly due to a visual “stripe” effect caused by similarities in adjacent strokes (Wilkins et al., 2007). The same author found the characters in Sassoon had this undesirable similarity and showed it resulted in slower reading speeds (Wilkins et al., 2009). To our knowledge, no work has been done to measure the effects of fonts with varying levels of sound-shape systematicity on readers and learners. Chapter 2. Background: Systematicity 7

A good font can have many objectives, some of which may be in conflict. A font can be meant to communicate an aesthetic or message. A font needs to appear readable in a variety of mediums. In electronic this means remaining legible on a variety of displays and at a variety of sizes. In this work, we focus solely on the single goal of increasing sound-shape systematicity in typography, but emphasize that it is not an isolated goal. Chapter 3

Rendering Glyphs

We have developed a software library written in Python 3.7 that renders a set of glyphs in a format appropriate for experimentation. The rendering of glyphs as bitmaps is done using the FreeType rendering library (Turner, D., Wilhelm, R., Lemberg W., Podtelezhnikov, A., and Toshiya, S., 2019) with the -py (Rougier, Nicolas P., 2019) library as a Python wrapper. FreeType was chosen because it is open source and widely used on some of the most popular consumer systems today, including iOS, Android, and the Chromium browser. Some parts of FreeType’s support for Open- Type font variations were missing from the freetype-py library. We have extended that support and intend to submit it as an open-source contribution to freetype-py in the future. Glyphs can be generated together as sets using any combination of characters supported by Unicode, making it usable for other languages than English. Glyphs may also be generated at any point size. We developed a procedure for aligning glyphs together as a set. Some distance measures are sensitive to the alignment of two glyphs, and the choice of alignment will affect the resulting measurement. For vertical alignment, aligning along the guideline (the imaginary line running along the bottom of a line of text) is the obvious choice. Some lower case characters have descenders that drop below the guideline and must be aligned accordingly. The choice for horizontal alignment is less obvious. Options include left, right, and center alignment. Another possibility is to align major vertical strokes or arc centers, an idea we have not explored further. We aligned our glyphs vertically along the guideline and centered them horizontally. Center alignment was chosen because it most often aligns the most visually prominent parts of glyphs while minimizing the impact of serifs and other smaller embellishments on major strokes and curves.

8 Chapter 3. Rendering Glyphs 9

Figure 3.1: Magnified examples of outline differences in 11-pt Calibri bitmaps. The first glyph in each pair was displayed on an RGB LCD monitor using sub-pixel rendering, while the second glyph in each pair was rendered in monochrome by our library using FreeType. Notice the sub-pixel rendering has resulted in redder pixels on the left and bluer pixels on the right due to the dimming of the color nearest to the glyph edges.

When a glyph is rendered to a display it is represented as a set of pixels. To make glyph outlines appear smooth, the edges are often shades of gray rather than solid black to soften the transition from black to white (assuming black text on a white background). At most small font sizes the number of pixels available on the display can be quite small, so sub-pixel rendering is applied. Sub-pixel rendering takes advantage of the fact that pixels on displays like LCD and OLED are subdivided into individual color parts. By activating the color subpixels closest to the glyph outline, the character appears smoother. These colors are not normally detectable to the human eye but can be seen under magnification (see figure 3.1). In our work, we use only black and white pixels in order to have a clear point where the edge of the glyph begins and ends. This leads to the more jagged-appearing edges than one would actually experience, but provides a convenient binary definition of the shape as outlined later. In order to avoid repeated generation of glyphs and to efficiently enable repeated calculations using different distances measures over different experiments, a data store was created using a SQLite database (Hipp, D. , 2018). This data store includes

• Font data.

• Glyph bitmaps rendered at various point sizes and using various axes coordi- nates.

• The measured distances for shapes and sounds using multiple distance measures.

• The sound-shape correlation measure for each glyph set and distance metric. Chapter 3. Rendering Glyphs 10

GlyphSet Glyph ShapeDistance

PK id PK id PK id

FK font_id FK glyph_set_id FK glyph1_id

size character FK glyph2_id

coords bitmap metric

bi distance

bi

Correlation Font

PK id PK id SoundDistance

glyph_set_id name PK id

shape_metric file_name char1

sound_metric font_file char2

is_variable metric

axes distance

Figure 3.2: Entity-relationship diagram of data set. Some fields have been omitted for simplicity.

An entity-relationship diagram of the key tables and fields in this database can be found in figure 3.2. A clean data store can be generated and repopulated for futures experiments using the software library. The results of these experiments have also been published in this format for further work and analysis. We limited our measurements in all experiments to lower-case glyphs. We chose to focus on one letter case for simplicity, and used lower-case over upper-case characters because they comprise the majority of text in most contexts. It is worth noting, how- ever, that some of the fonts tested used glyphs in the lower-case positions that have the appearance of traditional upper-case characters. Chapter 4

Systematicity in Fonts

The systematicity we seek to maximize is a correlation between the sound (phonology) of the letters and the shape (orthography) of the characters. In this chapter, we discuss how we represented sounds and shapes. We then review possible distance metrics and justify our choices. Finally, we discuss the correlation measure and describe measuring significance using a Monte Carlo permutation test.

4.1 Sound and Shape Representation

A phonological representation is necessarily language-dependent. For English, we followed Jee et al. (2018) and used the representation defined by Harm and Seidenberg (1999). In this definition, each sound is represented by an 11-dimensional vector, where each element represents a phonetic feature. Each feature takes a value between 1 and -1. Most of the features defined are binary, taking on 1 or -1, or ternary, taking on 1, 0, or -1. The sole exception is sonority, which can take continuous values. A single letter can represent multiple sounds. For instance, the letter c can indicate either the /k/ or /s/ sounds. To simplify our correlation measures, we take only a single pronunciation of each letter, using the sound first learned by British school children as the defining sound for each letter (Lloyd et al., 1998). We also omit q and x from our analysis as they are multi-phoneme sounds, leaving us with 24 phonetic represen- tations. We again follow the recommendations of Jee et al. (2018) in both of these regards. While this phonological definition for English was used in our work, another phonological definition could be substituted, either for English or another langauge. The only requirement is that the phonology can be represented as a set of feature vec- tors.

11 Chapter 4. Systematicity in Fonts 12

The visual representations used were black-and-white images generated using the software library described in chapter 3. Each set contained 24 bitmaps, each of which was paired to a single phonological definition. Glyphs were generated at various fonts sizes. Each font could produce a unique visual representation of a given letter, and bitmaps for a given font and letter varied by size.

4.2 Distance Measures

All possible pairs of sounds and shapes were generated, a total of 274 pairs of each type, and distances between each pair were measured. The choice of distance measure used at this stage made a difference in the systematicity measured. Some possible options are discussed below, and the choice for these experiments defined. As discussed above, sounds were represented by 11-dimensional feature vectors representing the phonological traits of the given sound. We measured both the feature edit distance and the Euclidean distance between these vectors. The feature edit dis- tance in our context applies only to the number of substitutions needed to make one vector identical to another since insertions and deletions are not necessary. Figure 4.1 shows the distribution of correlations measured across all glyph sets for each distance measure. The feature edit distance reflected somewhat higher systematicity. Because the exact mechanism by which the brain utilizes systematicity is not always clear, any evidence of systematicity could prove to be helpful. For this reason, we used the sound edit distance for all experiments and results reported except for those in figure 4.1. For shapes, the measure of visual similarity is less straightforward. There are a large number of methods for measuring visual similarity in the literature (see Veltkamp (2001) for an introduction to a few). The methods range from an overly simplistic binary choice to complex representational approaches such as training a neural network to learn visual features. Many definitions of visual similarity are intended for special further tasks such as classification and matching. Each of the approaches measures a different aspect of visual similarity and might be more or less suitable for a given task. Two measures that seem particularly useful for fonts are outline-based measures and the Hausdorff metric. Outline-based methods such as the Frechet´ distance (Har-Peled and Raichel, 2014) and the turning function distance (Chehreghan and Ali Abbaspour, 2017) trace the out- lines of shapes, measuring the distance and changes in direction. These outline-based metrics could be useful as fonts are often defined as Bezier´ curves to enable smooth Chapter 4. Systematicity in Fonts 13

Euclidean 6000 Edit

5000

4000

3000

2000 Number of Glyphs Sets

1000

0 0.2 0.1 0.0 0.1 0.2 Correlation Coefficient

Figure 4.1: Histogram of the resulting correlation using the edit and Euclidean distances between sounds. Edit mean: 0.062, std dev: 0.065, Euclidean mean: 0.020, std dev: 0.069 scaling to different sizes. These curve-based definitions ultimately are rendered into a discrete set of points (pixels) by the font rendering engine. The exact shape thus varies slightly depending on the text size and display resolution. This difference in shape leads to different distance measures and different resulting sound-shape systematicity. Using an outline-based distance method would eliminate this variation and provide a single measure for a given font. It would, however, ignore choices made by the font rendering engine that result in real differences in the glyphs viewed by users. The Hausdorff metric is widely used for image and shape matching algorithms. It operates on a set of points representing the shape. For our purposes, we can of each pixel of a character as a point in the shape of that character. The directed Hausdorff distance is as follows:

Given two sets of points that make up two shapes A = {a1,...,an},B = {b1,...,bn}, d where A,B ∈ R , the directed Hausdorff distance between them is given by:

−→ dh(A,B) = maxa∈Aminb∈B|a − b| where |a − b| gives the distance between two individual points using a base dis- tance metric such as the Euclidean distance. The directed Hausdorff distance is not −→ −→ symmetric, that is dh(A,B) 6= dh(B,A), which is not sufficient to define a metric space and is not desirable for our purposes. To solve this, the full Hausdorff metric is given by: Chapter 4. Systematicity in Fonts 14

−→ −→ H(A,B) = max(dh(A,B), dh(B,A)) The Hausdorff distance is greatly affected by outlying points. Two alternatives re- duce this dependence, the mean Hausdorff distance and the partial Hausdorff distance. The mean Hausdorff distance seeks to minimize outlier impact by replacing max and min with an average over distances (Baddeley, 1992). The partial Hausdorff distance minimizes outlier impact by ordering the point distances |a−b| and then taking the kth largest distance rather than the max, for some value of k (Huttenlocher et al., 1993). For our work, we selected the Hausdorff distance as our visual distance measure because it allows us to work directly with rendered glyphs as displayed to users and can be computed in near-linear time (Taha and Hanbury, 2015). Individual pixel locations are used as points in the Hausdorff calculation. Because font glyphs are made of continuous strokes, individual outlying pixels are less of a concern, and so we chose the Hausdorff distance as defined above, without using either of the outlier-mitigating variants. See figure 4.2 for samples of the Hausdorff metric using glyph bitmaps.

(a) j-m distance: (b) l-w distance: (c) e-h distance: (d) a-l distance: 39.1 34.7 30.8 26.0

(e) c-i distance: (f) p-y distance: (g) n-u distance: (h) i-l distance: 21.0 15.2 9.4 8.0

Figure 4.2: Hausdorff distances of selected 96-pt Amstelvar Roman glyphs, ordered by magnitude. The green arrow connects the determining points and the direction of the underlying directed Hausdorff distance. Chapter 4. Systematicity in Fonts 15

4.3 Correlation and Significance

The final step in measuring sound-shape systematicity was to measure the Pearson’s correlation coefficient (Pearson’s r) of the sound and shape distances. This measures the extent to which there is a linear relationship between the sounds and shapes. It is also possible that a non-linear relationship exists between sounds and shapes but that was not explored here. The significance of the correlation measure was then validated by performing a Monte Carlo permutation test (Dwass, 1957). In this procedure, the correlation of the actual distances is first measured as a test statistic. The sound and shape distances are then randomly and independently shuffled and the correlation re- measured. We repeated this sampling procedure 25,000 times for a given glyph set. The portion of measured correlation coefficients that exceeded the test statistic was reported, with p-values under 0.05 considered significant. Chapter 5

Parametric Font Optimization

In this chapter, we give a brief introduction to parametric fonts generally and to Open- Type font variations specifically. We then discuss our choice of simulated annealing as an optimization procedure, and how it was chosen to find increased sound-shape sys- tematicity. We describe the set of fonts used in our experiments and our experimental methodology. Finally, we review experimental results.

5.1 Parametric Fonts

Parametric fonts are computer fonts that can produce different glyphs for the same letter based upon a set of input parameters. The earliest computer fonts were bitmap fonts, where each glyph was defined as a set of pixels. Bitmap fonts did not scale well to different sizes. This led to outline fonts replacing bitmap fonts. Outline fonts define glyphs as outlines using Bezier´ curves. Outline fonts enabled a single glyph definition to scale smoothly to any number of sizes and devices with the help of a rendering engine. If a different version of a glyph was desired a new outline had to be defined and a new font created. This led to multiple fonts being created within a single name. For example, there are six fonts in the Calibri typeface: bold, italic, bold italic, light, light italic, and regular. Parametric fonts contain a single definition that can produce a variety of glyphs. The appearance of the glyphs can vary along a collection of predefined attributes called axes. Each axis represents a continuous variation of the font. Common examples are the glyph width, the character angle, or the stroke width in a particular direction. Less common variations include things like the style of serif or the rounding of edges. Early incarnations of parametric fonts included (Knuth, 1986) and In-

16 Chapter 5. Parametric Font Optimization 17

finifont (McQueen III and Beausoleil, 1993) in the 1980s and Multiple Masters fonts in the 1990s. Most recently, version 1.8 of the OpenType standard introduced font variations in a joint effort between , Apple, Google, and Adobe (Microsoft, 2018). OpenType parametric fonts can have one or more variation axes defined. Each axis must have a minimum and maximum value, but can vary continuously within that range. Each axis also has a default value utilized when no custom axes values are supplied by the rendering engine. The standard defines five registered axis definitions: weight, width, optical size, italic, and slant. Each standard axis has a definition design- ers should adhere to, but the actual implementation is dependent on the designer. For example, changes to the width axis of the Bandeins Strange modifies the width of only selected letters of the alphabet, while leaving the others unchanged (see figure 5.11). Each standard axis has specific requirements - some may require a spe- cific range or default value, while others leave this to the designers to define. Custom axis definitions can also be created, enabling professional typographers to create fonts a wide range of variations. OpenType has been a widely used standard for fonts for many years and is supported by virtually all major platforms. Support for the font vari- ations defined in version 1.8 has been added to many major rendering libraries. The Web Open Font Format (WOFF) 2.0 also has full support for OpenType font variations, enabling parametric fonts to be used on the web.

5.2 Metaheuristics

The aim was to find coordinates of a parametric font that yield an increased sound- shape systematicity. This systematicity is a function of the parameterization-to-glyph- output, the Hausdorff metric and feature edit distances, and of the correlation between those distances. The function that produces glyphs from the font parameterization, while fully defined by the internals of the font rendering engine and the font itself, is sufficiently complex and individualized as to be considered a black box for our purposes. For this reason, optimization methods that require a differentiable function, such as gradient ascent, are not useful. Metaheuristic methods are approaches that combine an optimization procedure with a stochastic method to find optima for a given function. They are particularly useful for non-invertible functions and non-differentiable functions as gradients are not required. Metaheuristic methods rarely provide a guarantee of finding global optima. Commonly used examples of these methods include harmony search (Woo Geem et al., Chapter 5. Parametric Font Optimization 18

Figure 5.1: Examples of variants of a lower case p in in the OpenType parametric font Amstelvar Roman. Each glyph was created using a different pair of values for the axes controlling the x-thickness and y-thickness.

2001), genetic algorithms (Melanie, 1999), tabu search, quantum annealing (Ohzeki and Nishimori, 2011), and simulated annealing (Kirkpatrick et al., 1983). Each of these methods work differently and choosing one is not always straightforward. We selected simulated annealing because it is a classical method that has been shown to find high-quality solutions quickly (Antosiewicz et al., 2013). Simulated annealing was inspired by the process of cooling molten metals at a con- trolled rate in order to achieve a desired physical structure. Like most metaheuristic algorithms, simulated annealing starts with an initial solution and a mechanism for choosing a candidate solution stochastically from within some neighborhood of the current solution. Using this process, a new candidate is generated repeatedly. If the candidate provides a better outcome than the current solution, it is automatically ac- cepted. Otherwise, it is accepted with some probability that is dependent on both the quality of the outcome and on a variable called the temperature. This temperature is decreased over time, reducing the probability that a solution that produces a sub- optimal outcome is accepted. This has the effect of increasing exploration initially to avoid local minima while becoming progressively more greedy in the later stages as better solutions are found. Chapter 5. Parametric Font Optimization 19

Axis Count Number of axes Number of fonts Weight 42 1 29 Width 12 2 11 Optical Size 3 3 3 Italic 1 6 2 Slant 1 12 1 Custom 54 15 2

Table 5.1: Number of variation axes in Table 5.2: Number of fonts by variation the data set, by type. axis count.

5.3 Font Set

The test set consisted of 47 parametric fonts. All selected fonts were open-source. Creators ranged from large corporations such as Adobe and IBM to individual cre- ators. The complete list with samples can be found in section 5.6. These fonts span a variety of styles. Many are standard serif and sans-serif fonts, while others are more experimental and stylistically unusual. Three were monospaced fonts, that is, with all characters the same width. The counts of design axes per font varied from one to fif- teen per font (see table 5.2). The standard weight axis was the most common variation axis, but custom variation axes outnumbered any single standard axis, accounting for almost half of all axes. Axis counts can be seen in table 5.1.

5.4 Methodology

An initial hyperparameter search was done using the simulated annealing algorithm and a variety of temperatures. A temperature of 0.2 yielded the best results. The algorithm was run using a temperature of 0.2, a time budget of 500 iterations, and a linear annealing rate. The algorithm was run for each parametric font at both 12-pt size and 24-pt size using a resolution of 96 dots per inch (DPI). The default axis values for each font were used as the initial solution. The sound-shape systematicity was measured at this parametrization and used as the baseline. At each iteration, a solution candidate was generated by altering each axis value by adding a quantity selected at random from a Gaussian distribution with mean 0 and variance of 10% of the axis range. A set of glyphs were generated using the candidate solution and then evaluated Chapter 5. Parametric Font Optimization 20 in terms of their resulting sound-shape systematicity. All glyph sets, distances, and systematicity measurements generated in these experiments were stored in the SQLite data store for later analysis. Two of the fonts from the original test set were omitted from the 12-pt size due to issues during the experiment. In both cases, one or more glyphs failed to render any points at one or more axes coordinates. This made the distance and resulting correlation impossible to calculate without substituting some approximation. It was observed in early trials that the three monospaced fonts in our font set demonstrated a statistically significant anti-correlation between sound and shape. All other proportional fonts showed either correlation or statistically insignificant results. Because anti-correlation is also a form of systematicity, we chose to optimize for anti- correlation with monospaced fonts and for correlation for proportional fonts.

5.5 Results

The simulated annealing algorithm produced a parameterization that resulted in higher sound-shape correlation for 41 of 42 fonts tested at 12-pt size, and for 42 of 44 fonts at 24-pt size. Of the resulting correlations, 32 were statistically significant (p ≤ 0.5) at 12-pt size, and 25 were statistically significant at 24-pt size. The comparison for each font can be found in figures 5.2 and 5.3. The best parameterizations found using simulated annealing increased the mean sound-shape correlation over all fonts and sizes from 0.053 to 0.123, an increase of 132%. The maximum systematicity found increased from 0.129 to 0.232. The distributions are visualized in figure 5.4. Having observed increases in systematicity, we looked at whether the font vari- ation axes could give us information about the factors that might contribute to these increases. While the definition, range, and application of axes are entirely up to the font designer, the standard axes defined by the OpenType specification are applied sufficiently consistently to allow us to observe how changes in the systematicity are impacted by changes along these axes. Of the standard axes only weight, width, and optical size were represented by more than one font in the test set. Over 85,000 unique axes coordinates were used to produce and store glyph sets from parametric fonts over the course of these experiments. For each glyph set, we standardized each axis coordinate to a value between 0 and 100 based on the percentile of the possible range of that axis. As an example, a weight of 24 on an axis ranging from 20 to 40 and a weight of 200 on an axis ranging from 0 to 1000 were both Chapter 5. Parametric Font Optimization 21

Default Parameters Simulated Annealing 0.20

0.15

0.10

0.05 Correlation coefficient 0.00

0.05 Inter Cabin Kayak Changa Decovar dT Jakob Graduate Voto Serif Inter Italic PT Root UI Movement Hepta Slab Cabin Italic Bahnschrift Kairos Sans Avenir Next Buffalo Gals Inter Upright Secuela Italic Amstelvar Italic Decovar Subset Work Sans Italic Secuela Regular Public Sans Italic Crimson Pro Italic Amstelvar Roman Work Sans Roman Titillium Web Italic Public Sans Roman IBM Plex Sans Italic Crimson Pro Roman Libre Franklin Italics Titillium Web Roman IBM Plex Sans Roman Bandeins Sans Variable Bandeins Strange Variable Variable Italic Source Code Variable Italic Variable Roman Source Sans Variable Roman Source Code Variable Roman Adobe Variable Font Prototype Font

Figure 5.2: The sound-shape correlation for each font at 12-pt size using the default pa- rameterization and using the best parameterization found using the simulated annealing algorithm.

Default Parameters 0.20 Simulated Annealing

0.15

0.10

0.05

Correlation coefficient 0.00

0.05 Inter Cabin Kayak Changa Decovar dT Jakob Graduate Voto Serif Inter Italic PT Root UI Nu Alfabet Movement Hepta Slab Cabin Italic Bahnschrift Kairos Sans Avenir Next Buffalo Gals Inter Upright Secuela Italic Amstelvar Italic Decovar Subset Work Sans Italic Secuela Regular Public Sans Italic Crimson Pro Italic Amstelvar Roman Work Sans Roman Titillium Web Italic Public Sans Roman IBM Plex Sans Italic Crimson Pro Roman Libre Franklin Italics Titillium Web Roman Libre Franklin Roman IBM Plex Sans Roman Bandeins Sans Variable Bandeins Strange Variable Source Sans Variable Italic Source Code Variable Italic Source Serif Variable Roman Source Sans Variable Roman Source Code Variable Roman Adobe Variable Font Prototype Font

Figure 5.3: The sound-shape correlation for each font at 24-pt size using the default pa- rameterization and using the best parameterization found using the simulated annealing algorithm. Chapter 5. Parametric Font Optimization 22

0.20

0.15

0.10

0.05 Correlation coefficient

0.00

0.05

Default Simulated Annealing Parameterization

Figure 5.4: Comparison of the sound-shape correlation distributions resulting from the default parameterizations (left) and the best parameterizations found (right). Includes results from all fonts at all sizes. standardized to a value of 20. Similar values do not necessarily represent similar visual appearances, but relative positions within the axis range and their relationship to the resulting systematicity. The correlation of these standardized axes values was then measured against the resulting sound-shape systematicity values. The results are in figure 5.5. The standardized axes values were also categorized into 50 ordinal buckets, and the mean of all correlations measured from glyph sets using parameterizations within that bucket was plotted. The results are in figure 5.6. Note that the measured correlation will vary for the same axis value and the same font because the variance of other axes may contribute to the measured systematicity. The width and weight axes both r-value p-value showed a positive correlation with the re- Optical Size -0.153 < 0.0001 sulting systematicity, while optical size Weight 0.121 < 0.0001 showed a negative correlation. Put an- Width 0.225 < 0.0001 other way, heavier strokes, wider char- acters, and smaller optical size seem to Figure 5.5: Correlation of common axes result in higher sound-shape correlation, with resulting systematicity measurements while thinner strokes, narrower charac- over all glyph sets generated. ters, and larger optical size tend to result Chapter 5. Parametric Font Optimization 23

0.070 0.07

0.065 0.06 0.060 0.05 0.055

0.050 0.04

0.045 0.03 Correlation coefficient (mean) Correlation coefficient (mean) 0.040

0.035 0.02 0 20 40 60 80 100 0 20 40 60 80 100 Percentile of weight range Percentile of width range

(a) Weight (b) Width

0.08

0.07

0.06

0.05 Correlation coefficient (mean)

0.04

0 20 40 60 80 100 Percentile of optical size range

(c) Optical Size

Figure 5.6: Variation axes values and resulting correlations over all glyph sets created.

in lower sound-shape correlation. Width, in particular, had the most significant correlation of any axes observed. The 24-pt glyphs for each font used are shown in section 5.6. Each font shows the default parameterization and the parameterization yielding the highest sound-shape correlation for that font. A few qualitative observation can be made from these glyphs. Visual inspection seems to confirm the relationship between weight, width, and the resulting systematicity; the improved glyphs are often wider and heavier. It should be noted that the very highest systematicity measurements are non-standard fonts. The highest systematicity for the two Decovar variants, in particular, seem to have resulted in distorted glyphs the designers likely did not intend. These may indicate that more radical changes in typography would be required to achieve particularly high levels of sound-shape correlation. The Bandeins Strange Variable is unique in that modifying the width axis only affect the lower-case letters o, r, and t. Even still, increasing the width (but not maximizing it) nearly doubled the systematicity of that font, even though the other glyphs were unchanged. Chapter 5. Parametric Font Optimization 24

5.6 Parametric Font Glyphs

This section contains a sample of the 24-pt monochrome glyph bitmaps generated for these experiments. The glyphs in the top row were generated using the font’s default parameterization while those in the bottom row were generated using the parameteri- zation that resulted in the best systematicity measurement. In some cases there may be more than one parameterization that resulted in the best measurement, in which case only one parameterization is shown. The correlation coefficients are displayed below the glyphs. The fonts are listed in order of their best systematicity measure. The three monospaced fonts can be found at the end of this list.

Figure 5.7: Decovar Subset. Top:r=0.024, p=0.3436 Bottom: r=0.213, p=.0002

Figure 5.8: Nu Alfabet. Top: r=-0.050, p=0.7981 Bottom: r=0.201, p=0.0006

Figure 5.9: Decovar. Top: r=0.024, p=0.3484 Bottom: r=0.188, p=0.0006

Figure 5.10: Source Sans Variable Italic. Top: r=0.038, p=0.2650 Bottom: r=0.164, p=0.0030

Figure 5.11: Bandeins Strange Variable. Top: r=0.080, p=0.0959 Bottom: r=0.156, p=0.0041 Chapter 5. Parametric Font Optimization 25

Figure 5.12: Graduate. Top: r=0.074, p=0.1112 Bottom: r=0.155, p=0.0043

Figure 5.13: Voto Serif. Top: r=0.107, p=0.0374 Bottom: r=0.152, p=0.0064

Figure 5.14: Amstelvar Roman. Top: r=0.044, p=0.2324 Bottom: r=0.151, p=0.0071

Figure 5.15: Inter. Top: r=0.092, p=0.0637 Bottom: r=0.145, p=0.0088

Figure 5.16: Libre Franklin Italics. Top: r=0.072, p=0.1198 Bottom: r=0.141, p=0.0094

Figure 5.17: PT Root UI. Top: r=0.124, p=0.0192 Bottom: r=0.132, p=0.0128

Figure 5.18: Work Sans Italic. Top: r=0.081, p=0.0915 Bottom: r=0.128, p=0.0160

Figure 5.19: Inter Upright. Top: r=0.092, p=0.0607 Bottom: r=0.128, p=0.0157 Chapter 5. Parametric Font Optimization 26

Figure 5.20: Changa. Top: r=0.044, p=0.2389 Bottom: r=0.128, p=0.0168

Figure 5.21: Titillium Web Italic. Top: r=0.127, p=0.0174 Bottom: r=0.127, p=0.0215

Figure 5.22: Public Sans Italic. Top: r=0.048, p=0.2158 Bottom: r=0.124, p=0.0191

Figure 5.23: Secuela Regular. Top: r=0.091, p=0.0652 Bottom: r=0.122, p=0.0217

Figure 5.24: Inter Italic. Top: r=0.100, p=0.0472 Bottom: r=0.122, p=0.0226

Figure 5.25: Work Sans Roman. Top: r=0.075, p=0.1085 Bottom: r=0.120, p=0.0246

Figure 5.26: Public Sans Roman. Top: r=0.085, p=0.0800 Bottom: r=0.120, p=0.0236 Chapter 5. Parametric Font Optimization 27

Figure 5.27: Secuela Italic. Top: r=0.072, p=0.1141 Bottom: r=0.119, p=0.0237

Figure 5.28: Hepta Slab. Top: r=0.071, p=0.1169 Bottom: r=0.114, p=0.0275

Figure 5.29: Kairos Sans. Top: r=0.013, p=0.4089 Bottom: r=0.113, p=0.0289

Figure 5.30: Bahnschrift. Top: r=0.102, p=0.0467 Bottom: r=0.112, p=0.0325

Figure 5.31: Source Sans Variable Roman. Top: r=0.033, p=0.2909 Bottom: r=0.111, p=0.0325

Figure 5.32: Bandeins Sans Variable. Top: r=0.090, p=0.0690 Bottom: r=0.107, p=0.0357

Figure 5.33: Libre Franklin Roman. Top: r=0.053, p=0.1948 Bottom: r=0.104, p=0.0428 Chapter 5. Parametric Font Optimization 28

Figure 5.34: IBM Plex Sans Italic. Top: r=0.062, p=0.1551 Bottom: r=0.104, p=0.0440

Figure 5.35: Adobe Variable Font Prototype. Top: r=0.021, p=0.0487 Bottom: r=0.101, p=0.3670

Figure 5.36: Cabin. Top: r=0.060, p=0.1590 Bottom: r=0.098, p=0.0505

Figure 5.37: Source Serif Variable Roman. Top: r=0.022, p=0.3585 Bottom: r=0.097, p=0.0548

Figure 5.38: Titillium Web Roman. Top: r=0.065, p=0.1425 Bottom: r=0.094, p=0.0608

Figure 5.39: Kayak. Top: r=0.071, p=0.1227 Bottom: r=0.093, p=0.0620

Figure 5.40: Cabin Italic. Top: r=0.067, p=0.1352 Bottom: r=0.089, p=0.0733 Chapter 5. Parametric Font Optimization 29

Figure 5.41: dT Jakob. Top: r=0.079, p=0.0970 Bottom: r=0.079, p=0.0952

Figure 5.42: IBM Plex Sans Roman. Top: r=0.034, p=0.2881 Bottom: r=0.068, p=0.1300

Figure 5.43: Source Code Variable Italic. Top: r=0.056, p=0.1784 Bottom: r=0.067, p=0.1330

Figure 5.44: Movement. Top: r=0.037, p=0.5431 Bottom: r=0.064, p=0.1440

Figure 5.45: Avenir Next. Top: r=-0.011, p=0.6058 Bottom: r=0.064, p=0.1415

Figure 5.46: Amstelvar Italic. Top: r=-0.007, p=0.5405 Bottom: r=0.062, p=0.1986

Figure 5.47: Buffalo Gals. Top: r=0.007, p=0.4515 Bottom: r=0.060, p=0.1611 Chapter 5. Parametric Font Optimization 30

Figure 5.48: Crimson Pro Roman. Top: r=0.012, p=0.4160 Bottom: r=0.053, p=0.1882

Figure 5.49: Crimson Pro Italic. Top: r=-0.010, p=0.5712 Bottom: r=0.049, p=0.2124

Figure 5.50: Source Code Variable Roman. Top: r=0.033, p=0.2930 Bottom: r=0.048, p=0.2167

Figure 5.51: Code. Top: r=-0.110, p=0.0342 Bottom: -0.111, p=0.0341

Figure 5.52: League Mono. Top: r=-0.114, p=0.0298 Bottom: -0.138, p=0.0119

Figure 5.53: BP Dots. Top: r=-0.182, p=0.0014 Bottom: -0.182, p=0.001 Chapter 6

Fonts from Distances Using Neural Networks

Having accumulated a large number of glyphs, we explored whether it would be pos- sible to generate novel glyphs with increased sound-shape systematicity directly from a set of shape distances. This approach would bypass searching in parametric spaces, and directly learn glyph representations from the glyphs and corresponding distances we have accumulated so far. By then supplying new shape distances that are more correlated with the defined sound distances, we should get glyphs that produce higher sound-shape systematicity.

6.1 Data Set

The data used for this experiment consisted of 85,500 glyph sets consisting of over 2,000,000 unique glyphs, along with the visual distance between glyph pairs in each set. These glyphs were generated as part of our parametric font experiment following the methods described in chapter 5. These glyphs were in two sets, one at 12-pt size and one at 24-pt size.

6.2 Pre-processing

The visual distances for each glyph were standardized to have a mean of 0 and a stan- dard deviation of 1 in order to allow relative distances across fonts to be represented consistently and to facilitate generation of desired distances during our inference ex- periments. The output bitmaps were all padded with white space to the largest bitmap

31 Chapter 6. Fonts from Distances Using Neural Networks 32 size in the data set in order to produce a consistently sized bitmap for output and to allow for later distance measurements on the novel glyphs created.

6.3 Training

Our approach used a neural network built in PyTorch (Paszke et al., 2017) to learn the desired glyph representation. The inputs to the network were the 274 visual distance measures associated with each pair of glyphs in a glyph set. The outputs the binary pixel values of the 24 individual glyph bitmaps. For 12-pt glyphs, the output bitmaps were 19 pixels high by 22 pixels wide, for a total output of 10,032 binary pixel values. For 24-pt glyphs, the output bitmaps were 37 pixels high by 43 pixels wide, for a total output of 38,184 binary pixel values. After an architecture search, a single-layer, fully-connected network with 80 hid- den rectified linear units (ReLU) (Nair and Hinton, 2010) was used. The Adam op- timization algorithm (Kingma and Lei Ba, 2014) was used with a learning rate of 1×10−3 and a weight decay value of 1×10−5 (Loshchilov and Hutter, 2017). The net- work was trained for 50 epochs with a batch size of 225, and used an L1 loss function. The resulting validation loss was 0.005.

6.4 Inference Experiments

The sets of 274 shape distances for each glyph set, standardized with mean 0 and standard deviation 1, were run through the trained network in 4 iterations. In the first iteration, the original standardized distances were input. On the following iterations, the shape distances were shifted toward the sound distances, also standardized with mean 0 and standard deviation of 1, by an increment of .8% of the difference. This had the effect of increasing the correlation between the sound distances and the input distances of the network. The glyph sets output by the network at each iteration were then captured and the actual shape distances and their correlation to the sound distances was measured.

6.5 Results

The glyphs generated directly from actual glyphs distances were similar to the original targets, though with some perturbations. A sample can be seen in figure 6.1. In later Chapter 6. Fonts from Distances Using Neural Networks 33

Figure 6.1: A sample of characters produced from novel distance sets. Each column from left to right was produced by input distances that were moved toward the full cor- relation with sound distances by an amount of 8% of the difference. iterations, as the input shape distances are modified to more closely correlate to the sound distances, the output glyphs become thicker with more jagged outlines. These are less visually appealing, as expected, but on the whole are still recognizable. The network does seem to have learned the association of weight with higher sound-shape correlation. The correlation of the novel glyphs in later iterations grew slightly, but at only a fraction of that predicted by the inputs. Figure 6.2 shows a comparison of the correla- tion of the input distances and that measured from the glyphs output by the network. While there was an upward trend, it was not significant. It seems clear that this approach does not generalize to glyphs that will produce cor- relations beyond those found in the training set. It did learn the relationships between weight and higher correlation, which is likely primarily responsible for the small in- crease we see, but beyond this it does not appear to provide us either with glyphs that produce higher sound-shape systematicity, nor with information that would help us identify contributing factors. Chapter 6. Fonts from Distances Using Neural Networks 34

0.30 Input Output 0.25

0.20

0.15 Correlation 0.10

0.05

1 2 3 4 Step

Figure 6.2: A comparison of the correlation coefficients of the shape distances input into the network and that of the the distances measured from the glyphs produced by the neural network. Chapter 7

Discussion

7.1 Parametric Fonts

We have demonstrated simulated annealing as an effective method for finding an in- creased sound-shape systematicity within the coordinate space of a parametric font. In addition, we have shown that heavier strokes, wider glyphs, and smaller optical size correlate with higher sound-shape systematicity, with glyph width providing the most contribution. The highest levels of systematicity were measured in unconventional fonts. This may indicate that non-traditional orthography will be required to achieve much higher systematicity levels. The unique features of these fonts may provide direction for further research into the factors that influence sound-shape systematicity. One trait shared by the three fonts with the highest systematicity is that they all omit stroke points found in more traditional typography. The two Decovar variants (figures 5.7, 5.9) have “hollow” strokes with space in the center, along with multiple perpendicular breaks. The Nu Alfabet font (figure 5.8) has glyphs with missing parts such as the second arcs of m and w, the two forward strokes of the letter k, and the top forward and bot- tom forward strokes of the letters s and z, respectively. The Nu Alfabet also has other unusual qualities. It has several characters with descenders that normally do not, par- ticularly m and w. It also achieved a higher level of systematicity at a parameterization that is both narrower and has thinner strokes than the default, indicating that the stroke configuration may play a bigger part in the systematicity than the stroke weight and glyph width. The Bandeins Strange Variable (figure 5.11) font nearly doubled in sound-shape

35 Chapter 7. Discussion 36 systematicity by modifying just three glyphs: the letters o, t, and r. This demonstrates that higher systematicity can be achieved by modifying only some letters. Finding sys- tematicity with different width values per glyphs could yield even better systematicity results. It is important to note that even though the correlation values measured are rela- tively small, they can still be significant for learning. The correlations Monaghan et al. (2014) measured between phonology and semantics were all below 0.04. The cumu- lative effect of these various types of systematicity is critical, and even a moderate amount could have an impact on learning outcomes Jee et al. (2018).

7.2 Neural Networks

In chapter 6 we extended our work by training a neural network to generate novel fonts directly from desired shape distances. These efforts were mostly unsuccessful. The fonts created are ugly but readable, but the increases in sound-shape systematicity are not significant. There may be a few reasons for this failure. First, due to the requirements of our design to have identically-sized bitmaps, white pixels dominated, making up 95% of the pixels in the data set. This imbalance shows in the results as a tendency toward white pixels and almost no unusual orthographic features. Second, our approach did not generalize to shape distance distributions that were outside the training set. In short, it did not learn to produce glyphs backward from the desired correlation.

7.3 Limitations

There are a few limitations to observe about this work:

• Our parametric font set was limited in the number of axes available. The major- ity of fonts only had a single axis. We also did not have enough fonts with the standard italic and slant axes to evaluate their relationship to increased system- aticity.

• The parametric font test set was limited to those that supported by the OpenType font variations defined in version 1.8 of the standard. This standard was released in 2016. The most recognizable and widely used fonts, however, were created before 2016 and were thus not included in our test set. While we expect many Chapter 7. Discussion 37

common type faces to have variable versions created in the future, at this time the most widely used fonts are not parametric.

7.4 Future Work

We suggest avenues for future exploration and research. First, possibilities for further investigations into typographic features and their relationship to sound-shape system- aticity, as well as exploring alternative measures. We then suggest alternate optimiza- tion approaches to those used here. We also submit that this work could be easily extended to other languages and phonologies.

7.4.1 Systematicity

The qualities we have demonstrated to correlate with increased sound systematicity could be explored further. It may be possible to design a font with higher systematicity by using a high level of weight, width, and a smaller optical size. We also did not explore the effect of the standard italic and slant axes. Further work could explore the relations of these axes to resulting systematicities. The role of individual characters widths should also be explored. The example in figure 5.11 shows that individual glyph weights can have an outsized impact on sys- tematicity. In our experiments, we used the same width parameter for all characters in each set. A set could be compiled with glyphs generated with different parameteriza- tions. Extreme differences in width would likely be unpleasant to read, but it may be possible to apply small differences without extreme results. Alternate shape distance measures may reveal more systematicity. The Hausdorff metric measures visual similarity in terms of the maximum difference in shapes. There are other variants of Haudsdorff that may yield different results. The average Haus- dorff distance, in particular, would capture the amount of overlap across two shapes, something that the classic Hausdorff metric ignores, driven instead solely by the most extreme difference. Different alignments may also be worth exploring. Left- and right- aligned characters may yield different systematicity results since the Hausdorff mea- sure is sensitive to alignment. The brain is likely to make use of any systematicity present. If systematicity exists that can be better measured by a different visual dis- tance measurement, it should be explored and exploited. Chapter 7. Discussion 38

7.4.2 Optimization

It is possible to learn a latent representation of font glyph. The latent space defined by this representation could then be explored for novel fonts that demonstrate increased sound-shape systematicity. An autoencoder is a neural network used to learn a lower- dimensional representation of data in an unsupervised manner (Ballard, 1987; Kingma and Welling, 2019). An autoencoder can be trained on a body of fonts to produce such a latent representation. This latent space would allow shape possibilities outside the designer-set bounds of parametric fonts, while retaining critical elements that make fonts readable. Campbell and Kautz (2014) also developed a latent representation using a Gaussian Process Latent Variable model. They trained it using polyline repre- sentations of characters rather than glyph bitmaps, resulting in very clear and readable results. A reinforcement learning approach could be used to learn a policy of font modifi- cation for increased systematicity. Reinforcement learning algorithms learn a policy of actions based on repeated trials driven by a reward signal. An agent could be trained to add and remove pixels to font glyphs based on a reward in the form of increased systematicity. A lower-dimensional, stochastic, parameterized policy could be learned using a policy gradient approach such as REINFORCE (Williams, 1992) or using an actor-critic method (Bhatnagar et al., 2009). Such a policy would be entirely uncon- strained in the changes it makes and could generate unusual typography. It could also be quite unattractive and unreadable, but it may be possible to constrain the changes by, for example, limiting the number or location of changes. The policy learned may provide additional clues about the factors that influence increased systematicity even if the raw results are not ideal.

7.4.3 Other Languages and Phonologies

Finally, nearly all of this work could be applied to any language or phonology. The ap- proaches described and the code modules produced could be used with any alternative language or phonology. All that is needed is a phonological definition that allows for phonemes to be represented as feature vectors. The glyph rendering libraries we have provided will work with any set of Unicode characters. The distance and systematicity modules can be used on any set of bitmaps. Chapter 8

Conclusions

Sound-shape systematicity exists in language and it could be helpful for early readers, for those who experience dyslexia, and for language learners. We have demonstrated that simulated annealing can be used to successfully increase sound-shape systematic- ity in parametric fonts by 132%. This systematicity may be bounded without uncon- ventional orthographies. To our knowledge, this work is the first of its kind. We have found evidence that thicker strokes, wider characters, and a smaller optical size are correlated with higher sound-shape systematicity. This has been quantified and is visually apparent in the resulting glyphs. We have also provided evidence that monospaced fonts show anti-correlation with sound distances while proportional fonts show correlation with sound distances. The number of monospaced fonts was small and this should be explored further. We have provided a software library1 to enable replication of our work here and to be used in new experiments and with new orthographies and phonologies. We have also published a data set2 containing our experimental results. Over two million unique glyph bitmaps are contained in this data store. These finding and tools may help enable research into the benefits of sound-shape systematicity in language.

1https://github.com/adamdotdev/font-systematicity 2https://doi.org/10.5281/zenodo.3369478

39 Bibliography

Antosiewicz, M., Koloch, G., and Kaminski,´ B. (2013). Choice of best possible meta- heuristic algorithm for the travelling salesman problem with limited computational time: quality, uncertainty and speed. Journal of Theoretical and Applied Computer Science, 7(1):46–55.

Baddeley, A. J. (1992). An error metric for binary images. Wichmann Verlag, Bonn.

Ballard, D. H. (1987). Modular Learning in Neural Networks. In AAAI-87 Proceed- ings, pages 279–284, Seattle.

Bhatnagar, S., Sutton, R. S., Ghavamzadeh, M., and Lee, M. (2009). Natural actor- critic algorithms. Automatica, 45:2471–2482.

Campbell, N. D. and Kautz, J. (2014). Learning a Manifold of Fonts. ACM Trans. Graph, 33.

Chehreghan, A. and Ali Abbaspour, R. (2017). An assessment of spatial similarity de- gree between polylines on multi-scale, multi-source maps. Geocarto International, 32(5):471–487.

Dwass, M. (1957). Modified randomization tests for nonparametric hypotheses. The Annals of Mathematical Statistics, pages 181–187.

Fodor, J. A. and Pylyshyn, Z. W. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28(1-2):3–71.

Gasser, M. (2004). The Origins of Arbitrariness in Language. In Proceedings of the Annual Meeting of the Cognitive Science Society, page 434 439, Chicago, USA.

Gasser, M., Boeke, J., Haffernan, M., and Tan, R. (2005). The Influence of Font Type on Information Recall. North American Journal of Psychology, 7(2):181–188.

40 Bibliography 41

Hadley, R. F. (1994). Systematicity in Connectionist Language Learning. Mind & Language, 9(3):247–272.

Har-Peled, S. and Raichel, B. (2014). The Frechet´ Distance Revisited and Extended. ACM Transactions on Algorithms, 10(1).

Harm, M. W. and Seidenberg, M. S. (1999). Phonology, Reading Acquisition, and Dyslexia: Insights from Connectionist Models. Psychological Review, 106(3):491– 528.

Hinton, L., Nichols, J., and Ohala, J. J. (2006). Sound symbolism. Cambridge Univer- sity Press.

Hipp, D. (2018). SQLite [Computer software]. https://sqlite.org/. version 3.26, accessed: 2019-08-10.

Hunter, J. D. (2007). Matplotlib: A 2d graphics environment. Computing in science & engineering, 9(3):90.

Huttenlocher, D., Klanderman, G., and Rucklidge, W. (1993). Comparing images using the Hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9):850–863.

Imai, M., Kita, S., Nagumo, M., and Okada, H. (2008). Sound symbolism facilitates early verb learning. Cognition, 109(1):54–65.

Jee, H., Tamariz, M., and Shillcock, R. (2018). The substructure of phonics: The vi- sual form of letters and their paradigmatic English pronunciation are systematically related. PsyArXiv.

Jones, E., Oliphant, T., Peterson, P., et al. (2001–). SciPy: Open source scientific tools for Python [computer software]. version 1.3.0, accessed 2019-08-10.

Kantartzis, K., Imai, M., and Kita, S. (2011). Japanese Sound-Symbolism Facilitates Word Learning in English-Speaking Children. Cognitive Science, 35(3):575–586.

Kim-Renaud, Y.-K. (1997). The Korean alphabet : its history and structure. University of Hawai’i Press.

Kingma, D. P. and Lei Ba, J. (2014). Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980. Bibliography 42

Kingma, D. P. and Welling, M. (2019). An Introduction to Variational Autoencoders. arXiv preprint arXiv:1906.02691.

Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. (1983). Optimization by simulated annealing. Science (New York, N.Y.), 220(4598):671–80.

Knuth, D. E. (1986). The Metafont Book. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.

Lloyd, S., Wernham, S., Jolly, C., and Stephen, L. (1998). The phonics handbook. Jolly Learning Chigwell, UK.

Loshchilov, I. and Hutter, F. (2017). Fixing Weight Decay Regularization in Adam. arXiv preprint arXiv:1711.05101.

McKinney, W. (2010). Data structures for statistical computing in python. In van der Walt, S. and Millman, J., editors, Proceedings of the 9th Python in Science Confer- ence, pages 51 – 56.

McQueen III, C. D. and Beausoleil, R. G. (1993). Infinifont: a Parametric Font Gen- eration System. Electronic Publishing, 6(3):117–132.

Melanie, M. (1999). An introduction to genetic algorithms. MIT press.

Microsoft (2018). OpenType specification. https://docs.microsoft.com/en-us/ typography/opentype/spec/. accessed 2019-08-10.

Monaghan, P., Shillcock, R. C., Christiansen, M. H., and Kirby, S. (2014). How arbi- trary is language? Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 9(19):369.

Nair, V. and Hinton, G. E. (2010). Rectified Linear Units Improve Restricted Boltz- mann Machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 807–814, Haifa, Israel.

Ohzeki, M. and Nishimori, H. (2011). Quantum annealing: An introduction and new developments. Journal of Computational and Theoretical Nanoscience, 8(6):963– 971.

Oliphant, T. (2006–). NumPy: A guide to NumPy. USA: Trelgol Publishing, http: //www.numpy.org/. version 1.16.3, accessed 2019-08-10. Bibliography 43

Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmai- son, A., Antiga, L., and Lerer, A. (2017). Automatic differentiation in PyTorch. In NIPS Autodiff Workshop.

Rello, L. and Baeza-Yates, R. (2013). Good Fonts for Dyslexia. In Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility, New York, New York, USA. ACM Press.

Rougier, Nicolas P. (2019). freetype-py [Computer software]. https://github.com/ rougier/freetype-py/. version 2.1.0.post1, accessed 2019-08-10.

Sassoon, R. (1993). Computers and Typography. Number 1. Intellect.

Taha, A. A. and Hanbury, A. (2015). An Efficient Algorithm for Calculating the Exact Hausdorff Distance. IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 37(11):2153–2163.

Tamariz, M. (2008). Exploring systematicity between phonological and context cooc- currence representations of the mental lexicon. The Mental Lexicon, 3(2):259–278.

Turner, D., Wilhelm, R., Lemberg W., Podtelezhnikov, A., and Toshiya, S. (2019). The FreeType Project [Computer software]. https://www.freetype.org/. accessed 2019-08-10.

Veltkamp, R. (2001). Shape matching: similarity measures and algorithms. In Pro- ceedings International Conference on Shape Modeling and Applications, pages 188–197. IEEE Computer Society.

Wilkins, A., Cleave, R., Grayson, N., and Wilson, L. (2009). Typography for children may be inappropriately designed. Journal of Research in Reading, 32(4):402–412.

Wilkins, A. J., Smith, J., Willison, C. K., Beare, T., Boyd, A., Hardy, G., Mell, L., Peach, C., and Harper, S. (2007). Stripes within Words Affect Reading. Perception, 36(12):1788–1803.

Williams, R. J. (1992). Simple Statistical Gradient-Following Algorithms for Connec- tionist Reinforcement Learning. Machine learning, 8(3-4):229–256.

Woo Geem, Z., Hoon Kim, J., and Loganathan, G. V. (2001). A New Heuristic Opti- mization Algorithm: Harmony Search. Simulation, 76(2):60–68. Appendix A

List of Parametric Fonts

• Adobe Variable Font Prototype https://github.com/adobe-fonts/adobe-variable-font-prototype/releases

• Amstelvar Italic, Amstelvar Roman https://github.com/TypeNetwork/Amstelvar

• Avenir Next https://github.com/Monotype/Monotype_prototype_variable_fonts/tree/ master/AvenirNext

• BP Dots https://backpacker.gr/fonts/7

• Bahnschrift https://docs.microsoft.com/en-us/typography/font-list/bahnschrift

• Bandeins Sans Variable, Bandeins Strange Variable http://type.bandeins.de/

• Buffalo Gals https://github.com/TrueTyper/BuffaloGals

• Cabin Cabin Italic https://github.com/impallari/Cabin/tree/master/fonts/Variable

• Changa https://github.com/eliheuer/changa-vf/tree/master/fonts

44 Appendix A. List of Parametric Fonts 45

• Crimson Pro Italic Crimson Pro Roman https://github.com/Fonthausen/CrimsonPro/tree/master/fonts/variable

• Decovar Decovar Subset https://github.com/TypeNetwork/Decovar/tree/master/fonts

• Fira Code https://github.com/tonsky/FiraCode

• Graduate https://github.com/etunni/Graduate-Variable-Font

• Hepta Slab https://github.com/mjlagattuta/Hepta-Slab/tree/master/fonts

• IBM Plex Sans Italic, IBM Plex Sans Roman https://github.com/IBM/plex/tree/master/IBM-Plex-Sans-Variable

• Inter, Inter Italic, Inter Upright https://github.com/rsms/inter/tree/master/docs/font-files

• Kairos Sans https://github.com/Monotype/Monotype_prototype_variable_fonts/tree/ master/KairosSans

• Kayak https://backpacker.gr/fonts/29

• League Mono https://github.com/mjlagattuta/Hepta-Slab/tree/master/fonts

• Libre Franklin Italics, Libre Franklin Roman https://github.com/impallari/Libre-Franklin

• Movement http://www.nmtype.com/movement/#final

• Nu Alfabet https://github.com/tipotipos/nu-alfabet/tree/master/variable_ttf

• PT Root UI https://www.paratype.com/fonts/pt/pt-root-ui/vf Appendix A. List of Parametric Fonts 46

• Public Sans Italic, Public Sans Roman https://github.com/uswds/public-sans/tree/master/fonts/variable

• Secuela Italic, Secuela Regular https://github.com/defharo/secuela-variable

• Source Code Variable Italic, Source Code Variable Roman https://github.com/adobe-fonts/source-sans-pro/releases/tag/variable- fonts

• Source Sans Variable Italic, Source Sans Variable Roman https://github.com/adobe-fonts/source-sans-pro/releases/tag/variable- fonts

• Source Serif Variable Roman https://github.com/adobe-fonts/source-serif-pro/releases/tag/variable- fonts

• Titillium Web Italic, Titillium Web Roman https://github.com/eliheuer/titillium-web-vf/tree/master/fonts

• Voto Serif https://github.com/twardoch/varfonts-ofl/tree/master/VotoSerifGX- OFL/Fonts/VotoSerifGX-VarTTF

• Work Sans Italic, Work Sans Roman https://github.com/weiweihuanghuang/Work-Sans

• dT Jakob https://home.dootype.com/dt-jakob-variable-concept