The evolution of graphic complexity in writing systems

Helena Miton, Hans-Jörg Bibiko, Olivier Morin

Table of Contents A. General summary ...... 3 B. First registration (2018-02-22) ...... 4 B. 1. Rationale for the study – Background - Introduction ...... 4 B.2. Materials and Sources ...... 6 B.2.1. Inclusion criteria at the level of [scripts]: ...... 6 B.2.2. Inclusion criteria at the level of characters ...... 6 B.3. Measures ...... 7 B.3.1. Complexity ...... 7 B.3.2. Other characteristics of scripts ...... 8 B.4. Hypotheses and Tests ...... 8 B.4.1. Relationship between scripts’ size and visual complexity ...... 9 B.4.2. Impact of time / invention ...... 10 B.4.3. Stylistic homogeneity ...... 11 B.4.4. Directionality of Change – addressing Galton’s problem ...... 11 B.5. Data Analysis Plan – additional details ...... 12 C. Second registration — June 4th, 2019 ...... 12 C.1. Inventory constitution ...... 13 C.1.1. Inventory of scripts ...... 13 C.1.2. Inventory of characters: special cases ...... 13 C.1.3. Changes in versions ...... 14 C.1.4. Exclusions from the inventory ...... 14 C.2. Generating and processing pictures ...... 17 C.2.1. Generating the pictures ...... 17 C.2.2 Image processing ...... 17 C.3. Change to data collection on scripts’ characteristics and phylogeny ...... 21 C.3.1. Sources ...... 21 C.3.2. Recoding for families ...... 21 C.3.3. Types of scripts ...... 21 C.4. Registration of an additional analysis: The shape of complexity distributions ...... 22 C.5. Referencing Chang et al. 2018 ...... 23 D. Report and results – April 2020 ...... 23 D.1. Erratum ...... 24 D.2. Composition of the final dataset and correlation between complexity measures ...... 24 D.3. Results ...... 26 D.3.1. Size ...... 26 D.3.2. Invention ...... 31 D.3.3. Homogeneity ...... 32 D.3.4. Descendants ...... 33 D.3.5. Distribution ...... 36 References ...... 39

A. General summary

This is our research diary for a study that considers, in an evolutionary light, the visual complexity of the characters that compose the scripts of the world’s writing systems. A (following the ISO 15924 definition) is "a set of graphic characters used for the written form of one or more languages". Some scripts are used by a variety of writing systems (e.g., the letters of the are used by the English, Vietnamese, Latin, etc. systems), but many scripts are used by one system only, and individual writing systems are typically compatible with one or a few scripts only. A script, then, is a collection of templates, used to form images which, when combined with a particular , may encode the sound of a natural language. These standardised images go by various names: letters, glyphs, , signs… here they are called “characters”, following the term used by both ISO 15924 and Unicode. A script, i.e., a series of characters, does not determine what writing encodes, but it determines what writing looks like.

Like languages and other cultural traditions, scripts can be transmitted, and they can be transformed, forming lineages shaped by descent, modification, and borrowing. In the course of this evolution some of their visual characteristics may be lost, others strengthened. This study investigates the nature and evolution of visual complexity in the characters of the scripts used by most of the world’s major writing systems. Visual complexity, the amount of information present in an image, can be assessed automatically using two robustly correlated measures (see section B.3.1).

We use these two measures to put five hypotheses to the test. A script’s size should influence its letters’ complexity (H1); Idiosyncratic scripts emerging de novo with no clear influence from one single ancestor, should be more complex than scripts continuously derived from another script (H2); A letter’s complexity should be predicted first and foremost by the script that it belongs to (H3); Descendant scripts branching out from an ancestor script should tend to have less complex letters, compared to their ancestor (H4); Within a script, the distribution of complexity values should be skewed towards the low-complexity range: we expect a relatively high number of low-complexity letters, accompanied by fewer complex letters.

This research diary will be registered, in its entirety, at each important step of the project, on the Open Science framework (https://osf.io/9dnj3/).

This section was created in June 2019, and entered into the second registration.

It was appended in April 2020 with a report of the data collection and the results of the pre-registered analyses.

B. First registration (2018-02-22)

This section constitutes the project’s original registration. It was registered on 2018-02- 22, under the title “Evolution of writing systems and visual complexity”, and can be consulted at https://osf.io/dh4wg. It was authored by Helena Miton, Hans-Jorg Bibikó, and Olivier Morin. The changes made in June 2019, between the first ad the second registration, are signaled in blue and between brackets and the reason for the changes is given in the next section (C). Section numbers have been changed. We also harmonized the vocabulary to remove potential sources of confusion. The original text used “writing system” and “script” interchangeably: “script” was substituted for “writing system” whenever warranted. We also use the same word, “character”, to refer to what the original named letters, glyphs, characters, etc. Typos and incorrect references were corrected too.

B. 1. Rationale for the study – Background - Introduction

Human visual signs, including written communication, have evolved to satisfy evolutionary pressures for easy processing by the human visual system. Pelli et al. (2006) showed how easy it can be to learn to discriminate letters: only a few thousand trials are necessary for naive participants to become proficient at recognizing letters from scripts they had no previous exposure to. Changizi & Shimojo (2005) showed that written signs tend to remain simple, i.e., they are composed by relatively few strokes. They also mimic natural scene statistics (Changizi et al. 2006). More recently, Morin (2018) showed that written characters massively favour cardinally oriented strokes, and are organized in a way that tend to make letter recognition easier (e.g., cardinal and oblique strokes not mixing).

In Changizi et al.’s studies (Changizi et al. 2006; Changizi and Shimojo 2005), the complexity of symbols within scripts is measured through manual coding of the number of strokes required for the symbol. Since Changizi’s work, the experimental literature developed better proxies for measuring visual complexity (Pelli et al. 2006; Tamariz and Kirby 2015; Watson 2012), in large part based on the cognitive psychology of perception. In contrast with Changizi’s original studies, we will thus be using automated and easily reproducible measures.

Additionally, Changizi et al.’s analyses face a weakness in their lack of control for Galton’s problem: the results might be biased by several scripts sharing the same ancestry, which would lead to an actual lower statistical power than assumed. This is all the more worrying since descendants of the Brahmi [Brah]1 script (scripts of India and

1 [Whenever possible, we identify every script by its ISO three-letters code. (Added June 2019.)] South-East Asia) form a major class of scripts used for this study. The scripts of that family tend to inherit a large number of characters (since most of them are ) and relatively complex shapes. Using phylogenetic data (already compiled for over a hundred scripts, Morin, 2018), we aim at replicating Changizi’s findings while controlling for Galton’s problem.

Another source of relevant empirical results come from various transmission and communication experiments. Such studies showed a general tendency for written symbols to evolve, through transmission (‘generations’) toward simpler, more compressed signs (Garrod et al. 2007; Caldwell and Smith 2012; Tamariz and Kirby 2015). This trend towards increasingly compressed signs is, at the moment, supposed to be driven by reproducing symbols from memory rather than directly copying them (Tamariz and Kirby 2015).

The present study fits in the direct continuation of both those lines of study (research on the complexity of scripts, and research on the complexity of graphic symbols more generally). It specifically focuses on the graphic compression of the characters composing scripts. How complex an image is can be estimated in two complementary ways, perimetric and algorithmic (Tamariz and Kirby 2015). A previous study (Miton and Morin submitted) has successfully used such computerized measures on non- experimental (i.e., historical) material (coats of arms from Medieval heraldry corpora).

When referring to (graphic, or visual) complexity, we here refer to complexity measured on a per character basis: we measure the visual complexity of the character itself (we here use the terms sign, glyph, , and character interchangeably). It is operationalized in two distinct ways (as a measure of the ratio between inked surface and perimeter length, and as the smallest size of a picture file).

The present study aims at testing four hypotheses (to be detailed in section B.4.):

- [H1 “Size”] The size of a [script] would correlate positively with its complexity. - [H2 “Invention”] Idiosyncratic scripts (scripts whose visual shape cannot be derived from the influence of one dominant ancestor, following criteria to be detailed below) would have higher complexity scores than non- idiosyncratic scripts. - [H3 “Homogeneity”] The [script] that characters belong to explains most of the variance in complexity between characters. - [H4 “Descendants”] Finally, in cases of branching out events, a “parent” [script’s] complexity would be higher than its offspring’s complexity.

B.2. Materials and Sources

B.2.1. Inclusion criteria at the level of [scripts]:

The necessary picture files will be generated using a bash script recruiting ImageMagick, and the set of scripts will be compiled from the Unicode 10.0 (ISO 15924) (ISO 15924 Registration Authority, Mountain View, CA 2013, 15), and enriched with current proposals2. [The final study included material from Unicode 11.0 as well: see section C.]

Drawing on Morin (2018), this study will exclude the following: - Secondary scripts [defined in Morin 2018 as scripts used by a writing system that encodes another system, e.g. Stenographics such as [Dupl]]; - Non-visual scripts [e.g. [Brai]]; - Scripts that do not directly encode a spoken language; - Undeciphered scripts [e.g., Linear A [LinA]], as determining what to include or exclude from such scripts—as well as their use—would be unreliable, to say the least.

In contrast with Morin (2018), this study will include logographic and logophonetic scripts. Subsets of both idiosyncratic and non-idiosyncratic scripts will be compiled from the Unicode 10.0 and based on the classification established by Morin (2018). [See sections C.1. and C.3. for further details on script inclusion and classification.]

B.2.2. Inclusion criteria at the level of characters

Drawing on Morin (2018), we include a character if it can stand alone as one sound (or, in the case of logographic systems, as one word or phrase). We thus exclude the following: - marks and ligatures; - Diacritic marks; - Number symbols, honorific marks, currency marks. This criterion means that, to some extent, a script’s inventory size will be somehow underestimated for and , compared to syllabaries and . Should a statistical comparison reveal such a bias, our analyses would then control for the nature of the writing system that each script is most often associated with, provided that this variable proves informative in our models.

Phylogenetic information will use information compiled by Morin (2018), using Daniels & Bright (1996) 1996 as its main source. Given that this study will include a larger set of

2 Available at https://www.unicode.org/pending/pending.html, [accessedJuly 2018]. scripts than Morin (2018) and draws heavily on its methods, when necessary, phylogeny and ‘group' data will be collected from the same sources (Rogers 2005; Daniels and Bright 1996; the AncientScripts website3; documents from the , including encoding proposals; the Scriptsource website4; Wikipedia, the online encyclopedia; the Omniglot website5, and the Ethnologue database [in the second registration The Ethnologue was removed as a source, see section C].).

B.3. Measures

B.3.1. Complexity

Measures of complexity will be similar to the ones used by Tamariz & Kirby (2015), i.e.: - Algorithmic complexity (AC): the smallest size of the picture file, after being maximally compressed. - Perimetric complexity (PC): a ratio of inked surface and perimeters’ length, will be computed according to Watson (2012). Both those measures will be taken for every character for each script in our sample. Both measures tend to be particularly sensitive to and thus biased by some specific parameters. We will attempt to control for two such biases: line-thickness (for perimetric complexity measures), and the characters’ size (for algorithmic complexity).

[In the second registration, we decided against those controls after improvements on image processing described in section C made it redundant. See section C.2.]

Perimetric complexity and control for line-thickness. In some of our previous work (Miton and Morin submitted; Kelly et al. submitted), we observed that our measure of perimetric complexity can be biased by the thickness of the traits used to form an image, such that thicker traits tend to produce lower complexity measures. To control for this, we will try to get an automatic measure of line thickness. In case this is deemed either too complex to be manageable or too unreliable, we will derive an index of stroke thickness for every script included in our study. For each script, five characters will be randomly selected, and all their component strokes will be measured for thickness, at each stroke's thickest point, using the digital ruler PixelStick. This measure, averaged over all three characters and their strokes, will be entered as a fixed effect in our models, and retained if it results in a model with greater predictive power (before adding the predictor and after adding the grouping variables). Otherwise it will be discarded.

3 ancientscripts.com 4 scriptsource.org 5 omniglot.com

Algorithmic complexity and control for character size. Our Unicode sources provide neatly standardised cells for all the characters we shall study, making sure that all the image files that will enter our study are identical in size. Even so, there is of course no straightforward way to make sure that the actual amount of ink inside the cells cover the same surface, and some scripts are systematically « smaller » than others. Given that the algorithmic complexity would be impacted by variation in the size of characters, and that, for technical reasons, it is impossible (or nearly so) to ensure that all of the pictures created for all the different scripts are comparable in size, algorithmic complexity will be normalized by inked surface (and/or size of the characters, measured in a way akin to line- thickness: five characters would be randomly selected for each script and the size of the characters – height and width - will be measured for all five of them using PixelStick).

B.3.2. Other characteristics of scripts

Scripts’ size will be the number of unique chraracters included in our sample for each script. When a letter or glyphs exists in several possible versions depending on its position (e.g. capital letters in the Latin script), we count each version as one distinct character. Scripts’ family will use the classification established by Morin (2018) on phylogenetic and geographic grounds. The seven families we will consider here are thus the following:

- Middle Eastern family: direct descendants of the scripts of the Middle East: Egyptian, , South Arabic and Aramaic. - Phoenician family: all the direct and indirect descendants of the Phoenician , including Greek and its descendants. - Indian Brahmic family: all the descendants of the Brahmic script in Modern India, Pakistan, Sri Lanka, Mongolia and Tibet. - Mainland South-East Asian Brahmic family: all the direct and indirect descendants of the Brahmic script in mainland South-East Asia. - Insular South-East Asian Brahmic family: all the direct and indirect descendants of the Brahmic script outside of mainland South-East Asia, in Indonesia and the Philippines - Recent inventions family: all the scripts created after 1800. - East Asian family: Korean and Japanese Kanas. [The second registration slightly changes this definition, see section C.3.2]

B.4. Hypotheses and Tests

Changizi & Shimojo’s (2005) study suggested that (i) the number of strokes per character centered around approximately three, independent of the number of characters in the script, and (ii) that characters are ca. 50% redundant, independently of script size [i.e., the number of strokes they contain is 50% higher than what would be strictly required to make them distinctive]. In contrast to those results, we hypothesize that there should actually be differences in how compressible are characters across scripts, when using measures of complexity based on visual perception and recognition, rather than on the number of strokes used to form them. We also hypothesize that such visual compressibility depends on specific characteristics of scripts, such as whether they are idiosyncratic (definition below), or the size of their inventory.

B.4.1. Relationship between scripts’ size and visual complexity

Is there a stable relationship between complexity as measured by the number of characters in a script and the average complexity of characters within a given script? Considering that, as characters accumulate, it becomes difficult to use only very simple characters, and drawing on Changizi’s work, we predict that larger scripts (i.e. those including a larger number of characters) should exhibit a higher degree of complexity than smaller ones.

[H1] “Size” hypothesis: Larger scripts should exhibit a higher average degree of complexity than smaller ones.

Data analysis: This prediction will be tested by means of a nested regression, with characters’ complexity scores as basic data points, and script and family as grouping variables. The predictor will be the size of the script (system-complexity)6, measured as the number of characters we include from each script in our dataset. We will first build a null model including only the dependent variable and the grouping variables (complexity, script and script’s family, then test the prediction by including the predictor (size of the script). If warranted (see section B.2.2.) we will include as a control the nature of the writing system most often associated with the script (alphabetic, syllabic, , or ). Only if this fixed effect results in a lower AIC for the null model will we keep it as a control for the final model. We predict that introducing the predictor (the script size) will lower the model's AIC (i.e. will make it more informative), and that the size of the script will affect the visual complexity positively. While this study only nests the scripts by their broad family instead of their precise ancestor, our test of hypothesis 4 (section B.4.4.) will remedy this defect and complement this study.

6 In subsequent work we do not use this confusing label, we simply talk about a script’s size. (Added in June 2019.) B.4.2. Impact of time / invention

Experimental research has highlighted the role of cultural transmission in shaping languages and visual signs (Tamariz and Kirby 2015), notably towards more compressibility - in other words, visual signs tend to decrease in complexity as they are culturally transmitted. For instance, in Tamariz & Kirby (2015), signs that had undergone only one transmission episode were more complex, as measured by both perimetric and algorithmic complexity metrics, than signs produced at the end of a transmission chain, after 20 transmission episodes.

Idiosyncratic scripts, as they would have had minimal exposure (or at least, lower exposure than non-idiosyncratic scripts) to evolutionary pressures for visual recognition should, on average, have a higher sign-complexity7 (i.e., visual complexity of individual items divided by the number of items) than non-idiosyncratic scripts. We also expect that such evolutionary pressures are responsible for some reduction in variance (cf. stylistic homogeneity hypothesis below). Hence, idiosyncratic scripts should have higher variance in complexity than non-idiosyncratic scripts.

We define idiosyncratic scripts as scripts fulfilling the following criteria:

(1) Precise information is known about their individual inventors (most often, their name); (2) There is no scholarly consensus to the effect that they derive their shape from the influence of one single identified ancestor. (Most resemble no known script, others fuse many influences together so that no single dominant influence is discernible. They don’t have any identified ancestor, though they may still take inspiration from known scripts in form and principles (3) There is information stating that the writing was invented (or scrapped altogether) after 1800.

Note that this definition excludes invented scripts such as Cherokee [Cher], that were invented de novo by an illiterate inventor, but nevertheless bear the dominant influence of one script (the Latin script [Latn] in Cherokee’s case).

[H2] “Invention” hypothesis: Idiosyncratic scripts would, on average, have higher complexity than non- idiosyncratic scripts.

Data analysis. This prediction will be tested by means of a nested regression, with characters’ complexity scores as basic data points, and scripts, as well as family and type

7 In our subsequent work we call this simply “complexity”. of scripts as grouping variables. The predictor will be whether the script belongs to the subset of idiosyncratic scripts or not. We will first build a null model including only the dependent variable and the grouping variable (complexity and script), then test the prediction by including the predictor (invention). We predict that introducing the predictor will lower the model's AIC (i.e. will make it more informative), and that being an invented script will affect the script’s visual sign-level complexity positively.

B.4.3. Stylistic homogeneity

We hypothesize that the fact of belonging to a particular script is the most important factor affecting character complexity, when compared to the factors that are relevant to the complexity of individual characters. We believe this for two reasons: First, inclusion in a script captures many important sources of variance in character complexity that should not be expressed at the level of individual characters. This includes the material that the script is usually written on; the shape of the basic strokes making up the script; general stylistic influences. Second, something like the principle of uniform information density (Jaeger 2010) that obtains for spoken language, may also obtain for written language, so that writers could be pushed to maintain a more or less constant complexity throughout the various letters that they write.

[H3] “Homogeneity” hypothesis: Most of the variance in character complexity can be accounted for by their belonging to a given script.

Thus, we predict that the ICC [i.e., intra-class correlation] for character complexity nested by script will show that more than half of the variance in character complexity is accounted for by their inclusion in a particular script. Without directly contradicting it, this prediction contrasts with Changizi et al.'s observation that characters in the world's scripts tended to cluster around a modal size of three strokes, regardless of which script they came from. We suggest that this finding may have to do with the way that Changizi et al.'s coders defined and distinguished strokes, rather than with the actual complexity of characters.

Data analysis. ICC for character complexity nested by scripts will show that more than half of the variance in character complexity is accounted for by their inclusion in a particular script.

B.4.4. Directionality of Change – addressing Galton’s problem

We will be testing whether there is a systematic directional change in ‘branching out’ events. [We call “branching-out events” what happens when a new script differentiates itself from a parent script, like the Latin script [Latn] did when it arose from its Greek [Grek] and Italic [Ital] ancestors.] This will allow us to control for Galton’s problem. Here, we predict that, when a script descends from an ancestor script, the descendant script will have a lower visual complexity compared to its ancestor. This allows us to control for Galton’s problem to the extent that such branching out events represent measures of cultural change.

[H4] “Descendants” hypothesis: When branching-out events occur, parent scripts would have a higher complexity than their offspring.

Data analysis. Branching out events, i.e., a descendant forming from an ancestor, will be identified [(based on Morin 2018’s data), and completed for this study based on the same set of sources]. For each pair, the ancestor’s value (ie., its averaged visual complexity) will be subtracted from the descendant’s value. This hypothesis will be tested by means of a Bayesian t-test, the null hypothesis being that differentials’ mean is equal to 0.

B.5. Data Analysis Plan – additional details

All analyses will be conducted both on all types of scripts and type by type (i.e., alphabets / abugidas / abjads / syllabaries / logographic scripts / mixed scripts). Arguably, the different types of writing systems already represent different levels of compression— either by compressing most vowels within the consonant and expressing only some of them with diacritic marks (abugidas) or not marking them at all (abjads). All analyses will also be performed both with and without ideographic scripts, as we expect such scripts to be outliers. For all predictions and models used to test them, the model's assumptions will be tested, and the data may be reformatted (e.g. log- or rank- transformed) if the need arises.

C. Second registration — June 4th, 2019

This project was registered a second time, after this section was added to it (along with the changes to section B signalled in blue and between brackets). This section is authored by Miton and Morin. The registration took place on June 4th, 2019 and can be consulted at https://osf.io/9dnj3/registrations. At the time of registration, we have collected all the pictures of characters that our analysis will be based on, but data analysis hasn’t started.

We here report (1) how the scripts’ inventory was constituted, including precisions regarding cases that arose and were not foreseen in the initial pre-registrations, and details on how the change from Unicode 10.0 to Unicode 11.0, which occurred in June 2018, affected the study (section C.1.); (2) how we are processing our pictures to avoid biases resulting from differences in size and line thickness, and how this affects our data analysis plans (section C.2.), (3) that we excluded The Ethonologue from our sources in completing phylogenetic information, and how we recoded the typology of scripts in our dataset (section C.3.), and we pre-register an additional hypothesis on the shape of complexity distributions (section C.4.). We also address a bibliographic lacuna (C.5.).

This registration is accompanied by a data file, InventoryScripts.csv (with accompanying metadata file, InventoryScripts-metadata.txt) that lists all the scripts that were considered for this study, whether they were included or not, with reasons motivating their exclusion if relevant, and otherwise with information on each script’s type, family, and ancestry.

C.1. Inventory constitution

C.1.1. Inventory of scripts

The inventory is uploaded along this interim report (InventoryScripts.csv). All mentioned webpages were consulted in late April and early May 2018.

The inventory was constituted following these five steps, detailed as columns in InventoryScripts.csv: (1) We started with the full list of scripts included in the ISO 15924 and in Unicode (10.0 at the time the list was compiled). (2) We listed current proposals for inclusion in the Unicode. These proposals are detailed plans to format a script for inclusion into the Unicode standard that had not yet been validated at the time we consulted them. (3) We excluded all scripts that did not meet our inclusion criteria from both the list of scripts included in ISO 15924 and in the Unicode. (4) All the remaining scripts that fulfilled our criteria thus constituted our inventory before generating the pictures. (5) Whenever pictures for its characters could not be generated, the script would be excluded from our final dataset. InventoryScripts.csv also reports all such cases.

C.1.2. Inventory of characters: special cases

Drawing on Morin (2018), we included a character if, and only if, it can stand alone as one phoneme (or, in the case of logographic scripts, as one word). We thus excluded the following: punctuation marks and ligatures, diacritic marks, number symbols, honorific marks, and currency marks. We included both upper and lowercase characters, which in our view and from the point of view of the Unicode standard constitute distinct characters, as well as final consonants or final letters, in the scripts where these assume a special shape. Vowel characters were excluded when they needed to be combined with other symbols and could not stand as phonemes on their own. They were included if they could be used without being combined to another symbol, in order to keep a coherence with the « if it can stand alone » criterion. This criterion has the advantage of being robust across scripts. Characters that cannot stand on their own are easy to spot in the Unicode documentation, as they appear together with a dashed circle (in order to illustrate their placement around other characters). Extensions used for writing other languages than the main language(s) of use within each script were excluded. We are aware of the subjective and arbitrary nature of these criteria. Note, however, that any criterion allowing us to select and count the characters of a script is bound to be arbitrary. Also, this one is consistent with the Unicode standard as well as with our previous published work, and we preregister it here in advance of hypothesis testing.

A script’s character range, the list of characters to be included in our study, identified by their Unicode point (a code of four to five characters), was re-checked and updated whenever necessary for every script in our dataset that was already in Morin (2018), as the exact range had, in a few cases, changed between when the Unicode was consulted for this project and running the current study.

C.1.3. Changes in Unicode versions

Unicode 11.0. The Unicode 11.0 was launched while data collection was in progress, in early June 2018. We stuck with the inventory of scripts as we had established it, based on Unicode version 10.0. The main changes between 10 and 11 are the official inclusion of 7 scripts that were at the proposal level and thus already included in our sample (Hanifi Rohingya [Rohg], Old Sogdian [Sogo], Sogdian [Sogd], Dogra [Dogr], Gunjala Gondi [Gong], Makasar [Maka] and Medefaidrin [Medf]). Thus the transition from 10 to 11 did not change which scripts were included or excluded in our study.

Otherwise, as the transition from Unicode 10.0 to Unicode 11.0 also includes changes within a few scripts, at the level of characters, every character range was re-checked, and the range from the Unicode 11.0 prevailed over the range derived from Unicode 10.0.

Unicode 12.0. Novelties from Unicode 12.0 were also introduced before our sample of pictures was completely collected. As changes introduced in Unicode 12.0 were still subject to revisions at the time of data collection, we did not take them into account.

C.1.4. Exclusions from the inventory

Unavailable fonts. We need to use a font (a set of printable or displayable text characters in a specific style and size) and a character range (list of characters fulfilling our inclusion criteria, with their Unicode identifiers) to generate our pictures. Whenever a given script was missing either one or both of those elements, it had to be excluded from our sample – see InventoryScripts.csv for the exhaustive list of scripts that had to be excluded for this reason.

Private Use Area fonts. Some scripts only had fonts in which the symbols were not mapped onto their designated Unicode range, but on some other range, usually exploiting the Private Use Area. Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. This means that, for some scripts in our dataset, the font has the right symbols, but they are not assigned to the Unicode range of code points used in Unicode. In such cases, pictures were generated for all the characters using their Unicode range in the Private Use Area, and then were manually re- named to match their original, assigned, Unicode ranges. This was the case for the following scripts: Afaka [Afak], Khutsuri [Geok], Nyiakeng Puache Hmong [Hmnp], Kpelle [Kpel], Loma [Loma], Makasar [Maka], Hanifi Rohingya [Rohg], Siddham [Sidd], and Soyombo [Soyo]. In order to minimize how many scripts had to be re- mapped, priority was given to font files that were already mapped onto the appropriate Unicode range. Priority was then given to fonts that were freely available, for reproducibility purposes.

From this list, only Afaka [Afak], Nyiakeng Puache Hmong [Hmnp], Hanifi Rohingya [Rohg], Siddham [Sidd] and Soyombo [Soyo] could be properly remapped. We thus add to exclude Loma [Loma] (more than half of the characters for syllables were missing), Khutsuri (the only font we found was actually another variant of Georgian, Mhkedruli [Geor]), and Makasar (the only font we found was actually for Lontara, another script used to write Makasar). Finally, Kpelle [Kpel] had more than 15 symbols missing from their inventory (17), leading to its exclusion from our dataset.

Sumero-Akkadian Cuneiform [Xusx]. Sumero-Akkadian Cuneiform [Xusx] was also excluded based on the fact that both available fonts had major problems: Assurbanipal.ttf, the first available font, had (at least) 240 characters missing, while the Google Noto font for this script was really hard to read, with parts of the characters being jammed together, thus altering the overall shape of a large number of the characters included in our sample – see Figure 1.

Figure 1. Examples of Sumero-Akkadian Cuneiform [Xsux] characters that were hard to process: from left to right, 121CD, 122B2, 121F6.

Inclusions with minor problems. Additional problems emerge from the fact that fonts aren’t developed at the same speed as scripts enter the Unicode (or have new characters included in their Unicode), hence, a few scripts had a few characters missing or the only available font was quite different from the one used in Unicode. We first tried to replace the font, but this wasn’t possible for all scripts affected by this problem—see Table 1 for details.

Number of missing IsoKeyA Full name Missing character(s) characters Ahom Ahom 1 1171A Han (Hanzi, , Hani 5 09FEB-09FEF ) Khar Kharoshthi 2 10A34, 10A35 Limb Limbu 2 191D, 191E Telu Telugu 1 00C34 Afak Afaka 1 16C87 Rohg Hanifi Rohingya 1 10D14 Sidd Siddham 4 1159B-1159E Soyo Soyombo 1 11A79

Table 1. Scripts included despite having a few characters missing. ISO 15924’s set of codes for the representation of names of scripts, defines two sets of codes for a number of scripts. Each script is given both a four-letter code (IsoKeyA in our documents) and a numeric one (IsoKeyB). ISO 15924 defines a script as "set of graphic characters used for the written form of one or more languages". The grey lines refer to scripts that had to be remapped from Private Use Area font.

C.2. Generating and processing pictures

C.2.1. Generating the pictures

In order to generate the pictures, we used a bash script using a unicode range and a font. This script fixated the size of the canvas (i.e., picture) at 500 by 500 pixels, and the size for drawing the symbols at 60. Whenever a script presented characters that would be to too big to fully fit within the 500 by 500 pixels canvas, it was rerun at a smaller point size. In such cases, we decreased point size 5 by 5, until reaching a size at which all characters would fit inside the canvas. We had to go through such a procedure for the following scripts: - [Egyp] (final pointsize = 55) - Balinese [Bali] (final point size = 45) - Burmese - Myanmar [Mymr] (final point size = 55) - Grantha [Gran] (final point size = 40)

C.2.2 Image processing

Image processing was optimized in order to standardize the pictures’ line thickness and character size (relative to the character’s background). All generated pictures go through the following treatment (see Table 2) after having been generated, in order to ensure standardization both in terms of size and in terms of line thickness.

Table 2. Overview of the processes that pictures of characters go through before data analysis.

Standardizing pictures for size. In order to standardize our pictures for size across scripts, we first trim all the pictures, and select in each script the character whose picture is the biggest. From this picture, we estimate by how much it has to be resized to fit a 490 by 490 pixels square (we maintain the aspect ratio, to avoid distortions), and use this ratio for resizing all pictures from the same script. We then place the resulting pictures back on a 500 by 500 pixels white canvas—see Figure 2. This procedure for resizing allows us to have a dataset that is homogeneous in size between different scripts, even when they use very different fonts, while maintaining the variation in size occurring within each script.

Figure 2. Procedure for resizing pictures, example from a few characters in Khojki. This process was applied within each script.

Standardizing pictures for line thickness. In order to obtain a collection of characters that all have the same, constant, line thickness, we used a combination of functions in Mathematica (Wolfram language): first, thinning, then pruning, and finally dilation (see Table 2 and Figure 4). First, the Thinning function, with the argument “Method” set on “MedialAxis” returns the approximate medial axis of the picture. Then, we apply the Pruning function (argument = 35) in order to get rid of some of the artefacts emerging with obtaining the approximate medial axis. It removes the small segments that appear during that process but aren’t part of the optimal (i.e., representative) skeleton of the character. Pruning branches whose length was inferior to 35 pixels yielded satisfactory results (see Figure 3). This decision was based on visual inspection of the pictures. Finally, the Dilation function (argument = 2), makes the trait thicker. This operation also makes visual inspection of the pictures easier.

Figure 3. Illustrations of the results of various settings for the pruning process, on the Unicode point 01981, (New Tai Lue [Talu]). The original resized character (a) is shown for comparison purposes, in black on a white background ; (b) shows the character after the approximate medial axis , i.e., with no pruning (c) shows the result of pruning too little (here, by 20 pixels, (c) shows the setting we chose (i.e., pruning by 35 pixels), and (d) shows the result of pruning too much (by 50 pixels).

Requirements for different types of measures and further treatment. Perimetric complexity measures will be taken directly on the output of the process described above (i.e., picture 4 in Figure 4), as Mathematica requires pictures to have a white foreground (i.e., the character itself should be white) and a black background. Algorithmic complexity metrics, on the other hand, are computed on pictures having black foreground (black character) over a white background. Each character’s picture will go through the potrace algorithm (Selinger 2003) in order to get rid of any superfluous pixels and to get a vectorized version before zip compression.

Figure 4. (1) the original resized character, (2) approximate medial axis, (3) pruning (by 35), (4) dilation by 2 – this is the picture on which perimetric complexity measures will be taken, (5) the vectorized (through the potrace algorithm) black-on-white background version, for algorithmic complexity metrics.

Changes to analyses plans. The initial analysis plan was to get measures on raw pictures, i.e., pictures that haven’t been treated to eliminate variation coming from the font used to generate them. The procedure generating our standardized collection of pictures now allows us to have a controlled version of our dataset with no noise nor bias resulting from stylistic aspects (i.e., the use of different fonts). It homogenizes both line thickness and size across scripts. Controls for line thickness and size, which were announced in the first registration (section B.3.), are now unnecessary and won’t be run.

C.3. Change to data collection on scripts’ characteristics and phylogeny

C.3.1. Sources

One of the sources used in (Morin 2018), The Ethnologue (“Ethnologue: Languages of the World” n.d.), could not be used in our study, due to its shift to a for-pay model. Whenever we needed to complement our information (e.g., when new scripts were included that were not present in the original study), we used all the other sources mentioned in (Morin 2018).

C.3.2. Recoding for families

Additionally, we had to redefine slightly how families were recoded, as Morin (2018) did not include, for instance, any logosyllabary. As we included those scripts in the present study, we updated the definitions of the East Asisan scripts to include Chinese logosyllabaries and syllabaries as well. It was defined as “Korean Hangul and Japanese Kanas” and is now defined as “Korean Hangul, Japanese Kanas and Chinese scripts that were not related to the (Han [Hani], Yi [Yiii], Tangut [Tang])”.

C.3.3. Types of scripts

We also added to the information used in Morin (2018) concerning the classification of scripts (see Table 3 for a synthetic view of the definitions). Based on definitions from Daniels & Bright (p. 4), we recoded the information from our sources as: - alphabet: “the characters denote consonants and vowels” – they are usually defined as “systems using the smallest possible phonemic subunit”. - abjad (“consonantary”): “the characters denote consonants (only)” – in other words, such scripts leave readers to supply the appropriate vowel. - abugida: “each character denotes a consonant accompanied by a specific vowel, and the other vowels are denoted by a consistent modification of the consonant symbols” - They are also referred to as syllabic alphabets or alphasyllabaries in other sources. - : “the characters denote particular syllables, and there is no systematic graphic similarity between the characters for phonetically similar syllables” - logosyllabary: “the characters of a script denote individual words (or morphemes) as well as particular syllables” - featural: « the shapes of the characters correlate with distinctive features of the segments of the language ». The only such script in our sample is Hangul [Hang].

Alphabet Consonants and vowels Abjad Consonants Abugida Consonant with a specific vowel, consistent modifications are used to represent a change from that vowel Syllabary Syllabes Logosyllabary Individual words, morphemes or syllables Featural Shapes of characters correlate with language’s features (i.e., shape taken by the mouth to produce sounds)

Table 3. Types of scripts and linguistic units represented by their characters.

C.4. Registration of an additional analysis: The shape of complexity distributions

Previous work made remarks, in passing, on the complexity distributions of graphic symbols been positively (i.e., right-) skewed. Mcdougall, Curry, and de Bruijn (1999), for instance, noticed that their sample of iconic and non-iconic symbols were positively skewed. Such a positively skewed distribution was also observed for European heraldic motifs (Miton and Morin submitted), and in the different versions of the Vai syllabary (Kelly et al. submitted).

We expect constraints on the production, recognition, and reproduction of graphic symbols to weigh on the complexity distribution of a given script. While very simple symbols are easy to produce and to recognize, they quickly become ‘saturated’, i.e., it becomes harder to keep them easy to differentiate while keeping them simple. During the constitution of a given graphic code, the progressive introduction of increasingly complex symbols would reflect an effort/information trade-off. We expect this, in turn, to result in a positively (right-) skewed distribution. A script which follows a positively skewed distribution thus has most of its characters at a relatively low complexity, compared to the full range of complexity scores that it covers. In practical terms, the mean of a positively skewed distribution exceeds its mode. To some extent, such a distribution (i.e., right-skewed) could also be the result of having a lower bound on complexity.

Additionally, previous studies on the shape of word lengths’ distributions also support, to some extent, such a prediction. Word length shares characteristics with pictures’ complexity (see Miton and Morin submitted) and the distribution of word lengths has been extensively informed and investigated. Although average word length varies among different languages, the shape of word lengths distribution has been documented to be more consistent, following a tradition starting, in the 1850s, as a potential way to identify authorship (see Grzybek, 2007 for an extensive historical review of this field).

Nevertheless, two characteristics makes this literature harder to exploit or directly relate to our current study: (1) they tend to focus on word tokens, rather than types, whereas our study takes place at the level of types, and (2) the basic unit for length in such studies is not consistent, and oscillates between syllable and letter/grapheme. Such works, although they are able to document precisely which distribution word lengths follow, tend to be rather silent as to why (i.e., which processes cause) such distributions (to) come about. Consequently, we do not want to commit ourselves to predicting an optimal fit with any particular distribution.

[H5] “Distribution” hypothesis: We expect scripts’ distributions of visual complexity to be systematically right-skewed (positively skewed).

Analyses and visualization: This hypothesis will be tested mainly through visual inspection, with plots of the density of the distribution over complexity scores. In order to complement the visual assessment of skewness, we will report a skewness measure for each script in our sample. Our hypothesis predicts that most scripts should have positive skewness measures. This will be tested by means of a Bayesian t-test, predicting a skewness of zero as null hypothesis.

C.5. Referencing Chang et al. 2018

The references in our first registration suffered from a lacuna: we were unaware of (Chang, Chen, and Perfetti 2018). It replicates Changizi et al. finding that scripts with a greater number of characters tend to have simpler characters (Changizi and Shimojo 2005) (our hypothesis H1), but, unlike Changizi, they use automated measures for visual complexity similar to the ones we will use. They do not, however, control for Galton’s problem, nor do they address the other hypotheses considered in this project (H2–H5).

D. Report and results – April 2020

This report includes more details on the composition of our dataset, and results of our pre-registered analyses on all our 5 hypotheses.

D.1. Erratum

- Sindhi/ Khudawadi’s [Sind] family had mistakenly been coded as « Modern Inventions », we changed it to the correct « Indian Brahmic » family. - Phoenician [Phnx] and Aramaic [Armi] were coded as ancestors of one another. After additional checks, this was a mistake : Aramaic descends from Phoenician. Phoenician’s ancestor has thus been corrected to NA, as Proto-Canaanite, its ancestor, is not part of our dataset.

D.2. Composition of the final dataset and correlation between complexity measures

Our dataset included 47 880 characters from 133 different scripts. It included (see Figure 5): 5 East Asian scripts, 23 European scripts, 35 Indian scripts, 24 Middle Eastern scripts, 23 Modern Inventions, 11 Insular South East Asian scripts and 12 Mainland South East scripts. By types, it includes : 17 abjads, 56 abugidas, 44 alphabets, 1 featural system, 4 logosyllabaries, and 11 syllabaries.

Figure 5. Composition of our dataset, by family and type of scripts. In our dataset, scripts included on average 360 characters (SD = 2 112). The largest script, Chinese sinograms [Hani] included 20, 971 characters. The smallest script was Tagbanwa [Tagb] with only 16 characters. Algorithmic and perimetric complexity measures were highly correlated, r(47878) = .85, p < .001, 95%CI[0.84, 0.85].

Variable Definition Script “a set of graphic characters used for the written form of one or more languages” (ISO 15924) Type Type is defined by the linguistic unit represented by the characters (e.g., alphabet, syllabary, etc) Family Classification based on phylogenetic (ancestor) and geographic information Ancestor Ancestor script, whenever there is a known ancestor to a script in our dataset Size Number of characters included in a script Idiosyncratic Binary variable indicating whether we considered the script as idiosyncratic (1) or not (0). Idiosyncratic means that a script was created in the last two centuries, by identifiable individuals, and bears no sign of influence from another existing script. Table 4. Reminder of predictors and their definitions

D.3. Results

Unless stated otherwise, (1) analyses were ran on our full dataset (i.e., on all 133 scripts), (2) both complexity measures (perimetric and algorithmic) and inventory size were log- transformed for all mixed effects regressions, (3), all models were fitted using Restricted Maximum Likelihood (REML), and (4) all reported p-values are two-tailed.

D.3.1. Size

We hypothesized that the inventory size of a script (the number of characters it includes)– here called Size variable–would correlate positively with its characters’ visual complexity. As complexity measures and size of scripts were on very different scales (and thus creating convergence problems in mixed effects model), both measures were log-transformed before analyses.

The size variable (number of characters in script) was used as predictor in a nested regression analysis. The scripts were grouped into seven families (East Asian, European, Indian, Middle Eastern, Modern Inventions, Insular South East Asian, and Mainland South East Asian) based on the classifications most common in the literature, as a way of keeping Galton’s problem in check. These families were used as the grouping variable (Family variable) in linear mixed models with random intercept, using the lmer function of the lme4 package for R. A null model was built first, with a random intercept for script family (Family variable) and for script type (Type variable); a second model introduced the size variable as a fixed effect.

Main Analysis We here present the results of the analyses bearing on our complete dataset, total N = 47,880 characters from 133 scripts.

The best null model for characters’ perimetric complexity (df = 4.222, t = 28.39, p < 0.01) included both type and script nested by family as random effects. A model adding size as a fixed effect shows larger scripts to be more complex than simpler ones (β = 0.12, 95%CI[0.073, 0.175], df = 21.222, t = 4.78, p < 0.001). The two models were refitted using maximum likelihood for comparison purposes, and revealed that the test model was more informative (Akaike’s information criterion—AIC—of -8609.7 vs. - 8597.2 for the previous model).

The best null model for characters’ algorithmic complexity (df = 4.79, t = 107.2, p < 0.001) included both type and script nested by family as random effects. A test model, adding size as a fixed effect, shows larger scripts to be more complex than simpler ones (β = 0.04, 95%CI[0.0245, 0.0747], df = 24.79, t = 3.873, p < 0.01). The two models were refitted using maximum likelihood, revealing that including the script’s size resulted in a more informative model (AIC—of -39144 vs. -39098 for the previous model, which did not include size as a predictor).

Figure 6. Complexity (perimetric on top, algorithmic below) as a function of script’s size. Color shows phylogenetic family. Both complexity measures and the number of characters in scripts were log-transformed.

Analysis excluding logosyllabaries

We now present the results of the same analyses, but now exclude all logosyllabaries from our dataset. The following analyses are thus based on N = 19125 characters, on 129 scripts. The best null model for characters’ perimetric complexity (df = 1.11, t = 46.23, p < 0.01) included both type and script nested by family as random effects. The test model, including size as a fixed effect, shows larger scripts to be more complex than simpler ones (β = 0.069, 95%CI[0.006, 0.133], df = 45.12, t = 2.143, p < 0.05). For comparison purposes, both models were refitted using maximum likelihood. Adding size makes the model more informative (AIC—of -3042.8 vs. -3040 for the previous model, which did not include size as a predictor).

The best null model for characters’ algorithmic complexity (df = 2.87, t = 286.1, p < 0.001) included both type and script nested by family as random effects. A test model (same random effects and an additional fixed effect for scripts’ size) did not show larger scripts to be significantly more complex than simpler ones (β = 0.024, 95%CI[ -0.0018, 0.0493], df = 51.00, t = 1.822, p = 0.074). Adding the size of the script’s inventory results in a very slightly more informative model (AIC of -22326 vs -22325 for the previous model).

Figure 7. Complexity (perimetric on the left, algorithmic on the right) as a function of script’s size. Color show phylogenetic family. Both complexity measures and the number of characters in scripts were log-transformed.

Analysis on our dataset with the same restrictions as Changizi & Shimojo 2005 (script’s size < 200 characters)

We also present our results on a subset including only scripts that had less than 200 characters (N = 5566, on 124 scripts), similarly to Changizi & Shimojo (2005). This effectively excludes logosyllabaries as well as Hangul, the only featural system in our dataset. The effect of script’s size seem to disappear. Most of the effect of the size of the system seem to depend on the inclusion of a few very large systems (mostly East Asian) which also tend to have very complex characters. We thus partially replicate Changizi & Shimojo, in that characters’ complexity do not seem to be impacted by script’s size, as long as we restrict our analyses to the scripts in the same range of length. The best null model for characters’ perimetric complexity (df = 2.78, t = 64.32, p < 0.001) included both type and script nested by family as random effects. A test model did not show larger scripts to be more complex than simpler ones (β = 0.06, 95%CI[-0.048, 0.168], df = 92.03, t = 1.086, p = 0.28). For comparison purposes, both models were refitted using maximum likelihood. Adding size did not make the model more informative (AIC of 3269.4 vs 3268.7 without the size variable). The best null model for characters’ algorithmic complexity (df = 3.33, t = 285.3, p < 0.001) included both type and script nested by family as random effects. Adding size did not make the model more informative (AIC of -4394.7, against -4394.6 for the null model). This model did not show larger scripts to be significantly more complex than simpler ones (β = 0.03, 95%CI[-0.012, 0.076], df = 100.58, t = 1.42, p = 0.159). For comparison purposes, both models were refitted using maximum likelihood.

Figure 8. Complexity (perimetric on the left, algorithmic on the right) as a function of script’s size. Color show phylogenetic family. Both complexity measures and the number of characters in scripts were log-transformed.

D.3.2. Invention We defined the Idiosyncratic variable as a binary variable: it was equal to 1 whenever a script fulfilled all our criteria (i.e., created in the last two centuries, by identifiable creators, with no main influence from any other existing script), and 0 whenever it did not. Because idiosyncratic scripts are thus a subset of the Recent Inventions family, the null model we used did not include phylogenetic family as a random effect. We hypothesized that idiosyncratic scripts, would have more complex characters than non- idiosyncratic scripts. We tested this hypothesis by comparing whether the inclusion of idiosyncratic as a predictor would lower the AIC of a null model of complexity. Contrary to our predictions, there was no significant effect of adding the Idiosyncratic variable to the model - the AIC increased, rather than decreased, when we added it, for both perimetric and algorithmic complexity measures. The test model failed to show any effect of idiosyncratic (β = 0.016, 95%CI[-0.128, 0.161], df = 126.08, t = 0.226, p = 0.822), when compared to the best null model for characters’ perimetric complexity (df = 4.22, t = 28.39, p < 0.01). Models were refitted using Maximum Likelihood in order to be compared. Adding idiosyncratic actually increased the AIC, indicating a less informative model (AIC—of -8595.2 vs. -8597.2 for the null model). The test model did not suggest an effect of idiosyncratic on algorithmic complexity (β = - 0.003, 95%CI[-0.062, 0.056], df = 125.05, t = -0.108, p = 0.914), when compared with the best null model for algorithmic complexity (df = 4.795, t = 107.2, p < 0.001). Adding idiosyncratic resulted in a less informative model (it increased the AIC to -39131 versus - 391333 for the previous model) – both models were refitted using Maximum Likelihood to be compared.

Idiosyncratic scripts were also no more or less complex when compared to other scripts from the Recent Inventions family that were not idiosyncratic (β = 0.07, 95%CI[-0.212, 0.228], df = 21.04, t = 0.71, p = 0.944 for perimetric complexity, β = 0.008, 95%CI[- 0.082, 0.098], df = 20.24, t = 0.177, p = 0.861, for algorithmic complexity, characters nested by script for both).

Figure 9. Complexity by type, comparing idiosyncratic (in blue and non-idiosyncratic (in pink) scripts from the Recent Invention family for both perimetric (on the left) and algorithmic (on the right) complexity measures. Error bars represent 95% confidence intervals.

D.3.3. Homogeneity

We hypothesized that the script that characters belong to explains most of the variance in complexity between characters. The interclass correlation (ICC) was calculated on raw values for perimetric complexity measures and on log-transformed values for algorithmic complexity, using the ICC1.lme function in the psychometric R package. An ICC for letter complexity nested by script showed that 38.57% of the variance in perimetric complexity and 38.49% of the variance in algorithmic complexity is accounted for by their inclusion in a particular script. While this represent a relatively high percentage of the variance, it remains under the predicted value of 50%. By comparison, Family (i.e., classification based on geographic and phylogenetic grounds) accounts for 29.74% (algorithmic complexity) to 45.00% (perimetric complexity) of the variance, and Type (e.g., alphabet, abugida, syllabary, etc., reflecting which linguistic unit is represented by the characters of the script) captures 68.26% of the variance in perimetric complexity and 55.43% of the variance in algorithmic complexity. Type was thus the variable that captured most of the variance in characters’ complexity, contrary to our predictions.

Figure 10. Complexity by family and type (error bars represent 95%CI): top panel represents perimetric complexity, bottom panel represents algorithmic complexity.

Excluding logo-syllabaries

According to the pre-registration, we also run this analysis excluding logo-syllabaries. An ICC for letter complexity nested by script showed that 48.82% of the variance in perimetric complexity and 37.91% of the variance in algorithmic complexity is accounted for by their inclusion in a particular writing system. While this represent a relatively high percentage of the variance, it remains under the predicted value of 50%. This makes it the best predictor – by comparison, family accounts for 21.17% (algorithmic complexity) to 50.01% (perimetric complexity) of the variance, and type captures 64.62% of the variance in perimetric complexity and 27.92% of the variance in algorithmic complexity.

D.3.4. Descendants

Based on the mean character complexity

We hypothesized that, in case of branching out events, a « parent » script’s characters would be more complex than its offsprings’ characters. For each pair, the ancestor’s average complexity (i.e., its averaged sign-complexity) was subtracted from the descendant’s average complexity, as pre-registered. Our dataset includes information on 102 branching out events, from 29 different ancestor scripts. The most frequent parent script was Brahmi [Brah], with 25 offspring scripts. A parent script had, on average, 3.55 offsprings (SD = 4.98). When controlling for ancestor (i.e, including Ancestor as a random effect), algorithmic complexity did not seem subjected to any systematic effect (neither increase nor decrease in complexity) occurring with branching out events (β = 12.87, 95%CI [-34.98, 57.71], df = 29.69, t = 0.563, p = 0.577). Perimetric complexity tended to increase with branching out events, but this trend failed to reach significance (β = 3.734, 95%CI [-0.65, 7.35], df = 21.44, t = 1.823, p = 0.082). These results suggest that the null hypothesis may be true (no tendency for descendants to diverge from ancestors changes in complexity in a particular direction). Nevertheless, the linear mixed effects model analyses so far presented do not test it directly. A Bayesian one-sample t test was conducted to see whether the data supported the hypothesis that descendants do not, on average, decrease or increase their complexity compared to their ancestor. It found moderate support for the null for both perimetric (BF = 4.02) and algorithmic complexity (BF = 5.06) – see Figure 11. For this t-test, differentials were averaged for each ancestor, rather than for each descendant-ancestor pair: this avoided to give more weight to ancestors with numerous descendants.

Figure 11. Difference between means of descendant scripts and ancestor scripts plotted for each documented script, by alphabetic order of their Iso Key A(error bars represent 95% confidence intervals), for perimetric complexity (top) and algorithmic complexity (bottom).

Based on the median of characters’ complexity

Given that most scripts’ complexity distributions were positively skewed, we also ran a version the same analysis on the median of characters’ complexity, rather than their mean. The median is assumed to be a better indicator of the central tendency than the mean for skewed distributions. The results using the median were similar in all points to the ones obtained using the mean. When controlling for ancestor (i.e, including Ancestor as a random effect), it does not seem that there is any systematic effect (neither increase nor decrease in complexity) occurring with branching out events (β = 13.76, 95%CI [-33.29, 61.59], df = 28.29, t = 0.601, p = 0.553 for algorithmic complexity; β = 3.254, 95%CI [-1.387, 7.792], df = 23, t = 1.471, p = 0.155 for perimetric complexity). These results suggest that the null hypothesis may be true (no tendency for descendants to diverge from ancestors changes in complexity in a particular direction). Nevertheless, the linear mixed effects model analyses so far presented do not test it directly. A Bayesian one-sample t test was conducted to see whether the data supported the hypothesis that descendants do not, on average, decrease or increase their complexity compared to their ancestor. It found moderate support for the null for both perimetric (BF = 4.098) and algorithmic complexity (BF = 5.038) – see Figure 12. For this t-test, differentials were averaged for each ancestor, rather than for each descendant-ancestor pair: this avoided to give more weight to ancestors with numerous descendants.

Figure 12. Difference between medians of descendant scripts and ancestor scripts plotted for each documented script, by alphabetic order of their Iso Key A (error bars represent 95% confidence intervals), for perimetric complexity (top) and algorithmic complexity (bottom).

D.3.5. Distribution

We expected that most scripts’ distributions of visual complexity would be right-skewed (positively skewed). This was indeed the case, as confirmed by a Bayesian one sample t- test against the null hypothesis of skewness measures being equal to zero (BF = 44184058 for algorithmic complexity, BF = 19155462142 for perimetric complexity). This overall tendency for scripts’ distributions of complexity to be positively skewed was relatively evenly distributed over both phylogenetic families and types of scripts, see Figures 13, 14 and 15.

Figure 13. Density distributions for perimetric complexity of all the 133 scripts in our dataset (ordered by alphabetic order of the IsoKeyA, 4-letter identifier for each script based on ISO 1592.)

Figure 14. Density distributions for algorithmic complexity of all the 133 scripts in our dataset (ordered by alphabetic order of the IsoKeyA, 4-letter identifier for each script based on ISO 1592.)

Figure 15. Skewness measure for all 133 writing systems in our dataset, ordered by alphabetical order of the IsoKeyA. Bold line indicates skewness = 0. The top panel shows skewness measures for perimetric complexity, and the bottom panel skewness for algorithmic complexity. Error bars represent the standard error.

References

Caldwell, Christine A., and Kenny Smith. 2012. “Cultural Evolution and Perpetuation of Arbitrary Communicative Conventions in Experimental Microsocieties.” Edited by Alex Mesoudi. PLoS ONE 7 (8): e43807. https://doi.org/10.1371/journal.pone.0043807. Chang, Li-Yun, Yen-Chi Chen, and Charles A. Perfetti. 2018. “GraphCom: A Multidimensional Measure of Graphic Complexity Applied to 131 Written Languages.” Behavior Research Methods 50 (1): 427–49. https://doi.org/10.3758/s13428-017-0881-y. Changizi, Mark, and S. Shimojo. 2005. “Character Complexity and Redundancy in Writing Systems over Human History.” Proceedings of the Royal Society B: Biological Sciences 272 (1560): 267–75. https://doi.org/10.1098/rspb.2004.2942. Changizi, Mark, Qiong Zhang, Hao Ye, and Shinsuke Shimojo. 2006. “The Structures of Letters and Symbols throughout Human History Are Selected to Match Those Found in Objects in Natural Scenes.” The American Naturalist 167 (5): E117-139. https://doi.org/10.1086/502806. Daniels, Peter T, and William Bright. 1996. The World’s Writing Systems. New York: Oxford University Press. “Ethnologue: Languages of the World.” n.d. Ethnologue. Accessed May 30, 2019. https://www.ethnologue.com/. Florian Jaeger, T. 2010. “Redundancy and Reduction: Speakers Manage Syntactic Information Density.” Cognitive Psychology 61 (1): 23–62. https://doi.org/10.1016/j.cogpsych.2010.02.002. Garrod, Simon, Nicolas Fay, John Lee, Jon Oberlander, and Tracy MacLeod. 2007. “Foundations of Representation: Where Might Graphical Symbol Systems Come From?” Cognitive Science 31 (6): 961–87. https://doi.org/10.1080/03640210701703659. Grzybek, Peter. 2007. “History and Methodology of Word Length Studies.” In Contributions to the Science of Text and Language: Word Length Studies and Related Issues, edited by Peter Grzybek, 15–90. Text, Speech and Language Technology. Dordrecht: Springer Netherlands. https://doi.org/10.1007/978-1- 4020-4068-9_2. ISO 15924 Registration Authority, Mountain View, CA. 2013. “ISO 15924.” http://www.unicode.org/iso15924/iso15924-codes.html. Kelly, Piers, James Winters, Helena Miton, and Olivier Morin. submitted. “The Predictable Evolution of Letter Shapes: An Emergent Script of West Africa Recapitulates Historical Change in Writing Systems.” Mcdougall, Siné J. P., Martin B. Curry, and Oscar de Bruijn. 1999. “Measuring Symbol and Icon Characteristics: Norms for Concreteness, Complexity, Meaningfulness, Familiarity, and Semantic Distance for 239 Symbols.” Behavior Research Methods, Instruments, & Computers 31 (3): 487–519. https://doi.org/10.3758/BF03200730. Miton, Helena, and Olivier Morin. submitted. “When Iconicity Stands in the Way of Abbreviation: No Zipfian Effect for Figurative Signals.” Morin, Olivier. 2018. “Spontaneous Emergence of Legibility in Writing Systems: The Case of Orientation Anisotropy.” Cognitive Science 42 (2): 664–77. https://doi.org/10.1111/cogs.12550. Pelli, Denis G., Catherine W. Burns, Bart Farell, and Deborah C. Moore-Page. 2006. “Feature Detection and Letter Identification.” Vision Research 46 (28): 4646–74. https://doi.org/10.1016/j.visres.2006.04.023. Rogers, Henry. 2005. Writing Systems: A Linguistic Approach. Wiley. Selinger, Peter. 2003. Potrace: A Polygon-Based Tracing Algorithm. Tamariz, Mónica, and Simon Kirby. 2015. “Culture: Copying, Compression, and Conventionality.” Cognitive Science 39 (1): 171–83. https://doi.org/10.1111/cogs.12144. Watson, Andrew. 2012. “Perimetric Complexity of Binary Digital Images.” The Mathematica Journal 14 (March): 1–40. https://doi.org/10.3888/tmj.14-5.