Bioinformatics Evolutionary Tree Algorithms Reveal the History of the Cretan Script Family

INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND INFORMATICS Volume 10, 2016 Bioinformatics Evolutionary Tree Algorithms Reveal the History of the Cretan Script Family Peter Z. Revesz syllabary, whose similarity with Linear A was noted by Evans. Abstract— This paper shows that Crete is the likely origin of a The Phoenician alphabet [28] was a major influence on the family of related scripts that includes the Cretan Hieroglyph, Linear development of many other alphabets due to the Phoenicians’ A, Linear B and Cypriot syllabaries and the Greek, Phoenician, Old widespread commercial influence in the Mediterranean area. Hungarian, South Arabic and Tifinagh alphabets. The paper develops The Phoenician and the South Arabic [30] alphabets are a novel similarity measure between pairs of script symbols. The similarity measure is used as an aid to develop a comparison table of assumed to derive from the Proto-Sinaitic alphabet, which the nine scripts. The paper presents a method to translate comparison originated in the Sinai Peninsula sometime between the th th tables into DNA encodings, thereby enabling the use of mid-19 and mid-16 century BC [29]. Phoenician represents bioinformatics algorithms that construct hypothetical evolutionary the northern branch, while South Arabic represents the trees. Applying the method to the nine scripts yields a script southern branch of Proto-Sinaitic. evolutionary tree with two main branches. The first branch is The classical Greek alphabet from about 800 BC had a composed of Cretan Hieroglyph, Cypriot, Linear A, Linear B, Old Hungarian and Tifinagh, while the second branch is composed of major influence for many other European alphabets. The Greek, Phoenician and South Arabic. It is also considered how classical Greek alphabet derives from the Phoenician alphabet Proto-Sinaitic and Ugaritic may belong to this script family. except for the letters Φ, Χ, Ψ and Ω [27]. Keywords—Cretan Hieroglyph, Linear A, Linear B, Evolution, The Old Hungarian alphabet is the alphabet used by Neighbor Joining, Old Hungarian, Phylogenetics, UPGMA, Tifinagh. Hungarians before the adoption of the Latin alphabet. Parallel with the Latin, it was used sporadically until the 20th century. I. INTRODUCTION The origin of Old Hungarian is still debated. Hosszú [11] RETE was the birthplace of several ancient writings that presents a detailed view of the development from Phoenician C were first categorized by Arthur Evans, the explorer of via Aramaic and Turkish and Proto-Rovas scripts. In contrast, Knossos Palace, as the Cretan Hieroglyph, the Linear A and Forrai [8] and Varga [25] claim that the Old Hungarian script the Linear B scripts [5]. Linear A, which dates back to about already existed in the Bronze Age and cite putative 2500 BC, was the main script used in the Minoan palaces of translations of engraved artifacts going back to 1000 BC. ancient Crete. The Cretan Hieroglyph script, which may Tifinagh is another ancient script that was used by Berber predate Linear A, was used for centuries simultaneously with language speakers in North Africa and on the Canary Islands. Linear A. Linear A was replaced around 1450 BC by Linear B, Tifinagh is attested in writing from at least the 3rd century BC. which was used in Mycenaean Greece and is the oldest known The origin of the Tifinagh alphabet is also unknown, although Greek writing [10]. As described in Chadwick [2], despite it is often assumed to derive from Phoenician [31]. Evan’s decades long attempt to decipher Linear B, it remained Using bioinformatics methods, we show in this paper that a mystery until 1952 when Michael Ventris gave a the above nine scripts are members in the same script family decipherment of Linear B showing that it is an archaic version that spans across language families. In computational biology, of Greek. However, the Cretan Hieroglyph and the Linear A the study of evolutionary relationships is greatly facilitated by scripts are still not deciphered. the use of phylogenetic tree construction algorithms, such as In order to understand better these three ancient Cretan Saitou and Nei’s neighbor-joining method [21] and Sokal and scripts, in this paper we study their relationship with six other Michener’s UPGMA method [23]. The books by Baum and scripts. The other scripts are the Cypriot syllabary, and the Smith [1], Hall [9] and Lerney et al. [12] review the maximum Phoenician, the South Arabic, the Greek, the Old Hungarian likelihood and several other methods. Recently, Revesz [15] and the Tifinagh alphabets. also proposed the Common Mutations Similarity Matrix or The Cypriot syllabary [26], which was used between the CMSM method for phylogenetic tree construction. The 11th and the 4th centuries BC, was deciphered by George Smith, CMSM method derives from a series of previous evolutionary who was aided by a bilingual Phoenician-Cypriot inscription. biology studies, including [14], [18]-[20], [22] and [24]. The Cypriot syllabary derives from the earlier Cypro-Minoan Some of the efficient phylogenetic tree algorithms are able to reconstruct hypothetical evolutionary trees in a few minutes Peter Z. Revesz is with the Department of Computer Science and of computational time. Moreover, they are based on statistical Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA techniques that are free of human bias, which sometimes ([email protected]). A preliminary version of this paper was presented at the AMCSE conference in Agios Nikolaos, Crete, Greece in October 2015 prevents the objective evaluation of linguistic artifacts. [16]. ISSN: 2074-1278 67 INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND INFORMATICS Volume 10, 2016 Table I A script comparison with each symbol’s known sound value to the right (blue) except for Tifinagh, which is similar to South Arabic’s. Hieroglyph Linear Linear Cypriot Phoeni South Greek Old Tifin Hungarian and A B -cian Arabic -agh Phaistos A I ʔ ʔ Α A SE SE B B P B * KO G G Γ G DA TA, D D D, Δ TE T KU XE H H E Ε MO W W US Υ MU Z Z U Ζ PA PA Ħ ḍ ɬˤ ɟ Η KA E Tˤ Tˤ C Θ MA WO Y Y J Ι WE NU K K K K PU LI L L L Λ TWE MI M M M M NE U, N N NT Ν NA TE TO S S H QE JA ʕ J, L ʕ Ο PO PO P Π ZO TI ͡ ts ṣ ṣ QA KI Q Q K Ϙ RA R R *R Ρ * TI SA š š š RO LO T T D Τ A F F H X RE RI Ħ Ψ ʒ TA LU O ISSN: 2074-1278 68 INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND INFORMATICS Volume 10, 2016 Human translation attempts are inherently prone to error. The rarity of curved lines makes the scripts symbols that For example, the Phaistos Disk, which contains some form of contain them stand out from the ones that contain only straight Cretan Hieroglyph writing, was translated in numerous lines. Hence we divide script symbols into two groups: contradictory ways by a large number of professional and amateur linguists. Faucounau [6] and Fisher [7] are examples Group 1. Symbols that contain some curved lines of decipherment attempts, and Duhoux [4] is a critique of Group 2. Symbols that contain only straight lines previous decipherment attempts. In spite of these problems, human evaluation of the similarity of script symbols is still For example, for the Cypriot script, Group 1 contains: common in linguistics and is a source of bias. In contrast, bioinformatics developed many sophisticated mathematical measures of the similarity of DNA and proteins [33]. In this paper we develop a mathematical measure for the similarity of All the symbols in the first group contain a significant part pairs of script symbols. Our similarity measure is particularly of an arc of a circle or oval as a deliberate feature. Only one applicable to the linear scripts studied in this paper. The other symbol seems somewhat difficult to classify, namely: source of bias is the human construction of linguistic evolutionary trees. Instead of that approach we use the UPGMA algorithm to construct script evolutionary trees. The idea of using bioinformatics tools is attractive, but the issue is Although the above symbol contains a little bit of curve to figure out how to translate the linguistic problem of scripts near the bottom, the curve here seems to connect two straight into a bioinformatics problem. In this paper we present a lines, namely the vertical line, which is the stem of the arrow, method of translating script syllabaries and alphabets into a and a small horizontal line at the bottom. Hence the curve DNA-like encoding. This DNA-like encoding of related connecting these two straight lines seems only a secondary scripts can be passed into the evolutionary tree algorithms feature that was probably introduced as a writing convenience used to reconstruct hypothetical evolutionary scripts. In this in a later stage of writing rather than an original deliberate translation, each script becomes like the DNA of a species feature of the character. In contrast, all the six Group 1 studied by bioinformatics. symbols listed above contain curves lines, which seem This paper is organized as follows. Section II presents a deliberate and original features of those symbols. Therefore, similarity measure for script symbols. Section III uses the we classified the above script symbols and all the remaining similarity measure to make a comparative table of the script symbols into the second group. symbols. Section IV describes the DNA encoding of the Our next major division of script symbols is according to scripts. Section V presents a computational reconstruction of whether they enclose some region.

Bioinformatics Evolutionary Tree Algorithms Reveal the History of the Cretan Script Family

Managing Order Relations in Mlbibtex∗

UAX #44: Unicode Character Database File:///D:/Uniweb-L2/Incoming/08249-Tr44-3D1.Html

Tungumál, Letur Og Einkenni Hópa

The Ogham-Runes and El-Mushajjar

Roman Numerals

ISO Basic Latin Alphabet

A Translation of the Malia Altar Stone

Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress

Early-Alphabets-3.Pdf

The Origins of the Alphabet but Their Numbers Tally Into of Greeks and Later Through Their Manuscripts

Old Cyrillic in Unicode*

Hungarian Place Name Pronunciation for English-Speakers: the Hungarian Language Is Part of the Uralic Language Family, Which Also Includes Finnish and Estonian