<<

Manuscript

1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 5 6 7 8 9 Non-Random Associations of Graphemes with Colors in Arabic 10 11 12 13 14 Tessa M. van Leeuwen1,2,3, Mark Dingemanse1, Büşra Todil1,4, Amira Agameya,5,6 , and 15 16 Asifa Majid1,2,7 17 18 19 20 1 21 Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands 22 2 Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the 23 24 Netherlands 25 26 3 Max Planck Institute for Brain Research, Frankfurt am Main, Germany 27 28 4 Jacobs University Bremen, Germany 29 5 30 American University of Cairo, Cairo, Egypt 31 6 32 Cairo University, Cairo, Egypt 33 7Centre for Studies, Radboud University, Nijmegen, the Netherlands 34 35 36 37 38 39 40 Corresponding authors 41 42 43 Asifa Majid Tessa M. van Leeuwen 44 45 Centre for Language Studies, Donders Institute for Brain, Cognition and 46 47 Radboud University Behaviour, Radboud University 48 49 50 P.O. Box 9103 P.O. Box 9104, 6500 HE, Nijmegen, The 51 52 6500 HD Nijmegen Netherlands 53 54 55 The Netherlands Telephone: +31 24 361 2758 56 57 Telephone: +31 24 361 1251 Email: [email protected] 58 59 60 Email: [email protected] 61 62 1 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 5 6 7 Abstract 8 9 Numerous studies demonstrate people associate colors to letters and numbers in systematic ways. 10 11 But most of these studies rely on speakers of English, or closely related . This makes it 12 13 14 difficult to know how generalizable these findings are, or what factors might underlie these 15 16 associations. We investigated -color and number-color associations in Arabic speakers, who 17 18 19 have a different system and unusual structure compared to Standard Average 20 21 European languages. We also aimed to identify grapheme-color synaesthetes (people who have 22 23 24 conscious color experiences with letters and numbers). Participants associated colors to 28 basic 25 26 Arabic letters and ten digits by typing color names that fit best each grapheme. We found 27 28 29 language-specific principles determining grapheme-color associations. For example, the word 30 31 formation process in Arabic was relevant for color associations. In addition, psycholinguistic 32 33 variables, such as letter frequency and the intrinsic order of graphemes influenced associations. 34 35 36 Contrary to previous studies we found no evidence for sounds playing a role in letter-color 37 38 associations for Arabic, and only a very limited role for shape influencing color associations. 39 40 41 These findings highlight the importance of linguistic and psycholinguistic features in cross-modal 42 43 correspondences, and illustrate why it is important to play close attention to each language on its 44 45 46 own terms in order to disentangle language-specific from universal effects. 47 48 49 50 Keywords 51 52 53 crossmodal, synaesthesia, color, association, letters, numbers, graphemes, Arabic, cross- 54 55 linguistic, cross-cultural 56 57 58 59 60 61 62 2 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Introduction 5 6 7 Our experience of the world is multi-sensory. A tomato is round, red, fragrant, sweet, and 8 9 soft. An important aspect of our unconscious mental life consists of binding these distinct sensory 10 11 experiences into unified perceptual experiences. We also map across senses in places where the 12 13 14 relationships are not so obvious. For example, in many communities in Latin America there is a 15 16 pervasive association of “hot” and “cold” with foods, drinks, natural objects, and even cultural 17 18 19 categories (Foster, 1979). Another place we can get insight into multisensory relationships is 20 21 language. For example, sounds varying in pitch are “high” or “low” in English, while those 22 23 24 varying in loudness are “soft” (zacht) and “hard” (hard) in Dutch. Such linguistic mappings are 25 26 potentially of interest, as they provide insight into how linguistic behavior can shape cross-modal 27 28 29 associations. 30 31 In this study, we focus on associations to language in its written form. Four out of five 32 33 adults worldwide are literate today (International data (2014)), and writing is one of 34 35 36 humanity’s most momentous inventions (Campbell, 1997; Coulmas, 2013). Considerable 37 38 research has been devoted to the cross-modal associations people have between graphemes (i.e., 39 40 41 letters and digits) and colors, and multiple accounts of the origin and function of these 42 43 associations exist. According to one account intrinsic associations exist between shapes and color 44 45 46 (Spector and Maurer, 2008; 2011). An alternative perspective is that the linguistic environment, 47 48 such as knowledge of color terms, frequency of graphemes within a language. (Rich et al., 2005; 49 50 Simner et al., 2005), or the phonetic properties of sounds (e.g., Marks, 1975) determine 51 52 53 associations. 54 55 To date, most studies of grapheme-color associations have been in populations speaking 56 57 58 English, using the Roman . This paper is a first investigation of grapheme-color 59 60 associations for the Arabic among Arabic speakers in Egypt. Testing grapheme-color 61 62 3 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 associations in a distinct script and an independent population allows us to test the robustness of 5 6 7 previous findings with regard to the underlying variables posited to underlie cross-modal 8 9 associations (Rich et al., 2005; Simner et al., 2005; Rouw et al., 2014). 10 11 We additionally aimed at identifying synaesthetes, a special population for whom certain 12 13 14 stimuli are intrinsically linked to automatic cross-modal experiences (Wollen and Ruggiero, 15 16 1983; Baron-Cohen et al., 1987; Hochel and Milán, 2008). Often, synaesthetic associations 17 18 19 follow patterns of regular cross-modal associations, such as high pitch with bright colors (e.g. 20 21 Ward et al., 2006), but some forms of synaesthesia appear to be more arbitrary in nature (Deroy 22 23 24 and Spence, 2013). It is therefore debated whether synaesthesia can be regarded as an extreme 25 26 form of normal cross-modal perception (Cohen Kadosh et al., 2007; Kusnir and Thut, 2012), or 27 28 29 not (Deroy and Spence, 2013). We explore cross-modal mappings for both synaesthetes and non- 30 31 synaesthetes in order to understand more about the mechanisms driving cross-modal associations. 32 33 34 35 36 A closer look at cross-linguistic grapheme-color associations 37 38 Recently grapheme-color associations have been investigated from a cross-linguistic perspective. 39 40 41 Simner et al. (2005) studied grapheme-color associations for the Roman alphabet in English and 42 43 German synaesthetes and non-synaesthetes, asking participants to free-list color associations for 44 45 46 letters. Similar factors predicated associations across these related languages: letter-color 47 48 associations were influenced by linguistic priming, i.e., letters tended to be paired with a color 49 50 starting with the same initial letter (e.g., y -> yellow), and in synaesthete participants, high 51 52 53 frequency letters were associated with high frequency color . The role of frequency has 54 55 been explored in other work too (e.g., Rich et al., 2005). High frequency letters and digits tend to 56 57 58 be associated with bright colors and high saturation in synaesthetes (Beeli et al., 2007; Smilek et 59 60 al., 2007). The correlation between digit frequency and numerical magnitude is negative in 61 62 4 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 synaesthesia, that is, larger numbers are generally associated with darker colors (Cohen Kadosh 5 6 7 et al., 2007; but see Cohen Kadosh et al., 2008 for an opposite effect in non-synaesthetes). 8 9 Simner et al. (2005) also found lexical properties of color words affected letter-color 10 11 associations. When people are asked to free-list colors in English they are likely to begin with 12 13 14 blue, red, green rather than mauve, olive, or magenta (referred to as “ease of generation”) (Battig 15 16 and Montague, 1969). Simner et al. found non-synaesthetes were more likely to associate letters 17 18 19 earlier in the alphabet with colors that were easier to generate, and those colors were also 20 21 produced more often. 22 23 24 Another factor relevant for letter-color associations is the Berlin and Kay evolutionary 25 26 sequence of colors (Berlin and Kay, 1969), which reflects the order colors are introduced into 27 28 29 languages (see Malt and Majid, 2013, for a review). Although no effect was found on the order of 30 31 color responses, high frequency letters were more likely to be associated with colors earlier in the 32 33 evolutionary sequence in synaesthetes (Simner et al., 2005). 34 35 36 In a similar study, Rouw et al. (2014) reported a strong effect of ease of generation on 37 38 non-synaesthetes’ color associations for letters in English, Dutch, and Hindi, the latter employing 39 40 41 the Devanāgarī script. The relationship between letter frequency and lexical frequency of 42 43 associated color words was not tested, but overall color preferences in Hindi showed an effect of 44 45 46 the Berlin and Kay evolutionary sequence. Rouw et al. concluded speakers across the three 47 48 languages and two scripts have similar color associations for similar sounds. This suggests 49 50 associations may be driven by (rather than letter shape). 51 52 53 Evidence consistent with this comes from Japanese and Chinese, languages that combine 54 55 syllabic or phonetic with a semanto-phonetic representing meaning and 56 57 58 sound. Chinese (L2) synaesthetes showed linguistic priming; but phonemic tone, potentially 59 60 relevant for Chinese, was not found to effect color associations (Simner et al., 2011). There was, 61 62 5 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 however, a language-specific, or rather writing-specific effect: most are 5 6 7 compounds with two components (radicals): one semantic (in the left position) and one phonetic 8 9 (on the right). Hung et al. (2014) reported synaesthetic hue and luminance of a compound were 10 11 determined by the semantic and phonetic radical, respectively. 12 13 14 Japanese writing combines three different types of scripts: the phonetic 15 16 and , and the Chinese-derived . Asano and Yokosawa (2011) found 17 18 19 synaesthetic color mapping was strongly dependent on phonology in the two phonetic scripts, 20 21 Hiragana and Katakana. They proposed this was due to the strong one-to-one relationship 22 23 24 between and phonetics (unlike English where a letter can have several pronunciations; 25 26 e.g., ‘c’ can be pronounced as /k/ or /s/). 27 28 29 Taken together, these studies suggest sounds of letters are especially relevant for letter- 30 31 color associations. Consistent with this, English speaking participants associate green with the 32 33 letter /i/ (because of the shared sound between the letter and color word), but Arabic speaking 34 35 36 participants did not display this association (Mompean Guillamon, 2013). However, across 37 38 languages, including Arabic, front open vowels were often associated with red, and front mid 39 40 41 vowels with green (Table 5 in Mompean Guillamon, 2013). According to Marks (1975), the pitch 42 43 of vowels predicts the lightness/darkness of the associated color (e.g. ‘o’ and ‘u’ are associated 44 45 46 with darker colors, and sound lower in pitch, compared to ‘e’ and ‘i’). 47 48 Another factor that appears to be relevant for letter-color associations is letter shape. For 49 50 example, Watson et al. (2012) found letter shape (measured by 11 letter-shape features) and 51 52 53 ordinality (position in the alphabet) correlated with hue and color distance in synaesthetes, while 54 55 letter frequency correlated with luminance. Some synaesthestic participants also associated 56 57 58 similarly shaped graphemes with similar hues (Watson et al., 2010; Brang et al., 2011; Jurgens 59 60 61 62 6 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 and Nikolic, 2012). Similarly, Spector and Maurer (2008, 2011) found pre-literate toddlers 5 6 7 systematically associate e.g., ‘o’ with white and ‘x’ with black, according to letter shape. 8 9 In sum, grapheme frequency, sound, and shape have been linked to color associations, at 10 11 least in work based mostly on the Roman alphabet. Here we examine which factors are relevant 12 13 14 for grapheme-color associations in Arabic. 15 16 17 18 19 Arabic and the 20 21 Arabic is an Afro-Asiatic language closely related to other Semitic languages such as Hebrew 22 23 24 and Aramaic. It is the official language of the 22 countries in the Arab World. Spoken by around 25 26 240 million people, it is the fifth most populous first-learned language worldwide (Lewis et al., 27 28 29 2014). 30 31 Spoken Arabic differs considerably depending on the region; but spoken varieties are 32 33 considered mutually intelligible and form a dialect chain. Egyptian Arabic, which we study here, 34 35 36 is probably the most widely understood spoken variety. Its written counterpart, Modern Standard 37 38 Arabic, is the only official form of Arabic, and is used in most written documents. The Quran is 39 40 41 written in Classical Arabic. Literacy in the Arab States is around 77.5%. 42 43 Arabic is written using a right-to-left script consisting of 28 letters (Figure 1A), with 44 45 46 digits written from left-to-right (Figure 1B; we refer to these as Eastern Arabic numbers to 47 48 prevent confusion with the “Arabic numbers” 0-9). All Arabic letters are consonants, and short 49 50 vowels can optionally be indicated using above or below letters (Figure 2A). Three of 51 52 .waaw) are also used for long vowels و yaa’, and ي ,alif’ ا) the letters used for consonants 53 54 55 Diacritics are usually omitted in written Modern Standard Arabic (as in Figure 2B and 2D). 56 57 58 59 60 61 62 7 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Letters change their appearance when they are written in words, as can be seen in Figure 5 6 7 2B. Twenty-two of the letters can be connected to the preceding and following letters, but six 8 .waaw و alif and’ ا ,.letters can only be joined to the preceding letter, not the following letter e.g 9 10 11 Arabic word structure is different to English and other related languages. The core of an 12 13 14 Arabic word is the “root”, which usually comprises three (range 2-5) consonants and expresses 15 expresses the meaning ‘to ( ك - ت - ب) the main conceptual content. For example, the root k-t-b 16 17 18 19 write’ (see Figure 2C). These consonantal roots serve as templates with slots for different vowel 20 21 patterns to form different words, as illustrated in Figure 2C. So, the root k-t-b, can become kataba 22 23 24 ‘he wrote’; kitaab ‘book’, kutub ‘books’, kuttaab ‘writers’, or the command uktub ‘write!’. There 25 26 are around 5,000-6,500 different lexical roots in Arabic (Ryding, 2005). 27 28 29 This word formation process (known as templatic morphology) has potentially interesting 30 31 implications for color associations. Prior studies have shown people associate Roman letters (e.g., 32 33 y) with colors beginning with the same letter (e.g., yellow). In Arabic, however, the same color 34 35 36 concept, e.g., ‘black’ can be expressed with word forms that begin with different letters (Figure 37 s depending on the gender and number of the ﺲ aa or’ ا 2D). So, ‘black’ can begin with 38 39 40 41 corresponding noun (the adjective has to agree with the noun). As well as the adjectival forms, 42 to blacken’. This raises the question whether Arabic speakers associate‘ ﺳﻮﺪ there is also a verb 43 44 45 46 letters to colors on the basis of the surface word form or on the basis of the underlying root. 47 48 Many Arabic letters share a strong shape resemblance (Elbeheri & Everatt 2006), in part 49 50 since several otherwise identical characters are distinguished only by the placement of a 51 52 53 disambiguating dot (see Figure 1A). Previous studies have found English synaesthetes are 54 55 influenced by shape similarities among letters in their color associations, i.e. similarly shaped 56 57 58 letters are associated with similar colors (Watson et al., 2010; Brang et al., 2011; Jurgens and 59 60 61 62 8 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Nikolic, 2012). Given the number of shared shapes in Arabic , we predict shape plays 5 6 7 an important role in determining color associations in Arabic as well. 8 9 10 11 12 Current study 13 14 In the current study, we assessed the influence of linguistic and other factors on grapheme-color 15 16 mapping in the Arabic language. We address the influence of grapheme frequency, ease-of- 17 18 19 generation order, the Berlin and Kay evolutionary sequence, matching between graphemes and 20 21 the first letter onset of the chosen color word, grapheme shape and sound, as well as the order in 22 23 24 which stimuli are presented to the participants. Following Simner et al. (2005), we presented 25 26 stimuli in written form and asked participants to write down the names of colors they thought 27 28 29 matched each grapheme. 30 31 32 33 Methods 34 35 36 37 Participants 38 39 40 Participants were recruited to participate in a 15-minute online survey via university mailing lists 41 42 in collaboration with the English Language Institute at the American University of Cairo, Egypt. 43 44 Participants were not paid. All participants were informed of the content of the survey and 45 46 47 formally asked to accept the conditions (informed consent). Ten demographic questions 48 49 regarding participants’ age, gender, origin, language information, keyboard use (Arabic or 50 51 52 English), handedness, and color-blindness were administered. 53 54 In total, 139 participants completed the study, however, the demographic questions were 55 56 57 completed by only 121 participants (the questions were placed after the first session of the 58 59 questionnaire and several people quit at this ). We can therefore only address the 60 61 62 9 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 demographic characteristics of these 121 participants. Their mean age was 25.3 years (SD = 7.8; 5 6 7 25 males, 96 females, 10 of unknown gender). Native languages were Arabic and English (55%), 8 9 Arabic only (25%), and Arabic in combination with one or more European languages of which at 10 11 least one was not English (17%). Languages of daily use were Arabic and English (79%), Arabic 12 13 14 only (11%), and Arabic in combination with one or more European languages of which at least 15 16 one was other than English (8%). Eighty-eight participants were born in Egypt (73%), and eleven 17 18 19 participants originated from non-Arabic speaking countries, while the remaining participants 20 21 came from other Arabic countries. Sixteen participants reported themselves as left-handed or 22 23 24 ambidextrous (13%), while two people reported color-blindness. On a scale from 1 to 5 of 25 26 keyboard use (1 being only English/Roman keyboards and 5 being only Arabic keyboards), the 27 28 29 participants scored 1.73 on average (SD = 0.75). 30 31 32 33 Structure of the online survey of grapheme-color mappings 34 35 36 Participants were presented with the 28 letters of the (no diacritics) and ten 37 38 Eastern Arabic digits (see Figure 1) and typed the name of the color that best fit each grapheme. 39 40 41 Keyboards had both English and Arabic characters on the keys. This method allows us to 42 43 investigate the psycholinguistic factors underlying grapheme-color associations. Immediately 44 45 46 after completing the task, participants were asked to do the same association task again. The 47 48 repetition was essential to test the consistency of color associations and possible identification of 49 50 synaesthetes. The stimuli were presented either in fixed alphabetic/numerical order (in both 51 52 53 repetitions) or random order (in both repetitions), as the presentation order of stimuli can 54 55 influence association results, e.g., because of ease-of-generation effects (see Simner et al., 2005). 56 57 58 The survey started either with digits or letters (in both repetitions). Hence there were four 59 60 different versions of the survey: Fixed stimulus order starting with letters, fixed stimulus order 61 62 10 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 starting with digits, random stimulus order starting with letters, and random stimulus order 5 6 7 starting with digits. Assignment of the survey version to participants at the start of the survey was 8 9 done at random as programmed in the survey software (Unipark, http://www.unipark.com/). The 10 11 survey instruction language was Modern Standard Arabic. 12 13 14 15 16 Identification of synaesthetes 17 18 19 After the color associations followed six questions with a Likert rating scale of 1 to 6 (do not 20 21 agree – agree) to assess the extent to which participants’ experiences resembled synaesthesia. 22 23 24 These questions were taken from the prevalence study of Simner et al. (2006). An example 25 26 statement is: “When performing the experiment, I felt that I knew for certain what the color for a 27 28 29 letter or number should be.” The minimum total score was 6 (not resembling synaesthesia) and 30 31 the maximum score 36 (resembling synaesthesia). Across all participants, we calculated the 32 33 average rating score on these six questions, and if a participant’s score exceeded a cut-off value 34 35 36 of (mean + 1.96SD), this was considered an indication for synaesthesia. Additionally, consistency 37 38 in the color associations was judged from the immediate re-test of the color associations in the 39 40 41 online survey. 42 43 For the letter and number surveys, we calculated the average number of color associations 44 45 46 that were the same between test and re-test across all participants. If, for a particular participant, 47 48 the number of identical color responses between test and re-test (“consistency score”) exceeded 49 50 the cut-off value (mean + 1.96SD), this was considered an indication for synaesthesia. Combined 51 52 53 with the responses to the synaesthesia-related questions this allowed us to discern synaesthetes 54 55 from non-synaesthete participants. In earlier studies aiming to identify synaesthetes, 56 57 58 synaesthetes were taken to be those individuals falling within a range of scores previously 59 60 established for synaesthetes (on both consistency and questionnaire scales) (Simner et al., 2006, 61 62 11 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Rothen and Meier, 2010). Since no known synaesthesia scores are available for individuals using 5 6 7 the Arabic script, we instead used a conservative estimate by requiring potential synaesthetes to 8 9 exhibit a score above the specified cut-off values. 10 11 12 13 14 Results 15 16 In total, 139 participants completed at least one session of the questionnaire. Even though 17 18 the survey was presented in Modern Standard Arabic, only 92 participants responded in Arabic; 19 20 21 47 responded in English. This high percentage of English responders could be related to the high 22 23 percentage of participants (79%) who used English alongside Arabic in their daily life. Arabic 24 25 26 and English responses were separated for all analyses. Of the Arabic responders (N=84), 51 27 28 participants completed both repetitions of the stimulus list (61%) for English responders (N=39), 29 30 31 19 completed both runs (49%). 32 33 Arabic responses were transliterated using AraMorph (Buckwalter, 2002), and Aralex 34 35 for the Arabic word برقل ,.Boudelaa and Marslen-Wilson, 2010) was used to extract roots (e.g) 36 37 ,orange color), and obtain of the surface forms and roots (e.g., brtqAly برتقالي 38 39 40 brql), corpus frequency of the surface Arabic forms, and root token frequency. For Arabic words 41 42 43 not listed in AraMorph or Aralex, a gloss was obtained from Google Translate 44 45 (http://translate.google.com) and checked with author AA who is a native Arabic speaker. 46 47 48 After translation but before further analyses, the data underwent pre-processing. 49 50 Participants who answered only ‘black’ or ‘no color’ for each grapheme were removed from the 51 52 53 analysis because there was no variation in their color responses (8 participants for Arabic, 8 for 54 55 English). Synaesthetes were identified, and analyzed separately from the other participants (see 56 57 below). Arabic color responses for which no frequency information could be obtained from 58 59 60 61 62 12 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Aralex were not considered in the frequency analyses, but were used for identification of 5 6 7 synaesthetes. 8 9 10 11 12 Identification of synaesthetes 13 14 Synaesthetes were identified separately for Arabic and English, because language and cultural 15 16 factors may influence the choice of colors and responses on questionnaires. Also, consistency 17 18 19 scores for letter and number responses were analyzed separately in order to independently 20 21 identify letter-color and number-color synaesthetes. The synaesthesia questionnaire and 22 23 24 consistency scores and cut-offs are summarized in Table 1. Two synaesthetes were identified 25 26 (both female); additionally, one participant was close to the criteria for synaesthesia for Eastern 27 28 29 Arabic numbers. Data from the two synaesthetes was removed from further analysis. The 30 31 synaesthetes’ responses could not be analyzed in detail because of the small sample size 32 33 (although see section “Richness of color word responses in relation to synaesthesia and gender”). 34 35 36 For Arabic responders, the synaesthesia questionnaire score and the consistency score 37 38 correlated strongly and positively for letters (r = 0.562, p<0.001) as well as for Eastern Arabic 39 40 41 numbers (r = 0.490, p<0.001), suggesting the two measures are related. For English responders, 42 43 this relationship was not significant (letters: r=0.164, p=0.545; numbers: r=0.352, p=0.198). 44 45 46 47 48 Grapheme-color pairings are not random 49 50 In order to ascertain whether grapheme-color combinations were systematically associated we 51 52 53 first counted the total number of times a color was chosen as a response. From this, we derived 54 55 the probability a given grapheme would be associated with a color by chance which served as a 56 57 58 baseline for calculating binomial probabilities (for instance 14% of all responses consisted of the 59 60 color red). We then counted the actual number of times colors were chosen for each grapheme, 61 62 13 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 and used the binomial probability to calculate how likely these grapheme-color pairings were if 5 6 7 the distribution would be random. The derived p-values were Bonferroni corrected for multiple 8 9 comparisons because of the number of colors tested (see Ward and Simner (2003) and Simner et 10 11 al. (2005)). 12 13 14 For Arabic and English responses to letters, color words outside of the eleven basic color 15 16 terms were frequent enough to be considered as separate colors for analysis. For Eastern Arabic 17 18 19 numerals, non-standard colors made up only 3.9% of the total amount of responses and those 20 21 non-standard color responses were collapsed onto one of the eleven basic color terms (black, 22 23 24 white, gray, brown, red, orange, yellow, green, blue, purple, pink) (Berlin & Kay, 1969). Violet 25 26 was collapsed onto purple, silver onto gray, golden onto yellow. For English responses to 27 28 29 numbers, the non-standard responses made up 3.1% of the total and the same procedure was 30 31 followed. 32 33 The significant grapheme-color pairings are summarized in Figures 4 and 5. Some 34 35 the first letter in the Arabic , ا regularities are particularly noteworthy. For example, the letter ’alif 36 37 38 alphabet, was associated with black and white significantly more often than chance (both colors 39 40 41 p<0.001). This was true for both Arabic and English responders. Intriguingly, a similar pattern 42 0 is associated with black at p<0.001 and white at p<0.01; 1 is) ١ and 1 ٠ holds for the numbers 0 43 44 45 46 associated with black at p<0.01 and white at p<0.001). Earlier studies (Beeli et al., 2007; Smilek 47 48 et al., 2007) found similar associations between 0 and 1 with black and white in English speaking 49 50 synaesthetes, but not for the first letter of the alphabet. Rich et al. (2005) find that number 1 is 51 52 53 significantly associated with black or white (0 was not included in their study) for non- 54 55 synaesthetes and white for synaesthetes. There are two things to note. First, it is probably not a 56 57 58 coincidence that there is a strong similarity in shape between the letter ’alif and the number 1. 59 60 Second, these associations are present in the non-synaesthetes in our study, unlike most previous 61 62 14 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 English studies which find the effect only with synaesthetes (with the exception of Rich et al. 5 6 7 2005). 8 9 Other letter-color associations were similar across both groups of respondents. For 10 r was associated with red by both Arabic and English respondents. This was also the ر ,example 11 12 13 m was associated with red by Arabic respondents, and with م S and yellow. The letter ص case for 14 15 16 a subordinate of red, i.e., maroon by English respondents. However, there were many differences 17 18 19 between the groups too (see Figures 4 and 5). For example, three different letters had very strong 20 21 associations with green amongst the Arabic respondents (all p<0.001), but no English respondent 22 23 24 associated any letter with green above chance levels. It is tempting to speculate this prominent 25 26 association reflects the cultural significance of green for Arabic respondents. Green is the 27 28 29 traditional color of Islam. 30 31 32 33 Arabic letters match onsets of color words and their roots 34 35 36 We also investigated whether the onset letters of Arabic color word responses (and their roots) 37 38 corresponded with particular Arabic graphemes more often than expected by chance. This 39 40 41 analysis is only relevant for Arabic responses to letter stimuli. For each color word response in 42 43 the Arabic letter condition, we noted the onset letter of the raw color word response produced, 44 45 46 and the onset letter of the root of the response word. The onset letter of the raw color word 47 48 matched the given stimulus letter in 8.6% of cases (when randomizing the data, onset matches 49 50 occurred in 2.9% of cases). For the root of the response word, the percentage of matches was 51 52 53 12.0% (for random data, root matches occurred in 6.7 % of cases). Overall, then, this analysis 54 55 suggests there was a higher than chance tendency to associate color words with both raw onsets 56 57 58 and root onsets of the experimentally presented letters, although this was not quantified 59 60 statistically. 61 62 15 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 In addition, we tested whether specific onset letters of the Arabic color word and the onset 5 6 7 letter of the root corresponded to a given letter more often than expected by chance, using the 8 bunni بني baa’ elicit the response ب same binomial probability method described above (e.g., did 9 10 11 ‘brown’). For the raw Arabic color word responses in the letter condition, an onset match in the 12 13 more often than chance (p<0.05), but for no other (و) response occurred for the grapheme waaw 14 15 16 specific grapheme-onset combination. 17 18 19 20 21 Grapheme frequency influences color word choices in Arabic and English responders. 22 23 24 To test the relationship between grapheme frequency and color word frequencies, linear 25 26 regression analyses were performed with Arabic letter and digit frequency as the independent 27 28 29 variable and frequencies of the (Arabic and English) color responses as a dependent variable. 30 31 Participants were included as a random factor. Arabic letter frequencies were derived from 32 33 Intellaren.com1, a 1.3 million word Arabic corpus (approximately 5 million letters) and are listed 34 35 36 in Table 2. Letter frequencies can be calculated lumping (or not) modified letter forms onto their 37 38 primary forms. We only presented primary letter forms (N=28); therefore we used non-lumped 39 40 41 letter frequencies. No explicit source was found for Arabic digit frequency, but based on data 42 43 from other languages the ranked digit frequency was taken to be the inverse of the digit number 44 45 46 (e.g. Beeli et al., 2007). As stated above, Arabic color word frequencies and properties were 47 48 derived from Aralex and are listed in Table 3. Ranked English color word frequencies were 49 50 derived from the British National Corpus, as listed in the Appendix of Simner et al. (2005). 51 52 53 When analyzing Arabic color word responses with respect to their orthographic word 54 in Arabic) as derived from Aralex بني ) ’frequency, we noticed the word frequency for ‘brown 55 56 57 58 was uncharacteristically high compared to other color word frequencies (43.15 compared to 59 60 61 1 http://www.intellaren.com/articles/en/a-study-of-arabic-letter-frequency-analysis 62 16 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 is ambiguous and can be derived from the root ‘to بني values in the range of 0-10). The word 4 5 6 7 build’ or ‘sons of’. Other color words are also ambiguous in their written form, but did not cause 8 9 this magnitude of distortion. Aralex reflects orthographic frequency as encountered by Arabic 10 in comparison to other بني readers. Still, given the discrepancy of orthographic frequency of 11 12 13 بني words, and the fact the high frequency is most likely due to its additional senses, we excluded 14 15 16 responses from the analyses reported next. 17 18 19 We found a strong positive relationship between the frequency of occurrence of Arabic 20 21 letters and orthographic frequency of associated Arabic color words F(1,2975) = 22.8, p<0.001, 22 23 2 24 R = 0.008, β = 0.087, t = 4.77. The same positive relationship was found between the frequency 25 26 of the Arabic letter and the root token frequency of the chosen color word F(1, 2451) = 18.8, 27 28 2 2 29 p<0.001, R = 0.008, β = 0.087, t = 4.34). For English responses, the color word frequency was 30 31 taken to be the ranked frequency of color words in the English language. We found the same 32 33 relationship: high frequency Arabic letters were associated with high frequency color words in 34 35 2 36 English F(1,1546) = 9.85, p<0.01, R = 0.006, β = 0.080, t = 3.14. 37 38 Similar positive correlations were found for Eastern Arabic number frequency and 39 40 2 41 associated Arabic color word frequency F(1,1094) = 81.2, p<0.001, R = 0.069, β = 0.263 t = 42 43 9.01; number frequency and root token frequency F(1, 871) = 38.2, p<0.001, R2 = 0.042, β = 44 45 46 0.205 t = 6.18, and number frequency and rank order of the frequency of English color words 47 48 (F(1,508) = 31.2, p<0.001, R2 = 0.058, β = 0.272, t = 5.59). 49 50 Figure 6 illustrates this effect for responses in Arabic with a high frequency color word 51 52 53 (white) and a relatively low frequency color word (violet). High frequency letters are associated 54 55 56 57 58 2 The degrees of freedom vary between these two analyses because Aralex did not have information for all root 59 60 token frequencies. 61 62 17 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 with the high frequency color word, whereas for lower frequency letters, lower frequency color 5 6 7 words are produced. 8 9 10 11 12 Similarly shaped or sounding letters are not systematically associated with similar colors 13 14 Earlier we saw that the letter ’alif and number one received similar color associations, i.e., black 15 16 and white. This suggests similarity in shape might influence color associations. Figure 7 plots 17 18 19 letters with similar shapes together with their color associations. The results do not support the 20 Z ظ T and ط hypothesis that color associations are driven by shape similarity. Only the letter pair 21 22 23 24 showed a potential similarity in color association, i.e., brown (although note only the association 25 Z was significantly above chance). This suggests shape has only a limited role to ظ of brown with 26 27 28 29 play in color associations, perhaps limited only to the first letter and digit of Arabic. 30 31 We also investigated whether letters that sound similar in Arabic tend to be associated 32 33 with similar colors (Figure 8). The letters were plotted according to Ryding (2005, p.13), and the 34 35 36 significant color associations shown by a colored square around the letter. As with letter shape, 37 38 there was no clear pattern between sound features and color associations in this task. 39 40 41 42 43 Color responses are influenced by the Berlin and Kay evolutionary sequence 44 45 46 We assessed whether the order in which color word responses were generated by participants was 47 48 random or not. According to the Berlin and Kay (1969) hierarchy color terms enter into 49 50 languages in the following order: 1) black/white; 2) red; 3) green/yellow; 4) blue; 5) brown; 6) 51 52 53 orange/purple/grey/pink. Because only six stages are distinguished in this hierarchy, analyses 54 55 were only performed for the first 6 letters of the alphabet. We computed correlations between the 56 57 58 order of the color word responses as given by participants and the Berlin and Kay hierarchy, and 59 60 ease of generation order (Al-Rasheed et al., 2011). The correlations were tested separately for the 61 62 18 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 different versions of the survey. Recall stimuli were presented either in random or fixed order, 5 6 7 and starting with letter or number stimuli. These differences in stimulus order are relevant here 8 9 because effects of ease-of-generation or the Berlin and Kay hierarchy could be triggered by any 10 11 ordered sequence, or alternatively triggered by associations to intrinsic stimulus order (e.g., 12 13 14 irrespective of when it is presented, the number 1 always becomes associated with the same 15 16 color). Participants differed for each version of the survey. English and Arabic were also different 17 18 19 from one another. 20 21 No effects were found of the ease of generation order of color words as found in Al- 22 23 24 Rasheed et al. (2011). We did, however, find effects of the Berlin and Kay hierarchy (1969) 25 26 summarized in Table 4. Table 4 shows color word responses to numbers followed the Berlin and 27 28 29 Kay hierarchy in almost all cases (significant Pearson correlations, see Table 4), irrespective of 30 31 the order in which stimuli were presented, irrespective of the response language, and irrespective 32 33 of whether the survey started with numbers or letters. This means numbers positioned early in the 34 35 36 intrinsic number sequence of 0-9 are associated with color words early in the Berlin & Kay 37 38 hierarchy (e.g. black, white, red), and this relationship is relatively robust. 39 40 41 For letter stimuli, the relationship of color word responses with the Berlin and Kay 42 43 hierarchy is also present, but reaches significance only when stimuli were presented in random 44 45 46 order (Random 1, letters first, see Table 4). The relationship between the order of letter stimuli in 47 48 the alphabet and the Berlin and Kay hierarchy is therefore perhaps less strong than the 49 50 relationship for numbers. 51 52 53 Finally, it is interesting to note the correlation between graphemes and the Berlin and Kay 54 55 evolutionary sequence follows the intrinsic order of the items, and not the experimental order. 56 57 58 That is, when the items are presented in a randomized order, participants did not associate the 59 was still more likely to ا first letter with the first color in the evolutionary sequence. Rather ‘alif 60 61 62 19 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 ,.was more likely to be associated with ‘red’, etc ب ’be associated with ‘black’ or ‘white’, baa 4 5 6 7 regardless of where in the experiment they occurred. 8 9 10 11 12 Richer color word responses in relation to synaesthesia, regardless of gender 13 14 Previous studies report synaesthetes describe their color associations with more words and more 15 16 non-standard color words than non-synaesthetes (Simner et al., 2005). We found participants who 17 18 19 displayed higher color consistency in their associations and therefore scored higher on the 20 21 ‘synaesthesia-scale’ tended to use more multi-word responses (e.g. ‘light green’) or non-standard 22 23 24 color responses (e.g. ‘silver’): Pearson correlation r = 0.438, p < 0.001. This analysis was 25 26 collapsed across all participants, versions, and stimuli of the survey. The correlation is significant 27 28 29 for both men and women (men r = 0.510, p < 0.05; women r = 0.409, p < 0.001). If we split all 30 31 responses in “synaesthetic” (= matching between repetitions 1 and 2 of the questionnaire) and 32 33 non-synaesthetic (non-matching responses), synaesthetic responses are roughly two times as 34 35 36 likely as non-synaesthetic responses to be a non-standard (‘exotic’) color word or a multi-word 37 38 response (19.2% versus 9.70% of all responses, see Table 5). 39 40 41 42 43 44 45 46 Discussion 47 48 49 We found significant non-random pairings of graphemes and colors for both letter and number 50 51 stimuli, amongst both Arabic and English responders (Figures 4 and 5), consistent with previous 52 53 54 studies. However, the pairings themselves differ to those reported earlier. Table 6 summarizes 55 56 grapheme-colour associations reported for non-synaesthetes in prior work, as a point of 57 58 59 comparison. Despite the fact many studies have investigated letter-color associations in various 60 61 languages, the primary associations are rarely reported in full. Therefore it is difficult to ascertain 62 20 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 with certainty which letter-color association are cross-culturally robust. Table 6 shows there are 5 6 7 considerable divergences between previously reported studies. 8 9 For example, many studies find ‘a’ is mapped to red (e.g., Rich et al., 2005; Simner et al., 10 11 2005; Rouw et al., 2014; Asano & Yokosawa, 2011; Mompean Guillamon, 2013, Marks, 1975). 12 13 14 This has been found in closely related languages using the Roman script, like English, Dutch, and 15 16 German, but also in languages with a different script, such as Hindi and Japanese. Some 17 18 19 researchers suggest these mappings could be due to the shared sound. Mompean Guillamon 20 21 (2013) has reported Arabic speakers also associate ‘a’ to red when mapping sounds to colors. In 22 23 24 our study, however, all participants regardless of responding language (Arabic or English), 25 with black or white instead. There ( ا strongly associated ‘a’ (which corresponds to the letter ’alif 26 27 28 29 was no reliable association between ‘a’ and red. 30 31 Note, however, what may seem like considerable uniformity in previous studies masks 32 33 some critical differences. The letter ‘a’ in Dutch and German is pronounced as the open front /ɑː/. 34 35 36 But in English ‘a’ is pronounced as a mid-front vowel /eɪ/ (as in ‘age’). So, when English 37 38 speakers associate the letter ‘a’ with red (e.g., Simner et al., 2005) it is not entirely clear this is 39 40 41 based on the specific sound. So it is unclear whether previously reported associations between ‘a’ 42 43 and red are really based on the sound per se. 44 45 46 Rouw et al. (2014) suggest ‘a’ is red because it is the first (ordinal) item in a sequence, 47 48 but we see in Arabic a very strong preference for the first ordinal item to be black or white. This 49 0 and ٠ is also reflected in the color associations to numbers. Participants in this study associated 50 51 52 share the ١ and the number 1 ا with black and white. Here it seems relevant that the letter ’alif ١1 53 54 55 same shape. Previously, Beeli et al. (2007) and Smilek et al. (2007), found English-speaking 56 57 58 synaesthetes associated 0 and 1 with black and white too, while Day (2004) found the letters I 59 60 and O were associated with white or black by the majority of synaesthetes (N=166). Baron- 61 62 21 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Cohen et al. (1993) found O to be associated with white, too, in 73% of synaesthetes. Rich et al. 5 6 7 (2005) found I and O to be significantly associated with white in synaesthetes and I with white in 8 9 non-synaesthetes; Spector and Maurer (2008, 2011) also report pre-literate children associate O 10 11 and I with white. Together this might suggest a general shape-color association. The effect, 12 13 14 however, appears to be strongest in synaesthetes. In contrast, Rouw et al. (2014) found Hindi 15 16 non-synaesthetes associated 1 to red. So this also suggests the posited universality of O and I 17 18 19 shapes to black and white also deserves future scrutiny. 20 21 Within our own study the effect of shape appears to be limited. We hypothesized Arabic 22 23 24 letter shape could play a role in determining associated colors, since many letters of the alphabet 25 26 share similar shapes. However, letters with similar shapes were not assigned similar colors. In 27 28 29 addition, no trends were detectable in color associations for letters with similar articulatory 30 31 features. 32 33 We also tested whether the templatic morphology of Arabic has consequences for 34 35 36 grapheme-color associations by checking whether color associations are driven by the onset letter 37 38 of the raw color word, the root color , or both. We found a higher than chance 39 40 41 tendency to match the stimulus letter to the onset of the raw color word and its root. However, the 42 43 specific letter-stimulus onset matches reached significance only in one case. This could be 44 45 46 because there are two possible sources of association (raw letter onset; root letter onset), or 47 ,alif (al-Jehani’ ا perhaps because many Arabic color words start with the same letter, namely 48 49 50 1990). 51 52 53 We found a clear positive relationship between the frequency of occurrence of letter and 54 55 number stimuli in Arabic and the frequency of color words chosen as associates. That is, high 56 57 58 frequency graphemes were associated with high frequency color words. There was also a 59 60 relationship between letter frequency and the root token frequency of color words – a linguistic 61 62 22 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 feature not previously examined. The relationship between letter frequency and word frequency 5 6 7 has previously been reported by Simner et al. (2005), but only for synaesthetes. Here we found 8 9 the effect for non-synaesthetes too. Simner et al.’s study features a large number of non- 10 11 synaesthete participants (N=317 for English and N=58 for German speaking participants), so it is 12 13 14 unlikely the discrepancy between the results is due to a lack of power in the Simner et al. study. 15 16 In addition, Simner et al. did not correct for multiple comparisons (which we did), making it even 17 18 19 more unlikely an effect was missed by Simner et al.. So, why this difference? The discrepancy 20 21 could be driven by language-specific factors. Perhaps the relative frequencies of letters in Arabic 22 23 24 differ more than in English or German. However, this requires further in-depth studies to verify 25 26 or refute. 27 28 29 In our data, the order in which color word responses were generated did not follow the 30 31 ease of generation order for Arabic speakers as reported by Al-Rasheed et al. (2011). This ease of 32 33 generation order effect was previously reported for non-synaesthetes by Simner et al. (2005). One 34 35 36 reason we did not find this effect could be that our source was not specific enough for 37 38 the participants of our study, who originated mainly from Egypt. The Al-Rasheed et al. color 39 40 41 word generation norms were from Arabic speakers from Saudi Arabia. Separate ease of 42 43 generation norms are required from Egyptian Arabic speakers to further verify this. In our data, 44 45 46 order of color word responses for number stimuli correlated strongly with the Berlin and Kay 47 48 evolutionary sequence, even when the stimuli were presented in random order. For letter stimuli, 49 50 this effect was also found, contra Simner et al. (2005) who reported no effect. This relationship 51 52 53 may be stronger for number stimuli because of the stronger intrinsic ordering of number vs. letter 54 55 stimuli (Shanon, 1982). 56 57 58 59 60 Synaesthetes in our study 61 62 23 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 By means of our cut-off method, we identified 2 synaesthetes in our study, out of 78 participants 5 6 7 who completed both sessions of the experiment (2.6%). The use of a statistical cut-off naturally 8 9 implies that a certain percentage of participants would fall above the specified value. As such it is 10 11 not possible to make definitive claims about the prevalence of synaesthesia in our sample. Our 12 13 14 purpose was merely to identify those participants exhibiting clear synaesthetic characteristics. 15 16 Comparisons to earlier prevalence studies on populations using the Roman script (Simner et al., 17 18 19 2006; Rothen and Meier, 2010) suggest that our method yields a rather conservative estimate of 20 21 synesthesia prevalence. . 22 23 24 Our results confirm the tendency of synaesthetes to describe their specific color 25 26 associations in more detail than regular participants. For instance in Simner et al. (2005) found 70 27 28 29 synaesthete participants together generated 54 different color descriptions for the color green, 30 31 compared to only 5 different descriptions of green from control participants. In our data, 32 33 responses matching across the first and second session of the survey (and were thus regarded as 34 35 36 ‘synaesthetic’ in nature) consisted of multi-word utterances or ‘exotic’ color words more often 37 38 than non-matching responses. This indicates that for certain participants color associations may 39 40 41 have been rather vivid or akin to color imagery, allowing specific color associations to occur. 42 43 Tentatively, the detail with which participants systematically describe their color associations 44 45 46 might even be a possible way to identify synaesthetes. 47 48 49 50 51 52 53 Conclusion 54 55 In the last years there is growing acknowledgement that psychology needs to expand its 56 57 58 investigations to a more representative sample of humanity (Henrich et al., 2010; Majid & 59 60 Levinson, 2010). We find color associations to letters and numbers in an Arabic speaking 61 62 24 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 population differ in many respects from previous results for speakers of other languages, such as 5 6 7 English. In fact, the results suggest grapheme-color associations are determined to a large extent 8 9 on language internal psycholinguistic factors such as letter frequency, the intrinsic ordering of 10 11 letters, and the morphological structure of Arabic. Other features previously suggested as 12 13 14 important, such as letter shape and sound, appear to play a negligible role. Grapheme-color 15 16 associations appear to be the result of statistical associations derived from the particular language 17 18 19 and cultural context speakers find themselves in. The findings together show linguistic and 20 21 psycholinguistic features have to be carefully examined in each language individually in order to 22 23 24 provide a pan-cultural account of grapheme-color associations. 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 25 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Acknowledgments 5 6 7 Author statement: The initial idea of this study and its design were conceptualized by AM, TvL, 8 9 MD, and BT. BT and AA collected the data. Data coding was done with the assistance of Peter 10 11 Nijland and Dirk Hage under the supervision of AM. AA checked the Arabic responses where 12 13 14 automated responses were not available. TvL conducted the statistical analyses under AM’s 15 16 supervision. TvL and AM jointly wrote the paper, with MD contributing to revisions. Ludy 17 18 19 Cilissen provided invaluable assistance with the figures. We’d like to thank William Marslen- 20 21 Wilson and Sami Boudelaa for help with Aralex. Thanks also to Kees Versteegh for answering 22 23 24 Arabic questions. This research was funded by the Max Planck Gesellschaft. We would like to 25 26 thank the members of the Language and Cognition department, and in particular thanks to Steve 27 28 29 Levinson for supporting this research. 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 26 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 References 5 6 7 al-Jehani N. M. (1990). Color terms in Mecca: A sociolinguistic perspective. Anthropological 8 9 32, 163-174. 10 11 Al-Rasheed A. S., Al-Sharif . H., Thabit M. J., Al-Mohimeed N. S., Davies I. R. L. (2011). Basic Colour 12 13 Terms of Arabic, in: New Directions in Colour Studies (Biggam CP, Hough CA, Kay CJ, 14 15 Simmons DR, eds), pp 53-58: John Benjamins Publishing Company. 16 17 18 Asano M., Yokosawa K. (2011). Synesthetic colors are elicited by sound quality in Japanese synesthetes. 19 20 Conscious. Cogn. 20, 1816-1823. 21 22 Asano M., Yokosawa K. (2012). Synesthetic colors for Japanese late acquired graphemes. Conscious. 23 24 Cogn. 21, 983-993. 25 26 27 Asano M., Yokosawa K. (2013). Grapheme learning and grapheme-color synesthesia: Toward a 28 29 comprehensive model of grapheme-color association. Front. Hum. Neurosci. 7. 30 31 Baron-Cohen S., Wyke M. A., Binnie C. (1987). Hearing words and seeing colours: an experimental 32 33 investigation of a case of synaesthesia. Perception 16, 761-767. 34 35 36 Battig W. F., Montague W. E. (1969). Category norms for verbal items in 56 categories - A replication 37 38 and extension of Connecticut category norms. J. Exp. Psychol. 80, 1-45. 39 40 Bazzi I., LaPre C., Makhoul J., Raphael C., Schwartz R. (1997). Omnifont and Unlimited- 41 42 OCR for English and Arabic. Proceedings of the Fourth International Conference on Document 43 44 Analysis and Recognition 2, 842-846. 45 46 47 Beeli G., Esslen M., Jancke L. (2007). Frequency correlates in grapheme-color synaesthesia. Psychol. Sci. 48 49 18, 788-792. 50 51 Berlin B., Kay P. (1969) Basic Color Terms: Their Universality and Evolution. Berkeley, USA: University 52 53 of California Press. 54 55 56 Boudelaa S., Marslen-Wilson W. D. (2010). Aralex: a lexical database for Modern Standard Arabic. 57 58 Behav. Res. Methods 42, 481-487. 59 60 61 62 27 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Brang D., Rouw R., Ramachandran V. S., Coulson S. (2011). Similarly shaped letters evoke similar colors 5 6 in grapheme-color synesthesia. Neuropsychologia 49, 1355-1358. 7 8 9 Buckwalter T. (2002). Buckwalter Arabic Morphological Analyzer Version 1.0 LDC2002L49, in. 10 11 Philadelphia, USA: Linguistic Data Consortium. 12 13 Campbell G. L. (1997) Handbook of Scripts and Alphabets. New York: Routledge. 14 15 Cohen Kadosh R., Henik A., Walsh V. (2007). Small is bright and big is dark in synaesthesia. Curr. Biol. 16 17 17, R834-R835. 18 19 20 Coulmas F. (2013). Writing and Society: An introduction. Cambridge: Cambridge University Press. 21 22 Day S. A. (2004). Trends in Synesthetically Colored Graphemes and Phonemes -- 2004 revision 23 24 http://www.daysyn.com/Day2004Trends.pdf. 25 26 Deroy O., Spence C. (2013). Why we are not all synesthetes (not even weakly so). Psychon. Bull. Rev. 20, 27 28 29 643-664. 30 31 Elbeheri G., Everatt J. (2006). Literacy Ability and Phonological Processing Skills amongst Dyslexic and 32 33 Non-Dyslexic Speakers of Arabic. and Writing 20, 273–94. 34 35 Foster G. M. (1979). Methodological Problems in the Study of Intracultural Variation: The Hot/cold 36 37 38 Dichotomy in Tzintzuntzan. Human Organization 38, 179–183. 39 40 Henrich J., Heine, S. J., Norenzayan, A. (2010). The weirdest people in the world. Behav. and Brain 41 42 Sciences, 33, 1–75. 43 44 Hochel M., Milán E. G. (2008). Synaesthesia: The existing state of affairs. Cogn. Neuropsychol. 25, 93- 45 46 47 117. 48 49 Hung W.-Y., Simner J., Shillcock R., Eagleman D. M. (2014). Synaesthesia in Chinese characters: The 50 51 role of radical function and position. Conscious. Cogn. 24, 38-48. 52 53 Jürgens U. M., Nikoliç D. (2012). Ideaesthesia: Conceptual processes assign similar colours to similar 54 55 shapes. Transl. Neurosci. 3, 22-27. 56 57 58 Cohen Kadosh R., Cohen Kadosh K., Henik A. (2008). When brightness counts: The neuronal correlate of 59 60 numerical-luminance interference. Cereb. Cortex 18, 337-343. 61 62 28 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Kusnir F., Thut G. (2012). Formation of automatic letter–colour associations in non-synaesthetes through 5 6 likelihood manipulation of letter–colour pairings. Neuropsychologia 50, 3641-3652. 7 8 9 Lewis P. M., Simons G. F., Fennig C. D. (2014) Ethnologue: Languages of the World, Seventeenth edition 10 11 Edition. Dallas, Texas: SIL International. 12 13 Majid A., Levinson S. C. (2010). WEIRD languages have misled us, too. Behav. and Brain Sci, 33, 103. 14 15 Malt B. C., Majid A. (2013). How Thought Is Mapped into Words. Wiley Interdisciplinary Reviews: 16 17 Cognitive Science 4, 583–597. 18 19 20 Marks L.E. (1975). On Colored-Hearing Synesthesia: Cross-Modal Translations of Sensory Dimensions. 21 22 Psychol. Bull. 82, 303-331. 23 24 Mompean Guillamon P. (2013). Vowel-colour symbolism in English and Arabic: a comparative study. 25 26 Miscelánea: A Journal of English and American Studies 47. 27 28 29 Rich A.N., Bradshaw J.L., Mattingley J.B. (2005). A systematic, large-scale study of synaesthesia: 30 31 implications for the role of early experience in lexical-colour associations. Cognition 98, 53-84. 32 33 Rothen N., Meier B. (2010). Higher prevalence of synaesthesia in art students. Perception 39, 718-720. 34 35 Rouw R., Gosavi R., Case L., Ramachandran V. (2014). Color associations for days and letters across 36 37 38 different languages. Fron. Psychol. 5. 39 40 Ryding K. C. (2005) A Reference of Modern Standard Arabic. Cambridge: Cambridge 41 42 University Press. 43 44 Shanon B. (1982). Colour Associates to Semantic Linear Orders. Psychological Research 44, 75-83. 45 46 47 Simner J., Hung W. Y., Shillcock R. (2011). Synaesthesia in a logographic language: The colouring of 48 49 Chinese characters and Pinyin/Bopomo . Conscious. Cogn. 20, 1376-1392. 50 51 Simner J., Ward J., Lanz M., Jansari A., Noonan K., Glover L., Oakley D. A. (2005). Non-random 52 53 associations of graphemes to colours in synaesthetic and non-synaesthetic populations. Cogn. 54 55 Neuropsychol. 22, 1069-1085. 56 57 58 Simner J., Mulvenna C., Sagiv N., Tsakanikos E., Witherby S. A., Fraser C., Scott K., Ward J. (2006). 59 60 Synaesthesia: the prevalence of atypical cross-modal experiences. Perception 35, 1024-1033. 61 62 29 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Smilek D., Carriere J. S. A., Dixon M. J., Merikle P. M. (2007). Grapheme Frequency and Color 5 6 Luminance in Grapheme-Color Synaesthesia. Psychol. Sci. 18, 793-795. 7 8 9 Spector F., Maurer D. (2008). The colour of Os: Naturally biased associations between shape and colour. 10 11 Perception 37, 841-847. 12 13 Spector F., Maurer D. (2011). The colors of the alphabet: Naturally-biased associations between shape and 14 15 color. J. Exp. Psychol. Hum. Percept. Perform. 37, 484-495. 16 17 Ward J., Simner J. (2003). Lexical-gustatory synaesthesia: linguistic and conceptual factors. Cognition 89, 18 19 20 237-261. 21 22 Ward J., Huckstep B., Tsakanikos E. (2006). Sound-colour synaesthesia: to what extent does it use cross- 23 24 modal mechanisms common to us all? Cortex 42, 264-280. 25 26 Watson M., Akins K., Enns J. (2012). Second-order mappings in grapheme-color synesthesia. Psychon. 27 28 29 Bull. Rev. 19, 211-217. 30 31 Watson M. R., Akins K., Enns J. T. (2010). Similar letters, similar hues - Shape-Colour isomorphism in 32 33 grapheme-colour synaesthesia, in: Meeting of the UK Synaesthesia Association. Brighton, UK. 34 35 Wollen K. A., Ruggiero F. T. (1983). Colored-letter synesthesia. J. Ment. Imagery. 7, 83-86. 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 30 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Figure legends 5 6 7 8 9 Figure 1. Arabic alphabet and digits 10 is the first ا Panel A depicts the Arabic alphabet. The letters should be read right-to-left. ʼalif 11 12 13 14 letter of the alphabet. Corresponding sounds are given in the light grey panel, and the Arabic 15 16 letter name is in the dark grey panel. Panel B depicts Arabic digits (also known as Eastern Arabic 17 18 19 numbers). The numbers are read from left-to-right. The top line shows the Arabic digits used in 20 21 the study. The corresponding English numbers follow in the light grey panel. The Arabic names 22 23 24 for the numbers are given below in the dark grey panel. Adapted from Ryding (2005). 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 31 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Figure 2. Details of Arabic script 5 6 7 Panel A shows how vowels can be represented in Arabic using diacritics, or in combination with 8 9 three of the letters from the alphabet. Panel B shows how Arabic letters change shape when 10 11 written in words. Three color words participants produced in the study are shown on left-side, 12 13 14 with the corresponding fully written out letters on the right-side. Panel C gives an illustrative 15 16 example of the rich morphology of Arabic. The Arabic word is shown on the right, with the 17 18 19 and word class (n = noun; v = verb), and translation. Panel D shows how three of 20 21 the color words change their form as a function of grammatical gender and number. 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 32 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Figure 3. Synaesthesia prevalence and scoring distributions 5 6 7 Distribution of scores on the synaesthesia questionnaire for Arabic and English respondents. 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 33 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Figure 4. Letter-color associations for Arabic and English respondents 5 6 7 Left panel shows the above chance Arabic color responses to Arabic letters. English translations 8 9 for the Arabic words are shown on the right-hand side, with their corresponding chance values. 10 11 English color associations to Arabic letters are shown in the right panel. 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 34 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Figure 5. Digit-color associations for Arabic and English respondents 5 6 7 Left panel shows the above chance Arabic color responses to Eastern Arabic numbers. English 8 9 translations for the Arabic words are shown on the right-hand side, with their corresponding 10 11 chance values. English color associations to Eastern Arabic numbers are shown in the right panel. 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 35 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Figure 6. Relationship between letter frequency and color words associated 5 6 7 8 Higher frequency color words (e.g., white) were more likely to be associated with high frequency 9 10 Arabic letters, whereas lower frequency words (e.g., violet) were more likely to be associated 11 12 with lower frequency letters. 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 36 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Figure 7. Letter shape and color associations. 5 6 Letters are depicted in groups based on similarity of shape. Significant color associations are 7 8 9 associated with an asterisk. Where there was no significant color association for a letter, the most 10 11 frequently associated color is depicted with a cross. There is no apparent systematic relationship 12 13 14 between letter shape and color association. 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 37 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Figure 8. Letter sound and color associations. 5 6 Letters are plotted according to their sound characteristics. Where there was no significant 7 8 9 association for a letter, the most frequent color association is depicted. There are no apparent 10 11 systematic relationships between letter sounds and color associations. 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 38 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Tables 5 6 7 8 Table 1. Synaesthesia 9 10 11 Synaesthesia questionnaire Consistency score* > cut- > cut- Syn- 12 responses (max. score = 36) off off aesthe- 13 quest consist tes 14Arabic Completed: N= 53 Letters Mean: 24.2 ± SD = 12.6 N=2 N=1 N =1 15 (N=51)** 16 Mean: 17.7±SD = 4.81 Cut-off: 48.8 17 *** 18 Numbers Mean: 8.55 ± SD = 5.61 N=0 N=1 - 19 Cut-off: 27.1 (N=47) 20 Cut-off: 19.54 21English Completed: N= 23 Letters Mean: 17.0 ± SD =12.8 N=1 N=0 - 22 (N=19) 23 Mean: 16.3±SD = 5.53 Cut-off: 42.2 24 Numbers Mean: 8.67 ± SD = 5.44 N=1 N=1 N=1 25 Cut-off: 27.1 (N=18) 26 27 Cut-off: 19.3 28 * Letters: maximum score 56, numbers: maximum score 20 29 ** Number of participants that completed both sessions and for whom a consistency score 30 could be calculated 31 *** One participant reached a score of 27, just below the cut-off of 27.1 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 39 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Table 2. Letter frequencies in Arabic 5 6 7 8 9 Letter 10 Arabic letter frequency1 12.5 ا 11 12.07 ل 12 13 6.61 ن 14 6.52 م 15 6.36 ي 16 17 5.8 و 18 5.08 ه 19 4.67 ب 20 21 4.2 ر 22 4.01 ع 23 2.84 ف 24 25 2.69 ق 26 2.67 د 27 2.61 ت 28 29 2.47 س 30 2.01 ك 31 1.86 ح 32 33 1.23 ج 34 1.04 ص 35 0.96 ذ 36 0.87 ث 37 38 0.79 خ 39 0.73 ش 40 0.52 ز 41 42 0.5 ط 43 0.44 ض 44 0.33 غ 45 46 0.18 ظ 47 1 48 Derived from Intellaren.com 49 50 51 52 53 54 55 56 57 58 59 60 61 62 40 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Table 3. Color word frequencies in Arabic 5 6 7 8 Orthographic 9 frequency 10 Raw Arabic Arabic form 11 form Color Aralex

apricot x مشمشي 12 13 black 7.8 أﺳﻮد 14 black 4.19 اﺳﻮد 15 blue 1.92 أزرق 16 17 blue 1.09 ازرق 18 brown 43.15 بنى 19 crimson 0.23 قرمزى 20 21 cumin 0.08 كمﻮني 22 golden 1.85 دهبي 23 gray 0.83 رمادي 24 25 green 2.86 أخضر 26 green 1.35 اخضر 27 honey x عسلى 28 29 lemon 0.03 ليمﻮني 30 orange 0.39 برتقالى 31 peach 0.03 خﻮخي 32 pink 1.43 وردي 33 34 purple 0.03 أرجﻮاني 35

purple 0.13 بنافسجى 36 red 4.97 أحمر 37 38 red 2.18 احمر 39 silver 0.44 فضي 40 turquoise 0.26 فيروزي 41 42 violet 0.13 بنفسجي 43 white 6.94 أبيض 44 white 3.88 ابيض 45 46 yellow 1.17 أصفر 47 yellow 0.57 اصفر 48 49 50 51 x denotes color word for which no frequency information was listed in Aralex 52 53 54 55 56 57 58 59 60 61 62 41 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Table 4. Order effects in color associations 5 6 7 Correlation Berlin & Correlation Berlin & 8 9 Kay hierarchy with Kay hierarchy with 10 intrinsic stimulus random presentation 11 order order of stimuli 12 13 Random 1, letters first 14 Letters Arabic (N=100) p<0.01, r=0.260 n.s. 15 Letters English (N=86) p<0.05, r=0.254 n.s. 16 17 Numbers E. Arabic (N=158) p<0.01, r=0.250 p<0.01, r=0.230 18 Numbers English (N=131) p<0.05, r=0.206 n.s. 19 20 Random 2, numbers first 21 Letters E.Arabic (N=120) n.s. n.s. 22 Letters English (N=46) p=0.068, r=0.271 n.s. 23 24 Numbers E. Arabic (N=212) p<0.0001, r=0.280 n.s. 25 Numbers English (N=87) n.s. n.s. 26 Fixed 1, letters first 27 28 Letters Arabic (N=117) n.s. NA 29 Letters English (N=48) n.s. NA 30 Numbers E. Arabic (N=186) p<0.001, r=0.350 NA 31 32 Numbers English (N=67) p<0.05, r=0.279 NA 33 Fixed 2, numbers first 34 Letters Arabic (N=66) n.s. NA 35 36 Letters English (N=36) n.s. NA 37 Numbers E. Arabic (N=106) p<0.001, r=0.482 NA 38 Numbers English (N=58) p<0.001, r=0.426 NA 39

40 For letter stimuli, the first 6 letters of the alphabet were used in the analyses. For numbers, all numbers were included (N=10). Only stimuli 41 belonging to Session 1 were analyzed. Reporting Pearson correlations. n.s. = non-significant (p>0.10). 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 42 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Table 5. Richness of color responses 5 6 7 Total responses Rich responses % Rich responses 8 9 10 11 Non-synaesthetic 4970 482 9.70 12 13 responses 14 15 16 Synaesthetic responses 1948 374 19.20 17 18 19 Rich responses are non-standard color words and multi-word responses. 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 43 63 64 65 1 GRAPHEME-COLOR ASSOCIATIONS IN ARABIC 2 3 4 Table 6. Reported grapheme-color associations for non-synaesthetes 5 6 A: 7 a b c d e f g h i j k l m n o p q r s t u v w x y z 8 Dutch1 ■■ ■■ ■■ ■ ■■ ■ □■ ■ ■■ ■■■■■ ■■■ ■■■■ □■■■ 1–3 9 English ■ ■■■ ■ ■■ ■■■ ■■ ■ ■□■ □ ■ ■■ ■□ ■■ ■ ■ ■■ ■ □ ■■ ■ ■ German4 ■ ■■ ■ ■ ■ ■■ ■■ ■ ■■ ■ ■ ■■ ■■ ■■ ■■ ■ □ ■ ■ ■■ ■ ■■ ■ □ ■ ■ ■■ 10 11 B: Devanāgarī 12 अ /a/ उ /u/ क /k/ व /v,w/ ह /h/ 13 Hindi1 □ ■ ■ ■ ■ ■ 14 15 C: Arabic5 S ص sh ش s س z ز r ر dh ذ x خ H ح j ج t ت b ب aa’ ا 16 17 Egyptian □■ ■■ ■ ■■ ■■ ■ ■ ■■ ■■ ■■ ■ ■ Arabic y ي w و h ه m م l ل k ك q ق f ف gh غ [ʔ] ع Z ظ D ض 18 19 ■ ■■ ■ ■ ■ ■■ ■ ■■ ■ □■ ■■ □ 20 21 D: Numerals 22 0 1 2 3 4 5 6 7 8 9 English2 □ ■ ■ ■ ٩ ٨ ٧ ٦ ٥ ٤ ٣ ٢ ١ ٠ 23 24 Eastern ■□ ■□ ■ ■ ■■ ■ ■■ ■ ■■ 25 Arabic5 26 27 28 Rich et al. (2005) and Simner et al. (2005) do not mention whether graphemes were tested in lowercase or uppercase; the first reports results using uppercase, the second using lowercase. In Rouw et al. (2014), all letters except n,b,l were presented in uppercase. Although non-synesthaetes’ grapheme-colour associations have also been collected for the 29 Chinese Hanzi and Japanese Hiragana, Katakana and Kanji scripts, the associations have not been reported. In all of the writing systems above, the relation between graphemes and 30 phonemes is complex and involves one-to-many or even many-to-many mappings. Thus, while transliterations for the non-Latin alphabets are included, there is no guarantee that Hindi 31 w/, just as there is no guarantee that English “w” and German “w” or English “a” and Dutch “a” refer to the same sound. To avoid such/ و w,v/ is identical to English /w/ or Arabic/ 32 व 33 misinterpretations, we have not attempted to align the sounds across the writing systems here.

34 1. Rouw, R., Case, L., Gosavi, R. & Ramachandran, V. Color associations for days and letters across different languages. Cogn. Sci. 5, 369 (2014). 35 2. Rich, A. N., Bradshaw, J. L. & Mattingley, J. B. A systematic, large-scale study of synaesthesia: implications for the role of early experience in lexical-colour associations. 36 Cognition 98, 53–84 (2005). 37 3. Spector, F. & Maurer, D. The colors of the alphabet: Naturally-biased associations between shape and color. J. Exp. Psychol. Hum. Percept. Perform. 37, 484–495 (2011). 38 4. Simner, J. et al. Non-random associations of graphemes to colours in synaesthetic and non-synaesthetic populations. Cogn. Neuropsychol. 22, 1069 (2005). 39 5. Current study 40 41 42 43 44 45 46 44 47 48 49