A Network Science Approach to Bilingual Code-Switching
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Society for Computation in Linguistics Volume 4 Article 3 2021 A Network Science Approach to Bilingual Code-switching Qihui Xu Graduate Center, City University of New York, [email protected] Magdalena Markowska Stony Brook, State University of New York, [email protected] Martin Chodorow Hunter College, City University of New York, [email protected] Ping Li The Hong Kong Polytechnic University, [email protected] Follow this and additional works at: https://scholarworks.umass.edu/scil Part of the Computational Linguistics Commons Recommended Citation Xu, Qihui; Markowska, Magdalena; Chodorow, Martin; and Li, Ping (2021) "A Network Science Approach to Bilingual Code-switching," Proceedings of the Society for Computation in Linguistics: Vol. 4 , Article 3. DOI: https://doi.org/10.7275/raze-1b18 Available at: https://scholarworks.umass.edu/scil/vol4/iss1/3 This Paper is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Proceedings of the Society for Computation in Linguistics by an authorized editor of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. A Network Science Approach to Bilingual Code-switching Qihui Xu Magdalena Markowska Department of Psychology Department of Linguistics Graduate Center, CUNY Institute for Advanced Computational Science [email protected] Stony Brook University [email protected] Martin Chodorow Ping Li Department of Psychology Department of Chinese and Bilingual Studies Hunter College, CUNY The Hong Kong Polytechnic University [email protected] [email protected] Abstract introduces a new possible angle for code-switching research. Previous research has shown that the struc- While code-switching could be analyzed at the ture of the semantic network can influence lan- guage production, such that a word with low sentence-level from the syntactic (Poplack, 2000; clustering coefficient (C) is more easily re- Santorini and Mahootian, 1995) and pragmatic per- trieved than a word with high C. In this study, spectives (Beebe and Giles, 1984; Myers-Scotton, we used a network science approach to exam- 1993), or at the phrase-level (Couto and Gullberg, ine whether the network structure accounts for 2019), our study focuses on a word-level account. why bilinguals code-switch. We established Lexical properties of words, such as frequency, semantic networks for words in each language, length, concreteness, and imageability (Gross and then measured the C for each code-switched Kaushanskaya, 2015; Marian, 2009; Gollan et al., word and its translated equivalent. The results showed that words where language is switched 2014; Gollan and Ferreira, 2009) may account for have lower C than their translated equivalents code-switching speech. Among these, word fre- in the other language, suggesting that the struc- quency has been characterized as a classical in- tures of the lexicons in the two languages play dicator of lexical accessibility that influences lan- an important role in bilingual code-switching guage choice in code-switching (Gollan and Fer- speech. reira, 2009; Gross and Kaushanskaya, 2015; Gollan et al., 2014). For example, Gross and Kaushan- 1 Background skaya(2015) found that words with a higher fre- Code-switching is defined as “the alternation be- quency of use for bilingual children were more tween two (or more) languages within a single dis- likely to be produced than corresponding transla- course, sentence, or constituent” (Poplack, 2000). tions in the other language. The phenomenon has been widely observed in bilin- Strikingly, almost all previous studies of word- guals with a variety of different language pairs level code-switching have focused on a local rather (Poplack, 2000; Santorini and Mahootian, 1995; than a global context. A local context treats Barman et al., 2014). Understanding the mech- each word as being independent from other words, anism behind bilingual code-switching speech is whereas a global context emphasizes the inter- of particular importance to psycholinguistics and connections and the interactions between words natural language processing. In psycholinguistics, (Karuza et al., 2016). In addition to the attributes it provides key clues to bilinguals’ coordination of a word itself, such as its frequency and con- between the representations of two or more lan- creteness, the global system of words has also guages from structural, psychological and social been shown to affect many aspects of language pro- perspectives (Bullock and Toribio, 2009). In nat- cessing (Hills et al., 2009; Sizemore et al., 2018; ural language processing, on the other hand, its Steyvers and Tenenbaum, 2005; Chan and Vite- mechanisms could provide a strong foundation for vitch, 2009, 2010). In the domain of bilingual tasks, such as language identification (Lyu et al., code-switching, however, little has been done to 2006; Barman et al., 2014) and syntactic parsing understand how a word, in a global interconnected (Broersma, 2009), among others. This study pro- lexical system, can affect the process of lexical poses and tests an explanation of why bilingual selection. In this study, we address this question speakers switch their codes, and, in doing so, it through a network science approach. 18 Proceedings of the Society for Computation in Linguistics (SCiL) 2021, pages 18-27. Held on-line February 14-19, 2021 In the following sections, we review theoreti- number of studies have revealed the influence of cal background of how a word and its associations network structure on language acquisition (Size- with other words could potentially account for code- more et al., 2018; Hills et al., 2009), representa- switching speech. First, we introduce the funda- tion (Steyvers and Tenenbaum, 2005) and produc- mental theory of what drives bilinguals to switch tion (Goldrick and Rapp, 2007; Chan and Vitevitch, codes automatically, namely, the accessibility- 2010). driven account of (Kleinman and Gollan, 2016). Semantic networks, an important domain in net- Next, we review how the structure of the network work science, provide the means of organizing and of words moderates the accessibility of those words. representing the meanings of words. In a typical Finally, a bridge between the semantic network semantic network, words are represented as nodes, structure and code-switching speech is proposed and the semantic associations between them are rep- by introducing a particular structural measure (i.e. resented as edges. In a weighted semantic network, the clustering coefficient) which is used to examine the weight of an edge is determined by the strength the proposed account. of the shared semantic association between the two words. Figure1 illustrates a weighted semantic 1.1 Accessibility-driven switching network. Empirical evidence has shown that the process of code-switching can be effortless (Li, 1996; Gollan and Ferreira, 2009; Kleinman and Gollan, 2016). When bilinguals switch back and forth between languages in a free conversation, they can often do so as efficiently as communicating solely in one language, without any additional cognitive cost. Other studies have shown that the switching of the code is driven by prior information (Li, 1996), which was has been referred to as the ‘accessi- bility’ of the code (Kleinman and Gollan, 2016; de Bruin et al., 2018; Gollan et al., 2014; Gross and Figure 1: An illustration of a weighted semantic net- work of four semantically-related words. Each node Kaushanskaya, 2015). de Bruin et al.(2018) found represents a word, while the edge linking the nodes rep- that when doing voluntary code-switching, partic- resents the semantic association between those words. ipants would choose the word that could be pro- The weight of the edge is determined by the semantic duced faster in a previous picture naming task than similarity between the two linked words. The thicker the corresponding word in the other language. As lines in the graph indicate higher semantic similarity more accessible words are believed to be produced between the two words. faster when being cued, the authors argued that lan- guage choice during code-switching is driven by Semantic networks have provided a mechanistic lexical accessibility. account for conducting research in various domains Previous evidence for the accessibility account of language research, such as first language acqui- comes mainly from word-by-word switching in sition (Hills et al., 2009; Sizemore et al., 2018), experimental settings, which is ecologically dis- lexical memory (Vitevitch et al., 2012), word recog- similar to the intrasentential code-switching found nition (Yates, 2013), to name a few. By comparing in a spontaneous speech. As a result, the present the properties of a semantic network and human study examines code-switching speech recorded in behavior, researchers have also found that vari- natural conversations. ous properties of the network may be correlated with human language processing and representa- 1.2 Words in network science tion (Vitevitch et al., 2012; Goldrick and Rapp, Network science has been increasingly applied in 2007; Storkel et al., 2006). One such network prop- the research fields of linguistics and cognitive sci- erty that is examined in this study is the clustering ence. It aims to answer questions about how im- coefficient (Watts and Strogatz, 1998), which is a mensely complex pieces of information interact commonly-used standard measurement for describ- with each other and how they shape human cogni- ing network structure that has been shown to influ- tion and behaviors (Karuza et al., 2016). A large ence the accessibility of the spoken words (Siew, 19 2019; Chan and Vitevitch, 2009, 2010; Vitevitch Vitevitch(2009), in their word recognition task, re- et al., 2011). ported that participants responded more slowly and less accurately to words with high C (e.g. ‘dish’, The clustering coeffi- Clustering coefficients ‘full’, ‘lool’) than to words with low C (e.g. ‘bush’, cient (C) represents the probability that two neigh- ‘wide’, ‘lick’). bors of a node are themselves neighbors (Watts and Strogatz, 1998).