Accommodation of Conversational Code-Choice
Total Page:16
File Type:pdf, Size:1020Kb
Accommodation of Conversational Code-Choice Anshul Bawa Monojit Choudhury Kalika Bali Microsoft Research Bangalore, India ft-anbaw,monojitc,[email protected] Abstract code-choice may be unexpected and noticed by other speakers, and is likely to affect other speak- Bilingual speakers often freely mix lan- ers’ subsequent code-choice. In other words, guages. However, in such bilingual con- speakers may accommodate to each other’s code- versations, are the language choices of choice, positively or negatively (Genesee, 1982). the speakers coordinated? How much In this work, we propose a set of metrics to does one speaker’s choice of language study the social accommodation of code-choice affect other speakers? In this paper, as a sociolinguistic style marker. We build upon we formulate code-choice as a linguis- the existing framework on accommodation by tic style, and show that speakers are in- Danescu-Niculescu-Mizil et al.(2011) and adapt deed sensitive to and accommodating of that for code-choice by introducing relevant fea- each other’s code-choice. We find that tures for code-choice. We then motivate and illus- the salience or markedness of a language trate the effect of code markedness on the degree in context directly affects the degree of of accommodation - the more salient code is more accommodation observed. More impor- strongly accommodated for. We further generalize tantly, we discover that accommodation of the framework to also account for delayed accom- code-choices persists over several conver- modation, instead of only next-turn or immediate sational turns. We also propose an alter- accommodation. native interpretation of conversational ac- In addition, we introduce an alternative view of commodation as a retrieval problem, and accommodation as a query-response task, and em- show that the differences in accommo- ploy mean reciprocal rank, a well-understood met- dation characteristics of code-choices are ric from the domain of Information Retrieval, as a based on their markedness in context. metric for latency of accommodation. We mea- sure how quickly a style marker (code-choice in 1 Introduction our case) introduced by a speaker is retrieved by Code-switching (CS) refers to the fluid alter- the other speaker during the conversation. Our ation between two or more languages within a approach is developed for analyzing code-choice conversation, and is a common feature of all but is applicable to other dimensions of linguis- multilingual societies. (Auer, 2013). Multilin- tic style as well (Tausczik and Pennebaker, 2010). gual speakers are known to code-switch in spo- This presents an alternative view of conversational ken conversations for a variety of reasons, moti- style accommodation and offers a simple but ef- vated by information-theoretic and cognitive prin- fective way of measuring, characterizing and even ciples, and also as a result of numerous social, predicting elements of conversational style. communicative and pragmatic functions (Scotton We test this formulation on two CS conversa- and Ury, 1977;S oderberg¨ Arnfast and Jørgensen, tional datasets - dialog scripts of bilingual Indian 2003; Gumperz, 1982). movies (in English and Hindi) and a transcrip- Code-choice refers to a speaker’s decision of tion of real-world conversations between Spanish- which code to use in a given utterance, and in English bilinguals in Florida, US. In both the case of a CS utterance, to what extent the differ- corpora, we observe strong signals of interper- ent codes are to be used. Depending on the soci- sonal code-choice accommodation for the salient olinguistic and conversational context, a speaker’s or marked code. We also observe that on average, 82 Proceedings of The Third Workshop on Computational Approaches to Code-Switching, pages 82–91 Melbourne, Australia, July 19, 2018. c 2018 Association for Computational Linguistics the marked code is accommodated within the first their linguistic styles towards (or away from) each three to four conversational turns, beyond which other in a conversation for social effect. In the the effect of accommodation on code-choice de- CAT framework, the interlocutors’ desire for ‘so- cays gradually. Contextually-unmarked code is cial approval’ results in an attempt to match each less strongly accommodated for, even when it oc- other’s linguistic style. Accommodation has been curs relatively infrequently within a conversation. studied for many markers of linguistic style like As far as we know, this is the first computa- tense, negations, articles, prepositions, pronouns tional study of code-choice accommodation, and and sentiment (Taylor and Thomas, 2008; Nieder- first work that introduces and formalizes the con- hoffer and Pennebaker, 2002). cept of delayed accommodation, that can be ap- Since it is possible to convey the same semantic plied to other style dimensions as well. content while widely varying the extent of CS, we The rest of the paper is organized as follows. also consider code-choice as a linguistic style di- We describe the background and related work in mension. Therefore, we expect to observe accom- Section2, which motivates the first formulation of modation in terms of code-choice in similar man- code-choice accommodation in Section3. We im- ner to that of variables for other linguistic styles. prove this by formulation by modifying the fea- While there have been linguistic and small-scale tures in Section4. We generalize the formulation studies (Sachdev and Giles, 2004; Bourhis, 2008; to multiple turns and introduce the analogy to re- Bissoonauth and Offord, 2001; y Bourhis et al., trieval in Section5, along with the results. We 2007) that argue for prevalence of code-choice ac- wrap up with a discussion in Section6 that we commodation, there are no large-scale quantitative conclude in Section7. or computational studies that corroborate this and shed light on the various patterns of code-choice 2 Related Work accommodation. Further, these studies rely on simple correlation-based measures. CS is employed by speakers to signal a common multilingual identity(Auer, 2005), and can be ef- The first computational study of linguistic style fectively used to reduce (or increase) the perceived accommodation (Danescu-Niculescu-Mizil et al., social distance between the speakers (Camilleri, 2011) shows that it is highly prevalent in Twit- 1996). As a marker of informality, it has been ter conversations. They use binary features for shown to lower interpersonal distance (Myers- the presence of various psychologically meaning- Scotton, 1995; Genesee, 1982). ful word categories as described by the LIWC Common structural patterns in CS as well as method(Tausczik and Pennebaker, 2010) to iden- the choice to switch between languages have tify stylistic variations in tweets. They then de- been the focus of many linguistic studies(Poplack, fine a probabilistic framework that mathematically 1988)(Auer, 1995). As CS is typically used as a models style accommodation in terms of the like- conversation strategy by bilinguals who are profi- lihood of an addressee to respond in the same style cient in both languages (Auer, 2013), it is not sur- as the speaker. prising that certain pragmatic and socio-linguistic Though CS is similar enough to other kinds of factors, such as formality of context (Fishman, linguistic style to allow analysis using the same 1970), age (Ervin-Tripp and Reyes, 2005), expres- framework, it also differs from them in being sion of emotion (Dewaele, 2010) and sentiment a strong sociological indicator of identity (Auer, (Rudra et al., 2016), are found to signal language 2005) and in not being processed nonconsciously preference in CS conversations. A Twitter study (Levelt and Kelter, 1982). We demonstrate that a of CS patterns across several geographies (Rijh- model that does not account for these crucial dif- wani et al., 2017), also suggests that there might be ferences fails to capture the accommodative pat- complex sociolinguistic reasons for code-choice. terns of code-choice. Because of being processed Thus, CS, and the choice of language or code consciously, code-choice also exhibits accommo- in which one communicates during a multilingual dation over several conversational turns, an ef- conversation, could be considered a marker of lin- fect which is not observed as strongly for other guistic style. style dimensions (Danescu-Niculescu-Mizil et al., Communication accommodation theory (Giles 2011). Long-term effects in accommodation have et al., 1973; Giles, 2007) states that speakers shift received very little attention, and have mostly 83 studied based on crude conversation-level correla- (Es denotes an expectation over all speakers s) tion values (Niederhoffer and Pennebaker, 2002). ∗ Acm (F ) = Es Acms(F ) 3 Accommodation of Code-Choice as = Es P δ F δS(d)=s δ F di di−1 (2) Linguistic Style − P (δ F δS(d)=s) di As a first step, we adapt an existing framework (Danescu-Niculescu-Mizil et al., 2011) that quan- 3.2 Measuring Code-Choice tifies accommodation of a given linguistic style. Our general hypothesis is that code-choice is re- Any linguistic feature is said to exhibit accommo- ciprocated in a bilingual conversation. To mea- dation if it is more likely to be expressed in re- sure this, we introduce simple binary features for sponse to a dialog that also expresses it, than oth- presence of each code, along the lines of the bi- erwise. In other words, an accommodative feature nary features in (Danescu-Niculescu-Mizil et al., in a dialog begets the same feature in the next dia- 2011), with individual language expression substi- log. We use the term ‘dialog’ or ‘turn’ to refer to a tuting for the style dimensions. For each language single spoken utterance or dialog within a conver- L, we define a feature FL indicating, for a dialog sation, and the term ‘speaker’ to refer to conversa- d, if the dialog contains words in the language L. tion participants.