Quick viewing(Text Mode)

02-Paolillo 1..9999

02-Paolillo 1..9999

c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

Journal of Sociolinguistics 5/2, 2001: 180±213

Language variation on Relay Chat: A social network approach1

John C. Paolillo University of Texas at Arlington and Indiana University

This paper examines linguistic variation on an channel with respect to the hypothesis, based on the model of Milroy and Milroy (1992), that standard variants tend to be associated with weak social network ties, while vernacular variants are associated with strong network ties. An analysis of frequency of contact as a measure of tie strength reveals a structured relationship between tie strength and several linguistic variants. However, the variant features are associated with social positions in a way that does not correlate neatly with tie strength. An account of these results is proposed in terms of the social functions of the di€erent variables and the larger social context of IRC a€ecting tie strength. KEYWORDS: Social networks, language variation, computer-mediated communication (CMC), Internet Relay Chat

1. INTRODUCTION The 1990s will be remembered as the decade in which the Internet and its protocols for communication (e-mail, Listserv discussions, news, the World-Wide Web, Internet Relay Chat, etc.) arrived in the lives of millions of people around the world. Almost from the outset of its popularization, digital technology enthusiasts promoted the Internet as a means for fostering social connection and community-building among geographically dispersed people in `virtual communities' (Rheingold 1993), where `virtual' denotes, as in com- puter science, something manifest only in a realm of electronic information. As social structures manifest largely through language, virtual communities have also attracted the interest of sociolinguists studying computer-mediated com- munication (CMC) (Baym 1996; Cherny 1999; Collot and Belmore 1996; Ferrara, Brunner and Whittemore 1991; Herring 1994, 1996, in press; Murray 1988; Paolillo 1996). This research has focused predominantly on ethnographic and interactional aspects of electronic discourse: in-group lan- guage, patterns of participation, turn-taking, and message schemata. In these studies, virtual communities are viewed as speech communities or discourse communities that may or may not have a prior o€-line existence, where some

# Published by Blackwell Publishers Ltd. 2001 108 Cowley Road, Oxford OX41JF, UK and 350 Main Street, Malden MA 02148,USA. c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 181

important part of that group's participation takes place through computer- mediated communication. Such on-line social groups often exhibit variation in micro-linguistic structures. For example, in the community studied by Cherny (1999), prepositions could be deleted in certain widely used expressions: `Henry nods (at) George' (Cherry 1999: 89). As yet, few empirical studies exist relating such micro-linguistic variation to the social mechanisms by which virtual communities are structured and maintained. Sociolinguistic research on computer-mediated communication faces an immediate problem, namely, articulating a basis from which to discuss the social identities of participants. Oftentimes, information about participants' `real-life' social identities can be dicult to obtain, because they use on-line monikers that obfuscate or falsify their o€-line identities, and because they can change the on-line cues to their identities by using di€erent computer accounts. In circumstances such as this, where there are few reliable cues to social structure, a social network approach is often bene®cial, since it permits social structure to be observed directly from patterns of interaction (Milroy and Milroy 1992: 4±6). This is appealing for the study of CMC, where social background information is scarce, and where the only observable social structure may be that which emerges through on-line interaction. A small but growing number of CMC studies has adopted a social network perspective (Garton, Haythornthwaite and Wellman 1999; Rice 1994; Wellman and Gulia 1999). However these studies are directed at the analysis of social interaction itself, and not linguistic variation. In the present study, I conduct a social network analysis of linguistic variation in an on-line community on a form of multi-participant CMC called `Internet Relay Chat' (IRC), in which users interact in real time by typing text messages from their computer terminals. The participants are mostly expatriate South Asians or their children who interact on the channel #india, using Hindi codeswitching together with more widespread features of IRC language. The study tests a hypothesis derived from the model of social network and language variation of Milroy and Milroy (1992), according to which vernacular variables ± those which are not legitimized by the dominant group ± are expected to be used more frequently by members of the community with stronger social network ties. The results of the analysis reveal a highly-structured relationship in which the variables' interactional functions and social values in¯uence their association with tie strength, at times calling into question their hypothesized vernacular status. I account for these observations in terms of factors in the larger social context of IRC that both weaken the dominance of powerful groups and cause weak ties to predominate. Variables arise which are `standard', yet non-legitimized, through the role of weak ties in enforcing macro-social cohesion (Granovetter 1973). The remainder of this paper is laid out as follows. Section 2 introduces IRC, Milroy and Milroy's (1992) model of social network and language variation, and the channel #india. Section 3 explains the data collection and analytical procedures used to measure tie strength and their relation to linguistic

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

182 PAOLILLO

variation. Section 4presents the results of the social network analysis, its interpretation in terms of tie strength, the correlation of participant groups and linguistic variables, and the relationship of the observed variation to tie strength. Section 5 integrates the interpretation of these ®ndings into a more general model of tie strength and language variation. Section 6 presents the conclusions drawn from this study for the sociolinguistics of CMC.

2. THE SOCIAL ANDLINGUISTIC ENVIRONMENT OF IRC 2.1 The IRC medium On IRC, users engage in simultaneous, multi-party interactions, hosted on `channels' that are maintained by a network of servers (Werry 1996). When users connect to a on a particular IRC network using a program and `join' a channel, they immediately start to see messages from other users appear on their screens; as messages are added, the text on the screen scrolls upward, allowing the user to see new messages as they arrive. To send messages to the channel, users type in a bu€er window and hit the return key. A typical IRC session from the channel #india is given in Example 1 (line numbers have been added in this and all other examples for ease of reference), in which the researcher joined #india under the nickname `gora'.

Example 1. A typical IRC log from the channel #india (all typographical and spelling errors appear as they occur in the original).2 1. *** gora ([email protected]) has joined channel #india 2. *** HighChief has left channel #india 3. * Zainaa PRince-1!! 4. *** Sandhya has left channel #india 5. 5Dr_K94 Gujju- hehehe muthafucka, i know khamoshi is a bot, but someone is on the bot ad talking thru it .. that's who i am tlaking to!! heheh ur a moron, muthafucka !! 6. 5shareen4 ANYONE HERE FROM BKK? 7. *** bhim ([email protected]) has joined channel #india 8. 5shareen4 ANYONE HERE FROM BKK? 9. * Zainaa H E L L O ! PRince-1 H E L L O ! H E L L O ! PRince-1 H E L L O ! H E L L O ! PRince-1 H E L L O ! 10. *** True^Love is now known as Nicole 11. 5lilly4 hi guys 12. 5lilly4 hi guys 13. 5Dr_K94 hi lilly.. would u like to have a quickie netsex with me? 14. 5lilly4 yup me 15. *** Signo€: TripleX (Operation timed out) 16. 5Pavitra4 Gujju: hiya:) 17. *** amesha ([email protected]) has joined channel #india 18. *** Signo€: Sheraz (Connection reset by peer)

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 183

19. * Gujju walks slowly . . . towards. . . . Lilly. . . . . and screams. . . . HIIIIIiiiii ya lilly . . . to kisko milli? 20. 5Gujju4 Hi pavitra..hows it going? 21. * ¯amenco wonders if gujju is ajit the villain 22. 5Pavitra4 Gujju: Its going great..and urself? 23. *** pyckle ([email protected]) has joined channel #india 24. * Gujju is no villan 25. * Gujju is a gujju 26. * Gujju laughts 27. * ¯amenco wonders if ajit the villain is gujju 28. *** Gujju has left channel #india 29. *** Gujju ([email protected]) has joined channel #india 30. 5Gujju4 tnt2 OPME 31. 5lilly4 no i don't dr-k9 32. 5Dr_K94 lilly- why not ? 33. *** Mode change ``+ooo Gujju'' on channel #india by KhAm0sHi 34. *** Signo€: DESIBABU (Leaving) 35. 5¯amenco4 scatman:) ) hows u? 36. *** Akshay ([email protected]) has joined channel #india 37. *** pyckle has left channel #india 38. *** k0oLaS|Ce is now known as SEXPL0S|V 39. *** umdhall1 ([email protected]) has joined channel #india 40. 5lilly4 cause i don't wanna 41. * Gujju tells Dr_K9 there is no one talking throught the bot hahaha ...u idiot.. Khamoshi is programmed to do that..once in a while..randomly.. 42. *** umdhall1 is now known as Panchaud

Prior to joining a channel, users select a nickname (`nick') by which to be known on the channel. In the default mode of communication on IRC, users type something to be broadcast to the channel, hit the return key, and the system displays it with the user's nick appended inside angle brackets (as in lines 5, 6, 8, 11, 12, 13, etc.). Lines beginning with a single asterisk (3, 9, 19, 21, 24, 25 and 26) represent `actions', which appear when a user types `/me' followed by some text. The system substitutes the user's name for `/me', and adds an asterisk, so that for the line `* Gujju laughts' (line 26), the user `Gujju' typed `/me laughts' (Werry 1996; cf. Cherny 1999 for similar features called `emote' commands in social MUDs (Multi-User Dimensions, see below)). Actions usually present a message in a third-person narrative voice, as in lines 24±26, although other uses are sometimes found, as in lines 3 and 9, where Zaina included special characters (not shown here) that create graphic e€ects on some terminals. Lines beginning with three asterisks (lines 1, 2, 4, 7, 10, 15, 17, etc.) are system messages representing changes in the state of the channel (which participants are on, who the operators are, etc.); they do not represent communication, but contain information about the identities of users and changes in their status. The overall e€ect of IRC interaction is oftentimes cognitively demanding,

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

184 PAOLILLO

although it o€ers a peculiar potential for interaction, surprise, and play (Danet 1998; Danet et al. 1997; Herring 1999a). The medium is popular, especially among young adults; at the time the log in Example 1 was recorded (September 1997), the EFNet IRC network alone had tens of thousands of users connecting daily. Overall turn length tends to be short (typical averages in data I have collected are between three and six words per turn), turn-adjacency is not strictly observed (as with line 41, which logically follows from line 5), and interruptions by people joining the channel (lines 1, 7, 17, 23, 29, 36 and 39) and leaving it (lines 2, 4, 15, 28, 34 and 37) are common. Although many EFNet channels are occupied 24hours a day, participant membership is ¯uid, undergoing constant rotation as some people leave IRC to tend to their daily lives while others return to resume and maintain relationships they have initiated there. Individual identities are ¯uid as well, since a user's nick can be changed at any time (lines 10 and 42). Additionally, participants sometimes mask their real-life identities by using servers that allow them to conceal or falsify identifying information. Despite this ¯uidity, participants tend to develop an investment in their on-line identities, and settle upon a small number of nicks that others recognize (Bechar-Israeli 1997). The log in Example 1 was collected on the channel #india on EFNet IRC as part of an ongoing project on multilingualism on the Internet (Paolillo 1996, 1999, 2000, forthcoming). Prior observation of #india indicated that its participants formed a community composed primarily of Indian nationals and children of nationals living abroad in other countries (as of the date of the log studied here, only a few participants connected from India itself). The largest number of participants connect from the U.S., the U.K. and Canada, although some also connect from other countries such as Indonesia, Singapore and Thailand. A few participants appear to have signi®cant o€-line relationships, such as a romantically involved couple who live in di€erent countries, and a small group of female students who attend the same university in Malaysia. For the most part, however, participants on #india do not appear to know one another o€-line. The geographic separation of many on-line participants disfavors o€-line interaction, and certain relations, e.g. across gender among the mostly young and single IRC participants, may even be taboo in their traditional real-life communities. The channel #india thus constitutes a virtual community, in that its participants are widely distributed geographically and interact principally on-line. At the same time, #india is situated at the intersection of two real-world cultures, each with its own social and linguistic norms (Paolillo 1996, forth- coming). On the one hand there is the bilingual culture of expatriate Indian nationals. Many Indian expatriates reside in English-speaking countries and are ¯uent speakers of English. Indian expatriates nonetheless place a high value on linguistic and cultural maintenance, and they retain ties with their home communities in India, especially via arranged marriages. Among the languages of India, Hindi holds a dominant position, both numerically and in social status,

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 185

as it is the language of the national capital and of the largest linguistic group in India. On the other hand, younger members of the Indian expatriate commun- ity are immersed in the local youth cultures. In Australia, the U.S. and the U.K., this means that they attend school with English-speaking youth, where they are exposed to popular culture in those countries, and are expected to integrate into the dominant culture. The e€ects of this integrationist pressure are re¯ected on #india by the predominance of English ± many ethnically Indian participants claim to know no Hindi. Similarly, many participants express interest for rap and heavy metal music alongside the Indian Bhangra musical idiom. The behaviors of IRC participants can fruitfully be examined from a social network perspective, since many of the characteristics of neighborhood-based real-life social networks have analogs in the virtual world of IRC. For example, regular channel participants often spend several hours per day using IRC. In social terms this behavior resembles `hanging out', an activity responsible for establishing and structuring social networks in many urban communities. As in real life, sustained social interaction among participants encourages the formation of social cliques. Those who do not have histories of sustained interaction are readily recognizable as peripheral participants on IRC. New users (`') are treated as outsiders until they have stable relationships with regular participants, or learn enough of the regular users' discourse conventions to chat appropriately. And `channel-surfers' hop from channel to channel (sometimes returning a short time later), without making much of an impact or establishing lasting contacts. Cliques on IRC can become locally powerful and assert control over a channel. Each channel has operators (or `ops') typically drawn from its most frequent and regular participants who are empowered to exclude (`kick') other users from the channel for behaviors they dislike, including violations of discursive norms. Users become operators either through being the ®rst to join a channel or by obtaining operator privileges (also called `ops') from an existing operator. Operators who leave a channel are also not assured of becoming operators again unless another operator will grant them ops upon their return. Alternatively, they may rely on computer programs known as `bots' (a shortened form of `robot') to maintain their desired social orders. These function as pre-programmed social proxies which exercise their owners' social powers in absentia, granting ops to, or even kicking and banning other participants (Werry 1996). Thus, on IRC, technological capital in the form of bots and programming expertise can be used to exercise social and discursive power. The result is a two-tiered social system with members of the operator- elite using their relationships of mutual trust and cooperation to maintain their special privileges, while leaving ordinary users and newbies at the mercy of the operators, whose actions range from benevolent to capricious to harassing (Herring 1999b). Like real-life social networks, IRC also exhibits patterns of language use which distinguish IRC from other modes of CMC, and which at times can

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

186 PAOLILLO

distinguish one IRC channel from another. For example, in many chat modes of CMC, certain words are commonly abbreviated, such as `u' in lines 13, 35 and 41, and `urself' in line 22 of Example 1; similarly one ®nds acronyms such as `brb' (`be right back'), `btw' (`by the way'), `' (`laughing out loud'), etc. Many of these same forms are found on the multi-user chat environments known as MUDs (Multi-User Dimensions) and MOOs (MUD Object-Oriented), although among participants on MUDs and MOOs, the use of `u' and `urself ' is sneered at as signaling that a user is an outsider (Cherny 1999: 92). Even within a particular chat channel, however, use of these forms is variable. What is the social meaning of these variant forms? Who uses them, when and with whom? Is there any relationship between the linguistic practices of participants and their positions in a channel's social network? These are the questions that I seek to examine here.

2.2 Social networks, language variation and tie strength Social network studies of language variation and change (e.g. Cheshire 1982; Eckert 1988; Labov 1972; L. Milroy 1987; Milroy and Milroy 1985, 1992) typically de®ne a network through contacts of friendship and kinship in a shared territory, in which a higher frequency of contact tends to mean a greater multiplexity of social ties (e.g. two people being both siblings and business partners). These studies ®nd that vernacular linguistic variables may take on in- group, solidary values for network members, such that avoiding or merely failing to use them may be interpreted as ¯outing the values of the local network. The methods most commonly employed in social network studies, i.e. mutual naming (Cheshire 1982) and ethnographic observation (Gal 1978; L. Milroy 1987; Milroy and Milroy 1985), are e€ective in identifying relation- ships based in solidarity and locality, or `strong ties'. However, they are less e€ective at identifying more incidental relationships, or `weak ties', which tend to fall outside of most treatments of language variation as a consequence. Weak ties are characterized by less frequent, more transient and more incidental contact among people, unanchored to any speci®c territory (J. Milroy 1992). According to the sociologist Mark Granovetter (1973), social di€usion of innovations, social mobility and social cohesion more generally are all e€ected through weak, rather than strong network ties. Granovetter de®nes social network tie strength as follows:

The strength of a tie is a (probably linear) combination of the amount of time, the emotional intensity, the intimacy (mutual con®ding), and the reciprocal services which characterize the tie. Each of these is somewhat independent of the other, though the set is obviously highly intracorrelated. (1973: 1361) Milroy and Milroy (1992) extend Granovetter's model of tie strength to systems of social class, allowing them to synthesize social network and class- based approaches to language variation and change. Their synthesis recognizes

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 187

distinct roles for strong and weak network ties in language variation, and relates them in turn to di€erent `life-modes' that characterize the economic activity of the social classes.3 Wage-earners, especially in lower socio-economic classes, tend to participate in di€erent types of networks depending on current economic fortunes: when work is more scarce, they are less mobile and hence tend to develop denser, more multiplex, territorially-based strong-tie networks; when work is more plentiful, they are more mobile and consequently tend to develop looser, weak-tie networks, with characteristically more contact with members of outside groups, including those from other social classes. Milroy and Milroy discuss language variation in terms of the vernacular, which for them is a theoretical primitive. Vernacular language is `roughly synonymous with ``real language in use'', and it is interpreted on a continuum of closeness to, or distance from, the idealized norm, or (in some cases) the idealized standard language' (J. Milroy 1992: 66, emphasis in original). Vernacular variables, as distinct from `legitimized' standard variables, are `non-legitimized' and often symbolize in-group identity and solidarity. Milroy and Milroy suggest that strong ties tend to reinforce the use of vernacular variables, whereas weak ties favor convergence toward the standard (1992: 22, Figure 3). These interpretations are supported by their observations of the (a) and (e) variables in Belfast English (J. Milroy 1992; Milroy and Milroy 1985; L. Milroy 1987), and patterns of African American Vernacular English variable use and inter-ethnic contact in Philadelphia as observed by Ash and Myhill (1986). The present study is guided by this theoretical perspective. Speci®cally, I seek to investigate empirically whether vernacular linguistic variables on #india IRC are associated with strong ties. Among the challenges for an empirical investigation of this nature is the need for accurate measures of tie strength. Because most methods of studying social networks leave weak ties invisible, Granovetter (1973) needed to adduce indirect evidence for weak ties from strong-tie measures, by assuming that the two are complementary. In con- structing their sociolinguistic model, Milroy and Milroy, following Granovetter's example, use a social network integration index which quanti®es strong tie in¯uences to draw inferences about the relationship of weak ties to language variation (they assume that the lower an individual's integration index, the more weak ties s/he has). These approximations of weak-tie strength, while heuristically useful and suggestive, are not completely accurate, because they do not actually count weak ties. Thus, in this study I seek to operationalize a measure of tie-strength that accurately measures both strong and weak ties. The measurement of tie strength employed here ®rst counts the frequency of any pair of participants' interaction, and then categorizes participants accord- ing to their overall patterns of interaction. In this way, participants may be reliably classi®ed according to tie strength, where those categories take into account all of the ties, strong and weak, a participant has to others. While it is dicult to obtain sucient data for such a measure of tie strength in face-to-face communication, relatively large amounts of data are collected more easily on

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

188 PAOLILLO

IRC, making the approach feasible. The rationale for using frequency of interaction, as measured in turns, as the operationalization of tie strength on IRC follows from Granovetter's (1973) assumption that time and frequency of interaction are both correlated with emotional intensity and mutual con®ding. On IRC, because individual turns take little time to execute, counting turns provides a reliable measure of the time an interaction takes. Since a log®le comprises a comprehensive record of the public interaction, frequency of interaction among any two participants is readily quanti®able. Furthermore, greater emotional investment leads to more time spent in interaction, as measured by number of turns. The categories of participants identi®ed through frequency of interaction are structural positions, that is, classes of individuals with equivalent relations to others in the social network, as distinct from the notion of cliques used in most sociolinguistic studies employing social networks. The di€erence between structural position and clique analysis is in the range of available interpretations. The structural position analysis employed here favors interpretation in terms of tie strength, whereas clique analysis favors interpreta- tions focused on the mutual interaction of clique members. Once a set of structural positions has been identi®ed based on the strength of participants' communicative ties to one another, the resulting categories can be correlated with distribution of linguistic variables, in order to reveal any relationship to tie strength. If the social processes at work in virtual community formation on IRC are like the social processes of geographically based commu- nities, then we should expect to ®nd patterns of variable distribution similar to those observed by Milroy and Milroy (1992). Speci®cally, we should be able to observe evidence of the maintenance of vernacular linguistic norms among participants with stronger ties to (more frequent interaction with) other participants, while among those with weaker ties (less frequent interaction with other participants), we should ®nd greater in¯uence of the legitimized, standard norms. This is the hypothesis that this study seeks to test.

2.3 Language variation on #india Language on the IRC channel #india, a channel I have observed since 1995, exhibits much socially signi®cant linguistic variation, especially in vernacular functions such as building solidarity and indexing in-group status. A number of these variables are illustrated in the log given in Example 2 (the log has been edited to remove turns and system messages not relevant to the interaction being illustrated). Example 2. Vernacular language use on #india. 1. *** Nagin ([email protected]. CA) has joined channel #india 2. 5Nagin4 Babylove:) 3. 5Nagin4 DRPEP! 4. 5Dr-pepper4 hi nagin 5. 5Dr-pepper4 nagin . . . leave ..letz talk on phone

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 189

6. 5Nagin4 Dr Pep im not at home 7. 5Lamer-X4 hi nagin 8. * Sahil waves to Nagin hi 9. 5Dr-pepper4 where r u ..nagin in dormz? 10. 5Nagin4 Lamerx Hey whats up?:) 11. 5Nagin4 Sahil :) 12. 5Lamer-X4 nagin: nuttin just feeling old now that i've hit 22 today :-/ *** 13. 5Nagin4 Lamer Happy Birthday:) well I know how u feel I felt like that when i turned 21 back in may 14. 5Lamer-X4 nagin: heh 15. 5Nagin4 DrPepper kiya dhoond ra ha hai? [trans.] what are you looking for? 16. 5Dr-pepper4 ur phone number . . . nagin 17. 5Nagin4 Dr Pepper u expect me to give it away just like that? 18. 5Dr-pepper4 what should i do..to get it..nagin..any testz for thiz ashiq?4 19. 5Nagin4 Dr Pep ashiq???? whoa i dont want any ashique 20. 5Nagin4 Dr Pep:) 21. 5Dr-pepper4 nagin..then what u want? 22. 5Nagin4 SYCLONE!!!!!! 23. 5Dr-pepper4 nagin..i fell in love with u..da ®rst time i chatted with u 24. 5Nagin4 DrPep ooh yeah . . . how sweet 25. 5Dr-pepper4 nagin . . . it waz love at ®rst chat 26. 5Nagin4 DrPep haha 27. 5Dr-pepper4 nagin..it seemz u have already lot of admirers 28. 5Nagin4 DrPep who me. . . . naaa ...idont think so 29. 5Dr-pepper4 nagin . . . tu kutte ayes? where are you? 30. 5Nagin4 DRPRP school 31. 5Dr-pepper4 nagin..so r u givin da number..dear? 32. 5Nagin4 Dr Pep hmmm I dont know 33. *** Nagin has left channel #india 34. 5Dr-pepper4 need time to think nagin..how abt 2 minutes 35. 5Creedence4 NAGIN!!!!!!!!!!!!!!!!!!!! 36. *** Nagin ([email protected]. CA) has joined channel #india 37. 5Dr-pepper4 wb nagin..came back with da phone number 38. *** Nagin has left channel #india

Two variables evident in this exchange are the use of the letters `r' and `u' in place of the words `are' and `you', as in lines 9, 13, 16, 17, 21, 23, 27 and 31; other abbreviations include `abt' = `about' (line 34), `wb' = `welcome back' (line 37), as well as forms not illustrated in Example 2 like `be' (`b'), `see' (`c'), `for' (`4'), etc. (Werry 1996). Although these shortenings are clearly adaptations that allow participants to economize keystrokes (especially in routinized and commonly used expressions), they are not necessarily found in other synchro- nous CMC modes, nor are they always used by IRC participants. Thus, their use must be conditioned by other factors; we may ask whether social factors contribute to their distribution.

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

190 PAOLILLO

Another characteristic IRC usage is the substitution of `z' for orthographic `s' (in lines 5, 9, 18, and 25). Note that this substitution is not simply phonetic (lines 5 and 18), and that no economy of expression is achieved thereby. This usage appears to have been copied from the African American subculture of rap and hip-hop music, which is popular among the mostly young IRC users (rap is a common focus of interaction on many IRC channels, including #india). It is also common in nicks such as `Rulerz', `DevilZ', `D-zire', and `MaDsKiLZ', which are reminiscent of the names of rap artists (other nicks are direct references to rap artists, such as `Pu€Daddy'). Other rap-based usages exist as well, such as `da' for `the' in lines 31 and 37, and variant spellings involving `x': `sux' = `sucks', etc. Yet, `z' does not necessarily invoke rap in any direct way, as can be seen from the log, and it is used only by certain participants (e.g. Dr-pepper, but not Nagin) and is variable (e.g. line 27, `seemz' and `admirers'). Furthermore, IRC `hackers' also appear to favor `z' and `x'. Hackers are people whose purpose is to obstruct other people's communication by invading a channel, obtaining operator privileges, and kicking and banning its participants. Hackers typically operate in small groups, and tend to use nicks with `x', `z', unconventional capitalization and special characters, such as `]MaD_HaX[` and `X11] ]11l1'. Such names are harder to type accurately, and hence more dicult for an op to kick and ban. We can thus ask whether use of stylized spellings such as `z' is in¯uenced by a participant's proximity to such individuals.5 Another variable characteristic of #india chat is evident in lines 15 and 29, namely, codeswitching from English into Hindi. Codeswitching can be found on many IRC channels with a national or regional focus (e.g. French predominates on #france, with some codeswitching into English). On #india one also ®nds examples of codeswitching into Bengali, Gujarati, Punjabi, Tamil and Malay.6 Codeswitching into Hindi is suciently frequent on #india that it is necessary to know some Hindi in order to appreciate fully what is being said. Paolillo (forthcoming) illustrates a number of communicative functions of codeswitch- ing on the channels #india and #punjab, such as summoning and holding a participant's attention, and holding parallel conversations with two di€erent interlocutors in di€erent languages. Some participants on #india are frequent codeswitchers, and others appear not to codeswitch at all, even in interactions with codeswitchers. Given the symbolic importance of Hindi and other Indian languages among members of the Indian diaspora, a participant's use of codeswitching on #india is therefore likely to be a socially signi®cant marker. Thus, these four variables ± `r', `u' and `z' substitutions, and codeswitching into Hindi ± have socially signi®cant uses on #india and IRC. Another characteristic of IRC language is a high incidence of obscenity, especially in direct threats to an interlocutor's face, a practice reminiscent of `¯aming' on Usenet and Listserv discussion groups (Herring 1994). In Example 3, chiter obscenely insults Soul4Real in lines 7, 13 and 16, which provokes Soul4Real to exercise his privileges as channel operator to kick and ban chiter from the channel (lines 14, 15, 22±25); lines 7 and 13 indicate that

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 191

even chiter's choice of nick might be intended as obscene. Chiter's banning follows his bald and unsuccessful request for operator privileges in which he uses obscenity as a form of discursive power to provoke Soul4Real.7 Since obscenity has been reported to be associated with social networks exhibiting frequent use of vernacular variants (Cheshire 1982; Labov 1972), we can ask if it is similarly distributed on #india.

Example 3. Obscenity used on #india to threaten an interlocutor's face. 1. 5chiter4 soul4real change your nick its tooo long 2. 5Soul4Real4 chiter: who are you? 3. 5disguise4 i think soul4real sounds kool 4. 5disguise4 :) 5. 5disguise4 keep it going soul 6. 5Soul4Real4 disguise : PP yeah i am gonna keep it : P 7. 5chiter4 im chiter, want to lick it? 8. 5Soul4Real4 chiter : : P 9. 5chiter4 give me ops 10. *** chiter has been kicked o€ channel #india by Soul4Real (here is ur ops : P) 11. *** chiter ([email protected]) has joined channel #india 12. 5chiter4 Soul4Real, that was not nice 13. 5chiter4 so lick my chiter you fucking cocksucker4Real!!!! 14. 5Soul4Real4 xb chiter : P 15. 5Soul4Real4 +b chuter 16. 5chiter4 ban me fucker!!1 17. 5chiter4 ban me for life!! 18. 5Soul4Real4 chiter : i will : P 19. 5chiter4 ok do it 20. 5chiter4 im waiting 21. 5chiter4 i have to go to class 22. *** chiter has been kicked o€ channel #india by Soul4Real (Soul4Real) 23. *** Mode change ``+b *!*@xxxxxx.xx.xxxxxx.ca'' on channel #india by Soul4Real 24. *** chiter ([email protected]) has joined channel #india 25. *** chiter has been kicked o€ channel #india by KhAm0sHi (Sorry, you are banned here.)

The distinctively IRC practices identi®ed above can all be seen as `vernacular' variants, in Milroy and Milroy's (1992) terms. None of the variables re¯ects the norms of Standard English. Furthermore, each variable expresses character- istically vernacular values. The `r' and `u' substitutions suggest that a user is experienced at IRC. Similarly, the variant spellings in `z' suggest the fashionably de®ant, individualistic postures assumed by rappers and hackers. Obscenity is also, albeit more traditionally, de®ant. Use of Hindi and other Indian languages serves to express the ethnic identi®cation and shared experiences of partici- pants, much as it does in o€-line interaction (see, e.g. Myers-Scotton 1993; Paolillo 1996, forthcoming). To the extent that this characterization is correct we should expect to ®nd these variables used more frequently among members

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

192 PAOLILLO

of strong-tie networks. The remainder of this paper reports on an empirical investigation designed to test this prediction.

3. DATA AND METHODS 3.1 Data collection and coding In order to carry out a social network analysis of variation on IRC and assign participants to social positions reliably, one must carefully track their identities, a matter complicated by the ability of participants to change their nicks. The system messages that appear when a participant joins a channel or changes her or his nick are helpful in sorting out these issues, and employing a continuous log of channel activity further facilitates tracking participants' identities. For this reason, I chose to record a single log of interaction on #india for a complete 24-hour period, by connecting with an IRC client program and capturing the session to a log ®le. The resulting 794K ®le was then imported into a relational to enable coding of linguistic and interactional features. Each line of the log ®le was coded as one of four message types: system messages, commands to bots, participant turns and actions. Since only participant turns and actions are the focus of the analysis, these lines were separated for subsequent analysis, while system messages and commands were collected in a separate database for tracking participant identities. The `speaker' of each participant turn or action was identi®ed as the ®rst word in the message, i.e. the nick in angle brackets, or the nick following the initial asterisk. These nicks were then located in the system message database and matched with a participant identity. Addressees were identi®ed similarly. Often a speaker identi®es the addressee directly in a turn or action (Werry 1996); if not, then often the addressee of a turn or action can be identi®ed by a careful reading of the log. When an addressee was identi®ed in either of these ways, the system message database was again consulted to establish the identity of the addressee by matching the appropriate nick with an email address. Messages addressed to bots, or not addressed to a speci®c addressee (e.g. `Hi all') were excluded from subequent analysis. A total of 6317 lines were coded in this way.

3.2 Social network and tie strength In order to relate network tie strength to the frequency of use of the di€erent linguistic features, I undertook a positional analysis based on structural equivalence of the participants' interactional patterns (Degenne and Forse 1999: 86±92; Wasserman and Faust 1994: 361€). This was accomplished using Factor Analysis, a statistical technique for reducing the dimensionality of large data sets (Rietveld and van Hout 1993; Scott 1991) by identifying shared variation among a number of variables.8 The shared variation is expressed by grouping the variables into categories called `factors'. To categorize participants

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 193

according to tie strength, we use variables representing the frequency of individual participants' interaction with other individual participants. The resulting factors are social positions, or categories of participants, classi®ed by having similar patterns of interaction. In this analysis, cliques, or mutually interacting groups of participants, show up as distinct factors, but not all factors necessarily represent cliques ± participants who share patterns of interaction but do not interact with one another are also assigned to factors, because they are positionally similar to one another. In the remainder of this paper, I designate all categories of participants with the neutral term `groups', whether they happen to be cliques or non-cliques. Factor Analysis is subject to certain practical and computing limitations, and for this reason, it was necessary to select a subset of the channel participants to study. I began with the 100 most frequent participants on the channel, since their interaction comprises the bulk of the observations made (5151 turns taken and 5243 turns received, out of 6317 total). From this set, six participants had to be excluded, since their data did not ®t the assumptions of Factor Analysis.9 Since participants are both speakers and addressees, I constructed two tables, one in which the columns of the table (variables) were speakers and the rows were the 288 addressees on the channel, and another in which the columns were addressees and the rows were the 350 speakers. In this way, both aspects of the interaction of the 94remaining participants were represented. The results of the two factor analyses, grouping the 94participants as speakers and as addressees, were collated, categorizing the 94participants into several distinct groups according to their complete interactional patterns. Messages exchanged between each of the groups were counted and compared to a model of completely random interaction to reveal the strength of the ties characterizing each group. Observed ties were then plotted in a graphic representation of inter-group interaction known as a reduced sociogram (Wasserman and Faust 1994: 391), with the weight of the lines representing the strength of interaction between each pair of groups. Relationships that fell below a ®xed threshold of interaction were not drawn.10 The resulting socio- gram was interpreted in terms of tie strength, by noting the weight of the lines connecting each group and its relative inter-connection with other groups.11

3.3 Linguistic variables The distributions of ®ve linguistic variables were studied in the present corpus: `r', `u', `z',12 obscenity and codeswitching into Hindi and other Indian lan- guages.13 Each message in the database was given a code for each variable, indicating if it was present (if the variable appeared), or absent, if the variable could have appeared but did not. For example, for the variable `u', any turn containing `u' was coded as `present'; those containing `you' were coded as `absent', and those in which neither `you' nor `u' were found were coded as not relevant to `u'. When this variable was later analyzed, `present' and `absent' were

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

194 PAOLILLO

contrasted and the irrelevant turns were excluded from analysis, as in usual variationist practice (Sanko€ 1988). For obscenity and Hindi codeswitching, the simplifying assumption was adopted that both could potentially occur in each and every turn. Also, since turns were the unit of analysis, if a given variable appeared only once in a turn or if it appeared multiple times, the turn was simply coded as having that feature present. After coding, the database records were exported as tokens for logistic regression (VARBRUL) analysis using GoldVarb 2.1. Each token contained information about all ®ve linguistic variables (present, absent, does not apply), the classi®cation of the turn's speaker according to speaker and addressee factors, the participant's gender (female, male, undeter- mined, as evidenced through choice of nick and through qualitative observation of the log®le), and the speaker's status on #india (operator, ordinary user, undetermined; also revealed in the log®le). In this way, the association of the ®ve linguistic features with di€erent social positions was tested, taking into account gender and operator-hood. The distribution of the linguistic features was plotted on the reduced sociogram in ®ve separate ®gures.

4. RESULTS: SOCIAL NETWORK ANDVARIATION ON #INDIA The results obtained from this investigation reveal a structured pattern of social interaction, wherein a particular group of participants with a large proportion of operators is disproportionately sought out for interaction by other partici- pants. This group's language use is characterized by relative avoidance of most of the vernacular linguistic features under study, which were generally found to be localized in peripheral regions of the network, where tie strength is weaker. The only exception to this generalization is obscenity, which is avoided to some extent by four peripheral groups. Still, and surprisingly, none of the features studied characterizes the groups in the network with the strongest network ties.

4.1 The social network analysis The two factor analyses produced four-factor solutions, as determined by a scree-test (Rietveld and van Hout 1993: 273±274). Participants were assigned to the factor group for which they had the highest factor loading, labeled s1 through s4for the speaker groups and a1 through a4for the addressee groups. Participants whose four factor loadings were less than 0.3 were considered not to load on any factor; these participants were placed in the groups labeled s0 and a0 in the speaker and addressee analyses, respect- ively. The factor loadings of the participants grouped into one of the factors s1 through s3 and a1 through a3 are all positive. Participants loading on the factors s4and a4had both positive and negative loadings ± that is, some participants' patterns of interaction correlated positively with this factor, while others' correlated negatively. In social network terms, this means that the two types of participants' loading on factor s4(and likewise a4)

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:21 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 195

had ties to complementary groups of participants. Collating the two factor analyses resulted in 16 distinct participant groups, labeled A through P for convenient reference, as indicated in Table 1. Within each group, participants were counted according to their operator/non-operator status (op/non, indicated in the relevant rows of the table), and gender (m/f, indicated at the top of the relevant sub-columns), as determined from the database of participants' identities. The participant groupings obtained this way are socially signi®cant with respect to interaction on IRC (see below), but not with respect to any readily ascertainable real-life characteristics. For example, male and female participants

Table 1: Classi®cation of the 94participants on #india, by speaker (s0± s4) and addressee factors (a0±a4), gender (male/female) and channel operator status (op/non). Factors a4and s4have both positive (+) and negative (±) loadings

Factor s0 s1 s2 s3 s4 (48) (13) (7) (15) (11) +++++7 status m f m f m f m f m f m f AG JMN a0 + op 5 2 1 (42) non 17 11 2 3 1 BHK a1 + op 41 52 (21) non 2 3 1 2 1 CI a2 + op 2 2 1 (10) non 1 1 3 DO a3 + op 1 (7) non 1 3 2 E +op11 a4 non 3 2 (14) FLP 7 op non 2 2 3

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

196 PAOLILLO

are distributed over the speaker and addressee types at roughly chance levels.14 Nor does geographic proximity appear to explain the observed groupings of participants: for example, in group K, the most socially signi®cant group (see below), there is one participant each connected from San Diego, California, Stanford, California, San Jose, California, Rutgers, New Jersey, Amherst, Massachusetts, Dallas, Texas, Australia and three undetermined locations. Although Stanford and San Jose are close enough that face-to-face contact would be possible for this pair, it is unlikely that such a relationship would be responsible for the social similarity of the members of group K as a whole, which is de®ned through its members' interactions with all of the other participants, not just with those of group K. As for the roles of these groups on IRC, there is a slight preponderance of channel operators among the a1 and a2 addressee factors, suggesting that these factors group participants together who are sought out for interaction and possibly administrative favors (e.g. opping or kicking other people). None of the speaker factors shows a similar preponder- ance of operators.

4.2 Tie strength among participant groups Tie strength among the sixteen groups was examined by constructing a 16616 table with groups A±P in both the rows and the columns, and analyzing it in the manner described in section 3.2, to arrive at the reduced sociogram in Figure 1. A grand total of 4296 turns was represented in the table (the 2148 turns not counted represent turns involving one of the 256 remaining participants on the channel). In Figure 1, the weakest relationships represented begin from 10 Pearson deviances below the level expected in the model of uniform interaction. More frequent interaction is represented by broader lines. Since not all interactions are symmetrical, the relative size of the arrowheads (and their absence in some cases) is used to represent asymmetries. In addition, since members of a group often interact with members of the same group (as is the case when that group represents a clique), `self-ties' are represented by loops directed back at the originating group. The resulting diagram iconically represents the nature and strength of social ties among the sixteen groups of participants, taking group size into account. Of the 4296 total turns exchanged among the members of the di€erent groups, 2466 (57%) are represented as lines in the reduced socio- gram. The remaining 1830 turns are from ties between groups with a low frequency of interaction (such as those between any of the groups and groups D, F, M and O); these are by de®nition weak ties. By inspecting the lines to and from a given group in the sociogram, it is possible to read the strength of its ties. As Figure 1 shows, the majority of interaction among the groups involves members of K. Fully half of the 16 groups (B, C, G, H, I, J, K, and L) have a substantial number of turns directed toward K. Moreover, K tends to return

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 197

P

I A E

J N G C

F L D K

B O M H

Figure 1: Interaction among the sixteen participant groups (A±P); line weight and arrowheads indicate intensity and direction of contact

proportionately fewer turns than are addressed to it. The one exception to this pattern is J, which receives more turns from K than vice versa. Finally, K has the highest rate of self-address of any of the groups (the deviance for K self address is more than seven times greater than the nearest deviance). It appears from this evidence that K represents a clique, the most central group in the social network of the channel #india. Since K is also among the groups with the greatest proportion of operators (the other being H), it is likely that K's social position comes in part from its concentration of operators, and the need for other participants to interact with operators in order to obtain favors. The second operator-heavy group, group H, does not receive the same attention, however. Members of H interact frequently with members of K and also with other members of H, suggesting cooperation among the operators of the two groups, but others in the network (except perhaps for B) do not appear to seek out H nearly as much as K. H's social position then, although close to K's, is less central to the channel. Members of G, another group closely linked to K, also engage in self address, as does E, to some extent, but neither of these groups displays the level of clique-hood that is evidenced

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

198 PAOLILLO

by K. Self-address is concentrated mostly around K, as are the other indicators of tie strength. At the outer periphery of the network are the groups D, F, M and O. These four groups have only weak ties connecting them to other members of the network; no ties could be drawn to them on the reduced sociogram. Next most peripheral are N and P, both of which are connected to K by a minimum of three links, followed by A and E, at two links away from K. Although group A is the largest of the groups identi®ed, it is also the most heterogeneous, and all ties connecting to and from A are weak. This is because A consists of all those participants whose patterns of interaction did not correlate with others on the channel, whether as speakers or as addressees. Consequently, A represents a social position of participants who are not associated with the main patterns of interaction on the channel. E's position is somewhat less peripheral than A's because the ties connecting E to K (through G) are stronger than those of A (which go through C or I). Finally, at one link away from K are the groups B, C, G, H, I, J and L, although the ties of G, H and J to K are stronger than those of C, I and L. While the position of I is super®cially similar to K, in that there are a fair number of turns addressed to both I and K, I has no signi®cant amount of self-address, and unlike K, the number of incoming ties is not greater than the number of outgoing ties. On the basis of these observations we can propose the following hierarchy of tie strength among the sixteen groups of #india participants. Group K, the `central core', exhibits by far the strongest ties of any group on the channel. Second are groups G, H and J, who can be considered the `outer core'. Groups B, C, E, I and L, are peripheral with predominantly weak ties, followed by A, N and ®nally D, M and O, the `outer periphery'.15

4.3 Correlations with linguistic variables With this hierarchy in place, we turn now to test our working hypothesis about the relationship of language variation to tie strength. Speci®cally, the hypo- thesis predicts that all ®ve of the linguistic features identi®ed in section 2.3 should be used most by K, becoming less and less frequent as we move down the hierarchy, until we reach groups O, D and F, where they should be least frequent. In order to ascertain the relationship between participants' use of linguistic variants and their social positions on #india, the ®ve linguistic features under study were correlated with the 16 groups of participants. This was done using GoldVarb 2.1. For this analysis, using the full set of 16 participant groups tended to disaggregate the participants in the study excessively, leading to several knock- out factors with small numbers of tokens. For this reason, the speaker and addressee factors of the Factor Analysis were used instead, meaning that a small group such as N, with one participant, was treated together with group P by speaker factor, and with groups A, G, J and M by addressee factor. This procedure minimizes skewing caused by small amounts of data for smaller

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 199

groups. Complete logistic regression models for each of the ®ve variables are reported in Table 2, in which all of the signi®cant factor groups have been recoded to the minimum number of signi®cant factors (for the speaker and addressee factor groups, `other' reports the factor weight for the aggregate of the remaining groups). Remaining knockout factors are also noted in Table 2. Operator-hood in the status factor turned out to be a knockout for the variable `r' (zero out of 10 tokens of `are' were shortened to `r'); since operator-hood is not central to the questions investigated here, excluding these 10 tokens produces an acceptable analysis. Speaker factor s2 and addressee factors a2 and a4were knockouts for the variable `z' (with 0/110, 0/164and 0/72 tokens respectively); these involve a suciently large number of tokens that they need to be considered in the interpretation of `z'. The weights in Table 2 for the speaker and addressee factors for each variable

Table 2: Logistic regression (VARBRUL) models for the ®ve variables. Non-sig- ni®cant factor groups are omitted; knockout factors are given as ratios in par- entheses, e.g. (0/110). For participant gender, `?' indicates indeterminate gender

Variable Input Speaker Addressee Gender Status (tokens)

`r' 0.216 s47: 0.725 a0: 0.744 ±± (394) other: 0.484 a1: 0.166 other: 0.510

`u' 0.623 s4+: 0.184 a0: 0.609 ±± (1720) s47: 0.636 a1: 0.237 other: 0.509 a2: 0.609 other: 0.531

`z' 0.037 s2: (0/110) a1: 0.356 ± op: 0.308 (2107) s47: 0.931 a2: (0/164) non: 0.578 other: 0.467 a4+: 0.028 a47: (0/72) other: 0.637

obscenity 0.037 s2: 0.350 a0: 0.430 F: 0.330 ± (8199) s47: 0.171 a3: 0.104 M: 0.576 other: 0.530 other: 0.552 ?: 0.534

Hindi 0.022 s0: 0.543 a0: 0.603 F: 0.321 ± (8199) s1: 0.700 a1: 0.272 M: 0.591 s3: 0.543 a47: 0.113 ?: 0.490 s4: 0.140 other: 0.587 other: 0.418

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

200 PAOLILLO

were combined using the logit-additive method to give the probability weights for each of the participant groups, given in Table 3. Note that these values are not predicted frequencies of the linguistic variables, but weights relative to their input parameters, or average frequencies of use: favoring weights (those greater than 0.5) indicate that a variable is more frequent than the input parameter in a given group, and disfavoring weights (those less than 0.5) indicate that it is less frequent than the input parameter. The weights in Table 3 were then plotted on the reduced sociogram of Figure 1, to allow the ready comparison of tie strength with linguistic variation, in Figures 2 through 6. Isoglosses were drawn on each ®gure indicating favoring probabilities by shaded areas and disfavoring probabilities by areas with shaded borders; a key in each ®gure gives the probability weights associated with each level of shading. More extreme favoring or disfavoring weights are indicated by darker shading. Finally, the median probability level has been left unenclosed by isoglosses in each ®gure; in Figures 2 through 5 this is the level closest to 0.5. In Figure 6, it is 0.326 on account of the categorical behavior of groups C, F, I and L, excluded as knockouts in the logistic regression for `z' (all four groups disfavor `z' categorically).16 In a few cases, probability levels close to the median level were combined with it, for clarity.

4.4 Linguistic variables and tie strength on #india Figures 2 through 6 indicate that the ®ve variables do not cluster around K, the group with the strongest network ties, contrary to prediction. With the exception of obscenity, group K uses the variables under study less frequently than the channel average. The variables `r', `u' and `z' are used most frequently by peripheral groups with predominantly weak ties, such as A, C, I, N and P, although the patterns di€er substantially for each of the variables. Obscenity is not strongly favored by any particular group (K favors it only weakly), and is disfavored by the outer-periphery groups D, N, O and P. Hindi codeswitching is avoided slightly by group K, although it is used most by G, one of the non- central core groups. How can these results be interpreted? Let us begin with Hindi codeswitching in Figure 2, whose use is most strongly favored by group G, one of the non-core central groups closely linked to K. Group K, however, avoids Hindi codeswitching, so that an isogloss falls between G and K, in spite of the closeness of their ties to one other. Of the remaining groups, the peripheral groups F and L show the most extreme avoidance of Hindi, followed by O and M. Peripheral groups A, C, D, and E, as well as group J all favor Hindi codeswitching to some extent. Thus, Hindi codeswitching is fairly widespread throughout the network, but it does not show the expected relationship to strong ties. We would expect that Hindi, being the dominant language of India, should serve a vernacular function among its speakers who constitute a minority group abroad, for example as a marker of ethnic in-group identi®ca- tion. Something is required to explain this apparent anomaly. Independent observations of codeswitching on #india and #punjab, show

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 201

Table 3: Probabilities of use for the ®ve variables in all sixteen groups. Groups contained in knockout factors are given as ratios in parentheses, e.g. (0/110)

Hindi obscenity `r' `u' `z'

A 0.643 0.46 0.732 0.618 0.606 B 0.307 0.581 0.157 0.244 0.326 C 0.628 0.581 0.494 0.618 (0/54) D 0.628 0.116 0.494 0.54 0.606 E 0.628 0.581 0.494 0.54 0.025 F 0.131 0.581 0.494 0.54 (0/14) G 0.78 0.46 0.732 0.618 0.606 H 0.466 0.581 0.157 0.244 0.326 I 0.505 0.399 0.494 0.618 (0/110) J 0.643 0.46 0.732 0.618 0.606 K 0.307 0.581 0.157 0.244 0.326 L 0.131 0.581 0.494 0.54 (0/58) M 0.198 0.46 0.732 0.26 0.606 N 0.522 0.135 0.885 0.643 0.959 O 0.188 0.116 0.494 0.203 0.606 P 0.505 0.203 0.733 0.567 0.28

P

I A E

J N G C

F L D Weight K 0.780 0.643 B 0.628 O median 0.466-0.522 M 0.307 0.198 H 0.188 0.131 Figure 2: Distribution of Hindi codeswitching over the 16 participant groups

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

202 PAOLILLO

similar patterns of distribution. Paolillo (forthcoming) found that operators, who tend to belong to a central clique, disfavor codeswitching. If the attention-getting functions are the channel norm for codeswitching, it follows that central participants like K would have much less need for those functions, and hence less need for codeswitching, because they are already well known to others who readily interact with them. Participants near but not at the network center, such as members of G, have social positions that are less secure; for them attention- getting strategies are presumably important for initiating successful interactions. Although codeswitching partly re¯ects a participant's self-identi®cation as Indian, its other functions make its use more necessary for those whose social position is less secure. Another way to think of this is that the codeswitching variable exhibits a vernacular pattern of distribution, provided we acknowledge that its linguistic function in the interactive norms of the makes its use less necessary for clique insiders. This picture is complicated by the fact that not all self-identifying ethnic Indians use Hindi, either because they are not pro®cient in it, or because it symbolizes traditional values with which they do not wish to identify. In the o€- line Indian expatriate community, there is an established hierarchy of cultural authenticity based on language pro®ciency and acculturation to outside norms. At the periphery, lower-pro®ciency and acculturated expatriates may be derisively categorized as `ABCD', or `American-Born Confused Desis', by other Indian ethnics indicating that they are less culturally authentic (Paolillo 1996). At the same time, they are distinctively Indian in their self identi®cation, as expressed through choice of nicks, musical tastes and other ethnic markers besides Hindi codeswitching. Furthermore, ABCD youth tend to be technically savvy and highly sociable, which often makes them fervent adopters of IRC; thus they predominate on #india. As a consequence, participants need not necessarily be pro®cient in Hindi in order to assume a central social position on #india: as long as one ®ts in other ways, by knowing the right music, by sharing common experiences, by kicking the right other people when granted ops, etc., one can be accepted. This tendency must especially be considered for female participants, who are less likely than their male cohorts to use Hindi codeswitching, as indicated in Table 2. A similar preference for English among females has been observed elsewhere in the Indian diaspora, where it appears that females are leading in a shift to English (Paolillo 1996). Whether or not the gender pattern on #india should be explained this way, the lower frequency of codeswitching by females does not appear to prevent them from occupying central social positions on #india in groups such as K. Obscenity, illustrated in Figure 3, is another linguistic variable whose expressive and social functions need to be taken into account. Peripheral participant groups D, N, O and P, and to a lesser extent I, all avoid the use of obscenity, while central groups K, H and B favor it to a small extent, along with the more peripheral groups C, E, F, and L. Again, groups G and J are distinct in their behavior from the core group K with which they are most closely

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 203

P

I A E

J N G C

F L D K

B Weight O 0.581 median 0.460 M 0.399 H 0.203 0.135 0.116 Figure 3: Distribution of obscenity over the 16 participant groups

connected. Obscenity appears therefore to function partly as a network marker, albeit not in a prototypical way, since its distribution is not concentric around K. As illustrated in Example 3, the function of obscenity is something more akin to the exercise of discursive power than to expression of in-group identi®cation. Moreover, it has a signi®cantly skewed gender distribution, such that female participants tend to avoid it. Herring (1999b) notes both these aspects of obscenity in her study of sexual harassment on #india. Although a participant's use of obscenity on IRC often constitutes an excuse for operators to kick that participant, operators, who tend to be male, are partly immune from such actions and can generally get away with using more obscenity. Thus, as Herring comments, proscriptions against obscenity are selectively enforced, and thus reify a pre-existing gender disparity, a practice which more generally discourages the use of obscenity by members of peripheral groups. These observations point out that, although obscenity weakly resembles a network

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

204 PAOLILLO

marker, its distribution must be carefully interpreted in terms of its social and communicative functions. The three orthographic variables show a di€erent distribution. Figure 4 illustrates the distribution of `u'. Group N leads in the use of `u', but since N is composed of a single participant weakly connected to others, it is probably not a genuine center from which `u' spreads. Groups A, C, P, J and G also favor `u'. As in the case of Hindi codeswitching, groups K, H and B avoid `u', although the peripheral group O shows greater avoidance. Once again, the use of `u' by G and J contrasts sharply with that of K, to which they are closely linked. In spite of this, and in spite of the absence of strong ties between G or J and any other group using `u', G and J's use of `u' is higher than the network average. The only direct link G has to any group using `u' is group C, and that link is directed from C to G; J has no such links. G's use of `u' is probably not accommodation to C, because members of G do not signi®cantly interact with members of C. In short, `u' predominates among the network periphery without being strongly

P

I A E

J N G C

F L D Weight K 0.643 0.618 B 0.567 O median 0.540 0.260 M 0.244 H 0.203

Figure 4: Distribution of `u' over the 16 participant groups

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 205

P

I A E

J N G C

F L D K

B O Weight M 0.885 H 0.732 median 0.494 0.157

Figure 5: The distribution of `r' over the 16 participant groups

localized in it. Some peripheral participants disfavor it (M and O), and we also ®nd it near the center of the social network, where it has perhaps spread through weak ties. Figure 5 shows a similar pattern for `r'. The singleton group N again exhibits the greatest probability in the use of `r', followed by A, P, G, J and M, most of which are peripheral, having only indirect contact with the central core group K. Groups K, H and B, all of which interact predominantly with K, exhibit the greatest avoidance of `r' (they prefer unshortened `are'). Groups C, D, E, F, I, L and O have neutral levels of `r' use. As with `u', there is a sharp di€erence between G and J, which favor `r' and K, which avoids it. Unlike `u', obscenity and Hindi codeswitching, the groups favoring `r' are not joined together by many ties. Thus, like `u', the variable `r' is associated with the network periphery, but is even less strongly localized. What both `r' and `u' share most is their avoidance by K, H and B. This is nearly the opposite of the distribution predicted for a vernacular variable.

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

206 PAOLILLO

An alternative interpretation of `r' and `u' can be framed in terms of their status as an `outside norm' for the ethnically-based channel #india. Seen in this way, `r' and `u' represent standard IRC usage. As standard variables, partici- pants exposed most to them would be those who interact most on other channels, and who therefore are peripheral on #india. The apparent resistance to `r' and `u' by those with the strongest ties suggests a conservative, vernacular norm-enforcing function of strong ties. Thus, like the standard variables (e) in Belfast English (Milroy and Milroy 1992), `r' and `u' come to function as markers of network peripherality on #india. These variables di€er from the standard variables discussed by Milroy and Milroy, however, in that they are not `legitimized' by the power structures of the larger society. At the same time, given the suggestive distribution of these variables, it might be worthwhile to consider the possibility that they are legitimized by the larger context of IRC, and are thus `standard'. Finally, Figure 6 illustrates a distinct distribution for the variable `z'. This variable is most common in the outer-periphery, as are the variables `r' and `u', with the single-participant group N again showing the most frequent use of `z', followed by G, J, D, M and O. However, the peripheral groups C, I, F, and L show

P

I A E

J N G C

F L D K

Weight B O 0.959 0.606 M median 0.326-0.280 H 0.025 0.000

Figure 6: Distribution of `z' over the 16 participant groups

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 207

rather extreme avoidance of `z': the speaker and addressee factors corresponding to these groups are knockout factors with large numbers of potential sites for `z' observed (see Table 2). The sharpest isogloss for `z' therefore falls between the periphery and the outer-periphery of the network (e.g. between C and A), although a weaker isogloss falls between G and J on the one hand and K on the other. The central groups K, H and B have only a small in¯uence in the distribution of `z'. The variable `z' therefore does not correlate with tie strength in any clear way. Since the central groups K, H and B all have a median use of `z', it is not a network marker on #india. At the same time its use di€ers among certain peripheral groups. For this reason, `z' appears to be an outside norm, but not the same sort as `r' and `u'; hence the label `standard' seems to ®t as poorly as `vernacular'. It is possible that this distribution re¯ects the association of `z' with hackers, or perhaps hacker wannabes, who are mostly peripheral to a channel like #india. Channel operators, some of whom have genuine hacker skills that they use in programming bots, etc., disfavor `z', as indicated in Table 2, while G and J, who have less secure positions, favor `z'. Unlike Hindi codeswitching, however, `z' has no obvious interactional function. Thus, `z' has a particularly complex nature which is not readily explainable in the current framework on the basis of present knowledge; its behavior must be left to future study.

5. TIE STRENGTH, LANGUAGE VARIATION ANDMEDIUM In a general way, tie strength is re¯ected in the distribution of all ®ve variables: core and peripheral groups tend to di€er in their patterns of use. Yet, the variables do not tend to be used more frequently among the groups with the strongest ties, contrary to our original hypothesis. Rather, group K favors only the use of obscenity, and disfavors or is neutral with respect to the other variables. The variable used most closest to the central core is Hindi codeswitch- ing, which is favored by group G, and consequently this variable appears to have a vernacular function as a network marker. However, K di€ers sharply from G, in strongly avoiding Hindi codeswitching. As discussed above, this apparent paradox is explainable if interactional function and participant abilities are taken into account. Likewise, obscenity needs to be interpreted in terms of its communicative function. In contrast, the variables `r' and `u' show a pattern of distribution more characteristic of standard variables, a situation not anticipated from their hypothesized vernacular status. Finally the variable `z', sharply divided as it is in the network periphery, resists explanation in terms of tie strength. Thus, our working hypothesis based on social network tie strength does not accurately predict the distribution of the variables overall. However, tie strength does assist in their interpretation: where a variable is found to be associated with strong network ties, its most natural interpretation is as a vernacular network marker. Likewise, where a variable is associated with weaker network ties, this suggests a relationship with an outside, standard

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

208 PAOLILLO

set of norms. One consequence of these ®ndings is that we need to reconsider the classi®cation of the linguistic variables as vernacular. The initial classi®cation of the ®ve variables as vernacular was made in part on the basis of their non-legitimized status. Legitimization is tied to the allocation of power in societies, which Milroy and Milroy (1992) de®ne in relation to the `dominant group': standard variants are assumed to come from the dominant group, while vernacular variants are assumed to arise within non-dominant groups. While this dichotomy has been a useful heuristic in much sociolinguistic variation research, it does not always fully express what can be observed, and partly for this reason, Milroy and Milroy (1992) turn to a model of macro-level social processes pivoting on the economic orientations of people (life-modes) to explain sociolinguistic variation. To develop a more complete explanation of language variation on IRC, we may follow their example further by considering the relevant macro-level processes which both in¯uence tie strength and apportion power to di€erent networks. The macro- level social processes at work on IRC are distinct from those in neighborhood- based face-to-face interaction, and thus the model of economic life-modes unfortunately cannot help, although similar dynamic processes can be observed. To begin with, in the analysis of #india, dominance is not easily factored into two poles. The central members of K, H and G are clearly dominant in the local sense, in that they concentrate the power of operator-hood on the channel. At the same time, during the period in which I collected data on #india, the channel was always under threat of takeover by marauding `hackers', who may or may not have been ethnically Indian. More powerful than either of these groups are the IRCops ± the computer administrators responsible for running the IRC servers. IRCops have their own social network relations through which they organize and maintain the services of IRC; their relations are subject to macro-level in¯uences as well, in terms of which remote sites are accessible to the network, and how readily people from di€erent locales are able to coordinate cooperation. IRCops also interfere in the day-to-day management of IRC channels by disconnecting bots, thereby potentially leaving the channel vulnerable to takeover by outsiders.17 Moreover, IRC is embedded within still larger spheres of in¯uence, e.g. Internet access providers, educational institu- tions, corporate work settings, the media and entertainment industries, and the like. All of these in¯uences have their own centers of power, with their own interests and agendas. All are also linked to IRC channels like #india primarily through weak ties. The preponderance of weak ties is in part a property of the IRC medium and the social contexts in which it is embedded. The relative mobility of participants, by hopping from IRC channel to channel, by changing servers within an IRC network and by connecting to di€erent networks, even at the same time, means that any given IRC participant will develop many more weak ties than strong ties. Moreover, weak ties may connect participants to

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 209

overlapping, potentially con¯icting centers of power, each of which may contribute its own linguistic norms. Thus, an IRC channel such as #india exists at the nexus of a multiplicity of in¯uences o€ering competing linguistic variants, most of which are associated with weak ties. It is in this context that the notion of a standard variable as belonging to the `legitimized' code becomes problematic. What is the power center that confers legitimacy to particular linguistic forms? In a sense, all power centers do; thus the notion of a standard is necessarily a relative one. An alternative explanation is also provided by Granovetter's (1973) weak tie model, on which Milroy and Milroy's (1992) sociolinguistic model is based, but which does not require an appeal to legitimization or social dominance. In Granovetter's model, the numerous smaller groups of a society, while internally bound by strong ties, are bound to one another by weak, not strong, ties ± the cohesiveness of the larger society is manifest through these inter-group weak ties. In this sense, weak ties enforce norms just as strong ties do, but at a level which represents an individual's belonging to a larger society, rather than a local neighborhood group. This perspective suggests a sense of `standard' meaning which is held in common through the cohesion of a larger social group. For such a standard, its legitimization is in its widespread use ± in its acceptance as currency by members of a larger social group, rather than by power elite. The variables `r' and `u' may be considered standard in exactly this sense. Although not legitimized, they are widespread on IRC, being shared by the larger community of IRC participants. They are therefore associated with weak ties, and show the pattern of distribution characteristic of standard variables.

6. CONCLUSIONS The results of this study suggest that there is a de®nite relationship between tie strength and linguistic variation, although one di€erent from our initial expectations. Of the ®ve variables studied, two, Hindi codeswitching and obscenity appear to have distributions resembling vernacular variables, in that their use is more frequent among participants with stronger ties. At the same time, anomalies in the distribution of these variables needed to be explained by appealing to their interactive and social functions. Two others, the `r' and `u' variables, turned out to be distributed more like standard variables than like vernacular variables. For these two variables, we must consider the possibility that there are sources of legitimization other than the dominance of a power elite in¯uencing their use. One possibility is that their use is suciently widespread to constitute a `standard' with respect to IRC. The ®fth variable examined, orthographic `z', is associated with weak ties, but its distribution in the periphery appears to be contested, rather than `standard'. Further study is required to fully explain the distribution of this variable. In all of these cases there is a need to take into account the larger social context of the observed interactions in which the variables are used.

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

210 PAOLILLO

Nonetheless, the results above suggest that the social network approach facilitates sociolinguistic analysis of computer-mediated communication, much as it does for other methodologically challenging social contexts.

NOTES 1. Earlier versions of this research were presented at the twelfth Sociolinguistics Symposium, University of London, March 1998, the Department of English, Youngs- town State University, May 1998, the 27th New Ways of Analyzing Variation conference, University of Georgia, October 1998, and the 32nd Hawaii International Conference on Systems Sciences, Wailea, Maui, January 1999. An earlier version of this research was published in the Journal of Computer-Mediated Communication,as Paolillo (1999). I wish to thank the following individuals for their helpful comments on earlier drafts and presentations: Salvatore Attardo, Allan Bell, Ron Breiger, Lynn Cherny, Tom Erickson, Caroline Haythornthwaite, Susan Herring, Lisa Ann Lane, Simeon Yates and ®ve reviewers. All responsibility for errors of fact or interpretation rests with the author. 2. The following list glosses jargon and abbreviated forms used in the examples. abt `about' ban exclude users from later participation (an action performed by operators) bkk `Bankok' (as used by a single user and not widely current on IRC) bot `robot' (a program that acts as an operator's proxy) kick exclude a user from current participation on a channel (an action performed by operators) lamer an individual lacking technical and/or social skills mode any of several properties of the channel, but especially its operators and banned users netsex sexually explicit dyadic IRC interaction op as a noun: operator, operator privileges as a verb: grant someone operator privileges (performed by operators) peer (in system messages) an IRC server other than the one reporting the message r `are' signo€ (in system messages) a user's connection was terminated by a server u `you' (also ur `your', urself `yourself ', etc.) wb `welcome back' 3. Another aspect of the theory presented in Milroy and Milroy (1992) is its account of gendered patterns of social contact and their relationship to patterns of language change. 4. Ashiq (also spelled Aashiq) is the (nick) name of an operator on #india. Here, Dr-pepper appears to be summoning Aashiq, but it is not clear what Nagin's response means. 5. Variant spellings with `x' have complex resonances when used on #india, since `x' is often used to represent a voiceless velar consonant (e.g. in the names Xodiax, Xaos and Xan), as it is in romanized transcriptions of Hindi-Urdu. Hence, it may represent identi®cation with hackers, Indian-ness, or both. 6. Malay is used principally among ethnic Indians who have lived in Malaysia or Singapore.

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 211

7. Chiter's and Soul4Real's opposed stances are underscored by their use of the taunting :P and :PP in lines 6, 8, 10, 14and 18. The `P' in such a `sideways smiley face' is intended to represent someone's tongue sticking out. Doubling the `P' adds emphasis. 8. Factor analysis with varimax rotation was conducted using Minitab 11 Xtra for Power Macintosh. 9. Speci®cally, their data correlated strongly with one or more of the other participants, resulting in a matrix that is not positive de®nite (Rietveld and van Hout 1993: 275€). In social terms, these six participants occupy social positions represented by other participants, so they were not studied further. 10. This threshold was 10 Pearson deviances below the level expected in a model of uniform interaction (Pearson deviances are [observed count ± expected count]2 Ä expected count). See Breiger and Ennis (1997) for similar methods using log-linear models. 11. While the reduced sociogram accurately represents tie strength among the 94 participants, it does not exhaustively represent interaction on the channel, since it excludes the 256 remaining participants. 12. Other variables, such as the variant spellings in `x', shortenings like `abt', etc., were infrequent individually, and appeared to be distributed in ways that were distinct from the other variables, hence, they were left for future study. 13. No Malay occurred in the corpus under consideration. Hindi and Gujarati were the two predominant languages, after English, with far more Hindi than Gujarati. 14. As determined by VARBRUL analysis with gender selected as the dependent variable. 15. The romantically involved couple mentioned in the introduction are members of group K. The students of the Malaysian university also mentioned did not appear in the transcript examined. 16. The ®gures were drawn and re-drawn several times to optimize the arrangement of the groups and the distribution of the variables for interpretation. Where social and linguistic relatedness were in con¯ict, I have favored arrangements that highlight social relatedness. 17. The reason for the anti-bot policy of IRCops is that bots tend to increase the load on the IRC servers.

REFERENCES Ash, Sharon and John Myhill. 1986. Linguistic correlates of inter-ethnic contact. In David Sanko€ (ed.) Diversity and Diachrony. Amsterdam: Benjamins. 33±44. Baym, Nancy K. 1996. Agreement and disagreement in a computer-mediated group. Research on Language in Social Interaction 29: 315±346. Bechar-Israeli, H. 1997. From 5Bonehead4 to 5cLoNehEAd4: Nicknames, play and identity on Internet Relay Chat. Journal of Computer-Mediated Communication 1.2 (http://www.ascusc.org/jcmc/). Breiger, Ron and James Ennis. 1997. Generalized exchange in social networks: Statistics and structure. L'AnneÂe sociologique 47: 73±88. Cherny, Lynn. 1999. Conversation and Community: Chat in a Virtual World. Stanford, California: CSLI Publications.

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

212 PAOLILLO

Cheshire, Jenny. 1982. Variation in an English Dialect: A Sociolinguistic Study. Cambridge: Cambridge University Press. Collot, Milena and Nancy Belmore. 1996. Electronic language: A new variety of English. In Susan C. Herring (ed.) Computer-Mediated Communication: Linguistic, Social and Cross-Cultural Perspectives. Amsterdam: Benjamins. 13±28. Danet, Brenda. 1998. Text as mask: Gender, play and performance on the Internet. In Steven G. Jones (ed.) Cybersociety 2.0. Thousand Oaks, California: Sage Publications. 129±158. Danet, Brenda, Tsameret Wachenhauser, Amos Cividalli, Haya Bechar-Israeli and Yehudit Rosenbaum-Tamari. 1997. Curtain time 20:00 GMT: Experiments in virtual theater on Internet Relay Chat. Journal of Computer-Mediated Communication 1.2 (http://www.ascusc.org/jcmc/). Degenne, Alain and Michel ForseÂ. 1999. Introducing Social Networks. Thousand Oaks, California: Sage Publications. Eckert, Penny. 1988. Sound change and adolescent social structure. Language in Society 17: 183±207. Ferrara, Kathleen, Hans Brunner and Greg Whittemore. 1991. Interactive written discourse as an emergent register. Written Communication 8: 8±34. Gal, Susan. 1978. Variation and change in patterns of speaking: Language shift in Austria. In David Sanko€ (ed.) Linguistic Variation: Models and Methods. New York: Academic Press. 227±238. Garton, Laura, Caroline Haythornthwaite and Barry Wellman. 1999. Studying on-line social networks. In Steve Jones (ed.) Doing Internet Research. Thousand Oaks, California: Sage Publications. 75±105. Granovetter, Mark. 1973. The strength of weak ties. American Journal of Sociology 78: 1360±1380. Herring, Susan C. 1994. Politeness in computer culture: Why women thank and men ¯ame. In Mary Bucholtz, A. C. Liang, Laurel Sutton and Christine Hines (eds.) Cultural Performances: Proceedings of the Third Berkeley Women and Language Conference. Berkeley, California: Berkeley Women and Language Group. 278±294. Herring, Susan C. 1996. Two variants of an electronic message schema. In Susan C. Herring (ed.) Computer-Mediated Communication: Linguistic, Social and Cross-Cultural Perspectives. Amsterdam: Benjamins. 81±106. Herring, Susan C. 1999a. Interactional coherence in CMC. Journal of Computer-Mediated Communication, 4.4 (http://www.ascusc.org/jcmc/). Herring, Susan C. 1999b. The rhetorical dynamics of gender harassment on-line. The Information Society 15: 151±167. Herring, Susan C. In press. Computer-mediated discourse. In Deborah Schi€rin, Deborah Tannen and Heidi Hamilton (eds.) Handbook of Discourse Analysis. Oxford: Blackwell. Labov, William. 1972. Language in the Inner City: Studies in the Black English Vernacular. Philadelphia: University of Pennsylvania Press. Milroy, James. 1992. Linguistic Variation and Change. Oxford: Blackwell. Milroy, Lesley. 1987. Language and Social Networks. Oxford: Blackwell. Milroy, James and Lesley Milroy. 1985. Linguistic change, social network, and speaker innovation. Journal of Linguistics 21: 339±384. Milroy, James and Lesley Milroy. 1992. Social networks and social class: Toward an integrated sociolinguistic model. Language in Society 21: 1±26. Murray, Denise. 1988. Computer-mediated communication: Implications for ESP. English for Special Purposes 7: 3±18.

# Blackwell Publishers Ltd. 2001 c:/3socio/5-2/02-paolillo.3d ± 3/4/1 ± 18:22 ± disk/mp

LANGUAGE VARIATION ON INTERNET RELAY CHAT 213

Myers-Scotton, Carol. 1993. Social Motivations for Code-switching. Oxford: Oxford Uni- versity Press. Paolillo, John C. 1996. Language choice on soc.culture.punjab. Electronic Journal of Communication 6.3 (http://www.cios.org/). Paolillo, John C. 1999. The virtual speech community: Social network and language variation on IRC. Journal of Computer-Mediated Communication 4.4 (http://www. ascusc.org/jcmc/). Paolillo, John C. 2000. Visualizing Usenet: A factor analytic approach. Proceedings of the 33rd Hawaii International Conference on Systems Sciences. Los Alamitos, California: Institute of Electrical and Electronics Engineers (IEEE) Computer Society (CD-ROM). Paolillo, John C. Forthcoming. `Conversational' Codeswitching on Usenet and Internet Relay Chat. In Susan Herring (ed.) Computer-Mediated Conversation. Oxford: Oxford University Press. Rheingold, Howard. 1993. The Virtual Community: Homesteading on the Electronic Frontier. Reading, Massachusetts: Addison-Wesley. Rice, Ronald E. 1994. Network analysis and computer-mediated communications systems. In Stanley Wasserman and Joseph Galaskiewicz (eds.) Advances in Social Network Analysis: Research in the Social and Behavioral Sciences. Thousand Oaks, California: Sage Publications. 167±203. Rietveld, Toni and Roeland van Hout. 1993. Statistical Techniques for the Study of Language and Language Behavior. Berlin: Mouton de Gruyter. Sanko€, David. 1988. Variable rules. In Ulrich Ammon, Norbert Dittmar and Klaus Mattheier (eds.) Sociolinguistics, an International Handbook of the Science of Language and Society. Berlin: Walter de Gruyter. 984±997. Scott, John. 1991. Social Network Analysis: A Handbook. Thousand Oaks, California: Sage Publications. Wasserman, Stanley and Katherine Faust. 1994. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press. Wellman, Barry and Milena Gulia. 1999. Net surfers don't ride alone: Virtual community as community. In Marc A. Smith and Peter Kollock (eds.) Communities in Cyberspace. London: Routledge. 167±194. Werry, Christopher. C. 1996. Linguistic and interactional features of Internet Relay Chat. In Susan C. Herring (ed.) Computer-Mediated Communication: Linguistic, Social and Cross-Cultural Perspectives. Amsterdam: Benjamins. 47±64. Yates, Simeon. 1996. English in cyberspace. In Sharon Goodman and David Graddol (eds.) Redesigning English: New Texts, New Identities. London: Routledge. 106±133.

Address correspondence to: John C. Paolillo SLIS Main Library 011 Indiana University Bloomington Indiana 47405 U.S.A. [email protected]

# Blackwell Publishers Ltd. 2001