The Struggle of Small and Non-Western Wikipedia Editions
Total Page:16
File Type:pdf, Size:1020Kb
The Struggle of Small and Non-Western Wikipedia Editions Heiko Wiggers Wake Forest University Abstract The online encyclopedia Wikipedia has become one of the most influ- ential Internet platforms on the World Wide Web and is currently the sixth-most visited website overall. For smaller languages, creating their own Wikipedia edi- tions can constitute a tremendous boost to their general online presence. This paper investigates whether Wikipedia’s internal structure and culture is really inclusive in its treatment and representation of minority, endangered, regional, and non-Western languages. The paper argues that Wikipedia and, indeed, the Internet itself favor Western, mainstream languages and content and thus make it almost impossible for smaller languages to achieve a meaningful online presence. 1 Introduction - Digital Divide The term "digital divide" dates back to the early days of the Internet in the 1990s and describes the unequal access of different sections of the population to new information and communication technologies (ICT) in international, national, and regional comparisons. This term does not only refer to the acquisition and ownership of new technological devices (e.g. personal computers, laptops, smartphones, etc.), but also to the fact that on the one hand more than half of all people in the world have no access to the Internet and, on the other hand, navigating the Internet (use or handling) poses a significant problem for many people who do have access. From a sociological point of view, researchers (Dudenhöffer & Meyen 2012) worry that information technologies will create a new two-tier society between those who can afford ICT equipment and who have the knowledge to operate these devices and those who do not have the necessary income to acquire such devices, or who are having difficulties handling such technologies. Furthermore, it is feared that existing inequalities, especially in terms of education, income and social skills, are being recreated or will even intensify in the new online world. Considering the rapid rise of the Internet as the largest communication system in human history, it becomes clear that people who cannot participate in this phenomenon are not only marginalized, but also have significantly fewer opportunities and chances than the so-called habitual users of these technologies. Critics point out that the digital divide cannot be substantiated empirically. In particular, they stress that Wiggers, H. 2018. The Struggle of Small and Non-Western Wikipedia Editions. Proceedings of the 4th Annual Linguistics Conference at UGA, The Linguistics Society at UGA: Athens, GA. 66–86. The Struggle of Small and Non-Western Wikipedia Editions Heiko Wiggers problems of use are relatively easy to remedy, and that it is up to the users themselves to gain the necessary knowledge to navigate the virtual world. By international standards, the digital divide is also considered to be predominantly a sociological and demographic problem, with some blatant inequalities between developed and developing countries. For example, current data from 2017 show that only about 10% of all 1.25 billion people living in Africa are Internet users and/or have access to the Internet. In Madagascar, for instance, out of an estimated 25.6 million people, just 1.3 million have access to the Internet, while the number of Internet users in Ethiopia is approximately 16 million, with an estimated 104.5 million inhabitants. The situation is similar in many countries of Southeast Asia. 2 Smaller Languages and the World Wide Web In addition to sociological problems, there is much debate about whether and to what extent access to and use of the Internet poses a threat to minority, endangered, regional, and non-Western languages (henceforth referred to as MERnW-languages). Linguists (such as Crystal 2002) point to a massive extinction of languages and estimate that approximately half of the estimated 6000-7000 languages currently spoken in the world will be extinct by the end of the 21st century. This process already existed before the spread of the Internet, but it has noticeably accelerated since the turn of the millennium. In linguistic research there are relatively large differences of opinion as to how the digitization of large parts of humanity contributes to the extinction of languages. Many linguists and language activists see the Internet as a chance to revive MERnW-languages or make them more accessible to a wider audience. Many others, however, fear that the increasing interconnectedness of the world only benefits the major dominant languages, such as English, Spanish, German, or French, and that smaller languages inevitably will fall by the wayside. This is particularly true for MERnW-languages whose speakers often have problems with accessing or using the Internet. The figures below show that the digital revolution of recent decades is by no means a reflection of global linguistic diversity: i. Of the estimated 6000-7000 spoken languages in the world, less than 500 had a digital existence in 2017 (i.e. websites in their languages). ii. Of the approximately 3.9 billion Internet users worldwide in 2017, around 3 billion are speakers of the so-called "top ten languages online". These are: English, Chinese, Spanish, Arabic, Portuguese, Indonesian / Malay, Japanese, Russian, French and German. 67 The Struggle of Small and Non-Western Wikipedia Editions Heiko Wiggers iii. This means that the remaining 900,000 million Internet users in 2017 were distributed among the approximately 470-480 remaining languages that are represented online. In 2007, Cunliffe launched an extensive study to investigate the online presence of smaller languages and came to the following conclusion: “The linguistic diversity of the world is poorly reflected on the Inter- net. [...] 90% of the world’s languages are simply not represented.” (2007: 139) The American media, however, seem to view the Internet as a rather positive medium for smaller languages. In fact, in recent years a slew of American media reports appeared, whose headlines alone seem to indicate that internet technology and/or globalization are a cure-all for MERnW-languages, such as: “Globalization helps prevent endangered languages” (Yale Global News, December 2013); “For rare languages, social media provide new hope” (NPR, July 2014); and “Technology to the endangered language rescue!” (Huffington Post, January 2015). This kind of trust in the Internet as a regenerative medium relies on a considerable body of research that views the World Wide Web and especially social media as a significant opportunity for MERnW- languages. In addition, these are not limited to African or threatened languages in South America’s Amazon region, but also extend to Europe’s endangered languages. Dolowy-Rybinska(2013), for example, investigated the use of Kashubian in social media, and came to the conclusion that this language, whose use was prohibited under Poland’s communist rule, benefits enormously from the Internet: “Speaking most broadly, the rise of the Internet has been very advan- tageous for the Kashubian-speaking community, especially for the young. [...] It [using Kashubian online] has led to an increase in the prestige of the language: if Kashubian can be used online, it cannot be so inferior and unsuitable after all. [...] Young people commu- nicate, exchange remarks Kashubian culture and its function in the modern world, and find other people to whom Kashubian language and culture are also important.” (2013: 127-128) Susan Wright (2006) examined the use of five other smaller, regional European lan- guages online (Occitan, Piedmontese, Ladin, Sardinian, and Frisian) and concluded likewise that the Internet was generally a positive development for these languages: 68 The Struggle of Small and Non-Western Wikipedia Editions Heiko Wiggers “[...] all five languages in the survey are present on the Internet. With- out providing actual figures, which are [...] likely to be misleading and immediately out of date, we can nonetheless report that the Occi- tan researchers found over a thousand sites, the Sardinian and Frisian researchers hundreds, and the Piedmontese and Ladin researchers dozens. The numbers of websites in which the five languages are used is, therefore, not negligible, and their presence in this medium indisputable.” (2006: 192-193). Despite these generally auspicious results, both authors point out that their research was ultimately inconclusive, as it is impossible to predict whether the Internet will really improve the situation of these languages in the long term. In addition, both researchers emphasize that the digital presence of a MERnW- language is by no means equivalent to a language revival: “The fact remains that using certain pages in the minority language is unlikely to produce a major linguistic shift among young people; their main language is likely to remain the national language or – in international contacts – English.” (2013: 127). In general, the influence of the Internet on MERnW-languages outside the U.S., is seen in a much more cautious manner. The most widely respected and most recognized study in the non-English-speaking world on this subject comes from the Hungarian linguist András Kornai and his team from the Budapest Institute of Technology. With meticulous research and the application of mathematical formulas and algorithms, Kornai’s team not only explored the current digital state of MERnW- languages, but also made predictions about their digital future, which are quite sobering. Based on his team’s calculations Kornai predicts that less than three hundred languages will have an online presence by the 21st century: “With only 250 digital survivors, all others must inevitably drift towards digital heritage status or digital extinction. [...] There could be another 20 spoken languages [...] that may make it, but every one of these will be an uphill battle. For 95% of the world’s languages there is very little hope of crossing the digital divide.” (2013: 10) Furthermore, Kornai points out that it is very difficult for MERnW-languages to secure a so-called digital ascent, i.e.