Thesis W SCHROBILGEN
Total Page:16
File Type:pdf, Size:1020Kb
ITALIAN INTERNET TERMINOLOGY: A CORPUS-BASED APPROACH TO BANALISED LANGUAGE by Wendy Marie Schrobilgen A thesis submitted in conformity with the requirements For the degree of Doctor of Philosophy Department of Italian Studies University of Toronto ©Copyright by Wendy Marie Schrobilgen 2010 ITALIAN INTERNET TERMINOLOGY: A CORPUS-BASED APPROACH TO BANALISED LANGUAGE Abstract of Doctoral Thesis Wendy Marie Schrobilgen Department of Italian Studies, University of Toronto, 2010 The present study offers strong evidence that the World Wide Web is a unique domain of use that can be categorized as a banalised context based on certain defining criteria. Italian Internet Terminology is worthy of investigation because of its unprecedented extent and rich context of use. The goal of chapter one is to make a case for the utility of a corpus-based study, explain the primary theoretical underpinnings of the study, which I base on the concept of central meaning linked to the compositionality of elements, and give important historical and sociological motivations for such a study . In chapter two, I explain how term are selected and discarded, how my corpus is created, and how it proves suitable and representative of this lexical domain. In the third chapter, I elucidate the classificatory system I employ which allows me to view the lexical items in grammatical context. To gain a better understanding of the conceptual system of the terms studied, I introduce another important analytical framework: qualia structures. In chapter four, the analysis of the morphosyntactic and semantic character of the terminology allows for greater insight into the processes and pattering of denomination of Internet terminology. To conclude this study, I show that Italian Internet terminology is a banalised language governed by a systematic set of morpho-syntactic rules in which Italian selects, uses, and lexicalizes terms based on core units of meaning. ii Thesis Outline Table of Contents Glossary or Terms and Abbreviations 1. Introduction 2. Corpus and Tools 3. Neological Aspects of the Italian Lexicon 4. Presentation of Data and Analysis of Terms 5. Conclusions Bibliography Appendix: Corpus of Terms Online http://www.humanities.mcmaster.ca/~schrobw/ Table of Contents 1 Introduction 0 Glossary, Symbols, Abbreviations and Conventions x 0.1 Glossary x 0.2 Conventions and Symbols xi 0.3 Abbreviations xi 0.3 Other Symbols xii 1.1 Area of Study 2 1.1.2 The Internet : A Unique Domain of Language Use 2 1.1.3 The Language of the Internet 3 iii 1.2 The Sociolinguistic Context 5 1.2.1 The Influence of English on Italian 5 1.2.2 Social Attitudes Towards the English Language in Italy 7 1.3 Literature Review 11 1.3.1 Examination of Previous Studies 11 1.3.2 Past Studies on Specialized Languages 14 1.3.2.1 Galisson 14 1.3.2.2 Wexler 15 1.3.3 Studies on Language Using Corpora 15 1.3.1a Summary of Previous Lexicological Studies 17 1.3.4 Recent Studies on English Borrowings on the Internet 18 1.3.5 Conclusion to Literature Review 20 1.4 Essential Notions 23 1.4.1 Banalisation 23 1.4.1.1 Diagram of Banalised Language 26 1.4.2 Mention vs. Use 26 1.4.2.1 Examples of Mention vs. Use: spam/spamming/spammare 27 1.4.3 Terms That Appear in Other Semiotic Systems 28 1.4.4 Areas of Use 29 1.4.5 Regional vs. Standard Italian 30 1.4.6 Terms and Terminology 30 1.4.6.1 Diagram: The Linguistic Sign 33 1.5 Plan of Study 35 iv 2 Corpus and Tools 37 2.1 Introduction 37 2.2.1 How the Net Differs from World Wide Web 37 2.2.2 Internet Coverage and Use 38 Table 2.2.2.1 Table: Internet Usage in Europe by Comparison 39 2.3 The Media 41 2.3.1 Studying the media: Establishing a Community of Practice 41 2.3.2 Establishing Trust in the Community of Users 43 2.4 Materials 44 2.4.1 Media Sources: Italian Language Media Online 44 2.4.1.1 National News Websites in Italian Language 44 2.4.2 The Corpus 46 2.4.3 Source and Number of Terms Generated 48 2.4.4 Subcategories of Terms 51 2.4.5 Other Kinds of Terms Excluded from Corpus 52 2.4.6 Use of a Control Group to Test Terms 52 2.4.7 Attestations / Frequency of Terms 53 2.4.8 Establishing Central Terms 53 2.4.8.1 Table of Synonomous Terms 54 2.4.8.1. Results of Italian Language Sites Searched Through Google.it 61 2.4.8.1.1a Frequency of Synonomous Terms: ‘to click/ click!’ 62 2.4.8.1.1b Frequency of Synonomous Terms: ‘chat / chat!’ 63 2.4.8.1.2a: Frequency of Synonomous Terms: ‘streaming’ 64 v 2.4.8.1.3a Frequency of Synonomous Terms: ‘link’ 65 2.5 Tools 66 2.5.1 Google.it 66 2.5.1.1 Google Advanced Search Page 67 2.5.2 Concordance 68 2.6 Synchronic Approach 69 2.7 Conclusion 70 3 Neological Aspects of the Italian Lexicon 72 3.1 Introduction 72 3.2 Borrowing 73 3.2.1 Conditions of Borrowing 73 3.2.2 Integration of Loanwords: Phonological Considerations 75 3.2.3 Constraints on Borrowing 76 3.2.4 Derivation in English and Italian: A Comparison 77 3.2.5 Word Morphology and Borrowed Lexical Stock 79 Table 3.2.5.1 Noun Analysis 81 Table 3.2.5.2 Verb Analysis 82 3.3 Word Formation in Italian 83 3.3.1 How Words Are Formed 83 3.4 Categorization of New Words 85 3.4.1 Polysemy: metaphorical extension/semantic shift 87 3.4.2 Borrowings 88 3.4.3 Lexicalization of Borrowed Terms 89 vi 3.4.4 Derivation 89 3.4.5 Compounding 89 3.4.6 Synonymous Terms 92 3.5. Semantic Approaches to Terms 92 3.5.1 Frames and Profiles 92 3.5.2 Frames and Selection Restrictions 94 3.5.2.1 Pustejovsky’s Generative Lexicon Theory 94 3.6 Conclusion 95 4 Presentation of Data and Analysis of Terms 97 4.1 Introduction 97 4.2 Borrowing 98 4.2.1 Semantic Borrowing 98 4.2.2 Formal Borrowing 102 4.2.2.1 background............2 (0.042%) 102 4.2.2.2 streaming............32 (0.677%) 103 4.2.2.3 cam..................8 (0.116%) 103 4.2.3 Lexicalization of Borrowed Terms 104 4.2.3.1 Most Frequent Lexicalized Borrowings as They Appear in the Corpus 107 4.2.3.1.1 Results from Search of Blog in the Corpus 107 4.2.3.1.2 Results from Search of Chat in the Corpus 111 4.2.3.1.3 Results from Search of Click/Clic in the Corpus 115 4.2.3.1.4 Results from Search of Log in the Corpus 118 4.2.3.1.5 Results from Search of Mail/Email in the Corpus 121 vii 4.2.3.1.6 Results from Search of Network in the Corpus 124 4.2.3.1.7 Results from Search of Scroll in the Corpus 127 4.2.3.1.8 Results from Search of Spam in the Corpus 128 4.2.3.1.9 Results from Search of Surf in the Corpus 130 4.2.3.1.10 Results from Search of Tag in the Corpus 131 4.2.3.1.11 Results from Search of User in the Corpus 134 4.2.3.11 Highly Lexicalized Terms-Semantic Lattices 135 4.2.4 Derivation 137 4.2.4.1 Subjective Nominalization (Nominative of Subject) 138 4.2.4.2. Objective Nominalization 140 4.2.4.3 Lexicalization of Nouns > Verbs 142 4.3 Compounding, Syntagms and Ellipsis 143 4.3.1 Compounding 143 4.3.1.1 Compound Constructions in the Corpus 145 4.3.1.2 Results from Search of Web in Non Compound Constructions in the Corpus 148 4.3.2 Syntagms 149 4.3.3 Ellipsis 153 4.4 Synonymous Terms 154 4.4.1 Synonymous Terms in the Corpus 155 4.4.1.1 Comparison of Scaricare and Downlodare in the Corpus 161 4.5 Sigmatic Plurals 162 4.6 Gender Attribution 164 4.7 Conclusion 164 viii 5 Conclusions 166 5.1 Introduction 166 5.2 Morphosyntactic Aspects of Italian on the Internet 167 5.2.1 Compounds 167 5.2.2 Verb Phrases/Syntagms 169 5.2.3 General Patterns and Stages of Lexicalization 172 5.3 Semantic Aspects of Italian on the Internet 177 5.3.1 Competing Terms: Nouns and Noun Phrases 177 5.3.2 Competing Terms: Verbs and Verb Phrases 178 5.3.3 Distribution of Semantic Load 180 5.3.4 Paradigms: The Collocation of Terms 183 5.4 The Expression of Internet Italian on the Internet 184 5.4.1 Italian on the Internet as a Banalised Language 185 5.4.1.1 Comparison of Galisson’s Banalised Language and Italian on the Internet 186 5.5 Applications and Future Work 190 5.5.1 Phonological and Prosodic Aspects of Italian on the Internet and Constraints on Borrowings 190 5.5.2 English Suffixation in Italian 191 5.5.3 Diachronic Approach 191 5.5.4 Second Language Teaching and Learning 192 Bibliography 194 ix 0 Glossary, Symbols, Abbreviations and Conventions 0.1 Glossary Context: Refers to the domain of use. Co-text: Refers to the written context in which we find the term. The term, as it appears in the data, gives it its meaning.