A Corpus-Based Investigation of Lexical Cohesion in En & It

A CORPUS-BASED INVESTIGATION OF LEXICAL COHESION IN EN & IT NON-TRANSLATED TEXTS AND IN IT TRANSLATED TEXTS A thesis submitted To Kent State University in partial Fufillment of the requirements for the Degree of Doctor of Philosophy by Leonardo Giannossa August, 2012 © Copyright by Leonardo Giannossa 2012 All Rights Reserved ii Dissertation written by Leonardo Giannossa M.A., University of Bari – Italy, 2007 B.A., University of Bari, Italy, 2005 Approved by ______________________________, Chair, Doctoral Dissertation Committee Brian Baer ______________________________, Members, Doctoral Dissertation Committee Richard K. Washbourne ______________________________, Erik Angelone ______________________________, Patricia Dunmire ______________________________, Sarah Rilling Accepted by ______________________________, Interim Chair, Modern and Classical Language Studies Keiran J. Dunne ______________________________, Dean, College of Arts and Sciences Timothy Moerland iii Table of Contents LIST OF FIGURES .......................................................................................................... vii LIST OF TABLES ........................................................................................................... viii DEDICATION ................................................................................................................... xi ACKNOWLEDGEMENTS .............................................................................................. xii ABSTRACT ..................................................................................................................... xiv INTRODUCTION .............................................................................................................. 1 Why Study Lexical Cohesion? ......................................................................................... 1 Research Hypotheses ....................................................................................................... 8 Research Method ............................................................................................................. 9 Significance of my Research Hypotheses ...................................................................... 12 Summary of Chapters .................................................................................................... 15 CHAPTER I ...................................................................................................................... 16 1.1 Coherence vs. Cohesion .......................................................................................... 16 1.2 Lexical Cohesion ..................................................................................................... 21 1.3 Lexical Cohesion Studies in Discourse Analysis and Linguistics ........................... 26 1.4 Lexical Chaining Sources ........................................................................................ 32 1.4.1 The WordNet Project ........................................................................................ 32 1.5 Cohesion and Lexical Cohesion in Translation Studies .......................................... 34 CHAPTER II ..................................................................................................................... 44 2.1 Methodological Approaches: Text Analysis and Corpus Linguistics ..................... 45 2.1.1 Text Analysis..................................................................................................... 45 iv 2.2 Tools ........................................................................................................................ 57 2.2.1 WordSmith Tools .............................................................................................. 57 2.2.2 WordNet ............................................................................................................ 60 2.3 Preliminary Analysis ............................................................................................... 62 2.4 Semantic Relation Analysis ..................................................................................... 66 2.5 Statistical Analysis .................................................................................................. 70 CHAPTER III ................................................................................................................... 72 3.1 Parallel and Comparable Corpora ............................................................................ 72 3.2 Textual Analysis ...................................................................................................... 73 3.2.1 Standardized Type-Token Ratio........................................................................ 73 3.2.2 Sentence number ............................................................................................... 77 3.2.3 Lexical Density ................................................................................................. 82 3.2.4 Readability ........................................................................................................ 84 3.2.5 Average Sentence Length.................................................................................. 86 3.3 Semantic Analysis ................................................................................................... 89 3.3.1 Repetition and Modified Repetition .................................................................. 90 3.3.2 Synonyms .......................................................................................................... 93 3.3.3 Antonyms, Meronyms and Holonyms .............................................................. 95 3.3.4 Hypernyms ...................................................................................................... 101 3.3.5 Hyponyms ....................................................................................................... 103 3.4 SPSS Statistical Analysis....................................................................................... 106 3.4.1 Textual Features .............................................................................................. 106 v 3.4.2 SPSS Statistical Analysis: Semantic Features ................................................. 108 CHAPTER IV ................................................................................................................. 113 4.1 Introduction ........................................................................................................... 113 4.2 Textual Features .................................................................................................... 117 Average Sentence length .......................................................................................... 117 Sentence number ...................................................................................................... 119 STTR ........................................................................................................................ 121 Lexical Density ........................................................................................................ 122 4.3 Semantic Features .................................................................................................. 123 4.3.1 Repetition ........................................................................................................ 123 4.3.2 Synonyms ........................................................................................................ 126 4.3.3 Meronyms, Holonyms, Hypernyms, Hyponyms, Antonyms .......................... 127 4.3.4 Semantic categories other than repetitions as a whole .................................... 128 CHAPTER V .................................................................................................................. 133 5.1 Introduction ........................................................................................................... 133 5.2 Pedagogical Implications ....................................................................................... 138 5.3 Limitations and Future Directions ......................................................................... 152 GLOSSARY OF ACRONYMS ...................................................................................... 157 References ....................................................................................................................... 159 Webography: ................................................................................................................... 167 vi LIST OF FIGURES Figure 1 – WordList Frequency List ................................................................................. 58 Figure 2 – Wordlist Statistics ............................................................................................ 59 Figure 3 – MultiWordNet Interface .................................................................................. 61 vii LIST OF TABLES Table 1 – Preliminary Textual Analysis Screenshot ......................................................... 63 Table 2 – Semantic Relation Analysis .............................................................................. 66 Table 3.1 – Parallel Corpus STTRs .................................................................................. 73 Table 3.2 – Parallel Corpus Types .................................................................................... 74 Table 3.3 Comparable Corpus STTRs .............................................................................

A Corpus-Based Investigation of Lexical Cohesion in En & It

A Way with Words: Recent Advances in Lexical Theory and Analysis: a Festschrift for Patrick Hanks

Usage of IT Terminology in Corpus Linguistics Mavlonova Mavluda Davurovna

Finetuning Pre-Trained Language Models for Sentiment Classification of COVID19 Tweets

Lexical Analysis

Linguistic Annotation of the Digital Papyrological Corpus: Sematia

Lexical Sense Labeling and Sentiment Potential Analysis Using Corpus-Based Dependency Graph

Creating and Using Multilingual Corpora in Translation Studies Claudio Fantinuoli and Federico Zanettin

Corpus Linguistics As a Tool in Legal Interpretation Lawrence M

Automated Citation Sentiment Analysis Using High Order N-Grams: a Preliminary Investigation

A Framework for Representing Lexical Resources

Against Corpus Linguistics

TEI and the Documentation of Mixtepec-Mixtec Jack Bowers