Phraseology and Epistemology in Scientific Writing: a Corpus-Driven Approach
Total Page:16
File Type:pdf, Size:1020Kb
PHRASEOLOGY AND EPISTEMOLOGY IN SCIENTIFIC WRITING: A CORPUS-DRIVEN APPROACH BY Garry Lee Plappert A thesis submitted to The University of Birmingham for the degree of DOCTOR OF PHILOSOPHY Department of English School of English, Drama, Canadian and American Studies July 2012 Abstract This thesis uses the tools and methods of corpus linguistics to study the process of knowledge encoding in a corpus of texts from the scientific discipline of genetics. It is argued here that the approach taken fits into the tradition of corpus-driven approaches to linguistic questions in that no assumption is made about the linguistic form that this knowledge encoding will take. Instead the study proceeds by identifying a set of keywords using the concept of lexical chains to identify items of terminology. The investigation of these uses the cluster function of WordSmith Tools (Scott 2004) and is qualitative, following Sinclair (1991; 2004) in attempting to develop a picture of the typical linguistic nature of the patterns surrounding these clusters inductively through a process of studying collocation and colligation patterns and identifying phraseology. It is argued here that such an approach is required to discover linguistic aspects of epistemic encoding that have as yet not been identified by those working in the related fields of discourse analysis or corpus linguistics. 2 Table of Contents Table of Contents Chapter 1: Introduction .......................................................................................................... 8 1.1 Introduction .................................................................................................................................. 8 1.2 Stating the research problem(s) ............................................................................................ 9 1.3 Justifying the research ............................................................................................................ 12 1.4 Outline of the study .................................................................................................................. 15 1.5 Some assumptions and limitations of this research .................................................... 17 Chapter 2: Epistemology, social epistemology and the philosophy of science 22 2.1 Traditional Epistemology ...................................................................................................... 23 2.1.1 Epistemology in analytic philosophy ....................................................................................... 23 2.1.2 A crisis in the traditional view of knowledge: the Gettier case ..................................... 23 2.2 The move towards Social epistemology ........................................................................... 25 2.3 The analytic tradition of the philosophy of science ..................................................... 30 2.4 Conclusion ................................................................................................................................... 33 Chapter 3: The contribution of linguistics .................................................................... 35 3.1 The study of discourse ............................................................................................................ 36 3.2 Non-corpus study of scientific texts ................................................................................... 39 3.3 Corpus Linguistics: concordances, collocation and keywords ................................. 44 3.4 Applications of Corpus Linguistics ..................................................................................... 47 3.5 Corpus linguistics, scientific texts and academic writing .......................................... 49 3.6 The potential of corpus-based and corpus driven approaches ............................... 54 3.6.1 The corpus-based/corpus-driven distinction ...................................................................... 55 3.6.2 Epistemic signalling and the corpus-driven approach ..................................................... 56 3.7 Conclusion ................................................................................................................................... 61 Chapter 4: Methodology ...................................................................................................... 63 4.1 Introduction ............................................................................................................................... 63 4.2 Pilot study ................................................................................................................................... 63 4.2.1 The pilot corpus ................................................................................................................................ 63 4.2.2 Extracting an item for further investigation ......................................................................... 67 4.2.3 Exploring an item using the techniques of corpus linguistics ....................................... 72 4.2.4 Investigating an item: The synchronic perspective .......................................................... 73 4.2.5 The synchronic perspective continued: qualitative analysis of expanded contexts ............................................................................................................................................................................. 78 4.2.6 Summary .............................................................................................................................................. 82 4.3 Final corpus construction ...................................................................................................... 83 4.3.1 Purpose ................................................................................................................................................ 84 4.3.2 Genre ..................................................................................................................................................... 84 4.3.3 Size ......................................................................................................................................................... 85 4.3.4 Representativeness ......................................................................................................................... 86 4.3.5 Corpus annotation ........................................................................................................................... 87 4.3.6 Reference corpus .............................................................................................................................. 88 4.4 Data collection and the corpus ............................................................................................ 89 4.5 Extracting useful items for study ........................................................................................ 89 3 4.5.1 whole corpus keywords ................................................................................................................ 90 4.5.2 Extracting discourse objects using lexical chains ............................................................... 97 4.6 Conclusion ................................................................................................................................... 99 Chapter 5: Results ................................................................................................................ 101 5.1 The clusters .............................................................................................................................. 102 5.1.1 Problems with the initial cluster list ..................................................................................... 108 5.1.2 Final list of clusters for investigation in expanded contexts. ...................................... 110 5.2 Clusters containing cells ...................................................................................................... 115 5.2.1 wild type cells .................................................................................................................................. 115 5.2.2 embryonic stem cells ..................................................................................................................... 119 5.3 Clusters containing gene ...................................................................................................... 122 5.3.1 gene expression data ................................................................................................................... 122 5.3.2 gene expression patterns ............................................................................................................. 125 5.3.3 gene expression profiles .............................................................................................................. 131 5.3.4 changes in gene expression......................................................................................................... 136 5.4 Clusters containing genes ........................................................................................................ 138 5.4.1 X linked genes .................................................................................................................................. 138 5.5 Clusters containing expression .............................................................................................. 150 5.6 Clusters containing cell .......................................................................................................