The Pennsylvania State University the Graduate School Department
Total Page:16
File Type:pdf, Size:1020Kb
The Pennsylvania State University The Graduate School Department of Applied Linguistics UNDERGRADUATE VS. GRADUATE ACADEMIC ENGLISH: A CORPUS-BASED ANALYSIS A Dissertation in Applied Linguistics by Tracy S. Davis 2012 Tracy S. Davis Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy December 2012 The dissertation of Tracy S. Davis was reviewed and approved* by the following: Xiaofei Lu Gil Wats Early Career Professor in Language and Linguistics Dissertation Advisor Chair of Committee Suresh Canagarajah Edwin Erle Sparks Professor in Applied Linguistics and English Xiaoye You Associate Professor of English and Asian Studies Robert Schrauf Professor of Applied Linguistics Karen Johnson Professor of Applied Linguistics Director of Graduate Studies *Signatures are on file in the Graduate School iii ABSTRACT While English as an Academic Language (EAP) has been the focus of much research, the differences between academic English at the undergraduate level and that at the graduate level have largely been ignored. The present study serves to fill that gap in the literature by a) creating two corpora of academic English at the two levels of study and b) comparing the vocabulary, lexical bundles, and grammar related to parts of speech (POS) found in each by emulating a series of corpus studies done on academic English in other contexts. Undergraduate academic English was found to have a more diverse vocabulary, a vocabulary that did not support teaching the Academic Word List (AWL) at the undergraduate level, less diversity in lexical bundles, a greater reliance on lexical bundles with different functions than those used at the graduate level, significantly different use of semantic types of nouns and verbs, significantly different use of personal pronouns, and significantly different use of voice and tense between disciplines at the two levels. While the results varied from nominal to dramatic variation between the two types of academic English, the study taken as a whole clearly demonstrates that academic English cannot be assumed to be the same regardless of level, and that these differences need to be addressed in EAP courses and potentially in undergraduate writing courses as well. iv TABLE OF CONTENTS List of Figures ............................................................................................................. vi List of Tables ............................................................................................................... vii Acknowledgements ...................................................................................................... ix Chapter 1 Introduction.............................................................................................. 1 1.1 Comparison of Undergraduate and Graduate Programs …..…………1 1.2 Research Questions …………………………………………………..5 1.3 Structure of the Dissertation ………………………………………….7 Chapter 2 The Corpus ............................................................................................... 9 2.1 Defining Academic, Undergraduate, and Graduate English …………9 2.1.1 Academic English by Context ………………………………..12 2.1.2 Academic English by Approach ……………………………...13 2.1.3 Academic English in this Study ………………………………17 2.2 Corpus Design ………………………………………………………...18 2.2.1 Text Inclusion …………………………………………………18 2.2.2 Discipline Selection …………………………………………...20 2.2.3 Size …………………………………………………………....23 2.2.4 Transcription and Coding of Text ……...……………………..25 2.2.5 Tagging and Manipulating Text ……………………………....27 2.3 Summary ………………………………………………………………29 Chapter 3 Vocabulary ............................................................................................... 30 3.1 What Types are Frequent and What is their Range? …………………39 3.1.1 Methodology ………………………………………………….39 3.1.2 Results ………………………………………………………...43 3.2 How Well Does the AWL Cover the AUGER Corpus? ……………...49 3.2.1 Methodology …………………………………………………..49 3.2.2 Results …………………………………………………………51 3.3 The America-centric Use of Language ………………………………..55 3.4 Summary ………………………………………………………………58 Chapter 4 Lexical Bundles ……………………..……………………………………59 4.1 Lexical Bundles in Academic English ………………………………..59 4.2 Analysis of Lexical Bundles ……………………………………….....64 4.2.1 What is the Frequency across Corpora? ……………………...65 v 4.2.2 Lexical Bundles at the Two Levels…………………………...66 4.2.3 Which Disciplines Use Lexical Bundles Frequently? ……….70 4.2.4 What are the Frequent Lexical Bundles? …………………….74 4.3 Summary ……………………………………………………………...80 Chapter 5 Grammatical Variation …………………………………...…………….81 5.1 Analysis of Grammar …………………………………………………83 5.2 Content Word Classes ……………………………………………...…85 5.3 Nouns …………………………………………………………………86 5.4 Personal Pronouns ………………………………………………….…91 5.5 Verbs ………………………………………………………………….97 5.5.1 Semantic Classes of Verbs ……………………………………97 5.5.2 Verb Tense, Aspect, and Voice ………………………………99 5.6 Summary ………………………………………………………...…..105 Chapter 6 Conclusions ............................................................................................... 109 6.1 Vocabulary …………………………………………………………...109 6.2 Lexical Bundles ………………………………………………………112 6.3 Grammar ……………………………………………………………...117 6.4 Implications for Research …………………………………………….119 6.5 Implications for Pedagogy …………………………………………....120 6.6 Limitations ……………………………………………………………122 6.7 Future Research …………………………………………………....…124 6.8 Concluding Remarks……………………………………………….....125 References ................................................................................................................... 127 Appendix The Penn Treebank Tag Set ...................................................................... 140 vi LIST OF FIGURES Figure 1.1 Requirements for Undergraduate Degrees.............................................................. 3 Figure 3.1 Frequency of Lemmatized Types .......................................................................... 44 Figure 3.2 Types by Discipline and Level. .............................................................................. 46 Figure 3.3 Normalized Type Token Ratio. .............................................................................. 47 Figure 5.1 Comparison of Content Word Classes by Level..................................................... 87 Figure 5.2 Nouns by Group. .................................................................................................... 89 Figure 5.3 Personal Pronouns. ................................................................................................. 94 Figure 5.4 Pronouns per million Words by Discipline Group. ................................................ 94 Figure 5.5 Percent of Tokens for Semantic Classes of Verbs by Level. .................................. 99 Figure 5.6 Past Tense and Present Tense in the Soft-Pure Fields. ........................................... 103 Figure 5.7 Past Tense and Present Tense in the Hard-Pure Fields........................................... 104 Figure 5.8 Percent of Passive Voice by Discipline and Level ................................................. 106 vii LIST OF TABLES Table 1.1 Comparison of Requirements at the Undergraduate and Graduate Levels .............. 2 Table 2.1 Disciplines in Each Subcategory. ............................................................................ 22 Table 2.2 Breakdown of Texts by Discpline............................................................................ 25 Table 3.1 Y2k-SWAL Corpus Tokens ..................................................................................... 34 Table 3.2 Number of Lemmas that Appear 200 times per million or more. ............................ 48 Table 3.3 Coverage of the GSL and AWL at the Graduate Level. .......................................... 51 Table 3.4 Word Families Found in all 10 Disciplines ............................................................. 54 Table 3.5 Use of Nation-centric Language. ............................................................................. 56 Table 4.1 Lexical Bundles in the AUGER Corpus. ................................................................. 67 Table 4.2 Top Three-Word Lexical Bundles across Both Corpora ......................................... 68 Table 4.3 Top Four-Word Lexical Bundles across Both Corpora. .......................................... 68 Table 4.4 Top Five-Word Lexical Bundles across Both Corpora. ........................................... 69 Table 4.5 Four-Word Bundle Frequency Information ............................................................. 72 Table 4.6 Shared Lexical Bundles Between Undergraduate and Graduate Levels. ................. 75 Table 4.7 Biology..................................................................................................................... 77 Table 4.8 Chemistry ................................................................................................................. 77 Table 4.9 Earth Science. .......................................................................................................... 77 Table 4.10 Mathematics. .......................................................................................................... 78 Table 4.11 Physics ................................................................................................................... 78 Table 4.12 Geography. ............................................................................................................. 78 Table 4.13 History. .................................................................................................................