Exemplary Analyses of the Philippine English Corpus
Total Page:16
File Type:pdf, Size:1020Kb
Exemplary analyses of the Philippine English Corpus Ma. LourdesExemplary analyses of the Philippine English S. Corpus Bautista De La Salle University-Manila The inspiration for this paper comes from Schneider’s “Corpus linguistics in the Asian context: Exemplary analyses of the Kolhapur corpus of Indian English” (2000), illustrating how descriptive linguistics can be done using an electronic corpus. The general objective of his study was to characterize some features of Indian English and to determine whether this new variety was particularly conservative or relatively innovative compared to the colonizers’ variety, mainly British English, and peripherally American English. In other words, does Indian English manifest signs of a nativization process? His analysis of the Kolhapur corpus focuses on the use of the subjunctive, the case marking of wh- pronouns, the putatively British intransitive-do proform, and the indefinite pronouns ending in -body and -one. He compares his results with those of the (British) Lancaster- Oslo-Bergen, hereafter LOB, and the (American) Brown corpora. The comparison seems well motivated because the three corpora—Kolhapur, LOB, and Brown—each consist of one million words, have an equal number of five hundred texts with two thousand words per text, follow the basic design of fifteen print genres or text types, with all the texts being synchronic: i.e., printed within the same year within each corpus (although it was 1961 for both Brown and LOB but 1978 for Kolhapur). I am replicating Schneider’s (2000) analysis of the subjunctive, the case marking of wh- pronouns, and the indefinite compound pronouns in -body and -one using two corpora—the components of the International Corpus of English (ICE) from the Philippines and Singapore, hereafter ICE-PHI and ICE-SIN. I have included the Singapore component * The availability in 2004 of the Philippine component of ICE has facilitated systematic studies of the lexical and grammatical features of Philippine English, in itself and in comparison with other corpora in ICE. ICE-PHI is an electronic lexical corpus of Philippine English consisting of three hundred spoken and two hundred written texts, with each text having approximately two thousand words, comprising a little over a million words. The Philippine corpus, together with corpora from six other countries or regions (Hong Kong, East Africa, Great Britain, India, New Zealand, Singapore), can be accessed from <http://www.ucl.ac.uk/English-usage/ice/>. With the use of a concordancing program such as WordSmith, electronic corpora present immense possibilities for interested students of Philippine English. It becomes easier, then, to implement a recommendation of Howard McKaughan, this volume’s honoree, from his article “Toward a standard Philippine English”: While the Philippines stands high amongst its peers in Southeast Asia in research on its own variety of English, there still remains much to be done. Teachers of language and linguistics should strive for even more advances in the study of Philippine English and to develop more efficient ways to teach Philippine English so that, beside the widening distribution and use of Filipino, the National Language, they will help Philippine English continue to flourish. (McKaughan 1993:53) I would also like to thank De La Salle University-Manila for the grant from the Research Faculty Program that enabled me to prepare this paper. Loren Billings & Nelleke Goudswaard (eds.), Piakandatu ami Dr. Howard P. McKaughan, 5–23. Manila: Linguistic Society of the Philippines and SIL Philippines, 2010. 6 MA. LOURDES S. BAUTISTA because, like the Philippines, Singapore has a colonial history but, unlike the Philippines, it was under the British rather than the Americans.1 It should be pointed out that dates of compilation of the various corpora are different: The Kolhapur corpus was compiled in 1978, whereas ICE-PHI and ICE-SIN were compiled throughout the 1990s, with some data in ICE-PHI obtained as late as 2004. More critically, the Kolhapur corpus of one million words is drawn from printed material (500 texts times 2,000 words per text), while the ICE-PHI and ICE-SIN corpora of one million words each come from three hundred spoken texts (» 600,000 words) and two hundred written texts (» 400,000 words). Since the difference between spoken and written corpora has been found to be wide-ranging and considerable, in order to give greater validity to the comparison with Schneider’s findings, I have removed all the spoken text types together with the nonprofessional and correspondence genres. Thus, the investigation here has been limited to the published data (= 150 texts or 300,000 words), from ICE-PHI and ICE-SIN. To make the figures from Schneider’s paper comparable to the ones for ICE-PHI and ICE-SIN, I have used the normalization procedure described in Biber et al.’s Methodology Box 6, Norming Frequency Counts: normalization is “a way to adjust raw frequency counts from texts of different lengths so that they can be compared accurately” (1998:263–264). In this case, following Schneider’s advice (personal communication, April 2005), I have chosen one million words as the basis for norming, and this means dividing the raw frequency count for each feature by three, and then multiplying it by ten; the formula comes from the corpus size of 300,000 words normed to a million words ([300,000 ÷ 3] . 10 = 1,000,000). 1. Use of the subjunctive The SUBJUNCTIVE, as Schneider notes, is defined as a formally marked grammatical category: a distinct form of verbs with the meaning broadly being counterfactual. He writes that it is strongly recessive in the history of English; however, it has been retained in American more than in British English (2000:124). In British English, the subjunctive sounds formal and rather legalistic in style but it appears to be re-establishing itself perhaps because of American influence (Quirk et al. 1985:157). Thus, the subjunctive is an interesting structure for comparative studies because the two dominant first-language varieties (British and American) show different usage of the structure. As in Schneider, the analysis here focuses on the MANDATIVE subjunctive: the subjunctive used in that-clauses after verbs and, very occasionally, after SUASIVE expressions such as nouns and adjectives expressing a demand, order, recommendation, suggestion, or wish. 1 The bulk of the analysis for this paper was completed in mid 2005. Later that year (November), I was surprised to discover that Edgar Schneider’s contribution to a festschrift in my honor was entitled “The subjunctive in Philippine English”: I became anxious that his article had made my paper obsolete. However, though he covered basically the same ground that I have here, there are enough differences in the analysis to make publishing this paper worthwhile. For one, the interested reader can note the methodological differences between the two papers. Schneider used the entire ICE-PHI (both spoken and written subcomponents), whereas this paper has included only the printed dataset and used norming to approximate the one million words in the other printed corpora. For another, this paper has included data from ICE-SIN and therefore adds a dimension to the comparative analysis. Finally, the reader can observe slight differences in detail by comparing frequencies for the suasive verbs, adjectives, and nouns triggering the subjunctive and frequencies of the hypothetical subjunctive were with as if, as though, even if in the two papers. The important thing, however, is that we have both arrived at the same conclusion for the subjunctive and the equivalent should form—i.e., that Philippine English is highly predisposed to using the subjunctive rather than should and that it adheres very closely to American English in this regard. EXEMPLARY ANALYSES OF THE PHILIPPINE ENGLISH CORPUS 7 The formal characteristics of the subjunctive have been succinctly described by Schneider (2000:124), summarizing Quirk et al. (1985:156[–157]): In many environments, a formal distinction between indicative (“base”) and subjunctive verb forms has been lost. The copula is the only verb which throughout its paradigm still has a distinct subjunctive form, viz. be. For all other verbs, the subjunctive equals the base form of a verb, i.e. it is distinctly recognizable (formally different from a non-subjunctive) only in certain environments: in the 3rd person singular in the present tense (because the subjunctive form lacks the verbal -s); in dependent clauses after past tense verbs in the main clause (because there is no tense concord, or “backshifting”); and, finally, in negatives (recognizable by the form not + infinitive without do-support). The most common functionally and semantically equivalent alternative of the subjunctive is the use of the modal should, said to be more common than the subjunctive especially in British English. The search structure for the subjunctives in ICE-PHI and ICE-SIN printed data followed that laid down by Schneider (2000:124–125) and reproduced almost verbatim below. As he notes, corpus linguistics is built on the need to identify a search structure so as not to have to read the whole corpus (an impossible undertaking); thus it is possible that this search structure missed a few tokens of the subjunctive, but they would be very few. To begin, define and formally identify the environments likely to trigger mandative subjunctives in dependent clauses—i.e., compile a list of suasive verbs and their derived nouns, such as demand, order, insist(ence), suggest(ion), and recommend(ation), as well as adjectives with a suasive meaning like imperative, important, and essential. The list used here is based on Schneider (2000) and Quirk et al. (1985:157, 1182). In addition, identify and clearly define the alternative formal realization categories (possible variants). Schneider identifies six of these, listed in (1) through (6)—corresponding to his types (a) through (f), respectively.