Quick viewing(Text Mode)

Grammaticality Judgements, Intuitions and Corpora 175-015 Syntax

Grammaticality Judgements, Intuitions and Corpora 175-015 Syntax

judgements, intuitions and corpora 175-015

Catherine Lai

June 2004

1 Introduction

The syntactician’s toolkit relies heavily on grammaticality judgements. Tests determining the structure of often involve transforming a language fragment and then checking the gram- maticality of the result. For example, distributional tests for constituency. However, determining grammaticality is not always a straightforward task. Tests for grammaticality have been characterized by two forms of evidence used. Native speaker intuitions have been the preferred evidence in the past fifty years of syntactic inquiry. On the other hand, frequency data obtained from corpora have developed an increased support base in recent times. In this essay, I review native speaker intuitions and corpus derived frequencies and their appropriateness as data for grammaticality testing. Finally considering how they may be used together to form a stronger empirical base of evidence for grammaticality.

1.1 Competence and Performance

The divide between these two sets of evidence has been driven by the theory that is autonomous within the human . The theory was propelled into the mainstream by Chomsky 1 in the 1950’s (Chomksy, 1957). The corollary of this is that that our use of language (performance) does not accurately reflect our internal knowledge of language (competence). Grammar is assumed

1Note, this theory did not originate with Chomsky - see (Derwing, 1979)

1 to be part of competence. Corpus derived data reflects performance and so is not relevant to ques- tions of grammaticality. On the other hand, native speaker intuitions, unpolluted by performance factors, could be. Not everyone accepts this as the truth. Corpus linguistic work with respect to syntax continued through the peak of transformational (McEnery & Wilson, 1996). Labov (1975) has argued the case for empiricism. Recently Abney (1996), has argued eloquently that the competence performance distinction is really just a ‘idealization of language for the sake of simplicity’. Similar thoughts have expressed in the past, for example (Lakoff, 1974). Despite this, mainstream has continued to accept intuitions as the primary data source for grammar and reject of corpus based evidence (Schutze, 1996, pg. 35). It is with this in mind that, we must consider the use of native speaker intuition.

2 Native speaker Intuitions about Grammaticality

Using intuitions to test grammaticality is very simple on the surface. An informant’s intuition defines whether a is grammatical. The linguist presents the sentence under and asks the appropriate question. This is consistent with a generative grammar that can take a string of and output its grammaticality status (e.g. true or false). Since the real grammar of a language exists inside the native speaker these intuitions should accurately reflect competence. It is generally agreed that intuitions have enabled linguistics research to reach areas outside the of a purely corpus driven approach (Labov, 1975; Newmeyer, 2003; Sampson, 1975). Moreover, the linguist can on relevant material with great ease and speed. In theory it allows access to data from an infinite sized internal corpora. Also, questioning can provide negative about grammar absent from corpora. However, the decisive reason to use intuitions is the link to competence. The strength of this link needs to be examined.

2.1 Sentence Judgements and Competence

The ellicitation of intuitions is done via judgements of sentences that are clearly affected by per- formance issues. Some sentences are clearly unacceptable while still grammatical. This can be caused by , processing limitations, context and many other traditional performance fac- tors. In these cases acceptability judgements are performance data. The line between what is

2 ungrammatical and what is merely unacceptable is extremely blurred. Reactions to this problem generally call for filtering of affected data. Bever (see (Schutze, 1996, pg.31)) suggests ungrammaticality is only found when unacceptability cannot be explained by performance. According to Chomksy (1965) people are incapable of make judgements about grammaticality since they have no direct access to grammatical knowledge. However, intuition and tests might shed light on the situation anyway. In any case, using sentence judgements to study competence means performance factors need to be stripped away. This is not easy, especially since performance is not precisely defined. The linguist might appear highly qualified to provide judgements that take performance into account. However, allowing linguists to create the data used to tests their hypotheses can result in more distortion than remedy.

2.2 Data Distortions

Native speaker intuitions are susceptible to bias. The source of intuitions has often been the linguist seeking to validate a theory. However, it has been shown that linguists have different ideas about language to non-linguists (Labov, 1975, pp.14-18). In fact Newmeyer (1983) claims that for the reasons above only linguists should produce intuitions. However, this ignores the fact that it can be easy to ignore counter evidence (Sampson, 1975; Manning, 2003). As Schutze (1996, pp.49- 50) wisely quips: ‘Except in those cases where they fail to suit the linguist’s purpose, subjects’ intuitions are taken to reflect their true linguistic knowledge.’ When informants are non-linguists, ellicitation of intuitions needs to be carefully controlled to make data reliable and free of external factors (Botha, 1981, pg.304). Cowart (1997) notes that linguistic background, knowledge of the experimenters intentions, over exposure to a sentence type, length of sentences can all change the way informants behave. A myriad of methods have been suggested to try to limit the influence of the experimenter, for example, asking the informant to transformations sentences and see if the part under question is changed (if it is grammatical is should not) (Schutze, 1996, pg.57). Indeed Cowart (1997) shows that when experiments are designed to control variance, sentence judgements are reasonably reliable. Unfortunately, it does not appear that such strict methodologies are mainstream practice. Evidence has been presented in Labov (1975) that informants may judge sentences as ungram-

3 matical even if they frequently use them in everyday life. A corollary of this has been that a persons intuitions of grammaticality may differ from their actual internal grammar. The justification for using intuitions is highly dependent on its unique ties to internal grammar. If Labov’s evidence is accepted then it possible that intuitions are possibly no closer to competence than corpus data.

2.3 Conflict Resolution

Resolving disputes over grammaticality is difficult when the deciding factor is an intuition. The generativist motto has been to avoid this altogether and deal only with clear cases when developing theories (Chomksy, 1957). Newmeyer (1983) claimed that most reported data disputes were actually been application of a grammar to an unclear case. However, Schutze (1996) presents cases where ignoring inter-speaker has indeed led to data disputes. Others, such as Abney (1996) have claimed that you can make almost anything grammatical if you try, so clear cases are few and far between. To confuse matters more, evidence suggests that grammaticality is a continuous scale (Manning, 2003). Chomksy (1965, pp.10-11) concedes that ‘ grammaticalness is, no doubt, a matter of degree’. Breaking some rules is a worse than breaking others. However, a scale for intuitions has not been agreed on which it makes it virtually impossible to compared results from different studies. The key question is whether claims of grammaticality based on intuitions can be contradicted by other intuitions. Unless one set can be shown to be irrelevant or more contaminated by performance, the answer appears to be no. The danger is that there is a temptation to attribute any conflict to performance. The alternative is a stalemate that persists until the theory changes. This does not appear to be a satisfactory situation. Given this state of affairs it seems well worth considering the usefulness of a corpus approach.

3 Frequency data as evidence of grammaticality

Testing grammaticality involves extracting frequencies of the sentence in question. Theoretically, higher frequencies mean a higher probability of grammaticality, vice-versa for low frequencies. The situation where a sentence has not occurred in the corpus is discussed in section 3.2 The resulting distribution can be easily conditionalized on context and meaning (Manning, 2003). More complicated methods (for example distributions over parts of ) can also be used to determine

4 grammaticality probabilistically. The main advantage of this methodology is that the data is public and verifiable. Tests are repeatable, less dependent on the linguist, and can undergo greater scrutiny. This means that different approaches are easier to compare. The nature of corpora have changed since the objections of the 1950’s. Machine readable corpora are becoming larger and tools are being developed to make searching easier than ever. This increased usability has been one of the driving forces behind the resurge in corpus based linguistic methodologies (McEnery & Wilson, 1996).

3.1 Frequencies of Performance

Accepting that frequency data reveals grammaticality requires a paradigm shift. While they contain some evidence of competence, corpora are definitely products of performance. Using frequencies to decide grammaticality amounts to agreeing that if a lot of people say it, then it must be grammatical. This narrows the gap between competence and performance considerably. Derwing (1979) is one of many have suggested that this is exactly what needs to happen for linguistics to move forward as ‘it is the language process - the language user’s competence to perform - which is the of ultimate interest in language study’. Whether or not this is a correct view depends on, as Derwing suggests, what the ultimate goal is. What can be said is that frequency data is about performance. In this sense, the problem of external interfering factors is much less severe than for intuitions.

3.2 Data Distortion

If we ignore the battle over performance for a moment, we see that their finite size is a serious drawback of corpora. Corpora can be skewed. This continues to remedied by the collection of more data. However, Chomksy (1957) is right that perfectly grammatical sentences may occur with low or zero frequency. For example, swear words are very unlikely to occur in the Wall Street Journal component of Penn (Marcus et al., 1994) though they are likely to be heard frequently on Wall Street. When a sentence does not occur in the corpora we must be careful about grammaticality judgements made about it. It must also be made clear there is more to statistical of corpora then counting words.

5 The point of statistics is to extrapolate the infinite from finite samples of data. Smoothing methods are designed decide whether or not a string has zero frequency because it is not valid or because it simply has not occurred yet (Jurafsky & Martin, 2000). As mentioned previously, statistical lan- guage modelling can be done at abstract level . This model will almost certainly accept Chomsky’s famous ‘green ideas’ (Chomksy, 1957). As noted in section 2.2 that the problem of skew also affects intuitions. Corpora have provided much evidence to contradict linguists’ intuitions (Manning, 2003; Sampson, 1996). It seems the skew associated with intuitions is more transient, less reliable.

3.2.1 Resolving conflict

The type of data dispute deadlock mentioned in section 2.3 should not occur with frequency data. This is because the data collection method is much open to scrutiny. If two researchers calculate different results on the same dataset their methodologies must be different. Techniques can be compared and a decision can be made as to what the correct methodologies should be. Moreover, statistical approaches are inherently gradiated and comparable. This approach is particularly useful for dealing with unclear cases. Instead of letting the grammar decide, we let the frequencies do the job. This does not mean that judgements built from frequency data will never be in dispute. It does mean that the primary data itself should not be the source of confusion.

4 Combining Frequencies and Intuitions

Frequency data and native speaker intuitions represent two fundamentally different ideologies. This does not mean they cannot be used together to test grammaticality. Each method can act as a sanity check for the other. If we cannot remove all performance factors from intuitions it makes sense to compare that data to the results of a performance only analysis. This especially important as a check of whether sentence judgements really match their internal knowledge of language. Consideration of corpus data may be our best chance of shedding light on unclear cases and data disputes. Consider the situation where a sentence’s grammaticality is in under dispute but statistics show it is used by a large proportion of that language community. It seems likely that arguing for grammaticality is correct. Even if we do not believe this reflects true grammar, the

6 fact a particular sentence is uttered profusely should ring alarm bells. Conversely, McEnery & Wilson (1996, pg.12) notes that ‘Chomsky’s criticism that we would never find certain sentences or constructions in a corpus overlooked an important point. If we do not find them, this is an interesting and important comment on their frequency.’ The extent to which combination can take place still probably depends on what experimenters want to know about grammar. If the goal is to uncover a grammar divorced from its usage, the quote above will not mean that much. It will if the goal is to model and understand the entire production and comprehension. In any case, it is important to be able to evaluate judgements that conflict. A combined approach can be of assistance here.

5 Conclusion

Native speaker intuitions are often used as competence data. However, it appears that obtaining facts purely about competence is very hard if not impossible. Ellicitation methods are necessarily tainted with performance issues. Data based on sampling the populations intuitions can be affected by misunderstandings about acceptability and grammaticality. If the linguist tries to avoid this by only using their own intuitions they are at a very real risk of introducing investigator bias into the results. On the other hand, improvements in technology are remedying past complaints about corpus accessibility. Corpora are growing in size to produce more accurate reflections on language. The finiteness of corpora can be address with statistical techniques and this extrapolation seems to be no more approximate that using intuitions. The open nature of the data makes and conflict resolution much more objective. The remaining objection to using corpus data is fundamentally ideological. Disputes about grammaticality cannot be resolved without reliable and comparable data. This is a major drawback of building theories on intuitions results can be unfalsifiable. This is not such a problem with corpus based data. However, an overriding question is whether the data is right for the question at hand. If we accept that performance always obscures our ability to determine grammaticality, corpus based data will never be acceptable. If we accept that corpora do contain evidence about the structure of language it seems worth- while to consider using both approaches. Doing this provides a safety guard against the confounding

7 and experimenter bias of intuitions. On the otherhand carefully collected intuitions can correct for the sparsity of corpus data. There seems to be no reason why this should not happen besides ideology.

References

S. Abney (1996). ‘Statistical Methods and Linguistics’. In J. Klavans & P. Resnik (eds.), The Balancing Act: Combining Symbolic and Statistical Approaches to Language. The MIT Press, Cambridge, MA.

R. P. Botha (1981). The Conduct of Linguistic Inquiry, chap. 9. Mouton Publishers, The Hague.

N. Chomksy (1957). . Mouton & Co., The Hague, 10 edn.

N. Chomksy (1965). Aspects of the theory of syntax. MIT Press.

W. Cowart (1997). Experimental Syntax: Applying Objective Methods to Sentence Judgements. SAGE Publications, London.

B. L. Derwing (1979). ‘Against Autonomous Linguistics’. In T. A. Perry (ed.), Evidence and Argumentation in Linguistics, pp. 163–189. Walter de Gruyter, Berlin.

D. Jurafsky & J. H. Martin (2000). Speech and Language Processing, chap. 6. Prentice Hall, New Jersey.

W. Labov (1975). What is a linguistic fact? Lisse: Peter de Ridder Press.

G. Lakoff (1974). ‘Interview’. In H. Parret (ed.), Discussing Language: Dialogues with Wallace L. Chafe, pp. 151–178. Mouton, The Hague.

C. D. Manning (2003). ‘Probabilistic Syntax’. In R. Bod, J. Hay, & S. Jannedy (eds.), Prob- abilistic Linguistics, chap. 8, pp. 289–342. Massachusetts Institute of Technology, Cambridge, Massachusetts.

M. P. Marcus, et al. (1994). ‘Building a Large Annotated Corpus of English: The Penn Treebank’. Computational Linguistics 19(2):313–330.

8 T. McEnery & A. Wilson (1996). , chap. 1. Edinburgh Textbooks in Empirical Linguistics. Edinburgh University Press, Edinburgh.

F. J. Newmeyer (1983). Grammaticality theory, its limits and possibilities. The University of Chicago Press, Chicago.

F. J. Newmeyer (2003). ‘Grammar is Grammar and Usage is Usage’. Language 79:682–707.

G. Sampson (1975). The Form of Language, chap. 4. Weidenfeld and Nicolson, London.

G. Sampson (1996). ‘From central embedding to corpus linguistics’. In J. Thomas & M. Short (eds.), Using Corpora for Language Research:Studies in Honour of Geoffrey Leech, chap. 2, pp. 14–26. Longman, London.

C. T. Schutze (1996). The empirical base of linguistics: grammaticality judgements and linguistic methodology. The University of Chicago Press, Chicago.

9