Class 2: Corpus-Based Studies on Ellipsis and English Gapping
Total Page:16
File Type:pdf, Size:1020Kb
Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Empirical Approaches to Elliptical Constructions Class 2: Corpus-based studies on ellipsis and English gapping Gabriela Bîlbîie1 and Anne Abeillé2 1University of Bucharest [email protected] 2Université Paris Diderot-Paris 7 [email protected] LSA 2017 Linguistic Institute 10 July 2017, University of Kentucky 1 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Class 2 – Content 1 Corpus-based studies motivation 2 Previous corpus studies Meyer (1995) Greenbaum & Nelson (1999) Harbusch & Kempen (2011) 3 PTB and ellipsis annotation 4 Gapping in PTB Missing material Remnants 2 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Plan 1 Corpus-based studies motivation 2 Previous corpus studies Meyer (1995) Greenbaum & Nelson (1999) Harbusch & Kempen (2011) 3 PTB and ellipsis annotation 4 Gapping in PTB Missing material Remnants 3 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Ellipsis – a challenge for grammar Ellipsis : a form/meaning mismatch (significatio ex nihilo) 1 part of the material necessary for the interpretation is missing in the syntactic structure (’incomplete’ syntax) ; 2 the missing material is recovered from an antecedent in the context. Descriptive problem : A mass of elliptical constructions, on the basis of several criteria, e.g. syntactic function of the missing material (head or dependent), syntactic context (coordination, subordination ; dialogue), ellipsis directionnality (forward vs. backward ellipsis). ⇒ Sometimes, unstable terminology. Theoretical problem : A plethora of competitive analyses, with respect to the level at which reconstruction of the missing material takes place : syntactic reconstruction vs. semantic reconstruction. ⇒ Unsolved theoretical problems. 4 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Gathering data on ellipsis In the literature on ellipsis, the large majority of examples are constructed data, based on introspective acceptability judgments. This leads very often to significant variation in acceptability judgments across speakers and sometimes even to contradictory data. Reliability of introspective acceptability judgments was recently called into question (Sprouse et al. 2010, Sprouse & Almeida 2012, Gibson & Fedorenko 2013), cf. weak methodological standards in linguistics (Gibson & Fedorenko 2013) : Confirmation bias on the part of the researcher (bias in favor of the predicted result). Confirmation bias on the part of the participants (if they are linguists, biased by their own hypotheses). 5 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Problems related to the introspective methods Even if the acceptability judgment on a given example is correct, we don’t always have clear intuitions about what is the source of inacceptability. If only a few examples (sometimes just one) of a given type are used as the basis for the judgment, this makes it especially unclear that low acceptability is due to one factor rather than another. E.g. specific lexical items can create play a role, as can discourse constraints. Many subtle factors of usage influence ease of processing and consequently acceptability. Factors influencing acceptability : Grammaticality Complexity Plausibility Lexical semantic properties of the lexical items chosen Frequency of lexical item and sequences of lexical items Various usage preferences 6 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Usage Preferences impact Usage Preferences (UPs) can affect any subdomain of linguistic competence : syntax, semantics, pragmatics, morphology, etc. Violating UPs can lead to reducing acceptability independently of any processing difficulties. UP violations can be cumulative, leading to strong unacceptability without any violations of linguistic constraints. 7 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB An illustration : the verbal anaphor do so General assumptions : Do so does not allow stative antecedents (Lakoff & Ross 1976). Do so does not allow non-action event antecedents (Culicover & Jackendoff 2005). (1) a. *Bill knew the answer, and Harry did so, too. (Lakoff & Ross 1976) b. *Robin dislikes Ozzie, but Leslie doesn’t do so. [Stative, Culicover & Jackendoff 2005] c. ?*Robin fell out the window, but Leslie didn’t do so. [Non-action event, Culicover & Jackendoff 2005] 8 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Attested examples of do so with stative antecedents : (2) a. The basic idea is that whenever the relation of complementary distribution holds between phones belonging to a common phoneme, it does so because the phonetic value of the phoneme depends upon the phonetic environment in which it occurs. [Stative, in Fodor, Bever and Garret, The Psychology of Language, cited by Michiels 1978] b. [Lanchester brings] his singular narrative ease to a historical story that sniffs of a quiet, personalized epic, but does so beautifully, eschewing the dripping drama so often wrongly associated with books that trace more than a few decades. [Stative, NYT, cited by Houser 2010] Paradox : Why do constructed examples with do so and a stative antecedent seem to be ungrammatical when such examples are attested in spontaneous usage of language ? 9 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Usage preferences for finite do so as evidenced by corpus investigation : UP1 Finite do so very strongly prefers to occur with non-stative antecedents. (98% of cases according to Houser 2010) UP2 Finite do so very strongly prefers to occur referring to the same state of affairs as its antecedent and hence with the same subject as its antecedent. (98% of cases according to Miller 2011) UP3 Finite do so prefers to occur with a non-contrastive adjunct. (83% of cases according to Miller 2011) Easy to find examples violating one UP in corpora. Very difficult to find examples violating two or three UPs. 10 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Acceptable examples respect two of the three UPs : (3) a. The basic idea is that whenever the relation of complementary distribution holds between phones belonging to a common phoneme, it does so because the phonetic value of the phoneme depends upon the phonetic environment in which it occurs. [UP1–, UP2+, UP3+] b. [Lanchester brings] his singular narrative ease to a historical story that sniffs of a quiet, personalized epic, but does so beautifully, eschewing the dripping drama so often wrongly associated with books that trace more than a few decades. [UP1–, UP2+, UP3+] c. Thus, players were more likely to behave positively if the team’s spectators and coaches did so as well. (COCA, Acad) [UP1+, UP2–, UP3+] 11 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Unacceptable examples violate the three UPs : (4) a. ↓Bill knew the answer, and Harry did so, too. b. ↓Robin dislikes Ozzie, but Leslie doesn’t do so. c. ↓Robin fell out the window, but Leslie didn’t do so. Examples are unacceptable because they do not respect three UPs but they are grammatical. The down-arrow symbol (↓) indicates unacceptability due to usage-preference violation. ⇒ Gradience in acceptability and grammaticality. 12 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB Methodological importance of UPs : Coming back to Lakoff & Ross (1976) on do so, they calque typical VPE examples. Specifically, out of 33 example sentences with do so, 27 have contrasting subjects. Among these are all of the sentences that they use to argue that do so cannot have stative antecedents. (5) a. Mary likes apples and Jane does too. b. *Bill knew the answer, and Harry did so, too. (Lakoff & Ross 1976). c. Bill knew the answer. He did so because he had read an article :::::::::::::::::::::::: on the subject in the paper the day before. ::::::::::::::::::::::::::::::::: Since Lakoff & Ross (1976), this unnatural pattern of usage has made its way into many articles and textbooks. In arguments for VP constituency and for the complement/adjunct distinction : in textbooks, e.g. Radford (1988), Haegeman (1991), Haegeman & Guéron (1999) ; in articles, e.g. Sobin (2008) : (out of 32 examples of do so, 26 have contrasting subjects). 13 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB The moral of the story A better understanding of usage is crucial to the interpretation of acceptability judgments. Corpus research is crucial to understanding usage preferences. Working on a small number of invented examples can lead to serious misinterpretations as to the actual source of acceptability differences. 14 / 76 N Corpus-based studies motivation Previous corpus studies PTB and ellipsis annotation Gapping in PTB The importance of corpus for ellipsis studies Data issues : Use of empirically attested data prevents the problems related to the introspective data and to the variability in acceptability judgments across speakers.