<<

Developing a new grammar checker for English as a second language

Cornelia Tschichold, Franck Bodmer, Etienne Cornu, Franqois Grosjean, Lysiane Grosjean, Natalie Ktibler, Nicolas Lrwy & Corinne Tschumi

Laboratoire de traitement du langage et de la parole Universit6 de Neuch~tel, Avenue du Premier-Mars 26 CH - 2000 Neuch~tel

Abstract I Introduction In this paper we describe the prototype of Many -processing systems today a new grammar checker specifically geared include grammar checkers which can be to the needs of French speakers writing in used to locate various grammatical problems English. Most commercial grammar in a text. These tools are clearly aimed at checkers on the market today are meant to be native speakers even if they can be of some used by native speakers of a language who help to non-native speakers as well. have good intuitions about their own However, non-native speakers make more language competence. Non-native speakers errors than native speakers and their errors of a language, however, have different intu- are quite different (Corder 1981). Because itions and are very easily confused by false of this, they require grammar checkers alarms, i.e. error messages given by the designed for their specific needs (Granger & grammar checker when there is in fact no Meunier 1994). error in the text. In our project aimed at The prototype we have developed is developing a complete writing tool for the aimed at French native speakers writing in non-native speaker, we concentrated on English. From the very start, we worked on building a grammar checker that keeps the the idea that our prototype would be com- rate of over-flagging down and on deve- mercialized. In order to find out users' real loping a user-friendly writing environment needs, we first conducted a survey among which contains, among other things, a series potential users and experts in the field con- of on-line helps. The grammar checking cerning such issues as coverage and the component, which is the focus of this paper, interface. In addition, we studied which uses island processing (or chunking) rather errors needed to be dealt with. To do this, than a full parse. This approach is both rapid we integrated the information on errors and appropriate when a text contains many found in published lists of typical learner errors. We explain how we use automata to errors, e.g. Fitikides (1963), with our own identify multi-word units, detect errors corpus of errors obtained from English texts (which we first isolated in a corpus of written by French native speakers. Some errors) and interact with the user. We end 27'000 of text produced over 2'800 with a short evaluation of our prototype and errors, which were classified and sorted. We compare it to three currently available have used this corpus to decide which errors commercial grammar checkers. to concentrate on and to evaluate the correc- tion procedures developed. The following and it offers help, i.e. explanations and two tables give the percentages of errors examples for each word, on request. found in our corpus, broken down by the Potential errors such as false friends, con- major categories, followed by the subcate- fusions, foreign loans, etc. are tackled by gories pertaining to the verb. the highlighter. The heart of the prototype is the grammar checker, which we describe Error type percentage below. Further details about the writing aids spelling & pronunciation 10.3 % can be found in Tschumi et al. (1996). adjectives 5.3 % adverbs 4.4 % 2 The grammar checker nouns 19.6 % Texts written by non-native speakers are verbs 24.5 % more difficult to parse than texts written by word combinations 8.3 % native speakers because of the number and sentence 27.6 % types of errors they contain. It is doubtful whether a complete parse of a sentence con- Table 1: Percentage of errors by major categories raining these kinds of errors can be achieved with today's technology (Thurmair 1990). To get around this, we chose an approach Error type percentage using island processing. Such a method, morphology 1.2 % similar to the chunking approach described agreement 1.4 % in Abney (1991), makes it possible to extract tenses 8.8 % from the text most of the information needed lexicon 8.0 % to detect errors without wasting time and phrase following the verb 5.1% resources on trying to parse an ungram- matical sentence fully. Chunking can be seen Table 2: Percentage of verb errors by as an intermediary step between tagging and subcategories , but it can also be used to get around the problem of a full parse when dealing The prototype includes a set of writing with ill-formed text. aids, a problem word highlighter and a grammar checker. The writing aids include Once the grammar checker has been two monolingual and a bilingual actived, the text which is to be checked goes (simulated), a verb conjugator, a small through a number of stages. It is first seg- translating module for certain fixed expres- mented into sentences and words. The indi- sions, and a comprehensive on-line vidual words are then looked up in the grammar. The problem word highlighter is dictionary, which is an extract of CELEX used to show all the potential lexical errors (see Burnage 1990). It includes all the in a text. While we hoped that the grammar words that occur in our corpus of texts, with checker would cover as many different types all their possible readings. For example, the of errors as possible, it quickly became clear word the is listed in our dictionary as a that certain errors could not be handled satis- determiner and as an adverb; table has an factorily with the checker, e.g. using library entry both as a noun and as a verb. In this (based on French librairie) instead of book- sense, our dictionary is not a simplified and store in a sentence like I need to go to the scaled-down version of a full dictionary, but library in order to buy a present for my simply a shorter version. In the next stage, father. Instead of flagging every instance of an algorithm based on neural networks (see library in a text, something other grammar Bodmer 1994)disambiguates all words checkers often do, we developed the prob- which belong to more than one syntactic lem word highlighter. It allows the user to category. Furthermore, some multi-word view all the problematic words at one glance units are identified and labeled with a single syntactic category, e.g. a lot of(PRON) I. feature. This is illustrated in the following After this stage, island parsing can begin. A automaton: first step consists in identifying simple noun phrases. On the basis of these NPs, prepro- Automaton 1 eessing automata assemble complex noun phrases and assign features to them when- NP_HUM) > processing automata identify the verb group and assign tense, voice and aspect features Every automaton starts by looking for its to it. Finally, in a third step, error detection anchor, an arc marked by "@" (not neces- automata are run, some of which involve sarily the first arc in the automaton). The interacting with the user. Each of these three above automaton first looks for a noun steps will be described below. (inside a noun phrase) which occurs in the list of nouns denoting human beings. If it 3 The preprocessing stage finds one, it puts the value NP_HUMAN in the register called NP TYPE, a register The noun phrase parser identifies simple which is associated with the NP as a whole. non-recursive noun phrases such as The second group of preprocessing Det+Adj+N or N+N. The method used for automata deals with the verb group. this process involves an algorithm of the Periphrastic verb forms are analyzed for type described in Church (1988) which was tense, aspect, and phase, and the result trained on a manually marked part of our stored in registers. The content of these corpus. The module is thus geared to the registers can later be called upon by the particular type of second language text the detection automata. After these two prepro- checker needs to deal with. The resulting in- cessing stages, the error detection automata formation is passed on to a preprocessing can begin their work. module consisting of a number of automata groups. The automata used here (as well as in subsequent modules) are finite-state 4 Error detection automata similar to those described in Allen Once important islands in the sentence (1987) or Silberztein (1993). This type of have been identified, it becomes much easier automata is well-known for its efficiency to write detection automata which look for and versatility. specific types of errors. Because no overall In the preprocessing module, a first set of parse is attempted, error detection has to rely automata scan the text for noun phrases, on well-described contexts. Such an identify the head of each NP and assign the approach, which reduces overflagging, also features for person and number to it. Other has the advantage of describing errors sets of automata then mark temporal noun precisely. Errors can thus not only be iden- phrases, e.g. this past week or six months tified but can also be explained to the user. ago. In a similar fashion, some prepositional Suggestions can also be made as to how phrases are given specific features if they errors can be corrected. denote time or place, e.g. during that period, One of the detection automata that make at the office. Still within the same prepro- use of the NPs which have been previously cessing module, some recursive NPs are identified is the automaton used for certain then assembled into more complex NPs, cases of subject-noun agreement (e.g. *He e.g. the arrival of the first group. Finally, never eat cucumber sandwiches): human NPs are identified and given a special l This part of speech corresponds to CELEX's use of 'PRON'. Automaton 2 NP_TIME, then the following question is put to the user: "La srquence "all the little Martians" est- elle l'objet direct du verbe "saw"?" [CAT = ADV]*I ("Is the sequence "all the little Martians" @[CAT = V, TNS = PRES, MORPH ~ P3] the direct object of the verb "saw"?") If the user presses on the Yes-button, a This automaton (simplified here) first reordering of the predicate is suggested so as looks for the content of its last arc, a verb to give V - NP - PP (this can be done auto- which is not in the third person singular matically). If the No-button is pressed, a form of the present tense. From there it pro- thank-you message appears instead and the ceeds leftwards. The arc before the an- checker moves on to the next error. choring arc optionally matches an adverb. Interaction is also used in cases where The next arc, the second from the top, one of two possible corrections needs to be matches an NP which has been found to chosen. If a French native speaker writes have the features third person singular. If the *the more great, it is not clear whether s/he whole of this detection automaton succeeds, intended to use a comparative or a super- a message appears which suggests that the lative. This can be determined by interacting verb be changed to the third person singular with the user and an appropriate correction form. can then be proposed if need be. 5 User interaction While developing the automata which include interaction, we took great care not to It is sometimes possible to detect an error ask too much of the user. So far we have without being completely sure about its used less than ten different question patterns identity or how to correct it. To deal with and the vocabulary has been restricted to some of these cases, we have developed an terminology familiar to potential users. In interactive mode where the user is asked for addition we only ask one question per inter- additional information on the problem in action. question. The first step is to detect the Interacting with the user can thus be a structure where there might be an error. valuable device during error detection if it is Here again, the preprocessing automata are used with caution. Too many interactions, put to use. For example, the following especially if they do not lead to actual error automaton looks for the erroneous order of correction, can be annoying. They should constituents in a sentence containing a therefore be restricted to cases where there is transitive verb (as in *We saw on the planet a fair chance of detecting an error and should all the little Martians). not be used to flag problematic lexical items such as all the tokens of the word library in a Automaton 3 text.

@ [CAT = V, +TRANS] 6 Evaluation In a first evaluation using the approach described in Tschichold (1994), we com- pared our prototype to three commercial grammar checkers: Correct Grammar for If this automaton finds a sequence that Macintosh (version 2.0 of the monolingual contains a transitive verb followed by a English grammar checker developed by prepositional phrase with either the feature Lifetree Software and Houghton Mifflin), PP TIME or PP PLACE and that ends in a Grammatik for Macintosh (the English for noun phrase which does not have the feature 10 Measure Prototype Correct Grammatik WinProof Grammar Error detection 14.5% 10.5% 12.5% 34% Error overflagging 43.5% 74% 75% 76%

Table 3. Error detection and error overflagging scores for the prototype and three commercial grammar checkers

French users 4.2 version of the Reference are not as easy to use on non-native speaker Software grammar checker) and WinProof texts which contain many errors. 'Add-on' for PC (version 4.0 of Lexpertise Linguistic components will include the kinds of writing Software's English grammar checker for aids we have integrated into the prototype. French speakers). While the overall per- • be customizable: centage of correctly detected errors is still The grammar checker can be customized rather low for all of the tested checkers, by turning on or off certain groups of Table 3 clearly shows that our prototype automata relating to grammatical areas such does the best at keeping the overflagging as "subject-verb agreement" or "tense". down while doing relatively well on real errors. 2 A more extensive evaluation can be • indicate what it can and cannot do: found in Cornu et al. (forthcoming). This variable is dealt with both in flyers Another way of evaluating our prototype sent around and in the checker's manual. As is to see how closely it follows the guide- our prototype is not yet commercialized, this lines for an efficient EFL/ESL grammar last criterion does yet apply here. checker proposed by Granger & Meunier (1994). They describe four different criteria 7 Concluding remarks which should apply to grammar checkers In this paper we have presented the used by non-natives. According to them, a prototype of a second language writing tool good second language grammar checker for French speakers writing in English. It is should: made up of a set of writing aids, a problem • be based on authentic learner errors: word highlighter and a grammar checker. This first criterion is met with the error Two characteristics of the checker make it corpus we collected to help us identify the particularly interesting. First, overflagging types of errors that need to be dealt with. is reduced. This is done by moving certain • have a core program and bilingual 'add- semantic problems to the problem word on' components: highlighter, gearing the automata to specific The general island processing approach errors and interacting with the user from we have adopted can be employed both for time to time. Second, the use of island pro- monolingual and bilingual grammar cessing has kept processing time down to a checkers. Full parse approaches, however, level that is acceptable to users. Because we have designed a prototype with the view to commercializing it, scaling 2 The error detection scores shown here are rather low compared to those of other evaluations (e.g. up can be achieved quite easily. The dictio- Bolt 1992, Granger & Meunier 1994). This can be nary already contains all word classes explained by the fact that most of the test items used needed and we do not expect the disam- came from real texts produced by non-native speakers and were not specifically written by the biguation stage to become less efficient evaluator to test a particular system. Real texts when a larger dictionary is used. In addi- contain many iexical and semantic errors which tion, the error processing technology that we cannot be corrected with today's grammar checkers. II have implemented allows for all components Pit S. Corder. 1981. Error Analysis and to be expanded without degrading the Interlanguage. Oxford: Oxford quality. Commercial development will University Press. involve increasing the size of the dictionary, Etienne Cornu, N. Ktibler, F. Bodmer, F. adding automata to cover a wider range of Grosjean, L. Grosjean, N. Lrwy, C. errors, finishing some of the writing aids Tschichold & C. Tschumi. and integrating the completed product, i.e. (forthcoming). Prototype of a second language writing tool for French the writing tool, into existing word pro- speakers writing in English. Natural cessors. Language Engineering. As for the reusability of various compo- T.J. Fitikides. 1963. Common Mistakes in nents of our grammar checker for other English. Harlow: Longman. language pairs, we believe that many of the Sylviane Granger & F. Meunier. 1994. errors in English are also made by speakers Towards a grammar checker for learners of other languages, particularly other of English. In U. Fries, G. Tottie & P. Romance languages. Thus many of our Schneider (eds.), Creating and Using automata could probably be reused as such. English Language Corpora. Amsterdam: Ho~'ever, the writing aids and the problem Rodopi. word highlighter would obviously need to Max Silberztein. 1993. Dictionnaires elec- be adapted to the user's first language. troniques et analyse automatique de textes. Paris: Masson. Acknowledgements Georg Thurmair. 1990. Parsing for gram- mar and style checking. COLING, The work presented here was part of a 90:365-370. project funded by the Swiss Committee for Cornelia Tschichold. 1994. Evaluating the Encouragement of Scientific Research second language grammar (CERS/KWF 20.54.2). checkers.TRANEL (Travaux neuchfitelois de linguistique ), 21: 195- References 204. Corinne Tschumi, F. Bodmer, E. Cornu, F. Steven Abney. 1991. Parsing by chunks. In Grosjean, L. Grosjean, N. Ktibler, N. R. Berwick, S. Abney & C. Tenny Lrwy & C. Tschichold. 1996. Un logi- (Eds.) Principle-Based Parsing. ciel qui aide h la correction en anglais. Dordrecht: Kluwer. Les langues modernes, 4:28-41. Franck Bodmer. 1994. DELENE: un Terry Winograd. 1983. Language as a drsambigufsateur lexical neuronal pour Cognitive Process: Syntax, Reading, textes en langue seconde. TRANEL (Travaux neuchfitelois de linguistique), Mass.: Addison-Wesley. 21: 247-263. Philip Bolt. 1992. An evaluation of gram- mar-checking programs as self-:helping learning aids for learners of English as a foreign language. CALL; 5:49-91. Gavin Burnage. 1990. CELEX - A Guide for Users. Centre for Lexical Information, University of Nijmegen, The Netherlands. Kenneth Church. 1988. A stochastic parts program and noun phrase parser for un- restricted text. Proceedings of the Second Conference on Applied Natural Language Processing. Austin, Texas.

12