Multiple Admissibility in Language Learning: Judging Grammaticality Using Unlabeled Data Anisia Katinskaia, Sardana Ivanova, Roman Yangarber University of Helsinki, Department of Computer Science, Finland
[email protected] Abstract because negative feedback for an acceptable an- swer misleads and discourages the learner. We present our work on the problem of detec- tion Multiple Admissibility (MA) in language We examine MA in the context of our language learning. Multiple Admissibility occurs when learning system, Revita (Katinskaia et al., 2018). more than one grammatical form of a word fits Revita is available online2 for second language syntactically and semantically in a given con- (L2) learning beyond the beginner level. It is in text. In second-language education—in partic- use in official university-level curricula at several ular, in intelligent tutoring systems/computer- major universities. It covers several languages, aided language learning (ITS/CALL), systems many of which are highly inflectional, with rich generate exercises automatically. MA implies that multiple alternative answers are possi- morphology. Revita creates a variety of exercises ble. We treat the problem as a grammatical- based on input text materials, which are selected ity judgement task. We train a neural network by the users. It generates exercises and assesses with an objective to label sentences as gram- the users’ answers automatically. matical or ungrammatical, using a “simulated For example, consider the sentence in Finnish: learner corpus”: a dataset with correct text “Ilmoitus vaaleista tulee kotiin postissa .” and with artificial errors, generated automat- (“Notice about elections comes to the house in the ically. While MA occurs commonly in many mail.”) languages, this paper focuses on learning Rus- sian.