Modeling Verbal Inflection for English to German
Total Page:16
File Type:pdf, Size:1020Kb
Modeling verbal inflection for English to German SMT Anita Ramm Alexander Fraser IMS CIS University of Stuttgart, Germany University of Munich, Germany [email protected] [email protected] Abstract not handle the verbs, ensuring neither that verbs appear in the correct position (which is a problem German verbal inflection is frequently due to the highly divergent word order of English wrong in standard statistical machine and German), nor that verbs are correctly inflected translation approaches. German verbs (problematic due to the richer system of verbal in- agree with subjects in person and num- flection in German). In many cases, verbs do not ber, and they bear information about mood match their subjects (in person and number) which and tense. For subject–verb agreement, makes understanding of translations difficult. In we parse German MT output to iden- addition to person and number, the German verbal tify subject–verb pairs and ensure that the inflection also includes information about tense verb agrees with the subject. We show and mood. If these are wrong (i.e. do not cor- that this approach improves subject-verb respond to the tense/mood in the source), very agreement. We model tense/mood transla- important information, such as point of time and tion from English to German by means of modality of an action/state expressed by the verb, a statistical classification model. Although is incorrect. This can lead to false understanding our model shows good results on well- of the overall sentence. formed data, it does not systematically improve tense and mood in MT output. In this paper, we reimplement the nominal in- Reasons include the need for discourse flection modeling for translation to German pre- knowledge, dependency on the domain, sented by Fraser et al. (2012) and combine it and stylistic variety in how tense/mood is with the reordering of the source data (Gojun and translated. We present a thorough analysis Fraser, 2012). In a novel extension, we present a of these problems. method for correction of the agreement errors, and an approach for modeling the translation of tense 1 Introduction and mood from English into German. While the subject-verb agreement problems are dealt with Statistical machine translation of English into Ger- successfully, modeling of tense/mood translation man faces two main problems involving verbs: (i) is problematic due to many reasons which we will correct placement of the verbs, and (ii) generation analyze in detail. of the appropriate inflection for the verb. The position of verbs in German and English In Section 2, we give an overview of the pro- differs greatly and often large-range reorderings cessing pipeline for handling verbal inflection. are needed to place the German verbs in the cor- The method for handling subject–verb agreement rect positions. Gojun and Fraser (2012) showed errors is described in Section 3, while modeling of that the preordering approach applied on English– tense/mood translation is presented in Section 4. to–German SMT overcomes large problems with The impact of the proposed methods for modeling both missing and misplaced verbs. verbal inflection on the quality of the MT output is Fraser et al. (2012) proposed an approach for shown in Section 5. An extensive discussion of the handling inflectional problems in English to Ger- problems related to modeling tense/mood is given man SMT, focusing on the problems of sparsity in Section 6. Finally, future work is presented in caused by nominal inflection. However, they do Section 7. 21 Proceedings of the First Conference on Machine Translation, Volume 1: Research Papers, pages 21–31, Berlin, Germany, August 11-12, 2016. c 2016 Association for Computational Linguistics MT: reordered EN + stemmed DE 2 Overall architecture + DE inflection generation new drugs might lung , ovarian cancer slow 2.1 Ensuring correct German verb placement neue Medikamantekonnte Lungen− und Eierstockkrebs verlangsamen Different positions of verbs in English and Ger- Agreement Tense/Mood − parse − derive features man often require word movements over a large − find SV pairs − predict tense/mood with CRF Medikamente + können distance. This leads to two problems in German − map subject morph to verb translations generated by SMT systems concern- ing the verbs: either the verbs are not generated at 3.Pl Past.Subj all, or they are placed incorrectly. Generate To ensure that our MT output contains the maxi- − SMOR − stem + morph features mum number of (correctly placed) finite verbs, we können + 3.Pl.Past.Subj reorder English prior to training and translation us- ing a small set of reordering rules originally de- könnten scribed by Gojun and Fraser (2012). The verbs in the English part of the training, tuning and testing Figure 1: Processing pipeline. The verbal inflec- data are moved to the positions typical for German tion modeling consists of two components: (i) a which increases the syntactic similarity of English component for deriving agreement features person and German sentences. We train an SMT system and number, and (ii) a component for predicting on the reordered English and apply it to the re- tense and mood. The inflected verbs are generated ordered English test set. with SMOR (Schmid et al., 2004), a morphology This approach has good results in terms of the generation tool for German. position of the verbs in German translations. How- ever, the problem of incorrect verbal inflection baseline by identifying finite verbs in the baseline is unresolved. In fact, the reordering makes the MT output, predicting their morphological fea- agreement problems even worse due to move- tures and finally producing the correct inflected ments of verbs away from their subjects (cf. Sec- output (see Figure 1). tion 3.1). Verbal morphological features include informa- tion about person/number, as well as tense and 2.2 Inflection of the German SMT output mood. Particularly the modeling of tense/mood Fraser et al. (2012) proposed a method for han- translation is interesting: in this paper, we present dling nominal inflection for English to German a method to model the translation of English tense SMT. They work with a stemmed representation of and mood into German considering all German the German words in which certain morphological tenses/moods in a single model. In addition, features such as case, number, etc. are omitted. we present a detailed discussion which is, to our After the translation step, for nominal stemmed knowledge, the first deep analysis of this topic. words in the MT output, morphological features The processing pipeline is given in Figure 1. are predicted using a set of pre-trained classifiers After translation of the reordered English input to and finally surface forms are generated resulting a German stem-like representation, the nominal in fully-inflected German MT output. feature prediction is performed followed by our In their approach, the verbs are neither stemmed novel verbal feature prediction. Finally, the entire nor inflected, but instead handled as normal words. German MT output is inflected by combining the Thus, in the translation step, the decoder (in inter- stems and the predicted features to produce sur- action with the German language model) decides face forms (normal words). on the inflected verb forms in the final MT output. 3 Correction of the subject–verb 2.3 Adding verbal inflection modeling agreement As a baseline SMT system, we use a system trained on the reordered English sentences (cf. 3.1 Problem description Section 2.1) and stemmed German data with nom- In many languages, the subject is located near the inal inflection modeling as a post-processing step corresponding finite verb. However, in languages (cf. Section 2.2). In our system, we extend the such as German, the subject might be very far from 22 Data avg dist in words >5 words that I1.Sg. that yesterday to you said1.Sg. News 3.9 24% Europarl 3.7 22% Crawled 2.9 15% dass ich1.Sg. dir das gestern gesagt habe1.Sg. Table 1: Subject–verb distances in German texts. Figure 2: Example of a subject–verb distance caused by the reordering of the English clause the verb. We extracted subject–verb pairs from ’that I said that yesterday to you’. German corpora and computed their distances. The results are summarized in Table 1. News and Europarl are composed of more com- 3.2 Parsing for detection of subject–verb plex sentences than the corpus crawled from the pairs internet. While in the crawled data, there are more sentences with smaller subject–verb dis- tances, News and Europarl expose larger distances Agreement correction depends on correct identifi- between subjects and finite verbs. cation of subject–verb pairs. Although we work with English parses where the subjects can be cor- Although the average distance in words is rather rectly identified in many cases, this information small, there is a fair amount of subject–verb pairs source seems not to be sufficient. Problematic are with distance larger than 5 words (in Europarl syntactic divergences where the English subject 22%, in News 25%) which are problematic for does not correspond to the German subject. training the translation system. Even for small distances, it is not guaranteed that the agreement Initially, we aimed at predicting agreement fea- is generated correctly due to the missing appro- tures. However, we were not able to build a clas- priate translation phrases. Moreover, the German sifier with satisfying results due to the problems language model trained on the same data would mentioned above. We thus applied a method im- probably have problems to extract n-grams which plemented in Depfix (Rosa et al., 2012). They ensure the correct subject–verb agreement for all parse the MT output, extract subject–verb pairs possible subject–verb combinations.