A Machine Learning Approach to German Pronoun Resolution Beata Kouchnir Department of Computational Linguistics Tubingen¨ University 72074 Tubingen,¨ Germany
[email protected] Abstract The system presented in this paper resolves German pronouns in free text by imitating the This paper presents a novel ensemble manual annotation process with off-the-shelf lan- learning approach to resolving German guage sofware. As the avalability and reliability of pronouns. Boosting, the method in such software is limited, the system can use only question, combines the moderately ac- a small number of features. The fact that most curate hypotheses of several classifiers German pronouns are morphologically ambiguous to form a highly accurate one. Exper- proves an additional challenge. iments show that this approach is su- The choice of boosting as the underlying ma- perior to a single decision-tree classi- chine learning algorithm is motivated both by its fier. Furthermore, we present a stan- theoretical concept as well as its performance for dalone system that resolves pronouns in other NLP tasks. The fact that boosting uses the unannotated text by using a fully auto- method of ensemble learning, i.e. combining the matic sequence of preprocessing mod- decisions of several classifiers, suggests that the ules that mimics the manual annotation combined hypothesis will be more accurate than process. Although the system performs one learned by a single classifier. On the practical well within a limited textual domain, side, boosting has distinguished itself by achieving further research is needed to make it good results with small feature sets. effective for open-domain question an- swering and text summarisation.