Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars Luke S. Zettlemoyer and Michael Collins MIT CSAIL
[email protected],
[email protected] Abstract The training data in our approach consists of a set of sen- tences paired with logical forms, as in this example. This paper addresses the problem of mapping This is a particularly challenging problem because the natural language sentences to lambda–calculus derivation from each sentence to its logical form is not an- encodings of their meaning. We describe a learn- notated in the training data. For example, there is no di- ing algorithm that takes as input a training set of rect evidence that the word states in the sentence corre- sentences labeled with expressions in the lambda sponds to the predicate state in the logical form; in gen- calculus. The algorithm induces a grammar for eral there is no direct evidence of the syntactic analysis of the problem, along with a log-linear model that the sentence. Annotating entire derivations underlying the represents a distribution over syntactic and se- mapping from sentences to their semantics is highly labor- mantic analyses conditioned on the input sen- intensive. Rather than relying on full syntactic annotations, tence. We apply the method to the task of learn- we have deliberately formulated the problem in a way that ing natural language interfaces to databases and requires a relatively minimal level of annotation. show that the learned parsers outperform pre- vious methods in two benchmark database do- Our algorithm automatically induces a grammar that maps mains.