Inducing probabilistic CCG grammars from logical form with higher-order unification

Luke Zettlemoyer
Sharon Goldwater
Mark Steedman
Proceedings of the 2010 conference on Empirical Methods in Natural Language Processing

Abstract

This paper addresses the problem of learning to map sentences to logical form, given
training data consisting of natural language sentences paired with logical representations
of their meaning. Previous approaches have been designed for particular natural languages
or specific meaning representations; here we present a more general method. The approach
induces a probabilistic CCG grammar that represents the meaning of individual words
and defines how these meanings can be combined to analyze complete sentences. We
use higher-order unification to define a hypothesis space containing all grammars consistent
with the training data, and develop an online learning algorithm that efficiently
searches this space while simultaneously estimating the parameters of a log-linear parsing
model. Experiments demonstrate high accuracy on benchmark data sets in four languages
with two different meaning representations.