Inducing probabilistic CCG grammars from logical form with higher-order unification
Abstract
This paper addresses the problem of learning to map sentences to logical form, given
training data consisting of natural language sentences paired with logical representations
of their meaning. Previous approaches have been designed for particular natural languages
or specific meaning representations; here we present a more general method. The approach
induces a probabilistic CCG grammar that represents the meaning of individual words
and defines how these meanings can be combined to analyze complete sentences. We
use higher-order unification to define a hypothesis space containing all grammars consistent
with the training data, and develop an online learning algorithm that efficiently
searches this space while simultaneously estimating the parameters of a log-linear parsing
model. Experiments demonstrate high accuracy on benchmark data sets in four languages
with two different meaning representations.
training data consisting of natural language sentences paired with logical representations
of their meaning. Previous approaches have been designed for particular natural languages
or specific meaning representations; here we present a more general method. The approach
induces a probabilistic CCG grammar that represents the meaning of individual words
and defines how these meanings can be combined to analyze complete sentences. We
use higher-order unification to define a hypothesis space containing all grammars consistent
with the training data, and develop an online learning algorithm that efficiently
searches this space while simultaneously estimating the parameters of a log-linear parsing
model. Experiments demonstrate high accuracy on benchmark data sets in four languages
with two different meaning representations.