Google Research

A Feature-Rich Constituent Context Model for Grammar Induction

Proceedings of the Association for Computational Linguistics (2012)

Abstract

We present LLCCM, a log-linear variant of the constituent context model (CCM) of grammar induction. LLCCM retains the simplicity of the original CCM but extends robustly to long sentences. On sentences of up to length 40, LLCCM outperforms CCM by 13.9% brack- eting F1 and outperforms a right-branching baseline in regimes where CCM does not.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work