Google Research

Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization

Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing


We present a framework for cross-lingual transfer of sequence information from a resource-rich source language to a resource-impoverished target language that incorporates soft constraints via posterior regularization. To this end, we use automatically word aligned bitext between the source and target language pair, and learn a discriminative conditional random field model on the target side. Our posterior regularization constraints are derived from simple intuitions about the task at hand and from cross-lingual alignment information. We show improvements over strong baselines for two tasks: part-of-speech tagging and named-entity segmentation.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work