Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance

Shay B. Cohen
Noah A. Smith
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Association of Computational Linguistics(2011)

Abstract

We describe a method for prediction of linguistic structure in a language for which only unlabeled data is available, using annotated data from a set of one or more helper languages. Our approach is based on a model that locally mixes between supervised models from the helper languages. Parallel data is not used, allowing the technique to be applied even in domains where human-translated texts are unavailable. We obtain state-of-the-art performance for two tasks of structure prediction: unsupervised part-of-speech tagging and unsupervised dependency parsing.

Research Areas