Universal Dependencies v1: A Multilingual Treebank Collection

Joakim Nivre
Marie-Catherine de Marneffe
Filip Ginter
Yoav Goldberg
Jan Hajic
Christopher D. Manning
Ryan McDonald
Sampo Pyysalo
Natalia Silveira
Reut Tsarfaty
Daniel Zeman
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)

Abstract

Cross-linguistically consistent annotation is necessary for sound comparative evaluation and cross-lingual learning experiments. It is also useful for multilingual system development and comparative linguistic studies. Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. In this paper, we describe v1 of the universal guidelines, the underlying design principles, and the currently available treebanks for 33 languages.