Google Research

Multilingual Word Embeddings using Multigraphs

https://arxiv.org/abs/1612.04732 (2015)

Abstract

We present a family of neural-network–inspired models for computing continuous word representation, specifically designed to exploit monolingual and multilingual text, without and with annotations (syntactic dependencies, word alignments, etc.). We find that this framework allows us to train embeddings with significantly higher accuracy on syntactic and semantic compositionality, as well as multilingual semantic similarity, compared to previous models. We also show that some of these embeddings can be used to improve the performance of a state-of-the-art machine translation system for words outside the vocabulary of the parallel training data.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work