Jump to Content

Federated learning of out-of-vocabulary words

Abstract

We demonstrate that a character-level LSTM neural network is able to learn out-of-vocabulary (OOV) words for the purpose of expanding the vocabulary of a virtual keyboard for smartphones. We train such a model using a distributed, on-device learning framework called federated learning. High-frequency words can then be sampled from the generative model by drawing from the joint posterior directly. We study the feasibility of the approach in three different settings: (1) using stochastic gradient descent, on an anonymized dataset of snippets of user content; (2) using simulated federated learning, on a publicly available non-IID per-user dataset from a popular social networking website; (3) using federated learning, on data hosted on user mobile devices. The model is shown to achieve good recall and precision when compared to ground-truth OOV words in settings (1) and (2). With (3) we demonstrate the practicality of this approach by showing that we can learn meaningful OOV words without exporting sensitive user data to servers.