Google Research

Restricted Transfer learning for Text Categorization

  • Rajhans Samdani
  • Gideon Mann
NIPS Workshop (2013) (to appear)


In practice, machine learning systems deal with multiple datasets over time. When the feature spaces between these datasets overlap, it is possible to transfer information from one task to another. Typically in transfer learning, all labeled data from a source task is saved to be applied to a new target task thereby raising concerns of privacy, memory and scaling. To ameliorate such concerns, we present a semi-supervised algorithm for text categorization that transfers information across tasks without storing the data of the source task. In particular, our technique learns a sparse low-dimensional projection from unlabeled and the source task data. In particular, our technique learns low-dimensional sparse word clusters-based features from the source task data and a massive amount of additional unlabeled data. Our algorithm is efficient, highly parallelizable, and outperforms competitive baselines by up to 9% on several difficult benchmark text categorization tasks.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work