Jump to Content
Amarnag Subramanya

Amarnag Subramanya

Amarnag Subramanya is a senior research scientist in the machine learning & natural language processing group at Google Research. He received his PhD from University of Washington, Seattle in 2009. His interests include, semi-supervised learning, graphical models and their applications to various problems in natural language and speech.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract We describe an LSTM-based model which we call Byte-to-Span (BTS) that reads text as bytes and outputs span annotations of the form [start, length, label] where start positions, lengths, and labels are separate entries in our vocabulary. Because we operate directly on unicode bytes rather than language-specific words or characters, we can analyze text in many languages with a single model. Due to the small vocabulary size, these multilingual models are very compact, but produce results similar to or better than the state-of-the-art in Part-of-Speech tagging and Named Entity Recognition that use only the provided training datasets (no external data sources). Our models are learning “from scratch” in that they do not rely on any elements of the standard pipeline in Natural Language Processing (including tokenization), and thus can run in standalone fashion on raw text. View details
    Preview abstract Entity resolution is the task of linking each mention of an entity in text to the corresponding record in a knowledge base (KB). Coherence models for entity resolution encourage all referring expressions in a document to resolve to entities that are related in the KB. We explore attention-like mechanisms for coherence, where the evidence for each candidate is based on a small set of strong relations, rather than relations to all other entities in the document. The rationale is that document-wide support may simply not exist for non-salient entities, or entities not densely connected in the KB. Our proposed system outperforms state-of-the-art systems on the CoNLL 2003, TAC KBP 2010, 2011 and 2012 tasks. View details
    Plato: A Selective Context Model for Entity Resolution
    Michael Ringgaard
    Transactions of the Association for Computational Linguistics, vol. 3 (2015), pp. 503-515
    Preview abstract We present Plato, a probabilistic model for entity resolution that includes a novel approach for handling noisy or uninformative features,and supplements labeled training data derived from Wikipedia with a very large unlabeled text corpus. Training and inference in the proposed model can easily be distributed across many servers, allowing it to scale to over 10^7 entities. We evaluate Plato on three standard datasets for entity resolution. Our approach achieves the best results to-date on TAC KBP 2011 and is highly competitive on both the CoNLL 2003 and TAC KBP 2012 datasets. View details
    Preview abstract Cross-document coreference, the task of grouping all the mentions of each entity in a document collection, arises in information extraction and automated knowledge base construction. For large collections, it is clearly impractical to consider all possible groupings of mentions into distinct entities. To solve the problem we propose two ideas: (a) a distributed inference technique that uses parallelism to enable large scale processing, and (b) a hierarchical model of coreference that represents uncertainty over multiple granularities of entities to facilitate more effective approximate inference. To evaluate these ideas, we constructed a labeled corpus of 1:5 million disambiguated mentions in Web pages by selecting link anchors referring to Wikipedia entities. We show that the combination of the hierarchical model with distributed inference quickly obtains high accuracy (with error reduction of 38%) on this large dataset, demonstrating the scalability of our approach. View details
    Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models
    Proceedings of the 2010 Conference on Empirical Methods on Natural Language Processing (EMNLP '10)
    Distributed MAP Inference for Undirected Graphical Models
    Sameer Singh
    Andrew McCallum
    Workshop on Learning on Cores, Clusters and Clouds (LCCC), Neural Information Processing Society (NIPS) (2010)
    Large Scale Graph Transduction
    Jeff Bilmes
    NIPS 2009 Workshop on Large-Scale Machine Learning: Parallelism and Massive Datasets, NIPS
    Preview abstract We consider the issue of scalability of graph-based semi-supervised learning (SSL) algorithms. In this context, we propose a fast graph node ordering algorithm that improves parallel spatial locality by being cache cognizant. This approach allows for a linear speedup on a shared-memory parallel machine to be achievable, and thus means that graph-based SSL can scale to very large data sets. We use the above algorithm an a multi-threaded implementation to solve a SSL problem on a 120 million node graph in a reasonable amount of time. View details
    Semi-Supervised Learning with Measure Propagation
    Jeff Bilmes
    Journal of Machine Learning Research (JMLR), vol. 12 (2011), pp. 3311-3370