Amarnag Subramanya
Amarnag Subramanya is a senior research scientist in the machine learning & natural language processing group at Google Research. He received his PhD from University of Washington, Seattle in 2009. His interests include, semi-supervised learning, graphical models and their applications to various problems in natural language and speech.
Research Areas
Authored Publications
Sort By
Collective Entity Resolution with Multi-Focal Attention
Soumen Chakrabarti
Michael Ringaard
ACL (2016)
Preview abstract
Entity resolution is the task of linking each mention of an entity in text to the corresponding record in a knowledge base (KB). Coherence models for entity resolution encourage all referring expressions in a document to resolve to entities that are related in the KB. We explore attention-like mechanisms for coherence, where the evidence for each candidate is based on a small set of strong relations, rather than relations to all other entities in the document. The rationale is that document-wide support may simply not exist for non-salient entities, or entities not densely connected in the KB. Our proposed system outperforms state-of-the-art systems on the CoNLL 2003, TAC KBP 2010, 2011
and 2012 tasks.
View details
Preview abstract
We describe an LSTM-based model which we call Byte-to-Span (BTS) that reads text as bytes and outputs span annotations of the form [start, length, label] where start positions, lengths, and labels are separate entries in our vocabulary. Because we operate directly on unicode bytes rather than language-specific words or characters, we can analyze text in many languages with a single model. Due to the small vocabulary size, these multilingual models are very compact, but produce results similar to or better than the state-of-the-art in Part-of-Speech tagging and Named Entity Recognition that use only the provided training datasets (no external data sources). Our models are learning “from scratch” in that they do not rely on any elements of the standard pipeline in Natural Language Processing (including tokenization), and thus can run in standalone fashion on raw text.
View details
Plato: A Selective Context Model for Entity Resolution
Michael Ringgaard
Transactions of the Association for Computational Linguistics, 3 (2015), pp. 503-515
Preview abstract
We present Plato, a probabilistic model for entity resolution that includes a novel approach for handling noisy or uninformative features,and supplements labeled training data derived from Wikipedia with a very large unlabeled text corpus. Training and inference in the proposed model can easily be distributed across many servers, allowing it to scale to over 10^7 entities. We evaluate Plato on three standard datasets for entity resolution. Our approach achieves the best results to-date on TAC KBP 2011 and is highly competitive on both the CoNLL 2003 and TAC KBP 2012 datasets.
View details
Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models
Sameer Singh
Andrew McCallum
Association for Computational Linguistics (ACL) (2011)
Preview abstract
Cross-document coreference, the task of grouping all the mentions of each entity in a document collection, arises in information extraction and automated knowledge base construction. For large collections, it is clearly
impractical to consider all possible groupings of mentions into distinct entities. To solve the problem we propose two ideas: (a) a distributed inference technique that uses parallelism to enable large scale processing, and (b) a hierarchical model of coreference that represents uncertainty over multiple granularities of entities to facilitate more effective approximate inference. To evaluate these ideas, we constructed a labeled corpus of 1:5 million disambiguated mentions in Web pages by selecting link anchors referring to Wikipedia entities. We show that the combination of the
hierarchical model with distributed inference quickly obtains high accuracy (with error reduction of 38%) on this large dataset, demonstrating the scalability of our approach.
View details
Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models
Preview
Proceedings of the 2010 Conference on Empirical Methods on Natural Language Processing (EMNLP '10)
Distributed MAP Inference for Undirected Graphical Models
Preview
Sameer Singh
Andrew McCallum
Workshop on Learning on Cores, Clusters and Clouds (LCCC), Neural Information Processing Society (NIPS) (2010)
Large Scale Graph Transduction
Jeff Bilmes
NIPS 2009 Workshop on Large-Scale Machine Learning: Parallelism and Massive Datasets, NIPS
Preview abstract
We consider the issue of scalability of graph-based semi-supervised learning (SSL) algorithms. In this context, we propose a fast graph node ordering algorithm that improves parallel spatial locality by being cache cognizant. This approach allows for a linear speedup on a shared-memory parallel machine to be achievable, and thus means that graph-based SSL can scale to very large data sets. We use the above algorithm an a multi-threaded implementation to solve a SSL problem on a 120 million node graph in a reasonable amount of time.
View details