Jump to Content
Dipanjan Das

Dipanjan Das

Dipanjan Das is a Research Scientist at Google working on learning semantic representations of language. He received a Ph.D. in 2012 from the Language Technologies Institute, School of Computer Science at Carnegie Mellon University. Before that, he completed an undergraduate degree in Computer Science and Engineering in 2005 from the Indian Institute of Technology, Kharagpur. His work on multilingual learning of sequence models received the best paper award at ACL 2011 and a best paper award honorable mention at EMNLP 2013.

See his personal webpage for more information.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Conditional Generation with a Question-Answering Blueprint
    Reinald Kim Amplayo
    Fantine Huot
    Mirella Lapata
    Transactions of the Association for Computational Linguistics (2023) (to appear)
    Preview abstract The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details. In this work, we advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded. Our work proposes a new conceptualization of text plans as a sequence of question-answer (QA) pairs. We enhance existing datasets (e.g., for summarization) with a QA blueprint operating as a proxy for both content selection (i.e., what to say) and planning (i.e., in what order). We obtain blueprints automatically by exploiting state-of-the-art question generation technology and convert input-output pairs into input-blueprint-output tuples. We develop Transformer-based models, each varying in how they incorporate the blueprint in the generated output (e.g., as a global plan or iteratively). Evaluation across metrics and datasets demonstrates that blueprint models are more factual than alternatives which do not resort to planning and allow tighter control of the generation output. View details
    Preview abstract With recent improvements in natural language generation (NLG) models for various applications, it has become imperative to have the means to identify and evaluate whether NLG output is only sharing verifiable information about the external world. In this work, we present a new evaluation framework entitled Attributable to Identified Sources (AIS) for assessing the output of natural language generation models, when such output pertains to the external world. We first define AIS and introduce a two-stage annotation pipeline for allowing annotators to appropriately evaluate model output according to AIS guidelines. We empirically validate this approach on generation datasets spanning three tasks (two conversational QA datasets, a summarization dataset, and a table-to-text dataset) via human evaluation studies that suggest that AIS could serve as a common framework for measuring whether model-generated statements are supported by underlying sources. We release guidelines for the human evaluation studies. View details
    Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation
    Fantine Huot
    Reinald Kim Amplayo
    Mirella Lapata
    Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations (2023)
    Preview abstract While conditional generation models can now generate natural language well enough to create fluent text, it is still difficult to control the generation process, leading to irrelevant, repetitive, and hallucinated content. Recent work shows that planning can be a useful intermediate step to render conditional generation less opaque and more grounded. We present a web browser-based demonstration for query-focused summarization that uses a sequence of question-answer pairs, as a blueprint plan for guiding text generation (i.e., what to say and in what order). We illustrate how users may interact with the generated text and associated plan visualizations, e.g., by editing and modifying the blueprint in order to improve or control the generated output. View details
    Preview abstract We introduce Seahorse (SummariEs Annotated with Human Ratings in Six languagEs), a dataset of 96K summaries with ratings along 6 dimensions (comprehensibility, repetition, grammar, attribution, main idea(s), and conciseness). The summaries are generated from 8 different models, conditioned on source text from 4 datasets in 6 languages (German, English, Spanish, Russian, Turkish, and Vietnamese). We release the annotated summaries as a resource for developing better summarization models and automatic metrics. We present an analysis of the dataset's composition and quality, and we demonstrate the potential of this dataset for building better summarization metrics, showing that metrics finetuned with Seahorse data outperform baseline metrics. View details
    Preview abstract Large language models (LLMs) have been shown to perform well in answering questions and in producing long-form texts such as stories and explanations, both in few-shot closed-book settings. While the former can be validated using well-known evaluation metrics, the latter is difficult to evaluate. To this end, we investigate the ability of LLMs to do both tasks at once -- to do question answering that requires long-form answers. Such questions tend to be multifaceted, i.e., they may have ambiguities and/or require information from multiple sources. To this end, we define query refinement prompts that encourage LLMs to explicitly express the multifacetedness in questions and generate long-form answers covering multiple facets of the question. Our experiments on two long-form question answering datasets, ASQA and AQuAMuSe, show that using our prompts allows us to outperform fully finetuned models in the closed book setting, as well as achieve results comparable to retrieve-then-generate open-book models. View details
    Preview abstract Experiments with pretrained models such as BERT are often based on a single checkpoint. While the conclusions drawn apply to the artifact (i.e., the particular instance of the model), it is not always clear whether they hold for the more general procedure (which includes the model architecture, training data, initialization scheme, and loss function). Recent work has shown that re-running pretraining can lead to substantially different conclusions about performance, suggesting that alternative evaluations are needed to make principled statements about procedures. To address this question, we introduce MultiBERTs: a set of 25 BERT-base checkpoints, trained with similar hyper-parameters as the original BERT model but differing in random initialization and data shuffling. The aim is to enable researchers to draw robust and statistically justified conclusions about pretraining procedures. The full release includes 25 fully trained checkpoints, as well as statistical guidelines and a code library implementing our recommended hypothesis testing methods. Finally, for five of these models we release a set of 28 intermediate checkpoints in order to support research on learning dynamics. View details
    A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation
    Yao Zhao
    Mirella Lapata
    Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), Association for Computational Linguistics, pp. 21
    Preview abstract We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies. It builds on recently proposed plan-based neural generation models (Narayan et al., 2021) that are trained to first create a composition of the output and then generate by conditioning on it and the input. Our approach avoids text degeneration by first sampling a composition in the form of an entity chain and then using beam search to generate the best possible text grounded to this entity chain. Experiments on summarization (CNN/DailyMail and XSum) and question generation (SQuAD), using existing and newly proposed automatic metrics together with human-based evaluation, demonstrate that Composition Sampling is currently the best available decoding strategy for generating diverse meaningful outputs. View details
    Preview abstract The availability of large, high-quality datasets has been one of the main drivers of recent progress in question answering (QA). Such annotated datasets however are difficult and costly to collect, and rarely exist in languages other than English, rendering QA technology inaccessible to underrepresented languages. An alternative to building large monolingual training datasets is to leverage pre-trained language models (PLMs) under a few-shot learning setting. Our approach, QAmeleon, uses a PLM to automatically generate multilingual data upon which QA models are trained, thus avoiding costly annotation. Prompt tuning the PLM for data synthesis with only five examples per language delivers accuracy superior to translation-based baselines, bridges nearly 60% of the gap between an English-only baseline and a fully supervised upper bound trained on almost 50,000 hand labeled examples, and always leads to substantial improvements compared to fine-tuning a QA model directly on labeled examples in low resource settings. Experiments on the TyDiQA-GoldP and MLQA benchmarks show that few-shot prompt tuning for data synthesis scales across languages and is a viable alternative to large-scale annotation. View details
    Preview abstract Large language models (LLMs) have shown impressive results across a variety of tasks while requiring little or no direct supervision. Further, there is mounting evidence that LLMs may have potential in information-seeking scenarios. We believe the ability of an LLM to attribute the text that it generates is likely to be crucial for both system developers and users in this setting. We propose and study Attributed QA as a key first step in the development of attributed LLMs. We develop a reproducable evaluation framework for the task, using human annotations as a gold standard and a correlated automatic metric that we show is suitable for development settings. We describe and benchmark a broad set of architectures for the task. Our contributions give some concrete answers to two key questions (How to measure attribution?, and How well do current state-of-the-art methods perform on attribution?), and give some hints as to how to address a third key question (How to build LLMs with attribution?). View details
    Increasing Faithfulness in Knowledge-Grounded Dialogue with Controllable Features
    Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (2021), pp. 704-718
    Preview abstract Knowledge-grounded dialogue systems are intended to convey information that is based on evidence provided in a given source text. We discuss the challenges of training a generative neural dialogue model for such systems that is controlled to stay faithful to the evidence. Existing datasets contain a mix of conversational responses that are faithful to selected evidence as well as more subjective or chit-chat style responses. We propose different evaluation measures to disentangle these different styles of responses by quantifying the informativeness and objectivity. At training time, additional inputs based on these evaluation measures are given to the dialogue model. At generation time, these additional inputs act as stylistic controls that encourage the model to generate responses that are faithful to the provided evidence. We also investigate the usage of additional controls at decoding time using resampling techniques. In addition to automatic metrics, we perform a human evaluation study where raters judge the output of these controlled generation models to be generally more objective and faithful to the evidence compared to baseline dialogue systems. View details
    Preview abstract We present ToTTo, an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description. To obtain generated targets that are natural but also faithful to the source table, we introduce a dataset construction process where annotators directly revise existing candidate sentences from Wikipedia. We present systematic analyses of our dataset and annotation process as well as results achieved by several state-of-the-art baselines. While usually fluent, existing methods often hallucinate phrases that are not supported by the table, suggesting that this dataset can serve as a useful research benchmark for high-precision conditional text generation. View details
    Preview abstract Despite significant advances in text generation in recent years, evaluation metrics have lagged behind, with n-gram overlap metrics such as BLEU or ROUGE still remaining popular. In this work, we introduce BLEURT, a learnt evaluation metric based on BERT that achieves state of the art performance on the last years of the WMT Metrics Shared Task and the WebNLG challenge. A key aspect of our approach is a novel pre-training scheme that uses millions of synthetically constructed examples to increase generalization. We show that in contrast to a vanilla BERT fine-tuning approach, BLEURT yields superior results even in the presence of scarce, skewed, or out-of-domain training data. View details
    Preview abstract We propose a novel conditioned text generation model. It draws inspiration from traditional template-based text generation techniques, where the source provides the content (i.e.,what to say), and the template influences how to say it. Building on the successful encoder-decoder paradigm, it first encodes the content representation from the given in-put text; to produce the output, it retrieves exemplar text from the training data as “soft templates,” which are then used to construct an exemplar-specific decoder. We evaluate the proposed model on abstractive text summarization and data-to-text generation. Empirical results show that this model achieves strong performance and outperforms comparable baselines. View details
    What do you learn from context? Probing for sentence structure in contextualized word representations
    Patrick Xia
    Berlin Chen
    Alex Wang
    Adam Poliak
    R. Thomas McCoy
    Najoung Kim
    Benjamin Van Durme
    Samuel R. Bowman
    International Conference on Learning Representations (2019)
    Preview abstract Contextualized representation models such as CoVe (McCann et al., 2017) and ELMo (Peters et al., 2018a) have recently achieved state-of-the-art results on a broad suite of downstream NLP tasks. Building on recent token-level probing work (Peters et al., 2018a; Blevins et al., 2018; Belinkov et al., 2017b; Shi et al., 2016), we introduce a broad suite of sub-sentence probing tasks derived from the traditional structured-prediction pipeline, including parsing, semantic role labeling, and coreference, and covering a range of syntactic, semantic, local, and long-range phenomena. We use these tasks to examine the word-level contextual representations and investigate how they encode information about the structure of the sentence in which they appear. We probe three recently-released contextual encoder models, and find that ELMo better encodes linguistic structure at the word level than do other comparable models. We find that the existing models trained on language modeling and translation produce strong representations for syntactic phenomena, but only offer small improvements on semantic tasks over a non-contextual baseline. View details
    BERT Rediscovers the Classical NLP Pipeline
    Association for Computational Linguistics (2019) (to appear)
    Preview abstract Pre-trained sentence encoders such as ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2018) have rapidly advanced the state-of-theart on many NLP tasks, and have been shown to encode contextual information that can resolve many aspects of language structure. We extend the edge probing suite of Tenney et al. (2019) to explore the computation performed at each layer of the BERT model, and find that tasks derived from the traditional NLP pipeline appear in a natural progression: part-of-speech tags are processed earliest, followed by constituents, dependencies, semantic roles, and coreference. We trace individual examples through the encoder and find that while this order holds on average, the encoder occasionally inverts the order, revising low-level decisions after deciding higher-level contextual relations. View details
    Preview abstract Automatically constructed datasets for generating text from semi-structured data (tables), such as WikiBio, often contain reference texts that diverge from the information in the corresponding semi-structured data. We show that metrics which rely solely on the reference texts, such as BLEU and ROUGE, show poor correlation with human judgments when those references diverge. We propose a new metric, PARENT, which aligns n-grams from the reference and generated texts to the semi-structured data before computing their precision and recall. Through a large scale human evaluation study of table-to-text models for WikiBio, we show that PARENT correlates with human judgments better than existing text generation metrics. We also adapt and evaluate the information extraction based evaluation proposed by Wiseman et al (2017), and show that PARENT has comparable correlation to it, while being easier to use. We show that PARENT is also applicable when the reference texts are elicited from humans using the data from the WebNLG challenge. View details
    Preview abstract Understanding natural language queries is fundamental to many practical NLP systems. Often, such systems comprise of a brittle processing pipeline, that is not robust to "word salad" text ubiquitously issued by users. However, if a query resembles a grammatical and well-formed question, such a pipeline is able to perform more accurate interpretation, thus reducing downstream compounding errors. Hence, identifying whether or not a query is well formed can enhance query understanding. Here, we introduce a new task of identifying a well-formed natural language question. We construct and release a dataset of 25,100 publicly available questions classified into well-formed and non-well-formed categories and report an accuracy of 70.7% on the test set. We also show that our classifier can be used to improve the performance of neural sequence-to-sequence model for generating questions for reading comprehension. View details
    Preview abstract Split and rephrase is the task of breaking down a sentence into shorter ones that together convey the same meaning. We extract a rich new dataset for this task by mining Wikipedia's edit history: WikiSplit contains one million naturally occurring sentence rewrites, providing sixty times more distinct split examples and a ninety times larger vocabulary than the WebSplit corpus introduced by Narayan et al. (2017) as a benchmark for this task. Incorporating WikiSplit as training data produces a model with qualitatively better predictions that score 32 BLEU points above the prior best result on the WebSplit benchmark. View details
    Preview abstract We release a corpus of atomic insertion ed-its: instances in which a human editor has inserted a single contiguous span of text into an existing sentence. Our corpus is derived fromWikipedia edit history and contains 43 million sentences across 8 different languages. We argue that the signal contained in these edits is valuable for research in semantics and dis-course, and that such signal differs from that found in conventional language modeling corpora. We provide experimental evidence from both a corpus linguistics and a language modeling perspective to support these claims. View details
    Preview abstract The reading comprehension task, that asks questions about a given evidence document, is a central problem in natural language understanding. Recent formulations of this task have typically focused on answer selection from a set of candidates pre-defined manually or through the use of an external NLP pipeline. However, Rajpurkar et al. (2016) recently released the SQUAD dataset in which the answers can be arbitrary strings from the supplied text. In this paper, we focus on this answer extraction task, presenting a novel model architecture that efficiently builds fixed length representations of all spans in the evidence document with a recurrent network. We show that scoring explicit span representations significantly improves performance over other approaches that factor the prediction into separate predictions about words or start and end markers. Our approach improves upon the best published results of Wang & Jiang (2016) by 5% and decreases the error of Rajpurkar et al.’s baseline by > 50%. View details
    Neural Paraphrase Identification of Questions with Noisy Pretraining
    Thyago Duque
    Oscar Täckström
    Jakob Uszkoreit
    Proceedings of the First Workshop on Subword and Character Level Models in NLP (2017)
    Preview abstract We present a solution to the problem of paraphrase identification of questions. We focus on a recent dataset of question pairs annotated with binary paraphrase labels and show that a variant of the decomposable attention model (Parikh et al., 2016) results in accurate performance on this task, while being far simpler than many competing neural architectures. Furthermore, when the model is pretrained on a noisy dataset of automatically collected question paraphrases, it obtains the best reported performance on the dataset. View details
    Preview abstract We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it trivially parallelizable. On the Stanford Natural Language Inference (SNLI) dataset, we obtain state-of-the-art results with almost an order of magnitude fewer parameters than previous work and without relying on any word-order information. Adding intra-sentence attention that takes a minimum amount of order into account yields further improvements. View details
    Transforming Dependency Structures to Logical Forms for Semantic Parsing
    Siva Reddy
    Oscar Täckström
    Mark Steedman
    Mirella Lapata
    Transactions of the Association for Computational Linguistics, vol. 4 (2016)
    Preview abstract The strongly typed syntax of grammar formalisms such as CCG, TAG, LFG and HPSG offers a synchronous framework for deriving syntactic structures and semantic logical forms. In contrast - partly due to the lack of a strong type system - dependency structures are easy to annotate and have become a widely used form of syntactic analysis for many languages. However, the lack of a type system makes a formal mechanism for deriving logical forms from dependency structures challenging. We address this by introducing a robust system based on the lambda calculus for deriving neo-Davidsonian logical forms from dependency trees. These logical forms are then used for semantic parsing of natural language to Freebase. Experiments on the Free917 and WebQuestions datasets show that our representation is superior to the original dependency trees and that it outperforms a CCG-based representation on this task. Compared to prior work, we obtain the strongest result to date on Free917 and competitive results on WebQuestions. View details
    Efficient Inference and Structured Learning for Semantic Role Labeling
    Oscar Täckström
    Transactions of the Association for Computational Linguistics, vol. 3 (2015), pp. 29-41
    Preview abstract We present a dynamic programming algorithm for efficient constrained inference in semantic role labeling. The algorithm tractably captures a majority of the structural constraints examined by prior work in this area, which has resorted to either approximate methods or off-the-shelf integer linear programming solvers. In addition, it allows training a globally-normalized log-linear model with respect to constrained conditional likelihood. We show that the dynamic program is several times faster than an off-the-shelf integer linear programming solver, while reaching the same solution. Furthermore, we show that our structured model results in significant improvements over its local counterpart, achieving state-of-the-art results on both PropBank- and FrameNet-annotated corpora. View details
    Semantic Role Labeling with Neural Network Factors
    Oscar Täckström
    Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP '15), Association for Computational Linguistics
    Preview abstract We present a new method for semantic role labeling in which arguments and semantic roles are jointly embedded in a shared vector space for a given predicate. These embeddings belong to a neural network, whose output represents the potential functions of a graphical model designed for the SRL task. We consider both local and structured learning methods and obtain strong results on standard PropBank and FrameNet corpora with a straightforward product-of-experts model. We further show how the model can learn jointly from PropBank and FrameNet annotations to obtain additional improvements on the smaller FrameNet dataset. View details
    Enhanced Search with Wildcards and Morphological Inflections in the Google Books Ngram Viewer
    Jason Mann
    David Zhang
    Lu Yang
    Proceedings of the 52th Annual Meeting of the Association for Computational Linguistics (Demonstrations), Association for Computational Linguistics (2014)
    Preview
    Semantic Frame Identification with Distributed Word Representations
    Karl Moritz Hermann
    Jason Weston
    Proceedings of the 52th Annual Meeting of the Association for Computational Linguistics (2014)
    Preview abstract We present a novel technique for semantic frame identification using distributed representations of predicates and their syntactic context; this technique leverages automatic syntactic parses and a generic set of word embeddings. Given labeled data annotated with frame-semantic parses, we learn a model that projects the set of word representations for the syntactic context around a predicate to a low dimensional representation. The latter is used for semantic frame identification; with a standard argument identification method inspired by prior work, we achieve state-of-the-art results on FrameNet-style frame-semantic analysis. Additionally, we report strong results on PropBank-style semantic role labeling in comparison to prior work. View details
    Frame-Semantic Parsing
    Desai Chen
    André F. T. Martins
    Nathan Schneider
    Noah A. Smith
    Computational Linguistics, vol. 40:1 (2014), pp. 9-56
    Preview abstract Frame semantics (Fillmore 1982) is a linguistic theory that has been instantiated for English in the FrameNet lexicon (Fillmore, Johnson, and Petruck 2003). We solve the problem of frame-semantic parsing using a two-stage statistical model that takes lexical targets (i.e., content words and phrases) in their sentential contexts and predicts frame-semantic structures. Given a target in context, the first stage disambiguates it to a semantic frame. This model employs latent variables and semi-supervised learning to improve frame disambiguation for targets unseen at training time. The second stage finds the target's locally expressed semantic arguments. At inference time, a fast exact dual decomposition algorithm collectively predicts all the arguments of a frame at once in order to respect declaratively stated linguistic constraints, resulting in qualitatively better structures than naïve local predictors. Both components are feature-based and discriminatively trained on a small set of annotated frame-semantic parses. On the SemEval 2007 benchmark dataset, the approach, along with a heuristic identifier of frame-evoking targets, outperforms the prior state of the art by significant margins. Additionally, we present experiments on the much larger FrameNet 1.5 dataset. We have released our frame-semantic parser as open-source software. View details
    Learning Compact Lexicons for CCG Semantic Parsing
    Yoav Artzi
    Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP '14)
    Preview
    Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging
    Oscar Tackstrom
    Ryan McDonald
    Joakim Nivre
    Transactions of the Association for Computational Linguistics (2013), 1–-12
    Preview
    Universal Dependency Annotation for Multilingual Parsing
    Ryan McDonald
    Joakim Nivre
    Yoav Goldberg
    Yvonne Quirmbach-Brundage
    Keith Hall
    Oscar Tackstrom
    Claudia Bedini
    Nuria Bertomeu Castello
    Jungmee Lee
    Association for Computational Linguistics, Association for Computational Linguistics (2013)
    Preview
    Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization
    Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
    Preview abstract We present a framework for cross-lingual transfer of sequence information from a resource-rich source language to a resource-impoverished target language that incorporates soft constraints via posterior regularization. To this end, we use automatically word aligned bitext between the source and target language pair, and learn a discriminative conditional random field model on the target side. Our posterior regularization constraints are derived from simple intuitions about the task at hand and from cross-lingual alignment information. We show improvements over strong baselines for two tasks: part-of-speech tagging and named-entity segmentation. View details
    A Universal Part-of-Speech Tagset
    Ryan McDonald
    Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC '12) (2012)
    Preview
    Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections
    Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL '11) (2011), Best Paper Award
    Preview abstract We describe a novel approach for inducing unsupervised part-of-speech taggers for languages that have no labeled training data, but have translated text in a resource-rich language. Our method does not assume any knowledge about the target language (in particular no tagging dictionary is assumed), making it applicable for a wide array of resource-poor languages. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as constraints in an unsupervised model. Across six European languages, our approach results in an average absolute improvement of 9.7\% over the state-of-the-art baseline, and 17.0\% over vanilla hidden Markov models induced with EM. View details
    An Exact Dual Decomposition Algorithm for Shallow Semantic Parsing with Constraints
    André F. T. Martins
    Noah A. Smith
    Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM 2012), Association of Computational Linguistics
    Graph-Based Lexicon Expansion with Sparsity-Inducing Penalties
    Noah A. Smith
    Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2012), Association of Computational Linguistics
    Semi-Supervised and Latent-Variable Models of Natural Language Semantics
    Ph.D. Thesis, Carnegie Mellon University (2012)
    Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
    Noah A. Smith
    Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011), Association of Computational Linguistics
    Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments
    Kevin Gimpel
    Nathan Schneider
    Brendan O'Connor
    Daniel Mills
    Jacob Eisenstein
    Michael Heilman
    Dani Yogatama
    Jeffrey Flanigan
    Noah A. Smith
    Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011), Association of Computational Linguistics
    Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance
    Shay B. Cohen
    Noah A. Smith
    Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Association of Computational Linguistics (2011)