Dana Alon
Previously: Dana Movshovitz-Attias
Research Areas
Authored Publications
Sort By
PaRaDe: Passage Ranking using Demonstrations with Large Language Models
Andrew Drozdov
Zhuyun Dai
Razieh Negin Rahimi
Andrew McCallum
Mohit Iyyer
EMNLP 2023 (Findings)
Preview abstract
Recent work has shown that Large Language Models (LLMs) can effectively re-rank the outputs of BM25 retrieval. This is achieved zero-shot by including task-specific instructions. However, for tasks that require scoring instead of generation, few-shot prompting remains underexplored. In this work, we improve LLM-based re-ranking performance by including demonstrations in the prompt. We show that adding even a single demonstration makes a significant impact. Our detailed analysis investigates under which conditions demonstrations are the most helpful. We propose a novel difficulty-based demonstration selection strategy instead of using the commonly used approach of semantic similarity. Furthermore, we show that demonstrations helpful for ranking are also effective at question generation. We hope our research will facilitate further studies into both question generation and passage re-ranking.
View details
GoEmotions: A Dataset of Fine-Grained Emotions
Dorottya Demszky
Alan Cowen
Gaurav Nemade
Sujith Ravi
ACL (2020) (to appear)
Preview abstract
Understanding emotion expressed in language has a wide range of applications, from building empathetic chatbots to detecting harmful online behavior. Advancement in this area can be improved using large-scale datasets with a fine-grained typology, adaptable to multiple downstream tasks. We introduce GoEmotions, the largest manually annotated dataset of 58k English Reddit comments, labeled for 27 emotion categories or Neutral. We demonstrate the high quality of the annotations via Principal Preserved Component Analysis. We conduct transfer learning experiments with existing emotion benchmarks to show that our dataset generalizes well to other domains and different emotion taxonomies. Our BERT-based model achieves an average F1-score of .46 across our proposed taxonomy, leaving much room for improvement.
View details
Graph Agreement Models for Semi-supervised Learning
Krishnamurthy Viswanathan
Anthony Platanios
Sujith Ravi
Proceedings of the Thirty-third Conference on Neural Information Processing Systems, Neurips 2019
Preview abstract
Graph-based algorithms are among the most successful paradigms for solving semi-supervised learning tasks. Recent work on graph convolutional networks and neural graph learning methods has successfully combined the expressiveness of neural networks with graph structures. We propose a technique that, when applied to these methods, achieves state-of-the-art results on semi-supervised learning datasets. Traditional graph-based algorithms, such as label propagation, were designed with the underlying assumption that the label of a node can be imputed from that of the neighboring nodes. However, real-world graphs are either noisy or have edges that do not correspond to label agreement. To address this, we propose Graph Agreement Models (GAM), which introduces an auxiliary model that predicts the probability of two nodes sharing the same label as a learned function of their features. The agreement model is used when training a node classification model by encouraging agreement only for the pairs of nodes it deems likely to have the same label, thus guiding its parameters to better local optima. The classification and agreement models are trained jointly in a co-training fashion. Moreover, GAM can also be applied to any semi-supervised classification problem, by inducing a graph whenever one is not provided. We demonstrate that our method achieves a relative improvement of up to 72% for various node classification models, and obtains state-of-the-art results on multiple established datasets.
View details
Discovering Subsumption Relationships for Web-Based Ontologies
Preview
Steven Euijong Whang
Alon Halevy
Proc. 18th International Workshop on the Web and Databases (WebDB) (2015)