Javad Hosseini

Javad Hosseini

Javad Hosseini is a researcher at Google Research, UK, working on natural language inference, reasoning, and problems related to factuality of large language models. Before joining Google, Javad earned his PhD at the Institute for Language, Cognition and Computation (ILCC), University of Edinburgh, under supervision of Mark Steedman. He obtained his MSc in computer science from the University of Washington while working with Hanna Hajishirzi, Oren Etzioni , and Su-In Lee. He earned his MSc and BSc (1st rank) in Computer Software Engineering from Sharif University of Technology.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Sources of LLM Hallucination in Natural Language Inference
    Nick McKenna
    Tianyi Li
    Liang Cheng
    Mark Johnson
    Mark Steedman
    Findings of the Association for Computational Linguistics: EMNLP 2023
    Preview abstract Large Language Models (LLMs) are claimed to be capable of Natural Language Inference (NLI), necessary for applied tasks like question answering and summarization. We present a series of behavioral studies on several LLM families (LLaMA, GPT-3.5, and PaLM) which probe their behavior using controlled experiments. We establish two biases originating from pretraining which predict much of their behavior, and show that these are major sources of hallucination in generative LLMs. First, memorization at the level of sentences: we show that, regardless of the premise, models falsely label NLI test samples as entailing when the hypothesis is attested in training data, and that entities are used as “indices’ to access the memorized data. Second, statistical patterns of usage learned at the level of corpora: we further show a similar effect when the premise predicate is less frequent than that of the hypothesis in the training data, a bias following from previous studies. We demonstrate that LLMs perform significantly worse on NLI test samples which do not conform to these biases than those which do, and we offer these as valuable controls for future LLM evaluation. View details
    Resolving Indirect Referring Expressions for Entity Selection
    Silvia Pareti
    Proceedings of the Annual Meetings of the Association for Computational Linguistics (ACL 2023)
    Preview abstract Recent advances in language modeling have enabled new conversational systems. In particular, it is often desirable for people to make choices among specified options when using such systems. We address the problem of reference resolution, when people use natural expressions to choose between real world entities. For example, given the choice `Should we make a Simnel cake or a Pandan cake?' a natural response from a non-expert may be indirect: `let's make the green one'. Such natural expressions have been little studied for reference resolution. We argue that robustly understanding such language has large potential for improving naturalness in dialog, recommendation, and search systems. We create AltEntities (Alternative Entities), a new public dataset of 42K entity pairs and expressions (referring to one entity in the pair), and develop models for the disambiguation problem. Consisting of indirect referring expressions across three domains, our corpus enables for the first time the study of how language models can be adapted to this task. We find they achieve 82%-87% accuracy in realistic settings, which while reasonable also invites further advances. View details
    Preview abstract Transformer encoders contextualize token representations by attending to all other tokens at each layer, leading to quadratic increase in compute effort with the input length. In practice, however, the input text of many NLP tasks can be seen as a sequence of related segments (e.g., the sequence of sentences within a passage, or the hypothesis and premise in NLI). While attending across these segments is highly beneficial for many tasks, we hypothesize that this interaction can be delayed until later encoding stages. To this end, we introduce Layer-adjustable Interactions in Transformers (LAIT). Within LAIT, segmented inputs are first encoded independently, and then jointly. This partial two-tower architecture bridges the gap between a Dual Encoder's ability to pre-compute representations for segments and a fully self-attentive Transformer's capacity to model cross-segment attention. Also, LAIT can be introduced only when finetuning, effectively converting an existing pretrained Transformer into the hybrid of the two aforementioned architectures, and providing an intuitive control over the performance-efficiency tradeoff. Experimenting on a wide range of NLP tasks, we find LAIT to significantly improve efficiency while preserving accuracy. View details
    Complementary Roles of Inference and Language Models in Open-domain QA
    Liang Cheng
    Mark Steedman
    Proceedings of the 2nd Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning (2023)
    Preview abstract Answering open-domain questions through unsupervised methods poses challenges for both machine-reading (MR) and language model (LM)-based approaches. The MR-based approach suffers from sparsity issues in extracted knowledge graphs (KGs), while the performance of the LM-based approach significantly depends on the quality of the retrieved context for questions. In this paper, we compare these approaches and propose a novel methodology that leverages directional predicate entailment (inference) to address these limitations. We use entailment graphs (EGs), with natural language predicates as nodes and entailment as edges, to enhance parsed KGs by inferring unseen assertions, effectively mitigating the sparsity problem in the MR-based approach. We also show EGs improve context retrieval for the LM-based approach. Additionally, we present a Boolean QA task, demonstrating that EGs exhibit comparable directional inference capabilities to large language models (LLMs). Our results highlight the importance of inference in open-domain QA and the improvements brought by leveraging EGs. View details
    Language models are poor learners of directional inference
    Tianyi Li
    Sabine Weber
    Mark Steedman
    Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 903-921
    Preview abstract We examine the RoBERTa LMs' competence of directional predicate entailments with prompt fine-tuning. Through analysis, we find that contrary to previous evidence of success, they are showing limited capabilities of directional inference; moreover, existing datasets are either ignorant of directionality, or infested by spurious correlations, allowing models to overfit to dataset artefacts. In response, we present BoOQA (Boolean Open QA), an extrinsic, robust, multi-lingual evaluation benchmark for directional predicate entailments, independent of existing training sets. On BoOQA, we establish baselines and verify that existing LM-prompting models are not competent directional entailment learners, while entailment graphs are cursed by sparsity. We bring the open problem of directional predicate entailment to spotlight and advocate for research along this line. View details
    Cross-lingual Inference with A Chinese Entailment Graph
    Tianyi Li
    Sabine Weber
    Liane Guillou
    Mark Steedman
    Findings of the Association for Computational Linguistics: ACL 2022, pp. 1214-1233
    Preview abstract Predicate entailment detection is a crucial task for question-answering from text, where previous work has explored unsupervised learning of entailment graphs from typed open relation triples. In this paper, we present the first pipeline for building Chinese entailment graphs, which involves a novel high-recall open relation extraction (ORE) method and the first Chinese fine-grained entity typing dataset under the FIGER type ontology. Through experiments on the Levy-Holt dataset, we verify the strength of our Chinese entailment graph, and reveal the cross-lingual complementarity: on the parallel Levy-Holt dataset, an ensemble of Chinese and English entailment graphs beats both monolinguals, and raises unsupervised SOTA by 4.7 AUC points. View details
    Open-Domain Contextual Link Prediction and its Complementarity with Entailment Graphs
    Shay B. Cohen
    Mark Johnson
    Mark Steedman
    Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2790-2802
    Preview abstract An open-domain knowledge graph (KG) has entities as nodes and natural language relations as edges, and is constructed by extracting (subject, relation, object) triples from text. The task of open-domain link prediction is to infer missing relations in the KG. Previous work has used standard link prediction for the task. Since triples are extracted from text, we can ground them in the larger textual context in which they were originally found. However, standard link prediction methods only rely on the KG structure and ignore the textual context of the triples. In this paper, we introduce the new task of open-domain contextual link prediction which has access to both the textual context and the KG structure to perform link prediction. We build a dataset for the task and propose a model for it. Our experiments show that context is crucial in predicting missing relations. We also demonstrate the utility of contextual link prediction in discovering out-of-context entailments between relations, in the form of entailment graphs (EG), in which the nodes are the relations. The reverse holds too: out-of-context EGs assist in predicting relations in context. View details
    Multivalent Entailment Graphs for Question Answering
    Nick McKenna
    Liane Guillou
    Sander Bijl de Vroe
    Mark Johnson
    Mark Steedman
    Conference on Empirical Methods in Natural Language Processing (EMNLP, long papers) (2021), pp. 10758-10768
    Preview abstract Drawing inferences between open-domain natural language predicates is a necessity for true language understanding. There has been much progress in unsupervised learning of entailment graphs for this purpose. We make three contributions: (1) we reinterpret the Distributional Inclusion Hypothesis to model entailment between predicates of different valencies, like DEFEAT(Biden, Trump) |= WIN(Biden); (2) we actualize this theory by learning unsupervised Multivalent Entailment Graphs of open-domain predicates; and (3) we demonstrate the capabilities of these graphs on a novel question answering task. We show that directional entailment is more helpful for inference than non-directional similarity on questions of fine-grained semantics. We also show that drawing on evidence across valencies answers more questions than by using only the same valency evidence. View details