Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

Bernd Bohnet

Vinh Tran

Pat Verga

Roee Aharoni

Daniel Andor

Livio Baldini Soares

Massimiliano Ciaramita

Jacob Eisenstein

Kuzman Ganchev

Jonathan Herzig

Kai Hui

Tom Kwiatkowski

Ji Ma

Jianmo Ni

Tal Schuster

Lierni Sestorain Saralegui

William Weston Cohen

Michael Collins

Dipanjan Das

Don Metzler

Slav Petrov

Kellie Webster

arXiv (2022)

Download Google Scholar

Abstract

Large language models (LLMs) have shown impressive results across a variety of tasks while requiring little or no direct supervision. Further, there is mounting evidence that LLMs may have potential in information-seeking scenarios. We believe the ability of an LLM to attribute the text that it generates is likely to be crucial for both system developers and users in this setting. We propose and study Attributed QA as a key first step in the development of attributed LLMs. We develop a reproducable evaluation framework for the task, using human annotations as a gold standard and a correlated automatic metric that we show is suitable for development settings. We describe and benchmark a broad set of architectures for the task. Our contributions give some concrete answers to two key questions (How to measure attribution?, and How well do current state-of-the-art methods perform on attribution?), and give some hints as to how to address a third key question (How to build LLMs with attribution?).

Research Areas

Natural language processing

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs