Its All Relative! -- A Synthetic Query Generation Approach for Improving Zero-Shot Relevance Prediction

Karthik Raman

Michael Bendersky

Aditi Chaudhary

Findings of the Association for Computational Linguistics: NAACL 2024

Download Google Scholar

Abstract

Recent developments in large language models (LLMs) have shown promise in their ability to generate synthetic query-document pairs by prompting LLMs with as few as 8 demonstrations \cite{dai2022promptagator}.
This has enabled building better IR models especially for tasks which have no training data readily available.
Typically, such synthetic query generation (QGen) approaches condition on an input context (e.g. document) and generate a query that is relevant to that context or condition the QGen model additionally on the relevance label (e.g. relevant vs irrelevant) to generate queries across relevance buckets.
However, we find that such QGen approaches are sub-optimal as it requires the model to reason about the desired label and the input from only a handful of examples, which is not trivial, especially when the relevance buckets are nuanced.
In this work, we propose to reduce this burden of LLMs by generating queries simultaneously for different labels (e.g. relevance buckets).
We hypothesize that instead of asking the model to generate, say, an irrelevant query given an input context, asking the model to generate an irrelevant query with respect to a relevant query is a much simpler task setup for the model to reason about.
Extensive experimentation across seven IR datasets shows that synthetic queries generated in such a fashion translates to a better downstream performance, suggesting that the generated queries are indeed of higher quality.

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Its All Relative! -- A Synthetic Query Generation Approach for Improving Zero-Shot Relevance Prediction

Abstract

Meet the teams driving innovation