Efficient, Property-Aligned Fan-Out Retrieval via RL-Compiled Diffusion

Patrick Jiang; Judith Li; Moonkyung Ryu; Lily Hu; Kun Su; Zhong Yi Wan; Liam Hebert; Hao Peng; Jiawei Han; Dima Kuzmin; Craig Boutilier

Efficient, Property-Aligned Fan-Out Retrieval via RL-Compiled Diffusion

Patrick Jiang

Judith Li

Moonkyung Ryu

Lily Hu

Kun Su

Zhong Yi Wan

Liam Hebert

Hao Peng

Jiawei Han

Dima Kuzmin

Craig Boutilier

Proceedings of the 43rd International Conference on Machine Learning (ICML-26), Seoul, South Korea (2026)

Google Scholar

Abstract

Many modern retrieval problems are set-valued: given a broad intent, the system must return a collection of results that optimizes higher-order properties (e.g., diversity, coverage, complementarity, coherence) while staying grounded to a fixed database. These objectives are inherently non-decomposable, creating a training bottleneck because property-aligned (query, content) supervision is scarce. Reinforcement learning (RL) can optimize set-level objectives via interaction, but deploying an RL-tuned LLM for fan-out retrieval is expensive at query time. Diffusion-based generative retrieval enables efficient single-pass fan-out in embedding space, but requires objective-aligned training targets. We propose R4T (Retrieve-for-Train), which uses RL once as an objective transducer: (i) train a fan-out LLM with composite set-level rewards, (ii) synthesize objective-consistent training pairs, and (iii) train a lightweight diffusion retriever to model the conditional distribution of set-valued outputs. Across Polyvore and a large-scale music playlist dataset, R4T improves retrieval quality over strong baselines while reducing query-time fan-out latency by an order of magnitude.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Efficient, Property-Aligned Fan-Out Retrieval via RL-Compiled Diffusion

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs