Dialogue to Questions Generation for Evidence-based Medical Guideline Agent Development

Zongliang Ji

Ziyang Zhang

Xincheng Tan

Matthew Thompson

Anna Goldenberg

Carl Yang

Rahul G. Krishnan

Fan Zhang

Machine Learning for Health 2025, PLMR

Download Google Scholar

Abstract

Evidence-based medicine (EBM) is central to high-quality care, yet remains difficult to implement in fast-paced primary care visits. Physicians face short consultations, increasing patient loads, and lengthy guideline documents that are impractical to consult in real time. To address this gap, we investigate the feasibility of using large language models (LLMs) as ambient assistants that generate targeted, evidence-based questions during physician–patient encounters. Our study focuses on the question generation stage rather than answering, with the aim of scaffolding physician reasoning and integrating guideline-based practice into brief consultations. We implemented two prompting strategies, a zero-shot baseline and a multi-stage reasoning variant, using Gemini~2.5 as the backbone model. We evaluated outputs on a benchmark of 80 de-identified transcripts from real clinical encounters, with six experienced physicians conducting over 80 hours of structured evaluation. Results indicate that while general-purpose LLMs are not yet fully reliable, they can produce clinically meaningful and guideline-relevant questions, suggesting significant potential to reduce cognitive burden and make EBM more actionable in everyday practice.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Dialogue to Questions Generation for Evidence-based Medical Guideline Agent Development

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs