The Anatomy of a Personal Health Agent

Ahmed Metwally
Ken Gu
Jiening Zhan
Kumar Ayush
Hong Yu
Amy Lee
Qian He
Zhihan Zhang
Isaac Galatzer-Levy
Xavi Prieto
Andrew Barakat
Ben Graef
Yuzhe Yang
Daniel McDuff
Brent Winslow
Shwetak Patel
Girish Narayanswamy
Conor Heneghan
Max Xu
Jacqueline Shreibati
Mark Malhotra
Orson Xu
Tim Althoff
Tony Faranesh
Nova Hammerquist
Vidya Srinivas
arXiv (2025)

Abstract

Health is a fundamental pillar of human wellness, and the rapid advancements in large language models (LLMs) have driven the development of a new generation of health agents. However, the solution to fulfill diverse needs from individuals in daily non-clinical settings is underexplored. In this work, we aim to build a comprehensive personal health assistant that is able to reason about multimodal data from everyday consumer devices and personal health records. To understand end users’ needs when interacting with such an assistant, we conducted an in-depth analysis of query data from users, alongside qualitative insights from users and experts gathered through a user-centered design process. Based on these findings, we identified three major categories of consumer health needs, each of which is supported by a specialist subagent: (1) a data science agent that analyzes both personal and population-level time-series wearable and health record data to provide numerical health insights, (2) a health domain expert agent that integrates users’ health and contextual data to generate accurate, personalized insights based on medical and contextual user knowledge, and (3) a health coach agent that synthesizes data insights, drives multi-turn user interactions and interactive goal setting, guiding users using a specified psychological strategy and tracking users’ progress. Furthermore, we propose and develop a multi-agent framework, Personal Health Insight Agent Team (PHIAT), that enables dynamic, personalized interactions to address individual health needs. To evaluate these individual agents and the multi-agent system, we develop a set of N benchmark tasks and conduct both automated and human evaluations, involving 100’s of hours of evaluation from health experts, and 100’s of hours of evaluation from end-users. Our work establishes a strong foundation towards the vision of a personal health assistant accessible to everyone in the future and represents the most comprehensive evaluation of a consumer AI health agent to date.