
Geza Kovacs
Geza's research focuses on using LLMs for machine translation and LLM multilinguality. Prior to Google, he was Principal Research Scientist at Lilt. He obtained his PhD in Computer Science from Stanford University.
Authored Publications
Sort By
Mitigating metric bias in minimum bayes risk decoding
Proceedings of the Ninth Conference on Machine Translation (2024), pp. 1063-1094
Preview abstract
Minimum bayes risk decoding has been shown to improve translation quality both on automated metrics and human evaluations. In this paper we show that MBR decoding tends to show larger improvements in the utility metric and similar metrics, compared to other unrelated metrics. To mitigate this metric bias issue, we explore using MBR decoding using ensembles of multiple metrics as the utility function, as well as QE filtering followed by MBR decoding. Human evaluations show that using an ensemble of metrics improves quality over MBR or QE decoding with a single metric.
View details
Large Language Models are Few-Shot Health Learners
Daniel McDuff
Isaac Galatzer-Levy
Jake Sunshine
Jiening Zhan
Ming-Zher Poh
Shun Liao
Paolo Di Achille
Shwetak Patel
ArXiv (2023)
Preview abstract
Large language models (LLMs) can capture rich representations of concepts that are useful for real-world tasks. However, language alone is limited. While existing LLMs excel at text-based inferences, health applications require that models be grounded in numerical data (e.g., vital signs, laboratory values in clinical domains; steps, movement in the wellness domain) that is not easily or readily expressed as text in existing training corpus. We demonstrate that with only few-shot tuning, a large language model is capable of grounding various physiological and behavioral time-series data and making meaningful inferences on numerous health tasks for both clinical and wellness contexts. Using data from wearable and medical sensor recordings, we evaluate these capabilities on the tasks of cardiac signal analysis, physical activity recognition, metabolic calculation (e.g., calories burned), and estimation of stress reports and mental health screeners.
View details