Lora Aroyo

Lora Aroyo

I am a research scientist at Google Research NYC where I work on Data Excellence for AI. My team DEER (Data Excellence for Evaluating Responsibly) is part of the Responsible AI (RAI) organization. Our work is focused on developing metrics and methodologies to measure the quality of human-labeled or machine-generated data. The specific scope of this work is for gathering and evaluation of adversarial data for Safety evaluation of Generative AI systems. I received MSc in Computer Science from Sofia University, Bulgaria, and PhD from Twente University, The Netherlands.

I am currently serving as a co-chair of the steering committee for the AAAI HCOMP conference series and I am a founding member of the DataPerf and the AI Safety Benchmarking working group both at MLCommons for benchmarking data-centric AI. Check out our data-centric challenge Adversarial Nibbler supported by Kaggle, Hugging Face and MLCommons. In 2023 I gave the opening keynote at NeurIPS Conference "The Many Faces of Responsible AI".

Prior to joining Google, I was a computer science professor heading the User-Centric Data Science research group at the VU University Amsterdam. Our team invented the CrowdTruth crowdsourcing method jointly with the Watson team at IBM. This method has been applied in various domains such as digital humanities, medical and online multimedia. I also guided the human-in-the-loop strategies as a Chief Scientist at a NY-based startup Tagasauris.

Some of my prior community contributions include president of the User Modeling Society, program co-chair of The Web Conference 2023, member of the ACM SIGCHI conferences board.

For a list of my publications, please see my profile on Google Scholar.

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Google
AART: AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications
Bhaktipriya Radharapu
The 2023 Conference on Empirical Methods in Natural Language Processing (2023) (to appear)
Adversarial Nibbler: A DataPerf Challenge for Text-to-Image Models
Hannah Kirk
Jessica Quaye
Charvi Rastogi
Max Bartolo
Oana Inel
D. Sculley
Meg Risdal
Will Cukierski
Vijay Reddy
Online (2023)
LaMDA: Language Models for Dialog Applications
Aaron Daniel Cohen
Alena Butryna
Alicia Jin
Apoorv Kulshreshtha
Ben Zevenbergen
Chung-ching Chang
Cosmo Du
Daniel De Freitas Adiwardana
Dehao Chen
Dmitry (Dima) Lepikhin
Erin Hoffman-John
Igor Krivokon
James Qin
Jamie Hall
Joe Fenton
Johnny Soraker
Kathy Meier-Hellstern
Maarten Paul Bosma
Marc Joseph Pickett
Marcelo Amorim Menegali
Marian Croak
Maxim Krikun
Noam Shazeer
Rachel Bernstein
Ravi Rajakumar
Ray Kurzweil
Romal Thoppilan
Steven Zheng
Taylor Bos
Toju Duke
Tulsee Doshi
Vincent Y. Zhao
Will Rusch
Yuanzhong Xu
Zhifeng Chen
arXiv (2022)
The Reasonable Effectiveness of Diverse Evaluation Data
Christopher Homan
Alex Taylor
Human Evaluation for Generative Models (HEGM) Workshop at NeurIPS2022