Publications
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.
Sort By
1 - 15 of 10132 publications
Preview abstract
Connected TV (CTV) devices blend characteristics of digital desktop and mobile devices--such as the option to log in and the ability to access a broad range of online content--and linear TV--such as a living room experience that can be shared by multiple members of a household. This blended viewing experience requires the development of measurement methods that are adapted to this novel environment. For other devices, ad measurement and planning have an established history of being guided by the ground truth of panels composed of people who share their device behavior. A CTV panel-only measurement solution for reach is not practical due to the panel size that would be needed to accurately measure smaller digital campaigns. Instead, we generalize the existing approach used to measure reach for other devices that combines panel data with other data sources (e.g., ad server logs, publisher-provided self-reported demographic data, survey data) to account for co-viewing. This paper describes data from a CTV panel and shows how this data can be used to effectively measure the aggregate co-viewing rate and fit demographic models that account for co-viewing behavior. Special considerations include data filtering, weighting at the panelist and household levels to ensure representativeness, and measurement uncertainty.
View details
SAC126 - DNSSEC Delegation Signer (DS) Record Automation
Internet Corporation for Assigned Names and Numbers (ICANN), ICANN Security and Stability Advisory Committee (SSAC) Reports and Advisories (2024), pp. 39
Preview abstract
The deployment of Domain Name System (DNS) Security Extensions (DNSSEC) has been
hindered by a number of obstacles. This report focuses on one: the management of Delegation
Signer (DS) records, which connect a child zone’s DNSSEC public key and signatures to the
chain of trust provided by its parent zone (e.g., a zone corresponding to a top-level domain).
DNSSEC is not simply enabled by signing a delegated domain’s DNS zone with DNSSEC
signatures. It is also necessary to configure (and later maintain) appropriate DS records, which
involves coordinated actions by the DNS operator, registrant, registrar, and registry.
In the case where the domain’s DNS service is operated by the registrar, this process can be
reduced to a simple internal operation by the registrar. If the functions are separated, this is not
possible. This report is therefore focused on when the domain’s DNS service is not operated by
the registrar, but by a third-party DNS operator.
In such a scenario, current practice holds the registrant responsible for coordinating DS
maintenance. The registrant (or someone appointed by them) needs to first obtain DNSSEC
public key parameters from the DNS operator, and convey these parameters to the registrar
(potentially via a reseller). The registrar will then need to relay these DNSSEC public key
parameters to the registry, who will use them to create and publish the DS record in the parent
zone. This process often involves idiosyncratic interfaces for each combination of DNS operator
and registrar, requiring a level of engagement and time investment, awareness, and
understanding that often do not match with what the registrant knows or expects. The complexity of the process further introduces opportunity for error.
This can be alleviated by employing automation for the data exchanges required for DS
maintenance so that, when the domain’s DNS service is operated by a third party, registries or
registrars can, without human involvement, obtain all information needed for keeping DS records up to date. Various approaches to achieve this are possible, such as a scheme where the registry or registrar actively contacts the Child DNS operator, or vice versa. The different approaches come with different challenges with respect to authentication, timing, and efficiency.
The IETF has standardized specifications around the first approach, where the parent pulls
information from the Child DNS operator, and operational experience has been gained over
recent years. However, some standardization gaps remain (such as to improve efficiency and
error handling). In addition, the industry could benefit from further development of best practices in deploying the technology.
The SSAC believes that automated DS maintenance should be a goal for the domain name
industry. To make this a reality, the SSAC makes several recommendations with the goal to spur
industry players and ICANN towards an industry best practice for DNSSEC DS automation.
View details
DOCCI: Descriptions of Connected and Contrasting Images
Garrett Tanzer
Jaemin Cho
Su Wang
Sunayana Rane
Zack Berger
Zarana Parekh
(2024)
Preview abstract
Despite recent advancements, text-to-image (T2I) models still exhibit critical limitations, such as errors in understanding spatial relationships, object counting, text rendering, and more. One challenge in overcoming these failure modes is the lack of resources; the majority of existing image-text datasets provide only brief captions that do not offer sufficient detail to discrepancies between images and their descriptions. To advance the development of T2I models further, we introduce \textbf{Descriptions of Connected and Contrasting Images (DOCCI)}, a dataset of 15k images taken by a single person with detailed human-annotated descriptions in English. We meticulously annotated detailed and coherent descriptions, averaging 136 words, which sufficiently differentiate images from related or similar ones. We intentionally curated images that showcase a diverse range of visual properties, including entities with their attributes, various orientations, and lighting effects, many of which are related to each other. We thoroughly analyze the quality and characteristics of the image-description pairs, and assess the performance of the latest T2I and I2T models. The experimental results indicate that the current state-of-the-art T2I models still struggle with the aforementioned challenges, and even the SOTA models have not fully addressed them. DOCCI is publicly available, and we believe that this dataset will be a valuable benchmark for vision-language research.
View details
Socio-spatial equity analysis of relative wealth index and emergency obstetric care accessibility in urban Nigeria
Kerry L. M. Wong
Aduragbemi Banke-Thomas
Tope Olubodun
Peter M. Macharia
Charlotte Stanton
Narayanan Sundararajan
Yash Shah
Mansi Kansal
Swapnil Vispute
Olakunmi Ogunyemi
Uchenna Gwacham-Anisiobi
Jia Wang
Ibukun-Oluwa Omolade Abejirinde
Prestige Tatenda Makanga
Bosede B. Afolabi
Lenka Beňová
Communications Medicine, 4 (2024), pp. 34
Preview abstract
Background
Better geographical accessibility to comprehensive emergency obstetric care (CEmOC) facilities can significantly improve pregnancy outcomes. However, with other factors, such as affordability critical for care access, it is important to explore accessibility across groups. We assessed CEmOC geographical accessibility by wealth status in the 15 most-populated Nigerian cities.
Methods
We mapped city boundaries, verified and geocoded functional CEmOC facilities, and assembled population distribution for women of childbearing age and Meta’s Relative Wealth Index (RWI). We used the Google Maps Platform’s internal Directions Application Programming Interface to obtain driving times to public and private facilities. City-level median travel time (MTT) and number of CEmOC facilities reachable within 60 min were summarised for peak and non-peak hours per wealth quintile. The correlation between RWI and MTT to the nearest public CEmOC was calculated.
Results
We show that MTT to the nearest public CEmOC facility is lowest in the wealthiest 20% in all cities, with the largest difference in MTT between the wealthiest 20% and least wealthy 20% seen in Onitsha (26 vs 81 min) and the smallest in Warri (20 vs 30 min). Similarly, the average number of public CEmOC facilities reachable within 60 min varies (11 among the wealthiest 20% and six among the least wealthy in Kano). In five cities, zero facilities are reachable under 60 min for the least wealthy 20%. Those who live in the suburbs particularly have poor accessibility to CEmOC facilities.
Conclusions
Our findings show that the least wealthy mostly have poor accessibility to care. Interventions addressing CEmOC geographical accessibility targeting poor people are needed to address inequities in urban settings.
View details
The Task-oriented Queries Benchmark (ToQB)
arXiv:2406.02943 (2024)
Preview abstract
Task-oriented queries (e.g., one-shot queries to play videos, order food, or call a taxi) are crucial for assessing the quality of virtual assistants, chatbots, and other large language model (LLM)-based services. However, a standard benchmark for task-oriented queries is not yet available, as existing benchmarks in the relevant NLP (Natural Language Processing) fields have primarily focused on task-oriented dialogues. Thus, we present a new methodology for efficiently generating the Task-oriented Queries Benchmark (ToQB) using existing task-oriented dialogue datasets and an LLM service. Our methodology involves formulating the underlying NLP task to summarize the original intent of a speaker in each dialogue, detailing the key steps to perform the devised NLP task using an LLM service, and outlining a framework for automating a major part of the benchmark generation process. Through a case study encompassing three domains (i.e., two single-task domains and one multi-task domain), we demonstrate how to customize the LLM prompts (e.g., omitting system utterances or speaker labels) for those three domains and characterize the generated task-oriented queries. The generated ToQB dataset is made available to the public.We further discuss new domains that can be added to ToQB by community contributors and its practical applications.
View details
Its All Relative! -- A Synthetic Query Generation Approach for Improving Zero-Shot Relevance Prediction
Findings of the Association for Computational Linguistics: NAACL 2024
Preview abstract
Recent developments in large language models (LLMs) have shown promise in their ability to generate synthetic query-document pairs by prompting LLMs with as few as 8 demonstrations \cite{dai2022promptagator}.
This has enabled building better IR models especially for tasks which have no training data readily available.
Typically, such synthetic query generation (QGen) approaches condition on an input context (e.g. document) and generate a query that is relevant to that context or condition the QGen model additionally on the relevance label (e.g. relevant vs irrelevant) to generate queries across relevance buckets.
However, we find that such QGen approaches are sub-optimal as it requires the model to reason about the desired label and the input from only a handful of examples, which is not trivial, especially when the relevance buckets are nuanced.
In this work, we propose to reduce this burden of LLMs by generating queries simultaneously for different labels (e.g. relevance buckets).
We hypothesize that instead of asking the model to generate, say, an irrelevant query given an input context, asking the model to generate an irrelevant query with respect to a relevant query is a much simpler task setup for the model to reason about.
Extensive experimentation across seven IR datasets shows that synthetic queries generated in such a fashion translates to a better downstream performance, suggesting that the generated queries are indeed of higher quality.
View details
Factual and Personalized Recommendation Language Modeling with Reinforcement Learning
Jihwan Jeong
Mohammad Ghavamzadeh
Proceedings of the First Conference on Language Modeling (COLM-24), Philadelphia (2024)
Preview abstract
Recommender systems (RSs) play a central role in connecting users to products, content and services by matching candidate items to users based on their preferences. While existing RSs often rely on implicit user feedback on recommended items (e.g., clicks, watches, ratings), conversational recommender systems are interacting with users to provide tailored recommendations in natural language. In this work, we aim to develop a recommender language model (LM) that is capable of generating compelling endorsement presentations of relevant items to users, to better explain the details of the items, to connect the items with users’ preferences, and to enhance the likelihood of users accepting recommendations. Specifically, such an LLM-based recommender can understand users’ preferences from users’ RS embeddings summarizing feedback history, output corresponding responses that not only are factually-grounded, but also explain whether these items satisfy users’ preferences in a convincing manner. The pivotal question is how one can gauge the performance of such a LLM recommender. Equipped with a joint reward function that measures factual consistency, convincingness, and personalization, not only can we evaluate the efficacies of different recommender LMs, but we can also utilize this metric as a form of AI feedback to fine-tune our LLM agent via reinforcement learning (RL). Building upon the MovieLens movie recommendation benchmark, we developed a novel conversational recommender delivering personalized movie narratives to users. This work lays the groundwork for recommendation systems that prioritize individualized user experiences without compromising on transparency and integrity.
View details
SkipWriter: LLM-Powered Abbreviated Writing on Tablets
Zheer Xu
Mukund Varma T
Proceedings of UIST 2024 (2024)
Preview abstract
Large Language Models (LLMs) may offer transformative opportunities for text input, especially for physically demanding modalities like handwriting. We studied a form of abbreviated handwriting by designing, developing and evaluating a prototype, named SkipWriter, that convert handwritten strokes of a variable-length, prefix- based abbreviation (e.g., “ho a y” as handwritten strokes) into the intended full phrase (e.g., “how are you” in the digital format) based
on preceding context. SkipWriter consists of an in-production hand-writing recognizer and a LLM fine-tuned on this skip-writing task. With flexible pen input, SkipWriter allows the user to add and revise prefix strokes when predictions don’t match the user’s intent. An user evaluation demonstrated a 60% reduction in motor movements with an average speed of 25.78 WPM. We also showed that this reduction is close to the ceiling of our model in an offline simulation.
View details
USM-SCD: USM-Based Multilingual Speaker Change Detection
Yongqiang Wang
Jason Pelecanos
Yu Zhang
Yiling Huang
Han Lu
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 11801-11805
Preview abstract
We introduce a multilingual speaker change detection model (USM- SCD) that can simultaneously detect speaker turns and perform ASR for 96 languages. This model is adapted from a speech foundation model trained on a large quantity of supervised and unsupervised data, demonstrating the utility of fine-tuning from a large generic foundation model for a downstream task. We analyze the performance of this multilingual speaker change detection model through a series of ablation studies. We show that the USM-SCD model can achieve more than 75% average speaker change detection F1 score across a test set that consists of data from 96 languages. On American English, the USM-SCD model can achieve an 85.8% speaker change detection F1 score across various public and internal test sets, beating the previous monolingual baseline model by 21% relative. We also show that we only need to fine-tune one-quarter of the trainable model parameters to achieve the best model performance. The USM-SCD model exhibits state-of-the-art ASR quality compared with a strong public ASR baseline, making it suitable to handle both tasks with negligible additional computational cost.
View details
Preview abstract
This is the seventh installment of the Developer Productivity for Humans column. This installment focuses on software quality: what it means, how developers see it, how we break it down into 4 types of quality, and the impact these have on each other.
View details
Preview abstract
This paper explores the integration of model-based and data-driven approaches within the realm of neural speech and audio coding systems. It highlights the challenges posed by the subjective evaluation processes of speech and audio codecs and discusses the limitations of purely data-driven approaches, which often require inefficiently large architectures to match the performance of model-based methods. The study presents hybrid systems as a viable solution, offering significant improvements to the performance of conventional codecs through meticulously chosen design enhancements. Specifically, it introduces a neural network-based signal enhancer designed to post-process existing codecs’ output, along with the autoencoder-based end-to-end models and LPCNet—hybrid systems that combine linear predictive coding (LPC) with neural networks. Furthermore, the paper delves into predictive models operating within custom feature spaces (TF-Codec) or predefined transform domains (MDCTNet) and examines the use of psychoacoustically calibrated loss functions to train end-to-end neural audio codecs. Through these investigations, the paper demonstrates the potential of hybrid systems to advance the field of speech and audio coding by bridging the gap between traditional model-based approaches and modern data-driven techniques.
View details
Solidarity not Charity! Empowering Local Communities for Disaster Relief during COVID-19 through Grassroots Support
Jeongwon Jo
Oluwafunke Alliyu
John M. Carroll
Computer Supported Cooperative Work (2024) (2024)
Preview abstract
The COVID-19 pandemic brought wide-ranging, unanticipated societal changes as communities rushed to slow the spread of the novel coronavirus. In response, mutual aid groups bloomed online across the United States to fill in the gaps in social services and help local communities cope with infrastructural breakdowns. Unlike many previous disasters, the long-haul nature of COVID-19 necessitates sustained disaster relief efforts. In this paper, we conducted an interview study with online mutual aid group administrators to understand how groups facilitated disaster relief, and how disaster relief initiatives developed and maintained over the course of the first year of COVID-19. Our findings suggest that the groups were crucial sources of community-based support for immediate needs, innovated long-term solutions for chronic community issues and grew into a vehicle for justice-centered work. Our insights shed light on the strength of mutual aid as a community capacity that can support communities to collectively be more prepared for future long-haul disasters than they were with COVID-19.
View details
Augmentations vs Algorithms: What Works in Self-Supervised Learning
Warren Morningstar
Alex Bijamov
Chris Duvarney
Luke Friedman
Neha Kalibhat
Philip Mansfield
Renan Rojas-Gomez
Karan Singhal
Bradley Green
Sushant Prakash
Arxiv (2024) (to appear)
Preview abstract
We study the relative effects of data augmentations, pretraining algorithms, and model architectures in Self-Supervised Learning (SSL). While the recent literature in this space leaves the impression that the pretraining algorithm is of critical importance to performance, understanding its effect is complicated by the difficulty in making objective and direct comparisons between methods. We propose a new framework which unifies many seemingly disparate SSL methods into a single shared template. Using this framework, we identify aspects in which methods differ and observe that in addition to changing the pretraining algorithm, many works also use new data augmentations or more powerful model architectures. We compare several popular SSL methods using our framework and find that many algorithmic additions, such as prediction networks or new losses, have a minor impact on downstream task performance (often less than 1%), while enhanced augmentation techniques offer more significant performance improvements (2−4%). Our findings challenge the premise that SSL is being driven primarily by algorithmic improvements, and suggest instead a bitter lesson for SSL: that augmentation diversity and data / model scale are more critical contributors to recent advances in self-supervised learning.
View details
Preview abstract
We present shadow Hamiltonian simulation, a framework for simulating quantum dynamics
using a compressed quantum state that we call the “shadow state”. The amplitudes of this
shadow state are proportional to the expectations of a set of operators of interest. The shadow
state evolves according to its own Schrodinger equation, and under broad conditions can be
simulated on a quantum computer. We analyze a number of applications of this framework to quantum simulation problems. This includes simulating the dynamics of exponentially large systems of free fermions, or exponentially large systems of free bosons, the latter example recovering a recent algorithm for simulating exponentially many classical harmonic oscillators. Shadow Hamiltonian simulation can be extended to simulate expectations of more complex operators such as two-time correlators or Green’s functions, and to study the evolution of operators themselves in the Heisenberg picture
View details
Stable quantum-correlated many-body states through engineered dissipation
Xiao Mi
Alexios Michailidis
Sara Shabani
Jerome Lloyd
Rajeev Acharya
Igor Aleiner
Trond Andersen
Markus Ansmann
Frank Arute
Kunal Arya
Juan Atalaya
Gina Bortoli
Alexandre Bourassa
Leon Brill
Michael Broughton
Bob Buckley
Tim Burger
Nicholas Bushnell
Jimmy Chen
Benjamin Chiaro
Desmond Chik
Charina Chou
Josh Cogan
Roberto Collins
Paul Conner
William Courtney
Alex Crook
Ben Curtin
Alejo Grajales Dau
Dripto Debroy
Agustin Di Paolo
ILYA Drozdov
Andrew Dunsworth
Lara Faoro
Edward Farhi
Reza Fatemi
Vinicius Ferreira
Ebrahim Forati
Brooks Foxen
Élie Genois
William Giang
Dar Gilboa
Raja Gosula
Steve Habegger
Michael Hamilton
Monica Hansen
Sean Harrington
Paula Heu
Markus Hoffmann
Trent Huang
Ashley Huff
Bill Huggins
Sergei Isakov
Justin Iveland
Cody Jones
Pavol Juhas
Kostyantyn Kechedzhi
Marika Kieferova
Alexei Kitaev
Andrey Klots
Alexander Korotkov
Fedor Kostritsa
John Mark Kreikebaum
Dave Landhuis
Pavel Laptev
Kim Ming Lau
Lily Laws
Joonho Lee
Kenny Lee
Yuri Lensky
Alexander Lill
Wayne Liu
Orion Martin
Amanda Mieszala
Shirin Montazeri
Alexis Morvan
Ramis Movassagh
Wojtek Mruczkiewicz
Charles Neill
Ani Nersisyan
Michael Newman
JiunHow Ng
Murray Ich Nguyen
Tom O'Brien
Alex Opremcak
Andre Petukhov
Rebecca Potter
Leonid Pryadko
Charles Rocque
Negar Saei
Kannan Sankaragomathi
Henry Schurkus
Christopher Schuster
Mike Shearn
Aaron Shorter
Noah Shutty
Vladimir Shvarts
Jindra Skruzny
Clarke Smith
Rolando Somma
George Sterling
Doug Strain
Marco Szalay
Alfredo Torres
Guifre Vidal
Cheng Xing
Jamie Yao
Ping Yeh
Juhwan Yoo
Grayson Young
Yaxing Zhang
Ningfeng Zhu
Jeremy Hilton
Anthony Megrant
Yu Chen
Vadim Smelyanskiy
Dmitry Abanin
Science, 383 (2024), pp. 1332-1337
Preview abstract
Engineered dissipative reservoirs have the potential to steer many-body quantum systems toward correlated steady states useful for quantum simulation of high-temperature superconductivity or quantum magnetism. Using up to 49 superconducting qubits, we prepared low-energy states of the transverse-field Ising model through coupling to dissipative auxiliary qubits. In one dimension, we observed long-range quantum correlations and a ground-state fidelity of 0.86 for 18 qubits at the critical point. In two dimensions, we found mutual information that extends beyond nearest neighbors. Lastly, by coupling the system to auxiliaries emulating reservoirs with different chemical potentials, we explored transport in the quantum Heisenberg model. Our results establish engineered dissipation as a scalable alternative to unitary evolution for preparing entangled many-body states on noisy quantum processors.
View details