Authored Publications
Sort By
Substance or Style: What Does Your Image Embedding Know?
Charles Herrmann
Chun-Sung Ferng
Dilip Krishnan
NeurIPS 2023 Workshop on Distribution Shifts (DistShift) New Frontiers with Foundation Models
Preview abstract
Probes are small networks that predict properties of underlying data from embeddings, and they provide a targeted, effective way to illuminate the information contained in embeddings. While analysis through the use of probes has become standard in NLP, there has been much less exploration in vision. Image foundation models have primarily been evaluated for semantic content. Better understanding the non-semantic information in popular embeddings (e.g., MAE, SimCLR, or CLIP) will shed new light both on the training algorithms and on the uses for these foundation models. We design a systematic transformation prediction task and measure the visual content of embeddings along many axes, including image style, quality, and a range of natural and artificial transformations. Surprisingly, six embeddings (including SimCLR) encode enough non-semantic information to identify dozens of transformations. We also consider a generalization task, where we group similar transformations and hold out several for testing. We find that image-text models (CLIP and ALIGN) are better at recognizing new examples of style transfer than masking-based models (CAN and MAE). Overall, our results suggest that the choice of pre-training algorithm impacts the types of information in the embedding, and certain models are better than others for non-semantic downstream tasks.
View details
An Adversarial Variational Inference Approach for Travel Demand Calibration of Urban Traffic Simulators
Martin Mladenov
Proceedings of the 30th ACM SIGSPATIAL Intl. Conf. on Advances in Geographic Information Systems (SIGSPATIAL-22), Seattle, WA (2022) (to appear)
Preview abstract
This paper considers the calibration of travel demand inputs, defined as a set of origin-destination matrices (ODs), for stochastic microscopic urban traffic simulators. The goal of calibration is to find a (set of) travel demand input(s) that replicate sparse field count data statistics. While traditional approaches use only first-order moment information from the field data, it is well known that the OD calibration problem is underdetermined in realistic networks. We study the value of using higher-order statistics from spatially sparse field data to mitigate underdetermination, proposing a variational inference technique that identifies an OD distribution. We apply our approach to a high-dimensional setting in Salt Lake City, Utah. Our approach is flexible—it can be readily extended to account for arbitrary types of field data (e.g., road, path or trip data).
View details
Preview abstract
Large generative language models such as GPT-2 are well-known for not only their ability to generate highly realistic text but also in their utility for common downstream tasks. However, how and in what settings one can best leverage these powerful language models is still a nascent research question. In this work, we explore their use in predicting ``language quality'', a notion of coherence and understandability of text. Our key finding is that, when trained in a self-discriminating fashion, large language models emerge as unsupervised predictors for such language quality. This enables fast bootstrapping of quality indicators in a low-resource setting. We conduct extensive qualitative and quantitative analysis over 500 million web articles, the largest-scale study conducted on this topic.
View details
An Efficient Simulation-Based Travel Demand Calibration Algorithm for Large-Scale Metropolitan Traffic Models
Yechen Li
Yi-fan Chen
Ziheng Lin
(2021) (to appear)
Preview abstract
Metropolitan scale vehicular traffic modeling is used by a variety of private and public sector urban mobil-ity stakeholders to inform the design and operations of road networks. High-resolution stochastic traffic simulators are increasingly used to describe detailed demand-supply interactions. The design of efficient calibration techniques remains a major challenge. This paper considers a class of high-dimensional calibration problems known as origin-destination (OD) calibration. We formulate the problem as a continuous simulation-based optimization problem. Our proposed algorithm builds upon recent metamodel methods that tackle the simulation-based problem by solving a sequence of approximate analytical optimization problems, which rely on the use of analytical network models. In this paper, we formulate a network model defined as a system of linear equations, the dimension of which scales linearly with the number of roads in the network and independently of the dimension of the route choice set. This makes the approach suitable for large-scale metropolitan networks. The approach has enhanced efficiency compared with past metamodel formulations that are based on systems of nonlinear, rather than linear, equations. It also has enhanced efficiency compared to traditional calibration methods that resort to simulation-based estimates of traffic assignment matrices, while the proposed approach uses analytical approximations of these matrices. We benchmark the approach considering a peak period Salt Lake City case study and calibrate based on field vehicular count data. The new formulation yields solutions with good performance, reduces the compute time needed, is suitable for large-scale road networks, and can be readily extended to account for other types of field data sources.
View details
Quantifying the sustainability impact of Google Maps: A case study of Salt Lake City
Theophile Cabannes
Yechen Li
Marc Nunkesser
2021 (2021) (to appear)
Preview abstract
Google Maps uses current and historical traffic trends to provide routes to drivers. In this paper, we use microscopic traffic simulation to quantify the improvements to both travel time and CO2 emissions from Google Maps real-time navigation. A case study in Salt Lake City shows that Google Maps users are, on average, saving 1.7% of CO2 emissions and 6.5% travel time. If we restrict to the users for which Google Maps finds a different route than their original route, the average savings are 3.4% of CO2 emissions and 12.5% of travel time. These results are based on traffic conditions observed during the Covid-19 pandemic. As congestion gradually builds back up to pre-pandemic levels, it is expected to lead to even greater savings in emissions.
View details
Preview abstract
This paper seeks to develop a deeper understanding of the fundamental properties of neural text generations models. Concretely, the study of artifacts that emerge in machine generated text as a result of modeling choices is a nascent research area. To this end, the extent and degree to which these artifacts surface in generated text is still unclear. In the spirit of better understanding generative text models and their artifacts, we propose the new task of distinguishing which of several variants of a given model generated some piece of text. Specifically, we conduct an extensive suite of diagnostic tests to observe whether modeling choices (e.g., sampling methods, top-$k$ probabilities, model architectures, etc.) leave detectable artifacts in the text they generate. Our key finding, which is backed by a rigorous set of experiments, is that such artifacts are present and that different modeling choices can be inferred by looking at generated text alone. This suggests that neural text generators may actually be more sensitive to various modeling choices than previously thought.
View details
Graph-RISE: Graph-Regularized Image Semantic Embedding
Aleksei Timofeev
Futang Peng
Krishnamurthy Viswanathan
Lucy Gao
Sujith Ravi
Yi-ting Chen
Zhen Li
The 12th International Conference on Web Search and Data Mining (2020) (to appear)
Preview abstract
Learning image representation to capture instance-based semantics has been a challenging and important task for enabling many applications such as image search and clustering. In this paper, we explore the limits of image embedding learning at unprecedented scale and granularity. We present Graph-RISE, an image embedding that captures very fine-grained, instance-level semantics. Graph-RISE is learned via a large-scale, neural graph learning framework that leverages graph structure to regularize the training of deep neural networks. To the best of our knowledge, this is the first work that can capture instance-level image semantics at million—O(40M)—scale. Experimental results show that Graph-RISE outperforms state-of-the-art image embedding algorithms on several evaluation tasks, including image classification and triplet ranking. We also provide case studies to demonstrate that, qualitatively, image retrieval based on Graph-RISE well captures the semantics and differentiates nuances at instance level.
View details
Preview abstract
Work in information retrieval has traditionally been focused on ranking and relevance: for a user's query, fetch some number of results, ordered by relevance to the user. However, the problem of determining how many results to return, i.e. how to optimally truncate the ranked result list, has received far less attention despite being of critical importance in a range of applications. Such truncation is a balancing act between the overall relevance, or usefulness, of the results with the user cost of processing more results. In this work, we propose Choppy, an assumption-free model based on the widely successful Transformer architecture in NLP, to the ranked-list truncation problem. Needing nothing more than the relevance scores of the results, the model uses a powerful multi-head attention mechanism to directly optimize any user-defined target IR metric. We show Choppy improves upon recent, state-of-the-art baselines on Robust04.
View details
Graph Agreement Models for Semi-supervised Learning
Krishnamurthy Viswanathan
Anthony Platanios
Sujith Ravi
Proceedings of the Thirty-third Conference on Neural Information Processing Systems, Neurips 2019
Preview abstract
Graph-based algorithms are among the most successful paradigms for solving semi-supervised learning tasks. Recent work on graph convolutional networks and neural graph learning methods has successfully combined the expressiveness of neural networks with graph structures. We propose a technique that, when applied to these methods, achieves state-of-the-art results on semi-supervised learning datasets. Traditional graph-based algorithms, such as label propagation, were designed with the underlying assumption that the label of a node can be imputed from that of the neighboring nodes. However, real-world graphs are either noisy or have edges that do not correspond to label agreement. To address this, we propose Graph Agreement Models (GAM), which introduces an auxiliary model that predicts the probability of two nodes sharing the same label as a learned function of their features. The agreement model is used when training a node classification model by encouraging agreement only for the pairs of nodes it deems likely to have the same label, thus guiding its parameters to better local optima. The classification and agreement models are trained jointly in a co-training fashion. Moreover, GAM can also be applied to any semi-supervised classification problem, by inducing a graph whenever one is not provided. We demonstrate that our method achieves a relative improvement of up to 72% for various node classification models, and obtains state-of-the-art results on multiple established datasets.
View details
Preview abstract
The classical Multinomial Logit (MNL) is a behavioral model for user choice. In this model, a user is offered a slate of choices (a subset of a finite universe of n items), and selects exactly one item from the slate, each with probability proportional to its (positive) weight. Given a set of observed slates and choices, the likelihood-maximizing item weights are easy to learn at scale, and easy to interpret. However, the model fails to represent common real-world behavior. As a result, researchers in user choice often turn to mixtures of MNLs, which are known to approximate a large class of models of rational user behavior. Unfortunately, the only known algorithms for this problem have been heuristic in nature. In this paper we give the first polynomial-time algorithms for exact learning of uniform mixtures of two MNLs. Interestingly, the parameters of the model can be learned for any n by sampling the behavior of random users only on slates of sizes 2 and 3; in contrast, we show that slates of size 2 are insufficient by themselves.
View details