Chih-wei Hsu
Research Areas
Authored Publications
Sort By
Minimizing Live Experiments in Recommender Systems: User Simulation to Evaluate Preference Elicitation Policies
Martin Mladenov
James Pine
Hubert Pham
Shane Li
Xujian Liang
Anton Polishko
Li Yang
Ben Scheetz
Proceedings of he 47th International ACM/SIGIR Conference on Research and Development in Information Retrieval (SIGIR-24), Washington, DC (2024), pp. 2925-2929
Preview abstract
Evaluation of policies in recommender systems (RSs) typically involves A/B testing using live experiments on real users to assess a new policy's impact on relevant metrics. This ``gold standard'' comes at a high cost, however, in terms of cycle time, user cost, and potential user retention. In developing policies for onboarding new users, these costs can be especially problematic, since on-boarding occurs only once. In this work, we describe a simulation methodology used to augment (and reduce) the use of live experiments. We illustrate its deployment for the evaluation of preference elicitation algorithms used to onboard new users of the YouTube Music platform. By developing counterfactually robust user behavior models, and a simulation service that couples such models with production infrastructure, we are able to test new algorithms in a way that reliably predicts their performance on key metrics when deployed live, sometimes more reliably than live experiments due to the scale at which simulation can be realized. We describe our domain, our simulation models and platform, results of experiments and deployment, and suggest future steps needed to further realistic simulation as a powerful complement to live experiments.
View details
Factual and Personalized Recommendation Language Modeling with Reinforcement Learning
Jihwan Jeong
Mohammad Ghavamzadeh
Proceedings of the First Conference on Language Modeling (COLM-24), Philadelphia (2024)
Preview abstract
Recommender systems (RSs) play a central role in connecting users to products, content and services by matching candidate items to users based on their preferences. While existing RSs often rely on implicit user feedback on recommended items (e.g., clicks, watches, ratings), conversational recommender systems are interacting with users to provide tailored recommendations in natural language. In this work, we aim to develop a recommender language model (LM) that is capable of generating compelling endorsement presentations of relevant items to users, to better explain the details of the items, to connect the items with users’ preferences, and to enhance the likelihood of users accepting recommendations. Specifically, such an LLM-based recommender can understand users’ preferences from users’ RS embeddings summarizing feedback history, output corresponding responses that not only are factually-grounded, but also explain whether these items satisfy users’ preferences in a convincing manner. The pivotal question is how one can gauge the performance of such a LLM recommender. Equipped with a joint reward function that measures factual consistency, convincingness, and personalization, not only can we evaluate the efficacies of different recommender LMs, but we can also utilize this metric as a form of AI feedback to fine-tune our LLM agent via reinforcement learning (RL). Building upon the MovieLens movie recommendation benchmark, we developed a novel conversational recommender delivering personalized movie narratives to users. This work lays the groundwork for recommendation systems that prioritize individualized user experiences without compromising on transparency and integrity.
View details
Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors
Christina Göpfert
Alex Haig
Ivan Vendrov
Tyler Lu
Hubert Pham
Mohammad Ghavamzadeh
ACM Transactions on Recommender Systems (2024)
Preview abstract
Interactive recommender systems have emerged as a promising paradigm to overcome the limitations of the primitive user feedback used by traditional recommender systems (e.g., clicks, item consumption, ratings). They allow users to express intent, preferences, constraints, and contexts in a richer fashion, often using natural language (including faceted search and dialogue).
Yet more research is needed to find the most effective ways to use this feedback. One challenge is inferring a user's semantic intent
from the open-ended terms or attributes often used to describe a desired item,
and using it to refine recommendation results.
Leveraging concept activation vectors (CAVs) (Kim, et al., 2018)
a recently developed approach for model interpretability in machine learning,
we develop a framework to learn a representation that captures the semantics of such attributes and connects them to user preferences and behaviors in recommender systems. One novel feature of our approach is its ability to distinguish objective and subjective attributes (both subjectivity of degree and of sense), and associate different senses of subjective attributes with different users.
We demonstrate on both synthetic and real-world data sets that our CAV representation not only accurately interprets users' subjective semantics, but can also be used to improve recommendations through interactive item critiquing.
View details
Embedding-Aligned Language Models
Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS-24), Vancouver (2024)
Preview abstract
We propose a novel approach for training large language models (LLMs) to adhere to objectives imposed by a latent embedding space. Our method leverages reinforcement learning (RL), treating a pre-trained LLM as an environment. An Embedding-Aligned Guided LanguagE (EAGLE) agent it trained using a significantly smaller language model to iteratively stir the LLM's generation towards optimal regions of a latent embedding space, given some predefined criteria. We demonstrate the effectiveness of the EAGLE agent using the MovieLens 25M dataset, on extrapolation tasks for content gap to satisfy latent user demand, and multi-attribute satisfaction for generating creative variations of entities. Our work paves the way for controlled and grounded text generation using LLMs, ensuring consistency with domain-specific knowledge and data representations.
View details
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta Reinforcement Learning
Anthony Liang
Erdem Biyik
Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS-24), Vancouver (2024)
Preview abstract
We introduce a meta-reinforcement learning (meta-RL) approach, called DynaMITE-RL, to perform approximate inference in environments where the latent information evolves slowly between subtrajectories called sessions.
We identify three key modifications to contemporary meta-RL methods: consistency of latent information during sessions, session masking, and prior latent conditioning.
We demonstrate the necessity of these modifications on various downstream applications from discrete Gridworld environments to continuous control and simulated robot assistive tasks and find that our approach significantly outperforms contemporary baselines.
View details
Demystifying Embedding Spaces using Large Language Models
Jihwan Jeong
Lior Shani
Martin Mladenov
The Twelfth International Conference on Learning Representations (2024)
Preview abstract
Embeddings have become a pivotal means to represent complex, multi-faceted information about entities, concepts, and relationships in a condensed and useful format. Nevertheless, they often preclude direct interpretation. While downstream tasks make use of these compressed representations, meaningful interpretation usually requires visualization using dimensionality reduction or specialized machine learning interpretability methods. This paper addresses the challenge of making such embeddings more interpretable and broadly useful, by employing large language models (LLMs) to directly interact with embeddings -- transforming abstract vectors into understandable narratives. By injecting embeddings into LLMs, we enable querying and exploration of complex embedding data. We demonstrate our approach on a variety of diverse tasks, including: enhancing concept activation vectors (CAVs), communicating novel embedded entities, and decoding user preferences in recommender systems. Our work couples the immense information potential of embeddings with the interpretative power of LLMs.
View details
Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors
Christina Göpfert
Ivan Vendrov
Tyler Lu
WWW22: The Web Conference 2022, Lyon, France, pp. 2411-2421
Preview abstract
Interactive Recommender Systems (RSs) have emerged as a promising paradigm to overcome the limitations of the primitive user feedback used by traditional RSs (e.g., clicks, item consumption, ratings), allowing users to express intent, preferences, constraints, and contexts in a richer fashion using natural language. Still, more research is needed to find the most effective ways to use this feedback. One major challenge is inferring a user's intended semantic intent from given the open-ended terms (say, attributes or tags) used to describe a desired item, and utilize that to refine recommendation results.
Leveraging Concept Activation Vectors (CAVs) [13], we develop a framework to learn a representation that captures the semantics of such attributes and connect them to user preferences and behaviors in RSs. One novel feature of our approach is its ability to distinguish objective and subjective attributes (including subjectivity of degree and of sense) and associate different senses of subjective attributes with different user. We demonstrate on both synthetic and real-world datasets that our CAV representation not only accurately interprets users' subjective semantics, but can also be used to improve recommendations.
View details
An Adversarial Variational Inference Approach for Travel Demand Calibration of Urban Traffic Simulators
Martin Mladenov
Proceedings of the 30th ACM SIGSPATIAL Intl. Conf. on Advances in Geographic Information Systems (SIGSPATIAL-22), Seattle, WA (2022) (to appear)
Preview abstract
This paper considers the calibration of travel demand inputs, defined as a set of origin-destination matrices (ODs), for stochastic microscopic urban traffic simulators. The goal of calibration is to find a (set of) travel demand input(s) that replicate sparse field count data statistics. While traditional approaches use only first-order moment information from the field data, it is well known that the OD calibration problem is underdetermined in realistic networks. We study the value of using higher-order statistics from spatially sparse field data to mitigate underdetermination, proposing a variational inference technique that identifies an OD distribution. We apply our approach to a high-dimensional setting in Salt Lake City, Utah. Our approach is flexible—it can be readily extended to account for arbitrary types of field data (e.g., road, path or trip data).
View details
Meta-Thompson Sampling
Branislav Kveton
Michael Konobeev
Martin Mladenov
Proceedings of the 38th International Conference on Machine Learning (ICML 2021), pp. 5884-5893
Preview abstract
Efficient exploration in multi-armed bandits is a fundamental online learning problem. In this work, we propose a variant of Thompson sampling that learns to explore over time by interacting with problem instances sampled from an unknown prior distribution. This algorithm meta-learns the prior and therefore we call it Meta-TS. We propose efficient implementations of Meta-TS and analyze it in Gaussian bandits. Our analysis captures the improvement due to learning the prior and is of a broader interest, because we derive the first prior-dependent upper bound on the Bayes regret. Our regret bound is complemented by empirical evaluation, which shows that Meta-TS quickly adapts to the unknown prior.
View details
RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems
Martin Mladenov
Vihan Jain
Christopher Colby
Nicolas Mayoraz
Hubert Pham
Ivan Vendrov
ArXiv (2021)
Preview abstract
The development of recommender systems that optimize multi-turn interaction with users, and model the interactions of different
agents (e.g., users, content providers, vendors) in the recommender ecosystem have drawn increasing attention in recent years.
Developing and training models and algorithms for such recommenders can be especially difficult using static datasets, which often
fail to offer the types of counterfactual predictions needed to evaluate policies over extended horizons. To address this, we develop
RecSim NG, a probabilistic platform for the simulation of multi-agent recommender systems. RecSim NG is a scalable, modular,
differentiable simulator implemented in Edward2 and TensorFlow. It offers: a powerful, general probabilistic programming language for
agent-behavior specification; tools for probabilistic inference and latent-variable model learning, backed by automatic differentiation
and tracing; a TensorFlow-based runtime for running simulations on accelerated hardware. We describe RecSim NG and illustrate
how it can be used to create transparent, configurable, end-to-end models of a recommender ecosystem, complemented by a small
set of simple use cases that demonstrate how RecSim NG can help both researchers and practitioners easily develop and train novel algorithms for recommender systems.
A short version of this paper was published at RecSys 2020.
View details