Chih-wei Hsu

Research Areas

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Minimizing Live Experiments in Recommender Systems: User Simulation to Evaluate Preference Elicitation Policies
    Martin Mladenov
    James Pine
    Hubert Pham
    Shane Li
    Xujian Liang
    Anton Polishko
    Li Yang
    Ben Scheetz
    Proceedings of he 47th International ACM/SIGIR Conference on Research and Development in Information Retrieval (SIGIR-24), Washington, DC (2024), pp. 2925-2929
    Preview abstract Evaluation of policies in recommender systems (RSs) typically involves A/B testing using live experiments on real users to assess a new policy's impact on relevant metrics. This ``gold standard'' comes at a high cost, however, in terms of cycle time, user cost, and potential user retention. In developing policies for onboarding new users, these costs can be especially problematic, since on-boarding occurs only once. In this work, we describe a simulation methodology used to augment (and reduce) the use of live experiments. We illustrate its deployment for the evaluation of preference elicitation algorithms used to onboard new users of the YouTube Music platform. By developing counterfactually robust user behavior models, and a simulation service that couples such models with production infrastructure, we are able to test new algorithms in a way that reliably predicts their performance on key metrics when deployed live, sometimes more reliably than live experiments due to the scale at which simulation can be realized. We describe our domain, our simulation models and platform, results of experiments and deployment, and suggest future steps needed to further realistic simulation as a powerful complement to live experiments. View details
    Factual and Personalized Recommendation Language Modeling with Reinforcement Learning
    Jihwan Jeong
    Mohammad Ghavamzadeh
    Proceedings of the First Conference on Language Modeling (COLM-24), Philadelphia (2024)
    Preview abstract Recommender systems (RSs) play a central role in connecting users to products, content and services by matching candidate items to users based on their preferences. While existing RSs often rely on implicit user feedback on recommended items (e.g., clicks, watches, ratings), conversational recommender systems are interacting with users to provide tailored recommendations in natural language. In this work, we aim to develop a recommender language model (LM) that is capable of generating compelling endorsement presentations of relevant items to users, to better explain the details of the items, to connect the items with users’ preferences, and to enhance the likelihood of users accepting recommendations. Specifically, such an LLM-based recommender can understand users’ preferences from users’ RS embeddings summarizing feedback history, output corresponding responses that not only are factually-grounded, but also explain whether these items satisfy users’ preferences in a convincing manner. The pivotal question is how one can gauge the performance of such a LLM recommender. Equipped with a joint reward function that measures factual consistency, convincingness, and personalization, not only can we evaluate the efficacies of different recommender LMs, but we can also utilize this metric as a form of AI feedback to fine-tune our LLM agent via reinforcement learning (RL). Building upon the MovieLens movie recommendation benchmark, we developed a novel conversational recommender delivering personalized movie narratives to users. This work lays the groundwork for recommendation systems that prioritize individualized user experiences without compromising on transparency and integrity. View details
    Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors
    Christina Göpfert
    Alex Haig
    Ivan Vendrov
    Tyler Lu
    Hubert Pham
    Mohammad Ghavamzadeh
    ACM Transactions on Recommender Systems (2024)
    Preview abstract Interactive recommender systems have emerged as a promising paradigm to overcome the limitations of the primitive user feedback used by traditional recommender systems (e.g., clicks, item consumption, ratings). They allow users to express intent, preferences, constraints, and contexts in a richer fashion, often using natural language (including faceted search and dialogue). Yet more research is needed to find the most effective ways to use this feedback. One challenge is inferring a user's semantic intent from the open-ended terms or attributes often used to describe a desired item, and using it to refine recommendation results. Leveraging concept activation vectors (CAVs) (Kim, et al., 2018) a recently developed approach for model interpretability in machine learning, we develop a framework to learn a representation that captures the semantics of such attributes and connects them to user preferences and behaviors in recommender systems. One novel feature of our approach is its ability to distinguish objective and subjective attributes (both subjectivity of degree and of sense), and associate different senses of subjective attributes with different users. We demonstrate on both synthetic and real-world data sets that our CAV representation not only accurately interprets users' subjective semantics, but can also be used to improve recommendations through interactive item critiquing. View details
    Embedding-Aligned Language Models
    Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS-24), Vancouver (2024)
    Preview abstract We propose a novel approach for training large language models (LLMs) to adhere to objectives imposed by a latent embedding space. Our method leverages reinforcement learning (RL), treating a pre-trained LLM as an environment. An Embedding-Aligned Guided LanguagE (EAGLE) agent it trained using a significantly smaller language model to iteratively stir the LLM's generation towards optimal regions of a latent embedding space, given some predefined criteria. We demonstrate the effectiveness of the EAGLE agent using the MovieLens 25M dataset, on extrapolation tasks for content gap to satisfy latent user demand, and multi-attribute satisfaction for generating creative variations of entities. Our work paves the way for controlled and grounded text generation using LLMs, ensuring consistency with domain-specific knowledge and data representations. View details
    DynaMITE-RL: A Dynamic Model for Improved Temporal Meta Reinforcement Learning
    Anthony Liang
    Erdem Biyik
    Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS-24), Vancouver (2024)
    Preview abstract We introduce a meta-reinforcement learning (meta-RL) approach, called DynaMITE-RL, to perform approximate inference in environments where the latent information evolves slowly between subtrajectories called sessions. We identify three key modifications to contemporary meta-RL methods: consistency of latent information during sessions, session masking, and prior latent conditioning. We demonstrate the necessity of these modifications on various downstream applications from discrete Gridworld environments to continuous control and simulated robot assistive tasks and find that our approach significantly outperforms contemporary baselines. View details
    Preview abstract Embeddings have become a pivotal means to represent complex, multi-faceted information about entities, concepts, and relationships in a condensed and useful format. Nevertheless, they often preclude direct interpretation. While downstream tasks make use of these compressed representations, meaningful interpretation usually requires visualization using dimensionality reduction or specialized machine learning interpretability methods. This paper addresses the challenge of making such embeddings more interpretable and broadly useful, by employing large language models (LLMs) to directly interact with embeddings -- transforming abstract vectors into understandable narratives. By injecting embeddings into LLMs, we enable querying and exploration of complex embedding data. We demonstrate our approach on a variety of diverse tasks, including: enhancing concept activation vectors (CAVs), communicating novel embedded entities, and decoding user preferences in recommender systems. Our work couples the immense information potential of embeddings with the interpretative power of LLMs. View details
    Preview abstract Interactive Recommender Systems (RSs) have emerged as a promising paradigm to overcome the limitations of the primitive user feedback used by traditional RSs (e.g., clicks, item consumption, ratings), allowing users to express intent, preferences, constraints, and contexts in a richer fashion using natural language. Still, more research is needed to find the most effective ways to use this feedback. One major challenge is inferring a user's intended semantic intent from given the open-ended terms (say, attributes or tags) used to describe a desired item, and utilize that to refine recommendation results. Leveraging Concept Activation Vectors (CAVs) [13], we develop a framework to learn a representation that captures the semantics of such attributes and connect them to user preferences and behaviors in RSs. One novel feature of our approach is its ability to distinguish objective and subjective attributes (including subjectivity of degree and of sense) and associate different senses of subjective attributes with different user. We demonstrate on both synthetic and real-world datasets that our CAV representation not only accurately interprets users' subjective semantics, but can also be used to improve recommendations. View details
    An Adversarial Variational Inference Approach for Travel Demand Calibration of Urban Traffic Simulators
    Martin Mladenov
    Proceedings of the 30th ACM SIGSPATIAL Intl. Conf. on Advances in Geographic Information Systems (SIGSPATIAL-22), Seattle, WA (2022) (to appear)
    Preview abstract This paper considers the calibration of travel demand inputs, defined as a set of origin-destination matrices (ODs), for stochastic microscopic urban traffic simulators. The goal of calibration is to find a (set of) travel demand input(s) that replicate sparse field count data statistics. While traditional approaches use only first-order moment information from the field data, it is well known that the OD calibration problem is underdetermined in realistic networks. We study the value of using higher-order statistics from spatially sparse field data to mitigate underdetermination, proposing a variational inference technique that identifies an OD distribution. We apply our approach to a high-dimensional setting in Salt Lake City, Utah. Our approach is flexible—it can be readily extended to account for arbitrary types of field data (e.g., road, path or trip data). View details
    Meta-Thompson Sampling
    Branislav Kveton
    Michael Konobeev
    Martin Mladenov
    Proceedings of the 38th International Conference on Machine Learning (ICML 2021), pp. 5884-5893
    Preview abstract Efficient exploration in multi-armed bandits is a fundamental online learning problem. In this work, we propose a variant of Thompson sampling that learns to explore over time by interacting with problem instances sampled from an unknown prior distribution. This algorithm meta-learns the prior and therefore we call it Meta-TS. We propose efficient implementations of Meta-TS and analyze it in Gaussian bandits. Our analysis captures the improvement due to learning the prior and is of a broader interest, because we derive the first prior-dependent upper bound on the Bayes regret. Our regret bound is complemented by empirical evaluation, which shows that Meta-TS quickly adapts to the unknown prior. View details
    RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems
    Martin Mladenov
    Vihan Jain
    Christopher Colby
    Nicolas Mayoraz
    Hubert Pham
    Ivan Vendrov
    ArXiv (2021)
    Preview abstract The development of recommender systems that optimize multi-turn interaction with users, and model the interactions of different agents (e.g., users, content providers, vendors) in the recommender ecosystem have drawn increasing attention in recent years. Developing and training models and algorithms for such recommenders can be especially difficult using static datasets, which often fail to offer the types of counterfactual predictions needed to evaluate policies over extended horizons. To address this, we develop RecSim NG, a probabilistic platform for the simulation of multi-agent recommender systems. RecSim NG is a scalable, modular, differentiable simulator implemented in Edward2 and TensorFlow. It offers: a powerful, general probabilistic programming language for agent-behavior specification; tools for probabilistic inference and latent-variable model learning, backed by automatic differentiation and tracing; a TensorFlow-based runtime for running simulations on accelerated hardware. We describe RecSim NG and illustrate how it can be used to create transparent, configurable, end-to-end models of a recommender ecosystem, complemented by a small set of simple use cases that demonstrate how RecSim NG can help both researchers and practitioners easily develop and train novel algorithms for recommender systems. A short version of this paper was published at RecSys 2020. View details