Konstantina Christakopoulou

Konstantina Christakopoulou

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Reward Shaping for User Satisfaction in a REINFORCE Recommender
    Can Xu
    Sriraj Badam
    Trevor Potter
    Daniel Li
    Hao Wan
    Elaine Le
    Chris Berg
    Eric Bencomo Dixon
    (2021)
    Preview abstract How might we design Reinforcement Learning (RL)-based recommenders that encourage aligning user trajectories with the underlying user satisfaction? Three research questions are key: (1) measuring user satisfaction, (2) combatting sparsity of satisfaction signals, and (3) adapting the training of the recommender agent to maximize satisfaction. For measurement, it has been found that surveys explicitly asking users to rate their experience with consumed items can provide valuable orthogonal information to the engagement/interaction data, acting as a proxy to the underlying user satisfaction. For sparsity, i.e, only being able to observe how satisfied users are with a tiny fraction of user-item interactions, imputation models can be useful in predicting satisfaction level for all items users have consumed. For learning satisfying recommender policies, we postulate that reward shaping in RL recommender agents is powerful for driving satisfying user experiences. Putting everything together, we propose to jointly learn a policy network and a satisfaction imputation network: The role of the imputation network is to learn which actions are satisfying to the user; while the policy network, built on top of REINFORCE, decides which items to recommend, with the reward utilizing the imputed satisfaction. We use both offline analysis and live experiments in an industrial large-scale recommendation platform to demonstrate the promise of our approach for satisfying user experiences. View details
    Preview abstract Most existing recommender systems primarily focus on the users (content consumers), matching users with the most relevant contents, with the goal of maximizing user satisfaction on the platform. However, given that content providers are playing an increasingly critical role through content creation, largely determining the content pool available for recommendation, a natural question that arises is: Can we design recommenders taking into account utilities of both users and content providers? By doing so, we hope to sustain the flourish of more content providers and a diverse content pool for long-term user satisfaction. Understanding the full impact of recommendations on both user and content provider groups is challenging. This paper aims to serve as a research investigation on one approach toward building a content provider-aware recommender, and evaluating its impact under a simulated setup. To characterize the users-recommender-providers interdependence, we complement user modeling by formalizing provider dynamics as a parallel Markov Decision Process of partially observable states transited by recommender actions and user feedback. We then build a REINFORCE recommender agent, coined EcoAgent, to optimize a joint objective of user utility and the counterfactual utility lift of the content provider associated with the chosen content, which we show to be equivalent to maximizing overall user utility and utilities of all content providers on the platform. To evaluate our approach, we also introduce a simulation environment capturing the key interactions among users, providers, and the recommender. We offer a number of simulated experiments that shed light to both the benefits and the limitations of our approach. These results serve to understand how and when a content-provider aware recommender agent is of benefit in building multi-stakeholder recommender systems. View details
    Deconfounding User Satisfaction Estimation from Response Rate Bias
    Madeleine Traverse
    Trevor Potter
    Emma Marriott
    Daniel Li
    Chris Haulk
    Proceedings of the 14th ACM Conference on Recommender Systems (2020)
    Preview abstract Improving user satisfaction is at the forefront of industrial recommender systems. While significant progress in recommender systems has relied on utilizing logged implicit data of user-item interactions (i.e., clicks, dwell/watch time, and other user engagement signals), there has been a recent surge of interest in measuring and modeling user satisfaction, as provided by orthogonal data sources. Such data sources typically originate from responses to user satisfaction surveys, which are explicitly asking users to rate their experience with the system and/or specific items they have consumed in the recent past. This data can be valuable for measuring and modeling the degree to which a user has had a satisfactory experience with the recommender, since what users do (engagement) does not always align with what users say they want (satisfaction as measured by surveys). We focus on a large-scale industrial system trained on user survey responses to predict user satisfaction. The predictions of the satisfaction model for each user-item pair, combined with the predictions of the other models (e.g., engagement-focused ones), are fed into the ranking component of a real-world recommender system in deciding items to present to the user. It is therefore imperative that the satisfaction model does an equally good job on imputing user satisfaction across slices of users and items, as it would directly impact which items a user is exposed to. However, the data used for training satisfaction models is specifically biased in that users are more likely to respond to a survey when they will respond that they are more satisfied. When the satisfaction survey responses in slices of data with high response rate follow a different distribution than those with low response rate, response rate becomes a confounding factor for user satisfaction estimation. We find a positive correlation between response rate and ratings in a large-scale survey dataset collected in our case study. To address this inherent response rate bias in the satisfaction data, we propose an inverse propensity weighting approach within a multi-task learning framework. We extend a simple feed-forward neural network architecture predicting user satisfaction to a shared-bottom multi-task learning architecture with two tasks: the user satisfaction estimation task, and the response rate estimation task. We concurrently train these two tasks, and use the inverse of the predictions of the response rate task as loss weights for the satisfaction task to address the response rate bias. We showcase that by doing this, (i) we can accurately model whether a user will respond to a survey, (ii) we improve the user satisfaction estimation error for the data slices with lower propensity to respond while not hurting that of the slices with higher propensity to respond, and (iii) we demonstrate in live A/B experiments that applying the resulting satisfaction predictions from this approach to rank recommendations translates to higher user satisfaction. View details
    Adversarial Attacks on an Oblivious Recommender
    Arindam Banerjee
    Proceedings of the 13th ACM Conference on Recommender Systems (2019)
    Preview abstract Can machine learning models be easily fooled? Despite the recent surge of interest in learned adversarial attacks in other domains, in the context of recommendation systems this question has mainly been answered using hand-engineered fake user profiles. This paper attempts to reduce this gap. We provide a formulation for learning to attack a recommender as a repeated general-sum game between two players, i.e., an adversary and a recommender oblivious to the adversary's existence. We consider the challenging case of poisoning attacks, which focus on the training phase of the recommender model. We generate adversarial user profiles targeting subsets of users or items, or generally the top-K recommendation quality. Moreover, we ensure that the adversarial user profiles remain unnoticeable by preserving proximity of the real user rating/interaction distribution to the adversarial fake user distribution. To cope with the challenge of the adversary not having access to the gradient of the recommender's objective with respect to the fake user profiles, we provide a non-trivial algorithm building upon zero-order optimization techniques. We offer a wide range of experiments, instantiating the proposed method for the case of the classic popular approach of a low-rank recommender, and illustrating the extent of the recommender's vulnerability to a variety of adversarial intents. These results can serve as a motivating point for more research into recommender defense strategies against machine learned attacks. View details
    Preview abstract Recommendation systems, prevalent in many applications, aim to surface to users the right content at the right time. Recently, researchers have aspired to develop conversational systems that offer seamless interactions with users, more effectively eliciting user preferences and offering better recommendations. Taking a step towards this goal, this paper explores the two stages of a single round of conversation with a user: which question to ask the user, and how to use their feedback to respond with a more accurate recommendation. Following these two stages, first, we detail an RNN-based model for generating topics a user might be interested in, and then extend a state-of-the-art RNN-based video recommender to incorporate the user’s selected topic. We describe our proposed system Q&R, i.e., Question & Recommendation, and the surrogate tasks we utilize to bootstrap data for training our models. We evaluate different components of Q&R on live traffic in various applications within YouTube: User Onboarding, Homepage Recommendation, and Notifications. Our results demonstrate that our approach improves upon state-of-the-art recommendation models, including RNNs, and makes these applications more useful, such as a > 1% increase in video notifications opened. Q&R has been deployed and is used in YouTube production. Further, our design choices can be useful to practitioners wanting to transition to more conversational recommendation systems. View details