Supervised Advantage Actor-Critic for Recommender Systems

Xin Xin

Alexandros Karatzoglou

Ioannis Arapakis

Joemon Jose

15th ACM International Conference on Web Search and Data Mining - WSDM 2022, Arizona, USA (to appear)

Google Scholar

Abstract

Casting session-based or sequential recommendation as reinforcement learning (RL) through reward signals is a promising research direction towards recommender systems (RS) that maximize long-term user engagement. However, the direct use of RL algorithms under the RS setting is unfeasible due to challenges like off-policy training, huge action spaces and lack of sufficient reward signals. Recent RL approaches in the recommendation domain try to tackle these challenges, by for example combining RL and self-supervised learning. In this paper, we examine the approach of self-supervised reinforcement learning for recommendation and show that existing methods still have limitations. For example, the negative signals from the self-supervised component are not sufficient for the RL component to perform good ranking. Moreover, the length of the sequence could also introduce bias to the training procedure.

To address the above problems, we first propose to introduce negative sampling into the RL training procedure and then combine it with self-supervised learning, namely Self-Supervised Negative Q-learning (SNQN). Based on the sampled negative actions (items), we can further calculate the ``advantage" of a positive action, which can be further utilized as a weight for the self-supervised part. This lead to another learning framework: Self-Supervised Advantage Actor-Critic (SA2C). We integrate SNQN and SA2C with four state-of-the-art sequential recommendation models and conduct experiments on two real-world datesets. Experimental results show that the proposed approaches achieve better performance than existing self-supervised reinforcement learning methods. Code will be open-sourced.

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Supervised Advantage Actor-Critic for Recommender Systems

Abstract

Research Areas

Meet the teams driving innovation