Jiaxi Tang
Jiaxi is a Software Engineer at Google DeepMind specializing in data mining and applied machine learning. His research centers on developing intelligent recommendation systems with cutting-edge technologies.
Authored Publications
Sort By
Improving Training Stability for Multitask Ranking Models in Recommender Systems
Justin Gilmer
Li Wei
Lichan Hong
Mahesh Sathiamoorthy
KDD 2023 (2023)
Preview abstract
Recommender systems play an important role in YouTube, one of the largest online video platforms across the world. In this paper, we focus on a real-world multitask ranking model for YouTube recommendations.
While most of the recommendation research is dedicated to designing better models to improve user engagement and satisfaction, we found that research on stabilizing the training for such models is severely under-explored.
As the recommendation models become larger and more sophisticated, they are more vulnerable to training instability issues, \emph{i.e.}, the loss diverges (instead of converging) which can make the model unusable, wasting significant resources and blocking model iterations.
In this paper, we share our understanding and best practices we learned for improving the training stability of a multitask ranking model used in production. We show some properties of the model that lead to unstable training and speculate on the cause. Furthermore, we propose an effective solution to improve training stability based on our observations of training dynamics when model training starts to become unstable. Our experiments on a proprietary dataset show the effectiveness of the proposed method over several commonly used baseline methods.
View details
Towards Neural Mixture Recommender for Long Range Dependent User Sequences
Francois Belletti
Sagar Jain
Alex Beutel
Can Xu
Paul Covington
WWW (2019)
Preview abstract
Understanding temporal dynamics has proved to be highly valuable for accurate recommendation. Sequential recommenders have been successful in modeling the dynamics of users and items over time. However, while different model architectures excel at capturing various temporal ranges or dynamics, distinct application contexts require adapting to diverse behaviors.
In this paper we examine how to build a model that can make use of different temporal ranges and dynamics depending on the request context. We begin with the analysis of an anonymized Youtube dataset comprising millions of user sequences. We quantify the degree of long-range dependence in these sequences and demonstrate that both short-term and long-term dependent behavioral patterns co-exist. We then propose a neural Multi-temporalrange Mixture Model (M3) as a tailored solution to deal with both short-term and long-term dependencies. Our approach employs a mixture of models, each with a different temporal range. These models are combined by a learned gating mechanism capable of exerting different model combinations given different contextual information. In empirical evaluations on a public dataset and our own anonymized YouTube dataset, M3 consistently outperforms state-of-the-art sequential recommendation methods.
View details