Yuan Wang
Authored Publications
Sort By
Scaling Up LLM Reviews for Google Ads Content Moderation
Ariel Fuxman
Chih-Chun Chia
Dongjin Kwon
Enming Luo
Mehmet Tek
Ranjay Krishna
Tiantian Fang
Tushar Dogra
Yu-Han Lyu
(2024)
Preview abstract
Large language models (LLMs) are powerful tools for content moderation but LLM inference costs and latency on large volumes of data, such as the Google Ads repository, are prohibitive for their casual usage. This study is focused on scaling up LLM reviews for content moderation in Google Ads. First, we use heuristics to select candidates via filtering and duplicate removal, and create clusters of ads for which we select one representative ad per cluster. Then, LLMs are used to review only the representative ads. Finally we propagate the LLM decisions for representative ads back to their clusters. This method reduces the number of reviews by more than 3 orders of magnitude while achieving a 2x recall compared to a non-LLM model as a baseline. Note that, the success of this approach is a strong function of the representations used in clustering and label propagation; we observed that cross-modal similarity representations yield better results than uni-modal representations.
View details
Scalable Hierarchical Agglomerative Clustering
Nick Monath
Guru Prashanth Guruganesh
Andrew McCallum
Gokhan Mergen
Mert Terzihan
Bryon Tjanaka
Yuchen Wu
Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2021), 1245–1255
Preview abstract
The applicability of agglomerative clustering, for inferring both hierarchical and flat clustering, is limited by its scalability. Existing scalable hierarchical clustering methods sacrifice quality for speed and often lead to over-merging of clusters. In this paper, we present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points. We perform a detailed theoretical analysis, showing that under mild separability conditions our algorithm can not only recover the optimal flat partition but also provide a two-approximation to non-parametric DP-Means objective [32]. This introduces a novel application of hierarchical clustering as an approximation algorithm for the non-parametric clustering objective. We additionally relate our algorithm to the classic hierarchical agglomerative clustering method. We perform extensive empirical experiments in both hierarchical and flat clustering settings and show that our proposed approach achieves state-of-the-art results on publicly available clustering benchmarks. Finally, we demonstrate our method’s scalability by applying it to a dataset of 30 billion queries. Human evaluation of the discovered clusters show that our method finds better quality of clusters than the current state-of-the-art.
View details
Preview abstract
Learning continuous representations of discrete objects such as text, sentences, users, and movies lies at the heart of many applications including involving text and user modeling. Unfortunately, traditional methods that embed all objects do not scale to large vocabulary sizes and embedding dimensions. In this paper, we propose a general method, Anchor & Transform (ANT) that learns sparse representations of discrete objects by jointly learning a small set of \textit{anchor embeddings} and a \textit{sparse transformation} from anchor objects to all objects. ANT is scalable, flexible, end-to-end trainable, and allows the user to easily incorporate domain knowledge about object relationships. ANT also recovers several task-specific baselines under certain structural assumptions on the anchor embeddings and transformation matrices. On several benchmarks involving text and user modeling, ANT demonstrates strong performance with respect to accuracy and sparsity.
View details
Uncovering Hidden Structure in Sequence Data via Threading Recurrent Models
Daniel Silva
Yuchen Wu
Shibani Sanan
Surojit Chatterjee
Proceedings of the 12 ACM International Conference on Web Search and Data Mining (2019), pp. 186-194
Preview abstract
Long Short-Term Memory (LSTM) is one of the most powerful sequence models for user browsing history [17, 22] or natural language text [19]. Despite the strong performance, it has not gained popularity for user-facing applications, mainly owing to a large number of parameters and lack of interpretability. Recently Zaheer et al. [25] introduced latent LSTM Allocation (LLA) to address these problems by incorporating topic models with LSTM, where the topic model maps observed words in each sequence to topics that evolve using an LSTM model. In our experiments, we found the resulting model, although powerful and interpretable, to show shortcomings when applied to sequence data that exhibit multi-modes of behaviors with abrupt dynamic changes. To address this problem we introduce thLLA: a threading LLA model. thLLA has the ability to break each sequence into a set of segments and then model the dynamic in each segment using an LSTM mixture. In that way, thLLA can model abrupt changes in sequence dynamics and provides a better fit for sequence data while still being interpretable and requiring fewer parameters. In addition, thLLA uncovers hidden themes in the data via its dynamic mixture components. However, such generalization and interpretability come at a cost of complex dependence structure, for which inference would be extremely non-trivial. To remedy this, we present an efficient sampler based on particle MCMC method for inference that can draw from the joint posterior directly. Experimental results confirm the superiority of thLLA and the stability of the new inference algorithm on a variety of domains.
View details
State Space LSTM Models with Particle MCMC Inference
Xun Zheng
Alex J. Smola
Eric Xing
The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS) workshop on Bayesian Deep Learning. (2017)
Preview abstract
Long Short-Term Memory (LSTM) is one of the most powerful sequence models.
Despite the strong performance, however, it lacks the nice interpretability as in
state space models. In this paper, we present a way to combine the best of both
worlds by introducing State Space LSTM (SSL) models that generalizes the earlier
work Zaheer et al. (2017) of combining topic models with LSTM. However,
unlike Zaheer et al. (2017), we do not make any factorization assumptions in our
inference algorithm. We present an efficient sampler based on sequential Monte
Carlo (SMC) method that draws from the joint posterior directly. Experimental
results confirms the superiority and stability of this SMC inference algorithm on
a variety of domains.
View details
A Practical Algorithm for Solving the Incoherence Problem of Topic Models In Industrial Applications
Preview abstract
Topic models are often applied in industrial settings to discover
user profiles from activity logs where documents correspond
to users and words to complex objects such as
web sites and installed apps. Standard topic models ignore
the content-based similarity structure between these objects
largely because of the inability of the Dirichlet prior to capture
such side information of word-word correlation. Several
approaches were proposed to replace the Dirichlet prior
with more expressive alternatives. However, this added expressivity
comes with a heavy premium: inference becomes
intractable and sparsity is lost which renders these alternatives
not suitable for industrial scale applications. In this
paper we take a radically different approach to incorporating
word-word correlation in topic models by applying this
side information at the posterior level rather than at the
prior level. We show that this choice preserves sparsity and
results in a graph-based sampler for LDA whose computational
complexity is asymptotically on bar with the state of
the art Alias base sampler for LDA. We illustrate the
efficacy of our approach over real industrial datasets that
span up to billion of users, tens of millions of words and
thousands of topics. To the best of our knowledge, our approach
provides the first practical and scalable solution to
this important problem
View details