Tushar Deepak Chandra
Research Areas
Authored Publications
Sort By
Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries
Ajit Apte
Ambarish Jash
Amol H Wankhede
Ankit Kumar
Ayooluwakunmi Jeje
Dima Kuzmin
Ellie Ka In Chio
Harry Fung
Jon Effrat
Nitin Jindal
Pei Cao
Senqiang Zhou
Sukhdeep S. Sodhi
Tameen Khan
Tarush Bali
KDD (2021)
Preview abstract
As more and more online search queries come from voice, automatic speech recognition becomes a key component to deliver relevant search results. Errors introduced by automatic speech recognition (ASR) lead to irrelevant search results returned to the user, thus causing user dissatisfaction. In this paper, we introduce an approach, Mondegreen, to correct voice queries in text space without depending on audio signals, which may not always be available due to system constraints or privacy or bandwidth (for example, some ASR systems run on-device) considerations. We focus on voice queries transcribed via several proprietary commercial ASR systems. These queries come from users making internet, or online service search queries. We first present an analysis showing how different the language distribution coming from user voice queries is from that in traditional text corpora used to train off-the-shelf ASR systems. We then demonstrate that Mondegreen can achieve significant improvements in increased user interaction by correcting user voice queries in one of the largest search systems in Google. Finally, we see Mondegreen as complementing existing highly-optimized production ASR systems, which may not be frequently retrained and thus lag behind due to vocabulary drifts.
View details
Zero-Shot Transfer Learning for Query-Item Cold Start in Search Retrieval and Recommendations
Ankit Kumar
Cosmo Du
Dima Kuzmin
Ellie Chio
John Roberts Anderson
Li Zhang
Nitin Jindal
Pei Cao
Ritesh Agarwal
Tao Wu
Wen Li
CIKM (2020)
Preview abstract
Most search retrieval and recommender systems predict top-K items given a query by learning directly from a large training set of (query, item) pairs, where a query can include natural language (NL), user, and context features. These approaches fall into the traditional supervised learning framework where the algorithm trains on labeled data from the target task. In this paper, we propose a new zero-shot transfer learning framework, which first learns representations of items and their NL features by predicting (item, item) correlation graphs as an auxiliary task, followed by transferring learned representations to solve the target task (query-to-item prediction), without having seen any (query, item) pairs in training. The advantages of applying this new framework include: (1) Cold-starting search and recommenders without abundant query-item data; (2) Generalizing to previously unseen or rare (query, item) pairs and alleviating the "rich get richer" problem; (3) Transferring knowledge of (item, item) correlation from domains outside of search. We show that the framework is effective on a large-scale search and recommender system.
View details
Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology
Vihan Jain
Jing Wang
Sanmit Narvekar
Ritesh Agarwal
Rui Wu
Morgane Lustman
Vince Gatto
Paul Covington
Jim McFadden
arXiv (2019)
Preview abstract
Most practical recommender systems focus on estimating immediate user engagement without considering the long-term effects of recommendations on user behavior. Reinforcement learning (RL) methods offer the potential to optimize recommendations for long-term user engagement. However, since users are often presented with slates of multiple items---which may have interacting effects on user choice---methods are required to deal with the combinatorics of the RL action space. In this work, we address the challenge of making slate-based recommendations to optimize long-term value using RL. Our contributions are three-fold. (i) We develop SlateQ, a decomposition of value-based temporal-difference and Q-learning that renders RL tractable with slates. Under mild assumptions on user-choice behavior, we show that the long-term value (LTV) of a slate can be decomposed into a tractable function of its component item-wise LTVs. (ii) We outline a methodology that leverages existing myopic learning-based recommenders to quickly develop a recommender that handles LTV. (iii) We demonstrate our methods in simulation, and validate the scalability of decomposed TD-learning using SlateQ in live experiments on YouTube.
View details
SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets
Vihan Jain
Jing Wang
Sanmit Narvekar
Ritesh Agarwal
Rui Wu
Proceedings of the Twenty-eighth International Joint Conference on Artificial Intelligence (IJCAI-19), Macau, China (2019), pp. 2592-2599
Preview abstract
Reinforcement learning (RL) methods for recommender systems optimize recommendations for long-term user engagement. However, since users are often presented with slates of multiple items---which may have interacting effects on user choice---methods are required to deal with the combinatorics of the RL action space. We develop SlateQ, a decomposition of value-based temporal-difference and Q-learning that renders RL tractable with slates. Under mild assumptions on user choice behavior, we show that the long-term value (LTV) of a slate can be decomposed into a tractable function of its component item-wise LTVs. We demonstrate our methods in simulation, and validate the scalability and effectiveness of decomposed TD-learning on YouTube.
View details
Wide & Deep Learning for Recommender Systems
Levent Koc
Tal Shaked
Glen Anderson
Wei Chai
Mustafa Ispir
Rohan Anil
Zakaria Haque
Lichan Hong
Vihan Jain
Xiaobing Liu
Hemal Shah
arXiv:1606.07792 (2016)
Preview abstract
Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense embeddings learned for the sparse features. However, deep neural networks with embeddings can over-generalize and recommend less relevant items when the user-item interactions are sparse and high-rank. In this paper, we present Wide & Deep learning---jointly trained wide linear models and deep neural networks---to combine the benefits of memorization and generalization for recommender systems. We productionized and evaluated the system on a commercial mobile app store with over one billion active users and over one million apps. Online experiment results show that Wide & Deep significantly increased app acquisitions compared with wide-only and deep-only models.
View details
Efficient projections onto the <em>l</em><sub>1</sub>-ball for learning in high dimensions
Preview
John Duchi
Shai Shalev-Shwartz
Yoram Singer
ICML '08: Proceedings of the 25th international conference on Machine learning, ACM, New York, NY, USA (2008), pp. 272-279
Paxos Made Live - An Engineering Perspective (2006 Invited Talk)
Joshua Redstone
Proceedings of the 26th Annual ACM Symposium on Principles of Distributed Computing, ACM press (2007)
Preview abstract
We describe our experience in building a fault-tolerant data-base using
the Paxos consensus algorithm. Despite the existing literature in
the field, building such a database proved to be non-trivial. We
describe selected algorithmic and engineering problems encountered,
and the solutions we found for them. Our measurements indicate that
we have built a competitive system.
View details
Bigtable: A Distributed Storage System for Structured Data
Fay Chang
Deborah A. Wallach
Mike Burrows
Andrew Fikes
7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), {USENIX} (2006), pp. 205-218
Preview abstract
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this paper we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.
View details