Jump to Content

Shyam Upadhyay

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract A baseline method of running the bidirectional models like BERT in streaming NLU text setting would be to run it again for each new (sub)token received. Here, no previously computed features are re-used and a restart is done from scratch at each timestep for the newly received token with the new prefix. This lead to computational inefficiency (measured as FLOP Count with lower count being better). \name~ addresses this issue by reducing the FLOP Count of having bidirectional features for streaming setting and also improves the performance or generalization to incomplete inputs (partials). \name~ has two components - a partially bidirectional encoder model and an adapter to guide the restarts of bidirectional layer. Our evaluations showed that these gains are observed while maintaining a similar performance over the complete input over 4 sequence tagging datasets. View details
    Preview abstract Understanding tables is an important aspect of natural language understanding. Existing models for table understanding require linearization of table contents in certain levels, where row or column orders are encoded as unwanted biases. Such spurious biases make the model vulnerable to row and column order perturbations. Also, prior work did not explicitly and thoroughly model structural biases, hindering the table-text modeling ability. In this work, we propose a robust table-text encoding architecture TableFormer, where tabular structural biases are incorporated completely through learnable attention biases. TableFormer is invariant to row and column orders, and could understand tables better due to its tabular inductive biases. Experiments showed that TableFormer outperforms strong baselines in all settings on SQA, WTQ and TabFact table reasoning datasets, and achieves state-of-the-art performance on SQA, especially when facing answer-invariant row and column perturbations (6% improvement over the best baseline), because previous SOTA models' performance drops by 4% - 6% when facing such perturbations while TableFormer is not affected. View details
    CST5: Code-Switched Semantic Parsing using T5
    Jigar Hasmukhbhai Gupta
    Pankaj Joshi
    Rahul Goel
    Rengarajan Aravamudhan
    arXiv (2022)
    Preview abstract Extending semantic parsers to code-mixed input has been a challenging problem, primarily due to lack of labeled data for supervision. In this work, we introduce CST5, a new data augmentation technique that finetune a T5 model using a small ($\approx$100 examples) seed set to generate code-mixed utterances from English utterances, allowing us to overcome the labeled data scarcity. We release over 10K annotated CS utterances alongside over 170K augmented CS utterances. Furthermore, We demonstrate the effectiveness of the augmentation technique by comparing baseline models which are trained without data augmentation to models which are trained with augmented data for varying amount of training data View details
    TimeDial: Temporal Commonsense Reasoning in Dialog
    Lianhui Qin
    Aditya Gupta
    Yejin Choi
    Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (2021)
    Preview abstract Everyday conversations require understanding everyday events, which in turn, requires understanding temporal commonsense concepts interwoven with those events. Despite recent progress with massive pre-trained language models (LMs) such as T5 and GPT-3, their capability of temporal reasoning in dialogs remains largely under-explored. In this paper, we present the first study to investigate pre-trained LMs for their temporal reasoning capabilities in dialogs by introducing a new task and a crowd-sourced English challenge set, TimeDial. We formulate TimeDial as a multiple choice cloze task with over 1.1K carefully curated dialogs. Empirical results demonstrate that even the best performing models struggle on this task compared to humans, with 23 absolute points of gap in accuracy. Furthermore, our analysis reveals that the models fail to reason about dialog context correctly; instead, they rely on shallow cues based on existing temporal patterns in context, motivating future research for modeling temporal concepts in text and robust contextual reasoning about them. The dataset is publicly available at https://github.com/google-research-datasets/timedial. View details
    Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering
    Aditya Gupta
    Jiacheng Xu
    Diyi Yang
    Findings of the Association for Computational Linguistics: ACL 2021, Association for Computational Linguistics
    Preview abstract Disfluencies is an under-studied topic in NLP, even though it is ubiquitous in human conversation. This is largely due to the lack of datasets containing disfluencies. In this paper, we present a new challenge question answering dataset, Disfl-QA, a derivative of SQuAD, where humans introduce contextual disfluencies in previously fluent questions. Disfl-QA contains a variety of challenging disfluencies that require a more comprehensive understanding of the text than what was necessary in prior datasets. Experiments show that the performance of existing state-of-the-art question answering models degrades significantly when tested on Disfl-QA in a zero-shot setting. We show data augmentation methods partially recover the loss in performance and also demonstrate the efficacy of using gold data for fine-tuning. We argue that we need large-scale disfluency datasets in order for NLP models to be robust to them. The dataset is publicly available at: https://github.com/google-research-datasets/disfl-qa. View details
    (Almost) Zero-Shot Cross-Lingual Spoken Language Understanding
    Gokhan Tur
    Dilek Hakkani-Tur
    Larry Heck
    Proceedings of the IEEE ICASSP (2018)
    Preview abstract Spoken language understanding (SLU) is a component of goal-oriented dialogue systems that aims to interpret user's natural language queries in system's semantic representation format. While current state-of-the-art SLU approaches achieve high performance for English domains, the same is not true for other languages. Approaches in the literature for extending SLU models and grammars to new languages rely primarily on machine translation. This poses a challenge in scaling to new languages, as machine translation systems may not be reliable for several (especially low resource) languages. In this work, we examine different approaches to train a SLU component with little supervision for two new languages -- Hindi and Turkish, and show that with only a few hundred labeled examples we can surpass the approaches proposed in the literature. Our experiments show that training a model bilingually (i.e., jointly with English), enables faster learning, in that the model requires fewer labeled instances in the target language to generalize. Qualitative analysis shows that rare slot types benefit the most from the bilingual training. View details
    No Results Found