Framework for Recasting Table-to-Text Generation Data for Tabular Inference

Aashna Jena; Manish Shrivastava; Vivek Gupta; Julian Martin Eisenschlos

Framework for Recasting Table-to-Text Generation Data for Tabular Inference

Aashna Jena

Manish Shrivastava

Vivek Gupta

Julian Martin Eisenschlos

Findings of EMNLP (2022)

Download Google Scholar

Abstract

Prior work on constructing challenging tabular inference data centered primarily on human annotation or automatic synthetic generation. Both techniques have their own set of issues. Human annotation, despite its diversity and superior reasoning, struggles from scaling concerns. Synthetic data, on the other hand, despite its scalability, suffers from lack of linguistic and reasoning diversity. In this paper, we address both of these concerns by presenting a recasting approach that semi-automatically generates tabular NLI instances. We transform the table2text dataset ToTTo (Parikh et al., 2020) into a tabular NLI dataset using our proposed framework. We demonstrate the use of our recasted data as an evaluation benchmark as well as augmentation data to improve performance on TabFact (Chen et al., 2020b). Furthermore, we test the effectiveness of models trained on our data on the TabFact benchmark in the zero-shot scenario.

Research Areas

Natural language processing

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Framework for Recasting Table-to-Text Generation Data for Tabular Inference

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs