Sandeep Tata

FieldSwap: Data Augmentation for Effective Form-Like Document Extraction

Jing Xie

James Wendt

Yichao Zhou

Seth Ebner

Sandeep Tata

IEEE 40th International Conference on Data Engineering (ICDE) (2024), pp. 4722-4732

VRDU: A Benchmark for Visually-rich Document Understanding

Zilong Wang

Yichao Zhou

Wei Wei

Chen-Yu Lee

Sandeep Tata

2023 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Selective Labeling: How to Radically Lower Data-Labeling Costs for Document Extraction Models

Yichao Zhou

James Wendt

Navneet Potti

Jing Xie

Sandeep Tata

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, ACL, pp. 3847-3860

STRUM: Extractive Aspect-Based Contrastive Summarization

Beliz Gunel

Sandeep Tata

Marc Najork

Companion Proceedings of the ACM Web Conference 2023, 28–31

Learning Transferable Node Representations for Attribute Extraction from Web Documents

Yichao Zhou

Ying Sheng

Nguyen Ha Vo

Nick Edmonds

Sandeep Tata

Web Search and Data Mining (2022)

Glean: Structured Extractions from Templatic Documents

Sandeep Tata

Navneet Potti

James B. Wendt

Lauro Beltrao Costa

Marc Najork

Beliz Gunel

Proceedings of the VLDB Endowment (2021), pp. 997-1005

Data-Efficient Information Extraction from Form-Like Documents

Beliz Gunel

Navneet Potti

Sandeep Tata

James B. Wendt

Marc Najork

Jing Xie

Document Intelligence Workshop @ KDD 2021

Representation Learning for Information Extraction from Form-like Documents

Bodhisattwa Majumder

Navneet Potti

Sandeep Tata

James B. Wendt

Qi Zhao

Marc Najork

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), pp. 6495-6504

Migrating a Privacy-Safe Information Extraction System to a Software 2.0 Design

Ying Sheng

Nguyen Ha Vo

James B. Wendt

Sandeep Tata

Marc Najork

Proceedings of the 10th Annual Conference on Innovative Data Systems Research (2020)

FreeDOM: A Transferable Neural Architecture for Structured Information Extraction on Web Documents

Yuchen Lin

Ying Sheng

Nguyen Ha Vo

Sandeep Tata

KDD 2020 (to appear)

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Sandeep Tata

Research Areas

Join us

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Sandeep Tata

Research Areas

Filter by:

Publications

Years

Research Areas

Teams

Join us