James B. Wendt

FieldSwap: Data Augmentation for Effective Form-Like Document Extraction

Jing Xie

James Wendt

Yichao Zhou

Seth Ebner

Sandeep Tata

IEEE 40th International Conference on Data Engineering (ICDE) (2024), pp. 4722-4732

Selective Labeling: How to Radically Lower Data-Labeling Costs for Document Extraction Models

Yichao Zhou

James Wendt

Navneet Potti

Jing Xie

Sandeep Tata

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, ACL, pp. 3847-3860

Data-Efficient Information Extraction from Form-Like Documents

Beliz Gunel

Navneet Potti

Sandeep Tata

James B. Wendt

Marc Najork

Jing Xie

Document Intelligence Workshop @ KDD 2021

Glean: Structured Extractions from Templatic Documents

Sandeep Tata

Navneet Potti

James B. Wendt

Lauro Beltrao Costa

Marc Najork

Beliz Gunel

Proceedings of the VLDB Endowment (2021), pp. 997-1005

Migrating a Privacy-Safe Information Extraction System to a Software 2.0 Design

Ying Sheng

Nguyen Ha Vo

James B. Wendt

Sandeep Tata

Marc Najork

Proceedings of the 10th Annual Conference on Innovative Data Systems Research (2020)

Representation Learning for Information Extraction from Form-like Documents

Bodhisattwa Majumder

Navneet Potti

Sandeep Tata

James B. Wendt

Qi Zhao

Marc Najork

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), pp. 6495-6504

Online Template Induction for Machine-Generated Emails

Michael Whittaker

Nick Edmonds

Sandeep Tata

James B. Wendt

Marc Najork

PVLDB (2019), pp. 1235-1248

RiSER: Learning Better Representations for Richly Structured Emails

Furkan Kocayusufoğlu

Ying Sheng

Nguyen Ha Vo

James B. Wendt

Qi Zhao

Sandeep Tata

Marc Najork

Proceedings of the 2019 World Wide Web Conference, pp. 886-895

Anatomy of a Privacy-Safe Large-Scale Information Extraction System Over Email

Ying Sheng

Sandeep Tata

James B. Wendt

Jing Xie

Qi Zhao

Marc Najork

24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM (2018), pp. 734-743

Learning Effective Embeddings for Machine Generated Emails with Applications to Email Category Prediction

Yu Sun

Luis Garcia Pueyo

James B. Wendt

Marc Najork

Andrei Broder

Proceedings of the IEEE International Conference on Big Data (2018), pp. 1846-1855

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

James B. Wendt

Research Areas

Join us

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

James B. Wendt

Research Areas

Filter by:

Publications

Years

Research Areas

Teams

Join us