FieldSwap: Data Augmentation for Effective Form-Like Document Extraction

Seth Ebner
IEEE 40th International Conference on Data Engineering (ICDE)(2024), pp. 4722-4732

Abstract

Extracting structured data from visually rich documents like invoices, receipts, financial statements, and tax forms is key to automating many business workflows. However, building extraction models in this domain often demands a large collection of high-quality training examples. To address this challenge, we introduce FieldSwap, a novel data augmentation technique specifically designed for such extraction problems. FieldSwap generates synthetic training examples by replacing key phrases indicative of one field with those corresponding to another. Our experiments on five diverse datasets demonstrate that incorporating FieldSwap-augmented data into the training process can enhance model performance by 1-11 F1 points, particularly when dealing with limited training data (10--100 documents). Additionally, we propose algorithms for automatically inferring key phrases from the training data. Our findings indicate that FieldSwap is effective regardless of whether key phrases are manually provided by human experts or inferred automatically.

Research Areas