Sequence-to-Label Script Identification for Multilingual OCR

Ashok C Popat

Jonathan Michael Baccash

Karel Driesen

Patrick Michael Hurst

Yasuhisa Fujii

Proceedings of the 14th International Conference on Document Analysis and Recognition (ICDAR), IEEE (2017)

Google Scholar

Abstract

We describe a novel line-level script identification method. In multilingual OCR, script identification is a crucial component as it automates the provision of a language hint. Previous work repurposed an OCR model that generates per-character script codes, aggregated by a counting heuristic to obtain line-level script ID. This baseline has two shortcomings. First, as a sequence-to-sequence model it is more complex than necessary for the sequence-to-label problem of line script ID, making it hard to train and inefficient to run. Second, the counting heuristic may be suboptimal compared to a learned model. Therefore we reframe line script identification as a sequence-to-label problem and solve it using two components, trained end-to-end: Encoder and Summarizer. The encoder converts a line image into a sequence of features. The summarizer aggregates this sequence to classify the line. We test various summarizers while keeping identical inception-style convolutional networks as encoders. Experiments on scanned books and photos containing 232 languages in 30 scripts show 16% reduction of script ID error rate compared to the baseline. This improved script ID reduces the character error rate attributable to script misidentification by 33%.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Sequence-to-Label Script Identification for Multilingual OCR

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Sequence-to-Label Script Identification for Multilingual OCR

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities