An Overview of the Tesseract OCR Engine

Ray Smith

An Overview of the Tesseract OCR Engine

Ray Smith

Proc. Ninth Int. Conference on Document Analysis and Recognition (ICDAR), IEEE Computer Society (2007), pp. 629-633

Google Scholar

Abstract

The Tesseract OCR engine, as was the HP Research
Prototype in the UNLV Fourth Annual Test of OCR
Accuracy[1], is described in a comprehensive
overview. Emphasis is placed on aspects that are novel
or at least unusual in an OCR engine, including in
particular the line finding, features/classification
methods, and the adaptive classifier.

Research Areas

Machine perception

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

An Overview of the Tesseract OCR Engine

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs