Ray Smith

Ray developed the Tesseract OCR engine at HPLabs Bristol for 10 years, followed by a 3 year term developing the text and line drawings pipelines for the HP PrecisionScan product in Greeley, Colorado. After spending a further 7 years developing a new architecture for the Omnipage OCR product for Caere/Scansoft/Nuance, Ray is now at Google, working on Tesseract again.

Research Areas

Machine Perception

Authored Publications

Google Publications

Other Publications

Improving Book OCR by Adaptive Language and Image Models

Dar-Shyang Lee

Ray Smith

Proceedings of 2012 10th IAPR International Workshop on Document Analysis Systems, IEEE, pp. 115-119

Limits on the Application of Frequency-based Language Models to OCR

Ray Smith

ICDAR, IEEE (2011), pp. 538-542

Table Detection in Heterogeneous Documents

Faisal Shafait

Ray Smith

Document Analysis Systems 2010, ACM International Conference Proceedings series

Adapting the Tesseract Open Source OCR Engine for Multilingual OCR

Ray Smith

Daria Antonova

Dar-Shyang Lee

MOCR '09: Proceedings of the International Workshop on Multilingual OCR (2009)

Combined Orientation and Script Detection using the Tesseract OCR Engine

Ranjith Unnikrishnan

Ray Smith

Workshop on Multilingual OCR (MOCR), Proc. 10th Intl. Conf. on Document Analysis and Recognition (ICDAR), (2009)

Hybrid Page Layout Analysis via Tab-Stop Detection

Ray Smith

Proceedings of the 10th international conference on document analysis and recognition, IEEE (2009)

An Overview of the Tesseract OCR Engine

Ray Smith

Proc. Ninth Int. Conference on Document Analysis and Recognition (ICDAR), IEEE Computer Society (2007), pp. 629-633

A simple and efficient skew detection algorithm via text row accumulation

Ray Smith

Proceedings 3rd ICDAR'95, IEEE (1995), pp. 1145-1148

Computer processing of line images: a survey

R. W. Smith

Pattern Recogn., vol. 20 (1987), pp. 7-15

Search on Google Scholar

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Ray Smith

Research Areas

Join us

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Ray Smith

Research Areas

Filter by:

Year

Research Area

Join us

AI/ML Foundations  & Capabilities