Google Research

Bringing Vision To The Blind: From Coarse To Fine, One Dollar At A Time

WACV 2019 : IEEE Winter Conf. on Applications of Computer Vision (2019)

Abstract

While deep learning has achieved great success in building vision applications for mainstream users, there is relatively less work for the blind and visually impaired to have a personal, on-device visual assistant for their daily life. Unlike mainstream applications, vision system for the blind must be robust, reliable and safe-to-use. In this paper, we propose a fine-grained currency recognizer based on CONGAS, which significantly surpasses other popular local features by a large margin. In addition, we introduce an effective and light-weight coarse classifier that gates the fine-grained recognizer on resource-constrained mobile devices. The coarse-to-fine approach is orchestrated to provide an extensible mobile-vision architecture, that demonstrates how the benefits of coordinating deep learning and local feature based methods can help in resolving a challenging problem for the blind and visually impaired. The proposed system runs in real-time with ~150ms latency on a Pixel device, and achieved 98% precision and 97% recall on a challenging evaluation set.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work