Henry A. Rowley
Henry A. Rowley received BS degrees in Electrical Engineering and Computer Science from the University of Minnesota in 1992, a Masters in Computer Science from Carnegie Mellon University in 1994, and PhD in Computer Science from Carnegie Mellon University in 1999 for his thesis work on neural network-based face detection. After graduating he worked at Zaxel Systems, Inc. on lossless video compression and multi-view stereo reconstruction, and at Microsoft on Chinese, Japanese, and Korean handwriting recognition. Currently he is a member of the Google Research group, where he has worked on computer vision, machine learning, and most recently handwriting recognition.
Authored Publications
Sort By
Fast Multi-language LSTM-based Online Handwriting Recognition
Thomas Deselaers
Alexander Daryin
Marcos Calvo
Li-Lun Wang
Sandro Feuz
Philippe Gervais
International Journal on Document Analysis and Recognition (IJDAR) (2020)
Preview abstract
Handwriting is a natural input method for many people and we continuously invest in improving the recognition quality. Here we describe and motivate the modelling and design choices that lead to a significant improvement across the 100 supported languages, based on recurrent neural networks and a variety of language models.
%
This new architecture has completely replaced our previous segment-and-decode system~\cite{Google:HWRPAMI} and reduced the error rate by 30\%-40\% relative for most languages. Further, we report new state-of-the-art results on \iamondb for both the open and closed dataset setting.
%
By using B\'ezier curves for shortening the input length of our sequences we obtain up to 10x faster recognition times. Through a series of experiments we determine what layers are needed and how wide and deep they should be.
%
We evaluate the setup on a number of additional public datasets.
%
View details
Multi-Language Online Handwriting Recognition
Thomas Deselaers
Li-Lun Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence (2016)
Preview abstract
We describe Google's online handwriting recognition system that currently supports 22 scripts and 97 languages. The system's focus is on fast, high-accuracy text entry for mobile, touch-enabled devices. We use a combination of state-of-the-art components and combine them with novel additions in a flexible framework. This architecture allows us to easily transfer improvements between languages and scripts. This made it possible to build recognizers for languages that, to the best of our knowledge, are not handled by any other online handwriting recognition system. The approach also enabled us to use the same architecture both on very powerful machines for recognition in the cloud as well as on mobile devices with more limited computational power by changing some of the settings of the system. In this paper we give a general overview of the system architecture and the novel components, such as unified time- and position-based input interpretation, trainable segmentation, minimum-error rate training for feature combination, and a cascade of pruning strategies. We present experimental results for different setups. The system is currently publicly available in several Google products, for example in Google Translate and as an input method for Android devices.
View details
GyroPen: Gyroscopes for Pen-input with Mobile Phones
Thomas Deselaers
Jan Hosang
IEEE Transactions on Human-Machine Systems, 45 (2015), pp. 263-271
Preview abstract
We present GyroPen, a method for text entry into mobile devices using pen-like
writing interaction reconstructed from standard built-in sensors. The key idea
is to reconstruct a representation of the trajectory of the phone's corner that
is touching a writing surface from the measurements obtained from the phone's
gyroscopes and accelerometers. We propose to directly use the angular
trajectory for this reconstruction, which removes the necessity for accurate
absolute 3D position estimation, a task that can be difficult using low-cost
accelerometers. Recognition is then performed using an off-the-shelf
handwriting recognition system, allowing easy extension to new languages and
scripts. In a small user study (n=10), the average novice participant was able
to write the first word only 37 seconds after the starting to use GyroPen for
the first time. With some experience, users were able to write at the speed of
3-4s for one English word and with a character error rate of 18%.
View details
Large-scale SVD and manifold learning
Preview
Ameet Talwalkar
Journal of Machine Learning Research, 14 (2013), pp. 3129-3152
Preview abstract
This paper presents the algorithms which power Google Correlate, a tool which finds web search terms whose popularity over time best matches a user-provided time series. Correlate was developed to generalize the query-based modeling techniques pioneered by Google Flu Trends and make them available to end users.
Correlate searches across millions of candidate query time series to find the best matches, returning results in less than 200 milliseconds. Its feature set and requirements present unique challenges for Approximate Nearest Neighbor (ANN) search techniques. In this paper, we present Asymmetric Hashing (AH), the technique used by Correlate, and show how it can be adapted to the specific needs of the product.
We then develop experiments to test the throughput and recall of Asymmetric Hashing as compared to a brute-force search. For "full" search vectors, we achieve a 10x speedup over brute force search while maintaining 97% recall. For search vectors which contain holdout periods, we achieve a 4x speedup over brute force search, also with 97% recall.
View details
Large Scale SVD and Manifold Learning
Preview
Ameet Talwalkar
Journal of Machine Learning Research (JMLR) (2013)
Learning Binary Codes for High Dimensional Data Using Bilinear Projections
Yunchao Gong
Svetlana Lazebnik
IEEE Computer Vision and Pattern Recognition (2013)
Preview abstract
Recent advances in visual recognition indicate that to achieve good retrieval and classification accuracy on large scale datasets like ImageNet, extremely high-dimensional visual descriptors, e.g., Fisher Vectors, are needed. We present a novel method for converting such descriptors to compact similarity-preserving binary codes that exploits their natural matrix structure to reduce their dimensionality using compact bilinear projections instead of a single large projection matrix. This method achieves comparable retrieval and classification accuracy to the original descriptors and to the state-of-the-art Product Quantization approach while having orders of magnitude faster code generation time and smaller memory footprint.
View details
Google Image Swirl: A Large-Scale Content-Based Image Visualization System
Preview
Yushi Jing
Jingbin Wang
David Tsai
Chuck Rosenberg
Michele Covell
WWW (2012), pp. 539-540
Image Saliency: From Local to Global Context
Preview
Meng Wang
Janusz Konrad
Prakash Ishwar
Yushi Jing
Proc. Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Large-Scale Image Annotation using Visual Synset
Preview
David Tsai
Yushi Jing
Yi Liu
Sergey Ioffe
James Rehg
Proc. International Conference on Computer Vision (ICCV) (2011)