Henry A. Rowley

Henry A. Rowley

Henry A. Rowley received BS degrees in Electrical Engineering and Computer Science from the University of Minnesota in 1992, a Masters in Computer Science from Carnegie Mellon University in 1994, and PhD in Computer Science from Carnegie Mellon University in 1999 for his thesis work on neural network-based face detection. After graduating he worked at Zaxel Systems, Inc. on lossless video compression and multi-view stereo reconstruction, and at Microsoft on Chinese, Japanese, and Korean handwriting recognition. Currently he is a member of the Google Research group, where he has worked on computer vision, machine learning, and most recently handwriting recognition.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Fast Multi-language LSTM-based Online Handwriting Recognition
    Thomas Deselaers
    Alexander Daryin
    Marcos Calvo
    Li-Lun Wang
    Sandro Feuz
    Philippe Gervais
    International Journal on Document Analysis and Recognition (IJDAR) (2020)
    Preview abstract Handwriting is a natural input method for many people and we continuously invest in improving the recognition quality. Here we describe and motivate the modelling and design choices that lead to a significant improvement across the 100 supported languages, based on recurrent neural networks and a variety of language models. % This new architecture has completely replaced our previous segment-and-decode system~\cite{Google:HWRPAMI} and reduced the error rate by 30\%-40\% relative for most languages. Further, we report new state-of-the-art results on \iamondb for both the open and closed dataset setting. % By using B\'ezier curves for shortening the input length of our sequences we obtain up to 10x faster recognition times. Through a series of experiments we determine what layers are needed and how wide and deep they should be. % We evaluate the setup on a number of additional public datasets. % View details
    Multi-Language Online Handwriting Recognition
    Thomas Deselaers
    Li-Lun Wang
    IEEE Transactions on Pattern Analysis and Machine Intelligence (2016)
    Preview abstract We describe Google's online handwriting recognition system that currently supports 22 scripts and 97 languages. The system's focus is on fast, high-accuracy text entry for mobile, touch-enabled devices. We use a combination of state-of-the-art components and combine them with novel additions in a flexible framework. This architecture allows us to easily transfer improvements between languages and scripts. This made it possible to build recognizers for languages that, to the best of our knowledge, are not handled by any other online handwriting recognition system. The approach also enabled us to use the same architecture both on very powerful machines for recognition in the cloud as well as on mobile devices with more limited computational power by changing some of the settings of the system. In this paper we give a general overview of the system architecture and the novel components, such as unified time- and position-based input interpretation, trainable segmentation, minimum-error rate training for feature combination, and a cascade of pruning strategies. We present experimental results for different setups. The system is currently publicly available in several Google products, for example in Google Translate and as an input method for Android devices. View details
    GyroPen: Gyroscopes for Pen-input with Mobile Phones
    Thomas Deselaers
    Jan Hosang
    IEEE Transactions on Human-Machine Systems, 45 (2015), pp. 263-271
    Preview abstract We present GyroPen, a method for text entry into mobile devices using pen-like writing interaction reconstructed from standard built-in sensors. The key idea is to reconstruct a representation of the trajectory of the phone's corner that is touching a writing surface from the measurements obtained from the phone's gyroscopes and accelerometers. We propose to directly use the angular trajectory for this reconstruction, which removes the necessity for accurate absolute 3D position estimation, a task that can be difficult using low-cost accelerometers. Recognition is then performed using an off-the-shelf handwriting recognition system, allowing easy extension to new languages and scripts. In a small user study (n=10), the average novice participant was able to write the first word only 37 seconds after the starting to use GyroPen for the first time. With some experience, users were able to write at the speed of 3-4s for one English word and with a character error rate of 18%. View details
    Large-scale SVD and manifold learning
    Ameet Talwalkar
    Journal of Machine Learning Research, 14 (2013), pp. 3129-3152
    Preview
    Preview abstract This paper presents the algorithms which power Google Correlate, a tool which finds web search terms whose popularity over time best matches a user-provided time series. Correlate was developed to generalize the query-based modeling techniques pioneered by Google Flu Trends and make them available to end users. Correlate searches across millions of candidate query time series to find the best matches, returning results in less than 200 milliseconds. Its feature set and requirements present unique challenges for Approximate Nearest Neighbor (ANN) search techniques. In this paper, we present Asymmetric Hashing (AH), the technique used by Correlate, and show how it can be adapted to the specific needs of the product. We then develop experiments to test the throughput and recall of Asymmetric Hashing as compared to a brute-force search. For "full" search vectors, we achieve a 10x speedup over brute force search while maintaining 97% recall. For search vectors which contain holdout periods, we achieve a 4x speedup over brute force search, also with 97% recall. View details
    Learning Binary Codes for High Dimensional Data Using Bilinear Projections
    Yunchao Gong
    Svetlana Lazebnik
    IEEE Computer Vision and Pattern Recognition (2013)
    Preview abstract Recent advances in visual recognition indicate that to achieve good retrieval and classification accuracy on large scale datasets like ImageNet, extremely high-dimensional visual descriptors, e.g., Fisher Vectors, are needed. We present a novel method for converting such descriptors to compact similarity-preserving binary codes that exploits their natural matrix structure to reduce their dimensionality using compact bilinear projections instead of a single large projection matrix. This method achieves comparable retrieval and classification accuracy to the original descriptors and to the state-of-the-art Product Quantization approach while having orders of magnitude faster code generation time and smaller memory footprint. View details
    Google Image Swirl: A Large-Scale Content-Based Image Visualization System
    Yushi Jing
    Jingbin Wang
    David Tsai
    Chuck Rosenberg
    Michele Covell
    WWW (2012), pp. 539-540
    Preview
    Image Saliency: From Local to Global Context
    Meng Wang
    Janusz Konrad
    Prakash Ishwar
    Yushi Jing
    Proc. Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
    Preview
    Large-Scale Image Annotation using Visual Synset
    David Tsai
    Yushi Jing
    Yi Liu
    Sergey Ioffe
    James Rehg
    Proc. International Conference on Computer Vision (ICCV) (2011)
    Preview