Jump to Content
Rich Washington

Rich Washington

Rich Washington is a Software Engineer at Google in Paris, France. His current work is on Operations Research. Previously within Google he worked on YouTube video annotation, web search quality, spelling correction, and biologically-inspired vision algorithms. From 1997-2004 he worked at NASA Ames Research Center on Mars rover autonomy and plan execution. He holds a BA in Mathematics from Johns Hopkins University and a PhD in Computer Science (Artificial Intelligence) from Stanford University.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Preview abstract We present a new approach to learning sparse, spatiotemporal features and demonstrate the utility of the approach by applying the resulting sparse codes to the problem of activity recognition. Learning features that discriminate among human activities in video is difficult in part because the stable space-time events that reliably characterize the relevant motions are rare. To overcome this problem, we adopt a multi-stage approach to activity recognition. In the initial preprocessing stage, we first whiten and apply local contrast normalization to each frame of the video. We then apply an additional set of filters to identify and extract salient space-time volumes that exhibit smooth periodic motion. We collect a large corpus of these space-time volumes as training data for the unsupervised learning of a sparse, over-complete basis using a variant of the two-phase analysis-synthesis algorithm of Olshausen and Field [1997]. We treat the synthesis phase, which consists of reconstructing the input as sparse a mostly coefficient zero and most importantly the time required for reconstruction in subsequent use production we adapted existing algorithms to exploit potential parallelism through the use of readily-available SIMD hardware. To obtain better codes, we developed a new approach to learning sparse, spatiotemporal codes in which the number of basis vectors, their orientations, velocities and the size of their receptive fields change over the duration of unsupervised training. The algorithm starts with a relatively small, initial basis with minimal temporal extent. This initial basis is obtained through conventional sparse coding techniques and is expanded over time by recursively constructing a new basis consisting of basis vectors with larger temporal extent that proportionally conserve regions of previously trained weights. These proportionally conserved weights are combined with the result of adjusting newly added weights to represent a greater range of primitive motion features. The size of the current basis is determined probabilistically by sampling from existing basis vectors according to their activation on the training set. The resulting algorithm produces bases consisting of filters that are bandpass, spatially oriented and temporally diverse in terms of their transformations and velocities. We demonstrate the utility of our approach by using it to recognize human activity in video. View details
    Recursive Sparse Spatiotemporal Coding
    Thomas Dean
    Proceedings of the Fifth IEEE International Workshop on Multimedia Information Processing and Retrieval, IEEE Computer Society (2009)
    Preview abstract We present a new approach to learning sparse, spatiotemporal codes in which the number of basis vectors, their orientations, velocities and the size of their receptive fields change over the duration of unsupervised training. The algorithm starts with a relatively small, initial basis with minimal temporal extent. This initial basis is obtained through conventional sparse coding techniques and is expanded over time by recursively constructing a new basis consisting of basis vectors with larger temporal extent that proportionally conserve regions of previously trained weights. These proportionally conserved weights are combined with the result of adjusting newly added weights to represent a greater range of primitive motion features. The size of the current basis is determined probabilistically by sampling from existing basis vectors according to their activation on the training set. The resulting algorithm produces bases consisting of filters that are bandpass, spatially oriented and temporally diverse in terms of their transformations and velocities. The basic methodology borrows inspiration from the layer-by-layer learning of multiple-layer restricted Boltzmann machines developed by Geoff Hinton and his students. Indeed, we can learn multiple-layer sparse codes by training a stack of denoising autoencoders, but we have had greater success using L1 regularized regression in a variation on Olshausen and Field's original SPARSENET. To accelerate learning and focus attention, we apply a space-time interest-point operator that selects for periodic motion. This attentional mechanism enables us to efficiently compute and compactly represent a broad range of interesting motion. We demonstrate the utility of our approach by using it to recognize human activity in video. Our algorithm meets or exceeds the performance of state-of-the-art activity-recognition methods. View details
    On the Prospects for Building a Working Model of the Visual Cortex
    Thomas Dean
    Glenn Carroll
    Proceedings of AAAI-07, MIT Press, Cambridge, Massachusetts (2007), pp. 1597-1600
    Preview abstract Human-level visual performance has remained largely beyond the reach of engineered systems despite decades of research and significant advances in problem formulation, algorithms and computing power. We posit that significant progress can be made by combining existing technologies from machine vision, insights from theoretical neuroscience and large-scale distributed computing. Such claims have been made before and so it is quite reasonable to ask what are the new ideas we bring to the table that might make a difference this time around. From a theoretical standpoint, our primary point of departure from current practice is our reliance on exploiting time in order to turn an otherwise intractable unsupervised problem into a locally semi-supervised, and plausibly tractable, learning problem. From a pragmatic perspective, our system architecture follows what we know of cortical neuroanatomy and provides a solid foundation for scalable hierarchical inference. This combination of features provides the framework for implementing a wide range of robust object-recognition capabilities. View details
    Dynamic Programming for Structured Continuous Markov Decision Problems
    Zhengzhu Feng
    Richard Dearden
    Nicolas Meuleau
    UAI (2004), pp. 154-161
    Planning Under Continuous Time and Resource Uncertainty: A Challenge for AI
    John L. Bresina
    Richard Dearden
    Nicolas Meuleau
    Sailesh Ramakrishnan
    David E. Smith
    Uncertainty in Artificial Intelligence (2002), pp. 77-84
    Plan Execution, Monitoring, and Adaptation for Planetary Rovers
    Keith Golden
    John L. Bresina
    Electron. Trans. Artif. Intell., vol. 4 (2000), pp. 3-21