Research in machine perception tackles the hard problems of understanding images, sounds, music and video. In recent years, our computers have become much better at such tasks, enabling a variety of new applications such as: content-based search in Google Photos and Image Search, natural handwriting interfaces for Android, optical character recognition for Google Drive documents, and recommendation systems that understand music and YouTube videos. Our approach is driven by algorithms that benefit from processing very large, partially-labeled datasets using parallel computing clusters. A good example is our recent work on object recognition using a novel deep convolutional neural network architecture known as Inception that achieves state-of-the-art results on academic benchmarks and allows users to easily search through their large collection of Google Photos. The ability to mine meaningful information from multimedia is broadly applied throughout Google.
Recent publications
Scaling Symbolic Methods using Gradients for Neural Model Explanation
International Conference on Learning Representations (2021)
Elf: Accelerate High-resolution Mobile Deep Vision with Content-aware Distributed Offloading
The 27th Annual International Conference on Mobile Computing and Networking (ACM MobiCom 2021). (2021) (to appear)
EdgeSharing: Edge Assisted Real-time Localization and Object Sharing in Urban Streets
The 40th IEEE International Conference on Computer Communications (IEEE INFOCOM 2021). (2021) (to appear)
When Ensembling Smaller Models is More Efficient than Single Large Models
Visual Understanding by Learning from Web Data 2020, CVPR (2020)
Some of our teams
Join Us
Our researchers work across the world
Together, our research teams tackle tough problems.