We build systems to interpret, reason about, and transform sensory data.
About the team
Research in Perception tackles the hard problems of understanding images, sounds, music and video, as well as providing more powerful tools for image capture, compression, processing, creative expression, and augmented reality.
Our technology powers products across Alphabet, including image understanding in Search and Google Photos, camera enhancements for the Pixel Phone, handwriting interfaces for Android, optical character recognition for Google Drive, video understanding and summarization for YouTube, Google Cloud, Google Photos and Nest, as well as mobile apps including Motion Stills, PhotoScan and Allo.
We actively contribute to the open source and research communities. Our pioneering deep learning advances, such as Inception and Batch Normalization, are available in TensorFlow. Further, we have released several large-scale datasets for machine learning, including: AudioSet (audio event detection); AVA (human action understanding in video); Open Images (image classification and object detection); and YouTube-8M (video labeling).
Creating accurate ML models capable of localizing and identifying multiple objects in a single image remains a core challenge in the field, and we invest a significant amount of time training and experimenting with these systems.
With Motion Stills on Android we built a new recording experience where everything you capture is instantly transformed into delightful short clips that are easy to watch and share.
Google Machine Perception researchers, in collaboration with Daydream Labs and YouTube Spaces, have been working on solutions to address this problem wherein we reveal the user’s face by virtually “removing” the headset and create a realistic see-through effect.
With “RAISR: Rapid and Accurate Image Super-Resolution”, we introduce a technique that incorporates machine learning in order to produce high-quality versions of low-resolution images.
We released an update to PhotoScan, an app for iOS and Android that allows you to digitize photo prints with just a smartphone.
Some of our people
We interpret the world around us through our senses. Our team helps make Google products more helpful and delightful by giving our systems the ability to understand and reason about images, video and sound.
What AI will enable us to do as a society is going to be very transformative. It's up to us to design the building blocks of tomorrow that can elevate every individual to do much more from what they are able to do today.