Google Computer Vision research at CVPR 2015

June 7, 2015

Posted by Vincent Vanhoucke, Google Research Scientist

Much of the world's data is in the form of visual media. In order to utilize meaningful information from multimedia and deliver innovative products, such as Google Photos, Google builds machine-learning systems that are designed to enable computer perception of visual input, in addition to pursuing image and video analysis techniques focused on image/scene reconstruction and understanding.

This week, Boston hosts the 2015 Conference on Computer Vision and Pattern Recognition (CVPR 2015), the premier annual computer vision event comprising the main CVPR conference and several co-located workshops and short courses. As a leader in computer vision research, Google will have a strong presence at CVPR 2015, with many Googlers presenting publications in addition to hosting workshops and tutorials on topics covering image/video annotation and enhancement, 3D analysis and processing, development of semantic similarity measures for visual objects, synthesis of meaningful composites for visualization/browsing of large image/video collections and more.

Learn more about some of our research in the list below (Googlers highlighted in blue). If you are attending CVPR this year, we hope you’ll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for hundreds of millions of people. Members of the Jump team will also have a prototype of the camera on display and will be showing videos produced using the Jump system on Google Cardboard.

Applied Deep Learning for Computer Vision with Torch
Koray Kavukcuoglu, Ronan Collobert, Soumith Chintala

DIY Deep Learning: a Hands-On Tutorial with Caffe
Evan Shelhamer, Jeff Donahue, Yangqing Jia, Jonathan Long, Ross Girshick

ImageNet Large Scale Visual Recognition Challenge Tutorial
Olga Russakovsky, Jonathan Krause, Karen Simonyan, Yangqing Jia, Jia Deng, Alex Berg, Fei-Fei Li

Fast Image Processing With Halide
Jonathan Ragan-Kelley, Andrew Adams, Fredo Durand

Open Source Structure-from-Motion
Matt Leotta, Sameer Agarwal, Frank Dellaert, Pierre Moulon, Vincent Rabaud

Oral Sessions:
Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection
George Papandreou, Iasonas Kokkinos, Pierre-André Savalle

Going Deeper with Convolutions
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich

DynamicFusion: Reconstruction and Tracking of Non-Rigid Scenes in Real-Time
Richard A. Newcombe, Dieter Fox, Steven M. Seitz

Show and Tell: A Neural Image Caption Generator
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

Long-Term Recurrent Convolutional Networks for Visual Recognition and Description
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, Trevor Darrell

Visual Vibrometry: Estimating Material Properties from Small Motion in Video
Abe Davis, Katherine L. Bouman, Justin G. Chen, Michael Rubinstein, Frédo Durand, William T. Freeman

Fast Bilateral-Space Stereo for Synthetic Defocus
Jonathan T. Barron, Andrew Adams, YiChang Shih, Carlos Hernández

Poster Sessions:
Learning Semantic Relationships for Better Action Retrieval in Images
Vignesh Ramanathan, Congcong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang Song, Samy Bengio, Charles Rosenberg, Li Fei-Fei

FaceNet: A Unified Embedding for Face Recognition and Clustering
Florian Schroff, Dmitry Kalenichenko, James Philbin

A Mixed Bag of Emotions: Model, Predict, and Transfer Emotion Distributions
Kuan-Chuan Peng, Tsuhan Chen, Amir Sadovnik, Andrew C. Gallagher

Best-Buddies Similarity for Robust Template Matching
Tali Dekel, Shaul Oron, Michael Rubinstein, Shai Avidan, William T. Freeman

Articulated Motion Discovery Using Pairs of Trajectories
Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

Reflection Removal Using Ghosting Cues
YiChang Shih, Dilip Krishnan, Frédo Durand, William T. Freeman

P3.5P: Pose Estimation with Unknown Focal Length
Changchang Wu

MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching
Xufeng Han, Thomas Leung, Yangqing Jia, Rahul Sukthankar, Alexander C. Berg

Inferring 3D Layout of Building Facades from a Single Image
Jiyan Pan, Martial Hebert, Takeo Kanade

The Aperture Problem for Refractive Motion
Tianfan Xue, Hossein Mobahei, Frédo Durand, William T. Freeman

Video Magnification in Presence of Large Motions
Mohamed Elgharib, Mohamed Hefeeda, Frédo Durand, William T. Freeman

Robust Video Segment Proposals with Painless Occlusion Handling
Zhengyang Wu, Fuxin Li, Rahul Sukthankar, James M. Rehg

Ontological Supervision for Fine Grained Classification of Street View Storefronts
Yair Movshovitz-Attias, Qian Yu, Martin C. Stumpe, Vinay Shet, Sacha Arnoud, Liron Yatziv

VIP: Finding Important People in Images
Clint Solomon Mathialagan, Andrew C. Gallagher, Dhruv Batra

Fusing Subcategory Probabilities for Texture Classification
Yang Song, Weidong Cai, Qing Li, Fan Zhang

Beyond Short Snippets: Deep Networks for Video Classification
Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici

THUMOS Challenge 2015
Program organizers include: Alexander Gorban, Rahul Sukthankar

DeepVision: Deep Learning in Computer Vision 2015
Invited Speaker: Rahul Sukthankar

Large Scale Visual Commerce (LSVisCom)
Panelist: Luc Vincent

Large-Scale Video Search and Mining (LSVSM)
Invited Speaker and Panelist: Rahul Sukthankar
Program Committee includes: Apostol Natsev

Vision meets Cognition: Functionality, Physics, Intentionality and Causality
Program Organizers include: Peter Battaglia

Big Data Meets Computer Vision: 3rd International Workshop on Large Scale Visual Recognition and Retrieval (BigVision 2015)
Program Organizers include: Samy Bengio
Includes speaker Christian Szegedy - “Scalable approaches for large scale vision”

Observing and Understanding Hands in Action (Hands 2015)
Program Committee includes: Murphy Stein

Fine-Grained Visual Categorization (FGVC3)
Program Organizers include: Anelia Angelova

Large-scale Scene Understanding Challenge (LSUN)
Winners of the Scene Classification Challenge: Julian Ibarz, Christian Szegedy and Vincent Vanhoucke
Winners of the Caption Generation Challenge: Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan

Looking from above: when Earth observation meets vision (EARTHVISION)
Technical Committee includes: Andreas Wendel

Computer Vision in Vehicle Technology: Assisted Driving, Exploration Rovers, Aerial and Underwater Vehicles
Invited Speaker: Andreas Wendel
Program Committee includes: Andreas Wendel

Women in Computer Vision (WiCV)
Invited Speaker: Mei Han

ChaLearn Looking at People (sponsor)

Fine-Grained Visual Categorization (FGVC3) (sponsor)