
Peyman Milanfar
I lead the Computational Imaging/ Image Processing team in Google Research. My team develops core imaging technologies that are used in a number of products at Google.
One of these technologies is RAISR (Rapid and Accurate Image Super-Resolution): Given an image, we wish to produce an image of larger size with significantly more pixels and higher image quality. With pairs of example images, we train a set of filters (i.e., a mapping) that when applied to a given image that is not in the training set, will produce a higher resolution version of it. The work was highlighted in a Research Blog post. The technology was launched for G+ photos G+ Photos worldwide; and also as part of the MotionStills app .
Another is Turbo Denoising for camera pipelines and other imaging applications. We produced a single-frame denoiser that is (1) fast enough to be practical even for mobile devices, and (2) handles content dependent noise that is typical for real camera captures. For realistic camera noise, our results are competitive with BM3D, but with nearly 400 times speedup. This technique allowed us to speed up denoising algorithm by two orders of magnitude, while producing quality that is state of the art. As a side benefit, less noisy images compress better and lead to smaller file sizes.
Another is Style Transfer which is a process of migrating a style from a given image to the content of another, synthesizing a new image which is an artistic mixture of the two. Our algorithm extends earlier work on texture-synthesis, while aiming to get stylized images that get closer in quality to ones produced by Convolutional Neural Networks. The proposed algorithm is fast and flexible, being able to process any pair of content + style images .
My team also works on more theoretical questions. For instance, in RED (Regularization by Denoising) we proposed a new way to use the denoising engine in defining the regularization for any inverse problem. RED is an explicit image-adaptive Laplacian-based regularization functional, making the overall objective functional clear and well-defined. With a complete flexibility to choose the iterative optimization procedure for minimizing the above functional, RED is capable of incorporating any image denoising algorithm, treat general inverse problems very effectively, and is guaranteed to converge to the globally optimal result. As examples of its utility, we test this approach and demonstrate state-of-the-art results in the image deblurring and super-resolution problems.
A bit about my background: Prior to joining Google, I was a Professor of Electrical Engineering at UC Santa Cruz from 1999-2014. I was also Associate Dean for Research at the School of Engineering from 2010-12. From 2012-2014 I was on leave at Google-x, where I helped develop the imaging pipeline for Google Glass. I received my undergraduate education in electrical engineering and mathematics from the University of California, Berkeley, and the MS and PhD degrees in electrical engineering from MIT. I hold 11 US patents, several of which are commercially licensed. He founded MotionDSP in 2005. I've been keynote speaker at numerous technical conferences including Picture Coding Symposium (PCS), SIAM Imaging Sciences, SPIE, and the International Conference on Multimedia (ICME). Along with my former students, I won several best paper awards from the IEEE Signal Processing Society.
I am a Distinguished Lecturer of the IEEE Signal Processing Society, and a Fellow of the IEEE "for contributions to inverse problems and super-resolution in imaging."
Please visit my public website, for the most up to date list of my publications, cv, etc.
One of these technologies is RAISR (Rapid and Accurate Image Super-Resolution): Given an image, we wish to produce an image of larger size with significantly more pixels and higher image quality. With pairs of example images, we train a set of filters (i.e., a mapping) that when applied to a given image that is not in the training set, will produce a higher resolution version of it. The work was highlighted in a Research Blog post. The technology was launched for G+ photos G+ Photos worldwide; and also as part of the MotionStills app .
Another is Turbo Denoising for camera pipelines and other imaging applications. We produced a single-frame denoiser that is (1) fast enough to be practical even for mobile devices, and (2) handles content dependent noise that is typical for real camera captures. For realistic camera noise, our results are competitive with BM3D, but with nearly 400 times speedup. This technique allowed us to speed up denoising algorithm by two orders of magnitude, while producing quality that is state of the art. As a side benefit, less noisy images compress better and lead to smaller file sizes.
Another is Style Transfer which is a process of migrating a style from a given image to the content of another, synthesizing a new image which is an artistic mixture of the two. Our algorithm extends earlier work on texture-synthesis, while aiming to get stylized images that get closer in quality to ones produced by Convolutional Neural Networks. The proposed algorithm is fast and flexible, being able to process any pair of content + style images .
My team also works on more theoretical questions. For instance, in RED (Regularization by Denoising) we proposed a new way to use the denoising engine in defining the regularization for any inverse problem. RED is an explicit image-adaptive Laplacian-based regularization functional, making the overall objective functional clear and well-defined. With a complete flexibility to choose the iterative optimization procedure for minimizing the above functional, RED is capable of incorporating any image denoising algorithm, treat general inverse problems very effectively, and is guaranteed to converge to the globally optimal result. As examples of its utility, we test this approach and demonstrate state-of-the-art results in the image deblurring and super-resolution problems.
A bit about my background: Prior to joining Google, I was a Professor of Electrical Engineering at UC Santa Cruz from 1999-2014. I was also Associate Dean for Research at the School of Engineering from 2010-12. From 2012-2014 I was on leave at Google-x, where I helped develop the imaging pipeline for Google Glass. I received my undergraduate education in electrical engineering and mathematics from the University of California, Berkeley, and the MS and PhD degrees in electrical engineering from MIT. I hold 11 US patents, several of which are commercially licensed. He founded MotionDSP in 2005. I've been keynote speaker at numerous technical conferences including Picture Coding Symposium (PCS), SIAM Imaging Sciences, SPIE, and the International Conference on Multimedia (ICME). Along with my former students, I won several best paper awards from the IEEE Signal Processing Society.
I am a Distinguished Lecturer of the IEEE Signal Processing Society, and a Fellow of the IEEE "for contributions to inverse problems and super-resolution in imaging."
Please visit my public website, for the most up to date list of my publications, cv, etc.
Research Areas
Authored Publications
Sort By
Google
DVMark: A Deep Multiscale Network for Video Watermarking
Huiwen Chang
Ce Liu
IEEE Transactions on Image Processing (2023)
SVDiff: Compact Parameter Space for Diffusion Fine-Tuning
Ligong Han
Han Zhang
Dimitris Metaxas
IEEE/CVF International Conference on Computer Vision (ICCV) (2023)
Soft Diffusion: Score Matching with General Corruptions
Giannis Daras
Alexandros Dimakis
Transactions on Machine Learning Research (TMLR) (2023)
Interpretable Unsupervised Diversity Denoising and Artefact Removal
Mangal Prakash
Florian Jug
International Conference on Learning Representations (2022)
Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings
Ce Liu
Huiwen Chang
Innfarn Yoo
Ondrej Stava
Computer Vision and Pattern Recognition (2022)
Deblurring via Stochastic Refinement
Jay Whang
Chitwan Saharia
Alexandros Dimakis
CVPR (2022)
MAXIM: Multi-Axis MLP for Image Processing
Zhengzhong Tu
Han Zhang
Alan Bovik
IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022)
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Han Zhang
Alan Bovik
European Conference on Computer Vision (ECCV) (2022)