Sam Hasinoff
Sam Hasinoff is a software engineer at Google. Before joining Google in 2011, he was an Research Assistant Professor at the Toyota Technological Institute at Chicago (TTIC), a philanthropically endowed academic institute on the campus of the University of Chicago. From 2008-2010, he was a postdoctoral fellow at the Massachusetts Institute of Technology, supported in part by the National Sciences and Engineering Research Council of Canada. He received the BSc degree in computer science from the University of British Columbia in 2000, and the MSc and PhD degrees in computer science from the University of Toronto in 2002 and 2008, respectively. In 2006, he received an honorable mention for the Longuet-Higgins Best Paper Award at the European Conference on Computer Vision. He is the recipient of the Alain Fournier Award for the top Canadian dissertation in computer graphics in 2008. http://people.csail.mit.edu/hasinoff/
Research Areas
Authored Publications
Sort By
Handheld Mobile Photography in Very Low Light
Kiran Murthy
Yun-Ta Tsai
Tim Brooks
Tianfan Xue
Nikhil Karnad
Dillon Sharlet
Ryan Geiss
Marc Levoy
ACM Transactions on Graphics, 38 (2019), pp. 16
Preview abstract
Taking photographs in low light using a mobile phone is challenging and rarely produces pleasing results. Aside from the physical limits imposed by read noise and photon shot noise, these cameras are typically handheld, have small apertures and sensors, use mass-produced analog electronics that cannot easily be cooled, and are commonly used to photograph subjects that move, like children and pets. In this paper we describe a system for capturing clean, sharp, colorful photographs in light as low as 0.3 lux, where human vision becomes monochromatic and indistinct. To permit handheld photography without flash illumination, we capture, align, and combine multiple frames. Our system employs “motion metering”, which uses an estimate of motion magnitudes (whether due to handshake or moving objects) to identify the number of frames and the per-frame exposure times that together minimize both noise and motion blur in a captured burst. We combine these frames using robust alignment and merging techniques that are specialized for high-noise imagery. To ensure accurate colors in such low light, we employ a learning-based auto white balancing algorithm. To prevent the photographs from looking like they were shot in daylight, we use tone mapping techniques inspired by illusionistic painting: increasing contrast, crushing shadows to black, and surrounding the scene with darkness. All of these processes are performed using the limited computational resources of a mobile device. Our system can be used by novice photographers to produce shareable pictures in a few seconds based on a single shutter press, even in environments so dim that humans cannot see clearly.
View details
Preview abstract
Performance is a critical challenge in mobile image processing. Given a reference imaging pipeline, or even human-adjusted pairs of images, we seek to reproduce the enhancements and enable real-time evaluation. For this, we introduce a new neural network architecture inspired by bilateral grid processing and local affine color transforms. Using pairs of input/output images, we train a convolutional neural network to predict the coefficients of a locally-affine model in bilateral space. Our architecture learns to make local, global, and content-dependent decisions to approximate the desired image transformation. At runtime, the neural network consumes a low-resolution version of the input image, produces a set of affine transformations in bilateral space, upsamples those transformations in an edge-preserving fashion using a new slicing node, and then applies those upsampled transformations to the full-resolution image. Our algorithm processes high-resolution images on a smartphone in milliseconds, provides a real-time viewfinder at 1080p resolution, and matches the quality of state-of-the-art approximation techniques on a large class of image operators. Unlike previous work, our model is trained off-line from data and therefore does not require access to the original operator at runtime. This allows our model to learn complex, scene-dependent transformations for which no reference implementation is available, such as the photographic edits of a human retoucher.
View details
Bilateral Guided Upsampling
Jiawen Chen
Andrew Adams
Neal Wadhwa
ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2016) (2016)
Preview abstract
We present an algorithm to accelerate a large class of image processing operators. Given a low-resolution reference input and output pair, we model the operator by fitting local curves that map the input to the output. We can then produce a full-resolution output by evaluating these low-resolution curves on the full-resolution input. We demonstrate that this faithfully models state-of-the-art operators for tone mapping, style transfer, and recoloring. The curves are computed by lifting the input into a bilateral grid and then solving for the 3D array of affine matrices that best maps input color to output color per x, y, intensity bin. We enforce a smoothness term on the matrices which prevents false edges and noise amplification. We can either globally optimize this energy, or quickly approximate a solution by locally fitting matrices and then enforcing smoothness by blurring in grid space. This latter option reduces to joint bilateral upsampling or the guided filter depending on the choice of parameters. The cost of running the algorithm is reduced to the cost of running the original algorithm at greatly reduced resolution, as fitting the curves takes about 10 ms on mobile devices, and 1-2 ms on desktop CPUs, and evaluating the curves can be done with a simple GPU shader.
View details
Burst photography for high dynamic range and low-light imaging on mobile cameras
Dillon Sharlet
Ryan Geiss
Andrew Adams
Florian Kainz
Jiawen Chen
Marc Levoy
SIGGRAPH Asia (2016)
Preview abstract
Cell phone cameras have small apertures, which limits the number of photons they can gather, leading to noisy images in low light. They also have small sensor pixels, which limits the number of electrons each pixel can store, leading to limited dynamic range. We describe a computational photography pipeline that captures, aligns, and merges a burst of frames to reduce noise and increase dynamic range. Our system has several key features that help make it robust and efficient. First, we do not use bracketed exposures. Instead, we capture frames of constant exposure, which makes alignment more robust, and we set this exposure low enough to avoid blowing out highlights. The resulting merged image has clean shadows and high bit depth, allowing us to apply standard HDR tone mapping methods. Second, we begin from Bayer raw frames rather than the demosaicked RGB (or YUV) frames produced by hardware Image Signal Processors (ISPs) common on mobile platforms. This gives us more bits per pixel and allows us to circumvent the ISP's unwanted tone mapping and spatial denoising. Third, we use a novel FFT-based alignment algorithm and a hybrid 2D/3D Wiener filter to denoise and merge the frames in a burst. Our implementation is built atop Android's Camera2 API, which provides per-frame camera control and access to raw imagery, and is written in the Halide domain-specific language (DSL). It runs in 4 seconds on device (for a 12 Mpix image), requires no user intervention, and ships on several mass-produced cell phones.
View details