Chia-Kai Liang
I am a software engineer at Google, focusing on mobile computational photography. I obtained my BS degree in 2004 and defended my PhD dissertation in 2008, both at National Taiwan University. My research interests include computational photography, image/video processing, computer vision, and hardware architecture. Basically anything that records, reads, modifies, writes, or displays the pixel, voxel, or ray-el values.
Research Areas
Authored Publications
Sort By
Preview abstract
We present Steadiface, a new real-time face-centric video stabilization method that simultaneously removes hand shake and keeps subject's head stable. We use a CNN to estimate the face landmarks and use them to optimize a stabilized head center. We then formulate an optimization problem to find a virtual camera pose that locates the face to the stabilized head center while retains smooth rotation and translation transitions across frames. We tested the proposed method on field test videos and shows it can stabilize both the head motion and background. It is robust to large head pose, occlusion, facial appearance variations, and different kinds of camera motions. We show our method advances the state of art in selfie video stabilization by comparing against alternative methods. The whole process runs very efficiently on a modern mobile phone (8.1 ms/frame).
View details
Preview abstract
Photographers take wide-angle shots to enjoy expanding views, group portraits that never miss anyone, or extra freedoms to composite subjects with spectacular scenery background. In spite of the rapid proliferation of wide-angle camera on mobile phones, a wider field-of-view (FOV) introduces a stronger perspective distortion.
Most notably, portrait subjects may look vastly different from real-life, such as that the faces are stretched, squished, and skewed. Correcting such distortions requires professional editing skills, as trivial manipulations can introduce other kinds of distortions.
This paper introduces a new algorithm to recover undistorted faces without disturbing other parts of the photo. Motivated by that the stereographic projection preserves local geometry from the camera viewing sphere, we formulate an optimization problem to create a warping mesh, which locally adapts to the stereographic projection on facial regions,, and seamlessly evolves to the perspective projection over the background. We introduce a new energy function to minimize face distortions, which works reliably even for a large group selfie. We also use a CNN-based portrait segmentation to assign the weights in the objective function. Our algorithm is fully automatic and runs at an interactive rate on the mobile platform. We demonstrate promising results on a wide-range of camera FOVs from 72-115\degree.
View details
Handheld Multi-Frame Super-Resolution
Bartlomiej Wronski
Manfred Ernst
Michael Krainin
Marc Levoy
ACM Transactions on Graphics (TOG), 38 (2019), pp. 18
Preview abstract
Compared to DSLR cameras, smartphone cameras have smaller sensors, which limits their spatial resolution; smaller apertures, which limits their light gathering ability; and smaller pixels, which reduces their signal-to noise ratio. The use of color filter arrays (CFAs) requires demosaicing, which further degrades resolution.
In this paper, we supplant the use of traditional demosaicing in single-frame and burst photography pipelines with a multiframe super-resolution algorithm that creates a complete RGB image directly from a burst of CFA raw images. We harness natural hand tremor, typical in handheld photography, to acquire a burst of raw frames with small offsets. These frames are then aligned and merged to form a single image with red, green, and blue values at every pixel site. This approach, which includes no explicit demosaicing step, serves to both increase image resolution and boost signal to noise ratio.
Our algorithm is robust to challenging scene conditions: local motion, occlusion, or scene changes. It runs at 100 milliseconds per 12-megapixel RAW input burst frame on mass-produced mobile phones.
Specifically, the algorithm is the basis of the Super-Res Zoom feature, as well as the default merge method in Night Sight mode (whether zooming or not) on Google's flagship phone.
View details
Real-Time Video Denoising on Mobile Phones
Jana Ehmann
Lun-Cheng Chu
Sung-fang Tsai
International Conference on Image Processing, IEEE (2018) (to appear)
Preview abstract
We present an algorithm for real-time video denoising on mobile platforms. Based on Gaussian-Laplacian pyramid de-composition, our solution’s main contributions are fast alignment and a new interpolation function that fuses noisy frames into a denoised result. The interpolation function is adaptive to local and global properties of the input frame, robust to motion alignment errors, and can be computed efficiently. We show that the proposed algorithm has comparable quality to offline high-quality video denoising methods, but is orders of magnitude faster. On a modern mobile platform, our work takes less than 20ms to process one HD frame, and it achieves the highest score on a public benchmark.
View details