Jingning Han
Jingning Han received the B.S. degree in Electrical Engineering from Tsinghua University in 2007, and the M.S. and Ph.D. degrees in Electrical and Computer Engineering from University of California Santa Barbara in 2008 and 2012, respectively. His research interests include video coding and computer architecture.
Dr. Han was a recipient of the Outstanding Teaching Assistant Awards in 2010 and 2011, the Dissertation Fellowship in 2012, both from the Department of Electrical and Engineering at University of California Santa Barbara. He was a recipient of the Best Student Paper Award at the IEEE International Conference on Multimedia and Expo in 2012. Dr. Han received the IEEE Signal Processing Society Best Young Author Paper award in 2015.
Dr. Han serves as an Associate Editor for IEEE Transactions on Image Processing.
Research Areas
Authored Publications
Sort By
Preview abstract
This paper proposes a novel bi-directional motion compensation framework that extracts existing motion information associated with the reference frames and interpolates an additional reference frame candidate that is co-located with the current frame. The approach generates a dense motion field by performing optical flow estimation, so as to capture complex motion between the reference frames without recourse to additional side information. The estimated optical flow is then complemented by transmission of offset motion vectors to correct for possible deviation from the linearity assumption in the interpolation. Various optimization schemes specifically tailored to the video coding framework are presented to further improve the performance. To accommodate applications where decoder complexity is a cardinal concern, a block-constrained speed-up algorithm is also proposed. Experimental results show that the main approach and optimization methods yield significant coding gains across a diverse set of video sequences. Further experiments focus on the trade-off between performance and complexity, and demonstrate that the proposed speed-up algorithm offers complexity reduction by a large factor while maintaining most of the performance gains.
View details
Preview abstract
Selecting among multiple transform kernels to code prediction residuals are widely used for better compression efficiency. Conventionally, the encoder performs trials of each transform to estimate the rate-distortion (R-D) cost. However such an exhaustive approach suffers from a significant increase of complexity due to the excessive trials. In this paper, a novel rate estimation approach is proposed to by-pass the entropy coding process for each transform type using the conditional Laplace distribution model. The proposed method estimates the Laplace distribution parameter by the context inferred by the quantization level and finds the expected rate of the coefficient for transform type selection. Furthermore, a greedy search algorithm for separable transforms is also presented to further accelerate the process. Experiment results show that transform type selection using the proposed rate estimation method achieves high accuracy at lower complexity.
View details
AN OVERVIEW OF CORE CODING TOOLS IN THE AV1 VIDEO CODEC
Adrian Grange
Andrey Norkin
Ching-Han Chiang
Hui Su
Jean-Marc Valin
Luc Trudeau
Nathan Egge
Paul Wilkins
Peter de Rivaz
Sarah Parker
Steinar Midtskogen
Thomas Davies
Zoe Liu
The Picture Coding Symposium (PCS) (2018)
Preview abstract
AV1 is an emerging open-source and royalty-free video compression format, which is jointly developed and finalized in early 2018 by the Alliance for Open Media (AOMedia) industry consortium. The main goal of AV1 development is to achieve substantial compression gain over state-of-the-art codecs while maintaining practical decoding complexity and hardware feasibility. This paper provides a brief technical overview of key coding techniques in AV1 along with preliminary compression performance comparison against VP9 and HEVC.
View details
Novel inter and intra prediction tools under consideration for the emerging AV1 video codec
Sarah Parker
Hui Su
Angie Chiang
Zoe Liu
Chen Wang
Emil Keyder
SPIE Optical Engineering + Applications, 10396 (2017), 10396 - 10396 - 13
Preview abstract
Google started the WebM Project in 2010 to develop open source, royalty-free video codecs designed specifically for media on the Web. The second generation codec released by the WebM project, VP9, is currently served by YouTube, and enjoys billions of views per day. Realizing the need for even greater compression efficiency to cope with the growing demand for video on the web, the WebM team embarked on an ambitious project to develop a next edition codec AV1, in a consortium of major tech companies called the Alliance for Open Media, that achieves at least a generational improvement in coding efficiency over VP9. In this paper, we focus primarily on new tools in AV1 that improve the prediction of pixel blocks before transforms, quantization and entropy coding are invoked. Specifically, we describe tools and coding modes that improve intra, inter and combined inter-intra prediction. Results are presented on standard test sets.
View details
Preview abstract
Screen content videos that typically contain computer generated texts and graphics are getting more demanding in nowadays online video service. They involve a great amount of circumstances that are not commonly seen in natural videos, including sharp edge transition and repetitive pattern, which make their statistical characteristics distinct from those of natural videos. This makes it questionable about the efficacy of the conventional discrete cosine transform (DCT), which builds on the Gauss-Markov model assumption that leads to a base-band signal, on coding the computer-generated graphics. This work exploits a class of staircase transforms. Unlike the DCT whose bases are samplings of sinusoidal functions, the staircase transforms have their bases sampled from staircase functions, which naturally better approximate the sharp transitions often encountered in the context of screen content. As an alternative transform kernel, the staircase transform is integrated into a hybrid transform coding scheme, in conjunction with DCT. It is experimentally shown that the proposed approach provides an average of 2.9% compression performance gains in terms of BD-rate reduction. A perceptual comparison further demonstrates that the use of staircase transform achieves substantial reduction in ringing artifact due to the Gibbs phenomenon.
View details
Preview abstract
Video codec exploits temporal redundancy of video signal, in the form of motion compensated prediction, to achieve superior compression performance. The coding of motion vectors takes a large portion of the total rate cost. Prior research utilizes the spatial and temporal correlations of the motion field to improve the coding efficiency of the motion information. It typically constructs a candidate pool composed of a fixed number of reference motion vectors and allows the codec to select and reuse the one that best approximates the motion activity of the current block. This largely disconnects the entropy coding process from the true boundary conditions, since it is masked by the fix-length candidate list, and hence could potentially cause sub-optimal coding performance. An alternative motion vector referencing scheme is proposed in this work to fully accommodate the dynamic nature of the boundary conditions for compression efficiency. It adaptively extends or shortens the candidate list according to the actual number of available reference motion vectors. The associated probability model accounts for the likelihood that an individual motion vector candidate is used. A complementary motion vector candidate ranking system is also presented here. It is experimentally shown that the proposed scheme achieves considerable compression performance gains across all the test sets.
View details
An estimation-theoretic approach to video denoising
Timothy Kopp
2015 IEEE International Conference on Image Processing, IEEE, pp. 4273-4277
Preview abstract
A novel denoising scheme is proposed to fully exploit the spatio-temporal correlations of the video signal for efficient enhancement. Unlike conventional pixel domain approaches that directly connect motion compensated reference pixels and spatially neighboring pixels to build statistical models for noise filtering, this work first removes spatial correlations by applying transformations to both pixel blocks and performs estimation in the frequency domain. It is premised on the realization that the precise nature of temporal dependencies, which is entirely masked in the pixel domain by the statistics of the dominant low frequency components, emerges after signal decomposition and varies considerably across the spectrum. We derive an optimal non-linear estimator that accounts for both motion compensated reference and the noisy observations to resemble the original video signal per transform coefficient. It departs from other transform domain approaches that employ linear filters over a sizable reference set to reduce the uncertainty due to the random noise term. Instead it jointly exploits this precise statistical property appeared in the transform domain and the noise probability model in an estimation-theoretic framework that works on a compact support region. Experimental results provide evidence for substantial denoising performance improvement.
View details
Preview abstract
The template matching prediction is an established approach to intra-frame coding that makes use of previously coded pixels in the same frame for reference. It compares the previously reconstructed upper and left boundaries in searching from the reference area the best matched block for prediction, and hence eliminates the need of sending additional information to reproduce the same prediction at decoder. In viewing the image signal as an auto-regressive model, this work is premised on the fact that pixels closer to the known block boundary are better predicted than those far apart. It significantly extends the scope of the template matching approach, which is typically followed by a conventional discrete cosine transform (DCT) for the prediction residuals, by employing an asymmetric discrete sine transform (ADST), whose basis functions vanish at the prediction boundary and reach maximum magnitude at far end, to fully exploit statistics of the residual signals. It was experimentally shown that the proposed scheme provides substantial coding performance gains on top of the conventional template matching method over the baseline.
View details
The latest open-source video codec VP9 - An overview and preliminary results
Preview
Adrian Grange
John Koleszar
Paul Wilkins
Ronald S Bultje
Picture Coding Symposium (2013)
Preview abstract
The hybrid transform coding scheme that alternates amongst the asymmetric discrete sine transform (ADST) and the discrete cosine transform (DCT) depending on the boundary prediction conditions, is an efficient tool for video and image compression. It optimally exploits the statistical characteristics of prediction residual, thereby achieving significant coding performance gains over the conventional DCT-based approach. A practical concern lies in the intrinsic conflict between transform kernels of ADST and DCT, which prevents a butterfly structured implementation for parallel computing. Hence the hybrid transform coding scheme has to rely on matrix multiplication, which presents a speed-up barrier due to under-utilization of the hardware, especially for larger block sizes. In this work, we devise a novel ADST-like transform whose kernel is consistent with that of DCT, thereby enabling butterfly structured computation flow, while largely retaining the performance advantages of hybrid transform coding scheme in terms of compression efficiency. A prototype implementation of the proposed butterfly structured hybrid transform coding scheme is available in the VP9 codec repository.
View details