James Bankoski

James Bankoski

Jim Bankoski is an Engineering Director working on Google's WebM project. He's the former CTO of On2 Technologies, and a technical contributor to all of On2's and later Google's video codecs from Tm2x through VP9; including video codecs widely used in Flash and Skype and now WebM.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    AN OVERVIEW OF CORE CODING TOOLS IN THE AV1 VIDEO CODEC
    Adrian Grange
    Andrey Norkin
    Ching-Han Chiang
    Hui Su
    Jean-Marc Valin
    Luc Trudeau
    Nathan Egge
    Paul Wilkins
    Peter de Rivaz
    Sarah Parker
    Steinar Midtskogen
    Thomas Davies
    Zoe Liu
    The Picture Coding Symposium (PCS) (2018)
    Preview abstract AV1 is an emerging open-source and royalty-free video compression format, which is jointly developed and finalized in early 2018 by the Alliance for Open Media (AOMedia) industry consortium. The main goal of AV1 development is to achieve substantial compression gain over state-of-the-art codecs while maintaining practical decoding complexity and hardware feasibility. This paper provides a brief technical overview of key coding techniques in AV1 along with preliminary compression performance comparison against VP9 and HEVC. View details
    The internet needs a competitive, royalty-free video codec
    Adrian Grange
    Matthew Frost
    Cambridge University Press, Google (2018)
    Preview abstract In this paper, we present the argument in favor of an open source, a royalty-free video codec that will keep pace with the evolution of video traffic. Additionally, we argue that the availability of a state-of-the-art, royalty-free codec levels the playing field, allowing small content owners, and application developers to compete with the larger companies that operate in this space. View details
    Novel inter and intra prediction tools under consideration for the emerging AV1 video codec
    Sarah Parker
    Hui Su
    Angie Chiang
    Zoe Liu
    Chen Wang
    Emil Keyder
    SPIE Optical Engineering + Applications, 10396 (2017), 10396 - 10396 - 13
    Preview abstract Google started the WebM Project in 2010 to develop open source, royalty-free video codecs designed specifically for media on the Web. The second generation codec released by the WebM project, VP9, is currently served by YouTube, and enjoys billions of views per day. Realizing the need for even greater compression efficiency to cope with the growing demand for video on the web, the WebM team embarked on an ambitious project to develop a next edition codec AV1, in a consortium of major tech companies called the Alliance for Open Media, that achieves at least a generational improvement in coding efficiency over VP9. In this paper, we focus primarily on new tools in AV1 that improve the prediction of pixel blocks before transforms, quantization and entropy coding are invoked. Specifically, we describe tools and coding modes that improve intra, inter and combined inter-intra prediction. Results are presented on standard test sets. View details
    Preview abstract Screen content videos that typically contain computer generated texts and graphics are getting more demanding in nowadays online video service. They involve a great amount of circumstances that are not commonly seen in natural videos, including sharp edge transition and repetitive pattern, which make their statistical characteristics distinct from those of natural videos. This makes it questionable about the efficacy of the conventional discrete cosine transform (DCT), which builds on the Gauss-Markov model assumption that leads to a base-band signal, on coding the computer-generated graphics. This work exploits a class of staircase transforms. Unlike the DCT whose bases are samplings of sinusoidal functions, the staircase transforms have their bases sampled from staircase functions, which naturally better approximate the sharp transitions often encountered in the context of screen content. As an alternative transform kernel, the staircase transform is integrated into a hybrid transform coding scheme, in conjunction with DCT. It is experimentally shown that the proposed approach provides an average of 2.9% compression performance gains in terms of BD-rate reduction. A perceptual comparison further demonstrates that the use of staircase transform achieves substantial reduction in ringing artifact due to the Gibbs phenomenon. View details
    Preview abstract Video codec exploits temporal redundancy of video signal, in the form of motion compensated prediction, to achieve superior compression performance. The coding of motion vectors takes a large portion of the total rate cost. Prior research utilizes the spatial and temporal correlations of the motion field to improve the coding efficiency of the motion information. It typically constructs a candidate pool composed of a fixed number of reference motion vectors and allows the codec to select and reuse the one that best approximates the motion activity of the current block. This largely disconnects the entropy coding process from the true boundary conditions, since it is masked by the fix-length candidate list, and hence could potentially cause sub-optimal coding performance. An alternative motion vector referencing scheme is proposed in this work to fully accommodate the dynamic nature of the boundary conditions for compression efficiency. It adaptively extends or shortens the candidate list according to the actual number of available reference motion vectors. The associated probability model accounts for the likelihood that an individual motion vector candidate is used. A complementary motion vector candidate ranking system is also presented here. It is experimentally shown that the proposed scheme achieves considerable compression performance gains across all the test sets. View details
    Technical Overview of VP8, an open source video codec for the web
    Paul Wilkins
    2011 International Workshop on Acoustics and Video Coding and Communication, IEEE, Barcelona, Spain (to appear)
    Preview