Troy T. Chinen

Troy T. Chinen

Research Areas

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    An Unsupervised Information-Theoretic Perceptual Quality Metric
    Sangnie Bhardwaj
    Ian Fischer
    Advances in Neural Information Processing Systems 33(2020)
    Preview abstract Tractable models of human perception have proved to be challenging to build. Hand-designed models such as MS-SSIM remain popular predictors of human image quality judgements due to their simplicity and speed. Recent modern deep learning approaches can perform better, but they rely on supervised data which can be costly to gather: large sets of class labels such as ImageNet, image quality ratings, or both. We combine recent advances in information-theoretic objective functions with a computational architecture informed by the physiology of the human visual system and unsupervised training on pairs of video frames, yielding our Perceptual Information Metric (PIM). We show that PIM is competitive with supervised metrics on the recent and challenging BAPPS image quality assessment dataset and outperforms them in predicting the ranking of image compression methods in CLIC 2020. We also perform qualitative experiments using the ImageNet-C dataset, and establish that PIM is robust with respect to architectural details. View details
    Preview abstract We propose a method for lossy image compression based on recurrent, convolutional neural networks that outperforms BPG (4:2:0), WebP, JPEG2000, and JPEG as measured by MS-SSIM. We introduce three improvements over previous research that lead to this state-of-the-art result using a single model. First, we show that training with a pixel-wise loss weighted by SSIM increases reconstruction quality according to several metrics. Second, we modify the recurrent architecture to improve spatial diffusion, which allows the network to more effectively capture and propagate image information through the network’s hidden state. Finally, in addition to lossless entropy coding, we use a spatially adaptive bit allocation algorithm to more efficiently use the limited number of bits to encode visually complex image regions. We evaluate our method on the Kodak and Tecnick image sets and compare against standard codecs as well recently published methods based on deep neural networks. View details
    Towards a Semantic Perceptual Image Metric
    Chunhui Gu
    Sung Jin Hwang
    Sergey Ioffe
    Sean O'Malley
    Charles Rosenberg
    2018 25th IEEE Int. Conf. on Image Processing (ICIP)
    Preview abstract We present a full reference, perceptual image metric based on VGG-16, an artificial neural network trained on object classification. We fit the metric to a new database based on 140k unique images annotated with ground truth by human raters who received minimal instruction. The resulting metric shows competitive performance on TID 2013, a database widely used to assess image quality assessments methods. More interestingly, it shows strong responses to objects potentially carrying semantic relevance such as faces and text, which we demonstrate using a visualization technique and ablation experiments. In effect, the metric appears to model a higher influence of semantic context on judgements, which we observe particularly in untrained raters. As the vast majority of users of image processing systems are unfamiliar with Image Quality Assessment (IQA) tasks, these findings may have significant impact on real-world applications of perceptual metrics. View details
    Spatially adaptive image compression using a tiled deep network
    Michele Covell
    Joel Shor
    Sung Jin Hwang
    Damien Vincent
    Proceedings of the International Conference on Image Processing(2017), pp. 2796-2800
    Preview abstract Deep neural networks represent a powerful class of function approximators that can learn to compress and reconstruct images. Existing image compression algorithms based on neural networks learn quantized representations with a constant spatial bit rate across each image. While entropy coding introduces some spatial variation, traditional codecs have benefited significantly by explicitly adapting the bit rate based on local image complexity and visual saliency. This paper introduces an algorithm that combines deep neural networks with quality-sensitive bit rate adaptation using a tiled network. We demonstrate the importance of spatial context prediction and show improved quantitative (PSNR) and qualitative (subjective rater assessment) results compared to a non-adaptive baseline and a recently published image compression model based on fully-convolutional neural networks. View details
    Preview abstract Personal photo albums are heavily biased towards faces of people, but most state-of-the-art algorithms for image denoising and noise estimation do not exploit facial information. We propose a novel technique for jointly estimating noise levels of all face images in a photo collection. Photos in a personal album are likely to contain several faces of the same people. While some of these photos would be clean and high quality, others may be corrupted by noise. Our key idea is to estimate noise levels by comparing multiple images of the same content that differ predominantly in their noise content. Specifically, we compare geometrically and photometrically aligned face images of the same person. Our estimation algorithm is based on a probabilistic formulation that seeks to maximize the joint probability of estimated noise levels across all images. We propose an approximate solution that decomposes this joint maximization into a two-stage optimization. The first stage determines the relative noise between pairs of images by pooling estimates from corresponding patch pairs in a probabilistic fashion. The second stage then jointly optimizes for all absolute noise parameters by conditioning them upon relative noise levels, which allows for a pairwise factorization of the probability distribution. We evaluate our noise estimation method using quantitative experiments to measure accuracy on synthetic data. Additionally, we employ the estimated noise levels for automatic denoising using "BM3D", and evaluate the quality of denoising on real-world photos through a user study. View details
    Visual Comparison Of JPEG 2000 Versus Conventional JPEG
    International Conference on Image Processing(2003), pp. 283-286
    Preview abstract One difficulty in image compression research is designing meaningful performance metrics. Purely numerical measures such as PSNR are unsatisfactory because they do not correlate well with human assessment. We introduce a method of subjective image evaluation for image compression called calibrated rank ordering (CRO). CRO is attractive because it produces substantial numerical results without excessive burden on the observers. Using CRO we compare traditional JPEG with JPEG 2000 in a variety of modes. We also consider the effect of differing images sources, i.e., digital still camera vs. film scan. Finally, we compare and contrast the artifacts of both JPEG and JPEG 2000. View details
    An Overview of Quantization in JPEG-2000
    Michael Marcellin
    Margaret Lepley
    Tom Flohr
    James Kasner
    Signal Processing: Image Communications, 17(2001), pp. 73-84
    Preview abstract Quantization is instrumental in enabling the rich feature set of JPEG 2000. Several quantization options are provided within JPEG 2000. Part I of the standard includes only uniform scalar dead-zone quantization, while Part II allows both generalized uniform scalar dead-zone quantization and trellis coded quantization (TCQ). In this paper, an overview of these quantization methods is provided. Issues that arise when each of these methods are employed are discussed as well. View details
    TCQ in JPEG-2000
    Thomas J. Flohr
    Michael W. Marcellin
    SPIE(2000), pp. 552-560
    Preview abstract A baseline mode of Trellis coded quantization (TCQ) is described as used in JPEG 2000 along with the results of visual evaluations which demonstrate TCQ effectiveness over scalar quantization (SQ). Furthermore, a reduced complexity TCQ mode is developed and described in detail. Numerical and visual evaluations indicate that compression performance is nearly identical to baseline TCQ, but with greatly reduced memory footprint and increased progressive image decoding facilities. View details
    Extension of Spatial Sharpening Techniques to Hyperspectral Data
    Timothy J. Patterson
    Lester Gong
    Robert Haxton
    Proc. SPIE(1998), pp. 114-122
    Preview abstract This paper describes our approach and presents measured results of the extension of multispectral sharpening techniques to hyperspectral imagery. Our approach produce high spatial resolution spectral imagery using a least squares estimator. The estimator is based on the underlying physics of the spectral imaging process. The intent of the process it to produce high spatial resolutions with the best possible spectral fidelity. The results on multiple test cases demonstrate sharpened imagery within 5% of the true high resolution hyperspectral values. View details
    A performance analysis of fast Gabor transform methods
    Todd R. Reed
    Graphical models and image processing, 59(1997), pp. 117-127
    Preview abstract Computation of the finite discrete Gabor transform can be accomplished in a variety of ways. Three representative methods (matrix inversion, Zak transform, and relaxation network) were evaluated in terms of execution speed, accuracy, and stability. The relaxation network was the slowest method tested. Its strength lies in the fact that it makes no explicit assumptions about the basis functions; in practice it was found that convergence did depend on basis choice. The matrix method requires a separable Gabor basis (i.e., one that can be generated by taking a Cartesian product of one-dimensional functions), but is faster than the relaxation network by several orders of magnitude. It proved to be a stable and highly accurate algorithm. The Zak–Gabor algorithm requires that all of the Gabor basis functions have exactly the same envelope and gives no freedom in choosing the modulating function. Its execution, however, is very stable, accurate, and by far the most rapid of the three methods tested. View details