Jump to Content

David Marwood

Research Areas

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Adding Non-linear Context to Deep Networks
    Michele Covell
    IEEE International Conference on Image Processing (2022)
    Preview
    Not All Network Weights Need to Be Free
    Michele Covell
    21st IEEE International Conference on Machine Learning and Applications, ICMLA 2022, Bahamas, December 12-14, 2022
    Preview abstract As state of the art network models routinely grow to architectures with billions and even trillions of learnable parameters, the need to efficiently store and retrieve these models into working memory becomes a more pronounced bottleneck. This is felt most severely in efforts to port models to personal devices, such as consumer cell phones, which now commonly include GPU and TPU processors designed to handle the enormous computational burdens associated with deep networks. In this paper, we present novel techniques for dramatically reducing the number of free parameters in deep network models with the explicit goals of (1) model compression with little or no model decompression overhead at inference time and (2) reducing the number of free parameters in arbitrary model without requiring any modifications to the architecture. We examine four techniques that build on each other, and provide insight into when and how each technique operates. Accuracy as a function of free parameters is measured on two very different deep networks: ResNet and Vision Transformer. On the latter, we find that we can reduce the number of parameters by 20\% with no loss in accuracy. View details
    Visualizing Semantic Walks
    NeurIPS-2022 Workshop on Machine Learning for Creativity and Design, https://neuripscreativityworkshop.github.io/2022/
    Preview abstract An embedding space trained from both a large language model and vision model contains semantic aspects of both and provides connections between words, images, concepts, and styles. This paper visualizes characteristics and relationships in this semantic space. We traverse multi-step paths in a derived semantic graph to reveal hidden connections created from the immense amount of data used to create these models. We specifically examine these relationships in the domain of painters, their styles, and their subjects. Additionally, we present a novel, non-linear sampling technique to create informative visualization of semantic graph transitions. View details
    Contextual Convolution Blocks
    Proceedings of the British Machine Vision Conference 2021 (2021)
    Preview abstract A fundamental processing layer of modern deep neural networks is the 2D convolution. It applies a filter uniformly across the input, effectively creating feature detectors that are translation invariant. In contrast, fully-connected layers are spatially selective, allowing unique detectors across the input. However, full connectivity comes at the expense of an enormous number of free parameters to be trained, the associated difficulty in learning without over-fitting, and the loss of spatial coherence. We introduce Contextual Convolution Blocks, a novel method to create spatially selective feature detectors that are locally translation invariant. This increases the expressive power of the network beyond standard convolutional layers and allows learning unique filters for distinct regions of the input. The filters no longer need to be discriminative in regions not likely to contain the target features. This is a generalization of the Squeeze-and-Excitation architecture that introduces minimal extra parameters. We provide experimental results on three datasets and a thorough exploration into how the increased expressiveness is instantiated. View details
    Interpretable Actions: Controlling Experts with Understandable Commands
    Michele Covell
    The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), AAAI (2021)
    Preview abstract Despite the prevalence of deep neural networks, their single most cited drawback is that, even when successful, their operations are inscrutable. For many applications, the desired outputs are the composition of externally-defined bases. For such decomposable domains, we present a two-stage learning procedure producing combinations of the external bases which are trivially extractable from the network. In the first stage, the set of external bases that will form the solution are modeled as differentiable generator modules, controlled by the same parameters as the external bases. In the second stage, a controller network is created that selects parameters for those generators, either successively or in parallel, to compose the final solution. Through three tasks, we concretely demonstrate how our system yields readily understandable commands. In one, we introduce a new form of artistic style transfer, learning to draw and color with crayons, in which the transformation of a photograph or painting occurs not as a single monolithic computation, but by the composition of thousands of individual, visualizable strokes. The other two tasks, single-pass function approximation with arbitrary bases and shape-based synthesis, show how our approach produces understandable and extractable actions in two disparate domains. View details
    Preview abstract A rapidly increasing portion of Internet traffic is dominated by requests from mobile devices with limited- and metered-bandwidth constraints. To satisfy these requests, it has become standard practice for websites to transmit small and extremely compressed image previews as part of the initial page-load process. Recent work, based on an adaptive triangulation of the target image, has shown the ability to generate thumbnails of full images at extreme compression rates: 200 bytes or less with impressive gains (in terms of PSNR and SSIM) over both JPEG and WebP standards. However, qualitative assessments and preservation of semantic content can be less favorable. We present a novel method to significantly improve the reconstruction quality of the original image with no changes to the encoded information. Our neural-based decoding not only achieves higher PSNR and SSIM scores than the original methods, but also yields a substantial increase in semantic-level content preservation. In addition, by keeping the same encoding stream, our solution is completely inter-operable with the original decoder. The end result is suitable for a range of small-device deployments, as it involves only a single forward-pass through a small, scalable network. View details
    Representing Images in 200 Bytes: Compression via Triangulation
    Pascal Massimino
    Michele Covell
    Proceedings of 2018 International Conference on Image Processing, IEEE
    Preview abstract A rapidly increasing portion of internet traffic is dominated by requests from mobile devices with limited and metered bandwidth constraints. To satisfy these requests, it has become standard practice for websites to transmit small and extremely compressed image previews as part of the initial page load process to improve responsiveness. Increasing thumbnail compression beyond the capabilities of existing codecs is therefore an active research direction. In this work, we concentrate on extreme compression rates, where the size of the image is typically 200 bytes or less. First, we propose a novel approach for image compression that, unlike commonly used methods, does not rely on block-based statistics. We use an approach based on an adaptive triangulation of the target image, devoting more triangles to high entropy regions of the image. Second, we present a novel algorithm for encoding the triangles. The results show favorable statistics, in terms of PSNR and SSIM, over both the JPEG and the WebP standards. View details
    No Results Found