Imtiaz Humayun

Imtiaz Humayun

Research scientist interested in understanding how data and objectives shape training, memorization and learning dynamics in neural network based models and agents.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Deep Generative Models are frequently used to learn continuous representations of complex data distributions by training on a finite number of samples. For any generative model, including pre-trained foundation models with Diffusion or Transformer architectures, generation performance can significantly vary across the learned data manifold. In this paper, we study the local geometry of the learned manifold and its relationship to generation outcomes for a wide range of generative models, including DDPM, Diffusion Transformer (DiT), and Stable Diffusion 1.4. Building on the theory of continuous piecewise-linear (CPWL) generators, we characterize the local geometry in terms of three geometric descriptors - scaling (ψ), rank (ν), and complexity/un-smoothness (δ). We provide quantitative and qualitative evidence showing that for a given latent vector, the local descriptors are indicative of post-generation aesthetics, generation diversity, and memorization by the generative model. Finally, we demonstrate that by training a reward model on the local scaling for Stable Diffusion, we can self-improve both generation aesthetics and diversity using geometry sensitive guidance during denoising. View details
    Mitigating Over-Exploration in Latent Space Optimization using LES
    Omer Ronen
    Richard G. Baraniuk
    Randall Balestriero
    Bin Yu
    ICML 2025
    Preview abstract We develop Latent Exploration Score (LES) to mitigate over-exploration in Latent Space Optimization (LSO), a popular method for solving black-box discrete optimization problems. LSO utilizes continuous optimization within the latent space of a Variational Autoencoder (VAE) and is known to be susceptible to overexploration, which manifests in unrealistic solutions that reduce its practicality. LES leverages the trained decoder’s approximation of the data distribution, and can be employed with any VAE decoder–including pretrained ones–without additional training, architectural changes or access to the training data. Our evaluation across five LSO benchmark tasks and twenty-two VAE models demonstrates that LES always enhances the quality of the solutions while maintaining high objective values, leading to improvements over existing solutions in most cases. We believe that new avenues to LSO will be opened by LES’ ability to identify out of distribution areas, differentiability, and computational tractability. View details
    Grokking and the Geometry of Circuit Formation
    Randall Balestriero
    Richard G. Baraniuk
    ICML 2024 Workshop on Mechanistic Interpretability
    Preview abstract Grokking, or delayed generalization, is a phenomenon where generalization in a deep neural network (DNN) emerges after achieving near zero training error. Previous studies have reported the occurrence of grokking in specific controlled settings, such as DNNs initialized with large-norm parameters or transformers trained on algorithmic datasets. Recent studies have shown that grokking occurs for adversarial examples as well, in the form of delayed robustness. We connect the emergence of grokking with the geometric arrangement of circuits in the input space, and their size as well as proximity to the training data. We also demonstrate that grokking manifests in Large Language Models in next-character prediction tasks. We provide evidence that the arrangement of circuits in a DNN undergo a phase transition during training, migrating away from the training samples therefore increasing both robustness and generalization. View details
    Deep Networks Always Grok and Here is Why
    Randall Balestriero
    Richard G. Baraniuk
    ICML 2024
    Preview abstract Grokking, or delayed generalization, is a phenomenon where generalization in a deep neural network (DNN) occurs long after achieving near zero training error. Previous studies have reported the occurrence of grokking in specific controlled settings, such as DNNs initialized with largenorm parameters or transformers trained on algorithmic datasets. We demonstrate that grokking is actually much more widespread and materializes in a wide range of practical settings, such as training of a convolutional neural network (CNN) on CIFAR10 or a Resnet on Imagenette. We introduce the new concept of delayed robustness, whereby a DNN groks adversarial examples and becomes robust, long after interpolation and/or generalization. We develop an analytical explanation for the emergence of both delayed generalization and delayed robustness based on the local complexity of a DNN's input-output mapping. Our local complexity measures the density of so-called "linear regions" (aka, spline partition regions) that tile the DNN input space and serves as a utile progress measure for training. We provide the first evidence that, for classification problems, the linear regions undergo a phase transition during training whereafter they migrate away from the training samples (making the DNN mapping smoother there) and towards the decision boundary (making the DNN mapping less smooth there). Grokking occurs post phase transition as a robust partition of the input space thanks to the linearization of the DNN mapping around the training points. Web: bit.ly/grok-adversarial. View details
    Preview abstract As the capabilities of Deep Generative Models improve, thorough evaluation of their generation performance and biases become crucial. Human evaluation remains the gold standard but its cost makes assessing generative models, particularly text-to-image foundation models like Stable Diffusion, prohibitively expensive. In this paper, we explore the feasibility of using geometric descriptors of the data manifold for self-assessment, therefore, requiring only the network architecture and its weights. We propose three theoretically inspired geometric descriptors – local scaling (ψ), local rank (ν) and local complexity (δ) – that can characterize the local properties of the learned data manifold for a given latent vector. Our proposed measures can be used to quantify (i) uncertainty, (ii) local dimensionality of the manifold as well as (iii) smoothness of the learned generative model manifold. We demonstrate the relationship of our manifold descriptors with generation quality and diversity. Further, we present evidence of geometric bias between sub-populations under the generated distribution for Beta-VAE and Stable Diffusion. We believe that our proposed framework will allow future research into understanding how bias manifests through the learned data manifold of foundation models. View details
    Learning Transferable Features for Implicit Neural Representations
    Kushal Vyas
    Aniket Dashpute
    Richard G. Baraniuk
    Ashok Veeraraghavan
    Guha Balakrishnan
    NeruIPS 2024
    Preview abstract Implicit neural representations (INRs) have demonstrated success in a variety of applications, including inverse problems and neural rendering. An INR is typically trained to capture one signal of interest, resulting in learned neural features that are highly attuned to that signal. Assumed to be less generalizable, we explore the aspect of transferability of such learned neural features for fitting similar signals. We introduce a new INR training framework, STRAINER that learns transferrable features for fitting INRs to new signals from a given distribution, faster and with better reconstruction quality. Owing to the sequential layer-wise affine operations in an INR, we propose to learn transferable representations by sharing initial encoder layers across multiple INRs with independent decoder layers. At test time, the learned encoder representations are transferred as initialization for an otherwise randomly initialized INR. We find STRAINER to yield extremely powerful initialization for fitting images from the same domain and allow for a ≈ +10dB gain in signal quality early on compared to an untrained INR itself. STRAINER also provides a simple way to encode data-driven priors in INRs. We evaluate STRAINER on multiple in-domain and out-of-domain signal fitting tasks and inverse problems and further provide detailed analysis and discussion on the transferability of STRAINER’s features. Our demo can be accessed here. View details
    On The Local Geometry of Deep Generative Manifolds
    Ibtihel Amara
    Golnoosh Farnadi
    Mohammad Havaei
    ICML 2024 Workshop on Geometry-grounded Representation Learning and Generative Modeling
    Preview abstract Is it possible to evaluate a pre-trained generative model, especially large text-to-image generative models, without access to its original training data or human evaluators? We propose a novel approach to addressing this challenge by introducing a self-assessment framework based on the theory of continuous piecewise affine spline generators. We investigate the use of three theoretically inspired geometric descriptors for neural networks – local scaling ($\psi$), local rank ($\nu$), and local complexity ($\delta$) – to characterize the uncertainty, dimensionality, and smoothness on the learned manifold, using only the network weights and architecture. We demonstrate the relationship of these descriptors with generation quality, aesthetics, diversity, and bias, providing insights into how these aspects manifest for different sub-populations under the generated distribution. Moreover, we observe that the geometry of the data manifold is influenced by the training distribution, enabling us to perform out-of-distribution detection, model comparison, and reward modeling to control the output distribution. We believe our framework will help elucidate the relationship between the learned data manifold geometry, the training data, and the downstream behavior of pre-trained generative models. View details
    ×