Imtiaz Humayun
Research scientist interested in understanding how data and objectives shape training, memorization and learning dynamics in neural network based models and agents.
Research Areas
Authored Publications
Sort By
What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models
Ibtihel Amara
Golnoosh Farnadi
Mohammad Havaei
ICLR 2025
Preview abstract
Deep Generative Models are frequently used to learn continuous representations of complex data distributions by training on a finite number of samples. For any generative model, including pre-trained foundation models with Diffusion or Transformer architectures, generation performance can significantly vary across the learned data manifold. In this paper, we study the local geometry of the learned manifold and its relationship to generation outcomes for a wide range of generative models, including DDPM, Diffusion Transformer (DiT), and Stable Diffusion 1.4. Building on the theory of continuous piecewise-linear (CPWL) generators, we characterize the local geometry in terms of three geometric descriptors - scaling (ψ), rank (ν), and complexity/un-smoothness (δ). We provide quantitative and qualitative evidence showing that for a given latent vector, the local descriptors are indicative of post-generation aesthetics, generation diversity, and memorization by the generative model. Finally, we demonstrate that by training a reward model on the local scaling for Stable Diffusion, we can self-improve both generation aesthetics and diversity using geometry sensitive guidance during denoising.
View details
Preview abstract
We develop Latent Exploration Score (LES) to mitigate over-exploration in Latent Space Optimization (LSO), a popular method for solving black-box discrete optimization problems. LSO utilizes continuous optimization within the latent space of a Variational Autoencoder (VAE) and is known to be susceptible to overexploration, which manifests in unrealistic solutions that reduce its practicality. LES leverages the trained decoder’s approximation of the data distribution, and can be employed with any VAE decoder–including pretrained ones–without additional training, architectural changes or access to the training data. Our evaluation across five LSO benchmark tasks and twenty-two VAE models demonstrates that LES always enhances the quality of the solutions while maintaining high objective values, leading to improvements over existing solutions in most cases. We believe that new avenues to LSO will be opened by LES’ ability to identify out of distribution areas, differentiability, and computational tractability.
View details
Grokking and the Geometry of Circuit Formation
Randall Balestriero
Richard G. Baraniuk
ICML 2024 Workshop on Mechanistic Interpretability
Preview abstract
Grokking, or delayed generalization, is a phenomenon where generalization in a deep neural network (DNN) emerges after achieving near zero training error. Previous studies have reported the occurrence of grokking in specific controlled settings, such as DNNs initialized with large-norm parameters or transformers trained on algorithmic datasets. Recent studies have shown that grokking occurs for adversarial examples as well, in the form of delayed robustness. We connect the emergence of grokking with the geometric arrangement of circuits in the input space, and their size as well as proximity to the training data. We also demonstrate that grokking manifests in Large Language Models in next-character prediction tasks. We provide evidence that the arrangement of circuits in a DNN undergo a phase transition during training, migrating away from the training samples therefore increasing both robustness and generalization.
View details
Preview abstract
Grokking, or delayed generalization, is a phenomenon where generalization in a deep neural network (DNN) occurs long after achieving near zero training error. Previous studies have reported the occurrence of grokking in specific controlled settings, such as DNNs initialized with largenorm parameters or transformers trained on algorithmic datasets. We demonstrate that grokking is actually much more widespread and materializes in a wide range of practical settings, such as training of a convolutional neural network (CNN) on CIFAR10 or a Resnet on Imagenette. We introduce the new concept of delayed robustness, whereby a DNN groks adversarial examples and becomes robust, long after interpolation and/or generalization. We develop an analytical explanation for the emergence of both delayed generalization and delayed robustness based on the local complexity of a DNN's input-output mapping. Our local complexity measures the density of so-called "linear regions" (aka, spline partition regions) that tile the DNN input space and serves as a utile progress measure for training. We provide the first evidence that, for classification problems, the linear regions undergo a phase transition during training whereafter they migrate away from the training samples (making the DNN mapping smoother there) and towards the decision boundary (making the DNN mapping less smooth there). Grokking occurs post phase transition as a robust partition of the input space thanks to the linearization of the DNN mapping around the training points. Web: bit.ly/grok-adversarial.
View details
Preview abstract
As the capabilities of Deep Generative Models improve, thorough evaluation of their generation performance and biases become crucial. Human evaluation remains the gold standard but its cost makes assessing generative models, particularly text-to-image foundation models like Stable Diffusion, prohibitively expensive.
In this paper, we explore the feasibility of using geometric descriptors of the data manifold for self-assessment, therefore, requiring only the network architecture and its weights. We propose three theoretically inspired geometric descriptors – local scaling (ψ), local rank (ν) and local complexity (δ) – that can characterize the local properties of the learned data manifold for a given latent vector. Our proposed measures can be used to quantify (i) uncertainty, (ii) local dimensionality of the manifold as well as (iii) smoothness of the learned generative model manifold. We demonstrate the relationship of our manifold descriptors with generation quality and diversity. Further, we present evidence of geometric bias between sub-populations under the generated distribution for Beta-VAE and Stable Diffusion. We believe that our proposed framework will allow future research into understanding how bias manifests through the learned data manifold of foundation models.
View details
Learning Transferable Features for Implicit Neural Representations
Kushal Vyas
Aniket Dashpute
Richard G. Baraniuk
Ashok Veeraraghavan
Guha Balakrishnan
NeruIPS 2024
Preview abstract
Implicit neural representations (INRs) have demonstrated success in a variety of
applications, including inverse problems and neural rendering. An INR is typically
trained to capture one signal of interest, resulting in learned neural features that
are highly attuned to that signal. Assumed to be less generalizable, we explore the
aspect of transferability of such learned neural features for fitting similar signals.
We introduce a new INR training framework, STRAINER that learns transferrable
features for fitting INRs to new signals from a given distribution, faster and with
better reconstruction quality. Owing to the sequential layer-wise affine operations
in an INR, we propose to learn transferable representations by sharing initial
encoder layers across multiple INRs with independent decoder layers. At test
time, the learned encoder representations are transferred as initialization for an
otherwise randomly initialized INR. We find STRAINER to yield extremely powerful
initialization for fitting images from the same domain and allow for a ≈ +10dB
gain in signal quality early on compared to an untrained INR itself. STRAINER
also provides a simple way to encode data-driven priors in INRs. We evaluate
STRAINER on multiple in-domain and out-of-domain signal fitting tasks and inverse
problems and further provide detailed analysis and discussion on the transferability
of STRAINER’s features. Our demo can be accessed here.
View details
On The Local Geometry of Deep Generative Manifolds
Ibtihel Amara
Golnoosh Farnadi
Mohammad Havaei
ICML 2024 Workshop on Geometry-grounded Representation Learning and Generative Modeling
Preview abstract
Is it possible to evaluate a pre-trained generative model, especially large text-to-image generative models, without access to its original training data or human evaluators? We propose a novel approach to addressing this challenge by introducing a self-assessment framework based on the theory of continuous piecewise affine spline generators. We investigate the use of three theoretically inspired geometric descriptors for neural networks – local scaling ($\psi$), local rank ($\nu$), and local complexity ($\delta$) – to characterize the uncertainty, dimensionality, and smoothness on the learned manifold, using only the network weights and architecture. We demonstrate the relationship of these descriptors with generation quality, aesthetics, diversity, and bias, providing insights into how these aspects manifest for different sub-populations under the generated distribution. Moreover, we observe that the geometry of the data manifold is influenced by the training distribution, enabling us to perform out-of-distribution detection, model comparison, and reward modeling to control the output distribution. We believe our framework will help elucidate the relationship between the learned data manifold geometry, the training data, and the downstream behavior of pre-trained generative models.
View details