SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

Ligong Han; Yinxiao Li; Han Zhang; Peyman Milanfar; Dimitris Metaxas; Feng Yang

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

Ligong Han

Yinxiao Li

Han Zhang

Peyman Milanfar

Dimitris Metaxas

Feng Yang

IEEE/CVF International Conference on Computer Vision (ICCV) (2023)

Download Google Scholar

Abstract

Diffusion models have achieved remarkable success in text-to-image generation, enabling the creation of high-quality images from text prompts or other modalities. However, existing methods for customizing these models are limited by handling multiple personalized subjects and the risk of overfitting. Moreover, their large number of parameters is inefficient for model storage. In this paper, we propose a novel approach to address these limitations in existing text-to-image diffusion models for personalization. Our method involves fine-tuning the singular values of the weight matrices, leading to a compact and efficient parameter space that reduces the risk of overfitting and language-drifting. We also propose a Cut-Mix-Unmix data-augmentation technique to enhance the quality of multi-subject image generation and a simple text-based image editing framework. Our proposed SVDiff method has a significantly smaller model size (1.7MB for StableDiffusion) compared to existing methods (vanilla DreamBooth 3.66GB, Custom Diffusion 73MB), making it more practical for real-world applications.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs