Mauricio Delbracio
Research Areas
Authored Publications
Sort By
Preview abstract
Inversion by Direct Iteration (InDI) is a new formulation for supervised image restoration that avoids the so-called ``regression to the mean'' effect and produces more realistic and detailed images than existing regression-based methods. It does this by gradually improving image quality in small steps, similar to generative denoising diffusion models.
Image restoration is an ill-posed problem where multiple high-quality images are plausible reconstructions of a given low-quality input. Therefore, the outcome of a single step regression model is typically an aggregate of all possible explanations, therefore lacking details and realism. The main advantage of InDI is that it does not try to predict the clean target image in a single step but instead gradually improves the image in small steps, resulting in better perceptual quality.
While generative denoising diffusion models also work in small steps, our formulation is distinct in that it does not require knowledge of any analytic form of the degradation process. Instead, we directly learn an iterative restoration process from low-quality and high-quality paired examples. InDI can be applied to virtually any image degradation, given paired training data. In conditional denoising diffusion image restoration the denoising network generates the restored image by repeatedly denoising an initial image of pure noise, conditioned on the degraded input. Contrary to conditional denoising formulations, InDI directly proceeds by iteratively restoring the input low-quality image, producing high-quality results on a variety of image restoration tasks, including motion and out-of-focus deblurring, super-resolution, compression artifact removal, and denoising.
View details
Soft Diffusion: Score Matching with General Corruptions
Giannis Daras
Alexandros Dimakis
Transactions on Machine Learning Research (TMLR) (2023)
Preview abstract
We define a broader family of corruption processes that generalizes previously known diffusion models. To reverse these general diffusions, we propose a new objective called Soft Score Matching. Soft Score Matching incorporates the degradation process in the network and provably learns the score function for any linear corruption process. Our new loss trains the model to predict a clean image, that after corruption, matches the diffused observation. This objective learns the gradient of the likelihood under suitable regularity conditions for the family of linear corruption processes. We further develop an algorithm to select the corruption levels for general diffusion processes and a novel sampling method that we call Momentum Sampler. We show experimentally that our framework works for general linear corruption processes, such as Gaussian blur and masking. Our method outperforms all linear diffusion models on CelebA-64 achieving FID score 1.85. We also show computational benefits compared to vanilla denoising diffusion.
View details
Interpretable Unsupervised Diversity Denoising and Artefact Removal
Mangal Prakash
Florian Jug
International Conference on Learning Representations (2022)
Preview abstract
Image denoising and artefact removal are complex inverse problems admitting multiple valid solutions. Unsupervised diversity restoration, that is, obtaining a diverse set of possible restorations given a corrupted image, is important for ambiguity removal in many applications such as microscopy where paired data for supervised training are often unobtainable. In real world applications, imaging noise and artefacts are typically hard to model, leading to unsatisfactory performance of existing unsupervised approaches. This work presents an interpretable approach for unsupervised and diverse image restoration. To this end, we introduce a capable architecture called Hierarchical DivNoising (HDN) based on hierarchical Variational Autoencoder. We show that HDN learns an interpretable multi-scale representation of artefacts and we leverage this interpretability to remove imaging artefacts commonly occurring in microscopy data. Our method achieves state-of-the-art results on twelve benchmark image denoising datasets while providing access to a whole distribution of sensibly restored solutions.
Additionally, we demonstrate on three real microscopy datasets that HDN removes artefacts without supervision, being the first method capable of doing so while generating multiple plausible restorations all consistent with the given corrupted image.
View details
Deblurring via Stochastic Refinement
Jay Whang
Chitwan Saharia
Alexandros Dimakis
CVPR (2022)
Preview abstract
Image deblurring is an ill-posed problem with multiple plausible solutions given a single input image. However, most existing methods produce a deterministic estimate of the clean image and are trained to minimize pixel-level distortion. These metrics are known to be poorly correlated with human perception, and often lead to unrealistic reconstructions.
We present an alternative framework for single-image blind deblurring based on conditional diffusion models. Unlike existing techniques, we train a stochastic sampler that refines the output of a deterministic predictor and is capable of producing a diverse set of plausible reconstructions for a single input. This leads to a significant improvement in perceptual quality over existing state-of-the-art methods across multiple standard benchmarks. Our predict-and-refine approach also enables much more efficient sampling compared to the standard diffusion model. Combined with a carefully tuned network architecture and inference procedure, our method is shown to be competitive in terms of traditional quantitative distortion metrics such as PSNR. These results show clear benefits of stochastic diffusion-based methods for deblurring and challenge the widely used strategy of producing a single, deterministic reconstruction.
View details
Projected Distribution Loss for Image Enhancement
2021 IEEE International Conference on Computational Photography (ICCP), pp. 1-12
Preview abstract
Features obtained from object detection CNNs have been widely used for measuring perceptual similarities between images. Such differentiable metrics can be used as perceptual learning losses to train image enhancement models. However, choice of the distance function between input and target features may have a consequential impact on the performance of trained model. While using the norm of the difference between extracted features leads to limited hallucination of details, measuring distance between distributions of features may generate more textures; yet also more unrealistic details and artifacts. In this paper, we demonstrate that aggregating 1D-Wasserstein distances between CNN activations is more reliable than the existing approaches, and it can significantly improve the perceptual performance of enhancement models. More explicitly, we show that in imaging applications such as denoising, super-resolution, demosaicing, deblurring and JPEG artifact removal, the proposed learning loss outperforms the current state-of-the-art on reference-based perceptual losses. This means that the proposed learning loss can be plugged into different imaging frameworks and produce perceptually realistic results.
View details
Learning to Reduce Defocus Blur by Realistically Modeling Dual-Pixel Data
Abdullah Abuolaim
Michael S. Brown
International Conference on Computer Vision (ICCV) (2021)
Preview abstract
Recent work has shown impressive results on data-driven defocus deblurring using the two-image views available on modern dual-pixel (DP) sensors. One significant challenge in this line of research is access to DP data. Despite many cameras having DP sensors, only a limited number provide access to the low-level DP sensor images. In addition, capturing training data for defocus deblurring involves a time-consuming and tedious setup requiring the camera's aperture to be adjusted. Some cameras with DP sensors (e.g., smartphones) do not have adjustable apertures, further limiting the ability to produce the necessary training data. We address the data capture bottleneck by proposing a procedure to generate realistic DP data synthetically. Our synthesis approach mimics the optical image formation found on DP sensors and can be applied to virtual scenes rendered with standard computer software. Leveraging these realistic synthetic DP images, we introduce a recurrent convolutional network (RCN) architecture that improves deblurring results and is suitable for use with single-frame and multi-frame data (e.g., video) captured by DP sensors. Finally, we show that our synthetic DP data is useful for training DNN models targeting video deblurring applications where access to DP data remains challenging.
View details
Polyblur: Removing mild blur by polynomial reblurring
IEEE Transactions on Computational Imaging (2021)
Preview abstract
We present a highly efficient blind image restoration method to remove mild blur in natural images. Contrary to the mainstream, we focus on removing slight blur that is often present damaging image quality and commonly generated by small out-of-focus, lens blur or slight camera motion. The proposed algorithm first estimates image blur and then compensates for it by combining multiple applications of the estimated blur in a principle-based way. In this sense, we present a novel procedure to design the approximate inverse of a filter and make only use of re-applications of the filter itself. To estimate image blur in natural images we introduce a simple yet robust algorithm based on empirical observations about the distribution of the gradient in sharp images. Our experiments show that, in the context of mild blur, the proposed method outperforms traditional and modern blind deconvolution methods and runs in a fraction of time. We finally show that the method can be used to blindly correct blur before applying an out-of-the-shelf deep super-resolution model leading to superior results than other highly complex and computational demanding methods. The proposed method can estimate and remove mild blur on a 12Mp image on a modern mobile phone device in a fraction of a second.
View details