- Beat Gfeller
- Marco Tagliasacchi
- Pen Li
- Zalán Borsos
Abstract
A critical aspect for the successful deployment of audio-based models ``in-the-wild'' is the robustness to the transformations introduced by heterogeneous microphones. In this work we propose a method that is able to perform \emph{one-shot microphone style} transfer. Given only a \emph{few seconds} of audio recorded by a target device, \emph{MicAugment} identifies the transformations associated to the microphone and uses the learned transformations to synthesize audio as if it were recorded by that device. We show that our method can successfully apply the style of a target microphone and that it significantly increases model robustness to microphone variability when used as \emph{data augmentation} in downstream tasks.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work