
Scott Wisdom
I am a researcher in Google AI Perception in Cambridge, MA working on speech, audio, and audio-visual machine perception, with a focus on audio source separation.
Research Areas
Authored Publications
Sort By
Google
TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition
Xuankai Chang
Zalán Borsos
Marco Tagliasacchi
Neil Zeghidour
Interspeech 2023
Don’t Listen to What You Can’t See: The Importance of Negative Examples for Audio-Visual On-Screen Sound Separation
ECCV 2022 Workshop on AV4D: Visual Learning of Sounds in Spaces
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Tal Remez
European Conference on Computer Vision (ECCV) (2022)
Sparse, Efficient, and Semantic MixIT: Taming In-the-Wild Unsupervised Sound Separation
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2021)
Self-Supervised Learning from Automatically Separated Sound Scenes
Marco Tagliasacchi
Xavier Serra
WASPAA 2021 (2021)
What's All the FUSS About Free Universal Sound Separation Data?
Romain Serizel
Nicolas Turpault
Eduardo Fonseca
Justin Salamon
Prem Seetharaman
ICASSP 2021
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds
Tal Remez
International Conference on Learning Representations (ICLR) 2021