Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Multimodal Modeling for Spoken Language Identification

Shikhar Bharadwaj

Min Ma

Shikhar Vashishth

Ankur Bapna

Sriram (Sri) Ganapathy

Vera Axelrod

Sid Dalmia

Wei Han

Yu Zhang

Daan van Esch

Sandy Ritchie

Partha Talukdar

Jason Riesa

Proceedings of 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024) (2024)

StreamVC: Real-Time Low-Latency Voice Conversion

Yang Yang

Yury Kartynnik

Pen Li

Jiuqiang Tang

Xing Li

George Sung

Matthias Grundmann

ICASSP 2024 (2024)

NOMAD: Unsupervised Learning of Perceptual Embeddings for Speech Enhancement and Non-matching Reference Audio Quality Assessment

Alessandro Ragano

Andrew Hines

Jan Skoglund

ICASSP 2024 (to appear)

USM-SCD: USM-Based Multilingual Speaker Change Detection

Guanlong Zhao

Yongqiang Wang

Jason Pelecanos

Yu Zhang

Hank Liao

Yiling Huang

Han Lu

Quan Wang

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 11801-11805

Now You See Me, Now You Don't: 'Poverty of the Stimulus' Problems and Arbitrary Correspondences in End-to-End Speech Models

Daan van Esch

Proceedings of the Second Workshop on Computation and Written Language (CAWL) 2024

Automatic Speech Recognition of Conversational Speech in Individuals with Disordered Speech

Jimmy Tobin

Philip Q Nelson

Bob MacDonald

Rus Heywood

Richard Cave

Katie Seaver

Antoine Desjardins

Pan-Pan Jiang

Jordan Green

Journal of Speech, Language, and Hearing Research (2024) (to appear)

Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM

Eliya Nachmani

Alon Levkovitch

Roy Hirsch

Julian Salazar

Chulayuth Asawaroengchai

Soroosh Mariooryad

Ehud Rivlin

RJ Skerry-Ryan

Michelle Tadmor Ramanovich

ICLR (2024)

A Study of Raters' Sensitivity to Inter-sentence Pause Durations in American English Speech

Paul Owoicho

Josh Camp

Tom Kenter

Speech Prosody 2024 (SP2024) (2024) (to appear)

Data processing for Japanese text-to-pronunciation models

Gleb Mazovetskiy

Taku Kudo

(2024)

Translatotron 3: Speech to Speech Translation with Monolingual Data

Eliya Nachmani

Alon Levkovitch

Yifan Ding

Chulayuth Asawaroengchai

Heiga Zen

Michelle Tadmor Ramanovich

Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation

Lion Jones

Richard William Sproat

Haruko Ishikawa

Alexander Gutkin

Transactions of the Association for Computational Linguistics, 11 (2023), 85–101

Envisioning Equitable Speech Technologies for Black Older Adults

Courtney Heldreth

Robin Brewer

Christina Harrington

FaCCT (2023)

LibriTTS-R: Restoration of a Large-Scale Multi-Speaker TTS Corpus

Yuma Koizumi

Heiga Zen

Shigeki Karita

Yifan Ding

Kohei Yatabe

Nobuyuki Morioka

Michiel Adriaan Unico Bacchiani

Yu Zhang

Wei Han

Ankur Bapna

Interspeech 2023 (2023)

Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss

Guanlong Zhao

Quan Wang

Han Lu

Yiling Huang

Ignacio Lopez Moreno

ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech Representation and Linguistic Features

Yuma Koizumi

Heiga Zen

Shigeki Karita

Yifan Ding

Kohei Yatabe

Nobuyuki Morioka

Yu Zhang

Wei Han

Ankur Bapna

Michiel Adriaan Unico Bacchiani

WASPAA 2023 (2023) (to appear)

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Publications

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Publications

Filter by:

Year

Team

Research Area

Meet the teams driving innovation