Towards Learning a Universal Non-Semantic Representation of Speech

Joel Shor

Aren Jansen

Ronnie Zvi Maor

Oran Lang

Omry Tuval

Félix de Chaumont Quitry

Marco Tagliasacchi

Ira Shavitt

Dotan Emanuel

Proc. Interspeech 2020 (2020)

Download Google Scholar

Abstract

The ultimate goal of transfer learning is to enable learning with a small amount of data, by using a strong embedding. While significant progress has been made in the visual and language domains, the speech domain does not have such a universal method. This paper presents a new representation of speech signals based on an unsupervised triplet-loss objective, which outperforms both existing state of the art and other representations on a number of transfer learning tasks in the non-semantic speech domain. The embedding is learned on a publicly available dataset, and it is tested on a variety of low-resource downstream tasks, including personalization tasks and medical domain. The model will be publicly released.

Research Areas

Machine Intelligence
Speech Processing

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Towards Learning a Universal Non-Semantic Representation of Speech

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Towards Learning a Universal Non-Semantic Representation of Speech

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities