Fully Automatic Speaker Separation System, with Automatic Enrolling of Recurrent Speakers

Raphael Cohen
Jason Levy
Russell Levy
Micha Breakstone
Amit Ashkenazi
INTERSPEECH 2018, ISCA(2018), pp. 1964-1965

Abstract

We present a system to enable speaker separation and identification, designed to operate without requiring any effort from the end-user. In the system, single channel conversations are transformed into i-vectors, clustered into speakers and matched to a database of known speakers. Enrollment is automatic and a voice print is constructed for the recording user, taking advantage of the meta-data identifying that user's conversations. Further information is used when available from other information sources such as video and the ASR transcribed content to identify speakers. We describe the system architecture, novel unsupervised enrollment algorithm and describe the difficulties encountered in solving this problem.

Research Areas