Voxel-based Viterbi Active Speaker Tracking (V-VAST) with best view selection for video lecture post-production

Damien Kelly

Anil C. Kokaram

Frank Boland

ICASSP(2011), pp. 2296-2299

Download Google Scholar

Abstract

An automated system is presented for reducing a multi-view lecture recording into a single view video containing a best view summary of active speakers. The system uses skin color detection and voxel-based analysis in locating likely speaker locations. Using time-delay estimates from multiple micro phones, speech activity is analyzed for each speaker position. The Viterbi algorithm is then used to estimate a track of the active speaker which maximizes the observed speech activity. This novel approach is termed Voxel-based Viterbi Active Speaker Tracking (V-VAST) and is shown to track speakers with an accuracy of 0.23m. Using the tracking information, the system then extracts from the available camera views the most frontal face view of the active speaker to display.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Voxel-based Viterbi Active Speaker Tracking (V-VAST) with best view selection for video lecture post-production

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Voxel-based Viterbi Active Speaker Tracking (V-VAST) with best view selection for video lecture post-production

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities