Speech Recognition Techniques for a Sign Language Recognition System

Philippe Dreuw
Thomas Deselaers
Morteza Zahedi
Hermann Ney
Interspeech(2007), pp. 2513-2516

Abstract

One of the most significant differences between automatic sign language recognition (ASLR) and automatic speech recognition (ASR) is due to the computer vision problems, whereas the corresponding problems in speech signal processing have been solved due to intensive research in the last 30 years. We present our approach where we start from a large vocabulary speech recognition system to profit from the insights that have been obtained in ASR research. The system developed is able to recognize sentences of continuous sign language independent of the speaker. The features used are obtained from standard video cameras without any special data acquisition devices. In particular, we focus on feature and model combination techniques applied in ASR, and the usage of pronunciation and language models (LM) in sign language. These techniques can be used for all kind of sign language recognition systems, and for many video analysis problems where the temporal context is important, e.g. for action or gesture recognition. On a publicly available benchmark database consisting of 201 sentences and 3 signers, we can achieve a 17% WER.

Research Areas