Articulatory Feature Classification Using Nearest Neighbors
Abstract
Recognizing aspects of articulation from audio recordings of speech is an important problem, either as an end in itself or as part of an articulatory approach to automatic speech recognition. In this paper we study the frame-level classification of a set of articulatory features (AFs) inspired by the vocal tract variables of articulatory phonology. We compare k nearest neighbor (k-NN) classifiers and multilayer perceptrons (MLPs), using different acoustic feature vectors, and classify the AFs either independently or jointly. We also consider using the MLP outputs for all of the AFs as inputs to k-NN classifiers for the individual AFs, effectively using the MLPs as a form of nonlinear dimensionality reduction and allowing the decision for each AF to be based on the MLPs for the other AFs. We find that MLPs outperform k-NN classifiers, while k-NN classifiers
using MLP outputs outperform both.