Speech Sentiment Analysis via End-To-End ASR Features
Abstract
In this paper, we propose to use end-to-end ASR features to solve the speech sentiment as a down-stream task. We show that end-to-end ASR
features integrate the benefits from both acoustic models and language models.
From the sequence of ASR features, we develop effective methods to recognize sentiment
and get promising results.
Our approach improves the-state-of-the-art accuracy on IEMOCAP from 66.6% to 71.7%, and achieves an accuracy of 70.10% on SWBD-sentiment with more than 49,500 utterances.
features integrate the benefits from both acoustic models and language models.
From the sequence of ASR features, we develop effective methods to recognize sentiment
and get promising results.
Our approach improves the-state-of-the-art accuracy on IEMOCAP from 66.6% to 71.7%, and achieves an accuracy of 70.10% on SWBD-sentiment with more than 49,500 utterances.