Robust Acoustic Emotion Recognition Based on Cascaded Normalization and Extreme Learning Machines
Özet
One of the challenges in speech emotion recognition is robust and speaker-independent emotion recognition. In this paper, we take a cascaded normalization approach, combining linear speaker level, non-linear value level and feature vector level normalization to minimize speaker-related effects and to maximize class separability with linear kernel classifiers. We use extreme learning machine classifiers on a four class (i.e. joy, anger, sadness, neutral) problem. We show the efficacy of our proposed method on the recently collected Turkish Emotional Speech Database.