Abstract
Class based emotion recognition from speech, as performed in most works up to now, entails many restrictions for practical applications. Human emotion is a continuum and an automatic emotion recognition system must be able to recognise it as such. We present a novel approach for continuous emotion recognition based on Long Short-Term Memory Recurrent Neural Networks which include modelling of long-range dependencies between observations and thus outperform techniques like Support-Vector Regression. Transferring the innovative concept of additionally modelling emotional history to the classification of discrete levels for the emotional dimensions "valence" and " activation" we also apply Conditional Random Fields which prevail over the commonly used Support-Vector Machines. Experiments conducted on data that was recorded while humans interacted with a Sensitive Artificial Listener prove that for activation the derived classifiers perform as well as human annotators.
Original language | English |
---|---|
Pages (from-to) | 597-600 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
State | Published - 2008 |
Event | INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia Duration: 22 Sep 2008 → 26 Sep 2008 |
Keywords
- Emotion recognition
- LSTM
- Sensitive artificial listener