TY - JOUR
T1 - Emotion recognition in the noise applying large acoustic feature sets
AU - Schuller, Björn
AU - Arsić, Dejan
AU - Wallhoff, Frank
AU - Rigoll, Gerhard
N1 - Publisher Copyright:
© 2006 Proceedings of the International Conference on Speech Prosody.
PY - 2006
Y1 - 2006
N2 - Speech emotion recognition is considered mostly under ideal acoustic conditions: acted and elicited samples in studio quality are used besides sparse works on spontaneous fielddata. However, specific analysis of noise influence plays an important factor in speech processing and is practically not considered hereon, yet. We therefore discuss affect estimation under noise conditions herein. On 3 well-known public databases - DES, EMO-DB, and SUSAS - effects of postrecording noise addition in diverse dB levels, and performance under noise conditions during signal capturing, are shown. To cope with this new challenge we extend generation of functionals by extraction of a large 4k hi-level feature set out of more than 60 partially novel base contours. Such comprise among others intonation, intensity, formants, HNR, MFCC, and VOC19. Fast Information-Gain-Ratio filter-selection picks attributes according to noise conditions. Results are presented using Support Vector Machines as classifier.
AB - Speech emotion recognition is considered mostly under ideal acoustic conditions: acted and elicited samples in studio quality are used besides sparse works on spontaneous fielddata. However, specific analysis of noise influence plays an important factor in speech processing and is practically not considered hereon, yet. We therefore discuss affect estimation under noise conditions herein. On 3 well-known public databases - DES, EMO-DB, and SUSAS - effects of postrecording noise addition in diverse dB levels, and performance under noise conditions during signal capturing, are shown. To cope with this new challenge we extend generation of functionals by extraction of a large 4k hi-level feature set out of more than 60 partially novel base contours. Such comprise among others intonation, intensity, formants, HNR, MFCC, and VOC19. Fast Information-Gain-Ratio filter-selection picks attributes according to noise conditions. Results are presented using Support Vector Machines as classifier.
UR - http://www.scopus.com/inward/record.url?scp=78149472083&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:78149472083
SN - 2333-2042
JO - Proceedings of the International Conference on Speech Prosody
JF - Proceedings of the International Conference on Speech Prosody
T2 - 3rd International Conference on Speech Prosody, SP 2006
Y2 - 2 May 2006 through 5 May 2006
ER -