TY - GEN
T1 - Incremental acoustic valence recognition
T2 - An inter-corpus perspective on features, matching, and performance in a gating paradigm
AU - Schuller, Björn
AU - Devillers, Laurence
PY - 2010
Y1 - 2010
N2 - It is not fully known how long it takes a human to reliably recognize emotion in speech from the beginning of a phrase. However, many technical applications demand for very quick system responses, e. g. to prepare different feedback alternatives before the end of a speaker turn in a dialog system. We therefore investigate this 'gating paradigm' employing two spoken language resources in a cross- and combined manner with a focus on valence: we determine how quick a reliable estimate is obtainable and whether matching by models trained on the same length of speech prevails. In addition we analyze how individual feature groups by type and derived functionals respond and find considerably different behavior. The language resources have been chosen to cover for manually segmented and automatically segmented speech at the same time. In the result one second of speech is sufficient on the datasets considered.
AB - It is not fully known how long it takes a human to reliably recognize emotion in speech from the beginning of a phrase. However, many technical applications demand for very quick system responses, e. g. to prepare different feedback alternatives before the end of a speaker turn in a dialog system. We therefore investigate this 'gating paradigm' employing two spoken language resources in a cross- and combined manner with a focus on valence: we determine how quick a reliable estimate is obtainable and whether matching by models trained on the same length of speech prevails. In addition we analyze how individual feature groups by type and derived functionals respond and find considerably different behavior. The language resources have been chosen to cover for manually segmented and automatically segmented speech at the same time. In the result one second of speech is sufficient on the datasets considered.
KW - Affective computing
KW - Automatic emotion recognition
KW - Gating paradigm
KW - Incremental speech processing
UR - https://www.scopus.com/pages/publications/79959856651
U2 - 10.21437/interspeech.2010-289
DO - 10.21437/interspeech.2010-289
M3 - Conference contribution
AN - SCOPUS:79959856651
T3 - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
SP - 801
EP - 804
BT - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
PB - International Speech Communication Association
ER -