TY - GEN
T1 - Improving recognition of speaker states and traits by cumulative evidence
T2 - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
AU - Weninger, Felix
AU - Marchi, Erik
AU - Schuller, Björn
PY - 2012
Y1 - 2012
N2 - We address the fully automatic recognition of intoxication, sleepiness, age and gender from speech in medium-term observation intervals of up to several minutes. The nature of these speaker states and traits as being medium-term or long-term, as opposed to short-term states such as emotion, makes it possible to collect cumulative evidence in the form of utterance level decisions; we show that by fusing these decisions along the time axis, more and more accurate decisions can be obtained. In extensive test runs on three official INTERSPEECH Challenge corpora, we show that the average recall can be improved by up to 5 %, 6 %, 10% and 11% absolute by longer-term observation of speaker sleepiness, gender, intoxication, and age, respectively, compared to the accuracy of a decision from a single utterance.
AB - We address the fully automatic recognition of intoxication, sleepiness, age and gender from speech in medium-term observation intervals of up to several minutes. The nature of these speaker states and traits as being medium-term or long-term, as opposed to short-term states such as emotion, makes it possible to collect cumulative evidence in the form of utterance level decisions; we show that by fusing these decisions along the time axis, more and more accurate decisions can be obtained. In extensive test runs on three official INTERSPEECH Challenge corpora, we show that the average recall can be improved by up to 5 %, 6 %, 10% and 11% absolute by longer-term observation of speaker sleepiness, gender, intoxication, and age, respectively, compared to the accuracy of a decision from a single utterance.
UR - http://www.scopus.com/inward/record.url?scp=84878394851&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84878394851
SN - 9781622767595
T3 - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
SP - 1158
EP - 1161
BT - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Y2 - 9 September 2012 through 13 September 2012
ER -