TY - GEN
T1 - Hierarchical neural networks and enhanced class posteriors for social signal classification
AU - Brueckner, Raymond
AU - Schuller, Björn
PY - 2013
Y1 - 2013
N2 - With the impressive advances of deep learning in recent years the interest in neural networks has resurged in the fields of automatic speech recognition and emotion recognition. In this paper we apply neural networks to address speaker-independent detection and classification of laughter and filler vocalizations in speech. We first explore modeling class posteriors with standard neural networks and deep stacked autoencoders. Then, we adopt a hierarchical neural architecture to compute enhanced class posteriors and demonstrate that this approach introduces significant and consistent improvements on the Social Signals Sub-Challenge of the Interspeech 2013 Computational Paralinguistics Challenge (ComParE). On this task we achieve a value of 92.4% of the unweighted average area-under-the-curve, which is the official competition measure, on the test set. This constitutes an improvement of 9.1% over the baseline and is the best result obtained so far on this task.
AB - With the impressive advances of deep learning in recent years the interest in neural networks has resurged in the fields of automatic speech recognition and emotion recognition. In this paper we apply neural networks to address speaker-independent detection and classification of laughter and filler vocalizations in speech. We first explore modeling class posteriors with standard neural networks and deep stacked autoencoders. Then, we adopt a hierarchical neural architecture to compute enhanced class posteriors and demonstrate that this approach introduces significant and consistent improvements on the Social Signals Sub-Challenge of the Interspeech 2013 Computational Paralinguistics Challenge (ComParE). On this task we achieve a value of 92.4% of the unweighted average area-under-the-curve, which is the official competition measure, on the test set. This constitutes an improvement of 9.1% over the baseline and is the best result obtained so far on this task.
KW - computational paralinguistics challenge
KW - deep autoencoder networks
KW - enhanced posteriors
KW - hierarchical neural networks
UR - http://www.scopus.com/inward/record.url?scp=84893704598&partnerID=8YFLogxK
U2 - 10.1109/ASRU.2013.6707757
DO - 10.1109/ASRU.2013.6707757
M3 - Conference contribution
AN - SCOPUS:84893704598
SN - 9781479927562
T3 - 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings
SP - 362
EP - 367
BT - 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings
T2 - 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013
Y2 - 8 December 2013 through 13 December 2013
ER -