TY - GEN
T1 - End-to-end audio classification with small datasets - Making it work
AU - Schmitt, Maximilian
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2019 IEEE
PY - 2019/9
Y1 - 2019/9
N2 - Deep end-to-end learning is a promising approach for many types of audio classification tasks. However, in fields such as health care and medical diagnosis, training data can be scarce, which makes training a neural network from the raw waveform to the target a challenge. In this work, we focus on a public dataset of human snore sounds, categorised into four classes, where one particular class has only a few training samples. We emphasise the pitfalls that need to be taken into account when working with such data and propose an end-to-end model providing a performance similar to that of other deep and non-deep approaches. Furthermore, we show that a model using only convolutional layers outperforms a model employing also recurrent layers.
AB - Deep end-to-end learning is a promising approach for many types of audio classification tasks. However, in fields such as health care and medical diagnosis, training data can be scarce, which makes training a neural network from the raw waveform to the target a challenge. In this work, we focus on a public dataset of human snore sounds, categorised into four classes, where one particular class has only a few training samples. We emphasise the pitfalls that need to be taken into account when working with such data and propose an end-to-end model providing a performance similar to that of other deep and non-deep approaches. Furthermore, we show that a model using only convolutional layers outperforms a model employing also recurrent layers.
KW - Audio classification
KW - End-to-end learning
KW - Representation learning
KW - Scarce data
KW - Snore sounds
UR - http://www.scopus.com/inward/record.url?scp=85075597246&partnerID=8YFLogxK
U2 - 10.23919/EUSIPCO.2019.8902712
DO - 10.23919/EUSIPCO.2019.8902712
M3 - Conference contribution
AN - SCOPUS:85075597246
T3 - European Signal Processing Conference
BT - EUSIPCO 2019 - 27th European Signal Processing Conference
PB - European Signal Processing Conference, EUSIPCO
T2 - 27th European Signal Processing Conference, EUSIPCO 2019
Y2 - 2 September 2019 through 6 September 2019
ER -