Bag-of-Deep-Features: Noise-Robust Deep Feature Representations for Audio Analysis

Shahin Amiriparian, Maurice Gerczuk, Sandra Ottl, Nicholas Cummins, Sergey Pugachevskiy, Bjorn Schuller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

25 Scopus citations

Abstract

In the era of deep learning, research into the classification of various components of the acoustic environment, especially in-the-wild recordings, is gaining in popularity. This is due in part to the increasing computational capacities and the expanding amount of real-world data available on social multimedia. However, the noisy nature of this data can add an additional complexity to the already complex deep learning systems. Herein, we tackle this issue by quantising deep feature representations of various in-the-wild audio data sets. The aim of this paper is twofold: 1) to assess the feasibility of the proposed feature quantisation task, and 2) to compare the efficacy of various feature spaces extracted from different fully connected deep neural networks to classify six real-world audio corpora. For the classification, we extract two feature sets: I) DEEP SPECTRUM features which are derived from forwarding the visual representations of the audio instances, in particular mel-spectrograms through very deep task-independent pre-trained Convolutional Neural Networks (CNNs), and ii) Bag-of-Deep-Features (BODF) which is the quantisation of the DEEP SPECTRUM features. Using BODF, we show the suitability of quantising the deep representations for noisy in-the-wild audio data. Finally, we analyse the effect of early and late fusion of the CNN features and models on the classification results.

Original languageEnglish
Title of host publication2018 International Joint Conference on Neural Networks, IJCNN 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781509060146
DOIs
StatePublished - 10 Oct 2018
Externally publishedYes
Event2018 International Joint Conference on Neural Networks, IJCNN 2018 - Rio de Janeiro, Brazil
Duration: 8 Jul 201813 Jul 2018

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2018-July

Conference

Conference2018 International Joint Conference on Neural Networks, IJCNN 2018
Country/TerritoryBrazil
CityRio de Janeiro
Period8/07/1813/07/18

Fingerprint

Dive into the research topics of 'Bag-of-Deep-Features: Noise-Robust Deep Feature Representations for Audio Analysis'. Together they form a unique fingerprint.

Cite this