Compact Bilinear Deep Features For Environmental Sound Recognition

Fatih Demir, Abdulkadir Sengur, Hao Lu, Shahin Amiriparian, Nicholas Cummins, Björn Schuller

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

6 Zitate (Scopus)

Abstract

Environmental sound recognition (ESR) has extensive various civilian and military applications. Existing ESR methods generally tackle this problem by employing various signal processing and machine learning methods. Herein, an ESR paradigm based on feature extraction from pre-trained deep convolutional neural networks (CNN), the derivation of higher-order statistics by compact bilinear pooling and normalisation. In particular, we consider two deep ImageNet architectures for deep feature extraction, and the Random Maclaurin (RM) to produce the compact bilinear features. A support vector machine (SVM) with homogeneous mapping is used in the classification stage. Two publicly available environmental sound datasets are used to verify the efficacy of the approach namely, ESC-50 and ESC-10. We compare the proposed method with various previous state-of-the-art methods. Presented results indicate the suitability of the higher-order statistics of Deep Spectrum representations for ESR classification tasks.

OriginalspracheEnglisch
Titel2018 International Conference on Artificial Intelligence and Data Processing, IDAP 2018
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
ISBN (elektronisch)9781538668788
DOIs
PublikationsstatusVeröffentlicht - 21 Jan. 2019
Extern publiziertJa
Veranstaltung2018 International Conference on Artificial Intelligence and Data Processing, IDAP 2018 - Malatya, Türkei
Dauer: 28 Sept. 201830 Sept. 2018

Publikationsreihe

Name2018 International Conference on Artificial Intelligence and Data Processing, IDAP 2018

Konferenz

Konferenz2018 International Conference on Artificial Intelligence and Data Processing, IDAP 2018
Land/GebietTürkei
OrtMalatya
Zeitraum28/09/1830/09/18

Fingerprint

Untersuchen Sie die Forschungsthemen von „Compact Bilinear Deep Features For Environmental Sound Recognition“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren