Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acoustic Scenes

  • University Hospital Augsburg
  • University of Surrey

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

71 Zitate (Scopus)

Abstract

The goal of Acoustic Scene Classification (ASC) is to recognise the environment in which an audio waveform has been recorded. Recently, deep neural networks have been applied to ASC and have achieved state-of-the-art performance. However, few works have investigated how to visualise and understand what a neural network has learnt from acoustic scenes. Previous work applied local pooling after each convolutional layer, therefore reduced the size of the feature maps. In this paper, we suggest that local pooling is not necessary, but the size of the receptive field is important. We apply atrous Convolutional Neural Networks (CNNs) with global attention pooling as the classification model. The internal feature maps of the attention model can be visualised and explained. On the Detection and Classification of Acoustic Scenes and Events (DCASE) 2018 dataset, our proposed method achieves an accuracy of 72.7 %, significantly outperforming the CNNs without dilation at 60.4 %. Furthermore, our results demonstrate that the learnt feature maps contain rich information on acoustic scenes in the time-frequency domain.

OriginalspracheEnglisch
Titel2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
Seiten56-60
Seitenumfang5
ISBN (elektronisch)9781479981311
DOIs
PublikationsstatusVeröffentlicht - Mai 2019
Extern publiziertJa
Veranstaltung44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, Großbritannien/Vereinigtes Königreich
Dauer: 12 Mai 201917 Mai 2019

Publikationsreihe

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Band2019-May
ISSN (Print)1520-6149

Konferenz

Konferenz44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Land/GebietGroßbritannien/Vereinigtes Königreich
OrtBrighton
Zeitraum12/05/1917/05/19

Fingerprint

Untersuchen Sie die Forschungsthemen von „Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acoustic Scenes“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren