Convolutional RNN: An enhanced model for extracting features from sequential data

Gil Keren, Björn Schuller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

116 Scopus citations

Abstract

Traditional convolutional layers extract features from patches of data by applying a non-linearity on an affine function of the input. We propose a model that enhances this feature extraction process for the case of sequential data, by feeding patches of the data into a recurrent neural network and using the outputs or hidden states of the recurrent units to compute the extracted features. By doing so, we exploit the fact that a window containing a few frames of the sequential data is a sequence itself and this additional structure might encapsulate valuable information. In addition, we allow for more steps of computation in the feature extraction process, which is potentially beneficial as an affine function followed by a non-linearity can result in too simple features. Using our convolutional recurrent layers, we obtain an improvement in performance in two audio classification tasks, compared to traditional convolutional layers.

Original languageEnglish
Title of host publication2016 International Joint Conference on Neural Networks, IJCNN 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3412-3419
Number of pages8
ISBN (Electronic)9781509006199
DOIs
StatePublished - 31 Oct 2016
Externally publishedYes
Event2016 International Joint Conference on Neural Networks, IJCNN 2016 - Vancouver, Canada
Duration: 24 Jul 201629 Jul 2016

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2016-October

Conference

Conference2016 International Joint Conference on Neural Networks, IJCNN 2016
Country/TerritoryCanada
CityVancouver
Period24/07/1629/07/16

Fingerprint

Dive into the research topics of 'Convolutional RNN: An enhanced model for extracting features from sequential data'. Together they form a unique fingerprint.

Cite this