TY - GEN
T1 - Convolutional RNN
T2 - 2016 International Joint Conference on Neural Networks, IJCNN 2016
AU - Keren, Gil
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/10/31
Y1 - 2016/10/31
N2 - Traditional convolutional layers extract features from patches of data by applying a non-linearity on an affine function of the input. We propose a model that enhances this feature extraction process for the case of sequential data, by feeding patches of the data into a recurrent neural network and using the outputs or hidden states of the recurrent units to compute the extracted features. By doing so, we exploit the fact that a window containing a few frames of the sequential data is a sequence itself and this additional structure might encapsulate valuable information. In addition, we allow for more steps of computation in the feature extraction process, which is potentially beneficial as an affine function followed by a non-linearity can result in too simple features. Using our convolutional recurrent layers, we obtain an improvement in performance in two audio classification tasks, compared to traditional convolutional layers.
AB - Traditional convolutional layers extract features from patches of data by applying a non-linearity on an affine function of the input. We propose a model that enhances this feature extraction process for the case of sequential data, by feeding patches of the data into a recurrent neural network and using the outputs or hidden states of the recurrent units to compute the extracted features. By doing so, we exploit the fact that a window containing a few frames of the sequential data is a sequence itself and this additional structure might encapsulate valuable information. In addition, we allow for more steps of computation in the feature extraction process, which is potentially beneficial as an affine function followed by a non-linearity can result in too simple features. Using our convolutional recurrent layers, we obtain an improvement in performance in two audio classification tasks, compared to traditional convolutional layers.
UR - http://www.scopus.com/inward/record.url?scp=85007199224&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2016.7727636
DO - 10.1109/IJCNN.2016.7727636
M3 - Conference contribution
AN - SCOPUS:85007199224
T3 - Proceedings of the International Joint Conference on Neural Networks
SP - 3412
EP - 3419
BT - 2016 International Joint Conference on Neural Networks, IJCNN 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 24 July 2016 through 29 July 2016
ER -