Audio-based eating analysis and tracking utilising deep spectrum features

Shahin Amiriparian, Sandra Ottl, Maurice Gerczuk, Sergey Pugachevskiy, Bjorn Schuller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This This paper proposes a deep learning system for audio-based eating analysis on the ICMI 2018 Eating Analysis and Tracking (EAT) challenge corpus. We utilise Deep Spectrum features which are image classification convolutional neural network (CNN) descriptors. We extract the Deep Spectrum features by forwarding Mel-spectrograms from input audio through deep task-independent pre-trained CNNs, including AlexNet and VGG16. We then use the activations of first (fc6), second (fc7), and third (fc8) fully connected layers from these networks as feature vectors. We obtain the best classification result by using the first fully connected layer (fc6) of AlexNet for extracting the features from Mel-spectrograms with a window size of 160 ms and a hop size of 80 ms and a viridis colour map. Finally, we build Bag-of-Deep-Features (BoDF) which is the quantisation of the Deep Spectrum features. In comparison to the best baseline results on the test partitions of the Food Type and the Likability sub-challenges, unweighted average recall is increased from 67.2 percent to 79.9 percent and from 54.2 percent to 56.1 percent, respectively. For the test partition of the Difficulty sub-challenge the concordance correlation coefficient is increased from.506 to.509.

Original languageEnglish
Title of host publication2019 7th E-Health and Bioengineering Conference, EHB 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728126036
DOIs
StatePublished - Nov 2019
Externally publishedYes
Event7th IEEE International Conference on E-Health and Bioengineering, EHB 2019 - Iasi, Romania
Duration: 21 Nov 201923 Nov 2019

Publication series

Name2019 7th E-Health and Bioengineering Conference, EHB 2019

Conference

Conference7th IEEE International Conference on E-Health and Bioengineering, EHB 2019
Country/TerritoryRomania
CityIasi
Period21/11/1923/11/19

Keywords

  • Audio processing
  • Deep Spectrum features
  • Eating analysis
  • Pre-trained convolutional neural networks

Fingerprint

Dive into the research topics of 'Audio-based eating analysis and tracking utilising deep spectrum features'. Together they form a unique fingerprint.

Cite this