TY - GEN
T1 - An Evolutionary-based Generative Approach for Audio Data Augmentation
AU - Mertes, Silvan
AU - Baird, Alice
AU - Schiller, Dominik
AU - Schuller, Bjorn W.
AU - Andre, Elisabeth
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/9/21
Y1 - 2020/9/21
N2 - In this paper, we introduce a novel framework to augment raw audio data for machine learning classification tasks. For the first part of our framework, we employ a generative adversarial network (GAN) to create new variants of the audio samples that are already existing in our source dataset for the classification task. In the second step, we then utilize an evolutionary algorithm to search the input domain space of the previously trained GAN, with respect to predefined characteristics of the generated audio. This way we are able to generate audio in a controlled manner that contributes to an improvement in classification performance of the original task. To validate our approach, we chose to test it on the task of soundscape classification. We show that our approach leads to a substantial improvement in classification results when compared to a training routine without data augmentation and training with uncontrolled data augmentation with GANs.
AB - In this paper, we introduce a novel framework to augment raw audio data for machine learning classification tasks. For the first part of our framework, we employ a generative adversarial network (GAN) to create new variants of the audio samples that are already existing in our source dataset for the classification task. In the second step, we then utilize an evolutionary algorithm to search the input domain space of the previously trained GAN, with respect to predefined characteristics of the generated audio. This way we are able to generate audio in a controlled manner that contributes to an improvement in classification performance of the original task. To validate our approach, we chose to test it on the task of soundscape classification. We show that our approach leads to a substantial improvement in classification results when compared to a training routine without data augmentation and training with uncontrolled data augmentation with GANs.
KW - data augmentation
KW - evolutionary computing
KW - generative adversarial networks
KW - latent vector evolution
KW - sound generation
UR - http://www.scopus.com/inward/record.url?scp=85099185149&partnerID=8YFLogxK
U2 - 10.1109/MMSP48831.2020.9287156
DO - 10.1109/MMSP48831.2020.9287156
M3 - Conference contribution
AN - SCOPUS:85099185149
T3 - IEEE 22nd International Workshop on Multimedia Signal Processing, MMSP 2020
BT - IEEE 22nd International Workshop on Multimedia Signal Processing, MMSP 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 22nd IEEE International Workshop on Multimedia Signal Processing, MMSP 2020
Y2 - 21 September 2020 through 24 September 2020
ER -