TY - GEN
T1 - Enhancing transferability of black-box adversarial attacks via lifelong learning for speech emotion recognition models
AU - Ren, Zhao
AU - Han, Jing
AU - Cummins, Nicholas
AU - Schuller, Björn W.
N1 - Publisher Copyright:
Copyright © 2020 ISCA
PY - 2020
Y1 - 2020
N2 - Well-designed adversarial examples can easily fool deep speech emotion recognition models into misclassifications. The transferability of adversarial attacks is a crucial evaluation indicator when generating adversarial examples to fool a new target model or multiple models. Herein, we propose a method to improve the transferability of black-box adversarial attacks using lifelong learning. First, black-box adversarial examples are generated by an atrous Convolutional Neural Network (CNN) model. This initial model is trained to attack a CNN target model. Then, we adapt the trained atrous CNN attacker to a new CNN target model using lifelong learning. We use this paradigm, as it enables multi-task sequential learning, which saves more memory space than conventional multi-task learning. We verify this property on an emotional speech database, by demonstrating that the updated atrous CNN model can attack all target models which have been learnt, and can better attack a new target model than an attack model trained on one target model only.
AB - Well-designed adversarial examples can easily fool deep speech emotion recognition models into misclassifications. The transferability of adversarial attacks is a crucial evaluation indicator when generating adversarial examples to fool a new target model or multiple models. Herein, we propose a method to improve the transferability of black-box adversarial attacks using lifelong learning. First, black-box adversarial examples are generated by an atrous Convolutional Neural Network (CNN) model. This initial model is trained to attack a CNN target model. Then, we adapt the trained atrous CNN attacker to a new CNN target model using lifelong learning. We use this paradigm, as it enables multi-task sequential learning, which saves more memory space than conventional multi-task learning. We verify this property on an emotional speech database, by demonstrating that the updated atrous CNN model can attack all target models which have been learnt, and can better attack a new target model than an attack model trained on one target model only.
KW - Black-box Adversarial Attacks
KW - Lifelong Learning
KW - Speech Emotion Recognition
KW - Transferability
UR - http://www.scopus.com/inward/record.url?scp=85098108890&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2020-1869
DO - 10.21437/Interspeech.2020-1869
M3 - Conference contribution
AN - SCOPUS:85098108890
SN - 9781713820697
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 496
EP - 500
BT - Interspeech 2020
PB - International Speech Communication Association
T2 - 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020
Y2 - 25 October 2020 through 29 October 2020
ER -