Is feature selection secure against training data poisoning?

Huang Xiao, Battista Biggio, Gavin Brown, Giorgio Fumera, Claudia Eckert, Fabio Roli

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

297 Zitate (Scopus)

Abstract

Learning in adversarial settings is becoming an important task for application domains where attackers may inject malicious data into the training set to subvert normal operation of data-driven technologies. Feature selection has been widely used in machine learning for security applications to improve generalization and computational efficiency, although it is not clear whether its use may be beneficial or even counterproductive when training data are poisoned by intelligent attackers. In this work, we shed light on this issue by providing a framework to investigate the robustness of popular feature selection methods, including LASSO, ridge regression and the elastic net. Our results on malware detection show that feature selection methods can be significantly compromised under attack (we can reduce LASSO to almost random choices of feature sets by careful insertion of less than 5% poisoned training samples), highlighting the need for specific countermeasures.

OriginalspracheEnglisch
Titel32nd International Conference on Machine Learning, ICML 2015
Redakteure/-innenDavid Blei, Francis Bach
Herausgeber (Verlag)International Machine Learning Society (IMLS)
Seiten1689-1698
Seitenumfang10
ISBN (elektronisch)9781510810587
PublikationsstatusVeröffentlicht - 2015
Veranstaltung32nd International Conference on Machine Learning, ICML 2015 - Lile, Frankreich
Dauer: 6 Juli 201511 Juli 2015

Publikationsreihe

Name32nd International Conference on Machine Learning, ICML 2015
Band2

Konferenz

Konferenz32nd International Conference on Machine Learning, ICML 2015
Land/GebietFrankreich
OrtLile
Zeitraum6/07/1511/07/15

Fingerprint

Untersuchen Sie die Forschungsthemen von „Is feature selection secure against training data poisoning?“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren