Abstract
Learning in adversarial settings is becoming animportant task for application domains where attackersmay inject malicious data into the trainingset to subvert normal operation of data-driventechnologies. Feature selection has been widelyused in machine learning for security applicationsto improve generalization and computationalefficiency, although it is not clear whetherits use may be beneficial or even counterproductivewhen training data are poisoned by intelligentattackers. In this work, we shed light onthis issue by providing a framework to investigatethe robustness of popular feature selectionmethods, including LASSO, ridge regression andthe elastic net. Our results on malware detectionshow that feature selection methods can besignificantly compromised under attack (we canreduce LASSO to almost random choices of featuresets by careful insertion of less than 5% poisonedtraining samples), highlighting the needfor specific countermeasures.
Original language | English |
---|---|
Title of host publication | Proceedings of the 32nd International Conference on Machine Learning |
Publisher | Association for Computing Machinery |
Pages | 1689-1698 |
Publication status | Published - 11 Jul 2015 |
Event | International Conference on Machine Learning 2015 - Lille, France Duration: 6 Jul 2015 → 11 Jul 2015 |
Conference
Conference | International Conference on Machine Learning 2015 |
---|---|
City | Lille, France |
Period | 6/07/15 → 11/07/15 |