Is Feature Selection Secure against Training Data Poisoning?

H Xiao, B Biggio, G Brown, G Fumera, C Eckert, F Roli

    Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

    123 Downloads (Pure)

    Abstract

    Learning in adversarial settings is becoming animportant task for application domains where attackersmay inject malicious data into the trainingset to subvert normal operation of data-driventechnologies. Feature selection has been widelyused in machine learning for security applicationsto improve generalization and computationalefficiency, although it is not clear whetherits use may be beneficial or even counterproductivewhen training data are poisoned by intelligentattackers. In this work, we shed light onthis issue by providing a framework to investigatethe robustness of popular feature selectionmethods, including LASSO, ridge regression andthe elastic net. Our results on malware detectionshow that feature selection methods can besignificantly compromised under attack (we canreduce LASSO to almost random choices of featuresets by careful insertion of less than 5% poisonedtraining samples), highlighting the needfor specific countermeasures.
    Original languageEnglish
    Title of host publicationProceedings of the 32nd International Conference on Machine Learning
    PublisherAssociation for Computing Machinery
    Pages1689-1698
    Publication statusPublished - 11 Jul 2015
    EventInternational Conference on Machine Learning 2015 - Lille, France
    Duration: 6 Jul 201511 Jul 2015

    Conference

    ConferenceInternational Conference on Machine Learning 2015
    CityLille, France
    Period6/07/1511/07/15

    Fingerprint

    Dive into the research topics of 'Is Feature Selection Secure against Training Data Poisoning?'. Together they form a unique fingerprint.

    Cite this