Measuring the Stability of Feature Selection

Sarah Nogueira, Gavin Brown

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    495 Downloads (Pure)

    Abstract

    In feature selection algorithms, “stability” is the sensitivity of the chosen feature set to variations in the supplied training data. As such it can be seen as an analogous concept to the statistical variance of a predictor. However unlike variance, there is no unique definition of stability, with numerous proposed measures over 15 years of literature. In this paper, instead of defining a new measure, we start from an axiomatic point of view and identify what properties would be desirable. Somewhat surprisingly, we find that the simple Pearson’s correlation coefficient has all necessary properties, yet has somehow been overlooked in favour of more complex alternatives. Finally, we illustrate how the use of this measure in practice can provide better interpretability and more confidence in the model selection process. The data and software related to this paper are available at https://github.com/nogueirs/ECML2016.
    Original languageEnglish
    Title of host publication European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part I
    EditorsPaolo Frasconi, Niels Landwehr, Giuseppe Manco, Jilles Vreeken
    PublisherSpringer Nature
    Pages442-457
    ISBN (Electronic)978-3-319-46128-1
    ISBN (Print)978-3-319-46127-4
    DOIs
    Publication statusPublished - 19 Sept 2016
    EventEuropean Conference, ECML PKDD - Riva del Garda, Italy
    Duration: 19 Sept 201623 Sept 2016

    Publication series

    NameLecture Notes in Artificial Intelligence
    PublisherSpringer
    Volume9851
    NameLecture Notes in Computer Science
    PublisherSpringer
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    ConferenceEuropean Conference, ECML PKDD
    Country/TerritoryItaly
    CityRiva del Garda
    Period19/09/1623/09/16

    Fingerprint

    Dive into the research topics of 'Measuring the Stability of Feature Selection'. Together they form a unique fingerprint.

    Cite this