Leveraging Model Inherent Variable Importance for Stable Online Feature Selection

Johannes Haug, Martin Pawelczyk, Klaus Broelemann, Gjergji Kasneci

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

Feature selection can be a crucial factor in obtaining robust and accurate predictions. Online feature selection models, however, operate under considerable restrictions; they need to efficiently extract salient input features based on a bounded set of observations, while enabling robust and accurate predictions. In this work, we introduce FIRES, a novel framework for online feature selection. The proposed feature weighting mechanism leverages the importance information inherent in the parameters of a predictive model. By treating model parameters as random variables, we can penalize features with high uncertainty and thus generate more stable feature sets. Our framework is generic in that it leaves the choice of the underlying model to the user. Strikingly, experiments suggest that the model complexity has only a minor effect on the discriminative power and stability of the selected feature sets. In fact, using a simple linear model, FIRES obtains feature sets that compete with state-of-the-art methods, while dramatically reducing computation time. In addition, experiments show that the proposed framework is clearly superior in terms of feature selection stability.

Original languageEnglish
Title of host publicationKDD 2020 - Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages1478-1488
Number of pages11
ISBN (Electronic)9781450379984
DOIs
StatePublished - 23 Aug 2020
Externally publishedYes
Event26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020 - Virtual, Online, United States
Duration: 23 Aug 202027 Aug 2020

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

Conference26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020
Country/TerritoryUnited States
CityVirtual, Online
Period23/08/2027/08/20

Keywords

  • data streams
  • feature selection
  • stability
  • uncertainty

Fingerprint

Dive into the research topics of 'Leveraging Model Inherent Variable Importance for Stable Online Feature Selection'. Together they form a unique fingerprint.

Cite this