Publication | Closed Access
Toward optimal feature selection
1.4K
Citations
3
References
1996
Year
Unknown Venue
Irrelevant and redundant features can degrade model performance, making feature selection essential. The study investigates an information‑theoretic approach to feature subset selection, aiming to discard features that add little or no new information. The authors first formalize an optimal but intractable selection criterion, then propose an efficient algorithm that approximates this criterion and analyze the conditions for its success. Experiments on multiple datasets demonstrate that the algorithm scales to very high‑dimensional data while effectively selecting informative features.
In this paper, we examine a method for feature subset selection based on Information Theory. Initially, a framework for defining the theoretically optimal, but computationally intractable, method for feature subset selection is presented. We show that our goal should be to eliminate a feature if it gives us little or no additional information beyond that subsumed by the remaining features. In particular, this will be the case for both irrelevant and redundant features. We then give an efficient algorithm for feature selection which computes an approximation to the optimal feature selection criterion. The conditions under which the approximate algorithm is successful are examined. Empirical results are given on a number of data sets, showing that the algorithm effectively handles datasets with a very large number of features.
| Year | Citations | |
|---|---|---|
Page 1
Page 1