Concepedia

Publication | Closed Access

Feature selection for classification

2.6K

Citations

30

References

1997

Year

TLDR

Feature selection has long been a key focus in machine learning, and the growth of large databases has heightened the need for novel approaches to address emerging challenges. This survey aims to comprehensively review feature‑selection methods from the 1970s to today, delineating their typical four steps and categorizing them by generation procedures and evaluation functions while highlighting unexplored combinations. The authors illustrate representative methods from each category with detailed examples and evaluate them on benchmark datasets of varying characteristics. The survey elucidates the strengths and weaknesses of feature‑selection techniques, offers data‑type‑specific guidelines, and outlines future research directions to aid practitioners in selecting appropriate methods for real‑world problems.

Abstract

Feature selection has been the focus of interest for quite some time and much work has been done. With the creation of huge databases and the consequent requirements for good machine learning techniques, new problems arise and novel approaches to feature selection are in demand. This survey is a comprehensive overview of many existing methods from the 1970's to the present. It identifies four steps of a typical feature selection method, and categorizes the different existing methods in terms of generation procedures and evaluation functions, and reveals hitherto unattempted combinations of generation procedures and evaluation functions. Representative methods are chosen from each category for detailed explanation and discussion via example. Benchmark datasets with different characteristics are used for comparative study. The strengths and weaknesses of different methods are explained. Guidelines for applying feature selection methods are given based on data types and domain characteristics. This survey identifies the future research areas in feature selection, introduces newcomers to this field, and paves the way for practitioners who search for suitable methods for solving domain-specific real-world applications.

References

YearCitations

Page 1