Concepedia

Abstract

In recent years, the gene expression profiles are used for cancer recognition. But the researchers are disturbed by their large variables and small observes. In this paper, a novel feature selection method based on correlation-based feature selection(CFS) was proposed. Firstly, the measures of variable to variable and variable to observe were calculated respectively. Then we utilized heuristic search method to search the space of variable for selecting informative gene subset and the subset weight was computed using these measures. Through regression we obtained a subset of distinguished genes. Finally, the stratified sampling strategy was presented to obtain the most informative genes. And classification performance was tested to evaluate the proposed method. Ten-fold cross-validation experiment was performed in three datasets including leukemia, colon cancer and prostate tumor. The experimental results show that the proposed method can obtain the distinguished gene subset and different classifier can acquire better classification performance with this subset.

References

YearCitations

Page 1