Publication | Closed Access
Parallel GA-based wrapper feature selection for spectroscopic data mining
10
Citations
16
References
2002
Year
Unknown Venue
EngineeringMachine LearningMachine Learning ToolFeature SelectionSpectroscopic Data MiningSpectrochemical AnalysisData ScienceData MiningPattern RecognitionStatistical MethodsMachine Learning ModelFeature EngineeringPredictive AnalyticsKnowledge DiscoveryComputer EngineeringComputer ScienceDense DatabasesFeature ConstructionSpectroscopySpectral Searching
Mining predictive models in dense databases is CPU time consuming and I/O intensive. In this paper, we propose a taxonomy of existing techniques allowing to achieve high performance. We propose a hybrid approach allowing to exploit four of them: feature selection, GA-based exploration space reduction, parallelism and concurrency. The approach is experimented on a near-infrared (NIR) spectroscopic application. It consists of predicting the concentration of a given component in a given product from its absorbances to NIR radiations. Statistical methods, like PLS, are well-suited and efficient for such data mining task. The experimental results show that preceding those methods with a feature selection allows to withdraw a significant number of irrelevant features and at the same time to enhance significantly the accuracy of the discovered predictive model. It is also shown that for the considered task the GA-based approach allows to build more accurate models than neural networks. Moreover, the parallel multithreaded implementation of the approach allows a linear speed-up.
| Year | Citations | |
|---|---|---|
Page 1
Page 1