Publication | Closed Access
Sequential Projection Pursuit Using Genetic Algorithms for Data Mining of Analytical Data
62
Citations
21
References
2000
Year
EngineeringAnalytical DataFeature SelectionComplexity ReductionOptimization-based Data MiningImage AnalysisData ScienceData MiningPattern RecognitionAnalytical ChemistryIndependent Component AnalysisPublic HealthPrincipal Component AnalysisStatisticsData OptimizationPredictive AnalyticsProjection PursuitKnowledge DiscoverySequential Projection PursuitInverse ProblemsDimensionality ReductionNonlinear Dimensionality ReductionFunctional Data AnalysisEvolutionary Data MiningStatistical Inference
Sequential projection pursuit (SPP) is proposed to detect inhomogeneities (clusters) in high-dimensional analytical data. Such inhomogeneities indicate that there are groups of objects (samples) with different chemical characteristics. The method is compared with principal component analysis (PCA). PCA is generally applied to visually explore structure in high-dimensional data, but is not specifically used to find clustering tendency. Projection pursuit (PP) is specifically designed to find inhomogeneities, but the original method is computationally very intensive. SPP combines the advantages of both methods and overcomes most of their weak points. In this method, latent variables are obtained sequentially according to their importance measured by the entropy index. This involves an optimization step, which is achieved by using a genetic algorithm. The performance of the method is demonstrated and evaluated, first on simulated data sets, and then on near-infrared and gas chromatography data sets. It is shown that SPP indeed reveals more easily information about inhomogeneities than PCA.
| Year | Citations | |
|---|---|---|
Page 1
Page 1