Publication | Closed Access
Concentration based feature construction approach for spam detection
32
Citations
16
References
2009
Year
Unknown Venue
Search OptimizationEngineeringMachine LearningHuman Immune SystemArtificial Immune SystemFeature SelectionImmunological ComputingInformation ForensicsFeature VectorText MiningSpam FilteringSupport Vector MachineInformation RetrievalData ScienceData MiningPattern RecognitionKnowledge DiscoveryComputer ScienceFeature ConstructionSpam DetectionClassifier System
Inspired by human immune system, a concentration based feature construction (CFC) approach which utilizes a two-element concentration vector as the feature vector is proposed for spam detection in this paper. In the CFC approach, dasiaselfpsila and dasianon-selfpsila concentrations are constructed by using dasiaselfpsila and dasianon-selfpsila gene libraries, respectively, and subsequently are used to form a vector with two elements of concentrations for characterizing the e-mail efficiently. As a result, the design of classifier actually amounts to establishing a mapping between two real-value inputs and one binary output. The classification of the e-mail is considered as an optimization problem aiming at minimizing a formulated cost function. A clonal particle swarm optimization (CPSO) algorithm proposed by the leading author is also employed for this purpose. Several classifiers including linear discriminant, multi-layer neural networks and support vector machine are used to verify the effectiveness and robustness of the CFC approach. Experimental results demonstrate that the proposed CFC approach not only has a very much fast speed but also gives 97% and 99% of accuracy just using a two-element concentration feature vector on corpus PU1 and Ling, respectively.
| Year | Citations | |
|---|---|---|
Page 1
Page 1