Publication | Closed Access
Semi-superyised support vector machines for unlabeled data classification
204
Citations
5
References
2001
Year
Artificial IntelligenceMathematical ProgrammingEngineeringMachine LearningTest Set CorrectnessSupport Vector MachineClassification MethodImage AnalysisData ScienceData MiningPattern RecognitionConcave Minimization ApproachComputational GeometrySemi-supervised LearningSupervised LearningConcave Minimization ProblemKnowledge DiscoveryUnlabeled Data ClassificationComputer ScienceConvex Optimization
A concave minimization approach is proposed for classifying unlabeled data based on the following ideas: (i) A small representative percentage (5% to 10%) of the unlabeled data is chosen by a clustering algorithm and given to an expert or oracle to label, (ii) A linear support vector machine is trained using the small labeled sample while simultaneously assigning the remaining bulk of the unlabeled dataset to one of two classes so as to maximize the margin (distance) between the two bounding planes that determine the separating plane midway between them. This latter problem is formulated as a concave minimization problem on a polyhedral set for which a stationary point is quickly obtained by solving a few (5 to 7) linear programs. Such stationary points turn out to be very effective as evidenced by our computational results which show that clustered concave minimization yields: (a) Test set improvement as high as 20.4% over a linear support vector machine trained on a correspondingly small but randomly chosen subset that is labeled by an expert. (b) Test set correctness averaged to within 5.1% when compared to that of a completely supervised linear support vector machine trained on the entire dataset which has been labeled by an expert.
| Year | Citations | |
|---|---|---|
Page 1
Page 1