Publication | Open Access
Tuning support vector machines for biomedical named entity recognition
251
Citations
20
References
2002
Year
Unknown Venue
EngineeringPart-of-speech TaggingCorpus LinguisticsText MiningSpeech RecognitionNatural Language ProcessingSupport Vector MachineData SciencePattern RecognitionComputational LinguisticsEntity RecognitionSvm TrainingBiostatisticsSupport Vector MachinesLanguage StudiesBiomedical Text MiningNamed-entity RecognitionAutomatic ClassificationKnowledge DiscoveryComputer ScienceLinguisticsHealth InformaticsPo Tagging
We explore the use of Support Vector Machines (SVMs) for biomedical named entity recognition. To make the SVM training with the available largest corpus - the GENIA corpus - tractable, we propose to split the non-entity class into sub-classes, using part-of-speech information. In addition, we explore new features such as word cache and the states of an HMM trained by unsupervised learning. Experiments on the GENIA corpus show that our class splitting technique not only enables the training with the GENIA corpus but also improves the accuracy. The proposed new features also contribute to improve the accuracy. We compare our SVM-based recognition system with a system using Maximum Entropy tagging method.
| Year | Citations | |
|---|---|---|
Page 1
Page 1