Publication | Open Access
Disambiguating automatic semantic annotation based on a thesaurus structure
19
Citations
8
References
2007
Year
Carrot AlgorithmEngineeringSemanticsSemantic WebAutomatic Semantic AnnotationCorpus LinguisticsText MiningApplied LinguisticsNatural Language ProcessingInformation RetrievalComputational LinguisticsInformation Extraction PipelineLanguage StudiesAnnotation Auto- MatiqueComputational LexicologyEntity DisambiguationThesaurus ManagementKnowledge DiscoveryTerminology ExtractionLexical ResourceKeyword ExtractionLinguisticsWord-sense Disambiguation
Theuse/use for relationship a thesaurus is usually more complex than the (para- ) synonymy recommended in the ISO-2788 standard describing the content of these controlled vocabularies. The fact that a non preferred term can refer to multiple preferred terms (only the latter are relevant in controlled indexing) makes this relationship difficult to use in automatic annotation applications : it generates ambiguity cases. In this paper, we present the CARROT algorithm, meant to rank the output of our Information Extraction pipeline, and how this al- gorithm can be used to select the relevant preferred term out of different possibilities. This selection is meant to provide suggestions of keywords to human annotators, in order to ease and speed up their daily process and is based on the structure of their thesaurus. We achieve a 95 % success, and discuss these results along with perspectives for this experiment. Mots-cles : desambiguisation semantique, algorithme de classement, annotation auto- matique.
| Year | Citations | |
|---|---|---|
Page 1
Page 1