Publication | Open Access
An example-based mapping method for text categorization and retrieval
404
Citations
15
References
1994
Year
Llsf ApproachExample-based Mapping MethodEngineeringIntelligent Information RetrievalSemantic WebCorpus LinguisticsText MiningNatural Language ProcessingText RetrievalInformation RetrievalData ScienceData MiningComputational LinguisticsDocument ClassificationLanguage StudiesAutomatic ClassificationKnowledge DiscoveryTerminology ExtractionArbitrary QueriesVector Space ModelLinguisticsSemantic Similarity
A unified model for text categorization and text retrieval is introduced. We use a training set of manually categorized documents to learn word-category associations, and use these associations to predict the categories of arbitrary documents. Similarly, we use a training set of queries and their related documents to obtain empirical associations between query words and indexing terms of documents, and use these associations to predict the related documents of arbitrary queries. A Linear Least Squares Fit (LLSF) technique is employed to estimate the likelihood of these associations. Document collections from the MEDLINE database and Mayo patient records are used for studies on the effectiveness of our approach, and on how much the effectiveness depends on the choices of training data, indexing language, word-weighting scheme, and morphological canonicalization. Alternative methods are also tested on these data collections for comparison. It is evident that the LLSF approach uses the relevance information effectively within human decisions of categorization and retrieval, and achieves a semantic mapping of free texts to their representations in an indexing language. Such a semantic mapping lead to a significant improvement in categorization and retrieval, compared to alternative approaches.
| Year | Citations | |
|---|---|---|
Page 1
Page 1