Publication | Closed Access
Improved topic spotting through statistical modelling of keyword dependencies
15
Citations
6
References
2002
Year
Unknown Venue
EngineeringSpoken Language ProcessingCorpus LinguisticsText MiningWord EmbeddingsNatural Language ProcessingSpeech RecognitionInformation RetrievalData ScienceComputational LinguisticsBackground SpeechDocument ClassificationLanguage StudiesAutomatic ClassificationKnowledge DiscoveryKeyword DependenciesTopic ModelKeyword ExtractionSpeech ProcessingKeyword-topic InterdependenceBroadcast Radio DatabaseLinguistics
Keywords are chosen on the basis of their usefulness for discriminating a topic from background speech. Good topic recognition can be achieved with a small set of well-chosen keywords, but particular combinations of keywords often achieve better discrimination than can be accounted for by regarding them as independent. This paper describes a higher-order statistical approach involving models of keyword-topic interdependence. A linear-logistic model brings some improvement in performance, but better results are obtained using log-linear contingency table models. Although the potential number of these is very large, good models tend to be simple and are suggested by heuristic measures inferred from the training data. The approach is tested using a broadcast radio database.
| Year | Citations | |
|---|---|---|
Page 1
Page 1