Publication | Open Access
Statistical models for topic segmentation
89
Citations
22
References
1999
Year
Unknown Venue
EngineeringPart-of-speech TaggingCorpus LinguisticsTopic BoundariesText MiningNatural Language ProcessingInformation RetrievalData ScienceText SegmentationComputational LinguisticsDocument ClassificationGood SegmentationTopic SegmentationLanguage StudiesContent AnalysisStatisticsDocument ClusteringNlp TaskKnowledge DiscoveryTopic ModelTopic Segmentation PerformanceLinguistics
Most documents are about more than one subject, but many NLP and IR techniques implicitly assume documents have just one topic. We describe new clues that mark shifts to new topics, novel algorithms for identifying topic boundaries and the uses of such boundaries once identified. We report topic segmentation performance on several corpora as well as improvement on an IR task that benefits from good segmentation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1