Publication | Closed Access
Semantically-Enhanced Topic Modeling
19
Citations
24
References
2018
Year
Unknown Venue
EngineeringTopic ModelingSemantic WebCorpus LinguisticsText MiningNatural Language ProcessingGeneral TopicInformation RetrievalData ScienceSemantically-enhanced Topic ModelingComputational LinguisticsDocument ClassificationLanguage StudiesContent AnalysisStatisticsAutomatic Text ClassificationSemantic LearningNlp TaskKnowledge DiscoveryComputer ScienceInformation ExtractionRetrieval Augmented GenerationTopic ModelSemantic Representation
In this paper, we advance the state-of-the-art in topic modeling by means of the design and development of a novel (semi-formal) general topic modeling framework. The novel contributions of our solution include: (i) the introduction of new semantically-enhanced data representations for topic modeling based on pooling, and (ii) the proposal of a novel topic extraction strategy - ASToC - that solves the difficulty in representing topics in our semantically-enhanced information space. In our extensive experimentation evaluation, covering 12 datasets and 12 state-of-the-art baselines, totalizing 108 tests, we exceed (with a few ties) in almost 100 cases, with gains of more than 50% against the best baselines (achieving up to 80% against some runner-ups). We provide qualitative and quantitative statistical analyses of why our solutions work so well. Finally, we show that our method is able to improve document representation in automatic text classification.
| Year | Citations | |
|---|---|---|
Page 1
Page 1