Publication | Closed Access
Scalable Parallel EM Algorithms for Latent Dirichlet Allocation in Multi-Core Systems
15
Citations
38
References
2015
Year
Unknown Venue
Cluster ComputingLatent Dirichlet AllocationEngineeringLda ParametersText MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningParallel Complexity TheoryParallel ComputingContent AnalysisMassively-parallel ComputingDocument ClusteringKnowledge DiscoveryComputer ScienceData-intensive ComputingTopic ModelMulti-core SystemsParallel ProcessingParallel ProgrammingData-level Parallelism
Latent Dirichlet allocation (LDA) is a widely-used probabilistic topic modeling tool for content analysis such as web mining. To handle web-scale content analysis on just a single PC, we propose multi-core parallel expectation-maximization (PEM) algorithms to infer and estimate LDA parameters in shared memory systems. By avoiding memory access conflicts, reducing the locking time among multiple threads and residual-based dynamic scheduling, we show that PEM algorithms are more scalable and accurate than the current state-of-the-art parallel LDA algorithms on a commodity PC. This parallel LDA toolbox is made publicly available as open source software at mloss.org.
| Year | Citations | |
|---|---|---|
Page 1
Page 1