Publication | Closed Access
Efficient Distributed Topic Modeling with Provable Guarantees
16
Citations
19
References
2014
Year
Unknown Venue
Topic modeling for large-scale distributed web-collections requires distributed tech-niques that account for both computational and communication costs. We consider topic modeling under the separability assumption and develop novel computationally efficient methods that provably achieve the statisti-cal performance of the state-of-the-art cen-tralized approaches while requiring insignifi-cant communication between the distributed document collections. We achieve trade-offs between communication and computa-tion without actually transmitting the doc-uments. Our scheme is based on exploiting the geometry of normalized word-word co-occurrence matrix and viewing each row of this matrix as a vector in a high-dimensional space. We relate the solid angle subtended by extreme points of the convex hull of these vectors to topic identities and construct dis-tributed schemes to identify topics. 1
| Year | Citations | |
|---|---|---|
Page 1
Page 1