Concepedia

Publication | Open Access

Finding community structure in very large networks

7.3K

Citations

33

References

2004

Year

TLDR

Community structure analysis is a hot topic in physics, yet existing methods are too costly for very large networks. We propose a hierarchical agglomeration algorithm that detects community structure faster than many competing methods. The algorithm runs in O(md log n) time, becoming essentially linear (O(n log² n)) for sparse, hierarchical networks, and we applied it to a 400 k‑node, 2 million‑edge retailer network. It successfully extracts meaningful communities, uncovering large‑scale purchasing patterns among customers.

Abstract

The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with $n$ vertices and $m$ edges is $O(md\phantom{\rule{0.2em}{0ex}}\mathrm{log}\phantom{\rule{0.2em}{0ex}}n)$ where $d$ is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with $m\ensuremath{\sim}n$ and $d\ensuremath{\sim}\mathrm{log}\phantom{\rule{0.2em}{0ex}}n$, in which case our algorithm runs in essentially linear time, $O(n\phantom{\rule{0.2em}{0ex}}{\mathrm{log}}^{2}\phantom{\rule{0.2em}{0ex}}n)$. As an example of the application of this algorithm we use it to analyze a network of items for sale on the web site of a large on-line retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400 000 vertices and $2\ifmmode\times\else\texttimes\fi{}{10}^{6}$ edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers.

References

YearCitations

Page 1