Publication | Closed Access
Making Search Efficient on Gnutella-Like P2P Systems
41
Citations
23
References
2005
Year
Unknown Venue
Cluster ComputingEngineeringSemantic WebInformation RetrievalData ScienceData MiningGraph Query LanguageManagementRelevance Ranking AlgorithmData IntegrationQuery ExpansionCombinatorial OptimizationData ManagementSearch EfficientKnowledge DiscoveryComputer ScienceDistributed Query ProcessingQuery OptimizationCloud ComputingPeer-to-peer DatabaseNode Vector SizeNode VectorTrusted P2pDistributed Search Engine
Leveraging the state-of-the-art information retrieval (IR) algorithms like VSM and relevance ranking algorithm, we present GES, an efficient IR system built on top of Gnutella-like P2P networks. The key idea is that GES employs a distributed, content-based, and capacity-aware topology adaptation algorithm to organize nodes (each of which is represented by a node vector) into semantic groups. The intuition behind this design is that semantically associated nodes within a semantic group tend to be relevant to the same queries. Given a query, GES uses a capacity-aware search protocol based on semantic groups and selective one-hop node vector replication, to direct the query to the most relevant nodes which are responsible for the query, thereby achieving high recall with probing only a small faction of nodes. Moreover, GES adopts automatic query expansion techniques to improve quality of search results, and it is the first work to show that node vector size plays a very important role in system performance. The experimental results show that GES is very efficient, and even outperforms the centralized node clustering system like SETS.
| Year | Citations | |
|---|---|---|
Page 1
Page 1