Publication | Closed Access
Efficient search for the top-k probable nearest neighbors in uncertain databases
149
Citations
22
References
2008
Year
Threshold QueriesEngineeringMachine LearningUncertain DatabaseRange SearchingUncertain DataLocalizationInformation RetrievalData ScienceData MiningPattern RecognitionUncertainty QuantificationManagementQuery UncertaintyCombinatorial OptimizationData ManagementSimilarity SearchKnowledge DiscoveryComputer ScienceNearest Neighbor ObjectsQuery OptimizationUncertain DatabasesEfficient SearchApproximate Query AnsweringLocation Information
Uncertainty pervades many domains such as location tracking, multimedia feature extraction, and sensor data management, where finding nearest neighbors is a key query type. The study aims to identify objects with the highest marginal probability of being the nearest neighbors to a query object. The authors adopt a general uncertainty model for data and query uncertainty, define new query semantics, and develop efficient evaluation algorithms that analyze cost factors and trade‑offs, validated by extensive experiments on real and synthetic data. They extend the techniques to handle dependencies among data objects and support threshold queries.
Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point is an important query type in these applications. In this paper, we study the problem of finding objects with the highest marginal probability of being the nearest neighbors to a query object. We adopt a general uncertainty model allowing for data and query uncertainty. Under this model, we define new query semantics, and provide several efficient evaluation algorithms. We analyze the cost factors involved in query evaluation, and present novel techniques to address the trade-offs among these factors. We give multiple extensions to our techniques including handling dependencies among data objects, and answering threshold queries. We conduct an extensive experimental study to evaluate our techniques on both real and synthetic data.
| Year | Citations | |
|---|---|---|
Page 1
Page 1