Publication | Closed Access
STING: A Statistical Information Grid Approach to Spatial Data Mining
1.2K
Citations
11
References
1997
Year
Unknown Venue
EngineeringSpatial Data MiningSpatiotemporal DatabaseSocial SciencesData ScienceData MiningPattern RecognitionSpatial Data ManagementStatisticsData ModelingSpatial DatabasesSpatial Statistical AnalysisGeographyKnowledge DiscoveryComputer ScienceSpatial DataQuantitative Spatial ModelSpatial StatisticsCommon ProblemsBig Data
Spatial data mining is challenging because of large volumes and spatial distance, and existing methods require at least a linear scan of all objects for clustering and region queries. We propose a hierarchical statistical information grid approach to reduce the cost further. The approach captures statistical information per spatial cell so that many queries and clustering problems can be answered without accessing individual objects. The method outperforms the best previous approach by at least an order of magnitude, especially on large datasets, as shown theoretically and empirically.
Spatial data mining, i.e., discovery of interesting characteristics and patterns that may implicitly exist in spatial databases, is a challenging task due to the huge amounts of spatial data and to the new conceptual nature of the problems which must account for spatial distance. Clustering and region oriented queries are common problems in this domain. Several approaches have been presented in recent years, all of which require at least one scan of all individual objects (points). Consequently, the computational complexity is at least linearly proportional to the number of objects to answer each query. In this paper, we propose a hierarchical statistical information grid based approach for spatial data mining to reduce the cost further. The idea is to capture statistical information associated with spatial cells in such a manner that whole classes of queries and clustering problems can be answered without recourse to the individual objects. In theory, and confirmed by empirical studies, this approach outperforms the best previous method by at least an order of magnitude, especially when the data set is very large.
| Year | Citations | |
|---|---|---|
Page 1
Page 1