Publication | Closed Access
Data Canopy
44
Citations
67
References
2017
Year
Unknown Venue
Statistical LearningEngineeringData ScienceData MiningData ScientistsPredictive AnalyticsInteractive Data ExplorationKnowledge DiscoveryManagementExploratory Data AnalysisData ExplorationData IntegrationComputer ScienceData Pre-processingData ManagementStatisticsExploratory Statistical AnalysisData Modeling
During exploratory statistical analysis, data scientists repeatedly compute statistics on data sets to infer knowledge. Moreover, statistics form the building blocks of core machine learning classification and filtering algorithms. Modern data systems, software libraries, and domain-specific tools provide support to compute statistics but lack a cohesive framework for storing, organizing, and reusing them. This creates a significant problem for exploratory statistical analysis as data grows: Despite existing overlap in exploratory workloads (which are repetitive in nature), statistics are always computed from scratch. This leads to repeated data movement and recomputation, hindering interactive data exploration.
| Year | Citations | |
|---|---|---|
Page 1
Page 1