Publication | Closed Access
Parallel sorting and data partitioning by sampling
71
Citations
0
References
1983
Year
Cluster ComputingEngineeringComputer ArchitectureParallel SortingParallel StorageData ScienceData MiningData PartitioningParallel ComputingCombinatorial OptimizationData ManagementParallel DatabaseParallel Sorting MethodSorting AlgorithmComputer EngineeringComputer SciencePartition (Database)Parallel ProgrammingData-level ParallelismBig Data
A parallel sorting method which requires data partitioning is presented. The ability to partition the data into equal-size ordered subsets is essential in the sorting process. A data partitioning method by sampling is proposed. The complexity and the performance of the sorting and partitioning algorithm are analyzed. Storage bounds and the choice of parameters which determine the sampling size are also discussed. The analysis is developed for parallel sorting in a local network environment, with distributed data sets in secondary storage devices. 9 references.