Publication | Closed Access
Parallel optimization of large join queries with set operators and aggregates in a parallel environment supporting pipeline
19
Citations
26
References
1996
Year
Cluster ComputingRelational Algebra OperatorsEngineeringParallel EnvironmentComputer ArchitectureComputational ComplexityParallel OptimizationMap-reduceLarge Join QueriesData ScienceManagementData IntegrationParallel ComputingData ManagementParallel DatabaseComputer EngineeringParallel OptimizerComputer ScienceDistributed Query ProcessingBushy ParallelismQuery OptimizationParallel ProcessingParallel ProgrammingData-level ParallelismBig Data
Proposes a parallel optimizer for queries containing a large number of joins, as well as set operators and aggregate functions. The platform for the execution is a shared-disk multiprocessor machine supporting bushy parallelism and pipeline processing. Our model partitions the query into almost independent subtrees that can be optimized simultaneously, and it applies an enhanced variation of the iterative improvement technique on those subtrees which contain a large number of joins; this technique is parallelized, too. In order to estimate the cost of the states constructed during the optimization of join subtrees, cost formulae are developed that estimate the cost of relational algebra operators when executed across coalescing pipes.
| Year | Citations | |
|---|---|---|
Page 1
Page 1