Publication | Closed Access
AutoIndex: An Incremental Index Management System for Dynamic Workloads
31
Citations
34
References
2022
Year
Cluster ComputingEngineeringCandidate IndexesBig Data IndexingComputer ArchitectureInformation RetrievalData ScienceDynamic WorkloadsQuery TemplatesManagementData IntegrationParallel ComputingData ManagementNew IndexesComputer EngineeringComputer ScienceQuery OptimizationData IndexingParallel ProgrammingSearch Engine IndexingIndexing Technique
Indexes are vital to enhance the lookup on single or multiple columns, and building proper indexes can significantly improve the database performance. Existing works focus on adding new indexes that can benefit the read queries, but they have several limitations. First, real-world workloads may have numerous queries and it is tricky to analyze their index requirements and find the most beneficial indexes within resource limit. Second, they fail to consider the update of existing indexes, which may be redundant or even have negative effects to current workload. Third, they cannot estimate the index maintenance costs, which are affected by multiple index utilization factors and can significantly affect the index benefits, especially for high-write-ratio workloads. To address those challenges, we propose an incremental index management system Autoindex for dynamic workloads. First, to support incremental index management, we map the incoming queries into query templates and efficiently generate promising candidate indexes from matched templates. And then we propose to utilize Monte Carlo Tree Search to incrementally add indexes from the candidate indexes or remove indexes from existing indexes, so as to ensure high workload performance. Besides, we propose a deep index estimation model, which integrates the practical experience to extract critical cost features and applies deep regression to estimate index benefits from historical index management data. We have implemented the modules like candidate index generation and index estimator in an open-sourced database system openGauss. Experimental re-sults showed that our method outperformed existing approaches on both testing and real-world workloads.
| Year | Citations | |
|---|---|---|
Page 1
Page 1