Concepedia

Publication | Closed Access

Survey of data partitioning algorithms for big data stores

14

Citations

17

References

2016

Year

Abstract

Data partitioning plays an important role in distributed systems to elevate performance of the applications. Big data applications require the augmented metrics of performance like responsiveness, availability, scalability and throughput. Data partitioning across multiple nodes improves the performance of the application with respect to its scalability and throughput. Although there is a rich literature on the data partitioning algorithms in distributed systems for big data applications, there is a need to classify as well as regroup the algorithms based on their strategies. The paper presents an exhaustive classification of data partitioning algorithms based on their strategies as well as operational units. A survey of this kind gives an insight to the user about not only a comparative analysis of performance of approaches but also suggests suitability of these approaches for candidate big data applications.

References

YearCitations

Page 1