Publication | Closed Access
A Genetic Algorithm Based Data Replica Placement Strategy for Scientific Applications in Clouds
86
Citations
29
References
2015
Year
Distributed File SystemCluster ComputingEngineeringCloud Computing ArchitectureCloud Resource ManagementData ScienceManagementGenetic AlgorithmData IntegrationDistributed CloudCloud Data ManagementParallel ComputingTripartite GraphData ManagementComputer ScienceData ReplicationScientific ApplicationsEdge ComputingCloud ComputingDistributed Data StoreBig Data
Cloud computing is a promising distributed computing platform for big data applications, e.g., scientific applications, since excessive resources can be obtained from cloud services for processing and storing both existing and generated application datasets. However, when tasks process big data stored in distributed data centers, the inevitable data movements will cause huge bandwidth cost and execution delay. In this paper, we construct a tripartite graph based model to formulate the data replica placement problem and propose a genetic algorithm based data replica placement strategy for scientific applications to reduce data transmissions in cloud. Our approach can reduce 1) the size of moved data, 2) the time of data movement and 3) the number of movements. We conduct experiments to compare the proposed strategy with the random placement strategy used in Hadoop Distributed Files System (HDFS), which demonstrates that our strategy has better performance for scientific applications in clouds.
| Year | Citations | |
|---|---|---|
Page 1
Page 1