Publication | Closed Access
An Internet traffic analysis method with MapReduce
82
Citations
7
References
2010
Year
Unknown Venue
Cluster ComputingInternet Traffic AnalysisEngineeringData ScienceEdge ComputingDistributed Data AnalyticsCloud ComputingFlow Analysis JobInternet Traffic MeasurementLarge VolumeParallel ProgrammingComputer ScienceMap-reduceParallel ComputingNetwork Traffic MeasurementData Streaming ArchitectureData ManagementBig Data
Internet traffic measurement and analysis have been usually performed on a high performance server that collects and examines packet or flow traces. However, when we monitor a large volume of traffic data for detailed statistics, a long-period or a large-scale network, it is not easy to handle Tera or Peta-byte traffic data with a single server. Common ways to reduce a large volume of continuously monitored traffic data are packet sampling or flow aggregation that results in coarse traffic statistics. As distributed parallel processing schemes have been recently developed due to the cloud computing platform and the cluster filesystem, they could be usefully applied to analyzing big traffic data. Thus, in this paper, we propose an Internet flow analysis method based on the MapReduce software framework of the cloud computing platform for a large-scale network. From the experiments with an open-source MapReduce system, Hadoop, we have verified that the MapReduce-based flow analysis method improves the flow statistics computation time by 72%, when compared with the popular flow data processing tool, flow-tools, on a single host. In addition, we showed that MapReduce-based programs complete the flow analysis job against a single node failure.
| Year | Citations | |
|---|---|---|
Page 1
Page 1