Concepedia

Publication | Closed Access

An Internet traffic analysis method with MapReduce

82

Citations

7

References

2010

Year

Abstract

Internet traffic measurement and analysis have been usually performed on a high performance server that collects and examines packet or flow traces. However, when we monitor a large volume of traffic data for detailed statistics, a long-period or a large-scale network, it is not easy to handle Tera or Peta-byte traffic data with a single server. Common ways to reduce a large volume of continuously monitored traffic data are packet sampling or flow aggregation that results in coarse traffic statistics. As distributed parallel processing schemes have been recently developed due to the cloud computing platform and the cluster filesystem, they could be usefully applied to analyzing big traffic data. Thus, in this paper, we propose an Internet flow analysis method based on the MapReduce software framework of the cloud computing platform for a large-scale network. From the experiments with an open-source MapReduce system, Hadoop, we have verified that the MapReduce-based flow analysis method improves the flow statistics computation time by 72%, when compared with the popular flow data processing tool, flow-tools, on a single host. In addition, we showed that MapReduce-based programs complete the flow analysis job against a single node failure.

References

YearCitations

Page 1