Concepedia

Publication | Closed Access

Identifying trends in enterprise data protection systems

25

Citations

40

References

2015

Year

TLDR

Enterprises rely on data protection to ensure business continuity, but as data volumes surge, these systems must evolve, yet little research explains why they often fail to meet backup and recovery goals. This study examines 40,000 Symantec NetBackup deployments to identify configuration, scheduling, and growth trends that could guide the creation of automated, self‑healing data protection systems. We analyzed over one million weekly reports collected over three years from the 40,000 systems. The analysis revealed that misconfigurations and bursty data growth—often with default settings—cause frequent goal misses, suggesting that automated self‑healing solutions could improve efficiency.

Abstract

Enterprises routinely use data protection techniques to achieve business continuity in the event of failures. To ensure that backup and recovery goals are met in the face of the steep data growth rates of modern workloads, data protection systems need to constantly evolve. Recent studies show that these systems routinely miss their goals today. However, there is little work in the literature to understand why this is the case. In this paper, we present a study of 40,000 enterprise data protection systems deploying Symantec NetBackup, a commercial backup product. In total, we analyze over a million weekly reports which have been collected over a period of three years. We discover that the main reason behind inefficiencies in data protection systems is misconfigurations. Furthermore, our analysis shows that these systems grow in bursts, leaving clients unprotected at times, and are often configured using the default parameter values. As a result, we believe there is potential in developing automated, self-healing data protection systems that achieve higher efficiency standards. To aid researchers in the development of such systems, we use our dataset to identify trends characterizing data protection systems with regards to configuration, job scheduling, and data growth.

References

YearCitations

Page 1