Publication | Closed Access
Watermain breaks and data: the intricate relationship between data availability and accuracy of predictions
37
Citations
42
References
2020
Year
Hydrological PredictionEngineeringMachine LearningIntricate RelationshipMining MethodsMany Water UtilitiesDecision AnalyticsXgboost MachineData ScienceData MiningWater ProblemManagementDecision Tree LearningData ManagementPrediction ModellingReal World DataData LakeRisk AnalyticsPredictive AnalyticsGeographyComputer ScienceForecastingWatermain BreaksPipe ReplacementData ModelingData Availability
Many water utilities are facing a crisis of aging infrastructure. Aging pipes are deteriorating, and pipe breaks are increasing. A variety of pipe break prediction models have been developed to identifying which pipes are most likely to break next, in order to assist utilities in prioritizing pipe replacement. This paper investigates the role of data in pipe break prediction model accuracy. A gradient boosting decision tree machine learning model, a Weibull proportional hazard probabilistic model and two ranking models (based on ‘age of pipe’ and ‘previous-break’) were calibrated using a various number of pipes, years of break records and input variables. The results indicate how the different model types are impacted by data limitations. Overall, this study finds the Age-based approach to be inaccurate, and the XGBoost machine learning model demonstrates superior predictive capability when the training dataset contains more than 5 years of break records and 2,000 or more pipes.
| Year | Citations | |
|---|---|---|
Page 1
Page 1