Publication | Closed Access
Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction
587
Citations
49
References
2007
Year
Software MaintenanceEngineeringSoftware EngineeringSource Code AnalysisSoftware AnalysisData ScienceData MiningMinimum Edit ScriptSoftware MiningKnowledge DiscoveryProgramming StyleChange DistillingComputer ScienceStatic Program AnalysisSoftware DesignSource Code ChangesSoftware EvolutionCode RefactoringProgram AnalysisSoftware TestingFormal Methods
Software evolution analysis must identify specific changes across multiple program versions. The study introduces change distilling, a tree differencing algorithm for fine‑grained source code change extraction, and reports its evaluation. The algorithm improves Chawathe et al.’s method by matching AST nodes and generating a minimum edit script, classifying changes via a taxonomy, and is evaluated on a benchmark of 1,064 manually labeled changes from 219 revisions of three open‑source projects. The algorithm outperforms the original approach, achieving a 45 % closer approximation to the minimum edit script and reducing mean absolute percentage error from 79 % to 34 %.
A key issue in software evolution analysis is the identification of particular changes that occur across several versions of a program. We present change distilling, a tree differencing algorithm for fine-grained source code change extraction. For that, we have improved the existing algorithm of Chawathe et al. for extracting changes in hierarchically structured data. Our algorithm detects changes by finding a match between nodes of the compared two abstract syntax trees and a minimum edit script. We can identify change types between program versions according to our taxonomy of source code changes. We evaluated our change distilling algorithm with a benchmark we developed that consists of 1,064 manually classified changes in 219 revisions from three different open source projects. We achieved significant improvements in extracting types of source code changes: our algorithm approximates the minimum edit script by 45% better than the original change extraction approach by Chawathe et al. We are able to find all occurring changes and almost reach the minimum conforming edit script, i.e., we reach a mean absolute percentage error of 34%, compared to 79% reached by the original algorithm. The paper describes both the change distilling and the results of our evaluation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1