Publication | Closed Access
Mining source code descriptions from developer communications
90
Citations
20
References
2012
Year
Unknown Venue
Software MaintenanceEngineeringSoftware EngineeringSource Code AnalysisSemantic WebSoftware AnalysisText MiningNatural Language ProcessingEmpirical Software Engineering ResearchCode DescriptionsInformation RetrievalData ScienceSoftware MiningSource CodeKnowledge DiscoveryComputer ScienceCode RepresentationSource Code DescriptionsStatic Program AnalysisSoftware DesignProgram AnalysisSoftware Testing
Very often, source code lacks comments that adequately describe its behavior. In such situations developers need to infer knowledge from the source code itself or to search for source code descriptions in external artifacts. We argue that messages exchanged among contributors/developers, in the form of bug reports and emails, are a useful source of information to help understanding source code. However, such communications are unstructured and usually not explicitly meant to describe specific parts of the source code. Developers searching for code descriptions within communications face the challenge of filtering large amount of data to extract what pieces of information are important to them. We propose an approach to automatically extract method descriptions from communications in bug tracking systems and mailing lists. We have evaluated the approach on bug reports and mailing lists from two open source systems (Lucene and Eclipse). The results indicate that mailing lists and bug reports contain relevant descriptions of about 36% of the methods from Lucene and 7% from Eclipse, and that the proposed approach is able to extract such descriptions with a precision of up to 79% for Eclipse and 87% for Lucene. The extracted method descriptions can help developers in understanding the code and could also be used as a starting point for source code re-documentation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1