Publication | Closed Access
Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques
100
Citations
34
References
2017
Year
Unknown Venue
Software MaintenanceEngineeringMachine LearningSoftware EngineeringSource Code AnalysisSoftware AnalysisNatural Language ProcessingSimilar Bug DetectionInformation RetrievalData ScienceData MiningDuplicate Bug DetectionFuzzingKnowledge DiscoveryComputer ScienceDuplicate BugsDeep LearningAutomated RepairRetrieval Augmented GenerationSoftware Testing
Duplicate Bug Detection is the problem of identifying whether a newly reported bug is a duplicate of an existing bug in the system and retrieving the original or similar bugs from the past. This is required to avoid costly rediscovery and redundant work. In typical software projects, the number of duplicate bugs reported may run into the order of thousands, making it expensive in terms of cost and time for manual intervention. This makes the problem of duplicate or similar bug detection an important one in Software Engineering domain. However, an automated solution for the same is not quite accurate yet in practice, in spite of many reported approaches using various machine learning techniques. In this work, we propose a retrieval and classification model using Siamese Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) for accurate detection and retrieval of duplicate and similar bugs. We report an accuracy close to 90% and recall rate close to 80%, which makes possible the practical use of such a system. We describe our model in detail along with related discussions from the Deep Learning domain. By presenting the detailed experimental results, we illustrate the effectiveness of the model in practical systems, including for repositories for which supervised training data is not available.
| Year | Citations | |
|---|---|---|
Page 1
Page 1