The weighted combined algorithm: a linkage algorithm for software clustering

TLDR

Software systems evolve with changing requirements, causing structural degradation and outdated documentation that hampers understanding, prompting researchers to use clustering techniques for architecture recovery. The study aims to adapt clustering algorithms and similarity measures specifically for software systems. A novel algorithm for computing inter‑cluster distance is introduced. Evaluating popular similarity measures on two test systems, the authors identify variations that improve clustering performance for software.

Abstract

Software systems need to evolve as business requirements, technology and environment change. As software is modified to accommodate the required changes, its structure deteriorates. There is increased deviation from the actual design and architecture. Very often, documentation is not updated to reflect these changes thus making it more and more difficult to understand, manage and maintain these systems. Researchers have applied various techniques to recover the components and architecture of such software systems. The use of clustering techniques has recently been explored for reverse engineering and software architecture recovery. There is a need to tailor clustering algorithms and similarity measures to cater to software. We present a new algorithm for finding intercluster distance. We compare the performance of some popular similarity measures for this algorithm using two test systems and suggest variations of the similarity measures which show better results for software clustering.

References

Page 1

	Year	Citations

Page 1