A language independent approach for detecting duplicated code

TLDR

Code duplication hampers maintenance of large software systems, and existing detection techniques depend on brittle parsers that struggle across languages. This paper demonstrates that a language‑independent visual approach can overcome these limitations. The authors present a parser‑free tool that detects substantial duplication and validate it on case studies spanning four languages and source sizes from 256 KB to 13 MB.

Abstract

Code duplication is one of the factors that severely complicates the maintenance and evolution of large software systems. Techniques for detecting duplicated code exist but rely mostly on parsers, technology that has proven to be brittle in the face of different languages and dialects. In this paper we show that is possible to circumvent this hindrance by applying a language independent and visual approach, i.e. a tool that requires no parsing, yet is able to detect a significant amount of code duplication. We validate our approach on a number of case studies, involving four different implementation languages and ranging from 256 K up to 13 Mb of source code size.

References

Page 1

	Year	Citations

Page 1