Concepedia

Publication | Closed Access

Binary Code Clone Detection across Architectures and Compiling Configurations

83

Citations

30

References

2017

Year

Abstract

Binary code clone (or similarity) detection is a fundamental technique for many important applications, such as plagiarism detection, malware analysis, software vulnerability assessment and program comprehension. With the prevailing of smart and IoT (Internet of Things) devices, more and more programs are ported from traditional desktop platform (e.g., IA-32) to ARM and MIPS architectures. It is imperative to detect cloned binary code across architectures. However, because of incomparable instruction sets of different architectures as well as alternative compiling configurations of binaries, it is difficult to conduct a binary code clone detection with traditional syntax-or structure-based methods. To address, we propose a semantics-based approach to fulfill the target. We recognize arguments and indirect jump targets of each binary function, and emulate executions of those functions to extract semantic signatures helping measure the similarity of functions. The approach has been implemented in a prototype system names CACompare to detect cloned binary functions across architectures and compiling configurations. It supports comparisons between mainstream architectures (IA-32, ARM and MIPS) and is able to analysis binaries on Linux platform. The experimental results show that CACompare not only is effective in dealing with binaries of different architectures and variant compiling configurations, but also improves the accuracy of binary code clone detection comparing to state-of-the-art solutions.

References

YearCitations

Page 1