Concepedia

TLDR

Binary code search is crucial for plagiarism detection, malware analysis, and vulnerability auditing, yet its effectiveness is hampered by vast syntax and structural differences across compilers, architectures, and operating systems. This work introduces BINGO, a scalable and robust binary search engine that supports multiple architectures and operating systems. BINGO achieves this by selectively inlining relevant library and user‑defined functions to capture full semantics, applying architecture‑ and OS‑neutral filtering to eliminate irrelevant targets, and using length‑variant partial traces to model functions independently of program structure. Experiments demonstrate that BINGO can efficiently locate semantically similar functions across architecture and OS boundaries despite structural distortions, and it successfully uncovered a zero‑day vulnerability in Adobe PDF Reader.

Abstract

Binary code search has received much attention recently due to its impactful applications, e.g., plagiarism detection, malware detection and software vulnerability auditing. However, developing an effective binary code search tool is challenging due to the gigantic syntax and structural differences in binaries resulted from different compilers, architectures and OSs. In this paper, we propose BINGO — a scalable and robust binary search engine supporting various architectures and OSs. The key contribution is a selective inlining technique to capture the complete function semantics by inlining relevant library and user-defined functions. In addition, architecture and OS neutral function filtering is proposed to dramatically reduce the irrelevant target functions. Besides, we introduce length variant partial traces to model binary functions in a program structure agnostic fashion. The experimental results show that BINGO can find semantic similar functions across architecture and OS boundaries, even with the presence of program structure distortion, in a scalable manner. Using BINGO, we also discovered a zero-day vulnerability in Adobe PDF Reader, a COTS binary.

References

YearCitations

Page 1