Peculiar: Smart Contract Vulnerability Detection Based on Crucial Data Flow Graph and Pre-training Techniques

TLDR

Smart contracts are rapidly adopted across industries, yet their immutable nature and frequent bugs cause significant economic losses, prompting heightened security scrutiny, though current detection methods—static heuristics and deep learning—remain inadequate. This study introduces Peculiar, a vulnerability detection framework that leverages pre‑training and a crucial data flow graph to identify smart contract flaws. Peculiar replaces the traditional, complex data flow graph with a streamlined crucial graph and incorporates pre‑training, enabling the model to focus on essential features and benefit from NLP advances. On 40,932 contracts, Peculiar achieves 91.80 % precision and 92.40 % recall for reentrancy, surpassing Smartcheck’s 79.37 % precision and 70.50 % recall, and ablation confirms both the graph and pre‑training drive its superior performance.

Abstract

Smart contracts with natural economic attributes have been widely and rapidly developed in various fields. However, the bugs and vulnerabilities in smart contracts have brought huge economic losses, which has strengthened people's attention to the security issues of smart contracts. The immutability of smart contracts makes people more willing to conduct security checks before deploying smart contracts. Nonetheless, existing smart contract vulnerability detection techniques are far away from enough: static analysis approaches rely heavily on manually crafted heuristics which is difficult to reuse across different types of vulnerabilities while deep learning based approaches also have unique limitations. In this study, we propose a novel approach, Peculiar, which uses Pre-training technique for detection of smart contract vulnerabilities based on crucial data flow graph. Compared against the traditional data flow graph which is already utilized in existing approach, crucial data flow graph is less complex and does not bring an unnecessarily deep hierarchy, which makes the model easy to focus on the critical features. Moreover, we also involve pre-training technique in our model due to the dramatic improvements it has achieved on a variety of NLP tasks. Our empirical results show that Peculiar can achieve 91.80 % precision and 92.40 % recall in detecting reentrancy vulnerability, one of the most severe and common smart contract vulnerabilities, on 40,932 smart contract files, which is significantly better than the state-of-the-art methods (e.g., Smartcheck achieves 79.37% precision and 70.50% recall). Meanwhile, another experiment shows that Peculiar is more discerning to reentrancy vulnerability than existing approaches. The ablation experiment reveals that both crucial data flow graph and pre-trained model contribute significantly to the performances of Peculiar.

References

Page 1

	Year	Citations

Page 1