Phishing Websites Detection based on Phishing Characteristics in the Webpage Source Code

TLDR

The World Wide Web Consortium develops web standards, while phishing sites exploit these standards to steal user credentials. The study proposes a source‑code‑based phishing detection method that extracts W3C‑violating characteristics to assess website security. The method scans webpage source code for such characteristics, assigns weights, and computes a security percentage where higher values indicate legitimate sites. Testing on a legitimate and a phishing site showed the phishing site had a lower security percentage, confirming the approach can detect phishing sites.

Abstract

World Wide Web Consortium (W3C) is the international standards organization for the World Wide Web (www). It develops standards, specifications and recommendations to enhance the interoperability and maximize consensus about the content of the web and define major parts of what makes the World Wide Web work. Phishing is a type of Internet scams that seeks to get a user‟s credentials by fraud websites, such as passwords, credit card numbers, bank account details and other sensitive information. There are some characteristics in webpage source code that distinguish phishing websites from legitimate websites and violate the w3c standards, so we can detect the phishing attacks by check the webpage and search for these characteristics in the source code file if it exists or not. In this paper, we propose a phishing detection approach based on checking the webpage source code, we extract some phishing characteristics out of the W3C standards to evaluate the security of the websites, and check each character in the webpage source code, if we find a phishing character, we will decrease from the initial secure weight. Finally we calculate the security percentage based on the final weight, the high percentage indicates secure website and others indicates the website is most likely to be a phishing website. We check two webpage source codes for legitimate and phishing websites and compare the security percentages between them, we find the phishing website is less security percentage than the legitimate website; our approach can detect the phishing website based on checking phishing characteristics in the webpage source code.

References

Page 1

	Year	Citations

Page 1