computers Article Detecting Website Defacements Based on Machine Learning Techniques and Attack Signatures Xuan Dau Hoang * and Ngoc Tuong Nguyen Posts and Telecommunications Institute of Technology, Hanoi 100000, Vietnam;
[email protected] * Correspondence:
[email protected]; Tel.: +84-904-534-390 Received: 23 February 2019; Accepted: 7 May 2019; Published: 8 May 2019 Abstract: Defacement attacks have long been considered one of prime threats to websites and web applications of companies, enterprises, and government organizations. Defacement attacks can bring serious consequences to owners of websites, including immediate interruption of website operations and damage of the owner reputation, which may result in huge financial losses. Many solutions have been researched and deployed for monitoring and detection of website defacement attacks, such as those based on checksum comparison, diff comparison, DOM tree analysis, and complicated algorithms. However, some solutions only work on static websites and others demand extensive computing resources. This paper proposes a hybrid defacement detection model based on the combination of the machine learning-based detection and the signature-based detection. The machine learning-based detection first constructs a detection profile using training data of both normal and defaced web pages. Then, it uses the profile to classify monitored web pages into either normal or attacked. The machine learning-based component can effectively detect defacements for both static pages and dynamic pages. On the other hand, the signature-based detection is used to boost the model’s processing performance for common types of defacements. Extensive experiments show that our model produces an overall accuracy of more than 99.26% and a false positive rate of about 0.27%.