Abstract—With the development of the Internet, the amount of information is expanding rapidly. Naturally, search engine becomes the backbone of information management. Nevertheless, the flooding of large number of malicious websites on search engine has posed tremendous threat to our users. Most of exiting systems to detect malicious websites focus on specific attack. At the same time, available browser extensions based on blacklist are powerless to countless websites. In this paper, we present a lightweight approach using static analysis techniques to quickly discriminate malicious sites comprising malware, drive-by-download and phishing sites. We extract comprehensive features to classify labeled dataset using various machine learning algorithms. Large scale evaluation of our dataset shows that the classification accuracy reaches 97.5% with low overhead. Furthermore, we achieved a chrome plugin to detect malicious search result websites based on our classification model.
Index Terms—Malicious websites, feature extracting, machine learning.
The authors were with Hunan University, Changsha, CHINA. (e-mail: zhouhao6278@yahoo.com.cn; jhsun@aimlab.org; and haochen@aimlab.org )
[PDF]
Cite:Hao Zhou, Jianhua Sun, and Hao Chen, "Malicious Websites Detection and Search Engine Protection," Journal of Advances in Computer Networks vol. 1, no. 3, pp. 260-264, 2013.