Abstract—Spammers are constantly creating sophisticated new weapons in their arms race with anti-spam technology, the latest of which is image-based spam. In general words, image spam is a type of email in which the text message is presented as a picture in an image file. This prevents text based spam filters from detecting and blocking such spam messages. There are several techniques available for detecting image spam (DNSBL, GrayListing, Spamtraps, etc,…). Each one has its own advantages and disadvantages. On behalf of their weakness, they become controversial to one another. This paper includes a general study on image spam detection using file properties, histogram and hough transform, which are explained in the following sections. The proposed methods are tested on a spam archive dataset and are found to be effective in identifying all types of spam images having (1) only images (2) both text and images. The goal is to automatically classify an image directly as being spam or ham. The proposed method is able to identify a large amount of malicious images while being computationally inexpensive.
Index Terms—File properties, histogram, hough transform, spam archive dataset.
The authors are with University of Computer Studies, Mandalay, Vietnam (e-mail: zinmarwinn19@gmail.com).
[PDF]
Cite:Zin Mar Win and Nyein Aye, "Detecting Image Spam Based on File Properties, Histogram and Hough Transform," Journal of Advances in Computer Networks vol. 2, no. 4, pp. 287-292, 2014.