Thwarting Email Spam Laundering

Thwarting E-mail Spam Laundering MENGJUN XIE, HENG YIN, and HAINING WANG College of William and Mary Laundering e-mail spam through open-proxies or compromised PCs is a widely-used trick to conceal real spam sources and reduce spamming cost in the underground e-mail spam industry. Spammers have plagued the Internet by exploiting a large number of spam proxies. The facility of breaking spam laundering and deterring spamming activities close to their sources, which would greatly benefit not only e-mail users but also victim ISPs, is in great demand but still missing. In this article, we reveal one salient characteristic of proxy-based spamming activities, namely packet symmetry, by analyzing protocol semantics and timing causality. Based on the packet symmetry exhibited in spam laundering, we propose a simple and effective technique, DBSpam, to online detect and break spam laundering activities inside a customer network. Monitoring the bidirectional traffic passing through a network gateway, DBSpam utilizes a simple statistical method, Sequential Probability Ratio Test, to detect the occurrence of spam laundering in a timely manner. To balance the goals of promptness and accuracy, we introduce a noise-reduction technique in DBSpam, after which the laundering path can be identified more accurately. Then DBSpam activates its spam suppressing mechanism to break the spam laundering. We implement a prototype of DBSpam based on libpcap, and validate its efficacy on spam detection and suppression through both theoretical analyses and trace-based experiments. Categories and Subject Descriptors: C.2.0 [Computer Communication Networks]: Security and protection General Terms: Security Additional Key Words and Phrases: Spam, proxy, SPRT ACM Reference Format: Xie, M., Yin, H., and Wang, H. 2008. Thwarting e-mail spam laundering. ACM Trans. Inf. Syst. Secur. 12, 2, Article 13 (December 2008), 32 pages. DOI = 10.1145/1455518.1455525. http://doi.acm.org/10.1145/1455518.1455525. 1. INTRODUCTION As a side-product of free e-mail services, spam has become a serious problem that afflicts every Internet user in recent years. According to Message- Labs [2006], approximately 86% of e-mail traffic was spam in 2006. Although 13 a number of anti-spam mechanisms have been proposed and deployed to foil Authors’ address: M. Xie, H. Yin, and H. Wang, College of William and Mary; email: {mjxie, hyin, hnw}@cs.wm.edu. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or direct commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credits is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from the Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]. c 2008 ACM 1094-9224/2008/12-ART13 $5.00 DOI: 10.1145/1455518.1455525. http://doi.acm.org/10.1145/1455518.1455525. ACM Transactions on Information and System Security, Vol. 12, No. 2, Article 13, Pub. date: December 2008. 13: 2 · M. Xie et al. spammers, spam messages continue swarming into Internet users’ mailboxes. A more effective spam detection and suppression mechanism close to spam sources is critical to dampen the dramatically-grown spam volume. At present, proxies such as off-the-shelf SOCKS [Leech et al. 1996] and HTTP proxies play an important role in the spam epidemic. Spammers launder e-mail spam through spam proxies to conceal their real identities and reduce spamming cost. The popularity of proxy-based spamming is mainly due to the anonymous characteristic of a proxy and the availability of a large number of spam proxies. The IP address of a spammer is obfuscated by a spam proxy during the protocol transformation, which hinders the tracking of real spam origins. According to Composite Blocking List (CBL) [CBL 2007], which is a highly-trusted spam blacklist, the number of available spam proxies and bots in January 2007 is more than 3,200,000. These numerous spam proxies facilitate the formation of e-mail spam laundering, by which a spammer has great flexibility to change spam paths and bypass anti-spam barriers. How- ever, there is very little research done in detecting spam proxies. Probing is a common method used to verify the existence of spam proxies in practice. Probing works by scanning open ports on the spam hosts and examining whether or not e-mail can be sent through the open ports. Due to the wide deployment of firewalls and the use of scanning, both accuracy and efficiency of probing are poor. In this article, we propose a simple and effective mechanism, called DBSpam, which detects and blocks spam proxies’ activities inside a customer network in a timely manner, and further traces the corresponding spam sources outside the network. DBSpam is designed to be placed at a network vantage point such as the edge router or gateway that connects a customer network to the Internet. The customer network could be a regional broadband (cable or DSL) customer network, a regional dialup network, or a campus network. It detects ongoing proxy-based spamming by monitoring bidirectional traffic. Due to the protocol semantics of SMTP (Simple Mail Transfer Protocol) [Klensin 2001] and timing causality, the behavior of proxy-based spamming demonstrates the unique characteristics of connection correlation and packet symmetry. Utilizing this distinctive spam laundering behavior, we can easily identify the suspicious TCP connections involved in spam laundering. Then, we can single out the spam proxies, trace the spam sources behind them, and block the spam traffic. Based on libpcap, we implement a prototype of DBSpam and evaluate its effectiveness on both detecting and suppressing spam laundering through theoretical analyses and trace-based experiments. In general, DBSpam is distinctive from previous anti-spam approaches in the following two aspects: —DBSpam pushes the defense line towards spam sources without the recipient’s cooperation. DBSpam enables an ISP (Internet Service Provider) to detect spam laundering activities and spam proxies online inside its customer networks. The quick responsiveness of DBSpam offers the ISP an opportunity to suppress laundering activities and quarantine the identified spam proxies. ACM Transactions on Information and System Security, Vol. 12, No. 2, Article 13, Pub. date: December 2008. Thwarting E-mail Spam Laundering · 13: 3 —DBSpam has no need to scan message contents and has very few assump- tions about the connections between a spammer and its proxies. DBSpam works even if (1) these connections are encrypted and the message contents are compressed; and (2) a spammer uses proxy chains inside the monitored network. One additional benefit of DBSpam is that once spam laundering is detected, fingerprinting spam messages at the sender side is viable and spam signatures may be distributed to accelerate spam detection at other places. In addition to all these advantages, DBSpam is complementary to existing anti-spam tech- niques and can be incrementally deployed over the Internet. The remainder of the article is organized as follows. Section 2 briefly presents spam mechanisms. Section 3 surveys commonly used anti-spam tech- niques. Section 4 describes the unique behavior of proxy-based spamming. Section 5 details the working mechanism of DBSpam. Section 6 evaluates the effectiveness of DBSpam through the trace-based experiments. Section 7 discusses the robustness of DBSpam against potential evasions. Finally, we conclude the article with Section 8. 2. SPAMMING MECHANISMS In this section, we first present the spam laundering mechanisms, and then briefly describe other commonly used spamming approaches. 2.1 Spam Laundering Mechanisms Spam laundering studied in this article refers to the spamming process, in which only proxies are involved in origin disguise. The proxy refers to the application such as SOCKS that simply performs “protocol translation” (i.e., rewrite IP addresses and port numbers) and forwards packets. Different from an e-mail relay, which first receives the whole message and then forwards it to the next mail server, an e-mail proxy requires that the connections on both sides of the proxy synchronize during the message transferring. More impor- tantly, unlike an e-mail relay which inserts the information—“Received From” that records the IP address of sender and the timestamp when the message is received—in front of the message header before relaying the message, an e-mail proxy does not record such trace information during protocol transformation. Thus, from a recipient’s perspective, the e-mail proxy, instead of the original sender, becomes the source of the message. It is this identity replace- ment that makes e-mail proxy a favorite choice for spammers. Initially, spammers just seek open proxies on the Internet, which usually are misconfigured proxies allowing anyone to access their services. There are many Web sites and free software providing open proxy search function. However, once such misconfigurations are corrected by system administrators, spammers have to find other available “open” proxies. It is ideal for a spammer to own many “private” and stable proxies. Unsecured home PCs with broadband connections are good candidates for this purpose. Malicious software including specially designed worms and viruses, such as SoBig and Bagle, ACM Transactions on Information and System Security, Vol. 12, No. 2, Article 13, Pub. date: December 2008. 13: 4 · M. Xie et al.

Thwarting Email Spam Laundering

Technical and Legal Approaches to Unsolicited Electronic Mail, 35 USFL Rev

Successful Non-Governmental Threat Attribution

Technical and Legal Approaches to Unsolicited Electronic Mail†

Glossary of Spam Terms

Image Spam Detection: Problem and Existing Solution

The Economics of Spam∗

The Commercial Malware Industry (An Introductory Note)

Improving Filtering of Email Phishing Attacks by Using Three-Way Text Classifiers

Understanding the World's Worst Spamming Botnet

the Spam-Ish Inquisition

David Harris, All Rights Reserved

E-Nuisance: Unsolicited Bulk E-Mail at the Boundaries of Common Law Property Rights