Gone Rogue: An Analysis of Rogue Security Software Campaigns (Invited Paper)

Marco Cova∗, Corrado Leita†, Olivier Thonnard‡, Angelos Keromytis† and Marc Dacier† ∗University of California, Santa Barbara, [email protected] †Symantec Research Labs, {Corrado Leita,Angelos Keromytis,Marc Dacier}@symantec.com ‡Royal Military Academy, Belgium, [email protected]

Abstract technique consists of attracting victims on malicious web sites that exploit vulnerabilities in the client In the past few years, Internet miscreants have software (typically, the browser or one of its plugins) developed a number of techniques to defraud and to download and install the rogue programs without make a hefty profit out of their unsuspecting victims. any user intervention. A troubling, recent example of this trend is cyber- Rogue AV programs are distributed by cyber- criminals distributing rogue security software, that is criminals to generate a financial profit. In fact, after malicious programs that, by pretending to be legitimate the initial infection, victims are typically tricked into security tools (e.g., anti-virus or anti-), deceive paying for additional tools or services (e.g., to upgrade users into paying a substantial amount of money in to the full version of the program or to subscribe to an exchange for little or no protection. update mechanism), which are most often fictitious or While the technical and economical aspects of rogue completely ineffective. The cost of these scams range security software (e.g., its distribution and monetiza- from $30–$100. tion mechanisms) are relatively well-understood, much Despite its reliance on traditional and relatively less is known about the campaigns through which this unsophisticated techniques, rogue AV has emerged as type of is distributed, that is what are the un- a major security threat, in terms of the size of the derlying techniques and coordinated efforts employed affected population (Symantec’s sensors alone reported by cyber-criminals to spread their malware. 43 million installation attempts over a one-year pe- In this paper, we present the techniques we used riod [1]), the number of different variants unleashed to analyze rogue security software campaigns, with in-the-wild (over 250 distinct families of rogue tools an emphasis on the infrastructure employed in the have been classified [1]), and the volume of profits campaign and the life-cycle of the clients that they generated for cyber-criminals (upward of $300,000 a infect. month in affiliate commissions alone [2]). The prevalence and effectiveness of this threat has 1. Introduction spurred considerable research by the security commu- nity [1], [3], [4]. These studies have led to a better A rogue security software program is a type of understanding of the technical characteristics of this misleading application that pretends to be legitimate malware (e.g., its advertising and installation tech- security software, such as an anti-virus scanner, but niques) and of the quantitative aspects of the overall which actually provides the user with little or no threat (e.g., the number and geolocation of the web protection. In some cases, rogue security software sites involved in the distribution of rogue programs (in the following, more compactly rogue AV) actually and of their victims). facilitates the installation of the very malicious code However, a number of areas have not been fully ex- that it purports to protect against. plored. In particular, malware code, the infrastructure Rogue AV makes its way on victim machines in two used to distribute it, and the victims that encounter prevalent ways. First, social engineering techniques it do not exist in isolation, but are different aspects can be used to convince unexperienced users that a of the coordinated effort by cyber-criminals to spread rogue tool is legitimate and that its use is necessary to rogue AV. We refer to such a coordinated activity as remediate often inexistent or exaggerated threats found a campaign. In our work, rather than examining or on the victim’s computer. A second, more stealthy measuring single aspects of a rogue AV campaign, we analyzed the campaign as a whole, focusing, in Our analysis identified 39 distinct clusters compris- particular, on understanding its infrastructure (e.g., the ing 10 or more domains, which we interpret as different web servers, DNS servers, and web sites it uses) and campaigns. Figure 1 shows one such cluster, which the way it is created and managed; and on how it comprises 15 distinct rogue AV-hosting sites (blue affects the clients that interact with it. More precisely, nodes). A number of details support the hypothesis that the main contributions of our work are: these sites are indeed part of the same campaign. First, • We developed a methodology to identify the registration data shows that all sites were registered server components used in a rogue security soft- on the same day using only 3 addresses from ware campaign and to learn any emerging patterns .ru domains (red nodes). Second, all sites have been in the ways these are set up and organized. hosted at one point in time in the same AS, even more • We leveraged the results of our analysis to provide strikingly on consecutive IP addresses (yellow nodes). insights into the tools, techniques, and strategies Furthermore, the site names follow the same nam- followed by current campaigns. ing scheme, which consists of different combinations • Finally, we have performed an initial analysis of of basic tokens such as home, av, anti-virus. the behavior and life-cycle of infected clients, as Finally, even more conclusively, we found that the observed from within the infrastructure respon- content of each site was identical (notice that the site sible for the infection (as opposed to the client content is not a feature of our clustering system). end-point). Other clusters represent more sophisticated cam- paigns. For example, our analysis isolated a cluster (not 2. Our Approach discussed here in detail for sake of space) describing a campaign involving 750 web sites, registered over a period of 8 months, and hosted in several distinct ASes Server-side analysis. A primary goal of our analysis and countries. is to identify the server-side components involved in a campaign. While extensive lists of individual rogue Client-side analysis. During our study we found that AV-hosting sites are easily obtainable from telemetry 6 of the rogue AV-hosting servers we monitored were data of legitimate anti-virus tools or publicly-available leaking information about the clients accessing them blacklists, this data per se does not provide information and their requests. The data available to us during on the campaigns themselves, for example, whether this monitoring was limited to the access time, the two sites are part of the same campaign. IP address of the client, and the specific URL on One assumption that can reasonably be made is that the server that was accessed. In particular, we did a campaign is managed by a group of people, who are not have access to the content of the client-server likely to reuse, at various stages of the campaign, the communication, for example, we could not be certain same techniques, strategies, and tools. Consequently, whether client requests were successful or not. Despite our approach to studying the infrastructure of rogue these limitations, this data provides an interesting view AV campaigns (e.g., rogue AV-hosting sites, DNS of the (potential and actual) victims of a campaign, as servers) is based on identifying commonalities in the seen from inside the rogue AV infrastructure. We report sites employed in the campaigns. More precisely, we here some of the results from our 45-day monitoring, apply multi-criteria clustering to determine groupings during which we observed 372,096 distinct client IP of server components with similar characteristics (the addresses. interested reader can refer to [5] for the details of Rogue AV victims can issue several different types our clustering techniques). The features we used for of requests, depending on their current stage in the the clustering consisted of a number of “network infection. For example, scan requests cause a fake observables” including IP address(es), DNS names, scan of the victim’s computer to be displayed. These other DNS entries pointing to the same IP, geolocation requests are typical of the before-the-infection phase, information, server identification string and version when the victim should be scared into downloading number, ISP identity, AS number, DNS registrar, DNS and installing the rogue AV. Other requests are related registrant, and server uptime. Values for each of these to the installation and use of the rogue AV: download features were collected over a period of approximately requests retrieve the actual binary, update requests two months. The rogue AV-hosting servers that we an- check if new versions of the rogue AV or of its (per- alyzed using these techniques were identified through functory) virus definitions are available. Other classes a variety of means, including automated and manual of requests we observed include accesses to pages that feeds. In total, we considered 4,305 rogue AV-hosting present a payment form (where victims pay for the IP addresses with 6,500 related domain names. rogue AV) and payment confirmation pages. Finally, ' ",+-9(&@"' "++$&' 1+,

36135154104: 6=14/1..0

!"#$%*)./0/12"# 36135154 6=14/1..0130

361351541043 !"#$%&' (*(+,-)./0/12"# !"#$%&' ()*(+,-./0/12"#

!"#$)%&' (*(+,-)./0/12"# !"#$)%*)./0/12"# ;2-$2,+(' <).//=12"#

3.14.1.0/1055

;2-$2,+(' <)/=12"# 3.14.1.0/105/ 2%&?<9$.#%(81+,

361351541044

!"#$%&' ()*(+,-)./0/12"# 3.14.1.0/ 3.14/1.0/1055

3.14/1.0/

!"#$%&' (*(+,-./0/12"#

3.14.1.0/105. 3.14.1.0/1050 78%(+9:/:01+,

!"#$)%&' ()*(+,-./0/12"#

!"#$)%&' ()*(+,-)./0/12"# !"#$)%&' (*(+,-./0/12"#

!"#$)%*./0/12"# !"#$%*./0/12"#

;2-$2,+('

36135154104>

6=14/10=.13/

6=14/10=. ! ! !"#$%&'()*'+$,$-./"0&'.1/"0"/2'32'/24&' !"#$%&'(' Figure 1. A relatively simple rogue AV campaign. Figure! 2. Duration of client interactions (by /24 sub- "#$%&!'&!'$%(')%!*(#+%!,-(!(-).%!/0!1-,23'(%!-,!4%23%%&!567!'&8!597:!-.(!'&';<1#1! net).#&8#+'2%1!2='2!2=%1%!>!1%($%(1!?3=#+=!@'! 8#12#&+2!%&2#2#%1A!@'R:!%12#@'2%8!4!'&8!5Q The:967!?*-2%&2#';;

We analyzed rogue AV campaigns by analyzing References server-side and client-side data. Our clustering tech- [1] Symantec, “Rogue Security Software,” http://www.symantec. nique used various server features to identify 39 cam- com/business/theme.jsp?themeid=threatreport, Tech. Rep., paigns out of 6,500 involved sites. These results can 2009. be leveraged in several ways. First, they give a more [2] B. Krebs, “Massive Profits Fueling Rogue Antivirus Market,” explanatory description of the rogue AV threat, in in Washington Post, 2009. which, for example, individual, disconnected sites are [3] , “Security Intelligence Report (SIR),” substituted by sets of related sites and time relation- http://www.microsoft.com/security/portal/Threat/SIR.aspx, ships (e.g., dates of domain registrations) are explicit. Tech. Rep., 2008. Second, campaign-level information reveals the modus [4] R. Sherstobitoff and S.-P. Correll, “Rogue Security Software in operandi of the criminals orchestrating the campaign, 2008,” ISSA Journal, Jan 2009. for example, their registration and hosting partners, the duration of their efforts, the sophistication of the tools [5] O. Thonnard, W. Mees, and M. Dacier, “Addressing the Attack Attribution Problem Using Knowledge Discovery and Multi- available to them (e.g., to automate the registration of Criteria Fuzzy Decision-Making,” in Proceedings of CSI-KDD, domain names), and the countermeasures they employ 2009.