The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

Trend Micro, Incorporated

Marco Dela Vega and Norman Ingal Threat Response Engineers

A Trend Micro Research Paper I November 2010 The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

CONTENTS

Introduction...... 3

Building Doorway Pages...... 5

Redirection and Stealth Tactics...... 10

Malicious Landing Pages and Damaging Payloads...... 13

Conclusion...... 15

References...... 16

2 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

INTRODUCTION

With the endless stream of information available on the Internet, owners now find it increasingly difficult to get their sites noticed even if their content provides very useful and interesting information on popular subjects. To gain and improve site traffic or to attract visitors, a site now needs to reach the top ranks in engines via optimization (SEO). SEO’s popularity, however, has also piqued cybercriminals’ attention. In fact, a widely used cybercriminal technique to deliver malware to unsuspecting users’ systems while earning huge amounts of profit, it has given rise to what we now know as blackhat SEO. From the outset, blackhat SEO attacks Blackhat SEO attacks are relatively simple, as discussed in more detail a previously are relatively simple. published Trend Micro research paper, “How Blackhat SEO Became Big.” What users Clicking poisoned search do not know is that before they end up on the final landing pages, the cybercriminals results direct unwitting instigated a series of redirections, which means taking users to several compromised users to malware- sites, in order to deliver the final malware payload. hosting sites. What users do not know is that before they end up on the final landing pages, the cybercriminals had to compromise several sites and to instigate a series of redirections to deliver the final malware payloads.

Figure 1. Typical blackhat SEO infection diagram

This research paper will explain how cybercriminals leverage blackhat SEO to compromise systems. It will share our observations regarding various sites that have been compromised and on doorway pages that have been specially crafted for use in blackhat SEO attacks. It will also identify the techniques that cybercriminals use to mask infected pages and the different payloads that the said compromised sites deliver. This paper focuses on the overall blackhat SEO-instigated infection chain and provides data on the latest SEO tool kit versions cybercriminals use today. Finally, it provides best practices that anyone who uses a search engine can adhere to in order to prevent system infections as a result of SEO poisoning and to rid infected systems of malware payloads.

3 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

Figure 2. How a blackhat SEO attack occurs

4 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

BUILDING DOORWAY PAGES

A blackhat SEO infection chain always starts with doorway pages, the landing pages that serve malware. Doorway pages aka portals, jumps, gateways, or entry pages are primarily designed to trick search engines into treating them as legitimate pages.

Cybercriminals have found a way to automate SEO poisoning in such a way that, as a certain topic becomes more popular, related doorway pages instantly appear among the top search results. These pages are usually hosted on specially crafted or on compromised legitimate sites.

Legitimate sites can be compromised either by exploiting improperly configured Web servers or by using known vulnerabilities in server and other Web applications. Most of the compromised sites that host doorway pages ran on Apache servers with Hypertext Preprocessor (PHP) functionality. In several cases, these also used common Web applications such as Joomla! and WordPress as content management systems (CMSs).

We also found several exploit codes in some compromised sites that strongly suggest that cybercriminals also used the said sites to find and exploit other vulnerable sites. These exploit codes varied from site vulnerability scanners to proof-of-concept (POC) codes that target specific vulnerabilities, making both users and site owners potential victims of this threat.

Once a page has been compromised, cybercriminals then set up its SEO components using a tool kit that performs poisoning routines.

Doorway pages are the landing pages that serve malware. These portals, jumps, gateways, or entry pages are primarily designed to trick search engines into treating them as legitimate pages.

Figure 3. Compromised site with an SEO tool kit installed

One of the most interesting components of the SEO tool kits we found in compromised sites is a log file that contains a list of strings and keywords similar to those used as search strings in Google Trends or Yahoo!, which feature trending topics. This clearly shows that cybercriminals harvest the said information as an important part of the infection process, as this will dictate their success in delivering threats to unsuspecting victims.

5 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

The list of search strings is managed and controlled by a central command-and- control (C&C) server and is distributed to different compromised sites using a variety of methods. The C&C server also distributes links to other compromised sites, which are appended to doorway pages that have been constructed to improve their ranking among search results.

In a blackhat SEO attack, a C&C server: • Manages and controls a list of search strings • Distributes links to compromised sites, which are appended to doorway pages that have been constructed to increase the sites’ ranking among search results

Figure 4. Search strings and links found in compromised sites

Another component is a record of all kinds of information requests from unwary page visitors. This information may include HTTP requests (i.e., query parameters), visitors’ IP addresses, and user-agents’ HTTP headers. Information about HTTP referrers is also recorded since this is used to verify if a visitor found the doorway page as a search engine result or not.

Figure 5. Log file containing information on a site’s visitors

6 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

The blackhat SEO tool kit’s main component is a single PHP script that handles an attack’s overall operation starting from obtaining HTTP requests to generating content for the compromised sites based on the responses. The latest script we obtained had several encryption layers, making it more difficult to analyze.

Figure 6. First encryption layer Figure 7. Second encryption layer

To avoid detection, when a compromised site receives an HTTP request, the main script checks if it came from any of the following: • Search engine crawler • User via a search engine • Direct site access

Figure 8. Decoded part of the script

When compromised sites receive an HTTP request, the main script checks if the request was received from any of the following:

• Search engine crawler • User via a search engine • Direct site access

The main script identifies the above-mentioned sources by checking different HTTP header fields such as $_SERVER[‘HTTP_USER_AGENT’] and $_SERVER[‘HTTP_ REFERER’] as well as the HTTP request itself. The PHP tool kit at hand checks if the $_SERVER[‘HTTP_USER_AGENT’] value is googlebot, slurp, or msnbot, common user-agent strings search engine crawlers use. It also checks for specific strings used as part of request parameters such as q and page as well as their corresponding values. To determine if a user request arrived via a search engine, the script checks the $_SERVER[‘HTTP_REFERER’] header field.

7 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

If a request was found to have come from a search engine crawler, the main script generates doorway pages stuffed with content it harvested. Using the search string parameters, content is harvested by lifting off relevant text and images from the results presented by any single search engine. The SEO tool kit that we analyzed, for instance, obtains the top 100 search results from Google Russia.

Figure 9. SEO tool kit uses Google Russia for

The contents of doorway pages are mainly created for spamdexing purposes. These pages increase a linked page’s ranking among search engine results. In some cases, however, a dormant doorway page may contain links to compromised sites to further increase its ranking.

The contents of doorway pages are mainly created for spamdexing purposes. These pages increase a linked page’s ranking among search engine results.

Figure 10. Dormant doorway pages with links to a malicious site

8 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

Malicious scripts are embedded in doorway pages in such a way that users who access the said pages are redirected to several malicious sites. This is done by referencing another PHP component from the tool kit that contains the URL to which the doorway page should redirect users.

Note, however, that this URL frequently changes, as it is updated from a master C&C server every 10 minutes. The payload or malware that the product ID points to can also be modified to identify what threat the tool kit should deliver. We can also assume that these tool kits are being sold to cybercriminals so they can more easily distribute their malicious creations.

Malicious scripts are embedded in doorway pages in such a way that users who access the said pages are redirected to several malicious sites.

Figure 11. SEO tool kit can be configured to provide different malware as payloads

9 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

REDIRECTION AND STEALTH TACTICS

Users who access doorway pages via search engines are either directed to fake scanning or video-streaming pages that then lead to the download of different malware binaries. Before the users reach the final destination pages, however, a series of link hops or redirections first takes place. These redirections help hide the actual of the final landing pages and of the pages that host the fake scanning results.

Users who access doorway pages via search engines are either directed to fake scanning or video- streaming pages that then lead to the download of different malware binaries.

Figure 12. Two-week diagram of a blackhat SEO infection chain from the initial

10 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

More than simple redirections, however, cybercriminals also use other techniques to redirect users to their specially crafted malicious pages. These include a combination of the following stealth tactics:

• Geo-targeting or IP delivery: This utilizes users’ IP addresses to determine their geographic locations in order to deliver location-specific content to their systems.

scraping: This refers to regularly scanning to search for and to copy content using an automated software.

• Referrer page checking: This ensures that only users arriving via search engines will be included in the infection chain and prevents security analysts or system administrators from seeing anything malicious when they directly access a doorway page.

• User-agent filtering:This refers to distinguishing between browsers to enable OS- specific download of payloads.

Since we started monitoring recent blackhat SEO attacks, we observed several variations as to how cybercriminals implemented the above-mentioned techniques. The foremost tactic we found was the use of server-side redirections, specifically HTTP 3xx redirections. Using this method, however, requires cybercriminals to gain administrative privileges on Web servers.

More than simple redirections, cybercriminals also use the following techniques to redirect users to their specially crafted malicious pages: • Geo-targeting or IP delivery • Blog scraping • Referrer page checking • User-agent filtering

Figure 13. How an HTTP 3xx server redirection takes place

11 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

Cybercriminals who have limited privileges on Web servers inject server-side scripts into sites to compromise them. The following redirection techniques can lead users to sites with malicious payloads:

• Use of JavaScript codes

Figure 14. JavaScript redirection code

• Use of tags, HTML features that refresh a displayed page after a certain amount of time

Cybercriminals who have limited privileges on Web servers inject server-side scripts into sites to compromise them via the use of: • JavaScript codes • Meta refresh tags • Iframe tags

Figure 15. Meta refresh tag redirection code

• Use of iframe tags, sometimes with the help of user-agent filtering to prevent access using specific browsers

Figure 16. Iframe tag redirection code with a browser-specific payload

Note, however, that to make a blackhat SEO attack successful, several redirection methods are employed as stealth mechanisms in order to evade the common URL- filtering technologies different security vendors come up with.

12 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

MALICIOUS LANDING PAGES AND DAMAGING PAYLOADS

After successfully employing any of the techniques mentioned earlier, cybercriminals then lead users to a page that hosts spoofed content. These include bogus message prompts; scareware pages that urge users to check fake scanning results, which have been designed to scare them into downloading fake antivirus software; and fake video- streaming pages urging users to download fake codecs in order to view fake videos.

Cybercriminals lead users to pages that host spoofed content including bogus message prompts, scareware pages, and fake video- streaming pages.

Figure 17. Samples of scareware pages

Figure 18. Fake video-streaming page that lures users into downloading a fake codec

13 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

Some spoofed content comes in the form of prompts to download fake Player installers. The said pages trick users into clicking a link that supposedly leads to a video, for which they need to install Adobe Flash Player to view. The cybercriminals behind this kind of attack have a keen eye for detail, as they not only craft convincing interfaces but also use URLs that strongly suggest that the sites are indeed Adobe related.

Most blackhat SEO attacks result in FAKEAV malware payloads but we have also seen attacks resulting in the download of MONDER, TDSS, and ZBOT variants. Most of these are related to botnets that either steal user information or deliver their final payloads.

Most blackhat SEO attacks result in FAKEAV malware payloads but we have also seen attacks resulting in the download of MONDER, TDSS, and ZBOT variants.

Figure 19. Botnet business model

14 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

CONCLUSION

SEO plays an important role in getting Knowing how SEO works the greatest number of Internet users to and how blackhat SEO access relevant information on popular has become a favorite subjects. Unfortunately, however, it infection vector will has also been playing an important help security experts role in spreading malware to as many come up with effective unsuspecting user systems as possible. countermeasures to protect users from Knowing how SEO works and how related threats. blackhat SEO has become a favorite infection vector will help security experts come up with effective countermeasures to protect users from related threats.

The following are some of the tried-and-tested best practices that users can keep in mind to protect their systems from blackhat SEO attacks:

• Practice safe browsing habits. Avoid visiting suspicious-looking sites. Do not download and install software from untrustworthy sources.

• Stay abreast of the latest threats and threat trends. Familiarizing oneself with the current threat landscape is a great way to stay informed about the latest scams. The most popular malware today tend to prey on unwary users. It is also worthwhile to familiarize oneself with the available security solutions in the market. To know more about the latest threats and threat trends, read the articles on TrendWatch and the latest posts by security experts in the TrendLabs Malware Blog.

• Download and install the latest patches. Unpatched machines are more prone to malicious attacks. It is a good computing habit to regularly patch systems. Enabling the automatic update feature is also recommended. Trend Micro also posts the latest vulnerability information on the new Threat Encyclopedia.

• Install an effective security suite. Blackhat SEO is now one of the most common threat infection vectors. As such, installing an effective security solution will mitigate the risks malware pose. Trend Micro products and solutions incorporate the Trend Micro™ Smart Protection Network™ infrastructure to stop threats before they can even reach your system.

Backed by the Smart Protection Network, Trend Micro security products and services use smarter approaches than conventional solutions. Smart Protection Network is a cloud-client content security infrastructure that automatically blocks threats before they reach systems. It utilizes a global network of threat intelligence sensors that correlates with email, Web, and file reputation technologies 24 x 7 to provide comprehensive protection against threats. As threats become more sophisticated, the volume of attacks increases, and the number of endpoints rapidly grows, the need for lightweight, comprehensive, and immediate threat intelligence in the cloud will become critical to protect businesses against data breaches, damage to reputations, and loss of productivity.

15 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES The Dark Side of Trusting Web Searches From Blackhat SEO to System Infection

REFERENCES

• Loucif Kharouni. (April 8, 2010). TrendLabs Malware Blog. “Spotlighting the Botnet Business Model.” http://blog.trendmicro.com/spotlighting-the-botnet-business- model/ (Retrieved September 2010).

• Ryan Flores. (November 2010). TrendWatch. “How Blackhat SEO Became Big.” http://us. trendmicro.com/imperia/md/content/us/trendwatch/researchandanalysis/how_ blackhat_seo_became_big__november_2010_.pdf (Retrieved November 2010).

• Trend Micro Incorporated. (2010). Threat Encyclopedia. “BKDR_TDSS.” http:// threatinfo.trendmicro.com/vinfo/virusencyclo/default2.asp?m=q&virus=tdss&alt= tdss&Sect=SA (Retrieved September 2010).

• Trend Micro Incorporated. (2010). Threat Encyclopedia. “TROJ_MONDER.” http:// threatinfo.trendmicro.com/vinfo/virusencyclo/default2.asp?m=q&virus=monder& alt=monder&Sect=SA (Retrieved September 2010).

• Trend Micro Incorporated. (2010). Threat Encyclopedia. “ZBOT.” http://threatinfo. trendmicro.com/vinfo/virusencyclo/default2.asp?m=q&virus=zbot&alt=zbot&Sect= SA (Retrieved September 2010).

TREND MICRO™ TREND MICRO INC. Trend Micro Incorporated is a pioneer in secure content and threat 10101 N. De Anza Blvd. management. Founded in 1988, Trend Micro provides individuals and Cupertino, CA 95014 organizations of all sizes with award-winning security software, hard- US toll free: 1 +800.228.5651 ware and services. With headquarters in Tokyo and operations in Phone: 1 +408.257.1500 more than 30 countries, Trend Micro solutions are sold through cor- Fax: 1 +408.257.2003 porate and value-added resellers and service providers worldwide. For additional information and evaluation copies of Trend Micro products www.trendmicro.com and services, visit our Web site at www.trendmicro.com.

©2010 by Trend Micro, Incorporated. All rights reserved. Trend Micro, the Trend Micro t-ball logo are trademarks 16 RESEARCH PAPER I THE DARK SIDE OF TRUSTING WEB SEARCHES or registered trademarks of Trend Micro, Incorporated. All other product or company names may be trademarks or registered trademarks of their owners.