Towards Measuring and Mitigating Social Engineering Software
Total Page:16
File Type:pdf, Size:1020Kb
Towards Measuring and Mitigating Social Engineering Software Download Attacks Terry Nelms, Georgia Institute of Technology and Damballa; Roberto Perdisci, University of Georgia and Georgia Institute of Technology; Manos Antonakakis, Georgia Institute of Technology; Mustaque Ahamad, Georgia Institute of Technology and New York University Abu Dhabi https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/nelms This paper is included in the Proceedings of the 25th USENIX Security Symposium August 10–12, 2016 • Austin, TX ISBN 978-1-931971-32-4 Open access to the Proceedings of the 25th USENIX Security Symposium is sponsored by USENIX Towards Measuring and Mitigating Social Engineering Software Download Attacks Terry Nelms1,2, Roberto Perdisci3,1, Manos Antonakakis1, and Mustaque Ahamad1,4 1Georgia Institute of Technology 2Damballa, Inc. 3University of Georgia 4New York University Abu Dhabi [email protected], [email protected], [email protected], [email protected] Abstract namely the user, by leveraging sophisticated social en- Most modern malware infections happen through the gineering tactics [27]. Because social engineering (SE) browser, typically as the result of a drive-by or social en- attacks target users, rather than systems, current defense gineering attack. While there have been numerous stud- solutions are often unable to accurately detect them. ies on measuring and defending against drive-by down- Thus, there is a pressing need for a comprehensive study loads, little attention has been dedicated to studying so- of social engineering downloads that can shed light on cial engineering attacks. the tactics used in modern attacks. This is important not In this paper, we present the first systematic study only to inform better technical defenses, but may also al- of web-based social engineering (SE) attacks that suc- low us to gather precious information that may be used cessfully lure users into downloading malicious and un- to better train users against future SE attacks. wanted software. To conduct this study, we collect and In this paper, we present a study of real-world SE reconstruct more than two thousand examples of in-the- download attacks. Specifically, we focus on studying wild SE download attacks from live network traffic. Via a web-based SE attacks that unfold exclusively via the web detailed analysis of these attacks, we attain the following and that do not require “external” triggers such as email results: (i) we develop a categorization system to identify spam/phishing, etc. An example of such attacks is de- and organize the tactics typically employed by attackers scribed in [9]: a user is simply browsing the web, vis- to gain the user’s attention and deceive or persuade them iting an apparently innocuous blog, when his attention into downloading malicious and unwanted applications is drawn to an online ad that is subtly crafted to mimic (ii) we reconstruct the web path followed by the victims a warning about a missing browser plugin. Clicking on and observe that a large fraction of SE download attacks the ad takes him to a page that reports a missing codec, are delivered via online advertisement, typically served which is required to watch a video. The user clicks on from “low tier” ad networks (iii) we measure the char- the related codec link, which results in the download of acteristics of the network infrastructure used to deliver malicious software. such attacks and uncover a number of features that can To conduct our study, we collect and analyze hundreds be leveraged to distinguish between SE and benign (or of successful in-the-wild SE download attacks, namely non-SE) software downloads. SE attacks that actually result in a victim downloading malicious or unwanted software. We harvest these at- 1 Introduction tacks by monitoring live web traffic on a large academic Most modern malware infections happen via the network. Via a detailed analysis of our SE attack dataset, browser, typically triggered by social engineering [9] or we attain the following main results: (i) we develop a drive-by download attacks [33]. While numerous studies categorization system to identify and organize the tactics have focused on measuring and defending against drive- typically employed by attackers to gain a user’s atten- by downloads [14,17,28,38], malware infections enabled tion and deceive or persuade them into downloading ma- by social engineering attacks remain notably understud- licious and unwanted applications (ii) we reconstruct the ied [31]. web path (i.e., sequence of pages/URLs) followed by SE Moreover, as recent defenses against drive-by down- victims and observe that a large fraction of SE attacks are loads and other browser-based attacks are becoming delivered via online advertisement (e.g., served via “low harder to circumvent [18, 24, 32, 36, 40], cyber-criminals tier” ad networks) (iii) we measure the characteristics of increasingly aim their attacks against the weakest link, the network infrastructure (e.g., domain names) used to USENIX Association 25th USENIX Security Symposium 773 deliver such attacks, and uncover a number of features Summary of Contributions: that can be leveraged to distinguish between SE and be- We present the first systematic study of modern nign (i.e., “non-SE”) software downloads. • web-based SE download attacks. For instance, our Our findings show that a large fraction of SE attacks analysis of hundreds of SE download attack in- (almost 50) are accomplished by repackaging existing stances reveals that most such attacks are enabled benign applications. For instance, users often download by online advertisements served through a handful free software that comes as a bundle including the soft- of “low tier” ad networks. ware actually desired by the user plus some Adware or other Potentially Unwanted Programs (PUPs). This con- To assist the process of understanding the origin • firms that websites serving free software are often in- of SE download attacks, we develop a categoriza- volved (willingly or not) in distributing malicious or un- tion system that expresses how attackers typically wanted software [4, 7]. gain a user’s attention, and what are the most com- The second most popular category of attacks is related mon types of deception and persuasion tactics used to alerting or urging the user to install an application that to trick victims into downloading malicious or un- is supposedly needed to complete a task. For instance, wanted applications. This makes it easier to track the user may be warned that they are running an outdated what type of attacks are most prevalent and may or insecure version of Adobe Flash or Java, and are of- help to focus user training programs on specific user fered to download a software update. Unfortunately, by weaknesses and particularly successful deception downloading these supposed updates, users are infected. and persuasion tactics currently used in the wild. Similarly, users may stumble upon a page that suppos- Via extensive measurements, we find that the most edly hosts a video of interest. This page may then inform • the user that a specific video codec is needed to play the common types of SE download attacks include fake desired video. The user complies by downloading the updates for Adobe Flash and Java, and that fake suggested software, thus causing an infection (see Sec- anti-viruses (FakeAVs), which have been a popu- tion 3 for details). lar and effective infection vector in the recent past, represent less than 1 of all SE downloads we ob- Another example of an SE download attack is rep- served in the wild. Furthermore, we find that ex- resented by fake anti-viruses (FakeAVs) [35]. In this isting defenses, such as traditional anti-virus (AV) case, a web page alerts the user that their machine is in- scanners, are largely ineffective against SE down- fected and that AV software is needed to clean up the loads. machine. In a way similar to the SE attack examples reported above, the user may be persuaded to down- Based on our measurements, we then identify a set • load (in some cases after a payment) the promoted soft- of features that allow for building a statistical clas- ware, which will infect the user’s machine. However, sifier that is able to accurately detect ad-driven SE while FakeAVs have been highly popular among attack- download attacks with 91% true positives and only ers in the recent past, our study of in-the-wild SE mal- 0.5 false positives. ware downloads finds that they represent less than 1 of modern SE attacks. This sharp decline in the num- 2 Study Overview ber of FakeAV attacks within the last few years is con- Our study of SE download attacks is divided in mul- sistent with the recent development of technical counter- tiple parts. To better follow the results discussed in the measures against this class of attacks [5] and increased following sections, we now present a brief summary of user awareness [6]. their content. As mentioned earlier, a large fraction of SE download In Section 3, we analyze the range of deception and attacks (more than 80) are initiated via advertisements, persuasion tactics employed by the attackers to victimize and that the “entry point” to these attacks is represented users, and propose a categorization system to systematize by only a few low-tier advertisement networks. For in- the in-the-wild SE tactics we observed. stance, we found that a large fraction of the web-based In Section 4, we discuss how we collect software SE attacks described above are served primarily via two downloads (including malicious ones) from live network ad networks: onclickads.net and adcash.com. traffic and reconstruct their download path. Namely, we By studying the details of SE download attacks, we trace back the sequence of pages/URLs visited by a user also discover a number of features that aid in the detec- before arriving to a URL that triggers the download of tion of SE download attacks on live web traffic.