That Ain't You: Detecting Spearphishing Emails Before They
Total Page:16
File Type:pdf, Size:1020Kb
That Ain't You: Blocking Spearphishing Emails Before They Are Sent Gianluca Stringhinix and Olivier Thonnardz xUniversity College London z Amadeus [email protected] [email protected] Abstract 1 Introduction Companies and organizations are constantly under One of the ways in which attackers try to steal sen- attack by cybercriminals trying to infiltrate corpo- sitive information from corporations is by sending rate networks with the ultimate goal of stealing sen- spearphishing emails. This type of emails typically sitive information from the company. Such an attack appear to be sent by one of the victim's cowork- is often started by sending a spearphishing email. At- ers, but have instead been crafted by an attacker. tackers can breach into a company's network in many A particularly insidious type of spearphishing emails ways, for example by leveraging advanced malware are the ones that do not only claim to come from schemes [21]. After entering the network, attackers a trusted party, but were actually sent from that will perform additional activities aimed at gaining party's legitimate email account that was compro- access to more computers in the network, until they mised in the first place. In this paper, we pro- are able to reach the sensitive information that they pose a radical change of focus in the techniques used are looking for. This process is called lateral move- for detecting such malicious emails: instead of look- ment. Attackers typically infiltrate a corporate net- ing for particular features that are indicative of at- work, gain access to internal machines within a com- tack emails, we look for possible indicators of im- pany and acquire sensitive information by sending personation of the legitimate owners. We present spearphishing emails. In a spearphishing attack an IdentityMailer, a system that validates the au- email is crafted and sent to a specific person within a thorship of emails by learning the typical email- company, with the goal of infecting her machine with sending behavior of users over time, and compar- an unknown piece of malware, luring her to hand out access credentials, or to provide sensitive informa- arXiv:1410.6629v1 [cs.CR] 24 Oct 2014 ing any subsequent email sent from their accounts against this model. Our experiments on real world e- tion. Recent research showed that spearphishing is mail datasets demonstrate that our system can effec- a real threat, and that companies are constantly tar- tively block advanced email attacks sent from genuine geted by this type of attack [38]. email accounts, which traditional protection systems Spearphishing is not spam. While they may share are unable to detect. Moreover, we show that it is re- a few common characteristics, it is important to note silient to an attacker willing to evade the system. To that spearphishing is still very different from tra- the best of our knowledge, IdentityMailer is the ditional email spam. In most cases, spearphishing first system able to identify spearphishing emails that emails appear to be coming from accounts within the are sent from within an organization, by a skilled at- same company or from a trusted party, to avoid rais- tacker having access to a compromised email account. ing suspicion by the victim [40]. This can be done in a trivial way, by forging the From: field in the attack and modal words in their emails. The core of Identi- email. However, in more sophisticated attacks, the tyMailer consists in building a user profile reflect- malicious emails are actually sent from a legitimate ing her email-sending behavior. When a user's ac- employee's email account whose machine has been count gets compromised, the attack emails that are compromised, or whose credentials have been previ- sent from this account are likely to show differences ously stolen by the attacker [30]. From the attacker's from the behavioral profile of the genuine user. perspective, this modus operandi presents two key Behavioral anomalies can be very evident or more advantages. First, it leverages a user's social connec- subtle. An example of a \noisy" attack is a worm tions: previous research showed that users are more that sends an email to the entire address book of likely to fall for scams if the malicious message is a user [39], which is a behavior that typical users sent by somebody they trust [14]. Secondly, it cir- do not show. In more realistic scenarios, attackers cumvents existing detection systems, which are typ- might try to mimic the typical behavior of the person ically based on anti-spam techniques. This happens they are impersonating in their emails. What they for two reasons: first, the content of spearphishing could do is sending emails only at hours in which the emails looks in many cases completely legitimate and user is typically sending them, and only to people she it does not contain any words that are indicative of frequently interacts with, or even imitate the user's spam, since the goal is to make it resemble typical writing style. business emails. Second, if an email impersonating To make it more difficult for attackers to success- one of the company's employees comes from that per- fully evade our system, IdentityMailer builds the son's computer, which has been compromised, then email-sending behavioral profile for a particular user origin-based detection techniques, such as IP reputa- by leveraging both the emails previously sent by that tion, become useless. Secondly, it circumvents all IP specific user and a set of emails that other users in and origin-based blacklisting systems, as well as email the organization authored. In a nutshell, Identi- sender or domain verification systems such as Sender tyMailer compares the emails written by the user Policy Framework (SPF) and DomainKeys Identified to the ones written by everybody else and extracts Mail (DKIM) [19, 43], since the email is sent from a those characteristics that are the most representative genuine email account. of the user's behavior. If common characteristics are A new paradigm for fighting targeted at- shared by multiple users, however, these will be de- tack emails. Given how different spearphishing emphasized because they are not specific to a particu- emails are compared to traditional spam and phish- lar user. For example, certain functional words only ing emails, we propose a paradigm shift in detection used by a given user (and rarely by others) would approaches to fight this threat, and present Identi- model her behavior well. tyMailer, a system to detect and block spearphish- When an attacker tries to learn a victim's sending ing emails sent from compromised accounts. In- behavior to mimic it in his attack emails, he only has stead of looking for signs of maliciousness in emails access to that user's emails (since he compromised her (such as words that are indicative of illicit content, account or personal machine). It is unlikely, however, phishy-looking content, or suspicious origin), Iden- that he has access to the ones authored by all other tityMailer determines whether an email was ac- users within the company { besides the few emails tually written by the author that it claims to come exchanged by the victim and the coworkers he/she from. In other words, we try to automatically val- is interacting with. Therefore, all the attacker can idate the genuineness of the email authorship. Our do is learning the most common habits of the user approach is based on a simple, yet effective observa- (such as the email address that is more frequently tion: most users develop habits when sending emails. contacted, and at what time the user generally sends These habits include frequent interactions with spe- emails), but he has no guarantee that those traits are cific people, sending emails at specific hours of the actually representative of the victim's behavior. day, and using certain greetings, closing statements, Working on the sending end. Traditional anti- spam systems work on the receiving end of the email the anomaly as a false positive, and the email is for- process. This means that they analyze incoming warded. In addition, we update the user's behavioral emails, and establish whether they are legitimate or profile to include this particular email, so that we malicious. This approach is very effective in gen- will we avoid similar false positives in the future. If eral, but it has many drawbacks in our specific case. the user fails at solving the challenge, however, we First of all, the analysis that can be performed on consider the email as a possible attack, and we dis- incoming emails has to be lightweight, due to the card it. We acknowledge that having to go through large amount of emails, mostly malicious, that mail an identity-verification process can be annoying for servers receive [36]. As a second drawback, learn- users. However, we think that having users confirm ing the typical behavior of a user on the receiving their identity once in a while is a fair price to pay to end has the problem that a mail server only has protect a company against advanced email attacks, as visibility of the emails that are exchanged between long as the verifications are rare enough (for example, that user and people whose mailboxes are hosted on one in every 30 emails on average). the server. Therefore, a behavioral profile built from We tested IdentityMailer on a large set of these emails might not be representative enough to publicly-available emails, and on real world data sets correctly model the sending habits of users. made of malicious targeted emails sent to the cus- Because of the aforementioned problems, we pro- tomers of a large security company.