Detecting the Unknown Protecting Against DGA-Based Malware

Detecting the Unknown Protecting Against DGA-Based Malware By: Lotem Guy, Head of Security Research Uri Sternfeld, Senior Security Researcher 1 Contents 3 Introduction 4 What is DGA? 5 Current Detection and Protection Techniques 6 Sinkholing 7 Connection Detection 8 Communication Patterns 9 Endpoints 10 About Cybereason 2 Introduction Contemporary malicious attacks usually employ a dedicated Command & Control (C&C) server, which serves a dual purpose – allowing the attacker to enhance and escalate the attack by issuing commands and updating the malicious code, and acting as a data exfiltration channel. In order to find its master after the initial penetration, the malware needs to connect to a specific IP address or – more commonly – a specific domain. In the past, most malware contains a hardcoded list of possible C&C addresses, which could be easily blocked by firewalls. This led to the creation of multiple websites dedicated to maintaining lists of known malicious domains so that any attack leveraging a known malicious domain or IP could be nullified as soon as any instance of the attack was revealed. Never ones to sit idle, hackers responded to this challenge by developing domain generation algorithms. 3 What is DGA? Domain generation algorithm (DGA) is a technique that many malware variants implement in order to coordinate C&C communication between the malware and the attacker, without the need to hardcode a single IP address or a domain. Instead, the malware contains an algorithm that generates multiple domains based on some pseudo- random seed, such as the current date. The attackers may choose any of these generated domains as a rendezvous point and register it in advance. The malware will try to connect to each of the generated domains on its own until it finds the one registered by the attackers. When such a C&C domain is revealed and blocked, the attacker can easily establish a new line of communication by registering another domain. Even if the algorithm itself is known, it often is infeasible to register all possible combinations - which may range in the millions - in advance. There is no lack of examples of DGA used by some of the most notorious malware in recent history – Kraken, CryptoLocker, GameOver Zeus, Conficker, Torpig and many more. 4 Current Detection and Protection Techniques There are two general properties that define each DGA: Its time-dependence (some variants generate new domains every few days or even hours) and its predictability (the ability to correctly predict its results ahead of time). For example, while most recorded DGAs simply use the current date as pseudo-random seed (CryptoLocker, GameOver Zeus, Conficker), one variant (Torpig) uses instead the currently trending Twitter hashtag which makes early registration of generated domain virtually impossible. There are several ways of defending against DGA-based malware. If the algorithm is both known and predictable, it’s sometimes possible to generate all possible domains in advance and statically block them. If the algorithm isn’t predictable, a more dynamic approach is needed. For example, some advanced firewall solutions are able to tell whether a given domain belongs to a known DGA in real time. 5 Sinkholing Another way to disrupt a known DGA-based malware is to register the generated domains, which serves to both prevent the attackers from registering the domains and also detect already-infected machines that communicate with these domains. This technique is called “sinkholing.” Some security companies, such as Damballa, Anubis and Kaspersky, perform sinkholing en-masse, but most organizations cannot register a large number of domains by themselves. Another possibility is for a government agency to take a judicial action and appropriate the suspected domains in advance to prevent their registration. The downsides of sinkholing are clear: It is costly in both time and money, since it involves bidding and paying for registering domains. In addition, when every single domain cannot be registered there might be a way for the malware operators to bypass it, as had happened in Operation Tovar when the FBI attempted to perform a takedown of the GameOver Zeus network. 6 Connection Detection However, the real challenge of protecting against DGAs is to be able to detect the connection attempts of as of yet unknown algorithms. There are two main heuristic methods that may be used in this case – detecting the generated domains themselves, and detecting the communication patterns. Many DGAs create random domain names that can be detected by their high entropy and character histograms (for example, GameOver Zeus domains range between 15-30 characters, all of which are randomly-distributed English characters). While this technique has a good potential to identify malware that use similar algorithms, there are many DGAs that generate domains with very different characteristics. Torpig and Kraken, for example, use conjoined, randomly-chosen English words (Torpig) or English-like words (Kraken) that are much harder to statistically differentiate from legitimate domains. 7 Communication Patterns It is also possible to detect a DGA-like communication pattern. As mentioned before, the basic concept of DGA involves creating multiple domains and only registering a small subset. Since the malware cannot know which domains were actually registered in each batch, it must continue trying until it succeeds. The malware sends DNS queries one-by- one in order to resolve the generated domains. If the domain is not registered the malware receives a non-existing domain response (NXDOMAIN) and tries to resolve the next generated domain. This often results in a sequence of NXDOMAIN responses, which can sometimes be detected by analyzing the network activity. This is harder than it sounds, however, as these patterns are often lost in the general noise of modern networks and are prone to false positives resulting from any number of legitimate reasons, from browsing typos to DNS server failures. As a result of learning the advantages and pitfalls of the methods listed above, the best way to identify DGA communication patterns is by monitoring the communications for each separate process - and this can only be done on the endpoints themselves. 8 Endpoints Endpoint monitoring on the process level allows analysis of process behavior across the organization and applies this accumulated information to improve detection accuracy. For example, a process that is common in the organization and signed by a legitimate vendor creates a regular pattern and is less suspicious than an unsigned process that is rare in the organization but creates the same pattern. An effective cyber defense system continuously collects data from your endpoints and correlates it intelligently to provide an accurate low-noise analysis of your current environment. Endpoint monitoring provides the visibility needed to match a sequence of unresolved DNS queries with a single suspicious process. Cyber attacks will continue to evolve, employing techniques like DGA to stay ahead of defenders. Keep one step ahead yourself and broaden your approach by adopting dynamic techniques like behavioral analysis to your cyber strategy. 9 About Cybereason Cybereason was founded in 2012 by a team of ex-military cybersecurity experts to revolutionize detection and response to cyber attacks. The Cybereason Detection and Response platform uniquely identifies both known and unknown threats in real time using big data, behavioral analytics and machine learning, and puts them in context to form a complete attack story. The console presents the TRACE elements of every malicious operation: Timeline, Root cause, Attacker activity, Communication, and affected Endpoints and users, eliminating the need for manual investigation and radically reducing response time. The platform is available as an on premise solution or a cloud-based service. Schedule a demo to see how Cybereason can help detect DGA-based threats: http://bit.ly/1KLxKJ2 10.

Detecting the Unknown Protecting Against DGA-Based Malware

Post-Mortem of a Zombie: Conficker Cleanup After Six Years Hadi Asghari, Michael Ciere, and Michel J.G

The Botnet Chronicles a Journey to Infamy

The Customer As a Target Risks of Phishing and Trojans

Understanding and Analyzing Malicious Domain Take-Downs

Symantec Intelligence Report: June 2011

An Adaptive Multi-Layer Botnet Detection Technique Using Machine Learning Classiﬁers

Monthly Report on Online Threats in The

Financial Malware Analysis-Secrets of Zeus

Zeus Spyeye Banking Trojan Analysis

Example of Trojan Horse

On Botnet Behaviour Analysis Using GP and C4.5

Malware and Fraudulent Electronic Funds Transfers: Who Bears the Loss?