The Ecology of Malware
Total Page:16
File Type:pdf, Size:1020Kb
The Ecology of Malware Jedidiah R. Crandall Roya Ensafi Stephanie Forrest University of New Mexico University of New Mexico University of New Mexico Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science Mail stop: MSC01 1130 Mail stop: MSC01 1130 Mail stop: MSC01 1130 1 University of New Mexico 1 University of New Mexico 1 University of New Mexico Albuquerque, NM 87131-0001 Albuquerque, NM 87131-0001 Albuquerque, NM 87131-0001 [email protected] [email protected] [email protected] Joshua Ladau Bilal Shebaro Santa Fe Institute University of New Mexico 1399 Hyde Park Road Dept. of Computer Science Santa Fe, New Mexico 87501 Mail stop: MSC01 1130 [email protected] 1 University of New Mexico Albuquerque, NM 87131-0001 [email protected] ABSTRACT General Terms The fight against malicious software (or malware, which includes Security everything from worms to viruses to botnets) is often viewed as an “arms race.” Conventional wisdom is that we must continu- Keywords ally “raise the bar” for the malware creators. However, the mul- titude of malware has itself evolved into a complex environment, malware ecology, malware analysis, worms, viruses, botnets and properties not unlike those of ecological systems have begun to emerge. This may include competition between malware, fa- 1. INTRODUCTION cilitation, parasitism, predation, and density-dependent population regulation. Ecological principles will likely be useful for under- Modern malware defense involves a variety of activities and is standing the effects of these ecological interactions, for example, quickly becoming unsustainable. So many malware samples are carrying capacity, species-time and species-area relationships, the collected from the wild each day that triaging is necessary to deter- unified neutral theory of biodiversity, and the theory of island bio- mine which samples warrant further analysis. Much of the analysis geography. The emerging malware ecology can be viewed as a crit- framework is consumed by creating new signatures for new vari- ical challenge to all aspects of malware defense, including collec- ants of well-known attacks, but simply releasing a signature for ev- tion, triage, analysis, intelligence estimates, detection, mitigation, ery threat that is currently in the wild is only part of what defenders and forensics. It can also be viewed as an opportunity. must do. They must also aggregate the results of multiple analyses In this position paper, we argue that taking an ecological ap- to mitigate threats that have already spread, develop intelligence proach to malware defense will suggest new defenses. In particular, estimates of what the attackers are doing and are likely to do next, we can exploit the fact that interactions of malware with its envi- and conduct forensic analysis of attacks that have occurred. ronment, and with other malware, are neither fully predictable nor The study of how ecological principles can be applied to these fully controllable by the malware author—yet the emergent behav- defense activities is important for two reasons. First, the emerg- ior will follow general ecological principles that can be exploited ing malware ecology is straining defenses at every stage. Exactly for malware defense. identifying every distinct instance of malicious code that is in the wild, something that most malware defense relies on, is no longer possible. For example, it is now common for targeted attacks, e.g., Categories and Subject Descriptors phishing attacks against local credit unions where e-mails are sent D.4.6 [Operating Systems]: Security and Protection–invasive soft- only to e-mail accounts in a particular city, to be constructed us- ware ing tools for building advanced Trojan horses using a point-and- click interface, such as Shark 2 [14]. As a second example, Song et al. [32] documented both experimentally and with information theory arguments that the diversity of a single component (the de- Permission to make digital or hard copies of all or part of this work for coder) of a botnet greatly exceeds the capacity of any signature- personal or classroom use is granted without fee provided that copies are based detection algorithm. not made or distributed for profit or commercial advantage and that copies The second reason to study malware ecology is that it has the bear this notice and the full citation on the first page. To copy otherwise, to potential of leading to defenses that give fundamental advantages republish, to post on servers or to redistribute to lists, requires prior specific to the defender. Current approaches are aimed primarily at known permission and/or a fee. NSPW’08, September 22–25, 2008, Lake Tahoe, California, USA. attacks or known types of attack. In the current landscape, these Copyright 2008 ACM 978-1-60558-341-9/08/09 ...$5.00. approaches at best consume some time of the attacker, who must create new malware to evade the defenses. Before the malware 1.2 The emerging malware ecology landscape became so ecologically diverse, it might have taken 30 In addition to interactions with each other, today’s malware in- minutes to roll out a new defense in response to the latest attack, teracts with its environment in unprecedented ways. The Trojan and two days for a new attack to appear in response to that defense programs installed by advanced botnets, dubbed “malware 2.0” by (these amounts of time are estimates for the purpose of example some researchers [4], use techniques such as combining public-key only). However, with the advent of “malware 2.0” [4], malware is cryptography with existing peer-to-peer networks to create custom supported on the back end by large amounts of resources and coor- name resolution mechanisms. There is also a growing trend of cor- dination, and it might take a couple of hours to roll out a new de- rupting the existing Domain Name Service (DNS) system for com- fense but only minutes for a new attack to appear. With the possible mand and control [12], creating the rise of a “second secret author- exception of automated diversity techniques, anti-malware tech- ity.” Additionally, botnets such as Storm cannot be described as a niques are so predictable that there is little defenders can to do to single instance of malicious code, they are really a system of differ- prevent sophisticated attackers from gaining this advantage. This is ent Trojans, amorphous and heterogeneous command and control, because the game between attacker and defender is circumscribed drive-by downloaders, droppers, waves of malicious e-mails that within the limits of the controlled environments in which analy- recruit new victims, and any number of other components that are sis and defense are carried out. We posit that there is an impor- nothing new in isolation but taken together represent a new level tant difference between the richly diverse interactions that malware of threat on the Internet. Obtaining a “sample” of the Storm botnet has with its real environment in the wild, and the more restricted and analyzing it in a laboratory is thus complicated in fundamental behaviors of malware in clean and controlled laboratory environ- ways. Unless the laboratory environment contains unwitting peer- ments. Studying malware ecology has the potential to bring the to-peer users and corrupted DNS resolution paths, it is incomplete. complexity of the environment, something that the attacker cannot Even the relatively simple malware of the past was notoriously predict or control, within the scope of novel analysis and defense fragile, and it was difficult to execute in a virtual machine be- techniques. cause of the many environmental dependencies. Most simple e- mail worms, for example, exit abnormally when executed in an iso- 1.1 Today’s analysis environments lated virtual machine because, e.g., the system date is not set within Malware analysis is typically conducted in carefully controlled the right range, DNS queries go unanswered, or no SMTP server is environments. A typical malware defense pipeline must sift configured. As early as 2003 researchers reported that running vul- through tens of thousands of unique malware samples every day to nerable programs under an emulator was sufficient to disrupt some find the few that are truly novel and warrant further analysis. As a attacks, even without patching the vulnerability [3, 5, 21]. This is rule, these selected samples are traded among experts and analyzed a serious problem because it requires malware analyzers to fully thoroughly only if their behavior can be demonstrated in a repro- recreate every detail of the malware’s natural environment before ducible way on a clean, freshly installed copy of the vulnerable analysis can be performed, and the complexity of malware envi- system. This prevents wasted effort, when one malware analyzer ronments continues to rise. passes malware off to another. If a sample “misbehaves” in some way, the malware analyzer will sift through all similar samples to 1.3 Opportunities find one that behaves more predictably (the behavior can be repro- The emerging malware ecology is also troublesome from the at- duced). Subsequent efforts at forensics and threat estimation take tacker’s point of view. In fact, it always has been. Suppose an these manual analyses as their starting point. At every stage valu- attacker wants to write a worm to install a backdoor on every ma- able information is lost because we do not know how to use this chine it infects. They develop a zero-day exploit, use their own information effectively. In the wild, malware that behaves exactly custom polymorphic engine, and throw all of their newest and best as its author intended is the exception, not the rule [20]. tricks into the code to slow the antivirus analysis down. They also Suppose that ecologists studied only healthy organisms in iso- use stealth techniques that they know will allow it to spread unfet- lation, in pristine caged environments tailored to those organisms.