Windows GUI Context Extraction
Total Page:16
File Type:pdf, Size:1020Kb
CS2-AA4X Windows GUI Context Extraction a Major Qualifying Project Report submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the Degree of Bachelor of Science by ________________________ Austin T. Rose March 21, 2017 ________________________ Professor Craig A. Shue 1 1. ABSTRACT In any computer system an intelligent policy for allowing or disallowing low-level actions is critical to security. Such low-level actions may include opening up new connections to the Internet, installing new drivers, or executing downloaded files. In determining whether to allow a given action, it is necessary to collect some context regarding how the action was triggered. Is this connection to an address we have never seen before? Where was this file downloaded from? An important part of that context is whether or not a human user actu- ally requested the action in some way, through their interactions with the Graphical User Interface (GUI). That is an abstract question, which is not as straightforward to answer as others. We seek to determine a user's high-level intentions by extracting and relating properties of the GUI as a user interacts with it. We have created a system that automatically generates information about user activity in a programmatic way, monitoring a Windows computer in real time with a low perfor- mance overhead. The information generated is well structured for consumption by security tools, and to inform policy. Deployed across an organization, this system has the potential to effectively white-list broad categories of user work-flows, in order to easily alert about any concerning anomalous behavior which would warrant further investigation. 2. INTRODUCTION Most low-level system action, especially network activity, is directly triggered by some user interaction with the GUI. A user's interaction with a program is a sort of endorsement for its actions. That interaction by no means guarantees the security of an application, but if an unrecognized application on some computer starts making connections to the Internet without any user interaction, it is ostensibly more suspicious than a program which only makes connections in response to the local user clicking buttons in its interface. Certainly there are many exceptions to this heuristic. To give an obvious example, modern applications often have the ability to autonomously check for updates, regardless of whether a user explicitly requested it. Nonetheless, from the perspective of a security- operator, any information providing context to a computer's actions is valuable. If we can whitelist low-level actions that result from legitimate user interaction, then actions not trig- gered by a user are easier to identify and scrutinize. Was it a periodic and expected update check or the beacon of some malware which otherwise stays hidden from the user? In its ideal form, this security system would force malware to masquerade as a Trojan Horse with a functional GUI in order to be successful. This is a much more difficult attack vector to apply than attacks like installing a hidden remote access tool by exploiting a PDF 2 reader bug. If a piece of malware does not expose any sort of GUI to the user, then any activity it makes becomes suspicious. This is an advantage for security operators, given that advanced malware often stays under the radar by not interacting with the local user at all. By building up a white-list of work-flows that should result in network activity, we can very effectively block and detect entire categories of malicious network activity. We are not exploring an anti-virus approach, since it offers no solutions to deal with a suspected compromised computer. This is a detection and prevention solution detecting suspicious activity, and flagging it as such to allow a higher-level controller to block it. This improves on many existing approaches, because it does not rely on any prior knowledge about the malware. Existing approaches rely on blacklisted executable signatures, domain names, IP addresses, or other indicators. By monitoring and interpreting the core operating system libraries used to facilitate GUIs, we can automatically gain useful insight about any application. Whereas a higher-level monitoring tool could require special tailoring for each unique user application, this system can effectively monitor both present and future applications. In this way, we are able to extract important security context at a minimal cost. The rest of this paper is organized as follows. Section 3 contrasts this project with pre- vious work that is broadly similar, in terms of its role as network security tool, as well as some work that is more specifically similar, in terms of its niche of leveraging information in the GUI for security context. Section 4 discusses some of the background concepts founda- tional to understanding our approach, and the technology involved. Section 5 describes the specifics of our implementation, and the rationale behind different design decisions. Section 6 presents the results of evaluating the tool in a few categories of effectiveness. Section 7 explores our after-the-fact reflections on the project, and describes some improvements that could be made in future work. Section 8 offers our concluding thoughts. 3. RELATED WORK Here we discuss how this project fits in with existing research and projects in the field of computer security. We look at similar works and how our approach contrasts with them. We also consider dissimilar tools, which may be improved through our efforts. 3.1. Intrusion Detection Systems This system we have built is understood in the security community as one form of Intrusion Detection System (IDS) that detects actions that attempt to compromise the confidentiality, 3 integrity, or availability of a resource. More specifically, we would describe our system as a host-based IDS, since it involves some sort of agent running on each host in the network. Bro [Paxson 1999] is one particularly well known IDS, which works entirely at the network level. Bro does not secure networks -agically; it requires instruction regarding the security policy of the site it is meant to protect. In fact, one of the key contributions of Bro was to define a precise and understandable language for unambiguously defining a network's security policy. Another widely used IDS, Snort [Roesch 1999], fills a niche as a free, lightweight, and cross-platform IDS. But, like Bro and like most IDS solutions, Snort relies on predefined network patterns to look for, which is essentially a site security policy. However, the question itself of what should the specific security policy be in each case is often quite difficult to answer. We seek to supplement existing systems which require clear and predefined policy. By operating below the network level and directly with users, our system can provide deep insights to policy decisions that may otherwise effectively have to be made in isolation. 3.2. Firewalls Firewalls are one of the most basic lines of defense for a network. Their classic operation is to block unsolicited incoming network traffic so that only specifically requested data is allowed to reach the host. Firewalls range between host-based or centralized, and ‘dumb’ or application-aware. Typically, though, a firewall will have to be host-based to be application- aware. In a host-based and application-aware firewall, protection may be extended by deny- ing any outgoing connections from applications, unless they have been specifically white- listed. Little Snitch1 is a popular Mac OS X product which does exactly that. The built-in firewall for Windows supports similar features, but its default configuration is limited to blocking incoming requests. Comodo Firewall2 is a highly-regarded host-based Windows firewall, for those that need more advanced features and precise features than the built-in Windows Firewall. Centralized firewalls can be useful to network operators by offering sim- ilar protections as host-based firewall, while additionally providing visibility into activity on the network as a whole. Our tool, which must be installed on each protected host, is not directly involved with monitoring or preventing network activity. However, the information that it generates could be used to determined connection allowances like a host-based firewall needs to, while addi- 1https://www.obdev.at/products/littlesnitch/index.html 2https://www.comodo.com/home/internet-security/firewall.php 4 tionally supplying contextual information that would supplement the broader understand- ing gained from a centralized firewall 3.3. GUI Context Systems A number of approaches have been implemented to understand a user's high-level actions. BINDER [Cui et al. 2005] is one recent example that takes a similar approach to our system by correlating network connections with user activities (keystrokes and clicks). Our work proposes to categorize user intent more precisely than BINDER could, by not only detect- ing input events, but by further correlating those events with the text present in the GUI components being used. Furthermore, while the BINDER work focuses heavily on deter- mining indicators of malicious activity, our work seeks only to produce high quality data for whatever policy may use it. Another approach, coming from Virginia Tech [Zhang et al. 2012] is focused specifically on user actions in web browsers, which represent a disproportionately wide attack surface, as a casual user's main tool for interacting with the Internet. With web browsers, our ap- proach has limited success. This is mainly due to the fact that modern web browsers do not use many, if any, standard Windows GUI components that are easy to detect and categorize. This is understandable, given the highly dynamic nature of rendering a web page. Nor do they use many of the same Window Messages types and conventions found in other desk- top applications.