You Can Type, but You Can't Hide: a Stealthy GPU-Based Keylogger

You Can Type, but You Can't Hide: a Stealthy GPU-Based Keylogger

You Can Type, but You Can’t Hide: A Stealthy GPU-based Keylogger Evangelos Ladakis,* Lazaros Koromilas,* Giorgos Vasiliadis,* Michalis Polychronakis,† Sotiris Ioannidis* *Institute of Computer Science, Foundation for Research and Technology—Hellas, Greece †Columbia University, USA {ladakis, koromil, gvasil, sotiris}@ics.forth.gr, [email protected] ABSTRACT keystrokes passed through the hooked function call. Although Keyloggers are a prominent class of malware that harvests kernel-level keyloggers are more sophisticated and stealthy sensitive data by recording any typed in information. Key- than user-level keyloggers, they heavily rely on kernel code logger implementations strive to hide their presence using modifications, and thus can be detected by kernel integrity rootkit-like techniques to evade detection by antivirus and and code attestation tools [25, 26]. other system protections. In this paper, we present a new In this paper, we present how malware can tap the general- approach for implementing a stealthy keylogger: we explore purpose computation capability of modern graphics proces- the possibility of leveraging the graphics card as an alterna- sors to increase the stealthiness of keylogging. By instructing tive environment for hosting the operation of a keylogger. the GPU to carefully monitor via DMA the physical page The key idea behind our approach is to monitor the system’s where the keyboard buffer resides, a GPU-based keylogger keyboard buffer directly from the GPU via DMA, without can record all user keystrokes and store them in the memory any hooks or modifications in the kernel’s code and data space of the GPU. An important property of our keylogger is structures besides the page table. The evaluation of our pro- that it does not rely on any kernel modifications besides alter- totype implementation shows that a GPU-based keylogger ing the page table, and uses a small code snippet that needs can effectively record all user keystrokes, store them in the to run just once from kernel context to acquire the physical memory space of the GPU, and even analyze the recorded address of the keyboard buffer. This code is completely stan- data in-place, with negligible runtime overhead. dalone, does not require any hooks or other modifications, and is completely removed after it accomplishes its task. The physical address of the keyboard buffer is then used by 1. INTRODUCTION the GPU to monitor all user keystrokes directly via DMA, Keyloggers are one of the most serious types of malware through the direction of a user-level controller process. that surreptitiously log keyboard activity, and typically exfil- We have implemented and evaluated our prototype GPU- trate the recorded data to third parties [12]. Despite signifi- based keylogger for Linux and the 32-bit x86 architecture. cant research and commercial efforts [16,17,21,22], keyloggers The results of our evaluation demonstrate the efficiency and still pose an important threat of stealing personal and finan- effectiveness of GPUs for capturing short-lived data from the cial information [1]. Keyloggers can be implemented as tiny host memory, while keeping the CPU and GPU utilization hardware devices [2,3], or more conveniently, in software [4]. at minimal levels, in our case about 0.1%. Software keyloggers can be implemented either at the user The main contributions of this work are: or kernel level. User-level keyloggers generally use high-level • We present the first (to the best of our knowledge) APIs to monitor keystrokes. For example, Windows provides GPU keylogger, capable of monitoring all keystroke the GetAsyncKeyState function to determine whether a events and storing them in the GPU memory. key is pressed or not at the time the function is called, and whether the key was pressed after a previous call. User-space • We have experimentally evaluated our GPU-based key- keyloggers, while easy to write, are also relatively easy to logger and demonstrate that it can be used effectively detect, using hook-based techniques [28]. In contrast, kernel and efficiently for capturing all keystroke events. By level keyloggers run inside the OS kernel and record all data carefully scheduling the GPU kernel invocations, the originating from the keyboard. Typically, a kernel level keylogger can capture all keystroke events without af- keylogger hooks specific system calls or driver functions. The fecting the proper rendering of the graphics and with injected malicious code is programmed to capture all user minimal overhead. 2. BACKGROUND Permission to make digital or hard copies of all or part of this work for General-purpose computing on graphics processing units personal or classroom use is granted without fee provided that copies are has drastically evolved during the last few years. Historically, not made or distributed for profit or commercial advantage and that copies the GPU has been used for handling 2D and 3D graphics bear this notice and the full citation on the first page. To copy otherwise, to rendering, effectively offloading the CPU from these compu- republish, to post on servers or to redistribute to lists, requires prior specific tationally intensive operations. Driven to a large extent by permission and/or a fee. EuroSec’13 April 14 2013, Prague, Czech Republic the ever-growing video game industry, graphics processors Copyright 2013 ACM 978-1-4503-2120-4/13/04 ...$15.00. have been constantly evolving, increasing both in compu- tational power and in the range of supported operations and functionality. Meanwhile, programmers began exploring GPU kernel module ways for enabling their applications to take advantage of the code massively parallel architecture of modern GPUs. Standard graphics APIs such as OpenGL and DirectX, do 4 start 3 manipulate 2 keylogger page table entries not expose much of the underlying computational capabilities scan pages that graphics hardware can provide. Data and variables have locate to be mapped to graphics objects, while algorithms must be controller buffer memory expressed as pixel or vertex shaders. The Compute Unified process 1 scanner Device Architecture (CUDA) introduced by NVIDIA [20] is a significant advance, exposing several hardware features that Figure 1: Temporary and permanent components 1 are not available via the graphics API. CUDA consists of a of the keylogger. Gray denotes bootstrapping oper- minimal set of extensions to the C language and a runtime ations, while black denotes monitoring functions. library that provides functions to control the GPU from the host, as well as device-specific functions and data types. trator privileges. In other words, depending on its purpose, At the top level, an application written for CUDA consists GPU-assisted malware can run successfully even under user of a serial program running on the CPU, and a parallel part, privileges, making it more robust and deployable. called a kernel, that runs on the GPU. A kernel, however, can only be invoked by a parent process running on the CPU. As 3. GPU-BASED KEYLOGGING a consequence, a kernel cannot be initiated as a stand-alone In this section we present in detail the design of a proof- application, and it strongly depends on the CPU process of-concept keylogger implemented on the GPU. Instead of that invokes it. Each kernel is executed on the device as relying on rootkit-like techniques, such as hooking system many different threads organized in thread blocks. Thread functions and manipulating critical data structures, our key- blocks are executed by the multiprocessors of the GPU in logger monitors the contents of the system’s keyboard buffer parallel. In addition to program execution, CUDA also directly from the GPU. provides appropriate functions for data exchange between One of the primary challenges of this design is how to the host and the device. All I/O transactions are performed locate the memory address of the keyboard buffer, as (i) via DMA over the PCI Express bus. DMA enables the GPU the keyboard buffer is not exported in the kernel’s symbol to trasfer data directly—without any CPU involvement—to table, making it not accessible directly by loadable modules, and from the host memory, using a dedicated DMA engine. and (ii) the memory space allocated for data structures is Typically, the GPU can only access specific memory regions, different after every system boot or after unplugging and allocated by the operating system. plugging back in the device. Typically, loadable modules Given the great potential of general-purpose computing allocate memory dynamically, hence object addresses are not on graphics processors, it is only natural to expect that necessarily the same after a system reboot. In addition, the malware authors would attempt to tap the powerful features OS can apply certain randomization algorithms to hinder an of modern GPUs to their benefit [24, 32]. The ability to attacker that tries to predict an object’s address. execute general purpose code on the GPU opens a whole new To overcome the randomized placement of the keyboard window of opportunity for malware authors to significantly buffer, the attacker has to scan the whole memory. As a raise the bar against existing defenses. Existing malicious consequence, our GPU-based keystroke logger consists of code analysis systems primarily support x86 code, while two main components: (i) a CPU-based component that current virus scanning tools cannot detect malicious code is executed once, at the bootstrap phase, with the task of stored in separate device memory and executed on a processor locating the address of the keyboard buffer in main memory, other than the CPU. In addition, the majority of security and (ii) a GPU-based component that monitors, via DMA, researchers are not familiar with the execution environment the keyboard buffer and records all keystroke events. Figure1 and the instruction set of graphics processors. displays the bootstrapping (gray) and monitoring (black) A GPU-assisted malware binary contains code destined to components of the system, along with the sequence of their run on different processors.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us