Device Fingerprinting with Peripheral Timestamps

Device Fingerprinting with Peripheral Timestamps John V. Monaco Naval Postgraduate School, Monterey, CA Abstract—Sensing and processing peripheral input is a ubiq- clock (presumably controlled by the attacker) to compare uitous capability of personal computers. Text entry on physical measurements against. Recent work however has explored and virtual keyboards, mouse pointer motion, and touchscreen the use of function execution time within a web browser to gestures are the primary ways in which users interact with websites viewed on desktop and mobile devices. Peripheral input obtain a device fingerprint [9]. This method relies instead on events must pass through a pipeline of hardware and software differences in machine performance without the need for a processes before they reach the web browser. This pipeline reference clock. is device-dependent and often contains low-frequency polling We introduce a new method to fingerprint devices based on components, such as USB polling and process scheduling, that the timing of peripheral input events, leveraging differences in influence event timing within the web page. We show that a relatively unique device fingerprint is formed the way various hardware and software components process by the timing of peripheral input events. No special permissions keyboard, mouse, and touchscreen input. User input must are required to register callbacks to keyboard, mouse, and pass through a chain of processes that sense and deliver the touch events from within a web browser, and the technique events to an application, such as a web browser [10]. This works on both desktop and mobile devices. We propose a dual chain typically includes a microcontroller on the peripheral clock model in which both a fast reference clock and slow subject clock are compared through a single time source. With device, communication protocol between the peripheral and this model, the instantaneous phase of the subject clock is host, operating system (OS) scheduler, and browser event loop. measured and used to construct a phase image. The phase image Many of these include low-frequency polling that is driven by is then embedded in a low dimensional feature space using an independent clock from system time. FPNET, a convolutional neural network designed to extract a Device fingerprinting is possible due to unique charac- device fingerprint from the instantaneous phase. Performance is evaluated using a dataset containing 300M keyboard events teristics in the periodic behavior of hardware and software collected from over 228k devices observed in the wild. After about components responsible for processing peripheral input. This two minutes of typing, a device fingerprint is formed that enables includes, for example, the timing of USB polls which originate 87.35% verification accuracy among a population of 100k devices. from the USB host controller operating off of an independent Combined with features that measure user behavior in addition clock from the host. The USB host controller regularly queries to device behavior, verification accuracy increases to 97.36%. The methods described have potential as a second authentication USB devices for new events. Low-speed polling occurs at factor, but could also be used to track Internet users. around 125Hz, although depending on the device it could Index Terms—clock skew; triplet network; Internet privacy; be slightly faster or slower. In addition, frequency drift and variations in the period between polls, i.e., timing jitter, are I. INTRODUCTION observed. Besides USB polling, we find a number of other Browser fingerprinting is the practice of measuring device- low-frequency processes that exhibit similar characteristics. specific attributes from within a web browser to perform We propose a dual clock model that captures this scenario in stateless tracking of Internet users [1]. A browser fingerprint which two clocks are compared through a single time source. may include many different entropy sources ranging from In the dual clock model, a reference clock measures the times software properties, such as User-Agent, installed plugins, at which a relatively slower subject clock ticks. Robust device and timezone, to hardware side effects, such as pixel-level fingerprints are formed from the instantaneous phase of the differences in the way images are rendered [2]. Fingerprint- subject clock, which reflects the precise time at which the ticks ing techniques have been extended to mobile device sensors occur relative to the reference. The instantaneous phase seems which have repeatedly been shown to enable device identifi- to be both device and software dependent, largely capturing cation [3], [4], [5]. As of 2021, at least one quarter of the the complete hardware+software stack on a device. top-10k visited websites implement some form of browser Our approach consists of using estimates from the dual fingerprinting [6]. clock model to form a phase image from the peripheral Time-based device fingerprinting techniques have evolved timestamps. We show that the phase image simultaneously alongside browser fingerprinting [7]. Timekeeping differences captures clock frequency, skew, drift, phase, and jitter (i.e., among devices may result from manufacturing inconsisten- changes in instantaneous phase), all of which contribute to cies as well as environmental conditions. Crystal oscillator the device fingerprint. Like face images, phase images are not circuits are prone to changes in frequency based on ambient directly comparable. A convolutional neural network, FPNET, temperature, which, if under an attacker’s control, could enable embeds the images in a low dimensional feature space that targeted alias resolution [8]. This approach of device finger- enables device identification and verification as well as system printing based on clock skew generally requires a reference profiling, for example predicting device model and OS family. Unlike prior work ([7], [8], and [11]), fine grained clock Keyboard circuit Subject behavior is captured by our model. The dual clock model clock tick closes opens t˙ t˙ allows clock properties (e.g., skew, drift, phase, jitter) to 1 2 be measured from a single time source as a result of the USB polling various processes that handle peripheral input being driven S S t t by independent timers. Instantaneous phase, rather than skew 1 2 in prior work, reveals idiosyncratic device behaviors in the Date.now() t R t R presence of low-frequency polling. This approach has wide Reference 1 2 applicability: it requires only the ability to measure the time clock tick DOM keydown DOM keyup of peripheral input events. This can be performed from within a web browser using, e.g., the JavaScript Date API, as no Fig. 1. Time quantization of DOM keydown and keyup events measured in a web browser using the JavaScript Date API. Gray arrows indicate when special permissions are required to register callbacks to DOM the events are sensed by the keyboard; purple arrows indicate times at which events induced by keyboard, touchscreen, and mouse input. USB polls return event data; green arrows indicate the times reported by Our method of fingerprinting scales up to many thousands Date.now() inside callbacks registered to the events. of devices. We validate the approach using a corpus with over 300M DOM events captured from over 228k Internet users in II. BACKGROUND AND RELATED WORK the wild. Relatively unique device fingerprints are formed from 300 keystrokes (600 events), which take about two minutes to A. Input event processing capture. With 10k devices, rank-1 identification accuracy is Document object model (DOM) events form the basis 56.2%, and as population size scales up to 100k it drops to of interactive and dynamic web pages. User input events just 29.7%. Combined with features that capture user behavior (keydown, keyup, mousemove, etc.) typically originate from the same timestamps, rank-1 identification rates increase in a peripheral device attached to the host. In this section, to 84.6% and 63.1% for 10k and 100k devices, respectively. we review the main components responsible for processing A near perfect 1000-fold reduction in population size (98.2% peripheral input on personal computers, including desktop, rank-100 accuracy with 100k devices) is achieved. laptop, and mobile devices. These include: the peripheral The main contributions of this paper include: itself (e.g., keyboard) that polls a sensor for physical changes; the communication protocol between the peripheral and the • The dual clock model, which enables comparison of two host; the OS scheduler; and the browser event loop. Each different clocks through a single time source. The model component introduces a delay to the event and in many cases assumes a high resolution reference clock that reports the exhibits periodic behavior driven by some independent clock times at which some lower frequency subject clock ticks. running at a lower frequency than the browser timestamps. We show that frequency, skew, drift, phase, and jitter can This effectively quantizes the event timestamps as measured all be measured within this model. in the web browser. This model is shown in Figure 1. • The concept of a phase image that captures fine-grained 1) Sensor polling: A keydown event begins with the clock behaviors and enables device fingerprinting. The user physically closing a circuit on the keyboard. Circuits phase image is formed by modular residues of the ob- are arranged in a crossbar switch called a matrix, and the served timestamps. A method to calculate the modular keyboard microcontroller periodically scans the matrix by residue with minimal precision loss is introduced. pulsing each column. Matrix scanning occurs at 100-200Hz • FPNET, a convolutional neural network that embeds on most keyboards [12]. When a closed circuit is detected, the phase images in a low dimensional feature space key debouncing is applied which consists of a short timeout to extract device fingerprints. The model architecture (~5ms) to avoid spurious keystrokes as the switch contacts is inspired by face recognition systems but designed bounce open and closed [13].

Load more