Digital Image Authentication from JPEG Headers Eric Kee, Micah K
Total Page:16
File Type:pdf, Size:1020Kb
1 Digital Image Authentication from JPEG Headers Eric Kee, Micah K. Johnson and Hany Farid Abstract— It is often desirable to determine if an image has several features of the JPEG format not previously considered, been modified in any way from its original recording. The namely properties of the run-length encoding employed by the JPEG format affords engineers many implementation trade-offs JPEG standard, and aspects of the EXIF header format. This which give rise to widely varying JPEG headers. We exploit these variations for image authentication. A camera signature analysis is validated on over 1:3 million images spanning 33 is extracted from a JPEG image consisting of information different camera manufacturers and 773 different camera and about quantization tables, Huffman codes, thumbnails, and EXIF cellphone models. format. We show that this signature is highly distinct across 1:3 million images spanning 773 different cameras and cellphones. Specifically, 62% of images have a signature that is unique to II. METHODS a single camera, 80% of images have a signature that is shared The JPEG file format has emerged as a universal image by three or fewer cameras, and 99% of images have a signature standard employed by nearly all commercial digital cam- that is unique to a single manufacturer. The signature of Adobe Photoshop is also shown to be unique relative to all 773 cameras. eras [30], [31]. As such, we consider the details of this These signatures are simple to extract and offer an efficient encoding scheme, and how these details vary among cameras method to establish the authenticity of a digital image. of different make, model, resolution, and quality. Cameras often support multiple resolutions and qualities, each of which yield images with different JPEG compression parameters. We I. INTRODUCTION will, therefore, explore the JPEG parameters used by cameras Digital images are now routinely introduced as evidence of different make and model, and by each camera under into a court of law. It has, therefore, become critical to different resolution and quality settings. verify the integrity of this digital evidence. Digital forensic We begin by describing the JPEG compression standard and techniques have been developed to detect various traces of file format. Given a three channel color image (RGB), com- tampering: region duplication [1], [2], [3], [4]; resampling [5], pression proceeds as follows. An image is first transformed [6]; color filter array artifacts [7], [8]; inconsistencies in from RGB into luminance/chrominance space (YCbCr). The camera response function [9]; inconsistencies in lighting and two chrominance channels (CbCr) are typically subsampled shadows [10], [11], [12], [13]; inconsistencies in chromatic by a factor of two relative to the luminance channel (Y). Each aberrations [14], [15]; inconsistencies in sensor noise [16], channel is then partitioned into 8×8 pixel blocks. These values [17]; and inconsistencies in statistical features [18]. See [19] are converted from unsigned to signed integers (e.g., from for a general survey. However, relatively benign modifications [0;255] to [−128;127]). Each block, fc(·), is converted to either cannot be detected by these techniques, or render these frequency space, Fc(·), using a 2-D discrete cosine transform techniques ineffective. (DCT): It is often desirable to determine if a digital image has 7 7 X X (2x + 1)uπ (2y + 1)vπ been altered in any way from the time of its recording, Fc(u;v) = αu;v fc(x;y) cos cos ; (1) 16 16 including manipulations as simple as cropping. Previous work x=0 y=0 in detecting double JPEG compression holds some promise to where c denotes a specific image channel, α is a normal- detect any form of image manipulation [20], [21], [22], [23], u;v izing scale factor, and f (·) is the underlying pixel values. [24]. These techniques require fairly sophisticated models c Depending on the specific frequency u;v and channel c, each and analysis schemes, can be vulnerable to simple counter- DCT coefficient, F (·), is quantized by an amount q (·): measures such as additive noise or down-sampling, and can c c be computationally intensive. In contrast, it has been previ- Fc(u;v) F^c(u;v) = round : (2) ously shown that EXIF headers [25] and JPEG quantization qc(u;v) tables [26], [27], [28], [29] used by cameras and software This stage is the primary source of data reduction and infor- manufacturers are somewhat distinct, and can therefore be mation loss. used to determine if an image has been altered from its With some variations, the above sequence of steps is em- original recording. Building on this earlier work, we describe ployed by all JPEG encoders. The primary source of variation how various aspects of the JPEG format can be used for in these encoders is the choice of quantization values q (·). authentication. Unlike previous work, this approach considers c The quantization is specified as a set of three 8 × 8 tables E. Kee ([email protected]) and H. Farid ([email protected]) associated with each frequency and image channel (YCbCr). are with the Department of Computer Science at Dartmouth College, 6211 For low compression rates, the values in these tables tend Sudikoff Lab, Hanover NH 03755. M.K. Johnson ([email protected]) is towards 1, and increase for higher compression rates. Shown with the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology, 32 Vassar Street, 32-D462 Cambridge, in top portion of Fig. 2, for example, are the quantization MA 02139. tables employed by a Canon and Nikon digital camera. Shown, 2 from top to bottom, are the tables for the luminance and B. Thumbnail Parameters chrominance channels. As is typical, the quantization for A thumbnail version of the full resolution image is often the luminance channel is less than for the two chrominance embedded in the JPEG header. The next three components channels, and the quantization is less for the lower frequency of our camera signature are extracted from this thumbnail components (the frequency in each table is specified in stan- image. A thumbnail is typically no larger in size than a few dard zig-zag order with the lowest frequency in the top left hundred square pixels, and is created by cropping, filtering corner, and the highest frequency in the bottom right corner). and down-sampling the full-resolution image. The thumbnail After quantization, the DCT coefficients are subjected to is then typically compressed and stored in the header as a entropy encoding (typically Huffman coding). Huffman coding JPEG image. As such, we can extract the same components is a variable-length encoding scheme that encodes frequently from the thumbnail as from the full resolution image described occurring values with shorter codes, and less frequently oc- in the previous section. As with the full resolution image, we curring values with longer codes. This lossless compression have found that the chrominance channels are encoded with scheme exploits the fact that the quantization of DCT coef- the same parameters. ficients yields many zero coefficients, which can in turn be Some camera manufacturers do not create a thumbnail efficiently encoded. Motivated by the fact that the statistics image, or do not encode them as a JPEG image. In such 1 of the DC and AC DCT coefficients are different , the JPEG cases, we simply assign a value of zero to all of the thumbnail standard allows for different Huffman codes for the DC and parameters. Rather than being a limitation, we consider the AC coefficients. This entropy encoding is applied separately lack of a thumbnail as a characteristic property of a camera. to each YCbCr channel, employing separate codes for each In total, we extract 284 values from the thumbnail image: channel. 2 thumbnail dimensions, 192 quantization values, and 90 The JPEG standard does not enforce any specific quantiza- Huffman codes. tion table or Huffman code. Camera and software engineers are therefore free to balance compression and quality to their own needs and tastes. The specific quantization tables and Huffman C. EXIF Metadata Parameters codes needed to decode a JPEG file are embedded into the The final component of our camera signature is extracted JPEG header. In the following sections we describe how the from an image’s EXIF metadata [32]. The metadata, found JPEG quantization table and Huffman codes along with other in the JPEG header, stores a variety of information about the data extracted from the JPEG header form a distinct camera camera and image. According to the EXIF standard, there are signature which can be used for authentication. five main image file directories (IFDs) into which the metadata is organized: (1) Primary; (2) Exif; (3) Interoperability; (4) A. Image Parameters Thumbnail; and (5) GPS. Camera manufacturers are free to embed any (or no) information into each IFD. We extract a The first three components of our camera signature are the compact representation of their choice by counting the number image dimensions, quantization table, and Huffman code. The of entries in each of these five IFDs. image dimensions are used to distinguish between cameras Because the EXIF standard allows for the creation of addi- with different sensor resolution. In order to compensate for tional IFDs, we also count the total number of any additional landscape and portrait images, the image dimensions are spec- IFDs, and the total number of entries in each of these. Some ified as the minimum dimension followed by the maximum camera manufacturers customize their metadata in ways that dimension. The set of three 8 × 8 quantization tables are do not conform to the EXIF standard. These customizations specified as a one dimensional array of 192 values: each yield errors when parsing the metadata.