Digital Image Ballistics from JPEG Quantization

Digital Image Ballistics from JPEG Quantization TR2008-638, Dartmouth College, Computer Science 2 Methods Using the Flickr API, 1,000,000 images were down- loaded from, a popular photo-sharing Digital Image Ballistics website. Since we are interested in the JPEG quan- from JPEG Quantization: tization table employed by different cameras, it A Followup Study was necessary to eliminate any images that had been edited or altered by any photo-editing soft- ware. To this end, only images tagged as “orig- inal” by Flickr were downloaded. Images were Hany Farid then eliminated if they were not 3-channel color Department of Computer Science images, and further eliminated if they had incom- Dartmouth College plete metadata, inconsistent “modification” and Hanover NH 03755 “original” dates, or a “software” metadata tag sug- gesting that the image had been edited by a photo- editing software. This filtering eliminated 557,045 Abstract images, leaving 442,955 images. The camera make, model, resolution, and JPEG The lossy JPEG compression scheme quantization table were extracted from each of the employs a quantization table that con- 442,955 images’ metadata. In order to further elim- trols the amount of compression achieved. inate possible edited or altered images, only those Because different cameras typically em- entries with five or more images having the same ploy different tables, a comparison of paired make, model, resolution, and quantization an image’s quantization scheme to a table were retained. This filtering eliminated 105,329 database of known cameras affords a images, leaving 337,626 images for the final anal- simple technique for confirming or deny- ysis. ing an image’s source. This report de- The remaining 337,626 images spanned 10,153 scribes the analysis of quantization ta- different distinct pairings of camera make, model, bles extracted from 1,000,000 images resolution and quantization table (accounting for downloaded from 48 different camera manufacturers, and 859 differ- ent camera models). It is from these 10,153 entries that the distinctness of the paired resolution and 1 Introduction quantization table was analyzed. Both the reso- lution and quantization table were combined in Since the JPEG image format has emerged as a order to provide a more refined criteria for nar- virtual standard, most cameras encode images in rowing the search criteria to identify the source this format. This lossy compression scheme al- camera. lows camera manufacturers to balance the amount of compression and the quality of their images. This tradeoff is embodied in the JPEG quantiza- 3 Results tion table. In earlier work [1], we showed that the quantization table is fairly distinct across various The 10,153 entries described in the previous sec- cameras, and can therefore be used to confirm or tion fall into equivalence classes of size 1 to 178, deny the source of an image (see also [2]). This where each class has identical resolution and JPEG initial study examined one image taken from one quantization table. The average class size is 20.13, of 204 cameras, but suffered from not consider- with a standard deviation of 35.7 (a class of size 1 ing the different quantization tables that are em- means that the paired values are unique). The me- ployed with varying camera quality and resolu- dian class size is 5. Of the 10,153 entries, 2,704 en- tion settings. In this report, we expand on our ear- tries (26.6%) have a unique paired resolution and lier work by considering a much larger and more quantization table, and 37.2% have at most two diverse collection of images. matches, and 44.1% have at most three matches. 1 Figure 1: The distribution of equivalence class sizes for paired resolution and JPEG quantization (a class size of 1 denotes a unique pairing). Shown in Figure 1 is the distribution of the size and by the Institute for Security Technology Stud- of these equivalence classes, and shown in Ap- ies at Dartmouth College under grants from the pendix A are the cameras belonging to the same Bureau of Justice Assistance (2005-DD-BX-1091) equivalence class. In the majority of cases, it is and the U.S. Department of Homeland Security cameras from the same manufacturer that share (2006-CS-001-000001). Points of view or opinions the same quantization table, although in several in this document are those of the author and do situations there is a significant amount of overlap not represent the official position or policies of the between cameras of different manufacturers. U.S. Department of Justice, the U.S. Department If we ignore resolution and partition only on of Homeland Security, or any other sponsor. the quantization tables, then the 10,153 entries fall into equivalence classes of size 1 to 899, where each class has an identical quantization table. The References average class size is 273.6, with a standard devi- [1] H. Farid. Digital image ballistics from JPEG ation of 276.1. The median class size is 254, and quantization. Technical Report TR2006-583, 517 (5.1%) have a unique quantization table, and Department of Computer Science, Dartmouth 8.3% have at most two matches, and 10.4% entries College, 2006. have at most three matches. [2] J.D. Kornblum. Using JPEG quantization ta- 4 Discussion bles to identify imagery processed by soft- ware. Digital Investigation, 5:S21–S25, 2008. While the JPEG quantization table is clearly not unique, it (paired with the image resolution) is Appendix A reasonably effective at narrowing the source of an image to a single camera make and model or to a Each entry in the first table corresponds to a cam- small set of possible cameras. era with unique resolution and quantization ta- ble. Each entry in the second table corresponds Acknowledgments to an equivalence class of identical resolution and quantization table. The size of the equivalence This work was supported by a gift from Adobe class is in the first column, and the camera make Systems, Inc., a gift from Microsoft, Inc., a grant and model and resolution are in the second col- from the National Science Foundation (CNS-0708209), umn. 2 Apple iPhone (1200 1600) Apple iPhone (599 800) Apple iPhone (600 800) BenQ DC E43 (1704 2272) BenQ DC E43 (480 640) Canon DIGITAL IXUS (1200 1600) Canon DIGITAL IXUS 40 (852 1136) Canon DIGITAL IXUS 50 (961 1280) Canon DIGITAL IXUS 50 (972 1296) Canon DIGITAL IXUS 55 (1536 2048) Canon DIGITAL IXUS 700 (563 750) Canon DIGITAL IXUS 75 (375 500) Canon DIGITAL IXUS 850 IS (1262 1600) Canon DIGITAL IXUS 860 IS (718 1279) Canon DIGITAL IXUS 960 IS (1350 1800) Canon DIGITAL IXUS 970 IS (852 1136) Canon DIGITAL IXUS v2 (768 1024) Canon ELURA100 (864 1152) Canon EOS 1000D (1296 1944) Canon EOS 1000D (2592 3888) Canon EOS-1D (1070 1600) Canon EOS-1D Mark II (2336 3504) Canon EOS-1D Mark III (1107 1772) Canon EOS-1D Mark III (2304 3456) Canon EOS-1D Mark III (2584 3888) Canon EOS-1D Mark III (2592 3888) Canon EOS-1D Mark III (335 504) Canon EOS-1D Mark III (684 1024) Canon EOS-1D Mark III (738 1181) Canon EOS-1D Mark II N (2336 3504) Canon EOS-1D Mark II N (2346 3510) Canon EOS-1DS (1352 2032) Canon EOS-1DS (2704 4064) Canon EOS-1Ds Mark II (2666 4000) Canon EOS-1Ds Mark II (2667 4000) Canon EOS-1Ds Mark III (3744 5616) Canon EOS 20D (1166 1752) Canon EOS 20D (1168 1752) Canon EOS 20D (1280 1920) Canon EOS 20D (2035 3057) Canon EOS 20D (2332 3504) Canon EOS 20D (2336 3504) Canon EOS 20D (320 480) Canon EOS 20D (400 600) Canon EOS 20D (408 600) Canon EOS 20D (426 640) Canon EOS 20D (496 730) Canon EOS 20D (853 1280) Canon EOS 30D (1166 1752) Canon EOS 30D (1364 2048) Canon EOS 30D (1365 2047) Canon EOS 30D (1479 2048) Canon EOS 30D (1696 2544) Canon EOS 30D (2336 3504) Canon EOS 30D (2346 3519) Canon EOS 30D (299 448) Canon EOS 30D (586 1044) Canon EOS 30D (594 890) Canon EOS 30D (639 960) Canon EOS 30D (682 1024) Canon EOS 30D (684 1024) Canon EOS 30D (689 1030) Canon EOS 30D (695 1044) Canon EOS 30D (696 1044) Canon EOS 30D (744 1024) Canon EOS 30D (800 1600) Canon EOS 30D (817 1226) Canon EOS 350D DIGITAL (1152 1728) Canon EOS 350D DIGITAL (1664 2496) Canon EOS 350D DIGITAL (400 600) Canon EOS 350D DIGITAL (427 640) Canon EOS 350D DIGITAL (484 726) Canon EOS 350D DIGITAL (534 800) Canon EOS 350D DIGITAL (535 802) Canon EOS 350D DIGITAL (576 864) Canon EOS 350D DIGITAL (582 874) Canon EOS 350D DIGITAL (682 1024) Canon EOS 350D DIGITAL (683 1024) Canon EOS 350D DIGITAL (691 1037) Canon EOS 350D DIGITAL (726 1040) Canon EOS 350D DIGITAL (787 1180) Canon EOS 350D DIGITAL (800 1200) Canon EOS 350D DIGITAL (851 1280) Canon EOS 400D DIGITAL (1065 1600) Canon EOS 400D DIGITAL (1066 1600) Canon EOS 400D DIGITAL (1069 1600) Canon EOS 400D DIGITAL (1200 1800) Canon EOS 400D DIGITAL (1288 1936) Canon EOS 400D DIGITAL (1296 1944) Canon EOS 400D DIGITAL (1600 2400) Canon EOS 400D DIGITAL (1880 2816) Canon EOS 400D DIGITAL (1910 2846) Canon EOS 400D DIGITAL (2329 3499) Canon EOS 400D DIGITAL (2592 3888) Canon EOS 400D DIGITAL (400 600) Canon EOS 400D DIGITAL (427 640) Canon EOS 400D DIGITAL (511 768) Canon EOS 400D DIGITAL (512 768) Canon EOS 400D DIGITAL (567 850) Canon EOS 400D DIGITAL (600 600) Canon EOS 400D DIGITAL (600 900) Canon EOS 400D DIGITAL (681 1023) Canon EOS 400D DIGITAL (683 1024) Canon EOS 400D DIGITAL (725 1068) Canon EOS 400D DIGITAL (768 1152) Canon EOS 400D DIGITAL (778 1166) Canon EOS 400D DIGITAL (778 1167) Canon EOS 400D DIGITAL (800 1280) Canon EOS 400D DIGITAL (892 1280) Canon EOS 40D (1200 1920) Canon EOS 40D (1288 1936) Canon EOS 40D (1293 1946) Canon EOS 40D (1296 1944) Canon EOS 40D (1363 2048) Canon EOS 40D (1364 2047) Canon EOS 40D (1364
