Creating a Robust Form of Steganography

CREATING A ROBUST FORM OF STEGANOGRAPHY

Joshua Michael Buchanan

A Thesis Submitted to the Graduate Faculty of

WAKE FOREST UNIVERSITY

In Partial Fulfillment of the Requirements

For the Degree of

MASTER OF SCIENCE

in the Department of Computer Science

May 2004

Winston-Salem, North Carolina

Approved By:

Dr. David John, Advisor ______

Examining Committee:

Dr. Stan Thomas, Chairperson ______

Dr. Daniel Cañas ______

Acknowledgements

This thesis would not be possible without the encouragement and counsel from my thesis advisor, Dr. David John. From my first day on the Wake Forest campus, he provided guidance both inside and outside the classroom. It was a pleasure to discover and explore a recent field in computer science with him, and for this I am thankful.

Table of Contents

Acknowledgements ...... ii

Illustrations...... v

Abstract...... vii

Chapter 1 œ Introduction...... 1 1.1 What is steganography? ...... 1 1.2 For good or evil?...... 2 1.3 History...... 3 1.4 Purpose/Goals of Stego...... 3 1.5 Existing Stego techniques...... 5 1.6 Robustness ...... 9 1.7 The Paradigm...... 9 1.8 The Goal...... 11

Chapter 2: The Details...... 12 2.1 Ample Space in Color Models...... 12 2.2 A new domain...... 14 2.3 Transformation...... 16 2.4 Example ...... 17 2.5 Quantization...... 19 2.6 Summary...... 21

Chapter 3: The Embedding Algorithm...... 22 3.1 Getting Started ...... 22 3.2 Hashing ...... 23 3.3 Potential Locations...... 23 3.4 Embedding Data...... 26 3.5 Redundancy...... 29 3.6 Extraction...... 30 3.7 Embedding Example...... 31 3.8 Extraction example ...... 35 3.9 Errors...... 36

Chapter 4 œ Results...... 38 4.1 LSB results...... 38 4.2 Visual Comparison œ STEM vs. LSB...... 40 4.3 Robustness Testing ...... 42 4.4 Original Images...... 43

4.5 Robustness Testing -- Level 90 Results...... 45 4.6 Robustness Testing -- Level 80 Results...... 49 4.7 Robustness Testing œ Level 70 Results ...... 53 4.8 Robustness Testing œ Level 60 Results ...... 57 4.9 Robustness Testing œ Photoshop Quantization Tables ...... 61 4.10 Robustness Testing œ Blurring...... 62 4.11 Conclusion ...... 63

Chapter 5 œ Future Work...... 64

Appendix A: Code...... 65 A.1 LSB Embedding Code ...... 65 A.2 MSB Embedding Test...... 69 A.3 Comparison of Transform Embedding to LSB...... 71 A.4 General code ...... 72 A.5 Test code ...... 81

Appendix B: Quantization Tables ...... 88 B.1 Photoshop Quantization Tables...... 88 B.2 JPEG Recommended Quantization Tables ...... 90

Appendix C: Initial Coefficient Testing Results ...... 92

Bibliography ...... 99

Illustrations

Figure 1.1: example Figure 1.2: LSB embedding example Figure 1.3: MSB embedding example

Figure 2.1: RGB to YUV conversion formulas Figure 2.2: Original color tiger image Figure 2.3: Tiger image, separated into Y,U, V components Figure 2.4: Formula for the Discrete Cosine Transform Figure 2.5: DCT transform on an 8x8 image block Figure 2.6: DCT energy concentration in a flower image Figure 2.7: Sample JPEG quantization table Figure 2.8: Results of quantization 8x8 tiger eye block

Figure 3.1: Visual importance of DCT coefficients in an image Figure 3.2: Possible DCT coefficient embedding locations Figure 3.3: Table of possible embedding locations Figure 3.4: Patterns of DCT relationships representing embedding bits Figure 3.5: Sample coefficient block Figure 3.6: Manipulating coefficients through two matrices

Figure 4.1: Orwellian LSB embedding example Figure 4.2: LSB robustness results Figure 4.3: Orwellian STEM embedding example Figure 4.4: STEM robustness results Figures 4.5 - 4.6: Original images used in testing Figures 4.7-4.10: STEM embedding at strength 90 Figures 4.11-4.14: STEM embedding at strength 80 Figures 4.15-4.18: STEM embedding at strength 70 Figures 4.19-4.22: STEM embedding at strength 60 Figure 4.23: Cumulative Photoshop compression results Figure 4.24: Selective blurring example

Abbreviations

DCT œ Discrete Cosine Transform

DFT œ Discrete Fourier Transform

IDCT œ Inverse Discrete Cosine transform

JPEG œ Joint Photographic Experts Group

LSB œ least-significant bit

MSB œ Most-significant bit stego œ steganography pixel œ picture element

RGB œ red, green, blue color model

STEM -- Steganographic Transform Embedding Method

Abstract

Steganography is a relatively new and exciting field in the world of computer science. It involves embedding data into a medium in such a manner that it cannot be easily detected. This paper provides the reader with a basic overview of steganography.

This overview includes the purpose, goals, and pre-digital history of steganography. The emphasis is on hiding data within images, with a focus on the robustness of the hidden message. Current spatial embedding techniques are not robust to basic compression algorithms. It examines the ”least-significant bit‘ paradigm which drives the spatial technique, and formulates a ”more-significant bit‘ paradigm in order to produce a more- robust method. Following this paradigm leads to the idea of transform embedding, hiding data in the frequency domain. This paper proposes a transform embedding method (STEM) which uses existing ideas from digital watermarking, improving upon them with empirical data to make them suitable for steganography. This method is more robust than existing spatial embedding methods. Experimental results are presented which support this claim.

Chapter 1 œ Introduction

The proliferation of the Internet has enabled basically anybody to send and receive information from virtually anywhere in the world. However, with this increase in communication comes an increased need for security. The ability to send and receive data securely is an important function to businesses, individuals, and governments alike.

While encryption provides a means of protecting data, it is a mathematical fact that all encryption schemes can eventually be solved with brute force. Moreover, the mere existence of encrypted data in a message may be cause for suspicion, and is even illegal in certain political climates of the world. Information hiding has become increasingly important as a means of conveying information to others without fear of discovery or eavesdropping.

1.1 What is steganography?

Steganography, also known as stego, is derived from the Greek words and , meaning —covered writing“ or —covered drawings.“ It involves embedding information into a medium in such a way that it is not easily detected. That medium can be images, sound files, video, or anything else. The effectiveness of a stego algorithm is a direct result of the cleverness of its definition of —ample space.“ Ample space is the location where an algorithm chooses to hide its data, in hopes that it remains unnoticed while not greatly affecting the host file. This choice affects things like the amount of data that can be inserted, how likely the insertion is to be detected, and how robust the embedded file is to basic file manipulations. The data to be concealed depends somewhat upon the medium in which it is hidden. For example, generally only text can be

embedded in a text file such as HTML. Binary files such as images can embed binary messages, allowing everything from text messages to other binary images.

1.2 For good or evil?

Steganography carries with it a certain stigma commonly associated with actions that are immoral or criminal. Unfortunately, this correlation is not entirely unfounded.

Terrorists such as Bin Laden have utilized the technology to coordinate and disperse information. Maps are hidden in sports chat rooms, on eBay tm auctions, and on pornographic web sites [4,5]. Web sites function as drop-off points, unknowingly harboring information hidden within its images. Criminals then visit the web site, and download all images of a certain type (dogs, for example). Then they can open up the image using the proper software, enter in a password or key phrase, and extract the hidden data. Criminals relish the anonymity provided by this scenario.

However, steganography also has its own productive, non-malicious applications.

One such application can be found in the medical field as an extension of other data structures. Suppose that we wanted to store patient information on their x-rays, without spending millions of dollars on new equipment. The data could be embedded using steganographic techniques, so that the original image would not be affected. Then, the data can be extracted from the image using a small, simple system rather than requiring all new equipment [6]. Also, information hiding is not only for criminals. In the high stakes world of business and corporate espionage, people need secure ways of conveying information. Steganography provides a means of communication that will not be intercepted by the competition. This idea can be extended to people who live under

oppressive governments, giving an anonymous voice to those who might be punished for speaking freely.

1.3 History

While its application in the field of computer science is relatively new, steganography has been around for thousands of years. David Kahn opened up the first international Information Hiding workshop by providing the definitive history of steganography [1]. In ancient Persia (circa 559 B.C.), secret messages were placed inside a dead rabbit, and delivered via a man disguised as a hunter. Histaieus employed stego tactics when he needed to deliver a delicate message involving a revolt. He shaved the hair off of one of his slaves and tattooed the message on his head. When the hair grew back, he sent the slave off to Miletus where his head was shaved, the messaged delivered.

The revolt was successful. While technology progressed, some information hiding techniques have remained the same. Aeneas the Tactician utilized a method of putting small pin pricks above the letters of a cover text, representing the letters of the secret message. This tactic continued through World War I, when Germans used the same tactics by pricking holes in magazines [8]. Although current methods vary from the pre- digital years, the intentions remain the same.

1.4 Purpose/Goals of Stego

The driving force behind steganography is that a message can‘t be intercepted unless its existence is discovered. Steganography techniques focus on making changes to the host file in a manner that is both invisible and undetectable. An embedding algorithm is said to be invisible if humans can‘t visually perceive whether an image contains information or it does not. In psycho-visual experiments, invisibility is commonly judged

by presenting a large number of images and asking the person to determine which images contain hidden data. A success ratio of 50% indicates that the person cannot determine which images contain hidden data [13]. The detectability of an embedding algorithm relates to whether a computer can detect these changes and confirm the existence of embedded data. Typically computers compare the image to a statistical model of an image in order to determine any large variances. For example, a popular stego tool called

S-Tools embeds data by dividing the number of colors in the color table by 8 (from 256 to 32), duplicating them, and embedding information in the least-significant bit for each color entry [16]. This results in images that have a large amount of duplicate or near- duplicate entries. However, naturally-occurring color images have very few duplicate colors [14]. Thus, by scanning color tables for duplicate entries and comparing them with the statistical norm for color tables, a computer can detect whether an image contains embedded data.

Cryptography and watermarking are commonly included in discussions on steganography, but each varies greatly in its intentions and purpose. Cryptography deals with taking a message and making it appear to be random noise, unreadable to an outside observer. It does nothing to hide the presence of the message itself. Often steganography is used in conjunction with cryptography, so that a message will remain unreadable even if it is detected. It is important to note that these are two independent fields of study.

Watermarking and its techniques are much more closely related to steganography, differing only in the intended purpose of the embedded data. Watermarks are used to provide proof-of-ownership of digital media. Thus, unlike steganography, watermark embedding algorithms are not concerned with detection of the watermark. Rather they

are interested in the robustness of the message. The algorithms need to be able to withstand various attacks meant to remove or corrupt the watermark, and focus on making it difficult to do so without compromising the quality of the data. Also, watermarking techniques have the advantage of knowing exactly what data is being embedded, and can extract a watermark based upon thresholds and probabilities.

Steganography algorithms have to embed variable data, and so its extraction techniques have to be more exact.

1.5 Existing Stego techniques

There are several clever ways of hiding information. Simple techniques include hiding data in an unused portion of a file, such as the header of a Microsoft Word tm document. This technique allows the file to function as expected, without hinting that it

Figure 1.1: Spam message, with the message —pier 9“ encoded in it using 5

contains any extra data. Techniques can get much more complicated, embedding data into text via complex grammar rules. A creative example of this is [17].

takes in a message as input and produces a message resembling spam using complex grammar rules derived from analysis of thousands of spam e-mails. Figure 1.1 shows an example e-mail generated by , with the message —pier 9“ embedded in the grammar. As an interesting side note, this message was automatically discarded by

Hotmail‘s tm spam filter. There are numerous other methods which embed data into text, music, or network traffic, but this thesis will focus on embedding data into images.

A common steganography technique used today operates by inserting data into the least-significant bits of an image. This method was first proposed by Kurak and

McHugh in 1992 [7]. Although image formats vary, all images are represented by a color value for each pixel. That value usually ranges from 0 to 255, represented as an 8-bit binary number. Binary numbers express a number as a power of two, where an n-digit binary number is represented by the following formula:

− = =

The variable represents position of the integer in the binary string, with position 0 indicating the rightmost position in the string. Thus binary numbers are calculated right- to-left. For example, the decimal number 133 is equal to the following:

∗ + ∗ + ∗ + ∗ + ∗ + ∗ + ∗ + ∗

This is represented as 10000101 in binary. Since the power of two decreases in importance from left to right, the rightmost bits are known as the least-significant bits.

The 3 least-significant bits in the binary representation are 101, representing the decimal value 5. These bits are known as the least-significant bits because changes to these bits

do not affect the overall value of the number as much as changes to bits to the left. If you

changed the least-significant bit of our example binary string to a 0, the resulting binary

output would be 10000100, which is 132. In contrast, changing the most significant bit in

our example binary string to a 0 would result in the binary string 00000101, which has a

decimal value of 5.

Thus existing stego techniques attempt to manipulate the least-significant bits of

the color data in an image, taking advantage of the fact that the human eye can‘t perceive

Figure 1.2: An example of embedding data into the LSB using S-Tools. The original image is at top center. The image to be embedded, a map of Wake Forest, is in the middle. The bottom left image shows embedding in a color image. The bottom right image shows embedding in black and white. (photo by Josh Buchanan) 7

the difference between really small changes in the color of a pixel. For example, if a pixel was represented by a gray-scale value 187, it would be difficult for a human to detect the difference between that pixel and a pixel with gray-scale value 186. Figure

1.2 shows a picture of a Wait Chapel, with a map of Wake Forest embedded in the least- significant bits using S-Tools. Note that S-Tools‘ embedding process works much better on black and white images than on color ones.

Typical least-significant bit embedding schemes discard the least-significant bits, and replace them with the data to be embedded. For example, if we wanted to embed the sequence 01101, and we had the following image data:

01100000 10101111 00011110 01011101 11111100 (96) (175) (30) (93) (252)

Using the embedding technique, the resulting binary sequence would be:

01100000 10101111 00011111 01011100 11111101 (96) ^ (175) ^ (31) ^ (92) ^ (253) ^

Note that there have to be 8 bits in the host file for each bit that you wish to embed, using the least-significant bit. This can be countered by embedding data into several least- significant bits, at the cost of visibly altering the image. Experimental data using S-Tools suggests that one can alter the 3 least-significant bits of image data without noticeably altering an image file [14].

Although the least-significant bit embedding (LSB) method has visually pleasing results, the embedded message is fragile. This is because image compression algorithms operate on the same principles as the embedding algorithm, manipulating the least- significant bits in order to save space. Hence the compression algorithm conflicts with

the embedding algorithm, overwriting the embedded message in the process [14]. In order for an embedding algorithm to survive compression, the covert data will have to be hidden elsewhere.

1.6 Robustness

In short, we are looking for a robust embedding method. In order to find one, a measure of robustness must be defined. An embedding algorithm will be considered robust if the embedded message can be extracted after an image has been manipulated without being destroyed. For the purpose of this thesis, image manipulation will encompass lossy JPEG compression. The embedding algorithm will be tested against different degrees of compression in order to determine how much an image can be manipulated before the message is destroyed.

1.7 The Paradigm

If compression algorithms manipulate the least-significant bits of an image, then it would be ideal if we could hide the data in the most-significant bits. Compression algorithms are least likely to manipulate these bits, thus increasing the chance of survival for the message. However, there is an important reason why compression algorithms do not affect the most-significant bits -- changing these bits will greatly alter the visual perception of the image. Simply applying the ideas of the LSB embedding process to the most-significant bits will achieve results that are unsatisfactory and, visually speaking, plain ugly (see Figure 1.3 ). However, this line of thinking leads us in the proper direction.

Developing an embedding process that is both robust and stealthy is a delicate compromise, attempting to make sure that the message is an intricate part of the image

yet remains undetectable. Since embedding data in the least-significant bits produces fragile, non-robust results, and embedding data in the most-significant bits produces detectable results, a balance of robustness and stealth might be achieved through hiding data in the bits of an image. This terminology, while ambiguous, reflects the Figure 1.3: This image contains data embedded fact that the combination of stealth and in the most-significant bit, in a square region around Lena‘s face. Code for this example can be found in Appendix A.2. robustness is an arbitrary one, as both goals cannot be entirely attained.

Up until now, the embedding algorithms have focused on manipulating the image data in the spatial domain. A color image in the spatial domain represents each pixel by a triplet of values, corresponding to the quantities of red, green, and blue values which combine to form its color. An image can then be viewed as an array of pixels, with the index of the pixel in the array corresponding to the location of the pixel in the image.

Thus an image with dimensions can be represented by an array of size 3.

Images are also represented as 3 arrays in the frequency domain, but the triplet values are different. There are many different technical definitions of what a value represents in the frequency domain, but intuitively speaking, each triplet value represents how much the color value in the spatial domain changes from its neighbors. This property has several unique characteristics which will be exploited in the

embedding process.

1.8 The Goal

This thesis asks the following question: can we take existing robust watermarking techniques and improve upon them so that they apply to steganography? The seemingly slight differences between watermarking and steganography make this task far from trivial. Watermarking detection algorithms already know what they are looking for, and can declare with a given certainty that an image contains that watermark. On the other hand, steganography algorithms embed variable data, and need more precise methods of extraction. In order to use watermarking methods for this purpose, they will have to be improved to achieve more accurate results.

Chapter 2: The Details

In order to combat the effects of image compression on our embedded message, we must gain an understanding of several key steps in the JPEG algorithm. These steps include converting from the RGB to YUV color space and transforming this information from the spatial domain to the frequency domain. The transformation to the frequency domain is done through the Discrete Cosine Transform (DCT), which has several key properties which can be used to further increase the robustness of our embedding method.

Once these steps can be understood, we can increase the robustness of our embedding method by taking advantage of specific properties of each step.

2.1 Ample Space in Color Models

Images are represented as a collection of individual elements called pixels. A pixel is a small dot on the screen, and represents a single color. Computer monitors display images by using three different phosphors (red, green, blue) for each pixel. A computer displays images in an additive manner, combining the three phosphors in order to generate a certain shade. Thus if all three phosphors are minimized, the pixel appears to be black. Likewise, if all three phosphors are maximized, the pixel is white. Different combinations of the three phosphors produce most of the colors in between. Most images are represented by the RGB (red-green-blue) color model, which mimics the way computer monitors work. Each pixel in the RGB color model is represented as a combination of these three components. The images dealt with in this thesis were 24 bit

JPEGs, 8 bits for each component. The values of each component range from 0 to 255.

The JPEG file format converts from RGB to the YUV (luminance-chrominance- chrominance) color model. Specifically, it converts an image to a form of the YUV color model known as YCbCr. The YUV color model was developed as a way to allow television stations to broadcast in both black-and-white and color. Black-and-white television sets use only the luminance (Y) portion of the broadcast, ignoring the chrominance portions of the signal. The Y component specifies the intensity of the image. Color television sets also decode the two chrominance components (Cb,Cr) in order to form a color picture [3]. The Cb component specifies how much blue is in the image, while the Cr component specifies how much red is in the image. The RGB and

YUV colors models are related via the following formulas:

! "#$ % ! "#$ &% ! "#$ ' & ! "#$ ' &% ! "#$ ' & ! "#$'

Figure 2.1: RGB to YUV (YCrCb) conversion formulas [2]. Sample Precision is equal to 8 for JPEG images 1

The luminance component (Y) contains a large part of the color information of the RGB color model, as viewed in the formula which converts YUV back to RGB. This is made obvious in Figure 2.3 , which shows an image of a tiger broken up into its Y, U, and V components. This is an important property of the YUV color model. It allows compression to be performed on the chrominance (U,V) components of an image, without affecting our visual perception of it. This is very different from the RGB color model, which assigns all three components equal importance. The JPEG image

1 The given formula from this book states the formula for R as R=Y+1.402V. This is wrong œ it is not an inverse for R. The formula has been corrected to be an accurate inverse.

compression algorithm exploits this fact, performing greater compression on the chrominance components than the luminance component. Data hidden in the luminance component of an image will be hidden in the part of the data, and less likely to be damaged during compression. By exploiting this characteristic of the YUV color model, we attempt to make our stego embedding process more robust.

2.2 A new domain

Another step we can take to make the embedding process more robust is by transforming the data from the YUV color space into the frequency domain. This allows us to look at images as a continuous entity, rather than a discrete set of 1‘s and 0‘s. Bits can now be embedded according to thresholds, rather than embedding the exact bit. In doing so, we are given greater freedom in embedding data, and greater margin for error caused by compression. The frequency domain represents the amount of energy in the luminance and chrominance light waves. The energy distribution varies greatly depending upon an image, and so it is difficult to compress an image in the spatial domain. In the frequency domain, the energy in an image tends to be more compact,

Figure 2.2: Original color image of a tiger (photo by Josh Buchanan) 14

Figure 2.3: Image of a tiger, separated into Y, U, and V components. (photo by Josh Buchanan)

grouped closer together. Thus the JPEG compression algorithm transforms image data into the frequency domain, which allows compression to be performed on the outer components of the energy spectrum.

2.3 Transformation

The operation that JPEG uses to convert between the spatial domain and the frequency domain is the two-dimensional Discrete Cosine Transform (DCT). The JPEG compression algorithm partitions an image into 8x8 blocks of pixels, applying the DCT separately to each block [18]. The DCT is a singular transform which transforms data into a sum of cosine functions. The two-dimensional DCT is defined as follows:

− − -+ +π .+ +π ()* +, = )+ ), ()* / "# "# = = - . 3#!+2 ,= 01 ! )+ 2 ), = #41 !0 Figure 2.4: Formula for the 2-D Discrete Cosine Transform (DCT). In JPEG compression, N=8.

So why use the DCT over the Discrete Fourier Transform (DFT)? This equation represents a special derivation of the Fourier transform. The Fourier Transform treats a finite signal as if it were periodic. Thus a difference in the edges of two neighboring blocks in an image results in high frequency components. During the quantization step

(see Section 2.5), these high frequencies are eliminated, resulting in blocky artifacts in the image. Thus the DFT needs to be modified in order to work well with images.

The Fourier series of any continuous, real-valued symmetric function is composed entirely of coefficients relating to the cosine terms of the series [19]. By forcing symmetry on the DFT, we are able to utilize this property to produce results which are real-valued and can be expressed entirely in terms of cosine waves [12]. This has an

important consequence. The symmetry of the function removes the sharp change

between neighboring image blocks, eliminating the high-frequency components that

cause blocky artifacts in an image [9].

2.4 Example

Let‘s observe the basic behavior of the DCT transform by viewing a sample

calculation from an 8x8 block of the eye of the tiger in the previous section. Figure 2.5

shows the Y values for the image, and the corresponding DCT coefficients. The upper

left-hand value in the DCT matrix (921) is a constant value, known as the DC

component. The other values in the matrix are known as AC components. The DCT

separates an image into parts according to its visual importance. That is, it transforms

most of the wave energy into the upper left corner of the block of coefficients, and the

least energy into the bottom right corner of the block. Thus the DC component

represents the most-significant part of the data. You can see this in the example

Figure 2.5: An example 8x8 block from the tiger image and its DCT coefficients. Coefficients were calculated using Matlab.

calculation, as the DC component is much larger than any of the other coefficients. As you move towards the right-hand corner of the matrix, the magnitude of the values gets increasingly smaller. It is excellent at energy compaction, and it is important to note that this property is independent of the image data. As seen in Figure 2.6, most of the energy of the flower image is also concentrated in the upper left hand corner of the coefficient matrix. This is an important part of the JPEG compression scheme. The algorithm is able to compress the lower right hand corner (or eliminate it entirely) without affecting the visual perception of the image.

It is important to apply the notions of significant bits in the spatial domain that were introduced in the first chapter to the DCT coefficients in the frequency domain.

Using the DCT, the DC coefficient is analogous to the most-significant bit in the spatial domain. Similarly, the least-significant bits are akin to the coefficients represented in the lower right-hand corner. In our search for the bits of data, we look to the middle frequency DCT coefficients. These coefficients are somewhat arbitrary, and will be obtained through research and experimental data, which will be discussed more in

Chapter 3.

Figure 2.6: A) Flower from Reynolda gardens (photo by Josh Buchanan) B) Corresponding DCT energy concentration

2.5 Quantization

Up until now, the steps of the JPEG compression algorithm have been lossless.

The lossy part of the algorithm takes place during the quantization of each 8x8 image block, and is the part our information hiding process attempts to withstand. Quantization on a matrix is performed as follows:

5 -2 . 56 -2 . = #+7 8 -2 .

Here represents the quantized matrix, and is the matrix representing the quantization table. The quantization table is an 8x8 matrix, and is stored in the header of the JPEG file format. Each value in the original matrix has a corresponding value in . Higher values in the quantization table result in more round-off in the quantized matrix . This fact is demonstrated in the image restoration process, which occurs as follows:

59-2.:5 69-2.:;89-2.:

Again represents the quantized matrix, is the matrix representing the quantization table, and is the restored image.

Initially, this inverse operation may seem trivial. However, compression is achieved through the initial rounding of the coefficient, the structure of the quantization table, and the nature of the DCT. The JPEG committee does not specify an exact

Figure 2.7 Quantization Table for the Y Component [2]

standard for quantization tables, but it does provide a table ( Figure 2.7 ) that has been tested with good results [2]. Notice how the quantization values corresponding to the lower right-hand corner of the DCT coefficients are larger than the others. This takes advantage of the fact that the DCT concentrates most of its energy into the upper left hand corner of the matrix. Figure 2.8 shows the effects of quantizing on the 8x8 block representing the tiger eye. Most of the visually less-significant coefficients are rounded down to 0. Thus when these coefficients are restored, they will be restored as 0. The

JPEG algorithm then applies Huffman encoding to the matrix in a zigzag manner in order to achieve maximum compression on these 0‘s [18].

So how does quantization affect the stego algorithm? The method must be robust enough to survive the quantization stage of JPEG compression. Since there is no set standard for quantization tables, it must be able to survive a variety of tables. There are limits, however, to which sort of tables the method should be able to survive. In line with the notion of robustness presented in Section 1.3, a given quantization table should not visually destroy the original image. Thus our embedding algorithm must be able to survive quantization tables which preserve the high-energy DCT coefficients.

Figure 2.8: Results of Quantizing the DCT coefficients from Figure 2.5

2.6 Summary

This chapter introduced steps of the JPEG algorithm which will affect the robustness of the proposed stego embedding process. The luminance (Y) component has been identified as the visually most-significant region of the YUV color space, and thus the component that is least likely to be compressed. This component is where data will be embedded. We examined the DCT and its basic properties, and observed how it distributed most of the energy from the spatial domain into the upper left hand corner of the coefficient matrix. Finally, we looked at how data is compressed using quantization tables, and how the layout of these tables takes advantage of the compaction properties of the DCT. Since there is no standard quantization table, the embedding method must be prepared to handle a variety of tables which behave differently. Chapter 3 will observe the relationship between varying quantization tables and their DCT coefficients, and form a notion of coefficients based upon experimental data. It will also lay out a transform-based steganography method (STEM) which uses the ideas presented in this chapter to produce a method which is robust to varying degrees of JPEG compression.

Chapter 3: The Embedding Algorithm

This chapter outlines the ideas behind our embedding system STEM

(Steganographic Transform Embedding Method). STEM is based on a watermarking method proposed by Jian Zhao and Eckhard Koch[15]. This process is similar to the spread-spectrum approach first utilized first in World War II [6]. It attempts to hide the signal in the noise of the frequency domain. By manipulating frequency domain coefficients, it spreads changes throughout an entire image block, rather than at a single location. Modifications have been made to the coefficient location table, quantization table, and redundancy features of the existing process in order to increase the robustness of the watermarking method.

3.1 Getting Started

An image of size is read in and stored as an array, with the 3 component corresponding to the RGB color model. The first step is to convert this array into the YUV color model. As discussed in Chapter 2, most of the important information is stored in the Y component of the color model and is least likely to be compressed. By hiding data in the Y component, the data has the best chance of surviving compression.

The image matrix is then partitioned into 8x8 blocks, which are to be processed separately. Incomplete blocks are discarded by the partitioning algorithm. Since robustness is the focus of this algorithm, the embedding order is currently sequential.

However, an embedding order could be generated from a user-defined key, similar to those used in PGP [22].

The 2-dimensional DCT is performed on each block in the sequence. These DCT coefficients are quantized via a quantization table which corresponds to the level of robustness specified by user input.

3.2 Hashing

The program uses a pass phrase to determine where to embed the data in the coefficients. Hash functions can be very complex, but this one is used as a proof-of- concept.

3#!<= &>1! 9:' ?&'@)AA&>1! 9:'B#"4#>4C >= ##

Each character in the pass phrase is hashed according to its numeric ASCII value, modulo the size of the location table introduced in Section 3.3. The resulting hash value corresponds to an entry in the location table. The pass phrase generates a coefficient embedding sequence. When the end of the coefficient embedding sequence is reached, the sequence starts over again, embedding the next bit at the first location in the sequence.

3.3 Potential Locations

The location table contains triplets of DCT coefficients which represent possible embedding locations. It is important to just choose coefficients which will be robust to compression. However, modifying these coefficients should not greatly modify the appearance of the image. Figure 3.1 shows the visual importance of the DCT coefficients. Note how much information is contained in only the DC coefficient. As more DCT frequencies are added, the image becomes more detailed and crisp. It is these

Figure 3.1: A visual display of the importance of DCT coefficients. The gray shaded boxes represent coefficients that were used from the original image to produce the lettered image. Photograph A shows the image reconstructed with only its DC coefficient. For each successive photograph, a DCT coefficient is added in a zigzag manner. The original image, reconstructed from all its DCT coefficients, is shown in photograph K. (photo by Josh Buchanan)

first several coefficients which will serve as the more-significant regions of the image, the area where data will be embedded.

In order to find the coefficients least likely to be compressed, the quantization tables for JPEG images from the

JPEG committee and Adobe Photoshop tm were observed, and Figure 3.2: Possible DCT coefficient embedding —good“ spots for embedding data were obtained and included locations with the embedding locations from [15]. Good spots were judged by calculating the average of the quantization tables in a given range and choosing the positions corresponding to the 9 coefficients in the average quantization table. For the JPEG standard, the compression level ranged from 50 to 100. The Photoshop compression level ranged from 7 to 12. In order to avoid skewing the image too much, the DC coefficient and its immediate neighbors were excluded from consideration. The

Hash k1 k2 k3 lowest coefficients were chosen because these coefficients are least likely to be affected by quantization. Embedding tests were performed at these locations to eliminate the triplets which were not resistant to compression, and the results are shown in Appendix C. The resulting —best“ spots for embedding data are Figure 3.3: Table of possible coefficient shown in Figure 3.3. Although there are more than 7 possible embedding triplets combinations of these three locations, a prime number of coefficient triplets was chosen in order to perform best with the given hash modular function [23].

These locations are organized into a table of triplets (k 1, k 2, k 3), shown in Figure

3.3 . Data is embedded through manipulating the relationships between these three coefficients. These relationships are shown in Figure 3.4. The bit represented by these

relationships was arbitrarily assigned. The labels —High,“

—Middle,“ and —Low“ are assigned relative to the three coefficients. In other words, the pattern MLH corresponds to the instance where k 2 < k1< k 3. This pattern represents an embedded ”0‘ bit. Similarly, the pattern HML corresponds to Figure 3.4: Patterns for the instance where k 1 > k 2 > k 3. This pattern represents a ”1‘ embedded bits. H=high, M=middle, L=Low bit. There are also patterns which represent an invalid bit.

This invalid pattern allows the extraction function to identify which blocks are either corrupted by compression or unable to be used by the algorithm without noticeably modifying the visual perception of the image.

3.4 Embedding Data

An 8x8 block is chosen from the embedding sequence. Before data is embedded into block , a threshold check is performed to make sure that modifying the coefficients will not greatly alter the visual appearance of the image. This threshold check is dependent upon the distance between frequency coefficients ( ). The value for

is obtained experimentally, partially dependent on the user-entered degree of robustness. A larger value for allows more robustness in the embedding method, but results in more visual distortion of the image.

The DCT is then performed on block , and the block is quantized according to a given quantization table . The values for table are also experimental and depend upon a number of factors. These factors include the visual characteristics of the image, the desired robustness, and how much one is willing to alter the image to obtain the desired

robustness. Chapter 4 provides results which show the importance of each of these factors.

A triplet of DCT coefficients (k 1, k 2, k 3) is obtained through the coefficient embedding sequence generated by the pass phrase introduced in Section 3.2 and the location table from Section 3.3. For a given bit , the threshold check is performed as follows:

3#!"= 3 - D 2 D > D + ( #73.D2 D2 D 4#,7 44 ! 7 8+4= C#"D !3#! , ! ()*# C#"D 0!4 C#"D C"D4#E ! 4+!3 F

3#!"= 3 D 2 D + ( < D #73.D2 D2 D 4#,7 44 ! 7 8+4= C#"D !3#! , ! ()*# C#"D 0!4 C#"D C"D4#E ! 4+!3 F

! 4+!4!+ 3#!#41 !"

By modifying the block to an invalid pattern which contains frequencies with similar values, we avoid changing the areas of the image susceptible to visual damage.

If the given block passes the threshold check, the coefficients will be modified according to the pattern in Figure 3.4 corresponding to the bit that is embedded.

Modifications will occur as follows:

3#!"= 3D + (≥ D 0D 7D "! D C. ( + "1E %+ 7 "! D C. (+ "1E %+ 3D + (≥ D 0D 7D "! D C. ( + "1E %+ 7 "! D C. (+ "1E %+

3#!"= 3D ≤

correct this averaging effect.

Since the DCT is both linear and separable, we can manipulate the values of the three DCT coefficients via a second matrix. This second matrix will consist of all zeros, except at the positions of the three coefficients. Figure 3.6 shows how this matrix will be set up. Those three positions will contain the values we wish to manipulate the DCT coefficients by. The IDCT is performed on each matrix separately, transforming the values back into the spatial domain. These two matrices are added together, producing a spatial matrix which has the desired relationships between the coefficients (k 1, k2, k3) in the frequency domain. We will observe how changes in the frequency domain affect our image in the spatial domain in an example at the end of this chapter.

3.5 Redundancy

Although taking in the statistics of different quantization tables makes the embedding process more robust, it cannot handle them all. In order to combat the effects of many different quantization tables, the algorithm must have some form of redundancy built into it. This results in an algorithm with a reduced message capacity, but an increased chance of message survival.

The embedding sequence will be divided evenly into parts. The covert message

Figure 3.6: Shows how to produce a spatial matrix that will produce coefficients k 1, k 2, k 3 which are altered by their corresponding values of C.

will be embedded in these locations. Before embedding the data sequence, a 10-bit number representing the length of the covert message will be embedded at the start of these three locations. The extraction method will handle damaged bits using the information stored at these locations and a heuristic scoring function.

3.6 Extraction

Similar to the embedding process, the image is partitioned into 8x8 blocks. The basic extraction method uses the coefficient relationships from the table in Figure 3.4 to represent embedded bits. The coefficient sequence is determined by hashing a user- entered pass phrase. Each triplet of DCT coefficients (k 1, k 2, k 3) is determined by this coefficient sequence. For each block B in the embedding sequence and bit in the embed message, data is extracted from these coefficients as follows:

3 D = - D2 D2 D 2 " = 3 D = D2 D2 D 2 " = #41 !0 41 C#"D,7

The message is first reconstructed by extracting binary strings of length 10 from the fixed locations in the embedding sequence to determine the length of the embedded message. Each of the strings will be assigned a score, and passed to a heuristic function. The scoring function is as follows:

3#! "1#341 r4!E "#! 3#! "1#4#"+!! 44!E "#! "#! 41 +C !#34"1E# !!#! C441 #41 !4!E4

The heuristic function assigns weights to each bit of the string based upon the score function and uses this to reconstruct the embedded string. The formula for this is as follows:

! -4!"4 7 = !#+7 0/∗ / /=

Here represents an individual string from the set of extracted strings. Each non-error bit is multiplied by the string‘s weight and added to a sum for each bit. This way, strings which are mostly correct (according to a poll of the other strings) will be given greater priority than ones which don‘t match the other strings.

After the length string is extracted, the algorithm extracts the binary strings of length from their respective locations. These strings will be scored and reconstructed in the exact same manner as the length string. This entire process will be best understood through a simple example, which will follow in the next section.

3.7 Embedding Example

The mathematics and consequences of this process can be best understood through a simple example. For this example, we will embed the length string using 3 bits (instead of 10), and the message to be embedded is the sequence ”0101‘. The redundancy factor is 3, and the size of the length string is 3 bits. This bit sequence will be embedded into the tiger image from Chapter 2, and the pass phrase is ”fire‘. The quantization table used corresponds to JPEG standard compression level 90 (See

Appendix B). Visual results for this small example will be omitted, because the change will be so slight that it would result in squinting and unnecessary headaches. The results of embedding more data in an image are shown in Chapter 4.

The tiger image is read in, converted to the YUV color space, and partitioned into

8x8 blocks. The tiger image is 600 425 pixels, which contains 3,922 (53*74) complete blocks. These blocks are divided into 3 groups, according to our redundancy factor .

The first group starts at the (x,y) coordinates (1,1) in the image, the second at (137,385), and the third at (281,185).

Next, the length portion of the string has to be calculated and added to the data to be embedded. In this example, the length string is 3 bits. Combined with the length of the embedded string ”0101‘ (4 bits), the total length of the data to be embedded is 7 bits.

In binary, this value is represented as ”111‘. The length string is concatenated with the message string. The entire string to be embedded is ”1110101‘.

The length string will be embedded in all 3 subsections of the image. The first step of the embedding process will be presented to demonstrate the concept, and the results of the remaining embedding steps will be summarized. The pass phrase is hashed to the sequence (7,10,7,6). The algorithm looks at an 8x8 block of the image, with upper left hand coordinates (1,1) in the tiger image. The Y component of the data is as follows:

A threshold check is performed on this block. The hash sequence currently points to the

(k 1,k 2,k 3) triplet values (16, 9, 17). The DCT is applied to block , which results in the following 8x8 block of coefficients:

The values for (k 1,k 2,k 3) are -43.9, 1.8, and -6.6 respectively. These values will be quantized according to their corresponding values (3,2,3) in the JPEG level 90 compression table. The resulting quantized values (k q1 , k q2 , k q3 ) are (-15,1,-2). Next we perform the threshold check from Section 3.4. Since the bit to be embedded is ”1‘, we check to see if k q3 ≥ min(k q2 , k q3 )+ . This inequality does not hold, so this is an invalid block and needs to be changed to an invalid pattern.

The algorithm modifies (k q1 , k q2 , k q3 ) such that the resulting triplet will be (k q1 -e, kq2 +e, .5*(k q1 + k q2 )). The experimental value for e is ±3*D. For the given block, this is obtained via the following changes matrix:

Both the original block and the changes matrix are dequantized, and the IDCT is performed on each separately to bring them back to the spatial domain. The resulting matrices are added together, and this block is written back to the image. The changes made to the original block in the spatial domain are as follows:

Note how a change to a value in the frequency domain is spread out over all the pixel values in the spatial domain. This averaging effect is the reason that the method is robust to compression œ the data is hidden in the more-significant region of an entire 8x8 block, rather than a single bit. Also, spreading the information across an entire block dissipates the change and makes it less visible.

The embedding algorithm then checks the next block in the embedding sequence

(starting at location 1,9) and continues with the hash sequence now pointing to the coefficient triplet (10,3,17). The process continues in this manner, manipulating the hashed coefficient values according to whether they are valid or invalid. This process is executed for each of the 3 groups. Here are the results:

Group 1 : B1 B 1BB 1 BBB 0B1BB 0BB 1 Group 2 : B1BB 1BBBBBBBB 10 BB 1BBB 0B1 Group 3: 1BBB 1BBB 101 BBB 0BB 1

Here —bad“ blocks are signified by a B. In this example, there are 28 bad blocks to embed

3 strings of length 7. This is a high ratio, but one of the compromises made in order to avoid perceptual detection. Now we will step through the extraction of this data, after the image has been compressed to JPEG level 38.

3.8 Extraction example

Assuming one has the correct pass phrase, data extraction is fairly straightforward. The most interesting phase of data extraction is the string reconstruction, and this will be the focus of the discussion. We will step through the extraction of the data from the previous example, after the modified image has been compressed to JPEG level 38.

The first step in extraction is to convert the image data into YUV format. The image is then divided into blocks, and the three different starting positions are calculated according to the preset redundancy value . Next the algorithm extracts , the string representing the length of the embedded data. The size of this field is known, and is 3 bits for this example. The DCT is performed on each 8x8 block and the coefficients are quantized, and their relationships examined according to the table in Figure 3.4 . Recall, if quantized coefficients k q1 , k q2 > k q3 , a ”1‘ has been embedded in the block. Similarly, if kq1 , k q2 < k q3 ,then a ”0‘ has been embedded in the block. Otherwise, the block is invalid.

Here are the results extracting the length string:

Group 1 : B1B1BB 1 111 Group 2 B1BB 1BBBBBBBB 1 extracted strings: 111 Group 3:1BBBBBBB 10 110

The three extracted strings differ œ the algorithm will attempt to reconcile them. Each string will be assigned a score, according to the scoring algorithm presented in Section

3.6. The scoring for the Group 1 string (111) is as follows, with matches highlighted in gray:

Similarly, the scores for String 2 and String 3 are 8 and 7, respectively. These scores are used to assign a weight to a given string‘s answer, according to a poll of its peers. The string is then reconstructed using this weight, and the formula from Section 3.6.

! ∗ + ∗ + ∗ = ∗ = = = -4!"4 7 !#+7 0/ / !#+7 + + !#+7 /= ! ∗+ ∗+ ∗ -4!"4 7 = !#+7 0/∗/ = !#+7 = !#+7 = = + + / ! ∗+ ∗+ ∗ -4!"4 7 = !#+7 0/∗/ = !#+7 = !#+7 = = + + /

Thus the extracted length string is 111, which is equal to 7 in decimal. The algorithm goes back to the three starting positions, and extracts 7 valid bits from these locations.

Here are the results of the extraction:

Group 1: B1B1BB 1BBB 0B1BB 0BB 1 1110101 Group 2: B1BB 1BBBBBBBB 10 BB 1BBB 0BB 0 extracted strings: 1110100 Group 3: 1BBBBBBB 101 BBB 0BBBB 01 1101001

The scores for Group 1, Group 2, and Group 3 are 17, 16, and 14 respectively.

Reconstructing the string using the method presented above yields 1110101 (the first 3 bits represents the length string), which matches the embedded data before compression.

3.9 Errors

Not all errors are created equal. For example, if the relationship between a triplet coefficient block is flipped, then a 0 bit becomes a 1 bit, and counts as a single error.

However, there is a much more serious type of error œ if a bad block changes to represent a 0 or 1 bit. Suppose we had the original sequence (with a length string of 3 bits) when embedding data:

B1BB 1BBBBBBBB 10 BB 1BBB 0BB 1 = 1110101

If a single bad block marker bit is flipped, an entirely different sequence is extracted.

B1BB 1BBBBBBBB 10 B0B1BBB 0 = 1110010 ^ Moreover, it gets worse if a single bad block marker is flipped in the length string. This error affects the amount of data extracted, wrecking the results in most cases.

B1BB 1BBBBBBBB 10 BB 1BBB 0BB 1 = 1110101 01 BB 1 = 011 ^ Since the length string is now 3 (instead of 7), the extraction algorithm stops. This problem is multiplied if the string is of significant length (over 50). For each bad block, the off-by-one errors cascade down the string and result in even more errors. The scoring system and weight heuristic was implemented to reward strings which were more correct than others, in an attempt to negate the affects of the cascading errors. Although this helps resolve some situations, the cascading error remains the most serious threat to the embedded data‘s integrity. This highlights one of the differences between watermarking and steganography. In watermarking, the desired data sequence is known, and cascading errors can be reconciled by comparing correct substrings. This is not the case in steganography. Keep this in mind when looking at robustness results in the next chapter

œ an especially unfortunate single cascading error in the 12 th bit of a 200 bit string can result in 188 errors.

Chapter 4 œ Results

On paper, the methods and ideas presented behind STEM seem to work out fine.

But in steganography, perception is everything. This section attempts to resolve several lingering questions. How does the transform embedding method compare to the existing

LSB method in terms of visual perception and robustness? How does the STEM embedding method affect its host image? What is the tradeoff between invisibility and robustness? What kinds of image characteristics are necessary to decrease the chance of detection? In this chapter, both visual and robustness results from STEM will be compared with the existing LSB method. The embedding strength of STEM will then be increased and both the visual and robustness consequences will be observed. Finally, the separate results of cumulative Photoshop compression and blurring will be presented and the findings summarized.

4.1 LSB results

As a way to measure the performance of the transform embedding method, first we will compare the results to the existing LSB embedding method. Code for the LSB tests follows the ideas presented in Chapter 1, and its implementation is provided in

Appendix A.1. For testing purposes, a quote from Orwell‘s book !"# was selected:

—Thoughtcrime does not entail death: thoughtcrime death.“ This sentence was

Figure 4.1: a) Original Lena image. b) Image with Orwellian text embedded using a LSB method. converted into a binary string, and embedded into the famous Lena 2 photograph. The visual results of LSB embedding are shown in Figure 4.1 . There is visually no difference between the original image and the one containing hidden data. But how robust is this embedding method?

To test the robustness of the embedding method, a random binary string of length

300 was embedded into the Lena image. The image containing hidden data was compressed using JPEG compression at levels ranging from 100 (least compression) to

45 (more compression) in increments of 5. Data was then extracted from the compressed image, and compared to the original embedded string. This experiment was repeated 200 times. Final results for this experiment are listed in Figure 4.2 . Note that after compression at any level, approximately 50% of the correct bits are extracted from the message. These results are worse than they initially appear. Since the strings are binary, the correct bit at each location will be chosen with probability ². Since this probability is independent of position and any other of the bits in the embedded message, on a random

2 See $%%&&&' ' for the full story behind this image processing photograph

LSB Robustness Totals 40 45 50 55 60 65 70 ! 75 80 85 90 95 100 ! total bits: 780000 % correct: 0.4995 Figure 4.2: Results of LSB extraction tests. extraction of bits results in extracting 50% of the bits in the correct order. Thus the LSB embedding method does no better than random extraction. Since LSB embedding performs no better than random guessing, it can be concluded that this method is not robust to any form of compression.

4.2 Visual Comparison œ STEM vs. LSB

It has been stated before, but cannot be emphasized enough œ obtaining robust embedding results is a compromise between visual detectability and robustness. By increasing the robustness of the message, we also increase the changes made to the image and thus the chance of these changes being visually detected. The visual results of LSB embedding are quite impressive, but they offer no robustness whatsoever. One question to be answered is this: ignoring robustness issues, can we achieve the same type of visual results using transform embedding? Comparable results were achieved using experimental parameters and are shown in Figure 4.3 . For this visual test, the quantization table was set equal to the JPEG quantization with compression level 90 (see

Appendix B), redundancy was set to removed (i.e. set to 1), and the coefficient pass phrase mapped to the coefficient triplet (3,10,11). The same Orwellian sentence was

Figure 4.3: a) Original Lena image b) Image with Orwellian text embedded using STEM embedded into the Lena photograph using the transform method. Code for this experiment can be found in Appendix A.3.

Compromises were made in order to achieve visual quality comparable to the

LSB embedding method. These changes affect the robustness of the message to compression algorithms. But by how much? The image was compressed in JPEG format from levels 80 (more compression) to 100 (least compression), and a message was extracted from each compressed image. Figure 4.4 shows the results of this extraction.

Even with the concessions made to achieve imperceptible changes, the message still survived unscathed through JPEG compression up through level 87. Such results are impossible using the LSB method. These results indicate that the more we are able to concede visually (through quantization tables and redundancy levels), the more robust the transform method will be.

Transform Robustness Totals for Figure 4.3 100 99 98 97 96 95 94 93 92 91 90 "# $ 89 88 87 86 85 84 83 82 81 80 "# $ Embedding Parameters %&'&() *'(+(( ,-!./ 01# (, 2##, 33 . &&&&((,451'1&(&(1,1'1&(1 4 Figure 4.4: Results of transform extraction tests on the image in Figure 4.3

4.3 Robustness Testing

We‘ve established that the existing LSB method is not robust to any sort of compression, and that the transform method does offer resistance to compression. This leaves several questions to be answered. How much compression can we perform on an image before the embedded data is corrupted? As the robustness of the embedded data increases, how does this affect the image? The first ranges of tests will be performed on four different images, at embedding strengths 90, 80, 70, and 60. Data will be embedded into these different images, the resulting images will be compressed at increasing levels, and then a message will be extracted and compared to the original. The data to be embedded is a quote from ( ) : —You can always trust a dishonest man to be dishonest. Honestly, it's the honest ones you want to watch out for.“

Converted to binary, this string is 888 bits long. It will be embedded into the images with redundancy factor 3, using the pass phrase ”clandestine‘.

4.4 Original Images

Here are the original, untouched images that will be used for testing. These images have been resized to fit two per page.

Figure 4.5: Original images used in testing. Top œ tiger1.jpg Bottom œ lena.jpg 43

Figure 4.6: Original images used in testing. Top œ shed.jpg Bottom œ duke2.jpg

4.5 Robustness Testing -- Level 90 Results

100 96 92 88 84 80 76 72 "# $ Embedding Parameters %&'&() *'(+(( ,-!./ 01# (, 2##(1(,(&

Figure 4.7: Robustness results œ tiger1.jpg, level 90

Note that this image contains a small percentage of errors in the extracted stream. This is due to a combination of rounding errors introduced in the conversion from RGB to

YUV format, the DCT, and the quantization process. As the embedding strength increases, these errors will become less significant and will be eliminated entirely at the higher JPEG levels (least compression).

Figure 4.8: Robustness results œ duke2.jpg, level 90

Note that embedding data at level 90 produces an image with no visual difference between it and the original.

Figure 4.9: Robustness results œ shed.jpg, level 90

Again, there is no visual difference between this image and the original. There is a small fraction of errors in the extracted data, due to the error caused by round-off. These errors will go away as the strength of the embedding algorithm increases.

Figure 4.10: Robustness results œ lena.jpg, level 90

At level 90 embedding, there is no perceptual difference between the modified image and the original.

4.6 Robustness Testing -- Level 80 Results

100 96 92 88 84 80 76 72 68 64 60 "# $ Embedding Parameters %&'&() 6'(+(( ,-!./ 01# (, 2##(1(,(&

As the embedding strength increases, artifacts are slowly emerging. Observe the water in the left-hand side of the image. Although not instantly recognizable, fuzzy blocks are beginning to form in this area.

This image is a busy image, and hides the changes quite well. Small artifacts are

beginning to form on the tree trunk on the left hand side, as well as in the TP-covered

sky. The entire message survives untouched through JPEG compression level 72.

The transformed image survives JPEG compression up to level 80, while still maintaining

reasonably good visual stealth. If you look at the sky to the right of the shed, you will

notice blocky artifacts starting to form. These artifacts will become more pronounced as

the embedding strength increases.

Three slightly fuzzy bars are starting to form in this image. These will probably go unnoticed, but will be much more evident as the embedding strength increases.

4.7 Robustness Testing œ Level 70 Results

100 96 92 88 84 80 76 72 68 64 60 56 52 "# $ Embedding Parameters %&'&() *'(+(( ,-!./ 01# (, 2##(1(,(&

Changes to this image are starting to cross the threshold from imperceptible to perceptible. The blocky artifacts in the water have become more noticeable, and artifacts are also starting to appear on the tiger itself.

Because this image is so busy, changes to this photograph are less noticeable than the others. There are artifacts more prominent on the left tree trunk, but these may still go unnoticed. At strength 70, the embedded message survives compression past JPEG level

Definite marks are now forming in the sky above the barn. This is one weakness of the embedding algorithm œ areas of color which are similar and unchanging. When choosing a candidate image for embedding, one should avoid images that have such characteristics.

Stripes are becoming detectable, especially the one across Lena‘s chin. Although the

embedded image survives compression past JPEG level 52, the visual changes should be

noticed rather easily.

4.8 Robustness Testing œ Level 60 Results

100 96 92 88 84 80 76 72 68 "# $ 64 60 56 52 48 44 40 36 32 "# $ Embedding Parameters %&'&() *'(+(( ,-!./ 01# (, 2##(1(,(&

At level 60, image modifications are clearly visible in the water and across the tiger.

This is the only image that has a chance of avoiding detection at level 60, and is a good example of what an embedding candidate should look like. There is so much happening in the image that it distracts the viewer from the changes. This is highly desirably, because this image survives compression up to JPEG level 36.

Obvious modifications to the sky, as well as discernible modifications to the grass make this image a poor choice for level 60 embedding.

The three stripes across Lena are now blatant. This image is also a poor choice for level

60 embedding.

4.9 Robustness Testing œ Photoshop Quantization Tables

embedding strength: 60 !1120 image 12 11 10 9 8 7 6 5 4 ( 7 &'8 7 "# 7 1& 7 embedding strength: 70 !1120 image 12 11 10 9 8 7 6 5 4 ( 7 &'8 7 "# 7 1& 7 embedding strength: 80 !1120 image 12 11 10 9 8 7 6 5 4 ( 7 &'8 7 "# 7 1& 7 embedding strength: 90 !1120 image 12 11 10 9 8 7 6 5 4 ( 7 &'8 7 "#

7 . . 1& 7 Figure 4.23: Cumulative Photoshop compression results

This set of tests was devised to see how the STEM system performs under cumulative Photoshop JPEG compression. For each embedding strength, the representative image at JPEG compression level 100 was compressed at Photoshop level

3 E symbolizes that the extraction algorithm ran out of blocks before extracting all 888 bits. In other words, the data here was corrupted.

12, and data was extracted from this image. This compressed image was compressed again at level 11, and data was extracted from the newly compressed image. This process was repeated until Photoshop compression level 4 (low quality). Note that the rounding errors from the original experiment are still prevalent after Photoshop compression. The results indicate that STEM is also robust under Photoshop compression.

4.10 Robustness Testing œ Blurring

This set of tests was more experimental, designed to tests the resistance of STEM to another type of distortion -- blurring. Blurring was performed on the tiger image shown in Figure 4.19, using the Photoshop blur filters and tools.

Blur filter once: 0 errors Gaussian Blur, 1 pixel: 414 errors Blur filter twice: 175 errors Selective blurring: 0 errors

Selective blurring, shown in Figure 4.24 , was performed to test the redundancy features

Figure 4.24: Selective blurring of tiger, performed in the water and on the tiger itself

of the STEM system. Select parts of the water and the tiger were blurred in order to lessen the visual impact of the embedded data. These areas could be changed without destroying the embedded message, because the embedded message was embedded in two other locations. Applying blurring to overlapping areas of the image negates the redundancy feature and corrupts the message, as seen in the results of the Gaussian blur test.

4.11 Conclusion

Results from these tests indicate that the STEM system is robust to JPEG compression. The level of robustness, and the visual cost associated with it, is a delicate combination that also depends upon the host image and the strength of the embedding method. In contrast, the existing LSB embedding method has been shown to be extremely fragile, with its embedded data destroyed by any sort of compression.

Chapter 5 œ Future Work

The main purpose of STEM is to produce results which are robust to image compression while invisible to human visual perception. It does not address the issue of computer-based detection, known as . Steganalysis is the process of analyzing the statistics of an image in order to determine if it contains hidden data. By modifying the DCT coefficients of an image, we alter its natural statistical properties. If the properties are altered enough, these changes will be detected by a computer.

There is a transform-based steganography system called Outguess which attempts to make the changes invisible to computer analysis [24]. It does this by restoring the statistics of the original image. An image is divided into half, with one half of the image reserved for embedding data, and the other half reserved for correcting statistics. If the embedding of data in a block moves a DCT coefficient from a certain range A to another range B, a coefficient in range B in the correcting portion of the image is altered to be in range A. Although it reduces the capacity of the embedded data by half, the histogram of the image remains the same. The correction ideas from the Outguess system can be integrated with our transform embedding method. This would further reduce the message capacity of an image. However, it would also produce a robust embedding method that is both statistically and visually invisible. In steganography, this is the ultimate goal.

Appendix A: Code

All the code presented in the Appendix is written in Matlab version 6.5.

A.1 LSB Embedding Code

This section contains the code used to generate the LSB robustness results presented in Section 4.1.

Note: functions * , & *+ , + -,+ . , and /, can be found in appendix A.4 (General code)

%%% TEST_ALL % % % TEST_ALL : This automated test script runs 200 LSB robustness experiments % records the data in the file lsb_results.txt, along with the totals of % the experiment. % numtests=200; h=zeros(1,14,numtests+1); for qual_headers=40:5:100 h(1,(qual_headers/5-7),1)=qual_headers; end for i=1:numtests h(:,:,i+1)=lsb_test(false); g=h(:,:,i); end fprintf('results stored in array h, and can be reference by h(:,:,x)\n'); diary lsb_results.txt diary on h fprintf('totals for all experiments\n'); sum(h,3)-h(:,:,1) diary off function error_results=lsb_test(printString) % LSB_TEST(printString) % % LSB_TEST tests the robustness of the LSB embedding method. % It generates random strings of length 300 and embeds them into % the least-significant bit of the color data for red in the color % spectrum. It then extracts the data from the image to make sure % it was embedded correctly. Then, it saves this image in JPG format % at compression levels 40 to 100, in increments of 5. It attempts to % remove the embedded data from each image, comparing it to the original % embedded string, and records the errors. % % Parameter: printString - a bool that specifies whether or not to print the extracted/embedded % string combo, or just the error count % Returns: a vector containing the number of errors after JPG compression at each level % %% %%%% close all;

% insert globals here global blocksize; global A;

%least_sig is modified image %A is original image blockX=1; blockY=1; errors_results=zeros(11,1); A=imread('lena.jpg'); original=A; blocksize=300; % CURRENT CONSTRAINTS ON C -- MUST BE AN EVEN NUMBER OF BITS c=rand_seq(blocksize); % generates a random binary sequence b=getBlock(blockX,blockY,A);

%embed the data embedded_block = lsb_embed(double(b),c); least_sig=writeBlock(A,embedded_block,blockX,blockY);

% now extract the data bex=getBlock(blockX,blockY,least_sig); bex=double(bex); extracted = lsb_extract(bex,size(c,2)); embedded=c;

% clean up extracted from double array to properly formatted string w/o % spaces clean_extract = remove_spaces(int2str(extracted)); % convert extracted from double array to string if (printString == true) clean_extract embedded end errcount=error_count(clean_extract,embedded); if (errcount~=0) error('original message not embedded correctly'); end imwrite(least_sig,'lsb_test_modified100.jpg','jpg','Quality',100);

% write to jpeg files at different quality intervals % from 40 to 100 in increments of 5 for embed_loop=40:5:100 fname = 'lsb_test'; fname = strcat(fname,num2str(embed_loop)); fname = strcat(fname,'.jpg');

imwrite(least_sig,fname,'jpg','Quality',embed_loop); end

% now read in these files err_index=1; total_bits=0; total_errors=0; for extract_loop=40:5:100 fname = 'lsb_test'; fname = strcat(fname,num2str(extract_loop)); fname = strcat(fname,'.jpg'); B=imread(fname);

total_bits=total_bits + size(c,2);

%extract the data bex=getBlock(blockX,blockY,B); bex=double(bex); extracted = lsb_extract(bex,size(c,2)); embedded=c; % represents the original embedded message

if (printString==true) outline = strcat('Experiment at compression level ***',num2str(extract_loop)); outline = strcat(outline,'***\n'); fprintf(outline); end

clean_extract = remove_spaces(int2str(extracted)); % convert extracted from double array to string

if (printString==true) embedded clean_extract end

% see how the extracted string compares to the original errcount = error_count(clean_extract,embedded); total_errors=total_errors+errcount;

if (printString==true) errcount end

error_results(err_index)=errcount; err_index=err_index+1; end % end of extract_loop error_results(14)=total_bits; if (printString==true) total_errors/total_bits end

function embeembeddeddded = lsb_embed(b,c) % LSB_EMBED(b,c) % % Takes in a block and a string, and embeds the string into the % least-significant bit (LSB) of the green & blue (GB) color spectrum in an image % % Parameters: b -- the block to embed data in % c -- the string to be embedded % Returns: a block with embedded data %%% b=double(b); for i=2:2:size(c,2) b(i/2,i/2,2)=round(b(i/2,i/2,2)) - mod(round(b(i/2,i/2,2)),2); % mask the least sig bit b(i/2,i/2,2)=b(i/2,i/2,2) + bin2dec(c(i-1));

b(i/2,i/2,3)=round(b(i/2,i/2,3)) - mod(round(b(i/2,i/2,3)),2); % mask the least sig bit b(i/2,i/2,3)=b(i/2,i/2,3) + bin2dec(c(i)); end embedded=b; function extracted = lsb_extract(B, msglength) % LSB_EXTRACT(B, msglength) % % Extracts a string from a block B using the least-significant bit of the % green and blue color spectrum % % Parameters - B represents the block to extract from % msglength represents the length of the embedded string % % Returns: extracted data string for i=2:2:msglength extracted(i-1) = mod(B(i/2,i/2,2),2); extracted(i)=mod(B(i/2,i/2,3),2); end

A.2 MSB Embedding Test

This section contains the code used to generate the example MSB embedding image (Figure 1.3)

% MSB_TEST(printString) % % MSB_TEST shows the visual results of embedding in the most-significant % bit of the G color spectrum in RGB. %%%% close all; clear all;

A=imread('lena.jpg'); b=double(A); stringsize=40000; blocksize=stringsize; % CURRENT CONSTRAINTS ON C -- MUST BE AN EVEN NUMBER OF BITS

% faster to do it this way c1=rand_seq(4000); c2=strcat(c1,c1); %8k c3=strcat(c2,c2); %16k c3=strcat(c3,c3); % 32k c=strcat(c3,c2); clear c1; clear c3; clear c2; fprintf('Embedding data in the MSB\n'); curBit = 1; for j=200:399 for i=200:399 temp =dec2bin(b(i,j,2),8); temp(1)=c(curBit); temp=bin2dec(temp); b(i,j,2)=temp; curBit=curBit+1; end end fprintf('extracting data to make sure it was embedded correctly\n'); curBit=1; for j=200:399 for i=200:399 temp = dec2bin(b(i,j,2),8); extracted(curBit) = temp(1); curBit=curBit+1; end end embedded=c;

%check to make sure the string was embedded correctly errcount=error_count(extracted,embedded); if (errcount~=0) error('original message not embedded correctly'); end imwrite(uint8(b),'msb_test_modified100.jpg','jpg','Quality',100); imshow(A); title('original image'); figure; imshow(uint8(b)); title('MSB embedding example');

A.3 Comparison of Transform Embedding to LSB

This is the code used to generate the example from Section 4.2. The input file was lena.jpg, and the passphrase used was “"Thoughtcrime does not entail death: thoughtcrime is death." function coeff_test_lsbcomp(inFile,phrase) global Q_table; c=phrase2bits(phrase); passString='n'; % uses coefficients corresponding to k1=3, k2=10, k3=11 table_qual=90; redundancy=1; startQual=80; endQual=100.0; increments=-1; resultSize=ceil((endQual-startQual)/increments); total_errs=zeros(size(passString,2)+1,resultSize); topSpot=1; for jQuality=endQual:increments:startQual total_errs(1,topSpot)=jQuality; topSpot=topSpot+1; end for jb=1:size(passString,2) jb passphrase=passString(jb);

curPlace=1; Q_table=jpeg_qtable(table_qual); mod_image=zmbed(inFile,c,redundancy,passphrase);

for jQuality=endQual:increments:startQual jQuality outFile='./tempPhotos/'; outFile=strcat(outFile,inFile);

outFile=strcat(outFile,num2str(jQuality)); outFile=strcat(outFile,'.jpg');

imwrite(mod_image,outFile,'jpg','quality',jQuality); c_out=zextract(outFile,passphrase);

total_errs(jb+1,curPlace)=error_count(c_out,c); curPlace=curPlace+1; end end total_errs end

A.4 General code

This section contains common code used by most experiments. function t = num2Block(blockNum) %num2Block(blockNum) % returns the coordinates of a block based on its blocknumber global blocksize; % +1 accounts for 1-based indexing of matlab r = floor(blockNum/(blocksize)) + 1; c = mod(blockNum,(blocksize)) + 1; t=[r c]; function c=phrase2bits(phrase) %%% PHRASE2BITS(PHRASE) %%%% %%%% Takes in a Strings and converts it to a binary string zeros(1); mySentence=double(phrase) embed_string=''; for x=1:size(mySentence,2) embed_string=strcat(embed_string,dec2bin(mySentence(x),8)); end sent_dec=''; for x=1:8:size(embed_string,2)-7 myCharBin=embed_string(x:x+7); myCharDec=bin2dec(myCharBin); myChar=char(myCharDec); sent_dec=strcat(sent_dec,myChar); end sent_dec c=embed_string; function d = decodeSentence(embed_string) %%% DECODESENTENCE(binary_string) %%%% %%%% Takes in a binary Strings and converts it to a character string sent_dec=''; spaceflag=0; for x=1:8:size(embed_string,2)-7 myCharBin=embed_string(x:x+7); myCharDec=bin2dec(myCharBin); if myCharDec==32 sent_dec=strcat(sent_dec,'*'); % spaceflag=1; % sent_dec(size(sent_dec,2)+1)=' '; else myChar=char(myCharDec); sent_dec=strcat(sent_dec,myChar);

end end for x=1:size(sent_dec,2) if sent_dec(x)=='*' sent_dec(x)=' '; end end d=sent_dec;

function a=rgb2yuv(A) %%% rgb2yuv(A) %%%% %%%% Takes in a m-by-n-by-3 matrix in RGB format %%%% and converts it to the YUV color scale

% convert to double so we can operate on it A=double(A);

%check for proper format thirdD = size(A,3); if thirdD~=3 error('Must be in RGB format'); end

R = A(:,:,1); G = A(:,:,2); B = A(:,:,3);

Y = R * 0.299 + G * 0.587 + B * 0.114; U = R * -0.1687 + G * -0.3313 + B * 0.500 + 2^4; V = R* 0.500 + G * -0.4187 + B * -0.0813 + 2^4; a(:,:,1) = Y; a(:,:,2) = U; a(:,:,3) = V; function a=yuv2rgb(B) %%% yuv2rgb(A) %%%% %%%% Takes in a m-by-n-by-3 matrix in YUV format %%%% and converts it to the RGB color scale

%check for proper format thirdD = size(B,3); if thirdD~=3 error('Must be in YUV format'); end

Y = B(:,:,1);

U = B(:,:,2); V = B(:,:,3);

R = Y + 1.402*(V - 2^4); G = Y - 0.34414 * (U-2^4) - 0.71414 * (V - 2^4); B = Y + 1.722 * (U - 2^4); a(:,:,1) = R; a(:,:,2) = G; a(:,:,3) = B;

% round off the values a=round(a); function seq = Ts(A,key) %%% Ts(A,key) %%%% %%%% Takes in a m-by-n-by-3 matrix and generates %%%% a sequential embedding/extraction sequence %%%% the key variable is not currently used, but can be %%%% in order to generate a non-sequential embedding sequence global blocksize; [dimX dimY] =size(A(:,:,1)); dimX=dimX-blocksize; dimY=dimY-blocksize; seq=ones(1,2); index = 1; for i=1:blocksize:dimX for j=1:blocksize:dimY seq(index,:) = [i j]; index=index+1; end end function b = getBlock(x,y,A) % getBlock(x,y,A) % x - x coordinate of the block % y - y coordinate of the block % A - the image you wish to extract a block from % global blocksize; b = A(x:x+blocksize-1,y:y+blocksize-1,:); function sp = get_startpos(n,seq) % returns an array indicating the starting positions of the % embedding data, taking the redudancy feature n into account sp = zeros(n,1); sp(1)=1; for i=2:n sp(i)=(i-1)*(floor(size(seq,1)/n));

end function coeff_seq = gen_hash_seq(passphrase) %%% gen_hash_seq %%%% Generates a hash sequence based upon the pass phrase. Uses a simple hash based upon the ASCII value. coeff_seq=zeros(1,size(passphrase,2)); for i=1:size(passphrase,2) char_value=double(passphrase(i)); coeff_seq(i)=mod(char_value,size(patternTable,1)); end coeff_seq=coeff_seq+1; function e=error_count(c, c_out) %%% ERROR_COUNT: Given two strings, it counts the number of differences between them e=0; if size(c_out,2)~=size(c,2) stringSize=min(size(c_out,2),size(c,2)); e=e+abs(size(c_out,2)-size(c,2)); else stringSize=size(c_out,2); end for j=1:stringSize if c_out(j)~=c(j) e=e+1; end end function isOK = check_write(b,c,hash_num) %%% check_write %%%% Sees if the DCT coefficients of an 8x8 yuv block are within a given threshold. %%%% If they are, then return OK. Else modify the block to an invalid pattern and return. global blocksize; global patternTable; global D; % default is 1 via paper global Q_table; k = patternTable(hash_num,:); k1 = num2Block(k(1)); k2 = num2Block(k(2)); k3 = num2Block(k(3)); dct_b = dct2(b(:,:,1)); % just take xform of y component rw = [dct_b(k1(1),k1(2)) dct_b(k2(1),k2(2)), dct_b(k3(1),k3(2))]; dk(1) = round(rw(1)/Q_table(k1(1),k1(2))); dk(2) = round(rw(2)/Q_table(k2(1),k2(2))); dk(3) = round(rw(3)/Q_table(k3(1),k3(2))); if c=='1' if min(dk(1),dk(2)) + D < dk(3) isOK = false; %then modify to invalid patterns on table % IDCT the block and write to A else

isOK=true; end elseif c=='0' if max(dk(1),dk(2)) > dk(3) + D isOK=false; % then modify to invalid patterns on table % IDCT the block and write to A else isOK=true; end else fprintf('ERROR HERE IN CHECK_WRITE!\n'); end function isOK = check_read(b,hash_num) %%% check_read %%%% Sees if the DCT coefficients of an 8x8 yuv block match a HML valid pattern %%%% from the pattern table. If so, then return OK. Else return false. global blocksize; global patternTable; global D; % default is 1 via paper global Q_table; k = patternTable(hash_num,:); k1 = num2Block(k(1)); k2 = num2Block(k(2)); k3 = num2Block(k(3)); dct_b = dct2(b(:,:,1)); % just take xform of y component rw = [dct_b(k1(1),k1(2)) dct_b(k2(1),k2(2)), dct_b(k3(1),k3(2))]; dk(1) = round(rw(1)/Q_table(k1(1),k1(2))); dk(2) = round(rw(2)/Q_table(k2(1),k2(2))); dk(3) = round(rw(3)/Q_table(k3(1),k3(2))); if dk(1) == dk(2) && dk(2) == dk(3) isOK=false; elseif dk(1) <= dk(3) && dk(2) >= dk(3) isOK=false; elseif dk(1) >= dk(3) && dk(3) >= dk(2) isOK=false; else isOK=true; end function b = steg_write4(b,c,hash_num) %%% steg_write4 %%%% Given a block b and a hash number, it embeds the bit c into the block %%%% according to the given HML relationship table global blocksize; global patternTable; global D; % default is 1 via paper global Q_table; changeValue=3; k = patternTable(hash_num,:); k1 = num2Block(k(1));

k2 = num2Block(k(2)); k3 = num2Block(k(3)); dct_b = dct2(b(:,:,1)); % just take xform of y component rw = [dct_b(k1(1),k1(2)) dct_b(k2(1),k2(2)), dct_b(k3(1),k3(2))]; dk(1) = round(rw(1)/Q_table(k1(1),k1(2))); dk(2) = round(rw(2)/Q_table(k2(1),k2(2))); dk(3) = round(rw(3)/Q_table(k3(1),k3(2))); changes=zeros(8); if c == '1' if dk(1) <= dk(3) + D changes(k1(1), k1(2))=-dk(1) + dk(3)+D + changeValue; %changes(k2(1), k2(2))=+changeValue; changes(k3(1), k3(2))=-dk(3)+dk(1)-D - changeValue; end if dk(2) <= dk(3) + D %changes(k1(1), k1(2))=+changeValue; changes(k2(1), k2(2))=-dk(2)+dk(3)+D + changeValue; changes(k3(1), k3(2))=-dk(3)+dk(2)-D - changeValue; end elseif c == '0' if dk(1) + D >= dk(3) changes(k1(1),k1(2))=-dk(1)+dk(3)-D - changeValue; changes(k3(1),k3(2))=-dk(3)+dk(1)+D + changeValue; end if dk(2) + D >= dk(3) changes(k2(1),k2(2))=-dk(2)+dk(3)-D - changeValue; changes(k3(1),k3(2))=-dk(3)+dk(2)+D + changeValue; end else fprintf('\nERROR -- this should not be reached\n\n'); end % dequantize the coefficents dk(1) = dk(1)*Q_table(k1(1),k1(2)); dk(2) = dk(2)*Q_table(k2(1),k2(2)); dk(3) = dk(3)*Q_table(k3(1),k3(2));

% dequantize the changes changes(k1(1),k1(2)) = changes(k1(1),k1(2)) * Q_table(k1(1),k1(2)); changes(k2(1),k2(2)) = changes(k2(1),k2(2)) * Q_table(k2(1),k2(2)); changes(k3(1),k3(2)) = changes(k3(1),k3(2)) * Q_table(k3(1),k3(2)); dct_b(k1(1),k1(2))=dk(1); dct_b(k2(1),k2(2))=dk(2); dct_b(k3(1),k3(2))=dk(3); b(:,:,1)=real(idct2(dct_b(:,:,1))); changes_spatial=idct2(changes); b(:,:,1)=b(:,:,1)+changes_spatial; function b = writeBlock(B,smallB,x,y) %%% writeBlock: inserts matrix smallB into matrix B at position x,y global blocksize;

B(x:x+blocksize-1,y:y+blocksize-1,:)=smallB; b = B; function clean_extract = remove_spaces(orig_string) % remove_spaces: removes spaces from a character string clean_extract=blanks(1); j=1; for i=1:size(orig_string,2) if (~isspace(orig_string(i))) clean_extract(j)=orig_string(i); j=j+1; end end function c = rand_seq(seq_size) % rand_seq: Generates a random binary sequence of size seq_size. Used for testing. c=''; for i=1:seq_size if (round(rand)==1) c=strcat(c,'1'); else c=strcat(c,'0'); end end function tb=jb_getTable() % patternTable tb = [25 18 11 18 25 11 3 10 11 10 3 11 10 3 17 3 10 17 16 9 17]; fufufunctionfu nction b = mark_bad(b,hash_num) % marks a block to an invalid/bad DCT pattern % at the coefficient corresponding with the hash number global blocksize; global patternTable; global D; % default is 1 via paper global Q_table; %fprintf('mark bad\n'); % change this to some random thing later k = patternTable(hash_num,:); k1 = num2Block(k(1)); k2 = num2Block(k(2)); k3 = num2Block(k(3)); dct_b = dct2(b(:,:,1)); % just take xform of y component rw = [dct_b(k1(1),k1(2)) dct_b(k2(1),k2(2)), dct_b(k3(1),k3(2))]; dk(1) = round(rw(1)/Q_table(k1(1),k1(2))); dk(2) = round(rw(2)/Q_table(k2(1),k2(2))); dk(3) = round(rw(3)/Q_table(k3(1),k3(2)));

changes=zeros(8); changes(k3(1),k3(2)) = -dk(3) + round(.5*(dk(1) + dk(2))); tempVal=zeros(1,3); tempVal(3)=dk(3)+changes(k3(1),k3(2)); %%% now push the third coefficient further into the middle if tempVal(3) >= dk(1) && tempVal(3) <= dk(2) changes(k1(1),k1(2))=-3*D; changes(k2(1),k2(2))=3*D; else changes(k1(1),k1(2))=3*D; changes(k2(1),k2(2))=-3*D; end tempVal(2)=dk(2)+changes(k2(1),k2(2)); tempVal(1)=dk(1)+changes(k1(1),k1(2)); if tempVal(1)==tempVal(2) changes(k1(1),k1(2))=3*D; changes(k2(1),k2(2))=-3*D; end % dequantize the coefficents dk(1) = dk(1)*Q_table(k1(1),k1(2)); dk(2) = dk(2)*Q_table(k2(1),k2(2)); dk(3) = dk(3)*Q_table(k3(1),k3(2)); % dequantize the changes changes(k1(1),k1(2)) = changes(k1(1),k1(2)) * Q_table(k1(1),k1(2)); changes(k2(1),k2(2)) = changes(k2(1),k2(2)) * Q_table(k2(1),k2(2)); changes(k3(1),k3(2)) = changes(k3(1),k3(2)) * Q_table(k3(1),k3(2)); dct_b(k1(1),k1(2))=dk(1); dct_b(k2(1),k2(2))=dk(2); dct_b(k3(1),k3(2))=dk(3); b(:,:,1)=real(idct2(dct_b(:,:,1))); changes_spatial=idct2(changes); b(:,:,1)=b(:,:,1)+changes_spatial; function s = steg_read(b, hash_num) %%% steg_Read(b,hash_num) %% given a block and a hash number, it compares the triplet of %% DCT coefficients and returns the value corresponding with their relationship global blocksize; global patternTable; global D; % default is 1 via paper global Q_table; k = patternTable(hash_num,:); k1 = num2Block(k(1)); k2 = num2Block(k(2)); k3 = num2Block(k(3)); dct_b = dct2(b(:,:,1)); % just take xform of y component rw = [dct_b(k1(1),k1(2)) dct_b(k2(1),k2(2)), dct_b(k3(1),k3(2))]; dk=rw; dk(1) = round(rw(1)/Q_table(k1(1),k1(2)));

dk(2) = round(rw(2)/Q_table(k2(1),k2(2))); dk(3) = round(rw(3)/Q_table(k3(1),k3(2))); if dk(1) > dk(3) && dk(2) > dk(3) s=1; elseif dk(1) < dk(3) && dk(2) < dk(3) s = 0; else s=-1; end function coeff_test2(inFile) %%% used to test all possible coefficient locations % this script used to get results seen in Appendix C global Q_table; Q_table=jpeg_qtable(90); c=rand_seq(100); passString='lmnopqrstuvwxyzijk'; % generates for 1-18 table_qual=70; redundancy=5; startQual=15; endQual=100.0; increments=-4; resultSize=ceil((endQual-startQual)/increments); total_errs=zeros(size(passString,2)+1,resultSize); topSpot=1; for jQuality=endQual:increments:startQual total_errs(1,topSpot)=jQuality; topSpot=topSpot+1; end for jb=1:size(passString,2) jb passphrase=passString(jb); curPlace=1; Q_table=jpeg_qtable(table_qual); mod_image=zmbed(inFile,c,redundancy,passphrase);

for jQuality=endQual:increments:startQual outFile='./tempPhotos/'; outFile=strcat(outFile,inFile); outFile=strcat(outFile,num2str(jQuality)); outFile=strcat(outFile,'.jpg'); imwrite(mod_image,outFile,'jpg','quality',jQuality); c_out=zextract(outFile,passphrase); total_errs(jb+1,curPlace)=error_count(c_out,c); curPlace=curPlace+1; end end total_errs

A.5 Test code

This section contains the scripts used to generate and run most of the tests presented in the paper. A test would be executed as follows: test_embed('lena.jpg',70,c,'clandestine',3) function test_embed(inFile,table_qual,c,passphrase,redundancy) %%% test_embed(inFile,table_qual,c,passphrase) %% takes in a filename, embedding strength, binary string, passphrase, and %% redundancy level, and uses lower-level scripts to embed, extract, and show %% results. stringSize=size(c,2); global Q_table; startQual=table_qual-30; endQual=100.0; increments=-4; resultSize=ceil((endQual-startQual)/increments); total_errs=zeros(2,resultSize); topSpot=1; for jQuality=endQual:increments:startQual total_errs(1,topSpot)=jQuality; topSpot=topSpot+1; end curPlace=1; Q_table=jpeg_qtable(table_qual); mod_image=zmbed(inFile,c,redundancy,passphrase);

for jQuality=endQual:increments:startQual jQuality

outFile='./tempPhotos/'; outFile=strcat(outFile,num2str(table_qual)); outFile=strcat(outFile,inFile); outFile=strcat(outFile,num2str(jQuality)); outFile=strcat(outFile,'.jpg');

imwrite(mod_image,outFile,'jpg','quality',jQuality); c_out=zextract(outFile,passphrase);

total_errs(2,curPlace)=error_count(c_out,c); curPlace=curPlace+1; end total_errs function A=zmbed(inFile,c, red,passphrase) % automates the insertion of a binary string c into a file global blocksize;

global A; global patternTable; global D; % max mod distance global Q_table; global redund; global LENGTH_SIZE;

LENGTH_SIZE=10; patternTable = jb_getTable; redund=red; %sets redundancy here

%init globals here D=1; ldmsg='Loading image:'; ldmsg = strcat(ldmsg,inFile); fprintf(ldmsg,'\n');

A=imread(inFile); original=A; blocksize=8; % effectively gives block size of 8, x:x+7 mystring = strcat('Embedding bit sequence ',c); fprintf(mystring,'\n'); fprintf('\n'); A=rgb2yuv(A); A=A-128; seq=Ts(A); startPos=get_startpos(redund,seq); seqSize=size(seq,1); cSize=size(seq,1) + LENGTH_SIZE; %include lenght of embedded data binSize=size(c,2) + LENGTH_SIZE; binSize=dec2bin(binSize,LENGTH_SIZE); c=strcat(binSize,c); indice=1; for i=1:redund [A, end_pos]=embed_sequence(c,seq,startPos(i),A,passphrase); if (i < redund) if (end_pos > startPos(i+1)) error('Flooded into next redundancy zone. Either reduce # of bits or redundancy'); end end end

A=A+128; A=yuv2rgb(A); A=uint8(A); function [A, startPos]=embed_sequence(c,seq,startPos,A,passphrasstartPos]=embed_sequence(c,seq,startPos,A,passphrase)e) % takes in a string, and an istartPosage and estartPosbeds data into it % string debugger is used for examples generated in thesis

hash_seq=gen_hash_seq(passphrase); hash_tablesize=size(hash_seq,2); curBit=1; n=size(c,2); totalBits=0; i=startPos; % debugger=''; while (curBit <= n) hash_index=mod(i,hash_tablesize)+1; curIndex=i; curX1=seq(curIndex,1); curY1=seq(curIndex,2); b=getBlock(curX1,curY1,A);

cTemp=c(curBit); checkFlag=check_write(b,cTemp,hash_seq(hash_index));

if checkFlag==true b=steg_write4(b,cTemp,hash_seq(hash_index)); A=writeBlock(A,b,curX1,curY1); curBit = curBit+1; % this only runs if write is successful else b=mark_bad(b,hash_seq(hash_index)); % debugger=strcat(debugger,'B'); if check_read(b,hash_seq(hash_index))~=false error('boo on mark_bad!'); end A=writeBlock(A,b,curX1,curY1); end i=i+1; end % debugger function extracted=zextract(inFile,passphrase) % zextract - high level control of extraction from a file, % given a pass phrase global redund global LENGTH_SIZE; global patternTable; global D; % max mod distance patternTable = jb_getTable; global blocksize; global Q_table; blocksize=8; LENGTH_SIZE=10; C=imread(inFile); seq=Ts(C); startPos=get_startpos(redund,seq); err_count=0; mod_image=rgb2yuv(C);

mod_image=mod_image-128;

%%%%% EXTRACT STRING LENGTH lengthArray = java_array('java.lang.String',redund); for i=1:redund [unusedTemp, tempString]=extract_sequence(startPos(i),LENGTH_SIZE,seq,mod_image,passphrase); lengthArray(i)=java.lang.String(tempString); end lengthStrings=cell(lengthArray); lengthStrings=char(lengthStrings); mysize = reconstruct_string(lengthStrings); e=0; % counts the number of errors in string mysize = remove_spaces(int2str(mysize)); % mysize=bin2dec(mysize) %%%% END EXTRACT STRING LENGTH curbit=1; strArray = java_array('java.lang.String',redund); for i=1:redund [err_count(i), tempString]=extract_sequence(startPos(i),mysize,seq,mod_image,passphrase); strArray(i)=java.lang.String(tempString); end outStrings=cell(strArray); outStrings=char(outStrings); c_out = reconstruct_string(outStrings); e=0; % counts the number of errors in string c_out = remove_spaces(int2str(c_out)); % convert extracted from double array to string extracted=c_out(LENGTH_SIZE+1:size(c_out,2)); function st = reconstruct_string(myStrings) %RECONSTRUCT_STRING(A) by Josh Buchanan % IMPORTANT NOTE -- THIS WORKS ON BINARY STRINGS ONLY % reconstructs a *binary* string given an array of strings % Given an array of strings of equal size, it compares % all strings to each other, scoring based upon how many % 'correct' entries each string contains. Correct entries % are judged by how many of the other strings contains the same % characters. % s = score_strings(A) returns an array of scores for each string. % % Example: % a = [11110 % 10001 % 01110]; % reconstruct_string(a) % ----OUTPUT % ans = % 1 1 1 1 0 % Josh Buchanan 03-March-2004.

% Last Revision: 03-March-2004 % $Revision: 1.0 $ sc = score_strings(myStrings); w=sc; stringSize=size(myStrings,2); % gets the size of each individual string (will be the same) mySums=zeros(1,stringSize); final_string=''; for i=1:stringSize curBitSum=0; weightsum=0; for j=1:size(myStrings,1) weight=w(j); if (myStrings(j,i)~='e') curBitSum=curBitSum+weight*(str2num(myStrings(j,i))); weightsum=weightsum+w(j); end end % i if weightsum==0 curBitSum=0; else curBitSum=curBitSum/weightsum; end round(curBitSum); mySums(i)=round(curBitSum); end % end of string loop st=mySums; function sc = score_strings(toCompascore_strings(toCompare)re)re)re) %SCORE_STRINGS(A) by Josh Buchanan % little heuristic for reconstructing a string % from a set of extracted strings % % Given an array of strings of equal size, it compares % all strings to each other, scoring based upon how many % 'correct' entries each string contains. Correct entries % are judged by how many of the other strings contains the same % characters. % s = score_strings(A) returns an array of scores for each string. % % Example: %a =[ '12356' % '32510' % '12356' ]; % % sc = score_strings(a) % -- OUTPUT -- % sc = % 6 2 6 % this will be used to create a weight matrix (sc/sum(sc)) by % RECONSTRUCT_STRING method

% % SEE ALSO: reconstruct_string % % Josh Buchanan 03-March-2004. % Last Revision: 03-March-2004 % $Revision: 1.0 $ sc=zeros(1,size(toCompare,1)); for i=1:size(toCompare,1) curString = toCompare(i,:); % this is how you reference the string for j=1:size(curString,2) % loop through string if (curString(j)~='e') % don't score if error exists in curString for k=1:size(toCompare,1) % loop through all entries (including current entry) if curString(j)==toCompare(k,j) sc(i)=sc(i)+1; end end end % end of error check in curString end end function [err_count, myseq] = extract_sequence(startPos,n,seq, mod_image,passphrase) % the low-level sequence extractor, called by zextract % debugger2 is used to generate examples in Ch 4. hash_seq=gen_hash_seq(passphrase); hash_tablesize=size(hash_seq,2); myseq=''; err_count=0; % debugger2=''; i=startPos; num_embedded=0; %for i=startPos:startPos+n while (num_embedded < n) curIndex=i; hash_index=mod(i,hash_tablesize)+1;

if curIndex > size(seq,1) my_errmsg = strcat('Image ran out of space at bit ', num2str(num_embedded)); error(my_errmsg); end curX=seq(curIndex,1); curY=seq(curIndex,2); b=getBlock(curX,curY,mod_image); checkFlag=check_read(b,hash_seq(hash_index));

if checkFlag==true num_embedded=num_embedded+1; s=steg_read(b,hash_seq(hash_index)); if s==1 myseq=strcat(myseq,'1'); % debugger2=strcat(debugger2,'1');

elseif s==0 myseq=strcat(myseq,'0'); % debugger2=strcat(debugger2,'0');

else myseq=strcat(myseq,'e'); err_count=err_count+1; end % end s== check else % do nothing, and move on ; % debugger2=strcat(debugger2,'B'); end % end checkFlag conditional i=i+1; end % end while loop % debugger2

Appendix B: Quantization Tables

B.1 Photoshop Quantization Tables Here are the JPEG luminance (Y) quantization tables used by Adobe Photoshop

6.0. They were obtained by saving images at different JPEG quality levels, and extracting the quantization table using a JPEG tool kit [21]. The tables range from quality level 0 to level 12, with 0 resulting in the most compression (worst visual quality) and 12 signifying the least compression (best visual quality). Note that even the quantization table representing level 12 results in lossy compression.

B.2 JPEG Recommended Quantization Tables

Here are the luminance (Y) quantization tables corresponding with the recommended JPEG standard. They were obtained using the JPEG toolkit [21].

Compression levels range from 0 to 100, with 0 being the most compressed and 100 the least.

Appendix C: Initial Coefficient Testing Results

In order to obtain possible coefficient embedding locations, data was embedding in potential coefficient triplets, and extracted from the image. A random 200-bit binary string was generated, and separately embedded into each of the potential coefficient locations. Standard JPEG compression was performed on the image at increasing levels, and data was compared with the original embedded string, recording the number of errors in the extracted string. This experiment was repeated 5 times for each embedding strength level (90,80,70). The overall results for all 5 tests are recorded as below. This test was performed on four different images as seen in Figure 4.5 (tiger1.jpg,lena.jpg) and Figure 4.6 (shed.jpg, duke2.jpg).

duke2.jpg 100 96 92 88 84 80 76 72 68 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33

shed.jpg 100 96 92 88 84 80 76 72 68 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33

tiger1.jpg 100 96 92 88 84 80 76 72 68 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33

lena.jpg 100 96 92 88 84 80 76 72 68 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33

duke2.jpg 100 96 92 88 84 80 76 72 68 64 60 56 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33

shed.jpg 100 96 92 88 84 80 76 72 68 64 60 56 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33

tiger1.jpg 100 96 92 88 84 80 76 72 68 64 60 56 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33

lena.jpg 100 96 92 88 84 80 76 72 68 64 60 56 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33

duke2.jpg 100 96 92 88 84 80 76 72 68 64 60 56 52 48 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33

shed.jpg 100 96 92 88 84 80 76 72 68 64 60 56 52 48 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33 3 3 9 33

tiger1.jpg 100 96 92 88 84 80 76 72 68 64 60 56 52 48 3 3 33 33 3 3 3 3 * 33 3 3 33 33 '33 3 3 9 33 3 3 9 33

lena.jpg 100 96 92 88 84 80 76 72 68 64 60 56 52 48 3 3 33 33 3 3 3 3 * 33

Based on this data, the table of coefficient locations was trimmed down to the following:

Hash k1 k2 k3

Coefficient triplets (10,17,18), (17,10,18), (10,17,3), and (17,10,3) were eliminated for their poor performance. Coefficient triplet (9,16,17) was arbitrarily eliminated to give the table a prime number of entries.

Bibliography

[1] David Kahn. "The History of Steganography", ) - 0 1 2 * 1 3- , Cambridge, U.K., May 30-June 1, 1996.

[2] John Miano. - 1 0 0 + Addison-Wesley, New York, 1999.

[3] Keith Jack. - - , Elsevier Science & Technology, 2001.

[4] Jack Kelley. —Terrorist instructions hidden online“ http://www.usatoday.com/tech/news/2001-02-05-binladen-side.htm , February 5, 2001.

[5] Declan McCullagh. —Bin Laden: Steganography Master?“ http://www.wired.com/news/politics/0,1283,41658,00.html February 7, 2001.

[6] Peter Wayner. , Morgan Kaufmann Publishers, New York, 2002.

[7] Charles Kurak and John McHugh. —A Cautionary Note on Image Downgrading,“ ) - 4 + San Antonio, TX, December 1992.

[8] Neil Johnson. —History of Steganography,“ http://www.jjtc.com/stegdoc/bks202.html , verified Jan 2004.

[9] —Issues in Information Hiding Transform Techniques," NRL/MR/5540-02-8621 http://chacs.nrl.navy.mil/publications/CHACS/2002/index2002.html , verified Dec 2003.

[10] Ingemar Cox, Joe Kilian, Tom Leighton, and Talal Shamoon. —Secure Spread Spectrum Watermarking for Multimedia,“ 56 7 1 ( 7 !8 9 , 1995.

[11] Eckard Koch, Jochen Rindfrey, and Jian Zhao, —Copyright Protection for Multimedia Data,“ from - - 6 ) , Academic Press, San Diego, 1996, pp203-213.

[12] William Pratt. 1 ) , John Wiley & Sons, New York, 1991.

[13] Jiri Fridrich. —Applications of Data Hiding in Digital Images“, http://citeseer.nj.nec.com/cache/papers/cs/500/http:zSzzSzssie.binghamton.eduzSz~jirifz SzResearchzSzispacs98.pdf/fridrich98application.pdf , verified Feb 2004.

[14] Eric Cole. 3- ) $ - 4 / , Wiley Publishing, Indiana, 2003.

[15] Jian Zhao and Eckhard Koch. —Embedding Robust Labels into Images for Copyright Protection,“ ) - ' 1 ' 1 ) 7 :- 1 + ; &- + - 5& ( , Vienna, August 1995.

[16] Andy Brown. Documentation from S-Tools software, available at http://www.stegoarchive.com/ , verified January 2004.

[17] http://www.spammimic.com , verified December 2003.

[18] Michael Marcellin and David Taubman. <)6= .999$ 1 0 - + - -+ - ) , Kluwer Academic Publishers, Boston, 2002.

[19] John Matthews and Kurtis Fink. 5 - > , Prentice Hall, New Jersey, 2004, p301.

[20] Gregory Wallace. "The JPEG Still Picture Compression Standard," 4 , April 1991.

[21] Phil Sallee. Matlab JPEG Toolbox, available at http://redwood.ucdavis.edu/phil/demos/jpegtbx/jpegtbx.htm , verified Feb 2004.

[22] http://www.pgpi.org/ , verified January 2004.

[23] Thomas Cormen, Charles Leiserson, Ronald Rivest, and Clifford Stein. 1 - 4 , MIT Press, Cambridge, Massachusetts, 2001.

[24] Niels Provos. —Defending Against Statistical Steganalysis,“ ) - 9 >651? , Washington D.C., 2001.