University of Konstanz Department of Computer and Information Science Distributed Systems Group
Master Thesis
Exploiting Facebook, Flickr, and Picasa: Utilizing Photo Sharing Websites as Cloud Storage Backends
Scientific Thesis in Fulfillment of the Requirements for the Degree of Master of Science (M.Sc.)
Author: Wolfgang Miller (01/612437)
First Assessor: Professor Dr. Marcel Waldvogel Second Assessor: Professor Dr. Dietmar Saupe Advisor: Sebastian Graf, M.Sc.
July 2013 Wolfgang Miller
Exploiting Facebook, Flickr, and Picasa: Utilizing Photo Sharing Websites as Cloud Storage Backends, c July 2013
Abstract
A major application of cloud computing is to supply storage that is often advertised by providers as being reliable, available, functional, and cost efficient because it relies on a massive infrastructure available in the cloud. Photo sharing websites rely on similar infrastructures to handle the amount of uploaded image data; however, in contrast to conventional cloud storage providers, they offer free storage capacities for images. Due to the fact that images can be used as data containers to store arbitrary information, it is possible to exploit these services as free cloud storage. In this thesis we describe our approach for the development of a fully functional photosharing website driven cloud storage system based on twelve encoding approaches. These encoding approaches allow for adjusting our system to the requirements of a specific photo sharing website and to gain the best trade-off between the actual data ratio of an image and its robustness against image compression. The system’s performance has been tested with the three major photo sharing websites Facebook, Flickr, and Picasa Web Albums, and has been compared with Amazon’s professional cloud storage service AWS S3. The results show that the system can be an alternative in scenarios in which the overall performance is not that important, such as backup systems. Our developed photosharing website driven cloud storage system is accessible as an open source and implemented as part of the widely used jClouds framework.
III IV
IV Acknowledgements
My special thanks goes to my family, my parents Willi and Anneliese Miller, my brother Sebastian Miller, and my grandmother Adelheid Rothmund, with whom I lived during the first years of studying in Konstanz. They made it possible for me to study by giving me always great support.
I would like to thank Professor Dr. Marcel Waldvogel and the members of the DiSy work- ing group for the excellent support. I would also like to thank Professor Dr. Dietmar Saupe for taking the time to assess this thesis.
Special thanks goes to Sebastian Graf. Throughout my studying I became to know and appreciate Sebastian in several roles: As a fellow student, a lecturer, a colleague, the su- pervisor of this thesis, and most importantly as a valuable friend. Thanks Sebastian for everything and good luck for your little family!
During my time in Konstanz, I met many nice people whom I studied with, and with whom I spend very good times. A special thanks to all my friends, who made me feel at home in Konstanz. Particularly I want to mention my master clique: Anja Fauth, Laura Lorenz, Volker Rehberg, Markus Hankh, Alexander Walter, Benno Geißelmann, Jochen Budzinski, Lin Shao, and Roland Jungnickel.
I also want to give special thanks to my very good friends Johanna Harde and Andreas Kraft, who gave me a great support during the writing of this thesis.
Last but not least, I would like to especially thank Anna Dowden-Williams for her more than valuable input in writing this thesis in English. Thank you Anna for all the time you invested in helping me and for your friendship.
V VI
VI Contents
1. Introduction1 1.1. Motivation ...... 1 1.2. Contribution of the Thesis ...... 2 1.3. Key Points of the Thesis ...... 3 1.4. Related Work ...... 4 1.4.1. Image Steganography ...... 5 1.4.2. 2D-Codes ...... 6 1.4.3. Projects with similar Orientation ...... 8 1.4.4. Conclusion and Comparison ...... 9
2. Encoding Approaches 11 2.1. Intention ...... 12 2.2. First Encoding ...... 13 2.2.1. Encoding Workflow ...... 14 2.2.2. Decoding Workflow ...... 15 2.2.3. Results ...... 16 2.3. Advanced Encoding ...... 17 2.3.1. Basics on Numeral Systems ...... 18 2.3.2. Relationship between Radix γ and Data Ratio ...... 18 2.3.3. Results ...... 20 2.4. Single-Layer-Approaches ...... 21 2.4.1. Encoding ...... 22 2.4.2. Decoding ...... 26 2.4.3. Results ...... 31 2.5. Multi-Layer-Approaches ...... 32 2.5.1. Encoding ...... 34 2.5.2. Decoding ...... 39
VII VIII Contents
2.5.3. Results ...... 39 2.6. Images and Colors ...... 40 2.6.1. Types of Images ...... 40 2.6.2. Predefined Colors Single-Layer-Approaches ...... 42 2.6.3. Generated Colors Multi-Layer-Approaches ...... 46 2.7. Summary ...... 47 2.7.1. Benchmarks ...... 48 2.7.2. Results ...... 51
3. Implementation of our Photosharing Website driven Cloud Storage System 53 3.1. Photosharing Website driven Cloud Storage System ...... 53 3.1.1. Error Correcting Code (Reed Solomon) ...... 54 3.1.2. Areas and Sectors of generated Images ...... 56 3.1.3. Integration in jClouds and System Workflows ...... 59 3.2. Facebook ...... 62 3.2.1. Features ...... 63 3.2.2. Implementation ...... 63 3.2.3. Benchmarks ...... 64 3.2.4. Results ...... 67 3.3. Flickr ...... 67 3.3.1. Features ...... 68 3.3.2. Implementation ...... 69 3.3.3. Benchmarks ...... 69 3.3.4. Results ...... 74 3.4. Picasa Web Albums ...... 74 3.4.1. Features ...... 74 3.4.2. Implementation ...... 75 3.4.3. Benchmarks ...... 75 3.4.4. Results ...... 78 3.5. Comparison Facebook, Flickr, and Picasa Web Albums ...... 79 3.5.1. Feature Comparison ...... 79 3.5.2. Overhead Comparison ...... 81 3.5.3. Performance Comparison ...... 82 3.5.4. Results ...... 84
4. Conclusion 87 4.1. Results ...... 87
VIII Contents IX
4.2. Future Work ...... 88 4.2.1. Benchmarks with larger Data and Flickr’s new Service ...... 88 4.2.2. Enhance Images with Visual Information ...... 89 4.2.3. Automatic Determination of Best Encoding Parameters ...... 90 4.2.4. Step from Images to Videos ...... 90
Appendices 91
A. Random Input Bytes 93
B. Encoding Approaches Naming Convention 95
Bibliography 97
IX X Contents
X 1 Introduction
1.1. Motivation
In the last years, cloud computing has become one of the most influencing and widely discussed trends in the IT sector. Big players like Apple, Amazon, Google, and Microsoft developed new cloud services and pushed the idea to outsource to third party providers launching marketing campaigns to gain the attention of the end consumer market. Con- sequently, today the term “cloud computing“ is well-known not only by computer en- thusiasts. The pull-effect of the cloud trend also brought up new business-concepts and start-up companies with fresh ideas. The credo seems to be “everything is better with the cloud“. Through this background a huge variation of services around cloud-computing have mushroomed. From services for companies to exclude a lot of their internal IT- infrastructure to trustful third parties, e.g. the local provider Adlon Datenverarbeitung Sys- tems GmbH [Adl13], to collaborative file storage for end-users, e.g. like Dropbox [Dro13]. Even gaming, whereas the complex graphics of video games are rendered in the cloud and users stream the precalculated scenes via client. A fitting example would therefore be Sony’s cloud-based gaming service Gaikai [com13].
Besides additional computing power, the major application of cloud services is to supply storage. This storage is often advertised as being reliable, available, functional, and cost efficient. Provider offers for end-users extend from services like the mentioned Dropbox or Apple’s iCloud [App13] to external storage, data backup, synchronizing and sharing general data, to professional big data NoSQL storage like Amazon’s AWS S3 [Ama13] or Microsoft’s Azure [Mic13]. With the cloud storage another type of service exploited the availability of large storage areas, namely: photo sharing websites.
Photo sharing websites are storage providers specialized to store and share photos. How-
1 2 1.2. Contribution of the Thesis ever, these sites rely on massive server infrastructures that utilize the same reliable stor- age techniques like other cloud storage services, they do not require fees. Examples for such services are Facebook [Fac13a], Yahoo’s Flickr [Fli13a], and Picasa Web Albums [Goo13a] from Google.
Taking these facts about photo sharing websites into account, we questioned whether it is possible to exploit these services and use them as free cloud storage for any kind of data. Whereby the major issue was finding a way of how to negotiate the constraint of only being able to upload photos. In this thesis the development of a fully functional photosharing website driven cloud storage system will be shown. The last part of the thesis will provide a closer look at the available providers implemented in our system, namely Facebook, Flickr, and Picasa Web Albums, and we will analyze how well they perform in our cloud storage scenario.
Flickr has shown that it is worth thinking about utilizing photo sharing websites as cloud storage. On May 20th, 2013 Flickr completely renewed its service offer [Fli13c]. Since that day Flickr allows to upload and store image data up to a size of 1TB for free instead of their previous monthly limit of 300MB. Another important change is that they now allow to access the original uploaded images and no longer apply JPEG-compression, which, as we will see, makes it a lot easier to store data. Since Flickr’s restart took place during the time this thesis was written, we only refer to features before the restart. Nevertheless our system supports all tools to also use the new Flickr service as well.
1.2. Contribution of the Thesis
In a nutshell, this thesis introduces a way of utilizing photo sharing websites as cloud storage backends for no-sql stores only. It answers the question of how offers of certain image providers can be exploited to store any kind of data without charge. The contri- bution covers the following points:
Twelve different encoding approaches, allowing to easily adapt our system to arbi- • trary photo sharing website providers and their handling of uploaded images (e.g. JPEG-compression).
Details about the implementation of our fully functional photosharing website driven • cloud storage system, e.g. with system workflows.
2 1.3. Key Points of the Thesis 3
Analysis of the in this system available photo sharing website providers Facebook, • Flickr, and Picasa Web Albums and their performance as cloud storage provider.
1.3. Key Points of the Thesis
The goal of this thesis is to show a way of how to make photo sharing websites usable as cloud storage backends imitating a blobstore. This necessitates finding a solution to the question of how to store arbitrary buckets of data on specific provider’s storage and also how to work the other way around and to regain access to the data again. Since we are dealing with photo sharing websites this data has to be somehow exchanged through images. The following figure summarizes how data should flow in such a system:
Image upload encode
00100101 110110111 Data Cycle Photo Storing Website 10101010 1000101
decode download Image
Figure 1.1.: Data Cycle of a Photo Sharing Website driven Cloud Storage System
On the left hand side we have buckets filled with data on a local machine symbolized by binary strings, whereas on the right hand side we see the cloud storage of a photo sharing website represented by a cylinder. Free data interchange between these two major points is crucial to support a fully functional storage. Let us take a closer look on the cycle and how the tasks of storing and restoring data work:
Store Data: We start on the left hand side with data that should be stored. This data first has to be encoded into an image. After that step the image with the embedded
3 4 1.4. Related Work
data is uploaded and hosted on the photo sharing website.
Restore Data: We start on the right hand side. First the image must be downloaded from the photo sharing website. Next the data that is embedded in the image has to be decoded. At the end the data previously stored is regained.
In the presented cycle we see four traversal steps that are crucial for the data flow be- tween local host and photo sharing website: encoding & decoding as well as uploading & downloading. The following two chapters of this thesis can be roughly divided by these steps:
Encoding & Decoding: In Chapter2 we will introduce twelve different encoding ⇒ approaches and give a detailed look on how the encoding of bytes into images works and vice versa. Each of these approaches has its own set of features like data capacity and utilized image colors. This amount of approaches is needed, because as we will see, every photo sharing website treats uploaded images differently. Some of the vendors apply image compression and force the use of encodings with lower data ratio to preserve the data of the images. As a result not every approach is applicable for every photo sharing website.
Uploading & Downloading: Chapter3 focuses on the application of the before dis- ⇒ cussed encoding approaches. It will show details about the implementation of our photosharing website driven cloud storage system, e.g. how the general upload and download workflows look like. Afterwards we will introduce the three in our system available photo sharing web- sites Facebook, Flickr, and Picasa Web Albums. We will furthermore discuss every vendor in detail and draw a comparison between the three main market players of photo sharing websites.
1.4. Related Work
There are three main fields the related work can be assigned to: image-steganography, 2D-Codes, and projects with similar orientation.
4 1.4. Related Work 5
1.4.1. Image Steganography
A topic related to the idea of storing data in images is image-steganography - the art of hiding additional data in an image without being visually noticed.
JPEG Compression Immune Steganography Using Wavelet Transform
A work of the image-steganography-field that especially attracted our attention is [XSSL04] and its JPEG-compression immune approach to store data in images. Due to some photo sharing websites applying JPEG-compression on uploaded images, compression is a seri- ous threat to the stored data. In [XSSL04] a way of storing bits in an image in such a manner that the hidden data in the image cannot be destroyed by JPEG-compression, not even with the highest compression settings, is introduced. The trade-off creating such a compression immune image is a very poor data ratio. An image container using this approach can only store poor 0.00025 Bpp (bytes per pixel). That is the reason why this approach is not applicable in our photo sharing website driven cloud storage scenario where we want to deal with data of several megabytes. Also we do not need to hide our data behind an image. As we will see later, our encoding approaches have a data ratio between at least 0.125 Bpp up to 3 Bpp, which, when compared, is an increase of the data ratio between a factor of 500 up to a factor of 12, 000.
A Web based Covert File System
An idea for a covert file system called “CovertFS” is introduced in [BKI07]. This file sys- tem should be built on public available media hostings and sharing services and allows its users to store their files hidden inside of media without being noticed by the providers. As an example for such a media service they refer to the photo sharing website Flickr. Their goal is to utilize image-steganography to add hidden file system information in images and upload these images to Flickr. Thereby the data ratio of the utilized image- steganography-technique is only specified vaguely, e.g. about 10% of an image’s actual size should be usable as storage, for example a 40KB image can contain 4KB actual data.
Even though [BKI07] wants to exploit photo sharing websites as storage, their purpose is completely different. They want to hide information and secure it as much as possible, even if this comes with a great lack of performance, e.g. they want to map every 4KB
5 6 1.4. Related Work disk block of their file system to a separate image. This is very different from our cloud storage scenario, which aims at providing the best possible data ratio and performance.
It has to be kept in mind that this work is based on theoretical thoughts. A practical application of the approach would possibly enhance some of the theories.
1.4.2. 2D-Codes
A set of techniques that share the similar use-case of expressing data with blocks of pixels, respectively images, are 2D-Codes. These codes are designed to embed data in images in such a way that it can be recognized and processed by appropriate reading equipment, e.g. mobile phones with cameras.
QR-Codes and DataMatrix-Codes
Two popular 2D-Code-techniques are the nowadays in context of mobile phone widely used QR-Codes [QRc13b] (“Qick Response” Codes) and the DataMatrix-Codes [fS06]. Fig- ure 1.2 shows an example output for both codes, storing the same array of 18 random bytes (this array is referenced in AppendixA and will be used throughout the thesis to make the output images of different approaches visually comparable):
(a) QR-Code (b) DataMatrix-Code
Figure 1.2.: Example Output 2D-Codes: QR-Code and DataMatrix-Code
QR-Codes exist in several predefined versions with respect to their resolution / used mod- ules and capacity. At a maximum it can have a resolution of 177 177 pixels (QR-Code version 40). Combined with the minimal available error-correction× code of level “L” a QR- Code can store up to 23,648 bits at maximum [QRc13a]. This corresponds to a data ratio
6 1.4. Related Work 7 at best of about 0.09 Bpp, iff the QR-Code is used to its capacity. That is an important point since, if the∼ input data reaches a defined threshold, the next greater version of QR- Codes has to be utilized to store it. This can result in an extremely unfavorable data ratio, for example the in Figure 1.2a shown QR-Code. 17 bytes would have fit into a version 1 QR-Code with a resolution of 21 21 pixels, but the threshold is reached with 18 bytes and a version 2 QR-Code has to be× used that has a resolution of 25 25 pixels. This code would be capable of storing 32 bytes, but is used to only store 18× bytes, meaning that the data ratio in this case is poor with about 0.03 Bpp - presenting only one third of the best ratio. ∼
For DataMatrix-Codes it is similar, there are also predefined DataMatrix-Code versions with respect to capacity and resolution. At best they can have a maximum size of 144 144 pixels and store in team play with ECC 200 up to 12,448 bits. That equals 1,556× bytes and a data ratio of about 0.075 Bpp. ∼ Summing up, the 2D-Codes have been a good inspiration for our own encoding with respect to utilizing pixels to embed data in images, but the inconstant data ratio and the limitation for predefined image container sizes prompted us to search for a better own so- lution. Moreover, in our photo sharing website scenario we do not need to take a detour over reading equipment like cameras, and can therefore spare a lot of error-correction code.
Quick Layered Response (QLR) Codes
An enhanced 3D-Code version of QR-Codes is introduced in [DD12]. Their so called “QLR- Codes” improve the capacity of QR-Codes by a factor of 3 through utilizing RGB images and their layers as additional dimension. To achieve this capacity win, the input data is split into three different equal sized QR-Codes. Every one of these three QR-Codes represents an RGB layer. Whereby the first QR-Code encodes its data in red pixels, the second in green pixels and the third in blue pixels. Finally, the three layers are merged together into an RGB image that represents the resulting QLR-Code and contains all three sub- QR-Codes. This method, gaining additional data capacity, is also applied by some of our encoding approaches and will be discussed in detail in Chapter 2.5.
7 8 1.4. Related Work
Unsynchronized 4D Barcodes (Coding and Decoding Time-Multiplexed 2D Colorcodes)
Another evolution of 2D-Codes is presented in [LB07]. The main idea behind this ap- proach is to improve the capacity of a regular cellphone-readable 2D-Codes by adding the dimensions, color, and time to the regular used dimensions width and height. The DataMatrix-Code is here used as originator. The authors argue that consumer-cameras have a clear limitation, which means that increasing the resolution of the barcodes to gain more data volume, is not an option. To achieve this goal they propose animated GIF-images [W3C13a] that hold an array of 3D DataMatrix-Codes. The GIF-images display the containing barcodes in a row in an endless loop. To access the information stored in those GIF-images, they have to be captured via a mobile camera (e.g. of a cellphone) and afterwards decoded by the device. Every single barcode consist of three layers of DataMatrix-Codes, each layer is represented by one of the three secondary colors of the RGB color space (red, green, and blue). In contrast to the previously introduced QLR- Codes, these layers are not used to gain additional storage, instead they are utilized to improve the robustness of the codes: Every DataMatrix-Code-frame has two redundan- cies.
1.4.3. Projects with similar Orientation
PNG-Store
[Sei13] introduces the idea to store JavaScript and CSS files in PNG-images [W3C13b] to utilize its lossless compression as alternative to the usually used GZip compression. To achieve this goal, 8 bit PNG-images are used as data containers, whereby the encoder has a data ratio of 1 Bpp. A continuation of this idea is the PNG Store [Hen13]. In this work also 24 bit PNG-images are used with a data ratio of 3 Bpp.
Both encoding methods are similar to two of our encoders introduced later in this the- sis; nevertheless the goals of our projects differ vastly. Their aim is to gain compression through storing data in images, whereas we want to store data on photo sharing web- sites even at a cost of overhead. Depending on the used provider, we maybe have to deal with lossy JPEG-compression and therefore also need encoders in our portfolio that have a lower bytes per pixel ratio but are more robust against compression.
8 1.4. Related Work 9
Filr
Simultaneously to the previously mentioned restart of Flickr and its new service offering 1TB photo storage for free, a new project appeared called “Filr” [Tom13]. It shares the similar goal as our project to exploit Flickr as free storage. Filr is a command line tool that allows to encode and upload files to Flickr. Its encoder accords to the one used in the PNG Store project with a data ratio of 3 Bpp.
In contrast to our photosharing website driven cloud storage system presented in this the- sis, this is a very limited tool. It only allows the encoding and uploading of single files with a limitation of 15MB. The files then have to be downloaded manually and encoded again. In contrast, our system works as blob store and handles all that automatically. Moreover, our system is adaptable for any photo sharing website and can deal with JPEG- compression if necessary. Filr only works with Flickr and its service allowing access to the original images.
1.4.4. Conclusion and Comparison
A comparison between the eight discussed approaches is shown in Table 1.1:
Approach: Use Case: JPGc Bpp: JPEG Comp. Immune Steg. Using Hide data in images compression im- 2.5 10 4 Wavelet Transform [XSSL04] mune − × A Web based Covert File System Hidden file system based on public 1 10 1 [BKI07] media hosting (e.g. Flickr) − ∼ × Embed data in images in such way 2 QR-Codes [QRc13b] that it can be processed by appropri- 9 10− ate reading equipment × 2 DataMatrix-Codes [fS06] “ 7.5 10− Quick Layered Response (QLR) Codes Enhance QR-Codes capacity with × 2.7 10 1 [DD12] RGB colors − × Enhance DataMatrix-Codes capacity 2 Unsynchronized 4D Barcodes [LB07] by utilizing additional dimensions 7.5 10− #f r. color and time × × PNG-Store [Hen13] Use PNG-images as GZip alternative 3 × Filr [Tom13] Store single files on Flickr 3 × Table 1.1.: Overview Approaches Related Work
The first column shows the name of the approach and its reference, the second column describes briefly the use case of the specific approach, the third column shows whether
9 10 1.4. Related Work the approach is capable of dealing with JPEG-compression, and the fourth column shows the best available data rate in bytes per pixel.
We see that the approaches that have the best data ratios are [Hen13] and [Tom13]. The disadvantage of both is that they are not capable of working with JPEG-compressed images.
On the other hand [XSSL04] as the JPEG-compression immune approach in this port- folio of approaches, is worlds apart from being usable with respect to its data ratio.
The approaches [QRc13b], [fS06], [DD12], and [LB07] utilize 2D-Code techniques and have an acceptable data ratio. At the same time, they are usable in a scenario where certain kind of JPEG-compression to images is applied.
[BKI07]’s encoding specification are very vaguely described and therefore difficult to anticipate.
10 2 Encoding Approaches
This chapter describes how to encode bytes into the pixels of an image, as well as how to regain such encoded bytes again. Twelve different encoding approaches are presented that serve this purpose. These encoding approaches can be divided in two main cate- gories, the approaches using a single layer of colors (SL-approaches) and the approaches using multiple layers of colors (ML-approaches). Figure 2.1 gives an overview over all encoding approaches’ output images generated with the same input data (the bytes used to generate this output can be found in AppendixA):
SL2 SL3 SL4 SL7 SL16 SL256
ML2 ML3 ML4 ML7 ML16 ML256
Figure 2.1.: Sample Output of all Encoding Approaches
As can be seen, the resulting images differ in their size and the number of colors utilized. Every encoding approach has its unique features and is therefore useful with respect to
11 12 2.1. Intention the later discussed application and the treating of uploaded images by specific photo sharing websites (e.g. JPEG-compression).
On the following pages, we are presenting all important steps taken to achieve the re- sulting portfolio of encoding approaches. Following will be a summary of benchmark comparisons between the different approaches.
2.1. Intention
The first test subject to exploit was Yahoo’s photo sharing website Flickr [Fli13a]. As tool for the implementation, we used the programming language JAVA in team play with the flickrj-android API [Yu13] as middleware to get access to Flickr’s web API. First we looked at how Flickr treats images and uploaded some test images stored in lossless image file format PNG [W3C13b]; then downloaded the hosted version again. It showed that Flickr applies JPEG-compression [UNI13] on uploaded images (changed since Flickr restarted its service, see Chapter 3.3). The downloaded images differed from the originals with respect to color fidelity of its single pixels as well as their overall disk space usage.
Therefore we did some research in the field of making images more robust against JPEG- compression and found papers related to image-steganography. Particularly [KTC09]’s JPEG-compression immune approach seemed to be interesting for our purpose of han- dling Flickr’s post-processing. With their method it is possible to regain information em- bedded in images without bit errors, even if the highest level of JPEG-compression has been applied to these images. The problem of this approach with respect to our goal of utilizing Flickr’s storage is, like previously mentioned, its poor data ratio.
Further investigations led us to 2D-Codes. Common available 2D-Code approaches solve the task of embedding data in images by utilizing blocks of pixels. We focused on the nowadays widely used QR-Codes [QRc13b]. Tests with test cases applied to Flickr and uploaded QR-Code images with random input bytes showed that QR-Codes can be in- terpreted by its decoders even after Flickr’s post-processing. This means that the data stored in these images survives the applied JPEG-compression. As a result QR-Codes could be used to host data on Flickr. Nevertheless, a big trade-off with QR-Codes and 2D-Codes in general, is that they are optimized to be camera-readable, meaning that they take a lot of effort with respect to readability and error correction. In our photo sharing website
12 2.2. First Encoding 13 scenario images will be directly interpreted by computers and do not have to take a de- tour over cameras. Therefore, 2D-Codes waste a lot of image information for in our case unnecessary features that renders them impractical with respect to efficiency, although, they are more efficient compared to [KTC09]’s approach.
At this point, we realized that the available approaches do not really satisfy the require- ments of our scenario and decided to develop our own encoding approaches optimized for our purposes. The work with 2D-Codes provided some good inspiration of how data can be embedded in pixels and consequently we focused on pixels as atomic storing units for our own approaches.
2.2. First Encoding
The main idea for the first encoding was that, if we have an image that consists of the colors black and white, every pixel in such an image has two possible states: the state black and the state white. A common bit has also two states: the state 0 and the state 1. Therefore both can be used as substitute for each other.
We define:
1. There are two color classes: the class black and the class white. Black stands for the binary number 0, whereas white is represented by the binary number 1. 2. Every 8 pixel in a row represent one coherent data unit (byte) in the image. 3. Every pixel that has a color value that is smaller or equal than white (white repre- 2 sents the highest color value available in a specific image) is interpreted as pixel of color class black and every color value that is greater is interpreted as pixel of color class white:
white cb ( ) < cw ≤ 2 Whereas cb equals color class black and cw class white.
This encoding will hereafter be denoted as SL2 (details about naming conventions are discussed later in this chapter).
13 14 2.2. First Encoding
2.2.1. Encoding Workow
Figure 2.2 shows the corresponding encoding workflow to this first encoding approach
(as input the random bytes listed in AppendixA were used, cf. resulting image with SL2 Figure 2.1):
0 1 0 0 0 1 0 1 6910 010001012
+ 0 1 0 0 0 0 0 1 6510 010000012
+ 1 1 0 1 1 0 1 0
-3810 11011010 2
+ 1 0 1 1 1 0 1 0 11011010 -7010 2
+ 1 0 1 1 1 0 0 0 -7210 101110002
Input Step 1 Step 2 Step 3 Bytes convert to binary map Bits to pixels add parts together
Figure 2.2.: Encoding Workflow - From an Array of Bytes to the final Image
To generate an image from a given array of input bytes, we go sequentially through it and then encode one byte after another into a pixel row. The resulting rows are then fused together to the final image. In detailed steps:
(Step 1) At first the given input bytes are converted from the decimal system to their corresponding binary representation (negative two’s complement values are interpreted unsigned). As an example we take the first input byte value 6910. This value is converted to the binary system, resulting in the 7 bit long string 10001012. We specified that every encoded byte row consists of 8 digits, respectively pixels, thus omissible leading bits with value 0 are appended, so that the resulting bit string fits the length of 8 digits. In this example we get 010001012 as a result.
(Step 2) The received binary representation of the byte then can be mapped to pix-
14 2.2. First Encoding 15 els. Like we defined: 0 will be encoded as black pixel (b) and 1 as white pixel (w). As a result we, derive from the binary number 01001012 the pixel row bwbbbwbw.
(Step 3) Finally we add all pixel rows together and get an image that contains all in- put bytes encoded in its pixels.
2.2.2. Decoding Workow
How to regain bytes stored in an image shows the general decoding workflow in Fig- ure 2.3:
010001012 6910 0 1
010000012 6510 0 1
11011010 2 -3810 1 1 0 1 0
101110102 -7010 0
Input Step 1 Step 2 Step 3 Image Inspect Color and regain Bit convert to Decimal MERGE bytes
Figure 2.3.: Decoding Steps - From Image to the regained Array of Bytes
The starting point in the decoding process is an image that contains embedded bytes. In the example image these bytes equal the bytes that were used in the encoding workflow, but the actual image is different. It is a JPEG-compressed version of the original output image (e.g. downloaded from a photo sharing website like Flickr) and therefore contains deviating pixel colors. To decode the embedded data from this image, the following steps are required:
(Step 1) It is defined that every data unit is stored in a row of 8 pixels, thus the im- age is traversed sequentially from top to bottom and each such row is processed. Due
15 16 2.2. First Encoding to the color values of pixels can deviate from given color classes (in this case black and white), every pixel color in such a row then is inspected and via threshold assigned to its nearest color class. In the workflow this inspection is symbolized by a magnifier. For example the first eight pixels of the input image: the magnified pixel value has a grey color value of 210. Since this value is greater than white , respectively greater a value of 2 127, this pixel is interpreted as pixel of the color class white. Next the determined color class is substituted by its corresponding bit. Received bits are then appended to the in previous iterations extracted bits. As a result, we gain the binary representation of the byte stored in a specific row, for example from image’s first row we gain 010001012.
(Step 2) The received binary string that represents the embedded byte is then converted back to the decimal system. In this case we receive the number 6010 from the binary string 010001012.
(Step 3) All extracted bytes are merged together. As a result, we receive the original array of input bytes that has been stored before.
2.2.3. Results
Analyses with several test cases and random test data showed that the here introduced encoding (SL2) makes it possible to store arbitrary bytes on the photo sharing website Flickr without being harmed by its post-processing, respectively its JPEG-compression. The data ratio of such a generated image is 1 byte to 8 pixels. In comparison with com- mon available approaches, e.g. like 2D-Codes, this is a great capacity win with respect to the generated image dimensions. For example the in Figure 1.2 shown QR-Code and
DataMatrix-Code hold the same array of bytes as we used in the SL2 workflow examples, but both need many more pixels to store it. Figure 2.4 illustrates a visual comparison between the three outputs (the array of input bytes can be found under AppendixA):
16 2.3. Advanced Encoding 17
25 px
20 px
18 px
25 px 20 px 8 px
(a) QR-Code (b) DataMatrix (c) SL2
Figure 2.4.: Output Comparison QR-Code vs. DataMatrix-Code vs. Own Encoding (SL2)
The comparison in detail:
an array of 18 input bytes needs an image with the pixel dimensions of 25 25 • to be represented in a QR-Code by using the lowest error-correction code available,× which is a consumption of 625 pixels or a pixel ratio of 1 byte to 35 pixels. ∼ the same data can be stored in DataMatrix-Code with 20 20 pixels. Those are in • sum 400 pixels and the ratio is 1 byte to about 22 pixels.× ∼
the SL2 in contrast just needs a sum of 144 pixels and has a data ratio of 1 byte to 8 • pixels. These are 481 pixels less or only 23% of the pixels that are needed for the encoding with QR-Code and 256 pixels less∼ than the DataMatrix-Code, respectively only 36% of the DataMatrix-Code’s pixel usage. ∼
These results make the SL2 a good solution of how to embed data in images and host it on photo sharing websites. Therefore it is used as a basis for the other encodings later introduced in this thesis.
2.3. Advanced Encoding
The next logical step to enhance the data capacity of the previously described SL2, is by utilizing images with more colors instead of only black and white. For example a greyscale
17 18 2.3. Advanced Encoding image as data container and an additional grey color could be utilized for the encoding. As a result, there are three different possible states available: The state black, the state grey, and the state white. This allows the switch from the binary numeral system, which expresses itself in two states to the less familiar ternary system consisting of three states: the state 0, the state 1, and the state 2.
The increased amount of utilized colors, respectively digits and the data ratio are de- pending on each other. To be able to determine this dependence and how the choice of the utilized numeral system influences the data ratio, some basics on numeral system follow.
2.3.1. Basics on Numeral Systems
A natural number z N0 consisting of n place-values in a numeral system of base γ can ∈ be expressed by the following polynomial (information derived from [KW05], p.33ff):
n 1 X− i z = ziγ (2.1) i=0
Whereas 0 zi < γ. The unique base γ of a numeral system is the so called radix. It determines≤ the amount of digits available in a specific system that reach from 0 to γ 1. Due in our context the digits of a number represented in a specific numeral system− are mapped to the colors of an image, γ also determines the amount of colors that are utilized to encode a number of such a numeral system.
2.3.2. Relationship between Radix γ and Data Ratio
The choice of radix γ influences the data ratio of an according encoding through defining the length of one pixel chunk that is able to represent a byte. In the following, we want to determine how many place-values, respectively pixels, are needed to do this represen- tation.
To determine the maximum number zmax to be possibly displayed with a specific radix γ and n available place-values, we set in Equation 2.1 zi = γ 1 for each i 0, . . . , n 1 , − ∈ { − }
18 2.3. Advanced Encoding 19 because (γ 1) equals the highest digit available in a specific numeral system: − n 1 n 2 zmax = ((γ 1) γ − ) + ((γ 1) γ − ) + ... + (γ 1) (2.2) n − n×1 n 1 −n 2 × − = (γ γ − ) + (γ − γ − ) + ... + (γ 1) (2.3) n − − − = γ 1 (2.4) − As a result, we find that with n place-values it is possible to display numbers between 0 to γn 1. − In the next step, we want to determine the lowest amount of place-values n needed to represent the values of one byte. Since 255 is the greatest possible value of a byte, it n must apply that zmax = γ 1 255. We derive: − ≥ γn 1 255 (2.5) − ≥ γn 256 (2.6) ≥ n logγ(256) (2.7) ≥
The smallest possible natural number n that satisfies the Inequation 2.7 is n = logγ(256) . d e From now on we denote by
pSL = logγ(256) (2.8) d e the lowest amount of pixel needed to represent one byte in a specific numeral system with radix γ.
We are now considering the ternary system (radix γ = 3) as an example. In this case, pSL = log3(256) = 6, meaning that 6 pixels are needed to embed a byte in an image utilizingd the ternarye system. This example also shows that not always the full possible number range of a numeral system with the minimal needed place-values to represent one byte is exploited. With the ternary system and 6 place-values it would be possible to 6 represent a range between 0 and zmax = 3 1 = 728, but a byte only uses the range from 0 to 255. Such unexploited ranges are− true for choices of radix γ of which 256 is not dividable without a remainder.
19 20 2.3. Advanced Encoding
2.3.3. Results
Referring back to the idea to improve the introduced encoding technique’s data ratio by utilizing greyscale images with the three colors black, grey, and white as data containers instead of the formerly used black & white images. This enables the switch from the bi- nary system (γ = 2) to the ternary system (γ = 3).
For this new encoding we define that:
1. There are three color classes: color class black stands for the digit 0, the color class grey for the digit 1, and the color class white for the digit 2 2. Every group of 6 pixels in a row represent one coherent data unit (byte) in the image. 3. The pixel color values are interpreted as follows:
white white cb ( ) < cg (2 ) < cw ≤ 3 ≤ × 3 Whereas cb equals color class black, cg color class grey, and cw color class white.
This encoding approach is from now on denoted as SL3. Figure 2.5 shows a comparison between the SL2 and the SL3:
11111000 0, 1 0, 1, 2 2 8 pixels
-810 Binary Ternary 100012 System System 3 6 pixels
(a) Pixel-mapping (b) Example Encoding
Figure 2.5.: Comp. Encoding Approaches: Binary System (SL2) and Ternary System (SL3)
In Figure 2.5a it is displayed how the mapping using three colors and the ternary system works in comparison to the previously discussed encoding using two colors and the binary system. Figure 2.5b shows the gain of this switch. Instead of 8 pixels that are needed to represent one byte, it is now possible to encode a byte with 6 pixels. This is 1/4th less pixel usage for storing one bytes with the SL3 than it was needed with the SL2 (cf. also
Figure 2.1 SL2 vs. SL3).
20 2.4. Single-Layer-Approaches 21
2.4. Single-Layer-Approaches
We have previously discussed the basic encoding approach SL2 and how to advance the data ratio of this technique by utilizing more colors, e.g. through a third color with the help of the less familiar ternary system leading to the SL3. Both of these encoding ap- proaches are part of our set of six single-layer-approaches (SL-approaches). Four other encoding approaches with different γ complement this portfolio.
Since γ is unique to every approach and represents its utilized colors and digits simul- taneously (e.g. like we see in Figure 2.5a), it can be used to distinguish between partic- ular encoding approaches. Approaches itself are named by their features, depending on whether it is a single-layer (SL) or a multi-layer (ML) approach (multi-layer-approaches are discussed Chapter 2.5) and its radix γ. Therefore the black and white approach using the binary system is denoted as SL2, whereas the approach using the colors black, white, and grey, and the ternary system is denoted as SL3 approach. Details to the naming con- ventions can be found in AppendixB.
The data ratio of a specific SL-approach can be determined with with Equation 2.8, were pSL equals the pixels that are needed to store one byte with a specific SL-approach. The SL-approaches itself cover only those encodings properly that gain an improvement through their bytes per pixel ratio with respect to encodings that use a smaller radix γ (equivalent to less digits, respectively less colors) to embed bytes. Encoding approaches that utilize more colors to encode, but need same amount of pixels to store a byte with respect to encoding approaches with less colors, are uninteresting in our context, since more colors mean less robustness against JPEG-compression and that the output images require a greater disk space. For example the SL4 approach that uses the quaternary sys- tem is able to represent a byte within 4 pixels and has a better data ratio to that of the
SL3 that needs 6 pixels for a byte. The SL4 therefore is part of the SL-approaches. On the other hand, an encoding approach utilizing γ = 5 would also need 4 pixels like the SL4 to embed a byte in an image, but with the disadvantage that more colors have to be utilized than it is the case with the SL4. This encoding is therefore not covered in our set of SL-approaches.
Taking these requirements into account, the resulting portfolio of SL-approaches contains the following six encoding approaches: SL2, SL3, SL4, SL7, SL16, and SL256. A summary of these encoding approaches and their features is shown in Figure 2.6:
21 22 2.4. Single-Layer-Approaches