Secure Digital Documents Using Steganography and QR Code

Secure Digital Documents Using Steganography and QR Code A thesis submitted for the degree of Doctor of Philosophy By Mohamed Sameh Hassanein Department of Computer Science, Brunel University November 2014 ABSTRACT With the increasing use of the Internet several problems have arisen regarding the processing of electronic documents. These include content filtering, content retrieval/search. Moreover, document security has taken a centre stage including copyright protection, broadcast monitoring etc. There is an acute need of an effective tool which can find the identity, location and the time when the document was created so that it can be determined whether or not the contents of the document were tampered with after creation. Owing the sensitivity of the large amounts of data which is processed on a daily basis, verifying the authenticity and integrity of a document is more important now than it ever was. Unsurprisingly document authenticity verification has become the centre of attention in the world of research. Consequently, this research is concerned with creating a tool which deals with the above problem. This research proposes the use of a Quick Response Code as a message carrier for Text Key-print. The Text Key-print is a novel method which employs the basic element of the language (i.e. Characters of the alphabet) in order to achieve authenticity of electronic documents through the transformation of its physical structure into a logical structured relationship. The resultant dimensional matrix is then converted into a binary stream and encapsulated with a serial number or URL inside a Quick response Code (QR code) to form a digital fingerprint mark. For hiding a QR code, two image steganography techniques were developed based upon the spatial and the transform domains. In the spatial domain, three methods were proposed and implemented based on the least significant bit insertion technique and the use of pseudorandom number generator to scatter the message into a set of arbitrary pixels. These methods utilise the three colour channels in the images based on the RGB model based in order to embed one, two or three bits per the eight bit channel which results in three different hiding capacities. The second technique is an adaptive approach in transforming domain where a threshold value is calculated under a predefined location for embedding in order to identify the embedding strength of the embedding technique. i The quality of the generated stego images was evaluated using both objective (PSNR) and Subjective (DSCQS) methods to ensure the reliability of our proposed methods. The experimental results revealed that PSNR is not a strong indicator of the perceived stego image quality, but not a bad interpreter also of the actual quality of stego images. Since the visual difference between the cover and the stego image must be absolutely imperceptible to the human visual system, it was logically convenient to ask human observers with different qualifications and experience in the field of image processing to evaluate the perceived quality of the cover and the stego image. Thus, the subjective responses were analysed using statistical measurements to describe the distribution of the scores given by the assessors. Thus, the proposed scheme presents an alternative approach to protect digital documents rather than the traditional techniques of digital signature and watermarking. ii ACKNOWLEDGEMENTS I would like to take this opportunity extend my hearty gratitude to the aid and support of the kind people around me to accomplish this thesis in time. First and foremost, I am grateful to my supervisor, Dr George Ghinea, who has offered me invaluable support and guidance throughout my Ph.D. with his knowledge and patience. This thesis would not have been possible without the support of my family. I owe my sincerest gratitude to my family for their endless support, for which my mere expression of thanks does not suffice. Last, by no means least, I wish to offer my heartfelt thanks and gratitude to all my fellow colleagues, the academic and support staff in the Department of Computer Science at Brunel University. I am grateful to all of those with whom I have had the pleasure to work during the course of my PH.D. iii DECLARATION The following papers have been published (or submitted for publication) as a direct result of the research discussed in this thesis: Hassanein, M. S., & Ghinea, G., 2012. Text fingerprint key generation. In Internet Technology And Secured Transactions, IEEE 2012 International Conference, pp. 603-609. iv ABBREVIATIONS 1D One Dimensional 2D Bi-dimensional AC Alternate Current ANOVA Analysis of Variance BMP Bitmap Format DC Direct Current DCT Discrete Cosine Transforms DFT Discrete Fourier Transforms DSCQS Double Stimulus Continuous Quality Scale DSIS Double Stimulus Impairment Scale DSR Design Science Research DWT Discrete Wavelet Transforms EAN International Article Number ENMPP Expected Number of Modifications per Pixel FH High Frequency FL Low Frequency FM Middle Frequency GIF Graphics Interchange Format HVS Human Visual System JPEG Joint Photographic Experts Group L.C.A Letter Count After L.C.B Letter Count Before LSB Least Significant Bits MOS Mean Opinion Scores MPD Multi-Pixel Differencing MSE Mean Squared Error NLP Natural Language Processing OCR Optical Character Recognition v OPAP Optimal Pixel Adjustment Process PNG Portable Network Graphics PRNG Pseudorandom Number Generator PSNR Peak Signal-to-Noise Ratio PVD Pixel Value Differencing QR Code Quick Response Code QT Quantisation Table RGB Red, Green and Blue SSCQE Single Stimulus Continuous Quality Evaluation T.L.I Total letter Index UID Unique Identification UPC Universal Product Code vi TABLE OF CONTENTS ABSTRACT ............................................................................................................. i ACKNOWLEDGEMENTS .................................................................................... iii DECLARATION .................................................................................................... iv ABBREVIATIONS ................................................................................................. v TABLE OF CONTENTS ...................................................................................... vii LIST OF TABLES ................................................................................................ xiii LIST OF FIGURES ............................................................................................... xv Chapter 1: Information Hiding and Steganography ......................................... 1 1.1 OVERVIEW ............................................................................................... 1 1.2 RESEARCH BACKGROUND ........................................................................ 2 1.2.1 Steganography and Cryptography ......................................................... 3 1.2.2 Steganography and Watermarking ........................................................ 4 1.2.3 Steganography and Fingerprinting ........................................................ 4 1.3 RESEARCH MOTIVATIONS ........................................................................ 6 1.4 RESEARCH AIM AND OBJECTIVES ............................................................ 8 1.5 RESEARCH APPROACH ............................................................................. 9 1.6 THESIS STRUCTURE ................................................................................ 10 Chapter 2: Nuts and Bolts of Steganography ................................................... 11 2.1 OVERVIEW ............................................................................................. 11 2.2 STEGANOGRAPHY DEFINED.................................................................... 13 2.3 TAXONOMY OF INFORMATION HIDING TECHNIQUES ............................. 15 2.3.1 Hiding Method-Based Classification ................................................... 15 2.3.1.1 Insertion-Based Method: ........................................................................... 15 2.3.1.2 Substitution-Based Method: ...................................................................... 15 vii 2.3.1.3 Generation-Based Method ......................................................................... 16 2.3.1.4 Transform Domain Technique ................................................................... 16 2.3.1.5 Spread Spectrum Technique ...................................................................... 16 2.3.1.6 Statistical Technique .................................................................................. 17 2.3.1.7 Distortion Technique ................................................................................. 17 2.3.2 Cover-Type Based Classification ......................................................... 18 2.4 TEXT WATERMARKING .......................................................................... 19 2.4.1 Format Based Methods ........................................................................ 20 2.4.1.1 Line-shift encoding .................................................................................... 20 2.4.1.2 Word-shift encoding .................................................................................. 20 2.4.1.3 Feature encoding ........................................................................................ 21 2.4.1.4 White-space encoding ................................................................................ 24

Load more