Proceedings of the 4th National Conference; INDIACom-2010 Computing For Nation Development, February 25 – 26, 2010 Bharati Vidyapeeth’s Institute of Computer Applications and Management, New Delhi

Robust using secure password protected scheme

Pradeepta Bhattacharya B.E, Computers, Padre Conceicao College Of Engineering, Goa Dhiraj S. Rajani B.E, Computers, Padre Conceicao College Of Engineering, Goa Faculty : Mr. Terence Johnson (Lect IT dept Padre Conceicao College Of Engineering, Goa)

ABSTRACT This study deals with constructing and implementing a new When using a 24-bit color image, a bit of each red, green and algorithm based on hiding a large amount of data (text) onto a blue components can be used, so a total of 3 bits can be stored color image. We have designed a cosine based transformation in each pixel. Thus an 800 x 600 pixel image can contain 14, to create the appropriate pixels where data could be hidden. 40,000 bits (1, 80,000 bytes) of secret data. But using just 3 bits Also the data to be hidden is not sequential but is random from this huge size of bytes is wasting space. So the main based on a password algorithm. This concept is based on both objective of the present work is to hide more than 1 bit at bytes visual and statistical. According to our design we have defined and still get results like LSB replacement (imperceptible two layers of security. Higher layers of security can be message). This objective is satisfied by building new achieved by encrypting the input data to confuse steganalysis. steganographic algorithm to hide a large amount of text in an image.. It concerns to work against visual attacks to make the KEYWORDS ability of humans unclear discern between noise and visual Hiding with high capacity, high security, Cosine transform, patterns. password protection scheme. NEW STEGANOGRAPHIC ALGORITHM WITH HIGH INTRODUCTION SECURITY AND ROBUSTNESS. Steganography is the art and science of information hiding and The new algorithm is constructed so as to have 2 layers of invisible communication. It is unlike , where the security (and 1 extra if required-Encryption of input text). goal is to secure communications from an observer by making These layers function independently to provide an unbreakable the data unreadable, steganography techniques strive to hide security wall. the very presence of a message itself from an observer to make The first is the transform that is applied to the image pixels that sure there is no knowledge of presence of any data at all. In makes sure data is not directly embedded in the color values. some situations, sending encrypted information will arouse The second is the password based algorithm that decides where suspicion but invisible data will not. Both the sciences can be to put the data in the matrix of transformed color values. combined to produce better protection of the information. If the steganographic technique fails and the presence of data is DESCRIPTION OF THE STEGANOGRAPHIC detected a complex embedding algorithm can make sure the ALGORITHM. data is not retrieved, if this is also compromised encryption In the present Stegaongraphic algorithm, two parts are will make the data unreadable. Hiding data inside images is considered (embedding data at the senders end and extracting very popular these days as an image can be spread around the data at the receivers end). These parts are implemented so as World Wide Web easily. to satisfy the following requirements [8]. I. The algorithm must reduce the chances of To hide a message inside an image without changing its visual statistical detection. properties, the cover source can be altered in “noisy” areas II. The stego image must not have any distortion with many color variations, so less attention will be drawn to artefacts. the modifications. Common methods used are LSB III. The algorithm must provide robustness against a replacement, masking, coding methods or some variety of image manipulation attacks. transformations [9, 5]. IV. The algorithm must not sacrifice the embedding capacity in order to achieve the above Because of the continual changes at the cutting edge of mentioned requirements. Steganography, and the large amount of data involved, steganalysts have suggested using machine learning techniques The technique we have devised centres on a transformation to characterize images as suspicious or non-suspicious. technique similar to DCT [1, 3, 4], with a change in the

Proceedings of the National Conference; INDIACom-2010

formula. During our course of study we realised DCT would values generated by the forward transform wouldn’t affect the not be the appropriate formula for our purpose, the prime color value at that pixel and thus make the original and stego reason being, the DCT values of each color component (R, G, image look the same. B) for each pixel depend on the other color values in its corresponding row and column, due to which any change in The following are the steps involved in the decoding process. those values would result in a change to our value desired. 1. Vectoring is performed on the 8 x 8 block of color values. 2. Resulting vector is checked for existence of data in it. As we have mentioned before the 8th location in The following are the steps involved in our encoding the vector serves as a marker. Now check the LSB process. of this location. If the LSB is equal to 1 we need 1. Take each 8 x 8 block of pixel values at each color move on to the next block as data is not hidden in at a time and perform vectoring on this block, so this block. If LSB is equal to 0 it proves that data that the 2-D matrix is converted to a 1-D array. exists in this block. This is done so that finding locations to hide data 3. To retrieve the data the locations in the vector are becomes easier. selected (based on the password scheme to be 2. Find the locations where the data bits are to be explained further). Apply the forward transform on hidden (based on the password scheme to be the selected value. Convert the value to binary and st nd explained further), which generates a unique extract the 1 and 2 LSB location values and store key for each red, blue and green value. them in a Boolean array. 3. Determine if data is hidden in a particular block 4. Once 7 bits are obtained convert them to a (this step confuses an intruder to the limit). This is character. done using the following procedure. There are 2 ways to mark the end of text.  Determine the values of the color stored in 1. Either concatenate successive special characters each of the locations selected before. (maybe 3) and add them at the end of text and  If any one of the values is greater than encode and decode them too. These characters 250, make the last bit of the value in the 8th once decoded mark end of text. 2. Or use the 17th location again as the mark. location 1 (this marks-data not hidden in th the block).  If it is the last block, LSB of the 17 location is  If all values are less than 250, use the made 1. same 8th location and make the LSB of the  If it is not, LSB is made 0. value 0. 4. If data can be hidden in that block, then take each value (from the selected) at a time from the vector, We don’t incorporate image validation or password apply the forward transform formula to get a validation because an intruder could use two types of attacks- frequency value. Convert the value to binary and He could find out if an image is a stego image by testing it on insert 2 bits into the 1st and 2nd LSB of the value. our application. Implementing password validation would 5. Apply the output transformation formula to get mean an intruder could brute force on the password (without back the color values. extra overhead of running the whole algorithm each time). 6. De-vector the modified vector of color values obtained from step 5. Once the R,G,B values are DETAIL DESCRIPTION OF EACH OF THE obtained store them back in the image. PROCESSES

Repeat the above mentioned steps till all the text is 1. The forward cosine transformation: successfully hidden into the image. Freq(i) = color(i) x cos(i x( pi/16)) x cos(j x (pi/16)) eq. 1 Once we have the password values for red, green and blue 2. The backward cosine transformation: blocks in hand, and we found that data can be hidden in the Color(i) = freq(i) / (cos(i x (pi/16)) x cos(j x (pi/16))+1 eq. 2 block we proceed as follows. 3. Vectoring: 4. De-vectoring: The text is hidden in binary format. We choose to hide in the (Both to be explained with diagram) 1st and the 2nd LSB positions because from our previous study and knowledge we concluded that manipulation of this bit doesn’t bring about a large change in the value under consideration [4]. Therefore LSB substitution in the desired

Copy Right © INDIACom – 2010 2 Robust Steganography using secure password based protection scheme

The next is the most important process, the password  In this eg. bit 1 has value 1 therefore 10th location will acceptance and usage scheme. hold data. This process is the one that provides that 1 extra level of security.

THE SECURE PASSWORD BASED EMBEDDING VECTORING ALGORITHM This is the zigzag scan process used to convert the 2-d matrix to a 1-d matrix. This process is used instead of a 1. Accept the password from the user. It has to be a sequential scan as it brings the similar frequency component minimum of 7 characters, maximum length depends values together. This is performed before hiding data. on the user. 2. Check if the password is a palindrome. If it is, perform the following on it. Take the first half and insert it at the end of the first half. This is done because a palindrome is considered a bad password by the algorithm. For all further purposes in the algorithm this transformed password is taken as the password accepted. 3. Reverse the password and store it. Now the password is broken down into 2 halves (taking 1 element less in the 1st half in case of an odd length password). Then each half is treated separately. Then digrams are taken 1 at a time and its elements are exchanged. Then is the important part which gives a unique [2, 6] password for the 3 different blocks. Each half is then left shifted once for the red block, twice for the green block and DE-VECTORING thrice for the blue block. The halves are then concatenated This process is the exact opposite of the above mentioned to produce the intermediate password. process and is used after hiding so that the pixels can be put 4. This intermediate password and its reverse are back in the image. The zigzag pattern is reconverted to converted to integers each letter at a time using the sequential 2-d matrix form and stored back into the color ASCII representation of characters. block. This is performed after the data is hidden into the 5. These two are then added to produce the final values. password which will be used as the logic to embed data randomly. This is converted to binary each integer at a time. 6. Then each digit is taken 1 bit a time and the IMPLEMENTING THE PRESENT ALGORITHM locations which have a 1 in them are used as 1. Embedding the secret data into the image: reference to hide the data at those locations i.e. if Assume that the user selects a cover image and enters the data there are 3 1’s in the digits binary then 6 bits are to be hidden. stored in the corresponding location (2 bits at each Then the user selects a password. location). Then the data is converted to integer (ASCII) and then is 7. The number blocks that can be processed using the converted to a long binary array. password is equal to the length of the password. The image is taken 1 block of 8 x 8 values of each color (R, G, Once all the values in the password are used the B) is taken at a time. same password is repeated for the next n blocks, Then according to the password, the locations to store the data where n is the length of the password. are found out. Example: At these locations the frequency transform (eq. 1) is applied. Then at each of these positions the 2 LSB locations are used to  Assume data is to be hidden on the red block hide 2 bits of the data. Consider the final password generated is “gedfcab”. This process is repeated for the other 2 colors and each 8 x 8  So its ASCII equivalent will be 103 101 100 102 99 97 blocks till the data bits are exhausted. 98, stored in an array.  The first value for the first red block is 103.  Binary of 103 is 1100111. 2. Extracting the secret data from the stego image:  Say i contain no.’s starting from 1, then location 9+i will hold the first 2 bits of the data.

Copy Right © INDIACom – 2010 3 Proceedings of the National Conference; INDIACom-2010

The user selects the image. Then the user enters a password. The same algorithm is again applied to this password and Example 2. depending on it the final password is found. The image is again taken 8 x 8 blocks at a time and using this The data to be embedded is: this is a secrete file. Please don’t password the algorithm is applied and the locations are found tamper with it. again. Then at these locations the reverse of the initial The password is: stegosys transform (eq. 2) is applied and from that particular value the The image chosen is 2 LSB’s are extracted. This is done for every color of every block till end of file is found (which as mentioned before can be marked by the 17th location in the array or by 3 or more successive special characters). These bits are converted to integer and then to character using again their ASCII representation. This character string is given as output.

Example 1: The data to be embedded is: This is a new steganographic algorithm. The password is: steganography The image chosen is

The output image is

The output image is

We make sure the visual properties are exactly the same. The image when given to our algorithm, using the same password the exact data is given back as output.

CONCLUSION This idea has been put into practical use by developing an application that performs the functions exactly as mentioned above.

The embedding capacity is not fixed and depends on the password. Studies have been conducted on a set of 480 x 320 resolution color images. The results are, on an average the hiding capacity is 2286 characters, maximum is 4572 characters and minimum is 571 characters. Using larger

Copy Right © INDIACom – 2010 4 Robust Steganography using secure password based protection scheme

images will of course allow more space to hide data. This algorithm also takes care of the visual aspects of the image by making sure that the image looks as it is.

FUTURE SCOPE This algorithm can be used with more layers of security in many applications such as defence for sharing of secret information

In such strategic applications exchanging images would be considered less vulnerable than exchanging scrambled text as it would not arouse suspicion that secret information is being shared and therefore the code is not broken into shielding the application too.

The higher layers could be as mentioned, encryption of the input text, and/or use of asymmetric encryption or any secure key exchange algorithm for exchanging the password.

REFERENCES 1. Information hiding techniques. By Mark Leddy 2. Image Compression and the Discrete cosine transform. By Ken Cabeen, Peter Gent. 3. Hiding large amount of sata using segemtation of pixel blocks. By Nameer N.El Emam 4. e-Forensics steganography systems for secret information retrieval. By-Vidyasagar M.Potdar, Muhammad A Khan, Elizabeth Chang, Mihaela Ulieru, Paul R.Worthington. 5. Hide and seek-An introduction to Steganography. By Neil Provos and Peter Honeyman 6. Multimedia Communication Applications, Networks, Protocols and standards By Fred Halsall. 7. Principles of Steganography. By Max Weiss 8. An overview of Image Steganography. By Information and Computer Security Architecture (ICSA) Research Group. 9. Information Hiding: The New Digital Age by, Dr. Jennifer Davidson Electrical & Computer Engineering Department of Mathematics

Copy Right © INDIACom – 2010 5