Revista Informatica Economică, nr. 4 (44)/2007 37

Steganographically Encoded Data

Adrian VASILESCU, Bucharest, Romania

Steganography is the art of hiding information in ways that prevent its detection. Though steganography is an ancient craft, the onset of computer technology has given it new life. Computer-based steganographic techniques introduce changes to digital covers to embed in- formation foreign to the native covers. Such information may be communicated in the form of text, binary files, or provide additional information about the cover and its owner such as digital watermarks or fingerprints. This paper explains steganography, provides a brief his- tory and describes how steganography is applied in hiding information in images. Keywords: Steganography, information hiding, digital image, digital watermarking

teganography is the art and science of matter how unbreakable it is, will arouse sus- S writing hidden in such a way picion and may in itself be incriminating, as in that no one apart from the intended recipient some countries encryption is illegal knows of the existence of the message; this is Steganographic techniques in contrast to cryptography, where the exis- Modern steganographic techniques tence of the message itself is not disguised, • Concealing messages within the low- but the content is obscured. est bits of noisy images or sound files. The word "Steganography" is of Greek ori- • Concealing data within encrypted gin and means "covered, or hidden writing". data. The data to be concealed is first en- Its ancient origins can be traced back to 440 crypted before being used to overwrite part BC. Herodotus mentions two examples of of a much larger block of encrypted data. Steganography in The Histories of Herodo- This technique works most effectively where tus. Demeratus sent a warning about a forth- the decrypted version of data being overwrit- coming attack to Greece by writing it on a ten has no special meaning or use: some wooden panel and covering it in wax. Wax cryptosystems, especially those designed for tablets were in common use then as re-usable filesystems, add random looking padding writing surface, sometimes used for short- bytes at the end of a ciphertext so that its size hand. Another ancient example is that of His- can't be used to know what was the plaintext tiaeus, who shaved the head of his most size. Examples of software that use this tech- trusted slave and tattooed a message on it. nique include FreeOTFE and TrueCrypt. After his hair had grown the message was • Chaffing and winnowing hidden. The purpose was to instigate a revolt • Invisible ink against the Persians. Later, Johannes • Null ciphers Trithemius's book Steganographia is a trea- • Concealed messages in tampered ex- tise on cryptography and steganography dis- ecutable files, exploiting redundancy in the guised as a book on black magic. Generally, i386 instruction set. a steganographic message will appear to be • Embedded pictures in video material something else: a picture, an article, a shop- (optionally played at slower or faster speed). ping list, or some other message. This appar- • A new steganographic technique in- ent message is the covertext. For instance, a volves injecting imperceptible delays to message may be hidden by using invisible packets sent over the network from the key- ink between the visible lines of innocuous board. Delays in keypresses in some applica- documents. tions (telnet or remote desktop) can mean a The advantage of steganography over cryp- delay in packets, and the delays in the pack- tography alone is that messages do not attract ets can be used to encode data. There is no attention to themselves, to messengers, or to extra processor or network activity, so the recipients. An unhidden coded message, no steganographic technique is "invisible" to the

38 Revista Informatica Economică, nr. 4 (44)/2007 user. This kind of steganography could be in- valee Dickinson, sent information to accom- cluded in the firmware of keyboards, thus modation addresses in neutral South Amer- making it invisible to the system. The firm- ica. She was a dealer in dolls, and her letters ware could then be included in all keyboards, discussed how many of this or that doll to allowing someone to distribute a keylogger ship. The stegotext in this case was the doll program to thousands without their knowl- orders; the 'plaintext' being concealed was it- edge. self a codetext giving information about ship • Content-Aware Steganography hides movements, etc. Her case became somewhat information in the semantics a human user famous and she became known as the Doll assigns a datagram; these systems offer secu- Woman.Counter-propaganda: During the rity against a non-human adversary/warden. Pueblo Incident, US crew members of the Historical steganographic techniques USS Pueblo (AGER-2) research ship held as Steganography has been widely used in his- prisoners by North Korea communicated in torical times, especially before cryptographic sign language during staged photo ops to in- systems were developed. Examples of his- form the United States that they had not de- torical usage include: fected, but were instead captured by North • Hidden messages in wax tablets: in an- Korea. In other photos presented to the US, cient Greece, people wrote messages on the the crew members gave "the finger" to the wood, then covered it with wax so that it unsuspecting North Koreans, in an attempt to looked like an ordinary, unused tablet. discredit the pictures that showed them smil- • Hidden messages on messenger's body: ing and comfortable.The one-time pad is a also in ancient Greece. Herodotus tells the theoretically unbreakable cipher that pro- story of a message tattooed on a slave's duces ciphertexts indistinguishable from ran- shaved head, hidden by the growth of his dom texts: only those who have the private hair, and exposed by shaving his head again. key can distinguish these ciphertexts from The message, if the story is true, carried a any other perfectly random texts. Thus, any warning to Greece about Persian invasion perfectly random data can be used as a cover- plans . text for a theoretically unbreakable steg- • Hidden messages on paper written in se- anography. A modern example of OTP: in cret inks under other messages or on the most cryptosystems, private symmetric ses- blank parts of other messages . sion keys are supposed to be perfectly ran- • During and after World War II, espionage dom (that is, generated by a good Random agents used photographically produced Number Generator), even very weak ones microdots to send information back and (for example, shorter than 128 bits). This forth. Since the dots were typically extremely means that users of weak crypto (in countries small -- the size of a period produced by a where strong crypto is forbidden) can safely typewriter or even smaller -- the stegotext hide OTP messages in their session keys. was whatever the dot was hidden within. If a Additional terminology letter or an address, it was some alphabetic In general, terminology analogous to (and characters. If under a postage stamp, it was consistent with) more conventional radio and the presence of the stamp. The problem with communications technology is used; how- the WWII microdots was that they needed to ever, a brief description of some terms which be embedded in the paper, and covered with show up in software specifically, and are eas- an adhesive (such as collodion), which could ily confused, is appropriate. These are most be detected by holding a suspected paper up relevant to digital steganographic systems. to a light and viewing it almost edge on. The The payload is the data it is desirable embedded microdot would reflect light dif- to transport (and, therefore, to hide). The car- ferently than the paper. rier is the , stream, or data file into • More obscurely, during World War II, a which the payload is hidden; contrast "chan- spy for the Japanese in New York City, Vel- nel" (typically used to refer to the type of in-

Revista Informatica Economică, nr. 4 (44)/2007 39

put, such as "a JPEG image"). The resulting latter. For this reason, digital pictures (which signal, stream, or data file which has the pay- contain large amounts of data) are used to load encoded into it is sometimes referred to hide messages on the Internet and on other as the package. The percentage of bytes, sam- communication media. It is not clear how ples, or other signal elements which are modi- commonly this is actually done. For example: fied to encode the payload is referred to as the a 24-bit bitmap will have 8 bits representing encoding density and is typically expressed as each of the three color values (red, green, and a floating-point number between 0 and 1. In a blue) at each pixel. If we consider just the set of files, those files considered likely to blue there will be 28 different values of blue. contain a payload are called suspects. If the The difference between say 11111111 and suspect was identified through some type of 11111110 in the value for blue intensity is statistical analysis, it may be referred to as a likely to be undetectable by the human eye. candidate. Therefore, the least significant bit can be Countermeasures used (more or less undetectably) for some- The detection of steganographically encoded thing else other than color information. If we packages is called steganalysis. The simplest do it with the green and the red as well we method to detect modified files, however, is can get one letter of ASCII text for every to compare them to the originals. To detect three pixels. information being moved through the graph- Stated somewhat more formally, the objec- ics on a website, for example, an analyst can tive for making steganographic encoding dif- maintain known-clean copies of these mate- ficult to detect is to ensure that the changes to rials and compare them against the current the carrier (the original signal) due to the in- contents of the site. The differences (assum- jection of the payload (the signal to covertly ing the carrier is the same) will compose the embed) are visually (and ideally, statistically) payload. In general, using an extremely high negligible; that is to say, the changes are in- compression rate makes steganography diffi- distinguishable from the noise floor of the cult, but not impossible; while compression carrier. errors provide a good place to hide data, high From an information theoretical point of compression reduces the amount of data view, this means that the channel must have available to hide the payload in, raising the more capacity than the 'surface' signal re- encoding density and facilitating easier de- quires, that is, there must be redundancy. For tection (in the extreme case, even by casual a digital image, this may be noise from the observation). imaging element; for digital audio, it may be Applications noise from recording techniques or amplifi- An example from modern practice cation equipment. In general, electronics that digitize an analog signal suffer from several Image of an artic hare. By noise sources such as thermal noise, flicker removing all but the last 2 noise, and shot noise. This noise provides bits of each color compo- enough variation in the captured digital in- nent, an almost com- formation that it can be exploited as a noise pletely black image re- cover for hidden data. In addition, lossy sults. Making the result- compression schemes (such as JPEG) always ing image 85 times introduce some error into the decompressed brighter results in the im- data; it is possible to exploit this for steg- age below. anographic use as well. Steganography can be used for digital wa- Image extracted from above image. termarking, where a message (being simply The larger the cover message is (in data con- an identifier) is hidden in an image so that its tent terms — number of bits) relative to the source can be tracked or verified. hidden message, the easier it is to hide the In the era of Digital video recorder and de-

40 Revista Informatica Economică, nr. 4 (44)/2007 vices like TiVo, TV commercials authors rumors were cited many times - without ever have figured out how to make use of such showing any actual proof - by other media devices as well - by putting a hidden message worldwide, especially after the terrorist at- which becomes visible when played at tack of 9/11. frame-by-frame speed Usage in modern printer References Steganography is used by some modern printers, including HP and Xerox brand color [1] Peter Wayner, Disappearing Criptogra- laser printers. Tiny yellow dots are added to phy – Information Hiding, Steganography each page. The dots are barely visible and and Watermarking, Morgan Kaufmann, New contain encoded printer serial numbers, as York, 2004. well as date and time stamps. This is another [2] Andreas Westfeld, Steganographie: way of watermarking. Grundlagen, Analyse, Verfahrensentwick- Rumored usage in terrorism lung, Springer, March 2006. The rumors about terrorists using steg- [3] Ross Anderson, Information Hiding, anography started first in the daily newspaper Cambridge 1996 USA Today on February 5, 2001. The arti- [4] Peter Wayner, Translucent Databases, cles are still available online, and were titled Barnes&Noble, NY 2005 "Terrorist instructions hidden online", and [5] S. Katzenbeisser and F. Petitcolas, In- the same day, "Terror groups hide behind formation Hiding Techniques for Steg- Web encryption". In July of the same year, anography and digital Watermarking, Artech the information looked even more precise: House, Boston, 2000. "Militants Web with links to jihad". [6] N.F. Johnson, Z. Duric, Information Hid- A citation from the USA Today article: ing: Steganography and Watermarking - At- "Lately, al-Qaeda operatives have been send- tacks and Countermeasures, Kluwer Aca- ing hundreds of encrypted messages that demic Publishers, 2000. have been hidden in files on digital photo- graphs on the auction site eBay.com". These