IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009 359 Temporal Derivative-Based Spectrum and Mel- Audio Steganalysis Qingzhong Liu, Andrew H. Sung, and Mengyu Qiao

Abstract—To improve a recently developed mel-cepstrum audio the interbands of the discrete cosine transform (DCT) domains steganalysis method, we present in this paper a method based on and combined the expanded features and the polynomial fitting Fourier spectrum statistics and mel-cepstrum coefficients, derived of the histogram of the DCT coefficients, and successfully im- from the second-order derivative of the audio signal. Specifically, the statistics of the high- spectrum and the mel-cepstrum proved the steganalysis performance in multiple JPEG images coefficients of the second-order derivative are extracted for use [4]. Other works on image steganalysis have been done by in detecting audio steganography. We also design a wavelet-based Fridrich [5], Pevny and Fridrich [6], Lyu and Farid [7], Liu and spectrum and mel-cepstrum audio steganalysis. By applying sup- Sung [8], and Liu et al. [9]–[11]. port vector machines to these features, unadulterated carrier sig- Due to different characteristics of audio signals and images, nals (without hidden data) and the steganograms (carrying covert data) are successfully discriminated. Experimental results show methods developed for image steganalysis are not directly that proposed derivative-based and wavelet-based approaches re- suitable for detecting information hiding in audio streams, and markably improve the detection accuracy. Between the two new many research groups have investigated audio steganalysis. methods, the derivative-based approach generally delivers a better Ru et al. presented a method by measuring the features be- performance. tween the signal under detection and a self-generated reference Index Terms—Audio, mel-cepstrum, second-order derivative, signal via linear predictive coding [12], [13], but the detection spectrum, steganalysis, support vector machine (SVM), wavelet. performance is poor. Avcibas designed a feature set of con- tent-independent distortion measures for classifier design [14]. Ozer et al. constructed a detector based on the characteristics I. INTRODUCTION of the denoised residuals of the audio file [15]. Johnson et al. set up a statistical model by building a linear basis that captures TEGANOGRAPHY is the art and science of hiding data in certain statistical properties of audio signals [16]. Craver et S digital media such as image, audio, and video files, etc. To al. employed cepstral analysis to estimate a stego-signal’s the contrary, steganalysis is the art and science of detecting the probability density function in audio signals [17]. Kraetzer and information-hiding behaviors in digital media. Dittmann recently proposed a mel-cepstrum-based analysis to In recent years, many steganalysis methods have been de- perform detection of embedded hidden messages [18], [19]. signed for detecting information-hiding in multiple steganog- By expanding the Markov approach proposed by Shi et al. for raphy systems. Most of these methods are focused on de- image steganalysis [3], Liu et al. designed expanded Markov tecting digital image steganography. For example, one of the features for audio steganalysis [20]. Additionally, Zeng et al. well-known detectors, histogram characteristic function center presented new algorithms to detect phase coding steganog- of mass (HCFCOM), was successful in detecting noise-adding raphy based on analysis of the phase discontinuities [21] and steganography [1]. Another well-known method is to construct to detect steganography based on statistical moments the high-order moment statistical model in the multiscale of peak frequency [22]. In all these methods, Kraetzer and decomposition using wavelet-like transform and then to apply Dittmann’s proposed mel-cepstrum audio analysis is particu- a learning classifier to the high-order feature set [2]. Shi et larly noticeable, because it is the first time that mel-frequency al. proposed a Markov-process-based approach to detect the cepstral coefficients (MFCCs), which are widely used in speech information-hiding behaviors in JPEG images [3]. Based on the recognition, are utilized for audio steganalysis. Markov approach, Liu et al. expanded the Markov features to In this paper, we propose an audio steganalysis method based on spectrum analysis and mel-cepstrum analysis of the second- Manuscript received December 04, 2008; revised May 04, 2009. First order derivative of audio signal. In spectrum analysis, the statis- published June 10, 2009; current version published August 14, 2009. This tics of the high-frequency spectrum of the second-order deriva- work was supported by Institute for Complex Additive Systems Analysis tive are extracted as spectrum features. To improve Kraetzer (ICASA), a research division of New Mexico Tech. The associate editor coordinating the review of this manuscript and approving it for publication was and Dittmann’s work [18], we design the features of mel-cep- Dr. Jessica J. Fridrich. strum coefficients that are derived from the second-order deriva- Q. Liu and A. H. Sung are with the Department of Computer Science and tive. Additionally, in comparison to the second-order derivative- Engineering and Institute for Complex Additive Systems Analysis, New Mexico Tech, Socorro, NM 87801 USA (e-mail: [email protected]; [email protected]). based approach, a wavelet-based spectrum and mel-cepstrum M. Qiao is with the Department of Computer Science and Engineering, New method is also designed. Support vector machines (SVMs) with Mexico Tech, Socorro, NM 87801 USA (e-mail: [email protected]). radial basis function (RBF) kernels [35] are employed to detect Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. and differentiate steganograms from innocent signals. Results Digital Object Identifier 10.1109/TIFS.2009.2024718 show that our derivative-based and wavelet-based methods are

1556-6013/$26.00 © 2009 IEEE

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. 360 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

(5)

(6)

where and is the number of samples of the derivatives. We have

(7)

Assume that is the angle between the vectors and , then

(8)

Since is arbitrary, the expected value of is calculated as follows:

Fig. 1. Example of edge detection using derivatives [23]. very promising and possess remarkable advantage over Kraetzer (9) and Dittmann’s work. The rest of the paper is organized as follows: Section II presents the second-order derivative for audio steganalysis Divide both sides by and the ; Section III introduces Kraetzer and Dittmann’s mel-cepstrum analysis and describes improved (10) mel-cepstrum methods; Section IV presents experiments, fol- lowed by discussion in Section V and conclusion in Section VI. Generally speaking, is far smaller than at II. TEMPORAL DERIVATIVE AND SPECTRUM ANALYSIS low-frequency and middle-frequency components, where In image processing, second-order derivative is widely em- the modification caused by the addition of hidden data is ployed for detecting isolated points, edges, etc. [23]. Fig. 1 negligible. However, the situation changes at high-frequency shows an example of edge detection by using second-order components. Digital audio signals are generally band-limited, derivative. With this in mind, we developed a scheme based on the power is zero or very close to zero above the second-order derivative for audio steganalysis, details of a certain finite frequency. On the other side, the error term which are described as follows. is assumed to be broadband; in such cases, the modification A digital audio signal is denoted as caused by the addition of hidden data is not negligible in . The second derivative of is , defined as high-frequency components. Assume an error to be a random signal with the expected value of zero. The spectrum is approximately depicted by a (1) Gaussian-like distribution [24]. The power is zero at the lowest frequency; as the frequency increases, the spectrum increases. The stego-signal is denoted , which is modeled by adding Fig. 2(a) shows a simulated error signal, consisting of 25% for a noise or error signal into the original signal 1s, 50% for 0 s, and 25% for 1 s. In this example, we assume the sampling rate is 1000 Hz. Fig. 2(b) is the spectrum distribu- (2) tion of second-order derivatives (only half the values are plotted due to data symmetry). It demonstrates that the energy of the The second-order derivatives of error term and signal derivatives is concentrated in high frequency. are denoted as and , respectively. Thus, Regarding the second-order derivative, at the low and middle (3) frequency components, the power spectrum of an audio signal is normally much stronger than the power spectrum of the error The discrete Fourier transforms of , , and term caused by data hiding, in other words, is al- are denoted as , , and , respectively, most equal to zero, based on (10); the difference of the spectrum between a cover and the stego-signal is suppressed at low and (4) 1Available: http://mathworld.wolfram.com/FourierTransformGaussian.html

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. LIU et al.: TEMPORAL DERIVATIVE-BASED SPECTRUM AND MEL-CEPSTRUM AUDIO STEGANALYSIS 361

Fig. 2. (a) Random error signal consisting of 25% for 1s, 50% for 0 s, and 25% for 1 s; (b) spectrum of the second-order derivative.

Fig. 3. Spectra of the second derivatives of a cover signal (left) and the stego-signal (right). middle frequency components. However, the situation is very In comparing signal spectrum to derivative spectrum, we also different at the high-frequency components. As frequency in- observe that Fig. 4 demonstrates the spectra of the same cover creases, increases, and is limited above a certain fre- and stego-signals without the extraction of the second deriva- quency, the increase of the spectrum resulted from hidden data tives. Similarly, the addition of hidden data increases the mag- is not negligible anymore; hence, the statistics extracted from nitude in high frequency although the energy is dominated in the high-frequency components give a clue to detect the infor- low frequency. mation-hiding behavior. It is worth noting that a comparison between Figs. 3 and 4 Fig. 3 shows the spectrum distribution of the second deriva- shows that the second derivative amplifies the energy in high tive of a 44.1-kHz audio cover and the distribution of the frequency; especially, it amplifies the energy contributed by the second derivative of the stego-signal that is generated by addition of the hidden signal. Therefore, a preprocessing step to embedding some data into the cover. It clearly shows that the extract the second-order derivative could be more effective for high-frequency spectrum of the second-order derivative of the detecting the hidden signal. stego-signal has higher magnitude values, in comparison with Next, we present the following procedure to extract the sta- the cover. tistical characteristics of the spectrum.

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. 362 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

Fig. 4. Spectra of the same cover and stego-signals that are used in Fig. 3.

1) Obtain the Fourier spectrum of the second-order derivative [27]. Recently, Kraetzer and Dittmann proposed a signal-based of the audio signal under test. mel-cepstrum audio steganalysis [18]. Based on (11), we de- 2) Calculate statistics, including mean value, standard devia- sign a second derivative-based mel-cepstrum audio steganalysis tion, and skewness, of the different frequency zones over to improve Kraetzer and Dittmann’s work. The details are the spectrum. In our experiments, we equally divide the en- described in Section III. tire frequency zone into ( is set to ) zones or parts, from the lowest to the highest frequency. The mean III. IMPROVED MEL-CEPSTRUM AUDIO STEGANALYSIS value, standard deviation, and skewness of the th zone are denoted , , and , respectively. In speech processing, mel-frequency cepstrum (MFC) is a 3) Choose the values , , and that are extracted from representation of the short-term power spectrum of a , the high-frequency spectrum as the features. based on a linear cosine transform of a log power spectrum on The expected value of is given in (9). The expected a nonlinear of frequency. To convert Hz into mel value of the variance is obtained by using the following: use the following:

(12)

MFCCs are coefficients that collectively make up an MFC. MFCCs are commonly derived from the following processes [28]: 1) take the of (a windowed excerpt of) a signal; (11) 2) map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows; According to (11), the rate of power change in different 3) take the logs of the powers at each of the mel ; spectrum bands of the stego-audio is different from the original 4) take the DCT of the list of mel log powers, as if it were a cover. Generally, the cepstrum may be interpreted as informa- signal; tion for the power change; it was defined by Bogert, Healy, and 5) the MFCCs are the amplitudes of the resulting spectrum. Tukey in [25]. Reynolds and McEachern showed a modified Fig. 5 (available at http://www.ee.bilkent.edu.tr/ cepstrum called mel-cepstrum for [26], ~onaran/SP-4.pdf) shows a fast Fourier transform (FFT)-based

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. LIU et al.: TEMPORAL DERIVATIVE-BASED SPECTRUM AND MEL-CEPSTRUM AUDIO STEGANALYSIS 363

and

(16)

Fig. 5. FFT-based mel-cepstrum procedure (available at http://www.ee. Temporal derivative-based high-frequency spectrum statis- bilkent.edu.tr/~onaran/SP-4.pdf). tics, described in Section II, and derivative-based mel-cepstrum coefficients, derived from (15) and (16), form the feature vector for detecting the information hiding in digital audio signals. mel-cepstrum computation; the more technical details are To improve the original mel-cepstrum audio steganalysis, given in [29]. a wavelet-based mel-cepstrum approach is also designed. We Mel-cepstrum is commonly used for representing the apply a wavelet transform to signal and get an approximation human voice and musical signals. Inspired by the success sub-band and a detail sub-band. Let denote the detail co- in speech recognition, Kraetzer and Dittmann proposed efficient sub-band; we replace in (13) and (14) with and mel-cepstrum-based speech steganalysis, including the fol- obtaine the MFCCs and FMFCCs as follows: lowing two types of mel-cepstrum coefficients [18]: 1) MFCCs, . is the number of MFCCs, for a signal with a sampling rate of 44.1 kHz , calculated by the following equation, where MT (17) indicates the mel-scale transformation:

and (13)

2) Filtered mel-frequency cepstral coefficients (FMFCCs), . is the number of FMFCCs, (18) calculated by the following equation:

IV. EXPERIMENTS

A. Setup (14) We obtained 6000 mono and 6000 stereo 44.1-kHz 16-bit quantization in uncompressed, PCM coded WAV audio signal files, covering different types such as digital speech, on-line In (14), the role of speech band filtering is to remove the broadcast in different languages, for instance, English, Chi- speech relevant bands (the spectrum components between 200 nese, Japanese, Korean, and Spanish, and music (jazz, rock, and 6819.59 Hz) [18]. blues). Each audio has the duration of 19 s. We produced the To improve mel-cepstrum-based audio steganalysis, same amount of the stego-audio signals by hiding different following Kraetzer and Dittmann’s work, we design the message in these audio signals. The hiding tools/algorithms second-order derivative-based MFCCs and FMFCCs, obtained include Hide4PGP V4.0,2 Invisible Secrets,3 LSB matching by replacing the signal in (13) and (14) with the second-order [30], and Steghide [31]. The hidden data include voice, video, derivative ; the calculation is listed as follows: image, text, and executable codes, which were encrypted before embedding by using different keys. We also produced audio steganograms by hiding random bits. The covert data in any two audio streams are different. All the covers and steganograms are available at http://www.cs.nmt.edu/~IA/steganalysis.html. (15) 2Available: http://www.heinz-repp.onlinehome.de/Hide4PGP.htm 3Available: http://www.invisiblesecrets.com/

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. 364 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

Fig. 6. P-values in one-way ANOVA analysis for the spectrum features over the whole frequency band. The x-axis corresponds to the low- to high-frequency component, from left to right. The y-axis shows the p-value.

B. Statistical Significance of Spectrum Features mel-cepstrum method. In comparison with the wavelet-based To extract the spectrum features, we set 80 to and ex- approach, the derivative-based mel-cepstrum audio steganalysis tract the mean values, standard deviations, and skewness statis- generally delivers a better performance. tics, for a total of 240 features over the whole frequency of the Since one-time cross-validation is not statistically significant, derivatives. Fig. 6 lists the p-values in the one-way analysis of to compare these three methods, we perform 100 runs for each variances (ANOVA) [32], [33] of these features, extracted from method on each type of audio steganograms with a certain in- Hide4PGP, Invisible Secrets, LSB matching, and Steghide, as formation-hiding ratio. In each run, 50% audio samples are ran- well as original covers. It clearly indicates the features extracted domly assigned to the training set; the remaining 50% audio from the high-frequency components, which correspond to the samples are used for testing. The mean testing accuracy values small p-values, have better statistical significances than the fea- and the standard errors are listed in Table I. Hiding ratio, the tures from the other frequency components. The ANOVAresults ratio of the size of hidden data to the maximum capacity, is used are consistent with the analysis of Section II. to measure the embedding strength. As seen in Table I, wavelet-based and derivative-based C. Signal-Based, Derivative-Based, and Wavelet-Based mel-cepstrum methods are superior to the original signal-based Mel-Cepstrum Audio Steganalysis method. Most remarkably, derivative-based mel-cepstrum We compare the signal-based mel-cepstrum audio steganal- audio steganalysis delivers the best result by greatly improving ysis to derivative-based and wavelet-based mel-cepstrum ap- the detection performance over the mel-cepstrum method. For proaches. Since Daubechies wavelets are widely used for signal instance, it improves the detection accuracy by about 17% for processing and decomposition [34], we apply a Daubechies the detection of invisible steganograms with maximum hiding, wavelet, “db8,” to signal for decomposition. Let MC, D-MC, and about 18% for the detection of LSB matching steganograms and W-MC stand for signal-based, derivative-based, and with maximum hiding. wavelet-based mel-cepstrum steganalysis methods, respec- D. Wavelet-Based and Wavelet-Based Spectrum and tively. Fig. 7 shows the receiver operating characteristic (ROC) Mel-Cepstrum Audio Steganalysis versus AMSL Audio curves by performing a cross validation in detecting the audio Steganalysis Toolset (AAST) steganograms by using SVMs with RBF kernels [35]. The experimental results show that wavelet-based and The feature sets of wavelet-based and derivative-based derivative-based mel-cepstrum audio steganalysis methods methods include two types: spectrum statistics and mel-cep- prominently improve the detection performance of the original strum coefficients. The mel-cepstrum coefficients consist of

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. LIU et al.: TEMPORAL DERIVATIVE-BASED SPECTRUM AND MEL-CEPSTRUM AUDIO STEGANALYSIS 365

Fig. 7. ROC curves by using signal-based, derivative-based, and wavelet-based mel-cepstrum audio steganalysis, with the legends MC, D-MC, and W-MC, re- spectively.

TABLE I constitute a detector for steganalysis of speech audio signals, AVERAGE TESTING ACCURACY VALUES AND STANDARD ERRORS OF ORIGINAL called AAST [18]. Fig. 8 shows ROC curves by performing a MEL-CEPSTRUM (MC), WAVELET-BASED MEL-CEPSTRUM (W-MC) BY USING “dB8” AND DERIVATIVE-BASED MEL-CEPSTRUM (D-MC) METHODS cross validation with the use of these three methods. Table II lists mean testing accuracy values and standard errors of 100 runs. Experimental results show that DSMC and WSMC outperform ASTT. On average, DSMC is superior to WSMC.

V. D ISCUSSION Derivative-based and wavelet-based methods are superior to the original solution proposed by Kraetzer and Dittmann for audio steganalysis. Our explanation is that audio signals are generally band-limited while, on the other hand, the embedded MFCCs and FMFCCs, given by (15) and (16) for deriva- hidden data is likely broadband. Consequently, both derivative- tive-based approach, and by (17) and (18) for wavelet-based based and wavelet-based methods are more accurate since they approach. In wavelet-based method, we adopt the same that first obtain the high-frequency information from audio signals. is used in the derivative-based approach for spectrum feature Our experimental results show that the derivative-based extraction, except that we first replace the second derivative method outperforms the wavelet-based method, especially with the wavelet detail subband. In both cases, spectrum fea- when only mel-cepstrum features are employed for detection. tures consist of mean values, standard deviations, and skewness We believe this resulted from different spectrum characteristics values from high-frequency components, given by between the second-order derivative and wavelet filtering. Fig. 9(a) shows the spectrum of the second derivative of the hidden data in an audio steganogram; Fig. 9(b) plots the spec- (19) trum of the detail wavelet sub-band of the same hidden data, filtered by using “db8.” There is a large difference between We abbreviate the methods as DSMC (derivative-based spec- these two spectra. The filtered by the wavelet may be treated trum and mel-cepstrum) and WSMC (wavelet-based spectrum as a white noise signal; the spectrum is almost equally dis- and mel-cepstrum). In Kraetzer and Dittmann’s work, signal- tributed over the whole frequency band, as shown in Fig. 9(b). based mel-cepstrum coefficients and other statistical features However, the second derivative suppresses the energy in low

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. 366 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

Fig. 8. ROC curves by using DSMC, WSMC, and AAST, respectively.

TABLE II information-hiding; in other words, mel-cepstrum features play AVERAGE TESTING ACCURACY VALUES AND STANDARD ERRORS BY USING key roles in audio steganalysis. AAST, WSMC, AND DSMC We note that different information-hiding systems/algorithms are sensitive to different features, which can be observed by comparing the detection results by using MC and AAST fea- ture sets, shown in Tables I and II, respectively. The testing re- sults by using the AAST feature set are with much higher stan- dard errors. We surmise that some statistical features in AAST feature may not be very significant, as a statistical analysis for each individual feature would indicate. To address this problem, we can discard some insignificant features and choose an op- timal feature set from all AAST features. The feature selection frequency and amplifies the energy in high frequency; the spec- problem in steganalysis has been studied in our previous work trum shape is approximately simulated by a half of Gaussian [10]. The feature optimization for audio steganalysis is currently distribution, not equally distributed over the whole frequency being conducted. band, as shown in Fig. 9(a). Based on (11) in Section II, as the As described and explained in our previous work on image spectrum increases, the expected value of the variance of steganalysis [11], besides the information-hiding ratio, image the audio steganogram will prominently increase, that is, the complexity is an important parameter in evaluating steganalysis rate of power change in different spectrum bands will dramat- performance, as the detection performance is necessarily lower ically change. Since the mel-cepstrum coefficients are used for steganalysis of images with high complexity. Similarly, in to capture the information for power change, the advantage audio steganalysis, all methods tested in our experiments will of derivative-based mel-cepstrum approach becomes easily not be very effective for steganalysis of audio signals with high noticeable. complexity, which generally have high magnitude in high fre- With the addition of spectrum features, the wavelet-based quency. This issue is also currently under our study. method gains a greater improvement than the derivative-based method and thereby narrows the performance gap between the VI. CONCLUSION two methods. However, in both methods, mel-cepstrum features In this paper, we proposed spectrum analysis and improved make more contributions than spectrum features for detecting mel-cepstrum methods for audio steganalysis, derived from the

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. LIU et al.: TEMPORAL DERIVATIVE-BASED SPECTRUM AND MEL-CEPSTRUM AUDIO STEGANALYSIS 367

Fig. 9. (a) Spectrum of the second-order derivative of a hidden data in a 44.1-kHz audio steganogram, and (b) spectrum of the detail wavelet sub-band of the same hidden data, filtered by using “db8.” The frequency shown in the x-axis of (b) is reduced due to down-sampling of wavelet decomposition to obtain the detail sub-band.

second-order derivative and from the detail wavelet sub-band. [9] Q. Liu, A. Sung, J. Xu, and B. Ribeiro, “Image complexity and fea- Experimental results show that, in steganalysis of audio files ture extraction for steganalysis of LSB matching steganography,” in Proc. 18th Int. Conf. Pattern Recognition (ICPR), 2006, no. 1, pp. produced by Hide4PGP, Invisible Secrets, Steghide, and LSB 1208–1211. matching algorithms/tools, our proposed methods deliver good [10] Q. Liu, A. Sung, Z. Chen, and J. Xu, “Feature mining and pattern clas- performance and gain significant advantage over a recently de- sification for steganalysis of LSB matching steganography in grayscale images,” Pattern Recognit., vol. 41, no. 1, pp. 56–66, 2008. signed signal-based mel-cepstrum method. In a comparison of [11] Q. Liu, A. Sung, B. Ribeiro, M. Wei, Z. Chen, and J. Xu, “Image the two new methods, on average, the derivative-based solution complexity and feature mining for steganalysis of least significant bit is superior to the wavelet-based method. matching steganography,” Inf. Sci., vol. 178, no. 1, pp. 21–36, 2008. [12] X. Ru, H. Zhang, and X. Huang, “Steganalysis of audio: Attacking the steghide,” in Proc. 4th Int. Conf. Machine Learning and Cybernetics, ACKNOWLEDGMENT 2005, pp. 3937–3942. [13] X. Ru, Y. Zhang, and F. Wu, “Audio steganalysis based on negative The authors would like to thank Dr. D. Ellis of Columbia resonance phenomenon caused by steganographic tools,” J. Zhejiang University for his insightful comments and great suggestions, Univ. SCIENCE A, vol. 7, no. 4, pp. 577–583, 2006. Dr. M. Slaney of Yahoo research and Prof. H. Ai for their helpful [14] I. Avcibas, “Audio steganalysis with content-independent distortion measures,” IEEE Signal Process. Lett., vol. 13, no. 2, pp. 92–95, Feb. discussions, and Dr. J. Dittmann and C. Kraetzer for kindly pro- 2006. viding them with the AAST document. Special thanks go to the [15] H. Ozer, B. Sankur, N. Memon, and I. Avcibas, “Detection of audio anonymous reviewers for their insightful comments that helped covert channels using statstical footprints of hidden messages,” Digital Signal Process., vol. 16, no. 4, pp. 389–401, 2006. improve the presentation. [16] M. Johnson, S. Lyu, and H. Farid, “Steganalysis of recorded speech,” Proc. SPIE, vol. 5681, pp. 664–672, 2005. [17] S. Craver, B. Liu, and W. Wolf, “Histo-cepstral analysis for reverse- EFERENCES R engineering watermarks,” in Proc. 38th Conf. Information Sciences and [1] J. Harmsen and W. Pearlman, “Steganalysis of additive noise mode- Systems (CISS’04), Princeton, NJ, Mar. 2004, pp. 824–826. lable information hiding,” Proc. SPIE, Electronic Imaging, Security, [18] C. Kraetzer and J. Dittmann, “Mel-cepstrum based steganalysis for Steganography, and Watermarking of Multimedia Contents, vol. 5020, voip-steganography,” Proc. SPIE, vol. 6505, p. 650505, 2007. pp. 131–142, 2003. [19] C. Kraetzer and J. Dittmann, “Pros and cons of mel-cepstrum based [2] S. Lyu and H. Farid, “How realistic is photorealistic?,” IEEE Trans. audio steganalysis using SVM classification,” in Lecture Notes Signal Process., vol. 53, no. 2, pt. 2, pp. 845–850, Feb. 2005. Comput. Sci., 2008, vol. 4567, pp. 359–377. [3] Y. Shi, C. Chen, and W. Chen, “A Markov process based approach [20] Q. Liu, A. Sung, and M. Qiao, “Detecting information-hiding in WAV to effective attacking JPEG steganography,” in Lecture Notes Comput. audios,” in Proc 19th Int. Conf. Pattern Recognition, 2008, pp. 1–4. Sci., 2007, vol. 437, pp. 249–264. [21] W. Zeng, H. Ai, and R. Hu, “A novel steganalysis algorithm of phase [4] Q. Liu, A. Sung, B. Ribeiro, and R. Ferreira, “Steganalysis of multi- coding in audio signal,” in Proc. 6th Int. Conf. Advanced Language class JPEG images based on expanded Markov features and polyno- Processing and Web Information Technology (ALPIT), 2007, pp. mial fitting,” in Proc. 21st Int. Joint Conf. Neural Networks, 2008, pp. 261–264. 3351–3356. [22] W. Zeng, H. Ai, and R. Hu, “An algorithm of echo steganalysis based [5] J. Fridrich, “Feature-based steganalysis for JPEG images and its impli- on power cepstrum and pattern classification,” in Proc. Int. Conf. In- cations for future design of steganographic schemes,” in Lecture Notes formation and Automation (ICIA), 2008, pp. 1667–1670. Comput. Sci., 2004, vol. 3200, pp. 67–81. [23] R. Gonzalez and R. Woods, Digital Image Processing, 3rd ed. Engle- [6] T. Pevny and J. Fridrich, “Merging Markov and DCT features for multi- wood Cliffs, NJ: Prentice-Hall, 2008, ISBN: 9780131687288. class JPEG steganalysis,” Proc. SPIE Electronic Imaging, pp. 03–04, [24] A. Oppenheim, R. Schafer, and J. Buck, Discrete-Time Signal Pro- Jan. 2007. cessing. Englewood Cliffs, NJ: Prentice-Hall, 1999. [7] S. Lyu and H. Farid, “Steganalysis using high-order image statistics,” [25] B. Bogert, M. J. R. Healy, and J. W. Tukey, “The frequency analysis IEEE Trans. Inf. Forensics Security, vol. 1, no. 1, pp. 111–119, Mar. of times series for echoes: Cepstrum, pseudo-autocovariance, cross- 2006. cepstrum, and saphe cracking,” in Proc. Symp. Time Series Analysis, [8] Q. Liu and A. Sung, “Feature mining and nuero-fuzzy inference M. Rosenblatte, Ed. New York: Wiley, 1963, ch. 15, pp. 209–243. system for steganalysis of LSB matching steganography in grayscale [26] D. Reynolds, “A Gaussian Mixture Modeling Approaching to Text- images,” in Proc. of 20th Int. Joint Conf. Artificial Intelligence, 2007, Independent Speaker Indentification,” Ph.D. thesis, Department Elec- pp. 2808–2813. trical Engineering, Georgia Institute of Technology, Atlanta, 1992.

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. 368 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

[27] R. H. McEachern, “Hearing it like it is: Audio signal processing the Andrew H. Sung received the Ph.D. degree from way the ear does it,” DSP Applications, vol. 3, no. 2, pp. 35–47, Feb. the State University of New York at Stony Brook, in 1994. 1984. [28] M. Xu, L. Duan, J. Cai, L. Chia, C. Xu, and Q. Tian, “HMM-based Dr. Sung is a Professor and the Chairman of the audio keyword generation,” in Proc. 5th Pacific Rim Conf. Multimedia, Computer Science and Engineering Department Part III, Lecture Notes in Computer Science, Nov. 30–Dec. 3, 2004, of New Mexico Tech, Socorro, NM. His current vol. 3333, pp. 566–574. research interests include information security and [29] S. Molau, M. Pitz, R. Schlüter, and H. Ney, “Computing mel-frequency digital forensic analysis, bioinformatics, application cepstral coefficients on the power spectrum,” in Proc. IEEE Int. Conf. algorithms, and soft computing and its engineering , Speech and Signal Processing, Salt Lake City, UT, May applications. 2001, vol. I, pp. 73–76. [30] T. Sharp, “An implementation of key-based digital signal steganog- raphy,” in Proc. Information Hiding Workshop (LNCS), 2001, vol. 2137, pp. 13–26. [31] S. Hetzl and P. Mutzel, “A graph-theoretic approach to steganography,” Mengyu Qiao is currently working toward the Ph.D. in LNCS, 2005, vol. 3677, pp. 119–128. degree in the Computer Science and Engineering [32] T. Hill and P. Lewicki, Statistics: Methods and Applications. Tulsa, Department, New Mexico Tech, Socorro, NM. He OK: StatSoft, Inc., 2005, ISBN: 1884233597. received the B.Eng. degree in software engineering [33] R. Agostino, L. Sullivan, and A. Beiser, Introductory Applied Biostatis- from Beijing University of Posts and Telecommu- tics. Pacific Grove, CA: Brooks/Cole, 2005. nications, China, in 2006, and the M.S. degree in [34] I. Daubechies, Ten Lectures on Wavelets. Philadelphia, PA: SIAM, computer science from New Mexico Tech, in 2009. 1992. His research interests include information se- [35] V. Vapnik, Statistical Learning Theory. Hoboken, NJ: Wiley, 1998. curity, bioinformatics, data mining, image/signal processing, and software engineering. Qingzhong Liu received the B.Eng. degree from Northwestern Polytechnical University and the M.Eng. degree from Sichuan University in China, and the Ph.D. degree in Computer Science from New Mexico Tech, in 2007. Dr. Liu is currently a Senior Research Scientist and Adjunct Faculty of New Mexico Tech, Socorro, NM. His research interests include data mining, pattern recognition, bioinformatics, multimedia computing, information security, and digital forensic analysis.

Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply.