Temporal Derivative-Based Spectrum and Mel-Cepstrum Audio Steganalysis Qingzhong Liu, Andrew H

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009 359 Temporal Derivative-Based Spectrum and Mel-Cepstrum Audio Steganalysis Qingzhong Liu, Andrew H. Sung, and Mengyu Qiao Abstract—To improve a recently developed mel-cepstrum audio the interbands of the discrete cosine transform (DCT) domains steganalysis method, we present in this paper a method based on and combined the expanded features and the polynomial fitting Fourier spectrum statistics and mel-cepstrum coefficients, derived of the histogram of the DCT coefficients, and successfully im- from the second-order derivative of the audio signal. Specifically, the statistics of the high-frequency spectrum and the mel-cepstrum proved the steganalysis performance in multiple JPEG images coefficients of the second-order derivative are extracted for use [4]. Other works on image steganalysis have been done by in detecting audio steganography. We also design a wavelet-based Fridrich [5], Pevny and Fridrich [6], Lyu and Farid [7], Liu and spectrum and mel-cepstrum audio steganalysis. By applying sup- Sung [8], and Liu et al. [9]–[11]. port vector machines to these features, unadulterated carrier sig- Due to different characteristics of audio signals and images, nals (without hidden data) and the steganograms (carrying covert data) are successfully discriminated. Experimental results show methods developed for image steganalysis are not directly that proposed derivative-based and wavelet-based approaches re- suitable for detecting information hiding in audio streams, and markably improve the detection accuracy. Between the two new many research groups have investigated audio steganalysis. methods, the derivative-based approach generally delivers a better Ru et al. presented a method by measuring the features be- performance. tween the signal under detection and a self-generated reference Index Terms—Audio, mel-cepstrum, second-order derivative, signal via linear predictive coding [12], [13], but the detection spectrum, steganalysis, support vector machine (SVM), wavelet. performance is poor. Avcibas designed a feature set of con- tent-independent distortion measures for classifier design [14]. Ozer et al. constructed a detector based on the characteristics I. INTRODUCTION of the denoised residuals of the audio file [15]. Johnson et al. set up a statistical model by building a linear basis that captures TEGANOGRAPHY is the art and science of hiding data in certain statistical properties of audio signals [16]. Craver et S digital media such as image, audio, and video files, etc. To al. employed cepstral analysis to estimate a stego-signal’s the contrary, steganalysis is the art and science of detecting the probability density function in audio signals [17]. Kraetzer and information-hiding behaviors in digital media. Dittmann recently proposed a mel-cepstrum-based analysis to In recent years, many steganalysis methods have been de- perform detection of embedded hidden messages [18], [19]. signed for detecting information-hiding in multiple steganog- By expanding the Markov approach proposed by Shi et al. for raphy systems. Most of these methods are focused on de- image steganalysis [3], Liu et al. designed expanded Markov tecting digital image steganography. For example, one of the features for audio steganalysis [20]. Additionally, Zeng et al. well-known detectors, histogram characteristic function center presented new algorithms to detect phase coding steganog- of mass (HCFCOM), was successful in detecting noise-adding raphy based on analysis of the phase discontinuities [21] and steganography [1]. Another well-known method is to construct to detect echo steganography based on statistical moments the high-order moment statistical model in the multiscale of peak frequency [22]. In all these methods, Kraetzer and decomposition using wavelet-like transform and then to apply Dittmann’s proposed mel-cepstrum audio analysis is particu- a learning classifier to the high-order feature set [2]. Shi et larly noticeable, because it is the first time that mel-frequency al. proposed a Markov-process-based approach to detect the cepstral coefficients (MFCCs), which are widely used in speech information-hiding behaviors in JPEG images [3]. Based on the recognition, are utilized for audio steganalysis. Markov approach, Liu et al. expanded the Markov features to In this paper, we propose an audio steganalysis method based on spectrum analysis and mel-cepstrum analysis of the second- Manuscript received December 04, 2008; revised May 04, 2009. First order derivative of audio signal. In spectrum analysis, the statis- published June 10, 2009; current version published August 14, 2009. This tics of the high-frequency spectrum of the second-order deriva- work was supported by Institute for Complex Additive Systems Analysis tive are extracted as spectrum features. To improve Kraetzer (ICASA), a research division of New Mexico Tech. The associate editor coordinating the review of this manuscript and approving it for publication was and Dittmann’s work [18], we design the features of mel-cep- Dr. Jessica J. Fridrich. strum coefficients that are derived from the second-order deriva- Q. Liu and A. H. Sung are with the Department of Computer Science and tive. Additionally, in comparison to the second-order derivative- Engineering and Institute for Complex Additive Systems Analysis, New Mexico Tech, Socorro, NM 87801 USA (e-mail: [email protected]; [email protected]). based approach, a wavelet-based spectrum and mel-cepstrum M. Qiao is with the Department of Computer Science and Engineering, New method is also designed. Support vector machines (SVMs) with Mexico Tech, Socorro, NM 87801 USA (e-mail: [email protected]). radial basis function (RBF) kernels [35] are employed to detect Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. and differentiate steganograms from innocent signals. Results Digital Object Identifier 10.1109/TIFS.2009.2024718 show that our derivative-based and wavelet-based methods are 1556-6013/$26.00 © 2009 IEEE Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. 360 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009 (5) (6) where and is the number of samples of the derivatives. We have (7) Assume that is the angle between the vectors and , then (8) Since is arbitrary, the expected value of is calculated as follows: Fig. 1. Example of edge detection using derivatives [23]. very promising and possess remarkable advantage over Kraetzer (9) and Dittmann’s work. The rest of the paper is organized as follows: Section II presents the second-order derivative for audio steganalysis Divide both sides by and the Fourier analysis; Section III introduces Kraetzer and Dittmann’s mel-cepstrum analysis and describes improved (10) mel-cepstrum methods; Section IV presents experiments, fol- lowed by discussion in Section V and conclusion in Section VI. Generally speaking, is far smaller than at II. TEMPORAL DERIVATIVE AND SPECTRUM ANALYSIS low-frequency and middle-frequency components, where In image processing, second-order derivative is widely em- the modification caused by the addition of hidden data is ployed for detecting isolated points, edges, etc. [23]. Fig. 1 negligible. However, the situation changes at high-frequency shows an example of edge detection by using second-order components. Digital audio signals are generally band-limited, derivative. With this in mind, we developed a scheme based on the power spectral density is zero or very close to zero above the second-order derivative for audio steganalysis, details of a certain finite frequency. On the other side, the error term which are described as follows. is assumed to be broadband; in such cases, the modification A digital audio signal is denoted as caused by the addition of hidden data is not negligible in . The second derivative of is , defined as high-frequency components. Assume an error to be a random signal with the expected value of zero. The spectrum is approximately depicted by a (1) Gaussian-like distribution [24]. The power is zero at the lowest frequency; as the frequency increases, the spectrum increases. The stego-signal is denoted , which is modeled by adding Fig. 2(a) shows a simulated error signal, consisting of 25% for a noise or error signal into the original signal 1s, 50% for 0 s, and 25% for 1 s. In this example, we assume the sampling rate is 1000 Hz. Fig. 2(b) is the spectrum distribu- (2) tion of second-order derivatives (only half the values are plotted due to data symmetry). It demonstrates that the energy of the The second-order derivatives of error term and signal derivatives is concentrated in high frequency. are denoted as and , respectively. Thus, Regarding the second-order derivative, at the low and middle (3) frequency components, the power spectrum of an audio signal is normally much stronger than the power spectrum of the error The discrete Fourier transforms of , , and term caused by data hiding, in other words, is al- are denoted as , , and , respectively, most equal to zero, based on (10); the difference of the spectrum between a cover and the stego-signal is suppressed at low and (4) 1Available: http://mathworld.wolfram.com/FourierTransformGaussian.html Authorized licensed use limited to: NEW MEXICO INST OF MINING & TECH. Downloaded on August 31, 2009 at 18:45 from IEEE Xplore. Restrictions apply. LIU et al.: TEMPORAL DERIVATIVE-BASED SPECTRUM AND MEL-CEPSTRUM AUDIO STEGANALYSIS 361 Fig. 2. (a) Random error signal consisting of 25% for 1s, 50% for 0 s, and 25% for 1 s; (b) spectrum of the second-order derivative. Fig. 3. Spectra of the second derivatives of a cover signal (left) and the stego-signal (right). middle frequency components. However, the situation is very In comparing signal spectrum to derivative spectrum, we also different at the high-frequency components. As frequency in- observe that Fig.

Load more