Use of the Kurtosis Statistic in the Frequency Domain As an Aid In

Total Page:16

File Type:pdf, Size:1020Kb

Use of the Kurtosis Statistic in the Frequency Domain As an Aid In lEEE JOURNALlEEE OF OCEANICENGINEERING, VOL. OE-9, NO. 2, APRIL 1984 85 Use of the Kurtosis Statistic in the FrequencyDomain as an Aid in Detecting Random Signals Absmact-Power spectral density estimation is often employed as a couldbe utilized in signal processing. The objective ofthis method for signal ,detection. For signals which occur randomly, a paper is to compare the PSD technique for signal processing frequency domain kurtosis estimate supplements the power spectral witha new methodwhich computes the frequency domain density estimate and, in some cases, can be.employed to detect their presence. This has been verified from experiments vith real data of kurtosis (FDK) [2] forthe real and imaginary parts of the randomly occurring signals. In order to better understand the detec- complex frequency components. Kurtosis is defined as a ratio tion of randomlyoccurring signals, sinusoidal and narrow-band of a fourth-order central moment to the square of a second- Gaussian signals are considered, which when modeled to represent a order central moment. fading or multipath environment, are received as nowGaussian in Using theNeyman-Pearson theory in thetime domain, terms of a frequency domain kurtosis estimate. Several fading and multipath propagation probability density distributions of practical Ferguson [3] , has shown that kurtosis is a locally optimum interestare considered, including Rayleigh and log-normal. The detectionstatistic under certain conditions. The reader is model is generalized to handle transient and frequency modulated referred to Ferguson'swork for the details; however, it can signals by taking into account the probability of the signal being in a be simply said thatit is concernedwith detecting outliers specific frequency range over the total data interval. It is shown that from an otherwise Gaussian sample. The outliers are equivalent this model produces kurtosis values consistent with real data meas- urements. tothe randomly occurring signal that is to be detected. By The abilityof the power spectral density estimate and the frequencyextending this idea to thefrequency domain and based on domain kurtosis estimate to detect randomly occurring signals, gen- analyses of real underwateracoustics data, we have found erated from the model,is compared using the deflection criterion. isIt conditionsunder which the FDK indicates the presence of shown, for the cases considered, that over a large rangeof conditions, randomly occurring signals [2] , [4]. Both time and frequency the power spectral density estimate is a better statistic based on the deflection criterion. However, there is a small range of conditions domainanalyses of the real data havebeen performed. By over which it appears that the frequency domain kurtosis estimate settinghas the frequency parameter equal to zero in a DFT, it an advantage. The real data that initiated this analytical investigationcan be shown that the time domain is a specialcase of the are also presented. frequency domain. Analogous results should also hold in the spatial domain; however, we will only consider the frequency I. INTRODUCTION domainhere. In addition, the results are applicable to both N MAYY IMPORTANTsignal processing applications, active and passive sonar, although we will concentrate on the I includingunderwater acoustics: an estimateof the power latter application rather than the former. The objective of this spectral density (PSD) of the received data is often employed paper is to analytically determine the potential for exploiting for signal detection.The data arefirst transformed into the kurtosisestimation in thefrequency domain to indicate the freguency domain by utilizing the discrete Fourier transform presence of randomlyoccurring signals. To accomplishthis, (DFT),which can be efficientlyexecuted by analgorithm we introducea model for the receiveddata which contains called thefast Fourier transform (FFT). At this point, the the effects of amplitude and phase fluctuation of the signal. data areconsidered to be inthe frequency domain and an Inaddition, to be morerealistic, transient and frequency estimate of the PSD can be easily obtairied. Often, this esti- modulation effects of the signal are also incorporated into the mateconsists ofaveraging together asufficient number of model. References which support this model will be cited in individual FFTspectrums or periodogamsto ensure con- the text. To justify the results presented here, the PSD and sistent results [ I ] . FDK estimates will be compared in the last section using the The PSD is essentially a sum of the estimatesof the second- real underwater acoustic data that initiated this work. How- order moments for both the real and imaginary parts of each ever, subsequent data have also supported the analytical work frequencycomponent in thefrequency domain. If thefre- presented here. quency domainsignals arerandomly occurring and not Gaussian Theseresults should also apply in other fields wherethe distributed, then higherorder moments of the complex fre- detection of arandomly occurring signal is important.For quency components may contain additional information that example,the detection ofvariable stars in astronomymay benefit from this approach. Manuscriptreceived July 18, 1983; February 2, 1984. Thiswork was supported by the Office of Naval Research (Probability and Sta- 11. FREQUENCY DOMAIN KURTOSIS tistics Program). Theauthor withis the Naval UnderwaterSystemsCenter, New Lon- Let 4) = -k (4 - 'P')''] ' = '7 3 ".: - 7 don, CT 06320. q = 1, 2?.-., n representthe real discretewheredata lz is the U.S. Government work not protected by U.S. copyright .... .I . I ~,. .. .. ~. r ~ . .. .. ." 86 JOURNALIEEE OF OCEANIC ENGINEERING, VOL. OE-9, NO. 2, APRIL 1984 intervalbetween successive observations of the process. We in theestimate [8]. For randomlyoccurring signals that will use the samedefinition for the DFT, asgiven in [5] . producenoncaussian distributions, the kurtosis estimate The DFT is defined as can be less than 3 or it can have a value much greater. Several cases are examined in the paper to demonstrate the range of 1M- 1 kurtosis values for various situations. x(q,F~) = Tqix(i, 4)exp (-jFpi) (1) Techniquesfor optimally processingsignals contaminated i= 0 by under-iceambient noise aswell as other noiseenviron- ments are presented in [9] and . where the symbols are identified as [IO] The FDK is defined by taking the expected valueof the j=a. fourth-order central moment and the square of the expected value of thesecond-order central moment separately, and then forming the ratio. The resultof this operation is Fp = 27rfph is thepth radian frequencycomponent, p = 0, 1, .*a, M - 1, and fp = p/Mh Hz. For simplicity, we shall resume the window weights equal one, Le., Wi= 1, for all i, K(Fp)=E{[X(q,F,)14}/IE[(X(q,F,))21)2. (3) andh=l. The power spectrum estimate is defined as Beforeproceeding further, we need to definea model for the received data. Our goal is to compare the FDK and n PSD estimatesunder some conditions which are known to Vp)= (1/n> X(q,Fp)x*(q, Fp) (2) q=1 occur in underwateracoustic detection problems, but have not been explicitly evaluated in this way before. The model where the asterisk represents complex conjugate. The variance we employ assumes that the transmitted or radiated signal is actedupon multiplicatively by the medium which causes of the periodogram does not go to zero as A4 + -, and there- fore the periodogram is not a consistent estimate of the PSD amplitudemodulation and frequency spreading tooccur. 151. In the PSDestimate considered here, n nonoverlapped This is obviously notthe most general modelpossible, but DFT segmentsare averaged to ensurethat each frequency it is adequateto answersome importantsonar design ques- componentrepresents a consistent PSD estimate [5] . Thus tions. (2) is anasymptotically unbiased estimate of the power Theinput, x(i, q), will be azero-mean process which is spectral density [5] . composed of an additive mixture of signal and noise of the The FDK estimate wasdiscussed in terms of a detection form problem in [6] . The main concern here is to determine the sensitivity of the PSD and FDK estimates to randomly occur- x(i, 4) = w,4) + m(i, qMi, 4) (4) ring signals, assuming a sufficient number of DFT segments is available. The FDK represents a measure for the probability where m(i, q) modulates the signal and will either represent distribution over a time interval consisting of many DFT seg- the effects of the propagation medium or reflect a physical ments. Many of the arctic segments, which will be discussed in characteristic of thetransmitted or radiated signal s(i, q). the last section, showed highly dynamic frequency components.The components N(i, qj and s(i, q) are zero-mean stationary This was dueto the highlydynamic nature of ice sounds. processes and N(i, q), m(i, q) and s(i, q),will be assumed to Thesedynamic components were easily descernedusing a be mutually independent from each other. For the particular spectrogram.The advantages of the DFK for estimating the choicesof m(i, q) and s(i, q) givenin the text, x(i, q) will statistical beha\lor of ice sounds over the spectro, Gram are be stationary in the wide sense. that an operator is not needed and that it produces a quantita- Ourmodel for the fading receivedsignal m(i, q)s(i, q) tive measure of the probabiIity distribution. This measure can assumes that the total effects of theamplitude fluctuations also be used to distinguish stationary sinusoids and Gaussian due tomultipath interference or to nonstationarities of the signals from ice sounds. source, receiver, or of themedium canbe simply included It should alsobe pointedout that overlapped segments in the multiplicativefunction m(i, q). Onthe other hand, havealso been studiedto reduce the variancein the PSD the phase fluctuations of the signal will be contained in the estimatefor another application [7], but ths technique function s(i, q) itself. This approach appliesto sound propagat- will not be treated here. ing in the ocean [l 11 and to electromagnetic communication The FDK is defined separately for the real and imaginary systems [12] . Later, we will generalize this model to include parts of (1).
Recommended publications
  • Moving Average Filters
    CHAPTER 15 Moving Average Filters The moving average is the most common filter in DSP, mainly because it is the easiest digital filter to understand and use. In spite of its simplicity, the moving average filter is optimal for a common task: reducing random noise while retaining a sharp step response. This makes it the premier filter for time domain encoded signals. However, the moving average is the worst filter for frequency domain encoded signals, with little ability to separate one band of frequencies from another. Relatives of the moving average filter include the Gaussian, Blackman, and multiple- pass moving average. These have slightly better performance in the frequency domain, at the expense of increased computation time. Implementation by Convolution As the name implies, the moving average filter operates by averaging a number of points from the input signal to produce each point in the output signal. In equation form, this is written: EQUATION 15-1 Equation of the moving average filter. In M &1 this equation, x[ ] is the input signal, y[ ] is ' 1 % y[i] j x [i j ] the output signal, and M is the number of M j'0 points used in the moving average. This equation only uses points on one side of the output sample being calculated. Where x[ ] is the input signal, y[ ] is the output signal, and M is the number of points in the average. For example, in a 5 point moving average filter, point 80 in the output signal is given by: x [80] % x [81] % x [82] % x [83] % x [84] y [80] ' 5 277 278 The Scientist and Engineer's Guide to Digital Signal Processing As an alternative, the group of points from the input signal can be chosen symmetrically around the output point: x[78] % x[79] % x[80] % x[81] % x[82] y[80] ' 5 This corresponds to changing the summation in Eq.
    [Show full text]
  • Higher-Order Asymptotics
    Higher-Order Asymptotics Todd Kuffner Washington University in St. Louis WHOA-PSI 2016 1 / 113 First- and Higher-Order Asymptotics Classical Asymptotics in Statistics: available sample size n ! 1 First-Order Asymptotic Theory: asymptotic statements that are correct to order O(n−1=2) Higher-Order Asymptotics: refinements to first-order results 1st order 2nd order 3rd order kth order error O(n−1=2) O(n−1) O(n−3=2) O(n−k=2) or or or or o(1) o(n−1=2) o(n−1) o(n−(k−1)=2) Why would anyone care? deeper understanding more accurate inference compare different approaches (which agree to first order) 2 / 113 Points of Emphasis Convergence pointwise or uniform? Error absolute or relative? Deviation region moderate or large? 3 / 113 Common Goals Refinements for better small-sample performance Example Edgeworth expansion (absolute error) Example Barndorff-Nielsen’s R∗ Accurate Approximation Example saddlepoint methods (relative error) Example Laplace approximation Comparative Asymptotics Example probability matching priors Example conditional vs. unconditional frequentist inference Example comparing analytic and bootstrap procedures Deeper Understanding Example sources of inaccuracy in first-order theory Example nuisance parameter effects 4 / 113 Is this relevant for high-dimensional statistical models? The Classical asymptotic regime is when the parameter dimension p is fixed and the available sample size n ! 1. What if p < n or p is close to n? 1. Find a meaningful non-asymptotic analysis of the statistical procedure which works for any n or p (concentration inequalities) 2. Allow both n ! 1 and p ! 1. 5 / 113 Some First-Order Theory Univariate (classical) CLT: Assume X1;X2;::: are i.i.d.
    [Show full text]
  • Random Signals
    Chapter 8 RANDOM SIGNALS Signals can be divided into two main categories - deterministic and random. The term random signal is used primarily to denote signals, which have a random in its nature source. As an example we can mention the thermal noise, which is created by the random movement of electrons in an electric conductor. Apart from this, the term random signal is used also for signals falling into other categories, such as periodic signals, which have one or several parameters that have appropriate random behavior. An example is a periodic sinusoidal signal with a random phase or amplitude. Signals can be treated either as deterministic or random, depending on the application. Speech, for example, can be considered as a deterministic signal, if one specific speech waveform is considered. It can also be viewed as a random process if one considers the ensemble of all possible speech waveforms in order to design a system that will optimally process speech signals, in general. The behavior of stochastic signals can be described only in the mean. The description of such signals is as a rule based on terms and concepts borrowed from probability theory. Signals are, however, a function of time and such description becomes quickly difficult to manage and impractical. Only a fraction of the signals, known as ergodic, can be handled in a relatively simple way. Among those signals that are excluded are the class of the non-stationary signals, which otherwise play an essential part in practice. Working in frequency domain is a powerful technique in signal processing. While the spectrum is directly related to the deterministic signals, the spectrum of a ran- dom signal is defined through its correlation function.
    [Show full text]
  • The Method of Maximum Likelihood for Simple Linear Regression
    08:48 Saturday 19th September, 2015 See updates and corrections at http://www.stat.cmu.edu/~cshalizi/mreg/ Lecture 6: The Method of Maximum Likelihood for Simple Linear Regression 36-401, Fall 2015, Section B 17 September 2015 1 Recapitulation We introduced the method of maximum likelihood for simple linear regression in the notes for two lectures ago. Let's review. We start with the statistical model, which is the Gaussian-noise simple linear regression model, defined as follows: 1. The distribution of X is arbitrary (and perhaps X is even non-random). 2. If X = x, then Y = β0 + β1x + , for some constants (\coefficients", \parameters") β0 and β1, and some random noise variable . 3. ∼ N(0; σ2), and is independent of X. 4. is independent across observations. A consequence of these assumptions is that the response variable Y is indepen- dent across observations, conditional on the predictor X, i.e., Y1 and Y2 are independent given X1 and X2 (Exercise 1). As you'll recall, this is a special case of the simple linear regression model: the first two assumptions are the same, but we are now assuming much more about the noise variable : it's not just mean zero with constant variance, but it has a particular distribution (Gaussian), and everything we said was uncorrelated before we now strengthen to independence1. Because of these stronger assumptions, the model tells us the conditional pdf 2 of Y for each x, p(yjX = x; β0; β1; σ ). (This notation separates the random variables from the parameters.) Given any data set (x1; y1); (x2; y2);::: (xn; yn), we can now write down the probability density, under the model, of seeing that data: n n (y −(β +β x ))2 Y 2 Y 1 − i 0 1 i p(yijxi; β0; β1; σ ) = p e 2σ2 2 i=1 i=1 2πσ 1See the notes for lecture 1 for a reminder, with an explicit example, of how uncorrelated random variables can nonetheless be strongly statistically dependent.
    [Show full text]
  • STATISTICAL FOURIER ANALYSIS: CLARIFICATIONS and INTERPRETATIONS by DSG Pollock
    STATISTICAL FOURIER ANALYSIS: CLARIFICATIONS AND INTERPRETATIONS by D.S.G. Pollock (University of Leicester) Email: stephen [email protected] This paper expounds some of the results of Fourier theory that are es- sential to the statistical analysis of time series. It employs the algebra of circulant matrices to expose the structure of the discrete Fourier transform and to elucidate the filtering operations that may be applied to finite data sequences. An ideal filter with a gain of unity throughout the pass band and a gain of zero throughout the stop band is commonly regarded as incapable of being realised in finite samples. It is shown here that, to the contrary, such a filter can be realised both in the time domain and in the frequency domain. The algebra of circulant matrices is also helpful in revealing the nature of statistical processes that are band limited in the frequency domain. In order to apply the conventional techniques of autoregressive moving-average modelling, the data generated by such processes must be subjected to anti- aliasing filtering and sub sampling. These techniques are also described. It is argued that band-limited processes are more prevalent in statis- tical and econometric time series than is commonly recognised. 1 D.S.G. POLLOCK: Statistical Fourier Analysis 1. Introduction Statistical Fourier analysis is an important part of modern time-series analysis, yet it frequently poses an impediment that prevents a full understanding of temporal stochastic processes and of the manipulations to which their data are amenable. This paper provides a survey of the theory that is not overburdened by inessential complications, and it addresses some enduring misapprehensions.
    [Show full text]
  • 2D Fourier, Scale, and Cross-Correlation
    2D Fourier, Scale, and Cross-correlation CS 510 Lecture #12 February 26th, 2014 Where are we? • We can detect objects, but they can only differ in translation and 2D rotation • Then we introduced Fourier analysis. • Why? – Because Fourier analysis can help us with scale – Because Fourier analysis can make correlation faster Review: Discrete Fourier Transform • Problem: an image is not an analogue signal that we can integrate. • Therefore for 0 ≤ x < N and 0 ≤ u <N/2: N −1 * # 2πux & # 2πux &- F(u) = ∑ f (x),cos % ( − isin% (/ x=0 + $ N ' $ N '. And the discrete inverse transform is: € 1 N −1 ) # 2πux & # 2πux &, f (x) = ∑F(u)+cos % ( + isin% (. N x=0 * $ N ' $ N '- CS 510, Image Computaon, ©Ross 3/2/14 3 Beveridge & Bruce Draper € 2D Fourier Transform • So far, we have looked only at 1D signals • For 2D signals, the continuous generalization is: ∞ ∞ F(u,v) ≡ ∫ ∫ f (x, y)[cos(2π(ux + vy)) − isin(2π(ux + vy))] −∞ −∞ • Note that frequencies are now two- dimensional € – u= freq in x, v = freq in y • Every frequency (u,v) has a real and an imaginary component. CS 510, Image Computaon, ©Ross 3/2/14 4 Beveridge & Bruce Draper 2D sine waves • This looks like you’d expect in 2D Ø Note that the frequencies don’t have to be equal in the two dimensions. hp://images.google.com/imgres?imgurl=hFp://developer.nvidia.com/dev_content/cg/cg_examples/images/ sine_wave_perturbaon_ogl.jpg&imgrefurl=hFp://developer.nvidia.com/object/ cg_effects_explained.html&usg=__0FimoxuhWMm59cbwhch0TLwGpQM=&h=350&w=350&sz=13&hl=en&start=8&sig2=dBEtH0hp5I1BExgkXAe_kg&tbnid=fc yrIaap0P3M:&tbnh=120&tbnw=120&ei=llCYSbLNL4miMoOwoP8L&prev=/images%3Fq%3D2D%2Bsine%2Bwave%26gbv%3D2%26hl%3Den%26sa%3DG CS 510, Image Computaon, ©Ross 3/2/14 5 Beveridge & Bruce Draper 2D Discrete Fourier Transform N /2 N /2 * # 2π & # 2π &- F(u,v) = ∑ ∑ f (x, y),cos % (ux + vy)( − isin% (ux + vy)(/ x=−N /2 y=−N /2 + $ N ' $ N '.
    [Show full text]
  • 20. the Fourier Transform in Optics, II Parseval’S Theorem
    20. The Fourier Transform in optics, II Parseval’s Theorem The Shift theorem Convolutions and the Convolution Theorem Autocorrelations and the Autocorrelation Theorem The Shah Function in optics The Fourier Transform of a train of pulses The spectrum of a light wave The spectrum of a light wave is defined as: 2 SFEt {()} where F{E(t)} denotes E(), the Fourier transform of E(t). The Fourier transform of E(t) contains the same information as the original function E(t). The Fourier transform is just a different way of representing a signal (in the frequency domain rather than in the time domain). But the spectrum contains less information, because we take the magnitude of E(), therefore losing the phase information. Parseval’s Theorem Parseval’s Theorem* says that the 221 energy in a function is the same, whether f ()tdt F ( ) d 2 you integrate over time or frequency: Proof: f ()tdt2 f ()t f *()tdt 11 F( exp(j td ) F *( exp(j td ) dt 22 11 FF() *(') exp([j '])tdtd ' d 22 11 FF( ) * ( ') [2 ')] dd ' 22 112 FF() *() d F () d * also known as 22Rayleigh’s Identity. The Fourier Transform of a sum of two f(t) F() functions t g(t) G() Faft() bgt () aF ft() bFgt () t The FT of a sum is the F() + sum of the FT’s. f(t)+g(t) G() Also, constants factor out. t This property reflects the fact that the Fourier transform is a linear operation. Shift Theorem The Fourier transform of a shifted function, f ():ta Ffta ( ) exp( jaF ) ( ) Proof : This theorem is F ft a ft( a )exp( jtdt ) important in optics, because we often encounter functions Change variables : uta that are shifting (continuously) along fu( )exp( j [ u a ]) du the time axis – they are called waves! exp(ja ) fu ( )exp( judu ) exp(jaF ) ( ) QED An example of the Shift Theorem in optics Suppose that we’re measuring the spectrum of a light wave, E(t), but a small fraction of the irradiance of this light, say , takes a different path that also leads to the spectrometer.
    [Show full text]
  • Introduction to Frequency Domain Processing
    MASSACHUSETTS INSTITUTE OF TECHNOLOGY DEPARTMENT OF MECHANICAL ENGINEERING 2.14 Analysis and Design of Feedback Control Systems Introduction to Frequency Domain Processing 1 Introduction - Superposition In this set of notes we examine an alternative to the time-domain convolution operations describing the input-output operations of a linear processing system. The methods developed here use Fourier techniques to transform the temporal representation f(t) to a reciprocal frequency domain space F (jω) where the difficult operation of convolution is replaced by simple multiplication. In addition, an understanding of Fourier methods gives qualitative insights to signal processing techniques such as filtering. Linear systems,by definition,obey the principle of superposition for the forced component of their responses: If linear system is at rest at time t = 0,and is subjected to an input u(t) that is the sum of a set of causal inputs,that is u(t)=u1(t)+u2(t)+...,the response y(t) will be the sum of the individual responses to each component of the input,that is y(t)=y1(t)+y2(t)+... Suppose that a system input u(t) may be expressed as a sum of complex n exponentials n s t u(t)= aie i , i=1 where the complex coefficients ai and constants si are known. Assume that each component is applied to the system alone; if at time t = 0 the system is at rest,the solution component yi(t)is of the form sit yi(t)=(yh (t))i + aiH(si)e where (yh(t))i is a homogeneous solution.
    [Show full text]
  • Time Domain and Frequency Domain Signal Representation
    ES442-Lab 1 ES440. Lab 1 Time Domain and Frequency Domain Signal Representation I. Objective 1. Get familiar with the basic lab equipment: signal generator, oscilloscope, spectrum analyzer, power supply. 2. Learn to observe the time domain signal representation with oscilloscope. 3. Learn to observe the frequency domain signal representation with oscilloscope. 4. Learn to observe signal spectrum with spectrum analyzer. 5. Understand the time domain representation of electrical signals. 6. Understand the frequency domain representation of electrical signals. II. Pre-lab 1) Let S1(t) = sin(2pf1t) and S1(t) = sin(2pf1t). Find the Fourier transform of S1(t) + S2(t) and S1(t) * S2(t). 2) Let S3(t) = rect (t/T), being a train of square waveform. What is the Fourier transform of S3(t)? 3) Assume S4(t) is a train of square waveform from +4V to -4V with period of 1 msec and 50 percent duty cycle, zero offset. Answer the following questions: a) What is the time representation of this signal? b) Express the Fourier series representation for this signal for all its harmonics (assume the signal is odd symmetric). c) Write the Fourier series expression for the first five harmonics. Does this signal have any DC signal? d) Determine the peak magnitudes and frequencies of the first five odd harmonics. e) Draw the frequency spectrum for the first five harmonics and clearly show the frequency and amplitude (in Volts) for each line spectra. 4) Given the above signal, assume it passes through a bandlimited twisted cable with maximum bandwidth of 10KHz. a) Using Fourier series, show the time-domain representation of the signal as it passes through the cable.
    [Show full text]
  • Statistical Models in R Some Examples
    Statistical Models Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Statistical Models Outline Statistical Models Linear Models in R Statistical Models Regression Regression analysis is the appropriate statistical method when the response variable and all explanatory variables are continuous. Here, we only discuss linear regression, the simplest and most common form. Remember that a statistical model attempts to approximate the response variable Y as a mathematical function of the explanatory variables X1;:::; Xn. This mathematical function may involve parameters. Regression analysis attempts to use sample data find the parameters that produce the best model Statistical Models Linear Models The simplest such model is a linear model with a unique explanatory variable, which takes the following form. y^ = a + bx: Here, y is the response variable vector, x the explanatory variable, y^ is the vector of fitted values and a (intercept) and b (slope) are real numbers. Plotting y versus x, this model represents a line through the points. For a given index i,y ^i = a + bxi approximates yi . Regression amounts to finding a and b that gives the best fit. Statistical Models Linear Model with 1 Explanatory Variable ● 10 ● ● ● ● ● 5 y ● ● ● ● ● ● y ● ● y−hat ● ● ● ● 0 ● ● ● ● x=2 0 1 2 3 4 5 x Statistical Models Plotting Commands for the record The plot was generated with test data xR, yR with: > plot(xR, yR, xlab = "x", ylab = "y") > abline(v = 2, lty = 2) > abline(a = -2, b = 2, col = "blue") > points(c(2), yR[9], pch = 16, col = "red") > points(c(2), c(2), pch = 16, col = "red") > text(2.5, -4, "x=2", cex = 1.5) > text(1.8, 3.9, "y", cex = 1.5) > text(2.5, 1.9, "y-hat", cex = 1.5) Statistical Models Linear Regression = Minimize RSS Least Squares Fit In linear regression the best fit is found by minimizing n n X 2 X 2 RSS = (yi − y^i ) = (yi − (a + bxi )) : i=1 i=1 This is a Calculus I problem.
    [Show full text]
  • A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications
    Special Publication 800-22 Revision 1a A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications AndrewRukhin,JuanSoto,JamesNechvatal,Miles Smid,ElaineBarker,Stefan Leigh,MarkLevenson,Mark Vangel,DavidBanks,AlanHeckert,JamesDray,SanVo Revised:April2010 LawrenceE BasshamIII A Statistical Test Suite for Random and Pseudorandom Number Generators for NIST Special Publication 800-22 Revision 1a Cryptographic Applications 1 2 Andrew Rukhin , Juan Soto , James 2 2 Nechvatal , Miles Smid , Elaine 2 1 Barker , Stefan Leigh , Mark 1 1 Levenson , Mark Vangel , David 1 1 2 Banks , Alan Heckert , James Dray , 2 San Vo Revised: April 2010 2 Lawrence E Bassham III C O M P U T E R S E C U R I T Y 1 Statistical Engineering Division 2 Computer Security Division Information Technology Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899-8930 Revised: April 2010 U.S. Department of Commerce Gary Locke, Secretary National Institute of Standards and Technology Patrick Gallagher, Director A STATISTICAL TEST SUITE FOR RANDOM AND PSEUDORANDOM NUMBER GENERATORS FOR CRYPTOGRAPHIC APPLICATIONS Reports on Computer Systems Technology The Information Technology Laboratory (ITL) at the National Institute of Standards and Technology (NIST) promotes the U.S. economy and public welfare by providing technical leadership for the nation’s measurement and standards infrastructure. ITL develops tests, test methods, reference data, proof of concept implementations, and technical analysis to advance the development and productive use of information technology. ITL’s responsibilities include the development of technical, physical, administrative, and management standards and guidelines for the cost-effective security and privacy of sensitive unclassified information in Federal computer systems.
    [Show full text]
  • The Wavelet Tutorial Second Edition Part I by Robi Polikar
    THE WAVELET TUTORIAL SECOND EDITION PART I BY ROBI POLIKAR FUNDAMENTAL CONCEPTS & AN OVERVIEW OF THE WAVELET THEORY Welcome to this introductory tutorial on wavelet transforms. The wavelet transform is a relatively new concept (about 10 years old), but yet there are quite a few articles and books written on them. However, most of these books and articles are written by math people, for the other math people; still most of the math people don't know what the other math people are 1 talking about (a math professor of mine made this confession). In other words, majority of the literature available on wavelet transforms are of little help, if any, to those who are new to this subject (this is my personal opinion). When I first started working on wavelet transforms I have struggled for many hours and days to figure out what was going on in this mysterious world of wavelet transforms, due to the lack of introductory level text(s) in this subject. Therefore, I have decided to write this tutorial for the ones who are new to the topic. I consider myself quite new to the subject too, and I have to confess that I have not figured out all the theoretical details yet. However, as far as the engineering applications are concerned, I think all the theoretical details are not necessarily necessary (!). In this tutorial I will try to give basic principles underlying the wavelet theory. The proofs of the theorems and related equations will not be given in this tutorial due to the simple assumption that the intended readers of this tutorial do not need them at this time.
    [Show full text]