Gaussian Processes

Total Page:16

File Type:pdf, Size:1020Kb

Gaussian Processes VAG002- Provided the covariance matrix is nonsingular, the Gaussian processes random vector X has a Gaussian probability density function given by The class of Gaussian processes is one of the most n/2 1/2 widely used families of stochastic processes for mod- fX x D 2 det eling dependent data observed over time, or space, ð exp 1 x m 01 x m 1 or time and space. The popularity of such processes 2 stems primarily from two essential properties. First, In environmental applications, the subscript t will a Gaussian process is completely determined by its typically denote a point in time, or space, or space mean and covariance functions. This property facili- and time. For simplicity, we shall restrict attention to tates model fitting as only the first- and second-order the case of time series for which t represents time. moments of the process require specification. Second, In such cases, the index set T is usually [0, 1 for solving the prediction problem is relatively straight- time series recorded continuously or f0, 1,...,g for forward. The best predictor of a Gaussian process time series recorded at equally spaced time units. at an unobserved location is a linear function of The mean and covariance functions of a Gaussian the observed values and, in many cases, these func- process are defined by tions can be computed rather quickly using recursive formulas. t D EXt 2 The fundamental characterization, as described below, of a Gaussian process is that all the finite- and dimensional distributions have a multivariate normal s, t D cov Xs,Xt 3 (or Gaussian) distribution. In particular the distribu- tion of each observation must be normally distributed. respectively. While Gaussian processes depend only There are many applications, however, where this on these two quantities, modeling can be diffi- assumption is not appropriate. For example, con- cult without introducing further simplifications on sider observations x1,...,xn,wherext denotes a 1 the form of the mean and covariance functions. or 0, depending on whether or not the air pollution The assumption of stationarity frequently provides on the tth day at a certain site exceeds a govern- the proper level of simplification without sacrificing ment standard. A model for there data should only much generalization. Moreover, after applying ele- allow the values of 0 and 1 for each daily obser- mentary transformations to the data, the assumption vation thereby precluding the normality assumption of stationarity of the transformed data is often quite imposed by a Gaussian model. Nevertheless, Gaus- plausible. sian processes can still be used as building blocks A Gaussian time series fXtg is said to be station- to construct more complex models that are appro- ary if priate for non-Gaussian data. See [3–5] for more on modeling non-Gaussian data. 1. m t D EXt D is independent of t,and 2. t C h, t D cov XtCh,Xt is independent of t for all h. Basic Properties For stationary processes, it is conventional to express A real-valued stochastic process fXt,t2 Tg,where the covariance function as a function on T instead T is an index set, is a Gaussian process if all the of on T ð T. That is, we define h D cov XtCh,Xt finite-dimensional distributions have a multivariate and call it the autocovariance function of the process. normal distribution. That is, for any choice of dis- For stationary Gaussian processes fXtg,wehave tinct values t ,...,t 2 T, the random vector X D 1 k 3. X ¾ N , 0 for all t,and X ,...,X 0 has a multivariate normal distribu- t t1 tk 4. X ,X 0 has a bivariate normal distribution tion with mean vector m D EX and covariance matrix tCh t with covariance matrix D cov X, X , which will be denoted by 0 h X ¾ N m, h 0 VAG002- 2 Gaussian processes for all t and h. where fZtg is a sequence of independent and identi- A general stochastic process fXtg satisfying con- cally distributed (iid) normal random variables with 2 ditions 1 and 2 is said to be weakly or second-order mean 0 and variance , f jg is a sequence of square stationary. The first- and second-order moments of summable coefficients with 0 D 1, and fVtg is a weakly stationary processes are invariant with respect deterministic process that is independent of fZtg.The to time translations. A stochastic process fXtg is Zt are referred to as innovations and are defined by strictly stationary if the distribution of Xt1 ,...,Xtn Zt D Xt E XtjXt1,Xt2,...). A process fVtg is is the same as Xt1Cs ,...,XtnCs for any s.Inother deterministic if Vt is completely determined by its words, the distributional properties of the time series past history fVs,s<tg. An example of such a pro- are the same under any time translation. For Gaus- cesses is the random sinusoid, Vt D A cos t C , sian time series, the concepts of weak and strict where A and are independent random variables stationarity coalesce. This result follows immediately with A ½ 0and distributed uniformly on [0, 2 ). In from the fact that for weakly stationary processes, this case, V2 is completely determined by the values X ,...,X and X ,...,X have the same t1 tn t1Cs tnCs of V and V . In most time series modeling applica- mean vector and covariance matrix. Since each of the 0 1 tions, the deterministic component of a time series is two vectors has a multivariate normal distribution, either not present or easily removed. they must be identically distributed. Purely nondeterministic Gaussian processes do not possess a deterministic component and can be repre- Properties of the Autocovariance Function sented as a Gaussian linear processes, An autocovariance function Ð has the properties: 1 Xt D jZtj 6 1. 0 ½ 0, jD0 2. j h jÄ 0 for all h, 3. h D h ,i.e. Ð is an even function. The autocovariance of fXtg has the form Autocovariances have another fundamental prop- 1 erty, namely that of non-negative definiteness, h D j jCh 7 n jD0 ai ti tj aj ½ 0 4 i,jD1 The class of autoregressive (AR) processes, and its extensions, autoregressive moving-average (ARMA) for all positive integers n, real numbers a1,...,an, and t1,...,tn 2 T. Note that the expression on the processes, are dense in the class of Gaussian linear processes. A Gaussian AR(p) process satisfies the left of (4) is merely the variance of a1Xt1 CÐÐÐC recursions anXtn and hence must be non-negative. Conversely, if a function Ð is non-negative definite and even, then it must be an autocovariance function of some Xt D !1Xt1 CÐÐÐC!pXtp C Zt 8 stationary Gaussian process. 2 where fZtg is an iid sequence of N 0, ran- Gaussian Linear Processes dom variables, and the polynomial ! z D 1 !1z p ÐÐÐ!pz has no zeros inside or on the unit cir- If fXt,tD 0, š1, š2,...,g is a stationary Gaussian cle. The AR(p) process has a linear representation process with mean 0, then the Wold decomposition (6) where the coefficients are found as functions of allows Xt to be expressed as a sum of two indepen- the !j (see [2]). Now for any Gaussian linear process, dent components, there exists an AR(p) process such that the difference 1 in the two autocovariance functions can be made arbi- Xt D jZtj C Vt 5 trarily small for all lags. In fact, the autocovariances jD0 can be matched up perfectly for the first p lags. VAG002- Gaussian processes 3 Prediction subset consisting of linear independent variables. The covariance matrix of this prediction subset will be Recall that if two random vectors X1 and X2 have a nonsingular. A mild and easily verifiable condition joint normal distribution, i.e. for ensuring nonsingularity of n for all n is that h ! 0ash !1with 0 >0 (see [1]). X m 1 ¾ N 1 , 11 12 While (12) and (13) completely solve the pre- X m 2 2 21 22 diction problem, these equations require the inver- n ð n and 22 is nonsingular, then the conditional distri- sion of an covariance matrix which may n bution of X1 given X2 has a multivariate normal be difficult and time consuming for large .The distribution with mean Durbin–Levinson algorithm (see [1]) allows one to 0 compute the coefficient vector !n D !n1,...,!nn 1 mX1jX2 D m1 C 1222 X2 m2 9 and the one-step prediction errors vn recursively from !n1, vn1, and the autocovariance function. and covariance matrix 1 The Durbin–Levinson Algorithm X1jX2 D 11 1222 21 10 The key observation here is that the best mean The coefficients !n in the calculation of the one-step square error predictor of X in terms of X prediction error (11) and the mean square error of 1 2 prediction (13) can be computed recursively from the (i.e. the multivariate function g X2 that minimizes 2 equations EjjX1 g X2 jj ,wherejj Ð jj is Euclidean distance) is E X jX D m j which is a linear function of 1 1 2 X1 X2 n X2. Also, the covariance matrix of prediction error, 1 !nn D n !n1,j n 1 vn1 X jX , does not depend on the value of X2.These 1 2 jD1 results extend directly to the prediction problem for Gaussian processes. !n,1 !n1,1 !n1,n1 Suppose fX ,tD 1, 2,...g is a stationary Gaussian . . . t .
Recommended publications
  • Stationary Processes and Their Statistical Properties
    Stationary Processes and Their Statistical Properties Brian Borchers March 29, 2001 1 Stationary processes A discrete time stochastic process is a sequence of random variables Z1, Z2, :::. In practice we will typically analyze a single realization z1, z2, :::, zn of the stochastic process and attempt to esimate the statistical properties of the stochastic process from the realization. We will also consider the problem of predicting zn+1 from the previous elements of the sequence. We will begin by focusing on the very important class of stationary stochas- tic processes. A stochastic process is strictly stationary if its statistical prop- erties are unaffected by shifting the stochastic process in time. In particular, this means that if we take a subsequence Zk+1, :::, Zk+m, then the joint distribution of the m random variables will be the same no matter what k is. Stationarity requires that the mean of the stochastic process be a constant. E[Zk] = µ. and that the variance is constant 2 V ar[Zk] = σZ : Also, stationarity requires that the covariance of two elements separated by a distance m is constant. That is, Cov(Zk;Zk+m) is constant. This covariance is called the autocovariance at lag m, and we will use the notation γm. Since Cov(Zk;Zk+m) = Cov(Zk+m;Zk), we need only find γm for m 0. The ≥ correlation of Zk and Zk+m is the autocorrelation at lag m. We will use the notation ρm for the autocorrelation. It is easy to show that γk ρk = : γ0 1 2 The autocovariance and autocorrelation ma- trices The covariance matrix for the random variables Z1, :::, Zn is called an auto- covariance matrix.
    [Show full text]
  • Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes
    Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Yves-Laurent Kom Samo [email protected] Stephen Roberts [email protected] Deparment of Engineering Science and Oxford-Man Institute, University of Oxford Abstract 2. Related Work In this paper we propose an efficient, scalable Non-parametric inference on point processes has been ex- non-parametric Gaussian process model for in- tensively studied in the literature. Rathbum & Cressie ference on Poisson point processes. Our model (1994) and Moeller et al. (1998) used a finite-dimensional does not resort to gridding the domain or to intro- piecewise constant log-Gaussian for the intensity function. ducing latent thinning points. Unlike competing Such approximations are limited in that the choice of the 3 models that scale as O(n ) over n data points, grid on which to represent the intensity function is arbitrary 2 our model has a complexity O(nk ) where k and one has to trade-off precision with computational com- n. We propose a MCMC sampler and show that plexity and numerical accuracy, with the complexity being the model obtained is faster, more accurate and cubic in the precision and exponential in the dimension of generates less correlated samples than competing the input space. Kottas (2006) and Kottas & Sanso (2007) approaches on both synthetic and real-life data. used a Dirichlet process mixture of Beta distributions as Finally, we show that our model easily handles prior for the normalised intensity function of a Poisson data sizes not considered thus far by alternate ap- process. Cunningham et al.
    [Show full text]
  • Stochastic Process - Introduction
    Stochastic Process - Introduction • Stochastic processes are processes that proceed randomly in time. • Rather than consider fixed random variables X, Y, etc. or even sequences of i.i.d random variables, we consider sequences X0, X1, X2, …. Where Xt represent some random quantity at time t. • In general, the value Xt might depend on the quantity Xt-1 at time t-1, or even the value Xs for other times s < t. • Example: simple random walk . week 2 1 Stochastic Process - Definition • A stochastic process is a family of time indexed random variables Xt where t belongs to an index set. Formal notation, { t : ∈ ItX } where I is an index set that is a subset of R. • Examples of index sets: 1) I = (-∞, ∞) or I = [0, ∞]. In this case Xt is a continuous time stochastic process. 2) I = {0, ±1, ±2, ….} or I = {0, 1, 2, …}. In this case Xt is a discrete time stochastic process. • We use uppercase letter {Xt } to describe the process. A time series, {xt } is a realization or sample function from a certain process. • We use information from a time series to estimate parameters and properties of process {Xt }. week 2 2 Probability Distribution of a Process • For any stochastic process with index set I, its probability distribution function is uniquely determined by its finite dimensional distributions. •The k dimensional distribution function of a process is defined by FXX x,..., ( 1 x,..., k ) = P( Xt ≤ 1 ,..., xt≤ X k) x t1 tk 1 k for anyt 1 ,..., t k ∈ I and any real numbers x1, …, xk .
    [Show full text]
  • Deep Neural Networks As Gaussian Processes
    Published as a conference paper at ICLR 2018 DEEP NEURAL NETWORKS AS GAUSSIAN PROCESSES Jaehoon Lee∗y, Yasaman Bahri∗y, Roman Novak , Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein Google Brain fjaehlee, yasamanb, romann, schsam, jpennin, [email protected] ABSTRACT It has long been known that a single-layer fully-connected neural network with an i.i.d. prior over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network width. This correspondence enables exact Bayesian inference for infinite width neural networks on regression tasks by means of evaluating the corresponding GP. Recently, kernel functions which mimic multi-layer random neural networks have been developed, but only outside of a Bayesian framework. As such, previous work has not identified that these kernels can be used as co- variance functions for GPs and allow fully Bayesian prediction with a deep neural network. In this work, we derive the exact equivalence between infinitely wide deep net- works and GPs. We further develop a computationally efficient pipeline to com- pute the covariance function for these GPs. We then use the resulting GPs to per- form Bayesian inference for wide deep neural networks on MNIST and CIFAR- 10. We observe that trained neural network accuracy approaches that of the corre- sponding GP with increasing layer width, and that the GP uncertainty is strongly correlated with trained network prediction error. We further find that test perfor- mance increases as finite-width trained networks are made wider and more similar to a GP, and thus that GP predictions typically outperform those of finite-width networks.
    [Show full text]
  • Power Comparisons of Five Most Commonly Used Autocorrelation Tests
    Pak.j.stat.oper.res. Vol.16 No. 1 2020 pp119-130 DOI: http://dx.doi.org/10.18187/pjsor.v16i1.2691 Pakistan Journal of Statistics and Operation Research Power Comparisons of Five Most Commonly Used Autocorrelation Tests Stanislaus S. Uyanto School of Economics and Business, Atma Jaya Catholic University of Indonesia, [email protected] Abstract In regression analysis, autocorrelation of the error terms violates the ordinary least squares assumption that the error terms are uncorrelated. The consequence is that the estimates of coefficients and their standard errors will be wrong if the autocorrelation is ignored. There are many tests for autocorrelation, we want to know which test is more powerful. We use Monte Carlo methods to compare the power of five most commonly used tests for autocorrelation, namely Durbin-Watson, Breusch-Godfrey, Box–Pierce, Ljung Box, and Runs tests in two different linear regression models. The results indicate the Durbin-Watson test performs better in the regression model without lagged dependent variable, although the advantage over the other tests reduce with increasing autocorrelation and sample sizes. For the model with lagged dependent variable, the Breusch-Godfrey test is generally superior to the other tests. Key Words: Correlated error terms; Ordinary least squares assumption; Residuals; Regression diagnostic; Lagged dependent variable. Mathematical Subject Classification: 62J05; 62J20. 1. Introduction An important assumption of the classical linear regression model states that there is no autocorrelation in the error terms. Autocorrelation of the error terms violates the ordinary least squares assumption that the error terms are uncorrelated, meaning that the Gauss Markov theorem (Plackett, 1949, 1950; Greene, 2018) does not apply, and that ordinary least squares estimators are no longer the best linear unbiased estimators.
    [Show full text]
  • Autocovariance Function Estimation Via Penalized Regression
    Autocovariance Function Estimation via Penalized Regression Lina Liao, Cheolwoo Park Department of Statistics, University of Georgia, Athens, GA 30602, USA Jan Hannig Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA Kee-Hoon Kang Department of Statistics, Hankuk University of Foreign Studies, Yongin, 449-791, Korea Abstract The work revisits the autocovariance function estimation, a fundamental problem in statistical inference for time series. We convert the function estimation problem into constrained penalized regression with a generalized penalty that provides us with flexible and accurate estimation, and study the asymptotic properties of the proposed estimator. In case of a nonzero mean time series, we apply a penalized regression technique to a differenced time series, which does not require a separate detrending procedure. In penalized regression, selection of tuning parameters is critical and we propose four different data-driven criteria to determine them. A simulation study shows effectiveness of the tuning parameter selection and that the proposed approach is superior to three existing methods. We also briefly discuss the extension of the proposed approach to interval-valued time series. Some key words: Autocovariance function; Differenced time series; Regularization; Time series; Tuning parameter selection. 1 1 Introduction Let fYt; t 2 T g be a stochastic process such that V ar(Yt) < 1, for all t 2 T . The autoco- variance function of fYtg is given as γ(s; t) = Cov(Ys;Yt) for all s; t 2 T . In this work, we consider a regularly sampled time series f(i; Yi); i = 1; ··· ; ng. Its model can be written as Yi = g(i) + i; i = 1; : : : ; n; (1.1) where g is a smooth deterministic function and the error is assumed to be a weakly stationary 2 process with E(i) = 0, V ar(i) = σ and Cov(i; j) = γ(ji − jj) for all i; j = 1; : : : ; n: The estimation of the autocovariance function γ (or autocorrelation function) is crucial to determine the degree of serial correlation in a time series.
    [Show full text]
  • LECTURES 2 - 3 : Stochastic Processes, Autocorrelation Function
    LECTURES 2 - 3 : Stochastic Processes, Autocorrelation function. Stationarity. Important points of Lecture 1: A time series fXtg is a series of observations taken sequentially over time: xt is an observation recorded at a specific time t. Characteristics of times series data: observations are dependent, become available at equally spaced time points and are time-ordered. This is a discrete time series. The purposes of time series analysis are to model and to predict or forecast future values of a series based on the history of that series. 2.2 Some descriptive techniques. (Based on [BD] x1.3 and x1.4) ......................................................................................... Take a step backwards: how do we describe a r.v. or a random vector? ² for a r.v. X: 2 d.f. FX (x) := P (X · x), mean ¹ = EX and variance σ = V ar(X). ² for a r.vector (X1;X2): joint d.f. FX1;X2 (x1; x2) := P (X1 · x1;X2 · x2), marginal d.f.FX1 (x1) := P (X1 · x1) ´ FX1;X2 (x1; 1) 2 2 mean vector (¹1; ¹2) = (EX1; EX2), variances σ1 = V ar(X1); σ2 = V ar(X2), and covariance Cov(X1;X2) = E(X1 ¡ ¹1)(X2 ¡ ¹2) ´ E(X1X2) ¡ ¹1¹2. Often we use correlation = normalized covariance: Cor(X1;X2) = Cov(X1;X2)=fσ1σ2g ......................................................................................... To describe a process X1;X2;::: we define (i) Def. Distribution function: (fi-di) d.f. Ft1:::tn (x1; : : : ; xn) = P (Xt1 · x1;:::;Xtn · xn); i.e. this is the joint d.f. for the vector (Xt1 ;:::;Xtn ). (ii) First- and Second-order moments. ² Mean: ¹X (t) = EXt 2 2 2 2 ² Variance: σX (t) = E(Xt ¡ ¹X (t)) ´ EXt ¡ ¹X (t) 1 ² Autocovariance function: γX (t; s) = Cov(Xt;Xs) = E[(Xt ¡ ¹X (t))(Xs ¡ ¹X (s))] ´ E(XtXs) ¡ ¹X (t)¹X (s) (Note: this is an infinite matrix).
    [Show full text]
  • Alternative Tests for Time Series Dependence Based on Autocorrelation Coefficients
    Alternative Tests for Time Series Dependence Based on Autocorrelation Coefficients Richard M. Levich and Rosario C. Rizzo * Current Draft: December 1998 Abstract: When autocorrelation is small, existing statistical techniques may not be powerful enough to reject the hypothesis that a series is free of autocorrelation. We propose two new and simple statistical tests (RHO and PHI) based on the unweighted sum of autocorrelation and partial autocorrelation coefficients. We analyze a set of simulated data to show the higher power of RHO and PHI in comparison to conventional tests for autocorrelation, especially in the presence of small but persistent autocorrelation. We show an application of our tests to data on currency futures to demonstrate their practical use. Finally, we indicate how our methodology could be used for a new class of time series models (the Generalized Autoregressive, or GAR models) that take into account the presence of small but persistent autocorrelation. _______________________________________________________________ An earlier version of this paper was presented at the Symposium on Global Integration and Competition, sponsored by the Center for Japan-U.S. Business and Economic Studies, Stern School of Business, New York University, March 27-28, 1997. We thank Paul Samuelson and the other participants at that conference for useful comments. * Stern School of Business, New York University and Research Department, Bank of Italy, respectively. 1 1. Introduction Economic time series are often characterized by positive
    [Show full text]
  • 3 Autocorrelation
    3 Autocorrelation Autocorrelation refers to the correlation of a time series with its own past and future values. Autocorrelation is also sometimes called “lagged correlation” or “serial correlation”, which refers to the correlation between members of a series of numbers arranged in time. Positive autocorrelation might be considered a specific form of “persistence”, a tendency for a system to remain in the same state from one observation to the next. For example, the likelihood of tomorrow being rainy is greater if today is rainy than if today is dry. Geophysical time series are frequently autocorrelated because of inertia or carryover processes in the physical system. For example, the slowly evolving and moving low pressure systems in the atmosphere might impart persistence to daily rainfall. Or the slow drainage of groundwater reserves might impart correlation to successive annual flows of a river. Or stored photosynthates might impart correlation to successive annual values of tree-ring indices. Autocorrelation complicates the application of statistical tests by reducing the effective sample size. Autocorrelation can also complicate the identification of significant covariance or correlation between time series (e.g., precipitation with a tree-ring series). Autocorrelation implies that a time series is predictable, probabilistically, as future values are correlated with current and past values. Three tools for assessing the autocorrelation of a time series are (1) the time series plot, (2) the lagged scatterplot, and (3) the autocorrelation function. 3.1 Time series plot Positively autocorrelated series are sometimes referred to as persistent because positive departures from the mean tend to be followed by positive depatures from the mean, and negative departures from the mean tend to be followed by negative departures (Figure 3.1).
    [Show full text]
  • Spatial Autocorrelation: Covariance and Semivariance Semivariance
    Spatial Autocorrelation: Covariance and Semivariancence Lily Housese ­P eters GEOG 593 November 10, 2009 Quantitative Terrain Descriptorsrs Covariance and Semivariogram areare numericc methods used to describe the character of the terrain (ex. Terrain roughness, irregularity) Terrain character has important implications for: 1. the type of sampling strategy chosen 2. estimating DTM accuracy (after sampling and reconstruction) Spatial Autocorrelationon The First Law of Geography ““ Everything is related to everything else, but near things are moo re related than distant things.” (Waldo Tobler) The value of a variable at one point in space is related to the value of that same variable in a nearby location Ex. Moranan ’s I, Gearyary ’s C, LISA Positive Spatial Autocorrelation (Neighbors are similar) Negative Spatial Autocorrelation (Neighbors are dissimilar) R(d) = correlation coefficient of all the points with horizontal interval (d) Covariance The degree of similarity between pairs of surface points The value of similarity is an indicator of the complexity of the terrain surface Smaller the similarity = more complex the terrain surface V = Variance calculated from all N points Cov (d) = Covariance of all points with horizontal interval d Z i = Height of point i M = average height of all points Z i+d = elevation of the point with an interval of d from i Semivariancee Expresses the degree of relationship between points on a surface Equal to half the variance of the differences between all possible points spaced a constant distance apart
    [Show full text]
  • Gaussian Process Dynamical Models for Human Motion
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 30, NO. 2, FEBRUARY 2008 283 Gaussian Process Dynamical Models for Human Motion Jack M. Wang, David J. Fleet, Senior Member, IEEE, and Aaron Hertzmann, Member, IEEE Abstract—We introduce Gaussian process dynamical models (GPDMs) for nonlinear time series analysis, with applications to learning models of human pose and motion from high-dimensional motion capture data. A GPDM is a latent variable model. It comprises a low- dimensional latent space with associated dynamics, as well as a map from the latent space to an observation space. We marginalize out the model parameters in closed form by using Gaussian process priors for both the dynamical and the observation mappings. This results in a nonparametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach and compare four learning algorithms on human motion capture data, in which each pose is 50-dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces. Index Terms—Machine learning, motion, tracking, animation, stochastic processes, time series analysis. Ç 1INTRODUCTION OOD statistical models for human motion are important models such as hidden Markov model (HMM) and linear Gfor many applications in vision and graphics, notably dynamical systems (LDS) are efficient and easily learned visual tracking, activity recognition, and computer anima- but limited in their expressiveness for complex motions. tion. It is well known in computer vision that the estimation More expressive models such as switching linear dynamical of 3D human motion from a monocular video sequence is systems (SLDS) and nonlinear dynamical systems (NLDS), highly ambiguous.
    [Show full text]
  • Modeling and Generating Multivariate Time Series with Arbitrary Marginals Using a Vector Autoregressive Technique
    Modeling and Generating Multivariate Time Series with Arbitrary Marginals Using a Vector Autoregressive Technique Bahar Deler Barry L. Nelson Dept. of IE & MS, Northwestern University September 2000 Abstract We present a model for representing stationary multivariate time series with arbitrary mar- ginal distributions and autocorrelation structures and describe how to generate data quickly and accurately to drive computer simulations. The central idea is to transform a Gaussian vector autoregressive process into the desired multivariate time-series input process that we presume as having a VARTA (Vector-Autoregressive-To-Anything) distribution. We manipulate the correlation structure of the Gaussian vector autoregressive process so that we achieve the desired correlation structure for the simulation input process. For the purpose of computational efficiency, we provide a numerical method, which incorporates a numerical-search procedure and a numerical-integration technique, for solving this correlation-matching problem. Keywords: Computer simulation, vector autoregressive process, vector time series, multivariate input mod- eling, numerical integration. 1 1 Introduction Representing the uncertainty or randomness in a simulated system by an input model is one of the challenging problems in the practical application of computer simulation. There are an abundance of examples, from manufacturing to service applications, where input modeling is critical; e.g., modeling the time to failure for a machining process, the demand per unit time for inventory of a product, or the times between arrivals of calls to a call center. Further, building a large- scale discrete-event stochastic simulation model may require the development of a large number of, possibly multivariate, input models. Development of these models is facilitated by accurate and automated (or nearly automated) input modeling support.
    [Show full text]