Asymptotic goodness-of-fit tests for the Palm mark distribution of stationary point processes with correlated marks

LOTHAR HEINRICH1, SEBASTIAN LÜCK2,* and VOLKER SCHMIDT2,** 1Institute of Mathematics, University of Augsburg, D-86135 Augsburg, Germany. E-mail: [email protected] 2Institute of Stochastics, Ulm University, D-89069 Ulm, Germany. E-mail: *[email protected]; **[email protected]

We consider spatially homogeneous marked point patterns in an unboundedly expanding convex sampling window. Our main objective is to identify the distribution of the typical mark by constructing an asymptotic χ2-goodness-of-fit test. The corresponding test statistic is based on a natural empirical version of the Palm mark distribution and a smoothed covariance estimator which turns out to be mean square consistent. Our approach does not require independent marks and allows dependences between the mark field and the point pattern. Instead we impose a suitable β-mixing condition on the underlying stationary marked which can be checked for a number of Poisson-based models and, in particular, in the case of geostatistical marking. In order to study test performance, our test approach is applied to detect anisotropy of specific Boolean models. Keywords: β-mixing point process; empirical Palm mark distribution; reduced factorial moment measures; smoothed covariance estimation; χ2-goodness-of-fit test


Convergence rate and concentration inequalities for Gibbs sampling in high dimension

NENG-YI WANG1 and LIMING WU2 1Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, 100190, Beijing, China. E-mail: [email protected] 2Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, 100190, Beijing, China and Laboratoire de Math. CNRS-UMR 6620, Université Blaise Pascal, 63177 Aubière, France. E-mail: [email protected]

The objective of this paper is to study the Gibbs sampling for computing the mean of observable in very high dimension – a powerful Monte Carlo method. Under the Dobrushin’s uniqueness condition, we establish some explicit and sharp estimate of the exponential convergence rate and prove some Gaussian concentration inequalities for the empirical mean.

Keywords: concentration inequality; coupling method; Dobrushin’s uniqueness condition; ; Markov chain Monte Carlo


The generalized Pareto process; with a view towards application and simulation

ANA FERREIRA1,3 and LAURENS DE HAAN2,3 1ISA, Univ Tecn Lisboa, Tapada da Ajuda 1349-017 Lisboa, Portugal. E-mail: [email protected] 2Erasmus University Rotterdam, P.O. Box 1738, 3000 DR Rotterdam, The Netherlands. E-mail: [email protected] 3CEAUL, FCUL, Bloco C6 – Piso 4 Campo Grande, 749-016 Lisboa, Portugal

In extreme value statistics, the peaks-over-threshold method is widely used. The method is based on the generalized Pareto distribution characterizing probabilities of exceedances over high thresholds in Rd .We present a generalization of this concept in the space of continuous functions. We call this the generalized Pareto process. Differently from earlier papers, our definition is not based on a distribution function but on functional properties, and does not need a reference to a related max-. As an application, we use the theory to simulate wind fields connected to disastrous storms on the basis of observed extreme but not disastrous storms. We also establish the peaks-over-threshold approach in function space.

Keywords: domain of attraction; extreme value theory; functional regular variation; generalized Pareto process; max-stable processes; peaks-over-threshold


Model comparison with composite likelihood information criteria

CHI TIM NG1,2 and HARRY JOE3 1Department of Statistics, Seoul National University, Room 430, Building 25, Seoul, South Korea. E-mail: *[email protected] 2Department of Statistics, Chonnam National University, Gwangju, 500-757, South Korea 3Department of Statistics, University of British Columbia, Room ESB 3138, Earth Sciences Building, Van- couver, Canada. E-mail: [email protected]

Comparisons are made for the amount of agreement of the composite likelihood information criteria and their full likelihood counterparts when making decisions among the fits of different models, and some properties of penalty term for composite likelihood information criteria are obtained. Asymptotic theory is given for the case when a simpler model is nested within a bigger model, and the bigger model approaches the simpler model under a sequence of local alternatives. Composite likelihood can more or less frequently choose the bigger model, depending on the direction of local alternatives; in the former case, composite likelihood has more “power” to choose the bigger model. The behaviors of the information criteria are illustrated via theory and simulation examples of the Gaussian linear mixed-effects model.

Keywords: Akaike information criterion; Bayesian information criterion; local alternatives; mixed-effects model; model comparison


On the form of the large deviation rate function for the empirical measures of weakly interacting systems

MARKUS FISCHER Department of Mathematics, University of Padua, via Trieste 63, 35121 Padova, Italy. E-mail: fi[email protected]

A basic result of large deviations theory is Sanov’s theorem, which states that the sequence of empirical measures of independent and identically distributed samples satisfies the large deviation principle with rate function given by relative with respect to the common distribution. Large deviation principles for the empirical measures are also known to hold for broad classes of weakly interacting systems. When the interaction through the empirical measure corresponds to an absolutely continuous change of measure, the rate function can be expressed as relative entropy of a distribution with respect to the law of the McKean– Vlasov limit with measure-variable frozen at that distribution. We discuss situations, beyond that of tilted distributions, in which a large deviation principle holds with rate function in relative entropy form.

Keywords: empirical measure; Laplace principle; large deviations; mean field interaction; particle system; relative entropy; Wiener measure


Minimax bounds for estimation of normal mixtures

ARLENE K.H. KIM Statistical Laboratory, Center for Mathematical Sciences, University of Cambridge, Wilberforce Road, Cambridge, CB30WB, UK. E-mail: [email protected]

This paper deals with minimax rates of convergence for estimation of density functions on the real line. The densities are assumed to be location mixtures of normals, a global regularity requirement that creates subtle difficulties for the application of standard minimax lower bound methods. Using novel Fourier and Hermite polynomial techniques, we determine the minimax optimal rate – slightly larger than the parametric rate – under squared error loss. For Hellinger loss, we provide a minimax lower bound using ideas modified from the squared error loss case.

Keywords: Assouad’s lemma; Hermite polynomials; minimax lower bound; normal location mixture


Local extinction in continuous-state branching processes with immigration

CLÉMENT FOUCART1 and GERÓNIMO URIBE BRAVO2 1Institut für Mathematik, Technische Universität Berlin, RTG 1845, D-10623, Berlin, Germany. E-mail: [email protected] 2Instituto de Matemáticas, Universidad Nacional Autónoma de México, Área de la Investigación Científica, Circuito Exterior, Ciudad Universitaria, Coyoacán, 04510, México, D.F. E-mail: [email protected]

The purpose of this article is to observe that the zero sets of continuous-state branching processes with im- migration (CBI) are infinitely divisible regenerative sets. Indeed, they can be constructed by the procedure of random cutouts introduced by Mandelbrot in 1972. We then show how very precise information about the zero sets of CBI can be obtained in terms of the branching and immigrating mechanism.

Keywords: continuous-state ; polarity; random cutout; zero set


Large deviations for bootstrapped empirical measures

JOSÉ TRASHORRAS* and OLIVIER WINTENBERGER** Université Paris–Dauphine, Ceremade, Place du Maréchal de Lattre de Tassigny, 75775 Paris Cedex 16, France. E-mail: *[email protected]; **[email protected]

We investigate the Large Deviations (LD) properties of bootstrapped empirical measures with exchange- able weights. Our main results show in great generality how the resulting rate functions combine the LD properties of both the sample weights and the observations. As an application, we obtain new LD results and discuss both conditional and unconditional LD-efficiency for many classical choices of entries such as Efron’s, leave-p-out, i.i.d. weighted, k-blocks bootstraps, etc.

Keywords: exchangeable bootstrap; large deviations


Particle-kernel estimation of the filter density in state-space models

DAN CRISAN1 and JOAQUÍN MÍGUEZ2 1Department of Mathematics, Imperial College London, Huxley Building, 180 Queen’s Gate, London SW7 2BZ, UK. E-mail: [email protected] 2Department of Signal Theory & Communications, Universidad Carlos III de Madrid, Avenida de la Uni- versidad 30, 28911 Leganés (Madrid), Spain. E-mail: [email protected]

Sequential Monte Carlo (SMC) methods, also known as particle filters, are simulation-based recursive al- gorithms for the approximation of the a posteriori probability measures generated by state-space dynamical models. At any given time t, a SMC method produces a set of samples over the state space of the system of interest (often termed “particles”) that is used to build a discrete and random approximation of the poste- rior probability distribution of the state variables, conditional on a sequence of available observations. One potential application of the methodology is the estimation of the densities associated to the sequence of a posteriori distributions. While practitioners have rather freely applied such density approximations in the past, the issue has received less attention from a theoretical perspective. In this paper, we address the prob- lem of constructing kernel-based estimates of the posterior probability density function and its derivatives, and obtain asymptotic convergence results for the estimation errors. In particular, we find convergence rates for the approximation errors that hold uniformly on the state space and guarantee that the error vanishes al- most surely as the number of particles in the filter grows. Based on this uniform convergence result, we first show how to build continuous measures that converge almost surely (with known rate) toward the posterior measure and then address a few applications. The latter include maximum a posteriori estimation of the system state using the approximate derivatives of the posterior density and the approximation of functionals of it, for example, Shannon’s entropy.

Keywords: density estimation; Markov systems; particle filtering; sequential Monte Carlo; state-space models; stochastic filtering


Optimal scaling for the transient phase of Metropolis Hastings algorithms: The longtime behavior

BENJAMIN JOURDAIN*, TONY LELIÈVRE** and BŁAZEJ˙ MIASOJEDOW† Université Paris-Est, CERMICS, 6 & 8, avenue Blaise Pascal, 77455 Marne-La-Vallée, France. E-mail: *[email protected]; **[email protected]; †[email protected]

We consider the Metropolis algorithm on Rn with Gaussian proposals, and when the tar- get probability measure is the n-fold product of a one-dimensional law. It is well known (see Roberts et al. (Ann. Appl. Probab. 7 (1997) 110–120)) that, in the limit n →∞, starting at equilibrium and for an appropriate scaling of the variance and of the timescale as a function of the dimension n, a diffusive limit is obtained for each component of the Markov chain. In Jourdain et al. (Optimal scaling for the transient phase of the random walk Metropolis algorithm: The mean-field limit (2012) Preprint), we generalize this result when the initial distribution is not the target probability measure. The obtained diffusive limit is the solution to a stochastic differential equation nonlinear in the sense of McKean. In the present paper, we prove convergence to equilibrium for this equation. We discuss practical counterparts in order to optimize the variance of the proposal distribution to accelerate convergence to equilibrium. Our analysis confirms the interest of the constant acceptance rate strategy (with acceptance rate between 1/4and1/3) first suggested in Roberts et al. (Ann. Appl. Probab. 7 (1997) 110–120). We also address scaling of the Metropolis-Adjusted Langevin Algorithm. When starting at equilibrium, a diffusive limit for an optimal scaling of the variance is obtained in Roberts and Rosenthal (J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 (1998) 255–268). In the transient case, we obtain formally that the optimal variance scales very differently in n depending on the sign of a moment of the distribution, which vanishes at equilibrium. This suggest that it is difficult to derive practical recommendations for MALA from such asymptotic results.

Keywords: diffusion limits; MALA; optimal scaling; propagation of chaos; random walk Metropolis


Restricted likelihood representation and decision-theoretic aspects of meta-analysis

ANDREW L. RUKHIN National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, MD 20899, USA. E-mail: [email protected]

In the random-effects model of meta-analysis a canonical representation of the restricted likelihood function is obtained. This representation relates the mean effect and the heterogeneity variance estimation problems. An explicit form of the variance of weighted means statistics determined by means of a quadratic form is found. The behavior of the mean squared error for large heterogeneity variance is elucidated. It is noted that the sample mean is not admissible nor minimax under a natural risk function for the number of studies exceeding three.

Keywords: DerSimonian–Laird estimator; Hedges estimator; Mandel–Paule procedure; minimaxity; quadratic forms; random-effects model; Stein phenomenon


Optimal filtering and the dual process

OMIROS PAPASPILIOPOULOS1 and MATTEO RUGGIERO2 1ICREA & Department of Economics and Business, Universitat Pompeu Fabra, Ramón Trias Fargas 25-27, 08005, Barcelona, Spain. E-mail: [email protected] 2Collegio Carlo Alberto & Department of Economics and Statistics, University of Torino, Unione Sovietica 218/bis, 10134, Torino, Italy. E-mail: [email protected]

We link optimal filtering for hidden Markov models to the notion of duality for Markov processes. We show that when the signal is dual to a process that has two components, one deterministic and one a pure death process, and with respect to functions that define changes of measure conjugate to the emission density, the filtering distributions evolve in the family of finite mixtures of such measures and the filter can be com- puted at a cost that is polynomial in the number of observations. Special cases of our framework include the Kalman filter, and computable filters for the Cox–Ingersoll–Ross process and the one-dimensional Wright– Fisher process, which have been investigated before. The dual we obtain for the Cox–Ingersoll–Ross pro- cess appears to be new in the literature.

Keywords: Bayesian conjugacy; Cox–Ingersoll–Ross process; finite mixture models; hidden Markov model; Kalman filter


New concentration inequalities for suprema of empirical processes

JOHANNES LEDERER* and SARA VAN DE GEER** Seminar für Statistik, ETH Zürich, Rämistrasse 101, 8092 Zürich, Switzerland. E-mail: *[email protected]; **[email protected]

While effective concentration inequalities for suprema of empirical processes exist under boundedness or strict tail assumptions, no comparable results have been available under considerably weaker assumptions. In this paper, we derive concentration inequalities assuming only low moments for an envelope of the empirical process. These concentration inequalities are beneficial even when the envelope is much larger than the single functions under consideration.

Keywords: chaining; concentration inequalities; deviation inequalities; empirical processes; rate of convergence


About the posterior distribution in hidden Markov models with unknown number of states

ELISABETH GASSIAT1 and JUDITH ROUSSEAU2 1Laboratoire de Mathématiques d’Orsay UMR 8628, Université Paris-Sud, Bâtiment 425, 91405 Orsay- Cédex, France. E-mail: [email protected] 2CREST-ENSAE, 3 avenue Pierre Larousse, 92245 Malakoff Cedex, France. E-mail: [email protected]

We consider finite state space stationary hidden Markov models (HMMs) in the situation where the number of hidden states is unknown. We provide a frequentist asymptotic evaluation of Bayesian analysis methods. Our main result gives posterior concentration rates for the marginal densities, that is for the density of a fixed number of consecutive observations. Using conditions on the prior, we are then able to define a consistent Bayesian estimator of the number of hidden states. It is known that the likelihood ratio test statistic for overfitted HMMs has a nonstandard behaviour and is unbounded. Our conditions on the prior may be seen as a way to penalize parameters to avoid this phenomenon. Inference of parameters is a much more difficult task than inference of marginal densities, we still provide a precise description of the situation when the observations are i.i.d. and we allow for 2 possible hidden states.

Keywords: Bayesian statistics; hidden Markov models; number of components; order selection; posterior distribution


Stochastic monotonicity and continuity properties of functions defined on Crump–Mode–Jagers branching processes, with application to vaccination in epidemic modelling

FRANK BALL1, MIGUEL GONZÁLEZ2,*, RODRIGO MARTÍNEZ2,** and MAROUSSIA SLAVTCHOVA-BOJKOVA3 1School of Mathematical Sciences, The University of Nottingham, Nottingham NG7 2RD, United Kingdom. E-mail: [email protected] 2Department of Mathematics, University of Extremadura, Avda, Elvas s/n, 06071-Badajoz, Spain. E-mail: *[email protected]; **[email protected] 3Faculty of Mathematics and Informatics, Sofia University and Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Bulgaria. E-mail: [email protected]

This paper is concerned with Crump–Mode–Jagers branching processes, describing spread of an epidemic depending on the proportion of the population that is vaccinated. Births in the branching process are aborted independently with a time-dependent probability given by the fraction of the population vacci- nated. Stochastic monotonicity and continuity results for a wide class of functions (e.g., extinction time and total number of births over all time) defined on such a branching process are proved using coupling arguments, leading to optimal vaccination schemes to control corresponding functions (e.g., duration and final size) of epidemic outbreaks. The theory is illustrated by applications to the control of the duration of mumps outbreaks in Bulgaria.

Keywords: coupling; general branching process; Monte-Carlo method; mumps in Bulgaria; SIR epidemic model; time to extinction; vaccination policies


Tail approximations for the Student t-, F -, and Welch statistics for non-normal and not necessarily i.i.d. random variables

DMITRII ZHOLUD Department of Mathematical Statistics, Chalmers University of Technology and University of Göteborg, SE-412 96 Gothenburg, Sweden. E-mail: [email protected]

Let T be the Student one- or two-sample t-, F -, or Welch statistic. Now release the underlying assumptions of normality, independence and identical distribution and consider a more general case where one only assumes that the vector of data has a continuous joint density. We determine asymptotic expressions for P(T > u) as u →∞for this case. The approximations are particularly accurate for small sample sizes and may be used, for example, in the analysis of High-Throughput Screening experiments, where the number of replicates can be as low as two to five and often extreme significance levels are used. We give numerous examples and complement our results by an investigation of the convergence speed – both theoretically, by deriving exact bounds for absolute and relative errors, and by means of a simulation study.

Keywords: dependent random variables; F -test; high-throughput screening; non-homogeneous data; non-normal population distribution; outliers; small sample size; Student’s one- and two-sample t-statistics; systematic effects; test power; Welch statistic


Goodness-of-fit test for noisy directional data

CLAIRE LACOUR* and THANH MAI PHAM NGOC** Laboratoire de Mathématique, UMR 8628, Université Paris Sud, 91405 Orsay Cedex, France. E-mail: *[email protected]; **[email protected]

We consider spherical data Xi noised by a random rotation εi ∈ SO(3) so that only the sample Zi = εiXi, i = 1,...,N is observed. We define a nonparametric test procedure to distinguish H0 : “the density f  − 2 ≥ C of Xi is the uniform density f0 on the sphere” and H1 :“f f0 2 ψN and f is in a Sobolev space with smoothness s”. For a noise density fε with smoothness index ν, we show that an adaptive ad = procedure (i.e., s is not assumed to be known) cannot have a faster rate of separation than ψN (s) − + + (N/ log log(N)) 2s/(2s 2ν 1) and we provide a procedure which reaches this rate. We also deal with the case of super smooth noise. We illustrate the theory by implementing our test procedure for various kinds of noise on SO(3) and by comparing it to other procedures. Applications to real data in astrophysics and paleomagnetism are provided.

Keywords: adaptive testing; minimax hypothesis testing; nonparametric alternatives; spherical deconvolution; spherical harmonics


Approximation of a stochastic wave equation in dimension three, with application to a support theorem in Hölder norm

FRANCISCO J. DELGADO-VENCES* and MARTA SANZ-SOLÉ** Facultat de Matemàtiques, Universitat de Barcelona, Gran Via, 585 E-08007 Barcelona, Spain. E-mail: *[email protected]; **[email protected]

A characterization of the support in Hölder norm of the law of the solution to a stochastic wave equation with three-dimensional space variable is proved. The result is a consequence of an approximation theorem, in the convergence of probability, for a sequence of evolution equations driven by a family of regularizations of the driving noise.

Keywords: approximating schemes; stochastic wave equation; support theorem


Adaptive sensing performance lower bounds for sparse signal detection and support estimation

RUI M. CASTRO Eindhoven University of Technology, The Netherlands. E-mail: [email protected]

In memory of Yuri Ingster

This paper gives a precise characterization of the fundamental limits of adaptive sensing for diverse esti- mation and testing problems concerning sparse signals. We consider in particular the setting introduced in (IEEE Trans. Inform. Theory 57 (2011) 6222–6235) and show necessary conditions on the minimum signal n magnitude for both detection and estimation: if x ∈ R is a sparse vector with s non-zero components√ then it can be reliably detected in noise provided the magnitude of the non-zero components exceeds 2√/s.Fur- thermore, the signal support can be exactly identified provided the minimum magnitude exceeds 2logs. Notably there is no dependence on n, the extrinsic signal dimension. These results show that the adaptive sensing methodologies proposed previously in the literature are essentially optimal, and cannot be substan- tially improved. In addition, these results provide further insights on the limits of adaptive compressive sensing.

Keywords: adaptive sensing; minimax lower bounds; sequential experimental design; sparsity-based models


Asymptotic behavior of CLS estimators for 2-type doubly symmetric critical Galton–Watson processes with immigration

MÁRTON ISPÁNY1, KRISTÓF KÖRMENDI2,* and GYULA PAP2,** 1University of Debrecen, Faculty of Informatics, Department of Information Technology, Pf. 12, H-4010 Debrecen, Hungary. E-mail: [email protected] 2University of Szeged, Faculty of Science, Bolyai Institute, Department of Stochastics, Aradi vértanúk tere 1, H-6720 Szeged, Hungary. E-mail: *[email protected]; **[email protected]

In this paper, the asymptotic behavior of the conditional least squares (CLS) estimators of the offspring means (α, β) and of the criticality parameter := α + β for a 2-type critical doubly symmetric positively regular Galton–Watson branching process with immigration is described.

Keywords: conditional least squares estimator; Galton–Watson branching process with immigration


Affine invariant divergences associated with proper composite scoring rules and their applications

TAKAFUMI KANAMORI1 and HIRONORI FUJISAWA2 1Department of Computer Science and Mathematical Informatics, Nagoya University, Furocho Chikusaku, Nagoya 464-8601, Japan. E-mail: [email protected] 2The Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, Japan. E-mail: [email protected]

In statistical analysis, measuring a score of predictive performance is an important task. In many scientific fields, appropriate scoring rules were tailored to tackle the problems at hand. A proper scoring rule is a popular tool to obtain statistically consistent forecasts. Furthermore, a mathematical characterization of the proper scoring rule was studied. As a result, it was revealed that the proper scoring rule corresponds to a Bregman divergence, which is an extension of the squared distance over the set of probability distributions. In the present paper, we introduce composite scoring rules as an extension of the typical scoring rules in order to obtain a wider class of probabilistic forecasting. Then, we propose a class of composite scoring rules, named Hölder scores, that induce equivariant estimators. The equivariant estimators have a favorable property, implying that the estimator is transformed in a consistent way, when the data is transformed. In particular, we deal with the affine transformation of the data. By using the equivariant estimators under the affine transformation, one can obtain estimators that do no essentially depend on the choice of the system of units in the measurement. Conversely, we prove that the Hölder score is characterized by the invariance property under the affine transformations. Furthermore, we investigate statistical properties of the estimators using Hölder scores for the statistical problems including estimation of regression functions and robust parameter estimation, and illustrate the usefulness of the newly introduced scoring rules for statistical forecasting.

Keywords: affine invariance; Bregman score; composite scoring rule; divergence; Hölder score


The affinely invariant distance correlation

JOHANNES DUECK1, DOMINIC EDELMANN1, TILMANN GNEITING2 and DONALD RICHARDS3 1Institut für Angewandte Mathematik, Universität Heidelberg, Im Neuenheimer Feld 294, 69120 Heidel- berg, Germany 2Heidelberg Institute for Theoretical Studies and Karlsruhe Institute of Technology, HITS gGmbH, Schloss- Wolfsbrunnenweg 35, 69118 Heidelberg, Germany 3Department of Statistics, Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected]

Székely, Rizzo and Bakirov (Ann. Statist. 35 (2007) 2769–2794) and Székely and Rizzo (Ann. Appl. Statist. 3 (2009) 1236–1265), in two seminal papers, introduced the powerful concept of distance correlation as a measure of dependence between sets of random variables. We study in this paper an affinely invariant version of the distance correlation and an empirical version of that distance correlation, and we establish the consistency of the empirical quantity. In the case of subvectors of a multivariate normally distributed random vector, we provide exact expressions for the affinely invariant distance correlation in both finite- dimensional and asymptotic settings, and in the finite-dimensional case we find that the affinely invariant distance correlation is a function of the canonical correlation coefficients. To illustrate our results, we con- sider time series of wind vectors at the Stateline wind energy center in Oregon and Washington, and we derive the empirical auto and cross distance correlation functions between wind vectors at distinct meteo- rological stations.

Keywords: affine invariance; distance correlation; distance covariance; hypergeometric function of matrix argument; multivariate independence; multivariate ; vector time series; wind forecasting; zonal polynomial


