Symbolic Data Analysis: Definitions and Examples
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
User Guide April 2004 EPA/600/R04/079 April 2004
ProUCL Version 3.0 User Guide April 2004 EPA/600/R04/079 April 2004 ProUCL Version 3.0 User Guide by Anita Singh Lockheed Martin Environmental Services 1050 E. Flamingo Road, Suite E120, Las Vegas, NV 89119 Ashok K. Singh Department of Mathematical Sciences University of Nevada, Las Vegas, NV 89154 Robert W. Maichle Lockheed Martin Environmental Services 1050 E. Flamingo Road, Suite E120, Las Vegas, NV 89119 i Table of Contents Authors ..................................................................... i Table of Contents ............................................................. ii Disclaimer ................................................................. vii Executive Summary ........................................................ viii Introduction ................................................................ ix Installation Instructions ........................................................ 1 Minimum Hardware Requirements ............................................... 1 A. ProUCL Menu Structure .................................................... 2 1. File ............................................................... 3 2. View .............................................................. 4 3. Help .............................................................. 5 B. ProUCL Components ....................................................... 6 1. File ................................................................7 a. Input File Format ...............................................9 b. Result of Opening an Input Data File -
Iterated Importance Sampling in Missing Data Problems
Iterated importance sampling in missing data problems Gilles Celeux INRIA, FUTURS, Orsay, France Jean-Michel Marin ∗ INRIA, FUTURS, Orsay, France and CEREMADE, University Paris Dauphine, Paris, France Christian P. Robert CEREMADE, University Paris Dauphine and CREST, INSEE, Paris, France Abstract Missing variable models are typical benchmarks for new computational techniques in that the ill-posed nature of missing variable models offer a challenging testing ground for these techniques. This was the case for the EM algorithm and the Gibbs sampler, and this is also true for importance sampling schemes. A population Monte Carlo scheme taking advantage of the latent structure of the problem is proposed. The potential of this approach and its specifics in missing data problems are illustrated in settings of increasing difficulty, in comparison with existing approaches. The improvement brought by a general Rao–Blackwellisation technique is also discussed. Key words: Adaptive algorithms, Bayesian inference, latent variable models, population Monte Carlo, Rao–Blackwellisation, stochastic volatility model ∗ Corresponding author: CEREMADE, Place du Mar´echal De Lattre de Tassigny, 75775 Paris Cedex 16, France, [email protected] Preprint submitted to Computational Statistics and Data Analysis 25 July 2005 1 Introduction 1.1 Missing data models Missing data models, that is, structures such that the distribution of the data y can be represented via a marginal density Z f(y|θ) = g(y, z|θ)dz , Z where z ∈ Z denotes the so-called ”missing data”, have often been at the forefront of computational Statistics, both as a challenge to existing techniques and as a benchmark for incoming techniques. This is for instance the case with the EM algorithm (Dempster et al., 1977), which was purposely designed for missing data problems although it has since then been applied in a much wider setting. -
Basic Statistical Concepts Statistical Population
Basic Statistical Concepts Statistical Population • The entire underlying set of observations from which samples are drawn. – Philosophical meaning: all observations that could ever be taken for range of inference • e.g. all barnacle populations that have ever existed, that exist or that will exist – Practical meaning: all observations within a reasonable range of inference • e.g. barnacle populations on that stretch of coast 1 Statistical Sample •A representative subset of a population. – What counts as being representative • Unbiased and hopefully precise Strategies • Define survey objectives: what is the goal of survey or experiment? What are your hypotheses? • Define population parameters to estimate (e.g. number of individuals, growth, color etc). • Implement sampling strategy – measure every individual (think of implications in terms of cost, time, practicality especially if destructive) – measure a representative portion of the population (a sample) 2 Sampling • Goal: – Every unit and combination of units in the population (of interest) has an equal chance of selection. • This is a fundamental assumption in all estimation procedures •How: – Many ways if underlying distribution is not uniform » In the absence of information about underlying distribution the only safe strategy is random sampling » Costs: sometimes difficult, and may lead to its own source of bias (if sample size is low). Much more about this later Sampling Objectives • To obtain an unbiased estimate of a population mean • To assess the precision of the estimate (i.e. -
Nearest Neighbor Methods Philip M
Statistics Preprints Statistics 12-2001 Nearest Neighbor Methods Philip M. Dixon Iowa State University, [email protected] Follow this and additional works at: http://lib.dr.iastate.edu/stat_las_preprints Part of the Statistics and Probability Commons Recommended Citation Dixon, Philip M., "Nearest Neighbor Methods" (2001). Statistics Preprints. 51. http://lib.dr.iastate.edu/stat_las_preprints/51 This Article is brought to you for free and open access by the Statistics at Iowa State University Digital Repository. It has been accepted for inclusion in Statistics Preprints by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Nearest Neighbor Methods Abstract Nearest neighbor methods are a diverse group of statistical methods united by the idea that the similarity between a point and its nearest neighbor can be used for statistical inference. This review article summarizes two common environmetric applications: nearest neighbor methods for spatial point processes and nearest neighbor designs and analyses for field experiments. In spatial point processes, the appropriate similarity is the distance between a point and its nearest neighbor. Given a realization of a spatial point process, the mean nearest neighbor distance or the distribution of distances can be used for inference about the spatial process. One common application is to test whether the process is a homogeneous Poisson process. These methods can be extended to describe relationships between two or more spatial point processes. These methods are illustrated using data on the locations of trees in a swamp hardwood forest. In field experiments, neighboring plots often have similar characteristics before treatments are imposed. -
Testing Adverse Selection and Moral Hazard on French Car Insurance Data
TESTING ADVERSE SELECTION AND MORAL HAZARD ON FRENCH CAR INSURANCE DATA Guillaume CARLIER1 Université Paris Dauphine Michel GRUN-REHOMME2 Université Paris Panthéon Olga VASYECHKO3 Université Nationale d'Economie de Kyiv This paper is a modest contribution to the stream of research devoted to find empirical evidence of asymmetric information. Building upon Chiappori and Salanié's (2000) work, we propose two specific tests, one for adverse selection and one for moral hazard. We implement these tests on French car insurance data, circumventing the lack of dynamic data in our data base by a proxy of claim history. The first test suggests presence of adverse selection whereas the second one seems to contradict presence of pure moral hazard. Keywords: empirical evidence of moral hazard, adverse selection, car insurance. JEL Classification: C35, D82. 1 Université Paris Dauphine, CEREMADE, Pl. de Lattre de Tassigny, 75775 Paris Cedex 16, FRANCE, [email protected] 2 Université Paris Panthéon, ERMES, 12 Pl. du Panthéon, 75005 Paris, FRANCE, [email protected] 3 Université Nationale d'Economie de Kyiv, UKRAINE, [email protected] BULLETIN FRANÇAIS D’ACTUARIAT, Vol. 13, n° 25, janvier – juin 2013, pp. 117- 130 118 G. CARLIER – M. GRUN-REHOMME – O. VASYECHKO 1. INTRODUCTION There has been an intensive line of research in the last decades devoted to find empirical evidence of asymmetric information on insurance data. Under adverse selection, the insured has some private information about his type of risk, which the insurer cannot observe before the subscription of an insurance contract. The adverse selection assumption stipulates that the high-risks tend to choose more coverage than the low-risks (Rothschild and Stiglitz (1976), Wilson (1977), Spence (1978)). -
Dirk Hundertmark Department of Mathematics, MC-382 University of Illinois at Urbana-Champaign Altgeld Hall 1409 W
Dirk Hundertmark Department of Mathematics, MC-382 University of Illinois at Urbana-Champaign Altgeld Hall 1409 W. Green Street Urbana, IL 61801 +1 (217) 333-3350 (217) 333-9516 (fax) [email protected] 1401 W. Charles, Champaign, IL 61821 +1 (217) 419-1088 (home) http://www.math.uiuc.edu/∼dirk Personal Data German and American citizen, married, one daughter. Research interests Partial differential equations, analysis, variational methods, functional analysis, spectral theory, motivated by problems from Physics, especially quantum mechanics, and Engineering. Spectral theory of random operators and and its connection to statistical mechanics and some probabilistic problems from solid state physics. More recently, mathematical problems in non-linear fiber-optics, especially properties of dispersion managed solitons. Education 5/2003 Habilitation in Mathematics, Ludwig{Maximilians-Universit¨atM¨unchen. 7/1992 { 11/1996 Ph.D. (Dr. rer. nat., summa cum laude) in Mathematics, Ruhr- Universit¨atBochum, Germany. Thesis: \ On the theory of the magnetic Schr¨odingersemigroup." Advisor: Werner Kirsch 11/1985 { 2/1992 Study of Physics, Friedrich-Alexander-Universit¨atErlangen, Germany. Graduated with Diplom. Advisor: Hajo Leschke. Employment Since 9/2007 Member of the Institute of Condensed Matter Theory at UIUC. Since 8/2006 Associate Professor (tenured), Department of Mathematics, University of Illinois at Urbana-Champaign (on leave 8/2006 { 8/2007). 8/2006 {8/2007 Senior Lecturer, School of Mathematics, University of Birmingham, England. 1/2003 { 7/2006 Assistant Professor, Department of Mathematics, University of Illinois at Urbana-Champaign. 9 { 12/2002 Research fellow at the Institut Mittag-Leffler during the special program \Partial Differential Equations and Spectral Theory." 9/1999{8/2002 Olga Taussky-John Todd Instructor of Mathematics, Caltech. -
Experimental Investigation of the Tension and Compression Creep Behavior of Alumina-Spinel Refractories at High Temperatures
ceramics Article Experimental Investigation of the Tension and Compression Creep Behavior of Alumina-Spinel Refractories at High Temperatures Lucas Teixeira 1,* , Soheil Samadi 2 , Jean Gillibert 1 , Shengli Jin 2 , Thomas Sayet 1 , Dietmar Gruber 2 and Eric Blond 1 1 Université d’Orléans, Université de Tours, INSA-CVL, LaMé, 45072 Orléans, France; [email protected] (J.G.); [email protected] (T.S.); [email protected] (E.B.) 2 Chair of Ceramics, Montanuniversität Leoben, 8700 Leoben, Austria; [email protected] (S.S.); [email protected] (S.J.); [email protected] (D.G.) * Correspondence: [email protected] Received: 3 September 2020; Accepted: 18 September 2020; Published: 22 September 2020 Abstract: Refractory materials are subjected to thermomechanical loads during their working life, and consequent creep strain and stress relaxation are often observed. In this work, the asymmetric high temperature primary and secondary creep behavior of a material used in the working lining of steel ladles is characterized, using uniaxial tension and compression creep tests and an inverse identification procedure to calculate the parameters of a Norton-Bailey based law. The experimental creep curves are presented, as well as the curves resulting from the identified parameters, and a statistical analysis is made to evaluate the confidence of the results. Keywords: refractories; creep; parameters identification 1. Introduction Refractory materials, known for their physical and chemical stability, are used in high temperature processes in different industries, such as iron and steel making, cement and aerospace. These materials are exposed to thermomechanical loads, corrosion/erosion from solids, liquids and gases, gas diffusion, and mechanical abrasion [1]. -
Mckean–Vlasov Optimal Control: the Dynamic Programming Principle
McKean–Vlasov optimal control: the dynamic programming principle∗ Mao Fabrice Djete† Dylan Possamaï ‡ Xiaolu Tan§ March 25, 2020 Abstract We study the McKean–Vlasov optimal control problem with common noise in various formulations, namely the strong and weak formulation, as well as the Markovian and non–Markovian formulations, and allowing for the law of the control process to appear in the state dynamics. By interpreting the controls as probability measures on an appropriate canonical space with two filtrations, we then develop the classical measurable selection, conditioning and concatenation arguments in this new context, and establish the dynamic programming principle under general conditions. 1 Introduction We propose in this paper to study the problem of optimal control of mean–field stochastic differential equations, also called Mckean–Vlasov stochastic differential equations in the literature. This problem is a stochastic control problem where the state process is governed by a stochastic differential equation (SDE for short), which has coefficients depending on the current time, the paths of the state process, but also its distribution (or conditional distribution in the case with common noise). Similarly, the reward functionals are allowed to be impacted by the distribution of the state process. The pioneering work on McKean–Vlasov equations is due to McKean [60] and Kac [49], who were interested in studying uncontrolled SDEs, and in establishing general propagation of chaos results. Let us also mention the illuminating notes of Snitzman [75], which give a precise and pedagogical insight into this specific equation. Though many authors have worked on this equation following these initial papers, there has been a drastic surge of interest in the topic in the past decade, due to the connection that it shares with the so–called mean–field game (MFG for short) theory, introduced independently and simultaneously on the one hand by Lasry and Lions in [53; 54; 55] and on the other hand by Huang, Caines, and Malhamé [44; 45; 46; 47; 48]. -
Introduction to Basic Concepts
Introduction to Basic Concepts ! There are 5 basic steps to resource management: " 1st - Assess current situation (gather and analyze info) " 2nd - Develop a plan " 3rd- Implement the plan " 4th- Monitor results (effects) of plan " 5th- Replan ! This class will help you do the 1st and 4th steps in the process ! For any study of vegetation there are basically 8 steps: " 1st Set objectives - What do you wan to know? You can’t study everything, you must have a goal " 2nd Choose variable you are going to measure Make sure you are meeting your objectives " 3rd Read published research & ask around Find out what is known about your topic and research area " 4th Choose a sampling method, number of samples, analysis technique, and statistical hypothesis " 5th Go collect data - Details, details, details.... Pay attention to detail - Practice and train with selected methods before you start - Be able to adjust once you get to the field " 6th Analyze the data - Summarize data - Use a statistical comparison " 7th Interpret the results - What did you find? Why is it important? - What can your results tell you about the ecosystem? - Remember the limitations of your technique " 8th Write the report - Your results are worthless if they stay in your files - Clearly document your methods - Use tables and figures to explain your data - Be correct and concise - No matter how smart you are, you will sound stupid if you can’t clearly communicate your results A good reference for several aspects of rangeland inventory and monitoring is: http://rangelandswest.org/az/monitoringtechnical.html -
Time Averages for Kinetic Fokker-Planck Equations
Time averages for kinetic Fokker-Planck equations Giovanni M. Brigatia,b,1 aCEREMADE, CNRS, UMR 7534, Universit´eParis-Dauphine, PSL University, Place du Marechal de Lattre de Tassigny, 75016 Paris, France bDipartimento di Matematica “F. Casorati”, Universit`adegli Studi di Pavia, Via Ferrata 5, 27100 Pavia, Italia Abstract We consider kinetic Fokker-Planck (or Vlasov-Fokker-Planck) equations on the torus with Maxwellian or fat tail local equilibria. Results based on weak norms have recently been achieved by S. Armstrong and J.-C. Mourrat in case of Maxwellian local equilibria. Using adapted Poincar´eand Lions-type inequali- ties, we develop an explicit and constructive method for estimating the decay rate of time averages of norms of the solutions, which covers various regimes corresponding to subexponential, exponential and superexponential (including Maxwellian) local equilibria. As a consequence, we also derive hypocoercivity estimates, which are compared to similar results obtained by other techniques. Keywords: Kinetic Fokker-Planck equation, Ornstein-Uhlenbeck equation, time average, local equilibria, Lions’ lemma, Poincar´einequalities, hypocoercivity. 2020 MSC: Primary: 82C40; Secondary: 35B40, 35H10, 47D06, 35K65. 1. Introduction Let us consider the kinetic Fokker-Planck equation α−2 ∂tf + v ⋅ ∇xf ∇v ⋅ ∇vf + α v vf , f 0, ⋅, ⋅ f0. (1) = ⟨ ⟩ ( ) = where f is a function of time t 0, position x, velocity v, and α is a positive arXiv:2106.12801v1 [math.AP] 24 Jun 2021 parameter. Here we use the notation≥ d v »1 + v 2, ∀ v R . ⟨ ⟩ = S S ∈ d We consider the spatial domain Q ∶ 0,L x, with periodic boundary con- = ( ) ∋ ditions, and define Ωt ∶ t,t + τ × Q, for some τ 0, t 0 and Ω Ω0. -
Shapiro, A., “Statistical Inference of Moment Structures”
Statistical Inference of Moment Structures Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA e-mail: [email protected] Abstract This chapter is focused on statistical inference of moment structures models. Although the theory is presented in terms of general moment structures, the main emphasis is on the analysis of covariance structures. We discuss identifiability and the minimum discrepancy function (MDF) approach to statistical analysis (estimation) of such models. We address such topics of the large samples theory as consistency, asymptotic normality of MDF estima- tors and asymptotic chi-squaredness of MDF test statistics. Finally, we discuss asymptotic robustness of the normal theory based MDF statistical inference in the analysis of covariance structures. 1 Contents 1 Introduction 1 2 Moment Structures Models 1 3 Discrepancy Functions Estimation Approach 6 4 Consistency of MDF Estimators 9 5 Asymptotic Analysis of the MDF Estimation Procedure 12 5.1 Asymptotics of MDF Test Statistics . 15 5.2 Nested Models . 19 5.3 Asymptotics of MDF Estimators . 20 5.4 The Case of Boundary Population Value . 22 6 Asymptotic Robustness of the MDF Statistical Inference 24 6.1 Elliptical distributions . 26 6.2 Linear latent variate models . 27 1 Introduction In this paper we discuss statistical inference of moment structures where first and/or second population moments are hypothesized to have a parametric structure. Classical examples of such models are multinomial and covariance structures models. The presented theory is sufficiently general to handle various situations, however the main focus is on covariance structures. Theory and applications of covariance structures were motivated first by the factor analysis model and its various generalizations and later by a development of the LISREL models (J¨oreskog [15, 16]) (see also Browne [8] for a thorough discussion of covariance structures modeling). -
Standard Deviation - Wikipedia Visited on 9/25/2018
Standard deviation - Wikipedia Visited on 9/25/2018 Not logged in Talk Contributions Create account Log in Article Talk Read Edit View history Wiki Loves Monuments: The world's largest photography competition is now open! Photograph a historic site, learn more about our history, and win prizes. Main page Contents Featured content Standard deviation Current events Random article From Wikipedia, the free encyclopedia Donate to Wikipedia For other uses, see Standard deviation (disambiguation). Wikipedia store In statistics , the standard deviation (SD, also represented by Interaction the Greek letter sigma σ or the Latin letter s) is a measure that is Help used to quantify the amount of variation or dispersion of a set About Wikipedia of data values.[1] A low standard deviation indicates that the data Community portal Recent changes points tend to be close to the mean (also called the expected Contact page value) of the set, while a high standard deviation indicates that A plot of normal distribution (or bell- the data points are spread out over a wider range of values. shaped curve) where each band has a Tools The standard deviation of a random variable , statistical width of 1 standard deviation – See What links here also: 68–95–99.7 rule population, data set , or probability distribution is the square root Related changes Upload file of its variance . It is algebraically simpler, though in practice [2][3] Special pages less robust , than the average absolute deviation . A useful Permanent link property of the standard deviation is that, unlike the variance, it Page information is expressed in the same units as the data.