The Alpha-Beta-Gamma Skew Normal Distribution and Its Application

Total Page:16

File Type:pdf, Size:1020Kb

The Alpha-Beta-Gamma Skew Normal Distribution and Its Application Open Journal of Statistics, 2020, 10, 1057-1071 https://www.scirp.org/journal/ojs ISSN Online: 2161-7198 ISSN Print: 2161-718X The Alpha-Beta-Gamma Skew Normal Distribution and Its Application Zhengyuan Wei, Tiankui Peng*, Xiaoya Zhou College of Science, Chongqing University of Technology, Chongqing, China How to cite this paper: Wei, Z.Y., Peng, Abstract T.K. and Zhou, X.Y. (2020) The Alpha- Beta-Gamma Skew Normal Distribution In this paper, a new class of skew multimodal distributions with more flexible and Its Application. Open Journal of Statis- than alpha skew normal distribution and alpha-beta skew normal distribution tics, 10, 1057-1071. is proposed, which makes some important distributions become its special https://doi.org/10.4236/ojs.2020.106060 cases. The statistical properties of the new distribution are studied in detail, Received: November 20, 2020 its moment generating function, skewness coefficient, kurtosis coefficient, Accepted: December 27, 2020 Fisher information matrix, maximum likelihood estimators are derived. Published: December 30, 2020 Moreover, a random simulation study is carried out for test the performance Copyright © 2020 by author(s) and of the estimators, the simulation results show that with the increase of sample Scientific Research Publishing Inc. size, the mean value of maximum likelihood estimators tends to the true val- This work is licensed under the Creative ue. The new distribution family provides a better fit compared with other Commons Attribution International known skew distributions through the analysis of a real data set. License (CC BY 4.0). http://creativecommons.org/licenses/by/4.0/ Keywords Open Access Skew Distribution, Multimodal Distribution, Alpha-Beta-Gamma Skew Normal Distribution, Fisher Information Matrix 1. Introduction Over the years, the family of skew-symmetric distributions has received consi- derable attention by many researchers. Azzalini [1] first proposed the skew normal distribution (SN) with the density T 2 fz( ) =2Φ∈(αφ z) ( z) ,,( z α) , (1) where φ (⋅) and Φ⋅( ) are probability density function (pdf) and cumulative distribution function (cdf) of standard normal distribution, respectively. Bala- krishnan [2] and Gupta [3] researched a generalization of the skew normal dis- tribution (GSN) and further discussed its properties, Yadegaria et al. [4] intro- duced a generalization of the Balakrishnan skew normal distribution which in- DOI: 10.4236/ojs.2020.106060 Dec. 30, 2020 1057 Open Journal of Statistics Z. Y. Wei et al. cludes GSN as special case, an epsilon skew normal distribution and a new class of skew-Cauchy distribution is studied in [5] and [6], respectively. Another skew normal distribution named power normal distribution is proposed by [7] which solved the estimation of the skewness parameter of SN when the sample size is not very large. Recently, Elal-Olivero [8] proposed a new form of skew distribution with bi- modal behavior named alpha skew normal distribution (ASN) with the density 2 (11−+α z) T 2 fz( ;α) = φα( z),,( z ) ∈ , (2) α 2 + 2 where α controls the skewness and kurtosis, Equation (2) has at most two modes. Alpha-skew-Laplace distribution (ASL) and a generalization of alpha skew normal distribution (GASN) are studied in [9] and [10], respectively. Moreover, Chakraborty et al. [11] proposed the Balakrishnan alpha skew normal distribution and discussed a generalized bimodal normal distribution denoted by BN(n) with the density zn fz( ) = φ ( z),, z∈ R (3) M where n is positive even integers and M is normalizing constant. Furthermore, Shafiei et al. [12] proposed a new class of skew normal distributions called al- pha-beta skew normal distribution (ABSN) that is more flexible than SN and ASN, with the density 2 11−−αβzz3 + ( ) T 3 fz( ;,αβ) = φ( z),( z ,, αβ) ∈ , (4) 6αβ+ 15 β22 ++ α 2 where α , β control the skewness and kurtosis, Equation (4) has at most four modes. The motivations for considering this new family of distributions are as follows: Firstly, the new distribution family contains some classical distributions, such as normal distribution, ASN, ABSN, etc. Secondly, the admissible intervals for the skewness and the kurtosis parameters are wider than ASN and ABSN. Lastly, the new distribution family is more flexible than some other known skew distribu- tions through the analysis of a real data set. Consequently, the reminder of the paper is organized as follows: In Section 2, we propose the new class of skew normal distributions and discussed its properties. In Section 3, we provide the maximum likelihood estimates (MLEs) of the parameters, the performance of the estimates are verified by random simulation. The real application of the new distribution family is considered in Section 4. Section 5 presents some conclu- sions. Lastly, some proofs and elements of Fisher information (FI) matrix are given in the appendix. 2. Alpha-Beta-Gamma Skew Normal Distribution In this section, we define a class of skew distributions that allow the fitting of multimodal data sets. DOI: 10.4236/ojs.2020.106060 1058 Open Journal of Statistics Z. Y. Wei et al. Definition 1 A random variable X follows alpha-beta-gamma skew normal distribution, denoted by ABGSN (αβγ, , ) , if it has the pdf 2 11−−αβxx35 − γ x + ( ) T 4 fx( ;,,αβγ) = φ( x),( x ,,, αβγ) ∈ , (5) C where C =945γ2 + 210 γβ + 30 αγ + 6 αβ + 15 β22 ++ α 2 . Figure 1 and Figure 2 show the pdf of the ABGSN (αβγ, , ) for different parameter values. As can be seen that the proposed density is very general, the parameters α , β as well as γ have substantial impact on the skewness and the number of extreme points of the distribution. Remark 1 Derivative of pdf (5) with respect to x is given by ∂fx( ;,,αβγ) ∂x φ ( x) = −γ2x 11 +(10 γ 2 − 2 βγ) x9 +( 16 βγ −− β2 2 αγ) xx7 + 2 γ 6 C (6) +(12αγ +− 6 β2 2 αβ) xx 5 +−( 2 β 10 γ) 4 +( 8 αβ − α 23) x αβ22 α α +−(2 6) xx +( 2 −− 2) 2, Figure 1. Alpha-beta-gamma skew normal pdf. ABGSN(20,− 10,0.8) has six modes (black solid line), ABGSN(4.5,−− 8, 10) has two modes (red dashed line). Figure 2. Alpha-beta-gamma skew normal pdf. ABGSN(0.4,− 0.3,0.1) has three modes (black solid line), ABGSN(−−0.4,2, 0.2) has five modes (red dashed line). DOI: 10.4236/ojs.2020.106060 1059 Open Journal of Statistics Z. Y. Wei et al. Equation (6) has at most eleven zeros, therefore the pdf up to six modes, see Figure 1 and Figure 2. T Proposition 1 The cdf of X ~ ABGSN(αβγ , , ) is FxX ( ;,,αβγ) =−− γ29 βγ + γ2 7 −+ β 2 αγ + γ2 + βγ5 + γ 4 x(2 9 ) x( 2 63 14) xx 2 −(5β2 + 10 αγ + 315 γ 2 + 2 αβ + 70 βγ) xx 32 ++( 2 β 8 γ ) −C −2 x + 2α + 4 β + 16 γ φ xC+Φ x,,,,. xαβγ ∈ 4 ( ) ( ) ( ) ( ) Remark 2 Some properties of the ABGSN (αβγ, , ) are as follows 1) ABGSN (0,0,0) , ABGSN (αβ,0, ) and ABGSN (α,0,0) degenerate to N(0,1) , ABSN (αβ, ) and ASN (α ) , respectively. 2) If α = 0 , pdf (5) reduces to 2 11−−βγxx35 + ( ) T 3 fx( ;,βγ) = φ( x),( x ,, βγ) ∈ , (7) 945γ22+ 210 γβ ++ 15 β 2 in sequel, we name Equation (7) the beta-gamma skew normal distribution, de- noted by BGSN (βγ, ) . 3) If β = 0 , pdf (5) reduces to 2 11−−αγxx5 + ( ) T 3 fx( ;,αγ) = φ( x),( x ,, αγ) ∈ , (8) 945γ22+ 30 αγ ++ α 2 in sequel, we name Equation (8) the alpha-gamma skew normal distribution, denoted by AGSN (αγ, ) . 4) If αγ= = 0 , pdf (5) reduces to 2 11−+β x3 ( ) T 2 fx( ;β) = φβ( x),,( x ) ∈ , (9) 15β 2 + 2 Equation (9) is known as the beta skew normal distribution and denoted by BSN (β ) , see Shafiei [12]. 5) If αβ= = 0 , pdf (5) reduces to 2 11−+γ x5 ( ) T 2 fx( ;γ) = φγ( x),,( x ) ∈ , (10) 945γ 2 + 2 in sequel, we name Equation (10) the gamma skew normal distribution, denoted by GSN(γ ) . 6) If X ~ABGSN (αβγ ,,) , when α → ±∞ , β → ±∞ and γ → ±∞ , then X follows BN(2), BN(6) and BN(10), respectively. 7) If X ~ABGSN (αβγ ,,) , then −X ~ABGSN( −−−αβγ ,,) . Proposition 2 Let X ~ABGSN (αβγ ,,) , then for k ∈ , we have 1 2k = +α 22 ++ αγ β X22 hh2kk 22++( ) h26k C ++γ2h2 αβ hh + 2, βγ 2k+ 10 2 kk ++ 4 2 8 DOI: 10.4236/ojs.2020.106060 1060 Open Journal of Statistics Z. Y. Wei et al. 21k − 1 X=−++(22αβ hh2kk 22++ 2 γ h 24 k) , C where h2ki+ =××13 ×( 2ki +− 1) , i = 0, 2, 4,6, . Remark 3 In particularly, we have 1 [ X ] =−++(2αβ 6 30 γ) , C 1 2 22 Var[ X ] =(2 ++ 3α 210 αγ + 105 β + 10395 γ + 30 αβ + 1890 βγ ) C −αβ ++ γ2 (2 6 30) C , We use R software to optimizing [ X ] and Var[ X ] with respect to αβγ,,, we can get the bounds [ X ]∈−( 0.71,0.71) , Var[ X ]∈(0.81,14.07) . Remark 4 Let ξ1 and ξ2 stand for its skewness and kurtosis coefficients, then we have 3 23 X−+32 [ XX] [ X] ξ = 1 32 22 (XX− [ ]) =−++ α β γ3 + α3 + β 3 + γ 32 + αγ 16( 3 15) 12C ( 120 61425 95 2 22 2 2 +++14α β 75 αβ 2835 β γ + 5775 αγ + 23730 βγ 22 +1200αβγ −− 2 β 20 γ)( 2 + 3 α + 210 αγ + 105 β 32 +γ2 + αβ + βγ − α ++ β γ 2 10395 30 1890)C ( 2 6 30) , 4 32 24 X−+46 [ XX] [ XX] − 3 [ X] ξ = − 2 2 3 22 (XX− [ ]) =−+− α + β + γ4 +32 α + β 2 + γ 2 + αβ 3 48( 3 15) C ( 15 945 135135 210 +1890αγ + 20790 βγ ++ 6) 24( α + 3 β + 15 γ)C ( α33 + 165 β +89775γ322 + 125 α γ ++ 17 α β 105 αβ 2 + 4095 β 2 γ + 9555 αγ 2 +βγ2 + αβγ −− α β− γ ++ α2 αγ 35385 1830 2 14 110)( 2 3 210 2 +β22 + γ + αβ + βγ − α ++ β γ 2 105 10395 30 1890)C ( 2 6 30) , calculated by R software with the package DEoptim we can get the bounds ξξ12∈−( 1.29,1.29) ,∈−( 1.88,5.60) .
Recommended publications
  • Impact of Multimodality of Distributions on Var and ES Calculations Dominique Guegan, Bertrand Hassani, Kehan Li
    Impact of multimodality of distributions on VaR and ES calculations Dominique Guegan, Bertrand Hassani, Kehan Li To cite this version: Dominique Guegan, Bertrand Hassani, Kehan Li. Impact of multimodality of distributions on VaR and ES calculations. 2017. halshs-01491990 HAL Id: halshs-01491990 https://halshs.archives-ouvertes.fr/halshs-01491990 Submitted on 17 Mar 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Documents de Travail du Centre d’Economie de la Sorbonne Impact of multimodality of distributions on VaR and ES calculations Dominique GUEGAN, Bertrand HASSANI, Kehan LI 2017.19 Maison des Sciences Économiques, 106-112 boulevard de L'Hôpital, 75647 Paris Cedex 13 http://centredeconomiesorbonne.univ-paris1.fr/ ISSN : 1955-611X Impact of multimodality of distributions on VaR and ES calculations Dominique Gu´egana, Bertrand Hassanib, Kehan Lic aUniversit´eParis 1 Panth´eon-Sorbonne, CES UMR 8174. 106 bd l'Hopital 75013, Paris, France. Labex ReFi, Paris France. IPAG, Paris, France. bGrupo Santander and Universit´eParis 1 Panth´eon-Sorbonne, CES UMR 8174. Labex ReFi. cUniversit´eParis 1 Panth´eon-Sorbonne, CES UMR 8174. 106 bd l'Hopital 75013, Paris, France. Labex ReFi.
    [Show full text]
  • Methods to Analyze Likert-Type Data in Educational Technology Research
    Journal of Educational Technology Development and Exchange (JETDE) Volume 13 Issue 2 3-25-2021 Methods to Analyze Likert-Type Data in Educational Technology Research Li-Ting Chen University of Nevada, Reno, [email protected] Leping Liu University of Nevada, Reno, [email protected] Follow this and additional works at: https://aquila.usm.edu/jetde Part of the Educational Technology Commons Recommended Citation Chen, Li-Ting and Liu, Leping (2021) "Methods to Analyze Likert-Type Data in Educational Technology Research," Journal of Educational Technology Development and Exchange (JETDE): Vol. 13 : Iss. 2 , Article 3. DOI: 10.18785/jetde.1302.04 Available at: https://aquila.usm.edu/jetde/vol13/iss2/3 This Article is brought to you for free and open access by The Aquila Digital Community. It has been accepted for inclusion in Journal of Educational Technology Development and Exchange (JETDE) by an authorized editor of The Aquila Digital Community. For more information, please contact [email protected]. Chen, L.T., & Liu, L.(2020). Methods to Analyze Likert-Type Data in Educational Technology Research. Journal of Educational Technology Development and Exchange, 13(2), 39-60 Methods to Analyze Likert-Type Data in Educational Technology Research Li-Ting Chen University of Nevada, Reno Leping Liu University of Nevada, Reno Abstract: Likert-type items are commonly used in education and related fields to measure attitudes and opinions. Yet there is no consensus on how to analyze data collected from these items. In this paper, we first provided a synthesis of the existing literature on methods to analyze Likert-type data and computing tools for these methods.
    [Show full text]
  • Resonance-Induced Multimodal Body-Size Distributions in Ecosystems
    Resonance-induced multimodal body-size distributions in ecosystems Adam Lamperta,1,2 and Tsvi Tlustya,b aDepartment of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel; and bSimons Center for Systems Biology, Institute for Advanced Study, Princeton, NJ 08540 Edited by Andrea Rinaldo, Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland, and approved November 12, 2012 (received for review July 31, 2012) The size of an organism reflects its metabolic rate, growth rate, classic metapopulation framework, we assume each patch is mortality, and other important characteristics; therefore, the distri- either empty or occupied by a single species (15–17). In our bution of body size is a major determinant of ecosystem structure model, each species is specified by its maximal growth rate q, and function. Body-size distributions often are multimodal, with which is negatively correlated with its body size. The higher several peaks of abundant sizes, and previous studies suggest that growth rate of smaller individuals results in more migrants sent this is the outcome of niche separation: species from distinct peaks to other patches. Larger individuals, on the other hand, are ad- avoid competition by consuming different resources, which results vantageous in interference and exclude the smaller individuals in selection of different sizes in each niche. However, this cannot during within-patch competitions. explain many ecosystems with several peaks competing over the Specifically, we considered the following processes (Fig. 1): same niche. Here, we suggest an alternative, generic mechanism First, occupied patches are emptied at a rate m. This may occur underlying multimodal size distributions, by showing that the size- because of mortality or because of path destruction and creation dependent tradeoff between reproduction and resource utilization in which occupied patches vanish and new patches enter the entails an inherent resonance that may induce multiple peaks, all system.
    [Show full text]
  • A Unified View on Skewed Distributions Arising from Selections
    The Canadian Journal of Statistics 581 Vol. 34, No. 4, 2006, Pages 581–601 La revue canadienne de statistique A unified view on skewed distributions arising from selections Reinaldo B. ARELLANO-VALLE, Marcia´ D. BRANCO and Marc G. GENTON Key words and phrases: Kurtosis; multimodal distribution; multivariate distribution; nonnormal distribu- tion; selection mechanism; skew-symmetric distribution; skewness; transformation. MSC 2000: Primary 60E05; secondary 62E10. Abstract: Parametric families of multivariate nonnormal distributions have received considerable attention in the past few decades. The authors propose a new definition of a selection distribution that encompasses many existing families of multivariate skewed distributions. Their work is motivated by examples that involve various forms of selection mechanisms and lead to skewed distributions. They give the main prop- erties of selection distributions and show how various families of multivariate skewed distributions, such as the skew-normal and skew-elliptical distributions, arise as special cases. The authors further introduce several methods of constructing selection distributions based on linear and nonlinear selection mechanisms. Une perspective integr´ ee´ des lois asymetriques´ issues de processus de selection´ Resum´ e´ : Les familles parametriques´ de lois multivariees´ non gaussiennes ont suscite´ beaucoup d’inter´ etˆ depuis quelques decennies.´ Les auteurs proposent une nouvelle definition´ du concept de loi de selection´ qui englobe plusieurs familles connues de lois asymetriques´ multivariees.´ Leurs travaux sont motives´ par diverses situations faisant intervenir des mecanismes´ de selection´ et conduisant a` des lois asymetriques.´ Ils mentionnent les principales propriet´ es´ des lois de selection´ et montrent comment diverses familles de lois asymetriques´ multivariees´ telles que les lois asymetriques´ normales ou elliptiques emergent´ comme cas particuliers.
    [Show full text]
  • General Solution of the Chemical Master Equation and Modality of Marginal Distributions for Hierarchic first-Order Reaction Networks
    J. Math. Biol. (2018) 77:377–419 https://doi.org/10.1007/s00285-018-1205-2 Mathematical Biology General solution of the chemical master equation and modality of marginal distributions for hierarchic first-order reaction networks Matthias Reis1 · Justus A. Kromer2 · Edda Klipp1 Received: 7 April 2017 / Revised: 10 October 2017 / Published online: 20 January 2018 © The Author(s) 2018. This article is an open access publication Abstract Multimodality is a phenomenon which complicates the analysis of sta- tistical data based exclusively on mean and variance. Here, we present criteria for multimodality in hierarchic first-order reaction networks, consisting of catalytic and splitting reactions. Those networks are characterized by independent and dependent subnetworks. First, we prove the general solvability of the Chemical Master Equa- tion (CME) for this type of reaction network and thereby extend the class of solvable CME’s. Our general solution is analytical in the sense that it allows for a detailed analysis of its statistical properties. Given Poisson/deterministic initial conditions, we then prove the independent species to be Poisson/binomially distributed, while the dependent species exhibit generalized Poisson/Khatri Type B distributions. General- ized Poisson/Khatri Type B distributions are multimodal for an appropriate choice of parameters. We illustrate our criteria for multimodality by several basic models, as well as the well-known two-stage transcription–translation network and Bateman’s model from nuclear physics. For both examples, multimodality was previously not reported. Electronic supplementary material The online version of this article (https://doi.org/10.1007/s00285- 018-1205-2) contains supplementary material, which is available to authorized users.
    [Show full text]
  • Multimodal Nested Sampling: an Efficient and Robust Alternative to Markov Chain Monte Carlo Methods for Astronomical Data Analyses
    Mon. Not. R. Astron. Soc. 384, 449–463 (2008) doi:10.1111/j.1365-2966.2007.12353.x Multimodal nested sampling: an efficient and robust alternative to Markov Chain Monte Carlo methods for astronomical data analyses F. Feroz⋆ and M. P. Hobson Astrophysics Group, Cavendish Laboratory, JJ Thomson Avenue, Cambridge CB3 0HE Accepted 2007 August 13. Received 2007 July 23; in original form 2007 April 27 ABSTRACT In performing a Bayesian analysis of astronomical data, two difficult problems often emerge. First, in estimating the parameters of some model for the data, the resulting posterior distribution may be multimodal or exhibit pronounced (curving) degeneracies, which can cause problems for traditional Markov Chain Monte Carlo (MCMC) sampling methods. Secondly, in selecting between a set of competing models, calculation of the Bayesian evidence for each model is computationally expensive using existing methods such as thermodynamic integration. The nested sampling method introduced by Skilling, has greatly reduced the computational expense of calculating evidence and also produces posterior inferences as a by-product. This method has been applied successfully in cosmological applications by Mukherjee, Parkinson & Liddle, but their implementation was efficient only for unimodal distributions without pronounced de- generacies. Shaw, Bridges & Hobson recently introduced a clustered nested sampling method which is significantly more efficient in sampling from multimodal posteriors and also deter- mines the expectation and variance of the final evidence from a single run of the algorithm, hence providing a further increase in efficiency. In this paper, we build on the work of Shaw et al. and present three new methods for sampling and evidence evaluation from distributions that may contain multiple modes and significant degeneracies in very high dimensions; we also present an even more efficient technique for estimating the uncertainty on the evaluated evidence.
    [Show full text]
  • Spatially Heterogeneous Estimates of Fire Frequency in Ponderosa Pine Forests of Washington, Usa
    Fire Ecology Volume 6, Issue 3, 2010 Kernan and Hessl: Spatially Heterogeneous Fire Frequency doi: 10.4996/fireecology.0603117 Page 117 PRACTICES AND APPLICATIONS IN FIRE ECOLOGY SPATIALLY HETEROGENEOUS ESTIMATES OF FIRE FREQUENCY IN PONDEROSA PINE FORESTS OF WASHINGTON, USA James T. Kernan1* and Amy E. Hessl2 1Department of Geography, State University of New York at Geneseo, 1 College Circle, Geneseo, New York 14454, USA 2Department of Geology and Geography, West Virginia University, G49 Brooks Hall, Morgantown, West Virginia 26506, USA *Corresponding author: Tel.: 001-585-245-5463; e-mail: [email protected] ABSTRACT Many fire history studies have evaluated the temporal nature of fire regimes using fire in- terval statistics calculated from fire scars. More recently, researchers have begun to eval- uate the spatial properties of past fires as well. In this paper, we describe a technique for investigating spatio-temporal variability using a geographic information system (GIS). We used a dataset of fire-scarred trees collected from four sites in eastern Washington, USA, ponderosa pine (Pinus ponderosa C. Lawson) forests. The patterns of past fires re- corded by individual trees (points) were converted to two-dimensional representations of fire with inverse distance weighting (IDW) in a GIS. A map overlay approach was then used to extract a fine-grained, spatially explicit reconstruction of fire frequency at the four sites. The resulting classified maps can supplement traditional fire interval statistics and fire atlas data to provide detailed, spatially heterogeneous estimates of fire frequency. Such information can reveal ecological relationships between fire and the landscape, and provide managers with an improved spatial perspective on fire frequency that can inform risk evaluations, fuels reduction efforts, and the allocation of fire-fighting resources.
    [Show full text]
  • Empirical Measures of Mutational Effects Define Neutral Models of Regulatory Evolution in Saccharomyces Cerevisiae
    bioRxiv preprint doi: https://doi.org/10.1101/551804; this version posted February 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. Empirical measures of mutational effects define neutral models of regulatory evolution in Saccharomyces cerevisiae Andrea Hodgins-Davisa, Fabien Duveaua, Elizabeth Walkera, Patricia J Wittkoppa,b* aUniversity of Michigan, Department of Ecology and Evolutionary Biology, Ann Arbor, MI 48109, USA bDepartment of Molecular, Cellular, and Developmental Biology, Ann Arbor, MI 48109, USA Classification: BIOLOGICAL SCIENCES: Evolution Corresponding author: Dr. Patricia J Wittkopp 1105 N University Ave Ann Arbor, MI, 48109 USA [email protected] Running title: Empirical measures of mutational effects for gene expression in yeast bioRxiv preprint doi: https://doi.org/10.1101/551804; this version posted February 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. Abstract Understanding how phenotypes evolve requires disentangling the effects of mutation generating new variation from the effects of selection filtering it. Evolutionary models frequently assume that mutation introduces phenotypic variation symmetrically around the population mean, yet few studies have tested this assumption by deeply sampling the distributions of mutational effects for particular traits. Here, we examine distributions of mutational effects for gene expression in the budding yeast Saccharomyces cerevisiae by measuring the effects of thousands of point mutations introduced randomly throughout the genome.
    [Show full text]
  • Evidence That Computer Science Grades Are Not Bimodal
    Evidence That Computer Science Grades Are Not Bimodal Elizabeth Patitsas, Jesse Berlin, Michelle Craig, and Steve Easterbrook Department of Computer Science University of Toronto Toronto, Ontario, Canada patitsas,mcraig,[email protected] and [email protected] ABSTRACT inform their practice [13], and these beliefs may or may not Although it has never been rigourously demonstrated, there be based on empirical evidence. is a common belief that CS grades are bimodal. We statisti- cally analyzed 778 distributions of final course grades from 1.1 Explanations of Bimodality a large research university, and found only 5.8% of the dis- A number of explanations have been presented for why CS tributions passed tests of multimodality. We then devised grades are bimodal, all of which begin with the assumption a psychology experiment to understand why CS educators that this is the case. believe their grades to be bimodal. We showed 53 CS pro- fessors a series of histograms displaying ambiguous distri- 1.1.1 Prior Experience butions and asked them to categorize the distributions. A A bimodal distribution generally indicates that two dis- random half of participants were primed to think about the tinct populations have been sampled together [5]. One ex- fact that CS grades are commonly thought to be bimodal; planation for bimodal grades is that CS1 classes have two these participants were more likely to label ambiguous dis- populations of students: those with experience, and those tributions as \bimodal". Participants were also more likely without it [1]. to label distributions as bimodal if they believed that some High school CS is not common in many countries, and students are innately predisposed to do better at CS.
    [Show full text]
  • Multi-Modal Albedo Distributions in the Ablation Area of the Southwestern Greenland Ice Sheet
    The Cryosphere, 9, 905–923, 2015 www.the-cryosphere.net/9/905/2015/ doi:10.5194/tc-9-905-2015 © Author(s) 2015. CC Attribution 3.0 License. Multi-modal albedo distributions in the ablation area of the southwestern Greenland Ice Sheet S. E. Moustafa1, A. K. Rennermalm1, L. C. Smith2, M. A. Miller3, J. R. Mioduszewski1, L. S. Koenig4,*, M. G. Hom5,6, and C. A. Shuman4,7 1Department of Geography, Rutgers, The State University of New Jersey, 54 Joyce Kilmer Avenue, Piscataway, NJ 08854-8045, USA 2Department of Geography, University of California, Los Angeles, 1255 Bunche Hall, P. O. Box 951524, Los Angeles, CA 90095-1524, USA 3Department of Environmental Sciences, Rutgers, The State University of New Jersey, 14 College Farm Rd, New Brunswick, NJ 08901-8551, USA 4Cryospheric Sciences Laboratory, NASA Goddard Space Flight Center, 8800 Greenbelt Road, Greenbelt, MD, 20771, USA 5Biospheric Sciences Laboratory, NASA Goddard Space Flight Center, 8800 Greenbelt Road, Greenbelt, MD, 20771, USA 6Science Systems and Applications, Inc., 10210 Greenbelt Rd, Lanham, MD, 20706, USA 7Joint Center for Earth Systems Technology (JCET), University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD, 21250, USA *now at: National Snow and Ice Data Center, University of Colorado, 1540 30th Ave., Boulder CO 80303, USA Correspondence to: S. E. Moustafa ([email protected]) Received: 4 August 2014 – Published in The Cryosphere Discuss.: 5 September 2014 Revised: 1 March 2015 – Accepted: 26 March 2015 – Published: 7 May 2015 Abstract. Surface albedo is a key variable controlling solar and ice facies characterizing the two peaks. Our results show radiation absorbed at the Greenland Ice Sheet (GrIS) surface that a shift from a distribution dominated by high to low albe- and, thus, meltwater production.
    [Show full text]
  • Multimodal Data Representations with Parameterized Local Structures
    Multimodal Data Representations with Parameterized Local Structures Ying Zhu1, Dorin Comaniciu2, Stuart Schwartz1, and Visvanathan Ramesh2 1 Department of Electrical Engineering, Princeton University Princeton, NJ 08544, USA {yingzhu, stuart}@EE.princeton.edu 2 Imaging & Visualization Department, Siemens Corporate Research 755 College Road East, Princeton, NJ 08540, USA {Dorin.Comaniciu, Visvanathan.Ramesh}@scr.siemens.com Abstract. In many vision problems, the observed data lies in a nonlin- ear manifold in a high-dimensional space. This paper presents a generic modelling scheme to characterize the nonlinear structure of the manifold and to learn its multimodal distribution. Our approach represents the data as a linear combination of parameterized local components, where the statistics of the component parameterization describe the nonlinear structure of the manifold. The components are adaptively selected from the training data through a progressive density approximation proce- dure, which leads to the maximum likelihood estimate of the underlying density. We show results on both synthetic and real training sets, and demonstrate that the proposed scheme has the ability to reveal impor- tant structures of the data. 1 Introduction In this paper we address the problem of learning the statistical representation of multivariate visual data through parametric modelling. In many pattern recog- nition and vision applications, an interesting pattern is measured or visualized through multivariate data such as time signals and images. Its random occur- rences are described as scattered data points in a high-dimensional space. To better understand and use the critical information, it is important to explore the intrinsic low dimensionality of the scattered data and to characterize the data distribution through statistical modelling.
    [Show full text]
  • Development of a Severe Fire Potential Map for the Contiguous United States
    Development of a Severe Fire Potential Map for the Contiguous United States Gregory K. Dillon, Matthew H. Panunto, Brett Davis, Penelope Morgan, Donovan S. Birch, William M. Jolly Forest Rocky Mountain General Technical Report August Service Research Station RMRS-GTR-415 2020 Dillon, Gregory K.; Panunto, Matthew H.; Davis, Brett; Morgan, Penelope; Birch, Donovan S.; Jolly, William M. 2020. Development of a Severe Fire Potential map for the contiguous United States. Gen. Tech. Rep. RMRS- GTR-415. Fort Collins, CO: U.S. Department of Agriculture, Forest Service, Rocky Mountain Research Station. 107 p. Abstract Burn severity is the ecological change resulting from wildland fires. It is often mapped by using prefire and postfire satellite imagery and classified as low, moderate, or high. Areas burned with high severity are of particular concern to land managers and others because postfire vegetation, soil, and other important ecosystem components can be highly altered. In this study, we developed Random Forest statistical models describing the occurrence of high burn severity across the contiguous United States. We divided our work into 17 regions in the western United States and 8 regions in the eastern United States, and further subdivided them by forest and nonforest vegetation settings, resulting in 50 separate models. Our predictor variables represented prefire vegetation, topography, and fuel moisture, and our response variable was satellite-derived burn severity from the U.S. interagency Monitoring Trends in Burn Severity program. We created statistical models from 6,663 fires in the western United States (1984–2007) and 5,295 fires in the eastern United States (2000–2013).
    [Show full text]