Joint Genome-Wide Prediction in Several Populations Accounting for Randomness of Genotypes

Total Page:16

File Type:pdf, Size:1020Kb

Joint Genome-Wide Prediction in Several Populations Accounting for Randomness of Genotypes 1 Joint genome-wide prediction in several populations accounting for randomness of genotypes: 2 A hierarchical Bayes approach. II: Multivariate spike and slab priors for marker effects and 3 derivation of approximate Bayes and fractional Bayes factors for the complete family of models 4 Carlos Alberto Martíneza,b, Kshitij Khareb, Arunava Banerjeec, Mauricio A. Elzoa 5 aDepartment of Animal Sciences 6 bDepartment of Statistics 7 cDepartment of Computer and Information Science and Engineering 8 University of Florida, Gainesville, FL, USA 9 Correspondence: Carlos Alberto Martínez, Department of Animal Sciences, University of Florida, 10 Gainesville, FL 32611, USA. 11 Tel: 352-328-1624. 12 Fax: 352-392-7851. 13 E-mail: [email protected] 14 15 16 17 18 19 20 21 22 23 24 25 1 26 Abstract 27 This study corresponds to the second part of a companion paper devoted to the development of 28 Bayesian multiple regression models accounting for randomness of genotypes in across population 29 genome-wide prediction. This family of models considers heterogeneous and correlated marker 30 effects and allelic frequencies across populations, and has the ability of considering records from 31 non-genotyped individuals and individuals with missing genotypes in any subset of loci without the 32 need for previous imputation, taking into account uncertainty about imputed genotypes. This paper 33 extends this family of models by considering multivariate spike and slab conditional priors for 34 marker allele substitution effects and contains derivations of approximate Bayes factors and 35 fractional Bayes factors to compare models from part I and those developed here with their null 36 versions. These null versions correspond to simpler models ignoring heterogeneity of populations, 37 but still accounting for randomness of genotypes. For each marker loci, the spike component of 38 priors corresponded to point mass at 0 in ℝ풮, where 풮 is the number of populations, and the slab 39 component was a 풮-variate Gaussian distribution, independent conditional priors were assumed. For 40 the Gaussian components, covariance matrices were assumed to be either the same for all markers or 41 different for each marker. For null models, the priors were simply univariate versions of these finite 42 mixture distributions. Approximate algebraic expressions for Bayes factors and fractional Bayes 43 factors were found using the Laplace approximation. Using the simulated datasets described in part I, 44 these models were implemented and compared with models derived in part I using measures of 45 predictive performance based on squared Pearson correlations, Deviance Information Criterion, 46 Bayes factors, and fractional Bayes factors. The extensions presented here enlarge our family of 47 genome-wide prediction models making it more flexible in the sense that it now offers more 48 modeling options. 2 49 Key words: Bayesian whole-genome regressions; Finite mixture priors; genetic heterogeneity; 50 Laplace approximation; multi-population genome-enabled prediction. 51 1. Introduction 52 The scenario of across population genome-wide prediction accounting for randomness of genotypes 53 was addressed in part I of our series of studies. There, we adopted a hierarchical Bayesian modeling 54 strategy to accommodate heterogeneous and correlated marker effects across subpopulations and 55 random genotypes. In that companion paper we provided a detailed derivation of the joint pmf of the 56 genotypes conditional on pedigree information and allelic frequencies and also discussed some of its 57 properties. Furthermore, the flexibility of hierarchical Bayesian modeling allowed us to account for 58 heterogeneous and correlated allelic frequencies. The “MG-GBLUP” model proposed by Lehermeir 59 et al. (2015) is similar to the models developed in part I of this study, except that they did not 60 consider randomness of genotypes. In addition, they did not consider models with different 61 (heterogeneous) covariance matrices of marker effects. One of the main properties of our models is 62 that individuals with phenotypic records and missing genotypes at any subset of loci (including non- 63 genotyped individuals) can be considered in the analysis without previous imputation. Furthermore, 64 due to the use of a Bayesian approach, uncertainty about imputed genotypes is automatically taken 65 into account. 66 The so called “spike and slab” priors, are finite mixtures of a continuous distribution (the slab) and a 67 mass point at some constant (the spike) (Mitchell and Beauchamp, 1988). A particular case of these 68 priors are the zero-inflated priors which have point mass at zero. This sort of priors has been used in 69 high dimensional problems to induce a stronger shrinkage and perform variable selection. In single 70 population analyses, it has been reported that when there are genes with major effects controlling the 71 trait under study or the number of genes controlling the trait is low, Bayesian variable selection 72 models tend to perform better (Daetwyler et al., 2012; Heslot et al., 2012; Gianola and Rosa, 2015). 73 In the case of multiple population analyses, van den Berg et al. (2015) studied scenarios under which 3 74 Bayesian variable selection models outperformed genomic BLUP (GBLUP). They found that 75 GBLUP was outperformed when the number of QTL was smaller than the number of independent 76 chromosome segments. They also found that the difference in accuracy between these models was 77 larger than in the single population case. 78 In a Bayesian framework, model comparison can be performed via Bayes factors and some 79 modifications of them known as non-subjective Bayes factors (Ghosh et al, 2006). Bayes factors 80 measure the change in the odds favoring a model once data are observed (Lavine and Schervish, 81 1999). On the other hand, O’Hagan (1994; 1995) proposed a non-subjective Bayes factor known as 82 fractional Bayes factor which uses a fractional part of the likelihood resulting in a “partial” Bayes 83 factor. Analytical forms of Bayes factors involve integration of the joint distribution of data and 84 parameters over the parameter space of a given model to obtain marginal likelihoods, and even for 85 some simple models these integrals do not have a closed form solution. One option to obtain 86 algebraic approximations is to use the Laplace approximation after arranging the integrand in an 87 appropriate form (Ghosh et al, 2006). Another criterion to compare models is the Deviance 88 Information Criterion (DIC, Spiegelhalter et al., 2002; 2014) which combines measures of model fit 89 and model complexity and, despite some limitations, it has been used in several research areas 90 (Spiegelhalter et al., 2014). 91 Thus, the objectives of this study were to extend the family of models developed in a companion 92 paper (part I) by considering the so called spike and slab priors for marker effects and to derive 93 approximate expressions for Bayes factors and fractional Bayes factors to compare the proposed 94 models with their corresponding null versions that ignore population structure. 95 2. Methods 96 2.1 The models 4 97 The complete population or simply the population is defined as the set of individuals with 98 phenotypes considered in the analysis, which is comprised by a set of 풮 subpopulations defined by 99 some criterion like environment, race, breed, line, etc. Also the following assumptions are made: 100 linkage equilibrium, Hardy-Weinberg equilibrium, no mutation, and starting from the oldest 101 individuals with phenotypes, the pedigree is fully known. 102 The following is the linear model describing the relationship between records and mappings of 103 marker genotypes: 풚 = 푊품 + 풆, where 풚 ∈ ℝ푛 is a vector containing response variables (e.g., 104 records corrected for non-genetic factors), 푊 ∈ ℝ푛×푚 is an observable random matrix with entries 105 corresponding to a one to one mapping from the set of individual marker genotypes to a subset of the 106 integers (defined later), 품 ∈ ℝ푚 is an unknown random vector of average marker allele substitution 107 effects for every population and 풆 ∈ ℝ푛 is a random vector of residuals. If records are sorted by 108 subpopulation as well as the columns of 푊 and the elements of 품, then for every 푙 = 1,2, … , 풮, ( ) ( ) ( ) ( ) 109 풚푙 = 푊푙품푙 + 풆푙, with dimensions: 풚푙 푛푙×1, 푊푙 푛푙×푚, 품푙 푚×1 and 풆푙 푛푙×1 where 푛푙 is the sample 풮 110 size of subpopulation 푙, and 푚 is the number of marker loci; therefore, 푛 = ∑푙=1 푛푙. 111 In our models, the mapping from the set of genotypes at each locus and each individual into a subset 112 of the integers is defined as follows, biallelic loci are considered. If A and B are the marker alleles at 113 each locus and B is considered the reference allele then: 1, 푖푓 푔푒푛표푡푦푝푒 = 퐵퐵 푊 = {푤푙 } = { 0, 푖푓 푔푒푛표푡푦푝푒 = 퐴퐵 . 푙 푖푗 푛 ×푚 푙 −1, 푖푓 푔푒푛표푡푦푝푒 = 퐴퐴 2 2 114 The following is the hierarchical representation of our models. Let 푅 = (휎푒1, … , 휎푒풮) and 푉 = 풮 115 퐵푙표푐푘 퐷푖푎푔. {휎2 퐼 } then 푒푙 푛푙 푙 = 1 풚|푊, 품, 푅~푀푉푁(푊품, 푉) ∗ ∗ ∗ ∗ ∗ ∗ 푊|풑1, 풑2, … , 풑푚~휋(∙|풑1, 풑2, … , 풑푚) 5 푖푖푑 ∗ ∗ 풑푗 ∼ 휋(풑 ), 푗 = 1,2, … , 푚 푖푖푑 휏2 푣 휏2 푣 휎2 , … , 휎2 ∼ 퐼푛푣푒푟푠푒 퐺푎푚푚푎 ( , ) ≔ 퐼퐺 ( , ) 푒1 푒풮 2 2 2 2 푖푛푑 푃표푖푛푡 푚푎푠푠 푎푡 ퟎ 푤푖푡ℎ 푝푟표푏푎푏푖푙푖푡푦 휋0 품푗|퐺푗, 휋0 ∼ { 푀푉푁(ퟎ, 퐺푗) 푤푖푡ℎ 푝푟표푏푎푏푖푙푖푡푦 1 − 휋0 푖푖푑 퐺푗 ∼ 퐼푛푣푒푟푠푒 푊푖푠ℎ푎푟푡(푎, 휮) ≔ 퐼푊(푎, 휮) 휎2 휎 ⋯ 휎 푗1 푗1,2 푗1,풮 휎2 ⋯ 휎 푗2 푗2,풮 퐺푗 = ⋱ ⋮ 푠푦푚 휎2 [ 푗풮 ] where 휎2 is the residual variance in subpopulation 푙, 휎2 is the variance of the effect of the 푗푡ℎ marker 116 푒푙 푗푙 푡ℎ 117 in the 푙 subpopulation, 휎푗푙,푙′ is the covariance between effects of marker 푗 in subpopulations 푙 and ∗ 푡ℎ ∗ 118 푙′, 풑푗 is a parameter related to allelic frequencies of the 푗 marker in each subpopulation and 휋(풑 ) 119 is its probability density function (pdf).
Recommended publications
  • Arxiv:1803.00360V2 [Stat.CO] 2 Mar 2018 Prone to Misinterpretation [2, 3]
    Computing Bayes factors to measure evidence from experiments: An extension of the BIC approximation Thomas J. Faulkenberry∗ Tarleton State University Bayesian inference affords scientists with powerful tools for testing hypotheses. One of these tools is the Bayes factor, which indexes the extent to which support for one hypothesis over another is updated after seeing the data. Part of the hesitance to adopt this approach may stem from an unfamiliarity with the computational tools necessary for computing Bayes factors. Previous work has shown that closed form approximations of Bayes factors are relatively easy to obtain for between-groups methods, such as an analysis of variance or t-test. In this paper, I extend this approximation to develop a formula for the Bayes factor that directly uses infor- mation that is typically reported for ANOVAs (e.g., the F ratio and degrees of freedom). After giving two examples of its use, I report the results of simulations which show that even with minimal input, this approximate Bayes factor produces similar results to existing software solutions. Note: to appear in Biometrical Letters. I. INTRODUCTION A. The Bayes factor Bayesian inference is a method of measurement that is Hypothesis testing is the primary tool for statistical based on the computation of P (H j D), which is called inference across much of the biological and behavioral the posterior probability of a hypothesis H, given data D. sciences. As such, most scientists are trained in classical Bayes' theorem casts this probability as null hypothesis significance testing (NHST). The scenario for testing a hypothesis is likely familiar to most readers of this journal.
    [Show full text]
  • 1 Estimation and Beyond in the Bayes Universe
    ISyE8843A, Brani Vidakovic Handout 7 1 Estimation and Beyond in the Bayes Universe. 1.1 Estimation No Bayes estimate can be unbiased but Bayesians are not upset! No Bayes estimate with respect to the squared error loss can be unbiased, except in a trivial case when its Bayes’ risk is 0. Suppose that for a proper prior ¼ the Bayes estimator ±¼(X) is unbiased, Xjθ (8θ)E ±¼(X) = θ: This implies that the Bayes risk is 0. The Bayes risk of ±¼(X) can be calculated as repeated expectation in two ways, θ Xjθ 2 X θjX 2 r(¼; ±¼) = E E (θ ¡ ±¼(X)) = E E (θ ¡ ±¼(X)) : Thus, conveniently choosing either EθEXjθ or EX EθjX and using the properties of conditional expectation we have, θ Xjθ 2 θ Xjθ X θjX X θjX 2 r(¼; ±¼) = E E θ ¡ E E θ±¼(X) ¡ E E θ±¼(X) + E E ±¼(X) θ Xjθ 2 θ Xjθ X θjX X θjX 2 = E E θ ¡ E θ[E ±¼(X)] ¡ E ±¼(X)E θ + E E ±¼(X) θ Xjθ 2 θ X X θjX 2 = E E θ ¡ E θ ¢ θ ¡ E ±¼(X)±¼(X) + E E ±¼(X) = 0: Bayesians are not upset. To check for its unbiasedness, the Bayes estimator is averaged with respect to the model measure (Xjθ), and one of the Bayesian commandments is: Thou shall not average with respect to sample space, unless you have Bayesian design in mind. Even frequentist agree that insisting on unbiasedness can lead to bad estimators, and that in their quest to minimize the risk by trading off between variance and bias-squared a small dosage of bias can help.
    [Show full text]
  • Hierarchical Models & Bayesian Model Selection
    Hierarchical Models & Bayesian Model Selection Geoffrey Roeder Departments of Computer Science and Statistics University of British Columbia Jan. 20, 2016 Geoffrey Roeder Hierarchical Models & Bayesian Model Selection Jan. 20, 2016 Contact information Please report any typos or errors to geoff[email protected] Geoffrey Roeder Hierarchical Models & Bayesian Model Selection Jan. 20, 2016 Outline 1 Hierarchical Bayesian Modelling Coin toss redux: point estimates for θ Hierarchical models Application to clinical study 2 Bayesian Model Selection Introduction Bayes Factors Shortcut for Marginal Likelihood in Conjugate Case Geoffrey Roeder Hierarchical Models & Bayesian Model Selection Jan. 20, 2016 Outline 1 Hierarchical Bayesian Modelling Coin toss redux: point estimates for θ Hierarchical models Application to clinical study 2 Bayesian Model Selection Introduction Bayes Factors Shortcut for Marginal Likelihood in Conjugate Case Geoffrey Roeder Hierarchical Models & Bayesian Model Selection Jan. 20, 2016 Let Y be a random variable denoting number of observed heads in n coin tosses. Then, we can model Y ∼ Bin(n; θ), with probability mass function n p(Y = y j θ) = θy (1 − θ)n−y (1) y We want to estimate the parameter θ Coin toss: point estimates for θ Probability model Consider the experiment of tossing a coin n times. Each toss results in heads with probability θ and tails with probability 1 − θ Geoffrey Roeder Hierarchical Models & Bayesian Model Selection Jan. 20, 2016 We want to estimate the parameter θ Coin toss: point estimates for θ Probability model Consider the experiment of tossing a coin n times. Each toss results in heads with probability θ and tails with probability 1 − θ Let Y be a random variable denoting number of observed heads in n coin tosses.
    [Show full text]
  • Bayes Factors in Practice Author(S): Robert E
    Bayes Factors in Practice Author(s): Robert E. Kass Source: Journal of the Royal Statistical Society. Series D (The Statistician), Vol. 42, No. 5, Special Issue: Conference on Practical Bayesian Statistics, 1992 (2) (1993), pp. 551-560 Published by: Blackwell Publishing for the Royal Statistical Society Stable URL: http://www.jstor.org/stable/2348679 Accessed: 10/08/2010 18:09 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=black. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Blackwell Publishing and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to Journal of the Royal Statistical Society.
    [Show full text]
  • Bayes Factor Consistency
    Bayes Factor Consistency Siddhartha Chib John M. Olin School of Business, Washington University in St. Louis and Todd A. Kuffner Department of Mathematics, Washington University in St. Louis July 1, 2016 Abstract Good large sample performance is typically a minimum requirement of any model selection criterion. This article focuses on the consistency property of the Bayes fac- tor, a commonly used model comparison tool, which has experienced a recent surge of attention in the literature. We thoroughly review existing results. As there exists such a wide variety of settings to be considered, e.g. parametric vs. nonparametric, nested vs. non-nested, etc., we adopt the view that a unified framework has didactic value. Using the basic marginal likelihood identity of Chib (1995), we study Bayes factor asymptotics by decomposing the natural logarithm of the ratio of marginal likelihoods into three components. These are, respectively, log ratios of likelihoods, prior densities, and posterior densities. This yields an interpretation of the log ra- tio of posteriors as a penalty term, and emphasizes that to understand Bayes factor consistency, the prior support conditions driving posterior consistency in each respec- tive model under comparison should be contrasted in terms of the rates of posterior contraction they imply. Keywords: Bayes factor; consistency; marginal likelihood; asymptotics; model selection; nonparametric Bayes; semiparametric regression. 1 1 Introduction Bayes factors have long held a special place in the Bayesian inferential paradigm, being the criterion of choice in model comparison problems for such Bayesian stalwarts as Jeffreys, Good, Jaynes, and others. An excellent introduction to Bayes factors is given by Kass & Raftery (1995).
    [Show full text]
  • Calibrated Bayes Factors for Model Selection and Model Averaging
    Calibrated Bayes factors for model selection and model averaging Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Pingbo Lu, B.S. Graduate Program in Statistics The Ohio State University 2012 Dissertation Committee: Dr. Steven N. MacEachern, Co-Advisor Dr. Xinyi Xu, Co-Advisor Dr. Christopher M. Hans ⃝c Copyright by Pingbo Lu 2012 Abstract The discussion of statistical model selection and averaging has a long history and is still attractive nowadays. Arguably, the Bayesian ideal of model comparison based on Bayes factors successfully overcomes the difficulties and drawbacks for which frequen- tist hypothesis testing has been criticized. By putting prior distributions on model- specific parameters, Bayesian models will be completed hierarchically and thus many statistical summaries, such as posterior model probabilities, model-specific posteri- or distributions and model-average posterior distributions can be derived naturally. These summaries enable us to compare inference summaries such as the predictive distributions. Importantly, a Bayes factor between two models can be greatly affected by the prior distributions on the model parameters. When prior information is weak, very dispersed proper prior distributions are often used. This is known to create a problem for the Bayes factor when competing models that differ in dimension, and it is of even greater concern when one of the models is of infinite dimension. Therefore, we propose an innovative criterion called the calibrated Bayes factor, which uses training samples to calibrate the prior distributions so that they achieve a reasonable level of "information". The calibrated Bayes factor is then computed as the Bayes factor over the remaining data.
    [Show full text]
  • The Bayes Factor
    The Bayes Factor I The Bayes Factor provides a way to formally compare two competing models, say M1 and M2. I It is similar to testing a “full model” vs. “reduced model” (with, e.g., a likelihood ratio test) in classical statistics. I However, with the Bayes Factor, one model does not have to be nested within the other. I Given a data set x, we compare models M1 : f1(x|θ1) and M2 : f2(x|θ2) I We may specify prior distributions p1(θ1) and p2(θ2) that lead to prior probabilities for each model p(M1) and p(M2). The Bayes Factor By Bayes’ Law, the posterior odds in favor of Model 1 versus Model 2 is: R p(M1)f1(x|θ1)p1(θ1) dθ1 π(M |x) Θ1 p(x) 1 = π(M2|x) R p(M2)f2(x|θ2)p2(θ2) dθ2 Θ2 p(x) R f (x|θ )p (θ ) dθ p(M1) Θ1 1 1 1 1 1 = · R p(M2) f (x|θ )p (θ ) dθ Θ2 2 2 2 2 2 = [prior odds] × [Bayes Factor B(x)] The Bayes Factor Rearranging, the Bayes Factor is: π(M |x) p(M ) B(x) = 1 × 2 π(M2|x) p(M1) π(M |x)/π(M |x) = 1 2 p(M1)/p(M2) (the ratio of the posterior odds for M1 to the prior odds for M1). The Bayes Factor I Note: If the prior model probabilities are equal, i.e., p(M1) = p(M2), then the Bayes Factor equals the posterior odds for M1.
    [Show full text]
  • On the Derivation of the Bayesian Information Criterion
    On the derivation of the Bayesian Information Criterion H. S. Bhat∗† N. Kumar∗ November 8, 2010 Abstract We present a careful derivation of the Bayesian Inference Criterion (BIC) for model selection. The BIC is viewed here as an approximation to the Bayes Factor. One of the main ingredients in the approximation, the use of Laplace’s method for approximating integrals, is explained well in the literature. Our derivation sheds light on this and other steps in the derivation, such as the use of a flat prior and the invocation of the weak law of large numbers, that are not often discussed in detail. 1 Notation Let us define the notation that we will use: y : observed data y1,...,yn Mi : candidate model P (y|Mi) : marginal likelihood of the model Mi given the data θi : vector of parameters in the model Mi gi(θi) : the prior density of the parameters θi f(y|θi) : the density of the data given the parameters θi L(θi|y) : the likelihood of y given the model Mi θˆi : the MLE of θi that maximizes L(θi|y) 2 Bayes Factor The Bayesian approach to model selection [1] is to maximize the posterior probability of n a model (Mi) given the data {yj}j=1. Applying Bayes theorem to calculate the posterior ∗School of Natural Sciences, University of California, Merced, 5200 N. Lake Rd., Merced, CA 95343 †Corresponding author, email: [email protected] 1 probability of a model given the data, we get P (y1,...,yn|Mi)P (Mi) P (Mi|y1,...,yn)= , (1) P (y1,...,yn) where P (y1,...,yn|Mi) is called the marginal likelihood of the model Mi.
    [Show full text]
  • Bayesian Inference Chapter 5: Model Selection
    Bayesian Inference Chapter 5: Model selection Conchi Aus´ınand Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methodsl Master in Mathematical Engineering Conchi Aus´ınand Mike Wiper Model selection Masters Programmes 1 / 33 Objective In this class we consider problems where we have various contending models and show how these can be compared and also look at the possibilities of model averaging. Conchi Aus´ınand Mike Wiper Model selection Masters Programmes 2 / 33 Basics In principle, model selection is easy. Suppose that we wish to compare two models, M1, M2. Then, define prior probabilities, P(M1) and P(M2) = 1 − P(M1). Given data, we can calculate the posterior probabilities via Bayes theorem: f (xjM )P(M ) P(M jx) = 1 1 1 f (x) where f (x) = f (xjM1)P(M1) + f (xjM2)P(M2). Note also that if model Mi is parametric with parameters θi , then Z f (xjMi ) = f (xjMi ; θi )f (θi jMi ) dθi Conchi Aus´ınand Mike Wiper Model selection Masters Programmes 3 / 33 Now consider the possible losses (negative utilities) associated with taking a wrong decision. L(select M2jM1 true) and L(select M1jM2 true): (something like type I and type II errors) Take the decision which minimizes the expected loss (Bayes decision). Expected loss for choosing M1 is P(M2jx)L(M1jM2) and similarly for M2. If the loss functions are equal, then we just select the model with higher probability. Setting L(M1jM2) = 0:05 and L(M2jM1) = 0:95 means we select M1 if P(M1jx) > 0:05.
    [Show full text]
  • Bayesian Alternatives for Common Null-Hypothesis Significance Tests
    Quintana and Eriksen Bayesian alternatives for common null-hypothesis significance tests in psychiatry: A non-technical guide using JASP Daniel S Quintana* and Jon Alm Eriksen *Correspondence: [email protected] Abstract NORMENT, KG Jebsen Centre for Psychosis Research, Division of Despite its popularity, null hypothesis significance testing (NHST) has several Mental Health and Addiction, limitations that restrict statistical inference. Bayesian analysis can address many University of Oslo and Oslo of these limitations, however, this approach has been underutilized largely due to University Hospital, Oslo, Norway Full list of author information is a dearth of accessible software options. JASP is a new open-source statistical available at the end of the article package that facilitates both Bayesian and NHST analysis using a graphical interface. The aim of this article is to provide an applied introduction to Bayesian inference using JASP. First, we will provide a brief background outlining the benefits of Bayesian inference. We will then use JASP to demonstrate Bayesian alternatives for several common null hypothesis significance tests. While Bayesian analysis is by no means a new method, this type of statistical inference has been largely inaccessible for most psychiatry researchers. JASP provides a straightforward means of performing Bayesian analysis using a graphical “point and click” environment that will be familiar to researchers conversant with other graphical statistical packages, such as SPSS. Keywords: Bayesian statistics; null hypothesis significance tests; p-values; statistics; software Consider the development of a new drug to treat social dysfunction. Before reaching the public, scientists need to demonstrate drug efficacy by performing a randomized- controlled trial (RCT) whereby participants are randomized to receive the drug or placebo.
    [Show full text]
  • Bayesian Model Selection
    Bayesian model selection Consider the regression problem, where we want to predict the values of an unknown function d N y : R ! R given examples D = (xi; yi) i=1 to serve as training data. In Bayesian linear regression, we made the following assumption about y(x): y(x) = φ(x)>w + "(x); (1) where φ(x) is a now explicitly-written feature expansion of x. We proceed in the normal Bayesian way: we place Gaussian priors on our unknowns, the parameters w and the residuals ", then derive the posterior distribution over w given D, which we use to make predictions. One question left unanswered is how to choose a good feature expansion function φ(x). For example, a purely linear model could use φ(x) = [1; x]>, whereas a quadratic model could use φ(x) = [1; x; x2]>, etc. In general, arbitrary feature expansions φ are allowed. How can I select between them? Even more generally, how do I select whether I should use linear regression or a completely dierent probabilistic model to explain my data? These are questions of model selection, and naturally there is a Bayesian approach to it. Before we continue our discussion of model selection, we will rst dene the word model, which is often used loosely without explicit denition. A model is a parametric family of probability distributions, each of which can explain the observed data. Another way to explain the concept of a model is that if we have chosen a likelihood p(D j θ) for our data, which depends on a parameter θ, then the model is the set of all likelihoods (each one of which is a distribution over D) for every possible value of the parameter θ.
    [Show full text]
  • A Tutorial on Conducting and Interpreting a Bayesian ANOVA in JASP
    A Tutorial on Conducting and Interpreting a Bayesian ANOVA in JASP Don van den Bergh∗1, Johnny van Doorn1, Maarten Marsman1, Tim Draws1, Erik-Jan van Kesteren4, Koen Derks1,2, Fabian Dablander1, Quentin F. Gronau1, Simonˇ Kucharsk´y1, Akash R. Komarlu Narendra Gupta1, Alexandra Sarafoglou1, Jan G. Voelkel5, Angelika Stefan1, Alexander Ly1,3, Max Hinne1, Dora Matzke1, and Eric-Jan Wagenmakers1 1University of Amsterdam 2Nyenrode Business University 3Centrum Wiskunde & Informatica 4Utrecht University 5Stanford University Abstract Analysis of variance (ANOVA) is the standard procedure for statistical inference in factorial designs. Typically, ANOVAs are executed using fre- quentist statistics, where p-values determine statistical significance in an all-or-none fashion. In recent years, the Bayesian approach to statistics is increasingly viewed as a legitimate alternative to the p-value. How- ever, the broad adoption of Bayesian statistics {and Bayesian ANOVA in particular{ is frustrated by the fact that Bayesian concepts are rarely taught in applied statistics courses. Consequently, practitioners may be unsure how to conduct a Bayesian ANOVA and interpret the results. Here we provide a guide for executing and interpreting a Bayesian ANOVA with JASP, an open-source statistical software program with a graphical user interface. We explain the key concepts of the Bayesian ANOVA using two empirical examples. ∗Correspondence concerning this article should be addressed to: Don van den Bergh University of Amsterdam, Department of Psychological Methods Postbus 15906, 1001 NK Amsterdam, The Netherlands E-Mail should be sent to: [email protected]. 1 Ubiquitous across the empirical sciences, analysis of variance (ANOVA) al- lows researchers to assess the effects of categorical predictors on a continuous outcome variable.
    [Show full text]