The JASP Data Library Version 2
Total Page:16
File Type:pdf, Size:1020Kb
The JASP Team The JASP Data Library Version 2 JASP Publishing Copyright © 2019 E.-J. Wagenmakers published by jasp publishing tufte-latex.googlecode.com Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in com- pliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/ LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “as is” basis, without warranties or conditions of any kind, either express or implied. See the License for the specific language governing permissions and limitations under the License. This printing, November 2019 Contents Contents 3 Preface 7 PART I DESCRIPTIVE STATISTICS 1 Sleep: Descriptive Statistics 19 2 Fear of Statistics: Reliability Analysis 27 PART II T-TESTS 3 Weight Gain: One Sample T-Test 37 4 Moon & Aggression: Paired Samples T-Test 43 5 Directed Reading Activities: Independent Samples T-Test 49 6 Stereograms: Bayesian Independent Samples T-Test 55 7 Kitchen Rolls: Independent Samples T-Test 63 8 Horizontal Eye Movements: Bayesian Independent Samples T-Test 73 9 Invisibility Cloak: Independent Samples T-Test 81 PART III ANOVA 10 Pain Thresholds: One-Way ANOVA 89 11 Facebook Friends: One-Way ANOVA 95 12 Response to Eye Color: One-Way ANOVA 103 Bibliography 111 List of External Figures 115 5 “You don’t understand anything until you learn it more than one way.” Marvin Minsky Preface This is the documentation for the examples contained in the JASP Data Library. The data sets are available from JASP both in .jasp format and in .csv format. This documentation will grow over time, and therefore we have added a version number to the book title. Structure and Contents The structure of this documentation follows the layout of the JASP ribbon: the first parts deal with Descriptives, T-tests, ANOVA, Regres- sion, Frequencies, and Factor Analysis, whereas the remaining parts concern various JASP modules, namely Meta-Analysis, Network Figure 1: JASP: A fresh way Analysis, and Structural Equation Modeling. to do statistics. Download The example data sets come from various sources. For educational JASP for free at http://jasp- purposes we have included the material from two popular statistical stats.org. textbooks, Field( 2017) and Moore et al.( 2012). This means it is now easy for teachers and students to use JASP and analyze the classic ‘Looks or Personality’ example from Andy Field, or the ‘College Success’ example from Moore, McCaib, and Craig. We also added the original example data sets that JASP users may be familiar with (ever wonder what the ‘Kitchen Rolls’ data set was all about?). A number of publicly available data sets have been added as well. Copyright Most data sets contained in the JASP Data Library are already pub- licly available because they have been presented in academic articles and books. The legal consensus is that such data are facts and there- fore do not fall under copyright protection. An exception concerns data that have not been collected but constructed in order to serve an educational purpose, as is the case for most data sets from Andy Field’s popular course books. For these data sets, the source is ac- knowledged as follows: “This data set comes from Field, A. P. (2017). Discovering Statis- tics Using IBM SPSS Statistics (5th ed.). London: Sage. The data 8 set was constructed by Andy Field who therefore owns the copy- right. Andy Field generously agreed that we can include the data set in the JASP data library. This data set is also publicly avail- able on the website that accompanies Andy Field’s book: https: //edge.sagepub.com/field5e. Without Andy Field‘s explicit consent, this data set may not be distributed for commercial purposes, this data set may not be edited, and this data set may not be presented without acknowledging its source (i.e., the terms of a CC BY-NC-ND license; https://creativecommons.org/licenses/by-nc-nd/3.0/). ” The Need for Documentation Guidelines Our experience with the analysis of publicly available data suggests that there is an urgent need for standard guidelines that encourage systematic and comprehensive documentation. Without such guide- lines, the analysis often becomes a frustrating exercise in encryption.1 1 For empirical support, see for instance Oftentimes, the data could not have been properly analyzed without Hardwicke et al.( 2018). assistance from the original authors. This undesirable state of affairs is why we have developed the JASP Data Documentation Format described below. This format provides a list of minimal requirements for how data ought to be documented, one that may need to be ex- panded for particular data sets (such as those from neuroscience). The proposed format is concrete and focuses purely on documenta- tion (see the FAIR guidelines2 for complementary advice concerning 2 Wilkinson et al.( 2016). data storage). The JASP Data Documentation Format Each chapter in this book documents a single data set using the JASP Data Documentation Format (JDDF).3 The JDDF contains the 3 JDDF templates are available online at following elements: https://osf.io/bjmrg/ I. Description. A brief summary of the data set in order to provide context. Example: This data set, Glasgow Norms, provides a set of normative rat- ings of 5, 553 English words on nine psycholinguistic dimensions. Specifically: “The Glasgow Norms are a set of normative ratings for 5,553 English words on 9 psycholinguistic dimensions: arousal, valence, dominance, concreteness, imageability, familiarity, age of acquisition, semantic size, and gender association (...) for any given subset of words, the same participants provided ratings across all 9 dimensions (32 participants/word, on average).”(Scott et al. 2019, p. 2). II. Purpose. A brief statement of the purpose that the data may serve. Example: 9 The purpose of the original study was to develop a substantial set of standardized, freely available, psycholinguistic materials. Scott et al. (2019) state that “Overall, the Glasgow Norms provide a valuable resource, in particular, for researchers investigating the role of word recognition in language comprehension”(Scott et al. 2019, p. 2). Here we use this data set to demonstrate Bayesian Pearson’s correlation. Specifically, we will estimate how the psycholinguistic dimensions relate to each other. For example, are words that are rated higher on the Dominance dimension rated as more masculine or feminine? Are words associated with higher arousal rated more positive or negative? III. Data Screenshot. A visual impression of the data structure, show- ing at least some of the columns and rows in which the data have been organized. An example screenshot is shown in Figure 2. An example of the text that could accompany the screenshot: The Glasgow Norms data set(Scott et al. 2019) consists of 5, 553 rows and 29 columns. Each row corresponds to a specific word, and the columns relate to various lexical properties. IV. Variables. A description for each of the variables in the data set. At a minimum, the description features (a) the variable name; (b) the variable’s measurement level and the values that the variable can take on; (c) a verbal summary of the variable. An example for the first five variables in the Glasgow Norms data set: • word – Type: Categorical In this book, we use three types of variables: Categorical, Ordinal, and – Description: The word whose lexical properties are listed. Continuous. Even discrete variables, • length although not only strictly continuous, are categorized as ’Continuous’ - this – Type: Continuous is because the analyses options in JASP are the same for both types of variables. – Description: Word length in number of letters. • M AROU – Type: Continuous – Description: A word’s mean arousal value (averaged across raters). Each rater judged the extent to which a word has emotional impact in terms of internal activation on a 9-point scale (‘1’ = very unarousing, ‘9’ = very arousing). • SD AROU – Type: Continuous 10 Figure 2: Screenshot of the first – Description: The standard deviation (across raters) of a word’s 18 rows and 8 columns of the arousal value. Each rater judged the extent to which a word Glasgow Norms data(Scott has emotional impact in terms of internal activation on a et al. 2019). 9-point scale (‘1’ = very unarousing, ‘9’ = very arousing). • N AROU – Type: Continuous – Description: The number of raters who judged a word’s arousal value. Each rater judged the extent to which a word has emotional impact in terms of internal activation on a 9-point scale (‘1’ = very unarousing, ‘9’ = very arousing). V. Source. A description of where the data were obtained, and where the data can be accessed online. Example: Scott, G. G., Keitel, A., Becirspahic, M., O’Donnell, P. J., and Sereno, S. C. (2019). The Glasgow norms: Ratings of 5,500 words on 9 scales. Behavior Research Methods, 51:1258–1270, https://psyarxiv.com/ akzyx/. The data set is available as Supplementary Materials. 11 VI. Analysis code. Anything that allows a third party to reproduce a result. This can be R code, Stan code, or a .jasp file, for instance. We strongly recommend against using commercial software for analysis, as the interested third party might be unwilling or unable to buy an expensive license (e.g., to Matlab, JMP, SAS, Stata, or SPSS) in order to reproduce the result. Example: An annotated JASP file is available in the JASP Data Library and on the OSF at https://osf.io/ypwk5/. VII. Example analysis. A demonstration that the provided code, when applied to the provided data, actually produces a result. Example: As a first step, we will inspect the correlations between the nine psycholinguistic dimensions.