Anal´Yza Rozptylu - Jednoduch´Etˇr´Idˇen´I 14

Total Page:16

File Type:pdf, Size:1020Kb

Anal´Yza Rozptylu - Jednoduch´Etˇr´Idˇen´I 14 UCˇEBN´I TEXTY OSTRAVSKE´ UNIVERZITY Pˇr´ırodovˇedeck´a fakulta ANALYZA´ DAT Petr Bujok Ostravsk´a univerzita 2019 ANALYZA´ DAT KIP/ANDAT texty pro distanˇcn´ıstudium Autor: Petr Bujok Ostravsk´auniverzita, Pˇr´ırodovˇedeck´afakulta Katedra Informatiky a poˇc´ıtaˇc˚u Jazykov´akorektura nebyla provedena, za jazykovou str´anku odpov´ıd´aautor. c Petr Bujok, 2019 OBSAH i Obsah 1 Parametrick´etesty o shodˇestˇredn´ıch hodnot 4 1.1 Jednov´ybˇerov´yt-test ........................... 4 1.2 Dvouv´ybˇerov´yt-test ........................... 6 1.3 P´arov´yt-test ............................... 11 2 Anal´yza rozptylu - jednoduch´etˇr´ıdˇen´ı 14 3 Z´aklady line´arn´ıregrese 21 4 Neparametrick´emetody 33 4.1 Test dobr´eshody ............................. 34 4.2 Kontingenˇcn´ıtabulky - test nez´avislosti ................. 36 4.3 Znam´enkov´ytest ............................. 41 4.4 Jednov´ybˇerov´yWilcoxon˚uv test ..................... 43 4.5 Dvouv´ybˇerov´yWilcoxon˚uv test ..................... 46 4.6 Kruskal˚uv-Wallis˚uv test ......................... 50 4.7 Spearman˚uv koeficient poˇradov´ekorelace ................ 52 5 Programov´eprostˇredky pro statistick´ev´ypoˇcty 57 5.1 Tabulkov´yprocesor MS Excel ...................... 57 5.2 Statistick´eprogramov´esyst´emy ..................... 62 5.3 Volnˇedostupn´yprogram JASP ..................... 63 6 Prezentace v´ysledk˚uanal´yzy dat 69 6.1 Prezentace tabulek a uˇzit´ıvhodn´ych graf˚u ............... 69 6.2 Jak´ym chyb´am se vyhnout? ....................... 74 7 Literatura - komentovan´yseznam 78 8 Statistick´etabulky 81 OBSAH 1 8.1 Distribuˇcn´ıfunkce normovan´eho norm´aln´ıho rozdˇelen´ı ......... 82 8.2 Vybran´ekvantily rozdˇelen´ıCh´ı-kvadr´at ................. 83 8.3 Vybran´ekvantily Studentova t-rozdˇelen´ı ................ 84 8.4 Vybran´ekvantily Fisherova Snedecorova F-rozdˇelen´ı .......... 85 8.5 Kritick´ehodnoty pro jednov´ybˇerov´yWilcoxon˚uv test ......... 86 8.6 Kritick´e hodnoty pro dvouv´ybˇerov´y Wilcoxon˚uv (Mann˚uv- Whitney˚uv) test .............................. 87 8.7 Kritick´ehodnoty Spearmanova korelaˇcn´ıho koeficientu ........ 88 OBSAH 3 Pˇredmluva Tento text slouˇz´ıjako opora pro pˇredmˇet Anal´yza dat. Navazuje na kurs Z´aklady matematick´estatistiky. C´ılem kursu je aplikovat z´akladn´ıstatistick´eznalosti v rela- tivnˇejednoduch´ych ´uloh´ach, s nimiˇzse velmi ˇcasto setk´av´ame pˇri anal´yze dat. Casovouˇ n´aroˇcnost zvl´adnut´ıtohoto textu a vyˇreˇsen´ı zadan´ych pˇr´ıklad˚ulze odhad- nout na pˇribliˇznˇe80 aˇz100 hodin. V nˇekter´ych pˇr´ıkladech, jejichˇzˇreˇsen´ıje uvedeno v uˇcebn´ımtextu, se uˇz´ıvaj´ıdata ze soubor˚uBI97.txt. Pokud si chcete uveden´aˇreˇsen´ısami ovˇeˇrit a zopakovat, tato data si m˚uˇzete st´ahnout z webov´ych str´anek autora textu, http://www1.osu.cz/ bujok/. ∼ Kaˇzd´akapitola zaˇc´ın´apokyny pro jej´ıstudium. Tato ˇc´ast je vˇzdy oznaˇcena jako Pr˚uvodce studiem s ikonou na okraji str´anky. Pojmy a d˚uleˇzit´esouvislosti k zapamatov´an´ıjsou vyznaˇceny na okraji str´anky textu ikonou. V rozsahu cel´eho textu jsou um´ıstˇeny Pˇr´ıklady, jejichˇzpodrobn´eˇreˇsen´ıumoˇzˇnuje porozumˇet prob´ıran´eproblematice do vˇetˇs´ıhloubky a tak si sn´aze osvojit praktiky pro dalˇs´ıaplikace. V z´avˇeru kaˇzd´ekapitoly je rekapitulace nejd˚uleˇzitˇejˇs´ıch pojm˚u. Tato rekapitulace je oznaˇcena textem Shrnut´ı a ikonou na okraji. Odd´ıl Kontroln´ı ot´azky oznaˇcen´yikonou by v´am mˇel pomoci zjistit, zda jste prostudovanou kapitolu pochopili a snad vyprovokuje i vaˇse dalˇs´ıot´azky, na kter´e budete hledat odpovˇed’. U nˇekter´ych kapitol je pˇripomenuta Korespondeˇcn´ı ´uloha. Pro kombinovan´ea distanˇcn´ıstudium jsou korespondenˇcn´ı ´ulohy zad´av´any v r´amci kurzu dan´eho se- mestru. Uspˇeˇsn´evyˇreˇsen´ıkorespondenˇcn´ıch´ ´uloh je souˇc´ast´ıpodm´ınek pro celkov´e hodnocen´ıpˇredmˇetu. Hlavn´ı´ulohou, kterou byste mˇeli osvˇedˇcit poznatky z´ıskan´ev tomto kursu, je ana- l´yza v´ami vybran´eho souboru dat z vaˇseho okol´ı.Proto se poohl´ednˇete po datech, kter´ebyste chtˇeli statisticky zpracovat, a kde jste zvˇedavi na v´ysledky t´eto anal´yzy. Pˇr´ıpadn´enejasnosti vˇcas konzultujte s vyuˇcuj´ıc´ım. V´ysledky anal´yzy bude pak po- tˇreba pˇredloˇzit formou vytiˇstˇen´estruˇcn´ea pˇrehledn´ezpr´avy, pokud moˇzno v rozsahu max. 3 strany. Pˇred pˇr´ıpravou zpr´avy si prostudujte kap. 6 o prezentaci v´ysledk˚u. 4 1 PARAMETRICKE´ TESTY O SHODEˇ STREDNˇ ´ICH HODNOT 1 Parametrick´etesty o shodˇestˇredn´ıch hodnot Pr˚uvodce studiem: Tato kapitola shrnuje tˇri statistick´eparametrick´etesty. Text je rozdˇelen do nˇekolika logicky ucelen´ych ˇc´ast´ı.K prostudov´an´ıcel´et´eto kapitoly budete potˇrebovat asi 12 hodin. Studium v´am ulehˇc´ıˇcetn´eilustrativn´ıpˇr´ıklady. V pˇr´ıpadˇenejasnost´ıje moˇzn´e se obr´atit na oporu pˇredchoz´ıho kursu Z´aklady pravdˇepodobnosti a statistiky. C´ıl: Po prostudov´an´ıt´eto ˇc´asti kapitoly byste mˇeli: zn´at detaily testov´an´ıstatistick´ych hypot´ez, • urˇcit a interpretovat testov´ekrit´erium jednotliv´ych test˚u, • ch´apat rozd´ılymezi z´akladn´ımi parametrick´ymi testy, • rozliˇsovat mezi jednostrannou a oboustrannou alternativn´ı hypot´ezou. • 1.1 Jednov´ybˇerov´yt-test Jednov´ybˇerov´yoboustrann´y t-test byl podrobnˇevysvˇetlen v uˇcebn´ımtextu Z´aklady pravdˇepodobnosti a statistiky. Doporuˇcujeme se k tomu vr´atit a z´aklady testov´an´ı hypot´ez si znovu pˇripomenout. M´ame n´ahodn´yv´ybˇer (X1,X2,...,Xn) nez´avisl´ych n´ahodn´ych veliˇcin norm´alnˇeroz- dˇelen´ych, tj. X N(µ, σ2), i = 1, 2,...,n. Testujeme hypot´ezu, ˇze stˇredn´ıhod- i ∼ nota rozdˇelen´ıpopulace, z n´ıˇzm´ame v´ybˇer, tj. µ je rovna nˇejak´edan´ehodnotˇe µ0. proti alternativˇe, ˇze µ = µ0. Za platnosti nulov´ehypot´ezy m´astatistika T rozdˇelen´ı 6 X µ0 podle n´asleduj´ıc´ıho vztahu T = − tn 1 a pˇri oboustrann´ealternativˇe µ = µ0 je s/√n ∼ − 6 kritick´yobor W ( , tn 1(α/2)] [tn 1(1 α/2), + ). Pokud vypoˇcten´ahod- ≡ −∞ − ∪ − − ∞ nota T leˇz´ıv kritick´em oboru, tak nulovou hypot´ezu µ = µ0 pro dan´e α zam´ıt´ame (kvantily t-rozdˇelen´ıjsou tabelov´any 8.3). Oboustrann´aalternativa H : µ = µ vˇsak nen´ıjedin´amoˇzn´aformulace alternativn´ı 1 6 0 hypot´ezy. M´ame-li k dispozici nˇejakou apriorn´ıinformaci o stˇredn´ıhodnotˇepopu- lace, ze kter´eje realizov´an v´ybˇer, m˚uˇzeme zformulovat alternativu jednostrannˇe: H0 : µ = µ0 H1 : µ>µ0 (tzv. pravostrann´a alternativa) Dalˇs´ıpostup testu bude zcela analogick´yjako u oboustrann´eho testu, pouze kritick´y obor bude jin´y, totiˇz W [tn 1(1 α), + ). Nulovou hypot´ezu m˚uˇzeme zam´ıtnout ≡ − − ∞ 1.1 Jednov´ybˇerov´yt-test 5 ve prospˇech t´eto alternativy tehdy, kdyˇzv´ybˇerov´ypr˚umˇer X je o hodnˇevˇetˇs´ıneˇz µ0. Pˇresnˇeji vyj´adˇreno, kdyˇzpro hodnotu testov´eho krit´eria plat´ı X µ0 − tn 1(1 α). s/√n ≥ − − Vid´ıme, ˇze pravdˇepodobnost neopr´avnˇen´eho zam´ıtnut´ı nulov´ehypot´ezy je opˇet rovna hladinˇev´yznamnosti α. T´ım, ˇze jsme alternativu formulovali s vyuˇzit´ımnˇejak´eapri- orn´ıinformace, staˇc´ık zam´ıtnut´ınulov´ehypot´ezy, aby hodnota testov´eho kriteria T byla alespoˇn tn 1(1 α). U oboustrann´ealternativy by to bylo tn 1(1 α/2). − − − − Zcela analogicky, pokud bychom mˇeli k tomu d˚uvod, m˚uˇzeme formulovat i levostran- nou alternativu H1 : µ<µ0. Pak kritick´yobor je W ( , tn 1(α)]. ≡ −∞ − Obecnˇepˇri uˇz´ıv´an´ıtest˚u, zejm´ena jednostrann´ych, je vhodn´enejdˇr´ıve formulovat alternativu ve tvaru obsahuj´ıc´ımtvrzen´ı,kter´ebychom chtˇeli prok´azat“ – tzv. v´y- ” zkumnou hypot´ezu. Pak pokud nulovou hypot´ezu zam´ıtneme, m´ame t´emˇeˇrjistotu (s rizikem rovn´ym α), ˇze tvrzen´ıvyj´adˇren´ealternativn´ıhypot´ezou je pravdiv´e. Pˇr´ıklad 1.1 Na vybranou optimalizaˇcn´ı´ulohu byl aplikov´an deterministick´y algo- ritmus, kter´ydos´ahl jej´ıho ˇreˇsen´ıza 2400 v´ypoˇcetn´ıch krok˚u. Dlouhodob´yv´yzkum ukazuje, ˇze nedeterministick´y(stochastick´y) pˇr´ıstup m˚uˇze b´yt efektivnˇejˇs´ı.Proto byl na stejnou ´ulohu nasazen rovnˇeˇzstochastick´yalgoritmus. Z d˚uvodu stochastiˇcnosti byl experiment opakov´an 60 kr´at, a v´ysledn´epoˇcty krok˚u k dosaˇzen´ıˇreˇsen´ı t´eˇze ´ulohy jsou v tabulce: 2622 2906 3816 1371 2812 1058 1749 2262 3992 1841 1200 2895 1675 4512 2530 2182 3044 2510 3087 3852 1220 1285 3984 2588 2232 2727 1741 2742 1932 3790 1548 1085 899 2269 2804 1336 3457 2525 1933 2307 2821 2333 2523 1122 1405 2661 1828 788 2018 1400 2130 635 1419 3275 2767 957 2594 1413 3124 3923 Zjistˇete, zda je stochastick´yalgoritmus efektivnˇejˇs´ıneˇz deterministick´y. Pokud m´ame zjistit, zda jsou v´ysledky opakov´an´ıstejn´eho pokusu stochastick´eho algoritmu lepˇs´ıneˇzv´ysledek deterministick´eho etalonu“, pouˇzijeme jednov´ybˇerov´y ” t-test. Nulovou hypot´ezu m˚uˇzeme formulovat H0 : µstoch = 2400. Dosad´ıme-li do testov´eho krit´eria, obdrˇz´ıme T = 0.909, a pokud v´ysledek porovn´ame s tabul- − kou 8.3 (α =0.05, proto Tkrit=2), zjist´ıme, ˇze T < Tkrit. Na z´akladˇetoho nem˚uˇzeme | | zam´ıtnout H0, tud´ıˇznovˇeaplikovan´ystochastick´yalgoritmus dosahuje obdobn´ych v´ysledk˚ujako deterministick´ypˇr´ıstup (a to i pˇres to, ˇze je pr˚umˇern´ypoˇcet krok˚u stochastick´eho algoritmu roven 2291). Uk´azka v´ystupu programu JASP: 6 1 PARAMETRICKE´ TESTY O SHODEˇ STREDNˇ ´ICH HODNOT 95% CI for Mean Difference t df p Mean Difference Lower Upper kroky -0.909 59 0.367 -109.067 -349.254 131.121 N Mean SD SE kroky 60.000 2290.933 929.780 120.034 1.2 Dvouv´ybˇerov´yt-test Pˇredpokl´ad´ame, ˇze m´ame dva nez´avisl´ev´ybˇery o rozsahu n1, resp. n2, ze dvou 2 norm´alnˇe rozdˇelen´ych populac´ı. Prvn´ı populace m´a rozdˇelen´ı N(µ1, σ1), druh´a 2 N(µ2, σ2).
Recommended publications
  • Anomalous Perception in a Ganzfeld Condition - a Meta-Analysis of More Than 40 Years Investigation [Version 1; Peer Review: Awaiting Peer Review]
    F1000Research 2021, 10:234 Last updated: 08 SEP 2021 RESEARCH ARTICLE Stage 2 Registered Report: Anomalous perception in a Ganzfeld condition - A meta-analysis of more than 40 years investigation [version 1; peer review: awaiting peer review] Patrizio E. Tressoldi 1, Lance Storm2 1Studium Patavinum, University of Padua, Padova, Italy 2School of Psychology, University of Adelaide, Adelaide, Australia v1 First published: 24 Mar 2021, 10:234 Open Peer Review https://doi.org/10.12688/f1000research.51746.1 Latest published: 24 Mar 2021, 10:234 https://doi.org/10.12688/f1000research.51746.1 Reviewer Status AWAITING PEER REVIEW Any reports and responses or comments on the Abstract article can be found at the end of the article. This meta-analysis is an investigation into anomalous perception (i.e., conscious identification of information without any conventional sensorial means). The technique used for eliciting an effect is the ganzfeld condition (a form of sensory homogenization that eliminates distracting peripheral noise). The database consists of studies published between January 1974 and December 2020 inclusive. The overall effect size estimated both with a frequentist and a Bayesian random-effect model, were in close agreement yielding an effect size of .088 (.04-.13). This result passed four publication bias tests and seems not contaminated by questionable research practices. Trend analysis carried out with a cumulative meta-analysis and a meta-regression model with Year of publication as covariate, did not indicate sign of decline of this effect size. The moderators analyses show that selected participants outcomes were almost three-times those obtained by non-selected participants and that tasks that simulate telepathic communication show a two- fold effect size with respect to tasks requiring the participants to guess a target.
    [Show full text]
  • Dentist's Knowledge of Evidence-Based Dentistry and Digital Applications Resources in Saudi Arabia
    PTB Reports Research Article Dentist’s Knowledge of Evidence-based Dentistry and Digital Applications Resources in Saudi Arabia Yousef Ahmed Alomi*, BSc. Pharm, MSc. Clin Pharm, BCPS, BCNSP, DiBA, CDE ABSTRACT Critical Care Clinical Pharmacists, TPN Objectives: Drug information resources provide clinicians with safer use of medications and play a Clinical Pharmacist, Freelancer Business vital role in improving drug safety. Evidence-based medicine (EBM) has become essential to medical Planner, Content Editor, and Data Analyst, Riyadh, SAUDI ARABIA. practice; however, EBM is still an emerging dentistry concept. Therefore, in this study, we aimed to Anwar Mouslim Alshammari, B.D.S explore dentists’ knowledge about evidence-based dentistry resources in Saudi Arabia. Methods: College of Destiney, Hail University, SAUDI This is a 4-month cross-sectional study conducted to analyze dentists’ knowledge about evidence- ARABIA. based dentistry resources in Saudi Arabia. We included dentists from interns to consultants and Hanin Sumaydan Saleam Aljohani, those across all dentistry specialties and located in Saudi Arabia. The survey collected demographic Ministry of Health, Riyadh, SAUDI ARABIA. information and knowledge of resources on dental drugs. The knowledge of evidence-based dental care and knowledge of dental drug information applications. The survey was validated through the Correspondence: revision of expert reviewers and pilot testing. Moreover, various reliability tests had been done with Dr. Yousef Ahmed Alomi, BSc. Pharm, the study. The data were collected through the Survey Monkey system and analyzed using Statistical MSc. Clin Pharm, BCPS, BCNSP, DiBA, CDE Critical Care Clinical Pharmacists, TPN Package of Social Sciences (SPSS) and Jeffery’s Amazing Statistics Program (JASP).
    [Show full text]
  • 11. Correlation and Linear Regression
    11. Correlation and linear regression The goal in this chapter is to introduce correlation and linear regression. These are the standard tools that statisticians rely on when analysing the relationship between continuous predictors and continuous outcomes. 11.1 Correlations In this section we’ll talk about how to describe the relationships between variables in the data. To do that, we want to talk mostly about the correlation between variables. But first, we need some data. 11.1.1 The data Table 11.1: Descriptive statistics for the parenthood data. variable min max mean median std. dev IQR Dan’s grumpiness 41 91 63.71 62 10.05 14 Dan’s hours slept 4.84 9.00 6.97 7.03 1.02 1.45 Dan’s son’s hours slept 3.25 12.07 8.05 7.95 2.07 3.21 ............................................................................................ Let’s turn to a topic close to every parent’s heart: sleep. The data set we’ll use is fictitious, but based on real events. Suppose I’m curious to find out how much my infant son’s sleeping habits affect my mood. Let’s say that I can rate my grumpiness very precisely, on a scale from 0 (not at all grumpy) to 100 (grumpy as a very, very grumpy old man or woman). And lets also assume that I’ve been measuring my grumpiness, my sleeping patterns and my son’s sleeping patterns for - 251 - quite some time now. Let’s say, for 100 days. And, being a nerd, I’ve saved the data as a file called parenthood.csv.
    [Show full text]
  • Statistical Analysis in JASP
    Copyright © 2018 by Mark A Goss-Sampson. All rights reserved. This book or any portion thereof may not be reproduced or used in any manner whatsoever without the express written permission of the author except for the purposes of research, education or private study. CONTENTS PREFACE .................................................................................................................................................. 1 USING THE JASP INTERFACE .................................................................................................................... 2 DESCRIPTIVE STATISTICS ......................................................................................................................... 8 EXPLORING DATA INTEGRITY ................................................................................................................ 15 ONE SAMPLE T-TEST ............................................................................................................................. 22 BINOMIAL TEST ..................................................................................................................................... 25 MULTINOMIAL TEST .............................................................................................................................. 28 CHI-SQUARE ‘GOODNESS-OF-FIT’ TEST............................................................................................. 30 MULTINOMIAL AND Χ2 ‘GOODNESS-OF-FIT’ TEST. ..........................................................................
    [Show full text]
  • Resources for Mini Meta-Analysis and Related Things Updated on June 25, 2018 Here Are Some Resources That I and Others Find
    Resources for Mini Meta-Analysis and Related Things Updated on June 25, 2018 Here are some resources that I and others find useful. I know my mini meta SPPC article and Excel were not exhaustive and I didn’t cover many things, so I hope this resource page will give you some answers. I’ll update this page whenever I find something new or receive suggestions. Meta-Analysis Guides & Workshops • David Wilson, who wrote Practical Meta-Analysis, has a website containing PowerPoints and macros (for SPSS, Stata, and SAS) as well as some useful links on meta-analysis o I created an Excel template based on his slides (read the Excel page to see how this template differs from my previous SPPC template) • McShane and Böckenholt (2017) wrote an article in the Journal of Consumer Research similar to my mini meta paper. It comes with a very cool website that calculates meta- analysis for you: https://blakemcshane.shinyapps.io/spmetacase1/ • Courtney Soderberg from the Center for Open Science gave a mini meta-analysis workshop at SIPS 2017. All materials are publicly available: https://osf.io/nmdtq/ • Adam Pegler wrote a tutorial blog on using R to calculate mini meta based on my article • The founders of MetaLab at Stanford made video tutorials on meta-analysis in general • Geoff Cumming has a video tutorial on meta-analysis and other “new statistics” or read his Psychological Science article on the “new statistics” • Patrick Forscher’s meta-analysis syllabus and class materials are available online • A blog post containing 5 tips to understand and carefully
    [Show full text]
  • Sequence Ambiguity Determined from Routine Pol Sequencing Is a Reliable Tool for Real-Time Surveillance of HIV Incidence Trends
    Infection, Genetics and Evolution 69 (2019) 146–152 Contents lists available at ScienceDirect Infection, Genetics and Evolution journal homepage: www.elsevier.com/locate/meegid Research paper Sequence ambiguity determined from routine pol sequencing is a reliable T tool for real-time surveillance of HIV incidence trends ⁎ Maja M. Lunara, Snježana Židovec Lepejb, Mario Poljaka, a Institute of Microbiology and Immunology, Faculty of Medicine, University of Ljubljana, Zaloška 4, Ljubljana 1105, Slovenia b Dr. Fran Mihaljevič University Hospital for Infectious Diseases, Mirogojska 8, Zagreb 10000, Croatia ARTICLE INFO ABSTRACT Keywords: Identifying individuals recently infected with HIV has been of great significance for close monitoring of HIV HIV-1 epidemic dynamics. Low HIV sequence ambiguity (SA) has been described as a promising marker of recent Incidence infection in previous studies. This study explores the utility of SA defined as a proportion of ambiguous nu- Sequence ambiguity cleotides detected in baseline pol sequences as a tool for routine real-time surveillance of HIV incidence trends at Mixed base calls a national level. Surveillance A total of 353 partial HIV-1 pol sequences obtained from persons diagnosed with HIV infection in Slovenia from 2000 to 2012 were studied, and SA was reported as a percentage of ambiguous base calls. Patients were characterized as recently infected by examining anti-HIV serological patterns and/or using commercial HIV-1 incidence assays (BED and/or LAg-Avidity assay). A mean SA of 0.29%, 0.14%, and 0.19% was observed for infections classified as recent by BED, LAg, or anti-HIV serological results, respectively. Welch's t-test showed a significant difference in the SA of recent versus long-standing infections (p < 0.001).
    [Show full text]
  • 9. Categorical Data Analysis
    9. Categorical data analysis Now that we’ve covered the basic theory behind hypothesis testing it’s time to start looking at specific tests that are commonly used in psychology. So where should we start? Not every textbook agrees on where to start, but I’m going to start with “χ2 tests” (this chapter, pronounced “chi-square”1)and “t-tests” (Chapter 10). Both of these tools are very frequently used in scientific practice, and whilst they’re not as powerful as “regression” (Chapter 11) and “analysis of variance” (Chapter 12)they’re much easier to understand. The term “categorical data” is just another name for “nominal scale data”. It’s nothing that we haven’t already discussed, it’s just that in the context of data analysis people tend to use the term “categorical data” rather than “nominal scale data”. I don’t know why. In any case, categorical data analysis refers to a collection of tools that you can use when your data are nominal scale. However, there are a lot of different tools that can be used for categorical data analysis, and this chapter covers only a few of the more common ones. 9.1 The χ2 (chi-square) goodness-of-fit test The χ2 goodness-of-fit test is one of the oldest hypothesis tests around. It was invented by Karl Pearson around the turn of the century (Pearson 1900), with some corrections made later by Sir Ronald Fisher (Fisher 1922a). It tests whether an observed frequency distribution of a nominal variable matches an expected frequency distribution.
    [Show full text]
  • JASP: Graphical Statistical Software for Common Statistical Designs
    JSS Journal of Statistical Software January 2019, Volume 88, Issue 2. doi: 10.18637/jss.v088.i02 JASP: Graphical Statistical Software for Common Statistical Designs Jonathon Love Ravi Selker Maarten Marsman University of Newcastle University of Amsterdam University of Amsterdam Tahira Jamil Damian Dropmann Josine Verhagen University of Amsterdam University of Amsterdam University of Amsterdam Alexander Ly Quentin F. Gronau Martin Šmíra University of Amsterdam University of Amsterdam Masaryk University Sacha Epskamp Dora Matzke Anneliese Wild University of Amsterdam University of Amsterdam University of Amsterdam Patrick Knight Jeffrey N. Rouder Richard D. Morey University of Amsterdam University of California, Irvine Cardiff University University of Missouri Eric-Jan Wagenmakers University of Amsterdam Abstract This paper introduces JASP, a free graphical software package for basic statistical pro- cedures such as t tests, ANOVAs, linear regression models, and analyses of contingency tables. JASP is open-source and differentiates itself from existing open-source solutions in two ways. First, JASP provides several innovations in user interface design; specifically, results are provided immediately as the user makes changes to options, output is attrac- tive, minimalist, and designed around the principle of progressive disclosure, and analyses can be peer reviewed without requiring a “syntax”. Second, JASP provides some of the recent developments in Bayesian hypothesis testing and Bayesian parameter estimation. The ease with which these relatively complex Bayesian techniques are available in JASP encourages their broader adoption and furthers a more inclusive statistical reporting prac- tice. The JASP analyses are implemented in R and a series of R packages. Keywords: JASP, statistical software, Bayesian inference, graphical user interface, basic statis- tics.
    [Show full text]
  • Statistical Analysis in JASP V0.10.0- a Students Guide .Pdf
    2nd Edition JASP v0.10.0 June 2019 Copyright © 2019 by Mark A Goss-Sampson. All rights reserved. This book or any portion thereof may not be reproduced or used in any manner whatsoever without the express written permission of the author except for the purposes of research, education or private study. CONTENTS PREFACE .................................................................................................................................................. 1 USING THE JASP ENVIRONMENT ............................................................................................................ 2 DATA HANDLING IN JASP ........................................................................................................................ 7 JASP ANALYSIS MENU ........................................................................................................................... 10 DESCRIPTIVE STATISTICS ....................................................................................................................... 12 EXPLORING DATA INTEGRITY ................................................................................................................ 21 DATA TRANSFORMATION ..................................................................................................................... 29 ONE SAMPLE T-TEST ............................................................................................................................. 33 BINOMIAL TEST ....................................................................................................................................
    [Show full text]
  • 12. Comparing Several Means (One-Way ANOVA)
    12. Comparing several means (one-way ANOVA) This chapter introduces one of the most widely used tools in psychological statistics, known as “the analysis of variance”, but usually referred to as ANOVA. The basic technique was developed by Sir Ronald Fisher in the early 20th century and it is to him that we owe the rather unfortunate terminology. The term ANOVA is a little misleading, in two respects. Firstly, although the name of the technique refers to variances, ANOVA is concerned with investigating differences in means. Secondly, there are several different things out there that are all referred to as ANOVAs, some of which have only a very tenuous connection to one another. Later on in the book we’ll encounter a range of different ANOVA methods that apply in quite different situations, but for the purposes of this chapter we’ll only consider the simplest form of ANOVA, in which we have several different groups of observations, and we’re interested in finding out whether those groups differ in terms of some outcome variable of interest. This is the question that is addressed by a one-way ANOVA. The structure of this chapter is as follows: in Section 12.1 I’ll introduce a fictitious data set that we’ll use as a running example throughout the chapter. After introducing the data, I’ll describe the mechanics of how a one-way ANOVA actually works (Section 12.2) and then focus on how you can run one in JASP (Section 12.3). These two sections are the core of the chapter.
    [Show full text]
  • A Tutorial on Bayesian Multi-Model Linear Regression with BAS and JASP
    Behavior Research Methods https://doi.org/10.3758/s13428-021-01552-2 A tutorial on Bayesian multi-model linear regression with BAS and JASP Don van den Bergh1 · Merlise A. Clyde2 · Akash R. Komarlu Narendra Gupta1 · Tim de Jong1 · Quentin F. Gronau1 · Maarten Marsman1 · Alexander Ly1,3 · Eric-Jan Wagenmakers1 Accepted: 21 January 2021 © The Author(s) 2021 Abstract Linear regression analyses commonly involve two consecutive stages of statistical inquiry. In the first stage, a single ‘best’ model is defined by a specific selection of relevant predictors; in the second stage, the regression coefficients of the winning model are used for prediction and for inference concerning the importance of the predictors. However, such second-stage inference ignores the model uncertainty from the first stage, resulting in overconfident parameter estimates that generalize poorly. These drawbacks can be overcome by model averaging, a technique that retains all models for inference, weighting each model’s contribution by its posterior probability. Although conceptually straightforward, model averaging is rarely used in applied research, possibly due to the lack of easily accessible software. To bridge the gap between theory and practice, we provide a tutorial on linear regression using Bayesian model averaging in JASP, based on the BAS package in R. Firstly, we provide theoretical background on linear regression, Bayesian inference, and Bayesian model averaging. Secondly, we demonstrate the method on an example data set from the World Happiness Report. Lastly,
    [Show full text]
  • A Tutorial on Conducting and Interpreting a Bayesian ANOVA in JASP
    A Tutorial on Conducting and Interpreting a Bayesian ANOVA in JASP Don van den Bergh∗1, Johnny van Doorn1, Maarten Marsman1, Tim Draws1, Erik-Jan van Kesteren4, Koen Derks1,2, Fabian Dablander1, Quentin F. Gronau1, Simonˇ Kucharsk´y1, Akash R. Komarlu Narendra Gupta1, Alexandra Sarafoglou1, Jan G. Voelkel5, Angelika Stefan1, Alexander Ly1,3, Max Hinne1, Dora Matzke1, and Eric-Jan Wagenmakers1 1University of Amsterdam 2Nyenrode Business University 3Centrum Wiskunde & Informatica 4Utrecht University 5Stanford University Abstract Analysis of variance (ANOVA) is the standard procedure for statistical inference in factorial designs. Typically, ANOVAs are executed using fre- quentist statistics, where p-values determine statistical significance in an all-or-none fashion. In recent years, the Bayesian approach to statistics is increasingly viewed as a legitimate alternative to the p-value. How- ever, the broad adoption of Bayesian statistics {and Bayesian ANOVA in particular{ is frustrated by the fact that Bayesian concepts are rarely taught in applied statistics courses. Consequently, practitioners may be unsure how to conduct a Bayesian ANOVA and interpret the results. Here we provide a guide for executing and interpreting a Bayesian ANOVA with JASP, an open-source statistical software program with a graphical user interface. We explain the key concepts of the Bayesian ANOVA using two empirical examples. ∗Correspondence concerning this article should be addressed to: Don van den Bergh University of Amsterdam, Department of Psychological Methods Postbus 15906, 1001 NK Amsterdam, The Netherlands E-Mail should be sent to: [email protected]. 1 Ubiquitous across the empirical sciences, analysis of variance (ANOVA) al- lows researchers to assess the effects of categorical predictors on a continuous outcome variable.
    [Show full text]