An Introduction to JASP: A Free and User-Friendly Package

James Bartlett

1 Introduction 1.1 What is JASP? JASP is a free, open-source alternative to SPSS that allows you to perform both simple and complex analyses in a user-friendly package. The aim is to allow you to conduct complex Classical (stuff with p values) and Bayesian analyses ("What the hell is ?" you might ask, look in appendix 9.1 if you are interested), but have the advantage of a drag-and-drop interface that is intuitive to use.

1.2 Why JASP? Although many universities predominantly use SPSS, it is expensive and it can be a frustrating experience. JASP is a free alternative that aims to give you exactly what you want when you are analysing data.

1.2.1 Effect sizes Effect sizes are one of the most important values to report when analysing data. However, despite many papers and an APA task force explaining their importance, SPSS only offers a limited number reference of effect size options and many simple effect sizes are required to be calculated manually. On the other hand, JASP allows you to simply tick a box to provide an effect size for each test.

1.2.2 Continuously updated output Imagine you have gone through all of the menus in SPSS to realise you forgot to click one option that you wanted to be included in the output. You would have to go back through the menus and select that one option and rerun the whole analysis, printing it below the first output. This looks incredibly messy and takes a lot of time. In JASP, all of the options and results are presented on the same screen. If you want another option to be presented, all you have to do is tick a box and the results are updated in seconds.

1.2.3 Minimalist design For each statistical test, SPSS provides every value you will ever need and more. This can be very confusing when you are getting to grips with statistics and you can easily report the wrong value. In JASP, minimalism is the aim. You start off with a basic result, and you have the option to select additional information if and when you need it.

1.3 Using JASP 1.3.1 How to download JASP Today, you will opening JASP through the software portal, but you can download it yourself for free on their website for either Windows, OSX, or (if that’s your thing). After installing it and opening the program, you will find the "Welcome to JASP" window shown in Figure 1.

1 Figure 1: JASP startup window

1.3.2 Entering data in JASP The first difference you will find between SPSS and JASP is how you enter data. In SPSS, you enter data manually through the data view screen. In JASP, there is currently no facility to enter data directly (although this may change in future), and you have to load the data in a comma separated values (.csv) file. This is like an Excel spreadsheet but you cannot directly load a normal Excel Workbook (.xlsx), you first have to convert it to a .csv file. The data you will use today is already in a .csv file so you will be able to load it directly. However, if you decide to use JASP in your own time and need to create a .csv file, here is a useful link that explains what it is and how to create one. To start any data analysis, we need to load some data to use. At the top of the "welcome" window, there are two tabs: File and Common. Click on the File tab and you will see the window in Figure 2. Here you will have your recently opened files on the right if you have used it before, but yours should be blank the first time you use it. To open a file, click on Computer > Browse, then select the data file from wherever you saved it on your computer. After it has loaded, the "welcome" window will look slightly different as your data will be visible on the left side of the window like in Figure 3.

1.3.3 Changing data types The next step is to make sure JASP knows what type of data your variables are (e.g. nominal, ordinal). Unlike SPSS, JASP does its best to guess what data type it is. The symbols at the top of the columns in Figure 3 look like three circles for nominal data, three lines like a podium for ordinal data, or a ruler for scale data. Most of the time JASP gets it right and the columns have the right symbol. However, sometimes it is not quite right. If you click on the symbol, you can change it from one data type to another if things are not quite right. Another difference between JASP and SPSS is how you define nominal factor levels. In SPSS, you might remember that to define a factor such as nationality you need to assign each level a value and label. For example, you could list the first participant as German and label them with a 1, the second person could be Chinese and be assigned a 2. Every German participant would be identified by the number 1, and every Chinese participant would be labeled 2. However, in JASP all you need to do is list the labels themselves (no values) and make sure the variable type is nominal (three little circles). An important thing to note is that all the labels need to be exactly the same to be considered the same condition. For example, German could not be spelled as german or GERMAN or JASP would think these are three different conditions. It has to be written exactly the same, capitals and spaces and everything.

2 Figure 2: JASP file tab

2 Today’s session

The session is organised like the process you would normally go through when you perform data analysis. You want to get a feel for the data through and using some plots to visualise the data. The next step is to make sure the type of analysis you want to perform is appropriate for the data you have, so we will look at data screening. Finally, we will go ahead and look at the inferential statistics. The data for all of the examples are from real published research and were made available on the Open Stats Lab (McIntyre 2016). All the analyses you are going to do are the same as what was performed in the original research.

3 Example One: T-test 3.1 Study background The first example that we are going to look at is from a study by Schroeder and Epley (2015). The aim of the study was to investigate whether delivering a short speech to a potential employer would be more effective at landing you a job than writing the speech down and the employer reading it. Thirty-nine professional recruiters were randomly assigned to receive a job application speech as either a transcipt for them to read, or an audio recording of them reading the speech. The recruiters then rated the applicants on intellect, their impression of the application, and whether they would recommend hiring the candidate. All ratings were on a likert scale ranging from 0 (low intellect, impression etc.) to 10 (high impression, recommendation etc.).

3.1.1 Task 1. What would your predictions be? As they are exactly the same words, do you think the applicants would be rated similarly, or do you think the audio recordings would result in higher ratings due to additional indicators of intelligence? It is important to think about this first, so write your prediction down briefly.

3.2 Descriptive statistics 3.2.1 Loading the data Firstly, we need to open the data file for this example. Look back at section 1.3 on how to open a .csv file and open Schroeder-Epley-data.csv from the folder you downloaded at the start of the session. Your window should now look like Figure 3. The next thing to do is to make sure JASP

3 Figure 3: JASP window with data has correctly labeled each column. The variables we are interested in for the first example are Condition, Intellect_Rating, Impression_Rating, and Hire_Rating. Condition is our independent variable and indicates whether the participant has been provided with a transcript (value 0) or an audio recording (value 1). It should be labeled as nominal data and have the three little circles explained in section 1.3.3. The other three variables are our dependent variables and each should be labeled as scale (a little ruler). Intellect_Rating and Impression_Rating are both identified correctly. However, Hire_Rating may have been labeled as a nominal variable and needs changing to scale. Click on the three circles and change it to a ruler.

3.2.2 Getting a first look at the data From the window in Figure 3, click on the Descriptives tab (below file) > Descriptive Statistics to find the new window in Figure 4. From here, we can take a look at the data by ticking the box ’display boxplots’ and dragging all three of our dependent variables into the white box to the right of the full list of variables. This will fill the table in the far right screen with the data for the three dependent variables and provide you with three boxplots. However, this only provides you with the descriptive statistics for the whole sample. This is not very informative as we had two independent groups: one for those provided the transcripts, and one for those provided the audio recordings. We can split the boxplots up into our two groups by dragging the Condition variable to the "split by (optional)" box. This divides each of the ratings into our two groups to provide the plots shown in Figure 5. Taking a look at each boxplot, we can see that when the participants were provided with audio recordings, they provided higher ratings of intellect, impression, and recommendation than those provided with just transcripts. Take a look and have a think whether this is consistent or not with the predictions you made in section 3.1. However, somewhat annoyingly, you do not get the descriptive statistics in the table divided by Condition (we will return to this in section 3.3.2). However, given that it is free and it is still in development, we will cut them a bit of slack and it will apparently be included in the next version release along with data editing capabilities. Follow them on Twitter (@JASPStats) or Facebook (JASPStats) to stay up to date with their version releases.

3.3 Data screening From the boxplots, we can see that the participants who listened to the audio recordings provided higher ratings than those who just read the transcript. This is only a visual inspection and we would ideally like to perform some inferential tests. However, before we go rushing to click some buttons, we need to make sure the data is suitable for parametric tests. We want to compare two

4 Figure 4: JASP window for descriptive statistics independent groups (remember what the independent variable was), so we want there to be little influence from outliers, we want to assume normality, and that the variances in both groups are approximately equal.

Figure 5: JASP window for boxplots and descriptive statistics

3.3.1 Assessing outliers The first thing we can do is look back at the boxplots we created in section 3.2.2. These Figures present the median as the central line, the first and third quartiles as the edges of the box, and then 1.5 of the inter-quartile range above and below the box as lines. Values outside of these lines are what can be considered outliers. In this example Intellect_Rating has a few outliers above and below, but they do not seem to be particularly extreme (you could follow this up more formally by assessing the scores after they have been converted to standardised values, but you cannot do this in JASP. You would have to do this in SPSS.).

5 3.3.2 Assessing normal distribution and homogeneity of variance The next thing to do is to check if the data appear to have been sampled from a normal distri- bution, and whether the variances are approximately equal. This involves clicking on T-Tests > Independent Samples T-Test to open a new menu below the list of variables and a blank analysis table on the right of the screen. Note one of the nice features of JASP is that you do not have to go back through the menus and click a range of options. The new analysis just appears below the old one (which is now greyed out but still visible if you scroll up). We can drag all of the dependent variables into the dependent variable box (again, another user- friendly design in contrast to SPSS where DVs are called DVs) and drag Condition into Grouping Variable. We now have a range of options to click and the temptation to start looking at T-Tests is almost irresistible. All we have to do is look at a few more tables and then we are ready to go. On the menu below the list of variables, click on Normality and Equality of Variances under the Assumption Checks heading, and also click Descriptives under Additional Statistics to get the descriptive statistics grouped by condition that we wanted in section 3.2.2. You should now get something that looks like Figure 6. Another useful design feature (yes, another) in JASP is that the tables are designed in APA style so that you can easily copy and paste them providing the variables have appropriate names.

Figure 6: JASP window for normality and equal variances

First, we will look at the Shapiro-Wilk test which assesses the assumption of normality. The idea behind this is that the assumption of normality is the null hypothesis in the test. Therefore, if you get a p value that is smaller than .05, then the data do not come from a normal distribution and the assumption of normality is violated. This test and a similar one called the Kolmogorov-Smirov test can also be found in SPSS. However, although the Shapiro-Wilk test is generally considered to be better, there are issues with using both and assessing normality visually such as by using the Q-Q plot (not available in JASP but can be produced in SPSS) is recommended (if you are interested in learning more, consider reading this). As you selected all three DVs, the Shapiro- Wilk table reports the test for each one and is divided by condition as we have independent groups. As we can see in each row, none of the tests are significant so we can assume the assumption of normality has not been violated. Secondly, we will look at Levene’s test for the assumption of equal variances (homogeneity of variance). You should have come across this test previously and uses a similar logic to the Shapiro-Wilk. The null hypothesis is that the variances are equal between the groups, therefore a sufficiently large difference in variances between the groups will be classified as significant, as indicated by the p value of the test. This test is also heavily criticised for reasons similar to the Shapiro-Wilk test above so any conclusion you make should be in conjunction with a careful look at the data using plots (scroll up to the boxplots, are the boxes and lines roughly symmetrical around the median line?). However, the Levene’s test suggests that the assumption of equal variances has

6 not been violated and we are fine to continue.

3.4 Data analysis This is the moment you have been waiting for. After all the visualising and checking, you want to finally look at some inferential statistics. We can stay on the analysis page similar to Figure 6 as most of the results are already here, but we were just ignoring them temporarily and require a few more options. On the menu section, Student should be selected under Tests by default, but we also want to select Welch. Under Additional Statistics, we also want to select Mean difference and Effect size. If you really want to tidy things up, you could always untick both of the boxes under Assumption Checks. Remember JASP automatically updates so you can select the information when and if you need it to keep with the minimalist approach. You should have a window that looks like Figure 7.

Figure 7: JASP window for the T-Test results

Looking at the Independent Samples T-Test table, we have all the information we want (and in contrast to SPSS, all the information we need). We have both a Student and Welch T-Test (Welch’s T-Test should be the default option but see appendix section 9.2 for more information). Remember what the boxplots and descriptive statistics showed us, participants who were provided with audio recordings gave higher ratings than those provided with transcripts. We can now support this using the T-Test result for intellect, impression, and hiring recommendation. Firstly, we will go through the intellect result, then you can do the others independently. In published articles and your practical reports, T-Tests should be reported in the standard format of: t (df) = t statistic, p value, effect size. For intellect, we would write the result for the Student T-Test up as t (37) = −3.53, p = .001, Cohen’s d = −1.13. As we selected the mean difference between the two conditions, this simply tells us what the difference was between the means for intellect in our two conditions. This shows that those in the transcript condition rated the applicant 1.99 points on our scale lower on average than those in the audio recording condition. This makes sense in our example, but if another study was performed using a different scale, the mean difference between studies would not be able to be compared. This is where Cohen’s d comes in. It is a standardised effect size that expresses the difference between two conditions in terms of standard deviations. In our example, those in the transcript condition rated the applicant 1.13 standard deviations lower on average than those in the audio recording condition. As this is a standardised unit, we would be able to compare this to other studies that used a different scale. To interpret this result, we can look at the guidelines Cohen (1977) originally suggested. He suggested results can be considered as small (±0.2), medium (±0.5), and large (±0.8) effects. However, it is important that you compare the effects to those found in the published literature when you perform your own research. These guidelines should be a last resort. Following these rough guidelines for these examples, there appears to be a large

7 effect between the two conditions. Putting this all together, we could conclude something like this: "Participants in the audio recording condition rated the applicant significantly higher on a rating of intellect than participants in the audio recording condition, t (37) = −3.53, p = .001, Cohen’s d = −1.13. This shows that there was a large effect between the two conditions.". This is an example for the first DV, but there are two remaining. Now it is your turn to think about the results and complete a few tasks.

3.4.1 Tasks 1. How would you write up the result for the impression rating?

2. What would you write up for the hiring recommendation rating? 3. Write this whole example up like you would for a results section in your coursework. Think about the order we have done the analysis in. You want to include descriptive statistics, any data screening, and finally inferential statistics. Include any plots or tables, and remember effect sizes!

4 Example Two: T-Test 4.1 Study background The next study we are going to look at is by Mehr, Song and Spelke (2016). They were interested in whether singing to infants conveyed important information about social affiliation. Infants become familiar with melodies that are repeated in their specific culture. The authors were interested in whether a novel person (someone they had never seen before) could signal to the child that they are a member of the same social group and attract their attention by singing a familiar song to them. Mehr et al. (2016) invited 32 infants and their parents to participate in a repeated measures experiment. Firstly, the parents were asked to repeatedly sing a previously unfamiliar song to the infants for two weeks. When they returned to the lab, they measured the baseline gaze (where they were looking) of the infants towards two unfamiliar people on a screen who were just silently define smiling at them. This was measured as the proportion of time looking at the individual who would how this later sing the familiar song (0.5 would indicate half the time was spent looking at the familiar was mea- singer. Values closer to one indicate looking at them for longer etc.). The two silent people on the sured screen then took it in turns to sing a lullaby. One of the people sung the song that the infant’s parents had been told to sing for the previous two weeks, and the other one sang a song with the same lyrics and rhythm, but with a different melody. During this period, the amount of time spent looking at each singer was measured (this was measured as the total gaze time (ms) as opposed to a proportion). Mehl et al. (2016) then repeated the gaze procedure to the two people at the start of the experiment to provide a second measure of gaze as a proportion of looking at the familiar singer.

4.1.1 Tasks 1. Identify the design, IV, and DVs in this experiment. You will need to know these for when you perform the analysis.

2. What would your predictions be? Do you think the infants will look longer at the person who sung the familiar song, or do you think there will be little difference?

4.2 Data analysis The first thing we need to do is load a new data file. Go back to the folder you downloaded at the beginning of the section and open Mehr-study1-data.csv in JASP. Think about the process we went through in the first example to explore the data, and then you need to think about how you are going to analyse the data.

8 4.2.1 Tasks 1. Create box plots for each DV and compare them. 2. Assess parametric assumptions. We do not need to worry about homogeneity of variance in this example as it is a . 3. Analyse the data and write all this up in a short results section. Think about the design of this study, you will need to click on T-Tests > Paired Samples T-Test this time. Do not forget effect sizes!

5 Example Three: Correlation 5.1 Study background Now that we have seen how you can run T-Tests, the next test to perform is a simple correlation between two continuous variables. Correlations allow us to assess the degree of relationship between two variables. The first example we are going to work with is from Beall, Hofer and Shaller (2016) who investigated how the outbreak of infectious diseases can influence voting behaviour. They were specifically interested in the emergence of the Ebola virus and whether it was associated with support for a more conservative candidate over a liberal candidate. There are two variables of interest that we are going to investigate: frequency of Google searches for the Ebola virus, political support for either a conservative or liberable candidate in the 2014 US Federal elections. The first variable is called ’Daily.Ebola.Search.Volume’ and is the search volume for particular topics in a geographical region. The topic with the highest search volume in a particular day is scored 100, and all other topics are expressed as a percentage of that value. Therefore, the closer the value is to 100, the more people Googled the Ebola virus. The second variable is called ’Voter.Intention.Index’. This was calculated by subtracting the percentage of voters who intended to support a liberal candidate in the election from the percentage of voters who intended to support a conservative candidate. Therefore, positive values indicate greater support for conservative candidates and negative values indicate greater support for liberal candidates.

5.1.1 Task 1. What will your prediction be? Do you think greater concern for infectious diseases will be associated with greater support for conservative or liberal candidates?

5.2 Descriptive statistics Start by loading Beall-Hofer-Shaller-data.csv into JASP. We are going to start again by looking at the descriptive statistics. Enter both variables listed above into the empty white box seen in Figure 4 and used in the previous examples.

5.2.1 Task 1. What is the mean and SD for both variables? 2. Create boxplots for both variables to have an initial look at the data.

5.3 Data screening We can now start to think about whether a parametric version of correlation is appropriate for our data. We want both both variables to be measured on a continuous scale, we want both measurements to be in pairs, and we want there to be no outliers. We will think about each one in turn. Both of the variables are measured on a continuous scale. Remember to check that JASP has correctly worked out what type of data each variable is by looking at the top of the columns in the data view screen as seen in Figure 3. We want both variables to have a little ruler on top, which they both should have. Next, we want both variables to be in pairs. We start to have a small problem here. If you looked closely when you opened the data, or when you were exploring the descriptive statistics,

9 you might have seen that we have 65 rows of data for the Ebola search index, but we only have 24 rows for the voter intention index. This makes sense as the data is split into days and the voter intention index is based on polling data. We do not have polling data for every day, so the correlations will be based on the number of data rows we have both a voter intention index and an Ebola Google search volume. This leaves us with 24 complete pairs of data to base the analysis on. Finally, we want there to be no outliers. Outliers are extremely problematic for correlations as it can bias the value. We can look back at the boxplots you created during the tasks in section 5.2.1. Can you see any problematic outliers? It appears that the data looks fine, so we are all set to go ahead and calculate some correlations.

5.4 Data analysis We will be using a different analysis tab than the one we used for the T-Tests. Firstly, click on Regression > Correlation Matrix to open a new analysis window. The next thing we want to do is to drag both of the variables into the empty white box like we did for the descriptive statistics. This will fill the table with two numbers, one for Pearson’s r (the correlation value) and one for the p value. We also want to tick the box for a correlation matrix to visualise the correlation. This will produce the screen you can see in Figure 8. One of the first things you might notice in comparison to SPSS, is that you only get two sets of numbers in the correlation table. SPSS gives you the full matrix of every combination of your variables (including correlating the same two variables). In JASP, it just provides you with what you need to know for two variables: what is the correlation coefficient, and what is the p value. Simple. We could write the result up like this: "The correlation between daily Ebola search volume and voter intention was small and was not significant, Pearson’s r(22) = .17, p = .430". We have the (22) after r as that is the degrees of freedom for a correlation. JASP does not provide it you directly, but it is the number of people in the analysis minus two (24 was the sample as we only have 24 matching pairs of data, so 24-2).

5.4.1 Task 1. Write this example up like you would for a results section in your coursework. Remember to do it in a logical order, and you can copy and paste the scatter plot if you want (although do not do this for an assignment as the axis labels are not very formal).

Figure 8: JASP window for the correlation results

10 6 Example Four: Correlation 6.1 Study background The final study we are going to look at is by Dawtry, Sutton and Sibley (2015). They were interested in how people differ in their assessment of income inequality. Most people report wanting a fair society where there is little inequality. However, previous research indicated that wealthy people perceive society as already wealthy and are less likely to support wealth being redistributed as they are happy with the current state of affairs. Dawtry et al. (2015) investigated if this was due to wealthier people having a wealthier social circle. Therefore, in their eye, they estimate the general population to be wealthier than it is due to everyone around them also being wealthy. They recruited 305 participants in an online survey to complete a series a questions about: their own wealth, the wealth of their social circle, inequality in their social circle, the average wealth of the population, and inequality in the population.

6.2 Data analysis The first thing we need to do is open the data file for this example. Open Dawtry-Sutton-Sibley- data.csv in JASP. You can do this example on your own following example three (Section 5). You will be looking at the correlations between each of the variables in the data file so it will be a bit more crowded than the first correlation example.

6.2.1 Tasks 1. What is the mean and standard deviation for each of the variables?

2. Screen the data. Are there any outliers? check this my- 3. Look at the correlations between each of the variables. Are there any significant positive or self negative correlations? 4. Write this up like you would for a coursework assignment.

7 Summary

This means it is now the end of the workshop! Hopefully you have found it interesting, and you can now see there are alternatives to SPSS. JASP is a lightweight alternative that helpfully provides you with information that you just do not get in SPSS. Although it is not perfect (it is still in development and not even at version 1.0 yet!), there are many advantages that it can offer you as a psychology student. For example, it provides you with a simple output with just the values you need, with the option to add more information as and when required. Think back to when you first came across an SPSS output. You are bombarded with information before you even know what to do with it. On the other hand, JASP is more intuitive, and is hopefully a little easier to understand if you are still getting to grips with statistics in psychology. Furthermore, one of the most important contributions of JASP is the ease in which you can produce effect sizes. Jacob Cohen (1990: 12), a famous statistician in psychology, said "the primary product of a research inquiry is one or more measures of effect size, not p values". Despite its popularity, SPSS provides only a handful of effect sizes and you are required to calculate most of them separately. JASP allows you to simply tick a box to produce them which saves you time, and reduces the likelihood of making a mistake if you were to calculate it by hand. However, despite the drawbacks, SPSS is still useful at times and there are things that JASP simply cannot do at the moment. Therefore, it is unlikely that you will solely use JASP to analyse your data. When analysing data myself, I often switch between several statistical packages to play to each of their strengths. Hopefully you will see that JASP can offer several benefits, and statistics might appear to be less frightening. If you have any further questions about any of the topics covered in this session, feel free to email me on [email protected].

11 8 Additional Resources

• A free online textbook by Craig Wendorf can be downloaded from the Open Science Frame- work. This has sections on SPSS and JASP, and provides some additional content that was not covered in this session. • JASP have their own YouTube channel with a few tutorials on how to perform simple Bayesian analyses (read appendix 9.1 if you still have no idea what this is). However, if you search JASP on Youtube, there are a few tutorials on importing data and statistical designs you may be more familiar with made by users that you might want to look at if you want some guidance in your own time.

9 Appendix 9.1 Bayesian statistics There is not really enough room to provide a thorough overview, but to cut a long story short, it is an entirely different field of statistical inference. Whereas classical (or frequentist) statistics is based on controlling long-term errors rates through p values, Bayesian statistics allows the probabilities of different hypotheses to be calculated based on the data available. One of the limitations in using p values is that all the information it provides is the observed data were unlikely assuming the null hypothesis was true, nothing more, nothing less. The aim is to control your long-term error rates, but it provides no information for how likely the null or alternative hypothesis is. However, a lot of published research and textbooks make out that it does, so it would be useful to be aware of the concepts of Bayesian Statistics. There are a few good introductory sources to the basics such as the blog post by Dablander (2015), and an annotated reading list by Etz et al. 2016 can be downloaded from the Open Science Framework. More and more published research is using Bayesian statistics, but it is currently not taught on many courses so it would be worthwhile to read up on it yourself.

9.2 Student or Welch T-Test? When you perform an independent samples T-Test, one of the parametric assumptions you test for is homogeneity of variance. The issues surrounding using the Levene’s test have already been mentioned in section 3.3.2. but a further issue is assuming exactly equal variances will not be strictly true in most examples. Unequal variances can be problematic and lead to an increase in type one error rate (Lakens 2015). An alternative approach is to use Welch’s T-Test by default which corrects for uneven variances and sample sizes. This essentially uses a correction to decrease the degrees of freedom and consequently the power of the test. This attempts to ensure the error rate does not increase above 5%. For example, in section 3.4 you might have noticed if you selected Welch’s T-Test, the degrees of freedom change from 37 to 33.43. It is generally recommended to use the Welch’s version by default as you do not need to conduct Levene’s test first (saves you time), and essentially provides the same result when there are equal variances and sample sizes in each group. For a more thorough explanation, read Lakens’ (2015) blog post or if you are interested, a more technical paper by Derrick et al. 2016.

12