Primer Hypothesis Prior to testing Tests Corr & regress

A Light Introduction to

Pavol Jancura

ESCMID Workshop

8-10-2012 © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Sources & references

Articles and books P Driscoll et al. An introduction to statistics, J Accid Emerg Med 2000;17:4-6 and 18:1-4.a H Motulsky. Intuitive , 2010 (2nd edition). R A Donnelly. The complete idiot’s guide to statistics, 2007 (2nd edition).b B Illowsky, S Dean. Collaborative Statistics, free available.c

a First chapter at http://emj.bmj.com/content/17/3/205.2.full b Errata at http://www.stat-guide.com c At http://cnx.org/content/col10522/latest/ or http://cnx.org/content/col10522/1.40/pdf © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Sources & references

Web http://en.wikipedia.org/wiki/Portal:Statistics http://en.wikibooks.org/wiki/Statistics http://en.wikiversity.org/wiki/Statistics http://www.graphpad.com/guides/prism/6/statistics/ http://www.sjsu.edu/faculty/gerstman/StatPrimer/ http://www.statsoft.com/textbook/ http://stattrek.com/tutorials/statistics-tutorial.aspx © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress

1 Primer 2 Hypothesis Research hypothesis Statistical hypothesis Hypothesis test 3 Prior to testing Student’s t-distribution 95% CI Test validity 4 Comparative tests One sample Two samples 5 Correlation and regression Correlation © by author Non-linear regression

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress

Primer

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress collection

When we design an then population represents all possible subjects (individuals) relevant for the experiment (e.g. people). a sample represents a set of subjects selected from entire population ((e.g. a group of people)). data are the collection of measurements (values) taken from/on the sample. a variable is a type of one measurement taken for all or a subset of subjects within the sample (age, gender, ...). an observation is a set of measurements (values of some variables) for a single subject of our sample, e.g when you come to a physician, he makes an observation on you (temperature, heart beat, blood pressure,...). an ©is anby (outlying) author observation that is numerically distant (deviates markedly) from the rest of the data.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Outlier

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress

We have a collection of measurements (data) on a set of subjects (sample) selected from all possible subjects (population). Sampling is a process of selecting subjects from a population for investigation. a proper sample is critical to the accuracy of the statistical analysis; it should be representative of the population from which it was taken. Sampling bias: a sample is collected in such a way that some members of the intended population are less likely to be included than others ⇒ a biased sample. Sampling error: the sample measurement is different from the population measurement. It is the result of selecting a (biased) sample© that isby not a perfect author match to the entire population.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Displaying data I

Scatter plot

a type of mathematical diagram using Cartesian coordinates to display© values forby two variables author for a set of data.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Displaying data I

Histogram

a graphical representation showing a visual impression of the distribution© of data. by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Displaying data I

Histogram

an estimate of the distribution of a continuous variable. © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Distribution

Probability distribution

Probability density function (PDF) Cumulative distribution function (CDF)

a assigns a probability to each of the possible outcomes of a random experiment (on your sample or population).© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Summarizing data

How to summarize our data into few numbers?

Summary statistics centrality () spread (dispersion,© by variation) author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Centrality measures

Measures of describe the centre point of a with a single value. - the (middle) value in the data set for which half the observations are higher and half the observations are lower; when there is an even number of data points, the median will be half of the sum of the two centre points. - the most frequent value in the data set. When the data on a sample are considered, we talk about sample median or mode. When the data on whole population are considered, we© talk about bypopulation author median or mode.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Centrality measures

Let N be the number of all possible subjects (the size of population) and n be the number of the selected subjects (n ≤ N) for an experiment (the size of sample). Let x1, x2,... be numeric values of one variable X (salary) measured on the subjects (people) of a population. arithmetic : sample mean

n x1 + x2 + ··· + xn 1 X x¯ = = x n n i i=1 population mean

N x1 + x2 + ··· + xN 1 X ©µ = by author= xi N N i=1

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Centrality measures

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Centrality measures

: sample ! 1 √ n n n Y x1x2 ··· xn = xi i=1 population 1 ! N √ N N Y x1x2 ··· xN = xi i=1 Mostly, you always work with a sample from a population (a group of mice, a group© of people,...). by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Centrality measures

Geometric mean vs Arithmetic mean Geometric mean ≤ arithmetic mean Geometric mean changes accordingly with the changes in the proportion among values when their overall sum does not change. Geometric mean works only with positive numbers (> 0).

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Centrality measures

Geometric mean vs Arithmetic mean Arithmetic mean is good to represent data with no significant . Arithmetic mean is used for numbers whose values are meant to be added together (to get a total gain). Geometric mean is used for numbers whose values are meant to be multiplied together (to get a total gain). Geometric mean is often used to evaluate data covering several orders of magnitude. If your data covers a narrow , geometric© meansby may notauthor be appropriate.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Centrality measures

Geometric mean vs Arithmetic mean Geometric mean is used in cases where the differences among data points are logarithmic or exponential in nature (e.g. a population growth, interest rates). Geometric is more appropriate than the arithmetic mean for describing proportional growth, both exponential growth (constant proportional growth) and varying growth. The geometric mean of growth over periods yields the equivalent constant growth rate that would yield the same final amount. Do not use geometric mean on a log-transform data (the log of geometric mean is the arithmetic mean of a log transform 1 1 data: log© (Q x ) n =byP log xauthor). i n i

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Spread measures (Dispersion)

Measures of dispersion describe how far the individual data values have strayed from the centre point (mean). Let X = {x1, x2,... } be a sample or population data. Let N be the size of a population and n < N be the size of a population sample. range Range = max{X } − min{X }.

quartiles are three values (Q1, Q2, Q3) that divide the data set into four equal segments covering approx 25% data each after it has been arranged in ascending order.

Q2 = median{X } Q1 = median{xi ∈ (min{X }, Q2)} Q3 = median{xi ∈ (Q2, max{X })}

Then each© of the intervalsby (minauthor{X }, Q1), (Q1, Q2), (Q 2, Q3) and (Q3, max{X }) covers 25% data.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Spread measures (Dispersion)

measures the spread of the centre half of the data (covers 50% of the data).

IQR = Q3 − Q1 The interquartile range is used to identify outliers, as outliers’ accuracy may be questioned and can cause unwanted distortions in statistical results. Outliers are identified as data points outside of the following interval: ©(Q 1 −by1.5 × IQR author, Q3 + 1.5 × IQR)

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Displaying data II

Box plot

Box-and-whisker diagram (plot) box represents the data within IQR. whiskers©usually correspondby toauthorRange, ±1.5IQR or other dispersion measures

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Displaying data II

Box plot

© by author identifying an outlier on a box plot

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Spread measures (Dispersion) II

Measures of dispersion continue... is a measure of dispersion describing the relative (squared) distance between the data points and the mean of the data. sample variance Pn (x − x¯)2 s2 = i=1 i n − 1 population variance

PN (x − µ)2 σ2 = i=1 i N Using squared distances (area) gives certain mathematical advantages,© e.g. Pythagoreanby author theorem: a2 + b2 = c2. You may see variance as the mean area of dispersion.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Spread measures (Dispersion) II

The sample variance is an estimate of the population variance, thus dividing by n would likely underestimate the real population variance. Dividing by n − 1 gives more conservative (larger) ’guess’ of the population variance. is the square root of a variance. sample standard deviation s √ Pn (x − x¯)2 s = s2 = i=1 i n − 1 population standard deviation s √ PN (x − µ)2 σ = σ2 = i=1 i © by authorN You may see standard deviation as the length of the mean area of dispersion.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Normal (Gaussian) distribution

Normal distribution is a probability distribution having a bell shape symmetric around the mean. It is defined by two parameters: mean and standard deviation.

2 1 − (x−µ) © N(byµ, σ) = √authore 2σ2 σ 2π

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Normal (Gaussian) distribution

Probability density function (PDF) Cumulative distribution function (CDF)

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Normal (Gaussian) distribution

Mean, median and mode of a are equal.

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Log-normal distribution

the data follow a normal distribution after the log transformation (applying the log function on the data measurements).

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress

Hypothesis

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Hypothesis building

Research problem/questions We usually formulate our research problem as a research question: everyone understands it: clarity. we can (dis)prove it: testability.

Hypothesis is a testable statement we can accept (fail to reject) or reject. (testability) generally© follows theby “If/then author” format. (clarity) failing to reject a hypothesis does not prove it true.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Questions ⇒ Hypotheses

Research questions Does cigarette smoking cause lung cancer? Does alcohol affect the reaction time of a person driving a car? What is the effect of temperature on seed germination?

Research hypotheses If you smoke cigarettes then you should more likely get lung cancer. If you drink alcohol then your reaction time to drive a car should be affected (get worse). If you change (increase) temperature then seed germination should change© (increase). by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Hypothesis & variables

Our research question or hypothesis should contain two types of variables: independent (x) and dependent (y = f (x)) variables.

If (independent variable) then (dependent variable). If (cigarettes) then (lung cancer). If (alcohol) then (the reaction time). If (temperature) then (seed germination). © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Statistical hypotheses

Having a research hypothesis we can state statistical hypotheses: Research hypothesis ⇒ Statistical hypotheses Null hypothesis © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Null hypothesis

Null hypothesis (H0) a general or default position, a baseline. usually a statement about what we should observe if our research hypothesis is wrong. there is no relationship between two measured phenomena or a potential treatment has no effect. in other words, we observe no difference in or no effect on dependent variables as independent variable changes. e.g.: There is no (statistical) difference in occurrence of lung cancer between© smokersby and non-smokers.author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Alternative hypothesis

Alternative hypothesis (H1) usually the statement we expect to be the truth,

the complement of the null hypothesis (H0). a change in independent variables causes changes in dependent variables.

the acceptance of the alternative hypothesis H1 is done by rejecting the null hypothesis H0 the acceptance of alternative hypothesis is a statistical evidence to support (accept) the research hypothesis. e.g.: There is a (statistical) difference in occurrence of lung cancer between© smokersby and non-smokers.author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Statistical hypothesis testing

A test (T) a mathematical function on the sample data that reduces the data to one or a small number of values to be used to perform a hypothesis test. e.g. correlation coefficients, t-statistic for one sample t-test x¯−√µ (T = σ/ n ),... a “null value” T0 is the value of a test statistic T under the null hypothesis. Then

values of T close to T0 present the strongest evidence in favour of the null hypothesis, values of T far from T0 present the strongest evidence against© the nullby hypothesis author.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Statistical hypothesis testing

A statistical hypothesis test it answers the question: Assuming that the null hypothesis is true, what is the probability (p-value) of obtaining by chance an equal or more extreme test statistic T as the one computed from the data?

in other words, let T1 be the value of T computed from data then what is the probability of values of T to be as far as T1 or even further than T1 from T0 if H0 is true. p-value < 0.05 means that given the null hypothesis we have less than 5% chance to observe the computed test statistic T1 or more extreme© (lessby or greater) author values of T than T1 .

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Statistical hypothesis testing

A test statistic T can be computed using an assumption about a general distribution of the data. The values of a test statistic T may follow a different distribution© than theby data have. author P-value is computed using the distribution of values of T .

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Statistical hypothesis testing

A significance level (α) a threshold for the p-value to reject the null hypothesis α = 0.05 means if p-value < 0.05 then we reject the null hypothesis (and accept the alternative one) because if the null hypothesis is true then it is very unlikely to observe the data of our experiment (in less then 5 out 100 repeated trials). Notice, by saying two groups (measurements) are significantly different we usually reject a (null) hypothesis that the measurements or characteristics of the groups are statistically the same/similar. © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Types of statistical tests

Statistical tests differ in the test statistics T used and on the assumptions made about sample (distribution): parametric vs non-parametric - assume vs do not assume a priori statistical properties about the population distribution. one-sided vs two-sided - consider extreme values on one tail vs on the both tails of the distribution.

one-sided two-sided parametic less general/less general less general/more general non-parametric more general/less general more general/more general © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Statistical hypothesis testing

The testing process 1 Formulate an initial research hypothesis. 2 State the relevant null and alternative hypothesis. 3 Chose appropriate statistical test and significance level. 4 Calculate a probability of the observation under the null hypothesis (p-value) using the chosen test. 5 Reject the null hypothesis if and only if the p-value is less than the significance level threshold. A failure to reject (an acceptance of) a null hypothesis does not prove it true! © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Errors in hypothesis testing

Type I, II and III errors

H0 is true H0 is false Reject H0 Type I error (False positive) Correct (True positive) Fail to reject H0 Correct (True negative) Type II error (False negative)

Type I error - the null hypothesis (H0) is true, but is rejected.

Type II error - the null hypothesis (H0) is false, but it is not rejected. Type III error - you get the right answer, but asked the wrong question!© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Research hypothesis Statistical hypothesis Hypothesis test Errors in hypothesis testing

The power of the test the probability that the test will reject the null hypothesis when the null hypothesis is false. it depends on the sample size and the significance level α. if the power is less than 50%, the study is really not helpful. conventionally, a test with a power of 80% is considered good. ideally, your choice of acceptable power should depend on the consequence of making a Type II error. © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress

Prior to testing

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Sample and population

Sample gives us an estimate from the whole population: a sample does not ideally correspond to the population ⇒ there is an error between sample statistics and population statistics. the greater the sample, the better the estimate the better the estimate, the better the confidence about the generality of findings for population. “Can we compute the error and the confidence of a sample with respect to the population?” © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity of the mean (SEM)

SEM quantifies the precision of the mean. is a measure of how far your sample mean is likely to be from the true population mean. is expressed in the same units as the data. is computed as the sample standard deviation divided by the square root of the sample size (the error is size-dependable). s SEM = √ n

gets smaller as samples get larger. If know the mean standard error, can we determine a mean error spread (dispersion)© (confidence by intervals)?author The mean error spread means what other values the sample mean could possible take if we adjust it by (±) its standard error.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Student’s t-distribution

The length of the adjustment determines the range (spread) of values which we can consider to be the true mean. To determine the spread we need to know (assume) the distribution of the sample mean. Again, a sample is an estimate from the population: the sample distribution is an approximate of the population distribution. the sample distribution will be closer to the population distribution with increasing sample size. the sample distribution depends on the sample size. if the population distribution is normal distribution, then the approximate distribution of the sample is called Student’s t-distribution©. by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Student’s t-distribution

bell-shaped and symmetrical around the mean. the shape of the curve depends on the sample size (degrees of freedom = n-1).

Degrees of freedom© areby the number author of values that are free to vary given information, such as the sample mean, is known.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Confidence Intervals

The sample mean comes with its standard error (SEM). Hence, it is not precise in terms of the population mean and we can only talk about certain confidence we have about the estimated mean value that it really represents the population mean. By the confidence we mean the population mean is the sample mean ± some error. The length of the tolerated error determines the confidence interval (CI) of a mean. A paradox The greater the error I tolerate the more the confidence I can have (it is more likely) that within my error interval (CI) I captured the true population mean =⇒ the greater the© confidence by you want author to have the greater the CI you get.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Confidence Intervals

The confidence interval (CI) of a mean tells you how precisely you have determined the mean. Definition CI = (¯x − t × SEM, x¯ + t × SEM) x¯ is the sample mean. SEM is the standard error of the mean. t is a critical t-value taken from t-distribution, if we assume the population to follow a normal distribution. as t-distribution changes given the sample size, the confidence intervals changes accordingly (the greater the sample, the narrower the CI). The confidence intervals are usually computed by assuming a normal distribution© of populationby values.author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Confidence Intervals

The ’confidence range’ of CI is determined by t-value. 95% CI A critical t-value is selected such that there is a 95% chance that the CI contains the true population mean. In other words, if you generate many 95% CIs from many samples, you can expect the CI to include the true population mean in 95% of the cases.

© by author

Ref.: http://www.comfsm.fm/ dleeling/statistics/notes009.html

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Paramteric & non-parametric tests

The most basic difference between various tests is whether they make assumptions about the data. parametric tests assume a specific population distribution from which the data where drawn, usually a normal distribution which is parametrized by mean and standard deviation. non-parametric tests do not assume any population distribution and use non-, as median or a of data (e.g. from least to greatest). Confidence intervals are usually computed for parametric tests as you need the assumption© by on a (normal) author distribution.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Determining sample size

Proper sample size is very important to correctly prove your research hypothesis: affects the power of the statistical tests. affects standard errors and confidence intervals (if a normal distribution is assumed). How to determine the sample size: is the trade-offs between sample size, power, and the effect size you can detect. should be decided as the part of experimental design (before doing ). ask the question: “if I use N subjects, what information can I learn?”. use available© software by (statistical author calculators) to determine the sample size before doing experiment (e.g. StatMate).

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Determining sample size

Important ethical consideration It may happen that it is impossible to find what you want to know with the number of subjects that are available to you. Then it is far better to cancel such an experiment in the planning stage, than to waste time and money on an experiment that won’t have sufficient power. If the experiment involves any clinical risk or expenditure of public money, performing such a study can even be considered unethical. © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Testing for the normality of data

Parametric test assume a population distribution to be known. In general, they assume a normal distribution. How can I know whether my data are normally distributed? We test the hypothesis: “If our data were sampled from a normal distribution, what is the (random) chance of the observed deviation of our data from the ideal normal distribution?” If P < 0.05, the data do not pass the normality test. If P > 0.05,© the databy do pass author the .

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Normality tests in Prism

D’Agostino-Pearson omnibus test (recommended) you need at least 8 subjects (values). The Shapiro-Wilk test you need at least 7 subjects (values). The Kolmogorov-Smirnov test (obsolete) do not use it any more as a normality test. however, you may use it for other purposes. Fail to reject the normality hypothesis does not prove the hypothesis! But it gives you a more solid ground for choosing a parametric test.© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Fail of normality tests

Do nothing if the data are approximately normally distributed (the deviation is not big), parametric tests should work anyway. mostly the case of large sample sizes. Transform the data e.g. log-transformation can make the data look normally distributed. run the test on the transform data again. Remove outliers the presence of outliers might cause the normality test to fail. Use non-parametric test do not, however,© chooseby a non-parametricauthor test solely based on a normality test!

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Parametric or non-parametric?

in the case of small sample sizes normality tests don’t have much power to properly asses deviations from the ideal normal distribution. in the case of large sample sizes normality tests are too sensitive and small deviation from a normal distribution will cause normality tests to fail. ⇓ You are likely to commit Type I error. © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Student’s t-distribution 95% CI Test validity Parametric or non-parametric?

If a normality test rejects the hypothesis on a normal distribution, do not panic by immediately choosing a non-parametric test!

Proving your research hypothesis with non-parametric hypothesis is always good (more general), however they are more strict (lower power), thus again you may commit Type I error (rejection of a true hypothesis) especially with very small samples. Always try above approaches (transformation, outlier removal,...).

⇓ You should choose your statistical test as part of the experimental design. Changing tests until you prove what you want is likely to be misleading© by and not veryauthor scientifically sound.1

1 http://www.graphpad.com/guides/prism/6/statistics/index.htm?when to choose a nonparametric.htm Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress

Comparative tests

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples One sample tests

You conducted an experiment with one sample and you would like to check if the mean or median values are different from the hypothetical (expected) mean or median (known from a previous experience or a scientific judgement).

Research question: Is the mean (or median) the same as expected by theory (or by the null hypothesis)? Is the observed difference due to chance or are the values really (statistically) different from the hypothetical value? © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples One sample tests

You know beforehand the hypothetical value of population mean or population median. It is the input of a one-sample test. Statistical hypotheses:

H0: There is no significant difference between the sample mean (median) and the population mean (median).

H1: There is a significant difference between the sample mean (median) and the population mean (median). © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples One sample tests

One sample t test Wilcoxon signed rank test parametric non-parametric asses the significance of the asses the significance of the mean differences. median differences. assumes the data drawn assumes a symmetric from a normal distribution. distribution around median. if P is small (P < 0.05 or 0.01) then the difference between means or is not a due to a coincidence. if P is large there is no statistical evidence that population mean (median)© differsby from theauthor hypothetical mean (median).

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Two samples’ tests

You conducted experiments on two samples and you would like to compare if there is a real difference between the two groups (in the mean or median values).

Research question: Is the observed difference between the means or medians of measured variables in two samples due to chance or are the two samples really© (statistically) by different? author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Hypothesis testing with two samples

General statistical hypotheses

H0: There is no difference between two samples.

H1: There is a difference between two samples.

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Paired or unpaired data

When you have measurements on subjects in two different samples, the subjects from these samples may create pairs (or they are the same). Then we talk about paired data.

For example, you select one group of people and you perform two experiments on this group. Each experiment represents one sample of data and naturally for each person you have two measurements (e.g. on different parts of body, in different time points) each in one sample. Then the pairing is given by the name of a person.

If you cannot find a pairing between measurements (of the same unit) the two samples because it simply does not exists, the data are unpaired.© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Paired or unpaired data

Pairing of data should be part of the experimental design. Selection of paired or unpaired test is done before running an experiment. Feature which decided how the pairing is done is called the pairing variable (name, age, family relationship, nationality,...). © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Unpaired two samples’ data

Unpaired t test Mann-Whitney test parametric non-parametric compares the means of two compares the distributions unmatched groups. (medians) of two assumes the data drawn unmatched groups. from a normal distribution. alternative assumes the two populations Kolmogorov-Smirnov test. have the same . Mann-Whitney usually preferred over K-S. Prefer a two-tailed test if you have no assumption on which sample will have a larger mean (median) before collecting data (the© same forby paired tests). author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Paired two samples’ data

Paired t test Wilcoxon matched pairs test parametric non-parametric compares the means of two compares (medians of) two matched groups. matched groups. assumes the differences computes and ranks the between the data of two differences between each set samples to follow a normal of pairs distribution. assumes the distribution of differences to be symmetrical Report 95%CI on t-tests, you have 95% chance it contains the true difference between means (Is there the 0 value?). 95% CI for© Wilcoxon by test is meaningfullauthor if the assumption on symetric distribution of differences is valid.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Simple contingency tables

Special case: The data are measured (understood) in a yes-no category, e.g. survived - not survived, cancer - no cancer,... .

How do we represent/collect such data? How do we compare such measurements between two groups? Contingency tables summarize results of two (or more) groups where the outcome is a . groups are© usually by also defined author on the basis of a yes-no answer.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples 2x2

A contingency table summarizes counts/numbers for every possible combination of outcomes. Cancer No cancer Row total Smoker a b a+b No smoker c d c+d Col total a+c b+d a+b+c+d (=n) We compare proportions between the groups. For example, is there a link between the smoking and a cancer? (Does smoking cause a cancer?) ⇒ Is there a proportional difference between people having a cancer who are smokers and no-smokers? © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Contigency table

The basic idea behind using contingency tables:

If the proportions of individuals in the different columns vary significantly between rows (or vice versa), we say that there is a contingency between the two variables. In other words, the two variables are dependent. If there is no contingency, we say that the two variables are independent. © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Comparing proportions

The significance of the difference between the two proportions can be assessed with a variety of statistical tests. contingency tables are used to examine the significance of the association (contingency) between the two kinds of classification. how counts are used to compute p-values (and CI) depends on a specific test. the specific interpretation of p-values (statistical hypotheses) also depends on a specific test. © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress One sample Two samples Statistical test on 2x2 contingency tables

Fisher’s / Chi-square test If there really is no association between the variable defining the rows and the variable defining the columns in the overall population, what is the probability (p-value) of observing an association in our contingency table as strong (or stronger) by random? for unmatched data, do not use for a case-control study where individual cases are matched with individual controls (given age, gender,...). Chi-square test approximates the p-value of Fisher’s exact test. Chi-square test is used for very large sample sizes. For a matched© case-control by study useauthorMcNemar’s test. It uses a table which, however, is not the same as a contingency table.2 2 http://www.graphpad.com/guides/prism/6/statistics/index.htm?stat how to mcnemars test.htm Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress

Correlation and regression

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Correlation

Until now we were testing for differences between distributions or proportions (sample mean vs population mean, sample means or medians,...). However, what if we are interested in testing a possible relationship (dependency) between data.

Is there a correlation (dependency) between the measured data. How strong is the dependency? Are the variables indeed © (statistically)by author dependent?

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Hypothesis testing with correlation

The strength of the correlation is assessed by a correlation coefficient (ρ). Statistical hypotheses

H0: There is no correlation (ρ = 0).

H1: There is a correlation (ρ 6= 0).

Correlation does not tell about the direction of dependency (causality). That is, it does not answer the following question:

Does x is dependent on y, or y is dependent on x? Mostly, you assume© theby direction fromauthor your experimental design.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Correlation coefficients

Pearson correlation coefficient assesses the strength of the linear correlation (how well the relationship between two variables can be described with a line (linear function)). is sensitive to outliers. is less general (only linear). varies between −1 (perfect negative (decreasing) correlation) and 1 (perfect positive (increasing) correlation). © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Pearson correlation coefficient

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Correlation coefficients

Spearman’s coefficient assesses the strength of a monotonic correlation (how well the relationship between two variables can be described using a monotonic function). is less sensitive to outliers. is more general than Pearson correlation. varies between −1 (perfect negative (decreasing) correlation) and 1 (perfect positive (increasing) correlation) © by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Spearman’s rank correlation coefficient

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Sample size dependence of correlation coefficients

A correlation coefficient is not sufficient yet to answer our statistical hypothesis. The strength of the correlation is dependent on the range of values which is usually limited by our sample size.

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Assessing statistical significance of correlation

Basic idea Asses the statistical difference between the sample correlation coefficient and zero correlation coefficient given our sample. An approach the coefficient is transformed such that it would follow a normal (or Student’s t) distribution with 0 mean under H0 (no correlation hypothesis). the p-value is the probability of observing the sample correlation so far or further from 0 by chance. the values 95% CI represent other possible values of the correlation with 95% confidence covering the true correlation.

An important note if P ≥ 0.05© then 95%CI by would contain author the 0 value (no correlation). if P ≥ 0.01 then 99%CI would contain the 0 value (no correlation).

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Statistical significance of correlation given the sample size

The minimum value of Pearson’s sample correlation coefficient that would be significant© atby the 0.05 level author for a given sample size.

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Fails of a correlation coefficient

A correlation coefficient may be misleading. It is just one number statistics (as mean, median, deviation,...) giving only one narrow view on the data. It should be only a part of a larger analysis.

© by author Each example has the same Pearson correlation (ρ = 0.816)

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression Residuals

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression

© by author

Pavol Jancura ESCMID Online Lecture Library Primer Hypothesis Prior to testing Tests Corr & regress Correlation Linear regression Non-linear regression E-max model

© by author

Pavol Jancura ESCMID Online Lecture Library