<<

Quantitative approaches Quantitative approaches Plan Lesson 3: 1. Introduction to quantitative sampling 2. and 3. Response rate 4. Types of "probability samples" 5. The size of the 6. Types of "non-probability samples"

1 2

Quantitative approaches Quantitative approaches 1. Introduction to quantitative sampling Sampling: Definition

Sampling = choosing the unities (e.g. individuals, famililies, countries, texts, activities) to be investigated

3 4 Quantitative approaches Quantitative approaches Sampling: quantitative and qualitative Population and Sample "First, the term "sampling" is problematic for qualitative research, because it implies the purpose of "representing" the population sampled. Population Quantitative methods texts typically recognize only two main types of sampling: probability sampling (such as random sampling) and Sample convenience sampling." (...) any strategy is seen as "convenience sampling" and is strongly discouraged." IIIIIIIIIIIIIIII Sampling This view ignores the fact that, in qualitative research, the typical way of IIIIIIIIIIIIIIII IIIII selecting settings and individuals is neither probability sampling nor IIIII convenience sampling." IIIIIIIIIIIIIIII IIIIIIIIIIIIIIII It falls into a third category, which I will call purposeful selection; other (= «Miniature population») terms are purposeful sampling and criterion-based selection." IIIIIIIIIIIIIIII This is a strategy in which particular settings, persons, or activieties are selected deliberately in order to provide information that can't be gotten as well from other choices." Maxwell , Joseph A. , Qualitative ..., 2005 , 88

5 6

Quantitative approaches Quantitative approaches Population, Sample, Representative sample, probability sample

Population = ensemble of unities from which the sample is Representative sample = Sample that reflects the population taken in a reliable way: the sample is a «miniature population» Sample = part of the population that is chosen for investigation. The choice may be based on Probability sample = Sample that has been randomly or not. chosen. Therefore, every unity has a known probability to be chosen. Sampling frame = list of all the unities from which the choice is made.

7 8 Quantitative approaches Quantitative approaches Representativity: an empirical question 2. Sampling error, sampling bias

The representativity of the sample cannot be assured by following a given method. If we use the correct methods (random choice, stratification etc.) we can only maximize the probability of producing a representative sample.

It is an empirical question (and should be tested) if the sample is really representative of the population.

For example: we would investigate if the percentage of women in the sample are not significantly different from those of the population (==> the sample is representative concerning gender).

9 10

Quantitative approaches Quantitative approaches Errors: different types Sampling error, sampling bias

1. Sampling error due to chance, size of sample Sampling error = Differences between the sample and the 2. Sampling bias not due to chance or size of population that are due to the sampling sample. E.g. non-response linked (the randomness). Sampling error can be to the specific theme of the diminished by increasing the size of the research sample 3. error e.g. bad question wording; bad interviewing Sampling bias = Differences between the sample and the 4. Data processing error e.g. wrong coding population that are not due to sampling 5. error e.g. wrong ; (the randomness); the sampling bias erroneous data analysis does not diminish with increased sample size. 6. Data interpretation error e.g. wrong interpretation of results

11 12 Quantitative approaches Quantitative approaches Sampling error/bias: example (I) Sampling error/bias: example (II)

smokers non-smokers smokers non-smokers smokers non-smokers smokers non-smokers

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

Population : N = 200 Population : N = 200 Population : N = 200 Population : N = 200 Sample : N = 32 Sample : N = 32 Sample : N = 32 no error/bias a bit of error/bias a lot of error/bias P(s) = 0.5; p(s) = 0.5 P(s) = 0.5; p(s) = 0.47 P(s) = 0.5; p(s) = 0.33

13 14

Quantitative approaches Quantitative approaches Sampling error: decreases Possible reasons for sampling bias with increasing sample size Experiment with a coin • The sampling frame does not include all the elements of the Probability of throwing «heads»? population (example: telephone directory) • The choice is not really random (example: open telephone P «in reality» = 0.5 directory at a random page and choose the next 600 names) We do 5 tries with N =1,2,5,20 • Certain groups of respondents have a higher (lower) response rate (example: the very poor, the very rich, ther very active, With growing N, the p is approaching the P the people with an active interest in the question, the people critical of surveys) N = 1 -> p = 0, 1, 0, 1, 1 N = 2 -> p = 0, 0.5, 0.5, 1, 0 N = 5 -> p = 0.6, 0.2, 0.4, 0.8, 0.1 N = 20 -> p = 0.4, 0.35, 0.45, 0.35, 0.55

15 16 Quantitative approaches Quantitative approaches Sampling error vs. sampling bias: Citation 3. Response rate

Sampling error is random. Every time you select an individual, a text, a situation, or any "unit of observation," that unit of observation will be different from the population of such units. Hence you always have an error (we hope a small one) in generalizing to the population of units." "Unlike sampling error, "sampling bias" is systematic (nonrandom). For example, if for a focus group study you "randomly" select one of every five students who happen to be in the library on a Friday afternonnon, you might have a biased sample that does not represent the views of "average" college students." "Unlike sampling error, increasing the size of the samle does not decrease the degree of bias in your sample." Obviously, the results of a biased sample cannot be considered to be representative of the population (i.e. , the findings have low transferability or external validity)." Tashakkori / Teddlie, Mixed Methodology. Combining Qualitative and Quantitative ..., S. 72-73 17 18

Quantitative approaches Quantitative approaches Response rate Response rate: example RLS Tabelle 1Ausschöpfungsrate und Anzahl der verwendeten Interviews in dieser Studie N % Brutto-Stichprobe 4800 Response rate= Percentage of individuals of the sample Stichprobenneutrale Ausfälle 1712 who have responded to the davon 1. Stufe 1291 2. Stufe 141 3. Stufe 280 Netto-Stichprobe 3088 100.0% (Brutto-Stichprobe - stichprobenneutr. Ausf.) N of returned interviews - N returned interviews, not usable Verweigerungen 1424 46.1% davon 1. Stufe 1062 = 2. Stufe 183 3. Stufe 179 Sample - number of individuals who were not able to Realisierte Interviews 1664 53.9% (=Netto- davon Ausschöpfung) answer or could not be reached Deutsche Schweiz 1054 (davon Kanton Zürich) 330 Französische Schweiz 409 Example Italienische Schweiz 201 652 - 8 = = 0.56 Anhänger/innen nichtchristlicher Religionen 28 in dieser Studie verwendete Interviews 1636 1212 - 66 davon Kanton Zürich 325 19 20 Quantitative approaches Quantitative approaches Response rate: example 4. Types of probability sample Christliches Zeugnis • Der tatsächliche Rücklauf war besser als erwartet. Von 942 angeschriebenen Personen antworteten 469 auf das erste Schreiben(49,8%); nach erfolgter Mahnung sandten weitere 125 Personen (13,3%) gültige Fragebogen ein. Die Gesamtrücklaufquote beläuft sich damit auf rund 63% (594 Personen). • Dies nach Abzug der ungültigen Antworten und der Befragten, die nicht mehr aufzufinden, krank oder gestorben waren.

21 22

Quantitative approaches Quantitative approaches Types of probability sample 4.1

4.1. Simple random sample Simple random sample = choose randomly a predetermined number of the 4.2. Systematic random sample population (sample frame)

4.3. Stratified random sampling 1. decide what population to use 4.4. Multi-stage 2. choose the sampling frame 3. decide sample size 4. use random numbers (e.g. with the help of a computer) in order to choose the units)

23 24 Quantitative approaches Quantitative approaches Systematic random sample: 4.2 Systematic random sample Christliches Zeugnis (I) Systematic sample = choose randomly/systematically a Ziel war, eine für den Evangelikalismus der deutschen predetermined number of the Schweiz repräsentative Untersuchung durchzuführen. population (sample frame) Als Methode wurde die schriftliche Befragung gewählt. In einem nächsten Schritt musste eine geeignete Adresskartei aller Evangelikalen gefunden werden, um die repräsentative 1. decide what population to use Stichprobe ziehen zu können. Eine solche Kartei existiert 2. choose the sampling frame nicht - und es ist schwierig, ja fast unmöglich, eine sinnvolle 3. decide sample size Stichprobe selbst zu konstruieren. (...) 4. begin with a random number between 1 and i; choose every ith unit in the sampling frame. i = sample / population Auf der Suche nach einem Ausweg aus dieser Schwierigkeit stiessen wir auf Campus für Christus, eine evangelikal ausgerichtete Organisation.

25 26

Quantitative approaches Quantitative approaches Systematic random sample: Systematic random sample: Christliches Zeugnis (II) Study on islamophobia Sie gibt eine Zeitschrift, das "Christliche Zeugnis", heraus, welche innerhalb des Evangelikalismus recht weit verbreitet The data used for this study stem from a closed-question ist und eine Auflage von ca. 20000 erreicht. Von der Kartei face-to-face , each taking from 45-60 dieser Zeitschriftenempfänger kann man hoffen, dass sie ein minutes. The population consisted of inhabitants of the city unverzerrtes Bild des E in der deutschen Schweiz liefert. of Zurich in the age range 18 to 65 with Swiss nationality. Die Zufallsstichprobe wurde wie folgt gezogen: Die erste The survey was conducted between October 1994 and March Adresse wurde durch eine Nummer zwischen 1 und 20 1995 by the Sociological Institute of the University of zufällig gewählt; dann wurden von hier aus in 20-er- Zurich. The people were chosen randomly from the official Schritten die weiteren Adressen aussortiert. Als gültig files of the state (Einwohnerkontrolle). In all, 1,138 erwiesen sich 942 Adressen. interviews were conducted. The response rate was 72%. The survey can be regarded as representative of the Swiss population of the city of Zurich (Stolz, 2000, 226). 27 28 Quantitative approaches Quantitative approaches 4.3 Stratified random sampling Stratified random sampling: example (1)

Stratified random sampling: create strata in your sampling On sait que dans notre population de 7'000'000 nous avons frame corresponding to central 72% de germanophones, 20% de francophones et 8% cleavages in your popultion. d'italophones. Notre sample size est 1000. Inside every strata, choose Alors nous décidons de chosir aléatoirement predetermined numbers of units randomly. dans la population des germanophones: 720 dans la population des francophones: 200 dans la population des italophones: 80

-> Concernant la langue, notre sample est absolument représentatif. -> Si nous avions effectué un simple random sample, le sampling erreur aurait produit p.ex. un sample avec: germ: 742, franc: 195, ital: 63

29 30

Quantitative approaches Quantitative approaches Stratified random sampling: example (2) 4.4 Multi-stage cluster sampling

In the NCS-CH study, we stratified for religious tradition. Multi-stage cluster sampling = on choisit d'abord Furthermore, we overweighted smaller religious traditions. aléatoirement des groupes d'unités (clusters); puis, on choisit aléatoirement dans ces groupes

-> Souvent moins cher

31 32 Quantitative approaches Quantitative approaches Multi-stage cluster sampling: Multi-stage cluster sampling: Etude sur les évangéliques (Milieu) (I) Etude sur les évangéliques (Milieu)(II) Some 1,850 were given out and 1,100 were returned, giving a response rate of 59.4%. The response rate • Our data stem from two representative surveys, one was 57.9% (N= 359) for the charismatic group, 54.6% conducted in 1999 covering the whole population of (N=377) for the moderate and 66.9% (N= 361) for the Switzerland, and a second survey from 2003 among the fundamentalist group. Being a mail survey, these response members of the evangelical free churches in Switzerland. rates can be seen as very satisfactory. The data was collected The first data set (1999) was produced by conducting 1,562 between June 2003 and September 2003. This sample can be computer-aided telephone interviews (CATI), based on a said to be representative of the members of evangelical free random sample of the inhabitants of Switzerland within the churches in Switzerland. For a number of analyses we age-range of 16 to 75. Response rate was 54%. aggregated the data sets from 1999 and 2003. One of the central features of the design of our study on evangelical free churches was to include a large number of questions that had already been used in the 1999 survey of the Swiss population, in order to be able to compare the evangelical 33 34 milieu to the „societal environment“.

Quantitative approaches Quantitative approaches Multi-stage cluster sampling: 5. The size of the sample Etude sur les évangéliques (Milieu) (III) The second data set (2003) was produced by a mail survey of 1,100 evangelicals from evangelical free churches in Switzerland, based on a stratified cluster sample. Cluster sampling was effectuated by randomly choosing evangelical free churches from a list and then randomly selecting members from these churches. Stratification was achieved by dividing the sample into three groups: charismatic, moderate and fundamentalist. Since the fundamentalist group in our population only amounts to about 11%, the fundamentalist stratum was overrepresented in the sample, in order to be able to make a better comparison between the three groups. 35 36 Quantitative approaches Quantitative approaches Size matters! Size : absolute and relative

It is not the relative but the absolute size that matters. The larger the sample, the better you fare! -> A random sample of 1000 has the same «value» if the With larger samples, population is Switzerland or China - your estimates of the parameters gain in precision (confidence intervals are getting smaller) - the differences you find will become significant easier - you will be able to make analyses at a more detailed level (comparing various subgroups etc.)

37 38

Quantitative approaches Quantitative approaches Example : increasing the sample size Formula n decreases the confidence interval x i What is the true mean in the population? Arithmetic mean = x = i=1 n Mean in the sample (n = 105): 4.8 standard deviation (sample) = 1.2 n n (x x)2 2 (mean) = 1.2/ 105 = 0.117 " i ! "(xi ! x) Variance = s2 = i=1 Standard deviation = s = i=1 confidence interval: true mean = 4.8 +- 1.96 * 0.117 n !1 n !1 -> between 4.571 et 5.029

s Mean in the sample (n = 1000): 4.8 Standard error = s = x n standard deviation (sample) = 1.2 standard error (mean) = 1.2/ 1000 = 0.00694 confidence interval: true mean = 4.8 +- 1.96 * 0.00694 (z 1.96) -> entre 4.7864 et 4.8136 95% confidence interval = X ± z0.25sx 0.25 = 39 40 Quantitative approaches Quantitative approaches Factors influencing the size of the sample The example of the dwarfs

Coûts: from n = 1000 on for the sample, the gains in precision are decreasing

Non-response: a certain percentage of individuals will refuse to participate; we therefore have to start out with a larger sample

Heterogeneity: If the heterogeneity of the the sample is large, we have to have a larger sample.

Type of analysis: If we want to analyze the relationship between many variables at the same time (multivariate analysis), we have to have a larger sample (e.g. sex * age * political preference)

41 42

Quantitative approaches Quantitative approaches Sampling error: decreases Simulation with R with growing N In this example, we imagine an infinite population of dwarfs. We would like to know their mean hight and the variance of their hight in the population. Simulation with R plot(c(0,100),c(0,15),type="n",xlab="Sample The question: how many dwarfs do we have to draw randomly from the size",ylab="Variance", cex.lab=1.2) population in order to measure them and then estimate the population for (df in seq(5,100,5)){ hight and variance? for(i in 1:30){ In the following simulation we draw 30 samples for different N’s (for N= x<-rnorm(df,mean=10,sd=2) 5,10,15,20....100). points(df,var(x))}}

The «real» mean in the population is 10 cm. The «real» variance in the population is 4 (standard deviation = 2)

The simulation shows that for samples smaller than N = 40, the estimate of the mean and variance are very unreliable. 43 44 Quantitative approaches Quantitative approaches How estimate of variance becomes more Weighting, change in sample size and their reliable with growing N effect on standard errors : example NCS-CH

Variance in population: 4

45 46

Quantitative approaches Quantitative approaches 6. Types of non-probability samples Types of non-probability samples

6.1 Convenience sampling 6.2 Snowball sampling 6.3 Quota sampling

47 48 Quantitative approaches Quantitative approaches 6.1 Convenience sampling Convenience sampling: example

Convenience sampling "Nous avons déposé dans les boîtes aux lettres des enseignants - qui existent dans la plupart des universités - le questionnaire, une note explicative du contenu de = We choose the people who are most easily available / notre recherche, et une enveloppe avec notre adresse afin qu'ils puissent nous approachable. faire parvenir le questionnaire dûment rempli. La plupart des universités parisiennes - ainsi qu'un bon nombre des plus importants centres de recherche - sont inclus dans notre enquête. Nous avons Problem: déposé des questionnaires à Paris I, Paris II, Paris III, Paris V, Paris VI, Paris VII, Sauphine, Paris X-Nanterre, Paris VIII, l'Institut de Sciences Politiques, la We do not know for what population these people are Maison des Sciences de l'Homme, et l'Ecole Normale Supérieure. representative / whom they stand for 271 enseignants nous ont fait parvenir leurs réponses au questionnaire. Cependant, les 271 réponses ne constituent pas un échantillon représentatif qui permette de décrire les caractéristiques générales de la population des enseignants. Par exemple, il ne nous permet pas de déterminer le pourcentage d'individus qui sont séduits pour les positions de gauche. L'échantillon n'est donc construit que pour fournir un test et non pour décrire la population des enseignants parisiens. (Magniberton/Rios, 2003)

49 50

Quantitative approaches Quantitative approaches 6.2 Snowball sampling Snowball sampling: example

Snowball sampling = We ask the first participants for "I conducted fifty interviews with marijuana users. I had addresses of other individuals who have been a professional dance musician for some years when I the same characteristics. Every conducted this study and my first interviews were with participant is again asked for still other people I had met in the music business. I asked them to put participants. me in contact with other users who would be willing to discuss their experiences with me... Although in the end half Problem: no representativity of the fifty interviews were conducted with musicians, the other half covered a wide range of people, including laborers, machinists, and people in the professions (Becker 1963: 45-6)

51 52 Quantitative approaches Quantitative approaches 6.3 Quota sampling 6.3 Quota sampling

Quota sampling = Starting with a knowledge of the Problems: population (e.g. 50% of women, 20% between 18 and 30 etc.), we decide how - Not really representative; bias because of the choice and the many individuals in certain groups networks of the interviewers (quotas) the sample should contain. - We cannot calculate the standard errors. Example: we need 3 elderly women from the sample to the population is not permitted. living in a rural area in the canton of Appenzell Innerrhoden). Now, the Advantage: interviewers have the responsibility of finding individuals with these - faster characteristics. - cheaper

Often used in

53 54