Questionnaire Validity and Reliability
Total Page:16
File Type:pdf, Size:1020Kb
Questionnaire Validity and reliability Department of Social and Preventive Medicine, Faculty of Medicine Outlines Introduction What is validity and reliability? Types of validity and reliability. How do you measure them? Types of Sampling Methods Sample size calculation G-Power ( Power Analysis) Research • The systematic investigation into and study of materials and sources in order to establish facts and reach new conclusions • In the broadest sense of the word, the research includes gathering of data in order to generate information and establish the facts for the advancement of knowledge. ● Step I: Define the research problem ● Step 2: Developing a research plan & research Design ● Step 3: Define the Variables & Instrument (validity & Reliability) ● Step 4: Sampling & Collecting data ● Step 5: Analysing data ● Step 6: Presenting the findings A questionnaire is • A technique for collecting data in which a respondent provides answers to a series of questions. • The vehicle used to pose the questions that the researcher wants respondents to answer. • The validity of the results depends on the quality of these instruments. • Good questionnaires are difficult to construct; • Bad questionnaires are difficult to analyze. •Identify the goal of your questionnaire •What kind of information do you want to gather with your questionnaire? • What is your main objective? • Is a questionnaire the best way to go about collecting this information? Ch 11 6 How To Obtain Valid Information • Ask purposeful questions • Ask concrete questions • Use time periods based on importance of the questions • Use conventional language • Use complete sentences • Avoid abbreviations • Use shorter questions Validity and Reliability • Validity: How well does the measure or design do what it purports to do? • Reliability: How consistent or stable is the instrument? –Is the instrument dependable? Which device can measure the body temperature ? 1 37.2 37.2 2 37.2 39.3 3 37.1 34.4 4 37.2 28.2 5 37.2 43.3 6 37.2 37 7 37.2 39 Validity Logical Statistical Construct Face ContenContent Convergent Divergent/ Concurren Predictive t Discriminant t Reliability Consistency Objectivity The Major Decisions in Questionnaire Design 1. Content - What should be asked? 2. Wording - How should each question be phrased? 3. Sequence - In what order should the questions be presented? 4. Layout - What layout will best serve the research objectives? Face Validity – Face validity is the extent to which a test is subjectively viewed as covering the concept it purports to measure – As a check on face validity, test/survey items are sent to experts to obtain suggestions for modification. – Because of its vagueness and subjectivity, psychometricians have abandoned this concept for a long time. Content Validity – In psychometrics, content validity (also known as logical validity) refers to the extent to which a measure represents all facets of a given construct. – Face validity Vs Content validity: • Face validity can be established by one person • Content validity should be checked by a panel, and thus usually it goes hand in hand with inter-rater reliability (Kappa!) The Content Validity Index Content validity has been defined as follows: • (1) ‘‘...the degree to which an instrument has an appropriate sample of items for the construct being measured’’ (Polit & Beck, 2004, p. 423); • (2) ‘‘...whether or not the items sampled for inclusion on the tool adequately represent the domain of content addressed by the instrument’’ (Waltz, Strickland, & Lenz, 2005, p. 155); • (3) ‘‘...the extent to which an instrument adequately samples the research domain of interest when attempting to measure phenomena’’ (Wynd, Schmidt, & Schaefer, 2003, p. 509). Two types of CVIs. • content validity of individual items • content validity of the overall scale. • Researchers use I-CVI information to guide them in revising, deleting, or substituting items • I-CVIs tend only to be reported in methodological studies that focus on descriptions of the content validation process • Most often reported in scale development studies is the CVI S-CVI/UA Proportion of items on a scale that S-CVI achieves a relevance rating of 3 Content Validity of or 4 by all the experts CVI the overall scale Degree to which an instrument has an appropriate sample of items S-CVI/Ave for construct being I-CVI Average of the I- measured CVIs for Content Validity of individual items: Are the items representati Are the items Has each item relevance to Are the items in the ve of concepts Question clarity in term comments instruments concepts related to the of wording consistency? related to the dissertation dissertation topic? topic? Q1 ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ Q2 ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ Q3 ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ Q4 ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ Q5 ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ Ratings 1= not relevant 2 =somewhat relevant. 3= quite relevant 4= highly relevant. I-CVI, item-level content validity index S-CVI, content validity index for the scale Acceptable standard for the S-CVI recommended a minimum S-CVI of .80. Content validity index The I-CVI expresses the proportion of agreement on the relevancy of each item If the I-CVI is higher than 79%, the item will be appropriate. If it is between 70% and 79%, it needs revision. If it is less than 70% it is eliminated Content validity ratio 푵 풏풆− ퟐ .푪푽푹 = 푵 ퟐ In this formula: • ne = The number of specialists who have chosen the " Necessary " option • N = total number of assessors. Kappa statistic is a consensus index of inter-rater agreement that adjusts for chance agreement and is an important supplement to CVI because Kappa provides information about the degree of agreement beyond chance Evaluation criteria for Kappa is the values above 0.74= excellent between 0.60 and 0.74=good between 0.40 and 0.59= fair To calculate modified kappa statistic, the probability of chance agreement was first calculated for each item by following formula: N PC= [N! /A! (N -A)!]* . 5 . • N= number of experts in a panel and • A= number of panelists who agree that the item is relevant. After calculating I-CVI for all instrument items, finally, kappa was computed by entering the numerical values of probability of chance agreement (PC) and content validity index of each item (I-CVI) in following formula: K= (I-CVI - PC) / (1- PC). Validity Logical Statistical Construct Criterion Face ContenContent Convergent Divergent/ Concurren Predictive t Discriminant t Reliability Consistency Objectivity Criterion Validity • This type of validity is used to measure the ability of an instrument to predict future outcomes. • Validity is usually determined by comparing two instruments ability to predict a similar outcome with a single variable being measured. • There are two major types of criterion validity: – predictive – concurrent Criterion validity • A test has high criterion validity if – It correlates highly with some external benchmark (concurrent) ? – How well does the test correlated with outcome criteria (predictive)? 30 Concurrent Validity • Concurrent validity is used when the two instruments are used to measure the same event at the same time. • It correlates highly with some external benchmark (concurrent) • Example: • DASS -> measuring depression • New instrument -> measuring depression Predictive Validity • Predictive validity is used when the instrument is administered then time is allowed to pass and is measured against the another outcome. • Regression analysis can be applied to establish criterion validity. • An independent variable could be used as a predictor variable and a dependent variable, the criterion variable. • The correlation coefficient between them is called validity coefficients. How is Criterion Validity Measured? • The statistical measure or correlation coefficient tells the degree to which the instrument is valid based on the measured criteria. • What does it look like in an equation? – The symbol “r” denotes the correlation coefficient. – A higher “r” value shows a positive relationship between the instruments. – A mix of high and low “r” values shows a negative relationship. • As a rule of thumb, for absolute value of r: • 0.00-0.19: very weak • 0.20-0.39: weak • 0.40-0.59: moderate • 0.60-0.79: strong • 0.80-1.00: very strong. Predictive Validity Concurrent Validity Validity Logical Statistical Construct Criterion Face ContenContent Convergent Divergent/ Concurren Predictive t Discriminant t Reliability Consistency Objectivity Construct validity • Measuring things that are in our theory of a domain. • The construct is sometimes called a latent variable – You can’t directly observe the construct – You can only measure its surface manifestations – it is concerned with abstract and theoretical construct, construct validity is also known as theoretical validity 39 What are Latent Variables? • Most/all variables in the social world are not directly observable. • This makes them ‘latent’ or hypothetical constructs. • We measure latent variables with observable indicators, e.g. questionnaire items. • We can think of the variance of an observable indicator as being partially caused by: – The latent construct in question – Other factors (error) • I cringe when I have to go to math class. • I am uneasy about going to the board in a math Math class. • I am afraid to ask questions in math class. anxiety • I am always worried about being called on in math class. • I understand math now, but I worry that it's going to get really difficult soon. • Specifying formative versus reflective