Part 7: Glossary and References Overview

Part 7: Glossary and References Overview In this Part This Part covers the following topics Topic See Page Section 1: Glossary of Terms Used in STEPS 7-1-1 Section 2: References 7-2-1 Part 7: Glossary and References Part 7: Glossary and References Section 1: Glossary of Terms Used in STEPS Introduction This section provides an alphabetical list of all the terms used in a STEPS surveillance with definitions that are appropriate for STEPS. Term Definition Age- A process of statistically adjusting rates from two or more populations standardisation with different age-sex structures, to a third hypothetical population with a fixed age-sex structure, in order to facilitate comparisons or understand differences between the populations. Archive A depository containing records or documents. Average See Mean Bias Distortion of a population estimate away from the true value. Bias can arise for many reasons such as measurement error or non-response. Cluster A (usually geographical defined) group of individuals. Cluster sampling A sampling method where the target population is divided into clusters/groups and a subset of each cluster is selected instead of the entire cluster. Cluster sampling often uses enumeration areas for the primary cluster Confidence A confidence interval is a measure of precision of the data of interest. All interval (CI) sample-based surveys lack some amount of precision due to non- sampling error and sampling error. To improve on point estimates, statisticians usually report an interval of values that they believe the parameter is highly likely to lie in. Usually the point estimate is the middle point of the interval and the endpoints of the interval communicate the size of the error associated with the estimate and how “confident” we are that the population parameter is in the interval. The intervals are called confidence intervals. Cross-sectional A study design based on observations at a single point in time. STEPS design surveys will be cross-sectional unless they are especially being extended to follow the sample over time. See also Longitudinal study. Database A large amount of information stored in a form that is easily searched by a computer. STEPS uses Microsoft Access. Dataset An electronic file with columns representing the variables being stored and rows that contain individual participant data. Demographic The characteristics of a human population for example age, sex, ethnicity characteristics and place of residence. Distribution The set of frequencies observed, or theoretical probabilities, in a set of events or values. Many estimates and tests used in statistics rely on assumptions about the data having a normal or other specific distribution. Enumeration Area A small to medium sized geographic area that has been defined in a census. Part 7: Glossary and References 7-1-1 Section 1: Glossary of Terms Used in STEPS WHO STEPS Surveillance EpiData Software package designed to facilitate data entry of survey data. Functions include immediate checking of ranges and legal values and ability to export data to a range of analysis packages. It is available for free. Epi InfoTM It is a statistical software package capable of complex analysis and available for free. Estimate A calculated guess of the true value of a population characteristic. If based on sample data, a confidence interval can be described which gives some indication of where the true value lies. Estimation Obtaining an approximation of a value for a population, based on sample data. Exploratory data The process of looking at raw data to find its important features. Various analysis (EDA) tools are used depending on the type of data. Simple frequency tables and histograms for categorical variables range, distributions, means etc and box plots for continuous variables are just some. This is a fundamental and essential aspect of data analysis. Head of household This can differ across countries. The definition used in most countries is the following: The person who has the decision making power and therefore it is not necessarily the person who earns the most. Household The age and sex of all the residents in the household who are within the composition age range of the survey. Imputation Methods for estimating values for questions where responses have been left out inadvertently, in order to provide a more complete dataset for analysis. Imputation is not done for STEPS. Instrument This refers to the STEPS Instrument which includes a questionnaire (Step 1), physical measurements (Step 2), and biochemical measurements (Step 3). Inter-quartile The difference between the upper and lower quartiles in a set of values. range Interval A pair of numbers describing the largest and the smallest in a set of values or estimates. The most commonly used interval in STEPS is the 95% confidence interval of the population estimates. Kish method The Kish method is a sampling method for selecting an individual randomly from a household. It uses a pre-determined table to select an individual based on the number of individuals living in the household. Mean The arithmetic mean is the average of a set of values, that is, the sum of all the values divided by number of values. Because of its simplicity and its statistical properties, it is used more than any of the other measures of central tendency (e.g. median). Measurement A tool used for measurement purposes, for example a blood pressure device monitor. Median The middle value in a distribution of values. Part 7: Glossary and References 7-1-2 Section 1: Glossary of Terms Used in STEPS WHO STEPS Surveillance METs A method for characterizing physical activities at different levels of effort based on the standard of a metabolic equivalent (MET). This unit is used to estimate the amount of oxygen used by the body during physical activity. For example, 1 MET = the energy (oxygen) used by the body as you sit quietly, perhaps while talking on the phone or reading a book. Moderate intensity Refers to activities which take moderate physical effort and than make physical activity you breathe somewhat harder than normal. Examples include cleaning, vacuuming, polishing, gardening, cycling at a regular pace or horse- riding. Moderate intensity activities require an energy expenditure of 3-6 METs. Multi-stage A study design with multiple stages of sampling and where each stage of probability sampling is based on probability. Multi-stage survey designs may save sampling cost and avoid having to compile exhaustive lists of every person in the population. Non-probability Methods of sampling a population in which the probability of selection of each every individual is not known, and therefore from which reliable population estimates are not calculable. See Sample design. Non-response In a sample survey, the failure, for any reason, to obtain information from a designated participant. Non-response bias Also known as coverage bias, the error introduced by non-response. Outlier An observation so far removed from others in the set, that it may influence the results considerably and therefore should be examined carefully before being accepted. Participant An individual who responds to the STEPS Instrument. Participant is preferred to respondent. Pilot survey A small trial run or “dress rehearsal” of the entire survey process completed before the survey itself begins. Population The target population is the entire group of individuals of interest to the STEPS survey. The survey population is the group of individuals which have a chance to be selected for the survey, defined by upper and lower age limits, and perhaps also by residency or geographical location. The survey population should ideally be the same as the target population, but may not be exactly the same in practice, for example border-dwellers or social outcasts may be impossible to find and include in samples. Post-stratification A method of improving the accuracy of population estimates after a survey sample has been conducted. Where information obtained for the sample is already known for the whole population, then the sample individuals may be stratified according to data collected, and population estimates adjusted accordingly. Precision Refers to the likely spread of estimates made from a sample of data or from a statistical model. It is measured by the standard error of the estimator, and can be decreased (and therefore precision increased) by increasing the number of observations. Prevalence “The number of instances of a given disease or other condition in a given population at a designated time. When used without qualification the term usually refers to the situation at a specified point in time (point prevalence)” [Last 1988] Prevalence is similar to and often analysed as a probability, though multiplied by 100 and represented as a percentage. Primary sampling The sampling units for the first stage of sampling. unit (PSU) Part 7: Glossary and References 7-1-3 Section 1: Glossary of Terms Used in STEPS WHO STEPS Surveillance Probability A number between 0 and 1 which represents how likely some event is to occur. A probability of 0 means an event will never occur, while a probability of 1 means the event will always occur. In STEPS, prevalence data is often derived from estimates of probability. Probability A probability sample selection method where the sampling units are proportional to given a chance of selection according to their size. It is often used in size (PPS) multi-stage sampling where each primary sampling unit is selected with PPS. Random sample A sample of a population (or sub-population) that has the property that each individual has an equal or known chance of being selected, and in which the chance of one item being selected does not alter or affect the selection of any other individual. Examples of random sampling include simple random sample, cluster sampling and stratified sampling.

Part 7: Glossary and References Overview

The Effect of Sampling Error on the Time Series Behavior of Consumption Data*

13 Collecting Statistical Data

American Community Survey Accuracy of the Data (2018)

Observational Studies and Bias in Epidemiology

Measurement Error Estimation Methods in Survey Methodology

STANDARDS and GUIDELINES for STATISTICAL SURVEYS September 2006

Error-Rate and Decision-Theoretic Methods of Multiple Testing

Ch7 Sampling Techniques

Minimizing Error When Developing Questionnaires

Sampling Error in Survey Research

Frequently Asked Questions on E-Commerce in the European Union – Eurobarometer Results

What Every Researcher Should Know About Statistical Significance