Health Interview Survey 2013

Research Protocol

Public Health and Surveillance | June 2013 | ,

© Scientific Institute of Public Health, Brussels 20 13

This report may not be reproduced, published or distributed without the consent of the ISP | WIV.

The authors of this protocol are (alphabetic order):

Charafeddine, Rana

Demarest, Stefaan

Drieskens, Sabine

Gisle, Lydia

Tafforeau, Jean

Van der Heyden, Johan

Research Protocol – HIS, Belgium 2013 p. 3/91

Table of contents

TABLE OF CONTENTS ...... 333 ANNEXES ...... 444 DEFINITIONS AND ABBRABBREVEVEVEVIATIONSIATIONS ...... 444 GENERAL INFORMATIONINFORMATION...... 555

1.1. COORDINATES OF THE SPONSORS ...... 5 1.2. HIS PROJECT RESPONSIBLE (IPH) ...... 6 1.3. HIS TEAM MEMBERS (IPH) ...... 6 1.4. PARTNERS (ICE) ...... 7 1.5. SUBCONTRACTORS (GDS) ...... 7 OBJECTIVE OF THE STUSTUDYDYDYDY ...... 888

2.1. DESCRIPTION OF OVERALL OBJECTIVE ...... 8 2.2. DESCRIPTION OF SPECIFIC OBJECTIVES ...... 9 SCIENTIFIC RELEVANCE ...... 999

3.1. SCIENTIFIC BACKGROUND ...... 9 3.2. CONCEPTUAL FRAMEWORK ...... 10 3.3. PUBLIC HEALTH RELEVANCE ...... 13 METHODS ...... 14... 141414

4.1. STUDY DESIGN ...... 14 4.2. SAMPLING METHODOLOGY ...... 16 4.3. INSTRUMENTS ...... 19 4.4. FIELDWORK PROCEDURES ...... 25 4.5. DATA MANAGEMENT AND FLOW ...... 30 4.6. DATA ANALYSIS ...... 36 4.7. DATA SECURITY MEASURES ...... 44 4.8. SOFTWARE DEVELOPMENT ...... 46 4.9. WEBSITE ...... 49 4.10. PROCEDURES FOR EXTERNAL USERS ...... 49 SCIENTIFIC REVIEW ...... 50505050

5.1. SCIENTIFIC STEERING COMMITTEE ...... 50 5.2. WORKING GROUPS ...... 50 5.3. ON-DEMAND SCIENTIFIC REVIEW ...... 50 ORGANISATION OF THE RESEARCH PROJECT ...... 51...... 515151

6.1. STARTING AND COMPLETION DATE ...... 51 6.2. TIMETABLE ...... 52 6.3. SUBCONTRACTING ...... 54 6.4. RESOURCES ...... 55 RISK AND BENEFITS FOFORR PARTICIPANTS ...... 56565656

7.1. PARTICIPATION RISKS AND BENEFITS ...... 56 7.2. LEGAL INSTANCES ...... 58 PROPRIETY RIGHTS OF STUDY MATERIAL AND RRESULTSESULTS ...... 59595959

Research Protocol – HIS, Belgium 2013 p. 4/91

CLIENT SATISFACTION ...... 59595959

9.1. DEFINITION OF THE CLIENTS ...... 59 9.2. CONTACTS WITH THE CLIENTS ...... 60 9.3. FEED -BACK SYSTEMS ...... 60 9.4. TREATMENT OF COMPLAINTS ...... 60 COMMUNICATION OF RESRESULTSULTS AND REPORTING ...... 61...... 616161

10.1. REPORTING MECHANISM ...... 61 10.2. PUBLICATION PLAN : PEER -REVIEWED PUBLICATIONS AND OTHERS ...... 61 10.3. OTHER FORMS OF COMMUNICATION OF RESULTS ...... 61 ARCHIVING PROCESS ...... 63636363

11.1. DATA MANAGEMENT ...... 63 11.2. DOCUMENTS ...... 63 REFERENCE LIST ...... 64646464

Annexes

1. Inter-ministerial Agreement Protocol (upon motivated request) 2. Outsourcing contract IPH-GDS (upon motivated request) 3. Selection of Municipalities, Households and Respondents in the HIS 2013 – p.65 4. Information on the rules of using a proxy – p.83 5. Calculation of survey weights – p.85 6. Implementation of HISIA – p.90

Definitions and abbreviations

BE Belgium CAPI Computer Assisted Personal Interview EHIS European Health Interview Survey FPS Federal Public Service GDS General Directorate - Statistics (Ex National Institute of Statistics) HIS Health Interview Survey ICE Interuniversity Cell of Epidemiology IPH Institute of Public Health - Belgium

Research Protocol – HIS, Belgium 2013 p. 5/91

General information

1.1. Coordinates of the SponsorSponsorssss

The national Health Interview Survey (HIS) is commissioned and co-financed by the different Belgian authorities competent in the field of public health. The assignment to carry out the Health Interview Survey (HIS) in 2013 was determined in the framework of an Interministerial Agreement between the Belgian Federal State and the Authorities defined by art. 128, 130 and 135 of the Constitution (Regions and Communities).

The Interministerial Agreement Protocol (Annex 1, upon request) was concluded on April 10 th , 2012. The signatories are:

The Federal Government: FOD Volksgezondheid, Veiligheid van de Voedselketen en Leefmilieu SPF Santé publique, Sécurité de la Chaîne alimentaire et Environnement Eurostation II Place Victor Horta, 40 bte 10 1060 Bruxelles

The Flemish Community and Region: Vlaams Ministerie van Welzijn, Volksgezondheid en Gezin Koning Albert II-laan 35 bus 30 Ellipsgebouw 1030 Brussel

The Walloon Region: Ministère de la Santé, de l’Egalité des chances et de l’Action sociale Avenue Gouverneur Bovesse, 100 5100 NAMUR (Jambes)

The French Community: Ministère de Culture, de l’Audiovisuel, de la Santé et de l’Egalité des Chances Place Surlet de Chokier, 15-17 1000 Bruxelles

The German Community: Ministerium für Familie, Gesundheit und Soziales Klötzerbahn 32 Eupen

Research Protocol – HIS, Belgium 2013 p. 6/91

Brussels’ Capital Region: Verenigd College van de Gemeenschappelijke Gemeenschaps-commissie van het Brussels-Hoofdstedelijk Gewest Collège réuni de la Commission Communautaire Communes de la Région de Bruxelles-Capitale Avenue Louise, 183 1050 Bruxelles

All cabinets and administrations of the Ministries inclined in the organisation of the HIS 2013 are represented in a formal “Commission of Commissioners”. The Commission meets periodically – at least twice a year or upon request – with the HIS team to monitor the progression of the survey.

The Commission of Commissioners is chaired by:

• Mr Paul De Bock, Ministry of Public Health Eurostation – Victor Horta Place, 40/10 1060 Brussels

1.2. HIS Project rrresponsibleresponsible (IPH)

• Dr Jean Tafforeau, Head of the division “Surveys, lifestyle and chronic disease”

Scientific Institute of Public Health Juliette Wytsman Street, 14 1050 Brussels

1.3. HHHISHISISIS Team members (IPH)

Scientific:

• Rana Charafeddine

• Stefaan Demarest

• Sabine Drieskens

• Lydia Gisle

• Jean Tafforeau

• Johan Van der Heyden

Administrative:

• Monique Schoonenburg

• Hubert de Krahe

Research Protocol – HIS, Belgium 2013 p. 7/91

Contact information: Email: [email protected] Tel.: +32 (0) 2 642 57 71

1.4. Partners (ICE)

The HIS 2013 is submitted to a partnership agreement with the Interuniversity Cell of Epidemiology (ICE) for a dental examination follow-up in a subsample of inclined HIS participants. The articulation the HIS data collection with the supplementary optional survey on oral health and dental examination has received due subsides.

Interuniversity Cell of Epidemiology

• Prof. Jacques Vanobbergen, ICE project leader

Universiteit Gent Maatschappelijke Tandheelkunde – P8 De Pintelaan 185 – 9000 Gent

1.5. Subcontractors ((GDSGDSGDS))

The HIS 2013 data collection phase is carried out by the General Directorate Statistics (GDS) under conditions stipulated in an outsourcing contract (Annex 2, upon request).

General Directorate Statistics

WTCIII - Bd Simon Bolivar, 30 1000 Bruxelles

Research Protocol – HIS, Belgium 2013 p. 8/91

Objective of the study

Health information and research was defined during the 43rd World Health Assembly (1) as “a process for obtaining systematic knowledge and technology that can be used for improvement of the health of the individuals or groups of population”. Health information can thus be considered as one of the tools to be used for health promotion and disease prevention.

Due to the lack of high quality and timely health data in Belgium, it was decided in the nineties to develop a new tool aiming at gathering useful information for the decision makers when designing public health programs. Several countries facing the same problem had successfully answered this by developing Health Interview Surveys (HIS). The pioneering countries in this domain were Canada, Denmark, The Netherlands and the United Kingdom where the HIS progressively became the necessary supplement to routine information systems in order to develop consistent public health policies.

The health interview survey is thus a powerful framework for a rational policy decision- making process (2).

2.12.12.1.2.1 ... Description of overall objective

The main objective of the HIS is to measure the health status of the population in Belgium, accounting also for three sub-regional populations (Flemish, Walloon and Brussels). The HIS is designed to obtain information on people’s health experience, their attitudes and health-related behaviours, the extent to which they use health care facilities and their use of preventive health and social services.

Health surveys provide one possible channel through which health-related information can be obtained. The added value here is the horizontal approach of the data collection: several types of information (health status and determinants, personal characteristics, health consumption ...) are collected simultaneously for the same individuals. The outcome is a global picture of the population’s health that allows identifying priority domains for strategic interventions. In addition, because the data are gathered periodically over time, changes in health status as well as effects of health policies and interventions can be monitored. Last but not least, health surveys allow obtaining health information from a representative sample of the population, including people that cannot be reached through the health services.

Research Protocol – HIS, Belgium 2013 p. 9/91

The ultimate goal of the HIS is thus to inform health authorities and stakeholders on various aspects of health in the population, but also to influence policy and health programs with surveillance data and provide a rich database to the scientific community for in depth research activities. The information collected via the HIS is not only useful at regional, community and national level but also for international instances such as Eurostat, WHO, UN and OECD.

2.22.22.2.2.2 ... Description of specific objectives

More specifically, the health survey pursues the following aims:

• identification of health problems; • description of the health status and health needs of the population; • estimation of prevalence and distribution of health indicators; • analysis of social (in)equality in health and access to the health services; • study of health consumption and its determinants; • study of possible trends in the health status of the population; • contribution to the evaluation of specific public health programmes.

Scientific relerelevancevance

3.13.13.1.3.1 ... Scientific background

Health surveys are population-based studies with a collection of data on health-related issues by means of standard structured interviews and/or examinations in a representative sample of the population. To gather this type of information, most European countries conduct health surveys that are exclusively oriented towards health- related items. But some countries, such as Germany and the UK, carry out multipurpose surveys that include a module on health among a lot of different topics. However, the advantage of an exclusive health survey over a multipurpose survey is that it allows an in- depth investigation of the issues at stake, as there is no competition with other subject matters under investigation (3).

Research Protocol – HIS, Belgium 2013 p. 10/91

In Belgium, several multipurpose surveys address specific health questions:

• The Socio-Economic Survey (Census, 2001) includes questions on health perception, chronic morbidity, functional status and informal care; • The Community Household Panel (2000) addresses health perception and depression; • The Labour Force Survey (2002) has questions around chronic morbidity and handicap; • The Survey on Income and Living Conditions (SILC: yearly survey starting from 2004) includes topics like health perception, chronic morbidity, functional status and unmet health care needs.

The number and scope of health items in those surveys are nevertheless quite limited. Consequently, the Health Interview Survey – exclusively focused on health and health- related topics – is regarded as the principal reference in terms of health statistics in the Belgian scenery.

The Belgian health survey is entirely interview-based, which means that the information gathered is self-perceived and “subjective”. Another way of obtaining health data is through health examination surveys. Performing examinations (e.g. collecting blood or urine samples) provides more “objective” picture of some specific health indicators. The possibility of combining the Belgian HIS with an examination module in the future was confronted during a conference 1 in December 2011: the interest of the stakeholders and the scientific community in health examination data was obvious, but unfortunately, no decision was taken at the level of the authorities due to the very high cost of this type of investigation.

3.23.23.2.3.2 ... Conceptual framework

Five areas of investigation are considered in the conceptual framework of the Health Interview Survey:

• Health status • Health determinants • Medical prevention • Health consumption • Health and society

1 The report of the conference is available upon request at [email protected]

Research Protocol – HIS, Belgium 2013 p. 11/91

Health status

Measuring the health status of the population is necessary in accordance to the WHO definition of health and the global approach of health problems. An instrument such as the HIS is essential to complement the information collected by health care providers, registries and vital statistics.

The HIS allows measuring the health status of the population in general, and not only in relation with specific health problems. Such a difference is described in the literature as the distinction between ‘health status’ and ‘state of the health’ (4).

Even if health is the main subject matter of the survey and despite the positive approach of health recommended by the WHO, most of the domains investigated in the HIS have to do with ill-health and diseases. A positive conceptual framework was effectively considered when elaborating the HIS but it has unfortunately not been possible to fulfil the concepts due notably to the lack of available instruments (5;6). One element is however tentatively introduced in the survey in 2013 related to the quality of life.

As stated earlier, one of the main characteristics of the HIS comes from the fact that most of the information is provided by the participants themselves with all their potential subjectivity, driven by their own experience and sensitivity in relation to their health. For some topics however, one might be tempted to differentiate between relatively objective questions (reporting height and weight for example) from purely subjective ones (reporting self-perceived health). Most of the topics investigated in the HIS lie between those two extremities.

Another basic concept of the HIS is the differential approach of the health status of the population related to the medically diagnosed diseases on one side and their consequences on the functional status of the individual on the other side.

The health status measurement within the HIS is mainly focused on chronic diseases and conditions; due to their long duration these conditions have a greater impact on health expenditures and represent a higher burden at the population level than acute conditions.

Health determinants

Life style habits (ex. physical exercise or food, alcohol, tobacco consumption) are closely linked to the values and priorities of each person, to the opportunities and constraints inherent to the culture and the socio-economic position of the individual. Life styles are shaped by social learning and interpersonal interactions. It is thus misleading to believe that a specific behaviour is only determined by an individual decision as far as adopting or not certain health related life styles (deterministic approach).

Life styles are however health determinants: some behaviours or habits contribute to preserving a good state of health, preventing specific conditions or improving

Research Protocol – HIS, Belgium 2013 p. 12/91 psychological well-being. At the opposite, other behaviours are harmful to health especially if they are excessive or chronic.

Lessons from the past have shown that changing people’s life style is the main source of improvement in population health, before and above the progress made in the field of medicine. This is why today health promotion programmes are still one of the most important components of public health policy.

It is thus essential for decision makers and institutions in charge of implementing health promotion to regularly measure the prevalence of health related behaviours at the population level, their distribution in specific subgroups and the trends over time. Such measurement is imperative for the evaluation of programmes and policies. Though it is not possible through the health surveys to provide evidence of a causal relationship between a specific programme and change in behaviour over time, they remain useful tools for monitoring health related behaviours.

Medical prevention

Clinicians have long understood that preventive medicine plays a major role in maintaining good health. The advantage of preventive medicine has become increasingly apparent in the past 30 to 40 years. This approach deeply modified the way to solve medical problems such as infectious diseases (with the immunisation programmes for example). More recently, early disease detection has also become an essential component of preventive medicine, with striking results as far as morbidity and mortality are concerned (7).

As a result, public health policy has progressively been enlarged from the mere management of health care expenditures to the development of strategies that aim at improving the health of the population on the whole. Such an approach involves specific actions at the levels of biological factors, physical and social environment, individual behaviour, but also at the level of the health services in their curative and preventive components (8).

The WHO “Health for All” targets published in 1985 and after explicitly mention health promotion and diseases prevention as priority programmes. From a conceptual point of view, three spheres can be discerned in the field of preventive medicine (8).

1. Primary prevention: actions aiming at abrogating the cause of a disease in order to avoid emergence of new cases. 2. Secondary prevention: early detection and treatment of a specific disease before the apparition of the clinical symptoms and the complications. 3. Tertiary prevention: it is not strictly speaking prevention of diseases, but rather trying to limit their consequences.

Research Protocol – HIS, Belgium 2013 p. 13/91

Some HIS modules investigate specific actions in the primary and secondary prevention spheres. Priority actions in the domain of preventive medicine are chosen for the HIS on the basis of several criteria: the frequency of the disease, the importance of the problem at the individual and societal level and the efficacy of the preventive methods.

Health consumption

Information regarding health consumption is an essential part of the health information system in order to adapt available resources to the needs of the population. Health consumption covers three main domains: ambulatory care, institutional care and medicines. Two different methods are usually available to measure health consumption at population level: health service statistics and health surveys. Several data sources are accessible in Belgium for health services registers: reimbursement of expenses for medical care (INAMI–RIZIV, insurance funds), reimbursement of expenses for prescribed medical drugs (Pharmanet) and hospital discharge records (RCM, RPM, RIM).

It is generally considered that health service statistics are more reliable (objective) than information gathered through health surveys. Health information from surveys is seen to suffer memory bias as well as bias due to the lack of medical knowledge of the participants (health literacy). Nonetheless, health surveys represent the only source of polyvalent information, where data are collected on different health related aspects. This offers the unique opportunity to analyse for instance, the level of health consumption in relation to several potential determinants such as health status, life styles or socio demographic characteristics.

Health and society

The concept of health has enlarged over time to progressively include the non-medical components; in this perspective, health has become a social phenomenon. Health and ill- health are also regarded as being in close relation to the environment as well as related to familial and professional insertion. It is now well documented that health status is linked to the social and economic situation of individuals. This can be assessed through concepts such as the accessibility to health care, the analysis of health inequalities, social support, environmental nuisances, accidents and violence, etc. The HIS takes into account specific societal and environmental health determinants.

3.33.33.3.3.3 ... Public health relevance

Public health is the science and art of preventing disease, prolonging life and promoting health through the organised efforts and informed choices of society, organizations, public and private, communities and individuals (9).

Research Protocol – HIS, Belgium 2013 p. 14/91

Today, the HIS constitutes an important source of information for people in charge of setting up public health policies. It provides a solid ground on which those policies can be constructed. The HIS can also prove to be an interesting tool in the evaluation process of the public health programmes.

Methods

4.1. Study design

4.1.1. Study type

The Health Interview Survey is a cross-sectional epidemiologic study in the general population.

It is repeated periodically over time.

4.1.2. Study period

The HIS 2013 Protocol Agreement covers 3 calendar years.

• The preparatory phase (conceptual, questionnaire development,...) is accomplished during 2012.

• The field work or data collection phase is carried out from 1 January 2013 to 31 December 2013.

• The data management, data analysis and reporting phase are achieved during 2014.

4.1.3. Target population

The main objective of the Health Interview Survey – i.e. to give a description of the health status of the population in Belgium – leads to the broad definition of the target population as “all people residing in Belgium, regardless their place of birth, nationality or any other characteristic”.

4.1.4. Sampling frame

The sampling frame consists of all households listed in the National Register.

Research Protocol – HIS, Belgium 2013 p. 15/91

Practically, a household is defined as the people living at the address of a reference person. However, some people are registered in the National Register as living in collective households (e.g. homes for the elderly, prisons, religious convents,...) and these households have no reference person. In order to include them in the sampling frame, each individual belonging to a collective household (field code “20”) is considered in the as a one-person household (i.e. as a reference person or field code “1”).

However, it should be noted that people living in:

• an institution (including psychiatric institutions), with the exception of elderly people living in old people’s homes, nursing homes and psychiatric nursing homes,

• a religious community or cloister with more than 8 persons,

• a prison,

are excluded from the survey for practical reasons. The exclusion process takes place a posteriori , when the interviewer identifies that the selected household (the one- person household in this case) belongs to one of the indicated categories.

People who are listed in the National Register without any physical address (mostly people living abroad, hence even not included in the target population) are excluded a priori from the sampling frame. They are identified in the National Register through the field of the statistical sector (starting with “Z” or “9”).

The sampling is carried out every quarter. Hence, the procedure is followed 4 times, each time in the months preceding the start of the quarter. Households selected in a previous quarter are excluded from the sampling frame for the next quarters.

All the members of a selected household are part of the sampling frame. Nevertheless, even though the most actual version of the National Register is used and the National Register is updated on a weekly basis, it can happen that the administrative composition of a household is different compared to the real composition. The general rule is that the list of household members is established on the information from the household itself. It should also include household members who are temporarily absent (e.g. students who spend most of their time in student accommodation) and older people (65 years and older) who have their official residence at the household address, but reside in fact in a home for older people, a nursing home for older people or a psychiatric nursing home. If – according to the administrative information – there are other household members than those mentioned by the reference person of the household, the interviewer should ask for this without mentioning that there is a discrepancy between the situation as reported by the household and the National Register.

Research Protocol – HIS, Belgium 2013 p. 16/91

4.1.5. Study population

The study population – the population that is reached by the study via the sampling frame – does not cover de target population completely. The following categories of persons are included in the target population, as defined above, but are not included in the study population:

1. People living in Belgium, but not listed in the National Register: homeless, illegal immigrants, etc. 2. People belonging to households listed in the National Register with no physical address in Belgium (Codes “Z” and “9”). 3. People, residing in: - an institution (including psychiatric institutions), with the exception of older people living in old people’s homes, nursing homes and psychiatric nursing homes - a religious community or cloister with more than 8 persons - a prison 4. People in newly created households that are not yet included as such in the National Register.

444.24.2.2.2.... Sampling methodology

4.2.1. Sample Size

The total number of successful interviews for the sample is set to 10.000 (3.500 for , 3.500 for , 3.000 for Brussels). This sample size is based on sample size calculations performed during pre-analyses for the HIS in 1997, taking into account specific budget constraints and the available logistic means.

On the basis of the previous surveys, the efficiency obtained in estimation at the national and regional level appeared to be sufficient. It is however too small for estimation purposes at the provincial level, especially for small provinces. Therefore the Protocol Agreement of the commissioners includes that provinces who wish additional sample units are entitled to ask for an oversampling of their province, if they are prepared to provide the financial means for this. For the HIS 2013, only one province agreed for an oversampling and supplied a financial counterpart for the increase in the number of interviews, that is 600 supplementary units for Luxemburg. As a result, the final sample size for the HIS 2013 is 10.600. This includes the basic sample (10.000) and the provincial oversampling (600).

Research Protocol – HIS, Belgium 2013 p. 17/91

4.2.2. Stratified Clustered Multi-stage design

• MOTIVATION

In the design of the sampling scheme, both the coverage of the population in Belgium and the logistic feasibility of the fieldwork are important concerns. Even when a relatively exhaustive list is available (such as the National Register) a simple random selection from this list would be expensive from a practical point of view because the spread of households to be interviewed would be too wide and scattered. By using a more complex sampling method it is possible to obtain a larger sample size than would be obtained by simple random sampling at the same cost. Therefore, a multiple stage clustered sampling design is preferable. In this design, municipalities serve as primary selection units, while households within the municipalities and individuals within households are respectively second and third-stage units.

Choosing a stratified sample instead of a simple random sample is motivated as follows. Sample surveys displaying small variability among the measurements will produce small bounds on the errors of estimation. In other words, stratification may produce a smaller bound on the error of estimation than would be produced by a simple random sample of the same size.

This result is enforced if the strata are largely homogeneous. For the HIS 2013, there are two stratification levels, one at the regional and one at the provincial level. Within a region, a proportional representation per province in the base sample of 10,000 is sought. A simple random sample of municipalities within a region would ascertain this condition from the sampling framework point of view. Resulting differences are regarded as purely random. However, stratifying proportionally over provinces controlled this random variation further.

Municipalities are established administrative units, they are stable (in general those units do not change during the time the survey is conducted), and they are easy to use in comparison with other specialised sources of data related to the survey. Municipalities are preferred to regions or provinces, because the latter are too large and too few. The great variation in the size of the municipalities is controlled for by systematically sampling within a province with a selection chance proportional to their size.

Within each municipality, a sample of households is drawn so that groups of 50 individuals can be interviewed in total. Clustering also takes place at the household level since members of the same household are more alike than persons not belonging to the same household.

Whereas the stratification and the systematic sampling according to municipalities usually increase the precision, the clustering effect (selecting groups of 50, selecting households instead of individuals) will slightly reduce the precision, since units will resemble each other more than in a simple random sample. However, since stratification is based on unequal probabilities (to guarantee meaningful sample size per stratum), a slight decrease in overall efficiency is to be expected. The effects due to clustering and

Research Protocol – HIS, Belgium 2013 p. 18/91 stratification observed in the HIS 1997 are very mild and do not outweigh the advantages. This design choice indeed enables persons to be sampled from abbreviated listings and, hence, reduces the survey field worker’s travel distance significantly.

In summary, in the light of the previous remarks, multistage sampling is the appropriated way to get access to individuals. An overview of the steps in the selection procedures is given in the next section.

• OVERVIEW OF THE DESIGN

In summary, the final sampling scheme, i.e. the mechanism to get a probabilistic sample of households and respondents, is a combination of several sampling techniques: stratification, multistage sampling and clustering. The selection process consists of the following steps:

1. Regional stratification . Belgium is divided into 3 regions, the Flemish Region, the Walloon Region and the Brussels Region, for which the number of interviews has been predetermined. The reason for this stratification is to ensure that inference is possible for each region with nearly the same precision. The number of interviews to be carried out is fixed to 3500 for the Flemish and Walloon regions and 3000 for the Brussels region. These figures do not include the oversampling.

2. SSStratification at the level of the provinces . This second level of stratification is done to improve the quality of the sample over a simple random sample. In particular, a balanced geographical spread is achieved. For the base sample, the sample size within the provincial stratification is proportional to the population size of the province. For the provinces that have requested an oversampling, the number of interviews obtained via the proportional stratification is increased with the number of interviews the province is willing to subsidise. In addition, there is the special case of stratification for the province of Liege: as the sample size of the German Community (which is geographically located in the province of Liège) is predetermined by convention, the province of Liège has been split into two strata: the German community and the rest of the province.

3. Then, within the strata, units are accessed in two (for the households (HH)) or three (for the individuals) stages:

(1) Municipalities are selected with a selection probability proportional to their size, within each stratum. These municipalities are called the Primary Sampling Units (PSU). To facilitate the fieldwork, for each PSU selected, a group of 50 individuals residing in that municipality has to be interviewed successfully during the year 2013.

Research Protocol – HIS, Belgium 2013 p. 19/91

(2) Within each municipality, a sample of households - the Secondary Sampling Units (SSU) - is drawn in such a way that 50 individuals per PSU can be interviewed in total.

(3) Finally, at most four individuals - the Tertiary Sampling Units (TSU) - are chosen for the interviews within each household. Only questioning the reference person within a household would not enable us to give a good picture of a household's health status. For households with four members or less, all members are selected. For households with at least five members, the reference person and his/her partner (if any) are selected. Among the remaining household members a random selection is made, so as to yield four selected household members. Interviewing more than four persons within a household is inefficient because of the familial correlation and because the burden on the household would be too large.

4. To further assure representativity over time, interviews are spread over the whole calendar year so that each quarter is comparable in terms of number of successful interviews. The quarters are defined as follows: Q1: January-March; Q2: April-June; Q3: July-September and Q4: October-December.

More detailed information about the sampling can be found in the document “Selection of Municipalities, Households and Respondents in the HIS 2013” (see Annex 3).

444.34.3.3.3.... Instruments

4.3.1. Mode of data collection

Data collection in the HIS 2013 takes place using two standardized questionnaires: a questionnaire administered in a face to face interview and a self-administered questionnaire handed out to the participants.

The face to face questionnaire is addressed to all the selected persons in the household. The data is collected via computer assisted personal interviewing (CAPI) where the interviewer reads the questions to each respondents and enters the answers directly onto a portable computer. Under certain specific conditions, one person (whether a member of the household or not) is allowed to respond on the behalf of the selected person. This is called “proxy interviewing” (information on the rules of using a proxy is found in Annex 4).

The self-administered questionnaire collects information from all respondents aged 15 years or older. The interviewees themselves fill the questionnaire by writing without the

Research Protocol – HIS, Belgium 2013 p. 20/91 intervention of the interviewer using the pen and pencil approach (PAPI). The decision to use such a mode of data collection is based on the idea that some topics, such as mental health, alcohol consumption and so on, are sensitive and are therefore not suited for a face-to-face interview. The self-administered questionnaire cannot be completed by a proxy.

The questionnaires are available in French, Dutch, German and English. Therefore the interviews can be conducted in these 4 languages. If a selected individual cannot speak any of these languages, an interpreter may be used. This interpreter is generally a member of the household who functions in the language of the selected person and the language of the interview. This is not considered as proxy interviewing as the interpreter is only translating the questions and answers but not responding on the behalf of the selected individual.

4.3.2 Content of the questionnaires

The questionnaires collect data on 5 health domains (described in section 3.2) and on a number of background social variables. They are organized by topic, also called “module”. The list of the HIS 2013 modules is found in the table below. In addition to the modules, the questionnaires include a number of questions to evaluate the questionnaire and the interview themselves.

A number of modules collect information from the reference person only and the results are relevant for all household members. These modules are: household composition, income, accessibility to health care, passive smoking and lodging. All the other modules collect information directly from all the selected individuals in the household, though some are specific to certain age or gender groups.

The questions of the HIS 2013 and of the previous years are found in an Excel sheet that is developed for each module. This excel sheet allows the comparison of the questions across the years in the four languages and is archived on the IPH server as: Overview_questions_XX (XX stands for the two letters of identification of each specific module) to be found in \HIS\HIS2013\modules\XX.

The questionnaires are available on the HIS website “www.healthsurvey.be”.

Research Protocol – HIS, Belgium 2013 p. 21/91

MODULES ABBREVIATION FACE SELF

Background information

1. Information on the selected individuals NR x

2. Household composition (+ nationality) HC x

3. Education ET x

4. Income IN x

5. Employment EM x

6. Lodging LO x

Health status

7. Subjective health SH x

8. Chronic conditions diseases MA x

9. Chronic conditions (general) MB x

10. Longstanding incapacities IL x

11. Bodily pain PI x

12. Mental health SL, WB, EB x

13. Oral health DE x

14. Nutritional status NS x

15. Quality of life QA x

Health behaviours

16. Physical activity PA x

17. Nutritional habits NH x

18. Alcohol consumption AL x

19. Smoking TA x

20. Illegal drugs use ID x

Research Protocol – HIS, Belgium 2013 p. 22/91

MODULES ABBREVIATION FACE SELF

Prevention

21. Knowledge and attitudes about HIV/AIDS HI x

22. Immunisation VA x

23. Prevention of cardiovascular diseases and PR x diabetes

24. Cancer screening SC x

25. Sexual health SE x

Health care consumption

26. Contacts with the general practitioner GP x

27. Contacts with the specialist SP x

28. Contacts with a dentist DE x

29. Contacts with other health providers and OH x services

30. Contacts with an emergency unit ED x

31. Hospitalisation HO x

32. Use of medicines DR x

33. Patient experiences PE x

Health and society

34. Accessibility to health care AC x

35. Health and environment HE x

36. Passive smoking HE & TA x x

37. Trauma: accidents and violence TR x x

38. Social contacts and social support SO x

39. Social & preventive services OH x

40. Informal care IC x

Research Protocol – HIS, Belgium 2013 p. 23/91

4.3.3. Conceptual development of the HIS questionnaires

The guiding principle of the questionnaires’ development is to keep comparable questions across surveys to allow the study of time trends. This does not mean that the questionnaires are static. In fact, the content of the questionnaires evolves overtime to address emerging concerns. When new needs are identified, the conceptual development of any new module starts with an extensive literature review to identify the indicators that describe the issue to be monitored. The identification of these indicators depends on the availability of instruments that are valid and reliable or instruments that have been used in other high quality surveys.

For the HIS 2013, the questions have been developed based on the following:

• PREVIOUS HIS QUESTIONS: The questions proposed in the HIS 2013 are based on the questions available in previous HIS cycles (1997, 2001, 2004 and 2008) to allow comparison across surveys.

• WORKING GROUPS AND EXPERT CONSULTATION: The questions proposed for some modules are discussed in working groups with scientific and academic experts, members of health agencies and administrations, and fieldwork experts. Other modules are reviewed by experts in the context of an e-mail consultation.

• REQUESTS FROM COMMISSIONERS OR OTHER USERS OF HIS DATA: New modules are requested as new health concerns emerge. Such requests may be issued by the commissioners of the HIS and by other users of HIS data such as health agencies or universities. Requests by the commissioners are given higher priority.

• REQUESTS FOR THE EUROPEAN HEALTH INTERVIEW SURVEY (EHIS): Eurostat aims at generating an EHIS database that allows comparing health data between European countries. For this purpose, a European regulation lists the variables that have to be included in the next national HIS in EU member states that should take place in 2013/2014. Belgium has to ensure that the HIS questions allow to provide information to construct the EHIS variables.

• REQUESTS FOR INTERNATIONAL REPORTING: International organisations such as the WHO or UNAIDS request the periodic delivery of a set of indicators. These indicators need to be included in the HIS in order for Belgium for fill such an international requirement.

In addition, the development of the HIS 2013 questions is guided by the need to keep the length of the questionnaire as is or even to reduce it. The questionnaire used for the previous HIS (2008) was long as it took on average one hour per person to complete. Therefore, requests for any new questions to be included in the HIS 2013 are analysed in the context of the necessity to drop other questions. This approach must be taken in order not to jeopardize the validity of the results.

Research Protocol – HIS, Belgium 2013 p. 24/91

Finally, the decision process that led to the development of the HIS 2013 questions differs according to the module. A conceptual paper is developed to document this process for each module. They are archived as: XX_2013_cpt.doc (XX standing for the two letters of identification of each specific module) to be found in the directory \HIS\HIS2013\modules\XX.

All the conceptual papers follow the same structure and generally include the following items:

1. Place of the module in the conceptual framework of the health survey 2. Introduction − Definition and concepts − Relation with health − Implication at the level of the society − Added value of this module in the HIS 3. Information about the module − Importance of the information for the decision makers − Data available in Belgium 4. Objectives of the module 5. Review of available instruments − Overview of instruments − Selection for the HIS1997 − Selection for the HIS2001 − Selection for the HIS2004 − Selection for the pretest 2007 − Selection for the HIS2008 − Selection for the HIS 2013 6. Questions − Overview of questions HIS 2013 − Detailed list of questions for HIS 2013 (in 4 languages) 7. List of variables HIS 2013 8. Construction of the indicators HIS 2013 9. References

Research Protocol – HIS, Belgium 2013 p. 25/91

444.44.4.4.4.... Fieldwork procedures

To collect the data through the health interview survey, more or less 200 interviewers are needed. The basic role of these interviewers is twofold: (1) to establish a contact with all households selected for interview in the group assigned to them, and (2) to conduct the interviews with the selected members of the participating households. The procedures foreseen for these tasks are the following:

• ESTABLISHING CONTACT WITH THE HOUSEHOLDS As described further on, interviewers only receive the names and addresses of the households they have to contact. No telephone or mobile number nor e-mail address is provided. It is up to the interviewers to try to contact the “activated” (invited) households either by telephone or at doorstep.

- If a household cannot be contacted despite 5 (duly documented) attempts, the household can be labelled as ‘non-contactable’ and will be replaced.

- If a contact with the household takes place, the interviewers have to ask the household (reference person or his/her partner) if the household is willing to participate in the survey.

999 If they don’t agree to participate, the household receives a “refusal” status.

999 If the household agrees to participate, an appointment should be made (date and time) to conduct the interview(-s) in that particular household.

• CONDUCTING THE INTERVIEWS

The structured interviews are led in a face-to-face setting using the CAPI mode of data collection with all selected household members (or proxy interview for those under 15 years of age). The data collection also consists of handing out self-administered questionnaire to the selected members 15 years or more (PAPI mode). The procedures related to the HIS data collection are documented in the “interviewers’ manual” and explained during the interviewers’ training.

4.4.1. Fieldwork data

Launching a large-scale survey such as the HIS implies dealing with numerous data at various levels, namely, data stemming from fieldwork operations (also referred to as “para-data”) and data collected from the HIS participants (resulting from the HIS

Research Protocol – HIS, Belgium 2013 p. 26/91 questionnaires). In the process of collecting survey-based data, fieldwork procedures are foreseen which generate the need for tight management by means of specific follow-up indicators.

Basically, the goal of the data collection phase is to obtain the participation of as many households needed so to reach a total of 10.750 interviews (baseline sample of 10.000 and supplementary provincial sample of 750 interviews) during the calendar year 2013. Because the HIS is not a compulsory survey, a certain number of selected households will not participate. Non-participation (including non-eligible households, non-contactable households and refusing households) is an important issue in the data collection phase, especially since a sophisticated substitution process of non-participating households is applied in the HIS.

The data collection phase starts with the “activation” of the households, that is, when they are notified that they were randomly selected to take part in the HIS 2013. In order to inform a household that it is selected for participation in the survey, the GDS sends an invitation letter and an introductory leaflet addressed to the reference person of the household. The leaflet explains, amongst others, the goal and the content of the HIS as well as the contact practicalities, i.e., that an interviewer will contact them shortly after receiving the letter. Households are explicitly informed that participation is not compulsory and different means are offered for them to express their refusal of further contact (e-mail address, free-telephone number, coordinates of the HIS team members). In the letter, reference is made to the possibility to participate in the ICE-survey regarding oral health.

As soon as these information documents are sent to the household, the interviewer responsible for the given (group of) households receives access to the contact data of the household (full name reference person, full address, first names of the (other) household- members) and can start attempting to establish a contact to find out whether the household is willing to participate, and if so, to set an appointment for the interview phase. This so-called contact information is retrieved from the National Register used as a sampling frame for the selection of the households.

It is important, for both the interviewer’s work as well as for a close monitoring of the fieldwork, that all attempts to contact the household are duly documented. For this purpose, the interviewer has to complete a (computerised) “contact sheet” for every activated household in his/her contact list. This contact sheet is the standard sheet used by GDS in their own surveys.

Research Protocol – HIS, Belgium 2013 p. 27/91

For every attempt and/or successful contact with an activated household, the contact sheet must be updated with the following data:

• The date

• The day

• The moment of the day of the contact

• The mode (at doorstep / by telephone / did not succeed in contacting the household)

• The outcome of the contact:

o Interview conducted

o Interview scheduled (+ date scheduled for the interview)

o Problems encountered to contact the address (*)

o Refusal (*)

o Interview impossible (*)

o Not available on the moment of the contact

o No-one opened the door (*)

o No-one at home (*)

• For results marked with (*) the interviewer has to describe the situation in detail.

Because of the substitution process implemented for non-participating households in the HIS 2013, the contact sheets must be completed on a regular basis, preferably day-by- day. Since the communication system between the interviewer and GDS is web-based, daily updates will enable the GDS to monitor the data-collection proceedings ‘in real time’. That way, special attention can be paid to households ‘on hold’ (meaning that their final outcome status is not yet known). In case a household is still ‘on hold’ after 6 weeks of activation, the responsible interviewer will be notified that he/she must do the maximum to assure that this household gets it final status.

As stated previously, a substitution process is applied to the non-participating households: the non-participating household are replaced with a predefined household (the next one in the sample cluster). Practically, when the GDS is informed that an invited household will not participate, an introductory leaflet and invitation letter is sent to the substitute-household, with a subsequent notification to the interviewer that contact attempts can start with the replacement household, for which s/he receives an open access to the contact data.

Research Protocol – HIS, Belgium 2013 p. 28/91

If the household consents to participate, the individual data collection through CAPI can take place. The CAPI application indicates how many (max. 4) and which household members have participated. This information is not readily available regarding the self- administered questionnaire (PAPI). It is the interviewer who must mention the number of household members that completed the self-administered questionnaire. These completed questionnaires should be sent by the interviewer to GDS. Until DSG receives these questionnaires, they fall under the responsibility of the interviewer.

444.4.4.4..4..4.2222.. IIInterviewersInterviewers

To reach the target of 10.600 interviews (divided in 212 groups of 50 interviews) in one calendar year, the HIS requires the services of about 200 interviewers, depending on the number of groups some of them take in charge.

• RECRUITMENT Interviewers that carry out the HIS 2013 are selected from a pool of interviewers already active in other surveys of the GDS. Given the specificity of the HIS – which is conducted in selected target municipalities – extra interviewers will have to be recruited. The recruitment of candidate-interviewers is a joint IPH – GDS initiative: in case no interviewers are directly available in the pool, local administrations and state-run institutions (post office, schools, hospitals …) will be contacted to search for candidates.

• TRAINING All interviewers active in the HIS 2013 have to follow a collective one-day training in order to ensure the standardisation of all data collection and fieldwork procedures. Three major themes are addressed during the training session: (1) the overall GDS approach applied in data-collection (the use of CAPI, communication with GDS, contact sheets,…); (2) specificities of the HIS 2013 survey (content of the questionnaire, conceptual background, selection of household members,…); (3) supplementary tasks in the context of the ICE-survey on oral health (introducing the survey, intent form,…).

Interviewers’ training is not only organised in the preparation phase of the HIS 2013 (i.e., late 2012), but also on a regular basis during the fieldwork in 2013 because of interviewer drop-out and turn-over throughout the data-collection phase. Of course, all possible efforts are made to limit drop-out. In this sense, the candidate-interviewers are fully informed before start of the actual workload and efforts expected of them, the kind of problems they will face during data collection (e.g. when contacting households) and how much they are paid.

Interviewers active in the HIS 2013 work under the responsibility of GDS; their tasks and their wage (expressed in terms of conducted interviews) will be published as a Royal Decree in the Official State Bulletin.

Research Protocol – HIS, Belgium 2013 p. 29/91

• GUIDELINE MANUALS The procedural guidelines for the interviewers are provided in three manuals: (1) contact- procedures, (2) an overview of the important features of the (CAPI and PAPI) questionnaires and (3) instructions for the CAPI application in carrying out the structured interviews.

Manual 1: CONTACT-PROCEDURES. An important feature of the survey is that the work of the interviewer is standardised as maximum as possible. This is not only related to the moment of the interview itself, but also for what concerns the contact-procedures (see data management procedure).

Manual 22: CONTENT OF THE QUESTIONNAIRES (CAPI & SELF-ADMINISTERED). Although the bulk of the work of interviewers consists in running through a CAPI questionnaire, it is important that they have an insight in the aim of the questions, the meaning of the concepts, etc. This information is of interest for the interviewer in case the respondent wants to have extra-information on a specific question or topic. In this context, an overview is also provided on the content of the self-administered questionnaire; although interviewers are not expected to run through this questionnaires (respondents complete the questionnaire themselves), it is found useful that interviewers have at least some notions of its content. The last part of the interviewers’ task is to introduce the ICE-survey. In the manual, a short overview is provided on the content of this survey and what respondents are asked to do (give consent to be re-contacted in the context of the ICE- survey).

Manual 3: INSTRUCTIONS FOR THE CAPI APPLICATION. Some interviewers (working for other surveys of GDS) have already some knowledge and experience of working with CAPI, but for others, the HIS 2013 will be the first survey in which this tool is used. In this context, it is important that clear and thorough instructions exist for what the use of CAPI concerns. While this theme is addressed during the training session, it remains important that interviewers have a manual at hand in case they need clarification or are confronted with a technical problem.

4.4.3. Fieldwork control

Using an online contact sheet enables the GDS to closely monitor the fieldwork, with specific SAS© - programs and fieldwork indicators developed for this purpose:

• Given the overall target of the data-collection phase, having 10.600 completed interviews by the end of 2013 (with prefixed numbers of interview in each of the regions), a follow-up of the accrual-rate is necessary. Hypothetical accrual projections, combined with comparative real accrual rates (in each of the regions and provinces), enables to forecast if the target(s) will be reached on time. In case of negative forecasts, appropriate measures can be taken to optimize the realization of the target(s).

Research Protocol – HIS, Belgium 2013 p. 30/91

• A follow-up of the accrual-rate at the level of each particular group (of 50 interviews assigned to an interviewer) allows tracing outlier-interviewers (in both positive as negative terms). For the positive outliers (interviewers with an accrual rate far quicker than that of others), a detailed assessment of their activities based on information derived from the contact sheets should be undertaken to estimate whether their work complies with the expected procedure. Negative outliers (interviewers with a substantial lower accrual rate), are contacted by phone to verify why they have a distinct profile. If necessary, WIV and GDS will jointly decide to replace the interviewers that do not work in line with the prescribed procedures.

• Data collected in the context of the contract between IPH and ICE (that is: the households that have indicated to agree with a re-contact in the context of the ICE survey) needs close follow-up.

• Next to a rather quantitative data monitoring, the information present in the contacts- database should enable to have an insight in the quality of the interviewers’ work. Again, the basic question here is to check to what extend the procedure interviewers apply is in line with the procedure they should apply. Specific indicators are developed to monitor the contact attempts to trace interviewers who do not perform the fieldwork as prescribed.

• A supplementary quality control is carried out by means of a short standard postal survey addressed to all participating households. In this survey, specific questions deal with the work of the interviewer: did the interviewer respect the protocol for data collection (home based face to face interview), did he/she explain the objectives of the study and were they clear enough, was he/she pleasant, polite, clear, patient, professional…? If deviations are reported in various households, it can be decided to reject all the completed interviews and dismiss that interviewer.

4.5. Data management and flow

4.5.1. Data entry procedures

There are two modes of data entry for the HIS 2013 data:

• Data from the face-to-face questionnaire that are collected using the computer assisted personal interviewing (CAPI) approach do not need a specific data entry procedure as the data are imputed during the interview. After conducting the interviews, the data collected are directly uploaded in the central GDS HIS 2013 database without further manipulation.

• Data from the self-administered questionnaire (PAPI data collection mode) require manual data-entry. For this, the interviewers are required to send the completed self- administered questionnaires to GDS (together with a signed ‘remittance document’) for transcription, where a team of professional typists is responsible for the data- entry. The IPH has developed a specific data input programme in Blaise® application.

Research Protocol – HIS, Belgium 2013 p. 31/91

The team of typists has to follow a short training session during which the questionnaire and the application are presented. A manual for data-entry operations is provided by the IPH to ensure standard transcription. The data entry process in Blaise® must start shortly after the beginning of data collection; it should end at the latest 1 month after the last questionnaires are received (that is at the latest on February 1 st 2014).

4.5.2. Data transfer to IPH

Data for every participating and non-participating household are transferred from GDS to IPH when all the expected information for a certain household has been uploaded in the database, both directly from the CAPI application and the PAPI application after data- entry. Based on the information on the contact form and the CAPI application, the GDS can verify if the declared number of self-completed questionnaires has been subject to data-entry.

The data transferred from GDS to IPH include:

• The case identifier (i.e., HIS-specific code for the household and the household members);

• The follow-up para-data (derived from the contact-form) in order to study the quality of the data-collection and to study non-response;

• The survey data from the CAPI and PAPI interviews.

Data-transfer from GDS to IPH is enabled by connecting to the FOD-SPF Economy with secured FTP. A first data set is transferred to the IPH at the latest on March 1 st 2013 in order to check its quality and consistency. The final dataset is transferred to the IPH at the latest on March 1 st 2014.

4.5.3. Record identifier

To identify respondents in the context of IPH-GDS communication, a system of a unique HIS case identifier for each respondent is devised. A unique case identifier code consists of the identification code of the household plus a 2-digits code that refers to the place of the individual within the household. The household code is based on the municipality where the household resides (3-digit number), the trimester when the household was selected (1-digit number from 1 to 4) and the cluster place (4 digits). A key data set comprising both the unique HIS case identifier at the individual level and the individual code number that was produced by GDS with an algorithm applied to the National Number of selected households, is kept by IPH.

Research Protocol – HIS, Belgium 2013 p. 32/91

4.5.4. Creation of a working database

After the data is exported to IPH, program files are created in SAS to generate an initial working database. The programmes deal with:

• input of the data (CAPI – PAPI – contact information)

• monitoring of the field work

• checks of consistency

• creation of 7 SAS data files:

o hh.sas7bdat: information at the household level

o ind.sas7bdat: information at the individual level

o gpcons.sas7bdat: information for each contact with a general practitioner

o spcons.sas7bdat: information for each contact with a specialist

o edcons.sas7bdat: information for each contact with an emergency department

o hosp.sas7bdat: information for each hospital admission

o medicat.sas7bdat: information for each medicine that was taken in the 24 hrs. before the interview

• allocation of labels in SAS in 3 languages (Dutch, French, English); labels in English are also available in a separate excel file

• creation and/or cleaning of general background variables:

o age, gender;

o demographic information: nationality, country of birth, region and province of residence, urbanisation level of municipality of residence, household type;

o socioeconomic information: income, education, occupation;

o housing situation.

• creation of survey weights at the level of the individual, household and contacts with health professionals

All the procedures performed are documented within the programme itself, following specific IPH’s standard operational procedures (SOP).

Research Protocol – HIS, Belgium 2013 p. 33/91

4.5.5. Controls of coherence

• CHECKS PERFORMED DURING DATA ENTRY The Blaise® data-entry programme (used for entering the self-completed questionnaire) and the CAPI programme used during data collection (face to face questionnaire) contain several mechanisms aimed at the production of quality data. These controls include the following:

• The system follows the logic of the questionnaire. When a question is not applicable, the data entry programme jumps to the next relevant question.

• Some questions only need one answer (only one field for the question in the table); others allow several answers (one field per answer in the table).

• Some answers are accompanied by a free text; data entry in this field is only allowed if answer to previous question meets specific criteria.

• The system only offers a limited number of possible answers. If one tries to insert another value, an alert shows up.

• Several answers have a table’s look-up to choose the right information in a list instead of typing it out: countries, professions, activities and medicines.

Prior to the actual data entry, the programme is tested and adapted by members of the HIS team by using mock-up questionnaires.

• CHECKS PERFORMED ON THE WORKING DATASET After the creation of a working database, two types of data checks are performed to create a set of final clean databases:

1. Vertical checks

Vertical controls are intended to verify for each household and for each individual within the household whether the information and records are conform to what is expected. Mainly, it is important to check whether each person who has participated in the survey based on the contact form and the face-to-face questionnaire has also completed a self- completed questionnaire.

Research Protocol – HIS, Belgium 2013 p. 34/91

If inconsistencies are noticed, they are verified (e.g. by physical checks by the secretariat of the GDS) and if needed, corrections are made in the input programme. The corrections are documented in the programme.

2. Horizontal checks

Next to the vertical checks, it is necessary to control for internal consistency of the data. Horizontal controls are intended to verify whether the answer to a specific question is coherent with the rest of the questionnaire. Due to the utilisation of CAPI, routing errors and inconsistencies can be avoided as the system is programmed to guide the interviewer through questionnaire. Therefore, the data collected via CAPI will undertake no intensive data checks. Depending on the module some conceptual editing may be needed.

Data collected using the PAPI questionnaires require a specific phase of data checks because in this case errors may have occurred during data entry. In this case, checks for routing errors as well as for inconsistent replies will be undertaken. These controls are performed separately for each module of the PAPI questionnaire.

The procedures for data checks are documented in the SAS programmes archived as:

XX2013.sas (XX for the two letters of identification of each specific module).

The errors identified through these controls are pooled together in one document and sent back to SB in order to have verifications performed (i.e. go back to the paper questionnaires to check if reply is correctly encoded). Corrections that are applied are sent back to IPH in a new version of the data file, exported from Blaise®. When corrections cannot take place (meaning that the error cannot be corrected without re- interviewing the individual) the incoherencies are taken into account in the programmes developed for the creation of the indicators.

4.5.6. Storing and archiving

A number of databases are to be created for different purposes. A description of these databases is provided below. These databases are stored on a database server (see point 11). A backup of the database is done every day on a separated hard disk (.bak file), and this file is backed up and archived as another file (see SOP 31/E/007).

Research Protocol – HIS, Belgium 2013 p. 35/91

• DATABASES FOR FINAL REPORT

A final common database is to be created in which the variables and indicators for all the modules are available and in which data from the previous surveys are included, as far as they are also available in the HIS 2013. The procedures to be followed for the creation of this database are included in an automated programme “OUTPUT2013.sas” in the directory \HIS\HIS2013\database. The procedures are documented within the programme.

As a result, 7 final databases are created:

• hisext2013.sas7bdat: information at the household level

• hh_ext.2013sas7bdat: information at the individual level

• his_gp2013.sas7bdat: information for each contact with a general practitioner

• his_sp2013.sas7bdat: information for each contact with a specialist

• his_ed2013.sas7bdat: information for each contact with an emergency department

• his_hosp2013.sas7bdat: information for each hospital admission

• his_medicat2013.sas7bdat: information for each medicine that was taken in the 24 hrs. before the interview

• DATABASES FOR EXTERNAL USERS

From the final databases mentioned above, datasets are created for external users. In these datasets some variables are removed (e.g. statistical sector, date of birth) or aggregated (e.g. age).

The procedures to be followed for the creation of these datasets are included in an automated programme “EXT2013.sas” in the directory \HIS\HIS2013\database\external The procedures are also documented within the programme.

The programme will create CSV data files that can be read by all statistical software packages. A codebook in excel includes all variable and value labels in English. A manual for external users provides all information that is needed to understand and use the data.

All files and documents are saved in the directory \HIS\HIS2013\database\external.

Research Protocol – HIS, Belgium 2013 p. 36/91

The modalities of the transfer of the data to external parties is described in a document stored in the directory \HIS\HIS2013\database\external\

• DATABASE FOR INTERNET BASED ANALYSES

From the final databases mentioned above, datasets are created to be used as a source dataset for the Internet based application that allows to carry out interactive analyses via the HIS website.

The procedures to be followed for the creation of this database are included in an automated programme in the directory \HIS\HIS2013\database\interactive database: hisia.sas. The procedures are documented within the programme itself.

444.64.6.6.6.... Data analysis

4.6.1. Indicator development

Data analysis begins with data cleaning and the construction of new variables called “indicators”. These are defined as “variables that indicate certain conditions of interest” and will be used for the final analysis of the data. In some cases indicators are just copies of existing variables (after cleaning) while in other cases they result from the recoding of a/several variable/s.

The creation of the indicators is performed in a separate SAS programme for each module of the questionnaire. The SAS programme is called:

XX2013.sas (XX for the two letters of identification of each specific module) to be found in \HIS\HIS2013\modules\XX.

All the procedures performed are documented within the SAS programme itself. In addition, when the same indicator is available from previous HISs, these data are added in order to be able to compare the results between the surveyed years. The procedures are included in the same SAS programme and documented within the programme itself. The source variables, a number of socio-demographic background indicators and the specific indicators for that specific module are saved in a data file called: XX2013.sas7bdat to be found in \HIS\HIS2013\modules\XX.

Research Protocol – HIS, Belgium 2013 p. 37/91

4.6.2. Plan of analysis

For each outcome indicator crude and age and/or gender standardised rates are calculated. The crude results reflect the true prevalence within a population group. The information is purely descriptive and should be considered with caution when comparing the data between sub-populations. The standardised results are based on a mathematical standardization through a linear or logistic regression and allow to compare the indicators between selected background variables while adjusting for age and/or gender.

Around the crude and the standardised results 95% confidence intervals (CI) are calculated. If the CI does not overlap, these proportions or means can be considered as significantly different from each other.

The crude and standardised rates are presented in function of a selected number of background indicators: gender, age group, education level, urbanisation level, region of residence, year of survey. Depending on the outcome, results are expressed as a proportion, a distribution, a mean, a median or other percentiles. The results are presented in a final report including tables, graphs and an explanatory text. They are reported for the whole of Belgium, but also separately for the three regions of the country.

Further exploration of the data varies from one module to another and is described in the concept paper of the concerned module. When relevant, extra analyses are performed including also other variables.

4.6.3. Software selection

Analyses are carried out using SAS version 9.3

4.6.4. Treatment of missing values

In the database three types of missing values are distinguished: 1) not applicable (if the question is not supposed to be answered); 2) no answer (no information available); and 3) “does not know” (if this option is actively indicated in the questionnaire). For each variable and indicator, details about the codes of the missing values are found in the codebook.

Missing values are not considered in the analysis. A complete case analysis is carried out.

Research Protocol – HIS, Belgium 2013 p. 38/91

4.6.5. Programming

A number of programs are developed for the data management and the analyses:

1. the program to construct a cleaned global dataset (the raw dataset is received from GDS); this program includes also the construction of the survey weights;

2. the programs dealing with the analyses of the methodological aspects of the HIS (description of population, participation rate, etc.);

3. the programs dealing with data cleaning and construction of indicators for the individual modules;

4. the programs to produce crude and adjusted results, trend analyses and more advanced analyses

5. the programs constructing the final databases used for the interactive analyses via the website and the final databases for external users

1. PROGRAM TO CONSTRUCT THE CLEANED GLOBAL DATASET (INCLUDING WEIGHTS)

A program in SAS with as input the data from the Blaise® data entry programme (that are received from the GDS) produces the cleaned global HIS 2013 dataset that consists of 8 databases: one database with data at the individual level, one database with data at the household level, one database with data on the medicines that are consumed in the past 24 hours and 5 databases with contacts with health services (GP, specialist, emergency department, day patient and inpatient hospital admissions).

The program consists of the following steps:

• Data input from GDS files • Construction of new variables at household level (e.g. region, province, degree of urbanization, district, status of household, education at household level, equivalent income of household) • Construction of new variables at individual level (e.g. age groups, living condition of elderly, nationality) • Correction of inconsistencies • Calculation of survey weights A technical document describing the calculation of the survey weights in the HIS 2013 is presented in Annex 5.

Research Protocol – HIS, Belgium 2013 p. 39/91

2. PROGRAMS DEALING WITH ANALYSES RELATED TO THE METHODOLOGICAL ASPECTS

A set of programs deals with methodological aspects of the HIS such as:

• Calculation of time needed to complete the oral questionnaire

• Reason for use of a proxy

• Homogeneity of the households belonging to the same cluster

• Description of households in function of their participation status

• Reasons for non-participation

• Description of the sample in function of background characteristics

3. PROGRAMS DEALING WITH DATA CLEANING AND CONSTRUCTION OF INDICATORS FOR THE INDIVIDUAL MODULES

These analyses are done by module. For each module the program consists of the same steps:

• Input of relevant variables of HIS 2013 database

• Input of comparable data of HIS 1997, 2001, 2004 and 2008

• Data cleaning (correction of inconsistent data), allocation of missing values, formats

• Calculation of the indicators as described in the concept paper

• Computation of database to be used as input file for the basic tables (see next point)

• Computation of basic tables with the crude and adjusted results by background characteristics

• Computation of crude and adjusted results by background characteristics for the province of (oversampling)

• Computation of data files to be used as input for the program constructing the interactive database

• Computation of data files to be used as input for the program constructing the database for external users

Research Protocol – HIS, Belgium 2013 p. 40/91

4. PROGRAMS TO PRODUCE CRUDE AND ADJUSTED RESULTS, TREND ANALYSES AND MORE ADVANCED ANALYSES

Macros are developed for the following types of analyses or outputs:

• Basic tables for the final report with crude and adjusted results (+ 95% confidence intervals) for percentages and means in function of background characteristics

• Graphs presenting percentages or means by age and sex with 95% confidence intervals for the final report

• Trend analyses for Belgium and each of the regions

• Analysis of social inequalities in health

5. PROGRAMS CONSTRUCTING THE FINAL DATABASES USED FOR THE INTERACTIVE ANALYSES VIA THE WEBSITE AND THE FINAL DATABASES FOR EXTERNAL USERS

Input files for those programs include both the cleaned global database and the databases with indicators that are produced for each module.

4.6.6. Non-response analysis

Non-response analysis is performed based on the variables received from the National Register and that are therefore available for all contacted household. These variables include age, sex, household composition, nationality, and place of residence.

4.6.7. Inference methods

The estimation of the population parameters mentioned above and their associated variances is based on assumptions about the characteristics of the distribution of the observations in the HIS sample. These assumptions are that the observations were selected independently and that each observation has the same probability of being selected. However the HIS violates both assumptions as it uses a stratified multistage clustered sampling procedure. For logistical reasons the selected households are clustered geographically (per municipality), and also within the household a sub-sample is taken. As a result sample, units are not selected independently, nor are their responses likely to be independently distributed. Additionally, we are dealing with unequal selection probabilities because of the regional stratification.

Research Protocol – HIS, Belgium 2013 p. 41/91

To obtain representative results, both at the national and the regional levels, it is necessary to account for the complex sampling design. Correct estimates and valid inferences can be obtained by re-weighting the data, inversely proportional to the selection probability. A weighting factor is calculated that combines different aspects of the study design into a single, compound weight, thereby simplifying the computations (A detailed description of the construction of the survey weights is found in Annex 5).

In addition to the weights, the inference procedures should also account for the clustering and stratification of the sampling procedure in order to obtain exact variances and standard errors (and thus also confidence intervals). Clustering will decrease the precision of the estimates; hence yield into bigger confidence intervals. Stratification has in most cases an opposite effect: standard errors and confidence intervals become usually somewhat smaller by taking into account the stratification of your sample.

• INTERNAL AND EXTERNAL VALIDATION

Statistical experts are consulted to ensure the validity of the methods, calculation of weights, etc. The programmes for both the data cleaning and the analysis are created by the HIS researcher responsible for the given module and systematically verified through internal peer review. Internal and external quality checks are also performed for the macros that are produced for standard analyses.

• PRESENTATION OF RESULTS

The final report with tables presenting crude and adjusted results for all indicators are available in pdf-version on the HIS website. Results are presented for Belgium and for each of the three Belgian regions separately.

Via HISIA, the interactive web based analysis tool, it is possible to generate tables with crude results (and 95% CI) in function of background variables that can be selected by the user him/herself (see section 12).

4.6.8. Methodology used to improve the quality control of data management

As stated in the course of this protocol, various safeguards are implemented at different stages of the HIS 2013. The quality assurance procedures include “preventive actions” such as guideline manuals, training and testing (i.e. for data collection and data entry, for data analysis by external users,..), as well as “control actions”, such as consistency checks implemented in the computer-based programs used for the data entry (CAPI), metadata analysis for the fieldwork management, etc. At the level of data collection and management, the following quality assurance procedures are carried out:

Research Protocol – HIS, Belgium 2013 p. 42/91

QUESTIONNAIRES

• CONTENT: The questions and questionnaires intended for each particular module of the HIS are discussed with scientific and academic experts, members of the health agencies and administrations and fieldwork experts.

• TRANSLATIONS: Questionnaires and their translation (4 languages) are double- checked by the native-language researchers of the HIS team.

• PRE-TEST: The HIS 2013 questionnaires (CAPI and PAPI) are pre-tested in a small (N=65) but diversified sample of people (gender, social background, age) by the HIS team members themselves. Different features are evaluated, such as the length of the questionnaire and the time to fill it in, the comprehension and the readability of questions, the completeness of response categories, the pathway and skips through the questionnaire, etc. Adaptations are made according to results.

INTERVIEWERS

• TRAINING: the candidate-interviewers from the GDS join a full-day training session that includes both a theoretical part (background objectives, fieldwork procedures, content of the questionnaires) and a practical part (on their personal computers).

• MANUAL: the interviewers are provided detailed guidelines containing all aspects necessary to execute the survey correctly. The manuals contain instructions regarding: (1) the contact and fieldwork procedures; (2) the content of the questionnaires; (3) the use of the CAPI application.

• FIELDWORK MONITORING: First, the secretariat established at the GDS supervises the work procedures of the interviewers in terms of schedule and rules for the "contact forms", number of households contacted or interviewed, etc. In case a problem is detected, the secretariat contacts the interviewer to see how he may solve it or if it is needed to replace the interviewer. The GDS also assures a helpdesk function for the interviewers. Second, weekly accounts of the number of completed interviews and refusals per region, strata, etc. are sent to the IPH to make sure the progress of the survey meets the objective of 10.600 interviews by the end of a 12 months period. Actions are taken whenever problems are detected.

• QUESTIONNAIRE CHECKS: The interviewers send the completed questionnaires (CAPI and PAPI) of participating households to the GDS where a first check is performed at reception. The employees verify the number of questionnaires received (regarding the expected, as noted on the contact form) and the code numbers of the household members.

• CONTROL OF INTERVIEWERS: a quality control questionnaire regarding the work of the interviewers is addressed to all participating households together with the 10€-bon for participation. The questionnaire is returned to the GDS with a prepaid envelope. Each incoming survey is checked for complaints or non-conformity and encoded in an Excel-database. If a problem is identified, the interviewer is

Research Protocol – HIS, Belgium 2013 p. 43/91

contacted to talk it over. In function of the problem, either the right procedure is re-explained, either - if justified - s/he is dismissed from the HIS.

DATA ENTRY

• MANUAL: Regarding the PAPI data entry, guidelines are available for the data entry operators.

• PROGRAM CONTROLS: Control procedures are built in both the CAPI application for data collection and the data entry program developed for the PAPI questionnaire. They allow avoiding a series of inconsistencies. For instance, the program is constructed in such a way that the data entry is ‘guided’: every question has a variable field and every variable for which a value should be entered is set with an upper and a lower limit value, making it impossible to introduce values that exceed the specified range. The data entry program blocks or masks entry fields on the basis of information obtained from linked variables (e.g. jumps due to gender: it is impossible to introduce data for men in question- fields designed for women).

• PRE-TEST: The data entry programs (the CAPI and the PAPI data entry) are pre- tested by HIS team members; they are then set on the GDS network for a try-out period. Errors or inconsistencies found in the programs are notified to the programmer at the IPH by e-mail and are corrected.

DATA MANAGEMENT

See point 4.5.5 for details.

• VERTICAL CONTROL: Once the data are entered in the database, a series of quality controls are performed. During the vertical control, it is checked whether all information is available for all activated households.

• HORIZONTAL CONTROL: Next to the vertical controls, it is necessary to control for internal consistency of the data (through SAS programs developed by HIS researchers per module). Inconsistencies in the data entail going back to the paper-questionnaires for verification. Inconsistencies in data may arise from errors due to the respondent or errors introduced during data entry. Inconsistencies due to the respondents are treated during the analysis (statistical programs for data cleaning). Inconsistencies due to data entry (discrepancy between questionnaires and encoded data) are corrected in the database.

• DATA ANALYSIS: As stated above, respondent's contradictions or interviewers' mistakes are dealt with in the data cleaning process before the statistical analysis. The programs for both the data cleaning and the analysis are created by the HIS researcher responsible for the given module and systematically verified by another designated HIS researcher.

Research Protocol – HIS, Belgium 2013 p. 44/91

• RESULTS: The tables of results and the explanative texts produced for a module (= draft version) undergoes internal as well as external revision. Internal review is done within the HIS team, while external experts in a given domain assure the external review of the module(s).

4.7. Data security measures

Both institutes involved in the treatment of the HIS 2013 data, namely the IPH and GDS, have a strong policy regarding data protection measures. This is of particular concern in the area of health and health related information that is regarded as highly sensitive data. The institutes fulfil additional precautionary measures, such as the designation of persons who have access to the data, as well as their function in respect of the data processing, just to mention one.

4.7.1. GDS

The IT and computer system at the GDS is currently well secured, but even then, improvements are still on-going. Moreover, the following means (among others) have been implemented at different levels to warrant privacy issues regarding the HIS data:

• SAMPLING: Identification parameters that come from the national register (national number, names, and addresses) are not transferred to the IPH or to any third party. These data are exclusively used within the GDS for sampling matters and for carrying out the home-based interviews, and are destroyed right after the data collection phase.

• ANONYMIZATION: The national number of the sampled individuals is immediately changed into arbitrary codes through an algorithm called “secret key”. The table of correspondence between the national number and the arbitrary codes (“logical key”) is kept at the GDS only for the time of the data collection phase and is destroyed right after. The secret key is kept on a secured record at the GDS (as “third trusted party”, under the responsibility of the data protection delegate) in order to be able to reconstruct the national number in case a linkage is performed between the HIS database (kept at IPH) and specific administrative registers (i.e., mortality register, social security database) in the framework of a scientific research program. The IPH is the holder of the full HIS database that can be linked with the arbitrary codes, while the GDS is to erase all trace of the data as soon as they are transferred to IPH.

Research Protocol – HIS, Belgium 2013 p. 45/91

• ENCRYPTMENT: The confidential data collected through the HIS are stored at the GDS in an encrypted dataset (with the Gnu Privacy Guard program) and are transferred to IPH through a Secure File Transfer Protocol (SFTP).

• CONFIDENTIALITY: All personnel working with private data at the GDS are submitted to a confidentiality clause in their work contract and are correctly informed about protecting personal information. Internal GDS workers that are directly involved with private data fall under the act of statistical secrecy. By law, they are prohibited from disclosing any information that could identify individuals without their knowledge and consent. All employees must take an oath of secrecy.

4.7.2. IPH

The regulation of the ICT service and system conforms to the ISO 9001 and ISO 17025 norms for which IPH has obtained certificates. The general organisation of the computer service at the IPH is described in the SOP 70/FN/01. A number of physical and electronic security measures to prevent unauthorized access to confidential information are implemented at the IPH. These include:

• Physical security measures control the access to IPH offices.

• Only employees who need to work with the HIS data files have access to them. These individuals are identified and listed.

• Data can only be accessed via a login and password personally dedicated to authorized employees. Access to data can be traced and controlled.

• The researchers that are authorized to access and analyse the HIS data are submitted to secrecy oath and have signed a confidentiality contract for all information that falls under the law of December 8th 1992 regarding the protection of personal data.

• When publishing the HIS results, only statistical aggregates are released to the public in a way that the participants can never be identified.

Research Protocol – HIS, Belgium 2013 p. 46/91

4.8. Software development

4.8.1 CAPI for the household and the face-to-face questionnaires

This application is developed to interview the respondents and directly encode the answers of the household and the face-to-face questionnaires of the HIS 2013.

• TECHNICAL CHOICE The CAPI application was developed with the software package Blaise® because it is seen to be a powerful and flexible system. Besides, this software package was already used for the development of the data entry programs in the HIS 2008, and not unimportant, the GDS also uses this software for CAPI interviews – so an expertise with this package exists in both institutes. A trial with the open source survey software tool Limesurvey carried out in 2010 revealed its limits for a large scale survey: it was slow, less flexible (no programming but working with ‘predefined menus’) and rather suitable for smaller surveys.

The specific Blaise® 4.8.2.1700 version is used because it is already installed on the interviewers’ netbooks (with touchscreen) working for other surveys carried out at the GDS.

• DESCRIPTION: The Blaise Developer software is installed on the remote computer ‘BlaiseDev’ (Remote Desktop Connection). Through its high-speed data-entry and management capacities, Blaise® is suitable for large surveys such as the HIS. This tool is user-friendly as well as for the developer as for the end user (interviewer).

Data-entry can be easily managed by applying specific rules for filters (i.e., some HIS questions are not allowed for proxy interviewees, some modules target a specific sub- group of respondents), jumps (depending on the response to a defined question, certain subsequent questions can be skipped), type of questions (open, semi-open, closed questions, multiple response), limiting the number of values, language switching during interview (through menu or with a keystroke), mouse, pen and touch-screen support…. These rules are very useful to limit the number of errors when entering the data during the interviews or the imputation of PAPI collection (improving the quality of the database).

The interfaces can be modified according to the home style. The fonts and font sizes for questions, response text, and entry cells can be adapted. Guidelines and instructions for developing a CAPI questionnaire are provided by GDS.

The CAPI application supporting the household and the face-to-face questionnaires is developed at the IPH. After testing, it is integrated to the CAPI application of the contact form at the level of GDS, on the interviewers’ netbooks.

Research Protocol – HIS, Belgium 2013 p. 47/91

The interviewers are to send the data collected by means of the questionnaires on a regular basis to the GDS via an online connection. The GDS then exports these data in an adequate format. Preliminary databases are useful for consistency checks and the preparation of the final analysis programs. The database generated in Blaise® has a specific format, but extraction in other formats is possible (ex. TXT). The generation is done in two steps: 1) extraction of the data and 2) extraction of the variables in a SAS, SPSS… format. The advantage of extracting the data in a SAS format is that not only the variables and their values are recovered, but also the questions.

• SCHEDULE:

1. May 2012: testing the CAPI program based on the HIS 2008 by the HIS team members and optimizing it afterwards by GDS;

2. August 2012: adapting the CAPI test version with the questionnaire of the HIS 2013 by IPH, if possible in the four languages (French, Dutch, German and English) and testing by the HIS team members;

3. September 2012: integrating the final version in the GDS system and testing;

4. November 2012: uploading the application on the netbooks by GDS and executing ‘Real life’ tests;

5. December 2012: CAPI programme fully operational.

4.8.2 Data entry program for auto-questionnaire

This data entry program is developed in Blaise® based on the same principle as the CAPI application (see previous point). A Dutch and a French version are available.

4.8.3 Web application HISIA

After the reporting of the HIS results, the IPH offers external users the possibility to explore the data in individualised analyses. In this respect, a dynamic website tool is made available that allows to compute a wide range of indicators in function of different background variables. The website tool is called “Health Interview Survey Interactive Analysis (HISIA)”:

https://www.wiv-isp.be/epidemio/hisia

Research Protocol – HIS, Belgium 2013 p. 48/91

• OBJECTIVE: The objective is to develop a flexible web-based application in order to facilitate the access and to enhance the use of the statistics generated from the HIS performed in Belgium, for scientific research and public health policy, but also for a broad range of the potential users.

• PRINCIPLE: The principle is to rationalise the statistical programs for calculating health indicators and run them automatically through parameterisation on the Internet in an interactive way.

• GENERAL DESCRIPTION: SAS  IntrNet (on SAS BI (Business Intelligence) server – later on with Store Process) was used to create this web-based application. The technical implementation procedures are described in Annex 6.

From a practical point of view, the user first chooses the necessary parameters (background variables, such as year of survey, region, gender, age,…) via the menu system on the Internet browser. The selected parameters pass from the html-form to the SAS  system, where the corresponding databases and statistical procedures are invoked via macros. Finally, the browser displays the generated results ( dynamic reports ).

Menu

Statistical Databases procedures

Dynamic reports

Research Protocol – HIS, Belgium 2013 p. 49/91

4.9. Website

The main page of the IPH website ( https://www.wiv-isp.be ) does not directly refer to the HIS. One can access the general description of the HIS via the hierarchical way, by selecting the Operational Direction (OD) ‘Public Health and Surveillance’ and then selecting the program ‘Surveys, lifestyle and chronic diseases’, or via the direct link of the HIS:

In English: http://www.healthsurvey.be/

In French: http://www.enquetesante.be/

In Dutch: http://www.gezondheidsenquete.be/

The complete HIS website has been updated in May 2013 and drawn up in the house style of IPH, using the software package Sharepoint and the templates proposed by ICT. The HIS team has obtained the rights to manage the HIS website so that regular updates can be made without the intervention of ICT.

On this HIS website, one can consult the background information regarding the Belgian HIS (aim, funding and collaborations), the methods applied (sampling, fieldwork, content, questionnaires and protocol), the outcomes (reports, publications, interactive analysis and access procedures to micro data), specific information for the selected households or person (folder, poster and FAQs), and finally, information for the press and contact information.

4.104.10.... Procedures for external users

One of the aims of the HIS is to deliver a database which can be used for further research, either by the commissioners of the survey, or by the scientific community. The data of the HIS 2013 are coded data and subject to the privacy legislation (Law of 8 December 1992, amended by the Law of 11 December 1998). Hence, the data can only be transferred to third parties via a procedure through the Sectorial Committee Health and Social Security of the Privacy Commission. More detailed information about the health survey, the data (use and overview of the variables) and the procedure can be found on the website of the HIS (outcomes – access to micro data): http://www.healthsurvey.be/

Research Protocol – HIS, Belgium 2013 p. 50/91

Scientific Review

5.1. Scientific Steering Committee

In the context of the HIS 2013, a Scientific Steering Committee (SSC) is set up to assure the scientific follow-up of the survey. The role and tasks of the SSC are detailed in the Agreement Protocol signed by the Commissioners of the HIS. Its main task is to advice the Commission of Commissioners on scientific issues related to the survey. To ensure its independency, neither members of the Commission of Commissioners, nor IPH collaborators can be members of the SSC. Meetings of the SCC should be organised twice a year during the three years of the project. After each meeting, the minutes are sent to all Commissioners and members of the HIS team.

5.2. Working groups

In defining the content of the HIS 2013 questionnaires, working groups can be created to discuss the content of modules that require extended revision. Experts on specific domains are invited to ‘brainstorm’ and to give their scientific input in defining the content of the module. The composition of such working groups, the items that are discussed and the regularity of their meetings depend entirely on the specificities of the module (some modules will be kept unchanged, some modules need an in-depth revision, some new modules will be added to the questionnaires).

5.3. OnOnOn-On ---demanddemand scientific review

The HIS is a complex survey with various scientific disciplines involved. Despite a detailed planning, unexpected problems can emerge throughout its execution while – based on the analysis of data – new research orientations or unpredicted finding can require the input of the scientific community. Thanks to the many formal and informal contacts the HIS-team has built since the start of the survey, scientific advice can easily be solicited.

Research Protocol – HIS, Belgium 2013 p. 51/91

Organisation of the research project

6.1. Starting and completion date

The project consists of three different phases:

• Preparatory phase: 01.01.2012 – 31.12.2012.

• Fieldwork phase: 01.01.2013 -- 31.12.2013. If the sample of 10.750 interviews is not reached in this period of time, the fieldwork phase is extended beyond this date. However, thanks to a close monitoring of the fieldwork (weekly analysis of para-data), strategies are foreseen to ensure that the data collection is limited to this 1-year time span.

• Analysis and reporting phase: 01.01.2014 -- 31.12.2014.

Research Protocol – HIS, Belgium 2013 p. 52/91

6.2. Timetable

Research Protocol – HIS, Belgium 2013 p. 53/91

Research Protocol – HIS, Belgium 2013 p. 54/91

6.3. Subcontracting

6.3.1. Subcontract GDS

Activities related to the HIS 2013 data collection are subcontracted to the GDS. In this context, a contract has been established and signed – after approval by the juridical department of IPH – between the general directors of IPH and of GDS (see Annex 2). In this contract, the tasks and responsibilities of GDS regarding the HIS data collection are specified.

The on-going ‘state of the art’ regarding the fieldwork progression on the basis the results obtained from the “data collection control procedures” (para-data) are discussed during weekly meetings between the GDS and IPH delegates. In case a problem occurs, a solution is sought in mutual agreement. Meetings between the GDS and IPH project leaders are scheduled on a monthly basis, where a more general picture of the fieldwork progress is discussed. Both kinds of meetings assure that IPH has a clear representation of the data collection phase and is able to verify if all GDS subcontracted activities are being carried out according to the contract.

6.3.2. Collaboration with ICE

The “Interuniversitaire Cel Epidemiologie” (ICE) plans to re-contact a subsample of households that have participated in the HIS 2013 survey, in order to conduct a survey regarding dental health. This survey includes both a completion of an auto-questionnaire and a clinical exam of the mouth. The goal is to reach a subsample of 3,000 HIS 2013 participants to take part in the ICE survey. In this respect, a contract has been signed – after approval by the juridical department of IPH - between the director of IPH and the director of the ICE-group. In this contract, the implication of IPH in the context of the ICE- survey is detailed.

Practically, after having carried out the HIS data collection, the interviewers ask the selected members of the participating households whether they agree to be re-contacted in the context of the ICE-survey. If so, a CAPI-based ‘information form’ is completed. In this form, the household members indicate that they agree to be re-contacted in the context of the ICE survey, and give their phone number, e-mail address and best time for being reached. It is important to stress that this form is not an informed consent. It only reflects the agreement of households to be re-contacted by an ICE operator. This information form is sent out together with the HIS CAPI questionnaires to GDS on a regular basis. GDS uploads this info on a server accessible for the ICE-research team on a weekly basis. ICE can consult and use the contact-information for the ICE-survey. ICE has no access to the HIS 2013 data.

Research Protocol – HIS, Belgium 2013 p. 55/91

6.4. Resources

6.4.1. Team

Full time First name Last name Qualification Function equivalent Rana Charafeddine Scientist 100 Scientific t eam member

Sabine Drieskens Technician 100 Scientific team member

Stefaan Demarest Sociologist 100 Scientific team member

Hubert De Krahe Secretariat 10 Administrative

Lydia Gisle Psychologist 100 Scientific team member

Monique Schoonenburg Secretariat 80 Logistic team member

Jean Tafforeau Medical doctor 20 Study director

Johan Van der Heyden Medical doctor 100 Scientific team member

Training of team members is foreseen in the IPH evaluation schemes on a permanent basis.

6.4.2. Availability of space, funds and material

• SPACE Besides the offices needed to host the team members (at IPH), it is imperative to have enough space for stocking several documents used for the survey (information folders, paper questionnaires, envelopes, etc) and to store the survey questionnaires when they are sent back by the interviewers (at GDS).

• FUNDING The Belgian HIS is funded in the framework of the interministerial conference, where the health ministers of the different authorities at federal and regional level are represented. The survey is funded on the basis of an Interministerial Agreement protocol (Annex 1) published in the Monitor (Official Journal).

The funding plan covers a period of three years: one year for the preparation of the survey, one year for the data collection and the third year for the data analysis.

Research Protocol – HIS, Belgium 2013 p. 56/91

• MATERIAL As the survey is mainly based on interviews (rather than medical examination) few materials is needed. However, starting from 2013, it has been decided to switch from paper and pencil to computer assisted face to face interviews to collect most of the HIS data. It is thus necessary to provide a portable computer to each of the 200 interviewers in charge of the fieldwork.

Here also, due to the subcontract with the GDS, it is not necessary for IPH to take care of all those portable computers.

6.4.3. Budget plan

A detailed budget plan was prepared and is available upon request.

Risk and benefits for participants

7.1. Participation risks and benefits

It is essential to ensure that research participants are not harmed physically or psychologically during the conduct of research.

The risk of harm to those who take part in an epidemiological investigation like the HIS is very limited. The greatest risk to individuals could be caused through the disclosure of personal data.

Likewise, participants gain no direct personal benefit in participating to the HIS. Selected individuals are correctly informed about the research purpose, subject and procedures before obtaining their consent to participate or respecting their refusal. In recognition of their participation, households are gratified with a 10-euro bon that they can exchange in an extensive range of shops – and this information is disclosed in the invitation letter.

• INFORMATION TO PARTICIPANTS AND CONSENT A formal written and signed consent is unnecessary if the research is carried out in settings that pose no threat to the potential participants, when it is stated that taking part is voluntary and it is obvious that no benefits are at risk of being lost if potential participants refuse to take part. This situation arises in studies such as the HIS, based on questionnaires or interviews where providing the data involves giving de facto consent.

- Basically, households that are selected to participate to the HIS receive notification about the survey, its practical organization, the institution in charge,

Research Protocol – HIS, Belgium 2013 p. 57/91

the commissioners of the survey and the contents via a letter and an information leaflet personally addressed to them 2.

- It is clearly stipulated that participation is voluntary. An e-mail address, Internet website and free telephone number are clearly indicated on the leaflet if selected households need further information or want to withdraw. Potential participants can also ask more about the survey or refuse to participate at the moment the interviewer contacts them to ask their consent and make an appointment.

- As stated in the leaflet, a summary of results is communicated to participating households to express gratitude for having taken part in the survey.

• RESPECT OF CONFIDENTIALITY OF THE COLLECTED DATA AND THEIR MANAGEMENT Measures are taken in the HIS to ensure that participants cannot be identified and that their responses remain confidential. These measured are described in point 4.7 “data security measures” of the present document. Other protection measures include the following:

- Interviewers sign a confidentiality clause inserted in their contract. Interviewers are forbidden to disclose any information gathered during the interview or make any general inference on the basis of the collected information.

- The employees responsible for encoding the HIS data from the self-administered questionnaires at the GDS are submitted by Decree (M.B. 20.07.62 §2) to an obligation of professional secrecy concerning personal information.

- IPH researchers do not have access to the name, address or national register number of selected households, which are kept at the secretariat of GDS for fieldwork management. Only coded data are transferred to the IPH researchers.

- Researchers have an obligation of confidentiality by contract or by status. Besides, only the researchers implied in the HIS team have access to the data (protected network emplacement).

- External researchers, who purchase the coded HIS database (in a restricted or aggregated form) to carry out scientific analysis, sign a bilateral contract where a confidentiality clause is stipulated as well as the obligation to renounce to identify individuals or transfer the dataset to a third party.

• COMMUNICATION OF RESULTS

All results are divulged in a format (tables of statistics, graphs) that impedes any recognition of individuals (See reports on website: www.healthsurvey.be).

2 HIS\HIS2013\Promotion - Press\Materiel de promotion

Research Protocol – HIS, Belgium 2013 p. 58/91

7.2. Legal instances

7.2.1. Ethical Committee

The HIS 2013 research protocol was submitted to the Ethical Committee of the University of Ghent and a positive advise was given for it on 1/10/2012.

7.2.2. Privacy Commission

The Privacy Act is the Act of 8 December 1992 on the protection of privacy in relation to the processing of personal data. This Act aims to protect individuals against abuse of their personal data. The rights and obligations of the individuals, whose data are processed, as well as the rights and obligations of those processing the data have been established by the Privacy Act. With the Privacy Act, an independent supervisory authority was also established: the “Privacy Commission" which is the Belgian Data Protection Authority. On the basis of the Privacy Act, this Commission, as an independent body, ensures that personal data are used and protected with due care, so that individuals’ privacy remains safeguarded.

Two different demands for carrying out the HIS 2013 were addressed to the Privacy Commission, more specifically to the:

1. National Register Sectorial Committee: demand regarding the use of the National Population Register as sampling frame for the HIS and the individual National Numbers for further possible linkage of the HIS data with other sources of data (kept by a third trusted party). The authorisation to use the National Population Register for sampling in fieldwork practicalities is found on the Commission’s website3 under the reference Nr 88/2012 on 7 October 2012.

2. Sectorial Committee Social Security and Health: demand regarding collection and treatment of personal data for scientific and statistical purposes. The recommendation of the Sectorial Committee Health regarding the HIS data management bears the reference Nr 12/03 dated 20/11/2012.

3 http://www.privacycommission.be/sites/privacycommission/files/documents/d%C3%A9lib%C3%A9ration_RN_88_2012_0.pdf

Research Protocol – HIS, Belgium 2013 p. 59/91

Propriety rights of study material and results

All data collected by means of the HIS 2013 is the property of the Commission of Commissioners, as mentioned in the Inter-Ministerial Agreement Protocol (Annex 1).

The results that stem from the analysis of the HIS data produced at the IPH are first communicated to the Commissioners (owners of the data) by means of reports.

Results are then communicated to the public through press conferences and the media, and by placing the reports on the HIS Website. Participants personally receive a summary of main results.

Finally, results are presented to the scientific community through peer reviewed publications and conferences.

Client satisfaction

9.1. Definition of the Clients

The clients of the survey are:

1) THE COMMISSION OF COMMISSIONERS

The Commission of Commissioners is the sponsor of the HIS. All aspects of the survey (methodology, content, analysis, report) are discussed and approved by the Commission. The tasks of the Commission are described in the Agreement Protocol (Annex 1).

2) THE PROVINCES THAT HAVE SPONSORED AN OVERSAMPLING

Some provinces can sponsor an oversampling, but they will not be considered as official sponsor and be represented in the Commission of Commissioners. In the HIS 2013, this is the case only for the province of Luxemburg.

3) THE ICE

In the context of the HIS 2013, HIS interviewers ask participating household if they agree to be re-contacted in the context of the ICE survey. In the IPH – ICE contract no end-terms are defined (that is, the number of HIS-participants expected to agree with a re-contact in the context of the ICE-survey).

Research Protocol – HIS, Belgium 2013 p. 60/91

9.2. Contacts with the Clients

1) THE COMMISSION OF COMMISSIONERS

Contacts with the Commission are made through meetings. There is no set schedule for these meeting, but there should be a minimum of two per year. In-between meetings, the members of the Commission will receive via e-mail the necessary information allowing them to have an updated insight of all aspects related to the survey. If necessary, any member of the Commission can ask for supplementary information.

2) THE PROVINCES THAT HAVE SPONSORED AN OVERSAMPLING

Every trimester, contacts with the province of Luxemburg are made via e-mail in order to inform them on the number of interviews completed in the province.

3) THE ICE

Under the responsibility of the IPH, GDS will transfer the updated information to ICE via e- mail on a weekly basis. If found necessary, IPH and ICE can meet to discuss the ‘state of the art’ and decide upon supplementary measures to motivate households to be re- contacted.

9.3. FeedFeed----backback systems

During meetings, based on specific enquiries.

9.4. Treatment of complaints

See SOP 30/FN/006

Research Protocol – HIS, Belgium 2013 p. 61/91

Communication of results and reporting

10.1. Reporting mechanism

To present the results of the HIS 2013 to the Commissioners and to other stakeholders, descriptive reports are compiled. The reports are organized along the 5 health domains of the HIS: health status, health behaviours, prevention, health care consumption, health and society.

The reports are entirely written and translated (from Dutch to French or vice versa) by the HIS team members. Executive report summaries are also foreseen in the two languages.

For ecological and financial reasons, printouts of the reports are limited (one copy for each commissioner), but the full electronic report (in a PDF form) are made available on the HIS website in French (www.enquetesante.be) and Dutch (www.gezondheidsenquete.be).

10.2. Publication plan: peerpeer----reviewedreviewed publications and others

In addition to the descriptive reports prepared for the Commissioners and the public, further in depth analysis of the data remains an important ambition of the HIS stakeholders. The HIS database is further analysed by the research teams at IPH and it is also made available for external users (academic research teams) for them to pursue in depth analyses.

The database can be delivered in 4 formats (SAS, SPSS, STATA and ASCII). Each format counts 4 different files, which correspond to the level of data registration: household level, individual level, contacts level with health care professionals and health care consumption level. These files include the indicators constructed using the 2013 data, and when available, the same indicators from previous HIS surveys to ensure trend analysis.

The results of those additional analyses should be presented during scientific conferences, symposiums, national and international meetings. They should also be published in scientific journals. An up-dated list of publications can be found on the HIS website.

10.3. Other forms of communication of results

To disseminate widely the results of the HIS, a number of communication tools are developed and published in different formats (i.e., reports, summaries, folders, press release, power point presentations, interactive database, etc.). The format of the presentation is adapted to the target group:

Research Protocol – HIS, Belgium 2013 p. 62/91

• FOR POLICY MAKERS AND OTHER STAKEHOLDERS

A series of power point presentations are developed to present the survey results.

• MEDIA AND THE GENERAL PUBLIC

For a wider dissemination of the HIS results among the general public, a number of communication instruments are developed: Press releases, conferences and/or interviews.

• WEBSITE

Information about the HIS (methodology, results) is available to the public on the HIS website at the following address: in French = www.enquetesante.be in Dutch = www.gezondheidsenquete.be

Beside this "static" presentation of results, an interactive and user-friendly data analysis tool is available for more specific data analysis:

https://www.wiv-isp.be/epidemio/hisia/index.htm

Using this instrument, it is possible to produce information on the distribution of any HIS indicator in the population without any preliminary statistical or programming knowledge. The output is displayed in tables and the indicators can be produced in function of maximum 3 the following background variables: age, gender, household composition, education, income, urbanisation, and province.

• PARTICIPATING HOUSEHOLDS

A leaflet describing the results in lay terms must be sent out to the household who participated in the survey, together with a thank you note for their participation.

Research Protocol – HIS, Belgium 2013 p. 63/91

Archiving process

11.1. Data management

All the HIS documents and data files are kept on the FS Volume of the secured internal network server. At the start of each new HIS project a new subdirectory is created. Each HIS is classified with reference to the year it is accomplished (1997 – 2000 – 2004 – 2008 - 2013). More specifically, the HIS 2013 subdirectory is placed in:

\\iph.local\FS\Services\33_SLCD\DATA\HIS\HIS 2013

The structure of the computer files’ storage system used for the HIS is documented in the SOP 31/E/HIS-002: “Management on electronic documents”.

Archiving of all the files is done via the general archiving process of the department of Public Health and Surveillance of the IPH: a daily backup on a separate Blade Server (Backup.iph.local), and this is described in the SOP 70/03/N.

11.2. Documents

The original contracts signed by the commissioners are stored in the central office of the RP/PJ.

The completed HIS paper-questionnaires are stored at the central office of the GDS for at least 5 years (according to the available place).

Research Protocol – HIS, Belgium 2013 p. 64/91

Reference List

(1) Research strategies for health based on the technical discussions at the 43th World Health Assembly on the role of health research in the strategy for Health for Alll by the year 2000. New York: Lewiston; 2008.

(2) Van Oyen H, Tafforeau J. Health Interview Survey. Arch Public Health 1994;52:79- 82.

(3) Van Oyen H, Tafforeau J, Hermans H, Quataert P, Schiettecatte E, Lebrun L, et al. The Belgian Health Interview Survey. Arch Public Health 1997;55:1-13.

(4) Bergner M. Measurement of Health Status. Medical care 1985;23(5):796-804.

(5) De Kleijn-De Vrankrijker MW. Internationale aspecten van gezondheidsmeting. In: Gunning-Schepers LJ, Mootz M, editors. Gezondheidsmeting.Houten: Van Loghum; 1992.

(6) Mootz M, Van den Berg J. [Indicators of health status in the CBS-Health Interview Survey]. Mndber Gezondheid (CBS) 1989;2:4-10.

(7) Guide to Clinical Preventive Services: Report of the US Preventive Services Task Torce. 2 ed. Baltimore: William & Wilkins; 1996.

(8) Schaapveld K, Bergsma EW, Van Ginneken JKS, Van De Water HPA. Setting priorities in prevention. Leiden: TNO Institute for Preventive Health Care; 1990.

(9) Winslow CEA. The untilled fields of public health. Science 1920 Jan 9;51(1306):23-33.

Research Protocol – HIS, Belgium 2013 Annex 3

Annex 3

Selection of Municipalities, Households and Respondents in the HIS 2013

1. Introduction

In this document we describe the procedures to select the municipalities, the households and the individuals of the HIS 2013 and refer to all documents, spreadsheets and statistical programs that were developed to produce and verify the sample. Both the method and the implementation of the selection procedures are described.

Overview

The sampling scheme of the households and respondents in the Belgian HIS 2013 is a combination of different sampling techniques: stratification, systematic sampling and clustering.

First there is a regional stratification . Belgium is divided into 3 regions, the Flemish Region, the Walloon Region and the Brussels’ Region, for which the number of interviews has been predetermined. The reason for this stratification is to be able to produce results by regions.

The second stratification is at the level of the provinces . This second level of stratification is done to improve the quality of the sample over a simple random sample. In particular a geographical spread is achieved. The sample size within the provincial stratification is proportional to the population size of the province. Within the province of Liège the sample size of the German Community (which is geographically located in the province of Liège) has been predetermined. Therefore the province of Liège is divided into 2 strata: 1) the 9 municipalities belonging to the German Community 2) the other municipalities in Liège.

Then within the strata units are accessed in two (for the households (HH)) or three (for the individuals) stages:

• First, within each stratum municipalities are selected with a selection chance proportional to their size. These municipalities are called the Primary Sampling Units (PSU). Each time a PSU is selected a group of 50 individuals have to be interviewed successfully.

• Then within each municipality a sample of households, the Secondary Sampling Units (SSU), is drawn such that around 50 individuals per group can be interviewed in total.

• Finally within each household at most four individuals, the Tertiary Sampling Units (TSU), are chosen.

65 Research Protocol – HIS, Belgium 2013 Annex 3

Following elements are specified in advance:

• a basic sample size of 10.000 (3500 in Flanders, 3500 in Wallonia, 3000 in Brussels), equally spread over the year 2013;

• the number of interviews by province in the basis sample is proportional to their size (to prevent problems related to the over- or under representation of some provinces by chance in the random sample);

• within the province of Liege the number of interviews in the German Community (in the district Eupen-Malmédy) is fixed to 300 interviews;

• there is an oversampling of 600 in the province of Luxemburg, resulting in a final sample size of 10.600 (3500 in Flanders, 4100 in Wallonia, 3000 in Brussels) 4;

• in order to keep the fieldwork manageable the number of interviews to be done in each municipality should be at least 50; hence it was decided to work with multiples of 50 within each stratum.

Stratification at level of regions and provinces

Method

Belgium is subdivided into 12 strata:

• 5 strata in the Flemish region : the 5 provinces

• 1 stratum in the Brussels region

• 6 strata in the Walloon region (4 provinces + the province of Liège divided in 2 strata: Liege with German Community and the German Community).

The numbers allocated to each stratum are calculated as follows.

1. For the Brussels region the number is fixed to be 3000.

2. For the 10 provinces the number of interviews is distributed proportional to the size of the provinces within each region, but such that the numbers are multiples of 50 and add up to 3500 in both the Flemish and Walloon Region.

3. Within the Province of Liège a number of 300 is allocated to the German Community; hence, the number of persons allocated to the province of Liège without the German Community = the total for Liège – 300.

To achieve this some rounding is necessary and as a consequence the probability for selection within each region was not perfectly equal.

4 Initially the province of Namur had asked for an oversampling of 150 individuals; later on they came back on that decision. 66 Research Protocol – HIS, Belgium 2013 Annex 3

Procedure implementation

The outcome of the procedure is presented in table 1. For each stratum the following items are specified: the population, the fraction relative to the total number of inhabitants of the region, the theoretical number of individuals to be interviewed, the effective number of individuals to be interviewed (a multiple of 50), the corresponding number of groups of 50 individuals to be interviewed and the probability of being selected (the sampling rate/1000).

Note

From the table it can be derived that the chance of being selected differs appreciably from region to region. In total 10750 interviews are scheduled. Hence - as the total population of Belgium is about 10 000 000 - the overall chance of being selected is about 1/1000. In the Flemish region the relative chance is 0,56, in the Walloon region it is 1,21. In the Brussels region this factor is 2,75; i.e. that the chance of an individual in the Brussels region to be selected is approximately 3 times the overall probability of being selected. Within each region the selection probabilities by province differ too because of the rounding process described above. This variation is very small, with the exception of the German Community and the province of Luxemburg where the chance of being selected is augmented because of the oversampling.

It should be stressed that this difference in selection probability does not affect the representativeness of the samples. Instead by taking in each region nearly the same number of respondents, the precision and hence the quality of resulting inferences at regional level is made about equal. However for the combination of the results to the national level some precision is lost but a valid estimate and inference can be obtained by reweighting each region by stratum with weights inversely proportional to the selection probability.

67 Research Protocol – HIS, Belgium 2013 Annex 3

Table 1. The distribution of the sample size by provinces Province (A) (B) (C) (D) (E) (F) (G) (H = (F/A) * 10³) Population * Fraction Theoretical Effective Provincial Total Number of The probability (%) number of number of oversampling groups of for an individuals to be individuals to be 50 individual to be interviewed interviewed individuals selected (multiple of 50) Antwerpen 1.744.862 27.9 977 950 950 19 0,54

Vlaams Brabant 1.076.924 17.2 603 600 600 12 0,56

Limburg 838.505 13.4 649 500 500 10 0,60

Oost-Vlaanderen 1.432.326 22.9 802 800 800 16 0,56

West-Vlaanderen 1.159.366 18.5 649 650 650 13 0,56

Flemish Region (5 strata) 6.251.983 100.0 3.500 3.500 3.500 70 0,56

Brabant Wallon 379.515 10.8 380 400 400 8 1,05

Hainaut 1.309.880 37.4 1.310 1.300 1.300 26 0,99

Liège (including German Com.) 1.067.685 30.5 1.068 1.050 1.050 21 0,98

Liège (exluding GC) 992.463 750 750 15 0,76

German Community 75.222 300 300 6 3,99

Luxembourg 269.023 7.7 269 300 600 900 18 3, 35

Namur 472.281 13.5 472 450 450 12** 0.95

Walloon Region (6 strata) 3.498.384 100 3.500 3.500 600 4.150 85** 1.86

Brussels region (1 stratum) 1.089.538 100.0 3.000 3.000 3.000 60 2,75

BELGIUM (12 strata) 10.839.905 100.0 10.000 10.000 600 10.750 215 0,99

* GDS (population 01.01.2010) (C) = (3500 * (B))/100 within the Flemish region; (C) = (3500* (B))/100 within the Walloon Region

** The province of Namur had initially asked for an oversampling of 150 persons (or 3 groups) and was therefore allocated 12 groups in total. Only after the fieldwork had started, they came back on that decision. Rather than to reduce the number of groups, it was decided to reduce the number of individuals per group. Therefore, in Namur there are 12 groups of 37,5 individuals instead of 9 groups of 50 individuals

68 -

2. Selection of the municipalities (PSU)

Method

In this step the municipalities are selected within each stratum. The selection is made for the whole year (2013) at once. In this way time is less confounded with place and it also facilitates the planning of the fieldwork (e.g. the recruitment of the interviewers).

To guarantee the efficiency of the sample, some additional rules are built into the random selection:

• The chance of selection of a municipality should be proportional to its population size.

• The larger cities and metropoles should be included at least once with certainty in the sample. The number of times a city or a metropolis has to be included is determined by its actual population size.

• A similar remark holds for the smaller towns and villages. Also from this group elements should be present. By grouping smaller communities through ordering the whole set of communities according to size the representation of smaller communities out of the pool of smaller communities is ensured. The assumption made is that smaller communities of about the same size are exchangeable with respect to the items of interest.

Above requirements are achieved by a weighted systematic sampling where the municipalities are ordered (from large to small) and expanded proportional to their size (area probability sampling). As a consequence the chance for a municipality to be selected is proportional to the number of inhabitants.

By ordering and systematic sampling also one implicitly stratifies the municipalities in blocks of a certain size and from each block just one municipality is chosen. As a result the sample contains municipalities ranging from small to large. Further the systematic sampling guarantees that the larger cities are selected with certainty. In fact some large cities are selected more than once because their size is a multiple of the step size by which the systematic sample is taken.

The implementation of this scheme is done as follows:

1. For each stratum all municipalities are listed by population size (from large to small).

2. A list is made with the cumulative population size (C) equaling the population size of the municipality + the population size of the municipalities listed before.

3. An interval (I) is calculated being equal to the total population size of the stratum (T) divided by the number of groups to be selected (N); I = T/N

4. An integer random number (R) is selected where 0 < R < I

5. A list of N values (x i) is generated, starting with R and adding each time a multiplier of I

xi = n.I + R where n = integer (where 0 ≥ n > N)

6. Each value of this list is compared with the cumulative population size of the municipalities. A group of 50 persons is selected in the municipality with the smallest C for which applies:

n.R + I < C (where 0 < n < N-1)

Through this procedure it is possible that several groups are selected within the same municipality.

Research Protocol – HIS, Belgium 2013 Annex 3

Procedure implementation

The implementation of the selection procedure is done via a SAS programme to be found on the CROSP volume: munselect.sas

Of the 589 municipalities 158 were selected. A list of all selected municipalities + the number of selected groups per municipalities is found in Municipalities HIS2013.doc

The number of municipalities (PSU) selected is smaller than the number of groups (225) of 50 individuals because several municipalities are selected more than once.

In the map below the geographical spread of the selected municipalities is presented, indicating also the municipalities in which several groups were selected.

70 - Research Protocol – HIS, Belgium 2013 Annex 3

3. Selection of the Households (SSU)

Method

At this stage the municipalities in which the households have to be sampled are known and the corresponding number of groups (or the number of respondents to be interviewed (a multiple of 50) is given. Next step is the selection of the households.

For municipalities with one group only, the sampling frame consists of all households of that municipality. If there are several groups in a municipality, the municipality is divided into groups of adjacent statistical sectors and the selection of the households is applied for each group of statistical sectors.

To take into account non-response there is need to select replacement households. To tackle, at least partially, systematic trends in drop-out, it was decided not to replace the households in a simple random fashion, but to seek for a-priori matches within the stratum based on:

- the statistical sector within the municipality (and hence also the municipality); - the size of the household; - the age of the reference person.

Per selected household three matches (i.e. “reserves”) are generated. The so-resulting group of matched quadruples is called a cluster of households.

The clusters of households are selected through a systematic sampling procedure. This is done in 4 steps.

1. Ordering of clusters

Households are ordered hierarchically (per municipality and per stratum) by:

- statistical sector - the size of the household in 5 categories : size 1, 2, 3, 4, and 4+ - the age of the reference person ° for stratum 1: in 7 categories : < 15 yrs, 15-24, 25-34,…, 65-74 ° for stratum 2: by age in years ° for stratum 3: by age in years

The first level of ranking is thus the statistical sector. Statistical sectors are ranked from north to south (based on the Lambert coordinates of the centre of the statistical sector).

The order of the variables “household size” and “age of reference person” is alternated when the level of the variable is changing from the current to the next level. For example, in statistical sector 1, the household sizes are given in an increasing order, while they decrease in statistical sector 2. For household size 1 in statistical sector 1, the reference persons are listed according to increasing age, while the ages decreases for the households of size 2 in statistical sector 1. In this way, households comparable to each other according to the first variable are located close to each other on the list. A schematic presentation is given in Table 2.

71 - Research Protocol – HIS, Belgium 2013 Annex 3

Table 2 : The ordered sampling frame for the selection of the households

Statistical Household Age reference

Sector Size Person youngest ... 1 ... Oldest oldest ... 2 ... 1 Youngest youngest ... 3 ... Oldest 4 ... 4+ 4+ 4 2 3 Age Age stratumwithin a municipality 2 1 1 3 ...

2. Calculation of the Step size

The step size for the systematic sample is calculated by dividing the total number of households by the required number of households. The required number of households is obtained by dividing the required number of individuals by an “adjusted mean household size” in the municipality. The “adjusted mean” implies that household sizes > 4 are set to 4. This is done because maximum 4 persons are interviewed per household. In each group 50 individuals are interviewed. As the sampling is spread over 4 quarters this means that (on average) 12,5 individuals are interviewed per quarter.

To account for the additional uncertainty that the number of personal interviews can be estimated from the number of households, but not determined with certainty (variable household membership, within household refusal), one has to ensure that a larger number of matched quadruples is selected to enable the inclusion of additional households. By setting the step size to half its original value the number of clusters is doubled.

Hence, the step size (y) is given by:

 N − 3  y =    2n  where N = the number of households within the group

n = the number of households to be sampled in the group

72 - Research Protocol – HIS, Belgium 2013 Annex 3

The table below gives an example of the calculation of the step size in a hypothetical municipality of 785 households (one group).

Number of households Household size Household size Number of individuals needed for to be considered calculation of adjusted mean for calculation of household size adjusted mean 250 1 1 250 200 2 2 400 150 3 3 150 100 4 4 400 50 5 4 200 25 6 4 100 10 7 4 40

Total number of households = 785 Total number of individuals needed to calculate adjusted mean = 1540 Adjusted mean household size = 1540/785 = 1,961783 Required number of households to obtain 12,5 interviews = 12,5/1,961783 = 6,371753  785 − 3  Step size y =   = 61,36459  .6*2 371753 

3. Generate a random number

A random number x is generated between 1 and y.

4. Selection of clusters of households

Instead of taking one household at each step of the sampling, the selected household and three consecutive households are taken. Such a group of households is called a cluster.

The first cluster constitutes of the households x, x+1, x+2 and x+3.

The second cluster contains the households x+y, x+y+1, x+y+2 and x+y+3.

And more general, the n-th cluster contains the households x + (n-1)y, x + (n-1)y +1, x + (n-1)y +2 and x + (n-1)y +3.

For most groups the step size is not an integer. Therefore the first household of the n-th cluster is household number ROUND(x + (n-1)y) on the list.

The outcome of this procedure is a number of clusters consisting of 4 households, similar in terms of statistical sector, household size and age of the reference period. The number of clusters is twice the number of required households. Hence the total sample size is 8 times as large as the number of required households.

73 - Research Protocol – HIS, Belgium 2013 Annex 3

5. Selection of households

After the implementation of the sampling procedure a four columns table is formed where each row represented a cluster of the (four) households. There are as many rows as there are clusters selected. The first column is the first household to contact. If this household is not eligible or if it does not result into an interview, the next household (in the second column) is contacted and so on. When all the households of the cluster are used and further replacement is necessary, the first eligible household of the next available row is selected. To prevent any order effect the households within each row are randomized (= horizontal scrambling). Also the rows themselves are randomized (= vertical scrambling). Then there are no row-effects and it is possible to work from top to bottom until a sufficient number of interviews is realised.

The tables below give a schematic presentation of the sample before and after scrambling.

BEFORE SCRAMBLING

Row Replacement households 1 2 3 4 1 nr 011 nr 012 nr 013 nr 014 2 nr 021 nr 022 nr 023 nr 024 3 nr ij * 4 ... 13 14 nr 141 nr 142 nr 143 nr 144

AFTER SCRAMBLING

Row Replacement households 1 2 3 4 1 nr 034 nr 033 nr 031 nr 032 2 nr 102 nr 103 nr 104 nr 101 3 4 ... 13 14 nr 123 nr 121 nr 124 nr 122

Above sampling scheme solves many issues at once:

• By taking a systematic sample from an ordered list, it is ensured that the characteristics of the sample is close to that of the municipality with respect to the variables statistical sectors, household size and age of the reference person.

• By taking in each step a cluster of four households, one forms in a natural way homogeneous groups of households which can be used to replace each other in case of non-response. If none of the four households in a row resulted in an interview, then a new row can be started. The latter is only necessary if the other rows already started re not sufficient.

• By taking twice as many clusters as needed the variability of the household size is anticipated. One starts with a first random group of the selected clusters. If with this group the required number of interviews is not achieved, then the next row (cluster) is started.

74 - Research Protocol – HIS, Belgium 2013 Annex 3

• By making a list in advance, the organisation of the fieldwork is facilitated because no algorithm is necessary to decide about the next replacement and all information about contacting is present.

Procedure implementation

1. Division of municipalities with several groups in groups of statistical sectors

The municipalities including several groups were divided into groups of adjacent statistical sectors with more or less equal population size. Statistical sectors in all groups were ranked according to a north – south gradient. SAS programs dealing with those procedures are in the subdirectory \sampling\statisticalsectors_belgium:

− inputlambert.sas − ssr.sas − liststatsect.sas − makegraph.sas

The main output of these programs is statgroup.xls : a list with all statistical sectors within the selected municipalities including

− code + name of municipality − HIS group number − code + name of statistical sector − ranking number of statistical sector within the group (according to north-south gradient)

Maps of the municipalities with several groups indicating the geographical location of each group is added in annex 1.

2. Practical implementation

The sampling was carried out by the GDSEI 5. The sampling frame was a copy of the National Register. Both individual households and collective households were included in the sampling frame. Persons in collective households were considered as individual households of 1 person 6.

As the information in the National Register changes continuously (people are born, die, move,..) the sampling of the households was done per quarter: Q1: January-March 2013, Q2: April-June 2013, Q3: July - September 2013, Q4 October-December 2013. For the first quarter a copy of the National Register are used from 1st December 2012. For the next quarters copies are used from 1st March 2013, 1st June 2013 and 1st August 2013.

5 See technical note sent to GDS : Sampling of households\Procedure sampling HIS2008.doc

6 For practical and logistical reasons households residing in a prison, a psychiatric institution or communities of more than 8 persons (e.g. convents) (BUT NOT THOSE RESIDING IN AN INSTITUTION FOR ELDERLY !) were not included in the study population, but they were included in the sampling frame as there are no identifiers to remove such types of households from the national register. This problem should be handled during the field work. If an interviewer notices that such a household is selected, it is given the status “non eligible” and has to be replaced.

75 - Research Protocol – HIS, Belgium 2013 Annex 3

3. Output

The output file of the sampling procedure carried out by GDSEI is a file listing all individuals of the selected households. The following information was extracted:

1. Identification number of the household (NUM_MENA) 7 2. Rank number of person within the household (01 = reference person) 3. Name of reference person 4. Address 5. NIS code of municipality 6. Statistical sector 7. Age 8. Household size

IPH received from GDS a copy of the file (without name and address of course, file name = Sampling of households\B93y2008sampling_3.xls ).

A list of checks 8 was performed to ensure that the sample was in line with the requirements specified in the study protocol.

The data file with all selected households was imported in HISIS and the scrambling procedure was carried out in order to define initially selected households and replacement households. For the households in stratum 3 an algorithm continued to select the next eligible cluster until the sum of the individuals in the households to be interviewed was the closer to 11. In case of equal distance to 11, there was a random process to remain beneath 11 or to exceed it.

Replacement household were identified as follows:

• In case of a household non-response, the replacement was the next eligible household within the cluster. Once a cluster was initiated, the algorithm continued to select the next eligible household within the cluster independently of the number of successful interviews attained. So once a cluster was started, all efforts were taken to have a successful contact with a household within the cluster.

• In case a cluster was exhausted in stratum 1 or 2 a new cluster was always activated.

• In case a cluster was exhausted in stratum 3 a new cluster was only activated if adding this cluster to the number of households to be yielded from the other activated clusters would generate a number of respondents closer to 11 than in the situation where no new cluster would be added.

A copy of the data file after scrambling was provided to IPH and it was verified whether the scrambling had been correctly applied 9.

7 The identification number of the household consisted of the following parts Digits 1 to 3: identification number of the group (see list with groups and municipalities in annex) Digit 4 : number that identifies the quarter (1 to 4) Digits 5 to 8 : number that identifies the cluster (01 to …) Digits 9 to 10 : number that identifies the household; digit 10 corresponds with the order of the activation of the households (1 to 8)

8 See program file \HIS\HI2008\Sampling\Sampling municipalities\check sample.sas 9 See program file \HIS\HI2008\Sampling\Sampling municipalities\check scrambling.sas

76 - Research Protocol – HIS, Belgium 2013 Annex 3

. 4. Selection of the Household Members (TSU)

At most four members of the household were interviewed because:

• interviewing more persons is inefficient because of the familial correlation: members of the same family tend to resemble each other more closely than members from different households. By augmenting the number of interviews nearly no new information is obtained for the global sample ; • the burden on the household would be to large.

Hence if a family contains more than four members a selection rule is necessary. This selection should in principal be random. Always selecting the reference person of the household might lead to bias since the reference person is not a random member of the household. He or she might have special characteristics. In the literature this person is sometimes denoted as the gatekeeper. Even including the partner might not totally compensate for this. However there are some practical limitations:

• it may be difficult to explain that the reference person will not be interviewed, while other members are and

• there is a general household questionnaire. This information should come from the reference person (or the partner).

Therefore following selection rules were used within a household to select the individuals to be interviewed.

1. In a household of no more than 4 members all individuals are interviewed.

2. In a household with 5 or more members only 4 members will be interviewed :

• in a household with a reference person and a partner both the reference person and the partner are interviewed and only two additional individuals will be selected using the birthday rule ;

• in a household with a reference person without a partner, the reference person and 3 additional members, selected using the birthday rule, are interviewed.

Using the birthday rule the individuals having their birthday first (month, day) from the date of the first contact onwards, are selected.

Within a household (SSU) that agreed to take part in the survey, non-response at personal level (TSU) is still possible. Possible reasons are:

• refusal

• unable to participate (children, mentally disabled persons, ...)

• one of the members is not at home for a (extended) period (e.g. students, persons in a hospital, outside the country, .....).

77 - Research Protocol – HIS, Belgium 2013 Annex 3

It was specified for which cases a proxy is allowed:

1. Obligatory: person younger than 15 years; person too sick or with mental disabilities. 2. Person cannot be reached for an extended period (at least more than 1 month). 3. Person refused and does not refuse that a proxy answers for him/her.

Sampling Scheme

If selection is necessary, the reference person and his/her partner are selected automatically. A randomisation will be done for the two or three remaining persons only. The selection itself is based on the birthday-rule. The two or three persons having their birthday first from the date of the first contact onwards, are included in the sample.

When the reference person (and the partner) is always selected but the other members have a probability of less than 1 to be selected, then the selection probability between members of the same household with a size of at least 5 is variable. This difference needs to be taken into account using the appropriate weights in order to avoid the potential bias as mentioned above:

• For a household of size k = 1,2,3,4 there is no additional weight required as everybody will be taken and the probability of selection for all members is 1.

• For a household of size k = 5 with reference person and partner the selection probabilities once the household is selected are :

1. for the reference person : p = 1 ; 2. for the partner : p = 1 ; 3. for the k - 2 remaining persons : p = 2/(k - 2). The inverse of these quantities should be multiplied to the weight already calculated for this household.

• For a household of size k = 5 with reference person but without partner the selection probabilities once the household is selected are:

1. for the reference person : p = 1 ; 2. for the k - 3 remaining persons : 3/(k - 1). Again, the inverse of these quantities should be multiplied to the weight already calculated for this household .

Concerning the non-response the rule is that NEVER replacements are allowed because bias is very probable here (e.g. household members having less time will be more reluctant to answer and will be replaced by members having more time).

Note

Household membership was based on the information provided by the reference person (or the partner) and not on the information from the National Register.

78 - Research Protocol – HIS, Belgium 2013 Annex 3

Annex 1. Municipalities with several groups in the HIS 2013

79 - Research Protocol – HIS, Belgium 2013 Annex 3

80 - Research Protocol – HIS, Belgium 2013 Annex 3

81 - Research Protocol – HIS, Belgium 2013 Annex 3

82 -

Annex 4

Use of a proxy

Definition of a proxy

The core of the HIS is a face-to-face interview with the selected members of a participating household. In specific circumstances it is allowed that a proxy answers the questions on behalf of one or more selected members of the household. A proxy does not replace the selected member(-s). In theory, this proxy does not have to be a member of the household, but in most cases it will be. It is presumed that the proxy is well aware of the (health) characteristics of the selected member(-s) for which she/he gives the answers. Since no instrument is used to check this, it is up to the interviewer to estimate whether this is the case.

Use of a proxyproxy----interviewinterview

An interview by proxy is by default applied for all selected members of the household that are younger than 15 years. Although this is arbitrary, it is felt that youngsters are not capable of responding to the questions of the survey. Consequently youngsters can not serve as a proxy.

For selected members of the household which are 15 years or older, an interview by proxy is only allowed if:

1. The selected member cannot be reached

2. The selected member is unable to complete the interview

3. The selected member refuses to complete the interview but agrees that a proxy does it

1. The selected member cannot be reached

If possible the interviewer has to have a direct contact with all selected members of the household. If necessary the interviewer has to return several times to the household in order to have this direct contact. When the selected person cannot be reached for an extended period – at least more then one month – the interviewer may decide to use a proxy. This situation can occur if the selected person is away from home (for a long time), is hospitalised or institutionalised.

2. The selected member is unable to complete the interview

If for the selected person it is impossible to complete the questionnaire due to physical or mental problems, a proxy interview is allowed. This situation will frequently occur in case the person to be interviewed lives in an institution for the elderly.

Research Protocol – HIS, Belgium 2013 Annex 4

3. The selected member refuses to complete the interview

In some cases a selected member of the household refuses to participate, while the other(-s) do participate. In such a case the interviewer has to ask whether he/she does agree that someone else completes the interview as a proxy. If he/she does not agree with this, no proxy interview can be applied and the case must be considered as a intra-household refusal.

In most cases a proxy himself is selected for interview. He/she has to complete the interviews two (or even more) times: once as a selected member of the household, once (or more times) as a proxy for one or more other selected members of the household.

All questions in the questionnaires are written in the direct form (‘are you’, ‘have you’,…). In case of a proxy interview these questions have to be transformed in the indirect form (‘is “ first name of the selected member of the household for which a proxy is used” …., ‘does ….’) by the interviewer on the spot.

When using a proxy the questions on health and environment will not be completed. An interview by proxy for the household level questions is sometimes used for institutionalised elderly who want to participate, but are not capable (due to mental or physical disabilities) to answer the questions themselves. The auto-questionnaire is never completed in case of a proxy.

Identification of a proxyproxy----interviewinterview

Interviewers indicate whether a proxy interview was applied by completing the module NR (Information on the selected member and the respondent), which is part of the face-to-face questionnaire.

ProxyProxy----interviewinterview for (institutionalised) older people

No oversampling for the older population will be undertaken in the HIS 2013. Still, we will probably have people who live in institutions. For institutionalised respondents who are not capable of responding themselves, relatives or a staff member of the institute can serve as a proxy, under the condition that he/she is well aware of the (health) characteristics of the respondent.

84 -

Annex 5

Weighting Scheme HIS 2013

Introduction

Since the design of the HIS follows a complex multistage probability sampling scheme, it is necessary to reflect these complex procedures in the calculation of the estimates. The principle behind estimation in a probability sample is that each person in the sample represents, besides himself or herself, an entire slice of the population. For example, in a simple random sample (2 % of the population) each person in the sample represents 50 persons in the population. In the terminology to be used here it can be said that each person has a weight of 50.

In the weighting phase, one calculates for each person his or her associated weight. This weight appears in the data file, and must be used to derive correct estimates of population quantities. The weight of each person in the sample is different, depending on his/her characteristics. For the HIS 2013, the following characteristics have been taken into account: residence (province), age, gender, size of the household the person belongs too and the period (quarter) of the year the interview took place.

Before going into detail in the construction of the weights, we need to introduce some notation and conventions:

• We artificially consider 12 ‘provinces’; the German community (Eupen ) and Brussels Region form a province by themself. By creating a twelfth province (German communtiy), the oversampling of Eupen can be taken into account in an easy way. The oversampling of the province Luxemburg does not demand an adaptation of the construction of the weights proposed in previous HISs.

• Age which is a continuous variable has been categorized into 9 age-groups 10 .

• The households are classified into four size groups (households of size 4 and larger are taken together).

10 < 15 year, [15,25[, [25,35[, [35,45[,…[75,85[, 85 and older.

Research Protocol – HIS, Belgium 2013 Annex 5

Individual Respondents Within Households

The fact that some persons have been chosen at random within families has led to different weights for members of the same household. We need this correction factor in the construction of the weights and we therefore introduce it here. The correction factor depends on the household size and the type of person:

Household Type of Weight w Size k Member i

≤ 4 Any 1 > 4 Reference 1 > 4 Partner 1 > 4 Other (k-2)/2 if ref. and part. selected (k-1)/3 if ref. selected

Note that for families with more than four members and no partner the weight equals (k-1)/3, because three members have to be selected in addition to the reference person.

Within a household, these weights sum to the household size. Thus, the selected individuals represent the complete household.

The weight

The weight for each person in the sample is basically a post stratification factor. Within each province the individuals can be classified into age-sex-household size strata. Since the proportions of the strata in the population are known, this information can be used to improve the sample estimators. This is simply done by calculating a weighted estimator, with weights equal to the inverse sampling rates.

A stratum p, a, s, h, q, is defined as a combination of the levels of the following variables:

• Province p,

• Age a,

• Sex s,

• Household size h,

• Quarter (in which the interview took place) q.

- 86 - Research Protocol – HIS, Belgium 2013 Annex 5

This means that there are 12x9x2x4x4=3456 strata in total. The following quantities are necessary in the construction of the weights:

• npashq : the number of interviews in stratum pashq ,

• wi the correction factor for an individual within a household. This factor is defined in the table in Section 2. Because of the interpretation of this factor – it sum is equal to the household size- we also call this the factor the number of information sources the subject is representing within the household.

ie • npashq : the number of information sources in stratum pashq .

n pashq ie = n pashq ∑wi i=1

• Npash : the population size of province p, with age a, sex s, living in a household of size h.

Every information source (w i) represents, besides himself or herself, an entire slice of the population of a stratum. This is reflected in the weight every information entity has. It is easily seen that, according to the terminology used here, every sampled subject in stratum pashq has weight:

1 wi (* N pash * ) = 4 w , fini ie (1) n pashq

Because of the interpretation of the weights their sum equal the total Belgian population size. When at least one stratum is not present in the sample this result however no longer holds.

In the following section we will discuss how the complex design is reflected by this weighting scheme.

Complex Design

Regional Factor

Because of the disproportional stratification at the regional level, the selection probability for an individual differs across regions. Based on a sample size of 12.000 individuals for a population of 10 millions, the overall chance of being selected is 1.2/1.000. In the Flemish region, the selection probability is about 0,7/1.000, but in Brussels the selection probability is 3,1/1.000.

In order to obtain results for the whole country, we need to assign a weight to each region.

Furthermore, as a result of the oversampling of some provinces, the provincial sampling rates within a region are not constant. So also when results at regional level are produced, weights at provincial level need to be used. To obtain results at provincial level there is no need to use weights, except for the province of Liège, because of the oversampling of the German

- 87 - Research Protocol – HIS, Belgium 2013 Annex 5 community. The probability of selection into the survey is about four times larger for a German speaking than for a French speaking individual.

These problems could be approached with seemingly different weighting schemes, but the only relevant feature is the relative weight.

All these consideration are taken into account in the weights as defined in Equation (1).

Selection of the Household within a Municipality

Since the selection probability of a household is the same for all the households of the same municipality (within a province) it is not necessary to define a weight at this level.

The following example illustrates this:

Suppose that 1/50 of the population has to be included in a sample and that persons are to be selected in groups of 50. This means that 50 per 2500 households have to be selected. Selection of the municipalities is done via systematic sampling. The probability for a municipality to be selected is proportional to its size. A municipality with 3000 households is included with certainty.

Now, if p 1 indicates the probability that a municipality is selected and p 2 the probability that a household is selected within that municipality, then the total selection probability for a household is:

ptot =p 1.p 2=3000/2500 . 50/3000=1/50

Spread in Time

Since the number of interviews within a given quarter q might not be equal to exactly one fourth of the number of pre-specified interviews, a reweighting scheme is necessary. It is the factor 1/4 in Equation (1) that assures that the individuals in an oversampled quarter are down weighted, and vice versa.

Post-stratification for the age, sex and household size status

For each province the individuals can be classified into age-sex-household size strata, and since the proportion of these strata in the population are known (or can be estimated in some occasions) we used this information to improve the sample estimators.

- 88 - Research Protocol – HIS, Belgium 2013 Annex 5

How was this post-stratification of each province according to age, sex and household size introduced in the weights ?

It can be shown that for a stratified sample ( H strata) the estimator for a mean is given by:

H = N h y ∑ yh h=1 N

A nice property is that this stratified estimator can be rewritten as a weighted estimator. When we define the following quantity (the inverse of the sampling rate):

N = h Whi nh this average can be rewritten as

H nh n

∑∑Whi yhi ∑Whi yhi y = h=1i = 1 = i=1 H nh n

∑∑Whi ∑Whi h=1i = 1 i=1

Where N h is the population of strata h and n h stands for the number of interviews in strata h.

The individual weights we propose for the HIS2004 are computed in this way, we used the National Registry of 1 January 2004. Although formula (1) has not the standard form it naturally leads to the same results.

N w (* N )25.0* = h ⇒ = i pash Whi w , fini ie (1) nh npashq

The advantages of these individual weights are:

• They can be used for all purposes. They are suitable for obtaining results for the federation, for a region, at provincial level, etc.

• Several aspects of the design are combined into a single, compound weight, thereby simplifying the computations and easing understanding.

- 89 -

Annex 6 Practical implementation of HISIA (HIS Interactive Analysis program)

This application is interactive. Pre-defined procedures accessible through menus make it very user- friendly, as it does not require any preliminary knowledge of the statistical package. The results are readily available.

1. HOMEPAGE The software package Dreamweaver® (in the future this software package will probably be replaced Sharepoint) is used to create/update the homepage of HISIA (index.htm) and its related pages. These related pages can be a registration form (registration.htm) but also instructions for the use of the procedures (use.htm) to a description of the background variables (backvar.htm). These files can be found on the Network Drive SLCD_Data , and for the HISIA 2013 the files will be saved in the directory: X:\HIS\HIS2013\WEBSITE\HISIA. In these pages, a link is made to the different interactive webpages. They are structured according to the main chapters (health status, life style, prevention, medical consumption and health & society), followed by the related modules.

2. INTERACTIVE WEBPAGES An interactive webpage is developed for each module (mental health, nutritional habits, smoking, hospitalization…). The menu system of each webpage has the same structure: first select an indicator, select year, select geographical level, and select no/one/two parameters. Only the structure of the module “Chronic conditions (specific)” (chrondis.sas) differs slightly from the others because when selecting an indicator a sub-choice will appear. These files, with the menu system and the underlying structure, are drawing up with HTML in SAS (for the HIS 2008: access08.sas to trauma08.sas) and are stored in: K:\INTERNET\PROG\HISIA 11 .

3. DATABASES There are different types of data stored in the directory K:\INTERNET\DATA\HISIA . First there are 2 databases to execute the analysis. It concerns the HIS-databases on individual level (datahis08.sas7bdat for the HIS 2008) and that on the level of the household (datahh08.sas7bdat for the HIS 2008). This directory also contains excel-files to create the lists of the menu system: indicat.xls, year.xls, geograph.xls, param1.xls and param2.xls. These excel-files must be imported in SAS to create the necessary sas-files. Indicat.xls is a file with all the indicators by module. This is a special file, because it stores a lot of information: it is not only used to draw up the first menu list, but it also contains a field with the title and the footnote which will be put in the outcome table. Moreover it also contains fields where links are made with the available year for each indicator, for which age (0 yr, 15 yrs or 65 yrs) the indicator is available, what type of indicator it is (ordinal, binary or continuous indicator) and if the indicator exists on individual or household level. Depending on which indicator is chosen in the menu system, the corresponding years and background variables (concerning age) are shown in the following selection choices and also the corresponding database and analysis are invoked in the programs. A field with a link to the format of the indicator is foreseen.

11 ‘K’ refers to: userdata on ‘SAS Business Intelligence Server (sasbi)’

Research Protocol – HIS, Belgium 2013 Annex 6

4. STATISTICAL PROCEDURES All the necessary programs are stored in the directory K:\INTERNET\PROG\HISIA\PRIMARY.

The file import.sas is necessary to export the excel files to sas files which then can be used for the menu lists. Beware: this program is closely related with the development of the interactive webpages.

The file start.sas needs to be run first because this file defines the libnames and refers to the formats, which are defined in format.sas. This latter file contains formats as well for the background variables as for the indicators.

In the case of the HIS 2008, in the webpages a link is made to the program “paramet08.sas”. This is the first program that will run. This program invokes two other macro-programs (m_year08 and m_param08) and depending on the choices (geographical level and no, one or two parameters) made in the menu system other programs from there on will be invoked, which contains the related statistical procedures and calculate the results.

5. URL https://www.wiv-isp.be/epidemio/hisia

- 91 -