The European Social Survey1: One in Two Dozen Countries Ineke Stoop2, Roger Jowell3, Peter Mohler4

Paper to be presented at the International Conference on Improving Surveys Copenhagen, 25-28 August 2002

Summary In September 2002 the fieldwork of the first round of the new European Social Survey (ESS) will start. This sur- vey is jointly funded by the European Commission, the European Science Foundation and the National Science Foundations within 24 European countries. The study will focus on changes in attitudes, values and behavioural patterns in the context of a changing Europe. It aims at achieving the highest standards of cross-national survey research, attempting to match the quality standards of the best national surveys. With the help of experts from different fields, a Central Coordinating Team has developed the survey design, the and strict guidelines on issues such as random , translation, response rates and fieldwork documentation. In each participating country a national coordinator has been appointed and a survey organisation that will conduct the fieldwork. The first half of 2002 has been largely devoted to the construction of the questionnaire and the meth- odological preparation of the survey ensuring high quality and optimal comparability despite local differences. The paper will present the methodological issues that will have come up during preparation of the ESS.

1 Introduction The European Social Survey is a new, conceptually well-anchored and methodologically rigorous survey that aims to pioneer and ‘prove’ a standard of methodology for cross-national attitude surveys that only the best na- tional studies usually aspire to. The preparation of the survey has involved a wide range of international experts on methodological and substantive issues. Funding by the EU 5th Framework Programme and the European Science Foundation covers the preparation, co-ordination, consultation and the exchange of information. The number of European countries that decided to participate in this new venture has far exceeded the original ex- pectations and now extends to twenty-four. National funders have committed themselves to following a centrally designed specification for the fieldwork and have appointed survey organisations that they believe can achieve the required exacting standards. The study will subscribe to the Declaration on Ethics of the International Statistical Institute, to which all national teams will also be asked to adhere (see Jowell, 1986). While all teams are committed to the highest quality standards for the ESS, these standards are, of course, by no means easy to achieve.

Obviously there is never enough money and never enough time, even for a well-founded project such as the ESS. More important than monetary and time constraints, however, is the fact that survey quality is a multi-faceted phenomenon (see, for instance, Lyberg, 2001; Lyberg et al. 2001; Fellegi, 2001). Various quality criteria matter, such as relevance or content, accuracy, timeliness, accessibility, interpretability and coherence. But the problem with multi-faceted criteria such as these is that they always involve a degree of trade-off. This paper discusses these quality criteria and their implementation in the European Social Survey with particular reference to such trade-offs. In particular, the need for optimal cross-national comparability - discussed here under the heading of coherence - is often in direct conflict with the need to adhere closely to the other criteria (Jowell, 1998).

2 Content The ESS intends to measure changing social attitudes and values in (now) 24 European nations. Around one half of the hour-long questionnaire is a core element comprising key repeat questions to measure change and persis- tence in a range of social and demographic characteristics, attitudes and behaviour patterns. This core contains questions on occupation and social structure, social exclusion, religious affiliation and identity, ethnic and na- tional identity, political trust, party affiliation, multilevel governance and voting behaviour, media consumption and value orientations. The other half of the questionnaire – the rotating element – consists of two topic-specific modules per round to measure particular academic and policy concerns and debates that require examination in depth. These modules are selected via an international competition. In 2002, the selected modules are on ‘citi- zenship, involvement and democracy', and ‘immigration’. In addition to the hour-long face-to-face questionnaire, a short self-completion questionnaire (though in some countries it will be an extension of the face- to-face interview) will provide room for a scale on ‘basic human values’, plus a number of methodological test questions designed to quantify the reliability and validity of certain measures in the interview.

1 Information on all aspects of the survey is available at: www.europeansocialsurvey.org. This site will be regularly updated. 2 Social and Cultural Planning Office, The Hague, The Netherlands, [email protected], www.scp.nl 3 National Centre for Social Research, London, UK, 4 Zentrum für Umfragen, Methoden und Analysen, Mannheim, Germany

1 There were, of course, substantial trade-offs involved in constructing a core 30-minute questionnaire that is de- signed to measure changes in attitudes, norms, values and social structure over time. Considerations such as sustainability, scope and how to deal with diversity continually arose. Such questions arise in national surveys too, where certain topics or phraseologies may be more relevant for certain parts of the population than for oth- ers, but they are greatly magnified when contemplating a diverse 24-nation survey. Can the same question on, say, social exclusion be asked of a small farmer in a relatively poor country and of a stockbroker in a rich coun- try? Do the same questions about confidence in multi-level governance have the same meaning to inhabitants of EU countries as to people from Switzerland, Hungary and Turkey?

The principle is, of course, all too clear. The task of our survey measurements is to discover and calibrate cross- cultural and cross-national differences in people’s responses, and to achieve that we must try to keep the stimu- lus of the questions as constant as possible between respondents. The problem, however, is that there is no easy way of guaranteeing such equivalence of meaning, especially in cross-national surveys where between-country variance is large. But the word “especially” in that sentence is important, because – as we have noted - these problems are by no means confined to cross-national surveys. Almost all surveys are, to varying degrees, cross- cultural; cross-national surveys tend merely to be more so.

The conflict arises acutely when the ‘best’ cross-national measure nationally for a particular concept seems to be different from the ‘best’ national measure of the same concept. We are not here referring to the added problem of lexical equivalence between different languages, just to the problem of different cultural constructions of the same basic concept. Suffice it to say that our approach in the ESS is, wherever possible, to rely in such circum- stances on the best cross-national measure we can come up with, for fear otherwise of creating measurement anarchy.

Another variation, both between countries and over time, that we will be trying to mitigate within the ESS, is the influence of context on responses. Part of the ESS data set will thus consist of an ‘event data bank’ which briefly documents the major political, social and economic factors just before and during the fieldwork period that are likely to have a substantial bearing on a particular country’s - or a group of countries’ - response patterns (see also section 5.3).

3 Accuracy The accuracy of surveys is generally affected by: • coverage errors, which occur because each member of the target population does not have a known, non- zero chance of inclusion in the ; • sampling errors, which occur because only a subset of the target population is selected; • measurement errors; which occur because the mechanisms (such as the , inter- viewers or coders) introduce faulty answers; • nonresponse errors, which occur because respondents and non-respondents turn out to have different char- acteristics that relate to the survey’s purpose.

3.1 Population and coverage errors The ESS aims to be representative of the residential population of each participating nation aged 15 years and above (with no upper age limit), regardless of their nationality, citizenship or legal status. A ‘resident’ is defined as anyone who has been living in the country for at least a year and who has no immediate concrete plans to return to his or her country of origin. Although, of course, the population of a country also includes the ‘home- less’ and people living in institutions (such as hospitals and prisons), these groups will be excluded from the target population, at least in Round 1 of the ESS. In any country where a minority language is spoken as a first language by at least 5% of the resident population, the questionnaire will be translated into that language too and appropriate interviewers will be deployed to administer the interviews.

Given the small sizes of the groups above, the costs of interviewing them did not seem justified by their expected yield. Nonetheless, their exclusion introduces coverage errors in a survey designed to represent the resident populations of all participating countries. And not only are these errors greater in some countries than in others, but they are also often difficult to quantify.

3.2 Sampling and sampling errors Strict probability methods (systematic random sampling) will be deployed at every stage in all countries, such that the relative selection probabilities of every sample member will be known and recorded in the data set. Nei- ther quota sampling nor any from of substitution of non-responding households or individuals (whether ‘refus- als’ or ‘non-contacts’) will be allowed at any stage. In some countries, registers of persons or voters exist, from

2 which national samples are routinely drawn, and they will be supplemented where necessary to satisfy the ESS’s more inclusive definition of the universe. A key function in the study’s co-ordination is to ensure that these rules are followed. To this end, sampling specialists have visited a number of countries to help ensure that the exacting procedures are not only followed but also meticulously recorded.

More important than each country’s nominal sample size is its effective sample size, since different sample de- signs generate different standard errors. In particular, the clustering of a sample into a number of discrete geo- graphical areas (highly desirable on budgetary grounds in most surveys, including the ESS) produces ‘design effects’ which reduce the sample’s ‘effective’ size (see Kish, 1965). Different response rates have a similar im- pact. To counteract this problem in the ESS, the specified sample size for each country is the effective sample size - that is, the number of observations required for a to produce the same precision as the design actually used. The result is that while nominal sample sizes will of course all be larger and will vary between countries, each country is attempting to produce an effective sample size of around 1500, generally comprising around 2000 or more interviews.

As it turns out, better sampling frames may perversely lead to higher coverage errors. While population registers, for instance, are thought to be ideal sampling frames containing as they do pertinent information on potential respondents, they are sometimes more out of date than address samples which lack such information. Address sampling frames also tend to be more inclusive, embracing students who do not live at their ‘official’ address, illegal aliens, etc.

The conflict between random and quota sampling was not hard to resolve in a study with the aims of the ESS, but for some countries it represents a change of normal service from fieldwork agencies which might well prove sub-optimal at best.

3.3 Questionnaire construction and measurement error The prescribed mode of data collection in the ESS in all countries is a face-to-face interview, since for the mo- ment this method seems best fitted to reduce measurement error in a diverse 24-nation survey. Face-to-face sur- veys generally have the important advantage of achieving higher response rates than most telephone or postal (or probably Internet) surveys, thus making them likely to be more representative. Another advantage of face-to-face surveys – from an ESS perspective especially – is that they can generally be far longer in duration than other modes without falling foul of low response rates or incomplete interviews. In addition, unlike telephone surveys, but in common with postal and Internet surveys, face-to-face interviews are able to employ helpful visual aids such as show cards. Finally, whereas postal and Internet household surveys are always at risk of becoming ‘fun for the whole family’, face-to-face surveys (in common with telephone surveys to a lesser extent) ensure a meas- ure of control over who completes the interview and in what circumstances.

Even so, for all their persuasive advantages, face-to-face surveys are not only more expensive and labour-inten- sive than all other modes, but are also the most likely to encourage social desirability bias on the part of respon- dents. They are also becoming less and less the norm in certain countries, such that major survey agencies in those countries have disposed of their face-to-face interviewing capacity. This is a slippery slope for the ESS and other time series and may necessitate a long-term re-think about mode effects and how to mitigate them, possibly in a contemplation of a mixed-mode future.

The painstaking construction of both core and rotating modules has gained from the input of several groups of substantive experts, each of whom were asked to produce papers that discussed and argued the theoretical and practical case for certain measures at the expense of others in their field. These long lists then had to be honed down during a year of questionnaire development into a more parsimonious set of questions which were eventu- ally tested in two large pilot surveys of around 600 respondents each – one in the UK and one in the Netherlands. Extensive statistical analysis of both the draft questions and then the pilot results (using MTMM and other ana- lytical tools) were undertaken with a view to maximising their reliability and validity (see Saris and Gallhofer, 1998; Krosnick and Fabrigar, 1997; Scherpenzeel and Saris, 1997). Further refinements and simplification of the draft questions resulted in the final English ‘source questionnaire', which is the basis for detailed translations into the languages of all participating nations.

To ensure that the best translation protocols were adopted, the source questions were annotated in detail for the benefit of translators who need to know not only the words used in a question but also what they denote in order to achieve a functionally equivalent translation. In other respects too, the translation protocols were specified in detail to guard against the introduction of unwanted variance (see Harkness, 1998; Harkness, 1999; Carey, 2000). The questionnaire design process ends with small pre-tests of the translated questionnaires in each partici-

3 pating country - in effect a multinational dress rehearsal – to enable final adjustments to be made in response to residual translation problems. The difficulty at this point of course is that any single-country problem that re- quires more than just a straightforward adjustment of translation to ensure it works in that country will automati- cally have multi-country consequences. To prevent such a domino effect, the reaction to late minor problems of this sort thus tends to be less assiduous than it ought perhaps to be from the narrow vantage point of measure- ment quality alone. In the conflict between late perfection and timely imperfection, the latter alternative often has to win.

3.4 Fieldwork and nonresponse errors As noted, the mode of ESS data collection in all countries is face-to-face interviewing, since it is best placed to boost response rates and achieve consistent data quality across diverse nations. Maximum assignment sizes for interviewers are pre-specified, as are minimum proportions of back-checks (by telephone or in person). To re- duce possible bias caused by nonresponse, a wide range of measures is specified, as follows: • A minimum target national response rate of 70 per cent is laid down, recognising that the target will not universally be achieved. But though this target is high (or very high) for certain countries, it is relatively modest for others, and the aim of the ESS is to attempt to raise rather than bow to current norms. Since re- sponse rates cannot be legislated for, however, we insist instead on certain fieldwork procedures in all countries that help to maximise the chances of recruiting or converting elusive sample members. • To reduce refusals, for instance, we insist on personal briefings of all interviewers in doorstep interactions and the consideration of a range of response-maximisation techniques, such as advance letters, a brochure, re-issuing non-contacts and ‘soft’ refusals, and incentives (see Carton and Loosveldt, 1999; Campanelli et al., 1997). • Data will also be collected about non-productive addresses and individuals (whether non-contacts or refus- als) for possible subsequent use in weighting or post-stratification of the data. • Procedures for minimising non-contacts are specified, together with a maximum target non-contact rate of 3%. • At least four visits have to be made before abandoning a sampling unit as non-productive, with calls spread across times of day and days of week. Similarly, to ensure that difficult-to-contact or busy people are con- scientiously pursued, a minimum fieldwork duration of 30 days has been specified, but - to reduce seasonal effects – within a maximum time window of four months (September to December 2002 for the first round). • Most importantly, outcome codes are completed for each call at each address on standardized contact forms. These include interviewer-recorded observations of the dwelling and its neighbourhood and features that may hamper contactability (Groves and Couper, pp. 88-89). This will enable response patterns to be identi- fied and subsequently response rates and nonresponse to be consistently documented and calculated across countries (see Laiho et al., 2000). • Even in certain countries (notably Scandinavian countries) where access to national registers means that first contact with individual respondents may be by telephone, all addresses will always be visited in person, whether to conduct the interview face-to-face, or to try to convert a ‘soft’ telephone refusal or non-contact, or to collect contextual household and neighbourhood information form nonresponding households.

In combination, these procedures are exacting almost everywhere in one respect or another, and by no means correspond to the norm. Does this mean that non-compliance will be high and that it might it have been better on consideration for the ESS to aim for a consistent level of mediocrity? Our view is still emphatically not.

3.5 Rigorous quality requirements: an overview Within specific time and budget constraints, reducing one set of potential errors may well increase another set. Even so, the ESS has specified a combined programme of error avoidance, including: • insisting on explicit theoretical foundations for all questions; • assessing the quality of questions through a priori tests and MTMM pilots; • enforcing strictly random sampling and setting up an expert sample panel to advise on and sign-off each national sampling schema; • developing extensive translation protocols, procedures and guidelines and closely supporting national teams; • ensuring as far as possible that high response rates will be aimed for and achieved; • recording outcomes and observational data on standardised contact forms; • building in methodological experiments and evaluation procedures to ensure that lessons are learnt; • and especially, requiring every actor in the process – from the central team to national coordinators to field- work agencies - painstakingly to document every phase of every procedure for future reference and meth- odological scholarship.

4 4 Timeliness As noted, strict adherence to timetables generally conflicts with the attainment of perfection. In the case of the ESS, however, the timetable constraints are tyrannical. So, at each stage of the process, conflicts arise – for in- stance between improving the questionnaire and meeting the pilot deadline, between improving response rates and meeting field deadlines (and budgets), between catering for national variations in the scheduling of field- work and releasing the archived data on time. In some ways, the ESS is fortunate in effectively being able to ignore the possible trade-off between small increments in survey quality and small delays in delivery. The reality within which we are working is that the delivery dates must reign supreme. This is, of course, usually the case in a time series, because delays in one round tend to have devastating effects on the timetable of the subsequent round, and so on. But the ESS is also bound by contractual considerations which are probably even more unfor- giving.

5 Accessibility and interpretability

5.1 Archiving and dissemination Distribution of ESS data will be offered at no charge to the scientific community. The Norwegian Social Science Data Services (NSD), the ESS Archive, will assemble all datasets, merge, document and disseminate them in a user-friendly format within 6 months. In accordance with data protection regulations in various countries, only anonymised data will of course be made available to users. The Archive will: • evaluate and assess the content, structure and format of datasets and documentation; • check, correct and validate errors; • standardise or harmonise national variables, wherever possible via standard international classifications; • merge all national data into an integrated multinational dataset; • document the datasets according to the ‘DDI meta data standard’; • convert the datasets, meta data and accompanying technical report into the DDI standard; • generating portable and system files for all common statistical packages (SPSS, SAS, SYSTAT, etc), based on the generic DDI format; • generate a single comprehensive codebook for integration into the technical report; • incorporate the questionnaires into a universal document exchange format, such as PDF; • produce a full range of indexes and associated meta data to enable searches and retrieval functions to be employed, including a multilingual thesaurus ; • develop a meta database containing all textual elements (such as questionnaires, technical report, data definitions, etc.); • ensure permanent availability of the data taking into account contemporary and future media, formats and analysis software.

Through the state of the art software system, NESSTAR (which has been produced under an EU Framework Programme), remote analysis and data delivery will both be achieved over the Internet. This system enables users to locate multiple data sources across national boundaries, browse detailed meta data about the dataset, analyse and visualise data online, and download the appropriate subsets of data in one of a number of formats. The data will also be available within newly developed Internet software systems, such as ILSES (Integrated Library and Survey-data Extraction Service). In addition, a website will contain full details of the data content, access arrangements, codebooks and other documentation. Information about the project will be distributed to mail groups of users. Over time, all articles, books and papers based on the project – whether substantive or methodological – will ideally be documented and catalogued by the Archive.

5.2 Training resource In Round 2 of the ESS, for which European Commission funding has now been secured, an internet-based train- ing resource will also be provided for students of comparative research, based on the data collected at each round of the project. This training programme will be accessible via standard web-browsers connected to the Internet and will consist of two or more learning packages, each containing a well-documented data set of variables from one or more of the topics in the survey plus background and relevant contextual data. Examples of possible top- ics are attitudes towards immigrants, political participation and political interest, and moral values.

For each learning package, on-line study material will be available that is to be developed by leading European scholars in the field, plus examples and guidelines on the limits of the data. The intention of this resource is twofold – on the one hand to inspire and stimulate interest in the data per se, and on the other to foster a collabo- rative spirit for comparative research. Thus the learning package must aim at a wide audience and, to ensure this, it will range from basic to advanced multivariate and multilevel analysis. On the other hand, while descriptive analysis and graphics training will be fully integrated into the web site, more advanced statisti-

5 cal analysis will require datasets to be downloaded into all standard statistical packages – an option designed to attract more sophisticated users too.

The main users of the training resource are likely to be European postgraduate students with limited training in social science methodology, an important community to introduce both to the dataset itself and to quantitative data more generally. But the packages may also appeal to students in secondary education as part of their empiri- cal coursework.

5.3 Context and event data All survey responses can be affected by timing and context. But certain types of attitudinal data are particularly prone to such effects. With that in mind, the ESS feels the need to integrate national and European-level con- textual data into the datasets to increase their analytic power. In particular, this should help to identify any na- tional variations uncovered that turn out to owe more to exogenous factors than to underlying attitudinal differ- ences.

So, country-based demographic and socio-economic macro statistics will be collected from available sources from a wide range of international organisations and websites. But equally important may be contextual socio- political factors in each country before and during fieldwork which may have an impact on responses. Examples are the proximity of an election, industrial or political unrest, or a natural disaster. If a major event is recorded which occurs in the middle of fieldwork, then responses before and after the event can be compared. For in- stance, consider the impact of the Pim Fortuyn assassination in the Netherlands, the subsequent outbreak of dis- contentment and unrest, and the change of government, all of which occurred within a very short period of time. Moreover, such events may influence not only one country’s public attitudes, but have wider implications for politics and social attitudes elsewhere. An example of this phenomenon is the 11 September 2001 attack in the USA which had, and continues to have, an emphatic impact on perceptions worldwide. But nearer to home, events like Chernobyl on the one hand or the removal of the Berlin wall on the other, can also have far-reaching (and in some cases measurable) impacts.

Other events of contextual relevance to a survey such as the ESS are less dramatic but not necessarily less influ- ential. For instance, as we write, there has been a general swing to the right in European national election out- comes which may or may not be part of a continuing trend. And there have been a series of controversies or rows relating to immigration and asylum – again across a range of European countries. These less discrete but no less important events or processes also need to be monitored and recorded.

In order to prepare for this process, we conducted a trial in which 13 national co-ordinators collected and re- corded events of this nature for 6 weeks. The results were encouraging and revealing, demonstrating the prom- ises of this approach. So we are confident that the final event data bank covering all ESS participating countries will prove to be a useful asset and may influence protocols in other cross-national research.

6 Coherence “Coherence”, according to Fellegi (2001), is “an aspect of quality that refers to several, as opposed to a single statistical series: it reflects the extent to which series which ought to adhere to a certain expected relationship deviate from such a behaviour” (Fellegi, 2001). But coherence is multi-faceted in other ways too, such as in the achievement of optimal comparability between related data sources, across participating countries, and over time. We deal briefly with each of these facets of coherence in turn.

6.1 Optimal comparability with other measures and classifications In addition to its ambitious methodological aims, the substantive purpose of the ESS is to chart and interpret the speed and direction of change in underlying public attitudes across Europe. Even in this increasingly well-docu- mented age, such data are still scarce in Europe at a national level, but especially at a European level. There are exceptions, such as , the European Values Surveys (EVS), and the International Social Survey Programme (ISSP), but each of these time series has its own issues of coherence to contend with. Meanwhile, the excellent series of Eurostat surveys properly confine themselves almost exclusively to demographic and behav- ioural data of one sort or another. Even so, neither the design nor the content of the ESS was forged within a vacuum. There were many distinguished footsteps to guide our path, though in truth, much of the best attitudinal work had been achieved at a national rather than at a European level.

This created both a problem and an opportunity. In terms of questionnaire development, for instance, we had to decide between two opposing strategies: whether on the one hand to try wherever possible to use pre-existing, well-proven national measures of certain phenomena, or on the other to attempt to construct an optimal cross-

6 national measure de novo. Although we debated these two possible strategies at length, we came to no hard and fast conclusion, leaving it in the end to the old devices of expert judgement and piloting to determine which strategy ultimately prevailed for each measure. The result is that the questionnaire contains some entirely new measures, some pre-existing national measures that we have adopted wholesale, and some hybrids. It is true, however, that whenever we came across an unsolvable conflict between coherence over time within the new ESS versus coherence with a prior measure, we chose future over past comparability.

As far as independent variables are concerned, we naturally stuck wherever possible to the wide array of harmo- nised and standardised international measures that are already available.

6.2 Optimal comparability cross-nationally To be defensible, a coherent multi-national survey requires strict adherence to common or equivalent multi-na- tional procedures and instruments. Otherwise, as Carey (2000) observes, local (or even contractor) preferences will always tend to prevail over the common good. Although, of course, local constraints need to be properly taken in to account and compromises always need to be struck, participants in cross-national surveys nonetheless have to adapt to a quite different set of ‘rules’ that may at first appear to trample cultural and academic values, but which are in fact designed to suppress avoidable variance. While quite different sampling frames and proce- dures can, under certain specified circumstances, generate random samples with identical properties, this is just not the case as far as most other features of a survey are concerned. Fortunately, all participants in the ESS have appreciated this necessity from the outset, and have adapted heroically to the inconvenient local adjustments sometimes involved.

So, for instance, whereas mixed-mode (a combination of telephone, face-to-face and possibly self-completion or web-based) data collection might well lead to a higher response rate in certain countries, it had to be ruled out of the first round of the ESS because of potential mode effects in the results. These would be exacerbated in a cross-national survey by very different levels of telephone penetration in different countries.

As noted, however, some countries have nonetheless been permitted under specified conditions to make the first approach to respondents by telephone, but the interview is always conducted face-to-face – thus allowing for what we considered to be benign local variations. In similar vein, we provided no centrally fixed format for advance letters or brochures, or for precise response enhancement procedures, restricting the documentation on these issues to guidelines, information and advice.

On translation matters, the conflicts were once again more severe. On the one hand, neither the central team nor even the special translation workforce possessed the skills or resources to ‘sign off’ each translation from the source questionnaire into approaching 20 different languages. Instead, as for sampling, the procedures were closely specified, leaving room for variations at a national level according to circumstances, e.g. when one coun- try within the ESS shares a common language with another.

6.3 Optimal comparability longitudinally A proposed time series such as the ESS is in a unique position to test its own methods at each round and between rounds, and then to incorporate improvements over time. For instance, while accepting that exclusively face-to- face interviewing is optimal for Round 1, this choice might well need to be re-visited in future rounds. On this sort of issue there is in any case likely to be increasing pressure over the years to consider alternatives in order to keep costs down. So mixed-mode surveys are bound to come into future reckoning, almost certainly including telephone and internet-based self-completion methods alongside face-to-face interviewing and paper self-com- pletion methods. But what impact might such major modifications have on the long-term coherence of the time series? True, other time series might well have to address the same question, but since the ESS is being set up at a time when these sorts of developments are already clearly in prospect, it can - at least to a degree - plan for this eventuality.

In particular, the ESS has a methods budget which might enable it to contribute to debates about mode-switching and mode-choosing across countries and over time. It might thus begin to do more than merely record and meas- ure mode effects and try to address more difficult questions such as: • Under what conditions might telephone interviewing be capable of producing equivalent answers to face-to- face interviewing? • What damage might be done if telephone interviewing were to become the exclusive mode in certain coun- tries, or one of the modes in several countries, and is it repairable?

7 • What precautions or adjustments might be necessary before individual respondents can be permitted to choose for themselves which of several modes they prefer from among all the alternatives - face-to-face interviewing, telephone, postal, or internet? • What might the effects of such alternatives be on response rates, equivalence and overall data quality?

In addressing questions such as these, the ESS team will not, of course, be alone. But we hope that we might be able to work with others already addressing such questions and possibly to use the ESS as a vehicle for serious small-scale experimentation. Its value of course is that it will offer the prospect of cross-national experimenta- tion. For the moment, the ESS’s precise methodological agenda has yet to be determined. This will be one of our tasks over the next few months. We are fortunate to have in place an expert international Methods Group to advise and guide us on this agenda. But we will also welcome advise from colleagues worldwide who are inter- ested in such work

References

Campanelli et al., 1997 P. Campanelli, P. Sturgis and S. Purdon (eds.) (1997) Can you hear me knocking: An investigation into the impact of interviewers on survey response rates. London: Survey Methods Centre at SCPR. Carey, 2000 Siobhán Carey (ed.) (2000) Measuring Adult Literacy. The International Adult Literacy Survey in the European context. London: Office for National Statistics. Carton and Loosveldt, 1999 Ann Carton and Geert Loosveldt (1999) How the initial contact can be a determinant for the final response rate in face-to- face surveys. Paper presented at the International Conference on Survey Nonresponse, Portland, Oregon, October 28-31 1999 Fellegi, 2001 Ivan P. Fellegi (2001) Comment (on: Can a Statistician Deliver, JOS, same issue) Journal of Official Statistics 17:1, pp 43- 50. Groves and Couper, 1998 Robert M. Groves & Mick P. Couper (1998) Nonresponse in Household Interview Surveys. New York: John Wiley & Sons Harkness, 1998 J. Harkness (ed.) (1998) Cross-Cultural Survey Equivalence. ZUMA-Nachrichten Spezial Band 3. Mannheim: ZUMA. Harkness, 1999 J. Harkness (1999) In pursuit of quality: Issues for cross-national survey research. International Journal of Social Research Methodology, Vol. 2, No. 2, pp. 125-140. Jowell, 1986 R. Jowell (1986) The codification of statistical ethics, Journal of Official Statistics 2:3 Jowell, 1998 R. Jowell (1998) How comparative is comparative research?, American Behavioral Scientist, 42:2, pp 168-177. Kish, 1965 L. Kish (1965) , New York: John Wiley & Sons. Koch & Porst, 1998 A. Koch & R. Porst (ed.) (1998) Nonresponse in Survey Research, ZUMA-Nachrichten Spezial Band 4, Mannheim: ZUMA Krosnick and Fabrigar, 1997 J.A. Krosnick & Fabrigar, L. R. (1997) Designing rating scales for effective measurement in surveys. In: Lars Lyberg et al. (eds.) Survey Measurement and Process Quality (pp. 141-164), New York: John Wiley & Sons. Laiho et al., 2000 J. Laiho & Peter Lynn (NCSR), R. Beerten & J. Martin (ONS) (2000) Alternative Definitions and Multiple Uses of Response Rates. Paper presented at the 11th International Workshop on Survey Nonresponse, Budapest, Hungary, 27-29 September 2000. Loosveldt, 1995 G. Loosveldt (1995) The profile of the difficult-to-interview respondent. Bulletin de Méthodologie sociologique, 48 (September), pp. 68-81. Lyberg, 2001 Lars Lyberg (ed.) (2001) ‘Can a Statistician Deliver’ and Comments, Journal of Official Statistics 17:1, pp 1-127. Lyberg et al., 2001 Lars Lyberg et al. (2001), Summary Report from the Leadership Group (LEG) on Quality, SPC. Saris and Gallhofer, 1998 W.E. Saris and I.N. Gallhofer (1998) Classificatie van survey-vragen, Tijdschrift voor Communicatiewetenschap, Vol. 26, pp. 96-122. Scherpenzeel and Saris, 1997 A. Scherpenzeel, A. and W.E. Saris (1997), Effects of data collection technique on the quality of survey data: An evaluation of interviewer- and self-administered computer-assisted data collection techniques, Sociological Methods and Research, Vol. 25, No. 3. Stoop and Louwen, 2000 I.A.L. Stoop & F. Louwen (2000) Do nonrespondents differ? Paper presented at the 11th International Workshop on Survey Nonresponse, Budapest, Hungary, 27-29 September 2000.

8