CARIBBEAN COMMUNITY SECRETARIAT

SIXTH MEETING OF THE CARICOM ADVISORY GROUP: CARICOM PROGRAMME ON STRENGTHENING CAPACITY IN THE COMPILATION OF SOCIAL GENDER AND ENVIRONMENT AND INDICATORS IN THE CARICOM REGION ST/IND/AG/2005/6/15

Paramaribo, Suriname 13-17 June 2005 14 June 2005

Measuring Performance of Statistical Offices

INTRODUCTION

According to Willem de Vries1 The Economist was the first to produce a of National Statistical Offices in 1991. The first round of concentrated on: timeliness and accuracy of major Statistical Series. For the second round of rankings in 1993 The Economist asked a panel of in various countries to rank statistical agencies in the 13 largest industrialized countries2 on: 1- Objectivity (i.e. absence of political interference) 2- Reliability of the figures 3- Statistical Methodology 4- Relevance of the Published Figures

The Economist looked further at: 5- Average size of revisions to GDP growth

1 Netherlands – 1998 (will further be referred to as NOS-1998)

2 The Economist 1997, pp 37-38

Page 1 of 8 6- Timeliness (Average speed of publication of GDP, industrial production CPI and Trade, based on latest three published figures) 7- Resources available (Statisticians per 10,000 and Government Statistics budget per capita, using PPP-figures)

Canada (1), Australia (2) and the Netherlands (3), came out on top and Belgium (13), Spain (12) and Italy (11) were at the bottom of the rankings!

Clearly, there is room for improving the criteria and one can certainly wonder why some countries were included (why only the 13 largest Industrialized countries?) and some were not.

In what follows we shall try to propose a more balanced list of criteria. In so doing we shall borrow freely from de Vries. However, we shall amend the proposal by de Vries. His proposal consists of : Two Parts. Part 1, called Principles, is based on the Fundamental Principles of Official Statistics and includes 25 criteria and Part 2 is called Practice and includes Quality of ten Major Statistical Series.

Page 2 of 8 QUALITY AND PERFORMANCE INDICATORS

If we may recall the Preamble of the Fundamental Principles of Official Statistics: “Official Statistical Information is an essential basis for development in the economic, demographic, social and environmental fields and for mutual knowledge and trade among the States and peoples of the world”, then it is clear that the Quality of Data and Information is a preponderant issue for every Statistical Office.

EUROSTAT has developed the following dimensions of data quality3 for the European Statistical System4.

1- Relevance (Statistics are relevant when they meet users needs. Most likely relevance will have to be decided upon by using some sort of Customer Satisfaction Survey) 2- Accuracy (Closeness between estimated value and the unknown, true value of a population. It requires reporting on and Non-sampling errors). 3- Timeliness and Punctuality (Timeliness refers to the lapse of time between the end of a reference period or reference date and the actual dissemination of data. Punctuality has to do with the difference between the actual date of availability and the date included in the publications calendar for a series) 4- Accessibility and Clarity (These refer to: number and types of media used in disseminating statistics, as well as the explanatory texts, i.e. meta-data) 5- Comparability (Geographic, Temporal and across fields: gauged by the concepts and definitions used) 6- Coherence (Statistical data are coherent when they can be combined reliably in different ways and for different purposes, regardless of whether they originate from a single source or from different sources. One should look at e.g. coherence between provisional and final results, conjunctural and annual results etc.) 7- Completeness (The difference between statistics that are available and those that should be available to meet e.g. CARICOM requirements).

We shall come back later to these seven Dimensions of data Quality, after we have presented a proposal of our own.

Just like de Vries and The Economist we think that the performance indicators should be a combination of Actual Statistical Series and Other criteria. As regards the actual statistical series the most popular choices are:

3 It should be noted that the General Data Dissemination System of the IMF distinguishes between Quality of Data and Quality of the System. The Data Quality has everything to do with EUROSTAT’s criteria 1,2,3,5,6,7 while the quality of the System is rooted into the availability of meta-data (explanatory or supporting notes on data) for the users of those data, EUROSTAT’s criterion 4.

4 See Carmen Arribas et al (UN-ECLAC, 2003) or Gordon Brackstone (UN-ECLAC, 2003)

Page 3 of 8 1- Annual and Quarterly GDP (and GNP) estimates 2- Selected Labour Statistics (Activity Rates, Unemployment Rates) 3- Basic (Population Size, Spatial and Age-sex distribution) 4- Annual and Quarterly External Trade Statistics 5- Monthly Consumer Price Index Presently we would also have to include: 6- MDG Indicators (48 in total)

Even if for the other criteria, we only look at Resources, it is clear that producing comprehensive League Tables for National Statistical Offices is a formidable task. The actual Statistical series represent 60 individual items and it must be stressed that also the dimensions of Quality need to be defined operationally, i.e. need to be broken down into measurable indicators. Producing the information for 1 single country will require setting up a large matrix with 60 rows (the actual series) and 17 columns5, plus information on resources. This is certainly unwieldy!

One wonders if it is not possible to strike a proper balance between being fair and comprehensive on the one side and being parsimonious on the other side. Why couldn’t we turn to some type of Composite Index such as the ? Clearly, we are looking for proper indicators to represent aspects of Population, Living Standards and the likes. We propose to use: 1- The Quality of Life Table data, used to compute e0. This implies availability of Age-Sex population data, Age-Sex Mortality data and some Statistical capacity to obtain a proper coverage of both numerator and denominator data. It is also possible from the life table to obtain information on the (life table) infant and under five mortality rates, which are two important MDG indicators 2- The Quality of Annual real GDP per capita data. This also implies availability of Population data, but also (if we go with the UNDP point of view) is a good proxy to Standard of Living. If we do not only look at the Production (or Output) approach to GDP, but also demand an Expenditure Approach, the Quality of Annual Real GDP data also reflects upon the Quality of External Trade Statistics. 3- The Quality of the CPI. We think inclusion of the CPI needs no justification 4- The Quality of the Labour Force Statistics (Labour Force Participation rates and unemployment rates). Again these have Population inputs 5- Resources available to the Statistical Office. We slightly adjust The Economist’s indicators: Statisticians in the NSO per10,000 population 12 years and older6 and NSO-statistics budget per head of population 12 years and older (or a modification of De Vries who wants to use the square root of population, instead of population data itself: Square root of Population 12 years and older)

5 If we adopt all Quality Indicators proposed by Carmen Arribas et al, (UN-ECLAC, 2003)

6 Pupils from primary schools, hardly ever visit the Statistical Office in Suriname.

Page 4 of 8 If we recall EUROTAT’S seven Dimensions of Data Quality, it is time to pay a closer look.

1- As regards Relevance, periodic Customer Satisfaction Surveys may be conducted but personally we think that a proper registration over an extended period of what data requests are made to the NSO (in person, by phone, by letter or by internet), we have a proper indication of relevance. So relevance is assumed and will not be taken into consideration for rankings either annually or every other year. 2- As regards the remaining 6 dimensions, we think a little regrouping and some prioritizing may proof rewarding: (A): Timeliness and Punctuality (B) Accuracy, Accessibility and Clarity and (C) Comparability, Coherence and Completeness. The advantage of regrouping is that fewer indicators7 may suffice.

For group (A) (i) The difference between the actual date of availability and the date included in the publications calendar for a series (ii) The difference between the end of the reference period and the data on which the provisional results become available (iii) The difference between the end of the reference period and the data on which the final results become available

For group (B) (iv) Presentation of sampling errors (standard errors or CV’s) (v) Presentation of non-sampling errors-1 Coverage errors (vi) Presentation of non-sampling errors-2 Non-response errors (vii) Presentation/availability of meta data (viii) Number and Types of media used for dissemination (ix) Objectivity (Absence of political interference)

For Group (C) (x) Proportion or percentage of statistical products with differing concepts and/or measurement units and proportion or percentage of statistical series that report “marked break in series” (xi) Difference between provisional results and subsequent revisions (xii) Differences between “conjunctural” and annual statistics (xiii) Ratio of statistical series supplied to CARICOM to the Statistical Series demanded by CARICOM.

Our proposal will result in a 4 (or 7) by 13 matrix, plus information on resources. Further reduction of the dimension of the scoring matrix is not impossible but like all statistical activities there will be a trade-off between parsimony and “good fit”.

We have to pay attention to three other issues. (a) What scoring system to use? (b) Who should do the scoring?

7 Following and slightly modifying Arribas et al (UN-ECLAC, 2003)

Page 5 of 8 (c) How to judge the outcome of Resource availability?

As regards the scoring system. We propose to adopt the simple system advocated by de Vries: blank =0, very poor=1, Poor=2, Fair=3, Good =4 and Very Good =5

As regards who should do the scoring, we think that Senior CARICOM Statisticians should do the scoring.

The issue of judging the outcome of Resource availability may be a bit tricky. That may be the reason why this information has only been used as background information by The Economist8. However, we think it is very valuable and should be included directly in the rankings. We propose to compute the indicators (P= statisticians in the NSO per10,000 population 12 years and older and Q= NSO-statistics budget per head of population 12 years and older separately and then use some HDI-mechanism to convert the results into a score also on a 5-point scale. The example below (based on data from The Economist 1993 rankings) may suffice to illustrate our proposal.

NSO-Country P Q P-score Q-score P + Q Canada 1.6 8.2 4.00 4.56 8.56 Australia 2.0 9.0 5.00 5.00 10.00 Netherlands 2.0 7.6 5.00 4.22 9.22 France 1.7 6.0 4.25 3.33 7.58 UK 0.9 4.2 2.25 2.33 4.58 USA 0.6 8.8 1.50 4.89 6.39 Germany 1.9 8.0 4.75 4.44 9.19 Max 2.0 9.0 5.00 5.00 10.00 Min 0.0 0.0 0.00 0.00 0.00 Max-Min 2.0 9.0 5.00 5.00 10.00

The formula used = 5*(Actual-Minimum)/(Maximum-Minimum)

The maximum used is the maximum of all observed outcomes and the minimum is arbitrarily put at 0. If we were to use the minimum of observed outcomes it would be impossible to distinguish between blanks and the observed minimum.

It would also be possible to set a priori a figure which should be attained, but that opens up the possibility of an NSO scoring higher than 5.

8 NOS-1998, p 13

Page 6 of 8

Let us take a look at a hypothetical Scoring card below. Qk : Quality of indicator k, with k running from i to xiii (see page 4).

Qi Qii Qiii Qiv Qv Qvi Qvii Qviii Qix Qx Qxi Qxii Qxiii Q Lifetable 4.04 Population 5 5 5 5 4 4 4 5 4 4 5 5 4 4.54 Mortality 4 3 3 5 3 4 5 5 4 4 3 0 3 3.54 GDP-data 3.46 Value Added 5 5 4 0 4 4 5 3 4 3 3 3 4 3.62 Exp on GDP 4 4 3 0 4 4 4 3 4 3 3 3 4 3.31 Monthly CPI 5 5 5 0 5 0 5 5 5 4 5 5 5 4.15 Labour Force 4.46 LFP-rates 4 5 5 4 5 4 5 5 4 5 4 4 4 4.46 U-rates 4 5 5 4 5 4 5 5 4 5 4 4 4 4.46 Total 16.12

If this were the scoring card for say Canada, then Statistics Canada would end with a score of 16.12 (please reckon with rounding errors) for the Statistical series and an overall score of: 16.12 + 8.56 = 24.68 (an average of 4.11 on the 5 point-scale).

Page 7 of 8 CLOSING REMARKS

We are convinced of the feasibility of implementing a Ranking system for CARICOM member countries and are sure that such a system will be conducive to improving statistics in member countries. CARICOM could either use individual rankings or a grouping system, i.e. “Low”, for the bottom 5 countries, “Medium” for the middle 5 countries and “High” for the top 4-5, countries. Alternatively, CARICOM could follow the footsteps of UNDP and use a priori cut-offs: below 2.5 is Low, 4 and higher is High and the remainder is Medium.

We would also like to see UNSD adopting a Ranking system instead of leaving such a system in hands of the Private Sector (The Economist). However, we think that on a global scale UNSD should not report individual ranks (as is customary in UNDP, HDI reports), but should report deciles or another grouping (say 20 classes of 5%), which leaves sufficient room for mobility.

SELECTED REFERENCES

Arribas, Carmen; J. Casado & A Martínez (UN-ECLAC, 2003): “Data Quality in National Statistical Institutes”

Brackstone, G. (UN-ECLAC 2003): “Managing Data Quality in a Statistical Agency”

IMF 2002 Guide to the General Data Dissemination System

The Economist 1997: Guide to Economic Indicators

UNDP 1997: Human Development Report 1997

United Nations 2003: Indicators for Monitoring the Millennium Development Goals

United Nations 1994: “Report on the Special Session (11-15 April 1994) Economic and Social Council Official Records, Supplement No. 9”

Vries, Willem de (Netherlands Official Statistics, Spring 1998, pp 5-13): “How are we doing? Performance Indicators for National statistical systems”

Page 8 of 8