<<

DISCUSSION PAPER SERIES

No. 6272

CORRUPTION PERCEPTIONS VS. CORRUPTION REALITY

Benjamin Olken

DEVELOPMENT ECONOMICS

ABCD

www.cepr.org

Available online at: www.cepr.org/pubs/dps/DP6272.asp www.ssrn.com/xxx/xxx/xxx

ISSN 0265-8003

CORRUPTION PERCEPTIONS VS. CORRUPTION REALITY

Benjamin Olken, and CEPR

Discussion Paper No. 6272 May 2007

Centre for Economic Policy Research 90–98 Goswell Rd, London EC1V 7RR, UK Tel: (44 20) 7878 2900, Fax: (44 20) 7878 2999 Email: [email protected], Website: www.cepr.org

This Discussion Paper is issued under the auspices of the Centre’s research programme in . Any opinions expressed here are those of the author(s) and not those of the Centre for Economic Policy Research. Research disseminated by CEPR may include views on policy, but the Centre itself takes no institutional policy positions.

The Centre for Economic Policy Research was established in 1983 as a private educational charity, to promote independent analysis and public discussion of open economies and the relations among them. It is pluralist and non-partisan, bringing economic research to bear on the analysis of medium- and long-run policy questions. Institutional (core) finance for the Centre has been provided through major grants from the Economic and Social Research Council, under which an ESRC Resource Centre operates within CEPR; the Esmée Fairbairn Charitable Trust; and the Bank of England. These organizations do not give prior review to the Centre’s publications, nor do they necessarily endorse the views expressed therein.

These Discussion Papers often represent preliminary or incomplete work, circulated to encourage discussion and comment. Citation and use of such a paper should take account of its provisional character.

Copyright: Benjamin Olken CEPR Discussion Paper No. 6272

May 2007

ABSTRACT

Corruption Perceptions vs. Corruption Reality*

Accurate citizen perceptions of corruption are crucial for the political process to effectively restrain corrupt activity. This paper examines the accuracy of these perceptions by comparing Indonesian villagers' stated beliefs about corruption in a road-building project in their village with a more objective measure of `missing expenditures' in the project. I find that villagers' beliefs do contain real information, and that villagers are sophisticated enough to distinguish between corruption in a particular road project and general corruption in the village. The magnitude of their information, however, is small, in part because officials hide corruption where it is hardest for villagers to detect. I also find that there are biases in beliefs that may affect citizens' monitoring behaviour. For example, ethnically heterogeneous villages have higher perceived corruption levels and greater citizen monitoring, but lower actual levels of missing expenditures. The findings illustrate the limitations of relying solely on corruption perceptions, whether in designing anti-corruption policies or in conducting empirical research on corruption.

JEL Classification: D73 Keywords: beliefs, corruption and perception

Benjamin Olken Harvard Society of Fellows 78 Mt. Auburn St. Cambridge, MA 02138 USA Email: [email protected]

For further Discussion Papers by this author see: www.cepr.org/pubs/new-dps/dplist.asp?authorid=164318

* I wish to thank Abhijit Banerjee, Esther Du.o, , Ray Fisman, Seema Jayachandran, Larry Katz, Aart Kraay, Michael Kremer, Susan Rose- Ackerman, and Monica Singhal for helpful comments. Special thanks are due to Victor Bottini, Richard Gnagey, Susan Wong, and especially Scott Guggenheim for their support and assistance throughout the project. The .eld work and engineering survey would have been impossible without the dedication of Faray Muhammad and Suroso Yoso Oetomo, as well as the entire P4 .eld sta¤. This project was supported by a grant from the DfID- Strategic Poverty Partnership Trust Fund. All views expressed are those of the author, and do not necessarily re.ect the opinions of DfID or the World Bank.

Submitted 08 April 2007

1 Introduction

Corruption is thought to be a signi…cant problem in much of the developing world. Corruption not only imposes a tax on public services and private sector activity; it also creates potentially severe e¢ ciency consequences as well (Krueger 1974, Shleifer and Vishny 1993, Bertrand et al. 2006). Yet despite the importance of the problem, eliminating corruption has proved di¢ cult in all but a few developing countries.

One potential reason why corruption is so persistent is that citizens may not have accurate information about corruption. After all, since corruption is illegal, regularly and directly observing corrupt activity is almost always impossible. If citizens’ perceptions about corruption are accu- rate, then the democratic process and grass-roots monitoring can potentially provide incentives for politicians to limit corruption. If, on the other hand, citizens have little in the way of accurate information about corrupt activity – or even if citizens know about average levels of corruption but do not know who is corrupt and who is honest – then the political process may not provide su¢ cient incentives to restrain corruption.

The accuracy of corruption perceptions is also important because of their ubiquitous use by international institutions and academics to measure corrupt activity. For example, corruption perceptions form the basis of the much-cited cross-country Transparency International Corruption

Index (Lambsdor¤ 2004) and World Bank Governance Indicators (Kaufmann et al. 2005), and are used extensively within countries as well to assess governance at the sub-national level. Perceptions have also been widely used in academic research on the determinants of corruption.1 Measuring beliefs about corruption rather than corruption itself skirts the inherent di¢ culties involved in measuring corruption directly, but raises the question of how those being surveyed form their beliefs in the …rst place, and how accurate those beliefs actually are.

This paper examines the empirical relationship between beliefs about corruption and a more objective measure of corruption, in the context of a road-building program in rural . To construct an objective measure of corruption, I assembled a team of engineers and surveyors who,

1 Prominent papers in this literature include Mauro (1995), Knack and Keefer (1995), LaPorta et al. (1999), and Treisman (2000). This literature is surveyed in detail in Rose-Ackerman (2004).

1 after the roads built by the project were completed, dug core samples in each road to estimate the quantity of materials used, surveyed local suppliers to estimate prices, and interviewed villagers to determine the wages paid on the project. From these data, I construct an independent estimate of the amount each road actually cost to build, and then compare this estimate to what the village reported it spent on the project on a line-item by line-item basis. The di¤erence between what the village claimed the road cost to build and what the engineers estimated it actually cost to build forms my objective measure of corruption, which I label ‘missing expenditures.’ To obtain data on villagers’beliefs about corruption, in the same set of villages I also conducted a household survey, in which villagers were asked their beliefs about the likelihood of corruption in the road project.

Using these data, I …nd that villagers’ beliefs about the likelihood of corruption in the road project do contain information about the level of missing expenditures in the project. Moreover, villagers are sophisticated enough in their beliefs to distinguish between general levels of corruption in the village and corruption in the particular road project I examine. However, perceptions of corruption contain only a limited amount of information about actual corruption: increasing missing expenditures by 10 percent is associated with just a 0.8 percent increase in the probability a villager believes that there is any corruption in the project.

One reason villagers’information about corruption may be limited is that o¢ cials have multiple methods of hiding corruption, and choose to hide corruption in the places where it is hardest for villagers to detect. In particular, my analysis suggests that villagers are able to detect marked- up prices, but appear unable to detect in‡ated quantities of materials used in the road project.

Consistent with this, the vast majority of corruption in the project occurs by in‡ating quantities, with almost no markup of prices on average. The inability of villagers to detect in‡ated quantities, combined with the fact that o¢ cials can substitute between hiding corruption as in‡ated prices or in‡ated quantities, suggests that o¢ cials may be strategic in how they hide corruption, and that e¤ective monitoring requires specialist auditors who can detect multiple types of corruption.

The fact that the overall correlation between beliefs and missing expenditures is positive, how- ever, is not su¢ cient to show that the two variables can be used interchangeably as measures of corruption. In particular, beliefs may be systematically biased. I …rst show that, even control-

2 ling for village …xed e¤ects (and therefore controlling completely ‡exibly for the actual level of corruption in the road) and benchmarking for how respondents answer the corruption question in other contexts, individual characteristics such as education and gender systematically predict respondents’perceptions of corruption in the road project. The fact that characteristics other than actual corruption systematically predict corruption perceptions suggests that villagers’perceptions may be biased.

Just because individual beliefs are biased does not necessarily mean that, in aggregate, corrup- tion perceptions will give misleading results when investigating the determinants of corruption. To test for aggregate biases that would a¤ect inference about the determinants of corruption, I examine the relationship between the two di¤erent measures of corruption and a host of village character- istics. Consistent with other studies, I …nd, for example, that increased ethnic heterogeneity is associated with higher levels of perceived corruption (e.g., Mauro 1995, LaPorta 1999), and that increased levels of participation in social activities are associated with lower levels of perceived corruption (e.g., Putnam 1993). But when I examine the relationship between these variables and the missing expenditures variable, I …nd very di¤erent results –ethnic heterogeneity is associated with lower levels of missing expenditures, and participation in social activities is not correlated with missing expenditures levels at all.

One explanation for these di¤erences is that these village characteristics a¤ect levels of interper- sonal trust in the village. Low levels of trust may increase people’s perceived levels of corruption, which may cause them to monitor more aggressively, in turn reducing the actual level of corruption.

In fact, I …nd suggestive evidence that this is the case –villagers in more ethnically heterogeneous villages are less likely to report trusting their fellow villagers, and more likely to attend project monitoring meetings, than those in homogeneous villages. These results suggest a feedback mecha- nism between biased beliefs, monitoring behavior, and actual corruption levels. They also suggest that when examining the correlates of corruption, examining perceptions of corruption may lead to misleading conclusions, particularly when investigating factors related to trust. Instead, more ob- jective methods of measuring corruption, such as the approach used here (or the related approaches used by Di Tella and Schargrodsky 2003, Reinikka and Svensson 2004, Fisman and Wei 2004, Yang

3 2004, Hsieh and Moretti 2006, and Olken 2006a), may produce more reliable results.

This paper is related to several literatures in economics that seek to characterize the relation- ship between beliefs and reality more generally. Bertrand and Mullainathan (2001) discuss the psychological underpinnings of biases in answers to subjective survey questions, and there is a large literature examining the accuracy and potential biases in individuals’forecasts of their own future retirement decisions, mortality, and income.2 In the public sphere, several authors have also found that perceptions are positively correlated with more objective measures of performance, in the very di¤erent contexts of international perceptions of bribery (Mocan 2004), prices paid by Bolivian hospitals for medical supplies (Gray-Molina et al. 2001), and principals evaluating teachers (Jacob and Lefgren 2005). In the setting closest to that examined here, however, Du‡o and Topalova

(2004) document that women leaders in Indian villages deliver better public services than male leaders, yet score worse on measures of citizen satisfaction. Their results, consistent with the re- sults presented here, suggest that there may be political market failures caused by inaccuracies in public perceptions about the performance of government o¢ cials.

The remainder of this paper is organized as follows. Section 2 discusses the empirical setting and the data used in the paper. Section 3 examines the degree to which individual villagers have information about actual corruption levels. Section 4 examines the degree to which villagers’beliefs about corruption are biased, and discusses potential feedback mechanisms between biased beliefs and actual corruption levels. Section 5 concludes.

2 Setting and Data

2.1 Empirical Setting

The data in this paper come from 477 villages in two of Indonesia’smost populous provinces, East

Java and Central Java. The villages in this study were selected because they were about to begin

2 For example, Bernheim (1989) discusses systematic variability in individual accuracy in forcasting their retirement dates, Hurd and McGarry (1995) document that individuals with certain observable characteristics are systematically more likely to over or under-predict their own mortality, Dominitz and Manski (1997) document that individuals can forcast their expected income, and Bassett and Lumsdaine (1999, 2001) discuss how even controlling for observable characteristics, some individuals are likely to be over-optimistic accross a wide variety of beliefs whereas others are systematically over-pessimistic.

4 building small-scale road projects under the auspices of the Kecamatan (Subdistrict) Development

Project, or KDP. KDP is a national government program, funded through a loan from the World

Bank, which …nances projects in approximately 15,000 villages throughout Indonesia each year.

The data in this paper were collected between September 2003 and August 2004.

The roads I examine are built of a mixture of rock, sand, and gravel, range in length from

0.5 –3 km, and may either run within the village or run from the village to the …elds. A typical road project costs on the order of Rp. 80 million (US$8,800 at the then-current exchange rate).

Under KDP, a village committee receives the funds from the central government, and then procures materials and hires labor directly, rather than using a contractor as an intermediary. The allocation to the village is lump-sum, so that the village is the residual claimant. In particular, surplus funds can be used, with the approval of a village meeting, for additional development projects, rather than having to be returned to the KDP program. These funds are often supplemented by voluntary contributions from village residents, primarily in the form of unpaid labor. A series of three village- level meetings are conducted to monitor the use of funds by the village committee implementing the project.

Corruption in the village projects can occur in several ways. First, village implementation teams, potentially working with the village head, may collude with suppliers to in‡ate either the prices or the quantities listed on the o¢ cial receipts. Second, members of the implementation team may manipulate wage payments by in‡ating the wage rate or the number of workers paid by the project.

The villages in this study were part of a randomized experiment on reducing corruption, described in more detail in Olken (forthcoming). Three experimental treatments were conducted in randomly selected subsets of villages: an increase in the probability of an external government audit of the project, an increase in the number of invitations distributed to the village meetings regularly held to oversee use of project funds, and the distribution of anonymous comment forms. All of the empirical speci…cations reported below include dummy variables for each of these experimental treatments to ensure that the e¤ects reported here are not being driven by these experiments, though the results below are essentially similar if the experimental dummies are not included. (I

5 discuss the e¤ects of the experiments on beliefs about corruption in Section 4.2).

The data used here come from three surveys designed by the author: a household survey, containing data on household beliefs about corruption in the project; a …eld survey, used to measure missing expenditures in the road project; and a key-informant survey with the village head and the head of each hamlet, used to measure village characteristics. In the subsequent subsections,

I describe the two aspects of the data that are the focus of this study –the household survey on corruption perceptions and the …eld survey to measure missing expenditures in the road project.

Additional details about the data collected can be found in the Appendix.

2.2 Beliefs about Corruption

Data on beliefs about corruption were obtained from a survey of a strati…ed random sample of adults in the village. The survey was conducted between February 2004 - April 2004, when construction of the road projects was between 80% - 100% complete. The sample includes 3,691 respondents.

The key corruption question I examine is the following: “Generally speaking, what is your opinion of the likelihood of diversions of money / KKN (corruption, collusion, and nepotism) involving [...],” where [...] is 1) the President of Indonesia (at the time, Megawati Sukarnoputri),

2) the sta¤ of the subdistrict o¢ ce (the administrative level above the village), 3) the village head,

4) the village parliament, and 5) the road project. KKN is the Indonesian acronym for corruption, collusion, and nepotism –the catch-all phrase for corruption in Indonesian. Respondents were given

5 possible choices in response –none, low, medium, high, and very high. The …rst four questions

(from the President to the village parliament) were asked, in that order, in the middle of the 1.5 hour survey; the question about the road project was asked towards the end of the survey.

The tabulations of the responses to this question for corruption involving the road project and, by way of comparison, the President of Indonesia and the village head, are given in Table 1. Several things are worth noting about the responses. First, the more ‘local’the subject being asked about, the less corruption respondents report – i.e., respondents report the highest corruption levels for the President, followed by the village head, followed by the road project.

Second, 8.9% of respondents do not answer the question about corruption in the road project,

6 claiming either they do not know or they do not want to answer. In interviews it appeared that many people who refused to answer did so because they felt uncomfortable saying that there was corruption. Although respondents were assured that responses would remain anonymous, this reluctance to state opinions about corruption is common to many surveys of corruption (Azfar and Murrell 2005). It is particularly understandable in this context, given that free speech was restricted in Indonesia until the end of the Soeharto government in 1998, and that even now village heads still wield considerable local authority.

I therefore examine two versions of the corruption beliefs variable that deal with these non- responses in di¤erent ways. The …rst version is simply the …ve ordered categorical responses shown in Table 1, where “refused to answer”is treated as missing. I use ordered probit models to inves- tigate the determinants of this categorical response variable. The disadvantage of this approach is that it disregards the potentially useful information contained in “refused to answer,”namely that those who refuse to answer often believe there is corruption but are unwilling to say so. I therefore create a second version of the beliefs variable called “any likelihood of corruption”that groups all positive likelihood of corruption answers together with non-responses. This variable is equal to 1 if the respondent reports any positive probability of corruption (low, medium, high, or very high) or refused to answer the corruption question, and 0 otherwise.3 I use probit models to investigate the determinants of this dummy variable. As will be discussed in more detail below, the two variables produce broadly similar results.

2.3 Missing Expenditures

The objective measure of corruption I use is “missing expenditures” in the road project. Missing expenditures are the di¤erence in logs between what the village claimed it spent on the project and an independent estimate of what it actually spent. This measure is approximately equal to the percent of the expenditures on the road project that cannot be accounted for by the independent estimate of expenditures.

3 Alternatively, if I use a dummy variable for any positive perceptions of corruption, but drop missings rather than count them as a positive perception, the results are slightly weaker than the results presented. This is consistent with the idea that a non-response is associated with a positive perceived corruption probability.

7 Obtaining data on what villages claim they spent is relatively straightforward. At the end of the project, all village implementation teams were required by KDP to …le an accountability report with the project subdistrict o¢ ce, in which they reported the prices, quantities, and total expenditure on each type of material and each type of labor (skilled, unskilled, and foreman) used in the project. The total amount reported must match the total amount allocated to the village.

These reports were obtained from the village by the survey team.

Obtaining an independent estimate of what was actually spent was substantially more di¢ cult, and involved three main activities— an engineering survey to determine quantities of materials used, a worker survey to determine wages paid by the project, and a supplier survey to determine prices for materials. In the engineering survey, an engineer and an assistant conducted a detailed physical assessment of all physical infrastructure built by the project in order to obtain an estimate of the quantity of main materials (rocks, sand, and gravel) used. In particular, to estimate the quantity of each of these materials used in the road, the engineers dug ten 40cm 40cm core samples  at randomly selected locations on the road and measured the quantities of each material in each core sample. By combining the measurements of the volume of each material per square meter of road with measurements of the total length and average width of the road, I can estimate the total quantity of materials used in the road. I also conducted calibration exercises to estimate a

“loss ratio,” i.e., the fraction of materials that are typically lost are lost as part of the normal construction process.4

To measure the quantity of labor, workers were asked which of the many activities involved in building the road were done with paid labor, voluntary labor, or some combination, what the daily wage and number of hours worked was, and to describe any piece rate arrangements that may have been part of the building of the project. To estimate the quantity of person-days actually paid out by the project, I combine information from the worker survey about the percentage of each task done with paid labor, information from the engineering survey about the quantity of each task,

4 For example, some amount of sand may blow away o¤ the top of a truck, or may not be totally scooped out of the hole dug by the engineers conducting the core sample. I estimated the ratio between actual materials used and the amount of materials measured by the engineering survey by constructing four test roads, where the quantities of materials were measured both before and after construction. In calculating missing expenditures, I multiply the estimated actual quantities based on the core samples of the road by this loss ratio to generate the actual estimated level of expenditures on the road project.

8 and assumptions of worker capacity derived both from the experience of …eld engineers and the experience from building the calibration roads.

To measure prices, a price survey was conducted in each subdistrict. Since there can be sub- stantial di¤erences in transportation costs within a subdistrict, surveyors obtained prices for each material that included transportation costs to each survey village. The price survey included sev- eral types of suppliers— supply contractors, construction supply stores, truck drivers (who typically transport the materials used in the project), and workers at quarries— as well recent buyers of material (primarily workers at construction sites). For each type of material used by the project, between three and …ve independent prices were obtained; I use the median price from the survey for the analysis.

From these data –reported and actual quantities and prices for each of the major items used in the project – I construct the missing expenditures variable. Speci…cally, I de…ne the missing expenditures variable to be the di¤erence between the log of the reported amount and the log of the actual amount. As shown in Table 1, on average, after adjusting for the normal loss ratios derived from the calibration exercise, the mean of the missing expenditures variable is 0:24. Note, however, that while the levels of the missing expenditures variable depend on the loss ratios, the di¤erences in missing expenditures across di¤erent villages do not.5 As a result, I focus primarily on the di¤erences in missing expenditures across villages rather than on the absolute level of missing expenditures.

3 How much do villagers know about corruption?

3.1 The information content of villagers’beliefs

I begin by estimating whether villagers’ beliefs about corruption contain any information about actual corruption levels. To measure beliefs, I consider both versions of the corruption perceptions

5 To see this, note that the loss ratio is a multiplicative constant for each component of the road. If there was only one type of material used the project, then since missing expenditures is expressed as the di¤erences in logs, the loss ratio is simply an additive constant. With multiple components (e.g., rocks, sand, gravel, etc), the additive constant varies slightly from village, depending on the relative weights of the di¤erent components in di¤erent villages. These di¤erences are small, however, so that changes in the loss ratios do not substantively a¤ect the results.

9 variable described above –the categorical response variable and a dummy variable for any positive probability of corruption in the road project (including missings as positive responses). To measure corruption objectively, I use the missing expenditures variable. I then estimate an ordered probit model of the following form:

P (Bvh = j) =  j cv Xvh0  j 1 cv Xvh0 (1)   where B is the respondent’s answer to the question about perceptions of corruption in the road project, c is the estimate of missing expenditures in the road project, v represents a village, h represents a household, j is one of the J categorical answers to the beliefs question, j is a cuto¤ point estimated by the model (with 0 = and J = ), Xvh are dummies for how the 1 1 household was sampled, which version of the form the respondent received, and the experimental treatments, and  is the Normal CDF. The test of whether individuals’beliefs have information about corruption is a test of whether the coe¢ cient > 0. For the dummy variable version of the perceptions variable, I estimate the equivalent probit equation (i.e., with only one threshold level

j). Standard errors are adjusted for clustering at the subdistrict level, to take into account the fact that there are multiple respondents h in a single village v and that the missing expenditures variable may be correlated across villages in a given subdistrict.6

The results are presented in columns (1) and (4) of Table 2 for the categorical and dummy variables, respectively. Note that to facilitate interpretation, for the probit speci…cation in column

(4) I present marginal e¤ects. Both results show a positive coe¢ cient on the missing expenditures variable, though neither coe¢ cient is statistically signi…cant.

A respondent’sbeliefs about a particular type of corruption may be colored by the respondent’s attitudes about corruption in general. The responses to the beliefs question may also di¤er if in- dividuals perceive the levels of the scale (i.e., ‘none,’ low,’ etc.) di¤erently. To correct for these factors, I benchmark the respondent’s attitudes about corruption in general by using the respon-

6 There are 143 subdistricts in the sample. One subdistrict therefore includes an average of 3.3 villages, so clustering at the subdistrict is more conservative than clustering at the village level. Clustering at the village level reduces the standard errors from those presented in the table.

10 dent’sanswer to the question about the likelihood that there is corruption involving the President of Indonesia. As discussed above, the phrasing of the corruption question is the same as the ques- tion about the road project, but in this case all respondents are evaluating the same individual – the President of Indonesia. Since the person being evaluated is the same for all respondents, the di¤erent answers to this question captures general di¤erences in the way the respondents evaluate corruption and answer the perceptions question.7 This is analogous to the approach taken by Bas- sett and Lumsdaine (1999), who use responses to a question about the probability of the weather being sunny tomorrow to benchmark the overall optimism or pessimism of the respondents when interpreting questions about the respondent’sbeliefs about future events.8

The results, controlling for dummies corresponding to the di¤erent possible answers to the question about how corrupt the President is, are presented in columns (2) and (5). The responses to the corruption question on the road project and the corruption question about the President are positively correlated (the dummy versions of these variables have correlation coe¢ cient 0.16, p <

0.001). Controlling for perceptions of how corrupt the President is substantially strengthens the results, increasing both the magnitudes and the statistical signi…cance in both speci…cations.9

However, even controlling for the individuals’beliefs about how corrupt the President is, it is possible that the correlation between missing expenditures in the road project and perceptions of corruption in the road project re‡ects only villagers’perceptions of the average levels of corruption in their village, rather than speci…c information about the road project per se.

7 One might be concerned that corruption perceptions of the President may also capture heterogeneity in overall attitudes towards the President of Indonesia rather than just benchmarking for how the respondent answers the corruption question. However, controlling for the respondent’s overall approval of the President’s job performance, rather than how corrupt they think the President is, has no e¤ect on the correlation between perceptions of corruption in the road project and the missing expenditures variable. Conversely, controlling for any of the respondent’sother answers to the corruption question –i.e., perceptions of subdistrict o¢ cials, village head, or village parliament – has a similar e¤ect to controlling for the corruption of the President, although slightly smaller in magnitude. This suggests that the e¤ect of controlling for perceptions of the President’scorruption is due to it capturing di¤erential interpretations of the corruption question, rather than individual opinions of the President. 8 This benchmarking exercise is also related to the anchoring vignettes literature in political science, discussed by King et al. (2004). The advantage of the approach used here relative to benchmarking against a hypothetical vignette is that the approach here captures di¤erences in the respondents’reluctantance to report corruption (due, for example, to fear of retaliation), which would not be captured in a hypothetical question. 9 A natural question is why controlling for beliefs about the President changes the point estimates on the correlation, rather than just reduces the standard errors. Returning to the logic of the model in Section ??, if all people in a certain area believe there is more corruption, they may monitor more, reducing actual corruption levels. This will attenuate the raw correlation between beliefs and actual corruption unless one also controls for the overall average beliefs about corruption.

11 To examine whether villagers have speci…c information about the road project per se, I estimate an alternative version of equation (1) that also controls as ‡exibly as possible for villagers’beliefs about the general level of corruption in the village, denoted by q:

P (Bvh = j) =  j cv Xvh0 q0  j 1 cv Xvh0 q0 (2)   To capture as ‡exibly as possible the respondents’ general beliefs q, I include in q the respon- dents’ answers to the corruption questions about subdistrict o¢ cials, the village head, and the village parliament (none of whom have any o¢ cial role in the road project), as well as a variety of respondent-level control variables –age, gender, per-capita expenditure (predicted from assets), participation in social activities, and family relationships to government and project o¢ cials. (The role of these respondent-level variables in predicting perceptions will be discussed in more detail in Section 4.1 below.) As can be seen in columns (3) and (6) of Table 2, adding these many addi- tional control variables reduces the standard errors but does not change the point estimates. This is despite the fact that, to take just one example, the correlation of respondents’ perceptions of corruption involving the village head and corruption involving the road project is 0.4. Thus, de- spite the relatively high correlation of these perceptions of di¤erent types of corruption, the results suggest that villagers are actually able to distinguish between general levels of corruption in the village and corruption in the road project in particular.

To interpret the magnitudes of the estimated coe¢ cients, consider the probit speci…cation. The point estimate in column (6) suggests that a 10 percent increase in missing expenditures above the mean level – i.e. an increase of 0.024 from the mean level of 0.24 – would be associated with an increase in the probability the respondent reports any corruption in the project of 0.0030, or an increase of about 0.8 percent over the mean level of 0.36. Put another way, the “elasticity” of a respondent reporting any likelihood of corruption with respect to the missing expenditures variable is about 0.08. Calculating the marginal e¤ects from the ordered probit speci…cations give results of similar magnitudes. While there is information about actual corruption levels in perceptions, the magnitude of this information is weak.

12 An important question is whether this weak correlation is merely the result of measurement error in the missing expenditures measure, or actually re‡ects the fact that households have lit- tle information. Recall that to construct the missing expenditures measure, I used data from 10 core samples of each road, and between 3-5 price quotations for each type of materials used. To investigate the role of measurement error, for each road I randomly split these 10 core samples and 3-5 price quotations into two groups of 5 core samples and 1-3 price quotations each, and use these subsamples of measurements to construct two di¤erent estimates of missing expenditures for each village. I then repeat the regressions in Table 2 instrumenting for the measure of missing expenditure constructed using the …rst set of measurements with the measure of missing expendi- ture constructed using the second set of measurements. When I do this, I …nd that the implied

“elasticity” increases only slightly, to about 0.085.10 Thus, at least to the extent I can detect it here, measurement error alone does not seem to explain the low correlation between perceptions and reality.

3.2 Di¤erential accuracy: prices vs. quantities

There are multiple methods village o¢ cials can use to hide corruption, and some of these methods may be easier for villagers to detect than others. In particular, village o¢ cials who steal a given amount have two options for how to account for this missing money in the accounts –they can either in‡ate the price paid for the materials procured, or they can in‡ate the quantities of the materials procured (or both). To examine how perceptions of corruption are formed, I re-estimate equation

(1) with the missing expenditures variable separated into variables representing its constituent parts

–“missing prices”and “missing quantities.” Speci…cally, I de…ne “missing prices”as the di¤erence in logs between the prices reported by the village and the prices measured by the independent survey team, weighted by the quantities reported by the village; similarly, I de…ne “missing quantities”as the di¤erence in logs between the quantities reported by the village and the quantities measured by

10 In order to do this, I use an instrumental variables linear probability model. For comparison purposes, I …rst re- estimate equation (2) using a OLS linear probability model rather than the probit model used in the text, yielding an implied “elasticity”of 0.072 rather than the 0.08 from column (6) of Table 2. With the instrumental variables strategy described in the text, the ”elasticity”increases to 0.085 from 0.072, suggesting that the impact of measurement error in the quantity and price samples is relatively slight. Of course, there may be other sources of measurement error, such as in reported expenditures, that are not captured in this methodology.

13 the independent survey team, weighted by the prices reported. Missing prices therefore captures markups in prices, while missing quantities captures markups in quantities.

The results are presented in Table 3. All speci…cations con…rm that villagers’perceptions of cor- ruption in the project are strongly positively correlated with price markups, and only very weakly

(and statistically insigni…cantly) correlated with markups in quantities. The estimated magni- tudes for missing prices are approximately double the magnitudes for missing expenditures overall.

Market prices for commodities are commonly known to villagers, but quantities of commodities delivered are very di¢ cult to estimate without careful measurement, even for trained engineers; therefore, it is not surprising that villagers are better at detecting marked-up prices than in‡ated quantities.

Given this result, it is interesting to compare the overall average levels of the missing prices and missing quantities variables. After all, if villagers can detect marked-up prices but cannot detect marked-up quantities, village o¢ cials would in general choose to hide their corruption by in‡ating quantities rather than marking up prices. As discussed above, one needs to interpret the levels of the missing expenditures variables with caution, because the levels of these variables depend on assumptions about the loss ratios and on the ability of surveyors to obtain exactly the same prices as the villages procuring the material for the project. Nevertheless, the levels of the missing prices and quantities variables are precisely what one would expect given the perceptions results: all of the missing expenditures are hidden by in‡ating quantities, not by in‡ating prices. Speci…cally, as shown in Table 1, the mean level of the missing quantities variable is 0.24, while the mean level of the missing prices variable is -0.014, very close to zero.11 Thus, on average the vast majority of the missing expenditures appears to be occurring exactly where villagers cannot detect it. This raises the possibility that the relatively low correlation between beliefs and missing expenditures may in part re‡ect the strategic behavior of savvy corrupt o¢ cials who deliberately choose the types of corruption that are hardest to detect.12 It also suggests that there may be limits in the degree to

11 Missing prices could be less than 0 if, for example, villages purchasing materials received bulk discounts on purchase prices that were not o¤ered to the independent survey team. 12 A natural question is how to reconcile the facts that 1) there appears to be no price-markups on average and 2) villagers are able to detect price-markups. The answer is that the fact that the average price-markup being 0 masks the fact that some villages had higher-than-market prices, and others had lower-than-market prices. Villagers appear to detect these di¤erences, and they are correlated with corruption perceptions. Perhaps the village o¢ cials in those

14 which villagers can e¤ectively monitor corruption, at least in the absence of external help detecting it.

4 Biases and feedback

4.1 Are individuals’beliefs systematically biased?

This section examines whether certain types of individuals are systematically biased in their beliefs.

If individuals’beliefs about corruption in‡uence their decisions about how actively to monitor public o¢ cials, biases in their beliefs about corruption could a¤ect their monitoring behavior. If these biases are idiosyncratic among villagers, they will o¤set each other and have little e¤ect on total corruption levels. If, however, many individuals in the same village share similar biases and respond accordingly in their monitoring behavior, these biases could, in theory, a¤ect aggregate corruption levels.13

To test for the presence of systematic biases in individuals’beliefs about corruption, I re-estimate a version of equation (2) that includes village …xed e¤ects in addition to respondent-level variables.

Since the actual level of corruption in the road project does not vary within the village –after all, there is only one road project in each village –if there are no individual biases, then once village

…xed e¤ects are included and once I benchmark for how respondents perceive the di¤erent possible answers to the corruption question, none of the individual characteristics in the regression should systematically predict corruption beliefs. If they do, then we know that those types of individuals described by the variable in question are systematically biased either towards …nding or not …nding corruption in the project.14

Given the incidental parameters problem, rather than estimate an ordered probit or probit villages where prices were marked-up did not realize that prices would be easier to detect than quantities. 13 A model developing this point explicitly can be found in the working paper version of this paper (Olken 2006b). 14 In interpreting these results, it is important to note that while I can estimate whether bias exists, I do not know which individuals are ‘biased’and which are ‘unbiased’. The reason is that the dependent variable, perceptions of corruption, does not have a numeric scale that we know should be comparable to the missing expenditures variable. Thus, unlike the literature evaluating subjective probabilities (e.g., Dominitz and Manski 1997, Hurd and McGarry 2002), I cannot say which individuals are right and which are wrong or whether the perceived level of corruption is “right on average”; rather, I can only say that conditional on the actual level of corruption, those with high education are more likely to report higher levels of corruption than those with low levels of education.

15 model with a large number of dummy variables, I instead estimate a conditional logit model, which conditions out the village …xed e¤ect. I therefore focus on the dummy variable version of the perceptions variable and estimate the conditional …xed e¤ects logit equivalent of equation (2), with

…xed e¤ects for each village. In addition, to make the coe¢ cients more easily interpretable, I also report the results from estimating an equivalent linear probability model with village …xed e¤ects.15

The results are presented in Table 4. For each speci…cation (conditional logit or linear proba- bility model with …xed e¤ects), I present two sets of results –one with no additional controls, and one controlling for perceptions about the President, to control for the fact that some respondents may have interpreted the multiple response categories di¤erently from others.

Individual-level biases appear quite signi…cant. Conditional on village …xed e¤ects, better ed- ucated respondents and male respondents tend to report more corruption; those who participate in the types of social activity where the project was likely to be discussed, those who live near the project, and (naturally) those who are related to the head of the project all tend to report less corruption.16 Taken together, these individual level biases are highly signi…cant –the p-value from a joint test of these characteristics is less than 0.01 in all speci…cations.

Not only are these biases statistically signi…cant, they are large in magnitude as well. For example, the results show that each year of education makes an individual between 0.7 and 0.9 percentage points more likely to report corruption in the project. This implies that, holding actual corruption levels constant, the “elasticity”of the probability of reporting any likelihood of corrup- tion with respect to the respondent’s education is between 0.21 and 0.27 — considerably larger than the impact of the actual missing expenditures variable discussed above.

The main conclusion from these results is that these individual biases are very substantial, especially when compared to the magnitude of the correlation between missing expenditures and beliefs about corruption found above. This suggests that the signal-to-noise ratio in reported beliefs

15 I have also estimated a …xed-e¤ects model on a linearized version of the categorical variable, where I assign a value of 0 to a response of ‘none’,1 to a response of ‘low’,2 to a response of ‘medium’,etc. The results are qualitatively similar to the results presented in Table 4. 16 An interesting question is whether these individual characteristics lead to repsondents being more or less accurate at detecting corruption, not just biased. To examine this, I also interacted each of these individual characteristics with the missing expenditures variable. Across a wide range of speci…cations, I found no evidence of such interactions (results not reported).

16 is quite low, which may also help explain the low overall correlation between beliefs and missing expenditures.

4.2 Are aggregate biases substantial?

The previous section showed that certain types of individuals are systematically biased in their perceptions of corruption. For these biases to feed back to a¤ect monitoring and, ultimately, corruption levels, these individual biases would have to be both large and correlated with village characteristics.

This section examines empirically whether aggregate biases are substantial enough to e¤ect qualitative conclusions about the correlates of corruption. In doing this, it is important to note that I do not necessarily claim a causal interpretation of the relationship between these variables and corruption; rather, the main question of interest is the consistency of the partial correlations between these variables and corruption across the various ways of measuring corruption.

To examine aggregate biases, I estimate the following two regressions via OLS:

cv = 1 + Zv0 2 + "v (3)

Bvhe = 1 + Zv0 2 + +Xvh0 3 + v0 (4)

e and examine the the similarity or di¤erence between the coe¢ cients 2 and 2, which capture the impact of village characteristics Z on missing expenditures and perceived corruption, respectively.

To obtain the most comparable possible coe¢ cients across these very di¤erent measures, I normalize all of the corruption measures to have mean 0 and standard deviation 1, so that all coe¢ cients can be interpreted in terms of standard deviation changes in the corruption measure. I denote the

17 normalized versions of missing expenditures by cv and the normalized version of beliefs by Bvh.

17 For the categorical-response perceptions variable, I impose a linear scale on the variable, and then normalize this linearized variable to have mean 0 and standard deviatione 1. Although this imposes a linearized form on categoricale response variable, as discussed in footnote 15 above, in other speci…cations OLS regressions using this linearized varaible produce qualitatively similar results to the ordered probit speci…cations, which suggests that the linear assumptions are not substantially a¤ecting the resuts. I have also considered ordered probit and probit versions of equation (4), and they produce qualitatively similar results to those in Table 5 below. Similarly, for the binary dependent variable, I normalize the variable to have mean 0 and standard deviation 1.

17 That being said, I will focus primarily on those results for which the estimated coe¢ cients 2 and 2 are of di¤erent sign, not just of di¤erent magnitude, so as not to rely too heavily on this normalization.18

The results are presented in Table 5. Column (1) presents the results when missing expenditures is the dependent variable, columns (2) and (3) present the result when the scaled linear version of perceptions is the dependent variable, and columns (4) and (5) present the results when the scaled dummy version of perceptions is the dependent variable. Columns (2) and (4) do not include the controls for corruption perceptions of the President, the village head, etc. or the respondent-level characteristics included in Table 4; columns (3) and (5) do.

The results suggest that for identifying the e¤ects of basic demographic characteristics, such as population and education, the results from perceptions (columns 2-5) appear to give similar results to the more objective missing expenditures measure (column 1). But when considering characteristics related to trust –such as ethnic heterogeneity and social participation –examining the impact on corruption perceptions rather than actual corruption may lead to biased conclusions.

Of particular note are the estimates on ethnic heterogeneity. The cross-country corruption literature has found that heterogeneity is positively associated with corruption perceptions (e.g.,

Mauro 1995, LaPorta et al 1999). Following the standard approach in the literature, I construct as a measure of ethnic and religious heterogeneity the probability that two randomly drawn individuals are from di¤erent ethnic or religious groups, respectively.19 Consistent with the literature, I …nd

18 An alternative, equivalent approach which does not rely on these normalizations is as follows. Suppose the true model of the world is:

cv = 1 + Zv0 2 + "v (5)

Bvh = 1 + Zv0 2 + +Xvh0 3 + 4cv + v0 (6) The coe¢ cient of interest is , the e¤ect of village characteristics Z on real corruption levels. In most circumstances, 2 e e e e e c is unobserved. If we assume that 2 = 0 (i.e., Zv a¤ects Bvh only through its e¤ect on cv), then the estimated coe¢ cient 2 in equation (4) in the text will be equal to 2, the coe¢ cient of interest. In the setting here, we also observe c, so we can estimate (6) directlye and test whether 2 = 0 - i.e., we can test directly whether Z has no direct e¤ect on perceptions other than through its e¤ect on the percent missing. This is equivalent to testing whether

2 = 2 in equations (3) and (4) in the text. Estimating (6) ande testing whether 2 = 0 yields similar results to those presented in Table 5 and discussed below. In particular, the coe¢ cients 2 in equation (6) are statistically signi…cantly di¤erent from 0 for the mean village education level, ordinances from village parliament,e social participation, and village ethnic fragmentation. e 19 Overall, the sample is relatively homogeneous –the probability that two individuals in the same village are from di¤erent ethnic groups is greater than 0.05 in only 9 percent of villages, and the probability that two individuals in the same village are from di¤erent religious groups is greater than 0.05 in only 10 percent of villages.

18 that ethnic heterogeneity is associated with greater perceived levels of corruption. Moving from a village with no ethnic heterogeneity to a village with the maximum ethnic heterogeneity in the sample (0.51) is associated with an increase of between 0.65 and 1 standard deviations in the perceived corruption measure, equivalent to an increase of about 50 percentage points in the probability of reporting positive corruption in the project. Moreover, controlling for the overall heterogeneity in the village, those respondents whose ethnic group di¤ered from that of the village head were 12 percentage points more likely to report positive probability of corruption in the project

(results not shown).

However, when I examine the relationship between ethnic heterogeneity and the missing ex- penditures measure, I get the opposite result –moving from a village with no ethnic heterogeneity to a village with the maximum ethnic heterogeneity in the sample is associated with a decrease in the percent missing variable of about 0.73 standard deviations.20 The coe¢ cients on religious fragmentation show a similar pattern –a large negative coe¢ cient when missing expenditures is the dependent variable, and coe¢ cients much closer to zero (and in some cases positive) when percep- tions are the dependent variable –though the results on religious heterogeneity are not statistically signi…cant.

A similar e¤ect – though in the opposite direction – can be seen by looking at the social participation variables. I de…ne the intensity of social participation as the average number of times an adult in the village participated in a social group of any kind in the past 3 months. This measure is obtained from a census of social groups obtained from the head of each hamlet. As can be seen in Table 5, increased participation in social groups in the village is associated with a decrease in perceived corruption levels. This is consistent with the results reported by Putnam

(1993). But when we look at the actual corruption level, we …nd, if anything, that increased social participation is associated with higher measured corruption levels, though the point estimate is statistically insigni…cant. Similar, though weaker, di¤erences between the perceptions variable and missing expenditures appear when we consider a measure of whether there is a political opposition

20 It is worth noting that Glaeser and Saks (2006) also …nd that greater racial heterogeneity is associated with more corruption, even though they examine federal corruption convictions rather than perceptions as the measure of corruption. However, it is possible that federal corruption prosecutions could be confounded by the same types of biases, if racial heterogeneity leads to less trust and a greater propensity to demand federal government prosecutions.

19 in the village that could potentially monitor the project – the degree of activeness of the village parliament, or BPD.21

4.3 Feedback between biased beliefs and corruption levels: the case of ethnic heterogeneity

One possible explanation for these di¤erences between perceptions and actual corruption is that they all appear on characteristics related to trust. For example, suppose that ethnic heterogeneity lowers the level of trust in the village. Greater suspicions could result in higher perceived levels of corruption, which could in turn result in greater monitoring and lower actual corruption.

In fact, there is suggestive evidence consistent with this explanation. The household survey asked respondents a version of the World Values Survey trust question, in which respondents were asked about the degree to which they trust other residents of the village.22 In results not reported in the table, I …nd that the villagers in ethnically heterogeneous villages are less trusting of their fellow villagers than those in homogeneous villages; on average, 52% of residents in villages with ethnic heterogeneity less than 0.05 reported trusting their fellow villagers, whereas only 36% of residents in villages with ethnic heterogeneity greater than 0.05 reported trusting their fellow co-residents.23

Moreover, there is also evidence that ethnic heterogeneity is correlated with higher monitoring levels. As discussed above, the prime mechanism for local-level monitoring is the village account- ability meeting, held three times during the course of the project. In high ethnic heterogeneity villages (de…ned similarly using the 0.05 threshold), the number of people who attended these accountability meetings was 22% higher than in villages with low ethnic heterogeneity.24 This sup-

21 To measure how active the BPD was, I examine the number of ordinances the BPD had issued in the previous year. Though the BPD coe¢ cients are not separately signi…cant in either the missing expenditures or perceptions regressions, the di¤erence between them appears to be statistically signi…cant using the method discussed in footnote 18. I also consider another measure of transparency in the project –whether a written report of project expenses was provided at a village accountability meeting –though the results are not conclusive in either direction. 22 Speci…cally, they were asked: “In general, do you think that other residents of the village can be trusted, or you have to be careful in dealing with them?”The variable is coded 1 if the respondents say they can trust other residents of the village, and 0 if they say they have to be careful in dealing with them. 23 This di¤erence is statistically signi…cant at the 5% level if no other village covariates are included. If I re-estimate a Probit version of equation (3) with this trust variable as the dependent variable and include other covariates, the coe¢ cient attenuates slightly, and the p-value increases to 0.11. 24 This di¤ernce is statistically signi…cant at the 1% level. The point estimates are virtually identical if the village- level characteristics included in Table 5 are included as well (except, obviously, ethnic heterogeneity). Using a linear

20 ports the story discussed above –lower levels of trust correlated with ethnic heterogeneity lead to more negative corruption perceptions, which in turn lead to higher levels of monitoring, lowering actual corruption levels.

These results suggest the possibility of a silver lining from ethnic di¤erences. While in general the literature on ethnic di¤erences has focussed on the idea that ethnic divisions diminish public good provision (e.g., Alesina, Baqir, and Easterly 1999, Miguel and Gugerty 2005), in this case, the mutual distrust from ethnic divisions may actually encourage villagers to monitor one another more, decreasing corruption and increasing the quality of public goods.

4.4 Using perceptions to detect experimental impacts

Given the di¢ culties in measuring corruption directly, an important question is the degree to which perceptions data can susbstitute for more direct measures of corruption in cases where the latter is di¢ cult or infeasible to collect. To examine this, Table 5 also examines how the experimental results reported in Olken (forthcoming) would have di¤ered had the perceptions-based measure been used to evaluate corruption instead of the missing expenditures measure. As described above, there were three experimental interventions in these villages –an audit treatment, in which villages were told in advance that they would be audited by the central government audit agency with probability 1, an invitations treatment, where hundreds of written invitations were passed out to villagers to attend accountability meetings, and an anonymous comment form treatment, where villagers were able to give comments about the project without fear of retaliation.

As can be seen in column (1) of Table 5, the audit intervention was associated with a sta- tistically signi…cant reduction in missing expenditures of about 0.3 standard deviations, whereas the invitations and invitations plus comment forms treatments were associated with a very small, and statistically insigni…cant, reduction in the missing expenditures variable. By contrast, when examining the perceptions variable, the audit treatment has a much smaller (and in most speci…- cations not statistically signi…cant) e¤ect, and the invitations and invitations plus comment form treatments are associated with increases in the perceptions of corruption, in some cases statisti- ethnic heterogeneity measure, rather than the discrete cuto¤ for ethnic heterogeneity greater than 0.05, gives very similar results.

21 cally signi…cantly so. The conclusions from the study would therefore have been quite di¤erent had perceptions been the main measure of corruption, rather than missing expenditures.

In this particular case, we can speculate as to some of the speci…c reasons why perceptions and actual corruption would respond di¤erently to the experimental treatments. For example, as reported in Olken (forthcoming), the audits primarily resulted in a reduction in missing quantities, whereas the results in Table 3 show that villagers are much better at detecting missing prices.

Villagers’perceptions of corruption do not detect precisely the type of corruption where the exper- iments had the greatest impact. Also, one can also easily imagine that anonymous comment forms would increase people’sperceptions of corruption by providing information about corruption, while in fact having the opposite e¤ect on actual corruption levels. More generally, the di¤erence in the results between the two types of measures highlights the importance of obtaining unbiased, direct measures of corruption.

5 Conclusion

This paper has examined the relationship between perceptions of corruption and a more objective measure of corruption, in the context of a road building program in rural Indonesia. The paper shows empirically that villagers’ perceptions of corruption do appear to be positively (though weakly) correlated with the more objective missing expenditures measure. Moreover, villagers appear to be able to distinguish between the overall probability of corruption in the village and corruption speci…c to the road project.

Despite this, the magnitude of the correlation between beliefs and missing expenditures is small.

In part, this may be because, on average, almost all of the corruption in the project was hidden by in‡ating quantities, which are hard for villagers to detect, rather than marking up prices, which are easier for villagers to detect. This suggests an important feedback mechanism between transparency

–which increases the ability of citizens to detect corruption –and corruption levels. It also suggests that, at least in this case, villagers do not currently possess enough capability to detect corruption to e¤ectively monitor local o¢ cials, at least without additional external help.

22 I then examine the extent of biases in corruption perceptions. The results show that there are signi…cant individual-level biases in how respondents answer the corruption question. Moreover, there is evidence that for some village level characteristics, particularly those associated with levels of trust, such as ethnic heterogeneity and social participation, using perceptions to measure corrup- tion can produce very di¤erent answers from the results obtained using a more objective measure of corruption. I present suggestive evidence in favor of the idea that biases in individual’sviews about corruption can lead to increased monitoring behavior, which in turn reduces corruption. These results suggest that perceptions data should be used for empirical research on the determinants of corruption with considerable caution, and that there is little alternative to continuing to collect more objective measures of corruption, di¢ cult though that may be.

References

Alesina, Alberto, Reza Baqir, and William Easterly, “Public Goods and Ethnic Divisions,”Quar- terly Journal of Economics 114 (4), pp. 1243-1284, 1999. Azfar, Omar and Peter Murrell, “Identifying reticent respondents: Assessing the quality of survey data on corruption and values,”mimeo, University of Maryland, 2005. Bassett, William F. and Robin L. Lumsdaine, “Outlook, Outcomes, and Optimism,” mimeo, Brown University, 1999. Bassett, William F. and Robin L. Lumsdaine, “Probability Limits: Are Subjective Assessments Adequately Accurate?”Journal of Human Resources 36 (2), pp. 327-363, Spring 2001. Bernheim, B. Douglas, “The Timing of Retirement: a Comparison of Expectations and Realiza- tions,”in David Wise (ed.), The Economics of Aging, Chicago: Press, pp. 259-283, 1989. Bertrand, Marianne and Sendhil Mullainathan, “Do People Mean What They Say? Implications for Subjective Survey Data,”American Economic Review 91 (2), pp. 67-72, May 2001. Bertrand, Marianne, Simeon Djankov, , and Sendhil Mullainathan, “Does Corruption Produce Unsafe Drivers?”NBER Working Paper #12274, 2006. Deaton, Angus, “Data and Econometric Tools for Development Analysis,” in Jere Behrman and T.N. Srinivasan (eds.), Handbook of Development Economics Volume 3, New York: North Holland, pp. 1785 - 1882, 1995. Di Tella, Rafael and Ernesto Schargrodsky, “The Role of Wages and Auditing During a Crackdown on Corruption in the City of Buenos Aires,”Journal of Law and Economics 46, April 2003.

23 Dominitz, Je¤rey and Charles Manski, “Using expectations data to study subjective income ex- pectations,”Journal of the American Statistical Association 92 (439), pp. 855-862, 1997.

Du‡o, Esther and Petia Topalova, “Unappreciated Service: Performance, Perceptions, and Women Leaders in India,”mimeo, MIT, October 2004.

Fisman, Raymond and Shang-Jin Wei, “Tax Rates and Tax Evasion: Evidence from ‘Missing Imports’in China,”Journal of Political Economy 112 (2), pp. 471-500, 2004.

Glaeser, Edward L. and Raven E. Saks, “Corruption in America,” Journal of Public Economics 90 (6-7), pp. 1053-1072, August 2006.

Gray-Molina, George, Ernesto Pérez de Rada, and Ernesto Yeñez, “Does Voice Matter? Partici- pation and Controlling Corruption in Bolivian Hospitals,”in Rafael Di Tella and WIlliam D. Savedo¤ (eds.), Diagnosis Coruption, Washington: IADB Press, 2001.

Kaufmann, Daniel, Aart Kraay, and Massimo Mastruzzi, “Governance Matters IV: Governance Indicators for 1996-2004,”mimeo, World Bank, May 2005.

King, Gary, Christopher J.L. Murray, Joshua A. Salomon, and Ajay Tandon. “Enhancing the Validity and Cross-cultural Comparability of Survey Research,” American Political Science Review 98 (1), pp. 191 - 207, February 2004.

Knack, Stephen and Philip Keefer, “Institutions and Economic Performance: Cross-Country Tests Using Alternative Institutional Measures,”Economics & Politics 7 (3), pp. 207-227, 1995.

Krueger, Anne O., “The Political Economy of the Rent-Seeking Society,” American Economic Review 64 (3), pp. 291-303, June 1974.

Hsieh, Chang-Tai and Enrico Moretti, “Did Iraq Cheat the United Nations? Underpricing, Bribes, and the Oil for Food Program,” Quarterly Journal of Economics 121 (4), pp. 1211-1248, Decmber 2006.

Hurd, Michael D. and Kathleen McGarry, “Evaluation of the Subjective Probabilities of Survival in the Health and Retirement Study,”Journal of Human Resources 30, pp. S268-S292, 1995.

Hurd, Michael D. and Kathleen McGarry, “The Predictive Validity of Subjective Probabilities of Survival,”Economic Journal 112, pp. 966-985, October 2002.

Jacob, Brian and Lars Lefgren, “Principals as Agents: Subjective Performance Assessment in Education,”NBER Working Paper #11463, 2005.

LaPorta, Raphael, Florencio Lopez-de-Silanes, Andrei Shleifer, and Robert Vishny, “The Quality of Government,”Journal of Law, Economics, and Organizations 15 (1), pp. 222 - 279, 1999.

Lambsdor¤, Johann G., “Background Paper to the 2003 Corruption Perceptions Index,”mimeo: Transparency International, September 2003.

Mauro, Paulo, “Corruption and Growth,”Quarterly Journal of Economics 110 (3), pp. 681-712, August 1995.

24 Miguel, Edward, and Mary Kay Gugerty, “Ethnic Diversity, Social Sanctions, and Public Goods in Kenya,”Journal of Public Economics 89 (11-12), pp 2325-2268, 2005.

Mocan, Naci, “What Determines Corruption? International Evidence from Micro Data,” NBER Working Paper #10460, May 2004.

Olken, Benjamin A., “Corruption and the Cost of Redistribution,” Journal of Public Economics 90 (4-5), pp. 853-870, May 2006(a).

Olken, Benjamin A., “Corruption Perceptions vs. Corruption Reality.” NBER Working Paper #12428, July 2006(b).

Olken, Benjamin A., “Monitoring Corruption: Evidence from a Field Experiment in Indonesia,” Journal of Political Economy, forthcoming..

Putnam, Robert D., with Robert Leonardi and Ra¤aella Y. Nanetti, Making Democracy Work: Civic Traditions in Modern Italy, Princeton: Princeton University Press, 1993.

Reinikka, Ritva and Jakob Svensson, “Local Capture: Evidence from a Central Government Transfer Program in Uganda,” Quarterly Journal of Economics 119 (2), pp. 679-706, May 2004.

Rose-Ackerman, Susan, “The Challenge of Poor Governance and Corruption,” mimeo, Copen- hagen Consensus and , 2004.

Shleifer, Andrei and Robert W. Vishny, “Corruption,” Quarterly Journal of Economics 108 (3), pp. 599-617, August 1993.

Treisman, Daniel, “The causes of corruption: a cross-national study,”Journal of Public Economics 76, pp. 399-457, 2000.

Yang, Dean, “Can Enforcement Back…re? Crime Displacement in the Context of a Common Customs Reform,”University of Michigan, mimeo, 2004.

Appendix: Data Details

The original data was collected in 608 total villages. The sample in this paper, however, is limited to the 477 villages where the missing expenditures variable, described above, could be constructed. The missing expenditures variable could not be calculated in some villages for one of four reasons: (1) surveyor error in locating the road, (2) the project consisted largely of a partial rehabilitation of an existing road, (3) agglomerated expenditures reports (i.e., the village expenditure report combined expenditures in the road project with other projects that could not be independently measured, such as a school), or (4) villages that had asphalted the road that refused to let the engineers break the asphalt to conduct the engineering survey. The household survey was designed as a strati…ed random sample, containing between six and thirteen respondents per village, selected as follows. Two respondents were selected from the

25 hamlets in which the road was located by …rst randomly selecting a hamlet, and then randomly selecting a neighborhood (RT) in that hamlet. A complete list of households in the RT was obtained from the neighborhood head, and two households were drawn randomly from that list. Individual respondents were drawn from the a list of all adults age 18 or over in the selected households. Two additional respondents were selected from the hamlets in which the road was not being built using the same procedure. As men in the village tend to participate much more in road construction activities, the randomization was designed such that, of the four respondents selected in this manner, three were men and one was a women. In villages receiving the Comment Form treatment, an additional four respondents were drawn using the same procedure, two from hamlets with the project and two from hamlets that did not contain the project. Two respondents were drawn randomly from the attendance list of Village Meeting II, which was held before the randomization was announced, and is therefore exogenous with respect to the experiments. Finally, in some Comment Form villages an additional 3 respondents were added, randomly selected from the two neighborhoods above (the reasons for this will be discussed below). Each respondent received compensation of Rp. 10,000 ($1.20), equal to slightly more than half of the typical daily agricultural wage in the study area. Given this sample selection, a natural question is whether the sample should be re-weighted to re‡ect the fact that di¤erent respondents had di¤erent probabilities of being sampled. As is apparent from the description of the sampling, women were systematically undersampled, and those who attended a pre-randomization village meeting were systematically oversampled. In all speci…cations, I control for how the respondent was sampled (i.e., whether the respondent was from a hamlet with or without the road project, whether the respondent was selected from the attendance list at Village Meeting II, and whether the household was one of the 3 additional households added in the Comment Form villages). The question is whether, given these control for level di¤erences among these samples, one needs to reweight to account for treatment e¤ect heterogeneity in the relationship between missing expenditures and perceptions. As discussed by Deaton (1995), weighting the sample makes the point estimates invariant to survey design, but reduces the e¤ective power and, in the presence of treatment heterogeneity, does necessarily obtain consistent estimates for the true average population e¤ect. Accordingly, the results presented below in the text are unweighted. Using sample weights in the regressions that account for the di¤erential probability of sampling and which weight each village equally (i.e., so that comment form villages are not over-weighted in the regression) does not substantively change the results, but not surprisingly it does reduce the statistical signi…cance of the missing expenditures variable in columns (2) and (3) of Table 2. Beyond simply measuring perceptions, an additional goal of the survey was to measure how stated beliefs about corruption change when respondents know that their answers will be used for monitoring. To examine this, after all corruption questions except for questions involving corruption in the road project had been asked, a randomly selected subset of respondents in the Comment Form villages were told that their responses to the questions about corruption in the project would be used, anonymously, as part of the overall report on the comment forms presented at the accountability meeting. To simplify exposition, I will refer to this variant of the questionnaire as Form B, and to the normal questionnaire, in which all questions were explained to be anonymous and to be used for research purposes only, as Form A. Due to a training error, approximately 60% of enumerators appear to have given Form B surveys to all households in Comment Form villages. Therefore, in approximately half of all Comment Form villages, three additional households were

26 surveyed, drawn randomly from the same neighborhoods as before, two of whom received Form A and one of whom received Form B. Although the Form B version of the form is similar to Form A (the coe¢ cient on receiving a Form B form is always very close to 0 and statistically insigni…cant), I nevertheless include a dummy variable for which version of the form each household received in all speci…cations, as well as dummies for whether the household was sampled as part of this additional three households per village, in addition for dummies for which experimental treatment the village was assigned (i.e. comment forms, invitations, or audits). Although I include these dummies in all speci…cations in this paper, doing so does not substantially a¤ect the results. In addition to the corruption question, the household survey included a wide variety of other covariates, such as a household roster, education levels, participation in social activities and in the road project, assets, and family relationships to various village o¢ cials. To estimate household expenditure of respondents, I used the 1999 SSD (Hundred Villages Survey), an Indonesian statis- tics bureau dataset, containing 3,193 rural Javanese households. The SSD asked both a detailed expenditure questionnaire and the same set of asset questions used in my household survey. In the SSD, I used OLS to estimate the relationship between log household expenditure and the follow- ing variables, all of which I observe in my survey: log household size, whether the household was headed by a woman, the percentage of household members consisting of children ages 0-3, 4-6, 7-9, 10-12, and 13-16, dummies for whether the household has a stove, refrigerator, radio, television, satellite dish, motorbike, car, and electricity, dummies for ‡oor type, wall type, and ceiling type, the total amount of land held by the household, whether the household consumes meat at least once a week, whether each household member has at least two sets of clothes, whether the household uses modern medicine when a child is sick. I then used the estimated coe¢ cients from the SSD to predict household expenditure in my survey. Combined, these 34 variables have an R-squared of 0.58 predicting log household expenditure in the SSD, which suggests that predicted expenditure is a reasonable approximation for actual expenditure, at least for the purposes used here.

27 Table 1: Summary Statistics

Perceived corruption involving: Road Project President Village Head None 64.1% 13.9% 47.1% Low 21.1% 12.8% 18.0% Medium 5.3% 22.9% 9.5% High 0.4% 9.2% 1.8% Very high 0.2% 3.4% 0.3% Refused to answer 8.9% 37.7% 23.3%

Missing expenditures 0.236 (0.343) Missing prices -0.014 (0.210) Missing quantities 0.243 (0.320)

Notes: For perceived corruption, the figures given are percentage responses to the question “In general, what is your opinion of the likelihood of corruption / KKN (corruption, collusion, nepotism) involving […]?” where […] is the President of Indonesia (Megawati Sukarnoputri), the village head in the respondent’s village, or the road project, as indicated in the columns. Standard deviations in parentheses Sample is limited to those villages where the missing expenditures variable is not missing. Total number of observations: 3,691. For missing expenditures, missing prices, and missing quantities, total number of observations: 477.

28

Table 2: Relationship Between Perceptions and Missing Expenditures (1) (2) (3) (4) (5) (6) Likelihood of Corruption Any Likelihood of Corruption in Road Project in Road Project (Ordered Probit) (Dummy variable 0-1, Probit Marginal Effects) Missing expenditures 0.186 0.280* 0.307** 0.097 0.119** 0.123*** (0.175) (0.167) (0.135) (0.060) (0.057) (0.047) Corruption perceptions of: President No Yes Yes No Yes Yes Subdistrict official No No Yes No No Yes Village head No No Yes No No Yes Village parliament No No Yes No No Yes Respondent Covariates No No Yes No No Yes Sample Controls Yes Yes Yes Yes Yes Yes Observations 3314 3314 2931 3639 3639 3226 Mean dep. Var 0.36 0.36 0.35

Notes: Robust standard errors in parentheses, clustered at the subdistrict level. In columns (1) – (3), dependent variable is the categorical responses to the perceptions question, i.e., ‘none’, ‘low’, ‘medium’, ‘high’ and ‘very high’ (in that order). In columns (4) – (6), dependent variable is a dummy that takes value 0 if answer was ‘none’ and 1 if answer was ‘low’, ‘medium’, ‘high’, ‘very high’, or if the respondent refused to answer. Corruption perceptions of President, Subdistrict Official, Village head, and Village Parliament are dummies for respondent’s perceived corruption levels of the respective officials. Respondent covariates are age, education, gender, predicted per-capita expenditure, participation in social activities, relationship to government and project officials. Sample controls are dummies for the three experimental interventions (audit, invitations, and invitations + comment forms), dummies for the different strata of respondents sampled, and a dummy for which version of the form the respondent received. * significant at 10%; ** significant at 5%; *** significant at 1%

29

Table 3: Accuracy – Prices vs. Quantities

(1) (2) (3) (4) (5) (6) Likelihood of Corruption Any Likelihood of Corruption in Road Project in Road Project (Ordered Probit) (Dummy variable 0-1, Probit Marginal Effects) Missing expenditures – prices 0.433 0.627** 0.669*** 0.177* 0.205** 0.204** (0.277) (0.270) (0.251) (0.096) (0.091) (0.081) Missing expenditures – quantities 0.057 0.118 0.112 0.049 0.069 0.070 (0.183) (0.177) (0.155) (0.062) (0.060) (0.053) Corruption perceptions of: President No Yes Yes No Yes Yes Subdistrict official No No Yes No No Yes Village head No No Yes No No Yes Village parliament No No Yes No No Yes Respondent Covariates No No Yes No No Yes Sample Controls Yes Yes Yes Yes Yes Yes Observations 3314 3314 2931 3639 3639 3226 Mean dep. Var 0.36 0.36 0.35

Notes: See Notes to Table 2. * significant at 10%; ** significant at 5%; *** significant at 1%

30

Table 4: Are beliefs systematically biased? (1) (2) (3) (4) Any Likelihood of Corruption Any Likelihood of Corruption in Road Project in Road Project (Conditional Logit) (Dummy variable 0-1, OLS with fixed effects) Education (years) 0.065*** 0.051*** 0.009*** 0.007*** (0.018) (0.019) (0.003) (0.002) Age -0.003 -0.001 -0.000 -0.000 (0.005) (0.005) (0.001) (0.001) Female -0.183* -0.160 -0.026* -0.022 (0.105) (0.108) (0.015) (0.015) Predicted per-capita consumption 0.217 0.148 0.028 0.021 (0.193) (0.199) (0.025) (0.025) Participation in social activities 0.013** 0.011* 0.002** 0.002** (0.006) (0.006) (0.001) (0.001) Participation in social activities -0.075*** -0.069*** -0.011*** -0.010*** where road project likely discussed (0.021) (0.021) (0.003) (0.003) Lives in project hamlet -0.781*** -0.764*** -0.110*** -0.106*** (0.108) (0.110) (0.014) (0.013) Attended development meeting -0.312*** -0.320*** -0.042*** -0.042*** (0.110) (0.112) (0.014) (0.014) Family member of village government 0.043 0.021 0.003 0.001 (0.112) (0.116) (0.016) (0.016) Family member of project leader -0.399** -0.402** -0.051** -0.051** (0.203) (0.205) (0.026) (0.025) Sample controls Yes Yes Yes Yes President corruption perception No Yes No Yes Observations 2675 2675 4084 4084 R-squared 0.47 0.48 Mean dep. var 0.40 0.40 0.34 0.34 Fixed effects Village Village Village Village P-value of joint F-test <0.01 <0.01 <0.01 <0.01 See Notes to Table 2. Village head corruption perception and president corruption refer to dummies for the respondent’s response to the corruption question about village head and President of Indonesia, respectively, as in Table 2. Robust standard errors in parentheses. All specifications include village fixed effects. Note that the sample size is lower in the conditional logit specification since all villages where there is no variation in the dependent variable are automatically dropped from the conditional logit model. * significant at 10%; ** significant at 5%; *** significant at 1%

31

Table 5: Village Level Differences (1) (2) (3) (4) (5) Missing Likelihood of Corruption Any Likelihood of Corruption Expenditures in Road Project in Road Project (Linear scale, Std Dev 1) (Dummy variable, Std Dev 1) Demographics: Log population 0.262** 0.174*** 0.128** 0.172*** 0.118** (0.111) (0.061) (0.055) (0.058) (0.051) Mean village education level (years) -0.039 -0.050 -0.020 -0.047 -0.006 (0.046) (0.032) (0.031) (0.034) (0.033) Share of population poor -0.334 -0.146 -0.124 -0.118 -0.065 (0.252) (0.164) (0.133) (0.159) (0.140) Social characteristics: Ethnic fragmentation -1.443** 1.733*** 1.303*** 1.942*** 1.477*** (0.567) (0.322) (0.292) (0.340) (0.332) Religious fragmentation -1.359 0.069 -0.324 0.076 -0.309 (1.088) (0.732) (0.704) (0.720) (0.700) Intensity of social participation 0.024 -0.053* -0.041 -0.072** -0.062** (0.064) (0.029) (0.025) (0.029) (0.026) Transparency: Meetings with written accountability -0.243 -0.017 -0.005 -0.072 -0.079 report (0.154) (0.093) (0.074) (0.088) (0.076) Number of ordinances from -0.019 0.008 0.015 0.012 0.015 village parliament (0.017) (0.012) (0.009) (0.011) (0.010) Experimental interventions: Audit treatment -0.303** -0.057 -0.126* -0.033 -0.068 (0.130) (0.087) (0.069) (0.087) (0.071) Invitations treatment -0.032 0.020 0.024 -0.012 0.019 (0.106) (0.077) (0.068) (0.069) (0.064) Invitations + comment treatment -0.021 0.161* 0.120 0.107 0.095 (0.094) (0.088) (0.073) (0.087) (0.077) Corruption perceptions of: President N/A No Yes No Yes Subdistrict official N/A No Yes No Yes Village head N/A No Yes No Yes Village parliament N/A No Yes No Yes Respondent Covariates N/A No Yes No Yes Sample Controls N/A Yes Yes Yes Yes Observations 445 3074 2731 3384 3011 R-squared 0.06 0.06 0.22 0.06 0.16 Dependent variable in column (1) is missing expenditures; dependent variable in columns (2) and (3) is the linearized variable of corruption perceptions described in the text, and dependent variables in columns (4) and (5) are dummy variables of corruption perceptions. Note that all dependent variables have been rescaled to have mean 0 and standard deviation 1. Observations in columns (2) – (5) are weighted by the inverse of the number of observations in each village, to ensure that each village receives the same weight as in column (1). Estimation is by OLS, though as discussed in the text, estimation of columns (2) and (3) by ordered probit and columns (4) and (5) by probit produce qualitatively similar results. Sample controls are as defined in the notes to Table 2; household controls are all of the individual respondent-level variables considered in Table 4; village head and President Corruption perception refer to dummies for how the respondent answered the corruption questions about the village head and President of Indonesia, respectively, as included in column (3) of Table 2. Robust standard errors in parentheses, adjusted for clustering at subdistrict level. * significant at 10%; ** significant at 5%; *** significant at 1%

32