.

Essays on Mechanism Design and Positive Political Theory:

Voting Rules and Behavior

Dissertation

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Semin Kim, M.A.

Graduate Program in Economics

The Ohio State University

2014

Dissertation Committee:

Yaron Azrieli, Advisor

James Peck

Katherine Baldiga Coffman .

Copyright by

Semin Kim

2014 .

Abstract

My dissertation focuses on the theory of mechanism design in environments, i.e., the design of voting rules that aggregate individual preferences into a group decision. The vast majority of the social choice literature on this topic uses purely ordinal models: agents can rank the candidates but the intensity of preferences is discarded. Furthermore, this literature restricts attention to dominant strategy rules- rules in which truth-telling is best for each agent regardless of other agents’ votes. My approach to the problem differs from the past literature in that, in my model, agents assign cardinal valuations to the alternatives, expressing their relative desirability. Instead of requiring dominant strategy, I consider the weaker concept of Incentive Compatibility (IC). IC only requires agents to be truth-telling if other agents do the same. My model is, therefore, close to the classic mechanism design problem, but with the crucial difference that monetary transfers cannot be used to remedy incentive problems. In Chapter 1, “Pareto efficiency and weighted majority rules” (joint with Yaron Azrieli), we study environments with two alternatives and independent private values, the simplest non-trivial environments. First, we show that the utilitarian rule subject to IC cannot utilize information about preference intensity, and that it is a weighted where the weights assigned to agents are environment-specific. Second, we characterize the set of Pareto efficient rules subject to IC and show that it is equal to the set of all weighted majority rules. We also study the stability of voting rules when a new rule is offered as an alternative to the incumbent. ii In Chapter 2, “Ordinal versus Cardinal Voting Rules: A Mechanism Design Ap- proach,” I consider neutral environments with at least three alternatives. Neutrality means that there is symmetry between the alternatives in the ex-ante stage, before preferences are realized. First, I show that within the class of ordinal rules (rules that do not use information about preference intensity) any Pareto efficient rule is incen- tive compatible. This result is quite interesting because unlike many of the classic results of mechanism design, it does not have tension between efficiency and incen- tive compatibility. Second, I construct a cardinal voting rule subject to IC, which achieves higher utilitarian social welfare than any ordinal rule. Thus, as opposed to environments with two alternatives, it is possible to increase social welfare by taking into account the intensity of preferences as reported by the agents when there are at least three alternatives. When there are exactly three alternatives my rule is closely related to Myerson’s (2002) (A,B) scoring rules, where each voter must choose a score vector that is a permutation of either (1,A,0) or (1,B,0). This is a simple rule that can practically be used in real-life institutions.

iii .

Dedication

This work is dedicated to my parents and my wife, Hye Lin.

Thanks for your love and support.

iv .

Acknowledgments

I would like to express my special appreciation and thanks to my advisor, Yaron Azrieli. He has been a great mentor for me. I would like to thank him for encouraging my research and for his thoughtful guidance and comments. I would also like to thank my committee members, James Peck, Paul Healy, Katherine Baldiga Coffman for serving as my committee members and for their helpful comments and suggestions. A special thanks to my family and friends in my church. Words cannot express how grateful I am to them. Many thanks to Dan Levin, Caleb Cox, Daeyoung Jeong, Jack Rouzer, Robert Munk, Dimitry Mezhvinsky, Ian Krajbich, and seminar participants at The Ohio State University Theory and Experiments Group Meetings for their valuable comments.

v .

Vita

2009 ...... M.A. Economics, The Ohio State University 2009 to present ...... Graduate Teaching Associate, Department

of Economics, The Ohio State University

Fields of Study

Major Field: Economics

vi .

Table of Contents

Abstract ...... ii

Dedication ...... iv

Acknowledgments ...... v

Vita...... vi

List of Tables...... viii

List of Figures ...... ix

Chapter 1: Pareto Efficiency and Weighted Majority Rules ...... 1

Chapter 2: Ordinal versus Cardinal Voting Rules:A Mechanism Design Approach.40

Bibliography ...... 80

Appendix A: Proofs in Chapter 1 ...... 82

Appendix B: Proofs in Chapter 2 ...... 85

vii .

List of Tables

abc bac Table 1. Plurality and voting rules when v1 ∈ V1 and v1 ∈ V1 . . .52

∗ abcH 0 abcL Table 2. fB and f when v1 ∈ V1 and v1 ∈ V1 ...... 54 Table 3. The rule g in Example 3 ...... 74

Table 4. The rule g in Example 4 ...... 76

viii .

List of Figures

Figure 1. An example of discontinuous h (β) at βˆ ...... 66

ix CHAPTER 1

Pareto Efficiency and Weighted Majority Rules

1.1. Introduction

Which mechanisms generate economically efficient outcomes when agents hold relevant private information? This is one of the most basic and important questions analyzed by the extensive mechanism design literature of the last several decades. In this paper we address this question in what is probably the simplest non-trivial environment: Society needs to choose between two alternatives, say reform or status- quo. Each agent is privately informed about his utility for each of the alternatives. The uncertainty of agents about the types of their opponents is captured by a common prior distribution, which we assume to be independent across agents. Monetary transfers are prohibited. The simplicity of the environment enables us to get a clear and precise answer to the above basic question, by characterizing the class of efficient mechanisms subject to incentive compatibility. The characterization depends on the particular notion of efficiency considered, but in all cases we find that the family of weighted majority rules plays an important role. These voting mechanisms, which include the simple majority rule as a special case, are widely used in practice and have been extensively studied in the political and economic literature.1 We think of our results as providing a normative rationale for the use of weighted majority rules, one that is based on the classic notions of efficiency and incentive compatibility.

1See the related literature section below for references. 1 Let us be more explicit regarding what we call a weighted majority rule. A Social Choice Function (SCF) f is a mapping from type profiles to lotteries over {reform, status-quo}. We say that f is a weighted majority rule if we can find positive weights

(w1, . . . , wn) for the agents and a positive quota q, such that under f the reform is implemented if the sum of weights of agents that prefer the reform exceeds q, and the status-quo prevails if this sum is less than q. Ties are allowed to be broken in an arbitrary way. Our first result (Theorem 1) characterizes SCFs that maximize (ex-ante, utili- tarian) social welfare subject to incentive compatibility.2 We show that a SCF is a solution to this optimization problem if and only if it is a weighted majority rule with specific weights. In particular, incentive compatibility constraints prevent effec- tive use of any information about realized intensity of preferences; only the ordinal ranking of the two alternatives as reported by the agents matters for the outcome. However, the distribution of intensities is relevant for determining the optimal rule, since, roughly speaking, the weight assigned to agent i in the optimal rule reflects the expected utility gain of i if his favorite alternative is chosen. We then move on to characterize the classes of interim and ex-ante incentive ef- ficient SCFs (Theorems 2 and 3). First, we show that a SCF is interim incentive efficient if and only if it is a weighted majority rule. In other words, the Pareto frontier (in the interim stage) of the set of incentive compatible decision rules coin- cides with the class of weighted majority rules. Second, we characterize the subset of weighted majority rules which are also ex-ante incentive efficient by imposing re- strictions on their weights and quota. Thus, in environments that are likely to satisfy our assumptions, the only kind of decision schemes that should be used are weighted

2By the revelation principle, and assuming that agents play Bayes-Nash equilibrium, incentive com- patibility constraints exactly characterize the class of feasible SCFs. 2 majority rules. Indeed, any other incentive compatible rule is Pareto inferior to some (incentive compatible) weighted majority rule. These results are proved by showing that a SCF is (interim or ex-ante) incentive efficient in a given environment if and only if it is a maximizer of social welfare in an auxiliary “equivalent” environment, and then applying Theorem 1. While the environment we analyze is simple, we emphasize that no symmetry be- tween the agents or the alternatives is assumed. Even at the ex-ante stage, different agents may have different utility distributions, and these distributions may be biased in favor of one of the alternatives. We believe that this is an important feature of our analysis, since many real-life environments are inherently asymmetric. Examples in- clude representative democracies with heterogenous district sizes, publicly held firms with institutional and private shareholders, and faculty hiring decisions in which the job candidate has closer research interests to some faculty members than others. We discuss the implications of our results for some of these environments in Section 1.8. Even if there is asymmetry between the agents, fairness considerations may force the designer to consider only anonymous mechanisms. Anonymity means that the chosen alternative depends only on the numbers of agents who prefer each of the alternatives and not on their names. In Section 1.6 we characterize the utilitarian solution and efficient rules under the additional constraint that only anonymous SCFs are feasible. As can be expected, qualified majority rules – which are weighted major- ity rules in which all the agents have the same weight, become central to the analysis. Specifically, we show that the utilitarian rule is a qualified majority rule with a certain quota (Theorem 4), that qualified majority rules are exactly those anonymous SCFs which are interim incentive efficient (Theorem 5), and that qualified majority rules with certain range of quotas are ex-ante incentive efficient (Theorem 6).

3 Our last theoretical contribution in this paper is a preliminary analysis of the stability of weighted majority rules (Section 1.7). Stability refers to a rule being suf- ficiently popular, so that no other rule will get a strong enough support to replace the existing rule. Several formulations of this idea appear in the literature, the dif- ferences between them being the timing at which the agents need to choose between the rules (interim or ex-ante), and the majority that is needed in order to replace the incumbent rule by the challenger (unanimity or the majority implied by the in- cumbent rule itself). We show that the concept of ex-ante self-stability (decision at the ex-ante stage by the current rule) is very restrictive, namely, that in symmetric environments no qualified majority rule is ex-ante self-stable. On the other hand, we show that durability (decision at the interim stage by unanimity) is very permissive in our environment, since every weighted majority rule is durable. We leave a more comprehensive analysis of the stability of weighted majority rules for future research. The environment we study has very unique features (two alternatives, independent private values, no transfers), and as we demonstrate in Section 1.9 our results do not extend to more general environments. However, from a practical point of view, the two alternatives case is perhaps the most interesting one to study, since binary decision problems are frequent. The no transfers assumption is also realistic, since in many cases they are infeasible or excluded for ethical reasons. Type independence is more restrictive in our context, but given the pervasiveness of this assumption in the literature we think that it is an interesting benchmark to study.

4 1.2. Related literature

The evaluation of (anonymous) majority rules based on the ex-ante expected wel- fare they generate first appears in Rae (1969), and the analysis has been later ex- tended in Badger (1972) and Curtis (1972). A more recent treatment appears in Schmitz and Tröger (2011) (see also the working paper Schmitz and Tröger (2006)), which analyzes efficiency of anonymous SCFs in a symmetric (between the agents and the alternatives) environment. Under incentive compatibility, the authors show that simple majority is the utilitarian rule (Proposition 2) and that it interim Pareto dominates every other incentive compatible, anonymous and neutral rule (Proposition 3).3 Our results complement these findings by extending the analysis to asymmetric environments, and by considering both cases of anonymous and non-anonymous rules. The theoretical study of weighted majority rules (with heterogenous weights) dates back at least to von-Neumann and Morgenstern’s book (1944, Section 50). Much of the literature deals with the measurement of the power of players in the cooperative game generated by the rule (e.g., Shapley and Shubik, 1954). In Barberà and Jackson (2006) a model of indirect democracy is studied, where each country in a union has a single representative that votes on its behalf. The authors find the optimal weights (according to a utilitarian criterion) that should be assigned to the representatives. These weights are essentially the same as in our Theorem 1. We discuss in greater detail the connection between the papers in Section 1.8. A related analysis of the weights of representatives of heterogenous districts is done in Beisbart and Bovens (2007).

3The analysis in Schmitz and Tröger (2011) also covers the case where incentive compatibility is replaced by the stronger strategy-proofness requirement. The results in this case are stronger, and in particular allow for correlated types. 5 Another result closely related to our Theorem 1 is Proposition 2 in Fleurbaey (2008). The main difference is that we derive the optimality of the weighted majority rule among all incentive compatible SCFs, while Fleurbaey (2008) considers only ordinal SCFs. On the other hand, Fleurbaey (2008) allows for correlated types. The voting example in Nehring (2004, Section 2) is also related to our analysis. In the same environment as in the current paper the author observes that a specific weighted majority rule maximizes ex-ante social welfare, and that there are many weighted majority rules that are interim incentive efficient. The analysis in Nehring (2004) presumes that incentive compatibility is equivalent to strategy proofness, but these two concepts differ even in this simple environment, and it is easy to construct examples of SCFs which are incentive compatible but not strategy-proof.4 When there are more than two alternatives the analysis of efficient SCFs becomes much more involved. In Apesteguia, et al. (2011), the authors characterize the utili- tarian rule among the SCFs that use only ordinal information, and show that it has the form of a scoring rule. They are not concerned with incentive compatibility issues, but Kim (2012) proves that the rule characterized in Apesteguia et al. (2011) is in fact incentive compatible in neutral environments. However, Kim (2012) also shows that there is an incentive compatible SCF that uses cardinal information and yields higher expected welfare than any ordinal SCF. Our analysis of stability in Section 1.7 borrows concepts from earlier works on this topic. Namely, ex-ante self-stability for qualified majority rules is introduced and analyzed in Barberà and Jackson (2004). The durability concept is due to Holmström and Myerson (1983).

4With two alternatives incentive compatibility is equivalent to ordinal incentive compatibility in- troduced in Majumdar and Sen (2004). With at least three alternatives, however, ordinal incentive compatibility is much stronger than incentive compatibility. 6 There are several papers that consider the design of mechanisms in other environ- ments when monetary payoffs are not allowed. In Börgers and Postl (2009), the au- thors consider a two-agents model with privately known intensities of preferences over three alternatives, and derive results regarding the form of the second-best decision rule. Another example of Bayesian mechanism design without transfers is Miralles (2012), who considers the allocation of two objects among n agents. In Gershkov, et al. (2013), optimal (utilitarian) dominant-strategy mechanisms in an environment with linear utilities but without transfers are characterized. Finally, our interest in the current question was initiated by Example 1 in Jackson and Sonnenschein (2007, pp. 243-244), which shows that incentive constraints can be overcome by linking many decision problems.

1.3. Environment

We consider a standard Bayesian environment à la Harsanyi. The set of agents

is N = {1, 2, . . . , n} with n ≥ 1. For each i ∈ N, Ti is a finite set of possible types

of agent i, and ti denotes a typical element of Ti. The type of agent i is a random

5 variable tˆi with values in Ti. The distribution of tˆi is µi ∈ ∆(Ti), which we assume has full support. Let T = T1 × · · · × Tn be the set of type profiles. We assume that types are independent across agents, so the distribution of tˆ= (tˆ1,..., tˆn) is the product distribution µ = µ1 × · · · × µn ∈ ∆(T ). As usual, a subscript −i means that the ith coordinate of a vector is excluded. Let A = {reform, status-quo} = {r, s} be the set of alternatives. The utility of each agent depends on the chosen alternative and on his own type only (private

values). Specifically, the utility of agent i is given by the function ui : Ti × A →

5For every finite set X, ∆(X) denotes the set of probability measures on X. 7 r s R. For ease of notation we write ui (ti) = ui(ti, r), ui (ti) = ui(ti, s) and ui(ti) = r s (ui (ti), ui (ti)). For expositional purposes, we assume that no agent is ever indifferent r s between the two alternatives, that is ui (ti) 6= ui (ti) for every ti ∈ Ti and every i ∈ N. 0 r s In addition, we assume that for every i there are ti, ti ∈ Ti such that ui (ti) > ui (ti) r 0 s 0 and ui (ti) < ui (ti). Since randomization over alternatives will be considered, we need to extend each ui(ti, ·) to ∆(A). We identify ∆(A) with the interval {(p, 1 − p) : 0 ≤ p ≤ 1} ⊆

R2, where the first coordinate corresponds to the probability of r and the second coordinate to the probability of s. With abuse of notation we write ui(ti, (p, 1 − p)) =

r s pui (ti) + (1 − p)ui (ti) for 0 ≤ p ≤ 1. A Social Choice Function (SCF) is a mapping f : T → ∆(A). The set of all SCFs is denoted F . It will be useful to think about F as a (convex, compact) subset of the linear space R2|T |. Thus, if f, g ∈ F and α ∈ [0, 1] then αf + (1 − α)g ∈ F is defined by (αf + (1 − α)g)(t) = αf(t) + (1 − α)g(t) ∈ ∆(A).

For every agent i, type ti ∈ Ti and SCF f we denote by Ui(f|ti) the interim expected utility of agent i under f conditional on him being of type ti:

 ˆ ˆ ˆ   ˆ  Ui(f|ti) = E ui ti, f t | ti = ti = ui(ti) · E f(ti, t−i) , where x · y denotes the inner product of the vectors x and y. The ex-ante utility of agent i under f is

 ˆ ˆ X Ui(f) = E ui ti, f t = µi(ti)Ui(f|ti). ti∈Ti

Definition 1. A SCF f is Incentive Compatible (IC) if truth-telling is a Bayesian equilibrium of the direct revelation mechanism associated with f. Namely, if for all

8 0 i ∈ N and all ti, ti ∈ Ti, we have

  ˆ   0 ˆ  (1.3.1) ui(ti) · E f(ti, t−i) − E f(ti, t−i) ≥ 0.

The set of all IC SCFs is denoted F IC .

1.3.1. Ordinal rules and weighted majority rules. We now define two classes of SCFs that will have an important part in the analysis below. For each agent i, let

Pi be the partition of Ti into the two (non-empty) sets

r r s Ti = {ti ∈ Ti : ui (ti) > ui (ti)},

s r s Ti = {ti ∈ Ti : ui (ti) < ui (ti)}.

Recall that agents are never indifferent, so every type ti is in exactly one of these sets.

The partition Pi reflects the ordinal preferences of agent i over the alternatives. Let

0 P be the partition of T which is the product of all the Pi’s: t and t are in the same

0 element of P if and only if ti and ti are in the same element of Pi for every i ∈ N. As usual, let P (t) be the element of the partition P that contains the type profile t.

Definition 2. A SCF f is ordinal if it is P -measurable, i.e., if f(t) = f(t0) whenever P (t) = P (t0). The set of all ordinal SCFs is denoted F ORD.

Thus, an ordinal SCF depends only on the ordinal information in the reported type profile, and is not affected by changes in the expressed intensity of preference.

Definition 3. A SCF f is a weighted majority rule if there are strictly positive P numbers (w1, . . . , wn; q) such that i∈N wi > q and such that   P  (1, 0) if r wi > q  {i:ti∈T } f(t) = i  P  (0, 1) if r wi < q. {i:ti∈Ti } 9 We refer to wi as the weight of agent i and to q as the quota.

Remark 1. Definition 3 does not specify how a weighted majority rule deals P with ties, i.e., with type profiles for which r wi = q (when such type profiles {i:ti∈Ti } exist). In particular, a weighted majority rule need not be ordinal, since it may assign different alternatives to two ordinally equivalent type profiles in which there is a tie. Furthermore, for the very same reason a weighted majority rule need not be IC. To see this, consider an environment with n = 2 agents and where the type of each agent is randomly chosen from the set {x, y, z, w} according to a uniform distribution. Both agents have the same utility function: They get utility of 0 if s is chosen no matter what is their type. If r is chosen then their utilities (as a function of their type) are 2 for type x, 1 for type y, -1 for type z, and -2 for type w. Let f be the ‘first-best’ SCF that chooses the alternative that maximizes the sum of utilities at each type profile and flips a fair coin if there is a tie. First, note that f is a weighted majority rule with weights and quota given by (1, 1; 1). Indeed, the only restriction that this weighted majority rule imposes is that if both agents prefer the same alternative then that alternative is chosen, which is clearly satisfied by f. Second, f is not ordinal since it assigns different outcomes to the ordinally equivalent type profiles (x, z) and (y, w). Third, f is not IC since an agent of type y has an incentive to report that his type is x. Note, however, that if a SCF is both ordinal and a weighted majority rule then it is IC. Indeed, in a weighted majority rule there is never an incentive to report a type with opposite ranking of alternatives than the actual one, while ordinality implies that an agent cannot gain by pretending to be a different type with the same ordinal preferences. 10 The following simple lemma provides an alternative representation of weighted majority rules, which will be useful in some of the proofs later on. The proof is straightforward and therefore omitted.

Lemma 1. f is a weighted majority rule if and only if there are 2n strictly positive

r s r s numbers ((w1, w1),..., (wn, wn)) such that   P r P s  (1, 0) if r w > s w  {i:ti∈T } i {i:ti∈T } i f(t) = i i  P r P s  (0, 1) if r wi < s wi . {i:ti∈Ti } {i:ti∈Ti }

r s r s Moreover, (w1, . . . , wn; q) and ((w1, w1),..., (wn, wn)) represent the same weighted r s P s majority rule whenever wi = wi + wi for every i and q = i wi .

1.4. The utilitarian rule

We start the analysis by considering the problem of maximizing social welfare, i.e., the sum of ex-ante expected utilities of all the agents. It will be convenient to

r P r s P s r s denote v (t) = i∈N ui (ti), v (t) = i∈N ui (ti) and v(t) = (v (t), v (t)) for every t = (t1, . . . , tn) ∈ T . These are the welfare totals for each of the two alternatives when t is the realized type profile.

P Definition 4. The (ex-ante) social welfare of a SCF f is V (f) = i∈N Ui(f) =      E v tˆ · f tˆ .

Without requiring incentive compatibility, a maximizer of social welfare simply chooses r whenever vr(t) > vs(t) and chooses s if the other inequality holds (anything, including randomization, can be chosen when vr(t) = vs(t)). However, such SCFs will typically not be IC, since agents will have an incentive to exaggerate the intensity of their preference. The following theorem characterizes maximizers of social welfare subject to incentive compatibility. 11 Theorem 1. A SCF f ∈ F IC is a maximizer of V in F IC if and only if it satisfies   P r P s  (1, 0) if r w˜ > s w˜  {i:ti∈T } i {i:ti∈T } i (1.4.1) f(t) = i i  P r P s  (0, 1) if r w˜i < s w˜i , {i:ti∈Ti } {i:ti∈Ti } where

r  r ˆ s ˆ ˆ r s  s ˆ r ˆ ˆ s (1.4.2) w˜i = E ui (ti) − ui (ti) | ti ∈ Ti , w˜i = E ui (ti) − ui (ti) | ti ∈ Ti .

In particular, by Lemma 1, any maximizer of V in F IC is a weighted majority rule

r s r s P s with weights and quota given by (w ˜1 +w ˜1,..., w˜n +w ˜n; i w˜i ).

According to the theorem, any maximizer of social welfare is a weighted major- ity rule with specific weights that reflect the expected intensity of preference of one alternative over the other. Specifically, given an announced type profile, the agents are partitioned according to whether the type they announced prefers r or s. For each agent who prefers r, the expected utility gain if r is indeed chosen is computed, and this is the weight assigned to this agent. Similarly, agents who prefer s are as- signed weights corresponding to their expected gain if indeed s is chosen. The chosen alternative is the one with the larger total weight.

Proof of Theorem 1: The proof is based on the following lemma which shows that, in the class of IC SCFs, restricting attention to ordinal SCFs does not change the interim utility possibility set. A similar result (for the agent-symmetric case) appears in Schmitz and Tröger (2006, Lemma 14).

12 IC 6 IC ORD Lemma 2. If f ∈ F then E(f|P ) ∈ F ∩ F and satisfies Ui(E(f|P )|ti) =

Ui(f|ti) for every i ∈ N and every ti ∈ Ti.

The proof of the lemma is in the Appendix. Note that the lemma implies that

the same relation holds also in the ex-ante stage, i.e., Ui(E(f|P )) = Ui(f) for every i ∈ N. In particular, V (E(f|P )) = V (f). We turn now to the proof of the theorem. First, note that for every g ∈ F ORD we have

     h      i h      i ˆ ˆ ˆ ˆ ˆ ˆ V (g) = E v t · g t = E E v t · g t P = E g t · E v t P .

Thus, a SCF g is a maximizer of V in F ORD if and only if it satisfies

  r     s     (1, 0) if E v tˆ | P (t) > E v tˆ | P (t) g(t) =          (0, 1) if E vr tˆ | P (t) < E vs tˆ | P (t) .

Using type independence, this is equivalent to

 P  r ˆ  P  s ˆ   (1, 0) if i∈N E ui (ti) | Pi(ti) > i∈N E ui (ti) | Pi(ti) g(t) =  P  r ˆ  P  s ˆ   (0, 1) if i∈N E ui (ti) | Pi(ti) < i∈N E ui (ti) | Pi(ti) , which is precisely condition (1.4.1) in the theorem.

Now, assume f ∈ F IC satisfies (1.4.1). Then E(f|P ) also satisfies (1.4.1), so by the previous argument E(f|P ) is a maximizer of V in F ORD. By Lemma 2, E(f|P ) ∈ F IC ∩ F ORD and V (f) = V (E(f|P )). Assume by contradiction that there is f 0 ∈ F IC with V (f 0) > V (f). Then again by Lemma 2, E(f 0|P ) ∈ F IC ∩ F ORD and V (E(f 0|P )) = V (f 0). It follows that V (E(f|P )) = V (f) < V (f 0) = V (E(f 0|P )), contradicting the fact that E(f|P ) is a maximizer of V in F ORD. 6 E(f|P ) is the conditional expectation of f given the partition P , i.e., it is the SCF defined by 1 P 0 0 E(f|P )(t) = µ(P (t)) t0∈P (t) µ(t )f(t ) 13 Conversely, assume that f is a maximizer of V in F IC . Then by Lemma 2 E(f|P ) is also a maximizer of V in F IC . From the first paragraph of this proof it follows that the maximum of V in F ORD is attained (perhaps not exclusively) by some function in F IC ∩ F ORD. It follows that E(f|P ) is a maximizer of V in F ORD, and so it must satisfy condition (1.4.1). However, if E(f|P ) satisfies (1.4.1) then f satisfies (1.4.1) as well.

1.5. Pareto efficiency

The utilitarian criterion for the evaluation of social choice functions is somewhat controversial, as it involves interpersonal comparison of cardinal utilities (see, e.g., the discussion in Weymark, 2005). In contrast, it is hard to argue against Pareto efficiency as a minimal normative criterion that a social planner should take into account. In this section we will characterize the Pareto frontier of F IC , and show that it is closely related to the class of weighted majority rules. As is well-known (see Holmström and Myerson, 1983), in environments with in- complete information the concept of Pareto efficiency is not as simple as in complete information environments. The timing (ex-ante, interim, ex-post) at which a given SCF is evaluated can affect the conclusion of whether it is efficient or not. Another aspect is whether incentive constraints are taken into account or not. Holmström and Myerson (1983) convincingly argue that, out of the six possible combinations (three possibilities for the timing of evaluation times two possibilities for the set of feasible rules), three are reasonable as notions of efficiency.

Definition 5. (Holmström and Myerson, 1983)

(1) A SCF f is ex-ante incentive efficient if f ∈ F IC and there is no g ∈ F IC

such that Ui(g) ≥ Ui(f) for every agent i, with at least one strict inequality. 14 (2) A SCF f is interim incentive efficient if f ∈ F IC and there is no g ∈ F IC

such that Ui(g|ti) ≥ Ui(f|ti) for every agent i and every ti ∈ Ti, with at least one strict inequality. (3) A SCF f ∈ F is ex-post classically efficient if there is no g ∈ F such that

ui(ti, g(t)) ≥ ui(ti, f(t)) for every agent i and every t = (ti, t−i) ∈ T , with at least one strict inequality.

The class of ex-post classically efficient SCFs is very large. The only restriction that this notion of efficiency imposes in our environment is that of unanimity: If a given type profile is such that all the agents prefer one alternative over the other, then this alternative should be chosen. For type profiles in which some agents prefer r and some prefer s any choice (including randomization) is allowed. It follows that this notion of efficiency is too weak to narrow F in a significant way. We therefore focus on the other two criteria. The following theorems characterize interim and ex-ante incentive efficiency.

Theorem 2. A SCF f is interim incentive efficient if and only if f is an IC weighted majority rule.7

.

Theorem 3. A SCF f is ex-ante incentive efficient if and only if there are strictly

positive numbers λ1, . . . , λn such that f is an IC weighted majority rule with weights

r s r s P s r s r s and quota given by (λ1 (w ˜1 +w ˜1) , . . . , λn (w ˜n +w ˜n); i λiw˜i ), where ((w ˜1, w˜1),..., (w ˜n, w˜n)) are the weights used to maximize social welfare as in (1.4.2).

Before the proofs, a couple of comments are in order. First, the weights and quota in an ex-ante incentive efficient rule are based on those of the utilitarian rule, but

7We must explicitly require that f be IC, see Remark 1. 15 the power of agents may be arbitrarily distorted by the coefficients λi. This may be viewed as the result of a social planner maximizing a different welfare functional than the utilitarian. Note however that a certain connection between the weights and the quota must hold: If the weight of an agent increases by a certain amount, then the quota should also increase by a certain smaller (environment-specific) amount. Thus, if agent i has greater influence on the outcome when he supports r (his weight increased more than the increase in quota) then he also has greater influence when he supports s (the quota has increased). One may ask whether, despite the argument of the previous paragraph, interim and ex-ante incentive efficiency are in fact equivalent in our simple environment. To see that in general they are not, consider the case where there is symmetry between

r s the alternatives in the sense that w˜i =w ˜i for every i. In this case, every ex-ante incentive efficient rule is neutral (except maybe in case of a tie), i.e., alternative r is chosen when a certain coalition of agents supports it if and only if alternative s is chosen when the same coalition supports s. Obviously, there are many weighted majority rules that do not satisfy this property, and thus the class of interim incentive efficient rules is strictly larger than the class of ex-ante incentive efficient rules in this case.

Proof of Theorem 2: Assume first that f ∈ F IC is a weighted majority rule with corresponding weights

r s r s r s ((w1, w1), (w2, w2) ..., (wn, wn)) (here we use the alternative representation of a weighted majority rule from Lemma 1). We consider an auxiliary environment with the same set of agents, types and distribution over types as in the original environment. The 16 utilities in the new environment are given by

 r wi r  r ui(ti) if ti ∈ Ti 0 w˜i ui(ti) = s  wi s  s ui(ti) if ti ∈ Ti , w˜i

r s where w˜i , w˜i are as in equation (1.4.2). Since the set of IC SCFs is not affected by this transformation of utilities, we can apply Theorem 1 to conclude that f is a maximizer of V in the auxiliary environment (among IC functions). In other words, f is a maximizer of

X X 0 µi(ti)Ui (g|ti) i ti IC 0 among all functions g ∈ F , where Ui (g|ti) is the interim utility of agent i of type ti r 0 wi in the auxiliary environment when g is implemented. Since Ui (g|ti) = r Ui(g|ti) for w˜i s r 0 wi s ti ∈ Ti and Ui (g|ti) = s Ui(g|ti) for ti ∈ Ti , we get that f is a maximizer of w˜i

 r s  X X wi X wi  µi(ti) r Ui(g|ti) + µi(ti) s Ui(g|ti) . r w˜i s w˜i i ti∈Ti ti∈Ti

Thus, f maximizes a linear combination with strictly positive coefficients of the in- terim utilities of the players (in the class F IC ). This proves that f is interim incentive efficient.

Conversely, let f be interim incentive efficient. We claim first that there are

strictly positive numbers {λi(ti)}i∈N, ti∈Ti such that f is a maximizer of

X X λi(ti)Ui(g|ti) i ti among all functions g ∈ F IC . Indeed, the set F IC , viewed as a subset of R2|T |, is polyhedral and convex. Also, the mapping from SCFs to interim utility vectors is affine: Ui(αf + (1 − α)g|ti) = αUi(f|ti) + (1 − α)Ui(g|ti) for any f, g ∈ F and any 17 n o α ∈ [0, 1]. It follows that the set (U (f|t )) : f ∈ F IC is polyhedral and i i i∈N, ti∈Ti convex (Rockafellar, 1970, page 174). For convex polyhedral utility-possibility sets, Pareto efficiency is completely characterized by maximization of linear combinations of utilities with strictly positive coefficients (Gale, 1960, page 308). Fix a vector λ as above and consider the auxiliary environment with utilities

0 λi(ti) given by u (ti) = ui(ti) (all other ingredients are the same as in the original i µi(ti) environment). As before, incentive compatibility is not affected by this change, and

0 λi(ti) the interim utilities in the new environment are given by U (g|ti) = Ui(g|ti). It i µi(ti) follows that f maximizes (in F IC ) the expression

X X 0 X 0 µi(ti)Ui (g|ti) = Ui (g), i ti i which means that f is a maximizer of (ex-ante) social welfare in the auxiliary envi- ronment. By Theorem 1 f must satisfy equation (1.4.1) (for the new environment), which means that f is a weighted majority rule.

Proof of Theorem 3: The proof follows the footsteps of the previous proof. Assume first that f ∈ F IC is a weighted majority rule with weights and quota as in the theorem (for some vector

(λ1, . . . , λn) of positive numbers). Consider the auxiliary environment with utilities

0 given by ui(ti) = λiui(ti) for every i and ti. It follows from Theorem 1 that f is a welfare maximizer (subject to IC) in the new environment, that is f maximizes the sum

X 0 X Ui (g) = λiUi(g) i i over all IC functions g. Since all the λ’s are strictly positive we conclude that f is ex-ante incentive efficient. 18 In the other direction, assume that f is ex-ante incentive efficient. By a similar argument to the one in the proof of Theorem 2, there are strictly positive numbers

(λ1, . . . , λn) such that f maximizes the sum

X λiUi(g) i

over all IC functions g. It follows that f is socially optimal subject to IC in an

0 auxiliary environment with utilities given by ui(ti) = λiui(ti). By Theorem 1, f satisfies the condition in the Theorem.

1.6. Anonymous rules

Voting rules that discriminate between voters, in the sense of giving more power to one group of voters over another, are often excluded as violating the basic fairness criterion of “one person, one vote”. In this section we revisit the questions addressed in the previous sections, under the additional restriction that only SCFs that treat the agents in a symmetric way are feasible. Such SCFs are typically called anonymous in the literature, and we will use this terminology as well. We emphasize that we do not assume that the environment is symmetric between the agents – different agents may have different utility distributions. In this section we restrict attention to ordinal SCFs. This makes the definition of anonymity simpler, and allows us to focus on the more interesting aspects of the

8 r problem. We denote by k(t) = # {i ∈ N : ti ∈ Ti } the number of agents that prefer r over s given the type profile t = (t1, . . . , tn) ∈ T .

8Alternatively, we could have required that all agents have the same set of type labels and define anonymity by requiring that a permutation of agents’ reports does not change the outcome. A mod- ified version of Lemma 2 would then imply that essentially only ordinal SCFs should be considered. 19 Definition 6. A SCF f is anonymous if there exists a function δ : {0, 1, . . . , n} → ∆(A) such that f(t) = δ(k(t)) for every t ∈ T . The class of anonymous SCFs is denoted F ANO.

The intersection of the classes of weighted majority rules and anonymous rules consists of weighted majority rules in which all agents have the same weight, i.e. rules in which the weights and quota are given by (1, 1,..., 1 ; q) for some 0 < q < n. We refer to such SCFs as qualified majority rules. Note that this class contains both super- and sub-majority rules, so our definition of a qualified majority rule is somewhat different than the usual meaning of this term. Every qualified majority rule f thus has a corresponding function δ of the form

  (0, 1) if k < q δ(k) =  (1, 0) if k > q for some 0 < q < n. When k = q (in cases where q is integer) every outcome, including randomization, is allowed. A qualified majority rule is IC since it is an ordinal weighted majority rule (see Remark 1). There are however IC anonymous SCFs which are not qualified majority rules, as demonstrated by the following example.

Example 1. Let n = 3 and consider a symmetric environment in which T1 =

r T2 = T3 = {tr, ts}, µi(tr) = µi(ts) = 1/2 for every i, and utilities given by ui (tr) = s r s ui (ts) = 1 and ui (ts) = ui (tr) = 0 for every i. Let δ be given by δ(0) = δ(2) = (0, 1) and δ(1) = δ(3) = (1, 0). That is, if the number of agents that support r is odd then r is chosen, and if it is even then s is chosen. Then (the anonymous SCF generated by) δ is IC even though it is not a qualified majority rule. Indeed, the probability 20 that each of the alternatives is chosen is 1/2 independently of the report of an agent (assuming others are truthful).

Our strategy in this section is similar to the one of the previous sections: We start by characterizing the utilitarian solution within the class of anonymous and IC functions, and we will then use this result to analyze Pareto efficiency in this class. The following theorem generalizes early results in Badger (1972) and Curtis (1972) (see also Lemma 4 in Barberà and Jackson, 2004).

Theorem 4. A SCF f is a maximizer of V in the class F IC ∩ F ANO if and only if f is a qualified majority rule with a quota q = βk + (1 − β)k for some 0 < β < 1, where

( ) X r r X s s k = min k ∈ {0, 1, . . . , n} : µ (Ti | k (t) = k)w ˜i ≥ µ (Ti | k (t) = k)w ˜i i∈N i∈N ( ) X r r X s s k = max k ∈ {0, 1, . . . , n} : µ (Ti | k (t) = k)w ˜i ≤ µ (Ti | k (t) = k)w ˜i i∈N i∈N

The proof of this result is quite simple, since f is a welfare maximizer if and only if for every k, conditional on the event {k (t) = k}, f chooses the alternative with higher conditional expected welfare. We will show that the difference of the conditional expected welfare between alternative r and alternative s is a strictly increasing function of k, and that k (k) is the minimal (maximal) k at which this function is non-negative (non-positive). It follows that an optimal SCF is a qualified majority rule with a quota between these two numbers.9 Proof of Theorem 4: We start with the following lemma, which is closely related to Lemma 1 in Barberà

9It is possible that k = k, in which case the quota q is equal to their common value. In this case an optimal SCF may assign any outcome (including randomization) if exactly q agents prefer r. Otherwise, k = k + 1 and there is a unique deterministic utilitarian rule. 21 and Jackson (2004), and to an earlier work of Badger (1972). The proof can be found in the Appendix.

r Lemma 3. For each i ∈ N, the conditional probability µ (Ti | k (t) = k) is a strictly increasing function of k.

Moving to the proof of the theorem, let f ∈ F ANO and let δ be its associated function. Then

n X h      i V (f) = µ (k (t) = k) δ(k) · E v tˆ | k tˆ = k . k=0

Therefore, f is a maximizer of V if and only if

  r   s       (1, 0) if E v tˆ − v tˆ | k tˆ = k > 0 (1.6.1) δ(k) =          (0, 1) if E vr tˆ − vs tˆ | k tˆ = k < 0.

Now,

r ˆ s ˆ ˆ  X r ˆ  s ˆ  ˆ  E v t − v t | k t = k = E ui ti − ui ti | k t = k = i∈N X r r ˆ  s ˆ  ˆ ˆ r µ (Ti | k (t) = k) E ui ti − ui ti | k t = k, ti ∈ Ti + i∈N X s r ˆ  s ˆ  ˆ ˆ s µ (Ti | k (t) = k) E ui ti − ui ti | k t = k, ti ∈ Ti = i∈N X r r X s s µ (Ti | k (t) = k)w ˜i + µ (Ti | k (t) = k)(−w˜i ), i∈N i∈N where in the last equality we used type independence to remove the conditioning event {k (t) = k}. Thus, (1.6.1) becomes

 P r r P s s  (1, 0) if i∈N µ (T | k (t) = k)w ˜ > i∈N µ (T | k (t) = k)w ˜ δ(k) = i i i i  P r r P s s  (0, 1) if i∈N µ (Ti | k (t) = k)w ˜i < i∈N µ (Ti | k (t) = k)w ˜i .

22 Consider the difference

X r r X s s α(k) = µ (Ti | k (t) = k)w ˜i − µ (Ti | k (t) = k)w ˜i . i∈N i∈N

P s P r We have α(0) = − i∈N w˜i < 0 and α(n) = i∈N w˜i > 0. The proof will be complete if we can show that α(k) is strictly increasing in k. A sufficient condition for this is

r that µ (Ti | k (t) = k) is strictly increasing in k for each i, so we are done by Lemma 3.

We now move on to discuss Pareto efficiency (subject to IC) in F ANO. The definitions of ex-ante and interim efficiency are almost the same as those in Definition 5, the only difference being that F IC should be replaced by F IC ∩ F ANO everywhere.

Theorem 5. A SCF f is interim efficient in F IC ∩ F ANO if and only if f is a qualified majority rule.

.

Theorem 6. A SCF f is ex-ante efficient in F IC ∩ F ANO if and only if f is a qualified majority rule with a quota q = β mini∈N ki + (1 − β) maxi∈N ki for some 0 < β < 1, where for each i ∈ N

r r s s ki = min {k ∈ {0, 1, . . . , n} : µ (Ti | k (t) = k)w ˜i ≥ µ (Ti | k (t) = k)w ˜i }

r r s s ki = max {k ∈ {0, 1, . . . , n} : µ (Ti | k (t) = k)w ˜i ≤ µ (Ti | k (t) = k)w ˜i } .

These theorems follow from similar arguments to those of the previous results, so we only give sketches of the proofs. For Theorem 5, notice first that since a qualified majority rule is an IC anonymous weighted majority rule, it follows immediately from Theorem 2 that every qualified majority rule is interim efficient in F IC ∩ F ANO. 23 Conversely, assume that f is interim efficient in F IC ∩ F ANO. Then, by an identical argument to the one in the proof of Theorem 2, f maximizes social welfare among all IC functions (not just anonymous) in an auxiliary environment obtained by a monotone transformation of utilities. It follows from Theorem 4 that f is a qualified majority rule. For Theorem 6, we can use the same argument as in the proof of Theorem 3 to conclude that f is ex-ante efficient in F IC ∩ F ANO if and only if it maximizes welfare in this set in an environment in which the utility of each agent i is multiplied by some constant λi > 0. Denote by kλ, kλ the cutoffs of Theorem 4 when agents’ utilities

are multiplied by λ = (λ1, . . . , λn). Notice that the cutoffs ki, ki in the statement of Theorem 6 are not affected by this transformation of utilities.

Now, for every λ we have that mini ki ≤ kλ ≤ kλ ≤ maxi ki. Thus, if q =

0 0 βkλ + (1 − β)kλ for some β ∈ (0, 1) then also q = β mini ki + (1 − β ) maxi ki for some β0 ∈ (0, 1). This proves that every ex-ante efficient SCF has the form as in the

0 0 0 theorem. Conversely, given q = β mini ki + (1 − β ) maxi ki with β ∈ (0, 1), there

exists λ such that q = βkλ +(1−β)kλ for some β ∈ (0, 1). The latter can be achieved for example by assigning appropriate weights to the two agents for which mini ki and

10 maxi ki are achieved, and assigning weights close enough to 0 to all other agents.

1.6.1. When are anonymous rules optimal? We conclude this section by considering the question of what environments have the property that an anonymity constraint does not reduce welfare. That is, in what kind of environments the optimal rule among all IC rules is anonymous?

10Theorem 6 can also be proved using the ‘single peakedness’ property of the agents’ preferences over qualified majority rules, see Badger (1972) and Barberà and Jackson (2004). 24 It is an easy guess that if the environment is symmetric between the agents, in the sense that all agents have the same utility distribution, then the optimal rule is anonymous. The following immediate corollary of Theorem 1 shows that in fact much less symmetry is needed for this result to hold.

r s Corollary 1. In an environment in which w˜i +w ˜i is constant in i there is a qualified majority rule which maximizes welfare among all IC SCFs. This rule is

s P w˜i characterized by the quota q = i r s . w˜i +w ˜i

The condition in the corollary is far from being necessary. There are very asym- metric environments in which anonymous rules are optimal. Consider for example an

r s r s r s environment with three agents in which w˜1 =w ˜1 =w ˜2 =w ˜2 = 100 and w˜3 =w ˜3 = 1. Even though agent 3 is very different from agents 1 and 2, the utilitarian rule (in F IC ) here is the simple majority rule, so in particular it is anonymous. We can ask a similar question regarding Pareto efficiency rather than welfare optimality. By Theorem 2, in any environment, every qualified majority rule is interim incentive efficient. What perhaps is less obvious is that in every environment there is at least one ex-ante incentive efficient SCF which is anonymous.

Corollary 2. In every environment, the qualified majority rule with quota q = s P w˜i i r s is ex-ante incentive efficient. w˜i +w ˜i

1 The corollary follows immediately from Theorem 3 by putting λi = r s . w˜i +w ˜i

1.7. Self-stability and durability

1.7.1. Ex-ante self-stability. Barberà and Jackson (2004) study the self-stability of qualified majority rules. A given status-quo rule f is self-stable if it beats any alter- native rule g, where the decision between f and g is made using the rule f. Self-stable 25 rules correspond to steady states of the evolutionary process of a constitution. Over time, a rule that is not self-stable is likely to be replaced by another rule, even if the former is excellent from a welfare point of view. An important question is thus whether any of the weighted majority rules that we study here are self-stable and, if so, which ones. To formally address this question we need to extend the definition of self-stability to weighted majority rules.

Definition 7. A weighted majority rule f = (w1, . . . , wn; q) is self-stable if for any other weighted majority rule g it holds that

X wi < q, i∈R(g,f) where R(g, f) = {i ∈ N : Ui(g) > Ui(f)} is the set of agents whose ex-ante expected utility is higher under g than under f.

We emphasize that this definition is based on an ex-ante comparison of the two voting rules. That is, when agents need to decide whether to support the current rule f or the alternative g, they do not know their preferences over the issues that these rules will be used to resolve. Also, notice that implicit in this definition is the P assumption that in case of a tie (i.e., when i∈R(g,f) wi = q) the new rule g prevails, but none of our results depend on this tie-breaking rule. The definition in Barberà and Jackson is the same as above, except that both f and g are assumed to be anonymous (i.e., qualified majority rules). Also, they only

r s consider environments in which w˜i =w ˜i = 1 for all i. They prove (Theorem 1) that 26 r 11 if µ (Ti ) is the same for all i then simple majority is self-stable. In contrast, when weighted majority rules are allowed we get the following result.

r s Proposition 1. Consider an environment in which w˜i =w ˜i = 1 for all i, and r such that µ (Ti ) = p ∈ (0, 1) for all i. If the number of agents is n ≥ 4 then there is no self-stable qualified majority rule. In particular, the utilitarian rule in such environments is not self-stable.12

Proof: First, any qualified majority rule different from simple majority is not self-stable since it is defeated by simple majority: It follows from our Theorem 1 that simple majority is the utilitarian rule in this environment, and that it gives strictly higher welfare than any other qualified majority rule. From the symmetry between the agents it follows that every agent gets strictly higher expected utility under simple majority than under any other qualified majority rule. Thus, all the agents will support a shift from a status-quo qualified majority rule to simple majority, and so the qualified majority is not self-stable.

Lemma 4. Thus, to complete the proof we only need to check that simple majority is not self-stable. For this we need the following lemma, whose proof is in the Ap- pendix. Let X1,...,Xn (n ≥ 4) be independent and identically distributed random variables with P r(X1 = 1) = 1 − P r(X1 = 0) = p ∈ (0, 1), and for each 1 ≤ k ≤ n

11We call simple majority to the qualified majority rule in which the quota is half of the total number of agents n. If n is odd then there is a unique simple majority rule, while if n is even there are two such (deterministic) rules, depending on how a tie is broken. We refer to both of them as simple majority. 12With n = 3 agents it is easy to check that simple majority is self-stable. With n = 2 agents the qualified majority rule that requires unanimous support for a reform is self-stable. 27 Pk denote Sk = i=1 Xi. Then

 n  n (1.7.1) P r X = 1,S ≥ + P r X = 0,S < < 1 n 2 1 n 2  n − 2  n − 2 P r X = 1,S ≥ + P r X = 0,S < . 1 n−2 2 1 n−2 2

The left-hand side of (1.7.1) is the ex-ante expected utility of agent 1 under simple majority rule when there are n agents and the tie-breaking rule favors alternative r (when n is even). Consider the weighted majority rule in which agents 1 through n − 2 each has a weight of 1, agents n − 1 and n each has a weight of  < 1/4, the quota is (n − 2)/2, and the tie-breaking rule still favors r. The ex-ante expected utility of agent 1 is then given by the right-hand side of (1.7.1). It follows from the lemma that agent 1 prefers the latter weighted majority rule over simple majority with the tie-breaking rule which favors r. By symmetry, the same is true for agents 2, . . . , n − 2. This proves that simple majority in which the tie-breaking rule favors r is not self-stable. But notice that the tie-breaking rule of simple majority does not affect the ex-ante utility of agents in our environment, so the same is true if the tie-breaking rule favors s. This completes the proof.

The proof of the proposition makes it clear that self-stability becomes a very restrictive notion once weighted majority rules can be used to challenge the status- quo rule. Keeping the assumptions of Proposition 1, even in the larger class of non- anonymous rules only few are self-stable. In other environments there may be more self-stable rules. A precise characterization of self-stability is an interesting direction for future research.

1.7.2. Durability and interim self-stability. A different notion of stability of SCFs, called durability, was introduced by Holmström and Myerson (1983). Roughly 28 speaking, a SCF f is durable if f would never (at any information state) be unani- mously rejected in favor of another SCF g. As opposed to self-stability of the previous subsection, the decision between the status-quo f and the alternative g takes place in the interim stage, after each agent knows his type. This is a crucial difference, since when an agent decides between the two rules he needs to take into account not only his own information, but also the fact that other agents’ voting behavior depends on their information. Indeed, learning that the alternative rule g defeated f (or that the complementary event happened) contains information about other agents’ types and, therefore, about how they are going to behave in each of these mechanisms. The precise definition of durability is quite involved, and we will not give the details here. Basically, given the current rule f ∈ F IC and some alternative g ∈ F , the play of the direct mechanism f is augmented by a preliminary voting game in which each agent can vote for either f or g, and g replaces f only if g gets a unanimous support. f is called durable if, for every alternative g, there is a sequential equilibrium (Kreps and Wilson, 1982) of the augmented game in which g is never chosen. The following is a corollary of Theorem 2.

Corollary 3. Every ordinal weighted majority rule is durable.

Proof: It is proved in Holmström and Myerson (1983, Theorem 2) that any interim incentive efficient and strategy-proof SCF is durable. Strategy-proofness means that truth- telling is not just a Bayesian equilibrium of the direct revelation mechanism, it is also a dominant strategy for each agent. Formally,

0 ui(ti) · (f(ti, t−i) − f(ti, t−i)) ≥ 0 29 0 for all i, ti, ti and t−i. It is straightforward to check that any ordinal weighted majority rule is strategy-proof. By Theorem 2, any such SCF is interim incentive efficient, so the proof is complete.

A natural modification of durability in our framework would be interim self- stability, which replaces the unanimity rule in the preliminary voting stage by the cur- rent status-quo rule f. That is, if f is the weighted majority rule f = (w1, . . . , wn; q), P then g defeats f if and only if wi ≥ q, where the sum is taken over all agents who voted for g. This modification makes it easier for g to beat the incumbent rule f, so interim self-stability is a more demanding concept than durability. An analysis of this stability concept is beyond the scope of this paper and we leave it for future research.

1.8. Applications

1.8.1. Representatives of groups with heterogenous sizes. The design of voting rules in a representative democracy with heterogenous country sizes has been thoroughly analyzed in Barberà and Jackson (2006). The variance in country size leads very naturally to an asymmetric environment at the representatives level. There- fore, asymmetric voting rules are appealing from a theoretical point of view in such institutions, and there are several examples of such asymmetric rules being used (e.g., the Council of the European Union). Here, we explain the relation between the model and results in Barberà and Jackson (2006) and those of the current paper, and discuss some of the implications of our results for such environments. In Barberà and Jackson (2006), a representative of each country votes for one of two alternatives. Her vote is a function of the realized (cardinal) preference profile of the citizens in the country she represents. The voting rule aggregates the votes of 30 the representatives and chooses one of the alternatives (possibly at random). Agents in our model thus correspond to the representatives, and the type of an agent corre- sponds to a realization of a preference profile in that representative’s country. A main difference between the models is that Barberà and Jackson restrict the message space of agents in the mechanism to consist of only two messages – each agent can only announce the alternative she supports. They find the optimal (utilitarian) rule for this class of mechanisms. Our approach is to consider the direct mechanism in which agents simply report their types. The advantage of this latter approach is that, by the revelation principle, there is no mechanism that can implement a SCF that yields a higher welfare than in the optimal solution of the direct mechanism. Therefore, our Theorem 1 strengthen the conclusion of Theorem 1 in Barberà and Jackson (2006), by showing that the weighted majority rule they find is optimal not just in the class of mechanisms they consider, but universally among all possible mechanisms.13 Our Theorems 2 and 3 provide further insight regarding the class of rules that should be considered by a social planner in such environments. Theorem 2 implies that only weighted majority rules should be considered, while Theorem 3 gives some guidance as to what weights and quota to choose. For concreteness, let us consider the ‘common bias’ model analyzed in Barberà and Jackson (2006). This is the special

s r case in which w˜i = γw˜i for every i ∈ N, where γ > 0 is a parameter. The utili- tarian rule in this case is a weighted majority rule with weights and quota given by

 r r γ P r w˜1,..., w˜n; 1+γ i w˜i . It follows from Theorem 3 that ex-ante incentive efficiency

13For the analogy between the models to be accurate, we need to assume that the preferences of the representatives coincide with the aggregate preference in their countries. Barberà and Jackson are not concerned with incentive compatibility of their rule, and they allow representatives to vote arbitrarily. They do however impose that representatives vote in accordance with their country’s preference when they analyze the more specific ‘block model’. 31 γ is exactly characterized by weighted majority rules in which the quota is a 1+γ frac- tion of the total weight. Thus, while there may be disagreement about whether the utilitarian rule with its specific weights should be used, it seems quite reasonable to focus on rules in which the quota is as implied by Pareto efficiency considerations. Finally, assume that despite the size heterogeneity, only anonymous rules can be used. This is the practice in the General Assembly of the United Nations (UN), for example. Under the above ‘common bias’ assumption, it follows from Theorem 4 that the utilitarian rule is then a qualified majority rule with a quota anywhere between

( ) X r r γ X r k = min k : µ (Ti | k (t) = k)w ˜i ≥ w˜i i∈N 1 + γ i∈N and ( ) X r r γ X r k = max k : µ (Ti | k (t) = k)w ˜i ≤ w˜i . i∈N 1 + γ i∈N r Note that the probabilities µ (Ti ) that country i prefers the reform, who do not play any role when non-anonymous rules are considered, become important to determine

r the optimal quota. If we add the assumption that µ (Ti ) does not depend on i, r γ then µ (Ti | k (t) = k) = k/n and the optimal quota becomes 1+γ n. In the General Assembly of the UN, passing a resolution on some of the more significant issues requires a 2/3 majority, which is optimal (under our assumptions) if γ = 2. For other issues a simple majority is used, which is optimal for γ = 1.

1.8.2. Dichotomous populations. Consider, for the sake of concreteness, an academic department that needs to decide whether to give a job offer to a certain faculty candidate (alternative r) or not (alternative s). The current faculty of the department (the voters) has members that work in the same field as the candidate

(group G1) and members that work in other fields (group G2). Assume that members 32 of G1 have more to gain (or lose) from the decision than members of G2, in the sense

r s that there are two numbers h > l > 0 such that w˜i =w ˜i = h for i ∈ G1 and r s w˜i =w ˜i = l for i ∈ G2.

The utilitarian rule here is to assign a weight of h to each member of G1, a weight

of l to each member of G2, and set the quota to be half of the total weight, i.e.

|G1|h+|G2|l q = 2 . Note that if h and l are sufficiently close then the optimal rule is a simple majority rule, while if h is much larger than l then the candidate’s own field members make the decision by themselves (by a simple majority). Consider now the more realistic case where only anonymous rules can be used for

r the decision. As above, the probabilities µ (Ti ) become relevant for determining the optimal quota. Assume that these probabilities are symmetric within each group,

r r such that there are 0 < pl ≤ ph < 1 with µ (Ti ) = ph for i ∈ G1 and µ (Ti ) = pl for r i ∈ G2. For i ∈ G1 denote ph(k) = µ (Ti | k (t) = k) and for i ∈ G2 denote pl(k) = r µ (Ti | k (t) = k). Then by Theorem 4, the optimal quota is, roughly speaking, the

|G1|h+|G2|l number k for which |G1|hph(k) + |G2|lpl(k) is closest to 2 . Solving for the optimal quota explicitly is very complicated, but we can say some- thing in the two extreme cases when (1) ph is close to pl and (2) ph is close to 1 and pl is close to 0. In the first case we are back in the situation analyzed in the previous subsection with γ = 1, so the optimal anonymous rule is a simple majority. In the second case, we can approximate ph(k) and pl(k) as follows. If k ≤ |G1| then condi- tional on k (t) = k it is almost certain that the entire group of k supporters comes

k from G1. It follows that ph(k) ≈ and pl(k) ≈ 0. If k > |G1| then conditional on |G1|

k (t) = k it is almost certain that the group of k supporters is composed of the G1

k−|G1| group plus k − |G1| members of G2, so ph(k) ≈ 1 and pl(k) ≈ . The optimal |G2| quota then depends on which of the two groups has a larger total expected preference

33 intensity, i.e., which of the two numbers |G1|h or |G2|l is bigger: If |G1|h > |G2|l

|G1| |G2|l then the optimal quota is (roughly) q = 2 + 2h ≤ |G1|, so if the field members unanimously agree to hire the candidate then a job offer should be given, even if all members of |G2| object. If on the other hand |G1|h < |G2|l then the optimal quota is

|G2| |G1|h q = |G1| + 2 − 2l > |G1|, and at least some support from G2 is required to pass the threshold.

1.8.3. Veto players. There are 15 members of the UN Security Council, of which five are permanent members and ten non-permanent members that are elected to serve two-year terms. To approve any substantial proposal, the affirmative votes of at least nine of the 15 members are required. In addition, if any one of the five permanent members cast a negative vote (veto), then the proposal is not adopted. This voting rule is in fact a weighted majority rule:14 Denote by P the set of per- manent members and by NP the set of non-permanent members. Set wi = 100 for i ∈ P , wi = 2 for i ∈ NP , and q = 507. Thus, in order to pass, a proposal needs the support of the five permanent members and at least four of the non-permanent members. In what kind of environments this weighted majority rule is welfare maximizing? To apply Theorem 1, we need to represent this rule as in Lemma 1. This can be done

r s r s for example by defining wi = 1, wi = 99 for i ∈ P , and wi = 0.8, wi = 1.2 for r r s s i ∈ NP . By Theorem 1, in environments in which w˜i = wi and w˜i = wi for each i, r s where wi , wi have the above values, the voting rule used in the Security Council is optimal.

14Members of the Security Council can also cast an ‘abstain’ vote. We ignore here this possibility since it complicates the analysis. Note however that, by the revelation principle, any SCF that can be implemented by a mechanism that allows for abstention can also be implemented by the direct mechanism we consider. 34 Of course that this representation of the voting rule is not unique – there are infinitely many different representations. But in any representation it must be the case

r s that for the permanent members wi is much smaller than wi . Thus, for this voting rule to be optimal the environment must have the property that for any permanent member the expected utility gain from adopting a proposal conditional on supporting the proposal is much smaller than the expected utility loss from adopting a proposal conditional on objecting to it.

1.9. Discussion

1.9.1. More than two alternatives. In this paper we restricted attention to the case of a binary alternative set. With three or more alternatives, the analysis becomes significantly more complicated. With two alternatives, Lemma 2 tells us that we can essentially consider only ordinal SCFs. When there are three or more alternatives, Lemma 2 is no longer true: The utility possibility set generated by IC ordinal functions is a strict subset of that generated by all IC functions and, furthermore, the Pareto frontier of these two sets may be different. Below we give a simple example to illustrate this point, and we refer the reader to Kim (2012) for more general results on the possibility to use cardinal information to improve welfare when there are more than two alternatives.15

Let N = {1, 2}, T1 = T2 = {x, y, z, w}, and assume that the type of each agent i is uniformly (and independently) drawn from Ti. There are three alternatives A =

{a, b, c}. For i = 1, 2, utilities are given by ui(x) = (ui(x, a), ui(x, b), ui(x, c)) =

(5, 4, 1), ui(y) = (5, 2, 1), ui(z) = (−5, −4, −1) and ui(w) = (−5, −2, −1). Notice that types x, y and types z, w have the same ordinal preferences over alternatives.

15The example below shares some similarities with the problem analyzed in Börgers and Postl (2009). 35 Let f ∗ be the ex-post first-best SCF, i.e. the function that chooses an alternative that maximizes the sum of utilities at every type profile, and assume that in case of a tie the winning alternative is randomly chosen from the set of maximizers. It is tedious but straightforward to check that f ∗ is IC. Thus, f ∗ is the utilitarian solution and, in particular, it is both ex-ante and interim incentive efficient.

∗ However, we claim that there is no IC ordinal SCF f such that Ui(f|ti) = Ui(f |ti)

for every ti and every i. Indeed, if f is such a function then it must also be a maximizer of ex-ante social welfare. Thus, f must choose an alternative that maximizes social welfare at every type profile. However, in type profile (x, w) the unique maximizer is alternative b, while in type profile (y, z) the maximizers are alternatives a and c. If f is ordinal then it must choose the same alternative in these two type profiles, a contradiction.

1.9.2. Correlated types. The following example shows the importance of the type-independence assumption for our results.16 Consider a three agents environment

with T1 = T2 = T3 = {x, y, z, w}, and utilities given by ui(x) = (3, 0), ui(y) = (1, 0),

ui(z) = (0, 1) and ui(w) = (0, 3) for i = 1, 2, 3. The distribution of type profiles is

1 given by µ(x, z, z) = µ(z, x, z) = µ(z, z, x) = µ(w, y, y) = µ(y, w, y) = µ(y, y, w) = 6 . It will be convenient to define a metric d over the set of type profiles T by letting

0 0 d(t, t ) = #{1 ≤ i ≤ 3 : ti 6= ti}. Denote E1 = {(x, z, z), (z, x, z), (z, z, x)} and ∗ 0 E2 = {(w, y, y), (y, w, y), (y, y, w)}. Define a SCF f as follows. If t ∈ E1 or d(t, t ) = 1

0 ∗ 0 0 for some t ∈ E1 then f (t) = r. If t ∈ E2 or d(t, t ) = 1 for some t ∈ E2 then f ∗(t) = s. Otherwise, f ∗(t) is defined in an arbitrary way. Notice that f ∗ is well-

0 0 defined since d(t, t ) = 3 for every t ∈ E1 and t ∈ E2.

16For a different example demonstrating the effect of type correlation on the optimal SCF see Example 3 in Schmitz and Tröger (2011). 36 First, f ∗ is IC since no player can influence the outcome by reporting untruthfully (assuming other agents are truthful). Second, f ∗ coincides with the ex-post first-best

∗ SCF on E1 ∪ E2. Since µ(E1 ∪ E2) = 1 this implies that f is a maximizer of social welfare (and so incentive Pareto efficient). However, it is not hard to see that f ∗ is not a weighted majority rule. Indeed, assume by contradiction that (w1, w2, w3; q)

∗ ∗ ∗ represent f . Since f (x, z, z) = r we must have w1 ≥ q, but since f (y, y, w) = s we must have w1 + w2 ≤ q. Since all weights must be strictly positive, these two inequalities cannot hold simultaneously. The above example relies on the fact that most type profiles can never occur. It is thus quite fragile. But it is possible to modify this example to get a similar result with a distribution µ that has a full support. Indeed, fix two small numbers δ,  > 0.

1− Assume that the probability of each type profile in E1 ∪ E2 is 6 , and that the rest of the probability () is evenly distributed among the rest 58 type profiles. For t ∈ E1 define f(t) = (1 − δ, δ), and for t ∈ E2 let f(t) = (δ, 1 − δ). If t contains at least two agents of type z (but t∈ / E1) then f(t) = (1 − 2δ, 2δ), and if t contains at least two agents of type y (but t∈ / E2) then f(t) = (2δ, 1−2δ). If t contains a type x agent and a type z agent (but t∈ / E1) then f(t) = (1, 0), and if t contains a type y agent and a type w agent (but t∈ / E2) then f(t) = (0, 1). Finally, f is defined in an arbitrary way for other type profiles. It is not hard to check that if δ and  are sufficiently small then f yields higher expected welfare than any ordinal rule (in particular higher than any weighted majority rule). It is tedious but straightforward to check that f is IC when  is much smaller than δ.

1.9.3. Weighted majority rules with zero weights. In the definition of a weighted majority rule (Definition 3) we require that the weights of all players are strictly positive. It turns out that this is essential for the result to hold: If some of 37 the players have zero weights then the SCF may not be incentive interim efficient. The reason is that we may have more type profiles in which there is exactly a tie, and the tie-breaking rule may introduce inefficiencies. To see this, consider an environment with six agents, each of which is equally likely

to be of type tr or type ts. Utilities are given by ui(tr) = (1, 0) and ui(ts) = (0, 1), so tr prefers r while ts prefers s. Consider the weighted majority rule f given by the weights and quota (1, 1, 1, 1, 0, 0 ; 2). Thus, r is chosen if at least three agents out of the group {1, 2, 3, 4} prefer r, and s is chosen if at most one agent out of {1, 2, 3, 4} prefers r. To complete the description of the rule f we need to specify which alternative is chosen if exactly two agents out of {1, 2, 3, 4} prefer r. Let us introduce the following tie-breaking rule: If the coalitions that prefer r are {1, 2}, {1, 3} or {2, 4} then r is chosen, otherwise s is chosen. Notice that this rule ignores completely the preferences of agents 5 and 6. We claim that we can introduce a different tie-breaking rule such that the result- ing SCF g interim Pareto dominates f. Indeed, let the minimal winning coalitions in g (in case of a tie) be {1, 2}, {1, 3, 5}, {1, 3, 6}, {1, 4, 5, 6}, {2, 4, 5}, {2, 4, 6}, and

{2, 3, 5, 6}. The change from f to g occurs in four type profiles: (tr, ts, tr, ts, ts, ts),

(ts, tr, ts, tr, ts, ts), (tr, ts, ts, tr, tr, tr), and (ts, tr, tr, ts, tr, tr). In the first two the out- come is switched from r to s, while in the last two from s to r. Because of the symmetry of the problem it is easy to see that the interim utilities of agents 1,2,3,4 are not affected by this change. However, since the new tie-breaking rule is responsive to the preferences of agents 5 and 6, their interim utilities increase.

1.9.4. Implementation. Reporting truthfully is an equilibrium of the direct revelation mechanism associated with a weighted majority rule, but there are also 38 other equilibria and, furthermore, in some of these equilibria the chosen alternative may be different than under truthfulness. In fact, in many cases there will be no mechanism that implements a given weighted majority rule as the unique equilibrium. The problem is that in many cases a weighted majority rule will fail to satisfy Bayesian monotonicity which is a necessary condition for implementation. See the closely related Example 2 in Palfrey and Srivastava (1989) for details. Note however that if one strengthen the solution concept to require players to play a Bayesian equilibrium using strategies that are not weakly dominated then the problem of equilibrium multiplicity disappears (here we assume that the tie-breaking rule is ordinal). Furthermore, truth-telling is a weakly dominant strategy, which makes this equilibrium particularly plausible. See Jackson (1991) and Palfrey and Srivastava (1989) for more on implementation in environments with incomplete information.

39 CHAPTER 2

Ordinal versus Cardinal Voting Rules: A Mechanism Design

Approach

2.1. Introduction

For the problem of aggregation of individual preferences into a group decision, the vast majority of the social choice and strategic voting literature uses a purely ordinal model: agents can rank the candidates but intensity of preferences is discarded. Thus, ordinal voting rules which depend only on ranking information in agents’ preferences have been mainly studied. Furthermore, this literature usually restricts attention to strategy proof rules,1 i.e., rules in which truth-telling is best for each agent regardless of other agents’ reports. These two assumptions greatly restrict the types of aggregation procedures that can be explored. Thus, relaxing these assumptions may prove valuable. First, pref- erence intensity information may need to be considered for the problem. A simple example is when a slight majority of people weakly prefers an alternative and a slight minority strongly prefers another alternative. The more concrete example is a rep- resentative democracy. Representatives need to choose one of the alternatives. A representative knows how much his own district is willing to pay for the alternatives to be chosen, which may represent his district’s preference intensities over the alter- natives. He not only needs to consider the ordinal preference of his district, but also the preference intensity of his district. Thus if a voting rule that aggregates the votes

1These are sometimes called dominant incentive compatible rules in the mechanism design literature. 40 of the representatives is an ordinal rule, then it cannot sufficiently satisfy the wants of people in the whole society. Second, requiring strategy proofness may be too de- manding. It is well known from the Gibbard-Satterthwaite theorem (Gibbard (1973) and Satterthwaite(1975)) that any deterministic and strategy proof rule is dictatorial under mild assumptions. Our approach to the problem differs from the past literature in that, in our model, agents assign cardinal valuations to the alternatives, which express their relative de- sirability. Rather than requiring strategy proofness, we consider the weaker concept of incentive compatibility, which only requires agents to be truth-telling if other agents are also truth-telling. Our model is, therefore, closely resembles the classic mecha- nism design problem with the crucial difference that monetary transfers cannot be used to remedy incentive problems. To the best of my knowledge, only a few other papers investigate incentive compatible cardinal voting rules with preference intensity information.2 Our objective is to apply a Bayesian mechanism design approach, without mone- tary transfers, to examine the efficiency and incentive compatibility of voting rules. We consider a neutral environment, i.e., when alternatives are symmetric ex-ante, and show that essentially any ex-ante Pareto efficient ordinal rule is Incentive Compatible (IC). Furthermore, we prove that there exists an IC cardinal rule which outperforms any ordinal rule under a symmetry assumption across agents. Lastly, we consider the non-neutral environment introduced in Majumdar and Sen (2004), and we show that every IC ordinal rule is a peak-only rule on the domain of single-peaked preferences.

2See the related literature section. 41 Let us be more explicit regarding our environment. We consider a Bayesian envi- ronment where an agent’s preference over at least three alternatives is private infor- mation, and monetary transfers are infeasible. To capture the intensity of preferences, each agent has cardinal valuations for the alternatives. These valuations are random variables which, we assume, are drawn independently across agents, not necessarily between alternatives. We also assume that values are neutral between alternatives, which implies that the value distributions of each alternative is symmetric. A vot- ing rule f is a mapping from the reported value profiles to lotteries over the set of alternatives. Social welfare is measured in terms of expected utilities induced by the rule. We first consider (weak) ex-ante Pareto efficiency in the class of ordinal rules.3 A rule f is Ordinally Pareto Efficient (OPE) if f is ordinal, and there is no ordinal rule that strictly increases the expected utilities of all agents under f. We characterize OPE rules such that f is OPE if and only if it is a scoring rule with specific scores, which are the expected values of the ranked alternatives given ordinal preferences

(Proposition 1). A rule is called a scoring rule if there exists a score vector si for each agent i over the alternatives such that the alternative with the greatest sum of scores is chosen. Then, we show that for any OPE rule f, there exists an IC ordinal rule g that delivers the same expected utility for every agent as f (Theorem 1). This is somewhat surprising, because even in the class of ordinal rules, incentive compatibility has often proven problematic in the strategic voting and mechanism design literature. In other words, an agent often has an incentive to misrepresent her preference in order to prevent an undesirable outcome. Theorem 1, however, guarantees that there is no incentive compatibility problem in our set-up as long as

3There are two well-known notions of Pareto efficiency: weak Pareto efficiency and strong Pareto efficiency. Since we mainly consider the weak concept, we omit the term, “weak” from now. 42 the rule satisfies the appealing criterion of ex-ante Pareto efficiency. Note, this result implies that there is also no conflict between strong Pareto efficiency and incentive compatibility.4 Next, we ask whether it is possible to design rules that are superior to ordinal rules. We show that one can design an IC cardinal rule which achieves higher utili- tarian social welfare than any ordinal rule, under a symmetry assumption on agents (Theorem 2).5 If we consider cardinal rules, there exists a trade-off between Pareto efficiency and incentive compatibility. For example, Börgers and Postl (2009) show that the first best cardinal rule in their set-up is not IC.6 However, Theorem 2 states that it is possible to design an IC cardinal rule superior to any ordinal rule. With three alternatives, this rule is closely related to well-known voting rules: plurality, negative, Borda count, and rules. As in Myerson (2002), the general form of these simple voting rules is an (A, B)-scoring rule, which is similar to a scor- ing rule, but where there are two possible score vectors, (1, A, 0) or (1,B, 0) where (0 ≤ A ≤ B ≤ 1). Proposition 2 shows that our rule is an IC (A, B)-scoring rule. Because of the simple structure of (A, B)-scoring rules, this theoretical result could inform voting rules in a variety of institutional settings. Lastly, we divert our attention to the non-neutral environment introduced in Ma- jumdar and Sen (2004). On the domain of unrestricted preferences, they show that any incentive compatible ordinal rule is dictatorial.7 But, on a domain of restricted

4A strong ordinally Pareto efficient rule f can be defined as an ordinal rule such that there is no ordinal rule that weakly increases the expected utilities of all agents, at least one strictly under f. 5We focus on symmetric agents in the sense that the value distribution is identical across agents. One advantage of this assumption is that the social welfare can be normalized to the utility of one agent - no interpersonal utility comparisons are needed to compare rules. 6They consider two agent who have the opposite ordinal preferences and three alternatives. 7In Section 7, we slightly modify our set-up similarly to their set-up and show the equivalence between our incentive compatibility and their concept of incentive compatibility. Also the specific non-neutral environment will be defined in the section. 43 preferences, particularly single-peaked preferences, this result does not hold. Moulin (1980) shows that even a strategy proof ordinal rule is not necessarily dictatorial. We build on these result by deriving a new implication for incentive compatible or- dinal rules on the domain of single-peaked preferences. Theorem 3 states that any IC ordinal rule is a peak-only rule, i.e., the rule depends only on the peaks in agents’ preferences, not their full rankings. The paper is organized as follows. The next section reviews related literature. In Section 3, we introduce the environment and define ordinal rules and scoring rules. Section 4 shows a motivating example. Section 5 discusses the notion of ordinal Pareto efficiency and incentive compatibility of voting rules. In Section 6, we construct an IC cardinal rule superior to any ordinal rule, and show the connection between our rule and the (A, B)-scoring rules. Section 7 discusses implications of incentive compatibility in the domain of single-peaked preferences. The Appendix contains the more technical portions of our proofs.

2.2. Related literature

There is an extensive literature which examines voting (decision) rules subject to incentive constraints. The seminal Gibbard-Satterthwaite theorem (Gibbard (1973) and Satterthwaite(1975)) investigates strategy proof voting rules, showing that under mild assumptions, a deterministic rule that is strategy proof is dictatorial. Much of the subsequent literature (Gibbard (1977), Moulin (1980), Freixas (1984), etc) has either confirmed the robustness of the theorem, or established a positive result by relaxing its assumptions (e.g., allowing randomized rules, allowing cardinal rules, or restricting the domain of preferences). Unfortunately, using preference intensity information does not aid in designing strategy proof voting rules because strategy 44 proofness is a too demanding constraint. Because of these negative results, we con- sider (Bayesian) incentive compatibility which is weaker than strategy proofness, but still requires agents to be truthful. Some recent papers adopted a Bayesian mechanism design approach. This ap- proach’s usefulness results from the availability of information about an agent’s value distribution. Jackson and Sonnenschein (2007) show that incentive constraints can be overcome by linking decision problems and identifying the cumulative value reports of agents with the value distribution. Barberà and Jackson (2006) derive the optimal weights assigned to representatives based on their value distributions in a model of indirect democracy. We also adopt this approach. Apesteguia et al. (2011) derive the form of the utilitarian, maximin, and maximax ordinal rules.8 These rules are the optimal ordinal rules based on different social welfare functions, such as the utilitarian, maximin, and maximax welfare functions. They do not consider incentive compatibility, assuming that agents reveal their true preferences. A natural question, then, is whether their rules are IC. Our Proposition 1 addresses the issue of incentive compatibility of the set of efficient rules, which includes all of their rules. The paper most related to our work is Majumdar and Sen (2004). They ana- lyze the implications of weakening incentive constraints from strategy proofness to Ordinal Bayesian Incentive Compatibility (OBIC). They show that under uniform priors a wide class of rules satisfies OBIC, which is related to our Theorem 1. But, unlike their work, we address the relationship between incentive compatibility and Pareto efficiency. We, also, allow for randomized rules, and our Theorem 1 does not require their assumptions of neutrality or elementary monotonicity. Furthermore, we

8We discuss these rules in Section 5. 45 use cardinal value information and characterize the ordinally Pareto efficient rules (Proposition 1). Our Theorem 3 is related to another result in Majumdar and Sen (2004) and Ger- shkov et al (2013). Majumdar and Sen (2004) prove that any ordinary Bayesian in- centive compatible rule is dictatorial in their non-neutral environment on the domain of unrestricted preferences. Theorem 3 is a new implication of incentive compatibil- ity on the domain of single-peaked preferences. On this popular domain, Gershkov et al. (2013) show that every dominant incentive compatible cardinal rule has an equivalent peak-only rule. Even though we do not cover cardinal rules on the domain of single-peaked preferences, Theorem 3 can be an important intermediate step for further research regarding IC cardinal rules on this domain. This paper is also related to voluminous literature on scoring rules. Myerson (2002) introduced (A, B)-scoring rules in a model of three-candidate elections. Giles and Postl (2012) study symmetric Bayes Nash equilibria under (A, B)-scoring rules and the welfare implication of the rules. They show that a good (A, B)-scoring rule increases welfare relative to more common voting rules such as the plurality, negative, Borda count, and approval voting rules. Their result is closely related to Theorem 2, but they use numerical methods for the welfare comparison and the indirect mecha- nism approach, while we use an analytical method and a direct revelation mechanism approach. There are several papers that investigate voting rules with two alternatives. Azrieli and Kim (2013) show that there is no incentive compatible rule which outperforms optimal ordinal rules, and they characterize the set of interim incentive efficient rules as the set of weighted majority rules. Schmitz and Tröger (2012) prove that the optimal rule among all anonymous and neutral rules in both symmetric and neutral

46 environments is majority rule, which is indeed ordinal. We show that the considera- tion of at least 3 alternatives leads to very different results regarding efficiency and incentive compatibility. Lastly, many papers do not allow monetary transfers, despite their application of the mechanism design approach. For example, Gershkov et al. (2013) deal with a voting mechanism, Miralles (2012) examines an allocation mechanism, and Carrasco and Fuchs (2009)9 study a communication protocol. The main reason for the exclusion of monetary transfers is that, in many cases monetary transfers are infeasible or excluded for ethical reasons (e.g., child placement in public schools, organ transplants, collusion in markets, etc). For the same reason, we also exclude monetary transfers.

2.3. The Model

2.3.1. Environment. Consider a standard Bayesian environment with private values. The set of agents is N = {1, 2, ..., n}, and the set of alternatives is L with | L |= m.10 The typical elements of L will be denoted by l, l0, l00, etc. Each agent i ∈ N has a von-Neumann Morgenstein value (utility) function vi : L → R. We denote by  l m vi = v ∈ the vector of values that agent i assigns to the alternatives. We i l∈L R l l0 assume that no agent is ever indifferent between alternatives, that is vi 6= vi for each 0 11 m 0 agent i ∈ N and for any alternatives l, l ∈ L. Let Vi ⊆ R denote agent i s value space, and V = V1 × · · · × Vn be the set of value profiles. The set of orderings of alternatives is T ORD, and let t denote an element of T ORD. We define an ordinal type

ORD l l0 l00 0 00 function ti : Vi → T such that if vi > vi > ··· > vi , then ti (vi) = ll ··· l

9Their procedure to derive the optimal communication protocol has the similar flavor with our Theorem 2 in the sense that each type is divided into two types. But, unlike our paper, they cover two players, infinite action spaces and, more importantly, develop a dynamic mechanism. 10For every finite set X, | X | denotes the number of elements in X . 11This assumption is not essential for results but simplifies definitions and notation. 47 , which shows the ordinal preferences of agent i. We allow any possible ordinal

ORD preferences over the alternatives, so that| {t : ti (vi) = t} = T |= m!. As usual, a subscript −i on a vector means that the ith coordinate is excluded. To capture the intensity of preferences, it will be useful to normalize as follows .

For a vector x ∈ Rm and an integer k ∈ {1, .., m}, we denote by x[k] the k th-highest [1] [2] [m] value among the coordinates of x. Now, let 1 = vi > vi > ... > vi = 0 for every i ∈ N. 12The value of the first ranked alternative is normalized to 1 and the value of the lowest to 0, which can be interpreted as the base of the preference intensities. And the preference intensities of the other intermediate alternatives are measured between 0 and 1. The information structure is the following. We assume that the vector of val- ues of agent i is a random variable vˆi with values in Vi. The assumption of strict

0 preference and the normalization establish agent i s value space such that Vi =

m n l l0 0 o [0, 1] vˆi :v ˆi 6=v ˆi for l, l ∈ L . Since the values of ranked alternatives are im- [k] ORD portant, we consider the conditional distribution of vi given type t ∈ T which

k,t t [k] is denoted by µi . Let µi be the joint distribution of vˆi for k = {1, ..., m} given type t ∈ T ORD. Another reason why we consider this distribution is that we focus on a neutral environment where the type distribution and the joint distribution are symmetric across ordinal types, that is for any t, t0 ∈ T ORD,

0 t t0 P r [ti (ˆvi) = t] = P r [ti (ˆvi) = t ] and µi=µi

12Although our results except Proposition 2 hold without this normalization, we use this normal- ization because it simplifies many arguments and provides Proposition 2. The similar normalization appears in Börgers and Postl (2009). But, it is arguable that this normalization can capture the preference intensity and can validate the interpersonal comparisons of utility. In fact, the issue of cardinal utility and interpersonal comparisons of utility has been controversial in the literature (Strotz (1953), Harsanyi (1955), Roberts (1980), etc). It is beyond the scope of our paper. 48 t 1 ORD As a consequence, µi = µi and P r [ti (ˆvi) = t] = m! for every t ∈ T and for each agent i ∈ N. Note that from the ex-ante perspective, there is no special

l alternative, i.e., the distribution of vˆi is the same for any l ∈ L. We also assume that values are independent across agents, so the distribution of    [k] vˆi is the product distribution µ = µ1 × · · · × µn. The above features k={1,...,m} i∈N of the environment are common knowledge among agents. But, the realized value vi is assumed to be only observed by agent i. A voting rule is a measurable mapping f : V → 4(L),13 which means that we allow randomized rules. Let F be the set of all rules. For every agent i and rule f,

14 Ui(f) = E (ˆvi · f (ˆv)) denotes the expected utility of agent i under the rule f. To illustrate the features of our environment, we present the following example.15

Example 1. Consider three alternatives a, b, and c, implying that L = {a, b, c},

a b c ORD vˆi = (ˆvi , vˆi , vˆi ), and T = {abc, acb, bac, bca, cab, cba}. Suppose that ti (vi) = abc, a c b then the normalization implies that vˆi = 1, vˆi = 0 and 0 < vˆi < 1. In a neutral 1 1 1 environment, P r [ti (ˆvi) = t] = 6 . For example, if f (v) = ( 2 , 2 , 0), it means that at 1 value profile v, alternatives a and b are each chosen with probability 2 and c is never chosen. In a neutral environment, it is convenient to define a permutation function and a neutral rule. Let σ : L → L be a permutation of L, and let φ be the set of all permu-    l m  l m×n tations with respect to L. Then, for x = x ∈ R and y = yi ∈ R l∈L l∈L i∈N   σ  σ(l) m σ  σ(l) , we denote by x the vector x ∈ R and by y the vector yi ∈ l∈L l∈L i∈N Rm×n.

13For every finite set X , 4(X) denotes the set of probability measures on X. 14x · y denotes the inner product of the vector x and y. 15The simplest case with three alternatives will be frequently used to help readers to understand the structure and intuitions of the model. 49 Definition 1. A rule f is neutral if for any profile v ∈ V and any permutation σ ∈ φ, we have f (vσ) = f (v)σ. A neutral rule treats all alternatives symmetrically. That is, the “labels” of the alternatives do not affect the rule. Finally, we use the standard definition of incentive compatibility from the mechanism design literature:

Definition 2. A rule f is Incentive Compatible (IC) if truth-telling is a Bayesian equilibrium of the direct revelation mechanism associated with f. In our environment,

0 this means that for all i ∈ N and all vi, vi ∈ Vi,

0 vi · (E (f(vi, vˆ−i)) − E (f(vi, vˆ−i))) ≥ 0.

2.3.2. Ordinal Rules and Scoring Rules. An ordinal rule acts only on the ordinal preference information in the reported value profile. To define an ordinal rule,

ORD ORD let Pi be the ordinal partition of Vi into | T | sets. That is, for t ∈ T ORD

t Vi = {vi ∈ Vi | ti (vi) = t}

ORD ORD ORD The partition Pi reflects the ordinal types of agent i. Let P = P1 × ORD 0 · · · × Pn be the corresponding product partition of V : v and v are in the same ORD 0 ORD element of P if and only if vi and vi are in the same element of Pi for every agent i ∈ N. As usual, let P ORD (v) be the element of P ORD that contains the value profile v.

Definition 3. A rule f is ordinal if it is P ORD-measurable, i.e., if f (v) = f (v0) whenever P ORD (v) = P ORD (v0). The set of all ordinal rules is denoted by F ORD. Thus, an ordinal rule depends only on the ranking information in the reported value profile, and is unaffected by changes in the expressed intensity of preferences. 50 We define a useful function related to the ranking of alternatives, for i ∈ N, let

rl : Vi → {1, ..., m} denote the ranking of alternative l in vi.

Next, we consider an important class of ordinal rules: scoring rules. Let si ≡

1 m m 1 m (si , ..., si ) be a score vector in R where si ≥ ... ≥ si for i ∈ N.

m Definition 4. A rule f is a scoring rule if there exists a score vector si ∈ R for i ∈ N such that

X rl(vi) Supp (f(v)) ⊆ argmax si , l∈L i∈N n o where Supp (f (v)) = l ∈ L : f (v)l > 0 .

rl(vi) 0 Note that si can be seen as agent i s score assigned to alternative l ∈ L when

he announces vi. Additionally, it depends only on the ranking of the alternative, not

the ordinal type ti (vi). We allow asymmetric score vectors across agents. If si is identical for all agents then we get a symmetric scoring rule, which is standard in the voting literature.

2.4. A Motivating Example

The objective of this section is to give a simple example with two agents and three alternatives, which helps to understand our motivation, main intuition and notation.

[2] Suppose that N = {1, 2}, L = {a, b, c} and vi is uniformly distributed in (0, 1) for i = 1, 2. We begin with a non-neutral environment where alternatives b and c are

popular: For i ∈ N, P r [ti (ˆvi) = t] =  for t = abc, acb, bac, cab and P r [ti (ˆvi) = t] =

1−4 2 for t = bca, cba. We look at rules widely used in practice, the plurality and Borda 16 count voting rules. Denote the rule by fP and denote the Borda

16The plurality voting rule allows each agent to vote for one alternative and chooses the alternative with the most votes. The Borda count voting rule allows each agent to give the alternatives a certain 51 count voting rule by fB. In our setting, fP is a scoring rule with si,P = (1, 0, 0) and

 1  fB with si,B = 1, 2 , 0 for i = 1, 2. Ties are broken by the uniform distribution over the set of alternatives with the greatest sum of scores. The following table shows the comparison of the two rules and the usual trade off between efficiency and incentive compatibility even in the class of ordinal rules.

abc bac v1 ∈ V1 v1 ∈ V1 fP (v1, v2) fB (v1, v2) fP (v1, v2) fB (v1, v2) abc  1 1   1 1  v2 ∈ V2 (1, 0, 0) (1, 0, 0) 2 , 2 , 0 2 , 2 , 0 acb  1 1  v2 ∈ V2 (1, 0, 0) (1, 0, 0) 2 , 2 , 0 (1, 0, 0) bac  1 1   1 1  v2 ∈ V2 2 , 2 , 0 2 , 2 , 0 (0, 1, 0) (0, 1, 0) bca  1 1  v2 ∈ V2 2 , 2 , 0 (0, 1, 0) (0, 1, 0) (0, 1, 0) cab  1 1   1 1   1 1 1  v2 ∈ V2 2 , 0, 2 (1, 0, 0) 0, 2 , 2 3 , 3 , 3 cba  1 1   1 1 1   1 1  v2 ∈ V2 2 , 0, 2 3 , 3 , 3 0, 2 , 2 (0, 1, 0) abc bac Table 1. Plurality and Borda count voting rules when v1 ∈ V1 and v1 ∈ V1

In the array, agent 1’s value announcements appear along the columns and agent

2’s along the rows. Inside, fP (v1, v2) and fB (v1, v2) are laid out according to agents’ announcements.

The voting rule fP is IC because an agent’s misrepresentations cannot increase the probability that the top alternative is chosen and cannot decrease the probability that the bottom one is chosen. We can improve fP in terms of the total expected utilities by using fB, i.e., U1 (fP ) + U2 (fP ) < U1 (fB) + U2 (fB). In fact, we will show in the next section that fB is the optimal ordinal rule based upon maximizing the total expected utilities. The result is due to the score vector of the Borda count number of points corresponding to their ranking of the alternatives and determines the alternative with the most points. 52 voting rule reflecting the voters’ expected utilities given their ordinal types, i.e., si,B =  [2]  E 1, vˆi , 0 | ti (vi) for i = 1, 2. However, the problem is that fB is not IC because an agent’s misrepresentation can decrease the probability that the bottom alternative

abc cba is selected (see the case when v1 ∈ V1 and v2 ∈ V2 ). If v1 = (1, 1 − ε, 0), then sufficiently small ε gives agent 1 an incentive to lie that his type is bac. Thus, we see the usual trade off between efficiency and incentive compatibility in the class of ordinal rules. The first question is in what kind of environment this trade off does not exist. We observe, from this example, that there is an incentive to lie when some alternatives are popular ex-ante. So, we consider a neutral environment in which no alternative is special from an ex-ante perspective. Only one condition is changed such that

1 ORD P r [ti (ˆvi) = t] = 6 for t ∈ T and i = 1, 2. Then, before agent 1’s report, he expects that the probability of each alternative getting the score 1 is the same if agent 2 reports truthfully because our environment is neutral. Agent 1’s truthful report, thus, is better than any other report. Since the same argument can be applied to

agent 2, the rule fB is IC.

The second question is whether we can design an IC rule superior to fB. We will show the exact construction of an IC rule superior to the Borda Count rule, using preference intensity information. To use preference intensity information, we consider a rule based on a finer partition than the ordinal partition. The partition divides each

t tH tL ordinal type set Vi into the two sets Vi and Vi using a threshold β ∈ (0, 1):

tH t [2] tL t [2] Vi (β) = {vi ∈ V | vi ≥ β} and Vi (β) = {vi ∈ V | vi < β}.

H type and L type are divided according to the value of the second-ranked alter- native. For each β ∈ (0, 1), we find the rule fβ which maximizes the sum of expected utilities among this partition-measurable rules. In fact, fβ resembles a 53 H   [2] tH   scoring rule but with the two score vectors, s (β) = 1, E vˆi | vi ∈ V , 0 and L   [2] tH   s (β) = 1, E vˆi | vi ∈ V , 0 . Since we use the ranking information as well as the preference intensity information with this partition, each fβ is more efficient than any ordinal rules. But, we need to find a specific β for incentive compatibility.

∗ ∗ 1 Given the environment of this example, we find the rule f = f ∗ with β = √ , which β 2  √    implies that sH = 1, 1+√ 2 , 0 and sL = 1, √1 , 0 . The following table shows f 2 2 2 2 B and f ∗ when agent 1 announces an H type and an L type.

abc abcH 0 abcL v1 ∈ V1 v1 ∈ V1 v1 ∈ V1

∗ ∗ 0 fB (v1, v2) f (v1, v2) f (v1, v2)

abcH v2 ∈ V2 (1, 0, 0) (1, 0, 0) (1, 0, 0)

abcL v2 ∈ V2 (1, 0, 0) (1, 0, 0) (1, 0, 0)

acbH v2 ∈ V2 (1, 0, 0) (1, 0, 0) (1, 0, 0)

acbL v2 ∈ V2 (1, 0, 0) (1, 0, 0) (1, 0, 0)

bacH 1 1 1 1 v2 ∈ V2 ( 2 , 2 , 0) ( 2 , 2 , 0) (1, 0, 0) bacL 1 1 1 1 v2 ∈ V2 ( 2 , 2 , 0) (0, 1, 0) ( 2 , 2 , 0) bcaH v2 ∈ V2 (0, 1, 0) (0, 1, 0) (0, 1, 0)

bcaL v2 ∈ V2 (0, 1, 0) (0, 1, 0) (0, 1, 0)

cabH v2 ∈ V2 (1, 0, 0) (1, 0, 0) (1, 0, 0)

cabL v2 ∈ V2 (1, 0, 0) (1, 0, 0) (1, 0, 0)

cbaH 1 1 1 v2 ∈ V2 ( 3 , 3 , 3 ) (0, 1, 0) (0, 1, 0) cbaL 1 1 1 1 1 v2 ∈ V2 ( 3 , 3 , 3 ) (0, 1, 0) ( 2 , 0, 2 )

∗ abcH 0 abcL Table 2. fB and f when v1 ∈ V1 and v1 ∈ V1

54 bac cba For the part of efficiency over fB, note the case when v2 ∈ V2 or v2 ∈ V2 , which corresponds to the case when the top two alternatives are shared or the case when the two agents have the opposite ordinal preferences. While f ∗ can distinguish these cases based on preference intensity information, the ordinal rule fB cannot. We

∗ ∗ 117 114 verify the efficiency by U1 (f ) + U2 (f ) = 72 >U1 (fB) + U2 (fB) = 72 . For the part of incentive compatibility, note that an H type announcement of an agent gives the alternatives sH , which provides him a relatively high proba- bility of getting middle alternative and an L type announcement has the oppo- site effect.17 Thus, the most of H and L type agents like to report their true types. But, for incentive compatibility an agent with any second-ranked alterna- tive’s value should be truth-telling. We look at the agent who has the second- ranked alternative’s value as β, called the cut-off agent. We find that under the

∗ rule f = f √1 this agent is indifferent between an H type and an L type announce- 2   ment, i.e., 1, √1 , 0 · ( (f ∗(v , vˆ )) − (f ∗(v0 , vˆ ))) = 0. Then, all H and L type 2 E i −i E i −i agents are willing to be truthful.18

2.5. Ordinal Pareto Efficiency and Incentive compatibility

In this section, we restrict our attention to ordinal rules for three reasons. First, it is consistent with previous work in this area. Second, ordinal rules are widely used in practice. Third and most importantly, deriving results within the class of ordinal rules will serve as a useful benchmark as we explore the relationship between efficiency and incentive compatibility.

17Precisely, an H type announcement gives him a low probability of getting the top alternative (a loss) but gives him a low probability of getting the bottom alternative (a gain). 18The details of the arguments are in Section 6. 55 Our primary objective for this section is to show that there is no conflict between ex-ante Pareto efficiency and incentive compatibility in the class of ordinal rules. We consider ex-ante Pareto efficiency as our notion of efficiency because, we believe, it is the minimal normative criterion that a social planner should take into account. For any Pareto dominated rule, agents would prefer a Pareto efficient rule. Let us define Pareto efficient rules in the class of ordinal rules.

Definition 5. A rule f ∈ F ORD is Ordinally Pareto Efficient (OPE) if there is

ORD no rule g ∈ F such that Ui(g) > Ui(f) for every agent i ∈ N. The following proposition characterizes OPE rules and shows the relationship between OPE rules and scoring rules.

Proposition 1. A rule f ∈ F ORD is OPE if and only if there are numbers

{λi}i∈N , where λi ≥ 0 and λ = (λ1, ..., λn) 6= 0 such that f is a scoring rule with    [k] ORD 19 si = λi vˆ | P for i ∈ N. E i k={1,...,m} i Proof. We start with the convexity of the utility possibility set of F ORD. The

n set F ORD, viewed as a subset of linear space Rm×(m!) , is convex because we allow randomized rules. In addition, the mapping from rules to the utility vectors is affine:

ORD Ui (αf + (1 − α) g) = αUi (f) + (1 − α) Ui (g) for any f, g ∈ F and any α ∈ [0, 1].

n  ORDo Hence, Ui (f)i∈N : f ∈ F is convex. For convex utility possibility sets, it is well known that Pareto efficiency can equivalently be represented by the maximization of linear combinations of utilities with {λi}i∈N , λi ≥ 0 and λ = (λ1, ..., λn) 6= 0 [Holmström and Myerson (1983)].20 Thus, a rule f is OPE if and only if there are numbers {λi}i∈N , where λi ≥ 0 and λ = (λ1, ..., λn) 6= 0 such that f maximizes the   19  [k] ORD  [k] E vˆi | Pi is the conditional expectation of vˆi given the ordinal k={1,...,m} k={1,...,m}      [k] ORD  [k] partition of agent i, i.e., E vˆi | Pi (vi)=E vˆi | ti (vi) k={1,...,m} k={1,...,m} 20For the detail of the argument, see page 560 in Mas-colell et al. (1995) . 56 P ORD social welfare i∈N λiUi(g) among all functions g ∈ F . Fix {λi}i∈N and consider P the generalized utilitarian social welfare, Wλ(g) = i∈N λiUi (g). It is the weighted sum of the expected utilities of all the agents. Given λ and P ORD, then for every g ∈ F ORD we have

! " !#

X X ORD Wλ (g) = E λivˆi · g (ˆv) = E E λivˆi · g (ˆv) P i∈N i∈N " !#

X ORD = E g (ˆv) · λiE vˆi P i∈N " # X  ORD = E g (ˆv) · λiE vˆi| Pi , i∈N

where the third equality follows from the fact that g ∈ F ORD and the fourth from our assumption of independent values across agents. Thus, a rule g is a maximizer of

ORD Wλ in F if and only if it satisfies

    l ORD [rl] ORD Supp (g(v)) ⊆ argmax λi vˆi Pi (vi) = argmax λi vˆi Pi (vi) . l∈L E l∈L E

    l ORD [rl] ORD In a neutral environment, E vˆi Pi (vi) is equal to E vˆi Pi (vi) , and depends only on the ranking of alternative l. It does not depend on the ordinal   t ORD [rl] ORD type ti (vi) because µi = µi for any t ∈ T . Hence, λiE vˆi Pi (vi) can

function as the score assigned to alternative l when agent i announces vi, meaning   rl(vi) [rl] ORD si = λiE vˆi Pi (vi) . Therefore, from the definition of a scoring rule the    [k] ORD 21 maximizer f is a scoring rule with si = λi vˆ | P for i ∈ N. E i k={1,...,m} i 

Proposition 1 shows that OPE rules are scoring rules with a particular structure

of scores. Notice that the optimal score si depends on the welfare weight λi and

21Some parts of this proof are similarly found in the proof of Theorem 1 and 2 in Azrieli and Kim (2013). 57 the value distribution conditional on the ordinal partition µi. If λi = 1 for every agent i, then Wλ(g) becomes the utilitarian social welfare. The following corollary characterizes the optimal ordinal rules based on the utilitarian social welfare.

Corollary 1. A rule f ∈ F ORD is an Ordinal Utilitarian Rule if and only if f    [k] ORD is a scoring rule with si = vˆ | P for i ∈ N. E i k={1,...,m} i

The Ordinal Utilitarian Rule is denoted by fOUR. The following example is helpful for understanding the rule and the relationship between the scores and agents’ value announcements.

[2] Example 2. Consider L = {a, b, c}, N = {1, 2}, and fOUR. Suppose that vˆ1 [2] has a uniform distribution on [0, 1], and vˆ2 has a normal distribution on [0, 1] with

 [2] ORD 2  1  E vˆi | Pi = 3 . From the definition of fOUR, we find that s1 = 1, 2 , 0  2  abc bca and s2 = 1, 3 , 0 . If agents 1 and 2 announce v1 ∈ V1 and v2 ∈ V2 , then

 ra(v1) rb(v1) rc(v1)  1   ra(v2) rb(v2) rc(v2)  2  s1 , s1 , s1 = 1, 2 , 0 and s2 , s2 , s2 = 0, 1, 3 . This implies

 ra(v1) rb(v1) rc(v1)  ra(v2) rb(v2) rc(v2) that fOUR (v1, v2) = (0, 1, 0) because s1 , s1 , s1 + s2 , s2 , s2 = 3 2 (1, 2 , 3 ).

This example also shows that in an environment with asymmetric value distribu- tions across agents, the ordinal utilitarian rule may not be a symmetric scoring rule

even with equal welfare weights (i.e., λi = 1 for all i). We now discuss the main result of this section, which pertains to the relationship between Pareto efficiency and incentive compatibility in the class of ordinal rules.

Theorem 1. If a rule f is OPE, then there exists an IC g ∈ F ORD such that

Ui (g) = Ui (f) for every agent i. 58 The following lemma is useful for the proof of Theorem 1 and shows that for any rule in a neutral environment, we can construct a neutral rule that delivers the same expected utility for each agent.

ORD 1 P σ σ−1 ORD Lemma 1. For any rule f ∈ F , a rule g (v) = m! f (v ) ∈ F is σ∈φ neutral. Moreover, Ui (g) = Ui (f) for every agent i. The proof of Lemma 1 is in the Appendix. The neutral rule is constructed by

1 assigning m! (the probability of an ordinal type) to every inversely permutated original rule where the value profile is permutated.22

Proof of Theorem 1: Assume a rule f is OPE, by Proposition 1 f is a scoring rule. The rule f may not be neutral because of a possible non-neutral tie breaking rule. By Lemma 1, however, we can construct a neutral rule g from f. Fix t ∈ T . It is sufficient to consider

t only vi ∈ Vi instead of all ordinal types because of the neutrality of the rule and  r v0  σ t 0 l( i)  rl(vi) environment. Pick vi ∈ Vi and vi ∈ Vi, note that si = si for some l∈L l∈L σ. To check incentive compatibility of g, we look at probabilities that alternatives are

0 chosen given i’s announcement under g (i.e., E (g(vi, vˆ−i)) and E (g(vi, vˆ−i))). First,

P  rl(vj ) consider the aggregated scores of the other agents, sj . The probabilities j6=i l∈L 1 that each alternative obtains the maximum aggregated scores are the same as m   because the environment is neutral. Second, adding srl(vi) to the previous scores i l∈L in every event generates E (g(vi, vˆ−i)) such that the probability that the first-ranked alternative is chosen becomes the highest and the probability that the second-ranked alternative is chosen the second-highest and so on. Also, the order of the probabilities

22This lemma is similar to Lemma 3 in Schmitz and Tröger (2012). But it shows that this con- struction that preserves several properties from the original rule is still possible with more than 2 alternatives. 59 0 σ 23 resembles the order of the value announcement, i.e., E (g(vi, vˆ−i)) = E (g(vi, vˆ−i)) .

Thus, the probability distribution E (g(vi, vˆ−i)) first order stochastically dominates σ 0 E (g(vi, vˆ−i)) for any σ ∈ φ, so that vi · E (g(vi, vˆ−i)) ≥ vi · E (g(vi, vˆ−i)) . We can conclude that g is IC.

Theorem 1 implies that there is non-existent the usual trade off between ex- ante Pareto efficiency and incentive compatibility within the class of ordinal rules, at least in a neutral environment. It also has an implication for the voting rules based upon maximizing different welfare functions. Apesteguia et al. (2011) describe the optimal ordinal rules based on the utilitarian, maximax, and maximin welfare functions. They use similar but stronger assumptions than the current paper.24 The maximin welfare function evaluates an alternative in terms of the expected utility of the worst-off agent, disregarding the other agents’ expected utilities. In contrast to the maximin welfare function, the maximax welfare function concentrates on the most well-off agent. Obviously, all of their rules are ordinally Pareto efficient. Since their focus is not on the incentive compatibility, they assume that voters announce their preferences truthfully. However, Theorem 1 shows that all of their rules are IC without the assumption. Thus, we can use their rules without any concern for incentive compatibility. In addition, we can find an important feature of the set of OPE rules in Proposition 1. The set of OPE rules is in the class of scoring rules that allows asymmetric scores (even zero score vectors) across agents. Apesteguia et al. (2011) show that an optimal

23In Example 2 with the standard tie breaking rule such that ties are broken by uniform distribution 2 1  over the set of maximizers, we can calculate that E (f(v1, vˆ2)) = 3 , 3 , 0 . If agent 1 announces 0 bca 0 2 1  v1 ∈ V1 , then E (f(v1, vˆ2)) = 0, 3 , 3 . Note the value and order of coordinates in E (f(v1, vˆ2)) 0 and E (f(v1, vˆ2)). 24Specifically, they assume identical value distributions across agents. 60 rule based on a maximax or maximin welfare function in their environment may not be a symmetric scoring rule, but the approximation of the rule is a symmetric scoring rule. Proposition 1 complements their analysis because the optimal rule can be a scoring rule with si = 0 for some i ∈ N.

2.6. A superior incentive compatible cardinal rule

In this section, we move beyond ordinal rules by utilizing preference intensity information. It is straightforward to design a cardinal rule superior to any ordinal rule, 25 but the non-trivial question is whether or not we can design an incentive compatible cardinal rule superior to any ordinal rule. We show that we can under the additional symmetry assumption on agents.

2.6.1. A Superior IC Cardinal Rule. The following theorem shows the main result of this section.

Theorem 2. Assume n ≥ 5 and that µi is an identical distribution for all i ∈ N, then there exists an IC cardinal rule that achieves higher utilitarian social welfare than any ordinal rule. Proof of Theorem 2: The proof consists of six steps. Step 1 shows Lemma 2 which derives a utilitarian rule based on a general partition P . In Step 2, we introduce a special family of n o partitions P β and a utilitarian rule based on P β, called a P β-Utilitarian Rule. β∈(0,1) Step 3 considers three alternatives and shows a necessary and sufficient condition for incentive compatibility of a P β-Utilitarian rule. In Step 4, we prove the existence of a rule f ∗ satisfying the condition. Step 5 proves that f ∗ achieves a higher utilitarian

25For example, the first best cardinal rule is a rule where a score assigned to an alternative is the realized value of the alternative and the alternative with the greatest sum of scores is chosen. 61 social welfare than any ordinal rule. Finally in step 6, we modify the P β-Utilitarian rule and show the extension with more than three alternatives.

Step 1) Consider any finite measurable partition Pi that divides Vi. Let P =

P (P1 × · · · × Pn) be the corresponding partition product of V . Let F denote the set of P -measurable rules. Given a partition P , we say that a rule f ∈ F P is a

P -Utilitarian Rule if f ∈ argmax Wλ (g) where λi = 1 for every i ∈ N. The following g∈F P lemma generalizes Proposition 1 from the previous section with a general partition, but with the restriction of λ.26

Lemma 2. A rule f ∈ F P is a P -Utilitarian Rule if and only if it satisfies

r X l(vi) Supp (f(v)) ⊆ argmax si l∈L i∈N where

rl(vi)  l  si = E vˆi | P for i ∈ N and l ∈ L

The proof of this lemma is omitted because it is nearly identical to the proof of Proposition 1. Notice that the scores assigned to the alternatives are the expected values conditional on the general partition P , not the ordinal partition P ORD.

Step 2) We consider a special family of partitions which provides the preference

β intensity information as well as the ranking information. Let Pi be a partition which

ORD divides each ordinal type set in Pi into two sets. It follows that the new type set n H L ORDo ORD t consists of 2m! types, T = t , t : t ∈ T . For every t ∈ T , the set Vi is tH tL partitioned into the two sets Vi (β) and Vi (β) according to the partition coefficient β ∈ (0, 1).

26The restriction of λ is not necessary for the generalization of Proposition 1, but is useful for Theorem 1. 62 tH t [2] [1] [3] t [2] [3] V (β) = {vi ∈ V | vi ≥ βvi + (1 − β) vi } = {vi ∈ V | vi ≥ β + (1 − β) vi }

tL t [2] [1] [3] t [2] [3] Vi (β) = {vi ∈ V | vi < βvi + (1 − β) vi } = {vi ∈ V | vi < β + (1 − β) vi }

where β ∈ (0, 1) is a partition coefficient.

Each ordinal type is divided into H and L types, and an agent i’s type is determined

[2] tH by the relative value of the second-ranked alternative vi . An agent i with vi ∈ V , called an H type agent, values the second-ranked alternative relatively closely to the

tL first-ranked alternative. An agent i with vi ∈ V , called an L type agent, values the second-ranked alternative relatively closely to the third-ranked alternative. Roughly speaking, the H type agents hate the third-ranked alternative and L type agents love the first-ranked alternative.

β Let fβ be a P -Utilitarian Rule and fix the standard tie breaking rule such that ties are broken by uniform distribution over the set of the maximizers. We mainly consider P β-measurable rules, so the following type-based notations are

β β convenient. Consider a type function associated with P , ti : Vi → T which maps a

tH value vector of agent i to the corresponding type t ∈ T . For example, if vi ∈ Vi (β), β H n then ti (vi) = t . Then, we can identify each rule fβ with gβ: T → 4(L) by

 β β  gβ(t1, ..., tn) = fβ t1 (vi), ..., tn(vn) . There are essentially two score vectors and two type probabilities because the value distributions across agents are identical and the environment is neutral.

H   [k] β H  L   [k] β L s (β) = vˆ | t (vi) = t , s (β) = vˆ | t (vi) = t E i i k={1,...,m} E i i k={1,...,m}

H n β H o L n β Lo p (β) = P r vi : t (vi) = t , p (β) = P r vi : t (vi) = t .

63 Step 3) We first consider the three alternative case, which clearly and simply

shows how to use preference intensity information. Since gβ is neutral and we as- sume a neutral environment, it is sufficient to consider only one ordinal type in our

ORD H 0 L examination of incentive compatibility. We fix t ∈ T , ti = t , and ti = t . To   ˆ    0 ˆ  0 simplify notation, let E gβ t, t−i = P (β) and E gβ t , t−i = P (β) . These are probability vectors in which the coordinates are the probabilities that the alternatives

0 are chosen under the rule gβ given t and t respectively. Note the following property of P (β) and P (β)0,

P (β)[2] ≥ P (β)0[2], P (β)[1] ≤ P (β)0[1]and P (β)[3] ≤ P (β)0[3] for any β ∈ (0, 1).

In other words, the change of one’s announcement from L to H type given the type profile of others weakly increases the probability that the second-ranked alternative is chosen and weakly decreases the probabilities that other alternatives are chosen. This results from a similar process of scoring rules, but with the two score vectors defined such that sH (β)[1] = sL (β)[1] = 1, sH (β)[3] = sL (β)[3] = 0 and sH (β)[2] > sL (β)[2].     We define the function h(β) = P (β)[1] − P (β)0[1] + β P (β)[2] − P (β)0[2] from the incentive constraints (see the proof of Lemma 3 in the Appendix). The following lemma identifies the necessary and sufficient condition for incentive compatibility.

Lemma 3. h(β) = 0 if and only if gβ is IC. The proof of the lemma is in the Appendix. We call h (β) the balance func-

tion for the rule gβ. That is, because h (β) can be arranged such that h (β) =         P (β)[1] − P (β)0[1] +β P (β)[2] − P (β)0[2] =(1 − β) P (β)[1] − P (β)0[1] +β P (β)0[3] − P (β)[3] ,     it shows a weighted average of a loss P (β)[1] ≤ P (β)0[1] and a gain P (β)0[3] ≥ P (β)[3] from the change of the announcement from L to H type. If the gain and loss are bal- anced, h (β) = 0, then the rule g (β) is IC. 64 Step 4) We proceed with three claims regarding the balance function h(β) to show the existence of an IC cardinal rule, and their proofs are in the Appendix. For ease of notation, we use the vector of scores assigned to the alternatives which depends

 β  on β and the announced types, si(β, ti) = E vˆi | ti (vi) = ti .

Claim 1 . limβ→0 h(β) < 0 and limβ→1 h(β) > 0

With Claim 1, if h(β) is continuous on (0, 1), then we can easily find a β∗ such that h (β∗) = 0 by the intermediate value theorem. Then, the new rule f ∗ is an IC

∗ cardinal rule where f (v) = gβ∗ (ti, t−i). However, there may not exist a β∗ such that h (β∗) = 0 because h(β) may be discontinuous. The following example explains the potential discontinuity.

H Example 3. Given L = {a, b, c} and ti = abc , then gβ (ti,t−i) is determined by

H the aggregated scores of the alternatives which depends on β and t−i. Let S (β, t−i) =

 H  P  s (β) + j6=i sj (β, tj) be the vector of aggregated scores assigned to the alter- natives, then we can express P (β)[1] with this function

[1] n H a H b H co P (β) = P r t−i : S (β, t−i) > S (β, t−i) and S (β, t−i) +

1 n o P r t : SH (β, t )a = SH (β, t )b > SH (β, t )c + 2 −i −i −i −i 1 n o P r t : SH (β, t )a = SH (β, t )c > SH (β, t )b + 2 −i −i −i −i 1 n o P r t : SH (β, t )a = SH (β, t )b = SH (β, t )c . 3 −i −i −i −i

P (β)[2] and P (β)[3] can be similarly expressed. P (β)0[1],P (β)0[2], and P (β)0[3] can

L be expressed using the corresponding S (β, t−i). 65 H a H b Note that gβ is determined by the value order of S (β, t−i) , S (β, t−i) , and

H c [1] S (β, t−i) at each t−i. If gβ does not change as β changes given any t−i, then P (β)

H L is continuous in β because p (β), p (β), and si(β, ti) are continuous in β. However,

H gβ could change with β because the value order of the coordinates in S (β, t−i) could change as well. This feature could cause a jump in P (β)[1] at some β0s, which means that P (β)[1] is possibly discontinuous. The analogous argument can be applied to P (β) and P (β)0. Therefore, h(β) may also be discontinuous.

However, even in this case we can construct a slightly different IC rule. Define the set D = {β ∈ (0, 1) : h (β) is discontinuous at β}.

Claim 2 . If D 6= φ, then there exits a βˆ ∈ D such that h(β) · h(β) ≤ 0. β→βˆ− β→βˆ+

The following figure is helpful in understanding the remaining argument.

Figure 1. An example of discontinuous h (β) at βˆ

Claim 2 provides a way to design an IC cardinal rule with such a βˆ by using an appropriate convex combination of two rules. One rule is g+ with the balance 66     function at βˆ, h+ βˆ = h(β) . The other rule is g− with h− βˆ = h(β) . The next β→βˆ+ β→βˆ− claim concludes Step 4 by showing exactly what the appropriate combination is.

Claim 3 . There exists an IC cardinal rule f ∗ (v) such that f ∗ (v) = αg+ + (1 − −h−(βˆ) α)g− where α = ∈ [0, 1]. h+(βˆ)−h−(βˆ)

Step 5) We prove that f ∗ has a higher utilitarian social welfare than the ordinal

∗ utilitarian rule, fOUR. The construction of f is based on a finer partition than fOUR,

∗ ∗ which implies that W (f ) ≥ W (fOUR) (the equality holds when f (v) = fOUR(v) for almost every v ∈ V ). However, we can always find a set of v with non-zero

∗ measure such that f (v) 6= fOUR(v) , as shown in the proof of Claim 1. In these

tH 0 tL ∗ ∗ 0 cases where vi ∈ Vi and vi ∈ Vi given the others agents’ types, f (v) 6= f (vi, v−i) 0 0 while fOUR(v) = fOUR(vi, v−i) because v and (vi, v−i) are in the same ordinal type set ORD ∗ in P . These cases guarantee a difference in social welfare between f and fOUR,

∗ implying that W (f ) > W (fOUR).

Step 6) The above steps prove the theorem with three alternatives. To extend the result to more than three alternatives, we need to change Step 3 by constructing a

rule with gβ and two other rules. First, recall the ordinal utilitarian rule gOUR which

  [k]  ORD has one score vector sOUR = E vˆi | ti(vi) = t for t ∈ T . Second, we consider a modified rule gˆ ∈ F P β with slightly different score vectors than sH (β) , sL (β). The

score vectors only assign the third-ranked alternative the same score as the gOUR. That is,

H  H 1 H 2 3 H 4 H m sˆ (β) = s (β) , s (β) , sOUR, s (β) , ..., s (β) ,

L  L 1 L 2 3 L 4 L m sˆ (β) = s (β) , s (β) , sOUR, s (β) , ..., s (β) .

Now, we construct a new rule in the following manner.

67    gβ (ti, t−i) if {l ∈ L : rl (vi) ≤ 3} = {l ∈ L : rl (vj) ≤ 3}   gˆβ (ti, t−i) = for all i, j ∈ N and gβ =g ˆ     gOUR (ti, t−i) otherwise

In words, gˆβ is the same as gβ when the two conditions are satisfied. First, every agent has the same top three alternatives. Second, gβ (ti, t−i) equals gˆ (ti, t−i).

Otherwise, gˆβ is the same as gOUR.

We construct gˆβ to apply the main intuition for three alternatives and directly compare gˆβ with gOUR. Our rule may be meaningful when we are concerned with the preference intensities and competition of the top three alternatives. After the construction of gˆβ, we use almost the same argument as the argument with three alternatives, replacing gβ with gˆβ in the remaining steps.

In fact, gˆβ is neutral because gβ, gˆ, and gOUR are neutral. We also need to   ˆ    0 ˆ  0 examine another property of E gˆβ t, t−i = P (β) and E gˆβ t , t−i = P (β) with the replacement,

P (β)[k] = P (β)0[k] for k ≥ 4.

This property results from thatgˆβ chooses the one among the top three alternatives shared for all agents regardless of agent i’s announcement of H or L type. Unlike the three alternative case, sH (β)3 and sL (β)3 are non-zero. The following lemma characterizes sH (β)3 and sL (β)3.

Lemma 4. For any β ∈ (0, 1), sH (β)3 ≤ sL (β)3. The proof of Lemma 4 is in the Appendix. Lemma 4 implies that we may lose the property that P (β)[2] ≥ P (β)0[2], P (β)[1] ≤ P (β)0[1]and P (β)[3] ≤ P (β)0[3] for any 68 β ∈ (0, 1) without the replacement. However, we can recover the property by mixing

H 3 L 3 3 gβ and gˆ because sˆ (β) =s ˆ (β) = sOUR.

The other steps hold with the replacement. 

The next corollary covers the n = 2 or n ≥ 4 case with a restriction of the value distribution µi. The proof of the corollary is in Appendix.

Corollary 2. Assume n = 2 or n ≥ 4 and that for all i ∈ N µi is an identical  [1] ORD  [3] ORD E vˆ |P +E vˆ |P  [2] ORD i i i i distribution such that E vˆi | Pi ≤ 2 , then there exists an IC cardinal rule which achieves a higher utilitarian social welfare than any ordinal rule. When agents are symmetric ex-ante, it is often interesting to focus on anonymous rules. An anonymous rule treats every agent equally. That is, the “name” of an agent does not affect the rule.

Definition 6. A rule f is anonymous if, for all profiles v and all permutations π over the set of agents,   f vπ(1), ..., vπ(n) = f (v) The following corollary demonstrates an important implication of Theorem 1 within the set of anonymous rules.

Corollary 3. Under the same assumptions as Theorem 2 or Corollary 2, there exists an anonymous and IC cardinal rule that strictly Pareto dominates any anony- mous ordinal rule Proof. The restriction of anonymous rules and the identical value distribution across agents implies that W (f) = NUi (f) and Ui (f) is the same for all i ∈ N.

∗ ∗ By Theorem 1, there exists a rule f such that Ui (f ) > Ui (fOUR) for all i ∈ N.

∗ Additionally, f and fOUR are anonymous.  69 2.6.2. (A, B)-scoring rules with three alternatives. In this subsection, we focus on three alternatives, connecting our rule f ∗ to several well-known rules: the plurality, negative, Borda count, and approval voting rules. As in Myerson (2002), the general form of these voting rules for three candidates is an (A, B)-scoring rule, where each voter must choose a score vector that is a permutation of either (1,B, 0) or (1, A, 0). That is, the voter can give a maximum of 1 point to one candidate, A or B (0 ≤ A ≤ B ≤ 1) to some other candidate, and a minimum of 0 to the remaining candidate. In our environment, we define the rule in the following way.

Definition 7. A rule f is an (A, B)-scoring rule if there exists two score vectors si ∈ {(1, A, 0), (1,B, 0)} for each i ∈ N such that

X rl(vi) Supp (f(v)) ⊆ argmax si l∈L i∈N Specific (A, B)-scoring rules are widely used in practice and theory. The case (A, B)=(0, 0) is the plurality voting rule, where each voter can support a single can- didate. The case (A, B)=(1, 1) is the negative voting rule, where each voter can oppose a single candidate. The case (A, B)=(0.5, 0.5) is the Borda count voting rule, where each voter can give candidates a completely ranked score. These rules are classified as ordinal rules because information about ordinal preference is sufficient to implement the rules. However, (A, B)=(0, 1) - the approval voting rule where each voter can support or oppose a group of candidates - requires more than information about ordinal preferences. Similar to P β-Utilitarian Rules, this property makes the approval voting rules more efficient than other ordinal rules in some cases. The fol- lowing proposition clearly shows the relationship between P β-Utilitarian Rules, f ∗ and (A, B)-scoring rules. 70 Proposition 2. Assume that µi is an identical distribution for all i ∈ N. Then, a rule is a P β-Utilitarian Rule if and only if it is an (A, B)-scoring rule with (A, B) =   sL (β)2 , sH (β)2 . Furthermore, f ∗ is an IC (A, B)-scoring rule with (A, B) =   sL (β∗)2 , sH (β∗)2

Proof. The first part follows from Theorem 2. Note that with three alternatives     sL (β) = 1, sL (β)2 , 0 and sH (β) = 1, sH (β)2 , 0 . For the second part, consider Step 4 in the proof of Theorem 1. When h (β) is continuous, f ∗ is obviously an IC   (A, B)-scoring rule with (A, B) = sL (β∗)2 , sH (β∗)2 and the standard tie breaking rule. But, in the other case f ∗ does not look like an (A, B)-scoring rule because of the convex combination of the two rules. However, it is shown in the proof of Claim 3

∗ that f is the same as gβˆ except for some ties. By the first part of Proposition 2, gβˆ is   2  2 an IC (A, B)-scoring rule with sL βˆ , sH βˆ and the standard tie breaking rule . Since the definition of an (A, B)-scoring rule makes no restriction on tie breaking   2  2 rules, the new rule f ∗ is an IC (A, B)-scoring rule with sL βˆ , sH βˆ , but with

∗ ˆ a different tie breaking rule from gβˆ. Then, setting β = β completes the proof. 

Proposition 2 shows that P β-Utilitarian Rules are not purely theoretical rules. Also, if we focus on (A, B)-scoring rules, Proposition 2 can be used to find an IC (A, B)-scoring rule superior to any ordinal rule.

71 2.7. Incentive Compatible Rules On Single-Peaked Preferences Domain

In the previous sections, we investigated IC voting rules in a neutral environment on the domain of unrestricted preferences. Now we study IC voting rules in a non- neutral environment. In a non-neutral environment, Majumdar and Sen (2004) show that any IC ordinal rule is dictatorial on the domain of unrestricted preferences.27 The next step could be to fix a non-neutral environment and look at a domain of restricted preferences. In particular, we consider the most popular domain of single- peaked preferences. Single-peaked preferences can be explained as follows. Suppose that the alternatives in L can be ordered in some way (left to right, smallest to biggest, etc). We say that agents have single-peaked preferences over the set of alternatives

0 ∗ if each agent i has a most preferred alternative (agent i s peak), li ∈ L and agent i 0 00 0 ∗ prefers l to l if and only if l is closer to li The primary objective of this section is to show a new implication of incentive compatibility in the non-neutral environment in Majumdar and Sen (2004) on the domain of single-peaked preferences. Specifically, we show that an IC ordinal rule is not necessarily dictatorial in this restricted domain. Furthermore, IC ordinal rules in this setting require information only on agents’ peaks, not their full rankings.

2.7.1. Model On Single-Peaked Preferences Domain. Consider single-peaked preferences where alternatives are linearly ordered, L = {1, 2, 3, ..., m}. The typical elements of L are now denoted by integers. The value set Vi over alternatives has the following property.

27Precisely, they assume unanimous rules which will be formally defined later. And their concept of incentive compatibility is ordinal Bayesian incentive compatibility. But we show the equivalence between the incentive compatibility in our environment and their concept in Appendix. Also, their non-neutral environment refers to the environment where the value distribution violates the condition of neutrality that the ordinal type distribution is uniform. 72 ∗ There exists an alternative li ∈ L , the peak of vi, such that for every vi ∈ Vi and l, l0 ∈ A,   0 ∗ l0 l l < l ≤ li ⇒ vi < vi

 ∗ 0 l0 l li ≤ l < l ⇒ vi < vi

The set of value profiles satisfying the above property is denoted by V SP . We assume that the distribution over V SP has full support. On the domain of single- peaked preferences, the associated type set is T SP , a subset of the ordinal type set, i.e., T SP ⊂ T ORD. The full support assumption implies the full single-peaked prefer- ences domain.28 In other words, it allows all admissible ordinal preferences satisfying the above property. For example, with 3 alternatives, T SP = {123, 213, 231, 321}⊂ T ORD = {123, 132, 213, 231, 312, 321}.  n We deal only with ordinal and deterministic rules, g : T SP → L. In addition, we impose that rules satisfy a unanimity condition, which states that in any case where all agents agree on a particular alternative as the best, then the rule must choose that alternative. This is a commonly-used axiom in the social choice literature. It is useful to define another function related to ranking of alternatives. For i ∈ N,

SP let rk : T → L denote the kth-ranked alternative in ti.

Definition 8. A rule g is unanimous if g (ti, t−i) = l whenever l = r1 (ti) for every agent i ∈ N

The information structure is the following. We frequently use the induced distribu-

n SP o tion over type profiles, ν (t1, ..., tn) = pr v ∈ V : ti (vi) = ti for every i ∈ N n  SP  P because we focus on ordinal rules. For any Q ⊆ T , let ν (Q) = (t1,...,tn)∈Q ν (ti, ..., tn).

28Gershkov et al. (2013) assume the linear environment of utilities, which implies a strict subset of full single-peaked preferences domain. 73 We have two assumptions on ν. First, types are assumed to be independent but not necessarily symmetric across agents. Second, ν is assumed to satisfy the following  n property: for all Q, R ⊂ T SP ,

[ν (Q) = ν (R)] ⇒ [Q = R].

Majumdar and Sen (2004) introduce the set of distributions which satisfy the above conditions. We denote this set by C and denote the set of independent distri- butions by ∆I . They show that C is large and generic in ∆I with the two properties: (1) C is open and dense in ∆I and (2) ∆I − C has Lebesgue measure zero. They prove that given ν ∈ C, if a rule g is unanimous and IC then g is dictatorial. But, the following example shows that there exists a unanimous, IC, and non-dictatorial rule even with ν ∈ C on the domain of single-peaked preferences. Example 3. Consider the rule g defined below with 3 alternatives and 2 agents.

t2 t1 123 213 231 321 123 1 1 1 1 213 1 2 2 2 231 1 2 2 2 321 1 2 2 3 Table 3. The rule g in Example 3

In the array, agent 1’s types appear along the columns and agent 2’s along the rows. The inside entries indicate g (t1, t2). It is obvious that the rule is unanimous and non-dictatorial. In fact, the rule is IC with respect to any distribution, because the probabilities that each alternative is chosen under the rule conditional on a truthful 74 type announcement first-order stochastically dominate the probabilities conditional on other type announcements regardless of any distribution.

2.7.2. Peak-only rules and Incentive Compatibility. A peak-only rule is a rule that depends on peaks in agents’ preferences. That is, we do not need full ranking information in agents’ preferences to implement the rule.

ORD 0 0 Definition 9. A rule g ∈ F is a peak-only rule if g (t1, ..., tn) = g (t1, ..., tn) 0 0 0 whenever (t1, ..., tn) and (t1, ..., tn) are such that r1 (ti) = r1 (ti) for all i ∈ N.

The following theorem shows the main result of this section.

Theorem 3. Every unanimous and IC rule g ∈ F ORD is a peak-only rule. Proof of Theorem 3: The first step is to show a useful lemma that given ν ∈ C, an IC rule g ∈ F ORD should satisfy a certain property called Property M.29 The second step is to prove that if g satisfies the unanimity and Property M, then g is a peak-only rule.

Step 1) For the lemma, we define the set of alternatives that are weakly preferred to alternative l of type ti (the upper contour set) such that

n 0 l0 l o B (l, ti) = l ∈ L : vi ≥ vi for t (vi) = ti . Also, we define Property M below.

Definition 10. The rule g satisfies Property M, if for every agent i ∈ N, for all

0 0 0 integers k = 1, ..., m , and for every ti, ti such that B (rk (ti) , ti) = B (rk (ti) , ti), we have

0 0 0 [g (ti, t−i) ∈ B (rk (ti) , ti)] ⇒ [g (ti, t−i) ∈ B (rk (ti) , ti)]

29We borrow the name and definition of this property from Majumdar and Sen (2004). 75 0 It implies that when two ordinal types ti and ti share the top k alternatives and 0 g (ti, t−i) is one of the top k alternatives, Property M demands that g (ti, t−i) must be one of the top k alternatives.

Lemma 5. Given ν ∈ C, if a rule g ∈ F ORD is IC then g satisfies Property M.

The proof of Lemma 5 is in the Appendix.

Step 2) Since the formal proof has some technical parts, we place it in the Appendix. Instead, we provide a simple example that shows the crucial idea of the second step. Example 4. Let A = {1, 2, 3}, N = {1, 2}. Given ν ∈ C, consider an incomplete rule g below.

t2 t1 123 213 231 321 123 1 l l0 1 213 2 2 231 2 2 321 1 3 Table 4. The rule g in Example 4

First, g is unanimous. Assume that g is IC, but not a peak-only rule. Then,

0 0 0 there exists (t1, t2) and (t1, t2) with r1 (ti) = r1 (ti) for i = 1 or 2 such that g (t1, t2) 6= 0 0 0 0 g (t1, t2). Consider (t1, t2) = (213, 123) and (t1, t2) = (231, 123) and let g (213, 123) = l and g (213, 123) = l0. If l or l0 is equal to 2 (the shared first-ranked alternative of type 213 and 231), Lemma 5 requires that l = l0 = 2, which contradicts that l 6= l0. Hence, l or l0 should be 3 because l, l0 6= 2 and l 6= l0. But, it also leads a 76 contradiction by Lemma 5 because B (r2 (123) , 123 (= t2)) = B (r2 (213) , 213) and g (213, 213) = g (231, 213) = 2 from the unanimity of the rule. The formal proof extends this idea with an arbitrary number of agents and alternatives.

2.8. Concluding Comments

We have investigated the efficiency and incentive compatibility of voting rules in a Bayesian environment with independent private values and at least three alterna- tives. First, we characterize the ex-ante Pareto frontier of the set of ordinal rules. Furthermore, we prove that in a neutral environment, essentially any ex-ante Pareto efficient ordinal rule is IC, which implies that the usual conflict between efficiency and incentive compatibility does not exist in the class of ordinal rules. This conflict, how- ever, arises if we consider cardinal rules. That is, it is not straightforward to design a cardinal rule which is more efficient than ordinal rules as well as IC. However, we successfully construct an IC cardinal rule superior to any ordinal rule when agents are symmetric ex-ante. With three alternatives, this cardinal rule turns out to be an IC (A,B)-scoring rule in Myerson (2002). Lastly, the condition of incentive compatibility becomes a strong constraint in a non-neutral environment. Any IC ordinal rule is a peak-only rule on the domain of single-peaked preferences. We do not attempt to derive an optimal cardinal rule subject to incentive com- patibility because an analytical characterization of this second best rule is difficult to obtain. But we believe that our paper addresses some of the most basic questions regarding the design of IC voting rules in a Bayesian environment, and that our re- sults are building steps to find this second-best rule. In addition, the method of using a finer partition and finding a condition for incentive compatibility in the proof of Theorem 2 is novel and worthy of attention. This method may be applicable to a 77 more general distribution of agents’ values for designing a superior voting rule or even to different fields such as allocation or matching rules. We hope to deal with these directions in future study.

78 Bibliography

[1] Apsteguia, J., Ballester, M.A. and Ferrer, R. (2011), On the justice of decision rules, Review of economic studies, 78, 1-16. [2] Azrieli, Y. and Kim, S. (2013), Pareto efficiency and weighted majority rules, The Ohio State University working paper. [3] Badger, W. W., “Political individualism, positional preferences, and optimal decision rules,” in R.G. Niemi and H.F. Weisberg, eds. (1972), Probability Models of Collective Decision Making (Columbus, Ohio: Merrill). [4] Barberà, S. and Jackson, M. O. (2004), “Choosing how to choose: Self-stable majority rules and constitutions,” Quarterly Journal of Economics, 119, 1011-1048. [5] Barberà, S. and Jackson, M. O. (2006), On the weights of nations: Assigning voting weights in a heterogeneous union, The Journal of Political Economy, 114, 317-339. [6] Beisbart, C. and Bovens, L. (2007), “Welfarist evaluations of decision rules for boards of repre- sentatives,” Social Choice and Welfare, 29, 581-608. [7] Börgers, T. and Postl, P. (2009), Efficient compromising, Journal of Economic Theory, 144, 2057-2076. [8] Carrasco, V., Fuchs, W. (2009), A Procedure for Taking Decisions with Non-transferable Utility, working paper. [9] Curtis, R. B., “Decision rules and collective values in constitutional choice,” in R.G. Niemi and H.F. Weisberg, eds. (1972), Probability Models of Collective Decision Making (Columbus, Ohio: Merrill). [10] Fleurbaey, M., “One stake one vote,” mimeo, University Paris-Descartes, 2008. [11] Freixas, X. (1984), A Cardinal Approach to Straightforward Probabilistic Mechanisms, Journal of Economic Theory, 34, 227-251. [12] Gale, D. (1960), The theory of linear economic models (New York: McGraw-Hill).

79 [13] Gershkov, A., Moldovanu, B., Shi, X. (2013), Optimal Mechanism Design without Money, working paper. [14] Gibbard, A. (1973), Manipulation of Voting Schemes: A General Result, Econometrica, 41, 587-601. [15] Gibbard, A. (1978), Straightforwardness of Game forms with Lotteries as Outcomes, Econo- metrica, 46, 595-613. [16] Giles, A. and Postl, P. (2012), Equilibrium and Effectiveness of Two-Parameter Scoring Rules, University of Birmingham working paper. [17] Harsanyi, J. C. (1955), Cardinal Welfare, Individualistic Ethics, and Interpersonal Comparisons of Utility, The Journal of Political Economy, 63, 309-321 [18] Holmström, B., and Myerson, R. B. (1983), Efficient and durable decision rules with incomplete information, Econometrica, 51, 1799-1819. [19] Jackson, M. O. (1991), “Bayesian implementation,” Econometrica, 59, 461-477. [20] Jackson, M. O. and Sonnenschein, H. F. (2007), Overcoming incentive constraints by linking decisions, Econometrica, 75, 241-257. [21] Kim, S., “Ordinal versus cardinal voting rules: A mechanism design approach,” mimeo, The Ohio State University, 2012. [22] Kreps, D. M. and Wilson, R. (1982), “Sequential equilibria,“ Econometrica, 50 , 863-894. [23] Majumdar, D. and Sen, A. (2004), Ordinally Bayesian incentive compatible voting rules. Econo- metrica, 72, 523-540. [24] Mas-colell, A., Whinston, M. D. and Green, J. R. (1995), Microeconomic Theory, Oxford Uni- versity Press. [25] Miralles, A. (2012), Cardinal Bayesian allocation mechanisms without transfers, Journal of Economic Theory, 147, 179-206. [26] Moulin, H. (1980), On strategy-proofness and single peakedness, , 35, 437-455. [27] Myerson, R. B. (2002), Comparison of Scoring Rules in Poisson Voting Games, Journal of Economic Theory, 103, 219-251. [28] Nehring, K. (2004), “The veil of public ignorance,” Journal of Economic Theory, 119, 247-270.

80 [29] von-Neumann, J. and Morgenstern, O. (1944), Theory of games and economic behavior (New Jersey: Princeton University Press). [30] Palfrey, T. R. and Srivastava, S. (1989), “Mechanism design with incomplete information: A solution to the implementation problem,” The Journal of Political Economy, 97, 668-691. [31] Rae, D. W. (1969), “Decision-rules and individual values in constitutional choice,” The Ameri- can Political Science Review, 63, 40-56. [32] Roberts K. W. S. (1980), Interpersonal Compatibility and Social Choice Theory, Review of Economic Studies, 47, 421-439 [33] Rockafellar, R. T. (1970), Convex analysis (New Jersey: Princeton University Press). [34] Samuels, S. M. (1965), “On the number of successes in independent trials,” Annals of Mathe- matical Statistics, 36, 1272-1278. [35] Satterthwaite, M. A. (1975), Strategy-proofness and arrow’s conditions: Existence and corre- spondence theorems for voting procedures and social welfare functions, Journal of Economic Theory, 10, 187-217. [36] Schmitz, P. W. and T. Tröger, “Garbled elections,” CEPR discussion paper No. 195, 2006. [37] Schmitz, P. W. and Tröger, T. (2012), The (sub-)optimality of the majority rule, Games and Economic Behavior, 74, 651-665. [38] Shapley, L. S. and Shubik, M. (1954), “A method for evaluating the distribution of power in a committee system,” American Political Science Review, 48, 787-792. [39] Strotz R. H. (1953), Cardinal Utility, The American Economic Review, 43, 384-397 [40] Weymark, J. A. (2005), “Measurment theory and the foundations of utilitarianism,” Social Choice and Wefare, 25, 527-555.

81 .

Appendix A: Proofs in Chapter 1

Proof of Lemma 2: Let f ∈ F IC . From (1.3.1) we have that

  ˆ   0 ˆ  ui(ti) · E f(ti, t−i) − E f(ti, t−i) ≥ 0

and 0   ˆ   0 ˆ  ui(ti) · E f(ti, t−i) − E f(ti, t−i) ≤ 0

0  ˆ  for every i and every ti, ti ∈ Ti. Denote (p, 1 − p) = E f(ti, t−i) and (q, 1 − q) =  0 ˆ  E f(ti, t−i) . Then the above inequalities become

r s (p − q)(ui (ti) − ui (ti)) ≥ 0 and

r 0 s 0 (q − p)(ui (ti) − ui (ti)) ≥ 0,

0 r s respectively. Now, if both ti, ti are in Ti or both are in Ti then the above inequalities  ˆ   0 ˆ   ˆ  imply that p = q, that is E f(ti, t−i) = E f(ti, t−i) . In other words, E f(ti, t−i) r s viewed as a function of ti is constant on Ti and on Ti (Pi-measurable). ORD r Denote g = E(f|P ). Obviously, g ∈ F . For any ti ∈ Ti we have

 ˆ   ˆ ˆ r   ˆ ˆ r   ˆ  (2.8.1) E f(ti, t−i) = E f t |ti ∈ Ti ) = E g t |ti ∈ Ti ) = E g(ti, t−i) ,

 ˆ  r where the first equality follows from the fact that E f(ti, t−i) is constant on Ti and ˆ r type independence, the second from the definition of g and the fact that {ti ∈ Ti } is 82 in (the algebra generated by) P , and the third from the fact that g(·, t−i) is constant

r s on Ti for any fixed t−i and type independence. Using the same argument for ti ∈ Ti

we get that (2.8.1) is valid for every ti ∈ Ti. Therefore,

 ˆ   ˆ  Ui(f|ti) = ui(ti) · E f(ti, t−i) = ui(ti) · E g(ti, t−i) = Ui(g|ti), which establishes that every type of every agent gets the same interim utility under f and under g. Finally, the fact that g is IC immediately follows from (1.3.1) and (2.8.1).

Proof of Lemma 3:

n ro Fix i ∈ N and denote k−i(t) = # j ∈ N \{i} : tj ∈ Tj the number of agents other than i that prefer alternative r. Then

µ (T r) µ (k (t) = k − 1) 1 µ (T r | k (t) = k) = i −i = . i r s µ T s µ(k (t)=k) µ (Ti ) µ (k−i (t) = k − 1) + µ (Ti ) µ (k−i (t) = k) ( i ) −i 1 + r µ(Ti )µ(k−i(t)=k−1)

It follows immediately from Inequality (6) in Samuels (1965, page 1272) that this ratio is strictly increasing in k, so the proof is complete.

Proof of Lemma 4: We have

 n  n − 1 P r X = 1,S ≥ − P r X = 1,S ≥ = 1 n 2 1 n−1 2   n − 2  n − 3 p P r S ≥ − P r S ≥ = n−1 2 n−2 2   n − 4  n − 2  n − 3 p P r (X = 1) P r S ≥ + P r (X = 0) P r S ≥ − P r S ≥ = n−1 n−2 2 n−1 n−2 2 n−2 2   n − 4  n − 2   n − 2  n − 3 p2 P r S ≥ − P r S ≥ + p P r S ≥ − P r S ≥ = n−2 2 n−2 2 n−2 2 n−2 2 n − 4 n − 2 n − 3 n − 2 p2P r ≤ S < − pP r ≤ S < . 2 n−2 2 2 n−2 2 83 A similar computation gives

 n  n − 1 P r X = 0,S < − P r X = 0,S < = 1 n 2 1 n−1 2 n − 2 n n − 1 n −p(1 − p)P r ≤ S < + (1 − p)P r ≤ S < . 2 n−2 2 2 n−2 2

Assume first that n is even. Then the sum of these two differences becomes

    ! ! 2 n − 4 n − 2 n − 2 n n n − 2 n n p P r Sn−2 = −p(1−p)P r Sn−2 = = n−4 p 2 (1−p) 2 − n−2 p 2 (1−p) 2 < 0. 2 2 2 2

If n is odd then after some simple manipulations the sum of the two differences becomes

 n − 1  n − 3 (1 − p)2P r S = − p(1 − p)P r S = = n−2 2 n−2 2 ! ! n − 2 n−1 n+1 n − 2 n−1 n+1 2 2 2 2 n−1 p (1 − p) − n−3 p (1 − p) = 0. 2 2

To conclude, we have shown that

 n  n P r X = 1,S ≥ + P r X = 0,S < ≤ 1 n 2 1 n 2  n − 1  n − 1 P r X = 1,S ≥ + P r X = 0,S < , 1 n−1 2 1 n−1 2

with a strict inequality when n is even and equality when n is odd. Applying twice this inequality we get (1.7.1).

84 .

Appendix B: Proofs in Chapter 2

Proof in Section 5

Proof of Lemma 1:

1 P σ σ−1 ∗ Consider g (v) = m! f (v ) . First, g is neutral. For any σ ∈ φ, σ∈φ

∗−1  σ ∗−1 ∗−1 σ−1  σ−1 σ  σ∗ σ 1 X  σ∗ σ 1 X  σ∗ σ g v =  f v  = f v m! σ∈φ m! σ∈φ

−1 ∗−1 1  ∗ σ (σ ) 1 −1 = Xf vσ (σ) = Xf (vσ)σ = g (v) , m! σ∈φ m! σ∈φ

where the second equality follows from the fact that the permutation of an aggre- gated vector is the same as an aggregation of the individually permutated vectors, and the fourth from the fact that {σ∗ (σ): σ ∈ φ} = φ.

1 P t t Moreover, Ui (f) = E (ˆvi · f (ˆv)) = m! E (ˆvi | vi ∈ Vi ) · E (f (ˆv) | vi ∈ Vi ), where t∈T the second equality follows from the neutral environment which yields the same prob-

1 ability of any ordinal type m! . To see the connection between f and g, we change the ORD expression of Ui (f) with permutations. Fix t ∈ T ,

1 X  σ t  σ t Ui (f) = E vˆi | vi ∈ Vi · E f (ˆv ) | vi ∈ Vi m! σ∈φ

−1 −1 1 X  σ tσ  σ tσ = E vˆi | vi ∈ Vi · E f (ˆv ) | vi ∈ Vi m! σ∈φ

−1  t 1 X  σ tσ = E vˆi | vi ∈ Vi · E f (ˆv ) | vi ∈ Vi m! σ∈φ 85 −1 where the second equality follows from the fact that for x, y ∈ Rm and σ−1, xσ · yσ−1 = x · y, and the third from the neutral environment. Since this formula holds for any t and any i,

 σ−1 1 X  t 1 X  σ t Ui (f) = E vˆi | vi ∈ Vi · E f (ˆv ) | vi ∈ Vi  m! t∈T m! σ∈φ    1 X  t 1 X σ σ−1 t = E vˆi | vi ∈ Vi · E  f (ˆv ) | vi ∈ Vi  m! t∈T m! σ∈φ

= Ui (g) .

Proofs in Section 6

Proof of Lemma 3: Assume h (β) = 0. First, check the incentive constraint between t and t0.

t 0 t0 For vi ∈ Vi (β) and vi ∈ Vi (β),    ˆ    0 ˆ  vi · E gβ t, t−i − E gβ t , t−i [1]  [1] 0[1] [2]  [2] 0[2] [3]  [3] 0[3] = vi P (β) − P (β) + vi P (β) − P (β) + vi P (β) − P (β)

 [3]  [1] 0[1]  [2] [3]  [2] 0[2] = 1 − vi P (β) − P (β) + vi − vi P (β) − P (β)

 [3]  [1] 0[1]  [2] 0[2] ≥ 1 − vi P (β) − P (β) + β P (β) − P (β)  [3] = 1 − vi h(β) = 0.

t [2] 0[2] The inequality comes from the fact that vi ∈ Vi (β) and P (β) − P (β) ≥ 0.

Similarly, 0    0 ˆ    ˆ  vi · E gβ t , t−i − E gβ t, t−i  [3]  0[1] [1]  0[2] [2] ≥ 1 − vi P (β) − P (β) + β P (β) − P (β)  [3] = − 1 − vi h(β) = 0. 86 The condition, h(β) = 0 can be interpreted with the cut-off agent argument as well. The cut-off agent is the agent with the value of the second-ranked alternative

[2] [3] vi = β + (1 − β) vi which divides the H type and L type sets. If this agent is indifferent between the announcement of an H type and an L type, then every agent is willing to announce his or her true type.

Second, consider the remaining incentive constraints regarding other type an- nouncements.

00 H 000 L ORD For ti = t , ti = t where t ∈ T ,

σ   00 ˆ    ˆ  E gβ t , t−i = E gβ t, t−i for some σ. similarly, σ   000 ˆ    0 ˆ  E gβ t , t−i = E gβ t , t−i for some σ.

Note, that the value order of coordinates in P (β) and P (β)0 still follows the order of value announcements from the similar argument in the proof of Theorem 1.   ˆ    00 ˆ  By the first order stochastic dominance of E gβ t, t−i over E gβ t , t−i and   0 ˆ    000 ˆ  0 E gβ t , t−i over E gβ t , t−i , and the above argument between t and t ,

  ˆ    00 ˆ  vi · E gβ t, t−i ≥ vi · E gβ t , t−i ,   ˆ    0 ˆ    000 ˆ  vi · E gβ t, t−i ≥ vi · E gβ t , t−i ≥ vi · E gβ t , t−i , and 0   0 ˆ  0   000 ˆ  vi · E gβ t , t−i ≥ vi · E gβ t , t−i , 0   0 ˆ  0   ˆ  0   00 ˆ  vi · E gβ t , t−i ≥ vi · E gβ t, t−i ≥ vi · E gβ t , t−i . The other direction in the claim is obvious from the first part of this proof, so it is omitted.

Proof of Lemma 4: 87 H 3 L 3 According to the definition of the score vectors, limβ→0 s (β) = limβ→1 s (β) . Thus, it is sufficient to show that sH (β)3 and sL(β)3 are (weakly) decreasing in β.

Given 0 < β2 < β1 < 1, [2] v −β i 1 H 3 R 1R 1 R 1 R 1 R 1−β1 [3]  [1] [m] [3] [2] [m] H s (β ) = [m] ... [4] [3] v µ v , ..., v dv dv ...dv /p (β ), 1 0 v v v [4] i i i i i i i 1 i i vi [2] v −β i 2 H 3 R 1R 1 R 1 R 1 R 1−β2 [3]  [1] [m] [3] [2] [m] H s (β ) = [m] ... [4] [3] v µ v , ..., v dv dv ...dv /p (β ). 2 0 v v v [4] i i i i i i i 2 i i vi

[2] [2] vi −β1 vi −β2 H H Since ≤ and p (β2) > p (β1), for any α ≥ 0, 1−β1 1−β2 h [3] tH i h [3] tH i P r vi ≤ α | vi ∈ Vi (β2) ≤ P r vi ≤ α | vi ∈ Vi (β1)

H 3 H 3 H 3 By the first order stochastic dominance, s (β2) ≥ s (β1) . It means that s (β) are (weakly) decreasing in β. Similarly, [2] L 3 R 1R 1 R 1 R 1 R vi [3]  [1] [m] [3] [2] [m] L s (β1) = [m] ... [4] [3] [2] v µi v , ..., v dv dv ...dv /p (β1) 0 v v v v −β i i i i i i i i i 1 1−β1 [2] L 3 R 1R 1 R 1 R 1 R vi [3]  [1] [m] [3] [2] [m] L s (β2) = [m] ... [4] [3] [2] v µi v , ..., v dv dv ...dv /p (β2). 0 v v v v −β i i i i i i i i i 2 1−β2

[2] [2] vi −β1 vi −β2 L L Since ≤ and P (β2) < P (β1), for any α ≥ 0, 1−β1 1−β2 h [3] tL i h [3] tL i P r vi ≥ α | vi ∈ Vi (β2) ≥ P r vi ≥ α | vi ∈ Vi (β1) .

L 3 L 3 By the first order stochastic dominance, s (β2) ≥ s (β1) .

Proof of Claim 1:

[1] 0[1] For limβ→0 h(β) = limβ→0 P (β) − P (β) < 0, we know that at every case the change of announcement from H type to L type weakly increases the probability that the first-ranked alternative is chosen. Thus, it is sufficient to find the case where the change strictly increases the probability as β is close to 0. We can always find the case

1 1 0 where gβ (ti, t−i) = ( 2 , 2 , 0) when every one’s type is H type and gβ (ti, t−i) = (1, 0, 0). H H H For example, for n = 2, t1 = abc and t2 = bac and for n = 3, t1 = abc , 88 H H t2 = bca , and t3 = cab . Generally, for any n = 2 or n ≥ 4, we can combine above

30 H 1 two profiles to find the case. Since all other types are H types (limβ→0 p (β) = m! ), [1] 0[1] limβ→0 P (β) − P (β) < 0

0[3] [3] Next, for limβ→1 h(β) = limβ→1 P (β) − P (β) > 0, it is sufficient to find the case where the change of an agent’s announcement from L type to H type strictly decreases the probability that the third-ranked alternative is chosen as β is close to 1.31

0 1 1 1 1 1 We can always find the case where gβ (ti, t−i) = (0, 2 , 2 ) or ( 3 , 3 , 3 ) when everyone’s 0 L L type is L type and gβ (ti, t−i) = (0, 1, 0). For example, for n = 3, t1 = abc , t2 = bca L 0 L L and t3 = cab and for n = 5, the same t1,t2,t3, t4 = bca and t3 = cba . Generally, for any n = 3 or n ≥ 5, we can use the above two profiles to find the case. However, we may not find the case when n = 2 and n = 4. For n = 2, the possible case where

0 c 0 L L gβ (ti, t−i) > 0 is that t1 = abc and t2 = cba . But , in some distributions such that L 2 1+sL(β) 0 s (β) > 2 for any β ∈ (0, 1), gβ (ti, t−i) = (0, 1, 0)= gβ (ti, t−i). The argument L 1 for n = 4 is analogous. Since all other types are L types (limβ→1 p (β) = m! ), 0[3] [3] limβ→1 P (β) − P (β) > 0. Note that Claim 1 still holds with more than 3 alternatives, considering that the

shared top 3 alternatives are replaced with the 3 alternates in the above cases and gβ

is replaced with gˆβ constructed in Step 6.

Proof of Claim 2: Suppose not (for all βˆ ∈ D, h(β) · h(β) > 0). We have two cases. β→βˆ− β→βˆ+

30 H H H H H For example, for n = 4, t1 = abc , t2 = bac ,t3 = abc and t4 = bac , and for n = 5, t1 = abc H H H H , t2 = bca t3 = cab , t4 = abc and t5 = bac , and so on. 31Recall, h (β) = P (β)[1] − P (β)0[1] + β P (β)[2] − P (β)0[2]=(1 − β) P (β)[1] − P (β)0[1] + β P (β)0[3] − P (β)0[3]. 89 ˆ ˆ h(β) h(β) Case I: There exists β1 < β2 ∈ D such · < 0 and h (β) is continuous β→βˆ1+ β→βˆ2−  ˆ ˆ  ˆ ˆ on β1, β2 . By the intermediate value theorem, there is a β ∈ (β1, β2) such that h(β) = 0, which contradicts that h(β) · h(β) > 0. β→βˆ− β→βˆ+

ˆ ˆ h(β) h(β) Case II: For any β1 < β2 ∈ D , we have · > 0. By Claim 1, h(β) β→βˆ1+ β→βˆ2−  ˆ  h(β)  ˆ  h(β) is continuous on some 0, β1 with > 0 or on some β2, 1 with < 0. β→βˆ1− β→βˆ2+ ˆ ˆ By Claim 1 and the intermediate value theorem, there is a β ∈ (0, β1) or (β2, 1) such

that h(β) = 0, contradiction again.

Proof of Claim 3: From Claim 2, fix βˆ and consider a rule, g+ based on the fixed partition P βˆ but  ˆ  with a different score vector, si β + , ti . With sufficiently small  > 0 such that ˆ + β +∈ / D, g are different from gβˆ only in some of the tie cases of gβˆ. That is because l l0 l ˆ H  ˆ  H  ˆ  L  ˆ  any β ∈ D involves tie cases where S β, t−i = S β, t−i or S β, t−i = l0 L  ˆ  0 βˆ S β, t−i for l 6= l ∈ L as seen in Example 2. With P , we obtain the balance   function of g+, h+ βˆ = h(β) . Because of the same partition P βˆ and the different β→βˆ+ + decisions only in tie cases of gβˆ, g is still a maximizer of the utilitarian social welfare P βˆ − βˆ  ˆ  in F . Similarly, we design a rule, g based on P but with si β − , ti and the   balance function h− βˆ = h(β) . From this perspective, we can say that g+and g− β→βˆ− are scoring rules with the same score vectors of gβˆ, but with different tie breaking rules. −h−(βˆ) Finally, consider a rule g˜ = αg+ + (1 − α)g−, where32 α = ∈ [0, 1] α h+(βˆ)−h−(βˆ) ˜  ˆ such that the balance function of g˜α, hα β is equal to 0. Hence, g˜α is an IC cardinal rule. In order to compare with ordinal rules, we will use the identical function,

∗ f (v) =g ˜α(t1, ..., tn) in the next step.     32Recall h+ βˆ · h− βˆ ≤ 0 from Claim 2

90 Proof of Corollary 2: Every step is the same as the proof of Theorem 2 except the second part in

0 L L the proof of Claim 1. For n = 2, consider t1 = abc and t2 = cba . Since  [1] ORD  [3] ORD L 3 E vˆ |P +E vˆ |P  [2] ORD L 2 1+s (1) i i i i L 2 E vˆi | Pi = s (1) ≤ 2 = 2 , s (β) increases and L 3 0 1 1 s (β) (weakly) decreases in β, we have gβ (ti, t−i) = ( 2 , 0, 2 ) and gβ (ti, t−i) = (0, 1, 0) 0 L L L as β closes to 1. For n = 4, we have the case where t1 = abc ,t2 = cba ,t3 = abc and L t4 = cba .

Proofs in Section 7

Proof of Lemma 5: The first part of the proof is to show the equivalence between the incentive com- patibility in this paper and the ordinal Bayesian incentive compatibility in Majumdar and Sen (2004). We can define the ordinal Bayesian incentive compatibility with our notations.

Definition 11. (Majumdar and Sen (2004)) A rule g ∈ F ORD is Ordinally Bayesian Incentive Compatible (OBIC) with respect

0 ORD ti to the belief ν if for all i ∈ N , for all ti, ti ∈ T , for all vi ∈ Vi , we have

  ˆ   0 ˆ  vi · E g(ti, t−i) − E g(ti, t−i) ≥ 0 If g is OBIC, then g is IC, which is obvious. The other direction holds because of the assumption of full support of V SP . Then, the second part is to prove that if g is OBIC, then g satisfies Property M. This part is omitted because it is shown in the proof of Theorem 4.1 in Majumdar and Sen (2004).

Formal Step 2 in the proof of Theorem 3: 91 Assume that a rule g ∈ F ORD is unanimous and IC, but not a peak-only rule. It

0 0 implies that there exists two type profiles (ti, t−i) and (ti, t−i) with r1 (ti) = r1 (ti) 0 such that g (ti, t−i) 6= g (ti, t−i). Without loss of generality, fix the agent i as agent 0 0 ∗ 0 1 and consider t1, t1 such that r1 (t1) = r1 (t1) = l and rk (t1) = rk (t1) for k =

{1, ..., l − 1, l + 2, ..., m}. In other words, all the ranked alternatives of two types t1

0 and t1 are the same except the l th and l + 1 th-ranked alternatives. Note that ∗ ∗ rl (t1) > l > rl+1 (t1) or rl (t1) < l < rl+1 (t1) because of the domain of single peaked

preferences. We can think of three cases according to g (t1, t−1).

∗ 0 ∗ Case I: g (t1, t−1) = l . By Lemma 5, g (t1, t−1) = l , which contradicts that 0 ∗ g (t1, t−1) 6= g (t1, t−1). Thus, g (t1, t−1) 6= l .

∗ 0 ∗ 0 ∗ Case II: g (t1, t−1) > l . Then, g (t1, t−1) < l . Suppose not, g (t1, t−1) ≥ l . If 0 ∗ ∗ 0 ∗ g (t1, t−1) = l , similarly to Case I, g (t1, t−1) = l , contradiction. If g (t1, t−1) > l 0 and without loss of generality g (t1, t−1) < g (t1, t−1), then the single-peaked prefer- 0 0 ences domain guarantees that there exists k such that B (rk (t1) , t1) = B (rk (t1) , t1) 0 0 0 and g (t1, t−1) ∈ B (rk (t1) , t1), but g (t1, t−1) ∈/ B (rk (t1) , t1). It contradicts that g satisfies Property M.

∗ 0 ∗ Case III: g (t1, t−1) < l . Then, g (t1, t−1) > l because of the analogous argument in Case II.

∗ 0 ∗ Thus, we can start with that g (t1, t−1) > l > g (t1, t−1) or g (t1, t−1) < l < 0 g (t1, t−1). After the following argument with agent 2, we will apply the same argu- ment to agent 3,...n and show the contradiction to the unanimity of the rule.

∗ 0 Consider the case that g (t1, t−1) > l > g (t1, t−1) (In the other case, the con- tradiction can be similarly shown). Take another agent, say agent 2. We have three

∗ cases according to l (t2) 92 ∗ ∗ Case A: l (t2) < l , then the full single-peaked preferences domain guarantees that

0 ∗ 0 ∗ 0 0 there exists t2 such that l (t2) = l and, for some k, B (rk (t2) , t2) = B (rk (t2) , t2), ∗ 0 g (t1, t2, ..., tn) ∈/ B (rk (t2) , t2) but l ∈ B (rk (t2) , t2). By Lemma 5, g (t1, t2, ..., tn) 6= l∗.

∗ ∗ 0 Case B: l (t2) = l , set t2 = t2.

∗ ∗ Case C: l (t2) > l , then the full single-peaked preferences domain guarantees that

0 ∗ 0 ∗ 0 0 there exists t2 such that l (t2) = l and, for some k, B (rk (t2) , t2) = B (rk (t2) , t2), 0 ∗ 0 0 g (t1, t2, ..., tn) ∈/ B (rk (t2) , t2) but l ∈ B (rk (t2) , t2). By Lemma 5, g (t1, t2, ..., tm) 6= l∗.

0 0 0 ∗ Therefore, at least one of g (t1, t2, ..., tn) or g (t1, t2, ..., tn) cannot be l . From the 0 ∗ 0 0 arguments of Case I,II, and III we have that g (t1, t2, ..., tm) > l > g (t1, t2, ..., tm) or 0 ∗ 0 0 g (t1, t2, ..., tn) < l < g (t1, t2, ..., tn). 0 Again, we can fix t2 and apply the same argument to agent 3 and repeat the same 0 0 0 ∗ 0 0 0 procedure to agent 4,...n. Then we reach that g (t1, t2, t3, ..., tn) > l > g (t1, t2, ..., tn) 0 0 0 ∗ 0 0 0 0 ∗ or g (t1, t2, t3, ..., tn) < l < g (t1, t2, t3, ..., tn) with r1 (ti) = l for every i ∈ N, which contradicts the unanimity of the rule.

93