<<

ARTICLE IN PRESS

Journal of Mathematical Psychology 48 (2004) 107–134

Binary choice, subset choice, random utility, and ranking: A unified perspective using the permutahedron Jun Zhang* Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA Received 3 December 2002; revised 8 December 2003

Abstract

d The d-permutahedron Pd1CR is defined as the of all d-dimensional permutation vectors, namely, vectors whose components are distinct values of a d-element set of integers ½dŠf1; 2; y; dg: By construction, Pd1 is a with d ! vertices, each representing a linear order (ranking) on ½dŠ; and has dimension dimðPd1Þ¼d 1: This paper provides a review of some well-known properties of a permutahedron, applies the geometric-combinatoric insights to the investigation of the various popular choice paradigms and models by emphasizing their inter-connections, and presents a few new results along this line. Permutahedron provides a natural representation of ranking probability; in fact it is shown here to be the space of all Borda scores on ranking probabilities (also called ‘‘voters profiles’’ in the social choice literature). The following relations are immediate consequences of this identification. First, as all d ! vertices of Pd1 are equidistant to its barycenter, Pd1 is circumscribed by a sphere Sd2 in a ðd 1Þ-dimensional space, with each spherical point representing an equivalent class of vectors whose components are defined on an interval scale. This property provides a natural expression of the random utility model of ranking probabilities, including the condition of Block and Marschak. Second, Pd1 can be realized as the image of an affine projection from the unit cube Cdðd1Þ=2 of dimension dðd 1Þ=2: As the latter is the space of all binary choice vectors describing probabilities of pairwise comparisons within d objects, Borda scores can be defined on binary choice probabilities through this projective mapping. The result is the Young’s formula, now applicable to any arbitrary binary choice vector. Third, d Pd1 can be realized as a ‘‘monotone path polytope’’ as induced from the lift-up of the projection of the cube Cd CR onto the line 1 d segment ½0; dŠCR : As the 2 vertices of the d-cube Cd are in one-to-one correspondence to all subsets of ½dŠ; a connection between the subset choice paradigm and ranking probability is established. Specifically, it is shown here that, in the case of approval voting (AV) with the standard tally procedure (Amer. Pol. Sci. Rev. 72 (1978) 831), under the assumption that the choice of a subset indicates an approval (with equal probability) of all linear orders consistent with that chosen subset, the Brams–Fishburn score is then equivalent to the Borda score on the induced profile. Requiring this induced profile (ranking probability) to be also consistent with the size-independent model of subset choice (J. Math. Psychol. 40 (1996) 15) defines the ‘‘core’’ of the AV Polytope. Finally, Pd1 can be realized as a canonical projection from the so-called Birkhoff polytope, the space of rank-position probabilities arising out of the rank- paradigm; thus Borda scores can be defined on rank-position probabilities. To summarize, the many realizations of a permutahedron afford a unified framework for describing and relating various ranking and choice paradigms. r 2004 Elsevier Inc. All rights reserved.

1. Background Common choice paradigms which arise from a wide variety of political, social–economical and psychological In the general context of decision-making through situations include: choice over d distinct objects (political candidates, consumer products, joboptions, etc.), denote the master 1. Linear ordering/ranking, in which all d objects are choice set ½dŠ :¼f1; 2; y; dg; with objects conveniently rank-ordered1 according to their levels of desirability, labelled as natural numbers 1 through d (d42is assuming no ties. assumed in this paper unless explicitly noted otherwise). 1 For simplicity, only linear/total order and sometimes weak order of *Tel.: +1-734-763-6161; fax: +1-734-763-7480. objects are considered; no considerations for semi-order or interval E-mail address: [email protected]. order will be given in this paper.

0022-2496/$ - see front matter r 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.jmp.2003.12.002 ARTICLE IN PRESS 108 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

2. Binary choice, in which two of the d objects (called a paradigm. Denote a total (linear) ranking of all ‘‘binary subset’’) are selected at a time to be candidates in ½dŠ by p ¼ /pð1Þpð2Þ?pðdÞS; where compared against each other for their desirability, pðiÞ is the rank (also called ‘‘rank-position’’) given to and all such pairwise comparisons of objects in ½dŠ are candidate i; iA½dŠ: As a convention adopted in this performed. paper, the rank-positions are natural numbers 1 through 3. Subset choice, in which only one distinct subset of ½dŠ d; forming a set denoted by ½dŠ as well, with larger values of any number of elements (including the null-set |)is indicating higher desirability. For convenience, p1 selected or approved of, while the desirability of all denotes the mapping from rank-position to candidate the selected elements is considered to be ‘‘equal’’ or identity, so p1ðkÞ represents the candidate who comparable (their precise meanings to be clarified occupies rank-position k according to ranking p; while later). p1ðdÞ returns the most-desirable candidate. In the 4. Rank assignment, in which each rank-position (1 literature as well as in this paper, /pð1Þpð2Þ?pðdÞS is through d; in ascending order of desirability) is to be referred to as ‘‘ranking’’, while ðp1ð1Þp1ð2Þ?p1ðdÞÞ assigned, as a partition of unity, to objects in ½dŠ: is referred to as ‘‘ordering’’ (in increasing desirability). Denote the set of all rankings of d candidates as Ld ; When there are more than one individuals (‘‘voters’’) with set size d !: The ‘‘ranking probability’’ Pp can be involved in performing such a binary choice, subset defined on the set of linear orders Ld : choice, or linear ranking task, or when such choice is X non-deterministic, a probability distribution will be PpX0; Pp ¼ 1: ð1Þ induced over the set of, respectively, all linear orders pALd over ½dŠ; all binary subsets of ½dŠ; or all subsets of ½dŠ: For instance, the probability distribution over the d ! Block and Marschak (1960) showed that under certain linear orders is termed the ‘‘voters profile’’ in the social conditions, the ranking probability Pp is naturally y choice literature or ‘‘ranking probability’’ elsewhere, the connected to the density f ðv1; v2; ; vd Þ of jointly distributed RU variable v v ; v ; y; v T through the probability distribution over the 2d subsets of ½dŠ is ¼½ 1 2 d Š relationship called ‘‘subset choice probability’’, while the more Z common ‘‘binary choice probability’’ refers to a vector y ? with dðd 1Þ=2 components, each representing the Pp ¼ f ðv1; v2; ; vd Þ dv1 dv2 dvd ; ð2Þ D probability of choosing one alternative over another in p a two-element subset of ½dŠ: where the region of integration Dp is the connected There have been attempts in the past to link the point-set preference structure revealed by these various choice d paradigms using some unifying constructs. One com- Dp ¼fvAR : vp1ðdÞ4vp1ðd1Þ4?4vp1ð1Þg: ð3Þ mon approach is to assume the existence of an under- lying interval-scaled utility associated with each choice The necessary and sufficient condition for (2) to hold is option (a candidate in ½dŠ). These utilities, represented the ‘‘non-coincidence’’ condition, roughly stated as T byad-dimensional vector v ¼½v1; v2; y; vd Š indeed, f ðv1; v2; y; vd Þ¼0 whenever vi ¼ vj ðiajÞ: ð4Þ may vary randomly, though often not necessarily independently. The distribution of such random utility This is to say that the set of points where any two (RU) values is denoted by a non-negative function random variables assume the same value has measure f ðv1; v2; y; vd Þ that satisfies zero; probability density is not concentrated on the Z boundaries of Dp: Block and Marschak (1960) demon- y ? f ðv1; v2; ; vd Þ dv1 dv2 dvd ¼ 1; strated that any such density function f ðv1; v2; y; vd Þ induced a ranking probability Pp according to (2); with f yet to be interpreted (see next paragraph). conversely a density function could always be con- This so-called (non-parametric) ‘‘Random Utility (RU) structed to satisfy (2) given an arbitrary ranking model’’, as a probabilistic generalization of deterministic probability Pp: utility theory, has been argued to provide the much- This result is intuitively important: assuming the non- needed tool for combining algebraic and probabilistic coincidence condition, a characterization of RU repre- representations of choice and, therefore, to serve sentation is equivalent to a characterization of ranking as a unifying framework to reconcile normative and probability Pp: As much as non-coincident RU repre- descriptive approaches to modelling and measurement sentation may become a unifying description for choice of choice in social sciences (Regenwetter & Marley, behavior (argued by M. Regenwetter), ranking prob- 2001). ability would play a central role in relating choice One important consequence of RU representation of probabilities associated with the various paradigms choice is its immediate connection to the ranking mentioned earlier. ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 109

1.1. Ranking probability and binary choice and ranking is X aij ¼ Pp; ð5Þ The binary choice paradigm is perhaps one of the fpALd : pðiÞ4pð jÞg most studied and best axiomatized paradigm of choice. where the summation is over all p’s that rank candidate i Let aij be the relative frequency (or probability) of choosing object ‘‘i’’ in a two-element set fi; jg: There can as more desirable than candidate j: Given Pp; one be many axiomatizations and representations of the obtains aij through marginalization, so each ranking probability of binary choice. For instance, the ‘‘Weak probability gives rise to a unique binary choice vector. But the inference in the opposite direction, i.e., finding a Utility Model’’ postulates that aij4aji if and only if there exists a real-valued function gðÁÞ such that ranking probability that is consistent with (in the sense gðiÞ4gð jÞ: The ‘‘Strong Utility Model’’ postulates that of (5)) an arbitrarily given binary choice vector, is far less straightforward2—this problem has been referred to aij4akl if and only if there exists a real-valued function g such that gðiÞgð jÞ4gðkÞgðlÞ for all i; j; k; l: The as the characterization problem (see Marley, 1992). To ‘‘Strict Utility Model’’ (Bradley–Terry–Luce or BTL begin with, not all dðd 1Þ=2-dimensional vectors Model) further requires the existence of a positive- confined to the cube Cdðd1Þ=2 are realizable by Pp via valued function g such that a gðiÞ : The dðd1Þ ðÁÞ ij ¼ gðiÞþgð jÞ (5); this is because, among the 2 2 vertices of ‘‘Fechnerian Utility Model’’ assumes the existence of a Cdðd1Þ=2; only d ! of those are ‘‘valid’’ ones, namely, real-valued function g and a strictly monotone increas- ones that map one-to-one to the set of linear orders Ld : ing function f; such that the binary choice probability The number of valid vertices represents only a small takes the form aij ¼ f½gðiÞgð jފ; where f is often dðd1Þ taken as the cumulative distribution function of a proportion; this fraction d !=2 2 equals 0.75, 0.375, normal random variable, or at other times the logistic 0.117, 0.022 for d ¼ 3; 4; 5; 6; respectively, and ap- function. Finally, the ‘‘RU model’’ (see (2), with d ¼ 2) proaches zero rapidly as d increases. Any dðd 1Þ=2- assumes the existence of random variables vi and vj dimensional vector which cannot be represented as a associated with i and j such that aij ¼ Prob½vi4vjŠ: convex combination of valid vertices necessarily cannot When more than two candidates are involved, be represented as a convex combination of those binary choice can be performed on any two-element representing linear orders; therefore no ranking prob- subset of ½dŠ; d42: In this case, the choice probabilities ability Pp exists that will give rise to such a binary choice become a dðd 1Þ=2-dimensional vector a ¼ vector via (5). Hence, the necessary (and sufficient) y T ½a12; a13; ; aðd1Þd Š ; called the ‘‘binary choice vector’’, condition for the existence of Pp is that a ¼ whose range is the unit cube T ½a12; a13; y; aðd1Þd Š can be expressed as a convex T dðd1Þ=2 combination of the valid vertices of Cd d 1 =2: The Cdðd1Þ=2 :¼f½a12; a13; y; aðd1Þd Š AR : ð Þ convex hull of these d ! vertices, that is, the fractional 0 a 1; i; jA½dŠ; i jg p ijp o mixture of the corresponding vectors, forms a geometric object contained within Cdðd1Þ=2: It is called the ‘‘Binary representing all possible pairwise comparisons of candi- 3 dates in the master set ½dŠ: Note that any constraint Choice Polytope’’ or ‘‘Linear Ordering Polytope’’, placed on the binary-choice probability (such as transi- which normatively describes the solution to the char- tivity) will necessarily limit the feasible region in the cube acterization problem of binary choice (Cohen & Falmagne, 1990, Fishburn, 1992, Koppen, 1991, 1995, Cdðd1Þ=2: For example, for any triplets i; j; k which are elements of ½dŠ; weak stochastic transitivity (WST) Suck, 1992). See Appendix A for the mathematical X1 X1 X1 background on polytopes. assumes that if aij 2 and ajk 2 then aik 2; moderate X1 The Binary Choice Polytope belongs to the general stochastic transitivity (MST) assumes that if aij 2 and X1 X class of 0/1-polytopes (see Ziegler, 1999 for a recent ajk 2 then aik minðaij; ajkÞ; while strong stochastic X1 X1 introduction). Its facets (faces of maximal dimension) transitivity (SST) assumes that if aij 2 and ajk 2 then are defined by a system of inequalities/equalities on aij’s, aikXmaxðaij; ajkÞ; see Luce and Suppes (1965).These increasingly stronger assumptions about the underlying the components of a binary choice vector a: However, it process of binary comparison will constrain the realizable turns out that the problem of finding a complete region to some increasingly smaller core in the cube 2 In Babington Smith’s (1950) ranking model Cdðd1Þ=2: While important for various axiomatizations, Y in this paper no such constraints will be placed on the Pp ¼ const aij ; binary choice vector a unless explicitly stated. fði;jÞ : pðiÞ4pð jÞg The binary choice vector a can be related to the one readily obtains a ranking probability from the binary choice A probability, but it does not satisfy Eq. (5). ranking probability Pp defined for ranking p Ld of 3 Actually, the Binary Choice Polytope is defined in the space Rdðd1Þ dðd1Þ=2 elements of ½dŠ; where Ld denotes the set of all linear (instead of R ) where the relations aij þ aji ¼ 1 are treated as one orders. A sensible relationship between binary choice of the facet-defining equalities/inequalities. ARTICLE IN PRESS 110 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 description of these facets is a deceptively simple but Random utility challenging one, as not only all of its facet-defining representation inequalities are not yet found (and new inequalities are constantly emerging, including the more recent ones by Bolotashvili, Kovalev, & Girlich, 1999), but there is Binary choice Ranking Subset choice some indication that the effort of exhausting all such probability a probability b probability inequalities to obtain a complete linear description is perhaps futile because such a description may not itself c be ‘‘described’’ following arguments in computation Rank position complexity theory (Pekec˘ , 2000). probability

1.2. Ranking probability and subset choice By Block & Marschak (1960) Through Marginalization Characterization Problem An alternative to the binary choice paradigm is the a: yields Linear Ordering Polytope b: yields AV Polytope (under SI Model) ‘‘subset choice’’ paradigm (Brams & Fishburn, 1978, c: yields Birkhoff Polytope 1983, Falmagne & Regenwetter, 1996, Doignon & Fig. 1. Connections between common choice paradigms (ranking, Regenwetter, 1997, Regenwetter, Marley, & Joe, 1998, binary choice, subset choice) and the random utility representation. Doignon & Regenwetter, 2002, Doignon & Fiorini,to Note the central role of ranking probability (the direction of an arrow appear. From the d-element master set ½dŠ; consider the indicates an inference or specification from one representation/ paradigm to another). set formed by all subsets of ½dŠ; i.e., Sd ¼fS : SD½dŠg: P The set Sd is commonly referred to as the power set of d the original master set ½dŠ and denoted as 2½dŠ; it has 2d with f ð0Þ¼P|; f ðdÞ¼P½dŠ and that k¼0f ðkÞ¼1: elements, including the null-set | and ½dŠ itself. Under Given Pp and f ðkÞ; one readily obtains PS: As with the binary choice case, the inference in the opposite the subset choice paradigm, any element SASd of this power set, i.e., any subset of ½dŠ; becomes the basic direction is much harder. Given PS (and hence f ðjSjÞ), choice primitive. A probability distribution defined on conditions for inducing a probability distribution Pp on d all rankings in accordance with (6) have been called the the 2 alternative elements of Sd ; called the ‘‘subset choice probability’’, can be introduced in the same way characterization problem for AV. Its solution defines a polytope called the AV Polytope (Doignon & Regen- that the ranking probability over the d ! elements of Ld has been introduced in (1) wetter, 1997), and all of its facets have been character- X ized recently (Doignon & Fiorini, to appear). PSX0; PS ¼ 1: SD½dŠ 1.3. Imposing structures on linear orders Subset choice paradigms are natural extensions to a popular voting mechanism introduced in the social The above discussion shows that the probability choice and voting literature, namely approval voting distribution on linear orders Pp plays a pivotal role in (AV) (Brams & Fishburn, 1978, 1983). In recent years, formulating the characterization problems for binary the size-independent (SI) model of AV is proposed by choice and for subset choice paradigms. This is Falmagne and Regenwetter (1996). The SI model schematically summarized as Fig 1. The conditions for envisions a two-stage process in voters’ choices of the existence of Pp under (5) for a binary choice vector, subsets, a first stage concerning the size of the subset, or under (6) for a subset choice probability amount to followed by a second stage concerning the particular linear descriptions of the Binary Choice Polytope or the subset (among all subsets of equal size) being chosen. AV Polytope, respectively. These polytopes are, un- The subset choice probability PS is then related to the fortunately, quite complex. One hopes that further ranking probability Pp through restrictions on how to combine and compare linear X orders may give rise to a simpler geometric–combina- PS ¼ f ðjSjÞ Á Pp: ð6Þ toric structure. A 1 1 fp Ld : fp ðdÞ;y;p ðdjSjþ1Þg¼Sg There have been at least two approaches in the Here, for a fixed S with set-size jSj¼k; the summation literature for endowing structures on linear orders is over all rankings p whose top k entries (which turn out to be much related). The first approach p1ðdÞ; y; p1ðd k þ 1Þ are exactly those elements of relies on the introduction of a metric on Ld to describe S; there are a total of k! ðd kÞ! rankings for each S: the difference between two linear orders. Familiar The set-size probability f ðkÞ; k ¼ 1; y; d 1 is given by metrics include Spearman’s Dr (arising from r; the X correlation coefficient of differences in ranks), Kendall’s f ðkÞ¼ PS; ð7Þ Dt (arising from t; the correlation coefficient of signs fSD½dŠ : jSj¼kg of rank comparisons), Cayley’s distance (minimum ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 111 number of transpositions between two rankings), among i.e., many others (see Kendall, 1962). These metrics are then 2 3 pð1Þ used for parametric models of ranking probability 6 7 X 6 pð2Þ 7 whereby Pp is inversely related to the distance Dðp; p0Þ Bd v ¼ Pp6 7: ð10Þ from an arbitrary ranking p to a modal ranking p 4 ^ 5 0 pALd (Fligner & Verducci, 1986). A prominent metric-based pðdÞ ranking model is Mallows’ (1957) f model, an exponential family with the assumption of unimodality The notions of the Borda rule and the Borda score are in the distribution of rankings. When endowed with the well-known in the social choice literature; in fact they Spearman’s r for measuring the distance between two have been systematically axiomatized (Young, 1974, rankings p1 and p2 1975, Nitzan & Rubinstein, 1981). Among its axioms (Young, 1974) is a cancellation condition effectively Xd stating that two rankings p1 and p2 will cancel each 2 Drðp1; p2Þ¼ ðp1ðiÞp2ðiÞÞ ; ð8Þ other’s effect if one is the ‘‘reverse’’ of the other, defined i¼1 as p1ðiÞ¼d p2ðiÞ for any iA½dŠ: The cancellation in the net consequence of reverse rankings is a crucial Mallow’s f model yields a particular probability axiom; it is closely related to the construction of ‘‘net distribution on the sphere Sd2; known as the von- ranking probability’’ (Regenwetter & Grofman, 1998a). Mises Fisher distribution in the modelling of directional The above two approaches of imposing structures on data (McCullagh, 1993). Other types of distance-based linear orders, one involving a metric on rankings and the models may incorporate the dynamics of the ranking other involving a scoring function on ranking probabil- process, such as a multi-stage process where a voter ities, turn out to be closely related; indeed they often chooses the most preferred candidate first, and then the parallel one another. Of special interest here is a result next preferred one from the remaining pool, and so by Cook and Seiford (1982)5 showing that the Borda forth, to produce a linear order (see Critchlow, Fligner, vector vBd in (10) actually achieves the minimum over T d & Verducci, 1991). the distance from a point v ¼½v1; v2; y; vd Š AR to all An alternative approach to endow structures on Ld is rank vectors when weighted by Pp and using Spearman’s based on the notion of a ‘‘scoring function’’ (Young, r metric (8) 1975) in the voting literature. Given Pp (called a ‘‘voters X Xd 2 profile’’ there for obvious reasons), the idea is to Drðv; pÞ¼ Ppðvi pðiÞÞ : ð11Þ A construct a score for each candidate i ½dŠ to reflect pALd i¼1 the collective or aggregated preference of the voting population. Each candidate’s score thus obtained will be The Borda vector is thus construed as an ‘‘average’’ compared against others’ for their desirability. One such rank vector, in the least-mean-square sense and con- scoring method is the well-acclaimed ‘‘Borda rule’’, sistent with the generalized average on ordered sets namely, candidate i accrues some point(s) (or ‘‘marks’’) (Ovchinnikov, 1996), obtained from the d rankings equal to4 the rank-position pðiÞ in a particular ranking associated with a voters profile Pp: Note that the geometries arising out of Borda scores, and more p; the total points vi accrued by each candidate from the voting population is then properly weighted by the generally the position voting schemes, were discussed in Saari (1990, 1992, 1993); however, these papers fell voters profile Pp to yield an ‘‘order of merit’’. Formally, short of identifying the space of all Borda vectors as the Borda score of a ranking probability Pp is defined as a d-dimensional vector vBd: forming a well-defined polytope under investigation here. P 2 3 As we shall see, construction (10) of the Borda score pAL Pp Á pð1Þ 6 P d 7 (vector) gives rise to a combinatoric–geometric object 6 A Pp Á pð2Þ 7 Bd 6 p Ld 7 known as the permutahedron (alternatively spelt as v ¼ 4 ^ 5; ð9Þ P ). The permutahedron is a very simple P p d and special kind of polytope that has been well studied pALd p Á ð Þ by mathematicians, see the recent treatise by Ziegler

5 These authors apparently mis-attributed Spearman’s r to Kendall 4 Given a linear order p; the points assigned to a candidate i; in the in this and a number of subsequent publications. It is clear from common Borda rule, is the number of other candidates that are Kendall (1962) that the metric associated with his t is ordered below the focal candidate i: The points assigned to a candidate X with rank-position pðiÞ is thus pðiÞ1; and ranges between 0 (i is least Dtðp1; p2Þ¼ 1 ðsgnðp1ðiÞp1ð jÞÞ Á sgnðp2ðiÞp2ð jÞÞÞ preferred) to d 1(i is most preferred). Here, for convenience, we add all ioj 1 to this mark, so the Borda score ranges between 1 (least preferred) to in which sgnðtÞ denotes the sign of t; whereas the metric associated d (most preferred). with Spearman’s r is given by (8). ARTICLE IN PRESS 112 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

(1995) and its excellent review by Suck (1997). As its Section 2.2, as a geometric object circumscribing a vertices represent the set of permutations of d objects, a hypersphere in Section 2.3, as a monotone path permutahedron is a kind of permutation polyhedra polytope (which arises from the lift-up of the projective (Bowman, 1972) that geometrically represent permuta- map from the cube onto a line segment) in Section 2.4, tions. It can be realized in various ways, as projections and through the canonical projection from the Birkhoff or lift-ups of projections of other polytopes. Its nice Polytope in Section 2.5. Each of these constructions geometric properties allow a unified treatment of provides a natural link to the binary choice paradigm, various choice paradigms (binary choice, subset choice, the RU representation, the subset choice paradigm, and etc.) and the related RU theory. the rank-assignment paradigm, respectively. Section 3 Although permutahedron is a familiar object to summarizes the central role of permutahedron in combinatorial mathematicians, it has so far rarely been connecting these choice paradigms in view of its rich used in models of choice (binary, subset, RU, Borda geometric–combinatorial properties, and discusses a scoring) where the polytope theory has demonstrated to connection to the topological approach (Chichilnisky be relevant. In particular, the potential power of & Heal, 1983) to preference and choice. permutahedron in connecting the aforementioned To help readers who might be unfamiliar with the choice paradigms has not been systematically investi- polytope theory, a review of the mathematical back- gated. The paper provides a comprehensive review of ground is provided as Appendix A. This includes some well-known mathematical properties of a permu- dualistic characterizations of a polytope either by its tahedron, points out their meanings within the context vertices or by its facets, the face lattice of polytopes, the of, as well as their applications to various choice projection of polytopes (including the projection of paradigms and models, along with a few new results. cubes), and the fiber polytope as lift-ups of polytope The goal of this paper is to bring to the awareness of projection (including the monotone path polytope). mathematical psychologists such a geometric-combina- These materials provide the necessary mathematical torial object and the intuitions it provides. The main knowledge that our current exposition draws upon. new results include the demonstration of: (i) the surjective mapping of the space of binary choice vectors to the permutahedron, the space of Borda scores 2. Choice paradigms and the permutahedron (Theorem 2.3); though this was presented and proven as an exercise in Ziegler’s book, the projection will be This section will review various choice paradigms and explicated in a matrix form allowing direct appreciation discuss their inter-connections under the common of the connection between the permutahedron and any framework of the permutahedron. Special attention will binary choice vector (not only the ones that lie within be paid to the geometric and the combinatorial view of the Binary Choice Polytope); (ii) the monotonicity and the permutahedron, the various realizations of a homogenizing property associated with this projective permutahedron as they relate to the various choice map if one further assumes BTL representation of the paradigms, and the projections as well as the lift-ups of choice probability (Corollary 2.4); (iii) the embedding of projections as the inter-connections among the choice the permutahedron in a hypersphere (Proposition 2.5); paradigms. this enables a geometric description of the RU First, from its construction in Section 2.1, a permu- representation of ranking probabilities and the Block tahedron over d objects, or d-permutahedron Pd1 for and Marschak (1960) condition; (iv) the connection short,6 will be shown to be a natural candidate for between the Brams–Fishburn score of the AV and the representing the probability distribution Pp of linear Borda score under the equal-probability interpretation orders p over a d-element set ½dŠ; with each of the d ! of subset choice (Theorem 2.7) and the latent voters vertices of Pd1 representing a possible ranking pALd : profile consistent with this interpretation (Corollary Since a point within Pd1 represents the Borda score 2.8); and (v) the core that defines the AV Polytope under associated with Pp; the various geometric properties of a the SI model of the underlying preference structure permutahedron reviewed here will naturally link the (Theorem 2.9). ranking probability and its Borda score to other choice The remaining of the paper is organized as follows: paradigms, including the binary choice paradigm, the Section 2.1 introduces the formal definition of permu- subset choice paradigm, and the rank-assignment tahedron, both as a convex combination of permutation paradigm, and to other representations of choice vectors and as an intersection of half-planes with a 6 system of inequalities defining its facets, the dual Strictly speaking, one should refer to Pd1 as a ðd 1Þ- definitions of a convex polytope that are of fundamental permutahedron, since the prefix usually refers to the dimensionality of a polytope in the polytope literature. In this paper, we call it a d- importance in the polytope theory. We then elaborate permutahedron to emphasize the fact that it arises as permutations various alternative constructions of a permutahedron, over d objects. We retain the notation of Pd1 to indicate that it is a i.e., as a zonotope (which is a projection of the cube) in ðd 1Þ-dimensional object. ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 113 probability, including the BTL representation and the lowest rank or is least preferred ðpð2Þ¼1Þ; etc. For d RU representation. Then, in Section 2.2, the d- candidates, there are a total of d ! ¼ d Áðd 1Þ?2 Á 1 permutahedron Pd1 will be realized as the affine different permutations, denoted as pkALd ðk ¼ projection of a dðd 1Þ=2-dimensional cube Cdðd1Þ=2; 1; 2; y; d!Þ; where Ld is the set of all permutations the latter is the space that defines binary choice over d elements. probabilities. Explicating the projection matrix yields a To understand how a permutahedron arises, intro- formula, Young’s formula, to associate an induced score duce the notion of a permutation vector, namely, a d- for any binary choice vector (probability), not necessa- dimensional vector whose components are rank-posi- rily the ones that conform to a given ranking probability tions of a linear order p: through (5), i.e., the ones that lie within the Binary x ¼½pð1Þ; pð2Þ; y; pðdފT ; ð12Þ Choice Polytope. It will be shown that this induced score p is consistent with the representation of the binary choice with T denoting vector transpose. For instance, when probability, along with certain desired properties. Next, p ¼ /4132S; the corresponding permutation vector is T in Section 2.3, the d-permutahedron Pd1; as a ðd 1Þ- xp ¼½4; 1; 3; 2Š : There are d! permutation vectors dimensional object, is shown to be circumscribed by a A d y d xpk R ; k ¼ 1; ; d!; all defined in R : d2 7 unit sphere S ; this property will be used to construct A d-permutahedron, denoted Pd1; is the convex a geometric representation of the RU model and to hull of all of the d! permutation vectors P ¼ derive its relation to ranking probabilities (including the y fxp1 ; xp2 ; ; xpd! g Block–Marschak condition). There, it will be shown that ( any interval-scale vector (a vector whose components Pd1 ¼ convðPÞ¼ l1xp þ l2xp þ ? þ ld!xp : are subject to the affine freedom in the choice of a 1 2 d! ) reference zero and a scaling factor) can be mapped to a Xd! d2 point in the ðd 2Þ-dimensional unit sphere S pkALd ; lkX0; lk ¼ 1 : ð13Þ osculating Pd1: Next, in Section 2.4, Pd1 will be k¼1 realized as a special fiber polytope, namely the mono- Its barycenter b; a d-dimensional vector, is obtained by tone path polytope, arising from projecting a d-cube ? d 1 setting l1 ¼ l2 ¼ ¼ ld! ¼ 1=d ! in (13): Cd CR to a line segment ½0; dŠCR ; this property will be shown to relate subset choice to ranking. There, all Xd! 1 ðd 1Þ!ð1 þ 2 þ ? þ dÞ d þ 1 subsets of the d-element set ½dŠ are naturally mapped to b ¼ xpk ¼ 1 ¼ 1; d d ! d ! 2 the 2 vertices of Cd ; and AV under the Brams– k¼1 Fishburn tally procedure is shown to amount to an where 1 ¼½1; 1; y; 1ŠT : equal-probabilistic assignment of all linear orders This construction of a permutahedron is closely that are compatible with the chosen subset. The SI related to the calculation of Borda scores in the voting model of approval voting (Falmagne & Regenwetter, and the social choice literature (cf. Section 1.3). There, 1996) is also analyzed using this framework. Finally, the collection of lk (denoted Pp) is rightfully called in Section 2.5, Pd1 will be related to the rank- the voters profile, and formula (10) for calculating position probability that arises from rank-matching Borda scores vBd gives exactly the same form of (13), paradigm. That a Birkhoff Polytope Bd for d  d once the definition of a permutation vector (12) is bistochastic matrices has a canonical projection to understood. Though not widely picked up by mathe- Pd1 will be utilized to derive an induced score for the matical psychologist until recently, permutahedra d candidates. have been explicitly applied to consensus ranking (Cook & Seiford, 1982, 1990) and to the description 2.1. Permutahedron of statistical ranking models in general (McCullagh, 1993). 2.1.1. Construction of a permutahedron The permutahedron Pd1 characterizes concisely both Permutahedron is a convex polytope associated with combinatorial and geometric properties of permutations permutations on a set of given objects labelled by over d objects, apparently first investigated by Schoute y natural numbers ½dŠ¼f1; ; dg: To fix the notation, let (1911). The actual shapes of some low-dimensional / ? S p ¼ pð1Þpð2Þ pðdÞ denote as a mapping from ½dŠ as permutahedra are graphically illustrated in Fig. 2 for the set of candidates to ½dŠ as the set of ranks, with pðiÞ d ¼ 3; 4: Instead of using the permutation vectors denoting the rank (also called ‘‘rank-position’’) asso- ½pð1Þ; pð2Þ; y; pðdފT ; define the p-corresponding vertex ciated with candidate i; and p1ð jÞ denoting the T as xp ¼½cpð1Þ; cpð2Þ; y; cpðdފ for any base vector candidate occupying the rank j: Our convention is that T ½c1; c2; y; cd Š (for later convenience c14c24?4cd is higher ranks correspond to larger integer values, so p ¼ / S 7 4132 means that candidate ‘‘1’’ has the highest rank The reason that a d-permutahedron is subscripted d 1 rather or is most preferred ðpð1Þ¼4Þ; that candidate ‘‘2’’ has than d will become clear in Proposition 2.5. See also footnote 6. ARTICLE IN PRESS 114 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

v is called a permutation polytope, which generalizes 3 1 permutahedron (13) in a trivial way. Such mathematical 2 generalization is in obvious correspondence with the 3 general position methods for rank aggregation (Young, 2 1975, Cook & Kress, 1992). Another related combina- 1 toric–geometric object is the permuto-associahedron 1 3 constructed for signed, bracketed permutations, see 3 Ziegler (1995) for further details. 2 2.1.2. Facet-defining inequalities of a permutahedron Permutahedra are known to have relatively simple 3 configurations in space. They are simple polytopes. In 1 2 fact, although a d-permutahedron has d ! vertices defined in Rd ; it is actually a geometric object of v2 dimension d 1 (a proof of this observation will be given in Proposition 2.5 in the next subsection). For this reason, the d-permutahedron is denoted as Pd1: 2 As an alternative to defining a permutahedron as the 3 convex combination of permutation vectors, we may 3 1 also define it as the bounded intersection of half-spaces v 2 1 given by a system of hyperplanes, in accordance with the (a) 1 basic, dualistic view of a convex polytope (see Appendix (4123) A). As any facet of a permutahedron is passed through (4132) (1423) by one and exactly one of the hyperplanes (which assumes an equality sign in the system of facet-defining (4213) (1432) inequalities/equalities), the geometric arrangement of

(4312) (1243) these hyperplanes in space defines the same Pd1: (4231) Fortunately, this system of facet-defining inequalities/ (2413) (4321) (1342) equalities has been worked out and is to be reproduced (2143) (3412) below—they followed a strong result on vector major- (2431) (1234) ization by the combinatoric mathematician Richard (3142) (3421) (1324) Rado in a paper published in 1952, and were given in its (2134) explicit form in a book on polytope and optimization by (2341) Yemelichev, Kovalev, and Kravtsov (1984). The facet- (3124) defining equality/inequalities were also independently (3241) (2314) derived by Gaiha and Gupta (1977). (b) (3214) Fig. 2. Permutahedra in two and three dimensions, in the case of (a) Theorem 2.1 (Rado, 1952; Gaiha & Gupta, 1977, d ¼ 3 (note its construction as the convex hull of Theorem 2; Yemelichev, Kovalev, & Kravtsov, ½1; 2; 3ŠT ; ½1; 3; 2ŠT ; y; ½3; 2; 1ŠT ); and in the case of (b) d ¼ 4: Note 1984). The d-permutahedron Pd1 is given by the that in (b), consistent with our convention of interpreting p i j as ð Þ¼ following system of constraints on its coordinates v ¼ ‘‘item i’s rank-position is j’’, the vertices are labelled as ordering T ðp1ð1Þp1ð2Þp1ð3Þp1ð4ÞÞ; such that the neighboring vertices differ ½v1; v2; y; vd Š X only in the transposition of adjacent ranks between two objects (for jMjð2d jMjþ1Þ instance the exchange of the lowest ranks between the first and the vip for all MC½dŠ; ð14Þ 2 third object ð1342Þ2ð3142Þ). This ensures that the permutation iAM vectors associated with adjacent vertices differ with only minimum X distance. This is the labelling scheme adopted in Ziegler (1995, p. 18) dðd þ 1Þ vi ¼ ; ð15Þ and in Yemelichev (1984, p. 227). Note that the permutation vector 2 iA½dŠ associated with, say ð1342Þ; is ½1; 4; 2; 3ŠT : where M is a non-trivial (proper and non-empty) subset of required without loss of generality). The resulting ½dŠ having integer set-size jMj: geometric object 2 3 cpkð1Þ Proof. See Yemelichev et al. (1984, pp. 228–230), using 6 7 Xd! 6 c 7 Rado’s result regarding necessary and sufficient condi- y T A 6 pkð2Þ 7 convf½cpð1Þ; cpð2Þ; ; cpðdފ : p Ld g¼ lk4 ^ 5 tions for majorization of one vector by another. An k¼1 alternative proof is given in Gaiha and Gupta (1977), c pkðdÞ using a well-known inequality from Hardy, Litteg, and ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 115

Polya (1934). A third proof is given in Ziegler (1995, p. Remark. The faces of Pd1 have interesting properties: 304). & each of its j-dimensional faces or j-face ð j ¼ 0; 1; y; d 2Þ corresponds to an ordered partition of the set Remark 1. Recall the generalization of a permutahe- f1; 2; y; dg into d j non-empty parts. Thus vertices dron to a permutation polytope, in which the base ð j ¼ 0Þ are permutations, and edges ð j ¼ 1Þ connect T vector becomes ½c1; c2; y; cd Š (with c14c24?4cd ). pairs of rankings that differ only by a single transposi- The corresponding system of facet-defining inequalities/ tion of adjacent rank-positions for two elements of ½dŠ equalities are (cf., Fig. 2). It can be shown that every vertex X XjMj of Pd1 belongs exactly to d 1 edges (Gaiha & vip ci for all MC½dŠ; Gupta, 1977). Each facet (j ¼ d 2), on the other iAM i¼1 hand, is a partition of the set ½dŠ into two subsets Xd Xd with non-overlapping elements ðS; ½dŠ\SÞ; where S vi ¼ ci: is any non-empty and proper subset of ½dŠ: The i¼1 i¼1 number of j-faces of a permutahedron fjðPd1Þ has the The case for the d-permutahedron is obtained by simply expression setting c ¼ d i þ 1: i X d! fj ¼ ð16Þ i1!i2!?idj! Remark 2. The proof of this proposition, as well as fi1;i2;y;idj ? Corollary 2.2 below involves the notion of vector i1þi2þ þidj ¼dg majorization and the Birkhoff/von Neumann Theorem (the sum is taken across all positive integral solutions of on bistochastic matrices (see Section 2.5). Ziegler’s i þ i þ ? þ i ¼ d). The number of facets can be (1995) proof was based on the notion of fiber polytopes. 1 2 dj calculated as 2d 2: These proofs are omitted here due to their high technicality. Besides being characterized as the convex hull of permutation vectors (13), or as the intersection of a Remark 3. Because of equality (15), the system of system of half-spaces (14) and (15), a permutahedron inequalities (14) can be equivalently cast as X can also arise as a projection, or a lift-up of a certain jMjðjMjþ1Þ projection, between polytopes. More specifically, a d- viX for all MC½dŠ: iAM 2 permutahedron can be realized as a zonotope, the projection of a unit cube of dimension d d 1 =2; as a Since a permutahedron can be described as the ð Þ monotone path polytope associated with projecting a intersection of a collection of half-spaces (each given cube of dimension d onto a line segment 0; d ; or as by a facet-defining hyperplane), its individual faces are ½ Š the canonical projection of the Birkhoff Polytope characterized by letting certain inequality to assume which is associated with d d bistochastic matrices; equal sign. Yemelichev et al. (1984) provided a  see below. particularly powerful result in elucidating all faces of a permutahedron, as reproduced below without 2.2. Connection to binary choice vectors via zonotope proof. projection Corollary2.2 (Yemelichev et al., 1984, Theorem 3.4). A 2.2.1. Projection of a binary choice vector set of solutions of system (14), (15) is a j-face ð0pjpd Permutahedron is a special kind of polytope, namely 2Þ of the d-permutahedron if and only if for each such a zonotope, defined as the image of a cube under affine solution, inequalities (14) are satisfied as equalities only projection. In other words, a permutahedron can be for a nested sequence of subsets M ; M ; y; M ; 1 2 dj1 realized as the projection of a cube of an appropriate where (and higher) dimension. This observation was made in C C?C C M1 M2 Mdj1 f1; 2; y; dg: Ziegler (1995, p. 200), along with a sketch of proof. Since this fact plays a major role in linking binary In particular, each of the hyperplanes that define the choice probabilities and ranking probabilities, such boundary of the permutahedron (i.e. that pass through the zonotope projection is explicated here, along with a facets) ‘‘converts’’ only one of the inequalities in (14) into constructive proof. A graphic illustration is shown as an equality (since d j 1 equals 1 for j ¼ d 2). Each Fig. 3, for d ¼ 3: of the 2d 2 ways of selecting a non-trivial subset MC½dŠ corresponds to a distinct facet. Theorem 2.3. The d-permutahedron Pd1 is realizable as an affine projection of a cube Cdðd1Þ=2; the domain Proof. See Yemelichev et al. (1984, p. 231). & on which the binary choice probability vector ARTICLE IN PRESS 116 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

Proof. Recall (see Appendix A) that a p-dimensional a unitP cube Cp; centered at origin, can be expressed as 23 p 1 1 x ¼ tl el with el’s as base vectors and ptlp as a12 l¼1 2 2 coordinates ðl ¼ 1; 2; y; pÞ; see (A.1). Its projection a q 13 x/z ¼ Jx þ b into a lower-dimensionalP space R {z Cd(d-1)/2 p ðqopÞ can be expressed as z ¼ l¼1tljl þ b where jl is γ the lth column vector of the q  p matrix J; and b is the x image of the p-cube’s center, see (A.2). In the present case, p ¼ dðd 1Þ=2 and tl represents the pairwise binary choice probability t ¼ a 1; 0 a 1: For γ γ l ij 2 p ijp convenience, the ij subscript (i ¼ 1; 2; y; d; j ¼ i þ 1; i þ 2; y; d) is used in place of l ðl ¼ 1; 2; y; d γ P ðd 1Þ=2Þ; so long as it is understood that dðd1Þ=2 ¼ P P l¼1 d d i¼1 j¼iþ1; with iði þ 1Þ l ¼ði 1Þd þ j: ð18Þ J x + b 2

Π To ensure that the center of the cube is properly mapped d-1 to the center of the d-permutahedron, we choose b ¼ dþ1 dþ1 y T 2 1 ¼ 2 ½1; 1; ; 1Š : We now proceed to construct the particular matrix Fig. 3. The d-permutahedron as a zonotope, i.e., resulting from a y suitably chosen projection of a cube of dimension dðd 1Þ=2 (here J ¼½j1; j2; ; jdðd1Þ=2Š by setting d ¼ 3). Note that projecting to the center of the hexagon (the 3- T T j e e ; i 1; 2; y; d; j i 1; i 2; y; d; permutahedron) are two vertices ½a12; a13; a23Š ¼½1; 0; 1Š and l ¼ i j ¼ ¼ þ þ T T ½a12; a13; a23Š ¼½0; 1; 0Š ; which are ‘‘invalid’’ vertices that lie outside of the Binary Choice Polytope for d ¼ 3 (cf. Section 1.1). T where ei ¼½0|fflfflffl{zfflfflffl}; y; 0; 1; 0|fflfflffl{zfflfflffl}; y; 0Š is just a base vector of i1 di Rd (with 1 in its ith coordinate and 0 elsewhere). More T explicitly, the ðk; lÞth entry of the matrix J is dik djk; a ¼½a11; a12; y; aðd1Þd Š is defined. In other words, { / with l linked to i; j through (18), and the Kronecker d- there exists a surjective map: Cdðd1Þ=2 a v ¼ T function given by ½v1; v2; y; vd Š APd 1 given by  8 1 when k ¼ i; Pd > 1 dki ¼ > v1 ¼ a11 þ a12 þ ? þ a1d þ ¼ a1j þ 1 0 otherwise: > 2 a > j¼1;j 1 > < 1 Pd So the ðk; lÞth entry of the projection matrix is 1 (if ? v2 ¼ a21 þ a22 þ þ a2d þ ¼ a2j þ 1 : ð17Þ k ¼ i), 1 (if k ¼ j), or 0 (otherwise). When we line up > 2 a 2 > j¼1;j 2 the components of the binary choice vector (a column > ?? > vector) in the following order a ¼½a ; a ; y; a ; a ; > 1 Pd 12 13 1d 23 :> v a a ? a a 1 T d ¼ d1 þ d2 þ þ dd þ ¼ dj þ y; a2d ; y; aðd1Þd Š ; the projection matrix has the 2 j 1;jad ¼ following explicit form

2 3 6 7 6 111? 1 00? 0 ? 000 7 6 7 6 ? ? ? 7 6 10 0 0 11 1 000 7 6 7 6 0 10? 0 10? 0 ? 000 7 6 7 6 ? ? ? ^^^ 7 J ¼ 6 001 0 0 1 0 7: ð19Þ 6 7 6 ^^^? ^ ^^? ^ ? 110 7 6 7 6 ^^^? ^ ? ? 7 6 0 0 0 101 7 4 ? ? ? 5 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}000 1 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}00 1 |fflfflfflfflffl{zfflfflfflfflffl}0 1 |{z}1 d1 d2 2 1 ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 117

With this J; the resulting zonotope is, according to (41), which is the right-hand side of (15). Second, for any  Xd Xd subset MCf1; 2; y; dg; denote its complementary as d þ 1 1 z ¼ 1 þ a ðe e Þ: M% ; and the number of elements in M and M% as jMj and 2 ij 2 i j i¼1 j¼iþ1 d jMj; ! The terms for summation evaluate to X X Xd   Xd Xd 1 Xd Xd 1 vi ¼ 1 þ aij a e a e iAM iAM j¼1;jai ij 2 i ij 2 j 0 1 i¼1 j¼iþ1 i¼1 j¼iþ1  X X X Xd Xd @ A 1 ¼jMjþ aij þ aij ¼ a e ij 2 i iAM jAM;jai jAM% i¼1 j¼iþ1 X X  jMjðjMj1Þ Xd Xd 1 ¼jMjþ þ aij a e ðexchange of indices i; jÞ 2 A % ji 2 i i M jAM j¼1 i¼jþ1  jMjðjMj1Þ Xd Xd p jMjþ þjMjðd jMjÞ 1 2 ¼ a e ij 2 i % i¼1 j¼iþ1 since ðaijp1andjMj¼d jMjÞ  ! Xd Xi1 Xd Xd Xd Xi1 jMjð2d jMjþ1Þ 1 ¼ ; þ aji ei ¼ 2 2 i¼1 j¼1 j¼1 i¼jþ1 i¼1 j¼1  which is the right-hand side of (14). Third, we show that Xd Xd 1 for any pALd ; there is a one-to-one mapping between a ¼ aij ei 2 p-representing vertex of Pd1 and the corresponding i¼1 j¼iþ1 vertex of Cdðd1Þ=2 consistent with p: For ranking p; the Xd Xi1 1 1 1 inequality pðiÞ4pð jÞ means that candidate i ranks þ a e a ¼ a ij 2 i ij 2 2 ji higher than candidate j according to p: The particular i¼1 j¼1  vertex of Cd d 1 =2 consistent with this p is represented Xd Xd ð Þ 1 by the binary choice vector ap whose components are ¼ a e ij 2 i  i¼1 j¼1;jai p 1ifpðiÞ4pð jÞ; Xd Xd Xd Xd aij ¼ d 1 0ifpðiÞopð jÞ: ¼ aij ei 1 ei a 2 a i¼1 j¼1;j i ! i¼1 j¼1;j i After projection, Xd Xd ¼ ðd 1Þei ¼ðd 1Þ1 : p vi ¼ 1 þ a i¼1 ij j¼1;jaXi Therefore, p p ! !¼ 1 þ aij ðsince aij ¼ 0ifpðiÞopð jÞÞ Xd Xd Xd Xd f j : pð jÞopðiÞg z ¼ aij ei þ 1 ¼ aij þ 1 ei: p i¼1 j¼1;jai i¼1 j¼1;jai ¼ 1 þðpðiÞ1Þðsince aij ¼ 1 for each j : pð jÞopðiÞÞ To verify that the zonotope described by the above z ¼ pðiÞ: vector is indeed the d-permutahedron, we proceed to This is to say, the coordinates of the projected point T T show that its components, hereafter denoted vi with an ½v1; v2; y; vd Š ¼½pð1Þ; pð2Þ; y; pðdފ is the p-repre- abuse of notation senting vertex in the d-permutahedron. & Xd 1 Xd vi ¼ 1 þ aij ¼ þ aij; Remark 1. Mnemonically, projection (17) can be 2 j¼1;jai j¼1 written in fact satisfy the system of constraints defining a 2 3 2 32 3 2 3 v a a ? a 1 1 permutahedron. First, consider equality (15): 1 12 22 1d ! 6 7 6 76 7 6 7 6 v2 7 6 a21 a22 ? a2d 76 1 7 16 1 7 Xd Xd Xd 6 7 ¼ 6 76 7 þ 6 7 4 ^ 5 4 ^^^^54 ^ 5 4 ^ 5 vi ¼ 1 þ aij 2 i¼1 i¼1 j¼1;jai vd ad1 ad2 ? add 1 1 dðd 1Þ ¼ d þ ðsince aij þ aji ¼ 1Þ or 2 dðd þ 1Þ 1 ¼ ; v ¼ A1 þ 1: 2 2 ARTICLE IN PRESS 118 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

It conforms8 to the formula in Young (1974) for that map to the same point in the d-permutahedron calculating Borda scores given a binary choice prob- Pd1; this is understandable since the former is of much ability (see also Regenwetter & Grofman, 1998a), but higher dimension than the latter. The center of the cube 1 y there is an important difference. Young’s formula Cdðd1Þ=2; with coordinates aij ¼ 2; ði ¼ 1; 2; d; j ¼ y dþ1 applies to the case where (i) the ranking probability i þ 1; i þ 2; ; dÞ projects to the barycenter b ¼ 2 1 of (voters profile) Pp is known; and (ii) the binary choice Pd1: The ‘‘strength’’ of this induced ranking can be probability is calculated from Pp according to (5). By defined as X applying Young’s formula, the resulting scores are 1 2 consistent with Borda’s idea of assigning a weight of k ðvi vjÞ : d A ð0pkpd 1Þ in accordance with the number of objects i;j ½dŠ k ranked below the focal object. Here, no constraint dþ1 A ð Þ When vi ¼ 2 ; 8i ½dŠ; there is null strength in the has been placed on the binary choice probability vector. induced ranking. For d ¼ 3; null ranking strength Even when it may not be compatible with any ranking occurs if and only if the following tri-cyclic condition probability/voters profile, i.e., when it lies outside of the is satisfied—this is when the binary choice vector Binary Choice Polytope, a binary choice vector still coincides with the axis of projection (see Fig. 3): maps to a unique point in the permutahedron and a12 ¼ a23 ¼ a31: thereby defines a Borda score. Formula (17) can also be viewed as constructing the graph-theoretic ‘‘out-degree’’ For d43; cyclic preference is neither a necessary nor a of each candidate, and reflects the essence of the ‘‘net sufficient condition for null ranking strength. Statistical probability’’ constructed by Regenwetter, Marley, and tests on the strength of a rank vector have been devised Grofman (2002) for arbitrary binary relations. (Feigin & Cohen, 1978, Alvo, Cabilio, & Feigin, 1982; Feigin & Alvo, 1986, Alvo & Cabilio, 1993). Remark 2. Formula (17) provides some useful insights into the property of the so-called ‘‘Condorcet winner’’ in the voting literature. A Condorcet winner is a candidate 2.2.2. Compatibility with the BTL choice model who is preferred to any other candidate in pairwise The previous subsection shows that any binary choice contests. Translated into the current notation, a Con- vector induces a (generalized Borda) score for each X1 A candidate in ½dŠ: the induced score equals the sum of dorcet winner k satisfies akj 2 for all j ½dŠ (and for at least one such j; strict inequality holds). Therefore, probabilities in which a candidate is (pairwise) chosen the associated Borda score satisfies vk4ðd þ 1Þ=2: over other candidates. Here we further investigate This is to say, a Condorcet winner k cannot have the whether this induced score is compatible with other a representations for binary choice, most notably the BTL lowestP BordaP score, for otherwise vkpvi; ðiA½dŠ; i kÞ; d v X d v ¼ v Á d4dðd þ 1Þ=2; contradicting representation (Bradley & Terry, 1952, Luce, 1959). If i¼1 i i¼1 k k T (15). This is a well-known result in the voting commu- the score v ¼½v1; v2; y; vd Š ðvi40; i ¼ 1; 2; y; dÞ is nity. compatible with the BTL representation of the binary choice probability, then Remark 3. Formula (17) provides a connection to vi aij ¼ : another important notion in voting, namely, that of vi þ vj Copeland score (Copeland, 1951, Saari & Merlin, 1996, If our induced score (17) is compatibility with the BTL Merlin & Saari, 1997). A Copeland scoring rule first representation, then a mapping arises G : Pd1-Pd1: assigns, in pairwise contests between a focal candidate i In vector components G : v/½g ; g ; y; g ŠT through A a 1 1 2 d and each other candidates k ½dŠ; k i; a mark of 1; 2; 0if Xd v 1 candidate i beats, ties with, or loses to candidate k: The g ðv ; v ; y; v Þ¼ k þ ; kA½dŠ: ð20Þ CP k 1 2 d v v 2 marks are then summed to yield a Copeland score vi j¼1 k þ j for each candidate i; 8iA½dŠ: Translated into the current 1 The properties of such a map are given in the next notation, aik is allowed to only take the value of 1; 2; 0— formula (17) in this case exactly yields the Copeland corollary. score. Corollary2.4. (i) The map G is ‘‘rank-preserving’’, Remark 4. Note that mapping (17) is surjective (onto): namely, it preserves the ranks of the components of the y T in general there are uncountably many points of vector v ¼½v1; v2; ; vd Š ; in that gipgj if and only if vipvj: (ii) The map G is ‘‘homogenizing’’, namely, it Cd d 1 =2 (i.e., points in the Binary Choice Polytope) ð Þ makes pairwise utility ratios closer to unity:

8 The additional term 1 in (17) is the result of our assigning a mark of gi vi pðiÞ; as opposed to the more conventional pðiÞ1; for candidate i 1 p 1 : under p: See footnote 4. gj vj ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 119

(iii) The only fixed point of the map G is the barycenter of in ½dŠ: Part (iii) of Corollary 2.4 implies that the gi’s may 0 the Pd1: not be such viS in general, but are only compatible with the BTL representation, in the sense of (i) and (ii). Proof. From (20), we have

Xd Xd v v 2.3. Connection to the RU representation via spherical g g ¼ i k i k v v v v embedding j¼1 i þ j j¼1 k þ j Xd ðvi vkÞvj 2.3.1. Affine equivalence and the representation of RU on ¼ v v v v the unit sphere j¼1 ð i þ jÞð k þ jÞ As briefly discussed in Section 1.3, the non-parametric Xd v ¼ðv v Þ j : RU representation assumes that each candidate in the i k v v v v j¼1 ð i þ jÞð k þ jÞ choice set ½dŠf1; 2; y; dg is associated with a real- valued random variable (‘‘utility’’) v ; i ¼ 1; 2; y; d: Since all the terms under summation are strictly positive, i The joint distribution of the d random variables v ¼ we conclude that g g and v v have the same sign, T i k i k ½v ; v ; y; v Š is given by the probability density and that v v iff g g : Therefore the transforma- 1 2 d i ¼ k i ¼ k f ðv ; v ; y; v Þ: Under (4), the non-coincidence condi- X X 1 2 d tion G is ‘‘rank preserving’’: gi gk iff vi vk: tion of Block and Marschak (1960), the ranking To prove the homogenizing property of the map G; probability and the density function of the jointly we calculate  distributed RU variables are related through Eq. (2). gi vi gi gk vi Let us now investigate the properties of the ¼ T g v v v g y 1 k k i k k ! region of integration Dp ¼fv ¼½v1; v2; ; vd Š : vp ðdÞ Xd 4vp1ðd1Þ4?4vp1ð1Þg associated with a given p: vk vi vi ¼ Clearly, if vAD is a point in this region, then after a ðv þ v Þðv þ v Þ g p j¼1 i j k j k positive affine transformation v0 ¼ av þ b1; with any  ! Xd a40 and b; the new point also belongs to this region vi vivk 1 0 ¼ 1 : ð21Þ v AD ; this is because v 1 4v 1 4?4v 1 - v g v v v v p p ðdÞ p ð1Þ p ð1Þ k k j¼1 ð i þ jÞð k þ jÞ avp1ðdÞ þ b4avp1ðd1Þ þ b4?4avp1ð1Þ þ b for all Therefore, gi=gk vi=vk and 1 vi=vk have the same a40 and b: This observation motivates us to introduce B sign. Suppose vipvk; or vi=vkp1; then gi=gkXvi=vk: On the notion of ‘‘affine equivalency’’ over the point-set d the other hand, since G has just been proven to be rank- R in which the utility vectors v are defined. Specifically, B 0 preserving, the assumption vipvk also leads to gipgk or v v if and only it there exists a unique a40 and b such 0 9 gi=gkp1: That is to say that v ¼ av þ b1: The affine parameters a; b are defined vi gi on A ¼ð0; NÞÂðN; NÞ: We can thus define a p p1 quotient space,10 Rd = ; to represent all equivalence vk gk A classes of utility vectors vARd : holds whenever v v : Similarly we can show that when ip k The idea of affine equivalency of utility vectors is v Xv ; i k actually rooted in the interval-scale nature of utility v g i X i X1: measurement. The affine freedom ða; bÞ is a natural vk gk consequence of the arbitrariness in choosing the origin Therefore and the unit in assigning utility values (here all utility dimensions are interchangeable therefore necessarily gi vi 1 p 1 : have the same reference zero and measuring unit). So gk vk we may impose, as a normative requirement of the RU The mapping G is making the scores more and more model, the following representational constraint on the homogenous. dþ1 A 9 Finally, vi ¼ 2 ; 8i ½dŠ is easily verified to be a fixed The affine equivalence relationship is easily seen to be: (i) reflexive: point of the map G: To show that it is the only fixed vBv; (ii) symmetric: vBv0-v0Bv; this is because v0 ¼ av þ b1-v ¼ dþ1 ð1=aÞv0 þðb=aÞ1; and (iii) transitive: vBv0; v0Bv00-vBv00; this is point, i.e., to show that gi ¼ vi; 8iA½dŠ leads to vi ¼ ; 2 0 00 0 0 0 - 00 0 0 0 we note from (21) that requiring g ¼ v and g ¼ v for because v ¼ av þ b1; v ¼ a v þ b 1 v ¼ðaa Þv þða b þ b Þ1: i i k k 10 Let X be a set and let B be an equivalence relation on X: The two distinct i; k leads to vi ¼ vk: Therefore all vi’s must equivalence classes due to B form a partition of X into pairwise dþ1 & be equal (and equal to 2 ) for any fixed point of G: disjoint subsets whose union is X: The set of all equivalence classes will be denoted X=B; and is called the quotient set of X modulo the Remark. Given the arbitrary binary choice probability equivalence relation. There is a natural projection aij’s, one often seeks to recover a scale, namely the vi’s g : X-X=B under the BTL representation, on individual candidates such that gðxÞ is the equivalence class of x: ARTICLE IN PRESS 120 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

A d probability density function: permutation vector xpk R as given by (12), its projection along the axis 1 ½1; 1; y; 1ŠT equals f ðv1; v2; y; vd Þ¼gða; bÞ hðu1; u2; y; ud Þ dðd þ 1Þ with ½1; 1; y; 1ŠÁx ¼ p ð1Þþp ð2Þþ? þ p ðdÞ¼ ; pk k k k 2 vi b ui ¼ ; which is a constant (i.e., independent of the particular a k). Therefore, for any point v in the permutahedron, its along with the restrictions projection along 1; 1; y; 1 T ½ Š ! Xd Xd Xd! 2 ui ¼ 0; u ¼ 1 y y i ½1; 1; ; 1ŠÁv ¼½1; 1; ; 1ŠÁ lkxpk i¼1 i¼1 k¼1 so that Xd! P rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dðd þ 1Þ dðd þ 1Þ d X ¼ lk ¼ v d 2 2 i¼1 i 2 k 1 b ¼ ; a ¼ ðvi bÞ : ¼ P d i¼1 d! is a constant, since k¼1 lk ¼ 1: This proves that d- Note that the probability density hðu1; u2; y; ud Þ is permutahedron lies in a ðd 1Þ-dimensional subspace, T d2 defined on the unit sphere u ¼½u1; u2; y; ud Š AS ; and the distance of any vertex to the barycenter which can be divided into d ! patches dþ1 y T ( 2 ½1; 1; ; 1Š is  T Xd 2 Xd 2 dp ¼ u ¼½u1; u2; y; ud Š : up1ðdÞ4up1ðd1Þ d þ 1 d þ 1 pð jÞ ¼ j ) 2 2 Xd Xd j¼1 j¼1 2 4?4u 1 ; u ¼ 0; u ¼ 1 : ðd þ 1Þdðd 1Þ p ð1Þ i i ¼ ; i¼1 i¼1 12

Each point uAdp defines an affine-equivalent class of a constant. To prove that it is of dimensionality d 1 A pointsZv Dp: Therefore exactly (and not less), we only need to show that if its projection along a vector c ¼½c ; c ; y; c ŠT is a ? 1 2 d Pp ¼ f ðv1; v2; y; vd Þ dv1 dv2 dvd constant, then that vector must be proportional to 1; ZDp i.e., c1 ¼ c2 ¼ ? ¼ cd : To show, for instance, c1 ¼ c2; ¼ hðu1; u2; y; ud Þ do; consider the projection of two vertices x1 ¼ d T T p ½1; 2; 3; y; dŠ and x2 ¼½2; 1; 3; y; dŠ : That x1 Á c ¼ where the a; b variables have been integrated out, with x2 Á c immediately leads to c1 ¼ c2: Therefore, all ci’s the remaining spherical variables to be integrated on the must be equal. & unit sphere (here do denotes the surface element of d2 d2 S ). Parametric models of hðu1; u2; y; ud Þ on S can Remark 1. Proposition 2.5 shows that the d ! vertices of be prescribed—a prominent one being the von-Mises– a permutahedron are equal-distant to its barycenter. Fisher distribution, which belongs to an exponential This suggests that a properly scaled d-permutehedron family. can be circumscribed by the unit sphere Sd2: For To summarize, based on considerations of the example, the hexagon (in Fig. 2a) is circumscribed by a interval-scale nature of utility vectors, we converted circle ðS1Þ; whereas the 4-permutahedron (in Fig. 2b)is the RU model into an affine-equivalent representation circumscribed by a sphere (S2). Therefore, the d ! on the unit sphere. vertices of the permutehedron serve as landmarks of d2 S on which the probability density hðu1; u2; y; ud Þ is 2.3.2. Vertices of a permutahedron as spherical defined. The vertex p ¼ /pð1Þpð2Þ?pðdÞS corresponds landmarks to equally spaced u-values:

As the next step, we follow the approach of up1ðdÞ up1ðd1Þ ¼ up1ðd1Þ up1ðd2Þ McCullagh (1993) to relate the unit sphere Sd2 to the ¼ ? ¼ up1ð2Þ up1ð1Þ: ð22Þ permutahedron Pd1: To do so, the following property of a permutahedron is reviewed. Solving for the ui’s yields pffiffiffi j d Adpermutahedron is a d pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi3ð2 ð þ 1ÞÞ Proposition 2.5. - Pd1 ð 1Þ- up1ð jÞ ¼ : dimensional object. All of its d ! vertices are equidistant to dðd þ 1Þðd 1Þ T dþ1 y d2 its barycenter 2 ½1; 1; ; 1Š : With this set of vertices, most points on S can be classified, using the spherical distance metric, as Proof. Recall the definition of a permutahedron (13), belonging to a certain closest vertex—each vertex p ¼ using permutation vectors as its vertices. For any /pð1Þpð2Þ?pðdÞS has a region of ‘‘attraction’’ dp; i.e., ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 121 a neighborhood within which all points are closer to this ability. First informally suggested by Robert T. Weber, vertex than to any other vertex. The only exceptions are it was popularized as a viable alternative voting the points on the boundaries of such regions—they mechanism in a number of articles by Brams and occur when at least two (but not all d) candidates have Fishburn (e.g., 1978, 1983). To date, AV mechanism same utility values, i.e., are tied in a weak order. Each was or has been adopted by several professional region of attraction turns out to be the intersection of an societies in their electoral process, and its effectiveness unbounded, polyhedral cone (whose apex is at the center was analyzed using the outcomes of 10 real-life elections of the sphere) with Sd2; the sphere Sd2 itself is then (Regenwetter & Grofman, 1998b). Once AV ballots are divided, by these d ! polyhedral cones, into d ! patches collected, a probability distribution PS (in terms of dp; pALd : relative frequencies) over all subsets SD½dŠ is specified. In the common tally procedure of AV (Brams & Remark 2. Three observations of this spherical repre- Fishburn, 1978), the accumulation of votes (across the sentation can be made. First, the Block and Marschak voting population) in favor of each candidate is (1960) condition (4) is equivalent to stating that the according to the total number of times a candidate has boundaries of d’s have measure zero in probability. appeared in the subsets of which he is a member/ Second, ranking probabilities can be viewed as a element. The votes (AV score) for candidate i is counted special kind of RU model in which the probability by summing across the probability distribution PS over measure is concentrated on the d ! isolated points all subsets S X on the sphere Sd2; this is exactly Block and Marschak’s v ¼ P m ðSÞ; (1960) construction of RU density function from a i S i SD½dŠ given ranking probability. Third, for any RU distribu- tion, the ranking probability, induced according where miðSÞ is the membership function  to (2), can be viewed as individual vertices (linear 1ifiAS; orders) of the permutahedron ‘‘pulling’’ the probability miðSÞ¼ 0 otherwise: measure that are ‘‘spread’’ in respective neighborhoods towards their centers; the boundary of attraction is In other words, each subset S (approved by a voter) given by setting up1ðkþ1Þ ¼ up1ðkÞ for one of the k ¼ contributes to the score vector by an amount T 1; 2; y; d 1; this is in accord with the geometric ½m1ðSÞ; m2ðSÞ; y; md ðSފ : We call the total score description of Borda’s rule of converting Borda accumulated according to this ‘‘vanilla’’ AV method BF T scores to a ranking assuming no ties (Cook & Seiford, the Brams–Fishburn score, v ¼½v ; v ; y; vd Š : 2 3 1 2 1982). m1ðSÞ X 6 7 6 m2ðSÞ 7 Remark 3. It is worth mentioning that the spherical vBF ¼ P 6 7: ð23Þ S4 ^ 5 representation of the preference space derived here is in SD½dŠ the same spirit as the topological approach to preference md ðSÞ aggregation in the social choice and welfare context One could, of course, implement other tally procedures. (Chichilnisky, 1980, Chichilnisky & Heal, 1983, Bar- An obvious alternative is to inversely weigh the yshnikov, 1993, Heal, 1997). However, since the utility importance of each subset by the percentage of the vectors associated with the null-preference v ¼ v ¼ 1 2 total voters choosing subsets equal to its size: ? ¼ vd map to the barycenter of the d-permutahedron, 2 3 the space of utility vectors (modulo the affine equiva- m1ðSÞ d2 X 6 7 lence) should be, strictly speaking, S ,f0g: The P 6 m2ðSÞ 7 vSI ¼ S 6 7; ð24Þ inclusion of the null-preference point introduces non- f ðjSjÞ4 ^ 5 trivial topological consequences for the existence of a SD½dŠ m ðSÞ proper social choice function. An elaboration on this d topic will be given by another paper (Jones, Zhang, & where f ðkÞ is given by (7). As we shall see (Section Simpson, 2003). 2.4.6), this tally procedure is related to the SI model of AV, and hence hereby referred to as the SI score.In essence, it encourages a voter to choose an unpopular 2.4. Connection to AV via monotone path polytope set-size in order to increase the importance of his/her vote. 2.4.1. Approval voting Apart from these different tally procedures, research- AV is a mechanism of social choice through which ers have also been interested in underlying models of each voter selects or approves of, from a master set of voters’ preferences that generate the AV ballots. For candidates ½dŠ; a subset of individuals SD½dŠ who instance, one may imagine that approving a particular presumably are above a voter’s ‘‘threshold’’ of accept- subset S amounts to equal-probabilistically choosing all ARTICLE IN PRESS 122 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 linear orders whose top jSj elements are precisely the nested subsets elements of S (hereafter referred to as the ‘‘equal- |Cfp1ðdÞgCfp1ðdÞ; p1ðd 1Þg probability model’’). On the other hand, one may 1 1 1 hypothesize a two-stage process, as in the SI model, C?Cfp ðdÞ; p ðd 1Þ; y; p ð1Þg  ½dŠ: and seek to uncover, from the given data PS; an We denote Lðp; kÞ; namely the kth ‘‘leading set’’ of p; as underlying probability distribution Pp over all linear the set of candidates at the k top ranks associated with a orders pALd that is ‘‘consistent’’ with PS; in the sense given p of (6). While in the former case there is a unique Lðp; kÞ¼fp1ðdÞ; p1ðd 1Þ; y; p1ðd k þ 1Þg probability distribution Pp over linear orders that can be inferred from ballots, in the latter case, there is no ð25Þ guarantee that such a Pp will ever be found for an for 1pkpd with jLðp; kÞj ¼ k; and arbitrary P : The condition under which this is possible S | is called the characterization problem for AV, whose Lðp; 0Þ¼ : solutions, in the space of all subset choice probabilities These nested subsets induced by p simply obey P ; form a polytope, the so-called AV Polytope S Lðp; 0ÞCLðp; 1Þ?CLðp; dÞ (Doignon & Regenwetter, 1997; Doignon & Regenwet- ter, 2002; Doignon & Fiorini, to appear). as more and more candidates are included, based on One interesting question arises: how are AV scores their desirability as determined by p: Geometrically, i.e., (the Brams–Fishburn score versus the SI score) related using vertices to represent subsets, this sequence can be to the AV models (the ‘‘equal-probability’’ model versus seen as defining a ‘‘path’’ associated with p; denoted sp; the SI model)? Though the former is a matter of tally that starts from the lower-left vertex of the cube Cd and procedures while the latter latent preference models, it travels along its edges all the way up to its upper-right will be shown (Sections 2.4.4 and 2.4.6) that AV scores vertex: exactly equal Borda scores of the latent probability sp : 0  spð0Þ-spð1Þ-spð2Þ-?-spðdÞ1; ð26Þ distribution of linear orders under the corresponding p models. This again demonstrates the important role where each s ðkÞ is the vertex corresponding to the k- played by Borda scoring method (and the permutahe- element set Lðp; kÞ dron) in linking a tally procedure with a voter preference Xk p y model in the case of AV. s ðkÞ¼ ep1ðdiþ1Þ; k ¼ 1; 2; ; d: i¼1 To formalize this intuition under the framework of monotone path polytope, construct the following special 2.4.2. Permutahedron as a monotone path polytope projection g from C to R1 along the diagonal axis It is known (Ziegler, 1995) that permutahedron can be d 1; 1; y; 1 T : realized as a ‘‘monotone path polytope’’ arising from the ½ Š lift-up of the projection from a cube to a line segment. Xd d - / y The mathematical details of monotone path polytopes g : ½0; 1Š ½0; dŠ; x ½1; 1; ; 1ŠÁx ¼ xi: ð27Þ (and fiber polytopes in general) are given in Appendix i¼1 A. Here we give the motivation of considering such a Here the range ½0; dŠCR1 of the projection is actually a projection, and then reproduce this conclusion (Theo- line segment ½0; dŠ (see Fig. 4)11. When the aforemen- rem 2.6) and its proof due to Ziegler. Since the tioned path sp is projected via g onto line segment construction is crucial in linking ranking probability to (see Fig. 4), points that are increasingly farther a subset choice model, the necessary mathematical facts away from the origin map onto increasingly larger are reviewed here using a language familiar to research- values between 0 and d—the vertex corresponding to ers of probabilistic choice. the set Lðp; kÞ maps to the point kA ½dŠ,f0g: Fig. 4 d d Let Cd ¼½0; 1Š CR be the d-dimensional unit shows an example for d ¼ 3; where the darker line cube with the lower left corner aligned with the indicates the path corresponding to p ¼ /213S; so that origin. There is a natural correspondence between the Lðp; 1Þ¼f3g; Lðp; 2Þ¼f1; 3g; Lðp; 3Þ¼f1; 2; 3g; and p p p vertices of Cd and the subsets of ½dŠ to be constructed s ð1Þ¼e3; s ð2Þ¼e1 þ e3; s ð3Þ¼e1 þ e2 þ e3: d The projection g : Rd {C -½0; dŠAR1 induces a so- as follows.2 Denote3 the base vectors of R as d T called fiber polytope (see Appendix A), which necessa- 4 5 rily is of dimension d 1; and which lives in the fiber ei ¼ 0; y; 0; 1; 0; y; 0 for iA½dŠ: A vertex of the |fflfflffl{zfflfflffl} |fflfflffl{zfflfflffl} orthogonal to the projected dimension and originating i1 di P cube Cd ; represented as a vector of the form iAS ei; 11 Note that in the graph, the cube’sp sidesffiffiffi have lengths 1=d and the is identified with the subset SD½dŠ itself. Recall that projection is along the vector c ¼ 1= d½1; 1; y; 1Š; but the resulting any ranking (ordering) p induces a sequence of algebraic relationship (27) still holds. ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 123

1 þ spðdÞ 2 γ (−1) (x) ¼ d ep1ðdÞ þðd 1Þep1ðd1Þ þ ? þ 2 ep1ð2Þ e 2 1 e . 3 þ e 1 1 p ð1Þ 2 e1 Xd e2 1 ¼ ke 1 1 p ðkÞ 2 k¼1 Xd e e + e + e = 1 1 3 1 2 3 ¼ pðkÞ ek 1 0 2 2k¼1 3 2 3 pð1Þ 1 P = C 6 7 6 7 d 6 pð2Þ 7 16 1 7 e1 + e3 ¼ 6 7 6 7: 4 ^ 5 24 ^ 5 e1 γ p d 1 γ γ ð Þ This established a one-to-one correspondence of a p- Q = [0,d] 0123 defined path in Cd to a p-representing vertex of Pd1— x the fiber polytope associated with projection (27) is & Fig. 4. The d-permutahedron as a fiber polytope, i.e., resulting from indeed a d-permutahedron. the construction of monotone paths associated with the projection of a cube of dimension d to a line segment ½0; dŠ: Here, for d ¼ 3; the thicker line represents a particular linear-order path (corresponding to 2.4.3. Subset choice probability and facets of a /213S; represented as a dot in the inset), while the vertical plane cutting across the cube in the middle is shown in the inset to contain permutahedron the proper fiber polytope—a hexagon. The fact (Theorem 2.6) that the fiber polytope induced by projection (27) turns out to be the d- permutahedron is of special significance in linking the from gðr0Þ where r0 is the barycenter of Cd : It turns out subset choice paradigm to ranking probability. First, as d that this resulting fiber polytope is in fact a d- mentioned earlier, the 2 vertices of a d-cube Cd permutahedron. are in one-to-one correspondence with all subsets of ½dŠ; it is easy to envision a d-dimensional vector, T Theorem 2.6 (Ziegler, 1995, p.303). The fiber polytope ½m1ðSÞ; m2ðSÞ; y; md ðSފ ; whose binary-valued vector associated with the diagonal projection (27) from a component indicates whether or not an element of ½dŠ C d C 1 d-cube Cd R to a line segment ½0; dŠ R is a has been included in the subset SD½dŠ: Therefore Cd is a d-permutahedron. natural representation for the subset choice paradigm, where the subset choice probability PS can be defined on Proof. According to its definition, the fiber polytope is the subsets SD½dŠ; which are vertices of Cd : Second, the the collection of points whose coordinates are given by surjective projection (27) of Cd onto the line segment integral (A3) for all possible (horizontal) sections s of ½0; dŠ collapses subsets according to their set-sizes—all Cd with respect to the projected line segment. To prove subsets of a given size jSj¼k project to the same that the monotone path polytope resulting from the (integer-valued) point kA½dŠ,f0g; so the probability specific projection (27) is indeed a permutahedron, one distribution over set-size f ðkÞ; k ¼ 0; 1; y; d; can be only needs to calculate the coordinates corresponding to defined, on the integer support along the line-segment the monotone paths that consist of extreme points of the ½0; dŠ: Third, a monotone path sp on the boundary of the T cube, i.e., its edges. Given a ranking p; an explicit cube Cd ; which starts from the vertex ½0; 0; y; 0Š and calculation of the coordinates associated with the path ending at the vertex ½1; 1; y; 1ŠT ; generates a sequence p s (26) is given below, which is adapted from Ziegler of leading sets Lðp; kÞ of p: (1995, p. 303): Z 1 d Lðp; kÞ¼Lðp; k 1Þ,fp ðd k þ 1Þg: spðxÞ dx 0 The gradual inclusion of elements into Lðp; kÞ as k ¼ 1 Xd 1 jLðp; kÞj increases depends on the ordering p ; it always ¼ ðspðk 1ÞþspðkÞÞ starts with the most desired object p1ðdÞ; followed by 2 k¼1 the second most desired one p1ðd 1Þ; so on. Since p p p 1 p p p p each such path s ð1Þ-s ð2Þ-?-s ðdÞ uniquely ¼ s ð0Þþs ð1Þþs ð2Þþ? þ s ðd 1Þ 2 represents a linear order, it is called a linear-order path. ARTICLE IN PRESS 124 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

Ranking probability Pp can be defined on the d! linear- which conforms to the formula for f -numbers (Eq. (16), order paths. for j ¼ 1). These considerations provide all necessary ingredients For example, when d ¼ 4 (see Fig. 2b), the for a concise description and characterization of subset subset S ¼f1; 3; 4gCf1; 2; 3; 4g is associated with a choice models (in connection with ranking models). particular vertex e1 þ e3 þ e4 of C4: Each ranking Consider a particular vertex VSACd corresponding to a that is compatible with S in placing candidates ‘‘1’’, given subset SC½dŠ with set-size jSj¼k; ð1pkpd 1Þ: ‘‘3’’, ‘‘4’’ in front of ‘‘2’’ (i.e., pð2Þ¼1) has the 1 The vertex VS is en route of many linear-order paths of general form p ¼ / 1 S or p ¼ð2Þ; the set Cd ; in fact those paths that correspond to all such linear of permutations PS ¼f/4132S; /4123S; /3124S; orders that rank elements of S ahead of elements of /2134S; /2143S; /3142Sg¼fð1342Þ; ð1432Þ; ð4132Þ; ½dŠ\S; i.e., those p’s whose kth leading set Lðp; kÞ equals ð4312Þ; ð3412Þ; ð3142ÞgCLd corresponds to all paths of T S: This vertex (of the cube) defines a bipartition of the Cd passing through e1 þ e3 þ e4 ¼½1; 0; 1; 1Š with set ½dŠ into two disjoint subsets that are set-wise ordered monotonically increasing path-lengths. These paths for desirability; hence, from Section 2.1.2, it corresponds map onto the vertices (of Pd1) that define the facet to a unique facet of the permutahedron Pd1; hereby Ff1;3;4g: Unfortunately, it is not possible to visualize the denoted FS: There are d!=ððd kÞ!k!Þ vertices (of Pd1) projection g for d ¼ 4 in the same way we did for d ¼ 3 on this facet, each corresponding to one of the linear- in Fig. 4. order paths on the cube Cd constrained to pass through VS: All such linear orders pALd satisfy S ¼ Lðp; kÞ; or 2.4.4. Brams–Fishburn score of AV ballots and the equal- probability model 1 1 y 1 S ¼fp ðdÞ; p ðd 1Þ; ; p ðd k þ 1Þg: The above combinatorial–geometric considerations Following Doignon and Regenwetter (1997), denote the hint at some intimate connections between the AV set of all rankings/permutations whose leading set (as paradigm and the ranking paradigm, along with their scoring rules (Bram–Fishburn score for the former and defined by (25)) equals a given S as PS: Borda score for the latter). Specifically, the event of PS ¼fpALd : Lðp; kÞ¼S; k ¼jSjg: choosing a particular subset (a vertex of the d-cube, a facet of the d-permutahedron) and the event of ranking In other words, PS; which is a subset of Ld ; contains all candidates in a particular linear order (a monotone path those p’s that rank any candidate belonging to S more of the d-cube, a vertex of the d-permutahedron) can be favorably than any one not belonging to S: Each made equivalent in terms of the scores they generate element of PS; which is a ranking and which at the same over the set of candidates. time represents a linear-order path, corresponds to a certain vertex of the facet FS of the permutahedron. Theorem 2.7 (Connection of the Brams–Fishburn score Note that a vertex of the permutahedron Pd1 and the Borda score). In an AV paradigm, if the corresponds to a ranking p; whereas a vertex of the approval of a subset SD½dŠ of candidates amounts to cube Cd corresponds to a subset S of ½dŠ; one should not assigning equal probability to all rankings PSDLd confuse the two. On the other hand, p corresponds to a consistent with S; i.e., to those rankings which place the path of the d-cube, while S corresponds to a facet of approved candidates in front of the non-approved ones, Pd1—the facet in fact represents the convex hull of all then Brams–Fishburn score (23) is equivalent, up to an monotone paths of Cd constrained to pass one of its (the affine transform, to computing the Borda score for each cube’s) vertex. All vertices of Cd ; organized in terms of candidate that is consistent with the latent ranking the number of steps away from the origin, give a total probabilities of the voters. count of    Proof. For any subset SD½dŠ with 0pjSj¼kpd; the d d d d þ þ ? þ ¼ 2 2: corresponding set of consistent rankings is P C : 1 2 d 1 S Ld Assuming equal probability among the total of k!ðd d The 2 2 facets of the permutahedron are organized in kÞ! rankings in PS; the Borda score due to S is computed the following way: d facets each with ðd 1Þ! edges, as dðd 1Þ=2 facets each with 2ðd 2Þ! edges; y; 2 3 pð1Þ d!=ðk!ðd kÞ!Þ facets each with k!ðd kÞ! X 6 7 edges; y; and d facets each with d 1 ! edges. The 1 6 pð2Þ 7 ð Þ vBdðSÞ¼ 6 7: total number of edges is k!ðd kÞ! 4 ^ 5 pAPS 1 pðdÞ ðd Áðd 1Þ! þ dðd 1Þ=2 Á 2ðd 2Þ! 2 Denote the ith component of vBdðSÞ as vBdðSÞ: If iAS; ? i þ þ d Áðd 1Þ!Þ¼d=2 Á d!; then pðiÞAfd; d 1; y; d k þ 1g: The sum of the ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 125 k! ðd kÞ! terms of pðiÞ; as p runs through PS; equals rankings (linear-order paths on the d-cube) that are ! compatible with the chosen subset. The subset choice Xd 2d k þ 1 probability P will induce a ranking probability P ; as ðk 1Þ! j Áðd kÞ! ¼ k!ðd kÞ! ; S p 2 stated in the next corollary (note that with an abuse of j¼dkþ1 notation, we use the same symbol P for ranking A for i S: probability and for subset choice probability). If ieS; then pðiÞAf1; 2; y; d kg; and the sum of the k! ðd kÞ! terms of pðiÞ equals Corollary2.8 (Latent profile under equal-probabilistic ! model). The probability distribution over subsets PS in Xdk AV, under equal-probability model, induces a prob- d k þ 1 ðd k 1Þ! j Á k! ¼ k!ðd kÞ! ; ability distribution over rankings P ; under Borda 2 p j¼1 scoring, through: for ieS: Xd Pp ¼ PLðp;kÞ Therefore k¼0 Xd 2d k þ 1 d k þ 1 1 1 vBdðSÞ¼ m ðSÞþ ð1 m ðSÞÞ ¼ P| þ i i i d! k!ðd kÞ! 2 2 k¼1 d d k þ 1 ¼ m ðSÞþ :  Pfp1ðdÞ;p1ðd1Þ;y;p1ðdkþ1Þg: ð29Þ 2 i 2 Note that the coefficient before m ðSÞ is independent of i Proof. The weighting factor for each eligible subset, k: Summing over SD½dŠ X 1=ðk!ðd kÞ!Þ; is according to the multiplicity of all Bd Bd permutations consistent with a given subset of size k: vi ¼ PS vi ðSÞ SD½dŠ Summing up all contributing Lðp; kÞ gives (29). & yields Remark. Intuitively, the ‘‘support’’ for a ranking p is 0 1 !contributed to by all subsets S with which p is consistent X Xd d d þ 1 1 in the sense that all S’s are leading sets of p: The support vBd ¼ @ P m ðSÞA þ kf ðkÞ ; i 2 S i 2 2 for p is accumulated along the path of its leading sets SD½dŠ k¼1 with increasing set-size ð28Þ |  Lðp; 0ÞCLðp; 1ÞC?CLðp; dÞ½dŠ; where f k denotes the probability distribution over ð Þ P where Lðp; kÞ is defined in (25). From Corollary 2.2, subset size k as given by (7), with d f ðkÞ¼1: This k¼0 each Lðp; kÞ (for any pALd and 1pkpd) defines a facet shows that the Borda score vBd for candidate i is i of Pd that contain k!ðd kÞ! vertices. Since the p- equivalent to the Brams–Fishburn score vBF in (23), 1 i representing vertex of Pd is the intersection of d 1 apart from an affine transform. & 1 such facets (corresponding to k ¼ 1; 2; y; d 1 respec- P P tively), each term in summation of (29) is indeed a d Remark 1. The value k¼1 kf ðkÞ¼ SD½dŠ PS ÁjSj is contribution from those facets to their common vertex. the average subset size chosen by the voting population. It is uniformly subtracted from the scores of all 2.4.5. SI model and the core of the AV polytope candidates, and therefore will not affect their Borda Under the SI model of subset choice (Falmagne & ordering. Regenwetter, 1996, Doignon & Regenwetter, 1997), voters are assumed to have (i) an explicit profile or Remark 2. Theorem 2.7 says that approving a particular probability distribution Pp of all rankings p over ½dŠ; subset (under Brams–Fishburn scoring) is equivalent to and independently (ii) a probability distribution f ðkÞ several possible rankings (using Borda score) chosen in a over the size of the subset (k ¼ 0; 1; y; d) that is to be way that is consistent with the approved subset and approved of. The subset choice probability PS is related without bias; here ‘‘consistency’’ is taken to mean that to the ranking probability Pp and the size probability the chosen k candidates rank higher than the non- f ðkÞ through (6). chosen d–k candidates, and ‘‘without bias’’ means The SI model proposes to group all subsets according equal-probabilistically. This gives the geometric inter- to their sizes; geometrically, this amounts to grouping pretation of the Brams–Fishburn tally procedure, i.e., the vertices of Cd according to their image of projec- each choice of a subset (vertex of the d-cube) SD½dŠ tion on the line-segment ½0; dŠ: For each k; any linear- p amounts to an equal-probabilistic assignment of all order path s (for pALd ) has to include one of the ARTICLE IN PRESS 126 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 d!=ðk!ðd kÞ!Þ vertices that project to kA½0; dŠ: The of the SI model (6) yields fundamental assumption of the SI model (6) is X Xd 1 that the sum of the probabilities of all rankings PS ¼ f ðjSjÞ PLðp;lÞ A l!ðd lÞ! compatible with a chosen subset equals, after normal- fp Ld : Lðp;jSjÞ¼0Sg l¼0 1 ization by the set-size probability f ðkÞ; the subset Xd X 1 @ A probability PS: ¼ f ðjSjÞ P : l!ðd lÞ! Lðp;lÞ The SI model for subset choice differs from the equal- l¼0 fpALd : Lðp;jSjÞ¼Sg probability model of AV in two important aspects: (i) While the latter can always induce a probability With S fixed, the summation over p is performed distribution of ranking from an empirically observed for those rankings having their top k jSj elements probability distribution over subsets (namely AV as those in S; and the summation over set-size l ballots), the former assumes the existence of a is performed for variable-size leading sets of a given, C probability distribution of ranking from which the eligible p: When lok; since Lðp; lÞ Lðp; kÞ holds A empirically observed probability distribution of for any p Ld ; the summation over p in the paren- subsets (AV ballots) are related to; (ii) While the theses above only includes terms PS0 for which 0C latter attributes equal probability to all paths S Lðp; kÞ¼S X X (rankings) compatible with a given subset, the former P ¼ w 0 P 0 ; does not require (and indeed often violates) the Lðp;lÞ S S fpAL : Lðp;jSjÞ¼Sg fS0 : S0CS;jS0j¼lg equal-probability interpretation of compatible paths d

(rankings). where wS0 denotes the multiplicity of an l-element subset The existence of a latent ranking probability distribu- S0CS that can be produced by an eligible p satisfying (i) tion is by no means automatically guaranteed for any Lðp; kÞ¼S and (ii) Lðp; lÞ¼S0: A straightforward AV ballots—the necessary conditions are characterized combinatoric consideration gives wS0 ¼ l!ðk lÞ!ðd by the AV polytope (Doignon & Regenwetter 1997, kÞ!: Therefore 2002, Doignon & Fiorini, to appear). Here we further X investigate necessary conditions on the subset choice PLðp;lÞ A probability PS such that the SI model would yield the fp Ld : Lðp;jXSjÞ¼Sg same ranking probability as the equal-probability ¼ l!ðd lÞ!ðd kÞ!PS0 : model. The region within the AV Polytope that is fS0 : S0CS;jS0j¼lg compatible with the equal-probability model is called the core of the AV Polytope. Similarly, for l4k; one may derive X Theorem 2.9 (Core of the AV polytope). The subset PLðp;lÞ fpALd : Lðp;jSjÞ¼Sg probability PS consistent with both the SI and the equal- X probability interpretations of AV satisfies the following ¼ k!ðl kÞ!ðd lÞ!PS0 : simultaneous equations: fS0 : S0*S;jS0j¼lg 0 X Finally, for l ¼ k; the only term that survives the @ 0 0 PS ¼ f ðjSjÞ PS þ r1ðS ; SÞPS0 summation is S ¼ S: Putting these results together fS0:S0CSg yields the desired formula (30). & 1 X 0 A þ r2ðS ; SÞ PS0 ; ð30Þ Remark 1. Note that the system of simultaneous fS0 : S0*Sg equations (30) is linear in PS: The total number of independent constraints on P ; according to (30), is where S; S0 are subsets of ½dŠ; S 2d ðd þ 1Þþ1 ¼ 2d d; ðjSjjS0jÞ!ðd jSjÞ! 0 0C P r1ðS; S Þ¼ for S S; y ðd jS0jÞ! this is becauseP at each k ¼ 0; 1; ; d; fS : jSj¼kg PS ¼ f k ; while d f k 1 (resulting in over-counting of ðjS0jjSjÞ! ðjSjÞ! ð Þ k¼0 ð Þ¼ r ðS; S0Þ¼ for SCS0; the number of constraints). The total number of 2 jS0j! independent components of PS is and f ðkÞ is the probability over set-size given by (7). 2d 1 ð2d dÞ¼d 1; Proof. According to Corollary 2.8, under the equal- which is the dimension of the d-permutahedron! There- probability model, the induced ranking probability Pp is fore, the region within the AV polytope satisfying (30) related to the probability distribution over subsets PS forms the core of the probability space associated with through (29). Requiring PS to also fulfill the condition the subset choice paradigm. ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 127

0 Remark 2. Note that the reciprocal of r1ðS; S Þ and of size (since a voter may be contented with the idea that r ðS; S0Þ the same amount of total effective votes will be cast 2   d jS0j jS0j regardless how many candidate he/she chooses): r1 ¼ ; r1 ¼ 2 3 2 3 1 d jSj 2 jSj m1ðSÞ 1 6 7 6 7 d6 m2ðSÞ 7 d jSjþ16 1 7 equal the number of ways of choosing d jS0j items out vModðSÞ¼ 6 7 þ 6 7: ð32Þ 4 ^ 5 4 ^ 5 of d jSj items (jS0jojSj), or choosing jSj items out of 2 2 0 0 0 jS j items (jS j4jSj), respectively. Therefore, r2ðS; S Þ md ðSÞ 1 can be interpreted as the probability that those voters This rule, in comparison to the Brams–Fishburn score choosing S0 would also have chosen SCS0 if they were 2 3 0 m S restricted to choose a smaller subset, while r1ðS; S Þ can 1ð Þ be interpreted as the probability that those voters not 6 7 BF 6 m2ðSÞ 7 %0 0 v ðSÞ¼6 7; ð33Þ choosing S ¼½dŠ\S would not have chosen ½dŠ\S ¼ 4 ^ 5 S%CS%0 either even if they were to enlarge the approved md ðSÞ subset. Under this interpretation, the constraints (30), P after a recast d Mod makes i¼1 vi ¼ dðd þ 1Þ=2; regardless of the num- ber of candidates (jSj¼k) being approved of. Under P PSP 0 0 this situation, the Brams–Fishburn score on the AV 0C r1ðS ; SÞPS0 þ 0* r2ðS ; SÞPS0 S S S S ballots is identical to the Borda score on the latent voters f ðjSjÞ P profile (under equal-probability interpretation of each ¼ d k¼0;kajSj f ðkÞ chosen subset). f jSj The SI model, on the other hand, may also induce a ¼ ; ð31Þ 1 f ðjSjÞ ranking probability distribution under very stringent conditions (stated as conditions for AV Polytope). can be viewed as some kind of consistency requirement When such a ranking probability does exist, the Borda on subset probability if the set-size is a truly irrelevant score on each candidate can be calculated using a factor in determining the aggregated preference (in formula developed in Regenwetter and Grofman (1998a, terms of Borda score over the candidates, see Section Theorem 5), which is reproduced here 2.4.1). It gives the condition under which the Brams– X P Fishburn score and the SI score become equivalent. vSI ¼ S m ðSÞ: ð34Þ i f ðjSjÞ i SD½dŠ The contribution to the Borda scores by SD½dŠ is (for 2.4.6. Comparing the SI score and the Brams–Fishburn f ðjSjÞa0) score 2 3 In calculating the Brams–Fishburn score (23), total m1ðSÞ 6 7 votes for each candidate are tallied according to how 1 6 m2ðSÞ 7 vSIðSÞ¼ 6 7; ð35Þ many times a candidate is included among all approved f ðjSjÞ4 ^ 5 subsets. As shown in Section 2.4.4, this vote counting m ðSÞ rule is equivalent, up to an affine transform, to the rule d of calculating Borda scores, provided that the choice of and vSIðSÞ¼0iff ðjSjÞ ¼ 0:12 Compared with (23), it is a subset is interpreted as an equal-probabilistic approval obvious that the only (and significant) difference of all linear orders consistent with the chosen subset. In between the Brams–Fishburn score (equal-probability this sense, AV scores (arising from subset choice model) and the SI model (SI model) is with respect to paradigm) are said to be equivalent to Borda scores the weights individual subsets carry. In the former case (arising from ranking paradigm). (33), a candidate’s appearing (or non-appearing) in a Under the Brams–Fishburn tally procedure, any subset constitutes a vote of 1 count (or 0 count), whereas chosen subset S contributesP towards the entireP candidate in the latter case (35) a candidate’s appearing (or non- d BF d 1 pool a total vote count of i¼1 vi ðSÞ¼ i¼1 miðSÞ¼ appearing) constitutes f ðjSjÞ count (or 0 count). The SI jSj: In Borda scoring, any chosen linear order con- score appears to have given too much weight to those Ptributes towards the candidate pool a total count of voters who happen to choose an unpopular set-size d i¼1 i ¼ dðd þ 1Þ=2: To truly equate vote distribution under the approval voting procedure with that under 12 In Theorem 5 of Regenwetter and Grofman (1998a), the summation does not include the S ¼½dŠ term. However, they define Borda scoring (of the induced voters profile), one T Mod Borda scores using ½d 1; d 2; y; 0Š as weighting function, rather would use a modified scoring rule v ðSÞ¼ than ½d; d 1; y; 1ŠT used in this article, see footnote 4. As a result, Mod Mod T ½v1 ðSÞ; y; vd ðSފ ; which may give some sense of (35) is a valid formula for calculating Borda scores under the current balancing the contributions from subsets of different set- definition so long as the summation over S includes the master set ½dŠ: ARTICLE IN PRESS 128 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 2 3 2 3 (small f ðjSjÞ), whereas the Brams–Fishburn score gives 001 001 equitable contribution for each voter. 6 7 6 7 O/312S ¼ 4 1005; O/321S ¼ 4 0105: When the subset probability PS falls within the core of the AV polytope, then the latent rank probability 010 100 distribution Pp (voters profile) is consistent with both Since permutation matrices ðOpÞ are bistochastic ma- the SI interpretation and the equal-probability inter- trices Bd in their extreme forms, one might ask whether pretation. In this case, the Brams–Fishburn tally every bistochastic matrix can be represented in this way procedure and the SI tally procedure yield the same (36), i.e., as a convex combination of permutation score—the Borda score over one and the same Pp: This matrices (and therefore corresponds to a point in the occurs when the subset size is truly irrelevant, see Birkhoff Polytope). Since the ranking probability Pp Eq. (31) under Remark 2 (Section 2.4.5). itself can be viewed as the coefficients for this convex combination, an equivalent statement of this character- 2.5. Connection to the rank-position probability izationX problem, in the languageX of choice paradigms, is b ¼ P ðO Þ ¼ P : ð37Þ Finally, for completeness of this exposition, the ij p p ij p pALd fpAL : pðiÞ¼jg connection between the permutahedron and the rank- d matching paradigm will be discussed. A rank-matching Given Pp; one readily obtains bij through margin- (also called rank-assignment) paradigm involves the alization. Given bij’s, are there any constraints required assignment of rank-positions 1; 2; y; d to the d candi- of bij for such Pp to exist (as in the characterization dates. Let bij be the amount (fraction) of jth rank- problems for binary choice and for subset choice)? position assigned to candidate i; Pwhich satisfies (i) bijX0 It turns out that the answer is ‘‘no’’: any bistochastic matrix (rank-position probability b ) can be represented (positivity of assignment);P (ii) i bij ¼ 1 (each rank- ij as a ranking probability P where the rankings are position fully spent); (iii) j bij ¼ 1 (each candidate p fully assigned). The bij’s (for i; jA½dŠ) form elements of a represented by permutation matrices; this is commonly d  d bistochastic matrix Bd : known as the Birkhoff/von Neumann Theorem, proven Among all bistochastic matrices, of special interest is independently by Birkhoff (1946) and von Neumann the class of permutation matrices Op; whose elements (1953) under different contexts. Therefore, all bistochas- assume the value of either 0 or 1, with exactly one 1 in tic matrices (rank-position probabilities) are well ‘‘char- each row and in each column. The acterized’’—the set of all bistochastic matrices is the Op is in one-to-one correspondence with the linear order Birkhoff Polytope, with permutation matrices (linear 13 p; with its ijth entry orders) as its vertices. Also with d! vertices, the  dimensionality of equals d 1 2; which is deter- 1ifpðiÞ¼j; Bd ð Þ ðOpÞij ¼ mined as follows: 0 otherwise: 2 |{z}d |{z}2d The convex hull of all permutation matrices (i.e., when p degree of freedom for a constraints on each exhausts the set of linear orders Ld ) forms a polytope, dÂd square matrix row and each column known as the Birkhoff Polytope (alternatively called Assignment Polytope or Polytope): þ |{z}1 : overcounting constraints since ¼ convfO g each row=column sums to 1 Bd ( p Properties of Bd have been studied and documented in ¼ l O þ l O þ ? þ l O : p A ; l X0; 1 p1 2 p2 d! pd! k Ld k great mathematical detail (Brualdi & Gibson, 1977a, b, ) c, 1976); in particular, all of its d2 facets have been Xd! characterized (Balinski & Russakoff, 1974). The Birkh- l ¼ 1 : ð36Þ k off Polytope has been introduced into the choice k¼1 probability literature by Suck (1992) in connection with Examples of permutation matrices are shown as follows the study of the Binary Choice Polytope. (for d ¼ 3): 2 3 2 3 Permutation matrices and permutation vectors (in- 100 100 troduced in Section 2.1.1) are two representations of the 6 7 6 7 set of linear orders ; with the former resulting in the O/123S ¼ 4 0105; O/132S ¼ 4 0015; Ld Birkhoff Polytope B and the latter the permutahedron 001 010 d 2 3 2 3 13 Thus, the situation with rank-position probabilities b here is quite 010 010 ij 6 7 6 7 different from the binary choice case and from the subset choice case, O/213S ¼ 4 1005; O/231S ¼ 4 0015; where constraints on the binary choice probability or on the subset choice probability exist for each to induce a ranking probability 001 100 distribution. ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 129 ! Pd1: Naturally, the two polytopes have close connec- Xd M tions. In fact, it is well known that there exists a ¼jMjÁ hj jMj 2 canonical projection g : Rd -Rd from the Birkhoff j¼1 Polytope Bd to the d-permutahedron Pd1: Denote b ¼ ¼ 0: b y b b y b T ARd2 canonical ½ 11; ; 1d ; 21; ; dd Š : The so-called Therefore projection X XjMj jMjðjMjþ1Þ v ¼ Jb vBkX j ¼ ; i 2 2 iAM j¼1 with JARd Âd given by 2 3 12? d 00? 0 ? 00? 0 which is the system of inequalities (14) defining the 6 7 permutahedron. & 6 00? 012? d ? 00? 0 7 6 7 4 ^^^^^^^^^^^^^5 Remark 1. This connection between the Birkhoff Poly- 00? 000? 0 ? 12? d tope Bd and the permutahedron Pd1 indicates that Borda scoring of a ranking probability Pp can be gives rise to a point vAPd1: In other words, conducted in two separable steps: converting Pp into a rank-position probability b (while still preserving the Proposition 2.10 (Permutahedron as canonical projec- ij structure of linear orders), and then using Borda’s point- tion of the Birkhoff Polytope). Any d  d bistochastic assignment system, i.e., assigning j points to rank- matrix B can be surjectively mapped onto the d- d position j; to construct Borda scores via (38). This is permutahedron via: seen in the following identities T Bk X Bd Á½1; 2; y; dŠ ¼ v APd1: ð38Þ Bd vi ¼ Pp pðiÞ from ð10Þ pALd Proof. The following proof is adapted from Yemelichev Xd X et al. (1984, p. 229). All one needs to show is that the ¼ Pp Á j coordinates j¼1 fp 0: pðiÞ¼jg 1 Xd Xd X Bk @ A vi ¼ jbij; i ¼ 1; 2; y; d; ð39Þ ¼ j Á Pp j¼1 j¼1 fp : pðiÞ¼jg satisfy the system of constraints defining the d-permu- Xd tahedron (in Theorem 2.1). Clearly, ¼ j Á bij from ð37Þ j 1 Xd Xd Xd ¼ Bk Bk vi ¼ jbij ¼ vi : i¼1 i¼1 j¼1 ! In other words, the contribution of the jth rank-position Xd Xd is counted as having strength j; while the induced voting ¼ j Á 1 since bij ¼ 1 j¼1 i¼1 score for each candidate i is the sum of strengths of all rank-positions weighted by the fraction assigned for dðd þ 1Þ ¼ ; each rank-position. See Regenwetter and Grofman 2 Bd Bd Bd Bd T (1998a, p. 43). Just as v ¼½v1 ; v2 ; y; vd Š mini- hence proving (15). Now, for any subset MC d and any T P ½ Š mizes (11), vBk ¼½vBk; vBk; y; vBkŠ would, as pointed A M 1 2 d integer j ½dŠ; denote hj ¼ iAM bij as a partial sum of M out in Cook and Seiford (1982), minimize the expression theP elements inP the jthP column. Clearly 0phj p1 and d M d Xd Xd 0p j¼1 hj ¼ iAM j¼1 bij ¼jMj: Therefore, 2 bijðvi pð jÞÞ X XjMj i¼1 j¼1 vBk j i y T A d iAM j¼1 among all v ¼½v1; v2; ; vd Š R : Xd XjMj M ¼ jhj j j¼1 j¼1 3. Conclusions and discussions Xd XjMj M M ¼ jhj jð1 hj Þ Permutahedron, as the space of Borda scores of a j¼jMjþ1 j¼1 probability distribution on the set of rankings (linear Xd XjMj orders) over d candidates, is a useful tool in linking M M X jMjÁhj jMjÁð1 hj Þ various choice paradigms: (i) It can be realized as a j¼jMjþ1 j¼1 ‘‘zonotope’’, i.e., a projection from a cube, of dimension ARTICLE IN PRESS 130 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

Random Utility The permutahedron-based approach may also Hypersphere Sd-2 prove to be useful in aggregation of preference or

osculation social choice. The pivotal result in social choice and welfare is Arrow’s impossibility theorem, which sets Binary Choice zonotope Permutahedron fiber Subset Choice projection polytope fundamental limits on any well-behaved social welfare Cube C Π for {1,…,d} Cube C d(d-1)/2 d-1 d ordering. In recent years, the essence of this impossi- canonical projection bility of ‘‘proper’’ (i.e., unanimous, anonymous, and Birkhoff continuous) aggregation of preference, as well as the homotopic equivalence from the Pareto rule to a Polytope Bd dictator rule, has been understood from a deep, Fig. 5. Connection of a d-permutahedron, the space of Borda scores topological perspective (Chichilnisky, 1980, Chichilnis- of candidates given a ranking probability, to the d d 1 =2-cube, the ð Þ ky & Heal, 1983, Baryshnikov, 1993, Heal, 1997, d-cube, and the ðd 2Þ-hypersphere, representing, respectively, the binary choice probability, the subset choice probability, and the Baryshnikov, 2000, Lauwers, 2000). Central to these random distribution of utility values of the d candidates (cf. Fig. 1). arguments is the topological notion of non-contract- ibility of the space of preference, an example being the space of non-coincidental random utilities under the spherical representation (see Section 2.3.1). When dðd 1Þ=2; that describes the binary choice probability; null preference is allowed, the space of affinely (ii) It can be realized as a ‘‘monotone path polytope’’ equivalent utility vectors also includes the origin (the associated with projecting a cube, of dimension d; barycenter of the permutahedron). Jones, Zhang, & that describes subset choice probability, onto a line Simpson (2003) have investigated the natural topo- segment that describes the subset size; (iii) It can be logy (which turns out to be non-Hausdorff) that realized as a canonical projection from the Birkhoff includes this null-preference point as a connected Polytope that describes bistochastic matrices, with d  d component in the space of preference. There, it is elements, that relates to the rank-position probability; shown that proper aggregation is possible if and (iv) Finally, all of its vertices are contained in a only if the null-preference is allowed for the society hypersphere, with dimension d 2; which is the (output of the aggregation map) but not for the quotient space of all equivalent classes of interval- individual voters (input to the aggregation map). A scaled d-dimensional utility vectors. See Fig. 5 for a natural question is that whether this conclusion summary. can be generalized to other preference spaces (of binary These properties of a permutahedron supply us with a choice, subset choice, etc). It is known (Baryshnikov, wealth of connections to various choice paradigms, 1993) that the nerve of the covering of the set of namely, those of binary choice, subset choice, ranking, linear orders on ½dŠ by pairwise (binary) comparisons, and rank-matching, that are commonly employed in which is a simplicial complex of dimension ðd þ 1Þ psychology, economics, political science, consumer ðd 2Þ=2; is homotopically equivalent to the sphere behavior, etc. Important insights gained from this Sd2: It is unclear whether a covering by variable-sized approach include: the extension of Young’s formula of subsets enjoys similar properties. Future research will Borda scores to all cases of binary choice probabilities illuminate the mathematical structure of and connec- (from only those induced from a ranking probability), as tions between the various spaces of preference and well as its compatibility with the BTL representation of choice. choice probability; the connection between the AV tally procedure a` la Brams and Fishburn (1983) and the equal-probability model of the induced voters profile; Appendix A. Mathematical background on polytopes the condition for the compatibility of the Brams– Fishburn tally procedure with the size-independent This appendix contains basic materials about the model a` la Falmagne and Regenwetter (1996); and the convex polytope, including its dualistic characteriza- spherical representation of the random utility model of tions either using the set of its vertices or by the set choice, in a way that illuminates the intimate connection of its facets. Special attention will be paid to the between interval-valued utility vectors and linear orders. projection and to the lift-up of projections of polytopes, Most importantly, the notion of Borda scores can be where permutahedra may arise. For details, see Ziegler extended from the scoring of a ranking probability (1995). (Eq. (10)) to the scoring of a binary choice probability (Eq. (17)), the scoring of a subset choice probability A.1. Polytopes (Eq. (28)) under the equal-probability model under the SI model (Eq. (34)) and the scoring of a rank-position A convex polytope (‘‘polytope’’) is the convex hull of a probability (Eq. (39)). finite set P of points (vectors) P ¼fx1; x2; y; xng in ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 131

Several polytopes have received considerable interest in recent years in the choice literature. They are (i) the Binary Choice Polytope, resulting from the character- ization of binary choice vectors compatible with a (a) ranking probability; (ii) the AV polytope, resulting from the characterization of the subset choice probability (of the SI model) as compatible with a ranking probability; and (iii) the Birkhoff Polytope, arising from the rank- assignment paradigm and bistochastic matrices. (b) A.2. Faces of a polytope Fig. 6. A polytope defined as (a) the convex hull of a finite set of points; or equivalently as (b) the enclosure of finite number of intersecting half-spaces. The dimension, dim P; of a polytope P is the dimension of its affine span. A polytope contains faces d which form its boundaries. The face of a d-dimensional some real vector space R {xiði ¼ 1; 2; y; nÞ: polytope can take the form of: (i) a vertex, which is of ¼ convðPÞ zero-dimension (i.e., a point); (ii) an edge, which is one- P ( dimensional (i.e., a line segment); (iii) a facet, which has ¼ l1x1 þ l2x2 þ ? þ lnxn : all liX0; dimension d 1 (i.e., the maximal proper face); or (iv) ) any dimension in between. In fact, any proper face Xn FCP can be expressed as the convex combination of a li ¼ 1 : subset F of the original set P of vectors making up P i¼1 F ¼ convfxi : xiAFCPg: Assuming that none of the x ’s can be written as a i The collection of all faces of a polytope, ordered by convex combination of the other points (otherwise, they inclusion, forms a lattice, called the face lattice. The can always be excluded from P to yield a reduced set of length of the maximal chain of this lattice plus one points), such a set of n distinct points or vectors form the equals the dimension of the polytope. For a d- vertices of the polytope : When all vectors in P are in P dimensional simple polytope, every vertex belongs to d general positions in Rd (i.e., no d of them lie on a facets, whereas for a d-dimensional simplicial polytope, common affine hyperplane), then the polytope is called a every facet has d vertices and hence are all simplices. simplicial polytope. Simple and simplicial polytopes are dual to each other. On the other hand, one can also define a polytope as Any d-dimensional polytope (dX3) that is both simple an enclosure (i.e., intersection that is bounded in space) and simplicial must be a simplex. of finitely many closed half-spaces in Rd : From the combinatoric point of view, a polytope is P ¼fxARd : Jxpb; JARmÂd ; bARmg: completely characterized by its face-lattice—two poly- topes are said to be combinatorially equivalent if their Each half-space is governed by one linear inequality (with face lattices are isomorphic. For a d-dimensional equality sign corresponding to a hyperplane), and a finite polytope P; denote the total number of its j-dimensional collection of them (indicated by the matrix J above), if faces as fjðPÞ; ð j ¼ d 1; d 2; y; 1; 0Þ; and define unbounded, would form a polyhedron. When all the fd ðPÞ¼1 for convenience. These numbers collectively defining hyperplanes are in general positions, the poly- form a ðd þ 1Þ-dimensional vector called the f -vector tope is called a simple polytope. These two definitions of (some authors also define f1ðPÞ¼1; thus making the a polytope, as a linear combination of points and as the f -vector ðd þ 2Þ-dimensional). The components of the f - bounded intersection of half-spaces (see Fig. 6), reflect the vector of a polytope satisfy some numerical relation- fundamental duality between a linear space and its dual ships. The most prominent one is the Euler–Poincare´ space induced by the inner-product operation. relation Familiar examples of polytopes include: (i) polygons, Xd j which are polytopes with dimension d ¼ 2 (e.g., ð1Þ fjðPÞ¼1; trapezoid, pallellogram); (ii) the d-dimensional hyper- j¼0 cube (‘‘d-cube’’), which is the d-dimensional extension which reduces, as a special case for d ¼ 3; to the famous of a cube; (iii) the d-dimensional simplex, with d þ 1 T Euler formula between the number of vertices (f0), the vertices; here one of the vertices xdþ1 ¼½0; 0; y; 0Š is number of edges (f1), and the number of facets (f2) for the origin, and the other d vertices are xk ¼ T any convex polytope in three dimensions: ½0|fflfflffl{zfflfflffl}; y; 0; 1; 0|fflfflffl{zfflfflffl}; y; 0Š ; k ¼ 1; 2; y; d: f0 f1 þ f2 ¼ 2: k1 dk ARTICLE IN PRESS 132 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

Interestingly, for simplicial polytopes, the Euler–Poin- care´ relation itself turns out to be a special case (k ¼ d) γ(−1) (F) of the so-called Dehn–Sommerville relations:  P Xk j d j ð1Þ fjðPÞ¼fkðPÞ; j¼0 d k γ F which holds for k ¼ 0; 1; 2; y; d (only half of these Q relations are linearly independent). Moreover, compo- nents of the f -vector may be constrained within certain Fig. 7. Affine projection between polytopes (note the tracing of a face upper and lower bounds (e.g., the Upper Bound of a lower-dimensional polytope to that of a higher-dimensional one). Theorem, McMullen, 1971). More details can be found elsewhere (see Bayer & Lee, 1993, Ziegler, 1995). T ½0|fflfflffl{zfflfflffl}; y; 0; 1; 0|fflfflffl{zfflfflffl}; y; 0Š has a 1 only in its lth position and A.3. Projection of polytopes l1 pl

has 0 elsewhere, J el ¼ jl; so An interesting property of a polytope that follows ( Xp immediately from its definition concerns its affine Z ¼ z : z ¼ b þ t j ; - l l projection. A projection g : P Q of a polytope is l¼1 p- q defined as an affine map g : R R ; 1 1 ptlp ; l ¼ 1; 2; y; p : ðA:2Þ x/Jx þ b; 2 2 C p C q where P R is a p-dimensional polytope, Q R is a q- Being a zonotope is a geometric property rather than a A qÂp dimensional polytope, and gðPÞ¼Q: Here J R combinatorial one. Zonotopes are centrally symmetric A q specifies a q  p matrix, b R is a vector of dimension with respect to its barycenter. Affine projection of a A p q; and x R is a vector of dimension p: Clearly, zonotope is still a zonotope; faces of a zonotope is again dim ðQÞpdim ðPÞ¼p; with equality holding when the a zonotope. Further properties of zonotopes are rank of J is p: From the linear algebra behind the affine reviewed in McMullen (1971). map, for any face FCQ; its pre-image 1 g ðFÞ¼fyAP : gðyÞAFg A.4. Fiber polytopes is a face of P: Furthermore, if F1; F2 are faces of Q; then F DF holds if and only if g1ðF ÞDg1ðF Þ: Two concepts are intimately associated with projec- 1 2 1 2 - That is, the inclusion relation between faces in the tions, namely, ‘‘sections’’ and ‘‘fibers’’. Let g : P Q bea p q projected polytope (of a lower dimension) can be projection of polytopes with PCR ; QCR ; such that ‘‘traced’’ back to their inclusion relation in the original gðPÞ¼Q; with dimðQÞodimðPÞ: A section is a (con- - polytope (of a higher dimension). See Fig. 7. tinuous) map s : Q P that satisfies gðsðxÞÞ ¼ x for all An interesting case is the projection of a cube, which xAQ: Informally, since any projection maps many p yields a special kind of polytope called a zonotope.A points of a higher dimensional space yAR to one single A q zonotope Z is the image of a cube under an affine point of a lower dimensional space x R (p4q), the set p q of pre-image points (i.e., y’s) that project to the same projection g : R *Cp-ZCR point xARq form a fiber g1ðxÞ¼fyARp : gðyÞ¼xg z : z Jx b; xA : Z ¼f ¼ þ Cpg that is attached, or growing out of x on Rq; the base An example is given by Fig. 3 where the two- manifold. A (horizontal) section s is then an inverse dimensional zonotope arises as the projection of C3: map (not unique, of course) obtained by selecting one q Expressing the p-cube Cp in 0-centered coordinates, one point in each fiber on R : The collection of fibers on a has base manifold is called the fiber bundle; it allows various ( horizontal sectioning. Xp 1 1 Cp ¼ x : x ¼ tlel; ptlp ; For any given section s; one can construct the 2 2 l¼1 integral ) Z l ¼ 1; 2; y; p ðA:1Þ sðxÞ dx; Q where el’s ðl ¼ 1; 2; y; pÞ are a set of p Cartesian base where the vector-valued integral is with respect to points vectors in Rp: The matrix J can be expressed in terms of of the base manifold xAQ; thus the integral itself is a pq its column vectors: J ¼½j1; j2; y; jpŠ where each of the point in R : The fiber polytope SðP; QÞ is defined as jl ðl ¼ 1; 2; y; pÞ is a q-dimensional vector. Since el ¼ the set of all such points, each obtained through a ARTICLE IN PRESS J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134 133 sectioning s: Bayer, M. M., & Lee, C. W. (1993). Combinatorial aspects of convex  Z 1 polytopes. In P. M. Gruber, & J. M. Wills (Eds.), Handbook of SðP; QÞ¼ sðxÞ dx : s convex geometry. Amsterdam: Elsevier. volðQÞ Q Billera, L. J., & Sturmfels, B. (1992). Fiber polytopes. Annals of Mathematics, 135, 527–549. is a section associated with g ; ðA:3Þ Birkhoff, G. (1946). Tres observaciones sobre el algebra lineal, Revista Facultad de Ciencias Exactas. Puras y Applicadas Universidad where the volume of Q is defined as Nacional de Tucuman, Serie A (Matematicas y Fisica Teoretica), 5, Z 147–151. volðQÞ¼ dx: Block, H. D., & Marschak, J. (1960). Random ordering and Q probabilistic theories of responses. In I. Olkin, S. Ghurye, W. Hoeffding, W. Madow, & H. Mann (Eds.), Contributions to It can be shown that SðP; QÞ indeed form a convex set. probability and statistics: Essays in honor of Harold Hostelling. p Though an object defined in R ; its dimensionality is Stanford, CA: Stanford University Press. only p q; it is contained in the fiber growing out of the Bolotashvili, G., Kovalev, M., & Girlich, E. (1999). New facets of the linear ordering polytope. SIAM Journal of Discrete Mathematics, point r0AQ: 12, 326–336. 1 SðP; QÞDg ðr0Þ-P; Bowman, V. J. (1972). Permutation polyhedra. SIAM Journal of Applied Mathematics, 22, 580–589. where the point r is the barycenter of : Z 0 Q Bradley, R. A., & Terry, M. E. (1952). Rank analysis of incomplete 1 block designs. I. Biometrika, 39, 324–345. r0 ¼ x dx: volðQÞ Brams, S. J., & Fishburn, P. C. (1978). Approval voting. American Q Political Science Review, 72, 831–847. More materials on fiber polytopes can be found in Brams, S. J., & Fishburn, P. C. (1983). Approval voting. Boston, MA: Billera and Sturmfels (1992). Birkha¨ user. A special kind of fiber polytope is the so-called Brualdi, R. A., & Gibson, P. M. (1976). Convex polyhedra of doubly stochastic matrices. IV. Linear Algebra and Its Applications, 15, monotone path polytope, constructed through an 153–172. interesting projection map. Here the projection g is Brualdi, R. A., & Gibson, P. M. (1977a). Convex polyhedra given as follows: choose a fixed but otherwise arbitrary of doubly stochastic matrices. I. Applications of the p-dimensional row vector c; and form a dot product permanent function. Journal of Combinatorial Theory (A), 22, with any column vector xAPCRp: The set of points 194–230. Brualdi, R. A., & Gibson, P. M. (1977b). Convex polyhedra of doubly Q ¼ fgc Á x : xAP stochastic matrices. II. Graph of Sn: Journal of Combinatorial Theory (B), 22, 175–198. 1 define a one-dimensional polytope Q ¼½cmin; cmaxŠDR Brualdi, R. A., & Gibson, P. M. (1977c). Convex polyhedra of doubly such that stochastic matrices. III. Affine and combinatorial properties of On: Journal of Combinatorial Theory (A), 22, 338–351. cmin ¼ minxAPc Á x; cmax ¼ max c Á x: xAP Chichilnisky, G. (1980). Social choice and the topology of spaces of preferences. Advances in Mathematics, 37, 165–176. With such Q; the corresponding fiber polytope SðP; QÞ Chichilnisky, G., & Heal, G. (1983). Necessary and sufficient is the monotone path polytope. The coordinates for a conditions for a resolution of the social choice paradox. Journal particular section s : Q{x-sðxÞAP is of Economic Theory, 31, 68–87. Z Cohen, M., & Falmagne, J. C. (1990). Random utility representation 1 cmax sðxÞ dx: of choice probability: a new class of necessary conditions. Journal cmax cmin cmin of Mathematical Psychology, 34, 88–94. Cook, W. D., & Kress, M. (1992). Ordinal information and preference structures: Decision models and applications. Englewood Cliffs, NJ: Prentice Hall. References Cook, W. D., & Seiford, L. M. (1982). On the Borda-Kendall consensus method for priority ranking problems. Management Alvo, M., & Cabilio, P. (1993). Rank correlations and the analysis of Science, 28, 621–637. rank-based experimental designs. In M. Fligner, & J. Verducci Cook, W. D., & Seiford, L. M. (1990). A data envelopment model (Eds.), Probability models and statistical analyses for ranking data for aggregating preference ranking. Management Science, 36, (pp. 141–156). New York: Springer. 1302–1310. Alvo, M., Cabilio, P., & Feigin, P. D. (1982). Asymptotic theory for Copeland, A.H. (1951). A ‘‘reasonable’’ social welfare function. measures of concordance with special reference to average Kendall Mimeograph, University of Michigan. tau. Annals of Statistics, 10, 1269–1276. Critchlow, D. E., Fligner, M. A., & Verducci, J. S. (1991). Probability Babington-Smith, B. (1950). Discussion of Professor Ross’s paper. models on ranking. Journal of Mathematical Psychology, 35, Journal of the Royal Statistical Society, Series B, 12, 53–56. 294–318. Balinski, M. L., & Russakoff, A. (1974). On the assignment polytope. Doignon, J. P., & Fiorini, S. (to appear). The facets and the SIAM Review, 16, 516–525. symmetries of the approval-voting polytope, Journal of Combina- Baryshnikov, Y. (1993). Unifying impossibility theorems: A topologi- torial Theory (Series B), to appear. cal approach. Advances in Applied Mathematics, 14, 404–415. Doignon, J. P., & Regenwetter, M. (1997). An approval-voting Baryshnikov, Y. (2000). On isotopic dictators and homological polytope for linear orders. Journal of Mathematical Psychology, manipulators. Journal of Mathematical Economics, 33, 123–134. 41, 171–188. ARTICLE IN PRESS 134 J. Zhang / Journal of Mathematical Psychology 48 (2004) 107–134

Doignon, J. P., & Regenwetter, M. (2002). On the combinatorial Ovchinnikov, S. (1996). Means on ordered sets. Mathematical Social structure of AV polytope. Journal of Mathematical Psychology, 46, Sciences, 32, 39–56. 554–563. Pekec˘ , A. (2000). Computation complexity issues in random utility Falmagne, J.-Cl., & Regenwetter, M. (1996). A random utility modeling. Paper presented at the Random Utility 2000 Conference, model for approval voting. Journal of Mathematical Psychology, Fuqua School of Business, Duke University. 40, 152–159. Rado, R. (1952). An inequality. Journal of the London Mathematical Feigin, P. D., & Alvo, M. (1986). Intergroup diversity and Society, 27, 1–6. concordance for ranking data: An approach via metrics for Regenwetter, M., & Grofman, B. (1998a). Choosing subsets: A size- permutations. Annals of Statistics, 14, 691–707. independent probabilistic model and the quest for a social welfare Feigin, P. D., & Cohen, A. (1978). On a model for concordance ordering. Social Choice Welfare, 15, 423–443. between judges. Journal of the Royal Statistical Society, Series B, Regenwetter, M., & Grofman, B. (1998b). Approval voting, Borda 40, 203–213. winners, and Condorcet winners: Evidence from seven elections. Fishburn, P. C. (1992). Induced binary probabilities and the linear Management Science, 44, 520–533. ordering polytope—A status-report. Mathematical Social Sciences, Regenwetter, M., Marley, A. A. J., & Joe, H. (1998). Random utility 12, 67–80. threshold models of subset choice. Australian Journal of Psychol- Fligner, M. A., & Verducci, J. S. (1986). Distance based ranking ogy, 50, 165–174. models. Journal of the Royal Statistical Society, Series B, 48, Regenwetter, M., & Marley, A. A. J. (2001). Random relations, 359–369. random utilities and random functions. Journal of Mathematical Gaiha, P., & Gupta, S. K. (1977). Adjacent vertices on a permutahe- Psychology, 45, 864–912. dron. SIAM Journal of Applied Mathematics, 32, 323–327. Regenwetter, M., Marley, A. A. J., & Grofman, B. (2002). A general Hardy, G., Littlewood, J. E., & Po´ lya, G. (1934). Inequality. concept of majority rule. Mathematical Social Sciences, 43, Cambridge, UK: Cambridge University Press. 405–428. Heal, G. (1997). Topological social choice. New York: Springer. Saari, D. G. (1990). The Borda dictionary. Social Choice and Welfare, Jones, M., Zhang, J., & Simpson, G. (2003). Aggregation of utility and 7, 279–317. social choice: A topological characterization. Journal of Mathema- Saari, D. G. (1992). Millions of election outcomes from a single profile. tical Psychology, 47, 545–556. Social Choice and Welfare, 9, 277–306. Kendall, M. G. (1962). Rank correlation methods (3rd ed.). London: Saari, D. G. (1993). Geometry of voting. New York: Springer. Griffin. Saari, D. G., & Merlin, V. R. (1996). The Copeland Koppen, M. (1991). Random utility representations of binary choice method I: Relationships and the dictionary. Economic Theory, 8, probabilities. In J. P. Doignon, & J.-C. Falmagne, (Eds.), 51–76. Mathematical psychology: Current developments (pp. 181–201). Schoute, P.H. (1911). Analytic treatment of the polytopes regularly New York: Springer. derived from the regular polytopes. Cited in Ziegler, G. M. (1995). Koppen, M. (1995). Random utility representation of binary choice Lectures on Polytope. New York: Springer (corrected second probabilities—Critical graphs yielding critical necessary conditions. printing, 1998). Journal of Mathematical Psychology, 39, 21–39. Suck, R. (1992). Geometric and combinatorial properties of the Lauwers, L. (2000). Topological social choice. Mathematical Social polytope of binary choice-probabilities. Mathematical Social Sciences, 40, 1–39. Sciences, 23, 81–102. Luce, R. D. (1959). Individual choice behavior. New York: Wiley. Suck, R. (1997). Polytopes in measurement and data analysis. Review Luce, R. D., & Suppes, P. (1965). Preference, utility, and subjective of Ziegler’s (1995) Lectures on Polytope. Journal of Mathematical probability. In R. D. Luce, R. R. Bush, & E. Galanter, (Eds). Psychology, 41, 299–303. Handbook of mathematical psychology, Vol. III (pp. 249–410). New von Neumann, J. (1953). A certain zero-sum two-person game York: Wiley. equivalent to the optimal assignment problem. In: H. W. Kuhn, Mallows, C. L. (1957). Non-null ranking models. I. Biometrika, 44, & A. W. Tucker (Eds.), Contributions to the theory of games, Vol. II 114–130. (pp. 5–12). Princeton, NJ: Princeton University Press. Annals of Marley, A. A. J. (1992). A selective review of recent characterizations Mathematical Studies, no. 28. of stochastic choice models using distribution and functional Yemelichev, V. A., Kovalev, M. M., & Kravtsov, M. K. (1984). equation techniques. Mathematical Social Sciences, 23, 5–29. Polytopes, graphs and optimization. (translated by G.H. Lawden). McCullagh, P. (1993). Models on spheres and models for permuta- Cambridge: Cambridge University Press. tions. In M. Fligner, & J. Verducci (Eds.), Probability models and Young, H. P. (1974). An axiomatization of Borda’s rule. Journal of statistical analyses for ranking data (pp. 278–283). New York: Economic Theory, 9, 43–52. Springer. Young, H. P. (1975). Social choice scoring functions. SIAM Journal of McMullen, P. (1971). On zonotopes. Transactions of the American the Applied Mathematics, 28, 824–838. Mathematical Society, 159, 91–109. Ziegler, G. M. (1995). Lectures on polytope. New York: Springer Merlin, V. R., & Saari, D. G. (1997). The Copeland method II: (corrected second printing, 1998). manipulation, monotonicity, and paradoxes. Journal of Economic Ziegler, G.M. (1999). Lectures on 0/1-polytopes. In: G. Kalai, & G. M. Theory, 72, 148–172. Ziegler (Eds.), Polytopes— and computation,DMV- Nitzan, S., & Rubinstein, A. (1981). A further characterization of Seminars Series. Basel: Birkha¨ user-Verlag. Available: http:// Borda ranking method. Public Choice, 36, 153–158. www.math.tu-berlin.de/~ziegler/.