Combinatorial Laplacian and Rank Aggregation

Combinatorial Laplacian and Rank Aggregation

Yuan Yao

Stanford University

ICIAM, Z¨urich,July 16–20, 2007

Joint work with Lek-Heng Lim Combinatorial Laplacian and Rank Aggregation Outline

1 Two Motivating Examples

2 Reflections on Ranking Ordinal vs. Cardinal Global, Local, vs. Pairwise

3 Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Combinatorial Laplacian Operator

4 Hodge Theory Cyclicity of Pairwise Rankings Consistency of Pairwise Rankings

5 Conclusions and Future Work Combinatorial Laplacian and Rank Aggregation Two Motivating Examples

Example I: Customer-Product Rating

Example (Customer-Product Rating)

m×n m-by-n customer-product rating X ∈ R X typically contains lots of missing values (say ≥ 90%).

The first-order , mean score for each product, might suffer from most customers just rate a very small portion of the products different products might have different raters, whence mean scores involve noise due to arbitrary individual rating scales Combinatorial Laplacian and Rank Aggregation Two Motivating Examples

From 1st Order to 2nd Order: Pairwise Rankings

The arithmetic mean of score difference between product i and j over all customers who have rated both of them, P k (Xkj − Xki ) gij = , #{k : Xki , Xkj exist} is translation invariant. If all the scores are positive, the geometric mean of score ratio over all customers who have rated both i and j,

 !1/#{k:Xki ,Xkj exist} Y Xkj gij = , Xki k is scale invariant. Combinatorial Laplacian and Rank Aggregation Two Motivating Examples

More invariant

Define the pairwise ranking gij as the probability that product j is preferred to i in excess of a purely random choice, 1 g = Pr{k : X > X } − . ij kj ki 2 This is invariant up to a monotone transformation. Combinatorial Laplacian and Rank Aggregation Two Motivating Examples

Example II: Purely Exchange Economics

Example (Pairwise ranking in exchange market)

n goods V = {1,..., n} in an exchange market, with an exchange rate matrix A, such that

1 unit i = aij unit j, aij > 0.

which is a reciprocal matrix, i.e. aij = 1/aji Ideally, a product triple (i, j, k) is called triangular arbitrage-free, if aij ajk = aik Money (universal equivalent): does there exist a universal equivalent with pricing function p : V → R+, such that

aij = pj /pi ? Combinatorial Laplacian and Rank Aggregation Two Motivating Examples

From Pairwise to Global

Under the logarithmic map, gij = log aij , we have an equivalent theory: • the triangular arbitrage-free is equivalent to

gij + gjk + gki = 0

• universal equivalent is a global ranking f : V → R (fi = log pi ) such that

gij = fj − fi =: (δ0f )(i, j)

Here • Global ranking ⇔ universal equivalent (price) • Pairwise ranking ⇔ exchange rates Combinatorial Laplacian and Rank Aggregation Two Motivating Examples

Observations

In both examples, contain cardinal information involve pairwise comparisons

How important are they? Combinatorial Laplacian and Rank Aggregation Reflections on Ranking Ordinal vs. Cardinal Ordinal Rank Aggregation

Problem: given a set of partial/total order {i : i = 1,..., n} on a common set V , find

∗ (1,..., n) 7→ ,

as a partial order on V , satisfying certain optimal condition. Examples: • voting • Notes: • Impossibility Theorems (Arrow et al.) • Hardness in solving (NP-hard for Kemeny optimality etc.) Combinatorial Laplacian and Rank Aggregation Reflections on Ranking Ordinal vs. Cardinal Cardinal Rank Aggregation

Problem: given a set of functions fi : V → R (i = 1,..., n), find ∗ (f1,..., fn) 7→ f as a function on V , satisfying certain optimal condition. Examples: • customer-product rating, e.g. Amazon, Netflix • stochastic choice with f as probability distributions on V , e.g. Google search, cardinal utility in Economics Notes • relaxations leave rooms for ‘possibility’ • ordinal rankings induced from cardinal rankings, but with information loss Combinatorial Laplacian and Rank Aggregation Reflections on Ranking Global, Local, vs. Pairwise Global, Local, and Pairwise Rankings

Global ranking is a function on V , f : V → R Local (partial) ranking: restriction of global ranking on a 0 subset U, f : U → R Pairwise ranking: g : V × V → R (with gij = −gji ) • Note: pairwise rankings are simply skew-symmetric matrices sl(n) or certain equivalence classes in sl(n). Also we may view pairwise rankings as weighted digraphs. Combinatorial Laplacian and Rank Aggregation Reflections on Ranking Global, Local, vs. Pairwise Why Pairwise Ranking?

Human mind can’t make preference judgements on moderately large sets (e.g. no more than 7 ± 2 in psychology study) But human can do pairwise comparison more easily and accurately Pairwise ranking naturally arises in tournaments, exchange Economics, etc. Pairwise ranking may reduce the bias caused by the arbitrariness of rating scale Pairwise ranking may contain more information than global ranking (to be seen soon)! Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian

Our Main Theme

Below we’ll outline an approach to analyze cardinal, and pairwise rankings, in a perspective from discrete exterior calculus. Briefly, we’ll reach an orthogonal decomposition of pairwise rankings, by Hodge Theory,

Pairwise = Global + Consistent Cyclic + Inconsistent Cyclic Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Simplicial Complex of Products

Let V = {1,..., n} be the set of products or alternatives to be ranked. Construct a simplicial complex K:

0-simplices K0: V

1-simplices K1: edges {i, j} such that comparison (i.e. pairwise ranking) between i and j exists

2-simplices K2: triangles {i, j, k} such that • every edge exists in K1 • more considerations on consistency, like triangular arbitrage-free Note: it suffices here to construct K up to 2! Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Cochains

k k-cochains C (K, R): vector space of k + 1-alternating associated with Kk+1

k+1 {u : V → R, uiσ(0),...,iσ(k) = sign(σ)ui0,...,ik }

for (i0,..., ik ) ∈ Kk+2, where σ ∈ Sk+1 is a permutation on (0,..., k). k Inner product in C (K, R): standard Euclidean In particular, 0 ∼ n • global ranking: 0-cochains f ∈ C (K, R) = R 1 • pairwise ranking: 1-cochains g ∈ C (K, R), gij = −gji Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Coboundary Maps

k k+1 k-dimensional coboundary maps δk : C (V , R) → C (V , R) are defined as the alternating difference operator

k+1 X j+1 (δk u)(i0,..., ik+1) = (−1) u(i0,..., ij−1, ij+1,..., ik+1) j=0

δk plays the role of differentiation

δk+1 ◦ δk = 0 In particular, • (δ0f )(i, j) = fj − fi is gradient of global ranking f • (δ1g)(i, j, k) = gij + gjk + gki is curl of pairwise ranking g Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus A View from Discrete Exterior Calculus

We have the following cochain complex

0 δ0 1 δ1 2 C (K, R) −→ C (K, R) −→ C (K, R), in other words,

grad Global −−→ Pairwise −−→curl Triplewise and curl ◦ grad(Global Rankings) = 0

Pairwise rankings = alternating 2-tensors = skew-symmetric matrices = log of Saaty’s reciprocal matrices Triplewise rankings = alternating 3-tensors See also: Douglas Arnold’s talk on Tuesday Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus What does it tell us?

grad Global −−→ Pairwise −−→curl Triplewise

grad(Global) (i.e. im(δ0)): a proper subset of pairwise rankings induced from global

curl(Pairwise) (i.e. im(δ1)): measures the consistency/triangular arbitrage on triangle {i, j, k}

(δ1g)(i, j, k) = gij + gjk + gki

• ker(curl) (i.e. ker(δ1)): consistent, curl-free, triangular arbitrage-free, in particular — • curl ◦ grad(Global) = 0 (i.e. δ1 ◦ δ0 = 0) says global rankings are consistent/curl-free Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Reverse direction: conjugate operators

grad∗(=− div) ∗ Gradient ←−−−−−−−− Pairwise ←−−−curl Triplewise

∗ T grad : δ0 under Euclidean inner product, gives the total inflow-outflow difference at each vertex (negative divergence)

T X X (δ0 g)(i) = g∗i − gi∗

T • ker(δ0 ), as divergence-free, is cyclic (interior/boundary) ∗ T curl : δ1 , gives interior cyclic pairwise rankings along triangles in K2, which are inconsistent Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Combinatorial Laplacian Operator Combinatorial Laplacian

Define the k-dimensional combinatorial Laplacian, k k ∆k : C → C by

T T ∆k = δk−1δk−1 + δk δk , k > 0

T k = 0, ∆0 = δ0 δ0 is the well-known graph Laplacian k = 1, ∗ ∆1 = curl ◦ curl − div ◦ grad Important Properties: • ∆k positive semi-definite T • ker(∆k ) = ker(δk−1) ∩ ker(δk ): k-harmonics, dimension equals to k-th Betti number • Hodge Decomposition Theorem Combinatorial Laplacian and Rank Aggregation Hodge Theory

Hodge Decomposition Theorem

Theorem 1 The space of pairwise rankings, C (V , R), admits an orthogonal decomposition into three

1 T C (V , R) = im(δ0) ⊕ H1 ⊕ im(δ1 )

where T H1 = ker(δ1) ∩ ker(δ0 ) = ker(∆1). Combinatorial Laplacian and Rank Aggregation Hodge Theory

Hodge Decomposition Illustration

Figure: Hodge Decomposition for Pairwise Rankings Combinatorial Laplacian and Rank Aggregation Hodge Theory

An Example from Jester Dataset

Figure: Hodge Decomposition for a pairwise ranking on four Jester jokes (No.1 - 4):g ˆ1 gives a global ranking (order: 1 > 2 > 3 > 4) which accounts for 90% of the total norm;g ˆ2 is the consistent cyclic part on triangles {{123}, {124}} with 7% norm; andg ˆ3 is the inconsistent cyclic part. Combinatorial Laplacian and Rank Aggregation Hodge Theory Cyclicity of Pairwise Rankings Acyclic-Cyclic Decomposition

Corollary Every pairwise ranking admits a unique orthogonal decomposition,

g = proj g + proj T g im(δ0) ker(δ0 ) i.e. Pairwise = grad(Global) + Cyclic

Note: Pairwise rankings induced from global are exactly acyclic component, as the orthogonal complement of cyclic pairwise rankings. Combinatorial Laplacian and Rank Aggregation Hodge Theory Consistency of Pairwise Rankings Consistency

Definition A pairwise ranking g is consistent on a triangle (2-simplex) (i, j, k) if gij + gjk + gki = 0, in other words, (δ1g)(i, j, k) = 0.

Note: 4 x 10 Curl distribution of Jester dataset 9

Consistency depends on the triangles 8 (2-simplices), so for a pairwise ranking 7 6

g, | curl(g)(i, j, k)| measures the curl 5 distribution over triangles (2-simplices) 4 3 in K2 2 1

0 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 Combinatorial Laplacian and Rank Aggregation Hodge Theory Consistency of Pairwise Rankings Consistent Decomposition

Corollary

1 A consistent pairwise ranking g associated with K, has a unique orthogonal decomposition

g = projim(δ0) g + projH1 g = grad(Global) + Harmonic i.e. where harmonic is cyclic on the “holes” of the complex K. 2 Every consistent pairwise ranking on a contractible K, is induced from a global ranking.

Note: (2) rephrases the famous theorem in exchange Economics: triangular arbitrage-free implies arbitrage-free and the existence of universal equivalent. Combinatorial Laplacian and Rank Aggregation Conclusions and Future Work

Conclusions and Future Work

Conclusions Hodge Theory provides an orthogonal decomposition for pairwise rankings Such decomposition is helpful to characterize the cyclicity and (triangular) consistency of pairwise rankings Future Comparisons with other spectral methods • on symmetry groups (Diaconis) • Markov Chain based methods (PageRank, etc.) as graph Laplacians Design new Applications on large scale data sets, e.g. Netflix dataset. Combinatorial Laplacian and Rank Aggregation

Acknowledgements

Gunnar Carlsson (Stanford) Persi Diaconis (Stanford) Nick Eriksson (Stanford) Fei Han (UCB) Susan Holmes (Stanford) Xiaoye Jiang (Stanford) Ming Ma (UCB and Beijing Institute of Technology) Michael Mahoney (Yahoo! Research) Steve Smale (TTI-U Chicago and UCB) Shmuel Weinberger (U Chicago)