Combinatorial Laplacian and Rank Aggregation
Combinatorial Laplacian and Rank Aggregation
Yuan Yao
Stanford University
ICIAM, Z¨urich,July 16–20, 2007
Joint work with Lek-Heng Lim Combinatorial Laplacian and Rank Aggregation Outline
1 Two Motivating Examples
2 Reflections on Ranking Ordinal vs. Cardinal Global, Local, vs. Pairwise
3 Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Combinatorial Laplacian Operator
4 Hodge Theory Cyclicity of Pairwise Rankings Consistency of Pairwise Rankings
5 Conclusions and Future Work Combinatorial Laplacian and Rank Aggregation Two Motivating Examples
Example I: Customer-Product Rating
Example (Customer-Product Rating)
m×n m-by-n customer-product rating matrix X ∈ R X typically contains lots of missing values (say ≥ 90%).
The first-order statistics, mean score for each product, might suffer from most customers just rate a very small portion of the products different products might have different raters, whence mean scores involve noise due to arbitrary individual rating scales Combinatorial Laplacian and Rank Aggregation Two Motivating Examples
From 1st Order to 2nd Order: Pairwise Rankings
The arithmetic mean of score difference between product i and j over all customers who have rated both of them, P k (Xkj − Xki ) gij = , #{k : Xki , Xkj exist} is translation invariant. If all the scores are positive, the geometric mean of score ratio over all customers who have rated both i and j,
!1/#{k:Xki ,Xkj exist} Y Xkj gij = , Xki k is scale invariant. Combinatorial Laplacian and Rank Aggregation Two Motivating Examples
More invariant
Define the pairwise ranking gij as the probability that product j is preferred to i in excess of a purely random choice, 1 g = Pr{k : X > X } − . ij kj ki 2 This is invariant up to a monotone transformation. Combinatorial Laplacian and Rank Aggregation Two Motivating Examples
Example II: Purely Exchange Economics
Example (Pairwise ranking in exchange market)
n goods V = {1,..., n} in an exchange market, with an exchange rate matrix A, such that
1 unit i = aij unit j, aij > 0.
which is a reciprocal matrix, i.e. aij = 1/aji Ideally, a product triple (i, j, k) is called triangular arbitrage-free, if aij ajk = aik Money (universal equivalent): does there exist a universal equivalent with pricing function p : V → R+, such that
aij = pj /pi ? Combinatorial Laplacian and Rank Aggregation Two Motivating Examples
From Pairwise to Global
Under the logarithmic map, gij = log aij , we have an equivalent theory: • the triangular arbitrage-free is equivalent to
gij + gjk + gki = 0
• universal equivalent is a global ranking f : V → R (fi = log pi ) such that
gij = fj − fi =: (δ0f )(i, j)
Here • Global ranking ⇔ universal equivalent (price) • Pairwise ranking ⇔ exchange rates Combinatorial Laplacian and Rank Aggregation Two Motivating Examples
Observations
In both examples, contain cardinal information involve pairwise comparisons
How important are they? Combinatorial Laplacian and Rank Aggregation Reflections on Ranking Ordinal vs. Cardinal Ordinal Rank Aggregation
Problem: given a set of partial/total order {i : i = 1,..., n} on a common set V , find
∗ (1,..., n) 7→ ,
as a partial order on V , satisfying certain optimal condition. Examples: • voting • Social Choice Theory Notes: • Impossibility Theorems (Arrow et al.) • Hardness in solving (NP-hard for Kemeny optimality etc.) Combinatorial Laplacian and Rank Aggregation Reflections on Ranking Ordinal vs. Cardinal Cardinal Rank Aggregation
Problem: given a set of functions fi : V → R (i = 1,..., n), find ∗ (f1,..., fn) 7→ f as a function on V , satisfying certain optimal condition. Examples: • customer-product rating, e.g. Amazon, Netflix • stochastic choice with f as probability distributions on V , e.g. Google search, cardinal utility in Economics Notes • relaxations leave rooms for ‘possibility’ • ordinal rankings induced from cardinal rankings, but with information loss Combinatorial Laplacian and Rank Aggregation Reflections on Ranking Global, Local, vs. Pairwise Global, Local, and Pairwise Rankings
Global ranking is a function on V , f : V → R Local (partial) ranking: restriction of global ranking on a 0 subset U, f : U → R Pairwise ranking: g : V × V → R (with gij = −gji ) • Note: pairwise rankings are simply skew-symmetric matrices sl(n) or certain equivalence classes in sl(n). Also we may view pairwise rankings as weighted digraphs. Combinatorial Laplacian and Rank Aggregation Reflections on Ranking Global, Local, vs. Pairwise Why Pairwise Ranking?
Human mind can’t make preference judgements on moderately large sets (e.g. no more than 7 ± 2 in psychology study) But human can do pairwise comparison more easily and accurately Pairwise ranking naturally arises in tournaments, exchange Economics, etc. Pairwise ranking may reduce the bias caused by the arbitrariness of rating scale Pairwise ranking may contain more information than global ranking (to be seen soon)! Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian
Our Main Theme
Below we’ll outline an approach to analyze cardinal, and pairwise rankings, in a perspective from discrete exterior calculus. Briefly, we’ll reach an orthogonal decomposition of pairwise rankings, by Hodge Theory,
Pairwise = Global + Consistent Cyclic + Inconsistent Cyclic Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Simplicial Complex of Products
Let V = {1,..., n} be the set of products or alternatives to be ranked. Construct a simplicial complex K:
0-simplices K0: V
1-simplices K1: edges {i, j} such that comparison (i.e. pairwise ranking) between i and j exists
2-simplices K2: triangles {i, j, k} such that • every edge exists in K1 • more considerations on consistency, like triangular arbitrage-free Note: it suffices here to construct K up to dimension 2! Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Cochains
k k-cochains C (K, R): vector space of k + 1-alternating tensors associated with Kk+1
k+1 {u : V → R, uiσ(0),...,iσ(k) = sign(σ)ui0,...,ik }
for (i0,..., ik ) ∈ Kk+2, where σ ∈ Sk+1 is a permutation on (0,..., k). k Inner product in C (K, R): standard Euclidean In particular, 0 ∼ n • global ranking: 0-cochains f ∈ C (K, R) = R 1 • pairwise ranking: 1-cochains g ∈ C (K, R), gij = −gji Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Coboundary Maps
k k+1 k-dimensional coboundary maps δk : C (V , R) → C (V , R) are defined as the alternating difference operator
k+1 X j+1 (δk u)(i0,..., ik+1) = (−1) u(i0,..., ij−1, ij+1,..., ik+1) j=0
δk plays the role of differentiation
δk+1 ◦ δk = 0 In particular, • (δ0f )(i, j) = fj − fi is gradient of global ranking f • (δ1g)(i, j, k) = gij + gjk + gki is curl of pairwise ranking g Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus A View from Discrete Exterior Calculus
We have the following cochain complex
0 δ0 1 δ1 2 C (K, R) −→ C (K, R) −→ C (K, R), in other words,
grad Global −−→ Pairwise −−→curl Triplewise and curl ◦ grad(Global Rankings) = 0
Pairwise rankings = alternating 2-tensors = skew-symmetric matrices = log of Saaty’s reciprocal matrices Triplewise rankings = alternating 3-tensors See also: Douglas Arnold’s talk on Tuesday Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus What does it tell us?
grad Global −−→ Pairwise −−→curl Triplewise
grad(Global) (i.e. im(δ0)): a proper subset of pairwise rankings induced from global
curl(Pairwise) (i.e. im(δ1)): measures the consistency/triangular arbitrage on triangle {i, j, k}
(δ1g)(i, j, k) = gij + gjk + gki
• ker(curl) (i.e. ker(δ1)): consistent, curl-free, triangular arbitrage-free, in particular — • curl ◦ grad(Global) = 0 (i.e. δ1 ◦ δ0 = 0) says global rankings are consistent/curl-free Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Discrete Exterior Calculus Reverse direction: conjugate operators
grad∗(=− div) ∗ Gradient ←−−−−−−−− Pairwise ←−−−curl Triplewise
∗ T grad : δ0 under Euclidean inner product, gives the total inflow-outflow difference at each vertex (negative divergence)
T X X (δ0 g)(i) = g∗i − gi∗
T • ker(δ0 ), as divergence-free, is cyclic (interior/boundary) ∗ T curl : δ1 , gives interior cyclic pairwise rankings along triangles in K2, which are inconsistent Combinatorial Laplacian and Rank Aggregation Discrete Exterior Calculus and Combinatorial Laplacian Combinatorial Laplacian Operator Combinatorial Laplacian
Define the k-dimensional combinatorial Laplacian, k k ∆k : C → C by
T T ∆k = δk−1δk−1 + δk δk , k > 0
T k = 0, ∆0 = δ0 δ0 is the well-known graph Laplacian k = 1, ∗ ∆1 = curl ◦ curl − div ◦ grad Important Properties: • ∆k positive semi-definite T • ker(∆k ) = ker(δk−1) ∩ ker(δk ): k-harmonics, dimension equals to k-th Betti number • Hodge Decomposition Theorem Combinatorial Laplacian and Rank Aggregation Hodge Theory
Hodge Decomposition Theorem
Theorem 1 The space of pairwise rankings, C (V , R), admits an orthogonal decomposition into three
1 T C (V , R) = im(δ0) ⊕ H1 ⊕ im(δ1 )
where T H1 = ker(δ1) ∩ ker(δ0 ) = ker(∆1). Combinatorial Laplacian and Rank Aggregation Hodge Theory
Hodge Decomposition Illustration
Figure: Hodge Decomposition for Pairwise Rankings Combinatorial Laplacian and Rank Aggregation Hodge Theory
An Example from Jester Dataset
Figure: Hodge Decomposition for a pairwise ranking on four Jester jokes (No.1 - 4):g ˆ1 gives a global ranking (order: 1 > 2 > 3 > 4) which accounts for 90% of the total norm;g ˆ2 is the consistent cyclic part on triangles {{123}, {124}} with 7% norm; andg ˆ3 is the inconsistent cyclic part. Combinatorial Laplacian and Rank Aggregation Hodge Theory Cyclicity of Pairwise Rankings Acyclic-Cyclic Decomposition
Corollary Every pairwise ranking admits a unique orthogonal decomposition,
g = proj g + proj T g im(δ0) ker(δ0 ) i.e. Pairwise = grad(Global) + Cyclic
Note: Pairwise rankings induced from global are exactly acyclic component, as the orthogonal complement of cyclic pairwise rankings. Combinatorial Laplacian and Rank Aggregation Hodge Theory Consistency of Pairwise Rankings Consistency
Definition A pairwise ranking g is consistent on a triangle (2-simplex) (i, j, k) if gij + gjk + gki = 0, in other words, (δ1g)(i, j, k) = 0.
Note: 4 x 10 Curl distribution of Jester dataset 9
Consistency depends on the triangles 8 (2-simplices), so for a pairwise ranking 7 6
g, | curl(g)(i, j, k)| measures the curl 5 distribution over triangles (2-simplices) 4 3 in K2 2 1
0 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 Combinatorial Laplacian and Rank Aggregation Hodge Theory Consistency of Pairwise Rankings Consistent Decomposition
Corollary
1 A consistent pairwise ranking g associated with K, has a unique orthogonal decomposition
g = projim(δ0) g + projH1 g = grad(Global) + Harmonic i.e. where harmonic is cyclic on the “holes” of the complex K. 2 Every consistent pairwise ranking on a contractible K, is induced from a global ranking.
Note: (2) rephrases the famous theorem in exchange Economics: triangular arbitrage-free implies arbitrage-free and the existence of universal equivalent. Combinatorial Laplacian and Rank Aggregation Conclusions and Future Work
Conclusions and Future Work
Conclusions Hodge Theory provides an orthogonal decomposition for pairwise rankings Such decomposition is helpful to characterize the cyclicity and (triangular) consistency of pairwise rankings Future Comparisons with other spectral methods • Fourier Analysis on symmetry groups (Diaconis) • Markov Chain based methods (PageRank, etc.) as graph Laplacians Design new algorithms Applications on large scale data sets, e.g. Netflix dataset. Combinatorial Laplacian and Rank Aggregation
Acknowledgements
Gunnar Carlsson (Stanford) Persi Diaconis (Stanford) Nick Eriksson (Stanford) Fei Han (UCB) Susan Holmes (Stanford) Xiaoye Jiang (Stanford) Ming Ma (UCB and Beijing Institute of Technology) Michael Mahoney (Yahoo! Research) Steve Smale (TTI-U Chicago and UCB) Shmuel Weinberger (U Chicago)