On Ranking and Choice Models

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) On Ranking and Choice Models Shivani Agarwal Radcliffe Institute for Advanced Study, Harvard University, Cambridge, MA, USA Indian Institute of Science, Bangalore, India [email protected] Abstract how to design new algorithms for ranking form pairwise comparisons that achieve desirable goals under various conditions In today’s big data era, huge amounts of ranking [Rajkumar and Agarwal, 2014; 2016a; Rajkumar et al., 2015; and choice data are generated on a daily basis, and Rajkumar and Agarwal, 2016b]. consequently, many powerful new computational In marketing, when presenting products to users in a store tools for dealing with ranking and choice data have or on a website, it is important to categorize related products emerged in recent years. This paper highlights together so that users can quickly find what they are looking recent developments in two areas of ranking and for. These categories should ideally be based on how users choice modeling that cross traditional boundaries themselves make choices, so that products that tend to be and are of multidisciplinary interest: ranking from liked or disliked together are grouped together. In the second pairwise comparisons, and automatic discovery of part of the paper (Section 3), we describe a new approach for latent categories from choice survey data. automatic discovery of categories from choice data, which brings together ideas from random utility choice models and 1 Introduction topic models in order to automatically discover latent cate- In today’s big data era, huge amounts of data are generated in gories from choice survey data [Agarwal and Saha, 2016]. the form of rankings and choices on a daily basis: restaurant ratings, product choices, employer ratings, hospital rankings, 2 Ranking from Pairwise Comparisons and so on. Given the increasing universality of such ranking and choice data, many powerful new computational tools for Ranking from pairwise comparisons is a ubiquitous problem dealing with such data have emerged over the last few years that arises in a variety of applications. The basic types of in areas related to AI, including in particular machine learn- questions of interest here are the following: Say there are n items, denoted [n]= 1,...,n , and we observe the out- ing, statistics, operations research, and computational social { } choice. Here we briefly highlight such developments in two comes of a number of pairwise comparisons among them broad areas: ranking from pairwise comparisons, and auto- (such as pairwise preferences among movies, pairwise judg- matic discovery of latent categories from choice data. ments among job candidates, or pairwise game outcomes As is well known, humans generally find it easier to ex- among sports teams). Based on these pairwise comparisons, press preferences in the form of comparisons between two can we find a good ranking of the n items, or identify a ‘best’ items, rather than directly ranking a large number of items. item or a ‘good’ set of items among them? How many com- Indeed, in many domains, one is given outcomes of pairwise parisons do we need? What sorts of algorithms can we use? comparisons among some set of items (such as movies, can- Under what conditions do these algorithms succeed? didates in an election, or sports teams), and must estimate a A natural statistical framework for analyzing such ques- ranking of all the items from these observed pairwise compar- tions assumes that for each pair of items (i, j), there is an underlying pairwise preference probability P [0, 1] such isons. Due to the ubiquitous nature of the problem, several ij 2 different algorithms have been developed for ranking from that whenever items i and j are compared, item i beats item pairwise comparisons in different communities, including j with probability Pij and j beats i with probability Pji = e.g. maximum likelihood estimation methods [Bradley and 1 Pij. Collectively, these pairwise preference probabilities − n n Terry, 1952; Luce, 1959], spectral ranking methods [Kendall, form an underlying pairwise preference matrix P [0, 1] ⇥ (with P = 1 i). Different statistical models for2 pairwise 1955; Keener, 1993; Negahban et al., 2012], noisy sorting ii 2 8 methods [Braverman and Mossel, 2008], and many others; comparisons lead to different conditions on P. For exam- however, little has been understood about when one algo- ple, if the pairwise preference probabilities follow the well- known Bradley-Terry-Luce (BTL) statistical model for pair- rithm should be preferred over another. In the first part of the n paper (Section 2), we discuss recent developments in under- wise comparisons, then there is a score vector w R++ wi 2 such that Pij = i, j [Bradley and Terry, 1952; standing these issues, including understanding the conditions wi+wj 8 under which different ranking algorithms succeed or fail, and Luce, 1959]; if they follow the ‘noisy permutation’ (NP) 4050 Condition on P Property satisfied by P n wi Bradley-Terry-Luce (BTL) w : Pij = R++ wi+wj 9 2 1 Low-noise (LN) Pij > 2 k Pik > k Pjk 1 ) Pik Pjk Logarithmic LN (LogLN) Pij > ln( ) > ln( ) 2 Pk Pki P k Pkj 1 ) Markov consistency (MC) Pij > ⇡i >⇡j 2 P P 1 ) 1 1 Stochastic transitivity (ST) Pij > 2 ,Pjk > 2 Pik > 2 1 ) Condorcet winner (CW) i : Pij > 2 j = i 9 81 6 1 p if i σ j Noisy permutation (NP) σ n,p< : Pij = − 9 2S 2 p otherwise Low rank (LR( , r)) rank( (P)) r ( :[0,n1] R,r [n]) ! 2 Figure 1: Conditions on the matrix of underlying pairwise preference probabilities P and relationships among them. These conditions play an important role in determining the success of different algorithms for ranking from pairwise comparisons. model, then there is a permutation σ n and noise param- Thus the input pairwise comparison data here is of the form eter p [0, 1 ) such that i = j, P =12S p if σ(i) <σ(j) y1 ,...,yK , with yk =1denoting that the k-th com- 2 2 8 6 ij − { ij ij }i<j ij (which we also denote as i σ j), and Pij = p otherwise k parison between i and j resulted in i beating j, and yij =0 [Braverman and Mossel, 2008; Wauthier et al., 2013]. Sev- denoting the reverse; under the statistical model discussed eral other conditions on P are also of interest: see Figure 1. k above, each comparison outcome yij is a random draw from One of our goals in recent work has been to understand a Bernoulli random variable with parameter Pij. Given this how the matrix of underlying pairwise probabilities P affects pairwise comparison data, consider the goal of finding a good the ranking goals that can be achieved, the success of differ- ranking or permutation of the n items, σ n. ent algorithms in achieving those goals, and the number of A natural measure of the quality of2 aS permutation σ is pairwise comparisons that are needed. We focus here mostly its pairwise disagreement error w.r.t. theb underlying pairwise on settings where item pairs to be compared are selected ran- preference probabilities P: domly (or fixed in advance), but similar concerns also apply when pairs to be compared are selected in an active fashion.1 1 1 dis(σ, P)= 1 (Pij )(σ(j) σ(i)) < 0 . As it turns out, the matrix P plays a huge role in deter- n − 2 − 2 i<j mining the success of different algorithms for ranking from X pairwise comparisons. For example, spectral ranking algo- An optimal ranking or permutation σ⇤ is then one that mini- rithms such as Rank Centrality perform well when P satis- mizes this pairwise disagreement error: fies the BTL condition or the slightly more general Markov σ⇤ argminσ dis(σ, P) . consistency (MC) condition, but can fail miserably under 2 2Sn more general settings of P; indeed, if P is known only Under what conditions on P can we find an optimal ranking to satisfy stochastic transitivity (ST), then using an SVM- from the observed pairwise comparison data? Clearly, as the based ranking algorithm or a topological sort based algo- number of comparisons K per pair increases, we expect to be rithm can be a better choice [Rajkumar and Agarwal, 2014; able to construct increasingly accurate estimates of P. Which 2016a]. If P does not satisfy ST, then all these algorithms ranking algorithms have the property that as K increases, the fail to even recover a good set of items at the top, but one can rankings they produce approach an optimal ranking? use other algorithms for this purpose [Rajkumar et al., 2015]. In recent work [Rajkumar and Agarwal, 2014; 2016b], we When comparisons can be made among only O(n log n) non- show that if the underlying pairwise probabilities P satisfy actively sampled pairs, then under suitable conditions on P, the LN condition, then both the simple matrix Borda algo- one can use algorithms based on low-rank matrix completion rithm and the popular method of maximum likelihood esti- [Rajkumar and Agarwal, 2016b]. Below we summarize some mation under a BTL model succeed in recovering (with high of these findings and give pointers for further investigation. probability) an optimal ranking (for sufficiently large K); if P satisfies the LogLN condition, then the least squares rank- 2.1 Finding an Optimal Ranking ing algorithm succeeds in recovering such an optimal rank- n Let us start by considering the setting where all 2 pairs ing; and if P satisfies the MC condition, then the Rank Cen- are compared a fixed number of times, say K times each.2 trality algorithm recovers an optimal ranking.

Load more