Submitted to Bernoulli arXiv: arXiv:0000.0000 Mallows and Generalized Mallows Model for Matchings EKHINE IRUROZKI1,* BORJA CALVO1,** and JOSE A. LOZANO1,† 1University of the Basque Country E-mail: *
[email protected]; **
[email protected]; †
[email protected] The Mallows and Generalized Mallows Models are two of the most popular probability models for distributions on permutations. In this paper we consider both models under the Hamming distance. This models can be seen as models for matchings instead of models for rankings. These models can not be factorized, which contrasts with the popular MM and GMM under Kendall’s-τ and Cayley distances. In order to overcome the computational issues that the models involve, we introduce a novel method for computing the partition function. By adapting this method we can compute the expectation, joint and conditional probabilities. All these methods are the basis for three sampling algorithms, which we propose and analyze. Moreover, we also propose a learning algorithm. All the algorithms are analyzed both theoretically and empirically, using synthetic and real data from the context of e-learning and Massive Open Online Courses (MOOC). Keywords: Mallows Model, Generalized Mallows Model, Hamming, Matching, Sampling, Learn- ing. 1. Introduction Permutations appear naturally in a wide variety of domains, from Social Sciences [47] to Machine Learning [8]. Probability models over permutations are an active research topic in areas such as preference elicitation [7], information retrieval [17], classification [9], etc. Some prominent examples of these are models based on pairwise preferences [33], Placket-Luce [34], [42], and Mallows Models [35]. One can find in the recent literature on distributions on permutations both theoretical discussions and practical applications, as well as extensions of all the aforementioned models.