Understanding the Generalized Median Stable Matchings∗

Understanding the Generalized Median Stable Matchings∗ Christine T. Cheng† Department of Computer Science University of Wisconsin–Milwaukee, Milwaukee, WI 53211, USA. [email protected] March 31, 2009 Abstract Let I be a stable matching instance with N stable matchings. For each man m, order his (not necessarily distinct) N partners from his most preferred to his least preferred. Denote the ith woman in his sorted list as pi(m). Let αi consist of the man-woman pairs where each man m is matched to pi(m). Teo and Sethuraman proved this surprising result: for i = 1 to N, not only is αi a matching, it is also stable. The αi’s are called the generalized median stable matchings of I. Determining if these stable matchings can be computed efficiently is an open problem. In this paper, we present a new characterization of the generalized median stable matchings that provides interesting insights. It implies that the generalized median stable matchings in the middle – α(N+1)/2 when N is odd, αN/2 and αN/2+1 when N is even – are fair not only in a local sense but also in a global sense because they are also medians of the lattice of stable matchings. We then show that there are some families of SM instances for which computing an αi is easy but that the task is NP-hard in general. Finally, we also consider what it means to approximate a median stable matching and present results for this problem. 1 Introduction In the stable marriage problem (SM), there are n men and n women each of whom has a list that ranks all individuals of the opposite sex. A matching is a set of man-woman pairs where each individual appears in at most one pair. The objective of the problem is to find a matching µ that has n pairs and has no blocking pairs – i.e., a man and a woman who prefer each other over their partners in µ. The rationale behind the stability condition is that if a blocking pair exists, the man and the woman will likely leave their partners and thereby compromise the integrity of the matching µ. A celebrated result by Gale and Shapley states that every instance of SM has a stable matching that can be found in O(n2) time [13]. Today, centralized stable matching algorithms match medical residents to hospitals [24], students to schools [1, 2], etc. In this paper, we focus on a set of stable matchings that is due to Teo and Sethuraman [27]. Let I be an SM instance, and M(I) its set of stable matchings. Let pµ(a) denote the partner of a in stable matching µ. By formulating the stable matchings problem as a linear program, they discovered the following surprising result: ∗A preliminary version [8] of this paper was presented at LATIN 2008. †Supported by NSF Award No. CCF-0830678. 1 Theorem 1 [27] Suppose that Z ⊆ M(I) and z = |Z|. For each man m, sort the multiset of women {pµ(m),µ ∈ Z} from m’s most preferred to least preferred woman. Let pi,Z(m) denote the ith woman in this sorted list. Do the same for each woman w. Let αi,Z consist of the man-woman pairs where each man m is matched to pi,Z(m). Similarly, let βi,Z consist of the man-woman pairs where each woman w is matched to pi,Z(w). For i = 1,...,z, αi,Z and βi,Z are stable matchings; moreover, αi,Z = βz−i+1,Z . When Z = M(I) and N = |M(I)|, we denote αi,Z and βi,Z as αi and βi respectively. We also follow Klaus and Klijn’s [18] terminology and call αi = βN−i+1 the ith generalized median stable matching of I. The most interesting of the generalized median stable matchings are the ones in the middle – α(N+1)/2 when N is odd, and αN/2 and αN/2+1 when N is even – because they match every participant to his/her (lower or upper) median stable partner and, thus, are fair in a very strong sense. Since Teo and Sethuraman’s work, several authors [22, 12, 18] have proven the existence of the generalized median stable matchings. However, determining if these stable matchings can be computed in polynomial time remains open. Simply using the definition can be inefficient because there are instances whose number of stable matchings is exponential in the input size. Our goal is to fill this gap. It is well known that M(I) forms a distributive lattice under a natural dominance relation . In the early 1960’s, Barbut [5] initiated the study of medians for distributive lattices. Recall that in statistics, the median of a set of test scores is a number that summarizes the entire set. It reflects how good or how bad the scores are as a whole. Specifically, it is one of the numbers whose total distance (or average distance) from all the test scores is the smallest. Extending this idea to distributive lattices, Barbut would define a median of M(I) = (M(I), ) as follows. Let H(I) denote the lattice’s Hasse diagram. For any two stable matchings µ and µ′, define the distance between µ and µ′, d(µ,µ′), as the length of a shortest path between µ and µ′ in the undirected ′ ∗ version of H(I). Let D(µ) = Pµ′∈M(I) d(µ,µ ). A median of M(I) is a stable matching µ such that D(µ∗) = min{D(µ),µ ∈ M(I)}. Just as in statistics, one can think of the medians of M(I) as the stable matchings that best describe the set M(I). An intriguing question is what is the relationship of the generalized median stable matchings of I to the medians of M(I)? To better understand the generalized median stable matchings of I, we make use of I’s rotation poset, a structure related to M(I). Algorithmically, the rotation poset is very useful because it encodes all the stable matchings of I, and, yet, its size is always polynomial in the input size. Our main results are as follows: • We present a new characterization of the generalized median stable matchings of I that provides interesting insights into these stable matchings not apparent from their definition. It implies that when N is odd, α(N+1)/2 is the unique median of M(I), and when N is even, the stable matchings µ such that αN/2 µ αN/2+1 are exactly the medians of M(I). Thus, quite remarkably, a stable matching is “locally median” if and only if it is “globally median”! • We consider the problem of computing a specific αi. We prove that when i = O(log n), αi and αN−i+1 can be computed efficiently. We also note that when I’s rotation poset is a series-parallel poset, an interval order, or a two-dimensional poset in general, computing any of its generalized median stable matchings is easy; and when I’s rotation poset has at most two layers, computing a median of M(I) can be done efficiently. • In spite of these results, however, we show that the task of computing αi is NP-hard in general. In particular, we prove that finding a median of M(I) is NP-hard. 2 • Finally, we consider what it means to approximate a median stable matching of M(I), and present results for this problem. The outline for the rest of the paper is as follows: in Section 2, we define distributive lattices, present important results on the medians of a distributive lattice, and describe rotation posets and their properties. We present the new characterization for the αi’s in Section 3 and complexity results for computing αi in Section 4. We consider the problem of approximating a median stable matching of M(I) in Section 5 and conclude in Section 6. 2 Distributive lattices, medians, and rotation posets Distributive lattices and their medians. Recall that a (finite) lattice L = (L, ≤) is a partially ordered set (or poset) such that every pair of elements x and y of L has a greatest lower bound, x∧y, called the meet of x and y, and a least upper bound, x ∨ y, called the join of x and y. It also implies that any subset L′ of L has a well-defined greatest lower bound, ∧L′, and a well-defined least upper bound, ∨L′. When x ≤ y, let the interval [x,y] = {z : x ≤ z ≤ y}. We say that y covers x when [x,y]= {x,y}; that is, no element of L lies between x and y. The Hasse diagram of L, H(L), is the directed graph formed by letting every element of L be a vertex and every pair (x,y) of vertices be an edge whenever y covers x. Below, we define a median of L and present important results related to it. Our discussion follows the paper of LeClerc [19]. Define the distance between any two elements x and y of L, d(x,y), as the length of the shortest path between x and y in the undirected version of H(L). Let Lk be the k-fold product of L. An k k element X = (x1,x2,...,xk) ∈ L is called a profile. Let DX (x) = Pi=1 d(x,xi). An X-median is an element m of L such that DX (m) = min{DX (x),x ∈ L}.

Understanding the Generalized Median Stable Matchings∗

Statistical Matching: a Paradigm for Assessing the Uncertainty in the Procedure

A Machine Learning Approach to Census Record Linking∗

Stability and Median Rationalizability for Aggregate Matchings

Report on Exact and Statistical Matching Techniques

Alternatives to Randomized Control Trials: a Review of Three Quasi-Experimental Designs for Causal Inference

Matching Via Dimensionality Reduction for Estimation of Treatment Effects in Digital Marketing Campaigns

Frequency Matching Case-Control Techniques: an Epidemiological Perspective

Package 'Matching'

A Comparison of Different Methods to Handle Missing Data in the Context of Propensity Score Analysis

Performance of Logistic Regression, Propensity Scores, and Instrumental Variable for Estimating Their True Target Odds Ratios In

An Empirical Evaluation of Statistical Matching Methodologies

The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insuranc