Arxiv:2101.04425V2 [Cs.DS] 13 Mar 2021 Ecmue Nlna Ie[1.Hwvr Tbemthn a Ehl T Half Be a Can Problem Matching Matching Stable a Stable However, the Instance [11]

Home , Stable marriage problem

arXiv:2101.04425v2 [cs.DS] 13 Mar 2021 ecmue nlna ie[1.Hwvr tbemthn a ehl t half be a can problem matching matching stable a stable However, the instance [11]. of (see time matching instance linear every in computed that be known well is It a enivsiae xesvl nltrtr 1,] oee,i l s all in However, [14,5]. literature o in in rigid stability extensively Relaxing investigated unacceptable. been even has sometimes and undesirable is uhthat such elacpe oino piaiyi hsstigadi enda follow in as matchings defined (Stable is and 1 setting Definition this in optimality of notion well-accepted vr gn rfr en ace ooeo t cetbepartn acceptable its of one to matched p being prefers agent Every matching htagent that gnsadporm epcieyad( and respectively programs and agents scle stepeeec ito h gn rporm If program. or agent oth the the of from list subset preference a the ranks as program called and is agent Each other. each otoepormadapormi ace oa most at to matched is program a and program one most h tnadstigaprogram a setting standard the hogottepaper. the throughout tbemthn rbe smdlda iatt graph bipartite a to as students wor modeled under-graduate real is [18], several problem courses in matching elective stable applicable or is [1] it known also schools since setting, investigated many-to-one extensively the in been problem matching stable The Introduction 1 scalled is ieo acigpasavr motn oei elwrdapplications real-world in role important very a plays matching a of Size A hc a o etecs ncranapplications. certain in case the be not may which matching eil,w rps n td h rbe fcmuigstabl computing of b problem Motivated the In the violated. the study be as cannot and denoted which propose quota we upper flexible, rigid prob Residents a Hospital has the as known commonly preferences, Abstract. eso prxmto,Apoiainagrtm,Fs exp Fast algorithms, Approximation approximation, of ness u oli ocmueasal acigta matches that matching stable a compute to is goal Our Keywords: the in results our that problem. show We problem. matching stable otati h opeiysau ftetootmzto cr optimization the two with the of model status our t complexity of respect the with in problems optimization contrast two study We criteria. M > a under-subscribed a SMFQ if smthdt in to matched is p > p a M ′ matching A . ecnie h rbe fasgigaet oporm nth in programs to agents assigning of problem the consider We a acig ne rfrne,Sal acig,Sal ext Stable matchings, Stable preferences, under Matchings ⊆ etn ehv otascae iheeypormadthes and program every with associated cost a have we setting M SMFQ E tbeMthnswt lxbeQuotas Flexible with Matchings Stable ( G a in ) nFg where 1 Fig. in and G setting. in tbeextension stable sa sineto gnst rgassc htec gn smat is agent each that such programs to agents of assignment an is M M M p p ninIsiueo ehooyMda,India Madras, Technology of Institute Indian sete ne-usrbdin under-subscribed either is and if is a nuprqoadntdby denoted upper-quota an has | iiaLmy n ehn Nasre Meghana and Limaye Girija stable M { M girija,meghana ( ,p a, SM p ( p ) a | eoetesto gnsmthdt program to matched agents of set the denote ) ) 3 fteei obokn arw.r.t. pair blocking no is there if q < rbe na paetydffrn w-on etn fthe of setting two-round different apparently an in problem ∈ setting). and E ( p fadol if only and if a and ) 5 r etuasge naysal matching). stable any in unassigned left are } @cse.iitm.ac.in pair A fully-subscribed all G q ( gnsadi pia ihrsett h cost the to respect with optimal is and agents SMFQ x p ( = ayaet.Let agents. many ) M nnilalgorithms onential h ot n hwta hr sasharp a is there that show and costs the o prefers e.I h tnadstigec program each setting standard the In lem. ( a P ∪ A tra eas sals h connection the establish also We iteria. ,p a, nvriyporm 3 n aymr.A more. many and [3] programs university rteeeit tlatoeagent one least at exists there or A ∈ r vrrmiigumthd program A unmatched. remaining over ers etn eeaietesal extension stable the generalize setting rsd nasrc re n hsranking this and order strict a in side er plctosweeqoa a be may quotas where applications y acig ne eil uts– quotas flexible under matchings e ) q ( ∈ steHsia eiet rbe has problem Residents Hospital the as and p y E , dstig ieasgigsuet to students assigning like settings ld c ok h utsaeconsidered are quotas the works uch .W eoeti sthe as this denote We ). in mt tbemthn n tcan it and matching stable a dmits E drt nbelre iematchings size larger enable to rder over nin,Feil uts Hard- quotas, Flexible ensions, where ) s. p \ esz famxmmcardinality maximum a of size he M hr evn gnsunmatched agents leaving where P ∈ M M ot oto h quotas. the control costs e z if ednt tby it denote we , rsneo two-sided of presence e . sabokn arwrt the w.r.t. pair blocking a is | M r uulyacpal to acceptable mutually are A M ( p and ( a ) | eoeteprogram the denote ) = P q p ( eoetestof set the denote p nmatching in ). Stability SM > y hdt at to ched a ′ ∈ x setting M z In . sa is M ( p ) . a1 : p1,p2 p1 : a2,a4,a1,a3 a2,a3,a4 : p2,p1 p2 : a1,a2,a5,a3,a4 a5 : p2

Fig. 1: Instance with ﬁve agents, two programs and the preferences of agents and programs. For the SM setting let q(p1) = 2 and q(p2) = 1 and denote the instance as G. The matching N = {(a1,p1), (a2,p2), (a4,p1)} is a stable in G. For the SMFQ setting let c(p1) = 1 and c(p2) = 2 and denote the instance as H.

In real-world applications of the SM problem, the assignment of agents to programs operates as follows: agents and programs submit their preferences and quotas to the central administrative authority who then outputs a stable assignment. This task is typically complicated by the presence of high-demand programs with limited quotas. In practice, quotas are determined by logistic considerations like class size in case of course allocation and resource availability in case of school choice and may be flexible. For instance, every semester elective allocation for under-graduate students at IIT Madras happens via an automated procedure and once the preferences of students are available to the academic office, course instructors are consulted to adjust class capacities if appropriate. A recent work by Gajulapalli et al. [10] studies the school-choice problem in a two round-setting. Here, in the first round the quotas given as input are considered rigid. In the second round though, the quotas of some schools are violated in order to match additional students in a stability preserving manner. Motivated by such applications, we introduce and study the problem of computing stable matchings with flexible quotas – denoted as SMFQ setting. Having unrestricted quotas is unreasonable hence we let costs control quotas in our setting. An instance H in the SMFQ setting is similar to the SM setting except that programs do not have quotas associated, instead a program p has a non-negative integer c(p) denoting the cost of matching an agent to p. Since there are no quotas in the SMFQ setting, in an output matching some programs may have no agents assigned to them – we denote such programs as closed. We modify the definition of stability in the SMFQ setting and use this throughout the paper. Definition 2 (Stable matchings in SMFQ setting). A pair (a,p) ∈ E \ M is a blocking pair w.r.t. the matching M if p >a M(a) and there exists at least one agent a′ ∈ M(p) such that a >p a′. A matching M is stable if there is no blocking pair w.r.t. M.

In literature, such a blocking pair is also considered as an envy pair (a,a′) and the matching M is called envy-free [24]. In the SM setting, an envy-free matching need not be stable but in the SMFQ setting, envy-free matchings are stable. In the SMFQ setting, our goal is to compute an A-perfect matching (one that matches all agents) that is stable with respect to the preferences subject to the following two optimization criteria with respect to costs.

1. Minimize the total cost: The total cost of a matching M is deﬁned as p (|M(p)|·c(p)). Our goal is to compute an A-perfect stable matching that minimizes the total costs –P we∈P denote this as the MINSUM problem. 2. Minimize the max cost: The maximum cost (spent at a program) for a matching M is deﬁned as maxp {|M(p)|·c(p)}. Our goal is to compute an A-perfect stable matching that minimizes the maximum cost –∈P we denote this as the MINMAX problem.

In the SMFQ instance H in Fig. 1, the matching N ′ = {(a1,p1), (a2,p2), (a3,p1), (a4,p1), (a5,p2)} is A- perfect as well as stable with respect to the preferences. The total cost of N ′ is 7 and the max-cost of N ′ is 4. It is easy to verify that in the instance H, the matching N ′ is an optimal solution for both the MINSUM as well as the MINMAX problem. This need not be true in general; in fact we prove a sharp contrast in the complexity status of the two problems.

1.1 Our Results We show the following new results for the SMFQ setting in this paper.

2 Theorem 1. The MINMAX problem is solvable in polynomial time. Theorem 2. The MINSUM problem is NP-hard even when every agent has a preference list of length exactly f ≥ 2, there is a master list ordering on agents and programs and the instance has 2 distinct costs. In the SMFQ setting since A-perfectness is guaranteed, it is natural for agents to submit short preference lists. However, since there is a guarantee that every agent is matched, the central authority is likely to impose a minimum requirement on the length of the preference list [15]. We further note that under the two extreme scenarios – that is all preference lists are unit length or all preference lists are complete, the MINSUM problem admits a simple polynomial time algorithm. Theorem 2 shows that the general case is NP-hard. We also show that MINSUM is hard to approximate within a constant factor, unless P = NP. MINSUM 7 Theorem 3. The problem cannot be approximated within a factor 6 − ǫ,ǫ > 0 even when the instance has 3 distinct costs, unless P = NP. On the positive side, we present the following approximation algorithms.

Theorem 4. The MINSUM problem admits the following approximation algorithms. Let ℓp denote the maximum length of preference list of any program.

(I) a linear time ℓp-approximation algorithm. (II) a |P|-approximation algorithm. We present a fast exponential algorithm for the instances where number of distinct costs is small. The number of distinct costs appearing in an agent’s preference list (denoted as k) is upper-bounded by the number of distinct costs in the instance as well as by the maximum length of an agent’s preference list.

Theorem 5. The MINSUM problem admits an O(k|A|) time algorithm. Size and A-perfectness of the output: As stated earlier, size of a matching plays a crucial role in real- world scenarios. In [6] the authors mention that the administrators of the Scottish Foundation Allocation Scheme were interested in larger size matchings at the expense of allowing blocking pairs. In applications like school-choice [1] every child must ﬁnd a school. In case of matching sailors to billets in the US Navy [23,20], every sailor must be assigned to some billet, apart from some additional constraints. The A-perfectness requirement is imposed in the two-round school choice problem studied in [10]. Round-1 is the standard stable matching problem; whereas in round-2, their goal is to match all agents in a particular set derived from the matching of round-1. We formalize this connection in Section 2 where we generalize the stable extensions from [10]. In [10] they consider a variant of MINSUM problem (Problem 33, Section 7) and state that their problem is NP-hard. However, they do not investigate the problem in detail.

1.2 Related Work Apart from [10], flexible quotas in the college admission setting are studied in [19]. In their setting, no costs are involved but colleges may have ties in the preference lists and flexible quotas are used for tie-breaking at the last matched rank. In the student-project allocation setting, the problem of minimizing the maximum and total deviation from the initial target is studied in [8]. Flexibility in quotas also comes up when colleges have upper and lower-quotas [4,24,12]. In the work by Biro et al. [4], colleges either satisfy the lower-quotas or are closed. A setting where courses make monetary transfers to students and have budget constraints is studied in [16]. Funding constraints are studied in [2] in the context of allocating student interns to the projects funded by supervisors. A setting involving high-demand courses and scheduling constraints is studied in [18] which assumes a fixed quota at courses. Course allocation involving efficient assignment of students to highly-popular courses is treated with the AI approach in [13]. Course bidding mechanisms for allocating seats efficiently at high-demand courses are investigated in [22]. Organization of the paper: In Section 2, we establish connection between SMFQ setting and the generalized stable extension setting. We present our algorithmic results for the MINMAX and MINSUM problems in Section 3. In Section 4, we present NP-hardness and inapproximability results for the MINSUM problem.

3 2 Stable extensions and SMFQ setting

In this section, we formally define the notion of stable extensions as defined by Gajulapalli et al. [10] in a two round setting. Given an SM instance G and a stable matching M in G, a stable extension M ′ of M is defined as follows.

Deﬁnition 3. M ′ is a stable extension of M if M ⊆ M ′ and M ′ is stable w.r.t. preferences in G and M ′ may violate quotas of programs in G.

In the Type A1 setting [10] they consider the following problem: let M be a stable matching in G computed in round-1. In round-2, the goal is to compute a stable extension M ′ of M which matches the maximum number of unmatched agents in M – they denote this as the largest stable extension problem. Let Au denote the set of agents unmatched in M. It is well known by the Rural Hospital Theorem [21] that the set Au is independent of the stable matching. We note the following about the stable extensions considered in [10].

– If agent a ∈ Au is matched to program p in round-2, then p must be fully-subscribed in M, otherwise M is not stable. Thus, the total number of agents matched to p at the end of round-2 exceeds q(p). In [10] no restriction is imposed on the number of agents matched to programs in round-2. – The stable matching M in round-1 determines a unique subset of agents Au(M) ⊆ Au all of which can be matched in a stable extension of M. Since increase in quotas is unrestricted in [10], they match all the agents in Au(M) to their top preferred programs in a suitably modiﬁed graph. We generalize this by introducing costs for programs in round-2. This allows quotas to be controlled in the second round. – Since Au(M) depends on the stable matching selected in round-1, it is natural to ask: can we select a stable matching M in round-1 which allows maximum number of agents to be matched in an extension of M in round-2? We answer this question by proving in Lemma 1 that the A-optimal stable matching in G achieves this guarantee. The A-optimal stable matching matches every agent to its most-preferred stable partner [11].

For the sake of completeness, we present below the algorithm from [10] to compute the set Au(M). Their overall idea is to delete certain edges from E that cannot be matched in any stable extension of M. We use the notion of barrier1 as deﬁned in [10]. For program p, Barrier(p) is the most-preferred matched agent a such that a prefers p over M(a). For every program p, we prune the preference list of p by deleting the edges of the form (a′,p) such that a′ is lower-preferred than Barrier(p) in p’s list. We denote this pruned graph after the for loop in line 3 as GM . The set Au(M) is precisely the set of agents which are not isolated in GM .

Algorithm 1 Algorithm to compute Au(M) (adapted from [10]) Input: An SM instance G and a stable matching M in G. Output: A set of agents Au(M). 1: Let Au be the set of agents unmatched in M 2: Let EM be the set of edges incident on an agent in Au 3: for every p ∈P do 4: for every a ∈ Au do 5: if a

7: Let Au(M) ⊆ Au such that every agent in Au(M) has at least one edge in EM incident on it. 8: Let GM =(Au(M) ∪P,EM ) 9: Return Au(M)

1 The notion of a barrier is similar to the notion of threshold resident deﬁned in the context of envy-free matchings [17].

4 Lemma 1. Let G be the SM instance in round-1. Let M be the A-optimal stable matching and let M ′ be any other stable matching in G. Then, any agent that can be matched in a stable extension of M ′ can also be matched in a stable extension of M. Thus, Au(M ′) ⊆ Au(M).

Proof. Let GM and GM ′ denote the pruned graphs after for loop in line 3 in Algorithm 1 w.r.t. M and M ′ respectively. Suppose for the contradiction that, |Au(M ′)| > |Au(M)|. Then, there exists an agent a ∈ Au∗ (M ′)\Au∗ (M); that is, agent a has degree at least one in GM ′ but degree 0 in GM . Let (a,p) ∈ GM ′ \GM . Thus, there exists agent a′ >p a such that M(a′)

In order to match maximum number of agents in round-2, by Lemma 1 we need to select the A- optimal matching in round-1. A stable matching M selected in round-1 may admit multiple stable extensions. In Fig. 1, the A-optimal stable matching N in G admits two stable extensions, namely NE1 =

N ∪{(a3,p2), (a5,p2)} and NE2 = N ∪{(a3,p1), (a5,p2)} and the algorithm in [10] outputs NE1 . It is natural to control the number of agents matched to a program in round-2. We achieve this by deﬁning the following two problems.

1. For program p, let d(p) denote the deviation of p, that is the additional number of agents matched to p beyond q(p) in round-2. Let d∗ = maxp d(p). The goal is to compute a stable extension of M that ∈P minimizes d∗. This can be modeled as the MINMAX problem on GM (deﬁned above) where every program has unit cost. 2. Let c(p) denote the cost of matching an agent to a program in round-2. The goal is to compute a stable extension of M where the total cost of matching agents in round-2 is minimized. This is exactly the MINSUM problem on GM with the costs c(p). Thus, our results in the SMFQ setting generalize the stable extensions problem.

3 Algorithmic Results in the SMFQ setting

In this section, we present our algorithmic results for the SMFQ setting. In Section 3.1, we present an exact exponential algorithm for MINSUM when the number of distinct costs in the instance is small. In Section 3.2, we deﬁne a natural lower-bound on the optimal cost and present two ℓp-approximation algorithms for MINSUM, where ℓp denotes the length of the longest preference list of any program. In Section 3.3, we present a polynomial time algorithm for the MINMAX problem. We use this result to derive a lower-bound for the MINSUM problem that leads to a practically better approximation guarantee of |P|.

3.1 Exact exponential algorithm for MINSUM

In this section we present an exact exponential algorithm for MINSUM. Let H be an SMFQ instance. Let t be the number of distinct costs in H and k be the number of distinct costs appearing in an agent’s preference list in H. Then k ≤ t. Also k ≤ ℓa where ℓa is the maximum length of an agent’s preference list. Our algorithm 1 i (Algorithm 2) considers every possible |A|-tuple of costs hc ,...,c|A|i such that each c is one of the distinct costs appearing for agent ai Thus, there are at most k|A| tuples. For each tuple, the algorithm constructs a i sub-graph H′ such that every agent ai has edges incident to programs with cost exactly c . If p is the highest preferred program neighbouring agent a in H′ then any program p′ >a p in the graph H cannot be matched with any agent a′

5 Algorithm 2 O(k|A|) algorithm for MINSUM 1: M = ∅, cost = ∞ 2: for every tuple hc1,...,c|A|i do ′ i 3: H = {(ai,p) | ai ∈ A,p has cost c } 4: let change = 1 5: while every agent has degree ≥ 1 and change = 1 do 6: let change = 0 7: for every ai ∈ A do ′ 8: let p be the top-preferred program such that (ai,p) ∈ H ′ 9: for every p >ai p do 10: for every a

Lemma 2. The matching M computed by Algorithm 2 is stable.

Proof. The matching M is actually matching M ′ computed for some tuple. Thus, it is enough to show that M ′ computed for an arbitrary tuple is stable. Suppose M ′ is not stable, then there exists agents a,a′ such that M ′(a) p′ a that triggered this deletion. But, then we have that a′′ >p′ a′, implying that (a′,p′) is also deleted. Thus, in both the cases, (a′,p′) ∈/ H′ and hence (a′,p′) ∈/ M ′. This contradicts the assumption that (a′,p′) ∈ M ′ and hence completes the proof. ⊓⊔ Thus the algorithm computes at least one A-perfect, stable matching. We now show that the algorithm computes an optimal matching.

Lemma 3. Let M be an A-perfect stable matching and T be the tuple corresponding to M. Then no edge in M is deleted when Algorithm 2 processes T .

Proof. Suppose for the sake of contradiction that an edge in M is deleted when Algorithm 2 processes the tuple T . Let (a1,p1) be the ﬁrst edge in M that gets deleted during the course of the algorithm. The edge (a1,p1) is in H′ after line 3 since p1 has the same cost as given by the tuple T . This implies that the edge (a1,p1) is deleted while pruning the instance. Suppose agent a2 caused the deletion of edge (a1,p1) at time t. Let M(a2) = p2. Then it is clear that either p2 = p1 or p2 >a2 p1, otherwise M is not stable. But since a2 triggered the deletion of (a1,p1), the top choice program adjacent to a2 in H′ at time t is less preferred than p1. Again note that (a2,p2) ∈ H′ after line 3 hence (a2,p2) must have been deleted at time earlier than t. This contradicts the assumption that (a1,p1) is the ﬁrst edge in M that gets deleted. This completes the proof. ⊓⊔

Let OPT is an optimal matching and let TOPT be the tuple corresponding to OPT. When algorithm processes TOPT, by Lemma 3, no edge in OPT is deleted. Thus, no agent gets isolated and the algorithm computes M ′. Since Algorithm 2 returns the matching with cost at most the cost of M ′, it implies that it returns an A-perfect stable matching with minimum total cost. Running Time. For each of the O(k|A|) tuples, the algorithm computes the graph H′, deletes O(m) edges where m is the number of edges in H and may compute matching M ′. Thus the Algorithm 2 runs in time O(k|A| · poly(m)). This establishes Theorem 5.

6 3.2 ℓp-approximation for MINSUM In this section we present two linear time algorithms for the MINSUM problem with approximation guarantee of ℓp. Let pa∗ denote the minimum cost program in the preference list of agent a. If there is more than one program with the same minimum cost, we let pa∗ be the most-preferred amongst these programs. Description of the ﬁrst algorithm: Given an SMFQ instance H, our algorithm (Algorithm 3) starts by matching every agent a to pa∗. Note that such a matching is A-perfect and min-cost but not necessarily stable. Now the algorithm considers programs in an arbitrary order. For program p, we consider agents in the reverse preference list ordering of p. If there exists agent a∈ / M(p) such that p>a M(a) and there exists a′ ∈ M(p) such that a′

Algorithm 3 First ℓp-approximation algorithm for MINSUM ∗ 1: let M = {(a,p)| a ∈ A and p = pa} 2: for every program p do 3: for a in reverse preference list ordering of p do ′ ′ 4: if there exists a ∈ M(p) such that a>p a and p>a M(a) then 5: M = M \ {(a, M(a))} ∪ {(a,p)} 6: return M

Note that in Algorithm 3 an agent can only get promoted in the loop (line 5) of the algorithm. Further, program p is assigned agents only when it is being considered in the for loop (line 2). Finally, if program p is assigned at least one agent in the ﬁnal output matching, then p = pa∗ for some agent a ∈ A. Analysis: It is clear that the matching computed by Algorithm 3 is A-perfect. We show the correctness and the approximation guarantee below.

Lemma 4. The matching M output by Algorithm 3 is stable.

Proof. We show that no agent participates in a blocking pair w.r.t. M. Assume for contradiction, that (a,p) blocks M. Then p>a M(a) and there exists an agent a′ ∈ M(p) such that a′

Lemma 5. The matching M output by Algorithm 3 is an ℓp-approximation.

Proof. Let c(OPT) and c(M) be the cost of matching OPT and M respectively. It is easy to see that

c(OPT) ≥ c(p ) (1) X a∗ a ∈A In the matching M, some agents are matched to their least cost program (call them A1), whereas some agents get promoted (call them A2). However, as noted earlier, if a program p is assigned agents in M then p = pa∗ for some agent a. Thus for agent a ∈ A2, we charge the cost of some other least cost program pa∗′ . Since a program can be charged at most ℓp − 1 times by agents in A2, we have

∗ ∗ ∗ c(M) = X c(pa)+ X c(M(a)) ≤ X c(pa)+ X (ℓp − 1) · c(pa) ≤ ℓp · c(OPT) a∈A1 a∈A2 a∈A a∈A ⊓⊔

7 This establishes Theorem 4 (I). Next, we present another algorithm using the lower-bound in Eq. 1 with approximation guarantee of ℓp. Description of the second algorithm (ALG): Given an SMFQ instance H, we construct a subset P′ of P such that p ∈P′ iﬀ p = pa∗ for some agent a. Our algorithm now matches every agent a to the most-preferred program in P′. Analysis of ALG: It is clear that the matching computed by ALG is A-perfect. Let M be the output of ALG and OPT be an optimal matching. Let c(OPT) and c(M) be the cost of matching OPT and M respectively. The lower-bound on c(OPT) in Eq. 1 is exactly the same. We show the correctness and the approximation guarantee of ALG via Lemma 6 and Lemma 7. Lemma 6. The output M of ALG is stable. Proof. We show that no agent participates in a blocking pair w.r.t. M. Suppose (a,p) blocks M. Then it implies that p>a M(a) and there exists an agent a′ ∈ M(p) such that a′

Lemma 7. The output M of ALG is an ℓp-approximation.

Proof. In the matching M, agent a is either matched to pa∗ or pa∗′ for some other agent a′. This is determined by the relative ordering of pa∗ and pa∗′ in the preference list of a. We partition the agents as A = A1 ∪ A2, where A1 is the set of agents matched to their own least cost program, that is, a ∈ A1 iﬀ M(a) = pa∗. We deﬁne A2 = A \ A1. We can write the cost of M as follows:

c(M)= c(p )+ c(M(a)) X a∗ X a 1 a 2 ∈A ∈A By similar arguments as in Lemma 5, we get the ℓp approximation guarantee. ⊓⊔

Comparing the two ℓp approximation algroithms for MINSUM We present following instances which illustrate that neither of the two algorithms is strictly better than the other.

Example 1. Let A = {a1,a2,...,an}, P = {p1,p2}, c(p1)=1,c(p2) = α where α is some large positive constant. The agents a1,...,an 1 have the same preference list p2 followed by p1. Whereas agent an has − only p2 in its preference list. The preferences of the programs are as given below.

p1 : a1, a2, ..., an 1 − p2 : an, an 1, an 2, ..., a1 − −

Here, ALG outputs M1 of cost n · α where M1 = {(a1,p2),..., (an,p2)}. In contrast, Algorithm 3 outputs M2 = {(a1,p1),..., (an 1,p1), (an,p2)} whose cost is n − 1+ α. Clearly, Algorithm 3 outperforms ALG in − this case and in fact M2 is optimal for the instance.

Example 2. Let A = {a1,a2,...,an}, P = {p1,p2,p3}, c(p1)=1,c(p2)=2,c(p3)= α where α is some large positive constant. The preferences of agents a1,...,an 2 are p2 followed by p3 followed by p1. The preference − list of an 1 contains only p2 and the preference list of an contains only p3. The preferences of programs are as shown− below.

p1 : a1, a2, ...,an 2 − p2 : an 1, a1, a2, ...,an 2 − − p3 : a1, ..., an 2, an −

Here, ALG outputs M1 = {(a1,p2),..., (an 1,p2), (an,p3)} whose cost is 2·(n−1)+α. In contrast, Algorithm 3 − outputs M2 of cost 2+(n − 1) · α where M2 = {(a1,p3),..., (an 2,p3), (an 1,p2), (an,p3)}. In this instance − − ALG outperforms Algorithm 3 and it can be veriﬁed that M1 is the optimal matching.

8 Remark on the lower-bound. We note that although ℓp seems a weak approximation guarantee, this is the best bound one can obtain via the lower-bound in Eq. 1. We show a family of instances where lower- bound is 1 and optimal cost is ℓp. In Fig. 2 we have n agents and 3 programs with master list ordering on agents and programs. We have ℓp = n and c(p0)=0,c(p1) = 1 and c(p2) = n (denoted in brackets). Then the lower-bound on optimal solution of MINSUM is 1 and the optimal cost is n.

(0) p0 : a1,a2,...,an−1 1 ≤ i ≤ n − 1, ai : p1,p0 (1) p1 : a1,a2,...,an−1,an an : p1,p2 (n) p2 : an

Fig. 2: A family of instances with optimal cost exactly ℓp times the lower-bound. There are two optimal matchings of cost ℓp = n: OPT1 = {(a1,p0),..., (an 1,p0), (an,p2)} and OPT2 = {(a1,p1),..., (an,p1)}. −

3.3 MINMAX problem and a |P|-approximation for MINSUM

In this section we present a simple polynomial time algorithm for the MINMAX problem. We then prove that the optimal solution to the MINMAX problem serves as a lower-bound on the optimal solution for the MINSUM problem. Using this lower-bound we design a |P|-approximation for the MINSUM problem.

Polynomial time algorithm for the MINMAX problem. Our algorithm for the MINMAX problem is based on the following observations:

– Let H be an SMFQ instance and M ∗ be the optimal solution for the MINMAX problem on H. Let t∗ = maxp {c(p) · |M ∗(p)|}. Then there exists an instance Gt∗ of the SM problem with quotas of ∈P ∗ t ∗ programs as follows: for each p ∈ P, q(p) = j c(p) k ≥ |M ∗(p)|. The instance Gt admits an A-perfect stable matching. This follows from the fact that M ∗ is an A-perfect matching in which every program is assigned at most q(p) many agents. SM t – For any tt, Gt′ admits an A-perfect stable matching.

Using the above observations the algorithm for the MINMAX problem is immediate. We binary search for the optimal value of t in the range [0, |A| · c∗]. For a particular value of t we construct the SM instance Gt by setting appropriate quotas. If the stable matching in Gt is not A-perfect, then we search in the upper-range. Otherwise, we check if Gt 1 admits an A-perfect stable matching. If not, we return t otherwise, we search for − the optimal in the lower-range. The algorithm requires O(log (|A| · c∗)) many iterations where each iteration computes at most two stable matchings using the linear time Gale and Shapley algorithm [11]. Thus the overall running time is polynomial in the input size. This establishes Theorem 1.

|P|-approximation algorithm for MINSUM. We now prove that the optimal matching M ∗ for the MINMAX problem is a |P|-approximation for the MINSUM problem.

Lemma 8. The optimal solution for the MINMAX problem is a |P|-approximation for the MINSUM problem.

9 Proof. Let H be an SMFQ instance and let M ∗ be the optimal matching for the MINMAX problem on H. For the same instance H, let N ∗ be the optimal matching for the MINSUM problem. Let us deﬁne t∗ and y∗ as follows:

t∗ = max{c(p) · |M ∗(p)|} y∗ = (c(p) · |N ∗(p)| ) p X ∈P p ∈P

We ﬁrst observe that y∗ ≥ t∗. This is true because N ∗ is an A-perfect stable matching in H. Furthermore, since costs of all programs are non-negative, maxp {c(p)·|N ∗(p)|} ≤ y∗. Therefore if y∗

This establishes Theorem 4 (II).

4 Hardness results

In this section, we present the NP-hardness hardness result (Theorem 2) and the inapproximability result (Theorem 3) for the MINSUM problem.

4.1 NP-hardness of MINSUM

We show that the MINSUM problem is NP-hard even under severe restrictions on the instance. In particular we show that the hardness holds even when all agents have a preference list of a constant length f and there is a master list ordering on agents and programs. To show the NP-hardness of MINSUM we use the Set Cover (SC) instance where every element occurs in exactly f sets. Minimum vertex cover on f-uniform hypergraphs is known to be NP-complete and SC problem where every element occurs in exactly f sets is equivalent to it [7]. Reduction: Let hS,E,ki be an instance of SC such that every element in E occurs in exactly f sets. Let m = |S|,n = |E|. We construct an instance H of SMFQ as follows. For every set si ∈ S, we have a set-agent ai and a set-program pi. For every element eh ∈ E, we have an element-agent ah′ . We also have a program p 1 f 2 and f − 2 programs p ,...,p − . Thus in the instance H we have m + n agents and m + f − 1 programs. Let Ei denote the set of elements in the si. The element-agents corresponding to Ei in H are denoted by Ai′ . Preferences: The preference lists of agents and programs are shown in Fig. 3. Every set-agent ai has f programs in its preference list - the set-program pi followed by the program p followed by the programs 1 f 2 p ,...,p − in that order. Every element-agent aj′ has the f set-programs corresponding to the sets that contain it in an arbitrary fixed order. Every set-program pi has its set-agent ai as its top-preferred agent 1 f 2 followed by the agents in Ai in an arbitrary fixed order. The program p and each program p ,...,p − has the set-agents as a1,...,an in an arbitrary fixed order. It is clear that every agent has a preference list of length f. j Costs: The costs for program pi for 1 ≤ i ≤ n and p for 1 ≤ j ≤ f − 2 is 1 and that for the program p is 0. Thus, the instance has two distinct costs.

A P 1 f−2 ′ 1 ≤ i ≤ m, ai : pi,p,p ,...,p 1 ≤ i ≤ m, pi : ai, Ai ′ 1 ≤ j ≤ n, aj : pj1,...,pjf p : a1,...,am t 1 ≤ t ≤ f-2,p : a1,...,am

Fig. 3: Reduced instance H of SMFQ from instance hS,E,ki of SC

10 Claim. If hS,E,ki is a yes instance, then H admits an A-perfect stable matching of cost at most n + k. Proof. Let X ⊆ S be the set cover size at most k. Using X, we construct an A-perfect matching M in H and show that M is stable. We then show that the cost of M is bounded. For every set si ∈/ X, we match the corresponding set-agent ai to the program p, that is, M(ai) = p. For every set si ∈ X, we match the corresponding set-agent ai to the program pi, that is, M(ai)= pi. Since X is a set cover, for every element ej ∈ E, at least one of the sets it occurs in is in X. Thus, for every element-agent aj′ , we match it to the program pjt corresponding to the set in X (in case more than one set containing ej are in X, then we match it to the program which is highest preferred amongst them). It is clear that M is A-perfect. To prove that M is stable, we show that no agent participates in a blocking pair. First observe that, an agent ai corresponding to si ∈ X is matched to its top-choice program. Hence such agents do not participate in blocking pairs. Now, for all agents ai such that si ∈/ X, these agents are matched to p. However, the corresponding program pi remains closed. Finally, every element-agent that is not matched to its top-choice programs has all such top-choice programs closed. Thus the matching is stable. We compute the cost of matching M from the agent side. Each element-agent is matched to some program pi and costs 1 each. At most k set-agents matched to their corresponding set-program pi each costs 1 and the m − k set-agents are matched to program p that incur the cost 0 each. Hence the cost of the matching is at most n + k. ⊓⊔ Claim. If H admits an A-perfect stable matching with cost at most n + k, then hS,E,ki is a yes instance. Proof. First we prove that program p must take at least m − k set-agents - Since the matching is A- perfect, every matched element-agent contributes a cost of at least 1 each, every set-agent not matched to p contributes a cost of 1 each thus if program p takes less than m − k set-agents then the cost of any such A-perfect stable matching is at least n + k + 1, which is a contradiction. Thus, at least m − k set-agents must be matched to program p. Let X be the set of sets si such that ai is matched to pi. Then, |X|≤ k. Since, the matching is stable, j every program pi such that si ∈/ X must be closed (since ai is either matched to p or p ). We now prove that X is a set cover. Suppose not, then there exists at least one element, say ej such that no set containing ej is in X, implying that all the programs pjt are closed. It means that element-agent at′ is unmatched. This implies that the matching is not A-perfect, which is a contradiction. Hence X must be a set cover. Since |X|≤ k, thus hS,E,ki is a yes instance. ⊓⊔ We remark that the NP-hardness result holds even when there is a master list ordering on agents and programs as follows:

1 f 2 (a1,...,am,a1′ ,...,an′ ), (p1,...,pm,p,p ,...,p − ). This establishes Theorem 2.

4.2 Inapproximability of MINSUM Let G = (V, E) be an instance of minimum vertex cover (MVC) problem. Let n = |V | and m = |E|. We construct an instance of SMFQ as follows. 1 m Reduction: For every vertex ui, we have m vertex-agents ai ,...ai and 2 vertex-programs pi and pi′ . For every edge ej, we have one edge-agent aj′ . We have an additional program p. Thus, we have m + mn agents and 2n + 1 programs. Preferences: Preference lists of programs and agents are shown in Fig. 4. The set of vertex-programs pj′ 1 and pj′ 2 corresponding to the end-points uj1 and uj2 of edge ej is denoted by Pj′ . The set of edge-agents aj′ t corresponding to the edges incident on vertex ui is denoted by Ai′ . Each vertex-agent ai, 1 ≤ t ≤ m has the program pi followed by pi′ followed by p. Each edge-agent aj′ has the two programs in Pj′ in an arbitrary 1 m order. Each vertex-program pi prefers the m vertex-agents ai ,...,ai in that order. Each vertex-program pi′ 1 m prefers the m vertex-agents ai ,...,ai in that order, followed by the edge-agents Ai′ . Program p prefers the mn vertex-agents in an arbitrary but ﬁxed order.

11 A P j ′ 1 m ∀1 ≤ i ≤ n, 1 ≤ j ≤ m, ai : pi,pi,p ∀1 ≤ i ≤ n, pi : ai ,...,ai ′ 1 m ′ ∀1 ≤ i ≤ n, pi : ai ,...,ai , Ai ′ ′ t 1 ≤ j ≤ m, aj : Pj 1 ≤ i ≤ n, 1 ≤ t ≤ m, p : ai

Fig. 4: Reduced instance H of SMFQ from instance G of MVC

Costs: The cost of each vertex-program pi is 3, that of each vertex-program pi′ is 2n and that of the program p is 0.

2 Lemma 9. If an optimal vertex cover in G is of size at most ( 3 + ǫ) · n then in H, there exists an A-perfect stable matching M with cost at most (4+3ǫ)mn.

2 t Proof. Let V ′ be the vertex cover of G of size at most ( 3 + ǫ) · n. If ui ∈/ V ′, we match all the msi agents to t the program p at cost 0. If ui ∈ V ′, we match all the m si-type agents to the program pi. This contributes 2 the cost of at most m · ( 3 + ǫ) · n · 3 = mn(2+3ǫ). We then prune the preference list of every edge-agent aj′ by deleting the programs pi′ corresponding to the end-point ui such that ui ∈/ V ′. Since V ′ is a vertex cover, it is guaranteed that every edge is covered and hence every edge-agent aj′ has a non-empty list Pj′ after pruning. Every edge-agent aj′ is then matched to the top-preferred program pi′ in the pruned list Pj . This contributes a cost of 2mn incurred by the edge-agents. Thus, the total cost of this matching is at most (2 + 2 + 3ǫ)mn =(4+3ǫ)mn. It is easy to see that M is A-perfect and is stable since no agent participates in a blocking pair. ⊓⊔

8 Lemma 10. If the optimal vertex cover in G has size greater than ( 9 − ǫ) · n then in H, any A-perfect stable 14 matching M has cost greater than ( 3 − 3ǫ)mn. 14 Proof. We prove the contra-positive i.e. if there exists an A-perfect stable matching within cost mn( 3 − 3ǫ) 8 then optimal vertex cover in G has size at most ( 9 − ǫ) · n. Given that the matching is A-perfect, all the aj′ type agents (m in total) must be matched to some program in Pj′ . Note that each such agent must contribute a cost of 2n and hence they together contribute a cost of 2mn. Suppose the edge-agent aj′ is matched to a program pi′ such that ui is one of the end-points of edge ej . t Then, msi vertex-agents for the vertex ui must be matched to either pi or pi′ , otherwise they form a blocking t pair with the program pi′ . Let V ′ be the set of vertices ui such that at least one si is matched to pi or pi′ . If t ui ∈ V ′ is such that less than m si type agents are matched to pi or pi′ , then it implies that no edge-agent corresponding to the edge incident on ui could have been matched to pi′ but is matched to pk′ where uk is its other end-point. Thus, we can remove such ui from V ′. It is clear that V ′ is a vertex cover of G since the t matching is A-perfect. For every ui ∈ V ′, all m si vertex-agents are matched to pi or pi′ . Since they together 8 contribute the cost at most mn( 3 − 3ǫ) and each ui ∈ V ′ has m copies matched, each contributing a cost 8 of at least 3, it implies that |V ′|≤ ( 9 − ǫ) · n, implying that an optimal vertex cover in G has size at most 8 ( 9 − ǫ) · n. ⊓⊔ MINSUM NP 7 Lemma 11. The problem is -hard to approximate within a factor of 6 − δ for any constant δ > 0.

Proof. We use the following proposition and the results proved in Lemma 9 and Lemma 10. 3 √5 Proposition [9]. For any ǫ> 0 and p< −2 , the following holds: If there is a polynomial-time algorithm that, given a graph G = (V, E), distinguishes between the following two cases, then P = NP. (1)|V C(G)|≤ (1 − p + ǫ)|V | (2)|V C(G)| > (1 − max{p2, 4p3 − 3p4}− ǫ)|V | ⊓⊔ 1 By letting p = 3 in Proposition above, we know that the existence of a polynomial-time algorithm that distinguishes between the following two cases implies P = NP for an arbitrary small positive constant ǫ: 2 (1)|V C(G)|≤ ( 3 + ǫ)|V | i.e. cost(M) ≤ (4+3ǫ)mn 8 14 (2)|V C(G)| > ( 9 − ǫ)|V | i.e. cost(M) > ( 3 − 3ǫ)mn

12 Now, suppose that there is a polynomial-time approximation algorithm A for MINSUM whose approxi- 7 8δ mation ratio is at most 6 −δ for some δ. Then, consider a ﬁxed constant ǫ such that ǫ< 13 6δ . If an instance 7 − 26(7 6δ) of case (1) is given to A, it outputs a solution with cost at most (4 + 3ǫ)mn( 6 − δ) < mn 3(13−6δ) . If an − 14 26(7 6δ) instance of case (2) is given to A, it outputs a solution with cost greater than mn( 3 − 3ǫ) > mn 3(13−6δ) . Hence, using A, we can distinguish between cases (1) and (2), which implies that P = NP. − ⊓⊔ This establishes Theorem 3.

References

1. Abdulkadiro˘glu, A., Sönmez, T.: School choice: A mechanism design approach. American Economic Review 93(3), 729–747 (June 2003). https://doi.org/10.1257/000282803322157061 2. Aziz, H., Baychkov, A., Biró, P.: Summer internship matching with funding constraints. In: Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’20. pp. 97–104 (2020), https://dl.acm.org/doi/abs/10.5555/3398761.3398778 3. Baswana, S., Chakrabarti, P.P., Chandran, S., Kanoria, Y., Patange, U.: Centralized admissions for engineering colleges in india. Interfaces 49(5), 338–354 (2019). https://doi.org/10.1287/inte.2019.1007 4. Biró, P., Fleiner, T., Irving, R.W., Manlove, D.: The college admissions problem with lower and common quotas. Theor. Comput. Sci. 411(34-36), 3136–3153 (2010). https://doi.org/10.1016/j.tcs.2010.05.005 5. Biró, P., Manlove, D., Mittal, S.: Size versus stability in the marriage problem. Theor. Comput. Sci. 411(16-18), 1828–1841 (2010). https://doi.org/10.1016/j.tcs.2010.02.003 6. Biró, P., Manlove, D.F., Mittal, S.: Size versus stability in the marriage problem. Theor. Comput. Sci. 411(16-18), 1828–1841 (2010). https://doi.org/10.1016/j.tcs.2010.02.003 7. Cardinal, J., Karpinski, M., Schmied, R., Viehmann, C.: Approximating vertex cover in dense hypergraphs. Journal of Discrete Algorithms 13, 67 – 77 (2012). https://doi.org/https://doi.org/10.1016/j.jda.2012.01.003 8. Cooper, F.: Fair and large stable matchings in the stable marriage and student-project allocation problems. Ph.D. thesis, University of Glasgow, UK (2020), http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.815027 9. Dinur, I., Safra, S.: The importance of being biased. In: Proceedings on 34th Annual ACM Symposium on Theory of Computing. pp. 33–42 (2002), https://doi.org/10.1145/509907.509915 10. Gajulapalli, K., Liu, J.A., Mai, T., Vazirani, V.V.: Stability-preserving, time-efficient mechanisms for school choice in two rounds. In: 40th IARCS Annual Conference on Foundations of Software Tech- nology and Theoretical Computer Science, FSTTCS 2020). LIPIcs, vol. 182, pp. 21:1–21:15 (2020). https://doi.org/10.4230/LIPIcs.FSTTCS.2020.21 11. Gale, D., Shapley, L.S.: College admissions and the stability of marriage. The American Mathematical Monthly 69(1), 9–15 (1962), http://www.jstor.org/stable/2312726 12. Hamada, K., Iwama, K., Miyazaki, S.: The hospitals/residents problem with lower quotas. Algorithmica 74(1), 440–465 (2016). https://doi.org/10.1007/s00453-014-9951-z, https://doi.org/10.1007/s00453-014-9951-z 13. Hoshino, R., Raible-Clark, C.: The quest draft: An automated course allocation algorithm. In: Pro- ceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. pp. 2906–2913 (2014), http://www.aaai.org/ocs/index.php/IAAI/IAAI14/paper/view/8341 14. Huang, C.C., Kavitha, T.: Popular matchings in the stable marriage problem. In: Interna- tional Colloquium on Automata, Languages, and Programming. pp. 666–677. Springer (2011). https://doi.org/10.1007/978-3-642-22006-7 56 15. Irving, R.: Matching practices for entry-labor markets – scotland, mip country profile 3. (2011), https://www.matching-in-practice.eu/the-scottish-foundation-allocation-scheme-sfas/ 16. Kawase, Y., Iwasaki, A.: Approximately stable matchings with budget constraints. In: Proceedings of the Thirty- Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI- 18) (2018), https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17032 17. Krishnapriya, A.M., Nasre, M., Nimbhorkar, P., Rawat, A.: How good are popular matchings? In: 17th International Symposium on Experimental Algorithms, SEA 2018. pp. 9:1–9:14 (2018). https://doi.org/10.4230/LIPIcs.SEA.2018.9 18. Othman, A., Sandholm, T., Budish, E.: Finding approximate competitive equilibria: efficient and fair course allocation. In: 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010). pp. 873–880 (2010), https://dl.acm.org/citation.cfm?id=1838323

13 19. Rios, I., Larroucau, T., Parra, G., Cominetti, R.: College admissions problem with ties and ﬂexible quotas (01 2014). https://doi.org/10.2139/ssrn.2478998 20. Robards, P.A.: Applying two-sided matching processes to the united states navy enlisted assignment process. Tech. rep., NAVAL POSTGRADUATE SCHOOL MONTEREY CA (2001) 21. Roth, A.E.: On the allocation of residents to rural hospitals: A general property of two-sided matching markets. Econometrica 54(2), 425–427 (1986), http://www.jstor.org/stable/1913160 22. S¨onmez, T., Unver,¨ M.U.: Course bidding at business schools. International Economic Review 51(1), 99–123 (2010). https://doi.org/10.1111/j.1468-2354.2009.00572.x 23. Yang, W., Giampapa, J., Sycara, K.: Two-sided matching for the us navy detailing process with market compli- cation. Tech. rep., Technical Report CMU-RI-TR-03-49, Robotics Institute, Carnegie-Mellon University (2003) 24. Yokoi, Y.: Envy-free matchings with lower quotas. Algorithmica 82(2), 188–211 (2020). https://doi.org/10.1007/s00453-018-0493-7, https://doi.org/10.1007/s00453-018-0493-7