A Kernel Method for Market Clearing

Sebastien´ Lahaie Yahoo! Research New York, NY 10018 [email protected]

Abstract agents would correspond to representative buyers and sellers that enact the aggregate demand and supply behavior of each The problem of market clearing in an economy is side of the economy. Thus there is little loss of generality in that of finding prices such that supply meets de- considering just a buyer and a seller in this setting, and we mand. In this work, we propose a kernel method opt for this simpler two-agent model for this initial work. to compute nonlinear clearing prices for instances The problem at hand is that of finding prices such that where linear prices do not suffice. We first present the bundle that maximizes the buyer’s utility simultaneously a procedure that, given a sample of values and costs maximizes the seller’s profit; it is in this sense that supply for a set of bundles, implicitly computes nonlinear meets demand. We make no convexity assumptions on the clearing prices by solving an appropriately formu- value and cost functions of the buyer and seller. Now, no lated quadratic program. We then use this as a sub- single clearing method can hope to perform universally well routine in an elicitation procedure that queries de- under such general conditions. A kernel method has the mod- mand and supply incrementally over rounds, only ularity to adopt an appropriate kernel function to clear any as much as needed to reach clearing prices. An em- particular instance of value and cost functions. pirical evaluation demonstrates that, with a proper choice of kernel function, the method is able to We first provide a quadratic program that, given a sample find sparse nonlinear clearing prices with much less of values and costs for a set of bundles, computes nonlinear than full revelation of values and costs. When the clearing prices using only inner-products of high-dimensional kernel function is not suitable to clear the market, bundle representations; this is the usual notion of a kernel the method can be tuned to achieve approximate method. Unlike typical machine learning scenarios, however, clearing. a sample of values and costs is not just given to us, it must be elicited. Thus we embed our quadratic program as a sub- routine in an elicitation procedure that queries demand and 1 Introduction supply incrementally over rounds. It is standard in microeconomic modeling to assume that util- Kernel methods have been applied to single-agent prefer- ities are concave and costs convex. These assumptions ensure ence elicitation, where the goal is to recover or approximate that linear prices exist that balance demand and supply, where an agent’s preferences in order to act on its behalf or assist one price is assigned to each item and the price of a bundle it on some tasks (i.e., for decision support). Domshlak and is simply the sum of item prices. But there are many realistic Joachims [2006] provide a method for generating an ordi- conditions under which such assumptions may fail to hold, nal utility function given qualitative preference statements. such as complementary items or economies of scale. In such Chappelle and Harchaoui [2005] and Evgeniou et al. [2005] cases nonlinear prices might be needed to clear the market. apply support vector machines to conjoint analysis (prefer- The problem of finding nonlinear clearing prices is chal- ence elicitation for marketing purposes). lenging because such prices may be difficult to succinctly de- We are not aware of any applications of kernel methods in scribe, let alone efficiently compute. In this work, we propose the context of multi-agent preference elicitation, where the a kernel method for market clearing to address both of these goal is not to recover preferences in any complete way, but issues. We view nonlinear prices as linear prices with respect rather to solve some ulterior problem involving all the agents to some high-dimensional representation of bundles. This (such as market clearing). Our elicitation procedure is similar perspective is drawn directly from the field of kernel methods to the work of Lahaie and Parkes [2004], who give an auction- in machine learning, in particular support vector machines, style protocol that leads to clearing prices. They fit valuation where nonlinear classifiers are construed as linear classifiers functions to the agent’s reported values using techniques from in high-dimensional feature space [Shawe-Taylor and Cris- computational learning theory, and ultimately reach clearing tianini, 2004]. prices on the basis of these valuations. Our method bypasses We consider a simple model with one buyer and one seller. the step of fitting functions and directly computes clearing In an environment with multiple buyers and sellers, our two prices based on sample information of values and costs.

208 2 The Model of cost query and -supply query are defined analogously for There is one buyer, one seller, and m items. A bundle is a the seller. Because prices are communicated to the agents in subset of the items. We associate each bundle with its indi- demand and supply queries, we must ensure that they have cator vector, and denote the set of bundles by X = {0, 1}m. succinct encodings for such queries to be practical. This is We write x ≤ x to denote that bundle x is contained in bun- another goal of our work, for which we draw on techniques dle x (the inequality is understood component-wise). The from kernel methods. buyer has a value function v : X → R+ denoting how much it is willing to pay for each bundle. The seller has a cost 3Kernels function c : X → R+ denoting how much it would cost to To compute nonlinear clearing prices, we view these as linear produce each bundle. We assume that each function is mono- prices in a higher-dimensional space to which we map the tone: v(x) ≤ v(x) whenever x ≤ x, and the same for c.We bundles via a mapping φ : X → RM ,whereM ≥ m.Entry also assume v(0)=c(0)=0. i in the vector φ(x) defines the value of the ith “feature” of A bundle is efficient if it maximizes the gains from trade: bundle x,fori =1,...,M.Now,givenp ∈ RM , the price value created minus cost of production. Formally, x ∈ X is of bundle x is p, φ(x) . The mapping φ therefore indirectly efficient if x ∈ arg maxx∈X v(x ) − c(x ). Together with an describes how each bundle is priced. efficient bundle, we wish to find prices p : X → R such that We might need M m to ensure that clearing prices exist, at these prices, supply meets demand. Let but with large M we cannot explicitly exhibit vectors φ(x) or prices p drawn from RM . The key insight of kernel meth- D(p) = arg max{v(x) − p(x)} x∈X ods is known as the “kernel trick”. Instead of working with RM S(p) = arg max{p(x) − c(x)} vectors in , which may be impractical or even infeasible, x∈X one formulates the relevant problem (e.g., classification, re- In words, D(p) is the set of bundles that maximizes the gression) as a mathematical program that relies only on the φ(xi),φ(xj ) buyer’s utility—value minus price—and similarly S(p) are inner products . What makes this practical is those bundles that maximize the seller’s profit—revenue mi- that, for many kinds of mappings, the inner products can be p D p ∩ S p ∅ efficiently evaluated in time that does not depend on M. nus cost. Prices are clearing prices if ( ) ( ) = . κ At clearing prices, there is some bundle that simultaneously A kernel function computes the inner product of the im- φ : Rm → RM maximizes the buyer’s utility and the seller’s profit; in this ages of two bundles under a mapping ,so κ(x1,x2)= φ(x1),φ(x2) sense, supply meets demand and the market clears. that . A fundamental result on κ : X ×X → R Clearing prices are important because they decentralize the kernels states that if a function with a count- x∗ ∈ D p ∩ S p able domain is the kernel function of some map φ, then for problem of efficient trade. If ( ) ( ), then for any {x ,...,x } X k × k other bundle x we have any finite sample 1 k from ,the kernel matrix K with entries Kij = κ(xi,xj) is positive semi-definite v(x∗) − p(x∗) ≥ v(x) − p(x) (see, e.g., [Shawe-Taylor and Cristianini, 2004]). p(x∗) − c(x∗) ≥ p(x) − c(x) Instances of useful kernel functions for various classification and regression tasks abound in the machine learning lit- and adding these two inequalities shows that v(x∗)−c(x∗) ≥ ∗ erature. By way of example, we list here those kernels that v(x) − c(x). Thus x is efficient. In our model nonlinear we use later for the empirical evaluation of our method. clearing prices always exist because observe that we can sim- Linear At one extreme we have the kernel κ(xi,xj )= ply take p = v or p = c. As mentioned in the introduc- xi,xj which simply corresponds to the identity map tion, convexity assumptions would imply that linear clear- φ(x)=x. We would use this kernel in the method devel- ing prices exist; formally, these can be described by a vector m oped later if we were to try to find linear clearing prices. p ∈ R so that the price of a bundle x ∈ X is the usual inner m Identity At the other extreme, we have the kernel function product p, x = j=1 pjxj. κ(xi,xj )=1if xi = xj,and0 otherwise. List all the pos- We need to explain how value and cost information will be 2m x ,x ,...,x m e ∈ R provided to our algorithms. Because the domain of bundles is sible bundles as 1 2 2 .Ifwedefine i to v be the unit vector that has a 1 in entry i (and zeroes in the exponentially large, passing an encoded description of and φ x e c would be infeasible in general for even moderate m (such as remaining), then the implied map is simply ( i)= i.This m =30). In fact, one main goal of our work is to try to find corresponds to the case were each bundle is priced explicitly, clearing prices by eliciting as little value and cost information and therefore clearing prices are guaranteed to exist. as possible. Thus we will just assume that our algorithms All-subsetsm The all-subsets kernel maps bundles into vectors in R2 . The range has a dimension for each bundle have oracle access to the value and cost functions through two kinds of queries. We assume that agents respond truthfully to x ∈ X.Wehaveφx(x )=1if x ≤ x and 0 other- queries, and leave the question of incentive-compatibility to wise. Thus the price of a bundle is the aggregate of the prices future work. On a value query, the buyer is presented a bundle on all its subsets. The actual kernel function is defined as κ x, x m x x x and replies with v(x).Onan-demand query, the buyer ( )= i=1(1 + i i), which can be evaluated in linear is presented a bundle x, prices p,andan ≥ 0; the buyer time. With the ability to implicitly price subsets of commodi- returns any bundle that maximizes its utility at prices p within ties, this kernel seems well suited to market clearing. an additive error of , breaking ties however it wishes, with Gaussian This is one of the most widely used ker- the exception that if x applies then it is returned. The notions nels in machine learning. It is defined as κ(x, x)=

209 2 exp(− x − x /2σ2). (Here and everywhere else, · Observe that this quadratic program solves the problem of refers to the Euclidean norm.) The parameter σ controls the finding an efficient bundle with respect to v¯ and c¯, except that flexibility of the kernel. With smaller σ, it is better able to we have written v(xi) instead of v(yi) and φ(xi) explicitly fit arbitrary functions, and the kernel matrix approaches the instead of yi. The dual of this program is as follows. identity matrix (i.e., the kernel matrix of the identity kernel). πb πs μ b2 μ s 2 It is difficult to glean any economic motivation for this ker- min + + 2 + 2 πs,πb,p, nel, but its success with classification and regression suggests b b it should be worth trying for market clearing as well. s.t.π≥ v(xi) − p, φ(xi) −i i =1,...,r s s π ≥ p, φ(xi) −c(xi) − i i =1,...,r 4 Formulation μ /λ πb In this section, we formulate the market clearing problem as Here =1 . At an optimal solution, we have = v x − p, φ x −b πb a quadratic program. To do this, we need to work with con- maxi ( i) ( i) i . Thus reflects the maximum utility that the buyer can derive from any bundle in X tinuous approximations of the value and cost functions. Let s r = |X|.LetY = φ(X),whereφ is the mapping under at prices p, to within a certain slack. The variable π has an consideration, and let yi = φ(xi) for each xi ∈ X. With a analogous interpretation in terms of profit. slight abuse of notation, the value function v can be written as Ideally, the program would find a discrete solution, mean- a function over Y rather than X (assuming φ is one-to-one, ing that the coefficients α and β would generate a yi ∈ Y , but this can be dispensed with in the developments that fol- thus identifying a specific bundle xi ∈ X. However, it is pos- low). Let Y¯ be the convex closure of Y , namely the set of all sible (in fact typical) that the optimal convex combinations convex combinations of elements in Y . We introduce a func- do not result in a discrete solution. The following proposition v¯ over Y¯ , parametrized by λ ≥ 0, with v¯(y) for y ∈ Y¯ tion gives a way to extract an (approximately) efficient bundle defined as from a solution to the primal. Its proof consists of a straight- r r forward appeal to complementary slackness. λ 2 α v y − α α y y . b s P max i ( i) i i = Proposition 1 Let (α, β) and (π ,π ,p,) be optimal pri- r α =1 2 i=1 i i=1 i=1 i r mal and dual solutions. Assume there is an index such that α∈R+ b b b αi > 0 and βi > 0. Then, letting δ =maxj j − i , We will refer to v¯ as the buyer’s extended value function. s s s b s δ =maxj j − i , and δ = δ + δ , we have that When λ =0, v¯ is the concave extension of v over Y¯ :the xi δ smallest concave function such that v¯(yi) ≥ v(yi) for each (a) Bundle is efficient to within an additive error of . yi ∈ Y .Whenλ is large, the second term in the objective (b) Bundle xi maximizes the buyer’s utility (seller’s profit) b s dominates. It is maximized when α is uniform: αi =1/k for at prices p to within an additive error of δ (δ ). each i. Thus it induces the function to consider more points in λ y This result also clarifies the purpose of the parameter . Sup- the neighborhood of when imputing a value. The extended pose that we set λ =0(equivalently, μ = ∞) and solve the cost function c¯ of c is defined similarly: above, min replaces α∗,β∗ c y v y quadratic program, but that at the solution ( ) there is max, ( i) replaces ( i), and the second term in the objec- no index i such that α∗ > 0 and β∗ > 0. This indicates tive is added rather than subtracted. When λ =0, c¯ is the i i c Y¯ that clearing prices have not been found, and necessarily oc- convex extension of over . curs if they do not exist for our choice of kernel. In this case Lemma 1 The extended value (cost) function is concave we can increase λ—equivalently, decrease μ—to allow for (convex) for all λ ≥ 0. approximate clearing prices in the dual. As λ →∞the opti- ∗ ∗ We stress that for any given λ,evenλ =0, it is not necessar- mal solution tends to αi = βi =1/k for all i, and thus we ily the case that v¯(yi)=v(yi) for all yi ∈ Y ,andthesame will reach a point where the conditions of Proposition 1 are for c¯ and c. The point of v¯ and c¯ is to approximate v and c by satisfied. Therefore, increasing λ increases the chances that concave and convex functions over Y¯ , because we know that the quadratic program will identify a discrete solution for any with such functions linear clearing prices exist in Y¯ . given kernel function, with the caveat that the discrete solu- We now formulate the clearing problem with respect to the tion might only be approximately efficient; correspondingly, extended value and cost functions. Explicitly, our quadratic the prices obtained in the dual will only approximately clear program only tries to identify an efficient bundle, but we will the market. see that the dual of the program gives the desired clearing prices. Note that (1) below corresponds to M different con- 5 Algorithms straints. r r As formulated, our quadratic program is problematic in two λ 2 λ 2 M max αiv(xi) − βic(xi) − α − β respects. First, it has too many constraints. Recall that , α,β≥0 i=1 i=1 2 2 the dimension of the range of φ, is potentially very large r r because clearing prices may only exist in high-dimensional s.t. αiφ(xi)= βiφ(xi) (1) space. The number of variables is also too large: there is a α β x ∈ X X i=1 i=1 variable i and a variable i for each i ,and is of m r r size 2 . We deal with the first concern by using a kernel αi =1 , βi =1 (2) method, and then propose an elicitation procedure to address i=1 i=1 the second concern.

210 5.1 Method of Multipliers leads to a kernel method when applied to our quadratic pro- k The usual approach in kernel methods is to formulate the gram. At iteration , the objective of our quadratic program given problem as a mathematical program that only uses becomes inner-product information (i.e., the kernel matrix), and then r r k k λ k2 λ k2 solve the optimization problem using general or special pur- max αi v(xi) − βi c(xi) − α − β αk,βk≥0 2 2 pose algorithms. Our quadratic program is not formulated in i=1 i=1 terms of inner products. Instead, it is the procedure we use −(¯αk − β¯k)K(αk − βk) to solve the program that only uses information in the kernel ν − (αk − βk)K(αk − βk) matrix. 2 The procedure is known as the method of multipliers.(For k k−1 ¯k k−1 an extensive treatment of this method, see [Bertsekas, 1996].) where α¯ = =1 ν αi and β = =1 ν βi . The con- Despite its name, it was not originally conceived as a kernel straints remain (2). One finds that the Hessian of the objective method, but simply as a way to solve equality-constrained is negative-semidefinite from the fact that K is positive semi- quadratic programs. That it only requires kernel matrix in- definite, so we now have a straightforward quadratic maxi- formation when applied to our quadratic program was a cru- mization problem with concave objective function, for which cial insight in this work. The procedure eliminates the prob- there is a wide selection of commercial and non-commercial lematic constraints (1) and replaces them with multiplier and solvers. penalty terms in the objective, which becomes At the final round k, we use the multiplier as the clearing r r prices given the method of multipliers’ convergence proper- λ 2 λ 2 max αiv(xi) − βc(xi) − α − β ties. Assuming clearing prices have been found (i.e., the con- α,β≥0 2 2 ditions of Proposition 1 hold), the price of a bundle x ∈ X is i=1 i=1 r r given by − p, αiφ(xi) − βiφ(xi) (3) r k k k i=1 i=1 p ,x = (¯α − β¯ )κ(xi,x). (5) i i r r 2 i=1 ν − αiφ(xi) − βiφ(xi) (4) If the vectors α¯k and β¯k are sparse, this gives us a sparse 2 i=1 i=1 representation of nonlinear clearing prices. Let us ignore the multiplier term (3) for the moment. Evaluat- ing the penalty term, one finds that it only involves the kernel 5.2 Elicitation Procedure matrix K.Asν →∞, the penalty term ensures that at an The kernel method of the previous section addresses the ques- optimal solution, the constraints (1) are satisfied. In practice, tion of computing nonlinear clearing prices given a kernel and ν one chooses a sufficiently large so that the constraints are full information of the value and cost functions, namely v(xi) satisfied to within an acceptable tolerance. There is a draw- and c(xi) for all xi ∈ X. If, instead of this full information, back, however: as ν grows large the problem becomes in- we only know the values and costs of bundles in some lim- creasingly ill-conditioned, making it inherently more difficult ited sample S, it is still possible apply the algorithm of the to solve with standard quadratic programming algorithms. S X 1 previous section with respect to rather than . To check Thus one typically starts with a small initial ν and increases whether the resulting bundle y is efficient, and whether the it over several iterations until the constraints are sufficiently resulting prices p are clearing, we can perform demand and satisfied, in an attempt to avoid ill-conditioning. A typical y k+1 k supply queries with these as inputs. If is returned in both update rule is ν = τν for some τ ∈ [4, 10], and this is cases, then it maximizes both the buyer’s utility and seller’s the rule we used in our experiments [Bertsekas, 1996]. profit at prices p, and we are done, following the arguments To speed up convergence and further alleviate ill- of Section 2. Otherwise, we can grow our sample S by in- conditioning, one can introduce a multiplier term as in (3). cluding the buyer and seller’s replies. h α, β r α φ x − r β φ x For simplicity, let ( )= i=1 i ( i) i=1 i ( i). The elicitation procedure based on these ideas is given The original method of multipliers (there are several vari- k+1 k formally as Algorithm 1. It begins with an empty sample. ants) uses at each iteration k the update rule p = p + Throughout, prices are represented by the multiplier (¯α, β¯), νkh(αk,βk),where(αk,βk) is the optimal solution at the it- k which is zero in the first round, leading to zero prices. The eration. Under this update rule, the iterates p converge to buyer’s demand query reply at each round is recorded in xb, the optimal dual solution [Bertsekas, 1996]—in this case the initialized to 1, the bundle containing all items; the seller’s clearing prices, which is why we used the suggestive notation reply at each round is xs, initialized to 0, the empty bundle. p for the multiplier. In theory and practice, using a multiplier The bundle y refers to the optimal bundle at each round, and speeds up convergence. This was borne out in our experi- δb and δs are the slacks at each round. ments. The multiplier is also essential for us because we seek The elicitation proceeds as long as no bundle common to to compute clearing prices, not just an efficient bundle. the demand and supply sets is found. At each round, the sam- p1 0 pk+1 b If we use = as the initial multiplier, then = ple is updated with the latest replies. We write value(x ) k b =1 ν h(α ,β ), and substituting this into (3) yields a term rather than v(x ) to indicate that an explicit value query is that uses only inner-products. Thus the method of multipliers performed, and similarly for cost queries; when we write

211 Input: κ, λ, query access to v and c. 6 Empirical Evaluation Output: whether clearing prices were found. S = ∅, p =(0, 0), y = 0; In this section we report on experiments run to evaluate xb = 1, xs = 0; the clearing, elicitation, and efficiency performance of our δb =0, δs =0; method. We used the CATS suite of distributions to gener- while xb = y or xs = y do ate value and cost functions for the buyer and seller [Leyton- S ← S ∪{xb,xs}; Brown et al., 2000]. CATS generates instances in the XOR record value(xb),cost(xb),value(xs),cost(xs); language [Nisan, 2000]. An XOR representation of a value (α, β, α,¯ β¯) ← multipliers-method(S, κ, λ); function consists of a subset of bundles Z ⊆ X together with if there is an index i where αi > 0 and βi > 0 then their values, in the form of atomic bids (z,v(z)) for all z ∈ Z. y ← xi; x ∈ X p ← (¯α, β¯); The value of a bundle according to an XOR represen- b tation is v(x)=max{z≤x|z∈Z} v(z). For cost functions, we δ =maxxj ∈S[v(xj ) − p(xj )] − [v(xi) − p(xi)]; s used the semantics c(x)=min{z≥x|z∈Z} c(z) to interpret δ =maxxj ∈S[p(xj ) − c(xj )] − [p(xi) − c(xi)]; else the XOR instance. CATS allows one to specify the number of return false; atomic bids desired in an XOR representation, as well as the end number of items; in all our experiments, we used m =30. xb = demand(y, p,κ, δb); CATS was originally designed to generate valuation func- s s x = supply(y, p,κ,δ ); tions for the purpose of testing winner-determination algo- end rithms for combinatorial auctions, not to generate pairs of return true; value and cost functions, so ours is an unconventional use Algorithm 1: Elicitation procedure to compute an efficient of the test suite. Nevertheless, this scheme generated non- bundle and sparse clearing prices. trivial test instances. In fact, the relative difficulty of clearing the market under different CATS distributions (in terms of runtime) agreed with the relative difficulty of the distributions for winner determination in combinatorial auctions, as v(xi) or c(xi), it is because the algorithm has already ob- reported by Leyton-Brown and Shoham [2006]. tained the value or cost of bundle xi. Algorithm 1 was coded in C. To solve the quadratic pro- At each round, the method of multipliers is run with re- grams within the method of multipliers, we used a non- spect to S to find a candidate efficient bundle and clearing commercial C implementation of the LOQO interior-point prices using the kernel κ. Specifically, the method returns the code [Vanderbei, 1999].1 We used default parameters for primal solution (α, β), from which we can deduce whether LOQO, except for a margin of 0.25 rather than 0.95 (this an efficient bundle was found, and the final multiplier (¯α, β¯), amounts to a more aggressive stepsize). We used τ =6for which represents the prices according to (5). If the criterion the updates in the method of multipliers. of Proposition 1 does not hold at this point, then the procedure halts with a failure flag. CATS provides five different distributions: arbitrary, paths, matching, regions, and scheduling. The matching distribution In the demand and supply queries, the kernel κ is passed p α, β¯ generated exceedingly easy instances where linear clearing along with the coefficients =(¯ ) because it is needed prices almost always existed, so we do not report on this dis- to evaluate the price of bundles—recall (5). To ensure that tribution. the procedure makes progress at each round, we must have xb ∈ S or xs ∈ S for at least one of the replies. This is Clearing. To evaluate the clearing ability of our method with δ = δb + δs different kernels, we ran it on each of the four CATS distribu- the essence of the following result. (Below, as b before.) tions, varying the total number of atomic bids. Letting Z and Zs be the bundles in the XOR representations of the value and b s Proposition 2 Algorithm 1 converges in a finite number of cost functions, the total number of bids refers to |Z | + |Z |. rounds. Upon convergence, y is a δ-optimal bundle, and it The average number of instances cleared are reported in Ta- maximizes the buyer’s utility (seller’s profit) at prices p to ble 1. As expected, the identity kernel cleared 100% of the within an additive error of δb (δs). instances for all distributions. The fact that the linear kernel regularly failed to clear the market indicates that the CATS If v = c, then observe that the only possible clearing prices distributions often generate interesting instances for which are p = v = c. In this case, any procedure that finds clearing linear clearing prices do not exist. For the paths and schedul- prices must elicit the complete information of one agent. In ing distributions, the all-subsets kernel generated a kernel ma- the worst-case, this requires an exponential number of queries trix with such large entries that the penalty term (4) swamped (in m), so there is no hope for our procedure or any other to the linear terms in the objective, leading LOQO to ignore always run in polynomial-time. (For formal communication the latter and reach suboptimal solutions. This revealed an complexity results to this effect, see [Nisan and Segal, 2006].) unanticipated limitation of the all-subsets kernel and of our Nevertheless, none of this discounts the possibility that our method. We believe this could be remedied to an extent with method could perform well in practice, especially given that proper scaling and more conservative LOQO settings. the modularity of the method allows one to choose a kernel function appropriate for the clearing task at hand. Therefore, 1The implementation, due to Alex Smola, was retrieved at we now turn to an empirical evaluation. www.kernel-machines.org/code/prloqo.tar.gz

212 arbitrary paths regions scheduling 100 100 κ bids clr. elc. clr. elc. clr. elc. clr. elc. 100 61 16 8 32 61 17 18 15 80 99 linear 300 17 6 6 14 33 6 16 8 60 98 500 5 4 6 10 38 4 20 9 100 100 25 59 45 95 24 61 31

% clearing 40 97 Gaussian % efficiency 300 89 18 66 26 78 13 43 19

500 78 15 79 22 73 8 50 15 20 96 linear, eff. linear, clr. 100 100 33 n/a n/a 100 39 n/a n/a gaussian, eff. gaussian, clr. all-subsets 0 95 300 100 43 n/a n/a 100 28 n/a n/a 0 0.5 1 1.5 2 λ 500 100 43 n/a n/a 100 21 n/a n/a Figure 1: Clearing and efﬁciency performance of the linear 100 100 43 100 32 100 45 100 46 and Gaussian kernels (σ =30) on the paths distribution with identity 250 bids, varying the approximation parameter λ. Each data 300 100 44 100 22 100 41 100 43 point is averaged over 200 instances. 500 100 44 100 20 100 38 100 43

Table 1: Average clearing and elicitation performance of our References method over 200 instances using λ =0and σ =30. Elicita- [Bertsekas, 1996] Dimitri P. Bertsekas. Constrained Optimization tion figures are averages over those instances that cleared. and Lagrange Multiplier Methods. Athena Scientific, 1996. [Chapelle and Harchaoui, 2005] Olivier Chapelle and Za¨ıd Har- b s Elicitation. The elicitation metric is defined as |S|/|Z ∪Z |, chaoui. A machine learning approach to conjoint analysis. In where S is the sample upon termination of Algorithm 1. The Advances in Neural Information Processing Systems, 17. MIT motivation for this definition is that at most |Zb ∪Zs| bundles Press, 2005. can be elicited when the value and cost functions are repre- [Domshlak and Joachims, 2006] Carmel Domshlak and Thorsten sented with XOR. Table1 shows that with all choices of ker- Joachims. Unstructuring user preferences: Efficient non- nels, the method is able to reach clearing prices with much parametric utility revelation. In Proc. of the 21st Conference less than full revelation on average. Importantly, elicitation on Uncertainty in Artificial Intelligence (UAI), pages 169–177, often decreases as bids are scaled up; this is particularly the 2006. case for the Gaussian kernel. That the identity kernel is able [Evgeniou et al., 2005] Theodoros Evgeniou, Constantinos Bous- to clear the market with less than 50% elicitation on average sios, and Giorgos Zacharia. Generalized robust conjoint estima- was an unexpected but welcome finding. The reason this oc- tion. Marketing Science, 24(3):415–429, 2005. curs is that the kernel explicitly prices the empty bundle; this [Lahaie and Parkes, 2004] Sébastien Lahaie and David C. Parkes. amounts to adding a constant term to the price of every other Applying learning algorithms to preference elicitation. In Proc. bundle, and this can lead to a much sparser way of represent- of the 5th ACM Conference on Electronic Commerce (EC), pages ing clearing prices. 180–188, 2004. Sparsity. The sparsity metric we use is the number of [Leyton-Brown and Shoham, 2006] Kevin Leyton-Brown and Yoav nonzero coefficients in the final prices (5) divided by the term Shoham. A test suite for combinatorial auctions. In Combinato- |Zb ∪Zs|, which is the maximum possible price size. By def- rial Auctions, chapter 18, pages 451–478. MIT Press, 2006. inition, sparsity is never more than elicitation, because prices [Leyton-Brown et al., 2000] Kevin Leyton-Brown, Mark Pearson, contain one coefficient for each element of S, and some coef- and Yoav Shoham. Towards a universal test suite for combinato- ficients may be zero. We do not report on any sparsity statis- rial auction algorithms. In Proc. of the second ACM Conference tics because we found that sparsity was always extremely on Electronic Commerce (EC), pages 66–76, 2000. close to elicitation. This indicates that the method elicits very [Nisan and Segal, 2006] Noam Nisan and Ilya Segal. The com- few superfluous bundles for the purpose of clearing. munication requirements of efficient allocations and supporting Approximation. As explained, the λ parameter can be ad- prices. Journal of Economic Theory, 129(1):192–224, 2006. justed to increase the chances that clearing prices are found, [Nisan, 2000] Noam Nisan. Bidding and allocation in combinato- at the expense of approximate rather than exact clearing. Fig- rial auctions. In Proc. second ACM Conference on Electronic ure 1 illustrates its effect on the paths distribution. We see that Commerce (EC), pages 1–12, 2000. with a modest efficiency sacrifice of around 3%, the Gaussian [Shawe-Taylor and Cristianini, 2004] John Shawe-Taylor and kernel can be made to clear almost all instances, where it was Nello Cristianini. Kernel Methods for Pattern Analysis. able to clear less than 70% when exact clearing was imposed. Cambridge University Press, 2004. The improvement is even more dramatic for the linear kernel: [Vanderbei, 1999] Robert Vanderbei. LOQO user’s manual–version a clearing improvement of almost 75% with less than a 5% 3.10. Optimization Methods and Software, 12:485–514, 1999. sacrifice in efficiency. The trade-off was comparable on the other distributions.

213