On Bisubmodular Maximization
Total Page:16
File Type:pdf, Size:1020Kb
On Bisubmodular Maximization Ajit P. Singh Andrew Guillory Jeff Bilmes University of Washington, Seattle University of Washington, Seattle University of Washington, Seattle [email protected] [email protected] [email protected] Abstract polynomial-time approximation algorithms exist when the matroid constraints consist only of a cardinality con- Bisubmodularity extends the concept of sub- straint [22], or under multiple matroid constraints [6]. modularity to set functions with two argu- Our interest lies in enriching the scope of selection ments. We show how bisubmodular maxi- problems which can be efficiently solved. We discuss mization leads to richer value-of-information a richer class of properties on the objective, known as problems, using examples in sensor placement bisubmodularity [24, 1] to describe biset optimizations: and feature selection. We present the first constant-factor approximation algorithm for max f(A, B) (2) A,B V a wide class of bisubmodular maximizations. ⊆ subject to (A, B) , i 1 . r . ∈ Ii ∀ ∈ { } 1 Introduction Bisubmodular function maximization allows for richer problem structures than submodular max: i.e., simul- Central to decision making is the problem of selecting taneously selecting and partitioning elements into two informative observations, subject to constraints on the groups, where the value of adding an element to a set cost of data acquisition. For example: can depend on interactions between the two sets. Sensor Placement: One is given a set of n locations, To illustrate the potential of bisubmodular maximiza- V , and a set of k n sensors, each of which can cover tion, we consider two distinct problems: a fixed area around it. Which subset X V should be ⊂ Coupled Sensor Placement (Section5): One is instrumented, to maximize the total area covered? given a set of n locations, V . There are two different Feature Selection: Given a regression model on n kinds of sensors, which differ in cost and area covered. features, V , and an objective which measures feature Given a total sensor budget, k, select sites to instru- quality, f : 2V R, select the best k features. ment with sensors of each type (A, B), to maximize → the total area covered. Both examples are selection problems, for which a near-optimal approximation can be found efficiently Coupled Feature Selection (Section6): Given a when the objective is submodular (see Definition1, regression model on n features, V , select a set of k below). Many selection problems involve a submodular features and partition them into disjoint sets A and objective: e.g., sensor placement [16, 15], document B, such that a joint measure of feature quality f : A B R is maximized. summarization [20], influence maximization in social × → networks [12], and feature selection [13]. All can be This paper is the first to (i) propose bisubmodular framed as submodular function maximization: maximization to describe a richer class of value-of- max f(S) (1) information problems; (ii) define sufficient conditions S V ⊆ under which bisubmodular max is efficiently solvable; subject to S i, i 1 . r . (iii) provide an algorithm for such instances (algorithms ∈ I ∀ ∈ { } for bisubmodular min exist [7]); (iv) prove, by con- The constraint sets are independent sets of matroids, Ii struction, the existence of an embedding of a directed which include knapsack or cardinality constraints. Fully bisubmodular function into a submodular one. Ancil- th lary to our study of bisubmodular maximization is a Appearing in Proceedings of the 15 International Con- new result on the extension of submodular functions ference on Artificial Intelligence and Statistics (AISTATS) 2012, La Palma, Canary Islands. Volume 22 of JMLR: defined over matroids (Theorem2). W&CP 22. Copyright 2012 by the authors. Our key observation is that many bisubmodular 1055 On Bisubmodular Maximization maximization problems can be reduced to matroid- 3 Simple Bisubmodularity constrained submodular maximization, for which ef- ficient, constant-factor approximation algorithms al- A natural analogue to submodularity for biset functions ready exist. We first introduce the concept of simple is simple bisubmodularity: bisubmodularity (Section3), which describes the biset Definition 4 (Simple Bisubmodularity). f : 22V analogue of submodularity. Directed bisubmodular- 2→V R is simple bisubmodular iff for each (A, B) 2 , ity [23] allows for more complex interaction between 2V ∈ (A0,B0) 2 with A A0, B B0 we have for the set arguments. The question of whether or not ∈ ⊆ ⊆ s / A0 and s / B0: directed bisubmodular functions can be efficiently max- ∈ ∈ imized was has been an open problem for over twenty f(A + s, B) f(A, B) f(A0 + s, B0) f(A0,B0), years. Our reduction approach proves that a wide range − ≥ − f(A, B + s) f(A, B) f(A0,B0 + s) f(A0,B0). of directed bisubmodular objectives can be efficiently − ≥ − maximized (Section4). 2V Equivalently, (A, B), (A0,B0) 2 , ∀ ∈ 2 Preliminaries f(A, B)+f(A0,B0) f(A A0,B B0)+f(A A0,B B0) ≥ ∪ ∪ ∩ ∩ We first introduce basic concepts and notation. Definition 1 (Submodularity). A set-valued function Fixing one of the coordinates of f(A, B) yields a sub- over ground set V , f : 2V R is submodular if for modular function in the free coordinate. → any A A0 V v, ⊆ ⊆ \ 3.1 Reduction to Submodular Maximization f(A + v) f(A) f(A0 + v) f(A0). (3) − ≥ − The reduction uses a duplication of the ground set Equivalently, A, A0 V , ¯ ∀ ⊆ (c.f., [10, 2]). Define V to be an extended ground set formed by taking the disjoint union of 2 copies of the f(A) + f(A0) f(A A0) + f(A A0). (4) ≥ ∪ ∩ original ground set V . For an element s V , we use ∈ ¯ + is used to denote adding an element to a set. si where i 1, 2 to refer to the ith copy in V . Then ¯ ∈ { } ¯ V = V1 V2 where Vi , i si. For an element s V A set function f(A, B) with two arguments A V and we use abs∪ (s) to refer to the corresponding original∈ ⊆ B V is a biset function. In some cases we assume element in V (i.e. abs(s S) s). For a set S V¯ we ⊆ i , the domain of f to be all ordered pairs of subsets of V : similarly use abs(S) abs(s): s S . ⊆ , { ∈ } 22V (A, B): A V, B V . , { ⊆ ⊆ } Given any simple bisubmodular function f, define a single-argument set function g for S V¯ : In other cases we assume the domain of f to be ordered ⊆ pairs of disjoint subsets of V : g(S) , f(abs(S V1), abs(S V2)). 3V (A, B): A V, B V, A B = . ∩ ∩ , { ⊆ ⊆ ∩ ∅} Note that there is a one-to-one mapping between A biset function is normalized if f( , ) = 0. In some f(A, B) for (A, B) 22V and g(S) for S V¯ and ∅ ∅ cases we will also assume f(A, B) is monotone. therefore maximizing∈g(S) is equivalent to maximizing⊆ Definition 2 (Monotonicity). A biset function f over f(A, B). Furthermore, 2V 2 is monotone (monotone non-decreasing) if for any Lemma 1. g(S) is submodular if f(A, B) is a simple s V , (A, B) 22V : ∈ ∈ bisubmodular function. f(A + s, B) f(A, B) and f(A, B + s) f(A, B). ≥ ≥ Note also that g is monotone when f is monotone. V A biset function f over 3 is monotone (monotone non- Based on this connection, maximizing a simple bisub- decreasing) if for any s V (A B), (A, B) 3V : ∈ \ ∪ ∈ modular function f reduces to maximizing a normal f(A + s, B) f(A, B) and f(A, B + s) f(A, B). submodular function g. We can then exploit constant- ≥ ≥ factor approximation algorithms for submodular func- Monotone submodular maximization is nontrivial only tion maximization. when constrained. We focus on matroid constraints: Corollary 1. If f(A, B) is non-negative and simple Definition 3 (Matroid). A matroid is a set of sets bisubmodular, then there exists a constant-factor ap- I with three properties: (i) , (ii) if A and proximation algorithm for solving Equation2 when the ∅ ∈ I ∈ I B A then B (i.e., is an independence system), constraints are either: (i) non-existent; (ii) A k , ⊆ ∈ I I 1 and (iii) if A and B and B > A then there B k ; (iii) A + B k; (iv) A B = | (v)| ≤ any ∈ I ∈ I | | | | 2 is an s B such that s / A and (A + s) . combination| | ≤ of| the| above.| | ≤ ∩ ∅ ∈ ∈ ∈ I 1056 Singh, Guillory, Bilmes Proof. With no constraints, maximizing f(A, B) corre- The previous section answers this question for simple sponds to maximizing non-negative, submodular g(S), bisubmodular functions. solvable using Feige et al. [5]. The only question is In many situations, including those described in Sec- how to represent the other constraints on f as ma- tions5 and6, simple bisubmodularity is sufficient. The troid constraints on g. Cases (ii) and (iii) reduce contributions of this section are primarily theoretical: to uniform matroids. In case (ii), S V k , 1 1 we (i) connect simple to directed bisubmodularity, (ii) S V k ; in case (iii) S V +| S∩ V| ≤ k. 2 2 1 2 give a method for embedding a directed bisubmodular For| ∩ case| (iv) ≤ the constraint is| a∩ partition| | matroid∩ | ≤ con- function into a submodular one; (iii) provide sufficient straint v V, S v , v 1. For any intersection 1 2 conditions under which directed bisubmodular maxi- of a constant∀ ∈ number| ∩ { of matroid}| ≤ constraints we can use mization can be reduced to submodular maximization. the algorithms of Lee et al. [18]. For the special case of monotone f (and therefore g) we can use the simple Definition 5 (Directed Bisubmodularity [24]).