Approximation Algorithms for Connected Maximum Coverage Problem for the Discovery of Mutated Driver Pathways in Cancer ✩ ∗ Dorit S

Total Page:16

File Type:pdf, Size:1020Kb

Approximation Algorithms for Connected Maximum Coverage Problem for the Discovery of Mutated Driver Pathways in Cancer ✩ ∗ Dorit S Information Processing Letters 158 (2020) 105940 Contents lists available at ScienceDirect Information Processing Letters www.elsevier.com/locate/ipl Approximation algorithms for connected maximum coverage problem for the discovery of mutated driver pathways in cancer ✩ ∗ Dorit S. Hochbaum 1, Xu Rao ,1 Department of IEOR, Etcheverry Hall, Berkeley, CA 94709, USA a r t i c l e i n f o a b s t r a c t Article history: This paper addresses the connected maximum coverage problem, motivated by the Received 28 October 2019 detection of mutated driver pathways in cancer. The connected maximum coverage Received in revised form 14 February 2020 problem is NP-hard and therefore approximation algorithms are of interest. We provide Accepted 24 February 2020 here an approximation algorithm for the problem with an approximation bound that Available online 26 February 2020 strictly improves on previous results. A second approximation algorithm with faster run Communicated by R. Lazic time, though worse approximation factor, is presented as well. The two algorithms are Keywords: then applied to submodular maximization over a connected subgraph, with a monotone Approximation algorithms submodular set function, delivering the same approximation bounds as for the coverage Connected maximum coverage maximization case. Bounded radius subgraph © 2020 Elsevier B.V. All rights reserved. 1. Introduction cer patients and identify their individual list of gene mu- tations. Each mutation in the list is thus associated with We address here the connected maximum coverage a subset of patients whose profiles include this mutation. problem. Given a collection of subsets, each associated The goal is to “explain” the cancer with up to k muta- with a node in a graph, and an integer k, the goal is to find tions, where explanation means that these mutations cover up to k subsets the union of which is of largest cardinality jointly the largest possible collection of patients. These se- so that the respective nodes in the graph form a connected lected mutations are considered to be the driver mutations subgraph. This problem generalizes the maximum cover- that are most associated with the specific type of cancer. age problem which is to find k subsets that jointly cover The connectivity requirement is motivated by the belief the most elements. that cancer is a disease of pathways [1]. Pathways are con- The motivation for connected maximum coverage prob- nected subnetworks in a large gene interaction network. lem is the detection of mutations associated with cancer. Therefore, the goal is to identify mutations that not only The search for driver mutations in cancer has accelerated deliver large coverage of the set of patients, but also form with recent advances in sequencing technologies. These a connected subnetwork in the interaction network. technologies enable measurements in large cohorts of can- In the graph representation of the driver mutations’ de- tection problem, each node of the graph corresponds to a protein and its associated gene (mutation), and thus, ✩ Research supported in part by NSF award No. CMMI-1760102. also associated with a subset of patients whose profiles in- Corresponding author. * clude the mutation. The edges of the graph represent the E-mail addresses: [email protected] (D.S. Hochbaum), pairwise protein-to-protein interactions. The detection of k [email protected] (X. Rao). 1 Department of Industrial Engineering and Operations Research, Uni- driver mutations is then the maximum coverage problem versity of California, Berkeley, USA. on this graph. https://doi.org/10.1016/j.ipl.2020.105940 0020-0190/© 2020 Elsevier B.V. All rights reserved. 2 D.S. Hochbaum, X. Rao / Information Processing Letters 158 (2020) 105940 Formally, the input to an instance of the connected number of edges, and n =|V | for the number of nodes in maximum coverage problem is a graph G = (V , E), where the graph. each node v ∈ V be associated with a set S v ⊂ P (pos- We let the distance between two nodes in the graph, sibly empty), for P a universal set, and an integer k. The v1 ∈ V and v2 ∈ V be the shortest path length between goal is to find a subset V ⊂ V of cardinality less than or the two nodes where all the edge weights are of unit equal to k, that induces a connected subgraph in G such length. We denote this distance by d(v1, v2). It is easy to | | that v∈V S v is maximized. find the distance between node v1 and all other nodes in The connected maximum coverage problem is easily V in O (m) time using breadth first search (BFS). The level seen to be NP-hard as it is a special case of the maximum of each node in the BFS tree rooted in v1 is its distance coverage problem [2]. The maximum coverage problem was from v1. Additional details about BFS are provided in the studied by Hochbaum and Pathria [3] who showed that the next subsection. problem is NP-hard and devised a simple greedy algorithm − = with an approximation factor of 1 1/e 0.632121.... This Definition 1. The eccentricity (v) of a node v ∈ V is the approximation factor is the best-possible attained in poly- longest distance between v and any other node in V . nomial time for maximum coverage problem under the assumption that P = NP [4]. The eccentricity of node v in G can be found using BFS Problems closely related to the connected maximum from v, where (v) is equal to the depth of the BFS tree coverage problem include the problem of finding a k-nodes (the deepest level in the tree). connected subgraph that maximizes the sum of weights of the nodes. This problem was studied by Hochbaum and Definition 2. The radius of a graph is the minimum ec- Pathria in [5] who showed that this problem is in general centricity among all nodes in the graph, minv∈V (v). The NP-hard, but solvable in polynomial time for a tree graph. ∗ argument of this minimum is attained for node v called Seufert et al. (2010) [6] addressed this NP-hard connected ∗ the center of the graph. (v ) is then the radius of the maximum sum of weights problem (on general graphs) graph. and provided an 1/(5(1 + ))-approximation algorithm. Vandin et al. [7]studied the connected maximum cov- erage problem and provided an approximation algorithm. Let the radius of graph G be denoted by R(G). Note that Bomersbach et al. [8]formulated the connected maximum the center of the graph is not necessarily unique. ≤ coverage problem as an integer programming problem and It is easy to see that R(G) n/2 and that this inequality used a branch and cut algorithm to get exact solutions. is tight when the graph is an n node path and the number Both of these papers that address the connected maximum of nodes n is even. We therefore conclude that, coverage problem are motivated by the search for driver mutations in cancer. Lemma 1. The radius of the optimal connected subgraph on k- ≤ Our main contribution here is the improvement of the nodes, rOPT, satisfies rOPT k/2. approximation factor for the connected maximum cov- erage problem, with an algorithm we call the Bounded In order to find the radius and center of a graph we use Radius Algorithm. The approximation factor achieved by breadth-first-search (BFS). The procedure creates the BFS- Vandin et al. [7]is 1/(crOPT), where c = (2e − 1)/(e − 1) = tree rooted at the node called “root”. The longest distance 2.581976... and rOPT is the radius of the optimal solu- from the root to another node is attained for a leaf node. tion subnetwork. The approximation factor of our Bounded Hence, to find the eccentricity of node v it suffices to eval- Radius Algorithm is max{(1 − 1/e)(1/rOPT − 1/k) , 1/k}, uate the distance from the root v of each one of the leaves which is demonstrated to always improve on the factor (the level of each leaf), and to take the maximum distance. of 1/(crOPT). We also provide an alternative approximation To find the radius of the graph it is sufficient to initiate BFS algorithm, the Greedy Path Algorithm, with approximation from each node of the graph, compute the respective ec- factor max{(1 − 1/e)(1/R − 1/k) , 1/k} where R is the ra- centricity and take the minimum. The complexity of BFS is dius of the input graph G. The Greedy Path Algorithm is known to be O (m) and therefore, finding the radius and always worse than the Bounded Radius Algorithm in terms the center of a graph takes at most O (nm) steps for a of approximation bound, but always better in terms of run- graph on n nodes and m edges. ning time. The parent of a node v in a rooted tree T is denoted by Another contribution here is the generalization of the p(v). connected maximum coverage to a connected submodu- The algorithms presented in this paper make calls to a lar maximization problem. It is shown here that our two subroutine that merges two sets and measure the size of approximation algorithms apply also to the connected the union. There are various algorithms of different com- submodular maximization, for monotone submodular set plexities for finding set unions and the size of the unions, functions, delivering the same approximation bounds as that depend on different assumptions on how the input is for the connected maximum coverage problem. given. Here, we assume that for any v ∈ V , subset S v is represented by an array of ordered indexes of length |S v |, 2.
Recommended publications
  • Fast Semidifferential-Based Submodular Function Optimization
    Fast Semidifferential-based Submodular Function Optimization: Extended Version Rishabh Iyer [email protected] University of Washington, Seattle, WA 98195, USA Stefanie Jegelka [email protected] University of California, Berkeley, CA 94720, USA Jeff Bilmes [email protected] University of Washington, Seattle, WA 98195, USA Abstract family of feasible solution sets. The set C could express, We present a practical and powerful new for example, that solutions must be an independent set framework for both unconstrained and con- in a matroid, a limited budget knapsack, or a cut (or strained submodular function optimization spanning tree, path, or matching) in a graph. Without based on discrete semidifferentials (sub- and making any further assumptions about f, the above super-differentials). The resulting algorithms, problems are trivially worst-case exponential time and which repeatedly compute and then efficiently moreover inapproximable. optimize submodular semigradients, offer new If we assume that f is submodular, however, then in and generalize many old methods for sub- many cases the above problems can be approximated modular optimization. Our approach, more- and in some cases solved exactly in polynomial time. A over, takes steps towards providing a unifying function f : 2V ! R is said to be submodular (Fujishige, paradigm applicable to both submodular min- 2005) if for all subsets S; T ⊆ V , it holds that f(S) + imization and maximization, problems that f(T ) ≥ f(S [ T ) + f(S \ T ). Defining f(jjS) , f(S [ historically have been treated quite distinctly. j) − f(S) as the gain of j 2 V with respect to S ⊆ V , The practicality of our algorithms is impor- then f is submodular if and only if f(jjS) ≥ f(jjT ) tant since interest in submodularity, owing to for all S ⊆ T and j2 = T .
    [Show full text]
  • On Interval and Circular-Arc Covering Problems
    Noname manuscript No. (will be inserted by the editor) On Interval and Circular-Arc Covering Problems Reuven Cohen · Mira Gonen Received: date / Accepted: date Abstract In this paper we study several related problems of finding optimal interval and circular-arc covering. We present solutions to the maximum k-interval (k-circular-arc) coverage problems, in which we want to cover maximum weight by selecting k intervals (circular-arcs) out of a given set of intervals (circular-arcs), respectively, the weighted interval covering problem, in which we want to cover maximum weight by placing k intervals with a given length, and the k-centers problem. The general sets version of the discussed problems, namely the general mea- sure k-centers problem and the maximum covering problem for sets are known to be NP-hard. However, for the one dimensional restrictions studied here, and even for circular-arc graphs, we present efficient, polynomial time, algorithms that solve these problems. Our results for the maximum k-interval and k-circular-arc covering problems hold for any right continuous positive measure on R. 1 Introduction This paper presents several efficient polynomial time algorithms for optimal interval and circular- arc covering. Given a set system consisting of a ground set of points with integer weights, subsets of points, and an integer k, the maximum coverage problem selects up to k subsets such that the sum of weights of the covered points is maximized. We consider the case that the ground set is a set of points on a circle and the subsets are circular-arcs, and also the case in which the ground set is a set of points on the real line and the subsets are intervals.
    [Show full text]
  • Approximate Submodularity and Its Applications: Subset Selection, Sparse Approximation and Dictionary Selection∗
    Journal of Machine Learning Research 19 (2018) 1{34 Submitted 10/16; Revised 8/18; Published 8/18 Approximate Submodularity and its Applications: Subset Selection, Sparse Approximation and Dictionary Selection∗ Abhimanyu Dasy [email protected] Google David Kempez [email protected] Department of Computer Science University of Southern California Editor: Jeff Bilmes Abstract We introduce the submodularity ratio as a measure of how \close" to submodular a set function f is. We show that when f has submodularity ratio γ, the greedy algorithm for maximizing f provides a (1 − e−γ )-approximation. Furthermore, when γ is bounded away from 0, the greedy algorithm for minimum submodular cover also provides essentially an O(log n) approximation for a universe of n elements. As a main application of this framework, we study the problem of selecting a subset of k random variables from a large set, in order to obtain the best linear prediction of another variable of interest. We analyze the performance of widely used greedy heuristics; in particular, by showing that the submodularity ratio is lower-bounded by the smallest 2k- sparse eigenvalue of the covariance matrix, we obtain the strongest known approximation guarantees for the Forward Regression and Orthogonal Matching Pursuit algorithms. As a second application, we analyze greedy algorithms for the dictionary selection prob- lem, and significantly improve the previously known guarantees. Our theoretical analysis is complemented by experiments on real-world and synthetic data sets; in particular, we focus on an analysis of how tight various spectral parameters and the submodularity ratio are in terms of predicting the performance of the greedy algorithms.
    [Show full text]
  • Fast and Private Submodular and K-Submodular Functions
    Fast and Private Submodular and k-Submodular Functions Maximization with Matroid Constraints Akbar Rafiey 1 Yuichi Yoshida 2 Abstract The problem of maximizing nonnegative monotone submodular functions under a certain constraint has been in- tensively studied in the last decade, and a wide range of efficient approximation algorithms have been developed for this problem. Many machine learning problems, including data summarization and influence maximization, can be naturally modeled as the problem of maximizing monotone submodular functions. However, when such applications involve sensitive data about individuals, their privacy concerns should be addressed. In this paper, we study the problem of maximizing monotone submodular functions subject to matroid constraints in the frame- 1 work of differential privacy. We provide (1 e )-approximation algorithm which improves upon the previous results in terms of approximation guarantee.− This is done with an almost cubic number of function evaluations in our algorithm. 1 Moreover, we study k-submodularity, a natural generalization of submodularity. We give the first 2 - approximation algorithm that preserves differential privacy for maximizing monotone k-submodular functions subject to matroid constraints. The approximation ratio is asymptotically tight and is obtained with an almost linear number of function evaluations. 1. Introduction A set function F : 2E R is submodular if for any S T E and e E T it holds that F (S e ) F (S) F (T e ) F (T ).→The theory of submodular maximization⊆ ⊆ provides∈
    [Show full text]
  • Maximizing Non-Monotone Submodular Functions
    Maximizing non-monotone submodular functions Uriel Feige∗ Vahab S. Mirrokni∗ Jan Vondr´ak† April 19, 2007 Abstract Submodular maximization is a central problem in combinatorial optimization, generalizing many important problems including Max Cut in directed/undirected graphs and in hypergraphs, certain constraint satisfaction problems and maximum facility location problems. Unlike the problem of minimizing submodular functions, the problem of maximizing submodular functions is NP-hard. In this paper, we design the first constant-factor approximation algorithms for maximiz- 1 ing nonnegative submodular functions. In particular, we give a deterministic local search 3 - 2 approximation and a randomized 5 -approximation algorithm for maximizing nonnegative sub- 1 modular functions. We also show that a uniformly random set gives a 4 -approximation. For 1 symmetric submodular functions, we show that a random set gives a 2 -approximation, which can be also achieved by deterministic local search. These algorithms work in the value oracle model where the submodular function is acces- sible through a black box returning f(S) for a given set S. We show that in this model, our 1 2 -approximation for symmetric submodular functions is the best one can achieve with a subex- ponential number of queries. For the case that the submodular function is given explicitly (specifically, as a sum of poly- nomially many nonnegative submodular functions, each on a constant number of elements) we 5 prove that for any fixed > 0, it is NP-hard to achieve a ( 6 + )-approximation for sym- 3 metric nonnegative submodular functions, or a ( 4 + )-approximation for general nonnegative submodular functions. ∗Microsoft Research.
    [Show full text]
  • Submodular Function Maximization Andreas Krause (ETH Zurich) Daniel Golovin (Google)
    Submodular Function Maximization Andreas Krause (ETH Zurich) Daniel Golovin (Google) Submodularity1 is a property of set functions with deep theoretical consequences and far{ reaching applications. At first glance it appears very similar to concavity, in other ways it resembles convexity. It appears in a wide variety of applications: in Computer Science it has recently been identified and utilized in domains such as viral marketing (Kempe et al., 2003), information gathering (Krause and Guestrin, 2007), image segmentation (Boykov and Jolly, 2001; Kohli et al., 2009; Jegelka and Bilmes, 2011a), document summarization (Lin and Bilmes, 2011), and speeding up satisfiability solvers (Streeter and Golovin, 2008). In this survey we will introduce submodularity and some of its generalizations, illustrate how it arises in various applications, and discuss algorithms for optimizing submodular functions. Our emphasis here is on maximization; there are many important results and applications related to minimizing submodular functions that we do not cover2. As a concrete running example, we will consider the problem of deploying sensors in a drinking water distribution network (see Figure 1) in order to detect contamination. In this domain, we may have a model of how contaminants, accidentally or maliciously introduced into the network, spread over time. Such a model then allows to quantify the benefit f(A) of deploying sensors at a particular set A of locations (junctions or pipes in the network) in terms of the detection performance (such as average time to detection). Based on this notion of utility, we then wish to find an optimal subset A ⊆ V of locations maximizing the utility, maxA f(A), subject to some constraints (such as bounded cost).
    [Show full text]
  • Maximization of Non-Monotone Submodular Functions
    University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science 1-24-2014 Maximization of Non-Monotone Submodular Functions Jennifer Gillenwater [email protected] Follow this and additional works at: https://repository.upenn.edu/cis_reports Part of the Computer Engineering Commons, and the Computer Sciences Commons Recommended Citation Jennifer Gillenwater, "Maximization of Non-Monotone Submodular Functions", . January 2014. MS-CIS-14-01 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/cis_reports/988 For more information, please contact [email protected]. Maximization of Non-Monotone Submodular Functions Abstract A litany of questions from a wide variety of scientific disciplines can be cast as non-monotone submodular maximization problems. Since this class of problems includes max-cut, it is NP-hard. Thus, general purpose algorithms for the class tend to be approximation algorithms. For unconstrained problem instances, one recent innovation in this vein includes an algorithm of Buchbinder et al. (2012) that guarantees a ½ - approximation to the maximum. Building on this, for problems subject to cardinality constraints, Buchbinderet al. (2014) o_er guarantees in the range [0:356; ½ + o(1)]. Earlier work has the best approximation factors for more complex constraints and settings. For constraints that can be characterized as a solvable polytope, Chekuri et al. (2011) provide guarantees. For the online secretary setting, Gupta et al. (2010) provide guarantees.
    [Show full text]
  • 1 Introduction to Submodular Set Functions and Polymatroids
    CS 598CSC: Combinatorial Optimization Lecture data: 4/8/2010 Instructor: Chandra Chekuri Scribe: Bolin Ding 1 Introduction to Submodular Set Functions and Polymatroids Submodularity plays an important role in combinatorial optimization. Given a ¯nite ground set S, a set function f : 2S ! R is submodular if f(A) + f(B) ¸ f(A \ B) + f(A [ B) 8A; B ⊆ S; or equivalently, f(A + e) ¡ f(A) ¸ f(B + e) ¡ f(B) 8A ⊆ B and e 2 S n B: Another equivalent de¯nition is that f(A + e1) + f(A + e2) ¸ f(A) + f(A + e1 + e2) 8A ⊆ S and distinct e1; e2 2 S n A: Exercise: Prove the equivalence of the above three de¯nitions. A set function f : 2S ! R is non-negative if f(A) ¸ 0 8A ⊆ S. f is symmetric if f(A) = f(S n A) 8A ⊆ S. f is monotone (non-decreasing) if f(A) · f(B) 8A ⊆ B. f is integer-valued if f(A) 2 Z 8A ⊆ S. 1.1 Examples of submodular functions Cut functions. Given an undirected graph G = (V; E) and a `capacity' function c : E ! R+ on V edges, the cut function f : 2 ! R+ is de¯ned as f(U) = c(±(U)), i.e., the sum of capacities of edges between U and V nU. f is submodular (also non-negative and symmetric, but not monotone). In an undirected hypergraph G = (V; E) with capacity function c : E! R+, the cut function is de¯ned as f(U) = c(±E (U)), where ±E (U) = fe 2 E j e \ U 6= ; and e \ (S n U) 6= ;g.
    [Show full text]
  • Partial Sublinear Time Approximation and Inapproximation for Maximum
    Partial Sublinear Time Approximation and Inapproximation for Maximum Coverage ∗ Bin Fu Department of Computer Science University of Texas - Rio Grande Valley, Edinburg, TX 78539, USA [email protected] Abstract We develop a randomized approximation algorithm for the classical maximum coverage problem, which given a list of sets A1,A2, · · · ,Am and integer parameter k, select k sets Ai1 ,Ai2 , · · · ,Aik for maximum union Ai1 ∪ Ai2 ∪···∪ Aik . In our algorithm, each input set Ai is a black box that can provide its size |Ai|, generate a random element of Ai, and answer 1 the membership query (x ∈ Ai?) in O(1) time. Our algorithm gives (1 − e )-approximation for maximum coverage problem in O(poly(k)m · log m) time, which is independent of the sizes of 1−ǫ 1 the input sets. No existing O(p(m)n ) time (1 − e )-approximation algorithm for the maxi- mum coverage has been found for any function p(m) that only depends on the number of sets, where n = max(|A1|, · · · , |Am|) (the largest size of input sets). The notion of partial sublinear time algorithm is introduced. For a computational problem with input size controlled by two parameters n and m, a partial sublinear time algorithm for it runs in a O(p(m)n1−ǫ) time or 1−ǫ O(q(n)m ) time. The maximum coverage has a partial sublinear time O(poly(m)) constant factor approximation since k ≤ m. On the other hand, we show that the maximum coverage problem has no partial sublinear O(q(n)m1−ǫ) time constant factor approximation algorithm.
    [Show full text]
  • Concave Aspects of Submodular Functions
    Concave Aspects of Submodular Functions Rishabh Iyer Jeff Bilmes University of Texas at Dallas, CSE Department University of Washington, ECE Department 2601 N. Floyd Rd. MS EC31 185 E Stevens Way NE Richardson, TX 75083 Seattle, WA 98195 Email: [email protected] Email: [email protected] Abstract—Submodular Functions are a special class of set This relationship is evident by the fact that submodular functions, which generalize several information-theoretic quan- function minimization is polynomial time (akin to convex tities such as entropy and mutual information [1]. Submodular minimization). A number of recent results, however, make functions have subgradients and subdifferentials [2] and admit polynomial time algorithms for minimization, both of which are this relationship much more formal. For example, similar to fundamental characteristics of convex functions. Submodular convex functions, submodular functions have tight modular functions also show signs similar to concavity. Submodular lower bounds and admit a sub-differential characterization [2]. function maximization, though NP hard, admits constant factor Moreover, it is possible [11] to provide optimality conditions approximation guarantees and concave functions composed with for submodular minimization, in a manner analogous to modular functions are submodular. In this paper, we try to provide a more complete picture on the relationship between the Karush-Kuhn-Tucker (KKT) conditions from convex submodularity with concavity. We characterize the superdiffer- programming. In addition, submodular functions also admit entials and polyhedra associated with upper bounds and provide a natural convex extension, known as the Lov´asz extension, optimality conditions for submodular maximization using the which is easy to evaluate [12] and optimize.
    [Show full text]
  • (05) Polymatroids, Part 1
    1 An integer polymatroid P with groundset E is a bounded set of non-negative integer-valued vectors coordinatized by E such that P includes the origin; and any non-neg integer vector is a member of P if it is ≤ some member of F; and for any non-neg integer vector y, every maximal vector x in P which is ≤ y (―an F basis of y”) has the same sum |x| ≡ x(E) ≡ ∑( xj : j ϵ E), called the rank r(y) of vector y in P. Clearly, by definition, the 0–1 valued vectors in an integral polymatroid are the ―incidence vectors‖ of the independent sets J (i.e., J ∈ F) of a matroid M = (E,F). For any vectors x and y in Z+, x ∩ y denotes the maximal integer vector which is ≤ x and ≤ y; x ∪ y denotes the minimal integer vector which is ≥ x and ≥ y. Clearly 0 ≤ r(0) ≤ r(x) ≤ r(y) for x ≤ y. “non-neg & non-decreasing” + Theorem 2: For any x and y in Z , r(x ∪ y) + r(x ∩ y) ≤ r(x) + r(y). “submodular” Proof: Let v be an P-basis of x ∩ y. Extend it to P-basis w of x ∪ y. By the definition of integer polymatroid, r(x ∩ y) = |v| and r(x ∪ y) = |w|. r(x) + r(y) ≥ |w ∩ x| + |w ∩y| = |w ∩ (x ∩ y)| + | w ∩(x ∪ y)| = |v| + |w| = r(x ∪ y) + r(x ∩ y). □ + For subsets S of E define fP(S) to be r(x) where x ϵ Z is very large for coordinates in S and = 0 for coordinates not in S.
    [Show full text]
  • On Bisubmodular Maximization
    On Bisubmodular Maximization Ajit P. Singh Andrew Guillory Jeff Bilmes University of Washington, Seattle University of Washington, Seattle University of Washington, Seattle [email protected] [email protected] [email protected] Abstract polynomial-time approximation algorithms exist when the matroid constraints consist only of a cardinality con- Bisubmodularity extends the concept of sub- straint [22], or under multiple matroid constraints [6]. modularity to set functions with two argu- Our interest lies in enriching the scope of selection ments. We show how bisubmodular maxi- problems which can be efficiently solved. We discuss mization leads to richer value-of-information a richer class of properties on the objective, known as problems, using examples in sensor placement bisubmodularity [24, 1] to describe biset optimizations: and feature selection. We present the first constant-factor approximation algorithm for max f(A, B) (2) A,B V a wide class of bisubmodular maximizations. ⊆ subject to (A, B) , i 1 . r . ∈ Ii ∀ ∈ { } 1 Introduction Bisubmodular function maximization allows for richer problem structures than submodular max: i.e., simul- Central to decision making is the problem of selecting taneously selecting and partitioning elements into two informative observations, subject to constraints on the groups, where the value of adding an element to a set cost of data acquisition. For example: can depend on interactions between the two sets. Sensor Placement: One is given a set of n locations, To illustrate the potential of bisubmodular maximiza- V , and a set of k n sensors, each of which can cover tion, we consider two distinct problems: a fixed area around it.
    [Show full text]