Faster Sublinear Approximation of the Number of K-Cliques in Low-Arboricity Graphs

Faster Sublinear Approximation of the Number of K-Cliques in Low-Arboricity Graphs

Faster sublinear approximation of the number of k-cliques in low-arboricity graphs Talya Eden ✯ Dana Ron ❸ C. Seshadhri ❹ Abstract science [10, 49, 40, 58, 7], with a wide variety of applica- Given query access to an undirected graph G, we consider tions [37, 13, 53, 18, 44, 8, 6, 31, 54, 38, 27, 57, 30, 39]. the problem of computing a (1 ε)-approximation of the This problem has seen a resurgence of interest because number of k-cliques in G. The± standard query model for general graphs allows for degree queries, neighbor queries, of its importance in analyzing massive real-world graphs and pair queries. Let n be the number of vertices, m be (like social networks and biological networks). There the number of edges, and nk be the number of k-cliques. are a number of clever algorithms for exactly counting Previous work by Eden, Ron and Seshadhri (STOC 2018) ∗ n mk/2 k-cliques using matrix multiplications [49, 26] or combi- gives an O ( 1 + )-time algorithm for this problem n /k nk k natorial methods [58]. However, the complexity of these (we use O∗( ) to suppress poly(log n, 1/ε,kk) dependencies). algorithms grows with mΘ(k), where m is the number of · Moreover, this bound is nearly optimal when the expression edges in the graph. is sublinear in the size of the graph. Our motivation is to circumvent this lower bound, by A line of recent work has considered this question parameterizing the complexity in terms of graph arboricity. from a sublinear approximation perspective [20, 24]. The arboricity of G is a measure for the graph density Letting n denote the number of vertices, m the number “everywhere”. There is a very rich family of graphs with bounded arboricity, including all minor-closed graph classes of edges, and nk the number of k-cliques, the complexity (such as planar graphs and graphs with bounded treewidth), of approximating the number of k-cliques up to a (1 ε)- bounded degree graphs, preferential attachment graphs and ± n mk/2 more. multiplicative factor is O∗ 1/k + with a nearly n nk We design an algorithm for the class of graphs k with arboricity at most α, whose running time is matching lower bound [24].1 − − ∗ nαk 1 n mαk 2 O (min , 1 + ). We also prove a nearly We study the problem of approximating the number nk n /k nk { k } matching lower bound. For all graphs, the arboricity is of k-cliques in bounded arboricity graphs, with the hope O(√m), so this bound subsumes all previous results on sub- of circumventing the above lower bound.2 A graph of linear clique approximation. arboricity at most α has the property that the average As a special case of interest, consider minor-closed families of graphs, which have constant arboricity. Our degree in any subgraph is at most 2α [46, 47]. One of our result implies that for any minor-closed family of graphs, motivations is to understand when it is possible to get there is a (1 ε)-approximation algorithm for nk that has a running time of O (n/n ). This is an obvious lower running time±O∗( n ). Such a bound was not known even ∗ k nk bound, since a graph can simply contain n disjoint k- for the special (classic) case of triangle counting in planar k graphs. cliques, and, e.g., a cycle on the remaining vertices. It requires Ω(n/nk) uniform vertex samples just to land 1 Introduction in a k-clique. Are there classes of graphs for which one can accurately estimate the number of k-cliques in this The problem of counting the number of k-cliques in a time? graph is a fundamental problem in theoretical computer A consequence of our main theorem is an affirma- tive answer to this question, for the class of constant- ✯CSAIL at MIT, [email protected]. The majority of this arboricity graphs. The class of graphs with constant ar- work was done while the author was affiliated with Tel Aviv boricity is an immensely rich class, containing, among University. This research was partially supported by a grant from others, all minor-closed graph families. The concept the Blavatnik fund, and Schmidt and Rothschild Fellowships. The of constant arboricity plays a significant role in the the- author is grateful to the Azrieli Foundation for the award of an Azrieli Fellowship. ory of bounded expansion graphs, which has applications ❸Tel Aviv University, [email protected]. This research was partially supported by the Israel Science Foundation grants No. Downloaded 08/04/20 to 24.6.75.129. Redistribution subject SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 671/13 and 1146/18. 1As stated in the abstract, we use the O∗(·) notation to ❹ University of California, Santa Cruz, [email protected]. This suppress poly(log n, 1/ε,kk) dependencies. research was funded by NSF CCF-1740850, NSF CCF-1813165, 2The arboricity of a graph is the minimal number of forests and ARO Award W911NF191029. required to cover the edges of the graph. Copyright © 2020 by SIAM 1467 Unauthorized reproduction of this article is prohibited in logic, descriptive complexity, and fixed parameter Recall that α is always upper bounded by √m, so tractability [48]. In the context of real-world graphs, the that the bound in Theorem 1.1 subsumes the result for classic Barab´asi-Albert preferential attachment graphs for approximating the number of k-cliques in general as well as additional models generate constant arboric- graphs [24]. As we discuss in more detail in Section 2, ity graphs [3, 5, 4]. In most real-world graphs, the ar- our algorithm starts similarly to the algorithm of [24] boricity is at most an order of magnitude larger than but departs quickly since it relies on a different, itera- the average degree, while the maximum degree is three tive, approach so as to achieve the dependence on α. − to four orders of magnitude larger [32, 39, 55]. In prac- n mαk 2 Comparing our bound of O∗ 1/k + for tical applications, low arboricity is often exploited for n nk k faster algorithms for clique and dense subgraph count- approximate counting with the Chiba and Nishizeki k 2 ing [28, 30, 45, 39, 16]. bound of O(n + mα − ) for exact counting, we get log n kk A classic result of Chiba and Nishizeki gives an that when nk poly · , our bound is smaller, k 2 ≫ ε O(n + mα − ) algorithm for exact counting of k-cliques and as nk increases the gap becomes more significant. in graphs of arboricity at most α [10]. Our primary Note that Chiba and Nishizeki read the entire graph, motivation is to get a sublinear-time algorithm for so that they have full knowledge of the graph, and approximating the number of k-cliques on such graphs. their challenge is to count the number of k-cliques (by We assume the standard query model for general graphs enumerating them), as efficiently (in terms of running (refer to Chapter 10 of Goldreich’s book [33]), so that time) as possible. On the other hand, our algorithm the algorithm can perform degree, neighbor and pair may obtain only a partial view of the graph. Hence, our queries. Let us exactly specify each query. (1) Degree challenge is to compute an estimate of the number of k- queries: given v V , get the degree d(v). (2) Neighbor ∈ th cliques based on such partial knowledge, by devising a queries: given v V and i d(v) get the i neighbor careful sampling procedure (that in particular, exploits of v. (3) Pair queries:∈ given≤ vertices u,v, determine if 3 the bounded arboricity). (u,v) is an edge. An application of Theorem 1.1 for the family of minor-closed graphs4 gives the following corollary.G We 1.1 Results. Our main result is an algorithm for ap- note that even for the special case of triangle counting in proximating the number of k-cliques, whose complexity planar graphs, such a result was not previously known. depends on the arboricity. The algorithm is sublinear k 2 for nk = ω(α − ) (and we subsequently show that for Corollary 1.2. Let be a minor-closed family of G smaller nk, sublinear complexity cannot be obtained). graphs. There is an algorithm that, given n,k,ε, and query access to G , outputs a (1 ε)-approximation Theorem 1.1. There exists an algorithm that, given ∈G ± of nk with high constant probability. The expected n, k, an approximation parameter 0 <ε< 1, query running time of the algorithm is access to a graph G, and an upper bound α on the k arboricity of G, outputs an estimate nk, such that with (n/nk) poly(log n, 1/ε, k ). high constant probability (over the randomness of the · algorithm), b In general, we prove that the bound of Theorem 1.1 is nearly optimal. (1 ε) nk nk (1 + ε) nk. − · ≤ ≤ · Theorem 1.3. Consider the set of graphs of ar- G The expected running timeb of the algorithm is boricity at most α. Any multiplicative approximation algorithm that succeeds with constant probability on all nαk 1 n mαk 2 graphs in must make min − , + − poly(log n, 1/ε, kk), G 1/k k−1 k−2 nk nk · nα n m(α/k) ( nk ) Ω min k , 1/k + min , m k nk k n nk · · k and the expected query complexity is the minimum queries in expectation.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    12 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us