1

Existence and Size of the Giant Component in Inhomogeneous Random K-out Graphs Mansi Sood and Osman Yagan˘

Abstract—Random K-out graphs are receiving attention as nodes, denoted by H(n; K), is constructed as follows. Each a model to construct sparse yet well-connected topologies in of the n nodes draws K edges towards K distinct nodes distributed systems including sensor networks, federated learn- chosen uniformly at random from among all other nodes. ing, and cryptocurrency networks. In response to the growing heterogeneity in emerging real-world networks, where nodes The orientation of the edges is then ignored, yielding an differ in resources and requirements, inhomogeneous random undirected graph. Random K-out graphs achieve connectivity K-out graphs were proposed recently. In this model, first, each with probability approaching one in the limit of large n for all of the n nodes is classified as type-1 (respectively, type-2) with K ≥ 2 [5], [7], with the probability of connectivity growing 2 probability µ (respectively, 1−µ) independently from the others, as 1 − Θ(1/nK −1) for all K ≥ 2 [8]. Consequently, random where 0 < µ < 1. Next, each type-1 (respectively, type-2) node K-out graphs get connected very easily, i.e., with far fewer draws 1 arc towards a node (respectively, Kn arcs towards Kn distinct nodes) selected uniformly at random. The orientation of edges (O(n)) as compared to classical models the arcs is ignored yielding the inhomogeneous random K-out including Erdos-R˝ enyi´ (ER) random graphs (O(n log n)). Ow- graph, denoted by H(n; µ, Kn). It was recently established that ing to the simplicity of graph construction and their unique H(n; µ, Kn) is connected with high probability (whp) if and only connectivity properties, random K-out graphs are receiving if K = ω(1). Motivated by practical settings where establishing n increasing attention as a model to construct sparse, yet well- links is costly and only a bounded choice of Kn is feasible (Kn = O(1)), we study the size of the largest connected sub- connected topologies in a fully distributed fashion in ad- network of H(n; µ, Kn). We first show that the trivial condition hoc networks. Random K-out graphs have received renewed of Kn ≥ 2 for all n is sufficient to ensure that H(n; µ, Kn) interest for designing securely connected sensor networks contains a giant component of size n − O(1) whp. Next, to model [7], anonymity preserving cryptocurrency networks [9], and settings where nodes can fail or get compromised, we investigate fully decentralized learning networks with differential privacy the size of the largest connected sub-network in H(n; µ, Kn) when dn nodes are selected uniformly at random and removed from guarantees [10]. the network. We show that if dn = O(1), a giant component In the context of wireless sensor networks (WSNs), random of size n − O(1) persists for all Kn ≥ 2 whp. Further, when K-out graphs have been studied [7], [11]–[13] to model dn = o(n) nodes are removed from H(n; µ, Kn), the remaining the random pairwise key predistribution scheme [14]. Along nodes contain a giant component of size n(1 − o(1)) whp for all with the original key predistribution scheme proposed by Kn ≥ 2. We present numerical results to demonstrate the size of the largest connected component when the number of nodes Escheanuer and Gligor [15], the random pairwise scheme is finite. is one of the most widely recognized security protocols for Index Terms—Network analysis and control, random graphs, sensor networks. The random pairwise scheme is implemented connectivity, ad-hoc networks in two phases. In the first phase, each sensor node is paired offline with K distinct nodes chosen uniformly at random among all other sensor nodes. Next, a unique pairwise key I.INTRODUCTION is inserted in the memory of each of the paired sensors. After A. Background deployment, two sensor nodes can communicate securely only arXiv:2009.01610v2 [cs.IT] 29 Jul 2021 The field of random graphs lays the mathematical foun- if they have at least one key in common. The random pairwise dations for modeling and analyzing the complex patterns of key predistribution scheme induces a random K-out graph on interconnections typical to real-world networks including com- the set of the participating sensor nodes [7], and therefore the munication networks [2], social networks [3], and biological connectivity properties of random K-out graphs are of interest networks [4]. A class of random graphs called the random in key management schemes for establishing secure WSNs. K-out graph is one of the earliest known models in the In recent years, the rapid proliferation of affordable sensing literature [5], [6]. The random K-out graph comprising n and mobile devices has led to advances in applications such as remote-sensing, environmental surveillance [16], and decen- This work has been supported in part by the National Science Foundation through grant CCF #1617934, David H. Barakat and LaVerne Owen-Barakat tralized learning [17]. Such emerging applications increasingly Carnegie Institute of Technology Dean’s Fellowship, Pennsylvania Infras- rely on integrating heterogeneous agents that have different tructure Technology Alliance (PITA), CyLab@IoT, and Cylab Presidential capabilities and (security and connectivity) requirements. Con- Fellowship at Carnegie Mellon University. M. Sood and O. Yagan˘ are with Department of Electrical and sequently, the analysis of heterogeneous variants of classical Computer Engineering and CyLab, Carnegie Mellon University, random graph models has emerged as an active area of Pittsburgh, PA, 15213 USA. Email: {[email protected], research. [18]–[23], [24]–[28]. A heterogeneous variant of [email protected]} A preliminary version of this work was presented at the 59th IEEE random K-out graph known as the inhomogeneous random Conference of Decision and Control (CDC) 2020 [1]. K-out graph was proposed recently to model heterogeneous 2

B C B C instance, if the power available for transmission is limited, picks C picks B, D, F the underlying physical network may not be dense enough to guarantee global connectivity with key predistribution schemes

A A [30]. In such resource-constrained environments where it is F picks B F picks B, E, F infeasible to establish a connected network, the goal is to establish a large connected sub-network spanning almost the E E picks B, D, F picks F D D entire network as efficiently as possible, i.e., by using the least amount of resources (links). [31]. For example, if a sensor Fig. 1. An inhomogeneous random K-out graph with 6 nodes. Nodes A, C and E are type-2 and the rest (B,D,F ) are type-1. Each type-1 node selects 1 node network is designed to monitor the temperature of a field, it uniformly at random and each type-2 node selects Kn = 3 nodes uniformly at may suffice to aggregate readings from a majority of sensors in random. An undirected edge is drawn between two nodes if at least one selects the field [32]. Formally, the notion of connected sub-networks the other. is described by connected components; see Definition 3.1 for a precise characterization. With this in mind, the first networks secured by random pairwise key predistribution question which we address is when Kn is bounded (i.e., schemes [19]. Kn = O(1)), how many nodes are contained in the largest The inhomogeneous random K-out graph is constructed on connected component of H(n; µ, Kn)? In the literature on a set of n nodes as follows. First, each node is independently random graphs, this is often studied in terms of the existence classified as type-1 with probability µ and type-2 with proba- and size of the giant component, defined as a connected sub- bility 1−µ, where 0 < µ < 1. The type of the node determines network comprising Ω(n) nodes; see [33] for a classical study the number of selections made by it, viz., each type-1 node on the size of the giant component of Erdos-R˝ enyi´ graphs. selects one node uniformly at random from all other nodes and Not only is it important to design networks that bear large each type-2 node selects Kn ≥ 2 distinct nodes uniformly connected components, but it is also crucial to ensure these at random from all other nodes; see Figure 1. The notation large components are robust to node failures. Sensor networks Kn indicates that the number of selections made by type-2 are often deployed in hostile environments e.g., battlefield nodes scales as a function of the number of nodes (n). Since monitoring and environmental surveillance, which makes the homogeneous random K-out graphs, where all nodes make the nodes susceptible to operational failures and adversarial cap- same number of selections, achieve connectivity whp for any tures. Therefore, it is of interest to analyze the properties of K ≥ 2 [5], [7], the interesting case arises when we allow networks induced by random K-out graphs when some nodes type-1 nodes to select just 1 edge; see Section II. Further, are dishonest, have failed, or have been captured from the the analysis in this paper also generalizes to inhomogeneous network. From a modeling perspective, this translates to ana- random K-out graphs with an arbitrary number of node types; lyzing the connectivity properties of inhomogeneous random see Appendix. Throughout our discussion, we let H(n; µ, Kn) K-out graphs when a subset of nodes are selected uniformly denote the inhomogeneous random K-out graph on n nodes at random and deleted. Based on the analytical framework with parameters µ and Kn. proposed in this paper, a recent work [34] investigates the size of the giant component for homogeneous random K-out B. Properties of interest and related results graphs (H(n; K)) under random node deletions. However, the analysis for H(n; µ, Kn) has been limited to stronger notions Connectivity is a fundamental driver of system performance of connectivity that require Kn → ∞ [19], [35]. With this in communication networks since it enables nodes to securely in mind, the second question that we analyze is when Kn is communicate with one another [29]. It is known [5], [7] that bounded, what is the size of the largest connected component the random K-out graph (n; K) is connected when K ≥ 2 H in H(n; µ, Kn) after a subset of nodes are selected at random and not connected for K = 1 with probability tending to one and removed from the network? as n → ∞. In particular, the following zero-one law holds: ( 1 if K ≥ 2, C. Main contributions lim P [H(n; K) is connected] = (1) n→∞ 0 if K = 1. In this work, we prove that the inhomogeneous random K- In [19], it was shown that for any µ ∈ (0, 1), the inhomo- out graph H(n; µ, Kn) contains a giant component as long as geneous random K-out graph H(n; µ, Kn) is connected with the trivial conditions µ ∈ (0, 1) and Kn ≥ 2 (for all n) hold. probability tending to one if and only if Kn = ω(1); i.e., In fact, we show that under the same conditions, H(n; µ, Kn) ( contains a connected sub-network of size n − O(1) whp. 1 if Kn → ∞ Put differently, all but finitely many nodes are contained lim P [H(n; µ, Kn) is connected] = n→∞ < 1 otherwise. in the giant component of H(n; µ, Kn) whp as n goes to (2) infinity. This result follows from an upper bound on the As seen from (2), ensuring connectivity of H(n; µ, Kn) probability that more than M nodes are outside of the giant requires the selections made by type-2 nodes (Kn) to grow component. We show that this probability decays at least as unboundedly large with n. Although it is desirable to have a fast as O(1) exp{−M(1 − µ)(Kn − 1)} + o(1) providing connected network, in practice, resource constraints can limit a clear trade-off between Kn and the fraction (1 − µ) of the number of links that can be successfully established. For nodes that make Kn selections. Next, we generalize this 3 result to study the size of the largest connected component generating anonymity graphs to facilitate the diffusion of trans- in H(n; µ, Kn) after dn nodes are selected at random and action information to protect cryptocurrency networks from deleted from the network. We show that after the deletion mass de-anonymization attacks. Random K-out graphs have of any randomly chosen finite subset of nodes (dn = O(1)), also received interest in the ongoing development of the next- the graph induced by the remaining nodes still contains a generation SCION Internet architecture to provide end hosts giant component of size n − O(1) for all Kn ≥ 2 whp. more control over their traffic using path-aware routing [37]. Furthermore, when dn = o(n) nodes are deleted uniformly In this context, gossip networks closely resembling random K- at random from H(n; µ, Kn), the remaining nodes contain a out graphs have been used to develop distributed systems to giant component of size n(1 − o(1)) for all Kn ≥ 2 whp. We efficiently translate between old IP addresses and new SCION also provide an upper bound on the probability of observing addresses [38]. Given their ability to get connected with a xn or more nodes outside the giant component in terms of relatively smaller number of edges than classical models such the parameters dn,Kn and µ. For the finite node setting, we as ER graphs, random K-out graphs offer a promising potential provide simulation results investigating the average size and for informing the topological properties of emerging peer-to- the minimum observed size of the giant component across peer networks. For instance, random K-out graphs may be of 100,000 independent experiments. Our simulation results are interest in Payment Channel Networks [39] where there is the well-aligned with our theoretical prediction on the existence trade-off between the number of edges in the network (which of a large connected component in the network. For example, is constrained since each edge corresponds to funds escrowed with n = 5000, µ = 0.9,Kn = 2, at most 45 nodes were in the PCN) and connectivity (which is desirable to facilitate observed to be outside the largest connected component across transactions between participating nodes). Our results on the 100,000 experiments; see Section III for details. impact of heterogeneity on the size of the largest connected component of random K-out graphs are of interest for these additional applications. D. Further applications of Random K-out Graphs Random K-out graphs are a useful tool for constructing distributed networks that are securely connected and resilient E. Notation to node failures efficiently, i.e., by using the least number of edges. Further, the inhomogeneous random K-out graph allows We use the following notational conventions throughout the for more practical deployment scenarios where not all nodes paper. All limits are understood with the number of nodes have the same resources or mission requirements. Below, we n going to infinity. While comparing asymptotic behavior describe some additional use-cases for random K-out graphs. of a pair of sequences {an}, {bn}, we use an = o(bn), There has been a growing interest in fully decentralized an = ω(bn), an = O(bn), an = Θ(bn), and an = Ω(bn) learning applications [17] where users communicate over a with their meaning in the standard Landau notation. All peer-to-peer network to train a machine learning model. In random variables are defined on the same probability triple such a decentralized setting, in absence of a trusted aggregator, (Ω, F, P). Probabilistic statements are made with respect to maintaining the privacy of the local data of participating nodes this probability measure P, and we denote the corresponding is a key challenge. A recent line of work [10], [36] ana- expectation operator by E. For an event A, its complement lyzes mechanisms to perform differentially-private averaging is denoted by Ac. We say that an event occurs with high of private values of users over a communication graph in probability (whp) if it holds with probability tending to one a fully-decentralized fashion without disclosing the private as n → ∞. We denote the cardinality of a discrete set A by values. In the proposed GOPA (GOssip Noise for Private |A| and the set of all positive integers by N0. For events A Averaging) protocol [10, Algorithm 1], each node pair in and B, we use A =⇒ B with the meaning that A ⊆ B. the communication graph (u, v) collaboratively generates a pairwise noise term ∆(u, v). The authors of [10], [36] use the random K-out graph as a practical randomized procedure to F. Organization construct communication graphs that are connected (and thus require a lower pairwise noise variance) and sparse (and thus In Section II we formally define the inhomogeneous random incur lower communication costs) in a distributed fashion. A K-out graphs and a model for random node deletions. In precise characterization of the differential privacy guarantees Section III we present our main results on characterizing the achieved while using random K-out graphs to construct the size of the largest connected component in H(n; µ, Kn). Here, communication graph can be found in [10, Theorem 3]. we provide a discussion of the main results and present simu- The achieved differential-privacy guarantees depend on the lations for the finite node regime. We provide a proof outline connectivity of the subgraph of honest users. Further, in cases for the existence of a giant component spanning n − O(1) where the subgraph on honest nodes is not connected, the nodes in Section IV-A, presenting its comprehensive proof in performance of the proposed protocol depends on the size of Section V. Similarly, we outline the proof for the case of node connected components of honest users. deletions in Section VI and provide full details in Sections VII. In addition to the applications in WSNs and distributed Lastly, we extend our results to the inhomogeneous random learning, a structure similar to the random K-out graph was K-out graphs with an arbitrary number of node types in the proposed [9, Algorithm 1] in the Dandelion framework for Appendix. 4

II.NETWORK MODEL N\ D. A pair of nodes (u1, u2) contained in G(n; µ, Kn, dn) are adjacent if and only if (u1, u2) are adjacent in (n; µ, Kn), A. The Inhomogeneous Random K-out Graph H(n; µ, Kn) H and u , u ∈/ D. The graph (n; µ, K , d ) represents the Let N := {v , v , . . . , v } denote the set of nodes. The 1 2 G n n 1 2 n resulting sub-network of reliable/honest nodes obtained after inhomogeneous random K-out graph is constructed on the removing the failed/compromised nodes in (n; µ, K ). node set N as follows. First, each node is assigned as type-1 H n (respectively, type-2) with probability µ (respectively, 1 − µ) independently from other nodes, where 0 < µ < 1. Next, each III.MAIN RESULTS AND DISCUSSION type-1 (respectively, type-2) node selects K1 (respectively, A. Main results K2) distinct nodes uniformly at random among all other nodes. It is known [19] that H(n; µ, Kn) is connected whp only For each vi ∈ N , let Γn(i) ⊆ N \ vi denote the subset of if Kn = ω(1). A natural question is then to ask what would nodes selected by vi. Under the aforementioned assumptions, happen if Kn is bounded, i.e, when Kn = O(1). It was shown Γn(v1),..., Γn(vn) are mutually independent given the types [19] that H(n; µ, Kn) has a positive probability of being not of nodes. We say that distinct nodes vi and vj are adjacent, connected in that case. Thus, it is of interest to analyze whether denoted by vi ∼ vj if at least one of them picks the other. the network has a connected sub-network containing a large Namely, number of nodes, or it consists merely of small sub-networks isolated from each other. To answer this question, we formally vi ∼ vj if vj ∈ Γn(vi) ∨ vi ∈ Γn(vj). (3) define connected components and then state our main result The inhomogeneous random K-out graph is defined on the characterizing the size of the largest connected component of N node set through the adjacency condition (3). More general H(n; µ, Kn) when Kn = O(1). constructions with an arbitrary number of node types are also Definition 3.1 (Connected Components): Nodes v1 and possible [19], and the implications of our results for such cases v2 ∈ N are said to be connected if there exists a path of edges are presented in the Appendix. connecting them. The connectivity of a pair of nodes forms Without loss of generality, we assume that 1 ≤ K1 < K2. an equivalence relation on the set of nodes. Consequently, From (1), it can be seen that the inhomogeneous random there is a partition of the set of nodes N into non-empty sets K-out graph will be connected whp if K1 ≥ 2. Therefore, C1,C2,...,Cm (referred to as connected components) such interesting cases arise for the analysis of the connected largest that two vertices v1 and v2 are connected if and only if there component only when K1 = 1; i.e., when each node has a exists i ∈ {1, . . . , m} for which v1, v2 ∈ Ci; see [40, p. 13]. positive probability µ of selecting only one other node. As In light of the above definition, a graph is connected if it in [19], we thus assume that K1 = 1 which in turn implies consists of only one connected component. In all other cases, that K2 ≥ 2. For generality, we allow K2 to scale with the graph is not connected and has at least two connected (i.e., to be a function of) n and simplify the notation by components that have no edges in between. It is of interest denoting the corresponding mapping as Kn. Put differently, to analyze the fraction of the nodes contained in the largest we consider the inhomogeneous random K-out graph, denoted connected component as the number of nodes grows. In as H(n; µ, Kn), where each of the n nodes selects one other particular, a graph with n nodes is said to have a giant node with probability µ (0 < µ < 1) and Kn other nodes with component if its largest connected component is of size Ω(n). probability 1 − µ; the edges are then constructed according to Let Cmax(n; µ, Kn) denote the set of nodes in the largest (3). Throughout, we assume that K ≥ 2 for all n in line with n connected component of H(n; µ, Kn). Our first main result, the assumption that K2 > K1 = 1. We denote the average presented below, shows that |Cmax(n; µ, Kn)| = n − O(1) number of selections made by each node in (n; µ, K ) by H n whp. Namely, H(n; µ, Kn) has a giant component that con- hKni. Observe that tains all but finitely many of the nodes whp. First, we show that the probability of at least M nodes being outside of hKni = µ + (1 − µ)Kn. (4) Cmax(n; µ, Kn) decays exponentially fast with M. Theorem 3.2: For the inhomogeneous random graph B. Modeling random node failures in (n; µ, K ) H n H(n; µ, Kn) with Kn ≥ 2 ∀n and Kn = O(1) for each Next, we describe a way to model nodes that are failed M = 1, 2,..., it holds that and/or compromised, rendering them removed from the re- [|Cmax(n; µ, Kn)| ≤ n − M] sulting network. Given a set of nodes N := {v1, . . . , vn}, P exp{−M (hK i − 1) (1 − o(1))} we first construct an inhomogeneous random K-out graph ≤ n + o(1) (5) H(n; µ, Kn) with parameters µ and Kn as outlined above 1 − exp{− (hKni − 1) (1 − o(1))} in Section II-A. The graph H(n; µ, Kn) constitutes the initial network before any nodes fail or get compromised. Next, we The proof of Theorem 3.2 relies on the connection between model a random attack on the network which results in the the non-existence of sub-graphs with size exceeding M that failure/capture of dn out of n nodes. A subset D of the node are isolated from the rest of the graph, and the size of the set N of size dn is selected uniformly at random (D ⊂ N of largest component being at least n − M. This approach and |D| = dn) from all possible subsets of N containing is inspired by [31] and differs from the branching process exactly dn nodes. We define G(n; µ, Kn, dn) as the subgraph technique typically employed in the random graph literature, of H(n; µ, Kn) induced by the subset of nodes contained in e.g., in the case of Erdos-R˝ enyi´ graphs [41, Ch. 4]. The 5

proof of Theorem 3.2 is presented in Section IV-A. The next are randomly deleted from H(n; µ, Kn), the remaining nodes corollary uses Theorem 3.2 to show that H(n; µ, Kn) has a contain a giant component of size n(1 − o(1)) for all Kn ≥ giant component that contains all but finitely many of the 2 whp. The proofs for Theorem 3.4 and Corollary 3.5 are nodes whp. presented in Section VI. Corollary 3.3: For the inhomogeneous K-out random graph (n; µ, K ) with K ≥ 2 ∀n and K = O(1) we have H n n n B. Discussion |Cmax(n; µ, Kn)| = n − O(1) whp. (6) Theorem 3.2 shows that for all Kn ≥ 2 and µ ∈ (0, 1), the largest connected component in H(n; µ, Kn) spans n − O(1) The proof of Corollary 3.3 is presented in Section IV-B. We nodes whp. In contrast to the requirement of Kn = ω(1) for extend Corollary 3.3 to inhomogeneous random K-out graphs connectivity [19], merely setting Kn = 2 ensures that only with arbitrary number of node types; see Corollary 8.1 in the finitely many nodes H(n; µ, Kn) lie outside the giant compo- Appendix. nent Cmax(n; µ, Kn). We expect that in resource-constrained environments (e.g., IoT type settings), it will be advantageous So far, we have discussed that H(n; µ, Kn) contains a giant component of size n − O(1) whp. The next question which to have a large connected component reinforcing the useful- we investigate is whether there exists a giant component, ness of inhomogeneous random K-out graphs as a topology de- i.e., a component of size Ω(n) in the event of node fail- sign tool in distributed systems. Moreover, for settings where ures or adversarial capture of nodes in the network. Recall a finite number of nodes fail and/or get captured, modeled by random node deletions, Theorem 3.4 implies that the subset that G(n; µ, Kn, dn) denotes the random graph obtained by of honest nodes still spans n − O(1) nodes whp. Furthermore, deleting dn nodes selected at random from H(n; µ, Kn). Let C (n; µ, K , d ) denote the set of nodes in the largest during large-scale failures and attacks on the network where max n n 0.99 as many as dn = ω(1), dn = o(n) (e.g., dn = n ) nodes connected component of G(n; µ, Kn, dn). Our next result characterizes the existence and size of the giant component are removed from the network, the remaining nodes contain a giant component of size n(1 − o(1)) whp. in G(n; µ, Kn, dn) obtained by deleting dn = o(n) nodes It is worth emphasizing that the largest connected com- uniformly at random from H(n; µ, Kn). ponent of (n; µ, Kn), whose size is given in (6), is much Theorem 3.4: Consider the graph G(n; µ, Kn, dn) with H larger than what is strictly required to qualify it as a giant dn = o(n), Kn ≥ 2 ∀n and Kn = O(1). For any sequence (1+)dn component, i.e, the condition that |Cmax(n; µ, Kn)| = Ω(n). x : N0 → N0 and ∀  > 0, if xn > ∀n, then hKni−1 In fact, for most random graph models, including Erdos-R˝ enyi´ P [|Cmax(n; µ, Kn, dn)| ≤ n − dn − xn] graphs [33] and random key graphs [42, Theorem 2], studies −x (hK i−1)  (1−o(1)) −x (1−µ)  (1−o(1)) on the size of the largest connected component are focused e n n 1+ e n 1+ ≤  +  . on characterizing the behavior of |Cmax|/n as n gets large; −(hKni−1) (1−o(1)) −(1−µ) (1−o(1)) 1 − e 1+ 1 − e 1+ this amounts to studying the fractional size of the largest (7) connected component. Our results (6) and (8) go beyond Theorem 3.4 provides an upper bound on the probabil- looking at the fractional size of the largest component, for |Cmax(n;µ,Kn)| which they yield →p 1. This is equivalent to ity that more than xn nodes lie outside the largest com- n having |Cmax(n; µ, Kn)| = n − o(n). However, even having ponent Cmax(n; µ, Kn, dn) which decays to zero whenever (1+)dn |Cmax(n; µ, Kn)| = n − o(n) leaves the possibility that as xn = ω(1) and xn > ∀n. Here, the constant  (hKni−1) n0.99 not provides a way to trade-off between the lower bound imposed many as nodes are part of the largest connected com- on x (x > (1 + )d /(hK i − 1)) and the coefficient ponent. Thus, our result in Theorem 3.2, showing that at most n n n n O(1) whp /(1 + ) that governs the decay rate of the exponent in the nodes are outside the largest connected component , fractional upper bound (7) on [|C (n; µ, K , d )| ≤ n − d − x ]; is sharper than existing results on the size of the P max n n n n d = o(n) see Section VI for more details.The following corollary to largest connected component. Further, even when n (n; µ, K ) Theorem 3.4 provides a succinct characterization of the size are randomly deleted from H n , (9) shows that the (n; µ, K , d ) of C (n; µ, K , d ). fractional size of the giant component of G n n is max n n one whp. Corollary 3.5: For the graph G(n; µ, Kn, dn) with Kn ≥ Our results highlight a major difference of inhomogeneous 2 ∀n and Kn = O(1), it holds that random K-out graphs with classical models such as Erdos-˝ i) if d = O(1), then n Renyi´ (ER) graphs [6], [33]. We provide an example to

|Cmax(n; µ, Kn, dn)| = n − O(1) whp, (8) compare the size of the giant component in H(n; µ, Kn) and ER graphs with the same mean degree. For H(n; µ, Kn), ii) if dn = ω(1) and dn = o(n), then we set Kn = 2 and µ = 0.9, which yields a mean node degree of (1 − o(1))2hKni ≈ 2(0.9 + 0.1 × 2) = 2.2; see |Cmax(n; µ, Kn, dn)| = n(1 − o(1)) whp. (9) Appendix VIII-B. Let G(n; pn) denote the ER graph on n A consequence of Corollary 3.5 is that despite the deletion nodes with edge probability pn ∈ [0, 1]. We set pn = 2.2/n of a randomly chosen finite subset of nodes (dn = O(1)) to get a mean degree of 2.2. Thus, the mean number of in H(n; µ, Kn), a giant component of size n − O(1) persists edges in both these models match. It is known [33] that for for all Kn ≥ 2 whp. Furthermore, when dn = o(n) nodes pn = c/n and c > 1, the ER graph has a giant component 6

of size βn(1 + o(1)) whp, where β ∈ (0, 1] is the solution 1000 of β + e−βc = 1. Substituting p = 2.2/n, the largest connected component of the ER graph G(n; 2.2/n) is of size 990 ≈ 0.8437n+o(n) whp. For an ER graph over 5000 nodes, this 980 corresponds to over 700 nodes being isolated from the largest component. In contrast, Theorem 3.2 shows that the largest 970 connected component of H(n; µ, Kn) would be much larger. 960

Namely, Cmax(n; µ, Kn) = n − O(1) whp. This is verified 950 in our experiments in Figure 2, where for a network of 5000 940 nodes, with µ ≤ 0.9, at most 45 nodes are outside the largest 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 connected component in 100,000 experiments. A related property of interest is the connectivity of

H(n; µ, Kn) under targeted attacks when any k − 1 nodes 5000 are deleted, where k ≥ 2. For H(n; µ, Kn), it was shown [35] 4995 that k-connectivity, i.e., the property that the network remains 4990 connected despite the failure of any k − 1 nodes or edges 4985 requires Kn = Ω(log n) where k ≥ 2. The requirement for 4980 k-connectivity (Kn = Ω(log n)) [35] is significantly larger 4975 than what is required for connectivity (Kn = ω(1)) [19]. 4970 These existing studies [19], [35] focus on stronger notions 4965 4960 of connectivity of H(n; µ, Kn) that require Kn = ω(1). 4955 Therefore, our analysis is the first to guarantee a useful 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 property, viz., giant component when the number of selections is finite (Kn = O(1)), paving the way for practical deployment of inhomogeneous random graphs as a topology design tool Fig. 2. Average and minimum number of nodes contained in the largest connected component of H(n; µ, Kn) with Kn = 2, n = 1000, 5000 and in real-world networks. µ ∈ {0.1,..., 0.9}. Even when µ = 0.9, setting Kn = 2 is enough to ensure that almost all of the nodes form a connected component; at most 45 out of 5000 nodes (or, 60 out of 1000 nodes) are seen to be isolated from the giant C. Simulation results component across 100,000 independent experiments. Through simulations, we examine the size of Cmax(n; µ, Kn) and Cmax(n; µ, Kn, dn) when the number 5000 of nodes is finite. We first explore the impact of varying the 4995 probability µ of a node being type-1 on the size of the largest 4990 connected component. We generate 100,000 independent 4985 4980 realizations of H(n; µ, Kn) with Kn = 2 for n = 1000 and n = 5000, varying µ between 0.1 and 0.9 in increments of 4975 0.1. Since, Theorem 3.2 states that the size of the largest 4970 connected component is n − O(1) whp, we focus on the 4965 minimum size of the largest component observed in 100,000 4960 4955 experiments. The average size of the largest connected 2 3 4 5 6 7 8 9 10 component is also shown for comparison in Figure 2. We see that even when the probability of a node being type-1 is Fig. 3. Average and minimum number of nodes contained in the largest con- as high as 0.9, setting Kn = 2 suffices to have a connected nected component of H(n; µ, Kn) across 100,000 independent experiments component spanning almost all of the nodes. For n = 1000 with n = 5000, µ = 0.9, and Kn ∈ {2,..., 10}. and 5000, at most 60 and 45 nodes, respectively, are found to be outside the largest connected component (Figure 2). The observation that the number of nodes outside the largest hKni in view of (4), this observation is consistent with our connected component does not scale with n is consistent with main result given in Theorem 3.2; i.e., with the fact that Theorem 3.2 and Corollary 3.3. P [n − |Cmax(n; µ, Kn)| > M] decays to zero exponentially The next set of experiments probes the impact of varying with (hKni − 1)M. the number Kn of edges pushed by type-2 nodes while µ Next, we probe the size of the largest connected component is fixed. We generate 100,000 independent realizations of of G(n; µ, Kn, dn), i.e., the graph obtained after the deletion H(n; µ, Kn) for n = 5000 while keeping µ fixed at 0.9 and of dn nodes uniformly at random from H(n; µ, Kn). We first varying Kn between 2 and 10 in increments of 1. Increasing examine how the size of largest connencted component of Kn has an impact similar to decreasing µ and we see in G(n; µ, Kn, dn) varies as we change µ or Kn. We generate Figure 3 that both the average and the minimum size of 100,000 independent realizations of G(n; µ, Kn, dn) with (i) the largest connected component increases nearly monoton- n = 1000, dn = 20, Kn = 2, and vary µ ∈ {0.1,..., 0.9}; ically. Given that increasing Kn (or, decreasing µ) increases and (ii) n = 5000, dn = 70, µ = 0.9, and vary Kn ∈ 7

{2,..., 10}. For these parameters, we plot the minimum and S ⊂ N of nodes that is isolated from the rest of the graph. average size of the largest component observed across the Namely, S ⊂ N is a cut if there is no edge between S and 100,000 realizations in Figure 4. By comparing Figures 2 Sc = N\ S. and 4 for Kn = 2 and µ = 0.9, we observe that for It is clear from Definition 4.1 that if S is a cut, then so is c n = 1000 (respectively, n = 5000), after deleting dn = 20 S . It is important to note the distinction between a cut as (respectively, dn = 70) nodes, the maximum number of nodes defined above and the notion of a connected component given observed outside Cmax(n; µ, Kn) increases from 60 to 93 in Definition 3.1. A connected component is isolated from the (respectively, 45 to 162). In Theorem 3.4, we characterized rest of the nodes by Definition 3.1 and therefore it is also a cut. the asymptotic size of Cmax(n; µ, Kn, dn), namely, when However, nodes within a cut may not be connected meaning |Cmax(n; µ, Kn, dn)| ≥ n − dn − xn whp for any xn = ω(1) that not every cut is a connected component. Let En(µ, Kn; S) (1+)dn satisfying xn > ∀ > 0. Motivated by this, we denote the event that S ⊂ N is a cut in (n; µ, Kn) as per hKni−1 H propose a heuristic lower bound on the size of the giant Definition 4.1. The event En(µ, Kn; S) occurs if no nodes in dn c component given by |Cmax(n; µ, Kn, dn)| ≥ n−dn − S pick neighbors in S , and no nodes in S pick neighbors in hKni−1 for finite n and we plot this in Figure 4. We observe that Sc. Thus, we have this lower bound becomes tighter as µ is decreased or Kn is \ \ increased. Note that the minimum size of the largest connected En(µ, Kn; S) = ({i 6∈ Γn,j} ∩ {j∈ / Γn,i}) . c component observed across 100,000 experiments stays above i∈S j∈S the corresponding lower bound for all values for µ and Let Zn(xn; µ, Kn) denote the event that H(n; µ, Kn) has Kn, reinforcing the usefulness of our results for finite node no cut S ⊂ N with size xn ≤ |S| ≤ n − xn where x : regimes. N0 → N0 is a sequence such that xn ≤ n/2 ∀n. In other words, Zn(xn; µ, Kn) is the event that there are no cuts in

1000 H(n; µ, Kn) whose size falls in the range [xn, n − xn]. If S is a cut, then so is Sc (i.e., if there is a cut of size m then 950 there must be a cut of size n − m), therefore we see that

\ c 900 Zn(xn; µ, Kn) = (En(µ, Kn; S)) , (10) n S∈Pn: xn≤|S|≤b 2 c 850 where Pn is the collection of all non-empty subsets of N . 800 Taking the complement of both sides in (10) and then using a union bound we get 750 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 c X P [(Zn(xn; µ, Kn)) ] ≤ P[En(µ, Kn; S)] n S∈Pn:xn≤|S|≤b 2 c n   b 2 c 5000 X X 4900 =  P[En(µ, Kn; S)] , r=xn S∈Pn,r 4800 (11) 4700

4600 where Pn,r denotes the collection of all subsets of N with

4500 exactly r elements. For each r = 1, . . . , n, we simplify the

4400 notation by writing En,r(µ, Kn) = En(µ, Kn; {v1, . . . , vr}).

4300 From the exchangeability of the node labels and associated random variables, we get 4200 2 3 4 5 6 7 8 9 10 P[En(µ, Kn; S)] = P[En,r(µ, Kn)],S ∈ Pn,r.

Fig. 4. Average and minimum number of nodes contained in the largest con- n Noting that |Pn,r| = r , we obtain nected component (Cmax) of G(n; µ, Kn, dn) across 100,000 independent   experiments with (i) n = 1000, dn = 20, Kn = 2, and µ ∈ {0.1,..., 0.9} X n and (ii) n = 5000, dn = 70, µ = 0.9, and Kn ∈ {2,..., 10}. We also plot P[En(µ, Kn; S)] = P[En,r(µ, Kn)]. the heuristic lower bound on the size of the largest connected component, as r S∈Pn,r inferred from Theorem 3.4. Substituting into (11) we obtain

n IV. PROOFOF THEOREM 3.2 AND COROLLARY 3.3 b 2 c   c X n A. Proof of Theorem 3.2 [(Zn(xn; µ, Kn)) ] ≤ [En,r(µ, Kn)]. (12) P r P In this section, we provide a proof sketch for Theorem 3.2. r=xn We start by defining a cut. Our next result presents an upper bound on c Definition 4.1 (Cut): [31, Definition 6.3] Consider a graph P [(Zn(M; µ, Kn)) ], i.e, the probability that there exists a G with the node set N .A cut is defined as a non-empty subset cut with size in the range [M, n − M] for H(n; µ, Kn). 8

1 Proposition 4.2: Consider a scaling K : N0 → N0 such that Combining (14) with Proposition 4.2, we get for all M ≤ n/3, Kn ≥ 2 ∀n, Kn = O(1), and µ ∈ (0, 1). It holds that P [|Cmax(n; µ, Kn)| ≤ n − M] c P [(Zn(M; µ, Kn)) ] exp{−M (hKni − 1) (1 − o(1))} ≤ + o(1) (15) exp{−M (hK i − 1) (1 − o(1))} 1 − exp{− (hK i − 1) (1 − o(1))} ≤ n + o(1). (13) n 1 − exp{− (hK i − 1) (1 − o(1))} n Note that for all M,N ∈ N0 such that M ≤ N, we have The proof of Proposition 4.2 is presented in Section V. |Cmax(n; µ, Kn)| ≤ n − N =⇒ |Cmax(n; µ, Kn)| ≤ n − M. The succeeding Lemma establishes the relevance of the This implies for each M = 1, 2,..., the upper bound (5) event Zn(xn; µ, Kn) in obtaining a lower bound for the size holds true, completing the proof of Theorem 3.2. of the largest connected component. Lemma 4.3: For any sequence x : N0 → N0 such that xn ≤ bn/3c for all n, we have B. Proof of Corollary 3.3 Zn(xn; µ, Kn) =⇒ |Cmax(n; µ, Kn)| > n − xn. Consider an arbitrary sequence xn = ω(1). Substituting M Proof of Lemma 4.3 with xn in (5), we readily see that Z (x ; µ, K ) Assume that n n n takes place, i.e., there is no cut lim [n − |Cmax(n; µ, Kn)| ≤ xn] = 1. (16) n→∞P in H(n; µ, Kn) of size in the range [xn, n − xn]. Since a connected component is also a cut, this also means that there Namely, we have is no connected component of size in the range [x , n − x ]. n n n − |C (n; µ, K )| ≤ x whp for any x = ω(1). (17) Since every graph has at least one connected component, it max n n n either holds that the largest one has size |Cmax(n; µ, Kn)| > This is equivalent to the number of nodes (n − n − xn, or that |Cmax(n; µ, Kn)| < xn. We now show that it |Cmax(n; µ, Kn)|) outside the largest connected component must be the case that |Cmax(n; µ, Kn)| > n − xn under the being bounded, i.e., O(1), with high probability. This fact is assumption that xn ≤ n/3. Assume towards a contradiction sometimes stated using the probabilistic big-O notation, Op. that |Cmax(n; µ, Kn)| < xn meaning that the size of each A random sequence fn = Op(1) if for any ε > 0 there exists connected component is less than xn. Note that the union finite integers M(ε) and n(ε) such that P[fn > M(ε)] < ε of any set of connected components is either a cut, or it for all n ≥ n(ε). In fact, we see from [43, Lemma 3] that (17) spans the entire network. If no cut exists with size in the is equivalent to having n − |Cmax(n; µ, Kn)| = Op(1) Here, range [xn, n − xn], then the union of any set of connected we equivalently state this as components should also have a size outside of [xn, n − xn]. Also, the union of all connected components has size n. Let n − |Cmax(n; µ, Kn)| = O(1) whp, C1,C2,...,Cmax denote the set of connected components in giving readily (6). increasing size order. Let m ≥ 1 be the largest integer such Pm that i=1 |Ci| < xn. Since |Cm+1| < xn, we have m+1 X V. PROOFOF PROPOSITION 4.2 xn ≤ |Ci| < xn + xn ≤ 2n/3 ≤ n − xn. i=1 A. Useful facts m+1 For 0 ≤ x < 1 and for a sequence y = 0, 1, 2 ..., it is known This means that ∪ Ci constitutes a cut with size in the i=1 [44, Fact 2] that range [xn, n − xn] contradicting the event Zn(xn; µ, Kn). We thus conclude that if Z (x ; µ, K ) takes place with 1 n n n 1 − xy ≤ (1 − x)y ≤ 1 − xy + x2y2. (18) xn ≤ n/3, then we must have |Cmax(n; µ, Kn)| > n − xn. 2 For all x ∈ R, we have We now have all the requisite ingredients for establishing 1 ± x ≤ e±x. (19) Theorem 3.2. Substituting xn = M, ∀n in Lemma 4.3 for any positive integer M satisfying M ≤ n/3, we get For 0 ≤ m ≤ n1 ≤ n2, m, n1, n2 ∈ N0, n1 m−1    m Y n1 − i n1 Zn(M; µ, Kn) =⇒ |Cmax(n; µ, Kn)| > n − M. m = ≤ . (20) n2 n − i n m i=0 2 2 Equivalently, we have |Cmax(n; µ, Kn)| ≤ n − M =⇒ c n Zn(M; µ, Kn) . This gives for all M ≤ n/3, From [18, Fact 4.1], we have that for r = 1, 2,..., b 2 c,    n−r [|C (n; µ, K )| ≤ n − M] ≤ [Z (M; µ, K )c] (14) n nr n P max n P n n ≤ . (21) r r n − r 1We refer to any mapping K : N → N as a scaling if it satisfies the condition For dn ≤ n, this gives 2 ≤ K < n, n = 2, 3,.... n n − d  n − d r  n − d n−dn−r n ≤ n n . (22) r r n − dn − r 9

bn/2c B. Proof of Proposition 4.2 X n + [E (µ, K )]. r P n,r n In view of (12), the proof for Proposition 4.2 will follow r=bn/ log nc upon showing (27)

bn/2c We first upper bound each term in the summation with X n [E (µ, K )] indices in the range M ≤ r ≤ bn/ log nc. r P n,r n r=M exp{−M (hK i − 1) (1 − o(1))} Range 1: M ≤ r ≤ bn/ log nc ≤ n + o(1). (23) 1 − exp{− (hKni − 1) (1 − o(1))} n [E (µ, K )] r P n,r n We have n   r Kn−1 ≤ µ + (1 − µ) 1 − n n P[En,r(µ, Kn)] n r    r Kn−1 = 1 − (1 − µ) 1 − 1 − (28)     n−r−1!n−r n n n − r − 1 Kn = µ + (1 − µ) n−1 r r n − 1 For r ≤ bn/ log nc, we have n = o(1). Using Fact (18) with Kn x = r we get r−1 !r n  r − 1  Kn   · µ + (1 − µ)n−1 n n − 1 P[En,r(µ, Kn)] Kn r K !n−r n  r   r  n    r(K − 1) r2(K − 1)2 n ≤ µ 1 − + (1 − µ) 1 − ≤ 1 − (1 − µ) 1 − 1 − n + n r n − 1 n − 1 n 2n2 n K !r  r(K − 1)  r(K − 1)  r − 1   r − 1  n = 1 − (1 − µ) n 1 − n (29) · µ + (1 − µ) (24) n 2n n − 1 n − 1 n−r Using r ≤ n/ log n, (19) and that Kn = O(1), we obtain, n   r   r Kn  ≤ µ 1 − + (1 − µ) 1 − r n n   r n   r   r Kn  P[En,r(µ, Kn)] · µ + (1 − µ) r n n   n r(Kn − 1) (Kn − 1) nr  n n−r  r n−r  r r ≤ 1 − (1 − µ) 1 − ≤ 1 − n 2 log n r n − r n n    (Kn − 1)  n−r ≤ exp −(1 − µ)r(Kn − 1) 1 −  r Kn−1 2 log n · µ + (1 − µ) 1 − n = exp {−r(1 − µ)(Kn − 1) (1 − o(1))} r   r Kn−1 · µ + (1 − µ) (25) = exp {−r(hKni − 1) (1 − o(1))} . (30) n n Next, we upper bound the second term in the summation (27)   r Kn−1 = µ + (1 − µ) 1 − with indices in the range bn/ log nc + 1 ≤ r ≤ bn/2c. n Range 2: bn/ log nc + 1 ≤ r ≤ bn/2c  r  r Kn−1 Observe that µ + (1 − µ) n  n  n   r Kn−1 ·     r Kn−1  P[En,r(µ, Kn)] ≤ µ + (1 − µ) 1 − µ + (1 − µ) 1 − r n n   r n  n ≤ µ + (1 − µ) 1 − (31)  r Kn−1 n ≤ µ + (1 − µ) 1 − (26)  r n n = 1 − (1 − µ) n where (24) uses (20), (25) follows from (21) and (26) is plain ≤ exp {−r(1 − µ)} , (32) from the observation that r/n ≤ 1/2. where (32) follows from noting that K ≥ 2 and (31) is a We divide the summation in (23) into two parts depending n consequence of (19). Finally, we use (30) and (32) in (27) as on whether r exceeds n/log n. The steps outlined below follows. can be used to upper bound the summation in (23) for an bn/2c arbitrary splitting of the summation indices. X n [E (µ, K )] r P n,r n r=M bn/2c bn/ log nc bn/ log nc X n X n X n [E (µ, K )] = [E (µ, K )] = [E (µ, K )] r P n,r n r P n,r n r P n,r n r=M r=M r=M 10

bn/2c X n exchangeability of the associated random variables, we have + P[En,r(µ, Kn)] r ∀S ∈ Pn,r, r=bn/ log nc+1 bn/ log nc bn/2c P[En(µ, Kn, dn; S)] = P[En,r(µ, Kn, dn)]. X −r(hKni−1)(1−o(1)) X −r(1−µ) ≤ e + e Substituting in (33), we get r=M r=bn/ log nc+1 c ∞ ∞ P[Zn(xn; µ, Kn, dn) ] X −r(hKni−1)(1−o(1)) X −r(1−µ) ≤ e + e . n−dn   b 2 c r=M r=bn/ log nc+1 X X ≤  P[En,r(µ, Kn, dn)] Observe that both of the above geometric series have all terms r=xn S∈Pn,r strictly less than one, and thus they are summable. This gives b(n−dn)/2c   X n − dn = [E (µ, K , d )]. (34) bn/2c   r P n,r n n X n r=xn [E (µ, K )] r P n,r n r=M Next, we find conditions on xn under which the summation n −M(hKni−1)(1−o(1)) −(1−µ) log n (34) decays to zero. In the above summation (34) we seek to e e c ≤ + derive an upper bound for the event (Zn(xn; µ, Kn)) , which 1 − e−(hKni−1)(1−o(1)) 1 − e−(1−µ) through Lemma 4.3 gives an upper bound on the event that e−M(hKni−1)(1−o(1)) = + o(1). xn or more nodes lie outside the largest connected component 1 − e−(hKni−1)(1−o(1)) of G(n; µ, Kn, dn). As specified in Theorem 3.4, throughout, (1+)dn we assume that xn > and xn = o(n). Therefore, hKni−1 for a given dn such that dn = o(n), our goal is to find VI.PROOFOF THEOREM 3.4 AND COROLLARY 3.5 the smallest xn that drives the summation in (34) to o(1). However, note that there is a trade-off while chosing xn. If A. Proof of Theorem 3.4 we choose a smaller xn that drives the summation in (34) to We extend the framework developed in Section IV-A to o(1), although we are guaranteed a larger giant component analyze the giant component in G(n; µ, Kn, dn). Recall that whp through Lemma 4.3, we are including more terms in the in Section II, we defined G(n; µ, Kn, dn) as the subgraph summation and the resulting upper bound on the summation of (n; µ, Kn) induced by the subset of nodes contained in can decay at a slower rate. The Proposition below gives an H c N\ D, where D ⊂ N , |D| = dn and D is selected uniformly upper bound on P [(Zn(xn; µ, Kn, dn)) ]. at random from all possible subsets of N with cardinality Proposition 6.1: Consider a scaling K : N0 → N0 such that dn. Let En(µ, Kn, dn; S) denote the event that S ⊂ N \ D Kn ≥ 2 ∀n, Kn = O(1), and µ ∈ (0, 1) with dn = o(n). For (1+)dn is a cut in (n; µ, K , d ) as per Definition 4.1. The event any x : 0 → 0 such that xn > ∀n and xn = o(n), G n n N N hKni−1 En(µ, Kn, dn; S) occurs if nodes in S pick neighbors in we have c c S ∪ D, and nodes in S pick neighbors in S ∪ D. Let c P [(Zn(xn; µ, Kn, dn)) ] Zn(xn; µ, Kn, dn) denote the event that G(n; µ, Kn, dn) con- −x (hK i−1)  (1−o(1)) −x (1−µ)  (1−o(1)) tains no cut of size in the range [x , n−d −x ]. From Lemma e n n 1+ e n 1+ n n n ≤ + . −(hK i−1)  (1−o(1)) −(1−µ)  (1−o(1)) 4.3, it follows that if Zn(xn; µ, Kn, dn) occurs, then the size of 1 − e n 1+ 1 − e 1+ the largest connected component of G(n; µ, Kn, dn) is greater The proof of Proposition 6.1 is presented in Section VII. We than n − dn − xn whp. We have present an alternate upper bound under the stronger condition (1+)dn that xn > ∀n in the Appendix. Zn(xn; µ, Kn, dn) 1−µ \ c Invoking Lemma 4.3 for the graph G(n; µ, Kn, dn), we have = (En(µ, Kn, dn; S)) , for any sequence x : N0 → N0 such that xn ≤ b(n − dn)/3c n−dn S∈Pn: xn≤|S|≤b 2 c for all n, we have where Pn is the collection of all non-empty subsets of N\D. Zn(xn; µ, Kn, dn) =⇒ |Cmax(n; µ, Kn, dn)| > n − dn − xn, Using a union bound, we get (35)

c P[Zn(xn; µ, Kn, dn) ] or equivalently, X ≤ P[En(µ, Kn, dn; S)] P[|Cmax(n; µ, Kn, dn)| ≤ n − dn − xn] c S∈P :x ≤|S|≤b n−dn c n n 2 ≤ P [(Zn(xn; µ, Kn, dn)) ] . (36) n−dn   b 2 c X X Combining (36) with Proposition 6.1, we get for all : N0 → = [E (µ, K , d ; S)] , (33) (1+)dn  P n n n  0 such that < xn ≤ b(n−dn)/3c ∀n and xn = o(n), N hKni−1 r=xn S∈Pn,r P[|Cmax(n; µ, Kn, dn)| ≤ n − dn − xn] where Pn,r denotes the collection of all subsets of N\ D   −xn(hKni−1) (1−o(1)) −xn(1−µ) (1−o(1)) with exactly r elements. For each r = 1,..., b(n − d )/2c, e 1+ e 1+ n ≤  +  . −(hKni−1) 1+ (1−o(1)) −(1−µ) 1+ (1−o(1)) let En,r(µ, Kn, dn) = En(Kn, µ, dn; {v1, . . . , vr}). From the 1 − e 1 − e 11

n−dn−r  n−dn−r  Kn−1! For arbitrary sequences y : N0 → N0 such that xn ≤ n − r n − r · µ + (1 − µ) yn ∀n, we have |Cmax(n; µ, Kn, dn)| ≤ n − dn − yn =⇒ n n |C (n; µ, K , d )| ≤ n − d − x max n n n n and therefore for all (38) (1+)dn sequences xn such that xn > ∀n, the upper bound hKni−1 n − d  r + d r n − r n−dn−r (7) holds true; this completes the proof of Theorem 3.4. = n n r n n !n−dn n − r Kn−1 · µ + (1 − µ) B. Proof of Corollary 3.5 n (1+)dn i) When dn = O(1), the condition xn > is satisfied r hKni−1   Kn−1  r + dn for sufficiently large n for any xn = ω(1). Substituting xn = µ + (1 − µ)  n  ω(1) in (7), we get for dn = O(1), ·     Kn−1   n − r  lim P [n − dn − |Cmax(n; µ, Kn, dn)| ≤ xn] = 1. µ + (1 − µ) n→∞ n r n−d −r Thus, for dn = O(1), we have for any xn = ω(1) that n − d  r + d  n − r  n ≤ n n n − dn − |Cmax(n; µ, Kn, dn)| ≤ xn whp, r n n n−dn   r Kn−1 which is equivalent to the number of nodes outside the largest · µ + (1 − µ) 1 − , (39) n connected component in G(n; µ, Kn, dn) being bounded, i.e., n − dn − |Cmax(n; µ, Kn, dn)| = O(1) whp. Since, dn itself where (37) follows from (20), (38) follows from noting that is bounded, we get |Cmax(n; µ, Kn, dn)| = n − O(1) whp. r ≤ (n−dn)/2 =⇒ r+dn ≤ n−r < n and (39) follows from ii) We have already analyzed the case dn = O(1) above. Now, the observation that µ+(1−µ)xKn−1 is an increasing function we consider the case dn = o(n) and dn = ω(1). Using (7), in x where x > 0 combined with the fact that r + dn ≤ n − r. (1+)dn for any xn = ω(1) satisfying xn > , we have hKni−1 We now prove a bound to simplify (39).

lim P [n − dn − |Cmax(n; µ, Kn, dn)| ≤ xn] = 1,    r  n−dn−r n→∞ n − dn r + dn n − r

(1+)dn r n n Upon selecting xn = + δ > 0 ∀n where δ > 0, hKni−1  r  n−dn−r (1+)dn n − dn n − dn note that xn satisfies both xn = ω(1) and xn > hK i−1 for ≤ n r n − d − r dn = o(n) and dn = ω(1), yielding n  r  n−dn−r (1 + )d r + dn n − r |C (n; µ, K , d )| ≥ n − d − n − δ whp. · (40) max n n n hK i − 1 n n n  r  r n − dn r + dn Therefore, for dn = o(n), it holds that |Cmax(n; µ, Kn, dn)| = = n(1 − o(1)) whp. n r n − d n−dn−r  n − r n−dn−r · n VII.PROOFOF PROPOSITION 6.1 n n − dn − r A. Preliminaries r + d r n − d n−dn  n − r n−dn−r = n n In order to find a vanishing upper bound on summation (34), r n n − dn − r we first upper bound the individual terms as follows.  r  n−dn  n−dn−r dn dn dn   = 1 + 1 − 1 + n − dn r n n − dn − r P[En,r(µ, Kn, dn)]     r dn r ≤ exp dn − dn 1 − + dn (41)     r+dn−1! n n − dn r + dn − 1 Kn = µ + (1 − µ) n−1   d  r n − 1 = exp d 1 + n , (42) Kn n n   n−r−1!n−dn−r n − r − 1 K · µ + (1 − µ) n where (40) and (41) follow from (22) and (19) respectively. n − 1 n−1 Kn Combining (39) and (42), we get !r n − d  r + d − 1 r + d − 1Kn ≤ n µ n + (1 − µ) n n − d  r n − 1 n − 1 n [E (µ, K , d )] r P n,r n n n−dn−r Kn ! n−dn n − r − 1 n − r − 1   d    r Kn−1 · µ + (1 − µ) ≤ exp d 1 + n µ + (1 − µ) 1 − . n − 1 n − 1 n n n (37) (43) r    r  Kn−1! n − dn r + dn r + dn Recall that Theorem 3.4 deals with the case dn = o(n) and ≤ µ + (1 − µ) (1+)dn r n n throughout we assume xn > ∀n and xn = o(n) , (hKni−1) 12

    dn for a fixed  > 0. We define f : N0 → N0 to be ≤ exp dn 1 + − r(hKni − 1) (1 − o(1))   n (1 + )dn fn = max xn, ∀n. (44) ( " (1 − µ) dn = exp − (hKni − 1)r · 1 − o(1) − (hKni − 1)r Note that under the enforced assumptions dn = o(n) and #) xn = o(n), by definition (44), fn = o(n).  d  · 1 + n Remark: While defining fn through (44), we can pick any n n (1+δ)dn o δ > 0 such that fn = max xn, (1−µ) ∀n and the same ( " sequence of steps can be used to complete the proof for 1 ≤ exp − (hKni − 1)r · 1 − o(1) − Theorem 3.4. For concreteness we choose δ =  yielding (44). (1 + ) Next, we split the summation (34) as follows:   #) dn b(n−dn)/2c   · 1 + X n − dn n [E (µ, K , d )] r P n,r n n   r=xn  = exp −r(hKni − 1) (1 − o(1)) . (49) bfnc   1 +  X n − dn = [E (µ, K , d )] r P n,r n n r=xn C. Upper bound for bfnc + 1 ≤ r ≤ (n − dn)/2 b(n−dn)/2c   From (43), for bfnc + 1 ≤ r ≤ b(n − dn)/2c we have X n − dn + P[En,r(µ, Kn, dn)]. (45) n − d  r n [E (µ, K )] r=bfnc+1 r P n,r n     K −1n−dn dn  r  n B. Upper bound for x ≤ r ≤ bf c ≤ exp dn 1 + µ + (1 − µ) 1 − n n n n x ≤ r ≤ bf c For n n , we first simplify (43) using Fact (18)   d    r n−dn r n with x = to get ≤ exp dn 1 + µ + (1 − µ) 1 − n n n n−dn   r Kn−1 (50) µ + (1 − µ) 1 −    n−d n dn   r  n = exp dn 1 + 1 − (1 − µ) n 1− dn   K −1 ( n ) n n  r  n      = 1 − (1 − µ) 1 − 1 − dn dn n ≤ exp dn 1 + − (1 − µ)r 1 − (51)    n n r(Kn − 1) ≤ 1 − (1 − µ) 1 − 1 −   d d  d  n = exp −(1 − µ)r 1 − n − n 1 + n n (1 − µ)r n n(1− dn ) r2(K − 1)2  n ( !) + n (46) d 1  d  2n2 ≤ exp − (1 − µ)r 1 − n − 1 + n n 1 +  n dn   n(1− n ) r(Kn − 1) r(Kn − 1) ( !) = 1 − (1 − µ) 1 −  d n 2n = exp − r(1 − µ) 1 − n , (52)      1 +  n r(Kn − 1) dn ≤ exp −(1 − µ)r(Kn − 1) 1 − 1 − 2n n where (50) is a consequence of Kn ≥ 2 ∀n, (51) follows from      (1+)dn r(Kn − 1) dn (19) and (52) is immediate from r ≥ bfnc + 1 > (1−µ) . = exp −r(hKni − 1) 1 − 1 − . 2n n Using dn = o(n), for bfnc + 1 ≤ r ≤ b(n − dn)/2c we have (47)   where (46) and (47) follow from Fact (18) and (19) respec- n − dn P[En,r(µ, Kn, dn)] tively. Combining (42) and (47), for xn ≤ r ≤ bfnc, we have r ( ) n − d   n [E (µ, K , d )] ≤ exp − r(1 − µ) (1 − o(1)) . (53) r P n,r n n 1 +  (   dn ≤ exp dn 1 + D. Proof of Proposition 6.1 n ) Now, using (49) and (53), we can upper bound summation  r(K − 1)  d  − r(hK i − 1) 1 − n 1 − n . (48) (45) as follows n 2n n b(n−dn)/2c   X n − dn (1+)dn P[En,r(µ, Kn)] Using < xn ≤ r ≤ bfnc together with with fn = (hKni−1) r r=xn o(n), dn = o(n) and Kn = O(1), for xn ≤ r ≤ bfnc we have bfnc   n − d  X n − dn n = P[En,r(µ, Kn)] P[En,r(µ, Kn, dn)] r r r=xn 13

b(n−dn)/2c   X n − dn [5] T. I. Fenner and A. M. Frieze, “On the connectivity of random m- + [E (µ, K )] r P n,r n orientable graphs and digraphs,” Combinatorica, vol. 2, no. 4, pp. 347– r=bfnc+1 359, Dec 1982. [6] B. Bollobas,´ Random graphs. Cambridge university press, 2001, bf c b(n−d )/2c n n vol. 73. X −r(hK i−1)  (1−o(1)) X −r(1−µ)  (1−o(1)) ≤ e n 1+ + e 1+ [7] O. Yagan˘ and A. M. Makowski, “On the connectivity of sensor networks under random pairwise key predistribution,” IEEE Transactions on r=xn r=bfnc+1 Information Theory ∞ ∞ , vol. 59, no. 9, pp. 5754–5762, Sept 2013.   [8] M. Sood and O. Yagan,˘ “Tight bounds for the probability of connectivity X −r(hKni−1) (1−o(1)) X −r(1−µ) (1−o(1)) ≤ e 1+ + e 1+ in random k-out graphs,” in 2021 IEEE International Conference on

r=xn r=bfnc+1 Communications(ICC), 2021. [9] G. Fanti, S. B. Venkatakrishnan, S. Bakshi, B. Denby, S. Bhargava, Both of the above geometric series have all terms strictly less A. Miller, and P. Viswanath, “Dandelion++: Lightweight cryptocurrency than one, and thus they are summable. This gives networking with formal anonymity guarantees,” Proc. ACM Meas. Anal. Comput. Syst., vol. 2, no. 2, pp. 29:1–29:35, Jun. 2018. b(n−dn)/2c   [10] C. Sabater, A. Bellet, and J. Ramon, “Distributed differentially private X n − dn [E (µ, K )] averaging with improved utility and robustness to malicious parties,” r P n,r n arXiv preprint arXiv:2006.07218, 2020. r=xn [11] O. Yagan˘ and A. M. Makowski, “Modeling the pairwise key predistri- bution scheme in the presence of unreliable links,” Information Theory, IEEE Transactions on, vol. 59, no. 3, pp. 1740–1760, March 2013. −x (hK i−1)  (1−o(1)) −(bf c+1)(1−µ)  (1−o(1)) e n n 1+ e n 1+ [12] F. Yavuz, J. Zhao, O. Yagan,˘ and V. Gligor, “k-connectivity in random ≤  +  . k-out graphs intersecting erdos-r˝ enyi´ graphs,” IEEE Transactions on −(hKni−1) 1+ (1−o(1)) −(1−µ) 1+ (1−o(1)) 1 − e 1 − e Information Theory, vol. 63, no. 3, pp. 1677–1692, 2017. −x (hK i−1)  (1−o(1)) −x (1−µ)  (1−o(1)) e n n 1+ e n 1+ [13] ——, “Toward k-connectivity of the random graph induced by a pairwise ≤  +  , key predistribution scheme with unreliable links,” IEEE Transactions on −(hKni−1) (1−o(1)) −(1−µ) (1−o(1)) 1 − e 1+ 1 − e 1+ Information Theory, vol. 61, no. 11, pp. 6251–6271, 2015. (54) [14] H. Chan, A. Perrig, and D. Song, “Random key predistribution schemes for sensor networks,” in 2003 Symposium on Security and Privacy, 2003. where (54) is a consequence of fn ≥ xn by (44). IEEE, 2003, pp. 197–213. [15] L. Eschenauer and V. D. Gligor, “A key-management scheme for distributed sensor networks,” in Proceedings of the 9th ACM VIII.CONCLUSIONS Conference on Computer and Communications Security, ser. CCS ’02. This work analyzes the existence and size of the giant New York, NY, USA: ACM, 2002, pp. 41–47. [Online]. Available: http://doi.acm.org/10.1145/586110.586117 component for inhomogeneous random K-out graphs. In par- [16] S. R. Jino Ramson and D. J. Moni, “Applications of wireless sensor ticular, we prove that whenever Kn ≥ 2, there exists a giant networks — a survey,” in 2017 International Conference on Innova- component of size n − O(1) whp. Moreover, even when a tions in Electrical, Electronics, Instrumentation and Media Technology (ICEEIMT), 2017, pp. 325–329. randomly chosen finite subset of nodes is deleted from the [17] P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. network, our results show that a giant component of size Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings et al., n − O(1) persists for all K ≥ 2 whp. We also establish the “Advances and open problems in federated learning,” arXiv preprint n arXiv:1912.04977, 2019. existence of a giant component of size n(1 − o(1)) whp when [18] R. Eletreby and O. Yagan,˘ “Connectivity of wireless sensor networks dn = o(n) nodes are randomly deleted from the network. secured by the heterogeneous random pairwise key predistribution This work goes beyond existing results on connectivity in scheme,” in 2018 IEEE Conference on Decision and Control (CDC), 2018, pp. 2085–2092. inhomogeneous random K-out graphs that require Kn = ω(1) [19] R. Eletreby and O. Yagan,˘ “Connectivity of inhomogeneous random k- by showing that a bounded choice of Kn is sufficient to obtain out graphs,” IEEE Transactions on Information Theory, vol. 66, no. 11, a giant component of size n − O(1). In particular, our results pp. 7067–7080, 2020. [20] X. Du, Y. Xiao, M. Guizani, and H.-H. Chen, “An effective key man- further strengthen the applicability of random K-out graphs as agement scheme for heterogeneous sensor networks,” Ad Hoc Networks, an efficient way to construct a connected network topology in vol. 5, no. 1, pp. 24–34, 2007. situations where resources are scarce and some nodes may [21] R. Eletreby and O. Yagan,˘ “k-connectivity of inhomogeneous random key graphs with unreliable links,” IEEE Transactions on Information fail or get compromised. An open direction of research is Theory, vol. 65, no. 6, pp. 3922–3949, June 2019. to characterize the asymptotic size of the largest connected [22] ——, “Connectivity of wireless sensor networks secured by hetero- component of homogeneous random K-out graph when K = 1 geneous key predistribution under an on/off channel model,” IEEE which is not known to the best of our knowledge. It is also Transactions on Control of Network Systems, 2018. [23] O. Yagan,˘ “Zero-one laws for connectivity in inhomogeneous random of interest to pursue further applications of random K-out key graphs,” IEEE Transactions on Information Theory, vol. 62, no. 8, graphs in decentralized learning algorithms and cryptographic pp. 4559–4574, Aug 2016. payment channel networks. [24] A.-L. Barabasi´ and R. Albert, “Emergence of scaling in random net- works,” Science, vol. 286, no. 5439, pp. 509–512, 1999. [25] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang, REFERENCES “Complex networks: Structure and dynamics,” Physics reports, vol. 424, no. 4-5, pp. 175–308, 2006. [1] M. Sood and O. Yagan,˘ “On the size of the giant component in [26] K. Lu, Y. Qian, M. Guizani, and H.-H. Chen, “A framework for a inhomogeneous random k-out graphs,” in 2020 59th IEEE Conference distributed key management scheme in heterogeneous wireless sensor on Decision and Control (CDC), 2020, pp. 5592–5597. networks,” IEEE Transactions on Wireless Communications, vol. 7, [2] A.-L. Barabasi,´ . Cambridge university press, 2016. no. 2, pp. 639–647, February 2008. [3] M. E. Newman, D. J. Watts, and S. H. Strogatz, “Random graph models [27] C.-H. Wu and Y.-C. Chung, “Heterogeneous wireless sensor network of social networks,” Proceedings of the National Academy of Sciences, deployment and topology control based on irregular sensor model,” in vol. 99, no. suppl 1, pp. 2566–2572, 2002. Advances in Grid and Pervasive Computing, 2007, pp. 78–88. [4] M. E. J. Newman, “Spread of epidemic disease on networks,” Phys. Rev. [28] M. Yarvis, N. Kushalnagar, H. Singh, A. Rangarajan, Y. Liu, and E, vol. 66, p. 016128, Jul 2002. S. Singh, “Exploiting heterogeneity in sensor networks,” in Proceed- 14

ings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies., vol. 2, March 2005, pp. 878–890 vol. 2. For a given r, throughout, we let (n;µ, ˜ Kr,n) denote the in- [29] G. Mao, Connectivity of communication networks. Springer, 2017. H [30] J. Hwang and Y. Kim, “Revisiting random key pre-distribution schemes homogeneous random graph with 2 node types and parameters Pr−1 for wireless sensor networks,” in Proceedings of the 2nd ACM workshop µ˜ = i=1 µi and Kn = Kr,n. For proving Corollary 8.1, we on Security of ad hoc and sensor networks. ACM, 2004, pp. 43–52. use the following proposition. [31] A. Mei, A. Panconesi, and J. Radhakrishnan, “Unassailable sensor networks,” in Proc. of the 4th International Conference on Security and Proposition 8.2: Let Kn = [K1,n,K2,n,...,Kr,n] and Privacy in Communication Netowrks, ser. SecureComm ’08. New York, µ = [µ1, µ2, . . . , µr] with µi > 0 for i = 1, . . . , r. For any NY, USA: ACM, 2008. monotone-increasing property P, i.e., a property which holds [32] X. Liu, “Coverage with connectivity in wireless sensor networks,” in 2006 3rd International Conference on Broadband Communications, upon addition of edges to the graph (see [45, p. 13], [46]), we Networks and Systems. IEEE, 2006, pp. 1–8. have [33] P. Erdos˝ and A. Renyi,´ “On the evolution of random graphs,” Publ. Math. Inst. Hung. Acad. Sci, vol. 5, no. 1, pp. 17–60, 1960. r r P[I(n;µ ,KK n) has property P] [34] E. C. Elumar, M. Sood, and O. Yagan,˘ “On the connectivity and giant component size of random k-out graphs under randomly deleted nodes,” ≥ P[H(n;µ, ˜ Kr,n) has property P], (55) arXiv preprint arXiv:2103.01471, 2021. [35] M. Sood and O. Yagan,˘ “On the minimum node degree and k- connectivity in inhomogeneous random k-out graphs,” IEEE Transac- Pr−1 where µ˜ = µi. tions on Information Theory, pp. 1–1, 2021. i=1 [36] P. Dellenbach, A. Bellet, and J. Ramon, “Hiding in the crowd: A Proof of Proposition 8.2: The proof relies on proving massively distributed algorithm for private averaging with malicious the existence of a coupling under which H(n;µ, ˜ Kr,n) adversaries,” arXiv preprint arXiv:1803.09984, 2018. r r is a spanning subgraph of I(n;µ ,KK n); i.e., if we can [37] A. Perrig, P. Szalachowski, R. M. Reischuk, and L. Chuat, SCION: a r r secure Internet architecture. Springer, 2017. generate an instantiation of I(n;µ ,KK n) by adding edges Pr−1 [38] S. Sridhara, F. Wirz, J. de Ruiter, C. Schutijser, M. Legner, and to an instantiation of H(n;µ, ˜ Kr,n). Define µ˜ = i=1 µi. A. Perrig, “Global distributed secure mapping of network addresses,” Consider an instantiation of an inhomogeneous random graph in Proceedings of the ACM SIGCOMM Workshop on Technologies, Applications, and Uses of a Responsible Internet (TAURIN), 2021. H(n;µ, ˜ Kr,n) with two classes such that each of the n nodes [39] J. Poon and T. Dryja, “The bitcoin lightning network: Scalable off-chain is independently assigned as type-1 (respectively, type-2) with instant payments,” 2016. probability µ˜ (respectively, 1−µ˜) and then type-1 (respectively, [40] J. A. Bondy, U. S. R. Murty et al., with applications. Macmillan London, 1976, vol. 290. type-2) nodes draw edges to 1 (respectively, Kr,n) nodes [41] R. Van Der Hofstad, Random graphs and complex networks. Cambridge chosen uniformly at random. From this instantiation, we university press, 2016, vol. 1. can generate an instantiation of (n;µr,KK r ) as follows. [42] K. Rybarczyk, “Diameter, connectivity, and of the I n uniform random intersection graph,” Discrete Mathematics, vol. 311, First, let each type-1 node be independently reassigned as µi no. 17, pp. 1998–2019, 2011. type-i with probability µ˜ for i = 1, 2, . . . , r − 1. Next, for [43] S. Janson, “Probability asymptotics: notes on notation,” arXiv preprint i = 2, . . . , r − 1, let each type-i node pick Ki,n − 1 additional arXiv:1108.3924, 2011. [44] J. Zhao, O. Yagan,˘ and V. Gligor, “k-connectivity in random key neighbors that were not chosen by it initially. Next, we draw graphs with unreliable links,” IEEE Transactions on Information Theory, an undirected edge between each pair of nodes where at vol. 61, no. 7, pp. 3810–3836, July 2015. least one picked the other. Clearly, this process creates a [45] K. Rybarczyk, “Sharp threshold functions for random intersection graphs via a coupling method,” The Electronic Journal of Combinatorics, graph whose edge set is a superset of the edge set of the vol. 18, no. 1, p. 36, 2011. realization of H(n;µ, ˜ Kr,n) that we started with. In addition, [46] J. Zhao, O. Yagan,˘ and V. Gligor, “On connectivity and robustness in in the new graph, the probability of a node picking Ki,n random intersection graphs,” IEEE Transactions on Automatic Control, µi vol. 62, no. 5, pp. 2121–2136, May 2017. other nodes (i.e., being type-i) is given by µ˜ µ˜ = µi, for i = 1, 2, . . . , r. We thus conclude that the new graph obtained constitutes a realization of (n;µr,KK r ). Since, the initial APPENDIX I n realization of H(n;µ, ˜ Kr,n) was arbitrary, this establishes A. Inhomogeneous random K-out graph with r classes the desired coupling between the graphs H(n; µ, Kn) and r r We now discuss more general constructions of the inho- I(n;µ ,KK n) as the edge set of H(n; µ, Kn) is contained r r mogeneous random K-out graph with an arbitrary number of in the edge set of I(n;µ ,KK n). Therefore (55) holds for node types, denoted by r [19]. We let each node belong to any property that is monotone increasing upon edge addition. type-i independently with probability µi > 0 for i = 1, . . . , r Pr with i=1 µi = 1. Each type-i nodes gets paired with Ki,n Proof of Corollary 8.1: The proof of Corollary 8.1 uses other nodes, chosen uniformly at random from among all Proposition 8.2 applied to the property |Cmax(n; µ, Kn)| ≥ other nodes where 1 ≤ K1,n < K2,n < . . . < Kr,n. An n−xn which is monotone increasing upon edge addition since undirected edge is drawn between a pair of nodes if either of adding more edges to the network cannot decrease the size of them chooses the other. We let the resulting inhomogeneous the giant component. Combining (55) with (16), we get for r r random graph with r classes be denoted by I(n;µ ,KK n) where any xn = ω(1), Kr µr Kn = [K1,n,K2,n,...,Kr,n] and µ = [µ1, µ2, . . . , µr] with r r r r lim P [|Cmax(n;µ ,KK n)| ≥ n − xn] µi > 0 for i = 1, . . . , r. Let Cmax(n;µ ,KK n) denote the n→∞ µr K r largest connected component of I(n;µ ,KKn). ≥ lim [|Cmax(n; µ, Kn)| ≥ n − xn] = 1, n→∞P Corollary 8.1: If Kr,n ≥ 2 ∀n then for the inhomogeneous r r random K-out graph I(n;µ ,KK n) with r node types, we have or equivalently, r r r r |Cmax(n;µ ,KK n)| = n − O(1) whp. n − |Cmax(n;µ ,KK n)| ≤ xn whp for any xn = ω(1),, 15

(56) B. Mean node degree in H(n; µ, Kn)

r r The probability that node vi picks node vj where vi, vj ∈ N and thus |Cmax(n;µ ,KK n)| = n − O(1) whp. depends on the type of node vi and is given by Using Proposition 8.2 and Theorem 3.2, we can also derive 1 K hK i [v ∈ Γ (v )] = µ + (1 − µ) n = n . (62) an upper bound on the probability that more than M nodes P j n i n − 1 n − 1 n − 1 r r are outside of the largest component for I(n;µ ,KK n). Corollary 8.3: For the inhomogeneous random graph Let vi ∼ vj denote the event that node vi is adjacent to r r node j in H(n; µ, Kn). I(n;µ ,KK n) with r node types, if Kr,n ≥ 2 ∀n and Kr,n = O(1) then for each M = 1, 2,..., it holds that P[i ∼ j] = 1 − (1 − P[vi ∈ Γn(vj)])(1 − P[j ∈ Γn(vi)]), r r  2 P [|Cmax(n;µ ,KK n)| ≤ n − M] hKni = 1 − 1 − exp{−M(Kr,n − 1)(µr)(1 − o(1))} n − 1 ≤ + o(1). (57) 2 1 − exp{−(Kr,n − 1)(µr)(1 − o(1))} 2hK i  hK i  = n − n . (63) n − 1 n − 1 Proof of Corollary 8.3: The proof of Corollary 8.3 uses the coupling argument described in the proof of Proposition 8.2. Consequently, the mean degree of node i, when Kn = o(n) r r is (1 − o(1))2hK i. For a given r, I(n;µ ,KK n) denotes the inhomogeneous ran- n dom graph with Kn = [K1,n,K2,n,...,Kr,n] and µ = [µ , µ , . . . , µ ]. Consider the inhomogeneous random graph 1 2 r C. Upper bound on P[Cmax(n; µ, Kn, dn) ≤ n − dn − xn] Pr−1 (1+)d H(n;µ, ˜ Kr,n) with 2 node types and parameters µ˜ = i=1 µi n with xn > (1−µ) ∀n and Kn = Kr,n and let Cmax(n;µ, ˜ Kr,n) denote its largest connected component. Using (55) for the property that the Observe that hKni−1 = (Kn−1)(1−µ) ≥ (1−µ)∀Kn ≥ 2. largest connected component is of size greater than n − M In this discussion, we derive a sharper upper bound than (1+)dn (7), but it requires the stronger condition of xn > together with (5), and using hKni − 1 = (Kn − 1)(1 − µ), we (1−µ) (1+)dn as opposed to xn > . Note that here we have have (hKni−1) (1+)dn r r < xn ≤ r and therefore following the sequence P [|Cmax(n;µ ,KK n)| > n − M] (hKni−1) of steps from (50) to (52), for dn = o(n) and xn ≤ r ≤ ≥ P [|Cmax(n;µ, ˜ Kr,n)| > n − M] (n − dn)/2, we get exp{−M(K − 1)(1 − µ˜)(1 − o(1))} ≥ 1 − r,n − o(1)   1 − exp{−(K − 1)(1 − µ˜)(1 − o(1))} n − dn r,n P[En,r(µ, Kn, dn)] exp{−M(K − 1)(µ )(1 − o(1))} r = 1 − r,n r − o(1), (58) ( ) 1 − exp{−(K − 1)(µ )(1 − o(1))}  r,n r ≤ exp − r(1 − µ) (1 − o(1)) . and subtracting both sides of (58) from 1 yields (57). 1 +  Using the above argument, we can also characterize the size of Thus, we can upper bound the summation (34) as follows. r r the giant component of Cmax(n;µ ,KK n) under random node P∗ b(n−dn)/2c   deletions. Let denote the property that after a subset of X n − dn [E (µ, K )] the node set N of size dn is selected uniformly at random r P n,r n and deleted, the largest connected component in the resulting r=xn r r b(n−dn)/2c graph, denoted by Cmax(n;µ ,KK , dn), is of size greater than n X −r(1−µ)  (1−o(1)) ∗ ≤ e 1+ or equal to n−xn. It is easy to see that P , as defined above, is a monotone-increasing property in edge addition and therefore r=xn ∞ we can invoke Proposition 8.2 together with Corollary 3.5 to X −r(1−µ)  (1−o(1)) ≤ e 1+ show that with dn = o(n), if Kr,n ≥ 2 ∀n and Kr,n = O(1), r=xn i) if dn = O(1), then −x (1−µ)  (1−o(1)) e n 1+ = . r r −(1−µ)  (1−o(1)) (64) |Cmax(n;µ ,KK n, dn)| = n − O(1) whp, (59) 1 − e 1+

ii) and if dn = ω(1) and dn = o(n), then (1+)dn Note that with the condition xn > (1−µ) , we do not benefit r r from splitting the summation at the threshold fn as defined in |Cmax(n;µ ,KK n, dn)| = n(1 − o(1)) whp.r (60) (44). Combining (64) with (35) we get Next, following the arguments used for proving Corollary 8.3, together with Theorem 3.4, we get for dn = o(n), if Kr,n ≥ P[|Cmax(n; µ, Kn, dn)| ≤ n − dn − xn] −x (1−µ)  (1−o(1)) 2 ∀n and Kr,n = O(1), we have e n 1+ ≤  . (65) r r −(1−µ) 1+ (1−o(1)) P [|Cmax(n;µ ,KK n, dn)| ≤ n − dn − xn] 1 − e −x (K −1)(µ )  (1−o(1)) −x (µ )  (1−o(1)) e n r,n r 1+ e n r 1+ Although, (65) presents an improvement over the upper bound ≤ + . −(K −1)(µ )  (1−o(1)) −(µ )  (1−o(1)) on [|C (n; µ, K , d )| ≤ n − d − x ] presented in 1 − e r,n r 1+ 1 − e r 1+ P max n n n n (61) Theorem 3.4, it comes at a cost, namely, requiring a larger xn 16

 (1+)dn  xn > (1−µ) for the summation to decay to 0. We present

 (1+)dn  the version of the bound with the condition xn > (hKni−1) since it captures the dependence on both Kn annd µ, requires a smaller value for xn as compared to (65) and yet offers an upper bound on P[|Cmax(n; µ, Kn, dn)| ≤ n − dn − xn] that decays to 0 exponentially fast with xn.