15 Netsci Configuration Model.Key

15 Netsci Configuration Model.Key

Configuring random graph models with fixed degree sequences Daniel B. Larremore Santa Fe Institute June 21, 2017 NetSci [email protected] @danlarremore Brief note on references: This talk does not include references to literature, which are numerous and important. Most (but not all) references are included in the arXiv paper: arxiv.org/abs/1608.00607 Stochastic models, sets, and distributions • a generative model is just a recipe: choose parameters→make the network • a stochastic generative model is also just a recipe: choose parameters→draw a network • since a single stochastic generative model can generate many networks, the model itself corresponds to a set of networks. • and since the generative model itself is some combination or composition of random variables, a random graph model is a set of possible networks, each with an associated probability, i.e., a distribution. this talk: configuration models: uniform distributions over networks w/ fixed deg. seq. Why care about random graphs w/ fixed degree sequence? Since many networks have broad or peculiar degree sequences, these random graph distributions are commonly used for: Hypothesis testing: Can a particular network’s properties be explained by the degree sequence alone? Modeling: How does the degree distribution affect the epidemic threshold for disease transmission? Null model for Modularity, Stochastic Block Model: Compare an empirical graph with (possibly) community structure to the ensemble of random graphs with the same vertex degrees. e a loopy multigraphs c d simple simple loopy graphs multigraphs graphs Stub Matching tono multiedgesdraw from theno config. self-loops model ~k = 1, 2, 2, 1 b { } 3 3 4 3 4 3 4 4 multigraphs 2 5 2 5 2 5 2 5 1 1 6 6 1 6 1 6 3 3 the 4standard algorithm:4 3 4 4 loopy and/or 5 5 draw2 from5 the distribution2 5 by sequential2 “Stub Matching”2 1 1 6 6 1 6 1 6 1. initialize each node n with kn half-edges or stubs. 2. choose two stubs uniformly at random and join to form an edge. graph isomorph. vertex-labeled stub-labeled e a loopy multigraphs c d simple simple loopy graphs multigraphs graphs no multiedges no self-loops Stub Matching to draw from the config. model b 3 3 4 3 4 3 4 4 multigraphs 2 5 2 5 2 5 2 5 1 1 draw #1 6 6 1 6 1 6 3 3 4 4 3 4 4 loopy and/or 5 5 2 5 2 5 2 2 1 1 6 6 1 6 1 6 draw #2 graph isomorph. vertex-labeled stub-labeled a b c d loopy multigraphs simple a b c d Are these two different networks? or the same network? loopy multigraphs simple Are stubs distinguishable or not? simple The rest of this talk: the answer matters. loopy graphs graphs multigraphs multigraphs simple loopy graphs graphs multigraphs no multiedges no self-loops multigraphs loopy and/or no multiedges no self-loops graph isomorph. vertex-labeled stub-labeled loopy and/or graph isomorph. vertex-labeled stub-labeled The distribution according to stub-matching a b c d loopy multigraphs When we draw a graph usingsimple stub matching, this is the set of graphs that we uniformly sample. 8 of the graphs are simple, while the other 7 simple loopy graphs have self-loops or multiedges. graphs multigraphs multigraphs We therefore say that stub matching uniformly samples space of stub-labeled loopy multigraphs. no multiedges no self-loops Note, however, that this is notloopy and/or a uniform sample over adjacency matrices (rows). graph isomorph. vertex-labeled stub-labeled The importance of uniform distributions remove vertex labels remove stub labels a b c d loopy multigraphs simple e a loopy multigraphs c d simple simple loopy graphs multigraphs graphs simple loopy graphs graphs multigraphs no multiedges no self-loops multigraphs b no multiedges no self-loops 3 3 4 3 4 3 4 4 multigraphs 2 5 2 5 2 5 2 5 loopy and/or 1 1 6 6 1 6 1 6 3 3 4 4 3 4 4 loopy and/or 5 5 graph isomorph. vertex-labeled stub-labeled 2 5 2 5 2 2 1 1 6 6 1 6 1 6 goal: provably uniform samplinggraph isomorph. for vertex-labeledall eight spaces: stub-labeled loopy{0,1} x multigraph{0,1} x {stub-,vertex-} Choosing a space for your configuration model Question 1: loops? Question 3: vertex- or stub-labeled? stub-labeled These configurations are . • two graphs • one graph, drawn two ways • one valid; one nonsensical simple loopy (skip Q3) • three graphs loopy multigraph • one graph, drawn three ways multigraph • one valid; two nonsensical Question 2: multiedges? vertex-labeled example: Are loops reasonable? Would a loop make sense? [tennis matches: no | author citations: yes] Sampling from configuration models stub matching samples uniformly from stub-labeled loopy multigraphs for other spaces, define a Markov chain over the “graph of graphs” G →each vertex is a graph, and directed edges are “double-edge swaps” swap this way or the other way NB: Sampling is easy. Provably uniform sampling is not! Markov chains for uniform sampling Prove that: • the transition matrix is doubly stochastic (G is regular) • the chain is irreducible (G is strongly connected) • the chain is aperiodic (G is aperiodic; gcd of all cycles is one) Straightforward for stub-labeled loopy multigraphs. Choose two edges uniformly at random and swap them. Accept all swaps and treat each resulting graph as a sample from the U distribution. (Each node in G has degree m-choose-2.) Easy for stub-labeled multigraphs. Choose two edges uniformly at random and swap them. Reject swaps that create a self-loop and resample the current graph. (Think of any “rejected swap” as a self-loop in G.) Easy for simple graphs. Choose two edges uniformly at random and swap them. Reject swaps that create a self-loop or multiedge and resample the current graph. (Again, treat “rejected swaps” as a self loops in G.) Markov chains for uniform sampling For vertex-labeled graphs, we inherit the strong connectedness of G as well as its aperiodicity. However, ensuring that the Markov chain has a uniform distribution as its stationary distribution requires that we adjust transition probabilities. a b unadjusted adjusted transitions transitions P = 1/3, 2/3 P = 1/2,1/2 These asymmetric modifications to transition probabilities depend on the number of self-loops and multiedges in the current state. decrease outflow (and increase resampling) Intuition: of graphs with multiedges or self-loops. Stub-labeled loopy graphs: not connected counterexample: no double-edge swap connects these two graphs! but see Nishimura 2017 (arxiv:1701.04888) - The connectivity of graphs of graphs with self-loops and a given degree sequence a b c d Do {stub labels, self-loops, multiedges} matter for how we sample CMs? yes loopy multigraphs simple showed that these spaces are far from introduced (and just outlined) equivalent, even in thermodynamic lim. provably uniform sampling methods. a b c d Do {stub labels, self-loops, multiedges} matter in applications of CMs? next… loopy multigraphssimple loopy graphs graphs multigraphs simple →hypothesis testing →null model for modularity multigraphs nosimple multiedges no self-loops loopy graphs graphs multigraphs loopy and/or multigraphs graph isomorph. vertex-labeled stub-labeled no multiedges no self-loops loopy and/or graph isomorph. vertex-labeled stub-labeled Hypothesis testing Do barn swallows tend to associate with other swallows of similar color? Data: bird interactions, bird colors. Compute color assortativity [correlation over edges] Choose a graph space for barn swallows QuestionQuestion 1: loops? 1: loops? QuestionQuestion 3: vertex- 3: vertex- or stub-labeled? or stub-labeled? stub-labeledstub-labeled Nonsensical TheseThese configurations configurations are . .are . vertex-labeled • two •graphs two graphs • one •graph, one graph, drawn drawn two ways two ways • one •valid; one valid; one nonsensical one nonsensical simplesimple loopyloopy (skip Q3)(skip Q3) [Why? If we interacted• three today• threegraphs and graphs yesterday, a randomization in loopyloopy multigraphmultigraph which my today interacts• one •graph, onewith graph, yourdrawn drawnyesterday three threeways waysis nonsensical!] Reasonable multigraphmultigraph • one •valid one ;valid two ;nonsensical two nonsensical [inQuestion 2: multiedges? our data,Question 2: multiedges? in fact!] vertex-labeledvertex-labeled This should be modeled as a vertex-labeled multigraph. Assortative pairing of barn swallows Stub-labeled Vertex-labeled 5 5 note: for simple graphs and statistics based on aphs the graph adjacency matrix, r 34 stub-labeled ≡ vert34 ex-labeled Sanity check: 2 2 Density p = 0.001 p = 0.001 should be = for simple 1 1 Simple g 0 0 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 r r 5 5 aphs r NONE of these is centered at zero. 234 234 Density p = 0.608 p = 0.852 Correct space is meaningfully different. Multig 01 01 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 r r Uniform sampling means we can compare empirical value to null distribution to draw scientific conclusions. The choice of graph space matters—careful choice & sampling can flip conclusions! Community Detection Are there groups of vertices that tend to associate more than Data: Maximize collaborationswe expect among by geometers chance with each other modularity calculate alignment scores A , e.g. C 1 ? 0.5 0 convert to alignment indicators . 1 a(t) 0 remove short aligned regions 1 0 234 extract highly variable regions 1 1 56789 D 0 0 200 400 600 alignment position t B NGDYKEKVSNNLRAIFNKIYENLNDPKLKKHYQKDAPNY 6 16 NGDYKKKVSNNLKTIFKKIYDALKDTVKETYKDDPNY 13 NGDYKEKVSNNLRAIFKKIYDALEDTVKETYKDDPNY 16 6 13 Coauthorship communities (vertex-labeled multigraph) expected number of edges in a

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    24 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us