Cristopher Moore, Santa Fe Institute Joint Work with Aaron Clauset
Total Page:16
File Type:pdf, Size:1020Kb
Inference in networks Cristopher Moore, Santa Fe Institute joint work with Aaron Clauset, Mark Newman, Xiaoran Yan, Yaojia Zhu, Lenka Zdeborová, Florent Krzakala, Aurelien Decelle, Pan Zhang, Jean-Baptiste Rouquier, and Tiffany Pierce Saturday, May 19, 2012 Learning statistics Saturday, May 19, 2012 Learning statistics Saturday, May 19, 2012 Learning statistics Saturday, May 19, 2012 Learning statistics Saturday, May 19, 2012 What is structure? Saturday, May 19, 2012 What is structure? Structure is that which... Saturday, May 19, 2012 What is structure? Structure is that which... makes data different from noise: makes a network different from a random graph Saturday, May 19, 2012 What is structure? Structure is that which... makes data different from noise: makes a network different from a random graph helps us compress the data: describe the network succinctly Saturday, May 19, 2012 What is structure? Structure is that which... makes data different from noise: makes a network different from a random graph helps us compress the data: describe the network succinctly helps us generalize from data we’ve seen from data we haven’t seen: Saturday, May 19, 2012 What is structure? Structure is that which... makes data different from noise: makes a network different from a random graph helps us compress the data: describe the network succinctly helps us generalize from data we’ve seen from data we haven’t seen: from one part of the network to another, Saturday, May 19, 2012 What is structure? Structure is that which... makes data different from noise: makes a network different from a random graph helps us compress the data: describe the network succinctly helps us generalize from data we’ve seen from data we haven’t seen: from one part of the network to another, or from one network to others generated by the same process Saturday, May 19, 2012 What is structure? Structure is that which... makes data different from noise: makes a network different from a random graph helps us compress the data: describe the network succinctly helps us generalize from data we’ve seen from data we haven’t seen: from one part of the network to another, or from one network to others generated by the same process helps us coarse-grain the dynamics (?) Saturday, May 19, 2012 What is structure? Structure is that which... makes data different from noise: makes a network different from a random graph helps us compress the data: describe the network succinctly helps us generalize from data we’ve seen from data we haven’t seen: from one part of the network to another, or from one network to others generated by the same process helps us coarse-grain the dynamics (?) Saturday, May 19, 2012 What is structure? Structure is that which... makes data different from noise: makes a network different from a random graph helps us compress the data: describe the network succinctly helps us generalize from data we’ve seen from data we haven’t seen: from one part of the network to another, or from one network to others generated by the same process helps us coarse-grain the dynamics (?) Saturday, May 19, 2012 Statistical inference Saturday, May 19, 2012 Statistical inference imagine that our data G is drawn from an ensemble, or “generative model”: some probability distribution P(G|θ) with parameters θ Saturday, May 19, 2012 Statistical inference imagine that our data G is drawn from an ensemble, or “generative model”: some probability distribution P(G|θ) with parameters θ θ can be continuous or discrete: represents the structure of the data Saturday, May 19, 2012 Statistical inference imagine that our data G is drawn from an ensemble, or “generative model”: some probability distribution P(G|θ) with parameters θ θ can be continuous or discrete: represents the structure of the data given G, find the θ that maximize P(G|θ) Saturday, May 19, 2012 Statistical inference imagine that our data G is drawn from an ensemble, or “generative model”: some probability distribution P(G|θ) with parameters θ θ can be continuous or discrete: represents the structure of the data given G, find the θ that maximize P(G|θ) or (Bayes) compute the posterior distribution P(θ|G) Saturday, May 19, 2012 Statistical inference imagine that our data G is drawn from an ensemble, or “generative model”: some probability distribution P(G|θ) with parameters θ θ can be continuous or discrete: represents the structure of the data given G, find the θ that maximize P(G|θ) or (Bayes) compute the posterior distribution P(θ|G) Saturday, May 19, 2012 The Erdős-Renyí model Saturday, May 19, 2012 The Erdős-Renyí model every pair of vertices i, j is connected independently with probability p Saturday, May 19, 2012 The Erdős-Renyí model every pair of vertices i, j is connected independently with probability p average degree d=np Saturday, May 19, 2012 The Erdős-Renyí model every pair of vertices i, j is connected independently with probability p average degree d=np degree distribution is Poisson with mean d Saturday, May 19, 2012 The Erdős-Renyí model every pair of vertices i, j is connected independently with probability p average degree d=np degree distribution is Poisson with mean d if d<1, almost all components are trees, and max component has size O(log n) Saturday, May 19, 2012 The Erdős-Renyí model every pair of vertices i, j is connected independently with probability p average degree d=np degree distribution is Poisson with mean d if d<1, almost all components are trees, and max component has size O(log n) if d>1, a unique giant component appears Saturday, May 19, 2012 The Erdős-Renyí model every pair of vertices i, j is connected independently with probability p average degree d=np degree distribution is Poisson with mean d if d<1, almost all components are trees, and max component has size O(log n) if d>1, a unique giant component appears at d=ln n, completely conected Saturday, May 19, 2012 The Erdős-Renyí model every pair of vertices i, j is connected independently with probability p average degree d=np degree distribution is Poisson with mean d if d<1, almost all components are trees, and max component has size O(log n) if d>1, a unique giant component appears at d=ln n, completely conected ring + Erdős-Renyí = Watts-Strogatz Saturday, May 19, 2012 The Erdős-Renyí model every pair of vertices i, j is connected independently with probability p average degree d=np degree distribution is Poisson with mean d if d<1, almost all components are trees, and max component has size O(log n) if d>1, a unique giant component appears at d=ln n, completely conected ring + Erdős-Renyí = Watts-Strogatz Saturday, May 19, 2012 The stochastic block model Saturday, May 19, 2012 The stochastic block model take a discrete attribute into account: Saturday, May 19, 2012 The stochastic block model take a discrete attribute into account: each vertex i has a type ti ∈ {1,...,k}, with prior distribution q1,...,qk Saturday, May 19, 2012 The stochastic block model take a discrete attribute into account: each vertex i has a type ti ∈ {1,...,k}, with prior distribution q1,...,qk k×k matrix p Saturday, May 19, 2012 The stochastic block model take a discrete attribute into account: each vertex i has a type ti ∈ {1,...,k}, with prior distribution q1,...,qk k×k matrix p if ti = r and tj = s, there is an edge i→j with probability prs Saturday, May 19, 2012 The stochastic block model take a discrete attribute into account: each vertex i has a type ti ∈ {1,...,k}, with prior distribution q1,...,qk k×k matrix p if ti = r and tj = s, there is an edge i→j with probability prs p is not necessarily symmetric, and we don’t assume that prr > prs Saturday, May 19, 2012 The stochastic block model take a discrete attribute into account: each vertex i has a type ti ∈ {1,...,k}, with prior distribution q1,...,qk k×k matrix p if ti = r and tj = s, there is an edge i→j with probability prs p is not necessarily symmetric, and we don’t assume that prr > prs given a graph G, we want to simultaneously Saturday, May 19, 2012 The stochastic block model take a discrete attribute into account: each vertex i has a type ti ∈ {1,...,k}, with prior distribution q1,...,qk k×k matrix p if ti = r and tj = s, there is an edge i→j with probability prs p is not necessarily symmetric, and we don’t assume that prr > prs given a graph G, we want to simultaneously label the nodes, i.e., infer the type assignment t : V→{1,...,k} Saturday, May 19, 2012 The stochastic block model take a discrete attribute into account: each vertex i has a type ti ∈ {1,...,k}, with prior distribution q1,...,qk k×k matrix p if ti = r and tj = s, there is an edge i→j with probability prs p is not necessarily symmetric, and we don’t assume that prr > prs given a graph G, we want to simultaneously label the nodes, i.e., infer the type assignment t : V→{1,...,k} learn how nodes affect connections, i.e., infer the matrix p Saturday, May 19, 2012 The stochastic block model take a discrete attribute into account: each vertex i has a type ti ∈ {1,...,k}, with prior distribution q1,...,qk k×k matrix p if ti = r and tj = s, there is an edge i→j with probability prs p is not necessarily symmetric, and we don’t assume that prr > prs given a graph G, we want to simultaneously label the nodes, i.e., infer the type assignment t : V→{1,...,k} learn how nodes affect connections, i.e., infer the matrix p how do we get off the ground? Saturday, May 19, 2012 Assortative and disassortative Saturday, May 19, 2012 The likelihood Saturday, May 19, 2012 The likelihood the probability of G given the types t and parameters θ=(p,q)