Computational Analysis of Network Data

Computational Analysis of Network Data

Computational Analysis of Network Data T ESTING NETWORK - LEVEL MEASURES THROUGH SIMULATION Austin van Loon SICSS 2019 Flash Talk WARNING! Lots to Cover JUPYTER NOTEBOOK WITH ANNOTATED CODE AND EXPLANATION Outline ▪ The why/when of networks ▪ Testing network-level measures › Define a measure • Homophily • Average clustering coefficient › Pick a null model • Erdös–Rényi model • Configuration model Outline ▪ The why/when of networks ▪ Testing network-level measures › Define a measure • Homophily • Average clustering coefficient › Pick a null model • Erdös–Rényi model • Configuration model When/Why Networks? ▪ Standard econometrics assumes i.i.d ▪ In many cases, we’re interested in interdependence ▪ Examples: › Relationships › Communication › Language › Computer networks › Protein interactions › Hyper-links ▪ Generally anything in which there’s measurable interdependence! Outline ▪ The why/when of networks ▪ Testing network-level measures › Define a measure • Homophily • Average clustering coefficient › Pick a null model • Erdös–Rényi model • Configuration model Outline ▪ The why/when of networks ▪ Testing network-level measures › Define a measure • Homophily • Average clustering coefficient › Pick a null model • Erdös–Rényi model • Configuration model The assortativity coefficient of a graph is the degree to which "birds of feather flock together". Where 푒 is a matrix for which entry 푒푖푗 is the fraction of edges in a graph that connect actors of type 푖 and 푗, it is defined as: 푇푟 푒 − | 푒2 | 푟 = 1 − | 푒2 | The average clustering coefficient for a graph describes how often there are "triangles" in the graph. Where 퐴푖, is the set of nodes 푖 is connected to and 푘푖 is the size of that set, it is defined as: 1 2| 푒푗푘: 푣푗, 푣푘 ∈ 퐴푖; 푒푗푘 ∈ 퐸 | 퐶ҧ = ෍ 푛 푘푖(푘푖 − 1) 푖∈푉 The average clustering coefficient for a graph describes how often there are "triangles" in the graph. Where 퐴푖, is the set of nodes 푖 is connected to and 푘푖 is the size of that set, it is defined as: 1 2(푛푢푚푏푒푟 표푓 triangles) 퐶ҧ = ෍ 푛 푛푢푚푏푒푟 표푓 possible triangles 푒푎푐ℎ 푛표푑푒 Outline ▪ The why/when of networks ▪ Testing network-level measures › Define a measure • Homophily • Average clustering coefficient › Pick a null model • Erdös–Rényi model • Configuration model More than by chance? ▪ Erdös–Rényi model holds constant › Number of nodes › Average degree ▪ Configuration model holds constant › Number of nodes › Exact degree distribution › Correlation between attributes and degree Testing network-level measures ▪ Simulate the network under the null model many times ▪ For each simulation, measure the statistic of interest ▪ Compare your observed measure to this distribution ▪ How likely of a draw is the observed measure from the distribution of the measures from the networks generated under the null hypothesis? That’s your p-value! What do we observe in our network? Erdös–Rényi: 푝 ≈ 0 Erdös–Rényi: 푝 ≈ 0.054 Configuration: 푝 ≈ 0 Configuration: 푝 ≈ 0.038 Thank you! P ERSONAL WEBSITE : HTTPS :// AN K E N YAV. WIXSITE . COM / AUSTINVANLOON E MAIL : AVA N L O O N @ STANFORD . EDU Austin van Loon.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    20 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us