Social Network Analysis of Affiliation Networks

Analysis of Affiliation Networks to Promote Online Communities of Practice for Science Education Kathleen Perez-Lopez, Ph.D. Darren Cambridge, Ph.D. Al Byers, Ph.D. Sherry Booth, Ph.D. Sunbelt XXXII San Diego, CA 3/17/2012 Outline • Issue: organizing forums of topics in an online community • Current context • Our approach • Initial stages: Implementation and progress • Next Steps • NSTA LC supports science teachers increasing their knowledge of science and of pedagogy • Hosts an online community through its Community Forums • Members can initiate topics within a number of forums or post to existing topics http://learningcenter.nsta.org/ Year of NSTA LC Posts 9/24/2010 - 9/28/2011 6792 posts 20 forums 307 members 556 topics SNA using NodeXL http://nodexl.codeplex.com/ Year of NSTA LC Posts 9/24/2010 - 9/28/2011 6792 posts 20 forums 307 members 556 topics SNA using NodeXL http://nodexl.codeplex.com/ Current NSTA Approach to Decomposing Overly Large Forums Heuristics . 25 moderators make recommendations . Review topics for thematic coherence . Manually move topics and aggregate under forum header (e.g., STEM) http://learningcenter.nsta.org/ Approach to Repartitioning Topics . Goal: Repartition Topics w/o creating Member islands Alternative topic partitions: Fn, n = 0, 1, 2, … F0 = the original NSTA Forums . Bimodal networks TM : Topic-Member network of posts to topics by members MFn : Member-Forum networks for Fn . Consider derived unimodal networks T Topic-Topic T = TM * MT, where MT = (TM) T Member-Member, Mn = MFn * FnM, where FnM = (MFn) . Goal restated: Find a natural partition Fn of T such that Clustering(Mn) is LOW Methodology . Annual postings data • Ignored topics and members with < 2 posts o 6792 posts by 307 members to 556 topics in 20 forums • On 2nd pass, omitted 26 “online advisors” o 2815 posts by 281 members to 474 topics in 20 forums . Unimodal networks derived from bimodal affiliation networks are often very dense . Partition topic networks using alternative grouping approaches . Measure resulting clustering on member network Repartitioning Topics Find Fn , a partition of topics, that yields: 1. VERY segregated Topic network, Tn 474 x 281 474 x 474 281 x 474 X Member-Topic Topic-Member Tn 2. UN-segregated member network, Mn 281 x 20+ 281 x 281 20+ x 281 X Fn-Member Member-Fn Mn Related Work • Rodríguez, Sicilia, Sanchez-Alonso, Lezcano (2011), “Exploring affiliation network models as a collaborative filtering mechanism in e-learning” – Very similar concept to what we do here – Create a 1-mode topic network from a learner-topic affiliation network, using m-slices to cluster topics – Smaller data set – More pre-filtering of topics – Blockmodeling to describe learner clusters (our intent also) • Recommender systems – Conceptually similar; could provide some insights to the current task Clustering Algorithms • F0 Original NSTA LC forums • F1 Clauset-Newman-Moore groups (NodeXL) • F2 Wakita-Tsurumi groups (NodeXL) • F3 m-slices and k-cores (Pajek ) • F4 Wakita-Tsurumi on a reduced dataset • M4 Wakita-Tsurumi on member network from F4 Member Network from NSTA Forums • Derived from Members posting to NSTA forums • 20,576 edges • 20,573 in connected component • Density 0.44 • Ave deg 134 Member Network from NSTA Forums Wakita-Tsurumi Groups • 12 Groups Member Network from NSTA Forums Wakita-Tsurumi Groups • 12 Groups Member Network from NSTA Forums Wakita-Tsurumi Groups • Strong inter- group ties • Want the derived partition to also be very dense Topic Network • Derived from topic postings from members • 74,709 edges • 74707 in connected component • Density 0.49 • Ave deg 271 F1: Topic Network Clauset-Newman-Moore Groups • Only 6 groups F1: Topic CNM Groups w/o Lines • Only 6 groups • 2 dominant • Split the graph in half F1: Topic Network CNM Groups - Boxed • 6 groups • 2 dominant • Split the graph in half • Halves tightly connected • Very POOR partition F2: Topic Clustering Wakita-Tsurumi • Somewhat better G14 • 8 + 6 groups • 2-3 dominant • G14 has 42% of nodes, 38% of edges • Group tightly connected • POOR partition F3: Topic Clustering with m-Slices • Derive cohesive subgroups using line multiplicity m – Topics linked by ≥ m mutually posting members • M-slice – Largest subnet of lines with multiplicity ≥ m, and their incident vertices – Usually leaves disjoint, highly connected clusters – But not in this case! • Used Pajek, following de Nooy, et. al. – 1 to 20 slices output, created partition from the 8-slice – Considered m=4, tried to cluster with k-cores, still had one dominant cluster of 212 out of 556 nodes F3: Topic Clustering Using m-Slices 1-Slice, Colored with NSTA Forums Using Pajek 3-Slice, Colored with NSTA Forums Using Pajek 5-Slice, Colored with NSTA Forums Using Pajek 6-Slice Original Forum Defs 7-Slice, Colored with NSTA Forums Using Pajek 8-Slice, Colored with NSTA Forums Using Pajek 8-Slice Groups Using Pajek 8-Slice Groups Using Pajek 8-Slice, Colored with NSTA Forums Group Doesn’t Align w/ Forums Using Pajek F3-1: 4-slice Followed by k-cores 4-slice 4-Slice, Nodes Removed 4-Slice, Nodes Removed 4-Slice Components 4-Slice Largest Component 4-Slice Largest Component k-core 4-Slice Largest Component 20-59-core: 212 nodes 4-Slice Largest Component 20-59-core: Component (=1) 4-Slice Largest Component 20-59-core: Component, Expanded Repartitioning Topics Find Fn , a partition of topics, that yields: 1. VERY segregated Topic network, Tn 474 x 281 474 x 474 281 x 474 X Member-Topic Topic-Member Tn 2. UN-segregated member network, Mn 281 x 20+ 281 x 281 20+ x 281 X Fn-Member Member-Fn Mn Repartitioning Topics: Poor Results Find Fn , a partition of topics, that yields: 1. VERY UNsegregated Topic network, Tn 474 x 281 474 x 474 281 x 474 X Member-Topic Topic-Member Tn 2. UN-segregated member network, Mn 307 x 20+ 307 x 307 20+ x 307 X Fn-Member Member-Fn Mn F4: Wakita-Tsurumi on Refined Dataset Ignoring posts from 26 online advisors: 2815 posts 20 forums 281 members 474 topics SNA using NodeXL http://nodexl.codeplex.com/ F4: Wakita-Tsurumi on Refined Dataset Removing inter-group lines SNA using NodeXL http://nodexl.codeplex.com/ M4: Member Network from F4 • More clustered than desired • 7 groups – 1 dominates, but – 5 are significant Next Steps • Consider realistic restrictions on forum definitions • Find different data to represent the natural clustering of topics – Textual content analysis • Filtering out non-contextual content – Friendly banter – Might be useful for other purposes, but interference here • More iterative approach • Consider time – Not a static phenomena; analyze over time References • DeNooy, W., Mrvar,A., & Batagelj, V. (2005) Exploratory Social Network Analysis with Pajek, New York: Cambridge University Press. • Wasserman, S., & Faust, K. (1994) Social Network Analysis: Methods and Applications. New York and Cambridge, Eng: Cambridge University Press. • Borgatti, S., 2-Mode Concepts in Social Network Analysis, Forthcoming in Encyclopedia of Complexity and System Science. • Rodríguez, D., Sicilia, M., Sanchez-Alonso, S., Lezcano, L., & García- Barriocanal, E.. (2011, in press) Exploring affiliation network models as a collaborative filtering mechanism in e-learning. Interactive Learning Environments. • Hansen, D., Shneiderman, B., & Smith, M. (2011) Analyzing Social Media Networks with NodeXL, Insights from a Connected World. Burlington, MA: Morgan Kaufmann. • Su, X. & Khoshgoftaar, T. (2009). A Survey of Collaborative Filtering Techniques, Advances in Artificial IntelligenceVolume. .

Social Network Analysis of Affiliation Networks

Networkx: Network Analysis with Python

Networkx Reference Release 1.9.1

Strongly Connected Components and Biconnected Components

Analyzing Social Media Network for Students in Presidential Election 2019 with Nodexl

Assortativity and Mixing

Graph and Network Analysis

Using Centrality Measures to Identify Key Members of an Innovation Collaboration Network

The Rise of the Giant Component

Number Theory and Graph Theory Chapter 6

Topic Mash II: Assortativity, Resilience, Link Prediction

Different Aspects of Social Network Analysis

6.99A Social Network Analysis