<<

Analysis of Affiliation Networks to Promote Online Communities of Practice for Science Education

Kathleen Perez-Lopez, Ph.D. Darren Cambridge, Ph.D. Al Byers, Ph.D. Sherry Booth, Ph.D.

Sunbelt XXXII San Diego, CA 3/17/2012 Outline • Issue: organizing forums of topics in an online community • Current context • Our approach • Initial stages: Implementation and progress • Next Steps • NSTA LC supports science teachers increasing their knowledge of science and of pedagogy • Hosts an online community through its Community Forums • Members can initiate topics within a number of forums or

post to existing topics http://learningcenter.nsta.org/ Year of NSTA LC Posts 9/24/2010 - 9/28/2011

6792 posts

20 forums

307 members

556 topics

SNA using NodeXL http://nodexl.codeplex.com/ Year of NSTA LC Posts 9/24/2010 - 9/28/2011

6792 posts

20 forums

307 members

556 topics

SNA using NodeXL http://nodexl.codeplex.com/ Current NSTA Approach to Decomposing Overly Large Forums

Heuristics . 25 moderators make recommendations . Review topics for thematic coherence . Manually move topics and aggregate under forum header (e.g., STEM)

http://learningcenter.nsta.org/ Approach to Repartitioning Topics . Goal: Repartition Topics w/o creating Member islands

Alternative topic partitions: Fn, n = 0, 1, 2, …

F0 = the original NSTA Forums . Bimodal networks TM : Topic-Member network of posts to topics by members

MFn : Member-Forum networks for Fn . Consider derived unimodal networks Topic-Topic T = TM * MT, where MT = (TM)T T Member-Member, Mn = MFn * FnM, where FnM = (MFn)

. Goal restated: Find a natural partition Fn of T such that Clustering(Mn) is LOW Methodology . Annual postings data • Ignored topics and members with < 2 posts o 6792 posts by 307 members to 556 topics in 20 forums • On 2nd pass, omitted 26 “online advisors” o 2815 posts by 281 members to 474 topics in 20 forums . Unimodal networks derived from bimodal affiliation networks are often very dense . Partition topic networks using alternative grouping approaches . Measure resulting clustering on member network Repartitioning Topics

Find Fn , a partition of topics, that yields:

1. VERY segregated Topic network, Tn 474 x 281 474 x 474 281 x 474

X

Member-Topic

Topic-Member Tn

2. UN-segregated member network, Mn 281 x 20+ 281 x 281 20+ x 281 X

Fn-Member Member-Fn Mn Related Work

• Rodríguez, Sicilia, Sanchez-Alonso, Lezcano (2011), “Exploring affiliation network models as a collaborative filtering mechanism in e-learning” – Very similar concept to what we do here – Create a 1-mode topic network from a learner-topic affiliation network, using m-slices to cluster topics – Smaller data set – More pre-filtering of topics – Blockmodeling to describe learner clusters (our intent also) • Recommender systems – Conceptually similar; could provide some insights to the current task

Clustering Algorithms

• F0 Original NSTA LC forums

• F1 Clauset-Newman-Moore groups (NodeXL)

• F2 Wakita-Tsurumi groups (NodeXL)

• F3 m-slices and k-cores (Pajek )

• F4 Wakita-Tsurumi on a reduced dataset

• M4 Wakita-Tsurumi on member network from F4

Member Network from NSTA Forums

• Derived from Members posting to NSTA forums • 20,576 edges • 20,573 in connected component • Density 0.44 • Ave deg 134 Member Network from NSTA Forums Wakita-Tsurumi Groups • 12 Groups Member Network from NSTA Forums Wakita-Tsurumi Groups • 12 Groups

Member Network from NSTA Forums Wakita-Tsurumi Groups • Strong inter- group ties • Want the derived partition to also be very dense

Topic Network

• Derived from topic postings from members • 74,709 edges • 74707 in connected component • Density 0.49 • Ave deg 271

F1: Topic Network Clauset-Newman-Moore Groups • Only 6 groups F1: Topic CNM Groups w/o Lines

• Only 6 groups • 2 dominant • Split the graph in half F1: Topic Network CNM Groups - Boxed • 6 groups • 2 dominant • Split the graph in half • Halves tightly connected • Very POOR partition F2: Topic Clustering Wakita-Tsurumi

• Somewhat better G14 • 8 + 6 groups • 2-3 dominant • G14 has 42% of nodes, 38% of edges • Group tightly connected • POOR partition F3: Topic Clustering with m-Slices

• Derive cohesive subgroups using line multiplicity m – Topics linked by ≥ m mutually posting members • M-slice – Largest subnet of lines with multiplicity ≥ m, and their incident vertices – Usually leaves disjoint, highly connected clusters – But not in this case! • Used Pajek, following de Nooy, et. al. – 1 to 20 slices output, created partition from the 8-slice – Considered m=4, tried to cluster with k-cores, still had one dominant cluster of 212 out of 556 nodes F3: Topic Clustering Using m-Slices 1-Slice, Colored with NSTA Forums

Using Pajek 3-Slice, Colored with NSTA Forums

Using Pajek 5-Slice, Colored with NSTA Forums

Using Pajek 6-Slice Original Forum Defs 7-Slice, Colored with NSTA Forums

Using Pajek 8-Slice, Colored with NSTA Forums

Using Pajek 8-Slice Groups

Using Pajek 8-Slice Groups

Using Pajek 8-Slice, Colored with NSTA Forums Group Doesn’t Align w/ Forums

Using Pajek F3-1: 4-slice Followed by k-cores 4-slice 4-Slice, Nodes Removed 4-Slice, Nodes Removed 4-Slice Components 4-Slice Largest Component 4-Slice Largest Component k-core 4-Slice Largest Component 20-59-core: 212 nodes 4-Slice Largest Component 20-59-core: Component (=1) 4-Slice Largest Component 20-59-core: Component, Expanded Repartitioning Topics

Find Fn , a partition of topics, that yields:

1. VERY segregated Topic network, Tn 474 x 281 474 x 474 281 x 474

X

Member-Topic

Topic-Member Tn

2. UN-segregated member network, Mn 281 x 20+ 281 x 281 20+ x 281 X

Fn-Member Member-Fn Mn Repartitioning Topics: Poor Results

Find Fn , a partition of topics, that yields:

1. VERY UNsegregated Topic network, Tn 474 x 281 474 x 474 281 x 474

X

Member-Topic

Topic-Member Tn

2. UN-segregated member network, Mn 307 x 20+ 307 x 307 20+ x 307 X

Fn-Member Member-Fn Mn F4: Wakita-Tsurumi on Refined Dataset

Ignoring posts from 26 online advisors: 2815 posts

20 forums

281 members

474 topics

SNA using NodeXL http://nodexl.codeplex.com/ F4: Wakita-Tsurumi on Refined Dataset

Removing inter-group lines

SNA using NodeXL http://nodexl.codeplex.com/ M4: Member Network from F4

• More clustered than desired • 7 groups – 1 dominates, but – 5 are significant

Next Steps

• Consider realistic restrictions on forum definitions • Find different data to represent the natural clustering of topics – Textual content analysis • Filtering out non-contextual content – Friendly banter – Might be useful for other purposes, but interference here • More iterative approach • Consider time – Not a static phenomena; analyze over time References

• DeNooy, W., Mrvar,A., & Batagelj, V. (2005) Exploratory Analysis with Pajek, New York: Cambridge University Press. • Wasserman, S., & Faust, K. (1994) : Methods and Applications. New York and Cambridge, Eng: Cambridge University Press. • Borgatti, S., 2-Mode Concepts in Social Network Analysis, Forthcoming in Encyclopedia of Complexity and System Science. • Rodríguez, D., Sicilia, M., Sanchez-Alonso, S., Lezcano, ., & García- Barriocanal, E.. (2011, in press) Exploring affiliation network models as a collaborative filtering mechanism in e-learning. Interactive Learning Environments. • Hansen, D., Shneiderman, B., & Smith, M. (2011) Analyzing Social Media Networks with NodeXL, Insights from a Connected World. Burlington, MA: Morgan Kaufmann. • Su, X. & Khoshgoftaar, T. (2009). A Survey of Collaborative Filtering Techniques, Advances in Artificial IntelligenceVolume.