<<

Online Social Networks

Academic Year: 2014-15, Hilary Term Day and Time: Weeks 1-9, Day and time to be determined Location: TBC

Course Providers: Dr Bernie Hogan, Oxford Internet Institute, [email protected] Dr Taha Yasseri, Oxford Internet Institute, [email protected]

Background The paradigm of is one that directly speaks to the web. With its hyperlink structure, relational tables, friend lists and constant stream of information diffusion, network analysis appears to be an obvious route to the analysis and understanding of the Internet’s dynamics. Such analyses are not mere passive reflections of data - the algorithms that power Google, Amazon, Facebook and Twitter are based in network science. Beyond the use of formal algorithms for network analysis are questions of societal import such as the consequence of the number and structure of Facebook friends; the overlap of members on many media; the cascading behaviour of political activism; and the salience of identity in threaded conversations to name a few relevant topics.

In this course we introduce analysis and the more recent notion of ‘network science’ with particular emphasis on research design, data collection and analysis. We take a comparative approach to network topics, such as evaluating different measures of , multiple approaches to clustering and variations on visualization. In doing so, it is our goal to not merely familiarize the student with the basics of network analysis capture and analysis, but to enable the student to make informed choices for analysis based on research questions rather than default tools or outmoded conventions.

Key Themes ● What differentiates social networks as analytical objects from the reality they seek to represent? ● How do the descriptive measures of networks inform us about macro social structures as well as micro social behaviours? ● How do the affordances and constraints of online technologies help facilitate certain kinds of network structures (and indeed, even the notion of networks as analytical tools in the first instance)? ● Why do networks as visual objects persist in having a rhetorical power? Is it that they are merely ‘sciency’ and complex looking or should we consider the visual presentation of networks as a meaningful scholarly practice?

Course Objectives The course will familiarise students with the state of network science as a paradigm comprising multidisciplinary approaches to the analysis of relational data. Students will be able to read introductory network metrics and understand how these measures speak to theories of human behaviour as well as put together an original piece of analysis using network data. Students will gain a

1 modest understanding, via the ‘ of science’, as to why network analysis is a highly distributed field where no single software application, journal or conference covers all of the active research on social networks. Students will also learn basic data capture and analysis techniques that can enable them to begin, if not complete, a full study.

Learning Outcomes Upon successful completion of this course students should: ● Have a familiarity with the basic terms and concepts of social network analysis. ● Understand how differing network analysis metrics relate both to each other and to academic research questions. ● Be able to describe how a network can be constructed from an online phenomenon. ● Have a clear understanding of some of the various analytical tools used in network science. ● Be able to construct and theorise a research question that employs social network analysis in order to address a specific topic related to human behaviour and collective dynamics.

Teaching Arrangements The course will consist of eight classes taught in weeks 1-4 and 6-9 of Hilary term. The date, time and venue will be communicated to students during Michaelmas Term.

Each class will begin with an hour-long lecture. The second half of the class is typically a guided walkthrough of network analysis techniques. The techniques draw upon a variety of software packages and data sources. Every effort will be made to ensure cross-platform and open source software is used whenever possible, but this cannot always be guaranteed.

Assessment Students will be assessed through a final essay that is no longer than 5000 words which must be submitted to the Examinations School by 12 noon of Monday of Week 1 of Trinity Term. The essay should consolidate a review of current literature, a theoretically-informed research question about online social networks and network-oriented methodology that was featured in the course. The essay topic should be agreed upon by the student and the course instructor prior to submission (see the following section).

Formative Assessment Each student will also be required to write one short essay (length: 1500-3000 words) stating the research question and methodological approach that will be developed in the final essay. Students are expected to submit a research question paragraph in Week 3 of Hilary Term, and the short essay in Week 7 of Hilary Term. Both are due on 5pm the day prior to class. This essay will provide a means for students to obtain feedback on the progress they have achieved prior to the final submission of their work.

Submission of Assignments All coursework should be submitted in person to the Examinations School by the stated deadline. All coursework should be put in an envelope and must be addressed to ‘The Chairman of Examiners for the MSc in Social Science of the Internet C/o The Clerk of Examination Schools, High Street. Students should also ensure they add the OII coversheet at the top of the coursework and that two copies of the coursework are submitted. Please note that all work must be single sided. An electronic copy will also need to be submitted to the department. Please note that all coursework will be marked anonymously and therefore only your candidate number is required on the coversheet.

Please note that work submitted after the deadline will be processed in the standard manner and, in addition, the late submission will be reported to the Proctors' Office. If a student is concerned that they will not meet the deadline they must contact their college office or examinations school for advice.

2

For further information on submission of assessments to the examinations school please refer to http://www.admin.ox.ac.uk/schools/oxonly/submissions/index.shtml. For details on the regulations for late and non-submissions please refer to the Proctors website at http://www.admin.ox.ac.uk/proctors/info/pam/section9.shtml.

Any student failing this assessment will need to follow the rules set out in the OII Examining Conventions regarding re-submitting failed work.

Topics

1. Introduction and Research Design 2. Generating and representing networks 3. Basic network metrics I: Centrality and position 4. Basic network metrics II: Clustering 5. Modeling networks I: Triads and dependency models 6. Modeling networks II: Diffusion and generative models 7. Network visualization techniques 8. Theorizing networks

General Readings The general readings are structured in essential and optional. Those items marked with an asterisk (*) are essential reading and MUST be read by all students in preparation for the class. If a reading is essential, it is because we use multiple chapters from the book and consider it a key reference for this course. The optional readings are resources that we return to on a regular basis, but will not help directly shape the course. We expect students to have purchased a (paper or digital) copy of the essential readings.

* Borgatti, Stephen P. Analyzing Social Networks. 2013. Thousand Oaks, CA: Sage. Everett, Martin G. [ASN] Johnson, Jeffrey C.

Henning, M. et al. Studying Social Networks. 2013. Berlin: Springer-Verlag. [SSN]

Hansen, Derek Analyzing Social Networks with NodeXL. New York: Morgan Shneiderman, Ben Kaufman. [ANXL] Smith, Marc A.

Newman, Mark Networks: An Introduction. 2010. Oxord, UK: Oxford University Press. [NAI]

Week 1. Introduction and research design.

Substantive: In this week we present an overview of the network-as-object and give a brief tour of key network science research. We discuss current journals, books and conferences currently relevant for advances in network science. We show how the network-as-object is based on some very simple mathematical assumptions but nevertheless appears to have a great degree of flexibility and analytical power.

3

Practical: We introduce the three tiers of network analysis software: end-user application, analysis packages and coding environments. We provide a walk-through for students to open and describe a network using all three.

* Borgatti, Stephen et al. Analyzing Social Networks. 2013. Thousand Oaks, CA: Sage. [ASN] • Chapter 1: Introduction, pp. 1-10

* Hennig, Marina et al. Studying Social Networks. 2013. Berlin: Springer-Verlag. [SSN] • Chapter 1: Introduction, pp. 13-26

Hansen, Derek et al. Analyzing Social Networks with NodeXL. New York: Morgan Kaufman. • Chapter 3: Social Network Analysis: Measuring, Mapping, and Modeling Collections of Connections, pp. 31-51

Wellman, Barry “Structural analysis: From method and metaphor to theory and substance”. In Wellman, B., S. D. Berkowitz. (Eds). Social Structures: A Network Approach. 1988. Cambridge University Press, Cambridge, UK. pp. 19–61

Marsden, Peter V. “Recent Developments in Network Measurement”. In Carrington, P., J. Scott, S. Wasserman. 2006. Models and Methods in Social Network Analysis. Cambridge, UK: Cambridge University Press. pp. 8-30.

Hogan, Bernie “The conceptual foundations of Social Network Sites and the Wellman, Barry of the relational self-portrait”. In Dutton, W., M. Graham. Forthcoming. Society and the Internet. Oxford, UK: Press.

Week 2. Generating and representing networks

Substantive: Networks are representations of relationships between entities. In some cases these relationships are well defined, whereas in others they are inferred, assumed, or surveyed. We first describe the multiple ways data can be represented, as adjacency matrices, lists, sociograms and data structures. We then talk about a variety of ways in which such structures are captured or derived from online sources. We argue that simply because a relationship can be specified does not mean it is the most direct route to analysis thereby beginning a critical approach to network science.

Practical: We take a ‘raw’ directed, weighted set of relationships coded in a network file and transform this data through a series of typical operations such as filtering and symmetrizing. We explore what differences this filtering makes to simple network metrics.

* Hansen, Derek et al., Analyzing Social Networks with NodeXL. New York: Morgan Kaufman. • Chapter 6: Preparing Data and Filtering. pp. 81-93

* Borgatti, Stephen et al. Analyzing Social Networks. 2013. Thousand Oaks, CA: Sage. • Chapter 2: Mathematical Foundations, pp. 11-23 • Chapter 5: Data Management, pp. 62-88

Hogan, Bernie “Chapter 7. Analyzing Via the Internet”. In Fielding, N, R. Lee, G. Blank. 2008. Sage Handbook of Online Research Methods.

4

Thousand Oaks, CA: Sage. pp. 141-151.

Hanneman, Robert A. “Chapter 23. A Brief Introduction to Analyzing Social Network Data”. In Riddle, Mark Scott, J and P. Carrington. 2011. Sage Handbook of Social Network Analysis. pp. 331-339.

Huisman, Mark “Software for statistical analysis of social networks”. 2005. Models and Van Duijn, Marijtje A. J. methods in social network analysis. Cambridge: Cambridge University Press. pp. 270-316.

Week 3. Basic network metrics I: Centrality and position

Substantive: One of the foundational concepts in social network analysis focuses on the relative position of the nodes. Even a question as simple as ‘who is the center’ can have a multitude of complex answers depending on how we define central. In this lecture we cover a smorgasbord of metrics that compare the position of nodes in a graph, starting with the basic metrics of degree, betweenness and closeness and moving towards more recent metrics such as PageRank, Hub scores, Structural Holes, Power Centrality.

Practical: We apply a set of network centrality and position metrics to a series of networks in order to indicate the relative prominence of different nodes within the network.

* Borgatti, Stephen “Centrality and Network Flow”. 2006. Social Networks. 27(1):55-71.

* Borgatti, Stephen et al. Analyzing Social Networks. 2013. Thousand Oaks, CA: Sage. • Chapter 10: Centrality, pp. 163-180

Borgatti, Stephen “Notions of Position in Social Network Analysis”. 1992. Sociological Everett, Martin Methodology. 22:1-35.

Newman, Mark Networks: An Introduction. 2010. Oxord, UK: Oxford University Press. • Chapter 7: Measures and Metrics. Section 7.1-7.7, pp. 168-192

Burt, Ronald “Network-Related Personality and the Agency Question: Multirole Evidence from a Virtual World” 2012. American Journal of Sociology. 118(3):543-591.

Cha, Meeyoung et al. “Measuring user influence in twitter: The million follower fallacy”. 2010. ICWSM ‘10.

Week 4. Basic network metrics II: Clustering

Substantive: Certainly from visual inspection, network clustering appears to be an obvious and important aspect of network analysis. Indeed, macro-level clustering has proven to be very useful in identifying and classifying parts of a large network structure. Techniques for proposing and optimizing have evolved rapidly in the last decade, but work s running up against the challenge of ‘ground truth’.

5

Practical: We will apply a variety of community detection and clustering algorithms to a series of networks, including personal Facebook networks to explore how different methods arrive at different results and why.

* Hansen, Derek et al. Analyzing Social Networks with NodeXL. New York: Morgan Kaufman. • Chapter 7: Clustering and Grouping, pp. 93-102

* Porter, Mason A. et al “Communities in Networks.” Notice of the AMS. 56(9):1082-1097.

Tyler, Joshua et al. “Email as Spectroscopy: Automated Discovery of Community Structure within Organizations”. 2003. Proc. Communities & Technologies ‘03.

Traud, Amanda L. et al. “Comparing Community Structure to Characteristics in Online Collegiate Social Networks” 2011. SIAM Review. 53(3): 526-543.

Adamic, Lada A “Divided they Blog”. WWW ‘05. Glance, Natalie

Friggeri, Adrien et al. “Triangles to capture social cohesion” 2011. 3rd International Conference on Social Computing [SOCCOM ‘11]. IEEE.

Leskovec, Jure et al. “Statistical Properties of Community Structure in Large Social and Information Networks”. WWW ‘08. ACM Press.

Week 5: BREAK

Week 6. Modeling networks I: Triads and dependency models

Substantive: Triads are the basis of a huge amount of network study. In many respects it is triads-- not dyads--are the basic blocks of sociality. They come in a huge range of forms, but their conceptual foundations in balance theory mark many network processes. Unfortunately, the analysis of triads is statistically formidable and computationally complex. Consequently, present work on triads tends to focus on smaller data sets and whole groups rather than large swaths of the web. Granovetter’s article is a watershed for thinking about the emergent properties of triads.

Practical: We perform a simple triad census of an online network as well as a simple ERGm model that models this network using triadic properties.

* Granovetter, Mark S. “The Strength of Weak Ties”. 1973. American Journal of Sociology. 78:1360-1380.

* Wimmer, Andreas “Beyond and Below Racial : ERG Models of a Friendship Network Documented on Facebook”. 2010. American Sociological Review. 116(2):583-642.

Kossinetts, Gueorgi “Empirical Analysis of an Evolving Social Network”. 2006. Science. Watts, Duncan J. 311(5757):88-90.

Leskovec, Jure “Signed networks in social media”. 2010. SIGCHI ‘10. Pp. 1361-1370. Huttonlocher, Daniel ACM Press. Kleinberg, Jon

6

Robins, Garry “Chapter 32. Exponential Models for Social Networks”. In Scott, J and P. Carrington. 2011. Sage Handbook of Social Network Analysis. pp. 484-500.

Snijders, Tom A. B. “Chapter 33. Network Dynamics”. 2011. In Scott, J and P. Carrington. 2011. Sage Handbook of Social Network Analysis. pp. 501-513.

Simmel, Georg “The Triad”. 1950. In Wolff, K. “The Sociology of Georg Simmel”. 145- 169.

Week 7. Modeling Networks II: Diffusion and generative models

Substantive: At this point in the course we know that networks have large-scale structural properties, and that microprocesses such as triadic closure can lead to such properties. Researchers have sought to put these ideas together in generative models to explore how simple rules at the micro-level lead to specific outcomes at the macro-level. Much of this work happened first outside of sociology and has been more associated with physics and “network science”. Nevertheless, such work helps inform models of diffusion as well as topology.

Practical: We generate a series of artificial networks under a variety of statistical conditions and evaluate the structure and properties of these networks.

* Aral, Sinan “The Diversity-Bandwidth Trade-off”. 2010. American Journal of Van Alstyne, Marshall Sociology. 117(1): 90-171.

* Watts, Duncan “The ‘New’ Science of Networks”. 2004. Annual Review of Sociology, 30: 243-270.

Bauckhage, Christian “Mathematical Models of Fads Explain the Temporal Dynamics of Kersting, Kristian Internet Memes”. 2013. ICWSM ‘13. AAAI. Hadiji, Fabian

Barabasi, Albert-László “Emergence of scaling in random networks”. 1999. Science. 286: 509- Albert, Réka 512.

Clauset, Aaron “Power-law distributions in empirical data”. 2009. SIAM Review. Shalizi, Cosma R. 51:661-703. Newman, M. E. J.

Adamic, Lada “How to search a social network”. 2006. Social Networks. 27:187-203. Adar, Eytan

Uzzi, Brian “Collaboration and creativity: the small world problem”. 2005. American Shapiro, Jarrett Journal of Sociology. 111: 447-504.

7

Week 8. Network visualization techniques

Substantive: Networks are highly visual research artifacts. See any network analysis book cover and you are virtually guaranteed to find some dots and lines. Freeman defines visual presentation of data as one of the core elements of the network analysis field. Yet, there is both an art and a science to visualization that is rarely appreciated or considered explicitly. Given the centrality [sic] of visualization to the analysis and presentation of network science, it is crucial to consider what makes for a good visualization or a poor one and why.

Practical: This week we play with a number of visualization packages. A set of networks will be provided to demonstrate how differing visualization algorithms can reveal different salient facts about the underlying network as well as show how no single picture communicates all facets of a network.

* Freeman, Linton C. “Visualizing Social Networks”. 2000. Journal of Social Structure. 1(1):0.

* Hennig, Marina et al. Studying Social Networks. 2013. Berlin: Springer-Verlag. • Chapter 4: Visualization, pp 149-182

Borgatti, Stephen et al. Analyzing Social Networks. 2013. Thousand Oaks, CA: Sage. • Chapter 7: Visualization, pp. 100-125

Welser, Howard T. et al. “Visualizing Social Networks”. 2008. Journal of Social Structure. 8(1). Online: http://www.cmu.edu/joss/content/articles/volume8/Welser/

Lima, Manuel “Chapter 5: The Syntax of a New Language”. 2011. Visual Complexity. Pp. 159-220.

Noack, Andreas “ clustering is force-directed layout”. 2009. Physical Review E. 79(2):026102.

Week 9. Theorizing networks

Substantive: The class concludes the lectures with a retrospective discussion on the power of networks and network analysis. However, rather than read self-congratulatory work on the potential of network science, we review work that has challenged some of its basic assumptions--often from its own practitioners.

Practical: The first half of the class will be dedicated to five-minute “lightning” presentations showcasing the work-in-progress based on the formative. Each presentation should briefly cover research question, research design and either preliminary or expected findings.

* Abbott, Andres “Transcending Linear Reality”. 1988. Sociological Theory. 6(2):169-186.

* Martin, John L. “Life’s a beach, but you’re an ant and other unwelcome news for the sociology of culture”. 2010. Poetics. 38:228-243.

* Emirbayer, Mustafa “Network Analysis, Culture, and the Problem of Agency”. 1994. American Goodwin, Jeff Journal of Sociology. 99(6): 1411-1454.

8

Latour, Bruno “The Whole is Always Smaller Than Its Parts: A Digital Test of Gabriel Tarde’s Monads”. 2012. British Journal of Sociology. 63(4):590-615.

Manzo, Gianluca “Symposium Review: The Whole is Greater than the Sum of its Parts: Some Remarks on the Oxford Handbook of Analytical Sociology”. 2011. European Sociological Review. 27(6): 824-835.

Gross, Neil “The Mechanistas”. 2013. Contemporary Sociology. 42: 368.

Emirbayer, Mustafa “Manifesto for a Relational Sociology”. 1997. American Journal of Sociology. 103(2): 281-317.

Please note: Option papers will only run if selected by at least four students.

9