Social Network Analysis Hansen and Smith
Total Page:16
File Type:pdf, Size:1020Kb
Social Network Analysis Hansen and Smith Heather Huynh What is Social Network Analysis? • Social network analysis (SNA) is the systematic study of collections of social relationships, which consist of social actors implicitly or explicitly connected to one another What is Social Network Analysis? • Entities joined together by relationships • Relationships used to measure changes in patterns of relationships and workflow that are not visible in more common metrics (count of users, rates of resource usage) • This perspective distinguishes between simple population growth and the development of important social structures within that population • Social networks existed long before internet, but social networking services like Facebook and LinkedIn, support the creation of large, distributed, real-time social networks History of Social Network Analysis Foundational phase (18th century – 1970s) • Focus on defining terms and establishing the necessary mathematical graph theory foundation • Erdos and Renyi: formal mechanisms for generating random graphs that made statistical tests of network properties viable • Mereno, Warner, and Mayo: applied formal mathematical methods to describe, analyze, and visualize networks (“psychological geography”, “sociometrics”, and “sociograms” • Milgram: Six degrees of separation • Granovetter: showed ”weak ties” much better source of new jobs than “strong ties” => value of social network approach History of Social Network Analysis Computational phase (1970s – mid-1990s) • Creation and systematic use of computational tools and methods • SNA as a methodological approach which leveraged the new capabilities of computers to analyze and visualize networks in novel ways • By mid 1990’s, SNA well-respected approach in numerous fields (organizational behavior, social psychology, communication networks, epidemiology, etc.) • “SNA Bible”: Social Network Analysis: Methods and Applications by Stanley Wasserman and Katherine Faust • Summarizes decades of research into a coherent mathematical framework, identifying core metrics and techniques used by SNA tools and researchers today History of Social Network Analysis Network Deluge Phase (current) • People outside academics now use SNA techniques like corporations, governments, and nonprofits • Lots of tools created: Pajek, SNAP, NodeXL, and Gephi • Mining of data from Facebook, IM services, other social media channels • Techniques pioneered for inferring friendship networks from data captured via mobile devices Goals of Social Network Analysis for HCI Researchers Goal 1: Inform the design and implementation of new CSCW systems • SNA can characterize the social structure of a population of intended users of a new CSCW system before the system is put in place • Research has shown mapping the social network of members of a large organization can help design social and technical strategies to facilitate more effective information flow • Use SNA to identify, educate, and leverage those who will influence the maximal spread of adoption through the network to assure its rapid, effective use or help others to know to to use a new technology • Data for these analyses may come from network surveys or from existing data sources such as communication exchanges • Individuals with unique and important network positions can be identified and interviewed or observed as part of a comprehensive contextual inquiry process Goal 2: Understand and improve current CSCW systems • SNA of data from existing CSCW systems can illustrate the ways current features are utilized by users in different locations in the network • SNA may help community managers understand what is happening in large scale communities where reading through even a meaningful sample of the content is not feasible • Example: knowing about “Theorist” subgroup on Lostpedia allowed designers to develop tools to meet needs of subgroup like page templates Goal 2: Understand and improve current CSCW systems • Several studies have developed recommendations for improving virtual reality games based on network analysis of guild networks and social interaction patterns • Network methods that identify subpopulations can offer customized interfaces and services to different groups of users, using the history of other users in the same group as a guide • Education researchers have shown how students use different social features to interact within small groups and class-wide, with implications for system design and instructional strategies Goal 3: Evaluate the impact of CSCW system on social relationships • Evaluate the impact of a CSCW system on the existing social structure of a population • Measuring the changes in aggregate and person-specific network metrics can help systematically evaluate the effectiveness of such systems • Evaluation can also be performed to assess the impact of a specific feature or social intervention (e.g. effect of an online “icebreaker”) • Education researchers are also using network data to identify students using online course management systems that may be in need of extra support • Data for evaluation assessments may come from offline network surveys, existing communications captured over time, or system usage data • For large-scale evaluations, SNA can be used as a part of a mixed method approach (like identifying who to interview in a network) Goal 4: Design novel CSCW systems and features using SNA methods • SNA can be used as input to new CSCW systems and features • A growing number of research prototypes and innovative products leverage SNA metrics and methods to provide enhanced functionality • Work done for identifying political tendencies of followers of different news agencies on Twitter which could be used for tools that personalize news, etc • Recent work has explored the theoretical and practical design implications for promoting “social translucence” within directed social network systems, like Twitter, where users can only see a portion of the social space (unlike chatrooms and discussion forums) Goal 5: Answer fundamental social science questions • “Computational social science”: a set of techniques that use computational techniques to address core social science questions in novel ways • So much data automatically captured via social media -> provide new opportunities to test hypotheses and theories at a much larger scale than previously possible • Predicting strength of ties from social media interactions or mobile phone usage patterns can support further large-scale studies of social networks by reducing the need for raw data collection from users • Work done by professors and students here at UIUC! Performing Social Network Analysis Identify Goals and Research Questions • Essential that analysts hone in on a few critical goals and turn them into specific research questions • Within HCI, SNA is often exploratory in nature and analysts may only recognize what they are looking for once they see it • Often questions are refined after preliminary analysis of initial data Types of Questions SNA answers • Questions about Individual Social actors • Find prominent individuals; use “centrality metrics” or “equivalence metrics” • Questions about overall network structure • Focus on overall distribution instead of position of individuals; use “community detection algorithms” (network clustering) and variety of “aggregate network metrics” • Questions about Network Dynamics and Flows • How networks change over time and how information, etc flows through networks (information diffusion) Collect Data • Sources of data: • Raw data from system usage (i.e. database or XML files) [Medium-high] • Network survey [High] • Application Programming Interfaces (APIs) [Medium-high] • Screen scraping [Medium-high] • Network analysis importer tools (can import from 3rd party sites) [Easy] • Existing datasets, like Enron email network and Amazon related items (more at http://snap.stanford.edu/data/) [Easy] • Type of social network will determine how to appropriately analyze, visualize, and interpret data • Type determined by underlying phenomena it represents (i.e. Facebook vs. Twitter relationships) Networks can be… • Directed vs. Undirected: directed = not necessarily reciprocated; undirected = always mutual • Weighted vs. Unweighted: weighted = edges have values associated with them; unweighted = edges either exist or do not • Multiplex networks: includes multiple types of edges (could be analyzed as a multiplex network or multiple distinct networks) • Unimodal vs. multimodal: unimodal networks = include only one type of node (i.e. all nodes represent people); multimodal networks = include more than one type of node, can have subset called bimodal or bipartite networks (which can be transformed into unimodal networks) Networks can be… • Partial networks: “egocentric network” = includes a single node called an “ego” and all nodes that ego is directly connected to (called ”alters”); adding connections adds on degrees; can also sample a large network to find some network boundary to create partial networks • A single socio-technical system has many types of networks; the choice of which to focus on depends on the goals of your study Representing Network Data • Network data can be represented in three primary ways: • Edge lists – adjacency lists • Matrices – adjacency matrix • Graphs – visually show nodes as vertices and edges as lines connecting them • Usually include additional attribute data to describe nodes and/or edges • In practice, several common network file formats: .graphml, .net, .gml, .dot,