What Is Special About Social Network Analysis?
Total Page:16
File Type:pdf, Size:1020Kb
What Is Special About Social Network Analysis? Marijtje A. J. van Duijn1 and Jeroen K. Vermunt2 1University of Groningen, the Netherlands 2Tilburg University, the Netherlands Abstract. In a short introduction on social network analysis, the main characteristics of social network data as well as the main goals of social network analysis are described. An overview of statistical models for social network data is given, pointing at differences and similarities between the various model classes and introducing the most recent developments in social network modeling. Keywords: complete network data, exponential random graph models, personal network data, random effect models, social network analysis Several types of social network data are distinguished. Introduction The two main types, which are both present in this special Social network analysis is an interdisciplinary field of re- issue, are ego-centered or personal networks, and complete search with a long history of input from sociology, anthro- or one-mode networks. Ego-centered network data are usu- pology, statistics, mathematics, information sciences, edu- ally collected from a sample of actors (egos) reporting on cation, psychology, and other disciplines. It still is a very the ties with and between other people (alters). The rela- active research area, as is evident from the many recent tional system is then assumed to be composed of the sam- publications on social network analysis. pled egos and reported alters and their ties, as well as pos- The large interest in social networks can be understood sible additional actor and tie information. Complete in view of the important theoretical and intuitively appeal- network data, on the other hand, concern a well-defined ing research questions connected with social networks and group of actors who report on their ties with all other actors the challenging methodological problems associated with in the group. The ties reported by actors can usually not be the collection and analysis of social network data. This assumed to be independent, which makes personal and fruitful combination of content and methodology has stim- complete network sampling nonstandard. Rather than dis- ulated lots of research in the past, described, for example, cussing network sampling in more detail here, we refer to by Wasserman and Faust (1994, chap. 1). Both aspects of Wasserman and Faust (1994) and Frank (2005) and the social network analysis involve theoretical as well as em- references therein. pirical problems, which makes the challenge even greater Social network data can be collected in various ways. and the research more rewarding. The most common approach is by means of questionnaires, Wasserman, Scott, and Carrington (2005) note the rapid but interviews, observations, and secondary sources are recent developments in the analysis of social network data also frequently used network data collection methods (see but do not offer an explanation. We think that the increase also Marsden, 2005). In research utilizing egocentered net- in computer and computing facilities is an important factor. work data, it is important to obtain as complete a picture We expect the advances in social network analysis to con- of the respondents’ networks as possible, which requires tinue and present examples of some of the most recent special tools for helping respondents to delineate their net- developments in this special issue of Methodology. works. A commonly used tool for this purpose is the so- called name generator, which provides a clear definition of which persons known by ego qualify as a network member Social Network Analysis (or alter) of ego. In a recent study, Van der Gaag and Sni- jders (2005) developed a new instrument for the measure- Social network analysis aims at understanding the network ment of ego’s social capital that was named the resource structure by description, visualization, and (statistical) generator. In this special issue, Gerich and Lehner (2006) modeling. Social network data consist of various elements. show how computer-assisted self-administered interviews Following the definition by Wasserman and Faust (1994, (CASI) with name generators can provide an improvement p. 89), social network data can be viewed as a social re- in personal network data collection. lational system characterized by a set of actors and their Tie variables are often, though not necessarily, dichot- social ties. Additional information in the form of actor at- omous, indicating the presence or absence of a relationship. tribute variables or multiple relations can be part of the This facilitates a nice depiction of the network in a graph social relational system. or sociogram. The accompanying mathematical represen- Methodology 2006; Vol. 2(1):2–6 ᭧ 2006 Hogrefe & Huber Publishers DOI 10.1027/1614-1881.2.1.2 M. A. J. van Duijn & J. K. Vermunt: What Is Special About Social Network Analysis? 3 tation is an adjacency matrix with 0s and 1s, where the diagonal is usually not defined (actors do not indicate ties Statistical Models for Social Network with themselves). If the graph is undirected (for instance Analysis when relations between actors are observed instead of self- reported) the adjacency matrix is symmetric. The statistical models applied in social network analysis From mathematical graph theory a variety of concepts are typically nonstandard because the common assumption describing the properties of the network are available, such of independent observations does not hold: The multiple as reciprocity (when two actors indicate the existence of a ties to and from the same actor are related. Moreover, the tie between them), stars (when one central actor is con- popular assumption of continuous normally distributed nected to a number of other, unconnected actors—personal variables does not hold when tie variables are binary, nom- network data can be viewed as a collection of stars), and inal, ordinal, or count variables. Below, three sometimes cliques (when there is a group of at least three actors that interrelated classes of statistical models for complete net- are all connected to each other). Many more concepts are work data are discussed: (a) dyadic interaction models, available, and the reader is referred to Wasserman and (b) exponential random graph models, and (c) stochastic Faust (1994) for a complete overview. actor-oriented models. But first we introduce several sta- In the analysis of complete networks, a distinction can tistical models for the analysis of ego-centered network be made between (a) descriptive methods, also through data. graphical representations (see Freeman, 2005); (b) analysis The dependence structure of personal network data is procedures, often based on a decomposition of the adja- least complex if there is no information on ties between cency matrix; and (c) statistical models based on probabil- alters, and if at the same time egos have neither overlapping ity distributions. Visualization by displaying a sociogram networks nor are members of other egos’ networks. Then, as well as a summary of graph theoretical concepts pro- the data have a neat hierarchical structure, where alters are vides a first description of social network data. For a small ordered (or nested) in egos. If the tie variable can be treated graph this may suffice, but usually the data and/or research as continuous, standard linear two-level models (also known questions are too complex for this relatively simple ap- as hierarchical, random effects, or mixed effects models) proach. for personal network data can be defined in a straight- Often it is of interest to compare actors on the basis of forward way (Van Duijn, Van Busschbach, & Snijders, their tie variables, possibly also taking into account other 1999), distinguishing an ego-specific error (or random) actor characteristics. The identification of subgroups is an term in addition to the usual (dyadic) error term. For binary important area in social network analysis, for which visu- data, a multilevel logistic model can be used, as was done alization tools can be very helpful. In this special issue, by Wellman and Frank (2001). The advantages of the mul- Brandes, Kenis, and Raab (2006) make a strong case for tilevel modeling framework are that it is extremely flexible the explanatory power of network visualization, based on and that it does not require balanced data, that is, the same complex mathematical algorithms implemented in special- number of alters for each ego. ized network visualization software called visone (Brandes In this special issue, Vermunt and Kalmijn (2006) pro- & Wagner, 2003). pose a related conditional logit modeling approach for the Another important goal of social network analysis is the modeling of categorical (nominal) personal network data modeling of ties between actors, so as to explain and/or in the presence of categorical covariates. It deals with the predict the observed network. Several modeling ap- dependence between alters within egos using either a para- proaches exist that are discussed in more detail in the next metric or nonparametric random effects approach. The section. As is shown by four articles in this special issue, parametric formulation leads to a model with a high-di- the modeling becomes more complex for richer social net- mensional random structure, which Vermunt and Kalmijn work data—that is, when actor attributes, multiple net- (2006) restrict by superimposing a factor-analytic model. works, and/or multiple observations of the same network In the nonparametric formulation, egos are assigned to la- are available. tent classes. The availability of software is an important condition Models for complete directed network data can be ob- for the feasibility of social network analysis, especially for tained by extending the multilevel approach. In that case, applied researchers. Some specialized software has been two observations instead of one are available for each pair available for a long time. As noted earlier, the development of actors, the dyad, and we no longer have a convenient of new or improved software for social network analysis hierarchical structure but a cross-nested structure where the seems continuous.