From psychology to phylogeny: bridging levels of analysis in cultural evolution

by

Mason Patrick Youngblood

A dissertation submitted to the Graduate Faculty in Psychology in partial fulfillment of the requirements for the degree of Doctor of Philosophy, The City University of New York

2021 ii

© 2021

Mason Patrick Youngblood All Rights Reserved iii

This manuscript has been read and accepted by the Graduate Faculty in Psychology in satisfaction of the dissertation requirement for the degree of Doctor of Philosophy.

Professor David Lahti

Date Chair of Examining Committee

Professor Richard Bodnar

Date Executive Officer

Professor Carolyn Pytte Professor Stefano Ghirlanda Supervisory Committee

The City University of New York iv

Abstract

From psychology to phylogeny: bridging levels of analysis in cultural evolution by

Mason Patrick Youngblood

Advisor: Professor David Lahti

Cultural evolution, or change in the socially learned behavior of a population over time, is a fascinating phenomenon that is widespread in humans and present in some non-human animals. In this dissertation, I present an array of cultural evolutionary studies that bridge pattern and process in a wide range of research models including music, extremism, and birdsong. The first chapter is an introduction to the field of cultural evolution, including a bibliometric analysis of its structure. The second and third chapters are studies on the cul- tural dynamics of music sampling traditions in hip-hop and electronic music communities and far-right extremism in the United States, using social network analysis and epidemiological modeling, respectively. The fourth and fifth chapters are studies on how cultural transmis- sion biases influence population-level changes in music sampling traditions and house finch song, using a combination of agent-based modeling and machine learning. The sixth chapter is a technical report on computerized birdfeeders that were used to remotely collect data on the social network structure of a wild house finch population. Lastly, the seventh chapter applies a novel phylogenetic method based on dynamic community detection to reconstruct the cultural evolution of electronic music.

Chapter 1 The science of cultural evolution is unified in its application of evolution- ary logic to socially transmitted behavior, but diverse in methodologies and assumptions. v

Qualitative reviews have encouraged integration by illuminating points of divergence and fostering interaction. This effort would be greatly enhanced by quantitative data on pat- terns of collaboration and idea sharing in the literature. In the present study, we apply a novel combination of network, cluster, and bibliometric analyses to an extensive dataset of publications on cultural evolution, in order to represent the structure of the field and evaluate the level of disciplinary integration. We first construct a co-authorship network and identify subdisciplines. We then use bibliometric analyses to describe each subdiscipline and investigate trends in collaboration and productivity. Lastly, we assess the topographical distance and degree of citation sharing between subdisciplines, as well as the diversity of subject categories within subdisciplines. Our results reveal an increase in productivity and collaboration over time, albeit a higher inequality in author productivity than expected. Our structural approach reveals research subcommunities with differential levels of integration, citation sharing, and subject diversity. These findings confirm the emergence of a vigor- ous interdisciplinary field, and indicate ways to foster integration and synthesis in cultural evolution.

Chapter 2 Music sampling is a common practice among hip-hop and electronic producers that has played a critical role in the development of particular subgenres. Artists preferen- tially sample drum breaks, and previous studies have suggested that these may be culturally transmitted. With the advent of digital sampling technologies and social media the modes of cultural transmission may have shifted, and music communities may have become de- coupled from geography. The aim of the current study was to determine whether drum breaks are culturally transmitted through musical collaboration networks, and to identify the factors driving the evolution of these networks. Using network-based diffusion analy- sis we found strong evidence for the cultural transmission of drum breaks via collaboration between artists, and identified several demographic variables that bias transmission. Ad- vi ditionally, using network evolution methods we found evidence that the structure of the collaboration network is no longer biased by geographic proximity after the year 2000, and that gender disparity has relaxed over the same period. Despite the delocalization of com- munities by the internet, collaboration remains a key transmission mode of music sampling traditions. The results of this study provide valuable insight into how demographic biases shape cultural transmission in complex networks, and how the evolution of these networks has shifted in the digital age.

Chapter 3 Increasing levels of far-right extremist violence have generated public concern about the spread of radicalization in the United States. Previous research suggests that rad- icalized individuals are destabilized by various environmental (or endemic) factors, exposed to extremist ideology, and subsequently reinforced by members of their community. As such, the spread of radicalization may proceed through a social contagion process, in which extrem- ist ideologies behave like complex contagions that require multiple exposures for adoption. In this study, I applied an epidemiological method called two-component spatio-temporal in- tensity modeling to data from 416 far-right extremists exposed in the United States between 2005 and 2017. The results indicate that patterns of far-right radicalization in the United States are consistent with a complex contagion process, in which reinforcement is required for transmission. Both social media usage and group membership enhance the spread of ex- tremist ideology, suggesting that online and physical organizing remain primary recruitment tools of the far-right movement. In addition, I identified several endemic factors, such as poverty, that increase the probability of radicalization in particular regions. Future research should investigate how specific interventions, such as online counter-narratives to battle pro- paganda, may be effectively implemented to mitigate the spread of far-right extremism in the United States. vii

Chapter 4 One of the fundamental questions of cultural evolutionary research is how individual-level processes scale up to generate population-level patterns. Previous studies in music have revealed that frequency-based bias (e.g. conformity and novelty) drives large-scale cultural diversity in different ways across domains and levels of analysis. Music sampling is an ideal research model for this process because samples are known to be culturally trans- mitted between collaborating artists, and sampling events are reliably documented in online databases. The aim of the current study was to determine whether frequency-based bias has played a role in the cultural transmission of music sampling traditions, using a longitudi- nal dataset of sampling events across three decades. Firstly, we assessed whether turn-over rates of popular samples differ from those expected under neutral evolution. Next, we used agent-based simulations in an approximate Bayesian computation framework to infer what level of frequency-based bias likely generated the observed data. Despite anecdotal evidence of novelty bias, we found that sampling patterns at the population-level are most consistent with conformity bias. We conclude with a discussion of how counter-dominance signaling may reconcile individual cases of novelty bias with population-level conformity.

Chapter 5 In this study, we used a longitudinal dataset of house finch (Haemorhous mexicanus) song recordings spanning four decades in the introduced eastern range to as- sess how individual-level cultural transmission mechanisms drive population-level changes in birdsong. First, we developed an agent-based model (available as a new R package called TransmissionBias) that simulates the cultural transmission of house finch song given differ- ent parameters related to transmission biases, or biases in social learning that modify the probability of adoption of particular cultural variants. Next, we used approximate Bayesian computation and machine learning to estimate what parameter values likely generated the temporal changes in diversity in our observed data. We found evidence that strong content bias, likely targeted towards syllable complexity, plays a central role in the cultural evolution viii of house finch song in western Long Island. Frequency and demonstrator biases appear to be neutral or absent. Additionally, we estimated that house finch song is transmitted with extremely high fidelity. Future studies should use our simulation framework to better un- derstand how cultural transmission and population declines influence song diversity in wild populations.

Chapter 6 Radio-frequency identification (RFID) is an increasingly popular wireless tech- nology that allows researchers to monitor wild bird populations from fixed locations in the field. I have developed an RFID-equipped birdfeeder based on the Raspberry Pi Zero W, a low-cost single-board computer, that collects continuous visitation data from birds marked with passive integrated transponder (PIT) tags. Each birdfeeder has a perch antenna con- nected to an RFID-reader board on a Raspberry Pi powered by a portable battery. When a tagged bird lands on the perch to eat from the feeder, its unique code is stored with the date and time on the Raspberry Pi. These birdfeeders require only basic soldering and coding skills to assemble, and can easily be outfitted with additional hardware like video cameras and microphones. I outline the process of assembling the hardware and setting up the operating system for the birdfeeders. Then, I describe an example implementation of the birdfeeders to track house finches (Haemorhous mexicanus) on the campus of Queens College in New York City.

Chapter 7 Cultural phylogenies, or “trees” of culture, are typically built using methods from biology that use similarities and differences in artifacts to infer the historical rela- tionships between the populations that produced them. While these methods have yielded important insights, particularly in linguistics, researchers continue to debate the extent to which cultural phylogenies are tree-like or reticulated due to high levels of horizontal trans- mission. In this study, we propose a novel method for phylogenetic reconstruction using ix dynamic community detection that explicitly accounts for transmission between lineages. We used data from 1,498,483 collaborative relationships between electronic music artists to construct a cultural phylogeny based on observed population structure. The results suggest that, although vertical transmission appears to be dominant, horizontal transmission is com- mon and populations never become fully isolated from one another. In addition, we found evidence that electronic music diversity has increased between 1975 and 1999. The method used in this study is available as a new R package called DynCommPhylo. Future studies should apply this method to other cultural systems such as academic publishing and film, as well as biological systems where high resolution reproductive data is available, to assess how levels of reticulation in evolution vary across domains. Acknowledgments

Thank you to Mom, Dad, Ben, and Brian for the lifelong love, encouragement, and support that have allowed me to dream big in all of my endeavors, scientific and otherwise. Thank you to Emily, Ryan, and all of my friends for constantly challenging and inspiring me to be a more creative, driven, and empathetic human. Thank you to my advisor, David Lahti, whose openness, diverse intellectual interests, and deep theoretical background are directly responsible for the scientist that I have become. The critical evolutionary perspective that you have instilled in me will act as the foundation for my research and worldview for the rest of my life. Thank you to my committee members, Carolyn Pytte and Stefano Ghirlanda, whose expertise, curiosity, and patience have made me feel confident in pursuing interdisciplinary research. Thank you to Andy Richards, Bobby Habig, Danielle Wasserman, Aaron Owen, Franny Geller, and the entire Lahti Lab for all of your friendship and feedback over the last six years. Our discussions in and out of the lab have set an impossibly high standard for future colleagues. Thank you to my brilliant coauthors who contributed so much to two of the studies in this dissertation: David Lahti in the first chapter, and Karim Baraghith and Patrick Savage in the final chapter. Finally, this research was supported, in part, by the John Templeton Foundation, the National Science Foundation (CNS-0958379, CNS-0855217, ACI-1126113), and the City University of New York High Performance Computing Center at the College of Staten Island.

x Contents

Contents xi

List of Tables xiv

List of Figures xvii

1 A bibliometric analysis of the interdisciplinary field of cultural evolution 1 1.1 Introduction ...... 1 1.2 Methods ...... 6 1.3 Results ...... 9 1.4 Discussion ...... 16

2 Cultural transmission modes of music sampling traditions remain stable despite delocalization in the digital age 23 2.1 Introduction ...... 23 2.2 Methods ...... 26 2.3 Results ...... 29 2.4 Discussion ...... 33

3 Extremist ideology as a complex contagion: the spread of far-right radi- calization in the United States between 2005-2017 37

xi CONTENTS xii

3.1 Introduction ...... 37 3.2 Methods ...... 42 3.3 Results ...... 47 3.4 Discussion ...... 49

4 Conformity bias in the cultural transmission of music sampling traditions 57 4.1 Introduction ...... 57 4.2 Methods ...... 60 4.3 Results ...... 65 4.4 Discussion ...... 67

5 Content bias in the cultural evolution of house finch song 71 5.1 Introduction ...... 71 5.2 Methods ...... 76 5.3 Results ...... 83 5.4 Discussion ...... 86

6 A Raspberry Pi-based, RFID-equipped birdfeeder for the remote moni- toring of wild bird populations 90 6.1 Introduction ...... 90 6.2 Hardware ...... 93 6.3 Software ...... 95 6.4 Example ...... 99

7 Phylogenetic reconstruction of the cultural evolution of electronic music via dynamic community detection (1975–1999) 103 7.1 Introduction ...... 103 7.2 Methods ...... 108 CONTENTS xiii

7.3 Results ...... 113 7.4 Discussion ...... 120

Appendix A 126

Appendix B 131

Appendix C 144

Appendix D 147

Appendix E 151

Appendix F 156

Bibliography 161 List of Tables

2.1 The results of the multiplicative model for the OADA including all individual- level variables. The top panel shows the model estimate, effect size, and p-

value for each individual-level variable. The bottom panel shows the AICc for the model with and without social transmission and the p-value from the likelihood ratio test...... 30 2.2 The results of the STERGM analyses for before and after 2000. The table shows the effect size and p-value for gender, popularity, followers, and mean distance during each time period...... 32

3.1 The results of the twinstim modeling, with estimated rate ratios (RR), Wald confidence intervals, and p-values. Endemic predictors (including the overall time trend) are in the top portion of the table, whereas epidemic predictors are in the bottom...... 48

4.1 Notable sampling events for the five most sampled drum breaks used in the current study. The number of times each drum break has been sampled was collected from WhoSampled on June 27th, 2019...... 61 4.2 The number of votes cast by the trained random forest for each model after being provided with the observed summary statistics, as well as the posterior probability of the selected model (conformity)...... 67 xiv LIST OF TABLES xv

5.1 The description and prior distribution for each dynamic parameter in the agent-based model...... 83 5.2 The median, 95% credible interval, and out-of-bag normalized mean absolute error of the posterior distribution for each dynamic parameter in the agent- based model...... 85 5.3 The median, 95% credible interval, and standard deviation for each parame- ter included in the Bayesian logistic regression. Concavity is the number of changes in the sign of the slope of the mean frequency trace per ms, and ex- cursion is the cumulative absolute change in Hz per ms. 95% credible intervals that do not overlap with 0 indicate a statistically significant effect...... 86

7.1 The size, cohesion index, top styles and top artists (calculated by weighted log-odds) from the eight largest populations in the phylogeny...... 117 7.2 The level of horizontal transmission between the eight largest populations in the phylogeny. Each value represents the percentage of links in the sub- network that flow between the two populations. In other words, we divided the total number of weighted links between populations by the sum of the number of internal weighted links from each population...... 118

A.1 Descriptive network statistics ...... 126 A.2 Names and primary departmental affiliations of core authors...... 127 A.3 Statistically over-represented country affiliations of first authors, author-chosen keywords, and subject categories in each cluster...... 128

C.1 The average rate ratios and p-values for all epidemic predictors after 100 rounds of imputation and estimation using the full model...... 144 LIST OF TABLES xvi

E.1 The number of syllable types and cluster validity indices for each value of deep split, calculated using the clValid package (Brock et al., 2008). For all three indices higher values correspond to higher quality clustering. We used a deep split value of 3 because it yielded results more consistent with previous research and had only slightly lower values of the silhouette and connectivity indices and an identical Dunn index, which appears to outperform the silhouette index when hierarchical clustering is used with real datasets (Arbelaitz et al., 2013)...... 152 E.2 The observed summary statistics from 1975, 2012, and 2019...... 154 E.3 The 10 most important summary statistics used by the random forests to estimate each parameter, identified using the Gini impurity method. The summary statistics are numbered as they appear in the methods: (1) the proportion of syllables that only appear once, (2) the proportion of the most common syllable type, (3) the number of syllable types, (4) Simpson’s diversity index, (5) Shannon’s diversity index, (6) Pielou’s evenness index, and (7) the exponent of the fitted power-law function to the progeny distribution. . . . . 155

F.1 The results of the eight clustering algorithms included in igraph ranked by modularity...... 158 List of Figures

1.1 Scatterplots showing the number of articles published on the topic of cultural evolution (top), as well as the collaboration index of the field (bottom), each year. Exponential fitting of the data revealed an increase in the number of articles written between 1990 and 2017 (r2 = 0.9443). Linear fitting of the data, after the removal of the outlier indicated by the star, revealed an increase in the collaboration index as well (r2 = 0.4333)...... 10 1.2 The largest connected co-authorship network in the dataset, analyzed using VOS clustering (n = 629). Red corresponds to group 1 (“biological anthropol-

ogy and archeology”; n1 = 183), green corresponds to group 2 (“mathematical

modeling and dual-inheritance theory”; n2 = 146), blue corresponds to group

3 (“cognitive linguistics and experimental cultural evolution”; n3 = 134), yel-

low corresponds to group 4 (“cross-cultural and phylogenetic studies”; n4 = 75), purple corresponds to group 5 (“computational biology and cultural niche

construction”; n5 = 56), cyan corresponds to group 6 (“evolutionary psychol-

ogy”; n6 = 20), and orange corresponds to group 7 (“behavioral ecology and

birdsong”; n7 = 15). Name size indicates total link strength. Many authors were arbitrarily excluded from the figure by the visualization algorithm in VOSviewer to maximize legibility...... 12

xvii LIST OF FIGURES xviii

1.3 The distribution of eigenvector centrality values for the authors in each group. The three most extreme outliers in group 2 were excluded from the figure for higher resolution. Pairwise comparisons were conducted using t-tests with pooled standard deviations, and clusters with significantly different means are denoted with different letters...... 14 1.4 The distribution of closeness centrality values for the authors in each clus- ter. Pairwise comparisons were conducted using t-tests with pooled standard deviations, and clusters with significantly different means are denoted with different letters...... 15 1.5 The degree of citation overlap between groups. The groups included in each pairwise comparison are designated by the black dots below each bar. For each pairwise comparison, the y-axis is the number of articles that are shared between the two collective reference lists. The total size of each collective reference list appears to the right of each row of dots. Only pairwise compar- isons with statistically significant intersections are included, and the results are sorted by increasing p-value...... 16

2.1 The diffusion of all three drum breaks through the combined collaboration network. At each time point individuals who have not yet used one of the drum breaks (informed) are shown in white, individuals who first used one of the drum breaks in a previous time step (previously informed) are shown in blue, and individuals who first used one of the drum breaks in the current time step (newly informed) are shown in red...... 31

3.1 The number of far-right extremists exposed in the PIRUS database, both per-year (left) and cumulative (right), between 2005 and 2017...... 43 LIST OF FIGURES xix

3.2 The untied locations of far-right extremists exposed in the PIRUS database between 2005 and 2017. The color of each county corresponds to its log- transformed population density...... 44 3.3 The total intensity (in black), as well as the isolated endemic component (grey), over time. Total intensity can be interpreted as the proportion of rad- icalization probability that is explained by the endemic and epidemic compo- nents...... 49 3.4 The results of the Monte Carlo permutation test. The grey bars show the null

distribution of R0 from the 739 permutations that converged, whereas the red

dashed line shows the observed R0 (0.31) calculated from the twinstim model. 50 3.5 The cumulative number of exposures during the last six months of the study period (red), as well as the results of 1,000 simulations (grey)...... 51 3.6 The exposure locations during the last six months of the study period. The color gradient, ranging from blue (low) to yellow (high), represents a Gaussian kernel density for the results of 1,000 simulations with a bandwidth of 200 km. The contour lines segment the kernel density into 10 levels...... 51

4.1 Violin plots showing the frequencies of samples, ranked by overall use, from 1980 to 2019. The x-axis is the rank of each sample, and the y-axis is the year. To the left of the dotted line are samples 1-10, while to the right are samples 501-510. More common samples (on the left) appear to be much more stable over time than rarer ones. The high popularity of the more common samples in the late 80s and early 90s is likely due to the rapid expansion of sample-based hip-hop and dance music triggered by increased access to digital samplers and more relaxed copyright enforcement during that period. . . . . 62 LIST OF FIGURES xx

4.2 The observed turn-over rates (z) for top-lists up to size 142, compared to those expected under neutral conditions according to Bentley (Bentley et al., 2007) (in blue) and Evans and Giometto (Evans and Giometto, 2011) (in orange). The x-axis is the size of the top lists for which z, on the y-axis, was calculated. 66 4.3 The posterior probability distribution of the level of frequency-based bias (b), with the median shaded in dark grey and the 95% HDPI shaded in light grey. 66

5.1 The first 10 syllables from a song recorded in 1975 and analyzed in Luscinia. Each red bar corresponds to a syllable, and each green number corresponds to an element within that syllable. The blue traces represent mean frequency. In this song, syllable three and syllable eight were classified as the same syllable type during dynamic time warping, and all other syllables are unique types. . 76 5.2 Locations of recordings from 1975, 2012, and 2019 in western Long Island (Brooklyn, Queens, and Nassau County)...... 77 5.3 The prior (dotted lines) and posterior (solid lines) distributions for the three dynamic parameters in the agent-based model related to transmission biases.

patt, α, and v correspond to content, frequency, and demonstrator bias, re- spectively...... 84

6.1 One of the RFID-equipped birdfeeders hanging in a tree on the Queens College campus...... 92 6.2 The RFID reader attached to the Raspberry Pi...... 93 6.3 The completed antenna and power supply connected to the computer and RFID reader...... 95 LIST OF FIGURES xxi

6.4 The locations of the five birdfeeders on the Queens College campus, the total number of minutes birds spent at each location (size and color of each point), and the total number of times birds moved between locations (thickness and color of each line). Plotted using feedr (Lazerte et al., 2017)...... 100 6.5 (A) The number of feeder visits (y-axis) that occurred on each day of the study period. (B) The total number of feeder visits (y-axis) that occurred during each hour across the entire study period...... 100 6.6 (A) The estimated social network of the 25 birds for which we had association data. (B) The dominance ranks (y-axis) of the 12 birds for which displacement interactions were available. Plotted with aniDom (Farine and Sanchez-Tojar, 2019)...... 102 LIST OF FIGURES xxii

7.1 A simplified visual summary of community detection using the TILES algo- rithm (A: top), and network simplification using the fast greedy modularity optimization algorithm (B: bottom). In A, arrows depict the movement of nodes (artists), dotted lines depict newly formed links (collaborations), and different colors represent different communities. Each link in the network cor- responds to a collaborative release like the one shown in the bottom right

(https://bit.ly/3jvofD8). Both core and peripheral artists are included, and artists in multiple communities are assigned to the largest one. Between

t1 and t2, artist 1 breaks all but one of its previous ties. Since artist 1 is no longer in a triangle with those other artists, the network splits into two com-

munities (green and pink). Between t2 and t3, artist 2 breaks its tie with the green community and establishes a collaboration with a peripheral member of the pink community and an unassigned artist. Since these three artists form a new triangle they are classified as a new community (yellow). In the left side of B, each node represents a community from A in the correspond- ing time point, and the links between communities represent the number of individuals moving between them. The right side of B is the same visualiza- tion after network simplification, where the pink and yellow communities have been compressed into a single population...... 110 LIST OF FIGURES xxiii

7.2 A phylogeny based on the dynamic community composition of electronic music over time. Each node is a set of communities, and each link corresponds to the number of artists moving between nodes. Nodes with the same color belong to the same population, and the height of each node corresponds to the number of artists. The visualization algorithm maximizes both the clustering of nodes and the continuity of links. The population names in the legend are simplified versions of the more detailed classifications in the results, and the overall pop-

ulation size over time can be seen in Figure F.2. Interactive version: (https: //masonyoungblood.github.io/electronic_music_phylogeny.html)... 114 7.3 The number of populations identified in each year (left y-axis), as well as the percentage of between-population links between each year and the previous year (right y-axis). The percentage of between-population links should be treated as a global measure of horizontal transmission...... 116 7.4 The effective number of styles from each year calculated using Simpson’s (left y-axis) and Shannon’s (right y-axis) diversity indices...... 119

A.1 The observed distribution of author productivity (solid line) alongside the theoretical inverse square function multiplied by the total number of authors in the dataset (dashed line, y = 3451/x2). A Kolmogorov-Smirnov test found the observed distribution to be significantly different from the theoretical inverse square function (p < 0.05)...... 126 A.2 Wordcloud showing the most commonly used author-chosen (as opposed to WoS-assigned) keywords associated with the articles in each group. “Cul- tural evolution” has been excluded from the wordcloud for clarity. Text size indicates use frequency...... 129 LIST OF FIGURES xxiv

A.3 A science overlay map of the entire dataset (above), as well as for each cluster of co-authors identified by VOSviewer (below). The links and nodes are based on a co-citation analysis of an exhaustive dataset of ISI subject categories from 2007, and represent a global map of science over which our own dataset is superimposed. Each node indicates a subject category represented in the dataset. The size of each node corresponds to the frequency of use, and the color corresponds to the larger disciplinary category it belongs to. Empty nodes represent subject categories not present in the dataset...... 130

B.1 The combined diffusion curve for all three drum breaks included in the primary OADA. The proportion of informed individuals is on the y-axis, and the year is on the x-axis. Although recent research suggests that inferring acquisition modes from diffusion curves is unreliable, it appears that the curve may have the S-shape indicative of social transmission prior to the early-2000s. . . . . 131 B.2 The relationship between diffusion years and transmission strengths for all seven diffusions included in the additional OADA. The mean (left) and me- dian (right) years of diffusion are on the x-axis, and the social transmission estimates from the additive model are on the y-axis. Linear regression found no significant relationships between either mean year of diffusion and social transmission estimate (R2 = 0.20, p = 0.31) or median year of diffusion and social transmission estimate (R2 = 0.17, p = 0.36)...... 134 LIST OF FIGURES xxv

B.3 The relationship between popularity and followers and the number of collab- orations for each artist in the dataset. Popularity and followers are on the x-axis, and number of collaborations is on the y-axis. Linear regression found significant positive relationships between both popularity and number of col- laborations (R2 = 0.048, p < 0.001) and followers and number of collaborations (R2 = 0.090, p < 0.001)...... 143

C.1 The pair correlation function at different pairwise distances in km (x-axis). The black line is the observed function for the data, the red line is the theoret- ical function assuming spatial randomness, and the grey envelope shows the upper and lower bounds of the functions from 100 simulated point patterns demonstrating spatial randomness...... 145 C.2 Estimates of the scaled spatial (left panel) and temporal (right) step functions. The 95% Monte Carlo confidence intervals were each calculated from 100 samples...... 146

C.3 (A) The empirical cumulative density function of Ui, or the standardized residuals according to Ogata Ogata (1988), with 95% Kolmogorov-Smirnov

confidence bands. (B) A scatterplot of Ui and Ui+1 to look for serial correlation.146

D.1 The measured diversity (x-axis) resulting from 1,000 iterations of a neutral model with the observed innovation rate, calculated across 500 timepoints (y-axis). The system appears to reach equilibrium at around 100 timepoints. We used 200 timepoints as the warm-up time to be more conservative. . . . . 147 LIST OF FIGURES xxvi

D.2 A goodness-of-fit test for the rejection form of ABC (n = 1000; ε = 0.01) indicates that the model is a good fit for the data (p = 0.47). The x-axis corresponds to the Euclidean distance between the simulated and observed summary statistics for each iteration, and the y-axis is the number of iterations with each range of distances...... 148 D.3 Leave-one-out cross validation indicates that the results are robust across tolerance levels (n = 10; ε: 0.005, 0.01, 0.05). Prediction error with ε = 0.005 is 0.0035, ε = 0.01 is 0.0042, and ε = 0.05 is 0.0099. The x-axis is the true parameter values, the y-axis is the estimated parameter values, and the colors correspond to the three tolerance levels...... 148 D.4 The out-of-bag error computed from an increasing number of trees from the random forest. The x-axis is the number of trees for which the out-of-bag error, shown on the y-axis, was computed...... 149 D.5 The ten most important variables for the classification ability of the random forest. The x-axis is variable importance calculated using the Gini impurity method, and the y-axis is the top ten variables in descending order. The top variable was mean diversity (D¯), followed by the first LDA axis (LD1), the exponent of the turn-over function (x), and the second LDA axis (LD2). The following six variables correspond to diversity in particular years or turn-over rates for specific top-lists...... 149 LIST OF FIGURES xxvii

D.6 The reference table of the random forest projected onto the two LDA axes. The x-axis is LD1, and the y-axis is LD2. Each colored point corresponds to a single simulation of one of the three cultural transmission models, while the black star corresponds to the observed data. The observed data fall neatly within the range of the output of the conformity model on LD1, but outside of the range on LD2. This is not surprising given that LD1 has over twice as much importance in the random forest...... 150

E.1 Average number of house finches encountered per party hour in western Long Island (Brooklyn, Queens, and Nassau County). The dotted line indicates 1994, the year in which conjunctivitis was first detected in the eastern range (Dhondt et al., 1998)...... 151 E.2 A dendrogram showing the results of the hierarchical clustering. The clus- ters could not be labelled because the number of syllable types exceeds the maximum color limit of R...... 152 E.3 The cumulative proportion of syllable types (y-axis) detected with each ad- ditional song sampled (x-axis) for the 106 birds from whom we have at least 10 recorded songs. The red line is the average value, which cross 75% around eight songs...... 153 E.4 The Simpson’s diversity index for 500 burn-in iterations. The red line is the mean Simpson’s diversity index across all iterations in that year...... 153 E.5 The prior (dotted lines) and posterior (solid lines) distributions for the four dynamic parameters in the agent-based model unrelated to transmission biases.154 LIST OF FIGURES xxviii

F.1 The log-transformed geographic distribution of releases in the phylogeny, ex- cluding the 3.6% of releases with either missing data or a region instead of a country. A total of 72 countries are represented, with 50.3% of releases coming from the US and UK...... 157 F.2 The overall population size of the phylogeny in each year...... 159 F.3 The top 10 terms (ranked by β) from each topic assigned to the eight largest populations in the phylogeny. β is the probability of a style coming from each topic...... 160 Chapter 1

A bibliometric analysis of the interdisciplinary field of cultural evolution

1.1 Introduction

The study of cultural evolution is an interdisciplinary endeavor that has captivated re- searchers from a diverse array of fields including evolutionary biology, anthropology, psy- chology, sociology, and computer science. This diversity of research communities and back- grounds has understandably been accompanied by a similarly broad range of approaches and methods (Richerson and Boyd, 2005; Driscoll, 2017; Mesoudi, 2015; Lewens, 2015). Evolu- tionary biology is often seen by cultural evolutionists as a reference point in terms of how an interdisciplinary science can be integrated upon a unified theoretical footing (Mesoudi et al., 2006). The history of evolutionary biology also illustrates, however, the diversity of working hypotheses or assumptions that can underlie research and reporting in an in- terdisciplinary field, often along disciplinary lines (Mayr, 1982). Classic typical differences

1 CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 2 of opinion between evolutionary paleontologists and neontologists, for instance, or between those who study change at the levels of molecules versus traits, have been covered in detail, and many of these differences are now reduced or even largely resolved (Adler et al., 2007; Hahn, 2008), although new or revised points of divergence are always arising (Laland et al., 2014; Pigliucci, 2006). Similarly, the various approaches of cultural evolutionists, especially in different disciplines, can carry with them different working hypotheses, or even explicit or implicit assumptions that are not shared across research communities. If such communities have little interaction or cooperation, these differences of opinion will become entrenched and hinder progress in the field. Understanding and synthesis among the various commu- nities that perform cultural evolutionary research is currently underway, as represented for instance in the founding of the Cultural Evolution Society (CES). On the CES website, the participants of the 2015 workshop that inspired the founding of the society are described as coming from a “melting pot of disciplines that need to become integrated to create a science of cultural change informed by evolutionary theory” (http://bit.ly/2tmeJZ6). The inte- gration of this “melting pot” could help to recontextualize studies from different disciplines to strengthen the field’s empirical base (Mesoudi et al., 2006; Henrich et al., 2008), and to increase the application of cultural evolutionary knowledge to policy matters (Gibson and Lawson, 2015). Additionally, as the application of evolutionary logic becomes more popular in the public sphere, the field of cultural evolution will benefit by fostering a stronger, more unified empirical and methodological foundation for evolutionary approaches to sociocultural questions (Smith et al., 2001; Mace, 2014). Proposals to integrate knowledge from diverse disciplines often involve the encourage- ment of interdisciplinary research. Interdisciplinary research is becoming more common (Porter and Rafols, 2009) and increasingly en vogue, as evidenced by the five-fold increase in publications tagged as {“interdisciplinary”} on Web of Science between 2002 and 2016, which is twice the increase in Web of Science entries as a whole over that period. In fact, CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 3 both the U. S. National Science Foundation and National Institutes of Health have made the support of interdisciplinary research an explicit goal of their respective strategic initia- tives (National Science Foundation, 2014; National Institutes of Health, 2016). Although interdisciplinarity is often used as a rallying cry among scientists, some have advised cau- tion (Jacobs and Frickel, 2009). A recent study found that researchers collaborating across disciplinary boundaries tend to publish fewer, more highly cited papers, but, interestingly, this “productivity penalty” was significantly lower for authors in more interdisciplinary fields (Leahey et al., 2017). This indicates that, for authors in an already interdisciplinary field such as cultural evolution, the benefits of increased citation probably outweigh any costs to productivity. Additionally, the critique that interdisciplinarity could divert focus from more targeted, mechanistic questions assumes that fields have already developed the methodolog- ical underpinnings to investigate those questions. For a less developed field like cultural evolution, in which researchers have been working with different assumptions for decades (Acerbi and Mesoudi, 2015; Claidi`ereand Andr´e,2011), theoretical unification could lead to an increase in empirical investigation and “more dynamic applications of emerging knowl- edge” (Aboelela et al., 2007). In order to understand the interdisciplinarity of cultural evolution, either with respect to the diversity of thought and practice or to the degree to which research communities communicate with each other, we have to know who those research communities are. Previ- ous suggestions have depended upon broad, qualitative and subjective descriptions of field structure (Wimsatt, 2013; Mesoudi et al., 2006; Henrich et al., 2008). These are valuable, especially as they highlight frequent points of divergence in ideas, and promote simple con- ceptualization of possible challenges. However, only quantitative data on the structure of the field of cultural evolution can delineate actual communities, determine whether they fall along disciplinary lines, assess degrees of communication and cross-fertilization, and inform specific goals about how to encourage integration and synthesis. Quantitative methods can CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 4 also provide a baseline for future studies to determine how well such synthesis is proceeding. Bibliometrics, or the statistical analysis of published works, was first described by Pritchard (1969), and bibliometric approaches have since been used to analyze scientific literature in fields as diverse as neuroscience (Yeung et al., 2017), water security (Huai and Chai, 2016), and bariatric surgery (Dabi et al., 2016). Additionally, bibliometric methods have been used to support historical assumptions about the development of fields (Raina and Gupta, 1998), identify connections between scientific growth and policy changes (das Neves Machado et al., 2015), and investigate the collaborative structure of an interdisciplinary field (Liu and Xia, 2015). Evidence suggests that collaboration may positively increase productivity (Landry et al., 1996; Lee and Bozeman, 2005), impact (Larivi`ereet al., 2014), and citation rate (Figg et al., 2006), making it a key feature of bibliometric studies. Historically, bibliometric analyses have been limited by data accessibility, but the establishment of web-based citation indexing databases has increased their scope. Web of Science (WoS), a database with records from more than 33,000 journals (Clarivate Analytics, 2017), is a particularly useful resource for bibliometrics research because each record is tagged with key information such as author name, source, cited references, keywords, and research area. Additionally, users are able to download the complete metadata for every publication resulting from a given search term. Access to these nearly exhaustive bibliometric datasets, in combination with social net- work analysis, has allowed researchers to investigate previously inaccessible questions about how scientific fields are structured. The construction of co-authorship networks, in which each node represents an author and each link represents a co-authorship relationship, can provide particularly valuable insight into the topography of a given field. Co-authorship represents a direct working relationship between authors, and is often used as a proxy mea- sure for scientific collaboration in bibliometric studies. Unfortunately, large networks are challenging to visualize with traditional mapping methods. VOS (visualization of similarity) mapping is a recently developed technique that allows representative and intelligible visu- CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 5 alization of more complex networks (van Eck and Waltman, 2007), and has been combined with modularity-based clustering to parse networks into discrete clusters based on the asso- ciation strength among nodes (Waltman et al., 2010). We utilize VOS mapping in this study because of its ability to simultaneously map and identify clusters within a network, and its widespread use in recent bibliometric analyses (Ta¸skınand Aydinoglu, 2015; Sweileh et al., 2016a,b). One challenge in using network analysis to evaluate the structure of a field is the accurate identification and labeling of subdisciplines. Previous studies have largely depended upon the subject categories developed by the Institute for Scientific Information (ISI) and assigned by WoS, but these categories are general, retrospective, and do not adequately describe the structures of emerging fields such as cultural evolution (Pudovkin and Garfield, 2002; Wagner et al., 2011). Rather than sorting authors into existing scientific categories, we chose to take a “structural approach” by classifying subdisciplines a posteriori, based on the discrete clustering patterns identified in the co-authorship network during the mapping process (Wagner et al., 2011). One of the more recently developed methods for classifying subdisciplines involves the identification and description of core documents in co-citation and bibliographic coupling networks (Gl¨anzeland Thijs, 2011, 2017). Gl¨anzeland Thijs (2011) found this method to be particularly effective for identifying topics in emerging fields, where text-based classification (e.g. keyword usage) might overestimate the the degree of overlap between subdisciplines. We chose to apply this method to our co-authorship network by identifying a set of core authors for each cluster (defined by the number of co-authorship links they have with other nodes) and evaluating their disciplinary affiliations, methodological approaches, and intellectual contributions. Although this approach increases the difficulty of placing authors into discrete disciplinary categories, it allows for more realistic and nuanced classification of clusters. By employing a novel combination of network, cluster, and bibliometric analyses to an CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 6 extensive dataset of journal article metadata, we aim to determine the degree of intellectual integration among subdisciplines in the field of cultural evolution, in order to inform more targeted efforts towards future synthesis.

1.2 Methods

All articles used in this study were retrieved from the WoS Core Collection. The search term {“cultural evolution”} was used in the topic field, and results were filtered by publication year through 2017. The complete metadata for each resulting publication, as well as a subject category frequency list, was compiled and manually exported on July 10, 2018. VOSviewer (version 1.6.8), a network analysis software tool used to analyze bibliometric data, was used to construct a collaboration network based on co-authorship (van Eck and Waltman, 2009). In order to prevent articles with many co-authors from biasing network structure, we utilized fractional counting and excluded articles with more than 10 co-authors (Perianes-Rodriguez et al., 2016). The resolution was set to 0.03, the highest setting that resulted in a manageable number of clusters with suitable sample sizes, with no minimum cluster size. Any redundant names were eliminated using a thesaurus file, so that each author with multiple entries was designated by their last name and first initial (i.e. RICHERSON, PJ to RICHERSON, P). In order to avoid author name disambiguation, which is known to influence the structure of co-authorship networks, individuals with the same last name and first initial were removed from the thesaurus file (Barbastefano et al., 2015). To qualitatively describe each cluster we designated the five authors with the most co-authorship links as “core authors”, and evaluated their departmental affiliations and experimental approaches. For additional support, we identified country affiliations of first authors, author-chosen key- words, and subject categories that were statistically over-represented in each cluster using GeneMerge, a software package for identifying categorical variables that occur at higher CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 7 frequencies in data subsets (Castillo-Davis and Hartl, 2003). The subject category frequency list, which lists of the number of publications in the dataset that fall within each subject category assigned by WoS (i.e. “anthropology”, “be- havioral sciences”, “evolutionary biology”), was used to generate a science overlay map in Pajek (Rafols et al., 2010). Science overlay maps superimpose data from a local network over a “global map of science” based on a co-citation matrix of ISI subject categories from 2007, providing an overview of how the focal field is situated in the rest of science (Rafols et al., 2010). Additional science overlay maps were constructed from the subject categories used within each cluster of co-authors identified in VOSviewer. The subject category frequency list was also used to calculate the Stirling-Rao diversity measure, an indicator of interdisci- plinarity that takes into account the distances between subject categories in the network, for both the entire network and each cluster (Rao, 1980; Stirling, 2007; Leydesdorff and Rafols, 2011). Further details about science overlay maps and the Stirling-Rao diversity measure are available online: https://bit.ly/2mmJTOF. CINNA (version 1.1.14), an R package for network analysis, was used to identify and calculate the two traditional centrality measures that explained the most variance in network structure: eigenvector and closeness centrality (Ashtiani et al., 2018). Eigenvector centrality is a weighted measure of the number of links that a given node in the network has with other nodes, while closeness centrality indicates the length of the most efficient paths to all other nodes in the network. These measures act as proxies for influence and distance from the center, respectively (Landherr et al., 2010). Both measures were rescaled to fall between 0 and 1. By comparing the centrality values for nodes in each cluster, we can quantitatively characterize the topography of the co-authorship network. The R packages qgraph (version 1.4.4) and igraph (version 1.1.2) were used to calcu- late the small-worldness index and network density, respectively. The small-worldness in- dex indicates whether or not a network exhibits the small-world structure shown by many CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 8 collaboration networks with high clustering and short average path lengths (Humphries and Gurney, 2008; Ebadi and Schiffauerova, 2015a,c). Values greater than one reflect small-world structure. Network density is the ratio of observed edges to possible edges in a network. Bibliometrix (version 1.7), an R package for bibliometric analysis, was used to import and manage the metadata from WoS (http://bit.ly/2t21Dkk). Articles published by individuals in each group were sorted accordingly. Any article co-authored with a member of another group was excluded to allow for more parsimonious subdisciplinary classification, and to avoid exaggerating the degree of citation overlap between groups. The exclusion of overlapping articles occurred downstream of the network analysis, and only impacted the group-level descriptive data and citation analyses. Basic descriptive data was generated for each group, such as authors, sources, keywords, and subjects. Each author’s h-index, a common proxy measure for individual scientific output, was calculated using only the records in the dataset to determine individual impact within the field (Hirsch, 2005). The annual collaboration index, a ratio of the number of authors of co-authored articles to the total number of co-authored articles, was calculated for the entire dataset and each cluster. In addition, the Lotka’s law coefficients were estimated to determine the relative frequency of authors ranked by productivity (Lotka, 1926). Lotka’s law predicts that the relative frequency distribution of author productivity for any given field will be a hyperbolic, inverse square function (y = 1/x2), such that a minority of the authors in the dataset are publishing the majority of the articles. The deviation of the observed function from the predicted inverse square function acts as a metric for the inequality in productivity of the field. Lotka’s law breaks down when fractional counting is used and/or the number of collaborators on any article in the dataset is extremely high (Rousseau, 1992; Kretschmer and Rousseau, 2001), so we ensured that full counting was used and none of the articles in the dataset exceeded 100 co-authors. In order to determine how much information was shared among the co-authorship groups, CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 9 the collective references cited by authors in each group were compiled and trimmed to eliminate duplicates. The overlaps between the groups’ collective reference lists were an- alyzed using SuperExactTest (version 0.99.4), an R package developed to statistically eval- uate multi-set intersections by calculating the probability of set overlaps given the size of the entire dataset (Wang et al., 2015). Lastly, a wordcloud was constructed in R to vi- sually compare author-chosen keyword usage among the five groups. All analyses in R were conducted using version 3.3.3. The data and R code used in the current study, as well as an interactive version of Figure 1.2, are available in the Dataverse repository: https://doi.org/10.7910/DVN/LBIDEL.

1.3 Results

The search criteria yielded 2,091 records published by 3,451 authors. Sixty-seven countries are represented in the dataset, but the majority of articles were published by first authors in the United States (34.78%), England (12.31%), Canada (5.71%), Germany (5.06%), and Scotland (4.75%). Although 685 journals are represented in the dataset, the five most pro- ductive were PNAS (3.35%), Behavioral and Brain Sciences (3.29%), Philosophical Trans- actions of the Royal Society B (2.93%), PLoS ONE (2.46%), and Evolution and Human Behavior (2.25%). The overall collaboration index (CI), or the average number of authors on multi-authored papers, was 2.43. The small-worldness index, calculated against 1000 random networks, was 22.42, indicating that the network exhibits small-world structure. Other descriptive network statistics can be found in Table A.1. Temporal analyses indicate that the field as a whole has become more productive over time, with an annual percentage growth rate of 13.61%. Exponential fitting of the data revealed an increase in the number of articles written between 1990 and 2017 (r2 = 0.9443) (top panel of Figure 1). Upon first analysis, a linear increase in the CI across the same time period was not evident, but the CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 10

250

200

150

100

50 Number of Publications

0 1990 1995 2000 2005 2010 2015

6

5

4

3 Collaboration Index

2

1990 1995 2000 2005 2010 2015 Year Figure 1.1: Scatterplots showing the number of articles published on the topic of cultural evolution (top), as well as the collaboration index of the field (bottom), each year. Exponen- tial fitting of the data revealed an increase in the number of articles written between 1990 and 2017 (r2 = 0.9443). Linear fitting of the data, after the removal of the outlier indicated by the star, revealed an increase in the collaboration index as well (r2 = 0.4333). removal of an outlier identified by its studentized residual (year = 1996, CI = 6.5; p < 0.001) revealed a positive relationship (r2 = 0.4333) (bottom panel of Figure 1.1). Linear fitting also revealed an increase in the mean number of unique countries associated with each article written between 1990 and 2017 (r2 = 0.5966). The Lotka’s law estimation yielded a beta coefficient (β) of 2.43 and a constant coefficient (c) of 0.40 (r2 = 0.9430). In the entire dataset, 2,907 authors (84.24%) published a single paper, 312 authors (9.04%) published two papers, and 232 authors (6.72%) published three or more papers. The most productive author, Alex Mesoudi, published 35 (1.67%) of the papers CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 11

included in the dataset. The Kolmogorov-Smirnov test found the Lotka’s law estimation to be significantly different from the theoretically predicted inverse square function (p < 0.05), indicating that the relative frequency distribution of the authors’ number of publications in the dataset is more skewed right than predicted (see Figure A.1). This suggests that the field of cultural evolution does not conform to Lotka’s law, and has fewer highly productive authors than expected. The largest set of authors linked by co-authorship (n = 629) was used for the network

analysis, resulting in the seven clusters portrayed in Figure 1.2 (n1 = 183; n2 = 146; n3 =

134; n4 = 75; n5 = 56; n6 = 20; n7 = 15). Articles published by individuals in each group,

excluding those co-authored with members of other groups, were compiled in R (n1 = 177;

n2 = 131; n3 = 137; n4 = 65; n5 = 57; n6 = 6; n7 = 9). The mean h-indices of the authors in each group were calculated using only articles in the dataset across all years to examine

collective output and impact (M1 = 2.04; M2 = 2.30; M3 = 1.82; M4 = 2.06; M5 = 2.14;

M6 = 1.1; M7 = 1.93). A one-way ANOVA found that the differences between the mean h-indices of each group were non-significant but trending (p = 0.069). Pairwise comparisons using t-tests with pooled standard deviations indicated that the mean h-index of group 6 is significantly lower than the mean h-indices of all groups except for group 3 (p < 0.05). The

CI for each group was calculated (CI1 = 1.79; CI2 = 1.78; CI3 = 1.48; CI4 = 1.81; CI5 =

1.4; CI6 = 10.3; CI7 = 2.11). Even after the removal of an extreme outlier (42 authors) the

CI of group 6 (CI6 = 4) was nearly twice that of group 7. This indicates that group 6 is the most internally collaborative in the dataset. The descriptive statistics of the core authors, and the statistically over-represented coun- try affiliations of first authors, author-chosen keywords, and subject categories in each cluster, can be found in Table A.2 and Table A.3, respectively. A wordcloud of author-chosen key- words can be found in Figure A.2. Core authors in group 1 included Alex Mesoudi, Alberto Acerbi, Kevin Laland, Mark Collard, and Andrew Whiten. The keywords “social learning”, CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 12

eryilmaz, k de boer, b bruno, j fay, n henshilwood, c feher, o wadley, l tamariz, m 1 brighton, h conard, n molleman, l de oliveira, e galantucci, b kirby, s tomasello, m caldwell, c 3 claidiere, n steels, l stadler, k whiten, a mcgrew, w griffiths, t loreto, v blythe, r van schaik, c scott-phillips, t collard, m christiansen, m hruschka, d franz, m lehmann, l lampos, v acerbi, a tehrani, j dale, r kandler, a reader, s belinsky, k sear, r perreault, c wakano, j blasi, d hahn, m laland, k hill, k fort, j aoki, k roberts, s moya, c atkisson, c kendal, j shennan, s 5 mesoudi, a reyes-garcia, v fogarty, l feldman, m levinson, s ehrlich, p ghirlanda, s garland, e ruhlen, m enquist, m nicholls, j cavalli-sforza, l gervais, w boyd, r gardner, b gray, r liden, k bettinger, r strimling, p 7 mace, r richerson, p paciotti, b ross, c baral, s puleston, c atkinson, q newson, l lubell, m lachlan, r murray, k whitehouse, h efferson, c 2 ince, s 4 chang, l 6 gelfand, m fehr, e buss, d

VOSviewer

Figure 1.2: The largest connected co-authorship network in the dataset, analyzed using VOS clustering (n = 629). Red corresponds to group 1 (“biological anthropology and archeology”; n1 = 183), green corresponds to group 2 (“mathematical modeling and dual-inheritance the- ory”; n2 = 146), blue corresponds to group 3 (“cognitive linguistics and experimental cultural evolution”; n3 = 134), yellow corresponds to group 4 (“cross-cultural and phylogenetic stud- ies”; n4 = 75), purple corresponds to group 5 (“computational biology and cultural niche construction”; n5 = 56), cyan corresponds to group 6 (“evolutionary psychology”; n6 = 20), and orange corresponds to group 7 (“behavioral ecology and birdsong”; n7 = 15). Name size indicates total link strength. Many authors were arbitrarily excluded from the figure by the visualization algorithm in VOSviewer to maximize legibility.

“conformity”, and “chimpanzees”, as well as the subject categories “anthropology” and “bi- ological psychology” are over-represented. We have chosen to label this group (1) biological anthropology and archaeology. Core authors in group 2 included Peter Richerson, Robert Boyd, Joseph Henrich, Magnus Enquist, and Stefano Ghirlanda. The keywords “coopera- CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 13 tion”, “religion”, and “cultural transmission”, as well as the subject category “behavioral sciences” are over-represented. We have chosen to label this group (2) mathematical mod- eling and dual-inheritance theory. Core authors in group 3 included Simon Kirby, Kenny Smith, Alex Bentley, Morten Christiansen, and Christine Caldwell. The keywords “language evolution”, “iterated learning”, and “compositionality”, as well as the subject categories “ex- perimental psychology” and “linguistics” are over-represented. We have chosen to label this group (3) cognitive linguistics and experimental cultural evolution. Core authors in group 4 included Ruth Mace, Russell Gray, Simon Greenhill, Thomas Currie, and Quentin Atkinson. The keywords “cultural phylogenetics”, “phylogenetic comparative methods”, and “cross- cultural research”, as well as the subject category “biology” are over-represented. We have chosen to label this group (4) cross-cultural and phylogenetic studies. Core authors in group 5 included Marcus Feldman, Nicole Creanza, Kenichi Aoki, Laurel Fogarty, and Joe Wakano. The keywords “demography”, “cultural accumulation”, and “gene-culture coevolution”, as well as the subject categories “mathematical and computational biology” and “genetics and heredity” are over-represented. We have chosen to label this group (5) computational biology and cultural niche construction. Core authors in group 6 included David Buss, Lei Chang, Hui Jing Lu, Todd Shackelford, and Laith Al-Shawaf. The keywords “mate preferences”, “evolutionary psychology”, and “sex”, as well as the subject category “social psychology” are over-represented. We have chosen to label this group (6) evolutionary psychology. Core authors in group 7 included Robert Lachlan, Peter Slater, Maria Servedio, Carel ten Cate, and Machteld Verzijden. The keywords “song learning”, “bird song”, and “speciation”, as well as the subject categories “ecology” and “evolutionary biology” are over-represented. We have chosen to label this group (7) behavioral ecology and birdsong. The subdisciplinary labels that we have applied to each group of co-authors are ultimately subjective and should be treated as conceptual guides rather than discrete categories. Science overlay maps were constructed using the subject category frequency lists for the CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 14

entire network, as well as for each cluster of co-authors identified in VOSviewer (see Figure A.3). All 19 of the broad categories identified as explaining the majority of the variance in the global science map were present in the full network (Leydesdorff et al., 2012), but they were represented to different extents in each of the five groups. The Stirling-Rao diversity

measure for each group was calculated as well (D1 = 0.7196; D2 = 0.7467; D3 = 0.7357; D4

= 0.6660; D5 = 0.5354; D6 = 0.5768; D7 = 0.4502). This indicates that group 2 is the most interdisciplinary in the dataset. Based on the eigenvector centrality values, group 2 has the highest weighted connectivity with other nodes in the network (M1 = 0.0029; M2 = 0.0376; M3 = 0.0011; M4 = 0.0026;

M5 = 0.0052; M6 = 0.0004; M7 = 0.0006) (see Figure 1.3). A one-way ANOVA found significant differences between the mean eigenvector centrality values of each group (p < 0.001). Pairwise comparisons using t-tests with pooled standard deviations indicated that the mean of group 2 is significantly higher than that of all other groups (p < 0.05), while the means of all other groups are not significantly different from one another (p > 0.05).

Figure 1.3: The distribution of eigenvector centrality values for the authors in each group. The three most extreme outliers in group 2 were excluded from the figure for higher reso- lution. Pairwise comparisons were conducted using t-tests with pooled standard deviations, and clusters with significantly different means are denoted with different letters. CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 15

Based on the closeness centrality values, group 2 is the closest to the center of the network

(M1 = 0.5870; M2 = 0.6554; M3 = 0.3876; M4 = 0.5488; M5 = 0.5134; M6 = 0.5562; M7 = 0.3299) (see Figure 1.4). A one-way ANOVA found significant differences between the mean closeness centrality values of each group (p < 0.001). Pairwise comparisons using t-tests with pooled standard deviations indicated that the mean of group 2 is significantly higher than those of all other groups (p < 0.05). Additional pairwise comparisons indicated that the means of groups 3 and 7 are significantly lower than those of all other groups (p < 0.001), but are not significantly different from one another (p > 0.05).

Figure 1.4: The distribution of closeness centrality values for the authors in each cluster. Pairwise comparisons were conducted using t-tests with pooled standard deviations, and clusters with significantly different means are denoted with different letters.

The degree of citation overlap between groups is shown in Figure 1.5. Due to inconsisten- cies in citation formatting and completeness, only references with a digital object identifier (DOI) were included in the analysis (41.5%). Groups 1 and 2, and groups 1 and 5, have the highest degree of citation overlap, and the results of SuperExactTest indicate that 71.4% of intersections between pairs of groups are significantly higher than chance given the size of the entire dataset (p < 0.05). CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 16

500

400

300

200

100

0 7 214 6 117 5 1232 4 1175 3 1883 2 2100 1 3100 Figure 1.5: The degree of citation overlap between groups. The groups included in each pairwise comparison are designated by the black dots below each bar. For each pairwise comparison, the y-axis is the number of articles that are shared between the two collective reference lists. The total size of each collective reference list appears to the right of each row of dots. Only pairwise comparisons with statistically significant intersections are included, and the results are sorted by increasing p-value.

1.4 Discussion

The study of cultural evolution is of burgeoning interest, with a publication rate rising twice as quickly as science in general. Over the last 25 years, the number of professional journal articles covering cultural evolution has increased dramatically and more countries are regu- larly involved in publishing. The increase in the collaboration index is particularly promising, given the evidence that larger research teams have higher scientific impact (Larivi`ereet al., 2014) and productivity (Landry et al., 1996; Lee and Bozeman, 2005). As cultural evolution draws researchers and ideas from a broad spectrum of academic fields, the present review has focused especially on assessing interdisciplinary interaction and integration. Despite vig- orous research activity, the field’s youth is indicated by the absence of a professional society until 2015, and the continued absence of a dedicated academic journal. Cultural evolutionary research is published in an unusually thinly distributed manner across publishing outlets. CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 17

Authorship is disparate, with most authors publishing only a single study, and fewer highly productive authors in the field than expected. Collaborations coalesce within seven topical clusters that differ in their level of interaction within and between groups, although the clusters overlap substantially in the references they cite. Vocabularies likewise have common elements, but each cluster employs distinctive terms as well. In general the analysis paints a picture of an active and energetic field that is increasing in integration, but that as yet has only partially established the communicative and collaborative networks typical of a cohesive field of study. The estimation of Lotka’s law for this dataset indicated significantly more disparity in author productivity than expected. A recent productivity analysis of 30 scientific fields found the average proportion of authors with one publication to be 69% (Ruiz-Castillo and Costas, 2014), which is approximately 15% lower than our result for cultural evolution. This high inequality of productivity in cultural evolution could be a result of a tendency for most researchers to be only occasionally interdisciplinary. As researchers from various fields begin to publish on the same topic, probably only a small minority of them will be consistently interdisciplinary enough to have high productivity related to the emerging topic, whereas the majority of researchers will continue to publish primarily in their home disciplines. Major publishing outlets for the field have not yet emerged. According to data from ISI Web of Science, the journal with the highest use by cultural evolutionists (PNAS) accounts for only 3.35% of the publications, and the top five together (the only journals with >2%) sum to only 14.28%. By contrast, the single journals publishing the most studies in several other active topic areas, for instance molecular evolution, the evolution of cooperation, and sexual selection, account for over 7% of each subfield’s produce by the same metrics, and the top 5 journals in these areas publish over 21-27%, with 28-40% of studies accounted for by 8-10 journals that each publish >2% of the field. The reason why dozens of journals are required to find a similar proportion of cultural evolution’s output is not clear. A dedicated CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 18 journal would probably help this situation, as they cover 7-8% of the field in many cases, but this does not always happen. Neither evolutionary psychology nor invasive species have a robust core group of journals, for instance, despite having dedicated journals and much research activity overall. Coevolution and niche construction are two additional topic areas with focused research activity but a lack of consistent publication outlets, although cultural evolution is the most thinly distributed of all of these topics. Our network analysis yielded seven clusters based on co-authorship patterns, meaning that researchers tend to collaborate with others within particular subgroups in the field. Our structural approach to mapping the field means that instead of determining group member- ship a priori (e.g. by departmental affiliations), we allowed the behavior of researchers (coauthorships) to speak for itself. However, this approach raises the challenge of labeling the resulting subdisciplines. The full diversity of approaches within any cluster cannot be exhaustively described without overestimating the degree of overlap between clusters. To avoid this, we chose to classify subdisciplines by characterizing the contributions of the core authors within them. Thus the labels that we proposed in the results should be taken as subjective, flexible indicators of researcher communities rather than strict subdisciplinary definitions. The seven co-authorship clusters differ in their activity and level of integration. “Math- ematical modeling and dual-inheritance theory” has the greatest weighted connectivity to other nodes, closeness to the center of the network, and subject category diversity, indicating that it is the most influential, central, and intellectually broad cluster. This is unsurprising given the fact that several of its members played integral roles in founding the field, alongside members of “computational biology and cultural niche construction” (Mesoudi, 2015). “Cog- nitive linguistics and experimental cultural evolution” and “behavioral ecology and birdsong” are furthest from the center of the network. “Biological anthropology and archaeology” and “mathematical modeling and dual-inheritance theory”, as well as “biological anthropology CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 19 and archaeology” and “computational biology and cultural niche construction”, overlap most in cited references, but over 70% of group pairs have more citation overlap than expected. All insignificant citation overlaps are between either “evolutionary psychology” or “behav- ioral ecology and birdsong”, the two smallest groups identified in the field, and non-adjacent groups. This indicates that intellectual similarity is related to structural distance in the collaboration network. “Evolutionary psychology” has the lowest mean h-index and is the most internally collaborative, which could be a result of its small size compared to other subdisciplines. Recent research has demonstrated that the relationship between interdisciplinary collab- oration and impact scales up with disciplinary distance (Larivi`ereet al., 2015), indicating that researchers in the field of cultural evolution could reap even more benefits from reaching out to authors further away in the network. Based on the results of this study, increased collaboration between the two most peripheral groups, “cognitive linguistics and experimen- tal cultural evolution” and “behavioral ecology and birdsong”, and the rest of the network would have the greatest impact on the integration of the field. A recent review of interdis- ciplinary research in the field actually highlighted the value of both cognitive and animal behavior approaches in connecting micro-level processes to macro-level patterns in cultural evolution (Kolodny et al., 2018). “Mathematical modeling and dual-inheritance theory” may be the best candidate for collaborations with “cognitive linguistics and experimental cultural evolution”, as it is highly connected to the rest of the network and there is already a great deal of intellectual overlap between these two groups. “Computational biology and cultural niche construction” is probably the best candidate for collaborations with “behavioral ecol- ogy and birdsong” as they already share co-authorship connections, but “behavioral ecology and birdsong” actually has a much older average publication year. More recent researchers who utilize approaches from behavioral ecology (i.e. Lucy Aplin and Johan Lind) already appear to be more collaborative with other groups, so the apparent distance between this CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 20 subdiscipline and the rest of the network could be due to historical lag. Recent studies in cultural evolution have shown that partial connectivity may enhance cultural complexity over full connectivity, using small experimental groups (Derex and Boyd, 2016) and models incorporating population size and fragmentation (Derex et al., 2018). Other studies in the innovation and management literature have found that small-world networks, characterized by higher clustering and shorter path lengths, have higher innovation rates and reach solutions more quickly (Schilling and Phelps, 2007; Mason et al., 2008). The apparent discrepancy in these results could be because collaboration networks often exhibit small-world structure (Ebadi and Schiffauerova, 2015a,c) and are thus below the threshold of partial connectivity used in the cultural evolution studies. Given the co-authorship network’s small-world structure and low density, efforts to integrate the field are more likely to enhance innovation through increased information access than to impair it through full connectivity. Several limitations of this study are worth mentioning. First, the search term {“cultural evolution”} excludes work by researchers who investigate changes in cultural traits but use different terminology. Although including other terms such as “cultural change” and “lan- guage evolution” would have expanded our results, we chose to limit our search to one term to avoid over-representing particular subdisciplines. Second, many of the most influential books and book chapters on cultural evolution are not indexed on WoS. As a result, fields with more emphasis on long-form publishing, such as cultural anthropology, may be un- derrepresented in this analysis. Third, our analysis excludes single-author papers, as we focused on coauthorship as an indicator of collaboration and field integration. Fourth, ear- lier articles are less likely than more recent articles to be represented in the coauthorship network because the authors of older papers are less likely to be collaborating recently. This bias actually aided us in avoiding another bias, however: the term “cultural evolution” was occasionally used, especially before 1975, in a way inconsistent with contemporary usage, specifically reflecting a now-outmoded unilineal approach to cultural change and develop- CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 21 ment. Few if any of the papers in our dataset used the term in this context. Lastly, our group-level results depend upon the resolution settings of the cluster analysis. Although larger resolutions break down the clusters into smaller subcommunities, they also reduce sample sizes and make classification less manageable. Increasing our resolution by 0.01, the smallest possible unit, resulted in 10 clusters with sample sizes down to 10 authors, so we used 0.03 to maximize our statistical power and classification ability. Cultural evolution as a whole draws together theory and data from a remarkable variety of sources, and does so by an equally broad array of methods (Kolodny et al., 2018). The field has already demonstrated that this intellectual diversity, bound together with a broad evolutionary perspective, will enable us to understand and explain change in socially learned traits over time in a new and powerful way (Mesoudi, 2017a). However, wherever there are groups that have differences of approach, differences in vocabulary, and differences in researchers with more collaboration within than between groups, we can expect to find differences of assumptions and possibly conclusions as well. Thus, the study of cultural evolution needs more integration if it is to function as a scientific community. Bridging efforts could take several forms. Workshops geared towards interdisciplinary research have been shown to improve integration in other fields (Piso et al., 2016; Barrows et al., 2008), and could be easily integrated into the annual meetings of the Cultural Evolution Society. Another potential strategy is to implement a database for researchers to find collaborators with complementary interests and skill sets (Novak et al., 2014), to ensure that co-authors are enhancing creativity by bringing new knowledge to the team (Lee et al., 2015). In addition, strategic grant funding targeted at integrative research could begin to reduce the distance between subdisciplines. Recent evidence that impact per dollar is actually higher for smaller grants (Fortin and Currie, 2013) suggests that a more distributed funding format might be more effective for increasing the visibility of the field as a whole. Since younger scientists tend to be more collaborative and act as intellectual bridges between disciplines (Ebadi and CHAPTER 1. BIBLIOMETRICS OF CULTURAL EVOLUTION 22

Schiffauerova, 2015b), providing more opportunities for early-career grant funding could also increase the connectivity of the entire network. Lastly, fostering a culture of open science, in which researchers publish and exchange their data and scripts, could play a key role in promoting interdisciplinarity and trust in a field with such an emphasis on mathematical modeling and phylogenetic methods (Mesoudi, 2017a; Easterbrook, 2014). By achieving conceptual synthesis through coordinated, interdisciplinary collaboration, the field of cultural evolution will be better equipped to tackle the “grand challenges” it faces (Brewer et al., 2017). Chapter 2

Cultural transmission modes of music sampling traditions remain stable despite delocalization in the digital age

2.1 Introduction

Music sampling, or the use of previously-recorded material in a new composition, is a nearly ubiquitous practice among hip-hop and electronic producers. The usage of drum breaks, or percussion-heavy sequences, ripped from soul and funk records has played a particularly critical role in the development of certain subgenres. For example, “Amen, Brother”, released by The Winstons in 1969, is widely regarded as the most sampled song of all time. Its iconic 4-bar drum break has been described as “genre-constitutive” (Whelan, 2009), and can be prominently heard in classic hip-hop and jungle releases by N.W.A and Shy FX (Collins, 2007). Due to the consistent usage of drum breaks in particular music communities and

23 CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 24 subgenres (Whelan, 2009; Collins, 2007; Frane, 2017; V¨akev¨a,2010; Rodgers, 2003) some scholars have suggested that they may be culturally transmitted (Bown et al., 2009), which could occur as a direct result of collaboration between artists or as an indirect effect of community membership. Before the digital age, artists may have depended upon collaborators for access to the physical source materials and expensive hardware required for sampling (Lloyd, 2014). In the 1990s, new technologies like compressed digital audio formats and digital audio worksta- tions made sampling more accessible to a broader audience (Bodiford, 2017). Furthermore, the widespread availability of the internet and social media have delocalized communities (DiMaggio et al., 2001), and allowed global music “scenes” to form around shared interests beyond peer-to-peer file sharing (Ebare, 2003; Alexandraki and Valsamakis, 2009). Indi- viduals in online music communities now have access to the collective knowledge of other members (Salavuo, 2006; Lazar and Preece, 2002), and there is evidence that online com- munities play a key role in music discovery (Garg et al., 2011). Although musicians remain concentrated in historically important music cities (i.e. New York City and Los Angeles in the United States) (Florida and Jackson, 2010; Graham, 2016), online music communi- ties also make it possible for artists to establish collaborative relationships independently of geographic location (Kruse, 2010). If more accessible sampling technologies and access to collective knowledge have allowed artists to discover sample sources independently of col- laboration (Makelberge, 2012), then the strength of cultural transmission via collaboration may have decreased over the last couple of decades. Similarly, if online music communities have created opportunities for interactions between potential collaborators, then geographic proximity may no longer structure musical collaboration networks. Studies of the cultural evolution of music have primarily investigated diversity in musical performances (Ellis et al., 2018) and traditions (Savage and Brown, 2014), macro-scale pat- terns and selective pressures in musical evolution (Mauch et al., 2015; Percino et al., 2014; CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 25

Savage et al., 2015; Rodriguez Zivic et al., 2013), and the structure and evolution of consumer networks (Garg et al., 2011; Schlitter and Falkowski, 2009; Monechi et al., 2017). Although several diffusion chain experiments have addressed how cognitive biases shape musical traits during transmission (Verhoef, 2012; Ravignani et al., 2017; Lumaca and Baggio, 2017a), few studies have investigated the mechanisms of cultural transmission at the population level (Rossman et al., 2008; Nakamura and Kaneko, 2018). The practice of sampling drum breaks in hip-hop and electronic music is an ideal research model for cultural transmission because of (1) the remarkably high copy fidelity of sampled material, (2) the reliable documentation of sampling events, and (3) the availability of high-resolution collaboration and demographic data for the artists involved. Exhaustive online datasets of sample usage and collaboration make it possible to reconstruct networks of artists and track the diffusion of particular drum breaks from the early 1980s to today. Furthermore, the technological changes that have occurred over the same time period provide a natural experiment for how the digital age has impacted cultural transmission more broadly (Acerbi, 2016). The aim of the current study was to determine whether drum breaks are culturally transmitted through musical collaboration networks, and to identify the factors driving the evolution of these networks. We hypothesized that (1) drum breaks are culturally transmitted through musical collaboration networks, and that (2) the strength of cultural transmission via collaboration would decrease after the year 2000. For clarification, the alternative to the first hypothesis is cultural transmission occurring outside of collaborative relationships (i.e. independent sample discovery via “crate-digging” in record stores or online). Previous studies have investigated similar questions using diffusion curve analysis (Rossman et al., 2008), but the validity of inferring transmission mechanisms from cumulative acquisition data has been called into question (Laland and Kendal, 2003). Instead, we applied network-based diffusion analysis (NBDA), a recently developed statistical method for determining whether network structure biases the emergence of a novel behavior in a population (Franz and Nunn, CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 26

2009). As NBDA is most useful in identifying social learning, an ability that is assumed to be present in humans, it has been primarily applied to non-human animal models such as birds, whales, and primates (Aplin et al., 2014; Allen et al., 2013; Hobaiter et al., 2014), but the ability to incorporate individual-level variables to nodes makes it uniquely suited to determining what factors bias diffusion more generally. Additionally, we hypothesized that (3) collaboration probability would be decoupled from geographic proximity after the year 2000. To investigate this we applied separable temporal exponential random graph modeling (STERGM), a dynamic extension of ERGM for determining the variables that bias network evolution (Krivitsky and Handcock, 2010).

2.2 Methods

All data used in the current study were collected in September of 2018, in compliance with the terms and conditions of each database. For the primary analysis, the three most heavily sampled drum breaks of all time, “Amen, Brother” by The Winstons, “Think (About It)” by Lyn Collins, and “Funky Drummer” by James Brown, were identified using WhoSam- pled (https://www.whosampled.com/). The release year and credits for each song listed as having sampled each break were collected using data scraping. In order to avoid name dis- ambiguation, only artists, producers, and remixers with active Discogs links and associated IDs were included in the dataset. In order to investigate potential shifts in transmission strength around 2000, the same method was used to collect data for the eight songs in the “Most Sampled Tracks” on WhoSampled that were released after 1990 (see Appendix B). One of these, “I’m Good” by YG, was excluded from the analysis because the sample is primarily used by a single artist. Each set of sampling events collected from WhoSam- pled was treated as a separate diffusion. All analyses were conducted in R (v 3.3.3). All R scripts and data used in the study are available in the Harvard Dataverse repository: CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 27

https://doi.org/10.7910/DVN/Q02JJQ. Collaboration data were retrieved from Discogs (https://www.discogs.com/), a crowd- sourced database of music releases. All collaborative releases in the database were extracted and converted to a master list of pairwise collaborations. For each diffusion, pairwise collab- orations including two artists in the dataset were used to construct collaboration networks, in which nodes correspond to artists and weighted links correspond to collaboration number. Although some indirect connections between artists were missing from these subnetworks, conducting the analysis with the full dataset was computationally prohibitive and incomplete networks have been routinely used for NBDA in the past (Aplin et al., 2012; Allen et al., 2013; Aplin et al., 2014). Individual-level variables for artists included in each collaboration network were collected from MusicBrainz (https://musicbrainz.org/), a crowdsourced database with more com- plete artist information than Discogs, and Spotify (https://www.spotify.com/), one of the most popular music streaming services. Gender and geographic location were retrieved from the Musicbrainz API. Whenever it was available, the “begin area” of the artist, or the city in which they began their career, was used instead of their “area”, or country of affiliation, to maximize geographic resolution. Longitudes and latitudes for each location, retrieved using the Data Science Toolkit and Google Maps, were used to calculate each artist’s mean geo- graphic distance from other individuals. Albunack (http://www.albunack.net/), an online tool which draws from both Musicbrainz and Discogs, was used to convert IDs between the two databases. Popularity and followers were retrieved using the Spotify API. An artist’s popularity, a proprietary index of streaming count that ranges between 0 and 100, is a better indicator of their long-term success because it is calculated across their entire discography. Followers is a better indicator of current success because it reflects user engagement with artists who are currently more active on the platform. Discogs IDs are incompatible with the Spotify API, so artist names were URL-encoded and used as text search terms. CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 28

In order to identify whether social transmission between collaborators played a role in sample acquisition, order of acquisition diffusion analysis (OADA) was conducted using the

R script for NBDA (v 1.2.13) provided on the Laland lab’s website (https://lalandlab. st-andrews.ac.uk/freeware/). OADA uses the order in which individuals acquire a trait to determine whether its diffusion is biased by the structure of the social network(Hoppitt et al., 2010). OADA was utilized instead of time of acquisition diffusion analysis (TADA) because it makes no assumptions about the baseline rate of acquisition (Franz and Nunn, 2009). For each artist, order of acquisition was determined by the year that they first used the sample in their music. Sampling events from the same year were given the same order. Gender, popularity, followers, and mean distance were included as predictor variables. For gender, females were coded as -1, males were coded as 1, and individuals with other identities or missing values were coded as 0. For popularity, followers, and mean distance each variable was centered around zero. Asocial, additive, and multiplicative models were fit to all three diffusions collectively with every possible combination of individual-level variables. Standard information theoretic approaches were used to rank the models according

to Akaike’s Information Criterion corrected for sample size (AICc). Models with a ∆AICc < 2 were considered to have the best fit (Burnham and Anderson, 2002). The best fitting model with the most individual-level variables was run separately to assess the effects of each variable on social transmission. Effect sizes were calculated according to Allen et al. (2013). An additional OADA was conducted using the seven diffusions from after 1990. Individual- level variables were excluded due to insufficient demographic data. An additive model was fit to the OADA, and separate social transmission parameters were calculated for each diffusion to identify differences in transmission strength. Additive and multiplicative models give identical results in the absence of individual-level variables, so no model comparison was necessary. In order to assess the effects of individual-level variables on network evolution, STERGM CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 29

was conducted using statnet (v 2016.9), an R package for network analysis and simulation. STERGM is a dynamic social network method that models the formation and dissolution of links over time(Krivitsky and Handcock, 2010). Collaboration events involving artists from each diffusion were combined to construct static collaboration subnetworks for each year between 1984 and 2017, which were then converted into an undirected, unweighted dynamic network. Early years not continuous with the rest of the event data (i.e. 1978 and 1981) were excluded from the dynamic network. In order to determine whether the variables biasing network structure have changed over time, the analysis was conducted separately with the data from 1984-1999 and 2000-2017. For each time period a set of STERGM models with every possible combination of individual-level variables were fit to the dynamic network using conditional maximum likelihood estimation (CMLE). Although STERGM can be used to separately model both the formation and dissolution of links, this analysis was restricted to the former. Gender, popularity, and followers were included to investigate homophily, while mean distance was included to assess its effect on link formation. As STERGM cannot be run with missing covariates, NA values in popularity (6.39%), followers (6.39%), and mean distance (38.49%) were imputed using the random forest method. The models from each period were ranked according to AIC, and the best fitting models (∆AIC < 2) with the most individual-level variables were run separately to assess the effects of each variable on network evolution.

2.3 Results

The three most heavily sampled drum breaks of all time were collectively sampled 6530 times

(n1 = 2966, n2 = 2099, n3 = 1465). 4462 (68.33%) of these sampling events were associated with valid Discogs IDs, corresponding to 2432 unique artists (F: n = 143, 5.88%; M: n = 1342, 55.18%; Other or NA: n = 947, 38.94%), and included in the primary OADA and CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 30

STERGM. The eight samples released after 1990 were collectively sampled 1752 times (n1

= 284, n2 = 260, n3 = 248, n4 = 198, n5 = 194, n6 = 193, n7 = 192, n8 = 182). 1305 (74.53%) of these sampling events were associated with valid Discogs IDs, corresponding to 1270 unique artists, and included in the additional OADA. The best fitting model from the primary OADA, which was multiplicative and included all four individual-level variables, can be seen in Table 2.1. In support of our first hypothesis, a likelihood ratio test found strong evidence for social transmission over asocial learning

(∆AICc = 141; p < 0.001). Based on the effect sizes, transmission appears to be more likely among females (p < 0.01) and less likely among artists who are more popular (p < 0.001) and have more followers (p < 0.001). Mean distance is not a significant predictor of transmission (p = 0.89). The diffusion network and diffusion curve for all three drum breaks included in the primary OADA are shown in Figure 2.1 and B.1, respectively. All other models fit to the primary OADA can be found in Appendix B.

Multiplicative Model - Order of Acquisition Estimate Effect size p Gender -0.11 0.81 < 0.01 Popularity -0.013 0.86 < 0.001 Followers -9.6E-8 0.92 < 0.001 Mean distance -1.8E-9 1 0.89 Likelihood Ratio Test

AICc p With social transmission 14719 < 0.001 Without social transmission 14860

Table 2.1: The results of the multiplicative model for the OADA including all individual- level variables. The top panel shows the model estimate, effect size, and p-value for each individual-level variable. The bottom panel shows the AICc for the model with and without social transmission and the p-value from the likelihood ratio test.

The results of the additional OADA, conducted using the seven diffusions from after 1990, can be found in Appendix B. A likelihood ratio test found strong evidence for social CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 31

●● ●● ● ● 1986 1996 ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●●●● ●● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●●● ● ●●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ●● ●● ● ●● ● ● ●●● ●●●●●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ●●● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●● ● ●● ● ●● ● ● ● ●●●●● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ●● ● ●● ● ●● ●● ●● ●

● ● ● ● ●● ● ●● ● ● ●● ● ●● ● ● 2007 ● ●● ●● ● ●● 2018 ● ●● ●● ● ●● ● ●● ●● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ●● ●● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ●●●● ● ● ● ● ●● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ●● ● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●●●●● ●● ● ● ● ● ● ● ● ●●● ● ●●●●●●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●●● ●● ● ●● ● ●● ● ● ● ● ●● ● ●●●● ● ● ●●● ●● ● ● ●●●●● ● ● ●●● ● ● ●●●●● ● ● ● ● ● ● ● ● ●● ● ●●●●●●●●●●●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ●●●●●● ● ●● ● ● ●● ● ● ● ●● ●● ●●●●●●● ●● ● ●● ● ● ●● ● ●● ● ● ●● ●● ●●● ● ● ● ● ●● ● ●●●●●●●●●●● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ●● ●● ● ●● ●● ● ●●● ● ● ● ●● ●● ● ● ●●● ●●●●●● ● ● ● ●● ● ● ●●●●● ●●●●●● ● ● ● ● ● ●● ●● ● ●● ● ●● ●● ● ● ● ●●● ●●● ●●● ● ●● ● ●● ●● ● ●●● ● ● ● ●● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ●● ● ●● ● ● ● ● ●●● ● ● ● ●● ● ●● ●● ● ● ● ●● ●● ● ● ● ●●● ● ● ● ● ●● ●●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ● ● ●●●●●● ● ● ● ● ●● ●● ●● ●● ●●●●● ●● ● ● ●● ●●● ●●●●● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●● ●● ● ● ●● ● ● ● ●● ●●● ● ● ●●●● ●● ● ● ●● ● ● ●●●●● ● ●●●● ●● ●● ● ● ● ● ● ●●●●● ● ●●●●● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ●● ●●● ● ● ●● ● ● ●● ● ●● ● ● ●●●● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ●● ●●● ●●● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ●●● ●●●● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ●● ●● ● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ●● ● ● ●● ● ● ● ●● ●● ● ● ●● ●● ● ●● ●● ●● ●● ● ● ● ● ●● ● ● ● ●

Uninformed● Newly informed● Previously informed Figure 2.1: The diffusion of all three drum breaks through the combined collaboration net- work. At each time point individuals who have not yet used one of the drum breaks (in- formed) are shown in white, individuals who first used one of the drum breaks in a previous time step (previously informed) are shown in blue, and individuals who first used one of the drum breaks in the current time step (newly informed) are shown in red.

transmission overall (∆AICc = 88; p < 0.001). Contrary to our second hypothesis, lin- ear regression found no significant relationships between either mean year of diffusion and social transmission estimate (R2 = 0.20, p = 0.31) or median year of diffusion and social transmission estimate (R2 = 0.17, p = 0.36) (see B.2). For both time periods the second best fitting STERGM models (∆AIC < 2) included all four individual-level variables, the results of which can be seen in Table 2.2. All other models can be found in Appendix B. Across both periods there appears to be homophily based on popularity (p < 0.001) and gender (M: p < 0.001; F: ps < 0.05). In support of our third hypothesis, mean distance negatively predicts link formation only before 2000 (p < 0.001). CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 32

Additionally, there is a heterophilic effect of followers only after 2000 (p < 0.001). Based on the effect sizes, there has been a nearly three-fold decrease in the strength of homophily among females. Conversely, the strengh of homophily by popularity has actually increased since 2000. The results of the STERGM analysis assuming different transition years (e.g. 1994, 1996, 1998, 2002, 2004, 2006) can be found in Appendix B. Linear regression found significant positive relationships between both popularity and number of collaborations (R2 = 0.048, p < 0.001) and followers and number of collaborations (R2 = 0.090, p < 0.001) (see B.3).

STERGM 1984-1999 2000-2017 Effect size p Effect size p Gender (F) 6.86 < 0.001 2.23 < 0.05 Gender (M) 1.70 < 0.001 2.43 < 0.001 Popularity 0.84 < 0.001 0.54 < 0.001 Followers 1.02 0.64 1.96 < 0.001 Mean distance 0.87 < 0.001 0.99 0.82

Table 2.2: The results of the STERGM analyses for before and after 2000. The table shows the effect size and p-value for gender, popularity, followers, and mean distance during each time period.

A goodness-of-fit analysis was conducted by generating simulated networks (n = 100) from the parameters of the best fitting model and comparing them to the observed network statistics (Hunter et al., 2008). For both time periods, the global statistics (i.e. gender, pop- ularity, followers, mean distance) from the simulated networks were not significantly different from those observed, indicating that both models are good fits for the variables in question. Structural statistics (i.e. degree, edgewise shared partner, minimum geodesic distance) from the simulated networks were significantly different from those observed, indicating that both models are not good fits for the structural properties of the network. The results of this analysis can be found in Appendix B. CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 33 2.4 Discussion

Using high-resolution collaboration and longitudinal diffusion data, we have provided the first quantitative evidence that music samples are culturally transmitted via collaboration between artists. Additionally, in support of the widespread assertion that the internet has delocalized artist communities, we have found evidence that geographic proximity no longer biases the structure of musical collaboration networks after the year 2000. Given that the strength of transmission has not weakened over the same time period, this finding indicates that collaboration remains a key cultural transmission mode for music sampling traditions. This result supports the idea that the internet has enhanced rather than disrupted existing social interactions (DiMaggio et al., 2001). Gender appears to play a key role in both network structure and cultural transmission. Across the entire time period, collaborations were more likely to occur between individuals of the same gender. Additionally, the probability of cultural transmission appears to be much higher for female artists. This effect could be a result of the much higher levels of homophily among women before 2000. Previous work has suggested that high levels of gender homophily are associated with gender disparity (Glass and Cook, 2017; Crewe and Wang, 2018; Jadidi et al., 2018), which is consistent with the historic marginalization of women in music production communities (Ebare, 2003; Baker and Cohen, 2008; Whelan, 2009). Although the proportion of female artists in the entire dataset is extremely low (∼6%), the reduction in homophily among female artists after 2000 could be reflective of increasing inclusivity (Smith et al., 2014). Artists with similar levels of popularity were also more likely to collaborate with each other. The increase in homophily by popularity after 2000 could be the result of an increase in skew, whereby fewer artists take up a greater proportion of the music charts (Ordanini and Nunes, 2016). In addition, the probability of cultural transmission appears to be higher CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 34 among less popular artists, even though they are slightly less collaborative. This effect could be linked to cultural norms within “underground” music production communities. In these communities, collective cultural production is sometimes prioritized over individual recognition (Thornton, 1995; Hesmondhalgh, 1998). This principle is best demonstrated by the historic popularity of the white-label release format, where singles are pressed to blank vinyl and distributed without artist information (Thornton, 1995; Hesmondhalgh, 1998). In more extreme cases, individual artists who experience some level of mainstream success or press coverage risk losing credibility, and may even be perceived as undermining the integrity of their music community (Hesmondhalgh, 1998; Noys, 1995). Concerns about credibility could cause individuals to selectively copy less popular artists or utilize more rare samples (i.e. De La Soul’s refusal to sample James Brown and George Clinton because of their use by other popular groups (Lena, 2004)). Future research should investigate whether the “high prestige attached to obscurity” (Hesmondhalgh, 1998) in these communities may be driving a model-based bias for samples used by less popular artists or a frequency-based bias for samples that are more rare in the population (Boyd and Richerson, 1985). A frequency-based novelty bias was recently identified in Western classical music using agent-based modeling (Nakamura and Kaneko, 2018), and similar methods could be utilized for sampling. Similarly to popularity, the number of followers an individual has negatively predicts transmission probability. However, artists with similar numbers of followers were actually less likely to collaborate with each other after 2000. This result could be due to the fact that followers is a better indicator of current popularity, but has lower resolution further back in time. Newer artists with inflated follower counts who collaborate with older, historically- important artists with lower follower counts may still be expressing homophily based on overall popularity. There are several limitations to this study that should be highlighted. Firstly, the time lag inherent in the user editing of WhoSampled means that older transmission records are CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 35 more complete. Algorithms for sample-detection (Hockman and Davies, 2015) may allow researchers to reconstruct full transmission records in the future, but these approaches are not yet publicly available. Secondly, MusicBrainz and Spotify had incomplete demographic data for some artists (i.e. gender and geographic location), which may have introduced bias into our model estimates. Lastly, Discogs primarily documents official releases, which means that more recent releases on streaming sites like Soundcloud are not well-represented. In combination with the exclusion of artists without Discogs IDs, this indicates that less promi- nent artists may be underrepresented in the dataset. Fortunately, social networks are fairly robust to missing data, especially when networks are large, centralized, and disproportion- ately include central nodes (Smith and Moody, 2013; Smith et al., 2017; Silk et al., 2015). Additionally, simulation studies evaluating the robustness of NBDA indicate that it performs well under fairly high levels of sampling error and bias (Franz and Nunn, 2009; Whalen and Hoppitt, 2016; Hoppitt, 2017; Wild and Hoppitt, 2018), risks that are mitigated by the fact that the network was reconstructed from published collaborations rather than temporal co- occurrence data. Data on collaborative relationships between artists are less likely to suffer from significant observation error because (1) they do not have to be statistically transformed or filtered before network construction and (2) they result in cultural artifacts that are part of the public record. The crowd-sourced nature of these data is unlikely to impact the results, given that comparative studies of other crowd-sourced, quantitative datasets indicate that they have high accuracy (Rajagopalan et al., 2010; Brown, 2011; Kr¨aenbring et al., 2014). The results of this study provide valuable insight into how demographic variables, par- ticularly gender and popularity, have biased both cultural transmission and the evolution of collaboration networks going into the digital age. In addition, we provide evidence that collaboration remains a key transmission mode of music sampling traditions despite the delocalization of communities by the internet. Future research should investigate whether decreased homophily among females is actually linked to greater inclusivity in the music in- CHAPTER 2. TRANSMISSION OF MUSIC SAMPLING 36 dustry (e.g. booking rates, financial compensation, media coverage), as well as whether the inverse effect of popularity on cultural transmission probability is a result of a model-based bias for obscurity or a frequency-based bias for novelty. Chapter 3

Extremist ideology as a complex contagion: the spread of far-right radicalization in the United States between 2005-2017

3.1 Introduction

The far-right movement, which includes white supremacists, neo-Nazis, and sovereign citi- zens, is the oldest and most deadly form of domestic extremism in the United States (Simi and Bubolz, 2017; Piazza, 2017). Despite some ideological diversity, members of the far- right often advocate for the use of violence to bring about an “idealized future favoring a particular group, whether this group identity is racial, pseudo-national, or characterized by individualistic traits” (National Consortium for the Study of Terrorism and Responses to Terrorism (START), University of Maryland, 2017). Over the last decade, the far-right movement was responsible for 73.3% of all extremist murders in the United States. In 2018,

37 CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 38 this statistic rose to 98% (The Anti-Defamation League Center on Extremism, 2019). The increasing severity of far-right extremist violence, as well as the associated rhetoric on so- cial media (Winter, 2019; Davey and Ebner, 2019), has generated public concern about the spread of radicalization in the United States. Former extremists have referred to it as a public health issue (Allam, 2019; Bonn, 2019), an idea advocated for by some policy experts as well (Weine and Eisenman, 2016; Sanir et al., 2017). There is little evidence that radicalization is primarily driven by psychopathology (Post, 2015; Webber and Kruglanski, 2017; Misiak et al., 2019). Rather, radicalization appears to be a process in which individuals are destabilized by various environmental factors, exposed to extremist ideology, and subsequently reinforced by members of their community (Webber and Kruglanski, 2017; Jasko et al., 2017; Jensen et al., 2018; Mills et al., 2019; Becker, 2019). Even “lone wolves”, or solo actors, often interact with extremist communities online (Kaplan et al., 2014; Post, 2015; Holt et al., 2019). As such, radicalization may spread through a social contagion process, in which extremist ideologies behave like complex contagions that require multiple exposures for adoption (Guilbeault et al., 2018), which has been observed for political movements more broadly (Gonz´alez-Bail´onet al., 2011). Previous research suggests that extremist propaganda (Ferrara, 2017), hate crimes (Braun and Koopmans, 2010; Braun, 2011), intergroup conflict (Buhaug and Gleditsch, 2008; Gelfand et al., 2012), and terrorism (Midlarsky et al., 1980; Cherif et al., 2009; LaFree et al., 2012; White et al., 2016) exhibit similar dynamics. The environmental factors favoring radicalization, referred to here as endemic factors, include variables like poverty rate that may influence an individual’s risk of adoption in particular regions. As such, endemic factors have the potential to enhance or constrain the spread of contagions through populations across geographic space. Although significant research has been done on how endemic factors predict radicalization (and resulting violence) (McVeigh, 2004; Goetz et al., 2012; LaFree and Bersani, 2014; Piazza, 2017; Medina et al., CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 39

2018), few studies have investigated how these factors interact with contagion processes. The aim of the current study is to determine whether patterns of far-right radicalization in the United States are consistent with a contagion process, and to assess the influence of critical endemic factors. After controlling for population density, I assessed the following endemic factors that have been implicated in previous research on radicalization, extremism, and mass shootings: poverty rate (Gale et al., 2002; Durso and Jacobs, 2013; Suttmoeller et al., 2015, 2016; Piazza, 2017; Lin et al., 2018; Suttmoeller et al., 2018; Medina et al., 2018; Kwon and Cabrera, 2019b), unemployment rate (Green et al., 1998; Jefferson and Pryor, 1999; Gale et al., 2002; Espiritu, 2004; Goetz et al., 2012; Piazza, 2017; Majumder, 2017; Pah et al., 2017), income inequality (McVeigh, 2004; Goetz et al., 2012; McVeigh and Cunningham, 2012; Majumder, 2017; Kwon and Cabrera, 2017, 2019b,a), education levels (Espiritu, 2004; Florida, 2011; Durso and Jacobs, 2013; McVeigh et al., 2014; Gladfelter et al., 2017; Kwon and Cabrera, 2017, 2019a), non-white population size (McVeigh, 2004; LaFree and Bersani, 2014; Gladfelter et al., 2017; Medina et al., 2018), violent crime rate (McVeigh and Cunningham, 2012; Gladfelter et al., 2017; Sweeney and Perliger, 2018), gun ownership (Pah et al., 2017; Anisin, 2018; Lin et al., 2018; Reeping et al., 2019), hate groups per capita (Adamczyk et al., 2014), and Republican voting (McVeigh et al., 2014; Medina et al., 2018). Furthermore, I aim to determine whether individual-level variables, such as social media use, enhance the spread of far-right radicalization over space and time. Social media platforms increasingly appear to play a role in radicalization, both as formal recruitment tools (Wu, 2015; Bertram, 2016; Aly et al., 2017; Awan, 2017) and spaces for extremist communities to interact (Amble, 2012; Dean et al., 2012; Winter, 2019; Pauwels et al., 2014). If social media platforms augment physical organizing (Bowman-Grieve, 2009; Holt et al., 2016), then they may also enhance the spread of radicalization. Although social media platforms relax geographic constraints on communication, evi- dence suggests that social media networks still exhibit spatial clustering. For example, the CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 40 majority of an individual’s Facebook friends live within 100 miles of them (Bailey et al., 2018), the probability of information diffusion on social media decays with increasing dis- tance (Liu et al., 2018), and online echo chambers map onto particular locations (Bastos et al., 2018). Since complex contagions require reinforcement, and the majority of online friendship ties are within a close radius, the diffusion of extremist ideologies online should still exhibit some level of geographic bias. This idea is supported by evidence that social media enhances physical organizing among extremists (von Behr et al., 2013; Gill et al., 2017; Bastug et al., 2018), and anecdotes of “self-radicalized” individuals using social media to contact other extremists in their area (Holt et al., 2019). In order to model the spread of far-right radicalization I used a two-component spatio- temporal intensity (twinstim) model (Meyer et al., 2017), an epidemiological method that treats events in space and time as resulting from self-exciting point processes (Reinhart, 2018). In this framework, future events depend on the history of past events within a certain geographic range. Event probabilities are determined by a conditional intensity function, which is separated into endemic and epidemic components. This allows researchers to assess the combined effects of both spatio-temporal covariates and epidemic predictors. Epidemic, in this framework, refers to any level of contagion effect and does not necessarily imply uncontrollable spread. With a couple of notable exceptions (Zammit-Mangion et al., 2012; Clark and Dixon, 2018), previous applications of self-exciting point process models in terrorism and mass shooting research have not simultaneously modeled diffusion over both time and space (Porter and White, 2010; Lewis et al., 2012; White et al., 2013; Towers et al., 2015; Garcia-Bernardo et al., 2015; Tench et al., 2016; Johnson and Braithwaite, 2017; Collins et al., 2020). The radicalization events in this study, which correspond to where and when a radicalized individual’s extremist activity or plot was exposed, came from the Profiles of Individual Rad- icalization in the United States (PIRUS), an anonymized database compiled by the National CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 41

Consortium for the Study of Terrorism and Responses to Terrorism (START) (National Consortium for the Study of Terrorism and Responses to Terrorism (START), University of Maryland, 2017). PIRUS is compiled from sources in the public record, and only includes individuals radicalized in the United States who were either arrested, indicted, or killed as a result of ideologically-motivated crimes, or were directly associated with a violent extremist organization. I chose to use PIRUS instead of the Terrorism and Extremist Violence in the United States (TEVUS) database because events in PIRUS are disambiguated by individual and include social variables that may influence the diffusion process. A contagion effect in this modeling framework could result from one of two forces. The first is a copy-cat effect, in which individuals copy behaviors observed directly or in the media. Although this effect has been proposed in terrorism and mass shooting research in the past (Nacos, 2009; Towers et al., 2015), it seems to be a more plausible contagion mechanism for specific methods of violence (Helfgott, 2015) (e.g. suicide bombings (Tominaga, 2018)) rather than radicalization more broadly. The second is linkage triggered by activism and organizing, or ideologically-charged events (e.g. elections, demonstrations, policies), in that region. To differentiate between these two forces, I included two sets of epidemic predictors in the modeling. The first two event-level variables, plot success and anticipated fatalities, might be expected to increase epidemic probability if a copy-cat effect is present. This is because successful large-scale events are probably more contagious due to increased media coverage (Towers et al., 2015). Alternatively, the two individual-level variables, group membership and social media use, might be expected to increase epidemic probability if activism and organizing drive the linkage between events. CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 42 3.2 Methods

All individual-level data came from PIRUS. Only individuals with far-right ideology who were exposed during or after 2005 (the earliest year with social media data) with location data at the city-level or lower (n = 416; F: 6.0%, M: 94.0%) were included (see Figures 3.1 and 3.2). For each individual, the date and location of their exposure (usually when their activity/plot occurred), whether their plot was successful (34.9%), the anticipated fatalities of their plot (0: 69.5%, 1-20: 26.0%, >20: 2.6%, >100: 1.9%), whether they were a member of a formal or informal group of extremists (58.4%), and whether social media played a role in their radicalization (31.2%), were included. Unknown or missing values for each predictor (plot success: 0.5%, anticipated fatalities: 13.5%, group membership: 0%, social media: 54.8%) were coded as 0. To ensure that the coding procedure for missing predictor values did not introduce bias, I checked whether the results of the full model were consistent after multiple imputation with chained equations and random forest machine learning (see Table C.1). The location of each exposure was geocoded from the nearest city or town using the R package ggmap (Kahle and Wickham, 2013). Since domestic terrorists tend to commit acts in their local area (Smith et al., 2008; Klein et al., 2017; Marchment et al., 2018), I assumed that exposure locations reflect where individuals were radicalized. CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 43

Figure 3.1: The number of far-right extremists exposed in the PIRUS database, both per-year (left) and cumulative (right), between 2005 and 2017.

State-level gun ownership was estimated using a proxy measure based on suicide rates and hunting licenses (Seigel et al., 2014). Using data from 2001, 2002, and 2004 (the only three years for which state-level gun ownership data is available), Seigel et al. found that the following proxy correlates with gun ownership with an R2 of 0.95:

FS (0.62 · ) + (0.88 · HL) − 0.0448 (3.1) S where F S/S is the proportion of suicides that involve firearms (from the Centers for Disease Control and Prevention, or CDC), and HL is hunting licenses per capita (from the United States Fish and Wildlife Service) (Seigel et al., 2014). Missing suicide rates (five years for DC, two years for Rhode Island) were replaced with the mean values for that state. State-level hate group data was collected from the Southern Poverty Law Center, while violent crime data was collected from the Federal Bureau of Investigation’s Uniform Crime Reporting CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 44

Program.

Figure 3.2: The untied locations of far-right extremists exposed in the PIRUS database be- tween 2005 and 2017. The color of each county corresponds to its log-transformed population density.

County-level demographic data was collected from the US Census using the R package censusapi (Recht, 2019). This included population density, poverty rate, Gini index of income inequality, percentage of the population that is non-white, percentage of the population that has at least a high-school diploma, and unemployment rate. County-level income, race, education, and unemployment data is only available after 2009, so the 2010 data was used for 2005-2009. County-level presidential election voting records were collected from the Massachusetts Institute of Technology Election Lab, and non-election years were assigned the data from the most recent election year. Geographic data was collected from the US Census using the R package tigris (Walker, CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 45

2019). Twinstim modeling was conducted using the R package surveillance (Meyer et al., 2017). To convert the data to a continuous spatio-temporal point process, all tied locations and dates were shifted in a random direction up to half of the minimum spatial and temporal distance between events (1.52 km and 0.5 days, respectively) (Meyer and Held, 2014). Step functions were used to model both spatial and temporal interactions. Visual in- spection of the pair correlation function for the point pattern indicates that the data is significantly clustered up to 400 km (see Figure C.1). As such, the spatial step function was split into four 100 km intervals with 400 km as the maximum interaction radius (Nightin- gale et al., 2015). The temporal step function was split into four six-month intervals up to two years (based on the the high degree of variation in radicalization and attack planning times among domestic extremists (Smith and Damphousse, 2009; Silkoset, 2016; Bouhana et al., 2018)). I attempted the analysis with different combinations of power-law, Gaussian, and Student spatial functions, and exponential temporal functions, but these variations con- verged to unrealistically steep spatial and temporal interaction functions that approached zero around two km and two days, and appeared to be significantly influenced by the tie- breaking procedure (Meyer and Held, 2014). Population density (county-level) was log-transformed and used as an offset endemic term. A centered time trend was also included to determine whether the strength of the endemic component has shifted over time. Poverty rate (county-level), Gini index of in- come inequality (county-level), gun ownership (state-level), percentage of the population that is non-white (county-level), percentage of the population that has at least a high-school diploma (county-level), unemployment rate (county-level), percentage of voters that vote Republican in presidential elections (county-level), violent crime rate per thousand residents (state-level), and number of hate groups per million residents (state-level) were included as dynamic endemic predictors that change annually. Plot success, anticipated fatalities, group CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 46

membership, and social media radicalization were included as epidemic predictors. All possible models with all possible combinations of predictors were run and ranked by Akaike’s Information Criterion (AIC) (Burnham and Anderson, 2002). The best fitting model with the lowest AIC was used to assess the effects of each variable on event probability. Rate ratios were calculated by applying exponential transformation to the model estimates. To determine whether the spatio-temporal interaction of the epidemic component was statistically significant, I used the Monte Carlo permutation approach developed by Meyer et al. (Meyer et al., 2016). Using this approach, a twinstim model with all endemic predictors from the best fitting model and no epidemic predictors was compared to 1,000 permuted null models with randomly shuffled event times. For each permutation I estimated the

reproduction number (R0), or the expected number of future events that an event triggers on average, which represents “infectivity”. A p-value was calculated by comparing the observed R0 with the null distribution of the subset of permutations that converged. For additional support, I also ran a likelihood ratio test and a standard Knox test of spatio-temporal clustering. The Knox test was conducted with spatial and temporal radii of 100 km and six months (the upper bounds of the first steps in the step functions), respectively (Knox and Bartlett, 1964). To further assess the quality of the model, I conducted simulations from the cumulative intensity function using Ogata’s modified thinning algorithm according to Meyer et al. (2012). Using the parameters of the best fitting model, I conducted 1,000 simulations of the last six months of the study period and compared the results to the observed data. All analyses were conducted in R using version 3.6.1, and the R code used in the study is available on

Harvard Dataverse: https://doi.org/10.7910/DVN/WPYCKJ CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 47 3.3 Results

The results of the best fitting model (∆AIC < 2), which included seven endemic and two epidemic predictor variables, are shown in Table 3.1. Firstly, there is a statistically signifi- cant time trend whereby the endemic rate decreases by 4.6% each year, indicating that the strength of the epidemic component has increased over time. There appears to be a base- line increase in the endemic component between 2008-2012 which likely corresponds to the financial crisis (Funke et al., 2016), as well as a significant spike in the epidemic component around 2016 which likely corresponds to the presidential election (Rushin and Edwards, 2018; Giani and Me´on,2019) (Figure 3.3). There are also significant positive effects of poverty rates (p < 0.01) and the presence of hate groups (p < 0.0001) on radicalization probability. Interestingly, the percentage of voters that vote Republican in presidential elections (p < 0.0001), the percentage of the population that is non-white (p < 0.05), and unemployment rates (p < 0.0001) appear to have significant negative effects on radicalization probability. Gun ownership, education level, and violent crime all have no significant effect on radicaliza- tion probability. When Republican voting was replaced with the absolute percent difference between Republican and Democratic voting, a proxy measure for the competitiveness of elections, it was no longer significant. A variance inflation factor test identified no multi- collinearity problems among the time-averaged endemic predictors (VIF < 3) (Zuur et al., 2010). Both group membership and radicalization via social media have strong and significant positive effects on epidemic probability. Exposures of individuals who belong to formal or informal extremist groups are over four times more likely to be followed by future exposures in close spatial or temporal proximity (p < 0.01). Similarly, exposures of individuals radicalized on social media are almost three times as likely to be followed by future exposures (p < 0.01). Anticipated fatalities and plot success did not appear in the best fitting model. Estimates CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 48

RR 95% CI p-value Time trend 0.946 0.91–0.98 0.0015 Poverty 1.052 1.02–1.09 0.0025 Unemployment 0.871 0.82–0.93 <0.0001 Republican voting 0.969 0.96–0.98 <0.0001 Non-white population 0.989 0.98–1.00 0.038 Education level 0.980 0.96–1.00 0.075 Hate groups 1.191 1.12–1.27 <0.0001 Violent crime 0.930 0.84–1.03 0.17 Group membership 4.563 1.57–13.28 0.0054 Social media 2.722 1.55–4.76 0.0005

Table 3.1: The results of the twinstim modeling, with estimated rate ratios (RR), Wald confidence intervals, and p-values. Endemic predictors (including the overall time trend) are in the top portion of the table, whereas epidemic predictors are in the bottom.

of the decaying spatial and temporal interaction functions, as well as model diagnostics, can be seen in Figures C.2 and C.3, respectively. A variance inflation factor test identified no multicollinearity problems among the epidemic predictors (VIF < 3) (Zuur et al., 2010).

Based on the permutation test, the observed R0 (0.31) is significantly higher than the

null distribution of the converged permutations (Nconv = 739, p < 0.01) (Figure 3.4). This indicates that the spatio-temporal interaction in the epidemic model is significant. Both the likelihood ratio test of the epidemic against the endemic model (p < 0.0001) and the Knox test (p < 0.0001) support this result. The results of the simulations can be seen in Figures 3.5 and 3.6. On average the sim- ulations neatly match the observed cumulative number of exposures between June, 2017 and January, 2018 (Figure 3.5), indicating that the model accurately captures the temporal dynamics in the data. Similarly, the model appears to do a good job of capturing the spatial dynamics in the data, although it is clearly weighted towards high population density areas (Figure 3.6). CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 49

Figure 3.3: The total intensity (in black), as well as the isolated endemic component (grey), over time. Total intensity can be interpreted as the proportion of radicalization probability that is explained by the endemic and epidemic components.

3.4 Discussion

By applying novel epidemiological methods to data on 416 extremists exposed between 2005 and 2017, this study provides evidence that patterns of far-right radicalization in the United States are consistent with a contagion process. Firstly, the estimated reproduction number is significantly higher than those from simulated null models, indicating that endemic causes alone are not sufficient to explain the spatio-temporal clustering observed in the data. The

reproduction number for radicalization (R0 = 0.31) is also lower than one, suggesting that extremist ideologies behave like complex contagions that require reinforcement for transmis- sion. Fortunately, this means that extremist ideologies are unlikely to spread uncontrollably through populations like seasonal influenza (R0 = 1.28) (Biggerstaff et al., 2014), but out- CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 50

Figure 3.4: The results of the Monte Carlo permutation test. The grey bars show the null distribution of R0 from the 739 permutations that converged, whereas the red dashed line shows the observed R0 (0.31) calculated from the twinstim model. breaks can occur under the right endemic and epidemic conditions. For example, regions with higher rates of poverty and hate group activity are more likely to experience far-right extremism, whereas regions with a larger non-white population, more Republican voting, and higher rates of unemployment are less likely to experience far-right extremism. Most importantly, radicalizations involving extremist groups or social media significantly increase the epidemic probability of future radicalizations in the same location. This suggests that clusters of radicalizations in space and time are driven by activism and organizing rather than a copy-cat effect. The fact that group membership significantly increases the epidemic strength of events, and the presence of hate groups significantly increases radicalization probability, suggests that local organizing remains a potent recruitment tool of the far-right movement. This CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 51

Figure 3.5: The cumulative number of exposures during the last six months of the study period (red), as well as the results of 1,000 simulations (grey).

Figure 3.6: The exposure locations during the last six months of the study period. The color gradient, ranging from blue (low) to yellow (high), represents a Gaussian kernel density for the results of 1,000 simulations with a bandwidth of 200 km. The contour lines segment the kernel density into 10 levels. idea is reflected in recent increases in rallies across the country, such as “Unite the Right” in Charlottesville, VA in August of 2017, that have been attended by regional chapters of white CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 52 nationalist and militia organizations. It also suggests that concerns about typological “lone wolves” radicalized over social media should not overshadow the persistent and expanding far-right movement in the United States. Only 10.8% of people in this study were radicalized on social media independently of an extremist group, indicating that solo actors are still the minority in the far-right movement. That being said, solo actors radicalized on social media, such as Omar Mateen (Pulse nightclub shooting in 2016) and Dylann Roof (Charleston church shooting in 2015) (Holt et al., 2019), are typically deadlier than group members in the United States (Phillips, 2017), and should thus be the subject of much future research. Radicalization on social media also significantly increases the epidemic strength of events, indicating that social media platforms augment physical organizing and that the diffusion of extremist ideologies online is likely geographically biased. The increasing role of social media in far-right extremism and radicalization is well established (Holt et al., 2019; Ottoni et al., 2018; Costello and Hawdon, 2018; Lowe, 2019; Winter, 2019). Social media platforms like Twitter provide extremist communities with low cost access to large audiences that might not otherwise engage with far-right content (Wu, 2015; Bertram, 2016). For example, one report found that only 44% of people who follow high-profile white nationalists on Twitter overtly express similar ideologies (Berger and Strathearn, 2013). As mainstream platforms clamp down on hate speech, extremist users have just shifted their traffic to alternative sites such as 8chan and Gab (Blackbourn et al., 2019; Hodge and Hallgrimsdottir, 2019). Given the centrality of social media in far-right organizing, future research should explore how counter-narratives (van Eerten et al., 2017; Voogt, 2017) and other strategies could be used to fight the spread of extremist ideologies online. The results indicate that county-level poverty rates increase the probability of far-right radicalization. While there is little to no evidence that poverty predicts extremism at the state-level (Piazza, 2017; Gale et al., 2002; Lin et al., 2018; Durso and Jacobs, 2013), studies at the county-level have found that poverty predicts both mass shooting rate (Kwon and CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 53

Cabrera, 2019b) and hate groups (presence (Medina et al., 2018) not longevity (Suttmoeller et al., 2015, 2016, 2018)). This discrepancy between geographic resolutions indicates that using state-level poverty data obscures local variation. The results of this study also reveal a negative effect of unemployment rate on radicalization, adding to the remarkably con- tradictory evidence for links between unemployment and extremism in the United States (Jefferson and Pryor, 1999; Goetz et al., 2012; Piazza, 2017; Majumder, 2017; Green et al., 1998; Gale et al., 2002; Espiritu, 2004). Although this result appears to be counter-intuitive, I hypothesize that poverty and unemployment may interact in driving radicalization. For ex- ample, individuals from regions where jobs are plentiful but poverty remains high may be the most disillusioned and susceptible to extremist ideologies. Interestingly, income inequality did not appear in the best fitting model, and had no significant effect when included. This suggests that overall deprivation, as measured by poverty rate, is more important in driving radicalization than inequality. Previous studies that have found a positive impact of income inequality on hate groups or crime either used state-level data (Majumder, 2017), did not account for poverty rate (McVeigh, 2004; Goetz et al., 2012), or combined income inequality with poverty rate into a single index (McVeigh and Cunningham, 2012). Interestingly, both unemployment (Pah et al., 2017) and income inequality (Kwon and Cabrera, 2017, 2019b,a) appear to be strong predictors of mass shootings. Although this seems paradoxical, the ma- jority of mass shootings are not ideologically driven (Capellan, 2015), so the socioeconomic drivers may be different than for far-right radicalization. Violent crime appears to have no influence on radicalization. Although one study of the Ku Klux Klan found that high levels of far-right activity can increase homicide rates in the long-term (McVeigh and Cunningham, 2012), there is little evidence that violent crime rates drive increases in extremist violence or radicalization (Sweeney and Perliger, 2018). Hate crime is only very weakly correlated with violent crime (Gladfelter et al., 2017), and extremist violence is even more rare (LaFree and Dugan, 2009), so they are likely driven by CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 54 different factors. Previous studies have found strong evidence for a negative relationship between education and hate crime rates (Espiritu, 2004; Gladfelter et al., 2017), a positive relationship between education and mass shooting rates (Kwon and Cabrera, 2017, 2019a), and no evidence for a relationship between education and hate group organizing (Durso and Jacobs, 2013; McVeigh et al., 2014; Florida, 2011). The results of this study are consistent with the latter category, which makes sense given that the majority of the plots in the dataset were non-violent. The negative effect of Republican voting on event probability could be because individuals on the far-right of the political spectrum who live in counties with more Democratic voters may feel more partisan hostility (Miller and Conover, 2015). Interestingly, this effect does not appear to be the result of more competitive elections (Suttmoeller et al., 2015), as the absolute difference between Republican and Democratic voting did not significantly influence event probability. Alternatively, the negative effect of Republican voting may be due to the fact that many of the rural counties that lean heavily Republican have low population densities and no recent history of extremist violence. A previous study that found mixed evidence for a positive influence of Republican voting on the presence of hate groups excluded counties without hate groups from the modeling, which may have eliminated this skew effect (Medina et al., 2018). The fact that the percentage of the population that is non-white negatively predicts far-right extremist violence is consistent with the intergroup contact hypothesis, which sug- gests that prolonged contact between racial groups reduces conflict under certain conditions (Allport, 1954). Although other researchers have suggested that population heterogeneity increases far-right radicalization (McVeigh, 2004), the only study to find evidence of this in the United States did not explicitly account for population density (LaFree and Bersani, 2014). Other studies controlling for population density have found that both anti-black hate crimes and hate groups appear to be more common in white dominated, racially homoge- CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 55 neous areas (Gladfelter et al., 2017; Medina et al., 2018). Despite mixed evidence for the intergroup contact hypothesis, it is widely accepted that community diversity and tolerance is key to fighting radicalization and extremist violence globally (Gunaratna et al., 2013; Ellis and Abdi, 2017; Ercan, 2017; Hoffman et al., 2018; United Nations Development Programme, 2016; Southern Poverty Law Center, 2017). Gun ownership does not predict radicalization in this model, which is unsurprising since only 30.5% of people in this study planned on committing fatal attacks but interesting given the centrality of gun control in debates following mass shootings in the United States (Pierre, 2019; Joslyn and Haider-Markel, 2017; Luca et al., 2020). Despite strong evidence that gun ownership is linked to mass shooting rates at the national-level (Reeping et al., 2019), evidence for same pattern at the state-level remains mixed. Previous studies have found that it either positively predicts mass shootings overall (Reeping et al., 2019), when combined with particular gun control laws (Anisin, 2018), or not at all (Lin et al., 2018; Pah et al., 2017). Unfortunately, CDC funding for research on gun ownership was restricted by Congress in 1996 after lobbying by the National Rifle Association, so potential links between extremist violence and gun ownership remain understudied (Lemieux, 2014; Winker et al., 2016; Morall, 2018; DeFoster and Swalve, 2018). Several limitations of this study should be highlighted. Firstly, the PIRUS database only represents a subset of radicalized individuals in the United States. The creators of the database used random sampling to maximize its representativeness over different time periods, but there remains a possibility of spatial or temporal bias in the original data due to underreporting by victims and law enforcement effort (DiIulio, 1996). Instances of hate crime are notoriously underreported relative to other forms of crime (Pezzella et al., 2019), because victims often fear retaliation or mistrust the police (Wong and Christmann, 2016; Weiss et al., 2016; Pezzella, 2017). There is also a great deal of variation in hate crime training among police departments, and the personal beliefs of individual officers can influence whether or CHAPTER 3. SPREAD OF FAR-RIGHT EXTREMISM 56 not instances are reported (Boyd et al., 1996; Pezzella, 2017). Both of these factors are likely to be more pronounced in areas with legacies of far-right extremist violence, and historical crossover between far-right groups and the police (Rowe, 1976; Barnes, 1996; Johnson, 2019). In addition, the geographic locations of events are only geocoded to the city-level, potentially enhancing the spatial clustering of the data. Furthermore, social media data were missing for a significant number of individuals (54.8%). The significance level of the estimate for social media usage is extremely low and robust to imputation, indicating that it likely reflects a real effect, but researchers should exercise caution when interpreting this result (Safer- Lichtenstein et al., 2017). Lastly, the spatial resolution of three of the endemic predictors was limited to the state-level, which may have flattened some important local variation. One of these variables, gun ownership, was also a proxy measure. Policymakers should release historical restrictions on research funding for gun violence and hate crime research to improve data resolution for future studies. In conclusion, far-right radicalization in the United States appears to spread through populations like a complex contagion. Both social media usage and group membership enhance the contagion process, indicating that online and physical organizing remain primary recruitment tools of the far-right movement. In addition, far-right radicalization is more likely in Democrat-majority regions with high poverty and low unemployment, fewer non- white people, and more hate group activity. While the federal government has acknowledged the threat of far-right extremism (The Department of Homeland Security, 2019), funding for organizations researching or fighting the movement has decreased in recent years (O’Toole, 2019). Based on the results of this study, I recommend that policymakers reconsider their funding priorities to address the expanding far-right extremist movement in the United States. Future research should investigate how specific interventions, such as online counter- narratives to battle propaganda, may be effectively implemented to mitigate the spread of extremist ideology. Chapter 4

Conformity bias in the cultural transmission of music sampling traditions

4.1 Introduction

As Darwinian approaches are increasingly incorporated into modern musicology (Savage, 2019), researchers have begun to investigate how transmission biases shape the cultural evolution of music (MacCallum et al., 2012; Savage et al., 2015; Lumaca and Baggio, 2017b; Ravignani et al., 2018; Brand et al., 2019). Transmission biases, or biases in social learning that predispose individuals to favor particular cultural variants, are important selective forces (Cavalli-Sforza and Feldman, 1981a) that can result in significant changes at the population- level (Leonard and Jones, 1987; Durham, 1990; Boyd and Richerson, 1992; Ormrod, 1992a). For example, a recent study found evidence that the presence of positive and negative lyrics in popular music has been driven by prestige, success, and content biases (Brand et al., 2019).

57 CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 58

In the last several decades, researchers have begun to explore how these kinds of trans- mission processes can be inferred from large-scale cultural datasets. This “meme’s eye view” approach (Shennan, 2011), originally pioneered by archaeologists studying ceramics (Neiman, 1995; Bentley and Shennan, 2003), has since been applied to dog breeds (Bentley et al., 2007), cooking ingredients (Isaksson et al., 2015), and baby names (O’Dwyer and Kandler, 2017). In music, this approach has revealed that frequency-based biases like conformity and novelty, in which the probability of adopting a variant disproportionately depends on its commonness or rarity (Boyd and Richerson, 1985), vary across domains and levels of analysis. For exam- ple, there is some evidence that dissonant intervals in Western classical music are subject to novelty bias (Nakamura and Kaneko, 2018), rhythms in Japanese enka music are subject to conformity bias (Nakamura and Kaneko, 2018), and popular music at the level of albums (Bentley et al., 2007) and artists (Acerbi and Bentley, 2014) is subject to random copying1. Music sampling, or the use of previously-recorded material in a new composition, is an ideal model for investigating frequency-based bias in the cultural evolution of music because (1) samples are known to be culturally transmitted between collaborating artists, and (2) sampling events are reliably documented in online databases (Youngblood, 2019b). For researchers, music sampling is a rare case where process is understood and pattern is accessible. In the current study, we aim to use longitudinal sampling data to determine whether frequency-based bias has played a role in the cultural transmission of music sampling traditions. Earlier manifestations of the “meme’s eye view” approach, based on diversity and progeny distributions, are time-averaged and more susceptible to type I and II error, respectively (Premo, 2014; Crema et al., 2014; O’Dwyer and Kandler, 2017). In the current study we utilize two more recent methods, turn-over rates and generative inference, that better capture the temporal dynamics that result from transmission processes (Wilder and

1Under certain conditions. The transmission of popular artists on Last.fm is consistent with random copying in generalist groups of users and conformity in more niche groups of users (Acerbi and Bentley, 2014). CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 59

Kandler, 2015). The turn-over rate of a top list of cultural variants, ranked by descending frequency, is simply the number of new variants that appear at each timepoint (Bentley et al., 2007). Examples of top lists in popular culture include the Billboard Hot 100 music chart and the IMDb Top 250 movies chart. By comparing the turn-over rates (z) of top lists of different lengths (y), we can gain insight into whether or not the data are consistent with neutral evolution (i.e. random copying). The turn-over profile for a particular cultural system can be described with the following function:

x zy = A · y (4.1)

where A is a coefficient depending on population size and x indicates the level of frequency- based bias (Evans and Giometto, 2011; Acerbi and Bentley, 2014; Kandler and Crema, 2018). Simulation studies indicate that at neutrality x ≈ 0.86 (Evans and Giometto, 2011; Acerbi and Bentley, 2014). Under conformity bias turn-over rates are relatively slower for shorter top-lists, leading to a convex turn-over profile (x > 0.86). Likewise, under novelty bias turn- over rates are relatively faster for shorter top-lists, leading to a concave turn-over profile (x < 0.86) (Acerbi and Bentley, 2014). Generative inference is a powerful simulation-based method that uses agent-based mod- eling and approximate Bayesian computation (ABC) to infer underlying processes from ob- served data (Kandler and Powell, 2018a). Agent-based modeling allows researchers to simu- late a population of interacting “agents” that culturally transmit information under certain parameters. With a single cultural transmission model, this method can be used to infer the parameter values that likely generated the observed data (Crema et al., 2014, 2016; Kandler and Shennan, 2015a; Kandler and Crema, 2018). With competing models assuming different forms of bias, this method can be used to choose the model that is most consistent with the CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 60 observed data (Crema et al., 2014, 2016; Thouzeau et al., 2018; Pagel et al., 2019). In the current study, we use the basic rejection form of ABC for parameter inference and a random forest machine learning form of ABC for model choice.

4.2 Methods

Sampling data were collected from WhoSampled (https://www.whosampled.com/) on Febru- ary 18th, 2019. The analysis was restricted to drum breaks because artists typically only use one drum break per composition, whereas vocal and instrumental samples are combined more flexibly. For each sample source tagged as a “drum break”, we compiled the release years and artist names for every sampling event that occurred between 1987-2018. Previous years had fewer than 82 cultural variants and were excluded from the analysis. Collectively, this yielded 1,463 sample sources used 38,500 times by 14,387 unique artists. The release years were used to construct a frequency table in which each row is a year, each column is a sample, and each cell contains the number of times that particular sample was used in that year. Notable sampling events for the five most sampled drum breaks are shown in Table 4.1, and the frequencies of 10 common and 10 rare samples through time are shown in Figure 4.1. Turn-over rates were calculated using the HERAChp.KandlerCrema package in R (Kan- dler and Crema, 2018). x was calculated from top-lists up to size 142 (the minimum number of cultural variants present in a given year) across all years. The observed distribution of turn-over rates was compared to those expected under neutral conditions according to Bentley (Bentley et al., 2007) and Evans and Giometto (Evans and Giometto, 2011). Simulations were conducted using the agent-based model of cultural transmission avail- able in the HERAChp.KandlerCrema package in R (Kandler and Crema, 2018). This trans- mission model generates a population of N individuals with different cultural variants, and CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 61 “Straight Outta Compton” by“King N.W.A of (1988) the Beats”“I by Want Mantronix You (Forever)” (1988) by Carl Cox“It (1991) Takes Two” by“Alright” Rob by Base Janet & Jackson DJ“Come (1989) E-Z on Rock My (1988) Selector” by Squarepusher“Fight (1997) the Power” by“I Public Am Enemy Stretched (1989) on“Pop Your Corn” Grave” by by Caustic Sin´eadO’Connor (1990) Window (1992) “Eric B. Is President”“Hip by Hop Eric Hooray” B. by“Wontime” & Naughty by Rakim by Smif-N-Wessun (1986) Nature (1995) (1993) “The Bridge” by MC“Mr. Shan Loverman” (1986) by Shabba“The Ranks Flute (1992) Tune” by Hidden Agenda (1995) Original Sample“Amen, Brother” by The Winstons (1969)“Think (About It)” by Lyn Collins (1972)“Funky Drummer” by James Brown (1970)“Funky President (People It’s Bad)” by 3,225 James Brown (1974)“Impeach the President” by Times The Sampled Honey Drippers 2,251 (1973) 865 Notable Sampling Events 1,517 785 Table 4.1: Notabletimes sampling each events drum for break the has five been most sampled was sampled collected drum from breaks WhoSampled used on in June 27th, the 2019. current study. The number of CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 62

Figure 4.1: Violin plots showing the frequencies of samples, ranked by overall use, from 1980 to 2019. The x-axis is the rank of each sample, and the y-axis is the year. To the left of the dotted line are samples 1-10, while to the right are samples 501-510. More common samples (on the left) appear to be much more stable over time than rarer ones. The high popularity of the more common samples in the late 80s and early 90s is likely due to the rapid expansion of sample-based hip-hop and dance music triggered by increased access to digital samplers and more relaxed copyright enforcement during that period. simulates the transmission of those variants between timepoints given a particular innova- tion rate (µ) and level of frequency-based bias (b). As departures from neutrality can only be reliably detected after equilibrium has been reached, this model incorporates a warm-up period that is excluded from the rest of the analysis. Negative values of b correspond to conformity bias, while positive values correspond to novelty bias. The output of this model includes turn-over rates and the Simpson’s diversity index at each timepoint. Simpson’s diversity index (D) is the probability that any two randomly selected cultural variants are of the same type, where values closer to 0 indicate high diversity and values closer to 1 indicate low diversity (Simpson, 1949). Parameter inference was conducted with the rejection algorithm of ABC, using the EasyABC (Jabot et al., 2013) and abc (Csillery et al., 2012) packages in R, in three ba- CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 63 sic steps:

1. 100,000 iterations of the model were run to generate simulated summary statistics for different values of b within the prior distribution.

2. The Euclidean distance between the simulated and observed summary statistics was calculated for each iteration.

3. The 1,000 iterations with the smallest distances from the observed data, determined by the tolerance level (ε = 0.01), were used to construct the posterior distribution of b.

The exponent of the turn-over function (x) and the mean Simpson’s diversity index (D¯) were used as summary statistics for parameter inference. Population size (N = 729), innovation rate (µ = 0.037), and warm-up time (t = 200) were kept constant for all models, and a uniform prior distribution was used for b (-0.2—0.2). Population size was calculated from the mean number of unique artists involved in a sampling event at each timepoint in the observed dataset. Innovation rate was calculated from the mean number of new sample types per total number of samples at each timepoint in the observed dataset, according to Shennan and Wilkinson (Shennan and Wilkinson, 2001). The warm-up time was determined by running 1,000 iterations of a neutral model with the observed innovation rate over 500 timepoints (Crema et al., 2014) and estimating when observed diversity reaches equilibrium (see Figure D.1). The bounds of the uniform prior distribution for b, adapted from Crema et al. (Crema et al., 2014), were reduced based on observed levels of frequency-based bias in other cultural systems (Kandler and Shennan, 2015a; Kandler and Crema, 2018; Kandler and Powell, 2018a). Each model was run for 32 timepoints, which corresponds to the number of years in the observed dataset. Model choice was conducted with the random forest algorithm of ABC, using the abcrf (Pudlo et al., 2015) package in R. Random forest is a form of machine learning in which a CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 64 set of decision trees are trained on bootstrap samples of variables, and used to predict an outcome given certain predictors (Cutler et al., 2012). Traditional ABC methods function optimally with fewer summary statistics (Blum et al., 2013), requiring researchers to reduce the dimensionality of their data. We chose to use random forest for model choice because it appears to be robust to the number of summary statistics (Pudlo et al., 2015), and does not require the exclusion of potentially informative variables. The random forest algorithm of ABC was conducted with the following steps:

1. 50,000 iterations of each model (conformity, novelty, and neutrality) were run to gener- ate simulated summary statistics for different values of b within the prior distributions.

2. The results of these three models were combined into a reference table with the sim- ulated summary statistics (and calculated LDA2 axes) as predictor variables, and the model index as the outcome variable.

3. A random forest of 1,000 decision trees was trained with bootstrap samples from the reference table (150,000 rows each).

4. The trained forest was provided with the observed summary statistics, and each deci- sion tree voted for the model that the data were likely generated by.

5. The posterior probability of the model with the majority of the votes was calculated using the out-of-bag data that did not make it into the bootstrap training samples.

The details of this process are outlined by Pudlo et al. (Pudlo et al., 2015). The following 178 summary statistics were used for model choice: the exponent of the turn-over function

(x), the mean turn-over rate (z ¯y) for each list size (up to 142), the Simpson’s diversity index for each timepoint (D) (up to 32), the mean Simpson’s diversity index (D¯), and the two

2Linear discriminant analysis (LDA) is a method of dimensionality reduction, similar to PCA, that compresses multiple variables onto two axes while maximizing the separation between classes. CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 65

LDA axes. Population size (N = 729), innovation rate (µ = 0.037), and warm-up time (t = 200) were kept constant for all models. Uniform prior distributions were used for b in both the conformity (-0.2—0) and novelty (0—0.2) models, whereas b was kept constant at 0 for neutrality. All R scripts and data used in the study are available in the Harvard Dataverse repository: https://doi.org/10.7910/DVN/TWADX4.

4.3 Results

The observed turn-over rates, as well as those expected under neutral conditions, can be seen in Figure 4.2. Kolmogorov-Smirnov tests found that the observed distribution of turn-over rates is significantly different from the neutral expectations of both Bentley (Bentley et al., 2007) (p < 0.001) and Evans and Giometto (Evans and Giometto, 2011) (p < 0.001). The value of the exponent x (see Equation 4.1) for the observed data is 1.13, which is indicative of conformity bias. The posterior probability distribution of the level of frequency-based bias (b), constructed with the basic rejection algorithm of ABC, is shown in Figure 4.3. Based on the parameter estimation of b, the observed data are most consistent with weak but significant conformity bias (median = -0.012; 95% HDPI: [-0.019, -0.0020]). A goodness-of-fit test (n = 1000; ε = 0.01) indicates that the model is a good fit for the data (p = 0.47) (see Figure D.2) (Lemaire et al., 2016), and leave-one-out cross validation indicates that the results are robust across tolerance levels (n = 10; ε: 0.005, 0.01, 0.05) (see Figure D.3) (Csillery et al., 2012). The results of the model choice using the random forest algorithm of ABC can be seen in Table 4.2. The conformity model has the strongest support (436 votes) with a posterior probability of 0.89. The out-of-bag error, calculated by running the out-of-bag data through the random forest, was 0.046 (see Figure D.4), indicating that the forest is a good classifier CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 66

Figure 4.2: The observed turn-over rates (z) for top-lists up to size 142, compared to those expected under neutral conditions according to Bentley (Bentley et al., 2007) (in blue) and Evans and Giometto (Evans and Giometto, 2011) (in orange). The x-axis is the size of the top lists for which z, on the y-axis, was calculated.

Figure 4.3: The posterior probability distribution of the level of frequency-based bias (b), with the median shaded in dark grey and the 95% HDPI shaded in light grey. CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 67 for the data. The most important variable for the classification ability of the random forest, identified using the Gini impurity method, was mean diversity (D¯), followed by the first LDA axis (LD1), the exponent of the turn-over function (x), and the second LDA axis (LD2). The importance of the top ten variables, as well as the results of the LDA, can be seen in Figures D.5 and D.6.

Conformity Novelty Neutrality Post. Prob. 436 174 390 0.89

Table 4.2: The number of votes cast by the trained random forest for each model after being provided with the observed summary statistics, as well as the posterior probability of the selected model (conformity).

4.4 Discussion

By applying simulation-based methods to three decades of sampling events, we have pro- vided evidence that conformity bias plays an important role in the cultural transmission of music sampling traditions. Firstly, turn-over rates for longer list sizes are higher than expected under neutral evolution, indicating that artists may be selectively using more pop- ular samples. In addition, the rejection algorithm of ABC found that transmission models assuming low but significant levels of conformity bias best match the observed data. Lastly, a random forest trained on simulated data from three transmission models classified the observed data as coming from the conformity model. Taken together, these results indicate that music producers tend to conform to the sampling patterns of others, which is consistent with reports of artists using particular samples as signals of community membership (e.g. the Amen break) (Whelan, 2009). Although our results are concordant with evidence of conformity bias in Japanese enka music (Nakamura and Kaneko, 2018), they conflict with evidence of novelty bias in West- CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 68 ern classical music (Nakamura and Kaneko, 2018) and neutral evolution in popular music (Bentley et al., 2007; Acerbi and Bentley, 2014). In the study of Western classical music, frequency-based bias was identified by looking at changes in the means and standard de- viations of the frequencies of particular cultural variants (Nakamura and Kaneko, 2018). Despite the fact that these measures appear to be intuitive indicators of frequency-based bias, they do not account for competition between cultural variants as frequencies change. For example, novelty bias would be expected to favor rare variants only until they become relatively common and are supplanted by rarer variants. These kinds of dynamic processes are better captured by turn-over rates and simulation-based methods. In the two studies of popular music, researchers looked at the transmission of albums and artists between listeners using data from the Billboard charts (Bentley et al., 2007) and Last.fm (Acerbi and Bentley, 2014). It is possible that the low cost of listening relative to producing allows individuals to explore new music at random rather than relying on frequency-based bias. That being said, we suspect that turn-over rates on the Billboard charts (Bentley et al., 2007) may not accurately reflect the behavior of listeners, given that the charts have historically been ma- nipulated by record labels and distributors (Knopper, 2009). Last.fm, on the other hand, is a more reliable source of transmission data as users can directly share music with one another in groups (Acerbi and Bentley, 2014). Interestingly, while turn-over rates of top artists in generalist groups of users were consistent with neutral evolution, rates in more niche groups (e.g. “female fronted metal”) were consistent with conformity bias (Acerbi and Bentley, 2014). It is possible that individuals in more niche groups feel a greater sense of community and are influenced more by other listeners, although this idea has yet to be tested. Overall, the discrepancies between the current and previous studies indicate that frequency-based bias in music may vary depending on the level of analysis (e.g. samples between artists vs. artists between listeners) and cost of adoption (e.g. work-intensive production process vs. clicking a streaming link), so the results of the current study may not be generalizable across CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 69 all musical domains. A recent study on music sampling found that less popular artists culturally transmit sam- ples with one another at higher rates (Youngblood, 2019b). In combination with anecdotes of artists selectively avoiding popular samples (e.g. De La Soul refusing to sample mainstream artists) (Lena, 2004), this suggests that novelty bias may be present. Counter-dominance sig- naling is a recently developed hypothesis (Klimek et al., 2019) that may reconcile the strong conformist signal in our data with the indications of novelty bias in the literature (Lena, 2004; Youngblood, 2019b). This hypothesis posits that low popularity “outsiders” develop new styles in opposition to those expressed by high popularity “elites”. Over time, these new styles become widespread enough to be adopted by elites, allowing space for new counter-elite styles to emerge in response (Klimek et al., 2019). In other words, novelty bias may cause new styles to be adopted by outsiders, and conformity bias may allow those new styles to spread and eventually be expressed by elites. If less popular artists are much more common and tend to favor samples used within their community over those used by more popular artists, then population-level sample frequencies are likely to reflect conformity bias over novelty bias. This hypothesis is consistent with the emphasis that many artist communities place on collective cultural production in opposition to the “mainstream” (Hesmondhalgh, 1998). There are several limitations to this study that need to be highlighted. Firstly, the turn-over rate results should be interpreted with caution, as recent work indicates that the exponent of the turn-over function (x) may be overestimated when fewer than 40 timepoints are analyzed (Kandler and Crema, 2018). Additionally, traditional ABC requires researchers to choose a subset of summary statistics, which can have a significant effect on parameter estimation. Luckily, the two statistics we used for parameter estimation ended up being the most important variables for classification by the random forest (excluding the LDA axes). Lastly, recent work indicates that the inclusion of rare variants is important for inferring CHAPTER 4. CONFORMITY BIAS IN MUSIC SAMPLING 70 underlying cultural transmission biases from population-level data (O’Dwyer and Kandler, 2017). As WhoSampled is a crowd-sourced database, its coverage of popular variants is presumably much more complete than its coverage of rare variants. Algorithms for sample- detection may allow researchers to reconstruct full transmission records in the future, but these approaches are not yet publicly available (Hockman and Davies, 2015; L´opez-Serrano et al., 2017). The results of the current study add to an expanding body of literature addressing how frequency-based bias influences cultural diversity at the population-level. In addition, we have provided further validation of generative inference methods that allow researchers to bridge pattern and process in cultural evolution. Future studies should employ more complex agent-based models that incorporate social status to determine whether counter-dominance signaling influences cultural transmission within music production communities, as well as other forms of transmission bias (e.g. content and prestige) to control for equifinality. Chapter 5

Content bias in the cultural evolution of house finch song

5.1 Introduction

Learned birdsong, which is socially transmitted and accumulates changes across generations, is a powerful model system for the study of cultural evolution (Freeberg, 2000; Lipkind and Tchernichovski, 2011; Mundinger and Lahti, 2014; Aplin, 2019). Over the course of transmis- sion, changes can accumulate through a variety of means, resulting in spatial and historical diversity in song structure (Slater, 1989; Parker et al., 2012). Transmission biases, or bi- ases in social learning that predispose individuals to favor particular cultural variants, are important selective forces (Cavalli-Sforza and Feldman, 1981b) that can result in significant changes at the population-level (Leonard and Jones, 1987; Durham, 1990; Boyd and Richer- son, 1992; Ormrod, 1992b). Conformity bias (a bias for more common variants), for example, appears to play a key role in the cultural transmission of foraging techniques in great tits, and birdsong in swamp sparrows (Aplin et al., 2014; Lachlan et al., 2018). In the last several decades, researchers have begun to explore how these transmission

71 CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 72 processes can be extracted from large-scale cultural datasets. This top-down approach, in which mechanisms of cultural transmission are inferred by looking at population-level changes in the variants themselves, was originally pioneered by archaeologists (Neiman, 1995; Bentley and Shennan, 2003) and has since been applied to various cultural domains including dog breeds (Bentley et al., 2007), music (Youngblood, 2019a), and birdsong (Byers et al., 2010; Lachlan et al., 2018). For example, chestnut-sided warblers have accented and unaccented songs that exhibit slightly different patterns of geographic variation (Byers et al., 2010). By comparing their rates of turnover and mutation, researchers were able to conclude that only one of the two song types is subject to stabilizing selection, likely caused by some form of transmission bias (Byers et al., 2010). In the current study we utilize a more recent version of this approach called generative inference, that allows researchers to test hypotheses related to specific forms of transmission bias (Kandler and Powell, 2018b). Generative inference is a powerful simulation-based method that uses agent-based mod- eling and approximate Bayesian computation to infer underlying processes from observed data (Kandler and Powell, 2018b). Agent-based modeling allows researchers to simulate generations of interacting individuals that culturally transmit information according to cer- tain parameters, some of which are variable and some of which are constant, based on what is known about the system. By running many iterations of these simulations with different parameter values and comparing the output to the observed data, it is possible to infer the parameter values that likely generated the observed data (Crema et al., 2014; Kandler and Crema, 2019; Crema et al., 2016; Kandler and Shennan, 2015b). Lachlan et al. (2018) re- cently investigated the cultural transmission of swamp sparrow song using an agent-based model that incorporated three forms of transmission bias. They found that conformity bias best explained the cultural composition of their population (Lachlan et al., 2018). For the current study, we modified this model to allow for dynamic population size. We used this model with a new version of approximate Bayesian computation that incorporates random CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 73 forest machine learning (Raynal et al., 2019) to infer whether transmission bias has played a role in the cultural evolution of house finch song. The house finch is a socially monogamous, non-territorial passerine (Thompson, 1960a) with learned song (Mann et al., 2020) that culturally evolves (Mundinger, 1980) and is likely subject to sexual selection (Nolan and Hill, 2004). It is native to the western United States and Mexico, but was introduced from California to Long Island around 1940 (Elliott and Arbib, 1953; Aldrich and Weske, 1978) and has since expanded throughout the eastern and central United States (Belthoff and Gauthreaux, 1991). The house finch is an ideal model for such a study because we have a collection of songs from over a thousand individuals from an extensive geographical area in the introduced eastern range recorded between 1971–1975 (Mundinger, 1975), 2010–2016 (Ju et al., 2019), and 2018–2019. Our recent comparative study of house finch songs recorded in 1975 and 2012 in Long Island showed that local di- alects have become less distinct and both song and syllable diversity have increased at the population-level (Ju et al., 2019). Additionally, although none of the songs from 1975 have persisted, around half of the syllable types recur in 2012 (Ju et al., 2019). We proposed three possible drivers for the increase in diversity: (1) higher innovation rates, (2) more demon- strators (as a result of increased density), and (3) more dispersal. Although the average dispersal distance and number of demonstrators probably increased during the population expansion of the 1980s, these variables would also have been affected by the significant bot- tleneck that occurred in the 1990s as a result of mycoplasmal conjunctivitis (Hawley et al., 2006). In addition, it is unclear, with the methods used, whether or not the increase in diversity is the result of neutral or biased cultural transmission. The aim of the this study was to infer whether transmission bias played a role in the cultural evolution of house finch song during a rapid demographic expansion and crash in the introduced eastern range. We investigated the three forms of cultural transmission bias, first described by Boyd and Richerson (1985), that likely play a role in birdsong (Lachlan et al., CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 74

2018): content bias, frequency bias, and demonstrator bias. Content bias occurs when some cultural variants are more likely to be learned because of the content of those variants (e.g. frequency bandwidth and complexity) (Rendell et al., 2011). The most widespread form of content bias in birds is a preference for species-typical song (Nelson et al., 2004; Aplin, 2019). In white-crowned sparrows, for example, birds are more likely to learn phrases from songs that contain a species-specific introductory whistle (Soha and Marler, 2000). Content biases can be adaptive, given that females choose mates based on particular acoustic features (Mennill et al., 2006; Hernandez et al., 2008). Since female house finches prefer elaborate song (Nolan and Hill, 2004; Mennill et al., 2006), and complex syllables are overrepresented in our more recent recordings (Ju et al., 2019), content bias, specifically for syllable complexity, may be present in male house finches. Frequency biases (e.g. conformity bias and novelty bias) occur when the commonness or rarity of cultural variants affects their adoption (Rendell et al., 2011). In birds, confor- mity is the most well-documented form of frequency bias (Jenkins, 1977; Slater and Ince, 1979; Lachlan et al., 2018; Aplin, 2019). White-crowned sparrows, for example, selectively retain the songs of their neighbors after establishing territories, resulting in stable local di- alects (Nelson and Poesel, 2009). Under certain conditions, conformity bias is thought to be broadly adaptive because it allows individuals to quickly adopt common variants (Boyd and Richerson, 1985; Henrich and Boyd, 1998), and is particularly important in birds because females of some species prefer males who sing local songs (O’Loghlen and Rothstein, 1995; Searcy et al., 1997). That being said, the adaptive nature of conformity bias depends on several assumptions that may not hold for all systems (Eriksson et al., 2008), and weak conformity is likely to be more adaptive than strong conformity in many contexts because it allows potentially advantageous innovations to spread (Kandler and Laland, 2009). To our knowledge, the only clear evidence of novelty bias in birds is in medium ground finches (Gibbs, 1990). In this species, males appear to preferentially learn rarer songs, and those CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 75 that sing rarer songs survive longer and produce more offspring. Gibbs (1990) attributes this novelty bias to the high population density of territorial males, a situation in which birds with more distinct songs may spend less time displaying and more time foraging and avoid- ing predation. Other possible examples of novelty bias, such as song “revolutions” in corn buntings and continent-level shifts in white-throated sparrow songs, (Garland and McGre- gor, 2020; Otter et al., 2020), could also result from content bias and need to be investigated further. In house finches, a non-territorial species in which females prefer males who sing local songs (Hernandez and MacDougall-Shackleton, 2004), conformity bias is more likely to be adaptive than novelty bias. Demonstrator bias occurs when cultural variants produced by particular demonstrators are preferred (Rendell et al., 2011). In indigo buntings, for example, young birds preferen- tially learn songs produced by males that they socially interact with (Payne, 1981). In other species, birds are more likely to learn from males that are older (Jenkins, 1977) or more aggressive (Clayton, 1987). In the house finch, a species in which both song and redness are used in mate choice (Mennill et al., 2006), it is possible that young birds pay more attention to redder demonstrators. Given that social context influences song production (Ciaburri and Williams, 2019), demonstrator bias in house finches could also be the result of observational learning (Hoppitt and Laland, 2008), in which young birds pay attention to which males spend the most time with females and choose demonstrators accordingly. One of the biggest challenges in extracting transmission biases from population-level data is dealing with equifinality, or the fact that different transmission mechanisms can yield similar results (Barrett, 2018). This is particularly relevant for this study, as content, frequency, and demonstrator biases can yield similar population-level patterns (Acerbi et al., 2016). For example, a content bias for songs containing particular syllables, such as the introductory whistle in white-crowned sparrows (Soha and Marler, 2000), could generate stable dialects that are consistent with conformity bias. Alternatively, a content bias for CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 76

Figure 5.1: The first 10 syllables from a song recorded in 1975 and analyzed in Luscinia. Each red bar corresponds to a syllable, and each green number corresponds to an element within that syllable. The blue traces represent mean frequency. In this song, syllable three and syllable eight were classified as the same syllable type during dynamic time warping, and all other syllables are unique types. complex syllables that only the highest-quality males can produce could manifest as novelty bias. However, despite potential similarities in emergent pattern, simulations indicate that content, frequency, and demonstrator biases have discriminable effects at the population- level (Lachlan et al., 2018). Content bias, for example, tends to reduce the number of rare variants in the population by increasing the turnover of unattractive new variants. Demonstrator bias, on the other hand, tends to cause a decrease in cultural diversity by reducing the number of potential demonstrators (Lachlan et al., 2018). By simulating all three processes simultaneously, we aim to infer which is dominant in the cultural evolution of house finch song.

5.2 Methods

We used recordings from 1975 (Mundinger, 1975), 2012 (Ju et al., 2019), and 2019 that were collected in western Long Island (Brooklyn, Queens, and Nassau County) (Figure 5.2). The 1975 and 2012 recordings were identical to those recently analyzed by Ju et al. (2019), whereas the 2019 recordings were specifically collected for this study. Recordings from 1975 CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 77 were collected with a Nagra III reel-to-reel tape recorder and a Sennheiser 804 shotgun microphone, and converted to digital files (32-bit, 96 kHz) by the Cornell Lab of Ornithology in 2013 (Ju et al., 2019). Recordings from 2012 (16-bit, 44.1 kHz) and 2019 (32-bit, 48 kHz) were collected with a Marantz PD661 solid state recorder and a Sennheiser ME66 shotgun microphone. The recordings from 1975 were downsampled to 48 kHz prior to analysis (Luscinia processes both 44.1 kHz and 48 kHz files). In all three years special precautions were taken to avoid recording the same bird twice (Mundinger, 1975; Ju et al., 2019). Each site was visited only once. Within a site, only one individual was recorded within a 160 m radius until they stopped singing or flew away.

Figure 5.2: Locations of recordings from 1975, 2012, and 2019 in western Long Island (Brook- lyn, Queens, and Nassau County).

Song analysis was conducted in Luscinia (http://rflachlan.github.io/Luscinia/), a database and analysis program developed specifically for birdsong. Songs were analyzed with a high pass threshold of 2,000 Hz, a maximum frequency of 9,000 Hz, and 5 dB of noise removal. We excluded 965 songs (26.2%) from the analysis due to high levels of noise CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 78 or overlap with other bird species. Continuous traces with more than 20 ms between them were classified as syllables (Mundinger, 1975). Some syllables were further separable into elements, or traces with less than 20 ms between them. See Figure 5.1 for an example of an analyzed song. Cluster analysis could not be done in Luscinia due to computational limitations, so we exported the raw mean frequency measurements and analyzed them in R. Mean frequency here refers to the mean frequency in each 1 ms bin, so it is a continuous trace across the length of syllables. Mean frequency was log-transformed prior to analysis (Lachlan et al., 2018). First, the normalized distances between all of the syllables were calculated via dynamic time warping (DTW) with a window size of 10 (10% of the average signal length) using the dtwclust package (Sard´a-Espinosa,2019). A window size of 10% of the signal length is commonly used in speech processing research and seems to be a practical upper limit for many applications (Ratanamahatana and Keogh, 2005). Infinite distances (0.19%) caused by comparisons of syllables with extreme signal length differences were assigned the maximum observed distance value. Next, hierarchical clustering and dynamic tree cut were used to cluster the syllables into types (Ju et al., 2019; Roginek, 2018). Hierarchical clustering was conducted with the UPGMA method implemented in the fastcluster package (M¨ullner, 2013), and dynamic tree cut was conducted with the dynamicTreeCut package (Langfelder et al., 2016). For dynamic tree cut, we ran the hybrid algorithm with a minimum cluster size of 1 to maximize the representation of rare syllable types. We chose a deep split value of 3 because it resulted in relatively good cluster validity indices (see Table E.1) and a similar number of syllable types compared with previous studies of house finch song (Ju et al., 2019; Roginek, 2018). In order to control for variation in repertoire coverage, we only included birds from whom we recorded enough songs to cover at least 75% of their syllable repertoire (Ju et al., 2019). To do this, we plotted the cumulative proportion of syllables types captured with each additional song for the 106 individuals with at least 10 recorded songs as per Ju et al. CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 79

(2019) (see Figure E.3). On average, 78.0% of an individual’s repertoire was captured in the first eight songs, so we only calculated repertoire sizes and observed summary statistics from birds with at least eight recorded songs. The agent-based model was adapted from Lachlan et al. (2018) and written in R and C++. For computational efficiency, we only included males in our model. Female house finches rarely sing in the wild (Bitterbaum and Baptista, 1979), and thus are unlikely to act as demonstrators to juvenile birds. The model of cultural transmission is initialized with

NB birds and NS possible syllables. Repertoire sizes are drawn randomly from a normal distribution based on the observed data. Each bird is assigned a demonstrator attractive- ness index (tm), calculated by raising Euler’s number to a random number drawn from a normal distribution with a mean of 0 and a standard deviation of v (so that the indices are non-negative and centered around 1). Each bird is also assigned a random geographic index (g) between 1 and 100, which is used to simulate geographic location. Each syllable

is assigned an attractiveness index (M) of 1 (attractive) or 0.05 (unattractive), where patt is the proportion of all syllables that are attractive. At the start of each year a new generation learns syllables from a pseudorandom set of D demonstrators, where the probability of a bird learning from a demonstrator is the inverse of the absolute difference between their geographic indices (i.e. more geographically distant adults are less likely to act as demon- strators). Dispersal rates in western Long Island can be as high as 91.3% (Youngblood, 2020) and males appear to learn most of their repertoire after dispersal (Bitterbaum and Baptista, 1979), indicating that choosing demonstrators pseudorandomly by geographic location after dispersal is appropriate.

The probability of a bird learning a syllable (Px) depends upon the frequency of that

syllable among the demonstrators (Fx), the attractiveness of the syllable (Mx), and the mean

attractiveness of the demonstrators singing the syllable (Tx). To simulate frequency-based bias, Fx is raised to the exponent α, where α > 1 corresponds to conformity bias and α < CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 80

α 1 corresponds to novelty bias. As such, Px = Fx · Mx · Tx. During the learning process, new syllable types are innovated with probability µ. We use the term innovation to describe the generation of new variants, as is common in both animal and human cultural evolution research (Caldwell et al., 2016; Aplin, 2019), but our model is agnostic about the exact processes involved (copy error, guided variation, etc.). Innovated syllables are drawn from

the initial pool of possible syllables (NS), to reflect the production limitations imposed by species-specific physiological and cognitive constraints (Burnell, 1998; Mann et al., 2020). Mortality is simulated at the end of each year by randomly selecting 50% of the agents to remove (overwinter survival varies from 39-63% in different populations (Badyaev and Martin, 2000; Badyaev et al., 2000, 2008)), and at the beginning of each year new individuals are added to bring the population up to size.

In summary, patt, α, and v correspond to content, frequency, and demonstrator bias, respectively. Smaller values of patt are indicative of stronger content bias, as fewer syllables are attractive to learners. Values of α > 1 are indicative of conformity bias, while values of α < 1 are indicative of novelty bias. And lastly, higher values of v are indicative of of stronger demonstrator bias, as there is more variation in demonstrator attractiveness.

To model changes in population size, we extrapolated from NB using data from the Christmas Bird Count (CBC)1. Annual trends were estimated by calculating the percent change in the average number of house finches encountered per party hour in Brooklyn, Queens, and Nassau County for each year. For example, if the simulated population size in year x was 100 and the CBC indicated a 25% increase between x and x + 1, then the population size in year x + 1 would be 125. These annual trends from the CBC can be seen in Figure E.1. The resulting population estimates were halved in the model, since only males were included. Following an initial burn-in phase, each iteration of the model was run from 1970 to 2019.

1https://www.audubon.org/conservation/science/christmas-bird-count CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 81

A 100 year burn-in phase appears to be sufficient to reach an equilibrium-level of diversity (Figure E.4). In 1975, 2012, and 2019 the following summary statistics were calculated from a random sample of the same size as the corresponding observed sample: (1) the proportion of syllables that only appear once, (2) the proportion of the most common syllable type, (3) the number of syllable types, (4) Simpson’s diversity index, (5) Shannon’s diversity index, (6) Pielou’s evenness index, and (7) the exponent of the fitted power-law function to the progeny distribution (Lachlan et al., 2018; Kandler and Crema, 2019). We chose these summary statistics because they capture diversity with an emphasis on both common and rare variants, as well as the general shape of the frequency distribution. The three sets of seven summary statistics from each simulation were compared with the same summary statistics from the same years in the observed dataset during parameter estimation. Parameter estimation was conducted with the random forest algorithm of approximate Bayesian computation, using the abcrf package in R (Raynal et al., 2019). Random forest is a form of machine learning in which a set of decision trees are trained on bootstrap samples of variables, and used to predict an outcome given certain predictors (Cutler et al., 2012). Traditional approximate Bayesian computation methods function optimally with fewer sum- mary statistics, requiring researchers to reduce the dimensionality of their data (Blum et al., 2013). We chose to use random forest for parameter estimation because it appears to be ro- bust to the number of summary statistics, and does not require the exclusion of potentially informative variables (Pudlo et al., 2015). In addition, it requires fewer iterations of the model compared to other methods, making it much less computationally intensive (Pudlo et al., 2015). Random forest approximate Bayesian computation was conducted with the following steps:

1. 200,000 iterations of the model were run to generate simulated summary statistics for

different values of the parameters: NB, NS, µ, D, patt, α, and v. CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 82

2. The output of these simulations was combined into a reference table with the simu- lated summary statistics as predictor variables, and the parameter values as outcome variables.

3. A random forest of 1,000 regression trees was constructed for each of the seven param- eters using bootstrap samples from the reference table. A maximum tree depth of five was used to avoid overfitting.

4. Each trained forest was provided with the observed summary statistics, and each re- gression tree was used to predict the parameter values that likely generated the data.

The following parameters were assigned log-uniform prior distributions, so that more sampling occurred near the lower bounds: NS = {596–800}, µ = {0.001–0.3}, v = {0.01–6}, and D = {2–10} (modified from Lachlan et al. (2018)). Increased sampling near the lower bounds improves the resolution of posterior distributions for parameters that are more likely to have low or moderate values (Lachlan et al., 2018). The lower bound of NS is set to 596 which is the number of syllables identified in our dataset. The following parameters were assigned uniform prior distributions, so that sampling was unbiased across the entire

range: NB = {2,000–10,000} patt = {0.01–1} and α = {0.25–3} (modified from Lachlan et al. (2018)). Assuming an average population density of 3.8 birds/km2 (Veit and Lewis, 1996; Lewis, 1997), our estimate for the average population size of western Long Island (1,203.4 km2) is 4,573. We chose a wide uniform prior distribution for initial population size that includes this estimate but allows for significant variation over time. To assess whether syllables with particular characteristics had higher persistence, we con- ducted Bayesian logistic regression with all of the syllable types detected in 1975 using the rstanarm package (Goodrich et al., 2020). We started out with the following predictor vari- ables used by Ju et al. (2019): average frequency (Hz), minimum frequency (Hz), maximum frequency (Hz), bandwidth (Hz), duration (ms), concavity (changes in the sign of the slope CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 83

Description Prior Type NB Initial population size {2,000–10,000} Uniform NS Total number of syllables {596–800} Log-uniform µ Innovation rate {0.001–0.3} Log-uniform D Number of demonstrators {2–10} Log-uniform patt Proportion of attractive syllables {0.01–1} Uniform α Level of conformity bias {0.25–3} Uniform v Variation in demonstrator attractiveness {0.01–6} Log-uniform

Table 5.1: The description and prior distribution for each dynamic parameter in the agent- based model.

of the mean frequency trace per ms), and excursion (cumulative absolute change in Hz per ms). Concavity and excursion are both indicators of syllable complexity (Ju et al., 2019; Podos et al., 2016). Concavity was calculated after smoothing the mean frequency trace using a polynomial spline with a smoothing parameter of 5 (chosen by visual inspection). The predictor variables were averaged across all of the observations of each syllable type. Maximum frequency, average frequency, and bandwidth were left out of the modeling due to multicollinearity issues (V IF > 4). Whether or not the syllable type persisted to 2019 (logical: T/F) was used as the outcome variable. We used Student’s t-distributions with a scale of 2.5 as priors to allow for a relatively wide range of parameter estimates. The agent-based model used in this study is available as a new R package called Transmis- sionBias on Github (https://github.com/masonyoungblood/TransmissionBias). The “example” subfolder of the R package includes our processed syllable data and analysis scripts. The unprocessed song recordings are available upon request.

5.3 Results

The dynamic tree cut identified 596 syllables types across the 2,724 songs from 331 individ- uals that we analyzed. After applying the eight song threshold to ensure that we had at least 75% of individuals’ syllable repertoires, we calculated the observed summary statistics CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 84

from 36, 75, and 29 individuals from 1975, 2012, and 2019, respectively (Table E.2). The observed mean repertoire size of 61.16 (SD = 17.55) was used for all of our simulations.

Figure 5.3: The prior (dotted lines) and posterior (solid lines) distributions for the three dynamic parameters in the agent-based model related to transmission biases. patt, α, and v correspond to content, frequency, and demonstrator bias, respectively.

The posterior distributions for patt, α, and v can be seen in Figure 5.3 and Table 5.2.

The median estimate for patt was 0.21, with a 95% CI that falls far short of 1. This indicates that the cultural changes in our dataset are consistent with strong content bias. The median estimate for α was 0.80 which is consistent with moderate novelty bias, but as the posterior distribution peaks at 1 and the 95% CI overlaps 1 we cannot reject the null hypothesis. The 95% CI for v encompasses almost all of the prior distribution, and the out-of-bag error for the random forest is orders of magnitude higher than the other parameters. This indicates that demonstrator bias is either not present or does not influence cultural changes at the population-level. The 10 most important summary statistics used by the random forests to estimate each parameter, identified using the Gini impurity method, can be seen in Table E.3. The posterior distributions for the other parameters can be seen in Figure E.5 and Table

5.2. The 95% CIs for NB and D are nearly equivalent to the prior distributions, indicating that neither has a significant effect on the cultural changes in our dataset. NS had a median estimate of 635 and a broad 95% CI that excludes values above 777. This means that, at the CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 85

M 95% CI NMAE NB 5830 [2174, 9789] 0.46 NS 635 [598, 777] 0.045 µ 0.0067 [0.0011, 0.17] 1.55 D 4 [2, 10] 0.51 patt 0.21 [0.023, 0.89] 0.64 α 0.80 [0.28, 1.06] 0.35 v 0.73 [0.012, 5.64] 16.55

Table 5.2: The median, 95% credible interval, and out-of-bag normalized mean absolute error of the posterior distribution for each dynamic parameter in the agent-based model.

current resolution of the hierarchical clustering, we captured at least 76.71% of the possible syllable types in the population. Lastly, the median estimate for µ is 0.0067 with a 95% CI that excludes values above 0.17, suggesting that house finches culturally transmit song with high fidelity like swamp sparrows (Lachlan et al., 2018). This result should be interpreted with caution, as µ may be underestimated by the log-uniform prior distribution, and the out-of-bag error is relatively high compared to other parameters. When both α and v are near neutrality (0.99 < α < 1.01, v < 0.02), there are significant

positive correlations between patt and the proportion of syllables that only appear once (R = 0.17, p < 0.05), the number of syllable types (R = 0.84, p < 0.0001), Simpson’s diversity index (R = 0.69, p < 0.0001), Shannon’s diversity index (R = 0.86, p < 0.0001), Pielou’s evenness index (R = 0.52, p < 0.0001), and the exponent of the fitted power-law

function (R = 0.22, p < 0.05), and there is a significant negative correlation between patt and the proportion of the most common syllable type (R = −0.69, p < 0.001). This indicates

that stronger content bias (lower value of patt) reduces cultural diversity in favor of attractive syllable types independently of other factors, which may explain why we were unable to get strong estimates for parameters like population size and the number of demonstrators. The results of the Bayesian logistic regression are shown in Table 5.3. The persistence of syllable types from 1975 to 2019 is positively predicted by both minimum frequency and CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 86

M 95% CI SD Min frequency (Hz) 0.54 [0.20, 0.93] 0.18 Duration (ms) -1.34 [-1.75, -0.97] 0.21 Concavity 0.56 [0.27, 0.89] 0.16 Excursion -0.26 [-0.53, 0.02] 0.14

Table 5.3: The median, 95% credible interval, and standard deviation for each parameter included in the Bayesian logistic regression. Concavity is the number of changes in the sign of the slope of the mean frequency trace per ms, and excursion is the cumulative absolute change in Hz per ms. 95% credible intervals that do not overlap with 0 indicate a statistically significant effect. concavity, and is negatively predicted by duration. Excursion does not significantly predict syllable persistence.

5.4 Discussion

By applying simulations and machine learning to three years of birdsongs spanning four decades, we have provided evidence that content bias plays a central role in the cultural evolution of house finch song in western Long Island. Syllables with higher concavity, or syllables with more transitions in slope, were more likely to persist between 1975 and 2019. As such, we hypothesize that the strong content bias observed in this study is targeted towards syllable complexity. This is roughly consistent with previous research that found that syllables with higher excursion (another indicator of complexity) were more prevalent in individuals’ repertoires in 2012 (Ju et al., 2019), although this variable was not significant in our model. In house finches, courtship songs are longer and contain longer syllables than solo songs (Ciaburri and Williams, 2019), and males that sing longer songs at a faster rate are more attractive (Nolan and Hill, 2004) and have higher reproductive performance (Mennill et al., 2006). If female preferences for elaborate and energetically-costly songs extend to the syllable level, then males with a content bias for syllable complexity will have higher reproductive success. CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 87

Additionally, we estimated that house finch song is transmitted with a median fidelity of 99.99% and a lower bound (5% quantile) of 83%. This is consistent with the high fidelity of swamp sparrow song learning estimated by Lachlan et al. (2018). Future studies should investigate whether departures from perfect fidelity in house finches are directional and maximize vocal performance (e.g. increased complexity). For example, swamp sparrows accurately copy attractive tutors and speed up the trill rates of slower tutors (Lahti et al., 2011). House finches shorten and increase the frequency of canary trills learned in captivity (Mann et al., 2020) and may also modify conspecific syllables in non-random ways. Interestingly, we did not find evidence for conformity bias, which appears to be relatively common in birds (Jenkins, 1977; Slater and Ince, 1979; Aplin, 2019) and was recently identi- fied in swamp sparrows using similar methods (Lachlan et al., 2018). A previous study that found that female house finches prefer local over foreign songs, which is often associated with selection for conformity bias in males (O’Loghlen and Rothstein, 1995), tested birds from the introduced range (Ontario) with songs from the native range (Arizona) (Hernandez and MacDougall-Shackleton, 2004). Preference differences between such distant and ecologically distinctive locations may be more reflective of morphological divergence (e.g. traits like bill depth (Badyaev et al., 2008; Giraudeau et al., 2014)) than regional song variation relevant for mating. Additionally, female preferences for local song in the same study appeared to be genetically inherited rather than learned (Hernandez and MacDougall-Shackleton, 2004). If song preferences in both females and males are independent of early experience then frequency and demonstrator biases are less plausible in house finches. We also found that syllables with a higher minimum frequency and a lower duration were more likely to persist between 1975 and 2019. A similar effect of minimum frequency on syllable persistence was observed by Ju et al. (2019), who attributed it to the effect that low-frequency urban noise has on birdsong. If syllables with lower minimum frequencies are more likely to be masked by urban noise or less likely to be sung by males in noisy conditions, CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 88 then they will be less likely to persist into the next generation (Luther and Baptista, 2010). Urban noise has been associated with increased minimum frequency in other house finch populations (Fern´andez-Juricicet al., 2005), and appears to be a plastic behavioral change on the part of males (Berm´udez-Cuamatzinet al., 2009, 2011). The negative effect of duration on syllable persistence is more puzzling, as male house finches in noisy environments shorten some song types and extend some syllable types (Berm´udez-Cuamatzinet al., 2011). It is also unlikely to be due to content bias, as males appear to sing longer syllables in courtship songs (Ciaburri and Williams, 2019). One possibility is that short syllables resulting from homoplasy are less distinctive and more likely to be lumped together during hierarchical clustering, causing their persistence to be overestimated by the logistic regression. There are several limitations to this study that should be highlighted. Firstly, our model is largely agnostic about guided variation and other internal transformative processes. Setting a maximum number of possible syllables to reflect physiological constraints may have indirectly captured some of these processes, but researchers applying these methods in more complex and domain-general cultural systems should carefully consider this issue. Secondly, our clustering process had a subjective component, in that we chose parameter values that yielded results consistent with previous studies. Since hierarchical clustering leads to a nested structure different parameter values for the tree cut just change the scale of the data, and the transmission biases we modeled should lead to similar population-level patterns regardless of scale. Finally, physiological and cognitive constraints likely limit the phonological space in which house finches can innovate, leading to widespread homoplasy over long periods of time (Roginek, 2018) that makes it difficult to reconstruct the evolutionary history of syllable types (Ju et al., 2019). While we explicitly accounted for homoplasy in our simulations by pulling innovated syllables from a limited number of possible types, we cannot rule out the fact that it may have influenced the results of our logistic regression. The importance of cultural traits in conservation is increasingly recognized among sci- CHAPTER 5. CONTENT BIAS IN HOUSE FINCH SONG 89 entists (Brakes et al., 2019; Caro and Sherman, 2012; Keith and Bull, 2017; Laiolo and Jovani, 2007; Whitehead, 2010) and policy-makers (United Nations Convention on Migra- tory Species, 2017). For songbirds that culturally transmit mating signals, researchers are concerned about the negative effects of population decline and habitat fragmentation on song diversity (Holland et al., 1996; Briefer et al., 2010; Hill and Pawley, 2019; Laiolo and Tella, 2005, 2007; Ortega et al., 2014; Pang-Ching et al., 2018; Paxton et al., 2019; Petruskov´a et al., 2010; Sebasti´an-Gonz´alezand Hart, 2017). Reduced song diversity has been linked to decreased population growth (Laiolo et al., 2008), leading some researchers to suggest that song diversity may be a useful assessment tool for threatened populations of songbirds and cetaceans (Laiolo, 2008; P´erez-Granadoset al., 2016; Garland et al., 2015; Rivera-Gutierrez et al., 2010). In order to use song diversity as a metric for population viability, researchers need to better understand how population diversity results from individual-level processes such as innovation and biased transmission (Greggor et al., 2017). An over-reliance on conformity bias, for example, could slow recovery by reducing effective variation, whereas novelty bias could quickly spread innovated syllables through the population (Barrett et al., 2019). If researchers using song diversity for population assessment assume that cultural transmission is neutral, which is unlikely to be realistic in most cases (Barrett et al., 2019), they may underestimate long-term population viability. Future studies should use our agent- based modeling framework to better understand how cultural transmission and population declines influence song diversity in wild populations. Chapter 6

A Raspberry Pi-based, RFID-equipped birdfeeder for the remote monitoring of wild bird populations

6.1 Introduction

Biologging, or the use of small electronic devices to track changes in an animal’s physiology, environment, or behaviour, allows researchers to passively collect large amounts of data without direct observation (Rutz and Hays, 2009; Chmura et al., 2018). One of the most popular forms of biologging in ornithological research is radio frequency identification (RFID) (Williams et al., 2019), a wireless technology that allows researchers to monitor wild bird populations at fixed locations in the field (Bonter and Bridge, 2011; Ponchon et al., 2013). Birds are tagged with leg bands containing passive integrated transponder (PIT) tags that transmit a unique code when activated by close proximity to an antenna connected to an

90 CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 91

RFID reader mounted inside a birdfeeder. The lack of an internal power source means that PIT tags are small enough to have negligible effects on survival (Adelman et al., 2015). In addition, birds with PIT tags do not have to be recaptured and continuously handled to collect data, making this method much less invasive than traditional capture-recapture. Based on visitation patterns at RFID-equipped birdfeeders researchers can collect basic information on movement and foraging ecology (Williams et al., 2019), and even infer social interactions to reconstruct population-level behavioural patterns such as social networks and dominance hierarchies (Adelman et al., 2015). In the past, the high cost of commercial RFID readers has been a financial barrier to their use in research. Previous DIY readers are significantly cheaper, but typically require a background in circuit-building (Bridge and Bonter, 2011; Ibarra et al., 2015; Zarybnicka et al., 2016; Bridge et al., 2019). I have developed a more user-friendly RFID-equipped birdfeeder 6.1 based on the Raspberry Pi Zero W that requires only basic soldering and coding skills to assemble. Each birdfeeder has a perch antenna connected to an RFID reader board on a Raspberry Pi powered by a portable battery. When a tagged bird lands on the perch to eat from the feeder, its unique code is stored with the date and time on the Raspberry Pi. All collected data are then backed-up to the cloud using a personal hotspot. The Raspberry Pi Zero W1 ($10) is an ideal single-board computer for remote monitoring setups because of its low power consumption and built-in Wi-Fi connectivity. The total cost of the electrical components for this setup, including a battery and antenna, is $85. Although this is slightly more expensive than the most affordable DIY models, which are about $30 (excluding the battery) (Bridge and Bonter, 2011; Bridge et al., 2019), it provides more flexibility in data management and can be easily outfitted with additional hardware like cameras (Jolles et al., 2018; Alarc´on-Nietoet al., 2018), microphones (Myers et al., 2019; Whytock and Christie, 2017), and environmental sensors (Grindstaff et al., 2019; Sethi et al.,

1https://www.raspberrypi.org/products/raspberry-pi-zero-w/ CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 92

2018). To my knowledge, the only other applications of the Raspberry Pi in RFID-based research used more expensive models with significantly higher power consumption (Lendvai et al., 2015; Meniri et al., 2019). Below, I outline the process of assembling the hardware and setting up the operating system for the birdfeeders. Then, I describe an example implementation of the birdfeeders to track house finches (Haemorhous mexicanus) on the campus of Queens College in New York City, and highlight several minor technical issues that I encountered using this system.

Figure 6.1: One of the RFID-equipped birdfeeders hanging in a tree on the Queens College campus. CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 93 6.2 Hardware

Each Raspberry Pi will require a microSD card for the operating system and internal storage. I also recommend that you purchase a mini-HDMI to HDMI adaptor and micro-USB to USB adaptor to setup the system with a monitor and keyboard. Before the computer can be used with the RFID reader, a standard 40-pin GPIO header needs to be soldered to the board. Alternatively, you can spend slightly more for the Raspberry Pi Zero WH ($14), which comes with the GPIO header pre-soldered. This system utilizes the 125 kHz RFID reader available from CognIoT2 ($46). This reader connects to the GPIO pins next to the microSD port (1-36) with the chip overlapping the body of the computer (Figure 6.2). The reader requires input from an antenna tuned to 770 µH. Please note that the production line for these readers is limited. Orders of more than 20 units may require negotiation with seller.

Figure 6.2: The RFID reader attached to the Raspberry Pi.

For power I utilized the Poweradd Pilot X73 ($26), a 20,000 mAh power bank that can run the setup continuously in the field for four days. These take around 10 hours to charge, so if you require constant data collection it may be a good idea to have two batteries for

2https://www.tindie.com/products/CognIoT/125khz-rfid-reader-for-raspberry-pi/ 3https://amzn.to/2OndvHj CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 94

each birdfeeder that you can cycle through. I recommend ordering EM4102 passive integrated transponder (PIT) tags from Eccel Technology, as they come pre-embedded in leg bands of various sizes4. Please ensure that you know the band size required for your species, and have the appropriate permits and ethical approval, before placing an order. I use the 2.6 mm bands for house finches. Building the perch antennas is the most technically challenging aspect of this setup (Figure 6.3). Each antenna consists of 30 American wire gauge (8.25 mm diameter) enameled magnet wire5 wrapped around a 40 x 6 mm ferrite rod6. Wrapping the wire around the rod is done by placing one end of the rod in a power drill, anchoring the wire on the other end of the rod with tape, and slowly rotating the rod with the drill. In order to achieve the required inductance of 770 µH (+/- 50 µH), I have found that doing four overlapping wraps between 5 mm and 15 mm from one end (range of 10 mm) works well. I recommend having an inductance meter handy for troubleshooting7. Once wrapping is complete, cover each end with a 6 mm rubber tip8 and the entire rod in 8 mm polyolefin heat shrink tubing9. Then, the two loose wires can be lightly stripped and soldered to 2 ft long leads10. These leads should be attached to the three position 1.5 mm connector11 by crimping each wire to the 24-26 AWG terminals12 and inserting them into the left-most and right-most positions. The last step is to waterproof the entire antenna using rubber coating spray (i.e. Plasti Dip). If you have already completed the software and firmware steps below, then you can test the antenna by plugging it into the white connector on the CognIoT board. If it is correctly

4https://eccel.co.uk/product-category/avian-products/bird-tags/ 5http://bit.ly/2AQ8xzc 6http://bit.ly/2QmIaGw 7http://bit.ly/2DqgT2S 8https://amzn.to/2DotDHl 9https://amzn.to/2D7Fh8Q 10http://bit.ly/2D83z2b 11http://bit.ly/2SRmcNj 12http://bit.ly/2PFT1ya CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 95

tuned then the red light should turn green when you present the antenna with a PIT tag. I have found that this antenna design has the highest read range (∼2 cm) on the unwrapped end of the antenna. In total, the materials required for 50 antennas (including an inductance meter and rubber coating spray) cost $105 before shipping. All hardware components of the setup can be seen in Figure 6.3.

Figure 6.3: The completed antenna and power supply connected to the computer and RFID reader.

6.3 Software

The computers in each birdfeeder run a modified Linux operating system with scripts allow- ing them to continuously collect and store RFID data, and upload it to cloud storage after connecting to a personal hotspot generated by a smartphone. These scripts run automati- cally when the feeder is connected to a power supply, so no commands need to be run in the field. Following are the steps needed to set up the operating system. First, download the most recent version of Raspbian Stretch Lite13 and follow the in-

13https://www.raspberrypi.org/downloads/raspbian/ CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 96 structions to write the image to a microSD card14. Once the operating system is installed, boot up the computer and log in using the default username (pi) and password (raspberry). Open the configuration menu:

sudo raspi −c o n f i g

Enter “Network Options” → “Wi-fi” to setup the internet connection. Then enter “Inter- facing Options” → “Serial”, and disable the login shell from using the serial port. Make sure that the serial port hardware is left enabled. Lastly, enable automatic login to the console by entering “Boot Options”. The computer may require you to reboot before continuing. Next, make sure that the prerequisite software for the RFID reader is installed:

sudo apt−get install python−dev python−setuptools python−pip

sudo apt−get install powertop

sudo pip install wiringpi sudo pip install numpy

Install the example software for the RFID reader:

git clone https://github.com/CognIot/RFID 125kHz

Import “rfid reader.py” and “schedule.sh” from the data repository (https://doi.org/ 10.7910/DVN/XAIRNM) into /home/pi, using secure copy or file transfer protocol. “rfid reader.py” is a modified version of a CognIot script15 that interfaces with and controls the RFID reader. If you need to change the default polling delay of the RFID reader (262 ms), edit the Set- PollingDelay function in “rfid reader.py” according to CognIot’s documentation. “sched- ule.sh” stores the collected data into files compatible with feedr, an R package for managing and visualizing data from RFID-equipped birdfeeders (Lazerte et al., 2017). Make both files executable: 14https://www.raspberrypi.org/documentation/installation/installing-images/ 15https://github.com/CognIot/RFID_125kHz CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 97

sudo chmod +x rfid r e a d e r . py sudo chmod +x schedule.sh

Remember to modify the “schedule.sh” script to reflect the name, latitude, and longitude of the site where the birdfeeder will be deployed. To edit the file, you can either open it in the command line with sudo nano ‘schedule.sh’ or use a text editor in the GUI. Next, “schedule.sh” needs to run automatically on startup so that a new data file is generated every time the battery is changed. It is also a good idea to run powertop to save power. Open the “rc.local” file located in /etc. Erase the lines for printing the IP address, and add the following lines above “exit 0” so that they run during startup:

#use powertop auto−tune to save power sudo powertop −−auto−tune &

#run schedule script /home/pi/schedule.sh &

Make sure that ‘rc.local’ is executable:

sudo chmod +x /etc/rc. local

Create a folder in /home/pi for the RFID data:

mkrdir r f i d l o g s

If you plan on backing up collected RFID data, install rclone and use the standard configuration for your cloud storage service of choice16. If you plan on using a cloud storage service that requires web authentication then you need to run the configuration via SSH from a computer with a GUI:

curl https://rclone.org/install.sh | sudo bash rclone config

16https://rclone.org/docs/ CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 98

There are several ways to handle backups. If the birdfeeders will be in locations with dependable Wi-Fi connections, then you can set up hourly backups via cron. Alternatively, you can use software such as Linux Reader or extFS to copy the files directly from the SD card of the Raspberry Pi. For my situation, I decided to have the birdfeeder automatically upload the RFID data to cloud storage whenever it connects to a personal hotspot generated by a smartphone. The first step is to install and configure an appropriate network manager:

sudo apt−get install wicd wicd−c u r s e s

sudo systemctl stop dhcpcd sudo systemctl disable dhcpcd sudo systemctl start wicd.service sudo systemctl enable wicd. service sudo gpasswd −a pi netdev

Next, open up the text-based interface for the manager to prioritize and automatically connect to the personal hotspot, and disable automatic connection to any networks that might interfere (e.g. institutional connections requiring login):

wicd−c u r s e s

Import “drive backup.sh” from the data repository into /etc/wicd/scripts/postconnect using secure copy or file transfer protocol, remove the “.sh”file extension, and make it exe- cutable:

sudo chmod +x /etc/wicd/scripts/postconnect/drive backup

“drive backup” uploads everything in /home/pi/rfid logs to /rfid logs/site name, a di- rectory that you need to create in your cloud storage. Remember to change “site name” to the name of the site where the birdfeeder will be deployed. Lastly, before the RFID reader can be used with the computer it must be set up to read EM4102 tags. Until this step is completed the reader will not register correctly constructed CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 99

antennas. Run the setup script and follow the relevant menu options:

python RFID 125kHz/python/RFIDReader.py

6.4 Example

In the spring and summer of 2018, we banded 138 house finches with PIT tags (immature: 92.0%; adult female: 3.6%; adult male: 4.4%). Due to high levels of dispersal (∼91.3% based on return rate), we banded an additional 47 house finches in the spring and summer of 2019 (immature: 63.8%; adult female: 8.5%; adult male: 27.7%). All birds were captured using perch traps17, and adult female and immature birds were distinguished based on the color of their secondary covert feathers (Hill, 2002). The capture and banding protocol was approved by the Institutional Animal Care and Use Committee of Queens College (Protocol #179), and conducted with a federal permit from the United States Department of the Interior (Permit #23708). Five RFID-equipped birdfeeders were deployed on the Queens College campus from Oc- tober 31, 2018 to August 1, 2019. Every four days, we visited each birdfeeder to replace the food and batteries and reboot them with a personal hotspot in range. Over the course of the field season we collected 6,878 visits from 28 individuals (12 banded in 2018, and 16 banded in 2019), after accounting for sequential reads within visits. The total time in minutes that individuals spent at each of the five feeders, as well as the number of movements between feeders, can be seen in Figure 6.4. Birds began to visit the feeders regularly towards the end of March, and activity peaked in early May (left panel of Figure 6.5). Foraging activity was relatively constant throughout the day with the highest levels occurring around dusk (right panel of Figure 6.5), consistent with the findings of Bonter et al. (Bonter et al., 2013). Once the raw visitation data has been imported into R using the package feedr (Lazerte

17http://www.thirdwheel.biz/perch-traps.html CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 100

Figure 6.4: The locations of the five birdfeeders on the Queens College campus, the total number of minutes birds spent at each location (size and color of each point), and the total number of times birds moved between locations (thickness and color of each line). Plotted using feedr (Lazerte et al., 2017).

Figure 6.5: (A) The number of feeder visits (y-axis) that occurred on each day of the study period. (B) The total number of feeder visits (y-axis) that occurred during each hour across the entire study period. et al., 2017), it can be used for several different analyses. Here I demonstrate two examples: (1) the estimation of the social network of the population, and (2) the estimation of the CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 101 dominance hierarchy of the population. For species such as house finches that congregate around birdfeeders, co-occurrences of individuals during foraging bouts can be used to estimate the social network structure of a population (Adelman et al., 2015). Firstly, adjacent feeder visits by different birds are transformed into association data using machine learning with a Gaussian mixture model (Psorakis et al., 2015), implemented in the R package asnipe (Farine, 2013). This association data is then used to reconstruct a weighted social network of the population, where nodes are individual birds and links represent interaction rates between them. The estimated social network of the population can be seen in the left panel of Figure 6.6. In this case, the 25 birds represent a subsample of the entire population. Although partial networks often closely reflect the structure of full networks (Silk et al., 2015), any social network constructed from a subsample should be interpreted with caution. House finches have linear dominance hierarchies in which females are typically domi- nant to males (Belthoff and Gauthreaux, 1991; Thompson, 1960c), and aggressive interac- tions between individuals often occur at food sources when one individual displaces another (Thompson, 1960b). As such, investigating dominance using displacement data is a well- established method in house finches (McGraw and Hill, 2002; Moyers et al., 2018) and other species (Miller et al., 2017; Evans et al., 2018). Firstly, displacement events, in which one bird displaces another at the feeder within two seconds, are extracted from the raw visita- tion data. This pairwise interaction data is then analyzed using the aniDom package in R (Farine and Sanchez-Tojar, 2019). Birds who consistently displace other individuals at the feeders are coded as dominant to those individuals. The estimated dominance hierarchy of the population can be seen in the right panel of Figure 6.6. During the field season we periodically experienced three minor technical issues. Firstly, cold weather and high humidity can cause the batteries to shut down prematurely. I highly recommend storing the components in a waterproof box with silica gel packets to avoid CHAPTER 6. RFID-EQUIPPED BIRDFEEDER 102

Figure 6.6: (A) The estimated social network of the 25 birds for which we had association data. (B) The dominance ranks (y-axis) of the 12 birds for which displacement interactions were available. Plotted with aniDom (Farine and Sanchez-Tojar, 2019). moisture build-up. If temperatures are approaching freezing or rain is expected, it is a good idea to check the batteries daily rather than every four days. Next, sometimes the configuration file for rclone is cleared when the backup script runs without the personal hotspot. If files are not appearing in the cloud backup, a quick reconfiguration of rclone should solve the issue. Lastly, if the computers reboot without connecting to a personal hotspot the system time will not be accurate. Ensure that the hotspot is on and within range whenever the computers are rebooted. For connection confirmation, I recommend creating a blank text file called ‘test’ in /home/pi/rfid logs on each unit. If you delete this file from the online backup and it reappears upon reboot then the computer successfully connected to the hotspot. Alternatively, you can use a smartphone app that allows you to view which devices are currently connected to your hotspot. Chapter 7

Phylogenetic reconstruction of the cultural evolution of electronic music via dynamic community detection (1975–1999)

7.1 Introduction

Historically, researchers have relied on phylogenetic comparative methods from biology to create cultural phylogenies, or “trees” of culture. These methods use differences and similari- ties in the cultural products of different populations (analogous to differences and similarities in DNA) to reconstruct the historical relationships between them. Traditional phylogenetic methods, which assume the tree-like structure typical of genetic evolution, have yielded critical insights, particularly in linguistics (Pagel et al., 2007; Levinson and Gray, 2012; Bouckaert et al., 2012). However, researchers have debated whether cultural phylogenies are fundamentally tree-like, or whether high levels of horizontal transmission lead to a more

103 CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 104 reticulated structure (Boyd et al., 1997; Borgerhoff Mulder et al., 2006; Gray et al., 2010; Rivero, 2016; Cabrera, 2017). For example, biologist Stephen Jay Gould wrote that:

“Biological evolution is a bad analogue for cultural change [...] Biological evolu- tion is a system of constant divergence without subsequent joining of branches. Lineages, once distinct, are separate forever. In human history, transmission across lineages is, perhaps, the major source of cultural change.” – Gould (1991)

Certainly, this is an oversimplification. Horizontal transmission also frequently occurs in biology and sometimes provides problems for the practice of classification (Doolittle, 2009). Horizontal transmission is for instance the primary mechanism for the spread of antibiotic resistance in bacteria (Gyles and Boerlin, 2014), but it also occurs in vertebrates, inverte- brates, and plants (Crisp et al., 2015; Bock, 2010). Nevertheless, Gould had a fair point. Although cultural traits evolve through a process of variation, selection and reproduction that results in observable fissions of cultural lineages, branches of the “tree of culture” can reunify later on and frequently do so, for instance in the case of shared practices or customs. That being said, there is substantial variation in reticulation across cultural domains (Gray et al., 2010; Cabrera, 2017), and depending on the conditions (e.g. co-inheritance of traits) horizontal transmission may or may not interfere with traditional phylogenetic reconstruction (Nunn et al., 2006, 2010; Currie et al., 2010). In language evolution, for example, horizontal transmission is lower and phylogenetic relationships can be reliably reconstructed (Greenhill et al., 2009). In other domains, such as material culture, rates of horizontal transmission can be higher and more variable (Tehrani and Collard, 2002; Jordan and Shennan, 2003; Cochrane and Lipo, 2010), leading to phylogenies that clearly contradict the historical record (T¨emkinand Eldredge, 2007). For contemporary culture in the digital age (Acerbi, 2020), where rapid within-generational changes are the norm (Youngblood, 2019b,a), the negative effects of horizontal transmission on phylogenetic signal are likely CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 105 to be even more extreme. More recent advancements in network-based phylogenetics allow researchers to estimate reticulation (Gray et al., 2010; Heggarty et al., 2010; Howe and Windram, 2011; Rivero, 2016; Mesoudi, 2017b; Bouckaert et al., 2019), but these methods are typically unrooted (Tehrani, 2013; Morrison, 2014b,a) (i.e. cannot be used to infer chronology (Dunn, 2004)) and thus remain complementary to traditional phylogenetic reconstruction (Tehrani and D’Huy, 2017). Additionally, outside of linguistics it can be extremely challenging to characterize com- plex cultural traits in a manner suitable for phylogenetic analysis (Howe and Windram, 2011; T¨emkin,2016). In practice, this means that cultural phylogenies are often limited to very specific domains with variation that can be more easily characterized. For example, re- searchers have begun applying evolutionary methods to the domain of music (Savage, 2019), but application of phylogenetic methods has been restricted to traditional rhythmic patterns (Toussaint, 2003; D´ıaz-B´a˜nezet al., 2004), individual instruments (T¨emkinand Eldredge, 2007), the works of a single composer (Liebman et al., 2012; Windram et al., 2014), specific variants of folk song melodies (Savage et al., 2020a), or folk music within a single region (Le Bomin et al., 2016). Given these limitations, it would be valuable to be able to construct large-scale phylo- genies for complex cultural traits while explicitly accounting for and measuring horizontal transmission (Nunn et al., 2006). If phylogenies represent changes in population structure over time (Tehrani et al., 2010; Velasco, 2013; Duda and Zrzav´y,2016), then one way for- ward might be to assess population structure from the bottom-up. This is, of course, a complicated proposition. Population structure in biology is typically determined by genetic variation driven by the combined effects of evolutionary processes such as recombination, mutation, genetic drift, demographic history, and natural selection. Unfortunately, popula- tion concepts are hardly discussed in the cultural evolutionary literature (for an exception see Scapoli et al. (2005)). CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 106

According to philosopher of biology Roberta Millstein’s definition of a (biological) pop- ulation, the “boundaries of a population are those groupings where the rates of interactions are much higher within than without” (Millstein, 2010). She calls this definition the “causal interactionist population concept”, or CIPC for short. If agents are represented as nodes in a network graph, where links represent interactions relevant for reproduction (e.g. mating, transferring information), then populations are groups of agents that interact significantly more with one another than with other agents. In general, the rates of interactions between agents are lower between populations than within them, and this feature is precisely what gives them their (sometimes fuzzy) boundaries. According to this interpretation, populations represent a specific kind of “nearly decomposable system” (Simon, 1962) in that they are emergent properties of the behavior of semi-independent agents. Naturally, agents in these kinds of nearly decomposable systems will be hierarchically organized on different levels, for example into communities, populations, and metapopulations. It has recently been suggested to use this “inner interactive connectivity”, i.e. cohesion in cultural population structure, as the population defining criterion in cultural evolution (Baraghith, 2020). Specifically, Baraghith (2020) formalized the CIPC by using graph the- ory to calculate a cohesion index (CI), or the ratio of a population candidate’s internal to external connections, that can be maximized to identify populations in a larger network of agents. The CI is a straightforward and effective indicator for populations in static networks, but identifying evolving populations requires a community detection method designed for dynamic networks. The TILES algorithm, designed by Rossetti et al. (2017), is a good can- didate for dynamic community detection in cultural evolution. TILES is an online algorithm in that it works with an “interaction stream” of nodes and links. In other words, individuals enter and exit communities as they form and break relationships with other individuals. A node is considered to be a “core” community member if it forms a triangle with other community members, and a “peripheral” member if it is one link away from a core node. CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 107

Community composition is recomputed throughout this process, each time that a new link enters or exits the network. Snapshots of the community composition are then collected at a regular time interval. Tests with simulated networks indicate that TILES outperforms other community detection algorithms in both processing time and accuracy, and is able to identify events that are critical for evolutionary research (e.g. splitting, death) (Rossetti et al., 2017). Most importantly, the TILES algorithm closely resembles the formalization of the CIPC (Baraghith, 2020) in that it is based on the relative density of links within and between communities and allows for overlap on the periphery. In the following, we propose a new method for phylogenetic reconstruction based on the CIPC that uses the TILES dynamic community detection algorithm to identify distinct populations and track how they change over time. We will treat different groups under investigation as populations or metapopulations, rather then as species. Deciding whether a member belongs to a specific community in this framework is not a question of whether she shares as many relevant hereditary traits as possible (like “genes” in biological evolution) with the other members of the grouping. Instead, what counts is the relative number of social interactions within (and outside) the community. More specifically, we are using genres of electronic music as a test case. Although mu- sicologists recognize that genres are generated by evolving communities of artists (Lena, 2012; Klement and Strambach, 2019), previous attempts to quantitatively map genres have depended on listener habits (Lambiotte and Ausloos, 2005), instrument similarity (Percino et al., 2014), or sub-genre tags on streaming platforms (Mauch et al., 2015). By using artist co-release data (who collaborates with who) we can explicitly track how populations of artists, and the genres that they correspond to, evolve over time. This approach is sim- ilar to qualitative attempts at reconstructing music trees that rely on historical accounts of how artist communities grow, diverge, and influence each other over time (Ishkur, 2000; Crauwels, 2016). We chose to study electronic music because it is known for its rapid differ- CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 108 entiation into competing genres and subgenres (van Venrooij, 2015), particularly during the 1990s (McLeod, 2001). In addition, collaboration links between electronic music producers are already known to be important for cultural transmission (Youngblood, 2019b,a) and community structure (Janosov et al., 2020). The aim of this study is to assess the degree to which the cultural phylogeny of electronic music is tree-like. As the tree-likeness of cultural phylogenies exists on a continuum (Gray et al., 2010; O’Brien et al., 2013) and there is not enough reliable comparative data to gen- erate neutral predictions about typical levels of horizontal transmission, it seems unwise to attempt to do hypothesis-testing about whether or not a given phylogeny is fundamentally tree-like or reticulated. Instead, we aim to simply characterize tree-likeness by using the ratio of vertical to horizontal transmission. Here, we calculate this as the proportion of links in the network that are within rather than between populations. Since collaborative links between artists are meaningful for cultural transmission (Youngblood, 2019b), the level of collabora- tion between two populations likely reflects the level of cultural transmission between them. This simple metric is consistent with the common definition of horizontal transmission as the level of cultural transmission between extant populations (Borgerhoff Mulder et al., 2006; Greenhill et al., 2009; Currie et al., 2010), and reflects the conceptualization of horizontal transmission in biological phylogenetics. The original definition of horizontal transmission proposed by Cavalli-Sforza and Feldman (1981a), which was simply cultural transmission oc- curring within the same generation, is insufficient here because it does not take population structure into account.

7.2 Methods

All data used in the current study was collected from Discogs in April 2020. Discogs is a large, user-generated database of music releases that has better coverage of electronic music CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 109 than other sources (van Venrooij, 2015; Bogdanov and Serra, 2017). We chose to use the releases data rather than the masters data because it has better coverage of EPs, singles, remixes, and other smaller releases. First, we extracted all collaborative releases with the “Electronic” genre tag. Then, we retrieved the release IDs, release years, style tags, and artists from these releases, including any featured artists, remixers, and producers from the tracklists. Due to computational limitations, we only included releases from between 1970 and 1999. Collaborations were converted into an unweighted dynamic edgelist for input into TILES. More details about data collection and computational limitations can be found in the appendix. Community detection was conducted using the Python implementation of TILES (Ros- setti et al., 2017). Each link persisted for 365 days and community composition was recorded at the beginning of each year. Links from collaborative releases were assigned random dates within the year they were released, so that changes in community membership were contin- uous. We chose to set link persistence to 365 days so that the resolution of the phylogeny matched the resolution of the collaboration data. Preliminary analyses indicated that lower values fragment the overall network structure, whereas higher values bias the results towards vertical transmission. The output of TILES is a set of community memberships for each year, where each individual may be part of multiple overlapping communities. In order to visualize the results, we required a single community membership for each individual. We assigned each individual in multiple communities to the community with highest network density, or the proportion of actual to possible connections. Ties were broken by group size. Both core and peripheral community members were included in the visualization and analysis. The dynamic community composition for all of the electronic releases, in which each node is a community of artists and each directed link represents artists moving between communities from year to year, was too large to visualize in its original form. To simplify CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 110

Figure 7.1: A simplified visual summary of community detection using the TILES algorithm (A: top), and network simplification using the fast greedy modularity optimization algorithm (B: bottom). In A, arrows depict the movement of nodes (artists), dotted lines depict newly formed links (collaborations), and different colors represent different communities. Each link in the network corresponds to a collaborative release like the one shown in the bottom right (https://bit.ly/3jvofD8). Both core and peripheral artists are included, and artists in multiple communities are assigned to the largest one. Between t1 and t2, artist 1 breaks all but one of its previous ties. Since artist 1 is no longer in a triangle with those other artists, the network splits into two communities (green and pink). Between t2 and t3, artist 2 breaks its tie with the green community and establishes a collaboration with a peripheral member of the pink community and an unassigned artist. Since these three artists form a new triangle they are classified as a new community (yellow). In the left side of B, each node represents a community from A in the corresponding time point, and the links between communities represent the number of individuals moving between them. The right side of B is the same visualization after network simplification, where the pink and yellow communities have been compressed into a single population. CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 111 the network we conducted a cluster analysis using the fast greedy modularity optimization algorithm (Clauset et al., 2004) and merged nodes assigned to the same cluster within each year using the igraph package in R (Csardi and Nepusz, 2006). Links were combined to rep- resent the total number of artists moving between each set of merged nodes. The simplified network can be thought of as a hierarchical network with two layers. Each node represents a population of communities (identified by TILES) which are further separable into individual artists. The fast greedy algorithm is a hierarchical agglomeration clustering method that opti- mizes the modularity, or the observed minus expected proportion of within-cluster links, of the entire network. In other words, the algorithm clusters groups of communities that interact significantly more with one another than with other communities into populations, as consistent with the CIPC (Millstein, 2010; Baraghith, 2020). We chose to use the fast greedy algorithm because it yielded the highest modularity relative to other clustering meth- ods (see Table F.1), as well as a manageable number of populations for interpretation. In addition, it results in accurate estimates of the proportion of links between clusters in simu- lated networks of various population sizes (Yang et al., 2016), which is directly relevant for our measurements of horizontal transmission. To measure horizontal transmission between two populations, we divided the total num- ber of weighted links between populations by the sum of the number of internal weighted links from each population. In other words, we calculated the percentage of links in the sub-network that flow between the two populations. In addition, we calculated the cohesion index (CI), or the ratio of internal to external links, for each individual population (Baraghith, 2020). Although this application of the CI differs from its original formulation, as populations in the dynamic network have a temporal dimension, it is still an intuitive metric for the degree to which a population is distinct from other populations in the phylogeny. Since the weighted links are summed during both CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 112 community detection and network simplification, the CI can be calculated at any level of the hierarchical network. The phylogeny was visualized with a sankey plot using the sankeyD3 package in R (Bre- itwieser et al., 2017). This method expands on recent work by Mall et al. (2015) and others (Rosvall and Bergstrom, 2010; Wu et al., 2016; Liechti and Bonhoeffer, 2020). As you go from left to right in the visualization, each node is a population of communities identified by TILES in the corresponding year on the x-axis. The links depict movement between popula- tions over time, where thickness corresponds to population size. Only the largest connected component of the network was plotted to maximize the clarity of the visualization. The lay- out of the nodes in the sankey plot is algorithmically generated by (1) minimizing the overlap between links and (2) maximizing the horizontal alignment of nodes. In other words, the visualization algorithm maximizes both the clustering of nodes and the continuity of links. The distances between and the positions of populations do not convey information like they do in a standard phylogeny, so the specific y-axis location of each node is not meaningful. To determine whether or not populations of artists correspond to distinct subgenres, we compiled the style tags for all releases from each population and conducted Latent Dirichlet Allocation (LDA) topic modelling. LDA models are Bayesian mixture models that take large corpora of words from different sources and identify clusters of words, or “topics” (Gr¨unand Hornik, 2011). Each source is assigned a distribution of topics, one of which is identified as the most distinctive to that source. In this case, we looked for clusters of style tags that were distinctive to each population. Style tags from Discogs have been successfully used to quantitatively identify genres in the past (Mauch et al., 2015), which is unsurprising given that their purpose is subjectively denoting genres. Releases by artists from multiple populations (42.5%) were excluded from the topic modelling so that the samples were independent from one another. We used all style tags for each group rather than unique style tags to account for frequency of use. The LDA model was fitted using variational CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 113 expectation maximization using the R package topicmodels (Gr¨unand Hornik, 2011). k, or the target number of topics, was set to the number of populations. The method used in this study is available as a new R package called DynCommPhylo on Github (https://github.com/masonyoungblood/DynCommPhylo). The “example” sub- folder of the R package includes our processed data and analysis script. The interactive version of Figure 7.2 is also available on Github (https://masonyoungblood.github.io/ electronic_music_phylogeny.html). The raw data used in this study came from the April 2020 XML archive of releases on Discogs (http://bit.ly/DiscogsApril2020).

7.3 Results

We analyzed 1,498,483 collaborative relationships between 93,831 artists from 53,581 differ- ent electronic music releases between 1970 and 1999. TILES identified 8,354 communities, of which we visualized the largest connected component. This component contained 90.2% of the communities, appearing between 1975 and 1999. Network simplification reduced the dynamic community composition to 72 populations with a modularity score of 0.65. The geographic distribution of releases from these populations can be seen in Figure F.1. The resulting phylogeny can be seen in Figure 7.2. Each node is a set of communities identified by TILES, which are further separable into individual artists. Nodes with the same color belong to the same population. Each link corresponds to the number of artists moving between nodes. In the interactive version of Figure 7.2 (https://masonyoungblood. github.io/electronic_music_phylogeny.html) you can zoom, navigate, and hover over nodes for the three most distinctive style tags and 10 most distinctive artists from each population (identified by weighted log-odds, calculated using uninformative Dirichlet priors), and hover over links for the number of individuals moving between nodes. Details about the eight largest populations in the phylogeny can be seen in Table 7.1. CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 114 ) https://masonyoungblood.github.io/electronic_music_phylogeny.html Figure 7.2: Aset phylogeny of based communities, oncolor and the belong dynamic each to link community the composition correspondsalgorithm same population, of maximizes to and electronic both the the music thesimplified number height over clustering versions of of of time. each of artists the node nodes morein Each moving corresponds and detailed Figure to node between classifications the F.2. the nodes. is in continuity number Interactive the a of of version: results, Nodes artists. links. and ( with The the The visualization overall the population population same size names over in time the can be legend seen are CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 115

Population #5 (yellow) includes a diverse array of four-on-the-floor dance music but leans towards trance. Most of the top artists are from Germany. Population #3 (pink) also includes a diverse array of dance music but leans towards house. The top artists are from America, England, and all over Europe. “Euro House”, the second top style in this population, is associated with the radio-friendly popular dance music that rose to prominence in Europe in the mid 1990s. Population #8 (purple) includes artists like Madonna, Pet Shop Boys, and Technotronic that were incorporating electro, hip-hop, and techno into pop and rock music in the late 1980s and early 1990s. Population #1 (orange) includes a diverse array of artists from American and Europe making techno, hard house, and experimental music. Population #6 (blue) is the branch of the “hardcore continuum” of high-BPM rave genres that includes jungle and drum n bass. Most of the top artists are from London and the surrounding suburbs. Population #4 (green), on the other hand, is the heavier branch of the “hardcore continuum” that includes happy hardcore, gabber, and noise. Most of the top artists are from rural England and Scotland. Population #2 (light grey) includes both disco and synth-pop throughout the 1980s, including Giorgio Moroder and Donna Summers who transformed dance music with their 1977 hit “I Feel Love”. Population #9 (dark grey) corresponds to Goa trance and psy-trance. Goa trance first emerged in India in the early 1990s and became known as psy-trance once it reached Europe and the rest of the world. Most of the top artists are from Europe and the Middle East. The most distinctive artists from each population do not necessarily represent the most innovative or influential artists from the subgenres that they correspond to. For example, the originators of Detroit techno music (e.g. Juan Atkins, Derrick May, and Kevin Saunderson (Sicko, 2010)) do not appear in the most distinctive artists from Population #1 (orange), and the originators of Chicago and New York house music (e.g. Frankie Knuckles and Larry Levan (Reynolds, 2012)) do not appear in the most distinctive artists from Population #3 (pink). The rise and fall of each of these lineages, as well as the interactions between them, are intuitive and broadly CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 116

Figure 7.3: The number of populations identified in each year (left y-axis), as well as the percentage of between-population links between each year and the previous year (right y- axis). The percentage of between-population links should be treated as a global measure of horizontal transmission. consistent with the historical record (Sicko, 2010; Reynolds, 2012; Collins et al., 2013). The level of horizontal transmission between the eight largest populations in the phy- logeny can be seen in Table 7.2. 20.93% of links in the phylogeny are between rather than within populations (mean of 25.74% with different network simplification methods, as per Table F.1), indicating that while horizontal transmission is common the phylogeny still has a fundamentally branching structure. Interestingly, both the number of populations and the percentage of between-population links has increased over time (Figure 7.3). The large spike in the percentage of between-population links between 1984 and 1986 is likely due to the large movement of artists from Population #2 (light grey) to Population #8 (purple), which appears to be caused by an influx of new artists collaborating with existing artists and disrupting the original population structure. The LDA topic model identified 48 unique topics across 69 of the populations in the simplified network. Three populations (#53, #57, #62) were excluded from the topic model CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 117 The Mackenzie, ATB, Fiocco, Da Hool,Cappella, Nalin Jon & Of Kane The Pleased2 Wimmin, Reel Unlimited, 2 Corona, Real, Sasha, Dave DJ Seaman Duke, DJ BoBo, Jeremy Healy, Jackson, Technotronic, PaulaNew Abdul, Order, Sandra Whitney Houston, Mike Flores, DJ Spooky,teno, AM/FM Alexander, George Bill Cen- Broom, Laswell, Nemesis, Angel Mad Alanis Professor,Grooverider, Poogie DJ SS, Bear, DJBukem, Hype, Ellis Mark Micky Dee, Finn, Randall, DJ DJ Ratty, LTJ Phantasy,Force Kenny & Ken Styles, Hixxy, Scott’Daniel’ Brown, Bunter, Bass Generator, BuzzDNA, Billy DJ Fuzz, Fade Eruption, Sy & Unknown, Cowley, Sylvester, Boney M.,zoo, Kraftwerk, The Gino Human Soccio, League Ya- Astral Projection, Prana,Project, GMS, Chakra, Pleiadians, X-Dream, Quirk, ManMadeMan MFG, S.U.N. dance mental core Noise Progressive Trance N CI Top Styles Top Artists 5 15,413 1.733 Trance, House, UK 10,891 Garage 1.45 House, Sash!, Vengaboys,8 Euro 666, Paul House, van 8,228 Dyk, Italo- Brooklyn Bounce, 4.61 Synth-pop,1 Italo-Disco, Disco 8,152 Madonna, Depeche Mode, Pet Shop 1.05 Boys, Black Box, Janet Techno,6 Hard House, Experi- 5,622 1.574 Jungle, 5,243 Drum n Bass, Hard- 1.98 Hardcore,2 Happy Hardcore, 3,885 4.87 Disco,9 Synth-pop, New Wave 2,283 Commodores, Donna 1.05 Summer, Giorgio Moroder, Goa Patrick Trance, Psy-Trance, Table 7.1: The size,populations in cohesion the index, phylogeny. top styles and top artists (calculated by weighted log-odds) from the eight largest CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 118

5 3 8 1 6 4 2 9

5 12.3% 0.0% 12.9% 2.4% 4.0% 0.0% 3.9%

3 4.3% 7.0% 7.4% 2.7% 0.0% 3.0%

8 0.0% 2.7% 0.1% 5.3% 0.0%

1 5.2% 5.1% 0.0% 3.6%

6 6.4% 0.0% 1.8%

4 0.0% 2.3%

2 0.0%

9

Table 7.2: The level of horizontal transmission between the eight largest populations in the phylogeny. Each value represents the percentage of links in the sub-network that flow between the two populations. In other words, we divided the total number of weighted links between populations by the sum of the number of internal weighted links from each population. CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 119

Figure 7.4: The effective number of styles from each year calculated using Simpson’s (left y-axis) and Shannon’s (right y-axis) diversity indices. because they did not have any unique releases. The α value of the fitted model, which gets smaller as more topics correspond one-to-one with groups, was 0.019. 64.6% of topics corresponded to a single population, 27.1% corresponded to two populations, and 8.3% corresponded to three populations. This indicates that populations of artists have distinctive style tags associated with their releases, and likely represent distinct subgenres of electronic music. The top 10 terms from each topic assigned to the eight largest populations in the phylogeny can be seen in Figure F.3. The style tag diversity from each year, calculated with all releases in the phylogeny using the Simpson and Shannon diversity indices, can be seen in Figure 7.4. We used the effective number of styles, or the number of equally-abundant styles required to get the same diversity index (Jost, 2006), because it scales linearly and has been used in previous work on musical diversity (Mauch et al., 2015). Between 1975 and 1999 the diversity of styles tags in the phylogeny has significantly increased. CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 120 7.4 Discussion

By applying dynamic community detection methods to an exhaustive dataset of electronic music releases spanning three decades, we have constructed a cultural phylogeny built ex- plicitly from population structure. Most importantly, we found that although vertical trans- mission appears to be dominant, as the majority of links in the phylogeny are within lineages (79.1%), horizontal transmission is common and populations never become fully isolated from another. Additionally, the level of horizontal transmission between populations increased be- tween 1975 and 1999. This is consistent with anecdotal accounts (Lindop, 2011) and another recent quantitative study that found that hybridization between genres increased over the same period, using genre tags from music databases (Gagen, 2019). Taken together these results indicate that although communication technologies like the internet have increased rates of horizontal transmission between human populations (Carrignon et al., 2019), allow- ing music producers to collaborate and influence one another despite significant geographical distance (Youngblood, 2019b), the cultural evolution of music is still relatively tree-like. We also observed a significant increase in style tag diversity. In combination with popula- tion diversification, this indicates that electronic music diversity has increased between 1975 and 1999. This result contradicts two recent studies on the cultural evolution of popular mu- sic, which concluded that musical diversity has either remained relatively constant (Mauch et al., 2015) or declined (Serr`aet al., 2012) in recent decades. Although it is possible that di- versification patterns in popular music are different than in contemporary music as a whole, both studies have some important limitations that reduce their generalizability. Mauch et al. (2015) only used data from the Billboard Top 100. Estimates of diversity based on the Bill- board charts alone will inevitably underestimate the diversity in music being produced at all levels of an ever-expanding industry. Although Serr`aet al. (2012) used a more extensive dataset their conclusions are based solely on pitch transitions, timbre, and loudness, mea- CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 121 sures that only capture a fraction of musical variation and are less relevant for recent musical innovations (e.g. experimental, noise, drum-based, and lyric-based music). That being said, the quantitative audio analysis methods used by both Mauch et al. (2015) and Serr`aet al. (2012) better capture patterns of musical diversity than style tags. Future studies should combine quantitative audio analysis with our process-based phylogenetic approach to better understand the diversification of contemporary music. The fact that the structure of the phylogeny is relatively tree-like may be surprising to some, given the prevalence of horizontal transmission in contemporary culture (Carrignon et al., 2019). We suspect that our observed level of vertical transmission is due to a combi- nation of factors. Geography, for example, plays an important role in structuring cultural variation (Gray et al., 2009; Bouckaert et al., 2012; Currie et al., 2013; Schillinger and Lycett, 2019; Kauhanen et al., 2021), and the top artists from certain populations in the phylogeny tend to come from particular countries. Although the widespread adoption of the internet has reduced the influence of geography on musical collaboration (Youngblood, 2019b), most of our data is from the 1980s and 1990s when electronic music subcultures were often associated with particular locations like Detroit or London (Sicko, 2010; Reynolds, 2012; Collins et al., 2013). Transmission isolating mechanisms (TRIMS), or cultural norms that reduce levels of transmission between lineages, may be present as well (Durham, 1992). TRIMS are com- parable to intrinsic reproduction barriers that exist between members of different biological species. Individuals in “underground” electronic music communities often define themselves in opposition to mainstream music culture, and are deeply invested in the shared identity of their community (Thornton, 1995; Lindop, 2011). The common underlying concern, that too much mainstream recognition could reduce the longevity and undermine the integrity of music communities, is actually supported by previous research on electronic music in the UK (van Venrooij, 2015). Artists who receive mainstream recognition are sometimes subject to criticism (Thornton, 1995; Noys, 1995; Hesmondhalgh, 1998), and may even adopt aliases to CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 122 reinforce their commitment to the “scene” (Formilan and Stark, 2020). Artists also appear to intentionally adopt styles that sound distinctive relative to more popular artists (Klimek et al., 2019), which could be further reinforced by conformity bias within groups (Young- blood, 2019a). Communal experiences in clubs and raves, sometimes supplemented with drug use, also enhance social bonding (Hutson, 2000; St John, 2006; Savage et al., 2020b) and reinforce community boundaries (Kavanaugh and Anderson, 2008). Some scholars have even posited that obscure subgenre names and other forms of jargon function in “maintaining clear boundaries that define in-group/out-group relations” (McLeod, 2001). We hypothesize that these cultural norms may act as TRIMS by enhancing the longevity and cohesion of music communities, and thus reducing the likelihood that they are integrated into larger and more popular genres (Lena, 2012). Importantly, many populations evolve independently and only later connect to the larger phylogeny. In some cases this is likely due to the fact that external connections to other genres are not shown, but in other cases it could represent independent evolution via shared technology. For example, the emergence of electronic music in India in the 1970s, most notably in Bollywood (#15 in Figure 7.2), was driven by the introduction of novel synthesizer technologies rather than the influence of American and European artists (Pandey, 2019; Purgas and Morgan, 2020). Additionally, the level of reticulation that we observed for electronic music has likely been enhanced by communication technologies and a globalized music industry, and is not necessarily reflective of traditional music. Diversity in Gabonese folk music, for example, appears to show signatures of vertical transmission (Le Bomin et al., 2016), and phylogenetic studies indicate that folk music variation can be relatively conserved within genetic lineages (Pamjav et al., 2012; Brown et al., 2013). Unfortunately, our method can only be used in cases where detailed collaboration data is available, which is unlikely to be the case for any form of traditional music. Some might argue that the framework we put forward is more closely related to population CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 123 genetics, as it deals with patterns of connectivity within and between populations. While our approach is informed by population genetics, we feel that reconstructing how populations evolve and diversify over time is also clearly within the realm of phylogenetics. Biologists have investigated the interface of population genetics and phylogenetics for decades (Hey, 1994; Davis, 1996), especially when reconstructing population histories within species complexes (Dean and Ballard, 2004; Russell et al., 2007; Graves and Schrader, 2008; Polihronakis, 2010). The field of phylogeography, for example, integrates population genetics and phylogenetics by studying reticulation, migration, and other processes at the level of populations and species (Edwards et al., 2016). In cultural evolution, where phylogenies are always constructed at the level of populations, sharp distinctions between population genetics and phylogenetics are unlikely to be empirically useful. In other words, we agree with Avise and Wollenberg’s assessment that “concepts of phylogeny cannot be divorced from those of population genetics and demography” (Avise and Wollenberg, 1997). Several limitations of this study need to be highlighted. Firstly, the phylogeny was exclusively constructed from collaboration links, which do not account for all cultural trans- mission. For example, contemporary artists routinely draw inspiration from recorded music, live events, new technologies, etc. These alternate modes of cultural transmission are more likely to be horizontal, especially towards the end of our study period when internet-use became more widespread. Thus, the phylogeny really only captures the “core” of these cultural lineages (Boyd et al., 1997), and may under-represent reticulation across all trans- mission modes. Secondly, the data was limited to releases tagged as “electronic” on Discogs, so external connections to other genres (e.g. rock and pop music) are not shown. Finally, our analysis focuses on the evolution of interactions among artists, and does not analyze the evolution of the sound of the musical works themselves (Serr`aet al., 2012; Savage et al., 2020a). Future studies should investigate the ways in which musical sounds and the artists making them may or may not coevolve. CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 124

Our framework also has the potential to improve music recommendations on streaming platforms like Spotify. Currently, Spotify’s recommendation algorithms are primarily based on listener habits (Johnston, 2019). In other words, related artists are identified based on what their fans are also listening to. Spotify’s algorithms have been observed to reduce the diversity of listeners’ music consumption (Anderson et al., 2020), and have the potential to exacerbate existing inequalities in the music industry based on popularity (Abdollahpouri et al., 2020; Abdollahpouri and Mansoury, 2020), gender (Werner, 2020), and other factors (O’Dair and Fry, 2020). These issues may be further compounded by a recently announced policy allowing artists to “pay” (via reduced royalties) for an algorithmic boost (Hern, 2020). Supplementing recommendation algorithms with data on collaborative population structure would allow for the identification of artists from the same community independently of listeners’ biases. Future research should apply this method to high resolution data from other cultural domains (e.g. co-citation patterns in academic fields (Youngblood and Lahti, 2018), institu- tional membership in the arts (Fraiberger et al., 2018), or the composition of Hollywood film crews (Tinits and Sobchuk, 2020)) to determine whether the observed level of reticulation in this study is typical of other contemporary cultural systems. Additionally, this method is theoretically generalizable to biological evolution as well. For example, researchers that study the evolution of insects in the lab could use motion tracking (Crall et al., 2015) to estimate population structure from mating events, and then measure how genetic changes map onto lineages over time. In some long-term study systems, such as Darwin’s finches, researchers have collected enough detailed mating data to reconstruct population structure in the wild. Applying this method where interbreeding between closely related species has led to hybrid speciation and clear genetic changes (Lamichhaney et al., 2018) could provide new insight into the process of reproductive isolation. Finally, previous studies have used cases where the historical relationships between cultural lineages are known (e.g. transmis- CHAPTER 7. ELECTRONIC MUSIC PHYLOGENY 125 sion chains (Spencer et al., 2004; Schillinger et al., 2016) and simulations (Greenhill et al., 2009; Currie et al., 2010)) to assess the efficacy of traditional phylogenetic methods under particular conditions. By using an agent-based model of cultural transmission that outputs both population structure and trait features and allows for movement between lineages, fu- ture studies could provide ground truth estimates of horizontal transmission from traditional phylogenetic methods against those from dynamic community detection. Appendix A

Descriptive Network Statistics Collaboration Index 2.43 Small-Worldness Index 22.42 Network Density 0.0083 Stirling-Rao Diversity 0.8188

Table A.1: Descriptive network statistics

Figure A.1: The observed distribution of author productivity (solid line) alongside the the- oretical inverse square function multiplied by the total number of authors in the dataset (dashed line, y = 3451/x2). A Kolmogorov-Smirnov test found the observed distribution to be significantly different from the theoretical inverse square function (p < 0.05).

126 APPENDIX A. 127

Core Authors Primary Departmental Affiliations Alex Mesoudi Biosciences Alberto Acerbi Industrial Engineering & Innovation Science 1 Kevin Laland Biology Mark Collard Archaeology Andrew Whiten Psychology & Neuroscience Peter Richerson Environmental Science & Policy Robert Boyd Human Evolution & Social Change 2 Joseph Henrich Human Evolutionary Biology Magnus Enquist Zoology Stefano Ghirlanda Psychology Simon Kirby Philosophy, Psychology, & Language Science Kenny Smith Philosophy, Psychology, & Language Science 3 Alex Bentley Anthropology Morten Christiansen Psychology Christine Caldwell Psychology Ruth Mace Anthropology Russell Gray Linguistic & Cultural Evolution 4 Simon Greenhill Linguistic & Cultural Evolution Thomas Currie Biosciences Quentin Atkinson Cognitive & Evolutionary Anthropology Marcus Feldman Biology Nicole Creanza Biological Sciences 5 Kenichi Aoki Strategic Coordination of Research & Intellectual Properties Laurel Fogarty Comp, Math & Phys in the Life Sciences and Experimental Biol Joe Wakano Interdisciplinary Mathematical Sciences David Buss Psychology Lei Chang Psychology 6 Hui Jing Lu Applied Social Sciences Todd Shackelford Psychology Laith Al-Shawaf Psychology Robert Lachlan Biological & Experimental Psychology Peter Slater Biology 7 Maria Servedio Biology Carel ten Cate Biology Machteld Verzijden Molecular Biology and Genetics

Table A.2: Names and primary departmental affiliations of core authors. APPENDIX A. 128 language facultyiconicityevolution of speech cultural evolution Interdisc Apps MathLanguage & LinguisticsBiology complexity Multidisciplinary Sci Mathematical Psychology Multidisciplinary Ecology Sciences 1 2 3 4 5 6 7 United KingdomNetherlandsFrance Swedensocial learningcultural transmissionconformity Canadaimitation cultural evolutionemulationcultural evolution cooperationchimpanzeescultural iterated attraction learning United Kingdom religion cultural transmission neolithic Netherlands europe euroevol language change language evolutionAnthropology phylo compBiological New methods Psychology ZealandEvolutionary Biology language Biological Psychology cultural accumulation bayesian cultural Behavioral inference phylogenetics Sciences fertility decline Linguistics Biomedical United Social Kingdom random Sci demography genetic drift Biophysics Japan evolutionary Experimental psych Psychology compositionality cross-cultural bird research song Biology Switzerland experimental phylogenetics gene-culture phylogeny coevolution differences mate preferences austronesian song learning pastoralism Math & Comp Biol speciation Social Psychology Genetics & Heredity Ecology Evolutionary Biology sex Evolutionary Biol Zoology Countries Keywords Subjects Table A.3: Statistically over-representedgories country in affiliations each of cluster. first authors, author-chosen keywords, and subject cate- APPENDIX A. 129

Figure A.2: Wordcloud showing the most commonly used author-chosen (as opposed to WoS-assigned) keywords associated with the articles in each group. “Cultural evolution” has been excluded from the wordcloud for clarity. Text size indicates use frequency.

The science overlay map in Figure A.3 illustrates the diversity of disciplines involved in the study of cultural evolution. The social, psychological, biomedical, and ecological sciences appear to be dominant. Although the clusters of authors share many features, some groups draw from a more varied set of disciplines than others. Group 1 is distinctive in the high degree to which it taps both the geosciences and the social sciences, in these respects contrasting especially with group 2. Infectious disease research is not very important to cultural evolution research, although groups 2 and 4 have made some use of it. Groups 1 and 4 have taken advantage of computer science, mathematics, and the physical sciences to a greater degree than the other groups. APPENDIX A. 130

Geosciences Ecol Sci Agri Sci

All Environ Sci & Tech Biomed Sci Infectious Diseases Materials Sci Chemistry Mech Eng

Clinical Med Health & Social Issues Physics

Clinical Psychology Health & Social Issues Computer Sci Math Methods

Business & MGT

Social Studies Econ Polit & Geography

Pajek

1 2 3 4

Pajek Pajek Pajek Pajek

5 6 7

Pajek Pajek Pajek

Figure A.3: A science overlay map of the entire dataset (above), as well as for each cluster of co-authors identified by VOSviewer (below). The links and nodes are based on a co-citation analysis of an exhaustive dataset of ISI subject categories from 2007, and represent a global map of science over which our own dataset is superimposed. Each node indicates a subject category represented in the dataset. The size of each node corresponds to the frequency of use, and the color corresponds to the larger disciplinary category it belongs to. Empty nodes represent subject categories not present in the dataset. Appendix B

1

0.8

0.6

0.4

Proportion Informed 0.2

0 1984 1992 2000 2008 2016 Year Figure B.1: The combined diffusion curve for all three drum breaks included in the primary OADA. The proportion of informed individuals is on the y-axis, and the year is on the x-axis. Although recent research suggests that inferring acquisition modes from diffusion curves is unreliable, it appears that the curve may have the S-shape indicative of social transmission prior to the early-2000s.

Primary OADA

The results of the multiplicative NBDA model fit to the primary OADA with all four individual-level variables.

Summary of Multiplicative Social Transmission Model Order of acquisition data 131 APPENDIX B. 132

Unbounded parameterisation .

Coefficients: Estimate Bounded se z p Social transmission 1 1.334607e−010.1177462 NA NA NA gender −1.050416 e−01 NA4.017186e−02 −2.6148064 8.927804e−03 p o p u l a r i t y −1.335417 e−02 NA1.465003e−03 −9.1154550 0.000000e+00 f o l l o w e r s −9.611204 e−08 NA1.956053e−08 −4.9135695 8.943303e−07 meandist −1.880816 e−09 NA1.361273e−08 −0.1381659 8.901093e−01

Likelihood Ratio Test for Social Transmission:

Null model includes all other specified variables Social transmission and asocial learning assumed to combine multiplicatively

Df LogLik AIC AICc LR p With Social Transmission 5 7354.4 14719 14719 143.09 0 Without Social Transmission 4 7425.9 14860 14860

The results of all NBDA models fit to the primary OADA. In the “Additive?” column TRUE means the model was additive, FALSE means the model was multiplicative, and NA means the model was asocial. In the “ILVs”, or individual-level variables, column the numbers correspond to the variables included in the model (1: gender; 2: popularity; 3: followers; 4: mean distance).

Additive? ILVs Social?AICc deltaAICc FALSE 1234 social 14718.7305939155 0 FALSE 234 social 14723.2581626229 4.53 FALSE 124 social 14749.7210576547 30.99 FALSE 24 social 14755.0075144196 36.28 FALSE 134 social 14795.6248094134 76.89 FALSE 34 social 14796.5052668663 77.77 FALSE 4 social 15956.6213244064 1237.89 FALSE 14 social 15957.1346280933 1238.4 FALSE 123 social 32229.4897515171 17510.76 FALSE 12 social 32260.0747802103 17541.34 FALSE 23 social 32267.3112186711 17548.58 APPENDIX B. 133

FALSE 2 social 32298.2544538564 17579.52 FALSE 13 social 32356.8941850595 17638.16 FALSE 3 social 32389.9528866174 17671.22 FALSE 1 social 43609.4787479712 28890.75 TRUE 1 social 43612.45988987 28893.73 NA 0 social 43662.7306996825 28944 NA 1 asocial43779.4569488198 29060.73 NA 0 asocial43815.27922266 29096.55 TRUE 2 socialInf Inf NA 2 asocialInf Inf TRUE 12 socialInf Inf NA 12 asocialInf Inf TRUE 3 socialInf Inf NA 3 asocialInf Inf TRUE 13 socialInf Inf NA 13 asocialInf Inf TRUE 23 socialInf Inf NA 23 asocialInf Inf TRUE 123 socialInf Inf NA 123 asocialInf Inf TRUE 4 socialInf Inf NA 4 asocialInf Inf TRUE 14 socialInf Inf NA 14 asocialInf Inf TRUE 24 socialInf Inf NA 24 asocialInf Inf TRUE 124 socialInf Inf NA 124 asocialInf Inf TRUE 34 socialInf Inf NA 34 asocialInf Inf TRUE 134 socialInf Inf NA 134 asocialInf Inf TRUE 234 socialInf Inf NA 234 asocialInf Inf TRUE 1234socialInf Inf NA 1234asocialInf Inf APPENDIX B. 134

0.6

0.5

0.4

0.3

0.2

0.1 Social Transmission Estimate 0 2005 2010 2015 2005 2010 2015 Mean Year Median Year Figure B.2: The relationship between diffusion years and transmission strengths for all seven diffusions included in the additional OADA. The mean (left) and median (right) years of diffusion are on the x-axis, and the social transmission estimates from the additive model are on the y-axis. Linear regression found no significant relationships between either mean year of diffusion and social transmission estimate (R2 = 0.20, p = 0.31) or median year of diffusion and social transmission estimate (R2 = 0.17, p = 0.36).

Additional OADA

The eight songs in the “Most Sampled Tracks” on WhoSampled that were released after 1990. The fifth song, “I’m Good” by YG, was excluded from the additional OADA because it is a producer tag used by a single artist.

1. “Crash Goes Love (Yell Apella)” by Loleatta Holloway (1992)

2. “Shook Ones Part II” by (1994)

3. “C.R.E.A.M.” by Wu-Tang Clan (1993)

4. “Sound of Da Police” by KRS-One (1993)

5. “I’m Good” by YG (2011) [excluded producer tag]

6. “Juicy” by The Notorious B.I.G. (1994) APPENDIX B. 135

7. “Sniper” by DJ Trace and Pete Parsons (1999)

8. “Who U Wit?” by Lil Jon and The East Side Boyz (1997)

The results of the additive NBDA model fit to the additional OADA. Remember that the fifth song was excluded, so the transmission estimates for five, six, and seven here are actually for six, seven, and eight.

Summary of Additive Social Transmission Model Order of acquisition data Unbounded parameterisation

Coefficients Estimate Bounded Social transmission 1 0.13558602 0.11939740 Social transmission 2 0.28805974 0.22363849 Social transmission 3 0.05184340 0.04928814 Social transmission 4 0.60154771 0.37560399 Social transmission 5 0.06816578 0.06381573 Social transmission 6 0.07600555 0.07063677 Social transmission 7 0.01547410 0.01523830

Likelihood Ratio Test for Social Transmission:

Null model includes all other specified variables Social transmission and asocial learning assumed to combine additively

Df LogLik AIC AICc LR p With Social Transmission 7 6450 12914 12914 101.99 0 Without Social Transmission 0 6501 13002 13002

STERGM

The results of all formation models of the STERGM fit to the data from 1984-1999. In the “ILVs”, or individual-level variables, column the numbers correspond to the variables APPENDIX B. 136 included in the model (1: gender; 2: popularity; 3: followers; 4: mean distance).

ILVs AIC deltaAIC 1 2 4 7676.380 0.00000 1 2 3 4 7678.174 1.79348 1 2 7693.425 17.04473 1 2 3 7695.203 18.82240 1 4 7702.003 25.62300 1 3 4 7702.160 25.77921 2 4 7710.403 34.02256 2 3 4 7711.909 35.52898 1 7718.147 41.76695 1 3 7718.390 42.00953 2 7727.194 50.81323 2 3 7728.701 52.32017 4 7737.373 60.99249 3 4 7738.094 61.71319 0 7753.228 76.84742 3 7753.985 77.60457

The results of the best-fitting formation model of the STERGM with the most individual- level variables fit to the data from 1984-1999.

======Summary o f model f i t ======

Formula: y.form ˜ edges + nodecov(”meandist”) + absdiff(”popularity”) + absdiff(”followers”) + nodematch(”gender”, diff = TRUE)

Iterations: 11 out of 20

Monte Carlo MLE Results: Estimate Std. Error MCMC% z value Pr( >| z | ) edges −9.382e+00 1.089e−01 0 −86.175 < 1e−04 *** nodecov. meandist −1.292e−07 3.371 e−08 0 −3.833 0.000126 *** absdiff .popularity −1.618e−02 3.284 e−03 0 −4.927 < 1e−04 *** absdiff.followers 1.194e−08 2.579 e−08 0 0.463 0.643396 nodematch. gender.−1 1.926e+00 3.293e−01 0 5.851 < 1e−04 *** APPENDIX B. 137

nodematch.gender.0 3.478e−01 1.965 e−01 0 1.770 0.076773 . nodematch.gender.1 5.313e−01 1.117 e−01 0 4.757 < 1e−04 *** −−− Signif. codes: 0 ’*** ’ 0.001 ’** ’ 0 . 0 1 ’ * ’ 0.05 ’.’ 0.1 ’ ’ 1

Null Deviance: 6349494 on 4580192 degrees of freedom Residual Deviance: 7664 on 4580185 degrees of freedom

AIC: 7678 BIC: 7772 (Smaller is better.)

The results of the goodness-of-fit analysis of the formation model of the STERGM with the most individual-level variables fit to the data from 1984-1999.

Goodness−of−fit for degree

obs min mean maxMCp−value 0 10698 10388 10484.50 10553 0.00 1 801 1032 1101.97 1200 0.00 2 148 89 104.41 126 0.00 3 43 16 20.71 29 0.00 4 15 5 8.01 11 0.00 5 0 0 0.38 2 1.00 6 6 0 0.95 2 0.00 7 1 0 0.07 1 0.14 8 12 4 6.55 7 0.00 9 1 0 0.42 3 0.62 10 1 0 0.03 1 0.06 11 2 0 0.91 1 0.00 12 2 0 1.00 2 0.14 13 0 0 0.08 1 1.00 14 0 0 0.01 1 1.00

Goodness−of−fit for edgewise shared partner

obs min mean maxMC p−value esp0 533608 645.34 700 0 esp1 92 43 43.01 44 0 esp2 42 19 19.01 20 0 esp3 6 3 3.00 3 0 APPENDIX B. 138

esp7 71 36 36.00 36 0 esp8 1 0 0.00 0 0

Goodness−of−fit for minimum geodesic distance

obs min mean maxMCp−value 1 745 709 746.36 801 0.92 2 421 186 212.11 249 0.00 3 243 46 67.21 102 0.00 4 138 14 22.30 45 0.00 5 60 6 8.67 22 0.00 6 19 0 1.07 7 0.00 7 5 0 0.06 3 0.00 Inf 68788954 68789399 68789527.22 68789605 0.00

Goodness−of−fit for model statistics

obs min mean maxMCp−value edges 745.00 709.00 746.36 801.00 0.92 nodecov. meandist −298388294.47 −366967354.28 −300622250.67 −216526380.89 0.94 absdiff.popularity 16092.87 15023.48 16140.46 17764.91 0.94 absdiff.followers 631702714.34 538012952.78 636527414.86 770413121.94 1.00 nodematch. gender.−1 20.00 13.00 20.16 28.00 1.00 nodematch.gender.0 68.00 56.00 68.37 80.00 1.00 nodematch.gender.1 384.00 361.00 385.54 424.00 1.00

The results of all formation models of the STERGM fit to the data from 2000-2017. In the “ILVs”, or individual-level variables, column the numbers correspond to the variables included in the model (1: gender; 2: popularity; 3: followers; 4: mean distance).

ILVs AIC deltaAIC 1 2 3 14194.16 0.00000 1 2 3 4 14196.11 1.94688 2 3 14365.84 171.67787 2 3 4 14367.30 173.13943 1 2 14409.22 215.06168 1 2 4 14410.98 216.82145 1 3 14563.71 369.55351 1 3 4 14565.71 371.55212 APPENDIX B. 139

2 14599.55 405.39275 2 4 14600.59 406.43284 1 14636.97 442.80822 1 4 14638.95 444.78763 3 14748.97 554.80763 3 4 14750.71 556.55346 0 14836.01 641.84666 4 14837.57 643.40921

The results of the best-fitting formation model of the STERGM with the most individual- level variables fit to the data from 2000-2017.

======Summary o f model f i t ======

Formula: y.form ˜ edges + nodecov(”meandist”) + absdiff(”popularity”) + absdiff(”followers”) + nodematch(”gender”, diff = TRUE)

Iterations: 10 out of 20

Monte Carlo MLE Results: Estimate Std. Error MCMC% z value Pr( >| z | ) edges −8.530e+00 7.639e−02 0 −111.660 < 1e−04 *** nodecov. meandist −3.907e−09 1.702 e−08 0 −0.230 0.81847 absdiff .popularity −4.764e−02 2.799 e−03 0 −17.022 < 1e−04 *** absdiff.followers 1.726e−07 9.188 e−09 0 18.786 < 1e−04 *** nodematch. gender.−1 8.030 e−01 3.605 e−01 0 2.227 0.02592 * nodematch. gender.0 −7.953e−01 2.582 e−01 0 −3.080 0.00207 ** nodematch.gender.1 8.877e−01 7.878 e−02 0 11.269 < 1e−04 *** −−− Signif. codes: 0 ’*** ’ 0.001 ’** ’ 0 . 0 1 ’ * ’ 0.05 ’.’ 0.1 ’ ’ 1

Null Deviance: 7195553 on 5190494 degrees of freedom Residual Deviance: 14182 on 5190487 degrees of freedom

AIC: 14196 BIC: 14290 (Smaller is better.) APPENDIX B. 140

The results of the goodness-of-fit analysis of the formation model of the STERGM with the most individual-level variables fit to the data from 2000-2017 are below.

Goodness−of−fit for degree

obs min mean maxMCp−value 0 11179 10679 10811.81 10930 0.00 1 1507 1878 2005.54 2115 0.00 2 371 313 351.05 398 0.24 3 130 76 88.47 104 0.00 4 68 17 25.10 31 0.00 5 26 3 7.85 13 0.00 6 5 0 1.94 6 0.06 7 5 0 1.94 3 0.00 8 0 0 0.26 2 1.00 9 2 0 0.04 1 0.00 10 1 0 0.00 0 0.00

Goodness−of−fit for edgewise shared partner

obs min mean maxMCp−value esp0 1176 1324 1379.71 1458 0 esp1 294 144 144.42 147 0 esp2 75 35 36.03 37 0 esp3 22 10 10.02 11 0

Goodness−of−fit for minimum geodesic distance

obs min mean maxMCp−value 1 1567 1514 1570.18 1648 0.90 2 1189 607 677.37 735 0.00 3 1107 320 373.55 436 0.00 4 950 177 225.67 286 0.00 5 658 83 123.00 177 0.00 6 380 37 69.22 125 0.00 7 182 15 39.00 82 0.00 8 79 3 19.09 47 0.00 9 16 1 8.26 34 0.22 10 1 0 2.59 14 1.00 APPENDIX B. 141

11 0 0 0.58 7 1.00 12 0 0 0.12 4 1.00 13 0 0 0.02 1 1.00 Inf 88352442 88355141 88355462.35 88355763 0.00

Goodness−of−fit for model statistics

obs min mean maxMCp−value edges 1567.00 1514.00 1570.181.648000e+03 0.90 nodecov. meandist −52499302.28 −196776194.24 −55329463.86 1.010890e+08 0.96 absdiff.popularity 25911.58 24355.87 25912.42 2.714093e+04 0.98 absdiff.followers 2958736083.69 2685506866.45 2938426320.54 3.215706e+09 0.86 nodematch. gender.−1 16.00 10.00 16.042.400000e+01 1.00 nodematch.gender.0 31.00 23.00 31.33 4.200000e+01 1.00 nodematch.gender.1 1012.00 956.00 1014.21 1.069000e+03 0.94

The results of the formation models of the STERGM with all individual-level variables assuming different transition years. The top and bottom tables show the results for before and after each transition year. The number of unique artists in each time period is included in the second row. Regardless of the transition year, mean distance and gender (F and M) had the same significance pattern and direction of effect observed in the main analysis. The results for popularity only varied from the main analysis in the first time period when the transition year was 1994 or 1996, which could be the result of lower sample sizes. The results for followers were consistent with the main analysis in the second time period, but fluctuated dramatically across transition years in the first time period.

Pre −1994 Pre −1996 Pre −1998 Pre −2000 n=205 n=286 n=370 n=450 Estimate p−value Estimate p−value Estimate p−value Estimate p−value Mean Distance −2.0e−07 9 . 0 e−04 −1.0e−07 1 . 1 e−02 −1.1e−07 3 . 2 e−03 −1.3e−07 1 . 3 e−04 P o p u l a r i t y −5.1e−03 2 . 9 e−01 −5.9e−03 1 . 6 e−01 −1.2e−02 1 . 5 e−03 −1.6e−02 8 . 4 e−07 F o l l o w e r s −2.3e−07 8 . 0 e−03 −2.4e−07 1 . 9 e−03 −8.3e−08 5 . 5 e−02 1 . 2 e−08 6 . 4 e−01 Gender (F) 1.5e+00 1.0e−02 1.3e+00 3.1e−02 1.3e+00 1.3e−02 1.9e+00 4.9e−09 Gender (M) 5.8e−02 7 . 3 e−01 2 . 6 e−01 7 . 4 e−02 4 . 7 e−01 1 . 4 e−04 5 . 3 e−01 2 . 0 e−06

Pre −2002 Pre −2004 Pre −2006 APPENDIX B. 142

n=520 n=593 n=659 Estimate p−value Estimate p−value Estimate p−value −1.1e−07 1 . 6 e−04 −1.2e−07 1 . 2 e−05 −1.1e−07 2 . 0 e−05 −1.8e−02 1 . 7 e−09 −2.0e−02 9 . 9 e−13 −2.2e−02 4 . 4 e−17 2 . 8 e−08 2 . 0 e−01 3 . 9 e−08 4 . 3 e−02 4 . 6 e−08 8 . 0 e−03 1.8e+00 5.1e−09 1.7e+00 6.1e−08 1.5e+00 1.6e−06 5 . 7 e−01 2 . 0 e−08 6 . 7 e−01 5 . 8 e−13 6 . 7 e−01 7 . 5 e−15

Post −1994 Post −1996 Post −1998 Post −2000 n=876 n=836 n=781 n=725 Estimate p−value Estimate p−value Estimate p−value Estimate p−value Mean Distance −2.0e−08 2 . 1 e−01 −2.2e−08 1 . 7 e−01 −9.3e−09 5 . 8 e−01 −3.9e−09 8 . 2 e−01 P o p u l a r i t y −4.3e−02 9 . 3 e−72 −4.4e−02 2 . 4 e−70 −4.7e−02 1 . 7 e−68 −4.8e−02 5 . 7 e−65 Followers 1.6e−07 2 . 6 e−74 1 . 6 e−07 1 . 6 e−77 1 . 7 e−07 1 . 1 e−75 1 . 7 e−07 9 . 9 e−79 Gender (F) 1.3e+00 3.7e−07 1.3e+00 9.2e−07 8 . 3 e−01 1 . 4 e−02 8 . 0 e−01 2 . 6 e−02 Gender (M) 8.9e−01 7 . 3 e−37 9 . 0 e−01 2 . 0 e−35 8 . 6 e−01 2 . 8 e−30 8 . 9 e−01 1 . 9 e−29

Post −2002 Post −2004 Post −2006 n=667 n=607 n=544 Estimate p−value Estimate p−value Estimate p−value 1 . 2 e−09 9 . 4 e−01 1 . 7 e−08 3 . 4 e−01 2 . 2 e−08 2 . 6 e−01 −4.9e−02 3 . 4 e−61 −5.3e−02 2 . 4 e−58 −5.5e−02 3 . 1 e−52 1 . 8 e−07 3 . 2 e−77 1 . 9 e−07 1 . 9 e−81 2 . 0 e−07 1 . 2 e−83 8 . 7 e−01 1 . 5 e−02 9 . 7 e−01 7 . 6 e−03 1.2e+00 1.4e−03 8 . 6 e−01 1 . 5 e−25 8 . 4 e−01 4 . 7 e−21 9 . 2 e−01 5 . 6 e−21 APPENDIX B. 143

60

50

40

30

20

10 Number of Collaborations

0 −50 0 50 0 5M 10M 15M 20M Popularity Followers Figure B.3: The relationship between popularity and followers and the number of collabora- tions for each artist in the dataset. Popularity and followers are on the x-axis, and number of collaborations is on the y-axis. Linear regression found significant positive relationships between both popularity and number of collaborations (R2 = 0.048, p < 0.001) and followers and number of collaborations (R2 = 0.090, p < 0.001). Appendix C

Imputation Check

To ensure that the coding procedure did not bias the estimation of the epidemic predictors, the full twinstim model was re-run after multiple imputation with chained equations using the R package mice (van Buuren and Groothuis-Oudshoorn, 2011) and random forest machine learning using the R package missForest (Stekhoven and B¨uhlmann,2012). All four epidemic predictors (plot success, anticipated fatalities, and group membership) were used in fitting and training. The maximum iterations was set to 10 and number of trees was set to 100. The results of 100 rounds of both imputation methods can be seen in Table C.1.

Chained equations Random forest RR p-value RR p-value Group membership 5.014 0.0040 6.61 0.00040 Social media 2.29 0.065 4.092 0.00013 Anticipated fatalities 1.15 0.43 0.90 0.53 Plot success 0.93 0.78 1.11 0.55

Table C.1: The average rate ratios and p-values for all epidemic predictors after 100 rounds of imputation and estimation using the full model.

Since the observed estimate of social media, the only epidemic predictor with missing data in the best fitting model, is between those from the two imputation methods, and as

144 APPENDIX C. 145

random forest in missForest outcompetes chained equations in mice in most (Stekhoven and B¨uhlmann,2012; Waljee et al., 2013; Liao et al., 2014; Muharemi et al., 2018; Misztal, 2019; Cui and Wang, 2019) (but not all (Shah et al., 2014; Penone et al., 2014)) direct comparisons, I assume that the coding method did not significantly influence the results.

Spatial interaction

Figure C.1: The pair correlation function at different pairwise distances in km (x-axis). The black line is the observed function for the data, the red line is the theoretical function assuming spatial randomness, and the grey envelope shows the upper and lower bounds of the functions from 100 simulated point patterns demonstrating spatial randomness.

Diagnostics

The residuals, or the fitted cumulative intensities over time, were calculated and transformed to fit a uniform distribution according to Ogata (Ogata, 1988). The cumulative density function diverges from expectations for Ui < 0.58, which appears to be the result of tie- breaking with small temporal distances (0.5 days) (Meyer et al., 2012). Increasing the tie-breaking distance to > 20 days to improve the cumulative density function and reduce APPENDIX C. 146

Figure C.2: Estimates of the scaled spatial (left panel) and temporal (right) step functions. The 95% Monte Carlo confidence intervals were each calculated from 100 samples.

Figure C.3: (A) The empirical cumulative density function of Ui, or the standardized residu- als according to Ogata Ogata (1988), with 95% Kolmogorov-Smirnov confidence bands. (B) A scatterplot of Ui and Ui+1 to look for serial correlation. serial correlation did not significantly change the predictor estimates, so I chose to use the original model. Appendix D

Warm-Up Time

Figure D.1: The measured diversity (x-axis) resulting from 1,000 iterations of a neutral model with the observed innovation rate, calculated across 500 timepoints (y-axis). The system appears to reach equilibrium at around 100 timepoints. We used 200 timepoints as the warm-up time to be more conservative.

147 APPENDIX D. 148 Parameter Inference Diagnostics

Figure D.2: A goodness-of-fit test for the rejection form of ABC (n = 1000; ε = 0.01) indicates that the model is a good fit for the data (p = 0.47). The x-axis corresponds to the Euclidean distance between the simulated and observed summary statistics for each iteration, and the y-axis is the number of iterations with each range of distances.

Figure D.3: Leave-one-out cross validation indicates that the results are robust across tol- erance levels (n = 10; ε: 0.005, 0.01, 0.05). Prediction error with ε = 0.005 is 0.0035, ε = 0.01 is 0.0042, and ε = 0.05 is 0.0099. The x-axis is the true parameter values, the y-axis is the estimated parameter values, and the colors correspond to the three tolerance levels. APPENDIX D. 149 Model Choice Diagnostics

Figure D.4: The out-of-bag error computed from an increasing number of trees from the random forest. The x-axis is the number of trees for which the out-of-bag error, shown on the y-axis, was computed.

Figure D.5: The ten most important variables for the classification ability of the random forest. The x-axis is variable importance calculated using the Gini impurity method, and the y-axis is the top ten variables in descending order. The top variable was mean diversity (D¯), followed by the first LDA axis (LD1), the exponent of the turn-over function (x), and the second LDA axis (LD2). The following six variables correspond to diversity in particular years or turn-over rates for specific top-lists. APPENDIX D. 150

Figure D.6: The reference table of the random forest projected onto the two LDA axes. The x-axis is LD1, and the y-axis is LD2. Each colored point corresponds to a single simulation of one of the three cultural transmission models, while the black star corresponds to the observed data. The observed data fall neatly within the range of the output of the conformity model on LD1, but outside of the range on LD2. This is not surprising given that LD1 has over twice as much importance in the random forest. Appendix E

Figure E.1: Average number of house finches encountered per party hour in western Long Island (Brooklyn, Queens, and Nassau County). The dotted line indicates 1994, the year in which conjunctivitis was first detected in the eastern range (Dhondt et al., 1998).

151 APPENDIX E. 152

Figure E.2: A dendrogram showing the results of the hierarchical clustering. The clusters could not be labelled because the number of syllable types exceeds the maximum color limit of R.

Deep split N Silhouette Dunn Connectivity

0 13 0.093 0.0027 7,168 1 36 -0.030 0.0023 12,380 2 114 0.018 0.0020 20,797 3 596 0.071 0.0020 34,602 4 1,646 0.084 0.0020 44,967

Table E.1: The number of syllable types and cluster validity indices for each value of deep split, calculated using the clValid package (Brock et al., 2008). For all three indices higher values correspond to higher quality clustering. We used a deep split value of 3 because it yielded results more consistent with previous research and had only slightly lower val- ues of the silhouette and connectivity indices and an identical Dunn index, which appears to outperform the silhouette index when hierarchical clustering is used with real datasets (Arbelaitz et al., 2013). APPENDIX E. 153

Figure E.3: The cumulative proportion of syllable types (y-axis) detected with each addi- tional song sampled (x-axis) for the 106 birds from whom we have at least 10 recorded songs. The red line is the average value, which cross 75% around eight songs.

Figure E.4: The Simpson’s diversity index for 500 burn-in iterations. The red line is the mean Simpson’s diversity index across all iterations in that year. APPENDIX E. 154

1975 2012 2019

Proportion of syllables that appear once 0.0094 0.0054 0.010 Proportion of most common syllable type 0.059 0.052 0.045 Number of syllable types 410 517 384 Simpson’s diversity index 0.98 0.99 0.98 Shannon’s diversity index 4.86 4.98 4.68 Pielou’s evenness index 0.81 0.80 0.79 Exponent of fitted power-law function 9.17 108.81 1.96

Table E.2: The observed summary statistics from 1975, 2012, and 2019.

Figure E.5: The prior (dotted lines) and posterior (solid lines) distributions for the four dynamic parameters in the agent-based model unrelated to transmission biases. APPENDIX E. 155 1 2 3 4 5 6 7 8 9 10 2012 (6)2012 2012 (1) (2)2019 2012 2019 (1) (7) (6)2019 1975 1975 (6) 1975 (1) (7) (6)1975 1975 2012 (5) 2012 (6) 2012 (3) (3) (4)1975 2012 1975 (2) 2012 (5) 2019 (2) 1975 (1) (7) (4)2019 2012 1975 (6) 2019 (2) 2019 (4) 2019 (2) 1975 (3) (3) (5) 1975 1975 2012 (2) 1975 (4) 1975 (4) 2019 (4) 2019 (3) (5) (2) 1975 2019 2019 (6) 2012 (2) 2012 (2) 1975 (6) 2012 (5) (3) (1) 1975 2019 2012 (4) 2012 (4) 2019 (2) 2012 (2) 2019 (5) (5) (4) 2019 2012 2019 (2) 2012 (4) 2012 (5) 2012 (1) (6) (6) 2012 2019 1975 (6) 1975 (5) 1975 (3) (5) (6) 2012 1975 2012 (2) 2019 (5) (1) (4) 2012 2019 2019 (4) (6) (4) 2012 2012 (1) (5) 1975 (5) B S att N N µ D p α v Table E.3: The 10using most the important Gini summary impurity method. statisticsof used The syllables by summary that the statistics only random are appear(4) numbered forests once, as Simpson’s to (2) they estimate diversity the appear each proportion index, infitted parameter, the of power-law (5) methods: identified the function Shannon’s most to (1) common diversity the the syllable proportion index, progeny type, distribution. (6) (3) the Pielou’s number evenness of index, syllable types, and (7) the exponent of the Appendix F

Data Collection

XML parsing and data management was conducted in BaseX (v9.3.2) and R (v3.6.1), using the follow XQuery commands:

//release[genres/genre/text() = ”Electronic”]/string −join((concat(’”’, @id, ’”’), concat (’”’, master id, ’”’), concat(’”’, released , ’”’), concat(’”’, replace(string −j o i n ( ( formats/format/descriptions/description), ’, ’), ’”’, ’’), ’”’), concat(’”’, string − join((styles/style), ’, ’), ’”’), concat(’”’, string −join((artists/artist/id, t r a c k l i s t /*/artists/artist/id), ’, ’), ’”’), concat(’”’, country, ’”’)), ’ ’)

The “extra artists” tag was not included, as it is often used for non-collaborators such as label executives, managers, and audio engineers. Releases tagged as compilations were excluded from the analysis, and the Discogs placeholder tags were removed (194: various artists, 355: unknown artist, 118760: no artist). When multiple versions of a release were available, we used the one with the most credited artists and assigned it the earliest release year. Since Discogs also includes re-releases and promos, we manually checked the release years from a random sample of 500 releases to assess data quality. 98.4% of releases had the correct year. Of the 1.6% that were incorrect the majority were only one or two years off (because of incomplete artist data, bootleg versions, etc.). Only one was off by a significant amount of time (12 years), because the re-release was the only version labelled as “electronic”. Rare releases with very inaccurate release years

156 APPENDIX F. 157 are unlikely to influence community detection because the collaborators of the artists are no longer part of the network.

Figure F.1: The log-transformed geographic distribution of releases in the phylogeny, ex- cluding the 3.6% of releases with either missing data or a region instead of a country. A total of 72 countries are represented, with 50.3% of releases coming from the US and UK.

Computational Limitations

The TILES algorithm was run on a symmetric multiprocessor (72 cores, 768 GB memory). Analyzing all data from 1970-1999 took 79 hours, with 1998-1999 accounting for about 30 hours of that time. Analyzing a random subsample of 50% of the data took only 1 hour, with 1998-1999 accounting for about 15 minutes of that time. This means that the runtime for the final year increased 120x with only 2x data. As the yearly data on Discogs throughout the 2000s and 2010s is roughly double or triple what it was in the 1990s, including more time with the full dataset was not really feasible and with the 50% subsample would have only given us a few more years of results. As such, we chose to restrict the analysis to 1970-1999. That being said, the results with the 50% subset were very similar to the results we got with the full dataset. The eight largest populations were still present and had roughly the APPENDIX F. 158 same level of horizontal transmission between them. The only major differences were in the smaller populations. For example, the independent evolution of electronic music in India (#15 in Figure 7.2) completely disappeared. Future studies might explore using a random subsample of data to increase the computational efficiency of TILES.

Clustering Results

Populations Modularity Between-population links

Fast greedy 87 0.65 20.93% Louvain 51 0.64 23.42% Edge betweenness 279 0.58 27.51% Walktrap 707 0.57 24.58% Label propogation 239 0.56 11.85% Spinglass 25 0.52 22.56% Leading eigenvector 10 0.52 27.14% Infomap 779 0.51 47.94%

Table F.1: The results of the eight clustering algorithms included in igraph ranked by mod- ularity. APPENDIX F. 159

Figure F.2: The overall population size of the phylogeny in each year. APPENDIX F. 160

LDA Results

Figure F.3: The top 10 terms (ranked by β) from each topic assigned to the eight largest populations in the phylogeny. β is the probability of a style coming from each topic. Bibliography

Abdollahpouri, H. and Mansoury, M. (2020). Multi-sided Exposure Bias in Recommendation. arXiv.

Abdollahpouri, H., Mansoury, M., Burke, R., and Mobasher, B. (2020). Addressing the Multistakeholder Impact of Popularity Bias in Recommendation Through Calibration. arXiv.

Aboelela, S. W., Larson, E., Bakken, S., Carrasquillo, O., Formicola, A., Glied, S. A., Haas, J., and Gebbie, K. M. (2007). Defining Interdisciplinary Research: Conclusions from a Critical Review of the Literature. Health Services Research, 42(1p1):329–346.

Acerbi, A. (2016). A cultural evolution approach to digital media. Frontiers in Human Neuroscience, 10(December):636.

Acerbi, A. (2020). Cultural evolution in the digital age. Oxford University Press.

Acerbi, A. and Bentley, R. A. (2014). Biases in cultural transmission shape the turnover of popular traits. Evolution and Human Behavior, 35(3):228–236.

Acerbi, A. and Mesoudi, A. (2015). If we are all cultural Darwinians what’s the fuss about? Clarifying recent disagreements in the field of cultural evolution. Biology & Philosophy, 30(4):481–503.

Acerbi, A., Van Leeuwen, E. J., Haun, D. B., and Tennie, C. (2016). Conformity cannot be identified based on population-level signatures. Scientific Reports, 6:1–9.

Adamczyk, A., Gruenewald, J., Chermak, S. M., and Freilich, J. D. (2014). The Relation- ship Between Hate Groups and Far-Right Ideological Violence. Journal of Contemporary Criminal Justice, 30(3):310–332.

Adelman, J. S., Moyers, S. C., Farine, D. R., and Hawley, D. M. (2015). Feeder use predicts both acquisition and transmission of a contagious pathogen in a North American songbird. Proceedings of the Royal Society B: Biological Sciences, 282:20151429.

Adler, P. B., HilleRisLambers, J., and Levine, J. M. (2007). A niche for neutrality. Ecology Letters, 10(2):95–104.

161 BIBLIOGRAPHY 162

Alarc´on-Nieto, G., Graving, J. M., Klarevas-Irby, J. A., Maldonado-Chaparro, A. A., Mueller, I., and Farine, D. R. (2018). An automated barcode tracking system for be- havioural studies in birds. Methods in Ecology and Evolution, 9(6):1536–1547.

Aldrich, J. W. and Weske, J. S. (1978). Origin and Evolution of the Eastern House Finch Population. The Auk, 95(3):528–536.

Alexandraki, C. and Valsamakis, N. (2009). Enabling virtual music performance communi- ties. In Virtual Community Practices and Social Interactive Media: Technology Lifecycle and Workflow Analysis, chapter 19, pages 378–399. IGI Global, Hershey.

Allam, H. (2019). ’We Were Blindsided’: Families Of Extremists Form Group To Fight Hate. National Public Radio.

Allen, J., Weinrich, M., Hoppitt, W., and Rendell, L. (2013). Network-Based Diffusion Analysis Reveals Cultural Transmission of Lobtail Feeding in Humpback Whales. Science, 340(6131):485–488.

Allport, G. (1954). The nature of prejudice. Addison-Wesley, Cambridge, MA.

Aly, A., Macdonald, S., Jarvis, L., and Chen, T. M. (2017). Introduction to the special issue: Terrorist online propaganda and radicalization. Studies in Conflict and Terrorism, 40(1):1–9.

Amble, J. C. (2012). Combating terrorism in the new media environment. Studies in Conflict and Terrorism, 35(5):339–353.

Anderson, A., Maystre, L., Anderson, I., Mehrotra, R., and Lalmas, M. (2020). Algorithmic Effects on the Diversity of Consumption on Spotify. In Proceedings of the World Wide Web Conference, WWW 2020, volume 2, pages 2155–2165.

Anisin, A. (2018). A configurational analysis of 44 US mass shootings: 1975–2015. Interna- tional Journal of Comparative and Applied Criminal Justice, 42(1):55–73.

Aplin, L. M. (2019). Culture and cultural evolution in birds: a review of the evidence. Animal Behaviour, 147:179–187.

Aplin, L. M., Farine, D. R., Morand-Ferron, J., Cockburn, A., Thornton, A., and Sheldon, B. C. (2014). Experimentally induced innovations lead to persistent culture via conformity in wild birds. Nature, 518(7540):538–541.

Aplin, L. M., Farine, D. R., Morand-Ferron, J., and Sheldon, B. C. (2012). Social networks predict patch discovery in a wild population of songbirds. Proceedings of the Royal Society B: Biological Sciences, 279(1745):4199–4205.

Arbelaitz, O., Gurrutxaga, I., Muguerza, J., P´erez,J. M., and Perona, I. (2013). An extensive comparative study of cluster validity indices. Pattern Recognition, 46(1):243–256. BIBLIOGRAPHY 163

Ashtiani, M., Mirzaie, M., and Jafari, M. (2018). CINNA: an R/CRAN package to decipher Central Informative Nodes in Network Analysis. Bioinformatics, 35(8):1436–1437.

Avise, J. C. and Wollenberg, K. (1997). Phylogenetics and the origin of species. Proceedings of the National Academy of Sciences, 94:7748–7755.

Awan, I. (2017). Cyber-Extremism: Isis and the Power of Social Media. Society, 54(2):138– 149.

Badyaev, A. V., Hill, G. E., Stoehr, A. M., Nolan, P. M., and McGraw, K. J. (2000). The evolution of sexual size dimorphism in the House Finch. II. Population divergence in relation to local selection. Evolution, 54(6):2134–2144.

Badyaev, A. V. and Martin, T. E. (2000). Sexual dimorphism in relation to current selection in the house finch. Evolution, 54(3):987–997.

Badyaev, A. V., Young, R. L., Oh, K. P., and Addison, C. (2008). Evolution on a local scale: developmental, functional, and genetic bases of divergence in bill form and associated changes in song structure between adjacent habitats. Evolution, 62(8):1951–1964.

Bailey, M., Cao, R., Kuchler, T., Stroebel, J., and Wong, A. (2018). Social Connectedness: Measurement, Determinants, and Effects. Journal of Economic Perspectives, 32(3):259– 280.

Baker, S. and Cohen, B. M. Z. (2008). From snuggling and snogging to sampling and scratching: Girls’ nonparticipation in community-based music activities. Youth & Society, 39(3):316–339.

Baraghith, K. (2020). Investigating populations in generalized Darwinism. Biology & Phi- losophy, 35(19):1–27.

Barbastefano, R. G., Souza, C., de Sousa Costa, J. M., and Teixeira, P. M. (2015). In- fluˆenciada ambiguidade de nomes na centralidade de redes de coautoria. Transinforma¸c˜ao, 27(3):189–198.

Barnes, R. D. (1996). Blue by day and white by (k)night: regulating the political affiliations of law enforcement and military personnel. Iowa Law Rev, 81(1079).

Barrett, B. J. (2018). Equifinality in empirical studies of cultural transmission. Behavioural Processes, 161:129–138.

Barrett, B. J., Zepeda, E., Pollack, L., Munson, A., and Sih, A. (2019). Counter-culture: Does social learning help or hinder adaptive response to human-induced rapid environ- mental change? Frontiers in Ecology and Evolution, 7:1–18. BIBLIOGRAPHY 164

Barrows, F. T., Bellis, D., Krogdahl, A.,˚ Silverstein, J. T., Herman, E. M., Sealey, W. M., Rust, M. B., and Gatlin, D. M. (2008). Report of the Plant Products in Aquafeed Strategic Planning Workshop: An Integrated Interdisciplinary Research Roadmap for Increasing Utilization of Plant Feedstuffs in Diets for Carnivorous Fish. Reviews in Fisheries Science, 16(4):449–455.

Bastos, M., Mercea, D., and Baronchelli, A. (2018). The geographic embedding of online echo chambers: Evidence from the Brexit campaign. PLoS ONE, 13(11):1–16.

Bastug, M. F., Douai, A., and Akca, D. (2018). Exploring the “demand side” of online radicalization: evidence from the Canadian context. Studies in Conflict & Terrorism.

Becker, M. H. (2019). When Extremists Become Violent: Examining the Association Be- tween Social Control, Social Learning, and Engagement in Violent Extremism. Studies in Conflict and Terrorism, pages 1–21.

Belthoff, J. R. and Gauthreaux, S. A. J. (1991). Aggression and dominance in house finches. The Condor, 93:1010–1013.

Bentley, R. A., Lipo, C. P., Herzog, H. A., and Hahn, M. W. (2007). Regular rates of popular culture change reflect random copying. Evolution and Human Behavior, 28(3):151–158.

Bentley, R. A. and Shennan, S. J. (2003). Cultural Transmission and Stochastic Network Growth. American Antiquity, 68(3):459–485.

Berger, J. M. and Strathearn, B. (2013). Who Matters Online: Measuring influence, evaluat- ing content and countering violent extremism in online social networks. Technical Report March, The International Centre for the Study of Radicalisation and Political Violence, London.

Berm´udez-Cuamatzin,E., R´ıos-Chel´en,A. A., Gil, D., and Garcia, C. M. (2009). Strategies of song adaptation to urban noise in the house finch: syllable pitch plasticity or differential syllable use? Behaviour, 146(9):1269–1286.

Berm´udez-Cuamatzin,E., R´ıos-Chel´en, A. A., Gil, D., and Garcia, C. M. (2011). Experi- mental evidence for real-time song frequency shift in response to urban noise in a passerine bird. Biol Lett, 7:36–38.

Bertram, L. (2016). Terrorism, the Internet, and the Social Media Advantage: Exploring how terrorist organizations exploit aspects of the internet, social media and how these same platforms could be used to counter-violent extremism. Journal for Deradicalization, Summer(7):225–252.

Biggerstaff, M., Cauchemez, S., Reed, C., Gambhir, M., and Finelli, L. (2014). Estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature. BMC Infectious Diseases, 14(480). BIBLIOGRAPHY 165

Bitterbaum, E. and Baptista, L. F. (1979). Geographical variation in Songs of California House Finches (Carpodacus mexicanus). The Auk, 96(3):462–474. Blackbourn, J., McGarrity, N., and Roach, K. (2019). Understanding and responding to right wing terrorism. Journal of Policing, Intelligence and Counter Terrorism, 14(3):183–190. Blum, M. G. B., Nunes, M. A., Prangle, D., and Sisson, S. A. (2013). A Comparative Re- view of Dimension Reduction Methods in Approximate Bayesian Computation. Statistical Science, 28(2):189–208. Bock, R. (2010). The give-and-take of dna: horizontal gene transfer in plants. Trends in Plant Science, 15(1):11–22. Bodiford, J. R. (2017). Sharing Sounds: Musical Innovation, Collaboration, and Ideological Expression in the Chilean Netlabel Movement. PhD thesis, University of Michigan. Bogdanov, D. and Serra, X. (2017). Quantifying music trends and facts using editorial metadata from the Discogs database. In 18th International Society for Music Information Retrieval Conference, pages 89–95, Suzhou, China. Bonn, T. (2019). Former extremists call for violent extremism to be treated as public health issue. The Hill. Bonter, D. N. and Bridge, E. S. (2011). Applications of radio frequency identification (RFID) in ornithological research: a review. Journal of Field Ornithology, 82(1):1–10. Bonter, D. N., Zuckerberg, B., Sedgwick, C. W., and Hochachka, W. M. (2013). Daily forag- ing patterns in free-living birds: exploring the predation-starvation trade-off. Proceedings of the Royal Society B: Biological Sciences, 280:20123087. Borgerhoff Mulder, M., Nunn, C. L., and Towner, M. C. (2006). Cultural macroevolution and the transmission of traits. Evolutionary Anthropology, 15(2):52–64. Bouckaert, R., Lemey, P., Dunn, M., Greenhill, S. J., Alekseyenko, A. V., Drummond, A. J., Gray, R. D., Suchard, M. A., and Atkinson, Q. D. (2012). Mapping the Origins and Expansion of the Indo-European Language Family. Science, 337:957–961. Bouckaert, R., Vaughan, T. G., Barido-Sottani, J., Duchˆene,S., Fourment, M., Gavryushk- ina, A., Heled, J., Jones, G., K¨uhnert, D., De Maio, N., Matschiner, M., Mendes, F. K., M¨uller,N. F., Ogilvie, H. A., Du Plessis, L., Popinga, A., Rambaut, A., Rasmussen, D., Siveroni, I., Suchard, M. A., Wu, C. H., Xie, D., Zhang, C., Stadler, T., and Drummond, A. J. (2019). BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis. PLoS Computational Biology, 15(4):1–28. Bouhana, N., Corner, E., Gill, P., and Schuurman, B. (2018). Background and preparatory behaviours of right-wing extremist lone actors: A comparative study. Perspectives on Terrorism, 12(6):150–163. BIBLIOGRAPHY 166

Bowman-Grieve, L. (2009). Exploring stormfront: A virtual community of the radical right. Studies in Conflict and Terrorism, 32(11):989–1007.

Bown, O., Eldridge, A., and McCormack, J. (2009). Understanding interaction in con- temporary digital music: From instruments to behavioural objects. Organised Sound, 14(2):188–196.

Boyd, E. A., Berk, R. A., and M., H. K. (1996). “Motivated by hatred or prejudice”: categorization of hate-motivated crimes in two police divisions. Law Soc Rev, 30(4):819– 850.

Boyd, R., Bogerhoff-Mulder, M., Durham, W. H., and Richerson, P. J. (1997). Are cultural phylogenies possible? In Weingart, P., Mitchell, S. D., Richerson, P. J., and Maasen, S., editors, Human by nature, pages 355–386. Lawrence Erlbaum, Mahwah, NJ.

Boyd, R. and Richerson, P. J. (1985). Culture and the Evolutionary Process. The University of Chicago Press, Chicago.

Boyd, R. and Richerson, P. J. (1992). How microevolutionary processes give rise to human history. In Nitecki, M. H. and Nitecki, D. V., editors, History and Evolution, pages 179– 209. State University of New York Press, Albany, NY.

Brakes, P., Dall, S. R., Aplin, L. M., Bearhop, S., Carroll, E. L., Ciucci, P., Fishlock, V., Ford, J. K., Garland, E. C., Keith, S. A., McGregor, P. K., Mesnick, S. L., Noad, M. J., Notarbartolo di Sciara, G., Robbins, M. M., Simmonds, M. P., Spina, F., Thornton, A., Wade, P. R., Whiting, M. J., Williams, J., Rendell, L., Whitehead, H., Whiten, A., and Rutz, C. (2019). Animal cultures matter for conservation. Science, 363(6431):1032–1034.

Brand, C. O., Acerbi, A., and Mesoudi, A. (2019). Cultural evolution of emotional expression in 50 years of song lyrics. SocArXiv, pages 1–14.

Braun, R. (2011). The diffusion of racist violence in the Netherlands: Discourse and distance. Journal of Peace Research, 48(6):753–766.

Braun, R. and Koopmans, R. (2010). The diffusion of ethnic violence in Germany: The role of social similarity. European Sociological Review, 26(1):111–123.

Breitwieser, F., Gandrud, C., Allaire, J., Russell, K., and Bostock, M. (2017). sankeyD3: D3 JavaScript Sankey Graphs from R. R package version 0.3.2.

Brewer, J., Gelfand, M., Jackson, J. C., MacDonald, I. F., Peregrine, P. N., Richerson, P. J., Turchin, P., Whitehouse, H., and Wilson, D. S. (2017). Grand challenges for the study of cultural evolution. Nature Ecology & Evolution, 1(3):70.

Bridge, E. S. and Bonter, D. N. (2011). A low-cost radio frequency identification device for ornithological research. Journal of Field Ornithology, 82(1):52–59. BIBLIOGRAPHY 167

Bridge, E. S., Wilhelm, J., Pandit, M. M., Moreno, A., Curry, C. M., Pearson, T. D., Proppe, D. S., Holwerda, C., Eadie, J. M., Stair, T. F., Olson, A. C., Lyon, B. E., Branch, C. L., Pitera, A. M., Kozlovsky, D., Sonnenberg, B. R., Pravosudov, V. V., and Ruyle, J. E. (2019). An Arduino-Based RFID Platform for Animal Research. Frontiers in Ecology and Evolution, 7(July):1–10.

Briefer, E., Osiejuk, T. S., Rybak, F., and Aubin, T. (2010). Are bird song complexity and song sharing shaped by habitat structure? An information theory and statistical approach. Journal of Theoretical Biology, 262(1):151–164.

Brock, G., Pihur, V., Datta, S., and Datta, S. (2008). clValid: An R Package for Cluster Validation. Journal of Statistical Software, 25(4).

Brown, A. R. (2011). Wikipedia as a data source for political scientists: Accuracy and completeness of coverage. PS: Political Science and Politics, 44(2):339–343.

Brown, S., Savage, P. E., Ko, A. M. S., Stoneking, M., Ko, Y. C., Loo, J. H., and Trejaut, J. A. (2013). Correlations in the population structure of music, genes and language. Proceedings of the Royal Society B: Biological Sciences, 281(1774).

Buhaug, H. and Gleditsch, K. S. (2008). Contagion or confusion? Why conflicts cluster in space. International Studies Quarterly, 52(2):215–233.

Burnell, K. (1998). Cultural variation in savannah sparrow, Passerculus sandwichensis, songs: an analysis using the meme concept. Animal Behaviour, 56:995–1003.

Burnham, K. P. and Anderson, D. R. (2002). Model selection and multimodel inference: a practical information-theoretic approach. Springer, 2 edition.

Byers, B. E., Belinsky, K. L., and Bentley, R. A. (2010). Independent cultural evolution of two song traditions in the chestnut-sided warbler. The American Naturalist, 176(4):476– 489.

Cabrera, F. (2017). Cladistic Parsimony, Historical Linguistics and Cultural Phylogenetics. Mind and Language, 32(1):65–100.

Caldwell, C. A., Cornish, H., and Kandler, A. (2016). Identifying innovation in laboratory studies of cultural evolution: rates of retention and measures of adaptation. Philosophical Transactions of the Royal Society B, 371(1690).

Capellan, J. A. (2015). Lone Wolf Terrorist or Deranged Shooter? A Study of Ideological Active Shooter Events in the United States, 1970–2014. Studies in Conflict and Terrorism, 38(6):395–413.

Caro, T. and Sherman, P. W. (2012). Vanishing behaviors. Conservation Letters, 5(3):159– 166. BIBLIOGRAPHY 168

Carrignon, S., Bentley, R. A., and Ruck, D. (2019). Modelling rapid online cultural transmis- sion: evaluating neutral models on Twitter data with approximate Bayesian computation. Palgrave Communications, 5(83).

Castillo-Davis, C. I. and Hartl, D. L. (2003). GeneMerge—post-genomic analysis, data mining, and hypothesis testing. Bioinformatics, 19(7):891–892.

Cavalli-Sforza, L. L. and Feldman, M. W. (1981a). Cultural Transmission and Evolution: A Quantitative Approach. Princeton University Press, Princeton, NJ.

Cavalli-Sforza, L. L. and Feldman, M. W. (1981b). Cultural Transmission and Evolution: A Quantitative Approach. Princeton University Press, Princeton, NJ.

Cherif, A., Yoshioka, H., Ni, W., and Bose, P. (2009). Terrorism: Mechanisms of Radical- ization Processes, Control of Contagion and Counter-Terrorist Measures. pages 1–14.

Chmura, H. E., Glass, T. W., and Williams, C. T. (2018). Biologging physiological and ecological responses to climatic variation: New tools for the climate change era. Frontiers in Ecology and Evolution, 6:92.

Ciaburri, I. and Williams, H. (2019). Context-dependent variation of house finch song syntax. Animal Behaviour, 147:33–42.

Claidi`ere,N. and Andr´e,J.-B. (2011). The Transmission of Genes and Culture: A Question- able Analogy. Evolutionary Biology, 39(1):12–24.

Clarivate Analytics (2017). Web of Science Fact Book. Technical report.

Clark, N. J. and Dixon, P. M. (2018). Modeling and estimation for self-exciting spatio- temporal models of terrorist activity. Annals of Applied Statistics, 12(1):633–653.

Clauset, A., Newman, M. E. J., and Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6).

Clayton, N. S. (1987). Song tutor choice in zebra finches. Animal Behaviour, 35(3):714–721.

Cochrane, E. E. and Lipo, C. P. (2010). Phylogenetic analyses of Lapita decoration do not support branching evolution or regional population structure during colonization of remote oceania. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1559):3889–3902.

Collins, B., Hoang, D. T., Yoon, H. J., Nguyen, N. T., and Hwang, D. (2020). A Survey on Forecasting Models for Preventing Terrorism. In Advanced Computational Methods for Knowledge Engineering, pages 323–334. Springer International Publishing.

Collins, N., Schedel, M., and Wilson, S. (2013). Electronic Music. Cambridge Introductions to Music. Cambridge University Press. BIBLIOGRAPHY 169

Collins, S. (2007). Amen to that. M/C Journal, 10(2).

Costello, M. and Hawdon, J. (2018). Who Are the Online Extremists among Us? Sociodemo- graphic Characteristics, Social Networking, and Online Experiences of Those Who Produce Online Hate Materials. Violence and Gender, 5(1):55–60.

Crall, J. D., Gravish, N., Mountcastle, A. M., and Combes, S. A. (2015). Beetag: A low- cost, image-based tracking system for the study of animal behavior and locomotion. PLOS ONE, 10(9):1–13.

Crauwels, K. (2016).

Crema, E. R., Edinborough, K., Kerig, T., and Shennan, S. J. (2014). An Approximate Bayesian Computation approach for inferring patterns of cultural evolutionary change. Journal of Archaeological Science, 50:160–170.

Crema, E. R., Kandler, A., and Shennan, S. (2016). Revealing patterns of cultural trans- mission from frequency data: Equilibrium and non-equilibrium assumptions. Scientific Reports, 6(December):1–10.

Crewe, L. and Wang, A. (2018). Gender inequalities in the City of London advertising industry. Environment and Planning A: Economy and Space, 50(3):671–688.

Crisp, A., Boschetti, C., Perry, M., Tunnacliffe, A., and Micklem, G. (2015). Expression of multiple horizontally acquired genes is a hallmark of both vertebrate and invertebrate genomes. Genome Biology, 16(50).

Csardi, G. and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems:1695.

Csillery, K., Francois, O., and Blum, M. G. B. (2012). abc: an r package for approximate bayesian computation (abc). Methods in Ecology and Evolution.

Cui, Y. and Wang, J. (2019). Impact of Dimension and Sample Size on the Performance of Imputation Methods. In He, J., Yu, P. S., Shi, Y., Li, X., Xie, Z., Huang, G., Cao, J., and Xiao, F., editors, Communications in Computer and Information Science, pages 538–549, Ningbo, China. Springer.

Currie, T. E., Greenhill, S. J., and MacE, R. (2010). Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1559):3903–3912.

Currie, T. E., Meade, A., Guillon, M., and Mace, R. (2013). Cultural phylogeography of the bantu languages of sub-saharan africa. Proceedings of the Royal Society B, 280:20130695. BIBLIOGRAPHY 170

Cutler, A., Cutler, D. R., and Stevens, J. R. (2012). Random Forests. In Zhang, C. and Ma, Y., editors, Ensemble Machine Learning: Methods and Applications, chapter 5, pages 157–176. Springer, New York.

Dabi, Y., Darrigues, L., Katsahian, S., Azoulay, D., Antonio, M. D., and Lazzati, A. (2016). Publication Trends in Bariatric Surgery: a Bibliometric Study. Obesity Surgery, 26(11):2691–2699. das Neves Machado, R., Vargas-Quesada, B., and Leta, J. (2015). Intellectual structure in stem cell research: exploring Brazilian scientific articles from 2001 to 2010. Scientometrics, 106(2):525–537.

Davey, J. and Ebner, J. (2019). ’The Great Replacement’: The Violent Consequences of Mainstreamed Extremism. Technical report, Institute for Strategic Dialogue, London.

Davis, J. I. (1996). Phylogenetics, molecular variation, and species concepts: Species biology plays a critical role in modern phylogenetics. BioScience, 46(7):502–511.

Dean, G., Bell, P., and Newman, J. (2012). The Dark Side of Social Media: Review of Online Terrorism. Pakistan Journal of Criminology, 3(3):107–126.

Dean, M. D. and Ballard, J. W. O. (2004). Linking phylogenetics with population genetics to reconstruct the geographic origin of a species. Molecular Phylogenetics and Evolution, 32(3):998–1009.

DeFoster, R. and Swalve, N. (2018). Guns, culture or mental health? Framing mass shootings as a public health crisis. Health Communication, 33(10):1211–1222.

Derex, M. and Boyd, R. (2016). Partial connectivity increases cultural accumulation within groups. Proceedings of the National Academy of Sciences, 113(11):2982–2987.

Derex, M., Perreault, C., and Boyd, R. (2018). Divide and conquer: intermediate levels of population fragmentation maximize cultural accumulation. Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1743):20170062.

D´ıaz-B´a˜nez,J. M., Farigu, G., Francisco, G., Rappaport, D., and Toussaint, G. T. (2004). El Comp´asFlamenco: A Phylogenetic Analysis. In Proceedings of BRIDGES: Mathematical Connections in Art, Music and Science, pages 61–70.

DiIulio, J. J. (1996). Help wanted: economists, crime and public policy. J Econ Perspect, 10(1):3–24.

DiMaggio, P., Hargittai, E., Neuman, W. R., and Robinson, J. P. (2001). Social Implications of the Internet. Annual Review of Sociology, 27:307–336. BIBLIOGRAPHY 171

Doolittle, W. F. (2009). The practice of classification and the theory of evolution, and what the demise of charles darwin’s tree of life hypothesis means for both of them. Philosophical Transactions of the Royal Society B: Biological Sciences, 364(1527):2221–2228.

Driscoll, C. (2017). The Evolutionary Culture Concepts. Philosophy of Science, 84(1):35–55.

Duda, P. and Zrzav´y,J. (2016). Human population history revealed by a supertree approach. Scientific Reports, 6:29890.

Dunn, M. (2004). Language phylogenies. In The Routledge Handbook of Historical Linguis- tics, chapter 7, pages 190–211.

Durham, W. (1992). Applications of evolutionary culture theory. Annual Review of Anthro- pology, 21(1):331–355.

Durham, W. H. (1990). Advances in evolutionary culture theory. Annual Review of Anthro- pology, 19(1):187–210.

Durso, R. M. and Jacobs, D. (2013). The Determinants of the Number of White Supremacist Groups: A Pooled Time-Series Analysis. Social Problems, 60(1):128–144.

Easterbrook, S. M. (2014). Open code for open science? Nature Geoscience, 7(11):779–781.

Ebadi, A. and Schiffauerova, A. (2015a). How to become an important player in scientific collaboration networks? Journal of Informetrics, 9(4):809–825.

Ebadi, A. and Schiffauerova, A. (2015b). How to become an important player in scientific collaboration networks? Journal of Informetrics, 9(4):809–825.

Ebadi, A. and Schiffauerova, A. (2015c). On the relation between the small world structure and scientific activities. PLOS ONE, 10(3):1–19.

Ebare, S. (2003). The Computerization of Practice in Peripheral Music Communities. PhD thesis, Simon Fraser University.

Edwards, S. V., Potter, S., Schmitt, C. J., Bragg, J. G., and Moritz, C. (2016). Reticulation, divergence, and the phylogeography-phylogenetics continuum. Proceedings of the National Academy of Sciences, 113(29):8025–8032.

Elliott, J. J. and Arbib, R. S. J. (1953). Origin and status of the House Finch in the eastern United States. Auk, 70:31–37.

Ellis, B. H. and Abdi, S. (2017). Building community resilience to violent extremism through genuine partnerships. American Psychologist, 72(3):289–300.

Ellis, B. K., Hwang, H., Savage, P. E., Pan, B. Y., Cohen, A. J., and Brown, S. (2018). Identifying style-types in a sample of musical improvisations using dimensional reduction and cluster analysis. Psychology of Aesthetics, Creativity, and the Arts, 12(1):110–122. BIBLIOGRAPHY 172

Ercan, S. A. (2017). Engaging with extremism in a multicultural society: A deliberative democratic approach. Journal of Peacebuilding and Development, 12(2):9–21.

Eriksson, K., Enquist, M., and Ghirlanda, S. (2008). Critical points in current theory of conformist social learning. Journal of Evolutionary Psychology, 5(1):67–87.

Espiritu, A. (2004). Racial diversity and hate crime incidents. Social Science Journal, 41(2):197–208.

Evans, J. C., Devost, I., Jones, T. B., and Morand-Ferron, J. (2018). Inferring dominance interactions from automatically recorded temporal data. Ethology, 124(3):188–195.

Evans, T. S. and Giometto, A. (2011). Turnover Rate of Popularity Charts in Neutral Models.

Farine, D. R. (2013). Animal social network inference and permutations for ecologists in R using asnipe. Methods in Ecology and Evolution, 4(12):1187–1194.

Farine, D. R. and Sanchez-Tojar, A. (2019). aniDom: Inferring Dominance Hierarchies and Estimating Uncertainty. R package version 0.1.4.

Fern´andez-Juricic,E., Poston, R., De Collibus, K., Morgan, T., Bastain, B., Martin, C., Jones, K., and Treminio, R. (2005). Microhabitat Selection and Singing Behavior Patterns of Male House Finches (Carpodacus mexicanus) in Urban Parks in a Heavily Urbanized Landscape in the Western U.S. Urban Habitats, 3(1):49–69.

Ferrara, E. (2017). Contagion dynamics of extremist propaganda in social networks. Infor- mation Sciences, 418-419:1–12.

Figg, W. D., Dunn, L., Liewehr, D. J., Steinberg, S. M., Thurman, P. W., Barrett, J. C., and Birkinshaw, J. (2006). Scientific Collaboration Results in Higher Citation Rates of Published Articles. Pharmacotherapy, 26(6):759–767.

Florida, R. (2011). The geography of hate. The Atlantic.

Florida, R. and Jackson, S. (2010). Sonic city: The evolving economic geography of the music industry. Journal of Planning Education and Research, 29(3):310–321.

Formilan, G. and Stark, D. (2020). Underground testing: Name-altering practices as probes in electronic music. British Journal of Sociology, pages 1–18.

Fortin, J.-M. and Currie, D. J. (2013). Big Science vs. Little Science: How Scientific Impact Scales with Funding. PLOS ONE, 8(6):e65263.

Fraiberger, S. P., Sinatra, R., Resch, M., Riedl, C., and Barab´asi,A.-L. (2018). Quantifying reputation and success in art. Science, 362(6416):825–829. BIBLIOGRAPHY 173

Frane, A. V. (2017). Swing Rhythm in Classic Drum Breaks From Hip-Hop’s Breakbeat Canon. Music Perception, 34(3):291–302.

Franz, M. and Nunn, C. L. (2009). Network-based diffusion analysis: a new method for detecting social learning. Proceedings of the Royal Society B: Biological Sciences, 276(1663):1829–1836.

Freeberg, T. M. (2000). Culture and courtship in vertebrates: a review of social learning and transmission of courtship systems and mating patterns. Behavioural Processes, 51(1):177– 192.

Funke, M., Schularick, M., and Trebesch, C. (2016). Going to extremes: Politics after financial crises, 1870–2014. European Economic Review, 88(2011):227–260.

Gagen, J. (2019). Hybrids and fragments: music, genre, culture and technology. PhD thesis, Goldsmiths, University of London.

Gale, L. R., Heath, W. C., and Ressler, R. W. (2002). An Economic Analysis of Hate Crime. Eastern Economic Journal, 28(2):203–216.

Garcia-Bernardo, J., Qi, H., Shultz, J. M., Cohen, A. M., Johnson, N. F., and Dodds, P. S. (2015). Social media affects the timing, location, and severity of school shootings.

Garg, R., Smith, M. D., and Telang, R. (2011). Measuring Information Diffusion in an Online Community. Journal of Management Information Systems, 28(2):11–38.

Garland, E. C., Goldizen, A. W., Lilley, M. S., Rekdahl, M. L., Garrigue, C., Constantine, R., Hauser, N. D., Poole, M. M., Robbins, J., and Noad, M. J. (2015). Population structure of humpback whales in the western and central South Pacific Ocean as determined by vocal exchange among populations. Conservation Biology, 29(4):1198–1207.

Garland, E. C. and McGregor, P. K. (2020). Cultural Transmission, Evolution, and Revo- lution in Vocal Displays: Insights From Bird and Whale Song. Frontiers in Psychology, 11.

Gelfand, M., Shteynberg, G., Lee, T., Lun, J., Lyons, S., Bell, C., Chiao, J. Y., Bruss, C. B., Dabbagh, M. A., Aycan, Z., Abdel-Latif, A. H., Dagher, M., Khashan, H., and Soomro, N. (2012). The cultural contagion of conflict. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1589):692–703.

Giani, M. and Me´on,P. G. (2019). Global Racist Contagion following Donald Trump’s Election. British Journal of Political Science.

Gibbs, H. L. (1990). Cultural evolution of male song types in Darwin’s medium ground finches, Geospiza fortis. Animal Behaviour, 39:253–263. BIBLIOGRAPHY 174

Gibson, M. A. and Lawson, D. W. (2015). Applying evolutionary anthropology. Evolutionary Anthropology: Issues News, and Reviews, 24(1):3–14.

Gill, P., Corner, E., Conway, M., Thornton, A., Bloom, M., and J, H. (2017). Terrorist use of the internet by the numbers: quantifying behaviors, patterns, and processes. Criminol Public Policy, 16(1):99–117.

Giraudeau, M., Nolan, P. M., Black, C. E., Earl, S. R., Hasegawa, M., and McGraw, K. J. (2014). Song characteristics track bill morphology along a gradient of urbanization in house finches (Haemorhous mexicanus). Front Zool, 11(1):83.

Gladfelter, A. S., Lantz, B., and Ruback, R. B. (2017). The Complexity of Hate Crime and Bias Activity: Variation across Contexts and Types of Bias. Justice Quarterly, 34(1):55– 83.

Glass, C. and Cook, A. (2017). Do Women Leaders Promote Positive Change? Analyzing the Effect of Gender on Business Practices and Diversity Initiatives. Human Resource Management, pages 1–15.

Gl¨anzel,W. and Thijs, B. (2011). Using ’core documents’ for detecting and labelling new emerging topics. Scientometrics, 91(2):399–416.

Gl¨anzel,W. and Thijs, B. (2017). Using hybrid methods and ’core documents’ for the representation of clusters and topics: the astronomy dataset. Scientometrics, 111(2):1071– 1087.

Goetz, S. J., Rupasingha, A., and Loveridge, S. (2012). Social Capital, Religion, Wal-Mart, and Hate Groups in America. Social Science Quarterly, 93(2):379–393.

Gonz´alez-Bail´on,S., Borge-Holthoefer, J., Rivero, A., and Moreno, Y. (2011). The dynamics of protest recruitment through an online network. Scientific Reports, 1:1–7.

Goodrich, B., Gabry, J., Ali, I., and Brilleman, S. (2020). rstanarm: Bayesian applied regression modeling via Stan. R package version 2.19.3.

Gould, S. J. (1991). Bully for Brontosaurus. W. W. Norton, New York.

Graham, S. (2016). Global and Local Underground/Fringe Scenes. In Sounds of the Under- ground, pages 28–50. University of Michigan Press.

Graves, W. R. and Schrader, J. A. (2008). At the interface of phylogenetics and population genetics, the phylogeography of dirca occidentalis (thymelaeaceae). American Journal of Botany, 95(11):1454–1465.

Gray, R. D., Bryant, D., and Greenhill, S. J. (2010). On the shape and fabric of human history. Philosophical Transactions of the Royal Society B, 365(1559):3923–3933. BIBLIOGRAPHY 175

Gray, R. D., Drummond, A. J., and Greenhill, S. J. (2009). Language phylogenies reveal expansion pulses and pauses in pacific settlement. Science, 323(5913):479–483. Green, D. P., Glaser, J., and Rich, A. (1998). From lynching to gay bashing: The elusive connection between economic conditions and hate crime. Journal of Personality and Social Psychology, 75(1):82–92. Greenhill, S. J., Currie, T. E., and Gray, R. D. (2009). Does horizontal transmission in- validate cultural phylogenies? Proceedings of the Royal Society B: Biological Sciences, 276(1665):2299–2306. Greggor, A. L., Thornton, A., and Clayton, N. S. (2017). Harnessing learning biases is essential for applying social learning in conservation. Behavioral Ecology and Sociobiology, 71(1). Grindstaff, B., Mabry, M. E., Blischak, P. D., Quinn, M., and Chris Pires, J. (2019). Af- fordable remote monitoring of plant growth in facilities using Raspberry Pi computers. Applications in Plant Sciences, 7(8):1–8. Gr¨un,B. and Hornik, K. (2011). topicmodels: An R package for fitting topic models. Journal of Statistical Software, 40(13):1–30. Guilbeault, D., Becker, J., and Centola, D. (2018). Complex Contagions: A Decade in Review. In Complex Spreading Phenomena in Social Systems, chapter 1, pages 3–25. Springer. Gunaratna, R., Jerard, J., and Nasir, S. M. (2013). Countering Extremism: Building Social Resilience through Community Engagement, volume 1. Imperial College Press, London, UK. Gyles, C. and Boerlin, P. (2014). Horizontally transferred genetic elements and their role in pathogenesis of bacterial disease. Veterinary Pathology, 51(2):328–340. Hahn, M. W. (2008). Toward a selection theory of molecular evolution. Evolution, 62(2):255– 265. Hawley, D. M., Hanley, D., Dhondt, A. A., and Lovette, I. J. (2006). Molecular evidence for a founder effect in invasive house finch (Carpodacus mexicanus) populations experiencing an emergent disease epidemic. Molecular Ecology, 15(1):263–275. Heggarty, P., Maguire, W., and McMahon, A. (2010). Splits or waves? Trees or webs? How divergence measures and network analysis can unravel language histories. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1559):3829–3843. Helfgott, J. G. (2015). Criminal behavior and the copycat effect: Literature review and theoretical framework for empirical investigation. Aggression and Violent Behavior, 22:46– 64. BIBLIOGRAPHY 176

Henrich, J. and Boyd, R. (1998). The Evolution of Conformist Transmission and the Emer- gence of Between-Group Differences. Evolution and Human Behavior, 19:215–241.

Henrich, J., Boyd, R., and Richerson, P. J. (2008). Five Misunderstandings About Cultural Evolution. Human Nature, 19(2):119–137.

Hern, A. (2020). Spotify to let artists promote music for cut in royalty rates.

Hernandez, A. M. and MacDougall-Shackleton, S. A. (2004). Effects of early song experience on song preferences and song control and auditory brain regions in female house finches (Carpodacus mexicanus). Journal of Neurobiology, 59(2):247–258.

Hernandez, A. M., Phillmore, L. S., and MacDougall-Shackleton, S. A. (2008). Effects of learning on song preferences and Zenk expression in female songbirds. Behavioural Processes, 77(2):278–284.

Hesmondhalgh, D. (1998). The British Dance Music Industry: A Case Study of Independent Cultural Production. The British Journal of Sociology, 49(2):234–251.

Hey, J. (1994). Bridging phylogenetics and population genetics with gene tree models, pages 435–449. Birkh¨auserBasel, Basel.

Hill, G. E. (2002). A Red Bird in a Brown Bag. Oxford University Press, Inc., New York.

Hill, S. D. and Pawley, M. D. (2019). Reduced song complexity in founder populations of a widely distributed songbird. Ibis, 161(2):435–440.

Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Pro- ceedings of the National Academy of Sciences, 102(46):16569–16572.

Hobaiter, C., Poisot, T., Zuberb¨uhler,K., Hoppitt, W., and Gruber, T. (2014). Social Network Analysis Shows Direct Evidence for Social Transmission of Tool Use in Wild Chimpanzees. PLoS Biology, 12(9):e1001960.

Hockman, J. A. and Davies, M. E. P. (2015). Computational Strategies for Breakbeat Classification and Resequencing in Hardcore, Jungle and Drum & Bass. In Proc. of the 18th Int. Conference on Digital Audio Effects (DAFx-15).

Hodge, E. and Hallgrimsdottir, H. (2019). Networks of Hate: The Alt-right, “Troll Culture”, and the Cultural Geography of Social Movement Spaces Online. Journal of Borderlands Studies, pages 1–18.

Hoffman, A. J., Alamilla, S., and Liang, B. (2018). The Role of Community Development in Reducing Extremism and Ethnic Conflict. Palgrave Macmillan, Cham, Switzerland.

Holland, J., McGregor, P. K., and Rowe, C. L. (1996). Changes in Microgeographic Song Variation of the Corn Bunting Miliaria calandra. Journal of Avian Biology, 27(1):47. BIBLIOGRAPHY 177

Holt, T. J., Freilich, J. D., and Chermak, S. M. (2016). Internet-Based Radicalization as Enculturation to Violent Deviant Subcultures. Deviant Behavior, 38(8):855–869.

Holt, T. J., Freilich, J. D., Chermak, S. M., Mills, C., and Silva, J. (2019). Loners, Colleagues, or Peers? Assessing the Social Organization of Radicalization. American Journal of Criminal Justice, 44(1):83–105.

Hoppitt, W. (2017). The conceptual foundations of Network-Based Diffusion Analysis: choosing networks and interpreting results. Philosophical Transactions of the Royal Society B, 372.

Hoppitt, W., Boogert, N. J., and Laland, K. N. (2010). Detecting social transmission in networks. Journal of Theoretical Biology, 263(4):544–555.

Hoppitt, W. and Laland, K. N. (2008). Social Processes Influencing Learning in Animals: A Review of the Evidence. In Advances in the Study of Behavior, volume 38, pages 105–160.

Howe, C. J. and Windram, H. F. (2011). Phylomemetics—Evolutionary Analysis beyond the Gene. PLoS Biology, 9(5):e1001069.

Huai, C. and Chai, L. (2016). A bibliometric analysis on the performance and underlying dynamic patterns of water security research. Scientometrics, 108(3):1531–1551.

Humphries, M. D. and Gurney, K. (2008). Network ‘small-world-ness’: A quantitative method for determining canonical network equivalence. PLOS ONE, 3(4):1–10.

Hunter, D. R., Goodreau, S. M., and Handcock, M. S. (2008). Goodness of Fit of Social Network Models. Journal of the American Statistical Association, 103(481):248–258.

Hutson, S. R. (2000). The Rave: Spiritual Healing in Modern Western Subcultures. Anthro- pological Quarterly, 73(1):35–49.

Ibarra, V., Araya-salas, M., Tang, Y.-p., Park, C., Hyde, A., Wright, T. F., and Tang, W. (2015). An RFID Based Smart Feeder for Hummingbirds. Sensors, 15(November):31751– 31761.

Isaksson, S., Funcke, A., Envall, I., Enquist, M., and Lindenfors, P. (2015). A Novel Method to Analyze Social Transmission in Chronologically Sequenced Assemblages, Implemented on Cultural Inheritance of the Art of Cooking. PLOS ONE, 10(5).

Ishkur (2000).

Jabot, F., Faure, T., and Dumoulin, N. (2013). Easyabc: performing efficient approxi- mate bayesian computation sampling schemes using r. Methods in Ecology and Evolution, 4(7):684–687. BIBLIOGRAPHY 178

Jacobs, J. A. and Frickel, S. (2009). Interdisciplinarity: A Critical Assessment. Annual Review of Sociology, 35(1):43–65.

Jadidi, M., Karimi, F., Lietz, H., and Wagner, C. (2018). Gender Disparities in Science? Dropout, Productivity, Collaborations and Success of Male and Female Computer Scien- tists. Advances in Complex Systems, 21.

Janosov, M., Musciotto, F., Battiston, F., and I˜niguez,G. (2020). Elites, communities and the limited benefits of mentorship in electronic music. Scientific Reports, 10(1):3136.

Jasko, K., LaFree, G., and Kruglanski, A. (2017). Quest for Significance and Violent Ex- tremism: The Case of Domestic Radicalization. Political Psychology, 38(5):815–831.

Jefferson, P. N. and Pryor, F. L. (1999). On the geography of hate. Economics Letters, 65(3):389–395.

Jenkins, P. F. (1977). Cultural transmission of song patterns and dialect development in a free-living bird population. Animal Behaviour, 25:50–78.

Jensen, M. A., Atwell Seate, A., and James, P. A. (2018). Radicalization to Violence: A Pathway Approach to Studying Extremism. Terrorism and Political Violence.

Johnson, S. D. and Braithwaite, A. (2017). Spatial and Temporal Analysis of Terrorism and Insurgency. In The Handbook of the Criminology of Terrorism, chapter 15, pages 232–243.

Johnson, V. B. (2019). KKK in the PD: white supremacist police and what to do about it. Lewis & Clark Law Review, 23(1):205–261.

Johnston, M. (2019). How “fans also like” works.

Jolles, J. W., Laskowski, K. L., Boogert, N. J., and Manica, A. (2018). Repeatable group differences in the collective behaviour of stickleback shoals across ecological contexts. Pro- ceedings of the Royal Society B: Biological Sciences, 285(1872):13–16.

Jordan, P. and Shennan, S. (2003). Cultural transmission, language, and basketry traditions amongst the California Indians. Journal of Anthropological Archaeology, 22(1):42–74.

Joslyn, M. R. and Haider-Markel, D. P. (2017). Gun Ownership and Self-Serving Attributions for Mass Shooting Tragedies. Social Science Quarterly, 98(2):429–442.

Jost, L. (2006). Entropy and diversity. Oikos, 113(2):363–375.

Ju, C., Geller, F. C., Mundinger, P. C., and Lahti, D. C. (2019). Four decades of cultural evolution in House Finch songs. The Auk: Ornithological Advances, 136:1–18.

Kahle, D. and Wickham, H. (2013). ggmap: Spatial visualization with ggplot2. The R Journal, 5(1):144–161. BIBLIOGRAPHY 179

Kandler, A. and Crema, E. R. (2018). Analysing cultural frequency data: neutral theory and beyond.

Kandler, A. and Crema, E. R. (2019). Analysing Cultural Frequency Data: Neutral Theory and Beyond, pages 83–108. Springer International Publishing, Cham.

Kandler, A. and Laland, K. N. (2009). An investigation of the relationship between innova- tion and cultural diversity. Theoretical Population Biology, 76(1):59–67.

Kandler, A. and Powell, A. (2018a). Generative inference for cultural evolution. Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1743).

Kandler, A. and Powell, A. (2018b). Generative inference for cultural evolution. Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1743).

Kandler, A. and Shennan, S. (2015a). A generative inference framework for analysing pat- terns of cultural change in sparse population data with evidence for fashion trends in LBK culture. Journal of the Royal Society Interface, 12.

Kandler, A. and Shennan, S. (2015b). A generative inference framework for analysing pat- terns of cultural change in sparse population data with evidence for fashion trends in LBK culture. Journal of the Royal Society Interface, 12.

Kaplan, J., L¨o¨ow,H., and Malkki, L. (2014). Introduction to the Special Issue on Lone Wolf and Autonomous Cell Terrorism. Terrorism and Political Violence, 26(1):1–12.

Kauhanen, H., Gopal, D., Galla, T., and Berm´udez-Otero,R. (2021). Geospatial distribu- tions reflect temperatures of linguistic features. Science Advances, 7(1).

Kavanaugh, P. R. and Anderson, T. L. (2008). Solidarity and drug use in the electronic dance music scene. Sociological Quarterly, 49(1):181–208.

Keith, S. A. and Bull, J. W. (2017). Animal culture impacts species’ capacity to realise climate-driven range shifts. Ecography, 40(2):296–304.

Klein, B. R., Gruenewald, J., and Smith, B. L. (2017). Opportunity, Group Structure, Temporal Patterns, and Successful Outcomes of Far-Right Terrorism Incidents in the United States. Crime and Delinquency, 63(10):1224–1249.

Klement, B. and Strambach, S. (2019). How do new music genres emerge? Diversification processes in symbolic knowledge bases. Regional Studies, 53(10):1447–1458.

Klimek, P., Kreuzbauer, R., and Thurner, S. (2019). Fashion and art cycles are driven by counter-dominance signals of elite competition: Quantitative evidence from music styles. Journal of the Royal Society Interface, 16(151). BIBLIOGRAPHY 180

Knopper, S. (2009). Appetite for Self-Destruction: The Spectacular Crash of the Record Industry in the Digital Age. Free Press, New York, NY.

Knox, E. G. and Bartlett, M. S. (1964). The detection of space-time interactions. Journal of the Royal Statistical Society. Series C (Applied Statistics), 13(1):25–30.

Kolodny, O., Feldman, M. W., and Creanza, N. (2018). Integrative studies of cultural evolu- tion: crossing disciplinary boundaries to produce new insights. Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1743):20170048.

Kretschmer, H. and Rousseau, R. (2001). Author inflation leads to a breakdown of Lotka’s law. Journal of the American Society for Information Science and Technology, 52(8):610– 614.

Krivitsky, P. N. and Handcock, M. S. (2010). A separable model for dynamic networks. Journal of the Royal Statistical Society, 76(1):29–46.

Kruse, H. (2010). Local Identity and Independent Music Scenes, Online and Off. Popular Music and Society, 33(5):625–639.

Kr¨aenbring, J., Monzon Penza, T., Gutmann, J., Muehlich, S., Zolk, O., Wojnowski, L., Maas, R., Engelhardt, S., and Sarikas, A. (2014). Accuracy and completeness of drug information in wikipedia: A comparison with standard textbooks of pharmacology. PLOS ONE, 9(9):1–7.

Kwon, R. and Cabrera, J. F. (2017). Socioeconomic factors and mass shootings in the United States. Critical Public Health.

Kwon, R. and Cabrera, J. F. (2019a). Income inequality and mass shootings in the United States. BMC Public Health, 19(1147).

Kwon, R. and Cabrera, J. F. (2019b). Social integration and mass shootings in U.S. counties. Journal of Crime and Justice, 42(2):121–139.

Lachlan, R. F., Ratmann, O., and Nowicki, S. (2018). Cultural conformity generates ex- tremely stable traditions in bird song. Nature Communications, 9.

LaFree, G. and Bersani, B. E. (2014). County-level correlates of terrorist attacks in the United States. Criminology and Public Policy, 13(3):455–481.

LaFree, G. and Dugan, L. (2009). Tracking Global Terrorism Trends, 1970-2004. In To Protect and To Serve: Policing in an Age of Terrorism, chapter 3, pages 43–80. Springer.

LaFree, G., Dugan, L., Xie, M., and Singh, P. (2012). Spatial and Temporal Patterns of Terrorist Attacks by ETA 1970 to 2007. Journal of Quantitative Criminology, 28(1):7–29. BIBLIOGRAPHY 181

Lahti, D. C., Moseley, D. L., and Podos, J. (2011). A Tradeoff Between Performance and Accuracy in Bird Song Learning. Ethology, 117(9):802–811.

Laiolo, P. (2008). Characterizing the spatial structure of songbird cultures. Ecological Ap- plications, 18(7):1774–1780.

Laiolo, P. and Jovani, R. (2007). The emergence of animal culture conservation. Trends in Ecology and Evolution, 22(1):5.

Laiolo, P. and Tella, J. L. (2005). Habitat fragmentation affects culture transmission: pat- terns of song matching in Dupont’s lark. Journal of Applied Ecology, 42(6):1183–1193.

Laiolo, P. and Tella, J. L. (2007). Erosion of animal cultures in fragmented landscapes. Frontiers in Ecology and the Environment, 5(2):68–72.

Laiolo, P., V¨ogeli,M., Serrano, D., and Tella, J. L. (2008). Song Diversity Predicts the Viability of Fragmented Bird Populations. PLOS ONE, 3(3):e1822.

Laland, K., Uller, T., Feldman, M., Sterelny, K., M¨uller,G. B., Moczek, A., Jablonka, E., Odling-Smee, J., Wray, G. A., Hoekstra, H. E., Futuyma, D. J., Lenski, R. E., Mackay, T. F. C., Schluter, D., and Strassmann, J. E. (2014). Does evolutionary theory need a rethink? Nature, 514(7521):161–164.

Laland, K. N. and Kendal, J. R. (2003). What the models say about social learning. In Fragaszy, D. M. and Perry, S., editors, The Biology of Traditions: Models and Evidence, chapter 2, pages 33–55. Cambridge University Press, Cambridge.

Lambiotte, R. and Ausloos, M. (2005). Uncovering collective listening habits and music genres in bipartite networks. Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, 72(6):1–11.

Lamichhaney, S., Han, F., Webster, M. T., Andersson, L., Grant, B. R., and Grant, P. R. (2018). Rapid hybrid speciation in darwin’s finches. Science, 359(6372):224–228.

Landherr, A., Friedl, B., and Heidemann, J. (2010). A Critical Review of Centrality Measures in Social Networks. Business & Information Systems Engineering, 2(6):371–385.

Landry, R., Traore, N., and Godin, B. (1996). An econometric analysis of the effect of collaboration on academic research productivity. Higher Education, 32(3):283–301.

Langfelder, P., Zhang, B., and Horvath, S. (2016). dynamicTreeCut: Methods for Detection of Clusters in Hierarchical Clustering Dendrograms. R package version 1.63-1.

Larivi`ere,V., Gingras, Y., Sugimoto, C. R., and Tsou, A. (2014). Team size matters: Collaboration and scientific impact since 1900. Journal of the Association for Information Science and Technology, 66(7):1323–1332. BIBLIOGRAPHY 182

Larivi`ere,V., Haustein, S., and B¨orner,K. (2015). Long-Distance Interdisciplinarity Leads to Higher Scientific Impact. PLOS ONE, 10(3):e0122565.

Lazar, J. and Preece, J. (2002). Social Considerations in Online Communities: Usability, Sociability, and Success Factors. PhD thesis.

Lazerte, S. E., Hill, D. J., Reudink, M. W., Otter, K. A., Kusack, J., Bailey, J. M., Woolver- ton, A., Paetkau, M., and Jong, A. D. (2017). feedr and animalnexus.ca: A paired R package and friendly Web application for transforming and visualizing animal movement data from static stations. Ecology and Evolution, pages 1–13.

Le Bomin, S., Lecointre, G., and Heyer, E. (2016). The evolution of musical diversity: The key role of vertical transmission. PLoS ONE, 11(3):1–17.

Leahey, E., Beckman, C. M., and Stanko, T. L. (2017). Prominent but Less Productive. Administrative Science Quarterly, 62(1):105–139.

Lee, S. and Bozeman, B. (2005). The Impact of Research Collaboration on Scientific Pro- ductivity. Social Studies of Science, 35(5):673–702.

Lee, Y.-N., Walsh, J. P., and Wang, J. (2015). Creativity in scientific teams: Unpacking novelty and impact. Research Policy, 44(3):684–697.

Lemaire, L., Jay, F., Lee, I.-H., Csill´ery, K., and Blum, M. G. B. (2016). Goodness-of-fit statistics for approximate Bayesian computation. pages 1–30.

Lemieux, F. (2014). Effect of Gun Culture and Firearm Laws on Gun Violence and Mass Shootings in the United States: A Multi-Level Quantitative Analysis. International Jour- nal of Criminal Justice Sciences, 9(1):74.

Lena, J. C. (2004). Meaning and membership: samples in rap music , 1979-1995. Poetics, 32:297–310.

Lena, J. C. (2012). Banding together: how communities create genres in popular music. Princeton University Press, Princeton, NJ.

Lendvai, A.´ Z., Ak¸cay, C¸., Weiss, T., Haussmann, M. F., Moore, I. T., and Bonier, F. (2015). Low cost audiovisual playback and recording triggered by radio frequency identification using Raspberry Pi. PeerJ, 2015(4):1–11.

Leonard, R. D. and Jones, G. T. (1987). Elements of an inclusive evolutionary model for archaeology. Journal of Anthropological Archaeology, 6:199–219.

Levinson, S. C. and Gray, R. D. (2012). Tools from evolutionary biology shed new light on the diversification of languages. Trends in Cognitive Sciences, 16(3):167–173.

Lewens, T. (2015). Cultural Evolution: Conceptual Challenges. Oxford University Press. BIBLIOGRAPHY 183

Lewis, E., Mohler, G., Brantingham, P. J., and Bertozzi, A. L. (2012). Self-exciting point process models of civilian deaths in Iraq. Security Journal, 25(3):244–264.

Lewis, M. A. (1997). Variability, patchiness, and jump dispersal in the spread of an invading population. Princeton University Press.

Leydesdorff, L., Carley, S., and Rafols, I. (2012). Global maps of science based on the new Web-of-Science categories. Scientometrics, 94(2):589–593.

Leydesdorff, L. and Rafols, I. (2011). Indicators of the interdisciplinarity of journals: Diver- sity centrality, and citations. Journal of Informetrics, 5(1):87–100.

Liao, S. G., Lin, Y., Kang, D. D., Chandra, D., Bon, J., Kaminski, N., Sciurba, F. C., and Tseng, G. C. (2014). Missing value imputation in high-dimensional phenomic data: Imputable or not, and how? BMC Bioinformatics, 15(1):1–12.

Liebman, E., Ornoy, E., and Chor, B. (2012). A Phylogenetic Approach to Music Perfor- mance Analysis. Journal of New Music Research, 41(2):195–222.

Liechti, J. I. and Bonhoeffer, S. (2020). A time resolved clustering method revealing longterm structures and their short-term internal dynamics.

Lin, P. I., Fei, L., Barzman, D., and Hossain, M. (2018). What have we learned from the time trend of mass shootings in the U.S.? PLoS ONE, 13(10):1–13.

Lindop, R. (2011). Re-evaluating musical genre in UK psytrance. In St John, G., editor, The Local Scenes and Global Culture of Psytrance, chapter 6, pages 114–130. Taylor & Francis.

Lipkind, D. and Tchernichovski, O. (2011). Quantification of developmental birdsong learn- ing from the subsyllabic scale to cultural evolution. Proceedings of the National Academy of Sciences, 108(3):15572–15579.

Liu, L., Chen, B., Ai, C., He, L., Wang, Y., Qiu, X., and Lu, X. (2018). The influence of geographic factors on information dissemination in mobile social networks in China: Evidence from wechat. ISPRS International Journal of Geo-Information, 7(5):1–16.

Liu, P. and Xia, H. (2015). Structure and evolution of co-authorship network in an interdis- ciplinary research field. Scientometrics, 103(1):101–134.

Lloyd, R. (2014). Unauthorized Digital Sampling in the Changing Music Landscape. Journal of Intellectual Property Law, 22(1):143–176.

L´opez-Serrano, P., Dittmar, C., and M¨uller,M. (2017). Finding Drum Breaks in Digital Music Recordings. In Proceedings of the 13th International Symposium on CMMR, pages 68–79. BIBLIOGRAPHY 184

Lotka, A. J. (1926). The frequency distribution of scientific productivity. Journal of the Washington Academy of Sciences, 16(12):317—-323.

Lowe, D. (2019). Christchurch Terrorist Attack, The Far-Right and Social Media: What can we learn? The New Jurist, (April).

Luca, M., Malhotra, D., and Poliquin, C. (2020). The impact of mass shootings on gun policy. Journal of Public Economics, 181:104083.

Lumaca, M. and Baggio, G. (2017a). Cultural transmission and evolution of melodic struc- tures in multi-generational signaling games. Artificial Life, 23:406–423.

Lumaca, M. and Baggio, G. (2017b). Cultural transmission and evolution of melodic struc- tures in multi-generational signaling games. Artificial Life, 23(3):406–423.

Luther, D. and Baptista, L. (2010). Urban noise and the cultural evolution of bird songs. Proceedings of the Royal Society B, 277:469–473.

MacCallum, R. M., Mauch, M., Burt, A., and Leroi, A. M. (2012). Evolution of music by public choice. Proceedings of the National Academy of Science, 109(30):12081–12086.

Mace, R. (2014). Human behavioral ecology and its evil twin. Behavioral Ecology, 25(3):443– 449.

Majumder, M. (2017). Higher rates of hate crimes are tied to income inequality.

Makelberge, N. (2012). Rethinking collaboration in networked music. Organised Sound, 17(1):28–35.

Mall, R., Langone, R., and Suykens, J. A. K. (2015). Netgram: visualizing communities in evolving networks. PLOS ONE, 10(9):e0137502.

Mann, D. C., Lahti, D. C., Waddick, L., and Mundinger, P. C. (2020). House finches learn canary trills. Bioacoustics, pages 1–17.

Marchment, Z., Bouhana, N., and Gill, P. (2018). Lone Actor Terrorists: A Residence-to- Crime Approach. Terrorism and Political Violence, pages 1–26.

Mason, W. A., Jones, A., and Goldstone, R. L. (2008). Propagation of innovations in networked groups. Journal of Experimental Psychology: General, 137(3):422–433.

Mauch, M., Maccallum, R. M., Levy, M., Leroi, A. M., and Mauch, M. (2015). The evolution of popular music: USA 1960–2010. Royal Society Open Science, 2.

Mayr, E. (1982). The Growth of Biological Thought: Diversity, Evolution, and Inheritance. Belknap Press. BIBLIOGRAPHY 185

McGraw, K. J. and Hill, G. E. (2002). Testing reversed sexual dominance from an ontoge- netic perspective: Juvenile female House Finches Carpodacus mexicanus are dominant to juvenile males. Ibis, 144(1):139–142.

McLeod, K. (2001). Genres, subgenres, sub-subgenres and more: music and social differ- entiation within electronic/dance music communities. Journal of Popular Music Studies, 13:59–75.

McVeigh, R. (2004). Structured Ignorance and Organized Racism in the United States. Social Forces, 82(3):895–936.

McVeigh, R. and Cunningham, D. (2012). Enduring Consequences of Right-Wing Extrem- ism: Klan Mobilization and Homicides in Southern Counties. Social Forces, 90(3):843–862.

McVeigh, R., Cunningham, D., and Farrell, J. (2014). Political Polarization as a Social Move- ment Outcome: 1960s Klan Activism and Its Enduring Impact on Political Realignment in Southern Counties, 1960 to 2000. American Sociological Review, 79(6):1144–1171.

Medina, R. M., Nicolosi, E., Brewer, S., and Linke, A. M. (2018). Geographies of organized hate in America: A regional analysis. Annals of the American Association of Geographers, 108(4):1006–1021.

Meniri, M., Farley, A., Helfenstein, F., and Fasel, N. (2019). A guide for ecologists to build a low-cost selective trap using radio frequency identification detection. Behavioral Ecology and Sociobiology, 73:80.

Mennill, D. J., Badyaev, A. V., Jonart, L. M., and Hill, G. E. (2006). Male house finches with elaborate songs have higher reproductive performance. Ethology, 112(2):174–180.

Mesoudi, A. (2015). Cultural Evolution: A Review of Theory Findings and Controversies. Evolutionary Biology, 43(4):481–497.

Mesoudi, A. (2017a). Pursuing Darwin’s curious parallel: Prospects for a science of cultural evolution. Proceedings of the National Academy of Sciences, 114(30):7853–7860.

Mesoudi, A. (2017b). Pursuing Darwin’s curious parallel: Prospects for a science of cul- tural evolution. Proceedings of the National Academy of Sciences of the United States of America, 114(30):7853–7860.

Mesoudi, A., Whiten, A., and Laland, K. N. (2006). Towards a unified science of cultural evolution. Behavioral and Brain Sciences, 29(04).

Meyer, S., Elias, J., and H¨ohle,M. (2012). A Space-Time Conditional Intensity Model for Invasive Meningococcal Disease Occurrence. Biometrics, 68(2):607–616.

Meyer, S. and Held, L. (2014). Power-law models for infectious disease spread. The Annals of Applied Statistics, 8(3):1612–1639. BIBLIOGRAPHY 186

Meyer, S., Held, L., and H¨ohle,M. (2017). Spatio-temporal analysis of epidemic phenomena using the r package surveillance. Journal of Statistical Software, Articles, 77(11):1–55.

Meyer, S., Warnke, I., R¨ossler,W., and Held, L. (2016). Model-based testing for space-time interaction using point processes: An application to psychiatric hospital admissions in an urban area. Spatial and Spatio-temporal Epidemiology, 17:15–25.

Midlarsky, M. I., Crenshaw, M., and Yoshida, F. (1980). Why violence spreads: the contagion of international terrorism. International Studies Quarterly, 24(2):262–298.

Miller, E. T., Bonter, D. N., Eldermire, C., Freeman, B. G., Greig, E. I., Harmon, L. J., Lisle, C., and Hochachka, W. M. (2017). Fighting over food unites the birds of North America in a continental dominance hierarchy. Behavioral Ecology, 28(6):1454–1463.

Miller, P. R. and Conover, P. J. (2015). Red and Blue States of Mind: Partisan Hostility and Voting in the United States. Political Research Quarterly, 68(2):225–239.

Mills, C. E., Freilich, J. D., Chermak, S. M., Holt, T. J., and LaFree, G. (2019). Social Learning and Social Control in the Off- and Online Pathways to Hate Crime and Terrorist Violence. Studies in Conflict and Terrorism, pages 1–29.

Millstein, R. (2010). The concepts of population and metapopulation in evolutionary biol- ogy and ecology. In Evolution since Darwin: the first 150 years, pages 61–68. Sinauer, Sunderland.

Misiak, B., Samochowiec, J., Bhui, K., Schouler-Ocak, M., Demunter, H., Kuey, L., Raballo, A., Gorwood, P., Frydecka, D., and Dom, G. (2019). A systematic review on the rela- tionship between mental health, radicalization and mass violence. European Psychiatry, 56:51–59.

Misztal, M. A. (2019). Comparison of Selected Multiple Imputation Methods for Continuous Variables – Preliminary Simulation Study Results. Folia Oeconomica, 6(339):73–98.

Monechi, B., Gravino, P., Servedio, V. D., Tria, F., and Loreto, V. (2017). Significance and popularity in music production. Royal Society Open Science, 4(7).

Morall, A. (2018). The science of gun policy: A critical synthesis of research evidence on the effects of gun policies in the united states. RAND Health Quarterly, 8(1).

Morrison, D. A. (2014a). Is the tree of life the best metaphor, model, or heuristic for phylogenetics? Systematic Biology, 63(4):628–638.

Morrison, D. A. (2014b). Phylogenetic networks: A new form of multivariate data summary for data mining and exploratory data analysis. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 4(4):296–312. BIBLIOGRAPHY 187

Moyers, S. C., Adelman, J. S., Farine, D. R., Moore, I. T., and Hawley, D. M. (2018). Exploratory behavior is linked to stress physiology and social network centrality in free- living house finches (Haemorhous mexicanus). Hormones and Behavior, 102:105–113.

Muharemi, F., Logofatu, D., and Leon, F. (2018). Review on General Techniques and Packages for Data Imputation in R on a Real World Dataset. In Nguyen, N. T., Pimenidis, E., Khan, Z., and Trawinski, B., editors, Computational Collective Intelligence, pages 386– 395, Bristol, UK. Springer.

M¨ullner,D. (2013). fastcluster: Fast hierarchical, agglomerative clustering routines for R and Python. Journal of Statistical Software, 53(9):1–18.

Mundinger, P. C. (1975). Song dialects and colonization in the house finch, Carpodacus mexicanus, on the east coast. The Condor, 77:407–422.

Mundinger, P. C. (1980). Animal Cultures and a General Theory of Cultural Evolution. Ethology and Sociobiology, 1(3):183–223.

Mundinger, P. C. and Lahti, D. (2014). Quantitative integration of genetic factors in the learning and production of canary song. Proceedings of the Royal Society B, 281:20132631.

Myers, D., Berg, H., and Maneas, G. (2019). Comparing the soundscapes of organic and conventional olive groves: A potential method for bird diversity monitoring. Ecological Indicators, 103:642–649.

Nacos, B. L. (2009). Revisiting the Contagion Hypothesis: Terrorism, News Coverage, and Copycat Attacks. Perspectives on Terrorism, 3(3):3–13.

Nakamura, E. and Kaneko, K. (2018). Statistical Evolutionary Laws in Music Styles. pages 1–23.

National Consortium for the Study of Terrorism and Responses to Terrorism (START), University of Maryland (2017). Profiles of individual radicalization in the united states (pirus).

National Institutes of Health (2016). NIH-Wide Strategic Plan: Fiscal Years 2016-2020. Technical report.

National Science Foundation (2014). Investing in Science, Engineering, and Education for the Nation’s Future: NSF Strategic Plan for 2014-2018. Technical Report March.

Neiman, F. D. (1995). Stylistic Variation in Evolutionary Perspective: Inferences from Deco- rative Diversity and Interassemblage Distance in Illinois Woodland Ceramic Assemblages. American Antiquity, 60(1):7–36.

Nelson, D. A., Hallberg, K. I., and Soha, J. A. (2004). Cultural evolution of puget sound white-crowned sparrow song dialects. Ethology, 110(11):879–908. BIBLIOGRAPHY 188

Nelson, D. A. and Poesel, A. (2009). Does learning produce song conformity or novelty in white-crowned sparrows, Zonotrichia leucophrys? Animal Behaviour, 78(2):433–440.

Nightingale, G. F., Laland, K. N., Hoppitt, W., and Nightingale, P. (2015). Bayesian Spatial NBDA for Diffusion Data with Home-Base Coordinates. PLOS ONE.

Nolan, P. M. and Hill, G. E. (2004). Female choice for song characteristics in the house finch. Animal Behaviour, 67(3):403–410.

Novak, E., Zhao, W., and Reiser, R. A. (2014). Promoting Interdisciplinary Research among Faculty. Journal of Faculty Development, 28(1):19–24.

Noys, B. (1995). Into the ‘Jungle’. Popular Music, 14(3):321–332.

Nunn, C. L., Arnold, C., Matthews, L., and Mulder, M. B. (2010). Simulating trait evolution for cross-cultural comparison. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1559):3807–3819.

Nunn, C. L., Mulder, M. B., and Langley, S. (2006). Comparative methods for studying cultural trait evolution: A simulation study. Cross-Cultural Research, 40(2):177–209.

O’Brien, M. J., Collard, M., Buchanan, B., and Boulanger, M. T. (2013). Trees, thickets, or something in between? Recent theoretical and empirical work in cultural phylogeny. Israel Journal of Ecology and Evolution, 59(2):45–61.

O’Dair, M. and Fry, A. (2020). Beyond the black box in music streaming: the impact of recommendation systems upon artists. Popular Communication, 18(1):65–77.

O’Dwyer, J. P. and Kandler, A. (2017). Inferring processes of cultural transmission: The critical role of rare variants in distinguishing neutrality from novelty biases. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1735).

Ogata, Y. (1988). Statistical models for earthquake occurrences and residual analysis for point processes. Journal of the American Statistical Association, 83(401):9–27.

O’Loghlen, A. L. and Rothstein, S. I. (1995). Culturally correct song dialects are correlated with male age and female song preferences in wild populations of brown-headed cowbirds. Behavioral Ecology and Sociobiology, 36(4):251–259.

Ordanini, A. and Nunes, J. C. (2016). Fewer blockbusters and more superstars: How tech- nological innovation has impacted convergence on the music charts. International Journal of Research in Marketing, 33(2):297–313.

Ormrod, R. K. (1992a). Adaptation and Cultural Diffusion. Journal of Geography, 91:258– 262. BIBLIOGRAPHY 189

Ormrod, R. K. (1992b). Adaptation and Cultural Diffusion. Journal of Geography, 91:258– 262.

Ortega, Y. K., Benson, A., and Greene, E. (2014). Invasive plant erodes local song diversity in a migratory passerine. Ecology, 95(2):458–465.

O’Toole, M. (2019). Trump officials have redirected resources from countering far-right, racism-fueled domestic terrorism. Los Angeles Times.

Otter, K. A., Mckenna, A., LaZerte, S. E., and Ramsay, S. M. (2020). Continent-wide Shifts in Song Dialects of White-Throated Sparrows. Current Biology, 30(16):3231–3235.e3.

Ottoni, R., Bernardina, P., Cunha, E., Meira, W., Magno, G., and Almeida, V. (2018). Analyzing right-wing youtube channels: Hate, violence and discrimination. WebSci 2018 - Proceedings of the 10th ACM Conference on Web Science, pages 323–332.

Pagel, M., Atkinson, Q. D., and Meade, A. (2007). Frequency of word-use predicts rates of lexical evolution throughout Indo-European history. Nature, 449:717–720.

Pagel, M., Beaumont, M., Meade, A., Verkerk, A., and Calude, A. (2019). Dominant words rise to the top by positive frequency-dependent selection. Proceedings of the National Academy of Sciences, 116(15):7397–7402.

Pah, A. R., Hagan, J., Jennings, A. L., Jain, A., Albrecht, K., Hockenberry, A. J., and Amaral, L. A. (2017). Economic insecurity and the rise in gun violence at US schools. Nature Human Behaviour, 1(2):2–7.

Pamjav, H., Juh´asz,Z., Zal´an,A., N´emeth, E., and Damdin, B. (2012). A comparative phy- logenetic study of genetics and folk music. Molecular Genetics and Genomics, 287(4):337– 349.

Pandey, H. (2019). Bollywood synth pioneer bappi lahiri’s 303 experiments. Red Bull.

Pang-Ching, J. M., Paxton, K. L., Paxton, E. H., Pack, A. A., and Hart, P. J. (2018). The effect of isolation, fragmentation, and population bottlenecks on song structure of a Hawaiian honeycreeper. Ecology and Evolution, 8(4):2076–2087.

Parker, K. A., Anderson, M. J., Jenkins, P. F., and Brunton, D. H. (2012). The effects of translocation-induced isolation and fragmentation on the cultural evolution of bird song. Ecology Letters, 15(8):778–785.

Pauwels, L., Brion, F., Schils, N., Laffineur, J., Verhage, A., De Ruyver, B., and Easton, M. (2014). Explaining and understanding the role of exposure to new social media in violent extremism: an integrative quantitative and qualitative approach. Academia Press, Gent. BIBLIOGRAPHY 190

Paxton, K. L., Sebasti´an-Gonz´alez,E., Hite, J. M., Crampton, L. H., Kuhn, D., and Hart, P. J. (2019). Loss of cultural song diversity and the convergence of songs in a declining Hawaiian forest bird community. Royal Society Open Science, 6(8):190719.

Payne, R. B. (1981). Song learning and social interaction in indigo buntings. Animal Be- haviour, 29(3):688–697.

Penone, C., Davidson, A. D., Shoemaker, K. T., Di Marco, M., Rondinini, C., Brooks, T. M., Young, B. E., Graham, C. H., and Costa, G. C. (2014). Imputation of missing data in life-history trait datasets: Which approach performs the best? Methods in Ecology and Evolution, 5(9):1–10.

Percino, G., Klimek, P., and Thurner, S. (2014). Instrumentational complexity of music genres and why simplicity sells. PLoS ONE, 9(12):1–16.

P´erez-Granados,C., Osiejuk, T., and L´opez-Iborra, G. M. (2016). Habitat fragmentation effects and variations in repertoire size and degree of song sharing among close Dupont’s Lark Chersophilus duponti populations. Journal of Ornithology, 157(2):471–482.

Perianes-Rodriguez, A., Waltman, L., and van Eck, N. J. (2016). Constructing bibliometric networks: A comparison between full and fractional counting. Journal of Informetrics, 10(4):1178–1195.

Petruskov´a,T., Osiejuk, T. S., and Petrusek, A. (2010). Geographic variation in songs of the tree pipit (Anthus trivialis) at two spatial scales. The Auk, 127(2):274–282.

Pezzella, F. S. (2017). Hate Crime Statutes A Public Policy and Law Enforcement Dilemma. Springer New York.

Pezzella, F. S., Fetzer, M. D., and Keller, T. (2019). The dark figure of hate crime underre- porting. Am Behav Sci.

Phillips, B. J. (2017). Deadlier in the U.S.? On lone wolves, terrorist groups, and attack lethality. Terrorism and Political Violence, 29(3):533–549.

Piazza, J. A. (2017). The determinants of domestic right-wing terrorism in the USA: Eco- nomic grievance, societal change and political resentment. Conflict Management and Peace Science, 34(1):52–80.

Pierre, J. M. (2019). The psychology of guns: risk, fear, and motivated reasoning. Palgrave Communications, 5(1):1–7.

Pigliucci, M. (2006). Phenotypic plasticity and evolution by genetic assimilation. Journal of Experimental Biology, 209(12):2362–2367. BIBLIOGRAPHY 191

Piso, Z., O’Rourke, M., and Weathers, K. C. (2016). Out of the fog: Catalyzing integrative capacity in interdisciplinary research. Studies in History and Philosophy of Science Part A, 56:84–94.

Podos, J., Moseley, D. L., Goodwin, S. E., McClure, J., Taft, B. N., Strauss, A. V., Rega- Brodsky, C., and Lahti, D. C. (2016). A fine-scale, broadly applicable index of vocal performance: frequency excursion. Animal Behaviour, 116:203–212.

Polihronakis, M. (2010). The interface between phylogenetics and population genetics: in- vestigating gene trees, species trees, and population dynamics in the phyllophaga fraterna species group. Evolution, 64(4):1048–1062.

Ponchon, A., Gremillet, D., Doligez, B., Chambert, T., Tveraa, T., Gonzalez-Solis, J., and Boulinier, T. (2013). Tracking prospecting movements involved in breeding habitat selec- tion: insights, pitfalls and perspectives. Methods in Ecology and Evolution, 4:143–150.

Porter, A. L. and Rafols, I. (2009). Is science becoming more interdisciplinary? Measuring and mapping six research fields over time. Scientometrics, 81(3):719–745.

Porter, M. D. and White, G. (2010). Self-exciting hurdle models for terrorist activity. Annals of Applied Statistics, 4(1):106–124.

Post, J. M. (2015). Terrorism and right-wing extremism: The changing face of terrorism and political violence in the 21st century: The virtual community of hatred. International Journal of Group Psychotherapy, 65(2):242–271.

Premo, L. S. (2014). Cultural Transmission and Diversity in Time-Averaged Assemblages. Current Anthropology, 55(1):105–114.

Pritchard, A. (1969). Statistical bibliography or bibliometrics? Journal of Documentation, 25(4):348–349.

Psorakis, I., Voelkl, B., Garroway, C. J., Radersma, R., Aplin, L. M., Crates, R. A., Culina, A., Farine, D. R., Firth, J. A., Hinde, C. A., Kidd, L. R., Milligan, N. D., Roberts, S. J., Verhelst, B., and Sheldon, B. C. (2015). Inferring social structure from temporal data. Behavioral Ecology and Sociobiology, 69(5):857–866.

Pudlo, P., Marin, J. M., Estoup, A., Cornuet, J. M., Gautier, M., and Robert, C. P. (2015). Reliable ABC model choice via random forests. Bioinformatics, 32(6):859–866.

Pudovkin, A. I. and Garfield, E. (2002). Algorithmic procedure for finding semantically related journals. Journal of the American Society for Information Science and Technology, 53(13):1113–1119.

Purgas, P. and Morgan, F. (2020). Electronic india 1969-73 revisited. The Wire. BIBLIOGRAPHY 192

Rafols, I., Porter, A. L., and Leydesdorff, L. (2010). Science overlay maps: A new tool for research policy and library management. Journal of the American Society for Information Science and Technology, 61(9):1871–1887.

Raina, D. and Gupta, B. M. (1998). Four aspects of the institutionalization of physics research in India (1990–1950): Substantiating the claims of histortical sociology through bibliometrics. Scientometrics, 42(1):17–40.

Rajagopalan, M. S., Khanna, V., Stott, M., Leiter, Y., Showalter, T. N., Dicker, A., and Lawrence, Y. R. (2010). Accuracy of cancer information on the internet: A compari- son of a wiki with a professionally maintained database. Journal of Clinical Oncology, 28(15 suppl):6058–6058.

Rao, I. K. R. (1980). The distribution of scientific productivity and social change. Journal of the American Society for Information Science, 31(2):111–122.

Ratanamahatana, C. A. and Keogh, E. (2005). Three myths about dynamic time warping data mining. In Proceedings of the 2005 SIAM International Conference on Data Mining, pages 506–510.

Ravignani, A., Delgado, T., and Kirby, S. (2017). Musical evolution in the lab exhibits rhythmic universals. Nature Human Behaviour, 1.

Ravignani, A., Thompson, B., Grossi, T., Delgado, T., and Kirby, S. (2018). Evolving building blocks of rhythm: how human cognition creates music via cultural transmission. Annals of the New York Academy of Sciences, 1423:176–187.

Raynal, L., Marin, J.-M., Pudlo, P., Ribatet, M., Robert, C. P., and Estoup, A. (2019). Abc random forests for bayesian parameter inference. Bioinformatics, 35(10):1720–1728.

Recht, H. (2019). censusapi: Retrieve Data from the Census APIs. R package version 0.6.0.

Reeping, P. M., Cerd´a, M., Kalesan, B., Wiebe, D. J., Galea, S., and Branas, C. C. (2019). State gun laws, gun ownership, and mass shootings in the US: Cross sectional time series. BMJ, 364.

Reinhart, A. (2018). A review of self-exciting spatio-temporal point processes and their applications. Statistical Science, 33(3):299–318.

Rendell, L., Fogarty, L., Hoppitt, W. J., Morgan, T. J., Webster, M. M., and Laland, K. N. (2011). Cognitive culture: Theoretical and empirical insights into social learning strategies. Trends in Cognitive Sciences, 15(2):68–76.

Reynolds, S. (2012). Energy flash: a journey through rave music and dance culture. Soft Skull Press, New York, NY. BIBLIOGRAPHY 193

Richerson, P. and Boyd, R. (2005). The Origin and Evolution of Cultures. Oxford University Press.

Rivera-Gutierrez, H. F., Matthysen, E., Adriaensen, F., and Slabbekoorn, H. (2010). Reper- toire sharing and song similarity between great tit males decline with distance between forest fragments. Ethology, 116(10):951–960.

Rivero, D. G. (2016). Darwinian Archaeology and Cultural Phylogenetics. In Mendoza Straffon, L., editor, Cultural Phylogenetics: Concepts and Applications in Archaeology, pages 43–72. Springer.

Rodgers, T. (2003). On the process and aesthetics of sampling in electronic music production. Organised Sound, 8(3):313–320.

Rodriguez Zivic, P. H., Shifres, F., and Cecchi, G. A. (2013). Perceptual basis of evolving Western musical styles. Proceedings of the National Academy of Sciences, 110(24):10034– 10038.

Roginek, E. W. (2018). Spatial variation of house finch (Haemorhous mexicanus) song along the American Southwest coast. PhD thesis, Queens College.

Rossetti, G., Pappalardo, L., Pedreschi, D., and Giannotti, F. (2017). Tiles: an on- line algorithm for community discovery in dynamic social networks. Machine Learning, 106(8):1213–1241.

Rossman, G., Chiu, M. M., and Mol, J. M. (2008). Modeling diffusion of multiple innova- tions via multilevel diffusion curves: Payola in pop music radio. Sociological Methodology, 38:201–230.

Rosvall, M. and Bergstrom, C. T. (2010). Mapping change in large networks. PLOS ONE, 5(1):1–7.

Rousseau, R. (1992). Breakdown of the robustness property of Lotka’s Law: The case of adjusted counts for multiauthorship attribution. Journal of the American Society for Information Science, 43(10):645–647.

Rowe, G. (1976). My undercover years with the Ku Klux Klan. Bantam Books, New York.

Ruiz-Castillo, J. and Costas, R. (2014). The skewness of scientific productivity. Journal of Informetrics, 8(4):917–934.

Rushin, S. and Edwards, G. S. (2018). The Effect of President Trump’s Election on Hate Crimes. SSRN Electronic Journal.

Russell, A. L., Ranivo, J., Palkovacs, E. P., Goodman, S. M., and Yoder, A. D. (2007). Work- ing at the interface of phylogenetics and population genetics: a biogeographical analysis of triaenops spp. (chiroptera: Hipposideridae). Molecular Ecology, 16(4):839–851. BIBLIOGRAPHY 194

Rutz, C. and Hays, G. C. (2009). New frontiers in biologging science. Biology Letters, 5(3):289–292.

Safer-Lichtenstein, A., LaFree, G., and Loughran, T. (2017). Studying Terrorism Empiri- cally: What We Know About What We Don’t Know. Journal of Contemporary Criminal Justice, 33(3):273–291.

Salavuo, M. (2006). Open and informal online communities as forums of collaborative musical activities and learning. British Journal of Music Education, 23(3):253–271.

Sanir, J., Nicholson, A., and Giammaria, C. (2017). Countering Violent Extremism Through Public Health Practice: Proceedings of a Workshop. In Forum on Medical and Public Health Preparedness for Disasters and Emergencies. The National Academy of Sciences.

Sard´a-Espinosa,A. (2019). Time-series clustering in r using the dtwclust package. The R Journal.

Savage, P. E. (2019). Cultural evolution of music. Palgrave Communications, 5(16).

Savage, P. E. and Brown, S. (2014). Mapping Music: Cluster Analysis Of Song-Type Fre- quencies Within And Between Cultures. Ethnomusicology, 58(1):133–155.

Savage, P. E., Brown, S., Sakai, E., and Currie, T. E. (2015). Statistical universals reveal the structures and functions of human music. Proceedings of the National Academy of Sciences, 112(29):8987–8992.

Savage, P. E., Chiba, G., Currie, T. E., Suzuki, H., and Atkinson, Q. D. (2020a). Sequence alignment of folk song melodies reveals cross-cultural mechanisms of musical evolution. PsyArXiv preprint.

Savage, P. E., Loui, P., Tarr, B., Schachner, A., Glowacki, L., Mithen, S., and Fitch, T. (2020b). Music as a coevolved system for social bonding. Behavioral and Brain Sciences.

Scapoli, C., Goebl, H., Sobota, S., Mamolini, E., Rodriguez-Larralde, A., and Barrai, I. (2005). Surnames and dialects in france: Population structure and cultural evolution. Journal of Theoretical Biology, 237(1):75–86.

Schilling, M. A. and Phelps, C. C. (2007). Interfirm collaboration networks: The impact of large-scale network structure on firm innovation. Management Science, 53(7):1113–1126.

Schillinger, K. and Lycett, S. (2019). The Flow of Culture: Assessing the Role of Rivers in the Inter-community Transmission of Material Traditions in the Upper Amazon. J Archaeol Method Theory, 26:135–154.

Schillinger, K., Mesoudi, A., and Lycett, S. J. (2016). Copying error, evolution, and phylo- genetic signal in artifactual traditions: An experimental approach using “model artifacts”. Journal of Archaeological Science, 70:23–34. BIBLIOGRAPHY 195

Schlitter, N. and Falkowski, T. (2009). Mining the Dynamics of Music Preferences from a Social Networking Site. In 2009 International Conference on Advances in Social Network Analysis and Mining, Athens, Greece.

Searcy, W. A., Nowicki, S., and Hughes, M. (1997). The Response of Male and Female Song Sparrows to Geographic Variation in Song. The Condor, 99(3):651–657.

Sebasti´an-Gonz´alez, E. and Hart, P. J. (2017). Birdsong meme diversity in a habitat land- scape depends on landscape and species characteristics. Oikos, 126(10):1511–1521.

Seigel, M., Ross, C., and King, C. (2014). A new proxy measure for state-level gun ownership in studies of firearm injury prevention. 20(3):204–207.

Serr`a,J., Corral, A.,´ Bogu˜n´a,M., Haro, M., and Arcos, J. L. (2012). Measuring the evolution of contemporary western popular music. Scientific Reports, 2.

Sethi, S. S., Ewers, R. M., Jones, N. S., Orme, C. D. L., and Picinali, L. (2018). Robust, real-time and autonomous monitoring of ecosystems with an open, low-cost, networked device. Methods in Ecology and Evolution, 9(12):2383–2387.

Shah, A. D., Bartlett, J. W., Carpenter, J., Nicholas, O., and Hemingway, H. (2014). Com- parison of random forest and parametric imputation models for imputing missing data using MICE: A CALIBER study. American Journal of Epidemiology, 179(6):764–774.

Shennan, S. (2011). Descent with modification and the archaeological record. Philosophical Transactions of the Royal Society B: Biological Sciences, 366:1070–1079.

Shennan, S. J. and Wilkinson, J. R. (2001). Ceramic Style Change and Neutral Evolution : A Case Study from Neolithic Europe. American Antiquity, 66(4):577–593.

Sicko, D. (2010). Techno rebels: the renegades of electronic funk. Wayne State University Press, Detroit, Michigan, 2nd edition.

Silk, M. J., Jackson, A. L., Croft, D. P., Colhoun, K., and Bearhop, S. (2015). The conse- quences of unidentifiable individuals for the analysis of an animal social network. Animal Behaviour, 104.

Silkoset, E. (2016). Lone-Actor Terrorists: How Long Does It Take to Plan a Terrorist Attack? PhD thesis, University College of London.

Simi, P. and Bubolz, B. F. (2017). Far Right Terrorism in the United States. In The Handbook of the Criminology of Terrorism, chapter 19, pages 297–309.

Simon, H. A. (1962). The Architecture of Complexity. Proceedings of the American Philo- sophical Society, 106(6):467–482.

Simpson, E. H. (1949). Measurement of diversity. Nature, 163(688). BIBLIOGRAPHY 196

Slater, P. (1989). Bird song learning: causes and consequences. Ethology Ecology & Evolution, 1(1):19–46.

Slater, P. and Ince, S. (1979). Cultural evolution in chaffinch song. Behaviour, 71:146–166.

Smith, B. L., Cothren, J., Roberts, P., and Damphousse, K. R. (2008). Geospatial Analysis of Terrorist Activities: The Identification of Spatial and Temporal Patterns of Prepara- tory Behavior of International and Environmental Terrorists. Technical report, Terrorism Research Center in Fulbright College, University of Arkansas.

Smith, B. L. and Damphousse, K. R. (2009). Patterns of precursor behaviors in the life span of a U.S. environmental terrorist group. Criminology & Public Policy, 8(3):475–496.

Smith, E. A., Mulder, M. B., and Hill, K. (2001). Controversies in the evolutionary social sciences: a guide for the perplexed. Trends in Ecology & Evolution, 16(3):128–135.

Smith, J. A., McPherson, M., and Smith-Lovin, L. (2014). Social Distance in the United States: Sex, Race, Religion, Age, and Education Homophily among Confidants, 1985 to 2004. American Sociological Review, 79(3):432–456.

Smith, J. A. and Moody, J. (2013). Structural effects of network sampling coverage I: Nodes missing at random. Social Networks, 35(4):652–668.

Smith, J. A., Moody, J., and Morgan, J. H. (2017). Network sampling coverage II: The effect of non-random missing data on network measurement. Social Networks, 48:78–99.

Soha, J. A. and Marler, P. (2000). A species-specific acoustic cue for selective song learning in the white-crowned sparrow. Animal Behaviour, 60(3):297–306.

Southern Poverty Law Center (2017). Ten ways to fight hate: a community resource guide. Technical report.

Spencer, M., Davidson, E. A., Barbrook, A. C., and Howe, C. J. (2004). Phylogenetics of artificial manuscripts. Journal of Theoretical Biology, 227(4):503–511.

St John, G. (2006). Electronic Dance Music Culture and Religion: An Overview. Culture and Religion, 7(1):1–25.

Stekhoven, D. J. and B¨uhlmann,P. (2012). MissForest—non-parametric missing value im- putation for mixed-type data. Bioinformatics, 28(1):112–118.

Stirling, A. (2007). A general framework for analysing diversity in science technology and society. Journal of The Royal Society Interface, 4(15):707–719.

Suttmoeller, M., Chermak, S., and Freilich, J. D. (2015). The Influence of External and Internal Correlates on the Organizational Death of Domestic Far-Right Extremist Groups. Studies in Conflict and Terrorism, 38(9):734–758. BIBLIOGRAPHY 197

Suttmoeller, M. J., Chermak, S. M., and Freilich, J. D. (2016). Only the Bad Die Young: The Correlates of Organizational Death for Far-Right Extremist Groups. Studies in Conflict and Terrorism, 39(6):477–499.

Suttmoeller, M. J., Chermak, S. M., and Freilich, J. D. (2018). Is More Violent Better? The Impact of Group Participation in Violence on Group Longevity for Far-Right Extremist Groups. Studies in Conflict and Terrorism, 41(5):365–387.

Sweeney, M. M. and Perliger, A. (2018). Explaining the spontaneous nature of far-right violence in the United States. Perspectives on Terrorism, 12(6):52–71.

Sweileh, W. M., Al-Jabi, S. W., Sawalha, A. F., and Zyoud, S. H. (2016a). Bibliometric profile of the global scientific research on autism spectrum disorders. SpringerPlus, 5(1):1480.

Sweileh, W. M., Shraim, N. Y., Al-Jabi, S. W., Sawalha, A. F., Rahhal, B., Khayyat, R. A., and Zyoud, S. H. (2016b). Assessing worldwide research activity on probiotics in pediatrics using Scopus database: 1994–2014. World Allergy Organization Journal, 9(1):25.

Ta¸skın,Z. and Aydinoglu, A. U. (2015). Collaborative interdisciplinary astrobiology research: a bibliometric study of the NASA Astrobiology Institute. Scientometrics, 103(3):1003– 1022.

Tehrani, J. and Collard, M. (2002). Investigating cultural evolution through biological phy- logenetic analyses of Turkmen textiles. Journal of Anthropological Archaeology, 21(4):443– 463.

Tehrani, J. J. (2013). The Phylogeny of Little Red Riding Hood. PLOS ONE, 8(11).

Tehrani, J. J., Collard, M., and Shennan, S. J. (2010). The cophylogeny of populations and cultures: reconstructing the evolution of Iranian tribal craft traditions using trees and jungles. Philosophical Transactions of the Royal Society B, 365(1559):3865–3874.

Tehrani, J. J. and D’Huy, J. (2017). Phylogenetics Meets Folklore: Bioinformatics Ap- proaches to the Study of International Folktales. In Math Meets Myths: Quantitative Approaches to Ancient Narratives, pages 91–114.

T¨emkin,I. (2016). Homology and phylogenetic inference in biological and material cultural evolution. In Panebianco, F. and Serrelli, E., editors, Understanding cultural traits: A multidisciplinary perspective on cultural diversity, pages 287–313.

T¨emkin,I. and Eldredge, N. (2007). Phylogenetics and material cultural evolution. Current Anthropology, 48(1):146–153.

Tench, S., Fry, H., and Gill, P. (2016). Spatio-temporal patterns of IED usage by the Provisional Irish Republican Army. European Journal of Applied Mathematics, 27(3):377– 402. BIBLIOGRAPHY 198

The Anti-Defamation League Center on Extremism (2019). Murder and Extremism in the United States in 2018. Technical report.

The Department of Homeland Security (2019). Strategic Framework for Countering Terror- ism and Target Violence. Technical report.

Thompson, W. L. (1960a). Agonistic behavior in the House Finch. Part I: Annual cycle and display patterns. The Condor, 62:245–271.

Thompson, W. L. (1960b). Agonistic behavior in the House Finch. Part I: Annual cycle and display patterns. The Condor, 62:245–271.

Thompson, W. L. (1960c). Agonistic behavior in the House Finch. Part II: Factors in aggressiveness and sociality. The Condor, 62:378–402.

Thornton, S. (1995). Club Cultures: Music, Media and Subcultural Capital. Polity Press, Cambridge.

Thouzeau, V., Affholder, A., Mennecier, P., Verdu, P., and Austerlitz, F. (2018). Inferring linguistic transmission between generations at the scale of individuals. bioRxiv.

Tinits, P. and Sobchuk, O. (2020). Open-ended cumulative cultural evolution of hollywood film crews. Evolutionary Human Sciences, 2:e26.

Tominaga, Y. (2018). There’s No Place Like Home! Examining the Diffusion of Suicide At- tacks Through Terrorist Group Locations. Applied Spatial Analysis and Policy, 11(2):355– 379.

Toussaint, G. (2003). Classification and Phylogenetic Analysis of African Ternary Rhythm Timelines. In Proceedings of BRIDGES: Mathematical Connections in Art, Music and Science, pages 1–18.

Towers, S., Gomez-Lievano, A., Khan, M., Mubayi, A., and Castillo-Chavez, C. (2015). Contagion in mass killings and school shootings. PLoS ONE, 10(7):e0117259.

United Nations Convention on Migratory Species (2017). Decisions 12.75 to 12.77 - conser- vation implications of animal culture and social complexity.

United Nations Development Programme (2016). Preventing Violent Extremism Through Promoting Inclusive Development, Tolerance and Respect for Diversity. Technical report, New York, NY.

V¨akev¨a,L. (2010). Garage band or GarageBand®? Remixing musical futures. British Journal of Music Education, 27(1):59–70. van Buuren, S. and Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in r. Journal of Statistical Software, 45(3):1–67. BIBLIOGRAPHY 199 van Eck, N. J. and Waltman, L. (2007). VOS: A New Method for Visualizing Similarities Between Objects. In Studies in Classification Data Analysis, and Knowledge Organization, pages 299–306. Springer Berlin Heidelberg. van Eck, N. J. and Waltman, L. (2009). Software survey: VOSviewer a computer program for bibliometric mapping. Scientometrics, 84(2):523–538. van Eerten, J.-J., Doosje, B., Konijn, E., de Graaf, B., and de Goede, M. (2017). Developing a social media response to radicalization: The role of counter-narratives in prevention of radicalization and de-radicalization. Technical report, Wetenschappelijk Onderzoek- en Documentatiecentrum, Amsterdam. van Venrooij, A. (2015). A community ecology of genres: Explaining the emergence of new genres in the UK field of electronic/dance music, 1985-1999. Poetics, 52:104–123.

Veit, R. R. and Lewis, M. A. (1996). Dispersal, Population Growth, and the Allee Effect: Dy- namics of the House Finch Invasion of Eastern North America. The American Naturalist, 148(2):255–274.

Velasco, J. D. (2013). Phylogeny as population history. Philosophy and Theory in Biology, 5:e402.

Verhoef, T. (2012). The origins of duality of patterning in artificial whistled languages. Language and Cognition, 4(4):357–380. von Behr, I., Reding, A., Edwards, C., and Gribbon, L. (2013). Radicalisation in the digital era: the use of the internet in 15 cases of terrorism and extremism. Technical report, RAND Europe.

Voogt, S. (2017). Countering far-right recruitment online: CAPE’s practitioner experience. Journal of Policing, Intelligence and Counter Terrorism, 12(1):34–46.

Wagner, C. S., Roessner, J. D., Bobb, K., Klein, J. T., Boyack, K. W., Keyton, J., Rafols, I., and B¨orner,K. (2011). Approaches to understanding and measuring interdisciplinary scientific research (IDR): A review of the literature. Journal of Informetrics, 5(1):14–26.

Waljee, A. K., Mukherjee, A., Singal, A. G., Zhang, Y., Warren, J., Balis, U., Marrero, J., Zhu, J., and Higgins, P. D. (2013). Comparison of imputation methods for missing laboratory data in medicine. BMJ Open, 3(8):1–7.

Walker, K. (2019). tigris: Load Census TIGER/Line Shapefiles. R package version 0.8.2.

Waltman, L., van Eck, N. J., and Noyons, E. C. (2010). A unified approach to mapping and clustering of bibliometric networks. Journal of Informetrics, 4(4):629–635.

Wang, M., Zhao, Y., and Zhang, B. (2015). Efficient Test and Visualization of Multi-Set Intersections. Scientific Reports, 5(1):16923. BIBLIOGRAPHY 200

Webber, D. and Kruglanski, A. W. (2017). Psychological Factors in Radicalization: A ”3 N” Approach. In The Handbook of the Criminology of Terrorism, number 1, chapter 2, pages 33–46. John Wiley & Sons.

Weine, S. and Eisenman, D. (2016). How Public Health Can Improve Initiatives to Counter Violent Extremism. National Consortium for the Study of Terrorism and Responses to Terrorism.

Weiss, J. C., McDevitt, J., and Iwama, J. A. (2016). Group work with victims of hate crimes. In Group work with populations at risk, pages 291–311. Oxford, New York.

Werner, A. (2020). Organizing music, organizing gender: algorithmic culture and Spotify recommendations. Popular Communication, 18(1):78–90.

Whalen, A. and Hoppitt, W. J. (2016). Bayesian model selection with Network Based Diffusion Analysis. Frontiers in Psychology, 7.

Whelan, A. M. (2009). The “Amen” Breakbeat as Fratriarchal Totem. In Gender and Music, pages 111–133. Universitatsverlag Winter, Heidelberg.

White, G., Porter, M. D., and Mazerolle, L. (2013). Terrorism Risk, Resilience and Volatility: A Comparison of Terrorism Patterns in Three Southeast Asian Countries. Journal of Quantitative Criminology, 29(2):295–320.

White, G., Ruggeri, F., and Porter, M. D. (2016). Modelling the Proliferation of Terrorism via Diffusion and Contagion. pages 1–19.

Whitehead, H. (2010). Conserving and managing animals that learn socially and share cultures. Learning and Behavior, 38(3):329–336.

Whytock, R. C. and Christie, J. (2017). Solo: an open source, customizable and inexpensive audio recorder for bioacoustic research. Methods in Ecology and Evolution, 8(3):308–312.

Wild, S. and Hoppitt, W. (2018). Choosing a sensible cut-off point: assessing the impact of uncertainty in a social network on the performance of NBDA. Primates.

Wilder, B. and Kandler, A. (2015). Inference of Cultural Transmission Modes Based on Incomplete Information. Human Biology, 87(3):193–204.

Williams, H. J., Taylor, L. A., Benhamou, S., Bijleveld, A. I., Clay, T. A., de Grissac, S., Demˇsar,U., English, H. M., Franconi, N., G´omez-Laich, A., Griffiths, R. C., Kay, W. P., Morales, J. M., Potts, J. R., Rogerson, K. F., Rutz, C., Spelt, A., Trevail, A. M., Wilson, R. P., and B¨orger,L. (2019). Optimizing the use of biologgers for movement ecology research. Journal of Animal Ecology, 89(1):186–206. BIBLIOGRAPHY 201

Wimsatt, W. C. (2013). Articulating Babel: An approach to cultural evolution. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 44(4):563–571.

Windram, H. F., Charlston, T., and Howe, C. J. (2014). A phylogenetic analysis of Orlando Gibbons’s Prelude in G. Early Music, 42(4):515–528.

Winker, M. A., Abbasi, K., and Rivara, F. P. (2016). Unsafe and understudied: the us gun problem. BMJ, 352.

Winter, A. (2019). Online Hate: From the Far-Right to the ’Alt-Right’ and from the Margins to the Mainstream. In Lumsden, K. and Harmer, E., editors, Online Othering, chapter 2, pages 39–63. Springer International Publishing.

Wong, K. and Christmann, K. (2016). Increasing hate crime reporting: narrowing the gap between policy aspiration, victim inclination and agency capability. British Journal of Community Justice, 14(3):5–23.

Wu, P. (2015). Impossible to Regulate? Social Media, Terrorists, and the Role for the U.N. Chicago International Journal of Law, 16(1):281–311.

Wu, Y., Pitipornvivat, N., Zhao, J., Yang, S., Huang, G., and Qu, H. (2016). egoslider: Visual analysis of egocentric network evolution. IEEE Transactions on Visualization and Computer Graphics, 22(1):260–269.

Yang, Z., Algesheimer, R., and Tessone, C. J. (2016). A comparative analysis of community detection algorithms on artificial networks. Scientific Reports, 6:30750.

Yeung, A. W. K., Goto, T. K., and Leung, W. K. (2017). The Changing Landscape of Neu- roscience Research 2006–2015: A Bibliometric Study. Frontiers in Neuroscience, 11:120.

Youngblood, M. (2019a). Conformity bias in the cultural transmission of music sampling traditions. Royal Society Open Science, 6:191149.

Youngblood, M. (2019b). Cultural transmission modes of music sampling traditions remain stable despite delocalization in the digital age. PLOS ONE, 14(2):e0211860.

Youngblood, M. (2020). A Raspberry Pi-based, RFID-equipped birdfeeder for the remote monitoring of wild bird populations. Ringing & Migration.

Youngblood, M. and Lahti, D. (2018). A bibliometric analysis of the interdisciplinary field of cultural evolution. Palgrave Communications, 4.

Zammit-Mangion, A., Dewar, M., Kadirkamanathan, V., and Sanguinetti, G. (2012). Point process modelling of the Afghan War Diary. Proceedings of the National Academy of Sciences of the United States of America, 109(31):12414–12419. BIBLIOGRAPHY 202

Zarybnicka, M., Kubiznak, P., Sindelar, J., and Hlavac, V. (2016). Smart nest box: a tool and methodology for monitoring of cavity-dwelling animals. Methods in Ecology and Evolution, 7:483–492.

Zuur, A. F., Ieno, E. N., and Elphick, C. S. (2010). A protocol for data exploration to avoid common statistical problems. Methods in Ecology and Evolution, 1(1):3–14.