Psychological Science

by

Ian Dennis Miller

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Psychology University of Toronto

c Copyright 2019 by Ian Dennis Miller Abstract

Psychological Meme Science

Ian Dennis Miller Doctor of Philosophy Graduate Department of Psychology University of Toronto 2019

Memes are , often represented using media, with the special characteristics of repeatable and adaptable. impact our lives in material ways, influencing political systems and propagating the stories our shared is built from. When propagated via online social networks, the massive scale at which memes operate is without precedent. However, the meme does not act on its own; it is only by human activity that memes are created and proliferated. This dissertation will tackle a series of research questions surrounding the scientific study of humans and memes from a psychological perspective.

This work begins with the observation that science is a social enterprise and scientific ideas spread as memes. The first chapter of this dissertation applies social network methods to the global scientific collaboration network in order to build a map of beliefs about systems of humans and memes. The next chapter examines a hierarchical democratic phenomenon - the online campaign preceding an election - in order to determine the appropriate analytical scope for investigating complex systems of political memes.

The final chapter presents a method for translating regression models from the psychological literature into computational social simulations using agent-based models. A computational social simulation of urban legends is then built, replicating a study from the literature and then extending it to examine the effect of social network topology upon the propagation of urban legends.

Humans and memes, together, constitute a that offers new methodological tools to study the human condition.

ii Acknowledgements

First and foremost, it is my pleasure to acknowledge the contributions of my advisor, Prof. Gerald Cupchik, who made this possible in the first place. The other members of my doctoral committee, Prof. Doug Bors and Prof. Jacob Hirsh, provided valuable encouragement and advice that substantially benefited the final work. In particular, Prof. Elizabeth Page-Gould generously offered her , advice, support, and extensive feedback on this dissertation. I also wish to acknowledge the examiners (alpha- betically): Prof. Matthew Feinberg, Prof. Will Gervais, Prof. Cendri Hutcherson, and Prof. Yoel Inbar. Ultimately, I wish to thank everyone who supported me throughout this journey.

iii Contents

List of Figures vii

List of Tables ix

1 Introduction 1 1.1 Overview ...... 2 1.2 Current Work ...... 3

2 The Literature of Psychological Meme Science 4 2.1 Introduction ...... 4 2.2 Background ...... 4 2.2.1 Six Degrees of Separation ...... 4 2.2.2 Academic Networks ...... 5 2.2.3 Coauthorship ...... 7 2.2.4 Summary ...... 9 2.3 Methods ...... 9 2.3.1 Bibliographic Entries ...... 9 2.3.2 Bibliography Management ...... 11 2.3.3 Scholarship Catalog Methods ...... 11 2.3.4 Biographic Research ...... 12 2.3.5 Cleaning Coauthorship Data ...... 12 2.3.6 Coauthorship Network ...... 12 2.3.7 Network Analysis Methods ...... 13 2.3.8 Visualization Methods ...... 13 2.4 Results ...... 13 2.4.1 Coauthorship Network ...... 13 2.4.2 Main Component Network ...... 15 2.4.3 Component Path Length Distribution ...... 16 2.4.4 Community Detection ...... 17 2.4.5 Component Community Size Distribution ...... 18 2.4.6 Groups of Communities ...... 18 2.4.7 Community Labels ...... 18 2.4.8 Author Influence ...... 21 2.4.9 Influential Institutions ...... 22

iv 2.4.10 Online Interactive Viewer ...... 22 2.4.11 Summary of Results ...... 23 2.5 Discussion ...... 23 2.5.1 Longer Path Length ...... 24 2.5.2 Utility of Scholarly Silos ...... 25 2.5.3 Institutions ...... 25 2.5.4 Observations of academic publishing over time ...... 26 2.5.5 When to Stop ...... 27 2.6 Conclusion ...... 27 2.6.1 Future Directions ...... 27 2.6.2 Applying this New Knowledge ...... 28

3 The Analytical Scale of Online Political Campaigns 29 3.1 Introduction ...... 29 3.2 Background ...... 30 3.2.1 Democracy and Elections ...... 30 3.2.2 Campaigns and Electioneering ...... 31 3.2.3 Speaking for a Collective ...... 31 3.2.4 Finding Symbols in Campaign Speech ...... 32 3.2.5 Memes ...... 34 3.2.6 Questions of Scale and Causation ...... 34 3.2.7 Models ...... 34 3.2.8 Basic Conceptual Model ...... 41 3.3 Methods ...... 42 3.3.1 Data Collection ...... 42 3.3.2 Collecting Tweets ...... 44 3.3.3 Collecting the Social Graph ...... 45 3.3.4 Natural Language Processing ...... 46 3.3.5 Requirements for a Method that Identifies Memes ...... 46 3.3.6 Pointwise Mutual ...... 47 3.3.7 Advantages and Disadvantages of PMI ...... 48 3.3.8 Community Detection ...... 48 3.3.9 Hierarchical Linear Modelling ...... 48 3.3.10 Operational Terminology ...... 49 3.4 Results ...... 49 3.4.1 Tweets ...... 49 3.4.2 Social ...... 55 3.4.3 Models ...... 60 3.4.4 Summary of Results ...... 66 3.5 Discussion ...... 66 3.5.1 Phenomenological Scale ...... 66 3.5.2 Agency of the Individual ...... 67 3.5.3 Influence of Collectives ...... 68 3.5.4 Covariation of Symbols and the Unconscious ...... 68

v 3.5.5 Symbols that Emerge ...... 69 3.5.6 Misinformation and Propaganda ...... 70 3.6 Conclusion ...... 70

4 Urban Legend Propagation 71 4.1 Introduction ...... 71 4.2 Background ...... 71 4.2.1 Key Terminology ...... 71 4.2.2 Reviewing a study on Urban Legends ...... 76 4.2.3 The current work ...... 80 4.3 Methods ...... 81 4.3.1 Computational Modelling Theory ...... 81 4.3.2 The Current Work ...... 87 4.3.3 ODD Protocol ...... 90 4.3.4 Simulations ...... 96 4.4 Results ...... 96 4.4.1 Study 1: Replication ...... 96 4.4.2 Study 2: Receive-transmit pattern at 10x scale ...... 102 4.4.3 Study 3: Preferential Attachment ...... 103 4.4.4 Summary ...... 104 4.5 Discussion ...... 105 4.5.1 The Lab and The Simulation ...... 105 4.5.2 Scaling a serial reproduction task ...... 105 4.5.3 Explanations for Emergent Network Behaviour ...... 105 4.5.4 Messages and Topologies ...... 106 4.5.5 Error Bars and Parallel Simulations ...... 106 4.5.6 Statistical Inference ...... 107 4.5.7 A General Method for Psychological Research ...... 107 4.6 Conclusion ...... 108 4.6.1 Adapting Linear Models to Agent-Based Modelling ...... 108 4.6.2 Urban Legends Require Networks ...... 108

5 Conclusion 110 5.1 Meme Findings ...... 110 5.2 Research with Memes ...... 111 5.3 Implications for Psychological Research with Memes ...... 111 5.4 Final Remarks ...... 112

References 113

vi List of Figures

2.1 Path Length Distribution...... 5 2.2 Scholarly Synthesis Algorithm ...... 10 2.3 Interactive Network Visualization ...... 14 2.4 Full Coauthorship Network ...... 14 2.5 Main Component of Coauthorship Network ...... 15 2.6 Main Component Path Length Distribution ...... 16 2.7 Component Communities ...... 17 2.8 Community Size Distribution ...... 18

3.1 Democracy ...... 30 3.2 Symbols and the Collective Unconscious ...... 32 3.3 Jungian Campaign Speech ...... 33 3.4 Online Social Media Network Model ...... 35 3.5 Memetic Model ...... 36 3.6 Social Cognition Model ...... 38 3.7 Social Forces Model ...... 39 3.8 Model ...... 40 3.9 Null Model ...... 41 3.10 Basic Conceptual Model ...... 42 3.11 Data Collection Diagram ...... 43 3.12 Downloading the Statuses ...... 44 3.13 Downloading the Follower Graph ...... 45 3.14 Tweets Over Time ...... 50 3.15 Replication Status of 3-grams ...... 51 3.16 Frequency of 3-grams ...... 52 3.17 Frequency Filter for 3-grams ...... 53 3.18 High-scoring 3-grams ...... 53 3.19 Tweet Location Map ...... 55 3.20 Complete Social Graph ...... 56 3.21 Degree Distribution ...... 57 3.22 Graph Dynamics of Degree Filter ...... 58 3.23 Social Graph (degree > 1000) ...... 59 3.24 Social Graph, Open Ord Layout ...... 59

vii 3.25 Basic Hierarchy, Operationalized ...... 61

4.1 I can has cheezburger? ...... 74 4.2 Process for Creating High and Low Disgust Stories ...... 77 4.3 Eriksson & Coultas (2014); p. 14 ...... 78 4.4 Research Method Used in Eriksson & Coultas Study 3 ...... 79 4.5 Eriksson & Coultas (2014); p. 17 ...... 80 4.6 Railsback & Grimm (2011, p. 245) ...... 86 4.7 Serial Reproduction Tasks; length = 2 ...... 88 4.8 Study 1; NetLogo Interface...... 88 4.9 Study 2; NetLogo Interface...... 89 4.10 Study 3; NetLogo Interface with network layout...... 90 4.11 ABM “step” submodel ...... 94 4.12 share threshold Parameter Search Results ...... 97 4.13 Computational Model Study 1 Results ...... 98 4.14 Null choose-to-receive Results ...... 99 4.15 Null choose-to-transmit Results ...... 100 4.16 Null receive and transmit Results ...... 101 4.17 Study 2 Results ...... 101 4.18 Study 3 Results ...... 102 4.19 Study 2, Total Stories ...... 103 4.20 Study 3, Null Comparison ...... 104

viii List of Tables

2.1 Disciplines Relevant to Psychological Meme Science ...... 19 2.2 Complexity Communities ...... 19 2.3 Psychology Communities ...... 19 2.4 Ecology Communities ...... 20 2.5 Social/Information Communities ...... 20 2.6 Computing Communities ...... 20 2.7 Physics Communities ...... 21 2.8 Uncategorized Communities ...... 21 2.9 Author Centrality Measures ...... 21 2.10 Silicon Valley Institutions ...... 22 2.11 Michigan, MIT, and Industry ...... 22 2.12 Classic Academics ...... 23

3.1 Model Specifications ...... 61 3.2 Model Test Results ...... 63 3.3 BIC Comparisons Among Models ...... 64

ix Chapter 1

Introduction

This dissertation explores how memes provide a window into social communication and relationships. A meme is an that can be shared among people by representing the idea using common symbols. When a group of people collectively understands something, and when they converge upon a symbolic representation of that understanding, then a meme has been created. For the purposes of this work, memes are defined as the covariation of symbols corresponding to the covariation of . The statistical of covariance will enable several forms of analysis. Altogether, these systems can be investigated at multiple levels: 1) the group that shares the understanding and the memes; 2) the individuals who create and propagate new memes; 3) the network connecting individuals; and 4) the memes or the understandings, themselves. Ever since the term “meme” was coined, memes have been compared to because they were introduced in a book about genetics: Dawkins’ Selfish (1976). This origin has interfered with the understanding of memes and many thinkers have identified numerous shortcomings of the genetic analogy. The very word meme functions as a shibboleth: it evokes a different idea among the general population than among the scientific community. In popular use, digital images with embedded text captions are commonly called memes. This popular use of the word meme implies that the image is the meme - but this conflation of the meme with its representation also interferes with the understanding of memes. To the extent that memes are shared ideas, a meme simply does not exist as an independent structure; there is a strong complementarity between memes and the humans who create and share them. Fur- thermore, humans are also not independent from one another; hierarchical social structures exist among individuals and memes may be transmitted through these structures. It is a mistake to reduce the meme to its media representation because the meme’s capacity for social transmission requires consideration for humans and society. The field of investigates the individual in the context of their social situation. The complex and interdependent system of memes, humans, and society offers a powerful analytic framework for examining the human condition in context. By virtue of spanning disciplines, the scientific study of memes adopts methodologies from several fields, including psychology, sociology, ecology, , communication, , computation, and more. Psychological Meme Science is the cross-disciplinary investigation of individuals within their social and media context.

1 Chapter 1. Introduction 2

1.1 Overview

In this dissertation, several kinds of systems will be examined from the perspective of psychological meme science. Academic ideas, themselves, are memes in which publications are created to represent and propagate ideas among groups and disciplines. Election campaigns can be viewed as the competition of political slogans - themselves, memes - that are designed to establish a majority for the elected representative. Urban legends are also memes in which stories are propagated via word of mouth until they are eventually forgotten. Each of these - academia, elections, and urban legends - is a complex system consisting of memes, individuals, and society that is best studied with an analytic approach that considers all of these. Therefore, psychological meme science is demonstrated throughout this dissertation by its application to these systems.1

Chapter 2: The Literature of Psychological Meme Science The academic social network is a small world, in which all scholars are connected through coauthorship, but some connections are strong and others are weak. Academic disciplines are formed from groups of academics who are strongly connected through coauthorship and citation. Conversely, interdisciplinary research occurs when a publication is coauthored by scholars who are weakly connected by coauthorship and citation. This chapter synthesizes the literature of social psychology, memes, and networks using a social network approach. The literature of psychological meme science is mapped by exploring the coauthorship network of the academics publishing on these topics to find a common community and literature that unites these sciences. The resulting bibliography and academic social network describes the literature of psychological meme science.2

Chapter 3: The Analytical Scale of Online Political Campaigns A political slogan is a meme that is created in the context of an election campaign. Memes are a window into the election dynamic itself: individuals share memes and cast votes, yet the outcomes of elections are collective as determined by the majority of voters. An election campaign is therefore a hierarchical process consisting of memes, individuals, and collectives. This chapter asks a fundamental question of liberal democracy: do we govern ourselves? Are our choices the mere product of memetic contagion, in which the most “viral” memes determine popularity? Or, do social network forces jointly influence groups of individuals to ultimately produce a majority. This chapter analyzes the memes shared during an election campaign to examine whether it is memes, individuals, or social networks that determine the popularity of political slogans.

Chapter 4: Urban Legend Propagation Urban legends are a kind of meme that can be transmitted through social networks - both word-of-mouth networks and digital networks. An urban legend cascade is the lifetime history of all the people who transmitted a particular story - and some urban legends have quite old cascades.

1There is also an autobiographical narrative to these chapters. Chapter 2 is the process of bootstrapping myself for this work by discovering literature, academic communities, and research methods. In Chapter 3, I apply some of these new methods to a specific social network with lots of meme activity. Another objective of Chapter 3 is to establish that the psychological perspective is relevant to the study of memes. Chapter 4 directly synthesizes psychological science with memes using computational simulation methods, demonstrating their validity for psychological research. Ultimately, my purpose for this dissertation is to justify future work with the psychological treatment of memes, including forthcoming research on political polarization and world population simulations. 2This chapter describes how the literature review was conducted but the actual literature review appears in the back- ground sections of subsequent chapters. Chapter 1. Introduction 3

Previous research on urban legends has found that more disgusting stories are more likely to be transmitted along a non-branching cascade (e.g., where each person who transmits an urban legend shares it with only one new person). However, online social networks are laid out as a branching web of connections (e.g., where each person can transmit an urban legend to more than one new person) - and these branching connections enable urban legends cascades to “go viral.” Previous research that used non-branching networks does not replicate real-world conditions and the results did not exhibit story proliferation. The current chapter examines these network differences by replicating a study of urban legend transmission with different network structures.

New Methods Several new methods are described in this dissertation. In Chapter 2, a method is demonstrated for building an inter-disciplinary bibliography by analyzing a network of coauthorships. In Chapter 3, a method is demonstrated for detecting memes in a corpus of text. In Chapter 4, a method is demonstrated for utilizing a regression model from the psychological literature as the basis for a computational simulation with agent-based modelling. Although these methods are applied to memes, they are intended to be generalizable to other domains in psychological science and beyond.

1.2 Current Work

Altogether, the current work situates memes within the discourse of psychological science. Memes will be examined both as the topic of study themselves and as the vehicle to study hierarchical systems of individuals and groups. This work also extends our understanding of memes by using a social psycholog- ical perspective to place memes in the context of the individuals who share them and the social networks that connect them. Chapter 2

The Literature of Psychological Meme Science

2.1 Introduction

The current work examines the scholarly literature in order to determine a context for the scientific, psychological examination of meme phenomena. This literature review identified an academic network that spans many disciplines but nevertheless connects to form a coherent whole. A social network is a mathematical data structure consisting of individuals and the connections between them. In this work, an academic social network was formed out of coauthorships, in which academic collaborations constitute the links of the network. The social psychology literature forms the core of the network. In order to develop a literature of memes, relevant topics were added to the network, including social network analysis, contagion, and computational modelling. This work presents a new method for curating a bibliography of scholarly works in order to support a social network analysis of the literature. This work concludes with a discussion of influential disciplines, authors, and institutions relevant to research with and upon memes.

2.2 Background

2.2.1 Six Degrees of Separation

Is academia a small world, in which we are all connected to one another, or does academia consist of distinct “silos” of disconnected fields? Previous work has examined the general form of this question as applied to society at large (Travers & Milgram, 1967). One theory held that people existed in separate social networks that never overlapped. The alternative theory was that all people were connected to one another in a “small world,” even if some connections were more distant than others. To test the idea, Travers & Milgram developed a methodology in which letters were mailed from Boston to Kansas with return instructions. The instructions specified one special condition: during the return process, the letter should be physically handed to somebody well-known, on a first-name basis. Close social connections of this kind are likely to co-occur with other resource sharing behaviours, forming the basis for communities (Wellman, 1976).

4 Chapter 2. The Literature of Psychological Meme Science 5

Figure 2.1: Path length distribution reported by Travers & Milgram (1967). The mode of this plot occurs at a length of 6, indicating the most common network distance required to reach the researchers was 6.

Many of the letters were returned to Boston, each of which contained a record of contacts who forwarded the letter. Altogether, these data formed a rudimentary social network (Mitchell, 1969). The number of hops required for the letter to be passed back to the laboratory were recorded as path length, the results of which are summarized in Figure 2.1. The most common path length connecting the researchers to the letter recipients was 6 links long, which inspired the widely known phrase six degrees of separation (Guare, 1990). From this result, it was concluded that everybody is connected; the social circles overlap and, on that basis, it can be inferred that communities are connected as well (Wellman, 1979). In other words, the of short paths connecting people suggested that we live in a small world.

2.2.2 Academic Networks

Networks of Influence

The connection between two individuals can be characterized in several ways, apart from path length. Other research has looked at the strength of the connections between individuals (Granovetter, 1973, 1983). When two individuals have many shared connections - that is, when their social networks have a high proportion of overlap - then those individuals have a strong tie between them. A weak tie occurs when two individuals are connected but few people from their respective social networks knows anybody in the other network. The motivating question underlying this original work asked whether stronger ties lead to greater influence. After all, a higher percentage of social network overlap might imply greater social influence between those individuals. Granovetter observed that separate clusters of strong ties can be connected over long distances through a small number of weak ties. These conditions result in a network topology with unexpectedly short path lengths connecting any two people in the network, just as Travers & Milgram observed. The findings from Granovetter (1973) generalize to scholarship as well (Chubin, 1976). Chapter 2. The Literature of Psychological Meme Science 6

Strength of Scholarly Ties

The network topology described in Travers & Milgram (1967) was formalized as the Small World Net- work in Watts & Strogatz (1998), which can be used to generate artificial networks with small world characteristics. Using artificial networks, it can be demonstrated that small world topologies facilitate simple and complex contagion (Newman et al., 2006). Particularly for complex contagions which require more than mere exposure, strong ties provide multiple opportunities for exposure (Centola & Macy, 2007). Weak ties function like shortcuts in the network to transport simple contagions across long distances to new clusters of strongly-connected neighborhoods. Coauthorship is an example of a strong tie that usually occurs between people who know each other very well (Abbasi et al., 2011). Previous work investigated whether institutional communities drive coau- thorship or if coauthorship is driven by research interests, finding that institutional affiliation imparts the stronger influence (Rodriguez & Pepe, 2008). Recent work confirmed that the academic environment ultimately determines publishing productivity (Way et al., 2019). These findings are consistent with the strong ties intuition: the higher degree of social overlap, the greater the influence.

Citation Networks

Not all social networks display small-world properties. Although small world networks describe many social phenomena, including coauthorship, other social phenomena are better described using other networks. For example, a scale-free network is a type of network that describes self-organizing phenomena in the natural world, including nervous systems and the world wide web (Barab´asi & Albert, 1999). Scale-free networks also model the characteristics scholarly citations (Barab´asi,2009), providing a useful theoretical comparison with small-world networks. Scale-free networks can be produced with an algorithm called Preferential Attachment (Newman, 2003) that enacts a “rich get richer” dynamic. Consider celebrity Twitter users for a moment; the most popular twitter accounts are the most likely to receive new connections. Consequently, academic citations networks are scale-free partly because highly-cited articles are more likely to receive new citations. Citation data structures tend to be large: when there are thousands of articles and each one cites dozens of other articles, the number of aggregate citations rapidly increases. Perhaps due to its similarity to the World Wide Web, the citation network has received more attention from bibliometric scientists because the same algorithms work on both networks. Due to these shared algorithms, bibliographic citation search tools are affected by popularity in a manner similar to web site rankings. Popularity within an academic discipline indicates the acceptance and reification of commonly- held techniques and practices, resulting in disciplinary boundaries that become more rigid over time.

Bibliometrics

Bibliometrics is the quantitative study of bibliographic indexing (e.g., Garfield & Merton, 1979). Previous work has looked at patterns of coauthorship in disciplines that were early adopters of computation- friendly bibliometric practices (e.g., Ginsparg, 2011). Bibliometric analysis shows that networks of coauthors tend to form small worlds (Newman, 2001) and networks of citations are scale-free (Newman, 2003). Bibliometrics uses its own taxonomy for characterizing scholarly works (Borgman & Furner, 2002). Numerous scholarly activities produce bibliometric records, including writing, citing, submitting articles, Chapter 2. The Literature of Psychological Meme Science 7 and collaboration. Bibliometrics can be aggregated at different resolutions: person-level, group-level, by discipline, by institution, or even by nation. Bibliometrics can also be analyzed in terms of the kind of publication: whether it’s a research article, a review, or reference work. Each of these different kinds of publication has different bibliometric properties, both in terms of how the publication is constructed and the way it is used by the rest of the literature.

Academic Disciplines

Newman’s finding produces the following paradox: when all authors are connected in a small world, why do we observe disciplines in academia? The term academic discipline refers to the shared methods and techniques that are practiced and adhered to by a group of academics. An academic discipline produces a body of literature that is published as articles, books, and otherwise. Disciplines also have social dimensions, including conferences and journals for dissemination. My intuition is that disciplines result from factors precipitating from network distance, rather than differences of kind.1 To the extent that two disciplines might use different methods, they may use different words to describe the same concepts. Without shared vocabulary, it is difficult to harmonize keyword taxonomies and perform scholarly activities. Disciplines can be thought of as network clusters, with longer average path lengths between clusters and shorter path lengths within. A byproduct of longer paths is that, as ideas require more network hops, more adaptation is required to pass ideas between nodes in the network. Increased translation costs produce vocabulary effects - jargon - leading to the same concept having multiple names in different disciplines. As academic disciplines generate their own lexicons, it becomes ever-harder to translate from one discipline to another.

Inter-disciplinary Research

The challenge to synthesizing an inter-disciplinary literature lies in locating the bridges between disci- plines. When two disciplines are bridged through collaboration, that initial connection takes the form of a weak tie in the small world collaboration network. Therefore, in order to synthesize an inter-disciplinary literature, we must examine the small world of academic collaboration to search for those rare, weak ties that reach across the disciplines.

2.2.3 Coauthorship

The current work focuses on coauthorship in order to accomplish several objectives: 1) to determine the research literature relevant to memes; 2) to uncover the network structure of the beliefs that drive the literature on memes; and 3) to locate the present work within this literature.

Shared beliefs and coauthorship

Whenever coauthorship occurs, it is partly because the authors agree with what they are publishing; they believe in it (Liberman & Wolf, 2013). In order to identify the network of beliefs, coauthors must be identified in order to construct a network of coauthorship (Newman, 2004a).

1I think the term academic silo is a misnomer. The silo concept alludes to agriculture, in which the oats are stored in one silo and the corn is stored in a separate silo - and you would never mix the oats and the corn! As the current work will demonstrate, science acts more like a jellyfish, consisting of a core of polyps and emanating tendrils. Chapter 2. The Literature of Psychological Meme Science 8

Coauthorship is influenced by institution as well as research relevance (Rodriguez & Pepe, 2008). Therefore, there are at least two types of coauthorship: 1) convenient coauthorships, which are based on location; and 2) relevance coauthorships, which occur regardless of location. For the purposes of identifying networks, coauthorships of relevance are important to locate - especially because those coauthorships could be the product of weak ties. Luckily, coauthorships can also be identified through biographical information, including mentorship. By combining several chains of coauthorship that differ in their ease of findability, including both convenient and relevant coauthorships, a single network of beliefs can be obtained. Apart from the belief properties of coauthorship, another reason to focus on coauthorship is that it is the right-sized problem. In contrast to the scale-free network dynamics of citation analysis, which is a big data undertaking that requires hundreds of thousands or even millions of citations, the current coauthorship analysis will require just a few thousand hand-picked bibliographic entries. Consequently, this is a reasonable undertaking that can be completed in a sensible amount of time while providing actionable insights.2

Alternatives to coauthorship

There are many bibliometric approaches to discipline mapping, in addition to coauthorship. Here, I will briefly discuss why an alternative to coauthorship was not chosen. Citation data, in which one article explicitly references another article, are one of the more common objects of bibliometric analysis (Garfield & Merton, 1979), largely because those data are fairly easy to acquire using modern . Citation data are typically leveraged by automatically collecting as many articles and citations as possible. However, doing so would violate my objective of keeping this project small. Co-citation, in which two articles cite the same article, is an interesting corollary to citation (Small, 1973). By analogy, co-citation is like a strong tie of citedness by implying greater network overlap. Co-citing articles have a higher probability of being related than articles that are not co-cited (White & Griffith, 1981). Co-citation suffers from the same data scale issues as citations: it is another big-data problem. Articles may include acknowledgements, which appear as loosely-structured paragraphs of natural language either at the beginning or the end of an article (Wang & Shapira, 2011). As such, it is difficult to acquire raw acknowledgements data in the first place and, once obtained, the paragraphs are difficult to parse into data for analysis. Acknowledgements imply a stronger tie than citations because acknowledgements are typically restricted to people known by the authors. Therefore, acknowledgement data are rich and could be theoretically expected to indicate shared beliefs - but raw acknowledgements data are difficult to work with. It is also possible to analyze the mentorship lineage of academics (Malmgren et al., 2010), as in the case of the Mathematics Genealogy Project (Jackson, 2007). Mentorship ties tend to be very strong, with substantial network overlap (Kram, 1988). Mentorship networks also transmit beliefs but, as with acknowledgements, the data are unstructured and are therefore difficult to work with. Some bibliometric methods require big data and other methods require significant data cleaning. None of these alternatives to coauthorship provide the right ratio of data size and belief indication. By

2This study is based on a manually-curated bibliographic library consisting of approximately 2500 articles. Consistent with my objectives, it was possible to collect these articles over the course of 12 months. Chapter 2. The Literature of Psychological Meme Science 9 process of elimination, coauthorship was selected.

2.2.4 Summary

Coauthorship indicates a shared belief and these beliefs accumulate into clusters of strong ties that may be called disciplines, which are reified by citation networks and vocabulary. Academia is a small world network in which weak coauthorship ties sometimes exist as bridges between disciplines. If a sufficient quantity of coauthorships are analyzed to identify weak ties, then it might be possible to connect cross- disciplinary scientific interests into a single network.

2.3 Methods

The primary goal for this work was to create a network of coauthorship relevant to the literature of memes and social psychology. This work also presents a method for synthesizing distinct academic liter- atures into a single bibliography by strategically searching for coauthorships that form bridges between academic disciplines. This synthesis method gradually connects groups of coauthors into a single aca- demic coauthorship network with small world properties. Using this synthesis method, a coauthorship network of memes and social psychology was constructed, then analyzed and interpreted. The complete scholarship synthesis method is depicted in Figure 2.2. While the details are described over the remainder of this methods section, the method can be summed up as: 1) build a network of coauthorships from a bibliography; 2) identify groups of coauthors in the bibliography; 3) search for new publications with coauthorships that bridge disciplines; 4) extend bibliography with new publications; and 5) repeat. The goal for each iteration in Figure 2.2 is to locate a new publication, with the coauthorships implied thereby, that connects any small group of coauthors to any bigger group of coauthors. Online scholarly search tools are used to locate publications written by coauthors who create bridges to bigger groups of coauthors. Any time a smaller cluster cannot be connected with the addition of a single publication, longer weak tie chains of coauthorship can also be built. Once a new publication is located and imported into the bibliographic database, the next iteration starts.

2.3.1 Bibliographic Entries

This dissertation used the BibTeX bibliography format to collect metadata about the literature on humans and memes. BibTeX3 is a file format that is widely used for digitally representing and sharing bibliographic entries (Patashnik & Lamport, 2010). When authoring scholarly works, many BibTeX entries can be combined to create a bibliography using the LATEX document authoring system (Lamport, 1994). The BibTeX format is ubiquitous. Nearly every bibliography management tool provides an interface for BibTeX. The BibTeX format has among the highest rates of adoption online, which means that most online scholarly catalogs will produce a BibTeX file corresponding to a bibliographic entry.4

3BibTeX is pronounced BIB-tekh, as the TeX refers to techne in the sense of craft or technique. The χ is a chi. 4The University of Toronto Library does not directly support BibTeX through its online portal. However, the University does have an institutional agreement with RefWorks, an online commercial reference manager - and RefWorks supports BibTeX, albeit poorly. I mention RefWorks for completeness and, doubtless, this is the simplest path for new students to get started with bibliography management. However, the use of RefWorks introduces a punishing 15 seconds of latency as well as several additional user interface interaction steps that are frustratingly slow. This observation is not a mere Chapter 2. The Literature of Psychological Meme Science 10

Figure 2.2: The Scholarly Synthesis Algorithm is a general method for conducting an inter-disciplinary literature review. The general steps depicted in this chart are described in detail in the methods section. The step called “Search for weak ties” could imply multiple sub-steps in which a chain of coauthorships might be required in order to successfully create a bridge to a group of coauthors. Chapter 2. The Literature of Psychological Meme Science 11

There are many interfaces for translating BibTeX data into different computing and statistical envi- ronments (e.g., Fran¸cois,2019; Boulogne et al., 2019). However, there is some ambiguity to the BibTeX format and subtle incompatibilities persist across BibTeX implementations.5 BibTeX is widely sup- ported by academic software, including Zotero (Roy Rosenzweig Center for History and New Media, 2019); LaTeX (Lamport, 1994); R (R Core Team, 2013); and Python (Python Software Foundation, 2010).

2.3.2 Bibliography Management

The acquired bibliographic entries were stored and managed using the Zotero software package. Zotero is a bibliography management platform, consisting of a stand-alone software application and an optional hosted service for synchronizing bibliographic entries between multiple computers (Roy Rosenzweig Center for History and New Media, 2019). The Zotero desktop software is Open Source.6 Zotero natively imports and exports BibTeX and is also extensible with plug-ins for additional func- tionality. In particular, the Better-BibTeX plug-in (Heyns, 2019) streamlines the bibliography export process, which became a critical part of the scholarship synthesis method. Zotero also integrates with popular web browsers to import bibliographic entries directly from many online catalogs and databases.

2.3.3 Scholarship Catalog Methods

Numerous scholarly databases were used to search for academic literature. Many of those databases were specific to a discipline or required some domain familiarity. Initially, Google Scholar was used for general-purpose literature searches (Google, 2019). However, Google provides limited access to Scholar - and apparently, the current work exceeded those limits.7 Other search tools that were used included OCLC Worldcat (Kilgour, 1979), Citeseer (Giles et al., 1998), and domain-specific databases like DBLP (Ley, 2002), APA PsycNET (American Psychological Association, 2019), and arXiv (Ginsparg, 2011). One benefit to domain-specific catalogs is the ability to navigate to related resources based on domain-specific criteria other than keywords. The main drawback to domain-specific tools is that the interface is different on each one - so each one must be learned anew. The University of Toronto library was essential for searching the literature (see Tillotson et al., 1995). Among other functions, the library aggregates licenses to provide access to a vast array of publications.8 matter of preference; consider that 15 seconds, multiplied by 2500 bibliographic entries, produces over 10 hours of wasted time waiting for software to do its job. I would not want to spend that time in a single stretch, spanning more than a working day, and it is torture to interleave that delay throughout a Csikszentmihalyi-esque flow state. As such, RefWorks is a beginner’s tool. 5There is no standards body responsible for maintaining a canonical definition of the BibTeX format. In practice, the canonical syntax specification for BibTeX seems to be synonymous with the package implementation (Patashnik & Lamport, 2010). In some cases, the absence of a formal specification has resulted in various BibTeX generators creating malformed bibliographic output and in other cases resulted in BibTeX parsers breaking with seemingly-valid input. 6I believe open source software is fundamental for creating reproducible science. 7As a result of my with Scholar, I now have reservations about it. The closed, proprietary of the Google Scholar database produced circumstances with conflicting incentives. On one hand, Google provides free access to their database and it is reasonable for them to impose limits on their service. On the other hand, scholarship is accelerated as barriers are overcome - but Google Scholar turns out to be quite limited. Admittedly, I came to submit dozens of queries per day to Google Scholar at the height of this work. In this manner, I suspect my profile became associated with this high rate of usage and, as a result, I was lumped into some abuser category. Google Scholar began limiting the number of searches I could perform by deploying Captcha against me (Von Ahn et al., 2003), which is the anti-automation technology employed by Google to protect against software bots. Eventually, I became a de facto worker for some unknown Google machine learning project, performing micro-tasks in exchange for search results. Suffice to say that this experience encouraged me to expand my utilization of search databases. 8I found that Toronto, as a large research institution, offers best-in-class access. Chapter 2. The Literature of Psychological Meme Science 12

2.3.4 Biographic Research

Many significant works are single-author publications, which affected the topology of coauthorship net- works by adding bibliographic entries without adding new coauthorship links. When encountering single- author works, academic biographies became particularly useful. It was frequently helpful to identify an author’s institutional contemporaries in order to locate collaborations. When tracking down an influ- ential scholar who usually published alone, looking at who their mentors collaborated with sometimes revealed useful coauthorship connections. With the addition of biographical information, existing schol- arly tools could be used in an effort to search for collaborations involving likely collaborators.

2.3.5 Cleaning Coauthorship Data

During the collection of bibliographic metadata, the quality of the data varied widely and required cleaning before it could be processed. Many bibliographic entries using the BibTeX format are not completely compatible between software packages. The bibclean utility was used to repair common BibTeX problems (Beebe, 2015). After cleaning, the R bibtex package was used to import the citation database into R for analysis (Fran¸cois,2019). Online scholarship databases contain inconsistent bibliographic information, especially regarding au- thor names. Each alternative spelling of an author’s name would appear to be an entirely separate author. To deal with naming irregularities, a data cleaning step was implemented in R to regularize author names. The following problems were automatically corrected during this data cleaning step:

• Capitalization may not be applied systematically.

• Middle names may be abbreviated; other they are expanded or omitted.

• Names that contain spaces may be spelled with or without spaces in each database.

• Characters with accents or umlauts may be represented in some databases with their undecorated counterparts.

2.3.6 Coauthorship Network

Having obtained clean bibliography data, the authors were then combined into a network based on coauthorship using the R network package (Butts et al., 2019). Each author is a node in the coauthorship network. For each publication coauthored by two authors, an edge connecting the two author nodes is added to the coauthorship network. An algorithm implemented in R iterated across each entry in the bibliography, processing one at a time. For each bibliographic entry, every pairwise combination of coauthors was extracted from the authorship field. For each pair of coauthors, an edge connecting those author nodes was added to the network.9 Authors who are connected by a coauthorship edge are adjacent to one another in the coauthorship network. The edges in the coauthorship network are unweighted 10 and undirected 11.

9 n A paper with n authors produces C2 pairwise combinations. 10When edges are unweighted, there is no rank order relationship among them; they are all equivalent. The number of times two authors collaborated could have been used as the edge weight. However, if I were to count the frequency of collaborations, I would actually be measuring the number of times I added specific authors to the bibliographic database. The edge weight would say more about my decisions than about the literature - so it is better to leave edges unweighted. 11The coauthorship edge implies a symmetric relationship between authors. Author A collaborated with author B just as much as B collaborated with A. An undirected edge may be contrasted with a directed edge, like the asymmetrical following relationships on Twitter. Chapter 2. The Literature of Psychological Meme Science 13

2.3.7 Network Analysis Methods

Once the coauthorship network was constructed in R, it could be analyzed. The first analytical step identified connected components describing sets of authors who are interconnected through a network path of some length. Separate components in the coauthorship network signify the nonexistence of any coauthorships in the network capable of bridging those unconnected groups of coauthors. Whenever a coauthorship was located that created a bridge between components, then those two components became merged into one, larger component. Several graph metrics were then computed on the coauthorship network. Average path length is computed as the average number of coauthorship edges that separate any two authors in the coauthor- ship network. Average clustering coefficient relates to the overall eagerness of authors to collaborate (Holland & Leinhardt, 1971). As the average clustering coefficient approaches 0, authors tend to operate independently. As the average clustering coefficient approaches 1, authors tend to collaborate with one another. Graph modularity relates to how “siloed” the communities are within a network. As modular- ity approaches 0, authors tend to belong to a single global community. As modularity approaches 1, communities operate as independent disciplines that have very little cross-talk between them. Graph modularity was calculated using the Louvain modularity algorithm (Blondel et al., 2008) implemented by R package igraph (Csardi & Nepusz, 2006). A computational byproduct of the Louvain method is to identify communities in a network and assign authors to those communities. Afterwards, those community codes were used to label communities and apply colours to authors during visualization.

2.3.8 Visualization Methods

The interactive graph environment Gephi was used to visualize the coauthorship network (Bastian et al., 2009).12 The Fruchterman-Rheingold graph layout algorithm was used to position related authors within proximity to one another (Fruchterman & Reingold, 1991).13 Author nodes were coloured accord- ing to community membership, which was previously assigned by the Louvain modularity algorithm. Finally, the network was exported as a web object using the Gephi Sigma.js exporter plug-in (Hale, 2012). The resulting coauthorship network, depicted in Figure 2.3, can be navigated interactively with a web browser (Miller, 2016).

2.4 Results

2.4.1 Coauthorship Network

When the bibliography was exported to perform the final analysis, 2, 435 citations had been collected. From those citations, 3, 690 individual authors were extracted. 8, 734 collaborations were derived from those citations. The average path length of the coauthorship network is 10. The global clustering coefficient is 0.721, meaning these authors have a tendency towards collaboration rather than independent work.

12Gephi is open source software that brands itself as “Photoshop for graphs,” providing many parameters to control the visual rendering of networks. 13In general, force-directed layouts like Fruchterman-Rheingold operate as physical simulations of electro-magnetism in which unconnected nodes repel one another and connected nodes attract. Chapter 2. The Literature of Psychological Meme Science 14

Figure 2.3: The interactive network visualization is available online at http://imiller.utsc.utoronto .ca/media/network. This visualization is designed to be viewed with a web browser. The nodes may be clicked to reveal author names. Colours were adapted from community codes.

Figure 2.4: This is the full coauthorship network, visualized with Fruchterman-Rheingold layout. Not all of these coauthors could be connected to form a single component. Therefore, the periphery of the visualization depicts many small components representing coauthorships that were not connected to the main group. Chapter 2. The Literature of Psychological Meme Science 15

Figure 2.5: The main component of coauthorship network, also laid out with Fruchterman-Rheingold, contains just those authors who are connected by a path of some length. All other authors who were not connected to this group were removed from this visualization.

When the coauthorship network is visualized (see Figure 2.4), a large component is visible in the middle of the plot with many smaller, disconnected components orbiting around the periphery. This plot was generated with ggplot2 (Wickham, 2011) and ggnetwork (Briatte, 2016). The disconnected components represent collaborations that could not be bridged to the main component.14

2.4.2 Main Component Network

Figure 2.5 presents a visualization of the main component, which remains after removing every node not connected to the main component. Of the original 3, 690 authors, almost half of them have been merged into a single network. The 1, 577 authors in the main component are connected to one another through a path of some length. As a proportion of the overall database, 42.8% of the authors and 62.6% of the collaborations are in the main component. Of the original 8, 734 collaborations, the main component contains 5, 460 collaborations. Due to the way path length is calculated, removing authors who are not connected to the main component does not dramatically affect average path length. The global clustering coefficient of the main component is 0.675, which is a better estimation of the actual rate of inter-collaboration than was reported in the global analysis.15

14Perhaps they could not be connected because no connection exists - or perhaps I was simply unable to find a connection. I am inclined to believe the latter; if I had infinite time and a larger network, all connections would probably be located eventually. It is unlikely that a highly influential author never collaborated with anybody during the duration of their entire career. 15Each of the smaller components that were removed from this analysis had a very high degree of connectedness within their little component. In fact, every stand-alone article that was a small component unto itself would have a clustering coefficient of 1, indicating complete connectedness among the collaborators because each coauthor on those articles would be connected to every other coauthor - and nobody else. Now that those smaller components are eliminated, the remaining component is relatively less-connected, although this is more an artifact of the clustering calculation than anything else. Chapter 2. The Literature of Psychological Meme Science 16

Figure 2.6: This plot depicts the distribution of path lengths in the main component. The x-axis depicts the path lengths between the nodes of the network. The y-axis depicts the count or frequency of occurrence of those path lengths. The peak of this distribution occurs at length = 8, although a lesser peak also occurs at length = 18.

2.4.3 Component Path Length Distribution

Figure 2.6 depicts the path length distribution of the main component. Path length is along the x-axis or, to use Travers & Milgram’s original terminology in Figure 2.1, “the number of intermediaries needed to reach target person.” The maximum in Travers & Milgram (1967) occurs at length = 6 whereas the coauthorship maximum occurs at length = 8. Interestingly, both distributions have the same bimodal contours. Frequency of path length occurrence is along the y-axis, which is not directly comparable to Travers & Milgram because the network topologies are dramatically different. In the case of Travers & Milgram, the network topology is a star in which the researchers themselves are the central node in the network and all the participants radiate therefrom like spokes on a wheel. Each participant implies the addition of a single path of some length connecting the participant to the researchers. Furthermore, it would be unlikely for any paths to directly bridge to one another due to the sparseness of the graph, since it is a subset of the US national population. By comparison, the nature of coauthorship is more complicated. The addition of a single author potentially implies the addition of multiple new paths connecting to the rest of the authors in the network. As can be seen in Figure 2.5, the network topology does not resemble hub-and-spokes. Consequently, in the coauthorship graph, the frequency of path length has a non-linear relationship to the number of authors in the network. The topological differences between these two networks must be considered as a caveat to any interpretation. Path length implies a Poisson distribution because 1) path length is bounded on the low-end at 1; and 2) length is an integer dimension. However, this distribution doesn’t look Poisson-like due to the second, smaller mode. In Travers & Milgram (1967), the smaller mode occurs at length = 9 whereas Chapter 2. The Literature of Psychological Meme Science 17

Figure 2.7: In this visualization, authors have been grouped according to community codes in addition to proximity. This layout was computed using the Distributed Recursive Layout algorithm.

in the coauthorship network, the second mode occurs at length = 18. It is possible we are observing compound Poisson distributions, in which two Poisson distributions with different means are observed simultaneously (Adelson, 1966). One possible explanation for these two modes is that one mode could correspond to strong ties while the other corresponds to weak ties. Could it be that strong ties imply an average path length of 8 whereas weak ties imply an average distance of 18? This will be revisited in the discussion.

2.4.4 Community Detection

To explore the community dynamics of the coauthorship network, the igraph implementation of Louvain community detection was used (Csardi & Nepusz, 2006). In the words of Blondel et al. (2008), this algorithm “finds high modularity partitions” that “unfolds a complete hierarchical community structure.” The modularity of the main component graph is 0.927, indicating that communities are very siloed; they mostly collaborate within their discipline.

When Louvain community detection is performed on the main component, 36 communities are iden- tified. Community membership codes were then used to lay out the network while optimizing the hierarchical relationships between the nodes, which is a more refined approach than using Fruchterman- Rheingold. Figure 2.7 depicts the main component graph with the Distributed Recursive Layout, which incorporates community membership in the layout process (Martin et al., 2008). Chapter 2. The Literature of Psychological Meme Science 18

Figure 2.8: The distribution of community sizes was plotted in order to verify that no single community was either too large or too small. The x-axis represents the arbitrary community codes that were assigned to groups of coauthorships by the Louvain method. The y-axis depicts the raw membership count.

2.4.5 Component Community Size Distribution

Figure 2.8 depicts the number of authors assigned to each community.16 Along the x-axis are the arbitrarily numbered 36 communities. Each community has a distinct integer code assigned to it and ordering is not meaningful. This visualization merely confirms that there is no systematic pattern to the size of the communities, which is the desired result; anything else would have been surprising.17

2.4.6 Groups of Communities

The Louvain clustering algorithm detected communities of authors who collaborated with one another. Based on my familiarity with the kinds of research conducted by those authors, I manually generated labels for each community. I have further grouped these communities into several broad categories, listed in Table 2.1, that could be thought of as disciplines, although these disciplines are also the result of my personal judgment.

2.4.7 Community Labels

Complexity Several communities involve complexity, which are listed in Table 2.2. I assigned the label of Computational Social Science to one community based on a current trend I’ve detected in the Zeitgeist. I also collected enough Agent-Based Modelling citations to form a distinct community of

16The Louvain community detection algorithm is stochastic, producing slightly different results each time it is run. During exploration, this operation was performed repeatedly and the results were similar each time. To facilitate replicability, stochasticity was accounted for by using a constant seed for the pseudo-random generator in order to obtain the same results each subsequent time the analysis is executed. 17Any bias I may have imposed during curation could explain the size differences among communities. Some literatures have greater representation in the citation database as a consequence of my interests. Chapter 2. The Literature of Psychological Meme Science 19

Discipline Number of sub-disciplines Complexity 3 Psychology 7 Ecology 2 Social Information 8 Computing 5 Physics 6 Uncategorized 5

Table 2.1: These disciplines were detected in the coauthorship network. These groups of communities represent disciplines relevant to the literature of psychological meme science. Within each discipline are several labelled sub-communities, which are described later. authors. Lastly, the coauthorship network contains authors who write about complex systems, in a general capacity.

Community Code Label 1 Computational Social Science 2 Agent-Based Modelling 3 Complex Systems

Table 2.2: These are coauthorship communities within the complexity discipline. These communities are generally concerned with issues of complexity.

Psychology The next major discipline is Psychology, which shown in Table 2.3. The psychology communities I located are niche, well-defined, and permit a good degree of differentiation among them.18 I believe the primary reason I was able to obtain this level of resolution is due to the quantity of psychology citations in my library.

Community Code Label 4 Social Psychology, Mischel Cluster 5 Social Psychology 6 Social/Behavioural Economics 7 Social Neuroscience 8 Social/Biological Psychology 9 Social Cognition 10 Cognitive Science

Table 2.3: These are coauthorship communities within the psychology discipline. The Mischel cluster refers to Walter Mischel, which was well-enough differentiated from the rest of Social Psychology as to be labelled on its own.

Ecology Next is the Ecology literature, specifically as it pertains to computational ecological mod- elling. These communities are listed in Table 2.4.19

Social Information The next group of communities, listed in Table 2.5, are a blend between Infor- mation Science and Sociology. These authors focus on the human side of information, especially as it 18Incidentally, my personal community exists within the context of this larger psychological community. However, I have listed my own community under Other to distinguish it. 19This result surprised me because I wasn’t aware that I was specifically adding citations from the ecology literature. However, this specific finding became more interesting the more I thought about it. Chapter 2. The Literature of Psychological Meme Science 20

Community Code Label 11 Ecological Modelling 12 Ecology of Communities

Table 2.4: These are coauthorship communities within the ecology discipline. Most of the authors in this discipline identify themselves with the theme of ecology and publish in ecology journals. pertains to the online context. This literature stretches back to the 1940s but becomes extremely prolific following the year 2000.

Community Code Label 13 Sociology 14 Social Media/Networks 15 Social Networks 16 Internet 17 Data and Information 18 Online Community 19 Early Social Computing 20 Humans, Computers, and Society

Table 2.5: These are coauthorship communities within the social/information discipline. These authors focus on the human side of information, especially as it pertains to the online context.

Computing Computing communities, listed in Table 2.6, are grouped according to their specific focus on the machinery of computation. In some cases, these communities focus more on the machine or the algorithm than upon the application. Nevertheless, many important methods that have been developed in this space may be applied to psychological questions.

Community Code Label 21 Artificial Intelligence 22 Network Science 23 Algorithms and Systems 24 Big Data, Search, Mturk 25 Human/Computer Interaction

Table 2.6: These are coauthorship communities within the computing discipline. These communi- ties tend to focus more on the machine or the algorithm than upon the application. Mturk refers to Mechanical Turk, which is an online platform commonly used for human subjects research.

Physics There is one final group of communities, listed in Table 2.7, that emerged from the analysis: physics. A notable similarity among these physicist communities is that they are primarily historical, roughly spanning the mid-1800s through 1960. These scientists collaborated on the basis of the physical principles of information transmission and electro-magnetism. This work provided the foundation for digital computation and communication.

Uncategorized Table 2.8 lists the remaining communities. Here, I list my own community, consisting primarily of the people I have published with. In the case of language, digital art, statistics, and design, it would appear that I have enough citations to detect distinct communities but not so many as to permit hierarchically-nested sub-communities within them. Chapter 2. The Literature of Psychological Meme Science 21

Community Code Label 26 Physics/Networks 27 Magnetism 28 Nuclear Physics, Early Computation 29 Physical Experimentalists 30 Information, Radar, early AI 31 Cyberneticists

Table 2.7: These are coauthorship communities within the physics discipline. These scientists collabo- rated on the basis of the physical principles of information transmission and electro-magnetism.

Community Code Label 32 Ian Dennis Miller co-authors 33 Language 34 Digital Art 35 Statistics 36 Design

Table 2.8: These coauthorship communities are distinct from one another, tending to be disciplines unto themselves. There were not enough coauthors from any of these disciplines to be merged into any superordinate category. The exception to this rule is the Ian Dennis Miller co-authors community, which is most closely related to Social Psychology.

2.4.8 Author Influence

The influential authors are the ones who drive the clustering effects of community formation. A family of network centrality metrics can be used to score authors according to their influence within their network context (Newman & Girvan, 2004). After exploring this family of functions, three centrality measures were selected for comparison: Betweenness, Closeness, and Expected influence. Betweenness centrality relates to the frequency with which paths to other nodes must include a given node; higher betweenness indicates that a node is central by virtue of being between other nodes. Closeness centrality relates to path overlap with other authors; two nodes that have a large proportion of shared paths are considered to be closer. Finally, Expected Influence is a weighted measure of centrality. The Newman & Girvan (2004) centrality algorithms have been implemented in the R qgraph package (Epskamp et al., 2019). Table 2.9 presents authors scored on these centrality measures, ranked in descending order by expected influence.

Name Betweenness Closeness Expected Influence Huberman, Bernardo 546758.89 0.0001197 104 Marlow, Cameron 176826.70 0.0001118 64 Christakis, Nicholas 381228.54 0.0001160 60 Grimm, Volker 18465.25 0.0000673 57 Adamic, Lada 440012.52 0.0001197 55 Railsback, Steven 15313.25 0.0000673 53

Table 2.9: Authors are presented alongside centrality measures, which were ranked by Expected Influence

According to several centrality metrics, Bernardo Huberman, Lada Adamic, and Nicholas Christakis are consistently ranked highly. These authors have published in many venues, they are prolific collabo- rators, and they have been influential researchers in the so-called Computational Social Sciences. Volker Grimm and Steven Railsback, who have an unusually high Expected Influence ranking, also caught my Chapter 2. The Literature of Psychological Meme Science 22 eye. These two are ecologists who have published two textbooks on the topic of Agent-Based Modelling. They are a little different from the other authors who exhibit high centrality rankings - and there will be more to say about them in the discussion.

2.4.9 Influential Institutions

Since there are relatively few authors with high centrality, biographical information was collected about each of them in order to summarize their institutional affiliations. Previous work investigated the positive relationship between institutions and collaboration (Rodriguez & Pepe, 2008). Since there is some institutional overlap among these authors, they have been collected into three arbitrary communities based on my qualitative assessments. Hewlett-Packard is notable in this coauthorship network for having the most high-centrality aca- demics affiliated with it. The Silicon Valley community, generally listed in Table 2.10, seems to consist of the overlap between Hewlett-Packard, Xerox, and Stanford, which are all in the Silicon Valley neigh- borhood.

Name Institutions Bernardo Huberman Xerox, Stanford, Hewlett-Packard Sitaram Asur Ohio State, Salesforce, Hewlett-Packard Tad Hogg Caltech, Stanford, Xerox, Hewlett-Packard

Table 2.10: The Silicon Valley community seems to consist of the overlap between Hewlett-Packard, Xerox, and Stanford.

Another group of high-centrality authors, listed in Table 2.11, are affiliated with Michigan, Facebook, Yahoo, and MIT.

Name Institutions Lada Adamic Michigan, Facebook, Hewlett-Packard Cameron Marlow MIT, Yahoo, Facebook Eytan Adar MIT, UW, Michigan Eytan Bakshy Michigan, Yahoo, Facebook

Table 2.11: An institutional community consists of the overlap between the University of Michigan, MIT, and several online businesses.

Finally, the academics in Table 2.12 never entered the industry. This result is also interesting because there are several highly influential institutions that do not appear on this list at all. However, since these tables were created from a very short list of names, there is only so much room. This list is by no means exhaustive but it does cast a spotlight on a few key institutions.

2.4.10 Online Interactive Viewer

The static network visualizations produced with R provide information about the structure of the coau- thorship network. However, in order to really become familiar with the network, I have found it is helpful to interact with the network. The ability to search, zoom, pan, and drag can be used to filter information in order to answer specific questions. For this purpose, the Gephi Sigma Exporter Plugin provides an off-the-shelf solution for producing an interactive online network explorer (Hale, 2012). This Chapter 2. The Literature of Psychological Meme Science 23

Name Institutions Nicholas Christakis Yale, Harvard, UPenn Volker Grimm Helmholtz Umweltforschung Steven Railsback Humboldt State John McCarthy Stanford, MIT, Princeton; d. 2011 John Cacioppo University of Chicago; d. 2018 Herbert Simon Carnegie Mellon; d. 2001

Table 2.12: An institutional community consists of academics who never entered the industry. These classic academics would have interacted through traditional scholarly channels. network can be accessed online at http://imiller.utsc.utoronto.ca/media/network.20

2.4.11 Summary of Results

Collaboration paths were located among 1, 577 authors - called the main component - corresponding to 42% of the entire citation library. Among those authors in the main component, there were 5, 460 coauthorships. The average path length of the network was 10.47. The clustering coefficient of the main component is 0.675, which means coauthors have a tendency to collaborate with one another as opposed to work independently. The path length distribution has two distinct modes, much like the Travers & Milgram (1967) results; one mode appears at path length 8 and another smaller mode at path length 18. The Louvain clustering procedure identifies 36 communities, producing a modularity of 0.927, which indicates a high level of siloedness. Those 36 clusters were qualitatively labelled and thematically grouped in order to create an overview of the literature. The resulting communities span complexity, psychology, ecology, social information, computing, physics, language, digital art, statistics, and design. Authors were explored through several centrality measures. In particular, Huberman, Adamic, Christakis, and Grimm emerged as notable authors. Among those authors, there is a high degree of industry crossover, particularly involving Hewlett-Packard.

2.5 Discussion

The question this work began with asked whether academia exists as a small world. We can compare to the results from Newman (2004b), which reported average path lengths in the range of 4 to 9.7. The coauthorship network described in the current work has an average path length of 10.47, which is higher than anything Newman observed - but not so much so as to be implausible. When comparing the clustering coefficient to other disciplines, we find the coauthorship network’s score of 0.675 is within the range reported by Newman. Because these parameters are comparable, it seems plausible that these coauthors constitute a small world. If the current work were not a small world, it’s possible the iterative citation discovery approach would have been doomed from the start because critical disciplines would have been impossible to connect no matter how many citations were collected. In practice, any time I was unable to find obvious strong tie links between authors, I was usually able to dig deeper to find longer paths of weak ties that

20I cannot speculate as to the permanence of this URL because I will not be affiliated with UTSC forever. However, at the time of this dissertation’s publication, it works. Chapter 2. The Literature of Psychological Meme Science 24 connected over longer distances. The Scholarly Synthesis method presented in this work seems to have been successful. Since this does appear to be a small world network, it is enticing to draw comparisons to other small world networks. With the caveat that my coauthorship network is curated rather than a random sample, I wonder whether I could directly compare this network to those discussed in Newman (2004b). One analysis I recently encountered is called the Exponential Random Graph Model (ERGM), which permits statistical comparisons between graphs (Hunter et al., 2008). ERGM is a future direction I wish to eventually pursue in the context of this work.

2.5.1 Longer Path Length

Why is the average path length of this network longer than other literatures? One explanation is that the Scholarship Synthesis method is not optimal, such that shorter paths existed but I simply missed them. A more interesting possibility I would like to suggest relates to the siloedness of the literature. Imagine one researcher on one side of the network - and then imagine somebody 10 hops away from them, on the other side of the network. We can presume that more hops are somehow related to the distance between the kinds of research they conduct. To the extent that these researchers are unaware of one another, they can be said to exist in different silos. At what point does network distance imply or establish a separate discipline? When does this “separateness” become a so-called academic silo? (see, e.g. Small, 1973; White & Griffith, 1981) A span of 10 network hops implies several kinds of distances. Each network hop introduces the possibility of a shift in semantics, in which meaning is somehow altered and disagreements emerge over the interpretation of some finding. Network distance permits linguistic slippage; perhaps synonyms in one discipline are distinct terms in another, interfering with the transfer of knowledge across network distances. Network distance factors combine to make scientific collaboration frustratingly high-effort, resulting in fatigue manifesting as an interest gap between disciplines. Yet another explanation for longer average path length is that perhaps these disciplines are not well-integrated. Demonstrably, by the absence of an extant community, ideas have not spread readily between these silos. It’s likely there is a relationship between network distance and the likelihood that an idea from one silo is going to influence authors in a different silo. In the current graph, following a path of length 10 across the collaboration network leads to a different literature altogether. Each degree of authorship-separation that must be mediated through implies some degree of epistemological corruption. Is it conceivable that in order to gather a literature of Psychological Meme Science, the network must span 10 or 11 levels of abstraction? Maybe. Perhaps the act of reconciling these abstractions is tantamount to the formation of a discipline. I would speculate that the average path length probably decreases over time, as a consequence of researchers becoming familiar with one-another’s work through new conferences and journals. Given that the current network has an average path length that is just one hop longer than any of the networks described in Newman (2004b), I wonder whether this literature is on the verge of snapping into a fully- coherent discipline. If the average path length of this literature is measured again in a few years, will it be lower than 10.47? Chapter 2. The Literature of Psychological Meme Science 25

2.5.2 Utility of Scholarly Silos

What if there is some advantage to the creation of academic silos? To the extent that long distances imply effortful translation of semantic and linguistic terms, the opposite ought to also be true: shorter distances imply lower-effort communication. Many complex systems, from slime molds to our human cognitive anatomy, exhibit topological optimization effects (Hilgetag et al., 2000; Sporns & K¨otter,26- Oct-2004; C. R. Reid et al., 2012). Topological optimization is analogous to force-directed layout in the sense that related nodes attract one another to reduce distance. Low-energy collaborations will manifest in our network as clusters - and, to the extent that a cluster can grow to include many authors, a discipline may form. In this sense, a discipline is a channel of communication that’s easy to use because it implies: 1) shared vocabulary; 2) shared methods; 3) familiarity with certain authors; and 4) familiarity with certain journals. An interesting corollary to academic collaboration is a desire path, which occurs when people walk across a field of grass often enough to form a path by trampling it (Lidwell et al., 2010). Desire paths usually indicate that established pathways do not provide efficient routes for walking to a certain destination. If a desire path is much more efficient than a sidewalk, then it may be popular enough to become self-sustaining. The existence of the path draws in future users to the path, thereby increasing traffic and making the path more permanent. The desire path network dynamic is also called Path Dependence (Ferrers, 1872; Pierson, 2000), which is a phenomenon observable in many domains including ant pheromone trails (Dorigo et al., 1996) and neural network link-weight reinforcement (Rumelhart & McClelland, 1988). The paths created by coauthorships manifests low-energy channels for ideas to be transmitted, which ultimately constitute shared beliefs. The network of these channels describes the contours of the collective beliefs of disciplines. The existence of scientific belief is the lower-level construct; the collaborations are its indicators. Coauthorship is therefore a way of indicating the structure of the network of beliefs.

2.5.3 Institutions

Those authors with high centrality were located at relatively few institutions. I expected to see many academic institutions and, on that basis, I was surprised to find so many commercial institutions. In particular, I was really surprised by Hewlett-Packard; they apparently had profound insights into a networked future. Xerox also shows up in this network to a degree that I did not expect. However, most of the institutions were academic and there weren’t too many surprises: Stanford, MIT, and Michigan top the list. Based on my time as a researcher at UC Berkeley, I believe Stanford functions as a clearing house for Silicon Valley. MIT became relevant to my literature search as a World War II military powerhouse that eventually grew into the east coast hub for computation. I think Michigan shows up because they have a strong interdisciplinary approach. It would appear that a single force driving these developments emanates from ARPA, the United States Advanced Research Projects Agency (Fong, 2001). ARPA, founded in 1958 by Dwight Eisen- hower, is a common institutional thread that connects the research conducted during World War II (Page, 1962; Seidel, 1983) to the military-industrial-academic complex later identified by Eisenhower during his farewell address (Eisenhower, 1961; Giroux, 2015), which is plainly manifested in the current work. The ARPA Interface Message Processor (IMP) is the original backbone of the proto-Internet (Heart et al., 1970), connecting institutions both thematically and practically. By 1978, dedicated com- Chapter 2. The Literature of Psychological Meme Science 26 munication channels linked IMP devices at Stanford; MIT; CMU; Xerox; RAND; MITRE; national laboratories including Argonne, Lawrence Berkeley, Lawrence Livermore; ARPA; NSA; and ultimately the US Pentagon itself (Lederberg & Feigenbaum, 1977). Let’s assume, for the sake of this discussion, that a discipline does function as a low-energy channel of communication. Perhaps collaborative energy is fungible and greater lexical distances can be spanned whenever the collaboration occurs in-house. Communication costs could be reduced through the in- stitutional co-location of researchers, thereby offsetting some of the costs related to inter-disciplinary knowledge transfer. Location-sensitive events like speakers and conferences facilitate collaboration by reducing the cost of communication - and this is easiest when everybody lives near one another. With devices like the ARPA IMP, ARPANET, and the Internet, some of these costs can be partly mitigated - but there’s nothing quite like the real thing. For the current analysis, I manually tagged institutions for a few high-ranking authors - but a more rigorous, methodical approach would probably produce insights about the psychological meme science literature. I would expect to see several institutions that did not show up in the current analysis, including the Santa Fe Institute, Northeastern, Northwestern, George Mason, ETH Zurich, and numerous institutions in the United Kingdom including Oxford, Surrey, and St. Andrews. If ARPA-connected institutions are the first wave, then the institutions I just named can be thought of as the second wave.

2.5.4 Observations of academic publishing over time

I can now qualitatively characterize the literature by decade and era based on my experience conducting this work. There is an explosion of publication that occurs during the 2010s. In 2019, the current year, virtually everything is computer-readable, everything is well-indexed, and almost all publications are available as digital objects. Some journals and conferences have already moved to digital-only distribution, ceas- ing hardcopy publication altogether. A consequence of academic literature digitization is that online scholarship methods can be applied to almost everything published this decade. During the decade of 2000-2009, not all articles are computer-readable but almost all are available online. Optical character recognition (OCR) has been widely applied to enable full-text indexing for many articles that were not published in digital format. Articles that have not been OCR-ed may suffer from reduced keyword search and other awkwardness. During the 1990s, access to research becomes irregular. At the time, the academic publishing industry was transitioning to digital and most articles were initially distributed as hardcopy. Not as much of the published record from the 1990s is computer readable - and, in fact, not all artifacts are even available online. Starting with the 1990s, it becomes increasingly necessary to physically visit the library in order to track some artifacts down. Nevertheless, almost everything published in a major journal is going to be online in one form or another.21 In the pre-digital days, publication is idiosyncratic and irregular. The time period between 1950 and 1990 is fairly easy to search due to excellent online indexing by librarians, a process which was labor-intensive and took decades to perform. However, there is less standardization across publishers and the media itself. Articles are difficult to locate for different reasons including availability, citation practices, and the likelihood that works are published as books instead of articles. Any time research from this period was published as a book or chapter, this reliably prompted a visit to the library stacks.

21At least, a major institution like the University of Toronto has access to virtually any journal I ever needed. Chapter 2. The Literature of Psychological Meme Science 27

Prior to 1950, all bets are off. Once again, libraries have maintained an excellent online index of publication for this period. Although some artifacts published between 1920 and 1950 are available online, this availability is not reliable. Relatively fewer journals and conferences even existed prior to 1920 so many familiar scholarship methods are irrelevant for these decades. When my search brought me to this time period, I frequently had to use non-bibliographic methods to continue locating coauthors. Prior to the 1920s, with the rare exception of scientific societies, nearly all scholarship is conducted through books and lectures. When we’re lucky, lectures were read from scripts that were archived. In other cases, a transcript may have been produced as a byproduct of language translation - and we might have access to that. The earliest work in my library comes from Leibniz (1693). Therefore, the sum total of all scholarly information I am able to leverage for the current work has occurred within a 326-year period.

2.5.5 When to Stop

At what point is the coauthorship network big enough? I ended up stopping because I was satisfied with the picture I could produce with more than 2400 citations. Eventually, I found that I was having trouble identifying new connections that could bridge the 900 unconnected authors into the main collaboration network. After enough effort, it became frustrating. As I said, I was hitting some search engines hard enough for it to appear like it was abuse. So, I had to shift my focus from citation collection to the analysis process. Newman (2004b) provides some criteria that might be useful for deciding when to stop. Because the path length and clustering coefficient are comparable to Newman’s results, perhaps that can be taken as an indication that it is okay to stop.

2.6 Conclusion

2.6.1 Future Directions

At this juncture, there are two ways forward with this work. The first path leads deeper into bibliometrics and the other path leads deeper into computational social science. I will describe the bibliometric path first. There are so many ways to get more information about individual academics and the collaborations they engage in. I could look for websites, CVs, professional profiles, university biographies, and so on. One bibliometric artifact I’ve been thinking about is the disciplinary handbook. It’s an academic practice, in fields of sufficient maturity, to produce a handbook with contributions from the eminent scholars in that field. It would be interesting to take a structured look at handbooks. Of course, this method is not appropriate for emerging disciplines that are not mature enough to produce a handbook. There’s probably a linguistic vocabulary relationship that would be interesting to explore for the purpose of identifying communities based on shared nomenclature. Finally, there are other types of co-occurrence: conferences, journals, co-publication, co-appearance, and co-presentation. Each of these represents a social dimension that would be an indicator of the underlying structure of scientific knowl- edge and beliefs. Although all of this is really interesting, I have chosen the other path - to go deeper into computational social science. Chapter 2. The Literature of Psychological Meme Science 28

2.6.2 Applying this New Knowledge

The results have provided insights into the structure of the academic community my work seems to fit within. As I scrutinized the names that ranked at the top of the various centrality measures, I was surprised to see Volker Grimm, the ecologist. He doesn’t show up in a lot of the communities I pay attention to - so how did he show up at the top of one of the centrality measures I explored? I dug deeper, using my iterative scholarship method, and this brought me to the topic of ecological modelling. To my delight, I found answers to several fundamental questions I had been struggling with, including: 1) a discussion of strong inference; 2) a method for describing models that I could apply to meme research; and 3) a general approach to modelling. Each of these insights was something I required in order to advance my own science of psychological meme systems. It is a direct result of this coauthorship analysis that I came to apply ecological modelling principles to the study of political memes and urban legends, which are discussed in the remainder of this dissertation. Chapter 3

The Analytical Scale of Online Political Campaigns

3.1 Introduction

While chapter 2 explored the academic networks that underlie our knowledge of memes and their spread through social networks, the value of this knowledge is in understanding how memes propagate ideas between individuals embedded within social networks in the real world. A pertinent and timely context for this work is the democratic election. The political slogans that a campaign consists of are de facto viral memes. The memes we become aware of are the successful ones. What best predicts the influence of a political meme: the meme itself, the individuals who choose to share it, or the communities in which the individuals are embedded? Methodologically, this work begins with a corpus of more than a million tweets observed on Twitter during the year preceding a mayoral election, all localized in the North American city of Toronto, Ontario, Canada. From this corpus, all of the possible 3-word phrases were extracted and scored according to the information they contain. Statistically, this information scoring method is related to the joint probability of the words appearing in phrases. Social connections between the users observed tweeting during the election campaign were also collected. Those connections constitute a social network within which communities could be identified. Finally, a comparison of hierarchical models was performed to examine the influence of memes, individuals, and collectives, upon the diffusion of election-related tweets. In order to make use of memes, this work introduces a new method for characterizing the way memes relate to individuals and collectives. Based on an analysis of a large corpus of Twitter data, a statistical property is observed among the lexical tokens that constitute text memes - and that property is covariance.1 This statistical signature will enable the detection of memes in a large corpus of political slogan data, which can then be related to the communities using that speech. The statistical relationship between symbols, individuals, and collectives will enable us to analyze the entire campaign to examine the scale of causation.

1Once we get into it, covariance will be operationalized as joint probability. Fundamentally, these two terms - covariance and joint probability - will be treated as indicators of the same underlying statistical construct. There is no deep difference intended by the use of one term or the other; it’s simply a matter of the relevant mathematical context within which we are operating.

29 Chapter 3. The Analytical Scale of Online Political Campaigns 30

Figure 3.1: This extremely basic diagram of a democratic system depicts issues, which are represented by leaders, who are then voted upon. The voters are tallied and the majority is determined from that count.

3.2 Background

This work examines memes in the context of a democratic election on the basis that election campaigns use memes - more commonly called slogans - to engage the voting population.

3.2.1 Democracy and Elections

Perhaps the best description of the phenomenon of elections in North America comes to us from Alexis de Tocqueville - who, incidentally, visited Upper Canada in 1831 as part of his mission to observe democracy in the Americas. de Tocqueville literally wrote the book on Democracy in America (de Tocqueville, 1840) in which he famously observed:

...in America, it acts by elections and decree. Public opinion is the predominant authority.

I have roughly depicted a democratic election in Figure 3.1, which consists of issues, leaders, and voters. There are several issues, in the abstract sense, which leaders advocate in some capacity. The voters, in consideration of the issues, decide which leaders will best represent their needs. Then, during the election, votes are tallied to determine which leader shall represent the majority. Election phenomena act at multiple scales, including both the individual scale and the collective scale. de Tocqueville points out that:

...the choice of president, which is of small importance to each individual citizen, concerns the citizens collectively. Chapter 3. The Analytical Scale of Online Political Campaigns 31

Another language for saying this invokes the micro and the macro; in the field of economics, microeco- nomics refers to individual phenomena while macroeconomics refers to global or system-wide phenomena. The individuals vote; yet, the outcomes are collective. As it so happens, the current work starts in Upper Canada and it starts with an election. Of course, Upper Canada is now called Ontario and, for the remainder of this work, we will be discussing the Toronto 2014 Mayoral election.

3.2.2 Campaigns and Electioneering

To be precise, the election itself was an event that took place on a single day in October, 2014. In fact, we’re not talking about the election so much as we are talking about the electioneering - the campaigning. In this regard, de Tocqueville points out that campaigning has always been a dirty affair:

...electioneering intrigues, the meanness of candidates, these opportunities for animosity which occur the oftener the more frequent elections become.

Despite the nastiness inherent in it, campaigning is a type of parasocial, mediated interaction between the electors and the elected (A. M. Rubin et al., 1985; A. M. Rubin & Step, 2000). The campaign occurs during the lead up to the election - so it covers an extended period of time. The goal of the campaign is to gain the support of the people and win the election through a majority. Campaign statements, ostensibly, compete in the nearly-euphemistic marketplace of ideas. In order to make progress, the current work seeks to provide an operationalization of this so-called marketplace.

3.2.3 Speaking for a Collective

An election is a social computation. The outcome of the vote is a population-level measurement of belief. However, this collective belief is formed from information that is held individually, as each individual casts their own vote (Lyon & Pacuit, 2013). The election process ultimately enables us to determine our own collective interests. Predicting the collective-level outcome from individual beliefs is a deceivingly hard problem. After all, election polling is an academic discipline unto itself (e.g., Crespi, 1988; Hillygus, 2011). There are also businesses that do research, year-round, to answer the question of precisely what it is that the electorate wants for itself and how to transform those insights into votes. This phenomenon, in which the collective knows what it wants but doesn’t know how to say it, is eloquently described in Jung (1968):

...the principles of the unconscious are indescribable because of their wealth of reference, although in themselves, recognizable.

Jung’s observation can explain why great political speech is rare - but when you hear it, you instantly recognize it. The property that makes speech great is the sense in which it connects with Jung’s “indescribable.” Great speech consists of the right symbols, masterfully selected to express the collective unconscious in a way that is recognizable. Political memes are likely successful for the same reason; they contain the right symbols that express the collective unconscious in a universally recognizable manner. Chapter 3. The Analytical Scale of Online Political Campaigns 32

Figure 3.2: Individuals are directly influenced by innate behaviours and instincts - and are indirectly influenced by what Jung calls the “collective unconscious.” Although individuals are not directly aware of either instincts or the collective unconscious, those constructs may nevertheless be represented indirectly with symbols.

Figure 3.2 depicts the relationship between the collective unconscious and the symbols that are used to express it.2 When Jung is pressed for details about the unconscious, he says it is fundamentally innate behaviours or even instincts. Those instincts, which are inherently shared among individuals, may be expressed as symbols which are in turn recognized by individuals. These instinctive drives are viewed with suspicion by de Tocqueville, who labels them as popular passions and warns us about the potential influence they can have in democracies. de Tocqueville observes that the political class are less swayed by these popular passions. Nevertheless, the most recognizable symbols used during a campaign are frequently those that reflect the popular passions - and it is frequently the politicians who use that speech.

3.2.4 Finding Symbols in Campaign Speech

A method is proposed in Jung (1968) for examining the collective unconscious. The first step in his approach is to gather a few hundred products of dreams or the “active imagination” - but for our purposes, we will focus on the imagination. He notes that the symbols identified in this corpus must be isolated enough to be recognizable as typical phenomena; they must be concrete enough to be analyzed on the basis of their repetition. The purpose of aggregating these symbols will be to draw inferences

2The diagram in Figure 3.2 is a visual depiction that might be recognizable to Jung as a mandala. An even more direct comparison can be drawn to Peterson (1999), which includes clip-art-esque diagrams that also attempt to present an architecture of understanding. There are many examples of mandalas in the appendix of Jung (1968) and a common thread that runs through them is the effort to provide metaphysical frameworks that explain the shared human experience. In Figure 3.2, I am using yet another visual language that borrows from statistical modelling and software architecture. In general, the boxes in my diagrams represent classes of objects and the arrows represent the flow of information. Chapter 3. The Analytical Scale of Online Political Campaigns 33

Figure 3.3: Insightful individuals, through a creative process, may translate aspects of the Collec- tive Unconscious into speech by making use of a shared vocabulary that is understandable by other individuals. The process of developing speech for an election campaign reflects this basic progression. about the collective, shared aspect of the unconscious - but we know there are limits to this approach. Jung remarks:

A symbol is the best possible expression for an unconscious content whose nature can only be guessed because it is still unknown.

We cannot look directly at the true beliefs of people so perhaps these symbols are the best we can do. For our corpus of the imagination, we turn to Twitter and recognize it for the global hallucination that it is. Through Twitter, we can observe campaign speech, which consists of the things that are said, the symbols that are invented, and nature of what is shared during the course of the campaign. While we could say that this approach is inspired by Jung, we’re not literally going to follow Jung’s methodology. Instead, we’ll ultimately implement a method that better suits the Twitter phenomenon. I nevertheless produced the diagram in Figure 3.3 in such a manner that I think would be consistent with the Jungian framework. The collective unconsciousness is shared among each of these individuals due to its innateness. Through a creative process, individuals invent new symbols by recombining terms drawn from a shared lexicon. New symbols can then be embedded into online speech, perhaps using the Twitter social media network. When symbols are understood by many people, they may be selectively copied as memes - a process which we will discuss at length. Online speech is then shared among individuals who recognize the symbols contained within due to their experience of the collective unconscious. In this manner, Jung’s method for identifying symbols of the imagination may be applied to online speech. Chapter 3. The Analytical Scale of Online Political Campaigns 34

3.2.5 Memes

In this research, we identify memes (Dawkins, 1976) within a corpus taken from Twitter (boyd & Ellison, 2007; boyd et al., 2010). Within this context, a meme is a media text - in this case, a couple of words - that can be replicated and mutated in the manner of Kooti, Yang, et al. (2012), who investigated the emergence of conventions on social media. A complete tweet could be a meme - or just a subset thereof. A new meme occurs when a novel expression accomplishes the articulation of a widely-understood aspect of the collective unconscious. I have adapted a statistical property known as mutual information (Church & Hanks, 1990) in order to formalize the notion of widely-understood; what Jung calls typical. Mutual information is like the inverse of randomness; words that are systematically paired convey information and words that are randomly paired do not. When words are combined to form a new phrase that becomes widely-used, then the words in that phrase come to have a high joint probability that exceeds the expected probability for the use of those words. By examining memes, we hope to talk about the way the unconscious, itself, covaries across individ- uals through the symbols that express it. Altogether, this is a description of a meme system consisting of people and symbols, in which the covariance of lexical tokens covaries with the covariances inherent in the human experience.

3.2.6 Questions of Scale and Causation

In this chapter, election campaigns will be characterized at multiple scales; in terms of memes, individu- als, and social network communities. Do the voters govern themselves, as de Tocqueville says? Is this an information contagion model, in which we become infected by ideas that we pass along to others? Are we merely hosts to information parasites? Is it the case that politicians and pollsters actually drive this system - or are they fundamentally subordinated to the shared needs they seek to discover? Are there collective issues that drive networks of individuals to coordinate and act together? What really happens during an election campaign; who or what is in charge? Ultimately, how big is this phenomenon?

3.2.7 Models

Models of Political Memes on Twitter

Election phenomena have been examined by many disciplines, each with a different approach represent- ing a separate - but, in all likelihood, compatible - epistemology with differing research methods and different goals. Some of the disciplines that have tried to seriously characterize memes include psychol- ogy, sociology, ecology, , anthropology, computer science, and information science - but the most complete analysis will be inter-disciplinary.3 In this section, I will review several election campaign models that I have adapted from various disciplinary frameworks. These models represent radically different perspectives for analyzing the same underlying phenomenon. They are simplifications that are practically guaranteed to make every disci- plinarian unhappy. However, we must forge ahead anyway; after all, every model is wrong but some are useful (Box, 1979).

3Inter-disciplinary methods are core to this work. Perhaps a discipline like Political Science could purport to capture the whole election phenomenon but, at the risk of getting mired in unprofitable semantics, let’s just say that this work will draw inspiration from many disciplines. Chapter 3. The Analytical Scale of Online Political Campaigns 35

Figure 3.4: Introducing the Online Social Media Network Model. An individual may create multiple tweets, which may then be retweeted. Shaded boxes indicate tweets with memes in them. The lateral transmission of a meme from Individual 1 to Individual 2 is depicted as occurring through a process like social learning. Separate social groups that do not follow one another have no path through with social learning can occur.

Social Media Model

I will begin with the model that I believe characterizes the full phenomenon of online political campaign memes, seen in Figure 3.4. This model is situated within the Twitter social network, from which some specific vocabulary is derived. In this diagram, Individual 2 is connected to Individual 1 through a social relationship called following, such that Individual 2 follows Individual 1. Each tweet is a statement made by an individual on the network. A retweet refers to the case when an individual copies a tweet. The meme is distinct from the tweet insofar as a single meme could be expressed through any number of tweets.

This is a hierarchically-nested model - tweets shared by individuals within communities - in which we track who belongs to which community and which tweet belongs to whom. Individuals belong to social network collectives, also called online communities, that serve to hierarchically encapsulate the individuals who belong to them. Tweets are written by authors so those tweets are nested within authors. Some of those tweets contain memes, which will be operationalized shortly. Tweets may then be retweeted - that is, replicated - including memes within them.

When all these entities are represented hierarchically, we end up with a model that captures the information relationships inherent in election phenomena. Nevertheless, there are other ways to think about it. Chapter 3. The Analytical Scale of Online Political Campaigns 36

Figure 3.5: The memetic model attempted to draw analogies between culture and genetics. In this formulation, memes possessed characteristics - however vaguely defined - that imparted fitness to those memes. An individual could represent aspects of those characteristics using language, which could then be shared to other individuals. The language, and therefore the meme, might be mutated in the process but, presumably, the underlying characteristics would persist to some degree. The whole process is rather fuzzy and the analogy to genetics is not especially strong. Chapter 3. The Analytical Scale of Online Political Campaigns 37

Memetic Model

The memetic perspective can be viewed as a combination of epidemiology, cultural , semiotics, and anthropology. The original form of memetics was proposed by Dawkins (1976) through his analogy with biological, genetic evolution. From the memetic perspective, memes are viewed like information viruses that act as contagious, mental parasites and human are the hosts to these memes. Dawkins, by the 1990s, came to advocate for a softer form of memetics (Dawkins, 1993).4 Memetic maximalists like Blackmore (1999) hold the view that humans are so subordinated in their relationship to memes as to be shaped by memes in order to become better hosts. The maximalist position places memes in the metaphorical driver’s seat. In this light, memes could be seen as a sort of life form unto itself. In the model depicted in Figure 3.5, tweets may contain memetic texts. The memes, by analogy with genes, are seen as having different memetic characteristics that may impart fitness to the memes. Some characteristics could make one meme more competitive than others, just as DNA imparts some biological characteristics that influence fitness. These memes are then seen as fighting amongst themselves for attention. It is this Darwinian process that determines the fitness outcomes of those meme characteristics. The memetic perspective has received valid criticisms. For example, a meme’s characteristics them- selves are not finite in the same sense that genetic characteristics are quantifiably-finite (Atran, 2001); memetic characteristics are fuzzy, lacking strong boundaries. Memes have been found to interact with people, too, and they are subject to a variety of cognitive biases in selection, , and sharing pro- cess (e.g., Heath et al., 2001; Eriksson & Coultas, 2014). The nature of those interactions is analogous to phenotypic expression, rather than genetic expression, and it presents other problems for the memetic perspective. Nevertheless, this diagram captures some of the key characteristics of the model advocated by memeti- cists. Through an epidemic contagion process, memes infect individuals who, in turn, infect other in- dividuals through their social networks. Altogether, this is a memetic model of the propagation of campaign slogans through a population.

Social Cognition Model

The social cognition model, depicted in Figure 3.6, incorporates theories from social and personality psychology that have to do with people thinking about other people (Zajonc, 1980; Srull & Wyer, 1989). Consider social identity (Tajfel, 1974): the cognitive approach to identity is that individuals think about social objects, including people and characteristics, that are salient in regards to a group. This is seen as a mental phenomenon, typically with neuro-anatomical correlates. This social/personality perspective focuses strongly on the individual and the processes within that individual. For example, attitudes about groups could be the product of individual differences (e.g., Devine, 1989). When thinking about social objects, personality traits might be activated for various reasons, with corresponding consequences (Allport, 1937). Human-meme interaction can be viewed as part of a social situation or environment in which the individual finds themselves (e.g., Heider, 1958; Bowers, 1973). Relatedly, the meme can be seen as a stimulus target for the individual to cognitively process (e.g.,

4A meme is less a law of culture and more just a popular idea - a meme unto itself. The hard comparison between memes and genes does not hold up to close scrutiny. Some of the shortcomings inherent in the meme-gene analogy will be discussed shortly. Chapter 3. The Analytical Scale of Online Political Campaigns 38

Figure 3.6: The individual lies at the centre of the social cognition model. In this formulation, the social group exists only insofar as the individual may think of the group. It is the cognitions and traits, possessed by each individual, that defines the group, specifying the range of speech and behaviours that individuals engage in.

McArthur & Baron, 1983). The meme comes to interface with the cognitions that are activated or with personality traits that are salient. Within a social cognitive framework, the collective can be viewed as the individual’s notion of the collective (e.g., Hilton & von Hippel, 1996). To the extent that many people have similar notions about a group, that group can be said to exist. According to this social/personality definition, no group needs to actually exist at all; only the thought of the group. In contrast, the social model in Figure 3.4 defines groups in terms of the explicit network connections that exist between individuals.

Social Forces Model

The next model for campaign phenomena, depicted in Figure 3.7, derives from the classical sociological perspective that could have been advanced by Durkheim (2014). This perspective would criticize the cognitive model for reducing social phenomena to an individual’s mere thoughts about social objects.5 The approach I am here calling a social forces model6 would argue that the group is more than an individual thinking about the idea of a group. From the classical sociological perspective, the economy is not reducible to our thoughts about the economy. Instead the economy, or any other macro-scale phenomenon, can be seen as a force unto itself that is able to cause outcomes as if on its own. Consider the following Durkheim-esque social facts: the news media, the Zeitgeist, education, in- equality, war, and so on. In the words of Durkheim, these constructs:

5Tarde was a contemporary of Durkheim who advanced ideas (e.g., Tarde, 1962) that Durkheim appears to have criticized. In fact, Tarde’s ideas would be extremely familiar to modern psychologists on the basis of their formulation in terms of the individual. 6Social forces may be thought of as an homage of sorts to Comte (1858) and his notion of social physics. Chapter 3. The Analytical Scale of Online Political Campaigns 39

Figure 3.7: Various social forces jointly impact individuals, who may then act as a consequence of those forces.

...possess the remarkable property of existing outside the of the individual.

Consistent with this perspective, McLuhan et al. (1967) observed:

Societies have always been shaped more by the nature of the media by which we communicate, than the content of the communication.

McLuhan’s definition elevates the media, writ large, to the privileged position of altering us more than we can alter ourselves. To contrast with the cognitive perspective, the media network exists as a concrete manifestation - literally as electricity. The medium is therefore not reducible to an individual’s thoughts and perhaps none of these social constructs are, either. A human-meme interaction may therefore be seen as the product of the social forces “existing outside the consciousness of the individual,” constraining the environment within which we interact. In addition to these forces, individuals are connected to one another through a media network that enables the sharing of tweets and memes - which may then be retweeted. From the social forces perspective, part of the reason Individual 2 chooses to retweet what Individual 1 has stated derives from the shared forces that impact them both.

Communication Model

The campaign phenomenon can also be examined as a communication system, resembling the model adapted from Shannon (1948), which is depicted in Figure 3.8. Shannon was primarily concerned with the way networks could be used to pass signals from one end to the other (e.g., Elias et al., 1956). With a campaign slogan as signal, an information transmission phenomenon can be observed as a slogan diffuses through an electoral population. Chapter 3. The Analytical Scale of Online Political Campaigns 40

Figure 3.8: Speech may be viewed from the perspective of the communication process that sustains it. Tweets are messages, constructed from a linguistic corpus, that are transmitted to individuals who receive them. Along the way, the message may be degraded by noise.

This model permits multiple layers of communication (e.g., ISO/IEC, 1994, 7498-1). The lowest layer is physical in nature; for Shannon, it is the wire that electrically connects nodes in a network. However, once this layer becomes coherent, other protocol layers may be added on top of it, which is how we eventually arrive at the modern Internet. In fact, ever more layers can be added on top of that, as is the case with Twitter and other social network applications that were invented decades after the Internet cohered (e.g., Stein et al., 2004). Memes can be viewed as yet another protocol used to encapsulate other signals.7

Shannon observed the noise that interferes with the channel, degrading the signal it carries, and devel- oped error correction techniques to combat the effects of noise (Shannon, 1948). Memes, too, sometimes include characteristics that improve fidelity in order to resist noise and degradation. For example, an- thropologists have noted the capacity for rhyming structures and musical tunes to improve memory and recall, increasing transmission fidelity (D. C. Rubin, 1997). Although the memetic perspective discusses fidelity characteristics in a fuzzy manner, as criticized in Atran (2001), the perspective can formally quantify this phenomenon in information theoretical terms (e.g., Shannon, 1951).

Thus, memes can be examined as a communication phenomenon by observing the signals that pass through the connection that exists between individuals. Chapter 3. The Analytical Scale of Online Political Campaigns 41

Figure 3.9: In the null model, entropy is the only force that determines the popularity of speech. In other words, the performance of tweets is completely random in this model.

Null Model

What if memes are simply unpredictable and their performance is completely random? Of course, we have good reason to believe this is not the case (see, e.g. Szabo & Huberman, 2010; Kooti, Mason, et al., 2012; Harada et al., 2015). My own Master’s Thesis examined predictors of meme sharing (Miller, 2012), finding sufficient evidence to reject this null hypothesis. In the style of Null Hypothesis Significance Testing (NHST)8 , but without actually depending upon NHST, a null model is implied by the model comparison that was used to produce statistical results. The error term in the null model, depicted in Figure 3.9, represents the infinitely-many causes that result in a meme propagating. Conceptually, a hierarchical model comparison will pivot from the null model to obtain a higher resolution of description.

3.2.8 Basic Conceptual Model

Analytically, we must do better than the null model by attributing variance to the influence of individuals, the memes they create and retweet, and the collectives those individuals belong to. When these terms are synthesized into a single diagram, as depicted in Figure 3.10, the resulting hierarchy involves a social construct, an individual construct, and a memetic construct. There are multiple entities at each level: many memes are created by one individual and many individuals belong to a social collective. This is the model we will revisit throughout the rest of this work.

7The 7 layers formalized in the OSI Model (ISO/IEC, 1994) are more of a convention than a hard theory. This has become a bit of a cottage industry, with many other theoretical layers being proposed (e.g., Farquhar, 2010). In this tradition, I will half-seriously propose that memes could be seen as yet another contender for a Layer 8 protocol. 8The analytic approach used in the current work involves model comparison, which is not actually Null Hypothesis Significance Testing (see, e.g. Cohen, 1990). However, insofar as the analysis is built up from linear models, individual slopes are interpreted as differences-from-zero. In this sense, there is no escaping NHST. Chapter 3. The Analytical Scale of Online Political Campaigns 42

Figure 3.10: When all the previous models are boiled down to their , just three constructs remain: the memes that are shared, the individuals who share them, and the social structure that brings individuals together.

3.3 Methods

3.3.1 Data Collection

This work examined Twitter in several ways, including the things people said and who they were con- nected to. This work began by monitoring Twitter status updates, consisting of both original tweets as well as retweets. The text from these tweets was used to create a corpus of expressions.9 In creating this corpus, we filtered strategically in order to capture the specific time period leading up to the 2014 Toronto mayoral election - and included a little bit afterwards, too. We also restricted the location of the people engaging with this election campaign, in an attempt to localize to the city where the election took place. Lastly, we restricted the topic to politics because, in this city, people talk about lots of things aside from politics.10 In other words, we sought to to restrict the time, location, and topic in order to observe the speech of just those individuals who directly interacted with the election campaign. We also gathered the social network, indicating who pays attention to whom. To construct this social network, we examined the accounts that were observed tweeting by identifying both whom these accounts were following and who were the followers of these accounts. Consequently, the social network was contextualized within our city and topic. In the process, we collected additional information about the individual accounts that were observed, providing us with three crucial kinds of data: tweet text, account identity, and social network connections. This information was used to build a social network of this space.

9These are our Jungian products of the imagination. 10(e.g., Raptors) Chapter 3. The Analytical Scale of Online Political Campaigns 43

Figure 3.11: Box sizes are approximate, representing about 5000 units each. The diagram depicts a brief illustration of the relationships between the levels of data. Three colours - black, red, and blue - correspond to three hypothetical individuals. Red follows black, who follows blue. Red tweeted one time, which was retweeted four times; black was responsible for one retweet and the other three retweets are from other individuals not depicted in the illustration. Black tweeted once and retweeted two times; black’s tweets were never retweeted. Blue tweeted once, which was retweeted by black. Black’s retweet was itself retweeted once by another individual. Chapter 3. The Analytical Scale of Online Political Campaigns 44

Figure 3.12: The Twitter API is directly monitored for status updates - called Tweets. Each tweet implies an individual who was the author of it. As tweets are downloaded, a basic description of individuals is also acquired.

Figure 3.11 depicts the numbers of tweets, individuals, and connections that we collected. Across the top of the diagram are boxes representing 9, 794, 113 social connection data points. Each of these data points records the observation that an account follows another account. The next level down is the individual level, at which we have observed 147, 954 individual accounts. Emanating from those individuals are the social network edges that connect the accounts. Depicted at the bottom of the diagram, 2, 660, 310 tweets were observed - and of those, 1, 486, 422 are retweets. This diagram also depicts a brief illustrative example, which is explained in the image caption, that demonstrates how a single individual account can be connected to other accounts, tweets, and retweets.

3.3.2 Collecting Tweets

Tweets were collected by observing the Twitter streaming API using the Twython library (McGrath, 2013). Parameters were provided to the Twitter API in order to filter the results so that we only received those with a specific keyword: TOpoli. In Twitter’s terminology, #TOpoli is a hash tag that individuals may use to indicate that their tweet is related to a city politics. In this case, TO is an abbreviation related to a specific city - Toronto - and -poli is the suffix for politics.11 The #TOpoli hash tag is not official; it exists as a convention that emerged through a process that is likely to be similar to other Twitter conventions (e.g., Kooti, Yang, et al., 2012).12 The twitter API produces a continuous stream of status updates - and also imposes many technical limitations that affect some kinds of research. Luckily, the volume of tweets that contain our hash tag

11Other related hash tags are #ONpoli and #CDNpoli, which refer to Ontario and Canada, respectively. These hash tags are mentioned for illustrative purposes only, and are not included in the current analysis. 12I hypothesize that the #TOpoli hash tag exists because the community selected this term through a competitive, memetic process. This observation was not formally tested. Chapter 3. The Analytical Scale of Online Political Campaigns 45

Figure 3.13: Once a set of individuals has been acquired, each one is inspected to discover who they follow and who their followers are. These individuals are examined one at a time by a process called a spider. In this visualization, just one individual is highlighted and the set of followers is populated as individuals are iterated across. is low enough that we were probably not affected by Twitter’s quantity restrictions on the use of their public API.13 Tweets provided by the API also contain additional information about authors, including where they were located when they tweeted. In order to collect these data, the Twitter API was monitored continuously from October 31, 2013 through December 31, 2014. To be successful, online observation required no interruptions to either electricity or network connectivity.14 Despite our best efforts, there was downtime due to a database error that caused an outage of 7 days during July, 2014.15 In proportion to the overall scale of the study, it is unlikely that this outage seriously impacted our results.

3.3.3 Collecting the Social Graph

To construct a social graph, we collected all the social network connections involving accounts that were observed tweeting. Collecting social network data from Twitter is a greater challenge than collecting status updates because the Twitter API is more restrictive about requests of this type.16 Despite the

13Twitter are not entirely transparent about their policies. At least at the time these data were collected, the Twitter Streaming API provided free access to a maximum of 1% of the total volume of tweets happening at any given time. Their so-called Firehose product provides access to 100% of tweets in real-time. Because our volume of tweets is far below 1% of the Firehose volume, we do not believe we were affected by Twitter’s limits. 14Initial attempts to use a basic university connection were unsuccessful due to frequent disruptions. Consequently, virtual cloud servers were rented for this monitoring task. Initially, Linode were used but at a later time, the task was migrated to Digital Ocean. 15We contacted Twitter to inquire about purchasing those 7 days of missed tweets and received a quote of approximately $3,500 USD. Ultimately, it was determined that the cost far exceeded our ability to pay for the data. As such, these 7 days will remain as missing data. 16For commercial and privacy reasons, Twitter limits the frequency of social graph API requests and also limits the number of results that are provided at a time. Chapter 3. The Analytical Scale of Online Political Campaigns 46 additional challenges, it is nevertheless possible to enumerate the social graph through a process called spidering, which is conceptually depicted in Figure 3.13. Just as a living spider walks through its web, our algorithm navigated the TOpoli social network. We began with a list of all the accounts that tweeted with the TOpoli hash tag. Inspecting one account at a time, the spider slowly downloaded all the followers of those accounts, within the limits imposed by Twitter. Due to the fact that the spidering process took months to complete, there was a lag between when a tweet was initially observed and when the social network was spidered. As a consequence of the delayed social network observation, some followers were likely added and removed subsequent to the tweet being observed. It is likely that most accounts’ networks probably changed by a relatively small amount during the period between the observation of the tweet and the observation of the social network. The lag between observation points is not likely to be especially problematic. All social network data were collected within a year. A related problem is that many accounts tweeted many times throughout the observation time period, yet we have only one snapshot of the social network. The social network is constantly changing so we would actually prefer network snapshots corresponding each tweet time point. However, this temporal resolution is simply not achievable due to the restrictions Twitter places upon their social data. Ultimately, our spidering method did successfully produce a social network of TOpoli users. Despite the caveats mentioned, this network is likely to be a pretty good representation of the ground .

3.3.4 Natural Language Processing

Tweets that contained memes were identified using methods from natural language processing. This work makes use of the Python Natural Language Toolkit (Bird, 2014). The first step to analyze the Twitter text corpus is to break it down into tokens, which is a basic unit of linguistic analysis (e.g., Herdan, 1960). A token is a contiguous sequence of alpha-numerical characters that are separated by spaces or punctuation. After isolating each token, a concordance analysis was performed to produce a dataset with all the words that appear next to one another. A bi-gram concordance consists of sets of 2 words, 3-grams are sets of 3 words, and the general form is called an n-gram concordance.17 The concordance process computes the relationships between groups of n words at a time, respecting the order in which the words appear. For the remainder of this work, we will linguistically operationalize a meme in terms of an n-gram that is repeated.

3.3.5 Requirements for a Method that Identifies Memes

For this work, we required a method for identifying which n-grams are “good” memes - which is to say that we needed some sort of definition for good. The goal was to identify those n-grams that are likely to be memetic - and to do so without any advanced knowledge of how much retweet activity an n-gram will ultimately generate. From a memetic perspective, there is a difference between an n-gram that is copied because it is retweeted versus an n-gram that is copied because it has been reproduced from memory (Cavalli-Sforza & Feldman, 1981). To borrow from genetic terminology, a retweet is like a direct descendant from a tweet

17For example, the familiar fragment, “it was the best of times,” is broken into 3-grams as: 1) it was the; 2) was the best; 3) the best of; 4) best of times. Chapter 3. The Analytical Scale of Online Political Campaigns 47 and its memetic characteristics are transmitted vertically, with extremely high fidelity. This direction of transmission can be contrasted with lateral transmission, which is accomplished through mechanisms like social learning (e.g., Bandura, 1977), frequently with lower fidelity due to many identifiable causes (Castro & Toro, 2014). Because retweets ultimately served as an outcome measure in subsequent analyses, all non-original tweets were removed from the corpus used for meme identification. Consequently, a method for iden- tifying memes must operate upon only those memes that were transmitted through social (lateral), as opposed to technical (vertical), means. The choice to examine 3-grams was informed by a preliminary exploration of the concordance data. 2-grams were found to be too short for political slogans: subjectively speaking, there were not many recognizable 2-word slogans. 4-grams produced a combinatorially-large space, which was problematic both in terms of computing power and corpus size. The 4-gram concordance obtained from the corpus was too sparse, such that most 4-grams were unique - which would not necessarily be the case with a larger corpus. However, even with the relatively-small corpus obtained from TOpoli, performing a concordance analysis with 4-grams would be computationally intensive, such that a larger corpus would have to be matched with correspondingly-increased computing power. When considering the parameters of the corpus and available computing resources, 3-grams exist in the “sweet spot” for detecting memetic campaign slogans. Finally, our method for detecting memes must discriminate against n-grams that are not memetic despite being used frequently. Many n-grams serve functional purposes and are commonly used in the course of everyday speech.18 A method for identifying memes must exclude functional n-grams despite their high frequency of use.

3.3.6 Pointwise Mutual Information

To identify memetic n-grams in our corpus, we used a linguistic method called Pointwise Mutual Infor- mation (PMI), which scores n-grams according to the non-independence of the tokens appearing within (Church & Hanks, 1990). The intuition underlying this approach is that when two words are indepen- dent of one another, they appear in an n-gram due to chance. However, when there is some interesting dependency between those words, they will co-occur together at a higher rate than they occur indepen- dently. PMI measures the relationship between word 1 (w1) and word 2 (w2) in terms of joint probability

(p(w1, w2)) versus independent probabilities:

p(w1, w2) pmi(w1; w2) ≡ log (3.1) p(w1)p(w2) Pointwise Mutual Information recalls our original formulation of meme systems, inspired by Jung, in which symbols covary due to a relationship with covarying experience. PMI has been used by other researchers to detect interesting text (e.g., Freitag et al., 2012; Liu et al., 2015). From a communications perspective, the joint probability of these symbols is viewed as information (Fano, 1961). Symbols with higher joint probability are informative, both statistically and in the Jungian sense. The most informative n-grams are those that systematically co-occur due to some dependence between them; there’s a reason for those words to co-occur at the rate they do.

18Consider 2-grams in the sentence “to be or not to be.” Although the 2-gram “to be” is repeated twice, this is not because it is memetic; it is repeated because it is a verb phrase that is functional. Chapter 3. The Analytical Scale of Online Political Campaigns 48

3.3.7 Advantages and Disadvantages of PMI

There were several advantages and disadvantages to using Pointwise Mutual Information. Among the advantages to PMI was the elimination of a human classification step. As a simple statistical operation, PMI was easy to execute and replicate. PMI was also superior to simple linguistic scoring techniques like frequency ranking. A na´’ive meme detection algorithm might derive from the intuition that the most popular terms are memetic. To evaluate the effectiveness of popularity scoring, the most common phrases were calculated and the top 50 were manually inspected. Although the most popular phrases were related to politics, some phrases were repeated many times with a single word changed. The performance of PMI was much better because it produced a variety of distinct phrases. PMI was found to suffer from some disadvantages. As an artifact of the way PMI is calculated, PMI was sensitive to extremely low-frequency terms due to their uniqueness, not because those terms are memetic. In the TOpoli corpus, PMI was foiled by made-up words which were so unlikely that the division of token dependence by token independence produced an extremely high ratio. Tweets that included URLs with tracking information could be erroneously classified as memetic. These mis- classification were manually investigated and the error rate is under 20%.

3.3.8 Community Detection

We required a method to identify social network communities among individuals who were observed tweeting. This method could not depend upon human-assigned labels, either according to political membership or otherwise, because any manual process would not match the scale of our data. The follower network that was collected would serve as the input to the community detection method. Several algorithms were tested for this purpose, including a random walk method (Pons & Latapy, 2005), but ultimately the Louvain community detection method was selected (Aynaud, 2019). The Louvain method is a graphical approach that tries to maximize graph modularity (Blondel et al., 2008). In other words, Louvain looks for split points that will partition the graph in order to locate sub-networks that approximate self-contained clumps. Because the Louvain method is stochastic, every time it is executed the results can be slightly different. However, we found that the communities were fairly stable from one run to the next, at least for our network.

3.3.9 Hierarchical Linear Modelling

In order to compare different models, the unifying framework we employed was hierarchical linear mod- elling (HLM).19 The goal was the specification of several different models that maintained the hierarchical relationships described in the introduction to this work. HLM enabled us to represent multiple tweets originating from one account, statistically specifying one regression slope for each author. We also modeled multiple authors nested within communities, obtaining one slope for each community. The largest model specification contained 3 levels: tweets nested within authors and communities. Every other model was a subset of this 3-level specification. We directly compared these models using the Bayesian Information Criterion (BIC), permitting us to identify which model is the most likely given

19HLM is also known as multi-level modelling (MLM), growth curve modelling, and linear mixed effects (LME). LME happens to be the terminology that is widely used within the R language. Chapter 3. The Analytical Scale of Online Political Campaigns 49 the data we collected (Kass & Raftery, 1995). Altogether, HLM enabled us to directly compare the theoretical models of political campaigns that were introduced in this section. To actually perform these tests, the lme4 package for the R language was used (Bates et al., 2019). The lme4::glmer function provided the ability to model Poisson-distributed outcomes.20 Retweet count, our outcome measure, was an integer measure of the number of times an original tweet was propagated. The retweets variable had a lower bound of 0 and, as the number of retweets increased, ever-fewer original tweets achieved those retweet rates. Since this outcome measure was not normally distributed, we could not use a regression model predicated upon a Gaussian function. The generalized linear mixed- effects model with a log transformation function, corresponding to the Poisson distribution, met our requirements. Due to the large number of observations and the eventual model complexity, the glmer nAGQ pa- rameter was used to accelerate parameter estimation during fitting. Initially, slower and more-precise estimation was performed with the default of nAGQ = 1 but, in some cases, the model did not converge and some parameters could not be estimated. When nAGQ = 0, models converged each time and the calculations were much quicker. Most importantly, the resulting BIC was not remarkably different than when more precise estimation was performed. Apart from the specification of the Poisson family and tuning nAGQ, all other glmer defaults were used.

3.3.10 Operational Terminology

individual a single twitter account. community a categorical code that identifies an individual’s social network neighborhood, as assigned by the Louvain community detection algorithm.

tweet an original, non-retweet status update posted by an account.

fitness the number of times a tweet is retweeted; more retweets means higher fitness.

meme a 3-gram with a top-ranking PMI score.

is-memetic An effect-coded variable (1 = meme, −1 = not meme) that indicated the presence of a meme within in a tweet..

3.4 Results

3.4.1 Tweets

TOPOLI2014 Data Set

We observed 2, 660, 310 tweets, beginning on October 31, 2013. The start date was almost exactly one year before the election, which occurred October 27, 2014. For the purpose of the analysis, the observation period was halted on December 31, 2014, approximately two months after the election was decided.

20GLMER is an abbreviation for Generalized Linear Mixed Effects Regression Chapter 3. The Analytical Scale of Online Political Campaigns 50

Figure 3.14: Both tweets and retweets are plotted by day. In most cases, there are more retweets than tweets on any given day. The most activity was observed on the day of the election.

1, 173, 888 original tweets were observed during the monitoring period. We also observed 1, 486, 422 retweets. 147, 954 agents were observed engaging through these tweets and retweets. Among those 147, 954 individuals, we were able to identify 9, 794, 113 million connections.

Observation Time Period

The number of tweets and retweets collected per day during the observation period were plotted in Figure 3.14. The plot begins in October, 2013, corresponding to the start of data collection. There were a few tweets that occurred before the start of observation, which were a consequence of people retweeting old tweets that were created before October 2013. The plot ends in December, 2014, corresponding to the end of the observation period.21 The election on October 27, 2014 corresponded to the highest point of activity during the observation period. On this day, the greatest number of tweets and retweets occurred. The day after the election, the volume dropped off dramatically. The tweets depicted in this plot are only those that were produced as a consequence of the TOpoli filter, so the correspondence between the election date and the peak volume confirmed that the filter successfully captured the election phenomenon. As was previously mentioned, that there was a technical problem for one week in July, 2014. This week can be seen in the plot as a small period with no activity, corresponding to the outage. In context, it can be seen that this was a relatively small outage; the bulk of the phenomenon was captured. This plot also depicts the periodic nature of campaign activity. Each weekend, the volume of tweets dropped off and resumed during the week. This time course is potentially very interesting for insights that might relate to other research questions.

21Incidentally, this data set is still growing - even in 2019. I’ve been running this process almost continuously, with minor interruptions. Chapter 3. The Analytical Scale of Online Political Campaigns 51

Figure 3.15: Most 3-grams are never repeated; they are entirely unique. The remaining 3-grams are those that have been repeated at least once. Any 3-grams that could possibly be successful memes will be detected within the set of 3-grams that are repeated.

Language Use

Next, we examined the lexical tokens extracted from the corpus. There were 22, 683, 616 individual tokens, separated by spaces or punctuation. Of those 22.68 million tokens, there were 546, 269 unique tokens. Among those half-million unique tokens, there were many non-words, including URL fragments, technical strings, and random sequences of characters. When performing the concordance analysis, we obtained the expected number of 3-grams based on the corpus size.22 Half of those 3-grams are unique, providing our first insight into the upper-bound for the global memetic carrying capacity of our corpus.23 As visualized in Figure 3.15, not more than 50% of our text corpus was ever copied; the remainder is unique. This upper-bound is an exceptionally crude measure - the actual number of memes is going to be much lower - but we know it cannot be higher than this.

3-gram Frequency

To gain some insight into the 3-grams that were not original, their frequency of repetition was plotted in Figure 3.16.24 It can immediately be seen that 3-grams vary in the degree to which they were replicated. I did not fit a curve to this plot but it would appear to approximate the Zipf distribution (Zipf, 1932), which is an empirical finding that describes individual linguistic tokens. Recent linguistic work has found that Zipf’s law generalizes from individual tokens to n-grams (Ha et al., 2002; Williams et al., 2015), suggesting our corpus contains plausible speech. The number of unique 3-grams was in the millions - and there were hundreds of thousands of 3- grams that were used just a few times. The sheer volume of infrequent 3-grams caused this curve to become flattened out and uninterpretable. Consequently, the plot was truncated by applying a minimum frequency threshold of 50 and a maximum threshold of 1, 000. At the upper bound, not too many 3-

22By definition, there number of 3-grams ought to be equivalent to the total number of tokens - minus two - so this simple check was satisfied. 23A unique 3-gram cannot be memetic, by definition, because it was never replicated. 24Here we have one of the vanishingly-rare cases where a pie chart is actually appropriate. Chapter 3. The Analytical Scale of Online Political Campaigns 52

Figure 3.16: This plot depicts the number of times each 3-gram was repeated. Most 3-grams are seldom repeated. The lower end of this plot, depicting the least-frequently repeated 3-grams, was truncated. In effect, the plot has been “zoomed” in order to get a better look at the more common 3-grams. grams were used more than 1, 000 times. Near the lower bound, there were nearly 2, 000 3-grams that were used 100 times. Suffice to say that most 3-grams were used infrequently.

3-grams Filtered by Minimum Frequency

Due to the ratio calculation of Pointwise Mutual Information, the PMI result is sensitive to unique and rare terms. As a consequence of the high volume of infrequently-used 3-grams, which will interfere with the PMI scoring procedure, the lowest-frequency 3-grams were removed prior to scoring. To remove infrequent 3-grams, we searched the parameter space for the lowest feasible cutoff that would eliminate some of the noise prior to PMI scoring. Several possible cutoff thresholds were probed to examine their impact on the average frequency of the terms remaining. These probes were evaluated in order to determine whether they caused only high-frequency 3-grams to remain. The plot in Figure 3.17 depicts the results of the search for a threshold that would ensure some low- frequency 3-grams remained. The threshold value of 500 is especially sub-optimal because the resulting average frequency is unusually high in the contest of the plot. Values below 200 exhibit a roughly linear relationship - so any of these thresholds could be viable. A threshold of 100 was chosen because a lower threshold removed a smaller portion of the corpus. When a threshold of 100 is applied, millions of 3-grams still remained. Nearly 10, 000 of the remaining 3-grams were unique, which provided a rich set of 3-grams to score with the Pointwise Mutual Information method.

Top 500 3-grams in TOPOLI2014

After removing the lowest-frequency terms, the filtered corpus was scored using the PMI method. The results produced by the PMI scoring operation consisted of the 3-grams along with their associated PMI Chapter 3. The Analytical Scale of Online Political Campaigns 53

Figure 3.17: This plot depicts the consequences of removing the least common 3-grams. As more 3-grams are removed, the remaining 3-grams are more common, on average.

Figure 3.18: Several of the highest-scoring 3-grams correspond to visual representations. A few of those 3-grams have been collected here: The Gravy Train, The Island Airport, and Little Red apples. In the context of the election campaign, each of these images was imbued with significance among the electorate. Chapter 3. The Analytical Scale of Online Political Campaigns 54 scores. 3-grams were then ranked by PMI score and the top 500 were retained. The decision to select only 500 top-scoring n-grams was made in consideration of the need to create a discriminant meme classifier based on PMI score. The preponderance of lower-scoring n-grams - well over 9, 000 - were regarded as non-memetic for coding purposes. Of the top 500 n-grams, the very top 100 were manually reviewed in order to develop a qualitative sense for the content. Of these manually assessed n-grams, 79 out of 100 were related to the campaign. Frequently, politicians’ names co-occur along with job titles, which have the characteristic that they tend to appear together frequently and rarely appear separately. Many of the manually reviewed 3-grams refer to politically important locations. With very few exceptions, these names and locations are highly relevant to the campaign. Some political headlines showed up when they were extremely popular and widely discussed. Many of these top-scoring n-grams are recognizable as slogans and campaign issues, even years later. For example, the gravy train was a slogan that was repeated frequently during the campaign. The island airport was a location that became a key campaign issue and it figured into the election discussion. Little red apples became a meme due to a profoundly ignorant characterization of racism.25 A speedy recovery related to the incumbent mayor’s cancer diagnosis, and was widely discussed during the campaign. Another media sensation that occurred during the campaign was the shirtless jogger, who interrupted a campaign rally held along his jogging route at which he spoke passionately about issues. The 21 n-grams that were assessed as unrelated to the campaign were generally classified as spam26 and advertising. One of the weird findings was “infowars zoomer conspiracy” - which, upon closer inspection, was a product of spam. When unusual terms like “zoomer” were spammed often enough, and the terms formed a unique combination of words, then PMI predictably produced a relatively high score. Another example of spam came from a sequence of tweets that included a promotional code for a ride sharing program, which was repeatedly targeted at the TOpoli hash tag. These “misses” that were unrelated to the election can be regarded as noise. PMI scoring identified recognizable, campaign-related 3-grams at a fairly high rate. Despite the 21% that were classified as noise, the entire set of 500 top-scoring 3-grams were retained in order to code original tweets as “memetic” or not. When these top-scoring n-grams were used to code tweets as having memes or not, fewer than 1% of the original tweets, n = 7, 587, were ultimately coded as memetic.

Summary: Tweets Results

The PMI scoring method was an algorithmic process that could be applied en masse to the original tweet corpus. The lowest-frequency 3-grams were removed from the corpus prior to PMI scoring because PMI is sensitive to low-frequency outliers. After sorting by PMI score, the top-scoring n-grams contained many recognizable campaign slogans and issues. By retaining only the top 500 PMI-scoring 3-grams for meme classification, a small minority of 3- grams - and not necessarily the most frequently-occurring 3-grams - were used to determine a tweet’s memetic status. A new effect code was calculated for every tweet based on the presence or absence of a top-scoring 3-gram that indicated whether the tweet was memetic or not.

25This phrase was regrettably uttered by Ontario’s current Premiere, Doug Ford. 26The term spam, which is contrasted with ham, should be interpreted as an of the real thing. Spam messages appear to be like real messages but they are typically advertisements. Although it is beyond the scope of this dissertation, a fascinating folksonomy has arisen to classify other sorts of fake messages. AstroTurf is a fake grassroots message, which frequently features in a political context. Misinformation, which regained prominence in the context of the 2016 U.S. Presidential election, is a classic spam-like form of messaging engineered to specifically deceive. Chapter 3. The Analytical Scale of Online Political Campaigns 55

Figure 3.19: Those tweets with location data attached to them were plotted on this map. Although the location dots are almost transparent, the portion of the map corresponding to the City of Toronto has so many dots overlaid upon it as to be blackened entirely. This result suggests that most individuals who used the #TOpoli hash tag were located in the city when they were tweeting.

3.4.2 Social

Location of Accounts

The next class of results describes the social network we observed. According to the map in Figure 3.19, most individuals were tweeting while located within the vicinity of the election. Each tweet that included geographic information is reported as a dot placed on the map. These dots are almost transparent but they stack upon one another, such that the regions on the map that are virtually blackened represent a high density of activity at that location. 90% of the tweets that have GPS information appear within the boundaries of this map. This analysis is somewhat special because it is no longer possible to gather these location data. In 2014, Twitter were reporting GPS data along with tweets but, due to privacy concerns, they have since stopped this practice.27 However, at the time of observation in 2014, fully 63% of the tweets we collected did include GPS data. Even in 2019, users may choose to self-report their location by composing a descriptive sentence as part of their online profile. When examining self-reported location information, 30% of individuals included the keyword “Toronto” in their location description. Meanwhile, 16% of individuals included the keyword “Canada”. While this is certainly less convincing than the GPS data, this result is nevertheless convergent with the map visualization. On this basis, we inferred that users of the TOpoli hash tag were local to the city where the election took place. Furthermore, the hash tag was probably used to discuss local issues due to the location of

27I agree with Twitter’s cessation of reporting location data. I have to admit that it’s slightly terrifying to consider what becomes possible when these data are available - and a few technologists produced proof-of-concepts to demonstrate various risks. However, my primary concern involves state and military actors with massive resources. In my estimation, the asymmetry between individuals and well-funded military organizations constitutes a serious threat to democracy. Chapter 3. The Analytical Scale of Online Political Campaigns 56

Figure 3.20: The complete social graph is visualized with the Force Atlas 2 layout, which places connected individuals near to one another. Individuals are coloured according to community codes. The large regions of similar colours indicate that community codes successfully identified related individuals. the individuals using the hash tag.

Social Graph

The global twitter graph has hundreds of millions of users but only a small subset of those accounts ever mentioned the TOpoli hash tag in a tweet during our observation period. The accounts that did use the hash tag - and the accounts they connected to - were spidered to create a social dataset. The individuals and their social network connections were combined into a single network structure representing who follows whom. In total, 9, 794, 113 follower data points were collected. Those nearly-10-million connections are visualized in Figure 3.20 using a layout algorithm called Force Atlas 2 (Jacomy et al., 2014). The colours of the edges are based on community membership, as determined by the Louvain community detection algorithm. The Louvain algorithm assigned community codes, which are represented as the colours in this plot. There are 55 communities detected by the Louvain method; hence, 55 colours appear in the plot.

Degree Dynamics of Graph

Degree, in graph theory, refers to the number of edges - followers, in Twitter nomenclature - associated with a node or individual. When modelling human social phenomena, there is typically an inverse Chapter 3. The Analytical Scale of Online Political Campaigns 57

Figure 3.21: These plots depict the consequences of removing the least-popular individuals from the network. as progressively better connected individuals are retained, fewer individuals remain in the network. relationship between degree and frequency. Most accounts have few followers and few accounts have many followers. Filtering the network based on degree is a common network operation that can illuminate interesting social phenomena observable among the best-connected nodes. Due to the inverse relationship between degree and frequency, eliminating the accounts with the fewest connections results in a dramatic reduction in the overall size of the graph. The plot in Figure 3.21 depicts the degree distribution with various lower-bound thresholds. The vertical scale of the plot is affected by the threshold so, as increasingly-high thresholds are applied, greater resolution is obtained for the remaining accounts. Applying a degree lower-bound of 50 eliminates almost half of the accounts from the graph. When the degree threshold is 1000, just 253 accounts remain in the network.28

Graph Modularity

Graph modularity is a measurement of the extent to which communities are self-contained versus in- terconnected.29 The plot in Figure 3.22 simultaneously visualizes the effects of degree threshold on both graph size and modularity. When applying a degree lower-bound, modularity decreases as the least-connected users are removed from the network. One way to interpret this result is that the world becomes “smaller” as popularity increases; there is an increasing likelihood that popular accounts will be connected to one another.30

28These 253 accounts are followed by 1000 or more other accounts strictly within the TOpoli network. Quite a few accounts have more than 1000 followers in the context of the overall Twitter network. However, in the context of the TOpoli network consisting of 147, 954 accounts, it was apparently rare to be followed by 1000 or more accounts from among this smaller subset. 29When modularity is close to 1, there are many disconnected “neighborhoods”. As modularity approaches 0, nodes cohere into a single network. 30My interest in degree dynamics, and its effects on modularity and graph size, relates to the challenges I faced in attempting to visualize the complete TOpoli network, consisting of nearly 10 million edges. Due to the size of the network, Chapter 3. The Analytical Scale of Online Political Campaigns 58

Figure 3.22: This visualization depicts the consequences of removing the least-connected individuals from the network. As the degree filter threshold is increased, fewer individuals remain. In addition, the modularity of the remaining network is also plotted. Better-connected individuals tend to belong to broader communities.

Social Graph Layout

When the network was visualized again with a degree lower-bound of 1000, as seen in Figure 3.23, the least-popular accounts were removed in order to observe the connectivity of the most popular accounts. The Force Atlas 2 layout algorithm was used again (Jacomy et al., 2014). Force Atlas 2 operates like a physical simulation in which nodes repel one another like negatively charged particles. When nodes are connected, they exert an attractive force that causes related nodes to pull together. This visualization provides one way to validate that the community detection worked. Recall that categorical community codes were assigned to individuals who were inter-connected based on the Lou- vain community detection algorithm (Blondel et al., 2008; Aynaud, 2019). In Figure 3.23, nodes with the same colours are group together, which is convergent with the results of the community detection process. If nearby nodes systematically exhibited different colours, that would be evidence against convergence, but because nearby nodes have similar colours, this suggests that community codes were assigned appropriately. Figure 3.23 also depicts the consequences of pruning the least-popular accounts from the network. In contrast to Figure 3.20, the wispy solitary nodes around the perimeter have been removed. The modularity of the pruned network in Figure 3.23 is lower, which is convergent with the expectation that as accounts become more popular, they tend to be more interconnected. Figure 3.24 presents the results the OpenOrd (Martin et al., 2011) layout algorithm, which was applied to the complete, unpruned TOpoli2014 social network. OpenOrd has some similarities to Force Atlas 2 but it is capable of producing more visually-distinct clusters.31 Node colours are once again laid

I was unable to visualize it all until finally solving the problem in 2019. I initially performed these degree analyses in order to justify the use of a degree lower-bound threshold so I could visualize simpler graphs without losing important aspects of the phenomenon. However, I am now able to handle the complete graph so I no longer need to justify the visualization of subsets. 31In addition to an attraction/repulsion model, OpenOrd also implements a simulated annealing and cooling process. Chapter 3. The Analytical Scale of Online Political Campaigns 59

Figure 3.23: This visualization depicts the consequences of removing those individuals with fewer than 1000 connections to other individuals in the network. In most ways, this plot resembles the complete network visualization. The same communities generally remain.

Figure 3.24: The social network has been plotted with the OpenOrd layout. This layout offers more distinctiveness between communities. Individuals in different communities tend to be laid out farther from one another. Chapter 3. The Analytical Scale of Online Political Campaigns 60 out near to one another, suggesting again that communities were coded correctly.

Community Labels

Over the course of data collection and analysis, communities were identified and labelled several times. As network size increased, it became infeasible to perform this labelling task manually and so the final communities were not labelled. Nevertheless, the labels obtained from preliminary data can still inform the final data that were collected because it is the same underlying TOpoli2014 population. Those preliminary community labels were: 1) news media; 2) politicians; 3) activists; and 4) spammers. As more data were collected, the number of communities grew, which would likely permit greater resolution in the names for the communities. For example, political communities could be sub-classified, which would be consistent with findings about political blogs (Adamic & Glance, 2005) and polarization in the U.S. congress (e.g., Waugh et al., 2009). One future direction for this work could be to produce interpretable labels for the communities. Several approaches, including machine learning, topic modelling, or crowd sourcing, might be used to assign labels to communities in arbitrarily large networks. However, creating these labels must be left as a future direction.

Summary: Social Graph Results

In summary, the social graph consisted of about 147, 000 individuals with about 10 million connections among them. For the most part, these individuals were local to Toronto. The Louvain community detection method identified 55 communities. Those communities become less-distinct - and the graph, more inter-connected - as the accounts increase in popularity. We validated the communities by looking at the graph layout algorithm, confirming that individuals coded with the same community were, in fact, laid out close to one another. Altogether, this constitutes a robust characterization of the community structure of TOpoli2014 network.

3.4.3 Models

Comparing Models of Political Memes

What is the analytical scale of online political campaigns? The results so far have yielded memetically- coded tweets and community codes. These data are ready to be subjected to the hierarchical modelling process. There are seven models to compare, which are themselves hierarchically nested within one another. Table 3.1 lists the specifications of each model. Within these model equations, the parenthetical state- ments declare random effects, the “1” represents the intercept, “meme” is the effect coded predictor identifying a tweet as memetic or not, and “individual” and “community” are identifier variables used to depict hierarchical relationships. The simplest model predicts tweet diffusion based on memetic content, alone. The next level of hierarchy will include the community in the prediction, reflecting the relationship between tweets and the community they originated within. This is examined hierarchically using one intercept per community, both with and without a slope for memes. The same pattern of analysis is applied to individuals; one intercept per individual, both with and without a slope for memes. Finally, individuals are nested within their communities, to be tested with and without a slope for memes. The model that specifies individuals Chapter 3. The Analytical Scale of Online Political Campaigns 61

Figure 3.25: Now that the components of the model have been operationalized, the original conceptual model is revisited to be annotated with variable names.

Model GLMER Specification just meme retweets ∼ meme per community retweets ∼ (1 | community) + meme per individual retweets ∼ (1 | individual) + meme per individual in community retweets ∼ (1 | community/individual) + meme memes per community retweets ∼ (1 + meme | community) + meme memes per individual retweets ∼ (1 + meme | individual) + meme memes per individual in community retweets ∼ (1 + meme | community/individual) + meme

Table 3.1: These models were specified for use with the R lme4::glmer function. Within these model equations, the parenthetical statements declare random effects, the “1” represents the intercept, “meme” is the effect coded predictor identifying a tweet as memetic or not, and “individual” and “community” are identifier variables used to depict hierarchical relationships. The just meme model was specified and tested with GLM, not GLMER. Chapter 3. The Analytical Scale of Online Political Campaigns 62 within communities and a random slope for memes represents the complete hierarchy that was initially presented in Figure 3.4. The outcome for each of these models will be tweet popularity, as measured by retweet frequency. The model fit results, presented in Table 3.3, will be compared using the Bayesian Information Criterion (BIC), which indicates the likelihood of the model conditioned upon the data (Kass & Raftery, 1995). Whichever model fits best will inform the scale at which the phenomenon operates - and therefore, which scale is analytically appropriate for evaluating online campaign phenomena. The results, presented in Table 3.2, will enable us to interpret the phenomenon. The fundamental questions are: 1) which model is the best predictor of popularity; and 2) what does that model tell us about the phenomenon?

Meme Effects

Does the presence or absence of a meme within a tweet predict diffusion by way of retweets? To answer this question, we refer to Table 3.2 for the results of the model tests. The simplest model, labelled Just Meme, considers the effect of memes across all tweets, irrespective of the author or the community the tweet originated in. In the memetic nomenclature, this model could be seen as a test of meme fitness, operationalized as retweet popularity. The raw estimate is given as a log-transformed value, which must be exponentiated32 to obtain an estimate that may be interpreted as a retweet count. In this model, there was an average effect of memetic content, b = 0.050, SE = 0.005, z = 10.55, p < 0.001, exp(b) = 1.052. Because memetic content is effect coded33, the amount that a memetic tweet gains relative to non- memetic is double the estimate. Since retweets are an event count, the effect should be rounded to the nearest integer.34 Altogether, the Just Meme model result should be interpreted to mean the mere presence of a meme within a tweet increases the number of retweets by 2, on average. To gain a better understanding of the effect of memes, we look to the Memes per Individual in Community model, which represents the entire campaign phenomenon by specifying random intercepts corresponding to 1) the community within which the meme originated and; 2) the author, nested within their community. The model further specifies random slopes for memes at both levels. Even accounting for this greater specification, there was still an average effect of memetic content in the Memes per Individual in Community model, b = −0.125, SE = 0.024, z = −5.293, p < 0.001, exp(b) = 0.882. When this effect is fully unpacked, the mere presence of a meme within a tweet still boosts expected retweets by 2. The omnibus Memes per Individual in Community model contains the Just Meme model within it and, as such, the two may be contrasted by examining their Bayesian Information Criteria (BIC). According to Kass & Raftery (1995), a BIC difference greater than 150 constitutes “very strong” evidence in support of the better model. Because the actual ∆ BIC is 1, 935, 751, we would favor the Memes per Individual in Community model - but our interpretation regarding memes is largely unchanged no matter which model we consider. Memes appear to be an important contributor to tweet propagation even after accounting for indi- viduals nested withing communities.

32exp(x) should be interpreted as: the base e raised to the power x 33Non-memetic tweets are coded with −1 while memetic tweets are coded with +1. The total magnitude of difference between these values is 2. 34The rounding rule increments to the next integer when the preceding floating point portion is 5 or above (i.e. rounding up). Otherwise, the integer portion is not incremented and the floating point is truncated entirely to produce an integer (i.e. rounding down). Chapter 3. The Analytical Scale of Online Political Campaigns 63 001 001 . 293 417 . 849 125 . . . . 0 0 882 157 5 249 7 024 49 1 0 833 002 650 279 31 ...... < < − − − − Memes per- Individual in Community 0 0 224 001 001 . . . 866 990 105 . . . 0 0 900 0 372 0 6 01759 0 015 0 0 0 711 279 31 ...... < < − − − − Memes per Individual 0 0 001 001 . . 916 298 813 343 . . . . 0 0 710 0 163 0 4 3425 0 070 0 99 0 1 0 503158 1 0 ...... < < − − 0 − − Memes per Community 0 001 . 581 070 736 008 . . . . 0 176 0 992 0 1 7 246 0 005 0 1 0 798 4 511 1 ...... < − − − − Community 832 . 001 . 612 893 008 . . . 0 992 0 409 0 101 1 009 0 005 0 0 0 759 1 . . . . . < − − − − 001 0.107 0.114 001 223 403 . . . 915 . 0 0 . 053 0 246 0 5 269 0 005 0 1 052 793 1 ...... < < − − 001 001 . . 70 55 10 0 0 . . 393 0 052 1 005 0 005 0 332 050 0 ...... < < Just Meme Per Community Per Individual Per Individual in Correlation VarianceVariance VarianceVariance 2 Correlation VarianceVariance Correlation 4 1 exp(Estimate) 1 P-value P-value exp(Estimate) 1 Z Value 69 Z Value 10 Std. Error 0 Std. Error 0 Estimate 0 Estimate 0 Intercept Slope of Meme Intercept Slope of Meme Intercept Slope of Meme individual in Intercept Slope of Meme community individual BIC 6320689 6285756 4389932 4388484 6285644 4386362 4384938 Fixed Effects Random Effects Likelihood Table 3.2:Test Results Model community Chapter 3. The Analytical Scale of Online Political Campaigns 64 Memes per ∆ Individual 1424 Memes per ∆ Community 1900706 Per Individual 1897160 ∆ in Community − 3546 Per Individual 1895712 ∆ − 4994 Per Community ∆ 1935751 112 1900818 Just Meme ∆ 1930757 35045 1935751 4388484 1932205 1897272 1448 BIC 6320689 4389932 6285644 4384938 Per Individual inmunity Com- Just Meme Per CommunityPer Individual Memes 6285756 per Community 34933 Memes per Individual in Community Table 3.3:parisons Among BIC Models Com- Memes per Individual 4386362 1934327 1899394 3570 2122 1899282 Chapter 3. The Analytical Scale of Online Political Campaigns 65

Community Effects

The Per Community model specifies a random intercept for each community, such that communities may differ in their average rates of retweeting. Performing a BIC comparison against the Just Meme model, we find a BIC difference of 34, 933, which is very strong evidence in support of the Per Community model. However, the Memes per Community model is even better than that, with a “strong” BIC difference35 from the Per Community model of 112. The Memes per Community model extends the Per Community model by including a random slope for memes, such that the inclusion of memes within tweets could be examined on a per-community basis. There was an average effect of memetic content in the Memes per Community model, b = −0.343, SE = 0.070, z = −4.916, p < 0.001, exp(b) = 0.710. The random intercept for community accounts for a relatively large amount of variance, var = 4.503, while the random slope for memes accounts for a relatively small amount of variance, var = 0.158. The Memes per Community model was itself nested within the omnibus Memes per Individual in Community model and the BIC difference between them is 1, 900, 706 in favor of the omnibus model. In the omnibus model, we find that the random intercept for communities explains a relatively large amount of variance, var = 1.833, and an extremely small amount of variance is attributable to the random slope for memes, var = 0.002. On this basis, communities systematically differ according to their rates of retweeting, and there is an extremely small but detectable amplification effect of memetic content attributable to the unique correspondence between memes and their respective communities. Altogether, this is evidence to suggest that communities play a distinct role in the ultimate popularity of election campaign messages.

Individual Effects

The Per Individual model specifies a random intercept for each individual, such that individual authors may differ in their average rates of retweeting. Performing a BIC comparison against the Just Meme model, we find a BIC difference of 1, 930, 757, which is very strong evidence in support of the Per individual model. However, the Memes per Individual model is even better than that, with a “very strong” BIC difference from the Per Individual model of 3, 570. The Memes per Individual model extends the Per Individual model by including a random slope for memes, such that the inclusion of memes within tweets could be examined on a per-individual basis. There was an average effect of memetic content in the Memes per Individual model, b = −0.105, SE = 0.105, z = −6.866, p < 0.001, exp(b) = 0.900. The random intercept for individual accounts for a relatively large amount of variance, var = 1.711, while the random slope for memes accounts for a relatively small amount of variance, var = 0.279. The Memes per Individual model was itself nested within the omnibus Memes per Individual in Community model and the BIC difference between them is 1, 424 in favor of the omnibus model. In the omnibus model, we find that the random intercept for individuals within community explains a fairly equivalent amount of variance, var = 1.650, and a relatively small amount of variance is attributable to the random slope for memes, var = 0.279. However, in contrast with the community effects, the random slope for memes in the individual effects accounts for a relatively larger amount of variance, suggesting that memetic content is subject to greater amplification when certain individuals are responsible for authoring it. On this basis, individuals systematically differ according to their rates of retweeting,

35The “strong” label is described in Kass & Raftery (1995). Chapter 3. The Analytical Scale of Online Political Campaigns 66 and there is a small but detectable amplification effect of memetic content attributable to the unique correspondence between memes and their authors. Altogether, as with communities, this is evidence to suggest that individuals play an important role in the ultimate popularity of campaign messages. Even more so than the community effects, there would appear to be an interesting correspondence between individuals, the memes included in the tweets they author, and the retweets those tweets receive when they include memetic content. These findings could be interpreted to mean that when the “right” individual composes a tweet containing memetic content, and when that individual is a member of the “right” community, then there is a systematic and reliable amplification that occurs beyond the average memetic benefit to determine the ultimate popularity a tweet receives.

3.4.4 Summary of Results

In summary, we coded tweets as memetic or not based on the presence or absence of the top 500 n-grams, as scored using Pointwise Mutual Information. The social graph was partitioned into 55 communities using the Louvain community detection method. Effects were found at all scales that explained retweet rates:

• Memes present within tweets were found to be predictive of retweet outcomes, irrespective of the other phenomena accounted for in the model.

• Communities were found to be predictive of retweet outcomes, above and beyond average memetic effects.

• Individuals were also found to be predictive of retweet outcomes and an substantial amount of variance is explained by memetic effects at the individual-within-community level.

The full hierarchical model, in which tweets are nested within authors and communities, was de- termined to be the most likely of all models examined, according to Bayesian Information Coefficient comparisons.

3.5 Discussion

3.5.1 Phenomenological Scale

When analyzing the forces that influence the popularity of political campaign messages, our results suggest there are detectable effects at all scales - the memetic, the individual, and the community - though these effects are not equivalent. The model comparison points towards the fully-hierarchically-nested model as the most likely, given the data. Simpler models examined on their own are also explanatory, permitting the investigation of the relative effects of different scales. Tweets containing memes are more likely to be retweeted. When those tweets are shared within certain communities, the effect is amplified. The effect is further amplified when considering the individ- ual; there is some process involving memetic content with a special relationship to the individual who authored it. On the basis of the variance explained by memes at the individual level in the Memes per Individual in Community model, this is probably the most interesting result and the one which I am most interested to explore further. Chapter 3. The Analytical Scale of Online Political Campaigns 67

An election campaign is a phenomenon that spans all these scales. Campaign slogans form a memetic connection from individual awareness to collective political mandate, by way of popular aggregation. Election campaign analyses that focus on memes alone will miss something. Indeed, one of the shocking events of the 2016 US Presidential Election involved the hybrid use of individual personality traits for the “micro-targeting” of campaign messages. During the ethical reckoning of the 2016 micro-targeting scandal - which is far from complete as of the writing of this dissertation - there is a sense that the pairing of individual characteristics with targeted content is somehow a violation of privacy, ethics, democratic norms, or even something else. The results of the current work suggest it is possible to go even farther by taking the social network community into account. If we were forced to choose a single predictor of memetic tweet diffusion, it would have to be the individual author - because both the intercept and the memetic slope at the individual level would appear to explain a preponderance of variance. However, methods that focus on individuals, as with so-called “influencers,” are likely be incomplete, just as content-focused analyses are also incomplete. Perhaps unsurprisingly, methods that exclude something are going to miss something - and so a cross-disciplinary approach is indicated.

3.5.2 Agency of the Individual

Based on these results, the individual appears to possess the the most interesting influence over outcomes, though the effect is complex. I’d like to discuss the notion of individual agency by briefly mentioning the Situationist criticism of media and culture. Debord wrote about the degradation of society through its reduction to representations, which he called the Spectacle (Debord, 1967). Images and media come to replace lived experience, separating our conscious situatedness from the of life itself. The process by which we descend, according to Debord (1967), is gradual as we learn to accept this mediated replacement and participate in it. We come to be driven by the pursuit of the spectacle - and are robbed of our humanity and agency, in the process. In the age of the celebrity social media influencer, I can appreciate that Debord’s description was far ahead of its time. Debord’s account is also blisteringly cynical - and, according to my reading, it evokes the involuntary contagion model of memes with an emphasis on the pathological and parasitic. The current work could be used to respond to Debord, in which we find that the individual is hierarchically elevated above the meme. It’s inescapable that we are the authors of memes - at least, some of us are. I am willing to accept the preponderance of Debord’s criticism; increasingly, it does appear that our reality is constructed. However, I do not arrive at the same dismal Situationist conclusion. Ultimately, we are the architects of our reality; Debord, as an author, nearly-tautologically proves this point by publishing his work, thereby influencing my current thinking. And I - circularly - reject Debord by refusing to accept his entire argument, instead adapting it to fit my own . Debord, like McLuhan, reminds us of the medium. Indeed, media can influence us - but media objects are created and the influence they exert is not deterministic. One non-fatalistic implication of individual agency is that, to the extent individuals do matter, then leadership also matters. Thus, we arrive once again at de Tocqueville’s comments regarding the aloof political class towering above the emotionally-inflamed crowds. Meanwhile, Jung points towards the capacity for shared symbols, which are themselves mediated, to assist each of us in our personal process of individuation. Indeed, mandalas are just another kind of imagery that Jung believed to be important for expanding awareness, not for reducing it. Chapter 3. The Analytical Scale of Online Political Campaigns 68

In my final analysis, individual agency is the most important.

3.5.3 Influence of Collectives

According to these results, the collective is also influential of outcomes. Community membership was a strong-enough predictor to appear to serve as good proxy for the individual. In case you don’t know who the individuals are, it is still possible to predict tweet outcomes by observing group membership. This result suggests a relationship between the collective, writ large, and the symbols they use. Whatever it is that covaries among these social network communities is one construct that influences outcomes. Our model didn’t specify anything about the communities, apart from their category code, so we have no additional parameters to use. While this could be criticized as underspecification, I would argue that this approach is unbiased and constitutes the first step towards further specification. In fact, this broad and unbiased specification has precedent in the Collective Intelligence literature (Woolley et al., 2010). A reliable statistical effect is found when teams collaborate and their performance is measured - but this effect is not explained by either the individual or the aggregate intelligences of the team members. Initial work in the development of the Collective Intelligence literature simply reported this statistical artifact - and subsequent work has sought to characterize it more precisely. The current work also points towards future directions in the specification of these collectives. Another interesting question to consider is causal direction; whether collectives are formed due to the covariance of consciousness or the other way around. Perhaps membership in a collective will precipitate future cohesion, based on a variety of social psychology theories. For example, many minimal groups paradigms will use a basic symbol, like the attribution of significance to a coloured hat or shirt, in order to affect an individual’s attitudes about a group (e.g., Sherif et al., 1961). When the most basic attention is paid to the most basic collective symbols, groups begin to behave differently. In this context, it is interesting to consider a minimal groups induction in our midst: the red MAGA cap sold by Donald Trump. We have every reason to expect those hats to create a minimal group, just as it has in other contexts. What causes one to purchase and wear a red cap? What are the downstream consequences of seeing members of a community wearing those hats? The hat is not an online meme but it has several memetic characteristics, both aesthetic and informational. The treatment of the hat, as a symbol, by the collective that identifies with it - and also by collectives that reject the symbol for its associations - is a campaign dynamic that is likely to operate very similarly to what we have observed online.

3.5.4 Covariation of Symbols and the Unconscious

Let’s revisit Jung’s method for the symbolic analysis of the imagination, which originally motivated the Pointwise Mutual Information hypothesis for detecting memes in a text corpus. There are several results that support the utility of the PMI-meme method, including the high (79%) detection rate of recognizable memes and consequent prediction of retweet outcomes. Throughout the model testing process, the average effect of memes remains a statistically significant factor, even when successfully attributing variance to other hierarchical factors. In all likelihood, this investigation can be pushed farther - potentially without even needing to collect more data. Rather than a simple effect code for the presence or absence of memes, we could apply categorical codes to tweets indicating which meme is present in a tweet. This would provide Chapter 3. The Analytical Scale of Online Political Campaigns 69 vastly more information for our models. In particular, I am excited to consider the interface between communities and certain kinds of content. My intuition suggests that this unexplored avenue is likely to be illuminating. From my anthropologically-embedded position within my own social networks, I have seen that different communities respond differently to different messages. Additionally, this intuition would seem to be supported by the literature on micro-targeting (e.g., Youyou et al., 2015). By digging deeper into the content-by-community mechanics of online campaign phenomena, it ought to be possible to further develop Jung’s ideas about the collective unconscious, as well. According to Jung, we cannot achieve awareness of the unconscious - and that which we do become aware of is what makes us individuals; our individual consciousness separates us from one another. So what is a collective that is collectively aware? To Jung, this is an impossible, paradoxical condition. Both memes and collectives are able to predict sharing outcomes - and perhaps we can interpret this to support of Jung’s method. To the extent that the widespread sharing of a tweet signifies something of shared importance to many people, this does provide evidence of a relationship between collectives and symbols. Furthermore, to the extent that widespread sharing and majorities are related to one another, votes and memes both embody this shared relationship. The popular vote, itself, represents the aggregation of the unconscious - and the democratic result does rise to the level of consciousness, no matter what Jung says. Perhaps as we come to collectively know ourselves better, the paradoxical relationship between individuation and the explicit knowledge of the collective unconscious offers a theoretical explanation for political polarization and the divisions observable in democracies. Ultimately, for the purposes of identifying memes using Pointwise Mutual Information, I am extremely satisfied by Jung’s method for symbolic analysis of the imagination. I am not yet ready to close the book on the relationship between symbols and the collective unconscious.

3.5.5 Symbols that Emerge

Something qualitative can be said for the actual symbols to emerge from this analysis. Since I manually read through the 100 top-scoring memes, I effectively performed a survey of liminal, collective awareness. This awareness is liminal at best because it is unlikely that any single individual, apart from myself, would have been exposed to all 100 of the memes I reviewed. These symbols can still be thought of as “lurking” just beyond awareness, fulfilling some of Jung’s description of the unconscious. Which symbols do we find? There was a surprising amount of hate speech and Nazi symbolism - some in relation to the campaign; some not. Surprise number 1: this was 2014, well before the 2016 US presidential election in which we became collectively aware of the resurgence of certain social pathologies. Surprise number 2: this happened in Canada, in Toronto, which is not known for a culture of hatred. Nevertheless, Jung warns us about this possibility. In his analysis of archetypes and the collective unconscious (Jung, 1968), he points toward the manner in which nations can be overcome by what he characterizes as “psychic epidemics.” Jung says:

Not much is needed: love and hate, joy and grief, are often enough to make the ego and the unconscious change places. Very strange ideas indeed can take possession of otherwise healthy people on such occasions. Groups, communities, and even whole nations can be seized in this way by psychic epidemics. Chapter 3. The Analytical Scale of Online Political Campaigns 70

3.5.6 Misinformation and Propaganda

Recalling the influence of the individual and the importance of leadership, it is interesting to consider the failure case in which bad leaders are elected. One useful diagnostic criterion for bad leadership is the deliberate use of deception and misinformation for the purposes of persuasion, which can be called propaganda. At a minimum, this violates de Tocqueville’s prescriptions. As with any other information, misinformation may spread memetically - and in this way, memes could be thought of signifying a “psychic epidemic.” Memorable lies include those that tap into the collective unconscious. Emotions of disgust, fear, and hatred can be stoked by appealing to our basest impulses. As spoken by the Nazis who overran Charlottesville, VA, USA in 2017: “blood and soil,” which refers to genetics and geographic sovereignty. Purity, whether ethnic or otherwise, has been associated with morality through a visceral disgust mechanism (Schnall et al., 2008). I examine some aspects of disgust in greater depth during the Urban Legends chapter of this dissertation. I will also emphasize that every lie has a creator; Spectacular society is authored. The very mechanism of authorship and misinformation has been folded into modern warfare (Haines, 2015). Leaders who lie are toxic to democracies. This represents the opposite of what de Tocqueville admires in his discussion of the political class, uninflamed by popular passions as they may be (or not). Populist leaders would disappoint de Tocqueville, who was suspicious of the “tyranny of the majority.” In my estimation, misinformation can bring about the antithesis of de Tocqueville’s model of democracy.

3.6 Conclusion

We have found that campaign phenomena are hierarchical and effects are observable at all scales. Both symbols and collectives are related to popular outcomes. However, it is the individual who has the strongest claim to agency by way of the most interesting hierarchical interface with memes. Texts, when they are ranked by the information they contain, can be classified as memes at a good rate - an effect that recalls Jung’s suggestions for analyzing symbols of the imagination. Both Jung and de Tocqueville are concerned that democratic systems can be overcome by popular passions and, on the basis of the current work, I believe this concern is distressingly well-founded. Ominously, the current work suggests there are untapped opportunities to push these effects farther. In the final analysis, when the goal is to maintain a healthy democracy, we had better be considerate in our choice of individuals to lead because they have the power to author new through the symbols they form and the unconscious they unleash. Chapter 4

Urban Legend Propagation

4.1 Introduction

This chapter is an investigation of Urban Legend propagation and the computational modelling of psychological phenomena. An urban legend is an apocryphal tale that the storyteller claims is true even though it might not be. Urban legends are a type of meme that can be passed from person to person by word of mouth. The work begins with a finding from the literature: Urban legends propagate farther when they are disgusting (Eriksson & Coultas, 2014). This is an example of a phenomenon known as emotional selection (e.g., Heath et al., 2001), in which the emotion evoked by a message can influence how the message is treated. The current work is a conceptual replication of urban legend transmission based on a series of studies conducted in Eriksson & Coultas (2014), which we conducted as a computational simulation. We extended their empirical models to generalize beyond the laboratory in an effort to achieve better ecological validity. This work offers two primary contributions. The first contribution pertains directly to the emotional selection of urban legends, which will be explored in a social network context. The second and more fundamental contribution pertains to the science of computational modelling for social psychological phenomena. While the investigation of urban legends is of interest in the context of memes and so- cial networks, the computational modelling developments are intended to apply generally throughout psychological science.

4.2 Background

4.2.1 Key Terminology

This work builds upon the serial reproduction task, which may be used to observe story-passing behaviour in a laboratory setting (see next section, Serial Reproduction Task). The key difference in our study was the use of a different network topology, replacing the serial reproduction task with one that more closely resembled an online social network. Our model employed an epidemiological contagion mechanism which caused urban legend stories to diffuse through this network like the viral spread of diseases.

71 Chapter 4. Urban Legend Propagation 72

Serial Reproduction Task

This work extends the Serial Reproduction Task, developed in Bartlett (1932) for the psychological examination of memory and message passing.1 The Serial Reproduction Task, which was the basis for Eriksson and Coultas’ study, functions like a kid’s game called “telephone operator,” in which children line up and whisper a message to one another along the line. The message propagates, one child at a time; child A tells child B, who tells child C, and so on. Usually, by the end of the game, the starting message has become mutated and distorted beyond recognition. A Serial Reproduction Task can be conducted using paper or another medium offering greater trans- mission fidelity than whispering. Although it is easily illustrated by a child’s game, The Serial Repro- duction Task is a formal research method for studying word of mouth phenomena with rigorous scientific controls.

Communication

Eriksson and Coultas described behaviour in the serial reproduction task using terms from early research by Bell Telephone on their communication network (Shannon, 1948). In this framework, there is a transmitter and a receiver, the channel that connects them, and the message that travels along that channel. Shannon (1948) includes a noise source, which manifests in the children’s telephone game as interference that produces unexpected results. Shannon’s work is also the gateway to information theory (e.g., Jaynes, 1957), which can serve to characterize the amount of signal that is lost or recovered in terms of a quantifiable informational property.

Networks

Graph Theory is the branch of discrete mathematics that deals with networks, which is suitable for describing the telephone networks of the 1900s and the online social networks of the 2000s - and, for that matter, the bridges of Koenigsberg in the 1700s (Euler, 1953). The serial reproduction task is an example of a very simple communication network. Graph theory provides the vocabulary2 for describing exactly how a serial reproduction experiment is topologically laid out. In graphical terms, the serial reproduction task consists of a graph that is directed, and acyclic; “directed,” meaning all the graph edges feed forward in a specific direction; and “acyclic,” meaning there are no cycles or loops. Every person is represented as a node or vertex in the graph and every connection to another node is called an edge. In Bartlett’s Serial Reproduction task, for each node there is only one other node that transmits a message and there is only one recipient; thus nodes have 2 edges, which is referred to as a degree of 2. As this is a directed graph, in which the edges point in a specific direction, we say that these nodes have an in-degree of one and an out-degree of one.

1In this work, Bartlett also conducted a study of images, which has renewed relevance to online phenomena in modern times. The current work does not build on this pioneering experimentation - but future work ought to. 2For completeness, the term plot is not synonymous with graph, despite the colloquial interchangeability of the terms. Throughout this dissertation, a plot refers to a data visualization, which is typically laid out on a euclidean plane, whereas a graph refers to a network, which itself may be plotted for the purpose of visualization. Irrespective of whether the graph is plotted, the mathematical data structure that represents a network is known as a graph. Chapter 4. Urban Legend Propagation 73

Contagion and Epidemics

Simple contagion3 is a process that involuntarily results in transmission between nodes by mere contact, alone (Le Bon, 1895). A contagious process does not involve discretion, decision making, or choices; it operates automatically like an epidemic (e.g., Goffman & Newill, 1964). Practically nobody chooses to become sick with influenza; to the contrary, most people will make an effort to avoid contracting the flu (e.g., Butler, 2009). This type of pathogenic contagion is studied in the field of epidemiology, which has produced a robust literature on the diffusion of epidemics (e.g., Kermack & McKendrick, 1927; Hethcote, 1974, 1994). Epidemiological modelling can be generalized beyond contagion to include processes that include imitation, choices, and decisions (e.g., Rogers, 1962) or, indeed, “whatever is to be diffused” (Granovetter, 1973). Other forms of contagion also exist. Emotional contagion occurs when person A “catches” the emotions of person B (e.g., Hatfield et al., 1993). Fear spreads through social animal groups, not because the animals simultaneously choose to feel fear but through the process of contagion. Emotional contagion lies at the heart of more sophisticated social processes like sympathy, perspective-taking, and altruism (De Waal, 2008). occurs when person A “catches” the beliefs of person B - due, again, to a process other than choice. A classic example of social contagion comes from Asch (1955), who displayed lines of varying lengths on a screen and induced participants to produce incorrect answers on the basis of group pressure. Roediger et al. (2001) induced false , also through social influence, when confederates systematically fed incorrect answers to participants. In these examples, ideas are transmitted socially - and, at least in these cases, it is only by willfully resisting these pressures that misinformation is not permitted to propagate.

Memes

Memes, introduced in the 7th chapter of Dawkins (1976), offer a useful theory for characterizing the messages passed through a communications network. Dawkins initially regarded memes and genes as being fairly analogous. The idea of memes has been interpreted both more radically (Blackmore, 1999) and more stylistically (Dennett, 1990; Dawkins, 1993). A meme is a unit of culture with aspects from media studies and psychology, necessarily incorporating the human response to the meme within the system. A meme is an idea that can be repeated. In contrast with the definition used in Chapter 3 of this dissertation, a meme doesn’t need to be a text; it could be anything from a musical jingle to a stone tools (e.g., Boyd et al., 2013). A meme can be mutated and changed, even unrecognizably so. For example, poems are typically altered, either intentionally or otherwise, and this can manifest by being mis-remembered or mis-communicated (Boyd & Richerson, 2000). Memes rose to prominence in light of the viral social network phenomena of the early 2000s (Black- more, 2000). The overbearing popularity of memes raises the question of whether memes spread as a contagious process (Blackmore, 1998). According to a strict interpretation of contagion, I believe the answer is no (Hodas & Lerman, 2014); this process more closely resembles social learning. Nevertheless, because the spread of memes does involve exposure and transmission, which are elements of contagion, the phenomenon may be subjected to epidemic analysis.

3as opposed to complex contagion (Centola & Macy, 2007) Chapter 4. Urban Legend Propagation 74

Figure 4.1: The “I can has cheezburger?” image macro is a historically significant image that gained a lot of attention in its time. It is an example of a LOLcat and the corrupted language was emulated and adopted by many other image macros. This style of image macro is still used in 2019, although the style is now referred to as an “advice animal.”

In the online context, a colloquial “meme” refers to a picture with a text overlay that is embedded within the image so that it is shared as a single digital object (e.g., Vickery, 2014). However, this particular type of meme is more accurately called an image macro (SAclopedia, 2004), which gained popularity through an early online forum called Something Awful (Kyanka, 1999). The premiere example of an image macro is the “I can has cheezburger?” cat (Nakagawa & Unebasami, 2007), depicted in Figure 4.1, which produced a prolific cascade of online image sharing. For the present purposes, the meme is the message from a media communication standpoint; it’s what is transmitted and what will ultimately be the subject of the current work.

Social Transmission

Social Transmission research incorporates individual differences and decision making in the study of message passing phenomena (Nicol, 1995; Berger & Milkman, 2010). Social transmission, also known as word of mouth, can occur online, in-person, overheard on the street, via broadcast media, and takes many other forms (Dichter, 1966). Social transmission can even occur among non-human animals (e.g., Whiten et al., 2007). Social transmission is a relaxed form of social contagion; now choice is allowed. Within this frame- work, messages propagate because they are interesting, popular, gratifying, persuasive, influential, and so on (e.g., Wojnicki & Godes, 2008). The study of social transmission incorporates a variety of phe- nomena, including how emotions affect whether a story is transmitted or not; that is, does the message evoke strong emotions (e.g., Chattoe-Brown, 2009)? My own work with user-generated content can be Chapter 4. Urban Legend Propagation 75 classified as social transmission research (Miller, 2012). Sharing decisions are one of the key behaviours that social transmission investigates and emotional selection is a notable factor in sharing decisions. Emotional selection occurs when a person reacts to something they read and this reaction causes the reader to behave differently with respect to the message (Heath et al., 2001). What is the likelihood to re-transmit a word of mouth message, when accounting for the emotion that the message evokes?

Cascades

Social transmission processes, when viewed from the perspective of the content rather than the individual, are called cascades (Lohmann, 1994; Bikhchandani et al., 1998). Cascades appear as a timeline or lifespan of a media object as it is passed from individual to individual (Fowler & Christakis, 2010). This cascade may consist of a series of actions taken by multiple people or institutions, as in the case of financial cascades (Tedeschi et al., 2012). The actions related to a cascade may be simple, like clicking a share button, or they may potentially be complex. There are countless examples of information cascades, including rumors (Allport & Postman, 1946; Friggeri et al., 2014) and viral media (e.g., BBC Staff, 2006). We have already introduced an early image macro known as “I can has cheezburger?” demonstrating that cascades can describe multiple formats of media objects. Cascades also exist in offline contexts and have been around much longer than the Internet. Cascades can be political in nature, such as the messaging on Twitter that was seen as instigating the Arab Spring (Lotan et al., 2011). Cascades can be massive, involving billions of individuals in the case of online viral videos (e.g., Jiang et al., 2014). Cascades could also have a strong behavioural component (Leskovec et al., 2007), potentially requiring significant action from participants in the cascade (Cheng et al., 2018).

Urban Legends

People have been telling stories for a long time - and those stories are propagated through social trans- mission dynamics (Edwards & Middleton, 1990). Urban legend cascades can exist entirely apart from online media, like in the case of chain letters, mythology, and folk tales (VanArsdale, 1998). An urban legend is an apocryphal story, often with unknown origins, that is passed via word of mouth (Mullen, 1972). Frequently, the legend includes a moral or a fitness-related safety message. Numerous examples of widespread storytelling crazes can be found in the literature (e.g., Stubbersfield et al., 2015). The Phantom Anesthetist is a suspicion cascade that was documented in rural America in the 1940s (Johnson, 1945). Another “mass hysteria” cascade was contagious laughter, which was treated as a health risk in some regions of the US, in part because it was viewed in the strict contagious sense (Freedman & Perlick, 1979). A more modern urban legend is the story of the tainted Halloween candy, although the literature suggests there is little basis for the story (Best & Horiuchi, 1985). The phenomenon of urban legends and folklore does not particularly resemble the serial reproduction model; the nature of storytelling is usually not one-to-one (Dundes, 1965). Instead, stories are often told to groups of people, perhaps even in a broadcast context (e.g., Heyer, 2003). In the so-called “real world,” urban legends don’t have the harsh experimental constraints that a serial reproduction task imposes. An urban legend can be thought of as a meme and the subtle variations of the story may be viewed as . The story could be described as an information cascade or it could be framed in terms Chapter 4. Urban Legend Propagation 76 of individual sharing episodes. Emotional selection factors can influence urban legend propagation, including the details that are emphasized and those that are remembered by listeners.

4.2.2 Reviewing a study on Urban Legends

Having reviewed some of the literature, we now turn to Eriksson & Coultas (2014) for its methodologically refined investigation of urban legend phenomena. Their work will be dissected so it can be “reverse engineered” to construct a computational replication based upon it. In many ways, the current work is a computational extension of Eriksson & Coultas (2014).

Disgusting Stories

Eriksson & Coultas (2014) studied disgust as a moderator for urban legend sharing.4 I was particularly drawn to their methodological separation of the urban legend transmission process into 3 stages that map onto Shannon’s communication theory (Shannon, 1948). In the receive stage of Eriksson & Coultas, participants chose whether or not to read an urban legend. During the encode/retrieve stage, participants engaged in a memory and recall process with urban legend stimuli. Finally, during the transmission stage, the story was actually shared to another participant. Eriksson & Coultas (2014) leveraged the serial reproduction task, a la Bartlett (1932), in which one participant transmits stories to the next participant, in sequence. Eriksson & Coultas found that almost all their urban legend cascades were extinguished before the end of the serial reproduction tasks, but this effect was moderated by how disgusting the urban legends were. Low-disgust cascades ended significantly sooner than high-disgust cascades. This finding is consistent with previous studies (e.g., Heath et al., 2001) but it is also surprising: in the real world, some urban legend cascades survive for years or centuries. The laboratory results do not appear to resemble the sort of “viral cascades” that real-world observation suggests are possible.

Eriksson & Coultas Study 2: choose-to-transmit

In study 2, Eriksson & Coultas investigated whether various characteristics of a story, as rated by participants, affected its transmission properties. Were disgusting stories more likely to be transmitted than their non-disgusting counterparts? And further to that question, would the disgust level of a story affect ratings of how funny the story was?

Methods Eriksson & Coultas generated four story topics: 1) cake; 2) dog; 3) Nepal; and 4) pizza. Each topic was converted into a high and low disgust variation, as depicted in Figure 4.2. In the low- disgust version of the dog topic, a dog received an unexpectedly delicious meal from a restaurant.5 In the high-disgust version, the restaurant cooked the dog and served it to its owner. The high-disgust version of the dog story violates numerous expectations about purity and morality that are associated with appraisals of disgust (Schnall et al., 2008). These four base stories were then presented to 80 Mechanical Turk participants who were randomly assigned to receive stories manipulated

4As an aside, I was going to conduct a similar study in 2013, although with humor instead of disgust. At the time, I had just finished my master’s thesis (Miller, 2012) and I found that I really wanted to dissect the process of meme transmission. Thanks to Eriksson & Coultas (2014), I didn’t have to do that work; the results they produced are just as useful to me. I love this article and they actually saved me a huge headache. 5Eriksson & Coultas were aware that this manipulation does not strictly involve disgust, alone. In this example, the story is not merely benign but it is also humorous. Chapter 4. Urban Legend Propagation 77

Figure 4.2: This diagram depicts the process by which story topics are manipulated to create high/low disgust versions. Chapter 4. Urban Legend Propagation 78

Figure 4.3: Eriksson & Coultas (2014); p. 14 by disgust level. Participants rated the stories along several dimensions, including humor, disgust, and the likelihood they would pass along the story. Pass-along ratings corresponded to the transmit phase of communication and may be interpreted as “ to transmit.”

Results Eriksson & Coultas found a positive effect of disgust upon transmission , which replicates earlier findings that disgusting stories are prolific in online communities (Heath et al., 2001). In fact, several of the measured dimensions, including humor, were related to transmission, which can also be interpreted as a replication of Berger (2011). One of the principle results of this study is a linear model of urban legend transmission. Figure 4.3, which is quoted directly from the article, contains the estimates from their linear model. These maximum likelihood estimates were interpreted as slopes corresponding to the effects of disgust, humor, and so on. This model tells me that on average, most stories are not shared because the intercept is nearly a standard deviation below zero. However, disgusting stories counteracted the general bias against sharing, such that disgusting stories were transmitted at a significantly higher rate. The estimates in Figure 4.3 were used as parameters for the current work’s computational model.

Eriksson & Coultas Study 3: Receive and Transmit

Study 3 from Eriksson & Coultas (2014) focused only on the receive and transmit stages of the serial reproduction task. 80 participants were grouped into pairs that were organized into 40 separate serial reproduction tasks. Within each serial reproduction task, participants were assigned a sequence number indicating their order of action. The participant who acted first was referred to as the Generation 1 participant and the one who acted next was Generation 2. Generation 1 participants were presented with four stories that were written on paper. Participants could choose whether to read each story and then choose whether to transmit them to the next par- ticipant. While each story had the chance to live for 2 generations - a cascade of length 2 - the full cascade could only occur when both participants decided to transmit those stories. In practice, most story cascades resulted in a length shorter than 2. Chapter 4. Urban Legend Propagation 79

Figure 4.4: This flowchart depicts the research method used in Eriksson & Coultas Study 3. The experimenter delivers 4 stories to the first participant. That participant chooses whether to read any, and subsequently, whether to share any. The stories are then passed to the next participant and the process is repeated.

Within each generation, there are 2 decision steps involving an interaction between the participant and the content. During the receive step, participants looked at the titles of the stories they were given and decided whether to read each story. To complete the receive step, participants actually read the stories they wanted to read. Then, during the transmit step, participants decided whether to share any stories. To complete the transmit step, experimenters collected the stories that were shared and provided them to the next person in the task.

Unlike the previous study, which measured intention to share, in this study participants actually engaged in the transmission activity. This paradigm realistically simulates the filtering effects present in the propagation of urban legends. The flowchart in Figure 4.4 depicts person 1 receiving four stories, choosing whether to read any of them, and finally choosing to transmit the stories to person 2. Presuming person 2 received any stories, they may choose whether to read the stories they received and subsequently choose whether to transmit any.

Eriksson & Coultas (2014) tracked the number of stories that were retained after 2 generations. The plot, reproduced in Figure 4.5, demonstrates that as stories pass from person to person, fewer and fewer stories are retained. Notice that high-disgust stories are retained at a higher rate than low-disgust. By the time the generation 2 participant engages in the transmission step, there are no low-disgust stories left and there are few high-disgust stories remaining. In showing different outcomes based on disgust levels, Eriksson & Coultas (2014) demonstrated evidence for emotional selection both during the receive stage and during the transmit stage. Chapter 4. Urban Legend Propagation 80

Figure 4.5: Eriksson & Coultas (2014); p. 17

4.2.3 The current work

Chapter 3 involved the naturalistic observation of memes spreading on twitter, a method that was strongly grounded in the phenomenological world but which included no experimental manipulation. By contrast, Eriksson & Coultas (2014) used strong experimental controls but resorted to the laboratory to accomplish it. The current work is an effort to obtain the best of both worlds, transporting the findings from the laboratory into a new environment that more closely resembles aspects of the real world. In the experimental sciences, validity is related to the proper controls applied to the proper ex- perimental variables - but these controls can also interfere with validity (Campbell, 1957). Ecological validity, one form of external validity, is a common and sometimes-valid criticism of laboratory studies conducted in the psychological sciences (e.g., Schmuckler, 2001). The controls afforded by laboratory research can be at odds with ecological validity but, as Campbell (1957) points out, both internal and external validity are important criteria for experimental research. The findings reported in Eriksson & Coultas (2014) demonstrate excellent internal validity; they have expertly dissected the urban legend storytelling episode. However, the use of the serial reproduction task is a departure from the nature of human social networks, which are rarely one-to-one, interfering with the generalizability of the results. Correspondingly, the transmission results do not appear to be consistent with the proliferation of urban legends that are observed in the world, as many urban legends persist for years or even centuries. The network itself is therefore one of the key study elements that can be changed to improve the external validity of the results. This work considers a program of research consisting of two phases: 1) an internally-valid laboratory phase that produces a refined model; and 2) an externally-valid simulation phase that examines that model in an ecologically-grounded environment. The current work translates the Eriksson & Coultas (2014) urban legend transmission model into a computational modelling framework. This transmission model may then be evaluated under different network conditions that more closely resemble the real Chapter 4. Urban Legend Propagation 81 world. In contrast with the implausible findings produced by a serial reproduction task, we seek recog- nizable, epidemic sharing of urban legends to emerge from the very same model specified by Eriksson & Coultas (2014).

4.3 Methods

This methods section begins with a review of the computational modelling literature, then describes how computational modelling techniques were applied in the current work.

4.3.1 Computational Modelling Theory

The computational modelling of complex systems is discussed in many disciplines, each of which has evolved a slightly different vocabulary for saying similar things. The current work uses Agent-Based Modelling (ABM), which is gaining prominence in the Computational Social Sciences community. A different modelling approach called Individual-Based Modelling is better-known within the ecological modelling community. Although these two approaches have largely converged into a single practice, they have very different theoretical roots. I initially ignored the differences between these two literatures but I later determined this was an oversight. The ecological literature provides strong theoretical foundations for modelling decisions that I now believe are essential to computational modelling in the psychological sciences. Therefore, this section begins with a synthesis of these literatures for applications in psychology.

Epistemological approaches to systems modelling

Mathematical models have been central to the scientific investigation of physical systems for millennia (e.g., Dreyer, 1906). When the principles of those models are mechanically embodied, as with astronomi- cal computing mechanisms like the orrery and clock, those systems can be simulated and predictions can be investigated (e.g., Pope, 1804; T. Reid, 1832). More recently, simulation techniques have been applied to social systems, including economies (e.g., Walras, 1874), demographics (Dublin & Lotka, 1925), and societies (Schelling, 1971).

Economics Walrasian models were an early approach to the analysis of complex human systems. Walrasian models can be characterized in terms of their formal approach that frequently involves math- ematical functions that are solved in order to compute equilibria that balance the equations. The strengths inherent in early economic models derive from their formal specification, permitting analysis by mathematical methods. A classic example of the Walrasian approach is to compute the price of goods from a supply curve and a demand curve, in which a single price may be found at the intersection of these curves. Critics of equilibria models - including proponents of Agent-based Computational Economics (e.g., Tesfatsion, 2003) - highlight the following shortcomings: 1) Walrasian systems do not support recursive, micro- and macro- level analysis; and 2) the formal notation can interfere with communica- tion. Early iterations of the current work attempted to apply a Walrasian approach to social scientific research with memes - but without much success. Chapter 4. Urban Legend Propagation 82

Individual-Based Modelling Individual-Based Modelling takes patterns of empirical observations and builds models from them (Schank, 2001; Grimm & Railsback, 2005). The Individual-Based Mod- elling approach was developed in the ecology literature, where regressions and algorithms are used to describe systems of wildlife. Some famous examples of individual-based modelling involve forests of trees (Liu & Ashton, 1995), salmon spawning (Railsback & Harvey, 2002), and the natural world at large. Insofar as many human behaviours are closely related to other animal behaviours - particularly to other primates - the epistemology underlying Individual-Based Modelling is a good match for social psychological research.

Agent-Based Modelling Agent-Based Modelling (ABM) is a discrete, computation-oriented mod- elling approach that is well-suited to social systems (e.g., Epstein & Axtell, 1996; Epstein, 1999; Bonabeau, 2002; Macy & Willer, 2002; Berry et al., 2002; Bankes, 2002). ABMs are sometimes represented algo- rithmically as finite state machines or as decision trees consisting of if-then statements. A different term for this approach is multi-agent systems (MAS), as it applies to the fields of robotics, computer science, or manufacturing (Jennings et al., 1998). ABM systems are examined by performing computations in a loop to see what inherently-unpredictable results emerge. These may not be analyzed in situ using the same formal methods that are suitable for classical economic systems.

A Brief Review of Agent-Based Modelling

Agent-Based Modelling (ABM)6 is a method for studying complex systems including people, animals, cars, and products (Railsback & Grimm, 2011; Wilensky & Rand, 2015). A key characteristic of the ABM method is the use of an agent population in which every agent implies some sort of computation.

Demographics Dublin & Lotka (1925) studied US population growth using a computational approach that might be the earliest example of ABM I’ve found. Dublin & Lotka (1925) represented 100 million Americans in their simulation and, for each one, computed the consequences of mating and childbirth in order to estimate the US national population over time.

Cellular Automata A precursor to modern agent-based modelling is the computational field of cel- lular automata (CA), in which each automaton is an abstract unit (a cell) that performs a computation (von Neumann, 1966). John Conway’s Game of Life is a classic example of cellular automata that is laid out on a grid, like a checker board, in which each square on the grid changes colours based on the colours of neighboring squares (Conway, 1976). The Game of Life produces unexpectedly-fascinating emergence, illustrating the profound and chaotic relationship between simple rules and complex out- comes. Conway (1976) elegantly demonstrates the non-trivial properties of emergence, extending far beyond mere extrapolation in order to produce unpredictable results.

Sociology Racial segregation was studied in Schelling (1971) using a computational simulation of a virtual neighborhood. The agents in Schelling (1971) were influenced by the effects of homophily: each agent measured the ratio of other-race agents in their vicinity and if a threshold were exceeded, they would move to a new location. By varying the homophily threshold, Schelling’s model demonstrated that even a minor preference for homophily could manifest communities of segregation. As demonstrated

6ABM is also referred to as Agent-Based Social Simulation, or ABSS. Chapter 4. Urban Legend Propagation 83 by Schelling (1971), which was computed using a chess board, a computer is not a requirement for Agent-Based Modelling. Schelling’s computational simulation convincingly bridges the social world and the mathematical, using parsimonious rules to inspect the complex sociological phenomenon of racial segregation.

Generative Social Science The Sugarscape Model was an important Agent-Based Model from which a wide range of natural phenomena were shown to emerge, including: life and death; sex, culture, con- flict; disease; society; wealth; social networks; migration; warfare; trade; markets (Epstein & Axtell, 1996).7 Epstein & Axtell obtained these emergent outcomes through the accumulation of simple rules.8 Sugarscape demonstrated that Agent-Based Modelling is a flexible method that can be applied to com- plex, human social phenomena using models that are easy to describe while offering practical insights on the phenomena they are applied to.

ABM Toolkits Each of the studies described so far simultaneously developed novel theories as well as a computing mechanism for exploring them, which usually amounted to a custom software implementation built expressly for the task. By the late 1990s, scientists began to experiment with general purpose ABM toolkits to provide common modelling resources to facilitate reuse and communication. An ABM toolkit is a programming environment in which a model is described with computer software that determines how agents behave. The ABM toolkit also provides an environment within which the agents exist, which may also be controlled using computer code. Once the agents and environment are described, the ABM toolkit facilitates the simulation of the model by placing agents into the environment and executing the instructions that cause the agents to act. These simulations, implemented by the ABM toolkit, are executed in order to produce the data that is ultimately analyzed. There are currently many ABM toolkits, including NetLogo (Wilensky, 1999), REPAST (Garc´ıa& Rodr´ıguez-Pat´on,2016), and MASON (Luke et al., 2005).9 Of these toolkits, NetLogo is the most widely used across the social sciences, becoming the Lingua franca for ABM work in many fields. The ecologists and individual-based modelling community have also come to use NetLogo, despite originating from a different intellectual lineage (e.g., Swarm Development Group, 2016). From the perspectives of usability and communication, NetLogo provides many tools that are useful to researchers.10

Model Building and Documentation

The model building framework described in Grimm & Railsback (2005) was adapted for the current work. Together with Railsback & Grimm (2011), these procedures provide a complete workflow for conducting a computational modelling study according to the Individual-Based Modelling approach. In order to facilitate communication among computational modelers, Grimm et al. (2006) propose a model description framework called the ODD Protocol, which stands for Overview, Design Concepts,

7The Sugarscape model has been been extremely influential to my own thinking. 8Epstein & Axtell (1996) is now considered part of a Generative Social Science trilogy, which includes Epstein (2006) and Epstein (2014). 9I used REPAST to implement Miller (2012) but I found that it was difficult to keep the toolkit running over the course of several years, which interfered with reproducibility. I initially used MASON for the current chapter but, after a year, I actually started from scratch with NetLogo. MASON is extremely general-purpose, which I initially considered to be a strength but eventually this constituted a net cost because I had to write so many basic routines from scratch. 10I’ve tried going my own way, developing some of this stuff from scratch, but I found that this is a dead-end because this introduces too many barriers to sharing work with other researchers. It’s worth it to communicate with a community that already exists. Chapter 4. Urban Legend Propagation 84 and Details. The current work uses the ODD Protocol for documentation about its models, which will be employed a little later in this section. Altogether, this literature provides a methodology that I have applied to the psychological science domain. I have summarized the model building method used in the current work, which was adapted from the theories of Grimm, Railsback, and others, as follows:

1. Formulate the model using the Overview, Design Concepts, and Details (ODD) protocol.

2. Identify characteristic patterns of the emergent behaviour. This corresponds closely to the Pattern Oriented Modelling process described in Railsback & Grimm (2011).

3. Describe the criteria for pattern matching. For the current work, I am characterizing “pattern matching” as a sort of error-minimization process. Pattern matching can be conceived of in different ways - and I think that developing stronger criteria for pattern matching is a likely future direction of my research.

4. Finally, review the model formulation. Models can be reviewed by scrutinizing the ODD protocol. In this way, the review should occur even before the model is implemented in NetLogo.

Pragmatism in Individual-Based Modelling

Individual-based modelling is characterized as describing a system by describing how the individuals within that system work (Railsback & Grimm, 2011). The descriptive process is recursive, requiring the examination of the system’s effects upon the individual while, at the same time, examining the individual’s effects upon the system. Over the years, I tracked down several lectures by Steven Railsback that were shared online.11 In particular, one lecture that was presented to a South American audience was recorded by a hand- held cellphone camera and uploaded over the course of several incomplete clips. Despite this dubious provenance, I memorialized the following Railsback quote in my notes:

You want a model that is not too simple and not too complex.

Presumably, it is only through expertise that one comes to understand when a model has the appro- priate amount of complexity in order to simultaneously be descriptive of a meaningful phenomenon and also simple enough to understand. This quote speaks to the practical sensibility that underlies ecological modelling; it is rooted in the real world, both in terms of the phenomenon and in terms of the model specification. The pragmatic capacity to understand a model after it has been constructed can be appreciated by counter-example. A criticism of contemporary Deep Learning methods is that the models become inscrutable, such that they may solve a problem but, in doing so, they may come to defy interpretation (e.g., Doran et al., 2017). The Artificial Intelligence field has embarked on a new effort in explainable AI, which would permit interpretation for such purposes as scientific inquiry or even ethical analysis (e.g., Charisi et al., 2017). Individual-based modelling, because it is specified by humans in the first place, does not presently suffer from this same problem. However, the potential exists for a model’s complexity to outstrip

11Regrettably, all of these lectures have been lost to time. They are not online anymore and, consequently, I don’t have a citation for them. Chapter 4. Urban Legend Propagation 85 the scientist’s ability to understand the mechanisms that initially enabled the model to produce the desired outcomes. Individual-Based Modelers must resist this complexity. The best models, according to Railsback, will strike the appropriate balance: “not too simple and not too complex.”

Strong Inference

If a scientific epistemology doesn’t include a mechanism for inference, then there’s virtually no way of scaffolding towards the truth. However, according to Railsback & Grimm (2011), strong inference can be achieved with the Individual-Based Modelling approach. Therefore, it’s essential for psychological science modelers to understand the tenets of Individual-Based Modelling, even though their actual work may not be ecological in nature. Platt (1964) admonishes scientists to, “...make [intellectual] inventions, to take the next step, to proceed to the next fork, without dawdling or getting tied up in irrelevancies.” The essence of the strong inferential approach is rapid hypothesis falsification, which has been adapted to the computational modelling context Railsback & Grimm (2011, p. 245):

1. Identify alternative traits (hypotheses) for behaviour.

2. Implement the alternative traits in the ABM, testing the software carefully to get “clean results.”

3. Test and contrast the alternatives by seeing how well the model reproduces the characteristic patterns, falsifying traits that cannot reproduce patterns.

4. Repeat the cycle as needed: revise the behaviour traits, look for (or generate, from experiments on the real system) additional patterns that better resolve differences among alternative traits, and repeat tests until a trait is found that adequately reproduces the characteristic patterns.

At this point, I wish to emphasize the unpredictable nature of the emergent outcomes that may be produced by simple model rules. Works such as Epstein & Axtell (1996) are entirely non-trivial; the emergent outcomes obtained therefrom are impossible to predict a priori. It is much simpler to obtain “unwanted” outcomes from computational models than it is to actually replicate the sort of phenomena that were intended. Inferentially, unwanted model outcomes are tantamount to falsification; when a model produces results that do not converge with empirical observation, the model is wrong - full stop. As such, the strong inference algorithm discussed in Railsback & Grimm (2011) provides a formal framework for quickly falsifying bad hypotheses, discarding bad models and, so long as there are intellectual inventions left to investigate, finding one that works. In computational modelling, most models are both wrong and not even useful.

Pattern-oriented Modelling: A Process for Falsification

A blueprint for strong inference with computational models is called Pattern Oriented Modelling (POM) by Railsback & Grimm (2011, p. 245). The POM procedure is reproduced in Figure 4.6, which and I have summarized in my own terms as follows:

1. Start with either the literature or with findings from the real world, perhaps collected in your own empirical work. Chapter 4. Urban Legend Propagation 86

Figure 4.6: Railsback & Grimm (2011, p. 245)

2. Identify characteristic patterns of emergent behaviour.

3. Propose theories for behaviours. A theory, for Railsback & Grimm, is a general explanation for the pattern of behaviour, which could be expressed as an algorithm, a linear model, or any other mechanism. Psychology is full of theories.

4. Collect these theories together into an individual-based model, which basically amounts to ex- pressing these theories mathematically. In practice, this is the point at which the model may be implemented with NetLogo.

5. Ask how well the model reproduces the observed patterns. The data may be analyzed like any other psychological study - or like any other natural study.

As of 2019, Null Hypothesis Significance Testing (NHST) remains the standard method in psycho- logical science for testing how well any given model fits with observed results. However, it has become abundantly clear that this method alone is no panacea. In my experience, it does not always make sense to apply NHST in a computational modelling context - just as it does not make sense to apply NHST in every empirical context, either. Pattern Oriented Modelling, which is built upon strong inference, presents an alternative to NHST. To Railsback and Grimm’s question of, “How well does the ABM reproduce observed patterns?” we may leverage any number of criteria to produce an answer. In the current work, convergence is operationalized in terms of a model’s ability to reproduce patterns of results with matching time scales, categorical levels, magnitude, and other features as appropriate. When any of these dimensions deviates, the model will be considered as falsified and it will be discarded. Only when all dimensions match will it be claimed that we have arrived at a hypothesis that will not be rejected.

Psychological Patterns

Modern Psychology is an empirical science based upon evidence and inference. In large part, psycholog- ical science proceeds according to a process of: 1) gathering insights about psychological phenomena; 2) Chapter 4. Urban Legend Propagation 87 generating models corresponding to hypotheses about those phenomena; 3) collecting new data corre- sponding to those models; and 4) performing inference based on the statistical properties of those models in the context of those data. In other words, psychology is a pattern-oriented science. Therefore, the challenge of creating Agent- Based Models of psychological phenomena is a matter of incorporating psychological patterns into an ABM framework.

4.3.2 The Current Work

The current work is is an extension of the Eriksson & Coultas model of emotional selection operating during a serial reproduction task. I formalized their model as a computational construct suitable for agent-based modelling. Those computations were then adapted to different network topologies; first, a serial reproduction task topology, as described in the background section, and then a social network topology that resembles real-world online social networks, in which individuals have multiple connections. The goal of this work was to detect the emergence of accelerating urban legend propagation, in con- trast with the original laboratory findings in which cascades terminated in fewer than two transmission generations. I approached this task over the course of three studies.

1. First, the original dynamics of Eriksson & Coultas (2014, Study 3) were replicated by building a computational model that matches the patterns of the original laboratory study.

2. Next, the scale of the serial reproduction study is increased by a factor of ten to explore whether sample size affects the likelihood of urban legend propagation.

3. Finally, the task is repeated using a network topology that resembles an online social network, in which individuals are permitted to have multiple connections.

Study 1: Replicating Urban Legend Cascades

Study 1 is a replication of Eriksson & Coultas (2014, Study 3), in which the first generation participant would read urban legends and chose whether to transmit; then the second generation participant actually received those stories and repeated the task. In the ABM simulation, schematically depicted in Figure 4.7, there will be many little networks representing story cascades of length = 2. Each serial reproduction task consists of two agents that are assigned as generation 1 or generation 2, which determines the directedness of the graph edge between them. Thus, the agent in generation 1 has one directed edge to the agent in generation 2. There are 40 such dyads for a total n of 80, corresponding to the parameters of the Eriksson and Coultas study. The study is specified using the NetLogo environment (Wilensky, 1999) and the screenshot in Fig- ure 4.10 depicts the user interface developed for the task. Although agents are laid out as a circle, like the spokes of a wheel, the actual topology of the networks is exactly equivalent to the cascades specified in Figure 4.7. At the center of the simulation window is a special experimenter agent who is responsible for distributing stories to generation 1 agents. The simulation begins when generation 1 agents receive stories from the experimenter agent. From the perspective of a single agent, they receive 4 stories at random from the experimenter. Then, for each story received, the agent will decide whether to read any of those stories. Of the stories they read, the Chapter 4. Urban Legend Propagation 88

Figure 4.7: This basic diagram depicts the arrangement of dyads into a sequence. Stories are first provided to Generation 1 participants, then they are provided to Generation 2 participants. There is no sharing between dyads; all sharing occurs within-dyad.

Figure 4.8: Study 1; NetLogo Interface. Agents are displayed in the box on the left. Buttons appearing on the right are used to interact with the simulation. Chapter 4. Urban Legend Propagation 89

Figure 4.9: Study 2; NetLogo Interface. There are so many agents and connections that they all blend together. Although it is no longer possible to visualize the distinct dyads, the system is nevertheless able to keep them separate. agent then decides whether to share each story. Any of the stories that are transmitted are passed to the generation 2 agent. There are two generations of agents in this simulation and only one generation acts at a time. Thus, the time duration of this simulation is 2 time units, about which more will be said shortly. Simulation data, including the number of stories retained at each generation, are logged to a CSV file, as if this study were conducted with living human participants.

Study 2: Scaling Up

Study 2 is a large-scale replication of Study 1. The intuition motivating this study addresses the ques- tion of whether the laboratory sample was too small with n = 80 participants. Since urban legend transmission was found to be a rare event in the laboratory, then a larger sample will produce more opportunities to share, possibly resulting in a more story transmission. As far as the model is concerned, nothing changes apart from the parameter that is used to control how many cascades are initialized and how many agents are created to populate the simulation environment. Thus, study 2 consist of 10 times as many cascades: 400 serial reproduction tasks, for a total agent sample of n = 800. Figure 4.9 depicts the NetLogo user interface which is nearly identical to before, with the notable exception that now the hub-and-spoke layout appears as a solid-fill circle as a consequence of the increased density of edges. In every other way, the parameters of Study 2 are just like Study 1. Increasing the sample size is rarely performed with human participants because it usually implies time and expense. However, computational simulations pose a different set of constraints, such that the cost and expense manifests differently. Below certain thresholds, there is virtually no marginal cost to increasing scale. Therefore, computational simulations are a cost-effective method for studying difficult- to-observe phenomena, including those that depend upon rare events, that might benefit from increased study scale.

Study 3: Social Network Topology

Study 3 is is substantial departure from the previous studies. Instead of using serial reproduction tasks, a different network topology based on a preferential attachment algorithm will be used. Preferential Chapter 4. Urban Legend Propagation 90

Figure 4.10: Study 3; NetLogo Interface with network layout. Once again, the visualization does not properly represent the topology of the networks. The elaborate network layout methods used in previous plots were not applied to this visualization. The network edges that are depicted in this visualization represent the first generation of the simulation, in which some agents have received stories from the experimenter. attachment (Barab´asi& Albert, 1999) is a quick and approximate method for generating networks that have some of the same properties as online social networks. These are “rich get richer” networks in which a small number of agents have links to lots of agents; think of these as the celebrity agents. Meanwhile, the vast majority of agents are not popular; they do not have many links to other agents. Study 3 uses the same population size as Study 2: n = 800 agents. We will also change another detail of the original studies; these will be permitted to run a little longer - until a potential fifth generation - whereas the other studies stopped at generation 2. By permitting longer cascades, this study will afford the possibility of actually observing an increase in the number of stories that are shared over time. In the lab study, by generation 2, almost every story cascade died, so even if Eriksson & Coultas (2014, Study 3) had specified longer cascade lengths, the effect would have been nullified because all low-disgust story sharing had ceased by the second generation and high-disgust stories would likely cease in the third generation. On this basis, there is no transmission advantage to running the simulation a little longer. By providing the simulation more time, it affords the opportunity to see whether the urban legend cascades die or whether they could actually become more prolific with time. The agents act according to the rules of an S-I-S epidemic, which is a model of simple contagion that stands for Susceptible, Infected, Susceptible (e.g., Grabowski & Kosinski, 2005). Susceptible agents, in this case, are able to become infected through exposure to urban legend stories. When agents choose to read a story, those agents become infected, recalling the parasitic model of memes that temporarily inhabit a host for as long as the story is remembered. Once agents transmits their stories - should they choose to do so - then they become susceptible again to receiving new stories in the future.

4.3.3 ODD Protocol

ODD stands for Overview, Design Concepts, and Details (Grimm et al., 2006, 2010). The ODD protocol will now be applied to the current work. In the Overview section, the model is broken down into Entities, Variables, and Process. In the Design Concepts section, the background will be summarized in order to describe why the model was formulated the way it was. Finally, the Details section describes the Chapter 4. Urban Legend Propagation 91 operation of the model.

Overview

Entities The entities or agents in this work are representative of human participants. Internally, these agents are assigned participant IDs that also determine the sequence in which they are activated. Edges represent the communication channels that agents may use to communicate, thereby forming a communication network. Edges are directed so that messages only flow in one direction, which is similar to the operation of popular online social networks like Twitter. Agents may communicate only when an edge connects them. The model also provides a representation for stories. An agent interacts with a story whenever there is reading or transmission activity. When an agent reads a story, then a readership edge is created linking the agent to the story. A story can have many readers and the number of readers can be counted by enumerating the readership edges that connect to the story. Readership edges primarily exist to track cascades as they develop.

Variables Agent behaviour is influenced by computational variables that vary between agents and conditions. These variables are used to track agent states and to create experiments within the simula- tion.

Has Content When an agent has one or more stories to read, that fact is tracked using a variable called has content. We can ask an agent: “do you have any stories to read?” If the answer to this question is yes, then we say of the agent that has content is true. Otherwise, the value of has content is false. As a result of formalizing the has content variable, we will be able to construct logical expressions based on knowledge about which agents have stories to read.

Has Decided Once agents have made their decisions about whether to read and share stories, this fact is tracked with a variable called has decided. Whether an agent has decided and whether they have content is relevant to an agent’s eligibility to become active at any given time.12

Decision Model Agents are controlled by a special variable called decision model which is used to select between two models for decision making; either the empirically-calibrated Eriksson & Coultas (2014, Study 2) model or an alternative called the null model, which will be described soon. Unlike previous variables, which exist to clarify the logic of the model, this variable is very consequential because it radically changes the behaviour of the agents.

Sharing Threshold Each agent has a sharing threshold that is used to transform the results of a sharing intention model, which is itself a linear regression, into a decision outcome that causes an agent to share or not. The sharing intention model is derived from Eriksson & Coultas (2014, Study 2) and is described later in the ODD Protocol. This model of sharing is a linear function that produces a continuous value that is compared to the sharing threshold. Values above the threshold cause the agent to share a story and values below cause the agent to effectively ignore the story. As this sharing threshold

12In fact, neither has decided nor has content strictly needs to exist; these values can be derived from network state. However, these two variables simplify some behaviour logic and they improve interpretability. Chapter 4. Urban Legend Propagation 92 could hypothetically be any number, it must be calibrated against empirical findings in order to ensure that it represents the correct threshold, a process which is described later in the ODD Protocol.

Story Variables Stories are represented as entities in this model and, consequently, they have their own variables in the same way agents do. In the laboratory study, stories were rated by participants and, consequently, data about the stories are available. Of particular relevance, participants rated how disgusting and amusing the stories were. Stories can therefore be high-disgust or low-disgust, which is captured as a single variable called disgusting that is true in the high disgust condition or false otherwise. Stories also have a categorical variable called themes, which has 4 levels corresponding to the following: cake, dog, Nepal, and pizza.

Scales Scale, here, refers to the types of dimensions that are involved in this model. Time is one scalar property of this model which proceeds in discrete increments of 1 hour, representing the duration of one experimental session. This time scale is plausible because we know the general properties of laboratory psychology research. The decisions made by participants during the Eriksson & Coultas (2014) laboratory studies would all occur within a single experiment block. Similarly, a single time tick is the amount of compute resource allocated to performing the decisions that occur within a single experiment block. Thus, we establish a correspondence between one hour, one experiment session, and one time tick within the simulation. When replicating a 2-link serial reproduction task, think of it as two separate, one-hour sessions that may be summed to 2 hours.

Process overview and scheduling In agent-based modelling, time itself may be conceived of accord- ing to various architectures. All simulations proceed in a certain order that is determined by a schedule. This simulation activates agents in the same order during every time tick, and that order is determined by the agents participant ID; lowest, first. A consequence of this schedule, in which all agents act at every time tick, is that all generation 2 agents are updated during time tick 1 - even though it is not actually their turn to act. Likewise, all generation 1 agents are updated during tick 2. The variable called has decided therefore prevents agents from acting when it’s not their turn, permitting agents to logically skip a turn when appropriate. Once an agent completes its actions, then the next agent begins according to participant ID. Another consequence of this schedule is that agents do not act in parallel, which is a departure from all our assumptions about how our lived (non-simulated) experience of reality operates. Only one agent is computed at a time but, importantly, agents do not actually “notice” when they are paused. That is, from an agent’s perspective, it is as if all agents exist within a parallel, simultaneous world. However, from our perspective as researchers who are situated outside the simulation, we have the knowledge that agents do not act simultaneously - although agents are entirely unaware of this quirk.

Design Concepts

Interaction Interaction describes the ways Entities of various kinds (i.e. agents, stories) can influence one another in the simulation. In this model, there are several possible kinds of interaction, depending on the types of entities that are interacting. The flowchart in Figure 4.4, which describes the methods used in Eriksson & Coultas (2014, Study 3), also happens to describe all of the major interactions among entity types. Chapter 4. Urban Legend Propagation 93

Communication Communication is one type of interaction that occurs between two agent entities. In this model, communication occurs by collecting urban legend stories into a packet that is transmitted the next agent in the cascade.

Reading and Sharing Other forms of interaction occur between Agents and Stories. When an Agent chooses whether to read a story, that is modeled as an interaction between a Story entity and an Agent entity. Agents also interact with stories when they decide whether to share them. In the current research, each of these forms of interaction has network implications. Communication may only occur along existing network routes. Meanwhile, the read and transmit interactions cause new network links to be created that encode the consequences of the interaction.

Stochasticity There are several random processes in the computational model, many of which are modeled as Gaussian noise. In the case of the linear regression models that were used to define the behaviour of the model entities, responses were centered and standardized during the model fitting process which is calculated under the assumption of normality. So, I added Gaussian noise like a residual term to the fitted model estimates that were reported by Eriksson and Coultas to allow different entities to behave slightly differently from one another. Emotional selection during the reading phase results in a non-normal distribution of choose-to-receive decisions. I model these choose-to-receive decisions as events over time, which I derive from the Poisson distribution. The mean receive rates that are reported in Eriksson & Coultas (2014, Study 3) determine the parameters of this Poisson distributions. As with Gaussian processes, this linear model may be operated in reverse - “made stochastic” - by sampling randomly from the Poisson distribution to add to the function where the error term ought to be. A different kind of random process is used to generate preferential attachment networks in Study 3, which is intended to simulate social network dynamics. The preferential attachment algorithm links new agents to another agent in proportion to the number of existing links an agent has. In this way, agents with many links have a higher likelihood of receiving even more links over time. This preferential attachment algorithm produces similar but slightly different topologies each time it is performed.

Emergence We expect several non-obvious patterns to emerge from this. The number of stories retained per generation is the primary emergent outcome that will be used to compare laboratory results with simulation results. The number of stories retained can be computed on a per-cascade, per- individual, and per-story basis. We expect to see that high-disgust stories will be retained at a higher rate than low-disgust stories. Another pattern we expect to emerge is Story extinction for serial reproduction topologies, in order to replicate Eriksson and Coultas. Based on their laboratory results, low-disgust story cascades should completely “die off” within 2 generations. In contrast to story extinction for serial reproduction cascades, we expect social network topologies to manifest ever-increasing shares over time.

Observation The simulation must permit the observation of the total number of Stories shared at each generation. In order to collect data about the computational model, the simulation environment must be monitored by a data logging process. For this purpose, NetLogo provides tools for observing emergent properties of the model while it is running, which can be programmed to log the number of shares per cascade, per generation, and so on. These observations may then be exported to a CSV file. Chapter 4. Urban Legend Propagation 94

Figure 4.11: This is a flowchart depiction of ABM “step” submodel. These labels represent the simulation sequence in a very rough, conceptual manner, as viewed from the perspective of the simulation itself. The behaviour of individual agents, which has been described elsewhere in the ODD protocol, can be thought of as being situated within the steps of this flowchart.

NetLogo also enables direct interaction with the simulation using Graphical User Interface controls and outputs.

Details

Initialization The process of initializing the simulation starts with the creation of all the agents in the environment, then connecting those agents with a network. Study 1 and 2 produces a serial reproduction topology to connect the agents. In study 3, a preferential attachment network connects the agents according to the stochastic process previously described. As the last step of initialization, the Experimenter agent is connected to generation 1 agents who then transmits stories to those agents in order to prepare the simulation before time starts.

Step Model Each time tick proceeds according to an algorithm called step, depicted in Figure 3.7, which is responsible for describing what happens during a time tick. Time itself moves forward as a progression of these steps. At the first branch in the flowchart, when an agent has no stories during a step, then the agent does nothing. Otherwise, once some stories have been transmitted to the agent, the agent performs the Chapter 4. Urban Legend Propagation 95 choose-to-receive behaviour and, for any stories they read, performs the choose-to-transmit behaviour. choose-to-receive Model The receive phase of the communication model is represented as an al- gorithm called choose to receive. Eriksson & Coultas used the name “choose to receive” to describe what their participants do when they decide whether to read a given story. The computational model therefore has a corresponding representation with the same name. According to Eriksson & Coultas (2014, Study 3), there is a significant difference in story retention during the choose-to-receive phase, depending upon whether the story is high-disgust. In case the story is high-disgust, then the mean number of stories retained is significantly higher - a mean of 2.4 stories - versus a mean of 1.2 stories for low-disgust. When the story is disgusting, the choose-to-receive algorithm is more likely to produce a result that causes the agent to read the story. This is modeled as a Poisson process that determines the likelihood that a specific story will be retained, depending on disgust level. In proportion to that likelihood, choose-to-receive randomly reports either “yes: read the story” or “no: skip it.” choose-to-transmit Model The choose-to-transmit algorithm derives from Eriksson & Coultas (2014, Study 2). Agents will share based on two quantities: first, what is the predicted share score for the agent and story; and second, what is the threshold above which agents will share. The prediction function is derived from the results of the regression analysis conducted by Eriksson & Coultas. I think of this prediction function as a regression in reverse. Instead of fitting a model to a fixed set of data, the model is used to generate a brand new set of outcomes. When the score produced by this function is above the share threshold, the agent has effectively chosen to transmit a given story.

share score = −0.86 + (0.79 ∗ disgust) + (0.08 ∗ social) + (0.22 ∗ amusing)+ (4.1) (0.35 ∗ interesting) + (0.25 ∗ plausible) + (0.18 ∗ surprising)

To compute a linear regression in reverse, we sample from the Gaussian distribution for each pa- rameter in Equation 4.1. As discussed in the stochasticity section, a critical assumption of the analysis is that values are zero-centered and standardized. The products and sums are computed to obtain the share score.

Null Model of Behaviour Until this point in the ODD Protocol, we tied our model specification to the empirical context of the Eriksson & Coultas (2014, Study 3). Now, we consider an alternative theory so we can potentially falsify our computational model; if it is no different from the null model, we ought to reject it. This requires a rhetorical straw man of sorts, analogous to the null hypothesis that is familiar to frequentist significance testing. What happens when agents behave in a purely random manner? To implement this, we can model the receive and transmit decisions as the result of a uniform random process. 50% of the time, agents receive the stories. 50% of the time, agents transmit the stories. I call this random decision model the null model - and, throughout the results, I refer back to various null models of behaviour for comparisons. I specifically think the null model does not describe the way people behave in real life. I expect that when agents behave according to the null model, it will not resemble the Eriksson & Coultas (2014, Study 3) results. However, if it turns out that the null model produces results that are as good as the Chapter 4. Urban Legend Propagation 96 computational model, then we have no choice but to reject the theory underlying the computational models and try again.

4.3.4 Simulations

NetLogo

The NetLogo agent-based modelling environment was used for this work. The simulation works with NetLogo version 6.0.4.13

BehaviorSpace

The NetLogo BehaviorSpace14 tool is used to iterate across a specified parameter space, repeatedly running models with varying parameters. The results from BehaviorSpace may then be logged to a CSV file. BehaviorSpace was used extensively in the present research to create structured batches of simulations that generated the raw data. Once a CSV file was obtained from BehaviorSpace, it could be imported into a statistical package for statistical analysis.

4.4 Results

4.4.1 Study 1: Replication

The first objective of this research was to replicate the pattern of results observed in Eriksson & Coultas (2014, Study 3). These patterns can be characterized as follows:

1. High disgust stories propagate at a higher rate than low disgust.

2. Fewer stories are received and transmitted as time proceeds.

3. At each time point, fewer stories are transmitted than received.

4. By the end of two time points, virtually no low disgust stories are transmitted.

Calibrate share threshold parameter

Before producing the results, there was one model parameter that constituted a degree of freedom in the simulation model. The share threshold, which was created in order to transform a regression model into a decision-making function, was not specified by the original research. As this parameter has no direct empirical basis in Eriksson & Coultas (2014, Study 3), the parameter must be calibrated.15 A sensitivity analysis was performed upon the share threshold parameter (Thiele et al., 2014). The parameter space between −1.5 and +1.5 was searched in increments of 0.5. During sensitivity analysis,

13The NetLogo language has changed slowly over the course of the years. I expect that this code will only work with this specific version of NetLogo. My recommendation for reproducibility is to archive a complete copy of NetLogo along with the model specification. 14The Canadian English spelling, BehaviourSpace, would produce no results. 15Any such degree of freedom yields obvious criticisms for agent-based models. As such, great care was taken to eliminate any other degrees of freedom during the construction of the ODD Protocol. Nevertheless, this one degree of freedom remains. Luckily, this problem is not unique in simulation research and the method of Sensitivity Analysis was developed to solve it. Chapter 4. Urban Legend Propagation 97

Figure 4.12: This plot depicts the results of the search for a share threshold parameter. The simulation was executed with different threshold values, which are abbreviated as th and can be seen along the right-most edge of the plot. Thus, the different parameter results are actually stacked on top of one another. k = 30 simulation runs were performed with each threshold value. The plot in Figure 4.12 visualizes the parameter space for share threshold along with corresponding results. Mathematically, the share threshold may be interpreted in terms of its computational consequences. When the share threshold is 0, then a transmit intention score of 0 and above will trigger transmis- sion. When share threshold is below zero, agents share at an increased rate - and the farther below 0 the threshold is, the more transmission there is. When share threshold is above zero, there is less transmission. When the threshold is 0, not enough stories are left by the end of the second generation. When the threshold is −1.5, there are too many low-disgust stories. On the basis of these parameter sensitivity results, a share threshold value of 0.5 was selected for all simulations.

Simulation Results

Figure 4.13 depicts the results obtained from k = 30 simulations performed with NetLogo BehaviorSpace. As with Eriksson & Coultas (2014, Study 3), there are 2 generations of agents in which there is a receive and a transmit step. These results are directly comparable to Figure 4.2. Simulation results converge with the absolute magnitudes that were reported in Eriksson & Coultas (2014, Study 3). When focusing on the generation 1 receive step, on average agents receive a little more than 2 high-disgust stories and a little more than 1 low-disgust story. During the generation 1 transmit step, nearly 2 high-disgust stories are transmitted, on average, whereas just 0.25 low-disgust stories are transmitted. Overall, the simulation produces a similar magnitude of reading and transmission events over time. Directly evaluating the patterns yields the following: Chapter 4. Urban Legend Propagation 98

Figure 4.13: The results of Study 1 correspond closely to the empirical results obtained by Eriksson & Coultas.

1. The simulation produces different outcomes for high versus low disgust stories. High disgust stories are transmitted at a higher rate.

2. Stories with different disgust levels are retained at an appropriate rate over time.

3. Within a single generation, the number of stories transmitted is lower than the number that were received.

4. By the end of two generations, low disgust stories are almost entirely extinguished - yet some high disgust stories remain.

Altogether, the overall pattern of stories retained over time resembles the original Eriksson & Coultas (2014, Study 3) results.

A Brief Note About Error Bars The error bars in Figure 4.13 occur because each simulation run produces a slightly different result. The size of the error bar is proportional to the number of simulations that were run. This is very different from the error bars that occurred in Eriksson & Coultas (2014, Study 3), which result from between-subjects variability. This topic will be revisited during the discussion.

Compare choose-to-receive null model

The choose-to-receive decision model was investigated by comparing it to a null model of behaviour.16 An unbiased choose-to-receive model was created that randomized decisions at a 50% rate, rather than determining decisions according to the empirical regression model. This is referred to as a null model because it is analogous to the no-difference condition of null hypothesis significance testing. In case the

16We cannot simply “turn off” the receiving behaviour. When disabling it entirely, the net effect would result in none of the agents ever doing anything because everything depends on what the agents read. Chapter 4. Urban Legend Propagation 99

Figure 4.14: When the behaviour for the choose-to-receive submodel is determined by a random source, instead of the empirically calibrated model, the simulation results no longer match the pattern of the empirical results. The plots are stacked: read model: E&C refers to the Eriksson & Coultas model for choose-to-receive. Meanwhile, read model: null refers to the random model. null model appears identical to the choose-to-receive model, we would conclude the choose-to-receive model adds no information. When no difference is detected, the choose-to-receive model is regarded as falsified. Figure 4.14 depicts the null model and the choose-to-receive model. The choose-to-receive model results are actually a full replication of the previous results, not merely a duplication of the plot that was already presented.17 The plot along the bottom was created by running the simulations with the null model for reading behaviour. As before, there are two generations, each containing a receive and a transmit step. The problem with the null model is evident in the first generation receive step. Almost exactly 2 stories are received by agents, regardless of whether the stories are high or low disgust. The null model result deviates from those that were expected from the Eriksson & Coultas (2014, Study 3) model. The consequences of the null model behaviour propagate throughout subsequent steps and generations, violating expectations at each step. Because the empirically-calibrated model differs from the null model we do not reject our initial model at this point.

Compare choose-to-transmit null model

The choose-to-transmit model was also examined by comparing to a null model. Figure 4.15 is laid out like the previous slide: the top depicts the empirically-calibrated model and the bottom depicts the null model. In the null model, high-disgust stories are not retained at a high enough rate and low-disgust stories are retained too often, as compared to the empirically-calibrated model.

17Since I am able to run a batch of simulations in a matter of minutes, I just re-ran the simulation instead of copying the data already produced. Chapter 4. Urban Legend Propagation 100

Figure 4.15: When the behaviour for the choose-to-transmit submodel is determined by a random source, instead of the empirically calibrated model, the simulation results no longer match the pattern of the empirical results. The plots are stacked: transmit model: E&C refers to the Eriksson & Coultas model for choose-to-transmit. Meanwhile, transmit model: null refers to the random model.

By the end of the second generation, the number of retained stories drops off geometrically; at each generation, half as many stories are retained. This observation is consistent with the expected value of the null model, which will randomly transmit 50% of the stories. From this, we can infer that the null model for choose-to-transmit does not capture the transmit dynamics that were observed in Eriksson & Coultas (2014, Study 3). As before, we do not discard the choose-to-transmit model in favor of the null model.

Compare both null models to empirical model

Figure 4.16 depicts the number of stories retained when the null model is used for both receive and transmit behaviours. Only the plot in the bottom-right quadrant is new; the other three plots are replications of the results from the previous slides. The null-transmit-null-receive model appears to decrease geometrically by a factor of 0.5, correspond- ing precisely to a 50/50 coin flip. At the beginning of generation 1, 4 stories becomes exactly 2 stories in the read step, which becomes exactly 1 story in the transmit step. This pattern of division, which continues through generation 2, clearly does not resemble the original laboratory results. By looking at all 4 quadrants in Figure 4.16 simultaneously, it is possible to mentally “partial out” the relative contributions from the empirical models for receiving and transmission. The effect observed in Eriksson & Coultas (2014, Study 3) can be thought of as either being greater than or less than the corresponding null models. Chapter 4. Urban Legend Propagation 101

Figure 4.16: Plots are both stacked on top of one another and aligned side-by-side. The lower-right quadrant depicts the case when null models are used for both receive and transmit. Meanwhile, the upper-left quadrant depicts the case when both receive and transmit are based on the Eriksson & Coultas model. By comparing the upper-left to the lower-right, it can be seen that the null models fail to capture any of the patterns that were sought after.

Figure 4.17: The Study 2 results are laid out in quadrants, as in Figure 4.16, and can be interpreted in the same way. The general results are preserved: the null results capture none of the phenomenon. Although the scale of this simulation has been increased by a factor of 10, there is no change to the overall pattern of results. Chapter 4. Urban Legend Propagation 102

Figure 4.18: Study 3 Results. The empirical Eriksson & Coultas model - both receive and transmit - was used to produce this plot. The difference is that now, a preferential attachment network topology was used. Along the x-axis, there are now 5 generations instead of 2, because the simulation was permitted to run a bit longer. The important thing to take note of is the increase in high-disgust stories that were shared by generation 5. If this pattern were continued into the future, it is now plausible that some of those stories could produce widespread cascades.

4.4.2 Study 2: Receive-transmit pattern at 10x scale

Urban legend transmission in Eriksson & Coultas (2014, Study 3) was a relatively unlikely event. Perhaps the reason the results don’t resemble the widespread urban legend sharing observed in the real world is that sharing is so unlikely that it simply could not be seen with n = 80 participants. Therefore, increasing the number of sharing opportunities could increase the likelihood of observing a cascade that extends beyond 2 generations.

In Study 2, a computational model with n = 800 agents was examined. This size increase is a factor of 10 beyond the Study 1 simulations. The serial reproduction task length remained fixed at l = 2, yielding 400 transmission tasks.

The results are presented in Figure 4.17. As before, null models were simulated for both receive and transmit behaviours. The top-left quadrant depicts the full empirically-calibrated simulation and the bottom-right quadrant depicts the full null model.

Despite increasing the simulation scale by a factor of 10, the results appear virtually identical to before. None of the base rates for any particular behaviours has changed. The net effect of the increase in scale is likely to just increase the precision of the estimates produced by the simulation. Although complex systems are capable of producing unexpected results under certain conditions, it does not appear that this particular system is affected by the sample size. Chapter 4. Urban Legend Propagation 103

Figure 4.19: The results from Study 2 are plotted again - this time, as total stories per generation instead of average stories per generation. Despite this change in reporting, there is no substantial difference to the results; the same pattern is observable.

4.4.3 Study 3: Preferential Attachment

Until this point, these studies have been replications of a serial reproduction task and the very simple network topology implied thereby. What happens when a more sophisticated network, simulating more realistic social phenomena, is used instead? As explained in the methods, the new network is built with a preferential attachment algorithm. Consistent with Study 2, n = 800 agents.

Reporting Total Stories versus Average Stories

Figure 4.18 depicts the total number of stories retained over time. In studies 1 and 2, averages were reported because that’s how Eriksson & Coultas (2014, Study 3) reported it. However, in Study 3, a different metric for reporting is used so a few words need to be said about it. With the new network topology in Study 3, agents may have many incoming and outgoing connec- tions. Agents are activated at each generation depending upon which agents transmit urban legend stories to them. Therefore, generations no longer consist of a fixed size. As a consequence of these differences, there’s not really a denominator to use for the calculation of an average on a per-generation basis. Instead, the non-cumulative total stories retained at each generation were plotted in Figure 4.18. In order to retain the ability to make comparisons to the previous studies, the results from Study 2 results have been re-plotted in Figure 4.19 using the total stories retained. As can be seen, the same patterns are observable; there is no substantial difference whether Study 2 results are plotted by total or by average. Therefore, the ability to interpret Study 3 results should be preserved.

Simulation Results

When examining Figure 4.18, generation 1 looks identical to before. However, in generation 2 and beyond, something different is happening. Instead of the number of stories gradually decreasing, the Chapter 4. Urban Legend Propagation 104

Figure 4.20: To permit comparison with the null models, the Study 3 results are plotted in quadrants again. The upper-left quadrant depicts the empirical models and the lower-right quadrant depicts the null models. In this plot, the term rx was used to abbreviate “receive” and the term tx was used to abbreviate “transmit.” Although the full null model does exhibit an increase in stories over time, it can be seen that the expected patterns are not present. opposite happens: more high-disgust stories are retained in generation 2 than in generation 1 - and this trend continues. The low-disgust stories, as before, are retained less by each successive generation. This result matches our original intuition about urban legends. In contrast to the effect observed in Eriksson & Coultas (2014, Study 3), the very same model produces “viral” sharing when the network resembles a human social network. Furthermore, this effect only holds for high-disgust stories, not for their low-disgust counterparts.

Null model with Preferential Attachment topology

Null models of receiving and transmission were examined with the preferential attachment network. These results are plotted in Figure 4.20. The lower-right quadrant, in which both null models used, exhibits a gradual positive trend over time, suggesting that the network topology alone contributes strongly to the effect. Critically, the null model does not exhibit any preference for one type of story over the other, which violates key patterns. On that basis, we do not reject the empirically-calibrated model.

4.4.4 Summary

Study 1 replicated the results of Eriksson & Coultas (2014, Study 3). A sensitivity analysis was performed for the share threshold parameter in order to calibrate it for all subsequent simulations. Null models for receive behaviour and transmit behaviour were examined in order to test whether the Eriksson & Coultas (2014, Study 3) models should be discarded. The null models did not reproduce any of the patterns whereas the empirically-calibrated models did. Thus, the empirical models were not discarded. Chapter 4. Urban Legend Propagation 105

In Study 2, the size of the simulation was increased by a factor of 10 in order to to look for any sort of scaling properties. It was observed that the number of serial reproduction tasks did not change the outcomes for urban legend sharing. In Study 3, the network topology was constructed using a preferential attachment network in order to resemble a human social network. With this new network, an ecologically-plausible outcome was obtained. Unlike in the lab, the high-disgust stories increase over time, just as can be observed in the real world.

4.5 Discussion

4.5.1 The Lab and The Simulation

We began with urban legend studies that were highly controlled. Although those studies lacked ecological validity and the results didn’t resemble real life, they had excellent internal validity. A computational model of urban legend transmission was constructed on the basis of those studies. The simulations exhibited the patterns that were expected, reproducing both the laboratory results and results that would be expected in the “real world.” Using a realistic network, high-disgust stories are shared at an increasing rate. Low-disgust stories, which are retained at ever-diminishing rates, would likely be forgotten entirely if the simulation were run longer.

4.5.2 Scaling a serial reproduction task

Increasing the scale of the serial reproduction task did not change the rate at which stories were retained. The pattern of results with 800 agents was the same as with 80 agents. This is actually interesting on its own because most other network topologies are parameterized in some way by scale. Serial cascades would appear to be unique in their stability under scaling conditions. On this basis, perhaps the serial reproduction topology is particularly well-suited to experimental control.

4.5.3 Explanations for Emergent Network Behaviour

A key pattern exhibited in Study 3 by the preferential attachment network is that high disgust stories are retained more, not less, at generation two compared to generation one. Why does this increasing- transmission behaviour emerge with the preferential attachment network? Revisiting an earlier intuition, urban legend transmission is a rare activity - but increasing the number of opportunities to share counteracts the model’s general bias against sharing. The preferential attachment topology permits agents to be connected to more than one other agent. By changing the in-degree and out-degree, this produces far more transmission opportunities. There are epidemiological implications to the preferential attachment network. Whereas the serial reproduction task permitted agents to act just once, now agents could potentially act more than once due to the social-network-like connections they have. In epidemic ABM terms, the preferential attachment network acts as an S-I-S simulation - or susceptible, infected, susceptible. Agents that are available to receive urban legend stories are called susceptible, so the return to susceptibility in an S-I-S model implies that agents can receive and transmit stories multiple times. The network topology and epidemic model therefore radically changes the range of possible agent behaviours, going far beyond the serial reproduction task. In this manner, there is a larger difference Chapter 4. Urban Legend Propagation 106 between Study 2 and Study 3 than might be immediately apparent. That algorithmic change, along with its consequences for agents, is the difference that causes Study 3 to have ecological validity when Study 2 does not.

4.5.4 Messages and Topologies

In their article, Eriksson and Coultas remarked that, “a link between emotional selection and social network structure might act as a factor that magnifies the cultural differences between communities, an idea left for future research.”

I would like to rephrase their comment as follows: what about those messages that produce changes in networks? Using the nomenclature of Twitter or Facebook, what happens when we unfriend people? For example, what happens when a political conversation turns nasty and, as a result of that conversation, you don’t want to be friends with that person anymore? Social networks are dynamic - and a message can cause us to add people to our network or remove them.

It is surprisingly difficult to obtain data from Twitter and - especially - Facebook that would be suit- able for examining the unfriending question. However, I can do some satisfying work with computational modelling. I presented some simulation findings on the topic of unfriending at the Political Networks 2018 conference. This is a future direction that I am still interested to pursue.

4.5.5 Error Bars and Parallel Simulations

The error bars produced by the simulations are different from the error bars reported in the laboratory studies. In both cases, the error bars are fundamentally related to the uncertainty surrounding estima- tion. In the case of human subjects research, individual observations correspond to the measurements from one individual participant and are aggregated as a between-subjects measure. In the case of sim- ulation studies, the results are aggregated at the scale of an entire simulation as a between-simulations measure.

The error bars reported by Eriksson and Coultas represent the uncertainty surrounding the estimates of population parameters. However, there is fundamentally just one population and we presume there is a single, unified reality that all participants belong to. Those error bars, as determined by the sample obtained, are related to the confidence in the estimation of the true mean.

Despite starting with the same parameters each time, every simulation run produces results that are a little bit different due to stochasticity. When summarizing across simulation results, we obtain an estimate of an emergent parameter produced by a complex system - and that estimate has error. In Chapter 4. Urban Legend Propagation 107 effect, we are sampling from parallel simulations.18 19 20 21

4.5.6 Statistical Inference

During the analysis of the simulation results, it became apparent that the tightness of error bars could be controlled by changing how many times the simulations were run. In order to obtain a better estimate of an emergent simulation parameter, the simulation could be run more times. This is reminiscent of the way frequentist statistics are sensitive to sample size; similarly, in order to obtain a better estimate, just obtain more observations. Gelman & Shalizi (2013) extended this intuition a step farther by claiming that p-value can be viewed as a crude measure of sample size, which is to say that if a specific p-value is desired, sample size can always be increased. In the context of the current computational modelling work, I have come to wonder whether the fol- lowing statistical inference process is conceivable: First, let us contrast two theories - say, the Eriksson and Coultas choose-to-receive theory against a null model of choose-to-receive. Next, run both simu- lations repeatedly, measuring our key emergent property of interest. Finally, determine the likelihood that the values obtained from each theory are drawn from separate distributions. The intuition underlying this process is like Gelman in reverse: perhaps the number of simulation runs is a crude measure of p-value. When more simulations are required in order to obtain a statistically significant difference between estimates obtained from separate theories, the effect is not as strong. I am particularly interested in this possibility because I believe it offers something translatable directly into psychological science in a form that is immediately recognizable.22 Once likelihoods are attached to simulation results, permitting quantifiable tests to contrast between empirically-calibrated models versus null models, I believe Agent-Based Models could become first-class research methods within psychological science.

4.5.7 A General Method for Psychological Research

Altogether, the current work presents a general method for psychological research. This style of com- putational modelling, which is a synthesis of classic agent-based modelling and the epistemology of

18By implication, these are parallel simulated realities implied by the computational simulations. As a stochastic system, each simulation does correspond to slightly different starting parameters, although the parameter that varies - a random seed - exists as part of the simulation environment, not as part of the model itself. This stochastic seed value, alone, is responsible for the chaotic unpredictability that manifests as the distribution that produces the error bars. 19The error bars therefore represent the uncertainty obtained from measuring parameters across parallel simulations. In this framework, the only parameter that varies is simply an index (i.e. the random seed) corresponding to which parallel simulation is being observed. So long as that index is recorded, it is then possible to re-run the simulation to obtain precisely the same results, which amounts to the repeatability of history. 20When this thought experiment is applied to the universe within which we find ourselves, a shocking correspondence to theoretical physics is obtained. Just as parallel simulations may be dimensionalized as an index according to the random seed that implicitly perturbed each run to produce slightly different results, a similar index is conceived of in M-theory, which gives rise the set of all possible universes (Banks et al., 1997). One advantage to M-theory is that it has already mapped out the possible dimensions that may be parameterized, although those dimensions are physical in nature and the underlying model produces physical constants like gravity instead of urban legend propagation. 21Ultimately, this thought experiment leaves me wondering about the underlying stochasticity of the observed universe. What is randomness, after all? Where, in the natural world, does the normal distribution derive from? Why are statistical mechanics seemingly at the heart of all these simulation systems? Why is simulated annealing such a powerful metaphor in computational modelling techniques? Why is there a profound relationship between heat and entropy and why does that concordance appear to be useful - essential, even - for driving something so trivial as a system that predicts urban legend propagation? My intuition suggests to me that the answer to these questions is actually contained in M-theory - specifically the Hamiltonian - but, at the risk of completely derailing this discussion, I have to leave this as a future direction. 22For example, this approach could be reminiscent of bootstrapping. Chapter 4. Urban Legend Propagation 108 computational ecology, can be applied across the psychology literature. The class of regression models contains thousands of existing examples of psychological phenomena - and many of these models can be adapted and calibrated to use this approach. I would also expect this approach to work with many other behavioural frameworks, including those that are not described as linear models. Consider the following general three-step program of study.

1. A highly-controlled, small-scale laboratory study is conducted.

2. A computational model is constructed based on the lab results.

3. The computational model is adapted to conditions that exist outside the laboratory.

In this manner, both strict experimental controls and ecological validity can be obtained.

4.6 Conclusion

4.6.1 Adapting Linear Models to Agent-Based Modelling

If the methods from this work can be generalized, then almost any psychological regression model can be translated to a computational model. The general epistemology developed by ecologists for use with plant and animal populations would appear to work with human phenomena as well. The current work demonstrated the entire computational modelling process using a study of urban legends. The Eriksson and Coultas model of urban legend transmission did not produce viral outcomes in their laboratory study. When the same model was transplanted, unmodified, into a more plausible network context, high-disgust stories were transmitted at an increasing rate over time, just as expected. This work demonstrates that a specific empirical finding can be “reverse engineered” from the liter- ature and examined in new ways. The laboratory provides an ideal environment for highly-controlled research, which comes at the expense of ecological validity. Computational methods permit the labora- tory results to be examined in a more ecologically valid context.

4.6.2 Urban Legends Require Networks

In order for a story to truly become prolific within a population, it must have many opportunities to be retold. When the community is structured so as to minimize the number of times stories are told, it causes an already-unlikely storytelling event to become diminished towards nothingness. The inverse can also be inferred: when stories are told to large audiences, the likelihood of retelling is increased as a simple function of the audience size. Online social networks, in which a single account may have millions of connections, provide the opportunity for individuals to share stories widely. Urban legends are told as if they are true, entirely irrespective of whether or not they actually are true, which manifests in the present-day context as misinformation. Network topologies, on their own, are likely a major contributor to the proliferation of misinformation. In the absence of modern online social networks, misinformation would likely remain an expensive proposition that could only be accomplished by wealthy actors like governments. The reduced cost of engaging in storytelling episodes has minimized the expenses associated with misinformation. Stories Chapter 4. Urban Legend Propagation 109 that are practically unbelievable have a chance at becoming widespread, as a simple function of network topology. Even as the cost to spread urban legends is reduced, the cost for high-quality information remains high. For example, within the scientific community, open access journals are rare and the widespread availability of facts more closely resembles a serial reproduction task than an online social network. These differing network topologies imply troubling consequences for stories that propagate the truth, as opposed to those that spread lies. In the end of the day, emotional selection would appear to interact with network topology in a meaningful way to determine which stories become prolific - but it appears that the network is ultimately the more influential factor. Chapter 5

Conclusion

This work sought to achieve two overarching goals: to advance our understanding of memes and also to establish research practices for incorporating memes into psychological science. Regarding memes, this work 1) situated memes within the literature of psychological science; 2) determined that the individual is the primary influence in memetic systems of political speech; and 3) demonstrated that social network topology affects urban legend transmission. Regarding research with memes, this work demonstrated methods for 1) inter-disciplinary literature analysis; 2) identifying memes in a text corpus; and 3) using psychological regression models in social computational simulations. Altogether, this dissertation has argued that memes provide an essential framework for the scientific investigation of complex systems involving individuals and groups.

5.1 Meme Findings

Literature of Memes The literature of memes is interdisciplinary, including psychology, sociology, ecology, anthropology, linguistics, statistics, computer science, media, communications, and more. Al- though academia consists of disciplinary clusters, coauthorships do occasionally bridge between clusters, enabling ideas to propagate like other systems of memes. It is likely that all authors are somehow connected by coauthorship - a “small world” of academia.

Political Campaigns In a study of political campaigns as a system of memes, this work determined that the individal has the most influence over the success of a political meme. This work also showed that there are strong hierarchical effects; the community and the meme itself are also meaningful predictors of meme cascades. The mere presence of memes within campaign speech is predictive of subsequent message popularity - but this effect can be amplified when certain individuals and communities are involved. Although this work looked at political memes in particular, it is expected that this hierarchical model will generalize to other kinds of memes.

Urban Legend Propagation The experimental finding that urban legends propagate farther when they are disgusting is shown to hold in a simulated online social network context. This work tested whether an empirically-calibrated model of urban legend transmission could produce the expected re-

110 Chapter 5. Conclusion 111 sults even when the underlying social network was changed - and it could. When a scale-free network topology was used, disgusting urban legends were propagated at an increasing rate - which is required to produce long cascades - while non-disgusting stories were quickly forgotten. This work confirmed that the propagation of memes is dependent upon the network topology to achieve long cascades.

5.2 Research with Memes

Several new research methods were developed in the course of this research.

Literature Synthesis Method A method was demonstrated for building an inter-disciplinary bibli- ography by analyzing a network of coauthorships. Through this coauthorship network, it was eventually possible to bridge these disciplines into a single interdisciplinary network.

Meme Detection Method In chapter 3, a method was demonstrated for detecting memes in a corpus of text. By focusing on the information properties of text phrases, memes could be detected at a high rate - and those phrases were then shown to be predictive of subsequent propagation.

Regression in Simulation Method In chapter 4, a method was demonstrated for utilizing a regres- sion model from the psychological literature as the basis for a computational simulation with agent-based modelling. The practice of reusing models from the literature in computational simulations has the po- tential to establish a solid empirical foundation for many kinds of simulation research. Although these new methods were applied to memes, it is intended that they might be generalizable to other domains.

5.3 Implications for Psychological Research with Memes

Humans and memes comprise a complex and interdependent system that should be studied with con- sideration for both humanity and memetics, requiring contributions from several disciplines. A meme coexists alongside a human counterpart, implying both an author and audience.1 A meme is capable of replication, requiring a social foundation upon which it may replicate. A meme is propagated until its cascade ends - and the historical trace of this cascade follows the contours of the network it propagated through. A meme is contagious, mutating as it adapts to new populations and spreading as an epidemic function. A meme is also a representation: an idea, a symbol, a story, a text, a picture, a song, etc. Because memes have all of these properties, there are several implications for performing research with memes.2

Implied Psychological Processes Research with memes implies the existence of individuals who engage with those memes, as well as corresponding psychological processes. This literature is derived from psychology and statistics. Many aspects of social transmission have been investigated from a psychological perspective, with many corresponding statistical models already published in the literature.

1More generally, there must be an intelligent counterpart. Non-human primates have sufficiently demonstrated , as well. 2By implication, research upon memes that does not address all these meme properties is at risk of being incomplete. Chapter 5. Conclusion 112

Implied Social Processes Research with memes implies social processes, in which people consider other people as part of their decision making. This literature is derived from social psychology, sociology, and anthropology. These social processes themselves may imply relationships, identities, motivations, and other constructs that can be viewed psychologically.

Implied Hierarchical Processes Research with memes implies hierarchical processes, in which pro- cesses are nested within other processes. This literature is derived from statistics, psychology, and sociology. Hierarchies can be obtained when objects belong to other objects, as when tweets belong to accounts or when articles belong to academics.

Implied Networks Research with memes implies the use of networks in order to represent the hier- archical relationships of the meme and its cascading transmission. This literature is derived from graph theory and sociology. Network science is a discipline unto itself which seems to interface with many other disciplines in a manner reminiscent of statistics, which also touchs many other disciplines.

Implied Simulation Research with memes implies the use of simulations, in which memetic processes may be rapidly explored with synthetic populations. This literature is derived from computer science and statistics. Computational social simulations permit research that extends beyond the laboratory, exploring situations that might be difficult to observe. It is critical for computational simulations to be empirically calibrated, which can be achieved by leveraging models from the literature.

Implied Epidemics Research with memes implies epidemic contagion, in which a meme is transmitted through a population by way of social networks. This literature is derived from epidemiology, ecology, and social psychology. When epidemic mechanisms are used in simulations of memes, then agents can be “infected” by memes, propagating those memes to other agents.

Implied Information Research with memes implies information. Memes can be quantified in several ways, permitting detection and statistical analysis. This literature is derived from information science, computational linguistics, and computer science.

Implied Inter-disciplinarity Altogether, research with memes implies the relevance of many disci- plines that must be combined without discirimination. Research conducted with memes can be related back to any of these disciplines, meaning meme research could ultimately be relevant to many different academic audiences. There is no natural disciplinary home for research with memes.3 However, the current work demonstrates that when a particular stance is adopted - social psychology, in this case - deep connections can be drawn from memes to psychology.

5.4 Final Remarks

This dissertation has demonstrated how scientific methods can be applied to understand memes - and the individuals and networks that propagate them. It is hoped that the principles set forth in this work can be adapted throughout science, particularly in the psychological sciences.

3In 2019, the keyword Computational Social Science (e.g., Lazer et al., 2009) encompasses much of the methodology related to research with memes. References

Abbasi, A., Altmann, J., & Hossain, L. (2011, October). Identifying the effects of co-authorship networks on the performance of scholars: A correlation and regression analysis of performance measures and social network analysis measures. Journal of Informetrics, 5 (4), 594-607. doi: 10.1016/j.joi.2011.05 .007

Adamic, L. A., & Glance, N. (2005). The Political Blogosphere and the 2004 U.S. Election: Divided They Blog. In Proceedings of the 3rd International Workshop on Link Discovery (pp. 36–43). New York, NY, USA: ACM. doi: 10.1145/1134271.1134277

Adelson, R. M. (1966). Compound poisson distributions. Journal of the Operational Research Society, 17 (1), 73–75.

Allport, G. W. (1937). Personality: A psychological interpretation. New York: Henry Holt.

Allport, G. W., & Postman, L. (1946). An Analysis of Rumor. Public Opinion Quarterly, 10 (4), 501-517. doi: 10.1086/265813

American Psychological Association. (2019). PsycNET. https://psycnet.apa.org.

Asch, S. E. (1955). Opinions and social pressure. Scientific American, 193 (5), 31–35.

Atran, S. (2001). The trouble with memes. Human Nature, 12 (4), 351–381.

Aynaud, T. (2019, June). Louvain Community Detection. Contribute to taynaud/python-louvain devel- opment by creating an account on GitHub.

Bandura, A. (1977). Social learning theory (Vol. viii). Oxford, England: Prentice-Hall.

Bankes, S. C. (2002, May). Agent-based modeling: A revolution? Proceedings of the National Academy of Sciences, 99 (suppl 3), 7199-7200. doi: 10.1073/pnas.072081299

Banks, T., Fischler, W., Shenker, S. H., & Susskind, L. (1997, April). M Theory As A Matrix Model: A Conjecture. Physical Review D, 55 (8), 5112-5128. doi: 10.1103/PhysRevD.55.5112

Barab´asi,A.-L. (2009, July). Scale-Free Networks: A Decade and Beyond. Science, 325 (5939), 412-413. doi: 10.1126/science.1173299

Barab´asi,A.-L., & Albert, R. (1999). Emergence of scaling in random networks. science, 286 (5439), 509–512.

113 REFERENCES 114

Bartlett, F. C. (1932). Remembering: An experimental and social study. Cambridge: Cambridge University.

Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. In Third international AAAI conference on weblogs and social media.

Bates, D., Maechler, M., Bolker, B., Walker, S., Christensen, R. H. B., Singmann, H., . . . Fox, J. (2019, March). Lme4: Linear Mixed-Effects Models using ’Eigen’ and S4.

BBC Staff. (2006, November). Star Wars Kid is top viral video. BBC .

Beebe, N. H. (2015, February). Bibclean – A BibTEX prettyprinter, verifier, etc. https://ctan.org/tex- archive/biblio/bibtex/utils/bibclean?lang=en.

Berger, J. (2011). Arousal Increases Social Transmission of Information. Psychological Science, 22 (7), 891–893.

Berger, J., & Milkman, K. (2010). Social Transmission, Emotion, and the Virality of Online Content. Wharton Research Paper.

Berry, B. J. L., Kiel, L. D., & Elliott, E. (2002, May). Adaptive agents, intelligence, and emergent human organization: Capturing complexity through agent-based modeling. Proceedings of the National Academy of Sciences, 99 (suppl 3), 7187-7188. doi: 10.1073/pnas.092078899

Best, J., & Horiuchi, G. T. (1985). The Razor Blade in the Apple: The Social Construction of Urban Legends. Social Problems, 32 (5), 488–499.

Bikhchandani, S., Hirshleifer, D., & Welch, I. (1998). Learning from the behavior of others: Conformity, fads, and informational cascades. Journal of economic perspectives, 12 (3), 151–170.

Bird, S. (2014). Nltk: Natural Language Toolkit.

Blackmore, S. (1998). Imitation and the definition of a meme. Journal of Memetics.

Blackmore, S. (1999). The meme machine. Oxford [England] ; New York: . (11th floor robarts)

Blackmore, S. (2000). The Power Of Memes..

Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008, October). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008 (10), P10008. doi: 10.1088/1742-5468/2008/10/P10008

Bonabeau, E. (2002). Agent-based modeling: Methods and techniques for simulating human systems. Proceedings of the National Academy of Sciences, 99 (suppl 3), 7280–7287.

Borgman, C. L., & Furner, J. (2002). Scholarly communication and bibliometrics. Annual review of information science and technology, 36 (1), 1–53.

Boulogne, F., Mangin, O., & Verney, L. (2019, May). Bibtex parser for Python 2.7 and 3.3 or newer. Contribute to sciunto-org/python-bibtexparser development by creating an account on GitHub. sciunto- org. REFERENCES 115

Bowers, K. S. (1973, September). Situationism in psychology: An analysis and a critique. Psychological Review, 80 (5), 307-336. doi: http://dx.doi.org.myaccess.library.utoronto.ca/10.1037/h0035592

Box, G. E. (1979). Robustness in the strategy of scientific model building. In Robustness in statistics (pp. 201–236). Elsevier. boyd, d., & Ellison, N. B. (2007). Social Network Sites: Definition, History, and Scholarship. Journal of Computer-Mediated Communication, 13 (1), 210–230. doi: 10.1111/j.1083-6101.2007.00393.x boyd, d., Golder, S., & Lotan, G. (2010). Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter. In 43rd Hawaii International Conference on System Sciences (HICSS) (pp. 1–10). IEEE.

Boyd, R., & Richerson, P. J. (2000). Meme theory oversimplifies cultural change. Scientific American, 283 , 54–55.

Boyd, R., Richerson, P. J., & Henrich, J. (2013). The cultural evolution of technology: Facts and theories. Cultural evolution: society, technology, language, and , 119–142.

Briatte, F. (2016, March). Ggnetwork: Geometries to Plot Networks with ’ggplot2’.

Butler, D. (2009, April). Swine flu goes global. Nature, 458 (7242), 1082-1083. doi: 10.1038/4581082a

Butts, C. T., Hunter, D., Handcock, M., Bender-deMoll, S., Horner, J., & Wang, L. (2019, April). Network: Classes for Relational Data.

Campbell, D. T. (1957, July). Factors relevant to the validity of experiments in social settings. Psycholog- ical Bulletin, 54 (4), 297-312. doi: http://dx.doi.org.myaccess.library.utoronto.ca/10.1037/h0040950

Castro, L., & Toro, M. A. (2014). Cumulative cultural evolution: The role of teaching. Journal of Theoretical Biology, 347 , 74–83.

Cavalli-Sforza, L. L., & Feldman, M. W. (1981). Cultural transmission and evolution: A quantitative approach (No. 16). Princeton University Press.

Centola, D., & Macy, M. (2007). Complex Contagions and the Weakness of Long Ties. American Journal of Sociology, 113 (3), 702–734.

Charisi, V., Dennis, L., Fisher, M., Lieck, R., Matthias, A., Slavkovik, M., . . . Yampolskiy, R. (2017, March). Towards Moral Autonomous Systems. arXiv:1703.04741 [cs].

Chattoe-Brown, E. (2009, December). The social transmission of choice: A simulation with applications to hegemonic discourse. & Society, 8 (2), 193-207.

Cheng, J., Kleinberg, J., Leskovec, J., Liben-Nowell, D., Subbian, K., & Adamic, L. (2018). Do Diffusion Protocols Govern Cascade Growth? In Twelfth International AAAI Conference on Web and Social Media.

Chubin, D. E. (1976, September). State of the Field The Conceptualization of Scientific Specialties. The Sociological Quarterly, 17 (4), 448-476. doi: 10.1111/j.1533-8525.1976.tb01715.x

Church, K. W., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational linguistics, 16 (1), 22–29. REFERENCES 116

Cohen, J. (1990, December). Things I Have Learned (So Far). American Psychologist, 45 (12), 1304- 1312.

Comte, A. (1858). The positive philosophy of Auguste Comte (Vol. 1). Blanchard.

Conway, J. H. (1976). On numbers and games. New York: Academic Press.

Crespi, I. (1988). Pre-election polling: Sources of accuracy and error. Russell Sage Foundation.

Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJour- nal, Complex Systems, 1695 (5), 1–9.

Dawkins, R. (1976). The selfish gene. New York: Oxford University Press.

Dawkins, R. (1993). . Dennett and his critics: Demystifying mind, 13–27.

Debord, G. (1967). Society of the Spectacle. Bread and Circuses Publishing.

Dennett, D. (1990). Memes and the exploitation of imagination. Journal of aesthetics and art criticism, 127–135. de Tocqueville, A. (1840). Democracy in America — Volume 2 (H. Reeve, Trans.).

Devine, P. G. (1989, January). Stereotypes and prejudice: Their automatic and controlled components. Journal of Personality and Social Psychology, 56 (1), 5-18. doi: http://dx.doi.org.myaccess.library .utoronto.ca/10.1037/0022-3514.56.1.5

De Waal, F. B. M. (2008). Putting the Altruism Back into Altruism: The Evolution of Empathy. Annual Review of Psychology, 59 , 279–300.

Dichter, E. (1966). How Word-of-Mouth Advertising Works. Harvard Business Review, 44 (6), 147.

Doran, D., Schulz, S., & Besold, T. R. (2017, October). What Does Explainable AI Really Mean? A New Conceptualization of Perspectives. arXiv:1710.00794 [cs].

Dorigo, M., Maniezzo, V., & Colorni, A. (1996, February). Ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and , Part B (Cybernetics), 26 (1), 29-41. doi: 10.1109/3477.484436

Dreyer, J. L. E. J. L. E. (1906). History of the planetary systems from Thales to Kepler. Cambridge University Press.

Dublin, L. I., & Lotka, A. J. (1925). On the true rate of natural increase: As exemplified by the population of the United States, 1920. Journal of the American statistical association, 20 (151), 305– 339.

Dundes, A. (1965). The Study of Folklore in Literature and Culture: Identification and Interpretation. The Journal of American Folklore, 78 (308), 136-142. doi: 10.2307/538280

Durkheim, E. (2014). The Rules of Sociological Method: And Selected Texts on Sociology and its Method. Simon and Schuster.

Edwards, D., & Middleton, D. (1990). Collective remembering. Sage. REFERENCES 117

Eisenhower, D. D. (1961). Farewell address. Washington, DC , 17 .

Elias, P., Feinstein, A., & Shannon, C. (1956, December). A note on the maximum flow through a network. IRE Transactions on Information Theory, 2 (4), 117-119. doi: 10.1109/TIT.1956.1056816

Epskamp, S., Costantini, G., Haslbeck, J., Cramer, A. O. J., Waldorp, L. J., Schmittmann, V. D., & Borsboom, D. (2019, May). Qgraph: Graph Plotting Methods, Psychometric Data Visualization and Graphical Model Estimation.

Epstein, J. M. (1999). Agent-based computational models and generative social science. , 20.

Epstein, J. M. (2006). Generative Social Science: Studies in agent-based computational modeling. Princeton University Press.

Epstein, J. M. (2014). Agent Zero: Toward Neurocognitive Foundations for Generative Social Science. Princeton: Princeton University Press.

Epstein, J. M., & Axtell, R. L. (1996). Growing Artificial Societies: Social Science from the Bottom Up. Brookings Institution Press.

Eriksson, K., & Coultas, J. C. (2014). Corpses, Maggots, Poodles and Rats: Emotional Selection Operating in three Phases of Cultural Transmission of Urban Legends. Journal of Cognition and Culture, 14 (1-2), 1–26.

Euler, L. (1953). Leonhard Euler and the Koenigsberg Bridges. Scientific American, 189 (1), 66-72.

Fano, R. M. (1961). Transmission of Information: A Statistical Theory of Communication MIT Press. Cambridge, Mass. and Wiley, New York.

Farquhar, I. (2010, December). Engineering Security Solutions at Layer 8 and Above.

Ferrers, N. M. (1872). Extension of Lagrange’s equations. Quart. J. Pure Appl. Math, 12 (45), 1–5.

Fong, G. R. (2001, November). ARPA Does Windows: The Defense Underpinning of the PC Revolution. Business and Politics, 3 (3), 213-237. doi: 10.2202/1469-3569.1025

Fowler, J. H., & Christakis, N. A. (2010). Cooperative Behavior Cascades in Human Social Networks. Proceedings of the National Academy of Sciences, 107 (12), 5334–5338.

Fran¸cois,R. (2019, January). Bibtex parser for R. Contribute to romainfrancois/bibtex development by creating an account on GitHub.

Freedman, J. L., & Perlick, D. (1979, May). Crowding, contagion, and laughter. Journal of Experimental Social Psychology, 15 (3), 295-303. doi: 10.1016/0022-1031(79)90040-4

Freitag, D., Chow, E., Kalmar, P., Muezzinoglu, T., & Niekrasz, J. (2012). A Corpus of Online Discussions for Research into Linguistic Memes. In Web as Corpus Workshop (WAC7) (p. 14).

Friggeri, A., Adamic, L. A., Eckles, D., & Cheng, J. (2014). Rumor Cascades. In ICWSM.

Fruchterman, T. M. J., & Reingold, E. M. (1991, November). Graph drawing by force-directed placement. Software: Practice and Experience, 21 (11), 1129-1164. doi: 10.1002/spe.4380211102 REFERENCES 118

Garc´ıa,A. P., & Rodr´ıguez-Pat´on,A. (2016, April). Analyzing Repast Symphony models in R with RRepast package. bioRxiv, 047985. doi: 10.1101/047985

Garfield, E., & Merton, R. K. (1979). Citation indexing: Its theory and application in science, technology, and humanities (Vol. 8). Wiley New York.

Gelman, A., & Shalizi, C. R. (2013). Philosophy and the practice of Bayesian statistics. British Journal of Mathematical and Statistical Psychology, 66 (1), 8–38.

Giles, C. L., Bollacker, K. D., & Lawrence, S. (1998). CiteSeer: An Automatic Citation Indexing System. In ACM DL (pp. 89–98).

Ginsparg, P. (2011, August). ArXiv at 20. Nature, 476 , 145-147. doi: 10.1038/476145a

Giroux, H. A. (2015). University in Chains : Confronting the Military-Industrial-Academic Complex. Routledge. doi: 10.4324/9781315631363

Goffman, W., & Newill, V. A. (1964, October). Generalization of Epidemic Theory: An Application to the Transmission of Ideas. Nature, 204 (4955), 225-228. doi: 10.1038/204225a0

Google. (2019). Google Scholar. https://scholar.google.ca/.

Grabowski, A., & Kosinski, R. A. (2005). The SIS model of epidemic spreading in a hierarchical social network. Acta Physica Polonica B, 36 (5), 1579–1593.

Granovetter, M. S. (1973). The Strength of Weak Ties. American Journal of Sociology, 78 (6), 1360- 1380.

Granovetter, M. S. (1983). The strength of weak ties: A network theory revisited. Sociological theory, 201–233.

Grimm, V., Berger, U., Bastiansen, F., Eliassen, S., Ginot, V., Giske, J., . . . DeAngelis, D. L. (2006, September). A standard protocol for describing individual-based and agent-based models. Ecological Modelling, 198 (1–2), 115-126. doi: 10.1016/j.ecolmodel.2006.04.023

Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J., & Railsback, S. F. (2010, November). The ODD protocol: A review and first update. Ecological Modelling, 221 (23), 2760-2768. doi: 10.1016/ j.ecolmodel.2010.08.019

Grimm, V., & Railsback, S. F. (2005). Individual-based modeling and ecology (Vol. 2005). BioOne.

Guare, J. (1990). Six degrees of separation: A play. Vintage.

Ha, L. Q., Sicilia-Garcia, E. I., Ming, J., & Smith, F. J. (2002). Extension of Zipf’s law to words and phrases. In Proceedings of the 19th international conference on Computational linguistics-Volume 1 (pp. 1–6). Association for Computational Linguistics.

Haines, J. R. (2015, February). Russia’s Use of Disinformation in the Ukraine Conflict. , 10.

Hale, S. (2012, November). Build your own interactive network — Interactive Visualizations. http://blogs.oii.ox.ac.uk/vis/2012/11/15/build-your-own-interactive-network/. REFERENCES 119

Harada, J., Darmon, D., Girvan, M., & Rand, W. (2015). Forecasting high tide: Predicting times of elevated activity in online social media. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 (pp. 504–507). ACM.

Hatfield, E., Cacioppo, J. T., & Rapson, R. L. (1993). Emotional contagion. Current Directions in Psychological Science, 2 (3), 96-99. doi: 10.1111/1467-8721.ep10770953

Heart, F. E., Kahn, R. E., Ornstein, S. M., Crowther, W. R., & Walden, D. C. (1970). The interface message processor for the ARPA computer network. In Proceedings of the May 5-7, 1970, spring joint computer conference (pp. 551–567). ACM.

Heath, C., Bell, C., & Sternberg, E. (2001). Emotional Selection in Memes: The Case of Urban Legends. Journal of Personality and Social Psychology, 81 (6), 1028.

Heider, F. (1958). The Psychology of Interpersonal Relations. Psychology Press. doi: 10.4324/ 9780203781159

Herdan, G. (1960). Type-token mathematics (Vol. 4). Mouton.

Hethcote, H. W. (1974). Asymptotic Behavior and Stability in Epidemic Models. In P. van den Driessche (Ed.), Mathematical Problems in Biology (p. 83-92). Springer Berlin Heidelberg.

Hethcote, H. W. (1994). A Thousand and One Epidemic Models. In S. A. Levin (Ed.), Frontiers in Mathematical Biology (p. 504-515). Springer Berlin Heidelberg.

Heyer, P. (2003). America under attack I: A reassessment of Orson Welles’ 1938 war of the worlds broadcast. Canadian Journal of Communication; Toronto, 28 (2), 149-165.

Heyns, E. (2019, May). Better-BibTeX.

Hilgetag, C.-C., Burns, G. A., O’Neill, M. A., Scannell, J. W., & Young, M. P. (2000). Anatomical con- nectivity defines the organization of clusters of cortical areas in the macaque and the cat. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 355 (1393), 91–110.

Hillygus, D. S. (2011). The evolution of election polling in the United States. Public opinion quarterly, 75 (5), 962–981.

Hilton, J. L., & von Hippel, W. (1996). Stereotypes. Annual Review of Psychology, 47 , 237-271. doi: http://dx.doi.org.myaccess.library.utoronto.ca/10.1146/annurev.psych.47.1.237

Hodas, N. O., & Lerman, K. (2014). The simple rules of social contagion. Scientific reports, 4 , 4343.

Holland, P. W., & Leinhardt, S. (1971, May). Transitivity in Structural Models of Small Groups. Comparative Group Studies, 2 (2), 107-124. doi: 10.1177/104649647100200201

Hunter, D. R., Handcock, M. S., Butts, C. T., Goodreau, S. M., & Morris, M. (2008, May). Ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Journal of statistical software, 24 (3), nihpa54860.

ISO/IEC. (1994). 7498-1: 1994 information technology–open systems interconnection–basic reference model: The basic model. International Organization for Standardization. REFERENCES 120

Jackson, A. (2007). A labor of love: The mathematics genealogy project. Notices of the AMS, 54 (8), 1002–1003.

Jacomy, M., Venturini, T., Heymann, S., & Bastian, M. (2014). ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software. PLOS ONE, 9 (6), e98679. doi: 10.1371/journal.pone.0098679

Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical review, 106 (4), 620.

Jennings, N. R., Sycara, K., & Wooldridge, M. (1998, January). A Roadmap of Agent Research and De- velopment. Autonomous Agents and Multi-Agent Systems, 1 (1), 7–38. doi: 10.1023/A:1010090405266

Jiang, L., Miao, Y., Yang, Y., Lan, Z., & Hauptmann, A. G. (2014). Viral Video Style: A Closer Look at Viral Videos on YouTube. In Proceedings of International Conference on Multimedia Retrieval (pp. 193:193–193:200). New York, NY, USA: ACM. doi: 10.1145/2578726.2578754

Johnson, D. M. (1945). The ”Phantom Anesthetist” of Mattoon: A Field Study of Mass Hysteria. The Journal of Abnormal and Social Psychology, 40 (2), 175-186. doi: 10.1037/h0062339

Jung, C. G. (1968). The Archetypes and the Collective Unconscious (2nd ed.,Revised ed.; R. F. C. Hull, Ed.). New York : Florence: Routledge Taylor & Francis Group.

Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the american statistical association, 90 (430), 773–795.

Kermack, W. O., & McKendrick, A. G. (1927, January). A Contribution to the Mathematical Theory of Epidemics. Proceedings of the Royal Society of London. Series A, 115 (772), 700-721. doi: 10.1098/ rspa.1927.0118

Kilgour, F. (1979, March). Shared cataloging at OCLC. Online Review, 3 (3), 275-279. doi: 10.1108/ eb024004

Kooti, F., Mason, W. A., Gummadi, K. P., & Cha, M. (2012). Predicting emerging social conventions in online social networks. In Proceedings of the 21st ACM international conference on Information and knowledge management (pp. 445–454).

Kooti, F., Yang, H., Cha, M., Gummadi, P. K., & Mason, W. A. (2012). The Emergence of Conventions in Online Social Networks. In Proceedings of the International AAAI Conference on Weblogs and Social Media.

Kram, K. E. (1988). Mentoring at work: Developmental relationships in organizational life. University Press of America.

Kyanka, R. (1999, November). Something Awful.

Lamport, L. (1994). LATEX: A document preparation system: User’s guide and reference manual. Addison-wesley.

Lazer, D., Pentland, A. S., Adamic, L., Aral, S., Barabasi, A. L., Brewer, D., . . . others (2009). Computational social science. Science (New York, NY), 323 (5915), 721. REFERENCES 121

Le Bon, G. (1895). The Crowd. Routledge. doi: 10.4324/9781315131566

Lederberg, J., & Feigenbaum, E. A. (1977). ANNUAL REPORT-YEAR 05. Resource.

Leibniz, G. W. (1693). ON THE CALCULABILITY OF THE NUMBER OF ALL POSSIBLE . The Leibniz Review, 13 , 99-101. (OCLC: 4839619535)

Leskovec, J., McGlohon, M., Faloutsos, C., Glance, N., & Hurst, M. (2007, April). Cascading Behavior in Large Blog Graphs. arXiv:0704.2803 [physics].

Ley, M. (2002). The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives. In A. H. F. Laender & A. L. Oliveira (Eds.), String Processing and Information Retrieval (p. 1-10). Springer Berlin Heidelberg.

Liberman, S., & Wolf, K. B. (2013). Scientific communication in the process to coauthorship. Handbook of the psychology of science, 123–150.

Lidwell, W., Holden, K., & Butler, J. (2010). Universal principles of design, revised and updated: 125 ways to enhance usability, influence , increase appeal, make better design decisions, and teach through design. Rockport Pub.

Liu, J., & Ashton, P. S. (1995). Individual-based simulation models for forest succession and manage- ment. Forest Ecology and Management, 73 (1-3), 157–175.

Liu, J., Shang, J., Wang, C., Ren, X., & Han, J. (2015). Mining Quality Phrases from Massive Text Corpora. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (pp. 1729–1744). New York, NY, USA: ACM. doi: 10.1145/2723372.2751523

Lohmann, S. (1994). The dynamics of informational cascades: The Monday demonstrations in Leipzig, East Germany, 1989–91. World politics, 47 (01), 42–101.

Lotan, G., Graeff, E., Ananny, M., Gaffney, D., Pearce, I., & m. boyd, d. (2011). The Arab Spring - The Revolutions Were Tweeted: Information Flows During the 2011 Tunisian and Egyptian Revolutions. International Journal of Communication, 5 , 31.

Luke, S., Cioffi-Revilla, C., Panait, L., Sullivan, K., & Balan, G. (2005, January). MASON: A Multiagent Simulation Environment. SIMULATION , 81 (7), 517-527. doi: 10.1177/0037549705058073

Lyon, A., & Pacuit, E. (2013). The Wisdom of Crowds: Methods of Human Judgement Aggregation. In P. Michelucci (Ed.), Handbook of Human Computation (p. 599-614). New York, NY: Springer New York. doi: 10.1007/978-1-4614-8806-4 47

Macy, M. W., & Willer, R. (2002). From Factors to Actors: Computational Sociology and Agent-Based Modeling. Annual Review of Sociology, 28 (1), 143-166. doi: 10.1146/annurev.soc.28.110601.141117

Malmgren, R. D., Ottino, J. M., & Amaral, L. A. N. (2010). The role of mentorship in prot´eg´e performance. Nature, 465 (7298), 622.

Martin, S., Brown, W. M., Klavans, R., & Boyack, K. W. (2011). OpenOrd: An open-source toolbox for large graph layout. In Visualization and Data Analysis 2011 (Vol. 7868, p. 786806). International Society for Optics and Photonics. REFERENCES 122

Martin, S., Brown, W. M., & Wylie, B. N. (2008). DrL: Distributed Recursive (graph) Layout (Tech. Rep. No. 2936). Sandia National Laboratories.

McArthur, L. Z., & Baron, R. M. (1983, July). Toward an ecological theory of social perception. Psychological Review, 90 (3), 215-238. doi: http://dx.doi.org.myaccess.library.utoronto.ca/10.1037/ 0033-295X.90.3.215

McGrath, R. (2013). Twython: Actively maintained, pure Python wrapper for the Twitter API. Supports both normal and streaming Twitter APIs.

McLuhan, M., Fiore, Q., & Agel, J. (1967). The Medium is the Massage: An Inventory of Effects. New York: Bantam Books.

Miller, I. D. (2012). The Social Transmission of User-Generated Memes (MA Thesis, University of Toronto, Toronto, ON, Canada). doi: 1807/67214

Miller, I. D. (2016, May). Coauthorship Network Visualisation. http://imiller.utsc.utoronto.ca/media/network/.

Mitchell, J. C. (1969). Social Networks in Urban Situations: Analyses of Personal Relationships in Central African Towns. Manchester University Press.

Mullen, P. B. (1972). Modern Legend and Rumor Theory. Journal of the Folklore Institute, 9 (2/3), 95-109. doi: 10.2307/3814160

Nakagawa, E., & Unebasami, K. (2007, January). I Can Has Cheezburger.

Newman, M. E. J. (2001). Scientific collaboration networks. I. Network construction and fundamental results. Physical review E, 64 (1), 016131.

Newman, M. E. J. (2003, June). The Structure and Function of Complex Networks. SIAM Review, 45 (2), 167-256. (ArticleType: research-article / Full publication date: Jun., 2003 / Copyright c 2003 Society for Industrial and Applied Mathematics) doi: 10.2307/25054401

Newman, M. E. J. (2004a). Coauthorship networks and patterns of scientific collaboration. Proceedings of the national academy of sciences, 101 (suppl 1), 5200–5205.

Newman, M. E. J. (2004b). Who is the best connected scientist? A study of scientific coauthorship networks. Complex networks, 337–370.

Newman, M. E. J., Barabasi, A.-L., & Watts, D. J. (2006). The structure and dynamics of networks. Princeton University Press.

Newman, M. E. J., & Girvan, M. (2004). Finding and Evaluating Community Structure in Networks. Physical Review E, 69 (2).

Nicol, C. J. (1995, September). The social transmission of information and behaviour. Applied Animal Behaviour Science, 44 (2-4), 79-98.

Page, R. M. (1962). The origin of radar. REFERENCES 123

Patashnik, O., & Lamport, L. (2010, December). CTAN: Package bibtex. https://www.ctan.org/pkg/bibtex.

Peterson, J. B. (1999). Maps of meaning: The architecture of belief. New York: Routledge.

Pierson, P. (2000). Increasing returns, path dependence, and the study of politics. American political science review, 94 (2), 251–267.

Platt, J. R. (1964). Strong Inference. Science, 146 (3642), 347-353.

Pons, P., & Latapy, M. (2005, December). Computing communities in large networks using random walks (long version). arXiv:physics/0512106 .

Pope, J. (1804). Description of an Orrery of His Construction. Memoirs of the American Academy of Arts and Sciences, 2 (2), 43-45. doi: 10.2307/27670809

Python Software Foundation. (2010, July). Python Language.

R Core Team. (2013). R: A Language and Environment for Statistical Computing. Vienna, Austria.

Railsback, S. F., & Grimm, V. (2011). Agent-based and individual-based modeling: A practical introduc- tion. Princeton university press.

Railsback, S. F., & Harvey, B. C. (2002). Analysis of habitat-selection rules using anindividual-based model. Ecology, 83 (7), 1817–1830.

Reid, C. R., Latty, T., Dussutour, A., & Beekman, M. (2012, October). Slime mold uses an externalized spatial “memory” to navigate in complex environments. Proceedings of the National Academy of Sciences, 109 (43), 17490-17494. doi: 10.1073/pnas.1215037109

Reid, T. (1832). Treatise on Clock and Watch Making: Theoretical and Practical. Carey and Lea.

Rodriguez, M. A., & Pepe, A. (2008, July). On the relationship between the structural and socioacademic communities of a coauthorship network. Journal of Informetrics, 2 (3), 195-201. doi: 10.1016/j.joi .2008.04.002

Roediger, H. L., Meade, M. L., & Bergman, E. T. (2001). Social contagion of memory. Psychonomic Bulletin & Review, 8 (2), 365–371.

Rogers, E. M. (1962). Diffusion of innovations. New York: Free Press of Glencoe.

Roy Rosenzweig Center for History and New Media. (2019, November). Zotero. George Mason Univer- sity.

Rubin, A. M., Perse, E. M., & Powell, R. A. (1985, December). Loneliness, Parasocial Interaction, and Local Television News Viewing. Human Communication Research, 12 (2), 155-180. doi: 10.1111/ j.1468-2958.1985.tb00071.x

Rubin, A. M., & Step, M. M. (2000, December). Impact of Motivation, Attraction, and Parasocial Interaction on Talk Radio listening. Journal of Broadcasting & Electronic Media, 44 (4), 635-654. doi: 10.1207/s15506878jobem4404 7 REFERENCES 124

Rubin, D. C. (1997). Memory in oral traditions: The cognitive psychology of epic, ballads, and counting- out rhymes. Oxford University Press on Demand.

Rumelhart, D. E., & McClelland, J. L. (1988). Parallel distributed processing (Vol. 1). IEEE.

SAclopedia. (2004, December). Image Macro [SAclopedia].

Schank, J. C. (2001, January). Beyond : Refocusing on the individual with individual-based modeling. Complexity, 6 (3), 33-40. doi: 10.1002/cplx.1026

Schelling, T. C. (1971). Dynamic models of segregation. Journal of Mathematical Sociology, 1 (2), 143–186.

Schmuckler, M. A. (2001). What Is Ecological Validity? A Dimensional Analysis. Infancy, 2 (4), 419-436. doi: 10.1207/S15327078IN0204 02

Schnall, S., Haidt, J., Clore, G. L., & Jordan, A. H. (2008, August). Disgust as Embodied Moral Judg- ment. Personality and Social Psychology Bulletin, 34 (8), 1096-1109. doi: 10.1177/0146167208317771

Seidel, R. W. (1983). Accelerating Science: The Postwar Transformation of the Lawrence Radiation Laboratory. Historical Studies in the Physical Sciences, 13 (2), 375-400. doi: 10.2307/27757520

Shannon, C. E. (1948, July). A Mathematical Theory of Communication. The Bell System Technical Journal, 27 , 379-423.

Shannon, C. E. (1951). Prediction and entropy of printed English. Bell Labs Technical Journal, 30 (1), 50–64.

Sherif, M., Harvey, O. J., White, B. J., Hood, W. R., & Sherif, C. W. (1961). Intergroup conflict and cooperation. The robbers cave experiment. Oklahoma.

Small, H. (1973, July). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24 (4), 265-269. doi: 10.1002/asi.4630240406

Sporns, O., & K¨otter,R. (26-Oct-2004). Motifs in Brain Networks. PLOS Biology, 2 (11), e369. doi: 10.1371/journal.pbio.0020369

Srull, T. K., & Wyer, R. S. (1989, January). Person Memory and Judgment. Psychological Review; Washington, 96 (1), 58.

Stein, S., Averitt, S., & Shaffer, H. (2004, October). LAYER 8: A White Paper on Managing Information Technology Investments to Advance NC State’s Mission. NC State University.

Stubbersfield, J. M., Tehrani, J. J., & Flynn, E. G. (2015). Serial Killers, Spiders and Cybersex: Social and Survival Information Bias in the Transmission of Urban Legends. British Journal of Psychology, 106 (2), 288–307.

Swarm Development Group. (2016, July). SWARMFEST 2016: 20th Annual Meeting on Agent Based Modeling & Simulation (Tech. Rep.). Burlington, Vermont: University of Vermont. REFERENCES 125

Szabo, G., & Huberman, B. A. (2010). Predicting the popularity of online content. Communications of the ACM , 53 (8), 80–88.

Tajfel, H. (1974, April). Social identity and intergroup behaviour. Information (International Social Science Council), 13 (2), 65-93. doi: 10.1177/053901847401300204

Tarde, G. (1962). The laws of imitation. Gloucester, Mass., P. Smith.

Tedeschi, G., Mazloumian, A., Gallegati, M., & Helbing, D. (2012). Bankruptcy cascades in interbank markets. PloS one, 7 (12), e52749.

Tesfatsion, L. (2003, February). Agent-based computational economics: Modeling economies as complex adaptive systems. Information Sciences, 149 (4), 262-268. doi: 10.1016/S0020-0255(02)00280-3

Thiele, J. C., Kurth, W., & Grimm, V. (2014). Facilitating parameter estimation and sensitivity analysis of agent-based models: A cookbook using NetLogo and R. Journal of Artificial Societies and Social Simulation, 17 (3), 11.

Tillotson, J., Cherry, J., & Clinton, M. (1995). Internet use through the University of Toronto Library: Demographics, destinations, and users’ reactions. Information Technology and Libraries, 14 (3), 190– 199.

Travers, J., & Milgram, S. (1967). The small world problem. Psychology Today, 1 (1), 61–67.

VanArsdale, D. W. (1998). Chain Letter Evolution.

Vickery, J. R. (2014, March). The curious case of Confession Bear: The reappropriation of online macro- image memes. Information, Communication & Society, 17 (3), 301-325. doi: 10.1080/1369118X.2013 .871056 von Neumann, J. (1966). Theory of self-reproducing automata (A. W. Burks, Ed.). Urbana: University of Illinois Press.

Von Ahn, L., Blum, M., Hopper, N. J., & Langford, J. C. (2003). CAPTCHA: Using hard AI problems for security. In Advances in Cryptology—EUROCRYPT 2003 (pp. 294–311). Springer.

Walras, L. (1874). Elements of Pure Economics. Routledge. doi: 10.4324/9781315888958

Wang, J., & Shapira, P. (2011, March). Funding acknowledgement analysis: An enhanced tool to investigate research sponsorship impacts: The case of nanotechnology. Scientometrics, 87 (3), 563- 586. doi: 10.1007/s11192-011-0362-5

Watts, D. J., & Strogatz, S. H. (1998). Collective Dynamics of Small-World Networks. Nature, 393 , 440–442.

Waugh, A. S., Pei, L., Fowler, J. H., Mucha, P. J., & Porter, M. A. (2009). Party polarization in congress: A network science approach.

Way, S. F., Morgan, A. C., Larremore, D. B., & Clauset, A. (2019, May). Productivity, prominence, and the effects of academic environment. Proceedings of the National Academy of Sciences, 116 (22), 10729-10733. doi: 10.1073/pnas.1817431116 REFERENCES 126

Wellman, B. (1976). Urban connections.

Wellman, B. (1979, March). The Community Question: The Intimate Networks of East Yorkers. American Journal of Sociology, 84 (5), 1201-1231. doi: 10.1086/226906

White, H. D., & Griffith, B. C. (1981). Author cocitation: A literature measure of intellec- tual structure. Journal of the American Society for Information Science, 32 (3), 163-171. doi: 10.1002/asi.4630320302

Whiten, A., Spiteri, A., Horner, V., Bonnie, K. E., Lambeth, S. P., Schapiro, S. J., & De Waal, F. B. M. (2007). Transmission of multiple traditions within and between chimpanzee groups. Current Biology, 17 (12), 1038–1043.

Wickham, H. (2011). Ggplot2. Wiley Interdisciplinary Reviews: Computational Statistics, 3 (2), 180-185. doi: 10.1002/wics.147

Wilensky, U. (1999). NetLogo. Center for Connected Learning and Computer-Based Modeling, North- western University. Evanston, IL.

Wilensky, U., & Rand, W. (2015). An introduction to agent-based modeling: Modeling natural, social, and engineered complex systems with NetLogo. MIT Press.

Williams, J. R., Lessard, P. R., Desu, S., Clark, E. M., Bagrow, J. P., Danforth, C. M., & Dodds, P. S. (2015). Zipf’s law holds for phrases, not words. Scientific reports, 5 , 12209.

Wojnicki, A. C., & Godes, D. (2008, April). Word-of-Mouth as Self-Enhancement. SSRN eLibrary.

Woolley, A. W., Chabris, C. F., Pentland, A., Hashmi, N., & Malone, T. W. (2010, October). Evidence for a Collective Intelligence Factor in the Performance of Human Groups. Science, 330 (6004), 686-688. doi: 10.1126/science.1193147

Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences, 112 (4), 1036–1040.

Zajonc, R. B. (1980). Cognition and social cognition: A historical perspective. Retrospections on social psychology, 180–204.

Zipf, G. K. (1932). Selected studies of the principle of relative frequency in language.