Research Collection

Doctoral Thesis

Reputation and Networks Studies in Reputation Effects and Network Formation

Author(s): Wehrli, Stefan

Publication Date: 2014

Permanent Link: https://doi.org/10.3929/ethz-a-010432336

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library DISS. ETH NO. 22476

Reputation and Networks

Studies in Reputation Effects and Network Formation

A thesis submitted to attain the of

DOCTOR OF SCIENCES of ETH ZURICH

(Dr. sc. ETH Zurich)

presented by

STEFAN WEHRLI

lic. rer. soc., University of Bern

born on October 13, 1974

citizen of Küttigen, AG

Prof. Dr. Andreas Diekmann (examiner) Prof. Dr. Andreas Flache (co-examiner) Prof. Dr. Ryan Murphy (co-examiner)

2014

Acknowledgments

This dissertation could not have been written without the support of many people. First of all, I would like to thank my advisor Andreas Diekmann for giving me the opportunity and the freedom to follow my curiosity. I am especially thankful to Andreas for creating an inspiring atmosphere and community that allowed me and my colleagues to pursue ideas that leave his center of gravity. He has been my advisor for many years now, but time and again his comments and ideas have offered new insights. Andreas has taught me the fundamentals of social theory, practical knowledge of empirical research, statistics, game theory, and also en- couraged me to pursue studies in analysis that I had the privilege to teach at the ETH for several years. I am also indebted to Ryan Murphy, who as a co-advisor has not only de- cisively supported this dissertation but who is also the best director of the ETH Decision Science Laboratory I can think of. His door was always open for me to discuss urgent matters of the lab’s daily business. Ryan taught me lessons in how to engage in proper decision making and strategic thinking. But perhaps most of all, I admire the elegance with which he approaches, presents, and teaches lex parsimoniae that goes beyond the beauty of behavioral decision science. I am proud to have convinced Andreas Flache to support my dissertation as my second co-advisor and external referee. Many of my colleagues know that I am espe- cially fond of approaching sociology via computational means. And Andreas is truly a pioneer in this field that has become so important in recent years. I owe a lot of my basic views, skills, and concepts to my colleagues Ben Jann and Wojtek Przepiorka, who co-authored the first chapter of this dissertation. I have benefited a lot from Ben’s extraordinary statistical rigor and Wojtek’s ana- lytical and writing skills. The conversations with Reto Meyer have always been encouraging, and I very much enjoyed working with him. Furthermore, I would like to express my gratitude to Dominik Hangartner, who helped me design and collect the experimental data for the second chapter and Debra Hevenstone for editing the first draft of my last chapter. iv

There are many other people whom I met in the past years at ETH. It has been a great privilege to work with them. Kurt Ackermann, Stefano Balietti, Noemi Blättler, Oliver Brägger, Heidi Bruderer, Joël Berger, Dirk Helbing, Marc Höglinger, Stefan Karlen, Nadja Jehli, Claudia Jenny, Silvana Jud, Sergi Lozano, Michael Mäs, Mathias Naef, Lennart Schalk, Heiko Rauhut, Manuela Vieht, Fabian Winter, and all the many researchers I have had the pleasure to assist and guide at DeSciL. Thank you all for wonderful discussions, for your feedback, comments, and support. Finally, I would like to thank my parents and my family, Caroline, Nora, and Julien for the most valuable support of all. Your love, kindness, and inspiration have made this work possible.

Zurich, November 2014

Stefan Wehrli Abstract

This dissertation contributes to research on relationship formation in the online world. The general topic is approached from two different angles. First, I exam- ine how pure strangers establish cooperative relationships on large, anonymous, and geographically dispersed online auction markets and how traders solve the problem of trust symptomatically for any kind of one-shot interactions. Second, I take a trusting relation as given and ask what kind of structures emerge when social actors translate their repeated and enduring social relationships into ac- quaintanceship networks on social networking sites. The first substantive chapter 2 is concerned with the measurement of reputa- tion effects in online auctions. Today’s online markets provide a unique oppor- tunity to test theories of trust, reciprocity, and reputation using large samples of unobtrusive data. Our statistical analyses show that sellers with better reputations have higher sales and obtain higher prices. Furthermore, we observe a high rate of participation in the feedback system, which is largely consistent with theories that suggest mixtures of strong reciprocity and altruism to solve the second order problem of cooperation. In the second article in chapter 3, I study the mechanism design of reputation systems and how social science can contribute to a proper functioning of online market places. The first part of the chapter provides a review of the literature on reputation building and how it relates to indirect reciprocity discussed in theo- retical biology. Special consideration is given to the role of direct reciprocity in bilateral rating systems prevalent in online auction markets. Within the paradigm of binary-choice trust games, I compare the performance of a bilateral-sequential, a bilateral-blind, and a one-sided feedback mechanism by means of experimen- tal methods. Feedback mechanisms that constrain the role of direct reciprocity exhibit substantially higher levels of trust while at the same time lowering the feedback turnout. For the third substantive chapter 4, the perspective changes to repeated inter- actions, acquaintanceship, and social relations that have primarily formed outside vi the online world while having left, however, traces on a popular social networking site. Based on simple generative network models from economics and statistical physics, I characterize an ensemble of five online collegiate networks that corre- spond to five universities in Switzerland. By means of , I reanalyze well-known theoretical and empirical findings from the network lit- erature. It turns out that the predictions from these simple models can draw a surprisingly accurate picture of structures that surround the students. The online world provides a lens through which social scientists can study social phenomena on unprecedented scale and with unparalleled efficiency. In the last chapter (5), I explore how individual differences influences be- havior in online social networks based on data collected from an online survey that can be linked to the data of chapter 4. Rivaling theories give very different answers to the question why people are so remarkably different in terms of the number of relations they maintain. By way of referring to the five-factor model from personality psychology, a potentially stable and exogenous source of indi- vidual variation is used to explain how students shape their ego-centric networks. I show how these network antecedents determine participation, adoption time, nodal degree and network growth over a period of 4 months. The statistical anal- ysis identifies extraversion as a major driving force in the tie-formation process. A counter-intuitive positive effect for emotional stability and a negative influence for conscientiousness are found. Openness and agreeableness are surprisingly unrelated to observed networking behavior. The two parts of this dissertation, even if different in topical focus, operate by a shared methodological strategy. Chapter 2 and 4 both resort to distributed real-time online observation, each complemented in chapter 3 and 5 by more tra- ditional forms of data collection to triangulate results and address open questions. Kurzfassung

Die vorliegende Dissertation leistet einen Beitrag zur Forschung über Entste- hung, Erhalt und Struktur von Beziehungen in der Online-Welt. Das generell formulierte Thema wird aus zwei unterschiedlichen Perspektiven beleuchtet. Die erste Perspektive beschaftigt sich mit der Frage, wie unter komplett unbekan- nten Personen Kooperation entsteht, die unter ungünstigsten Voraussetzungen auf anonymen Online-Auktionsmärkten ein einmaliges Tauschgeschaft vollziehen und damit gezwungen sind, ein Vertrauensproblem zu lösen. Die zweite Perspek- tive setzt vertrauensvolle Beziehungen voraus und fragt danach, welche Struk- turen entstehen, wenn soziale Akteure ihre real-weltlichen Beziehungen in die Online-Welt übertragen und dort fortsetzen. Das erste inhaltliche Kapitel 2 beschäftigt sich mit der Messung von Reputa- tionseffekten bei Online-Auktionen. Heutige Internet-Auktionsmärkte erlauben auf eine einzigartige Weise, Theorien über Vertrauen, Reziprozität und Repu- tation zu testen; dies basierend auf Stichproben nicht-reaktiven Datenmaterials. Unsere statistische Analyse zeigt, dass Verkäufer mit einer Reputation für ver- trauenswürdiges Handel mit höherer Wahrscheinlichkeit ihre Güter verkaufen und dies zusätzlich zu besseren Konditionen und höheren Preisen. Zusätzlich beobachten wir eine hohe Beteiligungsrate im Feedback-Forum. Dieser Be- fund ist konsistent mit Erklärungen des Kooperationsproblems zweiter Ordnung basierend auf Theorien von “starker” Reziprozität und unbedingtem Altruismus. In Kapitel 3 untersuche ich unterschiedliche Mechanismen und Anreize von Reputationssystemen und verweise auf potentielle Beiträge bei der Gestal- tung und Optimierung von Online-Marktplätzen, die man seitens der Sozialwis- senschaften erwarten möchte. Im ersten Teil dieses Kapitel wird ausführlich die Literatur zu Aufbau und Erhalt von Reputation diskutiert. Es wird gezeigt, in welchem Zusammenhang diese zum Konzept der indirekten Reziprozität seitens der theoretischen Biologie steht. Besonderes Augenmerk wir dabei auf die Wirkung direkter Reziprozität für bilaterale Bewertungssysteme gelegt. Basierend auf dem spieltheoretischen Model des Vertrauensspiels vergleiche ich viii mittels Laborexperimenten die Leistungsfähigkeit und Effizienz eines bilateral- sequentiellen, eines bilateral-blinden und eines einseitigen Bewertungssystems. Bewertungssysteme, die gezielt die Wirkung direkter Reziprozität einschränken, zeigen ein signifikant höheres Niveau vertrauenswürdigen Tausches während gle- ichzeitig das Niveau direkter Reziprozität massgeblich zur Erklärung der Bewer- tungsrate beiträgt. Für das Kapitel 4 wechsle ich die Perspektive zu wiederholten Interak- tionen, Bekanntschaften und sozialen Beziehungen, die sich primär ausserhalt der Online-Welt etabliert haben, jedoch analysierbare Spuren auf populären Online-Netzwerken hinterlassen. Basierend auf einfachen generativen Netzwerk- Modellen aus der Ökonomik und der statistischen Physik charakterisiere ich ein Ensemble von fünf Freundschaftsnetzwerken von Studierenden Schweizer Universitäten. Es zeigt sich, dass die Vorhersagen dieser einfachen Netzwerk- modelle ein akkurates Bild der Netzwerkstrukturen anzeigen, die Studierende während ihres Studiums vorfinden. Die Online-Welt eröffnet hier ungeahnte Möglichkeiten soziale Phänomene auf neuen Grössenskalen und effizient zu beobachten. Im letzten Kapitel 5 ergründe ich, wie individual-spezifische Heterogen- ität Verhalten auf Online-Freundschaftsnetzwerken bedingt. Dies erfolgt mittels Daten eines Online-Surveys, die sich mit den Daten des Kapitels 3 verbinden lassen. Rivalisierende Theorien geben unterschiedliche Antworten darauf, we- shalb Personen eine so unterschiedliche Anzahl Kontakte pflegen. Mit einer Anwendung des Fünf-Faktoren-Modells aus der Persönlichkeitspsychologie ver- wende ich eine potentiell stabile und exogene Quelle individueller Variation, um zu erklären, wie Studierende ihre ego-zentrierten Netzwerke auf der Online- Plattform pflegen. Ich zeige, inwieweit sich diese Persönlichkeitsmerkmale auf Online-Verhalten wie Partizipationsrate, Eintrittszeit, den Kontengrad, weit- ere Zentralitäts- und Transitivitätsmasse, sowie auf das Wachstum des ego- zentrierten Netzwerkes über einen Zeitraum von 4 Monaten auswirkt. Eine statis- tische Analyse identifiziert Extraversion als massgeblichen Erklärungsfaktor des Knotengrades. Ein kontraintuitives Resultat zeigt sich für die emotionale Sta- bilität und ein negativer Effekt auf die Netzwerkaktivität resultiert für Gewis- senhaftigkeit. Keine spezifischen Aussagen lassen sich erstaunlicherweise für Offenheit und Verträglichkeit ableiten. Die zwei Teile dieser Dissertation haben zwei unterschiedliche thematis- che Zentren, bilden aber eine Einheit hinsichtlich der methodischen Herange- hensweise an die Online-Forschung. Kapitel 2 und 4 basieren beide auf verteilter Echtzeit-Beobachtung von Internet-Phänomenen. Sie werden in Kapiteln 3 und 5 jeweils ergänzt durch standardisierte und traditionellere Methoden der Datenerhe- bung, die dabei helfen, Resultate zu triangulieren und offene Fragen aufzugreifen. Contents

Acknowledgments iii

Abstract v

Kurzfassung vii

1 Introduction 1

2 Reputation Formation and the Evolution of Cooperation in Anony- mous Online Markets 9 2.1 Introduction ...... 10 2.2 Theory and Hypothesis ...... 12 2.2.1 Transaction Stage ...... 12 2.2.2 Feedback Stage ...... 15 2.3 Previous Findings ...... 17 2.4 Data ...... 19 2.5 Results ...... 21 2.5.1 Effect of Reputation on Auction Success ...... 21 2.5.2 Participation in the Feedback System ...... 25 2.6 Discussion and Conclusions ...... 30 2.7 Appendix ...... 34

3 Reciprocity in Reputation System: An Experimental Study 45 3.1 Introduction ...... 46 3.2 Related Literature ...... 47 3.2.1 Trust and Trustworthiness ...... 48 3.2.2 Reputation in Repeated Games ...... 50 3.2.3 Indirect Reciprocity ...... 53 3.2.4 Strong Reciprocity ...... 56 3.2.5 Reputation Systems ...... 59 x CONTENTS

3.3 Experimental Design ...... 63 3.3.1 Games and Treatments ...... 63 3.3.2 Experimental Protocol ...... 66 3.3.3 Hypotheses ...... 68 3.4 Experimental Results ...... 73 3.4.1 Placing and Honoring Trust ...... 73 3.4.2 Submitting Feedback ...... 78 3.4.3 Feedback Timing ...... 83 3.5 Discussion and Conclusion ...... 87 3.6 Appendix ...... 92

4 Structures in Online Collegiate Networks 107 4.1 Introduction ...... 108 4.2 Related Literature ...... 109 4.3 Materials & Methods ...... 112 4.3.1 Data Collection ...... 112 4.3.2 Networks ...... 114 4.3.3 Attributes ...... 116 4.4 Growing Networks ...... 118 4.4.1 Tie-Formation Process ...... 118 4.4.2 ...... 119 4.4.3 Weak Ties, Bridges, and Small Worlds ...... 124 4.4.4 Degree Correlations ...... 130 4.5 Mixing Patterns ...... 134 4.5.1 Assortativity ...... 135 4.5.2 Multi-Category Mixing ...... 139 4.6 Discussion ...... 141 4.7 Appendix ...... 144

5 Personality on Social Network Sites: An Application of the Five Fac- tor Model 157 5.1 Introduction ...... 158 5.2 Social Network Sites ...... 160 5.3 Personality and Social Networks ...... 161 5.4 Data and Methods ...... 165 5.5 Personality Effects on Social Networks ...... 169 5.6 Conclusions ...... 175 5.7 Appendix ...... 178

6 Summary and Conclusion 187 CONTENTS xi

References 214

Curriculum Vitae 215

List of Tables

2.1 Means and Standard Deviations of Key Variables ...... 21 2.2 Effects of Reputation on Sales and Prices ...... 23 2.3 Feedback Patterns (%) ...... 25 2.4 Hazards of Positive and Negative Feedback in the DVD Market . 28 2.5 Means and Standard Deviations of Control Variables ...... 41 2.6 Effect of Reputation on Sales ...... 42 2.7 Effect of Reputation on Price I ...... 43 2.8 Effect of Reputation on Price II ...... 44

3.1 Experimental Design ...... 66 3.2 Frequencies of Decisions by Session (%) ...... 75 3.3 Determinants of Placing and Honoring Trust ...... 76 3.4 Feedback Frequency and Average Arrival Times ...... 80 3.5 Determinants of Positive and Negative Feedback ...... 82 3.6 Hazards of Positive and Negative Feedback (SE only) ...... 86 3.7 Functional Forms of Reputation Effects (Pooled Fixed Effects LPM) 94 3.8 Fixed Effects Models of Placing and Honoring Trust by Treat- ment (LPM) ...... 95 3.9 Feedback Timing Patterns (%) ...... 96 3.10 Treatment Effects on Feedback Arrival Time (RE) ...... 97 3.11 Ex-Post Randomization Test (MLogit) ...... 100

4.1 Descriptive Statistics of Attributes ...... 117 4.2 Univariate MLE of Degree Distributions ...... 121 4.3 Correlations of Overlap and Edge Betweenness with Tie Strength 129 4.4 Graph-Level Metrics of Swiss StudiVz Networks ...... 134 4.5 Inbreeding Homophily by Gender ...... 139 4.6 Comparison with Official Enrollment Count Reports of 2007/08 147 4.7 Logistic Network Regression Models (MPLE) ...... 156 xiv LIST OF TABLES

5.1 Hypotheses for Personality Effects on Outcome Variables . . . . 164 5.2 Factor Analysis of Personality Scores ...... 166 5.3 Maximum Likelihood Estimates of Personality Effects on Net- work Activity ...... 170 5.4 OLS Estimates of Personality Effects on Node Level Indices . . 173 5.5 Correlations of Contact Frequencies and Life Satisfaction . . . . 181 5.6 Personality Effects on Degree and Life Satisfaction (OLS) . . . 182 5.7 Raw Correlations between Personality and Node Level Indices . 183 5.8 Descriptive Statistics ...... 184 List of Figures

1.1 Degree Distributions of eBay and StudiVz Networks ...... 5

2.1 Survival Functions and Hazard Rates of Sellers’ and Buyers’ Rat- ing Decisions ...... 27

3.1 Extensive Forms of Trust Game and Simultaneous Rating Game 64 3.2 Average Levels of Placing and Honoring Trust by Treatment . . 74 3.3 Smoothed Hazard Rates of Positive and Negative Feedback . . . 84 3.4 Reputation Effects Plots for Placing and Honoring Trust . . . . . 93 3.5 Average Feedback Submission Rates by Period and Treatment . 98 3.6 Cumulative Feedback Incidence Functions from Competing Risks 99 3.7 Instructions SI Treatment (Page 1) ...... 102 3.8 Instructions SI Treatment (Page 2) ...... 103 3.9 Screen for first mover: buy or not buy (AS, SI, SE treatments) . 104 3.10 Screen for second mover: ship or not ship (AS, SI, SE treatments) 104 3.11 Screen for submitting feedback (SE treatment) ...... 105

4.1 Arrival Time and Last Activity on StudiVz Profile Pages . . . . 113 4.2 ETH Friendship Network Colored by Cohort ...... 115 4.3 Proportion of Valid and Missing Attributes ...... 116 4.4 Histogram and Log-Lin-CCDF of ETH Degree Distributions . . 122 4.5 Edge Overlap against Tie Strength and Edge-Betweenness . . . 128 4.6 Degree Correlations ...... 131 4.7 Degree and Type Correlations ...... 132 4.8 Cohort Mixing Matrix of ETH Zurich ...... 136 4.9 Coefficients from Exponential Random Graph Models ...... 140 4.10 StudiVz Profile Page ...... 145 4.11 Degree Distributions of Harvard and Columbia University . . . 150 4.12 Exponential’s Rate and Weibull’s Shape for FB100 Samples . . 151 xvi LIST OF FIGURES

4.13 Graph-Level Metrics of the FB100 Ensemble ...... 153

5.1 Degree Distribution and Marginal Effects of Personality . . . . . 171 5.2 Contact Frequency, Closeness, and Duration for Alter #1 in Name Generator ...... 180 5.3 Histograms of Raw Personality Scores ...... 185 Chapter 1

Introduction

This dissertation is about meeting strangers and friends. It contributes to research on relationship formation in the online world. The general topic is approached from two different angles. First, I examine the question of how pure strangers establish cooperative relationships in large, anonymous, and geographically dis- persed online auction markets. How do traders on these platforms solve the prob- lem of trust typical of any kind of non-repeated human interaction? It will be shown that a simple yet powerful social institution in the form of a reputation mechanism helps to overcome what appears to be a social trap. The second per- spective takes trusting relations as given and asks what kind of structures emerge when social actors translate their repeated and enduring social relationships into acquaintanceship networks on social networking sites. The traces of friendship are very different from those I study under the stranger regime. However, in both situations social actors arrive on Internet platforms and start to interact with ex- isting actors, which gives rise to a growing social interaction network. Some properties of the two networks are similar while others are radically different. Up until recently, research on social interactions has often relied on one-time snapshots of self-reported data on relationships from quite small groups. The Internet has changed this situation dramatically and offers entirely novel channels for understanding how and why people interact (Watts 2007). Already the fact that data collection in the Internet can be executed in an unobtrusive way is very promising on its own. Social scientists are typically confronted with self-reported data which suffers from several sources of error and bias, like social desirability, cognitive biases, framing, and several forms of selection bias that complicate data analysis substantially. Of course, personal and interaction data from the web has its own set of limitations. First of all, users on certain internet platforms tend to constitute a very specific sample with regard to the general population. Scholars 2 Introduction have countered this problem with ever growing datasets of digital traces that can cover substantial parts of a population (e.g., Onnela et al. 2007, Ugander et al. 2011). Individual digital traces can be compiled into comprehensive pictures of how the macro-level of groups, organizations, or even society looks like. While the capacity to collect and analyze massive amounts of data has already changed entire research fields in the natural sciences, this trend is now gaining momentum in the emerging field of “computational social science” (Lazer et al. 2009). In my thesis, I rely on large but not yet big data that I have collected myself by means of poor man’s real-time screen-scraping technologies. Strangers are observed on eBay Germany, where I have collected data on 1 million auctions with roughly 0.5 million traders. The data on friendships has been collected on StudiVz, a social networking site very similar to Facebook, that was quite pop- ular in German-speaking countries at the time of data collection. In retrospect, since Facebook has taken over the entire market, this will be more of an autopsy than an examination of a vivid online community. On StudiVz, I concentrate on mapping relations of acquaintanceship within single organizations. An ensemble of five networks has been collected, those networks representing five Swiss uni- versities with in total 0.5 million relations between 40,000 students. These two data sources build the foundation of chapters 2 and 4 in this book. Compared to more traditional and well-designed forms of data collection, crawling the web means that you only get what you can see. Internet platforms on the web are neither meant nor designed for answering a social scientist’s ques- tions. Even though we receive traces in unprecedented resolution to describe behavioral patterns, there is typically a lack of explanatory and covarying vari- ables that would be required to test existing theories and to explain the data. I see this as the single most largest challenge for computational social science to come on par with traditional methods. One potential way to address these limitations is to combine and triangulate new types of data sources with traditional ones. In chapters 3 and 5, I use ex- perimental and survey methods to address open questions and to cross-validate findings. In chapter 3, a laboratory experiment confirms that students react in a systematic way to the reputation mechanism studied in chapter 2. The online survey conducted for chapter 5 can be linked to the data of chapter 4. I will show that personal attributes collected in the survey contribute greatly to the explana- tion of how and why students take positions in the acquaintanceship network of their own university. In the remainder of this introduction, I will outline the central questions and substantive contributions of each of the four articles in this dissertation. After the outline of the first two papers, I will describe in what respect the two social systems, eBay and StudiVz, are similar and different. 3

Reputation Formation in Online Markets

The first substantive chapter 2 is concerned with the measurement of reputation effects in online auctions and why traders submit feedback even if it is very un- likely that the two transaction partners ever meet again. Today’s online markets provide a unique opportunity to test theories of trust, reciprocity, and reputation using large samples of unobtrusive data. A trader’s reputation and gossip about his past behavior have always helped to overcome trust problems with his trading partners. With trade taking place over ever longer distances, social and economic institutions like credit bureaus evolved as a substitute for informal social con- trol (Greif 1989). In a similar vein, cooperation in peer-to-peer online markets is sustained by an electronic reputation system that collects and disseminates infor- mation about traders’ past behavior (Kollock 1999; Resnick et al. 2000). An elab- orate statistical analysis will show that sellers with better reputations have higher sales and obtain higher prices. Furthermore, we observe a high rate of participa- tion in the feedback system. Together with my colleagues Andreas Diekmann, Ben Jann, and Wojtek Przepiorka, I will show that theories of strong reciprocity (Gintis 2000) and altruism (Becker 1976) help to explain how traders maintain second-order cooperation at the feedback stage. This generates, in turn, the infor- mation necessary to create financial incentives for first-order cooperation at the transaction stage.

Reciprocity in Reputation Systems

In the second article in chapter 3, I study the mechanism design of reputation systems and how social science can contribute to a proper functioning of online market places. The first part of the chapter provides a review of the literature on reputation building (Schelling 1960, Kreps et al. 1982) and how it relates to indirect reciprocity discussed in theoretical biology (Alexander 1987, Nowak and Sigmund 1998a). Special consideration is given to the role of direct reciprocity (Gouldner 1960, Diekmann 2004) in bilateral rating systems prevalent in online auction markets. Within the paradigm of binary-choice trust games, I compare the performance of a bilateral-sequential, a bilateral-blind, and a one-sided feedback mechanism by means of experimental methods. The first version of eBay’s repu- tation mechanism was a bilateral-sequential mechanism where both the buyer and the seller can provide positive, neutral, negative, or no feedback after the end of a transaction. Early on, there was reasonable suspicion that due to the two-sided nature and the sequentiality of the rating process buyers and sellers could have incentives not to report the true outcome of the transaction. If the outcome of the transaction is not above all reproach, a buyer can punish his seller by submit- ting a negative feedback. However, the seller can reciprocate and retaliate on the 4 Introduction buyer’s move. This option for counter-punishment may be why buyers often pre- fer to remain silent after failed transactions. My experimental results will show that feedback mechanisms which constrain the role of direct reciprocity exhibit substantially higher levels of trust. Direct reciprocity has positive and negative consequences for a reputation system. Even if the rates of trustworthy trading turn out to be lower in bilateral systems, I will demonstrate that direct reciprocity is a key behavioral regularity that explains why traders submit ratings in the first place.

Stranger and Friendship Networks

Before I continue with the outline of the last two papers, I briefly the gap between strangers and friends and sketch what differences and commonalities we would expect to see in terms of the interaction topology. In the experimental paper in chapter 3, I let buyers and sellers collide under a perfect-stranger treatment. This means that participants in the experiment are playing in a one-shot situation where they only interact once with any of their interaction partners. This mimics random encounters in a large and well-mixed population and demarcates at the same time the worst-case scenario for the evolution of cooperation. In an anonymous online market we expect traders to collide, to form isolated dyads and then never to meet again. They can be conceived to be structurally homogenous social atoms bumping into each other without any other form of social relation that would embed and stabilize the transaction (Granovetter 1985, 1992). There, it is the reputation system that substitutes embeddedness. In fact, if we check for previous encounters on the list of feedbacks, we find that only about 5% of traders have ever met before. It is also very unlikely that two buyers who bought from the same seller initiate a trade with each other. In technical terms, we expect complete absence of any transitivity and that is exactly what I observe. For the stranger case in general, we expect to see an interaction topology that resembles a random graph. This random structure can be observed indeed. However, it turns out to be a random graph of a specific class of networks. Both, strangers and friends, are observed on internet platforms where new nodes arrive in a network and start to form relations. In the simplest case, they enter, attach to a fixed number of preexisting nodes and turn inactive. If we let this process run for some time and then observe the outcome, we see as a result a grown graph of behavioral traces. Figure 1.1 shows the degree distributions of the stranger and the friendship networks. The former is generated from the DVD market discussed in chapter 2 while the right panel shows the friendship network of ETH Zurich discussed in chapter 4. The y-axis shows which proportion of users has a certain number 5

Buyer−Seller Network on eBay Friendship Network on StudiVz 0 ● ●● ●●●● 10 ●●● ● ●●● ●●● ● ●●● −1 ● ● ●● ● ●●● ●● −1 ●● 10 ● ●● ●●● ●●● ●● ●●● 10 ●●● ●●● ●●● ●●● ●●● ●●● ●●● ●●●● ●●● ●●●● ●●● ● −2 ●●●● ●● ●●●● ●● −3 ●● ● ●●●● ●●●

● 10 ●●●● ●●●● ●●●●● ●●● 10 ●●●●● ●●● ●●●●● ●● ●●●●●●● ● ●●●● ●● ●●● −3 ● ● ● ●● ● Number of Users ● Number of Users ●●● ● ● 10 ● ●● ● ●● −5 ● ●

● −4 ● 10 10 100 101 102 103 104 0 50 100 150 200

Number of Trading Partners Number of Friends

Figure 1.1: Degree Distributions of eBay and StudiVz Networks of interaction partners whereas the number of interaction partners is depicted on the x-axis. Both plots show that a large number of users have just a few contacts while few users have many. On first sight, both plots seem to share the same connection pattern, but they do not. While the left panel has logarithms on both scales, the right panel has a linear x-axis. While it is perfectly feasible to sell 10,000 DVDs in a month on eBay, it is very unlikely that someone finds the time to maintain the same number of friendship relations. The friendship networks seems to have grown from a pure random attach- ment process that produces an exponential distribution (Callaway et al. 2001). The buyer-seller graph, on the other hand, follows roughly a Pareto distribution that suggest a different growth mechanism. The simplest and most popular one is preferential attachment (Barabási and Albert 1999), where a new node attaches to existing nodes proportional to their degree. If a buyer can choose to buy a tooth brush from two sellers, and if the first has 50 successful trades and the second twice as many, then preferential attachment suggests that the buyer attaches ex- actly twice as likely to the second seller. An initial small difference in the success of selling DVDs will turn into a significant competitive advantage. While this nicely fits with my argument on reputation effects in chapter 2, it takes more to show that this mechanism is actually at work. But this will be il- lustrated elsewhere. Both networks share the growth element, and each falls into a certain class of networks that can be addressed with simple stochastic growth models. Both networks exhibit the expected degree distribution of its class beau- tifully so that one is tempted to claim, in the light of Figure 1.1, that the concepts of strangers and friends are just one logarithm apart. This is, of course, close to pure nonsense. 6 Introduction

The Structure of Online Collegiate Networks

In the third substantive chapter 4, I focus on acquaintanceship and social rela- tions that have primarily formed outside the online world which, however, have left traces on a popular social networking site. Based on simple generative net- work models from economics (Jackson and Rogers 2007) and statistical physics (Callaway et al. 2001, Zalányi et al. 2003), I discuss and characterize an ensemble of five online collegiate networks that correspond to five universities in Switzer- land. By means of social network analysis, I reanalyze well-known theoretical and empirical findings from the network literature. First, I describe the degree distribution with the methods of maximum likelihood and show that it nicely fits a random growth model. There, I also discuss the methods of how to fit these red lines in Figure 1.1. In a second step, it will be shown that all networks exhibit the well-known Small-World property (Watts and Strogatz 1998) and that the theory of the strength of weak ties also applies to online collegiate networks (Granovet- ter 1973). Furthermore, I use exponential random graph models (Goodreau et al. 2009) to map the typical opportunity structure, which is, as expected, primarily determined by cohort and study subject. In the appendix of this chapter, I finally compare my ensemble with a set of 100 similar networks consisting of 1.2 mil- lion users from Facebook (Traud et al. 2011). It turns out that all those online collegiate networks are remarkably similar and that the predictions from simple random graph models can give a quite accurate picture of structures that surround the students. The online world provides a lens through which social scientists can gain insights into the macro-structure of universities and upscale the findings from classroom sociometry (Moreno et al. 1943, Coleman 1964, Hallinan 1978) not only to the lecture hall but to an entire campus.

Personality on Social Networking Sites

In the last chapter (5), I explore how individual heterogeneity influences behavior in online social networks based on data collected from an online survey that can be linked to the data of chapter 4. Rivaling theories give very different answers to the question of why people are so remarkably different in terms of the number of relations they maintain. With an application of the five-factor model (Gold- berg 1990) from personality psychology, I use a potentially stable and exogenous source of individual variation to explain how students shape their egocentric net- works. I show how these network antecedents determine participation, adoption time, nodal degree and network growth over a period of 4 months. Besides the nodal degree, I also regress the personality scores on several measures 7

(Freeman 1978, Bonacich 1972) and measures of local clustering (Watts and Stro- gatz (1998), Burt 1992). Special attention will be given to statistical modelling by aapplying negative-binomial regression models and panel data methods.

Chapter 6 presents concluding remarks and a summary of the main findings.

Chapter 2

Reputation Formation and the Evolution of Cooperation in Anonymous Online Mar- kets

Abstract: Theoretical propositions stressing the importance of trust, reciprocity, and reputation for cooperation in social exchange relations are deeply rooted in classical sociological thought. Today’s online markets provide a unique opportunity to test these theories using unobtrusive data. Our study investigates the mechanisms promoting cooperation in an online-auction market where most transactions can be conceived as one-time-only exchanges. We first give a systematic account of the theoretical arguments explaining the process of cooperative transactions. Then, using a large dataset comprising 14,627 mobile phone auctions and 339,517 DVD auctions, we test key hypotheses about the effects of traders’ reputations on auction outcomes and traders’ motives for leaving feedback. Our statistical analyses show that sellers with better reputations have higher sales and obtain higher prices. Furthermore, we observe a high rate of participation in the feedback system, which is largely consistent with strong reciprocity – a predisposition to unconditionally reward (or punish) one’s interaction partner’s cooperation (or defection) – and altruism – a predisposition to increase one’s own utility by elevating an interaction partner’s utility. Our study demonstrates how strong reciprocity and altruism can mitigate the free-rider problem in the feedback system to create reputational incentives for mutually beneficial online trade.

Keywords: trust, reciprocity, reputation, e-commerce

† This chapter is an edited preprint version of the following paper: Diekmann, Andreas, Ben Jann, Wojtek Przepiorka, and Stefan Wehrli. 2014. “Reputation Formation and the Evolution of Coopera- tion in Anonymous Online Markets.” American Sociological Review 79:65-85. 10 Reputation Formation 2.1 Introduction

Social exchange has always been an important part of human sociality and ar- guably a driving force in the evolution of the human mind (Cosmides and Tooby 1992). In archaic societies, social exchange spanned a network of multiplex rela- tions and was accompanied by elaborate ceremonies. A long tradition of research in anthropology and sociology on social exchange began with the work of Mali- nowski and Mauss. Ceremonial exchanges, such as the Kula (Malinowski 1922) or the potlatch (Mauss [1950] 1990), were formal rituals characterized by solem- nity, decorum, and disinterested generosity; they occurred side by side with eco- nomic exchanges, which, in contrast, were guided by utilitarian motives (Heath 1976). Social exchange was governed by norms of reciprocity (Gouldner 1960; Malinowski 1926; Mauss [1950] 1990), and a failure to reciprocate was punished by the loss of reputation and status (Blau 1964; Mauss [1950] 1990). With trade taking place over ever longer distances, markets transcended the borders of small communities, and historical evidence shows that social and eco- nomic institutions evolved as a substitute for informal social control (Greif 1989; Zucker 1986). For example, trade in the early Middle Ages in Europe was charac- terized by geographic specialization, bookkeeping, and cashless payment. At that time, the Champagne fairs in France were a meeting point for traders from all over Europe. Milgrom et al. (1990) discuss the emergence of a private adjudication system (the Law Merchant) that helped overcome trust problems among anony- mous traders. Administered by private judges, it provided a platform for traders to settle disputes and document trading partners’ dishonest behavior. Along with a system of notaries, such an institution could track and disseminate information about traders’ past behavior so that dishonest traders could be denied access to the market. More recently, similar information-collecting institutions have emerged to overcome trust problems between money lenders and borrowers. Credit bureaus, for example, function as information brokers that collect and collate information about borrowers’ liabilities, credit histories (in particular, arrears and defaults), and other characteristics (Pagano and Jappelli 2002). Credit bureaus are private institutions, often operated by a group of lenders, and reporting of borrower data is voluntary. To overcome the free-rider problem in data reporting, credit bureaus’ services are based on reciprocity; only lenders who submit accurate information about their customers are granted access to a comprehensive customer database. This reciprocal information sharing system creates financial incentives for lenders to contribute to the common good of customer information and for borrowers to maintain a good reputation by timely debt repayment. Pagano and Jappelli (2002) Introduction 11 show, moreover, that such information sharing is indeed associated with higher lending and lower default rates across 46 countries. In a similar vein, cooperation in peer-to-peer online markets is sustained by an electronic reputation system that collects and disseminates information about traders’ past behavior (Kollock 1999; Resnick et al. 2000). In online markets, tens of thousands of anonymous buyers and sellers trade with each other every day. These traders are not connected through a social network. In most cases they interact only once (Resnick and Zeckhauser 2002), and – when cheating – they are unlikely to be caught and fined by a central authority because many of these transactions transcend national borders (Przepiorka 2013). Traders’ willingness to submit feedback after transactions are finished is crucial for the functioning of the reputation system. However, unlike in credit markets, information about traders’ reputations is a collective good freely available to anyone (Bolton et al. 2004). Submitting feedback has a cost in terms of time and effort, so traders have no real incentive to submit feedback and a second-order free-rider problem may exist (Heckathorne 1989). Assuming strict rationality, actors would not rate their trading partners and, consequently, the whole market would degenerate or not evolve at all. How is it, then, that online markets are so successful and enjoy increasing popularity? Here we argue that other-regarding preferences, such as altruism (Becker 1976) and strong reciprocity (Gintis 2000), play an important role in maintain- ing the provision of feedback in anonymous online markets. This feedback then generates the information necessary to create financial incentives for mutually beneficial trade. Based on a large sample of process-produced auction data ob- tained from eBay (N ≈ 350,000), we first corroborate previous studies’ findings that reputation has a market value, thus making it rational for traders to build and maintain a good reputation. We investigate the effect of reputation on the probability of sale and the selling price in a high- and a low-cost product mar- ket (mobile phones versus DVDs, respectively) and under high and low buyer uncertainty (used versus new mobile phones, respectively). Second, we investi- gate traders’ motivation to leave feedback, without which reputational incentives would not accrue. Based on a large sample of timed rating events, we test partly competing hypotheses with regard to the possible motives behind traders’ volun- tary contributions to the feedback system. Our analyses take advantage of the repeated measures obtained for the same traders, to control for unobserved con- founders (i.e., unobserved heterogeneity). Our analysis brings together different strands of research and describes theo- retically and empirically the mechanisms that guide traders’ cooperative behavior in an online market. In particular, we show how the institutional setup of an on- 12 Reputation Formation line market engages traders’ moral sentiments and material interests to create opportunities for mutually beneficial trade.

2.2 Theory and Hypothesis

An interaction between a buyer and a seller in a typical online-auction market can be conceived as a process comprising four stages. First (the initiation stage), a seller decides to initiate an auction and sets some relevant parameters. Second (the bidding stage), potential buyers place bids and the highest bidder gets the offered item. Third (the transaction stage), the buyer pays the highest bid (i.e., selling price) and the seller ships the item. Finally (the feedback stage), the buyer can rate the seller and, depending on implementation of the feedback mechanism, the seller can rate the buyer. Ratings can be positive or negative, are typically accompanied by a short written comment, and are immediately available to all market participants free of charge.

2.2.1 Transaction Stage The sequence of moves at the transaction stage is not predefined. However, be- cause sellers cannot choose their buyers, most sellers demand advance payment and ship the item only after payment is received (Diekmann et al. 2009). In this case, buyers cannot inspect the product before payment and a trust problem arises: sellers could deliver poor quality goods or keep the money without delivering at all (Akerlof 1970; Dasgupta 1988; Güth and Ockenfels 2003).1 How do traders overcome this trust problem? Evidence from laboratory experiments suggests there may be a significant proportion of trustworthy sellers who care about the outcome of their trading partners due to otherregarding motives such as positive reciprocity, fairness con- siderations, and altruism (Camerer 2003). However, these trustworthy sellers will be driven out of the market because they cannot keep up with dishonest sellers’ cheap offers. Bad experiences with dishonest sellers will decrease buyers’ will- ingness to pay for a product, and thereby reduce trustworthy sellers’ incentives

1 Most economic and social exchange is sequential. Actor A transfers resources to actor B and expects B to reciprocate at some point in the future so that both A and B earn the gains from trade. A trust problem arises when A cannot expect B to reciprocate with certainty (Coleman 1990). We call A’s expectation regarding B’s propensity to reciprocate “trust.” A trusting A is more likely to transfer resources to B. We call B’s propensity to reciprocate “trustworthiness.” B can be trustworthy because she is committed to act in A’s interest out of self-interest (Hardin 2002), or due to other-regarding preferences and internalized norms of fairness and reciprocity (for other notions of trust, see Cook et al. 2007). Theory and Hypothesis 13 to stay in the market (Akerlof 1970; Yamagashi et al. 2009). Alternatively, sell- ers may be trustworthy because they are embedded in dyadic relations or social networks (Buskens and Raub 2002; Granovetter 1992; Gulati and Gargiulo 1999; Jones et al. 1997 in which cooperation is enforced through direct and indirect reciprocity (Nowak and Sigmund 2005; see also Blau 1964), respectively. Online consumer markets, however, merely coordinate the supply and de- mand of end-product sellers and ultimate buyers. These markets are not char- acterized by customized and complex exchanges, and, as we will show, repeated interactions between the same two traders are infrequent. Under these conditions, it is unlikely that network governance structures will emerge to solve potential ex- change problems (Jones et al. 1997). Moreover, because online trade takes place across large geographic distances, offline commitment relations (Brown et al. 2004; Kollock 1994; Yamagishi et al. 1998) or trust networks (Cook et al. 2004; DiMaggio and Louch 1998), which would allow traders to obtain and dissemi- nate information about their interaction partners, are unlikely to form. Finally, although online markets such as eBay try to establish norms of good conduct by providing discussion forums, chat rooms, and clubs where traders have the op- portunity to exchange information on selected topics, to our knowledge, there is no evidence that eBay traders have developed shared social norms of cooperation. Even if they have, such norms will not prevent fraudulent traders from behaving opportunistically.2 In anonymous online markets, the lack of social embeddedness is compen- sated for through an electronic rating system that efficiently and systematically disseminates information about traders’ reputations (Resnick et al. 2000). From an organizational behavior perspective, an electronic reputation system is an ex- ogenously established, fully connected network that embeds anonymous online traders in weak ties, through which information about their reputations is trans- mitted, without structurally limiting their choice of potential interaction partners and the pursuit of their goals (Gulati and Gargiulo 1999; Uzzi 1997). In other words, in an anonymous online market with a reputation system, traders can be conceived as structurally homogeneous and their interactions as isolated dyads (Granovetter 1992).

2 eBay Germany has 14.5 million active users. We counted 142 eBay clubs with a total of 2,686 members representing .019 percent of active users (eBay.de, March 28, 2013). A somewhat higher number of participants are in discussion forums, but relative to the millions of users and transactions, their proportion is negligible. Moreover, repeated transactions with the same trader are rare. The density of the trader network will thus be close to zero, even for submarkets. Finally, online relations may be “too thin” (Hardin 2004) to establish norms of good conduct to the same extent as offline relations would (Granovetter 1992). 14 Reputation Formation

The resulting social mechanism that resolves exchange problems is rather simple. With their reputations at stake, rational and self-regarding sellers who sufficiently care about their future business have a strong incentive to behave co- operatively. Even in interactions with buyers whom they are unlikely to meet again, these sellers will forgo the short-term temptation to abuse a buyer’s trust to avoid a negative rating that would hamper future business. Market entrants, however, have no reputation and are therefore likely to be distrusted by buyers. Consequently, to enter the market, sellers with no feedback record have to lower prices to compensate buyers for the risk they take buying from a new seller. If new sellers sufficiently care about future business, their good reputations will reim- burse them for this initial investment in the future (Przepiorka 2013; Resnick and Zeckhauser 2002; Shapiro 1983).3 If buyers discriminate among sellers accord- ing to their reputations, selling prices will be positively (negatively) correlated with a seller’s number of positive (negative) ratings. Moreover, if the market does not clear because supply exceeds demand, an analogous hypothesis can be derived with respect to sellers’ positive (negative) ratings and the probability of sale. Previous studies have found a stronger absolute effect of negative ratings than positive ratings on outcomes (Ba and Pavlou 2002), possibly because negative ratings carry more information – they express unmet expectations in relation to proper business practice – whereas positive ratings merely confirm expectations (Lucking-Reiley et al. 2007). Finally, with both new and used products, a poten- tial buyer may fear that a seller will not deliver after payment or that the quality of the item will be lower than advertised. In the case of used products, uncertainty is higher for the buyer because description of the product condition is less standard- ized, leaving more room for dishonest behavior by sellers. For used products, a seller’s reputation may thus be more important and exhibit a larger effect than in the case of new products (Cook et al. 2004; Kollock 1994). In summary, our hypotheses for the transaction stage are as follows: Hypothesis 1: The number of positive (negative) ratings increases (decreases) the probability of a sale (1a) and the selling price (1b). Hypothesis 2: The effects of negative ratings on the probability of a sale and the selling price are stronger than the effects of positive ratings. 3 On eBay, traders can leave feedback only after a transaction, and for every transaction the market platform charges a small fee. Building a good reputation, even with sham transactions, thus requires time and money. This makes a good reputation costly to fake and the reputation system fairly robust against deliberate attempts to influence it. On platforms where the possibility to leave feedback is not restricted to traders of a particular transaction, reputations can be faked more easily. eBay’s reputation system is further stabilized because the platform bans traders with too many negative evaluations and may prevent them from opening a new account under a different alias. Theory and Hypothesis 15

Hypothesis 3: The effects of positive and negative ratings are stronger for used products than for new products.

2.2.2 Feedback Stage We argued that the market value of a good reputation creates financial incentives for traders to behave cooperatively. However, rational and self-regarding traders will not provide feedback after a transaction is finished and thus reputational in- centives will not accrue. Participation in the feedback system can be compared to voting, in that, just as with democratic elections, there is a collective good problem (Downs 1957). If citizens eschew the costs of voting, as strict rationality principles predict, the democratic system collapses, as would the online market should no buyer provide feedback. Fortunately, however, a substantial proportion of citizens do vote, and a large proportion of traders do leave feedback. Similar explanations can be given for both behaviors. Consider a one-sided reputation system in which only buyers can rate sellers after a transaction. There are several possible explanations for buyers’ participa- tion in the feedback system. First, as suggested by signaling theory (e.g., Gam- betta 2009), a buyer might leave feedback to gain a reputation for being an active rater. Such a reputation signals to sellers that the buyer will publicly comment on how a transaction plays out, which might enforce cooperative behavior by the seller. Second, buyers might submit positive feedback because they are altruistic and the increase in the seller’s utility elevates their own utility (Becker 1976).4 Because we assume a decreasing marginal utility of positive ratings, the likeli- hood of an altruistic buyer submitting feedback will decrease with an increasing seller score. Third, some buyers might leave feedback because they gain satis- faction from contributing to the public good provided by the feedback system. This idea corresponds to the notion of “warm glow” altruism (Andreoni 1990; see also Dellarocas et al. 2004). Finally, there is good reason to believe that some buyers are willing to punish defection or reward cooperation without direct ben- efit. That is, giving feedback can be motivated by strong reciprocity (Fehr et al. 2002; Gintis 2000).5 Returning a favor with a favor and responding to misdeeds with sanctions is observed in experiments and seems to be rooted in human nature (de Quervain et al. 2004).

4 Note that Becker’s altruism model is agnostic regarding the emotional triggers of altruistic behav- ior. Regret and guilt, for instance, can be subsumed under the model. 5 Fehr and colleagues (2002, 3) point out that the “essential feature of strong reciprocity is a willing- ness to sacrifice resources for rewarding fair and punishing unfair behavior even if this is costly and provides neither present nor future material rewards for the reciprocator” (emphasis in original). Be- havior as defined by strong reciprocity cannot therefore be explained by reciprocal altruism (Trivers 1971) or by reputation and indirect reciprocity (Nowak and Sigmund 2005). 16 Reputation Formation

Until May 2008, eBay.de had a feedback system in which both transaction partners could leave feedback within 90 days. In such a two-sided rating system, in addition to the motives listed above, the feedback stage is open to strategic thinking. Although both buyers and sellers can be led by strategic (and nonstrate- gic) motives, it is primarily sellers who have an interest in acting strategically at the feedback stage (see also Bolton et al. 2013). For example, the best way for a seller to establish a good reputation is to engage in good conduct, but there is al- ways a small probability that a buyer is dissatisfied for reasons beyond the seller’s control. To safeguard against a buyer’s (possibly unjustified) negative response, a seller will prefer to hold up the threat of retaliation. Because a negative rating is more detrimental for sellers who have not yet established a good reputation than for sellers with many positive ratings, we expect sellers with fewer positive ratings to be more likely to strive for the second mover position at the feedback stage. On the other hand, a seller could try to elicit positive feedback from a buyer by giving positive feedback first. Such a strategy makes sense if the seller believes the buyer adheres to a norm of reciprocity (Gouldner 1960; Resnick and Zeckhauser 2002). If sellers make such an assertion, then the buyer’s score will have a positive effect on the probability of the seller giving feedback first. A fur- ther variable likely to influence a seller’s rating decision is whether the seller has already received a rating from the same buyer in a previous interaction. At the time of our data collection, repeated ratings from the same user did not affect the reputation score. Hence, a seller had no reason to elicit further positive feedback from a buyer by giving positive feedback first (see also Dellarocas et al. 2004). For buyers, the nonstrategic motives mentioned earlier are still valid. In ad- dition, if buyers are motivated by strong reciprocity, positive feedback from the seller will further increase their likelihood of submitting a positive rating. In fact, it is hard to think of reasons why a self-regarding buyer would recipro- cate a seller’s positive feedback. Such behavior would be an indication of strong reciprocity at the feedback stage. Furthermore, indirect reciprocity (Nowak and Sigmund 2005), that is, rewarding sellers for their participation in the feedback system, might be another nonstrategic motive in a two-sided system. According to indirect reciprocity, buyers will be more likely to give (positive) feedback to sellers with a good reputation.6 Finally, due to eBay’s reputation system ignoring repeated ratings, buyers should be less likely to submit positive feedback if they provided a rating to the same seller in a previous interaction. If no such effect exists, altruistic motives based on the buyer’s belief that the seller gains utility from receiving feedback (Becker 1976) can be ruled out. For our empirical anal-

6 A high number of ratings can be interpreted as an indication of being an active rater due to the high correlation between incoming and outgoing ratings. Previous Findings 17 ysis, we confine ourselves to the following hypotheses (see also Dellarocas et al. 2004):

Hypothesis 4: Reciprocity If buyers and sellers are motivated by strong reciprocity at the feedback stage, their inclination to submit a positive (negative) rating increases after receiving positive (negative) feedback from the transaction partner.

Hypothesis 5: Effect of seller’s score on buyer’s behavior:7 a) If buyers are motivated by Becker’s altruism, the seller’s score should have a negative effect on the probability of feedback submission. b) If buyers are motivated by indirect reciprocity, the seller’s score should have a positive effect on the buyer’s decision to provide feedback.

Hypothesis 6: Sellers’ strategic behavior: a) Sellers’ inclination to make the first move at the feedback stage increases with their own score; and b) Sellers’ inclination to make the first move at the feedback stage increases with the buyer’s score.

Hypothesis 7: Previous Interactions: a) Sellers are less likely to submit positive feedback if they have already received a rating from the same buyer in a previous interaction. b) Buyers are less likely to submit positive feedback if they provided a rating to the same seller in a previous interaction.

2.3 Previous Findings

Most studies investigating reputation effects in online markets analyze data col- lected from eBay, and most of these studies find evidence of a positive relation between sellers’ reputations and prices or sales. Due to space limitations, we provide a review of these studies in the appendix (for earlier reviews, see Bajari

7 The two motives described in Hypotheses 5a and 5b are not mutually exclusive and may be present simultaneously in a heterogeneous population. Empirical estimates, however, will provide information about which motive is predominant. Other motives, such as the “warm glow” or signaling, might also be present, but we see no way to disentangle them with our data. These motives would be consistent with a null effect of sellers’ scores on buyers’ behavior. Furthermore, effects described in Hypothesis 5 could also be caused by fear of disagreeable e-mails from the seller if one does not leave feedback. However, we cannot offer any clear ideas about how such fears would relate to a seller’s score. 18 Reputation Formation and Hortaçsu 2004; Resnick et al. 2006). Here, we instead focus on important as- pects in the ongoing discussion about reputation formation in anonymous online markets. In their review of 15 empirical eBay studies, Resnick and colleagues (2006) point out that these studies’ quasi-experimental designs cannot account for un- observed heterogeneity in a satisfactory way. If unobserved factors, such as pre- purchase communication between buyers and sellers or the quality of the product description, are correlated with sellers’ reputations, then estimates of the regres- sion coefficients can be biased. In our dataset, we have repeated measures for a considerable proportion of both buyers and sellers. This allows us to deal with problems of unobserved heterogeneity by using fixed-effects regressions (Allison 2009). Some researchers argue that most eBay studies claim to estimate buyers’ will- ingness to pay for a seller’s reputation but, in fact, only estimate what buyers do pay, which is a poor estimate of their willingness to pay (Snijders and Weesie 2009). The argument is that the highest bids may depend on many other factors not observable in data obtained from eBay. For instance, bidding is a dynamic process and potential buyers’ bidding decisions also depend on their beliefs about other buyers’ reservation prices. Moreover, potential buyers could consider sev- eral offers simultaneously, but their initial choice sets are unobserved or difficult to define (but see Livingston 2003). In our study, we do not claim to be estimating buyers’ willingness to pay for a seller’s reputation. For the sake of our argument, it is enough to show that buyers, on average, pay more for sellers with higher rep- utations and thus create financial incentives for sellers to behave cooperatively in the market. The Achilles’ heel of a reputation system remains market participants’ will- ingness to rate each other. If participants do not provide truthful ratings, the market loses its capacity to identify fraudulent traders and reputation loses its value. For buyers and sellers on eBay, Resnick and Zeckhauser (2002) report a rating frequency of 52 and 61 percent, respectively; Dellarocas and colleagues 2004 find 68 and 78 percent; and Jian et al. (2010) report rates of 33 and 55 per- cent. Moreover, averaging across seven different countries’ eBay sites, Bolton and colleagues (2013) report a rating frequency of 71 percent. They attribute this high rating frequency to eBay’s two-sided rating system. On the one hand, a two- sided system provides incentives to leave feedback due to positive reciprocity expectations. On the other hand, the threat of retaliation against a negative rating leads to a positive evaluation bias. Based on a dataset of ratings obtained from eBay, Dellarocas and Wood (2008) also identify reciprocity as an important factor driving behavior at the feedback stage. Moreover, they suggest that not leaving feedback indicates dissatisfaction and, based on information obtained from the Data 19 observed rating patterns, they estimate the proportion of good, bad, and mediocre transactions. They show that the 99 percent positive rating usually observed on eBay is a biased estimate of the proportion of good transactions. Accounting for the “sound of silence” (i.e., transactions where one or both traders did not leave feedback) gives an estimate of 79 and 86 percent for satisfied buyers and sellers, respectively. In a similar vein, Jian and colleagues (2010) developed a model to estimate the proportion of different rating strategies from eBay feedback data. They assume three rating strategies (do not give feedback, give feedback uncondi- tionally, and reciprocate feedback) and estimate buyers’ and sellers’ propensity to choose one of the strategies. They find that the median buyer and seller choose the reciprocal feedback strategy in 23 and 20 percent of cases, respectively, whereas they choose the unconditional feedback strategy in 38 and 47 percent of cases. Our study adds to these results by analyzing time-specific feedback decisions. Whether trading partners provided feedback first is an explanatory variable in our analysis. We estimate discrete-time event-history models, an approach that does not rely on restrictive assumptions about the timing distribution of traders’ feedback. We also account for unobserved heterogeneity by including buyer and seller fixed effects in our models. Note, however, that we observe behavior in our data; therefore, any inference we make about traders’ underlying motives is based on the consistency of the behavioral data with our theoretical propositions.

2.4 Data

Our data, collected from the German eBay platform, contain information on auc- tions of mobile phones and DVDs that ended between December 1, 2004 and January 7, 2005 (for details on data collection, see the appendix). We restrict our analysis to offers from sellers in Germany and exclude private auctions, where the buyer is unknown, as well as auctions in which multiple pieces are offered that can be purchased by multiple buyers. Some auctions ended early because the seller withdrew the auction before the first bid, the seller stopped the auction with the current bid as the winning bid, or the buyer bought the item at the “Buy It Now” (i.e., fixed) price. We exclude these cases from our analysis. Finally, minor inconsistencies in the regular expressions caused missing or ambiguous data on key variables for a small fraction of auctions, which we also exclude. In total, we excluded 14.5 percent of mobile phone auctions and 10.4 percent of DVD auctions for these reasons. To estimate reputation effects on the highest bids in auctions, it is sensible to select a homogeneous good to avoid bias due to unobserved heterogeneity (Diek- mann et al. 2009). To ensure homogeneity in our mobile phone data, we manu- 20 Reputation Formation ally sorted out offers that contained complementary goods or multiple items (23.1 percent) and divided the sample into new (unused and in original packaging) and used products. The total sample then included 14,627 mobile phones of seven different models, 37.6 percent of which were new and 95.7 percent of which re- ceived at least one bid (i.e., their auctions were successful).8 In our data, the mean selling price was 217 Euros for new mobile phones and 150 Euros for used mo- bile phones (see Table 2.1 for descriptive statistics of our key variables; additional variables are documented in Table 2.5 in the appendix). To evaluate whether rep- utation effects can also be observed for low-cost products, we analyze a sample of 339,517 DVD auctions, 53.3 percent of which were successful. We collected data on all DVD auctions in the mentioned time span, covering hundreds of different movies. Thus, product homogeneity cannot be assumed. Also, due to the lack of a reliable automatic categorization method, we do not distinguish between new and used DVDs. Moreover, unlike the mobile phone market, with 1.25 auctions per seller, on average, the DVD market, with an average of 10.24 auctions per seller, is dominated by a relatively small number of professional sellers. We operationalize buyers’ and sellers’ reputations with their number of pos- itive and negative ratings. Because distributions of these variables are highly right-skewed, we take the logarithm of their values (after adding 1). Note that in terms of ratings, too, sellers in the DVD market are considerably larger than sellers in the mobile phone market. At the feedback stage, we have the time (in days) until feedback for both the seller and the buyer of a successful auction. Recall that these variables are censored at 90 days. The mean duration seems to be lower in the DVD market, and positive feedback is very common, whereas neutral and negative ratings are rare. Negative ratings are more frequent in auctions for mobile phones than for DVDs, indicating that problematic transactions are more likely when they involve more complex and expensive products (although this could also be due to the lower degree of professionalism in the mobile phone market). Another important variable at the feedback stage is whether the seller or buyer has already received a rating from the same person in a previous interaction. Such repeated encounters are more likely in the DVD market (5 percent) than in the mobile phone market (3 percent).9

8 We collected data for five more models, but we excluded these cases from our analysis because the models were released prior to 2004. The excluded cases are composed almost entirely of used products and thus are more heterogeneous than the offers of newer models. Including these older models in our analysis does not change our main findings. 9 We calculated these numbers based on traders’ entire rating histories. They are lower bounds for the actual proportions of repeated encounters because transactions appear in the feedback history only if a rating occurred. Results 21

Table 2.1: Means and Standard Deviations of Key Variables

New Mobile Used Mobile Phones Phones DVDs Seller Seller’s positive ratings (log) 3.77 (1.59) 3.74 (1.61) 7.00 (2.65) Seller’s negative ratings (log) .42 (.70) .47 (.71) 2.00 (2.01) Buyer (if sold) Buyer’s positive ratings (log) 3.06 (1.70) 3.16 (1.83) 3.92 (1.69) Buyer’s negative ratings (log) .27 (.53) .32 (.62) .28 (.54) Feedback (if sold) Time until seller feedback (censored) 32.6 (35.3) 36.5 (36.7) 24.9 (30.1) Positive seller feedback .736 .685 .843 Neutral seller feedback .002 .005 .001 Negative seller feedback .012 .016 .003 Seller has previous rating from buyer .032 .031 .049 Time until buyer feedback (censored) 33.9 (35.7) 39.0 (36.8) 25.2 (30.2) Positive buyer feedback .706 .656 .835 Neutral buyer feedback .007 .011 .004 Negative buyer feedback .019 .020 .004 Buyer has previous rating from seller .030 .037 .049 Auction Successful auction .958 .956 .533 Selling price in Euros (if sold) 216.7 (37.4) 150.3 (47.1) 7.4 (10.2) Number of observations 5,499 9,128 339,517 Number of sellers 4,341 7,687 33,166 Number of buyers 4,881 7,454 100,046 Note: Table S1 in the online supplement list descriptive statistics of further control vari- ables used in our model estimation.

2.5 Results

2.5.1 Effect of Reputation on Auction Success

We first turn to reputation effects on sales and selling price. Table 2.2 gives an overview of results for our key variables. We control for many additional variables in our model estimations, but due to space limitations, we report and discuss their effects in the appendix only. Models 1 through 3 in Table 2.2 show results from logistic regressions of auction success on sellers’ positive and negative ratings. In line with Hypothesis 1, we find a significantly positive effect of positive ratings and a negative effect of negative ratings on the probability of sale for new mobile phones (Model 1) and for DVDs (Model 3). Contrary to our expectation, we find no significant effects for used mobile phones (Model 2). Recall, however, that over 95 percent 22 Reputation Formation of all mobile phones were sold, which leaves relatively little information for the estimation of reputation effects on the probability of sale. Because we are dealing with auctions and not fixed-price offers, a large part of the reputation effect will be exerted on price. Table 2.6 also reports results from linear regressions of selling price on sellers’ positive and negative ratings (Mod- els 4 through 6). As expected, we observe a strong relation between a seller’s reputation and the highest bid in our data. For both mobile phones and DVDs, the number of positive ratings has a significantly positive effect on price, and negative ratings reduce price.10 Because both the number of ratings and the selling price are log-transformed, the estimated coefficients can be interpreted as a percent in- crease in the selling price due to a one percent increase in the number of ratings (the elasticity). Thus, doubling the number of positive ratings increases the sell- ing price of new mobile phones (Model 4) by about .005 × ln(2) × 100 = .35 percent. For used mobile phones (Model 5) and DVDs (Model 6), the effect of a 100 percent increase in positive ratings is about .55 and 3.7 percent, re- spectively. Evaluating these relative effects at the mean prices for new and used mobile phones and DVDs (see Table 2.1) yields absolute effects of 61, 97, and 27 cents, respectively. Although these amounts may appear small at first sight, they are not, in fact, if one takes into account the range of sellers’ reputations in our sample, where differences of 1,000 percent (e.g., 40 versus 440 positive rat- ings) and more are the rule rather than the exception. As Hypothesis 2 predicts, the absolute effect of negative ratings on price and sales is larger than the effect of positive ratings in all models (Models 1 through 6), although the difference is statistically significant at the 5 percent level only in Model 4 (p = .003 for a two-sided test). In terms of absolute effects at mean prices, doubling the number of negative ratings decreases the price by 1.3 Euros, 2.2 Euros, and 52 cents in Models 4, 5, and 6, respectively. In line with our expectation that used products will exhibit more uncertainty than new products, and buyers will thus pay a higher reputation premium to sell- ers of used products (Hypothesis 3), the effects of positive and negative ratings on the selling price are stronger for used mobile phones (Model 5) than for new mobile phones (Model 6).11 Recall, however, that reputation has a significant ef- fect on sales only for new mobile phones. If we take into account both outcomes

10 Censoring at the starting price might bias effects of positive and negative ratings on price be- cause our estimation is based on successful auctions only. To evaluate the robustness of our results, we applied various models that take censoring into account (e.g., censored normal regression and Heckman’s sample selection model). Results are very similar in all models (not shown). 11 A bootstrap test of the joint null-hypothesis that the absolute coefficients of positive and negative ratings are not larger in the model for used mobile phones than in the model for new mobile phones yields a p-value of .04. This p-value is an empirical strength probability computed as the fraction of bootstrap samples in which the point estimates for the coefficients are in accordance with the null- Results 23 ects) ff 18,054 15,964 ects in Models 7 and 8, title fixed ff ects regressions are shown (seller fixed e ff ects of Reputation on Sales and Prices ff 1) Selling Price Selling Price (with Fixed E / ects in Model 10); the dependent variable is the logarithm of the selling price (in Euros). Numbers ff cient estimates of fixed-e Table 2.2: E ffi Product Sold (0 ers, and dummies for private profile, verified identity, Me-Page, PowerSeller, auction picture, thumbnail cient estimates of OLS regressions are shown; the dependent variable is the logarithm of the selling price ff 1 2 3 4 5 6 7 8 9 10 ffi New Used New Used New Used cient estimates of logistic regressions are shown; the binary dependent variable is equal to one for successful (.136) (.064)(.301) (.040) (.132) (.058) (.001) (.003) (.002) (.015) (.006) (.010) (.035) (.045) (.034) (.002) (.033) (.007) (.005) (.014) MobilePhones Mobile Phones DVDs Phones Mobile Phones Mobile DVDs Phones Phones DVDs Mobile Mobile DVDs ffi .001 (two-tailed tests). < .01; ***p < Models 1, 2, and 3: Coe .05; **p < ects in Model 9, title and seller fixed e ff Seller’s positive ratings (log)Seller’s negative .344* ratings (log) -.670* -.019McFadden R-squared -.207 .117**R-SquaredNumber of observations .005*** -.145*Number of sellersNumber -.011*** .008* of .858 titles -.018** 5,499 .053*** -.101** .678 .027** 9,128 -.055 -.002 4,341 339,517 .133 -.025 .016*** 7,687 5,269 -.012** .005 33,166 8,727 -.036** 4,242 180,881 1,612 7,474 1,944 30,018 113,276 585 .844 103,030 691 .513 .111 6,901 (in Euros). Models 7, 8, 9, and 10: Coe Note: auctions. Models 4, 5, and 6: Coe e in parentheses are robustdescription, standard errors number (adjusted of forlisting, competing seller-clusters). bold o Models listing, contain variousonline payment control supplement. modes, variables auction (starting*p price, duration, length time of and product date of auction ending, and product subcategory); for detailed results see the 24 Reputation Formation

(the selling price and the probability of sale), the total effect of reputation is even larger for new than for used mobile phones, although the difference is statistically insignificant (not shown).12 As mentioned earlier, we cannot assume product homogeneity for our DVD dataset. We therefore replicate our analysis using a model with fixed effects for movie titles (Model 9).13 Although smaller in absolute size than in the ordinary DVD model, the effects of positive and negative ratings are still highly significant. One could object that our effects are spurious due to unobserved seller het- erogeneity. For example, some sellers might be more successful for reasons unre- lated to reputation and, therefore, more likely to have a good reputation, resulting in a spurious correlation between reputation and selling prices. We check this argument by analyzing the effects of changes in reputation for sellers who appear repeatedly in our dataset. We estimate models with seller fixed effects for new and used mobile phones (Models 7 and 8) and a model with crossed fixed effects for sellers and titles for DVDs (Model 10).14 In the model for new mobile phones, ef- fects of positive and negative ratings persist, as does the effect of negative ratings in the model for DVDs, but the effects disappear for used mobile phones. How- ever, the fact that we still find reputation effects in most of our models provides strong support for the reputation hypothesis (Hypothesis 1). Note that account- ing for seller fixed effects results in fairly conservative estimates of reputation effects. These estimates are based only on changes in sellers’ reputations during the observation window of about one month, and all between seller variability is discarded. Consequently, model estimations are based on sellers with high trading frequency, who are more likely to have a good reputation, whereas newcomers, for whom an increase in reputation might be most valuable, tend to be excluded. In summary, our analyses replicate previous findings of a positive (negative) effect of positive (negative) ratings on sales and prices (Hypothesis 1); we also find a larger absolute effect of negative ratings as compared to positive ratings (Hypothesis 2). Our additional model estimations also make it rather unlikely that the effects we find are spurious. Evidence for the uncertainty hypothesis hypothesis (see Davidson et al. 2003). The p-values for the separate hypotheses are .13 for positive and .10 for negative ratings. 12 We determined the total effect by computing average marginal effects on the logarithm of the expected price, where the expected price was computed as the predicted selling price weighted by the estimated probability of sale. 13 Title matches could not be found for all auctions, even after removing special characters. The sample size in this analysis is thus reduced to 113,276 sold items. These items were sold by 15,545 different sellers and comprised 18,054 different titles with 6.3 auctions per title, on average (the largest group being Harry Potter and the Prisoner of Azkaban, with 383 auctions). 14 For the two-way fixed-effects model, we use Schmieder’s (2009) implementation of an algorithm proposed by Guimarães and Portugal 2010. Results 25

Table 2.3: Feedback Patterns (%) Mobile Phone DVDs Auctions with mutual feedback Buyer rated first 32.3 48.6 Seller rated first 28.1 30.2 Simultaneous .1 .3 Auctions with buyer feedback only 10.1 5.1 Auctions with seller feedback only 11.8 5.7 Auctions without feedback 17.7 10.1 Total 100.0 100.0 Observations 13,996 180,881

(Hypothesis 3) is mixed, however; the reputation effect does not seem to be larger for used mobile phones than for new ones.

2.5.2 Participation in the Feedback System As argued earlier, if too few traders participate in the feedback forum, the repu- tation system will be ineffective in detecting cheaters, and reputational incentives for (first-order) cooperation at the transaction stage will not accrue. However, we do find high levels of (second-order) cooperation at the feedback stage. Buyers submitted positive feedback in over 65 percent of all transactions in our mobile phone sample, whereas only 3 percent of transactions were rated neutrally or negatively. In the DVD sample, the difference is even larger. Buyers rated 83 per- cent of transactions positively; neutral or negative feedback was submitted in less than 1 percent of cases (see Table 2.1). Furthermore, Table 2.3 shows that over 80 percent of transactions in the mobile phone market and 90 percent of DVD transactions were rated by at least one trader. After an auction ends, traders have 90 days to submit their feedback; because traders can observe their partners giving them feedback, their feedback decisions are not independent from each other. Therefore, it is important to take the exact timing of traders’ feedback decisions into account when analyzing their rating behavior. Figure 2.1 gives an overview of the distribution of feedback decisions over the 90 days. The survival functions (upper panel) show how the propor- tions of buyers and sellers who have not yet given feedback decrease over time; the hazard rates for positive and negative ratings (the middle and lower panel, respectively) show, roughly speaking, the feedback probability on a given day conditional on no feedback being given before; dotted lines indicate 95 percent confidence intervals (see the appendix for computational details). Survival curves for sellers and buyers are synchronized in both markets, but ratings are submitted faster in the DVD market. After 10 days, almost 50 percent 26 Reputation Formation of sellers and buyers had submitted a rating in the DVD market, but only about 40 percent had done so in the mobile phone market. Most notably, sellers are somewhat more likely to quickly submit a positive rating. That is, the hazard rate of a positive rating is higher for sellers than for buyers over the first couple of days in both markets. This indicates that some sellers might act strategically when trying to elicit a positive rating from the buyer by taking the first step. For negative ratings, the picture is different (note that the scale of the hazard rate is magnitudes smaller than for positive ratings). The curves are very similar for sellers and buyers, although in the mobile phone market the hazard rate for buyers is higher than for sellers. Furthermore, negative ratings are submitted more slowly than positive ratings. For positive ratings, hazard rates peak between days 5 and 10; for negative ratings, they are highest between days 15 and 35. Interestingly, unlike for positive ratings, we observe an increase in the hazard rates for negative ratings between days 80 and 90 (although confidence intervals tend to be large as data mass gets small in this region). This indicates that some actors may delay their negative rating to escape retaliation. The survival curves and hazard rates in Figure 2.1 only provide descriptive information on the distribution of ratings over time. To analyze the effects of co- variates, we divide the process time into days and approximate the hazard rates using discrete-time linear probability models (Long 1997; Mood 2010) contain- ing indicator variables for different segments of the process time (i.e., we model the baseline hazard rates shown in Figure 2.1 as step functions).15 We also in- clude two-way buyer and seller fixed effects in the models to control for unob- served buyer and seller characteristics that might otherwise bias the results. For instance, a positive relation between traders’ reputations and their rating probabil- ity could be due to the fact that frequent raters accumulate more ratings because their ratings are reciprocated. Likewise, transaction quality or speed might be higher among traders with better reputations, thus increasing the likelihood of receiving positive feedback or decreasing the time-to-feedback reaction. Such potential biases are kept at a minimum by including seller and buyer fixed effects in the models. Table 2.5.2 provides estimates from the discrete-time linear probability mod- els for positive and negative feedback from buyers and sellers in the DVD market (we do not present results for mobile phones here because the sample is too small for the two-way fixed-effects estimation). The displayed coefficients are scaled

15 We chose the segments such that a sufficient number of rating events are available in each interval. For positive feedback we use one-day intervals up to day 30, followed by two-day intervals up to day 50, followed by five-day intervals. For negative feedback, the intervals in days are: 1-10, 11-15, 16-20, 21-25, 26-30, 31-40, 41-50, 51-70, and 71-90. The exact choice of intervals does not seem to matter much for our findings. Results 27

Mobile phones DVDs

Survival function Survival function 1 1 .9 .9 .8 .8 .7 .7 .6 .6 .5 .5 .4 .4 .3 .3

Proportion without feedback .2 Proportion without feedback .2 .1 .1 0 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

Seller Buyer Seller Buyer

Hazards of positive feedback Hazards of positive feedback

.1 .1

.08 .08

.06 .06

Hazard rate .04 Hazard rate .04

.02 .02

0 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

Seller Buyer Seller Buyer

Hazards of negative feedback Hazards of negative feedback

.001 .001

.0008 .0008

.0006 .0006

Hazard rate .0004 Hazard rate .0004

.0002 .0002

0 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

Seller Buyer Seller Buyer

Figure 2.1: Survival Functions and Hazard Rates of Sellers’ and Buyers’ Rating Decisions by a factor of 100; they can be interpreted as percentage-point effects on the con- ditional probability of submitting a rating on a specific day given that no rating had yet been submitted. Our results are consistent with the hypothesis that ratings are driven by strong reciprocity (Hypothesis 4). All models in Table 2.5.2 include time-dependent variables indicating whether the trading partner submitted a positive, neutral, or negative rating first. Once a seller submits a positive rating, the buyer’s condi- 28 Reputation Formation

Table 2.4: Hazards of Positive and Negative Feedback in the DVD Market

Positive Feedback Negative Feedback 11 12 13 14 Buyer Seller Buyer Seller Positive first move by partner (time- 5.106*** 13.754*** .003 -.006** dependent) (.112) (1.013) (.004) (.002) Neutral first move by partner (time- .410 2.176*** -.011** .440*** dependent) (.732) (.461) (.004) (.109) Negative first move by partner (time- -.544 1.859*** .535** 1.871*** dependent) (.327) (.461) (.166) (.311) Positive ratings (log) -.880*** -.421 -.025* -.012** (.183) (.234) (.012) (.004) Negative ratings (log) -.616 -.225 -.102* -.008 (.508) (.258) (.051) (.006) Partner’s positive ratings (log) -.840*** -.259 .017*** .021*** (.137) (.269) (.005) (.006) Partner’s negative ratings (log) 1.023*** -.894 -.043** -.059 (.259) (.755) (.014) (.035) Previous interaction rating Received only -.067 .943 (.459) (.596) Provided only -2.051*** -1.448* (.428) (.676) Received and provided -1.147*** -.158 (.219) (.286) Received or provided .002 .006* (.005) (.003)

Number of observations 96,055 96,055 96,055 96,055 Number of events 80,601 80,343 310 223 Number of sellers 9,309 9,309 9,309 9,309 Number of buyers 26,188 26,188 26,188 26,188 Note: The table shows coefficient estimates for effects on the conditional probability of submitting a rating on a specific day given that no rating had been submitted yet (scaled by a factor of 100) for discrete-time linear probability models (LPMs) with seller and buyer fixed effects (standard errors in parentheses, adjusted for buyer-clusters in Models 11 and 13 and for seller-clusters in Models 12 and 14). Time indicators parameterizing the baseline hazard are not displayed. *p < .05; **p < .01; ***p < .001 (two-tailed tests).

tional probability of submitting a positive rating sharply increases by about five percentage points (Model 11); for sellers, this effect is even stronger (Model 12). Likewise, the probability that a negative rating will be reciprocated with a neg- ative rating is very high (Models 13 and 14). Note that these effects are much Results 29 smaller because of the low baseline probability.16 A somewhat inconclusive find- ing is the positive effect of a buyer’s negative rating on the seller’s probability of giving a positive rating. Moreover, our results support Hypothesis 5a, not 5b. The more positive rat- ings a seller has, the less likely the buyer is to submit feedback. That is, in addition to strong reciprocity, buyers’ rating behavior is consistent with Becker’s altruism rather than with warm glow altruism, signaling, or indirect reciprocity. The remaining evidence in Table 2.5.2 is consistent with the view that buyers, in particular, take their interaction partners’ utility into account when leaving feed- back. First, in line with Hypothesis 7b, buyers (and sellers) seem to acknowledge that traders to whom they provided a rating before do not benefit from another rating. Second, the more negative ratings a seller has, the more likely a buyer is to leave a positive rating. This might reflect buyers’ willingness to support sellers who want to redeem their good reputation. Finally, we find that buyers (and sell- ers) are reluctant to give a negative rating if a trading partner already has many negative ratings, but they are more likely to give a negative rating to a trading partner with many positive ratings. This might reflect the fact that, after a failed transaction, traders feel greater disappointment toward trading partners with good reputations than toward those with bad reputations. We do not find support for our hypotheses regarding sellers’ strategic behavior at the feedback stage. Neither sellers’ own number of positive ratings (Hypothe- sis 6a) nor buyers’ number of positive ratings (Hypothesis 6b) exhibit a positive effect on sellers’ probability of leaving positive feedback. Also, sellers are not less likely to give feedback to a buyer from whom they have received a rating in a previous interaction, which contradicts Hypothesis 7a. In summary, we observe high levels of participation in the feedback system, which seem to be driven mainly by strong reciprocity and altruism. Both sellers and buyers reciprocate received ratings from their transaction partners (Hypoth- esis 4) and appear to take their interaction partners’ utility into account in their feedback decisions (Hypotheses 5a and 7b). We do not find evidence for sellers’ strategic considerations, as our corresponding hypotheses cannot be confirmed (Hypotheses 6a, 6b, and 7a). However, this does not mean that strategic motives are unimportant or do not exist. For example, according to our descriptive results, sellers are on average quicker than buyers to submit positive feedback, which can be seen as a strategic attempt to elicit reciprocation.

16 In the subsample used for the two-way fixed-effects estimation, the average conditional probability of feedback being given on a specific day is about 3 percent for positive feedback and .01 percent for negative feedback. 30 Reputation Formation 2.6 Discussion and Conclusions

In commerce, trust problems have been ubiquitous ever since humans started trading commodities. Cultural evolution, together with the creativity of social engineers, led to a great variety of institutions intended to help mitigate these trust problems. These institutions include the reputation system of the Maghribi traders in the Middle Ages described by Greif (1989), the “law merchant” sys- tem analyzed by Milgrom and colleagues (1990), the formation of formal social structures in nineteenth-century North America explored by Zucker (1986), and informal network governance structures as described by Jones and colleagues (1997). Factors such as the degree of traders’ social embeddedness (Buskens and Raub 2002; DiMaggio and Louch 1998; Granovetter 1992), the type of commod- ity (Kollock 1994), the complexity of the interaction (Jones et al. 1997), and, most importantly, the lack of timely and inexpensive information about a trading partner’s reputation (Milgrom et al. 1990) have been important determinants in the evolution of these institutions. Recent technological progress has drastically diminished the cost of fast in- formation transmission, and auction markets were among the first to establish online reputation systems for economic transactions. Based on a seller’s repu- tation, buyers can trade off prices against the possibility of being cheated, and the price premium for a good reputation provides a financial incentive for sellers’ cooperative behavior. Moreover, traders do not have to be embedded in offline social networks to access information about other traders as long as they have access to the market platform online. In the past decade, online reputation sys- tems have spread all over the web, and the Internet’s technical possibilities have brought their functioning close to perfection. However, the existence and stability of online reputation systems cannot be taken for granted. A reputation system’s effectiveness in promoting (first-order) cooperation crucially depends on traders’ voluntary provision of feedback. Because a trader gains no direct benefits from leaving feedback, participation in the feedback system is subject to a (second- order) free-rider problem. Our study investigates the role of reputation in promoting cooperation and the process of reputation formation in the field. That is, we explore the mechanisms underlying cooperation by observing transactions and ratings among traders in anonymous online markets in an unobtrusive way. We consider online markets as integrated systems of institutional rules that govern traders’ decisions, and our analysis covers the entire chain of interactions between a buyer and a seller, starting with the seller’s offer and ending with either or both leaving feedback. Our theoretical considerations are rooted in classical sociology and anthropology, with their insights into the roles of trust, reciprocity, and reputation in social Discussion and Conclusions 31 exchange, but we also draw on recent findings from experimental social sciences. Based on a large sample of data from the mobile phone and DVD markets, we test various and partly competing hypotheses derived from our theoretical argument. In both markets, we find that sellers’ number of positive ratings has a positive effect on sales and selling prices, and sellers’ negative ratings have the opposite effect (Hypothesis 1). Moreover, in line with Hypothesis 2, negative ratings have a greater effect on sales and prices than do positive ratings. This corroborates previous findings that reputation has a market value and that buyers take into account information about sellers’ reputations when they place their bids (see the appendix for a review of previous findings). However, the empirical evidence for the uncertainty hypothesis (Hypothesis 3) is mixed. As hypothesized, reputation has a larger effect on price for used mobile phones than for new ones, but this is not the case for the probability of sale. The gross effect of reputation, taking into account both the price and the probability of sale, seems even larger for new mobile phones than for used ones. A possible explanation for this finding is that used products attract buyers who are willing to take a risk and therefore do not take information about sellers’ reputations into account as carefully as buyers of new products do. At the feedback stage, cooperation is very high. Sixty percent of transactions in the mobile phone market and 80 percent in the DVD market were rated by both sellers and buyers, and only a small fraction of transactions had no rating at all. This is comparable to what Dellarocas and colleagues (2004) and Bolton and colleagues (2013) found in their eBay data. Our study corroborates, moreover, that buyers’ and sellers’ behavior at the feedback stage is largely consistent with strong reciprocity (Hypothesis 4). Although one receives no apparent benefits from reciprocating a rating received from one’s trading partner, the likelihood of providing similar feedback sharply increases for both buyers and sellers after re- ceiving a rating. The effects are strong and highly significant for both positive and negative ratings. This is in line with statistics reported in Bolton and colleagues (2013) and with what Dellarocas and Wood (2008) and Jian and colleagues (2010) found based on their model estimations. Besides reciprocity, altruistic preferences seem to motivate traders’ (in partic- ular, buyers’) decisions. Buyers are more likely to give a positive rating to traders with few positive or many negative ratings (Hypothesis 5a). This contradicts Jian and colleagues (2010), who found that both buyers and sellers are more likely to give feedback to experienced traders than to inexperienced traders. We argue that this could be an artifact resulting from Jian and colleagues (2010) disregarding transaction quality in their model estimations. Transaction quality is likely to be higher for experienced traders and could thus lead to a positive correlation be- tween traders’ experience and the probability of giving feedback. We circumvent 32 Reputation Formation this problem by including partner fixed effects in our model estimations. If we do not include them, our results are similar to those of Jian and colleagues (2010). Moreover, we find that buyers (and sellers) are less likely to give positive feed- back to the same trading partner repeatedly (Hypothesis 7b). Hypotheses 5a and 7b are in line with Becker’s (1976) notion of altruism; our results are consistent with the idea that traders take their interaction partners’ utility into account when leaving feedback and are more likely to give feedback the more beneficial it is for their partner. Contrary to our expectations and unlike the results by Dellarocas and col- leagues (2004) would suggest, we find no support for our hypothesis regarding sellers’ strategic considerations at the feedback stage. In particular, we find no support for Hypotheses 6a and 6b, that sellers strategically take into account their own or the buyers’ score at the feedback stage. Dellarocas and colleagues’ (2004) conclusion was based on the finding that a buyer’s first move at the feedback stage had a negative effect on the seller’s overall probability of leaving feedback. We argue that this result could be an artifact due to their use of a probit model; if sellers have a lower a priori probability of providing a rating, buyers will more likely be the first movers at the feedback stage. The event-history models used in our analysis do not suffer from this problem because the buyer’s first move is treated as a time-varying variable. Yet, traders seem to anticipate their trading partners’ reciprocal behavior, especially in the case of negative feedback, as some try to avoid retaliation by submitting their negative rating close to the end of the rating period. Thus, we cannot conclude that strategic motives are unimportant in traders’ decisions to leave feedback; they are difficult to identify in our data. In Gouldner’s (1960, 176) words, the norm of reciprocity is a “’starting mech- anism’ [that] helps to initiate social interaction.” Reciprocity is also a cornerstone in Blau’s (1964) . Both authors elaborate on the principle of reciprocity and outline its implications for the emergence and stability of ex- change relations. Their notion of reciprocity and the reciprocity we observe in our data is similar to the notion of “strong reciprocity” pioneered by Gintis (2000). According to strong reciprocity, actors will return a favor with a favor and retal- iate against an unfriendly act without expecting to be compensated for the costs they incur in doing so. In laboratory experiments, Fehr and colleagues (2002) demonstrated that strong reciprocity is a key mechanism in promoting coopera- tion in voluntary contribution games, and they sparked a cross-disciplinary debate concerning the determinants of human social cooperation (Guala 2012). In our study, we show how the interplay of simple institutional rules and human motiva- tion establishes a high level of cooperation among anonymous traders in online markets. Discussion and Conclusions 33

Based on our findings we argue that, in these markets, altruism, strong reci- procity, and, most likely, also strategic motives play an important role in main- taining (second-order) cooperation at the feedback stage, which generates the information necessary to create financial incentives for (first-order) cooperation at the transaction stage. In other words, giving positive or negative feedback – a second-order issue – can be conceived as rewarding or punishing a trading part- ner’s first-order cooperation or defection, respectively (Resnick and Zeckhauser 2002). Given that the constrained set-up of online markets mirrors laboratory con- ditions relatively well, our findings bolster the external validity of experimental evidence of social cooperation and peer punishment. At the same time, our study makes apparent that unobtrusive behavioral data from online platforms need to be complemented with laboratory and field experiments, which are better suited to disentangling the various motives behind human behavior. Although strong reciprocity and altruism appear as the most likely motives behind the behavioral patterns we observe at the feedback stage, carefully designed experiments should be used to verify our results and rule out alternative explanations. Electronic reputation systems have been constantly refined to better meet the requirements of a steadily growing online community. In fact, the change in eBay.de’s reputation system in spring 2008 was guided by a thorough theoretical and empirical analysis that combined field data from online markets with evi- dence from laboratory experiments (Bolton et al. 2013). The main concern with the old two-sided feedback system was that it might foster a positive evaluation bias. On the one hand, it encouraged positive evaluations due to positive reci- procity expectations; on the other hand, it deterred truthful negative evaluations due to negative reciprocity expectations. Without abandoning the old system, buyers were newly given the opportunity to leave a more detailed and anonymous seller rating. That is, after a transaction, buyers could rate sellers on several at- tributes – such as item description, communication, transaction speed, and costs – and sellers were not able to identify the buyer who rated them because they were only shown average feedback scores on each attribute. Surprisingly, although the costs for leaving feedback increased for buyers, both buyers’ and sellers’ conven- tional rating behavior hardly changed 10 weeks into the system change, and 70 percent of buyers who gave conventional feedback also gave detailed seller feed- back (Bolton et al. 2013). This provides additional support for our assertion that cooperation at the feedback stage is largely motivated by strong reciprocity and altruism, rather than mere strategic considerations encouraged by the two-sided feedback system. 34 Reputation Formation 2.7 Appendix

The materials in this appendix is an edited version of the online supplement ac- companying the published version of this article.

2.7.1 Literature Review Several empirical studies have investigated the effects of reputation in online mar- kets (for earlier reviews, see Bajari and Hortaçsu 2004; Resnick et al. 2006). Most of these studies analyze auction data collected from the online market platform eBay (for other platforms, see Diekmann et al. 2009; Ockenfels 2003; Snijders and Weesie 2009; Snijders and Zijdeman 2004) and use statistical models to es- timate the effects of seller ratings on selling price and probability of sale. These studies employ datasets of different sizes (ranging from 100 to 340,000 cases) and covering various commodities (e.g., guitars, coins, stamps, mobile phones, and DVDs). They differ in the definition of the dependent variable, the opera- tionalization of sellers’ reputation, and the control variables included in the mod- els. The majority of these studies estimate the elasticity or semi-elasticity of the selling price with respect to the number of positive and negative ratings or with respect to the “score” (positive minus negative ratings) and account for censoring (i.e., that prices are unobserved for unsold items). Some studies provide mixed results (Bajari and Hortaçsu 2003; Lucking- Reiley et al. 2007; Standifird 2001) or do not support the reputation hypothesis (Cabral and Hortaçsu 2010; Eaton 2002). The majority of studies, however, find consistent evidence for a positive relation between sellers’ reputation and sell- ing price (Berger and Schmitt 2005; Dewally and Ederington 2006; Dewan and Hsu 2004; Houser and Wooders 2006; Livingston 2005; McDonald and Slaw- son 2002; Melnik and Alm 2002, 2005; Przepiorka 2013; Snijders and Zijdeman 2004). With respect to the probability of sale, the evidence is less conclusive. Re- sults found by Dewan and Hsu (2004), Hsu et al. (2009), and Przepiorka (2013) support the hypothesis of a positive (negative) relation between positive (nega- tive) ratings and sales, but Berger and Schmitt (2005), Eaton (2002), and Cabral and Hortaçsu (2010) find no support or mixed evidence. And Snijders and Zijde- man (2004) find a negative relation between the score and probability of sale.17 Apart from variables accounting for product characteristics (e.g., brands, book values, or condition), the explanatory and control variables most frequently

17 Snijders and Zijdeman (2004) and Eaton (2002) do not include the starting bid (i.e., minimum price) in their model. If the starting bid is negatively (positively) correlated with sellers’ positive (negative) ratings (see, e.g., Przepiorka 2013), the corresponding estimates of reputation effects can be biased in the opposite direction. Appendix 35 used are starting price, number of bids or bidders, existence of a product pic- ture, auction duration, whether the offer ends on a weekend, availability of secure payment methods, and whether the seller set a secret reserve price. The starting price consistently and in line with expectations has a significantly negative effect on sales (Berger and Schmitt 2005; Cabral and Hortaçsu 2010; Dewan and Hsu 2004; Przepiorka 2013), but the merit of its inclusion in a model for the selling price is debatable. On the one hand, the selling price is left-censored at the start- ing price and, therefore, the effect of the starting price will be overestimated in a standard regression model. On the other hand, the starting price may have a genuine effect on the selling price. For example, a high starting price may signal a superior product quality or, conversely, a low starting price may attract more potential buyers (see Livingston 2003). Consequently, reputation effects can be biased in models that do not control for starting prices. The number of bids and the number of bidders are endogenous variables mediating the effects of repu- tation and should not be included in the model; the same is true for all buyer characteristics. Providing a picture of the offered item is found to increase selling price by Snijders and Zijdeman (2004), Eaton (2002), and Dewally and Ederington (2006), as well as the probability of sale by Snijders and Zijdeman (2004). However, Eaton (2002) and Cabral and Hortaçsu (2010) report a decreasing effect on the probability of sale and Melnik and Alm (2002) did not find any effects in regard to product picture. Research finds auction duration has no effect (Houser and Wooders 2006; Melnik and Alm 2002; Melnik and Alm 2005) or a positive effect on sales (Cabral and Hortaçsu 2010; Dewan and Hsu 2004; Przepiorka 2013) and prices (Cabral and Hortaçsu 2010; Dewally and Ederington 2006; Lucking- Reiley et al. 2007; Przepiorka 2013). Results with respect to an auction ending on a weekend are mixed (Dewan and Hsu 2004; Lucking-Reiley et al. 2007; Melnik and Alm 2002; Melnik and Alm (2005); Przepiorka 2013). Moreover, with the exception of two studies (Cabral and Hortaçsu 2010; Przepiorka 2013), there is no evidence that the availability of a secure payment method positively affects sales or prices (Berger and Schmitt 2005; Houser and Wooders 2006; Melnik and Alm 2002; Melnik and Alm 2005). Finally, secret reserve prices are commonly used in online auction markets in the United States and the UK, but on eBay Germany, in most product categories, sellers do not have the option to set secret reserve prices. If sellers set a secret reserve price, a transaction takes place only if the highest bid reaches that threshold. Two studies find a positive and significant effect of a secret reserve price on selling price (Dewan and Hsu 2004; Lucking-Reiley et al. 2007), but results are mixed in one study (Standifird 2001) and two other studies find a significantly negative effect (Bajari and Hortaçsu 2003; Dewally and Ederington 2006). 36 Reputation Formation

2.7.2 Details of Data Collection We collected data on a total of 1,084,882 auctions of various products on the Ger- man eBay market platform between November 7, 2004 and January 7, 2005. We gathered information on auction characteristics using a computer program written to observe auctions for scientific analysis. Based on a set of 144 eBay categories (e.g., Sony-Ericsson T610), the first program maintained a directory of identifi- cation numbers and prospective end dates of new auctions that was updated daily. The auctions in this list were then processed sequentially by a second program, which scanned an auction’s pages once the end date was reached. The following pages were accessed and stored in a local database: (1) the auction’s main page (the item page); (2) the list of bids; (3) the seller’s profile page; (4) the buyer’s profile page; (5) the page listing other items offered by the seller; and (6) the page listing offers by the buyer. The information on these pages was extracted using regular expressions and stored in database fields. For the item page, the HTML source code was also stored. After 90 days, the maximum duration for submitting feedback, a third program recorded the entire feedback history of the seller and the buyer involved in the transaction. The usual procedure on eBay is to handle transactions in the manner of an English auction, but the system also allows sellers to set a “Buy It Now” price at which the product can be purchased immediately. We recorded data on offers where the English auction mechanism was used, possibly in combination with a Buy It Now price, but disregarded offers in which the seller disabled the English auction mechanism and employed a Buy It Now price only. Due to an error, only partial data were recorded for auctions with more than 18 bids. We tried to recover the missing data after detecting the problem, but this was only possible for auctions that ended on December 1, 2004 or later. Auctions that ended earlier were taken offline and we could not retrieve them using the item number. To have complete data and a well-defined sample, we therefore restricted our analysis to auctions that ended between December 1, 2004 and January 7, 2005. Appendix 37

2.7.3 Control Variables Included in the Models in Table 2.2

Apart from positive and negative ratings, we controlled for other seller character- istics, such as whether they had a private profile, a verified identity, a Me-Page, or PowerSeller status.18 These attributes are more common for sellers in the DVD market than for sellers in the mobile phone market, which indicates that profes- sionalism is more pronounced in the latter (see Table 2.5).

In addition to the product subcategories (model for mobile phones, genre for DVDs), control variables included the starting price (minimum bid; Models 1, 2, and 3 only), the duration of the auction, the payment modes offered, the (loga- rithm of the) number of characters used in the product description (as a proxy for the amount of information provided about the offered item), whether a picture was provided, and whether the auction was highlighted in eBay’s list of offers using boldface or a picture thumbnail (eBay charges a small fee for each of these options). We further included a variable called “competition” that captures the (logarithm of the) number of similar offers that were active at the time the auc- tion ended (same model and same new/used status in the case of mobile phones and same genre in the case of DVDs).

Finally, we included variables for the date and time the auction ended. In our models we used dummy variables for the week (the observation window cov- ers six weeks), the weekday (plus a dummy for December 24/25 or December 31/January 1), and the time of day (one eight-hour interval starting at midnight followed by four-hour intervals in the case of mobile phones; two four-hour in- tervals starting at midnight and at 4 a.m. followed by hourly intervals in the case of DVDs).

Unfortunately, we were not able to collect reliable information on shipping costs, which are usually charged by the seller in addition to the selling price. We therefore did not include such a variable in our models. Furthermore, we did not include the number of bids or buyer characteristics as explanatory variables because these variables are endogenous.

18 A private profile means that, although rating statistics are visible, text feedback is hidden from other users (and “private” is displayed next to the user’s name instead of the feedback score); a verified identity can be obtained for a small fee by proving one’s identity at a post office; a Me-Page is a user’s homepage on eBay that can provide additional information on a business or person; and PowerSeller status is granted to professional sellers with a certain trade volume and a good feedback record. 38 Reputation Formation

2.7.4 Discussion of the Effects of the Control Variables in the Models in Table 2.2

Results for the full models for the probability of sale (Models 1, 2, and 3) are displayed in Table S2. Regarding seller characteristics other than positive and negative ratings, no clear pattern can be found. The large negative effect of a seller’s verified identity in Model 1 may be an artifact driven by the small number of sellers with this attribute. Not surprisingly, the starting price is a strong deter- minant of sales in all models. The higher the starting price, the less likely it is the item will be sold. As expected, the level of competition (the number of similar offers currently on the market) exerts a negative effect on the chances of selling a product (neg- ative effect in all models, but not significant for new mobiles). The options to increase the visibility of an auction in the list of offers (thumbnail or boldface) seem to serve their purpose, but their effects are only significant in the model for DVDs. The description length has the expected positive effect for (new) mobile phones, but negatively affects DVD sales. An ad hoc explanation for this coun- terintuitive finding could be that an extensive description is valuable for complex goods such as mobile phones, but is irrelevant or might even raise suspicion in the case of an item such as a DVD. Furthermore, not providing a picture has a nega- tive effect for (used) mobile phones, but does not affect the sales of DVDs. The models also contain a number of dummy variables to control for payment modes offered (bank transfer, PayPal or credit card, cash on pick up, or cash on deliv- ery), auction duration (one, three, five, seven, or ten days), the time-point of the end of the auction (using a total of 16 and 29 parameters for mobile phones and DVDs, respectively; see the earlier description of the control variables), and the product subcategory (seven mobile phone models and seven DVD genres). For these groups of variables, joint tests are reported in the lower panel of the table without listing the single coefficients. The product subcategory is highly signifi- cant in all models, the auction duration is significant for DVDs but not for mobile phones, and the time-point of the end of the auction is significant for DVDs and used mobile phones but not for new mobile phones. Results for the full models for the selling price (Models 4 through 10) are dis- played in Tables 2.7 and 2.8. Positive and negative ratings are not the only source of information a buyer can use to assess sellers and their products. Auction char- acteristics such as a picture and the description length also have a positive effect on price for new and used mobile phones. We also observe a positive effect of the description length for DVDs (counteracting the negative effect on the probability of sale), but providing a picture, again, seems to be less important for DVDs. Seller characteristics such as a verified identity, a Me-Page, or PowerSeller status Appendix 39 do not exert a positive effect on price. A likely reason for the lack of an effect of PowerSeller status is that it can only be acquired by sellers who already have a high number and proportion of positive ratings. Due to the decreasing marginal effect of positive information, PowerSeller status does not contribute much to in- crease a seller’s trustworthiness and therefore is discarded by potential buyers. Moreover, sellers with a private profile do not seem to be disadvantaged. This might appear surprising, but recall that a private profile does not hide the seller’s feedback statistics, only the corresponding text comments. In accordance with theoretical considerations, more intense competition low- ers the price for mobile phones. For DVDs, no competition effect can be found, but this is not very surprising given the broad nature of the competition measure in this case (recall, however, that competition had an effect on sales of DVDs; see Table 2.6). Similarly, increasing the visibility of an offer using a thumbnail or bold listing seems to attract bidders and, as expected, increase the price. Payment modes offered take up some of the unexplained variance except for used mobile phones. Auction duration does not affect price, but there are significant timing effects. In the case of mobile phones, the model makes a large difference and there are also some price differences among DVD genres. We do not include the starting price in Models 4 through 10 because the sell- ing price is censored at the starting price (causing the coefficient for the starting price to be positively biased). Consistent estimates for the effect of the starting price can be obtained via censored regression. In our case, the starting price ex- erts a weak positive effect for used mobile phones and DVDs, but not for new mobile phones (not shown).

2.7.5 Computation of Hazard Rates and Confidence Intervals in Figure 2.1 The displayed hazard rates are kernel-density estimates based on nonparametric hazard components, applying linear boundary correction at days 0 and 90 (Jones 1993) and using a direct plug-in automatic bandwidth (Wand and Jones 1995), as implemented by Jann (2005). The dotted lines are normal approximation boot- strap confidence intervals taking into account clustering of sellers and buyers, respectively.

2.7.6 Model Diagnostics All reported results are fairly robust against changes in model specifications. For example, the exact choice of control variables does not affect the effects of key variables much. Because a selling price cannot be smaller than the starting price, 40 Reputation Formation we replicated results for the selling price (Models 4, 5, and 6) using models for censored data (censored normal regression, censored normal regression with het- eroscedastic errors, Heckman selection regression, and censored quantile regres- sion). In all variants effects of the reputation variables were highly significant. Furthermore, we evaluated the functional forms of effects of continuous variables using graphical methods and detected no major deviations from the chosen forms in the reported models. Multicollinearity does not seem to be a problem either. The only variable that reaches a substantial value for the variance inflation fac- tor is our measure for competition, which strongly depends on the auction date and time. However, excluding competition from the models does not change our findings. Finally, results do not seem to be driven by outliers. We replicated the models excluding the most influential percent of observations (as measured by Cook’s D) without changing our substantial conclusions. Three additional issues are worth mentioning with respect to discrete-time linear probability models (LPM) we used to analyze the feedback behavior (Mod- els 11 through 14). First, an alternative method of analyzing the feedback process would be to apply a proportional-hazards model for continuous time-to-event data (Cox regression; see, e.g., Kalbfleisch and Prentice 1980). We did not employ this model because there is no satisfactory way to incorporate buyer and seller fixed effects simultaneously (for the same reason, we used an LPM instead of the more common discrete-time logit or complementary log-log model). An analysis of our data using fixed-effects (i.e., stratified) proportional-hazards models (Allison 2009), where each model is estimated twice, once stratified by sellers and once by buyers, leads to substantially the same conclusions. Second, in our discrete- time LPMs we did not include auction characteristics (e.g., the selling price) as control variables. Including such variables does not change our results. Third, in our feedback models, effects are assumed to be constant over process time. This assumption seems to be violated for the reciprocity effect (i.e., the effect of a first move by the partner). Detailed analyses allowing the reciprocity effect to vary by process time indicate that the effect is strongest in the beginning, but stays positive over the whole 90-day period. Appendix 41

Table 2.5: Means and Standard Deviations of Control Variables New Mobile Used Mobile Phones Phones DVDs Seller Seller has private profile .025 .028 .017 Seller has verified identity .032 .038 188 Seller has Me-Page .086 .091 .486 Seller is PowerSeller .016 .020 .282 Buyer (if sold) Buyer has private profile .021 .049 .008 Buyer has verified identity .015 .014 .020 Buyer has Me-Page .045 .050 .047 Buyer is PowerSeller .007 .005 .002 Auction Starting price in Euros 28.8 (69.7) 25.9 (57.0) 3.7 (6.6) No picture .050 .058 .071 Description length (log) 7.30 (1.15) 6.95 (1.12) 7.04 (1.36) Competition (log) 3.65 (1.08) 3.87 (.83) 6.92 (.87) Listed with thumbnail .419 .352 .033 Listed in bold .146 .118 .003 Payment modes offered No bank transfer .035 .034 .082 PayPal or credit card .107 .082 .169 Cash on pick up .479 .399 .169 Cash on delivery .117 .087 .023 Auction duration 1 day .142 .104 .291 3 days .319 .278 .195 5 days .191 .183 .112 7 days .267 .344 .236 10 days .081 .091 .166 Model (mobile) or Genre (DVDs) Sony T610 / Action .041 .111 .403 Sony T630 / Drama .061 .085 .061 Nokia 6230 / Classics .531 .152 .051 Nokia 6310i / Comedy .061 .296 .185 Motorola V600 / Crime .089 .068 .093 Samsung E800 / Cartoon .156 .062 .021 Samsung E700 / Children .062 .226 .186 Number of observations 5,499 9,128 339,517 Number of sellers 4,341 7,687 33,166 Number of buyers 4,881 7,454 100,046 42 Reputation Formation ect of Reputation on Sales ff Model 1 Model 2 Model 3 New Mobile Phones Used Mobile Phones DVDs .001 (two-tailed tests). cient estimates of logistic regressions and robust standard errors in parenthe- Table 2.6: E ffi < 29 df) 22.4 36.5** 298.4*** / .01; ***p < The table shows coe .05; **p Payment modes (4 df)Auction duration (4 df)End of auction (16 9.7* 4.1 2.5 7.5 16.1** 36.8*** Product subcategory (6 df) 82.8*** 113.3*** 65.5*** < Seller’s positive ratings (log)Seller’s negative ratings (log)Seller has private profile .344*Seller has -.670* verified identitySeller has Me-Page (.136)Seller is PowerSeller (.301) -3.218***Starting price -.476No picture -.019 (.712) -.207Description length (log) (.865)Competition (log) (.064) -.185 .235Listed (.132) with -.024 thumbnail -.532Listed in .117** bold .423** -.145*Constant (.602) (.366) (.775) -.110*** (.040) (.593)Further (.058) covariates (Chi2-tests): (.159) -.100 .085 (.010) -.126 -.880 .026 -.583* -.883 -.056*** .047 (.206) (.327) (.292) (.419) (.677) (.385) (.803) (.005) -.724*** 1.074 -1.076*** (.078)McFadden R-squared -.203 -.109*** (.153) -.594* 25.165*** .280Number of (.279) -.115*** observations (.028) (.679)Number (3.686) (.165) of sellers (.028) -.656*** (.276) 17.835*** (.175) (.111) .432 (2.018) .692*** .009 7.429*** (.150) (.289) .858 (.983) 5499 (.155) .731* 4341 (.307) .678 9128 7687 339,517 .133 33,166 Note: ses (adjusted for seller-clusters). Theauctions. binary dependent variable in*p all models is equal to one for successful Appendix 43 ect of Reputation on Price I ff Model 1 Model 2 Model 3 New Mobile Phones Used Mobile Phones DVDs .001 (two-tailed tests). cient estimates of OLS regressions and robust standard errors in parenthe- ffi < Table 2.7: E 29 df) 26.6*** 7.6*** 4.6*** / .01; ***p < The table shows coe .05; **p Payment modes (4 df)Auction duration (4 df)End of auction (16 Product subcategory (6 df) 5.4*** 1545.4*** .4 979.8*** 0.7 1.3 18.6*** 3.8** .5 < Seller’s positive ratings (log)Seller’s negative ratings (log)Seller .005*** has private profile -.011***Seller has verified identity (.001)Seller has (.002) Me-PageSeller is PowerSeller .008* -.018**No -.019 picture .001Description length (log) (.006) (.003)Competition (.011) (log) (.007)Listed with thumbnail -.101** .053*** .007Listed .006 in .018 bold .004*** -.002 (.015) (.035) Constant (.005) (.001) (.012)Further (.020) covariates (F-tests): (.019) .010*** -.018** -.024** .026*** -.019 .000 (.003) -.027 (.006) .107 (.009) (.004) (.016) .036*** (.103) (.034) -.037* -.044* (.154) .006 .026* -.267***R-squared (.007) 5.473*** (.016) (.076) Number .020 of (.022) (.011) observations (.004) .321***Number of (.031) sellers -.022 (.052) (.118) .032** .067 5.261*** (.027) (.085) (.010) (.089) 1.519*** 5269 .521*** (.244) (.081) 4242 .844 8727 7474 .513 180,881 30,018 .111 ses (adjusted for seller-clusters).price (in The Euros). dependent variable*p in all models is the logarithm of the selling Note: 44 Reputation Formation

Table 2.8: Effect of Reputation on Price II

Model 7 Model 8 Model 9 Model 10 New Mobile Used Mobile Phones Phones DVDs DVDs Title & Seller FEs Seller FEs Title FEs Seller FEs Seller’s positive ratings (log) .027** -.002 .016*** .005 (.010) (.045) (.002) (.007) Seller’s negative ratings (log) -.055 -.025 -.012** -.036** (.034) (.033) (.005) (.014) Seller has private profile -.002 .008 (.026) (.082) Seller has verified identity .011 -.049* (.013) (.021) Seller has Me-Page .006 -.083 (.009) (.047) Seller is PowerSeller -.024 .000 (.013) (.015) No picture -.029* -.063 -.064*** -.007 (.015) (.073) (.010) (.016) Description length (log) .003 .041 .009*** -.004 (.006) (.031) (.002) (.005) Competition (log) .008 -.037 -.010* -.002 (.010) (.028) (.004) (.004) Listed with thumbnail -.008 .051 .015 .022* (.014) (.034) (.009) (.010) Listed in bold .018 -.014 .071 -.009 (.015) (.037) (.045) (.072) Constant 5.297*** 5.139*** 1.492*** (.084) (.308) (.037) Further covariates (F-tests): Payment modes (4 df) 1.5 .6 .9 1.4 Auction duration (4 df) .4 1.6 5.7*** 4.7*** End of auction (16/29 df) 7.3*** 4.2*** 8.2*** 8.0*** Product subcategory (6 df) 55.0*** 26.2*** Number of observations 1612 1944 113,276 103,030 Number of titles 18,054 15,964 Number of sellers 585 691 6901 Note: The table shows coefficient estimates of fixed-effects (FE) regressions and robust standard errors in parentheses (adjusted for seller-clusters). The dependent variable in all models is the logarithm of the selling price (in Euros). *p < .05; **p < .01; ***p < .001 (two-tailed tests). Chapter 3

Reciprocity in Reputation System: An Ex- perimental Study

Abstract: Electronic reputation systems based on voluntary feedback from traders constitute a powerful institution to create and sustain trust in online marketplaces like eBay or Amazon. This study compares the effectiveness of three different reputation systems, with special attention to the role of direct reciprocity in bilateral rating systems. We first review existing literature on rep- utation building and indirect reciprocity. Within the paradigm of binary-choice trust games, we experimentally compare a bilateral-sequential, a bilateral-blind, and a one-sided feedback mechanism with a control group that has no access to reputation information. Fairly high levels of trust and trustworthiness are found in all markets. However, treatments that prevent participants from submitting strategic and reciprocal ratings exhibit higher efficiency. We compare patterns of strategic feedback timing and determinants of rating behavior with an analysis based on duration models. Direct positive and negative reciprocation in the sequential treatment results in a significantly higher feedback turnout. Participants postpone negative ratings to avoid retaliation. But in contrast to the widespread belief that negative feedback is suppressed, our experimental results suggest otherwise.

Keywords: trust, reputation, reciprocity, online markets, negativity bias

† This chapter is joint work with Dominik Hangartner (London School of Economics). We devel- oped the experimental design together. Dominik also conducted most of the laboratory session at the University of Berne. Data handling, data analysis and the preparation of the manuscript was done by myself. Potential errors are all mine. We thank Marius Rohrer for help in programming the treat- ments. Financial support from Prof. Martin Abraham (now University of Erlangen) and the Institute of Sociology, University of Bern is gratefully acknowledged. 46 Reciprocity in Reputation Systems 3.1 Introduction

This paper reports on the interplay of indirect and direct reciprocity in internet market trust mechanisms. All markets require trust to a minimal extent (Coleman 1990). Compared to face-to-face transactions, traders in e-commerce are espe- cially confronted with a challenging environment. Trading on marketplaces like eBay or Amazon is typically anonymous, is not repeated, takes place over wide geographical distances, and is executed in a sequential fashion where a buyer has to transfer money up-front to receive goods of unknown quality from the seller later in time. Electronic marketplaces try to reduce the risk of opportunistic behavior with different measures like screening participants, monitoring transac- tions, offering assurances, providing escrow services and enforcing money back guarantees. One of the most prominent and widely discussed solutions to elicit cooperation and honest behavior in e-commerce are reputation systems (Kollock 1999, Dellarocas 2003, Yamagashi et al. 2009). Such systems allow trading part- ners to rate each other and to publicly post information about their past trans- action. Because the quality of the transaction is mainly privately observed by the trading parties involved, most online feedback mechanisms rely on voluntary self-reporting of the trading outcome. Many empirical and experimental studies have shown that reputation systems indeed implement an effective mechanism to foster trustworthy behavior. Empir- ical results based on eBay data typically show that sellers with a good reputation have a greater probability of selling their products, and sell them quicker and at a better price (e.g., Diekmann et al. 2014). At the same time, numerous scholars have alluded to reputation systems’ Achilles’ heel: the participant’s propensity to rate and to rate truthfully. Public information about the trustworthiness bene- fits both buyers and sellers. Providing feedback produces a collective good freely available for everybody. Although e-commerce platforms try to minimize the bur- den, feedback submission remains costly and time-consuming. Therefore, under the assumption of self-regarding preferences, traders should have no incentive to participate in the feedback system. This gives rise to a second-order free-rider problem with the prediction that feedback will be underprovided. However, em- pirical results indicate that the collective good problem in feedback systems is less severe than theories of pure self-regarding preferences may expect. Resnick and Zeckhauser (2002) report a rating propensity of 50-60%. More recent studies find rating frequencies between 65 and 85% in eBay markets from several dif- ferent countries (Dellarocas and Wood 2008, Bolton et al. 2013, Diekmann et al. 2014), which suggests that feedback provision does not deteriorate. Feedback submission alone is not sufficient to sustain cooperative online mar- kets. Reports must be truthful so that the information about trading partners is Related Literature 47 trustworthy in itself. Depending on implementation, self-reporting opens the door for several forms of under-reporting, reprisals, and strategic and recipro- cal ratings. Resnick and Zeckhauser (2002) mention several reasons that buyers could be reluctant to submit negative feedback after failed transactions. First, they may fear their negative feedback would provoke a reciprocal response and prefer to remain silent. In particular, buyers with ambitions to become sellers themselves may want to protect their own reputation. Second, buyers might ab- stain from submitting a negative rating to sellers with an unblemished standing. Third, buyers may want to be polite and submit positive feedback only because sellers have submitted their own positive feedback, even if the transaction quality was not above reproach. In general, traders may selectively choose to report on certain types of outcomes, which results in a distorted and biased view of public reputation accounts. Dellarocas and Wood (2008) provided empirical evidence of retaliatory actions and reciprocal behavior, which artificially increases the num- ber of positive ratings and suppresses negative ones. The objective of this article is threefold. First, we try to replicate findings from the existing literature and provide additional evidence of the effects of reputation systems on trustworthy or opportunistic behavior among anonymous traders. We conducted a series of laboratory experiments based on binary trust games implementing different feedback mechanisms. We attempt to compare the credibility and informational value of the different institutional designs and re- port their effects on the propensity to place and honor trust. Second, we attempt to understand determinants of the rating behavior and highlight the dependence of the feedback submission rate on reciprocity and reputation scores of both the seller and the buyer. Third, we intended to contrast our experimental results with empirical findings based on eBay data (Dellarocas and Wood 2008, Diekmann et al. 2014). We gave players in the lab discretion over the time to submit their feedback to make use of the arrival time distribution. In the next section, we will discuss related literature.

3.2 Related Literature

Our experimental design relies on several distinct components that we will dis- cuss in the following section. First, we constrain ourselves to a behavioral def- inition of trust and define the situation. The problem of trust is supposed to be resolved with a reputation system. We will review how rational and self-interested actors build bilateral reputations in repeated interactions and how this logic can be generalized to non-repeated environments with many players. Based on bounded- rationality assumptions, approaches from evolutionary game theory shed light on 48 Reciprocity in Reputation Systems reputation from a different angle. There we address the question of whether a reputation mechanism is evolutionarily stable. How indirect reciprocity relates to direct reciprocity will be discussed thereafter. Finally, we review existing ex- perimental evidence on trust games played in conjunction with reputation mech- anisms.

3.2.1 Trust and Trustworthiness Many researchers have stressed the importance of reputation in social phenom- ena involving uncertainty and trust. During the last two decades, there has been a surge of contributions to the field of trust research. But despite the large body of literature, there is still no consensus on a proper definition of trust. In this pa- per, we adhere to a behavioral definition of trust based on Coleman (1990) that is nicely represented by the game-theoretic formalization of the two-player binary trust game as shown in the left panel of Figure 3.1 in section 3.3.1. According to Coleman (1990, 97-99), a trust situation between a trustor (first mover) and a trustee (second mover) is characterized by four key elements: (1) A trustor places resources at the disposal of the trustee without any legal or formal safeguards. (2) The trustee can only honor or abuse trust if the trustor has placed trust in the first place. (3) Placing trust is associated with the expectation that it will benefit or harm the trustor if trust is honored or abused, respectively. Finally (4), there is a time-lag between the trustor’s and trustee’s decisions. This definition of trust assumes that placing trust is “calculative” and in essence the same as believed or expected trustworthiness (see Williamson 1993, Hardin 2002), whereas trustwor- thiness must be assumed to be either conditional (reciprocity) or unconditional kindness or a mixture of both (Camerer 2003, chap. 2, Ashraf et al. 2006).1 This definition of a trust situation – a transaction that resembles a binary trust game – will be sufficient for our purpose (see Buskens and Raub 2002 for a thor- ough discussion). However, this simple notion of trust has been challenged by recent research. For example, Fehr (2009) concludes in his review that behav- ioral trust is not just a special form of risk taking but is simultaneously deter- mined by risk preferences, social preferences, beliefs, and even neurobiological and genetic factors. The conclusion that trusting differs from non-socially consti- tuted risk taking receives special support from the literature on betrayal aversion (Bohnet and Zeckhauser 2004, Bohnet et al. 2008). Bohnet and colleagues com- pare several situations where players have to decide between a secure payoff and

1 In the terms of Coleman (1990), expected trustworthiness refers to the situation where a person will place trust if pG > (1− p)L, with p corresponding to a subjective probability that trust is honored, and G and L refer to gain and loss if trust is honored or broken, respectively. Information about past behavior (reputation) is one potential source for people’s formation of this subjective probability p. Related Literature 49 a random outcome with an interesting triangulated design. They elicit minimal acceptable probabilities (MAPs) of good outcomes such that players prefer to enter the risky situation. In the situation with social risks, players state signifi- cantly higher MAPs compared to the two other choice situations. This suggests an aversion to intended betrayal from a ‘real’ interaction partner. This means that people are less likely to take risks when facing a human counterpart compared to a gamble with a given probability of bad luck. Besides the binary-choice trust game we apply below, there exists another paradigm to study behavioral trust: the investment game by Berg et al. (1995), which is a continuous version of the binary trust game. We introduce it here because several studies discussed below make use of this alternative. In the in- vestment game a trustor can submit an arbitrary part x of the initial endowment E to the trustee. En route, the invested money is tripled by the experimenter. The trustee can then send back an amount y to the trustor, so that the trustor’s and trustee’s payoffs finally will be π1 = E − x + y and π2 = 3x − y, respectively. The subgame perfect equilibrium prediction for pure self-regarding players in the one- shot situation is to expect no transfers at all. The second mover will not return anything because she prefers more money to less. The first mover – anticipating the partner’s move – will therefore not invest in the first place. However, Berg and colleagues’ results are inconsistent with subgame perfect equilibrium. They find positive amounts sent by trustors and positive reciprocal responses by the trustees. Compared to the binary case, applying the investment game typically results in richer data. Nonetheless, Ermisch and Gambetta (2006) argue that in situations where trust is at stake, involved parties must have a common understanding of what placing and honoring trust actually is. In the investment game, a trustee can send back any positive amount of her share. This prevents the trustor from form- ing any sensible expectation of when trust would be honored or abused. In the binary trust game, where we only observe the corner cases, the situation is much less ambiguous and lends itself to a much simpler interpretation.2 As Bacharach and Gambetta (2001, 150) put it, “trust has two enemies: bad character and poor information”. A trustor always prefers to trust when facing a good character. Moral hazard arises from a potential bad trustee. Trust is therefore a one-sided

2 Ermisch and Gambetta (2006) list several other shortcomings of the investment game (initial endowments and windfall gains, tripling the contribution of the first mover). Most of the arguments are not refuted by switching the paradigm and apply to the binary case too. Besides trust, other social preferences like unconditional altruism can contribute to an explanation of why trustors submit positive amounts. Decomposing the motivational foundation of trust requires greater sophistication in paradigm design (cf. Cox 2004, Ashraf et al. 2006, ). 50 Reciprocity in Reputation Systems incentive problem. We proceed with the question of what kind of information a perfectly rational trustor can have when facing these two basic types of trustees.

3.2.2 Reputation in Repeated Games Reputation is widely accepted as an effective means of enforcing cooperation given that the information is valid, easy to track and disseminate. It has been extensively studied by economists using the tools of game theory. It is typically modeled (a) based on games of incomplete information with two types of players (good/bad, patient/impatient, able/unable), (b) with infinite or finitely repeated games, or (c) with the tools of evolutionary game theory. In economics, the repu- tation effect is the result of introducing a small amount of incomplete information into a repeated game setting (Cripps 2008). In common usage, reputation is a state or an attribute ascribed to an actor by another. It represents a prediction about this actor’s future behavior based on the history of the actor’s previously observed behavior (Wilson 1985). In the case of positive reputation, the attribute signals that the actor is more likely to be desirable for a certain interaction than those without the attribute, whereas a negative reputation signals the opposite. Reputation is thought to be especially effective within small groups where in- volved parties can observe and remember their counterpart’s choices. In repeated interactions, players can make their cooperation conditional on the partner’s his- tory of play, reciprocate uncooperative moves and refuse to cooperate in future rounds. For very long time horizons or as long as a pair of players values profit from future interactions highly enough, the ‘Folk Theorem’ of repeated games suggests that the future payoff stream provides enough incentive for sustained cooperation (Axelrod and Hamilton 1981, Fudenberg and Maskin 1986). This is nicely summarized by Axelrod and Hamilton’s simple rule for when direct reci- procity is able to foster cooperation: w = c/b. The probability w of another encounter of the same two players must exceed the cost-benefit ratio of the stage game payoff.3 With reference to subgame perfection and backwards induction, one would assume that cooperation breaks down as soon as players know when the game

3 With Kreps (1990), we can state the threshold level for the infinitely repeated binary trust game as follows. Given are a continuation probability of w and a discount factor δ with 0 < w, δ < 1, with both w and δ close to unity, i.e., termination of the interaction is sufficiently unlikely and players t value a payoff x from a future trust game t periods away as δ x. Then, wδ > (T2 − R2)/(T2 − P2) will ensure a cooperative equilibrium in repeated bilateral play. In evolutionary biology, the payoffs of the prisoner’s dilemma are typically simplified as T = b, R = b − c, P = 0, S = −c, with b > c. Substituting b and c into the equation above simplifies it to δw = c/b. Note that Axelrod and Hamilton formulated their threshold without discounting. See Buskens and Raub (2002, 176) or Bowles and Gintis (2008) for detailed discussions. Related Literature 51 will end or if the sequence of repeated play is not sufficiently long. But even in finitely repeated games, reputation helps to rationalize cooperative behavior. Player heterogeneity and incomplete information on what kind of trustee a trustor faces changes the setting substantially. The theory of bilateral reputation build- ing goes back to Schelling (1960, 41) and was formally developed in a series of articles by Kreps, Wilson, Milgrom, and Roberts. Kreps et al. (1982) consider two types of players – a long-run player faces a sequence of short-run players – who are unsure about their counterparts’ preferences or type. Short-run players observe the long-run player’s choices, which allow the latter to establish a repu- tation for committing to a certain strategy. Since an uncooperative move early in the sequence reveals the long-run player’s type, it can be (sequentially) rational to maintain a cooperative reputation as long as this reputation promises higher payoff streams. Given a repeated game of incomplete information and a mixture of trustees with self- and other-regarding preferences (e.g. reciprocal types with a preference to honor trust), even the purely selfish trustees have an incentive to imitate the reciprocal types. This creates, in turn, the incentives for trustors to trust. There are several implications that follow from sequential equilibrium rea- soning: First, reputations need to be established. In the initial phase reputations are most informative and incentives for cooperation are strongest. In a second phase, trustees start to deplete their reputation by playing mixed strategies. In the last phase of a finitely repeated game, incentives to maintain a reputation can dis- appear completely as the end of the game comes to a close. Second, even though reputations give rise to trustworthy behavior, their strategic use by bad characters renders their predictive value to be zero. Experimental tests of binary-choice trust games with the sequential equilibrium model were first carried out by Camerer and Weigelt (1988). Players only learn sequential equilibrium reasoning after nu- merous repetitions of short game sequences. Experimental evidence is mixed but supports several predictions of sequential equilibrium theory. This is remarkable given how involved the learning process and equilibria are (Bayesian updating, playing mixed strategies). Bolton et al. (2011) find, however, that human repu- tation builders are more predictable in phase 2 than expected. This suggests that reputations have predictive power and also market value. In contrast to repeated play scenarios with personal enforcement, the func- tioning and effectiveness of the reputation mechanism in larger social systems is less certain. Players continuously meet pure strangers and can only learn about their trading partner’s standing through gossip and word-of-mouth. Depending on the level of information transmission, results from the literature on repeated games can be generalized to cases with many players meeting in one-shot inter- actions. According to Kandori (1992), we can distinguish three cases of commu- 52 Reciprocity in Reputation Systems nity enforcement: (1) In the first and unrealistic case in which all players know the details of all past interactions (perfect information), varying partnership is not an obstacle to cooperation since players are not strangers. This setting is equiv- alent to repeated play. (2) In the minimal information case, in which players only have private information from their past interactions, cooperation can only be sustained if all players adhere to a strict ‘contagious strategy’.4 (3) Finally, in the middle of the two extreme cases, Kandori (1992) and Okuno-Fujiwara and Postlewaite (1995) suggest a status label given to each player which is publicly observable. Each player in the population is labeled “good” or “bad” according to a simple rule: a good player remains good if the player cooperates when the partner is labeled good and defects if the partner is bad. Otherwise, the player becomes labeled bad. Since any future status label depends on the player’s and partner’s current and past status, this is a higher-order summary statistic of past play that is crucial for this class of solutions. If the players are patient enough and status labels are transferred correctly, a cooperative equilibrium can be attained if defectors are punished a finite number of times.5

4 Contagious equilibrium (Kandori 1992) for a two-sided incentive problem like the prisoner’s dilemma expects all players who have experienced a defection or defected themselves to play the non-cooperative move in all future rounds (trigger strategy). This equilibrium concept is fragile to implementation error and trembles, since a single defection will trigger a contagious wave of defec- tions that finally causes a permanent end to all cooperation. Xie and Lee (2012) extend Kandori (1992) to the one-sided incentive problem of the binary trust game. One conclusion from their analysis is that trustors require a high outside option (P1) compared to the cooperative payoff (R1) to not deviate 1 from a contagious strategy: P1 ≥ (1 − n )R1 where n is the number of trustors or pairs of players. Since in our experiment P1 = 0, we cannot expect the contagious strategy to constitute a sequential equilibrium when players hold consistent beliefs as defined in Kreps and Wilson (1982). 5 First-order information is a record of the current partner’s past play and corresponds to cases of online feedback and credit histories that we study here. Second-order information is a record of the past play of the current partner’s past partner. Until recently, there was a consensus among theorists that cooperation cannot be sustained with only first-order information. The main obstacle without higher-order information is that players cannot distinguish cheaters from punishers because the two are behaviorally indistinguishable. Several contributions in the literature on repeated games with random matching are based on first-order information. Rosenthal (1979) studied prisoner’s dilemma games where players observe their partner’s last move. Rosenthal and Landau (1979) discuss the case where reputation is a summary statistic over the entire history of the partner’s play. Institutions like the law merchant (Milgrom et al. 1990) or credit bureaus (Klein 1992) are applications of community enforcement in the domain of one-sided incentive problems. Recently, Takahashi (2010) was able to demonstrate with belief-free equilibria that a community of patient, forward-looking players can sustain cooperation as a social norm even if they only have access to first-order information. Whether belief-free equilibria construction is robust enough to be actually played awaits empirical corrobora- tion. Related Literature 53

3.2.3 Indirect Reciprocity In a similar vein, reputation mechanisms have been widely discussed in evolution- ary biology where this mechanism is known as one variant of indirect reciprocity. Alexander (1987) distinguished two types of indirect reciprocity based on the di- rection of a helpful act: (1) “A helps B, B helps C, C helps A” or (2) “A helps B; C, observing, later helps A”. Boyd and Richerson (1989) called these two variants upstream and downstream tit-for-tat. In the case of upstream reciprocity the mere fact that B receives help from A is motivation enough for B to help C in turn. This forms a chain of help that eventually but not necessarily loops back to A.6 Although there is some experimental evidence for one-way flows of benefits in small groups (Greiner and Levati 2005), upstream reciprocity is hard to explain with narrow self-interest, especially as groups become large. Compared with the upstream variant, the evolution of cooperation via down- stream reciprocity is much less restricted by group size. According to Boyd and Richerson (1989, 1988) it is evolutionarily stable under similar conditions when direct reciprocity is stable. However, while upstream reciprocity requires play- ers only to know what happens to them, downstream reciprocity implies an as- sessment of eligibility by direct observation or based on reputation. Nowak and Sigmund (1998b) simulated a repeated helping game – nowadays also called the ‘indirect reciprocity game’ – with random rematching where donors can decide to help recipients with an altruistic act at a cost of c. If the donor decides to help, the recipient receives a benefit of b (with b > c). Otherwise, both get a payoff of zero. Agents are randomly assigned to the role of either the donor or receiver. Additionally, they introduce the concept of a public ‘image score’ that increases by one if the donor helps and decreases by one if the donor decides not to help. It turns out that for cooperation to evolve there must be a minimal number of discriminators (threshold strategies) that cooperate, conditional on the image score of their interaction partners. The strategy DISC observes n outcomes out of 0 < n ≤ p of all past interactions p where the partner was in the donor role. It counts defections in n and cooperates if the partner defected at most i times. Nowak and Sigmund (1998a) have provided analytical results for the simple binary case of image scoring in which players only see their partners’ last move (n = 1) and where DISC only cooperates if the last partner’s move was coopera-

6 Upstream reciprocity is very similar to what Trivers (1971) called generalized reciprocity as a wider form of reciprocal altruism. Here we follow Sahlins (1972, 191f) and Alexander (1987, 81f) and distinguish indirect from generalized reciprocity, where the generalized variant is defined as a one-way flow of benefits in which the expectation of returns is vague to nonexistent. Upstream reciprocity is also known as serial, misguided or pay-it-forward reciprocity. The two variants of indirect reciprocity are often discussed interchangeably, but in fact they are very different and also do not share the same neurobiological foundation (cf. Watabe et al. 2014). 54 Reciprocity in Reputation Systems tive (i = 0). In this case, indirect reciprocity can only be stable if the probability q of knowing the image score of another player exceeds the cost-benefit ratio of the helpful act (q > c/b).7 Let us assume interactions between DISC and a selfish strategy ALLD (i = −1, defects always). If DISC meets itself it receives b − c. It gets exploited at a rate 1 − q and receives (1 − q)(−c) if the opponent turns out to be ALLD. ALLD receives a zero payoff by meeting one of its own kind and (1 − q)b if it can exploit DISC. DISC outperforms ALLD if b − c > (1 − q)b. If we solve for q, we get Nowak and Sigmund’s inequality: q > c/b. In binary image scoring DISC is only an evolutionarily stable strategy against ALLD. However, Panchanathan and Boyd (2003) show that if we introduce an unconditional cooperator (ALLC, i = n), binary scoring is unstable under ran- dom drift and noisy conditions. It can survive for some time in the population but is bound for extinction in the long run. In reaction to this, attention has shifted toward models based on higher-order information. For example, Leimar and Hammerstein (2001) demonstrated in silico that a higher-order information ‘standing strategy’ is able to outperform the basic discriminating strategy. How- ever, as soon as one departs from the restrictions of the initial binary model, the situation is much less severe. Berger (2011) shows that the binary case is a very special case. He proposed a ‘tolerant scoring strategy’ that samples two obser- vations from the partner’s history (n = 2, first-order). It cooperates if and only if the opponent helped at least once in these two actions. Tolerant scoring is evolutionarily stable under sufficiently low rates of implementation error. It is a strict and locally attracting Nash equilibrium that can be learned from arbitrary initial conditions. On top of that, Berger and Grüne (2014) show in the general case of multi-valued scoring (n ≥ 2, 0 ≤ i ≤ n − 1) that mixtures of neighboring ‘i-discriminator’ strategies can be robust and evolutionarily stable in a substan- tial region of the cost/benefit-error plane. And for mixtures, chances for stable cooperation by indirect reciprocity are increasing by n, i.e., by the number of observations that can be sampled from a partner’s history. Based on the simple model of Nowak and Sigmund, we can conclude that in a world where everybody knows everybody else (q → 1), benefits need only to be insignificantly larger than cost to allow for sustained cooperation. We conclude from Berger and Grüne that it is easier to establish cooperation with longer track records. In large social systems of anonymous traders neither q nor n will be

7 It is noteworthy that several key mechanisms for the evolution of cooperation share a similar and simple dependence stemming from the cost-benefit ratio: kin selection based on the coefficient of genetic relatedness r (Hamilton’s Rule), direct reciprocity based on the probability w of future encoun- ters, and indirect reciprocity based on the probability q of knowing the partner’s reputation from indi- rect reciprocity. They all need to exceed the cost-benefit ration to sustain cooperation: r, w, q > c/b. See Nowak (2006b) for a detailed discussion. Related Literature 55 sufficiently large. However, reputation systems ensure that image scores are pub- lic knowledge and accumulate large numbers of observations from past behavior (n >> 2). Not all transactions are followed by ratings and not all traders have built up a reputation (q < 1). But it is unlikely that traders remain unrated for a long time, therefore q will be close to the rate of entry into the system. In reputation systems the problem is not receiving enough but receiving reliable information. Since transaction outcomes are privately monitored and voluntarily reported, im- age scores can be biased and distorted. DISC strategies know that they do not know. If the partner’s score is unknown, they cooperate. In our situation, how- ever, players do not know exactly what they know. This is a form of perception error that typically renders analytical solutions intractable (cf. Panchanathan and Boyd 2003, 116). Implementation errors  (trembles) in the helping game are intended but failed donations and both parties are aware of that. With perception errors δ, donors believe to have donated while recipients perceive otherwise. In our setting, we expect to have a systematic bias where players are perceived to be more trustworthy than they actually are. This should primarily affect the effi- ciency of discriminating strategies with, likely but not necessarily, the same effect as simple implementation error. See Brandt and Sigmund (2006) and Fudenberg and Maskin (1990) for a discussion on noisy conditions. Altogether, decreasing cost c and increasing information levels foster the evo- lution of indirect reciprocity, i.e., cooperation is easier to establish if c/b → 0, q → 1, n → ∞,  → 0, and δ → 0. For our study we apply the converse argument. Two different reputation systems share a fixed cost-benefit ratio c/b and identical n at the maximum n = p because image scores are summary statis- tics over p. If we introduce a source of error in one of them, ceteris paribus, we expect to observe lower rates of trust there. Similarly, if two reputation systems are different in terms of the feedback submission rate, they will have different q and n. The system that elicits more feedback will, ceteris paribus, generate higher levels of trust. A pessimistic results comes from a simulation study of Suzuki and Kimura (2013), at least at first sight. Not only errors in the transmission of reputation are a threat for indirect reciprocity but cost of establishing and storing it too. If donors, recipients, or independent observers have to incur even a tiny cost, this gives rise to another second-order free-rider problem that renders all cooperation via indirect reciprocity as impossible. In other words, they predict the feedback submission rate to be zero. However, even if costs of submitting ratings do matter, reputation systems have lowered these costs to a level where it is ultimately an empirical question if traders are able to overcome this second-order problem. Despite the theoretical problems in defining the exact amount, order, and cost of information required for sustained cooperation, we have reason to believe that 56 Reciprocity in Reputation Systems indirect reciprocity can have a competitive advantage over other types of reward- ing and sanctioning mechanisms for large populations (e.g. Fehr and Gächter 2002 for a discussion of punishment). Punishment, for example, incurs costs for both the punisher and the punished, whereas an indirect reciprocity mechanism saves punishment costs and disciplines by withholding help or avoiding interac- tions. Rockenbach and Milinski (2006) hypothesize that costly punishment may even become extinct in environments where spreading reputation information is cheaper than direct punishment. There is some experimental evidence that reputation-based indirect reci- procity fosters cooperation in the helping game paradigm. Wedekind and Milin- ski (2000) demonstrate that cooperation can arise if donors know the number of times a receiver has given help. Donors tend to be more generous to receivers who have been generous in previous rounds. Seinen and Schram (2006) vary the cost-benefit ratio and find, as expected, higher donation rates for lower cost. Ad- ditionally, donors make their choices contingent on the receiver’s previous history of donations. They apply a limited memory approach by giving donors access to the last 6 interactions. However, in their setup players interacted repeatedly in small groups of 7 players with simple random rematching. So in principle, their results can also be explained by repeated game incentives. Bolton et al. (2005) experimentally tested a helping game with a binary image score, varying the order of information and the cost-benefit ratio. The higher the cost of the helpful act, the lower is the rate of giving. Second-order information results in higher rates of giving than first-order information, which is, in turn, better than no information. Engelmann and Fischbacher (2009) apply an experiential design that lets them decompose different motifs of indirect reciprocity. They replicate the standard helping game over 80 rounds: the first 40 rounds with a public image score, and the second 40 rounds with a private one. They find a non-trivial amount of pure pay-it-forward indirect reciprocity for the case of private image scores. However, there are strong effects for strategic reputation building. The average helping rate with public image scores is more than twice that of senders with a private score. More experimental evidence similar in logic will be discussed in the context of reputation systems in section 3.2.5.

3.2.4 Strong Reciprocity It is easy to see that the helping game of Nowak and Sigmund (1998a,b) strongly resembles a one-sided version of the investment game of Berg et al. (1995). The crucial difference between the two is obviously that the recipient in the latter case has the option to directly reciprocate. If one would assume, like Charness et al. (2011, 363), a norm among recipients to never return anything reciprocally, Related Literature 57 the predictions from the indirect reciprocity literature would apply in a straight- forward manner to the trust game too. Of course, with reference to the ‘norm of reciprocity’ by Gouldner (1960), we expect the opposite where a reciprocal back transfer is the rule rather than the exception. Introducing an option to return gives rise to potential interactions of indirect and direct reciprocity. We concur with Charness and colleagues that introducing elements of direct reciprocity into games with indirect reciprocity can be a sensible robustness test for both mecha- nisms. Suppose a group of players faces a series sequential trust games in which players are rematched and randomly assigned the roles of first and second mover after every stage of the game. On top of that, we introduce to all players a public and error-free image score of their history of play. The first mover can make the transfer contingent on the partner’s image score and avoid players with a history of no or poor reciprocation. Additionally, and this is different from the basic indirect reciprocity setup, the second mover can make the back transfer dependent on the partner’s image score too. If a first mover is in bad standing, the second mover can hold back or diminish the reciprocal response. On the other hand, a second mover might have a lower propensity to abuse trust of a first mover with an unblemished record. Indirect reciprocity may reinforce or decrease direct reciprocal effects. On the other hand, effects from indirect reciprocity can possibly be overruled by direct reciprocity. The literature on direct reciprocity reveals considerable differences regarding what direct reciprocity is supposed to be. We will not discuss this here in detail. In the following, we will adhere to a simple behavioral definition where direct reciprocity is defined as conditional behavior and is a propensity to respond in kind to beneficial or hostile acts. Direct reciprocity is a dyadic concept and re- quires a sequential game to allow for conditional behavior. Therefore, positively reciprocal behavior is conditional kindness and different from unconditional kind- ness motivated by pure altruism (cf. Diekmann 2004).8 In a sequential one-shot game, a promise of paying back or a threat to retaliate against a first mover’s hos- tile act is not credible. Non-repeated direct reciprocity cannot be explained with models of narrow self-interest. Will a player bear the cost of punishing the coun-

8 With a behavioral definition of direct reciprocity we do not yet make a statement on underlying motivational factors. In the literature, we can identify several approaches to explain the driving force behind reciprocity. Outcome-based models (Fehr and Schmidt 1999, Bolton and Ockenfels 2000) fo- cus on absolute or relative payoff differences and on distributional aspects, and mirror a preference for fairness and balancing. Intention-based models (e.g., Rabin 1993, Falk and Fischbacher 2006) stress the role of the interaction partner’s intentions. Still others allude to the role of positive and negative emotions (Frank 1988) as motivational foundations of reciprocity. Some suggest with arguments from evolutionary psychology that behavioral propensities to reward and retaliate are even “hard-wired” by neurobiological mechanisms into human nature (cf. Hoffman et al. 1998, Fehr and Gächter 2002). 58 Reciprocity in Reputation Systems terpart if the two never meet again? Homo economics will not punish, but homo reciprocans is expected to do so – primarily by assumption but also due to the growing body of empirical evidence. After Fehr et al. (2002, 4), we can define a strong reciprocator as an individual who is “willing to sacrifice resources for rewarding fair and punishing unfair behavior” even if this is costly and provides neither present nor future material rewards for the reciprocator (cf. Gintis 2000, Bowles and Gintis 2004, and Guala 2012 for a survey). Direct reciprocity is the essential ingredient of our experimental design and is supposed to be at work at different locations within the sequence of play of a single stage. First we let subjects play a one-shot trust game. Given that trust has been placed by the trustor, honoring trust can be understood as a positive direct reciprocal response. After the trust game, we let subjects play a ‘rating game’ that has a similar structure to a sequential prisoner’s dilemma. In the one-sided rating game, the buyer can reward or punish the seller’s choices with positive or negative ratings. Submitting feedback can be understood as a reciprocal act toward balancing potential inequalities from the trust game. Finally, in the case of the bilateral rating game, there is yet another option to reciprocate by giving back or retaliate against a partner’s rating. Since feedback is costly on the side of the sender but only symbolic or in- directly effective toward the receiver, we can hypothesize based on pure self- regarding preferences that no feedback will be provided. Furthermore, without any mechanism stabilizing the trust game, trust will be abused and therefore not given in the first place. However, if strong reciprocators with other-regarding preferences are among our subjects, even a small direct reciprocal incentive to submit feedback can bootstrap the process of indirect reciprocity. This is a second way to conceptualize direct reciprocity as a “starting mechanism” for cooperation (Gouldner 1960, 176). On the other hand, options for direct reciprocity abused strategically can also result in a distorted and biased view of the histories of play, even to an extent that it may restrain cooperation by indirect reciprocity. One related example of an adverse effect of chained direct reciprocity is counter-punishment in public goods games. Punishment opportunities have been shown to increase cooperation in social dilemma situations. However, Nikiforakis (2008) adds a counter-punishment level to the basic setup of Fehr and Gächter (2002). Strategic considerations and a desire to reciprocate punishments results in a breakdown of cooperation where players receive even lower payoffs compared to a public goods game without punishment. The threat of revenge weakens the willingness to punish free-riders. The very same argument is brought forward for the case of bilateral rating systems where researchers expect lower (higher) rates of negative (positive) ratings due to counter-rating opportunities. Related Literature 59

There is now a considerable amount of experimental and empirical evidence for positive and negative reciprocity even in one-shot interactions (cf. Berg et al. 1995, Fehr and Gächter 2002) such that they have been considered opposing sides of the same preference for fairness. This symmetric view of reciprocity has re- cently been challenged by experimental (Yamagashi et al. 2012) and empirical evidence from surveys (Egloff et al. 2013) that suggests that propensities for neg- ative and positive reciprocity are uncorrelated. Already Fehr and Gächter (2000, 159) have indicated that “there also seems to be an emerging consensus that the propensity to punish harmful behavior is stronger than the propensity to reward friendly behavior.” This asymmetry in reciprocity (Offerman 2002, Keysar et al. 2008, Al-Ubaydli and Lee 2009) seems to be embedded in a more general finding of a positive-negative asymmetry (Baumeister et al. 2001). We will transfer this asymmetry to the case of indirect reciprocity by splitting the image score into positive and negative components and discussing the differential effectiveness of positive and negative signals.

3.2.5 Reputation Systems A reputation system is a feedback mechanism that transforms privately observed transaction outcomes into public knowledge. See Jøsang et al. (2007) for the var- ious technical implementations. From a game-theoretic point of view, subjects who play in an anonymous, one-shot environment with a system that dissemi- nates complete information should face an identical situation to a repeated play environment. It should not matter where the information is coming from. This information hypothesis was tested in a seminal paper by Bolton et al. (2004). Bolton and colleagues let subjects play 30 rounds of a binary-choice trust game under three conditions: first, a stranger condition where subjects get randomly re- matched after each round, and second, a partner condition where a pair of subjects stays together for all 30 rounds. They show, as Andreoni and Miller (1993) and Duffy and Ochs (2009) did with prisoner’s dilemma games, that a partner market outperforms the trust levels of the stranger market by a wide margin. The third market with a feedback mechanism, where buyers are perfectly informed about their sellers’ previous play, falls in between the stranger and the partner market. Therefore, partners and feedback treatments do differ and the main hypothesis on the irrelevance of the information source is rejected.9

9 This finding seems to be in line with the claim of Granovetter (1985, 490) that people have greater confidence in information from first-hand dealings or from a “trusted informant”. However, Bolton et al. (2004) discuss several other potential explanations. In the feedback market, the risky move of a trusting buyer generates a positive informational externality for other buyers but not for herself. In the partner market, on the other hand, profits from the taken risk primarily accrue toward the focal buyer. 60 Reciprocity in Reputation Systems

Several other contributions shed light on information density, i.e., how com- plete and error-free a trading history needs to be. Keser (2003) confronted sub- jects with a repeatedly played investment game and varied the length of the his- tory of play. A complete history results in higher transfers compared to a treat- ment with only the seller’s last move. Bracht and Feltovich (2009) corroborate this finding but underline that even the observation of the last move yields sub- stantial improvements compared to a complete stranger market. The findings of Bohnet et al. (2005) vary if buyers, sellers, or both have access to the seller’s trading histories. Letting sellers observe other sellers (imitation effect) yield no improvement over the stranger treatment. But the two-sided market transparency is more beneficial with regard to trust and trustworthiness compared to a setting where only buyers observe seller reputations. Typically, reputation information is such that a trustor learns about the trustee’s past behavior. Charness et al. (2011) turn this around and give the buyer information about what the seller did as a buyer. They show that a ‘history of placing trust’ can be as effective as a ‘history of return’ and that the two are behaviorally indistinguishable in their study. One key distinction between reward and sanctioning institutions on the one hand, and reputation systems on the other hand, is that the former are based on direct benefits and costs for involved parties, whereas in the latter case incen- tives for the target are generated indirectly. A negative rating of a buyer does not incur direct costs for the seller. It is the ostracism by future buyers that dis- ciplines the seller. Punishment will be executed by the next buyer downstream. Audience effects and punishment without any direct pecuniary consequences in- crease contributions in public goods games (Masclet et al. 2003), if the symbolic punishment is costly for the sender (Carpenter et al. 2004). Likewise, Lumeau et al. (2013) let subjects interact in repeatedly played investment games without feedback, with costly public feedback, and costly private feedback that was only visible for the matched receiver. Both public and private ratings improve cooper- ation but private ratings are less efficient in enhancing trust and trustworthiness.10 A systematic variation of the feedback cost was conducted by Xie and Lee (2012). In a binary-choice trust game setting, buyers’ feedback costs are varied from zero cost to 10% and 20% of a cooperative stage game payoff. In the case of free rat- ings, 99% of the subjects provided feedback. As expected, with increasing cost

Bolton and colleagues call this the ‘informational dilemma’ of reputation systems, which is distinct from the second-order dilemma of feedback provision discussed below. See Bolton and Ockenfels (2011) for a detailed discussion of information externalities in reputation building. 10 This setup extends the findings of an image scoring experiment of Engelmann and Fischbacher (2009) by adding direct reciprocity on top of indirect reciprocity. The qualitative conclusion in the two cases is the same. We see cooperation in private and public scores, although the latter is more effective. Related Literature 61 the feedback rate dropped to 33% and 11%, respectively. However, neither the purchasing nor the shipping decisions are significantly determined by rating cost. The majority of the studies cited above test effects from a perfect albeit in- complete reputation system where a computer reports trading histories to players without any kind of reporting bias. For example, the implementation of feedback in the study by Gazzale and Khopkar (2011) let buyers publicize their seller’s play by a costly decision to reveal or a costless decision not to reveal. The au- thors argue with reference to the findings of Chen and Hogg (2005) and Li and Xiao (2014) that strategic misreporting and untruthful feedback are rare. In these studies, less than 5% of the reports do not reflect the true transaction outcome. We take this as null hypothesis and test this by modeling the feedback process endogenously. We are aware of three studies with endogenous self-reported feedback. The latter two studies share many central design features with ours. They have been administered at about the same time as ours. We were obviously unaware of each other. Masclet and Pénard (2012) continue the seminal work by Keser (2003). Keser implemented a one-sided rating mechanism at the end of an investment game where player A has the possibility to rate player B’s back transfer with a positive (+), negative (−), or neutral (0) rating. There was no option to abstain from feedback. Players stayed in their roles for 20 rounds and both were shown the complete list of past ratings. Masclet and Pénard (2012) generalized this by adding two-sided feedback mechanisms after the investment game. A pure investment game without a feed- back stage serves as a baseline. Feedback was costly and positive (+), negative (−), or none (¬). Players make their rating decisions either simultaneously (blind) or in a sequential fashion. The latter case was modeled as a two-stage process where players could either submit a rating in the first or, after observing their partners move, in the second stage. In a third treatment the sequential order was imposed randomly by the computer. This allows the authors to study the poten- tial strategic nature of reciprocal feedback in great detail. However, they impose the two-stage logic onto the players and eventually produce strategic effects by their design choice. The authors find that introducing a rating system tends to in- creases transfers and back-transfers. Simultaneous and randomly ordered sequen- tial feedback provides significantly better outcomes than the baseline. However, bilateral ratings without constraining the order of feedback are not significantly better than the baseline. The feedback turnout is rather low in their study. In all treatments, 70-80% of subjects abstain and use the rating system primarily as a punishment device. Interestingly, they find that most negative ratings are indeed assigned in the second feedback phase, whereas positive ratings are generally given in the first stage. This is largely consistent with strategic feedback provi- 62 Reciprocity in Reputation Systems sion where subjects try to elicit a reciprocal answer in positive feedback but avoid retaliation in negative feedback. Bolton et al. (2013) compare a two-sided feedback mechanism called ‘con- ventional’ and known from eBay with two alternatives: a blind variant and a rating system that mixes the two-sided version with a detailed seller rating. In the detailed rating buyers are asked in addition to the conventional feedback to rate the quality of the transaction on a Likert scale. This setup comes close to eBay’s new feedback system introduced in 2008. All three feedback implementa- tions followed an auction game with seller moral hazard, repeatedly played with random rematching for 60 rounds. Bolton and colleagues find that blind and de- tailed version outperform the conventional version in terms of market efficiency. However, only the detailed seller feedback was significantly better than the con- ventional setup. The authors explain these differences by the fact that the blind and detailed versions restrict distortions from direct reciprocity. Before we proceed to our experimental design, we have to mention one criti- cal albeit intended omission in our setup. Most experimental studies that vary the density of reputation information keep the matching protocol and the interaction structure exogenous. Even though buyers have means to avoid sellers, they cannot actively choose them. In our design, there is no freedom of choice for buyers and, to that effect, no competition among sellers. However, in most real-life applica- tions of reputation systems – like eBay – buyers do have this option. To really appreciate the power of reputation-based solutions, one has to understand what competition and assortative partner choice add to the institution. In a replication of their 2004 study, Bolton et al. (2008, 163) offered buyers the opportunity to choose between two sellers. They conclude: “With competition, the performance of the strangers networks common to Internet markets is similar to the partners networks [..].” In a similar vein, Huck et al. (2012, 205) state: “[...] the effects of reputation on its own are limited. Only if some minimal feedback information – enabling the mechanics of reputation building – is coupled with competition we observe that the market-crippling moral hazard problem is fully overcome.” The pivotal role of competition for systems of indirect reciprocity has been introduced into the biology literature by Sylwester and Roberts (2010, 2013). There, partner choice and competition are thought to be two distinct mechanisms. And in fact, Huck and colleagues find that competition fosters trust through two channels: it creates stronger incentives for trustees who compete to be selected and selection by trustors itself. For reasons of traceability, we do not take seller competition into account. But we keep in mind how Schumpeter ([1943] 1976, 86; citation from Hörner 2002) would comment on this omission: “[...] even if correct in logic as well as in fact, it is like Hamlet without the Danish prince.” Experimental Design 63 3.3 Experimental Design

3.3.1 Games and Treatments The experimental design consists of four different treatments. All treatments start with a binary-choice trust game that remains unchanged in all treatments (cf. Kreps 1990, Camerer and Weigelt 1988, Snijders and Keren 2001, Bolton et al. 2004). In the stranger treatment, only the trust game is played, whereas three other treatments implement a feedback mechanism. The trust game equals a one-sided version of the prisoner’s dilemma (cf. Fig 3.1). The first mover (A) corresponds to the trustor or buyer who has to decide whether to place trust by entering into a transaction. If the buyer does not place trust, both players get a zero payoff and the stage game is over. The second mover (B) plays the trustee or seller and must decide if she is willing to honor or to abuse trust. If the seller honors trust, both parties gain from mutual trade. If trust is abused, the seller receives a temptation payoff while the buyer loses money. Generally, we try to replicate the game and incentive structure of Bolton et al. (2004). The payoff structure of Bolton et al. was T = 70, R = 50, P = 35, S = 0. We changed the payoffs with several notable consequences. Ours will be T = 40, R = 20, P = 0, S = −10. First, we rescale the outside option P to be a zero payoff to maintain a status quo if buyers do not enter the game. The gains from trade in the Bolton et al. setup are g = (50 − 35)/35 or about 40%. We choose the extreme case of no gains from any trade to induce a high number of transactions. If subjects wanted to earn money on top of the show-up fee they were forced to enter a risky situation. Second, we implement a negative payoff S = −10 for the situation of abused trust while keeping the buyer’s risk and seller’s temptation as close as possible to those of Bolton and colleagues. To avoid bankruptcy in the presence of negative payoffs the show-up fee of CHF 10 = 500 (ECU) was transferred to the account of each player as an initial endowment at the start of the first round.11 The experimental intervention consists in varying the structure of the feed- back stage. The basic structure is modeled with a rating game that closely re- sembles a prisoner’s dilemma. The second stage is only reached if the buyer enters the trust game in the first stage. The blind version of the rating game is displayed in Figure 3.1 (right panel). In this case, both players simultaneously

11 In BKO, the buyer’s risk is r = (P − S )/(R − S ) ≈ 0.29 and the seller’s temptation is t = (T −R)/(T −S ) ≈ 0.28 and, therefore, almost symmetric. Our choice of r ≈ 0.33 and t = 0.4 provides a stronger temptation for the seller to abuse trust. Accordingly, a cost-benefit ratio of c/b = 20/40 = 0.5 does not match our payoff structure perfectly, i.e., S = −10 should actually be −20 to be compatible with the research on indirect reciprocity. For a detailed discussion on the effects of risk and temptation see Snijders and Keren (2001). 64 Reciprocity in Reputation Systems

Stage I: Binary-Choice Trust Game Stage II: Simultaneous Rating Game

Ship R1,R2 ... B p C,p C B + 1 − 2 − Buy A − n1 C,p2 C Not ship S ,T ... − − − 1 2 ¬ + C,p2 Not buy P1, P2 − p C, n C A + 1 − − 2 −

Player A, B: Buyer, Seller ... n1 C, n2 C − − − − − − ¬ Payoffs: T2 = 40 > R1,2 = 20 > P1,2 = 0 > C, n2 S 1 = −10 ¬ − − p , C Ratings: + (positive), − (negative), ¬ (none) + 1 − Rating Cost: C = 1 b n , C − − 1 − Reputation Effect: Endogenous future payoffs ¬ , from extra positive (p) or negative (n) 0 0 feedback, with p1 = p2, n1 = n2, and |p| < |n|.

Figure 3.1: Extensive Forms of Trust Game and Simultaneous Rating Game rate without knowing the choice of the opponent, depicted with the information set for player B. In the sequential version of the game, the information set is re- moved and the second mover knows in which subgame it is necessary to make the choice. However, we do not enforce the sequence. The role of player A is al- located endogenously on a first-come basis. In the one-sided version of the rating game only buyers can submit ratings, i.e., they just play the first subgame of rat- ing stage. Finally, the stranger treatment is played without feedback intervention. The four treatments are called stranger (ST), asymmetric (AS), simultaneous (SI), and sequential (SE). Players can submit feedback by assigning a positive (+) or a negative (−) reputation point or abstain (¬) from rating. Submitting feedback costs the giver C = 1 ECU. This corresponded to a cost of 5% of a cooperative period outcome and reflects the opportunity cost of submitting feedback. Players can abstain from rating at no cost. Nor do ratings induce direct costs or benefits on the side of the rated players. However, we hypothesize based on existing evidence that positive and negative ratings exert indirect incentives on the players. A money- maximizing player prefers positive feedback over no feedback, which is, in turn, Experimental Design 65 still better than receiving a negative rating. Additionally, players prefer not to provide feedback to save cost C. This is what makes the rating game a social dilemma. Due to turn taking, we expect that benefits from positive ratings (p1 = p2) and losses from negative ratings (n1 = n2) will be equal for both players. However, we do not expect the magnitude for positive and negative reputation effects to be the same (|p| < |n|). The ratings are recorded in the player’s feedback profile as two cumulative sums of positive and negative ratings. Splitting feedback into positive and neg- ative components will allow us to check if the reputation system is used as a re- warding or punishing device. Additionally, we try to test if negative feedback has stronger effects than the positive type. The player’s own and the partner’s feed- back scores are displayed together with the player’s current balance in a round indicator in a static box at the right side of the computer screen. This information box is updated at the start of every new period. Note that it is easy to recover the ‘image score’ (cf. Nowak and Sigmund 1998b) by subtracting negative from positive ratings. The number of rounds where the partner has not received a rating can be computed as a difference from the period indicator. Since the seller has access to the buyer’s reputation profile as well, this double-sided transparency allows the seller to make his or her choices contingent on the buyer’s score as well. Altogether, reputation information is similarly displayed as in Wedekind and Milinski (2000) or Masclet and Pénard (2012). Feedback submission time is limited with a hard-close, meaning that players are given 10 seconds to submit their feedback. If players do not submit feedback in time, the stage defaults on abstention. We measure the submission time of positive and negative ratings with a time resolution of 1 second. Players have also the option to abstain early by pressing a button ‘silence’. About 50% of all abstentions are from participants who preferred to wait for the timeout. As we will see, these 50% are not evenly distributed across treatments. In treatments AS and SI the rating decisions are revealed after ten seconds of consideration to the interaction partner. The sequential treatment is different. The rating decision of a player will be transmitted instantaneously within the interval of 10 seconds to the partner, allowing the latter to reciprocate. This opens up various types of rating strategies. For example, players can submit an early positive rating to elicit a reciprocal response, a proactive strategic motive. On the other hand, players can decide to delay their rating to rule out retaliation, a reactive reciprocal motive. The order is not exogenously constrained. Players choose for themselves whether they rate and who rates first.12 The implementation is different from the two-stage

12 In game theory, timing is often assumed to be irrelevant starting with von Neumann and Morgen- stern (1944, 51f) and their distinction between anteriority (first in time) and preliminarity (priority in 66 Reciprocity in Reputation Systems

Table 3.1: Experimental Design

Treatments Stranger Asymmetric Simultaneous Sequential Total Sessions 3 3 3 4 13 Participants 48 48 48 64 208 Interactions 720 720 720 960 3120 Notes: The table lists the number of sessions, participants, and interactions by treatment. In each session 16 subjects played 30 rounds with alternating partners in either the buyer or the seller role (15 times each). This results in 720 interactions for each treatment, except for the sequential treatment, where 4 sessions have been conducted. setups of Masclet and Pénard (2012) or Bolton et al. (2013). Since the two other studies share the basic idea of endogenous feedback, endogenous timing is the unique feature of our study. Our implementation will not allow us to disentangle the full spectrum of the players’ rating motivations. However, we do not induce strategic considerations with an obvious two-stage procedure and will be able to observe the ‘natural’ rating sequence. Additionally, we can measure potential effects of direct reciprocity with duration models similarly applied in field studies conducted on eBay (Dellarocas and Wood 2008, Diekmann et al. 2014).

3.3.2 Experimental Protocol

The experiment consists of 13 sessions with matching groups of 16 participants in each session. We administered a between-subject design where the first three treatments have three sessions each; the sequential treatment is over-sampled with a fourth session. See Table 3.1 for an overview. We focused on excluding implicit repeated interactions and kept the matching groups as large as possible. Masclet and Pénard (2012) and Bolton et al. (2013) chose groups of only 8 players to be rematched randomly after every round, playing for a total of 20 and 60 rounds, respectively. We think their group sizes are too small. In their setting, two players meet several times, which does not exclude repeated game incentives. However, as a consequence of the group size, we only have 3-4 independent observations per treatment and 13 observations in total. information). Knowing what a counterpart did implies that he or she moved first. On the other hand, if no information is or will be available, a rational player would not care when the opponent makes the move. In our setup, however, reaction time itself determines who has the move and can inform players about their opponent’s preferences. Additionally, it enlightens the analysis of how players respond to the problem of strategic timing even if outcomes do not change dramatically. See Camerer (1997, 176) for a detailed discussion. Generally, we believe that timing has not received enough attention in the research on direct reciprocity. Experimental Design 67

Each session had 30 interaction rounds. This was common knowledge. A fixed market end introduces endgame behavior and incentives for strategic repu- tation building. Toward the end of the 30 rounds, the value of a reputation gets smaller and, to that effect, subjects should be reluctant to invest in their own or their trading partners’ reputation. Although a fixed ending compromises external validity, a stochastic endgame rule introduces unobservable endgame behavior and risk-over-stopping that we prefer to avoid. Subjects where randomly seated and introduced to the environment with two test rounds. Subsequent periods proceeded under identical rules. The test rounds will be excluded from the analysis. At the beginning of each period, participants were assigned anew to the role of player A (buyer) or player B (seller) in an alternating fashion. We know from the literature that role switching tends to lower rates of trust and trustworthiness (cf. Burks et al. 2003). Taking turns in playing the seller and buyer is a requirement, otherwise buyers have no incentive to invest in reputation and therefore will exhibit little interest in submitting feedback. This would make a comparison with the asymmetric treatment impossible. Players are matched under a stranger matching protocol, that is, the compo- sition of the groups changes in every round and no two subjects play more than once together in the same role. Subjects were only informed that they would be randomly matched with a new player in every round. In fact, each pair of players interacted twice but in different roles. Subjects’ decisions have been elicited with the direct-response method. Although seller choices cannot be observed if buyers do not enter the game, we prefer to avoid the strategy method in order to not rule out effects from emotional responses after having been betrayed. See Brandts and Charness (2011) for a discussion of the mixed results on the equivalence of direct-response and strategy methods. Experiments were conducted in spring 2007 at the Department of Social Sci- ences of the University of Berne. The computer room was equipped with movable partition panels to ensure participants’ privacy. Sixteen subjects were recruited for each of the 13 experimental sessions. We spread leaflets on the campus with a link to a website where students could register for one of the scheduled sessions. The subjects were undergraduate and graduate students from a variety of fields. About one third were recruited from the social sciences, 20 percent were students from economics and management, 40 percent came from the humanities, and an- other 10 percent from other faculties. Sixty percent of the participants were male, and the average age was 23.5 (sd = 3.67). About 80 percent of the participants indicated that they had never participated in behavioral experiments before. No individual participated more than once. On average, a session lasted 45 min- utes, including instructions and subject payments. Cash was the only incentive 68 Reciprocity in Reputation Systems to participate. Average player remuneration was CHF 18.- (sd = 2.1, min=10.5, max=23.2). Payment was made in private at the end of every session. At the beginning of each session, subjects were given written instructions. The experimenter read out loud the instructions to the participants to create pub- lic common knowledge. After that, subjects were given another 5 minutes to read the instructions silently. They could indicate whether they had any questions about the instruction, and the experimenter would answer them privately. The ex- periment was programmed and conducted with the software z-Tree (Fischbacher 2007). Detailed game instructions and selected screen shots are reproduced in the appendix.

3.3.3 Hypotheses Besides the main hypotheses below, we can derive several propositions that fol- low directly form the game-theoretic analysis. Absent other-regarding prefer- ences, if subjects can perform backwards induction and believe others can do it as well, then, in the unique subgame perfect Nash equilibrium of any treatment, both sellers and buyers never provide any feedback, sellers never ship, and buy- ers never buy (Harsanyi and Selten 1988). Equilibrium play implies no trade at all. However, as soon as we assume some heterogeneity in preferences in the sense of a reciprocally motivated player type, then the situation may change sub- stantially. Reciprocators are willing to bear the cost of submitting feedback and honor trust if placed. We learned from the literature on repeated games that in- complete information about the opponent’s preferences will introduce reputation incentives such that even a selfish seller has incentives to mimic a trustworthy seller. In repeated games with constant partners, reputations are not informative due to mimicry. In repeated games with rematching, however, the only way a buyer can learn the partner’s type is to rely on third-party information. Cooper- ation in one-shot games with history dissemination requires discriminators, i.e., players who make their choices contingent on the partner’s reputation score. In finite-length game sequences we can expect declining levels of placing and honoring trust (endgame effects) because reputation incentives vanish toward the end of the game. The same holds true for the feedback stage, where we expect a declining submission rate in the course of the 30 rounds. In this paper we are interested in the following set of hypotheses:

Hypothesis 1: Treatment effects on trust and trustworthiness a) Pr(buy | ST) < Pr(buy | SE) < Pr(buy | SI)  Pr(buy | AS) b) Pr(ship | ST) < Pr(ship | SE) < Pr(ship | SI)  Pr(ship | AS) Experimental Design 69

Treatments with a feedback stage (AS, SI, SE) exhibit higher rates of placing (1a) and honoring trust (1b) compared to the control treatment without feedback (ST). We hypothesize that direct reciprocity on the feedback level will distort ratings. It decreases negative ratings and increases positive ones in a way that makes it harder for buyers to discriminate sellers. Diminishing the role of direct reciprocity should result in higher trust rates. We expect the sequential treatment (SE) to perform better than the stranger treatment (ST) but to fall short compared with AS and SI. In part, these hypotheses replicate findings from Masclet and Pé- nard (2012) and Bolton et al. (2013). Unlike their studies, we add the asymmetric treatment and assume equivalence between AS and SI. Since both exclude di- rect reciprocity and strategic ratings, the informational value of the two feedback systems should be identical.

Hypothesis 2: Reputation effects on trust, trustworthiness, and feedback submis- sion rate

In the rating game, there are three different outcomes of a rating r ∈ {+, −, ¬}. Each outcome is present or not, e.g., r+ ∈ {0, 1}. There are two separate reputation scores R+ and R− that sum up positive and negative ratings over the periods p = + 1,..., 30. The positive score Rip is defined as:  0 for p = 1,   p−1 R+ =  X (3.1) ip  r+ for p > 1.  i  i=1 The negative reputation score is formed accordingly by summing up over negative ratings. Occasionally, we will also refer to the image score which is simply the difference between the two: I = R+ − R−. Reputation effects will be tested with logistic regression models by introducing the two scores as inde- pendent variables. We assume simple linear additive effects on the log-odds, i.e., Pr(y = 1|x) = G(β0+xβ) where y is observed trust (buy) or trustworthiness (ship). β0 is an intercept term and G is the logistic function G(z) = exp(z)/[1 + exp(z)] taking values strictly between zero and one. The vector x is our set of explanatory variables and β the vector of estimates. In our case, z = β+R+ + β−R− + ..., where β+ is the effect of positive reputation and β− the effect of negative reputation. We hypothesize a positive effect for positive reputation and a negative effect for negative reputation, both for placing and honoring trust:

2a) β+ > 0 and β− < 0 for placing trust and 2b) β+ > 0 and β− < 0 for honoring trust 70 Reciprocity in Reputation Systems

There is existing evidence for reputation effects of positive and negative scores from experiments (Masclet and Pénard 2012, Li and Xiao 2014) and from several field studies (e.g. Diekmann et al. 2014). We replicate these findings and hypothesize that the number of positive (negative) partner ratings not only in- creases (decreases) the probability of placing trust (2a) but also affects the prob- ability of honoring trust (2b). The latter case is typically unobserved in field studies. Besides reputation effects on buying and shipping choices, the reputation scores can also affect the rating decisions. A buyer might be reluctant to submit a negative rating to the seller if the seller has an unblemished track record or accu- mulated many positive ratings. A high number of negative ratings could, on the other hand, start off a stoning process. We hypothesize that indirect reciprocity repeats itself on the feedback stage, giving rise to a cumulative effect similar to the Matthew Effect (Merton 1968), i.e., a player’s probability to submit a rating r increases if the partner has already many of this type, while many ratings of the other type exhibit a decelerating effect. We call it ‘cumulative’ because in this case indirect reciprocity breeds additional indirect reciprocity. The focal decision maker’s own reputation scores are added too. Again, cooperative behavior in the trust game is expected to repeat itself in the rating game. Players with a high positive (negative) score are expected to rate more positively (negatively).13 We estimate time-invariant logistic regression models for the probability of submit- ting positive and negative ratings with the same logic as above.

+ − 2c) β1 > 0 and β < 0 for positive ratings 2d) β+ < 0 and β− > 0 for negative ratings

Hypothesis 3: Asymmetry in positive and negative indirect reciprocity 3) |β+| < |β−|

Previous field studies with data from eBay have found stronger effects of neg- ative ratings than positive ratings on outcomes (Ba and Pavlou 2002, Lucking- Reiley et al. 2007, Diekmann et al. 2014). Authors typically attribute this to the salience of negative feedback and to its low rate of occurrence. We replicate this finding under experimental conditions. By calibrating the incentives, nega- tive outcomes can be much more prevalent in the laboratory setting. There, they will carry less information and are less salient. The asymmetry is also expected

13 Note that there are several competing theoretical arguments for the dependence of rating behavior on partner reputation scores. Players motivated by indirect reciprocity, for example, will give to those who already have while pure altruists would give to those who do not have. A strictly selfish and strategic player would primarily try to optimize his or her own score. See Dellarocas et al. (2004) for a theoretical overview, Diekmann et al. (2014) for an application, and Nowak and Sigmund (1998b) for a simulation study and a class of strategies that optimizes a player’s own score. Experimental Design 71 based on broad findings from psychology. In an extensive review of why “bad is stronger than good”, Baumeister et al. (2001) mention several related stud- ies, especially findings on negativity bias in impression formation, greater impact and power of negative feedback in generic feedback processes, and possibly also a relation to the asymmetric value function in prospect theory (Kahneman and Tversky 1979). To support the hypothesis, the absolute value of the effect of the negative score (|β2|) needs to be higher than the effect for the positive score (|β1|), where β1 and β2 are again simply the estimates of logistic regression models from above. And again, we can test the hypothesis on the level of the trust and rating game.

Hypothesis 4: Direct reciprocity effect on the feedback submission rate in the SE treatment + + + 4a) Pr(ri |r j ) > Pr(ri ) − − − 4b) Pr(ri |r j ) > Pr(ri )

After having observed the outcome of the trust game, players have a baseline + − propensity to submit a rating Pr(ri ) and Pr(ri ) depending on the transaction out- come and other factors. For players with only self-regarding preferences these two probabilities equal zero, while for reciprocal types these the two probabili- ties can be positive.14 According to hypothesis 4a/b, the player’s probability of submitting a rating is higher after receiving a rating from the transaction partner. This is only observable in the SE treatment. If a seller posts feedback at time ts, this action can affect the buyer’s subsequent conditional probability to rate, given that she has not submitted a rating before ts. The buyer’s independent probability of rating at time tb corresponds simply to a hazard rate λ(t) that a rating will occur 0 in an interval t < tb < t . In discrete time, our hazard rate for player i to submit a rating at Ti is defined as

λit = Pr[Ti = t|Ti ≥ t, xit] (3.2) The choice of a discrete-time duration model comes from the fact that we ob- serve ratings grouped in intervals of one second from t = 0,..., 10. However, we assume an underlying continuous process best captured by a Cox proportional- hazards model and apply the corresponding complementary log-log variant in

14 If we analyze the rating stage separately, reciprocal types would only rate as second movers. ± Their unconditional rating propensity Pr(ri ) would equal the one of homo economicus and only true altruists would rate unconditionally. In the combined examination of the trust and rating game, feedback is primarily a response to the outcome of the trust game. Therefore, observing feedback only means rejecting the hypothesis of narrow self-interest. 72 Reciprocity in Reputation Systems discrete time for grouped survival data (Prentice and Gloeckler 1978, Allison 1982).

0 λit = 1 − exp[− exp(αt + β xit] (3.3) Conditioning a player’s behavior on a previous move of the partner is modeled by a time-varying covariate (tvc). This means a dummy variable in xit changes value at the partner’s submission time and exerts a multiplicative effect on the focal player’s hazard rate (Willett and Singer 1995). Since we have positive and negative ratings as outcomes, we will apply the competing risks framework by censoring the observation in case a rating of a different type occurs. Here we are primarily interested in homomorphic direct reciprocity where a service is reciprocated by exactly the same service, i.e., a positive for a positive and a negative for a negative rating.15 From direct reciprocity of positive feed- back on the individual level follows an increase of the group-level feedback rate. However, we cannot deduce the same aggregation for negative feedback. Even if there is direct reciprocity in negative feedback, rational players might want to remain silent and reduce their unconditional propensity to rate as a first mover to avoid retaliation. With our between-subject design we do not observe uncondi- tional rating propensities since participants only played the blind or the sequential treatment. Therefore, one key hypothesis from the literature, that buyers avoid submitting negative feedback as first movers, cannot be addressed directly. We can expect direct reciprocity to increase the overall positive submission rate in the SE treatment (4c). If we observe a substantially decreased submission rate for negative feedback, this would give at least some support to the hypothesis of Dellarocas and Wood (2008), called the “sound of silence”, that negative ratings get suppressed in a bilateral rating system (4d). Hypotheses 4c and 4d are for- mulated as differences in hazard rates. Due to complications from the competing risks approach, we will test 4c and 4d by comparing cumulative incidence curves.

+ + + 4c) λSE > λSI  λAS − − − 4d) λSE < λSI  λAS

Hypothesis 5: Feedback arrival times ˜+ ˜+ ˜+ 5a) tSE < tSI  tAS ˜− ˜− ˜− 5b) tSE > tSI  tAS

15 We will neglect heteromorphic cases due to a lack of sufficient observations. It is reasonable to assume in these rare cases that hypotheses 4a/b would simply flip signs, i.e., the probability of a positive rating by player i after having received a negative rating by player j should be lower than or + − + + + equal to his unconditional rating propensity, i.e., Pr(ri |r j ) ≤ Pr(ri ) < Pr(ri |r j ). Experimental Results 73

In empirical studies negative ratings typically exhibit longer median arrival times t˜ than positive ratings (cf. Dellarocas and Wood 2008, Diekmann et al. 2014). A transaction failure needs time and the manifestation of positive and negative transaction outcomes differ in the real world. In the lab this endogenous waiting time does not vary. We assume arrival times of feedback to be a function of cognitive processing time and the time needed to strategically expedite or delay a rating. There exists mixed empirical evidence for effects of stimulus-valence on choice latency. In any case, our time resolution is not sufficient to test potential hypotheses (cf. Baumeister et al. 2001, 340f.) However, in comparing the bilat- eral treatments, the sequential treatment invites players to choose the rating time strategically. On the one hand, players want to delay the submission of negative ratings as long as possible to avoid retaliation (sniping, last-minute feedback). On the other hand, an early positive rating could elicit a positive reciprocal response. Therefore, we hypothesize that the median arrival time of positive ratings (t˜+) is lower in the SE treatment compared to the simultaneous treatment, whereas neg- ative ratings take more time in SE compared to the SI treatment. Again, we do not expect any differences between SI and AS treatments where strategic timing is ruled out.

3.4 Experimental Results

This section is organized as follows. First, we discuss how subjects place and honor trust. We start with non-parametric tests of the treatment effects and con- tinue with a regression analysis of the determinants of the buyer’s trust and seller’s trustworthiness. We will discuss period and reputation effects and the asymmetry of positive and negative feedback. In a second part, we analyze the determinants of the rating game with logistic regression models and study feedback timing by comparing median arrival times. We conclude with an analysis of the direct reciprocity effect in the sequential treatment based on duration models.

3.4.1 Placing and Honoring Trust Table 3.2 shows percentages of placed and honored trust together with exchange efficiency in the trust game. The buyer’s decision to place trust is almost unaf- fected by the treatments. Only the sequential treatment falls short and shows at 70.2% an even lower level than the stranger treatment. Under the conservative assumption that each session is a unit of observation, a Mann-Whitney rank-sum test comparing means between treatments reveals no significant differences be- tween any pair of treatments. Surprising and inconsistent with previous findings 74 Reciprocity in Reputation Systems

Placing Trust (Buy) Stranger Asymmetric Simultaneous Sequential Comparison .9 .9 .9 .9 .9 .7 .7 .7 .7 .7 Buy Buy Buy Buy Buy .5 .5 .5 .5 .5 .3 .3 .3 .3 .3 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30 Period Period Period Period Period

Honoring Trust (Ship) Stranger Asymmetric Simultaneous Sequential Comparison .9 .9 .9 .9 .9 .7 .7 .7 .7 .7 .5 .5 .5 .5 .5 Ship Ship Ship Ship Ship .3 .3 .3 .3 .3 .1 .1 .1 .1 .1 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30 Period Period Period Period Period

Figure 3.2: Average Levels of Placing and Honoring Trust by Treatment from the literature is the level of buying in the stranger treatment. In 77% of all interactions buyers place trust although no reputation system is in place. In one- shot interactions we would expect a fast decay and trust to completely deteriorate in the course of play. We can see in the upper row of Figure 3.2 that the stranger treatment exhibits the flattest decay of all other treatments. All four treatments accompany each other rather closely. The asymmetric treatment exhibits the high- est average of placed trust, followed by the simultaneous treatment. As expected, buyers from the sequential treatment place the least trust compared with the two other feedback treatments. We conclude from the lack of significant effects on the buyer side that our reputation systems help little to explain how buyers solve their problem of trust. We expected that in a one-sided incentive problem buyers are mainly reacting by withholding trust. As we will see, our reputation systems primarily impact the seller side by disciplining trustworthiness.16

16 The zero payoff for the buyer’s outside option could be one potential explanation for the lack of treatment effects on buyer side and, at the same time, for the high levels of placing trust in the stranger treatment. Buyers cannot earn money on top of the show-up fee if they do not enter into transactions. We rescaled the payoffs to replicate buyers’ risk of Bolton et al. (2004) as closely as possible. Results from our stranger treatment diverge heavily from previous findings. In Bolton and colleagues’ stranger treatment, placing trust completely deteriorates and falls from an initial level of 65% down to 5% with an average level of placed trust of about 35% over 30 rounds. In the study of Charness et al. (2011) the stranger treatment falls from 35% down below 20%. In our case, however, placing trust never falls below 50% and is on average 40% higher over the entire 30 rounds compared to Bolton et al. Table 3.2 indicates that two sessions from the stranger treatment show the highest session averages in placing trust. Furthermore, we can see atypical endgame behavior in the stranger treatment where the buyers’ rate of trust increases in the last round. Experimental Results 75

Table 3.2: Frequencies of Decisions by Session (%)

Treatment Session Buying Shipping Efficiency Buyer Seller Order Rating Rating ST 2 80.8 42.8 34.6 - - ST 5 67.1 44.7 30.0 - - ST 7 83.3 50.0 41.7 - - 77.1 45.9 35.4 - - AS 1 80.4 79.3 63.8 59.1 - AS 4 74.2 70.2 52.1 59.0 - AS 10 78.8 79.4 62.5 63.0 - 77.8 76.4 59.4 60.4 - SI 11 77.9 72.2 56.3 62.0 21.4 SI 12 73.3 77.3 56.7 56.3 36.4 SI 13 79.2 83.7 66.3 63.7 35.8 76.8 77.8 59.7 60.8 31.1 SE 3 71.7 59.3 42.5 83.1 68.6 SE 6 74.6 71.5 53.3 76.0 67.6 SE 8 79.2 71.1 56.3 68.4 56.8 SE 9 55.4 54.9 30.4 69.2 51.1 70.2 65.0 45.6 74.3 61.6 Notes: The table lists the raw percentages of buying, shipping, efficiency, and ratings by session. Session level averages form the basis for non-parametric test of treatment effects. Bold values indicate average values by treatment.

At the seller’s side, we report on the average number of transactions with ship- ment conditioned on buy orders. Here, we find another surprising fact about the stranger treatment. Sellers ship only in 45% of all the cases where trust is placed, opening up a large discrepancy between buyer and seller behavior (cf. second row in Figure 3.2). In the other treatments, buyers adapt faster to the seller’s level of honoring trust. In comparing the average levels of shipment in table 3.2, a Mann-Whitney test indicates significant differences between the stranger and the three feedback treatments (z > 1.96 with p < 0.05) and also between SI and SE (z = 2.121 with p < 0.05). However the difference between AS and SE is not significant on the session level (z = 1.414 with p = 0.157).

The next important dimension is exchange efficiency which is defined as the percentage of successful trades, i.e., where the trustor buys and the trustee ships. Treatments AS and SI show the highest efficiency with about 60% of successful trades, followed by the sequential treatment with a lag of 15%. In the stranger treatment only 35% of all potential trades complete successfully. Efficiency is mainly driven by the seller side. Therefore, non-parametric tests for differences 76 Reciprocity in Reputation Systems

Table 3.3: Determinants of Placing and Honoring Trust

Placing Trust (Buy) Honoring Trust (Ship)

HA (1) (2) (3) (4) (5) (6) Sequential Ref. Ref. Ref. Ref. Ref. Ref.

Simultaneous + .066∗ .066∗ .024 .136∗∗∗ .136∗∗∗ .085∗∗∗ (.034) (.034) (.037) (.042) (.042) (.043) Asymmetric + .076∗∗ .076∗∗ .057 .127∗∗∗ .128∗∗∗ .085∗ (.033) (.033) (.041) (.041) (.041) (.051) Stranger - .069 -.167∗∗∗ (.046) (.062)

Partner’s Pos. Rat. + .025∗∗∗ .016∗∗∗ (.005) (.005) Partner’s Neg. Rat. - -.055∗∗∗ -.025∗∗ (.008) (.010)

Own Pos. Ratings + .008 .006 (.006) (.008) Own Neg. Ratings - .008 -.041∗∗∗ (.009) (.003)

Period - -.014∗∗∗ -.016∗∗∗ -.012∗∗∗ -.014∗∗∗ -.016∗∗∗ -.010∗∗∗ (.001) (0.001) (.002) (.001) (.001) (.003) N 3120 2400 2400 2342 1787 1787 Subjects 208 160 160 208 160 160 Notes: The table shows average marginal effects (dy/dx) of logistic regression coefficient estimates for placing (buy) and honoring (ship) trust. Models 1 and 4 show treatment effects (including a period effect) compared to the sequential treatment. Models 2 and 5 repeat the same models for treatments with a feedback mechanism. Models 3 and 6 incorporate positive and negative ratings of the partner and the focal decision maker as control variables. Robust standard errors clustered on the individual level in parentheses. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 (two-tailed tests). in efficiency reveal the same pattern as in shipping.17 Combining the results from the session level, we find support for hypothesis 1b (shipping) but cannot back up hypothesis 1a (buying). The differences in efficiency indicate, after all, that reputation systems improve the situation compared to interaction among complete strangers. Table 3.3 shows average marginal effects from logistic regression models of the individual decisions to buy and ship. Models 1 and 4 replicate treatment effects including a period effect. In these models, standard errors are clustered

17 Although AS is virtually identical in average efficiency to SI, it does not reach a significant dif- ference in honoring trust compared to the SE treatment. This is due to session 4 that falls below the mean of one of the SE sessions. Note that the non-parametric tests are only based on averages of the 13 session level observations and are therefore not very robust. Experimental Results 77 on the decision maker. Here, we loosen the assumption of correlated errors at the session level. All period effects are negative indicating falling rates of trust and trustworthiness over the course of the 30 periods. For every additional pe- riod, the probability of placing and honoring trust falls by 1.4%. Surprisingly, we do not find sharp endgame effects but a continuous decay that starts at an early stage. Taking the sequential treatment as the reference category, Model 1 shows that only the asymmetric treatment exhibits more trust (+7.6%) that is signifi- cant at conventional levels (p < 0.05). Model 4 captures the shipment decision of the seller. All treatment effects are highly significant and show the expected order as in hypothesis 1b. SI and AS enforce higher trustworthiness (+13.6% and +12.7%) while the stranger treatment falls below the sequential treatment (-16.7%). Models 2 and 5 replicate the treatment effects only for sessions with reputation mechanism. They show qualitatively identical results. In Models 3 and 6, we introduce the partners’ positive and negative reputation scores and the scores of the focal decision maker. As hypothesized, positive reputation increases the probability of placing trust while negative reputation is accompanied by lower levels of placed trust. For every additional positive point, the probability increases by 2.5% whereas it decreases by 5.5% for an additional negative point. This confirms hypothesis 2a. Also note that the treatment effects in model 3 are completely absorbed by the reputation effects. In our setup, sellers can make their shipment decision contingent on the partner’s reputation scores too. Model 6 shows that even though sellers do not face buyer moral hazard, they react to reputation information nevertheless. The probability of shipment increases by 1.6% for every additional positive point and decreases by 2.5% for each piece of negative feedback received. This confirms hypothesis 2b. Also apparent from Models 3 and 6 is the fact that the absolute values of the positive and negative reputation effects are different in size. An F-Test for the linear restriction assuming the same size of the two effects is rejected (p = 0.003, two-sided). However, that the effect for negative reputation is higher than (or about twice the size as) the effect for positive reputation cannot be rejected, for both the buyer and seller sides. This is in line with hypothesis 3 and demonstrates the asymmetry of positive and negative responses for the case of indirect reci- procity. In the appendix (section 3.6.1) we will discuss the functional form of reputation effects in greater detail. In Table 3.3, results are pooled over treatments with standard errors clustered on the individual level.18 As soon as we cluster on the session levels, signifi-

18 As indicated in section 3.3.2, we only have 3-4 completely independent observations per treat- ment. We have three nested levels: decisions, individuals, and sessions. The conventional statistical approach is to cluster standard errors on the highest level where errors are supposed to be correlated. Since we cannot exclude potential behavioral cascades in a single session, the most conservative ap- 78 Reciprocity in Reputation Systems cant treatment effects on the buyer side are lost. However, the treatment effects for the seller side and reputation effects for both sides stay highly significant. In the appendix, Table 3.8 reports on models estimated separately by treatment with fixed effects regressions. Reputation effects differ in size between treatments with effect sizes in AS > SI > SE. This is to a large extent expected. In the asymmet- ric treatment, the maximal range of reputation scores is only 50% compared to the bilateral treatments. In addition, as we will show below, the sequential treat- ment generates more feedback compared to the SI treatment. The more feedback information a system generates, the less important a single reputation point is. In Models 3 and 6 we also estimate effects from the decision maker’s own scores. There is little to report here. We only find a significant negative effect in the shipping decision for those sellers who have accumulated a negative rep- utation. Upon closer inspection, this effect turns out to be only present in the SE treatment and not robust under fixed effects. Surprisingly, with fixed effects it even flips signs (cf. Model 8 in Table 3.8). Altogether, we find little to no evidence that the binary trust game is determined by the decision maker’s own score.

3.4.2 Submitting Feedback In this section, we report on rating behavior only for the three treatments imple- menting a feedback mechanism. If buyers do not place trust, no feedback can be submitted. Therefore, we restrict our analysis to transactions in which buyers en- tered the game. We will start again with non-parametric tests of treatment effects, followed by a regression analysis of the individual rating behavior. The raw feed- back frequencies in Table 3.2 from the previous section permit a basic comparison among the three treatments. In this table we do not yet distinguish between pos- itive and negative feedback. A non-parametric Mann-Whitney test between the average submission rates of the 8 sessions in the AS and SI treatments reveals no significant differences in buyer rating behavior (z = 0.218, p = 0.83). Both treatments show with about 60% of submitted ratings a virtually identical turnout. In the four sessions of the SE treatment buyers rate significantly more, with an proach would be to cluster standard errors on the session level. For non-parametric tests we always compare mean values or medians from the session level. For the determinants of individual behav- ior in Tables 3.3, 3.5, and 3.6 we keep, however, clustering at the individual level and discuss cases where we get qualitatively different results from the case where errors are correlated within sessions. According to Cameron et al. (2008), clustered standard errors require cluster counts of at least 50. Clustering could lead to over-rejection and be too conservative in our situation. For all regression models, we conducted robustness checks with their ‘wild cluster bootstrap procedures’ that always return qualitatively similar results as simple clustering on the session level. See Cameron et al. (2008) for details on wild cluster bootstraps. Experimental Results 79 overall frequency of 74% (z = 2.121, p = 0.034). Also for the seller side the SE treatment elicits a higher turnout. In the SI treatment sellers submit feedback in 31% of their transactions whereas sellers in SE treatment provide, at 62%, significantly more (z = 2.121, p = 0.034). Our overall feedback frequencies nicely replicate previous findings from other studies. Bolton et al. (2013) report feedback rates of 80% for buyers and 60% for sellers in a sequential treatment that is very similar to ours. In the blind treatment (similar to our SI), they report 68% and 34% for buyers and sellers, respectively. Their feedback prices are tenfold lower than ours. This suggests that participants are not very price sensitive anymore if prices are close to zero. Masclet and Pénard (2012) have chosen higher feedback prices with the result that 75-80% of their subjects abstained from submitting feedback. Note also that our overall frequencies come close to numbers observed on eBay’s bilateral system. According to Diekmann et al. (2014), in the German cellular phone market about 70% of buyers and sellers submit feedback. Bolton et al. (2013) compare different eBay country markets with provision rates between 70 and 80%. Although we can recover the overall proportions of feedback provision, we are far from replicating the typical distribution of positive and negative ratings. On eBay, feedback is almost completely positive in nature (above 95%). In our experiment, however, subjects use it as both a rewarding and a punishing device. Table 3.4 shows descriptive statistics on feedback values and feedback arrival times for the three treatments. In the second column (all), we see the distribu- tion of positive and negative ratings submitted by the buyer. Even though the SE treatment has the highest frequencies of positive and negative feedback, non- parametric tests indicate that the pairwise differences do not reach a significant level. On the seller side (column 5), we find weakly significant increases for pos- itive feedback (z = 1.768, p = 0.077) but a non-significant increase for negative feedback (p = 0.288). Columns 3-4 and 6-7 show feedback values conditioned on the seller’s shipment decision. Given that the seller did not make a shipment, in the asymmetric treatment 81.1% of all transactions are followed by negative feedback. We find insignificantly different negative responses to non-shipment in SI and SE. But it is obvious that buyers react much more strongly in all treatments to non-shipment with negative feedback than they would with positive feedback after shipment. The vast majority of abstentions results after the seller ships. Note also that there are insignificant differences in the level of strategic or inconsistent ratings. In the asymmetric treatment, 10.7% of transactions receive a negative rat- ing even though the seller shipped. There are also rare cases where buyers submit a positive rating after sellers did not honor trust. All in all, inconsistent feedback does not significantly differ between the treatments. Nevertheless, endogenous rating produces an error rate of about 10%. This is higher than the prediction of 80 Reciprocity in Reputation Systems 2.6 (2) 3.9 (4) 5.0 (5) 3.9 (3) 2.8 (2) 4.2 (4) 2.8 (2) 3.3 (2.5) 5.7 (6) 7.1 (8) 3.7 (3) 5.3 (5) 2.7 (3) 2.0 (2) 2.9 (2) 4.3 (3) 4.6 (4) 4.1 (4) 4.3 (4) 3.4 (3) 3.9 (3) 6.8 (8)3.9 (3) 7.04.3 (8) (4) 4.4 (4) 5.2 (4.5) 4.5 (5)2.8 (3) 4.53.4 (4) (3) 3.4 (2.5) 4.3 (3.5) 3.9 (3)4.8 (4)3.9 (3) 3.9 (3) 7.2 (9) 3.8 (3) overall yes no overall yes no 7.3 (10) 7.3 (9.5) 7.3 (10) 7.7 (10) 8.2 (10) 7.0 (8) 6.5 8.1 40.7 47.0 19.5 53.9 14.5 7.0 23.7 11.2 2.3 1.6 16.6 3.4 37.8 81.1 87.0 84.3 Table 3.4: Feedback Frequency and Average Arrival Times Feedback Frequency (%) Average (Median) Arrival Time in Seconds 42.5 45.3 56.4 Buyer Seller Buyer Seller The table lists raw feedback frequencies (%) by rating value and treatment on the left and average (median) arrival times of feedback Note: on the right. Examplea of rating. proportions (%) In inratings 36.5% SE are the treatment: positive buyer In and submittedthat 25.7% 10.7% a the of are seller negative the negative. made rating interactions a Ifarrives and where shipment, the on in the a seller average buyer positive 27.8% after does buyer placed a 7.2 rating not trust, seconds positive arrives ship, the (median: on rating. 84.3% buyer average 9). of after did Conditional Numbers 3.9 buyers’ not on in seconds ratings submit seller bold (median: is indicate shipment, 3). negative. the 56.4% A Example most negative of of common buyer the rating and timing after buyer relevant in shipment cases. SE: Given Negative ratingPositive rating 27.3Simultaneous (SI) No 33.0 rating 10.7 Negative ratingPositive rating 25.1Sequential (SE) 39.2No 35.6 rating 7.4 47.2Positive rating 11.4 68.9 25.7 37.8 73.5 32.9 52.8 12.3 38.4 34.9 44.9 Seller shippedAsymmetric (AS) No overall rating yes no 39.6 overall yes 46.7 16.7 no Negative rating 36.5 10.7 Experimental Results 81

Gazzale and Khopkar (2011) and also exceeds typical implementation error rates in simulation models of indirect reciprocity. Table 3.5 shows average marginal effects from logistic regression models of the individual decisions to submit positive or negative ratings. Timing is not yet considered here. Models 1, 3, 5, and 7 replicate treatment effects including a pe- riod effect with the sequential treatment as the reference category. We do not find any treatment effects for buyers’ positive ratings. Sellers, on the other hand, show a stark increased propensity to submit positive ratings in the SE treatment. This treatment effect even remains stable after inclusion of additional control variables. For positive ratings, hypothesis 4c is only supported for the seller side. All other treatment effects get almost completely absorbed by additional control variables. For negative feedback we find treatment effects that contradict hypothesis 4d. The probability of negative ratings is not lower but always higher in the sequen- tial treatment. But again, a note of caution is appropriate. These treatment effects are not stable for inclusion of additional controls; nor do they remain significant at conventional levels after raising the clustering of standard errors to the session level. We will come back to hypotheses 4c and 4d below. Period effects have a negative sign for positive feedback. This is in line with the expectation that subjects cease to provide feedback as soon as reputations are established and the end of the super game is approaching. Contrariwise, period effects are positive for negative feedback, indicating that punishment is increasing over the course of the 30 periods. Figure 3.5 in the appendix shows trends of total, positive and negative rating behavior. Declining frequencies for positive ratings and an increasing profile for negative ratings is present in all treatments, both for the buyer and the seller sides. The most relevant and robust determinant of filing feedback is obviously the transaction outcome, i.e., if the seller ships or not. It exerts a positive effect on positive ratings and a negative effect on negative ratings. If we reverse the coding for the shipment variable, we can interpret it as a form of punishment that with- holds a reward. Less consistent findings come from the “cumulative” reputation effects, i.e., the inclusion of the decision maker’s own and the partner’s reputa- tion scores. The pattern of expected signs of the reputation effect is summarized in column HA of Table 3.5. For the decision maker’s own reputation scores we find 6 out of 8 effects to have the correct sign with 4 effects being significant. However, results from partner scores are very mixed. Only sellers submit pos- itive ratings at higher rates for buyers in good standing. And only buyers have a lower probability of submitting negative ratings when facing sellers with good reputations. Three of eight effects show the correct sign and only 2 of these are significant according to the hypotheses. Additional models estimated separately for each treatment, including fixed effects specifications not reported here, reveal 82 Reciprocity in Reputation Systems ∗ ∗∗∗ -.033 -.004 -.002 -.001 .021 -.242 ∗∗ ects (including ∗∗∗ ff ∗ ∗∗ ∗∗∗ ∗∗∗ .002 .007 -.023 -.096 -.016 -.375 -.013 ∗∗ ∗∗∗ cient estimates for submitting positive ∗∗∗ ffi -.103 .014 -.121 ∗∗ ∗∗ ∗∗∗ ∗∗∗ .014 .008 .002 -.004 .019 -.011 .345 -.205 ∗∗∗ ∗∗∗ -.209 dx) of logistic regression coe / ∗ ∗ ∗ ∗ ∗∗∗ .008 .005 -.013 -.004 -.007 (.058) (.050) (.019) (.027) (.007) (.007) (.005) (.005) (.013) (.012) (.007) (.007) (.010) (.010) (.006) (.007) (.017) (.017) (.010) (.012) -.033 Positive Feedback Negative Feedback ects (dy ff indicates the direction of the hypothesis for positive feedback in Models 1-4, while Buyer Seller Buyer Seller ∗∗∗ A H 01 (two-tailed tests). (1) (2) (3) (4) (5) (6) (7) (8) . 0 (.055) (.053) (.053) (.054) (.042) (.037) (.043) (.042) (.055) (.054) (.042) (.042) (.001) (.004) (.002) (.006) (.001) (.003) (.002) (.004) < p - .618 - -.005 .015 - .018 - -.016 -.088 - -.039 -.092 - -.011 A / / / /+ /+ ∗∗∗ H +/ +/ +/ Table 3.5: Determinants of Positive and Negative Feedback 05, . 0 < p ∗∗ ect). Models 2, 4, 6, and 8 incorporate shipping, the positive and negative feedback scores of the decision ff 10, . The table shows average marginal e 0 < p SequentialSimultaneous - Ref. Ref. Ref. Ref. Ref. Ref.N Ref. Ref. 1787 1787 1227 1227 1787 1787 1227 1227 Asymmetric - Ship Part.’s Pos. Ratings Part.’s Neg. Ratings - Own Pos. Ratings Own Neg. Ratings - Period - Note: feedback (M1-4) and negativethe feedback other (M5-8), four respectively. models Models area for 1, the period 2, seller e 5, decisions respectively. andmaker, Models 6 and 1, capture 3, the 5, buyer partner and submission scoreslevel. 7 behavior; as The show only first control treatment sign variables. e in Robust the standard column errors are in parentheses, clustered on the individual the second sign∗ is for negative ratings. Experimental Results 83 substantial heterogeneity between treatments and even significant effects incon- sistent with the proposed pattern. Therefore, we refrain from making any bold statements regarding hypotheses 2c and 2d. Similar to the findings on the level of the trust game, we have to conclude that rating behavior is not systematically determined by reputation scores.

3.4.3 Feedback Timing

The right side of Table 3.4 shows average and median arrival times of feedback by treatment. First we discuss overall timing patterns in the asymmetric and si- multaneous treatments where no incentives for strategic timing exist. In the AS treatment, it takes on average 4.3 seconds for positive feedback and 3.9 seconds for negative feedback to arrive. Ratings arrive significantly faster in the SI treat- ments, with an average of 3.4 seconds for positive ratings and 2.8 seconds for negative ones (p < 0.01, session and individual level).19 We do not have an ex- planation for these differences. Timing in the one-sided and blind setup should be identical for buyers. In the AS and SI treatments negative feedback tends to arrive earlier than positive feedback, for both buyers and sellers. But the differences are small and statistically insignificant, and also the median arrival times tend to be the same. In the SI treatment sellers submit somewhat faster than buyers, for both positive and negative feedback. However, this is also expected since sellers know the outcome of the transaction – their own decisions – earlier than buyers and require less time for consideration. The interesting findings come from a comparison with the sequential treat- ment. The SE treatment allows for strategic feedback submission and the elic- itation or avoidance of reciprocal responses. First we discuss the arrival times of positive ratings. According to hypothesis 5a, we expect accelerated arrival times because at least a fraction of the players might prefer to rate early in the hope of eliciting a reciprocal response. However, the hypothesis that subjects set ‘strategic reciprocity traps’ is clearly rejected. Positive feedback in the SE treat- ment arrives not earlier but insignificantly later compared to the SI treatment. For

19 We will test differences in arrival times (duration) with the very same non-parametric tests from above. Since arrival times in the submission interval of 10 seconds tend to be u-shaped and no severe censoring problem exists, we simply rely on comparing arithmetic means and medians. Treatment effects have also been tested with Cox regressions with qualitatively similar results. However, in our case the proportionality assumption is violated in substantive terms. A high hazard rate indicates that an event is likelier and simultaneously arrives earlier. As we will see below, ratings in the SE treatment are likelier but arrive later due to strategic considerations. 84 Reciprocity in Reputation Systems

Buyers' positive ratings Buyers' negative ratings .15 .12 .1 .1 .08 .06 Hazard Rate Hazard Rate .05 .04 .02 0 0 2 4 6 8 10 0 2 4 6 8 10 Time to Feedback Time to Feedback

Sellers' positive ratings Sellers' negative ratings .08 .08 .06 .06 .04 .04 Hazard Rate Hazard Rate .02 .02 0 0 0 2 4 6 8 10 0 2 4 6 8 10 Time to Feedback Time to Feedback

Asymmetric Simultaneous Sequential

Figure 3.3: Smoothed Hazard Rates of Positive and Negative Feedback sellers in the SE treatment, there is even weak evidence that they postpone their ratings and prefer to be second movers.20 For negative feedback we can confirm hypothesis 5b for buyers and sellers. Figure 3.3 shows smoothed hazard rates of arrival times for positive and negative feedback. It is easy to see that the functional form of the hazard rates for positive feedback does not differ substantially between treatments. However, the hazard rate of negative feedback in the SE treatment exhibits a very different shape, with high probability mass late in the process. This indicates that buyers and sell- ers try to delay negative ratings to escape retaliation. A test on the differences in arrival times for negative feedback is only significant on the individual level. Nevertheless, this finding corroborates evidence from Diekmann et al. (2014) on last-minute feedback. Diekmann and colleagues found an increase in the buyer’s hazard rate of negative feedback in the last few days. They suspected this to be a

20 As soon as we let buyers and sellers determine the rating sequence themselves, they restore the turn taking logic well known from eBay and other e-commerce platforms. Typically buyers pay first, then sellers ship, followed by the buyer and finally the seller rating (cf. Diekmann et al. 2014, Table 3 or Helbing et al. 2005 for another example of turn taking). Experimental Results 85 sign of a buyer strategy to avoid retaliation. This finding is now replicated under laboratory conditions in greater clarity. In hypotheses 4c and 4d, we claim a systematic order of the hazard rates + − where λSE is expected to be located above and λSE below the other rates. This seems not to be the case after inspecting the upper left panel of Figure 3.3. Even though the depicted single-transition hazards for positive ratings from buyers are correct, they do not reflect the competing risks of negative ratings and abstention. If we compare cumulative incidence rates predicted from competing risk models, we get back the findings expected from the raw frequencies in Table 3.4. These complications are discussed in Appendix 3.6.4 in greater detail and are rooted in differential abstentions. Differences in feedback abstention timing are highly significant even if we take median arrival times of sessions as the unit of compar- ison. Subjects have the option to abstain early or to wait for the timeout. In the SE treatment the median time of ‘no rating’ equals or is very close to the time- out of 10 seconds. This demonstrates that the majority of the participants have understood the strategic timing problem in the SE treatment. They prefer to wait and see what their partners do even though the result of the partner move will be displayed on the next screen and made transparent in one’s own reputation score. Another piece of evidence for strategic timing comes from inconsistent feed- back. In the left part of Table 3.4 we can identify cases where rating values do not correspond to the transaction outcome. In the SE treatment, for example, buyers submit negative ratings in 10.7% of the transactions even if the seller shipped and honored trust. This is obviously a hostile act to harm the opponent’s reputa- tion. On the other hand, only few pieces of positive feedback are submitted by buyers after sellers abuse their trust. And few sellers try, accordingly, to appease their buyers with a positive rating after having cheated on them. As noted above, these inconsistent cases appear in all three feedback treatments at a rate not sig- nificantly different from each other. But in the SE treatment the vast majority of inconsistent negative ratings arrive very late in the time-window. The last set of hypotheses concerns the direct reciprocity effect in the sequen- tial treatment (4a/b). Direct reciprocity on the feedback level is ruled out in AS and SI. Therefore, we restrict our analysis to the sequential treatment. Table 3.6 reports estimates from discrete-time proportional hazard models. We introduce the transaction outcome (ship) and the focal actor’s own and the partner’s reputa- tion ratings as control variables. The effect of direct reciprocity is introduced as a time-varying covariate (tvc). We split observations if a partner rating has occurred before. Additionally, the model incorporates the full set of time interval dummies and period dummies not reported in Table 3.6. Again, we report marginal effects on the hazard rate. The obvious effect of the transaction outcome (ship) is present with the expected sign and highly significant. Effects from direct reciprocity are 86 Reciprocity in Reputation Systems

Table 3.6: Hazards of Positive and Negative Feedback (SE only)

Positive Negative Feedback Feedback Buyer Seller Buyer Seller HA (1) (2) (3) (4) Ship +/- .125∗∗∗ .068∗∗∗ -.097∗∗∗ -.020∗∗∗ (.024) (.017) (.010) (.005)

Partner’s First Move (tvc) +/+ .060∗∗∗ .069∗∗∗ .046∗∗∗ .060∗∗∗ (.007) (.008) (.008) (.006)

Partner’s Pos. Ratings +/- .181 .064 -.368∗∗∗ -.146∗ (.157) (.165) (.120) (.084) Partner’s Neg. Ratings -/+ .042 .131 -.399∗∗ -.226∗ (.287) (.226) (.178) (.129) Own Pos. Ratings +/- .206 .192 .134 .016 (.211) (.216) (.138) (.102) Own Neg. Ratings -/+ -.328 -.472 .081 .248 (.428) (.421) (.215) (.177) Observations 5868 5801 6146 6718 Events 255 255 246 160 Clusters 64 64 64 64 Note: Sequential treatment only. The table shows average marginal effects (dydx) of discrete-time proportional hazard rate estimates (cloglog) incorporating the partner decision as a time-varying covariate (partner first). The partner first mod- els the increase of egos propensity to leave positive (negative) feedback if the partner has submitted positive (negative) feedback before. Reputation scores are rescaled by a factor of 100. Time indicators parameterizing the baseline hazard and period dummies are not displayed. Robust standard errors are clustered in parentheses, clustered on the individual level. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01 (two-tailed tests). present as well, both for the hazards of positive and negative ratings. A partner’s first move increases the hazards by 4.5-6%. Sellers react somewhat more strongly than buyers. We have also tested for heteromorphic direct reciprocity with the ex- pected flips of signs. For example, a negative first move by the partner obviously lowers the rate of a positive rating. Due to low occurrence of these cross-valence cases, they do not reach significance at conventional levels and are omitted here accordingly. Finally, we find only mixed results for indirect reciprocity effects on the feed- back submission rate. Estimates of the reputation scores in Table 3.6 are mul- tiplied by a factor of 100, thereby being very small. No significant effects are found on the hazards of positive ratings, nor do one’s own reputation scores exert any significant influence on the submission rates. As expected, having a good Discussion and Conclusion 87 positive reputation protects one from receiving negative ratings. However, we see the very same protective effects against negative reputations, which is inconsis- tent with our simple expected pattern. Ten out of sixteen effects from reputation scores in Table 3.6 show the correct sign as hypothesized in column HA. However, just two of those are significant, while another two contradict expectations.

3.5 Discussion and Conclusion

We started our discussion of reciprocity in reputation systems with the observa- tion that trust plays a pivotal role for business in general but for e-commerce in particular. Internet-based reputation systems can exhibit a formidable potential to reduce information asymmetries, promise lower risks of moral hazard and fos- ter market efficiency. The value of reputation mechanisms is obviously only as good as the quality of the information that is reported to and disseminated from these systems. In decentralized online trade, transaction outcomes are privately monitored and voluntarily reported. Voluntary feedback submission is prone to reporting bias. One prominent and widely discussed source of bias is direct reci- procity introduced by bilateral rating systems. Sequential implementations of these institutions generate incentives for strategic rating behavior. Strategic sell- ers increase their submission rates to elicit reciprocal positive responses while strategic buyers are expected to remain silent even if the transaction outcome did not meet their expectations. The question is whether traders behave strategically after all. Our study joins the growing experimental literature on reputation effects in the trust game paradigm. We implemented 3 different reputation mechanisms on top of one-shot trust games: a one-sided implementation, a two-sided blind variant, and a bilateral system similar to the original eBay feedback forum. All three reputation systems are compared with a baseline stranger treatment where players have no access to the history of their opponents. The games are played for 30 rounds to study the dynamics of reputation formation. In general, our results confirm that a reputation system can increase trust- worthiness and efficiency. However, in our study, treatments explain surprisingly little about how buyers solve their problem of trust. The buyer’s decision to trust is virtually unaffected by the treatments. We hypothesized that direct reciprocity is the major source of bias introduced into a sequential bilateral ratings system. Bias undermines the informational value of ratings and can possibly hinder buy- ers from discriminating between the different types of sellers. However, it is not the buyer side but rather the seller side that is strongly affected by the rating sys- tem. Our results clearly indicate that the one-sided and blind implementations 88 Reciprocity in Reputation Systems are more powerful for disciplining and enforcing trustworthy behavior in sellers compared to the baseline and the sequential treatments. Screening and discrimination based on reputation scores take place. We find strong evidence for effects of positive and negative reputation scores affecting both the seller’s and the buyer’s decision in a consistent and expected way. This corroborates findings from previous experiments and field studies and suggests that reputation has information and market value. In fact, we find that highly rated subjects receive significant higher total payoff. For each positive reputation point, total payoff increases by 3.9% while it decreases by 3.5% for every neg- ative reputation point (+12.4% and -8.0% change in total payoffs for a standard deviation increase in the reputation scores). In line with hypothesis 3, we find an asymmetry toward negativity in indirect reciprocity. Negative ratings overall are roughly twice as strong as positive ratings in relation to the decisions to place and honor trust. Also the stylized fact of diminishing marginal returns for reputation can be demonstrated with our data. No systematic evidence has been found for reputation effects on the feedback stage. We hypothesized that indirect reciprocity may breed itself, not only on the level of the trust game but also on the feedback stage, where current positive and negative ratings attract disproportionately more ratings of their own kind. How- ever, a simple, consistent, and symmetric pattern is indeterminable. A positive reputation shows indications of being a resource that attracts further positive and safeguards one from receiving negative ratings. Nevertheless, reputation effects on the feedback stage are small, often not statistically significant, and sometimes even inconsistent with the expected pattern. The same holds true for decision makers’ own reputation scores. Theoretical arguments suggest that rational play- ers primarily optimize their own score. In the simulation study by Nowak and Sigmund (1998b), for example, one of the most successful strategies involved decisions based on the partner’s and the player’s own score. Our results confirm that participants share information about their trading partners irrespective of the fact that feedback submission is costly. This finding is inconsistent with theories of narrow self-interest. But it adds to the literature on strong reciprocity. Strong reciprocity primarily denotes a preference and then as a consequence a behavior. It is the behavioral economists’ loophole in subgame perfection that allows them to explain why humans reward benevolent and punish deviant behavior even in one-shot interactions. The rating behavior is first of all determined to be outcome-oriented independent of the implemented treatment. The biggest difference between the treatments is the feedback rate itself. While in the asymmetric and simultaneous treatment about 60 percent of buyers submit a rating, there are roughly 75 percent of buyers providing feedback in the se- quential treatment. The difference on the seller side is even more pronounced. A Discussion and Conclusion 89 comparison between the SI and SE treatments reveals that in the latter case 30% more sellers are willing to share information. We expected increasing feedback rates for second movers in the sequential treatment. Results from discrete-time duration models indicate clear evidence for reciprocal responses. This gives yet additional support to theories of strong reciprocity. These results are consistent with previous findings from field studies (Dellarocas and Wood 2008, Jian et al. 2010, Diekmann et al. 2014 and from experiments (Masclet and Pénard 2012, Bolton et al. 2013). The primary contribution of this paper is the analysis of the endogenous feed- back time while controlling for the transaction outcome. Outcomes are unob- servable in field studies and timing was not yet directly addressed in the existing literature. In the sequential treatment, we find that players postpone their rating in the bad transaction case to guard against retaliation (hypothesis 5b). Inconsistent with our hypothesis 4d, buyers’ negative ratings are not significantly suppressed, only significantly retarded. On the other side, we find that timing is virtually irrelevant and mostly non-strategic for the case of positive feedback. But there, direct reciprocity clearly increases the submission rate, primarily for sellers. Al- together, players do not accelerate their ratings to artificially ameliorate their cur- rent standing. They only behave strategically to act spiteful, to retaliate, or to avoid the stigma of a bad reputation. The key claim, that buyers have a lower propensity to rate negatively in the sequential treatment, remains elusive. While Masclet and Pénard (2012) only confirm it for the second mover (the seller) in the investment game, Bolton et al. (2013) confirm it for the buyer only after controlling for several other covariates. We do not directly test for the unconditional propensity. In any case, it looks like direct reciprocity overrules the suppression effect in the SE treatment such that in the end the incidence curves are strictly higher for positive and negative feedback. One reason for the missing suppression effect could be the lack of ambiguity in our binary-choice trust paradigm. Not shipping is simply revealed betrayal that asks for punishment. In the other two experimental studies, sellers have discretion over the size of the back transfer, i.e., they can vary the quality of their service. Of course, similar things happen in the field. Even though a seller ships, it turns out that the contents of the box have a scratch. In circumstances where buyers are mildly unsatisfied, they might start to balance reasons and get reluctant to submit negative ratings. Irrespective of the individual rating behavior, the results between treatments are obvious. Conventional bilateral feedback clearly falls short of enforcing trust and especially trustworthiness among anonymous traders. Our results indicate that feedback systems can be improved by changing to a one-sided or blind ver- sion. And that is exactly what eBay tried to accomplish with its changes to the 90 Reciprocity in Reputation Systems reputation system in 2007. A blind system seems unpractical due to the very long waiting times of 60 days until revelation. It chose asymmetry but only for one type of rating. In order to improve the quality of the rating system, sellers are no longer able to submit negative feedback. On top of that, buyers received the option to submit detailed seller ratings. To reduce moral hazard, they prevent direct reciprocity from sellers in negative feedback while keeping reciprocal re- sponses in positive ratings intact. This design is inconsistent in light of our results since then. It is mainly direct reciprocity in positive ratings that biases reputation information in our experiment. Masclet and Pénard (2012) explicitly tested this design in their investment game paradigm. eBay’s choice even fares worse than their sequential treatment with regard to investment levels. The current solution seems to be a compromise among empowering the buyer, keeping intact accu- mulated reputations of stakeholder, and not losing the impression of an overall trustful environment that generates a share of positive ratings above 95%. We concur with Bolton et al. (2013, 283), who recommended a “targeted ap- proach rather than one that attempts to remove all forms of reciprocity.” Direct reciprocity obviously has positive and negative consequences for reputation sys- tems. Submitting feedback means providing a public good and investing in a community that surrounds the market. In the terms of Gouldner (1960, 176), di- rect reciprocity is a starting mechanism that helps to initiate social interaction. It scaffolds opposing interests into a simple structure of turn taking. Here, direct reciprocity may also be thought of as second form of starting mechanisms. By lifting feedback submission rates to a credible level where a majority of traders share their information, direct reciprocity is involved to start off the second mech- anism of indirect reciprocity. Much progress has been made in theories of reputation building. Still, we know little under what specific real-world conditions indirect reciprocity robustly sustains trust and cooperation. Several complications like other-regarding prefer- ences, reporting bias, perception errors and differential effectiveness of negative vs. positive reputation have been documented that are all hard to trace in ana- lytical models. This suggests applications of simulation studies that are closer to parameters observed in empirical studies. In our experiment, all implementations of feedback mechanisms finally decay and fail. However, the introduction of stochastic endgame rules and the interaction of indirect reciprocity with compe- tition and endogenous partner choice promise stable levels of trust and trustwor- thiness without steady disintegration or collapse near the end of the game. With laboratory experiments we have primarily focused on varying the credibility of reputation information. Our sequential treatment produces the highest feedback submission rate and the lowest levels of trustworthiness of all treatments imple- Discussion and Conclusion 91 menting a reputation system. Obviously, it is quality rather than quantity that matters. 92 Reciprocity in Reputation Systems 3.6 Appendix

3.6.1 Functional Form of Reputation Effects

In this section we discuss the functional form of reputation effects. We recover two prominent stylized facts common to several theories of decision making: diminishing sensitivity and negativity bias. The former can be derived from Bernoulli (1738) even though it is more often attributed to a similar logarithmic law from early German psychophysics. The claim is confirmed if we find supe- rior explanatory power by taking logarithms of the reputation scores. Negativity bias is most prominently formalized in cumulative prospect theory (Tversky and Kahneman 1992) with an asymmetric value function, i.e., ν(x) = xα for x ≥ 0 and v(x) = −λ(−x)α for x < 0, where negativity is called loss aversion and is captured by a stretching factor λ ≈ 2.25. The upper row of Figure 3.4 shows effects plots of positive and negative rep- utation scores (R+, R−) for buying on the left and shipping on the right side. The effects plots are estimated from a data set pooled over all three feedback treat- ments. In both situations the effect of positive reputation is flatter compared to the case of negative reputation. The lower row depicts effects plots for derived image scores. According to the definition of Nowak and Sigmund (1998a), an im- age score is the difference between the sums of helpful and not helpful acts. In our + − case, it is the difference between positive and negative reports, i.e., Iit = Rit − Rit and mirrors the score on eBay. As already mentioned in the main text, if we pool our data “bad” is roughly twice as strong as “good”. However, the asymmetry is not constant across buyer and seller roles or over treatments. Reputation information has a stronger impact for buyers and also the asymmetry is more pronounced there. The asymmetry is completely determined by neither the average levels of trust (ceiling in probabil- ity space) nor the distribution of positive vs. negative ratings. About the same numbers of positive and negative ratings are submitted in the SE treatment. In the SI and AS treatments there are about 25% and 30% more positive than negative ratings. However, the ratio of the reputation effects (β−/β+) tends to be higher and falls between 1.3 and 2.5 for buying and between 1.1 and 2.4 for shipping.21

21 Interestingly, ratios from several empirical studies on eBay fall roughly into the same range, with drastically different underlying distributions of positive and negative ratings. In Diekmann et al. (2014), bad is not in all models significantly stronger than good. In the market for new mobile phones, for example, the ratio is 2.2. And one can see from their Table 2 that the ratio is typically around 2. In the field experiment of Lucking-Reiley et al. (2007) the ratio lies between 1.5 and 2.8. But there are also counterexamples of positivity bias, for example in Przepiorka (2013). In experiments from Bolton et al. (2004) and Li and Xiao (2014) the reported ratios are 1.7 and 1.6, respectively. Note that Appendix 93

Reputation Effects for Buy Reputation Effects for Ship 1 1 .8 .8 .6 .6 .4 .4 .2 .2 Probability of Buying Probability of Shipping 0 0 0 5 10 15 0 5 10 15 Reputation points Reputation points

positive negative positive negative

Image Score Effects for Buy and Ship Image Score Effects By Treatment (Buy only) 1 1 .8 .8 .6 .6 .4 .4 .2 .2 Probability of Buying 0 0 Probability of Buying or Shipping -10 -5 0 5 10 -10 -5 0 5 10 Image Score Image Score

shipping buying AS SI SE

Figure 3.4: Reputation Effects Plots for Placing and Honoring Trust

In Table 3.7, we replicate the findings from Table 3.3 by means of linear fixed effects panel models. We incorporate partner reputation with different specifica- tions. All variants are highly significant. However, we can see from the adjusted R2 measures that the models with squared or logarithmic form have higher pre- dictive power. This supports the assumptions of diminishing sensitivity, or in economic terms, diminishing marginal returns from reputation. We receive the qualitatively same result after standardizing reputation scores. Finally, there are also systematic interaction effects between the scores and periods (not reported here), i.e., reputation effects get weaker over the course of the 30 periods by about 0.2% for each period.

3.6.2 Fixed Effects Reputation Models

In Table 3.8 we replicate the findings from Table 3.3 by means of fixed effects linear probability panel regression models where we report separate models by taking ratios of coefficients from very different regression models is obviously very rough and not the proper way to test the claim. 94 Reciprocity in Reputation Systems ∗∗∗ ∗∗∗ .096 -.175 ∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ects LPM) .006 ff .050 -.003 -.097 ∗∗ ∗∗∗ .014 -.056 ∗∗∗ .029 ects (Pooled Fixed E ff ∗∗∗ ∗∗∗ (.018)(.022) (.022) (.026) .197 ects linear probability models for buying (M1-M4) and ship- ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ff (.001) (.001) (.002) (.002) .086 .006 -.127 ∗∗∗ ∗∗∗ (.004) (.010) (.005) (.011) (.008) (.014) (.009) (.016) Placing Trust (Buy) Honoring Trust (Ship) .035 ∗∗∗ (1) (2) (3) (4) (5) (6) (7) (8) 2400 2400 2400 2400 1787 1787 1787 1787 0.001 (two-tailed tests). 0.278 0.291 0.317 0.312 0.240 0.248 0.263 0.258 (.004) (.004) .052 < A - -.004 + + + + H 2 0.01; ***p The table shows estimates from fixed e R Table 3.7: Functional Forms of Reputation E 2 < 2 N Adj. Score Pos Pos Neg - -.083 Neg ln(Pos) ln(Neg) - -.270 Note: ping (M5-M8) pooled overutation, all i.e., three the feedback image treatments.with score Models squares (M1, incorporate (M3, M5), measures M7), positivestandard of and and errors the logs are negative partner’s of reputation in rep- reputation parentheses, scores**p scores clustered (M2, (M4, on M6), M8). the reputation individual Period scores level. dummies are not displayed. Robust Appendix 95 ∗∗ ∗∗∗ ∗∗ ∗∗∗ ∗∗∗ .043 .978 -.036 -0.038 0.024 ∗∗ ∗∗∗ ∗∗ ∗∗∗ ∗∗∗ .023 -.061 1.007 ects model the number of ff ∗ ∗∗∗ ∗∗∗ -.043 .038 -.042 .025 .970 -.096 ∗∗ ∗∗∗ .608 ∗∗∗ ∗∗∗ ∗∗∗ -.024 .028 .044 .101 .038 .936 -.051 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ -.030 -.027 -.041 .036 .938 -.090 ∗ ∗∗∗ ∗∗∗ ∗∗∗ -.019 .001 -.025 -.030 -.003 .005 (.029) (.022) (.015) (.043) (.022) (.024) (.028) (.016) (.015) (.024) (.017) (.018) (.020) (.014) (.012) (.025) (.016) (.015) (.013) (.010) (.006) (.014) (.010) (.007) -.051 Placing Trust (Buy) Honoring Trust (Ship) .070 .905 ects linear probability models for buying (M1-M4) and shipping (M5-M8). Models ff ∗ ∗∗∗ (1) (2) (3) (4) (5) (6) (7) (8) ST AS SI SE ST AS SI SE 720 720 720 960 555 560 553 674 (.005) (.008) (.006) (.010) (.005) (.008) (.008) (.011) (.024) (.024)(.018) (.020) (.028) (.023) (.033) (.028) (.019) (.028) (.016) (.028) (.044) (.031) (.041) (.035) (.033) (.028) (.026) (.029) (.038) (.030) (.029) (.035) ects Models of Placing and Honoring Trust by Treatment (LPM) ff A + + H 0.01 (two-tailed tests). < 0.05; ***p < Table 3.8: Fixed E The table shows estimates from fixed e 0.1; **p < N Period - .001 .000 -.011 -.002 -.005 -.008 -.020 Partners’ Pos. Ratings Times partners didn’t ship -Intercept -.032 .858 Partners’ Neg. Ratings - -.146 Times partners didn’t buy - -.008 .023 .055 Own Neg. Ratings - -.022 -.014 -.013 .043 -.021 .058 Own Pos. Ratings incorporate the reputation scores of the decision makers and their partners. Additionally, experience e Note: times a decision makerthe faced individual partners level. who*p did not buy and did not ship. Robust standard errors are in parentheses, clustered on 96 Reciprocity in Reputation Systems

Table 3.9: Feedback Timing Patterns (%)

AS SI SE Mutual feedback Buyer rated first 8.3 27.7 Seller rated first 10.0 14.7 Simultaneous 2.0 5.9 Buyer feedback only 60.4 40.5 26.0 Seller feedback only 10.9 13.2 No feedback 39.6 28.4 12.5 Total 100.0 100.0 100.0 Observations 560 553 674

treatment. Reputation effects turn out to be very robust over a wide range of specifications, also under fixed effects with clustered standard errors on the in- dividual level. Asymmetry in effect sizes is not affected and remains intact. If we cluster on the session level (not reported here), we lose significance at con- ventional levels for positive reputation effects in AS and SI, and for the negative reputation effect in SE, with both buying and shipping. We conclude that nega- tive reputation effects are robust and strongest in AS and SI, but it is the positive reputation effect that is robust in the SE treatment. This holds true for both the buying and the shipping decisions. The decision makers’ own scores show no systematic effects. In the fixed effects models we also introduce history effects as additional con- trol variables, i.e., the number of times a player experienced partners who did not buy or ship in previous round (Gautschi 2000). Although under a rematching protocol one’s own history is uninformative about the current partner, players can learn something about the population in the lab. We find an expected effect for “times partner did not ship” in the stranger treatment, where the behavior of past partners is the only source of learning for the focal actor.

3.6.3 Feedback by Period, Time, and Treatment

Figure 3.5 shows that the total rate of feedback stays more or less stable over the 30 periods of play. The average rate of positive feedbacks is clearly declining while the average rate of negative ratings is always increasing. Table 3.9 displays timing patterns of feedback with the same logic as in Diek- mann et al. (2014, Table 3) or Bolton et al. (2013, Table 3). By comparing the tables, it is easy to see that the rigid two-stage feedback design of Bolton and colleagues does not match the arrival time pattern very well. Ours, on the other Appendix 97

Table 3.10: Treatment Effects on Feedback Arrival Time (RE)

Buyer Feedback Seller Feedback ¬ − + ¬ − + Sequential 3.007∗∗∗ 1.882∗∗∗ 0.649 4.133∗∗∗ 3.539∗∗∗ 1.294∗∗∗ (0.334) (0.458) (0.347) (0.446) (0.581) (0.275)

Simultaneous Ref. Ref. Ref. Ref. Ref. Ref.

Asymmetric 2.276∗∗∗ 0.914∗∗∗ 1.074∗ (0.334) (0.197) (0.421)

Ship -0.404 1.383∗∗∗ -0.869∗ 0.238 2.043∗∗∗ -0.512 (0.341) (0.304) (0.398) (0.350) (0.528) (0.315) Intercept 6.611∗∗∗ 4.072∗∗∗ 5.814∗∗∗ 4.452∗∗∗ 3.059 5.061∗∗∗ (0.514) (0.689) (0.745) (0.674) (1.991) (0.588) N 612 538 637 640 240 347 Note: Estimates from linear random effects panel regression models are shown. The dependent variable is the time in seconds for a transition to one of the three destination states (¬: no, − negative, or + positive feedback). Displayed are two treatment effects with the simultaneous treatment as the reference category and an additional ship effect. The full set of period dummies and random effects are not displayed. Robust standard errors are in parentheses, clustered on the session level. *p < 0.05; **p < 0.01; ***p < 0.001 (two-tailed tests). hand, fits the findings by Diekmann et al. better. In the case of mutual feedback, buyers tend to rate first. In Table 3.10 we report on models where the duration (in seconds) is regressed on treatment indicators, separately for abstention, positive feedback or negative feedback, and split by buyer and seller. The models complement the raw dura- tion in Table 3.4 and add to the non-parametric test discussed in the main text. Results indicate that the duration of the simultaneous treatment is systematically below that of the sequential treatment. Only the time for positive feedback is not significantly longer in the SE treatment. However, this needs to be put into perspective because we also see significantly longer duration in AS even though SI and AS should be identical with regard to timing. Positive feedback tends to arrive sooner if the seller made a shipment. For negative feedback shipment has a retarding effect. 98 Reciprocity in Reputation Systems 30 30 30 30 30 20 20 20 20 20 Period Period Period Period Period 10 10 10 10 10

0 0 0 0 0 AS - Buyer Comparison SI - Buyer Comparison SI - Seller Comparison SE - Buyer Comparison SE - Seller Comparison

.4 .4 0 .6 1 .6 .2 0 1 .8 .6 .2 1 .8 .4 .2 0 1 .6 .4 0 1 .8 .6 .4 .2 0 .8 .8 .2 30 30 30 30 30 20 20 20 20 20 Period Period Period Period Period 10 10 10 10 10

0 0 0 0 0 AS - Buyer Negative SI - Buyer Negative SI - Seller Negative SE - Buyer Negative SE - Seller Negative

.8 1 .6 .2 0 1 .8 .6 .2 1 .8 .6 .4 .2 0 1 .8 .6 .4 .2 0 1 .8 .6 .4 .2 0 .4 .4 0 30 30 30 30 30 20 20 20 20 20 Period Period Period Period Period 10 10 10 10 10

0 0 0 0 0 AS - Buyer Positive SI - Buyer Positive SI - Seller Positive SE - Buyer Positive SE - Seller Positive

1 .6 .2 0 .8 .6 .8 0 1 .4 0 .8 1 .2 1 .6 .4 .2 .6 .6 .4 0 .8 .4 .4 0 .8 .2 1 .2 30 30 30 30 30 20 20 20 20 20 Period Period Period Period Period 10 10 10 10 10

0 0 0 0 0 AS - Buyer Total SI - Buyer Total SI - Seller Total SE - Buyer Total SE - Seller Total

1 .6 .2 0 .8 .6 .8 1 .4 0 .8 1 .2 1 .6 .4 .2 0 .6 .2 1 .6 .4 0 .8 .2 .8 .4 .4 0

Figure 3.5: Average Feedback Submission Rates by Period and Treatment Appendix 99

Buyer's positve ratings Buyer's negative ratings .4 .4 .3 .3 .2 .2 .1 .1 Cumulative Incidence Cumulative Incidence 0 0 0 2 4 6 8 10 0 2 4 6 8 10 Time to feedback Time to feedback

Sellers's positive ratings Sellers's negative ratings .4 .4 .3 .3 .2 .2 .1 .1 Cumulative Incidence Cumulative Incidence 0 0 0 2 4 6 8 10 0 2 4 6 8 10 Time to feedback Time to feedback

Asymmetric Simultanteous Sequential

Figure 3.6: Cumulative Feedback Incidence Functions from Competing Risks

3.6.4 Competing Risks

We model timing in the feedback stage with a competing risks approach for three independent transitions to states where the players have rated positively, rated negatively, or abstained. If a player submits a positive rating after exactly 5 sec- onds, he or she has a zero risk of submitting negative feedback or abstaining thereafter. This is similar to a medical trial with treatment and placebo groups that allow for an additional cause of death, independent of the one the treatment is supposed to retard. For technical simplicity we assume independence of these three risks although we are aware that this is a tenuous position: our risks are likely to be correlated and even have a common cause. Transitions to multiple destination states used to be handled in a manner iden- tical to the single transition case. Every potential outcome event under risk was modeled separately while simply censoring competing risks (Blossfeld and Ro- hwer 2002, 81f,101f). However, the cause-specific hazard functions do not then have a direct interpretation in terms of survival probabilities (here feedback fre- quencies) for the particular failure type. In Figure 3.3, the hazard rate for positive buyer ratings in the SI treatment is strictly above the hazard rate of the SE treat- 100 Reciprocity in Reputation Systems

Table 3.11: Ex-Post Randomization Test (MLogit)

ST AS SI Coef SE Coef SE Coef SE Female 1.055 0.432 1.515 0.615 1.751 0.711 Age 0.952 0.050 0.879∗ 0.055 0.931 0.052 Experience 0.894 0.439 1.323 0.639 1.240 0.603 Faculty Economics Ref. Ref. Ref. Social Science 1.597 0.831 2.649 1.628 0.735 0.443 Arts 1.548 0.829 3.086 1.888 2.393 1.277 Other 1.418 1.242 5.123 4.315 3.144 2.482 2 N = 208, χ(d f ) = 23.9(18), p = 0.1552 Note: Estimates are from a single multinomial logistic regression model with the SE treatment as a baseline. Standard errors are in parentheses. *p < 0.05; (two-tailed tests)

ment even though the frequencies of transactions with a positive rating is larger in SE. This is the result of differential abstention behavior. In the SI treatment, buyers and sellers use the option to remain silent much more often and also click early on the corresponding button. This removes them from the risk set for po- tential positive and negative ratings and raises the hazard rate of the SI treatment upwards. To correctly match the percentages and correctly interpret treatment ef- fect, we have to address the correct subhazards. Figure 3.6 shows cumulative in- cidence functions predicted from a competing risk proportional hazard rate model proposed by Fine and Gray (1999). Fine and Gray’s approach is technically not well suited for our multiple failure case. That is why we stay with the standard time-discrete models in the main part of the text. We only add this analysis to cor- roborate the fact that the incidence function of the SE treatment is strictly above that of the other two treatments for all types of events under study. This gives support to hypothesis 4c, but it rejects 4d.

3.6.5 Ex-Post Randomization Test Our between-subject design is prone to self-selection into treatments since we do not randomize within sessions. Table 3.11 shows an ex-post randomization test, estimated from a multinomial logistic regression with the treatment indicator as the dependent variable, and the SE treatment as the baseline category. We report relative risks for being selected into one of the three treatments compared to the sequential treatment. As explanatory variables we use gender, age, the Appendix 101 faculty of the major study subject, and an experience indicator that turns one if subjects have participated in an experiment before. The only significant result is an age effect in the asymmetric treatment. However, all joint F-Tests of the covariates are non-significant. We conclude that randomization is intact or at least not systematically biased with regard to control variables collected during the post-treatment questionnaire.

3.6.6 Instructions an Screen Layouts 102 Reciprocity in Reputation Systems

Instruktionen zum Experiment

Beim Experiment an dem Sie gerade teilnehmen, geht es um Entscheidungen, die Sie in verschiedenen Situationen treffen sollen. Das Experiment besteht aus 30 Runden. In jeder Runde wird Ihnen ein je- weils anderer Mitspieler zufällig zugeordnet. Die Höhe der Punktzahl, die Sie während des Experi- ments erzielen können, hängt davon ab, wie Sie und Ihre Mitspieler sich in den jeweiligen Situationen entscheiden. Die Punkte, die Sie während des Experiments erzielen werden durch 50 geteilt und so in CHF umgerechnet. Dieser Betrag wird Ihnen nach dem Experiment ausbezahlt. Ihr Antrittsgeld von CHF 10.- wird Ihnen sogleich als 500 Punkte Ihrem Kontostand gutgeschrieben.

Aufbau einer Runde

Jede Runde besteht aus zwei Teilen, die immer zwei Spieler miteinander bestreiten. Jeder Teilnehmer ist von Runde zu Runde entweder Spieler 1 oder Spieler 2. Im 1. Teil einer Runde (siehe Abbildung 1) muss sich Spieler 1 entscheiden, ob er kaufen oder nicht kaufen wählt. Wählt er nicht kaufen, so ist der 1. Teil zu Ende und beide erhalten keine Punkte. Hat Spieler 1 sich für weiter entschieden, kann Spieler 2 sich zwischen liefern und nicht liefern entscheiden. Wählt Spieler 2 liefern, dann erhalten beide 20 Punkte. Wählt Spieler 2 nicht liefern, dann werden dem Spieler 1 10 Punkte abgezogen, Spieler 2 erhält hingegen 40 Punkte. Die Punkteauszahlung findet sich in den Abbildungen jeweils un- ter den Knoten, wobei die 1. Zahl der Punkteauszahlung für Spieler 1 und die 2. Zahl der Auszahlung für den Spieler 2 entspricht.

Abbildung 1: Spielbaum, 1. Teil Abbildung 2: Spielbaum, 2. Teil

Spieler 1 Bewerten A nicht kaufen kaufen C

Spieler 2 positiv schweigen 0, 0 B negativ nicht liefern liefern -1 -1 0

-10, 40 20, 20

Der 2. Teil einer Runde (siehe Abbildung 2) wird nur erreicht, wenn sich Spieler 1 für weiter ent- schieden hat und Spieler 2 zum Zuge kam. In diesem Fall erhalten beide Spieler die Möglichkeit, die Entscheidung des Mitspielers positiv oder negativ zu bewerten. Eine Bewertung kostet Sie 1 Punkt. Sie können aber auch auf eine Bewertung verzichten und sich für schweigen entscheiden. Schweigen kostet nichts. Ihre Bewertung beeinflusst den Punktestand Ihres Mitspielers nicht, wird aber seinen zu- künftigen Mitspielern gezeigt. Sie haben jeweils 10 Sekunden Zeit, eine Bewertung abzugeben. Nach Ablauf der Zeit wird keine Entscheidung als schweigen interpretiert. Sobald Ihr Spielpartner Sie be- wertet hat, wird Ihnen seine Entscheidung sofort eingeblendet. Nach der Bewertungsstufe startet eine neue Runde, die Rollen werden neu verteilt und Sie erhalten einen anderen Mitspieler.

Blatt bitte wenden…

Figure 3.7: Instructions SI Treatment (Page 1) Appendix 103

Informationen zu Ihnen und Ihrem Mitspieler

Am Computerbildschirm werden Ihnen jeweils auf der rechten Seite Informationen über Ihren eigenen Spielverlauf und Informationen zu Ihrem Mitspieler eingeblendet. Oben rechts wird immer Ihr aktuel- ler Punktestand, die Rundenzahl und die Anzahl Bewertungen ausgewiesen, die Sie von Ihren Mitspie- lern in den Vorrunden erhalten haben. Unten rechts werden die Informationen zu Ihrem Mitspieler ein- geblendet, auf die Sie Ihre Entscheidung stützen können.

Abbildung 3: Informationen

Informationen über Sie: Informationen über Ihren Mitspieler:

Aktueller Punktestand: Anzahl Bewertungen, die Ihr Mitspieler von Aktuelle Runde: Anzahl seinen Spielpartnern erhalten hat:

Bewertungen, die Sie von Ihren Positive Bewertungen: Anzahl Mitspielern erhalten haben: Negative Bewertungen: Anzahl

Positive Bewertungen: Anzahl Negative Bewertungen: Anzahl

Schlussfragebogen und Auszahlung

Kurz vor Schluss des Experiments wird Ihnen noch ein kurzer Fragebogen eingeblendet. Wir bitten Sie diesen vollständig auszufüllen. Danach werden Ihnen erneut Ihr Punktestand und die Umrechnung in Schweizer Franken eingeblendet und das Experiment ist abgeschlossen. Im Sitzungszimmer (S121) des Instituts wird ihnen die Auszahlung in einem verschlossenen Umschlag überreicht. Wir bitten Sie drin- gend, weder während noch nach dem Experiment mit den Teilnehmern über Ihre Entscheidungen und Ihre Auszahlung zu sprechen.

Viel Spass!

Figure 3.8: Instructions SI Treatment (Page 2) 104 Reciprocity in Reputation Systems

Figure 3.9: Screen for first mover: buy or not buy (AS, SI, SE treatments)

Figure 3.10: Screen for second mover: ship or not ship (AS, SI, SE treatments) Appendix 105

Figure 3.11: Screen for submitting feedback (SE treatment)

Chapter 4

Structures in Online Collegiate Networks

Abstract: In this paper, I study the structure and composition of online colle- giate networks that have formed at five Swiss universities. By means of social network analysis I re-analyze well-known findings from the network literature. Using cross-sectional network data from StudiVz from a single point in time, simple network formation processes from economics and statistical physics are discussed that can give rise to the observed emergent structures. I discuss statistical distributions that describe the degree distribution of growing social networks. The “Small World” property and the Theory of the Strength of Weak Ties will be tested. Based on attribute data, I show what common mixing patterns on online collegiate networks are observed and how they relate to the concepts of transitivity, propinquity, and homophily. Finally, mixing patterns will be statistically analyzed by means of exponential random graph models. I conclude that simple growing random graph models are able to capture many stylized facts typically observed on online social networks and that the latter can give sensible insights into social structures that have emerged in the offline world.

Keywords: social networks, social networking site, mixing patterns

† I thank several of my students for their participation in the data collection and cleaning process, especially Jonas Blatter, Marcel Breu, Michael Müri, Marc Osswald, and Florian Schlapbach. Support from Claudia Jenny in proofreading the manuscript is gratefully acknowledged. 108 Online Collegiate Networks 4.1 Introduction

Sociologists since Simmel (1908) have been interested in social circles as a cen- tral feature of friendship networks. One central focus (Feld 1981) where circles naturally intersect is public schools and universities. It comes as no surprise that social scientist started early on to study friendship, tie-formation processes, and sociometric choices at these central social crossroads. Starting with the seminal work of social network pioneer Moreno et al. (1943) in the classroom, the size of the networks quickly became bigger and approached entire schools (Coleman 1958, 1961) where, due to logistic constraints, the collection of complete network data has found its upper limit. But educational systems are still one of the richest sources of complete network data, be it on the classroom level (Hallinan 1978), educational institutions like high schools (Moody 2001, Bearman et al. 2004), or samples of university students (Van Duijn et al. 2003). In the last ten years, novel sources of complete network data based on digital traces from email (Kossinets and Watts 2009) or sites (Lewis et al. 2008, Mayer and Puller 2008, Wimmer and Lewis 2010, Traud et al. 2011) have piqued the interest of social scientist and network researchers with the promise to lift logistic constraints, and to allow the study of friendship formation in orga- nizations as large as entire universities. The explosive growth of social network sites like Facebook, LinkedIn, or Xing have even raised expectations of being able to observe social networks on a societal level. On social network sites (SNS), friendship formation and maintenance is com- monly an end in itself, though all major platforms have generalized to wider forms of usage in recent years. SNSs are defined by Boyd and Ellison (2007) as “web- based services that allow individuals to (1) construct a public or semi-public pro- file within a bounded system, (2) articulate a list of other users with whom they share a connection, and (3) view and traverse their list of connections and those made by others within the system.” Facebook is the archetype of an SNS. Other online services might include “social” technologies and features, but they often serve a special interest or are related to a specific focus. See Heidermann et al. (2012) for a recent review on the diversity and a classification of online social networks. For social scientists, social media is relevant in substantive terms, especially if one is interested in how individuals communicate, share information, coordi- nate, and cooperate. These novel forms of interaction facilitate social actions and movements and change resource mobilization due to drastically lowered transac- tion costs (Bond et al. 2012). In my context social media is, however, primarily of interest to solve the methodological problem of upscaling. Digital traces from online social networks have the potential to overcome the traditional limitations Related Literature 109 of social network analysis. In principal egocentric network data, typically col- lected by the means of surveys, allows the exploration of network properties for larger social entities. However, they do not provide a complete picture of the meso- or macro-level. The sociometric method for complete networks as well as the egocentric method are known to be very burdensome and demanding on study participants. Additionally, direct questioning introduces various forms of biases and limitations, such as social desirability, problems of recall, interviewer effects, or simply time constraints in the interview (Marsden 2005). On the other hand, digital traces of social networks can easily be collected by automated procedures in an unobtrusive way. They measure real behavioral phenomena by observing individuals’ choices and, incidentally, require far fewer resources. This is how proponents of computational social science (Watts 2007, Lazer et al. 2009) see the social sciences being revolutionized. As Corten (2012, 2) notes, “research on online social networks has only rarely fulfilled these promises.” The same may also hold true for social media analysis and social data mining in general. However, common knowledge about potentials, limits, and risks of online digital traces is emerging and meanwhile also subject to active debate (Lazer et al. 2014, Ruths and Pfeffer 2014, Zimmer 2010). In this paper, I study the online collegiate networks based on data collected from the SNS StudiVz. The contribution of this paper is to revisit common and well-known facts from and social network analysis and test them on this rather new type of data source. Compared to other studies of online social networks, the analysis will be restricted to single institutions where only links within the organization are considered. In doing so, the results should replicate an image of the inner structures of a university (Kossinets and Watts 2009). Addi- tionally, the data will be compared to an ensemble of similar Facebook networks. It will be shown that online collegiate network share remarkable similarities in terms of their global structure. The article is structured as follows: In the next section I review the previous literature on online collegiate networks. In section 4.3 I discuss the data collection and present the materials. Based on a simple network growth model by Jackson and Rogers (2007), Section 4.4 presents com- mon properties of growing random networks and discusses contributions from the “new” science of networks. Finally, common mixing patterns that mirror the opportunity structure will be discussed in Section 4.5.

4.2 Related Literature

There are three strands of research relevant for this work. The first concerns studies that use SNS data to analyze universities. Lewis et al. (2008), for example, 110 Online Collegiate Networks study a cohort of Harvard students with regards to shared tastes. Based on the same dataset, Wimmer and Lewis (2010) study racial homophily in a network of Facebook picture friends and ask what tie-generating mechanisms contribute to the final social structure. They rely on set of basic processes that determine why two persons establish a relationship with each other: availability, propinquity, and homophily. Their photo-tagging network is directed, so they also include endogenous balancing mechanism that I will not consider. Availability refers to the distribution of individuals over categories, such as race or gender, and goes back to Blau (1977, 79) and his conjecture that social as- sociation primarily depends on opportunity for social contact. From Blau follows, for example, that members of a minority are expected to form more out-group ties because of their relatively small group size. Propinquity refers to the distribution of individuals over physical space (Festinger et al. 1950) and primarily means ex- posure to the same focus (Feld 1981). Availability and propinquity together form the opportunity structure (Moody 2001). Homophily, also called assortative mix- ing, is the third potential mechanism where individuals are thought to exhibit a mutual preference in terms of a socially relevant category (Lazarsfeld and Merton 1978, McPherson et al. 2001). In principle, any shared category can be the basis for selection, including cohort and major study subject that are typically counted as opportunity structure. Ultimately, we observe mixing patterns over categories. Furthermore, there are several important endogenous processes, for example, reciprocity and . Reciprocity is a concept of directed networks, which refers to the tendency of ties becoming mutual. Triadic closure, which also applies to undirected networks, is the tendency that open triads tend to close, i.e., that friends of friends become friends (Rapoport 1953, Coleman 1990). Its outcome is called transitivity. Both endogenous effects are thought to result from a desire to balance (Heider 1946, Davis 1963). Finally, statistical models typically also include a sociality effect that captures an individual’s propensity to form ties by controlling for the degree distribution. In Section 4.5, I will come back to a specification of an exponential random graph model by Goodreau et al. (2009) that integrates several of the aforementioned mechanisms. Sociality, selective mixing, and balancing are not independent. Homophily can induce transitivity. On the other hand, triadic closure can amplify mix- ing patterns. Sociality also influences network-level mixing patterns, given it is correlated with the relevant categories. By including several interdependent mechanisms, Wimmer and Lewis (2010) show that racial segregation is not only driven by choice but is also a result of balancing mechanisms, propinquity, and homophily based on nonracial categories. Racial segregation is also studied by Mayer and Puller (2008). They analyze Facebook networks from 10 universities in Texas and match further attribute data from administrative sources to one of Related Literature 111 their networks. They find race to be strongly related to social ties, but also find nontrivial effects from measures of socioeconomic background, though primar- ily from propinquity and college activities. The strongest predictors are having completed the same high school and staying in the same dorm. Also important are shared major, shared cohort, or a shared college activity such as being an ath- lete or being member of the same fraternity. Mayer and Puller’s result indicates that tie formation is primarily driven by the opportunity structure and replicates the classical conjectures of Blau (1977) and findings of Festinger et al. (1950). Attributes, such as socioeconomic and educational background, gender, and stu- dents’ political ideology, are all significant predictors but tend to have far less explanatory power. A different strand of research is followed by Traud et al. (2011, 2012). Traud and colleagues analyze an ensemble of 100 Facebook networks to evaluate com- munity detection algorithms. Community detection comprises a diverse set of graph partitioning algorithms to find cohesive subgroups in networks (cf. Porter et al. 2009, Fortunato 2010). Traud and colleagues show that their algorithmically identified modules are composed from nodes with similar attribute values such as cohort, major, and dormitory. Even though not mentioned explicitly, their algo- rithms seem to have successfully recovered typical opportunity structures within universities in their ensemble. The work of Traud and colleages belongs to a “new” field of network scientists (Watts 2004, Newman 2010), influenced by diverse disciplines, including statistical physics, economics, and computer sci- ences. The majority of research contributions on online networks comes from this field, which will form the basis of Section 4.4. The third strand of research concerns common usage patterns of SNSs based on survey methods and usage analysis from social network sites that are not ex- plicitly tied to the university context. This body of research has grown to an extent that makes it impossible to discuss here. I refer to the review literature (Boyd and Ellison 2007, Mayer 2009, Heidermann et al. 2012) and highlight a few aspects. Compared with other online social network where users meet on- line, Ellison et al. (2006) find online collegiate networks to be spatially bounded, with users meeting offline first and connecting on online platforms later. Brooks et al. (2014) analyze egocentric networks from Facebook and find the networks of a user typically segmented into several dense subnetworks or foci (such as friends from high school) with very few connection between these subnetworks. In this article, I address only one of those clusters: the university focus. More related literature and empirical findings are introduced in the following sections. 112 Online Collegiate Networks 4.3 Materials & Methods

In this section, I discuss the data-collection procedure and present the graph and attribute data.

4.3.1 Data Collection The data for this study was collected September 2007 - January 2008 by crawling StudiVz pages of five German-speaking universities in Switzerland: University of Bern, Basel, St.Gallen, Zurich and ETH Zurich. StudiVz is an SNS that was very popular at German, Austrian, and Swiss universities and the dominant SNS there between mid-2006 and mid-2008.1 In many respects StudiVz can be considered an almost perfect clone of Facebook. The service imitated all relevant features of Facebook, so that I expect measurements of structural features to be identical. The data collection procedure is similar to the approaches of Mislove et al. (2007) and Ahn et al. (2007). The publicly visible part of the website was crawled by snowball sampling the connected component of the friendship network. An- alytical results from random graph theory suggest that my snowball-sampling procedure reached saturation with high likelihood. A detailed discussion can be found in Appendix 4.7.1. Compared to the crawling approach above, I consider only ties between students at the same university. This makes my ensemble of five Swiss universities comparable with Facebook datasets of Lewis et al. (2008), Mayer and Puller (2008), and Traud et al. (2011). Furthermore, at almost the very same time in January 2008, Lee et al. (2011) also collected StudiVz data from University of Bielefeld with a procedure identical to mine. They studied interactions between and not within universities, where the number of between- university ties is the foundation of their geo-spatial analysis.2 The publicly visible part of StudiVz excludes students who have restricted ac- cess with their privacy settings. However, at the time of data collection, very few students had enabled data-protection measures and hidden their list of friends. About 6.5% of the students chose not to make their profile public.3 These cases

1 According to a StudiVz press release from February 2008, StudiVz counted 4.5 Mio registered users and was in January 2008 the most popular German website. StudiVz measured on average one visit per user per day and about 30 page impressions per visit. As Lee et al. (2011, 130) have correctly identified, the actual number of post-secondary students in the three countries Germany, Austria, and Switzerland would only suggest a member count of 2.4 million users. It appears that StudiVz attracted users from other audiences such as former or exchange students, and young people who associated themselves with an institution without being actually enrolled there. 2 Lee et al. (2011) show that the number of maintained ties between Bielefeld and other institutions can be explained by the log of the geographical distance. This suggests that online friendship ties on StudiVz are determined by physical distance, too. 3 BE: 3.99%. BS: 7.32%, SG: 13.24%, ZH: 6.44%, ETH 4.84%. Materials & Methods 113

A) Arrival Times (All Profiles) B) Last Activity Time (Survivors) 150 3000 100 Frequency Frequency 50 1000 0 0

2005 2006 2006 2007 2007 2006 2007 2009 2010 2012

Time of Arrival Time of Last Activity

Figure 4.1: Arrival Time and Last Activity on StudiVz Profile Pages will be missing from the analysis. During the observation period I took three snap-shots of profiles, friendship list, guest books, and group memberships of the five universities in two-months intervals. In the following, I will concentrate on the last snapshot taken on January 13th, 2008. In November 2007, an online survey with students from ETH Zurich was conducted to better understand SNS usage patterns, to validate the participation rate, and to study the nature of friend- ship relationships. Detailed results from this survey will be discussed elsewhere. At the end of the data collection period, the data was anonymized. All personal identifiers such as clear text names were separated from the research data and replaced by numerical identifiers in the same way as suggested by Lewis et al. (2008) and Kossinets and Watts (2009). The StudiVz participation rate at the time of observation was roughly between 55-85% of all enrolled students at the five observed institutions with significant variations between institutions. A comparison with the official enrollment counts of ETH Zurich suggests a participation rate of about 65% at the time of wave two. In the online survey a rate of 67.5% was measured by direct questioning. By the time of wave three, the raw number indicate a participation rate of about 70%. The detailed comparison with official statistics is shown in Table 4.6 in the ap- pendix. Attributes like the proportion of females and foreign students observed on StudiVz are in good agreement with the official data. The participation rates fall in the middle but are at the lower end of what others have observed for university cohorts’ Facebook participation. While Lewis et al. (2008) report an impressive participation rate of 96% for their cohort of students from Harvard University, 114 Online Collegiate Networks

Mayer and Puller (2008) report numbers between 8-80% for their ensemble of ten Texas universities. In Figure 4.1.A, one can see two peaks of arrival dates that correspond to two new cohorts entering in autumn 2006 and 2007. The three horizontal lines are the observation dates up to snapshot 3. Seven years later, in March 2014, five random samples consisting of 500 students for each of the universities were revisited. We observed the students’ last activity date and number of friends given the profile still existed. Of all the profiles in the random samples, 48-55% were still online even though most of the profiles had been inactive for several years. In Figure 4.1.B the date of the last profile change is shown. The peak of the last activity was in May 2008, two months after Facebook entered the German- speaking market. In retrospect, since Facebook has taken over the entire market, the following analysis will be more of an autopsy than an examination of a vivid online community.

4.3.2 Networks StudiVz is a multiplex network. Four different social relations and tie defini- tions from four distinct networks are considered: (1) the friendship network, (2) a two-mode network of common interest group memberships, (3) a network of guestbook entries, and (4) the network of picture friends. The main focus will rest on the friendship network. A graph visualization of the ETH friendship net- work is depicted in Figure 4.2, where colors are coded according to entry cohort. The other three relational events will be used to build a measure of tie strength of the friendship network in Section 4.4.3. The friendship network can be rep- resented with an adjacency matrix A, whose elements Ai j give the strength of a relation between i and j. The networks are unweighted (Ai j ∈ {0, 1}) and undi- rected (A = AT ). Neither do we know who initiated the relation nor when the relation was established or dissolved. Note that this is a major difference and limitation compared to friendship studies based on directed and even timed net- work data. The networks of guestbook entries and picture friends are directed and the number of transferred messages and tagged photos can be interpreted as weights. Weights can also be derived from the network of common group mem- bership. However, since I concentrate on friendships, the networks have been symmetrized and the weights dropped. Ties in the friendship networks are simply called friendship relations. How- ever, this tie definition must be sharply distinguished from common friendship definitions. Friendship is typically defined as a mutual relationship characterized by trust, affection, and commitment. The number of friends users have accumu- lated indicates that the concept I here call friendship cannot be that of a close Materials & Methods 115

Figure 4.2: ETH Friendship Network Colored by Cohort relationship; it is better described as acquaintanceships. One distinguishes the nodal degree of within-university ties dwithin and the total number of friends on StudiVz dall that include links to nodes in other institutions. The latter metric is observed on the profile page. The median of dall across all five universities is 45 and the upper quartile is 74 friends, while the median and upper quartile of the 4 number of friends within the institution dwithin is 20 and 35, respectively. Such large numbers in nodal degree indicate very low tie formation and tie- maintenance costs. Nevertheless, I have good reason to believe that ties within institutions measure interaction on campus and are rooted in the offline world. Only 20% of my respondents in the online survey indicated they had added people to their list of friends they did not know in person before. And 75% of these

4 Lee et al. (2011) report almost identical measures for the StudiVz dataset from University of Bielefeld collected about ten days later than my snapshot 3. There, the median in the total number of friends dall was 48 and the upper quartile is 86 friends based on an institution with 29’192 students. Compared to StudiVz, users on Facebook tend to have much higher nodal degrees, e.g., Lewis et al. (2008) report a median of about 100 friends. Traud et al. (2011) show with their ensemble that dwithin is mainly driven by the size of the institution and the duration of potential accumulation, i.e., when Facebook was introduced at a specific institution. Accumulation is ongoing. According to Ugander et al. (2011), the median number of total friends on Facebook for the entire Facebook population was 100 in 2011. This number continues to rise. 116 Online Collegiate Networks

A) Proportion of Valid Attributes B) Proportion Missing by Degree

● private 0.6 gender

cohort 0.5 signup date ●

last update 0.4 birthday ●

● ●

subject 0.3 ● geo circle ● ● ● ● ● ● ● ● ● ● country ● ● ● 0.2 ● zip Proportion Missing Attributes ● ● ● canton ● 0.1 ● high school

0 20 40 60 80 0.0 0.2 0.4 0.6 0.8 1.0 Degree

Figure 4.3: Proportion of Valid and Missing Attributes respondents with “virtual contacts” on their list of friends added fewer than four of these. This suggests that the number of “merely online friendships” is expected to be below 5%. Mayer and Puller (2008) report this number to be 0.4% for their ensemble. According to Ellison et al. (2007), participants of online collegiate networks mainly use these sites to interact with people they already know or to keep in contact with people met during their high school days.

4.3.3 Attributes Besides relationship information, the data also contains limited sociodemo- graphic information such as gender, age, cohort, major study subjects, canton or country of origin, that are voluntarily entered by students. Table 4.1 provides descriptive statistics. I concentrated on a few general attributes that help to map the overall opportunity structure and where most of them are exogenous to mini- mize the potential for endogeneity between attributes and tie-formation process. Personal information, such as romantic relationship status, political orientation, tastes, or work related information, will not be considered. These attributes are typically endogenous, missing from the majority of nodes, are very high dimen- sional and don’t have an inherent incentive to be truthfully reported. Figure 4.3 shows the availability of the attributes which are included in the analysis. While gender and platform specific information like sign-up date and date of the last profile update are almost perfectly observed, canton and coun- try of origin or past attended high school are only observed for roughly half of the population. For nodes above the mean degree, typically 80% or more of all Materials & Methods 117

Table 4.1: Descriptive Statistics of Attributes

Attribute Mean Min Max Attribute Mean Sd Min Max Institution Cohort Uni Bern .211 0 1 2001 .081 - 0 1 Uni Basel .183 0 1 2002 .103 - 0 1 Uni St. Gallen .100 0 1 2003 .120 - 0 1 Uni Zurich .312 0 1 2004 .119 - 0 1 ETH Zurich .194 0 1 2005 .145 - 0 1 Faculty School 2006 .194 - 0 1 Languages .041 0 1 2007 .239 - 0 1 Humanities .051 0 1 Missing .122 - 0 1 Social Science .123 0 1 Geo-Circle Economics .148 0 1 Uni Canton .270 - 0 1 Law .090 0 1 Rest of Swiss .300 - 0 1 Exact .041 0 1 Foreign .080 - 0 1 Natural .085 0 1 Missing .351 - 0 1 Medical .092 0 1 Private Profile .065 - 0 1 Engineering .090 0 1 Female .465 - 0 1 Missing .233 0 1 Age 23.29 2.90 16.9 47.8 Note: Descriptive statistics pooled over all 5 institutions. N=39,846. attributes are observed while students with a degree below the mean typically only reveal very few details about themselves. This association is shown in Fig- ure 4.3.B, where the proportion of observed attributes is strongly determined by the degree in the friendship network, i.e., values are not missing at random and are likely not even independent between connected pairs. This demarcates the worst-case scenario and calls for multiple imputation procedures (cf. Rubin and Little 2002), which, however, will not be applied in this context. Missing values concern the analysis in section 4.5. For the case of univariate mixing, missing values are dropped listwise, i.e., I do not take edges into account that have a miss- ing value on either side of the edge. In the multivariate case in section 4.5.2, ERG models are fitted on selected faculties/schools within a university. In these subsamples gender and birthday are almost perfectly observed. Missing values will be randomly imputed. However, cases with missing values on cohort, major subject, and geo-circle will be dropped from the analysis. The variable geo-circle is a combination of the attributes country, canton, zip, and high school, taking three nominal values. For students who live and have completed high school in the canton that is home to the university, geo-circle takes the value 1. Coming from any other canton or simply being Swiss is coded as 2, foreigners are coded as 3. The variable high school consists of a four- digit number that corresponds to the zip code of the high school. In Switzerland, there are 128 high schools (“Gymnasia”) that grant access to higher education. 118 Online Collegiate Networks

There are 152 codes in the data covering almost all high schools, including few additional private schools and teacher colleges. Country, zip, and canton will only be part of the univariate analysis.

4.4 Growing Networks

The goal of this section is to review common characteristics of socially generated networks by means of a very simple network formation process. It will be shown that my networks are consistent with most of its expected properties. Jackson and Rogers (2007) model a tie-formation process that balances between purely random growth and a network-based process. The random component of the model reflects the intuition that in any larger social systems interactions tend to be first of all determined by chance. On the other hand, the network-based aspect stresses the importance of local optimization and reflects the fact that nodes are more likely to connect to nodes that already possess many links since the latter are more visible and easier to find. The combination of the two processes allows the hybrid model of Jackson and Rogers (2007, 891) (JR model) to incorporate and predict many stylized facts common to most social networks:

1. Skewed degree distributions or heterogeneity in the number of contacts a node in the network maintains. 2. Average distances and the maximal distance between any two nodes in a network tend to be small. 3. Triadic closure or the fact that two nodes tend to have more common neigh- bors than one would expect from a random graph. 4. The degrees of connected nodes tend to be positively correlated, which gives rise to a core/periphery structure of the network.

I start with a definition of the tie-formation process in Section 4.4.1 and fit the corresponding degree distribution in Section 4.4.2. In Section 4.4.3, features 2 and 3 are discussed that are commonly known as the Small World property of social networks. Furthermore, a test of the theory of the “Strength of Weak Ties” will be performed. In Section 4.4.4, I will discuss the degree assortativity and age dependency of the tie-formation process.

4.4.1 Tie-Formation Process I begin with some notation and the setup of the JR model. At each time point, the network is defined by a t × t asymmetric binary adjacency matrix, At. Each node i is indexed by the period t ∈ {1, 2,...} it enters the network. A direct link Growing Networks 119

t from node i to j is recorded as Ai j = 1, and the cell equals zero if no link exists. Node i is called the predecessor of j, and the latter is called successor. In each period t a single node enters the network with an in-degree d0 = 0. Nodes neither leave the network, nor are the links ever broken. Therefore, the number of nodes in period t equals t. On arrival, the node meets m ≡ mr + mp successors. First, the node selects mr predecessors by uniform random sampling (without replacement) from the set of existing nodes. With constant probability pr it finds each node attractive enough to establish a link. In the same period, it selects mn nodes by picking uniformly at random from the union of successors of those nodes it has just met by the random process. And with probability pn network-based meetings result in a link. This process forms a directed network where every node has a constant out-degree but a varying in-degree. This simple tie-formation process consisting of a random and a search component is a hybrid of several other models suggested in the literature (cf. Price 1976, Watts and Strogatz 1998, Barabási and Albert 1999, Krapivsky et al. 2000, Callaway et al. 2001). It has become a very popular starting point for generative network modeling in the social sciences and also serves as a bridge between social network analysis and research on strategic network formation.5

4.4.2 Degree Distribution

The range of potential degree distributions suggested with the aforementioned tie- formation process will be the first test of how reasonable the JR model performs in predicting the ensemble of StudiVz networks. The degree distribution is the probability distribution over nodal degrees and one of the fundamental network properties. In the following, I will discuss and approximate degree distributions based on continuous concepts even though the degree is inherently discrete. Of course, a histogram of nodal degrees does not reveal the complete structure of a network since there is typically a very large set of networks that share the same distribution. Nevertheless, generative network models typically give precise pre- dictions that can be ruled out with simple distributional tests. By applying a

5 The model of Jackson and Rogers nicely nests other well-known models of growing random networks. By setting mr = mn = pn = 1 and pr = 0, the model recovers the cumulative advantage process described by Price (1976). If we additionally set d0 = m a special case of Price’s model is recovered, which was studied by Barabási and Albert (1999), where new arriving nodes attach to existing ones proportional to the latter’s degree. In this case, the survivor function of the degree distribution equals 1 − F(d) = d−(m+d0)/m with its probability density Pr(d) ∼ d−3. There, degrees follow a Pareto distribution. On the other hand, if we set pn = mn = 0, it results in a purely randomly grown graph with degrees following an exponential distribution first described by Callaway et al. (2001) and Zalányi et al. (2003). 120 Online Collegiate Networks mean-field approximation, Jackson and Rogers find in-degrees for their model to be distributed as:

!1+r d + rm F(d) = 1 − 0 (4.1) d + rm where the parameters are those introduced in the previous section. Unfortunately, the approximation lacks a proper likelihood. To fit the parameter r to data, Jack- son and Rogers suggest a simple iterative least squares procedure that is now widely used (e.g. Corten 2012). However, as Atalay (2013) has nicely demon- strated, the JR model is virtually identical to a model proposed by Dorogostev et al. (2000) with only a few assumptions adjusted. Dorogostev and colleagues provide a closed-form degree distributions.6 As it turns out, the model is rather difficult to estimate, and this for a simple reason. Both procedures have been applied with the same outcome. Before I estimate m and r using maximum likelihood, requirements and prop- erties of the model will be discussed. The model is primarily applicable in the domain of directed graphs, and only undirected graphs are at hand. Jackson and Rogers show that their model can be adopted for the undirected case where the di- rected nature of the link only indicates who has initiated the meeting. Every link is treated as a mutual acquaintance by assuming perfect reciprocity. Thus, as a practical requirement, the undirected degrees di are divided by 2. In my case, the minimal degree equals d0 = 1, m is half of the average degree, and r represents the relative importance of random link formation to preferential attachment. For r = 0, the graph is formed by pure preferential attachment whereas r = ∞ represents a completely random process. Jackson and Rogers report r = 0.57 for Barabási and Albert’s snapshot of the World Wide Web. Fits for a citation and

6 In the JR model, nodes attach to the sets of chosen predecessors with probability pr and pm. As one can see from Equation 4.1, p is not part of the degree distribution and results in equal predictions for combinations of m and p where mp is constant. Thus, p can be dropped from the model. Further- more, in the tie-formation process of Dorogostev and colleagues’ random sampling is executed with replacement. While this allows for nonzero probabilities of multiple links between two nodes i and j, the probability that this actually happens is small. According to their Equation 15 the probability density is

Γ[(m + 1)r + 1] Γ[d + mr] Pr(d) = (1 + r) i (4.2) Γ(mr) Γ[di + 2 + (m + 1)r]] with the corresponding likelihood function:

! n ! Γ[(m + 1)r + 1] X Γ[d + mr] `(m, r) = n log(1 + r) + n log + log i . (4.3) Γ(mr) Γ[d + 2 + (m + 1)r] i=1 i The model was fitted to data by means of Matlab with an optimization based on simulated annealing in a routine provided by Atalay (2013). Growing Networks 121

Table 4.2: Univariate MLE of Degree Distributions

UBE UBS UZH USG ETH JR-Model m 13.23 14.82 11.04 18.78 11.30 r ∞ ∞ ∞ ∞ ∞ r∗ (1.5e+06) (6.8e+05) (1.2+e06) (9.5+e05) (7.6+e05) Exponential Distribution λ 0.038 0.034 0.045 0.027 0.044 (0.0004) (0.0004) (0.0004) (0.0004) (0.0005) BIC 71817 63948 101894 36841 63758 Weibull Distribution / Stretched Exponential k 1.213 1.143 1.157 1.105 1.153 (0.010) (0.010) (0.008) (0.014) (0.010) λ 28.19 31.08 23.25 38.95 23.78 (0.267) (0.335) (0.190) (0.588) (0.247) BIC 71360 63757 101499 36789 63521 Negative Binomial Distribution µ 26.45 29.64 22.08 37.57 22.59 (0.243) (0.307) (0.172) (0.546) (0.223) θ 1.494 1.336 1.402 1.227 1.402 (0.023) (0.022) (0.018) (0.026) (0.023) BIC 71510 63899 101796 36868 63690 Descriptive Statistics Mean 26.45 29.64 22.08 37.57 22.59 Median 21 23 17 29 18 SD 21.91 25.96 19.26 33.12 20.1 Max 195 290 174 221 226 N 8’398 7’284 12’442 3’981 7’743

coauthor network return r = 0.63 and r = 4.7. A network of prison inmates and a network of romantic relationships in high school both return an r = ∞. The first two networks (www, citations) are well-known candidates of scale-free networks whereas the latter two cases are classical social networks among human actors. Corten (2012) has fitted the JR model for the Dutch online social network Hyves. He reports an r = 4.7 for Hyves and r = 4.35 for a degree distribution sampled from Facebook (received from Gjorka et al. 2010). Compared to my case, Corten studies the degree distributions of the entire networks, including nodes that have accumulated degrees of up to 5000 contacts. Therefore, I expect my estimate of r to fall far above the case of Corten. Table 4.2 reports maximum likelihood estimates of the JR model together with other well-known statistical distributions often applied to modeling degree distributions. The first three rows correspond to the fit of the JR model. The 122 Online Collegiate Networks

A) ETH Degree Distribution B) ETH Degrees (Log−Lin CCDF) 0 ● ●● ●●● 10 ● ●●● ●● ●● Observed 0.04 Observed ●● ●● ●● ●● Exponential Fit Exponential Fit ●● ● ●● −1 ●● Weibull Fit ●● ● ●●

10 ● ●● 0.03 ● ●● ● ●● ● ●● ● ●● ● ●● ●● ● ●● ● ●● ● −2 ●● ●● ●● ●●● 10 ● 0.02 ● ●● ●●● ● ●●● Probability Probability ●● ●● ●● ● ●● ● ●

−3 ● ● ● ● ●

0.01 ●

10 ● ●● ● ● ● ●● ● ●● ● ●●●● ●●●●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● −4 0.00 10 0 50 100 150 200 0 50 100 150 200

Number of Friends Number of Friends

0 0 0 0 ●●●● ●●● ●●●●● ●●●● ●●●●● ●●●● ●●●●●●● ●●●● 10 ●●●● 10 ●●● 10 ●●●● 10 ●●● ●●●● ●●● ●●●●●● ●●● ●●●● ●●● ●●●●● ●●●● ●●●● ●●● ●●●●● ●●● UZH ● UBE ● UBS ●● −1 ● −1 ● ● ● ●● −1 ●● ●● USG ●● ●● ●● −1 ●●● ●● ●●●● ●●● ●●●●● ●●● ●●●● ●● ●●●●● ●●● ● ● ● 10 ● 10 ● ● ● ●● 10 ●● ●●● ●● ●● ● 10 ●● ● ●●● ●●● ●●●●● ●●● ●●● ●●● ●●●●● ●●● ●● ●● ●●● −2 ●●● −2 ● ● ● ● ● −2 ● ●● ●● ●● ●● ●● ●● ●● ●● −2 ●● ●● ●●●● ●●● ●●●● ●●● ●●● ●● ●● 10 ●● 10 ● ●

10 ● ●●● ●●● ●●●● ●● ●● ● 10 ● ● ●●● ●●● ●●● ●●● ●●● ●● ● −3 ●● −3 ● ● ● −3 ● ●● ●● ●● ●● ●● ● ● −3 ● ● ●● ●● ● 10 10 ● ● 10 ● ● ● ● ● 10 ● −4 −4

● −4 ● ● ● 10 10 10 0 50 100 150 200 0 50 150 250 0 50 100 150 200 0 50 100 150

Figure 4.4: Histogram and Log-Lin-CCDF of ETH Degree Distributions mean degree (or half of it) is perfectly fitted. However, the parameter r diverges and approaches infinity. The parameter r∗ is the maximum an optimizer based on simulated annealing is willing to report. In this parameter region the likelihood is almost flat and fits become unstable. Therefore, proper standard errors and the value of the log likelihood cannot be reported. The reason for this is simple. Ac- cording to the JR model r → ∞ means that the tie-formation process is dominated by pure random tie-formation with a proportion of network-based meetings that is essentially zero. If we restrict the JR model to a purely random growth, the model simplifies to the ones proposed by Callaway et al. (2001) and Zalányi et al. (2003). There, on every time step t a new node enters the network. In the former one, m edges are added randomly to the new set of nodes including the entrant. In the latter case, newly formed ties are only formed between the entrant and the incumbents. Irre- spective of this difference, it has been shown by the authors that in both models degrees follow closely an exponential distribution. Figure 4.4 shows the degree distributions of the ETH network. In Panel A, fitted values of the exponential distribution overlay the counts in a histogram. Panel b depicts the survivor function (CCDF) on a log-linear plot. It’s easy to see Growing Networks 123 that the degree distribution closely follows an exponential except the values for highly active nodes in the upper tail. Less convincing fits are received from the other four institutions where the degree distribution typically falls between the fitted values of an exponential and a Weibull distribution. In Table 4.2, I also report fitted parameters from the exponential and the Weibull distribution. The former is nested in the latter. We can check for devia- tions from the exponential by testing for a shape parameter λ = 1 of the Weibull. The test of a clean exponential distribution is clearly rejected for all five networks since all exhibit a significant shape above 1. However, the shape parameter is rather close to unity which suggests that the exponential is still a viable candi- date. Instead of relying on a model of a growing random graph, one can also test for a static graph model by means of the negative-binomial. The standard ran- dom graph models stipulate a Poisson distribution (Bernoulli graph, Erdos-Rényi˝ model). Since the Poisson is nested in the negative-binomial, deviations can be tested again. Receiving an overdispersion parameter θ > 1 indicates a pure or spurious contagion bias on top of the random network formation. This is the case for all five networks. The Poisson is heavily rejected and can be ruled out. A comparison of the model fit reveals that all three models are very close and hard to distinguish through visual inspection or by statistical tests. The Weibull always performs better with regard to the Bayesian information criterion (BIC) than the exponential. This comes as no surprise since the former has an additional parameter and degree of freedom. A comparison with the negbin based on BIC does not apply. In a Kolmogorov-Smirnov test the exponential distribution shows a higher p-value than the Weibull. However, Kolmogorov-Smirnov tests always reject all candidate distributions due to high statistical power. This claim also applies for the other four networks. Altogether, I conclude that the degree distri- butions follow approximately an exponential distribution, which, in turn, suggests that there is no or only little cumulative advantage from network-based meetings. With a parameter r ∼ ∞, my degree distributions is found to be a limiting case in the JR model. Besides the StudiVz ensemble, all distributional tests have also been per- formed on the FB100 ensemble from where the hypothesis of exponentially dis- tributed online collegiate networks receives additional corroboration. Users in the FB100 ensemble tend to have substantially higher mean degrees compared to Stu- diVz users. Additionally, a nontrivial number of the FB100 networks have few but relevant outlying hubs that have led scholars to apply heavy tailed distributions on this dataset. In the Appendix 4.7.3, I will discuss the claim of “scale-free” online social networks by Hu and Guo (2012) that seems to be prevalent as the following quote from the review paper of Heidermann et al. (2012, 3869) suggests. They 124 Online Collegiate Networks write: “For instance, usually social networks as well as the social and activity graphs of OSNs are scale-free.” However, I will show that the preferential attach- ment model of Barabási and Albert (1999) only applies to a small subsample of outliers whereas the vast majority of users follow the logic discussed above. Degree distribution of online collegiate networks are skewed but not heavy- tailed. One popular consequence of this finding is an intensified friendship para- dox described by Feld (1991) which explains why most people have fewer friends than their friends have. The friendship paradox rests on the assumption that a friendship count of one’s friends serves as a standard or reference point when people evaluate their own popularity. As Feld has demonstrated, the average de- P 2 P gree of neighbors ( x / xi) is generally higher than the degree of a typical P i node ( xi/n) in any network with positive variance in the degree. Since the degree distribution of neighbors depends on the variance of the original distri- bution, skewed degree distributions aggravate the situation because highly con- nected nodes show up disproportionately often on the contact lists of their neigh- bors. The mean degree in the ETH network is 22.6 while the mean degree of neighbors is 33.2. If ETH students would compare themselves with the average (median) of their friends, 83% (73.9) of all students would feel deprived. Since all networks studied here share a similar ratio of the variance to the mean, the magnitude of potential “deprivation” is almost identical for all observed cases.7

4.4.3 Weak Ties, Bridges, and Small Worlds

I continue the discussion on common characteristics of social networks with Jack- son and Rogers’ stylized facts (2), short average path length, and (3) clustering. Forty years ago, Granovetter (1973) proposed a theory that received widespread attention within and outside of sociology. The theory of the “Strength of Weak Ties” (SWT) contributes to the problem of intergroup cohesion, i.e., the integra- tion of the diverse subgroups into larger communities or society. First of all, it pertains to close-knit groups generated by face-to-face interactions that are con- nected and integrated over weak ties. Granovetter’s idea stresses the importance of (local) bridges as important channels of information that overcome traditional cleavages of a differentiated social system. He conjectures that bridges tend to have low tie strength as opposed to strong ties prevalent within cohesive clus- ters. Accordingly, acquaintances (weak ties) of a focal actor are less likely to be socially involved with one another compared to his close friends (strong ties).

7 The proportion of students in the StudiVz networks having a degree lower than the mean degree of their neighbors falls between 0.81-0.83. In the FB100 ensemble, the mean over all networks is 0.84 with a standard deviation of 0.03. Growing Networks 125

Upon closer inspection, the argument of Granovetter rests on two closely re- lated concepts. Burt (1992, 28) states: “The weak tie argument is about the strength of relationships at the same time that it is about their location.” Burt claims that the bridging aspect of the tie is more relevant in the context of in- formation propagation in networks. In graph-theoretic terms, a bridge is a line that, if removed, would increase the number of components of the graph. In other words, information can only flow over the bridge irrespective of its strength. Ac- cording to Granovetter, the stronger a tie gets the less likely it is a bridge because triads tend to close (“forbidden triad”). Similarly, bridges are due to their low tie strength and fleeting nature much less stable and decay faster (cf. Burt 2002). Nevertheless, bridges and weak ties are two opposing sides of the same coin that build the foundation of what Putnam (2000, 22) calls “bridging ”. A different but related strand of research in social networks tried to tackle bridges directly by forming chains of acquaintances and addressed the integra- tion of cohesive clusters from a different angle. Starting with the seminal work of Pool and Kochen (1978, 17) and Travers and Milgram (1969), the small-world research is motivated by the following questions: given the structure of an ac- quaintanceship networks, what is the objective likelihood of two strangers shar- ing a common acquaintance? A more general version asks about the distribution of the minimal chain lengths for populations of different sizes. For the layperson the answer is surprising. Small-World research shows that even the largest popu- lations can be traversed in only a few steps (“6 degrees of separation”). However, a short characteristic path length L, typically measured with the average length of geodesics in a network, is a common feature of any random graph. Short chains are only surprising in the light of highly clustered social networks. Watts and Strogatz (1998) have nicely formalized the small world phenomenon and pro- posed a simple stochastic network model to reproduce their claim. Given is a connected network with n nodes and a mean degree m with m  n and m  1. That is, the number of contacts is small enough that the graph is sparse but high enough that a giant component is present. Starting from a highly clustered graph with C close to 1, Watts and Strogatz show through random-rewiring links that the characteristic path length is reduced sharply while the clustered nature of the graph stays roughly intact.8 In other words, to substantiate the claim of a small

8 There are two different measures of clustering applied in this paper. The first one in Equation 4.4 is a graph level metric of transitivity whereas the second version in Equation 4.5 is a node level measure. Watts (1999) uses the average of local clustering CAVG in Equation 4.6 as a global measure. The measures can give substantially different numbers and are often confused. See Newman (2010, 198f) for a discussion.

(number of triangles) × 3 C = (4.4) (number of connected triples) 126 Online Collegiate Networks world network one needs to show that the network has a characteristic path length close to the one of a random graph but a clustering coefficient which is much higher than expected from a random graph. In an Erdos-Rényi˝ random graph (ER) the characteristic path length L can be approximated by L ∼ ln(n)/ ln(m). The average clustering coefficient is approxi- mately CAvg ∼ m/n or equals the density of the graph (cf. Watts 1999, 502). The ETH network has n = 7741 nodes and a mean degree of m = 22.6. Therefore, Avg 9 based on an ER graph one would predict Lrnd = 2.87 and Crnd = 0.003. The Avg observed values are LETH = 3.38 and CETH = 0.29. While the simple random graph approximation of L would predict average chain lengths of 3, observed chain length suggest about 4 degrees of separations. The clustering coefficients are, however, two orders of magnitude apart. This demonstrates that ETH is a small-world network. Table 4.4 shows that this also holds for the other four net- works.10 To put on record, we now know that my online collegiate networks are clus- tered, exhibit a short characteristic path length, and most likely have bridges that integrate different dense parts of the graphs. I now characterize the bridges and

(number of pairs of neighbors of i that are connected) C = . (4.5) i (number of pairs of neighbors of i)

1 Xn CAvg = C (4.6) n i i=i 9 A better benchmark than the ER model is random graph based on the configuration model that matches the degree sequence of my social network. A simulation with 10’000 random draws from the configuration model with ETH’s degree distribution returns CAvg = 0.008 with sd = 0.0003, which is three times higher than the ER benchmark but still far away from the observed values (Newman 2010, 434f). 10 Let us briefly discuss the prediction of the JR model and show that it poorly predicts the level of clustering under complete random meetings. For values of r < 1, the JR model predicts zero clustering by definition. This is known to be an accurate description of the level of clustering in scale- free networks. However, for values of r > 1, the mean-field approach suggests clustering to follow this equation (cf. theorem 2): 6p C = (4.7) (1 + r)[(3m − 2)(r − 1) + 2mr] By setting p = 1 to its maximum, it is easy to see that the clustering coefficient is unbounded and can give nonsensical values for r < 1. For my case where r → ∞, it follows that C → 0 and even falls below the network density. Poor predictions are also found for networks that are in a more reasonable domain with respect to the fitted r. For the complete networks of Hyves and Facebook, Corten (2012) estimates r = 4.72 and r = 4.35 with mean degrees of 106 and 94. The JR model predicts C ≈ 0.001 for r = 4.5, p = 1, m = 50. There, the prediction is at least above the network density. However, the observed clustering coefficients of 0.18 and 0.16 in these networks are two orders of magnitudes higher than what the JR model would predict. Growing Networks 127 test Granovetter’s claim of the Strength of Weak Ties. The ETH network has 356 proper articulation points and 178 bridges. If we remove all articulation points, the network disintegrates into components. However, the largest component still includes 90.5% of nodes and 89.5% of all edges. This suggest that real bridges are a phenomenon of the periphery, deleted nodes tend to be leafs, and bridging ties in the core will be in most cases local bridges and not real bridges. In graph-theoretic terms, the SWT-argument can be tested in several ways. A famous test of this claim was provided by Friedkin (1980). Friedkin used the dyad state (MAN) in directed graphs as proxy for tie strength and showed that bridges are in fact much more often asymmetric and not mutual. A second more direct way to approach the SWT-argument is testing whether the weight of an edge is proportional to its overlap. Borgatti and Feld (1994) proposed a simple procedure where one correlates the weighted adjacency matrix A with an overlap matrix O. The latter is a dichotomized and a squared version of the former, i.e., O = AAT . The product of an adjacency matrix with itself returns the number of walks of length 2, which is equivalent to the number of common neighbors. A similar but relative measure of topological overlap of a common neighborhood was proposed by Onnela et al. (2007) that falls between zero and one:

ni j oi j = ∈ [0, 1], (4.8) (ki − 1) + (k j − 1) − ni j where ki and k j correspond to the degree of the nodes at either side of the edge and ni j is the number of neighbors common to both nodes i and j. Onnela and colleagues computed edge weights taken from a large data set of mobile phone call records where the number of calls and call durations between two persons are the proxy for the edge weight. They show that overlap is positively correlated with edge weight and corroborate the SWT theory. In a similar vein, they corre- late overlap with edge-betweenness centrality. Edge betweenness bi j for edge e (cf. Brandes 2001, Girvan and Newman 2002) is defined as

X X gie j b = (4.9) i j g i j i j where gi j equals the number of all shortest paths (geodesics) from i to j and gie j refers to the number of shortest paths from i to j that pass through edge e. They show that edges with high betweenness centrality (local bridges) have lower overlap. A very similar analysis with identical conclusion was conducted with data from a massive multi-player online game (cf. Szell and Thurner 2010). For this study, tie strength is derived from multiple social relationships. The presence of additional ties in the other networks will be interpreted as a weak proxy for tie strength. Besides collecting acquaintances in the friendship network, 128 Online Collegiate Networks

A) Overlap vs Tie Strenght B) Overlap vs Edge−Betweenness

● ● 0.5 ●

0.5 ● ● ● ● ● ● ● ● ●● ● ● ● ● ● 0.4 ● ● ● ● ●●● ● 0.4 ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●●●● ●● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ●●● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● 0.3 ● ●● ●● ●●● ●●●● ● ● ●● ● ● ● ● ● ● ●●●● ●● 0.3 ● ● ● ●● ● ● ● ● ●● ●●● ●● ●● ● ● ●● ● ● ●●● ● ● ● ● ● ●●● ●●● ●●●● ●● ● ●● ●●● ●●●●● ●●● ●● ● ●● ● ● ●● ● ● ●● ●●● ● ●●● ● ● ● ● ● ●●● ●●●●●●● ● ● ●● ●● ● ●● ● ● ●●● ●● ●● ● ● ● ● ● ● ●●●● ● ●● ● ● ●● ● ● ●● ●●● ●●● ● ● ● ● ● ● ● ●●●● ●●●●●●● ● ● ●● ● ●●● ● ● ●● ●●●●●●●●●●●●●●● ●●●●● ●● ● ● ● ● ● ● ●●●●●●●●●●●●● ● ●●●●●●●● ●●● ● ● ● ● ●● ●●●●●●●●●● ●● ●●● ●●● ● ● ● ●● ●● ●●●●●●●● ●● ● ●●● ●●● ● ● ● ●●●●●●● ● ●●●●●●●●●●●●●●●●●●● ● 0.2 ●● ● ●●●●●● ● ●● ●●● ●●● ●● ● ● ● ● ●●●●●● ●● ●●●●●●● ● ●●●● ●●●● ● ● 0.2 ● ●● ●● ● ●● ● ●● ●● ●●●●●●●●●●●●●●●●●●●●● ● ●● ● ● ● Edge Overlap Edge Overlap ● ● ● ●●● ●●● ●●●● ● ● ● ●● ● ●●●●●●● ●●●●●●●●●●●●● ● ●● ●●● ● ● ●● ●●●● ●●●●●●●●●●●●●●●●●● ●● ● ●●● ● ● ●● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●● ● ● ●● ●● ● ●●●●●●● ●●●●●●●●●●●●●●● ● ●●●●● ●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●● ●●●●●●● ●●● ●●● ● ● ●●●●● ● ●●●●●●●● ●●●●●●●●●●●● ● ● ● ● ●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●● ● ● ●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ● ● ● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●● ● ● ●● ● ●●●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●● ●● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ● ● ●●● ●●● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ● 0.1 ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●● 0.1 ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ● ● ● ● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●● ● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ● ● ●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●● ●●●● ● ● ● ● ● ●●● ● ● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●● ● ●● ● ● ●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ● ● ●● ●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ● ● ● ●● ● ●● ● ● ●●●●● ●●●●●●●●●●●●●●●●● ●●●● ● ●● ● ● ● ●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ● ● ● ● 0.0 0.0

1 2 3 4 4 5 6 7 8 9

Level of Tie Strength Edge−Betweenness (log)

Figure 4.5: Edge Overlap against Tie Strength and Edge-Betweenness

students in the StudiVz network are also engaged in photo-tagging. The edge is flagged in the friendship network if two students appear together on a photo. Another network is formed from the guestbook where student i can post messages on the wall of j. Again, the edge is flagged if a message is passed between i and j in any direction. Finally, students are heavily engaged in topical interest groups that form a bipartite graph between interests and students. I also flag the edge if a common group membership in the student-student projection is found. Tie strength is then a simple linear combination of these three proxy indicators (F1- F3). The ordinal scale then takes the value 1 if the two students are linked in the friendship network only. We add one for a shared group membership, for a guestbook entry and for co-appearance on a published photo. The weights are then plugged into the adjacency matrix and are later correlated with the overlap matrix.

Figure 4.5.A shows that the proportion of common neighbors increases in tie strength. The right panel depicts the relationship of overlap with edge- betweenness. The lower the edge overlap is, the more shortest paths are traversing that edge. The average neighborhood overlap is 12%. Two students with only a link in the friendship network have an average neighborhood overlap of 8% while students with a maximal tie strength value of 4 have on average 20% of common neighbors. Table 4.3 reports the correlations of overlap with the index of tie strength and its three components. All three components and the index are posi- tively correlated (0.175, 0.06, 0.136 and 0.2). On the other hand, the measures of Growing Networks 129 relabelings / Statistics Correlations Descriptive Mean SD O EB GM GB FO F3) (TS) 1.703 0.695 0.200 -0.067 1.000 + F2 + F1 + Table 4.3: Correlations of Overlap and Edge Betweenness with Tie Strength The table shows results from the ETH network. Correlations of overlap and edge-betweenness with Overlap (O)Edge Betweenness (log) (EB)Tie strength (1 F1) Shared group membership (GM)F2) Shared guestbook entry (GB)F3) Coappearance on tagged 0.525 6.799 foto (FO) 0.499 0.783 0.059 0.119 -0.660 0.175 0.235 0.324 1.000 -0.065 0.136 0.120 0.060 0.791 -0.058 0.105 -0.002 1.000 0.479 1.000 0.578 0.098 0.085 0.150 1.000 Note: measures of tie strengths. Allhypothesis correlations that are significant graphs with are respect to unrelated. a permutation Monte test (QAP) Carlo of simulations the null- each with 10’000 permutations always indicate that 100% of draws are below the observed value. 130 Online Collegiate Networks tie strength are negatively correlated with the proxy for bridging. The correlation between tie strength and edge-betweenness are, however, very small. To assess the significance of the correlations, I have performed permutation tests. Standard significance test cannot be applied since dyads are not indepen- dent. Instead, the QAP procedure (Hubert and Arabie 1989, Krackhardt 1987) is used, comparing observed correlations with a distribution of correlations gener- ated according to the null hypothesis of no relations between the matrices. The QAP procedure permutes rows and columns of one of the input matrices. The p- value then corresponds to the proportion of random correlations that are as large as or larger than the observed correlation. In my case, 100% from 10’000 ran- dom draws for the correlations of O-vs-TS and EB-vs-TS fall below the observed correlations. This corresponds to a p-value of p = 0.0001. I conclude that Gra- novetter’s and Burt’s claim can be corroborated with my data. In a similar vein, also the Small World property can be clearly identified.

4.4.4 Degree Correlations Figure 4.6.A shows the inverse relationship between local clustering and nodal degrees. This is a common finding in large networks and corresponds to the fifth stylized fact of Jackson and Rogers. That is, members of the neighborhood of a highly connected node are less likely to be linked among each other compared to neighbors of a low-degree node. For example, for users with just five contacts, the average local clustering coefficient is around 0.35, indicating that 35% of all their friend pairs are themselves friends. The egocentric network of nodes with more than 100 contacts have density of between 15-20% which is still very high. The decay in Figure 4.6.A looks rather flat, predicted by the red lowess fit. This slowly declining curve stands, however, in very good agreement to what Ugander et al. (2011) have found on Facebook and Corten (2012) on Hyves. Both studies show that local clustering stays above 10% for nodes up to 1’000 contacts from where it then sharply drops. This negative degree-clustering correlation is remarkably similar across different online social networks. In the StudiVz ensemble the overall correlation falls between -0.32 and -0.35. In the FB100 ensemble the interquartile range is between -0.30 and -0.38 (cf. Figure 4.13). In essence, this relationship defined on the node level is very similar to the negative correlation of edge-overlap vs edge-betweenness already discussed in the previous section. The higher a node’s degree, the more likely shortest paths will pass through this node, or, in other words, bridges disproportionably often have a highly connected node at one of their ends. This holds especially true for networks with an extremely skewed degree distribution where hubs dominate a centralized topology and connect a large number of nodes with small degree. In Growing Networks 131

A) Transitivity vs Degree B) Degree vs Age of Node

●●●●●● ● ● ● 1.0 ● ● 200 ●●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ●●● ● ● ●

0.8 ●● ● ●●● ● ● ● ●●●● ● ● ●●●● ● ● ●●● 150 ●●●●●●● ● ● ● ●●●●● ● ● ●●●●●●●●● ● ●●● ● ● ● ● ●●●●●●● ● ● ●●● ● ●●●●●●● ● ● ●●●●●●●● ● ● ●●●●●●● ● ● ● ●● ●●●●●● ●●●● ● ●●●●● ●● ● ● ●●●● ●● ● ● ● ● ● ●●●●●●●●●●● ● ● ● ● 0.6 ● ● ●● ●●●●● ●● ●●●● ● ● ● ● ● ●●●●●●● ●● ● ● ● ● ●●●●●●●●●●●●● ● ● ● ● ● ● ● ●●●●● ●● ●●●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●● ● ● ●●●●●●●●●● ●●●● ● ● ● ● ●●●●●●●●●●●●● ● ●● ● ●●●●●●●●●●●● ●●● ● ●●● ● ●● ● ●●●●●●●●●●●●● ● ● ● ●● ● ● ●●●●●●●●● ●●●●●● ● ● 100 ● ● ● ● ● ● ●● ● ● ●●●●●●●●●●●●●●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●●●●●●●●●●●●●●●● ●●● ● ● ●● ● ●● ● ● ● ●●●●●●●●●●●●●●●●● ●● ● ● ●● ● ● ●● ● ●●●●●●●●●●●●●●●●●●●●●● ●● ● ● ● ●● ●● ● ● ● ● ● ●● ●●●●●●●●●●●●●●●●●●●●●● ●● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●● ●●●●●● ●●● ●● ●●●●●●●●●●●●●●●●●●●● ●●● ●●●● ● ● ●● ● ●●●● ● ● ● ● ● ● ● 0.4 ● ●●●●●●●●●● ●●●●●●●● ● ●● ● ● ●●● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ● ●●●● ●●●●●●●●●●● ●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●● ●● ●●● ● ●●●●● ● ●● ● ●● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ●●● ●●●●● ●● ● ● ●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●● ● ● ● ● ● ●●●●●●● ●●●●●●● ● ● ● ●● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ●●●●●●●●●●● ●● ● ● ● ●● ●●●●● ● ● ●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ●● ● ●●●●●● ●●●● ● ● ●●● ● ●● ● ● ● ●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ●●●●●●●●●●●●● ●●●●● ● ● ● ● ●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ●● ●● ●●●●●●●●●●●● ●●● ●●●●●● ●●●●●●● ● ●●● ●●● ●●●● ● ● Local Clustering ● ●●●●●●●●●●●●●●●●●●● ●●●●● ● ● ●●●●●●●●●●●●●●●●●● ● ●● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ●●● ●● ● ●●●●●●● ●●●●●●●●●●●●●●●●●●● ● ● ●●●●●●●●● ●●● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ● ● ●● ●● ● ● ●● ● ● ●● ●●●●●●●●●●●●●●●●●●●●● ● ●●● ●● ●● ●●● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●● ●● ● ● ● ● ●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●● ● ● ●●●●●●● ● ● ● ●●●● ●●● ●● ● ●●●●●●●●● ●●● ● ● ●●●●●● ● Number of Friends ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●● ●●● ● ●●●● ●● ● ● ● ●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ● ● ● ●● ● ●● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●● ● ● ●●●● ●● ● ●●●● ●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●● ● ●●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ●● ●● 50 ● ● ● ● ● ● ●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●● ●●●●●●●●● ●●● ● ● ●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●● ●● ●● ●● ●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ●●●●●●●●● ● ●● ●●●● ●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ● ●●●●●●●●● ●●●● ● ●● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ●●●●●●●●●●●●●● ●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●● ●●●●● ●● ●●●● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●● ● ● ● ● ● ● ● ●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●● ●●●●● ●● ●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ●● ● ●● ●● ● ● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ● ●●●●● ●● ●●●●●●● ● ● ●● ● 0.2 ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ●●●●●●●● ● ●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●● ● ●●●● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●● ●●●●●● ● ● ● ● ●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ● ●● ●●● ●● ●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●● ●● ● ● ●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●● ● ●●● ●●●● ●●● ● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●● ● ● ●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●● ●●● ● ● ●●●● ●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●● ●●●● ● ● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●● ●● ● ●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ● ●● ● ● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●● ●●● ● ● ●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●● ● ●●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●● ● ● ● ● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●● ●●●● ● ●●● ●● ● ● ●● ●● ●●●●●●●● ●●●●●●●●●● ●● ● ● ● ● ● ● ● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●● ● ●● ●●● ●●● ● ●●●●●●●●●● ●●●● ● ●●●●●●● ●● ●●● ●● ● ● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●● ●●●●●●●● ●● ● ● ● ● ●●● ●● ●●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●● ●● ●●●● ● ●● ●● ●● ●●●●● ● ● ● ● ● ●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●● ●●● ●● ● ● ● ● ●● ●● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●● ●● ● ●● ● ● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●● ●●●● ●● ● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●● ● ● ● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●● ●●●●●●●●●●●●● ● 0 0.0

0 50 100 150 200 0 200 400 600

Number of Friends Age of Nodes (in Days)

Figure 4.6: Degree Correlations these networks the correlation of degrees on either side of the edges would be negatively correlated. It turns out that this is not the case for SNSs and rarely for online social networks in general. Probably one of the most robust findings in the research of online social net- works is the degree assortativity, i.e., a positive relationship between the degrees of two neighboring nodes. It means that gregarious people are friends with other gregarious people while loners gravitate towards other loners. Positive degree assortativity is known to coincide with a core-periphery structure with a densely packed core consisting of nodes with high degree. At the other end of the con- tinuum, disassortatively mixed graphs exhibit a star-like structure. Simulated ex- amples of these two types of networks are show in Panels A and B of Figure 4.7. Both graphs consist of roughly the same number of nodes and edges. The major difference is degree assortativity r = +0.25 in a and r = −0.25 in b. Degree assortativity is important for two reasons: it determines (a) the speed and the outreach of processes taking place on the graph (e.g., information propa- gation, disease spread) and is (b) a key variable with respect to graph resilience. While spread phenomena initially percolate faster through positive assortative graphs, epidemics are finally more widespread in disassortative structures. In the former case, the spread is typically confined to a dense central position called the “core group” in epidemiology. The core group is also thought to act as a reser- voir for diseases or behaviors even if the graph is not dense enough to support endemic infection levels. Compared to star-like structures, assortative networks are remarkably robust against targeted attacks. One can easily disintegrate a star- 132 Online Collegiate Networks

● ● ● ● A B ● ● ● C ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Figure 4.7: Degree and Type Correlations like network by the removal or immunization of the hubs. However, this strategy is far less effective in a core-periphery structure where the core group maintains high redundancy.11 Following Newman (2003a), I measure degree assortativity simply by a Pear- son correlation coefficient between the degrees of the nodes at either side of the edges. Table 4.4 shows the results of the degree assortativity for the StudiVz en- semble, which falls between 0.16-0.21. Very similar results have been found for the entire Facebook graph: 0.226 (Ugander et al. 2011, 8). Corten (2012) reports 0.3 for Hyves. Surprisingly, the FB100 ensemble exhibits significantly lower lev- els of degree assortativity, with an inter-quartile range of 0.05-0.15 (cf. Figure 4.13). Nevertheless, all these levels are far above from what one would expect in a perfectly mixed environment. The standard null-model with the same degree distribution suggests a zero correlation. Positive degree assortativity can be an outcome of intended preference-based choice of one’s interaction partners. But there exists an alternative explanation that is much more plausible in this context: network growth. In the JR model, degree assortativity is a direct result of the tie-formation process in which nodes

11 Newman and Girvan (2003, 81) provide extensive simulation results. Additionally, they compare levels of assortativity in social (co-authorship, company directors, e-mail address books), technolog- ical (www, software dependencies), and biological networks (food webs, neural networks, metabolic and protein interactions). They conclude: “[. . . ] a particularly striking feature of the table is that the values of r for the social networks are all positive, indicating by degree, while those for the technological and biological networks are all negative, indicating disassortative mixing.” What appears to them as general regularity is, however, primarily a question of tie definition. One can easily find disassortative social networks such as buyer-seller relations or networks with ties defined based on inherent authority, status, or power differences. A surprising example is provided by Holme et al. (2004), who analyze an online dating network that exhibits strong negative degree assortativity. The explanation for the pattern is rather simple. Old incumbent nodes that already have accumulated a high number of contacts pounce on newly arriving nodes which, in turn, gives rise to a negative degree-degree correlation. Growing Networks 133 sequentially arrive in the network. Nodes with a higher degree simply tend to be older nodes. Since nodes must attach to preexisting ones, they connect to nodes that are at least as old as they are. This suggests that in a model where degrees grow with time, nodes with higher degree are more likely to be connected to each other, which leads to a positive correlation of the degrees of connected nodes in the network. Figure 4.7.B shows for the ETH network (and Table 4.4 for all other networks) that, in fact, the degree vs node-age correlation is positive and reaches substantial levels between 0.34 and 0.43. Node age is defined as the arrival time in the network, i.e., it is the sign-up date at StudiVz. It is neither identical with the real ages of students nor is it perfectly related to the study cohort even though these numbers tend to be highly correlated. Node age and real age are positively correlated with r = 0.7 whereas cohort and node age exhibit a correlation of r = 0.24 in the pooled sample of all StudiVz profiles. I have performed numerical simulations based on the JR model’s tie- formation process, incorporating only random meetings that reproduce an iden- tical exponential degree distribution and the degree assortativity of the StudiVz and the FB100 ensemble.12 A simple static random graph (configuration model) with an exponential degree distribution suggests a zero correlation with node age and also destroys any degree-degree correlations. Results for the grown graphs are remarkably different. Degree assortativity comes with rDD ≈ 0.3 close to the observed measures. However, correlation of age and degree are typically in the order of rDA ≈ 0.8 and therefore far above the observed levels. Even though net- work growth recovers both stylized facts, the actual fits of the models are poor. The age of a node does not completely determine this. This suggests that on top of the age dependency there must be also substantial individual heterogeneity in the node’s propensity to form ties that remains unexplained without detailed attribute data. As noted above, degree assortativity can also be an be an outcome of intended preference-based choice of one’s interaction partners based on their attributes or category given these are correlated with the degree. Figure 4.6.C shows a graph with two types of nodes that have preferentially chosen interaction partners based on type. In the following section, I discuss network patterns that are defined based on the relative occurrence of within-category ties (black, gray) and cross-category ties (red).

12 Simulations and all structural measures have been computed with the R Igraph package of Csárdi and Nepusz (2006). Random growth models implemented in this package are based on Zalányi et al. (2003) which is identical to the random-only meeting process in the JR model. I have also tested against the model of Callaway et al. (2001) which produces identical results in terms of age depen- dency but fails to match the degree assortativity of my target ensembles. 134 Online Collegiate Networks

Table 4.4: Graph-Level Metrics of Swiss StudiVz Networks

Wave III (Jan 2008) UBE UBS USG UZH ETH Students WS 07/08 * 10’978 8’665 4’558 20’507 10’627 StudiVZ Uptake Rate 0.77 0.84 0.87 0.61 0.73 Degree & Density Number of Vertices 8’398 7’284 3’981 12’442 7’741 Number of Edges 111’066 107’933 74’774 137’352 87’442 Mean degree 26.45 29.64 37.57 22.08 22.59 Density 0.003 0.004 0.009 0.002 0.003 Degree Standard Deviation Degree Skewness 1.586 1.770 1.506 1.694 2.076 Transitivity & Small World Diameter 9 9 9 12 11 Average Shortest Paths 3.388 3.305 2.986 3.701 3.607 Transitivity (Global) 0.150 0.170 0.158 0.141 0.198 Transitivity (Local) 0.234 0.254 0.226 0.228 0.288 Overlap-Betweenness Correlation -0.404 -0.359 -0.281 -0.310 -0.420 Degree-Transitivity Correlation -0.385 -0.345 -0.340 -0.351 -0.320 Degree-Age Correlation 0.368 0.372 0.433 0.343 0.346 Assortativity & Segregation Degree (c) 0.181 0.161 0.209 0.182 0.209 Gender (n) 0.148 0.143 0.068 0.137 0.150 Cohort (n) 0.396 0.431 0.448 0.406 0.514 Birth Year (c) 0.546 0.542 0.483 0.549 0.580 Member Since (c) 0.302 0.317 0.415 0.287 0.394 Faculty (n) 0.423 0.443 0.142 0.463 0.562 Major (n) 0.326 0.346 0.065 0.359 0.490 High School (n) 0.333 0.304 0.171 0.346 0.226 Canton (n) 0.243 0.216 0.173 0.281 0.203 Foreigner (n) 0.115 0.214 0.293 0.205 0.273 Geo-Circle (n) 0.295 0.271 0.256 0.323 0.234 Note: The table shows structural measures for the five StudiVz networks computed on the data from wave 3. The first two rows compare the number of nodes with official enrollment counts. The Small World properties are computed based on formulas in Section 4.4.3. Assortativity coefficients are significant with respect to a permutation test (QAP) of the null-hypothesis that there is no correlation between attributes at either side of the edge. Monte Carlo simulations each with 10,000 permutations/relabelings indicate that 100% of the draws are below the observed values. Assortativity coefficients indicated with (c) are calculated with Pearson correlations, nominal attributes are based on Equation 4.11.

4.5 Mixing Patterns

The goal of this section is to describe common mixing patterns of online col- legiate networks and to recover the opportunity structure based on the available attribute data. Networks in Swiss universities are, as in their US-American coun- Mixing Patterns 135 terparts, primarily segmented by cohorts, study subject, and friendship circles formed during high school. In the first part, I address each dimension separately, based on segregation measures, and finally reassess their effect in a multivariate model that combines different tie-formation mechanisms.

4.5.1 Assortativity Homophily belongs to the most robust findings in the domain of social networks. For many types of social relations, it is more likely that a tie is formed assorta- tively between similar entities than dissimilar ones (cf. McPherson et al. 2001). The term homophily often refers to the outcome as well as the process that gener- ates it. I will use the term “mixing pattern” for the outcome and restrict the use of the term “homophily” for a selection process based on a salient social attribute. Since this is a cross-sectional analysis, I do not observe the process but only the outcome. Even if attributes in my data, such as gender, cohort, study subject, and provenience are sometimes used to describe homophilous networks, my analysis does not suggest that students actively select on these dimensions. To simplify the proceeding, I introduce some notation following Bojanowski and Corten (2014, 16). In addition to the set of nodes N = {1, 2, ..., N} and the N× N adjacency matrix Ai j, we partition nodes into K mutually exclusive groups G = {G1, ..., Gk, ..., GK}, where h and g will be used to indicates two groups of G. The partition of nodes into groups is formalized by a type vector t = [t1, ..., ti, ..., tN ] with ti ∈ {1, ..., K}. Another way to represent group membership is the type indicator vector for group k as vk = [v1, ..., vi, ..., vN ] with vi ∈ {0, 1}. By binding the type indicator vectors column-wise, we receive an N ×K type indicator matrix I. With this, we can define the mixing matrix M as a distribution of dyads across types as

X X T M = mgh = ai j = I AI (4.10)

i∈Gg j∈Gh where IT is the transpose of the indicator matrix, and the mixing matrix itself can easily be computed by linear algebra. Figure 4.8 depicts a graphical representation of the cohort mixing matrix for ETH Zurich, where the graph is partitioned into seven entry cohorts (2001-2007) and the density is indicated by the shade of the cell. In a perfectly segregated network, where nodes only maintain within-group ties, the density would concen- trate in the diagonal. Two things can be seen in the figure: First, not all cohorts maintain the same number of ties. In fact, the participation rate of the younger cohorts is much higher than that of the older cohorts, which, in turn, results in 136 Online Collegiate Networks

2001 2002 2003 2004 2005 2006 2007

2001

2002

2003

2004

2005

2006

2007

Figure 4.8: Cohort Mixing Matrix of ETH Zurich lower number of edges. Second, younger cohorts primarily maintain ties with their own and neighboring cohorts, whereas older cohorts are less constrained by their own age group.13 The cohort mixing pattern is also clearly apparent from the graph visualization in Figure 4.2 in Section 4.3.2, where nodes are placed by a random force-based layout, and where neighboring cohorts end up being placed side by side, forming a “chain” of entry cohorts. To evaluate assortative mixing on the graph level, I use the assortativity co- efficient proposed by Newman (2003a) and Newman and Girvan (2003). For interval-scaled attributes, Newman suggested measuring assortativity with a Pear- son correlation coefficient of the attributes at either side of the edge. For nominal- scaled attributes he defined the assortativity coefficient as

13 Note that a proper mixing matrix of an undirected graph only includes the upper (or lower) triangle plus the diagonal. The mixing matrix in Figure 4.8 is symmetrized for purely aesthetical reasons. Furthermore, I only consider the contact layer of the mixing matrix, i.e., the pattern of existing ties in a network. See Koehly et al. (2004) or Bojanowski and Corten (2014) for a thorough discussion. The pattern that students primarily maintain ties with their cohort and with an exponentially decaying probability, with neighboring cohorts is also present in the other four networks and in the FB100 ensemble. However, ETH exhibits the highest cohort assortativity in the StudiVz ensemble, whereas in the cantonal universities, where instruction is less predetermined, contact across age group is more prevalent. Strong assortativity is a predominant pattern on all educational levels (cf. Newman 2010, p. 227 based on data from Bearman et al. 2004), however, I find it surprising that it is still so persistent on the university level. Mixing Patterns 137

PK PK pgg − pg+ p+g g=1 g=1 ∈ ≈ − AC = PK [ 1, 1] (4.11) 1 − g=1 pg+ p+g where pgh is the mixing matrix transformed to proportions, and + indicates sum- mation over the particular subscript. The assortativity coefficient is identical to Cohen’s Kappa for inter-rater agreement and measures the relative weight of the diagonal. While it reaches a maximum of 1 for perfectly assortative mixing, the minimum value of -1 for perfect disassortativity depends on the relative number of ties in each group.14 The lower part of Table 4.4 shows assortativity coefficients calculated for all five networks across all available attributes. Again, I have performed for every attribute in every network extensive permutation tests, which indicate that the ob- served assortativity coefficients are above all random draws. Note that edges with missing values are excluded from this analysis. Universities typically segregate the strongest with regards to cohort and birth year. Cohort exhibits lower assor- tativity than birth year since students from different educational routes can join a cohort, but students of the same birth year typically form the majority in a cohort. The second-ranked dimension is major, which is nested in faculty. The latter is always higher than the former, indicating that cross-major ties tend to stay within the same faculty. The third-ranked dimension is shared high school. However, here we see differences with regards to the type of university. UBE, UBS, and UZH are cantonal universities that attract students from within their respective regions, while USG and ETH recruit more widely and internationally. USG is a business school with many foreign students. ETH is one of two places in Switzer- land that offers higher education in engineering. So it comes as no surprise that high school is less tie-determining in these two cases. Integration of foreigners seems to work best at University of Bern. However, the reason for low assortativ- ity on the foreigner attribute is primarily due to its low rate of foreign students of 9%. At USG with a rate of roughly 30% students are given a much higher chance to segregate. For the attribute geo-circle, which roughly codes the distance of origin to the university, we see essentially the same pattern as for the attribute high school. Finally there is gender, which exhibits the lowest assortativity of all attributes on the list. Assortativity coefficients clearly recover the opportunity structure of the uni- versities. All coefficients are highly significant with regards to permutation test. Furthermore, the ranking of the dimension is similar as in the FB100 ensemble,

14 See Bojanowski and Corten (2014) for a detailed discussion of the assortativity coefficient, includ- ing the proper minimum value and discussion of the fact, that AC = 0 for K > 2 does not automatically imply proportionate mixing. 138 Online Collegiate Networks but, and this is surprising, assortativity with regards to the opportunity structure are significantly higher in Switzerland compared to the United States. Assortativity is a graph-level metric but gives no indication on the group level, or, in other words, if all categories within an attribute are selective to the same extent. Following McPherson et al. (2001) and Currarini et al. (2010), we can distinguish baseline homophily from inbreeding homophily. Let Nt denote the number of type t individuals in the population and wt = Nt/N the relative fraction of type t in the network. Let st denote the average number of ties that nodes of type t have with same-type nodes and dt the average number of cross-category ties. Then, the homophily index is simply defined as Ht = st/(st + dt). In case Ht increases with relative group size, then relative homophily is satisfied, i.e., when wg > wh also implies Hg > Hh. We say the network exhibits baseline ho- mophily with regard to an attribute if for all t, Ht = wt. Inbreeding homophily is given if Ht > wt, while heterophily is indicated by Ht < wt. Finally, to relate the homophily index of a group to the potential extent to which the group can be bi- ased, we have to normalize the index in a proper way and arrive at the segregation measure proposed by Coleman (1958).15 It is defined as

Ht − wt IHt = . (4.12) 1 − wt This exercise pursues the goal of identifying group level differences in ho- mophily. Differential homophily is only given if we find IHg , IHh. With Blau (1977) and contact theory, we expect that smaller groups are forced to form more cross-group friendships (dt) on a per capita basis. Second, based on findings from the GSS, we expect larger groups form more relationships (cf. Marsden 1987). And third, all groups have a tendency to form same-type relationships at rates that exceed their relative fraction in the population, or, in other words, they inbreed (IHt > 0). Table 4.5 shows gender homophily indices for UZH and ETH. The main cam- puses of both Zurich universities are located right next to each other. At UZH, the relative fraction of males and females is slightly in favor of women, with a share

15 Note that there are several issues with my application of Coleman’s measure. First, by having fol- lowed Currarini et al. (2009) and for demonstration purposes, I dropped the small-sample correction (N-1). Second, Equation 4.12 only refers to the case of homophily, not heterophily; the latter needs minor adjustment. However, the calculations in Table 4.5 are based on the full specification accord- ing to Bojanowski and Corten (2014, 25). Third, Coleman’s measure is designed for segregation of choices and, therefore, is a measure for directed networks. Although there are no technical reasons to use Coleman’s segregation index for undirected graphs, Bojanowski and Corten discuss a conceptual one: In undirected networks it requires the consent of both actors, however, Coleman’s measure is based on independent choices. Therefore, they recommend caution in the interpretation of Coleman’s index for undirected networks. Mixing Patterns 139

Table 4.5: Inbreeding Homophily by Gender

UZH ETH Male Female Male Female H N 5623 6804 5458 2279 Within-category ties st 12.63 12.49 15.73 10.50 Cross-category ties dt 10.49 8.69 5.96 14.27 + Ties per capita st + dt 23.12 21.16 21.69 24.78 -

Proportions wt 0.452 0.548 0.705 0.295 Homophily Ht 0.546 0.590 0.725 0.424 Inbreeding Homophily Ht 0.171 0.095 0.068 0.184 + Assortativity AC 0.137 0.150 of 55%, while ETH is a man’s world with 70.5% male students. The assortativ- ity coefficient is somewhat lower for UZH than for ETH. However, by taking a closer look at the groups, a more detailed pattern becomes apparent. In fact, “mi- norities” form more cross-category ties, and this is also true for the other three universities in my ensemble. However, the minority does not form fewer ties, as we would expect following Marsden (1987). This is again systematic for all five universities. It looks like the minority group tries to compensate for the lack of availability on campus by an increased online activity. Of course, this claim needs further investigation. Finally, all groups inbreed. However, they inbreed differentially, where the minority always exhibits higher rates of own-category choices than the majority does. I will not repeat this analysis for all other attributes. However, gender ho- mophily is special with regard to minority inbreeding. While students of the same sexes seem to form support networks within universities if they find them- selves in a minority position, the exact opposite is the case for all other attributes. There, typically the majority inbreeds with itself the strongest (not reported here). This corroborates a conjecture by Blau (1977) that as in-group size increases, out-group contact decreases, or in other words, inbreeding homophily tends to increase in heterogeneity and form an inverse u-shape over the group heterogene- ity continuum between 0 and 1. This pattern has been analyzed by Moody (2001, 699) in great detail and was corroborated by Currarini et al. (2009, 2010), who also provide a preference-based model to explain this biased matching process.

4.5.2 Multi-Category Mixing In the last step of this paper, I integrate several aspects of this paper into a single exponential-family random graph model. The model closely follows the specifi- 140 Online Collegiate Networks Coefficients for ERG Models, plotted for 22 Faculties

Sociality Coefficients Gender Male Age Cohort 2001 ● Cohort 2002 ● Cohort 2003 Cohort 2004 ● Cohort 2005 Cohort 2006 Circle 2 ● Circle 3

Gender ● ● Selective Mixing Coefficients Cohort 2001 ● Cohort 2002 ● Cohort 2003 Cohort 2004 ● ● Cohort 2005 Cohort 2006 ● ● Cohort 2007 ●● Subject Circle 1 ●● Circle 2 Circle 3 ● School

GWESP ● Triadic Closure Coefficient

0 1 2 3

Figure 4.9: Coefficients from Exponential Random Graph Models cation of Goodreau et al. (2009) and includes three types of effects. First, there are sociality coefficients that act as intercepts (fixed effects) for the number of edges within gender, age, cohort, and geo-circle. Female, cohort 2007 and geo- circle 1 are reference categories, and coefficients can be interpreted as conditional log-odds. Age in year enters linearity and is added because StudiVz was popular primarily among younger students. This should correct for this fact. Second, I model differential homophily (selective mixing) by adding one statistic for each category of each attribute of interest. Third, a statistic that captures triangles (ge- ometrically edge-wise shared partners) is included to control for triadic closure. Further details on model specification, and the reason why standard errors are not reported here, can be found in Appendix 4.7.5. The five StudiVz networks with more than 106 potential edges are too big for ERG modelling. Instead, models of 22 selected faculties, having at least 100 users, are estimated. Results are shown in Figure 4.9, where the variance in point Discussion 141 estimates are shown by means of box plots. Controlled for age and compared to the cohort of 2007, which just entered university, all other cohorts exhibit higher sociality that increases with time spent at university. This mirrors the fact that they had more time to become acquainted with fellow students compared to the newly arrived students. However, for the last two cohorts the trend is turning. Students, who completed high school in cantons that is not home of the university, typically have a lower degree. The same holds true for foreign students. However, effects are close to zero. And there is also no apparent difference in gender with regard to sociality. With only very few exceptions, all selective mixing coefficients are positive which replicates findings from the previous section. Interestingly, we also see the cohort mixing matrix from the previous section recovered, where the youngest cohort is primarily concerned with itself, and inbreeding slowly declines with increasing university exposure. Similar as in sociality effects, there is “u-shaped” dependency for the sequence of cohorts. To really appreciate the cohort pattern one needs to visually compare my results with Goodreau et al. (2009, 113), who analyze adolescent social networks based on a sample of 59 schools from the AddHealth study. Their effects from selective cohort mixing are stronger but exhibit the identical u-shaped pattern for both sociality and mixing. Apart from cohort, we again see strong selective mixing due to study subject and shared high school. However, compared to the univariate analysis, gender mixing is virtually nonexistent in the ERG models. And the geo-spatial indica- tors change order for inbreeding. While Coleman’s segregation index typically indicates highest inbreeding for geo-circle 1, the multivariate analysis indicates the opposite. This suggest that models which incorporate multiple attributes and multiple mechanisms are, in fact, necessary to address and estimate homophilous tendencies hidden in nested and correlated attributes.

4.6 Discussion

In this article, I have analyzed an ensemble of online social networks that corre- sponds to five Swiss universities. The data for this study has been collected by means of snowball-crawling the public profiles of the SNS StudiVz. StudiVz was at the time of the data collection a very popular online social network at most German-speaking universities in Austria, Switzerland, and Germany. A compar- ison with the official enrollment statistic indicates that between 60-85% of all students maintained a profile there. Analytical results from random graph the- ory as well as simulations suggest that the snowball-samples reached saturation. In total, three snapshots of profiles, list of friends, guest-book entries, common 142 Online Collegiate Networks interest group memberships as well as the network of picture friends have been taken. Even though StudiVz is a multiplex network with different layers, I have concentrated on the friendship network and used all other layers only for a test of the theory on the Strength of Weak Ties. Following a simple random growth model from economics and statistical physics, I show that most of the stylized facts that these models predict can be observed in the networks. The results of the analysis can be summarized as follows: In a first step I have fitted well-known degree distribution models to the data and found degrees to follow approximately an exponential distribution. In the model of Jackson and Rogers (2007) this corresponds to a meeting process that is primarily determined by pure random attachment. The identification of a ran- dom graph process is neither very surprising nor very exciting. The interesting finding with regards to the degree distribution is the similarity within and across ensembles. All five StudiVz networks and many cases in the Facebook network exhibit a beautifully linear pattern on a log-linear plot. However, there are also exceptions. Typically, very few outlying hubs that seem to follow another regime can be identified. These few outlying hubs were also reason enough for schol- ars to fit heavy tailed distributions to these data even if visual inspection already indicates that neither a heavy tail exist nor a substantial part of the distribution follows the scale-free pattern. I have chosen to describe the 95% of users from below instead of the 5% from above. Furthermore, my results are consistent with findings from Corten (2012) and Gjorka et al. (2010), which clearly reject the scale-free hypothesis for their friendship graphs. In a second step, I have computed metrics that prove the Small World prop- erty and performed a simple yet conclusive test of the Strength of Weak Ties theory. The level of measured transitivity is far above random level, and in the StudiVz dataset higher than in the FB100 dataset. The same thing holds true for the topological overlap in edges. The negative degree-clustering correlation shows a similar decay as in Corten (2012). The two ensembles are in this and in many other graph-level metrics almost indistinguishable. In this section, the JR model or any random attachment model does, however, not perform well. Given a process without network meetings, the JR model does not explain how transitivity emerges. Third, the key prediction of a growing random graph is present. If in every discrete time step a new node arrives that attaches to m existing nodes, older nodes need to have a higher degree and degrees of neighboring nodes needs to be cor- related. While the second property (degree-degree correlation) can be observed in both ensembles, the FB100 lacks a proper node-age indicator to compare my results with. It seems there exists no previous empirical evidence on this essential property, at least none that I am aware of. Therefore simulations were performed Discussion 143 that suggest a very strong age dependency of r ≈ 0.8. In the StudiVz data this correlation is only between 0.34-0.43, which leaves, unfortunately, much room for speculation on how we have to interpret this result. In step four, univariate mixing patterns nicely replicate the opportunity struc- ture common to many educational systems. Mixing patterns are primarily deter- mined by cohort, age, and the study subjects wherein students typically meet and interact. To a somewhat lesser extent, associations from past educational levels remain intact, which can be seen from assortativity coefficients in the attributes high school but also geo-spatial indicators. A multivariate analysis of mixing pattern has been performed by means of exponential random graph models. In a model that includes effects from sociality, selective mixing, and triadic closure, the main findings from the univariate analysis stay intact. This paper has provided a comparative analysis of online collegiate networks in two different aspects. I have compared the student population on the online platform with the actual enrollment statistics and found good agreement even though systematic variation in the participation rates exists. Second, I compared findings within an ensemble of five universities and found an almost identical pic- ture in terms of the global network structure as well as with regards to the inner workings. Again, good agreement also means that we recover expected differ- ences. The business school in St. Gallen, for example, has a far higher number of foreign students and only provides few study programs. Such differences from the cantonal universities translate and condense into the mixing patterns. Finally, I have compared the StudiVz ensemble with a set of online collegiate networks from Facebook and find good agreement with regard to global structural features. It turns out that all those online collegiate networks are remarkably similar and that the predictions from simple random graph models can give a quite accu- rate picture of the structures surrounding students. The online world provides a lens through which social scientists can gain insights into the macro-structure of universities and upscale the findings from classroom sociometry not only to the lecture hall but to an entire campus. The richness of the data is astounding, and in this article we only have scratched the surface. 144 Online Collegiate Networks 4.7 Appendix

Figure 4.10 shows a profile page of StudiVz at the time of the data collection. In Section 4.7.1, I present the snowball-sampling procedure together with an eval- uation of its success. Section 4.7.2 compares the number of StudiVz user with official enrollment counts. In Section 4.7.4, I discuss the statistical distributions for degree distributions and comment on the claim of “scale-free” online colle- giate networks. In Section 4.7.4, the Facebook 100 ensemble is presented and compared with the StudiVz ensemble. Section 4.7.5 finally documents specifica- tions of the exponential random graph models.

4.7.1 Snowball Sampling in Online Social Networks Crawling techniques can be roughly distinguished into random walks and graph traversal techniques. Gjorka et al. (2010) sampled Facebook profiles based on dif- ferent random walk procedures. A simple saturation snowball sampling approach with a breadth-first search (BFS) is performed by starting from a seed node which then iteratively inspects neighboring nodes within the university until all nodes in the networks have been visited. With BFS, only nodes within the connected com- ponent can be extracted that also include the seed node. However, the observed mean degree and results from random graph theory suggests that saturation is en- sured. Perhaps the single most studied phenomenon in the field of random graph theory is the formation of the largest component in a Bernoulli graph Gn,δ with n vertices, where each edge appears independently with probability δ. At the mean degree u when δ = u/n for u near 1, the random graph undergoes a phase tran- sition where a giant component starts to form. The size of the giant component follows a transcendental equation S = 1 − e−uS first mentioned by Solomonoff and Rapoport (1951, 113), but now mostly attributed to Erdos˝ and Rényi (1959). For u < 1, clusters are typically of size log(n). For u > 1, the giant component will include a large fraction up to the entire network. In social networks, the size of the giant component is downward biased by transitivity. Newman (2009) pro- vides an analytical solution for clustered networks, S = 1 − e−[uS +vS (2−S )], where u is again the mean degree and v corresponds to the clustering coefficient. With or without transitivity, the tail component, given the observed mean degree, is expected to be smaller than 0.001%. For growing random graphs, saturation seems even more easily obtained. Callaway et al. (2001) propose a model of a growing random graph where a node is added in each time step and then two nodes are picked at random and connected with probability δ. Compared to the static Bernoulli graph, it takes only half as many edges to produce a giant component. However, these results strongly de- Appendix 145

Figure 4.10: StudiVz Profile Page 146 Online Collegiate Networks pend on the growth process. Zalányi et al. (2003) modify the model of Callaway and colleagues and only allow newborn nodes to connect to k existing ones with probability δ. This corresponds to the random-only meeting process in the JR model discussed in Section 4.4. Independent of δ, with k = 1 the graph does not form a giant component at all. Nevertheless, with increasing δ and k  2 the giant component will capture most of the nodes. Since this model represents the worst case scenario, I performed numerical simulations with parameters taken from the ETH network (n = 7741, m = 11). Based on these simulation results, one would expect a tail component not reached by BFS of about 4.4%. This is large, compared to the Bernoulli or Callaway benchmarks but small compared with other sources of bias such as, for example, low participation rates. Luckily, the FB100 ensemble allows an empirical investigation since Traud et al. (2011) received the complete data directly from Facebook. There, the aver- age size of the tail component is 0.13% which indicates a good agreement with random graph theory and suggests that only very few nodes are missing from my observation.

4.7.2 Enrollment Count Reports Table 4.6 shows a comparison with official enrollment statistics of the five uni- versities. N1 corresponds to the number of students including post-graduates and students in advanced training. N2 corresponds to the head count of Bachelor’s and Master’s students. n is the observed number of profiles on StudiVz. Female shows the proportion of females in the official reports and on StudiVz. Canton is the proportion of students living in the canton in which the university is located. Foreign is the proportion of Non-Swiss students. Rate is a lower and upper bound for the StudiVz participation rate in January 2008, i.e., n/N1 to n/N2.

4.7.3 Statistical Models for Degree Distributions Here I introduce the statistical distributions of degree distributions for growing social networks discussed in this paper. Besides the JR model, I also fit the ex- ponential, the Weibull, the negative binomial, and the Pareto distribution. Neg- ative binomial fits to the StudiVz data clearly indicate overdispersion (θ > 1) and departure from the static Bernoulli random graphs. The latter would follow a Poisson. This rules out the random graph model of Erdos˝ and Rényi as a sen- sible null model. Therefore, two other distributions with heavier tails are fitted, in particular the exponential that corresponds to the limiting distribution of pure random growth in the JR model and also corresponds to the generative models of Callaway et al. (2001) and Zalányi et al. (2003). In the following we focus Appendix 147 08 / n Female Canton Foreign Rate 83987284 51.3%3981 53.9% 48.8% 29.6%7743 48.1% 9.1% 21.6% 29.4% 18.8% 64.0-76.5% 26.2% 65.1-84.3% 29.1% 66.7-91.1% 18.1% 58.7-77.5% 12442 54.8% 46.9% 14.7% 51.4-62.3% cial Enrollment Count Reports of 2007 ffi cial Statistic StudiVz Data Female Canton Foreign ffi O 2 N 1 Table 4.6: Comparison with O N UBEUBS 13129USG 11192 10977UZH 5970 8639 51.9% 24195 4369 44.6% 48.3% 19985 28.9% 51.9% 55.4% 9.1% 12.6% 41.3% 18.8% 29.6% 12.4% ETH 13197 9985 29.8% 21.9% 17.1% 148 Online Collegiate Networks on the continuous and not the discrete versions. The Yule-Simon distribution is the discrete analogue of the Pareto distribution, ust as the geometric distribution corresponds to the exponential. Furthermore, the geometric is nested in the neg- ative binomial in a similar way that the exponential is nested in the Weibull. All results below have also been tested for the discrete and discretized versions of the candidate distributions. I have not found any qualitatively different conclusion.16 The probability density of the exponential distribution is

f (x, λ) = λe−λx for x ≥ 0 (4.13) with the rate parameter λ. The Weibull distribution has the probability density

k−1 k  x  k f (x, λ, k) = e−(x/λ) for x ≥ 0 (4.14) λ λ where k is the shape parameter and λ is the scale parameter. The Weibull distri- bution and its underlying random graph model will not be discussed in detail. It will be sufficient to know that it is the limiting distribution for a sublinear pref- erential attachment process, i.e., where at every time step a new node attaches to every existing vertex independently, with a probability proportional to a concave function of its current degree (cf. Krapivsky et al. 2000 and Newman et al. 2002 for an application in e-mail networks). Important is that it nests or relates to sev- eral other candidates and helps to rule out alternatives. For instance, it includes the exponential distribution as a special case for k = 1. If we find significant departure from k = 1, we can reject the exponential hypothesis. The cumulative density of the Weibull is

k F(x, λ, k) = 1 − e−(x/λ) (4.15) and its survival function (CCDF) is a stretched exponential function. It can be shown with a minor reparametrization of the Weibull that the Pareto distribution – the distribution for pure network meetings or preferential attachment – is a limit case of the Weibull for k → 0 (cf. Sornette 2006, 191-192). Formally, the Pareto is not nested in the Weibull, but can be approximated with almost arbitrary precision. So instead of using the complicated likelihood of the JR model and fit r ∈ [0, ∞], one can simply fit Weibull’s shape k ∈ [0, 1] where values very close to 0 indicate Paretian behavior and values close to 1 suggest an exponential. Finally, the probability density of the Pareto (Type 1) distribution is

αxα f (x, α) = min for x ≥ x (4.16) xα−1 min 16 All fits have been performed with R and the packages MASS, fitdistrplus, VGAM, the code from Clauset et al. 2009 for the discretized Weibull accompanying their paper. Appendix 149 where α is the Pareto index and xmin is a scale parameter or the lower cutoff. α The survival function of the Pareto distribution is simply 1 − F(x, α) = (xmin/x) . Given xmin = 1, one can expect to see a straight line on a plot with double- logarithmic axis (log-log plot). It is also easy to see that if X is Pareto distributed, then Z = log(X/xmin) will follow an exponential with rate α. The corresponding signature of the exponential is then simply a straight line on a semi-logarithmic plot (or log-lin plot). With the pdf in Equation 4.16 we state that in a scale- free network the probability of node i having a degree xi follows a power law f (x) = Cx−γ where γ = α + 1 and C is a normalizing constant that includes xmin and γ is typically found to be in the range 2 ≤ γ ≤ 3. This is the degree distribution for the preferential attachment model by Barabási and Albert (1999) (BA), a model for undirected graphs, which, in turn, is a special case of a more general model for directed graphs by Price (1976). See Newman (2005) who provides an extensive discussion. Starting with the BA model, fitting power laws to the upper tail of degree dis- tributions has become a very popular undertaking in the network research litera- ture. See Stumpf and Porter (2012) for a recent comment on this line of research, consult Handcock and Jones (2004) and Clauset et al. (2009) for a discussion of methods to test the power law hypothesis. I revisit the claim of scale-free online collegiate networks of Hu and Guo (2012) since they also rely on the FB100 en- semble. The dataset itself is introduced in more detail in the next section. We are taking a closer look at the degree distribution of two special cases, Harvard and Columbia University. The authors have analyzed the 40 universities in the FB100 ensemble and follow the procedure suggested by Clauset and colleagues to fit the Pareto distri- butions to the degrees xi above some xmin. Hu and Guo (2012, 14) write: “We demonstrate that half of the networks cannot reasonably be considered to follow power law, while for another half the power-law hypothesis appears to be a good one.” If they had taken the entire ensemble of 100 networks, their claim would reduce to 20% support for the power-law hypothesis. Additionally, a closer in- spection of these 20% reveals multiscaling, i.e., nodes seem to be of different types or could adhere to different underlying micro-mechanisms. In Figure 4.11.A one can see on a log-linear plot that Harvard University closely follows an exponential and is far from being Pareto distributed. This also holds true for many other notable cases like MIT, Princeton, or Caltech. These cases closely resemble the StudiVz ensemble even if the mean degree in these networks is much higher than in their Swiss counterparts. According to Hu and Guo’s results, Columbia University is the case where the “power-law hypothesis” finds the best support for xmin ≥ 200. They fit the Pareto to the upper tail (k ≥ 200), where the linear pattern on the log-log plot is present (Panel b 150 Online Collegiate Networks

A) Harvard (LogLin) B) Columbia (LogLog) 0 0 ●● ● ● ●●●●●● ●●●● ●●●●●●●●●●●●●●●●● 10 ● 10 ●●● ●●●●● ●●●●●●● ●●●●● ●●●●● ●●●● ●●●● ●●●● ●●● ●●●● ●●● ●●● ●●● −1 ● ● ● −1 ● ●●● ●● ●●●● ●● ●●● ●● 10 ● ● ●● 10 ● ●●● ●● ●●● ●● ●●● ●● ●●●● ●● ●●● ●● −2 ●● ● −2 ● ●●● ●● ●●●● ●● ●●● ●● 10 ●● ● ●● 10 ● ●●●● ●● ●●● ● ●● ●● ●●● ●● −3 ●● ● ● −3 ● ●● ●● ●● ●● 10 ●● ● ● 10 ●

Probability Density Probability Density ● ●● ● ● ● ● −4 ● ● −4 ● 10 10

0 200 400 600 800 1200 100 101 102 103

Number of Friends Number of Friends

C) Columbia (LogLin) D) Columbia Upper Tail (LogLog) 0 0 ● ●● ●● ●●● 10 ● 10 ● ● ●●● ● ●●● ● ●●● ● ●●● ● ●●● ● ●●● −1 ● ●● ● ●●● ● ●●● ● ●●● 10 ● ●● ● −1 ● ● ●●● ● ●● ● ●●● ● 10 ●● ●● ●● −2 ● ● ●● ●● ●● ● ●● ●● 10 ● ● ●● ● ● ●● ●● ● ● ●● ● ● −2 ●●● ● −3 ●● ● ●● ●

● 10 ● ● 10 ● Probability Density ● ● Probability Density ● ● ● ● −4 ● ● 10

0 500 1500 2500 3500 103

Number of Friends Number of Friends (k>200)

Figure 4.11: Degree Distributions of Harvard and Columbia University above the red line). In doing this, they restrict their sample to 6.2% of the set of nodes and to about 25% of the set of edges, and obviously test the hypothesis on a very specific subset of members. In this small subsample they compare fits of different distributions and come to the conclusion that the single parameter Pareto has the best support. However, their argument critically depends on the lower cutoff xmin. Instead of finding a xmin, where above the distribution is best described as a power law, we can alternatively search a critical value where below the Weibull significantly better explains the data. Malevergne et al. (2005) prove (only in the Arxiv preprint version) that even though the two distributions are not nested, Wilks’s theorem still applies and a simple χ2 likelihood ratio test can be performed. For the Columbia University, the cutoff needs to be lowered to xmin = Appendix 151

A) Boxplot of Exponential's Rate C) Boxplot of Weibull's Shape

● ● 1.5 ● ● ● 1.4

0.025 ● ● ● ● 1.3 0.020 1.2 1.1 0.015 1.0 Shape Parameter of Weibull Shape Parameter Rate Parameter of Exponential Rate Parameter 0.9 0.010

Full Sample Student Sample Full Sample Student Sample

Figure 4.12: Exponential’s Rate and Weibull’s Shape for FB100 Samples

158 or to the 88th percentile from where the Weibull distribution significantly takes the lead. For the entire FB100 ensemble we have on average to consider more than the top 5.28% of nodes and the scale-free claim breaks down. In Figure 4.11.C, the degree distribution shows a kink roughly at x = 500 above which another regime seems to be in place. These kinks appear for many networks in the FB100 ensemble but not at the same position, so that I can rule out technical reasons. Finally, Panel D shows the upper tail region where Hu and Guo fit a straight line. I believe, however, that there is more structure to be seen. The network has obviously outlying hubs above x > 1500. The largest node in the network – this seems to be a student – is connected to 30% of all students at Columbia. Unfortunately, we know nothing about how and why this node established its position. But I suspect that the 27 popular nodes with x > 500 accumulate contacts based on different motivational grounds. Thus, the unity of node and tie definitions might be seriously violated for these few cases. About 80% of the networks in the FB100 ensemble have outlying hubs, typically fewer than 20 cases or about 0.15% of the set of nodes. Looking from below, one might be tempted to exclude them as statistical outliers. However, from a bird’s-eye view, these nodes determine processes running on the graphs to a large extent and their exclusion would change the topology substantially. Figure 4.12 shows a summary of rate and shape parameters from fits with the exponential and Weibull distributions by means of box plots. The two distribu- tions are in good agreement with the lower 95% of the degree distributions but fail to predict extreme values correctly. The Kolmogorov-Smirnov distribution 152 Online Collegiate Networks test always strongly rejects all the candidate distributions due to high statistical power. Since the exponential is nested in the Weibull, finding a shape parameter different from 1 rejects the former. In fact, the Bayesian Information Criterion is systematically in favor of the Weibull. Also note that the vast majority of the networks in the FB100 ensemble show a Weibull shape of k > 1. In the skewness continuum from Gaussian over exponential to Pareto, they fall in the interval be- tween Gaussian and exponential and, strictly speaking, out of the parameter space of the JR model. In the field of reliability and survival analysis a shape parameter above 1 indicates an increasing failure rate due to ageing. In the domain of social networks, this translates into the situation where, with increasing degree, it gets harder and harder to acquire more contacts. On the other hand, Weibull shapes k < 1 would suggest sublinear multiplicative processes such as mild forms of cu- mulative advantage that make the acquisition of additional edges easier, the more edges a node has already accumulated. However, having found k  0 and even k > 1 tells us that degree distributions in the FB100 ensemble do not follow a power law which refutes the claim of Hu and Guo (2012). Even if Hu and Guo follow best practices in statistical physics, I disagree that the single-parameter Pareto is a good description of online collegiate networks, especially those re- stricted to single institutions. Rarely, but from time to time, lex parsimoniae cuts both ways.

4.7.4 The Facebook 100 Dataset

The FB100 ensemble was published by Traud et al. (2011, 2012) and contains friendship networks of the first 100 universities that were present on Facebook. The datasets have been made available through the homepage of Mason A. Porter17. The networks consist of the complete sets of users, a limited set of sociodemographic variables, and the relations between users within each institu- tion. The snapshot was taken on a single day in September 2005. The data is from the early days of Facebook when access was still restricted to students of colleges and universities. Porter calls it the Facebook data of the Precambrian area to em- phasize that today’s topology of Facebook is very different from 2005. Compared to my StudiVz networks, the FB100 networks also contain alumni, graduate stu- dents, faculty members, and administrative staff. Everybody with a valid .edu e-mail address was allowed to sign up. In the following, I will restrict the net- works to the student subgraphs, which will primarily contain relations among undergraduate students.

17 http://arxiv.org/abs/1102.2166 Appendix 153

Density (x 1/10) ●●● ● ● ● Diameter (x 1/10) Shortest Path(x 1/10) Degree−Transitivity (x −1) ●● Transitivity (Global) ● ● Transitivity (Local) ● Degree Assortativity ●● Gender Assortativity ● ● Major Assortativity ●●●●●● Dormitory Assortativity ● ●● Cohort Assortativity High School Assortativity

0.0 0.2 0.4 0.6 0.8 1.0

Figure 4.13: Graph-Level Metrics of the FB100 Ensemble

The total dataset consist of 1,2 million users and 46.9 million relations. The subgraphs I summarize in Figure 4.13 are restricted to 0.96 million users and 32.2 million relations. The ensemble serves as baseline for comparison since random graph models typically lack certain features present in socially generated structures. Figure 4.13 summarizes available structural characteristics discussed in Sections 4.4 and 4.5. The metrics that are different from the StudiVz ensemble are briefly discussed. In Figure 4.13 the metrics density, diameter, and average shortest path length have been rescaled by a factor of ten, the degree-transitivity correlation has been multiplied by -1. The set of attributes is limited to gender, major, dormitory, cohort, and high school identifiers. Even though attribute data is volunteered by the users, they are surprisingly complete. Gender is observed in more than 93% of the nodes, major cohort and high school typically roughly at 85%, and only dormitory falls short with 60%. The first notable difference between the Facebook and StudiVz networks is a substantial gap in the mean degree and density of the graphs. While StudiVz users average about 26 contacts, on Facebook the median of the average number of contacts is 38. Accordingly, the network densities are much higher in the FB100 datasets, the median is at 0.011. Higher densities in FB100 graphs will have obvious but also nontrivial consequences. Adding more edges to the graphs obviously reduces the diameter and average shortest path lengths. Nodes in the ensemble are on average three degrees of separation away from each other (2.71) 154 Online Collegiate Networks while in the StudiVz networks one needs one step more on the shortest distance. The median of the diameter is 7 in FB100, while it is 9 in the StudiVz networks. The two measures for transitivity are very close together: medians of global and local transitivity are 0.167 and 0.236 on Facebook and 0.158 and 0.226 in the StudiVz samples. The same holds true for the inverse relationship of the degree to local transitivity. Medians of the correlations are -0.31 in FB100 and -0.34 on StudiVz. The second major difference between the FB100 and StudiVz ensemble is the level of assortativity that tends to be lower on most dimensions in the FB100 dataset. A non-parametric Mann-Whitney test that compares the graph-level as- sortativity between the two ensembles indicates significant differences in degree, gender, major and high school while the level of cohort is insignificantly higher in the FB100 dataset. The median values of assortativity in gender, major, and shared high school are all below 0.05 in the FB100 dataset. Only similarity in cohort and dormitory show substantial departure compared to random mixing. The StudiVz networks are far more strongly segmented by attributes: assortativ- ity by gender is 3 times, and major and high school are 6 times as high as in the FB100 sample. A part of it can be explained by the higher mean degree. The more random contacts a user adds to their list of friends, the less assortative a network will be. However, adding more ties per capita does not – as we have seen in Section 4.5 – automatically result in lower segregation measures since choices are typically made within the limit of the opportunity structure.

4.7.5 Exponential Random Graph Models The model specification and notation closely follows Goodreau et al. (2009). Let Yi j be a random variable for the tie between nodes i and j, with Yi j = 1 if the nodes share a tie and 0 otherwise. The ties are represented as an adjacency matrix Y of order n × n. For mutual friendship, Y is symmetric with an undefined diagonal. The ERG model specifies a probability for a set of ties in Y, nodes and attributes as:

PS  exp s=1 θszs(y) P(Y = y) = (4.17) κ where the term zs(y) includes a set of S network statistics, such as the number ties within a specific category, category combination, the number of triangles, etc. θS is the vector of parameter estimates. The denominator κ is a normalizing constant that sums over all possible networks with n nodes specified in the numerator. The number of possible combinations of ETH networks corresponds to 229957670, which makes it obvious that for Equation 4.17 maximum likelihood estimation Appendix 155 does not apply. Either standard errors are simulated by MCMCMLE, or one falls back to pseudo-likelihood techniques based on logistic regressions, which are, however, not well understood for this class of problems. My model includes terms for sociality, selective mixing, and triad closure. The terms for sociality and selective mixing have been described above. Triadic closure is modeled with the GWESP statistic, first proposed by Snijders et al. (2006) and re-parameterized by Hunter and Handcock (2006) to avoid common forms of degeneracy. It is defined as

n−2 τ X n −τio w = e 1 − 1 − e pi (4.18) i=1 where pi counts the number of actor pairs who are friends and have exactly i com- mon friends. Note that besides the counts, the statistic also includes a parameter τ that controls the geometric rate of decay in the effect of triadic closure on the tie probability. For the models reported in the main text, I finally fixed τ = 0.5 but the specific value of τ . As it turns out, the common triangle specifications are not well suited for my networks. Even if the specifications do not produce degen- erate networks, and proper MCMC standard errors are obtained, simulations for goodness-of-fit do not adequately recover the observed networks. I conclude that these standard errors should not be trusted, and results should be interpreted with caution. Standard errors will not be reported here but are available upon request. The box plots in Figure 4.17 give a good impression of which covariates tend in some models not to be significant. In 6 out 22 faculties there is no significant gender effect for sociality. In 5 out of 22 faculties selective mixing in geo-circle-1 is not significant. All other sociality and mixing coefficients are supposed to be highly significant, and inclusion or exclusion of the triangle statistic does change point estimates of sociality and mixing coefficients to a large extent. Finally, Table 4.7 shows point estimates for all five universities. 156 Online Collegiate Networks

Table 4.7: Logistic Network Regression Models (MPLE)

UBE UBS USG UZH ETH Edges -5.775 -6.064 -5.193 -6.249 -5.775 Gender Male 0.191 0.112 -0.121 0.121 0.191 Age -0.451 -0.339 -0.327 -0.454 -0.451 Cohort 2001 0.654 0.583 0.588 0.615 0.654 2002 0.818 0.852 0.813 0.752 0.818 2003 0.875 0.919 0.935 0.847 0.875 2004 0.911 0.905 1.027 0.884 0.911 2005 0.806 0.838 0.905 0.857 0.806 2006 0.634 0.642 0.638 0.571 0.634 Geo Circle 2 -0.096 -0.114 0.036 -0.162 -0.096 Geo Circle 3 -0.354 -0.475 0.018 -0.420 -0.354 Gender 0.251 0.235 0.160 0.222 0.251 Cohort 2001 1.695 1.775 1.079 1.512 1.695 2002 1.390 1.468 1.390 1.545 1.390 2003 1.396 1.217 1.171 1.277 1.396 2004 1.245 1.242 1.167 1.255 1.245 2005 1.290 1.335 1.135 1.098 1.290 2006 1.515 1.597 1.382 1.500 1.515 2007 2.269 2.329 1.940 2.132 2.269 Geo Circle 1 0.101 0.165 0.234 0.212 0.101 Geo Circle 2 0.670 0.618 0.426 0.547 0.670 Geo Circle 3 1.624 1.407 0.783 1.538 1.624 High School 2.637 2.443 2.400 2.904 2.637 Major Subject 2.273 2.412 0.324 2.266 2.273 Chapter 5

Personality on Social Network Sites: An Application of the Five Factor Model

Abstract: In this paper we explore how individual personality characteristics influence online social networking behavior. We use data from an online survey with 1560 respondents from a major Swiss technical university and their corresponding online profiles and friendship networks on a popular Social Network Site (SNS). Apart from sociodemographic variables and questions about SNS usage, we collected survey data on personality traits with a short question inventory of the five-factor personality model (BFI-15). We show how these psychological network antecedents influence participation, adoption time, nodal degree and ego-network growth over a period of 4 months on the networking platform. Statistical analysis with overdispersed degree distribution models identifies extraversion as a major driving force in the tie-formation process. We find a counter-intuitive positive effect for neuroticism, a negative influence for conscientiousness and no effects for openness and agreeableness.

Keywords: Online social networks, personality, big five, degree distribu- tions

† In memory of Elisabeth Coutts († August 5, 2009). We thank Elisabeth Coutts, Debra Hevenstone, Ben Jann and Wojtek Przepiorka for helpful comments and several students for their participation in the data collection process, especially Marcel Breu, Michael Müri, Marc Osswald, and Florian Schlapbach. 158 Personality on Social Network Sites 5.1 Introduction

How does an individual set of attributes like personality affect the formation of relationships and individuals’ positions in social networks? Some people seem to accumulate a large number of acquaintances while others have close circles. High variability in the number of contacts is a frequently observed feature of so- cial networks. Especially in the recent network literature, empirical estimation of the exact functional form of such skewed degree distributions has received con- siderable attention (cf. Newman et al. 2001, Newman 2003b, Handcock and Jones 2004). The sources of this excess variability are best illustrated with violations of the independence assumption of the underlying random tie-formation process. A random network exhibits a Poisson degree distribution with the imposed restric- tion that the conditional variance equals the conditional mean. Higher variance arises in statistical terms due to either true or spurious contagion (Cameron and Trivedi 1986). In the first case, all individuals initially have the same probability of forming ties, but the probability of successive ties depends on prior occur- rences. Cumulative advantage, for example, is a well-known related candidate for true (but pure) contagion exhibiting heaviest forms of overdispersion. On the other hand, according to the spurious contagion model individuals are assumed to have constant but unequal probabilities of forming ties. In this case, the major source of overdispersion would be simply population heterogeneity. Unfortu- nately, it has been shown that the two violations can result in the same outcome distribution, therefore making them impossible to distinguish with cross-sectional data. In the following, we will explore the second route – individual heterogeneity in the propensity to form ties – by incorporating a potentially stable and exoge- nous source of variation: personality. Note that this approach is similar in de- sign to the intrinsic fitness model of Bianconi and Barabási (2001) and to certain threshold models from classical diffusion research (Coleman 1964). Apart from the degree distribution, social networks are claimed to share at least three other empirical regularities: short average path lengths, high transitivity (clustering) and positive assortative mixing. The appearance of the first two is also known as the small-world effect (Watts 1999). We will briefly discuss these metrics and show that our network also has these characteristics. Traditionally, network theorists have devoted much of their attention to the consequences of networks and how the behavior of individuals depends on their location in the network. Individuals occupying central positions and having denser or wider reaching networks may gain faster access to information and assistance (Borgatti and Foster 2003). Another view on the causal ordering asks about how individuals’ characteristics influence social structure. Scholars have Introduction 159 mainly relied on two micro-level predictors of network tie-formation: spatial proximity and similarity in terms of personal attributes like age, race, gender, values and education (McPherson et al. 2001). Only recently, a special interest in interaction between personality traits and network positions has emerged. Person- ality traits that predispose people to socialize, such as extraversion or openness to experience, might foster and accelerate tie-formation in social networks while others like neuroticism constrain individuals from creating ties. Mehra et al. (2001), for example, found that high self-monitors – people who are concerned about how they are perceived by others – occupied more central positions in the friendship network of a high-tech company. Burt et al. (1998) showed how entrepreneurial personality characteristics are correlated with net- work constraint and bridging of structural holes. While these authors applied very specific and narrow conceptions of personality, others rely on a more com- prehensive instrument: the five-factor model (FFM, Goldberg 1990). The FFM now seems to be the de facto standard in personality research. Klein et al. (2004) found negative effects on centrality from neuroticism and openness, and a positive effect of agreeableness in friendship networks of work group members. Surpris- ingly, extraversion had no effect on friendship centrality. Vodosek (2003) reported positive effects from extraversion in a friendship network of an undergraduate business class. Asendorpf and Wilpers (1998) found in a very careful assessment of friendship formation processes mainly positive effects from extraversion on contact frequency. In our study we follow the strategy of Klein et al. (2004) and regress ex- ogenous personality traits on behavioral and structural indicators, i.e., on the probability of network membership, on the time until students adopt the social networking technology, on the number of contacts (degree) and on several well- known node level centrality and clustering measures. We conducted an online survey at a major Swiss technical university (ETH Zurich) to collect personality scores, sociodemographic variables and usage patterns from 1560 students. At the same time and at two subsequent time points we collected profiles and con- tact lists from a very popular social networking site (StudiVZ) by means of a snowball crawling approach. The two sources were linked, giving us the possi- bility to study ego-centered networks in the context of a complete network data set. The remainder of the article is organized as follows. In the next section we will briefly discuss validity concerns, as well as the advantages and disadvantages of data acquired on SNS. Personality traits and hypotheses will be presented in section 3, followed by some methodological considerations in section 4. In the last two sections results are presented and discussed. We will show that extraver- 160 Personality on Social Network Sites sion plays a pivotal role in the tie-formation process, exhibiting significant effects on degree and centrality measures.

5.2 Social Network Sites

Social Network Sites (SNSs) like Facebook, MySpace and LinkedIn, as well as the German Facebook clone StudiVZ, have shown dramatic growth over the last few years. At the same time they started to attract the attention of scholars due to the availability of ready-made, process generated complete network data. Boyd and Ellison (2007) characterize SNSs as web-based services that allow individu- als to (1) construct a web presence usually including a photo and descriptors like location, age, study concentration and interests, (2) publicly display a list of other users with whom they share a connection, and (3) traverse this list of connections to view the profiles of others within the system. SNSs particularly evolve around special interests or shared contexts like student populations at universities. Al- though they now represent a typical form of computer-mediated communication, Facebook and StudiVZ are well-known to be rooted in the offline world (Lampe et al. 2007), i.e., students use them typically to stay in contact, communicate with and “spy” on their offline friends. Luckily, this makes this type of data source compatible with classic classroom sociometry (Coleman 1964, Hallinan 1978, Moody 2001), where scientists have collected network data of classes or entire schools by means of survey name generators. The major methodological lim- itation of these studies was the number of contacts they were able to process, normally concentrating on the 3 to 5 strongest ties. On SNSs students collect their acquaintances by themselves, leading to a much wider and more complete picture of the social structure. Unfortunately, SNS data has obvious limitations as well. Not all students are members of SNSs, and often members are scattered across different platforms. Nevertheless, we found that about 70% of all ETH students maintain a StudiVZ account, and StudiVZ was, according to our survey results, definitely the most popular site to join. A major limitation of the StudiVZ data is that we cannot see who initiates a tie or when. Although acquaintance and friendship are thought to be symmetric relationships, directed network data would allow this study to measure whether individuals with high extraversion scores initiate or attract SNS relations. As already mentioned, the degree distribution has high variability, with some students having far more than 100 friends within ETH. The mean is 24.5 contacts with a standard deviation of 20.5. Note that this is only a quarter of the numbers that undergraduates from Michigan State University have accumu- lated on Facebook (Lampe et al. 2007). These high numbers suggest very low Personality and Social Networks 161 tie-formation and virtually no tie-maintenance costs in contrast to kinship, ex- change or support networks. Although strong ties are present in these networks, most of the contacts will be part of a wider circle with a low propensity of being resourceful or helpful. However, survey respondents reported that they know a substantial proportion of their contacts from face-to-face interaction. They meet with about 75% of their friends every once in a while, suggesting a considerable overlap between online and offline social networks. Only 20% of the respondents added unknown people to their lists, and in 70% of these cases fewer than three persons were added. Unfortunately, we know little about real returns to social capital investment and maintenance stemming from this sort of organized friend- ship. According to Granovetter (1973) we should expect users in more central positions to find better summer jobs and access class material or residences more easily, and they should be generally better informed about the latest rumors on campus. Ellison et al. (2007) came to a promising conclusion: “Our findings demonstrate a robust connection between Facebook usage and indicators of so- cial capital, especially the bridging type. Internet use alone did not predict social capital, but intensive use of Facebook did.”

5.3 Personality and Social Networks

In the last few decades a broad consensus has emerged that the structure of the personality trait domain encompasses five major dimensions. The FFM (“Big 5”) received considerable empirical support and is now the standard taxonomy to or- ganize and measure personality traits (Goldberg 1990, Costa and McCrae 1992). The model is based on early trait research and lexical assumptions that socially relevant personality differences become encoded in the language of a population. The scientific term “personality” is conceptualized as the entire mental organiza- tion of a person’s traits, where traits are defined as a cross-situational and tempo- rally stable set of individual attributes. The five factors are extraversion, agree- ableness, conscientiousness, neuroticism (emotional stability), and openness to experience (intellect). These traits are claimed to be, at least to some extent, her- itable on the basis of the neurotransmitter regime (Jang et al. 1998), unaffected by external influences, and very stable after 30 years of age (McCrae and Costa 1990, Soldz and Vaillant 1999). Unfortunately, individuals tend to be less stable between 18 and 30 years of age, when most people experience major life course events and transitions, such as leaving their family of origin and entering uni- versity or the job market (Asendorpf and Wilpers 1998). In this period people are presumed to select an environment congruent with their dispositions, prefer- ences and attitudes while, on the other hand, environment reinforces individual 162 Personality on Social Network Sites characteristics and behavior. This makes it hard to disentangle the endogenous re- lationship between personality and the environment. The five personality factors have been shown to relate to people’s behavior in a broad variety of social con- texts. It is likely that they predispose people’s propensity to form more or fewer social ties, or they may be related to the extent which others form and maintain relationships with the focal actor. In the following, we will briefly discuss each personality trait and its possible association with behavior on SNSs. Openness. McCrae (1996) suggested that openness to experience may have the strongest influence on social and interpersonal phenomena among all of the five factors. Survey questions on the openness dimension measure the propensity of individuals to display imagination, curiosity, originality and open-mindedness. In contrast, low openness scores indicate people who are practical, traditional and down-to-earth. According to Vodosek (2003) the literature offers very little empirical evidence for an association with tie-formation, but it suggests a positive relation to satisfaction and stability once the relationship is formed. In the context of SNSs and members coming from a technical university, we expect individuals with high scores on openness to be more likely to try, to use and to keep up with new social networking technologies. Extraversion. Extraversion refers to the extent to which individuals are out- going, active, assertive and talkative. Extraverts are expected to approach others more easily and engage in more social interaction. In contrast, individuals with low levels of extraversion tend to be “introverted,” reserved and serious, and pre- fer to be alone or stay within close circles. It is hardly surprising that extraverted individuals have been found to have larger networks and show higher contact frequencies (Russell et al. 1997, Anderson et al. 2001, Totterdell et al. 2008). For example, Asendorpf and Wilpers (1998) found that extraversion was highly associated with students’ interaction rates and their peer-relationship formation. Extraversion is the least controversial dimension in this context and is expected to exhibit obvious and strong effects, as has been shown by others too (e.g. Tot- terdell et al. 2008, Ryan and Xenos 2011). Conscientiousness. Conscientiousness refers to the extent that an individual is dependable, careful, responsible and organized, and has a high will to achieve. It has been shown to be associated with performance in the workplace and is known to be the most prominent dimension in the context of education and learn- ing, exhibiting substantial correlations with grade point average, educational per- formance and persistence (De Raad and Schouwenburg 1996). Conscientious students have been found to have more frequent contact with family members (Asendorpf and Wilpers 1998). Furthermore, one might expect this trait to be the central source of strategic network formation. Wanberg et al. (2000) reported in their study on job-seekers higher levels of contacting and asking in association Personality and Social Networks 163 with high levels of conscientiousness. However, SNSs do not promise fast and obvious returns to social networking. We expect that high scores on Conscien- tiousness will lead to lower numbers of contacts in this specific context. Consci- entious individuals will refrain from high investments in SNS profiles; they will stick to their main goals and try to avoid such sources of distraction (cf. Ryan and Xenos 2011). Agreeableness. Agreeable persons tend to be courteous, kind, flexible, trust- ing and forgiving; they are inclined to cooperate but known to avoid conflict. Agreeableness is associated with positive relations to alters, and has been shown to foster peer acceptance and friendship among children from middle and junior high school (Jensen-Campbell et al. 2002). McCarty and Green (2005) report that agreeableness and conscientiousness were most highly correlated with per- sonal network structure. At the same time they only found a small influence from extraversion. Generally, agreeableness is said to have a favorable influence on so- cial interactions and their perceived quality. Although the effect on tie-formation remains unclear, we can at least presume that agreeable individuals will not re- ject an offer of friendship. Exactly this interaction was found in one of the few studies in the domain of personality psychology that applies dynamic methods to a directed social network. In the study of Selfhout et al. (2010), extraversion was the only personality trait responsible for a higher propensity to initiate ties (sender effect), whereas agreeableness was the only significant trait explaining why indi- viduals receive friendship nominations (receiver effect). Besides the sender and receiver effects, Selfhout and colleagues also found similarity effects that even exceed selection effects. People with higher scores on openness, extraversion, and agreeableness show a higher propensity to meet if the person on the other side of the tie matches their scores. Neuroticism. Relatively few studies report on the relationship between neu- roticism and network characteristics. Neuroticism refers to the extent to which individuals experience and display negative affects like anxiety, sadness, embar- rassment, depression and guilt, and is tied to the ability to cope with stress. It is negatively related with status in male social groups (measured by indegree) and the number of peer relationships (Anderson et al. 2001). Individuals with high levels of the factor neuroticism are expected to impose higher tie maintenance costs on their alters, although these negative consequences are less plausible in a computer-mediated environment with a very low tie-formation threshold and, on average, low tie strength. On the other hand, people with high scores on neu- roticism tend to believe that they are not attractive to others and are fearful of rejection. Therefore an intensified desire for an unstained self-presentation could result – counter-intuitively – in higher activities on SNSs. Ross et al. (2009) summarized several studies that claim higher activity in computer-mediated com- 164 Personality on Social Network Sites

Table 5.1: Hypotheses for Personality Effects on Outcome Variables

SNS Adoption Number of Network Local Use Time Friends Centrality Clustering Openness + + + + - Conscientiousness - - - - + Extraversion ++ ++ ++ ++ -- Agreeableness + + + + - Neuroticism - - - - +

munication for those who are high on the neuroticism scale. These vulnerable individuals are likely to use the internet to avoid loneliness and to seek social support that they are otherwise missing. In their study, however, Ross and col- leagues found no indication of a positive effect from neuroticism. Despite the meager empirical evidence, neuroticism is generally assumed to be negatively associated with social relationships (Wanberg et al. 2000, Klein et al. 2004). The studies closest to our setup are based on data from the ‘myPersonality Project’ conducted on Facebook. Kosinski et al. (2013) allow Facebook users to complete a series of personality inventories by means of a Facebook application (cf. Bacharach et al. 2012 and Quercia et al. 2012, where results from the Big 5 are discussed). The myPersonality project has collected personality profiles of several million Facebook users who self-select into the treatment by installing the app. About 40% of the users completed at least one personality inventory. Kosinski and colleagues regressed personality scores of the FFM on the num- ber of friends (logged) and on several other SNS usage indicators (e.g. likes, group memberships, photo counts, status updates). In an analysis on a subset of 170’000 US American users, just two traits exhibited significant correlations with the logged friends count: extraversion (r = 0.20) and neuroticism (r = −0.07). In the authors’ linear regression model just extraversion turned out to be a reli- able predictor of SNS usage. They predicted extreme extraverts to have twice as many contacts as extreme introverts. Compared to their large sample, which tries to be representative of the general internet-using population, ours will be very restricted but more closely matches traditional studies conducted with undergrad- uate psychology students. Apart from that we expect very similar results from our SNS users. Table 5.1 shows an overview of the hypotheses and the direction of the ex- pected effects. Unfortunately, the current literature on associations between per- sonality and network positions is characterized by mixed evidence and inconsis- tencies. We treat personality as exogenous and stable and try to replicate pre- vious results. Extraversion is thought to be an intrinsic quality of an individual Data and Methods 165 exhibiting evident and strong effects on networking behavior. For the other four personality traits, there are several rival hypotheses and potential stories to tell. We invite the reader to interpret the effects with caution, since we were not able to derive all hypotheses from a well established theory. Additionally, we assume a homogenous influence, i.e., a trait is expected to show the same sign and ef- fect strength in all models, except in the clustering models, where effect signs should be reversed. The reason for this will be explained below. The FFM is typ- ically measured with extensive question inventories, sometimes with more than 100 items. Only recently have short inventories been developed and tested that are especially suited for large-scale survey research. We applied the BFI-15, an inventory developed for the German Socioeconomic Panel (Gerlitz and Schupp 2005). The inventory consists of 15 questions, measured on 7-point scales, where 3 questions always map onto one of the 5 dimensions. We conducted a principal component analysis (factor analysis) and were able to replicate the 5-factor nature of the FFM. Table 5.2 displays the factor loadings. The results are as expected. Every item loads on the predetermined dimension, although the factor loadings are in some cases unsatisfactory. The same holds for the scale reliability measure Cronbach’s α. A low reliability of survey questions in an additive index indicates that the underlying dimension is not properly captured. Only the factor extraver- sion exhibits sufficient reliability; agreeableness is with an α = 0.54 much too low, while the other three factors are with α ≈ 0.65 at the lower end of what is commonly accepted in survey and psychometric research. All personality scores are correlated with gender. Apart from neuroticism, female respondents show on all other four dimensions higher scores but similar standard deviations. Further- more, conscientiousness seems to be at least to some extent correlated with age (r = 0.09). All other dimensions show virtually zero correlation with age.

5.4 Data and Methods

Data. The data of this study comes from two distinct sources: an online survey and observations of SNS profiles. The survey was implemented using the online survey facility at the Chair of Sociology at ETH Zurich and was active between November 22 and December 24, 2007. Students were randomly recruited from the official university address list, i.e., we randomly invited 66% of the student population (13’000 students) by e-mail and received 1560 completed question- naires. This corresponds to a response rate of 17.5%. Female respondents are over-represented by about 10% in comparison with the official enrollment count (40% in survey vs. 30% in student body). Students from management and social sciences are over-represented while the proportion of engineering students is rela- 166 Personality on Social Network Sites

Table 5.2: Factor Analysis of Personality Scores

Item / Translated from German O C E A N “I am someone who . . . ” Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 04) “is original, comes up with new ideas” .726 09) “appreciates artistic pursuits” .439 14) “has a vivid imagination, ideas” .631 01) “works diligently” .807 07) “is rather lazy” [-] .534 11) “completes tasks effectively and efficiently” .550 02) “is communicative, talkative” .780 08) “is able to open up to other, is gregarious” .725 12) “is diffident” [-] .674 03) “is sometimes gruff with others” [-] .554 06) “can forgive” .346 13) “is considerate and friendly towards others” .756 05) “often worries” .585 10) “get nervous easily” .591 15) “is relaxed, able to deal with stress” [-] .722 MIC .376 .388 .550 .287 .389 Cronbach’s α .643 .655 .786 .548 .657 Mean 4.828 4.937 4.520 5.305 3.756 SD 1.119 1.098 1.224 1.224 1.173 N 1540 1541 1539 1540 1539 Note: Iterated principal component analysis of personality items results in a five-factor solution. Reported are the varimax rotated factor loadings of the normalized matrix, which are omitted when |loading| < 0.3. Items 3, 7, 12 and 15 are reversed. We build additive (unstandardized) indices for later use as exogenous predictors. MIC shows the average inter-item correlation for the personality indices, α is a scale reliability measure.

tively lower in our sample. As observed in other online surveys, the response rate decreases with age. The survey consisted of questions about SNS usage intensity and preferences and also confirmed those indicators we collected simultaneously from the SNS. The correlation between self-reported and observed indicators is very high (e.g. number of friends: 0.968, gender: 0.995), suggesting that most students fill out their online profiles truthfully. At the same time, on November 22, we collected all the StudiVZ profiles of ETH students with an automated web crawling approach. We started with a highly active user and followed the lists of friends until no new ETH member was found. Students are impossible to reach with this procedure if they are not con- nected to someone in the giant component of the graph. However, according to analytical results from random graph theory and our survey, we expect fewer than 1% of all students to be isolates in the friendship network collected by crawling. This snowballing procedure resulted in 7318 observed profiles. Compared with the enrollment statistics, we estimate that 65% of ETH students are members of StudiVZ (excluding graduate students). The direct question in the survey indi- Data and Methods 167 cates a StudiVZ membership rate of 67.5%. Ten percent of students maintain a membership at another platform such as Facebook, MySpace, or Xing. The two data sources were matched using first and last names. In cases where students have identical names, we used the study subject and birthday. Record linkage was successful in 97.9% of all cases. Comparing our survey respondents with the other platform members, we find our survey respondents are more active, with a mean of 24.5 friends versus 21.1 and a median of 19 friends versus 16. Analysis. We apply maximum likelihood estimation and simple OLS regres- sions, in a total of twelve statistical models (M1-M12). For explaining partici- pation on social network sites we perform a logistic regression on the dependent variable of having at least one membership of an SNS (M1: 0=none, 1=at least one). For model 2 (M2) The adoption rate is modeled with a Cox proportional hazards model as expressed in equation (5.1),

h(t|X) = λ(t) exp(β0X) (5.1) where λ(t) is the time-dependent baseline hazard function, X are vectors of inde- pendent variables and β are regression coefficients. The dependent variable is the duration in days from the opening day of StudiVZ (October 31, 2005) to the time when respondents sign up for StudiVZ. We control for different exposure times with cohort dummies. Positive regression coefficients indicate a higher sign-up rate, i.e., these respondents adopt faster. For the degree distributions and changes in the number of friends we use count models (M3-M5). The number of friends Yi of student i is assumed to follow a 2 negative binomial distribution with expected value λi and dispersion γ (cf. NB2, Cameron and Trivedi 1986, 2013).

0 2 2 E(Yi|X) = λi = exp(β X) with γ = λi + αλi (5.2) The expected event occurrence λ equals that of a Poisson regression, but is itself a random variable drawn from a gamma distribution. The continuous gamma mix- ture of Poisson distributions results in the negative binomial distribution where the conditional variance exceeds the conditional mean. The dispersion parame- ter α serves to estimate the level of overdispersion independently of the mean. Different exposure times are normalized by introducing the log of the duration from the sign-up date at StudiVz to the survey response date, with a regression coefficient constrained to equal 1. Since we have multiple observations of the degree distribution we also fit a panel regression model on waves 2-4. We model overdispersion with individual specific random effects drawn from a beta distri- bution according to the model suggested by Hausman et al. (1984, 922): 168 Personality on Social Network Sites

!λit !yit Γ(λit + yit) 1 δi E(Yit = yit|xit) = with 1/(1+δi) ∼ Beta(s, r) Γ(λit)Γ(yit + 1) 1 + δi 1 + δi (5.3)

The last set of models in Table 5.4 consists of simple linear regressions where network node indices of the friendship network are regressed on independent vari- ables. We report standardized beta coefficients and bootstrapped standard errors. The first set of node level indices comprises the centrality measures degree, close- ness, betweenness (Freeman 1978), and eigenvector centrality (Bonacich 1972). Centrality measures capture the relative position of an individual within the net- work. Degree-based measures focus on the level of communication activity, while betweenness captures stress control and the capacity to interrupt communication. Closeness refers to the freedom from such control in measuring how close/far apart an individual is from everybody else in the network. Eigenvector central- ity reflects the fact that individuals might profit from well-connected friends; it is the positive multiple of the sum of adjacent recursively solved for the entire network. The two indices in the second set are in essence densities of the ego networks. Network constraint measures the extent to which an ego is invested in people who are invested in his alters. It is typically applied in valued networks, but well suited for our setting as well. Burt et al. (1998) used this in- dicator to show that individuals with “entrepreneurial personalities” avoid redun- dant investment. Transitivity on the graph level, last but not least, measures the tendency for individuals’ neighborhoods to be connected. The node level concept of local clustering is closely related to transitivity (cf. Watts 1999). We computed local clustering coefficients for each node, measuring the ratio of triangles con- nected to the vertex to the triples centered on the vertex. We will not discuss the centrality and clustering models in great detail. See the appendix for the graph theoretic definitions and Wasserman and Faust (1994) and Newman (2010) for details on the applied node level indices. The aim of the analysis is to show that an external source of variation like personality will be associated with a wide ar- ray of node indices, given that is in strong relation to the degree of a node. All models use the same set of independent variables. As well as the five personality scores, we include gender, age in years, and a set of 8 dummies for cohort-groups (entry cohorts by year) with the freshmen as the reference category. Estimates for the cohorts are omitted in the tables. The cohort dummies always exhibit the same pattern: a parabolic effect with a maximum at the end of the undergraduate study time. Personality Effects on Social Networks 169 5.5 Personality Effects on Social Networks

In Table 5.3 and 5.4, personality effects on several indicators of networking be- havior and network position are reported. Three of the personality scores, consci- entiousness, extraversion and neuroticism, have a significant effect in the majority of models. There is no support from openness and mixed results for agreeable- ness. First, we discuss the models capturing networking activity and then we will report on the associations with centrality and clustering and discuss some further structural properties of the ETH/StudiVZ friendship network. Activity. In M1 we model the determinants of SNS membership. Estimates from a logistic regression indicate that extraversion is the most prominent pre- dictor of joining and, as expected, people with high scores in conscientiousness refrain from social networking. A standard deviation increase in extraversion al- ters the odds of being a member by 49%, while a standard deviation increase in conscientiousness leads to a 13% decrease in the odds of a registration, holding all other independent variables constant at their mean. Age is the most important explanation for being a member of a network site. Students who are one standard deviation older are 8.2% less likely to be members. Given that other studies (eg. Lampe et al. 2007, Tufekci 2008) have found that women have higher member- ship and activity rates, it is surprising we find no gender effect at all for ETH students. M2 is similar to M1 but with the major difference that we measure the time until registration takes place. Non-members are treated as censored and are kept in the sample for an adequate estimation of the set of people at risk of joining StudiVZ. The logit and Cox models show very similar coefficients and marginal effects, with the only difference between the two models being a positive signif- icant association of agreeableness with adoption time. We did not expect any positive results due to its low scale reliability. Presumably agreeable individuals do not easily reject invitations from other users. Models M3-M5 are the count models, where we estimate effects on the num- ber of friends a user has acquired. The dependent variable of M3 is the number of friends the individual had at the time of the survey, while the dependent variable in M4 is the number of friends 4 months later. In M5 we regress independent variables on the difference between the two. Again, extraversion has the largest influence on friendship. Finally, the panel regression model in M6 shows essen- tially the same results. Note, however, that the effects from conscientiousness are not significant in M5 and M6.1

1 For models M3 and M4 we also estimated generalized negative binomial regression models. There we model dispersion explicitly with ln(α) = z jγ where the number of contacts is a function of, for example, gender, age and cohort, but dispersion ln(α) is a function of personality z j. It turns out that none of the personality traits significantly predicts dispersion, which contradicts the assumption of the 170 Personality on Social Network Sites 4 , 3 ∗∗ , ∗∗∗ ∗∗∗ ∗∗ ∗∗∗ 2 ∗∗∗ .053 .213 1.726 2.053 1.064 Panel 4 − 2 + ∆ ∗ ∗∗∗ .097 -.024 -.015 .101 1.398 .143 Growth ects on Network Activity 4 T ff ∗ ∗ ∗∗∗ ∗ ∗∗∗ .050 -.039 0.776 .222 -1.083 Friends 2 T + ∗ ∗∗∗ ∗ ∗∗∗ ects. Bootstrapped standard errors in parentheses. ff -.008 .006 -.016 -.025 -.024 -.024 -.014 -.192 -.048 .211 ∗ 001 ∗∗ ∗∗∗ . ∗∗∗ 0 – 0.711 < -.060 .163 -.118 p ∗∗∗ ects panel regression model for the degree distributions in waves 2, 3 and 4. 01, . ff ∗ ∗∗∗ 0 ∗∗∗ ∗∗∗ < M1 M2 M3 M4 M5 M6 1520 1520 922 942 906 2921 p (.064) (.031) (.020) (.021) (.045) (.021) (.063) (.029) (.020) (.019) (.046) (.020) (.069) (.033) (.025) (.023) (.056) (.026) (.061) (.028) (.019) (.019) (.043) (.020) (.055)(.141) (.030) (.071) (.021)(.755) (.045) (.021) (.045) (.047) (0.404) (.097) (.017) (0.365) (.045) (.733) (.601) (.023) (.015) (.018) (.017) (.030) (.025) ∗∗ Participation Adoption Friends 05, . 0 < p ∗ 10, . Maximum likelihood estimates of personality e 0 ) -1.011 ) ) < s r α ( ( ( p N ln ln Openness -.007 -.002 .018 .014 .055ln .009 Conscientiousness -.129 Extraversion .327 AgreeablenessNeuroticism .086 .042 .102 .042 .051 FemaleAge (years)ConstantCohort -.056 Dummies -.139 Exposure 3.014 -.090 Yes .065 Yes .049 Yes .023 Yes Yes .002 Yes Yes Yes No No Table 5.3: Maximum Likelihood Estimates of Personality E Note: Model 1 is a logisticCox regression proportional on hazards having rate at model leastand of one 5 the membership are duration of negative until a binomial theis social regressions start a networking with of negative site. overdispersion binomial the random-e and Model StudiVZ normalized 2 membership. exposure is Models time a 3, (NB2). 4 Model 6 Personality scores are unstandardized+ and cohort dummies are not reported. Personality Effects on Social Networks 171

Degree Distribution of Friendship Network Marginal Effects on Degree (95% CIs)

Observed Counts 50 Extraversion .04 Predicted from Negative Binomial Neuroticism Conscientiousness 40 .03 30 .02 Probability 20 .01 Predicted Number of Friends 10 0 0 20 40 60 80 100 1 2 3 4 5 6 7 Number of Friends Raw Personality Scores

Figure 5.1: Degree Distribution and Marginal Effects of Personality

For a standard deviation increase in extraversion, a student’s expected mean number of friends increases by 29%. Conscientiousness significantly reduces the number of friends in M3 and M4, while neuroticism significantly increases the counts in all four models. Figure 5.1 depicts a predicted degree distribution and effect plots for the three significant effects in M3. Neuroticism’s effect was unanticipated. While all evidence from team and work group studies indicates de- bilitating influences from emotional instability, we find evidence for the opposite. One possible clarification comes from the time students spend on the networking site. A path analysis revealed that neuroticism and conscientiousness are moder- ated by time investment. While conscientious students tend to visit the site less frequently and for shorter periods of time, the opposite holds true for individuals with high scores on neuroticism. Extraversion and conscientiousness show di- rect and moderated effects, while neuroticism only shows a moderated effect. In essence, a high level of emotional instability is not a direct cause of having more friends, it is rather the cause for intensified online networking behavior. Further analysis is needed to clarify this puzzle. Node level indices. We now turn to the analysis of associations between per- sonality and indices of the complete network, i.e., for every survey respondent node level indices were computed from the entire network collected by the snow- balling procedure. Before looking at the OLS estimates, some of the global struc- tural characteristics of the friendship network are briefly reviewed. The network consists of 7318 nodes and 78631 edges at the reference time when the online survey starts (T2). Two months earlier (T1, September 2007) and shortly before spurious contagion model where individuals are assumed to have constant but unequal probabilities to form ties. This puzzle cannot be resolved with cross-sectional data alone but requires panel data on outcomes and personality scores. 172 Personality on Social Network Sites new freshmen arrived for the autumn term, the network was populated with 5975 students. Within only 2 months the number of nodes in the network rose by 20%. In the two subsequent periods up to wave 4, we observed further increases by 6% and 2%, which resulted in 7930 members in March 2008. In the same period the number of edges rose by 30%, while the density of the network slightly decreased from 0.0035 to 0.0029. The following structural measures of the friendship net- work are computed for the reference time point T2. It is hardly surprising that ETH is a small world. The network has a diam- eter of 10 and an average shortest path length of 3.6. Small worlds are defined as having a short average path length but high clustering, and both requirements are met. The global clustering coefficient for the friendship network is 0.28, far beyond what we would expect from a random network with equal density and size. This indicates that the network exhibits typical triadic closure and cliquish- ness regularly observed for social networks. Although the network lacks extreme events and huge hubs, the global structural measures strongly resemble several other social networks described in Newman (2002). We will come back to a last structural metric of social networks – assortativity – at the end of the section. Table 5.4 presents personality effects on the network positions of our survey respondents. In model M7, where the logarithm of the degree is the dependent variable, we simply replicate the results from the count model M3 by means of linear regression. In essence, M3 and M7 are identical. All centrality models indicate the same three personality traits at work, which were also significant in the activity models: extraversion, conscientiousness and neuroticism. Extraver- sion regularly displays effect sizes that are three times as strong as the other two factors, making it the most relevant trait for studying large social networks. Once more, we find no evidence for a gender bias. Age is strongly negatively associated with degree and closeness, but has no influence on betweenness and eigenvector centrality. The similarity of the four centrality models is of no surprise. It is well- known that all measures of centrality in graphs tend to be correlated with degree. In this study, the number of friends is the overruling characteristic of the network. Attributes strongly associated with degree will be handed over to all other local graph metrics. Having a high degree inevitably results in higher centralities, and is at the same time a source of lower local clustering around the focal actor. Nodes with very high degrees tend to have low clustering rates since the many nodes that are connected to them are relatively unlikely to be linked to each other. Jackson and Rogers (2007) claim this to be a key empirical regularity for so- cial networks: “The clustering among the neighbors of a given node, in at least some social networks, is inversely related to the node’s degree.” In our study, the correlation between degree and the local clustering measure is r = -0.31. Accord- ingly, the coefficients flip sign between the centrality and clustering models. High Personality Effects on Social Networks 173 + + ∗∗∗ .056 .020 .115 -.173 ∗ + ∗∗∗ ∗∗∗ .058 -.069 -.317 + ∗∗∗ ∗∗∗ -.028 .171 .050 .253 -.092 ects on Node Level Indices ff ∗ ∗ ∗∗ ∗∗∗ -.079 -.110 .099 .359 001. . 0 < p ∗ ∗ ∗ ∗∗∗ Centrality Measures Local Clustering ∗∗∗ .070 -.071 -.096 .291 01, . cients of linear regressions on node level indices, estimated with robust 0 ffi < p + ∗∗ ∗∗ ∗∗∗ ∗∗ M7 M8 M9 M10 M11 M12 .227 .342 0.170 .342 .237 .030 (.234) (.282) (.348) (.646) (.156) (.652) (.080) (.013) (.016) (.001) (.052) (.133) (.000) (.000)(.956) (.816) (.000) (.879) (.000) (.727) (.000) (.721) (.000) (.739) (.009) (.022)(.427) (.981) (.003) (.815) (.089) (.523) (.021) (.690) (.579) (.064) (.005) (.049) (.020) (.497) (.000) (.094) Degree Closeness Betweenness Eigenvector Constraint Transitivity 05, . 0 < p 2 ∗ R 10, Table 5.4: OLS Estimates of Personality E . Standardized beta coe 0 < p Openness .036 .030 .030 .012 -.042Adjusted .015 Conscientious. -.055 Extraversion .326 Agreeableness -.002Neuroticism -.006 .084 -.004 .009 .010 .010 FemaleAge .025 -.001 -.163 .007 -.018 -.012 .065 Cohorts Yes Yes Yes Yes Yes Yes N 958 958 958 958 958 958 Note: standard errors, p-values in parentheses.work, Network consisting measures of were 7’318 computed nodes forof and the the 78’631 entire degree, edges, friendship betweenness, with+ net- eigenvector R(Igraph). and constraint Due scores. to skewness Cohort we dummies take are logarithms not reported. 174 Personality on Social Network Sites scores in extraversion result in the fact that the first neighbors of this individual are less integrated on average (M12). At the same time this lowers the probability of investment in redundant ties, captured by M11 with network constraint as the dependent variable.2 In the study of Burt et al. (1998) an “entrepreneurial personality” was respon- sible for low levels of network constraint. This coincided with a high number of structural holes in the networks of those MBA students who claim the personality of the “entrepreneurial outsider.” To what extent extra edges are in fact bridges cannot be answered here and needs further investigation. Surprisingly, we find only weak evidence for a gender difference in transitivity. Women are known to have a higher proportion of kin in their ego-centered networks and are presumed to invest more time in the near graph neighborhood, resulting in a higher number of closed triads around them. The gender indicator variable in model M12 is only significant on the 10% level, questioning a strong association with gender. Finally, we have to address the question of whether similar personalities meet at a higher rate and show what the study of Selfhout et al. (2010) clearly sug- gests. Friendship choices based on similarity in attributes – better known un- der the term homophily – leads us to another prominent characteristic of social networks: assortativity. Nodes of the same kind tend to cluster. We found no indication of assortative personality mixing in the literature, suggesting that no specific preferences for people of the same or a different personality type exist. Assortative mixing is most easily calculated by a Pearson’s correlation coeffi- cient r for the attributes at either side of an edge (Newman 2002). Unfortunately, the subgraph, spanning the survey respondents by the relations they share among one another, is only a tiny and potentially biased fraction of the whole friendship network (883 nodes, 4182 edges, and a tenfold density of 0.01). Regardless of that fact, we calculated the node-node attribute correlations (with bootstrap re- sampling) and found only negligible deviations from a perfectly mixed random graph. Openness-Openness showed a correlation coefficient of 0.076; all other possible combinations of personality traits are well below 0.05. This is radically different for a set of other mixing patterns. The degree-degree correlations has a coefficient of 0.20, and gender mixing is at the level of r=0.15. The strongest deviations arise from age and cohort. Assortativity by age exhibits a correlation of 0.5, and the cohort is even higher, indicating that friendship choices most of-

2 For the models M8-M12 in Table 5.4 we also estimated alternative specifications including the degree as control variable. Obviously, the standardized beta coefficient of the degree variable then roughly matches the bivariate correlation coefficient in Table 5.7. Inclusion of the degree as the control variable does not change effect directions, but reduces the effect size of extraversion by factor 3. Extraversion stays highly significant but significant effects for conscientiousness and neuroticism are lost. Conclusions 175 ten take place within and not between the age groups. The heavily constraining nature of age on the tie-formation process in friendship networks is a well-known fact in the literature on sociometric choices at school. However, we find it surpris- ing that this pattern is so persistent even on the university level. To summarize, personality seems to be predictive of how many persons someone meets and what position the person takes in the network, but not of the type of personality the person’s interaction partners will have.

5.6 Conclusions

The results of this study show that extraversion, one dimension of the five-factor personality model, plays a uniquely important role in the formation of network ties. We conducted an online survey to collect personality scores of 1560 stu- dents from a major Swiss university. At the same time, we observed students’ online profiles on the popular SNS StudiVZ. Extraverts show a higher probability of joining StudiVZ, they also adopt the technology faster and accumulate more friends on their contact lists. Accordingly, individuals with high scores on ex- traversion take more central positions in the friendship network. All centrality measures applied in this study were highly correlated with degree. So it comes as no great surprise that all centrality models exhibit about the same association with extraversion. Conversely, the clustering among the neighbors of a given node is found to be inversely related to the node’s degree. In the models with network constraint and local transitivity as their dependent variable, most of the indepen- dent variables changed sign, therefore confirming this inverse relationship. Apart from extraversion, we found retardant effects from conscientiousness. Highly conscientious people tend to refrain from participation on SNS, suggesting that they successfully evade this popular source of distraction in a student’s everyday life. Surprisingly, we found activating, positive effects from neuroticism, which stands in sharp contradiction to theory and the majority of empirical findings re- ported in the literature. One possible explanation is that people exhibiting high levels of emotional instability tend to spend more time on SNS. Being fearful of rejection, they might try harder to present themselves well in an unstained and at- tractive manner. There is virtually no support for an association between network characteristics and the last two factors, openness and agreeableness. Although agreeableness seems to accelerate the adoption of social networking technology, we found no associations in the count and centrality models. In all models, ex- traversion exhibits effects three to six times stronger than any other personality trait, making it the most important personality factor in large social friendship networks. Extraversion is responsible for about 10% of the variance in the num- 176 Personality on Social Network Sites ber of friends, while age and entry cohort are responsible for another 10%. We found no evidence for gender effects in general, and only a weak indication that women tend to have higher levels of triadic closure in their ego networks. Clearly, personality does not tell the whole story. While extraversion fits perfectly into the wider picture, effects from conscientiousness and neuroticism seem to be partic- ularities of online social networks or just the data at hand. In their seminal work, Burt et al. (1998) applied a very specific personality trait of an “entrepreneurial outsider” and showed that high values on this person- ality trait correspond to low levels on the network constraint index. Based on Burt and colleagues’ work, several studies – including ours – have tried to find corre- lations between network structures and more general and encompassing concepts of personality. Most of the studies reviewed here corroborate a correlation of extraversion with structural network indices. All other trait dimensions result, however, in an inconclusive picture. Furthermore, correlations reported here and in the literature tend to be rather small. It might be the case that the FFM is too broad to be useful. Kalish and Robins (2005, 80) conclude: “A growing body of knowledge suggests that the Big Five may be too diffuse to capture specific behavioral outcomes [...]. The use of more specific psychological constructs, es- pecially ones that are theoretically relevant to bridging may be worth investigating in future studies.” The degree distribution of the number of friends shows high variability and is best described with the negative binomial distribution. By introducing personality as an exogenous and temporally stable source of variation into count models, we were able to show that the skew of the distribution is not entirely dominated by contagious processes in the network. However, it is very likely that people who have a lot of friends at time point 1 will profit from their visibility and might ac- quire disproportionately more contacts in a subsequent period. Unfortunately, the distinction between heterogeneity and contagion is not an easy undertaking, since these two sources of variation give rise to the same distribution. Distributional evidence is not enough to demonstrate the presence of a specific tie-formation process. Further analysis beyond the cross-sectional framework and the applica- tion of dynamic panel models is needed to distinguish between the two potential sources of excess variability. Apart from the direct effects of personality on the probability of SNS mem- bership, on the adoption rate, on the number of friends, and on network centrality, we found no evidence that personality traits are responsible for similarity effects. Compared with assortativity based on cohort, age, degree and gender, personality will not be of great importance to tie-formation models based on attribute simi- larity. Results from the survey indicate a substantial overlap between offline and online circles of acquaintances. SNS are just another place where students nowa- Conclusions 177 days meet, opening up the possibility to observe their behavioral traces. The ETH friendship network is characterized by a moderately skewed degree distribution, an average shortest path length of 3.6, a high average clustering coefficient of 0.28, positive assortativity and an inverse relation between clustering and degree, therefore exhibiting all key empirical regularities shared by socially generated networks. This makes these easy-to-collect SNS data sources an interesting alter- native to study processes of friendship formation on a larger-scale basis. 178 Personality on Social Network Sites 5.7 Appendix

In the following section we present the node level indices and report on supporting material collected in the online survey.

5.7.1 Node Level Indices In the following section we specify the applied node level indices for reference. We will not discuss them here but refer to Freeman (1978), Borgatti and Everett (2006), Bonacich (1972), Burt (1992), and Watts and Strogatz (1998) for details. Degree centrality is just a count of the number of contacts for vertex i in a graph of order n and is defined as

DEG X DEG ci = ai j; C = A1 (5.4) j where A is the adjacency matrix, ai j are the cells of the adjacency matrix, and 1 is a vector of 1 with length n. Closeness centrality is the total geodesic distance for any given vertex to all other vertices in the network. Closeness centrality for actor i is defined as

CLO X CLO ci = di j; C = D1 (5.5) j where D is the distance matrix (based on geodesics) and 1 is a vector of 1s with length n. The distance matrix D indicates the length of the shortest paths (geodesics) between i and j. The distance matrix is computed by taking powers of the adjacency matrix. The length of the shortest path results from the low- est possible power, where the cell of the distance matrix is greater than 0, i.e., d(i, j) > 0. Formally,

[p] d(i, j) = minp xi j > 0 (5.6) The normalized version of closeness is flipped and multiplied by the minimal sum CLO CLO of distances n−1, i.e., c‘i = (n−1)/D1. Higher values of c‘i indicate higher centrality. Betweenness centrality for actor k is defined as

X X gik j cBET = (5.7) k g i j i j where gi j equals the number of all shortest pathes (geodesics) from i to j. gik j refers to the number of shortest paths from i to j that pass through k. To normalize betweenness centrality we divide by the maximal value reached in a star, i.e., Appendix 179

BET 2 BET c‘k = (2/n − 3n + 2)ck . Eigenvector centrality is a generalization of degree centrality (cf. Bonacich 1972). A vertex’s centrality depends recursively on its neighbors’:

1 XN cEV = A c (5.8) i λ i j j j=1 Or in matrix notation:

λc = Ac (5.9) such that c is an eigenvector of the adjacency matrix A with eigenvalue λ. Since we want all the centralities to be non-negative, c has to be the eigenvector of the largest eigenvalue. We scale vector c to fall between 0 and 1. The constraint measure of Burt (1992, 2004) is defined as:

CR X  X 2 ci = pi j + pik pk j (5.10) j,i k,i,k, j where pi j is the proportional investment of i into j. The constraint measure is CR bounded between 0 and 1 for n ≥ 6. The lower the constraint score ci , the more structural freedom a vertex has and the higher are its network benefits. Finally, the local clustering coefficient of Watts and Strogatz (1998) is our measure of local transitivity. It is defined as:

number of pairs of neighbors of i that are connected cLC = (5.11) i number of pairs of neighbors of i

It takes values between 0 and 1 for vertices with cDEG ≥ 2 and reaches 1 when LC the first-degree neighborhood of i is fully connected. We set ci of vertices with degree 1 to be zero even though local transitivity is not defined.

5.7.2 Name Generators and Direct Questions An integral part of the survey was a name generator to get an impression of po- tential social capital embedded in the StudiVz network. Survey respondents were asked to give the initials of their last two contacts on StudiVz. After this step, they were asked to state how often they are in contact with each of these two alters, how long they have known each other, and if they would discuss important per- sonal matters with them (cf. Burt 1984). Figure 5.2 depicts contact frequency and personal closeness with the first of the two alters. Results for the alters are almost 180 Personality on Social Network Sites

Contact Frequency Discuss Personal Matters Has known him/her for

several times a day yes <= 1 year

once a day

rather yes 1-2 years several times a week

once a week rather no 2-4 years

several times a month

no > 4 years rarer

0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 percent percent percent

Figure 5.2: Contact Frequency, Closeness, and Duration for Alter #1 in Name Generator identical. More than 50% of the respondents are in daily contact with both their alters. A majority of respondents would discuss personal matters with their two last contacts on StudiVz. The median duration of the acquaintanceship is 2 years (24 months, mean: 37 months) for both contacts. This suggests that the StudiVz network includes contacts acquired before entering ETH and that the network is composed of weak and strong ties. Besides the unobtrusive measurement of the number of online contacts, we also asked survey respondents directly how many persons they consider to be in their core “circle of friends” (core friends). Additionally, students were asked how many study-related e-mails they had sent the day before and how many con- tacts they had saved on their mobile phones. Direct questioning methods can be prone to several biases, e.g., social desirability, recall or scaling issues. Distor- tions can also come from different definitions of friendship and acquaintance (cf. Marsden 2005). For example, the number of persons our respondents consider as core friends has the well-known spikes in intervals of 5. Additionally, 20% of our respondents state that they have more than 10 close friends. Notwithstanding the issues related to direct questioning, these measures give us a rough indication of how the nodal degree of the StudiVz friendship network is correlated with other communication channels. Table 5.5 shows that all measures are positively but only slightly or moderately correlated with the number of friends on StudiVz. In Table 5.6 we compare personality effects on the counts of StudiVz contacts with the self-reported number of “core” friends by means of simple OLS regres- sions (M13, M14). Extraversion is highly significant in both models. However, in M14 openness is now significant while conscientiousness and neuroticism flip sign and lose predictive power. This suggests that the finding of a negative effect Appendix 181

Table 5.5: Correlations of Contact Frequencies and Life Satisfaction

mean sd studivz core mail mail mobile degree friends sent received contacts core friends 9.06 6.74 .2468 mails sent 1.53 3.09 .1716 .0789 mails received 3.07 4.78 .1718 .0918 .4157 mobile contacts 107.6 82.61 .3287 .2438 .1057 .0862 life satisfaction 7.43 1.79 .0527 .1692 .0191 .0130 .0179 for conscientiousness and the positive effect for neuroticism are possibly very domain-specific and only apply to online social networks. A last means of triangulation comes from the covariance of personality, de- gree, and life satisfaction. From the US General Social Survey we know that there exists a positive association of the size of the core discussion network with life satisfaction (cf. Burt 1987). Furthermore, numerous studies from personality psychology have shown robust correlations of the FFM with the same construct (cf. DeNeve and Cooper 1998). We asked our survey respondents to state their overall life satisfaction on a scale from 1 to 10. In Models 15 and 16 of Table 5.6, we regress personality scores on the life satisfaction scale by means of OLS regressions. Results have also been tested with ordered probit regressions with no qualitative difference. Our personality effects nicely replicate the finding from the meta study of DeNeve and Cooper (cf. their Table 6), except our significant effect for openness that points in the wrong direction. Openness to experience in a university context seems to capture a different quality compared to studies from the general population. We introduce the StudiVz degree in M15 and the number of core friends in M16. We compare and conclude: while an additional “real” friend can make you happy, yet another “virtual” friend on a social networking site does not. 182 Personality on Social Network Sites (.047) (.027) (.000) (.000) (.000) (.000) ∗ ∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (.000) -.292 (.036)(.000) -.062 (.000) .147 (.025) .175 .070 ∗ ∗ ∗∗∗ ∗∗∗ ∗∗∗ (.001) .003 (.918) .009 (.777) (.006) -.065 (.000) .198 ∗∗ ∗∗∗ ∗∗∗ 001 . 0 ects on Degree and Life Satisfaction (OLS) < ff p (.001) -.020 (.668) .002 (.941) .017 (.560) (.052) .012(.018) (.714) -.012 .145 (.673) -.293 (.000) .215 ∗∗∗ 01, M13 M14 M15 M16 . + ∗ ∗∗ cients of OLS regressions with the dependent variables number of friends on 0 ∗∗∗ Degree Distributions (Counts) Life Satisfaction Scale (1-10) ffi < .124 .069 .181 .193 Coef p Coef p Coef p Coef p StudiVz Friends Close Friends p ∗∗ 05, . 0 < p ∗ Table 5.6: Personality E 10, . Standardized beta coe 0 < 2 p Age .098 Conscientiousness -.065 NeuroticismStudiVz Friends (log)Core Friends (log) .079 FemaleN .012 (.704) -.110 963 .018 (.551) 942 .108 958 940 Openness .031 (.345) .081 R ExtraversionAgreeableness .324 .018 (.566) .037 (.271) .071 StudiVz (M13), the numberwith of p-values self-reported in close parentheses. + friends (M14), and overall life satisfaction (M15, M16), Note: Appendix 183 degree bfo bfc bfe bfa bfn clo bet ev cr Table 5.7: Raw Correlations between Personality and Node Level Indices degree bfobfcbfe .1331 bfa -.0386bfn .3211 .0271 clo -.0084bet .3034 .0265 .0857ev .0478 .8389 -.0221cr .1954 .8478lc -.0458 .1285 -.0242 .7159 .1154 -.1872 -.0262 -.4923 -.0198 .0861 -.0439 -.3186 .2955 -.1005 -.0265 .2805 -.0297 -.0306 -.0080 -.0009 .1912 -.2322 .0535 .0382 .0073 .0261 -.1549 .0079 -.0329 .6363 -.0051 .0286 .5572 -.7166 .0563 -.3058 .6867 -.3807 -.2211 -.3109 -.1506 .2702 184 Personality on Social Network Sites Openness 4.828 1.119Age (years) 1540 Cohort 1 22.96 3.778 0.281 1560 –Cohort 8 1535 0.031 – 1535 Table 5.8: Descriptive Statistics 5.954 9.242 999 Female 0.396 – 1560 Mean SD N Mean SD N 24.59729.013 20.516 21.745 946 966 Agreeableness Neuroticism 5.305 0.947 3.756 1540 1.173 1539 Dependent Variables Independent Variables 4 − 2 2 4 ∆ T T Activity ParticipationAdoption TimeFriends Friends 0.676 556.05Growth 141.37 –Node 1054 Level Indices Degree Extraversion 1560ClosenessBetweenness Conscientiousness x 100Eigenvector 4.521 4.937 0.043Constraint 1.224 1.098Transitivity 24.617 0.283 0.071 1539 1541 20.452 0.029 0.028 975 975 0.123 0.069 Cohort 975 4 0.279 Cohort 2 0.156 975 Cohort 3 0.184 Cohort 975 5 975 Cohort 0.112 6 Cohort 0.161 7 0.141 – – 0.122 – 1535 0.102 1535 – 0.049 1535 – 1535 – 1535 1535 Appendix 185

Openness Conscientiousness Extraversion 200 200 200 150 150 150 100 100 100 Frequency Frequency Frequency 50 50 50 0 0 0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7

Agreeableness Neuroticism 200 200 150 150 100 100 Frequency Frequency 50 50 0 0 1 2 3 4 5 6 7 1 2 3 4 5 6 7

Figure 5.3: Histograms of Raw Personality Scores

Chapter 6

Summary and Conclusion

On what grounds do pure strangers in the online world choose their interaction partners? Does the same decision rule apply if the two social actors know each other for a long time and have completed several successful online transactions? The answer is most likely no. On what grounds do students add their classmates to their list of friends on a social networking site? Is the rule set for strangers different from the one that applies for this closer form of relationship? The answer is most likely yes. While in stranger meetings one wants to be very selective in choosing with whom to start an engagement, people who already know each other risk little by clicking the okay button in an internet browser. Would you reject an invitation from your colleague at work, from your neighbor, or classmate to become online friends? Most people would say: it depends. However, survey results indicate that people are reluctant to reject these invitations and seem to avoid a potential offense. On SNSs, and especially on those that are rooted in the offline world, users are not very selective of whom they add to their list. It is the opportunity structure, i.e., the places where students have the chance to meet in the offline world, which largely determines the overall connection pattern we later observe by crawling their behavioral traces. The findings of this dissertation contribute to research on relationship forma- tion in the online world. The general topic was approached from the perspec- tive of very loose and rather close relationships. To answer my research ques- tions, I have first collected my datasets by means of screen-scraping technologies. Strangers have been observed on eBay Germany where I collected data from on- line auctions and tracked decision making of roughly 0.5 million traders. Data on friendships have been collected on StudiVz, a social networking site so similar to Facebook that the latter sued the former for theft and plagiarism. On StudiVz, I mapped acquaintanceships within single organizations. Five Swiss universities 188 Summary and Conclusion with 40,000 students in total have been observed at three distinct time points. In both cases I tried to compile digital traces into a comprehensive picture that al- lows me to address specific questions and to test my hypotheses. Even if these digital traces allow social scientists to observe social behavior in unprecedented and efficient ways, I come to the conclusion that the promises of computational social science cannot be fulfilled by mere amassing big data and mining it. A theory-driven approach that adds methods from the traditional toolbox seems to be a fruitful way to integrate these novel data sources. I have conducted exper- iments and executed an online survey that helped me triangulate and pin down very relevant open questions because online data sources are simply not designed for answering a social scientist’s questions.

In chapter 2 I have, together with my coauthors, measured reputation effects in online auctions. We find that the seller’s number of positive ratings has a positive effect on sales and selling price. Negative ratings show the expected reversed association with an effect that is greater in absolute terms. We found mixed evidence for the uncertainty hypothesis. For used goods we conjectured higher reputation premiums compared to new ones because in the former case buyers face two problems of trust. For used goods, there is uncertainty not only over shipment but also over quality. Note that the number of ratings is the nodal degree in a signed network of user ratings that users form by submitting feedbacks after transactions have been completed. The network of ratings is nothing else than a subgraph of a growing transaction network that is formed as a digital trace by buyers and sellers. In a perfect online world, the subgraph of ratings would completely coincide with the transaction graph. However, ratings are voluntary. And since it is at least somewhat costly to provide feedback, we have good theoretical reasons to believe that feedback is underprovided. Under the assumption of narrow self-regarding preferences, the rating graph should be empty. Without the ratings, buyers lack the necessary information to screen for potentially fraudulent sellers since the latter have little incentive to honor the buyers’ trust after having received their money. This sequential game should unravel by reasoning of backwards induction to an uncooperative equilibrium. However, it does not. At the feedback stage, cooperation is very high. Typically more than 50% of all transactions are followed by a rating. In the German eBay market this rate can even approach 80%. Buyers take this information into account and pay a reputation premium which, in turn, creates a proper incentive structure for sellers to execute their business properly and to honor placed trust. A driving force that motivates traders to submit ratings is (strong) direct reci- procity, a behavioral pattern that orthodox economist see as a violation of sub- 189 game perfection. By means of time-varying duration models we have clearly identified this conditional form of cooperation which I was also able to replicate in the lab. This form of direct reciprocity is strong because it happens in one-shot situations without repeated-game incentives. Besides reciprocity, altruistic pref- erences seem to motivate trader’s decision too. Buyers are more likely to give a positive rating to trades with few positive ratings where an additional rating still matters. Based on our findings, we argue that altruism, strong reciprocity, and also strategic motives play an important role to endogenously stabilize (second- order) cooperation.

In chapter 3, I studied the mechanism design of reputation systems. The aim of the chapter was threefold. First, I wanted to replicate the finding of the empiri- cal paper in chapter 2 following a motto from agent based modelling (Epstein and Axtell 1996): “If you can’t grow it, you don’t understand it.” In laboratory exper- iments students played binary-choice trust games in a stylized online market that mimics the procedures and feedback stage on eBay. It turns out that reputation systems are effective and that the very same effect of direct reciprocity described above determines the propensity of submitting ratings to a great extent. Second, the bilateral-sequential rating system of eBay was compared with alternatives that constrain the role of direct reciprocity on the feedback level. Early on, there was reasonable suspicion that due to the two-sided nature and the sequential execution of the rating process buyers and sellers could have incen- tives not to report outcomes truthfully. Bilateral rating systems allow for counter- punishment and it might be that this is why buyers often prefer to remain silent after failed transactions. I compared a two-sided blind and also a unilateral ver- sion with the conventional implementation. My experimental evidence clearly shows that feedback mechanisms which constrain the role of direct reciprocity exhibit substantially higher levels of trust. This also corroborates findings from two other experimental studies that share many design features of mine Masclet and Pénard 2012, Bolton et al. 2013). Third, and this is the main contribution of the article, the endogenous feed- back time is modelled with competing-risks duration models while controlling for the transaction outcome. I analyze the time until feedbacks arrive, and I do so under the general hypothesis that the bilateral system gives incentives to strate- gically accelerate or retard one’s rating. Indeed, it can be shown that players postpone their ratings in the bad transaction case to guard themselves against re- taliation. This corroborates the finding in the empirical study where a small frac- tion of negative feedbacks arrived shortly before the closing time. On the other side, timing is virtually irrelevant and mostly non-strategic for the case of positive feedbacks. Unfortunately, one key hypothesis from the literature (Dellarocas and 190 Summary and Conclusion

Wood 2008) which suggests a suppression of negative buyer feedbacks could not be addressed directly with our between-subject design. While I see buyers post- pone their negative ratings, a lower submission rate is not observed. One would need to observe their unconditional rating behavior in a within-subject design to correctly identify and disentangle their conditional reciprocal behavior. This was an unanticipated consequence of the experimental design choice. Endogenous matching was an intended omission, i.e., buyers were not able to select their sellers but were simply matched in a stranger treatment. This rules out any competition among sellers which is, of course, a major driving force in the evolution of cooperation based on image scoring. In all treatments coopera- tion unravels to the non-cooperative equilibrium due to endgame effects, irrespec- tive of the fact if a feedback mechanism is implemented or not. This raises the question on how we need to design experiments to grow sustained cooperation. The solution to this puzzle most likely can be found in a combination of indirect reciprocity with competition or partner selection. The experimental evidence is promising. Huck et al. (2012, 205) conclude: “Only if some minimal feedback information (...) is coupled with competition we observe that the market-crippling moral hazard problem is fully overcome.” So the question is what happens if we let buyers choose and preferentially attach to sellers? In the introduction of this dissertation I have shown that the interaction topol- ogy of eBay’s DVD market is very different from a well-mixed population we typically use as the benchmark matching topology. Interestingly, the simplest model to let such a degree distribution grow is the preferential attachment ran- dom graph model (Barabási and Albert 1999). Coincidence? Perhaps not. We know that structured population can change evolutionary dynamics substantially (e.g. Nowak 2006a). And from simulation models we also know what happens if we run evolutionary dynamics on the topology observed in eBay’s DVD market. Santos and Pacheco (2005) conclude: “Whenever individuals interact following networks of contacts generated via growth and preferential attachment, leading to strong correlations between individuals, cooperation becomes the dominating trait throughout the entire range of parameters.” The scale-free topology seems to exhibit, figuratively speaking, a lotus effect with “self-cleaning” properties with regards to the evolution of cooperation. This is, of course, a bite more than I am able to chew in this dissertation. However, partner choice and assortative mixing will be an interesting avenue for future research.

Growing networks, assortative mixing, and structured populations are the leading topic of chapter 4. Here, I don’t study the evolution of cooperation but expect social actors to have already solved this problem before I observe their tie generating event. This paper concerns acquaintanceship and weak forms of 191 friendship relations that have primarily formed outside the online world while leaving traces on a popular social networking site. Based on simple genera- tive network models from economics (Jackson and Rogers 2007) and statistical physics (Callaway et al. 2001, Zalányi et al. 2003), I discuss and characterize an ensemble of five online collegiate networks that correspond to five universities in Switzerland. By means of social network analysis, I reanalyze well-known theoretical and empirical findings from the network literature. First, I describe the degree distribution with the methods of maximum likelihood and show that it nicely fits to a random growth model. Furthermore, I also discuss the methods how to fit these red lines in Figure 1.1. In a second step, it will be shown that all networks exhibit the well-known Small-World property (Watts and Strogatz 1998) and that the theory of the strength of weak ties also applies to online colle- giate networks (Granovetter 1973). Furthermore, I use exponential random graph models (Goodreau et al. 2009) to map the typical opportunity structure which is, as expected, primarily determined by cohort and study subject. In the appendix of this chapter, I finally compare my ensemble with a set of 100 similar networks consisting of 1.2 million users from Facebook (Traud et al. 2011). It turns out that all those online collegiate networks are remarkably similar and that the pre- dictions from simple random graph models can give a quite accurate picture of structures that surround the students. The online world provides a lens through which social scientists can gain insights into the macro-structure of universities and raise the findings from classroom sociometry (Moreno et al. 1943, Coleman 1964, Hallinan 1978) not only to the lecture hall but to an entire campus.

In the last substantive chapter 5, I have explored how individual differences influence behavior in online social networks based on data collected from an on- line survey that can be linked to the data of chapter 4. Theories and empirical results give very different answers to the question of how many “friends” a per- son can have and why people are so remarkably different in terms of the number of relations they maintain. Anthropologists, for example, see the typical size of social groups – and with it the upper limit for the degree of a – hardwired in the human brain (Dunbar 1993, 2003). Dunbar suggest that there may be a cognitive constraint on the size of social networks and predicted with fits of neocortex size on observed group sizes that humans should live in social groups of approximately 150 individuals. In my StudiVz networks many nodes have managed to surpass this number just by collecting fellow students which, of course, is not what Dunbar would accept to be an “intensely social group.” Dun- bar’s findings do not explain variants with humans. We know from survey results that the size of “core discussion networks” of the U.S. population systematically vary with sociodemographic factors like gender, age, employment and marital 192 Summary and Conclusion status (Moore 1990). Interestingly, also traits like being “cooperative” or “rest- less/impatient” contribute surprisingly much to the explanation of ego-network sizes in the General Social Survey (McPherson et al. 2006). So the question is what explains the degree in a homogenous group with regard to sociodemograph background? Traits. With an application of the five-factor model (Goldberg 1990) from personality psychology, I used a potentially stable and exogenous source of individual variation to explain how students shape their ego-centric networks. And I can show how these network antecedents determine participation, adop- tion time, nodal degree and network growth over a period of 4 months. Besides the nodal degree, I also regress the personality scores on several centrality mea- sures (Freeman 1978, Bonacich 1972) and measures of local clustering (Watts and Strogatz 1998, Burt 1992). Statistical analysis with overdispersed degree dis- tribution models identifies primarily extraversion as a major driving force in the tie-formation process. I find a counter-intuitive positive effect for emotional sta- bility (neuroticism), a negative influence for conscientiousness and no effects for openness and agreeableness. Extraversion was an obvious result while the effect from neuroticism is not. I suspect that the latter is very domain specific. Perhaps it is the special foci of the online world that lets people with low emotional stabil- ity flourish. However, results from the myPersonality project on Facebook raises doubt (Kosinski et al. 2013). There, they find a negative effect for neuroticism while all other effects nicely mirror my findings. In closesing this dissertation, I come back to a minor detail that can be found in the appendix of chapter 5 and 3. In the online survey I asked participants to rank themselves on the life satisfaction scale. From the U.S. General Social Survey we know that there exists a robust positive correlation of the size of the core discussion network with life satisfaction (Burt 1987). The number of core or close friends was also elicited in the survey. The results are conclusive. While an additional “real” friend is associated with positive life satisfaction, yet another “virtual” friend on a social networking platform is not. The correlation is slightly positive but truly insignificant. The next time you get invited to form an online friendship and you are not sure whether to accept it or not, then, the evidence in this book will help. Utility will not be improved much by adding such a little increment of positivity. Quite the contrary could happen. We know from the first part of this dissertation that increasing one’s degree raises the probability to attract strangers. Since bad is roughly twice as strong as good, mind the tradeoff before you click ok. References

Ahn, Yong-Yeol, Seungyeop Han, Haewoon Kwak, Sue Moon, and Hawoong Jeong. 2007. “Analysis of Topological Characteristics of Huge Online Social Networking Services.” Proceedings of the 16th International Conference on World Wide Web. Banff, Canada. Pp. 835-844. Akerlof, George A. 1970. “The Market for “Lemons”: Quality Uncertainty and the Market Mechanism.” The Quarterly Journal of Economics 84:488–500. Al-Ubaydli, Omar and Min Sok Lee. 2009. “An Experimental Study of Asymmet- ric Reciprocity.” Journal of Economic Behavior and Organization 2009:138– 749. Alexander, Richard D. 1987. The Biology of Moral Systems. New York: Aldine de Gruyter. Allison, Paul D. 1982. “Discrete-Time Methods for the Analysis of Event Histo- ries.” Sociological Methodology 13:61–98. Allison, Paul D. 2009. Fixed Effects Regression Models. Thousand Oaks, CA: Sage. Anderson, Camerion, Oliver P. John, Dacher Keltner, and Ann M Kring. 2001. “Who Attains Social Status? Effects of Personality and Physical Attractive- ness in Three Social Groups.” Journal of Personality and Social Psychology 81:116–132. Andreoni, James. 1990. “Impure Altruism and Donations to Public Goods: A Theory of Warm-Glow Giving.” Economic Journal 100:464–477. Andreoni, James and John H. Miller. 1993. “Rational Cooperation in the Finitely Repeated Prisoner’s Dilemma: Experimental Evidence.” The Economic Jour- nal 103:570–585. Asendorpf, Jens B. and Susanne Wilpers. 1998. “Personality Effects on Social Relationships.” Journal of Personality and Social Psychology 74:1531–1544. Ashraf, Nava, Iris Bohnet, and Nikita Piankov. 2006. “Decomposing Trust and Trustworthiness.” Experimental Economics 9:193–208. 194 REFERENCES

Atalay, Enghin. 2013. “Sources of Variation in Social Networks.” Games and Economic Behavior 79:106–131. Axelrod, Robert and William D. Hamilton. 1981. “The Evolution of Coopera- tion.” Science 221:1390–1396. Ba, Sulin and Paul A. Pavlou. 2002. “Evidence of the Effect of Trust Building Technology in Electronic Markets: Price Premiums and Buyer Behavior.” MIS Quarterly 26:243–268. Bacharach, Michael and Diego Gambetta. 2001. “Trust in Signals.” Pp. 148– 184. In Trust in Society, edited by Karen S. Cook. New York: Russell Sage Foundation. Bacharach, Yoram, Michal Kosinski, Thore Graepel, Pushmeet Kohli, and David Stillwell. 2012. “Personality and Patterns of Facebook Usage.” ACM Web Science Conference (WebSci). Bajari, Patrick and Ali Hortaçsu. 2003. “Winner’s Curse, Reserve Prices and Endogenous Entry: Empirical Insights from eBay Auctions.” RAND Journal of Economics 34:329–355. Bajari, Patrick and Ali Hortaçsu. 2004. “Economic Insights from Internet Auc- tions.” Journal of Economic Literature 42:457–486. Barabási, Albert-Lázló and Réka Albert. 1999. “Emergence of Scaling in Ran- dom Networks.” Science 286:509–512. Baumeister, Roy F., Ellen Bratslavsky, Catrin Finkenauer, and Kathleen D. Vohs. 2001. “Bad Is Stronger Than Good.” Review of General Psychology 5:323– 370. Bearman, Peter, S., James Moody, and Katherine Stovel. 2004. “Chains of Affec- tion: The Structure of Adolescent Romantic and Sexual Networks.” American Journal of Sociology 110:44–91. Becker, Gary S. 1976. “Altruism, Egoism and Genetic Fitness: Economics of Sociobiology.” Journal of Economic Literature 14:817–826. Berg, Joyce, John Dickhaut, and Kevin McCabe. 1995. “Trust, Recprocity, and Social History.” Games and Economic Behavior 10:122–142. Berger, Roger and Katharina Schmitt. 2005. “Vertrauen bei Internetauktionen und die Rolle von Reputation, Infomationen, Treuhandangebot und Preisniveau.” Kölner Zeitschrift für Soziologie und Sozialpsychologie 57:86–111. Berger, Ulrich. 2011. “Learning to Cooperate via Indirect Reciprocity.” Games and Economic Behavior 72:30–37. Berger, Ulrich and Ansgar Grüne. 2014. “Evolutionary Stability of Indirect Reci- procity by Image Scoring.” Department of Economics Working Paper Series, 168. WU Vienna University of Economics and Business, Vienna. Bernoulli, Daniel. 1738. “Specimen theoriae novae de mensura sortis.” Commen- tarii Academiae Scientiarum Imperialis Petropolitanae 5:175–192. REFERENCES 195

Bianconi, G. and Albert-Lázló Barabási. 2001. “Competition and Multiscaling in Evolving Networks.” Europhysics Letters 54:463–442. Blau, Peter. 1977. Inequality and Heterogeneity: A Primitive Theory of Social Structure. New York: Free Press. Blau, Peter Michael. 1964. Exchange and Power in Social Life. NewYork: Wiley. Blossfeld, Hans-Peter and Götz Rohwer. 2002. Techniques of Event History Mod- eling: New Approaches to Causal Analysis. London: Lawrence Erlbaum As- sociates. Bohnet, Iris, Fiona Greig, Benedikt Herrmann, and Richard Zeckhauser. 2008. “Betrayal Aversion: Evidence from Brazil, China, Oman, Switzerland, Turkey, and the United States.” American Economic Review 98:294–310. Bohnet, Iris, Heike Harmgart, Steffen Huck, and Jean-Robert Tyran. 2005. “Learning Trust.” Journal of the European Economic Association 3:322–329. Bohnet, Iris and Richard Zeckhauser. 2004. “Trust, Risk and Betrayal.” Journal of Economic Behavior and Organization 55:467–484. Bojanowski, Michal and Rense Corten. 2014. “Measuring Segregation in Social Networks.” Social Networks 39:14–32. Bolton, Gary E., Ben Greiner, and Axel Ockenfels. 2013. “Engineering Trust: Reciprocity in the Production of Reputation Information.” Management Sci- ence 59:265–285. Bolton, Gary E., Elena Katok, and Axel Ockenfels. 2004. “How Effective Are Electronic Reputation Mechisms: An Experimental Investigation.” Manage- ment Science 50:1587–1602. Bolton, Gary E., Elena Katok, and Axel Ockenfels. 2005. “Cooperation among Strangers with Limited Information about Reputation.” Journal of Public Eco- nomics 89:1457–1468. Bolton, Gary E., Claudia Loebbecke, and Axel Ockenfels. 2008. “Does Compe- tition Promote Trust and Trustworhiness in Online Trading? An Experimental Study.” Journal of Management Information Systems 25:145–169. Bolton, Gary E. and Axel Ockenfels. 2000. “ERC: A Theory of Equity, Reci- procity, and Competition.” American Economic Review 90:166–193. Bolton, Gary E. and Axel Ockenfels. 2011. “Information Value and Externali- ties in Reputation Building.” International Journal of Industrial Organization 29:23–33. Bolton, Gary E., Axel Ockenfels, and Felix Ebeling. 2011. “Information Value and Externalities in Reputation Building.” International Journal of Industrial Organization 29:23–33. Bonacich, Phillip. 1972. “Factoring and Weighting Approaches to Status Scores and Identification.” Journal of Mathematical Sociology 2:113–120. 196 REFERENCES

Bond, Robert M., Christopher J. Fariss, Jason J. Jones, Adam D.I. Kramer, Cameron Marlow, Jaime E. Settle, and James H. Fowler. 2012. “A 61-Million- Person Experiment in Social Influence and Political Mobilization.” Nature 489:295–298. Borgatti, Stephen P. and Martin G. Everett. 2006. “A Graph-Theoretic Perspective on Centrality.” Social Networks 28:466–484. Borgatti, Stephen P. and Scott L. Feld. 1994. “How to Test the Strength of Weak Ties Theory.” Connections 17:45–46. Borgatti, Stephen P. and Pacey C. Foster. 2003. “The Network Paradigm in Or- ganizational Research.” Journal of Management 29:991–1013. Bowles, Samuel and Herbert Gintis. 2004. “The Evolution of Strong Reciprocity: Cooperation in Heterogeneous Populations.” Theoretical Population Biology 65:17–28. Bowles, Samuel and Herbert Gintis. 2008. “Cooperation.” In The New Palgrave Dictionary of Economics, edited by Steven N. Durlauf and Lawrence E. Blume. Palgrave Macmillan. Boyd, Danah M. and Nicole B. Ellison. 2007. “Social Network Sites: Definition, History, and Scholarship.” Computer-Mediated Communication 13:210–230. Boyd, Robert and Peter J. Richerson. 1988. “The Evolution of Reciprocity in Sizable Groups.” Journal of Theoretical Biology 132:337–356. Boyd, Robert and Peter J. Richerson. 1989. “The Evolution of Indirect Reci- procity.” Social Networks 11:213–236. Bracht, Juergen and Nick Feltovich. 2009. “Whatever You Say, Your Reputation Preceds You: Observation and Cheap Talk in the Trust Game.” Journal of Public Economics 93:1036–1044. Brandes, Ulrik. 2001. “A Faster Algorithm For Betweenness Centrality.” Journal of Mathematical Sociology 25:163–177. Brandt, Hannelore and Karl Sigmund. 2006. “The Good, the Bad and the Dis- criminator: Errors in Direct and Indirect Reciprocity.” Journal of Theoretical Biology 239:183–194. Brandts, Jordi and Gary Charness. 2011. “The Strategy versus the Direct- Response Method: A First Survey of Experimental Comparison.” Experimen- tal Economics 14:375–398. Brooks, Brandon, Bernie Hogan, Nicole Ellison, Cliff Lampe, and Jessica Vi- tak. 2014. “Assessing Structural Correlates to Social Capital in Facebook Ego Networks.” Social Networks 38:1–15. Brown, Martin, Armin Falk, and Ernst Fehr. 2004. “Relational Contracts and the Nature of Market Interactions.” Econometrica 72:747–780. Burks, Stephen V., Jeffrey P. Carpenter, and Eric Verhoogen. 2003. “Playing Both Roles in the Trust Game.” Journal of Economic Behavior and Organization REFERENCES 197

51:195–216. Burt, Ronald S. 1984. “Network Items and the General Social Survey.” Social Networks 6:293–339. Burt, Ronald S. 1987. “A Note on Strangers, Friends, and Happiness.” Social Networks 9:311–331. Burt, Ronald S. 1992. Structural Holes: The Social Structure of Competition. Cambridge: Harvard University Press. Burt, Ronald S. 2002. “Bridge Decay.” Social Networks 24:333–363. Burt, Ronald S. 2004. “Structural Holes and Good Ideas.” American Journal of Sociology 110:349–399. Burt, Ronald S., Joseph .E. Janotta, and James T. Mahoney. 1998. “Personality Correlates of Structural Holes.” Social Networks 20:63–87. Buskens, Vincent and Werner Raub. 2002. “Embedded Trust: Control and Learn- ing.” Pp. 167–202. In Group Cohesion, Trust and Solidarity, edited by M.W. Lawler and S.R. Thye. Amsterdam: JAI Press. Cabral, Luís and Ali Hortaçsu. 2010. “The Dynamics of Seller Reputation: Evi- dence from eBay.” The Journal of Industrial Economics . Callaway, Duncan S., John E. Hopcroft, Jon M. Kleinberg, Mark E. J. Newman, and Steven H. Strogatz. 2001. “Are Randomly Grown Graphs Really Ran- dom?” Physical Review E 64:041902. Camerer, Colin F. 1997. “Progress in Behavioral Game Theory.” Journal of Economic Perspectives 11:167–188. Camerer, Colin F. 2003. Behavioral Game Theory. Princeton, NJ: Princeton University Press. Camerer, Colin F. and Keith Weigelt. 1988. “Experimental Test of a Sequential Equilibrium Reputation Model.” Econometrica 56:1–36. Cameron, A. Colin, Jonah B. Gelbach, and Douglas L. Miller. 2008. “Bootstrap- Based Improvements for Inference with Clustered Errors.” The Review of Eco- nomics and Statistics 90:414–427. Cameron, A. Colin and Pravin K. Trivedi. 1986. “Econometric Models based on Count Data: Comparison and Applications of Some Estimators and Tests.” Journal of Applied Econometrics 1:29–53. Cameron, A. Colin and Pravin K. Trivedi. 2013. Regression Analysis of Count Data. New York: Cambridge University Press. Carpenter, Jeffrey P., Amrita G. Daniere, and Lois M. Takahashi. 2004. “Cooper- ation, Trust, and Social Capital in Southeast Asian Urban Slums.” Journal of Economic Behavior and Organization 55:533–551. Charness, Gary, Ninghua Du, and Chun-Lei Yang. 2011. “Trust and Trustwor- thiness Reputations in an Investment Game.” Games and Economic Behavior 72:361–375. 198 REFERENCES

Chen, Kay-Yut and Tad Hogg. 2005. “Experimental Evalutation of an eBay-Style Self-Reporting Reputation Mechanism.” Lecture Notes in Computer Science 3828:434–443. Clauset, Aaron, Cosma Rohilla Shalizi, and Mark E. J. Newman. 2009. “Power- Law Distributions in Empirical Data.” SIAM Review 51:661–703. Coleman, James S. 1958. “Relational Analysis: The Study of Social Organiza- tions With Survey Methods.” Human Organizations 17:28–36. Coleman, James S. 1961. The Adolescent Society. New York: Free Press. Coleman, James S. 1964. Introduction to Mathematical Sociology. New York: Free Press. Coleman, James S. 1990. Foundations of Social Theory. Cambridge, MA: Belk- nap Press of Harvard University Press. Cook, Karen S., Russel Hardin, and Margaret Levi. 2007. Cooperation without Trust? New York: Russell Sage Foundation. Cook, Karen S., Eric R. W. Rice, and Alexandra Gerbasi. 2004. “The Emergence of Trust Networks Under Uncertainty: The Case of Transitional Economies – Insights from Social Psychological Research.” Pp. 193–212. In Creating Social Trust in Post-Socialist Transition, edited by J Kornai, B. Rothstein, and S. Rose-Ackerman. New York: Palgrave MacMillan. Corten, Rense. 2012. “Composition and Structure of a Large Online Social Net- work in the Netherlands.” PLOS One 7:e34760. Cosmides, Leda and John Tooby. 1992. “Cognitive Adaptions for Social Ex- change.” Pp. 163–228. In The Adapted Mind, edited by J. H. Barkow, Leda Cosmides, and John Tooby. New York: Oxford University Press. Costa, Paul T. and Robert R. McCrae. 1992. Revised NEO Personality Inventory NEO PI-R and NEO Five-Factor Inventory NEO-FFI. Odessa, FL: Psycholog- ical Assesment Resources. Cox, James C. 2004. “How to Identify Trust and Reciprocity.” Games and Eco- nomic Behavior 46:260–281. Cripps, Martin W. 2008. “Reputation.” In The New Palgrave Dictonary, edited by Steven N. Durlauf and Lawrence E. Blume. Palgrave Macmillan. Csárdi, Gábor and Tamás Nepusz. 2006. “The Igraph Software Package for Com- plex Network Research.” InterJournal Compex Systems:1695. Currarini, Sergio, Matthew O. Jackson, and Paolo Pin. 2009. “An Economic Model of Friendship: Homophily, Minorities and Segregation.” Econometrica 77:1003–1045. Currarini, Sergio, Matthew O. Jackson, and Trevor Pinch. 2010. “Identifying the Roles of Race-Based Choice and Chance in High School Friendship Network Formation.” PNAS 107:4857–4861. REFERENCES 199

Dasgupta, Partha. 1988. “Trust as a Commodity.” Pp. 49–72. In Trust: Making and Breaking Cooperative Relations, edited by Diego Gambetta. Oxford: Basil Blackwell. Davidson, A. C., D. V. Hinkley, and G. A. Young. 2003. “Recent Developments in Bootstrap Methodology.” Statistical Science 18:141–157. Davis, James. 1963. “Structural Balance, Mechanical Solidarity, and Interper- sonal Relations.” American Journal of Sociology 68:444–462. de Quervain, Dominique J.-F., Urs Fischbacher, Valerie Treyer, Melanie Schell- hammer, Ulrich Schnyder, Alfred Buck, and Ernst Fehr. 2004. “The Neural Basis of Altruistic Punishment.” Science 305:1254–1258. De Raad, Boele and Henri C. Schouwenburg. 1996. “Personality in Lerning and Education: A Review.” European Journal of Personality 10:303–336. Dellarocas, Chrysanthos. 2003. “The Digitization of Word of Mouth: Promise and Challenges of Online Feedback Mechanisms.” Management Science 49:1409–1424. Dellarocas, Chrysanthos, Ming Fan, and Charles A. Wood. 2004. “Self-Interest, Reciprocity, and Participation in Online Reputation Systems.” Working Paper 205, MIT Center for Digital Business. Dellarocas, Chrysanthos and Charles A. Wood. 2008. “The Sound of Silence in Online Feedback: Estimating Trader Satisfaction in the Presence of Reporting Bias.” Management Science 54:460–476. DeNeve, Kristina M. and Harris Cooper. 1998. “The Happy Personality: A Meta- Analysis of 137 Personality Traits and Subjective Well-Being.” Psychological Bulletin 124:197–229. Dewally, Michaël and Louis Ederington. 2006. “Reputation, Certification, War- ranties, and Information as Remedies for Seller-Buyer Information Asymme- tries: Lessons from the On-line Comic Book Market.” Journal of Business 79:693–729. Dewan, Sanjeev and Vernon Hsu. 2004. “Adverse Selection in Electronic Mar- kets: Evidence From Online Stamp Auctions.” The Journal of Industrial Eco- nomics 52:497–516. Diekmann, Andreas. 2004. “The Power of Reciprocity.” Journal of Conflict Resolution 48:487–515. Diekmann, Andreas, Ben Jann, Wojtek Przepiorka, and Stefan Wehrli. 2014. “Reputation Formation and the Evolution of Cooperation in Anonymous On- line Markets.” American Sociological Review 79:65–85. Diekmann, Andreas, Ben Jann, and David Wyder. 2009. “Trust and Reputation in Internet Auctions.” Pp. 139–165. In eTrust: Forming Relationships in the Online World, edited by Karen S. Cook, Chris Snijders, Vincent Buskens, and Coye Cheshire. New York: Russel Sage. 200 REFERENCES

DiMaggio, Paul and Hugh Louch. 1998. “Socially Embedded Consumer Trans- actions: for What Kinds of Purchases Do People Most Often Use Networks?” American Sociological Review 63:619–637. Dorogostev, Staniev N., José F.F. Mendes, and A. N. Samukhin. 2000. “Struc- ture of Growing Networks with Preferential Linking.” Physical Review Letters 85:4633–4636. Downs, Anthony. 1957. An Economic Theory of Democracy. New York: Harper and Row. Duffy, John and Jack Ochs. 2009. “Cooperative Behavior and the Frequency of Social Interaction.” Games and Economic Behavior 66:785–812. Dunbar, R.I.M. 1993. “Coevolution of Neocortical Size, Group Size and Lan- guage in Humans.” Behavioral and Brain Sciences 16:681–735. Dunbar, R.I.M. 2003. “The Social Brain: Mind, Language, and Society in Evo- lutionary Perspective.” Annual Review of Anthropology 32:163–181. Eaton, David H. 2002. “Valuing Information: Evidence from Guitar Auctions on eBay.” Journal of Applied Economics and Policy 24:1–19. Egloff, Boris, David Richter, and Stefan C. Schmukle. 2013. “Need for Conclu- sive Evidence that Positive and Negative Reciprocity are Unrelated.” PNAS 110:786. Ellison, Nicole, Charles Steinfield, and Cliff Lampe. 2006. “Spatially Bounded Online Social Networks and Social Capital.” Annual Conference of the Inter- national Communication Association 2006, Dresden. Ellison, Nicole B., Charles Steinfield, and Cliff Lampe. 2007. “The Benefits of Facebook "Friends:" Social Capital and College Students’ Use of Online So- cial Network Sites.” Journal of Computer Mediated Communication 12:1143– 1168. Engelmann, Dirk and Urs Fischbacher. 2009. “Indirect Reiciprocity and Strategic Reputation Building in an Experimental Helping Game.” Games and Economic Behavior 67:399–407. Epstein, Joshua M. and Robert Axtell. 1996. Growing Artificial Societies: Social Science from the Bottom Up. Washington, D.C.: The Brookings Institution. Erdos,˝ Paul and Alfréd Rényi. 1959. “On Random Graphs.” Publicationes Math- ematicae Debrecen 6:290–297. Ermisch, John and Diego Gambetta. 2006. “People’s Trust: The Design of a Survey-based Experiment.” IZA Discussion Paper 2216. Institute for the Study of Labor, Bonn. Falk, Armin and Urs Fischbacher. 2006. “A Theory of Reciprocity.” Games and Economic Behavior 54:293–315. Fehr, Ernst. 2009. “On the Economics and Biology of Trust.” Journal of the European Economic Association 7:235–266. REFERENCES 201

Fehr, Ernst, Urs Fischbacher, and Simon Gächter. 2002. “Strong Reciprocity, Human Cooperation, and the Enforcement of Social Norms.” Human Nature 13:1–25. Fehr, Ernst and Simon Gächter. 2000. “Fairness and Retaliation: The Economics of Reciprocity.” Journal of Economic Perspectives 14:159–181. Fehr, Ernst and Simon Gächter. 2002. “Altruistic Punishment in Humans.” Nature 415:137–140. Fehr, Ernst and Klaus M. Schmidt. 1999. “A Theory of Fairness, Competition, and Cooperation.” The Quarterly Journal of Economics 114:817–868. Feld, Scott L. 1981. “The Focused Organization of Social Ties.” American Jour- nal of Sociology 86:1015–1035. Feld, Scott L. 1991. “Why Your Friends Have More Friends than You Do.” Amer- ican Journal of Sociology 96:1464–1477. Festinger, Leon, Stanley Schachter, and Kurt Back. 1950. Social Pressures in In- formal Groups: A Study of Human Factors in Housing. Stanford, CA: Stanford University Press. Fine, Jason P. and Robert J. Gray. 1999. “A Proportional Hazards Model for the Subdistribution of a Competing Risk.” Journal of the American Statistical Association 94:496–509. Fischbacher, Urs. 2007. “z-Tree: Toolbox for Ready-made Economic Experi- ments.” Experimental Economics 10:171–178. Fortunato, Santo. 2010. “Community Detection in Graphs.” Physics Reports 486:75–174. Frank, Robert H. 1988. Passion within Reason: The Strategic Role of Emotions. New York: Norton. Freeman, Linton C. 1978. “Centrality in Networks. A Conceptual Clarification.” Social Networks 1:215–239. Friedkin, Noah E. 1980. “A Test of Structural Features of Granovetter’s Strength of Weak Ties Theory.” Social Networks 2:411–422. Fudenberg, Drew and Eric Maskin. 1986. “The Folk Theorem in Repeated Games with Discounting or with Incomplete Information.” Econometrica 50:533–554. Fudenberg, Drew and Eric Maskin. 1990. “Evolution and Cooperation in Noisy Repeated Games.” American Economic Review 80:274–279. Gambetta, Diego. 2009. “Signaling.” Pp. 168–194. In The Oxford Handbook of Analytical Sociology, edited by Peter Hedström and Peter S. Bearman. Oxford: Oxford University Press. Gautschi, Thomas. 2000. “History Effects in Social Dileamma Situations.” Ra- tionality and Society 12:131–162. Gazzale, Robert S. and Tapan Khopkar. 2011. “Remain Silent and Ye Shall Suf- fer: Seller Exploitation of Reticent Buyers in an Experimental Reputation Sys- 202 REFERENCES

tem.” Experimental Economics 14:273–285. Gerlitz, Jean-Yves and Jürgen Schupp. 2005. “Zur Erhebung der Big-Five- basierten Persönlichkeitsmerkmale im SOEP.” Berlin: DIW Research Notes. Gintis, Herbert. 2000. “Strong Reciprocity and Human Sociality.” Journal of Theoretical Biology 206:169–179. Girvan, Michelle and Mark E. J. Newman. 2002. “Community Structure in Social and Biological Networks.” PNAS 99:7821–7826. Gjorka, Minas, Maciej Kurant, Carter T. Butts, and Athina Markopoulou. 2010. “Walking in Facebook: A Case Study of Unbiased Sampling of OSNs.” Pro- ceedings of the IEEE INFOCOM 2010. Goldberg, Lewis R. 1990. “An Alternative Description of Personality: The Big- Five Factor Structure.” Journal of Personality and Social Psychology 70:1216– 1229. Goodreau, Steven M., James A. Kitts, and Martina Morris. 2009. “Birds of a Feather, or . Using Exponential Random Graph Models To Investigate Adolescent Social Networks.” Demography 46:103–125. Gouldner, Alvin. 1960. “The Norm of Reciprocity: A Preliminary Statement.” American Sociological Review 25:161–178. Granovetter, Mark S. 1973. “The Strength of Weak Ties.” American Journal of Sociology 78:1360–1380. Granovetter, Mark S. 1985. “Economic Action and Social Structure: The Problem of Embededness.” American Journal of Sociology 91:481–510. Granovetter, Mark S. 1992. “Problems of Explanation in Economic Sociology.” Pp. 25–56. In Networks and Organizations: Structure, Form, and Action, edited by N. Nohira and R. G. Eccles. Boston: Harvard Business School Press. Greif, Avner. 1989. “Reputation and Coalitions in Medieval Trade: Evidence on the Maghribi Traders.” Journal of Economic History 49:857–882. Greiner, Ben and Maria Vittoria Levati. 2005. “Indirect Reciprocity in Cyclical Networks: An Experimental Study.” Journal of Economic Psychology 26:711– 731. Guala, Francesco. 2012. “Reciprocity: Weak or strong? What Punishment Ex- periments Do (and Do Not) Demonstrate.” Behavioral and Brain Sciences 35:1–59. Guimarães, Paulo and Pedro Portugal. 2010. “A Simple Feasible Alternative Procedure to Estimate Models with High-Dimensional Fixed Effects.” Stata Journal 10:628–649. Gulati, Ranjay and Martin Gargiulo. 1999. “Where Do Interorganizational Net- works Come From.” American Journal of Sociology 104:1439–1493. Güth, Werner and Axel Ockenfels. 2003. “The Coevolution of Trust and Insti- tutions in Anonymous and Non-anonymous Communities.” Pp. 157–174. In REFERENCES 203

Jahrbuch für Neue Politische Ökonomie, edited by Manfred J. Holler, Hartmut Kliemt, D. Schmidtchen, and M. Streit. Tübingen: Mohr Siebeck. Hallinan, Maureen T. 1978. “The Process of Friendship Formation.” Social Net- works 1:193–210. Handcock, Mark S. and James Holland Jones. 2004. “Likelihood-based Inference for Stochastic Models of Formation.” Theoretical Population Biology 65:413–422. Hardin, Russel. 2002. Trust and Trustworthiness. New York: Russell Sage Foun- dation. Hardin, Russel. 2004. “Internet Capital.” Analyse & Kritik 26:122–138. Harsanyi, John C. and Reinhard Selten. 1988. A General Theory of Equilibrium Selection in Games. Cambridge, MA: MIT Press. Hausman, Jerry A., Bronwyn H. Hall, and Zvi Griliches. 1984. “Econometric Models for Count Data with an Application to the Patents-R&D Relationship.” Econometrica 52:909–938. Heath, Anthony. 1976. Rational Choice and Social Exchange: A Critique of Exchange Theory. Cambridge: Cambridge University Press. Heckathorne, Douglas D. 1989. “Collective Action and the Second-Order Free- Rider Problem.” Rationality and Society 1:78–100. Heider, Fritz. 1946. “Attitudes and Cognitive Organization.” Journal of Psychol- ogy 21:107–112. Heidermann, Julia, Mathias Klier, and Florian Probst. 2012. “Online Social Net- works: A Survey of a Global Phenomenon.” Computer Networks 56:3866– 3878. Helbing, Dirk, Martin Schönhof, Hans-Ulrich Stark, and Janusz A. Holyst. 2005. “How Individuals Learn to Take Turns: Emergence of Alternating Cooperation in a Congestion Game and the Prisoner’s Dilemma.” Advances in Complex Systems 8:87–116. Hoffman, Elizabeth, Kevin A. McCabe, and Vernon L. Smith. 1998. “Behav- ioral Foundations of Reciprocity: Experimental Economics and Evolutionary Psychology.” Economic Inquiry 36:335–352. Holme, P., Christofer R. Edling, and Fredrik Liljeros. 2004. “Structure and Evo- lution of an Internet Dating Community.” Social Networks 26:155–174. Hörner, Johannes. 2002. “Reputation and Competition.” American Economic Review 92:644–663. Houser, Daniel and John Wooders. 2006. “Reputation in Auctions: Theory, and Evidence from eBay.” Journal of Economics & Management Strategy 15:353– 369. Hsu, Greta, Michael T. Hannan, and Özgecan Koçak. 2009. “Multiple Category Memberships in Markets: An Integrative Theory and Two Empirical Tests.” 204 REFERENCES

American Sociological Review 74:150–169. Hu, Haibo and Jin-Li Guo. 2012. “A Comparative Research on Facebook Net- works in Different Institutions.” Advances in Complex Systems 15:1250030. Hubert, L. J. and P. Arabie. 1989. “Combinatorial Data Analysis: Confirmatory Comparisons Between Sets of Matrices.” Applied Stochastic Models and Data Analysis 5:273–325. Huck, Steffen, Gabriele K. Luenser, and Jean-Robert Tyran. 2012. “Competition Fosters Trust.” Games and Economic Behavior 76:195–209. Hunter, David R. and Mark S. Handcock. 2006. “Inference in Curved Exponen- tial Family Models for Networks.” Journal of Computational and Graphical Science 15:565–583. Jackson, Matthew O. and Brian W. Rogers. 2007. “Meeting Strangers and Friends of Friends: How Random are Social Networks?” American Economic Review 97:890–915. Jang, Kerry L., Robert R. McCrae, Alois Angleitner, Rainer Riemann, and W. John Livesley. 1998. “Heritability of Facet-level Traits in a Cross-Cultural Twin Sample: Support for a Hierarchical Model of Personality.” Journal of Personality and Social Psychology 74:1556–1565. Jann, Ben. 2005. “kdens: Stata Module for Univariate Kernel Density Estima- tion.” Boston College, Department of Economics, Statistical Software Compo- nents, S456410. Jensen-Campbell, L. A., R. Adams, D. G. Perry, K. A. Workman, J. Q. Furdella, and S. K. Egan. 2002. “Agreeableness, Extraversion, and Peer Relations in Early Adolescence: Winning Friends and Deflecting Aggression.” Journal of Research in Personality 36:224–251. Jian, Lian, Jeffrey K. MacKie-Mason, and Paul Resnick. 2010. “I Scratched Yours: The Prevalence of Reciporocation in Feedback Provision on eBay.” The B.E. Journal of Economic Analysis & Policy 10:1935–1682. Jones, Candace, William S. Hesterly, and Stephen P. Borgatti. 1997. “A Gen- eral Theory of Network Governance: Exchange Conditions and Social Mech- anisms.” Academy of Management Journal 22:911–945. Jøsang, Audun, Roslan Ismail, and Colin Boyd. 2007. “A Survey of Trust and Reputation System for Online Service Provision.” Decision Support Systems 43:618–644. Kahneman, Daniel and Amos Tversky. 1979. “Prospect Theory: An Analysis of Decision under Risk.” Econometrica 47:263–292. Kalbfleisch, John D. and Ross L. Prentice. 1980. The Statistical Analysis of Fail- ure Time Data. Wiley Series in Probability and Mathematical Statistics. New York: John Wiley & Sons. REFERENCES 205

Kalish, Yuval and Garry Robins. 2005. “Psychological Predispositions and Net- work Structure: The Relationship between Individual Predispositions, Strucu- tral Holes and Network Closure.” Social Networks 28:56–84. Kandori, Michihiro. 1992. “Social Norms and Community Enforcement.” The Review of Economic Studies 49:63–80. Keser, Claudia. 2003. “Experimental Games for the Design of Reputation Man- agement Systems.” IBM Systems Journal 42:498–506. Keysar, Boaz, Benjamin A. Converse, Jiunwen Wang, and Nicholas Epley. 2008. “Reciprocity Is Not Give and Take: Asymmetric Reciprocity to Positive and Negative Acts.” Psychological Science 19:1280–1286. Klein, Daniel B. 1992. “Promise Keeping in the Great Society: A Model of Credit Information Sharing.” Economics and Politics 4:117–136. Klein, Katherine J., Beng-Chong Lim, Jessica L. Saltz, and David M. Mayer. 2004. “How Do They Get There? An Examination of the Antecedents of Centrality in Team Networks.” Academy of Management Journal 47:952–963. Koehly, L.M., S.M. Goodreau, and Martina Morris. 2004. “Exponential Family Models for Sampled and Census Network Data.” Sociological Methodology 34:241–270. Kollock, Peter. 1994. “The Emergence of Exchange Structures: An Experimental Study of Uncertainty, Commitment, and Trust.” American Journal of Sociology 100:313–345. Kollock, Peter. 1999. “The Production of Trust in Online Markets.” Pp. 99–123. In Advances in Group Processes, edited by E.J. Lawler and Michael W. Macy, volume 16. Amsterdam: JAI Press. Kosinski, Michal, David Stillwell, and Thore Graepel. 2013. “Private Traits and Attributes are Predictable from Digital Records of Human Behavior.” PNAS 110:5802–5805. Kossinets, Gueorgi and Duncan J. Watts. 2009. “Origins of Homophily in an Evolving Social Network.” American Journal of Sociology 115:405–450. Krackhardt, David. 1987. “QAP Partialling as a Test of Spuriousness.” Social Networks 9:171–186. Krapivsky, Paul L., Sidney Redner, and F. Leyvraz. 2000. “Connectivity of Grow- ing Random Networks.” Physical Review Letters 85:4629–4632. Kreps, David M. 1990. Corporate Culture and Economic Theory. Perspecitives on Positive Political Economy. Cambridge: Cambridge University Press. Kreps, David M., Paul Milgrom, John Roberts, and Robert Wilson. 1982. “Ra- tional Cooperation in the Finitely Repeated Prisoners’ Dilemma.” Journal of Economic Theory 27. Kreps, David M. and Robert Wilson. 1982. “Reputation and Imperfect Informa- tion.” Journal of Economic Theory 27:253–279. 206 REFERENCES

Lampe, Cliff A. C., Nicole Ellison, and Charles Steinfield. 2007. “A Famil- iar Face(book): Profile Elements as Signals in an Online Social Network.” SIGCHI Conference on Human Factors in Computing Systems, Conference Contribution. Lazarsfeld, Paul F. and Robert K. Merton. 1978. “Friendship as Social Process: A Substantive and Methodological Analysis.” Pp. 18–66. In Freedom and Control in Modern Society, edited by Morroe Berger and Theodore Abel. New York: Van Nostrand. Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. “The Parable of Google Flu: Traps in Big Data Analysis.” Science 343:1203–1205. Lazer, David, Alex Pentland, Lada Adamic, Sinan Aral, Albert-Lázló Barabási, Devon Brewer, Nicholas A. Christakis, Noshir Contractor, James H. Fowler, Myron Gutman, Tony Jebara, Gary King, Michael W. Macy, Deb Roy, and Marshall Van Alstyne. 2009. “Life in the Network: The Coming Age of Com- putational Social Science.” Science 323:721–723. Lee, Conrad, Thomas Scherngell, and Michael J. Barber. 2011. “Investigating an Online Social Network Using Spatial Interaction Models.” Social Networks 33:129–133. Leimar, Olof and Peter Hammerstein. 2001. “Evolution of Cooperation through Indirect Reciprocity.” Proceedings of the Royal Society B 268:745–753. Lewis, Kevin, Jason Kaufman, and Nicholas A. Christakis. 2008. “The Taste for Privacy: An Analysis of College Student Privacy Settings in an Online Social Network.” Journal of Computer-Mediated Communication 14:79–100. Li, Lingfang and Erte Xiao. 2014. “Money Talks: Rebate Mechanisms in Repu- tation System Design.” Management Science 60:2054–2072. Livingston, Jeffrey A. 2003. “What Attracts a Bidder to a Particular Internet Auc- tion?” Pp. 165–187. In Organizing the New Industrial Economy (Advances in Applied Microeconomics, Volume 12), edited by Michael R. Baye. Amsterdam: Emerald Group Publishing Limited. Livingston, Jeffrey A. 2005. “How Valuable is a Good Reputation? A Sample Selection Model of Internet Auctions.” Review of Economics and Statistics 87:453–465. Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage. Lucking-Reiley, David, Doug Bryan, Naghi Prasad, and Daniel Reeves. 2007. “Pennies from eBay: the Determinants of Price in Online Auctions.” Journal of Industrial Economics 55:223–233. Lumeau, Marianne, David Masclet, and Thierry Pénard. 2013. “Reputation and Social (Dis)approval in Feedback Mechanisms: An Experimental Study.” CREM Working Paper 2013-43. REFERENCES 207

Malevergne, Yannik, V. Pisarenko, and Didier Sornette. 2005. “Empirical Dis- tribution of Stock Returns: Between the Stretched Exponential and the Power Law?” Quantitative Finance 5:379–401. Malinowski, Bronislaw. 1922. Argonauts of the Western Pacific: An Account of Native Enterprise and Adventure in the Archipelagoes of Melanesian New Guinea. London: Routledge & Kegan Paul. Malinowski, Bronislaw. 1926. Crime and Custom in Savage Society. London: Routledge & Kegan Paul. Marsden, Peter V. 1987. “Core Discussion Networks of Americans.” American Sociological Review 52:122–131. Marsden, Peter V. 2005. “Recent Developments in Network Measurement.” Pp. 8–30. In Recent Advances in Social Network Analysis, edited by Peter J. Car- rington, John Scott, and Stanley Wasserman. Cambridge: Cambridge Univer- sity Press. Masclet, David, Charles Noussair, Steven Tucker, and Marie-Claire Villeval. 2003. “Monetary and Nonmonetary Punishment in the Voluntary Contribu- tions Mechanism.” American Economic Review 93:366–380. Masclet, David and Thierry Pénard. 2012. “Do Reputation Feedback System Really Improve Trust among Anonymous Traders? An Experimental Study.” Applied Economics 44:4553–4573. Mauss, Marcel. [1950] 1990. The Gift: The Form and Reason for Exchange in Archaic Society. London: Routledge. Mayer, Adalbert. 2009. “Online Social Networks in Economics.” Decision Sup- port Systems 57:169–184. Mayer, Adalbert and Steven L. Puller. 2008. “The Old Boy (and Girl) Network: Social Network Formation on University Campuses.” Journal of Public Eco- nomics 92:329–347. McCarty, Christopher and H. D. Green. 2005. “Personality and Personal Net- works.” Sunbelt XXV, Conference Contribution. McCrae, Robert R. 1996. “Social Consequences of Experimental Openness.” Psychological Bulletin 120:323–337. McCrae, Robert R. and Paul T. Costa. 1990. Personality in Adulthood. New York: Guilford Press. McDonald, Cynthia G. and Carlos Jr. Slawson. 2002. “Reputation in an Internet Auction Market.” Economic Inquiry 40:633–650. McPherson, M., L. Smith-Lovin, and E.M. Brashears. 2006. “Social Isolation in America: Changes in Core Discussion Networks over Two Decades.” Ameri- can Sociological Review 71:353–375. McPherson, M., L. Smith-Lovin, and J.M. Cook. 2001. “Birds of a Feather. Homophily in Social Networks.” Annual Review of Sociology 27:415–444. 208 REFERENCES

Mehra, Ajay, Martin Kilduff, and Daniel J. Brass. 2001. “The Social Networks of High and Low Self-Monitors: Implications for Workplace Performance.” Administrative Science Quarterly 46:121–146. Melnik, Mikhail I. and James Alm. 2002. “Does a Seller’s E-Comerce Reputation Matter? Evidence from eBay Auctions.” The Journal of Industrial Economics 50:337–349. Melnik, Mikhail I. and James Alm. 2005. “Seller Reputation, Information Sig- nals, and Prices for Heterogenous Coins on eBay.” Southern Economic Journal 72:305–328. Merton, Robert K. 1968. “The Matthew Effect in Science.” Science 159:56–63. Milgrom, Paul R., Douglass C. North, and Barry R. Weingast. 1990. “The Role of Institutions in the Revival of Trade: The Law Merchant, Private Judges, and the Champagne Fairs.” Economics and Politics 2:1–23. Mislove, Alan, Massimiliano Marcon, Kirshna P. Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. “Measurement and Analysis of Online Social Net- works.” IMC 2007 Conference Proceedings, San Diego. Mood, Carina. 2010. “Logistic Regression: Why We Cannot Do What We Think We Can Do, And What We Can Do About It.” European Sociological Review 26:67–82. Moody, James. 2001. “Race, School Integration, and Friendship Segregation in America.” American Journal of Sociology 107:679–716. Moore, Gwen. 1990. “Structural Determinants of Men’s and Women’s Personal Networks.” American Sociological Review 55:726–735. Moreno, Jacob L., Helen H. Jennings, and Richard Stockton. 1943. “Sociometry in the Classroom.” Sociometry 6:425–428. Newman, Mark E. J. 2002. “Assortative Mixing in Networks.” Physical Review Letters 89:208701. Newman, Mark E. J. 2003a. “Mixing Patterns in Networks.” Physical Review E 67:026126. Newman, Mark E. J. 2003b. “The Structure and Function of Complex Networks.” SIAM Review 45:167–256. Newman, Mark E. J. 2005. “Power laws, Pareto distributions and Zipf’s law.” Contemporary Physics 46:323–351. Newman, Mark E. J. 2009. “Random Graphs with Clustering.” Physical Review E 103:058701. Newman, Mark E. J. 2010. Networks: An Introduction. Oxford: Oxford Univer- sity Press. Newman, Mark E. J., Stephanie Forrest, and Justin Balthrop. 2002. “Email Net- works and the Spread of Computer Viruses.” Physical Review E 66:035101. REFERENCES 209

Newman, Mark E. J. and Michelle Girvan. 2003. “Mixing Patterns and Commu- nity Structure in Networks.” Pp. 66–87. In Statistical Mechanics of Complex Networks, edited by Romualdo Pastor-Satorras, Miguel Rubi, and Albert Diaz- Guilera, volume 625. Berlin: Springer. Newman, Mark E. J., Steven H. Strogatz, and Duncan J. Watts. 2001. “Random Graphs with Arbitrary Degree Distribution and their Applications.” Physical Review E 64:026118. Nikiforakis, Nikos. 2008. “Punishment and Counter-Punishment in Public Good Games: Can We Really Govern Ourselves?” Journal of Public Economics 92:91–112. Nowak, Martin A. 2006a. Evolutionary Dynamics: Exploring the Equations of Life. Cambridge: Belknap Press of Harvard University Press. Nowak, Martin A. 2006b. “Five Rules for the Evolution of Cooperation.” Science 314:1560–1563. Nowak, Martin A. and Karl Sigmund. 1998a. “The Dynamics of Indirect Reci- procity.” Journal of Theoretical Biology 194:561–574. Nowak, Martin A. and Karl Sigmund. 1998b. “Evolution of Indirect Reciprocity by Image Scoring.” Nature 393:573–577. Nowak, Martin A. and Karl Sigmund. 2005. “Evolution of Indirect Reciprocity.” Nature 437:1291–1298. Ockenfels, Axel. 2003. “Reputationsmechansimen auf Internet- Marktplattformen: Theorie und Empirie.” Zeitschrift für Betriebswirtschaft Pp. 295–315. Offerman, Theo. 2002. “Hurting Hurts More Than Helping Helps.” European Economic Review 46:1423–1437. Okuno-Fujiwara, Masahiro and Andrew Postlewaite. 1995. “Social Norms and Random Matching Games.” Games and Economic Behavior 9:79–109. Onnela, Jukka-Pekka, J. Saramäki, Jörkki Hyvönen, Gabor Szabo, David Lazer, Kimmo Kaski, Janos Kertész, and Albert-Lázló Barabási. 2007. “Structure and Tie Strength in Mobile Communication Networks.” PNAS 104:7332–7336. Pagano, Marco and Tullio Jappelli. 2002. “Information Sharing, Lending and Defaults: Cross-Country Evidence.” Journal of Banking & Finance 26:2017– 2045. Panchanathan, Kathik and Robert Boyd. 2003. “A Tale of Two Defectors: The Importance of Standing in the Evolution of Indirect Reciprocity.” Journal of Theoretical Biology 224:115–126. Pool, Ithiel de Sola and Manfred Kochen. 1978. “Contacts and Influence.” Social Networks 1:5–51. Porter, Mason A., Jukka-Pekka Onnela, and Peter J. Mucha. 2009. “Communities in Networks.” Notices of the AMS 56:1082–1097. 210 REFERENCES

Prentice, Ross L. and L.A. Gloeckler. 1978. “Regression Analysis of Grouped Survival Data with Application to Breast Cancer Data.” Biometrics 34:57–67. Price, Derek J. De Solla. 1976. “A General Theory of Bibliometric and Other Cumulative Advantage Processes.” Journal of the American Society for Infor- mation Science 27:292–306. Przepiorka, Wojtek. 2013. “Buyers Pay For and Sellers Invest in a Good Reputa- tion: More Evidence from eBay.” Journal of Socio-Economics 42:31–42. Putnam, Robert D. 2000. Bowling Alone: The Collapse and Revival of American Community. New York: Simon & Schuster. Quercia, Daniele, Renaud Lambiotte, David Stillwell, Michal Kosinski, and Jon Crowcroft. 2012. “The Personality of Popular Facebook Users.” ACM Con- ference on Computer Supported Cooperative Work, Seattle, WA. Rabin, Matthew. 1993. “Incorporation Fairness into Game Theory and Eco- nomics.” The American Economic Review 83:1281–1302. Rapoport, Anatol. 1953. “Spread of Information through a Population with Socio- Structural Bias: I. Assumption of Transitivity.” Bulletin of Mathematical Bio- physics 15:523–533. Resnick, Paul and Richard Zeckhauser. 2002. “Trust Among Stranger in Internet Transactions: Empirical Analysis of eBay’s Reputation System.” The Eco- nomics of the Internet and E-Commerce 11:127–157. Resnick, Paul, Richard Zeckhauser, Eric J. Friedman, and Ko Kuwaba. 2000. “Reputation Systems.” Communication of the ACM 43:45–48. Resnick, Paul, Richard Zeckhauser, John Swanson, and Kate Lockwood. 2006. “The Value of Reputation on eBay: A Controlled Experiment.” Experimental Economics 9:79–101. Rockenbach, Bettina and Manfred Milinski. 2006. “The Efficient Interaction of Indirect Reciprocity and Costly Punishment.” Nature 444:718–723. Rosenthal, Robert W. 1979. “Sequences of Games with Varying Opponents.” Econometrica 47:1353–1366. Rosenthal, Robert W. and Henry J. Landau. 1979. “A Game-Theoretic Analysis of Bargaining with Reputations.” Journal of Mathematical Psychology 20:233– 255. Ross, Craig, Emily S. Orr, Mia Sisic, Jaime M. Arseneault, Mary G. Simmer- ing, and R. Robert Orr. 2009. “Personality and Motivations Associated with Facebook Use.” Computers in Human Behavior 25:578–586. Rubin, Donald B and Roderick J.A. Little. 2002. Statistical Analysis with Missing Data. New York: Wiley, 2nd edition. Russell, Daniel W., Brenda Booth, David Reed, and Philip R. Laughlin. 1997. “Personality, Social Networks, and Perceived Social Support among Alco- holics: A Structural Equation Analysis.” Journal of Personality 63:649–692. REFERENCES 211

Ruths, Derek and Jürgen Pfeffer. 2014. “Social Media for Large Studies of Be- havior.” Science 346:1063–1064. Ryan, Tracii and Sophia Xenos. 2011. “Who Uses Facebook? An Investigation Into the Relationship Between the Big Five, Shyness, Narcissism, Loneliness, and Facebook Usage.” Computers in Human Behavior 27:1658–1664. Sahlins, Marshall. 1972. Stone Age Economics. Chicago, IL: Aldine Atherton. Santos, F.C. and J.M. Pacheco. 2005. “Scale-Free Networks Provide a Unify- ing Framework for the Emergence of Cooperation.” Pysical Review Letters 95:098104. Schelling, Thomas C. 1960. The Strategy of Conflict. Cambridge: Harvard Uni- versity Press. Schmieder, Johannes F. 2009. “gpreg: Stata Module to Estimate Regressions with Two Dimensional Fixed Effects.” Boston College, Department of Economics, Statistical Software Components, S457048. Schumpeter, Joseph A. [1943] 1976. Capitalism, Socialism and Deomocracy. London: Routledge. Seinen, Ingrid and Arthur Schram. 2006. “Social Status and Group Norms: Indirect Reciprocity in a Helping Experiment.” European Economic Review 50:581–602. Selfhout, Maarten, William Burk, Susan Branje, Jaap J. A. Denissen, Marcel A.G. Van Aken, and Wim Meeus. 2010. “Emerging Late Adolescent Friendship Net- works and Big Five Personality Traits: A Social Network Approach.” Journal of Personality 78:509–538. Shapiro, Carl. 1983. “Premiums for High Quality Products as Returns to Reputa- tions.” Quarterly Journal of Economics 98:659–680. Simmel, Georg. 1908. “Die Kreuzung sozialer Kreise.” book section 6. In Untersuchungen über die Formen der Vergesellschaftung. Berlin: Dunker & Humboldt. Snijders, Chris and Gideon Keren. 2001. “Do you trust? Whom do you trust? When do you trust?” Pp. 129–160. In Advances in Group Processes, edited by S.R. Thye, M.W. Lawler, Michael W. Macy, and H.A. Walker, volume 18. Amsterdam: Elsevier Science. Snijders, Chris and Jeroen Weesie. 2009. “Online Programming Markets.” Pp. 166–185. In eTrust: Forming Relationships in the Online World, edited by Karen S. Cook, Chris Snijders, Vincent Buskens, and C. Cheshire. New York: Russel Sage Foundation. Snijders, Chris and Richard Zijdeman. 2004. “Reputation and Internet Auctions: eBay and Beyond.” Analyse & Kritik 26:158–184. Snijders, Tom A. B., Philippa Pattison, Garry L. Robins, and Mark S. Handcock. 2006. “New Specifications for Exponential Random Graph Models.” Socio- 212 REFERENCES

logical Methodolgy 36:99–153. Soldz, Stephen and George E. Vaillant. 1999. “The Big Five Personality Traits and the Life Course: A 45-Year Longitudinal Study.” Journal of Research in Personality 33:208–232. Solomonoff, Ray and Anatol Rapoport. 1951. “Connectivity of Random Nets.” Bulletin of Mathematical Biophysics 13:107–117. Sornette, Didier. 2006. Critical Phenomena in Natural Sciences. Berlin: Springer. Standifird, Stephen, S. 2001. “Reputation and E-Commerce: eBay Auctions and the Asymmetrical Impact of Positive and Negative Reatings.” Journal of Man- agement 27:279–295. Stumpf, Michael P.H. and Mason A. Porter. 2012. “Critical Truth about Power Laws.” Science 335:665. Suzuki, Shinsuke and Hiromichi Kimura. 2013. “Indirect Reciprocity is Sensitive to Costs of Information Transfer.” Scientific Reports 3:1435. Sylwester, Karolina and Gilbert Roberts. 2010. “Cooperatiors Benefit through Reputation-based Partner Choice in Economic Games.” Biology Letters 6:659– 662. Sylwester, Karolina and Gilbert Roberts. 2013. “Reputation-based Partner Choice is an Effective Alternative to Indirect Reciprocity in Sovling Social Dilemmas.” Evolution and Human Behavior 34:201–206. Szell, Michael and Stefan Thurner. 2010. “Measuring Social Dynamics in a Mas- sive Multiplayer Game.” Social Networks 32:313–329. Takahashi, Satoru. 2010. “Community Enforcement when Players Observe Part- ner’s Past Play.” Journal of Economic Theory 145:42–62. Totterdell, Peter, David Holman, and Amy Hukin. 2008. “Social Networkers: Measuring and Examing Individual Differences in Propensity to Connect with Others.” Social Networks 30:283–296. Traud, Amanda L., Eric D. Klesic, Peter J. Mucha, and Mason A. Porter. 2011. “Comparing Community Structure to Characteristics in Online Collegiate So- cial Networks.” SIAM Review 53:526–543. Traud, Amanda L., Peter J. Mucha, and Mason A. Porter. 2012. “Social Structure of Facebook Networks.” Physica A 391:4165–4180. Travers, J. and S. Milgram. 1969. “An Experimental Study of the Small World Problem.” Sociometry 32:425–443. Trivers, Robert, L. 1971. “The Evolution of Reciprocal Altruism.” Quarterly Review of Biology 46:35–57. Tufekci, Zeynep. 2008. “Grooming, Gossip, Facebook and Myspace: What Can We Learn About These Sites From Those Who Won’t Assimilate?” Informa- tion, Communication & Society 11:544–564. REFERENCES 213

Tversky, Amos and Daniel Kahneman. 1992. “Advances in Prospect Theory: Cumulative Representation of Uncertainty.” Journal of Risk and Uncertainty 5:297–323. Ugander, Johan, Brian Karrer, Lars Backstrom, and Cameron Marlow. 2011. “The Anatomy of the Facebook .” arXiv:1111.4503. Uzzi, Brian. 1997. “Social Structure and Competition in Interfirm Networks: The Paradox of Embeddedness.” Administrative Science Quarterly 42:35–67. Van Duijn, Maritje A.J., Evelien P.H. Zeggelink, Mark Huisman, Frans N. Stock- man, and Frans W. Wasseur. 2003. “Evolution of Sociology Freshmen into a Friendship Network.” Journal of Mathematical Sociology 27. Vodosek, M. 2003. “Personality and the Formation of Social Networks.” Annual Meeting of the American Sociological Association, Conference Contribution. von Neumann, John and Oskar Morgenstern. 1944. Theory of Games and Eco- nomic Behavior. Princeton, NJ: Princeton University Press. Wanberg, Connie R., R. Kanfer, and J. T. Banas. 2000. “Predictors and Outcomes of Networking Intensity Among Unemployed Job Seekers.” Journal of Applied Psychology 85:491–503. Wasserman, Stanley and Katherine Faust. 1994. Social Network Analysis: Meth- ods and Applications. Cambridge: Cambridge University Press. Watabe, Takamitsu, Masanori Takezawa, Yo Nakawake, Akira Kunimatsu, Hide- nori Yamasue, and Mitsuhiro Nakamura. 2014. “Two Distinct Neural Mecha- nisms Underlying Indirect Reciprocity.” PNAS 111:3390–3995. Watts, Duncan J. 1999. “Networks, Dynamics, and the Small-World Phe- nomenon.” American Journal of Sociology 105:493–527. Watts, Duncan J. 2004. “The “New” Science of Networks.” Annual Review of Sociology 30:243–70. Watts, Duncan J. 2007. “A Twenty-First Century Science.” Nature 445:489. Watts, Duncan J. and Steven H. Strogatz. 1998. “Collective Dynamics of Small- World Networks.” Nature 292:440–442. Wedekind, Claus and Manfred Milinski. 2000. “Cooperation Through Image Scoring in Humans.” Science 288:850–852. Willett, John B. and Judith D. Singer. 1995. “It’s Déjà Vu All Over Again: Using Multiple-Spell Discrete-Time Survival Analysis.” Journal of Education and Behavioral Statistics 20:41–67. Williamson, Oliver E. 1993. “Calculativeness, Trust, and Economic Organiza- tion.” Journal of Law and Economics 36:453–486. Wilson, Robert. 1985. “Reputation in Games and Markets.” Pp. 27–62. In Game Theoretic Models of Bargaining, edited by Alvin E. Roth. Cambridge: Cambridge University Press. 214 REFERENCES

Wimmer, Andreas and Kevin Lewis. 2010. “Below and Beyond Racial Ho- mophily: ERG Models of a Friendship Network based on Facebook.com.” American Journal of Sociology 116:583–642. Xie, Huan and Yong-Ju Lee. 2012. “Social Norms and Trust among Strangers.” Games and Economic Behavior 76:548–555. Yamagashi, Toshio, Yutaka Horita, Nobuhiro Mifune, Hirofumi Hashimoto, Yang Li, Mizuho Shinada, Arisa Miura, Keigo Inukai, Haruto Takagishi, and Dora Simunovic. 2012. “Rejection of Unfair Offers in the Ultimatum Game is no Evidence of Strong Reciprocity.” PNAS 109:20364–20368. Yamagashi, Toshio, Masafumi Matsuda, Noriaki Yoshikai, Hiroyuki Takahashi, and Yukihiro Usui. 2009. “Solving the Lemons Problem with Reputation.” Pp. 73–108. In eTrust: Forming Relationships in the Online World, edited by Karen S. Cook, Chris Snijders, Vincent Buskens, and C. Cheshire. New York: Russel Sage Foundation. Yamagishi, Toshio, Karen S. Cook, and Motoki Watabe. 1998. “Uncertainty, Trust, and Commitment Formation in the United States and Japan.” American Journal of Sociology 104:165–194. Zalányi, László, Gábor Csárdi, Támas Kiss, Máté Lengyel, Rebecca Warner, Jan Tobochnik, and Péter Érdi. 2003. “Properties of a Random Attachment Grow- ing Network.” Physical Review E 68:066104. Zimmer, Michael. 2010. ““But the Data Is Already Public”: On the Ethics of Research in Facebook.” Ethics and Information Technology 12:313–325. Zucker, Lynne G. 1986. “Production of Trust: Institutional Sources of Economic Structure, 1840-1920.” Research in Organizational Behavior 8:53–111. Curriculum Vitae

Personal Data

Name: Stefan Wehrli Date of birth: October 13, 1974 Place of birth: Lausen, Switzerland Place of origin: Küttigen, AG Nationality: Swiss Marital status: single, two children

Education

1994 – 1996 University of Basel, Bachelor (Vorlizentiat) in Economics 1997 – 2005 Lic.rer.soc. (M.A.) in Sociology with Business Administration and Economics, University of Berne. 2006 ICPSR Summer Program in Quantitative Methods of Social Research, Institute of Social Research, University of Michigan 2007 Doctoral Candidate in Sociology, ETH Zurich 2007 – Essex Summer School in Social Science Data Analysis and Collection, University of Essex

Professional Development

1999 – 2003 Undergraduate research assistant, University of Bern, Institute of Sociology 2005 – 2010 Research assistant and doctoral candidate, ETH Zurich, Chair of Sociology (Management of the Telephone and Experimental Laboratory SocioLab) 2010 – Research assistant and doctoral candidate, ETH Zurich, Chair of Sociology (50%), Management ETH Decision Science Lab- oratory (50%).