The Structural Determinants of Media Contagion by Cameron Alexander Marlow
Total Page:16
File Type:pdf, Size:1020Kb
The Structural Determinants of Media Contagion by Cameron Alexander Marlow B.S. Computer Science University of Chicago, I999 M.S. Media Arts and Sciences Massachusetts Institute of Technology,200I SUBMITTED TO THE PROGRAM IN MEDIA ARTS AND SCIENCES, SCHOOL OF ARCHITECTURE AND PLANNING, IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY SEPTEMBER, 2005 O 2005 MASSACHUSETTS INSTITUTE OF TECHNOLOGY. ALL RIGHTS RESERVED. Author Department of Media Arts and Sciences August 2, 2005 Certified by Walter Bender Senior Research Scientist Department of Media Arts and Sciences Thesis Supervisor A Accepted by Andrew Lippman Chairman Departmental Committee for Graduate Students Department of Media Arts and Sciences MASSACHUSETTS INSTIUIE OF TECHNOLOGY ARCHIVES SEP RAR2 IES2005 LIBRARIES Thesis Committee Thesis Supervisor Walter Bender Senior Research Scientist Massachusetts Institute of Technology Department of Media Arts and Sciences Thesis Reader X'At ;/ KeithHampton Assistant Professor Massachusetts Institute of Technology Department of Urban Studies and Planning Thesis Reader Tom Valente Associate Professor University of Southern California Department of Preventative Medicine The Structural Determinants of Media Contagion by Cameron Alexander Marlow Submitted to the Program in Media Arts and Sciences, School of Architecture and Planning, in partial fulfillment of the requirements for the degree of Doctor of Philosophy ABSTRACT Informal exchanges between friends, family and acquaintances play a crucial role in the dissemination of news and opinion. These casual interactions are embedded in a network of communication that spans our society, allowing information to spread from any one person to another via some set of intermediary ties. Weblogs have recently emerged as a part of our media ecology and incidentally engender this process of media contagion; because weblog authors are tied by social networks of readership, contagious media events happen frequently, and in a form that is immediately measurable. The generally accepted notion of media diffusion is that it occurs through two channels: externally, as applied by a constant force such as the mass media, and internally through socio-structural means. Sitting between our traditional notions of mass media and the public, weblogs problematize this classical theory of mass media influence. This thesis aims to elucidate the role of weblogs in media contagion through a sociological study of this community in two parts: First, I will address the issues of modeling the social structure of weblogs as observed through their readership network, and the various media events that occur therein. Using a large weblog corpus collected over a one-month period, I have constructed a model describing the structure of popularity and influence from the extracted readership network, and will show that this model more accurately describes the weblog network. I will also derive a typology of media events from collected examples using features of structural and non-structural diffusion. Second, the extent to which these data are reflective of actual social processes as opposed to artifacts of data collection and aggregation will be explored. To validate the models presented in part one, I have conducted a survey of randomly selected authors to examine their social behaviors, both in weblog use and otherwise. I will characterize the range of weblog uses and practices, presenting an analysis of personal influence in the blogging community. Thesis Supervisor: Walter Bender Title: Senior Research Scientist Acknowledgments This thesis is dedicated to the memory of Redmond Lyons-Keefe,a best friend whose kindness and laughter will be dearly missed. Thanks This work could not have been accomplished without the help and support of the following individuals: THE A LIST To my committee, Walter Bender, Keith Hampton and Thomas Valente. Also my family, who made it possible for me to be at MIT in the first place. THE AA LIST Linda Peterson and Pat Solakoff, EP: Erik, Joanie, Scotty, Sunil, Vadim, Carla, Ouko and Larissa. UG: Scorchio, The Pud, You're Fired, Shut up, Daddy Long, the Iz Rocka, Gys, JFlower, Eddie, Tek Fu and TFY200s. Aisling, of course. All of the MIT students, faculty and expats who have helped me along the way: Jeana, Nikita, Jess, Aaron, Jeff, Max, Min, Min Suh,Jonah and Andrea. My family at the CDC, especially the weak tie, Victoria Gammino. All my crew in A-town, Zach, Tom, and Blondie. This font, H-oefler Text. THE AAA LIST Andy Baio, Erik Benson, danah boyd, Michael Buffington, Maciej Ceglowski, Tom Coates, Anil Dash. Nick Denton, Redrick DeLeon, Cory Doctorow, Ze Frank, Rusty Foster, Ryan Gantz, Matt Haughey,Alison Headley,Scott Heiferman, Meg Hourihan,Matt Jones, Jason Kottke, Jason Levine, Leonard Lin, Erica Lucci, Peter Merholtz, Joshua Schachter, Tim Shey, Ben and Mena Trott, David Weinberger, and anyone who writes something called BLOG. 8 Contents List of Figures II List of Tables : 3 1 Introduction 5 2 Background 2I 2.I SocialNetworks ...... .. ............ 2 I 2.2 Computer Mediated Commitunication ............... 32 2.3 Diffusion Studies ...... ......... ... .36 2.4 Weblogs........... .....................45 3 Design and Methodology 53 3.I Sampling weblogs ..... ....... ............ ... 53 3.2 Weblog Aggregator .... .... ............. ... ..57 3.3 Survey ............ .....................64 4 Results 75 4.I Aggregator .......... * 75 4.2 Survey ............. 97 5 Conclusions 1I29 5.I Summary ........... I29 5.2 Contributions ........ .. I33 5.3 Future work. I34 Appendix A Weblog aggregator I37 Appendix B Email 143 Appendix C MIT Weblog survey I45 Bibliography I57 IO List of Figures 2.I S-Curve of Cumulative Adoption . 38 2.2 Adopter Categories ................... 38 2.3 Cumulative diffusion for 3 news stories ........ 40 2.4 Contagion through structural equivalence ....... .... 44 2.5 Contagion through thresholds ............. 44 2.6 Weblog Anatomy . 47 4.I Weblog Updates ............. .. ..... *- .*- *- 76 4.2 Readership Degree Distribution .... 79 4.3 Adjusted Degree Distribution ...... .. .. ... .. ... .. 80 4.4 Updates vs. Dynamic in-degree ..... .. ......... - * . 85 4.5 Network Density ............. .. .. .. ... .. .. ... 86 4.6 Distribution of meme diffusion times ................ .9 4.7 Mean vectors from K-Means clustering . .. .. .. ... .. 9I 4.8 Values of a and b ............. .. ... ... .. ... 92 4.9 Actual and predicted curves for three diffusion types . ... 93 4.Io New Subjects over time ......... ....... ... .. 97 4.II Survey questions answered per subject . .. .. .. .... .. .Ioo00 4.I2 Weblog Motivations by Sample .. ......... ...... .Io07 4.I3 Scree Plot of PCA for Weblog Motivation............... Io8 4.I4 Personal and Professional Component Distributions ......... Io 4.I5 Distribution of Total Communication Frequency (CommT) ..... II4 4. I6 Buddy list size (cumulative). .. ... II 4 4.I7 Observed Links ........................ .. .. Ii6 4.I8 Position Generator Response ................. .. .. 23 4.I9 Position Generator Sum Distribution ............ .. .I24 A.I Weblog aggregator system architecture .1............... 37 12 List of Tables _ 2.I Mathematical Models of Information Diffusion .. 42 4.I Detected Languages .................. 77 4.2 Top weblogs ranked by in-degree .. .. .. 80 .. 4.3 Connected components ................ .. .. 81 .. 4.4 Degree relationship .................. .. ... 83 . 4-5 Top Weblogs by Dynamic and Static Degree . .... 83 4.6 Observed Countries .................. .. I02 ...... 4.7 Sample demographics ................. Io03 ....... 4.8 Survey completion rates . .. .... IO03 ... 4-9 Demographics reported by survey and LiveJournal ..I04 ..... 4.Io0 Weblog Motivations .. .. I07 ....... 4.II Results of Motivation PCA .............. ....... I09 4.12 Post-type frequency correlations. ..... .. I 9 4.13 Investment into weblogging .............. III .... .... 4.I4 Demographics and communication frequencies . II3 ........ 4.I5 Communication frequency correlations . .... ... II 3 4. I6 IM frequencies ..................... 115 ....... 4.I7 Distribution of non-social links . .... .. .118 4.18 Top non-social links .................. .... II8 ... Social link type and relationship ........... I20 ........ 4.20 Readership and relationship .............. .. I20 ...... 4.2I Social links and communication ........... .I22 ....... 4.22 Position Generator. .................. .... I25 ... 4.23 Position Generator and demographics ........ ... I25 ...... 4.24 Online and offline Position Generator scores .... ..126 ...... Chapter 1 Introduction On November 22, 963,NBC Correspondent Frank McGee declared,"This afternoon, wherever you were and whatever you might have been doing when you received the word of the death of President Kennedy,that is a moment that will be emblazoned in your memory and you will never forget it...as long as you live." For this particular event, wherever they were, over 50% of America were talking to someone else, because that person was relaying the horrible news. Whenever a catastrophic news event occurs, the probability that we hear about it from another person increase dramatically;this might not come as a surprise to someone who has experienced an event of this magnitude, which most people have. But what might be unexpected is the fact that as news becomes increasingly irrelevantto the rest of the population, your odds of finding it through interpersonal communication