Emergent Structure, Semantics and Usage of Social Streams
Total Page:16
File Type:pdf, Size:1020Kb
Emergent Structure, Semantics and Usage of Social Streams Dipl.-Ing. Claudia Wagner, Bakk.rer.soc.oec. DISSERTATION zur Erlangung des akademischen Grades eines Doktors der technischen Wissenschaften der Studienrichtung Informatik an der Technischen Universit¨atGraz Univ.-Doz. Dipl.-Ing. Dr.techn. Markus Strohmaier Institut f¨urWissensmanagement Technische Universit¨atGraz Graz 2013 For my parents Senat Deutsche Fassung: Beschluss der Curricula-Kommission für Bachelor-, Master- und Diplomstudien vom 10.11.2008 Genehmigung des Senates am 1.12.2008 EIDESSTATTLICHE ERKLÄRUNG Ich erkläre an Eides statt, dass ich die vorliegende Arbeit selbstständig verfasst, andere als die angegebenen Quellen/Hilfsmittel nicht benutzt, und die den benutzten Quellen wörtlich und inhaltlich entnommene Stellen als solche kenntlich gemacht habe. Graz, am …………………………… ……………………………………………….. (Unterschrift) Englische Fassung: STATUTORY DECLARATION I declare that I have authored this thesis independently, that I have not used other than the declared sources / resources, and that I have explicitly marked all material which has been quoted either literally or by content from the used sources. …………………………… ……………………………………………….. date (signature) Abstract Social streams are aggregations of data that are produced by a temporal sequence of users' activities conducted in an online social environment like Twitter or Facebook where others can perceive the manifestation of these activities. Although previous research shows that social streams are a useful source for many types of information, most existing approaches treat social streams as just another textual document and neglect the fact that social streams emerge through user activities. This thesis sets out to explore potential relations between the user activities which generate a stream (and therefore impact the emergent structure of a stream) and the semantics of a stream. A network-theoretic model of social streams is introduced in this work which allows to formally describe social streams and the structures which emerge from them. Further, several structural stream measures which allow to compare different social streams and two novel measures for as- sessing the stability of emerging structures of social streams are presented in this work. In several empirical studies this work explores if a relation between semantics and user activities exists and if so to what extent this relation can be exploited for (1) the creation of semantic annotations of social streams and users and (2) the prediction of users' future activities in social streams. This work does not explore causal relations between se- mantics and user activities, but investigates if relational patterns between semantics and user activities exist and if they can be exploited for devel- oping new methods which allow annotating social streams with semantic and usage metadata. The empirical results of this work show that activity patterns which can be observed in social streams around different topics, reveal interesting differences. This suggests that a relation between activity patterns and their semantic context exists. Further, this work shows that structural stream properties and activity patterns can indeed be exploited for learn- ing semantic annotations of social streams. The results of this work highlight not only the relation between semantics and user behavior, but also show that (1) incorporating information about the semantic context may increase the accuracy of predictions about users' future activities and (2) incorporating information about users' activities and the structure which emerges from them may help to semantically annotate the context in which the activities are observed. This work is relevant for researchers interested in developing semantic technologies for mining social streams or user generated content and researchers work- ing on modeling and predicting users' behavior in online social environ- ments. Kurzfassung Der aktuelle Stand der Forschung zeigt klar, dass sogenannte \social streams", das heißt von Benutzern generierte Textstr¨omedie in einer sozialen online Umgebung erzeugt und geteilt werden, eine wertvolle Quelle f¨urverschiedene Arten von Information sind. Obwohl diese Textstr¨ome durch Benutzeraktivit¨atenerzeugt werden, beschr¨anktsich die semantis- che Analyse h¨aufigauf den Text alleine, da dieselben Methoden die zur semantischen Analyse von Textdokumenten verwendet werden, auch auf soziale Textstr¨omeangewendet werden k¨onnen. Das Ziel dieser Disser- tation ist es den Zusammenhang zwischen Benutzeraktivit¨aten(und den dadurch entstehenden emergenten Strukturen) und der emergenten Se- mantik von sozialen Textstr¨omenzu erforschen, um eine Grundlagen f¨ur neue Methoden zur semantischen Analyse von sozialen Textstr¨omenzu entwickeln. Im Rahmen dieser Dissertation wird ein netzwerktheoretisches Model vorgestellt, welches es erlaubt, soziale Textstr¨omeund die emergenten Strukturen die daraus entstehen, formal zu beschreiben und zu charak- terisieren. Weiters werden in dieser Arbeit verschiedene Maße eingef¨uhrt, die es erlauben social streams ¨uber ihre strukturellen Eigenschaften zu charakterisieren und zu vergleichen und die es erm¨oglichen die Stabilit¨at der emergenten Strukturen zu messen. In mehreren empirischen Stu- dien untersucht diese Dissertation m¨ogliche Zusammenh¨angezwischen dem Verhalten von Benutzern in sozialen online Umgebungen und der Semantik dieser Textstr¨ome. Genauer gesagt untersucht diese Arbeit ob diese Zusammenh¨ange(1) f¨urdie semantische Analyse und die Erstel- lung von semantischen Annotationen von Textstr¨omenn¨utzlich sind und (2) helfen k¨onnenzuk¨unftigesBenutzerverhalten im Zusammenhang mit diesen Textstr¨omenvorherzusagen. Der Fokus dieser Arbeit liegt nicht auf der Detektion von kausalen Zusammenh¨angen,sondern von relationalen Mustern die genutzt werden k¨onnen,um die Erstellung von semantischen Annotationen und/oder die Benutzermodellierung zu unterst¨utzen. Die Ergebnisse dieser Arbeit zeigen, dass einerseits die Einbindung von Informationen ¨uber das Benutzerverhalten und die daraus entstehenden Strukturen helfen k¨onnenakkuratere semantische Annotationen zu erzeu- gen und andererseits semantische Annotation genutzt werden k¨onnen,um das Benutzerverhalten akkurater vorherzusagen. Diese Arbeit ist relevant f¨urForscher die an der Entwicklung von semantischen Analysemethoden f¨ursoziale Textstr¨omearbeiten und/oder an der Modellierung von Be- nutzerverhalten in sozialen Medien. Acknowledgments First I would like to thank my advisor, Markus Strohmaier, who has been helping me throughout this research and has been a constant source of inspiration. I am truly grateful for all his valuable feedback, his patience and fruitful discussions. Markus is a brilliant advisor and he really knows how to create a stimulating research environment which allows students to grow and enjoy working on exciting problems. Various friends and co-authors deserve my gratitude. In particular, I would like to thank my colleagues at Joanneum Research and Graz Uni- versity of Technology, with whom I have discussed my work in countless hours. Special gratitude must go to Philipp Singer, Lisa Posch, Silvia Mit- ter, Christian K¨orner,Jan P¨oschko, Florian Klien and Simon Walk who supported me and my work at different stages. You all made me looking forward to our productive group meetings on Thursday afternoons, the nightly gaming events and the interesting and funny discussions in our gossip chat. I also want to sincerely thank Matthew Rowe, Harith Alani, Yulan He and Enrico Motta from The Open University, with whom I had the pleasure of cooperating. They really helped me to progress with my research and understand potential relations between semantics and user behavior on the Web. I also want to express my gratitude to the whole KMi team which made KMi to the most special lab I have ever visited. The warm and welcoming atmosphere in KMi is truly unique and creates an inspiring environment where people enjoy working and make new friends. My special thank to Peter Pirolli, Les Nelson and the whole Augmented Social Cognition Lab at Xerox PARC and Bernardo Huberman, Sitaram Asur and the whole Social Computing Lab at HP. I feel truly blessed that I was fortunate enough to work with these incredible smart and inspiring v researchers. During many interesting discussions they gave me a new perspective on science and I think I have never learned more in such a short period of time than during my internships. Thank you for that and thank you for making me feel very welcome from day one! My internships, research visits and conference trips did not only allow me to reflect on my work but also to make new friends. I am extremely grateful for all the amazing people I met during this journey. I also want to thank all my friends in Austria who have been with me during the course of this work and long before that. I feel truly blessed for having such strong ties, reciprocated with love. Last, but not least, I would especially like to thank my parents, Ellen and Hubert Wagner, and my sister Katrin. Without them, I would not be who I am today and all this would not have been possible. Words are not enough to expose how grateful I am for their love and constant support throughout so many years. And finally, a very special thanks to Peter, with whom I have shared most of my time during this work. Institutional Acknowledgements This thesis was supported by a DOC-fForte fellowship of the Austrian Academy of Science. Contents