Svms for the Blogosphere: Blog Identification and Splog Detection
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Studying Social Tagging and Folksonomy: a Review and Framework
Studying Social Tagging and Folksonomy: A Review and Framework Item Type Journal Article (On-line/Unpaginated) Authors Trant, Jennifer Citation Studying Social Tagging and Folksonomy: A Review and Framework 2009-01, 10(1) Journal of Digital Information Journal Journal of Digital Information Download date 02/10/2021 03:25:18 Link to Item http://hdl.handle.net/10150/105375 Trant, Jennifer (2009) Studying Social Tagging and Folksonomy: A Review and Framework. Journal of Digital Information 10(1). Studying Social Tagging and Folksonomy: A Review and Framework J. Trant, University of Toronto / Archives & Museum Informatics 158 Lee Ave, Toronto, ON Canada M4E 2P3 jtrant [at] archimuse.com Abstract This paper reviews research into social tagging and folksonomy (as reflected in about 180 sources published through December 2007). Methods of researching the contribution of social tagging and folksonomy are described, and outstanding research questions are presented. This is a new area of research, where theoretical perspectives and relevant research methods are only now being defined. This paper provides a framework for the study of folksonomy, tagging and social tagging systems. Three broad approaches are identified, focusing first, on the folksonomy itself (and the role of tags in indexing and retrieval); secondly, on tagging (and the behaviour of users); and thirdly, on the nature of social tagging systems (as socio-technical frameworks). Keywords: Social tagging, folksonomy, tagging, literature review, research review 1. Introduction User-generated keywords – tags – have been suggested as a lightweight way of enhancing descriptions of on-line information resources, and improving their access through broader indexing. “Social Tagging” refers to the practice of publicly labeling or categorizing resources in a shared, on-line environment. -
‗DEFINED NOT by TIME, but by MOOD': FIRST-PERSON NARRATIVES of BIPOLAR DISORDER by CHRISTINE ANDREA MUERI Submitted in Parti
‗DEFINED NOT BY TIME, BUT BY MOOD‘: FIRST-PERSON NARRATIVES OF BIPOLAR DISORDER by CHRISTINE ANDREA MUERI Submitted in partial fulfillment of the requirements For the degree of Doctor of Philosophy Dissertation Adviser: Dr. Kimberly Emmons Department of English CASE WESTERN RESERVE UNIVERSITY August 2011 2 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the thesis/dissertation of Christine Andrea Mueri candidate for the Doctor of Philosophy degree *. (signed) Kimberly K. Emmons (chair of the committee) Kurt Koenigsberger Todd Oakley Jonathan Sadowsky May 20, 2011 *We also certify that written approval has been obtained for any proprietary material contained therein. 3 I dedicate this dissertation to Isabelle, Genevieve, and Little Man for their encouragement, unconditional love, and constant companionship, without which none of this would have been achieved. To Angie, Levi, and my parents: some small piece of this belongs to you as well. 4 Table of Contents Dedication 3 List of tables 5 List of figures 6 Acknowledgements 7 Abstract 8 Chapter 1: Introduction 9 Chapter 2: The Bipolar Story 28 Chapter 3: The Lay of the Bipolar Land 64 Chapter 4: Containing the Chaos 103 Chapter 5: Incorporating Order 136 Chapter 6: Conclusion 173 Appendix 1 191 Works Cited 194 5 List of Tables 1. Diagnostic Criteria for Manic and Depressive Episodes 28 2. Therapeutic Approaches for Treating Bipolar Disorder 30 3. List of chapters from table of contents 134 6 List of Figures 1. Bipolar narratives published by year, 2000-2010 20 2. Graph from Gene Leboy, Bipolar Expeditions 132 7 Acknowledgements I gratefully acknowledge my advisor, Kimberly Emmons, for her ongoing guidance and infinite patience. -
Introduction to Web 2.0 Technologies
Introduction to Web 2.0 Joshua Stern, Ph.D. Introduction to Web 2.0 Technologies What is Web 2.0? Æ A simple explanation of Web 2.0 (3 minute video): http://www.youtube.com/watch?v=0LzQIUANnHc&feature=related Æ A complex explanation of Web 2.0 (5 minute video): http://www.youtube.com/watch?v=nsa5ZTRJQ5w&feature=related Æ An interesting, fast-paced video about Web.2.0 (4:30 minute video): http://www.youtube.com/watch?v=NLlGopyXT_g Web 2.0 is a term that describes the changing trends in the use of World Wide Web technology and Web design that aim to enhance creativity, secure information sharing, increase collaboration, and improve the functionality of the Web as we know it (Web 1.0). These have led to the development and evolution of Web-based communities and hosted services, such as social-networking sites (i.e. Facebook, MySpace), video sharing sites (i.e. YouTube), wikis, blogs, etc. Although the term suggests a new version of the World Wide Web, it does not refer to any actual change in technical specifications, but rather to changes in the ways software developers and end- users utilize the Web. Web 2.0 is a catch-all term used to describe a variety of developments on the Web and a perceived shift in the way it is used. This shift can be characterized as the evolution of Web use from passive consumption of content to more active participation, creation and sharing. Web 2.0 Websites allow users to do more than just retrieve information. -
Citizen Journalism Via Blogging: a Possible Resolution to Mainstream Media’S Ineptitude Heba Elshahed1* and Sally Tayie2
Research Article Global Media Journal 2019 Vol.17 No. ISSN 1550-7521 33:193 Citizen Journalism via Blogging: A Possible Resolution to Mainstream Media’s Ineptitude Heba Elshahed1* and Sally Tayie2 1Journalism and Mass Communication Department, The American University in Cairo, Egypt 2University Autònoma de Barcelona, Spain *Corresponding author: Heba Elshahed, Journalism and Mass Communication Department, The American University in Cairo, Egypt, Tel: +201001924654; E-mail: [email protected] Received date: Nov 02, 2019; Accepted date: Dec 02, 2019; Published date: Dec 09, 2019 Copyright: © 2019 Elshahed H, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Citation: Elshahed H, Tayie S. Citizen Journalism via Blogging: A Possible Resolution to Mainstream Media’s Ineptitude. Global Media Journal 2019, 17:33. Keywords: Blogosphere; Social media; Political activism; Abstract Facebook revolution Throughout the past years, the emergence of the Egyptian Introduction Blogosphere has been a definitive phenomenon. The Egyptian Blogosphere went through fluctuations and The introduction of a “Wireless World” in the mid 1990’s evolutionary phases, resulting in it becoming a powerful and early 2000’s reformed the entire media landscape, giving platform for cyber space political activism and citizen media an entirely new role in society. Today, the world is journalism, in attempts to compensate for the becoming entirely wireless. Informational videos, movies, mainstream media ’ s inadequacy. This paper explores music, pictures and even personas and figures can be accessed previous studies conducted on this topic. -
Barbara Cochran
Cochran Rethinking Public Media: More Local, More Inclusive, More Interactive More Inclusive, Local, More More Rethinking Media: Public Rethinking PUBLIC MEDIA More Local, More Inclusive, More Interactive A WHITE PAPER BY BARBARA COCHRAN Communications and Society Program 10-021 Communications and Society Program A project of the Aspen Institute Communications and Society Program A project of the Aspen Institute Communications and Society Program and the John S. and James L. Knight Foundation. and the John S. and James L. Knight Foundation. Rethinking Public Media: More Local, More Inclusive, More Interactive A White Paper on the Public Media Recommendations of the Knight Commission on the Information Needs of Communities in a Democracy written by Barbara Cochran Communications and Society Program December 2010 The Aspen Institute and the John S. and James L. Knight Foundation invite you to join the public dialogue around the Knight Commission’s recommendations at www.knightcomm.org or by using Twitter hashtag #knightcomm. Copyright 2010 by The Aspen Institute The Aspen Institute One Dupont Circle, NW Suite 700 Washington, D.C. 20036 Published in the United States of America in 2010 by The Aspen Institute All rights reserved Printed in the United States of America ISBN: 0-89843-536-6 10/021 Individuals are encouraged to cite this paper and its contents. In doing so, please include the following attribution: The Aspen Institute Communications and Society Program,Rethinking Public Media: More Local, More Inclusive, More Interactive, Washington, D.C.: The Aspen Institute, December 2010. For more information, contact: The Aspen Institute Communications and Society Program One Dupont Circle, NW Suite 700 Washington, D.C. -
Uncovering Social Network Sybils in the Wild
Uncovering Social Network Sybils in the Wild ZHI YANG, Peking University 2 CHRISTO WILSON, University of California, Santa Barbara XIAO WANG, Peking University TINGTING GAO,RenrenInc. BEN Y. ZHAO, University of California, Santa Barbara YAFEI DAI, Peking University Sybil accounts are fake identities created to unfairly increase the power or resources of a single malicious user. Researchers have long known about the existence of Sybil accounts in online communities such as file-sharing systems, but they have not been able to perform large-scale measurements to detect them or measure their activities. In this article, we describe our efforts to detect, characterize, and understand Sybil account activity in the Renren Online Social Network (OSN). We use ground truth provided by Renren Inc. to build measurement-based Sybil detectors and deploy them on Renren to detect more than 100,000 Sybil accounts. Using our full dataset of 650,000 Sybils, we examine several aspects of Sybil behavior. First, we study their link creation behavior and find that contrary to prior conjecture, Sybils in OSNs do not form tight-knit communities. Next, we examine the fine-grained behaviors of Sybils on Renren using clickstream data. Third, we investigate behind-the-scenes collusion between large groups of Sybils. Our results reveal that Sybils with no explicit social ties still act in concert to launch attacks. Finally, we investigate enhanced techniques to identify stealthy Sybils. In summary, our study advances the understanding of Sybil behavior on OSNs and shows that Sybils can effectively avoid existing community-based Sybil detectors. We hope that our results will foster new research on Sybil detection that is based on novel types of Sybil features. -
Blogs: Blogosphere a Blog (Short for Weblog) Is a Personal Online Journal That Is Frequently Updated and Intended for General P
Blogs: Blogosphere A blog (short for weblog) is a personal online journal that is frequently updated and intended for general public consumption. Blogs are defined by their format: a series of entries posted to a single page in reverse-chronological order. Blogs generally represent the personality of the author or reflect the purpose of the Web site that hosts the blog. Topics sometimes include brief philosophical musings, commentary on Internet and other social issues, and links to other sites the author favours, especially those that support a point being made on a post. Blogs represent a significant shift in information flow, where information flows from many to many seamlessly. It is a serious challenge to traditional journalism. Blogs do not have gatekeepers, so they are raw, honest, immediate passionate, opinionated and strike an emotional chord. At times they may not be credible as there are no gatekeepers. It is professional journalism versus amateur journalism. Media has realised the growing power of blogs. So news websites nowadays encourage blogging by their employees on their site.Many celebrities too have their own blogs. Blogs are on varied topics. They are easy to start but difficult to sustain. Those who wish to start a blog will have higher cyber space without payments and start to use the space. Add text, colours, paintings, photos, audio, visual, animation, graphics and more. Publish advertisements, persuasive pieces, and campaign materials; make money by business promotion, public relation activity, reviews etc. The ówner’of the blog decides the content and design. Seamless freedom is the major attraction of blogs. -
By Nilesh Bansal a Thesis Submitted in Conformity with the Requirements
ONLINE ANALYSIS OF HIGH-VOLUME SOCIAL TEXT STEAMS by Nilesh Bansal A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Computer Science University of Toronto ⃝c Copyright 2013 by Nilesh Bansal Abstract Online Analysis of High-Volume Social Text Steams Nilesh Bansal Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2013 Social media is one of the most disruptive developments of the past decade. The impact of this information revolution has been fundamental on our society. Information dissemination has never been cheaper and users are increasingly connected with each other. The line between content producers and consumers is blurred, leaving us with abundance of data produced in real-time by users around the world on multitude of topics. In this thesis we study techniques to aid an analyst in uncovering insights from this new media form which is modeled as a high volume social text stream. The aim is to develop practical algorithms with focus on the ability to scale, amenability to reliable operation, usability, and ease of implementation. Our work lies at the intersection of building large scale real world systems and developing theoretical foundation to support the same. We identify three key predicates to enable online methods for analysis of social data, namely : • Persistent Chatter Discovery to explore topics discussed over a period of time, • Cross-referencing Media Sources to initiate analysis using a document as the query, and • Contributor Understanding to create aggregate expertise and topic summaries of authors contributing online. The thesis defines each of the predicates in detail and covers proposed techniques, their practical applicability, and detailed experimental results to establish accuracy and scalability for each of the three predicates. -
Blogosphere: Research Issues, Tools, and Applications
Blogosphere: Research Issues, Tools, and Applications Nitin Agarwal Huan Liu Computer Science and Engineering Department Arizona State University Tempe, AZ 85287 fNitin.Agarwal.2, [email protected] ABSTRACT ging. Acknowledging this fact, Times has named \You" as the person of the year 2006. This has created a consider- Weblogs, or Blogs, have facilitated people to express their able shift in the way information is assimilated by the indi- thoughts, voice their opinions, and share their experiences viduals. This paradigm shift can be attributed to the low and ideas. Individuals experience a sense of community, a barrier to publication and open standards of content genera- feeling of belonging, a bonding that members matter to one tion services like blogs, wikis, collaborative annotation, etc. another and their niche needs will be met through online These services have allowed the mass to contribute and edit interactions. Its open standards and low barrier to publi- articles publicly. Giving access to the mass to contribute cation have transformed information consumers to produc- or edit has also increased collaboration among the people ers. This has created a plethora of open-source intelligence, unlike previously where there was no collaboration as the or \collective wisdom" that acts as the storehouse of over- access to the content was limited to a chosen few. Increased whelming amounts of knowledge about the members, their collaboration has developed collective wisdom on the Inter- environment and the symbiosis between them. Nonetheless, net. \We the media" [21], is a phenomenon named by Dan vast amounts of this knowledge still remain to be discovered Gillmor: a world in which \the former audience", not a few and exploited in its suitable way. -
Children's Literature in the Blogosphere
Children’s literature in the blogosphere Pat Pledger Director ReadPlus Australia Blogs about children’s literature can enhance professional knowledge and provide up to date information as well as giving insights into the writing process. This paper will examine different types of blogs that readers can use to find out about trends, new books, authors, reviews, and student use. It will also examine strategies to increase the value of library homepages with instant up to date news about what´s new in the library. The advantages and the limitations of blogs along with factors to consider when creating and subscribing to blogs will also be explored. A weblog can be defined as any web page with content organised according to date. In the United States, 8% of Internet users keep a blog and 39% read blogs. 54% of US bloggers are under the age of 30 and are divided evenly between men and women. (Lenhart, A. & Fox, S. 2006). Blogs are increasingly being used by librarians as a means of keeping up-to-date with current issues in children’s literature and reading about new books. Blogs will appeal to the younger generation that is avid users of the Internet and this technology would be useful in motivating young readers and writers. The many uses and benefits of blogs Weblogs provide simple web publishing space for individuals and are easy to create and maintain with little html knowledge required. New content can be published instantly from any location in the world, which has Internet access. Blogs also have the ability to archive old material, thus giving a webpage a new look. -
Web Spam Taxonomy
Web Spam Taxonomy Zolt´an Gy¨ongyi Hector Garcia-Molina Computer Science Department Computer Science Department Stanford University Stanford University [email protected] [email protected] Abstract techniques, but as far as we know, they still lack a fully effective set of tools for combating it. We believe Web spamming refers to actions intended to mislead that the first step in combating spam is understanding search engines into ranking some pages higher than it, that is, analyzing the techniques the spammers use they deserve. Recently, the amount of web spam has in- to mislead search engines. A proper understanding of creased dramatically, leading to a degradation of search spamming can then guide the development of appro- results. This paper presents a comprehensive taxon- priate countermeasures. omy of current spamming techniques, which we believe To that end, in this paper we organize web spam- can help in developing appropriate countermeasures. ming techniques into a taxonomy that can provide a framework for combating spam. We also provide an overview of published statistics about web spam to un- 1 Introduction derline the magnitude of the problem. There have been brief discussions of spam in the sci- As more and more people rely on the wealth of informa- entific literature [3, 6, 12]. One can also find details for tion available online, increased exposure on the World several specific techniques on the Web itself (e.g., [11]). Wide Web may yield significant financial gains for in- Nevertheless, we believe that this paper offers the first dividuals or organizations. Most frequently, search en- comprehensive taxonomy of all important spamming gines are the entryways to the Web; that is why some techniques known to date. -
Robust Detection of Comment Spam Using Entropy Rate
Robust Detection of Comment Spam Using Entropy Rate Alex Kantchelian Justin Ma Ling Huang UC Berkeley UC Berkeley Intel Labs [email protected] [email protected] [email protected] Sadia Afroz Anthony D. Joseph J. D. Tygar Drexel University, Philadelphia UC Berkeley UC Berkeley [email protected] [email protected] [email protected] ABSTRACT Keywords In this work, we design a method for blog comment spam detec- Spam filtering, Comment spam, Content complexity, Noisy label, tion using the assumption that spam is any kind of uninformative Logistic regression content. To measure the “informativeness” of a set of blog com- ments, we construct a language and tokenization independent met- 1. INTRODUCTION ric which we call content complexity, providing a normalized an- swer to the informal question “how much information does this Online social media have become indispensable, and a large part text contain?” We leverage this metric to create a small set of fea- of their success is that they are platforms for hosting user-generated tures well-adjusted to comment spam detection by computing the content. An important example of how users contribute value to a content complexity over groupings of messages sharing the same social media site is the inclusion of comment threads in online ar- author, the same sender IP, the same included links, etc. ticles of various kinds (news, personal blogs, etc). Through com- We evaluate our method against an exact set of tens of millions ments, users have an opportunity to share their insights with one of comments collected over a four months period and containing another.