Web Spam, Propaganda and Trust Panagiotis T. Metaxas Joseph DeStefano Wellesley College College of the Holy Cross Wellesley, MA 02481, USA Worcester, MA 01610, USA
[email protected] [email protected] ABSTRACT Retrieval]: Miscellaneous Web spamming, the practice of introducing artificial text and links into web pages to affect the results of searches, has General Terms been recognized as a major problem for search engines. It is also a serious problem for users because they are not aware Algorithms, Experimentation, Social Networks, Propaganda, of it and they tend to confuse trusting the search engine with Trust trusting the results of a search [16]. The parallels between web spamming on the internet and propaganda in the real Keywords world suggest that we can use anti-propaganda techniques to educate users and develop tools to help them evaluate the search, Web graph, link structure, PageRank, HITS, Web reliability of the information they find online. spam In this paper, we first analyze the effects that web spam has on the evolution of the search engines and their relation- 1. INTRODUCTION ship to propagandistic techniques in society. Then, we ex- The web has changed the way we inform and get informed. amine the neighborhoods of untrustworthy sites, finding that Every organization has a web site and people are increas- a dense biconnected component (BCCs) containing the site ingly comfortable accessing it for information for any ques- provide a reasonable trust neighborhood that has parallels in tion they may have. The exploding size of the web necessi- social network theory. The fact that spammers employ pro- tated the development of search engines and web directories.