REGULATE UNSOLICITED COMMERCIAL EMAIL House Bill

Total Page:16

File Type:pdf, Size:1020Kb

REGULATE UNSOLICITED COMMERCIAL EMAIL House Bill REGULATE UNSOLICITED COMMERCIAL EMAIL House Bill 4519 (Substitute H-2) First Analysis (5-8-03) Sponsor: Rep. Bill Huizenga Committee: Energy and Technology THE APPARENT PROBLEM: Advertisers have long claimed that their right to free send e-mails to newsgroups primarily to target speech allows them to provide potential customers “lurkers” who read a group or groups’ messages, but with information about goods and services. do not subscribe to groups or share their e-mail Consumer advocates often reply that respect for addresses publicly. Though the primary target of the individuals’ right to privacy demands acknowledging spam may be the lurkers, newsgroups have reportedly that certain spheres of person’s lives should be free broken up when participants have become frustrated from the uninvited and unwelcome intrusion of by the dwindling number of posted messages that others. Reality is of course more complex, as have anything to do with the newsgroup’s purported advertisers rarely confine themselves to providing subject matter. comprehensive, balanced information about their wares, and despite their protests to the contrary, Alternatively, or in conjunction with newsgroup people are often happy to learn about unfamiliar spamming, advertisers may mass distribute messages House Bill 4519 (5-8-03) goods and services, regardless of whether the to e-mail addresses. In some cases, these addresses messenger has a financial stake in the interaction. are obtained when businesses ask for permission to The regulation of unsolicited direct mail advertising, notify their customers about the availability of new telephone solicitation, and more recently unsolicited products, discounts, and other items of interest by e- commercial e-mail is an attempt by governments to mail. Yet a report by the Center for Democracy and find a nuanced middle ground. Technology suggests that most businesses that have an ongoing relationship with a customer generally “Spamming” is the mass distribution of unsolicited respect customers’ express wishes not to receive e- electronic mail (e-mail) messages, whether for mail ads. The real problem for consumers appears to commercial or noncommercial purposes. The e-mail be businesses and marketers that gather or guess e- messages themselves are called “spam”. While mass mail addresses of the public at large and figure that distributed unsolicited e-mail need not be sent for they stand to gain if they receive even a handful of commercial purposes, and there is no reason why replies after spamming thousands, hundreds of unsolicited commercial e-mail (UCE), like direct thousands, even millions of e-mail addresses. mail advertisements and telephone solicitations, Addresses may be found by “mailbots” that “sweep” cannot be tailored to individuals, spamming is or “scrape” the Internet for unbroken character generally valued and reviled as an inexpensive way strings that use the “@” symbol, judging that the for advertisers to promote goods and services to large majority of them will be valid e-mail addresses. numbers of people. Inexpensive for the sender, that Alternatively, “brute force” and “dictionary” is: e-mail ads do not involve the paper and postal programs “think up” plausible user names (e.g., costs of direct mail advertising, and the personnel and “johndoe”, “marysmith”, and “abc123”), and attach phone costs are far less than those for telemarketing them to an @ symbol followed by one or more sales. popular domain names (e.g., Internetserviceprovider.com, nonprofitentity.org, and To understand why spamming is such an attractive college.edu) allowing spammers to send messages to technique for many advertisers and such an addresses that seem likely--but are not known--to be aggravating nuisance for many e-mail users, it helps occupied. Regardless of how one gets on a to understand more about the practice itself. spammer’s mailing list, removing oneself from the There are two basic ways that advertisers go about list may be very difficult. Spammers often fail to spamming: advertisers may send a single message--or leave any valid contact information, and even if they a message whose content varies slightly--to multiple do, they may ignore requests to be removed from the Usenet newsgroups or they may send a single list or, worse yet, take any response whatsoever as a message to multiple e-mail addresses. Advertisers Analysis available @ http://www.michiganlegislature.org Page 1 of 6 Pages sign that the address is occupied, and therefore, ripe have been seeking such inappropriate content. This for future spamming. can create tension in the home or workplace, which may not be easily quantifiable. Further, as noted above, spam is cheap for the sender. One of the reasons that commercial spam has The scope of the problem is huge. The Economist provoked so much indignation among recipients is cites one estimate that spam accounts for 45 percent that it shifts the cost burden of advertising from the of all e-mail traffic. In addition, the article reports marketer--or ultimately the actual customer who that America OnLine (AOL) “blocks an average of buys the good or service at a price that reflects the 780 million junk e-mails daily, or about 100 million advertising cost--to the potential customer. more e-mails than it actually delivers”! AOL and Generally speaking, the spammer incurs only the other e-mail address providers offer their address costs of personnel for writing and sending the holders the option of having spam--or rather, e-mail message, obtaining e-mail addresses and an Internet that appears to be unsolicited and unwanted--sent to a connection. In an April 2003 article The Economist junk mail folder. Those who choose this option can reports that CDs can be bought for $5 per million adjust the settings so that the folder purges itself on a addresses, and whatever the other costs are, they are daily or other periodic basis. Others have fairly negligible when divided by 100,000 or independently composed “black lists” outing 1,000,000 recipients. According to committee spammers. Filtering invariably leaves gaps, letting testimony (and corroborated by newspaper and some UCE through, and blocks some wanted e-mail magazine articles, letters to the editor, and from reaching addressees. Moreover, filtering e- information provided by anti-spam groups) it is not mails is time-consuming and expensive, whether it is unusual for people to come into work on a Monday an individual e-mail address holder trying to sort morning to find 100 or more spam messages or to through the wanted and junk e-mail messages or an e- come back from lunch to find a dozen spams sitting mail address provider trying to keep address holders House Bill 4519 (5-8-03) in their in-boxes. Opening and deleting the messages happy by preventing the junk from ever reaching means lost productivity to their employers. their in-boxes. And black lists leave judgments about who is a spammer and who is not to the discretion of Then, when people leave work for the day and go whoever creates the lists. Being misidentified as a home and check their personal e-mail accounts, they spammer can create problems for “legitimate” e-mail may find several more spams awaiting them. senders, while illegitimate senders often “hijack” e- Opening and deleting those messages deprives mail addresses and domain names to spam, creating recipients of their own time and possibly money, if difficulties for the victimized e-mail address or for example they have a dial-up connection and pay domain name holder who must then explain that he is phone charges by the minute or have a restricted not the spammer. number of “free” minutes through their Internet service provider. While some people learn to AOL has joined forces with major competitors recognize spam fairly quickly, an employee may be Microsoft and Yahoo to collectively target the very hesitant to delete a message until she is problem, and in late April and early May, the Federal absolutely sure it has nothing to do with her work, Trade Commission held a three-day conference and people who post messages to the web frequently addressing spam. Various federal legislative may be used to receiving e-mails from people they do proposals have been introduced and debated, and not know. People who do not use e-mail frequently over half the states have enacted their own anti-spam may have a more difficult time distinguishing legislation. Many people believe that Michigan between legitimate e-mail solicitations and abusive or should do the same. fraudulent offers. The nuances of individuals’ e-mail habits may make it difficult to determine the precise THE CONTENT OF THE BILL: costs of spam in time and money to businesses and ordinary e-mail users, but it is clear that the costs can The bill would create a new act, the “Unsolicited be significant. Commercial E-mail Protection Act”, to regulate e- mail messages that promote goods, services, real Then, there are other, less tangible costs of spam: property, and other things of value and are sent consider the employee or child who receives a without the recipient’s express permission. Senders message containing pornography. Regardless of of unsolicited commercial e-mail messages (UCE) one’s web browsing history, anyone can receive such would have to identify themselves, indicate in the messages, and this may in turn lead a supervisor or subject heading that the message contained an parent to suspect that their employees and children advertisement, and allow recipients of such ads to Analysis available @ http://www.michiganlegislature.org Page 2 of 6 Pages “conveniently and at no cost” opt out of receiving “Commercial e-mail” would be defined as an future UCE from the sender. In addition, the bill electronic message, file, data, or other information would prohibit the falsification or misrepresentation promoting the sale, lease, or exchange of goods, of a message’s point of origin or transmission path services, real property or any other thing of value that and would prohibit knowingly selling, giving, or was transmitted between two or more computers, otherwise distributing software that falsifies such computer networks, or electronic terminals within a information.
Recommended publications
  • Spamming Botnets: Signatures and Characteristics
    Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten+,IvanOsipkov+ Microsoft Research, Silicon Valley +Microsoft Corporation {yxie,fangyu,kachan,rina,ghulten,ivano}@microsoft.com ABSTRACT botnet infection and their associated control process [4, 17, 6], little In this paper, we focus on characterizing spamming botnets by effort has been devoted to understanding the aggregate behaviors of leveraging both spam payload and spam server traffic properties. botnets from the perspective of large email servers that are popular Towards this goal, we developed a spam signature generation frame- targets of botnet spam attacks. work called AutoRE to detect botnet-based spam emails and botnet An important goal of this paper is to perform a large scale analy- membership. AutoRE does not require pre-classified training data sis of spamming botnet characteristics and identify trends that can or white lists. Moreover, it outputs high quality regular expression benefit future botnet detection and defense mechanisms. In our signatures that can detect botnet spam with a low false positive rate. analysis, we make use of an email dataset collected from a large Using a three-month sample of emails from Hotmail, AutoRE suc- email service provider, namely, MSN Hotmail. Our study not only cessfully identified 7,721 botnet-based spam campaigns together detects botnet membership across the Internet, but also tracks the with 340,050 unique botnet host IP addresses. sending behavior and the associated email content patterns that are Our in-depth analysis of the identified botnets revealed several directly observable from an email service provider. Information interesting findings regarding the degree of email obfuscation, prop- pertaining to botnet membership can be used to prevent future ne- erties of botnet IP addresses, sending patterns, and their correlation farious activities such as phishing and DDoS attacks.
    [Show full text]
  • Zambia and Spam
    ZAMNET COMMUNICATION SYSTEMS LTD (ZAMBIA) Spam – The Zambian Experience Submission to ITU WSIS Thematic meeting on countering Spam By: Annabel S Kangombe – Maseko June 2004 Table of Contents 1.0 Introduction 1 1.1 What is spam? 1 1.2 The nature of Spam 1 1.3 Statistics 2 2.0 Technical view 4 2.1 Main Sources of Spam 4 2.1.1 Harvesting 4 2.1.2 Dictionary Attacks 4 2.1.3 Open Relays 4 2.1.4 Email databases 4 2.1.5 Inadequacies in the SMTP protocol 4 2.2 Effects of Spam 5 2.3 The fight against spam 5 2.3.1 Blacklists 6 2.3.2 White lists 6 2.3.3 Dial‐up Lists (DUL) 6 2.3.4 Spam filtering programs 6 2.4 Challenges of fighting spam 7 3.0 Legal Framework 9 3.1 Laws against spam in Zambia 9 3.2 International Regulations or Laws 9 3.2.1 US State Laws 9 3.2.2 The USA’s CAN‐SPAM Act 10 4.0 The Way forward 11 4.1 A global effort 11 4.2 Collaboration between ISPs 11 4.3 Strengthening Anti‐spam regulation 11 4.4 User education 11 4.5 Source authentication 12 4.6 Rewriting the Internet Mail Exchange protocol 12 1.0 Introduction I get to the office in the morning, walk to my desk and switch on the computer. One of the first things I do after checking the status of the network devices is to check my email.
    [Show full text]
  • Enisa Etl2020
    EN From January 2019 to April 2020 Spam ENISA Threat Landscape Overview The first spam message was sent in 1978 by a marketing manager to 393 people via ARPANET. It was an advertising campaign for a new product from the company he worked for, the Digital Equipment Corporation. For those first 393 spammed people it was as annoying as it would be today, regardless of the novelty of the idea.1 Receiving spam is an inconvenience, but it may also create an opportunity for a malicious actor to steal personal information or install malware.2 Spam consists of sending unsolicited messages in bulk. It is considered a cybersecurity threat when used as an attack vector to distribute or enable other threats. Another noteworthy aspect is how spam may sometimes be confused or misclassified as a phishing campaign. The main difference between the two is the fact that phishing is a targeted action using social engineering tactics, actively aiming to steal users’ data. In contrast spam is a tactic for sending unsolicited e-mails to a bulk list. Phishing campaigns can use spam tactics to distribute messages while spam can link the user to a compromised website to install malware and steal personal data. Spam campaigns, during these last 41 years have taken advantage of many popular global social and sports events such as UEFA Europa League Final, US Open, among others. Even so, nothing compared with the spam activity seen this year with the COVID-19 pandemic.8 2 __Findings 85%_of all e-mails exchanged in April 2019 were spam, a 15-month high1 14_million
    [Show full text]
  • Locating Spambots on the Internet
    BOTMAGNIFIER: Locating Spambots on the Internet Gianluca Stringhinix, Thorsten Holzz, Brett Stone-Grossx, Christopher Kruegelx, and Giovanni Vignax xUniversity of California, Santa Barbara z Ruhr-University Bochum fgianluca,bstone,chris,[email protected] [email protected] Abstract the world-wide email traffic [20], and a lucrative busi- Unsolicited bulk email (spam) is used by cyber- ness has emerged around them [12]. The content of spam criminals to lure users into scams and to spread mal- emails lures users into scams, promises to sell cheap ware infections. Most of these unwanted messages are goods and pharmaceutical products, and spreads mali- sent by spam botnets, which are networks of compro- cious software by distributing links to websites that per- mised machines under the control of a single (malicious) form drive-by download attacks [24]. entity. Often, these botnets are rented out to particular Recent studies indicate that, nowadays, about 85% of groups to carry out spam campaigns, in which similar the overall spam traffic on the Internet is sent with the mail messages are sent to a large group of Internet users help of spamming botnets [20,36]. Botnets are networks in a short amount of time. Tracking the bot-infected hosts of compromised machines under the direction of a sin- that participate in spam campaigns, and attributing these gle entity, the so-called botmaster. While different bot- hosts to spam botnets that are active on the Internet, are nets serve different, nefarious goals, one important pur- challenging but important tasks. In particular, this infor- pose of botnets is the distribution of spam emails.
    [Show full text]
  • The History of Spam Timeline of Events and Notable Occurrences in the Advance of Spam
    The History of Spam Timeline of events and notable occurrences in the advance of spam July 2014 The History of Spam The growth of unsolicited e-mail imposes increasing costs on networks and causes considerable aggravation on the part of e-mail recipients. The history of spam is one that is closely tied to the history and evolution of the Internet itself. 1971 RFC 733: Mail Specifications 1978 First email spam was sent out to users of ARPANET – it was an ad for a presentation by Digital Equipment Corporation (DEC) 1984 Domain Name System (DNS) introduced 1986 Eric Thomas develops first commercial mailing list program called LISTSERV 1988 First know email Chain letter sent 1988 “Spamming” starts as prank by participants in multi-user dungeon games by MUDers (Multi User Dungeon) to fill rivals accounts with unwanted electronic junk mail. 1990 ARPANET terminates 1993 First use of the term spam was for a post from USENET by Richard Depew to news.admin.policy, which was the result of a bug in a software program that caused 200 messages to go out to the news group. The term “spam” itself was thought to have come from the spam skit by Monty Python's Flying Circus. In the sketch, a restaurant serves all its food with lots of spam, and the waitress repeats the word several times in describing how much spam is in the items. When she does this, a group of Vikings in the corner start a song: "Spam, spam, spam, spam, spam, spam, spam, spam, lovely spam! Wonderful spam!" Until told to shut up.
    [Show full text]
  • A Survey of Learning-Based Techniques of Email Spam Filtering
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Unitn-eprints Research A SURVEY OF LEARNING-BASED TECHNIQUES OF EMAIL SPAM FILTERING Enrico Blanzieri and Anton Bryl January 2008 (Updated version) Technical Report # DIT-06-056 A Survey of Learning-Based Techniques of Email Spam Filtering Enrico Blanzieri, University of Trento, Italy, and Anton Bryl University of Trento, Italy, Create-Net, Italy [email protected] January 11, 2008 Abstract vertising pornography, pyramid schemes, etc. [68]. The total worldwide financial losses caused by spam Email spam is one of the major problems of the to- in 2005 were estimated by Ferris Research Analyzer day’s Internet, bringing financial damage to compa- Information Service at $50 billion [31]. nies and annoying individual users. Among the ap- Lately, Goodman et al. [39] presented an overview proaches developed to stop spam, filtering is an im- of the field of anti-spam protection, giving a brief portant and popular one. In this paper we give an history of spam and anti-spam and describing major overview of the state of the art of machine learn- directions of development. They are quite optimistic ing applications for spam filtering, and of the ways in their conclusions, indicating learning-based spam of evaluation and comparison of different filtering recognition, together with anti-spoofing technologies methods. We also provide a brief description of and economic approaches, as one of the measures other branches of anti-spam protection and discuss which together will probably lead to the final victory the use of various approaches in commercial and non- over email spammers in the near future.
    [Show full text]
  • Characterizing Robocalls Through Audio and Metadata Analysis
    Who’s Calling? Characterizing Robocalls through Audio and Metadata Analysis Sathvik Prasad, Elijah Bouma-Sims, Athishay Kiran Mylappan, and Bradley Reaves, North Carolina State University https://www.usenix.org/conference/usenixsecurity20/presentation/prasad This paper is included in the Proceedings of the 29th USENIX Security Symposium. August 12–14, 2020 978-1-939133-17-5 Open access to the Proceedings of the 29th USENIX Security Symposium is sponsored by USENIX. Who’s Calling? Characterizing Robocalls through Audio and Metadata Analysis Sathvik Prasad Elijah Bouma-Sims North Carolina State University North Carolina State University [email protected] [email protected] Athishay Kiran Mylappan Bradley Reaves North Carolina State University North Carolina State University [email protected] [email protected] Abstract Despite the clear importance of the problem, much of what Unsolicited calls are one of the most prominent security is known about the unsolicited calling epidemic is anecdotal issues facing individuals today. Despite wide-spread anec- in nature. Despite early work on the problem [6–10], the re- dotal discussion of the problem, many important questions search community still lacks techniques that enable rigorous remain unanswered. In this paper, we present the first large- analysis of the scope of the problem and the factors that drive scale, longitudinal analysis of unsolicited calls to a honeypot it. There are several challenges that we seek to overcome. of up to 66,606 lines over 11 months. From call metadata we First, we note that most measurements to date of unsolicited characterize the long-term trends of unsolicited calls, develop volumes, trends, and motivations (e.g., sales, scams, etc.) have the first techniques to measure voicemail spam, wangiri at- been based on reports from end users.
    [Show full text]
  • "Best Way to Stop Robocalls", and Pulls on Both My Own Experience in A
    Dear FTC: The following is my submission on the "best way to stop robocalls", and pulls on both my own experience in anti-telemarketing-abuse and anti-spam activity and—more particularly—my personal experience in investigation of the "worst actors" and coordinating with the Kentucky Attorney-General's office to get some of the worse offenders shut down. A little bit of prelude/backstory is necessary to discuss my particular solution. Kentucky has had a particularly stringent do-not-call since the early 2000s even in comparison with the FTC's do-not-call regulations. Specifically, rather than being a mere civil offense, violation of telemarketing laws is a Class D felony in Kentucky (resulting from 1 to 5 years imprisonment). The majority of the issues that the Kentucky A/G (and myself, as a private actor) have had in stopping the worst offenders has been the following: • Much of the problem is both technical and regulatory--the major technical issue is that the majority of "bad actors" are now using SIP-based VoIP calling from "phish farms" overseas which may not even be located in a specific call center but may be "farmed out" to people's homes worldwide from a central location. (A good example of the latter is the "Telemarketing 1-2- 3/Total Home Systems" phish-farm located out of Guadalajara, Mexico which uses SIP-based "softphone" VoIP to conduct illegal telemarketing to Americans (and other countries) and which typically uses a client PC located in a worker's home to call from.) • Most illegal robocalling is conducted by "phish farms" which may or may not have a US actor at all, but whose primary operations do seem to be located overseas; sometimes US companies hire out, some are overseas, but usually there is some US presence if via the PBX used on the American end of a VoIP call center setup.
    [Show full text]
  • Stopping Outgoing Spam
    Stopping Outgoing Spam ∗ Joshua Goodman Robert Rounthwaite Microsoft Research Microsoft Anti-Spam Technology and Strategy One Microsoft Way Group Redmond, WA 98052 One Microsoft Way [email protected] Redmond, WA 98052 [email protected] ABSTRACT year. The vast majority of research on spam has focused on We analyze the problem of preventing outgoing spam. We helping the recipients of spam avoid receiving it [6, 13, 16] show that some conventional techniques for limiting outgo- (to cite a few). We instead focus on helping Email Service ing spam are likely to be ineffective. We show that while Providers (ESPs) prevent their users from sending spam. imposing per message costs would work, less annoying tech- ESPs include most consumer ISPs (e.g. AOL and MSN), niques also work. In particular, it is only necessary that free email systems (e.g. Hotmail and Yahoo), universities, the average cost to the spammer over the lifetime of an ac- etc. Stopping outbound spam at ESPs will not prevent all count exceed his profits, meaning that not every message outgoing spam, as many spammers own direct internet con- need be challenged. We develop three techniques, one based nectivity and cannot be forced to adopt these techniques, on additional HIP challenges, one based on computational but it will stop a substantial source of spam. ESPs want to challenges, and one based on paid subscriptions. Each sys- stop outgoing spam to lessen the load on their own servers, tem is designed to impose minimal costs on legitimate users, to prevent their systems from being blocked from sending while being too costly for spammers.
    [Show full text]
  • The History of Digital Spam
    The History of Digital Spam Emilio Ferrara University of Southern California Information Sciences Institute Marina Del Rey, CA [email protected] ACM Reference Format: This broad definition will allow me to track, in an inclusive Emilio Ferrara. 2019. The History of Digital Spam. In Communications of manner, the evolution of digital spam across its most popular appli- the ACM, August 2019, Vol. 62 No. 8, Pages 82-91. ACM, New York, NY, USA, cations, starting from spam emails to modern-days spam. For each 9 pages. https://doi.org/10.1145/3299768 highlighted application domain, I will dive deep to understand the nuances of different digital spam strategies, including their intents Spam!: that’s what Lorrie Faith Cranor and Brian LaMacchia ex- and catalysts and, from a technical standpoint, how they are carried claimed in the title of a popular call-to-action article that appeared out and how they can be detected. twenty years ago on Communications of the ACM [10]. And yet, Wikipedia provides an extensive list of domains of application: despite the tremendous efforts of the research community over the last two decades to mitigate this problem, the sense of urgency ``While the most widely recognized form of spam is email spam, the term is applied to similar abuses in other media: instant remains unchanged, as emerging technologies have brought new messaging spam, Usenet newsgroup spam, Web search engine spam, dangerous forms of digital spam under the spotlight. Furthermore, spam in blogs, wiki spam, online classified ads spam, mobile when spam is carried out with the intent to deceive or influence phone messaging spam, Internet forum spam, junk fax at scale, it can alter the very fabric of society and our behavior.
    [Show full text]
  • The Do-Not-Call Registry Model Is Not the Answer to Spam, 22 J
    The John Marshall Journal of Information Technology & Privacy Law Volume 22 Issue 1 Journal of Computer & Information Law Article 4 - Fall 2003 Fall 2003 The Do-Not-Call Registry Model is Not the Answer to Spam, 22 J. Marshall J. Computer & Info. L. 79 (2003) Richard C. Balough Follow this and additional works at: https://repository.law.uic.edu/jitpl Part of the Computer Law Commons, Internet Law Commons, Privacy Law Commons, and the Science and Technology Law Commons Recommended Citation Richard C. Balough, The Do-Not-Call Registry Model is Not the Answer to Spam, 22 J. Marshall J. Computer & Info. L. 79 (2003) https://repository.law.uic.edu/jitpl/vol22/iss1/4 This Article is brought to you for free and open access by UIC Law Open Access Repository. It has been accepted for inclusion in The John Marshall Journal of Information Technology & Privacy Law by an authorized administrator of UIC Law Open Access Repository. For more information, please contact [email protected]. THE DO-NOT-CALL REGISTRY MODEL IS NOT THE ANSWER TO SPAM RICHARD C. BALOUGHt I. INTRODUCTION The overwhelming response to the federal Do-Not-Call telephone registry maintained by the Federal Trade Commission ("FTC") indicates that telephone customers do not like to receive telemarketing calls at home.' Whether the Do-Not-Call telephone registry will ultimately re- 2 duce unwanted telemarketing as planned remains to be seen. A recent survey of Internet users indicated that they would like to see a Do-Not-Spam registry as well. But, as will be discussed here, dif- ferences between Internet spain and telephone telemarketing make an "opt-out" Do-Not-Spam registry an impractical model to the Internet users' lament.
    [Show full text]
  • Web Spam Taxonomy
    Web Spam Taxonomy Zolt´an Gy¨ongyi Hector Garcia-Molina Computer Science Department Computer Science Department Stanford University Stanford University [email protected] [email protected] Abstract techniques, but as far as we know, they still lack a fully effective set of tools for combating it. We believe Web spamming refers to actions intended to mislead that the first step in combating spam is understanding search engines into ranking some pages higher than it, that is, analyzing the techniques the spammers use they deserve. Recently, the amount of web spam has in- to mislead search engines. A proper understanding of creased dramatically, leading to a degradation of search spamming can then guide the development of appro- results. This paper presents a comprehensive taxon- priate countermeasures. omy of current spamming techniques, which we believe To that end, in this paper we organize web spam- can help in developing appropriate countermeasures. ming techniques into a taxonomy that can provide a framework for combating spam. We also provide an overview of published statistics about web spam to un- 1 Introduction derline the magnitude of the problem. There have been brief discussions of spam in the sci- As more and more people rely on the wealth of informa- entific literature [3, 6, 12]. One can also find details for tion available online, increased exposure on the World several specific techniques on the Web itself (e.g., [11]). Wide Web may yield significant financial gains for in- Nevertheless, we believe that this paper offers the first dividuals or organizations. Most frequently, search en- comprehensive taxonomy of all important spamming gines are the entryways to the Web; that is why some techniques known to date.
    [Show full text]