Research Article a Behavior-Driven Forum Spammer Recognition Method with Its Application in Automobile Forums

Total Page:16

File Type:pdf, Size:1020Kb

Research Article a Behavior-Driven Forum Spammer Recognition Method with Its Application in Automobile Forums Hindawi Mathematical Problems in Engineering Volume 2021, Article ID 7682579, 11 pages https://doi.org/10.1155/2021/7682579 Research Article A Behavior-Driven Forum Spammer Recognition Method with Its Application in Automobile Forums Han Su,1 Minglun Ren,1 Anning Wang,1 Xiaoan Tang ,1 Xin Ni,2 and Zhao Fang1,3 1School of Management, Hefei University of Technology, Hefei 230009, China 2Department of Design, Information System and Inventive Processes, INSA de Strasbourg, Strasbourg, France 3Department of Information Systems and Analytics, National University of Singapore, 13 Computing Drive, Singapore Correspondence should be addressed to Xiaoan Tang; [email protected] Received 10 June 2021; Revised 9 August 2021; Accepted 13 August 2021; Published 31 August 2021 Academic Editor: Xin Tian Copyright © 2021 Han Su et al. )is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Forum comments are valuable information for enterprises to discover public preferences and market trends. However, extensive marketing and malicious attack behaviors in forums are always an obstacle for enterprises to make effective use of this in- formation. And these forum spammers are constantly updating technology to prevent detection. )erefore, how to accurately recognize forum spammers has become an important issue. Aiming to accurately recognize forum spammers, this paper changes the research target from understanding abnormal reviews and the suspicious relationship among forum spammers to discover how they must behave (follow or be followed) to achieve their monetary goals. First, we classify forum spammers into automated forum spammers and marketing forum spammers based on different behavioral features. )en, we propose a support vector machine-based automated spammer recognition (ASR) model and a k-means clustering-based marketing spammer recognition (MSR) model. )e experimental results on the real-world labelled dataset illustrate the effectiveness of our methods on clas- sification spammer from common users. To the best of our knowledge, this work is among the first to construct behavior-driven recognition models according to the different behavioral patterns of forum spammers. 1. Introduction happened, the collective memory becomes relatively resis- tant to competing information [3]. )us, fake reviews in In recent years, with the background of social media, forums forums are now the biggest problem for forum users and have become a specific community for users who have the enterprises. same interests. An increasing number of users post related Lots of current studies indirectly identify fake reviews by reviews in forums [1]. )ese reviews cover a wide variety of recognizing forum spammers based on behavioral features content, ranging from breaking news, discussions on various or sentiment analysis methods [4–7]. However, forum topics, posts about one’s personal life, and the sharing of spammers are constantly updating their technology or activities and interests [2]. As a significant platform for the changing their posting methods to prevent them from being users’ discussion, some forums maintain a high level of user detected by the fake reviews recognition system, which activity. In addition, the feedback from forum users is makes many methods no longer useful for recognizing fo- usually an important source of information for potential rum spammers. Although the forum spammers try to dis- consumers to access product features. Enterprises also aim guise themselves as ordinary users, this purposeful posting to discover product defects and real users’ requirements via will eventually show different behaviors from ordinary users. reviews in forums. )erefore, this paper changes the research target from Due to the strong negative response to the initial ex- understanding abnormal reviews and the suspicious rela- posure to erroneous information, it is difficult to correct tionship among forum spammers to discovering how they such influences later. Once a network agrees on what must behave (follow or be followed) to achieve their 2 Mathematical Problems in Engineering monetary goals. Firstly, we classify forum users as automated integrated semantic analysis, was quite promising in the spammers, marketing spammers, and normal users real-world case study, as confirmed with both supervised according to the different behavior patterns of forum users. learning and unsupervised learning techniques by com- Automated spammers are those forum users who are paring a nonsemantic and semantic analysis. As demon- controlled by the spam software. )ey disguise themselves as strated in [16], by analyzing the features of forum users, normal users who display an intention to purchase the forum spammer, and forums, the authors found that every related product or express dissatisfaction toward a related forum has many fake reviews, including some forums with product. Normally, automated spammers mislead forum good reputations. users by posting reviews with a biased emotional tendency. However, our work found that the methods mentioned Marketing spammers are real users who are hired by a spam above are no longer working well. For instance, most users company. In contrast to automated spammers, marketing are now able to easily distinguish rough and fake websites spammers disguise themselves as leading users in forums to with many advertisements, so the number of fake reviews promote related products. )ey post deep, detailed, and with URLs [12, 13] has become much lower. Additionally, positive reviews to overstate the quality of related products. we found that the recognition effect of the method in [14] In general, the more detailed analysis, the more useful in- would be compromised if a large number of forum spam- formation for forum users [8–10]. Moreover, marketing mers have occupied the forums. In our study, the abnormal spammers, as a new but contemptible marketing mode, are feature named spamicity in the first post in [5] does not work emerging in many forums [11]. )en, we propose a be- currently for recognizing forum spammers. At the same havior-driven automated spammer recognition (ASR) time, we found that marketing spammers have a similar model and a marketing spammer recognition (MSR) model abnormal behavioral feature named the submission time of to recognize forum spammers based on the above three types replies in [5] but we cannot find the same behavioral pattern of forum users. Final experimental results illustrate our among automated spammers. In [16], the method that behavior-driven recognition models are able to accurately recognizes spam pages based on spam content features is still detect forum spammers. effective, but this method cannot efficiently recognize forum )e paper is organized as follows: Section 2 reviews the spammers who have many reviews that are similar to those related works. In Section 3, we define some variables to of normal users. In [6], the authors mentioned that once a measure the behavior features of forum users. )e proposed mission is finished, a paid spam poster normally discards the ASR and MSR models are introduced in Section 4. Subse- user ID and never uses it again, potential paid spam posters quently, we describe the experimental dataset and discuss are not willing to continue their activities for a long time. the main experimental results in Section 5. Finally, we In recent years, research on spammers in social media conclude with a summary in Section 6. and e-business websites has been increasing. Liu [17] pro- posed a two-stage cascading model, named ProZombie, 2. Related Works which balanced effectiveness and accuracy well in recog- nizing spammers in Weibo. In [18], message content, user At present, the research on recognizing spammers and fake behavior, and social relationship information were fully used reviews is mainly focused on social media like Twitter. Some to recognize spammers in Weibo. )e work by Hayati et al. e-business websites, such as Amazon and Taobao, have also [19] proposed using a self-organizing map and neural achieved more research attention. In terms of recognizing networks to determine the features of spammers on the forum spammers, a few studies have been conducted in Internet. )ey classified spammers into four categories based recent years, mainly focusing on the recognition of fake on the different behavioral patterns of spammers: content forums and forum spam automator tools. Some recognition submitters, profile editors, content viewers, and mixed be- methods based on abnormal text content have also been havior. Radford et al. [20] constructed an unsupervised proposed by researchers. Some researchers attempt to use representation learning system, which reached an accuracy abnormal URL characteristics in reviews and the link of 91.8% in sentiment analysis by using reviews in Amazon structure of the graph rooted at the posted URL to recognize as training datasets. Furthermore, the authors in [12, 21] posts from the forum spammers [12, 13]. Additionally, recognized fake reviews via the difference of emoticons, contents unrelated to the target posts in the forum were used URLs, @ symbols, and photos in different reviews from to recognize forum spammers [14]. Shin [15] discovered spammers and normal users. Dewang et al. [22] proposed a some features and operational mechanisms of a forum spam spam detection framework combining the PageRank algo- automator tool
Recommended publications
  • Address Munging: the Practice of Disguising, Or Munging, an E-Mail Address to Prevent It Being Automatically Collected and Used
    Address Munging: the practice of disguising, or munging, an e-mail address to prevent it being automatically collected and used as a target for people and organizations that send unsolicited bulk e-mail address. Adware: or advertising-supported software is any software package which automatically plays, displays, or downloads advertising material to a computer after the software is installed on it or while the application is being used. Some types of adware are also spyware and can be classified as privacy-invasive software. Adware is software designed to force pre-chosen ads to display on your system. Some adware is designed to be malicious and will pop up ads with such speed and frequency that they seem to be taking over everything, slowing down your system and tying up all of your system resources. When adware is coupled with spyware, it can be a frustrating ride, to say the least. Backdoor: in a computer system (or cryptosystem or algorithm) is a method of bypassing normal authentication, securing remote access to a computer, obtaining access to plaintext, and so on, while attempting to remain undetected. The backdoor may take the form of an installed program (e.g., Back Orifice), or could be a modification to an existing program or hardware device. A back door is a point of entry that circumvents normal security and can be used by a cracker to access a network or computer system. Usually back doors are created by system developers as shortcuts to speed access through security during the development stage and then are overlooked and never properly removed during final implementation.
    [Show full text]
  • Adversarial Web Search by Carlos Castillo and Brian D
    Foundations and TrendsR in Information Retrieval Vol. 4, No. 5 (2010) 377–486 c 2011 C. Castillo and B. D. Davison DOI: 10.1561/1500000021 Adversarial Web Search By Carlos Castillo and Brian D. Davison Contents 1 Introduction 379 1.1 Search Engine Spam 380 1.2 Activists, Marketers, Optimizers, and Spammers 381 1.3 The Battleground for Search Engine Rankings 383 1.4 Previous Surveys and Taxonomies 384 1.5 This Survey 385 2 Overview of Search Engine Spam Detection 387 2.1 Editorial Assessment of Spam 387 2.2 Feature Extraction 390 2.3 Learning Schemes 394 2.4 Evaluation 397 2.5 Conclusions 400 3 Dealing with Content Spam and Plagiarized Content 401 3.1 Background 402 3.2 Types of Content Spamming 405 3.3 Content Spam Detection Methods 405 3.4 Malicious Mirroring and Near-Duplicates 408 3.5 Cloaking and Redirection 409 3.6 E-mail Spam Detection 413 3.7 Conclusions 413 4 Curbing Nepotistic Linking 415 4.1 Link-Based Ranking 416 4.2 Link Bombs 418 4.3 Link Farms 419 4.4 Link Farm Detection 421 4.5 Beyond Detection 424 4.6 Combining Links and Text 426 4.7 Conclusions 429 5 Propagating Trust and Distrust 430 5.1 Trust as a Directed Graph 430 5.2 Positive and Negative Trust 432 5.3 Propagating Trust: TrustRank and Variants 433 5.4 Propagating Distrust: BadRank and Variants 434 5.5 Considering In-Links as well as Out-Links 436 5.6 Considering Authorship as well as Contents 436 5.7 Propagating Trust in Other Settings 437 5.8 Utilizing Trust 438 5.9 Conclusions 438 6 Detecting Spam in Usage Data 439 6.1 Usage Analysis for Ranking 440 6.2 Spamming Usage Signals 441 6.3 Usage Analysis to Detect Spam 444 6.4 Conclusions 446 7 Fighting Spam in User-Generated Content 447 7.1 User-Generated Content Platforms 448 7.2 Splogs 449 7.3 Publicly-Writable Pages 451 7.4 Social Networks and Social Media Sites 455 7.5 Conclusions 459 8 Discussion 460 8.1 The (Ongoing) Struggle Between Search Engines and Spammers 460 8.2 Outlook 463 8.3 Research Resources 464 8.4 Conclusions 467 Acknowledgments 468 References 469 Foundations and TrendsR in Information Retrieval Vol.
    [Show full text]
  • Battling the Internet Water Army: Detection of Hidden Paid Posters
    Battling the Internet Water Army: Detection of Hidden Paid Posters Cheng Chen Kui Wu Venkatesh Srinivasan Xudong Zhang Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science University of Victoria University of Victoria University of Victoria Peking University Victoria, BC, Canada Victoria, BC, Canada Victoria, BC, Canada Beijing, China Abstract—We initiate a systematic study to help distinguish on different online communities and websites. Companies are a special group of online users, called hidden paid posters, or always interested in effective strategies to attract public atten- termed “Internet water army” in China, from the legitimate tion towards their products. The idea of online paid posters ones. On the Internet, the paid posters represent a new type of online job opportunity. They get paid for posting comments is similar to word-of-mouth advertisement. If a company hires and new threads or articles on different online communities enough online users, it would be able to create hot and trending and websites for some hidden purposes, e.g., to influence the topics designed to gain popularity. Furthermore, the articles opinion of other people towards certain social events or business or comments from a group of paid posters are also likely markets. Though an interesting strategy in business marketing, to capture the attention of common users and influence their paid posters may create a significant negative effect on the online communities, since the information from paid posters is usually decision. In this way, online paid posters present a powerful not trustworthy. When two competitive companies hire paid and efficient strategy for companies.
    [Show full text]
  • The History of Digital Spam
    The History of Digital Spam Emilio Ferrara University of Southern California Information Sciences Institute Marina Del Rey, CA [email protected] ACM Reference Format: This broad definition will allow me to track, in an inclusive Emilio Ferrara. 2019. The History of Digital Spam. In Communications of manner, the evolution of digital spam across its most popular appli- the ACM, August 2019, Vol. 62 No. 8, Pages 82-91. ACM, New York, NY, USA, cations, starting from spam emails to modern-days spam. For each 9 pages. https://doi.org/10.1145/3299768 highlighted application domain, I will dive deep to understand the nuances of different digital spam strategies, including their intents Spam!: that’s what Lorrie Faith Cranor and Brian LaMacchia ex- and catalysts and, from a technical standpoint, how they are carried claimed in the title of a popular call-to-action article that appeared out and how they can be detected. twenty years ago on Communications of the ACM [10]. And yet, Wikipedia provides an extensive list of domains of application: despite the tremendous efforts of the research community over the last two decades to mitigate this problem, the sense of urgency ``While the most widely recognized form of spam is email spam, the term is applied to similar abuses in other media: instant remains unchanged, as emerging technologies have brought new messaging spam, Usenet newsgroup spam, Web search engine spam, dangerous forms of digital spam under the spotlight. Furthermore, spam in blogs, wiki spam, online classified ads spam, mobile when spam is carried out with the intent to deceive or influence phone messaging spam, Internet forum spam, junk fax at scale, it can alter the very fabric of society and our behavior.
    [Show full text]
  • Image Spam Detection: Problem and Existing Solution
    International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072 Image Spam Detection: Problem and Existing Solution Anis Ismail1, Shadi Khawandi2, Firas Abdallah3 1,2,3Faculty of Technology, Lebanese University, Lebanon ----------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Today very important means of communication messaging spam, Internet forum spam, junk fax is the e-mail that allows people all over the world to transmissions, and file sharing network spam [1]. People communicate, share data, and perform business. Yet there is who create electronic spam are called spammers [2]. nothing worse than an inbox full of spam; i.e., information The generally accepted version for source of spam is that it crafted to be delivered to a large number of recipients against their wishes. In this paper, we present a numerous anti-spam comes from the Monty Python song, "Spam spam spam spam, methods and solutions that have been proposed and deployed, spam spam spam spam, lovely spam, wonderful spam…" Like but they are not effective because most mail servers rely on the song, spam is an endless repetition of worthless text. blacklists and rules engine leaving a big part on the user to Another thought maintains that it comes from the computer identify the spam, while others rely on filters that might carry group lab at the University of Southern California who gave high false positive rate. it the name because it has many of the same characteristics as the lunchmeat Spam that is nobody wants it or ever asks Key Words: E-mail, Spam, anti-spam, mail server, filter.
    [Show full text]
  • Understanding the Business Ofonline Pharmaceutical Affiliate Programs
    PharmaLeaks: Understanding the Business of Online Pharmaceutical Affiliate Programs Damon McCoy Andreas Pitsillidis∗ Grant Jordan∗ Nicholas Weaver∗† Christian Kreibich∗† Brian Krebs‡ Geoffrey M. Voelker∗ Stefan Savage∗ Kirill Levchenko∗ Department of Computer Science ∗Department of Computer Science and Engineering George Mason University University of California, San Diego †International Computer Science Institute ‡KrebsOnSecurity.com Berkeley, CA Abstract driven by competition between criminal organizations), a broad corpus of ground truth data has become avail- Online sales of counterfeit or unauthorized products able. In particular, in this paper we analyze the content drive a robust underground advertising industry that in- and implications of low-level databases and transactional cludes email spam, “black hat” search engine optimiza- metadata describing years of activity at the GlavMed, tion, forum abuse and so on. Virtually everyone has en- SpamIt and RX-Promotion pharmaceutical affiliate pro- countered enticements to purchase drugs, prescription- grams. By examining hundreds of thousands of orders, free, from an online “Canadian Pharmacy.” However, comprising a settled revenue totaling over US$170M, even though such sites are clearly economically moti- we are able to provide comprehensive documentation on vated, the shape of the underlying business enterprise three key aspects of underground advertising activity: is not well understood precisely because it is “under- Customers. We provide detailed analysis on the con- ground.” In this paper we exploit a rare opportunity to sumer demand for Internet-advertised counterfeit phar- view three such organizations—the GlavMed, SpamIt maceuticals, covering customer demographics, product and RX-Promotion pharmaceutical affiliate programs— selection (including an examination of drug abuse as a from the inside.
    [Show full text]
  • Internet Security
    In the News Articles in the news from the past month • “Security shockers: 75% of US bank websites Internet Security have flaws” • “Blank robbers swipe 3,000 ‘fraud-proof’ UK passports” • “Korean load sharks feed on hacked data” • “Worms spread via spam on Facebook and Nan Niu ([email protected]) MySpace” CSC309 -- Fall 2008 • “Beloved websites riddled with crimeware” • “Google gives GMail always-on encryption” http://www.theregister.co.uk 2 New Targets of 2007 Scenario 1 • Cyber criminals and cyber spies have • The Chief Information Security Officer shifted their focus again of a medium sized, but sensitive, federal – Facing real improvements in system and agency learned that his computer was network security sending data to computers in China. • The attackers now have two new targets • He had been the victim of a new type of spear phishing attack highlighted in this – users who are easily misled year’s Top 20. – custom-built applications • Once they got inside, the attackers had • Next, 4 exploits scenarios… freedom of action to use his personal • Reported by SANS (SysAdmin, Audit, Network, computer as a tunnel into his agency’s Security), http://www.sans.org systems. 3 4 Scenario 2 Scenario 3 • Hundreds of senior federal officials and business • A hospital’s website was compromised executives visited a political think-tank website that had been infected and caused their computers to because a Web developer made a become zombies. programming error. • Keystroke loggers, placed on their computers by the • Sensitive patient records were taken. criminals (or nation-state), captured their user names and passwords when their stock trading accounts and • When the criminals proved they had the their employers computers, and sent the data to data, the hospital had to choose between computers in different countries.
    [Show full text]
  • Administrator's Guide for Synology Mailplus Server
    Administrator's Guide for Synology MailPlus Server Based on Synology MailPlus Server 2.2 1 Table of Contents Introduction 01 Chapter 1: Deployment Guidelines 02 Select a Synology NAS Estimate RAM and Storage Requirements Running Multiple I/O Intensive Packages on the Same NAS Chapter 2: Getting Started with MailPlus Server 06 Connect Synology NAS to the Internet Set up DNS Set up MailPlus Server Set up MailPlus Client Run MailPlus Third-Party Email Clients Troubleshoot Chapter 3: Mail Migration 19 Create a Mail Migration Task in MailPlus Server Import System Configurations from Microsoft Exchange to MailPlus Server Chapter 4: User Licenses 27 Purchase Licenses Install Licenses Use Licenses Chapter 5: Account Settings 31 Account System Activate Accounts Manage Privileges Chapter 6: Protocol Settings 46 SMTPI MAP/POP3 Network Interface Chapter 7: SMTP Settings 50 Service Settings SMTP Secure Connection Mail Relay Chapter 8: Domain Settings 66 Domain Domain Management Chapter 9: Security Settings 83 Spam Antivirus Scan Authentication Content Protection Chapter 10: Monitor Settings 107 Monitor Server Status Monitor Mail Queue Monitor Mail Log Chapter 11: Disaster Recovery 127 High-Availability Cluster Back up and Restore Email Chapter 12: MailPlus Navigation 140 Basic Operations Advanced Settings Introduction Introduction The Synology MailPlus suite provides advanced and secure mail service with high usability. This suite consists of two packages: MailPlus Server and MailPlus. MailPlus Server is an administration console that offers diverse settings, while MailPlus is an email platform for client users. This administrator's guide will guide you through the MailPlus Server setup and give detailed configuration instructions including DNS settings, mail service migration, and other security adjustments.
    [Show full text]
  • An Empirical Assessment of the CAN SPAM Act
    Portland State University PDXScholar Dissertations and Theses Dissertations and Theses 1-1-2010 An Empirical Assessment of the CAN SPAM Act Alex Conrad Kigerl Portland State University Follow this and additional works at: https://pdxscholar.library.pdx.edu/open_access_etds Let us know how access to this document benefits ou.y Recommended Citation Kigerl, Alex Conrad, "An Empirical Assessment of the CAN SPAM Act" (2010). Dissertations and Theses. Paper 704. https://doi.org/10.15760/etd.704 This Thesis is brought to you for free and open access. It has been accepted for inclusion in Dissertations and Theses by an authorized administrator of PDXScholar. Please contact us if we can make this document more accessible: [email protected]. An Empirical Assessment of the CAN SPAM Act by Alex Conrad Kigerl A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Criminology and Criminal Justice Thesis Committee: Scott Cunningham, Chair Warren Harrison Danielle McGurrin Portland State University ©2010 i Abstract In January 2004, the United States Congress passed and put into effect the Controlling the Assault of Non-Solicited Pornography and Marketing Act (CAN SPAM). The Act was set forth to regulate bulk commercial email (spam) and set the limits for what was acceptable. Various sources have since investigated and speculated on the efficacy of the CAN SPAM Act, few of which report a desirable outcome for users of electronic mail. Despite the apparent consensus of anti-spam firms and the community of email users that the Act was less than effective, there is little to no research on the efficacy of the Act that utilizes any significant statistical rigor or accepted scientific practices.
    [Show full text]
  • Criminal Abuse of Domain Names Bulk Registration and Contact Information Access
    Criminal Abuse of Domain Names Bulk Registration and Contact Information Access Prepared by Dave Piscitello and Dr. Colin Strutt Interisle Consulting Group, LLC 17 October 2019 © 2019 Interisle Consulting Group 2 Executive Summary Domain names that can be rapidly acquired, used in an attack, and abandoned before they can be traced are a critical resource for cybercriminals. Some attacks, including spam and ransomware campaigns and criminal infrastructure operation (e.g., “botnets”), benefit particularly from the ability to rapidly and cheaply acquire very large numbers of domain names—a tactic known as bulk registration. When cybercriminals can register hundreds or thousands of domain names in a matter of minutes, an attack can be widely distributed to make detection, blocking, and dismantling more difficult and prolonged. Cybercrime investigation is always a race against the clock—the longer it takes to identify an attacker and block the attack, the more damage can be inflicted on more victims. Before the adoption by ICANN of a Temporary Specification (“Temp Spec”) for handling domain name registration data in compliance with the European General Data Protection Regulation (GDPR), investigators had ready access to the contact information provided by domain name registrants (“Whois data”). This information, even when incomplete or inaccurate, facilitated rapid attack response both directly (when it correctly identified the attacker) and indirectly (by enabling “connect the dots” methods such as search-and-pivot). The immediate effect of the Temp Spec since the GDPR took full effect on 25 May 2018 has been to severely limit access to domain name registrant contact information, most of which is now redacted by registries and registrars when they respond to Whois data queries.
    [Show full text]
  • Anti-Spam Methods
    INTRODUCTION Spamming is flooding the Internet with many copies of the same message, in an attempt to force the message on people who would not otherwise choose to receive it. Most spam is commercial advertising, often for dubious products, get-rich-quick schemes, or quasi- legal services. Spam costs the sender very little to send -- most of the costs are paid for by the recipient or the carriers rather than by the sender. There are two main types of spam, and they have different effects on Internet users. Cancellable Usenet spam is a single message sent to 20 or more Usenet newsgroups. (Through long experience, Usenet users have found that any message posted to so many newsgroups is often not relevant to most or all of them.) Usenet spam is aimed at "lurkers", people who read newsgroups but rarely or never post and give their address away. Usenet spam robs users of the utility of the newsgroups by overwhelming them with a barrage of advertising or other irrelevant posts. Furthermore, Usenet spam subverts the ability of system administrators and owners to manage the topics they accept on their systems. Email spam targets individual users with direct mail messages. Email spam lists are often created by scanning Usenet postings, stealing Internet mailing lists, or searching the Web for addresses. Email spams typically cost users money out-of-pocket to receive. Many people - anyone with measured phone service - read or receive their mail while the meter is running, so to speak. Spam costs them additional money. On top of that, it costs money for ISPs and online services to transmit spam, and these costs are transmitted directly to subscribers.
    [Show full text]
  • Spam (Spam 2.0) Through Web Usage
    Digital Ecosystems and Business Intelligence Institute Addressing the New Generation of Spam (Spam 2.0) Through Web Usage Models Pedram Hayati This thesis is presented for the Degree of Doctor of Philosophy of Curtin University July 2011 I Abstract Abstract New Internet collaborative media introduce new ways of communicating that are not immune to abuse. A fake eye-catching profile in social networking websites, a promotional review, a response to a thread in online forums with unsolicited content or a manipulated Wiki page, are examples of new the generation of spam on the web, referred to as Web 2.0 Spam or Spam 2.0. Spam 2.0 is defined as the propagation of unsolicited, anonymous, mass content to infiltrate legitimate Web 2.0 applications. The current literature does not address Spam 2.0 in depth and the outcome of efforts to date are inadequate. The aim of this research is to formalise a definition for Spam 2.0 and provide Spam 2.0 filtering solutions. Early-detection, extendibility, robustness and adaptability are key factors in the design of the proposed method. This dissertation provides a comprehensive survey of the state-of-the-art web spam and Spam 2.0 filtering methods to highlight the unresolved issues and open problems, while at the same time effectively capturing the knowledge in the domain of spam filtering. This dissertation proposes three solutions in the area of Spam 2.0 filtering including: (1) characterising and profiling Spam 2.0, (2) Early-Detection based Spam 2.0 Filtering (EDSF) approach, and (3) On-the-Fly Spam 2.0 Filtering (OFSF) approach.
    [Show full text]