International Journal of Computer Applications (0975 – 8887) Volume 43– No.1, April 2012 Identification of Factors Affecting – A Survey based Approach

Laxmi Ahuja Ela Kumar Amity Institute of Information Technology School of information Technology Amity University, UP Gautam Buddha University Greater Noida

ABSTRACT purchase from the Spammer. Another approach to block spam The process of reducing spam has become much more is to make use of blacklists that contain addresses of known spammers or compromised hosts. However, these lists have to difficult than before and this is because more than 85% of all be constantly updated because spammers have learned to email receive today constitutes as spam the large quantity of counteract this by rapidly changing the origin of spam [4]. unsolicited email has also become a huge problem for IT Some of these techniques have been embedded in products, relations in several universities. As spammers become more services and software to ease the burden on users and advanced, many more innocent students are becoming free in administrators. their web of fraud. Most IT-related spam messages revolve However in-spite of a lot of research, no one technique is a around the technique, where bank accounts and complete solution to the spam problem, and each have trade- credit card information is targeted. On the other hand, offs between incorrectly rejecting legitimate e-mail vs. not spammers that target universities are more intent on gaining rejecting all spam, and the associated costs in time and effort access to the universities' networks, a strategy that will enable [5]. them to spam more staff members and students. Present paper 2. PROPOSED SURVEY performs a survey on various web users to identify the factors To identify the factors responsible for spamming, we affecting spam. Analysis shows that high keyword density conducted a detailed Survey asking questions from the web content, excessive anchor text at the bottom of the web page, users regarding his/her preferences on some excessive reciprocal linking and keyword stuffing are the important issues. The survey consists of two parts, in first part major factors that affect spam. there is some information about the respondent and in second Keywords section, total nine questions are there. The sample Anti-spamming, web search engine questionnaire is given here. The questionnaire has been developed to examine the views of search engine spamming 1. INTRODUCTION techniques in technical educational institution in India. SPAM, has become one of the biggest problems facing the Internet today. Spam has turned email into an unreliable Questionnaire consists of questions on the importance and Internet tool today. If you talk to some end users, they do not perceived benefits of spamming intervention in technical see much difference between Spam and the ordinary junk mail educational institutes in India as well as the barriers and that mail carriers have delivered for years. In a recent survey, threats faced. 93% of respondents reported dissatisfaction with the large The prospective respondents of the questionnaire are volume of unsolicited email (spam) they receive. The problem section/department heads, faculty, students and other web has grown to the point where nearly 50% of the world's email users. Respondents were requested to give their rating based is spam, yet only a few hundred groups are responsible Many on their perception on the ideas listed as per the following anti-spam solutions have been proposed and a few have been scales. implemented [1,2]. Unfortunately, these solutions do not prevent spam as much as they interfere with every-day email S.No Question Statement communications. IMPORTANCE OF SPAMMING AND ANTI SPAMMING The problems posed by spam have grown from simple 1 What does user want from anti spamming annoyances to significant security issues. The deluge of spam software? costs up to an estimated $20 billion each year in lost i) High detection rate of spam productivity -- according to the same document, spam within a company can cost between $600 and $1,000 per year for ii) Low false positives every user [3]. 2 What method does user use to get protection from To prevent spam, both end users and administrators use search engine spam? various anti-spam techniques. An example of blocking spam includes the, use of filters. Despite this, dubious assumptions i) Using anti spamming software as those that specifically filter spam are least likely to

34 International Journal of Computer Applications (0975 – 8887) Volume 43– No.1, April 2012 ii) Using need specific search engine (torrentz for This survey was filled by 59 respondents from various searching movies) domains and expertise. The results are compiled and analyzed and are put in the following section: 3 What kind of web spam is most common? i) Content Spam 3. ANALYSIS Q1 What does user want from anti spamming software? ii) Link spam Ans: 82% of the people want high detection rate of spam and 18% want Low false positives iii) Mirror Q2 What method does user use to get protection from search 4 How do you combat web spam? engine spam? Ans: It means that user perform both the techniques to get i) Trustrank or HillTop protection from spam Q3 What kind of web spam is most common? ii) BadRank or SpamRank Ans: It shows that majority of people feel that content spam is the most common web spam. iii) SandBox Q4 How do you combat web spam? iv) Anti Spam Ans: Result shows that Anti spam and spamRank are the two main web spam. 5 How can one identify a Spam? Q5 How can one identify a Spam? Ans: Spam may be identified mainly if the recipient’s i) If the receipt’s personal identity and context are personal identity and content are irrelevant. Also, a good irrelevant. number of people also think that the spam is used for the transmission and reception of the message give benefits ii) The recipient has not granted permission to receive to the sender. any information. Q6 How does a spam affect the user? iii) The transmission and reception of the message gives Ans: Viruses, Trojan and spyware are the main threat to the benefit to the sender. spamming. It has been the major factor responded by the majority of the users 6 How does a spam affect the user? Q7 Why is spam used? Ans: Viruses and other illegal activities are the main sources i) Install viruses, Trojan, spyware. for spam. Q8 Which search engine is safer? ii) Collect personal information Ans: Google is undoubtedly the most safe search engine voted iii) Compromises your network by majority of people Q9 What are your expectations from us in terms of iv) Black listing your email system deliverables? How would you qualify success? Ans: Lastly, higher ranking has been the preferred choice for 7 Why is spam used? maximum of the respondents i) To sell a product Based on the above survey, its analysis and literature available, we extracted some factors which are ii) Promote fake product responsible for spamming. These factors are described in the following section. iii) Circulate computer viruses 4. FACTORS AFFECTING SPAMMING iv) Connect to illegal activities Based on the above survey and further analyzing several facts in details, following factors are emerged, affecting spamming: 8 Which search engine is safer? (a) High keyword density content – Google easily finds out contents that are containing number of keywords in the i) Google content beyond from the limit. The ideal keyword density is ii) Yahoo 5% to 9%. In order to avoid spamming even if you are repeating keywords, it is better to use synonyms of the iii) MSN keywords that you use. (b) Excessive anchor text at the bottom of the web page – iv) Bing Putting anchor text at the bottom of the page is often 9 What are your expectations from us in terms of recommended by most of the SEO experts. It is because they deliverables? How would you qualify success? believe that it is easily seen by the search engine crawlers. However, a lot of ridiculous web marketing people overdo i) Higher Rankings in Search engines like Google, this technique that results to spamming. One or two links at Yahoo, MSN, ASK etc. the bottom of the page would be enough [8]. (c) Excessive reciprocal linking – Reciprocal linking is ii) Increased number of unique and targeted visitors another effective SEO technique. This can only be done right iii) Sales and Registration Conversions when you are exchanging links with a website that bears the relevant content with your own website. Some of the

35 International Journal of Computer Applications (0975 – 8887) Volume 43– No.1, April 2012 search engine optimizers are just out of their mind to the companies too much by assuming the advertisement exchange links with other , even buy them for an massages as spam [9]. Recently, most of the researchers are amount, not thinking if the website can help in boosting their suffering from the spam false estimations; they tried to move own site [7]. The only way in making this tactic works is to toward approaches that minimizes the false positive link only with websites, that aside from being relevant, which estimations. Some of the analysis shows that anti-spam has are of the same page rank as yours. If your website has page directly affected the e-mail 2’ rank of 3, then you have to link exchange with related Issues and problems associated with spam email websites that also possess 3 PR. There is no "silver bullet" for killing Spam. If that magic If you happen to link with websites that possess lower PR bullet existed, ISP's and corporations worldwide would have than yours, then your PR will be distributed to the site you already pulled the trigger! Many customers think their Internet link to. This means that your PR will, in a way, be reduced. Service Provider (ISP) should be able to fix the problem. But Of course, if you link with websites with higher PR than Spam is a world-wide problem, and email systems around the yours, then it’s a big luck for you if they accept your link world are not setup in a consistent manner. exchange request. 5. CONCLUSION (d) Keyword stuffing in the web menu- Search engine spiders In the present work, a test was conducted to determine the hates seeing keyword the are overly stuffed anywhere on the influence of Spam on user’s satisfaction with his/her email page. Your website menu serves as the guide for your visitors system, but statistically significant evidence was not found in order to understand what your site is all about. It also that Spam affects overall user satisfaction with their email serves as the giver of information on what services or product system. Improvements in user satisfaction need to be done as you offer. spam can be a nasty and often source of irritation. Individual (e) Directory submission to thousands of sites- This internet user need to be protected from falling victims to frauds that marketing technique is not really working for results that you can lead to financial loss, identify theft or even worse. The are expecting. You might be expecting to get more traffic or flood of Spam has threatened to make email unattractive to higher ranking in the search engine. The fact about this the extent of discarding it as a communications medium. It technique is that it gives nothing but only asks more effort has also resulted in a decrease of user satisfaction towards the from you. It is recommended to submit your site only to site search results. On the positive side, an unexpected and submission directories that are rarely visited or no visitors at fortunate side effect has also developed. It is believed that by all. You have to work for big guys on the site such as Google, using certain techniques, implementations could be developed Yahoo and MSN [6]. They have site directories. Open that are specifically customized to a new approach of dealing Directory Project is also one of the famous web directory with the threat of Spam. submission sites. Only in famous sites, that you can draw 6. REFERENCES more traffic or increased ranking in the search engine results page. [1] Sergey Brin and Lawrence Page, The Anatomy of a To be summarized Large-Scale Hypertextual Web Search Engine, available The five factors that spam the site are namely at http://infolab.stanford.edu/~backrub/google.html. (a) high keyword density content, [2] Message Sniffer Anti Spam Filter - High Performance (b) excessive anchor text at the bottom of the web page, Scanner, available at (c)Excessive reciprocal linking, http://www.armresearch.com/products /index.jsp (d)Keyword stuffing in the web menu, and [3] http://www.symantec.com/connect/articles/anti-spam- (e) Directory submission to thousands of sites. These are solutions-and-security,Anti Spam Solutions and Security just some of the several other tactics that often commits by [4] McAfee Anti-Spam: Protecting Your Organization from careless search engine optimizers. Spam, Phishing, and Other Unsolicited Messages (2006), Spam becomes more "genuine" available at http://www.mcafee.com/us/local_content Spamming in general has become a gainful business, and the /white_papers/anti_spam.pdf scams that were once easily identified now appear more [5] Anti Spam Techniques, available at genuine. Spammers are using legitimate online businesses, http://en.wikipedia.org/w/index.php?title=Anti- such as PayPal or eBay, to trick recipients into giving up spam_techniques www.gfi.com important account information. While most people would [6] Impact of spam advertisement through e-mail: A study never respond to messages from senders that they don't know, to assess the influence of the anti-spam on the e-mail this is a totally different situation when the request is being marketing available at: made from a sender within the IT network. To truly http://academicjournals.org/ajbm/PDF/ understand the characteristics of spam targeted at information pdf2010/4Sept/Raad%20et%20al.pdf, technology, all students and members of the IT staff must [7] http://www.windowsecurity.com/whitepapers/Impact- become aware of all the spamming patterns and other key Reducing-SPAM-Part1.html, SPAM - The Issues, Impact factors affecting the network. and Reducing SPAM (Part 1) Impact of the anti-spam on the email [8] http://internetmarketingxpert.com/2009/05/5-factors-of- Spam system classify the e-mails into wanted and unwanted website-spamming-webmarketing-expertscomau-part-3. messages, wanted messages are found in the inbox of the e- [9] Mostafa Raad, Norizan Mohd Yeassen, Gazi Mahabubul mails, while the unwanted messages are directed to the spam Alam, B. B. Zaidan and A. A. Zaidan, Impact of spam box. In many cases, anti-spams have several wrong advertisement through e-mail: A study to assess the estimations, although, the great roll of the anti-spam systems influence of the anti-spam on the e-mail marketing, in term of privacy, security and protecting e-mails from any African Journal of Business Management Vol. 4(11), pp. attacking cannot be ignored. From another view, spam cost 2362-2367, 4 September, 2010

36