Copyright 2009 Cengage Learning. All Rights Reserved. May Not Be Copied, Scanned, Or Duplicated, in Whole Or in Part

Total Page:16

File Type:pdf, Size:1020Kb

Copyright 2009 Cengage Learning. All Rights Reserved. May Not Be Copied, Scanned, Or Duplicated, in Whole Or in Part Index Note: Page numbers referencing fi gures are italicized and followed by an “f ”. Page numbers referencing tables are italicized and followed by a “t”. A Ajax, 353 bankruptcy, 4, 9f About.com, 350 Alexa.com, 42, 78 banner advertising, 7f, 316, 368 AboutUs.org, 186, 190–192 Alta Vista, 7 Barack Obama’s online store, 328f Access application, 349 Amazon.com, 7f, 14f, 48, 247, BaseballNooz.com, 98–100 account managers, 37–38 248f–249f, 319–320, 322 BBC News, 3 ActionScript, 353–356 anonymity, 16 Bebo, 89t Adobe Flash AOL, 8f, 14f, 77, 79f, 416 behavioral changes, 16 application, 340–341 Apple iTunes, 13f–14f benign disinhibition, 16 fi le format. See .fl v fi le format Apple site, 11f, 284f Best Dates Now blog, 123–125 player, 150, 153, 156 applets, Java, 352, 356 billboard advertising, 369 Adobe GoLive, 343 applications, see names of specifi c bitmaps, 290, 292, 340, 357 Adobe Photoshop, 339–340 applications BJ’s site, 318 Advanced Research Projects ARPA (Advanced Research Black Friday, 48 Agency (ARPA), 2 Projects Agency), 2 blog communities, 8f advertising artistic fonts, 237 blog editors, 120, 142 dating sites, 106 ASCO Power University, 168–170 blog search engines, 126 defi ned, 397 .asf fi le format, 154t–155t Blogger, 344–347 e-commerce, 316 AskPatty.com, 206–209 blogging, 7f, 77–78, 86, 122–129, Facebook, 94–96 AuctionWeb, 7f 133–141, 190, 415 family and lifestyle sites, 109 audience, capturing and retaining, blogosphere, 122, 142 media, 373–376 61–62, 166, 263, 405–407, blogrolls, 121, 142 message, 371–372 410–422, 432 Blue Nile site, 418 myYearbook site, 92 authors, blog, 140 bookmarking, social, 76, 110–112 overview, 364, 365f, 366–371 avatars, 200–201, 210 bookstores, online, 7f product placement, 25 .avi fi le format, 154t–155t boot-strapping, 261, 292 versus public relations, 383–384 bounce rate, 389, 397, 441 shopping networking sites, 107 B brand standardization, 25 B2B (business to business) sites, aware consumers, 221 theme, 372–373 43–47, 67, 414 building, 324–325, 370, time frame, 373 B2C (business to consumer) sites, 379–380, 432–433 video viewer preferences, 171 47–50, 67 conscious consumers, 221 Advertising Checking Bureau, B2E (business to employee) sites, growth stage, 372t 420–422 53, 68 guides, 239, 240f–241f, 252 age, Internet use by, 20t, 22, 85, Banker blog, 134, 135f indiff erent consumers, 221 129–130, 159, 304, 305t banking, online, 2 infancy stage, 372t Copyright 2009 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. INDEX brand (continued) business to consumer (B2C) consumer visibility, 370–371 loyalty, 219–227, 252, 304, 410 sites, 47–50, 67 container formats, 154, 175 management, 238, 252 business to employee (B2E) content development, 170, 291 maturity stage, 372t sites, 53, 68 content provision, 328–329 personality, 252 Business.com, 7 contests, 393, 417–418 preferred consumers, 221 BusinessWeek, 2, 195f Continental OnePass frequent profi les, 96 fl ier program, 419, 420f 460 promise, 243–244, 252 C control, advertising, 370 BrandNewDad.com, 109 C2C (consumer to consumer) convenience, online shopping, 301 brands sites, 50–53, 68 conversion, video, 151, 156, 174 in blogs, 133, 136, 138–139 Cascading Style Sheets (CSS), copy, 308, 309f colors, 235–238 351–352, 354 copywriters, 39 consistency, 238–245 cash payments, 302 cost effi ciency, 82 defi ned, 27 casting, 173 cost-per-impression method, defi nitions of, 374 Catalent site, 288f–289f 25, 27 distinguishing from company, CERN (European Organization for couponing, 408 218–219 Nuclear Research), 6f CPG (consumer products goods) elements of, 227–245 Chaos radio show, 83–84 fi rms, 408–409 fonts, 235–238 character blogs, 134, 137 Craigslist.com, 50, 51f, 78 heightened consumer chat rooms, AOL, 77 creative directors, 37 interaction, 251 check-out process, 319–328 Crocker, Chris, 162 image, 232–238 Chief Marketing Offi cer (CMO) CSS (Cascading Style Sheets), increased markets, 248–249 Counsel, 406 351–352, 354 individual message delivery, churn rate, 406, 424 custom player controls, 155–156 247–248 Citibank, 242 “cyber life” personas, 16 logos, 232–234 claiming process, loyalty Cyber Monday, 48 loyalty to, 220–227 program, 422 cyber newsrooms, 380–383, 398 negative impacts on, 19 Classmates.com, 78 cyber psychology, 17 overview, 217–218 classroom style videos, 162 personality, 229–230 click-thru rate, 389, 398, 432, D promise, 228–229 435–436 dashboards, 344, 345f–347f, 438 reinforcement of message, CMO (Chief Marketing Offi cer) dating, online, 2, 106–107 250–251 Counsel, 406 days to purchase measurement, role of Web, 245–251 Coca-Cola concert stage, Second 406 stages of, 372t Life, 204f Deal or No Deal site, 134, 135f taglines, 234–235 codec, 154, 175 del.icio.us site, 110, 111f trust in, 304, 320–321 comments, video-sharing site, 161 delivery, FLV fi le, 155 unique selling proposition, commercial informational sites, 61 delivery wait time, 302 231–232 commercials, 25, 367 demographics breadcrumbs, 285–286, 292 communication, 87–88, 245 bloggers, 129–130 Breakenridge, Deirdre, 383–386 comparison shopping, 301 complexity of, 269–270 broad visibility, 82 compatibility, Flash video, 155 defi ned, 27 broadband connections, 157, 158f competition, 42, 106 defi ning target market, Brookstone site, 12f compression, video, 174 262–263 browsers. See Web browsers comScore, 48, 93 Internet use, 20–23 budgets, 224 concept creation, 172 online shoppers, 304, 305t Burger King, 14f, 396 conceptual marketing, 366, 398 social media usage, 85 burn rates, 4, 27 connection speeds, 9 video viewers, 157–161 business blogs, 126 consumer interaction, heightened, Denny’s, 63f business models, 56–57 251 designing Web sites business networking sites, 100–106 consumer products goods (CPG) audience retention techniques, business plans, 261, 292 fi rms, 408–409 412–413 business to business (B2B) sites, consumer to consumer (C2C) breadcrumbs, 285–286 43–47, 67, 414 sites, 50–53, 68 content development, 291 Copyright 2009 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. INDEX graphic design, 290 educational videos, 162 FOX News site, 81f navigation, 278–282 eff ectiveness of advertising, 370 Fox On Demand site, 163f page layout, 286–289 eHarmony.com, 2 FrontPage, 343 search engines, 282–283 elements, brand, 227f frustration, video viewer, 172t site maps, 284 e-mail, 76–77 Full Metal Jackie, 83–84 tag clouds, 284–285 e-mail blast campaigns, 387–389, full-screen capabilities, 156 Deutsch NY, 374–376 390f–392f, 423, 432, developing Web sites. See 435–437 G 461 designing Web sites eMarketer, 157 gaming, online, 8f, 61–62, 63f, DHTML (Dynamic HTML), emotional connections to brands, 64–67, 201 352, 357 220–222 Geico Cavemen site, 166, 167f Digg.com, 110 emotional intelligence quotient gender, Internet use by, 20t, 21–22 digitizing, 174 (EQ), 376 general social networking sites, direct mail, 387, 423 encyclopedias, online, 9f, 15, 87–97, 112 direct marketing, 364–366, 184–186, 188–189 geography, and Internet use, 20t 386–392, 398 entertainment sites, 61–67 geotargeting, 328–329, 332 direct sales, 307–312 EQ (emotional intelligence GIFs, 290, 340, 356 disinhibition eff ect, 16 quotient), 376 gift cards, 421 Disney, 218–219 ERC Dataplus, 45–47 Gillette NASCAR racing site, display advertising, 107, 111, 368 ESPN.com, 42, 43f 197, 198f distributed Web portals, 42, 68 ethnicity, Internet use by, 20t, 305t global e-commerce usage, 299 domain names, 7 eToys.com, 5, 9f Global Positioning System dot-coms, 4, 27 European Organization for Nuclear (GPS), 24, 27 DoubleClick, 320, 419 Research (CERN), 6f goals, 365–366, 431–437 Dow Jones Industrial Average, 8 Everquest, 8f GoLive, 343 Dreamweaver, 341–343 Excel, 349 Google, 8f, 14f, 40, 41f, 164 drop-down menus, 282f experience, product, 302 Google AdWords, 58, 368, 369f Dynamic HTML (DHTML), Google Analytics, 437–445 352, 357 F GPS (Global Positioning System), Facebook, 14f, 87, 88f, 89t, 92, 24, 27 E 94–96 graphic design, 290 eBay, 7, 50 false transparency, 137, 142 graphic designers, 9, 38–39 e-commerce fame, 162 graphics, video, 174 advertising, 316 family sites, 109 Grocery Manufacturers of B2B sites, 44 Famzam.com, 54–59 America, 219–220 B2C sites, 47–50 FedEx, 232–234 gross revenue goals, 432 content provision, 328–329 feedback, 329–331 Guardian, Th e, 13f defi ned, 68, 332 fi le size, 151, 155, 290 Guericke, Konstantin, 100–106 direct sales, 307–312 Firefox browser, 14f feedback, 329–331 Flash application, 340–341 H indirect sales, 313 Flash (.fl v) fi le format, 38, 61, heavy social networkers, 93 mashups, 200 155–156 heavy viewers, 159 overview, 299–306 Flash Player 6, Macromedia, 13f Hershey Company, 222–227, paid memberships and Flash Player, Adobe, 150, 153, 156 407–410 subscriptions, 314–315 fl ickr, 285f Hewlett Packard, 62f retail sales growth, 12f fl ogs, 128, 137, 142 historical order record, 302 shopping cart, 319–323 Flooz.com, 5 hobby sites, 98–100 social media functionality, 76 .fl v (Flash) fi le format, 38, 61, Home pages, 40, 41f, 287, 318 store layout, 317–318 155–156 honesty, blogger, 137 update frequency, 414 focus groups, 264, 292 Hotmail, 7f economy, eff ect of Web on, 3 folds, 287, 292 how-to videos, 162 editing, video, 174 fonts, 236–238 HTML (HyperText Markup educational level, Internet use formats, graphic, 38 Language), 2, 6f, 350–351, by, 20t, 305t Forrester Research, 48 353–354, 357 Copyright 2009 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. INDEX I M MySpace, 13f, 83–84, 89t, 92, 96, iContact site, 390f–392f M&M company, 225–226 97f, 164 image quality, 155 Macromedia Flash Player 6, 13f MySpace TV, 164–165 Inc., 23 mail, direct, 387, 423 MySQL, 348 incentives, consumer, 395, 418 margins, 306, 332 myYearbook.com, 88–97 income, and Internet use, 20t, market share, 88–89 305t–306t marketers, Web site, 39 N 462 indirect sales, 313 marketing campaigns, 271, Napster, 8f information distribution, 431–432 374–376, 398 narrow visibility, 82 informational sites, 59–61 marketing goals, 431–432 NASDAQ, 4f, 8f in-house site development, marketing strategies, 48–49, 65, national events, 131 271–272 79, 192–193, 263, 292 navigation, site, 278–282, 292, 413 Initial Public Off erings.
Recommended publications
  • Liferay Portal 6 Enterprise Intranets
    Liferay Portal 6 Enterprise Intranets Build and maintain impressive corporate intranets with Liferay Jonas X. Yuan BIRMINGHAM - MUMBAI Liferay Portal 6 Enterprise Intranets Copyright © 2010 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. First published: April 2010 Production Reference: 1230410 Published by Packt Publishing Ltd. 32 Lincoln Road Olton Birmingham, B27 6PA, UK. ISBN 978-1-849510-38-7 www.packtpub.com Cover Image by Karl Swedberg ([email protected]) Credits Author Editorial Team Leader Jonas X. Yuan Aanchal Kumar Reviewer Project Team Leader Amine Bousta Lata Basantani Acquisition Editor Project Coordinator Dilip Venkatesh Shubhanjan Chatterjee Development Editor Proofreaders Mehul Shetty Aaron Nash Lesley Harrison Technical Editors Aditya Belpathak Graphics Alfred John Geetanjali Sawant Charumathi Sankaran Nilesh Mohite Copy Editors Production Coordinators Leonard D'Silva Avinish Kumar Sanchari Mukherjee Aparna Bhagat Nilesh Mohite Indexers Hemangini Bari Cover Work Rekha Nair Aparna Bhagat About the Author Dr.
    [Show full text]
  • Address Munging: the Practice of Disguising, Or Munging, an E-Mail Address to Prevent It Being Automatically Collected and Used
    Address Munging: the practice of disguising, or munging, an e-mail address to prevent it being automatically collected and used as a target for people and organizations that send unsolicited bulk e-mail address. Adware: or advertising-supported software is any software package which automatically plays, displays, or downloads advertising material to a computer after the software is installed on it or while the application is being used. Some types of adware are also spyware and can be classified as privacy-invasive software. Adware is software designed to force pre-chosen ads to display on your system. Some adware is designed to be malicious and will pop up ads with such speed and frequency that they seem to be taking over everything, slowing down your system and tying up all of your system resources. When adware is coupled with spyware, it can be a frustrating ride, to say the least. Backdoor: in a computer system (or cryptosystem or algorithm) is a method of bypassing normal authentication, securing remote access to a computer, obtaining access to plaintext, and so on, while attempting to remain undetected. The backdoor may take the form of an installed program (e.g., Back Orifice), or could be a modification to an existing program or hardware device. A back door is a point of entry that circumvents normal security and can be used by a cracker to access a network or computer system. Usually back doors are created by system developers as shortcuts to speed access through security during the development stage and then are overlooked and never properly removed during final implementation.
    [Show full text]
  • Adversarial Web Search by Carlos Castillo and Brian D
    Foundations and TrendsR in Information Retrieval Vol. 4, No. 5 (2010) 377–486 c 2011 C. Castillo and B. D. Davison DOI: 10.1561/1500000021 Adversarial Web Search By Carlos Castillo and Brian D. Davison Contents 1 Introduction 379 1.1 Search Engine Spam 380 1.2 Activists, Marketers, Optimizers, and Spammers 381 1.3 The Battleground for Search Engine Rankings 383 1.4 Previous Surveys and Taxonomies 384 1.5 This Survey 385 2 Overview of Search Engine Spam Detection 387 2.1 Editorial Assessment of Spam 387 2.2 Feature Extraction 390 2.3 Learning Schemes 394 2.4 Evaluation 397 2.5 Conclusions 400 3 Dealing with Content Spam and Plagiarized Content 401 3.1 Background 402 3.2 Types of Content Spamming 405 3.3 Content Spam Detection Methods 405 3.4 Malicious Mirroring and Near-Duplicates 408 3.5 Cloaking and Redirection 409 3.6 E-mail Spam Detection 413 3.7 Conclusions 413 4 Curbing Nepotistic Linking 415 4.1 Link-Based Ranking 416 4.2 Link Bombs 418 4.3 Link Farms 419 4.4 Link Farm Detection 421 4.5 Beyond Detection 424 4.6 Combining Links and Text 426 4.7 Conclusions 429 5 Propagating Trust and Distrust 430 5.1 Trust as a Directed Graph 430 5.2 Positive and Negative Trust 432 5.3 Propagating Trust: TrustRank and Variants 433 5.4 Propagating Distrust: BadRank and Variants 434 5.5 Considering In-Links as well as Out-Links 436 5.6 Considering Authorship as well as Contents 436 5.7 Propagating Trust in Other Settings 437 5.8 Utilizing Trust 438 5.9 Conclusions 438 6 Detecting Spam in Usage Data 439 6.1 Usage Analysis for Ranking 440 6.2 Spamming Usage Signals 441 6.3 Usage Analysis to Detect Spam 444 6.4 Conclusions 446 7 Fighting Spam in User-Generated Content 447 7.1 User-Generated Content Platforms 448 7.2 Splogs 449 7.3 Publicly-Writable Pages 451 7.4 Social Networks and Social Media Sites 455 7.5 Conclusions 459 8 Discussion 460 8.1 The (Ongoing) Struggle Between Search Engines and Spammers 460 8.2 Outlook 463 8.3 Research Resources 464 8.4 Conclusions 467 Acknowledgments 468 References 469 Foundations and TrendsR in Information Retrieval Vol.
    [Show full text]
  • Efficient and Trustworthy Review/Opinion Spam Detection
    International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 5 Issue: 4 86 – 94 _____________________________________________________________________________________________ Efficient and Trustworthy Review/Opinion Spam Detection Sanketi P. Raut Prof. Chitra Wasnik ME II (Computer), Computer Dept. Lokmanya Tilak College of Engineering, Lokmanya Tilak College of Engineering, New Mumbai, New Mumbai, Email:[email protected] Email:[email protected] Abstract - The most common mode for consumers to express their level of satisfaction with their purchases is through online ratings, which we can refer as Online Review System. Network analysis has recently gained a lot of attention because of the arrival and the increasing attractiveness of social sites, such as blogs, social networking applications, micro blogging, or customer review sites. The reviews are used by potential customers to find opinions of existing users before purchasing the products. Online review systems plays an important part in affecting consumers' actions and decision making, and therefore attracting many spammers to insert fake feedback or reviews in order to manipulate review content and ratings. Malicious users misuse the review website and post untrustworthy, low quality, or sometimes fake opinions, which are referred as Spam Reviews. In this study, we aim at providing an efficient method to identify spam reviews and to filter out the spam content with the dataset of gsmarena.com. Experiments on the dataset collected from gsmarena.com show that the proposed system achieves higher accuracy than the standard naïve bayes. Keywords - Review Spam detection, Opinion, Text mining, WordNet, naïve bayes classifier, logistic Regression. __________________________________________________*****_________________________________________________ making purchase. Opinions/Reviews are middle of almost all I.
    [Show full text]
  • The History of Digital Spam
    The History of Digital Spam Emilio Ferrara University of Southern California Information Sciences Institute Marina Del Rey, CA [email protected] ACM Reference Format: This broad definition will allow me to track, in an inclusive Emilio Ferrara. 2019. The History of Digital Spam. In Communications of manner, the evolution of digital spam across its most popular appli- the ACM, August 2019, Vol. 62 No. 8, Pages 82-91. ACM, New York, NY, USA, cations, starting from spam emails to modern-days spam. For each 9 pages. https://doi.org/10.1145/3299768 highlighted application domain, I will dive deep to understand the nuances of different digital spam strategies, including their intents Spam!: that’s what Lorrie Faith Cranor and Brian LaMacchia ex- and catalysts and, from a technical standpoint, how they are carried claimed in the title of a popular call-to-action article that appeared out and how they can be detected. twenty years ago on Communications of the ACM [10]. And yet, Wikipedia provides an extensive list of domains of application: despite the tremendous efforts of the research community over the last two decades to mitigate this problem, the sense of urgency ``While the most widely recognized form of spam is email spam, the term is applied to similar abuses in other media: instant remains unchanged, as emerging technologies have brought new messaging spam, Usenet newsgroup spam, Web search engine spam, dangerous forms of digital spam under the spotlight. Furthermore, spam in blogs, wiki spam, online classified ads spam, mobile when spam is carried out with the intent to deceive or influence phone messaging spam, Internet forum spam, junk fax at scale, it can alter the very fabric of society and our behavior.
    [Show full text]
  • History of Spam
    1 Max W. Mosing1 THE UPS AND DOWNS IN THE HISTORY OF EU-SPAM-REGULATIONS AND THEIR PRACTICAL IMPACT “Two years from now, spam will be solved. I promise a spam-free world by 2006” (Bill Gates; January 2004) Table of content: 1. (Legal) History of Spam................................................................................................................................................................................... 1 1.1. „Monty Python“and „Canter and Seigel“............................................................................................................................................ 2 1.2. Definition of Spam from a Legal Perspective ...................................................................................................................................... 2 1.3. Harm of Spam and– if the Internet community wants to ban spam – why is spam so successful?........................................... 3 1.4. Legal Framework and legal background of spam .............................................................................................................................. 4 2. “Spam-Regulations” in EU-Directives .......................................................................................................................................................... 6 2.1. Directive 97/66/EC concerning the processing of personal data and the protection of privacy in the telecommunications sector – ISDN-Directive and National Laws..............................................................................................................................................
    [Show full text]
  • Spam in Blogs and Social Media
    ȱȱȱȱ ȱ Pranam Kolari, Tim Finin Akshay Java, Anupam Joshi March 25, 2007 ȱ • Spam on the Internet –Variants – Social Media Spam • Reason behind Spam in Blogs • Detecting Spam Blogs • Trends and Issues • How can you help? • Conclusions Pranam Kolari is a UMBC PhD Tim Finin is a UMBC Professor student. His dissertation is on with over 30 years of experience spam blog detection, with tools in the applying AI to information developed in use both by academia and systems, intelligent interfaces and industry. He has active research interest robotics. Current interests include social in internal corporate blogs, the Semantic media, the Semantic Web and multi- Web and blog analytics. agent systems. Akshay Java is a UMBC PhD student. Anupam Joshi is a UMBC Pro- His dissertation is on identify- fessor with research interests in ing influence and opinions in the broad area of networked social media. His research interests computing and intelligent systems. He include blog analytics, information currently serves on the editorial board of retrieval, natural language processing the International Journal of the Semantic and the Semantic Web. Web and Information. Ƿ Ȭȱ • Early form seen around 1992 with MAKE MONEY FAST • 80-85% of all e-mail traffic is spam • In numbers 2005 - (June) 30 billion per day 2006 - (June) 55 billion per day 2006 - (December) 85 billion per day 2007 - (February) 90 billion per day Sources: IronPort, Wikipedia http://www.ironport.com/company/ironport_pr_2006-06-28.html ȱȱǵ • “Unsolicited usually commercial e-mail sent to a large
    [Show full text]
  • Detecting Web Robots with Passive Behavioral Analysis and Forced Behavior by Douglas Brewer (Under the Direction of Kang Li)
    Detecting Web Robots with Passive Behavioral Analysis and Forced Behavior by Douglas Brewer (Under the direction of Kang Li) Abstract The use of web robots has exploded on today's World Wide Web (WWW). Web robots are used for various nefarious activities including click fraud, spamming, email scraping, and gaining an unfair advantage. Click fraud and unfair advantage present a way for bot writers to gain a monetary advantage. Click fraud web bots allow their authors to trick advertisement (ad) networks into paying for clicks on ads that do not belong to a human being. This costs ad networks and advertisers money they wanted to spend on advertisements for human consumption. It also affects the reputation for ad networks in the eyes of advertisers. These problems make combating web robots an important and necessary step on the WWW. Combating web robots is done by various methods that provide the means to make the distinction between bot visits and human visits. These methods involve looking at web server logs or modifying the pages on a website The enhanced log method for identifying web robots expands on previous work using web server access logs to find attributes and uses AI techniques like decision-trees and bayesian- networks for detection. Enhanced log analysis first breaks down attributes into passive and active methods; it also goes further looking at deeper levels of logging including traffic traces. This allows for previously unexplored attributes like web user agent sniffing, condi- tional comments, OS sniffing, javascript requests, and specific HTTP responses. Expanding the attribute footprint in log analysis allows for unique web bot identification.
    [Show full text]
  • D2.4 Weblog Spider Prototype and Associated Methodology
    SEVENTH FRAMEWORK PROGRAMME FP7-ICT-2009-6 BlogForever Grant agreement no.: 269963 BlogForever: D2.4 Weblog spider prototype and associated methodology Editor: M. Rynning Revision: First version Dissemination Level: Public Author(s): M. Rynning, V. Banos, K. Stepanyan, M. Joy, M. Gulliksen Due date of deliverable: 30th November 2011 Actual submission date: 30th November 2011 Start date of project: 01 March 2011 Duration: 30 months Lead Beneficiary name: CyberWatcher Abstract: This document presents potential solutions and technologies for monitoring, capturing and extracting data from weblogs. Additionally, the selected weblog spider prototype and associated methodologies are analysed. D2.2 Report: Weblog Spider Prototype and Associated Methodology 30 November 2011 Project co-funded by the European Commission within the Seventh Framework Programme (2007-2013) The BlogForever Consortium consists of: Aristotle University of Thessaloniki (AUTH) Greece European Organization for Nuclear Research (CERN) Switzerland University of Glasgow (UG) UK The University of Warwick (UW) UK University of London (UL) UK Technische Universitat Berlin (TUB) Germany Cyberwatcher Norway SRDC Yazilim Arastrirma ve Gelistrirme ve Danismanlik Ticaret Limited Sirketi (SRDC) Turkey Tero Ltd (Tero) Greece Mokono GMBH Germany Phaistos SA (Phaistos) Greece Altec Software Development S.A. (Altec) Greece BlogForever Consortium Page 2 of 71 D2.2 Report: Weblog Spider Prototype and Associated Methodology 30 November 2011 History Version Date Modification reason Modified
    [Show full text]
  • A Survey on Web Spam and Spam 2.0
    International Journal of Advanced Computer Research (ISSN (print): 2249-7277 ISSN (online): 2277-7970) Volume-4 Number-2 Issue-15 June-2014 A Survey on Web Spam and Spam 2.0 Ashish Chandra1, Mohammad Suaib2 Abstract 2. Search Engine Spamming In current scenario web is huge, highly distributive, According to Fetterly [3] "web pages that hold no open in nature and changing rapidly. The open actual information value but created to lure web nature of web is the main reason for rapid growth searches to sites that would otherwise not visited". but it has imposed a challenge to Information Retrieval. The one of the biggest challenge is spam. 2.1 Search Engine Result Ranking We focus here to have a study on different forms of Search engines rank pages to search result according the web spam and its new variant called spam 2.0, to two main features. Relevancy of page with respect existing detection methods proposed by different to query (Dynamic Ranking) and Authoritativeness researches and challenges that are still unanswered. of the page (Static Ranking). Keywords Dynamic Ranking is calculated at query time. It depends on query, user's location, day, time, query Web spam, web 2.0, web spam 2.0, search engine, search history, location of page etc. Static Ranking engine ranking, social media spam. involves query independent features of the page to avoid spam. It is pre-computed at index- time. Some 1. Introduction among few hundred features are: Total length of the content, The term Spamming is defined by Gyongyi (2005)[1] Frequency of keyword (most frequent as: any deliberate action solely to boost ranking of a words), web page is known as spamming.
    [Show full text]
  • Pranam Kolari
    On Leveraging Social Media Pranam Kolari Tim Finin, Akshay Java & eBiquity folks! .. collaborations with IBM, NEC, JHU and others SOCIAL MEDIA Social media describes the online technologies and practices that people use to share opinions, insights, experiences, and perspectives with each other. Wikipedia 06 SOCIAL MEDIA Social media describes the online technologies and practices that people use to share opinions, insights, experiences, and perspectives and engage with each other. Wikipedia 07 SOCIAL MEDIA • Engagement protocols defined by platforms; – Blogs, Social Networks, Wiki, Micro-blogs • around content types; – text, audio, video, read-write Web, avatars • instantiated by applications; – Live Spaces, YouTube, Wikipedia, flickr • enabling online communities. SOCIAL MEDIA • Pew (2007) : 55 percent of American youth age 12 to 17 use online social networking sites • Hitwise (February 2007): 6.5% of all Internet visits for social networking sites • Andrew Tomkins at ICWSM 2007 – Professional vs. Personal (Social) Content – 4GB/day vs. 5-10GB/day (minus songs/videos) – 90% vs. 10% clicks – good ranking vs. crazy good ranking SOCIAL MEDIA RESEARCH Efforts best described by published papers in 3 workshops (2004, 2005, 2006) and at ICWSM 2007 A simple experiment… SOCIAL MEDIA RESEARCH Social Media: 2004, 2005, 2006 Web: www 2007 database, communities, ontology, server, analysis, ties, user, applications, moods, bloggers, databases, weblogs, topics, policies, services, blogs, weblog, personalized, blogosphere, scalable, mobile, blog networks,
    [Show full text]
  • A Co-Classification Framework for Detecting Web Spam
    A Co-classification Framework for Detecting Web Spam and Spammers in Social Media Web Sites Feilong Chen, Pang-Ning Tan, and Anil K. Jain Department of Computer Science & Engineering Michigan State University East Lansing, Michigan 48824 {chenfeil, ptan, jain}@cse.msu.edu ABSTRACT Despite the extensive research in this area, Web spam de- Social media are becoming increasingly popular and have tection remains a critically important but unsolved problem attracted considerable attention from spammers. Using a because spammers may adjust their strategies to adapt to sample of more than ninety thousand known spam Web the defense mechanisms employed against them. sites, we found between 7% to 18% of their URLs are posted Social media are becoming increasingly popular and have on two popular social media Web sites, digg.com and de- attracted considerable attention from spammers. From a licious.com. In this paper, we present a co-classi¯cation list of 94; 198 spam Web sites extracted from a benchmark framework to detect Web spam and the spammers who are email spam data [9], we found 6; 420 (¼ 7%) of them were responsible for posting them on the social media Web sites. posted at digg.com and 16; 537 (¼ 18%) of them were posted The rationale for our approach is that since both detection at delicious.com. While there has been considerable re- tasks are related, it would be advantageous to train them search to detect spam from hyperlinked Web pages to im- simultaneously to make use of the labeled examples in the prove search engine performance, spam detection from social Web spam and spammer training data.
    [Show full text]