Harvard Journal of Law & Technology Volume 12, Number 1 Fall 1998 Ira

Harvard Journal of Law & Technology Volume 12, Number 1 Fall 1998 PAGE LAYOUT AND NUMBERING DO NOT CORRESPOND TO ORIGINAL INTERNET INFOGLUT AND INVISIBLE INK: SPAMDEXING SEARCH ENGINES WITH META TAGS Ira S. Nathenson* Unseen treasure and hidden wisdom, of what use is either? -- 14th century librarian1 The more we know, the less we know. -- 20th century author2 TABLE OF CONTENTS I. INTRODUCTION .....................................44 II. WHERE DO YOU WANT TO GO TODAY?: INTERNET INFOGLUT AND SEARCH ENGINE ABUSE ...........................51 III. DEFINING THE INTERESTS ..............................72 A. Relevance and Noise .............................72 B. Reasonable Expectations ..........................77 1. The Public ..................................78 2. Webmasters .................................82 3. Proprietary-Rights Holders .....................84 C. Domain Names, Linking, and Framing Compared ......86 D. Externalization Costs .............................89 IV. STRIKING A BALANCE ................................92 A. Deceptiveness: "Protecting" the Public ..............93 1. Section 5 of the Federal Trade Commission Act.....93 * J.D. 1998, University of Pittsburgh School of Law; B.A. 1985, University of Pittsburgh College of Arts and Sciences. I would like to thank Professor Julie E. Cohen for her encouragement and suggestions, and Jeffrey R. Waxman for other helpful comments. None of this would have been possible without the love, support, and patience of my wife, Helen, and my children, Katie and Jenny. I may be reached at <[email protected]>. 1. Roger K. Summit, Knowledge Online: Current Implications and Future Trends, in INFORMATION SCIENCE: STILL AN EMERGING DISCIPLINE 101, 102 (James G. Williams & Toni Carbo eds., 1997) [hereinafter INFORMATION SCIENCE]. 2. David Shenk, Data Smog: Surviving the Info Glut, TECH. REV., May 15, 1997, at 18. Harvard Journal of Law & Technology [Vol. 12 PAGE LAYOUT AND NUMBERING DO NOT CORRESPOND TO ORIGINAL 2. How Much Deception Is Sufficient? ..............98 3. FTC Action Is Needed........................101 B. Unfair Competition: Balancing Competitive Interests . 106 1. Rationale of Unfair Competition Law ............106 2. Trademark Infringement: Conceptually Inapposite . 111 3. Trademark Dilution: Tailor-Made ..............118 4. Protection for Less-Than-Famous Marks? ........122 5. Defenses to Infringement and Dilution ...........124 V. WHERE DO YOU WANT TO GO TOMORROW?: TAMING THE INTERNET INFOGLUT ................................131 VI. CONCLUSION ......................................146 I. INTRODUCTION Imagine a never-ending traffic jam on a ten-lane highway. Road signs can't be trusted: the sign for Exit 7 leads to Exit 12, the sign for Cleveland leads to Erie. If you ask the guy at the Kwik-E-Mart how to get to I-79, he gives you directions to Route 30. To top it off, when you ask for a Coke, he gives you a Pepsi. Enough already. You stop at a pay phone to call directory assistance for the number to the local auto club, and instead get connected to "Dial-a-porn." Road trip from hell? Not exactly. Welcome to the Internet, the "information super-highway" that lacks reliable road signs. Suppose, for example, you use an Internet search engine to look for Web pages on the late Princess Diana. Instead, you may find get-rich schemes and pornography.3 Or you search for an attorney -- by name -- and instead get Internet service providers ("ISPs") and software companies.4 If you search for pages on "Monica Lewinsky," you might be shocked to find that the top listing from one search engine is "CityAuction," an Internet classified advertising site.5 3. See Elizabeth Weise, Some Web Pages Take Search Engines for a Ride, USA TODAY, Sept. 29, 1997, at 4D. 4. See James Evans, Fraud: Hidden Tags Leading to Court Battles, INTERNET WORLD, Dec. 1997, at 17. 5. A search for "Monica Lewinsky" using the HotBot search engine on Feb. 27, 1998 resulted in 1,126 matches (on file with Harvard Journal of Law & Technology). Of these, CityAuction was the very first site listed. See CityAuction, Monica Lewinsky, Please Use CityAuction Classifieds (visited Feb. 27, 1998) <http://www.cityauction. No. 1] Internet Infoglut and Invisible Ink PAGE LAYOUT AND NUMBERING DO NOT CORRESPOND TO ORIGINAL Microsoft's advertising, waxing poetic, asks, "Where do you want to go today?"6 Instead, maybe it should ask, "How in the hell do you get there?"7 After all, even those who know where they want to go may need help finding their way. Such help exists in the form of search engines, but some Internet content providers abuse these tools to the frustration of many. This Article is about those who search, those who want to be found, and those who would stand in between. The primary means of finding information on the Internet is via search engines.8 In compiling their databases, search engines rely on those who publish Web pages -- webmasters -- to supply indexing information.9 Because an increase in visits, or "hits," means more exposure or revenue, webmasters are strongly motivated to do anything to increase their chances of getting hits.10 Many therefore "spam" search engines in hopes of appearing in as many searches, and as highly ranked in those searches, as possible.11 com/lewinsky.asp>. This page is not the main CityAuction page, but rather a "bridge" page: one designed to draw in the public and then direct it to another place. See infra note 111 and accompanying text. In light of the discussion of relevance in Part III, it is ironic to note the attempt made by this page to appear relevant to Ms. Lewinsky by stating an "offer": "Monica Lewinsky, we hereby invite you to use CityAuction." CityAuction, supra. A look at the code for this page indicates that "Monica Lewinsky" is also used as a meta tag and title tag. See id.; see also infra notes 79-83 and accompanying text (discussing various ways Web pages are indexed by search engines). 6. Michael Newman, Sticky Wickets: Yahoo! and Its Rivals Now Want You to Stay Around and Not Go Searching, PITT. POST-GAZETTE, Apr. 19, 1998, at E1. 7. See Connected: Click Here to Go to Hell, LONDON DAILY TELEGRAPH, June 24, 1997, at 2. The background music for Microsoft's current television campaign is the Confutatis from Mozart's Requiem. The words of the final blast of music that accompanies the slogan "Where do you want to go today?" are "confutatis maledictis, flammis acribus addictis" which means "the damned and accused are convicted to flames of hell." All of which is grist to the mill of those who believe that Microsoft boss Bill Gates is the devil in disguise. Id. 8. See infra notes 68-83 and accompanying text. 9. Strictly speaking, a webmaster is one who designs or maintains a Web page. In the spamdexing context, see infra note 16 and accompanying text, the implementation of irrelevant or deceptive indexing terms may be done by an actual webmaster, or by a webmaster acting as an agent on behalf of his principal. For simplicity, "webmaster" will refer to any party who provides content on the World Wide Web, either personally or via a webmaster-agent. 10. "[A] high ranking can make all the difference. Consider that a search for the words 'tennis racket' on AltaVista yields more than 850 pages. The user is much more likely to glance through the first 20 sites listed than the last 20." Net Interest: Web-Search-2: It's Up to You to Stand Out, DOW JONES NEWS SERV., Oct. 9, 1997. 11. See DON SELLERS, GETTING HITS: THE DEFINITIVE GUIDE TO PROMOTING YOUR Harvard Journal of Law & Technology [Vol. 12 PAGE LAYOUT AND NUMBERING DO NOT CORRESPOND TO ORIGINAL Some webmasters use indexing terms that are completely irrelevant to actual Web page content in order to gain hits from searches on those topics.12 Others "stuff" multiple instances of a term, relevant or not, in hopes of appearing more relevant to the search engine, and therefore higher in its output.13 Yet others use the trademarks and trade names of third parties, hoping to take advantage of the goodwill of those asserting proprietary rights14 in an indexing term.15 USA Today calls these practices "spamdexing."16 WEB SITE 21-22 (1997); Weise, supra note 3; see also Internet and Unix Dictionary (last modified Nov. 17, 1998) <http://www.msg.net/kadow/answers/s.html#spam> (defining "spam" as "bulk, mass, or repeated posting or mailing of substantially identical messages. The emphasis is on the multiple sending, either many copies to one destination, or one copy to many destinations. This is a reference to the famous Monty Python Spam sketch."). 12. See infra note 97 (noting site that used buried text referring to "Seinfeld"). 13. See SELLERS, supra note 11, at 32-33; Weise, supra note 3 ("A scuba-diving site might hide the word 'scuba' a hundred times, so search engines would rank it above others."). A view of the source code for an Internet site dedicated to the World Wrestling Federation indicates that the term "WWF" is repeated five times as a meta tag. See WWF No Holds Barred (visited Jan. 31, 1998) <http://www.wwf2000.com> (source code on file with Harvard Journal of Law & Technology). 14. The term "property" is somewhat loaded, so I will avoid its use in this Article. After all, merely calling something property ascribes to it the conclusion to the question presented: should the right of exclusion apply? See Felix S. Cohen, Transcendental Nonsense and the Functional Approach, 35 COLUM. L. REV. 809, 815 (1935) [hereinafter Cohen, Transcendental Nonsense]; see also Zechariah Chaffee, Jr., The Disorderly Conduct of Words, 41 COLUM. L. REV. 381, 390-91 (1941). Therefore, references to "proprietary" or "trademark" rights provide a somewhat less volatile label. The proprietary rights to which I refer are more in the nature of quasi-property rights between competitors than rights between the holder and the public.

Harvard Journal of Law & Technology Volume 12, Number 1 Fall 1998 Ira

The Internet and Drug Markets

Fully Automatic Link Spam Detection∗ Work in Progress

Clique-Attacks Detection in Web Search Engine for Spamdexing Using K-Clique Percolation Technique

Download PDF Document, 456 KB

Delivering Timely Environmental Information to Your Community

Print Journalism: a Critical Introduction

Rethinking the Participatory Web: a History of Hotwired's “New Publishing Paradigm,” 1994–1997

The History of Digital Spam

Information Retrieval and Web Search Engines

Web Spam Taxonomy

The Domain Abuse Activity Reporting (DAAR) System

Rethinking the Participatory Web Final