US007574433B2

(12) Ulllted States Patent (10) Patent N0.: US 7,574,433 B2 Engel (45) Date of Patent: Aug. 11, 2009

(54) CLASSIFICATION-EXPANDED INDEXING 6,598,046 B1 * 7/2003 Goldberg et a1...... 707/5 AND RETRIEVAL OF CLASSIFIED 6,625,596 B1* 9/2003 Nunez ...... 707/3 DOCUMENTS 6,711,585 B1 * 3/2004 Copperman et a1...... 707/104.1 6,778,979 B2* 8/2004 Grefenstette et a1...... 707/3 75 _ 6,820,075 B2 * 11/2004 Shanahan et a1...... 707/3 ( ) Inventor‘ Alan Kent Engel’ vlnanova’ PA (Us) 6,928,425 B2 * 8/2005 Grefenstette et a1...... 707/2

. _ . 7,031,961 B2 * 4/2006 Pitkow et a1...... 707/4 (73) Asslgnee' Paterra’ Inc" Kern’ 1116’ TX (Us) 7,177,904 B1 * 2/2007 Mathur et a1...... 709/204

. . . . . 2002/0147738 A1 10/2002 R d ( * ) Not1ce: Subject‘ to any d1scla1mer, the term of this 2003/0229626 A1 12/2003 Nigair P211911t 15 extended Or adlusted under 35 2004/0177015 A1 9/2004 Galaietal. U.S.C. 154(b) by 566 days. FOREIGN PATENT DOCUMENTS (21) Appl' NO‘: 10/960’725 JP 2003-345950 A 12/2003 (22) Filed: Oct. 8, 2004 * Cited by examiner (65) Prior Publication Data Primary ExamineriShahid A Alam (74) Attorney, Agent, or Firm4Gianna Julian-Arnold; Miles US 2006/0242118 A1 Oct. 26 2006 - ’ & Stockbr1dge RC. (51) Int- Cl' 57 ABSTRACT G06F 17/30 (2006.01) ( ) (52) US. Cl...... 707/4; 707/3; 707/5; 707/ 104.1 Document Classi?cation systems are Valuable tools for 0f Classi?cation Search ...... Searching and retrieving Classi?ed documents but can be pro 707/5, 4, 1041; 709/ 201, 204; 715/209 hibitively complex and cumbersome for users. See application ?le for complete search history. _ A s stem for the indexin and retrieval of classi?ed docu (56) References Clted y g ments inserts keywords, titles or de?nitions of previously U.S. PATENT DOCUMENTS applied classi?cations into the document record and provides the resulting record to a search engine. Searchers are able to 5,765,176 A * 6/1998 Bloomberg ...... 715/514 retrieve documents by searching on keywords from the clas 5,835,087 A * 11/1998 Herzetal. 715/810 si?cation system Without looking up class coding. 6,038,560 A * 3/2000 Wical ...... 707/5 RE36,653 E 4/2000 Heckelet a1. 6,105,022 A 8/2000 Takahashiet a1. 30 Claims, 12 Drawing Sheets

P100 Load XML record from stream P101

Transform P102

Add class iitles P103 from database

Convert to HTML P104himl and save to document store

Last Record ? 151 US. Patent Aug. 11,2009 Sheet 1 0f 12 US 7,574,433 B2

401 EIEIIJD

110 200

402 100 403

FIG. 1 300 US. Patent Aug. 11,2009 Sheet 2 0f 12 US 7,574,433 B2

100

1:: z 110 DE! EID ‘cm-‘5T1

:3 ‘:1: 120 on an ? 140

151

EC) 132

EC) 133

FIG. 2 Typical Document Server Web Site

US. Patent Aug. 11,2009 Sheet 4 0f 12 US 7,574,433 B2

U1ited Sates Patent Pppllcation 20040177088 Knd Cbde A1 Jeffrey, Joel Septenber 9, 2004

Wde-spectrumi nfornati on search engine

Abstract Anethod and conputer programproduct for conparing docunents includes segnenting a iudgrrent natrix into a plurality of information sub-matrices VlhBfB each subrratrix has a plurality of classifications and a plurality of term; relevant to each classification; evaluating a relevance of each termof the plurality of terns vlith respect to each classification of each information subrratrix of the information subnatrices; calculating an information spectrumfor a first docunent based upon at least some of the plurality of terns; calculating an information spectrumfor a second docunent based upon at least sone of the plurality of terns; and identifying the second docunent as relevant to the first docunent based upon a conparison of the calculated information spectruns.

Inventors: Jeffrey, Joel; (Vlileaton, IL) Cbrrespondence Fame and Address: FlS-l& ao-iomsm RC 3300 DNNRNBJ-ERHAZA MMEAMIS Ml 55402 LS Pesignee Nine and Pdress: l-B Technologies, lnc., a Glifornia corporation

Serial N).: 800217 Series Code: 10 ?led: Mrch12, 2004

US OJrrent Class: 707/102 US Class at Publication: 707/102 lntern'l Class: (06F 017/30

FIG. 4 US. Patent Aug. 11,2009 Sheet 5 0f 12 US 7,574,433 B2

United Sates Patent Application 20040177088 HndQDde A1 Jeffrey, Joel Septenber 9, 2004

Wde-spectrumi nfornati on search engine

Abstract Anethod and corrputer programproduct for conparing documents includes segnenting a judgnent natrix into a plurality of information sub-matrices vihere each subnatrix has a plurality of classifications and a pl urality of terns relevant to each classification; evaluating a relevance of each termof the plurality of terns viith respect to each classification of each infornation subnatrix of the information subnatrices; calculating an infornation spectrumfor a first docunent based upon at least sons of the plurality of terns; calculating an information spectrumfor a second docurrent based upon at least sone of the plurality of terns; and identifying the second docurrent as rel evant to the first document based upon a comparison of the calculated information spectruns.

inventors: Jeffrey. Joel; (Wieaton, IL) Correspondence Name and Address: FlS-l& RCl-NIBCN RC 3300 DNNRNHJ-ERHAZA MNEAPOJS Ml 55402 LS Pssignee Name and Adress: H5 Teohnol ogies, Inc, a California corporation

Serial no: 800217 Series Code: 10 Filed: March 12, 2004

US. OJrrent Class: 707/102 - Oass 707 mmmm DQTPBASENQFILEMNGQUXTASTRUUB

- 100 DRTPBPSESJ-HACRDATASTRETLE - 102 @neratlng database or data structure (e.g., via user interface) US Oassat Publication: 707/102 lntern'l Class: GD6F 017/30

FIG. 5 US. Patent Aug. 11,2009 Sheet 6 0f 12 US 7,574,433 B2

U1ited Sates Patent Application 20040177088 Kind Code A1 Jeffrey, Joel Septenber 9, 2004

- - - _ - _ _ ------_ - v - _ - - . ------_ - - - - - _ ------~ ------_ ------_ Wde-spectrumi nfornati on search engine

Abstract Anethod and conputer programproduct for conparing docunents includes segnenting a judgrrent matrix into a plurality of information sub-matrices vlhere each subnatrix has a plurality of classifications and a plurality of terns relevant to each classification; evaluating a relevance of each termof the plurality of terns Vtith respect to each classification of each information subnatrix of the information subnatrices; calculating an infornation spectrumfor a first docurrent based upon at least some of the plurality of terns; calculating an inforn‘ation spectrumfor a second docunent based upon at least sons of the plurality of terns; and identifying the second docunent as relevant to the first docun'ent based upon a conparison of the calculated infornation spectruns.

inventors: Jeffrey, Joel; (Weaton, lL) Correspondence Name and Address: FlS-l & RO-PHZHN RC 3300 DNNHQLBJ-BRHAZA MMEAPOJS Ml 55402 LS Fssignee Nana and Pdress: H5 Technol ogies. Inc., a California corporation

Serial no: 800217 Series Cbde: 10 ?led: March 12, 2004

FIG. 6 US. Patent Aug. 11,2009 Sheet 7 0f 12 US 7,574,433 B2

CODE TITLE 00028000000 APPAREL 60028001000 MISCELLANEOUS 00028455000 GUARD OR PROTECTOR 00028456000 Body cover

FIG. 7 US. Patent Aug. 11,2009 Sheet 8 0f 12 US 7,574,433 B2

classid level code cdlsp title 1 0 (30028000000 Class 2 APPAREL 2 1 00028001000 1 MISCELLANEOUS 3 1 60028455000 455 GUARD OR PROTECTOR 4 2 00028456000 456 Body cover 5 3 (30028457000 457 Hazardous material body cover

FIG. 8 a

classid ancestroid 2 1 3 1 4 1 4 3 5 1 5 3 5 4

FIG. 8b US. Patent Aug. 11,2009 Sheet 9 0f 12 US 7,574,433 B2

P100 Load XML record from stream P101

Transform P102

Add class titles from database P103

P104 Save to document store

131

FIG. 9 US. Patent Aug. 11,2009 Sheet 10 0f 12 US 7,574,433 B2

Get subclass code from m_spDOM P103.1

Set append target to P103.2

Load class hierarchy for subclass (spGetSubclassHierarchy) P1 03'3

For each row in rowset P103.4

P1035 P1035

ucstree element with N0 Create/insert new same classid in target ? usctree element in classid sequence

P103.7 Set append target Set append target to existing element to new element

FIG. 10 US‘ Patent Aug- 11,2009 Sheet 11 0f 12 US 7,574,433 B2

P100 Load XML record P101 from stream

Transform p102

Add class titles from database P103

Convert to HTML P104him| and save to document store

P105 N0

r 151 Yes

FIG. 11 US. Patent Aug. 11,2009 Sheet 12 0f 12 US 7,574,433 B2

P100 Load XML record P101 from stream

Transform P1 ()2

P104 Save to document store

P1 05 No

131 Yes US 7,574,433 B2 1 2 CLASSIFICATION-EXPANDED INDEXING As a result the advantages of classi?cation and indexing AND RETRIEVAL OF CLASSIFIED systems are beyond the grasp of more casual users and infor DOCUMENTS mation professionals. On the other hand, the rapid recent groWth of fulltext-base TECHNICAL FIELD patent retrieval services on the Internet has led lay persons and information professionals alike to rely increasingly on keyWord searching. While keyWord searching has its advan This invention relates to the indexing and retrieval of docu tages and is easy to use, variations in terminology can easily ments to Which classi?cation codes and schemes have been lead to missed documents. Moreover, the intellectual product applied and, in particular, relates to the indexing and retrieval embodied in the classi?cations applied to the documents is of patent documents. totally lost. In related art, D & B Duns Market Ident?ers database on BACKGROUND DIALOG (http://library.dialog.com/bluesheets/html/ bI05l6.html) provides for searching SIC descriptors as a It is standard practice for intellectual property authorities search ?eld. TRADEMARKSCAN provides for searching to classify applications and documents by one or more clas international class descriptors as a search ?eld (http://library si?cation and/or indexing schemes. For example, the United .dialog.com/bluesheets/html/bI0669.html). States Patent and Trademark O?ice (USPTO) applies the US. Patent Classi?cation (USPC) system and the International BRIEF DESCRIPTION OF THE DRAWINGS Patent Classi?cation (IPC) system to patent applications ?led 20 in its of?ces. Likewise, the European Patent O?ice applies the FIG. 1 European Classi?cation system (ECLA) and IPC to applica Conceptual depiction of document server-search engine tions ?led in its of?ces and the Japan Patent O?ice UPO) client environment applies the File Index system (FI) and F-Terms systems to FIG. 2 applications ?led at its o?ice. 25 Typical hardWare and softWare con?guration for document More broadly, information vendors and database providers server Website according to this invention frequently develop and apply various coding schemes to FIG. 3 Public search engines in the United Kingdom documents that they index and provide on their services. For FIG. 4 example, BIOBASE, a database produced by Reed Elsevier Conventional classi?ed document uses a proprietary classi?cation coding system.ESBIOBASE FIG. 5 [ONLINE]. [retrieved on 2004 Mar. 17]. Retrieved from: . Classi?ed document according to this invention FIG. 6 These classi?cation and indexing systems are indispens Classi?ed document according to this invention With able for the rapid retrieval and handling of information. They 35 inserted classi?cation information in a second language are essential tools in the ef?cient and effective examination of FIG. 7 patent applications. Their application incorporates a high Table of classi?cation information according to this inven degree of intellectual input. tion Unfortunately, most classi?cation and indexing systems FIG. 8 are very sophisticated and complex. Effective use requires a 40 Table of classi?cation information according to the Pre high level of training. For example, European Patent O?ice ferred Embodiment examiners receive tWo years of training on ECLA before they FIG. 9 are alloWed to conduct unsupervised prior art searches using Process for producing document store according to the ECLA system. The US. Patent Classi?cations and the Embodiment 4 Japanese F-Term systems are similarly sophisticated. 45 FIG. 10 Moreover, even Within the ?eld of patent information, Process for inserting classi?cation information into docu skilled searching of the Trilateral Patent Of?ces requires that ment the search learn and search each of the national or regional FIG. 11 classi?cation systems separately. In other Words, the searcher Process for producing document store according to needs to learn ECLA to search EPO documents, the US. 50 Embodiment 5 classi?cations to search US. patent documents, and the FI FIG. 12 and F-term systems in order to search J P0 documents. Even Process for producing document store according to the the tools and resources needed to do this are lacking. For Preferred Embodiment example, there is no knoWn English index of the J P0 F-term system. In a recent symposium (FUJI, Yoshihiro “Providing 55 DISCLOSURE OF INVENTION Japanese patent information to non-Japanese users” Far East Meets West in Vienna: EPIDOS Users’ Meeting on Japanese Problem Invention Seeks To Solve Patent Information, 20030ct. 23, Vienna, Austria (Post-pre sentation discussion)), a J P0 patent examiner recommended This invention seeks to make the advantages of classi?ca the folloWing procedure for determining the appropriate FI 60 tion searching available to information users Without compel class for searching a particular concept: First, on the EPO ling them to learn the details, and particularly, the coding Website (http://v3 .espacenet.com/eclasrch?CY:ep&LG:en) schemes and formats, of the various classi?cation systems. to determine an appropriate ECLA class. Second, assume rough equivalence betWeen ECLA and PI and search the BRIEF SUMMARY OF THE INVENTION corresponding FI class on the JPO Website (http:// 65 WWW4.ipdl.jpo.go.jp/Tokujitu/tjftermenbipdl). This is very This invention provides for retrieval and indexing by cumbersome and subject to error. search engines of classi?ed documents in Which a portion of US 7,574,433 B2 3 4 the classi?cation coding has been supplemented With inserted dynamically inserting said term into the static document; and terms, keywords, titles or de?nitions derived from the classi transmitting the resulting document to the search engine sys ?cation system’s schedule and de?nitions. tem. Further, the static document can be in HTML or XML One aspect of this invention is a system for the indexing format. Further, the terms derived from the classi?cation and retrieval of classi?ed documents, the system comprising, system can be in a language other than the language of the at least one server computer Which is connected to a docu document in the document store. Further, the document in the ment store, said document store containing at least one static document store can be a patent document. Further, there can document derived from a document collection to Which at be a connection betWeen the server computer and a client least one classi?cation system code has been previously computer. applied, said document containing at least one keyword Another aspect of this invention is a computerized method derived from the title or de?nition of said code, and a con for the retrieval of classi?ed documents comprising the nection betWeen the server computer and at least one search method steps of, causing a client softWare application in a engine system. Further, the static document can be in HTML client computer to initiate a connection to a server computer; or XML format. Further, the terms derived from the classi? and causing the client softWare application in a client com cation system can be in a language other than the language of puter to make at least one request to the server computer, said the document in the document store. Further, the document in request causing the server computer to carry out a method the document store can be a patent document. Further, there comprising the folloWing method steps: retrieving a docu can be a connection betWeen the server computer and a client ment from a document store, said document store containing computer. at least one static document derived from a document collec Another aspect of this invention is a system for the index 20 tion to Which at least one classi?cation system code has been ing and retrieval of classi?ed documents, the system compris previously applied, said document containing at least one ing, at least one server computer Which is connected to a retrieval code corresponding to the title and/or de?nition of document store, said document store containing at least one said code; retrieving from a database at least one term derived static document derived from a document collection to Which from the title and/or de?nition of said classi?cation system at least one classi?cation system code has been previously 25 code; dynamically inserting said term into the static docu applied, said static document containing at least one retrieval ment; and transmitting the resulting document to a client key corresponding to the title and/ or de?nition of said code; a computer. Further, the static document can be in HTML, database system comprising at least one term derived from XML, PDF or MSWord format. Further, the terms derived the title and/or de?nition of said classi?cation system code; a from the classi?cation system can be in a language other than connection betWeen the server computer and at least one 30 the language of the document in the document store. Further, search engine system; and a means for dynamically inserting the document in the document store can be a patent document. said term into the static document in response to a request Further, there can be a connection betWeen the server com from the search engine system and communicating the result puter and a client computer. ing document to the search engine system. Further, the static document can be in HTML, XML, PDF or MSWord format. 35 DEFINITIONS Further, the terms derived from the classi?cation system can be in a language other than the language of the document in Search Engine the document store. Further, the document in the document A server or a collection of servers dedicated to indexing store can be a patent document. Further, there can be a con internet Web pages, storing the results and returning lists of nection betWeen the server computer and a client computer. 40 pages Which match particular queries. The indexes are nor Another aspect of this invention is a computeriZed method mally generated using spiders but may also be based on OEM for the indexing and retrieval of classi?ed documents com content provided from a search engine that has a spider that prising the method steps of, in response to a request from a actively craWls the Web. Some of the major search engines are search engine system, retrieving a document from a docu Altavista, Excite, Hotbot, Infoseek, Lycos, Northern Light ment store, said document store containing at least one static 45 and WebcraWler. document derived from a document collection to Which at least one classi?cation system code has been previously ‘Web Spider’ or ‘Web Robot’ applied, said document containing at least one term derived Program that searches the World Wide Web in order to from the title or de?nition of said code; and transmitting said identify neW (or changed) pages for the purpose of adding document to the search engine system. Further, the static 50 those pages to a search service’s (“search engine’s”) data document can be in HTML, XML, PDF or MSWord format. base. Further, the terms derived from the classi?cation system can Web Grabber be in a language other than the language of the document in Program that automatically doWnloads Web site content for the document store. Further, the document in the document the purpose of subsequent o?line vieWing or processing. store can be a patent document. Further, there can be a con 55 nection betWeen the server computer and a client computer. Web Site Another aspect of this invention is a computeriZed method A user-accessible server site that implements the basic for the indexing and retrieval of classi?ed documents com World Wide for the coding and transmission of prising the method steps of, in response to a request from a hypertextual documents. These standards currently include, search engine system, retrieving a document from a docu 60 Without limitation, HTML (the Hypertext Markup Language) ment store, said document store containing at least one static and HTTP (the Hypertext Transfer Protocol). In addition, document derived from a document collection to Which at reference is made to Java script (also referred to as ), least one classi?cation system code has been previously though other types of script, programming languages, and applied, said document containing at least one retrieval code code can be used as Well. It should be understood that the term corresponding to the title and/or de?nition of said code; 65 “site” is not intended to imply a single geographic location, as retrieving from a database at least one term derived from the a Web or other netWork site can, for example, comprise mul title and/or de?nition of said classi?cation system code; tiple geographically distributed computer systems that are US 7,574,433 B2 5 6 appropriately linked together. Furthermore, while the follow Examples of search engine software applications that can ing description relates to an embodiment utilizing the Internet be used in this role include, without limitation, the following: and related protocols, other networks, such as networked ES.NET 2004 by Innerprise runs on Windows 2000/XP/ interactive televisions, and other protocols may be used as 2003 servers and is a full-text indexing Web crawler and well. search engine. With ES.NET, documents are crawled and indexed from an Intranet, Web Site, or the Web. Crawling and Document Server-Search Engine-Client Environment updating can be automated using the built-in scheduler. FIG. 1 depicts the generalized operating environment of ES.NET 2004 consists of a Windows Service (actual spider), the current invention. This environment comprises document a Web Application (interface to the service), and a Search server web site 100, search engine 200 and client 300. These Application (for integration into an existing Web site). are interconnected via network connections 401, 402 and ES.NET 2004 supports common ?le types through the use of 403. This operating environment can reside within a single IFilters, including, without limitation, HTML, XML, organiZation’s intranet or can extend across the global Inter Microsoft Word (.DOC), Microsoft Excel (.XLS), Adobe net with web site 100, search engine 200 and client 300 Acrobat (.PDF), MP3 ID3vl & ID3v2 (.MP3), and Rich Text physically located on separate continents. Format (.RTF). Document Server Web Site Active Search Engine by Myrasoft is a server application that allows developers to create a Yahoo style search engine. FIG. 2 depicts a typical hardware and software con?gura It features an interactive user interface and administration tion for document server website 100 according to this inven tools for link management and approval, category creation, tion. Web server 110 provides the physical housing for a web 20 keyword based search, automatic con?rmation of new , server application. Database server 120 provides the physical user email list management, among other features. housing for a database containing classi?cation system data. Search Engine Studio by Xtreeme automatically indexes a Network attached storage (NAS) servers 131, 132 and 133 target Web site using four methods, and then creates a search provide a data store for documents to be served over the engine for the Web site or an o?lline search for CD-ROM and network by web server 110. Router 140 provides the connec 25 DVD distribution. tion to the Internet. Persons knowledgeable in the ?eld will Other site search engine software applications include, recogniZe that there can be many variations on this con?gu without limitation, Namo DeepSearch by S] Namo Interac ration without departing from this invention. For example, tive, Inc., Atri se Every?nd by Atrise Software, ActiveSearch there can be a plurality of web servers 110 to provide load SiteSearch SDK, Albert web, Alkaline (Vestris), Amber?sh, balancing or to server a plurality of document collections. 30 ARTS PDF Search, ASPSeek, ASTAWare SearchKey, Also, there can be a plurality of database servers 120 to Atomica, AtomZ Search, Autonomy Search Server, BeSeen provide load balancing, failover and a plurality of classi?ca from Looksmart (aka whatUSeek intraSearch), Bool tion systems. The number of NAS servers can vary widely to eanSearch, BBDBot, BRS/Search, CGISRCH, Compass provide a scalable data store. Finally, all of the functions (now iPlanet Search), Convera RetrievalWare, Copemic, provided by the hardware depicted in FIG. 2 can be combined 35 crawl-it, Cybotics, DarWin Set, Datagold, Datapark Search, on a single server. At the other extreme, document server web DeepSearch, Dieselpoint Search, DioWeb, DMP Scout, Doc site 100 can be a logical one with its physical components Father, Doclinx TeraXML, DolphinSearch, dtSearch Web, destributed distantly and connected via the Internet or other EasyAsk, ebhath, Educesoft ASP Search Engine, 80-20 Dis communications network. The content of the document covery, Elise Matching Engine, Endeca Commerce, Catalog server web site according to this invention is described in 40 and Enterprise Search, Engenium Semetric, Enterprise more detail below. Search (Innerprise), Eureka, eVe Image Search, Every?nd, Search Engine Excalibur RetrievalWare, Extense, Extropia Site Search, F3DSearch, FAST Search Server, Findex (now Onix), Public Search Engines Dynamics Search, FreeFind, Fulcrum Search Server (now Public search engines that can be used with this invention 45 Hummingbird), FusionBot, Glimpse, Harvest, Homepag include, without limitation, the following: , Yahoo, eSearchEngine, ht://Dig, Hummingbird Search Server, i411 Ask jeeves, AllTheWeb.com, AOL Search, HotBot, Teoma, Faceted Metadata Search, IBM Intelligent Miner for Text, AltaV1sta, Gigablast, LookSmart, Lycos, MSN Search, ICE, ic-?nd, IDKSM, IMP Database Search Engine, Index Excite, Inktomi, WebWombat, WebCrawler, Overture, and Search (Xavatoria), Index Server (Microsoft), IndexMySite, WiseNut. A contemporary diagram of major search engines 50 Inktomi Search Software, InMagic, InQuira for Search, Intel in the United Kingdom and their relationships is shown in ligent Miner for Text, Intelliseek Enterprise Search, Interac FIG. 3 (from http://www.alphaquad.co.uk/internet_market tiveTools Search Engine, interMedia, Intermediate Search ing_notes/uk-search_engine_relationships.jpg). (Fluid Dynamics), IntuiFind (Mercado), Inxight SmartDis covery, i-phrase, iPlanet Search (formerly Com Private Search Engines 55 pass), I-Search, Isearch, Isys:web, IXE:Ideare indexing This invention can be implemented with a private search Engine, Jobjects QuestAgent, Juggernautsearch, JXTA engine that is controlled in association with the document Search, KSearch, K2 (Verity), LexiQuest LexiGuide, link server. A common implementation is a server computer on Search, Lotus Extended Search (Domino), Lucene, Lycos which a search engine software application has been InSite Pro Service, Master.com (Webinator Remote), Matt’s installed. 60 SimpleSearch, Mercado IntuiFind, MetaStar, Microsearch Examples of server computers that can be used in this role WebSearch, Microsoft Index Server, Microsoft SharePoint, include, without limitation, the following: Windows-installed Microsoft Site Server, MiniSearch, mnoGoSearch (formerly computers such as Dell brand PowerEdge servers, HP Pro UdmSearch), MondoSearch, MPS Information Server, Mus liant servers, Sun Fire V202 and IBM e325 servers; LINUX cat, NamaZu, Nathra, Nav4, NetMind Search-It, Netrics installed computers such as Dell brand PowerEdge servers; 65 Search (previously Likelt), Netscape Compass (now iPlanet MacOS-installed servers such as Apple Xserve; and UNIX Search), Net.Sprint, NextPage (LivePublish), Northern Light installed servers such as Sun Netra servers. (search service & EIP), Noviforum (was ), NQL, US 7,574,433 B2 7 8 Nutch, Onix, OmSearch, OpenBridge (formerly ZNOW), ware (WebSoft), Muse-Lite (Muse Communications), Fast OpenFTS, OpenText-LiveLink, Oracle Text, Ultra Search Browser (FastBrowser), ActivatorDesk (R. Lee Heath), Web and interMedia, Orangevalley Intranet Search Engine, orenge Padlock (Leithauser Research), LE-Multibrowser (LE-Soft (empolis, Panoptic Search, PDF WebSearch, Perl Scripts, ware Sweden), BrowseMan (SpecialiZed Search), InnerX Perifect Search, Phantom, PicoSearch, PLWeb (PLS/AOL), (InnerX), Aggressive Internet Research (Frank Harrison), QuestAgent, QueryServer Metasearch Engine, Recommind Cygsoft LDAP Browser (Cygsoft), and WebSpeedReader MindServer IR, re.se@rch suite, RetrievalWare, RiSearch, (PerMaximumSoftware). RuterSearch, SearchKey Plus (ASTAWare), Selena Sol’s Keyword Search (now Extropia), SharePoint (Microsoft Web Grabber Client Tahoe), Sharewire SiteSearch, Sideran Seamark Faceted Also known as “of?ine browsers”, web grabber applica Metadata Search (formerly bpAllen Teapot), SimpleSearch, tions that can be used in this invention include, without limi SiteFerret Lite and Pro, siteLevel (formerly intraSearch), tation, the following: Aaron’s Web Grabber by Surfware SiteMiner, SiteSearch (now DocFather), SiteSearch Indexer (http :/ / www. surfwarel ab s .com/Awebvacuumg.htm), (JavaScript), Site Server (Microsoft), SiteSurfer, S.L.I. kabestin software’s Web Grabber (http://www.kabestin.com/ Search, SmartDiscovery (Inxight), Spiderline, Spy-Server, webgrabber.html), PicaLoader (http://www.vowsoft.com/), Subject Search Server (SSServer), SurfMap Search, SWISH HTTTrack Website Copier (published by HTTrack), Web E, SWISH++, Tahoe (Microsoft SharePoint), TEC-IMS, Shutter (published by MAB Software), Of?ine Explorer t.?nd (Eidetica), Thunderstone Webinator, Trident (now (MetaProducts), Of?ine Explorer Pro (MetaProducts), Noviforum), TYPENGO N300 Search, UdmSearch (now Of?ine Explorer Enterprise (MetaProducts), Power Siphon (Applied Kinematics), Leech (Aeria), WebZIP (Spidersoft), mnoGoSearch), Ultra Search (Oracle), Ultraseek (Verity, pre 20 viously by infoseek, then Inktomi), Universal Knowledge Web Dumper (Maxprog), WebCopier (MaximumSoft), MM3 Processor, Verity-Search97 & K2, Virage Audio & Video WebAssistant (MM3Tools MuenZenberger), GetBot (Get Search, Visual Net, WAIS and freeWAIS, WebCat, Web Bot), WebCloner (ProductsFoundry), SurfOf?ine (Bime Glimpse, Webinator (Thunderstone), WebMerger, Webrom, soft), QuadSucker/Web (SB Software), RafaBot (Spadix WebSearch Perl Script, WebServer 4D, WebSonar, Web 25 Software), Grab-a-Site (Blue Squirrel), Of?ine CHM (Di STAR Search (4D), WideSource, Windex Search, WiZDoc, rect-Soft), WebCatcher (WiZissoft), ActiveSite Compiler Xapian (formerly Open Muscat, OmSearch), XML Query (INTOREL), NetGrabber (FuZZSoft), Net-Ripper (SoftByte Engine, YourAmigo, Zebra, NOW (now OpenBridge), and Labs), BlackWidow (SoftByte Labs), Website Extractor (In Zoom. temetSoft Corporation), SuperBot (EliteSys), PageSucker Google markets the Google Search Appliance, a self-con 30 (Frederic Veynachter Software), eNotebook (GoldKingko), tained search engine. When applied to this invention, this Baldgorilla Go-Getter (Baldgorilla Software), BackStreet appliance can be logically placed within the same domain or Browser (Spadix Software), Of?ine Navigator (Asona), Web organiZation that houses the documents. Alternatively, it can Whacker (Blue Squirrel), WebGainer (LuoSoft corn), Rip be located anywhere as long has it has network access to the Clip (Kevlex Technologies), JOC Web Spider (JOC Soft document server and clients have network access to it. 35 ware), Web Capture (E-SOFTWARE), WebSlinky (web slinky.com), HTTP WeaZel (Imate Software), SBWcc Web Client site Capture (SB Software) and Teleport Pro (Tennyson Client Maxwell Information Systems). Browser applications that can be used in this invention Website Extractor Client include, without limitation, the following: Browser One (pub 40 Website extractors are client applications that mine and lished by Digital Internet), (Opera Software), Ultra extract data from the web. Web extractor applications that can Browser (UltraBrowser.com), Xeonn-Turbo (Xeonn.net), be used in this invention include, without limitation, (Anderson Che), Smart Bro (Bassam Jarad), Advanced Information Extractor (AIE) by Poorva, Inc., Inter NJ Star Asian Explorer (NJ Star Software), GameNet Browser net Macros by iOpus, Web Grabber by Ficstar Software, (Smartalec), (MyIE2 Team), Omnibrowser (Omni 45 Web-Site-Downloader, WebEx Service by KnowleSys, browser.com), SiteKiosk (PROVISIO), Wichio Browser Visual Web Task by Lencom Software, Web Data Extractor (Revopoint), NetCaptor (Stilesoft), MoZilla by WebExtractor System, and TextPipe by Crystal Software. (MoZilla), (Deepnet Technologies), MoZilla (MoZilla), Slim Browser (FlashPeak), SmartFox Web Content Repackager (Startplane Communications), SportsBrowser (4comtech), 50 Web content repackagers are intermediate applications that KidSplorer (Devicode Technology), Optimal Desktop (Opti receives requests from downstream client computers, mal Access), Ace Explorer (Tronix Software), Arlington retrieves web content from a server computer in response to Kiosk Browser (Arlington Technology), Advanced Browser client computer requests, modi?es, transforms or translates (Tronix Software), iRider (Wymea Bay), Image Browser (Im the retrieved content; and transmits the resulting content to age-browser.com), WindowSurfer (WindowSurfer Soft 55 the clients. Website repackagers include, without limitation, ware), 550Access Browser (550Access), FineBrowser (Soft automatic web page translators such as Google Translate and Inform), Kopassa Browser (Kopassa), 4C Vision (euris), AltaVista Translate. (Microsoft), Arlington Custom Browser (Arlington Technology), Net Viewer (Accessory Software), DETAILED DESCRIPTION OF DOCUMENT Play the Web (Philippe Vaugouin), Wysigot (Wysigot), Ser 60 SERVER WEB SITE viceHolder (LastReset), CafeTimePro (Protocall Computer), Freeware Browser (4comtech), Web Services Accelerator Document server web site 100 provides classi?ed docu (Virtual Innovations), iNetAdviser (Softinform), Netscape ments to search engine 200 and to client 300. According to (Netscape Communications), Surfnet (Info Touch Technolo this invention, the classi?ed documents served comprise con gies), Eminem Browser (Interscope Records), PhaseOut 65 ventional classi?ed document content with the addition or (PhaseOut team), Proximat (InnovSoft Consulting), subsitution of existing classi?cation codes by titles or de?ni WebView (ABC Enterprise Systems), Internet Research Soft tions of the codes. FIG. 4 shows a typical classi?ed document US 7,574,433 B2 10 according to conventional practice. This document is a patent manner that fulltext searches for terms and phrases occurring application that has been classi?ed and been published With in the classi?cation de?nitions and/or titles can retrieve the codes that indicate its classi?cation. FIGS. 5 and 6 shoW tWo document. documents according to this invention. In FIG. 5, titles of the There are many classi?cation systems and information class codes have been added to the document. In FIG. 6, coding systems that can serve in embodiments of this inven translations of titles of the class codes have been added to the tion. Several are described beloW but this invention is not document. When the classi?cation system is hierarchical, it is limited to these examples. preferable to add the subclass title together With the titles of The US. Patent Classi?cation System (http://WWW.usp its ancestors as has been done in FIGS. 5 and 6. to.gov/go/classi?cation/) is used by the United States Patent According to this invention, document server Web site 100 O?ice to classify patent applications, pregrant patent publi can store a collection of static classi?ed documents 110 to cations and granted patents. One or more classi?cations are Which code titles or de?nitions 111 have been added. These assigned to each document and published in the gaZette. static documents can be prepared and stored as ?les in any one The World Intellectual Property Organization (WIPO) of several suitable ?le formats including, but not limited to, administers four classi?cation systems (http://WWW.Wipo.int/ HTML, XML, PDF and MSWord. These can be stored on classi?cations/en/): the International Patent Classi?cation magnetic disc in the Web server itself or on a separate server (IPC) system for patents, the Nice Classi?cation of goods and or NAS device. services for the purposes of the registration of marks, the According to this invention, document server Web site 100 Locamo Classi?cation for industrial designs, and the Vienna preferably produces documents 110 dynamically in response Classi?cation of the ?gurative elements of marks. to a request from a search engine spider or other Web client. 20 The European Patent O?ice maintains the European Patent Collections of Classi?ed Documents Classi?cation (ECLA) system for European patent applica tions and documents. (Searchable at http://I2.espacenet.com/ This invention processes classi?ed starting documents for indexing by search engines. Preferably, these starting docu eclasrch.) The Japanese Patent O?ice (http://WWW.jpo.go.jp) main ments are part of collections of classi?ed starting documents. Examples of starting document collections that can be used 25 tains the File Index (FI) classi?cation system (an analogue to ECLA) and the File-Forming Term (F-Term) search coding for this invention include, Without limitation, the folloWing system and applies these, together With the IPC classi?cations patent and trademark patent collections: Weekly Patent Bibliographic RaW Data supplied by the to patent applications and granted patents. Thomson DerWent maintains the DerWent Classi?cation, US. Patent and Trademark O?ice (http://WWW.uspto.gov/ Web/menu/patdata.html) including Grant Red Book V2.5 30 the Chemical Patents Index (CPI) Manual Codes, and the Electrical Patents Index Manual Codes (EPI manual codes) () bibliographic data, Application Red BookVl .5 (XML) system for electrical and electronic engineering patents bibliographic data, and Patent Full-Text/APS (Green Book) bibliographic data. EPO bibliographic data and abstracts sup (http://thomsonderWent.com/suppor‘t/dWpiref/reftools/clas plied by the European Patent Of?ce (http://ebd.epoline.org/ si?cation 35 The North American Industry Classi?cation System (NA ebd/) include EBD ST.32 format data and Abstracts in ST.32 format. Publications by the Japan Patent O?ice include Kokai ICS) is jointly maintained by the governments of the United States, Canada and Mexico (http://WWW.census.gov/epcd/ and Registered Patents on DVD and CD-ROM, Patent and WWW/naics.html) as is the North American Product Classi? Registered Utility Models on DVD and CD-ROM, English cation System (http://WWW.census.gov/eos/WWW/napcs/ Abstracts of Kokai on CD-ROM, Design Patents on CD 40 napcs.htm). The NAICS Was developed as a replacement for ROM, Trademarks on CD-ROM, and International Trade the US. Standard Classi?cation System (SIC) Which is none marks on CD-ROM. Publications by the German Patent theless still in use and can be used in this invention. O?ice including Markenblatt (Trade Mark Journal) and Pat The United Nations Statistics Division (http://unstat entblatt (Patent GaZette). Publications of other patent o?ices, s.un.org/unsd/cr/registry/) maintains a registry of Statistical including Without limitation, Boletines de Patentes and Bole tines de Marcas published by the Argentina Patent Of?ce; 45 Classi?cations that can be used in this invention. These include economic activity classi?cations such as the Intema Supplement to the Australian Of?cial Journal of Patents in tional Standard Industrial Classi?cation of All Economic PDF format, Australian Patent Abstracts, OPI Patent Speci ?cations, and Austrailian Patents published by the Australian Activity (ISIC), the Central Product Classi?cation (CPC), the Standard International Trade Classi?cation (SITC), the Clas Patent Of?ce; Patent and Utility Model GaZette ASCII Data si?cation by Broad Economic Categories (BEC), Classi?ca by the Austrian Patent Of?ce; Recueil des brevets d’invention tions of the Functions of Government (COFOG), the Classi by the Belgian Patent Of?ce; Patent Documents on CD-R published by the Canadian Intellectual Property Of?ce; Chi ?cation of Individual Consumption According to Purpose (COICOP), Classi?cation of the Purposes of Non-Pro?t Insti nese Patent Speci?cation CD-ROM, CD-ROM of Chinese tutions Serving Households (COPNI), Classi?cation of the Patent Abstracts, Patent GaZette CD-ROM, CD-ROM for Design, and China Patent Database published by the State 55 Outlays of Producers According to Purpose, (COPP) and the Intellectual Property Of?ce of the People’s Republic of Trial International Classi?cation of Activities for Time-Use China, EkasWa-A and EkasWa-B CD-ROMs published by the Statistics (ICATUS). EUROSTAT is custodian of the Statistical Classi?cation of Patent Facilitating Centre, India; Patent Abstracts of Russia; RUPAT and RUABEN published by the Russian Agency for Economic Activities in the European Community (NACE) (http://europa.eu.int/comm/eurostat/ramon), the Statistical Patent and Trademarks; BREF CD-rom published by INPI; Classi?cation of Products by Activity in the European Eco and the PCT Electronic GaZette and the PCT Database on CD-ROM published by the World Industrial Property O?ice. nomic Community (CPA) and the Classi?cation of Environ mental Protection Activities and Expenditure (CEPA). Classi?cation Systems AFRISTAT (http://WWW.afristat.org) is the custodian of the This invention solves this problem by making de?nitions 65 Activity Classi?cation of AFRISTAT Member States or schedules of classi?cations that have been applied to a (NAEMA), the Product Classi?cation ofAFRISTAT Member particular document accessible to a fulltext search engine in a States (NOPEMA). US 7,574,433 B2 11 12 The Australian Bureau of Statistics (http://WWW.abs. hierarchy or indent level of the class, and a CDISP column gov.au/AUSSTATS) is custodian of the Australian and NeW Which contains a string in a format commonly used in public Zealand Standard Classi?cation (ANZSIC). records for that class. (FIG. 8a shoWs the ?rst feW roWs of a The World Customs Organization (http://WWW.Wcoom schedule table for the US. Patent Classi?cation System.) It is d.org/ie/index.html) is custodian of the Harmonized Com important that the table be sortable on the classid column so as modity Description and Coding System (HS). to reproduce the ordering of the US. Patent Classi?cation The International Labor Organization is custodian of the Schedule at least for subclasses Within single classes. International Standard Classi?cation of Occupations (ISCO), Preferably, the database also contains a table contain the the International Classi?cation of Status in Employment direct hierarchical lineage of each class under the top level (ICSE), International Standard Industrial Classi?cation of all such as shoWn in tbIUSPCHierarchy in FIG. 8b. In this abbre Economic Activities (ISIC), International Standard Classi? viated table for the US. Patent Classi?cation System, the cation of Education (a UNESCO classi?cation) (ISCED) and classid and ancestorid column entries reference the classid classi?cations of occupational injuries. column in tbIUSPCSchedule. The World Health Organization (WWW.Who.int) is custo Preparation of Classi?cation System Database dian of the International Statistical Classi?cation of Diseases The classi?cation system database can be prepared from an and Related Health Problems (ICD-lO); the International electronic copy of the classi?cation system or by Internet Classi?cation of Impairments, Disabilities, and Handicaps doWnload When available on the Internet. Several program (ICIDH); and the International Classi?cation of Functioning, ming approaches are available to those knowledgeable in the Disability and Health (ICF). ?eld and source code is included in the embodiments beloW. The Library of Congress maintains the Library of Congress 20 Classi?cation (http://WWW.loc.gov/catdir/cpso/lcco/ Document Server Document Store lcco.html). The DeWey Decimal Classi?cation (DDC) system The document store holds the documents that are to be is oWned by OCLC (http://WWW.ocic.org/deWey/about/). provided to search engines and to Web clients. While the store Several technical associations and publishers of scholarly can comprise static documents in a ?le system, it preferablly and technical journals and periodicals maintain classi?cation 25 comprises a base collection of documents in a ?le system or systems that can be used in this invention. The American database that are dynamically merged With classi?cation data Economic Association maintains the Journal of Economic When fed to search engines and Web clients. Literature classi?cation system. The Institute of Acoustics A static document collection according to this invention maintains the BEPAC Acoustics Library Classi?cation sys comprises documents that contain content from the starting tem (http://WWW.ioa.org). 30 document collection together With classi?cation information The Government Printing Of?ce maintains the Superinten from the classi?cation system’s schedule and/or de?nitions. dent of Documents classi?cation system (http://WWW.access The static documents are preferably in HTML format, but . gpo . gov/ su_doc s/ fdlp/ pub s/clas sman/ index .html may also be in any format that can be processed by search Classi?cation systems maintained by online database pro engines such as pdf, hdml, xml, cfm, doc, xis, ppt, rtf, Wks, viders can be used in this invention. Examples include, With 35 lWp, Wri, or sWf. out limitation, ABI/INFORM (http://support.dialog.com/ The information from the classi?cation system’s schedule searchaids/dialog/fl 5_f635_ccodes.shtml) BIOSIS® and/or de?nitions may be Whole class or subclass titles, Whole PrevieWs Biosystematic Codes and the Organism Classi?er class or subclass de?nitions, or portions of either, for Term Conversions (http://support.dialog.com/searchaids/ example, selected keyWords extracted from titles and/or de? dialog/f5_codes/) CABICODES in CAP Abstracts (http:// 40 nitions. support.dialog.com/searchaids/dialog/ The information from the classi?cation system’s schedule f50_cabicodes_list.shtml) CAS Registry Numbers, CAL and/or de?nitions may be in the same language as the con classi?cation codes (http://support.dialog.com/searchaids/ taining document. It can also be in a second language. For dialog/f8_ccodes.shtml), Merged descriptors and tree struc example, an English patent record derived from a USPTO tures (http://WWW.nlm.nih.gov/mesh/ 45 starting document collection may be merged With a Japanese introduction2004.html), the ACM Computing Classi?cation translation of the applied class code titles. This provides a system (http://WWW.acm.org/class/l 998/) the Inspec(r) Clas mechanism by Which the documents can be searched in the si?cation system (http://WWW.iee.org/publish/support/in second language. spec/document/electronic.com) If the classi?cation system is hierarchical, it is preferable to 50 insert the titles and/or de?nitions of the hierarchically directly Classi?cation System Database superior classi?cations of a starting document’s classi?ca In order to automatically produce a collection of static tions into the containing document. documents or to dynamically produce merged documents as part of the Document Server Web Application, the classi?ca Preparation of Document Server Document Store While a static document store according to this invention tion information to be used for these operations is stored in a 55 database. There are several commercially available softWare can be prepared by manually merging classi?cation titles packages that can be used, including but not limited to Wat and/or de?nitions into a classi?ed document, it is preferable com SQL, Oracle, Sybase, Access, Microsoft SQL Server, to automate this process. Examples of manually prepared IBM’s DB2, AT&T’s Daytona, NCR’s TeraData and Data documents are shoWn in FIGS. 5 and 6. The automatic preparation of a static document store Cache. 60 according to this invention is preferably an extension of a At its simplest, this database comprises a table With tWo preparation of a dynamic document store according to this columns: a normalized code column and a class title column. invention so the preparation of a dynamic document store is The normalized code column comprises a unique code that is described ?rst. a retrieval key for locating the class title as shoWn in FIG. 7. Preferably, this table also includes the columns shoWn in 65 Document Server Web Application tbIUSPCSchedule in FIG. 8a, in other Words, an indentity According to this invention, documents from the document column ‘classid’, a level column ‘level’ Which contains the store are made available to search engines and Web clients by US 7,574,433 B2 13 14 means of a server Web application Which communicates equals 1.36 KB. Created Sep. 13, 2004, Last revision: Oct. 7, documents from the document store to a search engine client 2204. The folloWing ?les are included: or Web client in response to a request by the client. This 20040177015.htm.txtiHTML ?le prepared according to communication is preferably performed according to the Embodiment 1, 5 KB, Created Sep. 13, 2004, Last revi HTTP protocol, but can also be according to other protocols, sion: Sep. 13, 2204 including Without limitation, File Transfer Protocol (FTP), 20040167928.htm.txtiHTML ?le prepared according to Simple Mail Transport Protocol (SMTP) and Network NeWs Embodiment 2, 2.97 KB, Created Sep. 13, 2004, Last Transfer Protocol (NNTP). revision: Sep. 13, 2204 Server Web applications that can be used for this invention cxptohtml.xsl.txtiXSL stylesheet according to Preferred include, Without limitation, Apache speci?c servers such as Embodiment, 3.46 KB, Created Sep. 23, 2004, Last AbaSioux, Apache, Apache-(PZ)-1.3.31, Apache-1.3.27, revision: Sep. 23, 2204 Apache-ADTI, Apache-AdvancedExtranetServer, Apache FolderBroWse.aspx.cs.txtiC# ?le for Web site application Coyote, Apache-NeoNova, Apache-NeoWebScript, Apache according to Preferred Embodiment, 3.06 KB, Created SSL, Apache1.3.29, DataClub-Apache, Fjapache, GonZolix Sep. 27, 2004, Last revision: Oct. 7, 2204 Apache, HP-UX_Apache-based_Web_Server, Rapidsite, FolderBroWse.aspx.resxtxtiResource ?le for Web site Red, Server_Apache, Stronghold, and SudApache_Mi application according to Preferred Embodiment, 1.69 crosoft NT speci?c servers such as Commerce-Builder, KB, Created Sep. 27, 2004, Last revision: Sep. 27, 2204 Microsoft-IIS, Microsoft-Intemet-Information-Server, Pur FolderBroWse.aspx.txtiSource ?le for Web site applica veyor, WebSite, and WebSitePro; Roxen speci?c servers such tion according to Preferred Embodiment, 663 bytes, as Roxen, Roxen Challenger, Roxen Web server, and Spinner; 20 Created Sep. 27, 2004, Last revision: Sep. 27, 2204 and Macintosh speci?c servers such as 4D_WebSTAR_S, Global.asax.txt4Global source ?le for Web site applica 4D_WebStar_D, AppleLISA, AppleShareIP, AppleWSE, tion according to Preferred Embodiment, 1.57 KB, Cre CL-HTTP, HomeDoor, Interaction, MACOS_Personal_ ated Sep. 27, 2004, Last revision: Sep. 27, 2204 Websharing, MacHTTP, NetPresenZ, QuidProQuo, Web PCDoWnloadCode.txt4Computer source code for doWn STAR, WebSTAR4, WebStar, WebStarV, and Web_Serveri 25 loading USPTO class schedule for Embodiment 4, 24.2 4D. KB, Created Sep. 13, 2004, Last revision: Sep. 17, 2204 While this invention can be practiced by serving static ShoWAbstract.aspx.cs.txt4C# ?le for Web site application HTML documents from a simple Web application, it is pref according to Preferred Embodiment, 5.02 KB, Created erably practiced With a Web application that is capable of Sep. 27, 2004, Last revision: Sep. 27, 2204 serving dynamic documents. Dynamic documents (or “server 30 ShoWAbstract.aspx.resx.txtiResource ?le for Web site pages”) comprise dynamic content. Dynamic content is, for application according to Preferred Embodiment, 1.69 example, in the case of the World Wide Web, Web page KB, Created Sep. 27, 2004, Last revision: Sep. 27, 2204 content that includes the usual static content such as display ShoWAbstract.aspx.txtiSource ?le for Web site applica text and markup tags, and, in addition, executable program tion according to Preferred Embodiment, 114 bytes, content. Executable program content includes, for example, 35 Created Sep. 27, 2004, Last revision: Sep. 27, 2204 Java, VBScript, CGI gateWay scripting, PHP script, and Perl StepP102.txtiXML stylesheet for Step P102 according to code. The kind of executable program content found in any particular dynamic server page depends on the kind of Preferred Embodiment, 17.8 KB, Created Sep. 22, 2004, Last revision: Sep. 23, 2204 dynamic server page engine that is intended to render the StepP103.txt4C++ source code for Step P103 according executable program content. For example, Java is typically 40 used in Java Server Pages (“JSPs”) for Java Server Page to Preferred Embodiment, 14.8 KB, Created Sep. 22, engines (sometime referred to in this disclosure as “JSP 2004, Last revision: Sep. 22, 2204 engines”); VBScript is used in Active Server Pages (“ASPs”) StepP103sql.txtiSQL source code for Step P103 accord for Microsoft Active Server Page engines (sometime referred ing to Preferred Embodiment, 638 bytes, Created Sep. to in this disclosure as “ASP engines”); Visual Basic and C# 45 22, 2004, Last revision: Sep. 22, 2204 are used in Microsoft ASP.NET server Web applications, and StepP104html.txtiC++ source code for Step P104 PHP script, a language based on C, C++, Perl, and Java, is according to Embodiment 5, 1.79 KB, Created Sep. 23, used in PHP pages for PHP: Hypertext Preprocessor engines. 2004, Last revision: Sep. 23, 2204 USPCScheduleAndHierarchyTables.scil.txtiSQL script Documents Produced by Server Web Application 50 for creating tables for Embodiment 4, 1.03 KB, Created The documents produced by the server Web application Sep. 13, 2004, Last revision: Sep. 13, 2204 and transmitted to search engines and/ or clients can be in any This embodiment discloses the merging of a classi?ed of several ?le formats that can be transmitted over a netWork patent record and subclass titles into a dynamic XML docu and read by search engines and Web clients. These formats ment that is inserted into a Website that is made accessible to include, Without limitation, HTML, XML, MSWord, MSEx 55 a Web spider or craWler so that it can be indexed by a Web cel, RTF and PDF. search engine. Those skilled in the art Will appreciate that many modi? The hardWare environment is a Dell PoWerEdge 1650 cations can be made to the above system and methods Without server equipped With tWo Model 80530 Intel 1.4 GHZ pro departing from the scope of the present invention. cessors, 1 GB of physical memory and 136 GB of hard disk in 60 a RAID 10 con?guration. The operating systems is Microsoft PREFERRED EMBODIMENT WindoWs 2000 Server containing Microsoft Internet Infor mation Services (IIS) Version 5. A Website is created accord Appendix 1 presents source code and other documentation, ing to IIS documentation and con?gured to alloW anonymous on CD-ROM, that has been Written in the course of develop access. The server is connected through a LAN netWork to a ment of a prototype developed according to the embodiments. 65 CISCO 2621XM router Which is connected to the Internet. In The ?le systems of the CD-ROM is CDFS. The operating addition, Microsoft SQL Server Version 7.0 is installed and system is XP Professional. Contents.txt, the Microsoft .NET Framework is installed on the Website.