(12) United States Patent (10) Patent No.: US 7,574.433 B2 Engel (45) Date of Patent: Aug
Total Page:16
File Type:pdf, Size:1020Kb
US007574.433B2 (12) United States Patent (10) Patent No.: US 7,574.433 B2 Engel (45) Date of Patent: Aug. 11, 2009 (54) CLASSIFICATION-EXPANDED INDEXING 6,598,046 B1* 7/2003 Goldberg et al. ............... 707/5 AND RETRIEVAL OF CLASSIFIED 6,625,596 B1 9/2003 Nunez ........................... 707/3 DOCUMENTS 6,711,585 B1* 3/2004 Copperman et al. ...... TO7 104.1 6,778,979 B2 * 8/2004 Grefenstette et al. ........... 707/3 (75)75 Inventor: Alan Kent Engel, Villanova, PA (US) 6,928,4256,820,075 B2 * 1 8/20051/2004 GrefenstetteShanahan et al.et al............... ........... 707/2707/3 7,031,961 B2 * 4/2006 Pitkow et al. .................. 7O7/4 (73) Assignee: Paterra, Inc., Kerrville, TX (US) 7, 177,904 B1* 2/2007 Mathur et al. ............... TO9.204 2002/0147738 A1 10, 2002 Read (*) Notice: Subject to any disclaimer, the term of this 2003,0229626 A1 12/2003 s patent is extended or adjusted under 35 2004/01770.15 A1 9, 2004 Galai et al. U.S.C. 154(b) by 566 days. FOREIGN PATENT DOCUMENTS (21) Appl. No.: 10/960,725 JP 2003-345950 A 12/2003 (22) Filed: Oct. 8, 2004 * cited by examiner (65) Prior Publication Data Primary Examiner Shahid A. Alam (74) Attorney, Agent, or Firm Gianna Julian-Arnold; Miles US 2006/0242118A1 Oct. 26, 2006 & Stockbridge P.C. (51) Int. Cl. 57 ABSTRACT G06F 7/30 (2006.01) (57) (52) U.S. Cl. ................. 707/4; 707/3; 707/5; 707/104.1 Document classification systems are valuable tools for (58) Field of Classification Search ..................... 707/3, Searching and retrieving classified documents but can be pro 707/5, 4, 104.1: 709/201, 204; 715/209 hibitively complex and cumbersome for users. See application file for complete search history. (56) References Cited A system for the indexing and retrieval of classified docu ments inserts keywords, titles or definitions of previously U.S. PATENT DOCUMENTS applied classifications into the document record and provides the resulting record to a search engine. Searchers are able to 5,765,176 A * 6/1998 Bloomberg ................. 715,514 retrieve documents by searching on keywords from the clas 5,835,087 A * 1 1/1998 Herz et al. .................. T15,810 6,038,560 A 3/2000 Wical ............................ 707/5 sification system without looking up class coding. RE36,653 E 4/2000 Heckel et al. 6,105,022 A 8/2000 Takahashi et al. 30 Claims, 12 Drawing Sheets P100 LoadXML record from stream P10 P102 Add class titles P103 from database Convert to HTM P104.html and Save to document store U.S. Patent Aug. 11, 2009 Sheet 1 of 12 US 7,574.433 B2 200 402 100 405 FIG. 1 U.S. Patent Aug. 11, 2009 Sheet 2 of 12 US 7,574.433 B2 100 P O. O. O. c isD Ron OeP 120 O O O D D 140 131 133 FIG 2 Typical Document Server Web Site U.S. Patent US 7,574.433 B2 9deOSION O SlapdSÁÐAQOVQ Gae)£"OIH U.S. Patent Aug. 11, 2009 Sheet 4 of 12 US 7,574.433 B2 Uhited States Patent Application 20040177088 Kind Oode A1 Jeffrey, Joel September 9, 2004 Wode-spectrum information search engine Abstract A method and computer program product for comparing documents includes segmenting a judgment matrix into a plurality of information sub-matrices where each submatrix has a plurality of classifications and a plurat it y of terms relevant to each classification; evaluating a relevance of each term of the plurality of terms with respect to each classification of each information submatrix of the information submatrices; calculating an information spectrum for a first document based upon at least some of the plurality of terms; calculating an information spectrum for a second document based upon at least Some of the plurality of terms; and identifying the second document as relevant to the first document based upon a comparison of the calculated information spectrums. Inventors: Jeffrey, Joel; (Weat on, IL) Correspondence Name and Address; FSH & ROHARDSON P. C. 3300 DAN RAUSOHER PLAZA MNNEAPOS MN 55402 US Assignee Name and Adress: H5 Technologies, Inc., a California Corporation Serial No.: 800217 Series Code: 10 Filed; March 12, 2004 U. S. Ourrent O ass: 707 102 U. S. Oass at Publication: 707/102 Intern I O aSS: GO6F 017/30 FIG. 4 U.S. Patent Aug. 11, 2009 Sheet 5 of 12 US 7,574.433 B2 Uhi ted States Patent Application 20040177088 Kind Obde A Jeffrey, Joel September 9, 2004 Wode-spectrum information search engine Abstract A method and computer program product for comparing documents includes segmenting a judgment matrix into a plurality of information sub-matrices where each submatrix has a plurality of classifications and a plurality of terms relevant to each classification; evaluating a relevance of each term of the plurality of terms with respect to each classification of each information submatrix of the information submatrices, calculating an information spectrum for a first document based upon at east some of the plurality of terms; calculating an information spectrum for a second docurrent based upon at least some of the plurality of terms; and identifying the second document as relevant to the first document based upon a comparison of the calculated information spect runs. inventors: Jeffrey, Joel; (Wheaton, L) Correspondence Name and Address: FSH & RCHARDSON P. C. 3300 DAN RASOHER PLAZA MNNEAPOS Assignee Nate and Adress: H5 Technologies, Inc., a California corporation Serial Nb, : 800217 Series Obde: 10 Filed: March 12, 2004 U. S. Current Class: 7071102 - Oass 707 DATA PROCESSNG DATABASE AND FLE MANAGEMENT OR DATA STRUCTURES - 100 DAABASE SCHEMA OR DATASTRUCTURE - 102 Generating database or data structure (e.g., via user interface) U. S. Oass at Publication: 707 102 Intern' O ass: GO6F 017130 FIG. 5 U.S. Patent Aug. 11, 2009 Sheet 6 of 12 US 7,574.433 B2 United States Patent Application 20040177088 Krd Obde A Jeffrey, Joel September 9, 2004 Wole-spectrum information search engine Abstract A method and computer program product for comparing documents includes segmenting a judgment matrix into a plurality of information sub-matrices where each submatrix has a plurality of classifications and a plurality of terms relevant to each classification; evaluating a relevance of each term of the plurality of terms with respect to each classification of each information submatrix of the information submatrices; calculating an information spectrum for a first document based upon at east some of the plurality of terms; calculating an information spectrum for a second document based upon at least some of the plurality of terms; and identifying the second document as relevant to the first document based upon a comparison of the calculated information spect runs. inventors: Jeffrey, Joel; (Weat on, ) Obrrespondence Nante and Address: FSH & R OHARDSON P. C. 33OO DAN RASOHER PLAZA MNNEAPOS MN 554.02 US Assignee Name and Adress: H5 Technologies, Inc., a California Corporation Serial Nb, : 800217 Series Code: 10 Fied: March 12, 2004 U. S. Current O ass: 7071102 - 27 707 is : 7-2-M-777 MJSid:7-5-liik - 100 -3-M-2. Zs-it-s-s - 102 7-5-M-Art:7-2-likofER (Big l-t-. 1 st 7t-2 (s) ) U. S. Oass at Rublication: 707l 102 Intern' OaSS: CD6F 0730 FIG. 6 U.S. Patent Aug. 11, 2009 Sheet 7 of 12 US 7,574.433 B2 FIG. 7 U.S. Patent Aug. 11, 2009 Sheet 8 of 12 US 7,574.433 B2 title O COO2SOOOOOO Class 2 APPAREL MSCELLANEOUS GUARD OR PROTECTOR Body COver 5 3 COO2S457000 457 Hazardous material body cover FIG. 8a. Classid ancestroid U.S. Patent Aug. 11, 2009 Sheet 9 of 12 US 7,574.433 B2 P100 LOadXML record from Stream P10 Add class titles from database P103 P104 Save to document Store FIG. 9 U.S. Patent Aug. 11, 2009 Sheet 10 of 12 US 7,574.433 B2 Get subclass Code from m spDOM P103.1 Set append target to <USCS/> P103.2 Load class hierarchy for Subclass (spGetSubclassHierarchy) For each row in roWSet UCStree element with Create/insert new same classid in target? USCtree element in classid sequence P103.8 P103.7 Set append target Set append target to existing element to new element FIG 10 U.S. Patent Aug. 11, 2009 Sheet 11 of 12 US 7,574.433 B2 P100 LoadXML record from Stream P101 Add class titles from database P103 Convert to HM P104.html and Save to document Store U.S. Patent Aug. 11, 2009 Sheet 12 of 12 US 7,574.433 B2 P100 loadXML record from Stream P101 P104 Save to document Store US 7,574.433 B2 1. 2 CLASSIFICATION-EXPANDED INDEXING As a result the advantages of classification and indexing AND RETRIEVAL OF CLASSIFIED systems are beyond the grasp of more casual users and infor DOCUMENTS mation professionals. On the other hand, the rapid recent growth of fulltext-base TECHNICAL FIELD 5 patent retrieval services on the Internet has led lay persons and information professionals alike to rely increasingly on keyword searching. While keyword searching has its advan This invention relates to the indexing and retrieval of docu tages and is easy to use, variations in terminology can easily ments to which classification codes and Schemes have been lead to missed documents. Moreover, the intellectual product applied and, in particular, relates to the indexing and retrieval 10 embodied in the classifications applied to the documents is of patent documents. totally lost. In related art, D & B Duns Market Identifiers database on BACKGROUND DIALOG (http://library.dialog.com/bluesheets/html/ bI0516.html) provides for searching SIC descriptors as a It is standard practice for intellectual property authorities 15 search field.