(12) Ulllted States Patent (10) Patent N0.: US 7,574,433 B2 Engel (45) Date of Patent: Aug
Total Page:16
File Type:pdf, Size:1020Kb
US007574433B2 (12) Ulllted States Patent (10) Patent N0.: US 7,574,433 B2 Engel (45) Date of Patent: Aug. 11, 2009 (54) CLASSIFICATION-EXPANDED INDEXING 6,598,046 B1 * 7/2003 Goldberg et a1. ............. .. 707/5 AND RETRIEVAL OF CLASSIFIED 6,625,596 B1* 9/2003 Nunez ......................... .. 707/3 DOCUMENTS 6,711,585 B1 * 3/2004 Copperman et a1. .... .. 707/104.1 6,778,979 B2* 8/2004 Grefenstette et a1. ......... .. 707/3 75 _ 6,820,075 B2 * 11/2004 Shanahan et a1. ............ .. 707/3 ( ) Inventor‘ Alan Kent Engel’ vlnanova’ PA (Us) 6,928,425 B2 * 8/2005 Grefenstette et a1. .. .... .. 707/2 . _ . 7,031,961 B2 * 4/2006 Pitkow et a1. .. 707/4 (73) Asslgnee' Paterra’ Inc" Kern’ 1116’ TX (Us) 7,177,904 B1 * 2/2007 Mathur et a1. ............. .. 709/204 . 2002/0147738 A1 10/2002 R d ( * ) Not1ce: Subject‘ to any d1scla1mer, the term of this 2003/0229626 A1 12/2003 Nigair P211911t 15 extended Or adlusted under 35 2004/0177015 A1 9/2004 Galaietal. U.S.C. 154(b) by 566 days. FOREIGN PATENT DOCUMENTS (21) Appl' NO‘: 10/960’725 JP 2003-345950 A 12/2003 (22) Filed: Oct. 8, 2004 * Cited by examiner (65) Prior Publication Data Primary ExamineriShahid A Alam (74) Attorney, Agent, or Firm4Gianna Julian-Arnold; Miles US 2006/0242118 A1 Oct. 26 2006 - ’ & Stockbr1dge RC. (51) Int- Cl' 57 ABSTRACT G06F 17/30 (2006.01) ( ) (52) US. Cl. ............... .. 707/4; 707/3; 707/5; 707/ 104.1 Document Classi?cation systems are Valuable tools for 0f Classi?cation Search ................... .. Searching and retrieving Classi?ed documents but can be pro 707/5, 4, 1041; 709/ 201, 204; 715/209 hibitively complex and cumbersome for users. See application ?le for complete search history. _ A s stem for the indexin and retrieval of classi?ed docu (56) References Clted y g ments inserts keywords, titles or de?nitions of previously U.S. PATENT DOCUMENTS applied classi?cations into the document record and provides the resulting record to a search engine. Searchers are able to 5,765,176 A * 6/1998 Bloomberg . .. 715/514 retrieve documents by searching on keywords from the clas 5,835,087 A * 11/1998 Herzetal. 715/810 si?cation system Without looking up class coding. 6,038,560 A * 3/2000 Wical .......................... .. 707/5 RE36,653 E 4/2000 Heckelet a1. 6,105,022 A 8/2000 Takahashiet a1. 30 Claims, 12 Drawing Sheets P100 Load XML record from stream P101 Transform P102 Add class iitles P103 from database Convert to HTML P104himl and save to document store Last Record ? 151 US. Patent Aug. 11,2009 Sheet 1 0f 12 US 7,574,433 B2 401 EIEIIJD 110 200 402 100 403 FIG. 1 300 US. Patent Aug. 11,2009 Sheet 2 0f 12 US 7,574,433 B2 100 1:: z 110 DE! EID ‘cm-‘5T1 :3 ‘:1: 120 on an ? 140 151 EC) 132 EC) 133 FIG. 2 Typical Document Server Web Site US. Patent Aug. 11,2009 Sheet 4 0f 12 US 7,574,433 B2 U1ited Sates Patent Pppllcation 20040177088 Knd Cbde A1 Jeffrey, Joel Septenber 9, 2004 Wde-spectrumi nfornati on search engine Abstract Anethod and conputer programproduct for conparing docunents includes segnenting a iudgrrent natrix into a plurality of information sub-matrices VlhBfB each subrratrix has a plurality of classifications and a plurality of term; relevant to each classification; evaluating a relevance of each termof the plurality of terns vlith respect to each classification of each information subrratrix of the information subnatrices; calculating an information spectrumfor a first docunent based upon at least some of the plurality of terns; calculating an information spectrumfor a second docunent based upon at least sone of the plurality of terns; and identifying the second docunent as relevant to the first docunent based upon a conparison of the calculated information spectruns. Inventors: Jeffrey, Joel; (Vlileaton, IL) Cbrrespondence Fame and Address: FlS-l& ao-iomsm RC 3300 DNNRNBJ-ERHAZA MMEAMIS Ml 55402 LS Pesignee Nine and Pdress: l-B Technologies, lnc., a Glifornia corporation Serial N).: 800217 Series Code: 10 ?led: Mrch12, 2004 US OJrrent Class: 707/102 US Class at Publication: 707/102 lntern'l Class: (06F 017/30 FIG. 4 US. Patent Aug. 11,2009 Sheet 5 0f 12 US 7,574,433 B2 United Sates Patent Application 20040177088 HndQDde A1 Jeffrey, Joel Septenber 9, 2004 Wde-spectrumi nfornati on search engine Abstract Anethod and corrputer programproduct for conparing documents includes segnenting a judgnent natrix into a plurality of information sub-matrices vihere each subnatrix has a plurality of classifications and a pl urality of terns relevant to each classification; evaluating a relevance of each termof the plurality of terns viith respect to each classification of each infornation subnatrix of the information subnatrices; calculating an infornation spectrumfor a first docunent based upon at least sons of the plurality of terns; calculating an information spectrumfor a second docurrent based upon at least sone of the plurality of terns; and identifying the second docurrent as rel evant to the first document based upon a comparison of the calculated information spectruns. inventors: Jeffrey. Joel; (Wieaton, IL) Correspondence Name and Address: FlS-l& RCl-NIBCN RC 3300 DNNRNHJ-ERHAZA MNEAPOJS Ml 55402 LS Pssignee Name and Adress: H5 Teohnol ogies, Inc, a California corporation Serial no: 800217 Series Code: 10 Filed: March 12, 2004 US. OJrrent Class: 707/102 - Oass 707 mmmm DQTPBASENQFILEMNGQUXTASTRUUB - 100 DRTPBPSESJ-HACRDATASTRETLE - 102 @neratlng database or data structure (e.g., via user interface) US Oassat Publication: 707/102 lntern'l Class: GD6F 017/30 FIG. 5 US. Patent Aug. 11,2009 Sheet 6 0f 12 US 7,574,433 B2 U1ited Sates Patent Application 20040177088 Kind Code A1 Jeffrey, Joel Septenber 9, 2004 - - - _ - _ _ - - - - - - - - - - - - - - - - - _ - v - _ - - . - - - - - - - - - - - _ - - - - - _ - - - - - - - - ~ - - - - - - - _ - - - - - - - - - - - -_ Wde-spectrumi nfornati on search engine Abstract Anethod and conputer programproduct for conparing docunents includes segnenting a judgrrent matrix into a plurality of information sub-matrices vlhere each subnatrix has a plurality of classifications and a plurality of terns relevant to each classification; evaluating a relevance of each termof the plurality of terns Vtith respect to each classification of each information subnatrix of the information subnatrices; calculating an infornation spectrumfor a first docurrent based upon at least some of the plurality of terns; calculating an inforn‘ation spectrumfor a second docunent based upon at least sons of the plurality of terns; and identifying the second docunent as relevant to the first docun'ent based upon a conparison of the calculated infornation spectruns. inventors: Jeffrey, Joel; (Weaton, lL) Correspondence Name and Address: FlS-l & RO-PHZHN RC 3300 DNNHQLBJ-BRHAZA MMEAPOJS Ml 55402 LS Fssignee Nana and Pdress: H5 Technol ogies. Inc., a California corporation Serial no: 800217 Series Cbde: 10 ?led: March 12, 2004 FIG. 6 US. Patent Aug. 11,2009 Sheet 7 0f 12 US 7,574,433 B2 CODE TITLE 00028000000 APPAREL 60028001000 MISCELLANEOUS 00028455000 GUARD OR PROTECTOR 00028456000 Body cover FIG. 7 US. Patent Aug. 11,2009 Sheet 8 0f 12 US 7,574,433 B2 classid level code cdlsp title 1 0 (30028000000 Class 2 APPAREL 2 1 00028001000 1 MISCELLANEOUS 3 1 60028455000 455 GUARD OR PROTECTOR 4 2 00028456000 456 Body cover 5 3 (30028457000 457 Hazardous material body cover FIG. 8 a classid ancestroid 2 1 3 1 4 1 4 3 5 1 5 3 5 4 FIG. 8b US. Patent Aug. 11,2009 Sheet 9 0f 12 US 7,574,433 B2 P100 Load XML record from stream P101 Transform P102 Add class titles from database P103 P104 Save to document store 131 FIG. 9 US. Patent Aug. 11,2009 Sheet 10 0f 12 US 7,574,433 B2 Get subclass code from m_spDOM P103.1 Set append target to <uscs/> P103.2 Load class hierarchy for subclass (spGetSubclassHierarchy) P1 03'3 For each row in rowset P103.4 P1035 P1035 ucstree element with N0 Create/insert new same classid in target ? usctree element in classid sequence P103.7 Set append target Set append target to existing element to new element FIG. 10 US‘ Patent Aug- 11,2009 Sheet 11 0f 12 US 7,574,433 B2 P100 Load XML record P101 from stream Transform p102 Add class titles from database P103 Convert to HTML P104him| and save to document store P105 N0 r 151 Yes FIG. 11 US. Patent Aug. 11,2009 Sheet 12 0f 12 US 7,574,433 B2 P100 Load XML record P101 from stream Transform P1 ()2 P104 Save to document store P1 05 No 131 Yes US 7,574,433 B2 1 2 CLASSIFICATION-EXPANDED INDEXING As a result the advantages of classi?cation and indexing AND RETRIEVAL OF CLASSIFIED systems are beyond the grasp of more casual users and infor DOCUMENTS mation professionals. On the other hand, the rapid recent groWth of fulltext-base TECHNICAL FIELD patent retrieval services on the Internet has led lay persons and information professionals alike to rely increasingly on keyWord searching. While keyWord searching has its advan This invention relates to the indexing and retrieval of docu tages and is easy to use, variations in terminology can easily ments to Which classi?cation codes and schemes have been lead to missed documents. Moreover, the intellectual product applied and, in particular, relates to the indexing and retrieval embodied in the classi?cations applied to the documents is of patent documents. totally lost. In related art, D & B Duns Market Ident?ers database on BACKGROUND DIALOG (http://library.dialog.com/bluesheets/html/ bI05l6.html) provides for searching SIC descriptors as a It is standard practice for intellectual property authorities search ?eld.