22 STNews – May 2011

Adding Value to INPADOC on STN - Quality Assurance by FIZ Karlsruhe Editorial -

INPADOC on STN is based on the world’s • Application numbers are the key elements largest collection of patent bibliographic and for linking all patent publications of a legal status data from the European Patent national family. INPADOCDB is the INPADOC Office. The EPO performs a tremendous task database on STN which features an applica- to compile and harmonize patent data from tion-based file design, i.e. one INPADOCDB more than 90 patent authorities into a single record compiles all bibliographic details patent data resource. Quality assurance pro- and legal status data available for the cesses at the EPO contribute significantly to national family. (1) the quality of the database . Nevertheless, The high quality of patent numbers is also FIZ Karlsruhe identified areas for which the essential for the interaction of INPADOC with quality of the database could be further im- other patent files on STN: efficient crossfile proved and established its own quality man- searching between different patent files re- agement for INPADOC. A key aspect of these quires correct and highly standardized patent activities relates to corrections of patent numbers. The STN patent standard creates bibliographic data, providing patent informa- consistency across the various patent data- tion professionals with more accurate and bases on STN and harmonizes patent data comprehensive patent family information. from different producers, in particular from Chemical Abstracts Service, Thomson Reuters The importance of high quality patent and the . numbers Patent numbers (publication, application and Standardization efforts and correc- priority numbers) are substantial bibliographic tions of patent numbers elements which are of high importance for Every week about 60.000 new and 140.000 up- building accurate patent families. dated patent publications enter the INPADOC • Correct priority numbers are indispensable databases, comprising a huge amount of to build reliable INPADOC families: all diverse patent numbers. All of these numbers patent publications referring to the same are validated against a standardization table and having one priority number and converted to the STN patent standard in common (directly or indirectly) constitute format. At the validation stage those publica- one INPADOC family. INPAFAMDB, the tions are rejected which do not match the re- INPADOC family file on STN, is designed quired input standard. Typical errors include according to this family concept, i.e. one missing or incorrect priority or application INPAFAMDB record includes all biblio- numbers, incorrect publication numbers or graphic and legal status data of all patent kind codes. Errors identified for the weekly publications referring to the INPADOC INPADOC update are corrected intellectually family. by the FIZ Karlsruhe editorial team or in case

(1) Albrecht M A, Bosma R, van Dinter T, Ernst J-L, van Ginkel K, Versloot-Spoelstra F. Quality assurance in the EPO Patent Information Resource. World Patent Information, 32 (2010) 279-28 STNews – May 2011 23

of serial errors corrections are done automati- These errors are corrected on a case by case cally. Error corrections are typically online basis, and often various different sources within one week the error has been detected. need to be consulted, e.g. original sources The standardization table is the core module from patent offices and related patent publi- of the whole validation and standardization cations. Especially priority number corrections procedure. For historical reasons the patent require a special expertise and careful error numbering formats applied by the more than analysis. 90 patent authorities are highly inconsistent. Each patent office uses different numbering More accurate patent families through formats and kind codes for different patent publication types and time ranges. As a result, priority number corrections the standardization table covers more than Family building in INPADOC is a dynamic pro- 2.200 different numbering formats for patent cess, which takes into account new priority publication and application numbers. The FIZ relationships. The INPADOC family could be Karlsruhe editorial team takes great care to reassembled when a priority number is correc- keep this table up to date and create a con- ted or added to an existing document. This sistent STN standard for new numbering means that separate families merge and build formats. a new family or a family is split into different families (Fig. 1). Apart from the weekly update routines, quality management for INPADOC also means that The two examples in Fig. 1 illustrate the bene- extra data deliveries from EPO are monitored fit of priority number corrections of FIZ edito- and that plausibility checks are performed for rial. In both cases the priority information on the entire database. the original US published application was wrong. The example on the left, a US publica- The editorial activities also involve corrections tion joining a patent family, clearly demon- of single errors reported by customers or STN strates that correct priority information is staff, e.g. wrong patent families, missing or essential for the comprehensiveness of a incorrect patent assignee names or titles. patent family.

Separate patent families are merged due to False patent families are separated due to priority number corrections of FIZ Karlsruhe priority number corrections of FIZ Karlsruhe

CN101175044 A DE 10327112 A1 US20090083750 A1 EP2068515 A1 EP 1495800 A2 WO2009076860 A1 EP 1495800 A3 US 20040107821 A1 US 7081579 B2 US 20050038130 A1 US 20060254411 A1 US 20060264521 A1 CN101175044 A US 20080021851 A1 EP2068515 A1 US 20080059287 A1 US20090083750 A1 WO2009076860 A1

Ion exchange material Method and system for Correction of Chinese priority number of production music recommendation US20090083750: Lanxess GmbH Music Intelligence Solut. CN2007-11017879 => CN2007-10178796 DE10327112 A1 US20040107821 A1 EP1495800 A2 US7081579 B2 EP1495800 A3 US20060254411 A1 US20050038130 A1 US20080021851 A1 US20060264521 A1 US20080059287 A1

Correction of US-priority number of US20080021851: US2006-492395 => US2006-492355

Fig. 1: Two examples of priority number corrections leading to more accurate patent families 24 STNews – May 2011

Statistics of FIZ Karlsruhe editorial The FIZ Karlsruhe editorial team reviewed the corrections complete list of legal status codes and assigned legal status categories for half of the codes. The number of corrections performed by the FIZ Karlsruhe editorial team is constantly in- Following the discussion with customers, seven creasing. In 2010 the number of corrections categories of major interest could be identified, was about 40.000, making the overall number (Fig. 3) e.g. applicant reassignments, opposi- of corrections more than 100.000 since the tions or “not in force” (including codes like beginning of the statistics in 2008 (Fig. 2). lapse or expiry of ). The editorial team is in charge of these categories and continu- The editorial corrections are very well integra- ously monitors any changes to the legal status ted in the weekly update routines for INPADOC. codes and makes sure that new codes are Whenever updated bibliographic information captured in the relevant category. from the EPO includes the same errors as previously corrected by FIZ Karlsruhe editorial, we make sure that our corrections are re- Reinstatement tained. In addition, all corrections from the Not.in-force Examination EPO are checked against earlier corrections done by FIZ Karlsruhe editorial. SPCs Licensing Opposition Change of owner

Fig. 3 Legal status categories assigned and maintained by FIZ Editorial

Outlook In the fast growing patent information land- scape the role of INPADOC as a central re- source for worldwide patent information will become even more important in the future.

Fig. 2: Corrections of bibliographic data in INPADOC on STN from 2008-2010 More and more patent offices from Asia, the Middle East and Latin America contribute to INPADOC and the EPO is faced with an in- Adding value to legal status infor- creasing number of heterogeneous sources. mation Especially the patenting activities in Asia account for a rapid growth of the INPADOC The INPADOC collection is the largest repository data collection. Furthermore, the EPO makes of legal status data worldwide, comprising a considerable effort to extend the backfile selected legal status events from currently 59 coverage for several patent authorities and fill patent authorities. The EPO applies more than coverage gaps where possible. All of these 2600 legal status codes, each code represen- coverage enhancements go along with addi- ting a specific legal status event from a par- tional standardization work and increase the ticular patent office. Worldwide legal status potential for bibliographic errors. searching is a rather challenging task, as var- ious different codes from different countries In the wake of these developments, the need to be searched for a particular legal thorough quality assurance by FIZ Karlsruhe request. remains a considerable value-add to INPADOC FIZ Karlsruhe introduced a set of legal status on STN. categories to simplify legal status searches.