Primo Technical Guide
Total Page:16
File Type:pdf, Size:1020Kb
Technical Guide May 2016 Ex Libris Confidential 6/7 1 CONFIDENTIAL INFORMATION The information herein is the property of Ex Libris Ltd. or its affiliates and any misuse or abuse will result in economic loss. DO NOT COPY UNLESS YOU HAVE BEEN GIVEN SPECIFIC WRITTEN AUTHORIZATION FROM EX LIBRIS LTD. This document is provided for limited and restricted purposes in accordance with a binding contract with Ex Libris Ltd. or an affiliate. The information herein includes trade secrets and is confidential. DISCLAIMER The information in this document will be subject to periodic change and updating. Please confirm that you have the most current documentation. There are no warranties of any kind, express or implied, provided in this documentation, other than those expressly agreed upon in the applicable Ex Libris contract. This information is provided AS IS. Unless otherwise agreed, Ex Libris shall not be liable for any damages for use of this document, including, without limitation, consequential, punitive, indirect or direct damages. Any references in this document to third‐party material (including third‐party Web sites) are provided for convenience only and do not in any manner serve as an endorsement of that third‐ party material or those Web sites. The third‐party materials are not part of the materials for this Ex Libris product and Ex Libris has no liability for such materials. TRADEMARKS ʺEx Libris,ʺ the Ex Libris bridge , Primo, Aleph, Alephino, Voyager, SFX, MetaLib, Verde, DigiTool, Preservation, Rosetta, URM, ENCompass, Endeavor eZConnect, WebVoyáge, Citation Server, LinkFinder and LinkFinder Plus, and other marks are trademarks or registered trademarks of Ex Libris Ltd. or its affiliates. The absence of a name or logo in this list does not constitute a waiver of any and all intellectual property rights that Ex Libris Ltd. or its affiliates have established in any of its products, features, or service names or logos. Trademarks of various third‐party products, which may include the following, are referenced in this documentation. Ex Libris does not claim any rights in these trademarks. Use of these marks does not imply endorsement by Ex Libris of these third‐party products, or endorsement by these third parties of Ex Libris products. Oracle is a registered trademark of Oracle Corporation. UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Ltd. Microsoft, the Microsoft logo, MS, MS‐DOS, Microsoft PowerPoint, Visual Basic, Visual C++, Win32, Microsoft Windows, the Windows logo, Microsoft Notepad, Microsoft Windows Explorer, Microsoft Internet Explorer, and Windows NT are registered trademarks and ActiveX is a trademark of the Microsoft Corporation in the United States and/or other countries. Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Google is a registered trademark of Google Inc. iPhone is a registered trademark of Apple Inc. Copyright Ex Libris Limited, 2016. All rights reserved. Document released: June 7, 2016 Web address: http://www.exlibrisgroup.com Ex Libris Confidential 6/7 2 The PNX Record This section includes: Introduction PNX Record Sections Subfields in the PNX The Control Section The Display Section The Links Section The Search Section The Facets Section The Sort Section The Duplicate Record Detection Section The FRBR Section The Delivery Section The Ranking Section The Enrichment Section The Additional Data Section Example of a PNX Record Ex Libris, a ProQuest Company 6/7 3 Introduction Primo harvests data from a wide range of source databases and formats. This data must be normalized and optimized to facilitate efficient discovery and delivery. The various source formats must be mapped to the Primo normalized XML (PNX) record. For additional information about the publishing process, refer to the Primo Administration Guide. The PNX record is created and stored in XML format, using the UTF-8 character set. The data in the PNX record is organized in sections, each section containing information for a specific purpose. In some cases, the data is duplicated to provide flexibility to the system when it is processing the data. Ex Libris, a ProQuest Company 6/7 4 PNX Record Sections The following table lists the sections in the PNX record. Sections of the PNX Record Section Contains For additional information, refer to Additional Data Data elements required for a The Additional Data Section number of functions that cannot be extracted from other sections of the PNX. Browse Data that is used to create the Refer to Browse Functionality. browse lists. Control Formatted data that is used for The Control Section control purposes. Delivery and Scoping Data required for managing The Delivery Section delivery and scoping for searches. Display Data displayed in the brief and The Display Section full displays in the user interface (UI). Duplication Detection A Dedup vector, which The Duplicate Record includes all the data required Detection Section by the Duplication Detection algorithm to determine if two records are equivalent. Enrichment Data required by the The Enrichment Section enrichment process. Facets Data used to create faceted The Facets Section browsing in the UI. Grouping (FRBR) A FRBR vector, which includes The FRBR Section one or more keys that identify the group it represents. Ex Libris, a ProQuest Company 6/7 5 Links Links that can be used to The Links Section create the GetIT! functionality and/or links in the record display. Ranking Two booster fields that can be The Ranking Section used to boost the ranking of the record. Search Metadata and full-text data The Search Section that are indexed for searching purposes. Sort Sort fields used as the basis The Sort Section for sorting the results set. Each of the above sections contains various fields. Multiple and repeatable fields of the source record can be concatenated to one field in the PNX record (such as the Display section) or stored in separate fields within the PNX record (such as the Search and Facets sections). In some cases, the system will take only one of the following fields: All fields of the Control section. All fields of the Sort section. All fields of the Dedup and FRBR sections. Delivery section – the Delivery category field. All fields of the Ranking section. Ex Libris, a ProQuest Company 6/7 6 Subfields in the PNX Some fields in the PNX record have multiple values that are delimited by two dollar signs followed by a specific character or number (similar to MARC subfields). The following table lists the various subfield delimiter types used in the PNX record. Subfield Delimiter Types Type Delimiter Description Uppercase Alphabetic The character denotes the content and is persistent across PNX fields. A Algorithm used for FRBR key type. C A constant that displays before the field. This delimiter can be used only in the Display section for the following fields: identifier, relation, and description. The constant can be a code (lower case with no spaces or special characters). The code is translated to a name for display in the Front End using the Display Constants code table. If the text added has no translation in the code table, it will display as entered in the rules. D Text that displays instead of the field—for example, for URLS). I Institution code K Key for FRBR L Library code O Origin of the field used in Deduped records - the sourceid. S Status T Template code U URL Ex Libris, a ProQuest Company 6/7 7 V Value of the field (to distinguish between the value of the field and the display text or constant). Numeric Indicates the order of data elements added to the field. All text fields can be either code or text. Primo first searches for a translation of the value in the codes table. Only if it does not find the translation will Primo display the value. Ex Libris, a ProQuest Company 6/7 8 The Control Section The Control section includes formatted data used for control purposes. The following table lists the contents of the Control section. Control Section Fields Field Code Description sourceid The source ID identifies the source repository in Primo. Every source repository has a configuration file in which the sourceid and other information about the source repository are recorded. originalsourceid This ID identifies the source repository in the source system. This is not necessarily the same as the source repository's identifier in Primo—for example, USM01. sourcerecordid This ID identifies the record in the source repository (such as an ALEPH system number supplied in MARC21 tag 001). This ID must be unique and persistent within the source repository. It is derived from the OAI header. addsrcrecordid This ID identifies an additional ID of the source record. recordid The record ID is a unique identifier of the record in the Primo repository. The sourceid and sourcerecordid are concatenated to create the recordid (for example, ALEPH system number + tag 001). sourcetype Source type—not in use. sourceformat The source format identifies the original format of the source record (such as MARC21, Dublin Core, and MAB2). sourcesystem The source system identifies the system used by the source repository (such as ALEPH, ADAM, MetaLib, SFX, and Digitool). recordtype Record type—not in use. lastmodified Date last modified—not in use. Ex Libris, a ProQuest Company 6/7 9 The Display Section The Display section includes data used in the brief and full display formats of the UI. The basis for the data elements used in the Display section is the Dublin Core element set. Dublin Core was selected as a metadata standard that is intended to support a broad range of purposes and resource types. In some cases, the names of the Dublin Core elements have been modified and a number of additional fields have been added.