IDOL Connector Framework Server 11.6 Administration Guide
Total Page:16
File Type:pdf, Size:1020Kb
Connector Framework Server Software Version: 11.6 Administration Guide Document Release Date: February 2018 Software Release Date: February 2018 Administration Guide Legal notices Warranty The only warranties for Seattle SpinCo, Inc. and its subsidiaries ("Seattle") products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. Seattle shall not be liable for technical or editorial errors or omissions contained herein. The information contained herein is subject to change without notice. Restricted rights legend Confidential computer software. Except as specifically indicated, valid license from Seattle required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. Copyright notice © Copyright 2018 EntIT Software LLC, a Micro Focus company Trademark notices Adobe™ is a trademark of Adobe Systems Incorporated. Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation. UNIX® is a registered trademark of The Open Group. Documentation updates The title page of this document contains the following identifying information: l Software Version number, which indicates the software version. l Document Release Date, which changes each time the document is updated. l Software Release Date, which indicates the release date of this version of the software. To verify you are using the most recent edition of a document, go to https://softwaresupport.softwaregrp.com/group/softwaresupport/search-result?doctype=online help. This site requires you to sign in with a Software Passport. You can register for a Passport through a link on the site. You will also receive updated or new editions if you subscribe to the appropriate product support service. Contact your Micro Focus sales representative for details. Support Visit the Micro Focus Software Support Online website at https://softwaresupport.softwaregrp.com. This website provides contact information and details about the products, services, and support that Micro Focus offers. Micro Focus online support provides customer self-solve capabilities. It provides a fast and efficient way to access interactive technical support tools needed to manage your business. As a valued support customer, you can benefit by using the support website to: l Search for knowledge documents of interest l Submit and track support cases and enhancement requests l Access the Software Licenses and Downloads portal l Download software patches l Access product documentation l Manage support contracts Connector Framework Server (11.6) Page 2 of 185 Administration Guide l Look up Micro Focus support contacts l Review information about available services l Enter into discussions with other software customers l Research and register for software training Most of the support areas require you to register as a Passport user and sign in. Many also require a support contract. You can register for a Software Passport through a link on the Software Support Online site. To find more information about access levels, go to https://softwaresupport.softwaregrp.com/web/softwaresupport/access-levels. About this PDF version of online Help This document is a PDF version of the online Help. This PDF file is provided so you can easily print multiple topics or read the online Help. Because this content was originally created to be viewed as online help in a web browser, some topics may not be formatted properly. Some interactive topics may not be present in this PDF version. Those topics can be successfully printed from within the online Help. Connector Framework Server (11.6) Page 3 of 185 Administration Guide Contents Chapter 1: Introduction 9 Connector Framework Server 9 Filter Documents and Extract Subfiles 10 Manipulate and Enrich Documents 10 The Ingestion Process 11 The Import Process 13 Index Documents 14 The IDOL Platform 14 System Architecture 15 OEM Certification 16 Related Documentation 17 Display Online Help 17 Chapter 2: Configure Connector Framework Server 18 Connector Framework Server Configuration File 18 Modify Configuration Parameter Values 19 Configure Connector Framework Server 20 Include an External Configuration File 21 Include the Whole External Configuration File 21 Include Sections of an External Configuration File 22 Include a Parameter from an External Configuration File 22 Merge a Section from an External Configuration File 23 Encrypt Passwords 23 Create a Key File 23 Encrypt a Password 24 Decrypt a Password 25 Configure Client Authorization 26 Example Configuration File 27 Chapter 3: Start and Stop Connector Framework Server 29 Start Connector Framework Server 29 Stop Connector Framework Server 29 Chapter 4: Send Actions to Connector Framework Server 31 Connector Framework Server (11.6) Page 4 of 185 Administration Guide Send Actions to Connector Framework Server 31 Asynchronous Actions 31 Check the Status of an Asynchronous Action 32 Cancel an Asynchronous Action that is Queued 32 Stop an Asynchronous Action that is Running 32 Store Action Queues in an External Database 33 Prerequisites 33 Configure Connector Framework Server 33 Store Action Queues in Memory 35 Use XSL Templates to Transform Action Responses 36 Example XSL Templates 37 Chapter 5: Ingest Data 38 Ingest Data using Connectors 38 Ingest an IDX File 38 Ingest XML 39 Transform XML Files 39 Parse XML into Documents 40 Ingest PST Files 42 Ingest Password-Protected Files 42 Ingest Data for Testing 44 Chapter 6: Filter Documents and Extract Subfiles 45 Customize KeyView Filtering 45 Disable Filtering or Extraction for Specific Documents 45 Chapter 7: Manipulate and Enrich Documents 47 Introduction 47 Choose When to Run a Task 48 Create Import and Index Tasks 50 Document Fields for Import Tasks 51 Write and Run Lua Scripts 52 Write a Lua Script 52 Run a Lua Script 53 Debug a Lua Script 53 Lua Scripts Included With CFS 56 Use Named Parameters 58 Enable or Disable Lua Scripts During Testing 58 Example Lua Scripts 59 Add a Field to a Document 59 Connector Framework Server (11.6) Page 5 of 185 Administration Guide Count Sections 59 Merge Document Fields 60 Add Titles to Documents 61 Analyze Media 62 Create a Media Server Configuration 62 Configure the Media Analysis Task 63 Run Analysis From Lua 66 Troubleshoot Media Analysis 67 Analyze Speech 68 Run Analysis on All Audio and Video Files 69 Run Analysis on Specific Documents 69 Use Multiple Speech Servers 70 Language Identification 71 Transcode Audio 71 Speech-To-Text Results 72 Categorize Documents 73 Customize the Query 74 Customize the Output 75 Run Eduction 76 Redact Documents 77 Lua Post Processing 77 Process HTML 78 HTML Processing with WKOOP 79 Remove Irrelevant Content 80 Extract Metadata 80 Split Web Pages into Multiple Documents 81 HTML Extraction 83 Extract Metadata from Files 84 Import Content Into a Document 84 Reject Invalid Documents 85 Reject Documents with Binary Content 85 Reject Documents with Import Errors 86 Reject Documents with Symbolic Content 86 Reject Documents by Word Length 86 Reject All Invalid Documents 87 Split Document Content into Sections 87 Split Files into Multiple Documents 88 Example 88 Write Documents to Disk 90 Write Documents to Disk in IDX Format 90 Write Documents to Disk in XML Format 91 Write Documents to Disk in JSON Format 91 Connector Framework Server (11.6) Page 6 of 185 Administration Guide Write Documents to Disk in CSV Format 92 Write Documents to Disk as SQL INSERT Statements 92 Standardize Document Fields 93 Customize Field Standardization 93 Normalize E-mail Addresses 97 Language Detection 99 Translate Documents 99 Chapter 8: Index Documents 101 Introduction 101 Configure the Batch Size and Time Interval 102 Index Documents into an IDOL Server 102 Index Documents into Haven OnDemand 103 Prepare Haven OnDemand 103 Configure CFS to Index into Haven OnDemand 104 Index Documents into Vertica 105 Prepare the Vertica Database 106 Configure CFS to Index into Vertica 107 Troubleshooting 108 Index Documents into another CFS 108 Index Documents into MetaStore 109 Document Fields for Indexing 110 AUTN_INDEXPRIORITY 110 Manipulate Documents Before Indexing 111 Chapter 9: Monitor Connector Framework Server 113 Use the Logs 113 Customize Logging 113 Monitor Asynchronous Actions using Event Handlers 114 Configure an Event Handler 115 Write a Lua Script to Handle Events 117 Monitor the size of the Import and Index Queues 117 Set Up Document Tracking 118 Appendix A: KeyView Supported Formats 120 Supported Formats 120 Archive Formats 122 Binary Format 124 Computer-Aided Design Formats 124 Connector Framework Server (11.6) Page 7 of 185 Administration Guide Database Formats 126 Desktop Publishing 127 Display Formats 127 Graphic Formats 128 Mail Formats 131 Multimedia Formats 133 Presentation Formats 135 Spreadsheet Formats 137 Text and Markup Formats 139 Word Processing Formats 140 Supported Formats (Detected) 145 Appendix B: KeyView Format Codes 152 KeyView Classes 152 KeyView Formats 153 Appendix C: Document Fields 178 Document Fields 178 AUTN_IDENTIFIER 179 Sub File Indexes 180 Append Sub File Indexes to the Document Identifier 181 Glossary 182 Send documentation feedback 185 Connector Framework Server (11.6) Page 8 of 185 Chapter 1: Introduction This section provides an overview of Connector Framework Server. • Connector Framework