IDOL Connector Framework Server 11.3 Administration Guide

HPE Connector Framework Server Software Version: 11.3 Administration Guide Document Release Date: February 2017 Software Release Date: February 2017 Administration Guide Legal Notices Warranty The only warranties for Hewlett Packard Enterprise Development LP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HPE shall not be liable for technical or editorial errors or omissions contained herein. The information contained herein is subject to change without notice. Restricted Rights Legend Confidential computer software. Valid license from HPE required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. Copyright Notice © Copyright 2017 Hewlett Packard Enterprise Development LP Trademark Notices Adobe™ is a trademark of Adobe Systems Incorporated. Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation. UNIX® is a registered trademark of The Open Group. This product includes an interface of the 'zlib' general purpose compression library, which is Copyright © 1995-2002 Jean-loup Gailly and Mark Adler. Documentation updates The title page of this document contains the following identifying information: l Software Version number, which indicates the software version. l Document Release Date, which changes each time the document is updated. l Software Release Date, which indicates the release date of this version of the software. To check for recent software updates, go to https://downloads.autonomy.com/productDownloads.jsp. To verify that you are using the most recent edition of a document, go to https://softwaresupport.hpe.com/group/softwaresupport/search-result?doctype=online help. This site requires that you register for an HPE Passport and sign in. To register for an HPE Passport ID, go to https://hpp12.passport.hpe.com/hppcf/login.do. You will also receive updated or new editions if you subscribe to the appropriate product support service. Contact your HPE sales representative for details. Support Visit the HPE Software Support Online web site at https://softwaresupport.hpe.com. This web site provides contact information and details about the products, services, and support that HPE Software offers. HPE Connector Framework Server (11.3) Page 2 of 169 Administration Guide HPE Software online support provides customer self-solve capabilities. It provides a fast and efficient way to access interactive technical support tools needed to manage your business. As a valued support customer, you can benefit by using the support web site to: l Search for knowledge documents of interest l Submit and track support cases and enhancement requests l Access product documentation l Manage support contracts l Look up HPE support contacts l Review information about available services l Enter into discussions with other software customers l Research and register for software training Most of the support areas require that you register as an HPE Passport user and sign in. Many also require a support contract. To register for an HPE Passport ID, go to https://hpp12.passport.hpe.com/hppcf/login.do. To find more information about access levels, go to https://softwaresupport.hpe.com/web/softwaresupport/access-levels. To check for recent software updates, go to https://downloads.autonomy.com/productDownloads.jsp. About this PDF version of online Help This document is a PDF version of the online Help. This PDF file is provided so you can easily print multiple topics or read the online Help. Because this content was originally created to be viewed as online help in a web browser, some topics may not be formatted properly. Some interactive topics may not be present in this PDF version. Those topics can be successfully printed from within the online Help. HPE Connector Framework Server (11.3) Page 3 of 169 Administration Guide Contents Chapter 1: Introduction 9 HPE Connector Framework Server 9 Filter Documents and Extract Subfiles 9 Manipulate and Enrich Documents 10 Index Documents 10 Import Process 11 Display Online Help 11 OEM Certification 12 Related Documentation 12 HPE's IDOL Platform 12 System Architecture 13 Chapter 2: Configure Connector Framework Server 15 Connector Framework Server Configuration File 15 Modify Configuration Parameter Values 16 Configure Connector Framework Server 17 Include an External Configuration File 18 Include the Whole External Configuration File 19 Include Sections of an External Configuration File 19 Include a Parameter from an External Configuration File 20 Merge a Section from an External Configuration File 20 Customize Logging 21 Encrypt Passwords 22 Create a Key File 22 Encrypt a Password 23 Decrypt a Password 24 Example Configuration File 24 Chapter 3: Start and Stop HPE Connector Framework Server 27 Start Connector Framework Server 27 Stop Connector Framework Server 27 Chapter 4: Send Actions to HPE Connector Framework Server 29 Send Actions to HPE Connector Framework Server 29 Asynchronous Actions 29 HPE Connector Framework Server (11.3) Page 4 of 169 Administration Guide Check the Status of an Asynchronous Action 30 Cancel an Asynchronous Action that is Queued 30 Stop an Asynchronous Action that is Running 30 Store Action Queues in an External Database 31 Prerequisites 31 Configure HPE Connector Framework Server 31 Store Action Queues in Memory 33 Use XSL Templates to Transform Action Responses 34 Example XSL Templates 35 Chapter 5: Ingest Data 36 Ingest Data using Connectors 36 Ingest an IDX File 36 Ingest XML 37 Ingest PST Files 39 Ingest Password-Protected Files 39 Ingest Data for Testing 41 Chapter 6: Filter Documents and Extract Subfiles 42 Customize KeyView Filtering 42 Disable Filtering or Extraction for Specific Documents 42 Chapter 7: Manipulate and Enrich Documents 44 Introduction 44 Choose When to Run a Task 45 Create Import and Index Tasks 47 Document Fields for Import Tasks 48 Write and Run Lua Scripts 49 Write a Lua Script 50 Run a Lua Script 50 Debug a Lua Script 50 Lua Scripts Included With CFS 54 Use Named Parameters 55 Enable or Disable Lua Scripts During Testing 56 Example Lua Scripts 56 Add a Field to a Document 56 Count Sections 57 Merge Document Fields 57 Add Titles to Documents 58 Analyze Media 59 HPE Connector Framework Server (11.3) Page 5 of 169 Administration Guide Create a Media Server Configuration 59 Run Analysis on All Supported Files 61 Run Analysis on Specific Documents 62 Media Analysis Output 64 Troubleshoot Media Analysis 65 Analyze Speech 65 Run Analysis on All Audio and Video Files 66 Run Analysis on Specific Documents 67 Use Multiple Speech Servers 68 Language Identification 68 Transcode Audio 69 Speech-To-Text Results 69 Categorize Documents 70 Customize the Query 71 Customize the Output 72 Run Eduction 74 Redact Documents 74 Lua Post Processing 75 Extract Content from HTML 76 Extract Metadata from Files 76 Import Content Into a Document 77 Reject Invalid Documents 78 Reject Documents with Binary Content 78 Reject Documents with Import Errors 78 Reject Documents with Symbolic Content 79 Reject Documents by Word Length 79 Reject All Invalid Documents 80 Split Document Content into Sections 80 Split Files into Multiple Documents 81 Example 81 Write Documents to Disk 83 Write Documents to Disk in IDX Format 83 Write Documents to Disk in XML Format 84 Write Documents to Disk in JSON Format 84 Write Documents to Disk in CSV Format 84 Write Documents to Disk as SQL INSERT Statements 85 Standardize Document Fields 86 Customize Field Standardization 86 Normalize E-mail Addresses 90 Chapter 8: Index Documents 92 HPE Connector Framework Server (11.3) Page 6 of 169 Administration Guide Introduction 92 Configure the Batch Size and Time Interval 93 Index Documents into an IDOL Server 93 Index Documents into Haven OnDemand 94 Prepare Haven OnDemand 94 Configure CFS to Index into Haven OnDemand 95 Index Documents into Vertica 96 Prepare the Vertica Database 97 Configure CFS to Index into Vertica 98 Troubleshooting 99 Index Documents into another CFS 99 Index Documents into MetaStore 100 Document Fields for Indexing 101 AUTN_INDEXPRIORITY 101 Manipulate Documents Before Indexing 101 Set Up Document Tracking 102 Appendix A: KeyView Supported Formats 106 Supported Formats 106 Archive Formats 108 Binary Format 110 Computer-Aided Design Formats 110 Database Formats 112 Desktop Publishing 113 Display Formats 113 Graphic Formats 114 Mail Formats 117 Multimedia Formats 119 Presentation Formats 121 Spreadsheet Formats 123 Text and Markup Formats 125 Word Processing Formats 126 Supported Formats (Detected) 131 Appendix B: KeyView Format Codes 136 KeyView Classes 136 KeyView Formats 137 Appendix C: Document Fields 162 Document Fields 162 HPE Connector Framework Server (11.3) Page 7 of 169 Administration Guide AUTN_IDENTIFIER 163 Sub File Indexes 164 Append Sub File Indexes to the Document Identifier 165 Glossary 166 Send documentation feedback 169 HPE Connector Framework Server (11.3) Page 8 of 169 Chapter 1: Introduction This section provides an overview of Connector Framework Server. • HPE Connector Framework Server 9 • Related Documentation 12 • HPE's IDOL Platform 12 • System Architecture 13 HPE Connector Framework Server HPE Connector Framework Server (HPE CFS) processes the information that is retrieved by connectors, and then indexes the information into one or more indexes, such as IDOL Server or Haven OnDemand. When connectors send documents to CFS, the documents contain only metadata extracted from the repository, such as the location of a file or record that the connector has retrieved. CFS extracts the file- specific metadata and file content from the file and adds it to the document. This allows IDOL to search and extract meaning from the information contained in the repository, without needing to process the information in its native format. CFS also provides features to manipulate and enrich documents before they are indexed.

IDOL Connector Framework Server 11.3 Administration Guide

Scalable Speculative Parallelization on Commodity Clusters

Symantec Web Security Service Policy Guide

IDOL Connector Framework Server 12.0 Administration Guide

Migratory Compression

Oracle Universal Installer and Opatch User's Guide for Windows and UNIX

Digital Forensics Formats: Seeking a Digital Preservation Storage Container Format for Web Archiving

Economic Value of Instream Flows in Montana: a Travel Cost Model Approach

Bogog00000000000000

Using Data Compression for Energy-Efficient Reprogramming Of

Data Description and Usage

Research Article N-Gram-Based Text Compression

Pentest-Report Chubaofs Pentest 08.-09.2020 Cure53, Dr.-Ing