IDOL ConnectorLib Java SDK™ Programming Guide

Version 10.0 Document Revision 0 10 May 2012 Copyright Notice

Notice

This documentation is a proprietary product of Autonomy and is protected by copyright laws and international treaty. Information in this documentation is subject to change without notice and does not represent a commitment on the part of Autonomy. While reasonable efforts have been made to ensure the accuracy of the information contained herein, Autonomy assumes no liability for errors or omissions. No liability is assumed for direct, incidental, or consequential damages resulting from the use of the information contained in this documentation. The copyrighted software that accompanies this documentation is licensed to the End User for use only in strict accordance with the End User License Agreement, which the Licensee should read carefully before commencing use of the software. No part of this publication may be reproduced, transmitted, stored in a retrieval system, nor translated into any human or computer language, in any form or by any means, electronic, mechanical, magnetic, optical, chemical, manual or otherwise, without the prior written permission of the copyright owner. This documentation may use fictitious names for purposes of demonstration; references to actual persons, companies, or organizations are strictly coincidental. Trademarks and Copyrights

Copyright 2012 Autonomy Corporation plc and all its affiliates. All rights reserved. ACI API, Alfresco Connector, Arcpliance, Autonomy Process Automation, Autonomy Fetch for Siebel eBusiness Applications, Autonomy, Business Objects Connector, Cognos Connector, Confluence Connector, ControlPoint, DAH, Digital Safe Connector, DIH, DiSH, DLH, Documentum Connector, DOH, EAS Connector, Ektron Connector, Enterprise AWE, eRoom Connector, Exchange Connector, FatWire Connector, File System Connector for Netware, File System Connector, FileNet Connector, FileNet P8 Connector, FTP Fetch, HTTP Connector, Hummingbird DM Connector, IAS, IBM Content Manager Connector, IBM Seedlist Connector, IBM Workplace Fetch, IDOL Server, IDOL, IDOLme, iManage Fetch, IMAP Connector, Import Module, iPlanet Connector, KeyView, KVS Connector, Legato Connector, LiquidOffice, LiquidPDF, LiveLink Web Content Management Connector, MCMS Connector, MediClaim, Meridio Connector, Meridio, Moreover Fetch, NNTP Connector, Notes Connector, Objective Connector, OCS Connector, ODBC Connector, Omni Fetch SDK, Open Text Connector, Oracle Connector, PCDocs Fetch, PLC Connector, POP3 Fetch, Portal-in-a-Box, RecoFlex, Retina, SAP Fetch, Schlumberger Fetch, SharePoint 2003 Connector, SharePoint 2007 Connector, SharePoint 2010 Connector, SharePoint Fetch, SpeechPlugin, Stellent Fetch, TeleForm, Tri-CR, Ultraseek, Verity Profiler, Verity, VersiForm, WebDAV Connector, WorkSite Connector, and all related titles and logos are trademarks of Autonomy Corporation plc and its affiliates, which may be registered in certain jurisdictions. Microsoft is a registered trademark, and MS-DOS, Windows, Windows 95, Windows NT, SharePoint, and other Microsoft products referenced herein are trademarks of Microsoft Corporation. is a registered trademark of The Open Group. AvantGo is a trademark of AvantGo, Inc. Epicentric Foundation Server is a trademark of Epicentric, Inc. Documentum and eRoom are trademarks of Documentum, a division of EMC Corp. FileNet is a trademark of FileNet Corporation. Lotus Notes is a trademark of Lotus Development Corporation. mySAP Enterprise Portal is a trademark of SAP AG. Oracle is a trademark of Oracle Corporation. Adobe is a trademark of Adobe Systems Incorporated. Novell is a trademark of Novell, Inc. Stellent is a trademark of Stellent, Inc. All other trademarks are the property of their respective owners. Notice to Government End Users

If this product is acquired under the terms of a DoD contract: Use, duplication, or disclosure by the Government is subject to restrictions as set forth in subparagraph ()(1)(ii) of 252.227-7013. Civilian agency contract: Use, reproduction or disclosure is subject to 52.227-19 (a) through (d) and restrictions set forth in the accompanying end user agreement. Unpublished-rights reserved under the copyright laws of the United States. Autonomy, Inc., One Market Plaza, Spear Tower, Suite 1900, San Francisco, CA. 94105, US.

10 May 2012 Contents

About This Document ...... 15 Documentation Updates...... 15 Related Documentation...... 17 Conventions ...... 18 Notational Conventions ...... 18 Command-line Syntax Conventions ...... 19 Notices ...... 20 Autonomy Product References ...... 20 Autonomy Customer Support ...... 21 Contact Autonomy...... 21

Part 1 Getting Started

Chapter 1  Introduction ...... 25 Overview ...... 25 About Connector Framework Server ...... 26 System Architecture ...... 27 Import Process...... 28

Chapter 2  Install ConnectorLib Java SDK ...... 29 System Requirements...... 29 Install ConnectorLib Java SDK on Windows ...... 30 Directory Structure—Windows ...... 33 Connector Framework Server Directory Structure ...... 33 ConnectorLib Java SDK Directory Structure ...... 34

• • • ConnectorLib Java SDK Programming Guide • 3 • • Contents

Chapter 3  Configure the Connector ...... 37 Modify Parameters ...... 37 Enter Boolean Values ...... 37 Enter String Values ...... 38 Encrypt Passwords ...... 38 Set Up Log Streams ...... 40

Chapter 4  Implement a Connector using the ConnectorLib Java SDK...... 43 Overview ...... 44 Create a New Connector based on ConnectorLibJava ...... 44 Run the Connector ...... 45 Implement the Synchronize Action ...... 46 Configuration and Logging ...... 47 Debug the Connector ...... 48 Implement Other Actions ...... 48 Documents ...... 49 DocInfo Class ...... 49 Identifiers ...... 50 Example Identifier ...... 51 Sub File Indices ...... 51 Append Sub File Indices with Lua ...... 52 Datastore ...... 53 Configure the Datastore Tables ...... 53 Insert Records ...... 54 Update Records ...... 54 Remove Records ...... 54 Commit Changes ...... 55 Select Records ...... 55 SelectOne Method ...... 55 Select Method ...... 55 Upgrade a Datastore ...... 56 Index a Column ...... 57 Ingester Class...... 57 Ingest Result Handler ...... 57 Additional Information...... 58

• • • 4 • ConnectorLib Java SDK Programming Guide • • Contents

Chapter 5  Start and Stop the Connector ...... 59 Start the Connector ...... 59 Stop the Connector ...... 60

Chapter 6  Configure Connector Framework Server ...... 61 Connector Framework Server Configuration File...... 61 Modify Parameters ...... 62 Enter Boolean Values ...... 62 Enter String Values ...... 62 Configure Connector Framework Server ...... 63 Example Configuration File ...... 63

Chapter 7  Use Lua Scripts ...... 67 Use Lua Scripts within the CFS ...... 67 Configure a Lua Script ...... 68 Write a Lua Script ...... 68 Method Reference ...... 69 General Methods ...... 74 abs_path ...... 74 convert_date_time ...... 74 convert_encoding ...... 75 copy_file ...... 76 create_path ...... 76 create_uuid ...... 76 delete_file ...... 77 encrypt ...... 77 encrypt_security_field ...... 77 file_setdates ...... 78 getcwd ...... 78 get_config ...... 78 gobble_whitespace ...... 79 hash_file ...... 79 hash_string ...... 80 is_dir ...... 80 log ...... 81 move_file ...... 81

• • • ConnectorLib Java SDK Programming Guide • 5 • • Contents

parse_csv ...... 82 parse_xml ...... 82 regex_match ...... 82 regex_search ...... 83 send_aci_action ...... 83 send_aci_command ...... 84 sleep ...... 85 string_uint_less ...... 85 unzip_file ...... 86 xml_encode ...... 86 zip_file ...... 86 Document Methods ...... 87 addField ...... 87 appendContent ...... 87 copyField ...... 87 copyFieldNoOverwrite ...... 88 countField ...... 88 deleteField ...... 89 findField ...... 89 getContent ...... 89 getField ...... 90 getFields ...... 90 getFieldNames ...... 90 getFieldValue ...... 91 getFieldValues ...... 91 getNextSection ...... 91 getReference ...... 92 hasField ...... 92 insertXML ...... 92 renameField ...... 93 setContent ...... 93 setFieldValue ...... 93 setReference ...... 94 writeStubIdx ...... 94 Field Methods ...... 95 addField ...... 95 copyField ...... 95 copyFieldNoOverwrite ...... 95 countField ...... 96

• • • 6 • ConnectorLib Java SDK Programming Guide • • Contents

deleteAttribute ...... 96 deleteField ...... 96 getAttributeValue ...... 97 getField ...... 97 getFieldNames ...... 98 getFields ...... 98 getFieldValues ...... 98 hasAttribute ...... 99 hasField ...... 99 insertXML ...... 99 name ...... 100 renameField ...... 100 setAttributeValue ...... 100 setValue ...... 101 value ...... 101 XMLDocument Methods ...... 101 root ...... 101 XPathExecute ...... 102 XPathRegisterNs ...... 102 XPathValue ...... 102 XPathValues ...... 103 XmlNodeSet Methods ...... 103 at ...... 103 size ...... 104 XmlNode Methods ...... 104 attr ...... 104 content ...... 104 firstChild ...... 104 lastChild ...... 104 name ...... 105 next ...... 105 nodePath ...... 105 parent ...... 105 prev ...... 106 type ...... 106 XmlAttr Methods ...... 106 name ...... 106 next_attribute ...... 106 previous_attribute ...... 107

• • • ConnectorLib Java SDK Programming Guide • 7 • • Contents

type ...... 107 value ...... 107 RegexMatch Methods ...... 107 length ...... 107 next ...... 108 position ...... 108 size ...... 108 str ...... 109 Config Methods ...... 109 getEncryptedValue ...... 109 getValue ...... 110 getValues ...... 110 Change the Value of a Field ...... 111 Example Script ...... 111 Use Lua Scripts Within the Connector ...... 112 Introduction ...... 112 Example Lua Script ...... 112

Part 2 Parameter and Command Reference

Chapter 8  Parameters Common to CFS Connectors ...... 117 ACI Server Configuration ...... 118 FilePath ...... 118 LibraryName ...... 118 LuaScript ...... 119 MaximumThreads ...... 119 MaxQueueSize ...... 120 MaxScheduledSize ...... 120 OnError ...... 121 OnErrorReport ...... 121 OnFinish ...... 122 OnStart ...... 122 Url ...... 122 Import Service ...... 123 KeyviewDirectory ...... 123 Distributed Connector ...... 124 ConnectorGroup ...... 124

• • • 8 • ConnectorLib Java SDK Programming Guide • • Contents

ConnectorPriority ...... 124 DataPortN ...... 125 HostN ...... 125 PortN ...... 126 RegisterConnector ...... 126 SharedPath ...... 127 SSLConfigN ...... 127 View Server ...... 128 EnableViewServer ...... 128 Host ...... 128 Port ...... 129 SharedPath ...... 129 General Connector Parameters ...... 130 CleanOnStart ...... 130 DatastoreFile ...... 130 DatastoreDirectory ...... 131 EnableExtraction ...... 131 EnableExtractionCopy ...... 131 EnableScheduledTasks ...... 132 EncryptACLEntries ...... 133 HashedDestinationDirectory ...... 133 HashedTempDirectory ...... 134 InsertActions ...... 134 InsertFailedDirectory ...... 135 MinFreeSpaceMB ...... 135 SynchronizeKeepDatastore ...... 135 SynchronizeThreads ...... 136 TaskMaxAdds ...... 136 TaskMaxDuration ...... 137 TaskThreads ...... 137 TempDirectory ...... 138 XsltDLL ...... 138 Fetch Task Configuration ...... 139 IngestConfigSection ...... 139 N ...... 139 Number ...... 140 ScheduleCycles ...... 140 ScheduleRepeatSecs ...... 141 ScheduleStartTime ...... 141 Ingestion ...... 142

• • • ConnectorLib Java SDK Programming Guide • 9 • • Contents

EnableIngestion ...... 142 IndexDatabase ...... 142 IngestActions ...... 143 IngestAddAsUpdate ...... 143 IngestBatchSize ...... 144 IngestCheckFinished ...... 144 IngestConnectorConfigSection ...... 145 IngestDataPort ...... 145 IngestDelayMS ...... 146 IngestEnableAdds ...... 146 IngestEnableDeletes ...... 146 IngestEnableUpdates ...... 147 IngestHashedSharedPath ...... 147 IngesterType ...... 147 IngestHost ...... 148 IngestKeepFiles ...... 149 IngestPort ...... 149 IngestSendByType ...... 149 IngestSharedPath ...... 150 IngestSSLConfig ...... 150 IngestWriteIDX ...... 151 GroupServer ...... 151 GroupServerHost ...... 151 GroupServerPort ...... 152 GroupServerRepository ...... 152 GroupServerSSLConfig ...... 152

Chapter 9  Parameters Common to CFS Connectors Using Java ...... 155 JavaClassPath ...... 155 JavaConnectorClass ...... 156 JavaLibraryPath ...... 156 JavaMaxMemoryMB ...... 157 JVMLibraryPath ...... 157 JavaVerboseGC ...... 158

Chapter 10  CFS Connector Actions...... 159 Synchronous Versus Asynchronous Actions ...... 160

• • • 10 • ConnectorLib Java SDK Programming Guide • • Contents

QueueInfo Action ...... 160 Synchronize Fetch Action...... 162 Synchronize Groups Fetch Action ...... 164 Collect Fetch Action ...... 165 Identifiers Fetch Action...... 168 Insert Fetch Action ...... 170 Delete/Remove Fetch Action...... 173 Hold and ReleaseHold Fetch Actions...... 174 Update Action...... 176 View Action ...... 177 StopFetch Action...... 178

Chapter 11  Connector Framework Server Parameters ...... 179 Service Parameters ...... 180 Server Parameters ...... 180 AdminClients ...... 180 Port ...... 181 QueryClients ...... 181 Actions Parameters ...... 182 MaxQueueSize ...... 182 MaximumThreads ...... 182 Import Tasks and their Parameters ...... 183 Import Tasks ...... 183 Lua ...... 183 IDXWriter ...... 183 TextToDocs ...... 184 Sectioner ...... 184 ImportFile ...... 184 HtmlExtraction ...... 185 PreN ...... 185 PostN ...... 186 UpdateN ...... 186 DeleteN ...... 187 HashN ...... 187 IdxWriter Import Task Parameters ...... 188 IdxWriterFileName ...... 188 IdxWriterArchiveDirectory ...... 188 IdxWriterMaxSizeKBs ...... 189 TextToDocs Import Task Parameters ...... 189

• • • ConnectorLib Java SDK Programming Guide • 11 • • Contents

FilenameMatchesRegex ...... 189 ReferenceMatchesRegex ...... 190 FieldMatchesName ...... 190 FieldMatchesRegex ...... 191 ContentContainsRegex ...... 191 MainRangeRegex ...... 192 MainContentRegex ...... 192 MainFieldName ...... 193 MainFieldRegex ...... 193 ChildrenRangeRegex ...... 194 ChildRangeRegex ...... 195 ChildContentRegex ...... 195 ChildFieldName ...... 196 ChildFieldRegex ...... 196 ChildInheritFields ...... 197 ContentReplaceRegex ...... 197 ContentReplaceFormat ...... 198 FieldReplaceName ...... 198 FieldReplaceRegex ...... 199 FieldReplaceFormat ...... 199 DateFieldName ...... 200 DateFieldFormat ...... 200 Sectioner Import Task Parameters ...... 201 SectionerMaxBytes ...... 201 SectionerMinBytes ...... 201 SectionerSeparatorsN ...... 202 Import Service Parameters ...... 203 ExtractDirectory ...... 203 ImportFamilyRootExcludeFmtCSV ...... 203 ImportHashFamilies ...... 204 ImportInheritFieldsCSV ...... 204 ImportMergeMails ...... 205 KeyviewDirectory ...... 205 MaxImportQueueSize ...... 206 RevisionMarks ...... 206 ThreadCount ...... 207 XsltDLL ...... 207 Indexing Parameters ...... 208 ACIPort ...... 208 CompressIndexFiles ...... 208

• • • 12 • ConnectorLib Java SDK Programming Guide • • Contents

DREHost ...... 209 IndexBatchSize ...... 209 IndexOverSocket ...... 209 IndexTimeInterval ...... 210 KillDuplicates ...... 210

Chapter 12  License Configuration Parameters...... 213 Full ...... 214 Holder ...... 214 Key ...... 214 LicenseServerACIPort ...... 215 LicenseServerHost ...... 216 LicenseServerTimeout ...... 216 LicenseServerRetries ...... 217 Operation ...... 217

Chapter 13  Logging Configuration Parameters ...... 219 LogArchiveDirectory ...... 220 LogCompressionMode ...... 221 LogDirectory ...... 221 LogEcho ...... 222 LogExpireAction ...... 222 LogFile ...... 223 LogHistorySize ...... 224 LogLevel ...... 224 LogLevelMatch ...... 225 LogMaxLineLength ...... 227 LogMaxOldFiles ...... 227 LogMaxSizeKBs ...... 228 LogOldAction ...... 229 LogOutputLogLevel ...... 230 LogSysLog ...... 230 LogTime ...... 231 LogTypeCSVs ...... 231

Chapter 14  Secure Socket Layer Parameters ...... 235 SSLConfig ...... 236 SSLCACertificate ...... 238

• • • ConnectorLib Java SDK Programming Guide • 13 • • Contents

SSLCACertificatesPath ...... 238 SSLCertificate ...... 240 SSLCheckCertificate ...... 240 SSLCheckCommonName ...... 241 SSLMethod ...... 241 SSLPrivateKey ...... 242 SSLPrivateKeyPassword ...... 243

Chapter 15  Service Configuration Parameters ...... 245 ServiceACIMode ...... 246 ServiceControlClients ...... 246 ServiceHost ...... 247 ServicePort ...... 247 ServiceStatusClients ...... 248

Chapter 16  Service Actions...... 249 Action Syntax ...... 249 GetConfig ...... 250 GetLogStream ...... 250 GetLogStreamNames ...... 251 GetStatistics ...... 251 GetStatus ...... 259 GetStatusInfo ...... 259 Stop ...... 260 Service Action Parameters ...... 260

Appendixes KeyView Classes ...... 266 KeyView Formats ...... 267

Glossary ...... 285

Index ...... 289

• • • 14 • ConnectorLib Java SDK Programming Guide • • About This Document

This guide is for readers who need to use the ConnectorLib Java SDK. It is intended for readers who have installed IDOL and are familiar with concepts related to administering a multi-part distributed application.

 Documentation Updates

 Related Documentation

 Conventions

 Autonomy Product References

 Autonomy Customer Support

 Contact Autonomy

Documentation Updates

The information in this document is current as of ConnectorLib Java SDK version 10.0. The content was last modified 10 May 2012. You can retrieve the most current product documentation from Autonomy’s Knowledge Base on the Customer Support Site. A document in the Knowledge Base displays a version number in its name, such as IDOL Server 7.5 Administration Guide. The version number applies to the product that the document describes. The document may also have a revision number in its name, such as IDOL Server 7.5 Administration Guide Revision 6. The revision number applies to the document and indicates that there were revisions to the document since its original release. It is recommended that you periodically check the Knowledge Base for revisions to documents for the products your enterprise is using.

• • • ConnectorLib Java SDK Programming Guide • 15 • • About This Document

To access Autonomy documentation 1. Go to the Autonomy Customer Support site at https://customers.autonomy.com 2. Click Login. 3. Enter the login credentials that were given to you, and then click Submit. The Knowledge Base Search page opens. 4. In the Search box, type a search term or phrase. To browse the Knowledge Base using a navigation tree only, leave the Search box empty. 5. Ensure the Documentation check box is selected. 6. Click Search. Documents that match the query display in a results list. 7. To refine the results list, select one or more of the categories in the Filter By pane. You can restrict results by  Product Group. Filters the list by product suite or division. For example, you could retrieve documents related to the iManage, IDOL, Virage or KeyView product suites.  Product. Filters the list by product. For example, you could retrieve documents related to IDOL Server, Virage Videologger, or KeyView Filter.  Component. Filters the list by a product’s components. For example, you could retrieve documents related to the Content or Category component in IDOL.  Version. Filters the list by product or component version number.  Type. Filters the list by document format. For example, you could retrieve documents in PDF or HTML format. Guides are typically provided in both PDF and HTML format. 8. To open a document, click its title in the results list. To download a PDF version of a guide, open the PDF version, click the Save icon in the PDF reader, and save the PDF to another location.

• • • 16 • ConnectorLib Java SDK Programming Guide • • Related Documentation

Related Documentation

The following documents provide more details on ConnectorLib Java SDK:  IDOL Administration User Guide IDOL Administration provides a distributed, Web-based infrastructure for managing IDOL components and services. The IDOL Administration manual describes how to administer IDOL through the IDOL Administration Dashboard and Dashboard console.  IDOL Server Administration Guide IDOL server lies at the center of an Autonomy infrastructure, storing and processing the data that connectors index into it. The IDOL Server Administration Guide describes the operations that IDOL server can perform with detailed descriptions of how to set them up.  Distributed Index Handler (DIH) Administration Guide This guide contains details on how you can use a DIH to distribute aggregated documents across multiple IDOL servers.  Intellectual Asset Protection System (IAS) Administration Guide This guide contains details on how you can use Autonomy’s Intelligent Asset Protection System (IAS) to ensure secure access through authentication and role permissions.  License Server Administration Guide This guide contains details on how you can use a License Server to license multiple Autonomy services.

• • • ConnectorLib Java SDK Programming Guide • 17 • • About This Document

Conventions

The following conventions are used in this document.

Notational Conventions This document uses the following conventions.

Convention Usage

Bold User-interface elements such as a menu item or button. For example: Click Cancel to halt the operation.

Italics Document titles and new terms. For example:  For more information, see the IDOL Server Administration Guide.  An action command is a request, such as a query or indexing instruction, sent to IDOL Server.

monospace font File names, paths, and code. For example: The FileSystemConnector.cfg file is installed in C:\Program Files\FileSystemConnector\.

monospace bold Data typed by the user. For example:  Type run at the command prompt.  In the User Name field, type Admin.

monospace italics Replaceable strings in file paths and code. For example: user UserName

• • • 18 • ConnectorLib Java SDK Programming Guide • • Conventions

Command-line Syntax Conventions This document uses the following command-line syntax conventions.

Convention Usage

[ optional ] Brackets describe optional syntax. For example: [ -create ]

| Bars indicate “either | or” choices. For example: [ option1 ] | [ option2 ] In this example, you must choose between option1 and option2.

{ required } Braces describe required syntax in which you have a choice and that at least one choice is required. For example: { [ option1 ] [ option2 ] } In this example, you must choose option1, option2, or both options.

required Absence of braces or brackets indicates required syntax in which there is no choice; you must type the required syntax element.

variable Italics specify items to be replaced by actual values. For example: -merge filename1 (In some documents, angle brackets are used to denote these items.)

... Ellipses indicate repetition of the same pattern. For example: -merge filename1, filename2 [, filename3 ... ] where the ellipses specify, filename4, and so on.

The use of punctuation—such as single and double quotes, commas, periods— indicates actual syntax; it is not part of the syntax definition.

• • • ConnectorLib Java SDK Programming Guide • 19 • • About This Document

Notices This document uses the following notices:

CAUTION A caution indicates an action can result in the loss of data.

IMPORTANT An important note provides information that is essential to completing a task.

NOTE A note provides information that emphasizes or supplements important points of the main text. A note supplies information that may apply only in special cases—for example, memory limitations, equipment configurations, or details that apply to specific versions of the software.

TIP A tip provides additional information that makes a task easier or more productive.

Autonomy Product References

This document references the following Autonomy products:

 Connector Framework Server

 IDOL

 IDOL Server

 Autonomy Distributed Action Handler (DAH)

 Autonomy Distributed Index Handler (DIH)

 Autonomy License Server

 Autonomy Intellectual Asset Protection System (IAS)

 Autonomy KeyView

• • • 20 • ConnectorLib Java SDK Programming Guide • • Autonomy Customer Support

Autonomy Customer Support

Autonomy Customer Support provides prompt and accurate support to help you quickly and effectively resolve any issue you may encounter while using Autonomy products. Support services include access to the Customer Support Site (CSS) for online answers, expertise-based service by Autonomy support engineers, and software maintenance to ensure you have the most up-to-date technology. To access the Customer Support Site, go to https://customers.autonomy.com The Customer Support Site includes:  Knowledge Base: The CSS contains an extensive library of end user documentation, FAQs, and technical articles that is easy to navigate and search.

 Case Center: The Case Center is a central location to create, monitor, and manage all your cases that are open with technical support.

 Download Center: Products and product updates can be downloaded and requested from the Download Center.

 Resource Center: Other helpful resources appropriate for your product. To contact Autonomy Customer Support by e-mail or phone, go to http://www.autonomy.com/content/Services/Support/index.en.html

Contact Autonomy

For general information about Autonomy, contact one of the following locations:

Europe and Worldwide North and South America

E-mail: [email protected] E-mail: [email protected] Telephone: +44 (0) 1223 448 000 Telephone: 1 415 243 9955 Fax: +44 (0) 1223 448 001 Fax: 1 415 243 9984 Autonomy Corporation plc Autonomy, Inc. Cambridge Business Park  One Market Plaza Cowley Rd Spear Tower, Suite 1900 Cambridge CB4 0WZ San Francisco CA 94105 United Kingdom USA

• • • ConnectorLib Java SDK Programming Guide • 21 • • About This Document

• • • 22 • ConnectorLib Java SDK Programming Guide • • PART 1 Getting Started

This section provides an overview of the ConnectorLib Java SDK, installation procedures, and configuration information for the connector that is developed and the Connector Framework server.

 Introduction

 Install ConnectorLib Java SDK

 Configure the Connector

 Implement a Connector using the ConnectorLib Java SDK

 Start and Stop the Connector

 Configure Connector Framework Server

 Use Lua Scripts Part 1 Getting Started

• • • 24 • ConnectorLib Java SDK Programming Guide • • CHAPTER 1  Introduction

This section provides an overview of the ConnectorLib Java SDK.

 Overview

 About Connector Framework Server

 System Architecture

 Import Process

Overview

The ConnectorLib Java SDK allows you to develop connectors that automatically aggregate documents from any type of local or remote repository and send them to the Connector Framework server (CFS), which then processes the information and indexes it into an Autonomy IDOL server.

• • • ConnectorLib Java SDK Programming Guide • 25 • • Chapter 1 Introduction

Once IDOL server receives the documents, it automatically processes them, performing a number of intelligent operations in real time, such as:

 Agents  Hyperlinking  Alerting  Mailing  Categorization  Profiling  Channels  Retrieval  Clustering  Spelling Correction  Collaboration  Summarization  Dynamic Thesaurus  Taxonomy Generation  Expertise

Refer to your IDOL server’s manual for further details.

About Connector Framework Server

The Connector Framework server (CFS) receives information from various connectors, which it then processes and indexes into an IDOL server. A single CFS can be configured to work with multiple connectors and send documents to multiple IDOL servers or Distributed Index Handlers (DIH). In addition, the server can execute predefined tasks on documents just before they are imported, after they are imported, or if errors occur. CFS filters text from a variety of document types with KeyView filters, which are document-specific readers used for text extraction. Users generally do not access KeyView directly; however, the parameter ImportFamilyRootExcludeFmtCSV requires that you identify the desired KeyView document formats.

Related Topics  “ImportFamilyRootExcludeFmtCSV” on page 203.

 “KeyView Format Codes” on page 265.

• • • 26 • ConnectorLib Java SDK Programming Guide • • System Architecture

System Architecture

There are several ways to install the Connector Framework server. The simplest installation consists of a single CFS, single connector, and single IDOL server.

It is also possible to have more complex configurations, consisting of more than one connector, a Distributed Index Handler (DIH), multiple IDOL servers, or some combination of these options.

• • • ConnectorLib Java SDK Programming Guide • 27 • • Chapter 1 Introduction

Import Process

The import process consists of the following basic steps:

1. The connector sends documents from the data repository to the CFS. 2. Pre-import tasks are performed, which are typically defined in Lua scripts. 3. KeyView filters the document content. 4. Post-import tasks are performed, as defined in the PostN parameters. 5. Optionally, a backup IDX or XML file is created. 6. The data is indexed into IDOL server, or sent to a DIH.

• • • 28 • ConnectorLib Java SDK Programming Guide • • CHAPTER 2  Install ConnectorLib Java SDK

This section provides information required to install the ConnectorLib Java SDK.

 System Requirements

 Install ConnectorLib Java SDK on Windows

System Requirements

ConnectorLib Java SDK should be installed by a system administrator as part of a larger Autonomy system that includes an Autonomy IDOL server and an interface for the information stored in the IDOL server.

Supported Platforms The ConnectorLib Java SDK runs on a Windows platform, 32-bit version. Solaris and versions can be made available.

NOTE The documented platforms are the recommended and most fully tested platforms for IDOL. Autonomy can provide support for other platforms on request.

• • • ConnectorLib Java SDK Programming Guide • 29 • • Chapter 2 Install ConnectorLib Java SDK

Minimum Server Requirements The minimum requirements for Windows are:  2 GHz Pentium4 processor

 2 GB RAM

 20GB hard disk recommended

Install ConnectorLib Java SDK on Windows

To install the ConnectorLib Java SDK, use the following procedure. After the SDK has been installed, you can create your own connector implementation, and then configure ConnectorLib Java to use it. For more information, see “Implement a Connector using the ConnectorLib Java SDK” on page 43.

To install a standalone version of ConnectorLib Java SDK on Windows 1. Double-click ConnectorLibJava_VersionNumber_WINDOWS.exe The installation wizard opens and the Introduction page is displayed. 2. Read the text, and click Next. The License Agreement page opens. 3. Read the license agreement and if you agree to its terms, click I accept the terms of the License Agreement and click Next. The Choose Install Folder page opens. 4. Choose an installation folder for ConnectorLib Java SDK. By default, ConnectorLib Java SDK is installed in C:\Autonomy\ConnectorLibJava, but you can click Choose to choose another location. After you choose an installation folder, click Next. The Service Name page opens. 5. In the Service Name box, type a name to use for the ConnectorLib Java windows service, and click Next. The License Server Configuration page opens. 6. Type the IP address or hostname, and the ACI Port of the license server and click Next. The Java Configuration page opens. 7. Choose whether you want to use the bundled Java VM. Select or clear the check box, and click Next.

• • • 30 • ConnectorLib Java SDK Programming Guide • • Install ConnectorLib Java SDK on Windows

The DRE Database page opens. 8. In the DRE Database box, type the name of the DRE database that ConnectorLib Java should index into, and click Next. The Connector Framework Server page is displayed. 9. Choose whether you want to install a new CFS or use an existing CFS.  To install a new CFS, click Install New CFS and click Next. The Choose CFS Install Folder page is displayed. Go to Step 10.  To use an existing CFS, click Use Existing CFS and click Next. The Connector Framework Server page is displayed.Type the Hostname and Port of your existing CFS installation. Click Next and go to Step 15. 10. Enter the path where you want the Connector Framework Server to be installed and click Next. The Enter Connector Framework Installation Name page opens. 11. Type a unique name for the Connector Framework installation and click Next. This name is used for the Connector Framework installation directory and various files. The name must not contain any spaces. The Connector Framework Service Settings page opens. 12. Enter the following details, and click Next.

Service Port The port number that the Connector Framework Server uses to communicate with the license server. This port must not be used by any other service.

Service Status The IP addresses of computers that are permitted to access Clients the Connector Framework service status, but are not permitted to control the status. If you want to permit a number of machines to access the Connector Framework service status, use a wildcard. For example, enter 187.*.*.* to permit any machine with an IP address that begins with 187 to access the Connector Framework service status.

Service Control The IP addresses of computers that are permitted to control Clients the Connector Framework service. If you want to permit a number of machines to control the Connector Framework service, use a wildcard. For example, enter 187.*.*.* to permit any machine with an IP address that begins with 187 to control the Connector Framework service.

The DRE Settings page opens.

• • • ConnectorLib Java SDK Programming Guide • 31 • • Chapter 2 Install ConnectorLib Java SDK

13. Enter the following details and click Next.

IDOL Server The IP address of the IDOL server to which you want to add Hostname documents.

ACI Port The port number the connector uses to query IDOL server.

The Connector Framework Server ACI Port page opens. 14. In the ACI Port box, type the port that you want the Connector Framework to listen on, and click Next. The Pre-Installation Summary page opens. 15. Review the installation settings. If necessary, click Previous to change any settings. If you are satisfied with the settings, click Install. The Installing ConnectorLib Java SDK page opens. The progress of the installation process is indicated. The Start Service page opens. 16. Choose whether to start the ConnectorLib Java service, and click Next. 17. Choose whether to start the Connector Framework service, and click Next. 18. The installation is complete. Click Done. You can now edit the ConnectorLib Java SDK and Connector Framework Server configuration files. You can also start the ConnectorLib Java and Connector Framework services if you did not start them from the installation wizard.

• • • 32 • ConnectorLib Java SDK Programming Guide • • Install ConnectorLib Java SDK on Windows

Directory Structure—Windows Once the installation of ConnectorLib Java SDK is completed, your installation directory contains the following files and subdirectories.

Connector Framework Server Directory Structure The Connector Framework server (CFS) is installed to the ConnectorFramework directory which is at the same level as the ConnectorLib Java SDK installation directory. The ConnectorFramework installation directory contains the following files and subdirectories (note that bold font indicates directories).

File Description

convtables Contains various text files used during the importing process.

filters Contains executables used during the importing process.

jre Contains Java Runtime Environment for the uninstaller.

scripts Contains Lua scripts.

Uninstall_ConnectorFramework Files required for uninstalling Connector Framework server.

ConnectorFramework.cfg Connector Framework server configuration file.

ConnectorFramework.exe Connector Framework server executable.

ConnectorFramework_InstallLog.log Installation log file that lists the details of the installation process.

lua.dll Lua library.

version.txt Text file containing version information.

• • • ConnectorLib Java SDK Programming Guide • 33 • • Chapter 2 Install ConnectorLib Java SDK

When you start the Connector Framework server for the first time, the following files are created:

File Description

logs Contains CFS log files. By default, this includes action.log, import.log, indexer.log. actions Contains queued asynchronous actions so that, if the server should go down, the actions will not be lost. When the server comes back up, the queued actions will be processed.

uid Contains document tracking files.

autn_ntres.dll NT resource library.

ConnectorFramework.lck Lock file which prevents multiple instances of CFS running simultaneously.

license.log License log file.

licensekey.txt License information text document.

portinfo.dat File that lists the ports that the connector is using.

service.log Service actions log file.

ConnectorLib Java SDK Directory Structure By default, the ConnectorLib Java SDK is installed to the C:\Autonomy\ ConnectorLibJava directory. The ConnectorLib Java SDK installation directory contains the following files and subdirectories (note that bold font indicates directories)

File Description

example The source code for the sample implementation.

jre Contains the Java Runtime Environment.

lib Various JAR files.

logs Contains various configured log files.

Uninstall_ConnectorName Files required to uninstall ConnectorLib Java SDK.

autn_ntres.dll NT resource library.

autpassword.exe Autonomy Password Encryption Utility, which allows you to encrypt the passwords.

• • • 34 • ConnectorLib Java SDK Programming Guide • • Install ConnectorLib Java SDK on Windows

File Description

ConnectorLibJava.cfg ConnectorLib Java configuration file.

ConnectorLibJava.dll This file is used when running from the command line through Java.

ConnectorLibJava.exe ConnectorLib Java SDK executable.

ConnectorLibJava_InstallLog.log Installation log file that lists the details of the installation process.

license.html License information in HTML format.

license.txt License information in text format.

lua5.1.dll Lua library

ReleaseNotes.pdf Release notes for the ConnectorLib Java SDK.

service.log Service actions log file.

vcredist.exe Visual Studio redistributable file.

When you start ConnectorLib Java SDK for the first time, the following files are created:

File Description

license Contains license information.

uid Contains document tracking files.

ConnectorLibJava.lck Lock file which prevents multiple instances of ConnectorLibJava from running concurrently.

license.log License log file.

licensekey.txt Text file that lists license information.

portinfo.dat File that lists the port information.

• • • ConnectorLib Java SDK Programming Guide • 35 • • Chapter 2 Install ConnectorLib Java SDK

• • • 36 • ConnectorLib Java SDK Programming Guide • • CHAPTER 3  Configure the Connector

The configuration settings are stored in the connector configuration file that you construct. Use the following information to modify the parameters directly in the configuration file using a text editor. Using the information below, you can also use the configuration file to set up log streams. If you are entering passwords into a configuration field, you can use the information in this chapter to encrypt them.

 Modify Parameters

 Encrypt Passwords

 Set Up Log Streams

Modify Parameters

The following section describes how to enter parameter values in the configuration file.

Enter Boolean Values The following settings for Boolean parameters are interchangeable: TRUE = true = ON = on = Y = y = 1 FALSE = false = OFF = off = N = n =0

• • • ConnectorLib Java SDK Programming Guide • 37 • • Chapter 3 Configure the Connector

Enter String Values Some parameters require string values that contain quotation marks. Escape each quotation mark by inserting a backslash before it. For example: FIELDSTART0="" Here, the beginning and end of the string are indicated by quotation marks, while all quotation marks that are contained in the string are escaped. If you want to enter a comma-separated list of strings for a parameter, and one of the strings contains a comma, you must indicate the start and the end of this string with quotation marks. For example: ParameterName=cat,dog,bird,"wing,beak",turtle If any string in a comma-separated list contains quotation marks, you must put this string into quotation marks and escape each quotation mark in the string by inserting a backslash before it. For example: ParameterName="",dog,bird,"wing,beak",turtle

Encrypt Passwords

For added security, it is recommended all passwords be encrypted before they are entered into a configuration field. To encrypt passwords, follow the steps relevant to your .

To encrypt passwords 1. At a command prompt, change directories to InstallDir\ ConnectorName. 2. Enter one of the following strings: autpassword -e -tEncryptionType [options] PasswordString autpassword -d PasswordString autpassword -x -tEncryptionType [options]

• • • 38 • ConnectorLib Java SDK Programming Guide • • Encrypt Passwords

where,

Option Description

-e Encrypts the password.

-d Decrypts the password.

-x Performs the operation specified by the -o option. See Options.

-tEncryptionType The type of encryption used. The following options are available:  Basic  AES You may use either form of encryption. However, AES is a more secure type of encryption than basic encryption.

PasswordString The password to encrypt or decrypt.

Options Options can be one of the following:  -oOptionName=OptionValue. OptionName can be: KeyFile. Specifies the path and filename of a keyfile. It should contain 64 hexadecimal characters. This option is only available with the AES encryption type and the -x option.  -c. The configuration filename in which to write the encrypted password. This option is only available with the -e argument.  -s. The name of the section in the configuration file in which to write the password. This option is only available with the -e argument.  -p. The parameter name in which to write the encrypted password. This option is only available with the -e argument. When writing the password to a configuration file, you must specify all related options: -c, -s, and -p.

Example: autpassword -e -tBASIC -c./Config.cfg -sDefault -pPassword passw0r autpassword -d passw0r autpassword -x -tAES -oKeyFile=./MyKeyFile.ky

• • • ConnectorLib Java SDK Programming Guide • 39 • • Chapter 3 Configure the Connector

Set Up Log Streams

Use the following information to set up your own log streams. Each log stream creates a separate log file in which specific log message types (for example, action, index, application, or import) are logged.

To set up log streams 1. Open the configuration file in a text editor. 2. Find the [Logging] section. (If the configuration file does not contain a [Logging] section, create one.) 3. Under the [Logging] section's heading, create a list of the log streams you want to set up using the format N=LogStreamName. For example: [Logging] LogLevel=FULL LogDirectory=logs 0=ApplicationLogStream 1=ActionLogStream 2=JavaLogStream 3=SynchronizeLogStream 4=InsertLogStream 5=CollectLogStream 6=ViewLogStream 7=DeleteLogStream 8=HoldLogStream 9=UpdateLogStream In this example, log streams are defined which report application and action messages, messages for the various fetch actions that can be implemented, and messages written to System.out and System.err. Note the log streams are listed in consecutive order, starting from 0 (zero). 4. Create a new section for each of the log streams you defined. Each section must have the same name as the log stream. For example: [ApplicationLogStream] [ActionLogStream] [JavaLogStream] [SynchronizeLogStream] [InsertLogStream] [CollectLogStream] [ViewLogStream] [DeleteLogStream] [HoldLogStream] [UpdateLogStream]

• • • 40 • ConnectorLib Java SDK Programming Guide • • Set Up Log Streams

5. Specify the settings you want to apply to each log stream in the appropriate log stream's section. You can specify the type of logging that should be performed (for example, full logging), whether log messages should be displayed on the console, the maximum size of log files, and so on. For example: [ApplicationLogStream] logfile=application.log logtypecsvs=application

[ActionLogStream] logfile=action.log logtypecsvs=action

[SynchronizeLogStream] LogFile=synchronize.log LogTypeCSVs=synchronize

[JavaLogStream] LogFile=java.log LogTypeCSVs=java

[InsertLogStream] LogFile=insert.log LogTypeCSVs=insert

[CollectLogStream] LogFile=collect.log LogTypeCSVs=collect

[ViewLogStream] LogFile=view.log LogTypeCSVs=view

[DeleteLogStream] LogFile=delete.log LogTypeCSVs=delete

[HoldLogStream] LogFile=hold.log LogTypeCSVs=hold

[UpdateLogStream] LogFile=update.log LogTypeCSVs=update

6. Save and close the configuration file. 7. Restart the service to execute your changes.

• • • ConnectorLib Java SDK Programming Guide • 41 • • Chapter 3 Configure the Connector

Related Topics  “Logging Configuration Parameters” on page 219

• • • 42 • ConnectorLib Java SDK Programming Guide • • CHAPTER 4  Implement a Connector using the ConnectorLib Java SDK

This section describes how to use the ConnectorLib Java SDK.

 Overview

 Create a New Connector based on ConnectorLibJava

 Run the Connector

 Implement the Synchronize Action

 Configuration and Logging

 Debug the Connector

 Implement Other Actions

 Documents

 Datastore

 Ingester Class

 Ingest Result Handler

 Additional Information

• • • ConnectorLib Java SDK Programming Guide • 43 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK

Overview

ConnectorLibJava is an implementation of connectorlib for Java. It is compatible with Java 1.4 and later. You can use it to implement a connector that supports any or all of the following actions:

 Synchronize  Insert  Collect  Update  View  Hold  Delete  ReleaseHold

Create a New Connector based on ConnectorLibJava

To implement a new connector, extend the class com.autonomy.connector.ConnectorBase:

package com.mycompany.connector;

import com.autonomy.connector.Config; import com.autonomy.connector.ConnectorBase; import com.autonomy.connector.Log;

public class MyConnector extends ConnectorBase { public MyConnector(Config config, Log log) { super("My Connector"); } }

The string passed to the ConnectorBase constructor (in this case My Connector) is the name of the connector. In order to run the connector you must have a matching license in your license server. This is all that is required to create a new connectorLibJava based connector. To build the connector you must include JavaConnector.jar in the compile classpath.

NOTE A sample program is located in the example folder in the installation directory.

• • • 44 • ConnectorLib Java SDK Programming Guide • • Run the Connector

Run the Connector

To run the connector, use the following instructions. This procedure assumes that you have built the connector into a jar called MyConnector.jar.

To run the connector 1. Copy MyConnector.jar into the lib directory of your connectorLibJava installation. The connectorLibJava directory should contain at least the following: lib\MyConnector.jar lib\JavaConnector.jar connectorLibJava.cfg connectorLibJava.exe lua5.1.dll

2. Update connectorLibJava.cfg to contain at least the following: [License] LicenseServerHost=server LicenseServerACIPort=20000

[service] ServicePort=7003 ServiceStatusClients=* ServiceControlClients=*

[server] Port=7002

[Logging] LogLevel=FULL LogEcho=TRUE LogDirectory=logs LogMaxSizeKBs=-1 0=ApplicationLogStream 1=ActionLogStream 2=JavaLogStream

[ApplicationLogStream] LogFile=application.log LogTypeCSVs=application

[ActionLogStream] LogFile=action.log LogTypeCSVs=action

• • • ConnectorLib Java SDK Programming Guide • 45 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK

[JavaLogStream] LogFile=java.log LogTypeCSVs=java

[Connector] JavaClassPath=lib\JavaConnector.jar;lib\MyConnector.jar JavaConnectorClass=com.mycompany.connector.MyConnector

3. Configure the [License] section to point to your license server. 4. Run the connector. If you find that the connector stops and writes out an error in license.log: "No license found for My Connector", confirm that you have a license matching the string passed to the ConnectorBase constructor, and that your [License] section is set up correctly. With the default configuration provided with connectorLibJava, the connector will output the log message "Exception when starting Scheduled Tasks: All connector features are currently disabled". This is because a synchronize job is configured, but synchronize has not yet been implemented. To implement a synchronize action, see “Implement the Synchronize Action” on page 46.

Implement the Synchronize Action

The initial implementation does not support any actions. To implement an action, override the appropriate method in ConnectorBase and provide an implementation. The following example is a simple implementation of the synchronize action. You would add this code to your connector class.

@Override public void synchronize(SynchronizeTask task) { DocInfo docInfo = new DocInfo( task.taskConfig, "http://www.example.com/"); docInfo.getId().setProperty( "IdentifierPropery", "Some Value"); docInfo.getDoc().addFieldValue( "Some Field", "Field Value"); task.ingester.add(docInfo); }

• • • 46 • ConnectorLib Java SDK Programming Guide • • Configuration and Logging

This example also shows how to create a document and send it for ingestion. A DocInfo object is created to store all the information about a document. It then has a property set in its identifier, and a field value added. Finally it is sent to the ingester. At this point you might want to ensure the configuration file includes settings for a synchronize log, and for ingestion:

[Logging] 3=SynchronizeLogStream

[SynchronizeLogStream] LogFile=synchronize.log LogTypeCSVs=synchronize

[Ingestion] IngesterType=CFS IngestHost=localhost IngestPort=7000

If you run a CFS on the appropriate port and issue a synchronize action to your connector, you will see that one document is sent to CFS. When sending the action, you should specify the task name in the synchronize command or in the configuration file:

Synchronize http://localhost:7002/action=fetch Command &fetchaction=synchronize &tasksections=MyTask

Configuration [FetchTasks] File Number=1 0=MyTask

[MyTask]

Configuration and Logging

The taskConfig member of ConnectorTask provides access to configuration settings:

String directory = task.taskConfig.read("Directory"); String password = task.taskConfig.readPassword("Password"); int maxSize = task.taskConfig.read("MaxSize", 14);

• • • ConnectorLib Java SDK Programming Guide • 47 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK

boolean doXYZ = task.taskConfig.read("DoXYZ", false);

The settings come from the appropriate configuration section in the configuration file (looking in default sections if necessary) or from the action sent to the connector.

Logging Logging is performed using the log member of ConnectorTask:

task.log.writeln(Log.NORMAL, "Processing XYZ"); task.log.writeln(Log.WARNING, "Watch out!");

The Java Log Stream An additional 'Java' log stream is also provided. Any messages written to System.out and System.err by the connector are logged to this stream.

Debug the Connector

A library version of ConnectorLibJava is supplied to simplify the process of debugging a Java connector. This allows you to start the connector in a Java debugger.

To debug the connector, start the application from the ConnectorBase main class. Pass the name of the ConnectorLibJava library as the first argument. By default it will look for a configuration file by appending .cfg to the library name. If you need to change this, use the -configfile parameter. For example:

java -cp com.autonomy.connector.ConnectorBase -configfile

java -cp com.autonomy.connector.ConnectorBase connectorlibjava -configfile connectorlibjava.cfg

Implement Other Actions

Implementation of the other actions is similar to implementation of the synchronize action (see “Implement the Synchronize Action” on page 46). To implement the other actions, override the appropriate method.

• • • 48 • ConnectorLib Java SDK Programming Guide • • Documents

The methods you can override are: synchronize, collect, delete, insert, update, view and hold. Each takes an appropriate implementation of ConnectorTask. The synchronize method is passed a SynchronizeTask, the collect action is passed a CollectTask and so forth. The JavaDoc pages for the various task types give more information about implementing those actions. Hold and ReleaseHold are both implemented by the hold method, which takes a boolean indicating whether it should apply a hold to a document, or release an existing hold. All of the tasks provide access to:

 An ingester.

 datastoreFilename - This is where the connector should store any persistent state information for the task.

 log - This provides a method to log messages to the appropriate log stream.

 taskConfig - This provides access to the configuration settings specific to this action.

 taskName - The name of the task (the configuration file section name).

 tempDirectory. The TempDirectory as specified in the server configuration file. Any temporary files should be created here and deleted when no longer needed.

 stop(). The stop method returns a boolean to indicate whether the action should terminate as soon as possible (the action implementation should monitor this regularly). This is true if the user has instructed the connector to stop.

Documents

DocInfo Class This class has three members:  document. The document has methods for setting and retrieving the document reference, metadata and content.

 identifier. The identifier holds a reference and a collection of properties that can be used to identify the document within the repository so it can be collected back. During a synchronize action a connector should set whatever identifier properties it will need to retrieve the document if it is passed the identifier as part of another action (such as view, collect, hold or update).

• • • ConnectorLib Java SDK Programming Guide • 49 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK

 file. The file represents a file on disk containing the content of the document. A file can be "owned" in which case it is deleted automatically when no longer needed. These should be accessed through the accessors:  getDoc()  getID()  getFile() and setFile(file) For most actions (except Synchronize) the action is passed a collection of documents. For these the metadata/properties of the document and identifier objects can be modified, but the objects themselves should not be replaced (hence no setID() or setDoc methods). The setFile(file) is provided so that Synchronize, Collect and View actions can assign the collected file into the DocInfo object. Synchronize can alternatively do this directly by passing the filename when constructing the DocInfo. The Synchronize action will usually create new DocInfo objects for ingestion. Most other actions should not create new DocInfo objects. These are passed a set of DocInfo objects to act upon. For example, the CollectTask contains a set of DocInfo objects - one for each document to be collected. The DocInfo objects are initially blank except for the identifier which the connector can use to find the document in the repository. The implementation should populate these DocInfo objects with content and metadata from the repository and then report success or failure as documented in the JavaDoc.

The success and failed methods must be called using the provided documents, and not by constructing new instances of DocInfo. (Calling success or failed on an unrecognized DocInfo object will most likely be ignored).

Identifiers When the connector creates a new DocInfo, it should give it an identifier which can be used to uniquely identify the repository document it was created for. This will be used by the view and collect actions to retrieve that document from the repository. The Identifier contains the following information:  The name of the configuration section defining the task that retrieved the document. The same configuration information should ideally be used when performing retrieval as when performing a Synchronize operation.

• • • 50 • ConnectorLib Java SDK Programming Guide • • Documents

 The document reference - this can be omitted if the retrieval does not require the reference.

 The repository specific parameters that identify the document within the repository or how to retrieve it. It would be normal to allow any non-document-specific data to also be specified in the configuration file and be overridden by the value in the Identifier if present. Any logon details or other sensitive information should not be stored in the Identifier and should only appear in the configuration file.

 The Identifier should also contain the sub file information for the sub files of a document once it has been ingested.

The Identifier string is stored and constructed by the Identifier class.

Example Identifier The following example illustrates an unencoded Identifier (not for any real repository):

The Identifier itself is constructed by Base64 encoding the entire XML to give the following for the first example above:

PGlkIHM9Ik15VGFzazEiIHI9Imh0dHA6Ly9teXNlcnZlcjo0NTY3L2RvYy9fdnhzd2 RmZ3VoamtuYmlvX2VhcnljcXp0XyI+PHAgbj0iU0VSVklDRVVSTCIgdj0iaHR0cDov L215c2VydmVyOjQ1Njcvc2VydmljZSIvPjxwIG49IkRPQ0lEIiB2PSJfdnhzd2RmZ3 VoamtuYmlvX2VhcnljcXp0XyIvPjwvaWQ+

Note that this should be URL-escaped as normal to pass to one of the actions. It is not necessary for implementations of connectorLibJava to perform this encoding or escaping - use the getId() method of the DocInfo class to get access to the Identifier class and then set whichever properties are required. The identifier will be encoded and escaped as necessary by connectorLib. Similarly, any identifiers passed to the connector will unescaped and decoded as necessary by connectorLib, and can be accessed using the getId() method.

Sub File Indices In order for Collect and View to retrieve sub files of containers, the Identifier is expanded by appending the KeyView sub file indices - this is done during ingestion when associating the Identifier with the document's children.

• • • ConnectorLib Java SDK Programming Guide • 51 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK

The following shows how the indices are appended for a container file ( represents the top level Base64 Identifier):

|-- |0 |-- |1 \-- |2 |-- |2.0 |-- |2.1 | |-- |2.1.0 | |-- |2.1.1 |-- |2.2

A general Identifier with the indices appended will look like:

|..<...>.

For Example (for the 7th child of the 3rd child of the 2nd child of the top level document):

PGlkIHM9Ik15VGFzazEiIHI9Imh0dHA6Ly9teXNlcnZlcjo0NTY3L2RvYy9fdnhzd2 RmZ3VoamtuYmlvX2VhcnljcXp0XyI+PHAgbj0iU0VSVklDRVVSTCIgdj0iaHR0cDov L215c2VydmVyOjQ1Njcvc2VydmljZSIvPjxwIG49IkRPQ0lEIiB2PSJfdnhzd2RmZ3 VoamtuYmlvX2VhcnljcXp0XyIvPjwvaWQ+|1.2.6

During synchronize, it is CFS that extracts sub files. The Append Sub File Indices with Lua section explains how to set up CFS to include the sub file indices in the identifier. During view or collect, the subfile extraction is performed internally so the connector should only need to retrieve the top level document, and can ignore the indices. To support extraction of main sub files of a container, an index can be replaced at any level by the letter 'M'. This means that the sub file marked as the main sub file by KeyView would be extracted. For example '...|1.M.M' would refer to the main sub file of the main sub file of the second child of the top level document.

Append Sub File Indices with Lua The following Lua script will append the sub file indices written to the document's SubFileIndexCSV field during import to the AUTN_IDENTIFIER field:

function handler( document ) identifier = document:getFieldValue( "AUTN_IDENTIFIER" ) if identifier then indices = document:getFieldValue( "SubFileIndexCSV" ) if indices then indices = string.gsub(indices, ",", ".")

• • • 52 • ConnectorLib Java SDK Programming Guide • • Datastore

value = identifier .. "|" .. indices document:setFieldValue( "AUTN_IDENTIFIER", value) end end return true end

This lua script could be configured as a CFS post task. For it to work properly you will need to ensure that subfiles inherit the AUTN_IDENTIFIER field. You can do this by including it in the fields listed in the ImportInheritFieldsCSV parameter.

Datastore

ConnectorLibJava provides access to the datastore library, which can be used to store and retrieve any state required by the connector. Often the synchronize action will make use of this so that it can determine what documents have been added, updated or deleted since it was last used.

A datastore is a single file on disk, normally with the extension .db. You might also see a journal file when the datastore is in use. Information in a datastore is stored in tables. A datastore can contain a number of tables, each of which has a set of columns defined by the connector.

Configure the Datastore Tables To access a datastore you must first create a Datastore object:

Datastore datastore = new Datastore( task.datastoreFilename, task.log);

The constructor takes a filename and a log stream. Both of these are provided in the ConnectorTask object. When you use the filename provided in the ConnectorTask, the connector framework ensures that the datastore for each task has a unique name, and that configuration parameters relating to the datastore (such as SynchronizeKeepDatastore) are respected. The Datastore object can be used to create tables:

datastore.createTable("MyTable", new String[] { "Url", "ModifiedDate", "Seen" }, new String[] { "Url" });

• • • ConnectorLib Java SDK Programming Guide • 53 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK

This example specifies a table called MyTable with the columns Url, ModifiedDate and Seen. The third argument specifies the primary key for the table, which in this case is the Url column. You should call createTable for each table every time you want to use the datastore. If a table does not exist, it is created. If a table does exist, then the datastore library verifies that the table has the expected format.

Insert Records To insert a record, populate a new DatastoreRecord object and then call the insert method with the required table name:

DatastoreRecord recordOne = new DatastoreRecord(); recordOne.setString("Url", "http://www.example.com/one"); recordOne.setString("ModifiedDate", "2012-04-02 10:20"); recordOne.setString("Seen", "1"); datastore.recordInsert("MyTable", recordOne);

Update Records To update a record or records, use the recordUpdate method:

DatastoreRecord filter = new DatastoreRecord(); filter.setString("Url", "http://www.example.com/one"); DatastoreRecord update = new DatastoreRecord(); update.setString("ModifiedDate", "2012-04-02 11:22"); datastore.recordUpdate("MyTable", filter, update);

The filter is a datastore record that is used to specify the records that should be updated. If fields are specified in the filter, only records that match those field values exactly are updated. An empty filter matches all records. In the example above, the filter is used to match exactly one record: it specifies the required value for Url, which is the primary key.

The third argument to recordUpdate (update) specifies the update to be performed. Any fields set in this record are updated to the provided values in any records that match the filter. All other fields remain unaltered.

Remove Records To remove records, specify a filter and use the recordRemove method. The filter works in the same way as for updating records (see “Update Records” on page 54). Any records that match the filter are removed:

DatastoreRecord filter = new DatastoreRecord(); filter.setString("Url", "http://www.example.com/one");

• • • 54 • ConnectorLib Java SDK Programming Guide • • Datastore

datastore.recordRemove("MyTable", filter);

Commit Changes When you insert, update or remove records the changes are not committed to the datastore immediately. Instead, changes that you make are saved and committed at a later time. Until changes are committed, any attempt to select records will act on the old data, not the new. If you want to force your changes to be committed you can use the processQueue method:

datastore.processQueue();

Select Records To retrieve a record, specify a filter and use either the selectOne or select method. The filter works in the same way as for updating records (see “Update Records” on page 54).

SelectOne Method If you require only a single record, use the selectOne method:

DatastoreRecord filter = new DatastoreRecord(); filter.setString("Url", "http://www.example.com/one"); DatastoreRecord found = datastore.recordSelectOne("MyTable", filter, new String[] { "Url", "ModifiedDate", "Seen" } );

As well as taking a filter, selectOne takes a list of columns that should be retrieved for the matching record. You can access the values for the retrieved columns using the get methods on the returned DatastoreRecord.

Select Method To select all records matching a filter (not just the first), use the select method:

DatastoreRecord filter = new DatastoreRecord(); filter.setString("Seen", "0"); datastore.recordSelect("MyTable", filter, new String[] { "Url" }, this, "handleRecord");

This example selects only the "Url" column for all records in "MyTable" where the "Seen" field is set to "0". For each record it calls the handleRecord method on this. An example record handler is:

public void handleRecord(DatastoreRecord record) {

• • • ConnectorLib Java SDK Programming Guide • 55 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK

log.writeln("Url = " + record.getString("Url")); }

The method is called once for each record matching the filter.

Upgrade a Datastore If you decide to change the structure of a datastore table but still want to be able to read datastore files using the old format, you can provide a function to upgrade the table. For example:

datastore.createTable("MyTable", new String[] { "Url", "ModifiedDate", "Seen" }, new String[0], new String[] { "Url" }, false); datastore.modifyTableByRow("MyTable", new String[] { "Url", "ModifiedDate", "Seen", "Acl" }, new String[0], this, "upgradeV1ToV2Record"); datastore.commitTable("MyTable");

In the example, the table is created with the old column set but the createTable method is passed false as its last argument. This indicates to the datastore that this is an old table structure and will be upgraded.

The modifyTableByRow method is then used to specify the new column set. If the datastore table still has the old structure, the method upgradeV1ToV2Record on this is called for each record to upgrade it to the new format. If the table already has the new structure no action is taken.

modifyTableByRow can be called as many times as necessary. commitTable is called last of all to indicate that no further upgrades will now take place. The following code is an example upgrade method:

public boolean upgradeV1ToV2Record( DatastoreRecord oldRecord, DatastoreRecord newRecord) { newRecord.setString("Url", oldRecord.getString("Url")); newRecord.setString("ModifiedDate", oldRecord.getString("ModifiedDate")); newRecord.setString("Seen", oldRecord.getString("Seen")); newRecord.setString("Acl", "123"); return true; }

• • • 56 • ConnectorLib Java SDK Programming Guide • • Ingester Class

This is called for each record with the old structure (oldRecord). It should populate newRecord with the same data but in the new structure.

Index a Column If you often select by a column that is not the primary key, you can index that column to improve performance:

datastore.createIndex("MyTable", new String[] { "Seen" });

Ingester Class

The Ingester class is used to send new documents, updated documents, or deleted document commands to the CFS for processing. Currently CFS connections are supported for ingestion into IDOL server. Ingestion into a different repository by the Insert action of another connector is also supported. An instance of the class is created on the base task class ConnectorTask and so is accessible for any action (though usually only required for Synchronize). The host, port and other settings for the CFS should be set in the configuration file in the Ingestion section. Commands can be sent during synchronization by calling the Add, Update or Remove functions passing in a DocInfo object which should include a reference. Update commands are used to update the metadata of a document but not the content, they also use any provided metadata in the document. Add commands use the metadata and file name. The files passed to the ingester should be in the temporary directory specified by ConnectorTask.tempDirectory.

Ingest Result Handler

Result handlers can be added to the ingester. They are called for each document in the following situations:

 When an ingest command has successfully been sent for processing.

 When the current task has completed and all outstanding ingest tasks have failed to be sent.

 When ingestion is disabled in which case all tasks are assumed to have completed successfully.

• • • ConnectorLib Java SDK Programming Guide • 57 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK

The result handler is typically used when state is being stored for each document so that synchronize cycles can be incremental. It is often desirable that the connector only update an item's state information if the document for the item has been ingested successfully. The result handler method might look like this:

public void handler(DocInfo docInfo, Ingester.TaskType type, boolean success) { if (success) { if(type == Ingester.TaskType.Add || type == Ingester.TaskType.Update) { /* Update state */; } else if (type == Ingester.TaskType.Remove) { /* Remove from state */; } } }

The result handler is registered using the addResultHandler method of the ingester:

ingester.addResultHandler(this, "handler");

Additional Information

Additional information is provided in the JavaDocs for ConnectorLibJava. You can obtain these by extracting them from JavaConnector.jar.

• • • 58 • ConnectorLib Java SDK Programming Guide • • CHAPTER 5  Start and Stop the Connector

This section describes how to start and stop a connector.

NOTE You must start and stop the Connector Framework server separately from the connector.

Start the Connector

Start the connector using one of the following methods.

To start the connector using Windows Services 1. Open the Windows Services dialog box. 2. Select the ConnectorInstallName service, and click Start. 3. Close the Windows Services dialog box.

To start the connector by running the executable 1. In the connector installation directory, locate the connector executable called ConnectorInstallName.exe. 2. On a command line, enter ConnectorInstallName.exe.

• • • ConnectorLib Java SDK Programming Guide • 59 • • Chapter 5 Start and Stop the Connector

Stop the Connector

Stop a connector from running by using one of the following methods.

To stop the connector using Windows Services 1. Open the Windows Services dialog box. 2. Select the ConnectorInstallName service, and click Stop. 3. Close the Windows Services dialog box.

To stop the connector service by sending a command to the service port Type the following command in the address bar of your browser:

http://host:ServicePort/action=stop where,

host The IP address (or name) of the machine on which the ConnectorLib Java SDK is running.

ServicePort The ConnectorLib Java SDK service port (specified in the [Service] section of the ConnectorLib Java SDK configuration file).

• • • 60 • ConnectorLib Java SDK Programming Guide • • CHAPTER 6  Configure Connector Framework Server

This section describes how to configure the parameters that determine how the Connector Framework server (CFS) operates.

 Connector Framework Server Configuration File

 Modify Parameters

 Configure Connector Framework Server

 Example Configuration File

Connector Framework Server Configuration File

The parameters that determine how Connector Framework server operates are in the ConnectorFramework.cfg file, located in the CFS installation directory. You can modify these parameters to customize the CFS according to your requirements. The CFS supports all standard Server, Service, Logging, and License parameters. Most of the specific import tasks are defined in Lua scripts; therefore, the Connector Framework server configuration requirements are quite minimal.

• • • ConnectorLib Java SDK Programming Guide • 61 • • Chapter 6 Configure Connector Framework Server

Related Topics  Connector Framework Server Parameters

 Example Configuration File

Modify Parameters

The following section describes how to enter parameter values in the configuration file.

Enter Boolean Values The following settings for Boolean parameters are interchangeable: TRUE = true = ON = on = Y = y = 1 FALSE = false = OFF = off = N = n =0

Enter String Values Some parameters require string values that contain quotation marks. Escape each quotation mark by inserting a backslash before it. For example: FIELDSTART0="" Here, the beginning and end of the string are indicated by quotation marks, while all quotation marks that are contained in the string are escaped. If you want to enter a comma-separated list of strings for a parameter, and one of the strings contains a comma, you must indicate the start and the end of this string with quotation marks. For example: ParameterName=cat,dog,bird,"wing,beak",turtle If any string in a comma-separated list contains quotation marks, you must put this string into quotation marks and escape each quotation mark in the string by inserting a backslash before it. For example: ParameterName="",dog,bird,"wing,beak",turtle

• • • 62 • ConnectorLib Java SDK Programming Guide • • Configure Connector Framework Server

Configure Connector Framework Server

This section describes how to configure the basic Connector Framework server parameters.

To configure CFS 1. Open the CFS configuration file. 2. In the [Service] section, specify the service information. 3. In the [Server] section, specify server information. 4. In the [ImportTasks] section, configure how data is imported to IDX or XML before it is indexed into IDOL Server. 5. In the [ImportService] section, specify details for Keyview and the service that imports documents into IDX or XML. 6. In the [Indexing] section, specify the details for the IDOL Server(s) to which the CFS will send documents for indexing. 7. In the [Actions] section, configure how actions are sent to the CFS. 8. Save the configuration file.

Related Topics  Service Parameters

 Server Parameters

 Import Tasks and their Parameters

 Import Service Parameters

 Indexing Parameters

 Actions Parameters

 Secure Socket Layer Parameters

Example Configuration File

This section contains a basic example configuration file, which meets the minimum configuration requirements. [Service] Port=40030 ServiceStatusClients=*.*.*.*

• • • ConnectorLib Java SDK Programming Guide • 63 • • Chapter 6 Configure Connector Framework Server

ServiceControlClients=*.*.*.*

[Server] Port=7000 QueryClients=* AdminClients=* MaxInputString=-1 MAXFILEUPLOADSIZE=-1

[Logging] LogLevel=NORMAL 0=ApplicationLogStream 1=ActionLogStream 2=ImportLogStream 3=IndexLogStream

[actions] MaxQueueSize=100

[ApplicationLogStream] LogTypeCSVs=application LogFile=application.log

[ActionLogStream] LogTypeCSVs=action LogFile=action.log

[ImportLogStream] LogTypeCSVs=import LogFile=import.log

[IndexLogStream] LogTypeCSVs=indexer LogFile=indexer.log

[Indexing] DREHost=127.0.0.1 ACIPort=9000 IndexBatchSize=1000 IndexTimeInterval=300

[ImportService] KeyviewDirectory=C:\Documents and Settings\dquatman\My Documents\ ConnectorFrameworkServers\filters ExtractDirectory=C:\Documents and Settings\dquatman\My Documents\ ConnectorFrameworkServers\Temp ThreadCount=3 ImportInheritFieldsCSV=AUTN_GROUP,AUTN_IDENTIFIER,DREDBNAME

• • • 64 • ConnectorLib Java SDK Programming Guide • • Example Configuration File

[ImportTasks] //Post0=lua: Post0=IdxWriter:C:\Documents and Settings\dquatman\My Documents\ ConnectorFrameworkServers\IDXArchive\output.idx

• • • ConnectorLib Java SDK Programming Guide • 65 • • Chapter 6 Configure Connector Framework Server

• • • 66 • ConnectorLib Java SDK Programming Guide • • CHAPTER 7  Use Lua Scripts

This section contains the following topics:  Use Lua Scripts within the CFS

 Method Reference

 Use Lua Scripts Within the Connector

Use Lua Scripts within the CFS

Connector Framework server can import or process data using Lua, an embedded scripting language. A Lua script allows CFS to:  Call out to an external service, for example to alert a user.

 Modify and insert document fields.

 Interface with other libraries. When data is imported, the script is run for each document. For more information on Lua, see:

http://www.lua.org/ CFS supports all standard Lua functions.

• • • ConnectorLib Java SDK Programming Guide • 67 • • Chapter 7 Use Lua Scripts

Configure a Lua Script You can execute four types of script: pre-Lua or post-Lua, Delete, and Update. Pre-Lua scripts are run after the document data is extracted but before it is filtered, so the document contains metadata. Post-Lua scripts are run after the document data is filtered, so the document also contains the document content. Delete is run when a document is deleted. Update is run when a document is updated. Update and Delete are configured in the same way as Pre and Post, but they appear in the [IndexTasks] section. Use this procedure to specify the location of the Lua script file.

To configure a Lua script 1. Stop the Connector Framework server. 2. Open the Connector Framework server configuration file in a text editor. 3. Locate the [ImportTasks] section, and enter a different value of PreN (for pre-Lua scripts) or PostN (for post-Lua scripts) for each script file. For example: [ImportTasks] ... Pre0=Lua:script1.lua Pre1=Lua:script2.lua Post0=Lua:script3.lua 4. To enable family hashing, set the HashN parameter. True indicates that the Lua calculated the hash; False indicates that the hash should be calculated. 5. Save the configuration file.

Write a Lua Script The script should have this structure: function handler(document) ... end The handler function is called for each document and is passed a document object. This is an internal representation of the document being processed. Modifying this object will change the document.

Return true if you want to continue processing the document and return false if you want to stop.

• • • 68 • ConnectorLib Java SDK Programming Guide • • Method Reference

NOTE You can write a library of useful functions to share between multiple scripts, which you can then include in the scripts by adding dofile(“library.lua”) to the top of the lua script outside of the handler function.

Method Reference

The Connector Framework server supports several methods, which are listed in Table 1.

Table 1 Supported methods Method Description

General Methods

abs_path Returns the supplied path as an absolute path.

convert_date_time Converts date and time formats using standard Autonomy syntax. convert_encoding Converts the encoding of the string passed in from UTF8 and returns the converted string.

copy_file Copies the source file to the destination path.

create_path Creates the specified directory tree.

create_uuid Creates a universally unique identifier.

delete_file Deletes the file specified by path. encrypt Encrypts a string passed in and returns the encrypted string.

encrypt_security_field Encrypts the ACL.

file_setdates Sets the given file times on the file specified by path. getcwd Returns the current working directory of the application.

get_config Loads a configuration file.

gobble_whitespace Reduces multiple adjacent while spaces.

hash_file Hashes specified file using the SH1 or MDA5 algorithm, or both.

hash_string Hashes specified string.

• • • ConnectorLib Java SDK Programming Guide • 69 • • Chapter 7 Use Lua Scripts

Table 1 Supported methods Method Description

is_dir Checks if the supplied path is a directory.

log Appends log messages to the specified file.

move_file Moves the source file to the destination path.

parse_csv Parse the given separated values string into a collection of individual strings.

parse_xml Parse the given XML string to an XMLDocument. regex_match Performs a regular expression match on a string. regex_search Performs a regular expression search on a string.

send_aci_action Takes the action parameters as a table instead of the full action as a string to avoid issues with parameter values containing “&”.

send_aci_command Sends the given query to the ACI server.

sleep Pauses the executing thread for a number of milliseconds. string_uint_less Takes two strings and returns True if the second one is longer than the first.

unzip_file Extracts the zip file specified by path to the location specified by dest. xml_encode Takes a string and encodes it to a string that is valid to be put into XML.

zip_file Zips the supplied path (file or directory).

Document Methods

addField Creates a new field when passed a name and value.

appendContent Appends content to the existing content of the document. copyField Creates a new named field with the same value as an existing named field. copyFieldNoOverwrite Copies a field to a certain name but does not overwrite the existing value. countField Returns an integer of the number of fields with the name specified.

• • • 70 • ConnectorLib Java SDK Programming Guide • • Method Reference

Table 1 Supported methods Method Description

deleteField Removes a field from the document. findField Returns the LuaField object of the specified name.

getContent Gets the content for a document. getField Returns the first LuaField object of the specified name.

getFieldNames Gets all the field names for the document. getFields Returns a table of LuaFields of the specified name.

getFieldValue Gets a field value.

getFieldValues Gets all values of a multi-valued field.

getNextSection Gets the next section in a document, allowing you to perform find or add operations on every section. getReference Returns a string containing the reference.

hasField Checks whether the document has a particular named field. insertXML Inserts a portion of XML as a new piece of metadata for the document.

renameField Moves an existing field from one name to another.

setContent Sets the content for a document.

setFieldValue Sets a field value. setReference Sets the reference to the string passed in.

writeStubIdx Writes out a stub IDX document.

Field Methods

addField Adds a sub field with the specified name and value.

copyField Copies the sub field to another sub field.

copyFieldNoOverwrite Copies the sub field to another sub field but does not overwrite the destination.

countField Returns the number of sub fields that exist with the specified name.

deleteAttribute Deletes the attribute specified by the name passed in.

• • • ConnectorLib Java SDK Programming Guide • 71 • • Chapter 7 Use Lua Scripts

Table 1 Supported methods Method Description

deleteField Deletes the sub field with the specified name.

getAttributeValue Gets the value of the attribute specified as a string.

getField Gets the sub field specified by the name.

getFieldNames Returns a table containing strings representing all the sub fields’ names. getFields Gets all the sub fields specified by the name. getFieldValues Returns a table of strings of all the values of sub fields with the specified name. hasAttribute Returns a Boolean specifying if the field has the specified attribute passed in by name. hasField Returns a Boolean specifying if the sub field exists or not. insertXML Inserts a portion of XML as a new piece of metadata for the document. name Returns the name of the field object in a string. renameField Renames the sub field. setAttributeValue Sets the value for the specified attribute of the field. setValue Sets the value of the field to be passed in a string. value Returns the value of the field object in a string.

XMLDocument Methods

root Returns an XMLNode that is the root node of the XML document.

XPathExecute Returns XMLNodeSet that is the result of supplied XPath query.

XPathRegisterNs Register a namespace with the XML parser. Returns an integer detailing the error code.

XPathValue Returns the first occurance of the value matching the the xpath query.

XPathValues Returns a table of Strings contain th values according to the XPath query.

• • • 72 • ConnectorLib Java SDK Programming Guide • • Method Reference

Table 1 Supported methods Method Description

XMLNodeSet Methods

at Returns XMLNode at position pos in the array. size Returns size of node set.

XMLNode Methods

attr Returns first XMLAttr attribute object for this element. content Returns the content (text element) of the xml node.

firstChild Returns XMLNode that is the first child of this node.

lastChild Returns XMLNode that is the last child of this node. name Returns the name of the xml node.

next Returns XmlNode that is the next sibling of this node. nodePath Returns the Xml path to the node that can be used in another XPath query.

parent Returns the parent XmlNode of the node.

prev Returns XmlNode that is the previous sibling of this node.

type Returns the type of the node as a string.

XmlAttr Methods

name Returns the name of this attribute.

next_attribute Returns XmlAttr object for the next attribute in the parent element.

previous_attribute Returns XmlAttr object for the previous attribute in the parent element.

type Returns the type of this attribute node.

value Returns the value of this attribute.

RegexMatch Methods length Returns the length of the sub match. The default value of 0 returns for the full match. next Returns a RegexMatch for the next match. position Returns the position of the sub match as an index from 1.

• • • ConnectorLib Java SDK Programming Guide • 73 • • Chapter 7 Use Lua Scripts

Table 1 Supported methods Method Description size Returns the number of sub matches for the current match as an integer. str Returns the string for the sub match.

Config Methods

getEncryptedValue Returns the unencrypted value from the config of an encrypted value.

getValue Returns the value of the configuration parameter key in a given section.

getValues Returns a table of strings if you have multiple values for a key (for example, a CSV or numbered like keyN).

General Methods

abs_path Returns the supplied path as an absolute path.

Syntax abs_path( String path )

Arguments

Arguments Type/Description

path The relative path.

Returns A string of the supplied path as an absolute path.

convert_date_time Converts date and time formats using standard Autonomy syntax.

Syntax String convert_date_time (String InputDateTime, String InputFormatCSV, String OutputFormat, [Boolean OutputGMT = false])

• • • 74 • ConnectorLib Java SDK Programming Guide • • Method Reference

Arguments

Arguments Type/Description

InputDateTime The date and time to be converted.

InputFormatCSV A comma-separated list of the possible date and time formats of the input.

OutputFormat The format of the date and time to be output.

OutputGMT Specifies whether to treat the date and time output as Greenwich Mean Time. Default is false.

Discussion All date and time input is treated as local time unless it contains explicit time zone information.

Returns Date and time in the desired format.

convert_encoding This method converts the encoding of the string passed in from UTF8 and returns the converted string.

Syntax convert_encoding ( String content, String encodingname)

Arguments

Arguments Type/Description

content The string to convert.

encodingname The encoding name to convert to (same as IDOL encoding names).

Returns The converted string.

• • • ConnectorLib Java SDK Programming Guide • 75 • • Chapter 7 Use Lua Scripts

copy_file Copy the source file to the destination path. The copy will fail if the destination file already exists. This can be overridden by providing the optional overwrite argument which forces the copy if the destination exists.

Syntax copy_file( String src, String dest [, Boolean overwrite] )

Arguments

Arguments Type/Description

src The source file.

dest The destination file.

overwrite Forces the copy if the destination exists.

Returns Returns a Boolean indicating success/failure.

create_path Creates the specified directory tree.

Syntax void create_path (String Path)

Arguments

Arguments Type/Description

Path The path to be created.

create_uuid Creates a universally unique identifier.

Syntax String create_uuid()

Returns A universally unique identifier.

• • • 76 • ConnectorLib Java SDK Programming Guide • • Method Reference

delete_file Delete the file specified by path.

Syntax delete_file( String path )

Arguments

Arguments Type/Description

path The path and filename of the file to be deleted.

Returns Returns a Boolean indicating success/failure.

encrypt This method encrypts a string passed in and returns the encrypted string. It uses the same encryption as is used for ACL encryption.

Syntax encrypt (String content)

Arguments

Arguments Type/Description

content The string to encrypt.

Returns The encrypted string.

encrypt_security_field Encrypts the ACL.

Syntax String encrypt_security_field (String ACL)

Arguments

Arguments Type/Description

ACL An Access Control List string.

• • • ConnectorLib Java SDK Programming Guide • 77 • • Chapter 7 Use Lua Scripts

Returns An encrypted string.

file_setdates Sets the given file times on the file specified by path. If the format parameter is not specified, it is assumed that the dates are provided as seconds since the epoch (1st January 1970).

Syntax file_setdates( String path, String created, String modified, String accessed [, String format] )

Arguments

Arguments Type/Description

path The path or filename of the file to be deleted.

created The date created.

modified The date modified.

accessed The last date accessed.

format Used to format the strings coming in at system time. The format parameter is the same as for other Autonomy products.

getcwd Returns the current working directory of the application.

Syntax getcwd()

Returns Returns a string of the current working directory.

get_config Load a configuration file.

Syntax get_config( path )

• • • 78 • ConnectorLib Java SDK Programming Guide • • Method Reference

Arguments

Arguments Type/Description

path The path of the configuration file to load.

Discussion Config files are cached after the first call to get_config, to avoid unnecessary disk I/O in the likely event that the same config is accessed frequently by subsequent invocations of the Lua script. One cache is maintained per Lua state, so the maximum number of reads for a config file is equal to the number of threads which are running Lua scripts An error is raised if the configuration file does not exist.

Returns A Config object.

gobble_whitespace Reduces multiple adjacent white spaces (tab, carriage return, space, and so on in the specified field) to a single space.

Syntax String gobble_whitespace (String Input)

Arguments

Arguments Type/Description

Input An input string.

Returns A string without adjacent white spaces.

hash_file Hashes the specified file using the SHA1 or MDA5 algorithm, or both.

Syntax String, [String] hash_file (String FileName, String Algorithm1, [String Algorithm2])

• • • ConnectorLib Java SDK Programming Guide • 79 • • Chapter 7 Use Lua Scripts

Arguments

Arguments Type/Description

FileName The name of the file to be specified. Algorithm1 The type of algorithm to use. Must be either SHA1 or MDA5.

Algorithm2 The optional second type of algorithm to use. Must be whichever algorithm was not used in Algorithm1.

Returns The hashed file.

hash_string Hashes the specified string using the SHA1 or MDA5 algorithm.

Syntax String hash_string (String StringToHash, String Algorithm)

Arguments

Arguments Type/Description

StringToHash The string to be hashed.

Algorithm The algorithm to use. Must be either SHA1 or MDA5.

Returns The hashed input string.

is_dir Check if the supplied path is a directory.

Syntax is_dir( String path )

Arguments

Arguments Type/Description

path The path to check.

• • • 80 • ConnectorLib Java SDK Programming Guide • • Method Reference

Returns Returns a Boolean indicating if the supplied path is a directory.

log Appends log messages to the specified file.

Syntax log( String file, String message )

Arguments

Arguments Type/Description

file The file to which log messages will be appended.

message The message to print to the file.

Returns Nothing.

move_file Move the source file to the destination path. The move will fail if the destination file already exists. This can be overridden by providing the optional overwrite argument which forces the move if the destination exists.

Syntax move_file( String src, String dest [, Boolean overwrite] )

Arguments

Arguments Type/Description

src The source file.

dest The destination file.

overwrite Forces the move if the destination exists.

Returns Returns a boolean indicating success/failure.

• • • ConnectorLib Java SDK Programming Guide • 81 • • Chapter 7 Use Lua Scripts

parse_csv Parse the given separated values string into a collection of individual strings.

Syntax parse_csv( csv_string [, delimiter])

Arguments

Arguments Type/Description

csv_string The string to parse.

delimiter The delimiter to use (defaults to ",").

Discussion The method understands quoted values (such that parsing 'foot, "leg, torso", elbow' produces three values) and ignores white space around delimiters.

Returns The elements are returned as multiple return values. You may wish to put them in a table like this:

local results = { parse_csv("cat,tree,house", ",") };

parse_xml Parse the given XML string to an XMLDocument.

Syntax parse_xml( xml_string )

Arguments

Arguments Type/Description

xml_string XML data as a string.

Returns An XMLDocument containing the parsed data, or nil if the string could not be parsed.

regex_match This method performs a regular expression match on a string.

• • • 82 • ConnectorLib Java SDK Programming Guide • • Method Reference

Syntax regex_match (String name, String regex [, Boolean case])

Arguments

Arguments Type/Description

name The string in which to search.

regex The regular expression with which to search.

case An optional Boolean specifying whether or not to be case-sensitive.

Returns A table of strings.

regex_search This method performs a regular expression search on a string.

Syntax regex_search (String name, String regex [, Boolean case])

Arguments

Arguments Type/Description

name The string in which to search.

regex The regular expression with which to search.

case An optional Boolean specifying whether or not to be case-sensitive.

Returns A regular expression match-object.

send_aci_action Sends the given query to the ACI server at host:port with optional time-out (ms) and retries settings. Takes the action parameters as a table instead of the full action as a string, as with send_aci_command, to avoid issues with parameter values containing “&”.

Syntax send_aci_action( host, port, action [, parameters][, timeout]  [, retries] )

• • • ConnectorLib Java SDK Programming Guide • 83 • • Chapter 7 Use Lua Scripts

Example send_aci_action( “localhost”, 9000, “query”, {text = “*”, print = “all”} );

Arguments

Arguments Type/Description

host The ACI host to send the query to.

port The port to send the query to.

action The action to perform (for example, query).

parameters This takes a Lua table containing the action parameters, for example, { param1=”foo”, param2=”bar” }

timeout The number of milliseconds to wait before timing out. The default is 3000.

retries The number of times to retry if the request fails. The default is 3.

Returns The xml response is returned as a string. If the request has failed, then nil is returned.

send_aci_command Sends the given query to the ACI server at host:port with optional time-out (ms) and retries settings.

Syntax send_aci_command( host, port, query [, timeout] [, retries] )

Arguments

Arguments Type/Description

host The ACI host to send the query to.

port The port to send the query to.

query The query to send (for example, action=getstatus)

timeout The number of milliseconds to wait before timing out. The default is 3000.

retries The number of times to retry if the request fails. The default is 3.

• • • 84 • ConnectorLib Java SDK Programming Guide • • Method Reference

Returns The xml response is returned as a string. If the request has failed, then nil is returned.

sleep Pause the executing thread for a number of milliseconds.

Syntax sleep( Integer milliseconds )

Arguments

Arguments Type/Description

milliseconds The number of milliseconds for which to pause the current thread.

Returns Nothing.

string_uint_less This method takes two strings and returns True if the second one is longer than the first. Will return False otherwise.

Syntax string_uint_less (String1, String2)

Arguments

Arguments Type/Description

String1 The string that acts as the standard for comparison.

String2 The string to compare against the first string (the standard).

Returns A Boolean.

• • • ConnectorLib Java SDK Programming Guide • 85 • • Chapter 7 Use Lua Scripts

unzip_file Extracts the zip file specified by path to the location specified by dest.

Syntax unzip_file( String path, String dest )

Arguments

Arguments Type/Description

path The path or filename of the file to be unzipped.

dest The destination path where the files are to be extracted.

Returns Returns a boolean indicating success/failure.

xml_encode This method takes a string and encodes it to a string that is valid to be put into XML.

Syntax xml_encode (String content)

Arguments

Arguments Type/Description

content The string to be encoded.

Returns A string.

zip_file Zip the supplied path (file or directory). The output file will only be overwritten if true is supplied for the optional overwrite argument.

Syntax zip_file( String path [, Boolean overwrite] )

• • • 86 • ConnectorLib Java SDK Programming Guide • • Method Reference

Arguments

Arguments Type/Description

path The path or filename of the file to be zipped.

overwrite Forces the creation of the zip file if an output file already exists.

Returns The output path is written to path.zip. Returns Boolean indicating success or failure.

Document Methods

addField Adds a new field to the document.

Syntax addField ( String fieldname, String fieldvalue )

Arguments

Arguments Type/Description

fieldname The name of the field to add.

fieldvalue The value to set for the field.

appendContent Appends content to the existing content of the document.

Syntax appendContent ( String content )

Arguments

Arguments Type/Description

content The content to append to the document content.

copyField Copies a field to a certain name.

• • • ConnectorLib Java SDK Programming Guide • 87 • • Chapter 7 Use Lua Scripts

Syntax copyField (String sourcename, String targetname [, Boolean case])

Arguments

Arguments Type/Description

sourcename The name of the field to copy.

targetname The destination field name.

case An optional Boolean specifying whether or not to be case-sensitive.

copyFieldNoOverwrite Copies a field to a certain name but does not overwrite the existing value.

Syntax copyFieldNoOverwrite ( String sourcename, String targetname [, Boolean case])

Arguments

Arguments Type/Description

sourcename The name of the field to copy.

targetname The destination field name.

case An optional Boolean specifying whether or not to be case-sensitive.

countField This method returns an integer of the number of fields with the name specified.

Syntax countField (String fieldname [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field to count.

case An optional Boolean to specify whether or not to be case-sensitive.

• • • 88 • ConnectorLib Java SDK Programming Guide • • Method Reference

Returns An integer.

deleteField Deletes a field from a document.

Syntax deleteField ( String fieldname [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field to delete.

case An optional Boolean to specify whether or not to be case-sensitive.

findField This method returns the LuaField object of the specified name.

Syntax findField ( String fieldname)

Arguments

Arguments Type/Description

fieldname The name of the field to find.

Returns A LuaField object of the specified name.

getContent Gets the content for a document.

Syntax getContent ()

Returns The document content as a string.

• • • ConnectorLib Java SDK Programming Guide • 89 • • Chapter 7 Use Lua Scripts

getField This method returns the first LuaField object of the specified name.

Syntax getField (String name [, Boolean case])

Arguments

Arguments Type/Description

name The name of the LuaField object.

case An optional Boolean to specify whether or not to be case-sensitive.

Returns First LuaField object of the specified name.

getFields This method returns a table of LuaFields of the specified name.

Syntax getFields (String name [, Boolean case])

Arguments

Arguments Type/Description

name The name of the LuaField object.

case An optional Boolean to specify whether or not to be case-sensitive.

Returns A table of LuaFields.

getFieldNames Gets all the field names for the document.

Syntax getFieldNames ( )

Returns A table of all the field names.

• • • 90 • ConnectorLib Java SDK Programming Guide • • Method Reference

getFieldValue Gets the value of a field on a document.

Syntax getFieldValue( String fieldname [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field who’s value is to be retrieved.

case An optional Boolean to specify whether or not to be case-sensitive.

Returns A string containing the value.

getFieldValues Gets all values from all fields that have the same name.

Syntax getFieldValues( String fieldname [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field to match.

case An optional Boolean to specify whether or not to be case-sensitive.

Returns A table of all the field values.

getNextSection The document object passed to the script's handler function in fact represents the first section of the document. This means the functions previously detailed only read and modify the first section. This method returns the next section in the document when sectioned.

Syntax LuaDocument getNextSection ()

• • • ConnectorLib Java SDK Programming Guide • 91 • • Chapter 7 Use Lua Scripts

Example To perform operations on every section, for example: local section = document while section do -- Manipulate section section = section:getNextSection() end

Returns A document object that contains the next DRE section.

getReference This method returns a string containing the reference.

Syntax getReference ()

Returns The string containing the reference.

hasField Checks to see if a field exists for a document.

Syntax hasField ( String fieldname [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field for who’s existence you are checking.

case An optional Boolean to specify whether or not to be case-sensitive.

Returns A Boolean: true if the field exists, false otherwise.

insertXML This method inserts a portion of XML as a new piece of metadata for the document.

• • • 92 • ConnectorLib Java SDK Programming Guide • • Method Reference

Syntax insertXML (LuaXMLNode node)

Arguments

Arguments Type/Description

node The node to insert.

Returns A LuaField object of the inserted data.

renameField Changes the name of a field from one name to another.

Syntax renameField ( String currentname, String newname [, Boolean case])

Arguments

Arguments Type/Description

currentname The name of the field to rename.

newname The new name of the field.

case An optional Boolean to specify whether or not to be case-sensitive.

setContent Sets the content for a document.

Syntax setContent ( String content )

Arguments

Arguments Type/Description

content The content to set for the document.

setFieldValue Sets the value of a field on a document.

• • • ConnectorLib Java SDK Programming Guide • 93 • • Chapter 7 Use Lua Scripts

Syntax setFieldValue( String fieldname, String newvalue )

Arguments

Arguments Type/Description

fieldname The name of the field to set.

newvalue The value to set for the field. If the field already exists, it will be overwritten.

setReference This method sets the reference to the string passed in.

Syntax setReference (String reference)

Arguments

Arguments Type/Description

reference The reference to set.

writeStubIdx Writes out a stub idx document (a metadata file used by IDOL applications).

Syntax writeStubIdx( String filename )

Arguments

Arguments Type/Description

filename The name of the file to create.

Returns A Boolean: true if written, false otherwise.

• • • 94 • ConnectorLib Java SDK Programming Guide • • Method Reference

Field Methods

addField This method adds a sub field with the specified name and value.

Syntax addField (String fieldname, String fieldvalue)

Arguments

Arguments Type/Description

fieldname The name of the field.

fieldvalue The value of the field.

Returns The LuaField object.

copyField This method copies the sub field to another sub field.

Syntax copyField (String fieldname, String destination [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field to copy.

destination The name of the field to copy to.

case A Boolean to specify whether or not to be case-sensitive.

copyFieldNoOverwrite This method copies the sub field to another sub field but does not overwrite the destination.

Syntax copyFieldNoOverwrite (String fieldname, String destination [, Boolean case])

• • • ConnectorLib Java SDK Programming Guide • 95 • • Chapter 7 Use Lua Scripts

Arguments

Arguments Type/Description

fieldname The name of the field to copy.

destination The name of the field to copy to.

case A Boolean to specify whether or not to be case-sensitive.

countField This method returns the number of sub fields that exist with the specified name.

Syntax countField (String fieldname [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field.

case A Boolean to specify whether or not to be case-sensitive.

Returns The number of sub fields that exist with the specified name.

deleteAttribute This method deletes the attribute specified by the name passed in.

Syntax deleteAttribute (String name)

Arguments

Arguments Type/Description

name The attribute name to delete.

deleteField This method deletes the sub field with the specified name.

• • • 96 • ConnectorLib Java SDK Programming Guide • • Method Reference

Syntax deleteField (String fieldname [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field to count.

case A Boolean to specify whether or not to be case-sensitive.

getAttributeValue This method gets the value of the attribute specified as a string.

Syntax getAttributeValue (String name)

Arguments

Arguments Type/Description

name The attribute name to get.

Returns Atribute values.

getField This method gets the sub field specified by the name.

Syntax getField (String name [, Boolean case])

Arguments

Arguments Type/Description

name The field name to get.

case A Boolean to specify whether or not to be case-sensitive.

Returns A single field object.

• • • ConnectorLib Java SDK Programming Guide • 97 • • Chapter 7 Use Lua Scripts

getFieldNames This method returns a table containing strings representing all the sub fields’ names.

Syntax getFieldNames ()

Returns A table containing strings representing all the sub fields’ names.

getFields This method gets all the sub fields specified by the name.

Syntax getFields (String name [, Boolean case])

Arguments

Arguments Type/Description

name The field name to get.

case A Boolean to specify whether or not to be case-sensitive.

Returns A table of field objects.

getFieldValues This method returns a table of strings of all the values of sub fields with the specified name.

Syntax getFieldValues (String fieldname [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field.

case A Boolean to specify whether or not to be case-sensitive.

• • • 98 • ConnectorLib Java SDK Programming Guide • • Method Reference

Returns A table of strings of all the values of sub fields with the specified name.

hasAttribute This method returns a Boolean specifying if the field has the specified attribute passed in by name.

Syntax hasAttribute (String name)

Arguments

Arguments Type/Description

name The name of the attribute.

Returns A Boolean specifying if the field has the specified attribute passed in by name.

hasField This method returns a Boolean specifying if the sub field exists or not.

Syntax hasField (String fieldname [, Boolean case])

Arguments

Arguments Type/Description

fieldname The name of the field.

case A Boolean to specify whether or not to be case-sensitive.

Returns A Boolean specifying if the sub field exists or not.

insertXML This method inserts a portion of XML as a new piece of metadata for the document.

Syntax insertXML (LuaXMLNode node)

• • • ConnectorLib Java SDK Programming Guide • 99 • • Chapter 7 Use Lua Scripts

Arguments

Arguments Type/Description

node The node to insert.

Returns A LuaField object of the inserted data.

name This method returns the name of the field object in a string.

Syntax name ()

Returns The name of the field object in a string.

renameField This method renames the sub field.

Syntax renameField (String oldname, String newname [, Boolean case])

Arguments

Arguments Type/Description

oldname The previous name of the field.

newname The new name of the field.

case A Boolean to specify whether or not to be case-sensitive.

setAttributeValue This method sets the value for the specified attribute of the field.

Syntax setAttributeValue (String attribute, String value)

• • • 100 • ConnectorLib Java SDK Programming Guide • • Method Reference

Arguments

Arguments Type/Description

attribute The attribute to set.

value The value to set to.

setValue This method sets the value of the field to be passed in a string.

Syntax setValue (String value)

Arguments

Arguments Type/Description

value The value to set.

value This method returns the value of the field object in a string.

Syntax value ()

Returns The value of the field object in a string.

XMLDocument Methods

root Returns an XmlNode which is the root node of the XML document.

Syntax root()

Returns An XmlNode.

• • • ConnectorLib Java SDK Programming Guide • 101 • • Chapter 7 Use Lua Scripts

XPathExecute Returns XmlNodeSet which is the result of supplied XPath query.

Syntax XPathExecute( String xpathQuery )

Arguments

Arguments Type/Description

xpathQuery The xpath query to execute.

Returns An XmlNodeSet node set.

XPathRegisterNs Register a namespace with the XML parser. Returns an integer detailing the error code.

Syntax XPathRegisterNs( String prefix, String URI )

Arguments

Arguments Type/Description

prefix The namespace prefix.

URI The namespace location.

Returns 0 in case of success, -1 in case of error.

XPathValue Returns the first occurance of the value matching the XPath query.

Syntax String XPathValue(String query)

• • • 102 • ConnectorLib Java SDK Programming Guide • • Method Reference

Arguments

Arguments Type/Description

query The XPath query to use.

Returns A string of the value.

XPathValues Returns a table of strings containing the values according to the XPath query.

Syntax Table XPathValues(String query)

Arguments

Arguments Type/Description

query The XPath query to use.

Returns A table of strings of the values.

XmlNodeSet Methods

at Returns XmlNode at position pos in the array.

Syntax at( pos )

Arguments

Arguments Type/Description

pos The index of the item in the array to get.

Returns An XmlNode.

• • • ConnectorLib Java SDK Programming Guide • 103 • • Chapter 7 Use Lua Scripts

size Returns size of node set.

Syntax size()

Returns An integer of the size of the node set.

XmlNode Methods

attr Returns first XmlAttr attribute object for this element.

Syntax attr()

Returns An XmlAttr object.

content Returns the content (text element) of the xml node.

Syntax content()

Returns A string containing the content.

firstChild Returns XmlNode which is the first child of this node.

Syntax firstChild()

Returns An xmlNode.

lastChild Returns XmlNode which is the last child of this node.

• • • 104 • ConnectorLib Java SDK Programming Guide • • Method Reference

Syntax lastChild()

Returns An xmlNode.

name Returns the name of the xml node.

Syntax name()

Returns A string containing the name.

next Returns XmlNode which is the next sibling of this node.

Syntax next()

Returns An xmlNode.

nodePath Returns the Xml path to the node which can be used in another XPath query.

Syntax nodePath()

Returns A string containing the path.

parent Returns the parent XmlNode of the node.

Syntax parent()

Returns An xmlNode.

• • • ConnectorLib Java SDK Programming Guide • 105 • • Chapter 7 Use Lua Scripts

prev Returns XmlNode which is the previous sibling of this node.

Syntax prev()

Returns An xmlNode.

type Returns the type of the node as a string.

Syntax type()

Returns A string containing the type. Possible values are:

element_node comment_node element_decl attribute_node document_node attribute_decl text_node document_type_node entity_decl cdata_section_node document_frag_node namespace_decl entity_ref_node notation_node xinclude_start entity_node html_document_node xinclude_end pi_node dtd_node docb_document_node

XmlAttr Methods

name Returns the name of this attribute.

Syntax name()

Returns A String containing the name of the attribute.

next_attribute Returns XmlAttr object for the next attribute in the parent element.

Syntax next_attribute ()

• • • 106 • ConnectorLib Java SDK Programming Guide • • Method Reference

Returns An XmlAttr.

previous_attribute Returns XmlAttr object for the previous attribute in the parent object.

Syntax previous_attribute ()

Returns An XmlAttr.

type Returns the type of this attribute node.

Syntax type()

Returns This method returns a string containing "attribute_node" if the node is valid, or "null" if the node is invalid.

value Returns the value of this attribute.

Syntax value()

Returns A String containing the value of the attribute.

RegexMatch Methods

length This method returns the length of the sub match. The default value of 0 returns for the full match.

Syntax length ( Integer submatch)

• • • ConnectorLib Java SDK Programming Guide • 107 • • Chapter 7 Use Lua Scripts

Arguments

Arguments Type/Description

submatch The sub match to query.

Returns The length of the sub match.

next Returns a RegexMatch for the next match.

Syntax next ()

Returns A RegexMatch for the next match.

position This method returns the position of the sub match as an index from 1. The default value of 0 returns for the full match.

Syntax position (Integer submatch)

Arguments

Arguments Type/Description

submatch The sub match to query.

Returns The position of the submatch as an index from 1.

size This method returns the number of sub matches for the current match as an integer. This includes the full match so it will return one greater than expected.

Syntax size (Integer submatch)

• • • 108 • ConnectorLib Java SDK Programming Guide • • Method Reference

Arguments

Arguments Type/Description

submatch The sub match to query.

Returns The number of sub matches for the current match.

str This method returns the string for the sub match. The default value of 0 returns for the full match.

Syntax str (Integer submatch)

Arguments

Arguments Type/Description

submatch The sub match to query.

Returns The string for the sub match.

Config Methods

getEncryptedValue Returns the unencrypted value from the configuration file of an encrypted value.

Syntax String getEncryptedValue(String section, String key)

Arguments

Arguments Type/Description

section The section in the configuration file.

key The key in the configuration file to get the value for.

• • • ConnectorLib Java SDK Programming Guide • 109 • • Chapter 7 Use Lua Scripts

Returns The unencrypted value.

getValue Returns the value of the configuration parameter key in a given section. If the key does not exist in the section, then the default value is returned.

Syntax getValue( String section, String key, String default )

Arguments

Arguments Type/Description

section The section name in the configuration file.

key The name of the key from which to read.

default The default value to use if no key is found.

Returns A string containing the value read from the configuration file.

getValues Returns a table of strings if you have multiple values for a key (for example, a CSV or numbered like keyN).

Syntax Table getValues(String section, String key)

Arguments

Arguments Type/Description

section The section in the configuration file.

key The key in the configuration file to get the value for.

Returns A table of strings of the values.

• • • 110 • ConnectorLib Java SDK Programming Guide • • Method Reference

Change the Value of a Field The functions getFieldValue, fieldGetValue and setFieldValue, fieldSetValue allow you to modify the contents of a field directly. For example: local content_field = document:findField("CONTENT") local content = document:fieldGetValue(content_field) local content = document:getFieldValue("CONTENT") content = content .. "\nCopyright MyCorp\n" document:setFieldValue("CONTENT", content) document:fieldSetValue(content_field, content)

Example Script For each document, this Lua script adds a COUNT field, a total sections count to the title, and replaces the content of each section with the section number.

NOTE The COUNT is 1 for the first document and increases as long as the job is running.

doc_count = 0 function handler(document) doc_count = doc_count + 1 document:addField("COUNT",doc_count);

local section_count = 0 local section = document

while section do section_count = section_count + 1 section:setFieldValue("CONTENT", "Section "..section_count); section = section:getNextSection() end local field = section:findField("DRECONTENT") if field then section:fieldSetValue(field, "Section "..section_count); end section = section:getNextSection() end

document:setFieldValue("TITLE", document:getFieldValue("TITLE").." Total Sections " ..section_count) return true; local field = document:findField("DRETITLE")

• • • ConnectorLib Java SDK Programming Guide • 111 • • Chapter 7 Use Lua Scripts

if field then document:fieldSetValue(field, document:fieldGetValue(field).." Total Sections "..section_count) end end

Use Lua Scripts Within the Connector

This section describes how to use CFS connector actions in Lua scripts to transform documents. It includes the following sections:

 Introduction

 Example Lua Script

Introduction There are occasions when documents are not to be sent to the Connector Framework Server (CFS). For example, you may use the Collect action to retrieve documents from one repository and then insert them into another. In doing so, you may need to transform the documents from the first repository before they can be accepted by the second repository. You can use a Lua script to do this. Some CFS connector configuration options and actions take a Lua script as a parameter. The information in this chapter discusses the requirements for any Lua script that is used in this way.

Example Lua Script You can use the CollectActions parameter of the Collect action, the IngestActions parameter of the Synchronize action and the IngestActions parameter in the configuration file to specify a Lua script that runs on each document.

• • • 112 • ConnectorLib Java SDK Programming Guide • • Use Lua Scripts Within the Connector

The Lua script takes the following parameters:

Parameter Description

config A configuration file object.

document A document object that represents the document.

params A table containing additional parameters provided by the connector. For example:  TYPE. The type of the command being performed. This can be ADD, UPDATE, DELETE, or COLLECT.  SECTION. The configuration section for the task.  FILENAME. The document filename. The Lua script may modify this file, but should not delete it.

The configuration file object provides the following methods: getValue(section, parameter, default) To see the set of methods that the document object provides, refer to “Document Methods” on page 87. An example Lua script appears below: method handler( config, document, params ) -- If these lines are uncommented, and the connector is running -- from the console, all the parameters in params will be output -- to the console. -- for k,v in pairs(params) do -- print(k,v) -- end -- Sets local variables from the parameters passed in. local type = params["TYPE"] local section = params["SECTION"] local filename = params["FILENAME"] -- Read a config setting from the config file. local val = config:getValue(section, "ConfigSettingName", "Value") -- If the document is not being deleted, set the field FieldName -- to the value read from the config file. if type ~= "DELETE" then document:setFieldValue("FieldName", val) end -- If this document has a file (that is, not just metadata), -- copy the file to a new location and write a stub idx -- containing the metadata with it.

• • • ConnectorLib Java SDK Programming Guide • 113 • • Chapter 7 Use Lua Scripts

if filename ~= "" then copytofilename = "OutputPath/"..create_uuid(filename) copy_file(filename, copytofilename) document:writeStubIdx(copytofilename..".idx") end return true end

NOTE The Lua script should return true normally, but can return false to reject the document when used as an Ingest action.

• • • 114 • ConnectorLib Java SDK Programming Guide • • PART 2 Parameter and Command Reference

This section describes useful configuration parameters and action commands for the connector to be developed and for the Connector Framework Server.

 Parameters Common to CFS Connectors

 Parameters Common to CFS Connectors Using Java

 CFS Connector Actions

 Connector Framework Server Parameters

 License Configuration Parameters

 Logging Configuration Parameters

 Secure Socket Layer Parameters

 Service Actions

 Service Configuration Parameters Part 2 Parameter and Command Reference

• • • 116 • ConnectorLib Java SDK Programming Guide • • CHAPTER 8  Parameters Common to CFS Connectors

This section describes the parameters that are common to all connectors that use the Connector Framework Service (CFS). If more than one configuration file-section is specified for a configuration parameter, the value of the parameter located in the left-most section will override the values of the parameters contained in the other sections mentioned.

Using the Configuration Section example, “TaskName or FetchTasks or Default,” parameter values in the TaskName section will override corresponding values in the FetchTasks section, which will, in turn, override those corresponding in the Default section.  ACI Server Configuration

 Import Service

 Distributed Connector

 View Server

 General Connector Parameters

 Fetch Task Configuration

 Ingestion

 GroupServer

• • • ConnectorLib Java SDK Programming Guide • 117 • • Chapter 8 Parameters Common to CFS Connectors

ACI Server Configuration

The parameters in this section control the way the connector handles the load caused by incoming ACI requests.

FilePath Use this parameter to specify the location of the file to receive event data. Set the value as TextFileHandler to use an internal text file handler.

Type: String

Default:

Required: No

Configuration EventHandler Section:

Example: FilePath=./EventData See Also: “LibraryName” on page 118

LibraryName Use this parameter to specify the name of the library to use as the event handler. Set as HttpHandler to use as internal HTTP handler. Specifying the .dll or .so extension is optional.

Type: String

Default:

Required: No

Configuration EventHandler Section:

Example: LibraryName=./luaHandler See Also: “OnError” on page 121 “OnFinish” on page 122 “OnStart” on page 122

• • • 118 • ConnectorLib Java SDK Programming Guide • • ACI Server Configuration

LuaScript Use this parameter to specify the Lua script to execute on the event.

Type: String

Default:

Required: No

Configuration EventHandler Section:

Example: LuaScript=./finished_handler.lua See Also: “LibraryName” on page 118

MaximumThreads Use this parameter to specify the maximum number of simultaneous ACI actions to process.

The number of synchronous actions (for example, getstatus or view) that should be processed simultaneously: [Server] MaximumThreads=4 The number os asynchronous actions (for example, fetch) that should be processed simultaneously: [Actions] MaximumThreads=4

Type: Integer

Default: 2

Required: No

Configuration Actions or Server Section:

Example: MaximumThreads=4 See Also:

• • • ConnectorLib Java SDK Programming Guide • 119 • • Chapter 8 Parameters Common to CFS Connectors

MaxQueueSize Use this parameter to specify the maximum number of asynchronous fetch action commands that will be queued by the connector. No further fetch actions will be accepted once the queue size has been reached (until the queue diminishes).

Type: Integer

Default: The default is the maximum integer value (no limit).

Required: No

Configuration Actions Section:

Example: MaxQueueSize=4 See Also: “MaxScheduledSize” on page 120

MaxScheduledSize Use this parameter to limit the number of Processing+Finished+Error tasks that are stored by the connector. All actions and response data are stored in a file, actions/fetch/fetch.queue. If the MaxScheduledSize parameter value is not specified, this file will continue to grow with each action - eventually a resource limit would be reached that will cause unhandled failures to occur.

If this limit is exceeded, the oldest Finished or Error action is disposed of and so will no longer be accessible through the queueinfo action. The MaxScheduledSize and MaxQueueSize parameters together give the total actions that will be stored (Queued, Processing, Finished, or Error). (Note that Queued, Processing, Finished, and Error are the action statuses reported by the queueinfo action.

Type: Integer

Default: The default is the maximum integer value (no limit).

Required: No

Configuration Actions Section:

Example: MaxScheduledSize=100 See Also: “MaxQueueSize” on page 120

• • • 120 • ConnectorLib Java SDK Programming Guide • • ACI Server Configuration

OnError Use this parameter to specify the handler for the Fetch action error event.

This is the section name that will contain the LibraryName and any other settings for the event handler.

Typical configuration is (using OnFinish):

[Actions] OnFinish=HttpHandler [HttpHandler] LibraryName=HttpHandler Url=http://localhost/dosomething?

Type: String

Default:

Required: No

Configuration Actions Section:

Example: OnError=EventHandler See Also:

OnErrorReport Use this parameter to specify the handler for the Fetch action error report event.

Type: String

Default:

Required: No

Configuration Actions Section:

Example: OnErrorReport=EventHandler See Also:

• • • ConnectorLib Java SDK Programming Guide • 121 • • Chapter 8 Parameters Common to CFS Connectors

OnFinish Use this parameter to specify the handler for the Fetch action finish event.

Type: String

Default:

Required: No

Configuration Actions Section:

Example: OnFinish=EventHandler See Also:

OnStart Use this parameter to specify the handler for the Fetch action start event.

Type: String

Default:

Required: No

Configuration Actions Section:

Example: OnStart=EventHandler See Also:

Url Use this parameter to specify the URL to receive event data (when configured to use LibraryName=HttpHandler only). Each handler type will have specific configuration settings. For HttpHandler, the available settings are: Url SSLConfig ProxyHost ProxyPort ProxyUser ProxyPassword BasicUser BasicPassword

• • • 122 • ConnectorLib Java SDK Programming Guide • • Import Service

If the LibraryName parameter is set to Luahandler, the available parameter is LuaScript, which takes the path of the Lua script that handles the events. Lua script event handlers should be of the form:

function handler(request, xml) ... end  request is a table holding the request parameters. xml is a string holding the response to the request.

Type: String

Default:

Required: No

Configuration Actions Section:

Example: Url=http://localhost/dosomething?param=value Url=http://localhost:1234/?action=dosomething See Also:

Import Service

The parameters in this section control the way the connector interfaces with Keyview to extract document sub-files in response to collect or view actions.

KeyviewDirectory Use this parameter to specify the location of Keyview filters used for sub-file extraction. Doing so is an alternative to specifying this in the KEYVIEW_DIRECTORY environment variable. This parameter is used only if Keyview is required (if the EnableExtraction parameter is set to True).

Type: String

Default:

Required: No

• • • ConnectorLib Java SDK Programming Guide • 123 • • Chapter 8 Parameters Common to CFS Connectors

Configuration ImportService Section:

Example: KeyviewDirectory=./filters See Also: “EnableExtraction” on page 131

Distributed Connector

The parameters in this section control the way the connector behaves when used with the Distributed Connector.

ConnectorGroup Use this parameter to specify the name of the connector group to which this connector belongs. The ConnectorGroup parameter can take any value - it is only configured in the individual connectors, and is passed to the Distributed Connector when registering.

This parameter is used only if the RegisterConnector parameter is set to True.

Type: String

Default: Connector

Required: No

Configuration DistributedConnector Section:

Example: ConnectorGroup=Connector See Also: “RegisterConnector” on page 126

ConnectorPriority Use this parameter to specify the priority value used to distribute actions to higher priority connectors.

• • • 124 • ConnectorLib Java SDK Programming Guide • • Distributed Connector

This parameter is used only if the RegisterConnector parameter is set to True.

Type: Integer

Default: 0

Required: No

Configuration DistributedConnector Section:

Example: ConnectorPriority=1 See Also: “RegisterConnector” on page 126

DataPortN Use this parameter to specify the dataport number(s) of the Distributed Connector(s). This parameter is used only if the RegisterConnector parameter is set to True.

Type: Integer

Default:

Required: No

Configuration DistributedConnector Section:

Example: DataPort0=9876 See Also: “RegisterConnector” on page 126

HostN Use this parameter to specify the hostname(s) or IP address(es) of the Distributed Connector(s). This parameter is used only if the RegisterConnector parameter is set to True.

Type: String

Default: localhost

Required: No

• • • ConnectorLib Java SDK Programming Guide • 125 • • Chapter 8 Parameters Common to CFS Connectors

Configuration DistributedConnector Section:

Example: Host0=localhost See Also: “RegisterConnector” on page 126

PortN Use this parameter to specify the port number(s) of the Distributed Connector(s). This parameter is used only if the RegisterConnector parameter is set to True.

Type: Integer

Default: 10000

Required: No

Configuration DistributedConnector Section:

Example: Port0=10000 See Also: “RegisterConnector” on page 126

RegisterConnector Use this parameter to register with the Distributed Connector. The connector will wait at startup for registration to be successful.

If the action=fetch parameter synchronizestate is assigned, then this will hold the locations of the datastore files that will be used for the tasks. In this case, the DatastoreFile is ignored. The synchronizestate parameter is assigned when using the connector through the Distributed Connector.

Type: Boolean

Default: False

Required: No

Configuration DistributedConnector Section:

Example: RegisterConnector=False See Also:

• • • 126 • ConnectorLib Java SDK Programming Guide • • Distributed Connector

SharedPath Use this parameter to specify the path to a location common to all connectors in the Connector Group. This location is used to store the compressed datastore files used by the connectors. This parameter is used only if the RegisterConnector parameter is set to True.

Type: String

Default: The value of the TempDirectory parameter. Required: No

Configuration DistributedConnector Section:

Example: SharedPath=./temp See Also: “RegisterConnector” on page 126

SSLConfigN Use this parameter to specify the section(s) containing SSL settings for the Distributed Connector(s). This parameter is used only if the RegisterConnector parameter is set to True.

Type: String

Default:

Required: No

Configuration DistributedConnector Section:

Example: SSLConfig0=SSL See Also: “RegisterConnector” on page 126

• • • ConnectorLib Java SDK Programming Guide • 127 • • Chapter 8 Parameters Common to CFS Connectors

View Server

The parameters in this section allow the connector’s view action to use a View Server.

EnableViewServer If this parameter is set to True, documents retrieved by a view action are processed by the View Server before being returned. If set to False, the original documents are returned.

Type: Boolean

Default: False

Required: No

Configuration Connector and ViewServer Section:

Example: EnableViewServer=False See Also:

Host Use this parameter to specify the hostname or IP address of the View Server. This parameter is used only if the EnableViewServer parameter is set to True.

Type: String

Default: localhost

Required: No

Configuration ViewServer Section:

Example: Host=localhost See Also: “EnableViewServer” on page 128

• • • 128 • ConnectorLib Java SDK Programming Guide • • View Server

Port Use this parameter to specify the port number of the View Server. This parameter is used only if the EnableViewServer parameter is set to True.

Type: Integer

Default: 9000

Required: No

Configuration ViewServer Section:

Example: Port=9000 See Also: “EnableViewServer” on page 128

SharedPath Use this parameter to specify the path to a location accessible by both the connector and the View Server. Intermediate files are stored here. This parameter is used only if the EnableViewServer parameter is set to True.

Type: String

Default: The value of the TempDirectory parameter.

Required: No

Configuration ViewServer Section:

Example: SharedPath=./temp See Also: “EnableViewServer” on page 128

• • • ConnectorLib Java SDK Programming Guide • 129 • • Chapter 8 Parameters Common to CFS Connectors

General Connector Parameters

CleanOnStart Set this parameter to True to delete actions and the temp directory on start. Any action data stored in the actions folder is deleted - including all Queued actions.

Type: Boolean

Default: False

Required: No

Configuration Connector Section:

Example: CleanOnStart=True See Also:

DatastoreFile Use this parameter to override the name of the datastore file used by synchronize actions. Normally, you should use the default value for this parameter.

Type: String

Default: connector_TaskName_datastore.db

Required: No

Configuration TaskName or FetchTasks Section:

Example: DatastoreFile=./Datastore/Datastore.db See Also: “SynchronizeKeepDatastore” on page 135

• • • 130 • ConnectorLib Java SDK Programming Guide • • General Connector Parameters

DatastoreDirectory Use this parameter to specify the directory where datastore files are stored (except when using the DistributedConnector section or have specified the DatastoreFile parameter.

Type: String

Default: .

Required: No

Configuration Connector Section:

Example: DatastoreDirectory=./Datastore/ See Also: “DatastoreFile” on page 130

EnableExtraction Use this parameter to enable the extraction of sub-files for collect and view actions. This requires keyview filters to be present and for their location to be specified in the KeyviewDirectory parameter or KEYVIEW_DIRECTORY environment variable.

Type: Boolean

Default: False

Required: No

Configuration Connector Section:

Example: EnableExtraction=False See Also: “KeyviewDirectory” on page 123

EnableExtractionCopy Generally, this parameter is only relevant to the File System Connector, that is acting on the original documents instead of temporary copies. When performing extraction from certain file types, KeyView has side-effects that mean that the document is updated. This specifically causes the modified date to be updated and will thus cause the connector to re-ingest the document on the next synchronize action.

• • • ConnectorLib Java SDK Programming Guide • 131 • • Chapter 8 Parameters Common to CFS Connectors

To avoid these modifications, the solution is to make a copy of the original document (by setting this parameter to True) and perform the extraction on the copy. For other connectors, enabling this setting will have no effect, since the connectors will be downloading temporary copies and will have ownership of the files.

Type: Boolean

Default: False

Required: No

Configuration Connector Section:

Example: EnableExtractionCopy=False See Also: “KeyviewDirectory” on page 123

EnableScheduledTasks Use this parameter to enable internal scheduling of synchronize actions. When this is set to True, the numbered tasks configured in the FetchTasks section will be performed according to their schedules. If this is set to False, synchronize actions will be performed only in response to an ACI request.

Type: Boolean

Default: True

Required: No

Configuration Connector Section:

Example: EnableScheduledTasks=False See Also:

• • • 132 • ConnectorLib Java SDK Programming Guide • • General Connector Parameters

EncryptACLEntries If this parameter is set to False, the entries in ACLs will not be encrypted. This should be used only for troubleshooting.

NOTE Some connectors allow this parameter to be set in the Task section. Not all connectors that have security necessarily must support this parameter.

Type: Boolean

Default: True

Required: No

Configuration Connector Section:

Example: EncryptACLEntries=True See Also:

HashedDestinationDirectory Set this parameter to True to use sub-directories within Collect Destination directory. The collect destination is a location on the filesystem specified by the destination parameter to the Collect fetch action: /action=fetch&fetchaction=collect&identifiers=<...>&destination=\\ foo\bar

Type: Boolean

Default: False

Required: No

Configuration Connector Section:

Example: HashedDestinationDirectory=True See Also: “TempDirectory” on page 138

• • • ConnectorLib Java SDK Programming Guide • 133 • • Chapter 8 Parameters Common to CFS Connectors

HashedTempDirectory Set this parameter to True to use sub-directories within TempDirectory.

Type: Boolean

Default: False

Required: No

Configuration Connector Section:

Example: HashedTempDirectory=True See Also: “TempDirectory” on page 138

InsertActions Use this parameter to perform some actions on each document before insertion. Each action is in the form: ‘ACTIONNAME:ACTIONPARAMETERS’ Possible actions are:

Action Parameters Example

META field=value InsertActions=META:MyField=MyValue

LUA lua_script_filename InsertActions=LUA:myLuaScript.lua

Type: String

Default:

Required: No

Configuration Connector Section:

Example: InsertActions=META:MyField=MyValue See Also: “IngestActions” on page 143

• • • 134 • ConnectorLib Java SDK Programming Guide • • General Connector Parameters

InsertFailedDirectory Use this parameter to specify the directory where failed insert commands are written.

Type: String

Default: ./insertfailed

Required: No

Configuration Connector Section:

Example: InsertFailedDirectory=./insertfailed See Also:

MinFreeSpaceMB The MinFreeSpaceMB parameter defines the minimum amount of free disk space (in megabytes) that must be available for a fetch action to be processed. If the specified amount of free space is not available, the fetch action is not processed and an error is returned in the ACI response.

Type: Integer

Default: 1024

Required: No

Configuration Connector Section:

Example: MinFreeSpaceMB=1024 See Also:

SynchronizeKeepDatastore When this parameter is set to True, a datastore file (.db) will remain on the disk after a synchronize action has been performed. When the synchronize action is next performed, this file will be checked by the connector so that it can index only documents that have changed since the last synchronize and can delete documents that have been deleted since the last synchronize.

• • • ConnectorLib Java SDK Programming Guide • 135 • • Chapter 8 Parameters Common to CFS Connectors

If this parameter is set to False, the datastore file is deleted at the end of each synchronize action. The next synchronize action will fetch all documents and will not delete old documents.

Type: Boolean

Default: True

Required: No

Configuration Connector Section:

Example: SynchronizeKeepDatastore=False See Also:

SynchronizeThreads Use this parameter to specify the number of threads to use for synchronization if the connector supports multi-threading. This parameter will not have an effect if the connector does not support multi-threading. In cases where this method is not supported by the connector, multiple tasks can be executed using the alternative TaskThreads setting.

Type: Integer

Default: 5

Required: No

Configuration Connector Section:

Example: SynchronizeThreads=4 See Also:

TaskMaxAdds Use this parameter to specify the maximum number of Adds to be processed by the task. The value 0 indicates an infinite number.

Type: Integer

Default: 0

Required: No

• • • 136 • ConnectorLib Java SDK Programming Guide • • General Connector Parameters

Configuration TaskName Section:

Example: TaskMaxAdds=0 See Also:

TaskMaxDuration Use this parameter to specify the maximum duration of the task in the format H[H][:MM][:SS].

Type: String

Default:

Required: No

Configuration TaskName Section:

Example: TaskMaxDuration=12:30:00 See Also:

TaskThreads Use this parameter to specify the number of simultaneous tasks that can be performed for an action. Each action=fetch action results potentially in a number of tasks. Each task generally consists of performing a subset of the action using a particular configuration section. For example, a single synchronize action can actually mean performing a synchronize action for multiple configured tasks. For collect and other actions where identifiers are provided, the identifiers are tied to particular configuration sections, so the whole action can span across several configuration sections. Generally, a single task is for a particular configuration section. Each would be processed on a separate thread.

Type: Integer

Default: 1

Required: No

• • • ConnectorLib Java SDK Programming Guide • 137 • • Chapter 8 Parameters Common to CFS Connectors

Configuration Connector Section:

Example: TaskThreads=1 See Also:

TempDirectory Use this parameter to specify the directory used to store temporary documents and files.

Type: String

Default: ./temp

Required: No

Configuration Connector Section:

Example: TempDirectory=./TempFiles See Also:

XsltDLL Use this parameter to specify the location of the autnxslt library.

Type: String

Default: autnxslt.dll (if present)

Required: No

Configuration Connector Section:

Example: XsltDLL=autnxslt.dll See Also:

• • • 138 • ConnectorLib Java SDK Programming Guide • • Fetch Task Configuration

Fetch Task Configuration

Each action the connector performs consists of one or more tasks. Each task is associated with a section in the configuration file. The section to use is either specified in an action parameter or encoded in each document identifier supplied to the action. The parameters below let you specify a numbered list of tasks. This is the set of tasks that will be performed when the connector performs a synchronize action whose parameters do not specify which tasks should be performed. The connector will also run synchronize actions for these tasks automatically according to the configured schedules.

IngestConfigSection Use this parameter to specify to read ingest settings from an alternative configuration section in preference.

Type: String

Default: task name

Required: No

Configuration TaskName or FetchTasks or Connector Section:

Example: IngestConfigSection=MyTask1 See Also:

N Use this parameter to specify the name of the task section containing the parameters for the synchronize task to be performed. The task will only be performed if N is less than Number and greater than or equal to 0.

Type: String

Default:

Required: No

• • • ConnectorLib Java SDK Programming Guide • 139 • • Chapter 8 Parameters Common to CFS Connectors

Configuration FetchTasks or Connector Section:

Example: Number=2 0=Task1 1=Task2 See Also: “Number” on page 140

Number The connector will schedule the tasks with the names specified by the numbered parameters 0 through Number -1. Numbers may be missing from the sequence. An alternative configuration method is to give the Number parameter the default value of -1. In this case, the tasks configured from 0 until the first missing parameter are used.

For example, this configuration executes Task0 and Task2: [FetchTasks] Number=3 0=Task0 2=Task2 This executes only Task0: [FetchTasks] 0=Task0 2=Task2

Type: Integer

Default: -1

Required: No

Configuration FetchTasks or Connector Section:

Example: Number=2 0=Task1 1=Task2 See Also: “N” on page 139

ScheduleCycles Use this parameter to specify the number of scheduled synchronize actions to perform.

• • • 140 • ConnectorLib Java SDK Programming Guide • • Fetch Task Configuration

The value -1 specifies to repeat forever. The value 0 specifies to perform the task once. Any other positive value specifies the number of times to perform the task. This parameter has an effect only if the EnableScheduledTasks parameter is set to True.

Type: Integer

Default: -1

Required: No

Configuration TaskName or FetchTasks or Connector Section:

Example: ScheduleCycles=3 See Also: “EnableScheduledTasks” on page 132

ScheduleRepeatSecs Use this parameter to specify the interval (in seconds) between scheduled synchronize actions. This parameter has an effect only if the EnableScheduledTasks parameter is set to True.

Type: Integer

Default: 86400

Required: No

Configuration TaskName or FetchTasks or Connector Section:

Example: ScheduleRepeatSecs=3600 See Also: “EnableScheduledTasks” on page 132

ScheduleStartTime Use this parameter to specify the start time of the first scheduled synchronize action in the format H[H][:MM][:SS]. This parameter has an effect only if the EnableScheduledTasks parameter is set to True.

Type: String

Default:

Required: No

• • • ConnectorLib Java SDK Programming Guide • 141 • • Chapter 8 Parameters Common to CFS Connectors

Configuration TaskName or FetchTasks or Connector Section:

Example: ScheduleStartTime=14:30:00 See Also: “EnableScheduledTasks” on page 132

Ingestion

The parameters in this section specify where the documents fetched by the synchronize action should be sent.

EnableIngestion Set this parameter to True if documents fetched by the synchronize action should be sent to the CFS or to another connector.

Type: Boolean

Default: True

Required: No

Configuration TaskName or Ingestion or Connector Section:

Example: EnableIngestion=False See Also:

IndexDatabase Use this parameter to specify the value assigned to the DREDBNAME field for all documents.

Type: String

Default:

Required: No

Configuration TaskName or Ingestion Section:

Example: IndexDatabase=News See Also:

• • • 142 • ConnectorLib Java SDK Programming Guide • • Ingestion

IngestActions The actions specified in this CSV will be performed on each document before it is sent to the CFS. Each action is in the form: ‘ACTIONNAME:ACTIONPARAMETERS’ Possible actions are:

Action Parameters Example

META field=value IngestActions=META:MyField=MyValue

LUA lua_script_filename IngestActions=LUA:myLuaScript.lua

Type: String

Default:

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestActions=META:MyField=MyValue See Also: “InsertActions” on page 134

Related Topics  “Use Lua Scripts” on page 67

IngestAddAsUpdate If you set this parameter to True, Add commands are treated as Updates for full metadata updating.

Type: Boolean

Default: False

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestAddAsUpdate=False See Also:

• • • ConnectorLib Java SDK Programming Guide • 143 • • Chapter 8 Parameters Common to CFS Connectors

IngestBatchSize Use this parameter to specify the number of documents that are sent to the CFS in a single batch.

This parameter has an effect only if the EnableIngestion parameter is set to True.

Type: Integer

Default: 100

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestBatchSize=200 See Also: “EnableIngestion” on page 142

IngestCheckFinished If the IngesterType parameter is set to CFS, setting the IngestCheckFinished parameter to True will cause the connector to wait until documents have been added to the import queue before returning success. If sending to another connector, this connector will wait until the action completes. More specifically, the task is held in a queue, and each time the connector attempts to send more data to the destination connector, it will check the status of the previous actions sent and will only mark those tasks as complete once the status returns Finished. This parameter has an effect only if the EnableIngestion parameter is set to True.

Type: Boolean

Default: False

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestCheckFinished=True See Also: “EnableIngestion” on page 142 “IngesterType” on page 147

• • • 144 • ConnectorLib Java SDK Programming Guide • • Ingestion

IngestConnectorConfigSection Use this parameter to specify the section to use for the destination connector. This is specifically for when IngesterType=Connector Normally when a document is sent to another connector for insertion, the destination connector must use the same configuration section name (as the source connector task name) for any settings required for the insertion of the document. This setting allows the default config section for the task to be overridden. So Connector1 performs task Task1 that retrieves a document. Connector2 would normally be forced to configure the insertion task from the Task1 section of it's own config file. This setting allows it to be changed to a different section specifically for the insertion.

Type: String

Default: IngestConfigSection

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestConnectorConfigSection=IngestConfigSection See Also: “IngesterType” on page 147

IngestDataPort Use this parameter to specify the DataPort number of the destination server.

This parameter has an effect only if the EnableIngestion parameter is set to True.

Type: Integer

Default:

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestDataPort=7051 See Also: “EnableIngestion” on page 142

• • • ConnectorLib Java SDK Programming Guide • 145 • • Chapter 8 Parameters Common to CFS Connectors

IngestDelayMS Use this parameter to specify the number of milliseconds to pause between adding individual documents to the ingest queue.

Type: Integer

Default: 0

Required: No

Configuration Ingestion or TaskName Section:

Example: IngestDelayMS=0 See Also:

IngestEnableAdds Use this parameter to specify whether or not Add commands should be sent.

Type: Boolean

Default: True

Required: No

Configuration Ingestion or TaskName Section:

Example: IngestEnableAdds=True See Also:

IngestEnableDeletes Use this parameter to specify whether or not Delete commands should be sent.

Type: Boolean

Default: True

Required: No

Configuration Ingestion or TaskName Section:

Example: IngestEnableDeletes=True See Also:

• • • 146 • ConnectorLib Java SDK Programming Guide • • Ingestion

IngestEnableUpdates Use this parameter to specify whether or not Update commands should be sent.

Type: Boolean

Default: True

Required: No

Configuration Ingestion or TaskName Section:

Example: IngestEnableUpdates=True See Also:

IngestHashedSharedPath Use this parameter to specify to use sub-directories within IngestSharedPath.

Type: String

Default: HashedTempDirectory

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestHashedSharedPath=HashedTempDirectory See Also: “IngestSharedPath” on page 150

IngesterType Use this parameter to specify the type of ingestion process. The only allowed values are CFS, AsyncPiranha (alias for CFS), Connector, and ConnectorInsert (alias for Connector). If this parameter is set to CFS, the IngestHost and IngestPort parameters point to a Connector Framework Server (CFS) (which can be used to import the documents and index them).

If this parameter is set to Connector, the IngestHost and IngestPort parameters point to another connector. Documents fetched from this repository by the synchronize action will be inserted into another repository using the connector specified. If this option is used, you will probably need to use the IngestActions parameter to convert the document into a form that can be handled by the other connector.

• • • ConnectorLib Java SDK Programming Guide • 147 • • Chapter 8 Parameters Common to CFS Connectors

Note that the synchronize action can result in Add, Update and Delete ingest commands. Adds result in insert actions, Updates result in update actions, and Deletes result in delete actions being sent to the destination connector. This parameter has an effect only if the EnableIngestion parameter is set to True.

Type: String

Default: AsyncPiranha

Required: No

Configuration TaskName or Ingestion Section:

Example: IngesterType=AsyncPiranha See Also: “EnableIngestion” on page 142 “IngestActions” on page 143 “IngestHost” on page 148 “IngestPort” on page 149

IngestHost Use this parameter to specify the hostname or IP address of the destination server.

This parameter has an effect only if the EnableIngestion parameter is set to True.

Type: String

Default: localhost

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestHost=localhost See Also: “EnableIngestion” on page 142

• • • 148 • ConnectorLib Java SDK Programming Guide • • Ingestion

IngestKeepFiles If this parameter is set to True, downloaded documents will not be deleted after they have been ingested.

Type: Boolean

Default: False

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestKeepFiles=True See Also: “EnableIngestion” on page 142

IngestPort Use this parameter to specify the port number of the destination server.

This parameter has an effect only if the EnableIngestion parameter is set to True.

Type: Integer

Default: 7000

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestPort=7050 See Also: “EnableIngestion” on page 142

IngestSendByType Use this parameter to specify whether or not to send Add, Update, and Delete commands separately.

Type: Boolean

Default: False

Required: No

• • • ConnectorLib Java SDK Programming Guide • 149 • • Chapter 8 Parameters Common to CFS Connectors

Configuration TaskName or Ingestion Section:

Example: IngestSendByType=False See Also:

IngestSharedPath Use this parameter to specify the location to which documents are saved before ingestion. This should be a path accessible by both the connector and the ingest server.

Type: String

Default: The value of the TempDirectory parameter.

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestSharedPath=./TempDirectory See Also: “EnableIngestion” on page 142

IngestSSLConfig Use this parameter to specify the configuration file section containing the SSL settings that should be used when communicating with the CFS. For more information on the contents of this section, refer to “Secure Socket Layer Parameters” on page 235.

Type: String

Default:

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestSSLConfig=SSLSettings See Also: “EnableIngestion” on page 142

• • • 150 • ConnectorLib Java SDK Programming Guide • • GroupServer

IngestWriteIDX If the IngestKeepFiles parameter is set to True, setting the IngestWr iteIDX parameter to True causes the connector to write document metadata in a stub IDX file alongside the document.

Type: Boolean

Default: False

Required: No

Configuration TaskName or Ingestion Section:

Example: IngestWriteIDX=False See Also: “EnableIngestion” on page 142

GroupServer

The parameters in this section are used when performing an update to an instance of OmniGroupServer using the SynchronizeGroups fetch action.

GroupServerHost Use this parameter to specify the host name or IP address of the group server.

Type: String

Default: localhost

Required: Yes

Configuration TaskName or GroupServer Section:

Example: GroupServerHost=localhost See Also:

• • • ConnectorLib Java SDK Programming Guide • 151 • • Chapter 8 Parameters Common to CFS Connectors

GroupServerPort Use this parameter to specify the port number of the group server.

Type: Integer

Default: 3057

Required: Yes

Configuration TaskName or GroupServer Section:

Example: GroupServerPort=3057 See Also:

GroupServerRepository Use this parameter to specify the group server repository name. This is the name of any repository section in an OmniGroupServer configuration file. It is named based on the name of the repository the connector is retrieving from.

Type: String

Default:

Required: Yes

Configuration TaskName or GroupServer Section:

Example: GroupServerRepository=RepositoryGroups See Also:

GroupServerSSLConfig Use this parameter to specify the section containing SSL settings for the group server.

Type: String

Default:

Required:

• • • 152 • ConnectorLib Java SDK Programming Guide • • GroupServer

Configuration TaskName or GroupServer Section:

Example: GroupServerSSLConfig=SSLConfig1 See Also:

Related Topics  “Synchronize Groups Fetch Action” on page 164

• • • ConnectorLib Java SDK Programming Guide • 153 • • Chapter 8 Parameters Common to CFS Connectors

• • • 154 • ConnectorLib Java SDK Programming Guide • • CHAPTER 9  Parameters Common to CFS Connectors Using Java

This chapter describes the parameters that specify details related to the CFS connectors’ Java files. These parameters are found in the [Connector] section.

JavaClassPath Specify the class path to use, including all dependencies of the connector. This must include the class specified by the JavaConnectorClass parameter. The Java class path must point to all the jar files found in the lib directory when installed. This must include the JavaConnector.jar and will usually include one other connector jar (ConnectorName.jar) file plus any number of dependencies and properties files or other resources required by the libraries.

If the connector is run through Java directly (using the ConnectorLibJava library), the CLASSPATH should be set on the command line. For example:

java -classpath com.autonomy.connector.ConnectorBase -configfile

Type: String

Default: None

Required: Yes

• • • ConnectorLib Java SDK Programming Guide • 155 • • Chapter 9 Parameters Common to CFS Connectors Using Java

Configuration Connector Section:

Example: JavaClasspath=./lib/javaConnector.jar;./lib/ MyConnector.jar

or

JavaClassPath0=./lib/JavaConnector.jar JavaClassPath1=./lib/MyConnector.jar See Also: “JavaConnectorClass” on page 156

JavaConnectorClass Specify the full class name of the Java class that contains the implementation of the connector. Package separators can be either slash (/) or dot (.).

Type: String

Default: None

Required: Yes

Configuration Connector Section:

Example: JavaConnectorClass=com.autonomy.connector.example .FileSystemConnector

See Also: “JavaClassPath” on page 155

JavaLibraryPath Set this parameter to the location of any additional native libraries required by and of the connector classes or dependencies.

Type: String

Default: ““

Required: No

Configuration Connector Section:

Example: JavaLibraryPath=./lib See Also: “JavaClassPath” on page 155

• • • 156 • ConnectorLib Java SDK Programming Guide • • JavaMaxMemoryMB Set this parameter to the maximum memory (in megabytes) to be allocated to the Java virtual machine.

Type: Integer

Default: 64

Required: No

Configuration Connector Section:

Example: JavaMaxMemoryMB=256 See Also: “JavaClassPath” on page 155

JVMLibraryPath Use this parameter to specify the location of the jvm library (for example, jvm.dll or libjvm.so). The connector will look for the library in various locations specified by the following:  JVMLibraryPath parameter  Location set in the Windows registry

 JAVA_HOME environment variable  System library path (PATH, LD_LIBRARY_PATH, and so forth)

Type: String

Default: None

Required: No, however, the connector will not start if the jvm library cannot be found or loaded.

Configuration Connector Section:

Example: JVMLibraryPath=./jre/bin/client See Also:

• • • ConnectorLib Java SDK Programming Guide • 157 • • Chapter 9 Parameters Common to CFS Connectors Using Java

JavaVerboseGC Set this parameter to True to enable verbose garbage collection (for debugging purposes only).

Type: Boolean

Default: False

Required: No

Configuration Connector Section:

Example: JavaVerboseGC=False See Also:

• • • 158 • ConnectorLib Java SDK Programming Guide • • CHAPTER 10  CFS Connector Actions

CFS connectors may provide one or more of the ACI actions described here. Not all connectors support all the actions. The sample HTTP requests in this section are split across multiple lines for readability. When using these requests, the whole request should be on one line and contain no spaces. Brackets ([]) enclosing a parameter indicate that the parameter is optional.

 Synchronous Versus Asynchronous Actions

 QueueInfo Action

 Synchronize Fetch Action

 Synchronize Groups Fetch Action

 Collect Fetch Action

 Identifiers Fetch Action

 Insert Fetch Action

 Delete/Remove Fetch Action

 Hold and ReleaseHold Fetch Actions

 Update Action

 View Action

 StopFetch Action

• • • ConnectorLib Java SDK Programming Guide • 159 • • Chapter 10 CFS Connector Actions

Synchronous Versus Asynchronous Actions

Some of the actions described here are synchronous and others are asynchronous. The connector does not respond to a synchronous action until it has completed the request. The result of the action is in the response to the request. An asynchronous action responds immediately; the request is added to a queue of actions to be performed. The response to the request contains a token. You can use this token to determine whether the request has finished and the results of the action. You can do this using the QueueInfo action.

Example http://localhost:1234/action=Fetch&FetchAction=Synchronize

Response FETCH SUCCESS MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDoxNzUyNTc4MDc0

QueueInfo Action

The QueueInfo action provides information about the asynchronous actions that CFS or a connector is processing. Use this action to determine whether a task has completed and retrieve the results of the task. http://host:port/action=QueueInfo &QueueName=QueueName &QueueAction=QueueAction [&Token=Token]

QueueInfo is a synchronous action.

• • • 160 • ConnectorLib Java SDK Programming Guide • • QueueInfo Action

Parameter Name Description

QueueName The name of the queue you wish to retrieve information about. There is one queue per asynchronous action. Most of the connector’s functionality is accessed through action=Fetch, so usually you should specify Fetch.

QueueAction The action you wish to perform on the queue. Possible actions are: GetStatus. The response provides information about the action currently on the queue.

Token This restricts the response to information about the action identified by the token.

Example

http://localhost:1234/ action=QueueInfo&QueueName=Fetch&QueueAction=GetStatus

Response A sample response appears below. Each action in the queue appears between tags. This example shows a single synchronize action on the queue. This has finished and the response is included between the action tags. QUEUEINFO SUCCESS SYNCHRONIZE

• • • ConnectorLib Java SDK Programming Guide • 161 • • Chapter 10 CFS Connector Actions

DIR1 DIR2 MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDoxNDAyOTU3MzY4 Finished 2009-Oct-15 14:44:32 0 2009-Oct-15 14:44:32 3 2009-Oct-15 14:44:35

Synchronize Fetch Action

This action is used to search a repository for document updates and send these updates to an Ingestion module. http://host:port/action=Fetch&FetchAction=Synchronize [&Config=Base64_Config] [&TaskSections=Section_CSV] [&IngestActions=Document_Action_CSV]

Type: Asynchronous

Parameter Name Description

Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.

TaskSections The names of the task sections to use to perform synchronization. If this parameter is unspecified, all configured task sections are used.

• • • 162 • ConnectorLib Java SDK Programming Guide • • Synchronize Fetch Action

Parameter Name Description

IngestActions This parameter specifies actions to perform on documents prior to being ingested. This can be a list of document actions of the form action:parameters processed from left to right. The available documents actions are:  META. Add a custom field to the document, specified as META:Fieldname=FieldValue  LUA. Execute a Lua script on the document, specified as LUA:Luascript.

Example: To add a field CATEGORY=FILESYSTEM to every document, specify the ingest action as: IngestActions=META:CATEGORY=FILESYSTEM Any commas in the action parameters should be escaped with a backslash (\).

Example http://host:port/action=Fetch&FetchAction=Synchronize

Response A sample response appears below. In this example, two tasks were performed as part of the synchronize (DIR1 and DIR2). Both of these found 10 new documents, but ingestion failed for all 20 documents. SYNCHRONIZE DIR1 DIR2 MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDoxNDAyOTU3MzY4

• • • ConnectorLib Java SDK Programming Guide • 163 • • Chapter 10 CFS Connector Actions

Finished 2009-Oct-15 14:44:32 0 2009-Oct-15 14:44:32 3 2009-Oct-15 14:44:35

Synchronize Groups Fetch Action

This action is used to search a repository for Group updates and send these updates to an Ingestion module. http://host:port/action=Fetch&FetchAction=SynchronizeGroups [&Config=Base64_Config] [&TaskSections=Section_CSV]

Type: Asynchronous

Parameter Name Description

Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.

TaskSections The names of the task sections to use to perform synchronization. If this parameter is unspecified, all configured task sections are used. The sections should include the GroupServerHost and GroupServerPort parameters as a minimum in addition to any connector-specifc parameters.

Example http://host:port/action=Fetch&FetchAction=SynchronizeGroups

Response A sample response appears below. In this example, two tasks were performed as part of the synchronize groups (GROUPS1 and GROUPS2). SYNCHRONIZEGROUPS GROUPS1 GROUPS2 MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDoxNDAyOTU3MzY4

• • • 164 • ConnectorLib Java SDK Programming Guide • • Collect Fetch Action

Finished 2009-Oct-15 14:44:32 0 2009-Oct-15 14:44:32 3 2009-Oct-15 14:44:35

Collect Fetch Action

This action is used to retrieve documents and metadata by their Identifiers from a repository and send the documents to be ingested to a specified location. http://host:port/action=Fetch&FetchAction=Collect [&Config=Base64_Config] [&Identifiers=Identifier_CSV] [&IdentifiersXML=Identifier_XML] [&Collectactions=Document_Action_CSV] [&Destination=UNC_Path]

Type: Asynchronous

Parameter Name Description

Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options are used instead of the options in the connector configuration file.

failedDirectory The directory in which the action will report failures.

Identifiers A CSV of the identifiers of the documents to be collected.

• • • ConnectorLib Java SDK Programming Guide • 165 • • Chapter 10 CFS Connector Actions

Parameter Name Description

IdentifiersXML This parameter can be specified in addition to the Identifiers parameter and specifies additional identifiers to collect along with a set of custom metadata to be associated with each collected document. This data should be provided in XML format as below:

Destination Output destination as UNC Path. If this is blank, the documents are added to the ingest queue. The parameter can use fields from the document or identifier to construct the resulting destination for each document. To add a document field value as part of the destination, use the tag within the string. To add an identifier field value as part of the destination use the tag within the string. Example: destination=\\server\share\\ Where a field can have multiple values or is a CSV, multiple destinations are created and each gets a copy of the document. A CSV can be specified by preceding the colon with the CSV separator character. Example: .

• • • 166 • ConnectorLib Java SDK Programming Guide • • Collect Fetch Action

Parameter Name Description

CollectActions This parameter specifies actions to perform on documents prior to transferring them to their destination. This can be a list of document actions of the form action:parameters processed from left to right. The available document actions are:  META. Add a custom field to the document, specified as META:Fieldname=FieldValue  ZIP. Add the document to a zip file, specified as ZIP:Filename[:Password]  LUA. Execute a Lua script on the document, specified as LUA:Luascript.

Example: To add a field CATEGORY=FILESYSTEM to every document, zip all documents with a password and add a field COLLECTTIME=1234567890 to the zip, specify the collect action as: CollectActions=META:CATEGORY=FILESYSTEM,ZIP:Output.zi p:password,META:COLLECTTIME=1234567890 Any commas in the action parameters should be escaped with a ‘\’.

Example http://localhost:1234/ action=Fetch&FetchAction=Collect&Identifiers=PGlkIHM9IkRJUjEiIHI9I kM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxmaWxlOS50eHQ iLz4%3D,PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZW N0b3JDRlNcZGlyMVxmaWxlOC50eHQiLz4%3D&Destination=C:\Autonomy\ collected

Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the QueueInfo action) appears below.

In this example the tokens for both documents appear between tags showing that they were collected successfully. The documents were output to C:\ Autonomy\collected along with stub files containing their metadata.

• • • ConnectorLib Java SDK Programming Guide • 167 • • Chapter 10 CFS Connector Actions

COLLECT PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3J DRlNcZGlyMVxmaWxlOC50eHQiLz4= PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3J DRlNcZGlyMVxmaWxlOS50eHQiLz4= MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDotMTI2NTE0MTI5NA== Finished 2009-Oct-15 16:02:53 0 2009-Oct-15 16:02:53 0 2009-Oct-15 16:02:53

 “Synchronize Fetch Action” on page 162

Identifiers Fetch Action

This action is used to retrieve a list of document identifiers and optionally perform an action on them (currently only the collect action is available). It should not be used to perform queries that could be more efficiently performed through IDOL Server.

http://host:port/action=Fetch&FetchAction=Identifiers [&Config=Base64_Config] &ConfigSection=Section_Name [&Identifiersaction=Collect &Destination=UNC_Path [&CollectActions=Document_Action_CSV] [&Connector-specific_Parameters]

• • • 168 • ConnectorLib Java SDK Programming Guide • • Identifiers Fetch Action

Type: Asynchronous

Parameter Name Description

Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.

ConfigSection The name of the configuration file section containing the task settings.

IdentifiersAction The name of the action to perform on the returned identifiers. If this action should be passed additional parameters, you should specify them as parameters to this action.

Connector-specific_ Additional parameters that are connector-specific and Parameters determine which identifiers to return.

Example http://localhost:1234/ action=Fetch&FetchAction=Identifiers&ConfigSection=DIR1

Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the QueueInfo action) appears below. This shows that the action has completed. The identifiers are listed in between the tags. IDENTIFIERS PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3J DRlNcZGlyMVxuZXdmaWxlLnR4dCIvPg== MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDotMTc4NTQ1MTYwOQ==

• • • ConnectorLib Java SDK Programming Guide • 169 • • Chapter 10 CFS Connector Actions

Finished 2009-Oct-15 16:36:32 0 2009-Oct-15 16:36:32 0 2009-Oct-15 16:36:32

Related Topics  “Synchronize Fetch Action” on page 162

Insert Fetch Action

This action is used to insert a document or documents into a repository. http://host:port/action=Fetch&FetchAction=Insert [&Config=Base64_Config] &ConfigSection=Section_Name &InsertXML=Insert_XML

Type: Asynchronous

Parameter Name Description

Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.

• • • 170 • ConnectorLib Java SDK Programming Guide • • Insert Fetch Action

Parameter Name Description

ConfigSection The name of the configuration file section containing the task settings. failedDirectory The directory in which the action will report failures.

InsertXML XML containing all the properties to determine how and where to add each document, all the metadata, and optionally, a file to insert for each document. Some connectors expect a file to be provided. The data should be provided in XML format as below: reference true/false true/false filename/base64file Most of the tags are optional.  ownfile. If ownfile is true, this file is deleted after it has been inserted. By default, this is false.  isfilename. If isfilename is true, the content tag contains the full path to the file. If isfilename is false, the content tag contains the entire file base64 encoded. The ownfile tag is ignored in this case.  insert. The insert tag can be omitted if a single document is being inserted.  reference, property, metadata. The usage of these tags depends on the connector used.

Example In this example, the object is to insert a file with the reference C:\Autonomy\ FileSystemConnectorCFS\dir1\newfile.txt with the content This is my file. First, construct the InsertXML: C:\Autonomy\FileSystemConnectorCFS\dir1\newfile.txt

• • • ConnectorLib Java SDK Programming Guide • 171 • • Chapter 10 CFS Connector Actions

false VGhpcyBpcyBteSBmaWxl Note that the content This is my file is base64 encoded. The XML is then escaped and used in the Insert action.

http://localhost:1234/ action=Fetch&FetchAction=Insert&ConfigSection=DIR1&InsertXML=%3Cin sertXML%3E%3Cinsert%3E%3Creference%3EC%3A%5CAutonomy%5CFileSystemC onnectorCFS%5Cdir1%5Cnewfile.txt%3C%2Freference%3E%3Cfile%3E%3Cisf ilename%3Efalse%3C%2Fisfilename%3E%3Ccontent%3EVGhpcyBpcyBteSBmaWx l%3C%2Fcontent%3E%3C%2Ffile%3E%3C%2Finsert%3E%3C%2FinsertXML%3E

Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the queueinfo action) appears below. This action shows that the action has completed and that one document has been inserted, and gives the identifier of the new document. INSERT PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3J DRlNcZGlyMVxuZXdmaWxlLnR4dCIvPg== MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDotMTc4NTQ1MTYwOQ== Finished 2009-Oct-15 16:36:32 0 2009-Oct-15 16:36:32 0 2009-Oct-15 16:36:32

• • • 172 • ConnectorLib Java SDK Programming Guide • • Delete/Remove Fetch Action

Related Topics  “Synchronize Fetch Action” on page 162

Delete/Remove Fetch Action

This action is used to delete documents from a repository by their identifiers. Remove and delete are different names for the same action.

http://host:port/action=Fetch&FetchAction=Delete [&Config=Base64_Config] &Identifiers=Identifier_CSV http://host:port/action=Fetch&FetchAction=Remove [&Config=Base64_Config] &Identifiers=Identifier_CSV

Type: Asynchronous

Parameter Name Description

Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.

Identifiers A CSV of document Identifiers. The documents with these identifiers are removed from the repository.

Example http://localhost:1234/ action=Fetch&FetchAction=Delete&Identifiers=PGlkIHM9IkRJUjEiIHI9Ik M6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxuZXdmaWxlLnR4 dCIvPg%3D%3D

Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the QueueInfo action) appears below. This response shows that one document was deleted successfully.

• • • ConnectorLib Java SDK Programming Guide • 173 • • Chapter 10 CFS Connector Actions

errors="0" holds="0" ingestadded="0" ingestdeleted="0" ingestfailed="0" ingestupdated="0" inserted="0" releasedholds="0" seen="0" task="DIR1" unchanged="0" updated="0"/> DELETE PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3J DRlNcZGlyMVxuZXdmaWxlLnR4dCIvPg== MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDotMTgwNDU4NzIxMQ== Finished 2009-Oct-15 16:43:17 0 2009-Oct-15 16:43:17 0 2009-Oct-15 16:43:17

Related Topics  “Synchronize Fetch Action” on page 162

Hold and ReleaseHold Fetch Actions

The Hold action places a hold on a document or documents in the repository by their identifier. When a document has been placed on hold, it cannot be deleted by a regular user.

The ReleaseHold action releases a document that has been placed on hold. http://host:port/action=Fetch&FetchAction=Hold [&Config=Base64_Config] &Identifiers=Identifier_CSV http://host:port/action=Fetch&FetchAction=ReleaseHold [&Config=Base64_Config] &Identifiers=Identifier_CSV

• • • 174 • ConnectorLib Java SDK Programming Guide • • Hold and ReleaseHold Fetch Actions

Type: Asynchronous

Parameter Name Description

Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.

Identifiers A CSV of document Identifiers. The documents with these identifiers is placed on hold or released from hold depending on whether you used the Hold or Release Hold action.

Example http://localhost:1234/ action=Fetch&FetchAction=Hold&Identifiers=PGlkIHM9IkRJUjEiIHI9IkM6 XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxuZXdmaWxlLnR4dC IvPg%3D%3D

Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the queueinfo action) appears below. This response shows that one document was successfully put on hold. HOLD PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3J DRlNcZGlyMVxuZXdmaWxlLnR4dCIvPg== MTAuMi4xMDUuMzQ6MTIzNDpGRVRDSDotMTgwNDU4NzIxMQ== Finished 2009-Oct-15 16:43:17 0 2009-Oct-15 16:43:17

• • • ConnectorLib Java SDK Programming Guide • 175 • • Chapter 10 CFS Connector Actions

0 2009-Oct-15 16:43:17

 “Synchronize Fetch Action” on page 162

Update Action

The Update action updates metadata for documents given by their identifier in a repository.

Request /action=fetch&fetchaction=Update [&parenttoken=] [&config=] [&identifiersXML=] [&]

IdentifiersXML The IdentifiersXML parameter specifies identifiers that require metadata updates along with a set of the metadata to be updated for each document. The data should be provided in XML format as below:

Asynchronous Response UPDATE [IDENTIFIER1]

• • • 176 • ConnectorLib Java SDK Programming Guide • • View Action

[IDENTIFIER2] [IDENTIFIER3]

Related Topics  “Synchronize Fetch Action” on page 162

View Action

The View action retrieves a single document and returns it. http://host:port/action=View [&Config=] [&NoACI=True/False] &Identifier=Identifier

Type: Synchronous

Parameter Name Description

Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.

NoACI Specify whether to return the document using a normal ACI response with a Base64 encoded file tag (false), or just return binary content (true). This defaults to true.

Identifiers The identifier of the document to be returned.

Example http://localhost:1234/ action=View&Identifier=PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGV TeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxmaWxlOC50eHQiLz4%3D

Response The response is the binary content of the file, unless you have specified NoACI=false.

Related Topics  “EnableViewServer” on page 128

• • • ConnectorLib Java SDK Programming Guide • 177 • • Chapter 10 CFS Connector Actions

StopFetch Action

This action requests all active asynchronous fetch actions or a particular asynchronous fetch action to stop.

http://host:port/action=StopFetch [&Token=Fetch_Action_Token]

Type: Synchronous

Parameter Name Description

Token The token of the asynchronous Fetch action to request to stop. If this is not specified, then the connector requests all asynchronous fetch actions to stop. Doing so does not clear the action queue.

Example http://localhost:1234/action=StopFetch

Response STOPFETCH SUCCESS

• • • 178 • ConnectorLib Java SDK Programming Guide • • CHAPTER 11  Connector Framework Server Parameters

This section describes the Connector Framework server (CFS) configuration parameters.

 Service Parameters

 Server Parameters

 Actions Parameters

 Import Tasks and their Parameters

 Import Service Parameters

 Indexing Parameters Connector Framework server supports standard service parameters, logging parameters and log streams. For more information, see the IDOL Server Administration Guide. This section lists the Connector Framework server configuration parameters.

• • • ConnectorLib Java SDK Programming Guide • 179 • • Chapter 11 Connector Framework Server Parameters

Service Parameters

The parameters in this section determine which machines are permitted to use and control the Connector Framework service.

Related Topics  Service Configuration Parameters

Server Parameters

The parameters in this section specify details for the Connector Framework server.

AdminClients Specify the IP addresses or names of clients that can issue administrative commands to the ACI Port. To enter multiple addresses, separate the individual addresses with commas (there must be no space before or after the comma). Alternatively, you can use wildcards in the IP address. Enter for example 187.*.*.* to permit any machine whose IP address begins with 187 to control the connector.

Type: String

Default: *.*.*.*

Required: No

Configuration Server Section:

Example: AdminClients=localhost,196.172.87.11 See Also: “Port” on page 181  “QueryClients” on page 181

• • • 180 • ConnectorLib Java SDK Programming Guide • • Server Parameters

Port Specify the ACI port by which actions are sent to the Connector Framework server.

Type: Long

Default:

Required: Yes

Allowed Minimum: 0 Range: Maximum: 65535

Recommended Minimum: 1024 Range: Maximum: 49151

Configuration Server Section:

Example: Port=7008 See Also:

QueryClients Specify the IP addresses or names of clients that can query the connector. To enter multiple addresses, separate the individual addresses with commas (there must be no space before or after the comma). Alternatively, you can use wildcards in the IP address. Enter for example 187.*.*.* to permit any machine whose IP address begins with 187 to control the connector.

Type: String

Default: *.*.*.*

Required: No

Configuration Server Section:

Example: QueryClients=10.1.1.*,127.0.0.1 See Also: “Port” on page 181  “AdminClients” on page 180

• • • ConnectorLib Java SDK Programming Guide • 181 • • Chapter 11 Connector Framework Server Parameters

Actions Parameters

The parameters in this section control how actions are sent to Connector Framework server.

MaxQueueSize Use this parameter to specify the maximum number of asynchronous ingest action commands that will be queued by the server. No further ingest actions will be accepted once the queue size has been reached (until the queue diminishes).

Type: Integer

Default: The largest size possible.

Required: No

Configuration Actions Section:

Example: MaxQueueSize=4 See Also:

MaximumThreads Specify the number of actions that the CFS can process in parallel at any one time. The optimal value for this parameter is dependant on the load of the server. The default is generally sufficient for most loads.

Type: Integer

Default: 2

Required No

Configuration Actions Section:

Example: MaximumThreads=10

• • • 182 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

Import Tasks and their Parameters

The tasks and parameters in this section control the way documents are imported to IDX or XML before they are indexed into IDOL Server. The Import Task types are Lua, IDXWriter, TextToDocs, Sectioner, ImportFile, and HtmlExtraction.

Import Tasks This section describes Import tasks. To define an Import task, use the following line in the CFS configuration file:

[Pre] | [Post] | [Update] | [Delete]N=TaskType where TaskType is the type of import task that you want to use. For example:

Pre0=HtmlExtraction

Lua The Lua import task is used to run a Lua script. The Lua import task can be configured as a Pre, Post, Update or Delete task. You must specify the path of the script. For example:

Post0=Lua:C:\Scripts\posttask1.lua

IDXWriter The IDXWriter import task is used to call the CFS IDX Writer. The IDX Writer is included in the Connector Framework server and generates an IDX file.

The IdxWriter import task can be configured as a Pre, Post, Update or Delete task. The parameters that are passed to the task are specified in an optional, named section of the configuration file. For example:

Post0=IdxWriter:IdxWriting [IdxWriting] IdxWriterFilename=Job0.idx IdxWriterMaxSizeKBs=100 IdxWriterArchiveDirectory=./IDXArchive For information about the parameters used to configure this task, see “IdxWriter Import Task Parameters” on page 188.

• • • ConnectorLib Java SDK Programming Guide • 183 • • Chapter 11 Connector Framework Server Parameters

TextToDocs The TextToDocs import task is used to split a file into a number of documents (a main document, and one or more child documents). This task results in a number of metadata and DRECONTENT documents being generated. The original document is discarded and is not filtered using Keyview.

The TextToDocs import task is always configured as a Pre task. The parameters that are passed to the task are specified in a named section of the configuration file. For example:

[ImportTasks] Pre0=TextToDocs:TextToDocsSection

[TextToDocsSection] //Settings to configure how to process the documents.

For information about the parameters used to configure this task, see “TextToDocs Import Task Parameters” on page 189.

Sectioner The Sectioner import task is used to split a large document into smaller sections.

The Sectioner import task is always configured as a Post task. The parameters that are passed to the task are specified in a named section of the configuration file. For example:

Post0=Sectioner:Sectioning [Sectioning] SectionerMaxBytes=3000 SectionMinBytes=1500 If a configuration file section is not specified, [Sectioning] is assumed. For information about the parameters used to configure this task, see “Sectioner Import Task Parameters” on page 201.

ImportFile The ImportFile import task imports a file and adds its content to the document being processed.

The ImportFile import task can be configured as a Pre or Post task. To define an ImportFile import task, use the following line in the configuration file:

[Pre] | [Post]N=ImportFile:fieldname where fieldname is a field that contains the file name of the document to import.

• • • 184 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

HtmlExtraction The HtmlExtraction import task is used to extract the relevant parts from an HTML page, leaving the irrelevant parts (for example advertisements). It is used only with HTML documents.

The HtmlExtraction import task is always configured as a Pre task. To define an HTMLExtraction import task, use the following line in the configuration file:

Pre0=HtmlExtraction

PreN PreN is used to specify tasks when documents are indexed into IDOL server. Pre tasks are called before file content is filtered out and before sub-files are extracted. Tasks must be numbered starting from zero (0). The import tasks that can be called by PreN are IdxWriter, Lua, HtmlExtraction, and TextToDocs. The fields AUTN_NO_FILTER and AUTN_NO_EXTRACT can be used to customise the task.

 To prevent sub files being extracted from the document, set the value of AUTN_NO_EXTRACT to true This setting might be used to prevent the contents of zip files being indexed.

 To prevent any content being filtered out of the document, set the value of AUTN_NO_FILTER to true This setting might be used to prevent content being indexed from a certain file type.

Type: String

Default: None

Required: No

Configuration ImportTasks Section:

Example: Pre0=Lua:C:\Scripts\pretask1.lua See Also “PostN” on page 186 “HashN” on page 187 “DeleteN” on page 187 “HashN” on page 187

• • • ConnectorLib Java SDK Programming Guide • 185 • • Chapter 11 Connector Framework Server Parameters

PostN PostN is used to specify tasks when documents are indexed into IDOL server. Post tasks are called after content has been extracted from documents, and after sub-files have been extracted. Tasks must be numbered starting from zero (0). The import tasks that can be called by PostN are IdxWriter, Lua, and Sectioner.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: Post0=Lua:C:\Scripts\posttask1.lua See Also: “PreN” on page 185 “HashN” on page 187 “DeleteN” on page 187 “HashN” on page 187

UpdateN UpdateN is used to specify tasks that are called when CFS is about to update fields in a document in IDOL Server. This is when a connector updates the metadata for a document but not the content. Tasks must be numbered starting from zero (0). The import tasks that can be called by UpdateN are IdxWriter and Lua. In the example below, Update0 runs a Lua script when a document is about to be updated.

Type: String

Default: None

Required: No

• • • 186 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

Configuration IndexTasks Section:

Example: Update0=Lua:onUpdate.lua See Also “PostN” on page 186 “PreN” on page 185 “HashN” on page 187 “DeleteN” on page 187

DeleteN DeleteN is used to specify tasks that are called when CFS is about to delete a document from IDOL Server. Tasks must be numbered starting from zero (0). The import tasks that can be called by DeleteN are IdxWriter and Lua. In the following example, Delete0 runs a Lua script when a document is about to be deleted.

Type: String

Default: None

Required: No

Configuration IndexTasks Section:

Example: Delete0=Lua:onDelete.lua See Also “PostN” on page 186 “PreN” on page 185 “HashN” on page 187 “HashN” on page 187

HashN Specify a file containing a Lua script to use for family hashing. The script inserts an MD5 field into the document, which is a hash of the document’s unique fields. In the example below, hash.lua should be (as this uses the file contents): function handler(document) return false end The hash is calculated from the whole document and it does not matter whether it is text or binary, the hash is calculated from the actual original imported file.

• • • ConnectorLib Java SDK Programming Guide • 187 • • Chapter 11 Connector Framework Server Parameters

Type: String

Default: None

Required: No

Configuration ImportTasks Section:

Example: Hash0=hash.lua See Also “ImportHashFamilies” on page 204 “ImportFamilyRootExcludeFmtCSV” on page 203

IdxWriter Import Task Parameters The parameters in this section are used to customise the IdxWriter Import Task.

IdxWriterFileName The IdxWriterFileName parameter specifies the name of the idx file that is used to store document data before it is indexed into IDOL Server.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: IdxWriterFileName=Job_0.idx See Also: “IdxWriterArchiveDirectory” on page 188 “IdxWriterMaxSizeKBs” on page 189

IdxWriterArchiveDirectory The IdxWriterArchiveDirectory parameter specifies the name of the directory into which idx files are archived when the maximum file size is reached.

Type: String

Default: None

Required No

• • • 188 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

Configuration ImportTasks Section:

Example: IdxWriterArchiveDirectory=/IDXarchive See Also: “IdxWriterFileName” on page 188 “IdxWriterMaxSizeKBs” on page 189

IdxWriterMaxSizeKBs The IdxWriterMaxSizeKBs parameter specifies a maximum size for idx files.

Type: Integer

Default: None

Required No

Configuration ImportTasks Section:

Example: IdxWriterMaxSizeKBs=1000 See Also: “IdxWriterFileName” on page 188 “IdxWriterArchiveDirectory” on page 188

TextToDocs Import Task Parameters The parameters in this section are used to customise the TextToDocs Import Task.

FilenameMatchesRegex The FilenameMatchesRegex parameter is used to restrict the files processed by a TextToDocs import task. This parameter accepts one or more regular expressions. If the file name does not match one of the regular expressions, the file is not included.

Type: String

Default: None

Required No

• • • ConnectorLib Java SDK Programming Guide • 189 • • Chapter 11 Connector Framework Server Parameters

Configuration ImportTasks Section:

Example: FilenameMatchesRegex0=.*htm FilenameMatchesRegex1=.*txt See Also:

ReferenceMatchesRegex The ReferenceMatchesRegex parameter is used to restrict the files processed by a TextToDocs import task. This parameter accepts one or more regular expressions. If the document reference does not match one of the regular expressions, the file is not included.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: ReferenceMatchesRegex0=123.* ReferenceMatchesRegex1=456.* See Also:

FieldMatchesName The FieldMatchesName and FieldMatchesRegex parameters are used to restrict the files processed by a TextToDocs import task. If the content of the field specified by the FieldMatchesName parameter does not match the regular expression in the FieldMatchesRegex parameter, the file is not included.

If more than one pair of FieldMatchesName and FieldMatchesRegex parameters is defined, the field content must match the regular expression for every field, or the file is not included.

Type: String

Default: None

Required No

• • • 190 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

Configuration ImportTasks Section:

Example: FieldMatchesName0=Title FieldMatchesRegex0=.* See Also: “FieldMatchesRegex” on page 191

FieldMatchesRegex The FieldMatchesName and FieldMatchesRegex parameters are used to restrict the files processed by a TextToDocs import task. If the content of the field specified by the FieldMatchesName parameter does not match the regular expression in the FieldMatchesRegex parameter, the file is not included.

If more than one pair of FieldMatchesName and FieldMatchesRegex parameters is defined, the field content must match the regular expression for every field, or the file is not included.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: FieldMatchesName0=Title FieldMatchesRegex0=.* See Also: “FieldMatchesName” on page 190

ContentContainsRegex The ContentContainsRegex parameter is used to restrict the files processed by a TextToDocs import task. This parameter accepts one or more regular expressions. If the document content does not match all of the regular expressions that are defined, the file is not included.

Type: String

Default: None

Required No

• • • ConnectorLib Java SDK Programming Guide • 191 • • Chapter 11 Connector Framework Server Parameters

Configuration ImportTasks Section:

Example: ContentContainsRegex0=.* See Also:

MainRangeRegex The MainRangeRegex parameter is used to define the main part of a document. The main part of the document includes the content and all of the fields that are extracted to the main document. This parameter returns the entire document by default. This parameter accepts one or more regular expressions. The regular expressions can contain sub-matches (enclosed in parentheses). If multiple matches are found, the content is concatenated. For example, to define the main part of the document as all content that is enclosed by tags, set the parameter to:

MainRangeRegex0=(.*)

Type: String

Default: *

Required No

Configuration ImportTasks Section:

Example: MainRangeRegex0=(.*) See Also:

MainContentRegex The MainContentRegex parameter is used to define the content that is extracted as the main document content. The content must be located within the range defined by the MainRangeRegex parameter. This parameter accepts one or more regular expressions. The regular expression can contain sub-matches (enclosed in parentheses). If multiple matches are found, the content is concatenated (separated by new line characters). For example, to define the main document content as all content that is enclosed by tags, set the parameter to:

MainContentRegex0=

(.*)

• • • 192 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: MainContentRegex0=

(.*)

See Also: “MainRangeRegex” on page 192

MainFieldName The MainFieldName and MainFieldRegex parameters are used to name and populate a document field within the main document.

The document field named in the MainFieldName parameter is populated by the content identified by the MainFieldRegex parameter. Each pair of parameters produces a single document field.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: MainFieldName1=Description MainFieldRegex1=

(.*)

See Also: “MainFieldRegex” on page 193

MainFieldRegex The MainFieldName and MainFieldRegex parameters are used to name and populate a document field within the main document.

The document field named in the MainFieldName parameter is populated by the content identified by the MainFieldRegex parameter. Each pair of parameters produces a single document field. The data used to populate the field must be within the range defined by the MainRangeRegex parameter.

• • • ConnectorLib Java SDK Programming Guide • 193 • • Chapter 11 Connector Framework Server Parameters

The regular expressions used in the MainFieldRegex parameter can contain sub-matches (enclosed in parentheses). If multiple sub-matches are found, the sub-matches are concatenated (separated by spaces).

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: MainFieldName1=Description MainFieldRegex1=

(.*)

See Also: “MainFieldName” on page 193 “MainRangeRegex” on page 192

ChildrenRangeRegex The ChildrenRangeRegex parameter is used to define part of a document that is split into one or more child documents. The part of the document identified should include the content and all of the fields to be extracted into the child documents. This parameter returns the entire document by default. This parameter accepts one or more regular expressions. The regular expression can contain sub-matches (enclosed in parentheses). If there are multiple matches, the content is concatenated.

For example, to define the all content that is enclosed by tags, set the parameter to:

ChildrenRangeRegex0=(.*)

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: ChildrenRangeRegex0=(.*) See Also: “MainFieldName” on page 193 “MainRangeRegex” on page 192

• • • 194 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

ChildRangeRegex The ChildRangeRegex parameter is used to define part of a document that is split into a single child document. The part of the document identified should include the content and all of the fields to be extracted into a single child document. The content must be in the range identified by the ChildrenRangeRegex parameter. A child document is produced for every match to a regular expression.

The regular expressions used in the ChildRangeRegex parameter can contain sub-matches (enclosed in parentheses). If multiple sub-matches are found, the sub-matches are concatenated.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: ChildRangeRegex0=(.*) See Also:

ChildContentRegex The ChildContentRegex parameter is used to define the content of a single child document. This parameter accepts one or more regular expressions. The content identified must be within the range defined by the ChildRangeRegex parameter. The regular expressions used in the ChildContentRange parameter can contain sub-matches (enclosed in parentheses). If multiple matches are found, the content is concatenated (separated by new line characters).

For example, to define all content that is enclosed by tags, set the parameter to:

ChildContentRange0=

(.*)

Type: String

Default: None

Required No

• • • ConnectorLib Java SDK Programming Guide • 195 • • Chapter 11 Connector Framework Server Parameters

Configuration ImportTasks Section:

Example: ChildContentRange0=

(.*)

See Also:

ChildFieldName The ChildFieldName and ChildFieldRegex parameters are used to name and populate a document field within a child document.

The document field named in the ChildFieldName parameter is populated by the content identified by the ChildFieldFieldRegex parameter. Each pair of parameters produces a single document field.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: ChildFieldName1=Description ChildFieldRegex1=.* See Also:

ChildFieldRegex The ChildFieldName and ChildFieldRegex parameters are used to name and populate a document field within a child document.

The document field named in the ChildFieldName parameter is populated by the content identified by the ChildFieldFieldRegex parameter. Each pair of parameters produces a single document field. The data used to populate the field must be within the range defined by the ChildRangeRegex parameter. The regular expressions used in the ChildFieldRegex parameter can contain sub-matches (enclosed in parentheses). If multiple sub-matches are found, they are concatenated (separated by spaces).

• • • 196 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: ChildFieldName1=Description ChildFieldRegex1=.* See Also:

ChildInheritFields The ChildInheritFields parameter is used to specify a comma-separated list of field names that are inherited by the child documents from the original (not the main) document.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: ChildInheritFields=Title,Author,ModifiedDate See Also:

ContentReplaceRegex The ContentReplaceRegex and ContentReplaceFormat parameters are used to find and replace data in a document.

The data identified by the ContentReplaceRegex parameter is replaced by the string specified in the ContentReplaceFormat parameter. The replacement affects the DRECONTENT of the main and child documents.

Type: String

Default: None

Required No

• • • ConnectorLib Java SDK Programming Guide • 197 • • Chapter 11 Connector Framework Server Parameters

Configuration ImportTasks Section:

Example: ContentReplaceRegex0=.* ContentReplaceFormat0=replacement See Also: “ContentReplaceFormat” on page 198

ContentReplaceFormat The ContentReplaceRegex and ContentReplaceFormat parameters are used to find and replace data in a document.

The data identified by the ContentReplaceRegex parameter is replaced by the string specified in the ContentReplaceFormat parameter. The replacement affects the DRECONTENT of the main and child documents.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: ContentReplaceRegex0=.* ContentReplaceFormat0=replacement See Also: “ContentReplaceRegex” on page 197

FieldReplaceName The FieldReplaceName, FieldReplaceRegex, and FieldReplaceFormat parameters are used to identify and replace data within a document field.

The FieldReplaceName parameter identifies the document field to be searched. This parameter must be followed by FieldReplaceRegex and FieldReplaceFormat parameters.

Type: String

Default: None

Required No

• • • 198 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

Configuration ImportTasks Section:

Example: FieldReplaceName0=Description FieldReplaceRegex0=.* FieldReplaceFormat0=new value See Also: “FieldReplaceRegex” on page 199 “FieldReplaceFormat” on page 199

FieldReplaceRegex The FieldReplaceName, FieldReplaceRegex, and FieldReplaceFormat parameters are used to identify and replace data in a document field.

The FieldReplaceRegex parameter defines the part of the field that should be replaced, using a regular expression.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: FieldReplaceName0=Description FieldReplaceRegex0=.* FieldReplaceFormat0=new value See Also: “FieldReplaceName” on page 198 “FieldReplaceFormat” on page 199

FieldReplaceFormat The FieldReplaceName, FieldReplaceRegex, and FieldReplaceFormat parameters are used to identify and replace data in a document field.

The FieldReplaceFormat parameter specifies a string value that should replace the data specified in the FieldReplaceRegex parameter.

• • • ConnectorLib Java SDK Programming Guide • 199 • • Chapter 11 Connector Framework Server Parameters

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: FieldReplaceName0=Description FieldReplaceRegex0=.* FieldReplaceFormat0=new value See Also: “FieldReplaceName” on page 198 “FieldReplaceRegex” on page 199

DateFieldName The DateFieldName and DataFieldFormat parameters are used to identify a date in a document field and replace it with the date in a standard date format.

The DateFieldName parameter is used to identify the field.

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: DateFieldName0=ModifiedDate DateFieldFormat0=YYYY-MM-DD HH:NN:SS See Also: “DateFieldFormat” on page 200

DateFieldFormat The DateFieldName and DateFieldFormat parameters are used to identify a date in a document field and replace it with the date in a standard date format.

The DateFieldFormat parameter is used to define the formatting of the date.

• • • 200 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters

Type: String

Default: None

Required No

Configuration ImportTasks Section:

Example: DateFieldName0=ModifiedDate DateFieldFormat0=YYYY-MM-DD HH:NN:SS See Also: “DateFieldName” on page 200

Sectioner Import Task Parameters The following parameters are used to customise the Sectioner Import Task.

SectionerMaxBytes Use this parameter to specify the maximum number of bytes recommended for a section. This is not a hard limit, but the Sectioner will try to keep section sizes below this.

Type: Integer

Default: 3000

Required No

Configuration The configured section or Sectioning Section:

Example: SectionerMaxBytes=3000

SectionerMinBytes Use this parameter to specify the minimum number of bytes recommended for a section. This is not a hard limit, but the Sectioner will try to keep section sizes above this.

Type: Integer

Default: SectionerMaxBytes/2

• • • ConnectorLib Java SDK Programming Guide • 201 • • Chapter 11 Connector Framework Server Parameters

Required No

Configuration The configured section or Sectioning Section:

Example: SectionerMinBytes=1500

SectionerSeparatorsN Use this parameter to specify the fixed strings or regular expressions that can be used by the sectioner to identify a suitable location in the content for inserting a section break. For example, you may prefer content to split on paragraph breaks “%0A%0A”. If a large bit of content has no paragraph breaks, the Sectioner could then revert to splitting on punctuation. A separator string can either be specified as a fixed string or a regular expression. A separator is treated as a regular expression if it begins with an open parenthesis “(“ and ends with closed parenthesis “)”. Fixed strings and regular expressions specified in the configuration are URL unescaped before use; this allows you to specify multi-byte and special characters.

Each SectionerSeparatorsN is a CSV of possibly URL escaped separators. Separators in an earlier SectionSeparators list have priority over those later. Separators towards the left of a CSV have priority over those toward the right.

Backslashes in a regular expression should appear in the configuration as “\\”. Commas in separators should be URL escaped as “%2C” or escaped as “\,”. SectionSeparators 0, 1, and 2 are set by default if none are specified in the configuration. If any SectionerSeparators are specified in the configuration, the defaults no longer apply.

Type: String

Default: SectionerSeparators0=%0A%0A SectionerSeparators1=([!?.]\\s+),([:;]\\ s+),(%EF%BC%81|%EF%BC%9F|%E3%80%82|%EF%BC%8E|[! ?]),(%EF%BC%9A|%EF%BC%9B|[:;]) SectionerSeparators2=((%2C|%E3%80%81|%EF%BC%8C)\\ s*),((%E3%80%80|\\s)+)

Required No

Configuration The configured section or Sectioning Section:

Example: SectionerSeparators0=%0A%0A

• • • 202 • ConnectorLib Java SDK Programming Guide • • Import Service Parameters

Import Service Parameters

The parameters in this section specify details for KeyView and the service that imports documents into IDX or XML.

ExtractDirectory Specify the directory to which to files are extracted. Use this parameter only when you want to keep copies of all extracted files.

Type: String

Default: Current directory

Required No

Configuration ImportService Section:

Example: ExtractDirectory=C:\temp

ImportFamilyRootExcludeFmtCSV Specify which KeyView formats not to designate as family roots if family hashing is enabled. For example, if you exclude the PST format (KeyView value 356), when Import Module Advanced hashes a PST file, it does not consider PST container as the root format. Instead, it searches for a deeper format that is not listed as a CSV: in this case, it would find the MAIL format, which would then be considered the root of the family. For a complete list of KeyView formats, see “KeyView Format Codes” on page 265.

Type: String

Default:

Required No

• • • ConnectorLib Java SDK Programming Guide • 203 • • Chapter 11 Connector Framework Server Parameters

Configuration ImportService Section:

Example: ImportFamilyRootExcludeFmtCSV=356,157,233,345 In this example, the numeric values correspond to the following formats: 356=PST 157=ZIP 233=EML 345=MSG

See Also: “HashN” on page 187 “ImportHashFamilies” on page 204

ImportHashFamilies Specify whether to enable family hashing, which is used for de-duplication.

Type: Boolean

Default: false

Required No

Configuration ImportService Section:

Example: ImportHashFamilies=true See Also: “HashN” on page 187 “ImportFamilyRootExcludeFmtCSV” on page 203 “ImportMergeMails” on page 205

ImportInheritFieldsCSV Specify a comma-separated list of fields that should be inherited from parent files by their children. For example, if you specify SUBJECT in this parameter, all the child attachments in a parent MSG file will contain a Subject field.

Type: String

Default: None

Required No

• • • 204 • ConnectorLib Java SDK Programming Guide • • Import Service Parameters

Configuration ImportService Section:

Example: ImportInheritFieldsCSV=AUTN_IDENTIFIER See Also:

ImportMergeMails Specify whether to merge the two files created by KeyView (the empty MSG or EML container file, and the MAIL file that contains the actual message content) when importing MSG or EML files. Set this to true to merge the two files.

Type: Boolean

Default: false

Required No. Recommended if ImportHashFamilies=true. Configuration ImportService Section:

Example: ImportMergeMails=true See Also: “ImportHashFamilies” on page 204

KeyviewDirectory Specify the location of the KeyView filters that Connector Framework Server uses to process documents. Enter the full path to the filters directory.

Type: String

Default: None

Required Yes

Configuration ImportService Section:

Example: KeyviewDirectory=C:\Autonomy\ConnectorFramework\ filters\

• • • ConnectorLib Java SDK Programming Guide • 205 • • Chapter 11 Connector Framework Server Parameters

MaxImportQueueSize Specify the size of an internal queue where documents are buffered before they are imported.

NOTE It is recommended that this parameter not be changed without consultation with Autonomy support personnel.

Type: Integer

Default: Ten times the size specified by the IndexBatchSize parameter. Required No

Configuration ImportService or Server Section:

Example: MaxImportQueueSize=1000

RevisionMarks Specify whether revision mark information (such as deleted text) is extracted from Microsoft Word documents. If Microsoft Word’s revision tracking feature was enabled when changes were made to a document, the CFS can extract the tracked information and include it in the index. Set to true to extract revision mark information.

Type: Boolean

Default: false

Required No

Configuration ImportService Section:

Example: RevisionMarks=true

• • • 206 • ConnectorLib Java SDK Programming Guide • • Import Service Parameters

ThreadCount Specify the number of threads to run. This parameter is only used for importing.

Type: Integer

Default: 1

Required No

Configuration ImportService Section:

Example: ThreadCount=3

XsltDLL Use this parameter to specify the location of the autnxslt library.

Type: String

Default: autnxslt.dll (if present)

Required: No

Configuration Paths or ImportService or Server Section:

Example: XsltDLL=autnxslt.dll See Also:

• • • ConnectorLib Java SDK Programming Guide • 207 • • Chapter 11 Connector Framework Server Parameters

Indexing Parameters

The parameters in this section specify the details for the IDOL Server(s) to which the Connector Framework server will send documents for indexing.

ACIPort Specify the ACI port of each IDOL Server with which Connector Framework server communicates. There should be the same number of values in the ACIPort CSV as in the DREHost CSV.

Type: CSV (comma-separated values)

Default: None

Required At least one entry is required.

Configuration Indexing Section:

Example: ACIPort=9000,9012 See Also: “CompressIndexFiles” on page 208

CompressIndexFiles Set this parameter to True to compress all index files sent to IDOL. (IDOL will need to be at a relevant version to understand them.)

Type: Boolean

Default: False

Required No

Configuration Indexing Section:

Example: CompressIndexFiles=True See Also:

• • • 208 • ConnectorLib Java SDK Programming Guide • • Indexing Parameters

DREHost Specify the IP address or host name of each IDOL Server with which Connector Framework server communicates. There should be the same number of values in the DREHost CSV as in the ACIPort CSV.

Type: CSV (comma-separated values)

Default: None

Required At least one entry is required.

Configuration Indexing Section:

Example: DREHost=hostmachine0,hostmachine1 See Also: “ACIPort” on page 208

IndexBatchSize Specify the maximum number of files that are included each batch that is indexed into IDOL Server.

Type: Integer

Default: 100

Required No

Configuration Indexing Section:

Example: IndexBatchSize=100

IndexOverSocket Enter true when the IDOL server and connector are installed on different computers and documents are indexed over a network. (In this case, DREADDDATA sends data over the network and is slower.) Enter false when the IDOL server and connector are installed on the same computer and documents are indexed locally. (In this case, DREADD uses file-based indexing and is quicker.)

• • • ConnectorLib Java SDK Programming Guide • 209 • • Chapter 11 Connector Framework Server Parameters

Type: Boolean

Default: True

Required: No

Configuration Indexing Section:

Example: IndexOverSocket=true

IndexTimeInterval Specify the timeout value in seconds for the index queue. This is the maximum amount of time a document will wait in the index queue before an attempt is made to index it. If no documents were indexed in the specified interval, any documents in the queue (up to the number specified in IndexBatchSize) are indexed.

Type: Integer

Default: 300

Required No

Configuration Indexing Section:

Example: IndexTimeInterval=100 See Also: “IndexBatchSize” on page 209

Related Topics  “Secure Socket Layer Parameters” on page 235

KillDuplicates Use this parameter to specify the string that gets used as the KillDuplicates parameter value when sending an index command to IDOL server. The following options are available for this parameter:  REFERENCE - Replaces an existing document with the new document if the document to index has the same value in its DREREFERENCE field.  The default is to leave the value blank, in which case nothing is appended to the command sent to IDOL. This allows duplicate documents in IDOL server - IDOL server does not replace nor delete documents. For more information, refer to the IDOL Server Administration Guide.

• • • 210 • ConnectorLib Java SDK Programming Guide • • Indexing Parameters

Type: String

Default:

Required No

Configuration Indexing Section:

Example: KillDuplicates=REFERENCE See Also:

• • • ConnectorLib Java SDK Programming Guide • 211 • • Chapter 11 Connector Framework Server Parameters

• • • 212 • ConnectorLib Java SDK Programming Guide • • CHAPTER 12  License Configuration Parameters

This chapter describes the license configuration parameters that specify licensing details.

 Full

 Holder

 Key

 LicenseServerACIPort

 LicenseServerHost

 LicenseServerTimeout

 LicenseServerRetries

 Operation

• • • ConnectorLib Java SDK Programming Guide • 213 • • Chapter 12 License Configuration Parameters

Full

Indicates whether you have a full or an evaluation license.

Type: Boolean

Default: False

Required: Yes

Configuration License Section:

Example: Full=on In this example, the service is fully licensed.

Holder

The name of the license holder.

Type: String

Default: None

Required: Yes

Configuration License Section:

Example: Holder=Company

Key

The license key.

Type: String

Default: None

• • • 214 • ConnectorLib Java SDK Programming Guide • • LicenseServerACIPort

Required: Yes

Configuration License Section:

Example: Key=01234567890

LicenseServerACIPort

ACI port of DiSH license server. This must be the Port specified in the DiSH configuration file's [Server] section. This port is used to request licensing from DiSH. This parameter is used in IDOL with Administration.

Type: Long

Default: None

Required: Yes

Allowed range: Minimum: 0 Maximum: 65536

Recommended Minimum: 1025 range: Maximum: 65536

Configuration License Section:

Example: LicenseServerACIPort=20000 See Also: “LicenseServerHost” on page 216 “LicenseServerTimeout” on page 216 “LicenseServerRetries” on page 217

• • • ConnectorLib Java SDK Programming Guide • 215 • • Chapter 12 License Configuration Parameters

LicenseServerHost

Address of DiSH host. The IP address (or name) of the machine that hosts the DiSH license server. This parameter is used in IDOL with Administration.

Type: String

Default: None

Required: Yes

Configuration License Section:

Example: LicenseServerHost=1.23.45.6 See Also: “LicenseServerACIPort” on page 215 “LicenseServerTimeout” on page 216 “LicenseServerRetries” on page 217

LicenseServerTimeout

Seconds to timeout when connecting to DiSH. Type the number of seconds after which requests that have been sent to the DiSH license server time out if it does not respond. This parameter is used in IDOL with Administration.

Type: Long

Default: 120000

Required: No

Configuration License Section:

Example: LicenseServerTimeout=600000 See Also: “LicenseServerACIPort” on page 215 “LicenseServerHost” on page 216 “LicenseServerRetries” on page 217

• • • 216 • ConnectorLib Java SDK Programming Guide • • LicenseServerRetries

LicenseServerRetries

Number of retries when connecting to the DiSH license server. This parameter is used in IDOL with Administration.

Type: Integer

Default: 5

Required: No

Configuration License Section:

Example: LicenseServerRetries=1 See Also: “LicenseServerACIPort” on page 215 “LicenseServerHost” on page 216 “LicenseServerTimeout” on page 216

Operation

Licensed Operations key to allow additional ACI server operations to be licensed.

Type: String

Default: None

Required: Yes

Configuration License Section:

Example: Operations=803|87sdhsdf9n94nmsf7oasda987w4yriasunfa asd==

• • • ConnectorLib Java SDK Programming Guide • 217 • • Chapter 12 License Configuration Parameters

• • • 218 • ConnectorLib Java SDK Programming Guide • • CHAPTER 13  Logging Configuration Parameters

This section describes the configuration parameters used to create separate log files for different log message types (such as query, index, and application) and to determine how each stream is logged.

 LogArchiveDirectory

 LogCompressionMode

 LogDirectory

 LogEcho

 LogExpireAction

 LogFile

 LogHistorySize

 LogLevel

 LogLevelMatch

 LogMaxLineLength

 LogMaxOldFiles

 LogMaxSizeKBs

 LogOldAction

• • • ConnectorLib Java SDK Programming Guide • 219 • • Chapter 13 Logging Configuration Parameters

 LogOutputLogLevel

 LogSysLog

 LogTime

 LogTypeCSVs

LogArchiveDirectory

Path to log archive directory. Type the directory in which you want the application to archive old log files when LogOldAction is set to Move.

Type: String

Default: ./archive

Required: No

Configuration Logging and/or LogStream or TaskName Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogArchiveDirectory=./archive See Also: “LogOldAction” on page 229

• • • 220 • ConnectorLib Java SDK Programming Guide • • LogCompressionMode

LogCompressionMode

Specifies how old log files are compressed when the LogExpireAction parameter is set to Compress. This can be set to either zip or gz.

Type: String

Default: zip

Required: No

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogCompressionMode=gz See Also: “LogExpireAction” on page 222

LogDirectory

Path to log directory. Type the directory in which you want the application to store the log files it creates.

Type: String

Default: ./logs

Required: No

Configuration Logging Section:

Example: LogDirectory=./logs See Also: “LogArchiveDirectory” on page 220 “LogFile” on page 223

• • • ConnectorLib Java SDK Programming Guide • 221 • • Chapter 13 Logging Configuration Parameters

LogEcho

Display logging messages on the console. Enable this parameter to display logging messages on the console.

NOTE This setting has no effect if you are running the application as a Windows service.

Type: Boolean

Default: False

Required: No

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogEcho=true See Also: “LogArchiveDirectory” on page 220

LogExpireAction

Determines how log files are handled when they exceed the maximum size. Type one of the following to determine how log files are handled when they exceed the MaxLogSizeKBs size:

Option Description

Compress The log file's name is appended with a time stamp, compressed and saved in the log directory. By default, this is a Zip file. Use the LogCompressionMode parameter to specify another compression format.

Consecutive The log file's name is appended with a number and saved in the log directory. When the next log file reaches its LogMaxSizeKBs size, it is appended with the next consecutive number.

• • • 222 • ConnectorLib Java SDK Programming Guide • • LogFile

Option Description

Datestamp The log file's name is appended with a time stamp and saved in the log directory.

Previous The log file's name is appended with .previous and saved in the log directory. Every time a log file reaches its LogMaxSizeKBs size, it is given the same so it overwrites the old log file.

Day Only one log file is created per day and is appended with the current time stamp. Log files are archived after they reach the LogMaxSizeKBs size. NOTE The LogMaxSizeKBs parameter takes precedence over the LogExpireAction parameter. Therefore, if you set LogExpireAction to Day, and the value for LogMaxSizeKBs results in more than one log file, multiple log files is generated per day.

Type: String

Default: Datestamp

Required: No

Configuration Logging and/or LogStream or TaskName Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogExpireAction=Compress See Also: “LogCompressionMode” on page 221 “LogFile” on page 223 “LogMaxSizeKBs” on page 228

LogFile

Name of the log file. The name of the log file the application creates in the specified LogDirectory.

Type: String

Default: None

Required: Yes

• • • ConnectorLib Java SDK Programming Guide • 223 • • Chapter 13 Logging Configuration Parameters

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogFile=query.log See Also: “LogDirectory” on page 221

LogHistorySize

The number of log messages to store in memory.

Type: String

Default: 100

Required: Yes

Allowed Minimum: 1 Range: Maximum: 520

Configuration LogStream Section:

Example: LogHistorySize=50 See Also: “LogExpireAction” on page 222

LogLevel

The type of messages that are logged. Type one of the following to determine the type of messages that are logged:

Option Description

Always Basic processes are logged. NOTE This produces only minimal logging and no errors are logged.

Error Errors are logged.

• • • 224 • ConnectorLib Java SDK Programming Guide • • LogLevelMatch

Option Description

Warning Errors and warnings are logged.

Normal Errors, warnings and basic processes are logged.

Full Every occurrence is logged. NOTE This produces a large log file and can affect performance.

The log levels are hierarchical from least logging to most logging. You can use the LogLevelMatch parameter to specify which messages are reported relative to the specified LogLevel. For example, if LogLevelMatch=LessThan and LogLevel=Warning, "Normal" and "Full" message types are reported. Use the LogOutputLogLevel parameter to report the log level in the log.

Type: String

Default: Normal

Required: No

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogLevel=Warning See Also: “LogFile” on page 223 “LogLevelMatch” on page 225

LogLevelMatch

The messages reported relative to the specified LogLevel. The LogLevelMatch parameter specifies the messages that are reported relative to the log-level hierarchy:  Always

 Error

 Warning

 Normal

 Full

• • • ConnectorLib Java SDK Programming Guide • 225 • • Chapter 13 Logging Configuration Parameters

Type one of the following values for LogLevelMatch:

Option Description

Equal Only the message type specified by LogLevel is reported. For example, if LogLevel=warning, only warning messages are reported.

LessThan The message types below the LogLevel setting are reported. For example, if LogLevel=warning, "Normal" and "Full" message types are reported.

LessThanOrEqual The message type specified by LogLevel and any message type below that are reported. For example, if LogLevel=warning, "Normal", "Full", and "Warning" message types are reported.

GreaterThan The message types above the LogLevel setting are reported. For example, if LogLevel=warning, "Error" and "Always" message types are reported.

GreaterThanOrEqual The message type specified by LogLevel and any message type above that are reported. For example, if LogLevel=warning, "Error", "Always", and "Warning" message types are reported.

Type: String

Default: GreaterThanOrEqual

Required: No

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogLevelMatch=GreaterThanOrEqual See Also: “LogFile” on page 223 “LogLevel” on page 224 “LogOutputLogLevel” on page 230

• • • 226 • ConnectorLib Java SDK Programming Guide • • LogMaxLineLength

LogMaxLineLength

Maximum characters in a log entry. The number of characters a log entry can include before it is truncated. Increase this value when you want long actions to be logged in full.

Type: Long

Default: 16384

Required: No

Allowed Minimum: 0 Range: Maximum: 2000000000

Recommended Minimum: 100 Range: Maximum: 1000000

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogMaxLineLength=24000 See Also: “LogFile” on page 223

LogMaxOldFiles

Maximum number of log files in the log directory. The maximum number of log files the specified LogDirectory can store before the application runs the specified LogOldAction. If you do not want to restrict how many log files the LogDirectory can store, type -1.

Type: Long

Default: -1 (unlimited)

Required: No

• • • ConnectorLib Java SDK Programming Guide • 227 • • Chapter 13 Logging Configuration Parameters

Configuration Logging and/or LogStream or TaskName Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogMaxOldFiles=1000 See Also: “LogDirectory” on page 221 “LogOldAction” on page 229

LogMaxSizeKBs

Maximum log file size (in KB). If you do not want to restrict the log file size, type -1. The LogExpireAction parameter determines how a log file is handled after it has reached its maximum size. This parameter is used for standard logging streams.

Type: Long

Default: 1024

Required: No

Configuration Logging and/or LogStream or TaskName Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogMaxSizeKBs=1000 See Also: “LogExpireAction” on page 222

• • • 228 • ConnectorLib Java SDK Programming Guide • • LogOldAction

LogOldAction

Determines how log files are handled when the maximum number of log files is exceeded. Type one of the following to determine how log files are handled when the LogDirectory has reached the maximum number of log files, as determined by the LogMaxOldFiles parameter:

Option Description

Delete The log files are deleted.

Move The log files are moved to the specified LogArchiveDirectory.

Type: String

Default: Delete

Required: No

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogOldAction=Move See Also: “LogArchiveDirectory” on page 220 “LogDirectory” on page 221 “LogMaxOldFiles” on page 227

• • • ConnectorLib Java SDK Programming Guide • 229 • • Chapter 13 Logging Configuration Parameters

LogOutputLogLevel

Determines whether the log level is reported in the log. Enable this parameter to include the log level of a message in the log entry.

Type: Boolean

Default: False

Required: No

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogLevel=Always LogOutputLogLevel=true

In this example, Always is added to the log message: 21/12/2006 12:34:56 [10] Always: ACI Server attached to port 1622 See Also: “LogLevel” on page 224

LogSysLog

Write messages to Windows/Linux system log. Enable this parameter to write messages to the Linux Syslog or the Windows Event Log.

Type: Boolean

Default: False

Required: No

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogSysLog=true

• • • 230 • ConnectorLib Java SDK Programming Guide • • LogTime

LogTime

Display time with each log entry. Enable this parameter to display the current time next to each log entry in the log file.

Type: Boolean

Default: True

Required: No

Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.

Example: LogTime=false See Also: “LogFile” on page 223

LogTypeCSVs

List of message types to log. Type one or more of the following message types to specify the type of messages written to the associated log file. If you want to type multiple message types, separate them with commas (there must be no space before or after a comma):

Option Description

All Components

Action Logs actions and related messages.

Application Logs application-related occurrences.

IDOL Server

Agent Logs agent actions and related messages.

Category Logs category actions and related messages.

Cluster Logs cluster actions and related messages.

Community Logs community actions and related messages.

ExtendedIndex Logs index actions as well as index actions that are sent after IDOL Server has routed incoming data through other processes.

• • • ConnectorLib Java SDK Programming Guide • 231 • • Chapter 13 Logging Configuration Parameters

Option Description

Index Logs index actions and related messages.

Mailer Logs mailer actions and related messages.

Profile Logs profile actions and related messages.

Query Logs query actions and related messages.

QueryTerms Logs each query term, after stemming, conversion to UTF8, capitalization and punctuation removal. This is mainly used by the Autonomy DiSH server for statistical reports.

Role Logs role actions and related messages.

Schedule Logs schedule actions and related messages.

Taxonomy Logs taxonomy actions and related messages.

User Logs user actions and related messages.

User_Audit Logs UserAdd and UserDelete actions and related messages. UserTerm Logs terms that IDOL Server uses to form a user's agents and profiles.

DIH

Index Logs index actions and related messages.

Query Logs query actions and related messages.

DAH

Security Logs security action results.

DiSH

Alert Logs alert actions and related messages.

AlertResults Logs alert action results.

Audit Logs audit actions and related messages.

Schedule Logs schedule actions and related messages.

ScheduleResults Logs schedule action results.

Connectors

FailureList Logs details of files that were not imported successfully.

Import Logs import actions and related messages.

Index Logs index actions and related messages.

• • • 232 • ConnectorLib Java SDK Programming Guide • • LogTypeCSVs

Option Description

Spider Logs spider actions and related messages. (HTTP Connector only)

CFS

Import Logs import actions and related messages.

Indexer Logs the status of Indexing into IDOL.

CFS Connectors

Collect Logs document collection for use in Legal Hold applications.

Delete Logs the deletion of documents from the repository.

Hold Logs details of documents that are put on hold in Legal Hold applications.

Identifiers Logs details of requests for document lists from repositories.

Insert Logs the insertion of documents into the repository.

Synchronize Logs data synchronization when ingesting into IDOL.

Update Logs details of documents whose metadata is updated in the repository.

View Logs details of documents that are viewed from the repository.

Transcode Server

Transcode Logs details of transcoding.

Type: String

Default: None

Required: Yes

Configuration LogStream Section:

Example: LogTypeCSVs=Application,Index See Also: “LogFile” on page 223

• • • ConnectorLib Java SDK Programming Guide • 233 • • Chapter 13 Logging Configuration Parameters

• • • 234 • ConnectorLib Java SDK Programming Guide • • CHAPTER 14  Secure Socket Layer Parameters

This section describes the configuration parameters used to configure Secure Socket Layer (SSL) connections between components.

NOTE These parameters usually appear in the [SSLOptions] section of the configuration file.

 SSLConfig

 SSLCACertificate

 SSLCACertificatesPath

 SSLCertificate

 SSLCheckCertificate

 SSLCheckCommonName

 SSLMethod

 SSLPrivateKey

 SSLPrivateKeyPassword

• • • ConnectorLib Java SDK Programming Guide • 235 • • Chapter 14 Secure Socket Layer Parameters

SSLConfig

Identifies the configuration section in which the SSL configuration details are specified, usually SSLOptionN. You must set this parameter if you are using SSL connections between components. To control incoming ACI calls, set this parameter in the [Server] or [Default] section. To control outgoing ACI calls, set this parameter in another component section, such as [DataDRE], [CatDRE], or a connector Job section. The section in which you set SSLConfig depends on whether you are using a distributed architecture and on which component you are configuring. For example, in a standalone Category configuration, you can set SSLConfig in the [Server], [DataDRE], [CatDRE], and [CommunityServer] sections. See each component’s documentation for more information.

Type: String

Default: None

Required: No

• • • 236 • ConnectorLib Java SDK Programming Guide • • SSLConfig

Configuration Server or Default, or other section for outgoing communications Section:

Example: [Server] SSLConfig=SSLOptions1 ...

[AgentDRE] SSLConfig=SSLOptions2 ...

[DataDRE] SSLConfig=SSLOptions2 ...

// For Omni Group Servers:

[Note] GroupServerHost=... GroupServerPort=... SSLConfig=SSLOptions2

[SSLOptions1] //SSL options for incoming connections SSLMethod=SSLV23 SSLCertificate=host1.crt SSLPrivateKey=host1.key SSLCACertificate=trusted.crt

[SSLOptions2] //SSL options for outgoing connections SSLMethod= SSLV23 SSLCertificate=host2.crt SSLPrivateKey=9s7BxMjD2d3M3t7awt/J8A SSLCACertificate=trusted.crt See Also: “SSLCACertificate” on page 238 “SSLCertificate” on page 240 “SSLCheckCertificate” on page 240 “SSLCheckCommonName” on page 241 “SSLMethod” on page 241 “SSLPrivateKey” on page 242 “SSLPrivateKeyPassword” on page 243

• • • ConnectorLib Java SDK Programming Guide • 237 • • Chapter 14 Secure Socket Layer Parameters

SSLCACertificate

Certificate Authority (CA) certificate file of a trusted authority. The component only trusts communication with a peer that provides a certificate signed by the specified CAs.

Type: String

Default: None

Required: No

Configuration SSLOptionN Section:

Example: SSLCACertificate=trusted.crt See Also: “SSLConfig” on page 236

SSLCACertificatesPath

Use this parameter to specify the path to a directory containing multiple CA certificates in PEM format to check against. Each file must contain one CA certificate. The files are looked up by the CA subject name hash value, which must be available. If more than one CA certificate with the same name hash value exists, the extension must be different (for example, 9dd6633f0.0, 9dd6633f0.1, and so on). The search is performed in the order of the extension number, regardless of other properties of the certificates. As an alternative, you can specify the path to a file containing multiple CA certificates in PEM format. The file can contain certificates identified by sequences like the following example:

----BEGIN CERTIFICATE---- ... (CA certificate in base64 encoding) ... ----END CERTIFICATE----

You can insert text before, between and after the certificates to be used as descriptions of the certificates.

• • • 238 • ConnectorLib Java SDK Programming Guide • • SSLCACertificatesPath

CAUTION If several CA certificates matching the name, key identifier, and serial number condition are available, only the first one is examined. This might lead to unexpected results if the same CA certificate is available with different expiration dates. If a “certificate expired” verification error occurs, no other certificate is searched. Make sure to not have expired certificates mixed with valid ones.

For more information, refer to: http://www.openssl.org/docs/ssl/ SSL_CTX_load_verify_locations.html.

Type: String

Default: None

Required: No

Configuration SSLOptionN Section:

Example: SSLCACertificatesPath=C:\Autonomy\HTTPConnector\ CACERTS\

See Also: “SSLConfig” on page 236

• • • ConnectorLib Java SDK Programming Guide • 239 • • Chapter 14 Secure Socket Layer Parameters

SSLCertificate

SSL Certificate file to use to identify this component to a peer. It can be either ASN1 or PEM format. This parameter requires a matching SSLPrivateKey value.

Type: String

Default: None

Required: Yes

Configuration SSLOptionN Section:

Example: SSLCertificate=host1.crt SSLPrivateKey=host1.key See Also: “SSLConfig” on page 236 “SSLPrivateKey” on page 242

SSLCheckCertificate

Specifies whether a certificate signed by a trusted authority is requested from peers.

Setting SSLCACertificate implicitly sets SSLCheckCertificate to true. If SSLCACertificate is set to false, communications are encrypted, but certificates are not requested from peers.

Type: Boolean

Default: True if SSLCACertificate is set. False if SSLCACertificate is not set. Required: No

Configuration SSLOptionN Section:

Example: SSLCheckCertificate=true See Also: “SSLConfig” on page 236

• • • 240 • ConnectorLib Java SDK Programming Guide • • SSLCheckCommonName

SSLCheckCommonName

Verifies the identity of the peer. Specifies whether the host name listed in the peer's certificate (that is, the CommonName or "CN" attribute) resolves to the same IP address as the peer itself, as determined by the network connection.

For example, if the host name in a certificate is eip.autonomy.com and resolves to an IP address of 12.3.4.56, then the peer must share the same IP address.

Type: Boolean

Default: False

Required: No

Configuration SSLOptionN Section:

Example: SSLCheckCommonName=true See Also: “SSLConfig” on page 236

SSLMethod

Specifies which SSL protocol is used. The options are:  SSLV2

 SSLV3

 SSLV23

 TLSV1

• • • ConnectorLib Java SDK Programming Guide • 241 • • Chapter 14 Secure Socket Layer Parameters

SSLV23 is used in most cases.

Type: String

Default: None

Required: Yes

Configuration SSLOptionN Section:

Example: SSLMethod=SSLV23 See Also: “SSLConfig” on page 236

SSLPrivateKey

The private security key for the SSL certificate. It can be either ASN1 or PEM format. This parameter requires a matching SSLCertificate value.

Type: String

Default: None

Required: Yes

Configuration SSLOptionN Section:

Example: SSLCertificate=host1.crt SSLPrivateKey=host1.key See Also: “SSLConfig” on page 236 “SSLCertificate” on page 240 “SSLPrivateKeyPassword” on page 243

• • • 242 • ConnectorLib Java SDK Programming Guide • • SSLPrivateKeyPassword

SSLPrivateKeyPassword

The password for the file defined in SSLPrivateKey. The password might be in plain text, or basic or AES encryption format.

Type: String

Default: None

Required: No

Configuration SSLOptionN Section:

Example: [SSLOption0] SSLCertificate=host1.crt SSLPrivateKey=host1.key SSLPrivateKeyPassword=PvKey1559

In this example, the private key password to the file host1.key is written in plain text. ... [SSLOption0] SSLCertificate=host1.crt SSLPrivateKey=host1.key SSLPrivateKeyPassword=9s7BxMjD2d3M3t7awt/J8A

In this example, the private key password to the file host1.key has basic encryption.

See Also: “SSLConfig” on page 236 “SSLPrivateKey” on page 242

• • • ConnectorLib Java SDK Programming Guide • 243 • • Chapter 14 Secure Socket Layer Parameters

• • • 244 • ConnectorLib Java SDK Programming Guide • • CHAPTER 15  Service Configuration Parameters

This section describes the Service configuration parameters that determine which machines are permitted to use and control a service.

 ServiceACIMode

 ServiceControlClients

 ServiceHost

 ServicePort

 ServiceStatusClients If the ServicePort, ServiceStatusClients and ServiceControlClients configuration parameters are specified, the service port is enabled and accepts the standard status and control actions described in “Service Actions” on page 249.

• • • ConnectorLib Java SDK Programming Guide • 245 • • Chapter 15 Service Configuration Parameters

ServiceACIMode

Generate ACI-compatible XML.

Type: Boolean

Default: False

Required: No

Configuration Service Section:

Example: ServiceACIMode=false See Also: “ServiceControlClients” on page 246 “ServiceHost” on page 247 “ServicePort” on page 247 “ServiceStatusClients” on page 248

ServiceControlClients

IP addresses or names of clients that can send service control actions to the service. To type multiple addresses, separate the individual addresses with commas (there must be no space before or after the comma). Alternatively, you can use wildcards in the IP address. Type for example 187.*.*.* to permit any machine whose IP address begins with 187 to control the connector.

Type: String

Default: None

Required: Yes

Configuration Service Section:

Example: ServiceControlClients=localhost,127.0.0.1 See Also: “ServiceACIMode” on page 246 “ServiceHost” on page 247 “ServicePort” on page 247 “ServiceStatusClients” on page 248

• • • 246 • ConnectorLib Java SDK Programming Guide • • ServiceHost

ServiceHost

The host server on which the service is running.

Type: String

Default: *.*.*.*

Required: Yes

Configuration Service Section:

Example: ServiceHost=127.0.0.1 See Also: “ServiceACIMode” on page 246 “ServiceControlClients” on page 246 “ServicePort” on page 247 “ServiceStatusClients” on page 248

ServicePort

The port on the host server on which the service listens for service status and control requests.

Type: Long

Default: 40010

Required: Yes

Allowed Minimum: 1 Range: Maximum: 65535

Recommended Minimum: 1024 Range: Maximum: 65535

Configuration Service Section:

Example: ServicePort=40010 See Also: “ServiceACIMode” on page 246 “ServiceControlClients” on page 246 “ServiceHost” on page 247 “ServiceStatusClients” on page 248

• • • ConnectorLib Java SDK Programming Guide • 247 • • Chapter 15 Service Configuration Parameters

ServiceStatusClients

The IP addresses or names of clients that can request status information from a service. These clients cannot control the service. To type multiple addresses, separate the individual addresses with commas (there must be no space before or after the comma). Alternatively, you can use wildcards in the IP address. Type for example 187.*.*.* to permit any machine whose IP address begins with 187 to access the service's status.

Type: String

Default: None

Required: Yes

Configuration Service Section:

Example: ServiceStatusClients=localhost,127.0.0.1 See Also: “ServiceACIMode” on page 246 “ServiceControlClients” on page 246 “ServiceHost” on page 247 “ServicePort” on page 247

• • • 248 • ConnectorLib Java SDK Programming Guide • • CHAPTER 16  Service Actions

This section describes the Service actions.

 Action Syntax

 GetConfig

 GetLogStream

 GetLogStreamNames

 GetStatistics

 GetStatus

 GetStatusInfo

 Stop

 Service Action Parameters If the ServicePort, ServiceStatusClients and ServiceControlClients configuration parameters are specified, the service port is enabled and accepts the status and control actions described in this section.

Action Syntax

The actions use the following format: http://Host:Port/action=ActionName&[Parameters]

• • • ConnectorLib Java SDK Programming Guide • 249 • • Chapter 16 Service Actions

where,

Host The IP address (or name) of the machine hosting the service.

Port The ServicePort specified in the Service section of the service’s configuration.

ActionName One of the actions described in this section.

Parameters One or more parameters that might be required by an action.

For example: http://12.3.4.56:40010/action=GetConfig This action uses port 40010 to request the service’s configuration file settings.

Related Topics  “Service Actions” on page 249

GetConfig

The GetConfig action returns the service’s configuration file settings.

Example action=GetConfig

Parameters None.

GetLogStream

The GetLogStream action returns a specific log stream for the service.

Example action=GetLogStream&Name=ApplicationLogStream&FromDisk=true&Tail=1 0 This action displays the first ten entries of the ApplicationLogStream log.

• • • 250 • ConnectorLib Java SDK Programming Guide • • GetLogStreamNames

Parameters The action has the following optional parameters:

Parameter Description Required

FromDiskFr Specifies whether the log stream is read from disk or omDisk memory.

Name The name of the log stream you want to return.

Tail The number of lines from the log stream to return.

GetLogStreamNames

The GetLogStreamNames action returns the names of the log streams defined for the service.

Example

action=GetLogStreamNames

Parameters None.

GetStatistics

The GetStatistics action returns statistics for the service. Each statistic returns in an autn:stat XML element. This element contains the following attributes:

class The group that the statistic belongs to. For example, Service.

autnid The sub-group that the statistic belongs to. For example, Documents.

• • • ConnectorLib Java SDK Programming Guide • 251 • • Chapter 16 Service Actions

name The name of the statistic.

metric The type of statistic. This can have one of the following values:  0 String  1 Bytes  2 Bytes per second  3 per second  4 percent  5 count  6 number  7 timestamp  8 seconds  9 milliseconds  10 maximum

value The value of the statistic.

For example: In this example, the statistic 24hourRequestsPerSecond has a value of 8.5 per second. The following statistics return for all servers:

Class Statistic Description

[Service] Class

[Statistics]

ServiceDuration The number of seconds the service has been running.

10SecondResponseAverage The average service response time (in milliseconds) measured over the last 10 seconds.

10SecondRequestsPerSecond The number of requests to the service per second within the last 10 seconds.

10SecondRequests The number of requests to the service in the last 60 seconds.

60SecondResponseAverage The average service response time (in milliseconds) measured over the last 60 seconds.

• • • 252 • ConnectorLib Java SDK Programming Guide • • GetStatistics

60SecondRequestsPerSecond The number of requests to the service per second within the last 60 seconds.

60SecondPeakRequestsPerSecond The highest number of requests to the service over any 60 second period.

60SecondRequests The number of requests to the service in the last 60 seconds.

1HourResponseAverage The average service response time (in milliseconds) measured over the last hour.

1HourRequestsPerSecond The number of requests to the service per second within the last hour.

1HourPeakRequestsPerSecond The highest number of requests to the service over any 1 hour period.

1HourRequests The number of requests to the service in the last hour.

24HourResponseAverage The average service response time (in milliseconds) measured over the last 24 hours.

24HourRequestsPerSecond The number of requests to the service per second within the last 24 hours.

24HourPeakRequestsPerSecond The highest number of requests to the service over any 24 hour period.

24HourRequests The number of requests to the service in the last 24 hours.

RecentResponseAverage The average service response time (in milliseconds) from the time the last 10 second period finished to the current time.

RecentRequestsPerSecond The number of requests to the service per second from the time the last 10 second period finished to the current time.

RecentPeakRequestsPerSecond The highest number of requests to the service from the time the last 10 second period finished to the current time.

RecentRequests The number of requests to the service from the time the last 10 second period finished to the current time.

TotalRequests The total number of requests that were made to the service.

• • • ConnectorLib Java SDK Programming Guide • 253 • • Chapter 16 Service Actions

The following statistics return for specific components:

Class Statistic Description

[Service] Class

[Statistics]

TruncatedQueries The number of queries that timed out.

[Documents]

Total The total number of documents that this IDOL Server contains.

Sections The number of document sections that this IDOL Server contains.

TotalSlots The total number of document sections that the IDOL Server contains including document sections that have been deleted.

[Databases]

Number The total number of databases including empty databases and databases that have been deleted.

Active The number of active databases (databases that are empty or contain data).

[ACI] Class

[Action:ActionName]

Count The total number of ActionName actions that were sent to the service.

Avg.Duration The average duration (in ms) of ActionName actions.

Shortest The shortest duration (in ms) of ActionName actions.

Longest The longest duration (in ms) of ActionName actions. [Indexer] Class

[Connections]

Total The number of socket connections to the index port.

Unauthorized The number of index actions that IDOL Server received from unauthorized clients.

Paused The number of connections that were rejected because the service was paused.

• • • 254 • ConnectorLib Java SDK Programming Guide • • GetStatistics

Class Statistic Description

InsufficientDiskSpace The number of connections that were rejected because there was insufficient disk space.

InvalidIndexCode The number of connections that were rejected because they contained an invalid index code.

[Commands]

Invalid The number of actions that the service received to the index port that were not valid index actions.

TuncatedData The number of index actions that were received that had truncated data.

CommandName The number of CommandName index actions that were run.

[Command:CommandName]

Avg.Duration The average duration (in ms) of CommandName index actions.

Shortest The shortest duration (in ms) of CommandName index actions.

Longest The longest duration (in ms) of CommandName index actions.

CommandsRejectedDiskFull The number of index actions that were rejected because the disk was full.

CommandsRejectedInvalidIndexCode The number of index actions that were rejected because their index code was invalid.

[Streaming]

BytesStreamedToDisk The number of bytes of data that the service has streamed to disk.

TimeSpentStreaming The amount of time in seconds that the service has spent streaming data.

[Queue]

Received The number of index actions that have been received.

Completed The number of index actions that have been completed.

Queued The number of index actions that are in the index queue.

[Rejected Commands]

Invalid The number of index actions that were rejected because they were not recognized actions.

• • • ConnectorLib Java SDK Programming Guide • 255 • • Chapter 16 Service Actions

Class Statistic Description

RejectedInvalidDatabase The number of index actions that were rejected because they contained an invalid database.

ReadOnlyDatabase The number of index actions that were rejected because they contained a read-only database.

FileNotFound The number of index actions that were rejected because the file was not found.

DocLimitExceeded The number of index actions that were rejected because the document limit was exceeded.

IndexSizeExceeded The number of index actions that were rejected because the maximum index size was exceeded.

UserConfIndexLimitExceeded The number of index actions that were rejected because the configured maximum allowed index size was exceeded.

OutOfMemory The number of index actions that were rejected because IDOL server was out of memory.

BadParameter The number of index actions that were rejected because they contained an invalid parameter or parameter value.

InsufficientFileHandles The number of index actions that were rejected because there were insufficient file handles.

InsufficientDiskSpace The number of index actions that were rejected because there was not enough disk space.

TruncatedData The number of POST index actions that were rejected because their data termination was incorrect.

SuccessfullyProcessed The number of successfully run index actions.

OndiskComponent The number of index actions that have data stored on disk.

[Documents]

ReplacedReindex The number of documents that were re-indexed because an ACLType or Index field had changed.

ReplacedDocsTotal The number of documents that have been replaced.

InvalidDatabaseDocs The number of documents that were not indexed because their database was invalid.

[Database] Class

[DatabaseName]

Documents The number of documents that this database contains.

• • • 256 • ConnectorLib Java SDK Programming Guide • • GetStatistics

Class Statistic Description

Sections The number of document sections that this database contains.

[Server] Class

[Tasks]

Number The number of tasks set up in the configuration file.

StartTask The first task that is performed.

IndexCommands The number of index actions that have been processed (the number displayed includes any index action that is currently being processed).

Documents The number of documents that have been processed (the number displayed includes any document that is currently being processed).

DocumentSuccesses The number of documents that have been processed successfully.

DocumentFailures The number of times that document processing has failed.

Sections The number of document sections processed.

[Tasks] Class

[TaskName] Requests The number of requests sent to a specific task.

Successes The number of requests processed successfully by a specific task.

Failures The number of request-processing failures for a specific task.

[Licensing] Class

[Users]

Maximum The maximum number of users that can be set up for this service.

[Statistics] Class

[Users]

Users The number of users that has been set up for this service.

[CHILDSTAT] Class

[AllChildren]

• • • ConnectorLib Java SDK Programming Guide • 257 • • Chapter 16 Service Actions

Class Statistic Description

TotalUpEvents The number of times a DIH child server was marked up

TotalDownEvents The number of times a DIH child server was marked down.

[Engine N] UpEvents The number of times this DIH child server was marked down.

DownEvents The number of times this DIH child server was marked up.

CommandsSent The number of actions that were sent to this DIH child server.

Retries The number of times actions to this DIH child server were retried.

TotalBytesSent The total number of bytes of data that were sent to this DIH child server.

AvgSendCommandRate The average rate that actions were sent to this DIH child server.

MinResponseTime The smallest time that DAH took to respond to a request.

AvgResponseTime The average time that DAH took to respond to requests.

MaxResponseTime The largest time that DAH took to respond to requests.

SuccessfulActions The number of actions that were successfully completed.

FailedActions The number of actions that failed.

Timeouts The number of actions that timed out.

Example action=GetStatistics

Parameters None.

• • • 258 • ConnectorLib Java SDK Programming Guide • • GetStatus

GetStatus

The GetStatus action returns the service’s status (running or stopped) and some current configuration settings.

Example action=GetStatus

Action Parameters None.

GetStatusInfo

The GetStatusInfo action returns status information for the service (for example, the service’s product name, version number and so on). The following status information for the service are returned:

Statistics Description

[StatusInfo]

ServiceStartTime The time the service started running (epochseconds).

ServiceUtilsVersion The version of the service utilities.

ServiceName The name of the service.

ProductName The product name of the service.

ProductVersion The version of the product.

ProductBuild The build of the product.

ServicePID The process ID of the service.

ProductUID The user identifier of the service.

ServiceStatus The status of the service (running or stopped).

[Job]

FlowRate The amount of data (in kilobytes) being aggregated per second.

Status The status of the connector job (running or stopped).

• • • ConnectorLib Java SDK Programming Guide • 259 • • Chapter 16 Service Actions

Example action=GetStatusInfo

Parameters None.

Stop

The Stop action stops the service.

Example action=Stop

Parameters None.

Service Action Parameters

This section describes the parameters for service actions.

FromDisk

Name

Tail

FromDisk Specifies whether the log stream is read from disk or memory. Type true if you want the log stream to be read from disk rather than from memory.

Action GetLogStream

Type: Boolean

Default: false

• • • 260 • ConnectorLib Java SDK Programming Guide • • Service Action Parameters

Required: No

Example: action=GetLogStream&Name=ApplicationLogStream&Fro mDisk=true&Tail=10 See Also:

Name Type the name of the log stream you want to return.

Action GetLogStream

Type: String

Default: false

Required: Yes

Example: action=GetLogStream&Name=ApplicationLogStream&Fro mDisk=true&Tail=10 See Also:

Tail Type the number of lines from the log stream to return. The lines are read from the top (that is the most recent lines are retuned). Type -1 to return all entries.

Action GetLogStream

Type: Long

Default: -1

Required: No

Example: action=GetLogStream&Name=ApplicationLogStream&Tai l=10 See Also:

• • • ConnectorLib Java SDK Programming Guide • 261 • • Chapter 16 Service Actions

• • • 262 • ConnectorLib Java SDK Programming Guide • • Appendixes

This section includes the following appendixes:

 KeyView Format Codes Appendixes

• • • 264 • ConnectorLib Java SDK Programming Guide • • APPENDIX A  KeyView Format Codes

This chapter lists the KeyView format classes and codes used with Connector Framework server. It includes the following section:

 KeyView Classes

 KeyView Formats Table 1 lists KeyView file classes. The numbers are reported in the DocumentClass field in IDX files generated by Import Module. Consult the table to determine the file class that was imported.

Table 2 lists all KeyView formats. The numbers are reported in the DocumentType field in IDX files generated by Import Module. Consult the table to determine the file type that was imported. You can use any of the format numbers from Table 2 in conjunction with the ImportFamilyRootExcludeFmtCSV parameter. For more information, see “ImportFamilyRootExcludeFmtCSV” on page 203.

• • • ConnectorLib Java SDK Programming Guide • 265 • • Appendix A KeyView Format Codes

KeyView Classes

Table 1 KeyView Classes Attribute Number File Class

0 No file class

01 Word processor

02 Spreadsheet

03 Database

04 Raster image

05 Vector graphic

06 Presentation

07 Executable

08 Encapsulation

09 Sound

10 Desktop publishing

11 Outline/planning

12 Miscellaneous

13 Mixed format

14 Font

15 Time scheduling

16 Communications

17 Object module

18 Library module

19 Fax

20 Movie

21 Animation

• • • 266 • ConnectorLib Java SDK Programming Guide • • KeyView Formats

KeyView Formats

Table 2 KeyView Formats

Format Name Format Number Format Description AES_Multiplus_Comm_Fmt 1 Multiplus (AES)

ASCII_Text_Fmt 2 Text

MSDOS_Batch_File_Fmt 3 MS-DOS Batch File

Applix_Alis_Fmt 4 APPLIX ASTERIX

BMP_Fmt 5 Windows Bitmap

CT_DEF_Fmt 6 Convergent Technologies DEF Comm. Format

Corel_Draw_Fmt 7 Corel Draw

CGM_ClearText_Fmt 8 Computer Graphics Metafile (CGM)

CGM_Binary_Fmt 9 Computer Graphics Metafile (CGM)

CGM_Character_Fmt 10 Computer Graphics Metafile (CGM)

Word_Connection_Fmt 11 Word Connection

COMET_TOP_Word_Fmt 12 COMET TOP

CEOwrite_Fmt 13 CEOwrite

DSA101_Fmt 14 DSA101 (Honeywell Bull)

DCA_RFT_Fmt 15 DCA-RFT (IBM Revisable Form)

CDA_DDIF_Fmt 16 CDA / DDIF

DG_CDS_Fmt 17 DG Common Data Stream (CDS)

Micrografx_Draw_Fmt 18 Windows Draw (Micrografx)

Data_Point_VistaWord_Fmt 19 Vistaword

DECdx_Fmt 20 DECdx

Enable_WP_Fmt 21 Enable Word Processing

EPSF_Fmt 22 Encapsulated PostScript

Preview_EPSF_Fmt 23 Encapsulated PostScript

MS_Executable_Fmt 24 MSDOS/Windows Program

G31D_Fmt 25 CCITT G3 1D

• • • ConnectorLib Java SDK Programming Guide • 267 • • Appendix A KeyView Format Codes

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description GIF_87a_Fmt 26 Graphics Interchange Format (GIF87a)

GIF_89a_Fmt 27 Graphics Interchange Format (GIF89a)

HP_Word_PC_Fmt 28 HP Word PC

IBM_1403_LinePrinter_Fmt 29 IBM 1403 Line Printer

IBM_DCF_Script_Fmt 30 DCF Script

IBM_DCA_FFT_Fmt 31 DCA-FFT (IBM Final Form)

Interleaf_Fmt 32 Interleaf

GEM_Image_Fmt 33 GEM Bit Image

IBM_Display_Write_Fmt 34 Display Write

Sun_Raster_Fmt 35 Sun Raster

Ami_Pro_Fmt 36 Lotus Ami Pro

Ami_Pro_StyleSheet_Fmt 37 Lotus Ami Pro Style Sheet

MORE_Fmt 38 MORE Database MAC

Lyrix_Fmt 39 Lyrix Word Processing

MASS_11_Fmt 40 MASS-11

MacPaint_Fmt 41 MacPaint

MS_Word_Mac_Fmt 42 Microsoft Word for Macintosh

SmartWare_II_Comm_Fmt 43 SmartWare II

MS_Word_Win_Fmt 44 Microsoft Word for Windows

Multimate_Fmt 45 MultiMate

Multimate_Fnote_Fmt 46 MultiMate Footnote File

Multimate_Adv_Fmt 47 MultiMate Advantage

Multimate_Adv_Fnote_Fmt 48 MultiMate Advantage Footnote File

Multimate_Adv_II_Fmt 49 MultiMate Advantage II

Multimate_Adv_II_Fnote_Fmt 50 MultiMate Advantage II Footnote File

Multiplan_PC_Fmt 51 Multiplan (PC)

Multiplan_Mac_Fmt 52 Multiplan (Mac)

MS_RTF_Fmt 53 Rich Text Format (RTF)

• • • 268 • ConnectorLib Java SDK Programming Guide • • KeyView Formats

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description MS_Word_PC_Fmt 54 Microsoft Word for PC

MS_Word_PC_StyleSheet_Fmt 55 Microsoft Word for PC Style Sheet

MS_Word_PC_Glossary_Fmt 56 Microsoft Word for PC Glossary

MS_Word_PC_Driver_Fmt 57 Microsoft Word for PC Driver

MS_Word_PC_Misc_Fmt 58 Microsoft Word for PC Miscellaneous File

NBI_Async_Archive_Fmt 59 NBI Async Archive Format

Navy_DIF_Fmt 60 Navy DIF

NBI_Net_Archive_Fmt 61 NBI Net Archive Format

NIOS_TOP_Fmt 62 NIOS TOP

FileMaker_Mac_Fmt 63 Filemaker MAC

ODA_Q1_11_Fmt 64 ODA / ODIF

ODA_Q1_12_Fmt 65 ODA / ODIF

OLIDIF_Fmt 66 OLIDIF (Olivetti)

Office_Writer_Fmt 67 Office Writer

PC_Paintbrush_Fmt 68 PC Paintbrush Graphics (PCX)

CPT_Comm_Fmt 69 CPT

Lotus_PIC_Fmt 70 Lotus PIC

Mac_PICT_Fmt 71 QuickDraw Picture

Philips_Script_Word_Fmt 72 Philips Script

PostScript_Fmt 73 PostScript

PRIMEWORD_Fmt 74 PRIMEWORD

Quadratron_Q_One_v1_Fmt 75 Q-One V1.93J

Quadratron_Q_One_v2_Fmt 76 Q-One V2.0

SAMNA_Word_IV_Fmt 77 SAMNA Word

Ami_Pro_Draw_Fmt 78 Lotus Ami Pro Draw

SYLK_Spreadsheet_Fmt 79 SYLK

SmartWare_II_WP_Fmt 80 SmartWare II

Symphony_Fmt 81 Symphony

• • • ConnectorLib Java SDK Programming Guide • 269 • • Appendix A KeyView Format Codes

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description Targa_Fmt 82 Targa

TIFF_Fmt 83 TIFF

Targon_Word_Fmt 84 Targon Word

Uniplex_Ucalc_Fmt 85 Uniplex Ucalc

Uniplex_WP_Fmt 86 Uniplex

MS_Word_UNIX_Fmt 87 Microsoft Word UNIX

WANG_PC_Fmt 88 WANG PC

WordERA_Fmt 89 WordERA

WANG_WPS_Comm_Fmt 90 WANG WPS

WordPerfect_Mac_Fmt 91 WordPerfect MAC

WordPerfect_Fmt 92 WordPerfect

WordPerfect_VAX_Fmt 93 WordPerfect VAX

WordPerfect_Macro_Fmt 94 WordPerfect Macro

WordPerfect_Dictionary_Fmt 95 WordPerfect Spelling Dictionary

WordPerfect_Thesaurus_Fmt 96 WordPerfect Thesaurus

WordPerfect_Resource_Fmt 97 WordPerfect Resource File

WordPerfect_Driver_Fmt 98 WordPerfect Driver

WordPerfect_Cfg_Fmt 99 WordPerfect Configuration File

WordPerfect_Hyphenation_Fmt 100 WordPerfect Hyphenation Dictionary

WordPerfect_Misc_Fmt 101 WordPerfect Miscellaneous File

WordMARC_Fmt 102 WordMARC

Windows_Metafile_Fmt 103 Windows Metafile

Windows_Metafile_NoHdr_Fmt 104 Windows Metafile (no header)

SmartWare_II_DB_Fmt 105 SmartWare II

WordPerfect_Graphics_Fmt 106 WordPerfect Graphics

WordStar_Fmt 107 WordStar

WANG_WITA_Fmt 108 WANG WITA

Xerox_860_Comm_Fmt 109 Xerox 860

• • • 270 • ConnectorLib Java SDK Programming Guide • • KeyView Formats

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description Xerox_Writer_Fmt 110 Xerox Writer

DIF_SpreadSheet_Fmt 111 Data Interchange Format (DIF)

Enable_Spreadsheet_Fmt 112 Enable Spreadsheet

SuperCalc_Fmt 113 Supercalc

UltraCalc_Fmt 114 UltraCalc

SmartWare_II_SS_Fmt 115 SmartWare II

SOF_Encapsulation_Fmt 116 Serialized Object Format (SOF)

PowerPoint_Win_Fmt 117 PowerPoint PC

PowerPoint_Mac_Fmt 118 PowerPoint MAC

PowerPoint_95_Fmt 119 PowerPoint 95

PowerPoint_97_Fmt 120 PowerPoint 97

PageMaker_Mac_Fmt 121 PageMaker for Macintosh

PageMaker_Win_Fmt 122 PageMaker for Windows

MS_Works_Mac_WP_Fmt 123 Microsoft Works for MAC

MS_Works_Mac_DB_Fmt 124 Microsoft Works for MAC

MS_Works_Mac_SS_Fmt 125 Microsoft Works for MAC

MS_Works_Mac_Comm_Fmt 126 Microsoft Works for MAC

MS_Works_DOS_WP_Fmt 127 Microsoft Works for DOS

MS_Works_DOS_DB_Fmt 128 Microsoft Works for DOS

MS_Works_DOS_SS_Fmt 129 Microsoft Works for DOS

MS_Works_Win_WP_Fmt 130 Microsoft Works for Windows

MS_Works_Win_DB_Fmt 131 Microsoft Works for Windows

MS_Works_Win_SS_Fmt 132 Microsoft Works for Windows

PC_Library_Fmt 133 DOS/Windows Object Library

MacWrite_Fmt 134 MacWrite

MacWrite_II_Fmt 135 MacWrite II

Freehand_Fmt 136 Freehand MAC

Disk_Doubler_Fmt 137 Disk Doubler

• • • ConnectorLib Java SDK Programming Guide • 271 • • Appendix A KeyView Format Codes

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description HP_GL_Fmt 138 HP Graphics Language

FrameMaker_Fmt 139 FrameMaker

FrameMaker_Book_Fmt 140 FrameMaker

Maker_Markup_Language_Fmt 141 Maker Markup Language

Maker_Interchange_Fmt 142 Maker Interchange Format (MIF)

JPEG_File_Interchange_Fmt 143 Interchange Format

Reflex_Fmt 144 Reflex

Framework_Fmt 145 Framework

Framework_II_Fmt 146 Framework II

Paradox_Fmt 147 Paradox

MS_Windows_Write_Fmt 148 Windows Write

Quattro_Pro_DOS_Fmt 149 Quattro Pro for DOS

Quattro_Pro_Win_Fmt 150 Quattro Pro for Windows

Persuasion_Fmt 151 Persuasion

Windows_Icon_Fmt 152 Windows Icon Format

Windows_Cursor_Fmt 153 Windows Cursor

MS_Project_Activity_Fmt 154 Microsoft Project

MS_Project_Resource_Fmt 155 Microsoft Project

MS_Project_Calc_Fmt 156 Microsoft Project

PKZIP_Fmt 157 ZIP Archive

Quark_Xpress_Fmt 158 Quark Xpress MAC

ARC_PAK_Archive_Fmt 159 PAK/ARC Archive

MS_Publisher_Fmt 160 Microsoft Publisher

PlanPerfect_Fmt 161 PlanPerfect

WordPerfect_Auxiliary_Fmt 162 WordPerfect auxiliary file

MS_WAVE_Audio_Fmt 163 Microsoft Wave

MIDI_Audio_Fmt 164 MIDI

AutoCAD_DXF_Binary_Fmt 165 AutoCAD DXF

• • • 272 • ConnectorLib Java SDK Programming Guide • • KeyView Formats

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description AutoCAD_DXF_Text_Fmt 166 AutoCAD DXF

dBase_Fmt 167 dBase

OS_2_PM_Metafile_Fmt 168 OS/2 PM Metafile

Lasergraphics_Language_Fmt 169 Lasergraphics Language

AutoShade_Rendering_Fmt 170 AutoShade Rendering

GEM_VDI_Fmt 171 GEM VDI

Windows_Help_Fmt 172 Windows Help File

Volkswriter_Fmt 173 Volkswriter

Ability_WP_Fmt 174 Ability

Ability_DB_Fmt 175 Ability

Ability_SS_Fmt 176 Ability

Ability_Comm_Fmt 177 Ability

Ability_Image_Fmt 178 Ability

XyWrite_Fmt 179 XYWrite / Nota Bene

CSV_Fmt 180 CSV (Comma Separated Values)

IBM_Writing_Assistant_Fmt 181 IBM Writing Assistant

WordStar_2000_Fmt 182 WordStar 2000

HP_PCL_Fmt 183 HP Printer Control Language

UNIX_Exe_PreSysV_VAX_Fmt 184 Unix Executable (PDP-11/pre-System V VAX)

UNIX_Exe_Basic_16_Fmt 185 Unix Executable (Basic-16)

UNIX_Exe_x86_Fmt 186 Unix Executable (x86)

UNIX_Exe_iAPX_286_Fmt 187 Unix Executable (iAPX 286)

UNIX_Exe_MC68k_Fmt 188 Unix Executable (MC680x0)

UNIX_Exe_3B20_Fmt 189 Unix Executable (3B20)

UNIX_Exe_WE32000_Fmt 190 Unix Executable (WE32000)

UNIX_Exe_VAX_Fmt 191 Unix Executable (VAX)

UNIX_Exe_Bell_5_Fmt 192 Unix Executable (Bell 5.0)

UNIX_Obj_VAX_Demand_Fmt 193 Unix Object Module (VAX Demand)

• • • ConnectorLib Java SDK Programming Guide • 273 • • Appendix A KeyView Format Codes

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description UNIX_Obj_MS8086_Fmt 194 Unix Object Module (old MS 8086)

UNIX_Obj_Z8000_Fmt 195 Unix Object Module (Z8000)

AU_Audio_Fmt 196 NeXT/Sun Audio Data

NeWS_Font_Fmt 197 NeWS bitmap font

cpio_Archive_CRChdr_Fmt 198 cpio archive (CRC Header)

cpio_Archive_CHRhdr_Fmt 199 cpio archive (CHR Header)

PEX_Binary_Archive_Fmt 200 SUN PEX Binary Archive

Sun_vfont_Fmt 201 SUN vfont Definition

Curses_Screen_Fmt 202 Curses Screen Image

UUEncoded_Fmt 203 UU encoded

WriteNow_Fmt 204 WriteNow MAC

PC_Obj_Fmt 205 DOS/Windows Object Module

Windows_Group_Fmt 206 Windows Group

TrueType_Font_Fmt 207 TrueType Font

Windows_PIF_Fmt 208 Program Information File (PIF)

MS_COM_Executable_Fmt 209 PC (.COM)

StuffIt_Fmt 210 StuffIt (MAC)

PeachCalc_Fmt 211 PeachCalc

Wang_GDL_Fmt 212 WANG Office GDL Header

Q_A_DOS_Fmt 213 Q & A for DOS

Q_A_Win_Fmt 214 Q & A for Windows

WPS_PLUS_Fmt 215 WPS-PLUS

DCX_Fmt 216 DCX FAX Format (PCX images)

OLE_Fmt 217 OLE Compound Document

EBCDIC_Fmt 218 EBCDIC Text

DCS_Fmt 219 DCS

UNIX_SHAR_Fmt 220 SHAR

Lotus_Notes_BitMap_Fmt 221 Lotus Notes Bitmap

• • • 274 • ConnectorLib Java SDK Programming Guide • • KeyView Formats

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description Lotus_Notes_CDF_Fmt 222 Lotus Notes CDF

Compress_Fmt 223 Unix Compress

GZ_Compress_Fmt 224 GZ Compress

TAR_Fmt 225 TAR

ODIF_FOD26_Fmt 226 ODA / ODIF

ODIF_FOD36_Fmt 227 ODA / ODIF

ALIS_Fmt 228 ALIS

Envoy_Fmt 229 Envoy

PDF_Fmt 230 Portable Document Format

BinHex_Fmt 231 BinHex

SMTP_Fmt 232 SMTP

MIME_Fmt 233 MIME

USENET_Fmt 234 USENET

SGML_Fmt 235 SGML

HTML_Fmt 236 HTML

ACT_Fmt 237 ACT

PNG_Fmt 238 Portable Network Graphics (PNG)

MS_Video_Fmt 239 Video for Windows (AVI)

Windows_Animated_Cursor_Fmt 240 Windows Animated Cursor

Windows_CPP_Obj_Storage_Fmt 241 Windows C++ Object Storage

Windows_Palette_Fmt 242 Windows Palette

RIFF_DIB_Fmt 243 RIFF Device Independent Bitmap

RIFF_MIDI_Fmt 244 RIFF MIDI

RIFF_Multimedia_Movie_Fmt 245 RIFF Multimedia Movie

MPEG_Fmt 246 MPEG Movie

QuickTime_Fmt 247 QuickTime Movie

AIFF_Fmt 248 Audio Interchange File Format (AIFF)

Amiga_MOD_Fmt 249 Amiga MOD

• • • ConnectorLib Java SDK Programming Guide • 275 • • Appendix A KeyView Format Codes

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description Amiga_IFF_8SVX_Fmt 250 Amiga IFF (8SVX) Sound

Creative_Voice_Audio_Fmt 251 Creative Voice (VOC)

AutoDesk_Animator_FLI_Fmt 252 AutoDesk Animator FLIC

AutoDesk_AnimatorPro_FLC_Fmt 253 AutoDesk Animator Pro FLIC

Compactor_Archive_Fmt 254 Compactor / Compact Pro

VRML_Fmt 255 VRML

QuickDraw_3D_Metafile_Fmt 256 QuickDraw 3D Metafile

PGP_Secret_Keyring_Fmt 257 PGP Secret Keyring

PGP_Public_Keyring_Fmt 258 PGP Public Keyring

PGP_Encrypted_Data_Fmt 259 PGP Encrypted Data

PGP_Signed_Data_Fmt 260 PGP Signed Data

PGP_SignedEncrypted_Data_Fmt 261 PGP Signed and Encrypted Data

PGP_Sign_Certificate_Fmt 262 PGP Signature Certificate

PGP_Compressed_Data_Fmt 263 PGP Compressed Data

PGP_ASCII_Public_Keyring_Fmt 264 ASCII-armored PGP Public Keyring

PGP_ASCII_Encoded_Fmt 265 ASCII-armored PGP encoded

PGP_ASCII_Signed_Fmt 266 ASCII-armored PGP encoded

OLE_DIB_Fmt 267 OLE DIB object

SGI_Image_Fmt 268 SGI Image

Lotus_ScreenCam_Fmt 269 Lotus ScreenCam

MPEG_Audio_Fmt 270 MPEG Audio

FTP_Software_Session_Fmt 271 FTP Session Data

Netscape_Bookmark_File_Fmt 272 Netscape Bookmark File

Corel_Draw_CMX_Fmt 273 Corel CMX

AutoDesk_DWG_Fmt 274 AutoDesk Drawing (DWG)

AutoDesk_WHIP_Fmt 275 AutoDesk WHIP

Macromedia_Director_Fmt 276 Macromedia Director

Real_Audio_Fmt 277 Real Audio

• • • 276 • ConnectorLib Java SDK Programming Guide • • KeyView Formats

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description MSDOS_Device_Driver_Fmt 278 MSDOS Device Driver

Micrografx_Designer_Fmt 279 Micrografx Designer

SVF_Fmt 280 Simple Vector Format (SVF)

Applix_Words_Fmt 281 Applix Words

Applix_Graphics_Fmt 282 Applix Graphics

MS_Access_Fmt 283 Microsoft Access

MS_Access_95_Fmt 284 Microsoft Access 95

MS_Access_97_Fmt 285 Microsoft Access 97

MacBinary_Fmt 286 MacBinary

Apple_Single_Fmt 287 Apple Single

Apple_Double_Fmt 288 Apple Double

Enhanced_Metafile_Fmt 289 Enhanced Metafile

MS_Office_Drawing_Fmt 290 Microsoft Office Drawing

XML_Fmt 291 XML

DeVice_Independent_Fmt 292 DeVice Independent file (DVI)

Unicode_Fmt 293 Unicode

Lotus_123_Worksheet_Fmt 294 Lotus 1-2-3

Lotus_123_Format_Fmt 295 Lotus 1-2-3 Formatting

Lotus_123_97_Fmt 296 Lotus 1-2-3 97

Lotus_Word_Pro_96_Fmt 297 Lotus Word Pro 96

Lotus_Word_Pro_97_Fmt 298 Lotus Word Pro 97

Freelance_DOS_Fmt 299 Lotus Freelance for DOS

Freelance_Win_Fmt 300 Lotus Freelance for Windows

Freelance_OS2_Fmt 301 Lotus Freelance for OS/2

Freelance_96_Fmt 302 Lotus Freelance 96

Freelance_97_Fmt 303 Lotus Freelance 97

MS_Word_95_Fmt 304 Microsoft Word 95

MS_Word_97_Fmt 305 Microsoft Word 97

• • • ConnectorLib Java SDK Programming Guide • 277 • • Appendix A KeyView Format Codes

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description Excel_Fmt 306 Microsoft Excel

Excel_Chart_Fmt 307 Microsoft Excel

Excel_Macro_Fmt 308 Microsoft Excel

Excel_95_Fmt 309 Microsoft Excel 95

Excel_97_Fmt 310 Microsoft Excel 97

Corel_Presentations_Fmt 311 Corel Presentations

Harvard_Graphics_Fmt 312 Harvard Graphics

Harvard_Graphics_Chart_Fmt 313 Harvard Graphics Chart

Harvard_Graphics_Symbol_Fmt 314 Harvard Graphics Symbol File

Harvard_Graphics_Cfg_Fmt 315 Harvard Graphics Configuration File

Harvard_Graphics_Palette_Fmt 316 Harvard Graphics Palette

Lotus_123_R9_Fmt 317 Lotus 1-2-3 Release 9

Applix_Spreadsheets_Fmt 318 Applix Spreadsheets

MS_Pocket_Word_Fmt 319 Microsoft Pocket Word

MS_DIB_Fmt 320 MS Windows Device Independent Bitmap

MS_Word_2000_Fmt 321 Microsoft Word 2000

Excel_2000_Fmt 322 Microsoft Excel 2000

PowerPoint_2000_Fmt 323 Microsoft PowerPoint 2000

MS_Access_2000_Fmt 324 Microsoft Access 2000

MS_Project_4_Fmt 325 Microsoft Project 4

MS_Project_41_Fmt 326 Microsoft Project 4.1

MS_Project_98_Fmt 327 Microsoft Project 98

Folio_Flat_Fmt 328 Folio Flat File

HWP_Fmt 329 HWP(Arae-Ah Hangul)

ICHITARO_Fmt 330 ICHITARO V4-10

IS_XML_Fmt 331 Extended or Custom XML

Oasys_Fmt 332 Oasys format

PBM_ASC_Fmt 333 Portable Bitmap Utilities ASCII Format

• • • 278 • ConnectorLib Java SDK Programming Guide • • KeyView Formats

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description PBM_BIN_Fmt 334 Portable Bitmap Utilities Binary Format

PGM_ASC_Fmt 335 Portable Greymap Utilities ASCII Format

PGM_BIN_Fmt 336 Portable Greymap Utilities Binary Format

PPM_ASC_Fmt 337 Portable Pixmap Utilities ASCII Format

PPM_BIN_Fmt 338 Portable Pixmap Utilities Binary Format

XBM_Fmt 339 X Bitmap Format

XPM_Fmt 340 X Pixmap Format

FPX_Fmt 341 FPX Format

PCD_Fmt 342 PCD Format

MS_Visio_Fmt 343 Microsoft Visio

MS_Project_2000_Fmt 344 Microsoft Project 2000

MS_Outlook_Fmt 345 Microsoft Outlook

ELF_Relocatable_Fmt 346 ELF Relocatable

ELF_Executable_Fmt 347 ELF Executable

ELF_Dynamic_Lib_Fmt 348 ELF Dynamic Library

MS_Word_XML_Fmt 349 Microsoft Word 2003 XML

MS_Excel_XML_Fmt 350 Microsoft Excel 2003 XML

MS_Visio_XML_Fmt 351 Microsoft Visio 2003 XML

SO_Text_XML_Fmt 352 StarOffice Text XML

SO_Spreadsheet_XML_Fmt 353 StarOffice Spreadsheet XML

SO_Presentation_XML_Fmt 354 StarOffice Presentation XML

XHTML_Fmt 355 XHTML

MS_OutlookPST_Fmt 356 Microsoft Outlook PST

RAR_Fmt 357 RAR

Lotus_Notes_NSF_Fmt 358 IBM Lotus Notes Database NSF/NTF

Macromedia_Flash_Fmt 359 SWF

MS_Word_2007_Fmt 360 Microsoft Word 2007 XML

MS_Excel_2007_Fmt 361 Microsoft Excel 2007 XML

• • • ConnectorLib Java SDK Programming Guide • 279 • • Appendix A KeyView Format Codes

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description MS_PPT_2007_Fmt 362 Microsoft PPT 2007 XML

OpenPGP_Fmt 363 OpenPGP Message Format (with new packet format)

Intergraph_V7_DGN_Fmt 364 Intergraph Standard File Format (ISFF) V7 DGN (non-OLE)

MicroStation_V8_DGN_Fmt 365 MicroStation V8 DGN (OLE)

MS_Word_Macro_2007_Fmt 366 Microsoft Word Macro 2007 XML

MS_Excel_Macro_2007_Fmt 367 Microsoft Excel Macro 2007 XML

MS_PPT_Macro_2007_Fmt 368 Microsoft PPT Macro 2007 XML

LZH_Fmt 369 LHA Archive

Office_2007_Fmt 370 Office 2007 document

MS_XPS_Fmt 371 Microsoft XML Paper Specification (XPS)

Lotus_Domino_DXL_Fmt 372 IBM Lotus representation of Domino design elements in XML format

ODF_Text_Fmt 373 ODF Text

ODF_Spreadsheet_Fmt 374 ODF Spreadsheet

ODF_Presentation_Fmt 375 ODF Presentation

Legato_Extender_ONM_Fmt 376 Legato Extender Native Message ONM

bin_Unknown_Fmt 377 n/a

TNEF_Fmt 378 Transport Neutral Encapsulation Format (TNEF)

CADAM_Drawing_Fmt 379 CADAM Drawing

CADAM_Drawing_Overlay_Fmt 380 CADAM Drawing Overlay

NURSTOR_Drawing_Fmt 381 NURSTOR Drawing

HP_GLP_Fmt 382 HP Graphics Language (Plotter)

ASF_Fmt 383 Advanced Systems Format (ASF)

WMA_Fmt 384 Window Media Audio Format (WMA)

WMV_Fmt 385 Window Media Video Format (WMV)

EMX_Fmt 386 Legato EMailXtender Archives Format (EMX)

Z7Z_Fmt 387 7 Zip Format (7z)

• • • 280 • ConnectorLib Java SDK Programming Guide • • KeyView Formats

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description MS_Excel_Binary_2007_Fmt 388 Microsoft Excel Binary 2007

CAB_Fmt 389 Microsoft Cabinet File (CAB)

CATIA_Fmt 390 CATIA Formats (CAT*)

YIM_Fmt 391 Yahoo Instant Messenger History

ODF_Drawing_Fmt 392 ODF Drawing

Founder_CEB_Fmt 393 Founder Chinese E-paper Basic (CEB)

QPW_Fmt 394 Quattro Pro 9+ for Windows

MHT_Fmt 395 MHT format

MDI_Fmt 396 Microsoft Document Imaging Format

GRV_Fmt 397 Microsoft Office Groove Format

IWWP_Fmt 398 Apple iWork Pages format

IWSS_Fmt 399 Apple iWork Numbers format

IWPG_Fmt 400 Apple iWork Keynote format

BKF_Fmt 401 Windows Backup File

MS_Access_2007_Fmt 402 Microsoft Access 2007

ENT_Fmt 403 Microsoft Entourage Database Format

DMG_Fmt 404 Mac Disk Copy Disk Image File

CWK_Fmt 405 AppleWorks File

OO3_Fmt 406 Omni Outliner File

OPML_Fmt 407 Omni Outliner File

Omni_Graffle_XML_File 408 Omni Graffle XML File

PSD_Fmt 409 Photoshop Document

Apple_Binary_PList_Fmt 410 Apple Binary Property List format

Apple_iChat_Fmt 411 Apple iChat format

OOUTLINE_Fmt 412 OOutliner File

BZIP2_Fmt 413 Bzip 2 Compressed File

ISO_Fmt 414 ISO-9660 CD Disc Image Format

DocuWorks_Fmt 415 DocuWorks Format

• • • ConnectorLib Java SDK Programming Guide • 281 • • Appendix A KeyView Format Codes

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description RealMedia_Fmt 416 RealMedia Streaming Media

AC3Audio_Fmt 417 AC3 Audio File Format

NEF_Fmt 418 Nero Encrypted File

SolidWorks_Fmt 419 SolidWorks Format Files

XFDL_Fmt 420 Extensible Forms Description Language

Apple_XML_PList_Fmt 421 Apple XML Property List format

OneNote_Fmt 422 OneNote Note Format

Dicom_Fmt 424 Digital Imaging and Communications in Medicine

EnCase_Fmt 425 Expert Witness Compression Format (EnCase)

Scrap_Fmt 426 Shell Scrap Object File

MS_Project_2007_Fmt 427 Microsoft Project 2007

MS_Publisher_98_Fmt 428 Microsoft Publisher 98/2000/2002/2003/2007

Skype_Fmt 429 Skype Log File

Hl7_Fmt 430 Health level7 message

MS_OutlookOST 431 Microsoft Outlook OST

Epub_Fmt 432 Electronic Publication

MS_OEDBX_Fmt 433 Microsoft Outlook Express DBX

BB_Activ_Fmt 434 BlackBerry Activation File

DiskImage_Fmt 435 Disk Image

Milestone_Fmt 436 Milestone Document

E_Transcript_Fmt 437 RealLegal E-Transcript File

PostScript_Font_Fmt 438 PostScript Type 1 Font

Ghost_DiskImage_Fmt 439 Ghost Disk Image File

JPEG_2000_JP2_File_Fmt 440 JPEG-2000 JP2 File Format Syntax (ISO/IEC 15444-1)

Unicode_HTML_Fmt 441 Unicode HTML

CHM_Fmt 442 Microsoft Compiled HTML Help

EMCMF_Fmt 443 Documentum EMCMF format

• • • 282 • ConnectorLib Java SDK Programming Guide • • KeyView Formats

Table 2 KeyView Formats (continued)

Format Name Format Number Format Description MS_Access_2007_Tmpl_Fmt 444 Microsoft Access 2007 Template

Jungum_Fmt 445 Samsung Electronics Jungum Global document

JBIG2 446 JBIG2 File Format

EFax_Fmt 447 eFax file

AD1_Fmt 448 Forensic Toolkit FTK Imager file

SketchUp_Fmt 449 Google SketchUp

GWFS_Email_Fmt 450 Group Wise File Surf email

JNT_Fmt 451 Windows Journal Format

• • • ConnectorLib Java SDK Programming Guide • 283 • • Appendix A KeyView Format Codes

• • • 284 • ConnectorLib Java SDK Programming Guide • • Glossary

A

ACI (Autonomy Content The Autonomy Content Infrastructure is a technology layer that automates Infrastructure) operations on unstructured information for cross-enterprise applications, thus enabling an automated and compatible business-to-business, peer-to-peer infrastructure. The ACI allows enterprise applications to understand and process content that exists in unstructured formats, such as e-mail, Web pages, office documents, and Lotus Notes.

ACL (access control list) An ACL is a set of data associated with a document that defines which users, groups, and roles are permitted to access a document or data source (for example, an Oracle database or Windows file system).

C

connector A connector is an Autonomy fetching solution (such as HTTP Connector, Oracle Connector, Exchange Connector, and so on) that allows you to retrieve information from any type of local or remote repository such as a database or Web site. It imports the fetched documents into IDX or XML file format and indexes them into IDOL Server, from where you can retrieve them (for example by sending queries to IDOL Server).

D

database An Autonomy database is an IDOL Server data pool that stores indexed information. The administrator can set up one or more databases and specify how data is fed to the databases. You can retrieve information that is indexed in the IDOL Server database by sending a query to the IDOL Server.

DIH (Distributed Action The Distributed Index Handler allows you to efficiently split and index Handler) extremely large quantities of data into multiple IDOL Servers to create a completely scalable solution that delivers high performance and high availability. It provides a flexible way of transparently batching, routing, and categorizing the indexing of internal and external content into the IDOL Server.

• • • ConnectorLib Java SDK Programming Guide • 285 • • Glossary

DiSH (Distributed The Distributed Service Handler provides a unified way to communicate with Service Handler) all Autonomy services from a centralized location. It also facilitates the licensing that enables you to run Autonomy solutions. You must have an Autonomy DiSH server running on a machine with a static known IP address.

F

fetch The process of downloading documents from the repository in which they are stored (such as a local folder, Web site, database, Lotus Domino server, and so on), importing them to IDX format, and indexing them into an IDOL Server.

I

IAS (Intellectual Asset The Intellectual Asset Protection System provides an integrated security Protection System) solution to protect your data. At the front end, authentication checks users are allowed to access the system on which result data is displayed. At the back end, entitlement checking and authentication combine to ensure query results only contain documents the user is allowed to see from repositories the user is allowed to access.

IDOL Server Using Autonomy connectors, Autonomy's Intelligent Data Operating Layer (IDOL) server integrates unstructured, semi-structured, and structured information from multiple repositories through an understanding of the content, delivering a real-time environment in which operations across applications and content are automated, removing all the manual processes involved in getting the right information to the right people at the right time.

IDX Apart from XML files, only files in IDX format can be indexed into IDOL Server. You can use a connector to import files into this format or manually create IDX files.

importing After a document has been downloaded from the repository in which it is stored, it is imported to an IDX or XML file format. This process is called “importing.”

Index fields Store fields containing text which you want to query frequently as index fields. Index fields are processed linguistically when they are stored in IDOL Server. This means stemming and stop lists are applied to text in index fields before they are stored, which allows IDOL Server to process queries for these fields more quickly. Typically, the fields DRETITLE and DRECONTENT are set up as index fields.

indexing After documents have been imported to IDX file format, their content (or links to the original documents) is stored in an IDOL Server. This process is called “indexing.”

• • • 286 • ConnectorLib Java SDK Programming Guide • • Q

Q

query You can submit a natural language query to IDOL Server which analyzes the concept of the query and returns documents that are conceptually similar to the query. You can also submit other query and search types to IDOL Server, such as Boolean, bracketed Boolean, and keyword searches.

S

Search Unlike ordinary searches that look for keywords, the Autonomy Search allows you to enter a natural language query. The concept of the query is analyzed and documents relevant to this concept are returned to you.

• • • ConnectorLib Java SDK Programming Guide • 287 • • Glossary

• • • 288 • ConnectorLib Java SDK Programming Guide • • Index

A CompressIndexfiles configuration parameter 208 configuration 37, 43, 59 abs_path method 74 boolean values 37, 62 access control list (ACL) 285 string values 38, 62 AciPort configuration parameter 208 configuration parameters actions AciPort 208 GetConfig 250 AdminClients (CFS) 180 GetLogStream 250 ChildContentRegex 195 GetLogStreamNames 251 ChildFieldName 196 GetStatistics 251 ChildFieldRegex 196 GetStatus 259 ChildInheritFields 197 GetStatusInfo 259 ChildRangeRegex 195 Stop 260 ChildrenRangeRegex 194 addField method 87, 95 CleanOnStart 130 AdminClients (CFS) configuration parameter 180 CompressIndexFiles 208 append sub file indices with Lua 52 ConnectorGroup 124 appendContent method 87 ConnectorPriority 124 at method 103 ContentContainsRegex 191 attr method 104 ContentReplaceFormat 198 Autonomy Content Infrastructure (ACI) 285 ContentReplaceRegex 197 DataPortN 125 B DatastoreDirectory 131 boolean DatastoreFile 130 values 37, 62 DateFieldFormat 200 DateFieldName 200 C DeleteN 187 DREHost 209 ChildContentRegex configuration parameter EnableExtraction 131 195 EnableExtractionCopy 131 ChildFieldName configuration parameter 196 EnableIngestion 142 ChildFieldRegex configuration parameter 196 EnableScheduledTasks 132 ChildInheritFields configuration parameter 197 EnableViewServer 128 ChildRangeRegex configuration parameter 195 EncryptACLEntries 133 ChildrenRangeRegex configuration parameter ExtractDirectory 203 194 FieldMatchesName 190 CleanOnStart configuration parameter 130

• • • ConnectorLib Java SDK Programming Guide • 289 • • Index

FieldMatchesRegex 191 IngestKeepFiles 149 FieldReplaceFormat 199 IngestPort 145, 149 FieldReplaceName 198 IngestSendByType 149 FieldReplaceRegex 199 IngestSSLConfig 150 FilenameMatchesRegex 189 IngestWriteIDX 151 FilePath 118 InsertActions 134 Full 214 InsertFailedDirectory 135 GroupServerHost 151 JavaClasspath 155 GroupServerPort 152 JavaConnectorClass 156, 157 GroupServerRepository 152 JavaLibraryPath 156 GroupServerSSLConfig 152 JavaVerboseGC 158 HashedDestinationDirectory 133 JVMLibraryPath 157 HashedTempDirectory 134 Key 214 HashN 187 KeyviewDirectory 123, 205 Holder 214 KillDuplicates 210 Host 128 LibraryName 118 HostN 125 LicenseServerACIPort 215 HtmlExtraction 185 LicenseServerHost 216 IdxWriterArchiveDirectory 188 LicenseServerRetries 217 IdxWriterFileName 188 LicenseServerTimeout 216 IdxWriterMaxSizeKBs 189 LogArchiveDirectory 220 ImportFamilyRootExcludeFmtCSV 203 LogCompressionMode 221 ImportHashFamilies 204 LogDirectory 221 ImportInheritFieldsCSV 204 LogEcho 222 ImportMergeMails 205 LogExpireAction 222 IndexBatchSize 209 LogFile 223 IndexDatabase 142 LogHistorySize 224 IndexOverSocket 209 LogLevel 224 IndexTimeInterval 210 LogLevelMatch 225 IngestActions 143 LogMaxLineLength 227 IngestAddAsupdate 143 LogMaxOldFiles 227 IngestBatchSize 144 LogMaxSizeKBs 228 IngestCheckFinished 144 LogOldAction 229 IngestConfigSection 139 LogOutputLogLevel 230 IngestConnectorConfigSection 145 LogSysLog 230 IngestDelayMS 146 LogTime 231 IngestEnableAdds 146 LogTypeCSVs 231 IngestEnableDeletes 146 LuaScript 119 IngestEnableUpdates 147 MainContentRegex 192 IngesterType 147 MainFieldName 193 IngestHashedSharedPath 147 MainFieldRegex 193 IngestHost 148 MainRangeRegex 192

• • • 290 • ConnectorLib Java SDK Programming Guide • • C

MaxImportQueueSize 206 SynchronizeKeepDatastore 135 MaximumThreads 119, 182 SynchronizeThreads 136, 137 MaxQueueSize 120, 182 TaskMaxAdds 136 MaxScheduledSize 120 TaskMaxDuration 137 N 139 TempDirectory 138 Number 140 ThreadCount 207 OnError 121 UpdateN 186 OnErrorReport 121 Url 122 OnFinish 122 XsltDLL 138, 207 OnStart 122 connector 285 Operation 217 Connector Framework server 25 Port 129, 150 configure 63 Port (CFS) 181 IDX Writer 183 PortN 126 parameters PostN 186 AciPort 208 PreN 185 CompressIndexFiles 208 QueryClients (CFS) 181 DeleteN 187 ReferenceMatchesRegex 190 DREHost 209 RegisterConnector 126 ExtractDirectory 203 RevisionMarks 206 HashN 187 ScheduleCycles 140 HtmlExtraction 185 ScheduleRepeatSecs 141 IdxWriterArchiveDirectory 188 ScheduleStartTime 141 IdxWriterFileName 188 SectionerMaxBytes 201 IdxWriterMaxSizeKBs 189 SectionerMinBytes 201 ImportFamilyRootExcludeFmtCSV 203 SectionerSeparatorsN 202 ImportHashFamilies 204 ServiceACIMode 246 ImportInheritFieldsCSV 204 ServiceControlClients 246 ImportMergeMails 205 ServiceHost 247 IndexBatchSize 209 ServicePort 247 IndexTimeInterval 210 ServiceStatusClients 248 KeyviewDirectory 205 SharedPath 127, 129 KillDuplicates 210 SSLCACertificate 238 MaxImportQueueSize 206 SSLCACertificatessPath 238 MaximumThreads 182 SSLCertificate 240 PostN 186 SSLCheckCertificate 240 PreN 185 SSLCheckCommonName 241 RevisionMarks 206 SSLConfig 236 SectionerMaxBytes 201 SSLConfigN 127 SectionerMinBytes 201 SSLMethod 241 SectionerSeparatorsN 202 SSLPrivateKey 242 ThreadCount 207 SSLPrivateKeyPassword 243 UpdateN 186

• • • ConnectorLib Java SDK Programming Guide • 291 • • Index

ConnectorGroup configuration parameter 124 getContent 89, 93 ConnectorPriority configuration parameter 124 getField 90 content method 104 getFieldNames 90 ContentContainsRegex configuration parameter getFields 90 191 getFieldValue 91 ContentReplaceFormat configuration parameter getFieldValues 91 198 getNextSection 91 ContentReplaceRegex configuration parameter getReference 92 197 hasAttribute 99 convert_date_time method 74 hasfield 92 convert_encoding method 75 insertXML 92 copy_file method 76 renameField 100 copyField method 87, 95 renamefield 93 copyFieldNoOverwrite method 88, 95 setFieldValue 93 countField method 88, 96 setReference 94 create_path method 76 writeStubIdx 94 create_uuid method 76 DREHost configuration parameter 209

D E database 285 EnableExtraction configuration parameter 131 DataPortN configuration parameter 125 EnableExtractionCopy configuration parameter 131 DatastoreDirectory configuration parameter 131 EnableIngestion configuration parameter 142 DatastoreFile configuration parameter 130 EnableScheduledTasks configuration parameter DateFieldFormat configuration parameter 200 132 DateFieldName configuration parameter 200 EnableViewServer configuration parameter 128 delete_file method 77 encrypt method 77 deleteAttribute method 96 encrypt_security_field method 77 deleteField method 89, 96 EncryptACLEntries configuration parameter 133 DeleteN configuration parameter 187 ExtractDirectory configuration parameter 203 directory structure Windows 33 F Distributed Action Handler (DIH) 285 Distributed Service Handler (DiSH) 286 fetch 286 DocInfo class 49 Field methods Document Objects methods addField 95 addField 87 countField 96 appendContent 87 deleteAttribute 96 copyField 87, 95 getAttributeValue 97 copyFieldNoOverwrite 88, 95 getField 97 countField 88 getFieldNames 98 deleteField 89, 96 getFields 98 findField 89 getFieldValues 98

• • • 292 • ConnectorLib Java SDK Programming Guide • • G

hasField 99 getFieldValue method 91, 111 insertXML 99 getFieldValues method 91, 98 name 100 GetLogStream service action 250 setAttributeValue 100 GetLogStreamNames service action 251 setValue 101 getNextSection method 91 value 101 getReference method 92 fieldGetValue method 111 GetStatistics service action 251 FieldMatchesName configuration parameter GetStatus service action 259 190 GetStatusInfo service action 259 FieldMatchesRegex configuration parameter getValue method 110 191 getValues method 110 FieldReplaceFormat configuration parameter gobble_whitespace method 79 199 GroupServerHost configuration parameter 151 FieldReplaceName configuration parameter GroupServerPort configuration parameter 152 198 GroupServerRepository configuration parameter FieldReplaceRegex configuration parameter 152 199 GroupServerSSLConfig configuration parameter fieldSetValue method 111 152 file_setdates method 78 FilenameMatchesRegex configuration parameter 189 H FilePath configuration parameter 118 hasAttribute method 99 findField method 89, 111 hasField method 92, 99 firstChild method 104 hash_file method 79 Full configuration parameter 214 hash_string method 80 HashedDestinationDirectory configuration parameter G 133 HashedTempDirectory configuration parameter 134 General methods HashN configuration parameter 187 convert_encoding 75 Holder configuration parameter 214 encrypt 77 Host configuration parameter 128 regex_match 82 HostN configuration parameter 125 regex_search 83 HtmlExtraction configuration parameter 185 string_uint_less 85 xml_encode 86 I get_config method 78 getAttributeValue method 97 Identifiers 50 GetConfig service action 250 IDOL Server 286 getContent method 89, 93 IDX 286 getcwd method 78 IdxWriterArchiveDirectory configuration parameter getEncryptedValue method 109 188 getField method 90, 97 IdxWriterFileName configuration parameter 188 getFieldNames method 90, 98 IdxWriterMaxSizeKBs configuration parameter 189 getFields method 90, 98 Import Module Advanced

• • • ConnectorLib Java SDK Programming Guide • 293 • • Index

C methods 69 introduction 25 ImportFamilyRootExcludeFmtCSV configuration is_dir method 80 parameter 203 iupported platforms 29 ImportHashFamilies configuration parameter 204 importing 286 J ImportInheritFieldsCSV configuration parameter 204 JavaClasspath configuration parameter 155 ImportMergeMails configuration parameter 205 JavaConnectorClass configuration parameter 156, Index fields 286 157 IndexBatchSize configuration parameter 209 JavaLibraryPath configuration parameter 156 IndexDatabase configuration parameter 142 JavaVerboseGC configuration parameter 158 indexing 286 JVMLibraryPath configuration parameter 157 IndexOverSocket configuration parameter 209 IndexTimeInterval configuration parameter 210 K IngestActions configuration parameter 143 Key configuration parameter 214 IngestAddAsUpdate configuration parameter 143 KeyView IngestBatchSize configuration parameter 144 formats 265 IngestCheckFinished configuration parameter 144 KeyviewDirectory configuration parameter 123, 205, IngestConfigSectionTime configuration parameter 206 139 KillDuplicates configuration parameter 210 IngestConnectorConfigSection configuration parameter 145 L IngestDelayMS configuration parameter 146 IngestEnableAdds configuration parameter 146 lastChild method 104 IngestEnableDeletes configuration parameter 146 length method 107 IngestEnableUpdates configuration parameter 147 LibraryName configuration parameter 118 ingester class 57 LicenseServerACIPort configuration parameter 215 IngesterType configuration parameter 147 LicenseServerHost configuration parameter 216 IngestHashedSharedPath configuration parameter LicenseServerRetries configuration parameter 217 147 LicenseServerTimeout configuration parameter 216 IngestHost configuration parameter 148 log method 81 IngestKeepFiles configuration parameter 149 LogArchiveDirectory configuration parameter 220 IngestPort configuration parameter 145, 149 LogCompressionMode configuration parameter 221 IngestSendByType configuration parameter 149 LogDirectory configuration parameter 221 IngestSSLConfig configuration parameter 150 LogEcho configuration parameter 222 IngestWriteIDX configuration parameter 151 LogExpireAction configuration parameter 222 InsertActions configuration parameter 134 LogFile configuration parameter 223 InsertFailedDirectory configuration parameter 135 LogHistorySize configuration parameter 224 insertXML method 92, 99 LogLevel configuration parameter 224 Installation LogLevelMatch configuration parameter 225 Standalone 30 LogMaxLineLength configuration parameter 227 installation 29 LogMaxOldFiles configuration parameter 227 Intellectual Asset Protection System (IAS) 286 LogMaxSizeKBs configuration parameter 228

• • • 294 • ConnectorLib Java SDK Programming Guide • • M

LogOldAction configuration parameter 229 platforms, supported 29 LogOutputLogLevel configuration parameter 230 Port (CFS) configuration parameter 181 LogSysLog configuration parameter 230 Port configuration parameter 129, 150 LogTime configuration parameter 231 PortN configuration parameter 126 LogTypeCSVs configuration parameter 231 position method 108 Lua, append sub file indices with 52 PostN configuration parameter 186 LuaScript configuration parameter 119 PreN configuration parameter 185 prev method 106 M previous_attribute method 107

MainContentRegex configuration parameter 192 Q MainFieldName configuration parameter 193 query 287 MainFieldRegex configuration parameter 193 QueryClients (CFS) configuration parameter 181 MainRangeRegex configuration parameter 192 MaximumThreads configuration parameter 119, 182 R MaxQueueSize configuration parameter 120, 182 MaxScheduledSize configuration parameter 120 ReferenceMatchesRegex configuration parameter 190 move_file method 81 regex_match method 82 regex_search method 83 N RegexMatch methods N configuration parameter 139 length 107 name method 100, 105, 106 next 108 next method 105, 108 position 108 next_attribute method 106 size 108 nodePath method 105 str 109 Number configuration parameter 140 RegisterConnector configuration parameter 126 renameField method 93, 100 O RevisionMarks configuration parameter 206 root method 101 OnError configuration parameter 121 OnErrorReport configuration parameter 121 S OnFinish configuration parameter 122 OnStart configuration parameter 122 ScheduleCycles configuration parameter 140 Operation configuration parameter 217 ScheduleRepeatSecs configuration parameter 141 ScheduleStartTime configuration parameter 141 P Search 287 SectionerMaxBytes configuration parameter 201 parent method 105 SectionerMinBytes configuration parameter 201 parse_csv method 82 SectionerSeparatorsN configuration parameter 202 parse_xml method 82 Secure Socket Layer connections 235 password send_aci_action method 83 encrypt 38 send_aci_command method 84

• • • ConnectorLib Java SDK Programming Guide • 295 • • Index

service actions Sub File Indices 51 GetConfig 250 SynchronizeKeepDatastore configuration parameter GetLogStream 250 135 GetLogStreamNames 251 SynchronizeThreads configuration parameter 136, GetStatistics 251 137 GetStatus 259 system requirements 29 GetStatusInfo 259 Stop 260 T ServiceACIMode configuration parameter 246 TaskMaxAdds configuration parameter 136 ServiceControlClients configuration parameter 246 TaskMaxDuration configuration parameter 137 ServiceHost configuration parameter 247 TempDirectory configuration parameter 138 ServicePort configuration parameter 247 ThreadCount configuration parameter 207 ServiceStatusClients configuration parameter 248 type method 106, 107 setAttributeValue method 100 setFieldValue method 93, 111 U setReference method 94 setValue method 101 unzip_file method 86 SharedPath configuration parameter 127, 129 UpdateN configuration parameter 186 SharePoint Connector Url configuration parameter 122 configuration 37, 43, 59 installation 29 V introduction 25 value method 101, 107 system requirements 29 size method 104, 108 W sleep method 85 SSLCACertificate configuration parameter 238 Windows SSLCACertificatesPath configuration parameter directory structure 33 238 system requirements 30 SSLCertificate configuration parameter 240 writeStubIdx method 94 SSLCheckCertificate configuration parameter 240 SSLCheckCommonName configuration parameter X 241 SSLConfig configuration parameter 236 XML document methods SSLConfigN configuration parameter 127 root 101 SSLMethod configuration parameter 241 XPathExecute 102 SSLPrivateKey configuration parameter 242 XPathRegisterNs 102 SSLPrivateKeyPassword configuration parameter XPathValue 102 243 XPathValues 103 stop a connector 60 xml_encode method 86 Stop service action 260 XmlAttr methods 106 str method 109 abs_path 74 string values 38, 62 convert_date_time 74 string_uint_less method 85 copy_file 76

• • • 296 • ConnectorLib Java SDK Programming Guide • • Z

create_path 76 prev 106 create_uuid 76 type 106 delete_file 77 XmlNodeSet Methods 103 encrypt_security_field 77 XmlNodeSet methods fieldGetValue 111 at 103 fieldSetValue 111 size 104 file_setdates 78 XPathExecute method 102 findField 111 XPathRegisterNs method 102 get_config 78 XPathValue method 102 getcwd 78 XPathValues method 103 getEncryptedValue 109 XsltDLL configuration parameter 138, 207 getFieldValue 111 getValue 110 Z getValues 110 zip_file method 86 gobble_whitespace 79 hash_file 79 hash_string 80 is_dir 80 log 81 move_file 81 name 106 next_attribute 106 parse_csv 82 parse_xml 82 previous_attribute 107 send_aci_action 83 send_aci_command 84 setFieldValue 111 sleep 85 type 107 unzip_file 86 value 107 zip_file 86 XmlNode methods 104 attr 104 content 104 firstChild 104 lastChild 104 name 105 next 105 nodePath 105 parent 105

• • • ConnectorLib Java SDK Programming Guide • 297 • • Index

• • • 298 • ConnectorLib Java SDK Programming Guide • •