IDOL ConnectorLib Java SDK™ Programming Guide
Version 10.0 Document Revision 0 10 May 2012 Copyright Notice
Notice
This documentation is a proprietary product of Autonomy and is protected by copyright laws and international treaty. Information in this documentation is subject to change without notice and does not represent a commitment on the part of Autonomy. While reasonable efforts have been made to ensure the accuracy of the information contained herein, Autonomy assumes no liability for errors or omissions. No liability is assumed for direct, incidental, or consequential damages resulting from the use of the information contained in this documentation. The copyrighted software that accompanies this documentation is licensed to the End User for use only in strict accordance with the End User License Agreement, which the Licensee should read carefully before commencing use of the software. No part of this publication may be reproduced, transmitted, stored in a retrieval system, nor translated into any human or computer language, in any form or by any means, electronic, mechanical, magnetic, optical, chemical, manual or otherwise, without the prior written permission of the copyright owner. This documentation may use fictitious names for purposes of demonstration; references to actual persons, companies, or organizations are strictly coincidental. Trademarks and Copyrights
Copyright 2012 Autonomy Corporation plc and all its affiliates. All rights reserved. ACI API, Alfresco Connector, Arcpliance, Autonomy Process Automation, Autonomy Fetch for Siebel eBusiness Applications, Autonomy, Business Objects Connector, Cognos Connector, Confluence Connector, ControlPoint, DAH, Digital Safe Connector, DIH, DiSH, DLH, Documentum Connector, DOH, EAS Connector, Ektron Connector, Enterprise AWE, eRoom Connector, Exchange Connector, FatWire Connector, File System Connector for Netware, File System Connector, FileNet Connector, FileNet P8 Connector, FTP Fetch, HTTP Connector, Hummingbird DM Connector, IAS, IBM Content Manager Connector, IBM Seedlist Connector, IBM Workplace Fetch, IDOL Server, IDOL, IDOLme, iManage Fetch, IMAP Connector, Import Module, iPlanet Connector, KeyView, KVS Connector, Legato Connector, LiquidOffice, LiquidPDF, LiveLink Web Content Management Connector, MCMS Connector, MediClaim, Meridio Connector, Meridio, Moreover Fetch, NNTP Connector, Notes Connector, Objective Connector, OCS Connector, ODBC Connector, Omni Fetch SDK, Open Text Connector, Oracle Connector, PCDocs Fetch, PLC Connector, POP3 Fetch, Portal-in-a-Box, RecoFlex, Retina, SAP Fetch, Schlumberger Fetch, SharePoint 2003 Connector, SharePoint 2007 Connector, SharePoint 2010 Connector, SharePoint Fetch, SpeechPlugin, Stellent Fetch, TeleForm, Tri-CR, Ultraseek, Verity Profiler, Verity, VersiForm, WebDAV Connector, WorkSite Connector, and all related titles and logos are trademarks of Autonomy Corporation plc and its affiliates, which may be registered in certain jurisdictions. Microsoft is a registered trademark, and MS-DOS, Windows, Windows 95, Windows NT, SharePoint, and other Microsoft products referenced herein are trademarks of Microsoft Corporation. UNIX is a registered trademark of The Open Group. AvantGo is a trademark of AvantGo, Inc. Epicentric Foundation Server is a trademark of Epicentric, Inc. Documentum and eRoom are trademarks of Documentum, a division of EMC Corp. FileNet is a trademark of FileNet Corporation. Lotus Notes is a trademark of Lotus Development Corporation. mySAP Enterprise Portal is a trademark of SAP AG. Oracle is a trademark of Oracle Corporation. Adobe is a trademark of Adobe Systems Incorporated. Novell is a trademark of Novell, Inc. Stellent is a trademark of Stellent, Inc. All other trademarks are the property of their respective owners. Notice to Government End Users
If this product is acquired under the terms of a DoD contract: Use, duplication, or disclosure by the Government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of 252.227-7013. Civilian agency contract: Use, reproduction or disclosure is subject to 52.227-19 (a) through (d) and restrictions set forth in the accompanying end user agreement. Unpublished-rights reserved under the copyright laws of the United States. Autonomy, Inc., One Market Plaza, Spear Tower, Suite 1900, San Francisco, CA. 94105, US.
10 May 2012 Contents
About This Document ...... 15 Documentation Updates...... 15 Related Documentation...... 17 Conventions ...... 18 Notational Conventions ...... 18 Command-line Syntax Conventions ...... 19 Notices ...... 20 Autonomy Product References ...... 20 Autonomy Customer Support ...... 21 Contact Autonomy...... 21
Part 1 Getting Started
Chapter 1 Introduction ...... 25 Overview ...... 25 About Connector Framework Server ...... 26 System Architecture ...... 27 Import Process...... 28
Chapter 2 Install ConnectorLib Java SDK ...... 29 System Requirements...... 29 Install ConnectorLib Java SDK on Windows ...... 30 Directory Structure—Windows ...... 33 Connector Framework Server Directory Structure ...... 33 ConnectorLib Java SDK Directory Structure ...... 34
• • • ConnectorLib Java SDK Programming Guide • 3 • • Contents
Chapter 3 Configure the Connector ...... 37 Modify Parameters ...... 37 Enter Boolean Values ...... 37 Enter String Values ...... 38 Encrypt Passwords ...... 38 Set Up Log Streams ...... 40
Chapter 4 Implement a Connector using the ConnectorLib Java SDK...... 43 Overview ...... 44 Create a New Connector based on ConnectorLibJava ...... 44 Run the Connector ...... 45 Implement the Synchronize Action ...... 46 Configuration and Logging ...... 47 Debug the Connector ...... 48 Implement Other Actions ...... 48 Documents ...... 49 DocInfo Class ...... 49 Identifiers ...... 50 Example Identifier ...... 51 Sub File Indices ...... 51 Append Sub File Indices with Lua ...... 52 Datastore ...... 53 Configure the Datastore Tables ...... 53 Insert Records ...... 54 Update Records ...... 54 Remove Records ...... 54 Commit Changes ...... 55 Select Records ...... 55 SelectOne Method ...... 55 Select Method ...... 55 Upgrade a Datastore ...... 56 Index a Column ...... 57 Ingester Class...... 57 Ingest Result Handler ...... 57 Additional Information...... 58
• • • 4 • ConnectorLib Java SDK Programming Guide • • Contents
Chapter 5 Start and Stop the Connector ...... 59 Start the Connector ...... 59 Stop the Connector ...... 60
Chapter 6 Configure Connector Framework Server ...... 61 Connector Framework Server Configuration File...... 61 Modify Parameters ...... 62 Enter Boolean Values ...... 62 Enter String Values ...... 62 Configure Connector Framework Server ...... 63 Example Configuration File ...... 63
Chapter 7 Use Lua Scripts ...... 67 Use Lua Scripts within the CFS ...... 67 Configure a Lua Script ...... 68 Write a Lua Script ...... 68 Method Reference ...... 69 General Methods ...... 74 abs_path ...... 74 convert_date_time ...... 74 convert_encoding ...... 75 copy_file ...... 76 create_path ...... 76 create_uuid ...... 76 delete_file ...... 77 encrypt ...... 77 encrypt_security_field ...... 77 file_setdates ...... 78 getcwd ...... 78 get_config ...... 78 gobble_whitespace ...... 79 hash_file ...... 79 hash_string ...... 80 is_dir ...... 80 log ...... 81 move_file ...... 81
• • • ConnectorLib Java SDK Programming Guide • 5 • • Contents
parse_csv ...... 82 parse_xml ...... 82 regex_match ...... 82 regex_search ...... 83 send_aci_action ...... 83 send_aci_command ...... 84 sleep ...... 85 string_uint_less ...... 85 unzip_file ...... 86 xml_encode ...... 86 zip_file ...... 86 Document Methods ...... 87 addField ...... 87 appendContent ...... 87 copyField ...... 87 copyFieldNoOverwrite ...... 88 countField ...... 88 deleteField ...... 89 findField ...... 89 getContent ...... 89 getField ...... 90 getFields ...... 90 getFieldNames ...... 90 getFieldValue ...... 91 getFieldValues ...... 91 getNextSection ...... 91 getReference ...... 92 hasField ...... 92 insertXML ...... 92 renameField ...... 93 setContent ...... 93 setFieldValue ...... 93 setReference ...... 94 writeStubIdx ...... 94 Field Methods ...... 95 addField ...... 95 copyField ...... 95 copyFieldNoOverwrite ...... 95 countField ...... 96
• • • 6 • ConnectorLib Java SDK Programming Guide • • Contents
deleteAttribute ...... 96 deleteField ...... 96 getAttributeValue ...... 97 getField ...... 97 getFieldNames ...... 98 getFields ...... 98 getFieldValues ...... 98 hasAttribute ...... 99 hasField ...... 99 insertXML ...... 99 name ...... 100 renameField ...... 100 setAttributeValue ...... 100 setValue ...... 101 value ...... 101 XMLDocument Methods ...... 101 root ...... 101 XPathExecute ...... 102 XPathRegisterNs ...... 102 XPathValue ...... 102 XPathValues ...... 103 XmlNodeSet Methods ...... 103 at ...... 103 size ...... 104 XmlNode Methods ...... 104 attr ...... 104 content ...... 104 firstChild ...... 104 lastChild ...... 104 name ...... 105 next ...... 105 nodePath ...... 105 parent ...... 105 prev ...... 106 type ...... 106 XmlAttr Methods ...... 106 name ...... 106 next_attribute ...... 106 previous_attribute ...... 107
• • • ConnectorLib Java SDK Programming Guide • 7 • • Contents
type ...... 107 value ...... 107 RegexMatch Methods ...... 107 length ...... 107 next ...... 108 position ...... 108 size ...... 108 str ...... 109 Config Methods ...... 109 getEncryptedValue ...... 109 getValue ...... 110 getValues ...... 110 Change the Value of a Field ...... 111 Example Script ...... 111 Use Lua Scripts Within the Connector ...... 112 Introduction ...... 112 Example Lua Script ...... 112
Part 2 Parameter and Command Reference
Chapter 8 Parameters Common to CFS Connectors ...... 117 ACI Server Configuration ...... 118 FilePath ...... 118 LibraryName ...... 118 LuaScript ...... 119 MaximumThreads ...... 119 MaxQueueSize ...... 120 MaxScheduledSize ...... 120 OnError ...... 121 OnErrorReport ...... 121 OnFinish ...... 122 OnStart ...... 122 Url ...... 122 Import Service ...... 123 KeyviewDirectory ...... 123 Distributed Connector ...... 124 ConnectorGroup ...... 124
• • • 8 • ConnectorLib Java SDK Programming Guide • • Contents
ConnectorPriority ...... 124 DataPortN ...... 125 HostN ...... 125 PortN ...... 126 RegisterConnector ...... 126 SharedPath ...... 127 SSLConfigN ...... 127 View Server ...... 128 EnableViewServer ...... 128 Host ...... 128 Port ...... 129 SharedPath ...... 129 General Connector Parameters ...... 130 CleanOnStart ...... 130 DatastoreFile ...... 130 DatastoreDirectory ...... 131 EnableExtraction ...... 131 EnableExtractionCopy ...... 131 EnableScheduledTasks ...... 132 EncryptACLEntries ...... 133 HashedDestinationDirectory ...... 133 HashedTempDirectory ...... 134 InsertActions ...... 134 InsertFailedDirectory ...... 135 MinFreeSpaceMB ...... 135 SynchronizeKeepDatastore ...... 135 SynchronizeThreads ...... 136 TaskMaxAdds ...... 136 TaskMaxDuration ...... 137 TaskThreads ...... 137 TempDirectory ...... 138 XsltDLL ...... 138 Fetch Task Configuration ...... 139 IngestConfigSection ...... 139 N ...... 139 Number ...... 140 ScheduleCycles ...... 140 ScheduleRepeatSecs ...... 141 ScheduleStartTime ...... 141 Ingestion ...... 142
• • • ConnectorLib Java SDK Programming Guide • 9 • • Contents
EnableIngestion ...... 142 IndexDatabase ...... 142 IngestActions ...... 143 IngestAddAsUpdate ...... 143 IngestBatchSize ...... 144 IngestCheckFinished ...... 144 IngestConnectorConfigSection ...... 145 IngestDataPort ...... 145 IngestDelayMS ...... 146 IngestEnableAdds ...... 146 IngestEnableDeletes ...... 146 IngestEnableUpdates ...... 147 IngestHashedSharedPath ...... 147 IngesterType ...... 147 IngestHost ...... 148 IngestKeepFiles ...... 149 IngestPort ...... 149 IngestSendByType ...... 149 IngestSharedPath ...... 150 IngestSSLConfig ...... 150 IngestWriteIDX ...... 151 GroupServer ...... 151 GroupServerHost ...... 151 GroupServerPort ...... 152 GroupServerRepository ...... 152 GroupServerSSLConfig ...... 152
Chapter 9 Parameters Common to CFS Connectors Using Java ...... 155 JavaClassPath ...... 155 JavaConnectorClass ...... 156 JavaLibraryPath ...... 156 JavaMaxMemoryMB ...... 157 JVMLibraryPath ...... 157 JavaVerboseGC ...... 158
Chapter 10 CFS Connector Actions...... 159 Synchronous Versus Asynchronous Actions ...... 160
• • • 10 • ConnectorLib Java SDK Programming Guide • • Contents
QueueInfo Action ...... 160 Synchronize Fetch Action...... 162 Synchronize Groups Fetch Action ...... 164 Collect Fetch Action ...... 165 Identifiers Fetch Action...... 168 Insert Fetch Action ...... 170 Delete/Remove Fetch Action...... 173 Hold and ReleaseHold Fetch Actions...... 174 Update Action...... 176 View Action ...... 177 StopFetch Action...... 178
Chapter 11 Connector Framework Server Parameters ...... 179 Service Parameters ...... 180 Server Parameters ...... 180 AdminClients ...... 180 Port ...... 181 QueryClients ...... 181 Actions Parameters ...... 182 MaxQueueSize ...... 182 MaximumThreads ...... 182 Import Tasks and their Parameters ...... 183 Import Tasks ...... 183 Lua ...... 183 IDXWriter ...... 183 TextToDocs ...... 184 Sectioner ...... 184 ImportFile ...... 184 HtmlExtraction ...... 185 PreN ...... 185 PostN ...... 186 UpdateN ...... 186 DeleteN ...... 187 HashN ...... 187 IdxWriter Import Task Parameters ...... 188 IdxWriterFileName ...... 188 IdxWriterArchiveDirectory ...... 188 IdxWriterMaxSizeKBs ...... 189 TextToDocs Import Task Parameters ...... 189
• • • ConnectorLib Java SDK Programming Guide • 11 • • Contents
FilenameMatchesRegex
• • • 12 • ConnectorLib Java SDK Programming Guide • • Contents
DREHost ...... 209 IndexBatchSize ...... 209 IndexOverSocket ...... 209 IndexTimeInterval ...... 210 KillDuplicates ...... 210
Chapter 12 License Configuration Parameters...... 213 Full ...... 214 Holder ...... 214 Key ...... 214 LicenseServerACIPort ...... 215 LicenseServerHost ...... 216 LicenseServerTimeout ...... 216 LicenseServerRetries ...... 217 Operation ...... 217
Chapter 13 Logging Configuration Parameters ...... 219 LogArchiveDirectory ...... 220 LogCompressionMode ...... 221 LogDirectory ...... 221 LogEcho ...... 222 LogExpireAction ...... 222 LogFile ...... 223 LogHistorySize ...... 224 LogLevel ...... 224 LogLevelMatch ...... 225 LogMaxLineLength ...... 227 LogMaxOldFiles ...... 227 LogMaxSizeKBs ...... 228 LogOldAction ...... 229 LogOutputLogLevel ...... 230 LogSysLog ...... 230 LogTime ...... 231 LogTypeCSVs ...... 231
Chapter 14 Secure Socket Layer Parameters ...... 235 SSLConfig ...... 236 SSLCACertificate ...... 238
• • • ConnectorLib Java SDK Programming Guide • 13 • • Contents
SSLCACertificatesPath ...... 238 SSLCertificate ...... 240 SSLCheckCertificate ...... 240 SSLCheckCommonName ...... 241 SSLMethod ...... 241 SSLPrivateKey ...... 242 SSLPrivateKeyPassword ...... 243
Chapter 15 Service Configuration Parameters ...... 245 ServiceACIMode ...... 246 ServiceControlClients ...... 246 ServiceHost ...... 247 ServicePort ...... 247 ServiceStatusClients ...... 248
Chapter 16 Service Actions...... 249 Action Syntax ...... 249 GetConfig ...... 250 GetLogStream ...... 250 GetLogStreamNames ...... 251 GetStatistics ...... 251 GetStatus ...... 259 GetStatusInfo ...... 259 Stop ...... 260 Service Action Parameters ...... 260
Appendixes KeyView Classes ...... 266 KeyView Formats ...... 267
Glossary ...... 285
Index ...... 289
• • • 14 • ConnectorLib Java SDK Programming Guide • • About This Document
This guide is for readers who need to use the ConnectorLib Java SDK. It is intended for readers who have installed IDOL and are familiar with concepts related to administering a multi-part distributed application.
Documentation Updates
Related Documentation
Conventions
Autonomy Product References
Autonomy Customer Support
Contact Autonomy
Documentation Updates
The information in this document is current as of ConnectorLib Java SDK version 10.0. The content was last modified 10 May 2012. You can retrieve the most current product documentation from Autonomy’s Knowledge Base on the Customer Support Site. A document in the Knowledge Base displays a version number in its name, such as IDOL Server 7.5 Administration Guide. The version number applies to the product that the document describes. The document may also have a revision number in its name, such as IDOL Server 7.5 Administration Guide Revision 6. The revision number applies to the document and indicates that there were revisions to the document since its original release. It is recommended that you periodically check the Knowledge Base for revisions to documents for the products your enterprise is using.
• • • ConnectorLib Java SDK Programming Guide • 15 • • About This Document
To access Autonomy documentation 1. Go to the Autonomy Customer Support site at https://customers.autonomy.com 2. Click Login. 3. Enter the login credentials that were given to you, and then click Submit. The Knowledge Base Search page opens. 4. In the Search box, type a search term or phrase. To browse the Knowledge Base using a navigation tree only, leave the Search box empty. 5. Ensure the Documentation check box is selected. 6. Click Search. Documents that match the query display in a results list. 7. To refine the results list, select one or more of the categories in the Filter By pane. You can restrict results by Product Group. Filters the list by product suite or division. For example, you could retrieve documents related to the iManage, IDOL, Virage or KeyView product suites. Product. Filters the list by product. For example, you could retrieve documents related to IDOL Server, Virage Videologger, or KeyView Filter. Component. Filters the list by a product’s components. For example, you could retrieve documents related to the Content or Category component in IDOL. Version. Filters the list by product or component version number. Type. Filters the list by document format. For example, you could retrieve documents in PDF or HTML format. Guides are typically provided in both PDF and HTML format. 8. To open a document, click its title in the results list. To download a PDF version of a guide, open the PDF version, click the Save icon in the PDF reader, and save the PDF to another location.
• • • 16 • ConnectorLib Java SDK Programming Guide • • Related Documentation
Related Documentation
The following documents provide more details on ConnectorLib Java SDK: IDOL Administration User Guide IDOL Administration provides a distributed, Web-based infrastructure for managing IDOL components and services. The IDOL Administration manual describes how to administer IDOL through the IDOL Administration Dashboard and Dashboard console. IDOL Server Administration Guide IDOL server lies at the center of an Autonomy infrastructure, storing and processing the data that connectors index into it. The IDOL Server Administration Guide describes the operations that IDOL server can perform with detailed descriptions of how to set them up. Distributed Index Handler (DIH) Administration Guide This guide contains details on how you can use a DIH to distribute aggregated documents across multiple IDOL servers. Intellectual Asset Protection System (IAS) Administration Guide This guide contains details on how you can use Autonomy’s Intelligent Asset Protection System (IAS) to ensure secure access through authentication and role permissions. License Server Administration Guide This guide contains details on how you can use a License Server to license multiple Autonomy services.
• • • ConnectorLib Java SDK Programming Guide • 17 • • About This Document
Conventions
The following conventions are used in this document.
Notational Conventions This document uses the following conventions.
Convention Usage
Bold User-interface elements such as a menu item or button. For example: Click Cancel to halt the operation.
Italics Document titles and new terms. For example: For more information, see the IDOL Server Administration Guide. An action command is a request, such as a query or indexing instruction, sent to IDOL Server.
monospace font File names, paths, and code. For example: The FileSystemConnector.cfg file is installed in C:\Program Files\FileSystemConnector\.
monospace bold Data typed by the user. For example: Type run at the command prompt. In the User Name field, type Admin.
monospace italics Replaceable strings in file paths and code. For example: user UserName
• • • 18 • ConnectorLib Java SDK Programming Guide • • Conventions
Command-line Syntax Conventions This document uses the following command-line syntax conventions.
Convention Usage
[ optional ] Brackets describe optional syntax. For example: [ -create ]
| Bars indicate “either | or” choices. For example: [ option1 ] | [ option2 ] In this example, you must choose between option1 and option2.
{ required } Braces describe required syntax in which you have a choice and that at least one choice is required. For example: { [ option1 ] [ option2 ] } In this example, you must choose option1, option2, or both options.
required Absence of braces or brackets indicates required syntax in which there is no choice; you must type the required syntax element.
variable Italics specify items to be replaced by actual values. For example:
... Ellipses indicate repetition of the same pattern. For example: -merge filename1, filename2 [, filename3 ... ] where the ellipses specify, filename4, and so on.
The use of punctuation—such as single and double quotes, commas, periods— indicates actual syntax; it is not part of the syntax definition.
• • • ConnectorLib Java SDK Programming Guide • 19 • • About This Document
Notices This document uses the following notices:
CAUTION A caution indicates an action can result in the loss of data.
IMPORTANT An important note provides information that is essential to completing a task.
NOTE A note provides information that emphasizes or supplements important points of the main text. A note supplies information that may apply only in special cases—for example, memory limitations, equipment configurations, or details that apply to specific versions of the software.
TIP A tip provides additional information that makes a task easier or more productive.
Autonomy Product References
This document references the following Autonomy products:
Connector Framework Server
IDOL
IDOL Server
Autonomy Distributed Action Handler (DAH)
Autonomy Distributed Index Handler (DIH)
Autonomy License Server
Autonomy Intellectual Asset Protection System (IAS)
Autonomy KeyView
• • • 20 • ConnectorLib Java SDK Programming Guide • • Autonomy Customer Support
Autonomy Customer Support
Autonomy Customer Support provides prompt and accurate support to help you quickly and effectively resolve any issue you may encounter while using Autonomy products. Support services include access to the Customer Support Site (CSS) for online answers, expertise-based service by Autonomy support engineers, and software maintenance to ensure you have the most up-to-date technology. To access the Customer Support Site, go to https://customers.autonomy.com The Customer Support Site includes: Knowledge Base: The CSS contains an extensive library of end user documentation, FAQs, and technical articles that is easy to navigate and search.
Case Center: The Case Center is a central location to create, monitor, and manage all your cases that are open with technical support.
Download Center: Products and product updates can be downloaded and requested from the Download Center.
Resource Center: Other helpful resources appropriate for your product. To contact Autonomy Customer Support by e-mail or phone, go to http://www.autonomy.com/content/Services/Support/index.en.html
Contact Autonomy
For general information about Autonomy, contact one of the following locations:
Europe and Worldwide North and South America
E-mail: [email protected] E-mail: [email protected] Telephone: +44 (0) 1223 448 000 Telephone: 1 415 243 9955 Fax: +44 (0) 1223 448 001 Fax: 1 415 243 9984 Autonomy Corporation plc Autonomy, Inc. Cambridge Business Park One Market Plaza Cowley Rd Spear Tower, Suite 1900 Cambridge CB4 0WZ San Francisco CA 94105 United Kingdom USA
• • • ConnectorLib Java SDK Programming Guide • 21 • • About This Document
• • • 22 • ConnectorLib Java SDK Programming Guide • • PART 1 Getting Started
This section provides an overview of the ConnectorLib Java SDK, installation procedures, and configuration information for the connector that is developed and the Connector Framework server.
Introduction
Install ConnectorLib Java SDK
Configure the Connector
Implement a Connector using the ConnectorLib Java SDK
Start and Stop the Connector
Configure Connector Framework Server
Use Lua Scripts Part 1 Getting Started
• • • 24 • ConnectorLib Java SDK Programming Guide • • CHAPTER 1 Introduction
This section provides an overview of the ConnectorLib Java SDK.
Overview
About Connector Framework Server
System Architecture
Import Process
Overview
The ConnectorLib Java SDK allows you to develop connectors that automatically aggregate documents from any type of local or remote repository and send them to the Connector Framework server (CFS), which then processes the information and indexes it into an Autonomy IDOL server.
• • • ConnectorLib Java SDK Programming Guide • 25 • • Chapter 1 Introduction
Once IDOL server receives the documents, it automatically processes them, performing a number of intelligent operations in real time, such as:
Agents Hyperlinking Alerting Mailing Categorization Profiling Channels Retrieval Clustering Spelling Correction Collaboration Summarization Dynamic Thesaurus Taxonomy Generation Expertise
Refer to your IDOL server’s manual for further details.
About Connector Framework Server
The Connector Framework server (CFS) receives information from various connectors, which it then processes and indexes into an IDOL server. A single CFS can be configured to work with multiple connectors and send documents to multiple IDOL servers or Distributed Index Handlers (DIH). In addition, the server can execute predefined tasks on documents just before they are imported, after they are imported, or if errors occur. CFS filters text from a variety of document types with KeyView filters, which are document-specific readers used for text extraction. Users generally do not access KeyView directly; however, the parameter ImportFamilyRootExcludeFmtCSV requires that you identify the desired KeyView document formats.
Related Topics “ImportFamilyRootExcludeFmtCSV” on page 203.
“KeyView Format Codes” on page 265.
• • • 26 • ConnectorLib Java SDK Programming Guide • • System Architecture
System Architecture
There are several ways to install the Connector Framework server. The simplest installation consists of a single CFS, single connector, and single IDOL server.
It is also possible to have more complex configurations, consisting of more than one connector, a Distributed Index Handler (DIH), multiple IDOL servers, or some combination of these options.
• • • ConnectorLib Java SDK Programming Guide • 27 • • Chapter 1 Introduction
Import Process
The import process consists of the following basic steps:
1. The connector sends documents from the data repository to the CFS. 2. Pre-import tasks are performed, which are typically defined in Lua scripts. 3. KeyView filters the document content. 4. Post-import tasks are performed, as defined in the PostN parameters. 5. Optionally, a backup IDX or XML file is created. 6. The data is indexed into IDOL server, or sent to a DIH.
• • • 28 • ConnectorLib Java SDK Programming Guide • • CHAPTER 2 Install ConnectorLib Java SDK
This section provides information required to install the ConnectorLib Java SDK.
System Requirements
Install ConnectorLib Java SDK on Windows
System Requirements
ConnectorLib Java SDK should be installed by a system administrator as part of a larger Autonomy system that includes an Autonomy IDOL server and an interface for the information stored in the IDOL server.
Supported Platforms The ConnectorLib Java SDK runs on a Windows platform, 32-bit version. Solaris and Linux versions can be made available.
NOTE The documented platforms are the recommended and most fully tested platforms for IDOL. Autonomy can provide support for other platforms on request.
• • • ConnectorLib Java SDK Programming Guide • 29 • • Chapter 2 Install ConnectorLib Java SDK
Minimum Server Requirements The minimum requirements for Windows are: 2 GHz Pentium4 processor
2 GB RAM
20GB hard disk recommended
Install ConnectorLib Java SDK on Windows
To install the ConnectorLib Java SDK, use the following procedure. After the SDK has been installed, you can create your own connector implementation, and then configure ConnectorLib Java to use it. For more information, see “Implement a Connector using the ConnectorLib Java SDK” on page 43.
To install a standalone version of ConnectorLib Java SDK on Windows 1. Double-click ConnectorLibJava_VersionNumber_WINDOWS.exe The installation wizard opens and the Introduction page is displayed. 2. Read the text, and click Next. The License Agreement page opens. 3. Read the license agreement and if you agree to its terms, click I accept the terms of the License Agreement and click Next. The Choose Install Folder page opens. 4. Choose an installation folder for ConnectorLib Java SDK. By default, ConnectorLib Java SDK is installed in C:\Autonomy\ConnectorLibJava, but you can click Choose to choose another location. After you choose an installation folder, click Next. The Service Name page opens. 5. In the Service Name box, type a name to use for the ConnectorLib Java windows service, and click Next. The License Server Configuration page opens. 6. Type the IP address or hostname, and the ACI Port of the license server and click Next. The Java Configuration page opens. 7. Choose whether you want to use the bundled Java VM. Select or clear the check box, and click Next.
• • • 30 • ConnectorLib Java SDK Programming Guide • • Install ConnectorLib Java SDK on Windows
The DRE Database page opens. 8. In the DRE Database box, type the name of the DRE database that ConnectorLib Java should index into, and click Next. The Connector Framework Server page is displayed. 9. Choose whether you want to install a new CFS or use an existing CFS. To install a new CFS, click Install New CFS and click Next. The Choose CFS Install Folder page is displayed. Go to Step 10. To use an existing CFS, click Use Existing CFS and click Next. The Connector Framework Server page is displayed.Type the Hostname and Port of your existing CFS installation. Click Next and go to Step 15. 10. Enter the path where you want the Connector Framework Server to be installed and click Next. The Enter Connector Framework Installation Name page opens. 11. Type a unique name for the Connector Framework installation and click Next. This name is used for the Connector Framework installation directory and various files. The name must not contain any spaces. The Connector Framework Service Settings page opens. 12. Enter the following details, and click Next.
Service Port The port number that the Connector Framework Server uses to communicate with the license server. This port must not be used by any other service.
Service Status The IP addresses of computers that are permitted to access Clients the Connector Framework service status, but are not permitted to control the status. If you want to permit a number of machines to access the Connector Framework service status, use a wildcard. For example, enter 187.*.*.* to permit any machine with an IP address that begins with 187 to access the Connector Framework service status.
Service Control The IP addresses of computers that are permitted to control Clients the Connector Framework service. If you want to permit a number of machines to control the Connector Framework service, use a wildcard. For example, enter 187.*.*.* to permit any machine with an IP address that begins with 187 to control the Connector Framework service.
The DRE Settings page opens.
• • • ConnectorLib Java SDK Programming Guide • 31 • • Chapter 2 Install ConnectorLib Java SDK
13. Enter the following details and click Next.
IDOL Server The IP address of the IDOL server to which you want to add Hostname documents.
ACI Port The port number the connector uses to query IDOL server.
The Connector Framework Server ACI Port page opens. 14. In the ACI Port box, type the port that you want the Connector Framework to listen on, and click Next. The Pre-Installation Summary page opens. 15. Review the installation settings. If necessary, click Previous to change any settings. If you are satisfied with the settings, click Install. The Installing ConnectorLib Java SDK page opens. The progress of the installation process is indicated. The Start Service page opens. 16. Choose whether to start the ConnectorLib Java service, and click Next. 17. Choose whether to start the Connector Framework service, and click Next. 18. The installation is complete. Click Done. You can now edit the ConnectorLib Java SDK and Connector Framework Server configuration files. You can also start the ConnectorLib Java and Connector Framework services if you did not start them from the installation wizard.
• • • 32 • ConnectorLib Java SDK Programming Guide • • Install ConnectorLib Java SDK on Windows
Directory Structure—Windows Once the installation of ConnectorLib Java SDK is completed, your installation directory contains the following files and subdirectories.
Connector Framework Server Directory Structure The Connector Framework server (CFS) is installed to the ConnectorFramework directory which is at the same level as the ConnectorLib Java SDK installation directory. The ConnectorFramework installation directory contains the following files and subdirectories (note that bold font indicates directories).
File Description
convtables Contains various text files used during the importing process.
filters Contains executables used during the importing process.
jre Contains Java Runtime Environment for the uninstaller.
scripts Contains Lua scripts.
Uninstall_ConnectorFramework Files required for uninstalling Connector Framework server.
ConnectorFramework.cfg Connector Framework server configuration file.
ConnectorFramework.exe Connector Framework server executable.
ConnectorFramework_InstallLog.log Installation log file that lists the details of the installation process.
lua.dll Lua library.
version.txt Text file containing version information.
• • • ConnectorLib Java SDK Programming Guide • 33 • • Chapter 2 Install ConnectorLib Java SDK
When you start the Connector Framework server for the first time, the following files are created:
File Description
logs Contains CFS log files. By default, this includes action.log, import.log, indexer.log. actions Contains queued asynchronous actions so that, if the server should go down, the actions will not be lost. When the server comes back up, the queued actions will be processed.
uid Contains document tracking files.
autn_ntres.dll NT resource library.
ConnectorFramework.lck Lock file which prevents multiple instances of CFS running simultaneously.
license.log License log file.
licensekey.txt License information text document.
portinfo.dat File that lists the ports that the connector is using.
service.log Service actions log file.
ConnectorLib Java SDK Directory Structure By default, the ConnectorLib Java SDK is installed to the C:\Autonomy\ ConnectorLibJava directory. The ConnectorLib Java SDK installation directory contains the following files and subdirectories (note that bold font indicates directories)
File Description
example The source code for the sample implementation.
jre Contains the Java Runtime Environment.
lib Various JAR files.
logs Contains various configured log files.
Uninstall_ConnectorName Files required to uninstall ConnectorLib Java SDK.
autn_ntres.dll NT resource library.
autpassword.exe Autonomy Password Encryption Utility, which allows you to encrypt the passwords.
• • • 34 • ConnectorLib Java SDK Programming Guide • • Install ConnectorLib Java SDK on Windows
File Description
ConnectorLibJava.cfg ConnectorLib Java configuration file.
ConnectorLibJava.dll This file is used when running from the command line through Java.
ConnectorLibJava.exe ConnectorLib Java SDK executable.
ConnectorLibJava_InstallLog.log Installation log file that lists the details of the installation process.
license.html License information in HTML format.
license.txt License information in text format.
lua5.1.dll Lua library
ReleaseNotes.pdf Release notes for the ConnectorLib Java SDK.
service.log Service actions log file.
vcredist.exe Visual Studio redistributable file.
When you start ConnectorLib Java SDK for the first time, the following files are created:
File Description
license Contains license information.
uid Contains document tracking files.
ConnectorLibJava.lck Lock file which prevents multiple instances of ConnectorLibJava from running concurrently.
license.log License log file.
licensekey.txt Text file that lists license information.
portinfo.dat File that lists the port information.
• • • ConnectorLib Java SDK Programming Guide • 35 • • Chapter 2 Install ConnectorLib Java SDK
• • • 36 • ConnectorLib Java SDK Programming Guide • • CHAPTER 3 Configure the Connector
The configuration settings are stored in the connector configuration file that you construct. Use the following information to modify the parameters directly in the configuration file using a text editor. Using the information below, you can also use the configuration file to set up log streams. If you are entering passwords into a configuration field, you can use the information in this chapter to encrypt them.
Modify Parameters
Encrypt Passwords
Set Up Log Streams
Modify Parameters
The following section describes how to enter parameter values in the configuration file.
Enter Boolean Values The following settings for Boolean parameters are interchangeable: TRUE = true = ON = on = Y = y = 1 FALSE = false = OFF = off = N = n =0
• • • ConnectorLib Java SDK Programming Guide • 37 • • Chapter 3 Configure the Connector
Enter String Values Some parameters require string values that contain quotation marks. Escape each quotation mark by inserting a backslash before it. For example: FIELDSTART0="" Here, the beginning and end of the string are indicated by quotation marks, while all quotation marks that are contained in the string are escaped. If you want to enter a comma-separated list of strings for a parameter, and one of the strings contains a comma, you must indicate the start and the end of this string with quotation marks. For example: ParameterName=cat,dog,bird,"wing,beak",turtle If any string in a comma-separated list contains quotation marks, you must put this string into quotation marks and escape each quotation mark in the string by inserting a backslash before it. For example: ParameterName="",dog,bird,"wing,beak",turtle
Encrypt Passwords
For added security, it is recommended all passwords be encrypted before they are entered into a configuration field. To encrypt passwords, follow the steps relevant to your operating system.
To encrypt passwords 1. At a command prompt, change directories to InstallDir\ ConnectorName. 2. Enter one of the following strings: autpassword -e -tEncryptionType [options] PasswordString autpassword -d PasswordString autpassword -x -tEncryptionType [options]
• • • 38 • ConnectorLib Java SDK Programming Guide • • Encrypt Passwords
where,
Option Description
-e Encrypts the password.
-d Decrypts the password.
-x Performs the operation specified by the -o option. See Options.
-tEncryptionType The type of encryption used. The following options are available: Basic AES You may use either form of encryption. However, AES is a more secure type of encryption than basic encryption.
PasswordString The password to encrypt or decrypt.
Options Options can be one of the following: -oOptionName=OptionValue. OptionName can be: KeyFile. Specifies the path and filename of a keyfile. It should contain 64 hexadecimal characters. This option is only available with the AES encryption type and the -x option. -c. The configuration filename in which to write the encrypted password. This option is only available with the -e argument. -s. The name of the section in the configuration file in which to write the password. This option is only available with the -e argument. -p. The parameter name in which to write the encrypted password. This option is only available with the -e argument. When writing the password to a configuration file, you must specify all related options: -c, -s, and -p.
Example: autpassword -e -tBASIC -c./Config.cfg -sDefault -pPassword passw0r autpassword -d passw0r autpassword -x -tAES -oKeyFile=./MyKeyFile.ky
• • • ConnectorLib Java SDK Programming Guide • 39 • • Chapter 3 Configure the Connector
Set Up Log Streams
Use the following information to set up your own log streams. Each log stream creates a separate log file in which specific log message types (for example, action, index, application, or import) are logged.
To set up log streams 1. Open the configuration file in a text editor. 2. Find the [Logging] section. (If the configuration file does not contain a [Logging] section, create one.) 3. Under the [Logging] section's heading, create a list of the log streams you want to set up using the format N=LogStreamName. For example: [Logging] LogLevel=FULL LogDirectory=logs 0=ApplicationLogStream 1=ActionLogStream 2=JavaLogStream 3=SynchronizeLogStream 4=InsertLogStream 5=CollectLogStream 6=ViewLogStream 7=DeleteLogStream 8=HoldLogStream 9=UpdateLogStream In this example, log streams are defined which report application and action messages, messages for the various fetch actions that can be implemented, and messages written to System.out and System.err. Note the log streams are listed in consecutive order, starting from 0 (zero). 4. Create a new section for each of the log streams you defined. Each section must have the same name as the log stream. For example: [ApplicationLogStream] [ActionLogStream] [JavaLogStream] [SynchronizeLogStream] [InsertLogStream] [CollectLogStream] [ViewLogStream] [DeleteLogStream] [HoldLogStream] [UpdateLogStream]
• • • 40 • ConnectorLib Java SDK Programming Guide • • Set Up Log Streams
5. Specify the settings you want to apply to each log stream in the appropriate log stream's section. You can specify the type of logging that should be performed (for example, full logging), whether log messages should be displayed on the console, the maximum size of log files, and so on. For example: [ApplicationLogStream] logfile=application.log logtypecsvs=application
[ActionLogStream] logfile=action.log logtypecsvs=action
[SynchronizeLogStream] LogFile=synchronize.log LogTypeCSVs=synchronize
[JavaLogStream] LogFile=java.log LogTypeCSVs=java
[InsertLogStream] LogFile=insert.log LogTypeCSVs=insert
[CollectLogStream] LogFile=collect.log LogTypeCSVs=collect
[ViewLogStream] LogFile=view.log LogTypeCSVs=view
[DeleteLogStream] LogFile=delete.log LogTypeCSVs=delete
[HoldLogStream] LogFile=hold.log LogTypeCSVs=hold
[UpdateLogStream] LogFile=update.log LogTypeCSVs=update
6. Save and close the configuration file. 7. Restart the service to execute your changes.
• • • ConnectorLib Java SDK Programming Guide • 41 • • Chapter 3 Configure the Connector
Related Topics “Logging Configuration Parameters” on page 219
• • • 42 • ConnectorLib Java SDK Programming Guide • • CHAPTER 4 Implement a Connector using the ConnectorLib Java SDK
This section describes how to use the ConnectorLib Java SDK.
Overview
Create a New Connector based on ConnectorLibJava
Run the Connector
Implement the Synchronize Action
Configuration and Logging
Debug the Connector
Implement Other Actions
Documents
Datastore
Ingester Class
Ingest Result Handler
Additional Information
• • • ConnectorLib Java SDK Programming Guide • 43 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK
Overview
ConnectorLibJava is an implementation of connectorlib for Java. It is compatible with Java 1.4 and later. You can use it to implement a connector that supports any or all of the following actions:
Synchronize Insert Collect Update View Hold Delete ReleaseHold
Create a New Connector based on ConnectorLibJava
To implement a new connector, extend the class com.autonomy.connector.ConnectorBase:
package com.mycompany.connector;
import com.autonomy.connector.Config; import com.autonomy.connector.ConnectorBase; import com.autonomy.connector.Log;
public class MyConnector extends ConnectorBase { public MyConnector(Config config, Log log) { super("My Connector"); } }
The string passed to the ConnectorBase constructor (in this case My Connector) is the name of the connector. In order to run the connector you must have a matching license in your license server. This is all that is required to create a new connectorLibJava based connector. To build the connector you must include JavaConnector.jar in the compile classpath.
NOTE A sample program is located in the example folder in the installation directory.
• • • 44 • ConnectorLib Java SDK Programming Guide • • Run the Connector
Run the Connector
To run the connector, use the following instructions. This procedure assumes that you have built the connector into a jar called MyConnector.jar.
To run the connector 1. Copy MyConnector.jar into the lib directory of your connectorLibJava installation. The connectorLibJava directory should contain at least the following: lib\MyConnector.jar lib\JavaConnector.jar connectorLibJava.cfg connectorLibJava.exe lua5.1.dll
2. Update connectorLibJava.cfg to contain at least the following: [License] LicenseServerHost=server LicenseServerACIPort=20000
[service] ServicePort=7003 ServiceStatusClients=* ServiceControlClients=*
[server] Port=7002
[Logging] LogLevel=FULL LogEcho=TRUE LogDirectory=logs LogMaxSizeKBs=-1 0=ApplicationLogStream 1=ActionLogStream 2=JavaLogStream
[ApplicationLogStream] LogFile=application.log LogTypeCSVs=application
[ActionLogStream] LogFile=action.log LogTypeCSVs=action
• • • ConnectorLib Java SDK Programming Guide • 45 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK
[JavaLogStream] LogFile=java.log LogTypeCSVs=java
[Connector] JavaClassPath=lib\JavaConnector.jar;lib\MyConnector.jar JavaConnectorClass=com.mycompany.connector.MyConnector
3. Configure the [License] section to point to your license server. 4. Run the connector. If you find that the connector stops and writes out an error in license.log: "No license found for My Connector", confirm that you have a license matching the string passed to the ConnectorBase constructor, and that your [License] section is set up correctly. With the default configuration provided with connectorLibJava, the connector will output the log message "Exception when starting Scheduled Tasks: All connector features are currently disabled". This is because a synchronize job is configured, but synchronize has not yet been implemented. To implement a synchronize action, see “Implement the Synchronize Action” on page 46.
Implement the Synchronize Action
The initial implementation does not support any actions. To implement an action, override the appropriate method in ConnectorBase and provide an implementation. The following example is a simple implementation of the synchronize action. You would add this code to your connector class.
@Override public void synchronize(SynchronizeTask task) { DocInfo docInfo = new DocInfo( task.taskConfig, "http://www.example.com/"); docInfo.getId().setProperty( "IdentifierPropery", "Some Value"); docInfo.getDoc().addFieldValue( "Some Field", "Field Value"); task.ingester.add(docInfo); }
• • • 46 • ConnectorLib Java SDK Programming Guide • • Configuration and Logging
This example also shows how to create a document and send it for ingestion. A DocInfo object is created to store all the information about a document. It then has a property set in its identifier, and a field value added. Finally it is sent to the ingester. At this point you might want to ensure the configuration file includes settings for a synchronize log, and for ingestion:
[Logging] 3=SynchronizeLogStream
[SynchronizeLogStream] LogFile=synchronize.log LogTypeCSVs=synchronize
[Ingestion] IngesterType=CFS IngestHost=localhost IngestPort=7000
If you run a CFS on the appropriate port and issue a synchronize action to your connector, you will see that one document is sent to CFS. When sending the action, you should specify the task name in the synchronize command or in the configuration file:
Synchronize http://localhost:7002/action=fetch Command &fetchaction=synchronize &tasksections=MyTask
Configuration [FetchTasks] File Number=1 0=MyTask
[MyTask]
Configuration and Logging
The taskConfig member of ConnectorTask provides access to configuration settings:
String directory = task.taskConfig.read("Directory"); String password = task.taskConfig.readPassword("Password"); int maxSize = task.taskConfig.read("MaxSize", 14);
• • • ConnectorLib Java SDK Programming Guide • 47 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK
boolean doXYZ = task.taskConfig.read("DoXYZ", false);
The settings come from the appropriate configuration section in the configuration file (looking in default sections if necessary) or from the action sent to the connector.
Logging Logging is performed using the log member of ConnectorTask:
task.log.writeln(Log.NORMAL, "Processing XYZ"); task.log.writeln(Log.WARNING, "Watch out!");
The Java Log Stream An additional 'Java' log stream is also provided. Any messages written to System.out and System.err by the connector are logged to this stream.
Debug the Connector
A library version of ConnectorLibJava is supplied to simplify the process of debugging a Java connector. This allows you to start the connector in a Java debugger.
To debug the connector, start the application from the ConnectorBase main class. Pass the name of the ConnectorLibJava library as the first argument. By default it will look for a configuration file by appending .cfg to the library name. If you need to change this, use the -configfile parameter. For example:
java -cp
java -cp
Implement Other Actions
Implementation of the other actions is similar to implementation of the synchronize action (see “Implement the Synchronize Action” on page 46). To implement the other actions, override the appropriate method.
• • • 48 • ConnectorLib Java SDK Programming Guide • • Documents
The methods you can override are: synchronize, collect, delete, insert, update, view and hold. Each takes an appropriate implementation of ConnectorTask. The synchronize method is passed a SynchronizeTask, the collect action is passed a CollectTask and so forth. The JavaDoc pages for the various task types give more information about implementing those actions. Hold and ReleaseHold are both implemented by the hold method, which takes a boolean indicating whether it should apply a hold to a document, or release an existing hold. All of the tasks provide access to:
An ingester.
datastoreFilename - This is where the connector should store any persistent state information for the task.
log - This provides a method to log messages to the appropriate log stream.
taskConfig - This provides access to the configuration settings specific to this action.
taskName - The name of the task (the configuration file section name).
tempDirectory. The TempDirectory as specified in the server configuration file. Any temporary files should be created here and deleted when no longer needed.
stop(). The stop method returns a boolean to indicate whether the action should terminate as soon as possible (the action implementation should monitor this regularly). This is true if the user has instructed the connector to stop.
Documents
DocInfo Class This class has three members: document. The document has methods for setting and retrieving the document reference, metadata and content.
identifier. The identifier holds a reference and a collection of properties that can be used to identify the document within the repository so it can be collected back. During a synchronize action a connector should set whatever identifier properties it will need to retrieve the document if it is passed the identifier as part of another action (such as view, collect, hold or update).
• • • ConnectorLib Java SDK Programming Guide • 49 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK
file. The file represents a file on disk containing the content of the document. A file can be "owned" in which case it is deleted automatically when no longer needed. These should be accessed through the accessors: getDoc() getID() getFile() and setFile(file) For most actions (except Synchronize) the action is passed a collection of documents. For these the metadata/properties of the document and identifier objects can be modified, but the objects themselves should not be replaced (hence no setID() or setDoc methods). The setFile(file) is provided so that Synchronize, Collect and View actions can assign the collected file into the DocInfo object. Synchronize can alternatively do this directly by passing the filename when constructing the DocInfo. The Synchronize action will usually create new DocInfo objects for ingestion. Most other actions should not create new DocInfo objects. These are passed a set of DocInfo objects to act upon. For example, the CollectTask contains a set of DocInfo objects - one for each document to be collected. The DocInfo objects are initially blank except for the identifier which the connector can use to find the document in the repository. The implementation should populate these DocInfo objects with content and metadata from the repository and then report success or failure as documented in the JavaDoc.
The success and failed methods must be called using the provided documents, and not by constructing new instances of DocInfo. (Calling success or failed on an unrecognized DocInfo object will most likely be ignored).
Identifiers When the connector creates a new DocInfo, it should give it an identifier which can be used to uniquely identify the repository document it was created for. This will be used by the view and collect actions to retrieve that document from the repository. The Identifier contains the following information: The name of the configuration section defining the task that retrieved the document. The same configuration information should ideally be used when performing retrieval as when performing a Synchronize operation.
• • • 50 • ConnectorLib Java SDK Programming Guide • • Documents
The document reference - this can be omitted if the retrieval does not require the reference.
The repository specific parameters that identify the document within the repository or how to retrieve it. It would be normal to allow any non-document-specific data to also be specified in the configuration file and be overridden by the value in the Identifier if present. Any logon details or other sensitive information should not be stored in the Identifier and should only appear in the configuration file.
The Identifier should also contain the sub file information for the sub files of a document once it has been ingested.
The Identifier string is stored and constructed by the Identifier class.
Example Identifier The following example illustrates an unencoded Identifier (not for any real repository):
The Identifier itself is constructed by Base64 encoding the entire XML to give the following for the first example above:
PGlkIHM9Ik15VGFzazEiIHI9Imh0dHA6Ly9teXNlcnZlcjo0NTY3L2RvYy9fdnhzd2 RmZ3VoamtuYmlvX2VhcnljcXp0XyI+PHAgbj0iU0VSVklDRVVSTCIgdj0iaHR0cDov L215c2VydmVyOjQ1Njcvc2VydmljZSIvPjxwIG49IkRPQ0lEIiB2PSJfdnhzd2RmZ3 VoamtuYmlvX2VhcnljcXp0XyIvPjwvaWQ+
Note that this should be URL-escaped as normal to pass to one of the actions. It is not necessary for implementations of connectorLibJava to perform this encoding or escaping - use the getId() method of the DocInfo class to get access to the Identifier class and then set whichever properties are required. The identifier will be encoded and escaped as necessary by connectorLib. Similarly, any identifiers passed to the connector will unescaped and decoded as necessary by connectorLib, and can be accessed using the getId() method.
Sub File Indices In order for Collect and View to retrieve sub files of containers, the Identifier is expanded by appending the KeyView sub file indices - this is done during ingestion when associating the Identifier with the document's children.
• • • ConnectorLib Java SDK Programming Guide • 51 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK
The following shows how the indices are appended for a container file (
A general Identifier with the indices appended will look like:
For Example (for the 7th child of the 3rd child of the 2nd child of the top level document):
PGlkIHM9Ik15VGFzazEiIHI9Imh0dHA6Ly9teXNlcnZlcjo0NTY3L2RvYy9fdnhzd2 RmZ3VoamtuYmlvX2VhcnljcXp0XyI+PHAgbj0iU0VSVklDRVVSTCIgdj0iaHR0cDov L215c2VydmVyOjQ1Njcvc2VydmljZSIvPjxwIG49IkRPQ0lEIiB2PSJfdnhzd2RmZ3 VoamtuYmlvX2VhcnljcXp0XyIvPjwvaWQ+|1.2.6
During synchronize, it is CFS that extracts sub files. The Append Sub File Indices with Lua section explains how to set up CFS to include the sub file indices in the identifier. During view or collect, the subfile extraction is performed internally so the connector should only need to retrieve the top level document, and can ignore the indices. To support extraction of main sub files of a container, an index can be replaced at any level by the letter 'M'. This means that the sub file marked as the main sub file by KeyView would be extracted. For example '...|1.M.M' would refer to the main sub file of the main sub file of the second child of the top level document.
Append Sub File Indices with Lua The following Lua script will append the sub file indices written to the document's SubFileIndexCSV field during import to the AUTN_IDENTIFIER field:
function handler( document ) identifier = document:getFieldValue( "AUTN_IDENTIFIER" ) if identifier then indices = document:getFieldValue( "SubFileIndexCSV" ) if indices then indices = string.gsub(indices, ",", ".")
• • • 52 • ConnectorLib Java SDK Programming Guide • • Datastore
value = identifier .. "|" .. indices document:setFieldValue( "AUTN_IDENTIFIER", value) end end return true end
This lua script could be configured as a CFS post task. For it to work properly you will need to ensure that subfiles inherit the AUTN_IDENTIFIER field. You can do this by including it in the fields listed in the ImportInheritFieldsCSV parameter.
Datastore
ConnectorLibJava provides access to the datastore library, which can be used to store and retrieve any state required by the connector. Often the synchronize action will make use of this so that it can determine what documents have been added, updated or deleted since it was last used.
A datastore is a single file on disk, normally with the extension .db. You might also see a journal file when the datastore is in use. Information in a datastore is stored in tables. A datastore can contain a number of tables, each of which has a set of columns defined by the connector.
Configure the Datastore Tables To access a datastore you must first create a Datastore object:
Datastore datastore = new Datastore( task.datastoreFilename, task.log);
The constructor takes a filename and a log stream. Both of these are provided in the ConnectorTask object. When you use the filename provided in the ConnectorTask, the connector framework ensures that the datastore for each task has a unique name, and that configuration parameters relating to the datastore (such as SynchronizeKeepDatastore) are respected. The Datastore object can be used to create tables:
datastore.createTable("MyTable", new String[] { "Url", "ModifiedDate", "Seen" }, new String[] { "Url" });
• • • ConnectorLib Java SDK Programming Guide • 53 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK
This example specifies a table called MyTable with the columns Url, ModifiedDate and Seen. The third argument specifies the primary key for the table, which in this case is the Url column. You should call createTable for each table every time you want to use the datastore. If a table does not exist, it is created. If a table does exist, then the datastore library verifies that the table has the expected format.
Insert Records To insert a record, populate a new DatastoreRecord object and then call the insert method with the required table name:
DatastoreRecord recordOne = new DatastoreRecord(); recordOne.setString("Url", "http://www.example.com/one"); recordOne.setString("ModifiedDate", "2012-04-02 10:20"); recordOne.setString("Seen", "1"); datastore.recordInsert("MyTable", recordOne);
Update Records To update a record or records, use the recordUpdate method:
DatastoreRecord filter = new DatastoreRecord(); filter.setString("Url", "http://www.example.com/one"); DatastoreRecord update = new DatastoreRecord(); update.setString("ModifiedDate", "2012-04-02 11:22"); datastore.recordUpdate("MyTable", filter, update);
The filter is a datastore record that is used to specify the records that should be updated. If fields are specified in the filter, only records that match those field values exactly are updated. An empty filter matches all records. In the example above, the filter is used to match exactly one record: it specifies the required value for Url, which is the primary key.
The third argument to recordUpdate (update) specifies the update to be performed. Any fields set in this record are updated to the provided values in any records that match the filter. All other fields remain unaltered.
Remove Records To remove records, specify a filter and use the recordRemove method. The filter works in the same way as for updating records (see “Update Records” on page 54). Any records that match the filter are removed:
DatastoreRecord filter = new DatastoreRecord(); filter.setString("Url", "http://www.example.com/one");
• • • 54 • ConnectorLib Java SDK Programming Guide • • Datastore
datastore.recordRemove("MyTable", filter);
Commit Changes When you insert, update or remove records the changes are not committed to the datastore immediately. Instead, changes that you make are saved and committed at a later time. Until changes are committed, any attempt to select records will act on the old data, not the new. If you want to force your changes to be committed you can use the processQueue method:
datastore.processQueue();
Select Records To retrieve a record, specify a filter and use either the selectOne or select method. The filter works in the same way as for updating records (see “Update Records” on page 54).
SelectOne Method If you require only a single record, use the selectOne method:
DatastoreRecord filter = new DatastoreRecord(); filter.setString("Url", "http://www.example.com/one"); DatastoreRecord found = datastore.recordSelectOne("MyTable", filter, new String[] { "Url", "ModifiedDate", "Seen" } );
As well as taking a filter, selectOne takes a list of columns that should be retrieved for the matching record. You can access the values for the retrieved columns using the get methods on the returned DatastoreRecord.
Select Method To select all records matching a filter (not just the first), use the select method:
DatastoreRecord filter = new DatastoreRecord(); filter.setString("Seen", "0"); datastore.recordSelect("MyTable", filter, new String[] { "Url" }, this, "handleRecord");
This example selects only the "Url" column for all records in "MyTable" where the "Seen" field is set to "0". For each record it calls the handleRecord method on this. An example record handler is:
public void handleRecord(DatastoreRecord record) {
• • • ConnectorLib Java SDK Programming Guide • 55 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK
log.writeln("Url = " + record.getString("Url")); }
The method is called once for each record matching the filter.
Upgrade a Datastore If you decide to change the structure of a datastore table but still want to be able to read datastore files using the old format, you can provide a function to upgrade the table. For example:
datastore.createTable("MyTable", new String[] { "Url", "ModifiedDate", "Seen" }, new String[0], new String[] { "Url" }, false); datastore.modifyTableByRow("MyTable", new String[] { "Url", "ModifiedDate", "Seen", "Acl" }, new String[0], this, "upgradeV1ToV2Record"); datastore.commitTable("MyTable");
In the example, the table is created with the old column set but the createTable method is passed false as its last argument. This indicates to the datastore that this is an old table structure and will be upgraded.
The modifyTableByRow method is then used to specify the new column set. If the datastore table still has the old structure, the method upgradeV1ToV2Record on this is called for each record to upgrade it to the new format. If the table already has the new structure no action is taken.
modifyTableByRow can be called as many times as necessary. commitTable is called last of all to indicate that no further upgrades will now take place. The following code is an example upgrade method:
public boolean upgradeV1ToV2Record( DatastoreRecord oldRecord, DatastoreRecord newRecord) { newRecord.setString("Url", oldRecord.getString("Url")); newRecord.setString("ModifiedDate", oldRecord.getString("ModifiedDate")); newRecord.setString("Seen", oldRecord.getString("Seen")); newRecord.setString("Acl", "123"); return true; }
• • • 56 • ConnectorLib Java SDK Programming Guide • • Ingester Class
This is called for each record with the old structure (oldRecord). It should populate newRecord with the same data but in the new structure.
Index a Column If you often select by a column that is not the primary key, you can index that column to improve performance:
datastore.createIndex("MyTable", new String[] { "Seen" });
Ingester Class
The Ingester class is used to send new documents, updated documents, or deleted document commands to the CFS for processing. Currently CFS connections are supported for ingestion into IDOL server. Ingestion into a different repository by the Insert action of another connector is also supported. An instance of the class is created on the base task class ConnectorTask and so is accessible for any action (though usually only required for Synchronize). The host, port and other settings for the CFS should be set in the configuration file in the Ingestion section. Commands can be sent during synchronization by calling the Add, Update or Remove functions passing in a DocInfo object which should include a reference. Update commands are used to update the metadata of a document but not the content, they also use any provided metadata in the document. Add commands use the metadata and file name. The files passed to the ingester should be in the temporary directory specified by ConnectorTask.tempDirectory.
Ingest Result Handler
Result handlers can be added to the ingester. They are called for each document in the following situations:
When an ingest command has successfully been sent for processing.
When the current task has completed and all outstanding ingest tasks have failed to be sent.
When ingestion is disabled in which case all tasks are assumed to have completed successfully.
• • • ConnectorLib Java SDK Programming Guide • 57 • • Chapter 4 Implement a Connector using the ConnectorLib Java SDK
The result handler is typically used when state is being stored for each document so that synchronize cycles can be incremental. It is often desirable that the connector only update an item's state information if the document for the item has been ingested successfully. The result handler method might look like this:
public void handler(DocInfo docInfo, Ingester.TaskType type, boolean success) { if (success) { if(type == Ingester.TaskType.Add || type == Ingester.TaskType.Update) { /* Update state */; } else if (type == Ingester.TaskType.Remove) { /* Remove from state */; } } }
The result handler is registered using the addResultHandler method of the ingester:
ingester.addResultHandler(this, "handler");
Additional Information
Additional information is provided in the JavaDocs for ConnectorLibJava. You can obtain these by extracting them from JavaConnector.jar.
• • • 58 • ConnectorLib Java SDK Programming Guide • • CHAPTER 5 Start and Stop the Connector
This section describes how to start and stop a connector.
NOTE You must start and stop the Connector Framework server separately from the connector.
Start the Connector
Start the connector using one of the following methods.
To start the connector using Windows Services 1. Open the Windows Services dialog box. 2. Select the ConnectorInstallName service, and click Start. 3. Close the Windows Services dialog box.
To start the connector by running the executable 1. In the connector installation directory, locate the connector executable called ConnectorInstallName.exe. 2. On a command line, enter ConnectorInstallName.exe.
• • • ConnectorLib Java SDK Programming Guide • 59 • • Chapter 5 Start and Stop the Connector
Stop the Connector
Stop a connector from running by using one of the following methods.
To stop the connector using Windows Services 1. Open the Windows Services dialog box. 2. Select the ConnectorInstallName service, and click Stop. 3. Close the Windows Services dialog box.
To stop the connector service by sending a command to the service port Type the following command in the address bar of your browser:
http://host:ServicePort/action=stop where,
host The IP address (or name) of the machine on which the ConnectorLib Java SDK is running.
ServicePort The ConnectorLib Java SDK service port (specified in the [Service] section of the ConnectorLib Java SDK configuration file).
• • • 60 • ConnectorLib Java SDK Programming Guide • • CHAPTER 6 Configure Connector Framework Server
This section describes how to configure the parameters that determine how the Connector Framework server (CFS) operates.
Connector Framework Server Configuration File
Modify Parameters
Configure Connector Framework Server
Example Configuration File
Connector Framework Server Configuration File
The parameters that determine how Connector Framework server operates are in the ConnectorFramework.cfg file, located in the CFS installation directory. You can modify these parameters to customize the CFS according to your requirements. The CFS supports all standard Server, Service, Logging, and License parameters. Most of the specific import tasks are defined in Lua scripts; therefore, the Connector Framework server configuration requirements are quite minimal.
• • • ConnectorLib Java SDK Programming Guide • 61 • • Chapter 6 Configure Connector Framework Server
Related Topics Connector Framework Server Parameters
Example Configuration File
Modify Parameters
The following section describes how to enter parameter values in the configuration file.
Enter Boolean Values The following settings for Boolean parameters are interchangeable: TRUE = true = ON = on = Y = y = 1 FALSE = false = OFF = off = N = n =0
Enter String Values Some parameters require string values that contain quotation marks. Escape each quotation mark by inserting a backslash before it. For example: FIELDSTART0="" Here, the beginning and end of the string are indicated by quotation marks, while all quotation marks that are contained in the string are escaped. If you want to enter a comma-separated list of strings for a parameter, and one of the strings contains a comma, you must indicate the start and the end of this string with quotation marks. For example: ParameterName=cat,dog,bird,"wing,beak",turtle If any string in a comma-separated list contains quotation marks, you must put this string into quotation marks and escape each quotation mark in the string by inserting a backslash before it. For example: ParameterName="",dog,bird,"wing,beak",turtle
• • • 62 • ConnectorLib Java SDK Programming Guide • • Configure Connector Framework Server
Configure Connector Framework Server
This section describes how to configure the basic Connector Framework server parameters.
To configure CFS 1. Open the CFS configuration file. 2. In the [Service] section, specify the service information. 3. In the [Server] section, specify server information. 4. In the [ImportTasks] section, configure how data is imported to IDX or XML before it is indexed into IDOL Server. 5. In the [ImportService] section, specify details for Keyview and the service that imports documents into IDX or XML. 6. In the [Indexing] section, specify the details for the IDOL Server(s) to which the CFS will send documents for indexing. 7. In the [Actions] section, configure how actions are sent to the CFS. 8. Save the configuration file.
Related Topics Service Parameters
Server Parameters
Import Tasks and their Parameters
Import Service Parameters
Indexing Parameters
Actions Parameters
Secure Socket Layer Parameters
Example Configuration File
This section contains a basic example configuration file, which meets the minimum configuration requirements. [Service] Port=40030 ServiceStatusClients=*.*.*.*
• • • ConnectorLib Java SDK Programming Guide • 63 • • Chapter 6 Configure Connector Framework Server
ServiceControlClients=*.*.*.*
[Server] Port=7000 QueryClients=* AdminClients=* MaxInputString=-1 MAXFILEUPLOADSIZE=-1
[Logging] LogLevel=NORMAL 0=ApplicationLogStream 1=ActionLogStream 2=ImportLogStream 3=IndexLogStream
[actions] MaxQueueSize=100
[ApplicationLogStream] LogTypeCSVs=application LogFile=application.log
[ActionLogStream] LogTypeCSVs=action LogFile=action.log
[ImportLogStream] LogTypeCSVs=import LogFile=import.log
[IndexLogStream] LogTypeCSVs=indexer LogFile=indexer.log
[Indexing] DREHost=127.0.0.1 ACIPort=9000 IndexBatchSize=1000 IndexTimeInterval=300
[ImportService] KeyviewDirectory=C:\Documents and Settings\dquatman\My Documents\ ConnectorFrameworkServers\filters ExtractDirectory=C:\Documents and Settings\dquatman\My Documents\ ConnectorFrameworkServers\Temp ThreadCount=3 ImportInheritFieldsCSV=AUTN_GROUP,AUTN_IDENTIFIER,DREDBNAME
• • • 64 • ConnectorLib Java SDK Programming Guide • • Example Configuration File
[ImportTasks] //Post0=lua:
• • • ConnectorLib Java SDK Programming Guide • 65 • • Chapter 6 Configure Connector Framework Server
• • • 66 • ConnectorLib Java SDK Programming Guide • • CHAPTER 7 Use Lua Scripts
This section contains the following topics: Use Lua Scripts within the CFS
Method Reference
Use Lua Scripts Within the Connector
Use Lua Scripts within the CFS
Connector Framework server can import or process data using Lua, an embedded scripting language. A Lua script allows CFS to: Call out to an external service, for example to alert a user.
Modify and insert document fields.
Interface with other libraries. When data is imported, the script is run for each document. For more information on Lua, see:
http://www.lua.org/ CFS supports all standard Lua functions.
• • • ConnectorLib Java SDK Programming Guide • 67 • • Chapter 7 Use Lua Scripts
Configure a Lua Script You can execute four types of script: pre-Lua or post-Lua, Delete, and Update. Pre-Lua scripts are run after the document data is extracted but before it is filtered, so the document contains metadata. Post-Lua scripts are run after the document data is filtered, so the document also contains the document content. Delete is run when a document is deleted. Update is run when a document is updated. Update and Delete are configured in the same way as Pre and Post, but they appear in the [IndexTasks] section. Use this procedure to specify the location of the Lua script file.
To configure a Lua script 1. Stop the Connector Framework server. 2. Open the Connector Framework server configuration file in a text editor. 3. Locate the [ImportTasks] section, and enter a different value of PreN (for pre-Lua scripts) or PostN (for post-Lua scripts) for each script file. For example: [ImportTasks] ... Pre0=Lua:script1.lua Pre1=Lua:script2.lua Post0=Lua:script3.lua 4. To enable family hashing, set the HashN parameter. True indicates that the Lua calculated the hash; False indicates that the hash should be calculated. 5. Save the configuration file.
Write a Lua Script The script should have this structure: function handler(document) ... end The handler function is called for each document and is passed a document object. This is an internal representation of the document being processed. Modifying this object will change the document.
Return true if you want to continue processing the document and return false if you want to stop.
• • • 68 • ConnectorLib Java SDK Programming Guide • • Method Reference
NOTE You can write a library of useful functions to share between multiple scripts, which you can then include in the scripts by adding dofile(“library.lua”) to the top of the lua script outside of the handler function.
Method Reference
The Connector Framework server supports several methods, which are listed in Table 1.
Table 1 Supported methods Method Description
General Methods
abs_path Returns the supplied path as an absolute path.
convert_date_time Converts date and time formats using standard Autonomy syntax. convert_encoding Converts the encoding of the string passed in from UTF8 and returns the converted string.
copy_file Copies the source file to the destination path.
create_path Creates the specified directory tree.
create_uuid Creates a universally unique identifier.
delete_file Deletes the file specified by path. encrypt Encrypts a string passed in and returns the encrypted string.
encrypt_security_field Encrypts the ACL.
file_setdates Sets the given file times on the file specified by path. getcwd Returns the current working directory of the application.
get_config Loads a configuration file.
gobble_whitespace Reduces multiple adjacent while spaces.
hash_file Hashes specified file using the SH1 or MDA5 algorithm, or both.
hash_string Hashes specified string.
• • • ConnectorLib Java SDK Programming Guide • 69 • • Chapter 7 Use Lua Scripts
Table 1 Supported methods Method Description
is_dir Checks if the supplied path is a directory.
log Appends log messages to the specified file.
move_file Moves the source file to the destination path.
parse_csv Parse the given separated values string into a collection of individual strings.
parse_xml Parse the given XML string to an XMLDocument. regex_match Performs a regular expression match on a string. regex_search Performs a regular expression search on a string.
send_aci_action Takes the action parameters as a table instead of the full action as a string to avoid issues with parameter values containing “&”.
send_aci_command Sends the given query to the ACI server.
sleep Pauses the executing thread for a number of milliseconds. string_uint_less Takes two strings and returns True if the second one is longer than the first.
unzip_file Extracts the zip file specified by path to the location specified by dest. xml_encode Takes a string and encodes it to a string that is valid to be put into XML.
zip_file Zips the supplied path (file or directory).
Document Methods
addField Creates a new field when passed a name and value.
appendContent Appends content to the existing content of the document. copyField Creates a new named field with the same value as an existing named field. copyFieldNoOverwrite Copies a field to a certain name but does not overwrite the existing value. countField Returns an integer of the number of fields with the name specified.
• • • 70 • ConnectorLib Java SDK Programming Guide • • Method Reference
Table 1 Supported methods Method Description
deleteField Removes a field from the document. findField Returns the LuaField object of the specified name.
getContent Gets the content for a document. getField Returns the first LuaField object of the specified name.
getFieldNames Gets all the field names for the document. getFields Returns a table of LuaFields of the specified name.
getFieldValue Gets a field value.
getFieldValues Gets all values of a multi-valued field.
getNextSection Gets the next section in a document, allowing you to perform find or add operations on every section. getReference Returns a string containing the reference.
hasField Checks whether the document has a particular named field. insertXML Inserts a portion of XML as a new piece of metadata for the document.
renameField Moves an existing field from one name to another.
setContent Sets the content for a document.
setFieldValue Sets a field value. setReference Sets the reference to the string passed in.
writeStubIdx Writes out a stub IDX document.
Field Methods
addField Adds a sub field with the specified name and value.
copyField Copies the sub field to another sub field.
copyFieldNoOverwrite Copies the sub field to another sub field but does not overwrite the destination.
countField Returns the number of sub fields that exist with the specified name.
deleteAttribute Deletes the attribute specified by the name passed in.
• • • ConnectorLib Java SDK Programming Guide • 71 • • Chapter 7 Use Lua Scripts
Table 1 Supported methods Method Description
deleteField Deletes the sub field with the specified name.
getAttributeValue Gets the value of the attribute specified as a string.
getField Gets the sub field specified by the name.
getFieldNames Returns a table containing strings representing all the sub fields’ names. getFields Gets all the sub fields specified by the name. getFieldValues Returns a table of strings of all the values of sub fields with the specified name. hasAttribute Returns a Boolean specifying if the field has the specified attribute passed in by name. hasField Returns a Boolean specifying if the sub field exists or not. insertXML Inserts a portion of XML as a new piece of metadata for the document. name Returns the name of the field object in a string. renameField Renames the sub field. setAttributeValue Sets the value for the specified attribute of the field. setValue Sets the value of the field to be passed in a string. value Returns the value of the field object in a string.
XMLDocument Methods
root Returns an XMLNode that is the root node of the XML document.
XPathExecute Returns XMLNodeSet that is the result of supplied XPath query.
XPathRegisterNs Register a namespace with the XML parser. Returns an integer detailing the error code.
XPathValue Returns the first occurance of the value matching the the xpath query.
XPathValues Returns a table of Strings contain th values according to the XPath query.
• • • 72 • ConnectorLib Java SDK Programming Guide • • Method Reference
Table 1 Supported methods Method Description
XMLNodeSet Methods
at Returns XMLNode at position pos in the array. size Returns size of node set.
XMLNode Methods
attr Returns first XMLAttr attribute object for this element. content Returns the content (text element) of the xml node.
firstChild Returns XMLNode that is the first child of this node.
lastChild Returns XMLNode that is the last child of this node. name Returns the name of the xml node.
next Returns XmlNode that is the next sibling of this node. nodePath Returns the Xml path to the node that can be used in another XPath query.
parent Returns the parent XmlNode of the node.
prev Returns XmlNode that is the previous sibling of this node.
type Returns the type of the node as a string.
XmlAttr Methods
name Returns the name of this attribute.
next_attribute Returns XmlAttr object for the next attribute in the parent element.
previous_attribute Returns XmlAttr object for the previous attribute in the parent element.
type Returns the type of this attribute node.
value Returns the value of this attribute.
RegexMatch Methods length Returns the length of the sub match. The default value of 0 returns for the full match. next Returns a RegexMatch for the next match. position Returns the position of the sub match as an index from 1.
• • • ConnectorLib Java SDK Programming Guide • 73 • • Chapter 7 Use Lua Scripts
Table 1 Supported methods Method Description size Returns the number of sub matches for the current match as an integer. str Returns the string for the sub match.
Config Methods
getEncryptedValue Returns the unencrypted value from the config of an encrypted value.
getValue Returns the value of the configuration parameter key in a given section.
getValues Returns a table of strings if you have multiple values for a key (for example, a CSV or numbered like keyN).
General Methods
abs_path Returns the supplied path as an absolute path.
Syntax abs_path( String path )
Arguments
Arguments Type/Description
path The relative path.
Returns A string of the supplied path as an absolute path.
convert_date_time Converts date and time formats using standard Autonomy syntax.
Syntax String convert_date_time (String InputDateTime, String InputFormatCSV, String OutputFormat, [Boolean OutputGMT = false])
• • • 74 • ConnectorLib Java SDK Programming Guide • • Method Reference
Arguments
Arguments Type/Description
InputDateTime The date and time to be converted.
InputFormatCSV A comma-separated list of the possible date and time formats of the input.
OutputFormat The format of the date and time to be output.
OutputGMT Specifies whether to treat the date and time output as Greenwich Mean Time. Default is false.
Discussion All date and time input is treated as local time unless it contains explicit time zone information.
Returns Date and time in the desired format.
convert_encoding This method converts the encoding of the string passed in from UTF8 and returns the converted string.
Syntax convert_encoding ( String content, String encodingname)
Arguments
Arguments Type/Description
content The string to convert.
encodingname The encoding name to convert to (same as IDOL encoding names).
Returns The converted string.
• • • ConnectorLib Java SDK Programming Guide • 75 • • Chapter 7 Use Lua Scripts
copy_file Copy the source file to the destination path. The copy will fail if the destination file already exists. This can be overridden by providing the optional overwrite argument which forces the copy if the destination exists.
Syntax copy_file( String src, String dest [, Boolean overwrite] )
Arguments
Arguments Type/Description
src The source file.
dest The destination file.
overwrite Forces the copy if the destination exists.
Returns Returns a Boolean indicating success/failure.
create_path Creates the specified directory tree.
Syntax void create_path (String Path)
Arguments
Arguments Type/Description
Path The path to be created.
create_uuid Creates a universally unique identifier.
Syntax String create_uuid()
Returns A universally unique identifier.
• • • 76 • ConnectorLib Java SDK Programming Guide • • Method Reference
delete_file Delete the file specified by path.
Syntax delete_file( String path )
Arguments
Arguments Type/Description
path The path and filename of the file to be deleted.
Returns Returns a Boolean indicating success/failure.
encrypt This method encrypts a string passed in and returns the encrypted string. It uses the same encryption as is used for ACL encryption.
Syntax encrypt (String content)
Arguments
Arguments Type/Description
content The string to encrypt.
Returns The encrypted string.
encrypt_security_field Encrypts the ACL.
Syntax String encrypt_security_field (String ACL)
Arguments
Arguments Type/Description
ACL An Access Control List string.
• • • ConnectorLib Java SDK Programming Guide • 77 • • Chapter 7 Use Lua Scripts
Returns An encrypted string.
file_setdates Sets the given file times on the file specified by path. If the format parameter is not specified, it is assumed that the dates are provided as seconds since the epoch (1st January 1970).
Syntax file_setdates( String path, String created, String modified, String accessed [, String format] )
Arguments
Arguments Type/Description
path The path or filename of the file to be deleted.
created The date created.
modified The date modified.
accessed The last date accessed.
format Used to format the strings coming in at system time. The format parameter is the same as for other Autonomy products.
getcwd Returns the current working directory of the application.
Syntax getcwd()
Returns Returns a string of the current working directory.
get_config Load a configuration file.
Syntax get_config( path )
• • • 78 • ConnectorLib Java SDK Programming Guide • • Method Reference
Arguments
Arguments Type/Description
path The path of the configuration file to load.
Discussion Config files are cached after the first call to get_config, to avoid unnecessary disk I/O in the likely event that the same config is accessed frequently by subsequent invocations of the Lua script. One cache is maintained per Lua state, so the maximum number of reads for a config file is equal to the number of threads which are running Lua scripts An error is raised if the configuration file does not exist.
Returns A Config object.
gobble_whitespace Reduces multiple adjacent white spaces (tab, carriage return, space, and so on in the specified field) to a single space.
Syntax String gobble_whitespace (String Input)
Arguments
Arguments Type/Description
Input An input string.
Returns A string without adjacent white spaces.
hash_file Hashes the specified file using the SHA1 or MDA5 algorithm, or both.
Syntax String, [String] hash_file (String FileName, String Algorithm1, [String Algorithm2])
• • • ConnectorLib Java SDK Programming Guide • 79 • • Chapter 7 Use Lua Scripts
Arguments
Arguments Type/Description
FileName The name of the file to be specified. Algorithm1 The type of algorithm to use. Must be either SHA1 or MDA5.
Algorithm2 The optional second type of algorithm to use. Must be whichever algorithm was not used in Algorithm1.
Returns The hashed file.
hash_string Hashes the specified string using the SHA1 or MDA5 algorithm.
Syntax String hash_string (String StringToHash, String Algorithm)
Arguments
Arguments Type/Description
StringToHash The string to be hashed.
Algorithm The algorithm to use. Must be either SHA1 or MDA5.
Returns The hashed input string.
is_dir Check if the supplied path is a directory.
Syntax is_dir( String path )
Arguments
Arguments Type/Description
path The path to check.
• • • 80 • ConnectorLib Java SDK Programming Guide • • Method Reference
Returns Returns a Boolean indicating if the supplied path is a directory.
log Appends log messages to the specified file.
Syntax log( String file, String message )
Arguments
Arguments Type/Description
file The file to which log messages will be appended.
message The message to print to the file.
Returns Nothing.
move_file Move the source file to the destination path. The move will fail if the destination file already exists. This can be overridden by providing the optional overwrite argument which forces the move if the destination exists.
Syntax move_file( String src, String dest [, Boolean overwrite] )
Arguments
Arguments Type/Description
src The source file.
dest The destination file.
overwrite Forces the move if the destination exists.
Returns Returns a boolean indicating success/failure.
• • • ConnectorLib Java SDK Programming Guide • 81 • • Chapter 7 Use Lua Scripts
parse_csv Parse the given separated values string into a collection of individual strings.
Syntax parse_csv( csv_string [, delimiter])
Arguments
Arguments Type/Description
csv_string The string to parse.
delimiter The delimiter to use (defaults to ",").
Discussion The method understands quoted values (such that parsing 'foot, "leg, torso", elbow' produces three values) and ignores white space around delimiters.
Returns The elements are returned as multiple return values. You may wish to put them in a table like this:
local results = { parse_csv("cat,tree,house", ",") };
parse_xml Parse the given XML string to an XMLDocument.
Syntax parse_xml( xml_string )
Arguments
Arguments Type/Description
xml_string XML data as a string.
Returns An XMLDocument containing the parsed data, or nil if the string could not be parsed.
regex_match This method performs a regular expression match on a string.
• • • 82 • ConnectorLib Java SDK Programming Guide • • Method Reference
Syntax regex_match (String name, String regex [, Boolean case])
Arguments
Arguments Type/Description
name The string in which to search.
regex The regular expression with which to search.
case An optional Boolean specifying whether or not to be case-sensitive.
Returns A table of strings.
regex_search This method performs a regular expression search on a string.
Syntax regex_search (String name, String regex [, Boolean case])
Arguments
Arguments Type/Description
name The string in which to search.
regex The regular expression with which to search.
case An optional Boolean specifying whether or not to be case-sensitive.
Returns A regular expression match-object.
send_aci_action Sends the given query to the ACI server at host:port with optional time-out (ms) and retries settings. Takes the action parameters as a table instead of the full action as a string, as with send_aci_command, to avoid issues with parameter values containing “&”.
Syntax send_aci_action( host, port, action [, parameters][, timeout] [, retries] )
• • • ConnectorLib Java SDK Programming Guide • 83 • • Chapter 7 Use Lua Scripts
Example send_aci_action( “localhost”, 9000, “query”, {text = “*”, print = “all”} );
Arguments
Arguments Type/Description
host The ACI host to send the query to.
port The port to send the query to.
action The action to perform (for example, query).
parameters This takes a Lua table containing the action parameters, for example, { param1=”foo”, param2=”bar” }
timeout The number of milliseconds to wait before timing out. The default is 3000.
retries The number of times to retry if the request fails. The default is 3.
Returns The xml response is returned as a string. If the request has failed, then nil is returned.
send_aci_command Sends the given query to the ACI server at host:port with optional time-out (ms) and retries settings.
Syntax send_aci_command( host, port, query [, timeout] [, retries] )
Arguments
Arguments Type/Description
host The ACI host to send the query to.
port The port to send the query to.
query The query to send (for example, action=getstatus)
timeout The number of milliseconds to wait before timing out. The default is 3000.
retries The number of times to retry if the request fails. The default is 3.
• • • 84 • ConnectorLib Java SDK Programming Guide • • Method Reference
Returns The xml response is returned as a string. If the request has failed, then nil is returned.
sleep Pause the executing thread for a number of milliseconds.
Syntax sleep( Integer milliseconds )
Arguments
Arguments Type/Description
milliseconds The number of milliseconds for which to pause the current thread.
Returns Nothing.
string_uint_less This method takes two strings and returns True if the second one is longer than the first. Will return False otherwise.
Syntax string_uint_less (String1, String2)
Arguments
Arguments Type/Description
String1 The string that acts as the standard for comparison.
String2 The string to compare against the first string (the standard).
Returns A Boolean.
• • • ConnectorLib Java SDK Programming Guide • 85 • • Chapter 7 Use Lua Scripts
unzip_file Extracts the zip file specified by path to the location specified by dest.
Syntax unzip_file( String path, String dest )
Arguments
Arguments Type/Description
path The path or filename of the file to be unzipped.
dest The destination path where the files are to be extracted.
Returns Returns a boolean indicating success/failure.
xml_encode This method takes a string and encodes it to a string that is valid to be put into XML.
Syntax xml_encode (String content)
Arguments
Arguments Type/Description
content The string to be encoded.
Returns A string.
zip_file Zip the supplied path (file or directory). The output file will only be overwritten if true is supplied for the optional overwrite argument.
Syntax zip_file( String path [, Boolean overwrite] )
• • • 86 • ConnectorLib Java SDK Programming Guide • • Method Reference
Arguments
Arguments Type/Description
path The path or filename of the file to be zipped.
overwrite Forces the creation of the zip file if an output file already exists.
Returns The output path is written to path.zip. Returns Boolean indicating success or failure.
Document Methods
addField Adds a new field to the document.
Syntax addField ( String fieldname, String fieldvalue )
Arguments
Arguments Type/Description
fieldname The name of the field to add.
fieldvalue The value to set for the field.
appendContent Appends content to the existing content of the document.
Syntax appendContent ( String content )
Arguments
Arguments Type/Description
content The content to append to the document content.
copyField Copies a field to a certain name.
• • • ConnectorLib Java SDK Programming Guide • 87 • • Chapter 7 Use Lua Scripts
Syntax copyField (String sourcename, String targetname [, Boolean case])
Arguments
Arguments Type/Description
sourcename The name of the field to copy.
targetname The destination field name.
case An optional Boolean specifying whether or not to be case-sensitive.
copyFieldNoOverwrite Copies a field to a certain name but does not overwrite the existing value.
Syntax copyFieldNoOverwrite ( String sourcename, String targetname [, Boolean case])
Arguments
Arguments Type/Description
sourcename The name of the field to copy.
targetname The destination field name.
case An optional Boolean specifying whether or not to be case-sensitive.
countField This method returns an integer of the number of fields with the name specified.
Syntax countField (String fieldname [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field to count.
case An optional Boolean to specify whether or not to be case-sensitive.
• • • 88 • ConnectorLib Java SDK Programming Guide • • Method Reference
Returns An integer.
deleteField Deletes a field from a document.
Syntax deleteField ( String fieldname [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field to delete.
case An optional Boolean to specify whether or not to be case-sensitive.
findField This method returns the LuaField object of the specified name.
Syntax findField ( String fieldname)
Arguments
Arguments Type/Description
fieldname The name of the field to find.
Returns A LuaField object of the specified name.
getContent Gets the content for a document.
Syntax getContent ()
Returns The document content as a string.
• • • ConnectorLib Java SDK Programming Guide • 89 • • Chapter 7 Use Lua Scripts
getField This method returns the first LuaField object of the specified name.
Syntax getField (String name [, Boolean case])
Arguments
Arguments Type/Description
name The name of the LuaField object.
case An optional Boolean to specify whether or not to be case-sensitive.
Returns First LuaField object of the specified name.
getFields This method returns a table of LuaFields of the specified name.
Syntax getFields (String name [, Boolean case])
Arguments
Arguments Type/Description
name The name of the LuaField object.
case An optional Boolean to specify whether or not to be case-sensitive.
Returns A table of LuaFields.
getFieldNames Gets all the field names for the document.
Syntax getFieldNames ( )
Returns A table of all the field names.
• • • 90 • ConnectorLib Java SDK Programming Guide • • Method Reference
getFieldValue Gets the value of a field on a document.
Syntax getFieldValue( String fieldname [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field who’s value is to be retrieved.
case An optional Boolean to specify whether or not to be case-sensitive.
Returns A string containing the value.
getFieldValues Gets all values from all fields that have the same name.
Syntax getFieldValues( String fieldname [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field to match.
case An optional Boolean to specify whether or not to be case-sensitive.
Returns A table of all the field values.
getNextSection The document object passed to the script's handler function in fact represents the first section of the document. This means the functions previously detailed only read and modify the first section. This method returns the next section in the document when sectioned.
Syntax LuaDocument getNextSection ()
• • • ConnectorLib Java SDK Programming Guide • 91 • • Chapter 7 Use Lua Scripts
Example To perform operations on every section, for example: local section = document while section do -- Manipulate section section = section:getNextSection() end
Returns A document object that contains the next DRE section.
getReference This method returns a string containing the reference.
Syntax getReference ()
Returns The string containing the reference.
hasField Checks to see if a field exists for a document.
Syntax hasField ( String fieldname [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field for who’s existence you are checking.
case An optional Boolean to specify whether or not to be case-sensitive.
Returns A Boolean: true if the field exists, false otherwise.
insertXML This method inserts a portion of XML as a new piece of metadata for the document.
• • • 92 • ConnectorLib Java SDK Programming Guide • • Method Reference
Syntax insertXML (LuaXMLNode node)
Arguments
Arguments Type/Description
node The node to insert.
Returns A LuaField object of the inserted data.
renameField Changes the name of a field from one name to another.
Syntax renameField ( String currentname, String newname [, Boolean case])
Arguments
Arguments Type/Description
currentname The name of the field to rename.
newname The new name of the field.
case An optional Boolean to specify whether or not to be case-sensitive.
setContent Sets the content for a document.
Syntax setContent ( String content )
Arguments
Arguments Type/Description
content The content to set for the document.
setFieldValue Sets the value of a field on a document.
• • • ConnectorLib Java SDK Programming Guide • 93 • • Chapter 7 Use Lua Scripts
Syntax setFieldValue( String fieldname, String newvalue )
Arguments
Arguments Type/Description
fieldname The name of the field to set.
newvalue The value to set for the field. If the field already exists, it will be overwritten.
setReference This method sets the reference to the string passed in.
Syntax setReference (String reference)
Arguments
Arguments Type/Description
reference The reference to set.
writeStubIdx Writes out a stub idx document (a metadata file used by IDOL applications).
Syntax writeStubIdx( String filename )
Arguments
Arguments Type/Description
filename The name of the file to create.
Returns A Boolean: true if written, false otherwise.
• • • 94 • ConnectorLib Java SDK Programming Guide • • Method Reference
Field Methods
addField This method adds a sub field with the specified name and value.
Syntax addField (String fieldname, String fieldvalue)
Arguments
Arguments Type/Description
fieldname The name of the field.
fieldvalue The value of the field.
Returns The LuaField object.
copyField This method copies the sub field to another sub field.
Syntax copyField (String fieldname, String destination [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field to copy.
destination The name of the field to copy to.
case A Boolean to specify whether or not to be case-sensitive.
copyFieldNoOverwrite This method copies the sub field to another sub field but does not overwrite the destination.
Syntax copyFieldNoOverwrite (String fieldname, String destination [, Boolean case])
• • • ConnectorLib Java SDK Programming Guide • 95 • • Chapter 7 Use Lua Scripts
Arguments
Arguments Type/Description
fieldname The name of the field to copy.
destination The name of the field to copy to.
case A Boolean to specify whether or not to be case-sensitive.
countField This method returns the number of sub fields that exist with the specified name.
Syntax countField (String fieldname [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field.
case A Boolean to specify whether or not to be case-sensitive.
Returns The number of sub fields that exist with the specified name.
deleteAttribute This method deletes the attribute specified by the name passed in.
Syntax deleteAttribute (String name)
Arguments
Arguments Type/Description
name The attribute name to delete.
deleteField This method deletes the sub field with the specified name.
• • • 96 • ConnectorLib Java SDK Programming Guide • • Method Reference
Syntax deleteField (String fieldname [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field to count.
case A Boolean to specify whether or not to be case-sensitive.
getAttributeValue This method gets the value of the attribute specified as a string.
Syntax getAttributeValue (String name)
Arguments
Arguments Type/Description
name The attribute name to get.
Returns Atribute values.
getField This method gets the sub field specified by the name.
Syntax getField (String name [, Boolean case])
Arguments
Arguments Type/Description
name The field name to get.
case A Boolean to specify whether or not to be case-sensitive.
Returns A single field object.
• • • ConnectorLib Java SDK Programming Guide • 97 • • Chapter 7 Use Lua Scripts
getFieldNames This method returns a table containing strings representing all the sub fields’ names.
Syntax getFieldNames ()
Returns A table containing strings representing all the sub fields’ names.
getFields This method gets all the sub fields specified by the name.
Syntax getFields (String name [, Boolean case])
Arguments
Arguments Type/Description
name The field name to get.
case A Boolean to specify whether or not to be case-sensitive.
Returns A table of field objects.
getFieldValues This method returns a table of strings of all the values of sub fields with the specified name.
Syntax getFieldValues (String fieldname [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field.
case A Boolean to specify whether or not to be case-sensitive.
• • • 98 • ConnectorLib Java SDK Programming Guide • • Method Reference
Returns A table of strings of all the values of sub fields with the specified name.
hasAttribute This method returns a Boolean specifying if the field has the specified attribute passed in by name.
Syntax hasAttribute (String name)
Arguments
Arguments Type/Description
name The name of the attribute.
Returns A Boolean specifying if the field has the specified attribute passed in by name.
hasField This method returns a Boolean specifying if the sub field exists or not.
Syntax hasField (String fieldname [, Boolean case])
Arguments
Arguments Type/Description
fieldname The name of the field.
case A Boolean to specify whether or not to be case-sensitive.
Returns A Boolean specifying if the sub field exists or not.
insertXML This method inserts a portion of XML as a new piece of metadata for the document.
Syntax insertXML (LuaXMLNode node)
• • • ConnectorLib Java SDK Programming Guide • 99 • • Chapter 7 Use Lua Scripts
Arguments
Arguments Type/Description
node The node to insert.
Returns A LuaField object of the inserted data.
name This method returns the name of the field object in a string.
Syntax name ()
Returns The name of the field object in a string.
renameField This method renames the sub field.
Syntax renameField (String oldname, String newname [, Boolean case])
Arguments
Arguments Type/Description
oldname The previous name of the field.
newname The new name of the field.
case A Boolean to specify whether or not to be case-sensitive.
setAttributeValue This method sets the value for the specified attribute of the field.
Syntax setAttributeValue (String attribute, String value)
• • • 100 • ConnectorLib Java SDK Programming Guide • • Method Reference
Arguments
Arguments Type/Description
attribute The attribute to set.
value The value to set to.
setValue This method sets the value of the field to be passed in a string.
Syntax setValue (String value)
Arguments
Arguments Type/Description
value The value to set.
value This method returns the value of the field object in a string.
Syntax value ()
Returns The value of the field object in a string.
XMLDocument Methods
root Returns an XmlNode which is the root node of the XML document.
Syntax root()
Returns An XmlNode.
• • • ConnectorLib Java SDK Programming Guide • 101 • • Chapter 7 Use Lua Scripts
XPathExecute Returns XmlNodeSet which is the result of supplied XPath query.
Syntax XPathExecute( String xpathQuery )
Arguments
Arguments Type/Description
xpathQuery The xpath query to execute.
Returns An XmlNodeSet node set.
XPathRegisterNs Register a namespace with the XML parser. Returns an integer detailing the error code.
Syntax XPathRegisterNs( String prefix, String URI )
Arguments
Arguments Type/Description
prefix The namespace prefix.
URI The namespace location.
Returns 0 in case of success, -1 in case of error.
XPathValue Returns the first occurance of the value matching the XPath query.
Syntax String XPathValue(String query)
• • • 102 • ConnectorLib Java SDK Programming Guide • • Method Reference
Arguments
Arguments Type/Description
query The XPath query to use.
Returns A string of the value.
XPathValues Returns a table of strings containing the values according to the XPath query.
Syntax Table XPathValues(String query)
Arguments
Arguments Type/Description
query The XPath query to use.
Returns A table of strings of the values.
XmlNodeSet Methods
at Returns XmlNode at position pos in the array.
Syntax at( pos )
Arguments
Arguments Type/Description
pos The index of the item in the array to get.
Returns An XmlNode.
• • • ConnectorLib Java SDK Programming Guide • 103 • • Chapter 7 Use Lua Scripts
size Returns size of node set.
Syntax size()
Returns An integer of the size of the node set.
XmlNode Methods
attr Returns first XmlAttr attribute object for this element.
Syntax attr()
Returns An XmlAttr object.
content Returns the content (text element) of the xml node.
Syntax content()
Returns A string containing the content.
firstChild Returns XmlNode which is the first child of this node.
Syntax firstChild()
Returns An xmlNode.
lastChild Returns XmlNode which is the last child of this node.
• • • 104 • ConnectorLib Java SDK Programming Guide • • Method Reference
Syntax lastChild()
Returns An xmlNode.
name Returns the name of the xml node.
Syntax name()
Returns A string containing the name.
next Returns XmlNode which is the next sibling of this node.
Syntax next()
Returns An xmlNode.
nodePath Returns the Xml path to the node which can be used in another XPath query.
Syntax nodePath()
Returns A string containing the path.
parent Returns the parent XmlNode of the node.
Syntax parent()
Returns An xmlNode.
• • • ConnectorLib Java SDK Programming Guide • 105 • • Chapter 7 Use Lua Scripts
prev Returns XmlNode which is the previous sibling of this node.
Syntax prev()
Returns An xmlNode.
type Returns the type of the node as a string.
Syntax type()
Returns A string containing the type. Possible values are:
element_node comment_node element_decl attribute_node document_node attribute_decl text_node document_type_node entity_decl cdata_section_node document_frag_node namespace_decl entity_ref_node notation_node xinclude_start entity_node html_document_node xinclude_end pi_node dtd_node docb_document_node
XmlAttr Methods
name Returns the name of this attribute.
Syntax name()
Returns A String containing the name of the attribute.
next_attribute Returns XmlAttr object for the next attribute in the parent element.
Syntax next_attribute ()
• • • 106 • ConnectorLib Java SDK Programming Guide • • Method Reference
Returns An XmlAttr.
previous_attribute Returns XmlAttr object for the previous attribute in the parent object.
Syntax previous_attribute ()
Returns An XmlAttr.
type Returns the type of this attribute node.
Syntax type()
Returns This method returns a string containing "attribute_node" if the node is valid, or "null" if the node is invalid.
value Returns the value of this attribute.
Syntax value()
Returns A String containing the value of the attribute.
RegexMatch Methods
length This method returns the length of the sub match. The default value of 0 returns for the full match.
Syntax length ( Integer submatch)
• • • ConnectorLib Java SDK Programming Guide • 107 • • Chapter 7 Use Lua Scripts
Arguments
Arguments Type/Description
submatch The sub match to query.
Returns The length of the sub match.
next Returns a RegexMatch for the next match.
Syntax next ()
Returns A RegexMatch for the next match.
position This method returns the position of the sub match as an index from 1. The default value of 0 returns for the full match.
Syntax position (Integer submatch)
Arguments
Arguments Type/Description
submatch The sub match to query.
Returns The position of the submatch as an index from 1.
size This method returns the number of sub matches for the current match as an integer. This includes the full match so it will return one greater than expected.
Syntax size (Integer submatch)
• • • 108 • ConnectorLib Java SDK Programming Guide • • Method Reference
Arguments
Arguments Type/Description
submatch The sub match to query.
Returns The number of sub matches for the current match.
str This method returns the string for the sub match. The default value of 0 returns for the full match.
Syntax str (Integer submatch)
Arguments
Arguments Type/Description
submatch The sub match to query.
Returns The string for the sub match.
Config Methods
getEncryptedValue Returns the unencrypted value from the configuration file of an encrypted value.
Syntax String getEncryptedValue(String section, String key)
Arguments
Arguments Type/Description
section The section in the configuration file.
key The key in the configuration file to get the value for.
• • • ConnectorLib Java SDK Programming Guide • 109 • • Chapter 7 Use Lua Scripts
Returns The unencrypted value.
getValue Returns the value of the configuration parameter key in a given section. If the key does not exist in the section, then the default value is returned.
Syntax getValue( String section, String key, String default )
Arguments
Arguments Type/Description
section The section name in the configuration file.
key The name of the key from which to read.
default The default value to use if no key is found.
Returns A string containing the value read from the configuration file.
getValues Returns a table of strings if you have multiple values for a key (for example, a CSV or numbered like keyN).
Syntax Table getValues(String section, String key)
Arguments
Arguments Type/Description
section The section in the configuration file.
key The key in the configuration file to get the value for.
Returns A table of strings of the values.
• • • 110 • ConnectorLib Java SDK Programming Guide • • Method Reference
Change the Value of a Field The functions getFieldValue, fieldGetValue and setFieldValue, fieldSetValue allow you to modify the contents of a field directly. For example: local content_field = document:findField("CONTENT") local content = document:fieldGetValue(content_field) local content = document:getFieldValue("CONTENT") content = content .. "\nCopyright MyCorp\n" document:setFieldValue("CONTENT", content) document:fieldSetValue(content_field, content)
Example Script For each document, this Lua script adds a COUNT field, a total sections count to the title, and replaces the content of each section with the section number.
NOTE The COUNT is 1 for the first document and increases as long as the job is running.
doc_count = 0 function handler(document) doc_count = doc_count + 1 document:addField("COUNT",doc_count);
local section_count = 0 local section = document
while section do section_count = section_count + 1 section:setFieldValue("CONTENT", "Section "..section_count); section = section:getNextSection() end local field = section:findField("DRECONTENT") if field then section:fieldSetValue(field, "Section "..section_count); end section = section:getNextSection() end
document:setFieldValue("TITLE", document:getFieldValue("TITLE").." Total Sections " ..section_count) return true; local field = document:findField("DRETITLE")
• • • ConnectorLib Java SDK Programming Guide • 111 • • Chapter 7 Use Lua Scripts
if field then document:fieldSetValue(field, document:fieldGetValue(field).." Total Sections "..section_count) end end
Use Lua Scripts Within the Connector
This section describes how to use CFS connector actions in Lua scripts to transform documents. It includes the following sections:
Introduction
Example Lua Script
Introduction There are occasions when documents are not to be sent to the Connector Framework Server (CFS). For example, you may use the Collect action to retrieve documents from one repository and then insert them into another. In doing so, you may need to transform the documents from the first repository before they can be accepted by the second repository. You can use a Lua script to do this. Some CFS connector configuration options and actions take a Lua script as a parameter. The information in this chapter discusses the requirements for any Lua script that is used in this way.
Example Lua Script You can use the CollectActions parameter of the Collect action, the IngestActions parameter of the Synchronize action and the IngestActions parameter in the configuration file to specify a Lua script that runs on each document.
• • • 112 • ConnectorLib Java SDK Programming Guide • • Use Lua Scripts Within the Connector
The Lua script takes the following parameters:
Parameter Description
config A configuration file object.
document A document object that represents the document.
params A table containing additional parameters provided by the connector. For example: TYPE. The type of the command being performed. This can be ADD, UPDATE, DELETE, or COLLECT. SECTION. The configuration section for the task. FILENAME. The document filename. The Lua script may modify this file, but should not delete it.
The configuration file object provides the following methods: getValue(section, parameter, default) To see the set of methods that the document object provides, refer to “Document Methods” on page 87. An example Lua script appears below: method handler( config, document, params ) -- If these lines are uncommented, and the connector is running -- from the console, all the parameters in params will be output -- to the console. -- for k,v in pairs(params) do -- print(k,v) -- end -- Sets local variables from the parameters passed in. local type = params["TYPE"] local section = params["SECTION"] local filename = params["FILENAME"] -- Read a config setting from the config file. local val = config:getValue(section, "ConfigSettingName", "Value") -- If the document is not being deleted, set the field FieldName -- to the value read from the config file. if type ~= "DELETE" then document:setFieldValue("FieldName", val) end -- If this document has a file (that is, not just metadata), -- copy the file to a new location and write a stub idx -- containing the metadata with it.
• • • ConnectorLib Java SDK Programming Guide • 113 • • Chapter 7 Use Lua Scripts
if filename ~= "" then copytofilename = "OutputPath/"..create_uuid(filename) copy_file(filename, copytofilename) document:writeStubIdx(copytofilename..".idx") end return true end
NOTE The Lua script should return true normally, but can return false to reject the document when used as an Ingest action.
• • • 114 • ConnectorLib Java SDK Programming Guide • • PART 2 Parameter and Command Reference
This section describes useful configuration parameters and action commands for the connector to be developed and for the Connector Framework Server.
Parameters Common to CFS Connectors
Parameters Common to CFS Connectors Using Java
CFS Connector Actions
Connector Framework Server Parameters
License Configuration Parameters
Logging Configuration Parameters
Secure Socket Layer Parameters
Service Actions
Service Configuration Parameters Part 2 Parameter and Command Reference
• • • 116 • ConnectorLib Java SDK Programming Guide • • CHAPTER 8 Parameters Common to CFS Connectors
This section describes the parameters that are common to all connectors that use the Connector Framework Service (CFS). If more than one configuration file-section is specified for a configuration parameter, the value of the parameter located in the left-most section will override the values of the parameters contained in the other sections mentioned.
Using the Configuration Section example, “TaskName or FetchTasks or Default,” parameter values in the TaskName section will override corresponding values in the FetchTasks section, which will, in turn, override those corresponding in the Default section. ACI Server Configuration
Import Service
Distributed Connector
View Server
General Connector Parameters
Fetch Task Configuration
Ingestion
GroupServer
• • • ConnectorLib Java SDK Programming Guide • 117 • • Chapter 8 Parameters Common to CFS Connectors
ACI Server Configuration
The parameters in this section control the way the connector handles the load caused by incoming ACI requests.
FilePath Use this parameter to specify the location of the file to receive event data. Set the value as TextFileHandler to use an internal text file handler.
Type: String
Default:
Required: No
Configuration EventHandler Section:
Example: FilePath=./EventData See Also: “LibraryName” on page 118
LibraryName Use this parameter to specify the name of the library to use as the event handler. Set as HttpHandler to use as internal HTTP handler. Specifying the .dll or .so extension is optional.
Type: String
Default:
Required: No
Configuration EventHandler Section:
Example: LibraryName=./luaHandler See Also: “OnError” on page 121 “OnFinish” on page 122 “OnStart” on page 122
• • • 118 • ConnectorLib Java SDK Programming Guide • • ACI Server Configuration
LuaScript Use this parameter to specify the Lua script to execute on the event.
Type: String
Default:
Required: No
Configuration EventHandler Section:
Example: LuaScript=./finished_handler.lua See Also: “LibraryName” on page 118
MaximumThreads Use this parameter to specify the maximum number of simultaneous ACI actions to process.
The number of synchronous actions (for example, getstatus or view) that should be processed simultaneously: [Server] MaximumThreads=4 The number os asynchronous actions (for example, fetch) that should be processed simultaneously: [Actions] MaximumThreads=4
Type: Integer
Default: 2
Required: No
Configuration Actions or Server Section:
Example: MaximumThreads=4 See Also:
• • • ConnectorLib Java SDK Programming Guide • 119 • • Chapter 8 Parameters Common to CFS Connectors
MaxQueueSize Use this parameter to specify the maximum number of asynchronous fetch action commands that will be queued by the connector. No further fetch actions will be accepted once the queue size has been reached (until the queue diminishes).
Type: Integer
Default: The default is the maximum integer value (no limit).
Required: No
Configuration Actions Section:
Example: MaxQueueSize=4 See Also: “MaxScheduledSize” on page 120
MaxScheduledSize Use this parameter to limit the number of Processing+Finished+Error tasks that are stored by the connector. All actions and response data are stored in a file, actions/fetch/fetch.queue. If the MaxScheduledSize parameter value is not specified, this file will continue to grow with each action - eventually a resource limit would be reached that will cause unhandled failures to occur.
If this limit is exceeded, the oldest Finished or Error action is disposed of and so will no longer be accessible through the queueinfo action. The MaxScheduledSize and MaxQueueSize parameters together give the total actions that will be stored (Queued, Processing, Finished, or Error). (Note that Queued, Processing, Finished, and Error are the action statuses reported by the queueinfo action.
Type: Integer
Default: The default is the maximum integer value (no limit).
Required: No
Configuration Actions Section:
Example: MaxScheduledSize=100 See Also: “MaxQueueSize” on page 120
• • • 120 • ConnectorLib Java SDK Programming Guide • • ACI Server Configuration
OnError Use this parameter to specify the handler for the Fetch action error event.
This is the section name that will contain the LibraryName and any other settings for the event handler.
Typical configuration is (using OnFinish):
[Actions] OnFinish=HttpHandler [HttpHandler] LibraryName=HttpHandler Url=http://localhost/dosomething?
Type: String
Default:
Required: No
Configuration Actions Section:
Example: OnError=EventHandler See Also:
OnErrorReport Use this parameter to specify the handler for the Fetch action error report event.
Type: String
Default:
Required: No
Configuration Actions Section:
Example: OnErrorReport=EventHandler See Also:
• • • ConnectorLib Java SDK Programming Guide • 121 • • Chapter 8 Parameters Common to CFS Connectors
OnFinish Use this parameter to specify the handler for the Fetch action finish event.
Type: String
Default:
Required: No
Configuration Actions Section:
Example: OnFinish=EventHandler See Also:
OnStart Use this parameter to specify the handler for the Fetch action start event.
Type: String
Default:
Required: No
Configuration Actions Section:
Example: OnStart=EventHandler See Also:
Url Use this parameter to specify the URL to receive event data (when configured to use LibraryName=HttpHandler only). Each handler type will have specific configuration settings. For HttpHandler, the available settings are: Url SSLConfig ProxyHost ProxyPort ProxyUser ProxyPassword BasicUser BasicPassword
• • • 122 • ConnectorLib Java SDK Programming Guide • • Import Service
If the LibraryName parameter is set to Luahandler, the available parameter is LuaScript, which takes the path of the Lua script that handles the events. Lua script event handlers should be of the form:
function handler(request, xml) ... end request is a table holding the request parameters. xml is a string holding the response to the request.
Type: String
Default:
Required: No
Configuration Actions Section:
Example: Url=http://localhost/dosomething?param=value Url=http://localhost:1234/?action=dosomething See Also:
Import Service
The parameters in this section control the way the connector interfaces with Keyview to extract document sub-files in response to collect or view actions.
KeyviewDirectory Use this parameter to specify the location of Keyview filters used for sub-file extraction. Doing so is an alternative to specifying this in the KEYVIEW_DIRECTORY environment variable. This parameter is used only if Keyview is required (if the EnableExtraction parameter is set to True).
Type: String
Default:
Required: No
• • • ConnectorLib Java SDK Programming Guide • 123 • • Chapter 8 Parameters Common to CFS Connectors
Configuration ImportService Section:
Example: KeyviewDirectory=./filters See Also: “EnableExtraction” on page 131
Distributed Connector
The parameters in this section control the way the connector behaves when used with the Distributed Connector.
ConnectorGroup Use this parameter to specify the name of the connector group to which this connector belongs. The ConnectorGroup parameter can take any value - it is only configured in the individual connectors, and is passed to the Distributed Connector when registering.
This parameter is used only if the RegisterConnector parameter is set to True.
Type: String
Default: Connector
Required: No
Configuration DistributedConnector Section:
Example: ConnectorGroup=Connector See Also: “RegisterConnector” on page 126
ConnectorPriority Use this parameter to specify the priority value used to distribute actions to higher priority connectors.
• • • 124 • ConnectorLib Java SDK Programming Guide • • Distributed Connector
This parameter is used only if the RegisterConnector parameter is set to True.
Type: Integer
Default: 0
Required: No
Configuration DistributedConnector Section:
Example: ConnectorPriority=1 See Also: “RegisterConnector” on page 126
DataPortN Use this parameter to specify the dataport number(s) of the Distributed Connector(s). This parameter is used only if the RegisterConnector parameter is set to True.
Type: Integer
Default:
Required: No
Configuration DistributedConnector Section:
Example: DataPort0=9876 See Also: “RegisterConnector” on page 126
HostN Use this parameter to specify the hostname(s) or IP address(es) of the Distributed Connector(s). This parameter is used only if the RegisterConnector parameter is set to True.
Type: String
Default: localhost
Required: No
• • • ConnectorLib Java SDK Programming Guide • 125 • • Chapter 8 Parameters Common to CFS Connectors
Configuration DistributedConnector Section:
Example: Host0=localhost See Also: “RegisterConnector” on page 126
PortN Use this parameter to specify the port number(s) of the Distributed Connector(s). This parameter is used only if the RegisterConnector parameter is set to True.
Type: Integer
Default: 10000
Required: No
Configuration DistributedConnector Section:
Example: Port0=10000 See Also: “RegisterConnector” on page 126
RegisterConnector Use this parameter to register with the Distributed Connector. The connector will wait at startup for registration to be successful.
If the action=fetch parameter synchronizestate is assigned, then this will hold the locations of the datastore files that will be used for the tasks. In this case, the DatastoreFile is ignored. The synchronizestate parameter is assigned when using the connector through the Distributed Connector.
Type: Boolean
Default: False
Required: No
Configuration DistributedConnector Section:
Example: RegisterConnector=False See Also:
• • • 126 • ConnectorLib Java SDK Programming Guide • • Distributed Connector
SharedPath Use this parameter to specify the path to a location common to all connectors in the Connector Group. This location is used to store the compressed datastore files used by the connectors. This parameter is used only if the RegisterConnector parameter is set to True.
Type: String
Default: The value of the TempDirectory parameter. Required: No
Configuration DistributedConnector Section:
Example: SharedPath=./temp See Also: “RegisterConnector” on page 126
SSLConfigN Use this parameter to specify the section(s) containing SSL settings for the Distributed Connector(s). This parameter is used only if the RegisterConnector parameter is set to True.
Type: String
Default:
Required: No
Configuration DistributedConnector Section:
Example: SSLConfig0=SSL See Also: “RegisterConnector” on page 126
• • • ConnectorLib Java SDK Programming Guide • 127 • • Chapter 8 Parameters Common to CFS Connectors
View Server
The parameters in this section allow the connector’s view action to use a View Server.
EnableViewServer If this parameter is set to True, documents retrieved by a view action are processed by the View Server before being returned. If set to False, the original documents are returned.
Type: Boolean
Default: False
Required: No
Configuration Connector and ViewServer Section:
Example: EnableViewServer=False See Also:
Host Use this parameter to specify the hostname or IP address of the View Server. This parameter is used only if the EnableViewServer parameter is set to True.
Type: String
Default: localhost
Required: No
Configuration ViewServer Section:
Example: Host=localhost See Also: “EnableViewServer” on page 128
• • • 128 • ConnectorLib Java SDK Programming Guide • • View Server
Port Use this parameter to specify the port number of the View Server. This parameter is used only if the EnableViewServer parameter is set to True.
Type: Integer
Default: 9000
Required: No
Configuration ViewServer Section:
Example: Port=9000 See Also: “EnableViewServer” on page 128
SharedPath Use this parameter to specify the path to a location accessible by both the connector and the View Server. Intermediate files are stored here. This parameter is used only if the EnableViewServer parameter is set to True.
Type: String
Default: The value of the TempDirectory parameter.
Required: No
Configuration ViewServer Section:
Example: SharedPath=./temp See Also: “EnableViewServer” on page 128
• • • ConnectorLib Java SDK Programming Guide • 129 • • Chapter 8 Parameters Common to CFS Connectors
General Connector Parameters
CleanOnStart Set this parameter to True to delete actions and the temp directory on start. Any action data stored in the actions folder is deleted - including all Queued actions.
Type: Boolean
Default: False
Required: No
Configuration Connector Section:
Example: CleanOnStart=True See Also:
DatastoreFile Use this parameter to override the name of the datastore file used by synchronize actions. Normally, you should use the default value for this parameter.
Type: String
Default: connector_TaskName_datastore.db
Required: No
Configuration TaskName or FetchTasks Section:
Example: DatastoreFile=./Datastore/Datastore.db See Also: “SynchronizeKeepDatastore” on page 135
• • • 130 • ConnectorLib Java SDK Programming Guide • • General Connector Parameters
DatastoreDirectory Use this parameter to specify the directory where datastore files are stored (except when using the DistributedConnector section or have specified the DatastoreFile parameter.
Type: String
Default: .
Required: No
Configuration Connector Section:
Example: DatastoreDirectory=./Datastore/ See Also: “DatastoreFile” on page 130
EnableExtraction Use this parameter to enable the extraction of sub-files for collect and view actions. This requires keyview filters to be present and for their location to be specified in the KeyviewDirectory parameter or KEYVIEW_DIRECTORY environment variable.
Type: Boolean
Default: False
Required: No
Configuration Connector Section:
Example: EnableExtraction=False See Also: “KeyviewDirectory” on page 123
EnableExtractionCopy Generally, this parameter is only relevant to the File System Connector, that is acting on the original documents instead of temporary copies. When performing extraction from certain file types, KeyView has side-effects that mean that the document is updated. This specifically causes the modified date to be updated and will thus cause the connector to re-ingest the document on the next synchronize action.
• • • ConnectorLib Java SDK Programming Guide • 131 • • Chapter 8 Parameters Common to CFS Connectors
To avoid these modifications, the solution is to make a copy of the original document (by setting this parameter to True) and perform the extraction on the copy. For other connectors, enabling this setting will have no effect, since the connectors will be downloading temporary copies and will have ownership of the files.
Type: Boolean
Default: False
Required: No
Configuration Connector Section:
Example: EnableExtractionCopy=False See Also: “KeyviewDirectory” on page 123
EnableScheduledTasks Use this parameter to enable internal scheduling of synchronize actions. When this is set to True, the numbered tasks configured in the FetchTasks section will be performed according to their schedules. If this is set to False, synchronize actions will be performed only in response to an ACI request.
Type: Boolean
Default: True
Required: No
Configuration Connector Section:
Example: EnableScheduledTasks=False See Also:
• • • 132 • ConnectorLib Java SDK Programming Guide • • General Connector Parameters
EncryptACLEntries If this parameter is set to False, the entries in ACLs will not be encrypted. This should be used only for troubleshooting.
NOTE Some connectors allow this parameter to be set in the Task section. Not all connectors that have security necessarily must support this parameter.
Type: Boolean
Default: True
Required: No
Configuration Connector Section:
Example: EncryptACLEntries=True See Also:
HashedDestinationDirectory Set this parameter to True to use sub-directories within Collect Destination directory. The collect destination is a location on the filesystem specified by the destination parameter to the Collect fetch action: /action=fetch&fetchaction=collect&identifiers=<...>&destination=\\ foo\bar
Type: Boolean
Default: False
Required: No
Configuration Connector Section:
Example: HashedDestinationDirectory=True See Also: “TempDirectory” on page 138
• • • ConnectorLib Java SDK Programming Guide • 133 • • Chapter 8 Parameters Common to CFS Connectors
HashedTempDirectory Set this parameter to True to use sub-directories within TempDirectory.
Type: Boolean
Default: False
Required: No
Configuration Connector Section:
Example: HashedTempDirectory=True See Also: “TempDirectory” on page 138
InsertActions Use this parameter to perform some actions on each document before insertion. Each action is in the form: ‘ACTIONNAME:ACTIONPARAMETERS’ Possible actions are:
Action Parameters Example
META field=value InsertActions=META:MyField=MyValue
LUA lua_script_filename InsertActions=LUA:myLuaScript.lua
Type: String
Default:
Required: No
Configuration Connector Section:
Example: InsertActions=META:MyField=MyValue See Also: “IngestActions” on page 143
• • • 134 • ConnectorLib Java SDK Programming Guide • • General Connector Parameters
InsertFailedDirectory Use this parameter to specify the directory where failed insert commands are written.
Type: String
Default: ./insertfailed
Required: No
Configuration Connector Section:
Example: InsertFailedDirectory=./insertfailed See Also:
MinFreeSpaceMB The MinFreeSpaceMB parameter defines the minimum amount of free disk space (in megabytes) that must be available for a fetch action to be processed. If the specified amount of free space is not available, the fetch action is not processed and an error is returned in the ACI response.
Type: Integer
Default: 1024
Required: No
Configuration Connector Section:
Example: MinFreeSpaceMB=1024 See Also:
SynchronizeKeepDatastore When this parameter is set to True, a datastore file (.db) will remain on the disk after a synchronize action has been performed. When the synchronize action is next performed, this file will be checked by the connector so that it can index only documents that have changed since the last synchronize and can delete documents that have been deleted since the last synchronize.
• • • ConnectorLib Java SDK Programming Guide • 135 • • Chapter 8 Parameters Common to CFS Connectors
If this parameter is set to False, the datastore file is deleted at the end of each synchronize action. The next synchronize action will fetch all documents and will not delete old documents.
Type: Boolean
Default: True
Required: No
Configuration Connector Section:
Example: SynchronizeKeepDatastore=False See Also:
SynchronizeThreads Use this parameter to specify the number of threads to use for synchronization if the connector supports multi-threading. This parameter will not have an effect if the connector does not support multi-threading. In cases where this method is not supported by the connector, multiple tasks can be executed using the alternative TaskThreads setting.
Type: Integer
Default: 5
Required: No
Configuration Connector Section:
Example: SynchronizeThreads=4 See Also:
TaskMaxAdds Use this parameter to specify the maximum number of Adds to be processed by the task. The value 0 indicates an infinite number.
Type: Integer
Default: 0
Required: No
• • • 136 • ConnectorLib Java SDK Programming Guide • • General Connector Parameters
Configuration TaskName Section:
Example: TaskMaxAdds=0 See Also:
TaskMaxDuration Use this parameter to specify the maximum duration of the task in the format H[H][:MM][:SS].
Type: String
Default:
Required: No
Configuration TaskName Section:
Example: TaskMaxDuration=12:30:00 See Also:
TaskThreads Use this parameter to specify the number of simultaneous tasks that can be performed for an action. Each action=fetch action results potentially in a number of tasks. Each task generally consists of performing a subset of the action using a particular configuration section. For example, a single synchronize action can actually mean performing a synchronize action for multiple configured tasks. For collect and other actions where identifiers are provided, the identifiers are tied to particular configuration sections, so the whole action can span across several configuration sections. Generally, a single task is for a particular configuration section. Each would be processed on a separate thread.
Type: Integer
Default: 1
Required: No
• • • ConnectorLib Java SDK Programming Guide • 137 • • Chapter 8 Parameters Common to CFS Connectors
Configuration Connector Section:
Example: TaskThreads=1 See Also:
TempDirectory Use this parameter to specify the directory used to store temporary documents and files.
Type: String
Default: ./temp
Required: No
Configuration Connector Section:
Example: TempDirectory=./TempFiles See Also:
XsltDLL Use this parameter to specify the location of the autnxslt library.
Type: String
Default: autnxslt.dll (if present)
Required: No
Configuration Connector Section:
Example: XsltDLL=autnxslt.dll See Also:
• • • 138 • ConnectorLib Java SDK Programming Guide • • Fetch Task Configuration
Fetch Task Configuration
Each action the connector performs consists of one or more tasks. Each task is associated with a section in the configuration file. The section to use is either specified in an action parameter or encoded in each document identifier supplied to the action. The parameters below let you specify a numbered list of tasks. This is the set of tasks that will be performed when the connector performs a synchronize action whose parameters do not specify which tasks should be performed. The connector will also run synchronize actions for these tasks automatically according to the configured schedules.
IngestConfigSection Use this parameter to specify to read ingest settings from an alternative configuration section in preference.
Type: String
Default: task name
Required: No
Configuration TaskName or FetchTasks or Connector Section:
Example: IngestConfigSection=MyTask1 See Also:
N Use this parameter to specify the name of the task section containing the parameters for the synchronize task to be performed. The task will only be performed if N is less than Number and greater than or equal to 0.
Type: String
Default:
Required: No
• • • ConnectorLib Java SDK Programming Guide • 139 • • Chapter 8 Parameters Common to CFS Connectors
Configuration FetchTasks or Connector Section:
Example: Number=2 0=Task1 1=Task2 See Also: “Number” on page 140
Number The connector will schedule the tasks with the names specified by the numbered parameters 0 through Number -1. Numbers may be missing from the sequence. An alternative configuration method is to give the Number parameter the default value of -1. In this case, the tasks configured from 0 until the first missing parameter are used.
For example, this configuration executes Task0 and Task2: [FetchTasks] Number=3 0=Task0 2=Task2 This executes only Task0: [FetchTasks] 0=Task0 2=Task2
Type: Integer
Default: -1
Required: No
Configuration FetchTasks or Connector Section:
Example: Number=2 0=Task1 1=Task2 See Also: “N” on page 139
ScheduleCycles Use this parameter to specify the number of scheduled synchronize actions to perform.
• • • 140 • ConnectorLib Java SDK Programming Guide • • Fetch Task Configuration
The value -1 specifies to repeat forever. The value 0 specifies to perform the task once. Any other positive value specifies the number of times to perform the task. This parameter has an effect only if the EnableScheduledTasks parameter is set to True.
Type: Integer
Default: -1
Required: No
Configuration TaskName or FetchTasks or Connector Section:
Example: ScheduleCycles=3 See Also: “EnableScheduledTasks” on page 132
ScheduleRepeatSecs Use this parameter to specify the interval (in seconds) between scheduled synchronize actions. This parameter has an effect only if the EnableScheduledTasks parameter is set to True.
Type: Integer
Default: 86400
Required: No
Configuration TaskName or FetchTasks or Connector Section:
Example: ScheduleRepeatSecs=3600 See Also: “EnableScheduledTasks” on page 132
ScheduleStartTime Use this parameter to specify the start time of the first scheduled synchronize action in the format H[H][:MM][:SS]. This parameter has an effect only if the EnableScheduledTasks parameter is set to True.
Type: String
Default:
Required: No
• • • ConnectorLib Java SDK Programming Guide • 141 • • Chapter 8 Parameters Common to CFS Connectors
Configuration TaskName or FetchTasks or Connector Section:
Example: ScheduleStartTime=14:30:00 See Also: “EnableScheduledTasks” on page 132
Ingestion
The parameters in this section specify where the documents fetched by the synchronize action should be sent.
EnableIngestion Set this parameter to True if documents fetched by the synchronize action should be sent to the CFS or to another connector.
Type: Boolean
Default: True
Required: No
Configuration TaskName or Ingestion or Connector Section:
Example: EnableIngestion=False See Also:
IndexDatabase Use this parameter to specify the value assigned to the DREDBNAME field for all documents.
Type: String
Default:
Required: No
Configuration TaskName or Ingestion Section:
Example: IndexDatabase=News See Also:
• • • 142 • ConnectorLib Java SDK Programming Guide • • Ingestion
IngestActions The actions specified in this CSV will be performed on each document before it is sent to the CFS. Each action is in the form: ‘ACTIONNAME:ACTIONPARAMETERS’ Possible actions are:
Action Parameters Example
META field=value IngestActions=META:MyField=MyValue
LUA lua_script_filename IngestActions=LUA:myLuaScript.lua
Type: String
Default:
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestActions=META:MyField=MyValue See Also: “InsertActions” on page 134
Related Topics “Use Lua Scripts” on page 67
IngestAddAsUpdate If you set this parameter to True, Add commands are treated as Updates for full metadata updating.
Type: Boolean
Default: False
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestAddAsUpdate=False See Also:
• • • ConnectorLib Java SDK Programming Guide • 143 • • Chapter 8 Parameters Common to CFS Connectors
IngestBatchSize Use this parameter to specify the number of documents that are sent to the CFS in a single batch.
This parameter has an effect only if the EnableIngestion parameter is set to True.
Type: Integer
Default: 100
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestBatchSize=200 See Also: “EnableIngestion” on page 142
IngestCheckFinished If the IngesterType parameter is set to CFS, setting the IngestCheckFinished parameter to True will cause the connector to wait until documents have been added to the import queue before returning success. If sending to another connector, this connector will wait until the action completes. More specifically, the task is held in a queue, and each time the connector attempts to send more data to the destination connector, it will check the status of the previous actions sent and will only mark those tasks as complete once the status returns Finished. This parameter has an effect only if the EnableIngestion parameter is set to True.
Type: Boolean
Default: False
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestCheckFinished=True See Also: “EnableIngestion” on page 142 “IngesterType” on page 147
• • • 144 • ConnectorLib Java SDK Programming Guide • • Ingestion
IngestConnectorConfigSection Use this parameter to specify the section to use for the destination connector. This is specifically for when IngesterType=Connector Normally when a document is sent to another connector for insertion, the destination connector must use the same configuration section name (as the source connector task name) for any settings required for the insertion of the document. This setting allows the default config section for the task to be overridden. So Connector1 performs task Task1 that retrieves a document. Connector2 would normally be forced to configure the insertion task from the Task1 section of it's own config file. This setting allows it to be changed to a different section specifically for the insertion.
Type: String
Default: IngestConfigSection
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestConnectorConfigSection=IngestConfigSection See Also: “IngesterType” on page 147
IngestDataPort Use this parameter to specify the DataPort number of the destination server.
This parameter has an effect only if the EnableIngestion parameter is set to True.
Type: Integer
Default:
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestDataPort=7051 See Also: “EnableIngestion” on page 142
• • • ConnectorLib Java SDK Programming Guide • 145 • • Chapter 8 Parameters Common to CFS Connectors
IngestDelayMS Use this parameter to specify the number of milliseconds to pause between adding individual documents to the ingest queue.
Type: Integer
Default: 0
Required: No
Configuration Ingestion or TaskName Section:
Example: IngestDelayMS=0 See Also:
IngestEnableAdds Use this parameter to specify whether or not Add commands should be sent.
Type: Boolean
Default: True
Required: No
Configuration Ingestion or TaskName Section:
Example: IngestEnableAdds=True See Also:
IngestEnableDeletes Use this parameter to specify whether or not Delete commands should be sent.
Type: Boolean
Default: True
Required: No
Configuration Ingestion or TaskName Section:
Example: IngestEnableDeletes=True See Also:
• • • 146 • ConnectorLib Java SDK Programming Guide • • Ingestion
IngestEnableUpdates Use this parameter to specify whether or not Update commands should be sent.
Type: Boolean
Default: True
Required: No
Configuration Ingestion or TaskName Section:
Example: IngestEnableUpdates=True See Also:
IngestHashedSharedPath Use this parameter to specify to use sub-directories within IngestSharedPath.
Type: String
Default: HashedTempDirectory
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestHashedSharedPath=HashedTempDirectory See Also: “IngestSharedPath” on page 150
IngesterType Use this parameter to specify the type of ingestion process. The only allowed values are CFS, AsyncPiranha (alias for CFS), Connector, and ConnectorInsert (alias for Connector). If this parameter is set to CFS, the IngestHost and IngestPort parameters point to a Connector Framework Server (CFS) (which can be used to import the documents and index them).
If this parameter is set to Connector, the IngestHost and IngestPort parameters point to another connector. Documents fetched from this repository by the synchronize action will be inserted into another repository using the connector specified. If this option is used, you will probably need to use the IngestActions parameter to convert the document into a form that can be handled by the other connector.
• • • ConnectorLib Java SDK Programming Guide • 147 • • Chapter 8 Parameters Common to CFS Connectors
Note that the synchronize action can result in Add, Update and Delete ingest commands. Adds result in insert actions, Updates result in update actions, and Deletes result in delete actions being sent to the destination connector. This parameter has an effect only if the EnableIngestion parameter is set to True.
Type: String
Default: AsyncPiranha
Required: No
Configuration TaskName or Ingestion Section:
Example: IngesterType=AsyncPiranha See Also: “EnableIngestion” on page 142 “IngestActions” on page 143 “IngestHost” on page 148 “IngestPort” on page 149
IngestHost Use this parameter to specify the hostname or IP address of the destination server.
This parameter has an effect only if the EnableIngestion parameter is set to True.
Type: String
Default: localhost
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestHost=localhost See Also: “EnableIngestion” on page 142
• • • 148 • ConnectorLib Java SDK Programming Guide • • Ingestion
IngestKeepFiles If this parameter is set to True, downloaded documents will not be deleted after they have been ingested.
Type: Boolean
Default: False
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestKeepFiles=True See Also: “EnableIngestion” on page 142
IngestPort Use this parameter to specify the port number of the destination server.
This parameter has an effect only if the EnableIngestion parameter is set to True.
Type: Integer
Default: 7000
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestPort=7050 See Also: “EnableIngestion” on page 142
IngestSendByType Use this parameter to specify whether or not to send Add, Update, and Delete commands separately.
Type: Boolean
Default: False
Required: No
• • • ConnectorLib Java SDK Programming Guide • 149 • • Chapter 8 Parameters Common to CFS Connectors
Configuration TaskName or Ingestion Section:
Example: IngestSendByType=False See Also:
IngestSharedPath Use this parameter to specify the location to which documents are saved before ingestion. This should be a path accessible by both the connector and the ingest server.
Type: String
Default: The value of the TempDirectory parameter.
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestSharedPath=./TempDirectory See Also: “EnableIngestion” on page 142
IngestSSLConfig Use this parameter to specify the configuration file section containing the SSL settings that should be used when communicating with the CFS. For more information on the contents of this section, refer to “Secure Socket Layer Parameters” on page 235.
Type: String
Default:
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestSSLConfig=SSLSettings See Also: “EnableIngestion” on page 142
• • • 150 • ConnectorLib Java SDK Programming Guide • • GroupServer
IngestWriteIDX If the IngestKeepFiles parameter is set to True, setting the IngestWr iteIDX parameter to True causes the connector to write document metadata in a stub IDX file alongside the document.
Type: Boolean
Default: False
Required: No
Configuration TaskName or Ingestion Section:
Example: IngestWriteIDX=False See Also: “EnableIngestion” on page 142
GroupServer
The parameters in this section are used when performing an update to an instance of OmniGroupServer using the SynchronizeGroups fetch action.
GroupServerHost Use this parameter to specify the host name or IP address of the group server.
Type: String
Default: localhost
Required: Yes
Configuration TaskName or GroupServer Section:
Example: GroupServerHost=localhost See Also:
• • • ConnectorLib Java SDK Programming Guide • 151 • • Chapter 8 Parameters Common to CFS Connectors
GroupServerPort Use this parameter to specify the port number of the group server.
Type: Integer
Default: 3057
Required: Yes
Configuration TaskName or GroupServer Section:
Example: GroupServerPort=3057 See Also:
GroupServerRepository Use this parameter to specify the group server repository name. This is the name of any repository section in an OmniGroupServer configuration file. It is named based on the name of the repository the connector is retrieving from.
Type: String
Default:
Required: Yes
Configuration TaskName or GroupServer Section:
Example: GroupServerRepository=RepositoryGroups See Also:
GroupServerSSLConfig Use this parameter to specify the section containing SSL settings for the group server.
Type: String
Default:
Required:
• • • 152 • ConnectorLib Java SDK Programming Guide • • GroupServer
Configuration TaskName or GroupServer Section:
Example: GroupServerSSLConfig=SSLConfig1 See Also:
Related Topics “Synchronize Groups Fetch Action” on page 164
• • • ConnectorLib Java SDK Programming Guide • 153 • • Chapter 8 Parameters Common to CFS Connectors
• • • 154 • ConnectorLib Java SDK Programming Guide • • CHAPTER 9 Parameters Common to CFS Connectors Using Java
This chapter describes the parameters that specify details related to the CFS connectors’ Java files. These parameters are found in the [Connector] section.
JavaClassPath Specify the class path to use, including all dependencies of the connector. This must include the class specified by the JavaConnectorClass parameter. The Java class path must point to all the jar files found in the lib directory when installed. This must include the JavaConnector.jar and will usually include one other connector jar (ConnectorName.jar) file plus any number of dependencies and properties files or other resources required by the libraries.
If the connector is run through Java directly (using the ConnectorLibJava library), the CLASSPATH should be set on the command line. For example:
java -classpath
Type: String
Default: None
Required: Yes
• • • ConnectorLib Java SDK Programming Guide • 155 • • Chapter 9 Parameters Common to CFS Connectors Using Java
Configuration Connector Section:
Example: JavaClasspath=./lib/javaConnector.jar;./lib/ MyConnector.jar
or
JavaClassPath0=./lib/JavaConnector.jar JavaClassPath1=./lib/MyConnector.jar See Also: “JavaConnectorClass” on page 156
JavaConnectorClass Specify the full class name of the Java class that contains the implementation of the connector. Package separators can be either slash (/) or dot (.).
Type: String
Default: None
Required: Yes
Configuration Connector Section:
Example: JavaConnectorClass=com.autonomy.connector.example .FileSystemConnector
See Also: “JavaClassPath” on page 155
JavaLibraryPath Set this parameter to the location of any additional native libraries required by and of the connector classes or dependencies.
Type: String
Default: ““
Required: No
Configuration Connector Section:
Example: JavaLibraryPath=./lib See Also: “JavaClassPath” on page 155
• • • 156 • ConnectorLib Java SDK Programming Guide • • JavaMaxMemoryMB Set this parameter to the maximum memory (in megabytes) to be allocated to the Java virtual machine.
Type: Integer
Default: 64
Required: No
Configuration Connector Section:
Example: JavaMaxMemoryMB=256 See Also: “JavaClassPath” on page 155
JVMLibraryPath Use this parameter to specify the location of the jvm library (for example, jvm.dll or libjvm.so). The connector will look for the library in various locations specified by the following: JVMLibraryPath parameter Location set in the Windows registry
JAVA_HOME environment variable System library path (PATH, LD_LIBRARY_PATH, and so forth)
Type: String
Default: None
Required: No, however, the connector will not start if the jvm library cannot be found or loaded.
Configuration Connector Section:
Example: JVMLibraryPath=./jre/bin/client See Also:
• • • ConnectorLib Java SDK Programming Guide • 157 • • Chapter 9 Parameters Common to CFS Connectors Using Java
JavaVerboseGC Set this parameter to True to enable verbose garbage collection (for debugging purposes only).
Type: Boolean
Default: False
Required: No
Configuration Connector Section:
Example: JavaVerboseGC=False See Also:
• • • 158 • ConnectorLib Java SDK Programming Guide • • CHAPTER 10 CFS Connector Actions
CFS connectors may provide one or more of the ACI actions described here. Not all connectors support all the actions. The sample HTTP requests in this section are split across multiple lines for readability. When using these requests, the whole request should be on one line and contain no spaces. Brackets ([]) enclosing a parameter indicate that the parameter is optional.
Synchronous Versus Asynchronous Actions
QueueInfo Action
Synchronize Fetch Action
Synchronize Groups Fetch Action
Collect Fetch Action
Identifiers Fetch Action
Insert Fetch Action
Delete/Remove Fetch Action
Hold and ReleaseHold Fetch Actions
Update Action
View Action
StopFetch Action
• • • ConnectorLib Java SDK Programming Guide • 159 • • Chapter 10 CFS Connector Actions
Synchronous Versus Asynchronous Actions
Some of the actions described here are synchronous and others are asynchronous. The connector does not respond to a synchronous action until it has completed the request. The result of the action is in the response to the request. An asynchronous action responds immediately; the request is added to a queue of actions to be performed. The response to the request contains a token. You can use this token to determine whether the request has finished and the results of the action. You can do this using the QueueInfo action.
Example http://localhost:1234/action=Fetch&FetchAction=Synchronize
Response
QueueInfo Action
The QueueInfo action provides information about the asynchronous actions that CFS or a connector is processing. Use this action to determine whether a task has completed and retrieve the results of the task. http://host:port/action=QueueInfo &QueueName=QueueName &QueueAction=QueueAction [&Token=Token]
QueueInfo is a synchronous action.
• • • 160 • ConnectorLib Java SDK Programming Guide • • QueueInfo Action
Parameter Name Description
QueueName The name of the queue you wish to retrieve information about. There is one queue per asynchronous action. Most of the connector’s functionality is accessed through action=Fetch, so usually you should specify Fetch.
QueueAction The action you wish to perform on the queue. Possible actions are: GetStatus. The response provides information about the action currently on the queue.
Token This restricts the response to information about the action identified by the token.
Example
http://localhost:1234/ action=QueueInfo&QueueName=Fetch&QueueAction=GetStatus
Response A sample response appears below. Each action in the queue appears between
• • • ConnectorLib Java SDK Programming Guide • 161 • • Chapter 10 CFS Connector Actions
Synchronize Fetch Action
This action is used to search a repository for document updates and send these updates to an Ingestion module. http://host:port/action=Fetch&FetchAction=Synchronize [&Config=Base64_Config] [&TaskSections=Section_CSV] [&IngestActions=Document_Action_CSV]
Type: Asynchronous
Parameter Name Description
Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.
TaskSections The names of the task sections to use to perform synchronization. If this parameter is unspecified, all configured task sections are used.
• • • 162 • ConnectorLib Java SDK Programming Guide • • Synchronize Fetch Action
Parameter Name Description
IngestActions This parameter specifies actions to perform on documents prior to being ingested. This can be a list of document actions of the form action:parameters processed from left to right. The available documents actions are: META. Add a custom field to the document, specified as META:Fieldname=FieldValue LUA. Execute a Lua script on the document, specified as LUA:Luascript.
Example: To add a field CATEGORY=FILESYSTEM to every document, specify the ingest action as: IngestActions=META:CATEGORY=FILESYSTEM Any commas in the action parameters should be escaped with a backslash (\).
Example http://host:port/action=Fetch&FetchAction=Synchronize
Response A sample response appears below. In this example, two tasks were performed as part of the synchronize (DIR1 and DIR2). Both of these found 10 new documents, but ingestion failed for all 20 documents.
• • • ConnectorLib Java SDK Programming Guide • 163 • • Chapter 10 CFS Connector Actions
Synchronize Groups Fetch Action
This action is used to search a repository for Group updates and send these updates to an Ingestion module. http://host:port/action=Fetch&FetchAction=SynchronizeGroups [&Config=Base64_Config] [&TaskSections=Section_CSV]
Type: Asynchronous
Parameter Name Description
Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.
TaskSections The names of the task sections to use to perform synchronization. If this parameter is unspecified, all configured task sections are used. The sections should include the GroupServerHost and GroupServerPort parameters as a minimum in addition to any connector-specifc parameters.
Example http://host:port/action=Fetch&FetchAction=SynchronizeGroups
Response A sample response appears below. In this example, two tasks were performed as part of the synchronize groups (GROUPS1 and GROUPS2).
• • • 164 • ConnectorLib Java SDK Programming Guide • • Collect Fetch Action
Collect Fetch Action
This action is used to retrieve documents and metadata by their Identifiers from a repository and send the documents to be ingested to a specified location. http://host:port/action=Fetch&FetchAction=Collect [&Config=Base64_Config] [&Identifiers=Identifier_CSV] [&IdentifiersXML=Identifier_XML] [&Collectactions=Document_Action_CSV] [&Destination=UNC_Path]
Type: Asynchronous
Parameter Name Description
Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options are used instead of the options in the connector configuration file.
failedDirectory The directory in which the action will report failures.
Identifiers A CSV of the identifiers of the documents to be collected.
• • • ConnectorLib Java SDK Programming Guide • 165 • • Chapter 10 CFS Connector Actions
Parameter Name Description
IdentifiersXML This parameter can be specified in addition to the Identifiers parameter and specifies additional identifiers to collect along with a set of custom metadata to be associated with each collected document. This data should be provided in XML format as below:
Destination Output destination as UNC Path. If this is blank, the documents are added to the ingest queue. The parameter can use fields from the document or identifier to construct the resulting destination for each document. To add a document field value as part of the destination, use the tag
• • • 166 • ConnectorLib Java SDK Programming Guide • • Collect Fetch Action
Parameter Name Description
CollectActions This parameter specifies actions to perform on documents prior to transferring them to their destination. This can be a list of document actions of the form action:parameters processed from left to right. The available document actions are: META. Add a custom field to the document, specified as META:Fieldname=FieldValue ZIP. Add the document to a zip file, specified as ZIP:Filename[:Password] LUA. Execute a Lua script on the document, specified as LUA:Luascript.
Example: To add a field CATEGORY=FILESYSTEM to every document, zip all documents with a password and add a field COLLECTTIME=1234567890 to the zip, specify the collect action as: CollectActions=META:CATEGORY=FILESYSTEM,ZIP:Output.zi p:password,META:COLLECTTIME=1234567890 Any commas in the action parameters should be escaped with a ‘\’.
Example http://localhost:1234/ action=Fetch&FetchAction=Collect&Identifiers=PGlkIHM9IkRJUjEiIHI9I kM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxmaWxlOS50eHQ iLz4%3D,PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZW N0b3JDRlNcZGlyMVxmaWxlOC50eHQiLz4%3D&Destination=C:\Autonomy\ collected
Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the QueueInfo action) appears below.
In this example the tokens for both documents appear between
• • • ConnectorLib Java SDK Programming Guide • 167 • • Chapter 10 CFS Connector Actions
“Synchronize Fetch Action” on page 162
Identifiers Fetch Action
This action is used to retrieve a list of document identifiers and optionally perform an action on them (currently only the collect action is available). It should not be used to perform queries that could be more efficiently performed through IDOL Server.
http://host:port/action=Fetch&FetchAction=Identifiers [&Config=Base64_Config] &ConfigSection=Section_Name [&Identifiersaction=Collect &Destination=UNC_Path [&CollectActions=Document_Action_CSV] [&Connector-specific_Parameters]
• • • 168 • ConnectorLib Java SDK Programming Guide • • Identifiers Fetch Action
Type: Asynchronous
Parameter Name Description
Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.
ConfigSection The name of the configuration file section containing the task settings.
IdentifiersAction The name of the action to perform on the returned identifiers. If this action should be passed additional parameters, you should specify them as parameters to this action.
Connector-specific_ Additional parameters that are connector-specific and Parameters determine which identifiers to return.
Example http://localhost:1234/ action=Fetch&FetchAction=Identifiers&ConfigSection=DIR1
Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the QueueInfo action) appears below. This shows that the action has completed. The identifiers are listed in between the
• • • ConnectorLib Java SDK Programming Guide • 169 • • Chapter 10 CFS Connector Actions
Related Topics “Synchronize Fetch Action” on page 162
Insert Fetch Action
This action is used to insert a document or documents into a repository. http://host:port/action=Fetch&FetchAction=Insert [&Config=Base64_Config] &ConfigSection=Section_Name &InsertXML=Insert_XML
Type: Asynchronous
Parameter Name Description
Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.
• • • 170 • ConnectorLib Java SDK Programming Guide • • Insert Fetch Action
Parameter Name Description
ConfigSection The name of the configuration file section containing the task settings. failedDirectory The directory in which the action will report failures.
InsertXML XML containing all the properties to determine how and where to add each document, all the metadata, and optionally, a file to insert for each document. Some connectors expect a file to be provided. The data should be provided in XML format as below:
Example In this example, the object is to insert a file with the reference C:\Autonomy\ FileSystemConnectorCFS\dir1\newfile.txt with the content This is my file. First, construct the InsertXML:
• • • ConnectorLib Java SDK Programming Guide • 171 • • Chapter 10 CFS Connector Actions
http://localhost:1234/ action=Fetch&FetchAction=Insert&ConfigSection=DIR1&InsertXML=%3Cin sertXML%3E%3Cinsert%3E%3Creference%3EC%3A%5CAutonomy%5CFileSystemC onnectorCFS%5Cdir1%5Cnewfile.txt%3C%2Freference%3E%3Cfile%3E%3Cisf ilename%3Efalse%3C%2Fisfilename%3E%3Ccontent%3EVGhpcyBpcyBteSBmaWx l%3C%2Fcontent%3E%3C%2Ffile%3E%3C%2Finsert%3E%3C%2FinsertXML%3E
Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the queueinfo action) appears below. This action shows that the action has completed and that one document has been inserted, and gives the identifier of the new document.
• • • 172 • ConnectorLib Java SDK Programming Guide • • Delete/Remove Fetch Action
Related Topics “Synchronize Fetch Action” on page 162
Delete/Remove Fetch Action
This action is used to delete documents from a repository by their identifiers. Remove and delete are different names for the same action.
http://host:port/action=Fetch&FetchAction=Delete [&Config=Base64_Config] &Identifiers=Identifier_CSV http://host:port/action=Fetch&FetchAction=Remove [&Config=Base64_Config] &Identifiers=Identifier_CSV
Type: Asynchronous
Parameter Name Description
Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.
Identifiers A CSV of document Identifiers. The documents with these identifiers are removed from the repository.
Example http://localhost:1234/ action=Fetch&FetchAction=Delete&Identifiers=PGlkIHM9IkRJUjEiIHI9Ik M6XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxuZXdmaWxlLnR4 dCIvPg%3D%3D
Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the QueueInfo action) appears below. This response shows that one document was deleted successfully. • • • ConnectorLib Java SDK Programming Guide • 173 • • Chapter 10 CFS Connector Actions errors="0" holds="0" ingestadded="0" ingestdeleted="0" ingestfailed="0" ingestupdated="0" inserted="0" releasedholds="0" seen="0" task="DIR1" unchanged="0" updated="0"/>
Related Topics “Synchronize Fetch Action” on page 162
Hold and ReleaseHold Fetch Actions
The Hold action places a hold on a document or documents in the repository by their identifier. When a document has been placed on hold, it cannot be deleted by a regular user.
The ReleaseHold action releases a document that has been placed on hold. http://host:port/action=Fetch&FetchAction=Hold [&Config=Base64_Config] &Identifiers=Identifier_CSV http://host:port/action=Fetch&FetchAction=ReleaseHold [&Config=Base64_Config] &Identifiers=Identifier_CSV
• • • 174 • ConnectorLib Java SDK Programming Guide • • Hold and ReleaseHold Fetch Actions
Type: Asynchronous
Parameter Name Description
Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.
Identifiers A CSV of document Identifiers. The documents with these identifiers is placed on hold or released from hold depending on whether you used the Hold or Release Hold action.
Example http://localhost:1234/ action=Fetch&FetchAction=Hold&Identifiers=PGlkIHM9IkRJUjEiIHI9IkM6 XEF1dG9ub215XEZpbGVTeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxuZXdmaWxlLnR4dC IvPg%3D%3D
Response As this is an asynchronous action, you receive a token in response to the request. A sample response to the action (as retrieved using the queueinfo action) appears below. This response shows that one document was successfully put on hold.
• • • ConnectorLib Java SDK Programming Guide • 175 • • Chapter 10 CFS Connector Actions
“Synchronize Fetch Action” on page 162
Update Action
The Update action updates metadata for documents given by their identifier in a repository.
Request /action=fetch&fetchaction=Update [&parenttoken=
IdentifiersXML The IdentifiersXML parameter specifies identifiers that require metadata updates along with a set of the metadata to be updated for each document. The data should be provided in XML format as below:
Asynchronous Response
• • • 176 • ConnectorLib Java SDK Programming Guide • • View Action
Related Topics “Synchronize Fetch Action” on page 162
View Action
The View action retrieves a single document and returns it. http://host:port/action=View [&Config=
Type: Synchronous
Parameter Name Description
Config Optional Base64 encoded configuration file. If this parameter is specified, then the encoded configuration options is used instead of the options in the connector configuration file.
NoACI Specify whether to return the document using a normal ACI response with a Base64 encoded file tag (false), or just return binary content (true). This defaults to true.
Identifiers The identifier of the document to be returned.
Example http://localhost:1234/ action=View&Identifier=PGlkIHM9IkRJUjEiIHI9IkM6XEF1dG9ub215XEZpbGV TeXN0ZW1Db25uZWN0b3JDRlNcZGlyMVxmaWxlOC50eHQiLz4%3D
Response The response is the binary content of the file, unless you have specified NoACI=false.
Related Topics “EnableViewServer” on page 128
• • • ConnectorLib Java SDK Programming Guide • 177 • • Chapter 10 CFS Connector Actions
StopFetch Action
This action requests all active asynchronous fetch actions or a particular asynchronous fetch action to stop.
http://host:port/action=StopFetch [&Token=Fetch_Action_Token]
Type: Synchronous
Parameter Name Description
Token The token of the asynchronous Fetch action to request to stop. If this is not specified, then the connector requests all asynchronous fetch actions to stop. Doing so does not clear the action queue.
Example http://localhost:1234/action=StopFetch
Response
• • • 178 • ConnectorLib Java SDK Programming Guide • • CHAPTER 11 Connector Framework Server Parameters
This section describes the Connector Framework server (CFS) configuration parameters.
Service Parameters
Server Parameters
Actions Parameters
Import Tasks and their Parameters
Import Service Parameters
Indexing Parameters Connector Framework server supports standard service parameters, logging parameters and log streams. For more information, see the IDOL Server Administration Guide. This section lists the Connector Framework server configuration parameters.
• • • ConnectorLib Java SDK Programming Guide • 179 • • Chapter 11 Connector Framework Server Parameters
Service Parameters
The parameters in this section determine which machines are permitted to use and control the Connector Framework service.
Related Topics Service Configuration Parameters
Server Parameters
The parameters in this section specify details for the Connector Framework server.
AdminClients Specify the IP addresses or names of clients that can issue administrative commands to the ACI Port. To enter multiple addresses, separate the individual addresses with commas (there must be no space before or after the comma). Alternatively, you can use wildcards in the IP address. Enter for example 187.*.*.* to permit any machine whose IP address begins with 187 to control the connector.
Type: String
Default: *.*.*.*
Required: No
Configuration Server Section:
Example: AdminClients=localhost,196.172.87.11 See Also: “Port” on page 181 “QueryClients” on page 181
• • • 180 • ConnectorLib Java SDK Programming Guide • • Server Parameters
Port Specify the ACI port by which actions are sent to the Connector Framework server.
Type: Long
Default:
Required: Yes
Allowed Minimum: 0 Range: Maximum: 65535
Recommended Minimum: 1024 Range: Maximum: 49151
Configuration Server Section:
Example: Port=7008 See Also:
QueryClients Specify the IP addresses or names of clients that can query the connector. To enter multiple addresses, separate the individual addresses with commas (there must be no space before or after the comma). Alternatively, you can use wildcards in the IP address. Enter for example 187.*.*.* to permit any machine whose IP address begins with 187 to control the connector.
Type: String
Default: *.*.*.*
Required: No
Configuration Server Section:
Example: QueryClients=10.1.1.*,127.0.0.1 See Also: “Port” on page 181 “AdminClients” on page 180
• • • ConnectorLib Java SDK Programming Guide • 181 • • Chapter 11 Connector Framework Server Parameters
Actions Parameters
The parameters in this section control how actions are sent to Connector Framework server.
MaxQueueSize Use this parameter to specify the maximum number of asynchronous ingest action commands that will be queued by the server. No further ingest actions will be accepted once the queue size has been reached (until the queue diminishes).
Type: Integer
Default: The largest size possible.
Required: No
Configuration Actions Section:
Example: MaxQueueSize=4 See Also:
MaximumThreads Specify the number of actions that the CFS can process in parallel at any one time. The optimal value for this parameter is dependant on the load of the server. The default is generally sufficient for most loads.
Type: Integer
Default: 2
Required No
Configuration Actions Section:
Example: MaximumThreads=10
• • • 182 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
Import Tasks and their Parameters
The tasks and parameters in this section control the way documents are imported to IDX or XML before they are indexed into IDOL Server. The Import Task types are Lua, IDXWriter, TextToDocs, Sectioner, ImportFile, and HtmlExtraction.
Import Tasks This section describes Import tasks. To define an Import task, use the following line in the CFS configuration file:
[Pre] | [Post] | [Update] | [Delete]N=TaskType where TaskType is the type of import task that you want to use. For example:
Pre0=HtmlExtraction
Lua The Lua import task is used to run a Lua script. The Lua import task can be configured as a Pre, Post, Update or Delete task. You must specify the path of the script. For example:
Post0=Lua:C:\Scripts\posttask1.lua
IDXWriter The IDXWriter import task is used to call the CFS IDX Writer. The IDX Writer is included in the Connector Framework server and generates an IDX file.
The IdxWriter import task can be configured as a Pre, Post, Update or Delete task. The parameters that are passed to the task are specified in an optional, named section of the configuration file. For example:
Post0=IdxWriter:IdxWriting [IdxWriting] IdxWriterFilename=Job0.idx IdxWriterMaxSizeKBs=100 IdxWriterArchiveDirectory=./IDXArchive For information about the parameters used to configure this task, see “IdxWriter Import Task Parameters” on page 188.
• • • ConnectorLib Java SDK Programming Guide • 183 • • Chapter 11 Connector Framework Server Parameters
TextToDocs The TextToDocs import task is used to split a file into a number of documents (a main document, and one or more child documents). This task results in a number of metadata and DRECONTENT documents being generated. The original document is discarded and is not filtered using Keyview.
The TextToDocs import task is always configured as a Pre task. The parameters that are passed to the task are specified in a named section of the configuration file. For example:
[ImportTasks] Pre0=TextToDocs:TextToDocsSection
[TextToDocsSection] //Settings to configure how to process the documents.
For information about the parameters used to configure this task, see “TextToDocs Import Task Parameters” on page 189.
Sectioner The Sectioner import task is used to split a large document into smaller sections.
The Sectioner import task is always configured as a Post task. The parameters that are passed to the task are specified in a named section of the configuration file. For example:
Post0=Sectioner:Sectioning [Sectioning] SectionerMaxBytes=3000 SectionMinBytes=1500 If a configuration file section is not specified, [Sectioning] is assumed. For information about the parameters used to configure this task, see “Sectioner Import Task Parameters” on page 201.
ImportFile The ImportFile import task imports a file and adds its content to the document being processed.
The ImportFile import task can be configured as a Pre or Post task. To define an ImportFile import task, use the following line in the configuration file:
[Pre] | [Post]N=ImportFile:fieldname where fieldname is a field that contains the file name of the document to import.
• • • 184 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
HtmlExtraction The HtmlExtraction import task is used to extract the relevant parts from an HTML page, leaving the irrelevant parts (for example advertisements). It is used only with HTML documents.
The HtmlExtraction import task is always configured as a Pre task. To define an HTMLExtraction import task, use the following line in the configuration file:
Pre0=HtmlExtraction
PreN PreN is used to specify tasks when documents are indexed into IDOL server. Pre tasks are called before file content is filtered out and before sub-files are extracted. Tasks must be numbered starting from zero (0). The import tasks that can be called by PreN are IdxWriter, Lua, HtmlExtraction, and TextToDocs. The fields AUTN_NO_FILTER and AUTN_NO_EXTRACT can be used to customise the task.
To prevent sub files being extracted from the document, set the value of AUTN_NO_EXTRACT to true This setting might be used to prevent the contents of zip files being indexed.
To prevent any content being filtered out of the document, set the value of AUTN_NO_FILTER to true This setting might be used to prevent content being indexed from a certain file type.
Type: String
Default: None
Required: No
Configuration ImportTasks Section:
Example: Pre0=Lua:C:\Scripts\pretask1.lua See Also “PostN” on page 186 “HashN” on page 187 “DeleteN” on page 187 “HashN” on page 187
• • • ConnectorLib Java SDK Programming Guide • 185 • • Chapter 11 Connector Framework Server Parameters
PostN PostN is used to specify tasks when documents are indexed into IDOL server. Post tasks are called after content has been extracted from documents, and after sub-files have been extracted. Tasks must be numbered starting from zero (0). The import tasks that can be called by PostN are IdxWriter, Lua, and Sectioner.
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: Post0=Lua:C:\Scripts\posttask1.lua See Also: “PreN” on page 185 “HashN” on page 187 “DeleteN” on page 187 “HashN” on page 187
UpdateN UpdateN is used to specify tasks that are called when CFS is about to update fields in a document in IDOL Server. This is when a connector updates the metadata for a document but not the content. Tasks must be numbered starting from zero (0). The import tasks that can be called by UpdateN are IdxWriter and Lua. In the example below, Update0 runs a Lua script when a document is about to be updated.
Type: String
Default: None
Required: No
• • • 186 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
Configuration IndexTasks Section:
Example: Update0=Lua:onUpdate.lua See Also “PostN” on page 186 “PreN” on page 185 “HashN” on page 187 “DeleteN” on page 187
DeleteN DeleteN is used to specify tasks that are called when CFS is about to delete a document from IDOL Server. Tasks must be numbered starting from zero (0). The import tasks that can be called by DeleteN are IdxWriter and Lua. In the following example, Delete0 runs a Lua script when a document is about to be deleted.
Type: String
Default: None
Required: No
Configuration IndexTasks Section:
Example: Delete0=Lua:onDelete.lua See Also “PostN” on page 186 “PreN” on page 185 “HashN” on page 187 “HashN” on page 187
HashN Specify a file containing a Lua script to use for family hashing. The script inserts an MD5 field into the document, which is a hash of the document’s unique fields. In the example below, hash.lua should be (as this uses the file contents): function handler(document) return false end The hash is calculated from the whole document and it does not matter whether it is text or binary, the hash is calculated from the actual original imported file.
• • • ConnectorLib Java SDK Programming Guide • 187 • • Chapter 11 Connector Framework Server Parameters
Type: String
Default: None
Required: No
Configuration ImportTasks Section:
Example: Hash0=hash.lua See Also “ImportHashFamilies” on page 204 “ImportFamilyRootExcludeFmtCSV” on page 203
IdxWriter Import Task Parameters The parameters in this section are used to customise the IdxWriter Import Task.
IdxWriterFileName The IdxWriterFileName parameter specifies the name of the idx file that is used to store document data before it is indexed into IDOL Server.
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: IdxWriterFileName=Job_0.idx See Also: “IdxWriterArchiveDirectory” on page 188 “IdxWriterMaxSizeKBs” on page 189
IdxWriterArchiveDirectory The IdxWriterArchiveDirectory parameter specifies the name of the directory into which idx files are archived when the maximum file size is reached.
Type: String
Default: None
Required No
• • • 188 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
Configuration ImportTasks Section:
Example: IdxWriterArchiveDirectory=/IDXarchive See Also: “IdxWriterFileName” on page 188 “IdxWriterMaxSizeKBs” on page 189
IdxWriterMaxSizeKBs The IdxWriterMaxSizeKBs parameter specifies a maximum size for idx files.
Type: Integer
Default: None
Required No
Configuration ImportTasks Section:
Example: IdxWriterMaxSizeKBs=1000 See Also: “IdxWriterFileName” on page 188 “IdxWriterArchiveDirectory” on page 188
TextToDocs Import Task Parameters The parameters in this section are used to customise the TextToDocs Import Task.
FilenameMatchesRegex
Type: String
Default: None
Required No
• • • ConnectorLib Java SDK Programming Guide • 189 • • Chapter 11 Connector Framework Server Parameters
Configuration ImportTasks Section:
Example: FilenameMatchesRegex0=.*htm FilenameMatchesRegex1=.*txt See Also:
ReferenceMatchesRegex
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: ReferenceMatchesRegex0=123.* ReferenceMatchesRegex1=456.* See Also:
FieldMatchesName
If more than one pair of FieldMatchesName and FieldMatchesRegex parameters is defined, the field content must match the regular expression for every field, or the file is not included.
Type: String
Default: None
Required No
• • • 190 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
Configuration ImportTasks Section:
Example: FieldMatchesName0=Title FieldMatchesRegex0=.* See Also: “FieldMatchesRegex
FieldMatchesRegex
If more than one pair of FieldMatchesName and FieldMatchesRegex parameters is defined, the field content must match the regular expression for every field, or the file is not included.
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: FieldMatchesName0=Title FieldMatchesRegex0=.* See Also: “FieldMatchesName
ContentContainsRegex
Type: String
Default: None
Required No
• • • ConnectorLib Java SDK Programming Guide • 191 • • Chapter 11 Connector Framework Server Parameters
Configuration ImportTasks Section:
Example: ContentContainsRegex0=.* See Also:
MainRangeRegex
MainRangeRegex0=(.*)
Type: String
Default: *
Required No
Configuration ImportTasks Section:
Example: MainRangeRegex0=(.*) See Also:
MainContentRegex
MainContentRegex0=
(.*)
• • • 192 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: MainContentRegex0=
(.*)
See Also: “MainRangeRegexMainFieldName
The document field named in the MainFieldName
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: MainFieldName1=Description MainFieldRegex1=
(.*)
See Also: “MainFieldRegexMainFieldRegex
The document field named in the MainFieldName
• • • ConnectorLib Java SDK Programming Guide • 193 • • Chapter 11 Connector Framework Server Parameters
The regular expressions used in the MainFieldRegex parameter can contain sub-matches (enclosed in parentheses). If multiple sub-matches are found, the sub-matches are concatenated (separated by spaces).
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: MainFieldName1=Description MainFieldRegex1=
(.*)
See Also: “MainFieldNameChildrenRangeRegex
For example, to define the all content that is enclosed by tags, set the parameter to:
ChildrenRangeRegex0=(.*)
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: ChildrenRangeRegex0=(.*) See Also: “MainFieldName
• • • 194 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
ChildRangeRegex
The regular expressions used in the ChildRangeRegex parameter can contain sub-matches (enclosed in parentheses). If multiple sub-matches are found, the sub-matches are concatenated.
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: ChildRangeRegex0=(.*) See Also:
ChildContentRegex
For example, to define all content that is enclosed by tags, set the parameter to:
ChildContentRange0=
(.*)
Type: String
Default: None
Required No
• • • ConnectorLib Java SDK Programming Guide • 195 • • Chapter 11 Connector Framework Server Parameters
Configuration ImportTasks Section:
Example: ChildContentRange0=
(.*)
See Also:ChildFieldName
The document field named in the ChildFieldName
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: ChildFieldName1=Description ChildFieldRegex1=.* See Also:
ChildFieldRegex
The document field named in the ChildFieldName
• • • 196 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: ChildFieldName1=Description ChildFieldRegex1=.* See Also:
ChildInheritFields The ChildInheritFields parameter is used to specify a comma-separated list of field names that are inherited by the child documents from the original (not the main) document.
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: ChildInheritFields=Title,Author,ModifiedDate See Also:
ContentReplaceRegex
The data identified by the ContentReplaceRegex
Type: String
Default: None
Required No
• • • ConnectorLib Java SDK Programming Guide • 197 • • Chapter 11 Connector Framework Server Parameters
Configuration ImportTasks Section:
Example: ContentReplaceRegex0=.* ContentReplaceFormat0=replacement See Also: “ContentReplaceFormat
ContentReplaceFormat
The data identified by the ContentReplaceRegex
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: ContentReplaceRegex0=.* ContentReplaceFormat0=replacement See Also: “ContentReplaceRegex
FieldReplaceName
The FieldReplaceName
Type: String
Default: None
Required No
• • • 198 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
Configuration ImportTasks Section:
Example: FieldReplaceName0=Description FieldReplaceRegex0=.* FieldReplaceFormat0=new value See Also: “FieldReplaceRegex
FieldReplaceRegex
The FieldReplaceRegex
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: FieldReplaceName0=Description FieldReplaceRegex0=.* FieldReplaceFormat0=new value See Also: “FieldReplaceName
FieldReplaceFormat
The FieldReplaceFormat
• • • ConnectorLib Java SDK Programming Guide • 199 • • Chapter 11 Connector Framework Server Parameters
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: FieldReplaceName0=Description FieldReplaceRegex0=.* FieldReplaceFormat0=new value See Also: “FieldReplaceName
DateFieldName
The DateFieldName
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: DateFieldName0=ModifiedDate DateFieldFormat0=YYYY-MM-DD HH:NN:SS See Also: “DateFieldFormat
DateFieldFormat
The DateFieldFormat
• • • 200 • ConnectorLib Java SDK Programming Guide • • Import Tasks and their Parameters
Type: String
Default: None
Required No
Configuration ImportTasks Section:
Example: DateFieldName0=ModifiedDate DateFieldFormat0=YYYY-MM-DD HH:NN:SS See Also: “DateFieldName
Sectioner Import Task Parameters The following parameters are used to customise the Sectioner Import Task.
SectionerMaxBytes Use this parameter to specify the maximum number of bytes recommended for a section. This is not a hard limit, but the Sectioner will try to keep section sizes below this.
Type: Integer
Default: 3000
Required No
Configuration The configured section or Sectioning Section:
Example: SectionerMaxBytes=3000
SectionerMinBytes Use this parameter to specify the minimum number of bytes recommended for a section. This is not a hard limit, but the Sectioner will try to keep section sizes above this.
Type: Integer
Default: SectionerMaxBytes/2
• • • ConnectorLib Java SDK Programming Guide • 201 • • Chapter 11 Connector Framework Server Parameters
Required No
Configuration The configured section or Sectioning Section:
Example: SectionerMinBytes=1500
SectionerSeparatorsN Use this parameter to specify the fixed strings or regular expressions that can be used by the sectioner to identify a suitable location in the content for inserting a section break. For example, you may prefer content to split on paragraph breaks “%0A%0A”. If a large bit of content has no paragraph breaks, the Sectioner could then revert to splitting on punctuation. A separator string can either be specified as a fixed string or a regular expression. A separator is treated as a regular expression if it begins with an open parenthesis “(“ and ends with closed parenthesis “)”. Fixed strings and regular expressions specified in the configuration are URL unescaped before use; this allows you to specify multi-byte and special characters.
Each SectionerSeparatorsN is a CSV of possibly URL escaped separators. Separators in an earlier SectionSeparators list have priority over those later. Separators towards the left of a CSV have priority over those toward the right.
Backslashes in a regular expression should appear in the configuration as “\\”. Commas in separators should be URL escaped as “%2C” or escaped as “\,”. SectionSeparators 0, 1, and 2 are set by default if none are specified in the configuration. If any SectionerSeparators are specified in the configuration, the defaults no longer apply.
Type: String
Default: SectionerSeparators0=%0A%0A SectionerSeparators1=([!?.]\\s+),([:;]\\ s+),(%EF%BC%81|%EF%BC%9F|%E3%80%82|%EF%BC%8E|[! ?]),(%EF%BC%9A|%EF%BC%9B|[:;]) SectionerSeparators2=((%2C|%E3%80%81|%EF%BC%8C)\\ s*),((%E3%80%80|\\s)+)
Required No
Configuration The configured section or Sectioning Section:
Example: SectionerSeparators0=%0A%0A
• • • 202 • ConnectorLib Java SDK Programming Guide • • Import Service Parameters
Import Service Parameters
The parameters in this section specify details for KeyView and the service that imports documents into IDX or XML.
ExtractDirectory Specify the directory to which to files are extracted. Use this parameter only when you want to keep copies of all extracted files.
Type: String
Default: Current directory
Required No
Configuration ImportService Section:
Example: ExtractDirectory=C:\temp
ImportFamilyRootExcludeFmtCSV Specify which KeyView formats not to designate as family roots if family hashing is enabled. For example, if you exclude the PST format (KeyView value 356), when Import Module Advanced hashes a PST file, it does not consider PST container as the root format. Instead, it searches for a deeper format that is not listed as a CSV: in this case, it would find the MAIL format, which would then be considered the root of the family. For a complete list of KeyView formats, see “KeyView Format Codes” on page 265.
Type: String
Default:
Required No
• • • ConnectorLib Java SDK Programming Guide • 203 • • Chapter 11 Connector Framework Server Parameters
Configuration ImportService Section:
Example: ImportFamilyRootExcludeFmtCSV=356,157,233,345 In this example, the numeric values correspond to the following formats: 356=PST 157=ZIP 233=EML 345=MSG
See Also: “HashN” on page 187 “ImportHashFamilies” on page 204
ImportHashFamilies Specify whether to enable family hashing, which is used for de-duplication.
Type: Boolean
Default: false
Required No
Configuration ImportService Section:
Example: ImportHashFamilies=true See Also: “HashN” on page 187 “ImportFamilyRootExcludeFmtCSV” on page 203 “ImportMergeMails” on page 205
ImportInheritFieldsCSV Specify a comma-separated list of fields that should be inherited from parent files by their children. For example, if you specify SUBJECT in this parameter, all the child attachments in a parent MSG file will contain a Subject field.
Type: String
Default: None
Required No
• • • 204 • ConnectorLib Java SDK Programming Guide • • Import Service Parameters
Configuration ImportService Section:
Example: ImportInheritFieldsCSV=AUTN_IDENTIFIER See Also:
ImportMergeMails Specify whether to merge the two files created by KeyView (the empty MSG or EML container file, and the MAIL file that contains the actual message content) when importing MSG or EML files. Set this to true to merge the two files.
Type: Boolean
Default: false
Required No. Recommended if ImportHashFamilies=true. Configuration ImportService Section:
Example: ImportMergeMails=true See Also: “ImportHashFamilies” on page 204
KeyviewDirectory Specify the location of the KeyView filters that Connector Framework Server uses to process documents. Enter the full path to the filters directory.
Type: String
Default: None
Required Yes
Configuration ImportService Section:
Example: KeyviewDirectory=C:\Autonomy\ConnectorFramework\ filters\
• • • ConnectorLib Java SDK Programming Guide • 205 • • Chapter 11 Connector Framework Server Parameters
MaxImportQueueSize Specify the size of an internal queue where documents are buffered before they are imported.
NOTE It is recommended that this parameter not be changed without consultation with Autonomy support personnel.
Type: Integer
Default: Ten times the size specified by the IndexBatchSize parameter. Required No
Configuration ImportService or Server Section:
Example: MaxImportQueueSize=1000
RevisionMarks Specify whether revision mark information (such as deleted text) is extracted from Microsoft Word documents. If Microsoft Word’s revision tracking feature was enabled when changes were made to a document, the CFS can extract the tracked information and include it in the index. Set to true to extract revision mark information.
Type: Boolean
Default: false
Required No
Configuration ImportService Section:
Example: RevisionMarks=true
• • • 206 • ConnectorLib Java SDK Programming Guide • • Import Service Parameters
ThreadCount Specify the number of threads to run. This parameter is only used for importing.
Type: Integer
Default: 1
Required No
Configuration ImportService Section:
Example: ThreadCount=3
XsltDLL Use this parameter to specify the location of the autnxslt library.
Type: String
Default: autnxslt.dll (if present)
Required: No
Configuration Paths or ImportService or Server Section:
Example: XsltDLL=autnxslt.dll See Also:
• • • ConnectorLib Java SDK Programming Guide • 207 • • Chapter 11 Connector Framework Server Parameters
Indexing Parameters
The parameters in this section specify the details for the IDOL Server(s) to which the Connector Framework server will send documents for indexing.
ACIPort Specify the ACI port of each IDOL Server with which Connector Framework server communicates. There should be the same number of values in the ACIPort CSV as in the DREHost CSV.
Type: CSV (comma-separated values)
Default: None
Required At least one entry is required.
Configuration Indexing Section:
Example: ACIPort=9000,9012 See Also: “CompressIndexFiles” on page 208
CompressIndexFiles Set this parameter to True to compress all index files sent to IDOL. (IDOL will need to be at a relevant version to understand them.)
Type: Boolean
Default: False
Required No
Configuration Indexing Section:
Example: CompressIndexFiles=True See Also:
• • • 208 • ConnectorLib Java SDK Programming Guide • • Indexing Parameters
DREHost Specify the IP address or host name of each IDOL Server with which Connector Framework server communicates. There should be the same number of values in the DREHost CSV as in the ACIPort CSV.
Type: CSV (comma-separated values)
Default: None
Required At least one entry is required.
Configuration Indexing Section:
Example: DREHost=hostmachine0,hostmachine1 See Also: “ACIPort” on page 208
IndexBatchSize Specify the maximum number of files that are included each batch that is indexed into IDOL Server.
Type: Integer
Default: 100
Required No
Configuration Indexing Section:
Example: IndexBatchSize=100
IndexOverSocket Enter true when the IDOL server and connector are installed on different computers and documents are indexed over a network. (In this case, DREADDDATA sends data over the network and is slower.) Enter false when the IDOL server and connector are installed on the same computer and documents are indexed locally. (In this case, DREADD uses file-based indexing and is quicker.)
• • • ConnectorLib Java SDK Programming Guide • 209 • • Chapter 11 Connector Framework Server Parameters
Type: Boolean
Default: True
Required: No
Configuration Indexing Section:
Example: IndexOverSocket=true
IndexTimeInterval Specify the timeout value in seconds for the index queue. This is the maximum amount of time a document will wait in the index queue before an attempt is made to index it. If no documents were indexed in the specified interval, any documents in the queue (up to the number specified in IndexBatchSize) are indexed.
Type: Integer
Default: 300
Required No
Configuration Indexing Section:
Example: IndexTimeInterval=100 See Also: “IndexBatchSize” on page 209
Related Topics “Secure Socket Layer Parameters” on page 235
KillDuplicates Use this parameter to specify the string that gets used as the KillDuplicates parameter value when sending an index command to IDOL server. The following options are available for this parameter: REFERENCE - Replaces an existing document with the new document if the document to index has the same value in its DREREFERENCE field. The default is to leave the value blank, in which case nothing is appended to the command sent to IDOL. This allows duplicate documents in IDOL server - IDOL server does not replace nor delete documents. For more information, refer to the IDOL Server Administration Guide.
• • • 210 • ConnectorLib Java SDK Programming Guide • • Indexing Parameters
Type: String
Default:
Required No
Configuration Indexing Section:
Example: KillDuplicates=REFERENCE See Also:
• • • ConnectorLib Java SDK Programming Guide • 211 • • Chapter 11 Connector Framework Server Parameters
• • • 212 • ConnectorLib Java SDK Programming Guide • • CHAPTER 12 License Configuration Parameters
This chapter describes the license configuration parameters that specify licensing details.
Full
Holder
Key
LicenseServerACIPort
LicenseServerHost
LicenseServerTimeout
LicenseServerRetries
Operation
• • • ConnectorLib Java SDK Programming Guide • 213 • • Chapter 12 License Configuration Parameters
Full
Indicates whether you have a full or an evaluation license.
Type: Boolean
Default: False
Required: Yes
Configuration License Section:
Example: Full=on In this example, the service is fully licensed.
Holder
The name of the license holder.
Type: String
Default: None
Required: Yes
Configuration License Section:
Example: Holder=Company
Key
The license key.
Type: String
Default: None
• • • 214 • ConnectorLib Java SDK Programming Guide • • LicenseServerACIPort
Required: Yes
Configuration License Section:
Example: Key=01234567890
LicenseServerACIPort
ACI port of DiSH license server. This must be the Port specified in the DiSH configuration file's [Server] section. This port is used to request licensing from DiSH. This parameter is used in IDOL with Administration.
Type: Long
Default: None
Required: Yes
Allowed range: Minimum: 0 Maximum: 65536
Recommended Minimum: 1025 range: Maximum: 65536
Configuration License Section:
Example: LicenseServerACIPort=20000 See Also: “LicenseServerHost” on page 216 “LicenseServerTimeout” on page 216 “LicenseServerRetries” on page 217
• • • ConnectorLib Java SDK Programming Guide • 215 • • Chapter 12 License Configuration Parameters
LicenseServerHost
Address of DiSH host. The IP address (or name) of the machine that hosts the DiSH license server. This parameter is used in IDOL with Administration.
Type: String
Default: None
Required: Yes
Configuration License Section:
Example: LicenseServerHost=1.23.45.6 See Also: “LicenseServerACIPort” on page 215 “LicenseServerTimeout” on page 216 “LicenseServerRetries” on page 217
LicenseServerTimeout
Seconds to timeout when connecting to DiSH. Type the number of seconds after which requests that have been sent to the DiSH license server time out if it does not respond. This parameter is used in IDOL with Administration.
Type: Long
Default: 120000
Required: No
Configuration License Section:
Example: LicenseServerTimeout=600000 See Also: “LicenseServerACIPort” on page 215 “LicenseServerHost” on page 216 “LicenseServerRetries” on page 217
• • • 216 • ConnectorLib Java SDK Programming Guide • • LicenseServerRetries
LicenseServerRetries
Number of retries when connecting to the DiSH license server. This parameter is used in IDOL with Administration.
Type: Integer
Default: 5
Required: No
Configuration License Section:
Example: LicenseServerRetries=1 See Also: “LicenseServerACIPort” on page 215 “LicenseServerHost” on page 216 “LicenseServerTimeout” on page 216
Operation
Licensed Operations key to allow additional ACI server operations to be licensed.
Type: String
Default: None
Required: Yes
Configuration License Section:
Example: Operations=803|87sdhsdf9n94nmsf7oasda987w4yriasunfa asd==
• • • ConnectorLib Java SDK Programming Guide • 217 • • Chapter 12 License Configuration Parameters
• • • 218 • ConnectorLib Java SDK Programming Guide • • CHAPTER 13 Logging Configuration Parameters
This section describes the configuration parameters used to create separate log files for different log message types (such as query, index, and application) and to determine how each stream is logged.
LogArchiveDirectory
LogCompressionMode
LogDirectory
LogEcho
LogExpireAction
LogFile
LogHistorySize
LogLevel
LogLevelMatch
LogMaxLineLength
LogMaxOldFiles
LogMaxSizeKBs
LogOldAction
• • • ConnectorLib Java SDK Programming Guide • 219 • • Chapter 13 Logging Configuration Parameters
LogOutputLogLevel
LogSysLog
LogTime
LogTypeCSVs
LogArchiveDirectory
Path to log archive directory. Type the directory in which you want the application to archive old log files when LogOldAction is set to Move.
Type: String
Default: ./archive
Required: No
Configuration Logging and/or LogStream or TaskName Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogArchiveDirectory=./archive See Also: “LogOldAction” on page 229
• • • 220 • ConnectorLib Java SDK Programming Guide • • LogCompressionMode
LogCompressionMode
Specifies how old log files are compressed when the LogExpireAction parameter is set to Compress. This can be set to either zip or gz.
Type: String
Default: zip
Required: No
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogCompressionMode=gz See Also: “LogExpireAction” on page 222
LogDirectory
Path to log directory. Type the directory in which you want the application to store the log files it creates.
Type: String
Default: ./logs
Required: No
Configuration Logging Section:
Example: LogDirectory=./logs See Also: “LogArchiveDirectory” on page 220 “LogFile” on page 223
• • • ConnectorLib Java SDK Programming Guide • 221 • • Chapter 13 Logging Configuration Parameters
LogEcho
Display logging messages on the console. Enable this parameter to display logging messages on the console.
NOTE This setting has no effect if you are running the application as a Windows service.
Type: Boolean
Default: False
Required: No
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogEcho=true See Also: “LogArchiveDirectory” on page 220
LogExpireAction
Determines how log files are handled when they exceed the maximum size. Type one of the following to determine how log files are handled when they exceed the MaxLogSizeKBs size:
Option Description
Compress The log file's name is appended with a time stamp, compressed and saved in the log directory. By default, this is a Zip file. Use the LogCompressionMode parameter to specify another compression format.
Consecutive The log file's name is appended with a number and saved in the log directory. When the next log file reaches its LogMaxSizeKBs size, it is appended with the next consecutive number.
• • • 222 • ConnectorLib Java SDK Programming Guide • • LogFile
Option Description
Datestamp The log file's name is appended with a time stamp and saved in the log directory.
Previous The log file's name is appended with .previous and saved in the log directory. Every time a log file reaches its LogMaxSizeKBs size, it is given the same postfix so it overwrites the old log file.
Day Only one log file is created per day and is appended with the current time stamp. Log files are archived after they reach the LogMaxSizeKBs size. NOTE The LogMaxSizeKBs parameter takes precedence over the LogExpireAction parameter. Therefore, if you set LogExpireAction to Day, and the value for LogMaxSizeKBs results in more than one log file, multiple log files is generated per day.
Type: String
Default: Datestamp
Required: No
Configuration Logging and/or LogStream or TaskName Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogExpireAction=Compress See Also: “LogCompressionMode” on page 221 “LogFile” on page 223 “LogMaxSizeKBs” on page 228
LogFile
Name of the log file. The name of the log file the application creates in the specified LogDirectory.
Type: String
Default: None
Required: Yes
• • • ConnectorLib Java SDK Programming Guide • 223 • • Chapter 13 Logging Configuration Parameters
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogFile=query.log See Also: “LogDirectory” on page 221
LogHistorySize
The number of log messages to store in memory.
Type: String
Default: 100
Required: Yes
Allowed Minimum: 1 Range: Maximum: 520
Configuration LogStream Section:
Example: LogHistorySize=50 See Also: “LogExpireAction” on page 222
LogLevel
The type of messages that are logged. Type one of the following to determine the type of messages that are logged:
Option Description
Always Basic processes are logged. NOTE This produces only minimal logging and no errors are logged.
Error Errors are logged.
• • • 224 • ConnectorLib Java SDK Programming Guide • • LogLevelMatch
Option Description
Warning Errors and warnings are logged.
Normal Errors, warnings and basic processes are logged.
Full Every occurrence is logged. NOTE This produces a large log file and can affect performance.
The log levels are hierarchical from least logging to most logging. You can use the LogLevelMatch parameter to specify which messages are reported relative to the specified LogLevel. For example, if LogLevelMatch=LessThan and LogLevel=Warning, "Normal" and "Full" message types are reported. Use the LogOutputLogLevel parameter to report the log level in the log.
Type: String
Default: Normal
Required: No
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogLevel=Warning See Also: “LogFile” on page 223 “LogLevelMatch” on page 225
LogLevelMatch
The messages reported relative to the specified LogLevel. The LogLevelMatch parameter specifies the messages that are reported relative to the log-level hierarchy: Always
Error
Warning
Normal
Full
• • • ConnectorLib Java SDK Programming Guide • 225 • • Chapter 13 Logging Configuration Parameters
Type one of the following values for LogLevelMatch:
Option Description
Equal Only the message type specified by LogLevel is reported. For example, if LogLevel=warning, only warning messages are reported.
LessThan The message types below the LogLevel setting are reported. For example, if LogLevel=warning, "Normal" and "Full" message types are reported.
LessThanOrEqual The message type specified by LogLevel and any message type below that are reported. For example, if LogLevel=warning, "Normal", "Full", and "Warning" message types are reported.
GreaterThan The message types above the LogLevel setting are reported. For example, if LogLevel=warning, "Error" and "Always" message types are reported.
GreaterThanOrEqual The message type specified by LogLevel and any message type above that are reported. For example, if LogLevel=warning, "Error", "Always", and "Warning" message types are reported.
Type: String
Default: GreaterThanOrEqual
Required: No
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogLevelMatch=GreaterThanOrEqual See Also: “LogFile” on page 223 “LogLevel” on page 224 “LogOutputLogLevel” on page 230
• • • 226 • ConnectorLib Java SDK Programming Guide • • LogMaxLineLength
LogMaxLineLength
Maximum characters in a log entry. The number of characters a log entry can include before it is truncated. Increase this value when you want long actions to be logged in full.
Type: Long
Default: 16384
Required: No
Allowed Minimum: 0 Range: Maximum: 2000000000
Recommended Minimum: 100 Range: Maximum: 1000000
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogMaxLineLength=24000 See Also: “LogFile” on page 223
LogMaxOldFiles
Maximum number of log files in the log directory. The maximum number of log files the specified LogDirectory can store before the application runs the specified LogOldAction. If you do not want to restrict how many log files the LogDirectory can store, type -1.
Type: Long
Default: -1 (unlimited)
Required: No
• • • ConnectorLib Java SDK Programming Guide • 227 • • Chapter 13 Logging Configuration Parameters
Configuration Logging and/or LogStream or TaskName Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogMaxOldFiles=1000 See Also: “LogDirectory” on page 221 “LogOldAction” on page 229
LogMaxSizeKBs
Maximum log file size (in KB). If you do not want to restrict the log file size, type -1. The LogExpireAction parameter determines how a log file is handled after it has reached its maximum size. This parameter is used for standard logging streams.
Type: Long
Default: 1024
Required: No
Configuration Logging and/or LogStream or TaskName Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogMaxSizeKBs=1000 See Also: “LogExpireAction” on page 222
• • • 228 • ConnectorLib Java SDK Programming Guide • • LogOldAction
LogOldAction
Determines how log files are handled when the maximum number of log files is exceeded. Type one of the following to determine how log files are handled when the LogDirectory has reached the maximum number of log files, as determined by the LogMaxOldFiles parameter:
Option Description
Delete The log files are deleted.
Move The log files are moved to the specified LogArchiveDirectory.
Type: String
Default: Delete
Required: No
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogOldAction=Move See Also: “LogArchiveDirectory” on page 220 “LogDirectory” on page 221 “LogMaxOldFiles” on page 227
• • • ConnectorLib Java SDK Programming Guide • 229 • • Chapter 13 Logging Configuration Parameters
LogOutputLogLevel
Determines whether the log level is reported in the log. Enable this parameter to include the log level of a message in the log entry.
Type: Boolean
Default: False
Required: No
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogLevel=Always LogOutputLogLevel=true
In this example, Always is added to the log message: 21/12/2006 12:34:56 [10] Always: ACI Server attached to port 1622 See Also: “LogLevel” on page 224
LogSysLog
Write messages to Windows/Linux system log. Enable this parameter to write messages to the Linux Syslog or the Windows Event Log.
Type: Boolean
Default: False
Required: No
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogSysLog=true
• • • 230 • ConnectorLib Java SDK Programming Guide • • LogTime
LogTime
Display time with each log entry. Enable this parameter to display the current time next to each log entry in the log file.
Type: Boolean
Default: True
Required: No
Configuration Logging and/or LogStream Section: If you set this parameter in the Logging and LogStream sections, the setting in the LogStream section takes precedence for the specified log stream.
Example: LogTime=false See Also: “LogFile” on page 223
LogTypeCSVs
List of message types to log. Type one or more of the following message types to specify the type of messages written to the associated log file. If you want to type multiple message types, separate them with commas (there must be no space before or after a comma):
Option Description
All Components
Action Logs actions and related messages.
Application Logs application-related occurrences.
IDOL Server
Agent Logs agent actions and related messages.
Category Logs category actions and related messages.
Cluster Logs cluster actions and related messages.
Community Logs community actions and related messages.
ExtendedIndex Logs index actions as well as index actions that are sent after IDOL Server has routed incoming data through other processes.
• • • ConnectorLib Java SDK Programming Guide • 231 • • Chapter 13 Logging Configuration Parameters
Option Description
Index Logs index actions and related messages.
Mailer Logs mailer actions and related messages.
Profile Logs profile actions and related messages.
Query Logs query actions and related messages.
QueryTerms Logs each query term, after stemming, conversion to UTF8, capitalization and punctuation removal. This is mainly used by the Autonomy DiSH server for statistical reports.
Role Logs role actions and related messages.
Schedule Logs schedule actions and related messages.
Taxonomy Logs taxonomy actions and related messages.
User Logs user actions and related messages.
User_Audit Logs UserAdd and UserDelete actions and related messages. UserTerm Logs terms that IDOL Server uses to form a user's agents and profiles.
DIH
Index Logs index actions and related messages.
Query Logs query actions and related messages.
DAH
Security Logs security action results.
DiSH
Alert Logs alert actions and related messages.
AlertResults Logs alert action results.
Audit Logs audit actions and related messages.
Schedule Logs schedule actions and related messages.
ScheduleResults Logs schedule action results.
Connectors
FailureList Logs details of files that were not imported successfully.
Import Logs import actions and related messages.
Index Logs index actions and related messages.
• • • 232 • ConnectorLib Java SDK Programming Guide • • LogTypeCSVs
Option Description
Spider Logs spider actions and related messages. (HTTP Connector only)
CFS
Import Logs import actions and related messages.
Indexer Logs the status of Indexing into IDOL.
CFS Connectors
Collect Logs document collection for use in Legal Hold applications.
Delete Logs the deletion of documents from the repository.
Hold Logs details of documents that are put on hold in Legal Hold applications.
Identifiers Logs details of requests for document lists from repositories.
Insert Logs the insertion of documents into the repository.
Synchronize Logs data synchronization when ingesting into IDOL.
Update Logs details of documents whose metadata is updated in the repository.
View Logs details of documents that are viewed from the repository.
Transcode Server
Transcode Logs details of transcoding.
Type: String
Default: None
Required: Yes
Configuration LogStream Section:
Example: LogTypeCSVs=Application,Index See Also: “LogFile” on page 223
• • • ConnectorLib Java SDK Programming Guide • 233 • • Chapter 13 Logging Configuration Parameters
• • • 234 • ConnectorLib Java SDK Programming Guide • • CHAPTER 14 Secure Socket Layer Parameters
This section describes the configuration parameters used to configure Secure Socket Layer (SSL) connections between components.
NOTE These parameters usually appear in the [SSLOptions] section of the configuration file.
SSLConfig
SSLCACertificate
SSLCACertificatesPath
SSLCertificate
SSLCheckCertificate
SSLCheckCommonName
SSLMethod
SSLPrivateKey
SSLPrivateKeyPassword
• • • ConnectorLib Java SDK Programming Guide • 235 • • Chapter 14 Secure Socket Layer Parameters
SSLConfig
Identifies the configuration section in which the SSL configuration details are specified, usually SSLOptionN. You must set this parameter if you are using SSL connections between components. To control incoming ACI calls, set this parameter in the [Server] or [Default] section. To control outgoing ACI calls, set this parameter in another component section, such as [DataDRE], [CatDRE], or a connector Job section. The section in which you set SSLConfig depends on whether you are using a distributed architecture and on which component you are configuring. For example, in a standalone Category configuration, you can set SSLConfig in the [Server], [DataDRE], [CatDRE], and [CommunityServer] sections. See each component’s documentation for more information.
Type: String
Default: None
Required: No
• • • 236 • ConnectorLib Java SDK Programming Guide • • SSLConfig
Configuration Server or Default, or other section for outgoing communications Section:
Example: [Server] SSLConfig=SSLOptions1 ...
[AgentDRE] SSLConfig=SSLOptions2 ...
[DataDRE] SSLConfig=SSLOptions2 ...
// For Omni Group Servers:
[Note] GroupServerHost=... GroupServerPort=... SSLConfig=SSLOptions2
[SSLOptions1] //SSL options for incoming connections SSLMethod=SSLV23 SSLCertificate=host1.crt SSLPrivateKey=host1.key SSLCACertificate=trusted.crt
[SSLOptions2] //SSL options for outgoing connections SSLMethod= SSLV23 SSLCertificate=host2.crt SSLPrivateKey=9s7BxMjD2d3M3t7awt/J8A SSLCACertificate=trusted.crt See Also: “SSLCACertificate” on page 238 “SSLCertificate” on page 240 “SSLCheckCertificate” on page 240 “SSLCheckCommonName” on page 241 “SSLMethod” on page 241 “SSLPrivateKey” on page 242 “SSLPrivateKeyPassword” on page 243
• • • ConnectorLib Java SDK Programming Guide • 237 • • Chapter 14 Secure Socket Layer Parameters
SSLCACertificate
Certificate Authority (CA) certificate file of a trusted authority. The component only trusts communication with a peer that provides a certificate signed by the specified CAs.
Type: String
Default: None
Required: No
Configuration SSLOptionN Section:
Example: SSLCACertificate=trusted.crt See Also: “SSLConfig” on page 236
SSLCACertificatesPath
Use this parameter to specify the path to a directory containing multiple CA certificates in PEM format to check against. Each file must contain one CA certificate. The files are looked up by the CA subject name hash value, which must be available. If more than one CA certificate with the same name hash value exists, the extension must be different (for example, 9dd6633f0.0, 9dd6633f0.1, and so on). The search is performed in the order of the extension number, regardless of other properties of the certificates. As an alternative, you can specify the path to a file containing multiple CA certificates in PEM format. The file can contain certificates identified by sequences like the following example:
----BEGIN CERTIFICATE---- ... (CA certificate in base64 encoding) ... ----END CERTIFICATE----
You can insert text before, between and after the certificates to be used as descriptions of the certificates.
• • • 238 • ConnectorLib Java SDK Programming Guide • • SSLCACertificatesPath
CAUTION If several CA certificates matching the name, key identifier, and serial number condition are available, only the first one is examined. This might lead to unexpected results if the same CA certificate is available with different expiration dates. If a “certificate expired” verification error occurs, no other certificate is searched. Make sure to not have expired certificates mixed with valid ones.
For more information, refer to: http://www.openssl.org/docs/ssl/ SSL_CTX_load_verify_locations.html.
Type: String
Default: None
Required: No
Configuration SSLOptionN Section:
Example: SSLCACertificatesPath=C:\Autonomy\HTTPConnector\ CACERTS\
See Also: “SSLConfig” on page 236
• • • ConnectorLib Java SDK Programming Guide • 239 • • Chapter 14 Secure Socket Layer Parameters
SSLCertificate
SSL Certificate file to use to identify this component to a peer. It can be either ASN1 or PEM format. This parameter requires a matching SSLPrivateKey value.
Type: String
Default: None
Required: Yes
Configuration SSLOptionN Section:
Example: SSLCertificate=host1.crt SSLPrivateKey=host1.key See Also: “SSLConfig” on page 236 “SSLPrivateKey” on page 242
SSLCheckCertificate
Specifies whether a certificate signed by a trusted authority is requested from peers.
Setting SSLCACertificate implicitly sets SSLCheckCertificate to true. If SSLCACertificate is set to false, communications are encrypted, but certificates are not requested from peers.
Type: Boolean
Default: True if SSLCACertificate is set. False if SSLCACertificate is not set. Required: No
Configuration SSLOptionN Section:
Example: SSLCheckCertificate=true See Also: “SSLConfig” on page 236
• • • 240 • ConnectorLib Java SDK Programming Guide • • SSLCheckCommonName
SSLCheckCommonName
Verifies the identity of the peer. Specifies whether the host name listed in the peer's certificate (that is, the CommonName or "CN" attribute) resolves to the same IP address as the peer itself, as determined by the network connection.
For example, if the host name in a certificate is eip.autonomy.com and resolves to an IP address of 12.3.4.56, then the peer must share the same IP address.
Type: Boolean
Default: False
Required: No
Configuration SSLOptionN Section:
Example: SSLCheckCommonName=true See Also: “SSLConfig” on page 236
SSLMethod
Specifies which SSL protocol is used. The options are: SSLV2
SSLV3
SSLV23
TLSV1
• • • ConnectorLib Java SDK Programming Guide • 241 • • Chapter 14 Secure Socket Layer Parameters
SSLV23 is used in most cases.
Type: String
Default: None
Required: Yes
Configuration SSLOptionN Section:
Example: SSLMethod=SSLV23 See Also: “SSLConfig” on page 236
SSLPrivateKey
The private security key for the SSL certificate. It can be either ASN1 or PEM format. This parameter requires a matching SSLCertificate value.
Type: String
Default: None
Required: Yes
Configuration SSLOptionN Section:
Example: SSLCertificate=host1.crt SSLPrivateKey=host1.key See Also: “SSLConfig” on page 236 “SSLCertificate” on page 240 “SSLPrivateKeyPassword” on page 243
• • • 242 • ConnectorLib Java SDK Programming Guide • • SSLPrivateKeyPassword
SSLPrivateKeyPassword
The password for the file defined in SSLPrivateKey. The password might be in plain text, or basic or AES encryption format.
Type: String
Default: None
Required: No
Configuration SSLOptionN Section:
Example: [SSLOption0] SSLCertificate=host1.crt SSLPrivateKey=host1.key SSLPrivateKeyPassword=PvKey1559
In this example, the private key password to the file host1.key is written in plain text. ... [SSLOption0] SSLCertificate=host1.crt SSLPrivateKey=host1.key SSLPrivateKeyPassword=9s7BxMjD2d3M3t7awt/J8A
In this example, the private key password to the file host1.key has basic encryption.
See Also: “SSLConfig” on page 236 “SSLPrivateKey” on page 242
• • • ConnectorLib Java SDK Programming Guide • 243 • • Chapter 14 Secure Socket Layer Parameters
• • • 244 • ConnectorLib Java SDK Programming Guide • • CHAPTER 15 Service Configuration Parameters
This section describes the Service configuration parameters that determine which machines are permitted to use and control a service.
ServiceACIMode
ServiceControlClients
ServiceHost
ServicePort
ServiceStatusClients If the ServicePort, ServiceStatusClients and ServiceControlClients configuration parameters are specified, the service port is enabled and accepts the standard status and control actions described in “Service Actions” on page 249.
• • • ConnectorLib Java SDK Programming Guide • 245 • • Chapter 15 Service Configuration Parameters
ServiceACIMode
Generate ACI-compatible XML.
Type: Boolean
Default: False
Required: No
Configuration Service Section:
Example: ServiceACIMode=false See Also: “ServiceControlClients” on page 246 “ServiceHost” on page 247 “ServicePort” on page 247 “ServiceStatusClients” on page 248
ServiceControlClients
IP addresses or names of clients that can send service control actions to the service. To type multiple addresses, separate the individual addresses with commas (there must be no space before or after the comma). Alternatively, you can use wildcards in the IP address. Type for example 187.*.*.* to permit any machine whose IP address begins with 187 to control the connector.
Type: String
Default: None
Required: Yes
Configuration Service Section:
Example: ServiceControlClients=localhost,127.0.0.1 See Also: “ServiceACIMode” on page 246 “ServiceHost” on page 247 “ServicePort” on page 247 “ServiceStatusClients” on page 248
• • • 246 • ConnectorLib Java SDK Programming Guide • • ServiceHost
ServiceHost
The host server on which the service is running.
Type: String
Default: *.*.*.*
Required: Yes
Configuration Service Section:
Example: ServiceHost=127.0.0.1 See Also: “ServiceACIMode” on page 246 “ServiceControlClients” on page 246 “ServicePort” on page 247 “ServiceStatusClients” on page 248
ServicePort
The port on the host server on which the service listens for service status and control requests.
Type: Long
Default: 40010
Required: Yes
Allowed Minimum: 1 Range: Maximum: 65535
Recommended Minimum: 1024 Range: Maximum: 65535
Configuration Service Section:
Example: ServicePort=40010 See Also: “ServiceACIMode” on page 246 “ServiceControlClients” on page 246 “ServiceHost” on page 247 “ServiceStatusClients” on page 248
• • • ConnectorLib Java SDK Programming Guide • 247 • • Chapter 15 Service Configuration Parameters
ServiceStatusClients
The IP addresses or names of clients that can request status information from a service. These clients cannot control the service. To type multiple addresses, separate the individual addresses with commas (there must be no space before or after the comma). Alternatively, you can use wildcards in the IP address. Type for example 187.*.*.* to permit any machine whose IP address begins with 187 to access the service's status.
Type: String
Default: None
Required: Yes
Configuration Service Section:
Example: ServiceStatusClients=localhost,127.0.0.1 See Also: “ServiceACIMode” on page 246 “ServiceControlClients” on page 246 “ServiceHost” on page 247 “ServicePort” on page 247
• • • 248 • ConnectorLib Java SDK Programming Guide • • CHAPTER 16 Service Actions
This section describes the Service actions.
Action Syntax
GetConfig
GetLogStream
GetLogStreamNames
GetStatistics
GetStatus
GetStatusInfo
Stop
Service Action Parameters If the ServicePort, ServiceStatusClients and ServiceControlClients configuration parameters are specified, the service port is enabled and accepts the status and control actions described in this section.
Action Syntax
The actions use the following format: http://Host:Port/action=ActionName&[Parameters]
• • • ConnectorLib Java SDK Programming Guide • 249 • • Chapter 16 Service Actions
where,
Host The IP address (or name) of the machine hosting the service.
Port The ServicePort specified in the Service section of the service’s configuration.
ActionName One of the actions described in this section.
Parameters One or more parameters that might be required by an action.
For example: http://12.3.4.56:40010/action=GetConfig This action uses port 40010 to request the service’s configuration file settings.
Related Topics “Service Actions” on page 249
GetConfig
The GetConfig action returns the service’s configuration file settings.
Example action=GetConfig
Parameters None.
GetLogStream
The GetLogStream action returns a specific log stream for the service.
Example action=GetLogStream&Name=ApplicationLogStream&FromDisk=true&Tail=1 0 This action displays the first ten entries of the ApplicationLogStream log.
• • • 250 • ConnectorLib Java SDK Programming Guide • • GetLogStreamNames
Parameters The action has the following optional parameters:
Parameter Description Required
FromDiskFr Specifies whether the log stream is read from disk or omDisk memory.
Name The name of the log stream you want to return.
Tail The number of lines from the log stream to return.
GetLogStreamNames
The GetLogStreamNames action returns the names of the log streams defined for the service.
Example
action=GetLogStreamNames
Parameters None.
GetStatistics
The GetStatistics action returns statistics for the service. Each statistic returns in an autn:stat XML element. This element contains the following attributes:
class The group that the statistic belongs to. For example, Service.
autnid The sub-group that the statistic belongs to. For example, Documents.
• • • ConnectorLib Java SDK Programming Guide • 251 • • Chapter 16 Service Actions
name The name of the statistic.
metric The type of statistic. This can have one of the following values: 0 String 1 Bytes 2 Bytes per second 3 per second 4 percent 5 count 6 number 7 timestamp 8 seconds 9 milliseconds 10 maximum
value The value of the statistic.
For example:
Class Statistic Description
[Service] Class
[Statistics]
ServiceDuration The number of seconds the service has been running.
10SecondResponseAverage The average service response time (in milliseconds) measured over the last 10 seconds.
10SecondRequestsPerSecond The number of requests to the service per second within the last 10 seconds.
10SecondRequests The number of requests to the service in the last 60 seconds.
60SecondResponseAverage The average service response time (in milliseconds) measured over the last 60 seconds.
• • • 252 • ConnectorLib Java SDK Programming Guide • • GetStatistics
60SecondRequestsPerSecond The number of requests to the service per second within the last 60 seconds.
60SecondPeakRequestsPerSecond The highest number of requests to the service over any 60 second period.
60SecondRequests The number of requests to the service in the last 60 seconds.
1HourResponseAverage The average service response time (in milliseconds) measured over the last hour.
1HourRequestsPerSecond The number of requests to the service per second within the last hour.
1HourPeakRequestsPerSecond The highest number of requests to the service over any 1 hour period.
1HourRequests The number of requests to the service in the last hour.
24HourResponseAverage The average service response time (in milliseconds) measured over the last 24 hours.
24HourRequestsPerSecond The number of requests to the service per second within the last 24 hours.
24HourPeakRequestsPerSecond The highest number of requests to the service over any 24 hour period.
24HourRequests The number of requests to the service in the last 24 hours.
RecentResponseAverage The average service response time (in milliseconds) from the time the last 10 second period finished to the current time.
RecentRequestsPerSecond The number of requests to the service per second from the time the last 10 second period finished to the current time.
RecentPeakRequestsPerSecond The highest number of requests to the service from the time the last 10 second period finished to the current time.
RecentRequests The number of requests to the service from the time the last 10 second period finished to the current time.
TotalRequests The total number of requests that were made to the service.
• • • ConnectorLib Java SDK Programming Guide • 253 • • Chapter 16 Service Actions
The following statistics return for specific components:
Class Statistic Description
[Service] Class
[Statistics]
TruncatedQueries The number of queries that timed out.
[Documents]
Total The total number of documents that this IDOL Server contains.
Sections The number of document sections that this IDOL Server contains.
TotalSlots The total number of document sections that the IDOL Server contains including document sections that have been deleted.
[Databases]
Number The total number of databases including empty databases and databases that have been deleted.
Active The number of active databases (databases that are empty or contain data).
[ACI] Class
[Action:ActionName]
Count The total number of ActionName actions that were sent to the service.
Avg.Duration The average duration (in ms) of ActionName actions.
Shortest The shortest duration (in ms) of ActionName actions.
Longest The longest duration (in ms) of ActionName actions. [Indexer] Class
[Connections]
Total The number of socket connections to the index port.
Unauthorized The number of index actions that IDOL Server received from unauthorized clients.
Paused The number of connections that were rejected because the service was paused.
• • • 254 • ConnectorLib Java SDK Programming Guide • • GetStatistics
Class Statistic Description
InsufficientDiskSpace The number of connections that were rejected because there was insufficient disk space.
InvalidIndexCode The number of connections that were rejected because they contained an invalid index code.
[Commands]
Invalid The number of actions that the service received to the index port that were not valid index actions.
TuncatedData The number of index actions that were received that had truncated data.
CommandName The number of CommandName index actions that were run.
[Command:CommandName]
Avg.Duration The average duration (in ms) of CommandName index actions.
Shortest The shortest duration (in ms) of CommandName index actions.
Longest The longest duration (in ms) of CommandName index actions.
CommandsRejectedDiskFull The number of index actions that were rejected because the disk was full.
CommandsRejectedInvalidIndexCode The number of index actions that were rejected because their index code was invalid.
[Streaming]
BytesStreamedToDisk The number of bytes of data that the service has streamed to disk.
TimeSpentStreaming The amount of time in seconds that the service has spent streaming data.
[Queue]
Received The number of index actions that have been received.
Completed The number of index actions that have been completed.
Queued The number of index actions that are in the index queue.
[Rejected Commands]
Invalid The number of index actions that were rejected because they were not recognized actions.
• • • ConnectorLib Java SDK Programming Guide • 255 • • Chapter 16 Service Actions
Class Statistic Description
RejectedInvalidDatabase The number of index actions that were rejected because they contained an invalid database.
ReadOnlyDatabase The number of index actions that were rejected because they contained a read-only database.
FileNotFound The number of index actions that were rejected because the file was not found.
DocLimitExceeded The number of index actions that were rejected because the document limit was exceeded.
IndexSizeExceeded The number of index actions that were rejected because the maximum index size was exceeded.
UserConfIndexLimitExceeded The number of index actions that were rejected because the configured maximum allowed index size was exceeded.
OutOfMemory The number of index actions that were rejected because IDOL server was out of memory.
BadParameter The number of index actions that were rejected because they contained an invalid parameter or parameter value.
InsufficientFileHandles The number of index actions that were rejected because there were insufficient file handles.
InsufficientDiskSpace The number of index actions that were rejected because there was not enough disk space.
TruncatedData The number of POST index actions that were rejected because their data termination was incorrect.
SuccessfullyProcessed The number of successfully run index actions.
OndiskComponent The number of index actions that have data stored on disk.
[Documents]
ReplacedReindex The number of documents that were re-indexed because an ACLType or Index field had changed.
ReplacedDocsTotal The number of documents that have been replaced.
InvalidDatabaseDocs The number of documents that were not indexed because their database was invalid.
[Database] Class
[DatabaseName]
Documents The number of documents that this database contains.
• • • 256 • ConnectorLib Java SDK Programming Guide • • GetStatistics
Class Statistic Description
Sections The number of document sections that this database contains.
[Server] Class
[Tasks]
Number The number of tasks set up in the configuration file.
StartTask The first task that is performed.
IndexCommands The number of index actions that have been processed (the number displayed includes any index action that is currently being processed).
Documents The number of documents that have been processed (the number displayed includes any document that is currently being processed).
DocumentSuccesses The number of documents that have been processed successfully.
DocumentFailures The number of times that document processing has failed.
Sections The number of document sections processed.
[Tasks] Class
[TaskName] Requests The number of requests sent to a specific task.
Successes The number of requests processed successfully by a specific task.
Failures The number of request-processing failures for a specific task.
[Licensing] Class
[Users]
Maximum The maximum number of users that can be set up for this service.
[Statistics] Class
[Users]
Users The number of users that has been set up for this service.
[CHILDSTAT] Class
[AllChildren]
• • • ConnectorLib Java SDK Programming Guide • 257 • • Chapter 16 Service Actions
Class Statistic Description
TotalUpEvents The number of times a DIH child server was marked up
TotalDownEvents The number of times a DIH child server was marked down.
[Engine N] UpEvents The number of times this DIH child server was marked down.
DownEvents The number of times this DIH child server was marked up.
CommandsSent The number of actions that were sent to this DIH child server.
Retries The number of times actions to this DIH child server were retried.
TotalBytesSent The total number of bytes of data that were sent to this DIH child server.
AvgSendCommandRate The average rate that actions were sent to this DIH child server.
MinResponseTime The smallest time that DAH took to respond to a request.
AvgResponseTime The average time that DAH took to respond to requests.
MaxResponseTime The largest time that DAH took to respond to requests.
SuccessfulActions The number of actions that were successfully completed.
FailedActions The number of actions that failed.
Timeouts The number of actions that timed out.
Example action=GetStatistics
Parameters None.
• • • 258 • ConnectorLib Java SDK Programming Guide • • GetStatus
GetStatus
The GetStatus action returns the service’s status (running or stopped) and some current configuration settings.
Example action=GetStatus
Action Parameters None.
GetStatusInfo
The GetStatusInfo action returns status information for the service (for example, the service’s product name, version number and so on). The following status information for the service are returned:
Statistics Description
[StatusInfo]
ServiceStartTime The time the service started running (epochseconds).
ServiceUtilsVersion The version of the service utilities.
ServiceName The name of the service.
ProductName The product name of the service.
ProductVersion The version of the product.
ProductBuild The build of the product.
ServicePID The process ID of the service.
ProductUID The user identifier of the service.
ServiceStatus The status of the service (running or stopped).
[Job]
FlowRate The amount of data (in kilobytes) being aggregated per second.
Status The status of the connector job (running or stopped).
• • • ConnectorLib Java SDK Programming Guide • 259 • • Chapter 16 Service Actions
Example action=GetStatusInfo
Parameters None.
Stop
The Stop action stops the service.
Example action=Stop
Parameters None.
Service Action Parameters
This section describes the parameters for service actions.
FromDisk
Name
Tail
FromDisk Specifies whether the log stream is read from disk or memory. Type true if you want the log stream to be read from disk rather than from memory.
Action GetLogStream
Type: Boolean
Default: false
• • • 260 • ConnectorLib Java SDK Programming Guide • • Service Action Parameters
Required: No
Example: action=GetLogStream&Name=ApplicationLogStream&Fro mDisk=true&Tail=10 See Also:
Name Type the name of the log stream you want to return.
Action GetLogStream
Type: String
Default: false
Required: Yes
Example: action=GetLogStream&Name=ApplicationLogStream&Fro mDisk=true&Tail=10 See Also:
Tail Type the number of lines from the log stream to return. The lines are read from the top (that is the most recent lines are retuned). Type -1 to return all entries.
Action GetLogStream
Type: Long
Default: -1
Required: No
Example: action=GetLogStream&Name=ApplicationLogStream&Tai l=10 See Also:
• • • ConnectorLib Java SDK Programming Guide • 261 • • Chapter 16 Service Actions
• • • 262 • ConnectorLib Java SDK Programming Guide • • Appendixes
This section includes the following appendixes:
KeyView Format Codes Appendixes
• • • 264 • ConnectorLib Java SDK Programming Guide • • APPENDIX A KeyView Format Codes
This chapter lists the KeyView format classes and codes used with Connector Framework server. It includes the following section:
KeyView Classes
KeyView Formats Table 1 lists KeyView file classes. The numbers are reported in the DocumentClass field in IDX files generated by Import Module. Consult the table to determine the file class that was imported.
Table 2 lists all KeyView formats. The numbers are reported in the DocumentType field in IDX files generated by Import Module. Consult the table to determine the file type that was imported. You can use any of the format numbers from Table 2 in conjunction with the ImportFamilyRootExcludeFmtCSV parameter. For more information, see “ImportFamilyRootExcludeFmtCSV” on page 203.
• • • ConnectorLib Java SDK Programming Guide • 265 • • Appendix A KeyView Format Codes
KeyView Classes
Table 1 KeyView Classes Attribute Number File Class
0 No file class
01 Word processor
02 Spreadsheet
03 Database
04 Raster image
05 Vector graphic
06 Presentation
07 Executable
08 Encapsulation
09 Sound
10 Desktop publishing
11 Outline/planning
12 Miscellaneous
13 Mixed format
14 Font
15 Time scheduling
16 Communications
17 Object module
18 Library module
19 Fax
20 Movie
21 Animation
• • • 266 • ConnectorLib Java SDK Programming Guide • • KeyView Formats
KeyView Formats
Table 2 KeyView Formats
Format Name Format Number Format Description AES_Multiplus_Comm_Fmt 1 Multiplus (AES)
ASCII_Text_Fmt 2 Text
MSDOS_Batch_File_Fmt 3 MS-DOS Batch File
Applix_Alis_Fmt 4 APPLIX ASTERIX
BMP_Fmt 5 Windows Bitmap
CT_DEF_Fmt 6 Convergent Technologies DEF Comm. Format
Corel_Draw_Fmt 7 Corel Draw
CGM_ClearText_Fmt 8 Computer Graphics Metafile (CGM)
CGM_Binary_Fmt 9 Computer Graphics Metafile (CGM)
CGM_Character_Fmt 10 Computer Graphics Metafile (CGM)
Word_Connection_Fmt 11 Word Connection
COMET_TOP_Word_Fmt 12 COMET TOP
CEOwrite_Fmt 13 CEOwrite
DSA101_Fmt 14 DSA101 (Honeywell Bull)
DCA_RFT_Fmt 15 DCA-RFT (IBM Revisable Form)
CDA_DDIF_Fmt 16 CDA / DDIF
DG_CDS_Fmt 17 DG Common Data Stream (CDS)
Micrografx_Draw_Fmt 18 Windows Draw (Micrografx)
Data_Point_VistaWord_Fmt 19 Vistaword
DECdx_Fmt 20 DECdx
Enable_WP_Fmt 21 Enable Word Processing
EPSF_Fmt 22 Encapsulated PostScript
Preview_EPSF_Fmt 23 Encapsulated PostScript
MS_Executable_Fmt 24 MSDOS/Windows Program
G31D_Fmt 25 CCITT G3 1D
• • • ConnectorLib Java SDK Programming Guide • 267 • • Appendix A KeyView Format Codes
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description GIF_87a_Fmt 26 Graphics Interchange Format (GIF87a)
GIF_89a_Fmt 27 Graphics Interchange Format (GIF89a)
HP_Word_PC_Fmt 28 HP Word PC
IBM_1403_LinePrinter_Fmt 29 IBM 1403 Line Printer
IBM_DCF_Script_Fmt 30 DCF Script
IBM_DCA_FFT_Fmt 31 DCA-FFT (IBM Final Form)
Interleaf_Fmt 32 Interleaf
GEM_Image_Fmt 33 GEM Bit Image
IBM_Display_Write_Fmt 34 Display Write
Sun_Raster_Fmt 35 Sun Raster
Ami_Pro_Fmt 36 Lotus Ami Pro
Ami_Pro_StyleSheet_Fmt 37 Lotus Ami Pro Style Sheet
MORE_Fmt 38 MORE Database MAC
Lyrix_Fmt 39 Lyrix Word Processing
MASS_11_Fmt 40 MASS-11
MacPaint_Fmt 41 MacPaint
MS_Word_Mac_Fmt 42 Microsoft Word for Macintosh
SmartWare_II_Comm_Fmt 43 SmartWare II
MS_Word_Win_Fmt 44 Microsoft Word for Windows
Multimate_Fmt 45 MultiMate
Multimate_Fnote_Fmt 46 MultiMate Footnote File
Multimate_Adv_Fmt 47 MultiMate Advantage
Multimate_Adv_Fnote_Fmt 48 MultiMate Advantage Footnote File
Multimate_Adv_II_Fmt 49 MultiMate Advantage II
Multimate_Adv_II_Fnote_Fmt 50 MultiMate Advantage II Footnote File
Multiplan_PC_Fmt 51 Multiplan (PC)
Multiplan_Mac_Fmt 52 Multiplan (Mac)
MS_RTF_Fmt 53 Rich Text Format (RTF)
• • • 268 • ConnectorLib Java SDK Programming Guide • • KeyView Formats
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description MS_Word_PC_Fmt 54 Microsoft Word for PC
MS_Word_PC_StyleSheet_Fmt 55 Microsoft Word for PC Style Sheet
MS_Word_PC_Glossary_Fmt 56 Microsoft Word for PC Glossary
MS_Word_PC_Driver_Fmt 57 Microsoft Word for PC Driver
MS_Word_PC_Misc_Fmt 58 Microsoft Word for PC Miscellaneous File
NBI_Async_Archive_Fmt 59 NBI Async Archive Format
Navy_DIF_Fmt 60 Navy DIF
NBI_Net_Archive_Fmt 61 NBI Net Archive Format
NIOS_TOP_Fmt 62 NIOS TOP
FileMaker_Mac_Fmt 63 Filemaker MAC
ODA_Q1_11_Fmt 64 ODA / ODIF
ODA_Q1_12_Fmt 65 ODA / ODIF
OLIDIF_Fmt 66 OLIDIF (Olivetti)
Office_Writer_Fmt 67 Office Writer
PC_Paintbrush_Fmt 68 PC Paintbrush Graphics (PCX)
CPT_Comm_Fmt 69 CPT
Lotus_PIC_Fmt 70 Lotus PIC
Mac_PICT_Fmt 71 QuickDraw Picture
Philips_Script_Word_Fmt 72 Philips Script
PostScript_Fmt 73 PostScript
PRIMEWORD_Fmt 74 PRIMEWORD
Quadratron_Q_One_v1_Fmt 75 Q-One V1.93J
Quadratron_Q_One_v2_Fmt 76 Q-One V2.0
SAMNA_Word_IV_Fmt 77 SAMNA Word
Ami_Pro_Draw_Fmt 78 Lotus Ami Pro Draw
SYLK_Spreadsheet_Fmt 79 SYLK
SmartWare_II_WP_Fmt 80 SmartWare II
Symphony_Fmt 81 Symphony
• • • ConnectorLib Java SDK Programming Guide • 269 • • Appendix A KeyView Format Codes
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description Targa_Fmt 82 Targa
TIFF_Fmt 83 TIFF
Targon_Word_Fmt 84 Targon Word
Uniplex_Ucalc_Fmt 85 Uniplex Ucalc
Uniplex_WP_Fmt 86 Uniplex
MS_Word_UNIX_Fmt 87 Microsoft Word UNIX
WANG_PC_Fmt 88 WANG PC
WordERA_Fmt 89 WordERA
WANG_WPS_Comm_Fmt 90 WANG WPS
WordPerfect_Mac_Fmt 91 WordPerfect MAC
WordPerfect_Fmt 92 WordPerfect
WordPerfect_VAX_Fmt 93 WordPerfect VAX
WordPerfect_Macro_Fmt 94 WordPerfect Macro
WordPerfect_Dictionary_Fmt 95 WordPerfect Spelling Dictionary
WordPerfect_Thesaurus_Fmt 96 WordPerfect Thesaurus
WordPerfect_Resource_Fmt 97 WordPerfect Resource File
WordPerfect_Driver_Fmt 98 WordPerfect Driver
WordPerfect_Cfg_Fmt 99 WordPerfect Configuration File
WordPerfect_Hyphenation_Fmt 100 WordPerfect Hyphenation Dictionary
WordPerfect_Misc_Fmt 101 WordPerfect Miscellaneous File
WordMARC_Fmt 102 WordMARC
Windows_Metafile_Fmt 103 Windows Metafile
Windows_Metafile_NoHdr_Fmt 104 Windows Metafile (no header)
SmartWare_II_DB_Fmt 105 SmartWare II
WordPerfect_Graphics_Fmt 106 WordPerfect Graphics
WordStar_Fmt 107 WordStar
WANG_WITA_Fmt 108 WANG WITA
Xerox_860_Comm_Fmt 109 Xerox 860
• • • 270 • ConnectorLib Java SDK Programming Guide • • KeyView Formats
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description Xerox_Writer_Fmt 110 Xerox Writer
DIF_SpreadSheet_Fmt 111 Data Interchange Format (DIF)
Enable_Spreadsheet_Fmt 112 Enable Spreadsheet
SuperCalc_Fmt 113 Supercalc
UltraCalc_Fmt 114 UltraCalc
SmartWare_II_SS_Fmt 115 SmartWare II
SOF_Encapsulation_Fmt 116 Serialized Object Format (SOF)
PowerPoint_Win_Fmt 117 PowerPoint PC
PowerPoint_Mac_Fmt 118 PowerPoint MAC
PowerPoint_95_Fmt 119 PowerPoint 95
PowerPoint_97_Fmt 120 PowerPoint 97
PageMaker_Mac_Fmt 121 PageMaker for Macintosh
PageMaker_Win_Fmt 122 PageMaker for Windows
MS_Works_Mac_WP_Fmt 123 Microsoft Works for MAC
MS_Works_Mac_DB_Fmt 124 Microsoft Works for MAC
MS_Works_Mac_SS_Fmt 125 Microsoft Works for MAC
MS_Works_Mac_Comm_Fmt 126 Microsoft Works for MAC
MS_Works_DOS_WP_Fmt 127 Microsoft Works for DOS
MS_Works_DOS_DB_Fmt 128 Microsoft Works for DOS
MS_Works_DOS_SS_Fmt 129 Microsoft Works for DOS
MS_Works_Win_WP_Fmt 130 Microsoft Works for Windows
MS_Works_Win_DB_Fmt 131 Microsoft Works for Windows
MS_Works_Win_SS_Fmt 132 Microsoft Works for Windows
PC_Library_Fmt 133 DOS/Windows Object Library
MacWrite_Fmt 134 MacWrite
MacWrite_II_Fmt 135 MacWrite II
Freehand_Fmt 136 Freehand MAC
Disk_Doubler_Fmt 137 Disk Doubler
• • • ConnectorLib Java SDK Programming Guide • 271 • • Appendix A KeyView Format Codes
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description HP_GL_Fmt 138 HP Graphics Language
FrameMaker_Fmt 139 FrameMaker
FrameMaker_Book_Fmt 140 FrameMaker
Maker_Markup_Language_Fmt 141 Maker Markup Language
Maker_Interchange_Fmt 142 Maker Interchange Format (MIF)
JPEG_File_Interchange_Fmt 143 Interchange Format
Reflex_Fmt 144 Reflex
Framework_Fmt 145 Framework
Framework_II_Fmt 146 Framework II
Paradox_Fmt 147 Paradox
MS_Windows_Write_Fmt 148 Windows Write
Quattro_Pro_DOS_Fmt 149 Quattro Pro for DOS
Quattro_Pro_Win_Fmt 150 Quattro Pro for Windows
Persuasion_Fmt 151 Persuasion
Windows_Icon_Fmt 152 Windows Icon Format
Windows_Cursor_Fmt 153 Windows Cursor
MS_Project_Activity_Fmt 154 Microsoft Project
MS_Project_Resource_Fmt 155 Microsoft Project
MS_Project_Calc_Fmt 156 Microsoft Project
PKZIP_Fmt 157 ZIP Archive
Quark_Xpress_Fmt 158 Quark Xpress MAC
ARC_PAK_Archive_Fmt 159 PAK/ARC Archive
MS_Publisher_Fmt 160 Microsoft Publisher
PlanPerfect_Fmt 161 PlanPerfect
WordPerfect_Auxiliary_Fmt 162 WordPerfect auxiliary file
MS_WAVE_Audio_Fmt 163 Microsoft Wave
MIDI_Audio_Fmt 164 MIDI
AutoCAD_DXF_Binary_Fmt 165 AutoCAD DXF
• • • 272 • ConnectorLib Java SDK Programming Guide • • KeyView Formats
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description AutoCAD_DXF_Text_Fmt 166 AutoCAD DXF
dBase_Fmt 167 dBase
OS_2_PM_Metafile_Fmt 168 OS/2 PM Metafile
Lasergraphics_Language_Fmt 169 Lasergraphics Language
AutoShade_Rendering_Fmt 170 AutoShade Rendering
GEM_VDI_Fmt 171 GEM VDI
Windows_Help_Fmt 172 Windows Help File
Volkswriter_Fmt 173 Volkswriter
Ability_WP_Fmt 174 Ability
Ability_DB_Fmt 175 Ability
Ability_SS_Fmt 176 Ability
Ability_Comm_Fmt 177 Ability
Ability_Image_Fmt 178 Ability
XyWrite_Fmt 179 XYWrite / Nota Bene
CSV_Fmt 180 CSV (Comma Separated Values)
IBM_Writing_Assistant_Fmt 181 IBM Writing Assistant
WordStar_2000_Fmt 182 WordStar 2000
HP_PCL_Fmt 183 HP Printer Control Language
UNIX_Exe_PreSysV_VAX_Fmt 184 Unix Executable (PDP-11/pre-System V VAX)
UNIX_Exe_Basic_16_Fmt 185 Unix Executable (Basic-16)
UNIX_Exe_x86_Fmt 186 Unix Executable (x86)
UNIX_Exe_iAPX_286_Fmt 187 Unix Executable (iAPX 286)
UNIX_Exe_MC68k_Fmt 188 Unix Executable (MC680x0)
UNIX_Exe_3B20_Fmt 189 Unix Executable (3B20)
UNIX_Exe_WE32000_Fmt 190 Unix Executable (WE32000)
UNIX_Exe_VAX_Fmt 191 Unix Executable (VAX)
UNIX_Exe_Bell_5_Fmt 192 Unix Executable (Bell 5.0)
UNIX_Obj_VAX_Demand_Fmt 193 Unix Object Module (VAX Demand)
• • • ConnectorLib Java SDK Programming Guide • 273 • • Appendix A KeyView Format Codes
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description UNIX_Obj_MS8086_Fmt 194 Unix Object Module (old MS 8086)
UNIX_Obj_Z8000_Fmt 195 Unix Object Module (Z8000)
AU_Audio_Fmt 196 NeXT/Sun Audio Data
NeWS_Font_Fmt 197 NeWS bitmap font
cpio_Archive_CRChdr_Fmt 198 cpio archive (CRC Header)
cpio_Archive_CHRhdr_Fmt 199 cpio archive (CHR Header)
PEX_Binary_Archive_Fmt 200 SUN PEX Binary Archive
Sun_vfont_Fmt 201 SUN vfont Definition
Curses_Screen_Fmt 202 Curses Screen Image
UUEncoded_Fmt 203 UU encoded
WriteNow_Fmt 204 WriteNow MAC
PC_Obj_Fmt 205 DOS/Windows Object Module
Windows_Group_Fmt 206 Windows Group
TrueType_Font_Fmt 207 TrueType Font
Windows_PIF_Fmt 208 Program Information File (PIF)
MS_COM_Executable_Fmt 209 PC (.COM)
StuffIt_Fmt 210 StuffIt (MAC)
PeachCalc_Fmt 211 PeachCalc
Wang_GDL_Fmt 212 WANG Office GDL Header
Q_A_DOS_Fmt 213 Q & A for DOS
Q_A_Win_Fmt 214 Q & A for Windows
WPS_PLUS_Fmt 215 WPS-PLUS
DCX_Fmt 216 DCX FAX Format (PCX images)
OLE_Fmt 217 OLE Compound Document
EBCDIC_Fmt 218 EBCDIC Text
DCS_Fmt 219 DCS
UNIX_SHAR_Fmt 220 SHAR
Lotus_Notes_BitMap_Fmt 221 Lotus Notes Bitmap
• • • 274 • ConnectorLib Java SDK Programming Guide • • KeyView Formats
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description Lotus_Notes_CDF_Fmt 222 Lotus Notes CDF
Compress_Fmt 223 Unix Compress
GZ_Compress_Fmt 224 GZ Compress
TAR_Fmt 225 TAR
ODIF_FOD26_Fmt 226 ODA / ODIF
ODIF_FOD36_Fmt 227 ODA / ODIF
ALIS_Fmt 228 ALIS
Envoy_Fmt 229 Envoy
PDF_Fmt 230 Portable Document Format
BinHex_Fmt 231 BinHex
SMTP_Fmt 232 SMTP
MIME_Fmt 233 MIME
USENET_Fmt 234 USENET
SGML_Fmt 235 SGML
HTML_Fmt 236 HTML
ACT_Fmt 237 ACT
PNG_Fmt 238 Portable Network Graphics (PNG)
MS_Video_Fmt 239 Video for Windows (AVI)
Windows_Animated_Cursor_Fmt 240 Windows Animated Cursor
Windows_CPP_Obj_Storage_Fmt 241 Windows C++ Object Storage
Windows_Palette_Fmt 242 Windows Palette
RIFF_DIB_Fmt 243 RIFF Device Independent Bitmap
RIFF_MIDI_Fmt 244 RIFF MIDI
RIFF_Multimedia_Movie_Fmt 245 RIFF Multimedia Movie
MPEG_Fmt 246 MPEG Movie
QuickTime_Fmt 247 QuickTime Movie
AIFF_Fmt 248 Audio Interchange File Format (AIFF)
Amiga_MOD_Fmt 249 Amiga MOD
• • • ConnectorLib Java SDK Programming Guide • 275 • • Appendix A KeyView Format Codes
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description Amiga_IFF_8SVX_Fmt 250 Amiga IFF (8SVX) Sound
Creative_Voice_Audio_Fmt 251 Creative Voice (VOC)
AutoDesk_Animator_FLI_Fmt 252 AutoDesk Animator FLIC
AutoDesk_AnimatorPro_FLC_Fmt 253 AutoDesk Animator Pro FLIC
Compactor_Archive_Fmt 254 Compactor / Compact Pro
VRML_Fmt 255 VRML
QuickDraw_3D_Metafile_Fmt 256 QuickDraw 3D Metafile
PGP_Secret_Keyring_Fmt 257 PGP Secret Keyring
PGP_Public_Keyring_Fmt 258 PGP Public Keyring
PGP_Encrypted_Data_Fmt 259 PGP Encrypted Data
PGP_Signed_Data_Fmt 260 PGP Signed Data
PGP_SignedEncrypted_Data_Fmt 261 PGP Signed and Encrypted Data
PGP_Sign_Certificate_Fmt 262 PGP Signature Certificate
PGP_Compressed_Data_Fmt 263 PGP Compressed Data
PGP_ASCII_Public_Keyring_Fmt 264 ASCII-armored PGP Public Keyring
PGP_ASCII_Encoded_Fmt 265 ASCII-armored PGP encoded
PGP_ASCII_Signed_Fmt 266 ASCII-armored PGP encoded
OLE_DIB_Fmt 267 OLE DIB object
SGI_Image_Fmt 268 SGI Image
Lotus_ScreenCam_Fmt 269 Lotus ScreenCam
MPEG_Audio_Fmt 270 MPEG Audio
FTP_Software_Session_Fmt 271 FTP Session Data
Netscape_Bookmark_File_Fmt 272 Netscape Bookmark File
Corel_Draw_CMX_Fmt 273 Corel CMX
AutoDesk_DWG_Fmt 274 AutoDesk Drawing (DWG)
AutoDesk_WHIP_Fmt 275 AutoDesk WHIP
Macromedia_Director_Fmt 276 Macromedia Director
Real_Audio_Fmt 277 Real Audio
• • • 276 • ConnectorLib Java SDK Programming Guide • • KeyView Formats
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description MSDOS_Device_Driver_Fmt 278 MSDOS Device Driver
Micrografx_Designer_Fmt 279 Micrografx Designer
SVF_Fmt 280 Simple Vector Format (SVF)
Applix_Words_Fmt 281 Applix Words
Applix_Graphics_Fmt 282 Applix Graphics
MS_Access_Fmt 283 Microsoft Access
MS_Access_95_Fmt 284 Microsoft Access 95
MS_Access_97_Fmt 285 Microsoft Access 97
MacBinary_Fmt 286 MacBinary
Apple_Single_Fmt 287 Apple Single
Apple_Double_Fmt 288 Apple Double
Enhanced_Metafile_Fmt 289 Enhanced Metafile
MS_Office_Drawing_Fmt 290 Microsoft Office Drawing
XML_Fmt 291 XML
DeVice_Independent_Fmt 292 DeVice Independent file (DVI)
Unicode_Fmt 293 Unicode
Lotus_123_Worksheet_Fmt 294 Lotus 1-2-3
Lotus_123_Format_Fmt 295 Lotus 1-2-3 Formatting
Lotus_123_97_Fmt 296 Lotus 1-2-3 97
Lotus_Word_Pro_96_Fmt 297 Lotus Word Pro 96
Lotus_Word_Pro_97_Fmt 298 Lotus Word Pro 97
Freelance_DOS_Fmt 299 Lotus Freelance for DOS
Freelance_Win_Fmt 300 Lotus Freelance for Windows
Freelance_OS2_Fmt 301 Lotus Freelance for OS/2
Freelance_96_Fmt 302 Lotus Freelance 96
Freelance_97_Fmt 303 Lotus Freelance 97
MS_Word_95_Fmt 304 Microsoft Word 95
MS_Word_97_Fmt 305 Microsoft Word 97
• • • ConnectorLib Java SDK Programming Guide • 277 • • Appendix A KeyView Format Codes
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description Excel_Fmt 306 Microsoft Excel
Excel_Chart_Fmt 307 Microsoft Excel
Excel_Macro_Fmt 308 Microsoft Excel
Excel_95_Fmt 309 Microsoft Excel 95
Excel_97_Fmt 310 Microsoft Excel 97
Corel_Presentations_Fmt 311 Corel Presentations
Harvard_Graphics_Fmt 312 Harvard Graphics
Harvard_Graphics_Chart_Fmt 313 Harvard Graphics Chart
Harvard_Graphics_Symbol_Fmt 314 Harvard Graphics Symbol File
Harvard_Graphics_Cfg_Fmt 315 Harvard Graphics Configuration File
Harvard_Graphics_Palette_Fmt 316 Harvard Graphics Palette
Lotus_123_R9_Fmt 317 Lotus 1-2-3 Release 9
Applix_Spreadsheets_Fmt 318 Applix Spreadsheets
MS_Pocket_Word_Fmt 319 Microsoft Pocket Word
MS_DIB_Fmt 320 MS Windows Device Independent Bitmap
MS_Word_2000_Fmt 321 Microsoft Word 2000
Excel_2000_Fmt 322 Microsoft Excel 2000
PowerPoint_2000_Fmt 323 Microsoft PowerPoint 2000
MS_Access_2000_Fmt 324 Microsoft Access 2000
MS_Project_4_Fmt 325 Microsoft Project 4
MS_Project_41_Fmt 326 Microsoft Project 4.1
MS_Project_98_Fmt 327 Microsoft Project 98
Folio_Flat_Fmt 328 Folio Flat File
HWP_Fmt 329 HWP(Arae-Ah Hangul)
ICHITARO_Fmt 330 ICHITARO V4-10
IS_XML_Fmt 331 Extended or Custom XML
Oasys_Fmt 332 Oasys format
PBM_ASC_Fmt 333 Portable Bitmap Utilities ASCII Format
• • • 278 • ConnectorLib Java SDK Programming Guide • • KeyView Formats
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description PBM_BIN_Fmt 334 Portable Bitmap Utilities Binary Format
PGM_ASC_Fmt 335 Portable Greymap Utilities ASCII Format
PGM_BIN_Fmt 336 Portable Greymap Utilities Binary Format
PPM_ASC_Fmt 337 Portable Pixmap Utilities ASCII Format
PPM_BIN_Fmt 338 Portable Pixmap Utilities Binary Format
XBM_Fmt 339 X Bitmap Format
XPM_Fmt 340 X Pixmap Format
FPX_Fmt 341 FPX Format
PCD_Fmt 342 PCD Format
MS_Visio_Fmt 343 Microsoft Visio
MS_Project_2000_Fmt 344 Microsoft Project 2000
MS_Outlook_Fmt 345 Microsoft Outlook
ELF_Relocatable_Fmt 346 ELF Relocatable
ELF_Executable_Fmt 347 ELF Executable
ELF_Dynamic_Lib_Fmt 348 ELF Dynamic Library
MS_Word_XML_Fmt 349 Microsoft Word 2003 XML
MS_Excel_XML_Fmt 350 Microsoft Excel 2003 XML
MS_Visio_XML_Fmt 351 Microsoft Visio 2003 XML
SO_Text_XML_Fmt 352 StarOffice Text XML
SO_Spreadsheet_XML_Fmt 353 StarOffice Spreadsheet XML
SO_Presentation_XML_Fmt 354 StarOffice Presentation XML
XHTML_Fmt 355 XHTML
MS_OutlookPST_Fmt 356 Microsoft Outlook PST
RAR_Fmt 357 RAR
Lotus_Notes_NSF_Fmt 358 IBM Lotus Notes Database NSF/NTF
Macromedia_Flash_Fmt 359 SWF
MS_Word_2007_Fmt 360 Microsoft Word 2007 XML
MS_Excel_2007_Fmt 361 Microsoft Excel 2007 XML
• • • ConnectorLib Java SDK Programming Guide • 279 • • Appendix A KeyView Format Codes
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description MS_PPT_2007_Fmt 362 Microsoft PPT 2007 XML
OpenPGP_Fmt 363 OpenPGP Message Format (with new packet format)
Intergraph_V7_DGN_Fmt 364 Intergraph Standard File Format (ISFF) V7 DGN (non-OLE)
MicroStation_V8_DGN_Fmt 365 MicroStation V8 DGN (OLE)
MS_Word_Macro_2007_Fmt 366 Microsoft Word Macro 2007 XML
MS_Excel_Macro_2007_Fmt 367 Microsoft Excel Macro 2007 XML
MS_PPT_Macro_2007_Fmt 368 Microsoft PPT Macro 2007 XML
LZH_Fmt 369 LHA Archive
Office_2007_Fmt 370 Office 2007 document
MS_XPS_Fmt 371 Microsoft XML Paper Specification (XPS)
Lotus_Domino_DXL_Fmt 372 IBM Lotus representation of Domino design elements in XML format
ODF_Text_Fmt 373 ODF Text
ODF_Spreadsheet_Fmt 374 ODF Spreadsheet
ODF_Presentation_Fmt 375 ODF Presentation
Legato_Extender_ONM_Fmt 376 Legato Extender Native Message ONM
bin_Unknown_Fmt 377 n/a
TNEF_Fmt 378 Transport Neutral Encapsulation Format (TNEF)
CADAM_Drawing_Fmt 379 CADAM Drawing
CADAM_Drawing_Overlay_Fmt 380 CADAM Drawing Overlay
NURSTOR_Drawing_Fmt 381 NURSTOR Drawing
HP_GLP_Fmt 382 HP Graphics Language (Plotter)
ASF_Fmt 383 Advanced Systems Format (ASF)
WMA_Fmt 384 Window Media Audio Format (WMA)
WMV_Fmt 385 Window Media Video Format (WMV)
EMX_Fmt 386 Legato EMailXtender Archives Format (EMX)
Z7Z_Fmt 387 7 Zip Format (7z)
• • • 280 • ConnectorLib Java SDK Programming Guide • • KeyView Formats
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description MS_Excel_Binary_2007_Fmt 388 Microsoft Excel Binary 2007
CAB_Fmt 389 Microsoft Cabinet File (CAB)
CATIA_Fmt 390 CATIA Formats (CAT*)
YIM_Fmt 391 Yahoo Instant Messenger History
ODF_Drawing_Fmt 392 ODF Drawing
Founder_CEB_Fmt 393 Founder Chinese E-paper Basic (CEB)
QPW_Fmt 394 Quattro Pro 9+ for Windows
MHT_Fmt 395 MHT format
MDI_Fmt 396 Microsoft Document Imaging Format
GRV_Fmt 397 Microsoft Office Groove Format
IWWP_Fmt 398 Apple iWork Pages format
IWSS_Fmt 399 Apple iWork Numbers format
IWPG_Fmt 400 Apple iWork Keynote format
BKF_Fmt 401 Windows Backup File
MS_Access_2007_Fmt 402 Microsoft Access 2007
ENT_Fmt 403 Microsoft Entourage Database Format
DMG_Fmt 404 Mac Disk Copy Disk Image File
CWK_Fmt 405 AppleWorks File
OO3_Fmt 406 Omni Outliner File
OPML_Fmt 407 Omni Outliner File
Omni_Graffle_XML_File 408 Omni Graffle XML File
PSD_Fmt 409 Photoshop Document
Apple_Binary_PList_Fmt 410 Apple Binary Property List format
Apple_iChat_Fmt 411 Apple iChat format
OOUTLINE_Fmt 412 OOutliner File
BZIP2_Fmt 413 Bzip 2 Compressed File
ISO_Fmt 414 ISO-9660 CD Disc Image Format
DocuWorks_Fmt 415 DocuWorks Format
• • • ConnectorLib Java SDK Programming Guide • 281 • • Appendix A KeyView Format Codes
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description RealMedia_Fmt 416 RealMedia Streaming Media
AC3Audio_Fmt 417 AC3 Audio File Format
NEF_Fmt 418 Nero Encrypted File
SolidWorks_Fmt 419 SolidWorks Format Files
XFDL_Fmt 420 Extensible Forms Description Language
Apple_XML_PList_Fmt 421 Apple XML Property List format
OneNote_Fmt 422 OneNote Note Format
Dicom_Fmt 424 Digital Imaging and Communications in Medicine
EnCase_Fmt 425 Expert Witness Compression Format (EnCase)
Scrap_Fmt 426 Shell Scrap Object File
MS_Project_2007_Fmt 427 Microsoft Project 2007
MS_Publisher_98_Fmt 428 Microsoft Publisher 98/2000/2002/2003/2007
Skype_Fmt 429 Skype Log File
Hl7_Fmt 430 Health level7 message
MS_OutlookOST 431 Microsoft Outlook OST
Epub_Fmt 432 Electronic Publication
MS_OEDBX_Fmt 433 Microsoft Outlook Express DBX
BB_Activ_Fmt 434 BlackBerry Activation File
DiskImage_Fmt 435 Disk Image
Milestone_Fmt 436 Milestone Document
E_Transcript_Fmt 437 RealLegal E-Transcript File
PostScript_Font_Fmt 438 PostScript Type 1 Font
Ghost_DiskImage_Fmt 439 Ghost Disk Image File
JPEG_2000_JP2_File_Fmt 440 JPEG-2000 JP2 File Format Syntax (ISO/IEC 15444-1)
Unicode_HTML_Fmt 441 Unicode HTML
CHM_Fmt 442 Microsoft Compiled HTML Help
EMCMF_Fmt 443 Documentum EMCMF format
• • • 282 • ConnectorLib Java SDK Programming Guide • • KeyView Formats
Table 2 KeyView Formats (continued)
Format Name Format Number Format Description MS_Access_2007_Tmpl_Fmt 444 Microsoft Access 2007 Template
Jungum_Fmt 445 Samsung Electronics Jungum Global document
JBIG2 446 JBIG2 File Format
EFax_Fmt 447 eFax file
AD1_Fmt 448 Forensic Toolkit FTK Imager file
SketchUp_Fmt 449 Google SketchUp
GWFS_Email_Fmt 450 Group Wise File Surf email
JNT_Fmt 451 Windows Journal Format
• • • ConnectorLib Java SDK Programming Guide • 283 • • Appendix A KeyView Format Codes
• • • 284 • ConnectorLib Java SDK Programming Guide • • Glossary
A
ACI (Autonomy Content The Autonomy Content Infrastructure is a technology layer that automates Infrastructure) operations on unstructured information for cross-enterprise applications, thus enabling an automated and compatible business-to-business, peer-to-peer infrastructure. The ACI allows enterprise applications to understand and process content that exists in unstructured formats, such as e-mail, Web pages, office documents, and Lotus Notes.
ACL (access control list) An ACL is a set of data associated with a document that defines which users, groups, and roles are permitted to access a document or data source (for example, an Oracle database or Windows file system).
C
connector A connector is an Autonomy fetching solution (such as HTTP Connector, Oracle Connector, Exchange Connector, and so on) that allows you to retrieve information from any type of local or remote repository such as a database or Web site. It imports the fetched documents into IDX or XML file format and indexes them into IDOL Server, from where you can retrieve them (for example by sending queries to IDOL Server).
D
database An Autonomy database is an IDOL Server data pool that stores indexed information. The administrator can set up one or more databases and specify how data is fed to the databases. You can retrieve information that is indexed in the IDOL Server database by sending a query to the IDOL Server.
DIH (Distributed Action The Distributed Index Handler allows you to efficiently split and index Handler) extremely large quantities of data into multiple IDOL Servers to create a completely scalable solution that delivers high performance and high availability. It provides a flexible way of transparently batching, routing, and categorizing the indexing of internal and external content into the IDOL Server.
• • • ConnectorLib Java SDK Programming Guide • 285 • • Glossary
DiSH (Distributed The Distributed Service Handler provides a unified way to communicate with Service Handler) all Autonomy services from a centralized location. It also facilitates the licensing that enables you to run Autonomy solutions. You must have an Autonomy DiSH server running on a machine with a static known IP address.
F
fetch The process of downloading documents from the repository in which they are stored (such as a local folder, Web site, database, Lotus Domino server, and so on), importing them to IDX format, and indexing them into an IDOL Server.
I
IAS (Intellectual Asset The Intellectual Asset Protection System provides an integrated security Protection System) solution to protect your data. At the front end, authentication checks users are allowed to access the system on which result data is displayed. At the back end, entitlement checking and authentication combine to ensure query results only contain documents the user is allowed to see from repositories the user is allowed to access.
IDOL Server Using Autonomy connectors, Autonomy's Intelligent Data Operating Layer (IDOL) server integrates unstructured, semi-structured, and structured information from multiple repositories through an understanding of the content, delivering a real-time environment in which operations across applications and content are automated, removing all the manual processes involved in getting the right information to the right people at the right time.
IDX Apart from XML files, only files in IDX format can be indexed into IDOL Server. You can use a connector to import files into this format or manually create IDX files.
importing After a document has been downloaded from the repository in which it is stored, it is imported to an IDX or XML file format. This process is called “importing.”
Index fields Store fields containing text which you want to query frequently as index fields. Index fields are processed linguistically when they are stored in IDOL Server. This means stemming and stop lists are applied to text in index fields before they are stored, which allows IDOL Server to process queries for these fields more quickly. Typically, the fields DRETITLE and DRECONTENT are set up as index fields.
indexing After documents have been imported to IDX file format, their content (or links to the original documents) is stored in an IDOL Server. This process is called “indexing.”
• • • 286 • ConnectorLib Java SDK Programming Guide • • Q
Q
query You can submit a natural language query to IDOL Server which analyzes the concept of the query and returns documents that are conceptually similar to the query. You can also submit other query and search types to IDOL Server, such as Boolean, bracketed Boolean, and keyword searches.
S
Search Unlike ordinary searches that look for keywords, the Autonomy Search allows you to enter a natural language query. The concept of the query is analyzed and documents relevant to this concept are returned to you.
• • • ConnectorLib Java SDK Programming Guide • 287 • • Glossary
• • • 288 • ConnectorLib Java SDK Programming Guide • • Index
A CompressIndexfiles configuration parameter 208 configuration 37, 43, 59 abs_path method 74 boolean values 37, 62 access control list (ACL) 285 string values 38, 62 AciPort configuration parameter 208 configuration parameters actions AciPort 208 GetConfig 250 AdminClients (CFS) 180 GetLogStream 250 ChildContentRegex
• • • ConnectorLib Java SDK Programming Guide • 289 • • Index
FieldMatchesRegex
• • • 290 • ConnectorLib Java SDK Programming Guide • • C
MaxImportQueueSize 206 SynchronizeKeepDatastore 135 MaximumThreads 119, 182 SynchronizeThreads 136, 137 MaxQueueSize 120, 182 TaskMaxAdds 136 MaxScheduledSize 120 TaskMaxDuration 137 N 139 TempDirectory 138 Number 140 ThreadCount 207 OnError 121 UpdateN 186 OnErrorReport 121 Url 122 OnFinish 122 XsltDLL 138, 207 OnStart 122 connector 285 Operation 217 Connector Framework server 25 Port 129, 150 configure 63 Port (CFS) 181 IDX Writer 183 PortN 126 parameters PostN 186 AciPort 208 PreN 185 CompressIndexFiles 208 QueryClients (CFS) 181 DeleteN 187 ReferenceMatchesRegex
• • • ConnectorLib Java SDK Programming Guide • 291 • • Index
ConnectorGroup configuration parameter 124 getContent 89, 93 ConnectorPriority configuration parameter 124 getField 90 content method 104 getFieldNames 90 ContentContainsRegex
D E database 285 EnableExtraction configuration parameter 131 DataPortN configuration parameter 125 EnableExtractionCopy configuration parameter 131 DatastoreDirectory configuration parameter 131 EnableIngestion configuration parameter 142 DatastoreFile configuration parameter 130 EnableScheduledTasks configuration parameter DateFieldFormat
• • • 292 • ConnectorLib Java SDK Programming Guide • • G
hasField 99 getFieldValue method 91, 111 insertXML 99 getFieldValues method 91, 98 name 100 GetLogStream service action 250 setAttributeValue 100 GetLogStreamNames service action 251 setValue 101 getNextSection method 91 value 101 getReference method 92 fieldGetValue method 111 GetStatistics service action 251 FieldMatchesName
• • • ConnectorLib Java SDK Programming Guide • 293 • • Index
C methods 69 introduction 25 ImportFamilyRootExcludeFmtCSV configuration is_dir method 80 parameter 203 iupported platforms 29 ImportHashFamilies configuration parameter 204 importing 286 J ImportInheritFieldsCSV configuration parameter 204 JavaClasspath configuration parameter 155 ImportMergeMails configuration parameter 205 JavaConnectorClass configuration parameter 156, Index fields 286 157 IndexBatchSize configuration parameter 209 JavaLibraryPath configuration parameter 156 IndexDatabase configuration parameter 142 JavaVerboseGC configuration parameter 158 indexing 286 JVMLibraryPath configuration parameter 157 IndexOverSocket configuration parameter 209 IndexTimeInterval configuration parameter 210 K IngestActions configuration parameter 143 Key configuration parameter 214 IngestAddAsUpdate configuration parameter 143 KeyView IngestBatchSize configuration parameter 144 formats 265 IngestCheckFinished configuration parameter 144 KeyviewDirectory configuration parameter 123, 205, IngestConfigSectionTime configuration parameter 206 139 KillDuplicates configuration parameter 210 IngestConnectorConfigSection configuration parameter 145 L IngestDelayMS configuration parameter 146 IngestEnableAdds configuration parameter 146 lastChild method 104 IngestEnableDeletes configuration parameter 146 length method 107 IngestEnableUpdates configuration parameter 147 LibraryName configuration parameter 118 ingester class 57 LicenseServerACIPort configuration parameter 215 IngesterType configuration parameter 147 LicenseServerHost configuration parameter 216 IngestHashedSharedPath configuration parameter LicenseServerRetries configuration parameter 217 147 LicenseServerTimeout configuration parameter 216 IngestHost configuration parameter 148 log method 81 IngestKeepFiles configuration parameter 149 LogArchiveDirectory configuration parameter 220 IngestPort configuration parameter 145, 149 LogCompressionMode configuration parameter 221 IngestSendByType configuration parameter 149 LogDirectory configuration parameter 221 IngestSSLConfig configuration parameter 150 LogEcho configuration parameter 222 IngestWriteIDX configuration parameter 151 LogExpireAction configuration parameter 222 InsertActions configuration parameter 134 LogFile configuration parameter 223 InsertFailedDirectory configuration parameter 135 LogHistorySize configuration parameter 224 insertXML method 92, 99 LogLevel configuration parameter 224 Installation LogLevelMatch configuration parameter 225 Standalone 30 LogMaxLineLength configuration parameter 227 installation 29 LogMaxOldFiles configuration parameter 227 Intellectual Asset Protection System (IAS) 286 LogMaxSizeKBs configuration parameter 228
• • • 294 • ConnectorLib Java SDK Programming Guide • • M
LogOldAction configuration parameter 229 platforms, supported 29 LogOutputLogLevel configuration parameter 230 Port (CFS) configuration parameter 181 LogSysLog configuration parameter 230 Port configuration parameter 129, 150 LogTime configuration parameter 231 PortN configuration parameter 126 LogTypeCSVs configuration parameter 231 position method 108 Lua, append sub file indices with 52 PostN configuration parameter 186 LuaScript configuration parameter 119 PreN configuration parameter 185 prev method 106 M previous_attribute method 107
MainContentRegex
• • • ConnectorLib Java SDK Programming Guide • 295 • • Index
service actions Sub File Indices 51 GetConfig 250 SynchronizeKeepDatastore configuration parameter GetLogStream 250 135 GetLogStreamNames 251 SynchronizeThreads configuration parameter 136, GetStatistics 251 137 GetStatus 259 system requirements 29 GetStatusInfo 259 Stop 260 T ServiceACIMode configuration parameter 246 TaskMaxAdds configuration parameter 136 ServiceControlClients configuration parameter 246 TaskMaxDuration configuration parameter 137 ServiceHost configuration parameter 247 TempDirectory configuration parameter 138 ServicePort configuration parameter 247 ThreadCount configuration parameter 207 ServiceStatusClients configuration parameter 248 type method 106, 107 setAttributeValue method 100 setFieldValue method 93, 111 U setReference method 94 setValue method 101 unzip_file method 86 SharedPath configuration parameter 127, 129 UpdateN configuration parameter 186 SharePoint Connector Url configuration parameter 122 configuration 37, 43, 59 installation 29 V introduction 25 value method 101, 107 system requirements 29 size method 104, 108 W sleep method 85 SSLCACertificate configuration parameter 238 Windows SSLCACertificatesPath configuration parameter directory structure 33 238 system requirements 30 SSLCertificate configuration parameter 240 writeStubIdx method 94 SSLCheckCertificate configuration parameter 240 SSLCheckCommonName configuration parameter X 241 SSLConfig configuration parameter 236 XML document methods SSLConfigN configuration parameter 127 root 101 SSLMethod configuration parameter 241 XPathExecute 102 SSLPrivateKey configuration parameter 242 XPathRegisterNs 102 SSLPrivateKeyPassword configuration parameter XPathValue 102 243 XPathValues 103 stop a connector 60 xml_encode method 86 Stop service action 260 XmlAttr methods 106 str method 109 abs_path 74 string values 38, 62 convert_date_time 74 string_uint_less method 85 copy_file 76
• • • 296 • ConnectorLib Java SDK Programming Guide • • Z
create_path 76 prev 106 create_uuid 76 type 106 delete_file 77 XmlNodeSet Methods 103 encrypt_security_field 77 XmlNodeSet methods fieldGetValue 111 at 103 fieldSetValue 111 size 104 file_setdates 78 XPathExecute method 102 findField 111 XPathRegisterNs method 102 get_config 78 XPathValue method 102 getcwd 78 XPathValues method 103 getEncryptedValue 109 XsltDLL configuration parameter 138, 207 getFieldValue 111 getValue 110 Z getValues 110 zip_file method 86 gobble_whitespace 79 hash_file 79 hash_string 80 is_dir 80 log 81 move_file 81 name 106 next_attribute 106 parse_csv 82 parse_xml 82 previous_attribute 107 send_aci_action 83 send_aci_command 84 setFieldValue 111 sleep 85 type 107 unzip_file 86 value 107 zip_file 86 XmlNode methods 104 attr 104 content 104 firstChild 104 lastChild 104 name 105 next 105 nodePath 105 parent 105
• • • ConnectorLib Java SDK Programming Guide • 297 • • Index
• • • 298 • ConnectorLib Java SDK Programming Guide • •