IDOL Keyview Filter SDK 12.5 C++ Programming Guide

IDOL Keyview Filter SDK 12.5 C++ Programming Guide

KeyView Software Version 12.5 Filter SDK C++ Programming Guide Document Release Date: February 2020 Software Release Date: February 2020 Filter SDK C++ Programming Guide Legal notices Copyright notice © Copyright 2016-2020 Micro Focus or one of its affiliates. The only warranties for products and services of Micro Focus and its affiliates and licensors (“Micro Focus”) are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. Micro Focus shall not be liable for technical or editorial errors or omissions contained herein. The information contained herein is subject to change without notice. Documentation updates The title page of this document contains the following identifying information: l Software Version number, which indicates the software version. l Document Release Date, which changes each time the document is updated. l Software Release Date, which indicates the release date of this version of the software. To check for updated documentation, visit https://www.microfocus.com/support-and-services/documentation/. Support Visit the MySupport portal to access contact information and details about the products, services, and support that Micro Focus offers. This portal also provides customer self-solve capabilities. It gives you a fast and efficient way to access interactive technical support tools needed to manage your business. As a valued support customer, you can benefit by using the MySupport portal to: l Search for knowledge documents of interest l Access product documentation l View software vulnerability alerts l Enter into discussions with other software customers l Download software patches l Manage software licenses, downloads, and support contracts l Submit and track service requests l Contact customer support l View information about all services that Support offers Many areas of the portal require you to sign in. If you need an account, you can create one when prompted to sign in. To learn about the different access levels the portal uses, see the Access Levels descriptions. KeyView (12.5) Page 2 of 250 Filter SDK C++ Programming Guide Contents Part I: Overview of Filter SDK 11 Chapter 1: Introducing Filter SDK 12 Overview 12 Features 12 Platforms, Compilers, and Dependencies 13 Supported Platforms 13 Supported Compilers 14 C++ Filter SDK 14 Software Dependencies 14 Windows Installation 15 UNIX Installation 16 Package Contents 17 License Information 18 Enable Advanced Document Readers 18 Update License Information 18 Directory Structure 19 Chapter 2: Getting Started 22 Use the C++ Language Implementation of the API 22 Build the C++ API 22 Create a KeyView Session 23 Configure your session 23 Detect the Format of a File 24 Filter a File 24 Extract Subfiles 24 Extract Metadata 25 Exceptions 25 Generic IO Types 25 Part II: Use Filter SDK 28 Chapter 3: Use the File Extraction API 29 Introduction 29 Extract Subfiles 30 Extract Images 31 Extract Mail Metadata 31 Default Metadata Set 31 Extract the Default Metadata Set 32 Extract Subfiles from Outlook Express Files 32 Extract Subfiles from Mailbox Files 32 Extract Subfiles from Outlook Personal Folders Files 33 Choose the Reader to use for PST Files 33 KeyView (12.5) Page 3 of 250 Filter SDK C++ Programming Guide MAPI Attachment Methods 35 Open Secured PST Files 35 Detect PST Files While the Outlook Client is Running 35 Extract Subfiles from Lotus Domino XML Language Files 36 Extract .DXL Files to HTML 36 Extract Subfiles from Lotus Notes Database Files 37 System Requirements 37 Installation and Configuration 37 Windows 38 Solaris 38 AIX 5.x 38 Linux 39 Open Secured NSF Files 39 Format Note Subfiles 39 Extract Subfiles from PDF Files 40 Improve Performance for PDFs with Many Small Images 40 Extract Embedded OLE Objects 40 Extract Subfiles from ZIP Files 41 Extract Metadata 41 Chapter 4: Use the Filter API 42 Generate an Error Log 42 Enable or Disable Error Logging 43 Use the API 43 Use Environment Variables 43 Change the Path and File Name of the Log File 43 Report Memory Errors 43 Use the API 44 Use Environment Variables 44 Specify a Memory Guard 44 Specify the Maximum Size of the Log File 44 Extract Metadata 45 Convert Character Sets 45 Determine the Character Set of the Output Text 45 Guidelines for Character Set Conversion 46 Set the Character Set During Filtering 46 Set the Character Set During Subfile Extraction 47 Extract Deleted Text Marked by Tracked Changes 47 Filter a File 47 Filter PDF Files 48 Filter PDF Files to a Logical Reading Order 48 Enable Logical Reading Order 49 Use the C++ API 49 Use the formats.ini File 49 Rotated Text 50 Extract Custom Metadata from PDF Files 50 Extract All Custom Metadata 50 KeyView (12.5) Page 4 of 250 Filter SDK C++ Programming Guide Filter Tagged PDF Content 50 Skip Embedded Fonts 51 Use the formats.ini File 51 Use the C++ API 51 Control Hyphenation 52 Use the formats.ini File 52 Use the C++ API 52 Filter Portfolio PDF Files 52 Filter Spreadsheet Files 52 Filter Worksheet Names 52 Filter Hidden Text in Microsoft Excel Files 53 Specify Date and Time Format on UNIX Systems 53 Filter Very Large Numbers in Spreadsheet Cells to Precision Numbers 53 Extract Microsoft Excel Formulas 54 Configure Headers and Footers 55 Filter Hidden Data 56 Hidden Data in HTML Documents 56 Tab Delimited Output for Embedded Tables 56 Table Detection for PDF Files 57 Exclude Japanese Guide Text 57 Source Code Identification 57 Chapter 5: Sample Programs 60 Introduction 60 Build the Sample Programs 60 Run the Sample Programs 61 detect 61 extract 62 filter_document 62 metadata 63 subfiles 63 filter_container 63 Part III: C++ API Reference 64 Chapter 7: InputTypes and OutputTypes 66 Chapter 8: The keyview Namespace 68 The Session Class 68 Constructor 68 config 69 detect 69 filter 69 get_summary_information 69 metadata_map 70 subfiles 70 The Configuration Class 70 Constructor 70 KeyView (12.5) Page 5 of 250 Filter SDK C++ Programming Guide custom_pdf_metadata 70 date_time_field_codes 71 extraction_timeout 71 filename_field_code 71 formatted_mail 71 header_and_footer 72 header_and_footer_tags 72 hidden_text 72 no_encoding_conversion 72 out_of_process_log 73 out_of_process_memory_log 73 password 73 pdf_logical_reading 73 revision_marks 74 skip_comments 74 skip_embedded_fonts 74 skip_thumbnail 74 soft_hyphens 75 source_encoding 75 tagged_pdf_content 75 target_encoding 75 string& temporary_directory 76 timeout 76 unicode_byte_order_marker 76 The DetectionInfo Class 76 appleDoubleEncoded 77 appleSingleEncoded 77 category 77 category_name 77 description 77 encrypted 77 extension 78 format 78 macBinaryEncoded 78 version 78 wangGDLencoded 78 windowRMSEncrypted 79 The Container Class 79 The Subfile Class 79 extract 79 children 79 index 80 is_folder 80 mail_metadata 80 parent 80 rawname 80 KeyView (12.5) Page 6 of 250 Filter SDK C++ Programming Guide size 81 time 81 type 81 The SummaryInfoItem Class 81 apply_visitor 82 convert_to_string 82 name 82 type 82 The SummaryInfoVisitorBase Class 82 visit_boolean 83 visit_datetime 83 visit_double 83 visit_integer 83 visit_target_encoding_string 84 visit_utf8_string 84 Enumerations 84 LogicalPDFDirection 85 SubFile::Type 85 SummaryInfoType 86 Exceptions 86 Chapter 9: The keyview::io Namespace 88 InputFile 88 Constructors 88 OutputFile 88 Constructors 88 OutputStdout 88 Constructors 89 InMemoryFile 89 Constructors 89 Appendixes 90 Appendix A: Supported Formats 91 Key to Supported Formats Table 91 Supported Formats 93 Appendix B: Document Readers 154 Key to Document Reader Tables 154 Archive Formats 155 Binary Format 158 Computer-Aided Design Formats 159 Database Formats 160 Desktop Publishing 161 Display Formats 161 Graphic Formats 162 Mail Formats 166 Multimedia Formats 169 KeyView (12.5) Page 7 of 250 Filter SDK C++ Programming Guide Presentation Formats 172 Spreadsheet Formats 175 Text and Markup Formats 177 Word Processing Formats 178 Appendix C: Character Sets 184 Multibyte and Bidirectional Support 184 Coded Character Sets 192 Appendix D: Extract and Format Lotus Notes Subfiles 198 Overview 198 Customize XML Templates 198 Use Demo Templates 199 Use Old Templates 199 Disable XML Templates 199 Template Elements and Attributes 200 Conditional Elements 200 Control Elements 201 Data Elements 202 Date and Time Formats 205 Lotus Notes Date and Time Formats 205 KeyView Date and Time Formats 206 Appendix E: File Format Detection 211 Introduction 211 Extract Format Information 211 Determine Format Support 211 Example formats.ini file entries 212 Refine Detection of Text Files 212 Allow Consecutive NULL Bytes in a Text File 213 Translate Format Information 214 Distinguish Between Formats 214 Determine a Document Reader 215 Category Values in formats.ini 215 Appendix F: List of Required Files for Redistribution 219 Core Files 219 Support Files 220 Document Readers 221 Appendix G: Develop a Custom Reader 228 Introduction 228 How to Write a Custom Reader 229 Naming Conventions 229 Basic Steps 230 Token Buffer 230 Macros 231 Reader Interface 232 Function Flow 232 Example Development of fffFillBuffer() 233 KeyView (12.5) Page 8 of 250 Filter SDK C++ Programming Guide Implementation 1—fpFillBuffer() Function 233 Structure of Implementation 1 234 Problems with Implementation 1 234 Implementation 2—Processing a Large Token Stream 235 Structure of Implementation 2 236 Problems with Implementation 2 236 Boundary Conditions 236 Implementation 3—Interrupting Structured Access Layer Calls 237 Structure of Implementation 3 239 Development Tips 239 Functions 240 xxxsrAutoDet() 240 xxxAllocateContext() 241 xxxFreeContext() 242 xxxInitDoc() 243 xxxFillBuffer() 243 xxxGetSummaryInfo() 244 xxxOpenStream() 245 xxxCloseStream() 246 xxxCharSet() 246 Appendix H: Password Protected Files 248 Supported Password Protected File Types 248 Send documentation feedback 250 KeyView (12.5) Page 9 of 250 Filter SDK C++ Programming Guide KeyView (12.5) Page 10 of 250 Part I: Overview of Filter SDK This section provides an overview of the Micro Focus KeyView Filter SDK and describes how to use the C++ implementation of the API. KeyView (12.5) Page 11 of 250 Chapter 1: Introducing Filter SDK This section describes the Filter SDK package.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    250 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us