Guide and Reference Manual (Pdf)
Total Page:16
File Type:pdf, Size:1020Kb
Distributed UIMA Cluster Computing Written and maintained by the Apache UIMATMDevelopment Community Version 3.0.0 i Copyright c 2012 The Apache Software Foundation Copyright c 2012 International Business Machines Corporation License and Disclaimer The ASF licenses this documentation to you under the Apache License, Version 2.0 (the "License"); you may not use this documentation except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, this documentation and its contents are distributed under the License on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. Trademarks All terms mentioned in the text that are known to be trademarks or service marks have been appropriately capitalized. Use of such terms in this book should not be regarded as affecting the validity of the the trademark or service mark. Publication date: April 2019 Table of Contents I DUCC Concepts1 1 DUCC Overview 2 1.1 What is DUCC?................................................2 1.2 DUCC Job Model...............................................2 1.3 DUCC From UIMA to Full Scale-out.....................................3 1.4 Error Management...............................................5 1.5 Cluster and Job Management.........................................5 1.6 Security Measures...............................................6 1.6.1 ducc ling................................................7 1.7 Security Issues.................................................7 2 Glossary 8 II Ducc Users Guide 10 3 Command Line Interface 11 3.1 The DUCC Job Descriptor.......................................... 11 3.2 Operating System Limit Support....................................... 12 3.3 Command Line Forms............................................. 12 3.4 DUCC Commands............................................... 13 3.5 ducc submit................................................... 13 3.6 ducc cancel................................................... 18 3.7 ducc reserve................................................... 19 3.8 ducc unreserve................................................. 20 3.9 ducc process submit.............................................. 20 3.10 ducc process cancel............................................... 22 3.11 ducc services.................................................. 22 3.11.1 Common Options............................................ 23 3.11.2 ducc services {register [specification file] [options].......................... 23 3.11.3 ducc services {start options...................................... 26 3.11.4 ducc services {stop options...................................... 27 3.11.5 ducc services {enable options..................................... 27 3.11.6 ducc services {disable options..................................... 27 3.11.7 ducc services {observe references options............................... 27 3.11.8 ducc services {ignore references options............................... 28 3.11.9 ducc services {modify options..................................... 28 3.11.10 ducc services {query options..................................... 28 3.12 viaducc and java viaducc........................................... 29 3.13 ducc status................................................... 30 3.14 ducc watcher.................................................. 30 4 The DUCC Public API 32 4.1 Overview Of The DUCC API......................................... 32 ii TABLE OF CONTENTS iii 4.2 Compiling and Running With the DUCC API................................ 33 4.3 Java API.................................................... 33 5 Service Management 34 5.1 Overview..................................................... 34 5.2 Service Types.................................................. 35 5.3 Service Instance IDs.............................................. 35 5.4 Service References and Endpoints....................................... 35 5.5 Application Broker for UIMA-AS Services.................................. 36 5.6 Service Management Policies......................................... 36 5.7 Service Pingers................................................. 38 5.7.1 The Pinger API............................................ 39 5.7.2 Declaring a Pinger in A Service.................................... 40 5.7.3 Implementing a Pinger......................................... 40 5.7.4 Building And Testing Your Pinger.................................. 41 5.7.5 Globally Registered Pingers...................................... 43 5.8 Sample Pinger................................................. 43 5.8.1 Using the Sample Pinger........................................ 43 5.8.2 Understanding Sample Pinger..................................... 44 5.8.3 Calculating New Deployments in the Pinger............................. 45 5.8.4 Summary of Sample Pinger...................................... 48 6 Job Logs 49 7 Job Error Handler 51 8 DUCC Web Server 52 8.1 Common Links................................................. 53 8.2 Login...................................................... 54 8.3 Jobs Page.................................................... 55 8.4 Job Details Page................................................ 57 8.4.1 Processes................................................ 57 8.4.2 Work Items............................................... 61 8.4.3 Performance.............................................. 62 8.4.4 Specification.............................................. 62 8.4.5 Files................................................... 62 8.5 Reservations Page............................................... 62 8.6 Managed Reservation Details Page...................................... 64 8.6.1 Processes................................................ 65 8.6.2 Specification.............................................. 66 8.6.3 Files................................................... 66 8.7 Services Page.................................................. 66 8.8 Service Details Page.............................................. 67 8.8.1 Deployments.............................................. 67 8.8.2 Registry................................................. 69 8.8.3 Files................................................... 69 8.8.4 History................................................. 69 8.9 System Pages.................................................. 69 8.9.1 Administration............................................. 69 8.9.2 Broker.................................................. 70 8.9.3 Classes................................................. 70 8.9.4 Daemons................................................ 70 8.9.5 Machines................................................ 71 8.10 Visualization.................................................. 72 8.11 JSON...................................................... 72 TABLE OF CONTENTS iv III Programming Model And Applications 74 9 Building and Testing Jobs 75 9.1 Overview.................................................... 75 9.1.1 Basic Job Process Threading Model................................. 75 9.1.2 Alternate Pipeline Threading Model................................. 75 9.1.3 Overriding UIMA Configuration Parameters............................. 76 9.2 Collection Segmentation and Artifact Extraction.............................. 76 9.3 CAS Consumer Changes for DUCC...................................... 76 9.4 Job Development for an Existing Pipeline Design.............................. 76 9.5 Job Development for a New Pipeline Design................................. 77 9.5.1 Collection Reader (CR) Characteristics............................... 77 9.5.2 DUCC built-in Flow Controller.................................... 77 9.5.3 Workitem Feature Structure...................................... 77 9.5.4 Deployment Descriptor (DD) Jobs.................................. 78 9.5.5 Debugging................................................ 78 10 Sample Application: Raw Text Processing 79 10.1 Application Function and Design....................................... 79 10.2 Configuration Parameters........................................... 79 10.3 Set up a working directory........................................... 80 10.4 Download and Install OpenNLP....................................... 80 10.5 Get some Input Text.............................................. 80 10.6 Run the Job................................................... 81 10.7 Job Output................................................... 81 10.8 Job Performance Details............................................ 81 11 Sample Application: CAS Input Processing 83 11.1 Application Function and Design....................................... 83 11.2 Configuration Parameters........................................... 83 11.3 Run the Job................................................... 83 11.4 Job Performance Details............................................ 84 11.5 Limiting Job Resources............................................ 84 IV Ducc Administrators Guide 85 12 Installation, Configuration, and Verification 86 12.1 Overview.................................................... 86 12.2 Software Prerequisites............................................. 86 12.3 Building from Source.............................................. 88 12.4 Documentation................................................