Distributed UIMA Cluster Computing
Total Page:16
File Type:pdf, Size:1020Kb
Distributed UIMA Cluster Computing The DUCC Team September 23, 2013 ii Contents I DUCC Concepts1 1 DUCC Overview 3 1.1 What is DUCC?................................................3 1.2 DUCC Job Model...............................................3 1.3 DUCC From UIMA to Full Scale-out.....................................4 1.4 Error Management...............................................6 1.5 Cluster and Job Management.........................................7 1.6 Security Measures...............................................8 1.7 Security Issues.................................................8 2 Application Quick Start 9 3 Glossary 11 II Ducc Users Guide 13 4 Command Line Interface 15 4.0.1 The DUCC Job Descriptor...................................... 15 4.0.2 Operating System Limit Support................................... 16 4.0.3 Command Line Forms......................................... 16 4.0.4 DUCC Commands........................................... 17 4.1 ducc submit................................................... 17 4.2 ducc cancel................................................... 21 4.3 ducc reserve................................................... 22 4.4 ducc unreserve................................................. 23 4.5 ducc process submit.............................................. 23 4.6 ducc process cancel............................................... 25 4.7 ducc services.................................................. 25 4.7.1 Common Options............................................ 26 4.7.2 ducc services {register Options.................................... 26 4.7.3 ducc services {start Options...................................... 29 4.7.4 ducc services {stop Options...................................... 29 4.7.5 ducc services {modify Options.................................... 30 4.7.6 ducc services {query Options..................................... 31 4.8 ducc perf stats................................................. 32 4.9 viaducc and java viaducc........................................... 32 5 The DUCC Public API 35 5.1 Overview Of The DUCC API......................................... 35 5.2 Compiling and Running With the DUCC API................................ 36 5.3 Java API.................................................... 36 6 Service Management 37 iii iv CONTENTS 6.1 Overview..................................................... 37 6.2 Service Types.................................................. 37 6.3 Service References and Endpoints....................................... 38 6.4 Service Classes.................................................. 38 6.4.1 Implicit Services............................................. 38 6.4.2 Registered Services........................................... 39 6.5 Service Pingers................................................. 39 6.5.1 Declaring a Pinger in A Service.................................... 40 6.5.2 Implementing a Pinger......................................... 40 6.5.3 Building And Testing Your Pinger.................................. 42 7 Job Logs 45 8 DUCC Web Server 47 8.1 Common Links................................................. 47 8.2 Jobs Page.................................................... 48 8.3 Job Details Page................................................ 51 8.3.1 Processes................................................ 51 8.3.2 Work Items............................................... 54 8.3.3 Performance.............................................. 54 8.3.4 Specification.............................................. 55 8.4 Reservation Page................................................ 55 8.5 Managed Reservation Details Page...................................... 56 8.5.1 Processes................................................ 57 8.5.2 Specification.............................................. 57 8.6 Services Page.................................................. 57 8.7 Service Details Page.............................................. 59 8.7.1 Processes................................................ 59 8.7.2 Specification.............................................. 60 8.8 System Details Page.............................................. 60 8.8.1 Administration............................................. 61 8.8.2 Classes................................................. 61 8.8.3 Daemons................................................ 61 8.8.4 Machines................................................ 62 III Programming Model And Applications 65 9 Building and Testing Applications: All-InOne 67 10 Sample Application: Source Ingestion 69 11 Sample Application: Fooing The Bar 71 IV Ducc Administrators Guide 73 12 Installation, Configuration, and Verification 75 12.1 Overview.................................................... 75 12.2 Software Prerequisites............................................. 75 12.3 Building from Source.............................................. 76 12.4 Documentation................................................. 76 12.5 Single-user Installation and Verification................................... 76 12.6 Minimal Hardware Requirements for single-user Installation........................ 77 12.7 Single-user System Installation........................................ 77 12.8 Initial System Verification........................................... 78 12.9 Logs....................................................... 79 CONTENTS v 12.10Multi-User Installation and Verification................................... 79 12.11Ducc ling Installation............................................. 80 12.12CGroups Installation and Configuration................................... 81 12.13Set up the full nodelists............................................ 82 12.14Full DUCC Verification............................................ 82 13 Administration 83 13.1 WebServer Authentication........................................... 83 13.1.1 Example Implementation....................................... 83 13.1.2 IAuthenticationManager........................................ 84 13.1.3 IAuthenticationResult......................................... 85 13.1.4 Example ANT script to build jar................................... 86 13.1.5 Example ducc.properties entries................................... 86 13.1.6 Example ducc.administrators..................................... 87 13.2 ducc.properties................................................. 87 13.2.1 General DUCC Properties....................................... 87 13.2.2 Web Server Properties......................................... 92 13.2.3 Job Driver Properties......................................... 93 13.2.4 Service Manager Properties...................................... 94 13.2.5 Orchestrator Properties........................................ 96 13.2.6 Resource Manager Properties..................................... 97 13.2.7 Agent Properties............................................ 101 13.2.8 Process Manager Properties...................................... 104 13.2.9 Job Process Properties......................................... 105 13.3 Resource Manager Configuration: Classes and Nodepools......................... 106 13.3.1 Nodepools................................................ 106 13.3.2 Class Definitions............................................ 109 13.3.3 Validation................................................ 112 13.4 Ducc Node Definitions............................................. 112 13.5 Administrative Commands.......................................... 113 13.5.1 start ducc................................................ 113 13.5.2 stop ducc................................................ 115 13.5.3 check ducc............................................... 116 14 Resource Management 119 14.1 Overview.................................................... 119 14.2 Scheduling Policies............................................... 120 14.3 Priority vs Weight............................................... 121 14.4 Node Pools................................................... 121 14.5 Job Classes................................................... 122 15 Simulation and System Testing 125 15.1 Cluster Simulation............................................... 125 15.1.1 Overview................................................ 125 15.1.2 Node Configuration.......................................... 126 15.1.3 Starting a Simulated Cluster..................................... 127 15.1.4 Stopping a Simulated Cluster..................................... 127 15.2 Job Simulation................................................. 128 15.2.1 Overview................................................ 128 15.2.2 Job meta-descriptors.......................................... 129 15.2.3 Prepare Descriptors.......................................... 129 15.2.4 Services................................................. 131 15.2.5 Generating a Job Set.......................................... 131 15.2.6 Running the Test Driver........................................ 132 15.3 Pre-Packaged Tests............................................... 133 vi CONTENTS V Ducc Principles of Operation 135 16 Platform 137 16.1 Highlights.................................................... 137 16.2 Architecture................................................... 137 16.3 Jobs....................................................... 138 16.3.1 Characteristics............................................. 138 16.3.2 Performance.............................................. 139 16.4