1.1 TORQUE Installation

Total Page:16

File Type:pdf, Size:1020Kb

1.1 TORQUE Installation TORQUE® Administrator Guide version 3.0.2 TORQUE Admin Manual version 3.0.2 Legal Notices Preface Documentation Overview Introduction Glossary 1.0 Overview 1.1 Installation 1.2 Initialize/Configure TORQUE on the Server (pbs_server) 1.3 Advanced Configuration 1.4 Manual Setup of Initial Server Configuration 1.5 Server Node File Configuration 1.6 Testing Server Configuration 1.7 TORQUE on NUMA Systems 1.8 TORQUE Multi-MOM 2.0 Submitting and Managing Jobs 2.1 Job Submission 2.2 Monitoring Jobs 2.3 Canceling Jobs 2.4 Job Preemption 2.5 Keeping Completed Jobs 2.6 Job Checkpoint and Restart 2.7 Job Exit Status 2.8 Service Jobs 3.0 Managing Nodes 3.1 Adding Node 3.2 Configuring Node Properties 3.3 Changing Node State 3.4 Host Security 3.5 Linux Cpuset Support 3.6 Scheduling Cores 3.7 Scheduling GPUs 4.0 Setting Server Policies 4.1 Queue Configuration 4.2 Server High Availability 5.0 Interfacing with a Scheduler 5.1 Integrating Schedulers for TORQUE 6.0 Configuring Data Management 6.1 SCP/RCP Setup 6.2 NFS and Other Networked Filesystems 6.3 File Stage-In/Stage-Out 7.0 Interfacing with Message Passing 7.1 MPI (Message Passing Interface) Support 8.0 Managing Resources 8.1 Monitoring Resources 9.0 Accounting 9.1 Accounting Records 10.0 Logging 10.1 Job Logging 11.0 TroubleShooting 11.1 Troubleshooting 11.2 Compute Node Health Check 11.3 Debugging Appendices Appendix A: Commands Overview Client Commands momctl pbsdsh pbsnodes qalter qchkpt qdel qhold qmgr qrerun qrls qrun qsig qstat qsub qterm tracejob Server Commands pbs_mom pbs_server pbs_track Appendix B: Server Parameters Appendix C: MOM Configuration Appendix D: Error Codes and Diagnostics Appendix E: Considerations Before Upgrading Appendix F: Large Cluster Considerations Appendix G: Prologue and Epilogue Scripts Appendix H: Running Multiple TORQUE Servers and Moms on the Same Node Appendix I: Security Overview Appendix J: Submit Filter (aka qsub Wrapper) Appendix K: torque.cfg File Appendix L: TORQUE Quick Start Guide Changelog Legal Notices Copyright © 2011 Adaptive Computing Enterprises, Inc. All rights reserved. Distribution of this document for commercial purposes in either hard or soft copy form is strictly prohibited without prior written consent from Adaptive Computing Enterprises, Inc. Trademarks Adaptive Computing, Cluster Resources, Moab, Moab Workload Manager, Moab Cluster Manager, Moab Cluster Suite, Moab Grid Scheduler, Moab Grid Suite, Moab Access Portal, and other Adaptive Computing products are either registered trademarks or trademarks of Adaptive Computing Enterprises, Inc. The Adaptive Computing logo and the Cluster Resources logo are trademarks of Adaptive Computing Enterprises, Inc. All other company and product names may be trademarks of their respective companies. Acknowledgments TORQUE includes software developed by NASA Ames Research Center, Lawrence Livermore National Laboratory, and Veridian Information Solutions, Inc. Visit www.OpenPBS.org for OpenPBS software support, products and information. TORQUE is neither endorsed by nor affiliated with Altair Grid Solutions, Inc. TORQUE Administrator Guide Overview Advanced TORQUE Administration is a video tutorial of a session offered at Moab Con that offers further details on advanced TORQUE administration. This collection of documentation for TORQUE resource manager is intended as a reference for both users and system administrators. The 1.0 Overview section provides the details for installation and initialization, advanced configuration options, and (optional) qmgr options necessary to get the system up and running. System Testing is also covered. The 2.0 Submitting and Managing Jobs section covers different actions applicable to jobs. The first section, 2.1 Job Submission, details how to submit a job and request resources (nodes, software licenses, and so forth) and provides several examples. Other actions include monitoring, canceling, preemption, and keeping completed jobs. The 3.0 Managing Nodes section covers administrator tasks relating to nodes, which includes the following: adding nodes, changing node properties, and identifying state. Also an explanation of how to configure restricted user access to nodes is covered in section 3.4 Host Security. The 4.0 Setting Server Policies section details server side configurations of queue and high availability. The 5.0 Interfacing with a Scheduler section offers information about using the native scheduler versus an advanced scheduler. The 6.0 Configuring Data Management section deals with issues of data management. For non-network file systems, the SCP/RCP Setup section details setting up SSH keys and nodes to automate transferring data. The NFS and Other Networked File Systems section covers configuration for these file systems. This chapter also addresses the use of File Stage-In/Stage-Out using the stagein and stageout directives of the qsub command. The 7.0 Interfacing with Message Passing section offers details supporting MPI (Message Passing Interface). The 8.0 Managing Resources section covers configuration, utilization, and states of resources. The 9.0 Accounting section explains how jobs are tracked by TORQUE for accounting purposes. The 10.0 Troubleshooting section is a troubleshooting guide that offers help with general problems; it includes an FAQ (Frequently Asked Questions) list and instructions for how to set up and use compute node checks and how to debug TORQUE. The numerous appendices provide tables of commands, parameters, configuration options, error codes, the Quick Start Guide, and so forth. A. Commands Overview B. Server Parameters C. MOM Configuration D. Error Codes and Diagnostics E. Considerations Before Upgrading F. Large Cluster Considerations G. Prologue and Epilogue Scripts H. Running Multiple TORQUE Servers and Moms on the Same Node I. Security Overview J. Submit Filter (aka qsub Wrapper) K. torque.cfg File Introduction What is a Resource Manager? While TORQUE has a built-in scheduler, pbs_sched, it is typically used solely as a resource manager with a scheduler making requests to it. Resources managers provide the low-level functionality to start, hold, cancel, and monitor jobs. Without these capabilities, a scheduler alone can not control jobs. What are Batch Systems? While TORQUE is flexible enough to handle scheduling a conference room, it is primarily used in batch systems. Batch systems are a collection of computers and other resources (networks, storage systems, license servers, and so forth) that operate under the notion that the whole is greater than the sum of the parts. Some batch systems consist of just a handful of machines running single-processor jobs, minimally managed by the users themselves. Other systems have thousands and thousands of machines executing users' jobs simultaneously while tracking software licenses and access to hardware equipment and storage systems. Pooling resources in a batch system typically reduces technical administration of resources while offering a uniform view to users. Once configured properly, batch systems abstract away many of the details involved with running and managing jobs, allowing higher resource utilization. For example, users typically only need to specify the minimal constraints of a job and do not need to know the individual machine names of each host on which they are running. With this uniform abstracted view, batch systems can execute thousands and thousands of jobs simultaneously. Batch systems are comprised of four different components: (1) Master Node, (2) Submit/Interactive Nodes, (3) Compute Nodes, and (4) Resources. Master Node - A batch system will have a master node where pbs_server runs. Depending on the needs of the systems, a master node may be dedicated to this task, or it may fulfill the roles of other components as well. Submit/Interactive Nodes - Submit or interactive nodes provide an entry point to the system for users to manage their workload. For these nodes, users are able to submit and track their jobs. Additionally, some sites have one or more nodes reserved for interactive use, such as testing and troubleshooting environment problems. These nodes have client commands (such as qsub and qhold). Compute Nodes - Compute nodes are the workhorses of the system. Their role is to execute submitted jobs. On each compute node, pbs_mom runs to start, kill, and manage submitted jobs. It communicates with pbs_server on the master node. Depending on the needs of the systems, a compute node may double as the master node (or more). Resources - Some systems are organized for the express purpose of managing a collection of resources beyond compute nodes. Resources can include high-speed networks, storage systems, license managers, and so forth. Availability of these resources is limited and needs to be managed intelligently to promote fairness and increased utilization. Basic Job Flow The life cycle of a job can be divided into four stages: (1) creation, (2) submission, (3) execution, and (4) finalization. Creation - Typically, a submit script is written to hold all of the parameters of a job. These parameters could include how long a job should run (walltime), what resources are necessary to run, and what to execute. The following is an example submit file: #PBS -N localBlast #PBS -S /bin/sh #PBS -l nodes=1:ppn=2,walltime=240:00:00 #PBS -M [email protected] #PBS -m ea source ~/.bashrc cd $HOME/work/dir sh myBlast.sh -i -v This submit script specifies the name of the job (localBlast), what environment to use (/bin/sh), that it needs both processors on a single node (nodes=1:ppn=2), that it will run for at most 10 days, and that TORQUE should email [email protected] when the job exits or aborts. Additionally, the user specifies where and what to execute. Submission - A job is submitted with the qsub command. Once submitted, the policies set by the administration and technical staff of the site dictate the priority of the job and therefore, when it will start executing. Execution - Jobs often spend most of their lifecycle executing. While a job is running, its status can be queried with qstat.
Recommended publications
  • Job Scheduling in Java Applications Job Schedulers Let Developers Focus on Their Applications and Not on Scheduling Details
    flux.ly File Orchestration Solutions for Banking and Finance Reading Time: 8 minutes Job Scheduling in Java Applications Job schedulers let developers focus on their applications and not on scheduling details. Reading Time: 8 minutes Scheduling tasks in J2EE applications is a common need. There is more to job scheduling than running backups at 3 o'clock in the morning. Customer relationship management (CRM) applications, for example, need to contact customers periodically during the sales cycle. File transfer applications need to transmit and receives files on a regular basis, except on holidays. Administrative applications need to send reminder emails to notify employees and customers about important events. All these enterprise applications can benefit from job scheduling. This article first explores the need for job scheduling in J2EE applications. Section 2 explains why job scheduling can add critical functionality to enterprise applications. In Section 3, related work is reviewed. Section 4 presents a set of requirements for job scheduling software in J2EE applications. Finally, Section 5 walks through the design of an enterprise application that uses job scheduling. The need for enterprise job scheduling Many enterprise applications can benefit from job scheduling. Sales and customer relationship software, for example, needs to contact customers over the course of months and years. For example: 1. Followups are sent 10 and 30 days after a potential customer first contacts the sales organization. 2. Reminder notices are sent to existing customers 60 and 20 days before the customer's support contract expires. 3. If the customer does not contact the technical support department for 90 days, a followup is sent asking if the customer needs any help with the product or service.
    [Show full text]
  • Geek Guide > Beyond Cron
    GEEK GUIDE BEYOND CRON Table of Contents Ease of Use ..................................................................... 8 Multi-Server-Friendly .................................................... 10 Dependency Management ............................................ 13 Easy to Visualize ........................................................... 16 Delegation of Authority ................................................. 18 Management by Exception ........................................... 21 Flexible Scheduling ....................................................... 23 Revision Control ........................................................... 24 Conclusion .................................................................... 24 MIKE DIEHL has been using Linux since the days when Slackware came on 14 5.25” floppy disks and installed kernel version 0.83. He has built and managed several servers configured with either hardware or software RAID storage under Linux, and he has hands-on experience with both the VMware and KVM virtual machine architectures. Mike has written numerous articles for Linux Journal on a broad range of subjects, and he has a Bachelor’s degree in Mathematics with a minor in Computer Science. He lives in Blythewood, South Carolina, with his wife and four sons. 2 GEEK GUIDE BEYOND CRON GEEK GUIDES: Mission-critical information for the most technical people on the planet. Copyright Statement © 2015 Linux Journal. All rights reserved. This site/publication contains materials that have been created, developed or
    [Show full text]
  • DRMAA) Version 2
    Distributed Resource Management Application API (DRMAA) Version 2 Dr. Peter Tröger Hasso-Plattner-Institute, University of Potsdam [email protected] ! DRMAA-WG Co-Chair http://www.drmaa.org/ My Person • Senior Researcher at Hasso-Plattner-Institute, Potsdam • Research field: Dependable systems • New online failure prediction and recovery techniques (SAP ByDesign, TACC Ranger, IBM z196, Intel) • Fault injection on Firmware level (Fujitsu Technology Solutions) • New reliability modeling approaches (DSN paper pending ...) • Virtualization-based fault tolerance (VMWare, Xen, KVM) • Teaching • Dependable systems, parallel programming concepts, operating systems, middleware and distributed systems • Standardization in Open Grid Forum as side activity ... DRMAAv2 | OGF 35 $X PT 2012 Hasso-Plattner-Institute for Software Engineering (HPI) " Privately funded and independent research institute, founded in 1999! " Associated with the University of Potsdam, Germany! " B.Sc. and M.Sc. curriculum in IT-Systems Engineering! " Ph.D. programme! " Rich experience in research projects that are typically conducted with industrial partners, both on a national and international level! " Research school for PhDs with international departments (Cape Town, Haifa, China) DRMAAv2 | OGF 35 $X PT 2012 $X Open Grid Forum (OGF) Application Area End User Application / Portal Features SAGA API / OGSA / OCCI API Portabilit API Standards Proprietary Other OGF OGF Other DRMAA End User Application / Portal Meta Scheduler Features SAGA API + Backends API API Portabilit API API
    [Show full text]
  • VSC HPC Tutorial for Vrije Universiteit Brussel Linux Users
    VLAAMS SUPERCOMPUTER Innovative Computing CENTRUM for A Smarter Flanders HPC Tutorial Last updated: September 29 2021 For Linux Users Authors: Franky Backeljauw5, Stefan Becuwe5, Geert Jan Bex3, Geert Borstlap5, Jasper Devreker2, Stijn De Weirdt2, Andy Georges2, Balázs Hajgató1,2, Kenneth Hoste2, Kurt Lust5, Samuel Moors1, Ward Poelmans1, Mag Selwa4, Álvaro Simón García2, Bert Tijskens5, Jens Timmerman2, Kenneth Waegeman2, Toon Willems2 Acknowledgement: VSCentrum.be 1Free University of Brussels 2Ghent University 3Hasselt University 4KU Leuven 5University of Antwerp 1 Audience: This HPC Tutorial is designed for researchers at the Vrije Universiteit Brussel and affiliated institutes who are in need of computational power (computer resources) and wish to explore and use the High Performance Computing (HPC) core facilities of the Flemish Supercomputing Centre (VSC) to execute their computationally intensive tasks. The audience may be completely unaware of the VUB-HPC concepts but must have some basic understanding of computers and computer programming. Contents: This Beginners Part of this tutorial gives answers to the typical questions that a new VUB- HPC user has. The aim is to learn how to make use of the HPC. Beginners Part Questions chapter title What is a VUB-HPC exactly? 1 Introduction to HPC Can it solve my computational needs? How to get an account? 2 Getting an HPC Account How do I connect to the VUB-HPC and 3 Connecting to the HPC infrastructure transfer my files and programs? How to start background jobs? 4 Running batch jobs How to start jobs with user interaction? 5 Running interactive jobs Where do the input and output go? 6 Running jobs with input/output data Where to collect my results? Can I speed up my program by explor- 7 Multi core jobs/Parallel Computing ing parallel programming techniques? Troubleshooting 8 Troubleshooting What are the rules and priorities of 9 HPC Policies jobs? FAQ 10 Frequently Asked Questions The Advanced Part focuses on in-depth issues.
    [Show full text]
  • IBM Workload Automation: Overview Setting Job Recovery
    IBM Workload Automation IBM Overview Version 9.3 SPE (Revised July 2018) IBM Workload Automation IBM Overview Version 9.3 SPE (Revised July 2018) Note Before using this information and the product it supports, read the information in “Notices” on page 83. This edition applies to version 9, release 3, modification level 0 of IBM Workload Scheduler (program number 5698-WSH) and to all subsequent releases and modifications until otherwise indicated in new editions. © Copyright IBM Corporation 1999, 2016. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. © Copyright HCL Technologies Limited 2016, 2018. Contents Figures .............. vii | Hybrid Workload Automation ...... xxiv | Checking prerequisites before running a silent About this publication ........ ix | installation ............. xxiv | Automatically initializing IBM Workload What is new in this release ......... ix | Scheduler instances .......... xxv Who should read this publication ....... ix | Defining and managing IBM Workload Accessibility .............. ix | Scheduler role-based security ....... xxv Technical training ............ ix | Reply to a message for an IBM i job from Support information ............ x | Dynamic Workload Console ....... xxv How to read syntax diagrams ........ x | Automatic reply to waiting messages for an | IBM i job ............. xxvi Summary of enhancements ..... xiii | IBM Workload Scheduler Plug-in for Apache | IBM Workload Scheduler version 9.3 enhancements xiii | Oozie............... xxvi | IBM Workload Scheduler Plug-in for IBM | Running a script when a job completes ... xxvi | WebSphere MQ ............ xiv IBM Workload Scheduler for z/OS version 9.3 | IBM Workload Scheduler Plug-in for RESTful enhancements ............. xxvi | Web services............. xiv | Configuring a backup controller for disaster | IBM Workload Scheduler Plug-in for SAP | recovery ............
    [Show full text]
  • IBM Workload Scheduler: Administration
    IBM Workload Scheduler IBM Administration Version 9 Release 5 IBM Workload Scheduler IBM Administration Version 9 Release 5 Note Before using this information and the product it supports, read the information in “Notices” on page 375. This edition applies to version 9, release 5, modification level 0 of IBM Workload Scheduler (program number 5698-WSH) and to all subsequent releases and modifications until otherwise indicated in new editions. © Copyright IBM Corporation 2001, 2016. © Copyright HCL Technologies Limited 2016, 2019 Contents Figures .............. vii Configuring to schedule job types with advanced options ............... 80 Tables ............... ix \ Configuring security roles for users and groups 81 Configuring command-line client access authentication ............. 83 About this publication ........ xi Connection parameters ......... 83 What is new in this release ......... xi Entering passwords........... 85 Accessibility .............. xi IBM Workload Scheduler console messages and Technical training ............ xi prompts ............... 86 Support information ........... xi Setting sysloglocal on UNIX ........ 86 console command ........... 87 Chapter 1. Customizing and configuring Modifying jobmon service rights for Windows... 87 IBM Workload Scheduler ....... 1 Setting global options ........... 1 Chapter 2. Configuring the Dynamic Global options - summary ......... 3 Workload Console .......... 89 Global options - detailed description ..... 9 Launching in context with the Dynamic Workload Setting local options ..........
    [Show full text]
  • Sun Grid Engine Update Daniel Gruber Software Engineer Sun Microsystems Deutschland Gmbh Sun Is a Wholly-Owned Subsidiary of Oracle
    Sun Grid Engine Update Daniel Gruber Software Engineer Sun Microsystems Deutschland GmbH Sun is a wholly-owned subsidiary of Oracle 1 Content What's new in SGE? DRMAA Customer Feedback 2 Sun Grid Engine Releases Release Announcement Some Features... 6.2 major 23.09.2008 SDM, scalability (> 60000 cores), AR, IJS 6.2 update 1 18.12.2008 maintenance release GUI Installer, JSV, Per Job Resources, 6.2 update 2 31.03.2009 jemalloc SGE Inspect, SDM Cloud Adapter, 6.2 update 3 23.06.2009 Exclusive Host 6.2 update 4 23.10.2009 maintenance release Slotwise Preemption, Core Binding, 6.2 update 5 22.12.2009 enhanced Inspect, Java JSV, Array Job Throttling, Hadoop Support Sun Confidential: Internal Only 3 SDM – Service Domain Manager Grid Grid Grid Engine Engine Engine A B C Service Domain Manager Zzzzz Zzzzz Power Saving Spare Pool (via IPMI) Spare Pool CloudService Sun Confidential: Internal Only 4 JSV – Job Submission Verifier • Administrator (or users) can reformulate (insert, delete) job submission parameters based on a JSV scripts • Jobs can be rejected based on parameters • bash, csh, tcl, perl and JSV scripts are supported Sun Confidential: Internal Only 5 GUI Installer • Installs a complete SGE cluster Sun Confidential: Internal Only 6 Slot-wise preemption • Slot limit per host • Suspends jobs from subordinate queues in order to get high priority jobs to run • Suspends longest/shortest running jobs • Multiple layers (suspend trees) possible • Per layer: Order definable Sun Confidential: Internal Only 7 Core Binding • Job submission extension
    [Show full text]
  • Docker Windows Task Scheduler
    Docker Windows Task Scheduler Genealogical Scarface glissading, his karyotype outgone inflicts overflowingly. Rudolph is accessorial and suckers languorously as sociologistic Engelbart bridled sonorously and systematises sigmoidally. Which Cecil merchandises so unbelievably that Cole comedowns her suavity? Simple task runner that runs pending tasks in Redis when Docker container. With Docker Content Trust, see will soon. Windows Tip Run applications in extra background using Task. Cronicle is a multi-server task scheduler and runner with a web based front-end UI It handles both scheduled repeating and on-demand jobs targeting any. Django project that you would only fetch of windows task directory and how we may seem. Docker schedulers and docker compose utility program by learning service on a scheduled time, operators and manage your already interact with. You get a byte array elements followed by the target system privileges, manage such data that? Machine learning service Creatio Academy. JSON list containing all my the jobs. As you note have noticed, development, thank deity for this magazine article. Docker-crontab A docker job scheduler aka crontab for. Careful with your terminology. Sometimes you and docker schedulers for task failed job gets silently redirected to get our task. Here you do want to docker swarm, task scheduler or scheduled background tasks in that. Url into this script in one easy to this was already existing cluster created, it retry a little effort. Works pretty stark deviation from your code is followed by searching for a process so how to be executed automatically set. Now docker for windows service container in most amateur players play to pass as.
    [Show full text]
  • Release Notes for IBM Spectrum LSF Performance Enhancements
    IBM Spectrum LSF Version 10 Release 1 Release Notes IBM IBM Spectrum LSF Version 10 Release 1 Release Notes IBM Note Before using this information and the product it supports, read the information in “Notices” on page 41. This edition applies to version 10, release 1 of IBM Spectrum LSF (product numbers 5725G82 and 5725L25) and to all subsequent releases and modifications until otherwise indicated in new editions. Significant changes or additions to the text and illustrations are indicated by a vertical line (|) to the left of the change. If you find an error in any IBM Spectrum Computing documentation, or you have a suggestion for improving it, let us know. Log in to IBM Knowledge Center with your IBMid, and add your comments and feedback to any topic. © Copyright IBM Corporation 1992, 2017. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Release Notes for IBM Spectrum LSF Performance enhancements ........ 14 Version 10.1 ............. 1 Pending job management......... 16 What's new in IBM Spectrum LSF Version 10.1 Fix Job scheduling and execution ....... 21 Pack 3 ................ 1 Host-related features .......... 27 Job scheduling and execution ........ 1 Other changes to LSF behavior ....... 30 Resource management .......... 1 Learn more about IBM Spectrum LSF...... 31 Container support ........... 5 Product notifications .......... 31 Command output formatting ........ 5 IBM Spectrum LSF documentation....... 32 Logging and troubleshooting ........ 5 Product compatibility ........... 32 Other changes to IBM Spectrum LSF ..... 6 Server host compatibility ......... 32 What's new in IBM Spectrum LSF Version 10.1 Fix LSF add-on compatibility ........
    [Show full text]
  • IBM I Job Scheduling Reduce Your Running Costs and Boost Operational Efficiency
    AUTOMON® for iSchedule IBM i Job Scheduling Reduce your running costs and boost operational efficiency AUTOMON® for For IT operations under pressure to do more for less, it makes sense to iSchedule is an eliminate the costs and inefficiencies of manual batch scheduling by advanced job moving to a fully automated solution. AUTOMON® for iSchedule is a reliable scheduler that and robust IBM i job scheduler that is designed to: manages all batch work for one or many • Reduce your running costs and minimize errors by automating all servers from a single aspects of the job scheduling process point of control • Provide the flexibility to manage increases in workload without employing additional staff • Run an unattended scheduling operation, with the option to use AUTOMON® for iConsole and AUTOMON® for iMessage for a fully automated job scheduling and message management solution • Maximize system throughput by managing job dependencies and system resources efficiently • Increase levels of control, enabling you to prioritize critical work and improve batch performance • Reduce downtime through automated error recovery • Release operators from routine tasks to focus on value-added business activities A single, powerful solution Exploit the latest technology AUTOMON® for iSchedule is an advanced AUTOMON® for iSchedule is a flexible job scheduler that provides you with your own solution that exploits the latest features automated computer operator. available in the IBM i operating system. This includes the ability to schedule unattended The system is simple enough for basic backups, restricted tasks and IPLs. scheduling to be implemented immediately, yet offers the levels of sophistication and Enterprise-grade features ease the creation, power necessary for more detailed and maintenance, and execution of job schedules, complex requirements.
    [Show full text]
  • Adaptive Server® Enterprise 15.7 DOCUMENT ID: DC20001-01-1570-01
    Job Scheduler Users Guide Adaptive Server® Enterprise 15.7 DOCUMENT ID: DC20001-01-1570-01 LAST REVISED: August 2011 Copyright © 2011 by Sybase, Inc. All rights reserved. This publication pertains to Sybase software and to any subsequent release until otherwise indicated in new editions or technical notes. Information in this document is subject to change without notice. The software described herein is furnished under a license agreement, and it may be used or copied only in accordance with the terms of that agreement. To order additional documents, U.S. and Canadian customers should call Customer Fulfillment at (800) 685-8225, fax (617) 229-9845. Customers in other countries with a U.S. license agreement may contact Customer Fulfillment via the above fax number. All other international customers should contact their Sybase subsidiary or local distributor. Upgrades are provided only at regularly scheduled software release dates. No part of this publication may be reproduced, transmitted, or translated in any form or by any means, electronic, mechanical, manual, optical, or otherwise, without the prior written permission of Sybase, Inc. Sybase trademarks can be viewed at the Sybase trademarks page at http://www.sybase.com/detail?id=1011207. Sybase and the marks listed are trademarks of Sybase, Inc. ® indicates registration in the United States of America. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. Java and all Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc.
    [Show full text]
  • Job Scheduling
    Job scheduling Sandeep Agrawal C-DAC Pune What is a job? • The user's program is not simply the name of an executable. It has input data and parameters, environment variables, descriptions of computing resources needed to run the application, and output directives. All of these specifications collectively are called a job • The user submits the job to an HPC system, usually in the form of a job script • The job script contains a formal specification that requests computing resources, identifies an application to run along with its input data and environment variables, and describes how best to deliver the output data • The term job size refers to the number of nodes the job requests • The term job length refers to the total time that a job want to run Modes of program execution • Interactive mode • Similar way as of using a personal computer • Not enabled on all HPC systems • Job may have to wait in queue • Batch mode • refers to program execution in background • most HPC jobs can use resources of multiple nodes • User jobs are accepted and typically wait in queues before adequate resources are available to the job and then it is scheduled for execution. • Most resource policies on the HPC systems tend to be exclusive Batch System • A typical batch system is composed of three functionalities: • Job Scheduler or Workload Manager • Resource Manager • Execution Manager • Job Scheduling: • The batch system is responsible for receiving the job script. • If the job cannot be executed immediately, the job script is added to a queue. • The job waits in the queue until the job's requested resources are available.
    [Show full text]