10/9/2013
Outline
Condor
Presented by : Walid Budgaga The Anatomy of the Grid
Globus Toolkit CS 655 – Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 2
Motivation Motivation High Throughput Computing (HTC)? HTC is suitable for scientific research Large amounts of computing capacity over long Example(Parameter sweep): periods of time. Testing parameter combinations to Measured: operations per month or per year High Performance Computing (HPC)? keep temp. at particular level Large amounts of computing capacity for short
periods of time op(x,y,z) takes 10 hours, 500(MB) memory, I/O 100(MB) Measured: FLOPS x(100), y(50), z(25)=> 100x50x25=125,000(145 years)
3 4
Motivation HTC Environment Fort Collins Large amounts of processing capacity? Exploiting computers on the network Science Center Utilizing heterogeneous resources Uses Condor for Overcoming differences of the platforms By building portable solution scientific projects Including resource management framework
Over long periods of time? System must be reliable and maintainable Surviving failures (software & hardware) Allowing leaving and joining of resources at any time Upgrading and configuring without significant downtimes Source: http://www.fort.usgs.gov/Condor/ComputingTimes.asp 5 6
1 10/9/2013
HTC Environment HTC
Also, the system must meet the needs of: Other considerations: Resource owners The distributive owned resources lead to: Rights respected Decentralized maintenance and configuration of resources Policies enforced
Resource availability Customers Benefit of additional processing capacity outweigh complexity of Applications preempted at any time usage Adds an additional degree of resource heterogeneity System administrators
Real benefit provided to users outweigh the maintenance cost
7 8
Condor Overview
Open-source high-throughput computing framework for computing intensive tasks.
Manages distributive owned resources to provide large amount of capacity
Developed at the Computer Sciences Department at the University of Wisconsin-Madison
Name changed to HTCondor in October 2012
9 10
Condor Overview Condor Overview
Customer agent Represent the customer job(application) Can state the its requirement as following:
Need a Linux/x86 platform
Want the machine with the high memory capacity
Prefer a machine in the lab 120
11 12
2 10/9/2013
Condor Overview Condor Overview
Resource agent Matchmaker Represent the resource Matches jobs and resources Can state its offers as following: Platform: a Linux/x86 platform based on requirements and offers Memory: 1GB Can state its requirements as following: Notifies the agents when a match found Run jobs only when keyboard and mouse are idle for 15 m Run jobs only from the computer department
Never run jobs belong to [email protected]
13 14
Challenges of HTC system:
Software Development
System Administration
15 16
Software Development Software Development Four primary challenges Utilization of heterogeneous resources: Utilization of heterogeneous resources Requires system portability obtained through layered system design • Network API : Requires system portability. • Connection-oriented and connectionless Network Protocol Flexibility • Reliable and unreliable interfaces. Required to cope with constantly changing of the resource and customer needs • Authentication and encryption Required for adding new features • Process management API : • Create , suspend, resume, Remote file access and kill a process. Required for giving ability for accessing data from any workstation • Workstation statistics API: Utilization of non dedicated resources • Reports information needed to Implement resource owner policies Required for preempt and resume application. Verify the validation of the applications requirements 17 18
3 10/9/2013
Software Development Software Development Network Protocol Flexibility: Remote file access(1): To cope with adding new services in HTC without frequently updating HTC To guarantee that HTC applications can access their data components, general purpose data format may be used from any workstation in the cluster. • Three possible solutions: • For example: Condor uses protocol similar to RPC • Using existing distributed file system (NFS) • Authenticates customer application, • Condor: • Privileges need to assigned, or • Grant file access permission
19 20
Software Development Software Development Remote file access (2): Remote file access(3): • Redirecting file I/O system calls • Implementing data file staging • Interposing HTC between application & operating system • By Linking application with an interposition library • Transferring input and output files to remote workstation • Does not require file storage on remote workstation specified by customer • Reduce performance. • Require free disk space on workstation • Difficult to develop & maintain portable interposition • High cost for large data files
21 22
Software Development Software Development Utilization of non-dedicated resources Checkpoints in Condor (1)
Requires the ability for preempting and resuming application. Used as migration mechanism It can be obtained using checkpoints Job scheduler to migrate jobs from workstations to others Checkpoint: Used to resume a vacated jobs snapshot of the state of the executing program The program has the ability to checkpoint itself It can be used to restart the program at a later time Using a checkpointing library Provide reliability To provide additional reliability Enable preemptive-resume scheduling HTCondor can be configured to write checkpoints periodically
23 24
4 10/9/2013
Software Development Software Development Checkpoints in Condor (2) Checkpoints in Condor (3)
When checkpoints are stored: Storing of checkpoints
Periodically, if HTCondor is configured By default,
At any time by the program checkpoints are stored on local disk of the machine
When higher priority job has to start on the same machine where job was submitted
When the machine becomes busy However,
It can be configured to stored them on checkpoints server
25 26
System Administration Administrator has to answer to:
Resource owners
By guaranteeing that HTC enforces their policies
Customers
By ensuring receipt of valuable services from HTC
Policy makers
By demonstrating that HTC is meeting the stated goals.
28 27
System Administration System Administration Access Policies Access Policies Example from Condor: Specifies when and how the resources can be accessed and by whom
The policies might be specified using a set of expressions
For example in Condor: Requirements (true: to start accessing the resources) Rank (preference) Suspend Continue Vacate (notification to stop using resources) Kill (immediately stopping using the resources)
29 30
5 10/9/2013
System Administration System Administration Reliability System logs The HTC must be prepared against failures and It is primary tools for diagnosing system failures. It gives the ability to must be automate failure recover for common failure. reconstruct the events leading up to the failure . Problems and Suggested solutions It is not easy job Logs files can grow to unbounded size. Detect difference between normal and abnormal termination Detailed logs for recently events and summaries for old information Don’t leave running applications unattended Managing distributed log file Choose the correct checkpoint to restart Store logs centrally on a file server or a customized log server
Decide when it is safe to restart the application Provide single interface by installing logging agents on each workstation Determine & avoid bad nodes 31 32
System Administration System Administration Monitoring and Accounting CondorView Usage Graph
It helps the Administrator to:
Assess the current and historical state of the system
Track the system usage
33 34
System Administration System Administration Security (1) Security (2)
Possible attacks To protect against an unauthorized of resource access policy
Resource attack Resource owner may specify authorized users in his access policy
Unauthorized user gains access to a resource Condor Example:
Authorized user violates resource owner’s access policy Requirement = (Customer == “[email protected]”) || Customer attack ( Customer == “[email protected]”)
Customer’s account or files are risked via HTC environment
35 36
6 10/9/2013
System Administration System Administration Security (3) Security (4)
To protect against violations of resource access policy, To protect the customer’s account and files
The resource agent may: HTC must ensure that all resource agents are trustworthy
Set resource consumption limit by using system API Placing data files only on trusty hosts
Run the application under “guest” account Using authentication mechanism Set file system root directory to “sandbox” directory Encrypting network streams Intercept the system calls performed by app. via OS interposition interface
37 38
System Administration Remote Customers
Remote access is more convenient than direct access
Customer creates an HTC account
Customer agent can be installed on customer workstation
The administrator allows this agent to access the HTC cluster
For non- trustworthy customers, extra security procedures may be required
39 40
Condor Condor Condor is suitable for high throughput computations Running programs unattended and in the background Redirecting console input & outputs from and to files Running many jobs at same time at different machines Notifying on completion via email Exploiting idle machines Allowing tracking jobs’ progress Running one job on multiple machines Allowing for many jobs to be completed over a long period of Survive hardware and software failure time Allowing joining and leaving of machines Useful for researchers that concern with number of jobs they Enforcing your own policy can do over particular time length
41 42
7 10/9/2013
Condor Condor can be seen as a distributed job scheduler Scheduling submitted jobs on available machines Allowing users to specify priorities to their jobs Ensuring fair resources share by constantly calculating user priority Condor as Distributed Job Scheduler Lower numerical value means higher priority Each user starts with the highest priority (0.5) Priority improves over time if number of used machines < priority Priority worsens over time if number of used machines > priority Using of checkpoints Suspending and resuming of jobs Rescheduling jobs on different machines
43 44
Distributed Job Scheduler Distributed Job Scheduler Machine expresses ClassAd Language Attributes Describing jobs, workstations, and other resources Conditions Same idea of classified advertising section of news paper Preferences
Job expresses Attributes Exchangeable between Requirements processes to schedule jobs Preferences
Matchmaker Providing information about Finds matching the state of the system Notifies the matched parties 45 46
Distributed Job Scheduler Distributed Job Scheduler
ClassAd Structure ClassAd Example: Set of attribute-values pairs Each value can be: Integer Floating point String Logical expression TRUE FALSE UNDEFINED Also, attributes from different ClassAds can be used
For Example: other.size > 3
47 48
8 10/9/2013
Distributed Job Scheduler Distributed Job Scheduler
Matchmaker:
Its job to find matching between two ClassAds (job & machine)
Matching between two ClassAds (job & machine) occurs if
The expressions of Requirements attribute in both ClassAds are true
If more than two matches are found? Rank is used
49 50
Distributed Job Scheduler Job Submission Can be done by submit a job description file
Job description file It is plain ASCII text file used to describe job or cluster (several jobs) Condor: How to submit the job Specify how many times to run the job Specify the directory of the input and output files Specify how to receive notification when completing the execution (email or log) Select an Universe Standard or Vanilla PVM MPI GLOBUS (Grid applications) Scheduler (meta-schedulers)
51 52
Distributed Job Scheduler Distributed Job Scheduler Description file Example: Description file Example:
53 54
9 10/9/2013
Distributed Job Scheduler Distributed Job Scheduler Standard Universe Description file Example: Running serial jobs
Not supporting Checkpoint at kernel level
Relinking source code with Condor system call library
Transparently processing & restarting checkpoint
Transparently processing migration
Automatically using remote access mechanism By default, storing checkpoint on local disk of submit machine Configurable, it can be stored on checkpoint server 55 56
Distributed Job Scheduler Distributed Job Scheduler
Standard Universe Vanilla Universe Remote file access Running almost all serial jobs
Running any program that can run outside of Condor
Typically relying on shared file system between submit machine and other nodes
If no shared file system, files will be transferred
57 58
Distributed Job Scheduler Distributed Job Scheduler
MPI Universe PVM Universe Managing parallel programs written using MPI Giving the ability to submit PVM applications PVM can ask Condor to add new machine Uses only dedicated resources
59 60
10 10/9/2013
Distributed Job Scheduler DAGMan Scheduler Using directed acyclic graph(DAG) to specify dependencies
Condor: dependencies between jobs
61 62
Distributed Job Scheduler Distributed Job Scheduler
DAGMan Scheduler DAGMan Scheduler Managing the submissions of jobs Managing the submissions of jobs
63 64
Distributed Job Scheduler Distributed Job Scheduler
DAGMan Scheduler DAGMan Scheduler Managing the submissions of jobs Managing the submissions of jobs
65 66
11 10/9/2013
Condor Architecture Condor Architecture
Condor Pool? Pool owns central manager and a collection of jobs and machines Job Startup Central manager serves as centralized repository of info about the state of the pool
67 68
Condor Architecture
INFN Condor pool
69 70
Grid overview Grid overview What is Grid? “Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources”
Virtual Organization (VO) Dynamic Set of individuals and institutions defined by sharing rules to share their resources to achieve a common goal
Example of VO A crisis management team, the databases, and simulation systems are used to plan a response to an emergency situation
71 72
12 10/9/2013
Grid overview Grid Architecture VO requirements Grid architecture must be formed as layers with hourglass shape Flexible sharing relationships Each layer contains Control on shared resources component sharing Usage modes the same role Shared infrastructure services Component in each Interoperability layer can use services
of lower layers Since Grid technology provides a general resource-sharing framework, it can be used to address the VO requirements The interacting between components can be done through standard protocols
73 74
Grid Architecture Grid Architecture Fabric Connectivity Interface to local control Defines core communication and authentication protocols. Implement the local, resource–specific operations To exchange data between Fabric layer resources. Implement resources Enquiry and resource management Authentication solutions : mechanisms to have the capability logon once and have access to multiple Grid resources Computational: monitoring and controlling process execution Delegation Storage: read and writes files User-based trust relationships Network: have control over network resources Code Repository: managing versioned source code
75 76
Grid Architecture Grid Architecture Resource Collective Sharing Single Resources Coordinating Multiple Resources Defines protocols for secure negotiation, initiation, monitoring, Defines protocols that capture interactions across collections of control, accounting, payment of sharing operations on individual resources. resources. Service examples: Directory services Two primary classes of Resource layer protocols Co-allocation, scheduling, and brokering services Information protocols Monitoring and diagnostics services Data replication services To provide information about the structure and state of a resource Grid-enabled programming systems Management protocols Workload management systems and collaboration frameworks To negotiate access to a shared resource Software discovery services
Community authorization servers
77 78
13 10/9/2013
Grid Architecture Application Implements the business logic Operate within VO environment Constructed by calling services defined at any layer.
79 80
Globus Toolkit Globus Toolkit Globus Who is involved in Globus Alliance? Argonne National Laboratory’s Mathematics and Computer Science Division Community of organizations and individuals developing fundamental The University of Southern California’s Information Sciences Institute technologies behind the Grid The University of Chicago's Distributed Systems Laboratory
The University of Edinburgh in Scotland Globus Toolkit The Swedish Center for Parallel Computers
Open source software toolkit provides basic infrastructure, protocols, National Computational Science Alliance and services to build grids and applications The NASA Information Power Grid project
…..
81 82
Globus Toolkit Globus Toolkit
Projects using Globus Toolkit The Toolkit
Computer Science Physics Astronomy Include a set of services and software components to Condor FusionGrid Sloan Digital Sky Survey National Virtual DOE e-Services LIGO Observatory support building Grids and their applications GridLab Particle Physics Data Grid GriPhyN Chemistry Infrastructure NMI GridShib CMCS Includes a set of modules NMI Performance Monitoring ASCI (HPSS) Civil Engineering OGCE EGEE NEES Each module provides an interface used by higher- OGSA-DAI Grid3 SciDAC CoG GRIDS Center Climate Studies level services to invoke the module’s mechanisms. SciDAC Data Grid iVDGL LEAD SciDAC Security NorduGrid Earth System Grid Each module provides implementations that use vGRADS Open Science Grid Collaboration low-level operations to give the ability to implement TeraGrid Access Grid UK e-Science these mechanisms in different environments 83 84
14 10/9/2013
Globus Toolkit Globus Toolkit
Fabric: Any resources that can be shared. For example: Distributed file system and condor Resources defined by vendor-supplied interfaces Includes enquiry software to detect
resources capabilities and
delivers these information
to higher lever services
85 86
Globus Toolkit Globus Toolkit
Connectivity Resource
Grid Resource Access Management (GRAM)
Grid Resource Information Protocol (GRIP)
Grid Resource Registration Protocol (GRRP)
Grid Security Infrastructure (GSI) GridFTP
Nexsus
87 88
Globus Toolkit Commonalities & Contrast
Connectivity Commonalities
Using dedicated & non-dedicated resources Grid Information Index Servers (GIISs) Providing powerful capacity LDAP information protocol
Dynamically Updated Request Online Contrast Co-allocator (DUROC) Globus provides tools to to build girds, while Condor is software that exploits resources of workstations to perform extensive tasks
Condor and Globus complementary technologies
Condor-G, a Globus-enabled version of Condor
89 90
15 10/9/2013
Inefficiencies & Possible
Problems in Condor:
One central manager
One checkpoint server
Possible Solution:
For each one, we should have mirror server that can be used
in case of crashing the original server
91 92
16