asynciojobs Documentation Release 0.13.2

Thierry Parmentelat

Oct 10, 2018

Contents

1 README 3 1.1 A simple orchestration engine for asyncio ...... 3 1.2 Full documentation...... 4 1.3 Prerequisites...... 4 1.4 Installing...... 4 1.5 Examples...... 4 1.6 Nesting schedulers...... 16 1.7 Other useful features on the Scheduler class...... 19 1.8 Troubleshooting...... 21 1.9 Customizing jobs...... 22

2 The asynciojobs API 25 2.1 The Scheduler classes...... 25 2.2 Job-like classes...... 33 2.3 The Sequence class...... 37 2.4 Notes on ordering...... 38 2.5 Convenience classes...... 38

3 ChangeLog 41 3.1 0.13.2 - 2018 Oct 10...... 41 3.2 0.13.1 - 2018 Sep 20...... 41 3.3 0.13.0 - 2018 Aug 30...... 41 3.4 0.12.11 - 2018 Aug 23...... 41 3.5 0.12.10 - 2018 Jul 5...... 41 3.6 0.12.9 - 2018 Jul 5...... 42 3.7 0.12.7 - 2018 Jul 4...... 42 3.8 0.12.2 - 2018 Jun 14...... 42 3.9 0.11.4 - 2018 Jun 12...... 42 3.10 0.11.3 - 2018 Jun 12...... 42 3.11 0.11.2 - 2018 May 15...... 42 3.12 0.11.1 - 2018 May 10...... 42 3.13 0.10.2 - 2018 May 2...... 43 3.14 0.10.1 - 2018 Apr 30...... 43 3.15 0.9.1 - 2018 Apr 25...... 44 3.16 0.8.2 - 2018 Apr 20...... 44 3.17 0.8.1 - 2018 Apr 19...... 44 3.18 0.7.1 - 2018 Apr 17...... 44

i 3.19 0.6.1 - 2018 Mar 12...... 44 3.20 0.6.0 - 2018 Feb 25...... 45 3.21 0.5.8 - 2018 Jan 16...... 45 3.22 0.5.7 - 2017 Dec 19...... 45 3.23 0.5.6 - 2017 Dec 18...... 45 3.24 0.5.5 - 2017 Nov 2...... 45 3.25 0.5.4 - 2016 Dec 15...... 45 3.26 0.5.2 - 2016 Dec 8...... 45 3.27 0.5.0 - 2016 Dec 6...... 46 3.28 0.4.6 - 2016 Dec 5...... 46 3.29 0.4.4 - 2016 Dec 5...... 46 3.30 0.4.3 - 2016 Dec 2...... 46 3.31 0.4.2 - 2016 Dec 1...... 46 3.32 0.4.1 - 2016 Nov 30...... 46 3.33 0.4.0 - 2016 Nov 21...... 46 3.34 0.3.4 - 2016 Nov 20...... 46 3.35 0.3.3 - 2016 Nov 17...... 47 3.36 0.3.2 - 2016 Nov 17...... 47 3.37 0.3.1 - 2016 Nov 15...... 47 3.38 0.2.3 - 2016 Oct 23...... 47 3.39 0.2.2 - 2016 Oct 20...... 47 3.40 0.2.1 - 2016 Oct 7...... 47 3.41 0.2.0 - 2016 Oct 4...... 47 3.42 0.1.2 - 2016 Oct 2...... 47 3.43 0.1.1 - 2016 Sep 28...... 48 3.44 0.1.0 - 2016 Sep 27...... 48 3.45 0.0.6 - 2016 Sep 21...... 48 3.46 0.0.5 - 2016 Sep 21...... 48 3.47 0.0.4 - 2016 Sep 20...... 48 3.48 0.0.3 - 2016 Sep 19...... 48 3.49 0.0.2 - 2016 Sep 15...... 48 3.50 0.0.1 - 2016 Sep 15...... 48

4 Indices and tables 49

Python Module Index 51

ii asynciojobs Documentation, Release 0.13.2

Contents:

Contents 1 asynciojobs Documentation, Release 0.13.2

2 Contents CHAPTER 1

README

1.1 A simple orchestration engine for asyncio

The main and single purpose of this library is to allow for the static description of scenarii involving asyncio- compliant jobs, that have dependencies in the sense that a given job cannot start until its requirements have not completed. So in a nutshell you would: • define a set of Job objects, • together with their requires relationship; that is to say, for each of them, which other jobs need to have completed before this one can be triggered, • and run this logic through an Scheduler object, that will orchestrate the whole scenario. Further features allow to • define a job as critical or not; a critical job that raises an exception causes the orchestration to terminate abruptly; • define a job as running forever, in which case the scheduler of course won’t wait for it, but instead will terminate it when all other jobs are done; • define a global timeout for the whole scheduler; • define a window in terms of a maximal number of simultaneous jobs that are allowed to run; • define nested schedulers: a Scheduler instance being also a job, a scheduler can be inserted in another scheduler just as if it were a re gular job; nested schedulers allow for reusability, since workflow pieces can be for example returned by regular python functions. A job object can be created: • either as a Job instance from a regular asyncio ; • or by specializing the AbstractJob class, and defining its co_run() method; this is for example the case for the SshJob in the apssh library.

3 asynciojobs Documentation, Release 0.13.2

As a convenience, the Sequence class is a helper class that can free you from manually managing the requires deps in long strings of jobs that must run sequentially.

1.2 Full documentation

This document, along with asynciojobs’s API reference documentation, and changelog, is available at http:// asynciojobs.readthedocs.io Contact author: thierry dot parmentelat at inria dot fr Licence: CC BY-NC-ND

1.3 Prerequisites asynciojobs requires asyncio and python-3.5 or more recent. import sys major, minor= sys.version_info[:2] if (major, minor)<(3,5): print("asynciojobs won't work in this environment")

1.4 Installing asynciojobs requires python-3.5, and can be installed from the pypi repository: pip3 install asynciojobs

1.4.1 Extra dependency to graphviz

This installation method will not try to install the graphviz python package, that can be cumbersome to install, and that is not strictly necessary at run-time for orchestrating a scheduler. We recommend to install it for developping scenarios though since, as we will see shortly, asynciojobs provides a graphical representation of schedulers, that is very convenient for debugging.

1.5 Examples import asyncio

In all our examples, we will use the Watch class, it is a helper class that works like a stopwatch; instead of printing the current time, we prefer to display running time from the beginning, which corresponds to the time where the watch instance is created or reset: import time from asynciojobs import Watch watch= Watch() (continues on next page)

4 Chapter 1. README asynciojobs Documentation, Release 0.13.2

(continued from previous page) time.sleep(0.5) watch.print_elapsed('some')

000.000 000.505 some

We can now write a simple coroutine for illustrating schedulers through small examples: import time watch= Watch()

# just print a message when entering and exiting, and sleep in the middle async def in_out(timeout): global watch watch.print_elapsed("-> in_out({})\n".format(timeout)) await asyncio.sleep(timeout) watch.print_elapsed("<- in_out({})\n".format(timeout)) # return something easy to recognize: the number of milliseconds return 1000 * timeout

000.000

1.5.1 Example A : running in parallel

Running a series of in parallel - a la gather - can be done like this: from asynciojobs import Job, Scheduler a1, a2, a3= Job(in_out(0.1)), Job(in_out(0.2)), Job(in_out(0.25)),

What we’re saying here is that we have three jobs, that have no relationships between them. So when we run them, we would start all 3 coroutines at once, and return once they are all done:

# this is required in our case because our coroutines # use watch to show time elapsed since reset() watch.reset() sa= Scheduler(a1, a2, a3) sa

Scheduler with 0 done+0 ongoing+3 idle=3 job(s) sa.run()

000.016-> in_out(0.1) 000.016-> in_out(0.2) 000.016-> in_out(0.25) 000.119<- in_out(0.1) 000.219<- in_out(0.2) 000.269<- in_out(0.25)

(continues on next page)

1.5. Examples 5 asynciojobs Documentation, Release 0.13.2

(continued from previous page)

True

Note: the run() method is a regular python function, which is easier to illustrate in this README, but in practical terms it is only a wrapper around the co_run() coroutine method.

1.5.2 Several programming styles

The library offers great flexibility for creating schedulers and jobs. In particular in the example above, we have created the jobs first, and then added them to the scheduler; it is possible to do it the other way around, like this totally equivalent construction:

# if it is more to your taste, you can as well # create the scheduler first sa2= Scheduler()

# and then add jobs in it as you create them a1, a2= Job(in_out(0.1), scheduler=sa2), Job(in_out(0.2), scheduler=sa2)

# or add them later on a3= Job(in_out(0.25)) sa2.add(a3)

Scheduler with 0 done+0 ongoing+3 idle=3 job(s) watch.reset() sa2.run()

000.000-> in_out(0.1) 000.000-> in_out(0.2) 000.000-> in_out(0.25) 000.105<- in_out(0.1) 000.205<- in_out(0.2) 000.254<- in_out(0.25)

True

1.5.3 Retrieving individual results

We can see right away how to retrieve the results of the various jobs a1.result()

100.0

6 Chapter 1. README asynciojobs Documentation, Release 0.13.2

1.5.4 Example B : adding requirements (dependencies)

Now we can add requirements dependencies between jobs, as follows. Here we want to run: • job 1 followed by job 2 • all this in parallel with job 3 We take this chance to show that jobs can be tagged with a label, which can be convenient for a more friendly display. b1, b2, b3= (Job(in_out(0.1), label="b1"), Job(in_out(0.2), label="b2"), Job(in_out(0.25))) b2.requires(b1)

Now b2 needs b1 to be finished before it can start. And so only the 2 first coroutines get started at the beginning, and only once b1 has finished does b2 start. watch.reset()

# with this setup we are certain that b3 ends in the middle of b2 sb= Scheduler(b1, b2, b3) sb.run()

000.000-> in_out(0.1) 000.000-> in_out(0.25) 000.103<- in_out(0.1) 000.104-> in_out(0.2) 000.255<- in_out(0.25) 000.309<- in_out(0.2)

True

1.5.5 Example B’ : exact same using a Sequence

The code above in example B is exactly identical to this: from asynciojobs import Sequence sb2= Scheduler( Sequence(Job(in_out(0.1), label="bp1"), Job(in_out(0.2), label="bp2")), Job(in_out(0.25))) watch.reset() sb2.run()

000.000-> in_out(0.1) 000.000-> in_out(0.25) 000.104<- in_out(0.1) (continues on next page)

1.5. Examples 7 asynciojobs Documentation, Release 0.13.2

(continued from previous page) 000.104-> in_out(0.2) 000.255<- in_out(0.25) 000.309<- in_out(0.2)

True

1.5.6 Return value for Scheduler.run()

Note that because sb.run() had returned True, we could have inferred that all jobs have completed. As a matter of fact, run() returns True if and only if: • all jobs have completed during the allocated timeout, if specified, and • no critical job has raised an exception. Note: What happens if these two conditions are not met depends on the critical attribute on the scheduler object: • if scheduler is not critical: then if any of these conditions is not met, run() returns False; • if scheduler is itself critical, then run() will raise an exception, depending on the reason behind the failure, see co_run() for details. This behaviour has been chosen so that nested schedulers do the right thing: it allows exceptions to bubble from inner schedulers up to the toplevel one, and to trigger its abrupt termination. See also failed_critical(), failed_time_out(), debrief() and why().

1.5.7 Inspecting scheduler and results - Scheduler.list()

Before we see more examples, let’s see more ways to get information about what happened once run finishes. For example to check that job b1 has completed: print(b1.is_done())

True

To check that job b3 has not raised an exception: print(b3.raised_exception())

None

To see an overview of a scheduler, just use the list() method that will summarize the contents of a scheduler: sb.list()

1 [[ -> 100.0]] 2 [[ -> 200.0]] requires={1} 3 [[ -> 250.0]]

8 Chapter 1. README asynciojobs Documentation, Release 0.13.2

The textual representation displayed by list() shows all the jobs, with: • its rank in the topological order of the graph (graphs with cycles will need to use list_safe()) • its progress wrt the job’s lifecycle, • its label, or some computed label if not specified, • its result or exception if the job has run • its requirements. The individual lifecycle for a job instance is: idle → scheduled → running → done where the ‘scheduled’ state is for cases where a maximal number of simulataneous jobs has been reached - see jobs_window - so the job essentially has all its requirements fulfilled but still waits for its turn. With that in mind, here is a complete list of the symbols used, with their meaning: • : idle (read: requirements are not fulfilled) • : scheduled (read: waiting for a slot in the jobs window) • : running • : complete • : raised an exception • : went through fine (no exception raised) • : defined as critical • ∞ : defined as forever And and here’s an example of output for list() with all possible combinations of jobs:

01 ∞ 02 - requires {01} 03 ∞ - requires {02} 04 - requires {03} 05 ∞ - requires {04} 06 - requires {05} 07 ∞ - requires {06} 08 - requires {07} 09 ∞ - requires {08} 10 - requires {09} 11 ∞ - requires {10} 12 - requires {11} 13 ∞ !! CRIT. EXC. => bool:True! ˓→! - requires {12} 14 !! CRIT. EXC. =>

˓→bool:True!! - requires {13} 15 ∞ !! exception => ˓→bool:True!! - requires {14} 16 !! exception =>

˓→bool:True!! - requires {15} 17 ∞ [[ -> 0]] - requires {16} 18 [[ -> 0]] - requires {17} 19 ∞ [[ -> 0]] - requires ˓→{18} 20 [[ -> 0]] - requires

˓→{19}

1.5. Examples 9 asynciojobs Documentation, Release 0.13.2

Note that if your locale/terminal cannot output these, the code will tentatively resort to pure ASCII output.

1.5.8 Graphical representation

It is easy to get a graphical representation of a scheduler. From inside a jupyter notebook, you would just need to do e.g. sb2.graph()

However, in the context of readthedocs, this notebook is translated into a static markdown file, so we cannot use this elegant approach. Instead, we use a more rustic workflow, that first creates the graph as a dot file, and then uses an external tool to produce a png file:

# this can always be done, it does not require graphviz to be installed sb2.export_as_dotfile("readme-example-b.dot")

'(Over)wrote readme-example-b.dot'

# assuming you have the 'dot' program installed (it ships with graphviz) import os os.system("dot -Tpng readme-example-b.dot -o readme-example-b.png")

0

We can now look at the result, where you can recognize the logic of example B:

1.5.9 Example : infinite loops, or coroutines that don’t return

Sometimes it is useful to deal with a endless loop; for example if we want to separate completely actions and printing, we can use an asyncio.Queue to implement a simple message bus as follows: message_bus= asyncio.Queue() async def monitor_loop(bus): while True: message= await bus.get() print("BUS: {}".format(message))

10 Chapter 1. README asynciojobs Documentation, Release 0.13.2

Now we need a modified version of the in_out coroutine, that interacts with this message bus instead of printing anything itself : async def in_out_bus(timeout, bus): global watch await bus.put("{} -> in_out({})".format(watch.elapsed(), timeout)) await asyncio.sleep(timeout) await bus.put("{} <- in_out({})".format(watch.elapsed(), timeout)) # return something easy to recognize return 10 * timeout

We can replay the prevous scenario, adding the monitoring loop as a separate job. However, we need to declare this extra job with forever=True, so that the scheduler knows it does not have to wait for the monitoring loop, as we know in advance that this monitoring loop will, by design, never return. c1, c2, c3, c4= (Job(in_out_bus(0.2, message_bus), label="c1"), Job(in_out_bus(0.4, message_bus), label="c2"), Job(in_out_bus(0.3, message_bus), label="c3"), Job(monitor_loop(message_bus), forever=True, label="monitor")) c3.requires(c1) watch.reset() sc= Scheduler(c1, c2, c3, c4) sc.run()

BUS: 000.000-> in_out(0.2) BUS: 000.000-> in_out(0.4) BUS: 000.205<- in_out(0.2) BUS: 000.206-> in_out(0.3) BUS: 000.406<- in_out(0.4) BUS: 000.509<- in_out(0.3)

True

Note that run() always terminates as soon as all the non-forever jobs are complete. The forever jobs, on the other hand, get cancelled, so of course no return value is available at the end of the scenario : sc.list()

1 [[ -> 2.0]] 2 [[ -> 4.0]] 3 [[ -> 3.0]] requires={1} 4 ∞ [not done]

Forever jobs appear with a dotted border on a graphical representation:

1.5. Examples 11 asynciojobs Documentation, Release 0.13.2

# a function to materialize the rustic way of producing a graphical representation def make_png(scheduler, prefix): dotname="{}.dot".format(prefix) pngname="{}.png".format(prefix) scheduler.export_as_dotfile(dotname) os.system("dot -Tpng {dotname} -o {pngname}".format(**locals())) print(pngname) make_png(sc,"readme-example-c") readme-example-c.png

Note: a scheduler being essentially a set of jobs, the order of creation of jobs in the scheduler is not preserved in memory.

1.5.10 Example D : specifying a global timeout

A Scheduler object has a timeout attribute, that can be set to a duration (in seconds). When provided, run() will ensure its global duration does not exceed this value, and will return False or raise TimeoutError if the timeout triggers. Of course this can be used with any number of jobs and dependencies, but for the sake of simplicity let us see this in action with just one job that loops forever: async def forever(): global watch for i in range(100000): print("{}: forever {}".format(watch.elapsed(), i)) await asyncio.sleep(.1) j= Job(forever(), forever=True) watch.reset() sd= Scheduler(j, timeout=0.25, critical=False) sd.run()

000.000: forever0 000.104: forever1 000.209: forever2 11:34:57.169 SCHEDULER(None): PureScheduler.co_run: TIMEOUT occurred (continues on next page)

12 Chapter 1. README asynciojobs Documentation, Release 0.13.2

(continued from previous page)

False

As you can see the result of run() in this case is False, since not all jobs have completed. Apart from that the jobs is now in this state: j

[not done]

1.5.11 Handling exceptions

A job instance can be critical or not; what this means is as follows: • if a critical job raises an exception, the whole scheduler aborts immediately and returns False; • if a non-critical job raises an exception, the whole scheduler proceeds regardless. In both cases the exception can be retrieved in the corresponding Job object with raised_exception().

1.5.12 Example E : non critical jobs async def boom(n): await asyncio.sleep(n) raise Exception("boom after {}s".format(n))

# by default everything is non critical e1= Job(in_out(0.2), label='begin') e2= Job(boom(0.2), label="boom", critical=False) e3= Job(in_out(0.3), label='end') se= Scheduler(Sequence(e1, e2, e3), critical=False)

# with these settings, jobs 'end' is not hindered # by the middle job raising an exception watch.reset() se.run()

000.000-> in_out(0.2) 000.205<- in_out(0.2) 000.409-> in_out(0.3) 000.710<- in_out(0.3)

True

1.5. Examples 13 asynciojobs Documentation, Release 0.13.2

# in this listing you can see that job 'end' # has been running and has returned '300' as expected se.list()

1 [[ -> 200.0]] 2 !! exception => Exception:boom after 0.2s!! requires={1} 3 [[ -> 300.0]] requires={2}

Non-critical jobs and schedulers show up with a thin and black border: make_png(se,"readme-example-e") readme-example-e.png

1.5.13 Example F : critical jobs

Making the boom job critical would instead cause the scheduler to bail out: f1= Job(in_out(0.2), label="begin") f2= Job(boom(0.2), label="boom", critical=True) f3= Job(in_out(0.3), label="end") sf= Scheduler(Sequence(f1, f2, f3), critical=False)

# with this setup, orchestration stops immediately # when the exception triggers in boom() # and the last job does not run at all watch.reset() sf.run()

14 Chapter 1. README asynciojobs Documentation, Release 0.13.2

000.000-> in_out(0.2) 000.202<- in_out(0.2) 11:34:58.383 SCHEDULER(None): Emergency exit upon exception in critical job

False

# as you can see, job 'end' has not even started here sf.list()

1 [[ -> 200.0]] 2 !! CRIT. EXC. => Exception:boom after 0.2s!! requires={1} 3 [not done] requires={2}

Critical jobs and schedulers show up with a thick and red border: make_png(sf,"readme-example-f") readme-example-f.png

1.5.14 Limiting the number of simultaneous jobs

A Scheduler has a jobs_window attribute that allows to specify a maximum number of jobs running simultane- ously. When jobs_windows is not specified or 0, it means no limit is imposed on the running jobs.

1.5. Examples 15 asynciojobs Documentation, Release 0.13.2

# let's define a simple coroutine async def aprint(message, delay=0.5): print(message) await asyncio.sleep(delay)

# let us now add 8 jobs that take 0.5 second each s= Scheduler(jobs_window=4) for i in range(1,9): s.add(Job(aprint("{} {}-th job".format(watch.elapsed(), i), 0.5)))

# so running them with a window of 4 means approx. 1 second watch.reset() s.run() # expect around 1 second print("total duration = {}s".format(watch.elapsed()))

000.4691-th job 000.4692-th job 000.4693-th job 000.4694-th job 000.4695-th job 000.4696-th job 000.4697-th job 000.4698-th job total duration= 001.004s

1.6 Nesting schedulers

As mentioned in the introduction, a Scheduler instance can itself be used as a job. This makes it easy to split complex scenarii into pieces, and to combine them in a modular way. Let us consider the following example:

# we start with the creation of an internal scheduler # that has a simple diamond structure sub_sched= Scheduler(label="critical nested", critical=True) subj1= Job(aprint("subj1"), label='subj1', scheduler=sub_sched) subj2= Job(aprint("subj2"), label='subj2', required=subj1, scheduler=sub_sched) subj3= Job(aprint("subj3"), label='subj3', required=subj1, scheduler=sub_sched) subj4= Job(aprint("subj4"), label='subj4', required=(subj2, subj3), scheduler=sub_

˓→sched) make_png(sub_sched,"readme-subscheduler") readme-subscheduler.png

16 Chapter 1. README asynciojobs Documentation, Release 0.13.2

We can now create a main scheduler, in which one of the jobs is this low-level scheduler:

# the main scheduler main_sched= Scheduler( Sequence( Job(aprint("main-start"), label="main-start"), # the way to graft the low-level logic in this main workflow # is to just use the ShcdulerJob instance as a job sub_sched, Job(aprint("main-end"), label="main-end"), ) )

This nested structure is rendered by both list() and graph():

# list() shows the contents of sub-schedulers implemented as Scheduler instances main_sched.list()

1 [not done] 2 [not done] requires={1} -> entries={3} 3 > [not done] 4 > [not done] requires={3} 5 > [not done] requires={3} 6 > [not done] requires={4, 5} 2 --end-- < exits={6} 7 [not done] requires={2}

When using a Scheduler to describe nested schedulers, asynciojobs will also produce a graphical output that properly exhibits the overall structure: Let us do this again another way, so that this shows up properly in readthedocs: make_png(main_sched,"readme-nested")

1.6. Nesting schedulers 17 asynciojobs Documentation, Release 0.13.2

readme-nested.png

Here’s how the main scheduler would be rendered by graph():

Which when executed produces this output: main_sched.run() main-start subj1 subj2 subj3 subj4 main-end

(continues on next page)

18 Chapter 1. README asynciojobs Documentation, Release 0.13.2

(continued from previous page)

True

1.6.1 Benefits of nesting schedulers

This feature can can come in handy to deal with issues like: • you want to be able to re-use code - as in writing a library - and nesting schedulers is a convenient way to address that; functions can return pieces of workflows implemented as schedulers, that can be easily mixed within a larger scenario; • in another dimension, nested schedulers can be a solution if – you want the jobs_window attribute to apply to only a subset of your jobs; – or you need the timeout attribute to apply to only a subset of your jobs; – you have forever jobs that need to be terminated sooner than the very end of the overall scenario.

1.6.2 Historical note

Internally, asynciojobs comes with the PureScheduler class. A PureScheduler instance is a fully functional scheduler, but it cannot be used as a nested scheduler. In terms of implementation, Scheduler is a mixin class that inherits from both PureScheduler and AbstractJob. In previous versions of this library, the Scheduler class could not be nested, and a specific class was required for the purpose of creating nestable schedulers, like is shown in this table: The bottom line is that, starting with version 0.10, users primarily do not need to worry about that, and creating only nestable Scheduler objects is the recommended approach.

1.7 Other useful features on the Scheduler class

1.7.1 Inspect / troubleshoot : Scheduler.debrief()

Scheduler.debrief() is designed for schedulers that have run and returned False, it does output the same listing as list() but with additional statistics on the number of jobs, and, most importantly, on the stacks of jobs that have raised an exception.

1.7.2 Cleanup : Scheduler.sanitize()

In some cases like esp. test scenarios, it can be helpful to add requirements to jobs that are not in the scheduler. The sanitize method removes such extra requirements, and unless you are certain it is not your case, it might be a good idea to call it explcitly before an orchestration.

1.7. Other useful features on the Scheduler class 19 asynciojobs Documentation, Release 0.13.2

1.7.3 Early checks : Scheduler.check_cycles()

check_cycles will check for cycles in the requirements graph. It returns a boolean. It’s a good idea to call it before running an orchestration.

1.7.4 Need a coroutine instead ? : Scheduler.co_run()

run() is a regular def function (i.e. not an async def), but in fact just a wrapper around the native coroutine called co_run().

def run(self, *args, **kwds): loop= asyncio.get_event_loop() return loop.run_until_complete(self.co_run(*args, **kwds))

1.7.5 Cleaning up - the shutdown() method.

Scheduler objects expose the shutdown() method. This method should be called explicitly by the user when resources are attached to the various jobs, and they can be released. Contrary to what was done in older versions of asynciojobs, where nested schedulers were not yet as massively useful, this call needs to be explicit, it is no longer automatically invoked by run() when the orchestration is over. Although such a cleanup is not really useful in the case of local Job instances, some application libraries like apssh define jobs that are attached to network connections, ssh connections in the case of apssh, and it is convenient to be able to terminate those connections explicitly.

1.7.6 Visualization - in a notebook : Scheduler.graph()

If you have the graphviz package installed, you can inspect a scheduler instance in a Jupyter notebook by using the graph() method, that returns a graphviz.Digraph instance; this way the scheduler graph can be displayed interactively in a notebook - see also http://graphviz.readthedocs.io/en/stable/manual.html#jupyter-notebooks. Here’s a simple example:

# and a simple scheduler with an initialization and 2 concurrent tasks s= Scheduler() j1= Job(aprint("j1"), label="init", critical=False, scheduler=s) j2= Job(aprint("j2"), label="critical job", critical=True, scheduler=s, required=j1) j3= Job(aprint("j3"), label="forever job", critical=False, forever=True, scheduler=s,

˓→ required=j1) s.graph()

In a regular notebook, that is all you need to do to see the scheduler’s graph. In the case of this README though, once rendered on readthedocs.io the graph has got lost in translation, so please read on to see that graph.

1.7.7 Visualization - the long way : Scheduler.export_as_dotfile()

If visualizing in a notebook is not an option, or if you do not have graphviz installed, you can still produce a dotfile from a scheduler object:

20 Chapter 1. README asynciojobs Documentation, Release 0.13.2

s.export_as_dotfile('readme-dotfile.dot')

'(Over)wrote readme-dotfile.dot'

Then later on - and possibly on another host - you can use this dot file as an input to produce a .png graphics, using the dot program (which is part of graphviz), like e.g.: import os os.system("dot -Tpng readme-dotfile.dot -o readme-dotfile.png")

0

Which now allows us to render the graph for our last scheduler as a png file:

Legend: on this small example we can see that: • critical jobs come with a thick red border; • while non-critical jobs have a finer, black border; • forever jobs have a dotted border; • while usual jobs have a continuous border. Although this is not illustrated here, the same graphical legend is applicable to nested schedulers as well. Note that if you do have graphviz available, you can produce a png file a little more simply, i.e. without the need for creating the dot file, like this:

# a trick to produce a png file on a box that has graphviz pip-installed g=s.graph() g.format='png' g.render('readme')

'readme.png'

1.8 Troubleshooting

As a general rule, and maybe especially when dealing with nested schedulers, it is important to keep in mind the following constraints.

1.8. Troubleshooting 21 asynciojobs Documentation, Release 0.13.2

1.8.1 Don’t insert a job in several schedulers

One given job should be inserted in exactly one scheduler. Be aware that the code does not check for this, it is the programmer’s responsability to enforce this rule. A job that is not inserted in any scheduler will of course never be run. A job inserted in several schedulers will most likely behave very oddly, as each scheduler will be in a position to have it move along.

1.8.2 You can only create requirements between jobs in the same scheduler

With nested schedulers, it can be tempting to create dependencies between jobs that are not part of the same scheduler, but that belong in sibling or cousin schedulers. This is currently not supported, a job can only have requirements to other jobs in the same scheduler. Like for the previous topic, as of now there is no provision in the code to enforce that, and failing to comply with that rule will result in unexpected behaviour.

1.8.3 Create as many job instances as needed

Another common mistake is to try and reuse a job instance in several places in a scheduler. Each instance carries the state of the job progress, so it is important to create as many instances/copies as there are tasks, and to not try and share job objects. In particular, if you take one job instance that has completed, and try to insert it into a new scheduler, it will be considered as done, and will not run again.

1.8.4 You can’t run the same scheduler twice

In much the same way, once a scheduler is done - assuming all went well - essentially all of its jobs are marked as done, and trying to run it again will either do nothing, or raise an exception.

1.9 Customizing jobs

1.9.1 Customizing the Job class

Job actually is a specializtion of AbstractJob, and the specification is that the co_run() method should denote a coroutine itself, as that is what is triggered by Scheduler when running said job.

1.9.2 AbstractJob.co_shutdown()

The shutdown() method on a scheduler sends co_shutdown() method on all - possibly nested - jobs. The default behaviour - in the Job class - is to do nothing, but this can be redefined by daughter classes of AbstractJob when relevant. Typically, an implementation of an SshJob will allow for a given SSH connection to be shared amongst several SshJob instances, and so co_shutdown() may be used to close the underlying SSH connections.

22 Chapter 1. README asynciojobs Documentation, Release 0.13.2

1.9.3 The apssh library and the SshJob class

You can easily define your own Job class by specializing job.AbstractJob. As an example, which was the primary target when developping asynciojobs, you can find in the apssh librarya SshJob class, with which you can easily orchestrate scenarios involving several hosts that you interact with using ssh.

1.9. Customizing jobs 23 asynciojobs Documentation, Release 0.13.2

24 Chapter 1. README CHAPTER 2

The asynciojobs API

2.1 The Scheduler classes

The PureScheduler class is a set of AbstractJobs, that together with their required relationship, form an execution graph. class asynciojobs.purescheduler.PureScheduler(*jobs_or_sequences, jobs_window=None, timeout=None, shutdown_timeout=1, watch=None, verbose=False) A PureScheduler instance is made of a set of AbstractJob objects. The purpose of the scheduler object is to orchestrate an execution of these jobs that respects the required rela- tionships, until they are all complete. It starts with the ones that have no requirement, and then triggers the other ones as their requirement jobs complete. For this reason, the dependency/requirements graph must be acyclic. Optionnally a scheduler orchestration can be confined to a finite number of concurrent jobs (see the jobs_window parameter below). It is also possible to define a timeout attribute on the object, that will limit the execution time of a scheduler. Running an AbstractJob means executing its co_run() method, which must be a coroutine The result of a job’s co_run() is NOT taken into account, as long as it returns without raising an exception. If it does raise an exception, overall execution is aborted iff the job is critical. In all cases, the result and/or exception of each individual job can be inspected and retrieved individually at any time, including of course once the orchestration is complete. Parameters • jobs_or_sequences – instances of AbstractJob or Sequence. The order in which they are mentioned is irrelevant. • jobs_window – is an integer that specifies how many jobs can be run simultaneously. None or 0 means no limit.

25 asynciojobs Documentation, Release 0.13.2

• timeout – can be an int or float and is expressed in seconds; it applies to the overall orchestration of that scheduler, not to any individual job. Can be also None, which means no timeout. • shutdown_timeout – same meaning as timeout, but for the shutdown phase. • watch – if the caller passes a Watch instance, it is used in debugging messages to show the time elapsed wrt that watch, instead of using the wall clock. • verbose (bool) – flag that says if execution should be verbose.

Examples

Creating an empty scheduler:

s= Scheduler()

A scheduler with a single job:

s= Scheduler(Job(asyncio.sleep(1)))

A scheduler with 2 jobs in parallel:

s= Scheduler(Job(asyncio.sleep(1)), Job(asyncio.sleep(2)))

A scheduler with 2 jobs in sequence:

s= Scheduler( Sequence( Job(asyncio.sleep(1)), Job(asyncio.sleep(2)) ))

In this document, the Schedulable name refers to a type hint, that encompasses instances of either the AbstractJob or Sequence classes. add(job) Adds a single Schedulable object; this method name is inspired from plain python set.add() Parameters job – a single Schedulable object. Returns the scheduler object, for cascading insertions if needed. Return type self check_cycles() Performs a minimal sanity check. The purpose of this is primarily to check for cycles, and/or missing starting points. It’s not embedded in co_run() because it is not strictly necessary, but it is safer to call this before running the scheduler if one wants to double-check the jobs dependency graph early on. It might also help to have a sanitized scheduler, but here again this is up to the caller. Returns True if the topology is fine Return type bool

26 Chapter 2. The asynciojobs API asynciojobs Documentation, Release 0.13.2

coroutine co_run() The primary entry point for running a scheduler. See also run() for a synchronous wrapper around this coroutine. Runs member jobs (that is, schedule their co_run() method) in an order that satisfies their required rela- tionsship. Proceeds to the end no matter what, except if either: • one critical job raises an exception, or • a timeout occurs.

Returns True if none of these 2 conditions occur, False otherwise. Return type bool

Jobs marked as forever are not waited for. No automatic shutdown is performed, user needs to explicitly call co_shutdown() or shutdown(). coroutine co_shutdown() Shut down the scheduler, by sending the co_shutdown() method to all the jobs, possibly nested. Within nested schedulers, a job receives the shutdown event when its enclosing scheduler terminates, and not at the end of the outermost scheduler. Also note that all job instances receive the ‘co_shutdown()’ method, even the ones that have not yet started; it is up to the co_shutdown() method to triage the jobs according to their life cycle status - see is_running() and similar. This mechanism should be used only for minimal housekeeping only, it is recommended that intrusive cleanup be made part of separate, explicit methods. Note typically in apssh for example, several jobs sharing the same ssh connection need to arrange for that connection to *be kept alive across an entire scheduler lifespan, and closed later on. Historically there had been an attempt to deal with this automagically, through the present shutdown mechanism. However, this turned out to be the wrong choice, as the choice of closing connec- tions needs to be left to the user. Additionally, with nested schedulers, this can become pretty awkward. Closing ssh connections is now to be achieved explicitly through a call to a specific apssh function. Returns True if all the co_shutdown() methods attached to the jobs in the scheduler com- plete within shutdown_timeout, which is an attribute of the scheduler. If the shutdown_timeout attribute on this object is None, no timeout is implemented. Return type bool

Notes

There is probably space for a lot of improvement here xxx: • behaviour is unspecified if any of the co_shutdown() methods raises an exception; • right now, a subscheduler that sees a timeout expiration does not cause the overall co_shutdown() to return False, which is arguable; • another possible weakness in current implementation is that it does not support to shutdown a sched- uler that is still running.

2.1. The Scheduler classes 27 asynciojobs Documentation, Release 0.13.2

debrief(details=False) Designed for schedulers that have failed to orchestrate. Print a complete report, that includes list() but also gives more stats and data. dot_format() Creates a graph that depicts the jobs and their requires relationships, in DOT Format. Returns: str: a representation of the graph in DOT Format underlying this scheduler. See graphviz’s documentation, together with its Python wrapper library, for more information on the format and available tools. See also Wikipedia on DOT for a list of tools that support the dot format. As a general rule, asynciojobs has a support for producing DOT Format but stops short of actually importing graphviz that can be cumbersome to install, but for the notable exception of the :meth:graph() method. See that method for how to convert a PureScheduler instance into a native DiGraph instance. DOT_%28graph_description_language%29 entry_jobs() A generator that yields all jobs that have no requirement. Exemples: List all entry points:

for job in scheduler.entry_points(): print(job)

exit_jobs(*, discard_forever=True, compute_backlinks=True) A generator that yields all jobs that are not a requirement to another job; it is thus in some sense the reverse of entry_points(). Parameters • discard_forever – if True, jobs marked as forever are skipped; forever jobs often have no successors, but are seldom of interest when calling this method. • compute_backlinks – for this method to work properly, it is necessary to compute backlinks, an internal structure that holds the opposite of the required relationship. Passing False here allows to skip that stage, when that relationship is known to be up to date already. export_as_dotfile(filename) This method does not require graphviz to be installed, it writes a file in dot format for post-processing with e.g. graphviz’s dot utility. It is a simple wrapper around dot_format(). Parameters filename – where to store the result. Returns a message that can be printed for information, like e.g. "(Over)wrote foo.dot" Return type str See also the graph() method that serves a similar purpose but natively as a graphviz object. As an example of post-processing, a PNG image can be then obtained from that dotfile with e.g.:

dot-Tpng foo.dot-o foo.png

28 Chapter 2. The asynciojobs API asynciojobs Documentation, Release 0.13.2

export_as_pngfile(filename) Convenience wrapper that creates a png file. Like graph(), it requires the graphviz package to be installed. Parameters filename – output filename, without the .png extension Returns created file name

Notes

• This actually uses the binary dot program. • A file named as the output but with a .dot extension is created as an artefact by this method.

failed_critical() Returns returns True if and only if co_run() has failed because a critical job has raised an exception. Return type bool failed_time_out() Returns returns True if and only if co_run() has failed because of a time out. Return type bool graph() Returns a graph Return type graphviz.Digraph This method serves the same purpose as export_to_dotfile(), but it natively returns a graph in- stance. For that reason, its usage requires the installation of the graphviz package. This method is typically useful in a Jupyter notebook, so as to visualize a scheduler in graph format - see http://graphviz.readthedocs.io/en/stable/manual.html#jupyter-notebooks for how this works. The dependency from asynciojobs to graphviz is limited to this method and export_as_pngfile(), as it these are the only places that need it, and as installing graphviz can be cumbersome. For example, on MacOS I had to do both:

brew install graphviz # for the C/C++ binary stuff pip3 install graphviz # for the python bindings

iterate_jobs(scan_schedulers=False) A generator that scans all jobs and subjobs Parameters scan_schedulers – if set, nested schedulers are ignored, only actual jobs are reported; otherwise, nested schedulers are listed as well. list(details=False) Prints a complete list of jobs in topological order, with their status summarized with a few signs. See the README for examples and a legend. Beware that this might raise an exception if check_cycles() would return False, i.e. if the graph is not acyclic.

2.1. The Scheduler classes 29 asynciojobs Documentation, Release 0.13.2

list_safe() Print jobs in no specific order, the advantage being that it works even if scheduler is broken wrt check_cycles(). On the other hand, this method is not able to list requirements. orchestrate(*args, **kwds) A synchroneous wrapper around co_run(), please refer to that link for details on parameters and return value. Also, the canonical name for this is run() but for historical reasons you can also use orchestrate() as an for run(). remove(job) Removes a single Schedulable object; this method name is inspired from plain python set. remove() Parameters job – a single Schedulable object. Raises KeyError – if job not in scheduler. Returns the scheduler object, for cascading insertions if needed. Return type self run(*args, **kwds) A synchroneous wrapper around co_run(), please refer to that link for details on parameters and return value. Also, the canonical name for this is run() but for historical reasons you can also use orchestrate() as an alias for run(). sanitize(verbose=None) This method ensures that the requirements relationship is closed within the scheduler. In other words, it removes any requirement attached to a job in this scheduler, but that is not itself part of the scheduler. This can come in handy in some scheduler whose composition depends on external conditions. In any case it is crucial that this property holds for co_run() to perform properly. Parameters verbose – if not None, defines verbosity for this operation. Otherwise, the ob- ject’s verbose attribute is used. In verbose mode, jobs that are changed, i.e. that have requirement(s) dropped because they are not part of the same scheduler, are listed, together with their container scheduler. Returns returns True if scheduler object was fine, and False if at least one removal was needed. Return type bool shutdown() A synchroneous wrapper around co_shutdown(). Returns True if everything went well, False otherwise; see co_shutdown() for details. Return type bool stats() Returns a string like e.g. 2D + 3R + 4I = 9 meaning that the scheduler currently has 2 done, 3 running an 4 idle jobs topological_order() A generator function that scans the graph in topological order, in the same order as the orchestration, i.e. starting from jobs that have no dependencies, and moving forward. Beware that this is not a separate iterator, so it can’t be nested, which in practice should not be a problem.

30 Chapter 2. The asynciojobs API asynciojobs Documentation, Release 0.13.2

Examples

Assuming all jobs have a label attribute, print them in the “right” order:

for job in scheduler.topological_order(): print(job.label)

update(jobs) Adds a collection of Schedulable objects; this method is named after set.update(). Parameters jobs – a collection of Schedulable objects. Returns the scheduler object, for cascading insertions if needed. Return type self why() Returns a message explaining why co_run() has failed, or "FINE" if it has not failed. Return type str

Notes

At this point the code does not check that co_run() has actually been called. The Scheduler class makes it easier to nest scheduler objects. class asynciojobs.scheduler.Scheduler(*jobs_or_sequences, jobs_window=None, time- out=None, shutdown_timeout=1, watch=None, verbose=False, **kwds) The Scheduler class is a mixin of the two PureScheduler and AbstractJob classes. As such it can be used to create nested schedulers, since it is a scheduler that can contain jobs, and at the same time it is a job, and so it can be included in a scheduler. Parameters • jobs_or_sequences – passed to PureScheduler, allows to add these jobs inside of the newly-created scheduler; • jobs_window – passed to PureScheduler; • timeout – passed to PureScheduler; • shutdown_timeout – passed to PureScheduler; • watch (Watch) – passed to PureScheduler; • verbose (bool) – passed to PureScheduler; • kwds – all other named arguments are sent to the AbstractJob constructor.

Example

Here’s how to create a very simple scheduler with an embedded sub-scheduler; the whole result is equivalent to a simple 4-steps sequence:

2.1. The Scheduler classes 31 asynciojobs Documentation, Release 0.13.2

main= Scheduler( Sequence( Job(aprint("begin", duration=0.25)), Scheduler( Sequence( Job(aprint("middle-begin", duration= 0.25)), Job(aprint("middle-end", duration= 0.25)), ) ), Job(aprint("end", duration=0.25)), ) main.run()

Notes

There can be several good reasons for using nested schedulers: • the scope of a window object applies to a scheduler, so a nested scheduler is a means to apply windoing on a specific set of jobs; • likewise the timeout attribute only applies to the run for the whole scheduler; • you can use forever jobs that will be terminated earlier than the end of the global scheduler; • strictly speaking, the outermost instance in this example could be an instance of PureScheduler, but in practice it is simpler to always create instances of Scheduler. Using an intermediate-level scheduler can in some case help alleviate or solve such issues. check_cycles() Supersedes check_cycles() to account for nested schedulers. Returns True if this scheduler, and all its nested schedulers at any depth, has no cycle and can be safely scheduled. Return type bool coroutine co_run() Supersedes the co_run() method in order to account for critical schedulers. Scheduler being a subclass of AbstractJob, we need to account for the possibility that a scheduler is defined as critical. If the inherited co_run() method fails because of an exception of a timeout, a critical Scheduler will trigger an exception, instead of returning False: • if orchestration failed because an internal job has raised an exception, raise that exception; • if it failed because of a timeout, raise TimeoutError

Returns True if everything went well; False for non-critical schedulers that go south. Return type bool Raises • TimeoutError – for critical schedulers that do not complete in time,

32 Chapter 2. The asynciojobs API asynciojobs Documentation, Release 0.13.2

• Exception – for a critical scheduler that has a critical job that triggers an exception, in which case it bubbles up.

dot_cluster_name() assigns a name to the subgraph that will represent a nested scheduler; dot format imposes this name to start with cluster_

2.2 Job-like classes

This module defines AbstractJob, that is the base class for all the jobs in a Scheduler, as well as a basic concrete subclass Job for creating a job from a coroutine. It also defines a couple of simple job classes. class asynciojobs.job.AbstractJob(*, forever=False, critical=True, label=None, re- quired=None, scheduler=None) AbstractJob is a virtual class: • it offers some very basic graph-related features to model requirements a la Makefile; • its subclasses are expected to implement a co_run() and a co_shutdown() methods that specify the actual behaviour of the job, as coroutines. • AbstractJob is mostly a companion class to the PureScheduler class, that triggers these co_* methods. Life Cycle: AbstractJob is also aware of a common life cycle for all jobs, which can be summarized as follows: idle → scheduled → running → done In un-windowed schedulers, there is no distinction between scheduled and running. In other words, in this case a job goes directly from idle to running.a On the other hand, in windowed orchestrations - see the jobs_window attribute to PureScheduler() - a job can be scheduled but not yet running, because it is waiting for a slot in the global window. Parameters • forever (bool) – if set, means the job is not returning at all and runs forever; in this case Scheduler.orchestrate() will not wait for that job, and will terminate it once all the regular - i.e. not-forever - jobs are done. • critical (bool) – if set, this flag indicates that any exception raised during the execution of that job should result in the scheduler aborting its run immediately. The default behaviour is to let the scheduler finish its jobs, at which point the jobs can be inspected for exceptions or results. • required – this can be one, or a collection of, jobs that will the job’s requirements; requirements can be added later on as well. • label (str) – for convenience mostly, allows to specify the way that particular job should be displayed by the scheduler, either in textual form by Scheduler.list(), or in graph- ical form by Scheduler.graph(). See also text_label() and graph_label() for how this is used. As far as labelling, each subclass of AbstractJob implements a default labelling scheme, so it is not mandatory to set a specific label on each job instance, however it is sometimes useful. Labels must not be confused with details, see details()

2.2. Job-like classes 33 asynciojobs Documentation, Release 0.13.2

• scheduler – this can be an instance of a PureScheduler object, in which the newly created job instance is immediately added. A job instance can also be inserted in a scheduler instance later on. Note: a Job instance must only be added in one Scheduler instance at most - be aware that the code makes no control on this property, but be aware that odd behaviours can be observed if it is not fulfilled. coroutine co_run() Abstract virtual - needs to be implemented coroutine co_shutdown() Abstract virtual - needs to be implemented. details() An optional method to implement on concrete job classes; if it returns a non None value, these addi- tional details about that job will get printed by asynciojobs.purescheduler.PureScheduler. list() and asynciojobs.purescheduler.PureScheduler.debrief() when called with details=True. dot_style() This method computes the DOT attributes which are used to style boxes according to critical / forever / and similar. Legend is quite simply that: • schedulers have sharp angles, while other jobs have rounded corners, • critical jobs have a colored and thick border, and • forever jobs have a dashed border.

Returns a dict-like mapping that sets DOT attributes for that job. Return type DotStyle

graph_label() This method is intended to be redefined by daughter classes. Returns a string used by the Scheduler methods that produce a graph, such as graph() and export_as_dotfile(). Because of the way graphs are presented, it can have contain “newline” characters, that will render as line breaks in the output graph. If this method is not defined on a concrete class, then the text_label() method is used instead. is_critical() Returns whether this job is a critical job or not. Return type bool is_done() Returns True if the job has completed. Return type bool If this method returns True, it implies that is_scheduled() and is_running() would also return True at that time. is_idle() Returns True if the job has not been scheduled already, which in other words means that at least one of its requirements is not fulfilled.

34 Chapter 2. The asynciojobs API asynciojobs Documentation, Release 0.13.2

Return type bool Implies not is_scheduled(), and so a fortiori not is_running and not is_done(). is_running() Returns once a job starts, it tries to get a slot in the windowing sytem. This method returns True if the job has received the green light from the windowing system. Implies is_scheduled(). Return type bool is_scheduled() Returns True if the job has been scheduled. Return type bool If True, it means that the job’s requirements are met, and it has proceeded to the windowing system; equivalent to not is_idle(). raised_exception() Returns an exception if the job has completed by raising an exception, and None otherwise. repr_id() Returns the job’s id inside the scheduler, or ‘??’ if that was not yet set by the scheduler. Return type str repr_main() Returns standardized body of the object’s repr, like e.g. . Return type str repr_requires() Returns text part that describes requirements Return type str repr_result() Returns standardized repr’s part that shows the result or exception of the job. Return type str repr_short() Returns a 4 characters string (in fact 7 with interspaces) that summarizes the 4 dimensions of the job, that is to say Return type str

• its point in the lifecycle (idle → scheduled → running → done) • is it declared as forever • is it declared as critical • did it trigger an exception

2.2. Job-like classes 35 asynciojobs Documentation, Release 0.13.2

requires(*requirements, remove=False) Parameters • requirements – an iterable of AbstractJob instances that are added to the requirements. • remove (bool) – if set, the requirement are dropped rather than added. Raises KeyError – when trying to remove dependencies that were not present. For convenience, any nested structure made of job instances can be provided, and if None objects are found, they are silently ignored. For example, with j{1,2,3,4} being jobs or sequences, all the following calls are legitimate: • j1.requires(None) • j1.requires([None]) • j1.requires((None,)) • j1.requires(j2) • j1.requires(j2, j3) • j1.requires([j2, j3]) • j1.requires(j2, [j3, j4]) • j1.requires((j2, j3)) • j1.requires(([j2], [[[j3]]])) • Any of the above with remove=True. For dropping dependencies instead of adding them, use remove=True result() Returns When this job is completed and has not raised an exception, this method lets you re- trieve the job’s result. i.e. the value returned by its co_run() method. standalone_run() A convenience helper that just runs this one job on its own. Mostly useful for debugging the internals of that job, e.g. for checking for gross mistakes and other exceptions. text_label() This method is intended to be redefined by daughter classes. Returns a one-line string that describes this job. This representation for the job is used by the Scheduler object through its list() and debrief() methods, i.e. when a scheduler is printed out in textual format. The overall logic is to always use the instance’s label attribute if set, or to use this method otherwise. If none of this returns anything useful, the textual label used is NOLABEL. class asynciojobs.job.Job(corun, *args, coshutdown=None, **kwds) The simplest concrete job class, for building an instance of AbstractJob from of a python coroutine. Parameters • corun – a coroutine to be evaluated when the job runs • coshutdown – an optional coroutine to be evaluated when the scheduler is done running • scheduler – passed to AbstractJob

36 Chapter 2. The asynciojobs API asynciojobs Documentation, Release 0.13.2

• required – passed to AbstractJob • label – passed to AbstractJob

Example

To create a job that prints a message and waits for a fixed delay:

async def aprint(message, delay): print(message) await asyncio.sleep(delay)

j= Job(aprint("Welcome - idling for 3 seconds",3))

coroutine co_run() Implementation of the method expected by AbstractJob coroutine co_shutdown() Implementation of the method expected by AbstractJob, or more exactly by asynciojobs. purescheduler.PureScheduler.list() text_label() Implementation of the method expected by AbstractJob

2.3 The Sequence class

This module defines the Sequence class, that is designed to ease the building of schedulers class asynciojobs.sequence.Sequence(*sequences_or_jobs, required=None, scheduler=None) A Sequence is an object that organizes a set of AbstratJobs in a sequence. Its main purpose is to add a single required relationship per job in the sequence, except the for first one, that instead receives as its required the sequence’s requirements. If scheduler is passed to the sequence’s constructor, all the jobs passed to the sequence are added in that sched- uler. Sequences are not first-class citizens, in the sense that the scheduler primarily ignores these objects, only the jobs inside the sequence matter. However a sequence can be used essentially in every place where a job could be, either being inserted in an scheduler, added as a requirement, and it can have requirements too. Parameters • sequences_or_jobs – each must be a Schedulable object, the order of course is important here • required – one, or a collection of, Schedulable objects that will become the require- ments for the first job in the sequence • scheduler – if provided, the jobs in the sequence will be inserted in that scheduler. append(*sequences_or_jobs) Add these jobs or sequences at the end of the present sequence. Parameters sequences_or_jobs – each must be a Schedulable object.

2.3. The Sequence class 37 asynciojobs Documentation, Release 0.13.2

requires(*requirements) Adds requirements to the sequence, so that is to say, to the first job in the sequence. Parameters requirements – each must be a Schedulable object.

2.4 Notes on ordering

Schedulers and jobs requirements are essentially sets of jobs, and from a semantic point of view, order does not matter. However for debugging/cosmetic reasons, keeping track of creation order can be convenient. So using OrderedSet looks like a good idea; but it turns out that on some distros like fedora, installing OrderedSet can be a pain, as it involves recompiling C code, which in turn pulls in a great deal of dependen- cies. For this reason, we use OrderedSet only if available, and resort to regular sets otherwise. On macos or ubuntu, fortunately, this can be simply achieved with: pip3 install orderedset or alternatively with: pip3 install asynciojobs[ordered]

2.5 Convenience classes

The PrintJob class is a specialization of the AbstractJob class, mostly useful for debugging, tests and tutorials. class asynciojobs.printjob.PrintJob(*messages, sleep=None, banner=None, sched- uler=None, label=None, required=None) A job that just prints messages, and optionnally sleeps for some time. Parameters • messages – passed to print as-is • sleep – optional, an int or float describing in seconds how long to sleep after the messages get printed • banner – optional, a fixed text printed out before the messages like e.g. 40*'='; it won’t make it into details() • scheduler – passed to :class:AbstractJob • required – passed to :class:AbstractJob • label – passed to :class:AbstractJob coroutine co_run() Implementation of the method expected by AbstractJob coroutine co_shutdown() Implementation of the method expected by AbstractJob; does nothing.

38 Chapter 2. The asynciojobs API asynciojobs Documentation, Release 0.13.2

details() Implementation of the method expected by AbstractJob A utility to print time and compute durations, mostly for debugging and tests. class asynciojobs.watch.Watch(message=None, *, show_elapsed=True, show_wall_clock=False) This class essentially remembers a starting point, so that durations relative to that epoch can be printed for debug instead of a plain timestamp. Parameters • message (str) – used in the printed message at creation time, • show_elapsed (bool) – tells if a message with the elapsed time needs to be printed at creation time (elapsed will be 0), • show_wall_clock (bool) – same for the wall clock.

Examples

Here’s a simple use case; note that print_wall_clock() is a static because it is mostly useful, precisely, when you do not have a Watch object at hand:

$ python3 Python 3.6.4 (default, Mar 9 2018, 23:15:12) >>> from asynciojobs import Watch >>> import time >>> watch = Watch("hello there"); time.sleep(1); watch.print_elapsed() 000.000 hello there 001.000 >>> >>> >>> Watch.print_wall_clock() 20:48:27.782 >>>

elapsed() Returns number of seconds elapsed since start, formatted on 7 characters: 3 for seconds, a dot, 3 for milliseconds Return type str print_elapsed(suffix=’ ’) Print the elapsed time since start in format SSS.MMM + a suffix. Parameters suffix (str) – is appended to the output; to be explicit, by default no newline is added. static print_wall_clock(suffix=’ ’) Print current time in HH:MM:SS.MMM + a suffix. Parameters suffix (str) – is appended to the output; to be explicit, by default no newline is added. reset() Use current wall clock as starting point. seconds() Returns time elapsed since start, in seconds. Return type float

2.5. Convenience classes 39 asynciojobs Documentation, Release 0.13.2

History:

40 Chapter 2. The asynciojobs API CHAPTER 3

ChangeLog

3.1 0.13.2 - 2018 Oct 10

• minor tweaks in the way things are displayed

3.2 0.13.1 - 2018 Sep 20

• bugfix: in dot output, the source and destination of arrows need to be atomic jobs, not subgraphs; we had that wrong in case where nesting depth was more than 2.

3.3 0.13.0 - 2018 Aug 30

• re-enabled auto-shutdown() message broadcast system; this method is sent to all jobs within a scheduler when co_run() finishes; • a job in a nested scheduler receives this event when its most enclosing scheduler finishes, not when the highest- level scheduler finishes; • the feature is deemed complete and well tested including wrt shutdown_timeout.

3.4 0.12.11 - 2018 Aug 23

• new method iterate_jobs() on schedulers to scan all jobs in scheduler and nested sons.

3.5 0.12.10 - 2018 Jul 5

• iron unicode support detection, primarily for testing apssh within a ubuntu virtualbox

41 asynciojobs Documentation, Release 0.13.2

3.6 0.12.9 - 2018 Jul 5

• make OrderedSet optional again; too cumbersome on some distros like fedora • OrderedSet will be used if present, otherwise use plain sets • can install with either pip3 install asynciojobs[ordered] • or equivalently with pip3 install orderedset • release 0.12.8 is broken

3.7 0.12.7 - 2018 Jul 4

• fix packaging, so that orderedset can properly be installed as a dependency • intermediate releases were still broken in this respect

3.8 0.12.2 - 2018 Jun 14

• use OrderedSet’s to preserve creation order • more balanced graphical layout in case of nested schedulers • 0.12.1 was broken, it used a couple of f-strings

3.9 0.11.4 - 2018 Jun 12

• schedulers are graphically rendered with right corners instead of rounded • darker color for critical jobs - plain red was too much

3.10 0.11.3 - 2018 Jun 12

• new convenience method export_as_pngfile()

3.11 0.11.2 - 2018 May 15

• alter signature of co_shutdown() to remove argument depth, which is not relevant, and creates confusion for user libraries. • there is still a need to more accurately specify the expected behaviour of co_shutdown() though, see also notes in PureScheduler.co_shutdown()

3.12 0.11.1 - 2018 May 10

This is a release candidate for 1.0:

42 Chapter 3. ChangeLog asynciojobs Documentation, Release 0.13.2

3.12.1 major changes

• default value for critical is now True for all species, jobs and schedulers alike - see #7 • shutdown() is no longer implicitly called by run(), it is now up to the caller to call this method - see #10 • critical schedulers now propagate exceptions so they can bubble up the nested schedulers tree - see #12 • co_shutdown() now must accept mandatory argument depth • rain_check() renamed into check_cycles()

3.12.2 enhancements

• schedulers have a new attribute shutdown_timeout, defaults to 1s • list(), list_safe(), check_cycles() and sanitize() know about nested schedulers • new method PureScheduler.remove(job) • AbstractJob.requires() accepts kwarg remove=True to allow for removal of requirements • enhance border width of critical jobs in dot representation for colorblind people • list_safe() now shows requirements too • more graphs in the README

3.12.3 minor

• scheduler methods no longer accept a loop parameter

3.13 0.10.2 - 2018 May 2

• empty schedulers now run fine

3.14 0.10.1 - 2018 Apr 30

• make schedulers nestable by default - see issue #3 – Scheduler is now the nestable class (formerly known as SchedulerJob); – PureScheduler is the new name for what was formerly known as Scheduler • see issue #1 : jobs_window and timeout are no longer parameters to co_run(), but attributes of a PureScheduler object • graphical layout - see issue #4 – critical jobs or schedulers are shown with a red border – forever jobs or schedulers are shown with a dashed line

3.13. 0.10.2 - 2018 May 2 43 asynciojobs Documentation, Release 0.13.2

3.15 0.9.1 - 2018 Apr 25

• graphical output should now properly show nested schedulers in all cases of imbrications • textual output marginally nicer too • removed the formal definition of the Schedulable type hint that was only clobbering the doc • major renaming; all methods that produce pieces of text for representing objects are called repr_something() • more tools in tests.utils

3.16 0.8.2 - 2018 Apr 20

• Scheduler.list() shows nested jobs too • resurrected the PrintJob class

3.17 0.8.1 - 2018 Apr 19

• new class SchedulerJob allows to ease the creation of nested schedulers; no support yet for graphical represen- tation though • in the process, reviewed names for Scheduler methods: – for synchroneous calls, one can use run() (preferred) or orchestrate() (legacy) – coroutine co_orchestrate() is now renamed into co_run(); co_orchestrate() is now absent from the code • new class Watch for more elegant tests and nicer outputs

3.18 0.7.1 - 2018 Apr 17

• thoroughly reviewed the way custom labels are defined and used; classes that inherit AbstractJob can rede- fine text_label() and graph_label() • graph production: both methods (dot and native digraph) now consistently accept argument show_ids • major overhaul on the documentation – using the numpy style in docstrings – examples of nested schedulers – section on troubleshooting • code is now totally pep8/flake8- and pylint- clean

3.19 0.6.1 - 2018 Mar 12

• adopt new layout for the doc - no source/ subdir under sphinx • from asynciojobs import version

44 Chapter 3. ChangeLog asynciojobs Documentation, Release 0.13.2

• cosmetic micro changes in the doc

3.20 0.6.0 - 2018 Feb 25

• Scheduler.graph() can natively visualize in a notebook • Scheduler.run() an alias for orchestrate • printing a scheduler shows number of jobs • doc uses new sphinx theme

3.21 0.5.8 - 2018 Jan 16

• introduce dot_label() on jobs as a means to override label() when producing a dotfile

3.22 0.5.7 - 2017 Dec 19

• minor tweaks suggested by pylint

3.23 0.5.6 - 2017 Dec 18

• bugfix, remove one occurrence of Exceptin instead of Exception

3.24 0.5.5 - 2017 Nov 2

• just flushing a pile on harmless cosmetic changes

3.25 0.5.4 - 2016 Dec 15

• only minor changes, essentially PEP8-friendly • with a very shy start at type hints • but documentation w/ type hints is still unclear (in fact even without them, I can see weird extra ‘*’)

3.26 0.5.2 - 2016 Dec 8

• setup uses setuptools and no distutils anymore • cleaned init.py • use super() in subclasses • autodoc should now outline coroutines • 0.5.1 broken

3.20. 0.6.0 - 2018 Feb 25 45 asynciojobs Documentation, Release 0.13.2

3.27 0.5.0 - 2016 Dec 6

• windowing capability to limit number of simultaneous jobs • new status in job lifecycle: idle → scheduled → running → done ‘scheduled’ means : waiting for a slot in window

3.28 0.4.6 - 2016 Dec 5

• much nicer dotfile output, using double quotes to render strings instead of identifiers like we were trying to do • PrintJob can be created in a scheduler as well • 0.4.5 is broken in PrintJob

3.29 0.4.4 - 2016 Dec 5

• feedback message is forced for severe conditions (crit. exc. and timeout) • debrief(details) • minor improvements in export_as_dotfile

3.30 0.4.3 - 2016 Dec 2

• hardened export_as_dotfile()

3.31 0.4.2 - 2016 Dec 1

• can create jobs and sequences with scheduler=

3.32 0.4.1 - 2016 Nov 30

• sphinx documentation on http://nepi-ng.inria.fr/asynciojobs

3.33 0.4.0 - 2016 Nov 21

• rename engine into scheduler

3.34 0.3.4 - 2016 Nov 20

• a first sphinx doc, but not yet available at readthedocs.org because I could not get rtd to run python3.5

46 Chapter 3. ChangeLog asynciojobs Documentation, Release 0.13.2

3.35 0.3.3 - 2016 Nov 17

• Job’s default_label() to provide a more meaningful default when label is not set on a Job instance

3.36 0.3.2 - 2016 Nov 17

• major bugfix, sometimes critical job was not properly dealt with because it was last • new class PrintJob with an optional sleep delay • Engine.list(details=True) gives details on all the jobs provided that they have the details() method

3.37 0.3.1 - 2016 Nov 15

• no semantic change, just simpler and nicer • cosmetic : nicer list() that shows all jobs with a 4-characters pictogram that shows critical / forever / done/running/idle and if an exception occured • verbosity reviewed : only one verbose flag for the engine obj

3.38 0.2.3 - 2016 Oct 23

• Engine.store_as_dotfile() can export job requirements graph to graphviz

3.39 0.2.2 - 2016 Oct 20

• bugfix for when using Engine.update/Engine.add with a Sequence

3.40 0.2.1 - 2016 Oct 7

• cleanup

3.41 0.2.0 - 2016 Oct 4

• robust and tested management of requirements throughout

3.42 0.1.2 - 2016 Oct 2

• only cosmetic

3.35. 0.3.3 - 2016 Nov 17 47 asynciojobs Documentation, Release 0.13.2

3.43 0.1.1 - 2016 Sep 28

• hardened and tested Sequence - can be nested and have required= • jobs are listed in a more natural order by list() and debrief()

3.44 0.1.0 - 2016 Sep 27

• the Sequence class for modeling simple sequences without having to worry about the requires deps • a critical job that raises an exception always gets its stack traced

3.45 0.0.6 - 2016 Sep 21

• in debug mode, show stack corresponding to caught exceptions • various cosmetic

3.46 0.0.5 - 2016 Sep 21

• bugfix - missing await

3.47 0.0.4 - 2016 Sep 20

• Engine.verbose • robustified some corner cases

3.48 0.0.3 - 2016 Sep 19

• Engine.why() and Engine.debrief()

3.49 0.0.2 - 2016 Sep 15

• tweaking pypi upload

3.50 0.0.1 - 2016 Sep 15

• initial version

48 Chapter 3. ChangeLog CHAPTER 4

Indices and tables

• genindex • modindex • search

49 asynciojobs Documentation, Release 0.13.2

50 Chapter 4. Indices and tables Python Module Index

a asynciojobs.bestset, 38 asynciojobs.job, 33 asynciojobs.printjob, 38 asynciojobs.purescheduler, 25 asynciojobs.scheduler, 31 asynciojobs.sequence, 37 asynciojobs.watch, 39

51 asynciojobs Documentation, Release 0.13.2

52 Python Module Index Index

A dot_format() (asynciojobs.purescheduler.PureScheduler AbstractJob (class in asynciojobs.job), 33 method), 28 add() (asynciojobs.purescheduler.PureScheduler dot_style() (asynciojobs.job.AbstractJob method), 34 method), 26 append() (asynciojobs.sequence.Sequence method), 37 E asynciojobs.bestset (module), 38 elapsed() (asynciojobs.watch.Watch method), 39 asynciojobs.job (module), 33 entry_jobs() (asynciojobs.purescheduler.PureScheduler asynciojobs.printjob (module), 38 method), 28 asynciojobs.purescheduler (module), 25 exit_jobs() (asynciojobs.purescheduler.PureScheduler asynciojobs.scheduler (module), 31 method), 28 asynciojobs.sequence (module), 37 export_as_dotfile() (asyncio- asynciojobs.watch (module), 39 jobs.purescheduler.PureScheduler method), 28 C export_as_pngfile() (asyncio- check_cycles() (asyncio- jobs.purescheduler.PureScheduler method), jobs.purescheduler.PureScheduler method), 28 26 check_cycles() (asynciojobs.scheduler.Scheduler F method), 32 failed_critical() (asyncio- co_run() (asynciojobs.job.AbstractJob method), 34 jobs.purescheduler.PureScheduler method), co_run() (asynciojobs.job.Job method), 37 29 co_run() (asynciojobs.printjob.PrintJob method), 38 failed_time_out() (asyncio- co_run() (asynciojobs.purescheduler.PureScheduler jobs.purescheduler.PureScheduler method), method), 26 29 co_run() (asynciojobs.scheduler.Scheduler method), 32 co_shutdown() (asynciojobs.job.AbstractJob method), 34 G co_shutdown() (asynciojobs.job.Job method), 37 graph() (asynciojobs.purescheduler.PureScheduler co_shutdown() (asynciojobs.printjob.PrintJob method), method), 29 38 graph_label() (asynciojobs.job.AbstractJob method), 34 co_shutdown() (asyncio- jobs.purescheduler.PureScheduler method), I 27 is_critical() (asynciojobs.job.AbstractJob method), 34 D is_done() (asynciojobs.job.AbstractJob method), 34 is_idle() (asynciojobs.job.AbstractJob method), 34 debrief() (asynciojobs.purescheduler.PureScheduler is_running() (asynciojobs.job.AbstractJob method), 35 method), 27 is_scheduled() (asynciojobs.job.AbstractJob method), 35 details() (asynciojobs.job.AbstractJob method), 34 iterate_jobs() (asynciojobs.purescheduler.PureScheduler details() (asynciojobs.printjob.PrintJob method), 38 method), 29 dot_cluster_name() (asynciojobs.scheduler.Scheduler method), 33

53 asynciojobs Documentation, Release 0.13.2

J topological_order() (asyncio- Job (class in asynciojobs.job), 36 jobs.purescheduler.PureScheduler method), 30 L U list() (asynciojobs.purescheduler.PureScheduler method), 29 update() (asynciojobs.purescheduler.PureScheduler list_safe() (asynciojobs.purescheduler.PureScheduler method), 31 method), 29 W O Watch (class in asynciojobs.watch), 39 orchestrate() (asynciojobs.purescheduler.PureScheduler why() (asynciojobs.purescheduler.PureScheduler method), 30 method), 31 P print_elapsed() (asynciojobs.watch.Watch method), 39 print_wall_clock() (asynciojobs.watch.Watch static method), 39 PrintJob (class in asynciojobs.printjob), 38 PureScheduler (class in asynciojobs.purescheduler), 25 R raised_exception() (asynciojobs.job.AbstractJob method), 35 remove() (asynciojobs.purescheduler.PureScheduler method), 30 repr_id() (asynciojobs.job.AbstractJob method), 35 repr_main() (asynciojobs.job.AbstractJob method), 35 repr_requires() (asynciojobs.job.AbstractJob method), 35 repr_result() (asynciojobs.job.AbstractJob method), 35 repr_short() (asynciojobs.job.AbstractJob method), 35 requires() (asynciojobs.job.AbstractJob method), 35 requires() (asynciojobs.sequence.Sequence method), 37 reset() (asynciojobs.watch.Watch method), 39 result() (asynciojobs.job.AbstractJob method), 36 run() (asynciojobs.purescheduler.PureScheduler method), 30 S sanitize() (asynciojobs.purescheduler.PureScheduler method), 30 Scheduler (class in asynciojobs.scheduler), 31 seconds() (asynciojobs.watch.Watch method), 39 Sequence (class in asynciojobs.sequence), 37 shutdown() (asynciojobs.purescheduler.PureScheduler method), 30 standalone_run() (asynciojobs.job.AbstractJob method), 36 stats() (asynciojobs.purescheduler.PureScheduler method), 30 T text_label() (asynciojobs.job.AbstractJob method), 36 text_label() (asynciojobs.job.Job method), 37

54 Index