Actor Model and Data Flow (using Ported Network Graphs)

Brett Viren Physics Department

DUNE DAQ DFWG – 21 Feb 2020 Outline

Actor Model

Data Flow Programming Paradigm

Brett Viren (BNL) actor + graph 21 Feb 2020 2 / 22 Motivation

Our distributed DAQ software needs cohesive in order to keep software development, configuration, operation, etc manageable.

The DAQ workshop began the process of discussing some likely patterns and many of them are already in use in existing DUNE-related offline and online software.

This presentation covers two such patterns identified in our recent discussions.

Brett Viren (BNL) actor + graph 21 Feb 2020 3 / 22 Two Behavioral Structure Patterns

• Actor Model describes a set of independent code units that intercommunicate asynchronously. • Data Flow Programming Paradigm describes a graph with edges providing data transfer between nodes representing code units. → Will specialize “graph” into “ported graph” (PGraph) and then “ported network graph” (PNGraph).

It is natural, but not required, to implement DFP nodes as actors.

Brett Viren (BNL) actor + graph 21 Feb 2020 4 / 22 Actor Model Actor Model (Hewitt, Bishop, Steiger, 1973)

«thread» Application creates Actor Function Socket pipe Socket pipe Thread actor actor_function ( pipe, userdata )

An actor is a function started in a thread communicat- ing with its creator over a bidirectional pipe following a protocol.

• Typically, the actor function is called with some user data to use for configuration/initialization. • The pipe is an exclusive-pair of connected sockets, one end for the actor and one end for the application. • After thread launch, the pipe is a tether for actor protocol with app. Typically, the actor protocol is very simple. • The actor may also communicate with other actors or in general with the “outside word”.

Brett Viren (BNL) actor + graph 21 Feb 2020 5 / 22 Actor Model Simple Actor Protocol and Lifetime

• App and actor communicate over a pipe (pipe not explicitly drawn)

Application World Actor • The “ready” message allows app to delay,

create typically actor notifies immediately. ready • App then goes on to do other things. message • Actor enters its “main loop”: process external go do other input, computes, polls pipe for input from app. things compute • App sends “terminate” via pipe, actor performs message any cleanup and the actor function exits. terminate • Actor may also shutdown on its own but still waits shutdown for “terminate” prior to function exit. • Some cases may need more complexity. Application World → Actor protocol may be more complex. → App may have many actors. Actor → Actors may have actors. → App may respond to actor termination.

Brett Viren (BNL) actor + graph 21 Feb 2020 6 / 22 Actor Model Synchronous API to Asynchronous Actor Protocol

A detail: API X Actors are nicely object oriented, simplifying app Implementation by hiding behavior.

× But, now app must have actor protocol message Application API pipe handling code! Actor construct • Hide message handling behind synchronous API. create method ◦ App calls method, API sends message, waits to call

recv() reply (if appropriate), interprets message, msg

returns result to app, app continues. msg • Fact of life, not all async can be totally hidden. method return ◦ Expose pipe to app for explicit polling. ◦ Provide a poll() type method, app calls Application API pipe periodically. Either returns value from oldest recv() message sitting in socket queue or a Actor “false” result when queue is empty.

Brett Viren (BNL) actor + graph 21 Feb 2020 7 / 22 Data Flow Programming Paradigm Data Flow Programming Paradigm Structure overall job as code units which receive and/or send data to other code units forming a directed (and possibly cyclic) graph. “Program” by drawing lines (graph edges) between code units (graph nodes).

Simplistic view of DFP paradigm.

Brett Viren (BNL) actor + graph 21 Feb 2020 8 / 22 Data Flow Programming Paradigm Ported Graph

Specialize from simple, directed graph to ported graph.

in B1 out o1 i1 A C o2 i2 in B2 out

A port is an identified, edge-attachment point on a node. Depending on the DFP system policy, a port may: • follow a specific protocol (flow-in, flow-out, query/response, etc). • pass only specific data types or operate in a type-free manner. • restrict edge multiplicity (allow zero, require exactly one, allow multiple). Note: policy requires validation! Think on ways to perform this.

Brett Viren (BNL) actor + graph 21 Feb 2020 9 / 22 Data Flow Programming Paradigm Ported Graph Abstraction

A powerful, practical feature of ported graphs

in B1 out o1 B1:out A −→ X o2 B2:out in B2 out

X represents all of A and the input ports of B1 and B2.

• A ported subgraph can be abstracted by “removing” all fully-populated ports and presenting a new graph with fewer nodes and ports. • Resulting subgraph is (apparently) much simpler. • In practice: experts configure their subsystems in detail and provide abstracted subgraphs. These are connected to produce another graph which can be abstracted, etc, until the entire system is configured. Note: allows validation the divide-and-conquer strategy.

Brett Viren (BNL) actor + graph 21 Feb 2020 10 / 22 Data Flow Programming Paradigm Ported Network Graph

Networking adds some complexity.

tcp://a.b.c.:1234 tcp://a.b.c.:1236 in B1 out o1 i1 A C o2 i2 in B2 out tcp://a.b.c.d:1235 tcp://a.b.c.e:1237

mark bind() as • and connect() as 

• Networking requires a socket to bind() or connect() via an address. • PNGraph edges must conceptually “pass through” this address. • An address is a node, thus PNGraphs are bipartite: ◦ ported nodes: data transformation ◦ address nodes: data transportation.

Brett Viren (BNL) actor + graph 21 Feb 2020 11 / 22 Data Flow Programming Paradigm Address Resolution Network addressing makes PNGraphs more complex than PGraphs. Specifying explicit addresses is brittle (eg, collisions are possible).

in B1 out o1 (A,o1) i1 A (C,i1) C o2 i2 in B2 out (A,o2) (C,i2)

Robust simplicity: discover addresses given node/port names.

• Every port known by (node, port) name 2-tuple. • bind() needs no configuration, pick first unused TPC/IP port number. ◦ publish 3-tuple: (node, port, address) • connect() configured with 2-tuple: (node, port) names ◦ Ports resolve node/port names to address via discovery mechanism.

Brett Viren (BNL) actor + graph 21 Feb 2020 12 / 22 Data Flow Programming Paradigm Node Resolution

Further abstraction and simplification: peers discover port addresses based on attribute matching rules instead of hard-wiring (node,port) name 2-tuple in configuration.

• Node publishes arbitrary key/value attributes (“discovery headers”). ◦ A node may have a “type” or a “role” or a “class” or “category” or a “favorite color”.... • Peers discover node attributes and apply attribute matching rules. • Matching port addresses also held in node’s “discovery headers”. → Peer, “I want to connect to all PUB ports of Hit Finders of APA 123” → Discovery mechanism, “you want this list of addresses: [...]”

Brett Viren (BNL) actor + graph 21 Feb 2020 13 / 22 Data Flow Programming Paradigm Discovery Mechanisms

Two basic approaches: • centralized service (eg DNS), service bind to “well known address”, peer must connect and CRUD records and propagation may be required if service is redundant (latency). Peer must poll/query service to learn records of other peers. • distributed protocol (eg Zyre), “network is the service”, peer publishes to the network, peers get immediate updates (no poll/query), no single point of failure, fundamental redundancy, sub-second latency possible. Bonus: network learns when peers appear, disappear or go quiet.

Note: Zyre is implemented as a ZeroMQ actor, so an application or individual node can use it with very little coding to worry about.

Brett Viren (BNL) actor + graph 21 Feb 2020 14 / 22 Data Flow Programming Paradigm Modeling Our Graphs Even with simplifying strategies, our graphs will still be complex. We should develop a way to model our graphs independent from “merely” producing configuration files. • Express model in some data language. ◦ (eg, PTMP and Wire-Cell Toolkit uses Jsonnet) • Maintain models in version control. ◦ (use a text-based modeling language)

• Develop parameterized models (eg, “use Napa per TC alg”). ◦ (eg, exploit Jsonnet’s ) • Produce visualization for debugging model and for operational displays. ◦ (Jsonnet to GraphViz dot conversion is easy and exists) • Validate policy (eg, find unconnected ports). ◦ (eg. apply constraints with Jsonnet functions). • Transform to applications configuration files/objects. ◦ (eg, compile to JSON, load to DB, load Jsonnet directly, etc)

Brett Viren (BNL) actor + graph 21 Feb 2020 15 / 22 Data Flow Programming Paradigm ZIO: an implementation of Ported Network Graph ZIO is a next generation PTMP and also applies to problems with highly parallel offline applications (Wire-Cell Toolkit). It supports Ported Network Graph pattern with three classes: • Node an identified, coherent set of ports, provides port creation, online/offline transitions. • Port a light wrapper around a ZeroMQ socket, bind()/connect(), online/offline. • Peer a simplifying wrapper around ZeroMQ’s Zyre mechanism, discovery header caching and matching.

Available in Python and C++ flavors. Both taste similar. https://brettviren.github.io/zio

Brett Viren (BNL) actor + graph 21 Feb 2020 16 / 22 Data Flow Programming Paradigm ZIO node/port/peer Example Here, in Python, C++ is similar. node= zio.Node("nodename") log= node.port("logger", zmq.PUB) log.bind() node.online(favorite_color="purple")

msg= zio.Message(...) log.send(msg)

node= zio.Node("other") port= node.port("slurp", zmq.SUB) port.connect("nodename","logger") node.online()

logmsg= port.recv()

A zio.Peer is used inside zio.Node to resolve (node,port) to an address. Resolution can be extended to support node resolution described above.

Brett Viren (BNL) actor + graph 21 Feb 2020 17 / 22 Last Slide

• Actor Model and the DFP paradigm in general and the Ported Network Graph pattern in particular, are two behavioral structure patterns which are central to distributed systems. The will show up, either implicitly or explicitly. • As concepts alone, they at least provide a language with which we can define and discuss our DAQ design. • Implementations available for adoption or inspiration (in Wire-Cell Toolkit, PTMP and ZIO, and of course elsewhere). • Providing an implementation shared by all/most DAQ apps will assist in simplifying software development, opening it up a larger pool of developers. End note: additional patterns (eg, Interface, Factory, Plugin, (H)FSM, ...) are maybe worth future presentations.

Brett Viren (BNL) actor + graph 21 Feb 2020 18 / 22 FIN

Brett Viren (BNL) actor + graph 21 Feb 2020 19 / 22 ZeroMQ Actor Construction Examples

In C/C++ (CZMQ)

void actor_func(zsock_t* pipe, void* args); struct UD {...} ud; zactor_t* actor= zactor_new(actor_func,( void*)&ud); zsock_signal(actor, 0);// terminate

In Python (PyZMQ/Pyre) def actor_func(ctx, pipe, arg1, arg2): pass actor= ZActor(ctx, actor_func,"name", 42) actor.pipe.signal()# terminate

The high-level ZeroMQ C++ interface package ZMQPP also provides direct support for actors. The simpler CPPZMQ does not but one can easily DIY with std::thread.

Brett Viren (BNL) actor + graph 21 Feb 2020 20 / 22 Application Construction Patterns

Plugin dynamically loaded shared libraries provide classes or functions following some contracted interface. Factory construct and possibly later retrieve objects as an interface type based on an implementation type and possibly an instance name.

These two may interact:

→ App asks Factory for an unknown implementation type → Factory asks Plugin to load plugins until found.

These are very useful, if rather pedestrian. I assume we will have them in some shape. We can dive into details another day.

Brett Viren (BNL) actor + graph 21 Feb 2020 21 / 22 Details on Synchronous API to Asynchronous Actor

• Constructor creates actor thread and waits for “ready” message, execution returns to app. «class» SomeActorAPI • Actor function is then asynchronously running. Thread _actor; • The terminate() method takes care creating and sending “terminate” message. SomeActorAPI ( userdata ); void terminate(); • A query method may directly wait for and return reply Value query(); value. Result check(); optional • Async results via check method, app calls periodically Socket pipe(); and uses only if valid, socket buffering soaks up delays. • May provide app-side pipe socket so app may poll it with complex protocols.

Brett Viren (BNL) actor + graph 21 Feb 2020 22 / 22