Arxiv:2001.11604V2 [Cs.PL] 6 Sep 2020 of Trigger Conditions, Having an in Situ Infrastructure That Simplifies Results They Desire
Total Page:16
File Type:pdf, Size:1020Kb
DIVA: A Declarative and Reactive Language for in situ Visualization Qi Wu* Tyson Neuroth† Oleg Igouchkine‡ University of California, Davis, University of California, Davis, University of California, Davis, United States United States United States Konduri Aditya§ Jacqueline H. Chen¶ Kwan-Liu Ma|| Indian Institute of Science, India Sandia National Laboratories, University of California, Davis, United States United States ABSTRACT for many applications. VTK [61] also partially adopted this ap- The use of adaptive workflow management for in situ visualization proach, however, VTK is designed for programming unidirectional and analysis has been a growing trend in large-scale scientific simu- visualization pipelines, and provides limited support for highly dy- lations. However, coordinating adaptive workflows with traditional namic dataflows. Moreover, the synchronous dataflow model is procedural programming languages can be difficult because system somewhat difficult to use and does not always lead to modular pro- flow is determined by unpredictable scientific phenomena, which of- grams for large scale applications when control flows become com- ten appear in an unknown order and can evade event handling. This plicated [19]. Functional reactive programming (FRP) [19,24,48,53] makes the implementation of adaptive workflows tedious and error- further improved this model by directly treating time-varying values prone. Recently, reactive and declarative programming paradigms as first-class primitives. This allowed programmers to write reac- have been recognized as well-suited solutions to similar problems in tive programs using dataflows declaratively (as opposed to callback other domains. However, there is a dearth of research on adapting functions), hiding the mechanism that controls those flows under these approaches to in situ visualization and analysis. With this pa- an abstraction. By enabling such a uniform and pervasive man- per, we present a language design and runtime system for developing ner to handle complicated data flows, applications gain clarity and adaptive systems through a declarative and reactive programming reliability. paradigm. We illustrate how an adaptive workflow programming Through our work, we demonstrate how FRP abstractions can be system is implemented using our approach and demonstrate it with used to better assist adaptive in situ visualization and analysis work- a use case from a combustion simulation. flow creation, using a domain-specific language (DSL) we created called DIVA. This language consists of two components: an FRP- 1 INTRODUCTION based visualization specific language and a low-level C++-based dataflow API. Rather than replacing existing in situ infrastructures, Scientific simulations running on petascale high-performance com- it aims to work as a middle layer which can extend existing systems puting (HPC) platforms such as Summit [50] can easily produce such as VTK [61]. Through the description of our design, we empha- datasets at scales beyond what can be efficiently processed. There- size the key principles that ensure a correct implementation of the fore, scientists often have to compromise the allocation of strained FPR abstractions in a parallel environment. These principles can also resources, such as I/O bandwidth, to maximize the overall efficiency be applied to in situ systems that are aimed at enabling declarative of simulations [8, 29]. An ideal solution is to manage the system and reactive programming, without adopting DIVA itself. workflow adaptively through trigger-action mechanisms (which are Our design provides four primary benefits: termed “in situ triggers” in some literature [11, 31]) because they Simple. Traditionally, in situ infrastructures employ callback can prioritize the allocation of these strained resources to the most functions to handle separated programming stages. Useful callback interesting phenomena as they emerge [56]. However, adaptive functions include initialization, per-timestep execution, finalization, workflows are usually more difficult and error-prone to program be- and feedback loops [36]. As such, a single data dependency could cause they are reactive applications rather than transformational [48]. be broken into multiple pieces that are controlled by interleaving Control flows in adaptive workflows are often driven by evolving control flows, making it hard to read and maintain. If an operation simulation outcomes and event sequences that cannot be predicted requires information from multiple timesteps, the handling of static in advance. Therefore, domain scientists must anticipate all possible storage will also be involved, making the implementation even more scenarios in advance and create sophisticated rules to dynamically complex. Declarative programming simplifies these tedious and trigger actions in response to the events [6]. Moreover, because the redundant low-level tasks by placing more of the burden on the quality of adaptive workflows often relies heavily on the accuracy tool developers. This allows the user to focus on specifying the arXiv:2001.11604v2 [cs.PL] 6 Sep 2020 of trigger conditions, having an in situ infrastructure that simplifies results they desire. Meanwhile, reactive programming offers the the customization process is important [8, 31]. ability to automatically coordinate data dependencies and propagate A common approach to implementing reactive systems is through changes from inputs to outputs. This model is commonly used to synchronous dataflow programming models [14, 15, 26]. In such greatly simplify the handling of time-varying signals, such as events models, programs are formed by wiring primitive processing ele- triggered by human interaction. Since adaptive workflows similarly ments and composition operators into a directed graph structure. The specify the reaction of a system to time-varying signals, we believe dataflow programming model provides a natural form of modularity reactive programming is also a good solution for coordinating them. *e-mail: [email protected] Extensible. By providing a fully programmable interface and †e-mail: [email protected] a low-level C++ API, DIVA allows flexible extension through the ‡e-mail: [email protected] development of new custom modules. §e-mail: [email protected] Portable. The low-level API also makes the integration with ¶e-mail: [email protected] existing visualization infrastructures easier, by only asking for a ||e-mail: [email protected] short binding implementation; this can typically be just one source file. Thus, DIVA can be used as a wrapper to enable declarative and reactive programming for these infrastructures. This also means that workflows written in DIVA for one infrastructure can be easily reused in a different infrastructure if the proper bindings are supplied for both of them. workflow management, such signals can represent the outputs from Dynamic. DIVA’s API resolves linking through dynamic load- a simulation [53]. Although there are many variations of FRP focus- ing; therefore, adding or updating linked libraries does not always ing on different applications, they all fall into two main branches: require recompilation nor restarting. This feature can be useful for classical FRP [24] and arrowized FRP [48]. scientific simulations on HPC systems, because allocations for these Classical FRP was first proposed for creating interactive anima- simulations are usually limited; removing unnecessary restarts al- tions. It introduced notations called behaviors (another name for lows for more useful simulation work to be done and gives a higher signals) and events, to represent continuous time-varying values and chance for scientific discoveries. sequences of time-stamped values respectively. Although initially In this paper, we first begin with a summary of related works. We focused on animation, it inspired many later works of broader scope then describe the principles and features of our language with a set due to its elegant semantics [20]. However, because it is a denota- of examples. We also provide a set of examples to demonstrate how tional model which does not restrict the length of a signal stream our design meets the needs of in situ visualization and analysis, and that can be operated on, classical FRP programs can have a high how it supports our assertion that declarative and reactive models memory footprint and long computation times [20]. are good approaches for adaptive workflow management in this Arrowized FRP (AFRP) [48] aims to resolve the space and time domain. Next, we discuss implementation details of our language. leaks without losing the expressiveness of classical FRP. Instead of Finally, we showcase the use of DIVA to program an adaptive in situ working with signals or similar notations directly, AFRP focuses on workflow for a leading edge simulation code running on the Summit manipulating causal functions between signals, connecting to the supercomputer. outside world only at the top level [53]. However, programs written in AFRP can still suffer from issues like global delay or unnecessary 2 BACKGROUND AND MOTIVATION updates, depending on their implementations [20]. We begin with an overview of recent advances in reconfigurable Real-time FRP (RT-FRP) [70], event-driven FRP (E-FRP) [71] in situ workflows, and a history of functional reactive program- and asynchronous FRP (e.g., the original version of Elm) [21] are ming. We then compare existing methods in our domain with the other variations developed to optimize classical FRP. RT-FRP in- FRP model. Finally,