Formalizing the Structural Semantics of Domain-Specific Modeling

1 Formalizing the Structural Semantics of Domain-Specific Modeling Languages ETHAN JACKSON and JANOS SZTIPANOVITS Institute for Software Integrated Systems, Vanderbilt University Abstract— Model-based approaches to system design simulation. This evolution can be seen in the hybrid are now widespread and successful. These approaches systems [83], embedded systems [34], and security [55] make extensive use of model structure to describe systems communities. However, the immediacy of behavioral using domain-specific abstractions, to specify and imple- issues has dominated the spotlight, leaving issues in the ment model transformations, and to analyze structural structural regime behind. properties of models. In spite of its general importance the structural semantics of modeling languages are not In all fairness, it is not obvious that the founda- well-understood. In this paper we develop the formal tions of DSML syntax should differ from that of exist- foundations for the structural semantics of domain specific ing programming languages. Traditionally, programming modeling languages (DSML), including the mechanisms language construction follows a well-defined path [1]: by which metamodels specify the structural semantics of First, language syntax is defined with an eBNF grammar DSMLs. Additionally, we show how our formalization can [15] and a parser is generated. Second, a type-system complement existing tools, and how it yields algorithms is defined. Third, algorithms are developed that walk for the analysis of DSMLs and model transformations. the abstract syntax tree (AST) and check the well- Index Terms— Model-based Design, Domain Specific typedness of the program. The terms “domain-specific” Modeling Languages, Structural Semantics, Metamodel- and “model” do not by themselves indicate that the ing, Formal Logic, Horn Logic procedure should be any different for DSMLs. The first sign that DSMLs diverge from traditional I. INTRODUCTION language design appears in their specification with meta- Domain-specific modeling languages (DSMLs) play models [5]. Metamodels employ UML-like class dia- an important role in software and system design. They grams to describe rich syntactic constructs with hier- are essential components of the OMG’s model-driven archical internal structure (aggregation) carrying typed architecture (MDA) [76], mature tools exist for con- data (attributes). Metamodeling also focuses attention structing and utilizing DSMLs [31], [36], [64], and many to relations (associations) between syntactic entities, methodologies in model-based design, such as platform- providing n-ary relations over infinite sets. Unlike BNF based design [45] and actor-based design [66] exploit the grammars, metamodels treat these building blocks as DSML metaphor [65]. first-class concepts. Expressive constraint languages, Despite widespread application of DSMLs, many sci- such as the Object Constraint Language (OCL) [79], entific questions remain about their formal properties enrich metamodels by supporting expressive constraints [8], [41], [61], which can be loosely grouped into the on legal model instances [55]. These observations have structural and behavioral regimes. The structural regime already led many researchers to relate metamodels to concerns the specification, representation, and manipula- graph grammars, which natively support relations [23], tion of models as represented in some domain-specific [60]. syntax. Research in the behavioral regime focuses on the The second sign of divergence occurs in the varied ap- specification and analysis of domain-specific execution plications of DSMLs syntax, which include model trans- semantics. Efforts to formalize statecharts [37], [38] and formations, design space exploration, and correct-by- message sequence charts [3], [39] fit into this regime. construction design. Model transformations [70] trans- Others have studied the general problem of linking late between domain-specific syntaxes to change ab- domain-specific syntaxes with execution semantics [14], straction levels [32], compose modeling aspects [35], [72]. and relate platform-independent models with platform Anytime a new community adopts DSML-based mod- specific ones [76]. Modern model transformation lan- eling, its members inevitably desire a precise formaliza- guages utilize extended graphs grammars to capture the tion of behaviors for the purposes of verification and complexities of DSML syntax [23], [60]. Syntax has also 2 been used to perform design space exploration. These UML infrastructure [78], and Meta-Object Facil- techniques employ syntactic perturbations to generate ity (MOF) [75] do not provide sufficiently rigid optimized variants of models [57], [74]. Again, the definitions of the DSML process to permit inter- expressiveness of metamodeling constraints facilitates operability between tools. See [24] for a detailed meaningful design space exploration. This same expres- example of this phenomenon. siveness can be used to project behavioral properties (e.g. In this paper we explore a bottom-up approach to for- deadlock freedom) onto the syntactic level [52]. Correct- malizing metamodeling, DSML syntax, and model trans- by-construction design [33] uses expressive syntactic formations. By bottom-up, we mean that our approach rules in conjunction with a suitable behavioral semantics does not begin with the concepts of metamodeling or to statically identify models with bad behavioral prop- model transformation, but ends with these. Instead, we erties. Note that the correct-by-construction approach start with a simple formal core capable of expressing differs from static analysis of traditional programming the rich syntaxes of a class D of DSMLs. Each member languages, because DSMLs are a priori designed to of this class is simply called a domain (motivated by maximize analysis. Programming language static anal- the phrase domain-specific) and each domain D defines ysis has historically evolved in the other direction: First, some domain-specific syntax. Next, we define a class the language is fixed (e.g. C++), and then reasonable ap- T of functions that relate the syntactic elements of proaches to static analysis develop, e.g. dataflow analysis one domain with the elements of another; this class of software [28]. represents model transformations. Third, we identify a Given the important uses-cases for DSML syntax we special pair (Dmeta;Tmeta) D T that together are might expect that: capable of generating all domains2 × in the class; this 1) There exists a precise mathematical foundation for becomes metamodeling. Finally, we define the classes D metamodeling, model transformations, and DSML and T using the same underlying mathematical apparatus syntax. based on deductive logic. 2) Tool-independent formal descriptions of modeling With our approach we can formulate important prop- artifacts can be extracted from tool-dependent ar- erties over domains and transformations, including: Do- tifacts. main emptiness, domain equivalence, and structure pre- 3) The formal foundation yields analysis techniques serving maps. for DSML syntax, metamodels, and model trans- 1) Domain emptiness refers to DSMLs with no legal formations. syntactic instances. Empty domains are created Unfortunately, this is not the current state of affairs. when metamodel composition introduces inconsis- Metaprogrammable tools (i.e. tools that use metamodels) tencies, or when metamodeling constraints contain and standardized metamodeling languages have evolved mistakes. independently from each other, from formalizations of 2) Two domains are equivalent if they have the same the metamodeling process, and from formalizations of set of syntactic instances, even though their metamodel transformations. Here are some examples that models may differ. Domain equivalence provides illustrate this: a mechanism to compare domains independently 1) The work on KM3 [54] provides a formal meta- from the metamodeling language. modeling language for use with graph transforma- 3) Structure preserving maps refer to model trans- tions, but does not address expressive constraints formations that always rewrite legal syntactic in- incorporated into the source/target languages. Sim- stances to legal syntactic instances. ilar limitations are present in the VMP [87] formal- We have developed a theorem prover called FORMULA ism used by Viatra2 [17]. (FORmal Modeling Using Logic Analysis) that calcu- 2) The mature metaprogrammable Generic Modeling lates these properties when domains/transformations are Environment (GME) [64] and Graph Rewriting described using Horn logic [53] with stratified negation And Transformation (GReAT) tool [29] provide [48], [51]. We will not describe the details of FORMULA expressive DSMLs and extensive model transfor- here, but show some results of the tool. mation features. However, the precise structural Finally, we incorporated our formalization into the semantics of tool artifacts depend on (and are Model Integrated Computing (MIC) tool suite, which has hidden in) the implementation of these complex amassed over a decade of massive modeling efforts rang- tools. ing from models of the NASA space station [12] to the 3) Standards such as the UML superstructure [80], sensor and control networks of automotive plants [68]. 3 We show in detail how our framework can be connected UML includes many capabilities (diagrams) including to this tried-and-tested DSML-based modeling suite. metamodeling, state machines, activities, sequence charts This case study illustrates that the bottom-up

Load more