Complexity of Regular Abstractions of One-Counter Languages
Total Page:16
File Type:pdf, Size:1020Kb
Complexity of regular abstractions of one-counter languages Mohamed Faouzi Atig Dmitry Chistikov Piotr Hofman ∗ Uppsala University, Sweden MPI-SWS, Germany LSV, CNRS & ENS Cachan, [email protected] [email protected] Université Paris-Saclay, France [email protected] K Narayan Kumar Prakash Saivasan Georg Zetzsche y Chennai Mathematical Institute, Chennai Mathematical Institute, LSV, CNRS & ENS Cachan, India India Université Paris-Saclay, France [email protected] [email protected] [email protected] Abstract Reasoning about OCA, however, is hardly an easy task. We study the computational and descriptional complexity For example, checking whether two OCA accept some word of the following transformation: Given a one-counter au- in common is undecidable even in the deterministic case; for tomaton (OCA) A, construct a nondeterministic finite au- nondeterministic OCA, even language universality, as well tomaton (NFA) B that recognizes an abstraction of the lan- as language equivalence, is undecidable. For deterministic guage L(A): its (1) downward closure, (2) upward closure, OCA, equivalence is NL-complete; the proof of the member- or (3) Parikh image. For the Parikh image over a fixed al- ship in NL took 40 years [9, 35]. phabet and for the upward and downward closures, we find This lack of tractability suggests the study of finite-state polynomial-time algorithms that compute such an NFA. For abstractions for OCA. Such a transition is a recurrent theme the Parikh image with the alphabet as part of the input, we in formal methods: features of programs beyond finite state find a quasi-polynomial time algorithm and prove a com- are modeled with infinite-state systems (such as pushdown pleteness result: we construct a sequence of OCA that ad- automata, counter systems, Petri nets, etc.), and then finite- mits a polynomial-time algorithm iff there is one for all state abstractions of these systems come as an important OCA. For all three abstractions, it was previously unknown tool for analysis (see, e.g., [4–7, 15, 31, 34]). In our work, we if appropriate NFA of sub-exponential size exist. focus on the following three regular abstractions, each capturing a specific feature of a language L ⊆ Σ∗: • The 1. Introduction downward closure of L, denoted L#, is the set of all subwords (subsequences) of all words w 2 L, i.e., the set of all words The family of one-counter languages is an intermedi- that can be obtained from words in L by removing some ate class between context-free and regular languages: it is letters. The downward closure is always a superset of the strictly less expressive than the former and strictly more ex- original language, L ⊆ L#, and, moreover, a regular one, no pressive than the latter. For example, the language fambm j matter what L is, by Higman’s lemma [22]. m ≥ 0g is one-counter, but not regular, and the set of palin- dromes over the alphabet fa; bg is context-free, but not one- • The upward closure of L, denoted L", is the set of all counter. From the verification perspective, the correspond- superwords (supersequences) of all words w 2 L, i.e., the ing class of automata, one-counter automata (OCA), can set of all words that can be obtained from words in L by model some infinite-state phenomena with its ability to keep inserting some letters. Similarly to L#, the language L" track of a non-negative integer counter, see, e.g., [10], [29, satisfies L ⊆ L" and is always regular. Section 5.1], and [3, Section 5.2]. • The Parikh image of L, denoted (L), is the set of jΣj ∗ Supported by Labex Digicosme, Univ. Paris-Saclay, project all vectors v 2 N , that count the number of occur- VERICONISS rences of letters of Σ in words from L. That is, suppose Σ = fa ; : : : ; a g, then every word w 2 L corresponds to a y Supported by a fellowship within the Postdoc-Program of the 1 k German Academic Exchange Service (DAAD). vector (w) = (v1; : : : ; vk) such that vi is the number of oc- currences of ai in w. The set (L) is always a regular subset jΣj of N if L is context-free, by the Parikh theorem [33]. It has long been known that all three abstractions can be ef- fectively computed for context-free languages (CFL), by the results of van Leeuwen [36] and Parikh [33]. Algorithms per- forming these tasks, as well as finite automata recognizing these abstractions, are now widely used as building blocks in the language-theoretic approach to verification. Specifi- cally, computing upward and downward closures occurs as an ingredient in the analysis of systems communicating via [Copyright notice will appear here once ’preprint’ option is removed.] shared memory, see, e.g., [4, 6, 7, 31]. As the recent pa- 1 2016/1/19 per [28] shows, for parameterized networks of such systems construction has two steps, both of which are of interest: the decidability hinges on the ability to compute downward in the first step we show, using a combination of local and closures. The Parikh image as an abstraction in the verifica- global transformations on runs (Lemmas 20 and 21), that we tion of infinite-state systems has been used extensively; see, may focus our attention on runs with at most polynomially e.g., [1, 2, 5, 13, 15, 17, 20, 25, 34]. For pushdown systems, it many reversals. In the second step, which also works for is possible to construct a linear-sized existential Presburger pushdown automata, we turn the bound on reversals, using formula that captures the Parikh image [37], which leads, for an argument with a flavour of Strahler numbers [16], into a a variety of problems (see, e.g., [1, 13, 20, 25]), to algorithms logarithmic bound on the stack size of a pushdown system that rely on deciding satisfiability for such formulas (which (Lemma 23). is in NP). Finite automata for Parikh images are used as • We prove a lower-bound type result (Theorem 24): We find intermediate representations, for example, in the analysis of a sequence of OCA (Hn)n≥1, where n denotes the number multi-threaded programs [5, 15, 34] and in recent work on of states, over alphabets of growing size, that admits a so-called availability languages [2]. polynomial-time algorithm for computing an NFA for the Extending the scope of these three abstractions from CFL Parikh image if and only if there is such an algorithm for to other classes of languages has been a natural topic of all OCA. Thus, the problem of transforming an arbitrary interest. Effective constructions for the downward closure OCA A into an NFA for (L(A)) is reduced to performing have been developed for Petri nets [19] and stacked counter this transformation on Hn, which enables us to call Hn automata [39]. The paper [38] gives a sufficient condition for complete. This result also has a counterpart referring to just a class of languages to have effective downward closures; this the existence of NFA of polynomial size. condition has since been applied to higher-order pushdown For the Parikh image of CFL, a number of constructions automata [21]. The effective regularity of the Parikh image is can be found in the literature as well; we refer the reader to known for linear indexed languages [12], phase-bounded and the paper by Esparza et al. [14] for a survey and state-of- scope-bounded multi-stack visibly pushdown languages [26, the-art results: exponential upper and lower bounds of the 27], and availability languages [2]. However, there are also form 2Θ(n) on the size of NFA for (L). negative results: for example, it is not possible to effectively compute the downward closure of languages recognized by Applications lossy channels automata—this is a corollary of the fact that, Our results show that for OCA, unlike for pushdown sys- for the set of reachable configurations of a lossy channel tems, NFA representations of downward and upward clo- system, boundedness is undecidable [32]. sures and Parikh image (for fixed alphabet size) have ef- Our contribution ficient polynomial constructions. This suggests a possible way around standard NP procedures that handle existential We study the construction of nondeterministic finite au- Presburger formulas. This insight also leads to significant tomata (NFA) for L#, L", and (L), if L is given as an OCA gains when abstractions are used in a nested manner, as A with n states: L = L(A). It turns out that for one-counter illustrated by the following examples. languages—a proper subclass of CFL—all three abstractions Consider a network of pushdown systems communicat- can be computed much more efficiently than for the entire ing via a shared memory. The reachability problem is un- class of CFL. decidable in this setting. In [7] a restriction called stage- boundedness, generalizing context-boundedness, is explored. Upward and downward closures: We show, for OCA, how During a stage, the memory can be written to only by one to construct NFA accepting L" and L# in polynomial time system. Reachability along runs with at most k stages is (Theorems 2 and 7). The construction for L" is straightfor- decidable when all but one pushdown in the network are ward, but the one for L# is involved and uses pumping-like counters. The procedure in [7] uses NFA that accept up- techniques from automata theory. ward and downward closures of one-counter languages; the These results are in contrast with the exponential lower polynomial-time algorithms developed in the present paper bounds known for both closures in the case of CFL [36]: bring the complexity from NEXP down to NP for any net- Several constructions for L" and L# have been proposed in work with a fixed number of components.