Non-linear Pattern Matching with Backtracking for Non-free Data Types

Satoshi Egi1 and Yuichi Nishiwaki2

1 Rakuten Institute of Technology, Japan 2 University of Tokyo, Japan

Abstract. Non-free data types are data types whose data have no canon- ical forms. For example, multisets are non-free data types because the multiset {a, b, b} has two other equivalent but literally different forms {b, a, b} and {b, b, a}. Pattern matching is known to provide a handy tool set to treat such data types. Although many studies on pattern match- ing and implementations for practical programming languages have been proposed so far, we observe that none of these studies satisfy all the cri- teria of practical pattern matching, which are as follows: i) efficiency of the backtracking algorithm for non-linear patterns, ii) extensibility of matching process, and iii) polymorphism in patterns. This paper aims to design a new pattern-matching-oriented program- ming language that satisfies all the above three criteria. The proposed language features clean Scheme-like syntax and efficient and extensible pattern matching semantics. This is especially useful for the processing of complex non-free data types that not only include multisets and sets but also graphs and symbolic mathematical expressions. We discuss the importance of our criteria of practical pattern matching and how our language design naturally arises from the criteria. The proposed language has been already implemented and open-sourced as the Egison programming language.

1 Introduction

Pattern matching is an important feature of programming languages featur- ing data abstraction mechanisms. Data abstraction serves users with a simple method for handling data structures that contain plenty of complex informa- tion. Using pattern matching, programs using data abstraction become concise, arXiv:1808.10603v2 [cs.PL] 27 May 2019 human-readable, and maintainable. Most of the recent practical programming languages allow users to extend data abstraction e.g. by defining new types or classes, or by introducing new abstract interfaces. Therefore, a good program- ming language with pattern matching should allow users to extend its pattern- matching facility akin to the extensibility of data abstraction. Earlier, pattern-matching systems used to assume one-to-one correspondence between patterns and data constructors. However, this assumption became prob- lematic when one handles data types whose data have multiple representa- tions. To overcome this problem, Wadler proposed the pattern-matching system 2 Satoshi Egi1 and Yuichi Nishiwaki2 views [28] that broke the symmetry between patterns and data constructors. Views enabled users to pattern-match against data represented in many ways. For example, a complex number may be represented either in polar or Cartesian form, and they are convertible to each other. Using views, one can pattern-match a complex number internally represented in polar form with a pattern written in Cartesian form, and vice versa, provided that mutual transformation functions are properly defined. Similarly, one can use the Cons pattern to perform pattern matching on lists with joins, where a list [1,2] can be either (Cons 1 (Cons 2 Nil)) or (Join (Cons 1 Nil) (Cons 2 Nil)), if one defines a normalization function of lists with join into a sequence of Cons. However, views require data types to have a distinguished canonical form among many possible forms. In the case of lists with join, one can pattern- match with Cons because any list with join is canonically reducible to a list with join with the Cons constructor at the head. On the other hand, for any list with join, there is no such canonical form that has Join at the head. For example, the list [1,2] may be decomposed with Join into three pairs: [] and [1,2], [1] and [2], and [1,2] and []. For that reason, views do not support pattern matching of lists with join using the Join pattern. Generally, data types without canonical forms are called non-free data types. Mathematically speaking, a non-free data type can be regarded as a quotient on a free data type over an equivalence. An example of non-free data types is, of course, list with join: it may be viewed as a non-free data type composed of a (free) binary tree equipped with an equivalence between trees with the same leaf nodes enumerated from left to right, such as (Join Nil (Cons 1 (Cons 2 Nil))) = (Join (Cons 1 Nil) (Cons 2 Nil)). Other typical examples in- clude sets and multisets, as they are (free) lists with obvious identifications. Gen- erally, as shown for lists with join, pattern matching on non-free data types yields multiple results.3 For example, multiset {1,2,3} has three decompositions by the insert pattern: insert(1,{2,3}), insert(2,{1,3}), and insert(3,{1,2}). Therefore, how to handle multiple pattern-matching results is an extremely im- portant issue when we design a programming language that supports pattern matching for non-free data types. On the other hand, pattern guard is a commonly used technique for filter- ing such multiple results from pattern matching. Basically, pattern guards are applied after enumerating all pattern-matching results. Therefore, substantial unnecessary enumerations often occur before the application of pattern guards. One simple solution is to break a large pattern into nested patterns to apply pat- tern guards as early as possible. However, this solution complicates the program and makes it hard to maintain. It is also possible to statically transform the pro- gram in the similar manner at the compile time. However, it makes the compiler implementation very complex. Non-linear pattern is an alternative method for pattern guard. Non-linear patterns are patterns that allow multiple occurrences

3 In fact, this phenomenon that “pattern matching against a single value yields mul- tiple results” does not occur for free data types. This is the unique characteristic of non-free data types. Non-linear Pattern Matching with Backtracking for Non-free Data Types 3 of same variables in a pattern. Compared to pattern guards, they are not only syntactically beautiful but also compiler-friendly. Non-linear patterns are easier to analyze and hence can be implemented efficiently (Section 3.1 and 4.2). How- ever, it is not obvious how to extend a non-linear pattern-matching system to al- low users to define an algorithm to decompose non-free data types. In this paper, we introduce extensible pattern matching to remedy this issue (Section 3.2, 4.4, and 6). Extensibility of pattern matching also enables us to define predicate pat- terns, which are typically implemented as a built-in feature (e.g. pattern guards) in most pattern-matching systems. Additionally, we improve the usability of pat- tern matching for non-free data types by introducing a syntactic generalization for the match expression, called polymorphic patterns (Section 3.3 and 4.3). We also present a non-linear pattern-matching algorithm specialized for backtrack- ing on infinite search trees and supports pattern matching with infinitely many results in addition to keeping efficiency (Section 5). This paper aims to design a programming language that is oriented toward pattern matching for non-free data types. We summarize the above argument in the form of three criteria that must be fulfilled by a language in order to be used in practice: 1. Efficiency of the backtracking algorithm for non-linear patterns, 2. Extensibility of pattern matching, and 3. Polymorphism in patterns. We believe that the above requirements, called together criteria of practical pat- tern matching, are fundamental for languages with pattern matching. However, none of the existing languages and studies [5,15,26,10] fulfill all of them. In the rest of the paper, we present a language which satisfies the criteria, together with comparisons with other languages, several working examples, and formal semantics. We emphasize that our proposal has been already implemented in Haskell as the Egison programming language, and is open-sourced [6]. Since we set our focus in this paper on the design of the programming language, detailed discussion on the implementation of Egison is left for future work.

2 Related Work

In this section, we compare our study with the prior work. First, we review previous studies on pattern matching in functional program- ming languages. Our proposal can be considered as an extension of these studies. The first non-linear pattern-matching system was the symbol manipulation system proposed by McBride [21]. This system was developed for Lisp. Their paper demonstrates some examples that process symbolic mathematical expres- sions to show the expressive power of non-linear patterns. However, this approach does not support pattern matching with multiple results, and only supports pat- tern matching against a list as a collection. Miranda laws [27,25,24] and Wadler’s views [28,22] are seminal work. These proposals provide methods to decompose data with multiple representations by 4 Satoshi Egi1 and Yuichi Nishiwaki2 explicitly declaring transformations between each representation. These are the earliest studies that allow users to customize the execution process of pattern matching. However, the pattern-matching systems in these proposals treat nei- ther multiple pattern matching results nor non-linear patterns. Also, these stud- ies demand a canonical form for each representation. Active patterns [15,23] provides a method to decompose non-free data. In active patterns, users define a match function for each pattern to specify how to decompose non-free data. For example, insert for multisets is defined as a match function in [15]. An example of pattern matching against graphs using matching function is also shown in [16]. One limitation of active patterns is that it does not support backtracking in the pattern matching process. In active pat- terns, the values bound to pattern variables are fixed in order from the left to right of a pattern. Therefore, we cannot write non-linear patterns that requires backtracking such as a pattern that matches with a collection (like sets or mul- tisets) that contains two identical elements. (The pattern matching fails if we unfortunately pick an element that appears more than twice at the first choice.) First-class patterns [26] is a sophisticated system that treats patterns as first- class objects. The essence of this study is a pattern function that defines how to decompose data with each data constructor. First-class patterns can deal with pattern matching that generates multiple results. To generate multiple results, a pattern function returns a list. A critical limitation of this proposal is that first-class patterns do not support non-linear pattern matching. Next, we explain the relation with . We have mentioned that non-linear patterns and backtracking are impor- tant features to extend the efficiency and expressive power of pattern matching especially on non-free data types. Unification of logic programming has both features. However, how to integrate non-determinism of logic programming and pattern matching is not obvious [18]. For example, the pattern-matching facility of is specialized only for algebraic data types. Functional logic programming [10] is an approach towards this integration. It allows both of non-linear patterns and multiple pattern-matching results. The key difference between the functional logic programming and our approach is in the method for defining pattern-matching algorithms. In functional logic pro- gramming, we describe the pattern-matching algorithm for each pattern in the logic-programming style. A function that describes such an algorithm is called a pattern constructor. A pattern constructor takes decomposed values as its argu- ments and returns the target data. On the other hand, in our proposal, pattern constructors are defined in the functional-programming style: pattern construc- tors take a target datum as an argument and returns the decomposed values. This enables direct description of algorithms.

3 Motivation

In this section, we discuss the requirements for programming languages to es- tablish practical pattern matching for non-free data types. Non-linear Pattern Matching with Backtracking for Non-free Data Types 5

3.1 Pattern Guards vs. Non-linear Patterns Compared to pattern guards, non-linear patterns are a compiler-friendly method for filtering multiple matching results efficiently. However, non-linear pattern matching is typically implemented by converting them to pattern guards. For example, some implementations of functional logic programming languages con- vert non-linear patterns to pattern guards [8,9,18]. This method is inefficient because it leads to enumerating unnecessary candidates. In the following pro- gram in Curry, seqN returns "Matched" if the argument list has a sequential N-tuple. Otherwise it returns "Not matched". insert is used as a pattern con- structor for decomposing data into an element and the rest ignoring the order of elements. insert x [] = [x] insert x (y:ys) = x:y:ys ? y:(insert x ys) seq2 (insert x (insert (x+1) _)) = "Matched" seq2 _ = "Not matched" seq3 (insert x (insert (x+1) (insert (x+2) _))) = "Matched" seq3 _ = "Not matched" seq4 (insert x (insert (x+1) (insert (x+2) (insert (x+3) _)))) = "Matched" seq4 _ = "Not matched" seq2 (take 10 (repeat 0))-- returns"Not matched" inO(n^2) time seq3 (take 10 (repeat 0))-- returns"Not matched" inO(n^3) time seq4 (take 10 (repeat 0))-- returns"Not matched" inO(n^4) time

When we use a Curry compiler such as PAKCS [4] and KiCS2 [11], we see that “seq4 (take n (repeat 0))” takes more time than “seq3 (take n (repeat 0))” n because seq3 is compiled to seq3’ as follows. Therefore, seq4 enumerates 4 n candidates, whereas seq3 enumerates 3 candidates before filtering the results. If the program uses non-linear patterns as in seq3, we easily find that we can n check no sequential triples or quadruples exist simply by checking 2 pairs. However, such information is discarded during the program transformation into pattern guards. seq3' (insert x (insert y (insert z _))) | y==x+1 && z==x+2 = "Matched" seq3' _ = "Not matched" One way to make this program efficient in Curry is to stop using non-linear patterns and instead use a predicate explicitly in pattern guards. The following illustrates such a program. isSeq2 (x:y:rs) = y == x+1 isSeq3 (x:rs) = isSeq2 (x:rs) && isSeq2 rs perm [] = [] perm (x:xs) = insert x (perm xs) 6 Satoshi Egi1 and Yuichi Nishiwaki2

seq3 xs | isSeq3 ys = "Matched" where ys = perm xs seq3 _ = "Not matched" seq3 (take 10 (repeat 0))-- returns"Not matched" inO(n^2) time

In the program, because of the laziness, only the head part of the list is evaluated. In addition, because of sharing [17], the common head part of the list is pattern- matched only once. Using this call-by-need-like strategy enables efficient pattern matching on sequential n-tuples. However, this strategy sacrifices readability of programs and makes the program obviously redundant. In this paper, instead, we base our work on non-linear patterns and attempt to improve its usability keeping it compiler-friendly and syntactically clean.

3.2 Extensible Pattern Matching

As a program gets more complicated, data structures involved in the program get complicated as well. A pattern-matching facility for such data structures (e.g. graphs and mathematical expressions) should be extensible and customizable by users because it is impractical to provide the data structures for these data types as built-in data types in general-purpose languages. In the studies of systems, efficient non-linear pattern- matching algorithms for mathematical expressions that avoid such unnecessary search have already been proposed [2,20]. Generally, users of such computer alge- bra systems control the pattern-matching method for mathematical expressions by specifying attributes for each operator. For example, the Orderless attribute of the Wolfram language indicates that the order of the arguments of the oper- ator is ignored [3]. However, the set of attributes available is fixed and cannot be changed [1]. This means that the pattern-matching algorithms in such com- puter algebra systems are specialized only for some specific data types such as multisets. However, there are a number of data types we want to pattern-match other than mathematical expressions, like unordered pairs, trees, and graphs. Thus, extensible pattern matching for non-free data types is necessary for handling complicated data types such as mathematical expressions. This paper designs a language that allows users to implement efficient backtracking algo- rithms for general non-free data types by themselves. It provides users with the equivalent power to adding new attributes freely by themselves. We discuss this topic again in Section 4.4.

3.3 Monomorphic Patterns vs Polymorphic Patterns

Polymorphism of patterns is useful for reducing the number of names used as pattern constructors. If patterns are monomorphic, we need to use different names for pattern constructors with similar meanings. As such, monomorphic patterns are error-prone. Non-linear Pattern Matching with Backtracking for Non-free Data Types 7

For example, the pattern constructor that decomposes a collection into an element and the rest ignoring the order of the elements is bound to the name insert in the sample code of Curry [8] as in Section 3.1. The same pattern con- structor’s name is Add’ in the sample program of Active Patterns [15]. However, these can be considered as a generalized cons pattern constructor for lists to multisets, because they are same at the point that both of them are a pattern constructor that decomposes a collection into an element and the rest. Polymorphism is important, especially for value patterns. A value pattern is a pattern that matches when the value in the pattern is equal to the target. It is an important pattern construct for expressing non-linear patterns. If patterns are monomorphic, we need to prepare different notations for value patterns of different data types. For example, we need to have different notations for value patterns for lists and multisets. This is because equivalence of objects as lists and multisets are not equal although both lists and multisets are represented as a list. pairsAsLists (insert x (insert x _)) = "Matched" pairsAsLists _ = "Not matched" pairsAsMultisets (insert x (insert y _)) | (multisetEq x y) = "Matched" pairsAsMultisets _ = "Not matched" pairsAsLists [[1,2],[2,1]] -- returns "Not matched" pairsAsMultisets [[1,2],[2,1]] -- returns "Matched"

4 Proposal

In this section, we introduce our pattern-matching system, which satisfies all requirements shown in Section 3. Our language has Scheme-like syntax. It is dynamically typed, and as well as Curry, based on lazy evaluation.

4.1 The match-all and match expressions We explain the match-all expression. It is a primitive syntax of our language. It supports pattern matching with multiple results. We show a sample program using match-all in the following. In this paper, we show the evaluation result of a program in the comment that follows the program. “;” is the inline comment delimiter of the proposed language. (match-all {1 2 3} (list integer) [ [xs ys]]) ; {[{} {12 3}] [{1} {2 3}] [{1 2} {3}] [{12 3} {}]}

Our language uses three kinds of parenthesis in addition to “(” and “)”, which denote function applications. “<” and “>” are used to apply pattern and data constructors. In our language, the name of a data constructor starts with upper- case, whereas the name of a pattern constructor starts with lowercase. “[” and “]” are used to build a tuple. “{” and “}” are used to denote a collection. 8 Satoshi Egi1 and Yuichi Nishiwaki2

In our implementation, the collection type is a built-in data type implemented as a lazy 2-3 finger tree [19]. This reason is that we thought data structures that support a wider range of operations for decomposition are more suitable for our pattern-matching system. (2-3 finger trees support efficient extraction of the last element.) match-all is composed of an expression called target, matcher, and match clause, which consists of a pattern and body expression. The match-all expres- sion evaluates the body of the match clause for each pattern-matching result and returns a (lazy) collection that contains all results. In the above code, we pattern- match the target {1 2 3} as a list of integers using the pattern . (list integer) is a matcher to pattern-match the pattern and target as a list of integer. The pattern is constructed using the join pattern constructor. $xs and $ys are called pattern variables. We can use the result of pattern matching referring to them. A match-all expression first consults the matcher on how to pattern-match the given target and the given pattern. Matchers know how to decompose the target following the given pattern and enumerate the results, and match-all then collects the results returned by the matcher. In the sample program, given a join pattern, (list integer) tries to divide a collection into two collections. The collection {1 2 3} is thus divided into two collections by four ways. match-all can handle pattern matching that may yield infinitely many re- sults. For example, the following program extracts all twin primes from the infinite list of prime numbers4. We will discuss this mechanism in Section 5.2.

(define $twin-primes (match-all primes (list integer) [>> [p (+ p 2)]]))

(take 6 twin-primes); {[3 5] [5 7] [11 13] [17 19] [29 31] [41 43]}

There is another primitive syntax called match expression. While match-all returns a collection of all matched results, match short-circuits the pattern matching process and immediately returns if any result is found. Another differ- ence from match-all is that it can take multiple match clauses. It tries pattern matching starting from the head of the match clauses, and tries the next clause if it fails. Therefore, match is useful when we write conditional branching. However, match is inessential for our language. It is implementable in terms of the match-all expression and macros. The reason is because the match-all expression is evaluated lazily, and, therefore, we can extract the first pattern- matching result from match-all without calculating other pattern-matching re- sults simply by using car. We can implement match by combining the match-all and if expressions using macros. Furthermore, if is also implementable in terms of the match-all and matcher expression as follows. We will explain the matcher

4 We will explain the meaning of the value pattern ,(+ p 2) and the cons pattern constructor in Section 4.2 and 4.3, respectively. Non-linear Pattern Matching with Backtracking for Non-free Data Types 9 expression in Section 6. For that reason, we only discuss the match-all expres- sion in the rest of the paper.

(define $if (macro [$b $e1 $e2] (car (match-all b (matcher {[$ something {[ {e1}] [ {e2}]}]}) [$x x]))))

4.2 Efficient Non-linear Pattern Matching with Backtracking

Our language can handle non-linear patterns efficiently. For example, the calcu- lation time of the following code does not depend on the pattern length. Both of the following examples take O(n2) time to return the result.

(match-all (take n (repeat 0)) (multiset integer) [> x]) ; returns{} inO(n^2) time

(match-all (take n (repeat 0)) (multiset integer) [>> x]) ; returns{} inO(n^2) time

In our proposal, a pattern is examined from left to right in order, and the binding to a pattern variable can be referred to in its right side of the pattern. In the above examples, the pattern variable $x is bound to any element of the collection since the pattern constructor is insert. After that, the patterns “,(+ x 1)” and “,(+ x 2)” are examined. A pattern that begins with “,” is called a value pattern. The expression following “,” can be any kind of expressions. The value patterns match with the target data if the target is equal to the content of the pattern. Therefore, after successful pattern matching, $x is bound to an element that appears multiple times. We can more elaborately discuss the difference of efficiency of non-linear patterns and pattern guards in general cases. The time complexity involved in pattern guards is O(np+v) when the pattern matching fails, whereas the time complexity involved in non-linear patterns is O(np+min(1,v)), where n is the size of the target object5, p is the number of pattern variables, and v is the number of value patterns. The difference between v and min(1, v) comes from the mechanism of non-linear pattern matching that backtracks at the first mismatch of the value pattern. Table 1 shows micro benchmark results of non-linear pattern matching for Curry and Egison. The table shows execution times of the Curry program pre- sented in Section 3.1 and the corresponding Egison program as shown above. The environment we used was Ubuntu on VirtualBox with 2 processors and

5 Here, we suppose that the number of decompositions by each pattern constructor can be approximated by the size of the target object. 10 Satoshi Egi1 and Yuichi Nishiwaki2

Curry n=15 n=25 n=30 n=50 n=100 Egison n=15 n=25 n=30 n=50 n=100 seq2 1.18s 1.20s 1.29s 1.53s 2.54s seq2 0.26s 0.34s 0.43s 0.84s 2.72s seq3 1.42s 2.10s 2.54s 7.40s 50.66s seq3 0.25s 0.34s 0.46s 0.82s 2.66s seq4 3.37s 16.42s 34.19s 229.51s 3667.49s seq4 0.25s 0.34s 0.42s 0.78s 2.47s Table 1. Benchmarks of Curry (PAKCS version 2.0.1 and Curry2Prolog(swi 7.6) com- piler environment) and Egison (version 3.7.12)

8GB memory hosted on MacBook Pro (2017) with 2.3 GHz Intel Core i5 pro- cessor. We can see that the execution times in two implementations follow the theoretical computational complexities discussed above. We emphasize that this benchmark results do not mean Curry is slower than Egison. We can write the efficient programs for the same purpose in Curry if we do not persist in using non-linear patterns. Let us also note that the current implementation of Egi- son is not tuned up and comparing constant times in two implementations is nonsense. Value patterns are not only efficient but also easy to read once we are used to them because it enables us to read patterns in the same order the execution process of pattern matching goes. It also reduces the number of new variables introduced in a pattern. We explain the mechanism how the proposed system executes the above pattern matching efficiently in Section 5.

4.3 Polymorphic Patterns The characteristic of the proposed pattern-matching expression is that they take a matcher. This ingredient allows us to use the same pattern constructors for different data types. For example, one may want to pattern-match a collection {1 2 3} sometimes as a list and other times as a multiset or a set. For these three types, we can naturally define similar pattern-matching operations. One example is the cons pattern, which is also called insert in Section 3.1 and 4.2. Given a collection, pattern divides it into the “head” element and the rest. When we use the cons pattern for lists, it either yields the result which is uniquely determined by the constructor, or just fails when the list is empty. On the other hand, for multisets, it non-deterministically chooses an element from the given collection and yields many results. By explicitly specifying which matcher is used in match expressions, we can uniformly write such programs in our language:

(match-all {1 2 3} (list integer) [ [x rs]]) ; {[1 {2 3}]} (match-all {1 2 3} (multiset integer) [ [x rs]]) ; {[1 {2 3}] [2 {1 3}] [3 {1 2}]} (match-all {1 2 3} (set integer) [ [x rs]]) ; {[1 {12 3}] [2 {12 3}] [3 {12 3}]}

In the case of lists, the head element $x is simply bound to the first element of the collection. On the other hand, in the case of multisets or sets, the head element Non-linear Pattern Matching with Backtracking for Non-free Data Types 11 can be any element of the collection because we ignore the order of elements. In the case of lists or multisets, the rest elements $rs are the collection that is made by removing the “head” element from the original collection. However, in the case of sets, the rest elements are the same as the original collection because we ignore the redundant elements. If we interpret a set as a collection that contains infinitely many copies of an each element, this specification of cons for sets is natural. This specification is useful, for example, when we pattern-match a graph as a set of edges and enumerate all paths with some fixed length including cycles without redundancy. Polymorphic patterns are useful especially when we use value patterns. As well as other patterns, the behavior of value patterns is dependent on matchers. For example, an equality {1 2 3} = {2 1 3} between collections is false if we regard them as mere lists but true if we regard them as multisets. Still, thanks to polymorphism of patterns, we can use the same syntax for both of them. This greatly improves the readability of the program and makes programming with non-free data types easy.

(match-all {1 2 3} (list integer) [,{2 1 3} "Matched"]);{} (match-all {1 2 3} (multiset integer) [,{2 1 3} "Matched"]);{"Matched"}

We can pass matchers to a function because matchers are first-class objects. It enables us to utilize polymorphic patterns for defining function. The following is an example utilizing polymorphism of value patterns.

(define $member?/m (lambda [$m $x $xs] (match xs (list m) {[> #t] [_ #f]})))

4.4 Extensible Pattern Matching

In the proposed language, users can describe methods for interpreting patterns in the definition of matchers. Matchers appeared up to here are defined in our language. We show an example of a matcher definition. We will explain the details of this definition in Section 6.1.

(define $unordered-pair (lambda [$a] (matcher {[ [a a] {[ {[x y] [y x]}]}] [$ [something] {[$tgt {tgt}]}]})))

An unordered pair is a pair ignoring the order of the elements. For exam- ple, is equivalent to , if we regard them as unordered pairs. Therefore, datum is successfully pattern-matched with pat- tern .

(match-all (unordered-pair integer) [ x]); {2} 12 Satoshi Egi1 and Yuichi Nishiwaki2

We can define matchers for more complicated data types. For example, Egi con- structed a matcher for mathematical expressions for building a on our language [7,13,14]. His computer algebra system is implemented as an application of the proposed pattern-matching system. The matcher for mathematical expressions is used for implementing simplification algorithms of mathematical expressions. A program that converts a mathematical expression object n cos2(θ) + n sin2(θ) to n can be implemented as follows. (Here, we intro- duced the math-expr matcher and some syntactic sugar for patterns.)

(define $rewrite-rule-for-cos-and-sin-poly (lambda [$poly] (match poly math-expr {[<+ <* $n <,cos $x>^,2 $y> <* ,n <,sin ,x>^,2 ,y> $r> (rewrite-rule-for-cos-and-sin-poly <+' r <*' n y>>)] [_ poly]})))

5 Algorithm

This section explains the pattern-matching algorithm of the proposed system. The formal definition of the algorithm is given in Section 7. The method for defining matchers explained in Section 6 is deeply related to the algorithm.

5.1 Execution Process of Non-linear Pattern Matching

Let us show what happens when the system evaluates the following pattern- matching expression.

(match-all {2 8 2} (multiset integer) [> m]); {2 2}

Figure 1 shows one of the execution paths that reaches a matching result. First, the initial matching state is generated (step 1). A matching state is a datum that represents an intermediate state of pattern matching. A matching state is a compound type consisting of a stack of matching atoms, an environment, and intermediate results of the pattern matching. A matching atom is a tuple of a pattern, a matcher, and an expression called target. MState denotes the data constructor for matching states. env is the environment when the evaluation enters the match-all expression. A stack of matching atoms contains a single matching atom whose pattern, target and matcher are the arguments of the match-all expression. In our proposal, pattern matching is implemented as reductions of matching states. In a reduction step, the top matching atom in the stack of matching atoms is popped out. This matching atom is passed to the procedure called matching function. The matching function is a function that takes a matching atom and returns a list of lists of matching atoms. The behavior of the matching function is controlled by the matcher of the argument matching atom. We can control the behavior of the matching function by defining matchers properly. For Non-linear Pattern Matching with Backtracking for Non-free Data Types 13

1 MState {[> (multiset integer) {2 8 2}]} env {} MState {[$m integer 2] [ (multiset integer) {8 2}]} env {} 2 MState {[$m integer 8] [ (multiset integer) {2 2}]} env {} MState {[$m integer 2] [ (multiset integer) {2 8}]} env {} 3 MState {[$m something 2] [ (multiset integer) {8 2}]} env {} 4 MState {[ (multiset integer) {8 2}]} env {[m 2]} MState {[,m integer 8] [_ (multiset integer) {2}]} env {[m 2]} 5 MState {[,m integer 2] [_ (multiset integer) {8}]} env {[m 2]} 6 MState {[_ (multiset integer) {8}]} env {[m 2]} 7 MState {[_ something {8}]} env {[m 2]} 8 MState {} env {[m 2]}

Fig. 1. Reduction path of matching states example, we obtain the following results by passing the matching atom of the initial matching state to the matching function. matchFunction [> (multiset integer) {2 8 2}] = { {[$m integer 2] [ (multiset integer) {8 2}]} {[$m integer 8] [ (multiset integer) {2 2}]} {[$m integer 2] [ (multiset integer) {2 8}]} }

Each list of matching atoms is prepended to the stack of the matching atoms. As a result, the number of matching states increases to three (step 2). Our pattern-matching system repeats this step until all the matching states vanish. For simplicity, in the following, we only examine the reduction of the first matching state in step 2. This matching state is reduced to the matching state shown in step 3. The matcher in the top matching atom in the stack is changed to something from integer, by definition of integer matcher. something is the only built-in matcher of our pattern-matching system. something can handle only wildcards or pattern variables, and is used to bind a value to a pattern variable. This matching state is then reduced to the matching state shown in step 4. The top matching atom in the stack is popped out, and a new binding [m 2] is added to the collection of intermediate results. Only something can append a new binding to the result of pattern matching. Similarly to the preceding steps, the matching state is then reduced as shown in step 5, and the number of matching states increases to 2. “,m” is pattern-matched with 8 and 2 by integer matcher in the next step. When we pattern-match with a value pattern, the intermediate results of the pattern matching is used as an environment to evaluate it. In this way, “m” is evaluated to 2. Therefore, the first matching state fails to pattern-match and vanishes. The second matching state succeeds in pattern matching and is reduced to the matching state shown in step 6. In step 7, the matcher is simply converted from (multiset integer) to something, by definition of (multiset integer). Fi- nally, the matching state is reduced to the empty collection (step 8). No new 14 Satoshi Egi1 and Yuichi Nishiwaki2 binding is added because the pattern is a wildcard. When the stack of matching atoms is empty, reduction finishes and the matching patching succeeds for this reduction path. The matching result {[m 2]} is added to the entire result of pattern matching. We can check the pattern matching for sequential triples and quadruples are also efficiently executed in this algorithm.

5.2 Pattern Matching with Infinitely Many Results

The proposed pattern-matching system can eventually enumerate all successful matching results when matching results are infinitely many. It is performed by reducing the matching states in a proper order. Suppose the following program:

(take 8 (match-all nats (set integer) [> [m n]])) ; {[1 1] [1 2] [2 1] [1 3] [2 2] [3 1] [1 4] [2 3]}

Figure 2 shows the search tree of matching states when the system executes the above pattern matching expression. Rectangles represent matching states, and circles represent final matching states of successful pattern matching. The rect- angle at the upper left is the initial matching state. The rectangles in the second row are the matching states generated from the initial matching state one step. Circles o8, r9, and s9 correspond to pattern-matching results {[m 1] [n 1]}, {[m 1] [n 2]}, and {[m 2] [n 1]}, respectively. One issue on naively searching this search tree is that we cannot enumerate all matching states either in depth-first or breadth-first manners. The reason is that widths and depths of the search tree can be infinite. Widths can be infinite because a matching state may generate infinitely many matching states (e.g., the width of the second row is infinite), and depths can be infinite when we extend the language with a notion such as recursively defined patterns [12]. To resolve this issue, we reshape the search tree into a reduction tree as presented in Figure 3. A node of a reduction tree is a list of matching states, and a node has at most two child nodes, left of which is the matching states generated from the head matching state of the parent, and right of which is a copy of the tail part of the parent matching states. At each reduction step, the system has a list of nodes. Each row in Figure 3 denotes such a list. One reduction step in our system proceeds in the following two steps. First, for each node, it generates a node from the head matching state. Then, it constructs the nodes for the next step by collecting the generated nodes and the copies of the tail parts of the nodes. The index of each node denotes the depth in the tree the node is checked at. Since widths of the tree are at most 2n for some n at any depth, all nodes can be assigned some finite number, which means all nodes in the tree are eventually checked after a finite number of reduction steps. We adopt breadth-first search strategy as the default traverse method be- cause there are cases that breadth-first traverse can successfully enumerate all pattern-matching results while depth-first traverse fails to do so when we handle pattern matching with infinitely many results. However, of course, when the size Non-linear Pattern Matching with Backtracking for Non-free Data Types 15

Fig. 2. Search tree Fig. 3. Binary reduction tree of the reduction tree is finite, the space complexity for depth-first traverse is less expensive. Furthermore, there are cases that the time complexity for depth-first traverse is also less expensive when we extract only the first several successful matches. Therefore, to extend the range of algorithms we can express concisely with pattern matching keeping efficiency, providing users with a method for switching search strategy of reduction trees is important. We leave further in- vestigation of this direction as as interesting future work.

6 User Defined Matchers

This section explains how to define matchers.

6.1 Matcher for Unordered Pairs

We explain how the unordered-pair matcher shown in Section 4.4 works. unordered-pair is defined as a function that takes and returns a matcher to specify how to pattern-match against the elements of a pair. matcher takes matcher clauses. A matcher clause is a triple of a primitive-pattern pattern, next-matcher expressions, and primitive-data-match clauses. The formal syntax of the matcher expression is found in Figure 4 in Section 7. unordered-pair has two matcher clauses. The primitive-pattern pattern of the first matcher clause is . This matcher clause defines the inter- pretation of pair pattern. pair takes two pattern holes $. It means that it interprets the first and second arguments of pair pattern by the matchers spec- ified by the next-matcher expression. In this example, since the next-matcher expression is [a a], both of the arguments of pair are pattern-matched using the matcher given by a. The primitive-data-match clause of the first matcher clause is {[ {[x y] [y x]}]}. is pattern-matched with the target datum such as , and $x and $y is matched with 2 and 5, respectively. The primitive-data-match clause returns {[2 5] [5 2]}.A primitive-data-match clause returns a collection of next-targets. This means the 16 Satoshi Egi1 and Yuichi Nishiwaki2 patterns “,5” and $x are matched with the targets 2 and 5, or 5 and 2 using the integer matcher in the next step, respectively. Pattern matching of primitive- data-patterns is similar to pattern matching against algebraic data types in or- dinary languages. As a result, the first matcher clause works in the matching function as follows. matchFunction [ (unordered-pair integer) ] = { {[$x integer 2] [$y integer 5]} {[$x integer 5] [$y integer 2]} }

The second matcher clause is rather simple; this matcher clause simply con- verts the matcher of the matching atom to the something matcher.

6.2 Case Study: Matcher for Multisets

As an example of how we can implement matchers for user-defined non-free data types, we show the definition of multiset matcher. We can define it simply by using the list matcher. multiset is defined as a function that takes and returns a matcher.

(define $multiset (lambda [$a] (matcher {[ [] {[{} {[]}] [_ {}]}] [ [a (multiset a)] {[$tgt (match-all tgt (list a) [> [x (append hs ts)]])]}] [,$val [] {[$tgt (match [val tgt] [(list a) (multiset a)] {[[ ] {[]}] [[ ] {[]}] [[_ _] {}]})]}] [$ [something] {[$tgt {tgt}]}]})))

The multiset matcher has four matcher clauses. The first matcher clause han- dles the nil pattern, and it checks if the target is an empty collection. The second matcher clause handles the cons pattern. The match-all expression is effectively used to destruct a collection in the primitive-data-match clause. Be- cause the join pattern in the list matcher enumerates all possible splitting pairs of the given list, match-all lists up all possible consing pairs of the tar- get expression. The third matcher clause handles value patterns. “,$val” is a value-pattern pattern that matches with a value pattern. This matcher clause checks if the content of a value pattern (bound to val) is equal to the target (bound to tgt) as multisets. Note that the definition involves recursions on the multiset matcher itself. The fourth matcher clause is completely identical to unordered-pair and integer. Non-linear Pattern Matching with Backtracking for Non-free Data Types 17

M ::= x | | (lambda [$x ··· ] M) | (MM ··· ) p ::= | $x | ,M | | [M ··· ] | {M · · · } | φ ::= [pp M {[dp M] ···}] | (match-all MM [p M]) pp ::= $ | ,$x | | (match MM {[p M] ···}) dp ::= $x | | something | (matcher {φ ···})

Fig. 4. Syntax of our language

6.3 Value-pattern Patterns and Predicate Patterns

We explain the generality of our extensible pattern-matching framework taking examples from the integer matcher. How to implement value patterns and predicate patterns in our language is shown.

(define $integer (matcher {[,$n [] {[$tgt (if (eq? tgt n) {[]} {})]}] [ [] {[$tgt (if (lt? tgt n) {[]} {})]}] [$ [something] {[$tgt {tgt}]}]}))

Value patterns are patterns that successfully match if the target expression is equal to some fixed value. For example, ,5 only matches with 5 if we use integer matcher. The first matcher clause in the above definition exists to implement this. The primitive-pattern pattern of this clause is ,$n, which is a value-pattern pattern that matches with value patterns. The next-matcher expression is an empty tuple because no pattern hole $ is contained. If the target expression tgt and the content of the value pattern n are equal, the primitive-data-match clause returns a collection consisting of an empty tuple, which denotes success. Otherwise, it returns an empty collection, which denotes failure. Predicate patterns are patterns that succeed if the target expression satisfies some fixed predicate. Predicate patterns are usually implemented as a built-in feature, such as pattern guards, in ordinary programming languages. Interest- ingly, we can implement this on top of our pattern-matching framework. The second matcher clause defines a predicate pattern which succeeds if the target integer is less than the content of the value pattern n. A technique similar to the first clause is used.

7 Formal Semantics

In this section, we present the syntax and big-step semantics of our language (Fig. 4 and 5). We use metavariables x, y, z, . . ., M, N, L, . . ., v, . . ., and p, . . . for variables, expressions, values, and patterns respectively. In Fig. 4, c denotes a constant expression and C denotes a data constructor name. X ··· in Fig. 4 means a finite list of X. The syntax of our language is similar to that of the Lisp language. As explained in Section 4.1, [M ··· ], {M ···}, and denote tuples, collections, and data constructions. All formal arguments 18 Satoshi Egi1 and Yuichi Nishiwaki2

Evaluation of matcher and match-all:

Γ, (matcher [ppi Mi [dpij Nij ] ] ) ⇓ ([ppi,Mi, [dpij ,Nij ]j ]i,Γ ) j i Γ,M ⇓ v Γ, N ⇓ m [[[p ∼m v],Γ, ∅]] V [∆i]i Γ ∪ ∆i,L ⇓ vi (∀i) Γ, (match-all MN [p L]) ⇓ [vi]i

Matching states:

 → none, none, none (, Γ, ∆): ~s → (some ∆), none, (some ~s) Γ ∪∆ 0 p ∼m v ⇓ [~ai]i, ∆ 0 ((p ∼m v): ~a,Γ, ∆): ~s → none, (some[~ai + ~a,Γ, ∆ ∪ ∆ ]i), (some ~s) ~s → opt Γ , opt s~0 , opt s~00 (∀i) ~ ~ ~0 ~0 ~ i i i i ~s ⇒ Γ, s s V ∆ [~s ] ⇒ P (opt Γ ), P (opt s~0 ) + P (opt s~00 )   ~ ~ ~ i i i i i i i i V ~s V Γ + ∆

Matching atoms:

pp ≈Γ p ⇓ fail p ∼Γ v ⇓ ~a,Γ~ 0 (φ,∆~ ) Γ Γ 0 $x ∼something v ⇓ [], {x 7→ v} p ∼ v ⇓ ~a,Γ~ ((pp,M,~σ):φ,∆~ ) pp ≈Γ p ⇓ [p0 ] , ∆0 dp ≈ v ⇓ fail p ∼Γ v ⇓ ~a,Γ~ 0 i i ((pp,M,~σ):φ,∆~ ) p ∼Γ v ⇓ ~a,Γ~ 0 ((pp,M,(dp,N):~σ):φ,∆~ ) Γ 0 0 00 0 00 0 0 pp ≈ p ⇓ [pj ]j , ∆ dp ≈ v ⇓ ∆ ∆ ∪ ∆ ∪ ∆ ,N ⇓ [[vij ]j ]i ∆, M ⇓ [mj ]j Γ 0 0 p ∼ ~ v ⇓ [[pj ∼m0 vij ]j ]i, ((pp,M,(dp,N):~σ):φ,∆) j ∅

Pattern matching on patterns:

Γ Γ,M ⇓ v ppi ≈ pi ⇓ ~pi,Γi (∀i) Γ , y ≈Γ ,M ⇓ , {y 7→ v} Γ P S $ ≈ p ⇓ [p], ∅ $ ⇓ i ~pi, i Γi

Pattern matching on data:

dpi ≈ vi ⇓ Γi (∀i) S $z ≈ v ⇓ {z 7→ v} ⇓ i Γi Fig. 5. Formal semantics of our language

are decorated with the dollar mark. φ, pp and dp are called matcher clauses, primitive-pattern patterns and primitive-data patterns respectively.

In Fig. 5, the following notations are used. We write [ai]i to mean a list [a1, a2,...]. Similarly, [[aij]j]i denotes [[a11, a12,...], [a21, a22,...],...], but each list in the list may have different length. List of tuples [(a1, b1), (a2, b2),...] may be often written as [ai, bi]i instead of [(ai, bi)]i for short. Concatenation of lists l1, l2 are denoted by l1+l2, and a : l denotes [a]+l (adding at the front).  denotes the empty list. In general, ~x for some metavariable x is a metavariable denoting a list of what x denotes. However, we do not mean by ~xi the i-th element of ~x; if Non-linear Pattern Matching with Backtracking for Non-free Data Types 19 we write [~xi]i, we mean a list of a list of x. Γ, ∆, . . . denote variable assignments, i.e., partial functions from variables to values. Our language has some special primitive types: matching atoms a, . . ., match- ing states s, . . ., primitive-data-match clauses σ, . . ., and matchers m, . . ..A matching atom consists of a pattern p, a matcher m, and a value v, and written as p ∼m v. A matching state is a tuple of a list of matching atoms and two vari- able assignments. A primitive-data-match clause is a tuple of a primitive-data pattern and an expression, and a matcher clause is a tuple of a primitive-pattern pattern, an expression, and a list of data-pattern clauses. A matcher is a pair con- taining a list of matcher clauses and a variable assignment. Note that matchers, matching states, etc. are all values. Evaluation results of expressions are specified by the judgment Γ, e ⇓ ~v, which denotes given a variable assignment Γ and an expression e one gets a list of values ~v. In the figure, we only show the definition of evaluation of matcher and match-all expressions (other cases are inductively defined as usual). The ~ ~ definition of match-all relies on another type of judgment ~s V Γ , which defines ~ ~ ~0 how the search space is examined. V is inductively defined using ~s ⇒ Γ, s , which is again defined using ~s → opt Γ, opt s~0, opt s~00. In their definitions, we introduced notations for (meta-level) option types. none and some x are the constructors of the option type, and opt x is a metavariable for an optional P value (possibly) containing what the metavariable x denotes. i(opt xi) creates a list by collecting all the valid (non-none) xi preserving the order. Γ ~ p ∼m v ⇓ ~a,∆ is a 6-ary relation. One reads it “performing pattern matching on v against p using the matcher m under the variable assignment Γ yields the result ∆ and continuation ~a~.” The result is a variable assignment because it is a result of unifications. ~a~ being empty means the pattern matching failed. If [] is returned as ~a~, it means the pattern matching succeeded and no further search is necessary. As explained in Section 6, one needs to pattern-match patterns and data to define user-defined matchers. Their formal definitions are given by judgments pp ≈Γ p ⇓ p~0, ∆ and dp ≈ v ⇓ Γ .

8 Conclusion

We designed a user-customizable efficient non-linear pattern-matching system by regarding pattern matching as reduction of matching states that have a stack of matching atoms and intermediate results of pattern matching. This system enables us to concisely describe a wide range of programs, especially when non- free data types are involved. For example, our pattern matching architecture is useful to implement a computer algebra system because it enables us to directly pattern-match mathematical expressions and rewrite them. The major significance of our pattern matching system is that it greatly improves the expressivity of the programming language by allowing to freely extend the process of pattern matching by themselves. Furthermore, in the general cases, use of the match expression will be as readable as that in 20 Satoshi Egi1 and Yuichi Nishiwaki2 other general-purpose programming languages. Although we consider that the current syntax of matcher definition is already clean enough, we leave further refinement of the syntax of our surface language as future work. We believe the direct and concise representation of algorithms enables us to implement really new things that go beyond what was considered practical before. We hope our work will lead to breakthroughs in various fields.

Acknowledgments

We thank Ryo Tanaka, Takahisa Watanabe, Kentaro Honda, Takuya Kuwahara, Mayuko Kori, and Akira Kawata for their important contributions to implement the interpreter. We thank Michal J. Gajda, Yi Dai, Hiromi Hirano, Kimio Ku- ramitsu, and Pierre Imai for their helpful feedback on the earlier versions of the paper. We thank Masami Hagiya, Yoshihiko Kakutani, Yoichi Hirai, Ibuki Kawa- mata, Takahiro Kubota, Takasuke Nakamura, Yasunori Harada, Ikuo Takeuchi, Yukihiro Matsumoto, Hidehiko Masuhara, and Yasuhiro Yamada for construc- tive discussion and their continuing encouragement.

References

1. Attributes::attnf - Wolfram Language Documentation. http://reference. wolfram.com/language/ref/message/Attributes/attnf.html, [Online; accessed 14-June-2018] 2. Introduction to Patterns - Wolfram Language Documentation. http://reference. wolfram.com/language/tutorial/Introduction-Patterns.html, [Online; ac- cessed 14-June-2018] 3. Orderless - Wolfram Language Documentation. http://reference.wolfram.com/ language/ref/Orderless.html, [Online; accessed 14-June-2018] 4. PAKCS. https://www.informatik.uni-kiel.de/~pakcs/, [Online; accessed 14- June-2018] 5. ViewPatterns - GHC. https://ghc.haskell.org/trac/ghc/wiki/ViewPatterns, [Online; accessed 14-June-2018] 6. The Egison Programming Language. https://www.egison.org (2011), [Online; accessed 14-June-2018] 7. Egison Mathematics Notebook. https://www.egison.org/math (2016), [Online; accessed 14-June-2018] 8. Antoy, S.: Programming with narrowing: A tutorial. Journal of Symbolic Compu- tation 45(5) (2010) 9. Antoy, S.: Constructor-based conditional narrowing. In: Proceedings of the 3rd ACM SIGPLAN international conference on Principles and practice of declarative programming (2001) 10. Antoy, S., Hanus, M.: Functional logic programming. Communications of the ACM 53(4) (2010) 11. Braßel, B., Hanus, M., Peem¨oller,B., Reck, F.: KiCS2: A new compiler from Curry to Haskell. In: International Workshop on Functional and Constraint Logic Pro- gramming (2011) Non-linear Pattern Matching with Backtracking for Non-free Data Types 21

12. Egi, S.: Non-linear Pattern Matching against Non-free Data Types with Lexical Scoping. arXiv preprint arXiv:1407.0729 (2014) 13. Egi, S.: Scalar and Tensor Parameters for Importing Tensor Index Notation in- cluding Einstein Summation Notation. The Scheme and Functional Programming Workshop (2017) 14. Egi, S.: Scalar and Tensor Parameters for Importing the Notation in Differential Geometry into Programming. arXiv preprint arXiv:1804.03140 (2018) 15. Erwig, M.: Active patterns. Implementation of Functional Languages (1996) 16. Erwig, M.: Functional programming with graphs. In: ACM SIGPLAN Notices. vol. 32 (1997) 17. Fischer, S., Kiselyov, O., Shan, C.c.: Purely functional lazy non-deterministic pro- gramming. In: ACM Sigplan Notices. vol. 44 (2009) 18. Hanus, M.: Multi-paradigm declarative languages. In: International Conference on Logic Programming (2007) 19. Hinze, R., Paterson, R.: Finger trees: a simple general-purpose . Journal of functional programming 16(2) (2006) 20. Krebber, M.: Non-linear Associative-Commutative Many-to-One Pattern Matching with Sequence Variables. arXiv preprint arXiv:1705.00907 (2017) 21. McBride, F., Morrison, D., Pengelly, R.: A symbol manipulation system. Machine Intelligence 5 (1969) 22. Okasaki, C.: Views for Standard ML. In: SIGPLAN Workshop on ML (1998) 23. Syme, D., Neverov, G., Margetson, J.: Extensible pattern matching via a lightweight language extension. In: ACM SIGPLAN Notices. vol. 42 (2007) 24. Thompson, S.: Lawful functions and program verification in Miranda. Science of Computer Programming 13(2-3) (1990) 25. Thompson, S.: Laws in Miranda. In: Proceedings of the 1986 ACM conference on LISP and functional programming (1986) 26. Tullsen, M.: First Class Patterns. Practical Aspects of Declarative Languages (2000) 27. Turner, D.: Miranda: A non-strict functional language with polymorphic types. In: Functional programming languages and computer architecture (1985) 28. Wadler, P.: Views: A way for pattern matching to cohabit with data abstraction. In: Proceedings of the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages (1987)