<<

Universitat¨ Stuttgart

Efficient Neighbor-Finding on Space-Filling

Bachelor Thesis

Author: David Holzm¨uller* Degree: B. Sc. Mathematik Examiner: Prof. Dr. Dominik G¨oddeke, IANS Supervisor: Prof. Dr. Miriam Mehl, IPVS

October 18, 2017 arXiv:1710.06384v3 [cs.CG] 2 Nov 2019

*E-Mail: [email protected], where the ¨uin the last name has to be replaced by ue. Abstract

Space-filling curves (SFC, also known as FASS-curves) are a useful tool in scientific computing and other areas of computer science to sequentialize multidimensional grids in a cache-efficient and parallelization-friendly way for storage in an array. Many algorithms, for example grid-based numerical PDE solvers, have to access all neighbor cells of each grid cell during a grid traversal. While the array indices of neighbors can be stored in a cell, they still have to be computed for initialization or when the grid is adaptively refined. A fast neighbor- finding algorithm can thus significantly improve the runtime of computations on multidimensional grids. In this thesis, we show how neighbors on many regular grids ordered by space-filling curves can be found in an average-case time complexity of (1). In 풪 general, this assumes that the local orientation (i.e. a variable of a describing grammar) of the SFC inside the grid cell is known in advance, which can be efficiently realized during traversals. Supported SFCs include Hilbert, Peano and Sierpinski curves in arbitrary dimensions. We assume that integer arithmetic operations can be performed in (1), i.e. independent of the size of the integer. 풪 We do not deal with the case of adaptively refined grids here. However, it appears that a generalization of the algorithm to suitable adaptive grids is possible. To formulate the neighbor-finding algorithm and prove its correctness and runtime properties, a modeling framework is introduced. This framework extends the idea of vertex-labeling to a description using grammars and matrices. With the sfcpp library, we provide a C++ implementation to render SFCs generated by such models and automatically compute all lookup tables needed for the neighbor- finding algorithm. Furthermore, optimized neighbor-finding implementations for various SFCs are included for which we provide runtime measurements.

2 Contents

1 Introduction 4 1.1 Related Work ...... 4 1.2 Contribution ...... 5 1.3 Outline ...... 6

2 Overview and Motivating Examples 8 2.1 Modeling the Hilbert ...... 8 2.2 Neighbor-Finding Algorithm ...... 12 2.3 Other Space-Filling Curve Models ...... 14

3 Modeling Space-Filling Curves 21 3.1 Trees, Matrices and States ...... 21 3.2 Algebraic Neighbors ...... 27 3.3 Geometric Neighbors and Regularity Conditions ...... 30 3.4 Algorithmic Verification ...... 38 3.5 Tree Isomorphisms ...... 41

4 Algorithms 45 4.1 General Neighbor-Finding Algorithm ...... 45 4.2 Other Algorithms ...... 48 4.3 Exploiting Symmetry ...... 49

5 Optimizations 52 5.1 Curve-Indepentent Optimizations ...... 52 5.2 Curve-Dependent Optimizations ...... 54

6 Implementation 56 6.1 Implementation for General Models ...... 56 6.2 Other Code ...... 57 6.3 Experimental Results ...... 60

7 Conclusion 64

3 1 Introduction

Many algorithms, especially in scientific computing, operate on data stored in multi- dimensional grids. Often, these grids are stored in a one-dimensional array, using a sequential order on the grid cells that defines which array index corresponds to which grid cell. For example, matrices are usually stored row-major (i.e. row-by-row) or column-major (i.e. column-by-column). Yet, such a naive sequential order may be suboptimal when certain geometric operations are performed on the grid. Space-filling curves (SFC), originally known as a topological curiosity, yield sequential orders on grids that are more cache-efficient and parallelization-friendly [2]. For example, solving a partial differential equation on a grid numerically often involves combining values of a grid cell with values of its geometrical neighbors. In a sequential order induced by a SFC, a much higher percentage of pairs of geometrical neighbors have array indices that are “close”, which means that geometrical neighbors are often loaded into cache memory together. Due to the access speed difference between RAM and cache memory, this locality property can have a significant impact on the algorithm’s overall efficiency. Another consequence of the better locality of SFCs is that when splittingthe array in 푝 similarly-sized partitions, a high percentage of pairs of geometrical neighbors lie in the same part. If these 푝 parts of the array are processed in 푝 parallel threads, each pair of neighboring cells lying in different parts of the array leads to communication between the corresponding threads or increases the number of cells that are stored in multiple threads. This is another reason why SFCs can help to reduce computational effort [2]. SFC-induced sequential orders are usually recursively defined, starting with a single-cell grid and in each step splitting each cell into 푏 2 subcells and arranging ≥ them in a certain order. A disadvantage of such a SFC-induced sequential order over simpler sequential orders is therefore that the algorithms needed to deal with them are often more complicated. It has to be taken care that the reduction of data access and transfer time described above is greater than the extra runtime introduced by more complicated algorithms. In particular, this thesis deals with the problem of finding the array index of a geometrical neighbor cell and related problems.

1.1 Related Work An obvious way to find neighbors in a regular grid is to convert an array index toa coordinate vector, changing one coordinate to obtain the neighbor’s coordinate vector, and convert the latter back into an array index. Many methods for these conversions have been suggested. The book by Bader [2] provides a good overview over such algorithms. Bartholdi and Goldsman [4] introduced the technique of vertex-labeling. This technique is similar to the modeling approach introduced in this thesis. Moreover, it can be used to convert between array indices and coordinate vectors for a broad range of SFCs. Bartholdi and Goldsman also proposed a neighbor-finding algorithm for various SFCs based on vertex-labeling that uses a coordinate vector of an interior point of each edge of a cell. All neighbor-finding algorithms of this type have a time complexity equal to that of the corresponding conversion algorithm, meaning that they will scale linearly with the refinement level of the curve and thus logarithmically

4 with the number of grid points (assuming that the grid is regular, i.e. equally refined everywhere). For the popular Morton order, Schrack [5] published a neighbor-finding algorithm with runtime (1), i.e. independent of the refinement level of the grid, assuming constant- 풪 time arithmetic integer operations. Aizawa and Tanaka [1] investigated the Morton order on adaptive 2D grids. They suggested to store level differences to neighbors inside the grid cells and presented an adaption of Schrack’s algorithm to find “location codes” of neighbors of equal or bigger size. If the used data structure efficiently allows to access data based on its location code, then such neighbors can be accessed efficiently. In stack-based approaches (see Chapter 14 in Bader [2]), data is placed at the vertices of the grid cells, i.e. the grid points, and stored in stacks. Neighboring vertices are automatically found when a cell containing both neighbors is visited. Weinzierl and Mehl [6] presented a stack-based framework for the . This approach does not work for all space-filling curves. For example, the Hilbert curves in dimensions 푑 3 are not suited for such a traversal [2]. In addition, it can be difficult to embed an ≥ existing simulation software in such a framework.

1.2 Contribution In this thesis, we present an algorithm to find neighbors in space-filling curves with an average-case time complexity of (1) under certain assumptions: 풪 The space-filling curve is generated by a certain recursive pattern that satisfies ∙ some regularity conditions. These conditions are precisely given in Section 3. Supported SFCs include Morton, Hilbert, Peano, Sierpinski and variants thereof, some of which are shown in Section 2.3.

Arithmetic operations on (non-negative) integers can be performed in (1) time, ∙ 풪 i.e. independent of the number of bits of the integer. For SFCs like Morton, Hilbert and Sierpinski, where 푏 (the number of subcells that a cell is split into in the recursive construction) is a power of two, only the following arithmetic operations are needed:

– Increment, decrement and comparison on integers not greater than the level of the given grid cell. Even if no constant-time integer arithmetic is assumed, this only needs (log(level)) time. 풪 – Bit operations that operate on a constant number of bits.

The grid is regular. This usually means that every grid cell has the same size. We ∙ assume that an extension of the algorithm may also work on adaptive grids.

The “state”, i.e. the local pattern of the curve inside the given grid cell, is known. ∙ During a traversal of the grid, this state can be computed very efficiently with (1) overhead per grid cell. If random access is intended, the state can be 풪 computed in (푙) = (log 푛), where 푙 is the tree level of the grid cell in the tree 풪 풪 and 푛 is the total number of grid cells. This state can then be used to find all neighbors and their neighbors, if necessary. For SFCs like Hilbert, Peano and

5 Sierpinski, the neighbors still can be found in unknown order if the state of the grid cell is not known. This is explained in Section 4.3.

In this thesis, we develop a rigorous framework for modeling space-filling curves based on vertex-labeling [4]. This framework is used to give a general formulation of the neighbor-finding algorithm and prove its correctness and runtime. Furthermore, itis employed to precisely formulate properties of SFCs, for example conditions for the algorithm to operate correctly or conditions needed to perform certain optimizations in its implementation. We intend to make all aspects of this modeling algorithmically computable. Together with this thesis, we provide a library called sfcpp,1 where some these aspects have been implemented, especially:

neighbor-finding algorithms for the Peano curve in arbitrary dimensions, the ∙ in 2D and 3D and the Sierpinski curve in 2D,

models of many SFCs, ∙ LAT X rendering (visualization) of SFCs specified by such a model, ∙ E generation of lookup tables needed for the algorithms given here, and ∙ checking of some properties of a specified model. ∙ A particular version of the neighbor-finding algorithm for the Peano curve, an imple- mentation and visualization code has been developed as a term paper.2 This thesis generalizes the algorithm, includes a further optimized version of the implementation into a bigger library and partially replaces the visualization code with a more general implementation.

1.3 Outline In Section 2.1, we will give an overview over the modeling framework, using the 2D Hilbert curve as an example. In Section 2.2, the idea behind the neighbor-finding algorithm is presented using some examples. Section 2.3 presents other curves to motivate the introduction of a general model. Section 3.1 then introduces formal definitions for models and related objects such as trees. Section 3.2 establishes a definition of neighborship on trees using lookup tables. This definition is used later to formulate the neighbor-finding algorithm. Section 3.3 formally introduces a geometric definition of neighborship for geometric models and proves that this definition is, under certain regularity conditions, equivalent tothe neighborship definition using lookup tables. Section 3.4 then shows how to verify these regularity conditions algorithmically. Section 3.5 introduces isomorphisms between trees. A general formulation of our neighbor-finding algorithm is given in Section 4.1 andits correctness and runtime are proven. Section 4.2 shows how to use the neighbor-finding

1https://github.com/dholzmueller/sfcpp 2David Holzm¨uller:Raumf¨ullende Kurven, 2016.

6 algorithm in a traversal. Based on tree isomorphisms, a framework for computing states and coordinates is given. In some cases, calling this algorithm with a wrong state still yields the correct set of neighbors, just in permuted order. Section 4.3 gives formal criteria to verify these cases. Section 5 shows optimizations that can be employed to make the algorithms faster in practice. Some of these optimizations work for all SFCs while some only apply to special curves. Section 6 presents implementations of algorithms, mainly those implemented in the sfcpp library. It also provides runtime measurements for different algorithms that compute neighbors or states of grid cells. Finally, Section 7 points out open research questions.

7 2 Overview and Motivating Examples

In this section, we want to give an overview over many concepts and algorithms that are introduced formally in the following sections. First, we examine the two-dimensional Hilbert curve as a popular example of a space-filling curve. After discussing various related mathematical tools, we explain the ideas behind the neighbor-finding algorithm presented in Section 4.1. Finally, we present more examples of space-filling curves, focusing on aspects that are important for the design of algorithms that should be applicable to a large class of SFCs. The choice of presented SFCs and their presentation is inspired by Bader [2].

2.1 Modeling the Hilbert Curve The definition of the term “space-filling curve” is not uniform across literature ([2], [4]). A space-filling curve (SFC) can be defined as a continuous function 푓 : 퐼 R R푑, 푑 2, ⊆ → ≥ for which 푓(퐼) has positive Jordan content. In 1890, Peano showed the first example of such a space-filling curve [2]. SFCs relevant for grid sequentialization are usually specified as the uniform limit of a sequence of piecewise affine curves, where eachcurve in the sequence is a recursive refinement of the previous curve in the sequence. From an algorithmic perspective, the “finite approximations” in this sequence are the main object of study. In the following, they will also be referred to as space-filling curves, although they do not satisfy the definition above.

H H H H H H A B A B H R A B R A B H A B H

(a) Level 0 (b) Level 1 (c) Level 2

Figure 1: Construction of the 2D Hilbert curve.

Figure 1 shows the first three levels of the 2D Hilbert curve. From one level tothenext level, each square is subdivided into four equal subsquares. These subsquares are then put into a certain order depending on the state 푠 퐻, 퐴, 퐵, 푅 of the square. We ∈ { } can interpret these squares on different levels as nodes (vertices) of a tree: The square 푄(푟) at level 0 is the root of the tree. The subsquares 퐶(푄, 푗), 푗 := 0, 1, 2, 3 , of ∈ ℬ { } a square 푄, which are located in the next level, form the four children of a square. Conversely, the square 푄 is called the parent of its four children. Each square except the root 푄(푟) has a parent. The tree structure of the squares is also visualized in Figure 2. A tree consisting of 푑-dimensional hypercubes, where each hypercube ist partitioned into 푘푑 equal subcubes by its children, is called 푘푑-tree. For the 2D Hilbert curve, we

8 Figure 2: Tree representation of the squares from the 2D Hilbert curve. have 푘 = 2 and 푑 = 2 and each square is partitioned into 푏 := 푘푑 = 22 = 4 equal subsquares. A 22-tree is also called Quadtree [2]. To store all squares of a given level sequentially in memory, we take the order generated by the space-filling curve, see Figure 3.

5 6 9 10 1 2 4 7 8 11 0 3 2 13 12 0 3 0 1 14 15

(a) Level 0 (b) Level 1 (c) Level 2

Figure 3: Cell enumeration in the 2D Hilbert curve in base 10.

Figure 4 shows the numbers in base 푏 = 4 instead of base 10. This exposes a simple pattern: The base-4 digits of a square in the tree consist of the base-4 digits of its parent square and a digit corresponding to its position inside its parent square. The tree representation using squares is useful to draw SFCs. However, when using SFCs to create efficient algorithms, representing a square using coordinates is computationally slow. Moreover, depending on the application, a square might not be given by its coordinates. Instead, we may represent a square in the tree by a tuple (푙, 푗). Here, 푙 푙 N0 is the level of the square and 푗 0, . . . , 푏 1 is its position along the curve. ∈ ∈ { − } The first three levels of the corresponding tree (which will be called level-position 푏-index-tree later) are shown in Figure 5. The base-4 pattern for the position of a node described above can be directly turned into arithmetic formulas: Consider a node 푣 = (푙, 푗).

The parent 푃 (푣) of 푣 is given by 푃 (푣) = (푙 1, 푗 div 4), where 푗 div 4 := 푗/4 is ∙ − ⌊ ⌋

9 11 12 21 22 1 2 10 13 20 23 0 03 02 31 30 0 3 00 01 32 33

(a) Level 0 (b) Level 1 (c) Level 2

Figure 4: Cell enumeration in the 2D Hilbert curve in base 4.

(0, 0)

(1, 0) (1, 1) (1, 2) (1, 3)

(2, 0) (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8) (2, 9) (2, 10)(2, 11)(2, 12)(2, 13)(2, 14)(2, 15)

Figure 5: Levels 0, 1, 2 of the level-position 푏-index-tree as defined in Definition 3.2.

the integer part of 푗/4. The operation 푗 푗 div 4 eliminates the last digit of 푗 in ↦→ its base-4 representation. The 푖-th child 퐶(푣, 푖) is given by 퐶(푣, 푖) := (푙 + 1, 4푗 + 푖), where 푗 4푗 + 푖 inserts ∙ ↦→ the digit 푖 at the end of the base-4 representation of 푗. The index 퐼(푣) of 푣 inside its parent is given by 퐼(푣) = 푗 mod 4, i.e. the last digit ∙ of the base-4 representation of 푗. Now that we know how to work with levels and positions of nodes in the 2D Hilbert curve, we can turn to the states of the nodes. As noted above, the state 푆(푣) of a node 푣 determines the geometric arrangement of the children of the node. For all SFCs examined here, the state 푆(퐶(푣, 푖)) of the 푖-th child of 푣 is uniquely determined by 푖 and by the state of 푣. That is, we can specify a function 푆푐 such that 푆(퐶(푣, 푖)) = 푆푐(푆(푣), 푖). For the 2D Hilbert curve, this function is specified in Table 1.

Table 1: Values of the function 푆푐 for the 2D Hilbert curve.

푆푐(푠, 푗) 푗 = 0 푗 = 1 푗 = 2 푗 = 3 푠 = 퐻 퐴 퐻 퐻 퐵 푠 = 퐴 퐻 퐴 퐴 푅 푠 = 푅 퐵 푅 푅 퐴 푠 = 퐵 푅 퐵 퐵 퐻

All of the entries except for 푠 = 푅 can be extracted from Figure 1. The state 푠 = 푅 occurs in level 2 of the curve for the first time. Thus, we need level 3 to read off

10 the states of the children of a state-푅 square. Figure 6 shows the construction up to level 3 using a finer Hilbert curve. Note that the function 푆푐 may also be expressed as a grammar with production rules 퐻 퐴퐻퐻퐵, 퐴 퐻퐴퐴푅, 푅 퐵푅푅퐴 and → → → 퐵 푅퐵퐵퐻. We will use the function-based approach since it is better suited for this → thesis.

H H H H H H H H H H H H A B A B A B A B H H R A B R R A B R A B A B H A B H H A B H A B R A B R A B R A B R R R H A B H R R A B H H R A B R H H H A B H A B H A B H A B

(a) Level 1 (b) Level 2 (c) Level 3

Figure 6: Construction of the 2D Hilbert curve. The curve is drawn one level finer than the grid.

Using the function 푆푐, we can compute the state 푆(푣) = 푆푐(푆(푃 (푣)), 퐼(푣)) of a node 푣 only using the state 푆(푃 (푣)) of its parent and the index 퐼(푣) of 푣 inside its parent. This can be efficiently implemented using a lookup table. Can we also compute the state of a parent from the state of its child? For the 2D Hilbert curve, the answer is yes. The function 푠 푆푐(푠, 푖) is invertible for all 푖 since all columns of Table 1 contain each ↦→ state exactly once. Thus, we can find an inverse function 푠 푆푝(푠, 푖) for all 푖. This ↦→ means that 푆(푃 (푣)) = 푆푝(푆(푣), 퐼(푣)), since

푆(푣) = 푆(퐶(푃 (푣), 퐼(푣))) = 푆푐(푆(푃 (푣)), 퐼(푣)) .

Here, we used the fact that a node 푣 is by definition the 퐼(푣)-th child of its own parent. Remark 2.1. In the case of the 2D Hilbert curve, the function 푆푝 is equal to the function 푆푐. From an algebraic perspective, the functions 휎 : , 푠 푆푐(푠, 푗) are 푗 풮 → 풮 ↦→ invertible and thus permutations on the set = 퐻, 퐴, 퐵, 푅 of states. For the 2D 풮 { } 푝 푐 Hilbert curve, each 휎푗 is equal to its own inverse, which is why we find that 푆 = 푆 . The permutation subgroup 휎 푗 0, 1, 2, 3 spanned by these permutations is ⟨ 푗 | ∈ { }⟩ isomorphic to the Klein four-group Z2 Z2. If we examine the Hilbert curve more × closely, this is no surprise: The curves in squares of different states can be seen as mirrored versions of each other, where the mirroring corresponds to an orthogonal transformation given by one of the four matrices (︃ )︃ (︃ )︃ (︃ )︃ (︃ )︃ 1 0 0 1 1 0 0 1 , , − , − . 0 1 1 0 0 1 1 0 − − ⊤ ⊤ A change of basis to a new basis 푏1 = (1, 1) , 푏2 = (1, 1) yields the matrices − (︃ )︃ (︃ )︃ (︃ )︃ (︃ )︃ 1 0 1 0 1 0 1 0 , , − , − , 0 1 0 1 0 1 0 1 − − 11 which clearly form a subgroup of O2(R) isomorphic to the Klein four-group. In an implementation, this suggests to encode the states in a way such that this group operation can be applied efficiently. For example, one might use integers with bitwise XOR as the group operation.

2.2 Neighbor-Finding Algorithm In this section, we will outline the ideas behind our neighbor-finding algorithm along the example of the Hilbert curve. In a nutshell, the neighbor-finding algorithm takes the shortest path to the neighbor in the corresponding tree. It ascends 푘 levels and then descends 푘 levels, where 푘 is as small as possible and depends on the tree node. An investigation of the average size of 푘 with a standard geometric series argument yields that for all SFCs presented here, this average is bounded independent of the level of the nodes considered. In Section 2.1, we have seen how to efficiently deal with levels, positions and states. Yet, there is one modeling decision remaining: When modeling general space-filling curves, one may or may not include the state as part of a node. In the modeling below, we decided to include the state into the node. Thus, we will from now on deal with nodes of the form (푙, 푗, 푠) instead of (푙, 푗). A disadvantage of this modeling is that for specifying a node, one already has to know its state. However, given the level and the position of a node inside the curve, the state can be computed as shown in Section 4.2. It can also be efficiently tracked during traversals. Furthermore, our neighbor-finding algorithm assumes that the state of a node is known. The following examples show an application of our algorithm and different cases that have to be handled. A complete presentation of Algorithm 1 will be given in Section 4.1. Example 2.2. Let 푣 be the node at level 2 with position 1 in the curve. In Figure 6, we can see that its state is 퐴, since the second square on the curve at level 2 (i.e. the square with position 1) has state 퐴. Thus, 푣 = (2, 1, 퐴). Now assume that we want to find the upper neighbor. The algorithm proceeds as follows:

(1) Check if the level of 푣 is zero, in which case there would be no neighbor. This is not the case.

(2) Compute the parent 푝 = (2 1, 1 div 4, 푆푝(퐴, 1 mod 4)) = (1, 0, 퐴) of 푣. Also, 푣 − compute the index 푗푣 = 1 mod 4 = 1 of 푣. (3) Every node with state 퐴 in the Hilbert curve “looks essentially the same”,3 so we can consult a suitable lookup table to find out whether in a node of state 퐴 = 푆(푝푣), the child with index 퐼(푣) = 1 has an upper neighbor inside 푝푣. Indeed, the lookup table will yield that the child with index 2 is an upper neighbor of 푣.

(4) Compute the child with index 2: 푤 := 퐶(푝 , 2) = (1+1, 4 0+2, 푆푐(퐴, 2)) = (2, 2, 퐴). 푣 · (5) Return 푤 as the upper neighbor of 푣.

Finding the left neighbor (2, 0, 퐻) of 푣 is completely analogous. 3This corresponds to the pre-regularity condition (P1) from Definition 3.26.

12 Example 2.3. Now suppose that instead of the upper neighbor of 푣 = (2, 1, 퐴), we are interested in the right neighbor of 푣. This time, the algorithm does the following:

(1) The level of 푣 is not zero, continue.

(2) Compute the parent 푝푣 = (1, 0, 퐴) of 푣. (3) Ask the lookup table if the child with index 1 inside a node with state 퐴 has a right neighbor. This time, the answer is no.

4 (4) If there is a right neighbor of 푣, its parent must be a right neighbor of 푝푣. Apply the algorithm recursively to find a right neighbor of 푝푣. The recursive call will proceed similarly to Example 2.2 and yield the neighbor 푝푤 = (1, 3, 퐵). (5) In the Hilbert curve, whenever a node of state 퐵 is a right neighbor of a node of state 퐴, the geometric arrangement of these two “looks essentially the same”.5 Thus, we can consult a second lookup table using 퐼(푣) = 1, 푆(푝푣) = 퐴, 푆(푝푤) = 퐵 and the facet “right” to find the index of the right neighbor of 푣 inside 푤. The lookup table will yield the index 푖 = 2 as desired.

(6) Now, compute the neighbor 푤 = 퐶(푝 , 푖) = (1 + 1, 4 3 + 2, 푆푐(퐵, 2)) = (2, 14, 퐵) 푤 · of 푣.

(7) Return 푤 as the right neighbor of 푣.

Example 2.4. Finally, suppose that we are seeking the lower neighbor of 푣 = (2, 1, 퐴). The algorithm proceeds as follows:

(1) The level of 푣 is not zero, continue.

(2) Compute the parent 푝푣 = (1, 0, 퐴) of 푣.

(3) The first lookup table yields that there is no lower neighbor of 푣 inside 푝푣.

(4) Call the algorithm recursively to find a lower neighbor of 푝푣:

(i) The level of 푝푣 is not zero, continue.

(ii) Compute the parent 푝푝푣 = (0, 0, 퐻) of 푝푣.

(iii) The first lookup table yields that there is no lower neighbor of 푣 inside 푝푣.

(iv) Call the algorithm recursively to find a lower neighbor of 푝푝푣 :

1. The level of 푝푝푣 is zero. Thus, 푝푝푣 cannot have a neighbor. Return a value

indicating that there is no lower neighbor of 푝푝푣 .

(v) Since there is no lower neighbor of 푝푝푣 , return a value indicating that there is no lower neighbor of 푝푣.

(5) Since there is no neighbor of 푝푣, return a value indicating that there is no lower neighbor of 푣. 4This corresponds to the regularity conditions (R2) and (R3) of Definition 3.32. 5This corresponds to the regularity condition (R1) of Definition 3.32.

13 In these three examples, it becomes evident that the runtime of the algorithm depends on how far one has to walk up and down the tree to find the neighbor. This concept is later introduced as the neighbor-depth of 푣. The upper-neighbor-depth and the left-neighbor-depth of 푣 in these examples is 1, since there is only one call of the algorithm in Example 2.2. The right-neighbor-depth of 푣 in these examples is 2, since the algorithm needs to call itself recursively once. The lower-neighbor-depth of 푣 is 3, since the algorithm calls itself recursively twice. The neighbor-depth allows to characterize the runtime of the algorithm. In Theorem 4.1, we will show that the runtime of the algorithm is in (neighbor-depth) (assuming constant-time arithmetic 풪 operations). Using a standard geometric series argument, we will then show that the average neighbor-depth at level 푙 is (1) for all “normal” SFCs. 풪 Remark 2.5. In Remark 2.1, it was mentioned that in the 2D Hilbert curve, nodes with states 퐴, 퐵 or 푅 are merely mirrored versions of nodes of state 퐻. If the algorithm is given the wrong state of a node, it will behave as if the whole curve was mirrored. For example, given the node 푣 = (2, 1, 퐻) instead of (2, 1, 퐴), requesting the upper neighbor of 푣 will return the right neighbor of (2, 1, 퐴) (also with an incorrect state), since the upper side in the coordinate system of a state-퐻-node corresponds to the right side in the coordinate system of a state-퐴-node. A formal criterion for this behavior is given in Section 4.3.

2.3 Other Space-Filling Curve Models In this section, we want to take a look at some other curves with a focus on aspects that are different to the 2D Hilbert curve. The general space-filling curve modelfrom Section 3 and the formulation of the algorithm are motivated by many of these aspects presented here. Furthermore, special properties of certain SFCs presented here allow for special optimizations of the algorithm.

P R P R P R P R P P R P Q S Q S Q S Q S Q P R P R P R P R P Q S Q S Q S Q S Q Q S Q P R P R P R P R P Q S Q S Q S Q S Q P P R P R P R P R P P R P Q S Q S Q S Q S Q P R P R P R P R P

(a) Level 0 (b) Level 1 (c) Level 2

Figure 7: Construction of the 2D Peano curve.

Figure 7 shows the construction of the 2D Peano curve. This is the first example of a SFC given by Peano in 1890 [2]. Note that while the Hilbert curve uses a 22-tree, this Peano curve uses a 32-tree, i.e. 푘 = 3. There are also versions of the Peano curve for 푘 = 2푚 + 1, 푚 N≥1. The Peano curve satisfies most of the properties that the ∈ Hilbert curve has. In contrast to the Hilbert curve, it also satisfies the palindrome

14 property: At a boundary of two neighboring squares, the neighboring children are ordered inversely to each other [2]. For example, the node 푤 := (1, 1, 푅) is the right neighbor of 푣 := (1, 0, 푃 ) and the children (2, 9, 푅), (2, 14, 푆), (2, 15, 푅) of 푤 are the right neighbors of the children (2, 8, 푃 ), (2, 3, 푄), (2, 2, 푃 ) of 푣. Here, the palindrome property implies that the indices of these child neighbors add up to 32 1 = 9 1 = 8: − − Indeed, we find that 0 + 8 = 5 + 3 = 6 + 2 = 8. To see that this does not workfor the Hilbert curve, consider e.g. the upper neighbor 푤 = (1, 1, 퐻) of 푣 = (1, 0, 퐴) in Figure 6. A disadvantage of the 2D Peano curve (and other Peano curves) is that the branching factor 푏 = 32 is not a power of two. This means that the arithmetic operations used for computing parents and children of nodes like division by 푏, multiplication with 푏 or computing modulo 푏 cannot be represented using bit-operations, so they are potentially slower. Another disadvantage is that because of the bigger branching factor 푏, adaptive refinement can be controlled less precisely.

Figure 8: The 2D Hilbert curve on an adaptive grid.

Figure 8 shows the 2D Hilbert curve on an adaptively refined grid, i.e. where not all of the nodes have the same level. In an adaptive grid, the question of specifying a neighbor or even finding these neighbors becomes more difficult. These questions are outside of the scope of this thesis. However, we think that it is possible to generalize the results presented here for adaptive grids, if the grid cells are efficiently accessible via their node encoding 푣 = (푙, 푗, 푠).

M M M M M M M M M M M M M M M M M M M M M

(a) Level 0 (b) Level 1 (c) Level 2

Figure 9: Construction of the 2D Morton curve.

A particularly simple curve is the Morton curve shown in Figure 9, also known as Morton code, Lebesgue curve, Morton order or Z-order curve. It is built from only one

15 pattern and is popular due to the simplicity and efficiency of the algorithms that are used to deal with it. For example, there is a (1) neighbor-finding algorithm with 풪 a small constant by Schrack [5]. Its drawback is that it is not a SFC in the sense that it does not converge to a continuous curve. This means that it has worse locality properties than the other curves presented here [2]. While the algorithms shown in this paper can be applied to the Morton curve, already existing algorithms perform better

on it.

G

G G G

G G G

(a) Level 0 (b) Level 1 (c) Level 2

G G GG

G G G

G G G G

G G

G GG G

G G G G

G G G G

G G

G

G G G G GG

G G G

G G G G G

G G G G G G G G G G G G G G (d) Level 3 (e) Level 4 (f) Level 5

Figure 10: Construction of the 2D Sierpinski curve.

Up to now, we have only considered SFCs on 푘푑-trees. The Sierpinski curve, also known as the Sierpinski-Knopp curve, uses simplices (i.e. triangles in the 2D case) instead of hypercubes. This means that it can be applied to suitable triangular meshes. In the construction in Figure 10, all triangles have the same state 퐺, but in different rotated and mirrored versions. To be more precise, not the state is rotated, but their local coordinate system, a concept that will be explained in Example 3.9. We will refer to such a model as a local model, since the state of a node does not reveal the node’s orientation in the global coordinate system. This is a choice of model that is particularly comfortable for the 2D Sierpinski curve: A global model like those of the Hilbert and Peano curves above would require eight different states since the right-angled edges of the triangles can point to eight different directions. In Figure 10, it can be seen that the Sierpinski curve also satisfies the palindrome property. For local models, the neighbor-finding algorithm has to be extended: Consider for example the nodes 푣 = (2, 1, 퐺) and 푤 = (2, 2, 퐺) in the Sierpinski curve. These nodes share a common facet, the hypotenuse of their respective triangles. Their parents (1, 0, 퐺) and (1, 1, 퐺) also share a common facet, but it is not their hypotenuse. This means

16 that when finding neighbors of parents in the recursion step of the neighbor-finding algorithm, the neighbor at a possibly different facet has to be found. This problem can be resolved using a third lookup table which will be introduced in Section 3.2.

G G G G G

G G G G G

G G G G G G G G

G G G

(a) Level 0 (b) Level 1 (c) Level 2

Figure 11: Local Construction of the 2D Hilbert curve, where the curve is one level finer than the grid.

Figure 11 shows such a local model for the Hilbert curve. There are four possible orientations of a square, corresponding to the four states 퐻, 퐴, 퐵, 푅 in the global model and to the four group elements from Remark 2.1. We used 퐺 instead of 퐻 since 퐻 is a symmetric letter.

Remark 2.6. In the local model of the Hilbert curve, an assumption from Example 2.3 is violated: The nodes 푣 = (2, 0, 퐺) and 푣 = (2, 5, 퐺) have the same state and their right neighbors also have the same state, but they have a different orientation. This means that the second lookup table is not well-defined. There are two ways to circumvent this: extending the lookup tables or choosing a global model instead. To avoid more complications, we choose the second approach in this thesis. This problem will be revisited in Remark 3.33.

P P P P

P P P P P P

P P P

P P P P P P

P P P

P P P

P P P P P

P P P P

P P P P P P

P P P

P P P P P P

P P P

P P P

P P P P P

P P P P

P P P P P P

P P P P P

P P P P P

P P P P P P P P P P P

(a) Level 0 (b) Level 1 (c) Level 2

Figure 12: Local Construction of the 2D Peano curve.

The Peano curve can also be modeled locally. This is shown in Figure 12. In Remark 2.6, we have seen that the local model of the Hilbert curve poses some problems. The palindrome of the Peano curve implies that the orientation of a square and the neighbor

17 direction already uniquely determine the state of a neighbor in that direction. Thus, these problems do not occur for the Peano curve.

D B R B D

(a) Level 0 (b) Level 1

D Ω Ω B D Ω Ω B

E B E B D D B D B D C C C Ψ

E

Ω Ψ D D E B B

E B D C C

Ω C

C E E

E E C

C Ω C C D E B

B E B D D Ψ Ω E

Ψ C

C C D

D B B

D E B D E B

Ω Ω Ω Ω B B D D

(c) Level 2 (d) Level 3

Figure 13: Construction of the 2D 훽Ω-curve, where the curve is one level finer than the grid.

The 훽Ω-curve shown in Figure 13 exposes some new properties. The construction intends to create a closed curve, which comes at some expense: First of all, it needs more states than the curves presented above even though the model from Figure 13 already uses a partially local approach. More importantly, this is an example of a curve where not all of the functions 푠 푆푐(푠, 푗) are invertible. For example, Figure 13 shows ↦→ that the parent of a node with state 퐵 and index 1 at level 2 may have state 퐵 or state 퐷. Another related phenomenon is that the root state 푅 only occurs at the root node. Figure 14 shows a semi-local model of the , also called Gosper Flowsnake, using two states. Its tree nodes correspond to hexagons. However, the space filled by the Gosper curve is not a hexagon but a called Gosper island [2]. This is possible because in the Gosper curve, the children of a hexagon do not form a partition of the hexagon — instead, they cover a region neither including nor being included in the parent hexagon. The Gosper curve is thus not well-suited for being used on adaptive grids. The level-2 curve together with the boundaries of hexagons at levels 0, 1, 2 is shown in Figure 15. In this model of the Gosper curve, the functions 푠 푆푐(푠, 푗) are ↦→ not all invertible.

18

G G

G G R

R G

R G G G R

R G R

G G

G R R

R R

R R R

R G

R R G

G G

R R G

G G G R R

R

G G G G R G R

R G G G R G R G R

(a) Level 0 (b) Level 1 (c) Level 2

Figure 14: Construction of the 2D Gosper curve.

Figure 15: Levels 0, 1, 2 of the 2D Gosper curve superimposed.

It seems geometrically clear, that using a suitable model, the algorithms presented here also work for the Gosper curve. However, the proof of correctness given here is partially restricted to curves where all children are contained in the parent polygon and thus does not apply to the Gosper curve. Moreover, the semi-local model of the Gosper curve given here suffers from the problem described in Remark 2.6. Up to now, we have only considered two-dimensional SFCs. For many applications, SFCs in dimensions 푑 3 are needed. Curves like Hilbert, Peano, Morton and Sierpinski ≥ allow higher-dimensional variants. The Hilbert and the Peano curves have many variants in each dimension 푑 2 and for the Hilbert curve, there is no “canonical” variant even ≥ in 3D. Figure 16 shows two levels of a 3D Hilbert curve. The 3D Hilbert curve yields a sequential order on a 23-tree, also called octree. The book by Bader [2] presents many more SFCs. We suppose that the methods presented here work for all of these SFCs except possibly for the H-index, a modification of the Sierpinski curve that works on squares and is not presented here.

19 (a) Level 1 (b) Level 2

Figure 16: Construction of the 3D Hilbert curve.

20 3 Modeling Space-Filling Curves

In this section, we show a general way of modeling common space-filling curves geo- metrically. This model essentially expresses the vertex-labeling method (see Bartholdi and Goldsman [4]) using matrices. While the original vertex-labeling method only used local models of SFCs, we also allow multiple states in our model so that it applies to more SFCs. We then show how to abstract the geometric information that is contained in this model to an integer-based model that can be efficiently processed using algo- rithms to find neighbors of tree cells. An implementation of this transformation process is available in the sfcpp library. We intend to make every aspect of this modeling approach algorithmically verifiable. However, at the time of writing, only some aspects are implemented.

3.1 Trees, Matrices and States In the following, we will derive a modeling framework for SFCs following the example of the Hilbert curve in two dimensions as examined in Section 2. When a level of the Hilbert curve is refined to obtain the next level, each square is subdivided intofour smaller squares. The squares in the Hilbert curve form a tree: Each square contains four subsquares in the next level, these can be viewed as children in a tree. The children can be ordered in the order the curve passes through them in the corresponding level. We can define the notion of such a tree abstractly:

Definition 3.1. Let 푏 N with 푏 2. We define the set := 0, . . . , 푏 1 of indices. ∈ ≥ ℬ { − } A 푏-index-tree is a tuple ( , 퐶, 푃, ℓ, 푟) that satisfies the following conditions: 풱 is a set and 푟 . Elements of are called (tree) nodes (or vertices). 푟 is ∙풱 ∈ 풱 풱 called the root node.

퐶 : 푟 is a bijective function. This implies that is infinite. 퐶(푣, 푗) ∙ 풱 × ℬ → 풱 ∖ { } 풱 is called the 푗-th child of 푣.

푃 : 푟 is a function satisfying 푃 (퐶(푣, 푗)) = 푣 for all 푣 , 푗 . 푃 (푣) ∙ 풱 ∖ { } → 풱 ∈ 풱 ∈ ℬ is called the parent of 푣.

ℓ : N0 is a function satisfying ℓ(푟) = 0 and ℓ(푣) = ℓ(푃 (푣)) + 1 for all ∙ 풱 → 푣 푟 . This immediately yields ℓ(퐶(푣, 푗)) = ℓ(푣) + 1 and ℓ(푣) = 0 푣 = 푟. ∈ 풱 ∖ { } ⇒ The level function ℓ thus allows proofs by induction on the level.

For 푣 푟 , we then define 퐼(푣) to be the unique index satisfying 퐶(푃 (푣), 퐼(푣)) = 푣. ∈ 풱 ∖ { } The root node 푟 and the functions 푃, ℓ, 퐼 are all uniquely determined by 퐶 if they exist. Because of this, we will also write ( , 퐶) instead of ( , 퐶, 푃, ℓ, 푟) for brevity. 풱 풱 In the case of the 2D Hilbert curve, the branching factor 푏 of the tree is 푏 = 4 = 22. In the more general setting of a 푘푑-tree, we would have 푏 = 푘푑. As shown in Section 2, a square arising in the construction of the Hilbert curve can be described by its level in the tree and its position along the curve. Moreover, the tree operations on such nodes can be efficiently realized because they correspond to simple

21 manipulations in the base-푏 representation of the position. The following definition makes these operations explicit and yields our first example of a 푏-index-tree: Definition 3.2. The level-position 푏-index-tree 푇 = ( , 퐶, 푃, ℓ, 푟) is defined by 풱 2 푙 := (푙, 푗) N0 푗 < 푏 풱 { ∈ | } 푟 := (0, 0) 푃 ((푙, 푗)) := (푙 1, 푗 div 푏) (if 푙 > 0) − 퐶((푙, 푗), 푖) := (푙 + 1, 푗푏 + 푖) ℓ((푙, 푗)) := 푙 .

Here, 푗 div 푏 := 푗/푏 is the integer division. It then follows that 퐼((푙, 푗)) = 푗 mod 푏 if ⌊ ⌋ 푙 > 0.

The first three levels of a level-position 푏-index-tree are shown in Figure 5. A level- position 푏-index-tree contains no information about the underlying SFC except for the branching factor 푏. Our next step is to derive a geometric model and include this as an additional information into the tree. This geometric 푏-index-tree will simplify geometric definitions which can then be transferred to a computationally efficient tree like the level-position 푏-index-tree using an isomorphism. Isomorphisms will be covered in Section 3.5. As we have seen in Section 2.3, not every SFC is constructed using squares. We will use polytopes for modeling. Polytopes are a generalization of convex polygons in arbitrary dimension. The following definition introduces some basic concepts of convex geometry together with customary conventions for matrices. For a detailed introduction, see Ziegler [7]. We follow the convention from Ziegler [7] and use the term “polytope” only for convex polytopes. From now on, we assume that 푑 2 denotes the ≥ dimension of the space we are working in.

푑 Definition 3.3. Let 푘 N≥1 and 푞1, . . . , 푞푘 R . ∈ ∈ (a) The set {︃ ⃒ }︃ 푘 ⃒ 푘 ∑︁ ⃒ ∑︁ 푑 conv( 푞1, . . . , 푞푘 ) := 푡푖푞푖 ⃒ 푡푖 [0, 1], 푡푖 = 1 R { } 푖=1 ⃒ ∈ 푖=1 ⊆

is called convex hull of 푞1, . . . , 푞푘. It is the smallest convex set containing 푞1, . . . , 푞푘. For the matrix ⎛ ⎞ ⎜ | | ⎟ 푄 := ⎝푞1 . . . 푞푘⎠ | |

containing 푞1, . . . , 푞 as columns, let conv(푄) := conv( 푞1, . . . , 푞 ). Furthermore, 푘 { 푘} we set conv( ) := . ∅ ∅ (b) A set 푃 R푑 is called polytope if 푃 = conv(푆) for some finite set 푆 R푑. ⊆ ⊆ Especially, = conv( ) is also a polytope. The dimension dim(푃 ) is defined as the ∅ ∅ dimension of the affine subspace aff(푃 ) of R푑 spanned by the polytope 푃 .

22 (c) Let 푃 be a polytope. A subset 퐹 푃 is called a face of 푃 if there exist 푐 R푑 ⊤ ⊆ 푑 ⊤ ∈ and 푐0 R such that 푐 푥 푐0 for all 푥 푃 and 퐹 = 푃 푥 R 푐 푥 = 푐0 . ∈ ≤ ∈ ∩ { ∈ | } Again, we define dim(퐹 ) := dim(aff(퐹 )). Let 퐹 be a face of 푃 . If dim(퐹 ) = dim(푃 ) 1, 퐹 is called a facet of 푃 . If 퐹 = 푣 − { } for some 푣 R푑, i.e. dim(퐹 ) = 0, 푣 is called a vertex of 푃 . We denote by vert(푃 ) ∈ the set of all vertices of 푃 .

We can now define the initial level-0 square [0, 1]2 of the Hilbert curve construction via [0, 1]2 = conv(푄(푟)), where 푄(푟) contains the four vertices of the square: (︃ )︃ 0 1 0 1 푄(푟) := . (3.1) 0 0 1 1

There are different choices for 푄(푟) generating the same polytope, since permuting the vertices or inserting more points of the square as columns does not affect conv(푄(푟)). To include our geometrical knowledge about the Hilbert curve into the tree nodes, we want to find such a matrix 푄 for every node. The columns of 푄 should contain the vertices of the associated square. Then, the associated square is given by conv(푄). For example, a corresponding matrix 퐿 of the lower left subsquare of the matrix 푄(푟) from Equation (3.1) can be obtained using matrix multiplication: ⎛ ⎞ ⎛ ⎞ 1 1/2 1/2 1/4 ⎜ ⎟ 4 (푟) ⎜ | | ⎟ ⎜0 1/2 0 1/4⎟ ∑︁ 퐿 = 푄 푀 = ⎝푙1 . . . 푙푘⎠ , where 푀 := ⎜ ⎟ and 푙푗 = 푀푖푗푞푖. ⎝0 0 1/2 1/4⎠ 푖=1 | | 0 0 0 1/4

Similarly, 푄(푟) 푀 푀 is the lower left subsquare of the lower left subsquare of 푄(푟) and · · so forth. It is important to note that all columns of 푀 sum to one. As we will see in Lemma 3.5, this makes the operation 푄 푄 푀 commute with translations. First, we ↦→ · will introduce some helpful terminology:

⊤ 푛 Definition 3.4. Let 1푛 := (1,..., 1) R . ∈ 푚×푛 ⊤ ⊤ (a) A matrix 푀 R is called transition matrix if 1푚푀 = 1푛 , that is, the entries ∈ of each column sum to one.

푑×푚 푑 ⊤ (b) A matrix 퐵 R is called offset matrix if there exists 푏 R with 퐵 = 푏1푚, ∈ ∈ that is, all columns of 퐵 are identical.

⊤ 푑×푚 Lemma 3.5. Let 퐵 = 푏1푚 R be an offset matrix. ∈ 푑×푑 ⊤ (a) For any matrix 퐴 R , 퐴퐵 = (퐴푏)1푚 is an offset matrix. ∈ 푚×푛 ⊤ ⊤ (b) For any transition matrix 푀 R , 퐵푀 = 푏1푚푀 = 푏1푛 is an offset matrix with ∈ the same columns as 퐵. Especially, if 푚 = 푛, we have 퐵푀 = 퐵. (c) The product of two transition matrices is a transition matrix.

Proof. It remains to prove (c): Let 푀 R푚×푛, 푁 R푛×푝 be transition matrices, then ⊤ ⊤ ⊤ ∈ ∈ 1푚푀푁 = 1푛 푁 = 1푝 .

23 We can (by definition of conv(푄)) choose all entries of 푀 to be non-negative if and only if conv(퐿) = conv(푄푀) conv(푄), since elements in conv(푄) are exactly those ⊆ vectors that are a convex combination of columns of 푄. However, we need to allow negative entries if we want to be able to model the Gosper curve, see Figure 14. In some cases, using negative values might also simplify modeling. Once we have found transition matrices for the four subsquares of a square in the Hilbert curve construction, we have to decide how to order them in order to obtain the Hilbert curve. As explained in Section 2 the order of the subsquares is determined by the state 푠 := 퐻, 퐴, 퐵, 푅 of a square. Furthermore, the function 푆푐 allows us ∈ 풮 { } to track the state of an element when descending in a tree. The following definition introduces such a model abstractly:

푐 Definition 3.6. Let be a finite set, 푠푟 and let 푆 : . The tuple 푐 풮 ∈ 풮 풮 × ℬ → 풮 S = ( , 푆 , 푠푟) is called 푏-state-system, if every state 푠 is reachable from the 풮 ∈ 풮 root state 푠푟, i. e. there exist 푛 N0, 푠0, . . . , 푠푛 and 푗1, . . . , 푗푛 such that 푐 ∈ ∈ 풮 ∈ ℬ 푠0 = 푠 , 푠 = 푠 and 푠 = 푆 (푠 −1, 푗 ) for 푘 1, . . . , 푛 . 푟 푛 푘 푘 푘 ∈ { } Remark 3.7. The reachability condition in Definition 3.6 can be easily satisfied by restricting the set of states to all reachable states. It is useful to simplify some definitions but will not be used throughout most of this thesis.

Together with appropriate transition matrices and a matrix for the root square, a 푏-state-system can be used to specify a SFC:

Definition 3.8. Let S be a 푏-state-system as in Definition 3.6. A tuple (S, 푀, 푄(푟)) is called 푏-specification if there exist 푛푠 N≥1 for 푠 (specifying the number of ∈ ∈ 풮 vertices of a polytope of state 푠) such that

(a) 푄(푟) R푑×푛푠푟 and ∈ ⋃︀ 푛×푚 푠,푗 (b) 푀 : 푛,푚∈N≥1 R is a function such that for all 푠 , 푗 , 푀 ×풮푐 × ℬ → ∈ 풮 ∈ ℬ ∈ R푛푠 푛푆 (푠,푗) is a transition matrix. Example 3.9. In this example, we define a local 푏-specification of the Hilbert curve, where 푏 = 4. This specification corresponds to the local construction of the Hilbert curve shown in Figure 11. Our model only has one state 퐺, i.e. = 퐺 . The child 풮 { } state function is thus 푆푐 : , (퐺, 푗) 퐺. The root state is 푠 = 퐺. The 풮 × ℬ → 풮 ↦→ 푟 root point matrix 푄(푟) is the same as in Equation (3.1). The only interesting part of this specification are the transition matrices. In the global model, the point matrix belonging to the lower left subsquare of 푄(푟) is (︃ )︃ 0 1/2 0 1/2 퐿 = . 0 0 1/2 1/2

Because 퐿 has the same state as 푄(푟) in the local model, we have to mirror its coordinate system to modify the arrangement of its subsquares. In this case, we can do this by exchanging the order of the second and third vertex and setting (︃ )︃ 0 0 1/2 1/2 퐿 = . 0 1/2 0 1/2

24 Figure 17 shows for each square the order of its vertices in the matrix from the local specification. We can achieve this reordering using correspondingly permuted transition matrices: ⎛1/2 1/4 0 0 ⎞ ⎛1/4 0 0 0⎞ ⎜ ⎟ ⎜ ⎟ 퐺,1 ⎜ 0 1/4 0 0 ⎟ 퐺,2 ⎜1/4 1/2 0 0⎟ 푀 = ⎜ ⎟ 푀 = ⎜ ⎟ ⎝1/2 1/4 1 1/2⎠ ⎝1/4 0 1/2 0⎠ 0 1/4 0 1/2 1/4 1/2 1/2 1 ⎛1 1/2 1/2 1/4⎞ ⎛ 0 0 1/4 1/2⎞ ⎜ ⎟ ⎜ ⎟ 퐺,0 ⎜0 0 1/2 1/4⎟ 퐺,3 ⎜1/2 1 1/4 1 ⎟ 푀 = ⎜ ⎟ 푀 = ⎜ ⎟ . ⎝0 1/2 0 1/4⎠ ⎝ 0 0 1/4 0 ⎠ 0 0 0 1/4 1/2 0 1/4 0

As stated in Remark 2.6, the local specification of the Hilbert curve cannot be directly used for the algorithms presented here. In the following, we thus have to revert to global models of the Hilbert curve.

3 4 3 4 3 4 3 4 3 4 3 4 1 2 1 2 1 2 1 2 3 4 2 4 3 1 2 4 3 1 1 2 1 2 1 3 4 2 1 3 4 2 2 1 2 4 3 1 2 1 2 4 3 1 4 3 1 3 4 2 4 3 3 4 2 4 3 1 3 4 1 2 1 3 4 2 1 2 1 3 4 2 1 2 (a) Level 0 (b) Level 1 (c) Level 2

Figure 17: Vertex column indices in the local model of the 2D Hilbert curve.

Now that we have seen how to specify a space-filling curve, we want to turn this specification into a tree. Since this specification includes states, the tree shouldalso include states. We will thus first augment the definition ofa 푏-index-tree to include states: Definition 3.10. Let ( , 퐶) be a 푏-index-tree as in Definition 3.1, let be a finite set 풱 풮 and let 푆 : be a function. Then, 푇 = ( , 퐶, , 푆) is called a state-푏-index-tree. 풱 → 풮 푠 풱 풮 For 푣 , 푆(푣) is called the state of 푣. ∈ 풱 Remark 3.11. A state-푏-index-tree can be obtained from a 푏-index-tree and a 푏-state- system by setting 푆(푟) := 푠푟 and then inductively defining

푆(퐶(푣, 푗)) := 푆푐(푆(푣), 푗) .

From a 푏-specification, we can now build a tree where each tree node comprises the following components:

the level of the node, ∙ 25 its position in the curve inside the given level, ∙ its state, and ∙ a point matrix specifying its corresponding polytope. ∙ The corresponding model is introduced in the following definition.

Definition 3.12. Let (S, 푀, 푄(푟)) be a 푏-specification as in Definition 3.8. Out ofa basic set

{︁ 푑×푛푠 }︁ ˜ := (푙, 푗, 푠, 푄) 푙, 푗 N0, 푠 , 푄 R 풱 | ∈ ∈ 풮 ∈ (푟) with root node 푟 := (0, 0, 푠푟, 푄 ) and child function

퐶˜ : ˜ ˜, ((푙, 푗, 푠, 푄), 푖) (푙 + 1, 푗푏 + 푖, 푆푐(푠, 푖), 푄 푀 푠,푖) , 풱 × ℬ → 풱 ↦→ · we construct our node set recursively:

0 := 푟 풱 { } 푙+1 := 퐶˜(푣, 푗) 푣 푙, 푗 풱 {∞ | ∈ 풱 ∈ ℬ} ⋃︁ := 푙 . 풱 푙=0 풱

Then, we set our child function on to 퐶 := 퐶˜ 풱 . The restriction of the codomain 풱 |풱×ℬ to is valid by construction of . For 푣 = (푙, 푗, 푠, 푄) , define 풱 풱 ∈ 풱 푆(푣) := 푠 , 푄(푣) := 푄 .

Note that the function 푣 푄(푣) does not belong to the required functions for a ↦→ state-푏-index-tree but will be frequently used for geometric purposes later. With these definitions, ( , 퐶, , 푆) is a state-푏-index-tree with root node 푅 and 풱 풮 ℓ((푙, 푗, 푠, 푄)) = 푙 퐼((푙, 푗, 푠, 푄)) = 푗 mod 푏 (if 푙 > 0) .

We call this tree the geometric 푏-index-tree for the 푏-specification (S, 푀, 푄(푟)).

The geometric 푏-index-tree will be used to define neighborship in space-filling curves geometrically. It can also be implemented for rendering space-filling curves, which has been done to generate most of the figures in this thesis. Next, we want to introduce two trees that are convenient for computations because they contain information about the state of a node, but not its point matrix. The operations of the first tree are easier and more efficient to implement if the functions 푠 푆푐(푠, 푗) ↦→ are invertible for each 푗 . Otherwise, the second tree may be used. ∈ ℬ

26 Definition 3.13. Let S be a 푏-state-system. Similar to Definition 3.12, we can define a state-푏-index-tree 푇 = ( , 퐶, , 푆) satisfying 풱 풮 N0 N0 풱 ⊆ × × 풮 푟 = (0, 0, 푠푟) 퐶((푙, 푗, 푠), 푖) = (푙 + 1, 푗푏 + 푖, 푆푐(푠, 푖)) ℓ((푙, 푗, 푠)) = 푙 퐼((푙, 푗, 푠)) = 푗 mod 푏 (if 푙 > 0) 푆((푙, 푗, 푠)) = 푠 for every (푙, 푗, 푠) . We call 푇 the algebraic 푏-index-tree without history for S. ∈ 풱 If 푠 푆푐(푠, 푗) is invertible for every 푗 with inverse function 푠 푆푝(푠, 푗), we also ↦→ ∈ ℬ ↦→ have 푃 ((푙, 푗, 푠)) = (푙 1, 푗 div 푏, 푆푝(푠, 푗 mod 푏)) − for all (푙, 푗, 푠) 푟 . In this case, assuming constant-time integer arithmetic ∈ 풱 ∖ { } operations, all functions of 푇 can be realized in (1) time. If 푏 is a power of two, the 풪 operations on the position only operate on a constant number of bits.

For the Morton, Hilbert, Peano and Sierpinski curves, the functions 푠 푆푐(푠, 푗) are ↦→ invertible. For the semi-local models of the 훽Ω-curve and the Gosper curve from Section 2.3, this is not true. In this case, we also need to store the states of ancestors in a node: Definition 3.14. Let S be a 푏-state-system. Similar to Definition 3.12, we can define a state-푏-index-tree 푇 = ( , 퐶, , 푆) satisfying 풱 풮 ∞ ⋃︁ 푛 N0 N0 풱 ⊆ × × 푛=1 풮 푟 = (0, 0, 푠푟) 푐 퐶((푙, 푗, (푠푝, 푠), 푖) = (푙 + 1, 푗푏 + 푖, (푠푝, 푠, 푆 (푠, 푖))) 푃 ((푙, 푗, (푠 , 푠))) = (푙 1, 푗 div 푏, 푠 ) if 푙 > 0 푝 − 푝 ℓ((푙, 푗, (푠푝, 푠))) = 푙

퐼((푙, 푗, (푠푝, 푠))) = 푗 mod 푏

푆((푙, 푗, (푠푝, 푠))) = 푠 for every (푙, 푗, (푠 , 푠)) where 푠 is a (possibly empty) tuple of states. We call 푇 the 푝 ∈ 풱 푝 algebraic 푏-index-tree with history for S. At first, it might seem that not all of these operations are realizable in (1) since they 풪 involve creating tuples of arbitrary length. However, we will show in Section 6.2 how to implement all operations in (1) using partial trees for the state tuples. 풪

3.2 Algebraic Neighbors Now, we want to define what it means for a node 푤 to be a neighbor of 푣. First of all, we need a finite set whose elements encode the facets of a node, i.e. the directions ℱ 27 where a neighbor can be looked for. For the 2D Hilbert curve, we might for example choose = 0, 1, 2, 3 , where 0 encodes the left, 1 the right, 2 the lower and 3 the ℱ { } upper side of a square. If a node 푣 has a neighbor 푤 in direction 푓 , we distinguish ∈ ℱ two cases:

Case 1: 푃 (푣) = 푃 (푤). In this case, we assume that the index 퐼(푤) of 푤 is uniquely ∙ determined by 퐼(푣), 푆(푃 (푣)) and 푓. We can model this relationship by a function 푁:

퐼(푤) = 푁(퐼(푣), 푆(푃 (푣)), 푓) .

Case 2: 푃 (푣) = 푃 (푤). In this case, we assume that 푁(퐼(푣), 푆(푃 (푣)), 푓) = , ∙ ̸ ⊥ where is a value representing an undefined result. In addition, we assume that ⊥ 푃 (푤) is a neighbor of 푃 (푣) in direction 푓 , where 푓 can be modeled by 푝 ∈ ℱ 푝 another functional relationship:

푝 푓푝 = 퐹 (퐼(푣), 푆(푃 (푣)), 푓) .

For most global models, we can assume that 푓푝 = 푓, i.e. the direction of the neigbors stays the same. In local models, this is generally not the case due to the change of coordinate system. Finally, we assume that 퐼(푤) is given by a third functional relationship:

퐼(푤) = Ω(퐼(푣), 푆(푃 (푣)), 푆(푃 (푤)), 푓) .

This is the case if the states of the parents provide enough information to know which of their children can be neighbors in which direction.

Definition 3.15. We use as a not otherwise used object representing an undefined ⊥ or invalid result. For any set 푀, let 푀⊥ := 푀 . ∪ {⊥} We can now take the three functions 푁, Ω and 퐹 푝 that were derived from geometric intuition above and use them to define neighborship:

Definition 3.16. A tuple 푇 = (푇 , , 푁, Ω, 퐹 푝) is called neighbor-푏-index-tree if 푠 ℱ 푇 = ( , 퐶, , 푆) is a state-푏-index-tree as in Definition 3.10. ∙ 푠 풱 풮 is a finite set, called the set of facets. ∙ℱ

푁 : ⊥ is a function. ∙ ℬ × 풮 × ℱ → ℬ

Ω: ⊥ is a function. ∙ ℬ × 풮 × 풮 × ℱ → ℬ 푝 퐹 : ⊥ is a function. ∙ ℬ × 풮 × ℱ → ℱ Let 푓 , 푣, 푤 푟 and 푗 := 푁(퐼(푣), 푆(푃 (푣)), 푓). We define neighborship ∈ ℱ ∈ 풱 ∖ { } inductively as follows:

푤 is called a depth-1 푓-neighbor of 푣 if 푗 = and 푤 = 퐶(푃 (푣), 푗). ∙ ̸ ⊥ 28 Let 푘 2. 푤 is called a depth-푘 푓-neighbor of 푣 if ∙ ≥ – 푗 = , ⊥ – 푓 := 퐹 푝(퐼(푣), 푆(푃 (푣)), 푓) = , 푝 ̸ ⊥ – 푃 (푤) is a depth-(푘 1) 푓 -neighbor of 푃 (푣), and − 푝 – 퐼(푤) = Ω(퐼(푣), 푆(푃 (푣)), 푆(푃 (푤)), 푓).

푤 is called an 푓-neighbor of 푣 if there exists 푘 1 such that 푤 is a depth-푘 ∙ ≥ 푓-neighbor of 푣.

Although this definition makes no assumptions about the functions 푁, Ω and 퐹 푝, we can still derive some intuitive properties of neighborship:

Lemma 3.17. Let 푣 and 푓 . If there is a depth-푘 푓-neighbor 푤 of 푣, it is its ∈ 풱 ∈ ℱ only 푓-neighbor, 푘 is unique, ℓ(푤) = ℓ(푣) and 푘 ℓ(푣). ≤ Proof. We prove this by induction on ℓ(푣). If ℓ(푣) = 0, there is by definition no neighbor of 푣. Now assume that the claim is true up to level 푙 0 and let ℓ(푣) = 푙 + 1. Furthermore, ≥ let 푗 := 푁(퐼(푣), 푆(푃 (푣)), 푓). Suppose that 푤 is a depth-푘 푓-neighbor of 푣 and 푢 is a depth-푘′ 푓-neighbor of 푣. We distinguish two cases:

(1) If 푗 = , it follows that 푘 = 푘′ = 1 ℓ(푣). But then, 푢 = 퐶(푃 (푣), 푗) = 푤. The ̸ ⊥ ≤ levels are identical because ℓ(푤) = ℓ(퐶(푃 (푣), 푗)) = ℓ(푃 (푣)) + 1 = ℓ(푣).

(2) If 푗 = , then 푘, 푘′ > 1. Thus, 푃 (푤) must be a depth-(푘 1) 퐹 푝(퐼(푣), 푆(푃 (푣)), 푓)- ⊥ − neighbor of 푃 (푣) and 푃 (푢) must be a depth-(푘′ 1) 퐹 푝(퐼(푣), 푆(푃 (푣)), 푓)-neighbor − of 푃 (푣). Since ℓ(푃 (푣)) = ℓ(푣) 1 = 푙, we can apply the induction hypothesis to − obtain 푃 (푢) = 푃 (푤) and 푘 1 = 푘′ 1 ℓ(푃 (푣)) = ℓ(푃 (푤)) = ℓ(푃 (푢)). Thus, − − ≤ 푘 = 푘′ ℓ(푃 (푣)) + 1 = ℓ(푣). By definition, ≤ 퐼(푢) = Ω(퐼(푣), 푆(푃 (푣)), 푆(푃 (푢)), 푓) = Ω(퐼(푣), 푆(푃 (푣)), 푆(푃 (푤)), 푓) = 퐼(푤)

and thus

푢 = 퐶(푃 (푢), 퐼(푢)) = 퐶(푃 (푤), 퐼(푤)) = 푤 .

With the preceding lemma, the depth of a neighbor is well-defined. This will lead toa characterization of the neighbor-finding algorithm’s runtime later.

Definition 3.18. Let 푇 be a neighbor-structured 푏-index-tree as in Definition 3.16. The 푓-neighbor-depth of 푣 , 퐷 (푣, 푓), is defined as ∈ 풱 푇 푘, if there is a depth-푘 푓-neighbor of 푣; and ∙ ℓ(푣) + 1, if there is no 푓-neighbor of 푣. ∙

29 3.3 Geometric Neighbors and Regularity Conditions This section presents sufficient conditions for a 푏-specification such that under these conditions,

the geometric 푏-index-tree from Definition 3.12 is extended to a neighbor-푏-index- ∙ tree,

a geometric definition of 푓-neighborship is given for such a tree, and ∙ the equivalence of this geometric definition and Definition 3.16 is proven. ∙ Readers that understand the geometric intuitions behind Definition 3.16 and are mainly interested in algorithms may skip this rather technical section and directly go to Section 3.5 or Section 4. To define conditions for regularity of geometric 푏-trees, we first need to define what it means for two nodes to “look essentially the same”. We define this by defining an equivalence relation on their point matrices, where two point matrices are defined to be equivalent if their points are affinely transformed versions of each other.

Definition 3.19. Let

푑 푑 푑 푑×푑 푑 Autaff (R ) := 휏 : R R , 푥 퐴푥 + 푏 퐴 R invertible, 푏 R { → ↦→ | ∈ ∈ } 푑 푑 be the group of all invertible affine transformations 휏 : R R . Furthermore, let 푄·,푖 → denote the vector containing the 푖-th column of the matrix 푄. ′ 푑×푚 푑 For 푄, 푄 R and 휏 Autaff (R ), we write ∈ ∈ ′ ′ 휏 : 푄 푄 : 푖 1, . . . , 푚 : 푄 = 휏(푄· ) . ∼ ⇔ ∀ ∈ { } ·,푖 ,푖 푑 ′ ⊤ ′ If 휏(푥) = 퐴푥 + 푏 for all 푥 R , this is equivalent to 푄 = 퐴푄 + 푏1푚. We write 푄 푄 ∈푑 ′ ∼ if there exists 휏 Autaff (R ) such that 휏 : 푄 푄 . ∈ ∼ It is easy to show that

id 푑 : 푄 푄, ∙ R ∼ 휏 : 푄 푄′ implies 휏 −1 : 푄′ 푄, and ∙ ∼ ∼ (휋 : 푄 푄′ and 휏 : 푄′ 푄′′) implies (휏 휋): 푄 푄′′ ∙ ∼ ∼ ∘ ∼ ′ ′′ 푑×푚 푑 for all 푄, 푄 , 푄 R and 휋, 휏 Autaff (R ). This shows that is an equivalence ∈ ∈ ∼ relation. 푑 ′ 푑×푚 ′ 푑×푛 For 휏 Autaff (R ), 푄1, 푄1 R and 푄2, 푄2 R , we define ∈ ∈ ∈ ′ ′ ′ ′ 휏 :(푄1, 푄2) (푄 , 푄 ): (휏 : 푄1 푄 and 휏 : 푄2 푄 ) . ∼ 1 2 ⇔ ∼ 1 ∼ 2 ′ ′ This means that 푄1 and 푄2 are related to 푄1 and 푄2 by the same affine automorphism ′ ′ 푑 휏. Again, we write (푄1, 푄2) (푄1, 푄2) if there exists 휏 Autaff (R ) such that ′ ′ ∼ ∈ 휏 :(푄1, 푄2) (푄 , 푄 ). The relation on pairs of matrices is also an equivalence ∼ 1 2 ∼ relation.

30 Remark 3.20. Given two matrices 푄, 푄′ R푑×푚, it can be algorithmically checked6 ′ ∈ ′ ′ whether 푄 푄 : Let 푢1, . . . , 푢푚 be the columns of 푄 and let 푢1, . . . , 푢푚 be the columns ′ ∼ ′ ′ ′ of 푄 . Set 푈 := 푢1, . . . , 푢 and 푈 := 푢 , . . . , 푢 . Let 퐵 = 푢 , . . . , 푢 푈 be { 푚} { 1 푚} { 푖1 푖푘 } ⊆ an affine basis of aff(푈). Set 퐵′ := 푢′ , . . . , 푢′ . Then, there is a unique affine map { 푖1 푖푘 } 휋 : aff(푈) = aff(퐵) aff(퐵′) satisfying 휋(푢 ) = 푢′ for all 푛 1, . . . , 푘 . → 푖푛 푖푛 ∈ { }

′ If 휏 : 푄 푄′, it follows that 휋 = 휏 aff(퐵 ) and thus, 퐵′ is affinely independent and ∙ ∼ |aff(퐵) 휋(푢 ) = 푢′ for all 푖 1, . . . , 푚 . 푖 푖 ∈ { } ′ ′ If 퐵 is affinely independent and 휋(푢푖) = 푢푖 for all 푖 1, . . . , 푚 , dim(aff(푈)) = ∙ ′ ∈ { } 푑 dim(aff(푈 )) and 휋 can be extended to an affine isomorphism 휏 Autaff (R ) with ∈ 휏 : 푄 푄′. ∼ ′ ′ Thus, the condition that 퐵 is affinely independent and 휋(푢푖) = 푢푖 for all 푖 1, . . . , 푚 ′ ∈ { ′ ′ } is equivalent to 푄 푄 and can be algorithmically verified. Since (푄1, 푄2) (푄1, 푄2) if ∼ ′ ′ ∼ and only if the concatenated matrices (푄1 푄2) and (푄 푄 ) are equivalent, equivalence | 1| 2 of matrix pairs can also be algorithmically verified.

푑×푚 Remark 3.21. Let 푚, 푛 N≥1. For 푄 R , let [푄] be the equivalence class of ∈ ∈ 푄 with respect to . The set of transition matrices in R푚×푛 is a semigroup by ∼ ℳ Lemma 3.5 (c). We can define a right group action of on the quotient R푑×푚/ by ℳ ∼ [푄] 푀 := [푄 푀] . · · To show that this is well-defined, assume that 휏 : 푄 푄′, where 휏(푥) = 퐴푥 + 푏 for all 푑 ′ ⊤ ∼ 푥 R . Then, 푄 = 퐴푄 + 푏1푚 and thus ∈ ′ ⊤ ⊤ ⊤ 푄 푀 = (퐴푄 + 푏1푚)푀 = 퐴푄푀 + 푏(1푚푀) = 퐴(푄푀) + 푏1푛 , which means 휏 : 푄푀 푄′푀. This is the idea behind the next lemma. ∼ To work with matrix equivalences in a geometric 푏-tree, the following identities are essential:

푑 Lemma 3.22. Let 휏 Autaff (R ). For matrices of suitable dimensions, the following implications hold: ∈

If 휏 : 퐴 퐵, then 휏 :(퐴, 퐴) (퐵, 퐵). ∙ ∼ ∼ If 휏 :(퐴, 퐴′) (퐵, 퐵′) and 푀, 푀 ′ are transition matrices, then ∙ ∼ 휏 :(퐴푀, 퐴′푀 ′) (퐵푀, 퐵′푀 ′) . ∼ If 휏 :(퐴, 퐴′) (퐵, 퐵′), then 휏 :(퐴′, 퐴) (퐵′, 퐵). ∙ ∼ ∼ Especially, in a geometric 푏-index-tree 푇 , the following implications hold:

6This assumes arbitrary precision arithmetic, which is also required for exact computations on polytopes. This assumption is realistic if only rational numbers are used for modeling, which is usually the case.

31 If 휏 : 푄(푣) 푄(푤), then 휏 :(푄(푣), 푄(푣)) (푄(푤), 푄(푤)). ∙ ∼ ∼ If 휏 :(푄(푣), 푄(푣′)) (푄(푤), 푄(푤′)) and 푗, 푗′ , then ∙ ∼ ∈ ℬ 휏 :(푄(푣′), 푄(푣)) (푄(푤′), 푄(푤)) , ∼ 휏 :(푄(퐶(푣, 푗)), 푄(퐶(푣′, 푗′))) (푄(퐶(푤, 푗)), 푄(퐶(푤′, 푗′))) , ∼ 휏 :(푄(푣), 푄(퐶(푣′, 푗′))) (푄(푤), 푄(퐶(푤′, 푗′))) . ∼ Proof. These identites follow easily from Definition 3.19 and Lemma 3.5.

To prove geometric results, we need to establish some well-known facts about polytopes.

Proposition 3.23. Let 푆 R푑 be finite and let 퐹 푃 be a face of the polytope ⊆ ⊆ 푃 := conv(푆). Then,

(a) conv(vert(푃 )) = 푃 ,

(b) vert(푃 ) 푆, ⊆ (c) 퐹 is a polytope and vert(퐹 ) = 퐹 vert(푃 ), ∩ (d) If 퐹 and 퐹 ′ are facets of 푃 and dim(퐹 퐹 ′) = dim(푃 ) 1, then 퐹 = 퐹 ′. ∩ − 푑 (e) If 휏 Autaff (R ), then 휏(푃 ) is a polytope with dim(휏(푃 )) = dim(푃 ), 휏(퐹 ) is a ∈ face of 휏(푃 ), dim(휏(퐹 )) = dim(퐹 ), vert(휏(푃 )) = 휏(vert(푃 )) and conv(휏(푆)) = 휏(conv(푆)). (f) If 푃 and 푃 ′ are polytopes, then 푃 푃 ′ is also a polytope. ∩ Proof. Follows partially from Ziegler [7] or elementary from Definition 3.3.

In the following, we want to investigate the structure of a geometric 푏-index-tree 푇 . Our goal is to specify the set and the functions 푁, Ω and 퐹 푝 from Definition 3.16. ℱ To this end, we want to specify a map (푣, 푓) (푣) that takes a node 푣 and an ↦→ 푓 ∈ 풱 encoding 푓 and returns a facet (푣) of the polytope conv(푄(푣)). Having said that, ∈ ℱ 푓 not all such maps allow us to reasonably specify functions 푁, Ω and 퐹 푝. We have to ensure that for two nodes 푣, 푤 having the same state, the facets (푣)푓 and (푤)푓 are the “same” facet in the sense that they are the same when viewed from the local coordinate system of 푣 and 푤, respectively. To make that precise, we look at the indices of the columns of 푄(푣) and 푄(푤) that are contained in the facet (푣)푓 and (푤)푓 , respectively. The set of these indices should be the same for 푣 and 푤.

Definition 3.24. For 푄 R푑×푚, let facets(푄) be the set of all facets of conv(푄). For ∈ 푓 facets(푄), define indices(푓, 푄) = 푗 1, . . . , 푚 푄· 푓 , the set of indices of ∈ { ∈ { } | ,푗 ∈ } matrix columns contained in the facet 푓. Define indexFacets(푄) := indices(푓, 푄) 푓 { | ∈ facets(푄) , which means that } indices( , 푄) : facets(푄) indexFacets(푄) · → is surjective. It is even bijective since it follows from Proposition 3.23 (a) – (c) that facet( , 푄) : indexFacets(푄) facets(푄), 퐽 conv( 푄· 푗 퐽 ) is its inverse. · → ↦→ { ,푗 | ∈ } 32 To be able to define the facet 푓 of a node 푣, we first show that equivalent matrices have the same facet indices and then require nodes of the same state to have equivalent matrices. Proposition 3.25. Let 푄, 푄′ R푑×푚. If 푄 푄′, then ∈ ∼ indexFacets(푄) = indexFacets(푄′) . Proof. Since is symmetric, it is sufficient to show “ ”. Let indexFacets(푄). ∼ ⊆ ℐ ∈ By definition, there exists a facet 퐹 of conv(푄) such that = 푖 1, . . . , 푚 푑 ′ ℐ { ∈ { } | 푄·,푖 vert(퐹 ) . Choose 휏 Autaff (R ) with 휏 : 푄 푄 . For all 푖 1, . . . , 푚 , ∈ } ∈ ∼ ∈ { } Proposition 3.23 yields ′ 푄· vert(퐹 ) 푄 = 휏(푄· ) 휏(vert(퐹 )) = vert(휏(퐹 )) , ,푖 ∈ ⇔ ·,푖 ,푖 ∈ where 휏(퐹 ) is a facet of conv(푄′) = 휏(conv(푄)). This shows indexFacets(푄′). ℐ ∈ To turn a geometric 푏-index-tree 푇 into a neighbor-푏-index-tree in a meaningful way, we need to impose some regularity conditions on the structure of 푇 . We establish these regularity conditions in two steps, where the first one is needed to define the 푓-facet of a node, which is in turn useful to define the second step conveniently. Definition 3.26. A geometric 푏-index-tree 푇 = ( , 퐶, , 푆) is called pre-regular if 풱 풮 the following conditions are satisfied: (P1) For all 푣, 푣′ , 푆(푣) = 푆(푣′) implies 푄(푣) 푄(푣′), ∈ 풱 ∼ (P2) For all 푣 푉 , dim(conv(푄(푣))) = 푑. ∈ Remark 3.27. The geometric 푏-index-trees corresponding to SFCs presented in Sec- tion 2 obviously satisfy these criteria. They even appear to satisfy a much stronger version of (P1): For any two nodes 푣, 푣′ irrespective of their states, there exists 푑 ∈′ 풱 휏 Autaff (R ) such that 휏 : 푄(푣) 푄(푣 ), where 휏(푥) = 퐴푥 + 푏 with an invertible ∈ ∼ matrix 퐴 R푑×푑 that is a scalar multiple of an orthogonal matrix. ∈ For the rest of this section, we assume 푇 = ( , 퐶, , 푆) to be a pre-regular 풱 풮 geometric 푏-index-tree. Definition 3.28. Let be a finite set for every 푠 and := ⋃︀ (not ℱ푠 ∈ 풮 ℱ 푠∈풮 ℱ푠 necessarily disjoint). The intuition behind this is that the set is a set representing ℱ푠 the facets of a node in state 푠. For every 푠 , choose 푣 satisfying 푆(푣 ) = 푠 ∈ 풮 푠 ∈ 풱 푠 (this is possible because of the reachability condition in Definition 3.6). Let Φ : 푠 ℱ푠 → indexFacets(푄(푣 )) be a bijective function for each 푠 . Note that (P1) together with 푠 ∈ 풮 Proposition 3.25 implies that indexFacets(푄(푣푠)) is independent of the choice of 푣푠. We call ( , Φ) a facet specification for 푇 . ℱ For 푣 , 푓 , we can then define the facet 푓 of 푣,(푣) , by ∈ 풱 ∈ ℱ 푓 ⎧ ⎨facet(Φ푆(푣)(푓), 푄(푣)) = conv( 푄(푣)·,푖 푖 Φ푆(푣)(푓) ) , if 푓 푆(푣) (푣)푓 := { | ∈ } ∈ ℱ ⎩ , otherwise. ∅ Since Φ ( ) is bijective and by Definition 3.24, facet( , 푄(푣)) is bijective, the function 푆 푣 · ( ) facets(푄(푣)), 푓 (푣) is also bijective. This means that (푣) = (푣) ′ implies ℱ푆 푣 → ↦→ 푓 푓 푓 푓 = 푓 ′ if (푣) = . 푓 ̸ ∅ 33 Note that definition of푣 ( ) depends on the choice of ( ) ∈풮 and (Φ ) ∈풮 , i.e. on the 푓 ℱ푠 푠 푠 푠 choice of the facet specification ( , Φ). We will assume for the rest of this section that ℱ one such choice is fixed.

Lemma 3.29. Let 휏 : 푄(푣) 푄(푤) and 푆(푣) = 푆(푤). Then, (푤) = 휏((푣) ). ∼ 푓 푓 Proof. Let 푠 := 푆(푣) = 푆(푤). We may assume that 푓 , otherwise the claim is ∈ ℱ푠 trivial. With := Φ (푓), we obtain ℐ 푠

(푤)푓 = conv( 푄(푤)·,푖 푖 ) = conv( 휏(푄(푣)·,푖) 푖 ) { | ∈ ℐ} 3.23 { | ∈ ℐ} = conv(휏( 푄(푣)· 푖 )) = 휏(conv( 푄(푣)· 푖 )) = 휏((푣) ) . { ,푖 | ∈ ℐ} { ,푖 | ∈ ℐ} 푓 Definition 3.30. For 푣, 푣′ , let section(푣, 푣′) := conv(푄(푣)) conv(푄(푣′)). By ∈ 풱 ∩ Proposition 3.23 (f), section(푣, 푤) is a polytope. Furthermore, if 휏 :(푄(푣), 푄(푣′)) ∼ (푄(푤), 푄(푤′)), Proposition 3.23 yields

section(푤, 푤′) = conv(푄(푤)) conv(푄(푤′)) = conv(휏(푄(푣))) conv(휏(푄(푣′))) ∩ ∩ = 휏(conv(푄(푣)) conv(푄(푣′))) = 휏(section(푣, 푣′)) . ∩ ′ Definition 3.31 (Geometric neighbors). Let 푣 and 푓 푆(푣). A node 푣 is ∈ 풱 ′ ∈ ℱ ′ ∈ 풱 called geometric 푓-neighbor of 푣 if ℓ(푣) = ℓ(푣 ) and there is 푓 푆(푣′) such ′ ′ ∈ 풱 ∈′ ℱ that section(푣, 푣 ) = (푣) = (푣 ) ′ . Note that this implies dim(section(푣, 푣 )) = 푑 1, 푓 푓 − hence 푣 = 푣′ and thus ℓ(푣) = ℓ(푣′) = 0, since only 푟 has level 0 and dim(section(푟, 푟)) = ̸ ̸ 푑 = 푑 1 by (P2). ̸ − Definition 3.32. A pre-regular geometric 푏-index-tree 푇 is called regular if the following conditions hold:

′ ′ ′ (R1) If 푣2, 푣2 are geometric 푓-neighbors of 푣1 and 푣1 respectively satisfying 푆(푣푖) = 푆(푣푖) ′ ′ for 푖 1, 2 , then (푄(푣1), 푄(푣2)) (푄(푣 ), 푄(푣 )). ∈ { } ∼ 1 2 (R2) For all 푣 and 푗 , conv(푄(퐶(푣, 푗))) conv(푄(푣)). ∈ 풱 ∈ ℬ ⊆ (R3) For all 푣, 푣′ with ℓ(푣) = ℓ(푣′), the following conditions are satisfied: ∈ 풱 (i) If dim(section(푣, 푣′)) = 푑, then 푣 = 푣′. (ii) If dim(section(푣, 푣′)) = 푑 1, there exists 푓 such that 푣′ is a geometric − ∈ ℱ 푓-neighbor of 푣.

Note that regularity is a property independent of the choice of the facet specification ( , Φ), but a formulation using sets of matrix column indices would be less convenient. ℱ Remark 3.33. Condition (R2) is somewhat restrictive as it rules out the Gosper curve. However, it is satisfied by the most popular curves and it is also important for usinga curve in an adaptive setting. Furthermore, it is intuitively obvious that the algorithms given below work for the Gosper curve as well. Condition (R1) is also restrictive because it is not satisfied by the local model of the Hilbert curve and the semi-local model of the Gosper curve, cf. Remark 2.6. This is a problem because this means that the function Ω will not be well-defined. A simple

34 remedy is to choose a global model instead, where the state contains more information about the orientation of a polytope. Another possibility would be to add another parameter to Ω: Let 푤 be a geometric 푓-neighbor of 푣 and 푣 be a geometric 푓 ′-neighbor of 푤. Instead of setting

Ω(퐼(푣), 푆(푃 (푣)), 푆(푃 (푤)), 푓) := 퐼(푤) , we could set

Ω(퐼(푣), 푆(푃 (푣)), 푆(푃 (푤)), 푓, 푓 ′) := 퐼(푤) , i.e. include 푓 ′ as an additional parameter. This has two main drawbacks:

The lookup table for Ω gets bigger (though it might be still smaller than the ∙ lookup table in a global model without additional parameter).

The facet 푓 ′ has to be known, which in general requires additional lookup tables ∙ and thus complicates the model.

Hence, we will not further investigate this approach. Condition (R2) yields the relation ⋃︁ conv(푄(퐶(푣, 푗))) conv(푄(푣)) . 푗∈ℬ ⊆ We might also demand equality instead, which is again satisfied by all curves from Section 2 except the Gosper curve. This would yield information about the existence of neighbors in some cases. Here, we stay with the weaker regularity condition because it will be sufficient for our purposes. In condition (R3), we could require section(푣, 푣′) to be a face of 푣 and 푣′ independent of its dimension, but we will not need this requirement and it might be more difficult to verify algorithmically.

To prove useful results about our geometric approach, we need some more preparations.

′ Lemma 3.34. Let 푣 and 푓, 푓 such that dim((푣) (푣) ′ ) 푑 1. Then, ∈ 풱 ∈ ℱ 푓 ∩ 푓 ≥ − 푓 = 푓 ′.

′ Proof. Under the assumptions above, let 퐹 := (푣)푓 and 퐹 := (푣)푓 ′ . From Definition 3.28, we know that dim(퐹 ), dim(퐹 ′) 푑 1. The assumption dim(퐹 퐹 ′) 푑 1 yields ≤ − ∩ ≥ − dim(퐹 ) = dim(퐹 ′) = 푑 1. Since 퐹 and 퐹 ′ are facets of conv(푄(푣)) and since − ′ dim(conv(푄(푣)) = 푑 by (P2), Proposition 3.23 (d) yields 퐹 = 퐹 , i.e. (푣)푓 = (푣)푓 ′ . As noted in Definition 3.28, this implies 푓 = 푓 ′. Lemma 3.35. Let 휏 :(푄(푣), 푄(푣′)) (푄(푤), 푄(푤′)) with 푆(푣) = 푆(푤) and 푆(푣′) = ∼ 푆(푤′).

′ ′ ′ (a) If 푓, 푓 such that (푣) (푣 ) ′ , then (푤) (푤 ) ′ . ∈ ℱ 푓 ⊇ 푓 푓 ⊇ 푓 (b) If 푣′ is a geometric 푓-neighbor of 푣 and ℓ(푤) = ℓ(푤′), then 푤′ is a geometric 푓-neighbor of 푤.

35 Proof.

′ ′ (a) By Lemma 3.29, we obtain (푤) = 휏((푣) ) 휏((푣 ) ′ ) = (푤 ) ′ . 푓 푓 ⊇ 푓 푓 (b) By Lemma 3.29 and Definition 3.30, we obtain section(푤, 푤′) = 휏(section(푣, 푣′)), ′ ′ ′ ′ (푤)푓 = 휏((푣)푓 ) and (푤 )푓 ′ = 휏((푣 )푓 ′ ). Hence, section(푤, 푤 ) = (푤)푓 = (푤 )푓 ′ , which means that 푤′ is a geometric 푓-neighbor of 푤. For the rest of this section, we assume 푇 to be a regular geometric 푏-index- tree. Lemma 3.36. For any 푣 and 푓 , there is at most one geometric 푓-neighbor of 푣. ∈ 풱 ∈ ℱ

Proof. Assume that 푤 and 푤′ are geometric 푓-neighbors of 푣. Let 퐹 := ∈ 풱 ′ ∈ 풱 section(푣, 푤) = (푣)푓 = section(푣, 푤 ). Choose a point 푝 in the interior of 퐹 . By (P2), we have dim(conv(푄(푣))) = dim(conv(푄(푤))) = dim(conv(푄(푤′))) = 푑. Choosing 휀 > 0 sufficiently small, we can assume that ball 퐵휀(푝) with center 푝 and radius 휀 intersects ′ conv(푄(푣)), conv(푄(푤)) and conv(푄(푤 )) in half-balls 퐻푣, 퐻푤 and 퐻푤′ , respectively. Since 퐻 퐻 , 퐻 퐻 ′ 퐹 by assumption, it follows that 퐻 = 퐻 ′ . Therefore, 푣 ∩ 푤 푣 ∩ 푤 ⊆ 푤 푤 dim(section(푤, 푤′)) dim(aff(퐻 )) = 푑. But then, 푤 = 푤′ by (R3). ≥ 푤 With these preparations, we can now extend 푇 to a neighbor-푏-index-tree and then show in the following proposition that this extension is well-defined: Definition 3.37. Let 푗 , 푠 , 푓 . ∈ ℬ ∈ 풮 ∈ ℱ (a) If there exists 푣 with 푆(푣) = 푠 and 푗′ such that 퐶(푣, 푗′) is a geometric ∈ 풱 ∈ ℬ 푓-neighbor of 퐶(푣, 푗), we set 푁(푗, 푠, 푓) := 푗′ .

(b) If there exist 푣, 푣′ , 푗′ and 푓 ′ such that 푆(푣) = 푠, 푣′ is a geometric ∈ 풱 ∈ ℬ ∈ ℱ 푓 ′-neighbor of 푣 and 퐶(푣′, 푗′) is a geometric 푓-neighbor of 퐶(푣, 푗), we set 퐹 푝(푗, 푠, 푓) := 푓 ′ Ω(푗, 푠, 푆(푣′), 푓) := 푗′ .

In all cases not covered, we set 푁(푗, 푠, 푓), 퐹 푝(푗, 푠, 푓) or Ω(푗, 푠, 푠′, 푓) to , respectively. ⊥ Proposition 3.38. The functions in Definition 3.37 are well-defined.

Proof. We assume to be given a choice of variables in Definition 3.37 and show that any other choice leads to the same definition: (a) Let 푤 with 푆(푤) = 푠 and 푗′′ such that 퐶(푤, 푗′′) is a geometric 푓-neighbor ∈ 풱 ∈ ℬ of 퐶(푤, 푗). We show that 푗′ = 푗′′: (P1) yields 푄(푣) 푄(푤) and thus, by Lemma 3.22, ∼ (푄(퐶(푣, 푗)), 푄(퐶(푣, 푗′))) (푄(퐶(푤, 푗)), 푄(퐶(푤, 푗′))) . ∼ By Lemma 3.35, 퐶(푤, 푗′) is a geometric 푓-neighbor of 퐶(푤, 푗) and thus 푗′ = 푗′′ by the uniqueness of geometric neighbors (Lemma 3.36).

36 (b) Let 푤, 푤′ , 푗′′ and 푓 ′′ such that 푆(푤) = 푠, 푤′ is a geometric 푓 ′′-neighbor ∈ 풱 ∈ ℬ ∈ ℱ of 푤 and 퐶(푤′, 푗′′) is a geometric 푓-neighbor of 퐶(푤, 푗). We show that 푓 ′ = 푓 ′′ and that 푆(푣′) = 푆(푤′) implies 푗′ = 푗′′: (i) We obtain

(R2) ′ ′ ′ (푣) ′ = section(푣, 푣 ) section(퐶(푣, 푗), 퐶(푣 , 푗 )) = (퐶(푣, 푗)) (3.2) 푓 ⊇ 푓 and similarly (푤) ′′ (퐶(푤, 푗)) . Since 푆(푣) = 푆(푤), (P1) yields 푄(푣) 푓 ⊇ 푓 ∼ 푄(푤) and thus (푄(푣), 푄(퐶(푣, 푗))) (푄(푤), 푄(퐶(푤, 푗))) by Lemma 3.22. By ∼ Lemma 3.35, Equation (3.2) implies (푤) ′ (퐶(푤, 푗)) . Consequently, 푓 ⊇ 푓 푑 1 = dim((퐶(푤, 푗)) ) dim((푤) ′ (푤) ′′ ) − 푓 ≤ 푓 ∩ 푓 and thus 푓 ′ = 푓 ′′ by Lemma 3.34. (ii) Now assume that 푆(푣′) = 푆(푤′). Then, (푄(푣), 푄(푣′)) (푄(푤), 푄(푤′)) ∼ by (R1) and thus (푄(퐶(푣, 푗), 푄(퐶(푣′, 푗′))) (푄(퐶(푤, 푗)), 푄(퐶(푤′, 푗′))) by ∼ Lemma 3.22. This means that 퐶(푤′, 푗′) is a geometric 푓-neighbor of 퐶(푤, 푗) by Lemma 3.35. Since 퐶(푤, 푗) can have only one geometric 푓-neighbor by Lemma 3.36, it follows that 퐶(푤′, 푗′) = 퐶(푤′, 푗′′) and thus 푗′ = 푗′′. Definition 3.39. Given a geometric 푏-index-tree 푇 and a facet specification ( , Φ) ℱ as in Definition 3.28, we can define the extended geometric 푏-index-tree 푇^ := (푇, , 푁, Ω, 퐹 푝), where 푁, Ω and 퐹 푝 are defined as in Definition 3.37. ℱ We can now relate the geometric neighborship-definition from Definition 3.31 to the algebraic neighborship-definition from Definition 3.16. Theorem 3.40. Let 푇^ be an extended geometric 푏-tree for the facet specification ( , Φ). For any 푣, 푣′ and 푓 , 푣′ is a geometric 푓-neighbor of 푣 in 푇^ if and only ifℱ it is an 푓-neighbor∈ of 풱 푣 in 푇^.∈ ℱ

Proof. We will prove both implications by induction on ℓ(푣). The induction basis is trivial because both neighborship definitions do not allow ℓ(푣) = 0, which would mean that 푣 = 푟. “ ”: Let 푣′ be a geometric 푓-neighbor of 푣 with ℓ(푣) 1. Since 푣 = 푟 = 푣′, ⇒ ∈ 풱 ′ ′ ∈ 풱 ≥ ̸ ̸ let 푣푝 := 푃 (푣), 푣푝 := 푃 (푣 ). ′ ′ – Case 1: 푣푝 = 푣푝. The definition of 푁 implies that 푁(퐼(푣), 푆(푣), 푓) = 퐼(푣 ). By definition, 푣′ is an 푓-neighbor of 푣. – Case 2: 푣 = 푣′ . Let 퐻 := section(푣 , 푣′ ). (R2) implies conv(푄(푣)) 푝 ̸ 푝 푝 푝 ⊆ conv(푄(푣 )) and conv(푄(푣′)) conv(푄(푣′ )) and therefore 푝 ⊆ 푝 ′ dim(퐻) = dim(aff(퐻)) = dim(aff(section(푣푝, 푣푝))) dim(aff(section(푣, 푣′))) = dim(section(푣, 푣′)) = 푑 1 . ≥ − ′ By (R3), there exists 푓푝 푆(푣푝) such that 푣푝 is a geometric 푓푝-neighbor ∈ ℱ ′ of 푣푝. Invoking the induction hypothesis, we obtain that 푣푝 is an 푓푝- 푝 ′ neighbor of 푣푝. By definition, we have 푓푝 = 퐹 (퐼(푣), 푆(푣푝), 푓) and 퐼(푣 ) = ′ ′ Ω(퐼(푣), 푆(푣푝), 푆(푣푝), 푓). This means that 푣 is an 푓-neighbor of 푣.

37 “ ”: Let 푣′ be an 푓-neighbor of 푣 with ℓ(푣) 1. Since 푣 = 푟 = 푣′, let ⇐ ∈ 풱 ′ ′ ∈ 풱 ≥ ̸ ̸ 푣푝 := 푃 (푣), 푣푝 := 푃 (푣 ).

′ ′ – Case 1: 푣푝 = 푣푝. By definition, 퐼(푣 ) = 푁(퐼(푣), 푆(푣푝), 푓). By definition of 푁, there exists 푤 with 푆(푤 ) = 푆(푣 ) such that 푤′ := 퐶(푤 , 퐼(푣′)) is a 푝 ∈ 풱 푝 푝 푝 geometric 푓-neighbor of 푤 := 퐶(푤 , 퐼(푣)). (P1) implies 푄(푤 ) 푄(푣 ) and 푝 푝 ∼ 푝 thus (푄(푤), 푄(푤′)) (푄(푣), 푄(푣′)) by Lemma 3.22. By Lemma 3.35, 푣′ is a ∼ geometric 푓-neighbor of 푣. ′ ′ 푝 – Case 2: 푣푝 = 푣푝. By definition, 푣푝 is a 퐹 (퐼(푣), 푆(푣푝), 푓)-neighbor of 푣푝. ̸ ′ 푝 By the induction hypothesis, 푣푝 is then also a geometric 퐹 (퐼(푣), 푆(푣푝), 푓)- ′ ′ neighbor of 푣푝. Furthermore, we have 퐼(푣 ) = Ω(퐼(푣), 푆(푣푝), 푆(푣푝), 푓). By ′ definition of Ω, there exist 푤푝, 푤푝 with 퐼(푤푝) = 퐼(푣푝), 푆(푤푝) = 푆(푣푝) ′ ′ ′ ∈ 풱 푝 and 푆(푤푝) = 푆(푣푝) such that 푤푝 is a geometric 퐹 (퐼(푣), 푆(푣푝), 푓)-neighbor ′ ′ ′ of 푤푝 and 푤 := 퐶(푤푝, 퐼(푣 )) is a geometric 푓-neighbor of 푤 := 퐶(푤푝, 퐼(푣)). (R1) yields (푄(푤 ), 푄(푤′ )) (푄(푣 ), 푄(푣′ )), which leads to (푄(푤), 푄(푤′)) 푝 푝 ∼ 푝 푝 ∼ (푄(푣), 푄(푣′)) using Lemma 3.22. By Lemma 3.35, 푣′ is a geometric 푓-neighbor of 푣.

3.4 Algorithmic Verification In this section, we want to give an algorithmically verifiable necessary and sufficient criterion for the regularity of a geometric 푏-tree 푇 . It works by finding a representative finite collection of nodes such that checks on this collection can verify regularity on the whole tree. Such a collection can also be used to compute the functions 푁, Ω, 퐹 푝. As in the last section, the representations are introduced in two stages, where the first stage can be used to check pre-regularity, which is then required for the definition of neighborship that is in turn used for the second stage.

Definition 3.41. A sequence 푢 of representants 푢 for 푠 is called pre- 푠 ∈ 풱 ∈ 풮 representation for 푇 if 푢 ( ) = 푟 and 푆(푢 ) = 푠 for all 푠 . 푆 푟 푠 ∈ 풮

The reachability condition in Definition 3.6 ensures that nodes 푢푠 with 푆(푢푠) = 푠 for every 푠 exist. A pre-representation for 푇 can be found algorithmically as follows: ∈ 풮

Initialize 푢 ( ) to 푟 and all other representants to . ∙ 푆 푟 ⊥ Iteratively check the children of found representants to find a representant for ∙ each state.

Definition 3.42. A pre-representation 푢 for 푇 is called pre-regular if the following conditions are satisfied:

(P1’) For all 푠 and 푗 , 푄(퐶(푢 , 푗)) 푄(푢 ( ( ))). ∈ 풮 ∈ ℬ 푠 ∼ 푆 퐶 푢푠,푗 (P2’) For all 푠 , dim(conv(푄(푢 ))) = 푑. ∈ 풮 푠 Proposition 3.43. Let 푢 be a pre-representation for 푇 . Then, 푇 is pre-regular if and only if 푢 is pre-regular.

38 Proof. The direction “ ” follows immediately from the definition. We will show ⇒ “ ” here. Thus, let 푢 be pre-regular. To show that (P1) is satisfied, we show that ⇐ 푄(푥) 푄(푢 ( )) for all 푥 by induction on ℓ(푥). If ℓ(푥) = 0, 푥 = 푟 and the assertion ∼ 푆 푥 ∈ 풱 follows directly from 푢 ( ) = 푟. Now assume that ℓ(푥) = 푙+1 1 and the claim is true up 푆 푟 ≥ to level 푙. Set 푥푝 := 푃 (푥). Then, ℓ(푥푝) = 푙 and thus 푄(푥푝) 푄(푢푆(푥푝)) by the induction ∼ 푐 hypothesis. Setting 푦 := 퐶(푢푆(푥푝), 퐼(푥)), we obtain 푆(푦) = 푆 (푆(푥푝), 퐼(푥)) = 푆(푥) and thus

3.22 (P1’) 푄(푥) = 푄(퐶(푥 , 퐼(푥))) 푄(푦) 푄(푢 ( )) = 푄(푢 ( )) , 푝 ∼ ∼ 푆 푦 푆 푥 which completes the induction. 푑 To show (P2), let 푥 . By (P1), there exists 휏 Autaff (R ), such that conv(푄(푥)) = ∈ 풱 ∈ 휏(conv(푄(푢푆(푥))). But then,

3.23 (P2’) dim(conv(푄(푥))) = dim(휏(conv(푄(푢푆(푥))))) = dim(conv(푄(푢푆(푥)))) = 푑 .

For the rest of this section, we assume 푇 to be a pre-regular geometric 푏- index-tree with a neighbor structure given by a fixed facet representation ( , Φ). ℱ Definition 3.44. Let 푢 be a pre-representation for 푇 . A tuple (푢, 푣, 푤) with represen- ′ tants (푣 ′ , 푤 ′ ) (( 푟 ) ( 푟 )) ( , ) for all 푠, 푠 and 푓 is 푠,푠 ,푓 푠,푠 ,푓 ∈ 풱 ∖ { } × 풱 ∖ { } ∪ { ⊥ ⊥ } ∈ 풮 ∈ ℱ called representation for 푇 if the following conditions are satisfied:

′ ′ ′ (a) For any 푠 , 푗, 푗 and 푓 such that 푦 := 퐶(푢푠, 푗 ) is a geometric ′ ∈ 풮 ∈ ℬ ∈ ℱ 푓 -neighbor of 푥 := 퐶(푢 , 푗), we have (푣 ( ) ( ) ′ , 푤 ( ) ( ) ′ ) = ( , ). 푠 푆 푥 ,푆 푦 ,푓 푆 푥 ,푆 푦 ,푓 ̸ ⊥ ⊥ ′ (b) For every 푠, 푠 , 푓 with (푣 ′ , 푤 ′ ) = ( , ), the following conditions ∈ 풮 ∈ ℱ 푠,푠 ,푓 푠,푠 ,푓 ̸ ⊥ ⊥ are satisfied:

(1) 푆(푣푠,푠′,푓 ) = 푠, ′ (2) 푆(푤푠,푠′,푓 ) = 푠 ,

(3) 푤푠,푠′,푓 is a geometric 푓-neighbor of 푣푠,푠′,푓 , ′ ′ ′ (4) For any 푗, 푗 and 푓 such that 푦 := 퐶(푤푠,푠′,푓 , 푗 ) is a geometric ′ ∈ ℬ ∈ ℱ 푓 -neighbor of 푥 := 퐶(푣 ′ , 푗), we have (푣 ( ) ( ) ′ , 푤 ( ) ( ) ′ ) = ( , ). 푠,푠 ,푓 푆 푥 ,푆 푦 ,푓 푆 푥 ,푆 푦 ,푓 ̸ ⊥ ⊥ A representation for 푇 can be found algorithmically as follows:

Find a pre-representation 푢 for 푇 as described above. ∙ Initialize all representants 푣 ′ and 푤 ′ to . ∙ 푠,푠 ,푓 푠,푠 ,푓 ⊥ Repeatedly check conditions (a) and (b)(4) for all found representants to find new ∙ representants until these conditions are satisfied. Definition 3.45. A pre-regular representation (푢, 푣, 푤) for 푇 is called regular if the following conditions are satisfied:

(R1’)( 푄(푥), 푄(푦)) (푄(푣 ( ) ( ) ′ ), 푄(푤 ( ) ( ) ′ )) ∼ 푆 푥 ,푆 푦 ,푓 푆 푥 ,푆 푦 ,푓 39 for all 푥 := 퐶(푢 , 푗′), 푦 := 퐶(푢 , 푗′′), where 푦 is a geometric 푓 ′-neighbor of 푥, ∙ 푠 푠 and ′ ′′ ′ for all 푥 := 퐶(푣 ′ , 푗 ), 푦 := (푤 ′ , 푗 ), where 푦 is a geometric 푓 -neighbor ∙ 푠,푠 ,푓 푠,푠 ,푓 of 푥.

(R2’) For all 푠 and 푗 , conv(푄(퐶(푢 , 푗))) conv(푄(푢 )). ∈ 풮 ∈ ℬ 푠 ⊆ 푠 (R3’) For all 푥, 푦 of the form ∈ 풱 ′ ′′ (1) 푥 = 퐶(푢푠, 푗 ), 푦 = 퐶(푢푠, 푗 ), or ′ ′′ (2) 푥 = 퐶(푣푠,푠′,푓 , 푗 ), 푦 = (푤푠,푠′,푓 , 푗 ),

the following conditions are satisfied:

If dim(section(푥, 푦)) = 푑, then 푥 = 푦. ∙ If dim(section(푥, 푦)) = 푑 1, then there exists 푓 ′ such that 푦 is a ∙ − ∈ ℱ geometric 푓 ′-neighbor of 푥.

Proposition 3.46. Let (푢, 푣, 푤) be a representation for 푇 . Then, 푇 is regular if and only if (푢, 푣, 푤) is regular.

Proof. Again, the implication “ ” follows directly from the definition and we only ⇒ show “ ”. Let (푢, 푣, 푤) be regular. ⇐ 푑 To show that (R2) holds, let 푥 and 푗 . By (P1), there exists 휏 Autaff (R ) ∈ 풱 ∈ ℬ ∈ such that 휏 : 푄(푥) 푄(푢 ( )). Lemma 3.22 then yields 휏 : 푄(퐶(푥, 푗)) 푄(퐶(푢 ( ), 푗)). ∼ 푆 푥 ∼ 푆 푥 Hence,

(R2’) 3.23 conv(푄(퐶(푥, 푗))) = 휏(conv(푄(퐶(푢푆(푥), 푗)))) 휏(conv(푄(푢푆(푥)))) 3 23 ⊆ =. conv(푄(푥)) .

To show that (R1) and (R3) hold, we prove the following statements for all 푥, 푦 ∈ 풱 with ℓ(푥) = ℓ(푦) by induction on ℓ(푥):

(i) If 푦 is a geometric 푓-neighbor of 푥 for some 푓 , then (푄(푥), 푄(푦)) ∈ ℱ ∼ (푄(푣푆(푥),푆(푦),푓 ), 푄(푤푆(푥),푆(푦),푓 )). (ii) If dim(section(푥, 푦)) = 푑, then 푥 = 푦.

(iii) If dim(section(푥, 푦)) = 푑 1, then there exists 푓 ′ such that 푦 is a geometric − ∈ ℱ 푓 ′-neighbor of 푥.

The induction basis is trivial: If ℓ(푥) = ℓ(푦) = 0, then 푥 = 푦 = 푟, which means that 푦 is not a geometric 푓-neighbor of 푥 and furthermore dim(section(푥, 푦)) = 푑 by (P2). Now, let the statement be true for levels 푙 and let ℓ(푥) = ℓ(푦) = 푙 + 1. ≤ Define 푥푝 := 푃 (푥), 푦푝 := 푃 (푦), 퐻 := section(푥, 푦) and 퐻푝 := section(푥푝, 푦푝). Since (R2) has already been shown, we can conclude 퐻 퐻 and thus dim(퐻 ) dim(퐻). If 푝 ⊇ 푝 ≥ dim(퐻) < 푑 1, nothing has to be shown. We therefore assume dim(퐻) 푑 1 and − ≥ − investigate different cases:

40 Case 1: dim(퐻푝) = 푑. By the induction hypothesis, (ii) holds for 푥푝 and 푦푝, which ∙ ′ ′ ′ means 푥푝 = 푦푝. (P1) yields 푄(푥푝) 푄(푢푆(푥푝)). Set 푥푝 := 푢푆(푥푝), 푥 := 퐶(푥푝, 퐼(푥)) ′ ′ ∼ 푑 and 푦 := 퐶(푥푝, 퐼(푦)). With an appropriate 휏 Autaff (R ), this yields ∈ ′ ′ dim(section(푥 , 푦 )) = dim(휏(section(퐶(푥푝, 퐼(푥)), 퐶(푥푝, 퐼(푦))))) = dim(section(푥, 푦)) = dim(퐻) . (3.3) – Case 1.1: dim(퐻) = 푑. Then, (R3’) yields 푥′ = 푦′, i.e. 퐼(푥) = 퐼(푦) and thus 푥 = 푦. – Case 1.2: dim(퐻) = 푑 1. Then, by (R3’), there exists 푓 such that − ∈ ℱ 푦′ is a geometric 푓-neighbor of 푥′. Since (푄(푥), 푄(푦)) (푄(푥′), 푄(푦′)), 푦 is ∼ a geometric 푓-neighbor of 푥. This shows (iii). To show (i), assume that 푦 ˜ is a geometric 푓-neighbor of 푥. Then, (푥)푓˜ = section(푥, 푦) = (푥)푓 and thus 푓˜ = 푓. Statement (i) then follows from

′ ′ (R1’) (푄(푥), 푄(푦)) (푄(푥 ), 푄(푦 )) (푄(푣 ( ) ( ) ), 푄(푤 ( ) ( ) ) . ∼ ∼ 푆 푥 ,푆 푦 ,푓 푆 푥 ,푆 푦 ,푓 Case 2: dim(퐻) = dim(퐻 ) = 푑 1. By the induction hypothesis, (iii) holds for 푥 ∙ 푝 − 푝 and 푦푝. Hence, a facet 푓푝 exists such that 푦푝 is a geometric 푓푝-neighbor of 푥푝. ′ ′ ∈ ℱ ′ ′ ′ ′ Let 푥푝 := 푣푆(푥푝),푆(푦푝),푓푝 , 푦푝 := 푤푆(푥푝),푆(푦푝),푓푝 , 푥 := 퐶(푥푝, 퐼(푥)) and 푦 := 퐶(푦푝, 퐼(푦)). By statement (i) of the induction hypothesis, (푄(푥 ), 푄(푦 )) (푄(푥′ ), 푄(푦′ )). 푝 푝 ∼ 푝 푝 The argument from Equation (3.3) yields dim(section(푥′, 푦′)) = 푑 1. By (R3’), − there exists 푓 such that 푦′ is a geometric 푓-neighbor of 푥′. The statements ∈ ℱ (i) and (iii) then follow as in Case 1.2. By examining children of representants in a regular representation, the values of the functions 푁, Ω, 퐹 푝 can be computed to automatically generate lookup tables for these functions. This process is implemented in the sfcpp library.

3.5 Tree Isomorphisms In this chapter, we introduce isomorphisms between trees. We then show isomorphisms between various trees that have already been defined and add the neighbor structure from Section 3.3 to algebraic 푏-index-trees. Furthermore, we show that neighbor-finding on isomorphic trees is equivalent. Isomorphisms will also be used later to formulate conditions for a certain state invariance of neighborship, cf. Section 4.3, ∙ establish a framework to describe the computation of additional information for a ∙ node, for example its state, its point matrix or its coordinates. This is shown in Section 4.2. Definition 3.47 (Isomorphism). (1) Let 푇 = ( , 퐶) and 푇 ′ = ( ′, 퐶′) be 푏-index-trees. A map 휙푉 : ′ is called 풱 풱 풱 → 풱 푏-index-tree isomorphism if it is bijective and 휙푉 (퐶(푣, 푗)) = 퐶′(휙푉 (푣), 푗) for all 푣 , 푗 . We write 휙푉 : 푇 = 푇 ′. ∈ 풱 ∈ ℬ ∼ 41 (2) Let 푇 = ( , 퐶, , 푆) and 푇 ′ = ( ′, 퐶′, ′, 푆′) be state-푏-index-trees. A pair (휙푉 , 휙푆) 풱 풮 풱 풮 is called state-푏-index-tree isomorphism if 휙푉 :( , 퐶) = ( ′, 퐶′) and 휙푆 : 풱 ∼ 풱 풮 → ′ is a bijective function satisfying 풮 휙푆(푆(푣)) = 푆′(휙푉 (푣))

for all 푣 . We write (휙푉 , 휙푆): 푇 = 푇 ′. ∈ 풱 ∼ (3) Let 푇 = ( , 퐶, , 푆, , 푁, Ω, 퐹 푝) and 푇 ′ = ( ′, 퐶′, ′, 푆′, ′, 푁 ′, Ω′, 퐹 푝′) be neigh- 풱 풮 ℱ 풱 풮 ℱ bor-푏-index-trees. A tuple (휙푉 , 휙푆, 휙퐹 ) is called neighbor-푏-index-tree isomor- phism if (휙푉 , 휙푆):( , 퐶, , 푆) = ( ′, 퐶′, ′, 푆′) and 휙퐹 : ′ is a bijective 풱 풮 ∼ 풱 풮 ℱ → ℱ function satisfying

푁(푗, 푠, 푓) = 푁 ′(푗, 휙푆(푠), 휙퐹 (푓)) , Ω(푗, 푠, 푠,˜ 푓) = Ω′(푗, 휙푆(푠), 휙푆(˜푠), 휙퐹 (푓)) , 휙퐹 (퐹 푝(푗, 푠, 푓)) = 퐹 푝′(푗, 휙푆(푠), 휙퐹 (푓))

for all 푗 , 푠, 푠˜ , 푓 . We write (휙푉 , 휙푆, 휙퐹 ): 푇 = 푇 ′. ∈ ℬ ∈ 풮 ∈ ℱ ∼ ′ ′ In any of these cases, if an isomorphism between 푇 and 푇 exists, we write 푇 ∼= 푇 and say that 푇 and 푇 ′ are isomorphic. This is an equivalence relation.

The next two propositions show that isomorphisms have convenient properties:

Proposition 3.48. Let 휙푉 :( , 퐶, 푃, ℓ, 푟) = ( ′, 퐶′, 푃 ′, ℓ′, 푟′). Then, 풱 ∼ 풱 휙푉 (푟) = 푟′ 푃 ′(휙푉 (푣)) = 휙푉 (푃 (푣)) (if 푣 = 푟) ̸ ℓ′(휙푉 (푣)) = ℓ(푣) 퐼′(휙푉 (푣)) = 퐼(푣)(if 푣 = 푟) ̸ for all 푣 . ∈ 풱 ′ 푉 ′ Between two -index-trees , there is exactly one isomorphism ′ . 푏 푇, 푇 휙푇 →푇 : 푇 ∼= 푇 Proof. Since 휙푉 is bijective, there exists 푣 with 휙푉 (푣) = 푟′. Assume that 푣 = 푟, ∈ 풱 ̸ then there exists 푤 and 푗 such that 퐶(푤, 푗) = 푣. But this means that ∈ 풱 ∈ ℬ 푟′ = 휙푉 (푣) = 휙푉 (퐶(푤, 푗)) = 퐶′(휙푉 (푤), 푗) , which is a contradiction. Thus, 휙푉 (푟) = 푟′. This shows that the other equations are well-defined for 푣 = 푟. Let 푣 푟 . Then, ̸ ∈ 풱 ∖ { } 푃 ′(휙푉 (푣)) = 푃 ′(휙푉 (퐶(푃 (푣), 퐼(푣)))) = 푃 ′(퐶′(휙푉 (푃 (푣)), 퐼(푣))) = 휙푉 (푃 (푣)) 퐼′(휙푉 (푣)) = 퐼′(휙푉 (퐶(푃 (푣), 퐼(푣)))) = 퐼′(퐶′(휙푉 (푃 (푣)), 퐼(푣)) = 퐼(푣) .

The level equation follows by induction on the level of 푣. The equations 휙푉 (푟) = 푟′ and 휙푉 (퐶(푣, 푗)) = 퐶′(휙푉 (푣), 푗) uniquely define a function 휙푉 and this function is an isomorphism. This can also be shown by induction on the level.

42 ′ 푉 푆 퐹 ′ Proposition 3.49. Let 푇, 푇 be neighbor-푏-index-trees and (휙 , 휙 , 휙 ): 푇 ∼= 푇 as in Definition 3.47. Let 푣, 푤 and 푓 . Then, 푤 is a depth-푘 푓-neighbor of 푣 in 푇 if ∈ 풱 ∈ ℱ and only if 휙푉 (푤) is a depth-푘 휙퐹 (푓)-neighbor of 휙푉 (푤) in 푇 ′. Proof.

“ ”: We prove this statement by induction on ℓ(푣). Let 푤 be a depth-푘 푓-neighbor of ⇒ 푣 in 푇 . Then, 푣, 푤 = 푟 and thus also 휙푉 (푣), 휙푉 (푤) = 푟′. ̸ ̸ – Case 1: 푃 (푣) = 푃 (푤). Then, 퐼(푤) = 푁(퐼(푣), 푆(푃 (푣)), 푓) and 푘 = 1. By Proposition 3.48, we have 푃 ′(휙푉 (푣)) = 휙푉 (푃 (푣)) = 휙푉 (푃 (푤)) = 푃 ′(휙푉 (푤)). Furthermore, 푁 ′(퐼′(휙푉 (푣)), 푆′(푃 ′(휙푉 (푣))), 휙퐹 (푓)) = 푁 ′(퐼(푣), 푆′(휙푉 (푃 (푣))), 휙퐹 (푓)) = 푁 ′(퐼(푣), 휙푆(푆(푃 (푣))), 휙퐹 (푓)) = 푁(퐼(푣), 푆(푃 (푣)), 푓) = 퐼(푤) = 퐼′(휙푉 (푤)) . This shows that 휙푉 (푤) is a depth-1 휙퐹 (푓)-neighbor of 휙푉 (푣) in 푇 ′. – Case 2: 푃 (푣) = 푃 (푤). Then, 푃 ′(휙푉 (푣)) = 휙푉 (푃 (푣)) = 휙푉 (푃 (푤)) = ̸ ̸ 푃 ′(휙푉 (푤)). By assumption, 푓 := 퐹 푝(퐼(푣), 푆(푃 (푣)), 푓) = and 푤 := 푃 (푤) 푝 ̸ ⊥ 푝 is a depth-(푘 1) 푓 -neighbor of 푣 := 푃 (푣). By the induction hypothesis, − 푝 푝 휙푉 (푤 ) is a depth-(푘 1) 휙퐹 (푓 )-neighbor of 휙푉 (푣 ). Here, the definition 푝 − 푝 푝 yields

퐹 푝′ 푆 퐹 휙 (푓푝) = 퐹 (퐼(푣), 휙 (푆(푃 (푣))), 휙 (푓)) = 퐹 푝′(퐼′(휙푉 (푣)), 푆′(푃 ′(휙푉 (푣))), 휙퐹 (푓)) and 퐼′(휙푉 (푤)) = 퐼(푤) = Ω(퐼(푣), 푆(푃 (푣)), 푆(푃 (푤)), 푓) = Ω′(퐼(푣), 휙푆(푆(푃 (푣))), 휙푆(푆(푃 (푤))), 휙퐹 (푓)) = Ω′(퐼′(휙푉 (푣)), 푆′(푃 ′(휙푉 (푣))), 푆′(푃 ′(휙푉 (푤))), 휙퐹 (푓)) by assumption. Thus, 휙푉 (푤) is a depth-푘 휙퐹 (푓)-neighbor of 휙푉 (푣) in 푇 ′. “ ”: This direction follows from what we have proven above by applying the inverse ⇐ isomorphism. Now, we can revisit the geometric 푏-index-tree from Definition 3.12 and the algebraic 푏-index-trees with or without history from Definition 3.14 and Definition 3.13. Definition 3.50. For neighbor-finding on SFCs, we want to turn an algebraic 푏-index- tree 푇 = ( , 퐶, , 푆) (with or without history) for a 푏-state-system S into a neighbor- 풱 풮 푏-index-tree. Assume that 푇^′ = ( ′, 퐶′, , 푆′, , 푁, Ω, 퐹 푝) is an extended geometric 푏- 풱 풮 ℱ index-tree for a 푏-specification (S, 푀, 푄(푟)) and a facet specification ( , Φ). This has been ℱ defined in Definition 3.39. We can now define the extended algebraic 푏-index-tree ^ 푝 푉 푆 퐹 ^′ ^ 푇 := ( , 퐶, , 푆, , 푁, Ω, 퐹 ) (with or without history). Then, (휙 , 휙 , 휙 ): 푇 ∼= 푇 , 풱 푉 풮 푉 ℱ 푆 퐹 where 휙 = 휙푇^′→푇^, 휙 = id풮 and 휙 = idℱ . By Proposition 3.49, neighbor-finding on 푇^ is equivalent to neighbor-finding on 푇^′.

43 Using an isomorphism, we can also define coordinates on 푏-index-trees, which is especially 푑 푑 useful if they represent 푘 -trees. A coordinate vector of a tree node is a vector 푢 N0. ∈ Definition 3.51 (Coordinates). Let S be a 푏-state-system. Let 푏 = 푘푑 and for every 푠 , let 휅 : 0, . . . , 푘 1 푑 be a bijective function that assigns a 푑-dimensional ∈ 풮 푠 ℬ → { − } coordinate vector to a child index in a node of state 푠. Using a similar construction to Definition 3.12, we can define a 푏-index-tree 푇 = ( , 퐶, 푃, ℓ, 푟) such that 휅 풱 푑 N0 N0 , 풱 ⊆ × × 풮 푟 = (0, (0,..., 0), 푠 ) , 푟 ∈ 풱 ℓ((푙, 푢, 푠)) = 푙 , −1 퐼((푙, 푢, 푠)) = 휅푠 (푢1 mod 푘, . . . , 푢푑 mod 푘) , 퐶((푙, 푢, 푠), 푗) = (푙 + 1, 푘 푢 + 휅 (푗), 푆푐(푠, 푗)) · 푠 푃 ((푙, 푢, 푠)) = (푙 1, (푢1 div 푘, . . . , 푢 div 푘), 푠 ) for some 푠 . − 푑 푝 푝 ∈ 풮

We call 푇휅 the coordinate-푏-index-tree for 휅. Now, let 푇 ′ be another 푏-index-tree. We can define coordinate vectors on 푇 ′ by using 푉 푉 ′ 푑 ′ an isomorphism 휙 := 휙푇 ′→푇 (cf. Proposition 3.48): Define 휅^ : N0 by 휅^(푣 ) := 푢, 휅 풱 → where (푙, 푢, 푠) := 휙푉 (푣′).

44 4 Algorithms

In this section, we first present a new algorithm for finding the 푓-neighbor of a node 푣 in a neighbor-푏-index-tree if it exists. We then prove its correctness and its average- case and worst-case runtime complexity. For most space-filling curves, we obtain an average-case runtime complexity of (1). In Section 4.2, we show how to embed 풪 the neighbor-finding algorithm efficiently in a traversal of all nodes at afixedlevel. Furthermore, we show how, based on the frameworks of isomorphisms introduced in Section 3.5, we can elegantly formulate algorithms for the computation of states, point matrices and coordinate vectors. In Section 4.3, we give formal criteria under which executing the algorithm with a wrong state of a node for all facets 푓 still yields all ∈ ℱ neighbors, but in permuted order.

4.1 General Neighbor-Finding Algorithm Algorithm 1 is a general neighbor-finding algorithm for neighbor-푏-index-trees. To formulate a bound on its runtime, we use the neighbor-depth 퐷푇 (푣, 푓) from Defini- tion 3.18.

Algorithm 1 Neighbor-finding in a neighbor-푏-index-tree 1: function Neighbor(푣 , 푓 ) ∈ 풱 ∈ ℱ 2: if ℓ(푣) = 0 then ◁ If 푣 is the root node, 3: return ◁ return an invalid result. ⊥ 4: end if 5: 푝푣 := 푃 (푣) ◁ Let 푝푣 be the parent of 푣, 6: 푗푣 := 퐼(푣) ◁ 푗푣 the index of 푣, 7: 푠푝 := 푆(푝푣) ◁ 푠푝 the state of the parent, 8: 푗푤 := 푁(푗푣, 푠푝, 푓) ◁ and 푗푤 the index of the direct neighbor of 푣. 9: if 푗 = then ◁ If this neighbor index is valid, 푤 ̸ ⊥ 10: return 퐶(푝푣, 푗푤) ◁ return the neighbor. 11: end if 푝 12: 푓푝 := 퐹 (푗푣, 푠푝, 푓) ◁ Find the corresponding facet of the parent. 13: if 푓 = then ◁ If it does not exist, 푝 ⊥ 14: return ◁ return an invalid result. ⊥ 15: end if 16: 푝푤 := Neighbor(푝푣, 푓푝) ◁ Find the parent’s neighbor recursively. 17: if 푝 = then ◁ If it does not exist, 푤 ⊥ 18: return ◁ return an invalid result. ⊥ 19: end if

20: 푠푝푤 := 푆(푝푤) ◁ Let 푠푝푤 be the parent’s neighbor’s state. ˜ ˜ 21: 푗푤 := Ω(푗푣, 푠푝, 푠푝푤 , 푓) ◁ Let 푗푤 be the index of its child adjacent to 푣. 22: if ˜푗 = then ◁ If it is invalid, 푤 ⊥ 23: return ◁ return an invalid result. ⊥ 24: end if 25: return 퐶(푝푤, ˜푗푤) ◁ Return the adjacent child. 26: end function

45 Theorem 4.1 proves major results about the correctness and the runtime of Algorithm 1. The runtime assumptions in Theorem 4.1 (b) are satisfied by the extended algebraic 푏-index-trees with or without history from Definition 3.50. Theorem 4.1. Let 푇 = ( , 퐶, , 푆, , 푁, Ω, 퐹 푝) be a neighbor-푏-index-tree and 푣 , 푓 . 풱 풮 ℱ ∈ 풱 ∈ ℱ (a) If 푣 has an 푓-neighbor 푤, then Neighbor(푣, 푓) = 푤. If 푣 has no 푓-neighbor, then Neighbor(푣, 푓) = . ⊥ (b) If 푆, 푁, Ω, 퐹 푝 can be computed in (1) and 푡^ is a nondecreasing function such that 풪 푡^(ℓ(푣)) is an upper bound for the runtime of 퐶(푣, 푗), 푃 (푣), 퐼(푣), there is a constant 퐷 R such that ∈ 푡^ (푣, 푓) := 퐷 (푣, 푓) (3 푡^(ℓ(푣)) + 퐷) Neighbor 푇 · · is an upper bound for the actual runtime 푡Neighbor(푣, 푓) of Neighbor(푣, 푓). Proof. Choose 퐷 as a constant comprising all computational overhead generated from calls to 푆, 푁, Ω, 퐹 푝, assignments, if statements etc. excluding those inside a possible recursion in Line 16. We prove both parts by induction on ℓ(푣). If ℓ(푣) = 0, then 푣 = 푟 and 푣 has no neighbor. The algorithm correctly returns in Line 3 with a time less than 퐷 푡^ (푣, 푓). ⊥ ≤ Neighbor Now assume that the claims are true up to level 푙 and that ℓ(푣) = 푙 + 1. Note that 푠푝 = 푆(푃 (푣)). We have to distinguish several cases: Case 1: 푗 = . ∙ 푤 ̸ ⊥ (a) Let 푤 := 퐶(푃 (푣), 푗푤). The node 푤 is a depth-1 푓-neighbor of 푣, since 퐼(푤) = 푗푤 = 푁(퐼(푣), 푠푝, 푓) = 푁(퐼(푣), 푆(푃 (푣)), 푓). Thus, the algorithm returns the correct result in Line 10. (b) The runtime in this case consists of one call to 푃, 퐼, 퐶 respecively. This yields the term 3 푡^(ℓ(푣)), the rest is covered by 퐷. · Case 2: 푗 = and 푓 = . ∙ 푤 ⊥ 푝 ⊥ (a) By definition, 푣 has no 푓-neighbor. Thus, the algorithm correctly returns . ⊥ (b) It follows by definition that 퐷 (푣, 푓) = ℓ(푣) + 1 1 and thus 푇 ≥ 푡 (푣, 푓) 퐷 + 2푡^(ℓ(푣)) 푡^ (푣, 푓) . Neighbor ≤ ≤ Neighbor Case 3: 푗 = , 푓 = and 푝 = . ∙ 푤 ⊥ 푝 ̸ ⊥ 푤 ⊥ (a) By the induction hypothesis, 푝 = 푃 (푣) has no 푓-neighbor. Since 푗 = , 푣 푤 ⊥ this means that 푣 has no neighbor and the algorithm correctly returns . ⊥ (b) The runtime of this call to Neighbor consists of the recursive call, the calls to 푃, 퐼 and the rest covered by 퐷. Using the induction hypothesis and 퐷푇 (푣, 푓) = ℓ(푣) + 1 = ℓ(푃 (푣)) + 2 = 퐷푇 (푃 (푣), 푓푝) + 1, we obtain 푡 (푣, 푓) 2푡^(ℓ(푣)) + 퐷 + 퐷 (푃 (푣), 푓 ) (3푡^(ℓ(푃 (푣))) + 퐷) Neighbor ≤ 푇 푝 · (퐷 (푃 (푣), 푓 ) + 1)(3푡^(ℓ(푣)) + 퐷) = 푡^ (푣, 푓) . ≤ 푇 푝 Neighbor 46 Case 4: 푗 = , 푓 = and 푝 = . ∙ 푤 ⊥ 푝 ̸ ⊥ 푤 ̸ ⊥

(a) In this case, the induction hypothesis tells us that 푝푤 is the 푓푝-neighbor of 푝푣. If there is an 푓-neighbor 푤 of 푣, 푃 (푤) must be an 푓-neighbor of 푝푣 = 푃 (푣), so 푃 (푤) = 푝푤. Furthermore, ˜ 퐼(푤) = Ω(퐼(푣), 푆(푃 (푣)), 푆(푃 (푤)), 푓) = Ω(푗푣, 푠푝, 푠푝푤 , 푓) = 푗푤 .

This means that ˜푗 = and we have 푤 ̸ ⊥

퐶(푝푤, ˜푗푤) = 푐(푃 (푤), 퐼(푤)) = 푤 .

If 푗 = , there cannot be any 푓-neighbor of 푣 and the algorithm correctly 푤 ⊥ returns . ⊥ (b) If ˜푗 = , we have 퐷 (푃 (푣), 푓 ) + 1 = 퐷 (푣, 푓). Otherwise, 푃 (푣) has an 푤 ̸ ⊥ 푇 푝 푇 푓-neighbor, but 푣 does not, and Lemma 3.17 yields 퐷 (푃 (푣), 푓 ) + 1 푇 푝 ≤ ℓ(푃 (푣)) + 1 < ℓ(푣) + 1 = 퐷푇 (푣, 푓). But then,

푡 (푣, 푓) 3푡^(ℓ(푣)) + 퐷 + 퐷 (푃 (푣), 푓 ) (3푡^(ℓ(푃 (푣))) + 퐷) Neighbor ≤ 푇 푝 · (퐷 (푃 (푣), 푓 ) + 1)(3푡^(ℓ(푣)) + 퐷) 푡^ (푣, 푓) . ≤ 푇 푝 ≤ Neighbor Theorem 4.1 can be employed to derive more runtime results: Corollary 4.2. Under the assumptions of Theorem 4.1 (b), the following runtime bounds hold: (︁ )︁ (a) Algorithm 1 has a worst-case-runtime complexity of ℓ(푣) 푡^(ℓ(푣)) . 풪 · (b) Let := 푣 ℓ(푣) = 푙 be the set of all nodes at level 푙. If there exist constants 풱푙 { ∈ 풱 | } 푐 R and 푞 (0, 1) such that ∈ ∈ (푣, 푓) 퐷 (푣, 푓) 푘 |{ ∈ 풱푙 × ℱ | 푇 ≥ }| 푐푞푘−1 ≤ |풱푙| · |ℱ| for every 푙 N0 and every 푘 N≥1, then Algorithm 1 has an average-case runtime (︁ ∈)︁ ∈ of 푡^(ℓ(푣)) . The average is taken across each set . 풪 풱푙 × ℱ Proof.

(a) By Lemma 3.17 and Definition 3.18, we have 퐷 (푣, 푓) ℓ(푣) + 1. The claim then 푇 ≤ follows from Theorem 4.1.

(b) We want to determine the average neighbor-depth of nodes 푣 with level 푙. ∈ 풱푙 Observe that

퐷푇 (푣,푓) 푙+1 1 ∑︁ 1 ∑︁ ∑︁ 1 ∑︁ ∑︁ 퐷푇 (푣, 푓) = 1 = 1 푙 푙 푙 |풱 | · |ℱ| 푣∈풱푙,푓∈ℱ |풱 | · |ℱ| 푣∈풱푙,푓∈ℱ 푘=1 |풱 | · |ℱ| 푘=1 푣∈풱푙,푓∈ℱ 퐷푇 (푣,푓)≥푘 푙+1 ∑︁ (푣, 푓) 푙 퐷푇 (푣, 푓) 푘 = |{ ∈ 풱 × ℱ | ≥ }| 푘=1 |풱푙| · |ℱ| 47 푙+1 ∞ ∑︁ ∑︁ 푐 푐푞푘−1 푐푞푘−1 = , ≤ ≤ 1 푞 푘=1 푘=1 − which is a bound independent of 푙. By Theorem 4.1, this implies an average-case (︁ )︁ runtime of 푡^(ℓ(푣)) . 풪 Remark 4.3. The conditions of Corollary 4.2 (b) are satisfied by sufficiently “normal” space-filling curves like those used in this thesis. For example, in the 2D Hilbert curve,a 푙2−푙1 푙2−푙1 square of level 푙1 contains 4 subsquares of level 푙2 > 푙1. Of these, for all 푓 , 2 ∈ ℱ squares have no 푓-neighbor. Thus, we can choose 푞 = 2/4 = 1/2 and 푐 = 1. Assuming constant-time arithmetic operations, this yields an average-case runtime in (1) for an 풪 extended algebraic 푏-index-tree with or without history as defined in Definition 3.50.

4.2 Other Algorithms In this section, we want to show other algorithms related to Algorithm 1. For example, Algorithm 1 can be embedded into a traversal of all nodes at level 푙 as shown in Algorithm 2. During the traversal, the state of a node is computed “on the fly” and does not cause significant overhead. Thus, under the conditions of Remark 4.3, a traversal can be performed in (푛), where 푛 = 푏푙 is the number of nodes (i.e. grid cells) at level 풪 푙. Algorithm 2 Traversing a tree 푝 1: function Traverse(푇 = ( , 퐶, , 푆, , 푁, Ω, 퐹 ), 푙 N0) 풱 풮 ℱ ∈ 2: TraverseRecursively(푇, 푟, 푙) 3: end function 푝 4: function TraverseRecursively(푇 = ( , 퐶, , 푆, , 푁, Ω, 퐹 ), 푣 , 푙 N0) 풱 풮 ℱ ∈ 풱 ∈ 5: if ℓ(푣) = 푙 then 6: for all 푓 do ∈ ℱ 7: 푤 := Neighbor(퐼(푣), 푆(푃 (푣)), 푓) 8: Do something with 푤 9: end for 10: else 11: for all 푗 do ∈ ℬ 12: TraverseRecursively(푇, 퐶(푣, 푗), 푙) 13: end for 14: end if 15: end function

Other useful algorithms like state computation and conversion between curve position and coordinates can be formulated elegantly in the framework of Section 3.5. They are special cases of Algorithm 3, an algorithm to evaluate an isomorphism between two 푏-index-trees that can be seen as a generalization of several well-known algorithms. 푉 ′ From Proposition 3.48, we know that there is exactly one isomorphism 휙푇 →푇 ′ : 푇 ∼= 푇 for each pair (푇, 푇 ′) of 푏-index-trees. To distinguish the functions of different trees, we use the tree as a subscript of these functions. For example, 퐶푇 ′ means the child function of 푇 ′.

48 Algorithm 3 Computing a tree isomorphism 1: function Isomorphism(푇, 푇 ′, 푣 ) ∈ 풱푇 2: if 푣 = 푟푇 then ◁ If 푣 is the root node of 푇 , ′ 3: return 푟푇 ′ ◁ return the root node of 푇 . 4: end if ′ 5: return 퐶푇 ′ (Isomorphism(푇, 푇 , 푃푇 (푣)), 퐼푇 (푣)) ◁ Else proceed recursively. 6: end function

Obviously, Algorithm 3 runs in time (ℓ(푣)) if the functions 퐶 ′ and 푃 are computable 풪 푇 푇 in (1), i.e. with a runtime bound independent of ℓ(푣). In the following, we give 풪 examples of how to use this algorithm. Example 4.4 (State computation). Let 푇 ′ be an algebraic 푏-index-tree (with or without history). Given a level 푙 and a position 푗, we want to find the unique node (푙, 푗, 푠) 푇 ′ . To this end, we set 푇 to the level-position 푏-index-tree from Definition 3.2. ∈ 풱 푉 We then obtain (푙, 푗, 푠) = 휙푇 →푇 ′ ((푙, 푗)). Since 퐶푇 ′ and 푃푇 are efficiently computable, Algorithm 3 can be used to compute 푠. Here, 푠 can be a single state or a state history as in Definition 3.14. Example 4.5 (Coordinate conversion). In this example, we want to show how to convert between array index and grid coordinates for suitable grids. Assume that 푇 is the level-position 푏-index-tree from Definition 3.2 and 푇 ′ is a coordinate-푏-index-tree 푉 푉 as in Definition 3.51. Using the 푏-index-tree isomorphisms 휙푇 →푇 ′ and 휙푇 ′→푇 , we can convert between an array index 푗 and a coordinate representation (푢1, . . . , 푢푑). Note that in order to be able to compute 푃푇 ′ efficiently, it is necessary for some SFCs to define 푇 ′ using a state-history as in the algebraic 푏-index-tree with history from Definition 3.14. Example 4.6 (Point matrix computation). By choosing 푇 as the level-position 푏-index- ′ 푉 tree and 푇 as a geometric 푏-index-tree, computing 휙푇 →푇 ′ (푣) not only yields the state, but also the point matrix associated with 푣.

4.3 Exploiting Symmetry In order to apply Theorem 4.1, the function 푆 must be efficiently computable, which means in many cases that state information must be stored in 푣. For example, this is realized in an extended algebraic 푏-index-tree. Storing states in memory may significantly increase memory usage. Therefore, the user might want to apply the algorithm without knowing the state of a tree node, i.e. just given a level 푙 and an index 푗. One possibility is then to compute the corresponding tree node 푣 in time (ℓ(푣)) as shown in Example 4.4 풪 and then apply the neighbor-finding algorithm. In this section, we show that ifan extended algebraic 푏-index-tree 푇 without history (cf. Definition 3.50) satisfies rather common conditions, setting 푣 = (푙, 푗, 푠) and calling Neighbor(푣, 푓) for all 푓 yields ∈ ℱ the same neighbors for each 푠 , just in permuted order. This essentially corresponds ∈ 풮 to the existence of a local model for the given SFC, cf. Remark 2.5, i.e. it can be applied to global models of Hilbert, Peano and Sierpinski curves. In this section, let 푇 = ( , 퐶, , 푆, , 푁, Ω, 퐹 푝) be an extended algebraic 푏-index- 풱 풮 ℱ tree without history for the 푏-specification ( , 푆푐, 푠 , 푀, 푄(푟)). We first motivate the 풮 푟 49 introduction of a group structure on for a certain class of SFCs and then show that 풮 such a group structure together with suitable facet correspondences can lead to this symmetry property. Definition 4.7. If the functions 휎 : , 푠 푆푐(푠, 푗) are bijective for each 푗 , 푗 풮 → 풮 ↦→ ∈ ℬ we define the group as the subgroup of the permutation group (or symmetric group) 풢푇 on spanned by these 휎 . 풮 푗 Example 4.8. For the 2D Hilbert curve, we can use Table 1 to explicitly obtain the permutations in cycle notation:

휎0 = (퐻퐴)(퐵푅)

휎1 = id풮

휎2 = id풮

휎3 = (퐻퐵)(퐴푅) .

Hence, = id, (퐻퐴)(퐵푅), (퐻퐵)(퐴푅), (퐻푅)(퐴퐵) , which is isomorphic to the Klein 풢푇 { } four-group Z2 Z2 as noted in Remark 2.1. Note that the permutation (퐻푅)(퐴퐵) is × not equal to any of the 휎 , 푗 . This corresponds to the fact that the state 푅 does 푗 ∈ ℬ not occur at Level 1 of the Hilbert curve. Remark 4.9. In many cases, the group can be identified with the set of states: 풢푇 풮 For the Hilbert, Peano and Sierpinski curves, the the function 휑 : , 휋 휋(푠 ) 풢푇 → 풮 ↦→ 푟 is a bijection. This is due to the fact that as indicated in Remark 2.1, every state corresponds to a particular rotation or reflection of the coordinate system. Together, these rotations and reflections form a group. A permutation 휎푗 that describes the state transition from a parent to its 푗-th child also corresponds to such a rotation or reflection.

Under these assumptions, we can define a group multiplication : , (푠1, 푠2) −1 −1 · 풮 × 풮 → 풮 ↦→ 휑(휑 (푠1) 휑 (푠2)) on the set of states. Then, 휑 : is a group isomorphism. ∘ 풢푇 → 풮 Moreover, every index 푗 corresponds to a state 퐺(푗) := 휑−1(휎 ). With these ∈ ℬ 푗 definitions, we can write

푐 −1 −1 −1 푆 (푠, 푗) = 휎푗(푠) = 휎푗(휑(휑 (푠))) = 휑 (퐺(푗))(휑(휑 (푠))) = 휑−1(퐺(푗))(휑−1(푠)(푠 )) = (휑−1(퐺(푗)) 휑−1(푠))(푠 ) 푟 ∘ 푟 = 휑(휑−1(퐺(푗)) 휑−1(푠)) = 퐺(푗) 푠 . ∘ · 푆푝(푠, 푗) = 퐺(푗)−1 퐺(푗) 푆푝(푠, 푗) = 퐺(푗)−1 푆푐(푆푝(푠, 푗), 푗) = 퐺(푗)−1 푠 . · · · · Now assume that some multiplication : and a function 퐺 : are · 풮 × 풮 → 풮 ℬ → 풮 given such that ( , ) is a group, 푆푐(푠, 푗) = 퐺(푗) 푠 and 푆푝(푠, 푗) = 퐺(푗)−1 푠 for all 풮 · · · 푠 , 푗 . For any 푔 , define the neighbor-푏-index-tree 푇 by the following ∈ 풮 ∈ ℬ ∈ 풮 푔 conventions:

:= (푙, 푗, 푠푔) (푙, 푗, 푠) 풱푇푔 { | ∈ 풱} 푟푇푔 := (0, 0, 푠푟푔) 퐶 ((푙, 푗, 푠), 푖) := (푙 + 1, 푗푏 + 푖, 퐺(푖) 푠) 푇푔 · 푃 ((푙, 푗, 푠)) := (푙 1, 푗 div 푏, 퐺(푗 mod 푏)−1 푠) 푇푔 − · 50 ℓ푇푔 ((푙, 푗, 푠)) := 푙

퐼푇푔 ((푙, 푗, 푠)) := 푗 mod 푏 := 풮푇푔 풮푇 푆푇푔 ((푙, 푗, 푠)) := 푠 := ℱ푇푔 ℱ 푁푇푔 := 푁푇

Ω푇푔 := Ω푇 푝 푝 퐹푇푔 := 퐹푇 .

Note that while depends on 푔, the “implementation” of all functions here is 풱푇푔 independent of 푔. This means that we can replace any occurence of 푇푔 in their domain ⋃︀ 풱 by ′ without changing efficiency aspects. If Algorithm 1 is implemented using 푔 ∈풮 풱푇푔′ these extended domains, it works for all trees 푇푔 simultaneously. This is because it never uses and 푟 , the only objects where the implementation depends on 푔. How 풱푇푔 푇푔 can we exploit this? Let 휙푉 : , (푙, 푗, 푠) (푙, 푗, 푠푔) 푔 풱푇 → 풱푇푔 ↦→ 휙푆 : = = , 푠 푠푔 . 푔 풮푇 풮 → 풮푇푔 풮 ↦→ Then (휙푉 , 휙푆): 푇 푇 , meaning that 푇 and 푇 are isomorphic as state-푏-index-trees 푔 푔 ≃ 푔 푔 in the sense of Definition 3.47. Our last assumption is that for every 푔 , there ∈ 풮 exists a map 휙퐹 : such that (휙푉 , 휙푆, 휙퐹 ): 푇 푇 , meaning that 푇 and 푇 are 푔 ℱ → ℱ 푔 푔 푔 ≃ 푔 푔 isomorphic as neighbor-푏-index-trees in the sense of Definition 3.47. This corresponds to a permutation of facets induced by a rotation or reflection associated with the state transition 푠 푠푔. ↦→ Now assume that a level 푙 and a position 푗 is given but the state 푠 of the ∈ 풮 corresponding tree node, i.e. the state satisfying 푣 := (푙, 푗, 푠) , is unknown. We 푠 ∈ 풱푇 want to investigate the results of calling Neighbor((푙, 푗, 푠′), 푓) where 푠′ , i.e. just ∈ 풮 pretending that the state is 푠′. Defining the (unknown) state 푔 := 푠−1푠′, we obtain

푉 −1 ′ ′ 휙푔 (푣푠) = (푙, 푗, 푠푔) = (푙, 푗, 푠푠 푠 ) = (푙, 푗, 푠 ) .

푉 By Proposition 3.49, a node 푤 푇 is an 푓-neighbor of 푣푠 if and only if 휙푔 (푤) is a 퐹 푉 ∈ 풱 휙푔 (푓)-neighbor of 휙푔 (푣푠). Thus, 휙푉 (Neighbor((푙, 푗, 푠), 푓)) 푓 = Neighbor((푙, 푗, 푠′), 휙퐹 (푓)) 푓 { 푔 | ∈ ℱ} { 푔 | ∈ ℱ} = Neighbor((푙, 푗, 푠′), 푓) 푓 , { | ∈ ℱ} where 휙푉 ( ) := . This means that using the state 푠′ instead of 푠, all the neighbors are 푔 ⊥ ⊥ still found, but in (possibly) unknown order and with a different state. For the runtime 7 complexity, Theorem 4.1 and Corollary 4.2 can be applied to the tree 푇푔 instead of 푇 .

7The value of 푔 depends on 푙 and 푗, which makes the conclusion of an average-case runtime for this method more difficult. However, there are only finitely many possible values for 푔 and they all have (︀ )︀ the same average-case runtime complexity of 푡^(ℓ(푣)) under the conditions of Corollary 4.2 (b). 풪 푙 The average-case runtime for computing neighbors of (푙, 푗, 푠′) for all 푗 0, . . . , 푏 1 is therefore ∈ { −(︀ } )︀ bounded by the sum of the average-case complexities for all 푔 , which is again 푡^(ℓ(푣)) . ∈ 풮 풪 51 5 Optimizations

In this section, we want to introduce different kinds of optimizations that can be used to make Algorithm 1 (and other algorithms) faster in practice. Some of these optimizations work for all curves and some only work for special curves.

5.1 Curve-Indepentent Optimizations Optimization 5.1 (Avoiding recursion). The recursion in Algorithm 1 is a single recursion, so it can be replaced by two loops. The first loop walks up the tree until it finds a direct neighbor using 푁 or until it reaches the root node. In the former case, the second loop walks down the tree, tracking the neighbors using Ω until it reaches the level of the original node. In general, the second loop will need the states 푠 and facets 푓 computed in the first loop. These can be stored in arrays during the first loop. Optimization 5.11 investigates cases where the arrays and the second loop can be omitted. This optimization can also be applied to Algorithm 3.

Optimization 5.2 (Loop unrolling). In the iterative algorithm described in Optimiza- tion 5.1, it can be effective to unroll the first iteration of each loop. This is thecase since by Remark 4.3, most curves have a high chance of finding the neighbor in the first loop. This probability gets even higher when applying Optimization 5.3. The unrolled iteration can then be optimized specifically.

Optimization 5.3 (Multi-level tables). Algorithm 1 traverses the tree one level per recursion. If enough space is available for the lookup tables, these lookup tables can be modified so that 푘 2 levels can be handled per recursion. The algorithm then effectively ≥ behaves almost as if 푘 levels of the 푏-index-tree each had been “flattened” to a single level of a 푏푘-index-tree. For example, a call 푁(푗, 푠, 푓) might allow 푗 0, . . . , 푏푘 1 ∈ { − } instead of 푗 = 0, . . . , 푏 1 , as well as the calls to Ω, 퐹 푝, 푆푐 and (if available) 푆푝. ∈ ℬ { − } Such tables may not be applicable to the case ℓ(푣) < 푘, in which case one may revert to using a smaller lookup table. This case can also appear inside a loop or recursion and it is relevant if and only if ℓ(푣) is not a multiple of 푘. Example 5.4 shows the computation of a neighbor in the 2D Hilbert curve with depth-2 tables. In a case where the functions 푠 푆푐(푠, 푗) are all invertible, it is possible to replace ↦→ the state of an ancestor by the state of a child to increase efficiency: For example, assume that 푣 = (푙, 푗, 푠) is a node of an algbraic 푏-index-tree representing the 2D Hilbert curve. Instead of calling 푁(푗 mod 푏푘, 푆(푃 푘(푣)), 푓) to find a potential neighbor of 푣 inside its ancestor 푃 푘(푣) = (푃 ... 푃 )(푣), the definition of 푁 may be changed so that ∘ ∘ 푁(푗 mod 푏푘, 푆(푣), 푓) yields the desired result. Of course, multi-level tables require more cache memory to be stored and thus can impact the cache performance of a grid traversal.

Example 5.4. Consider the computation of the 푓-neighbor of the node 푣 = (3, 28, 푅) in the 2D Hilbert curve as shown in Figure 18, where 푓 encodes the right facet of 푣. With lookup tables of depth 2, we can skip two levels at once:

52 21 22 25 26 37 38 41 42 H H H H H H H H 20 23 24 27 36 39 40 43 A B A B A B A B 19 18 29 28 35 34 45 44 R A B R R A B R 16 17 30 31 32 33 46 47 H A B H H A B H 15 12 11 10 53 52 51 48 A B R A B R A B 14 13 8 9 54 55 50 49 R R H A B H R R 1 2 7 6 57 56 61 62 H H R A B R H H 0 3 4 5 58 59 60 63 A B H A B H A B

(a) Positions (b) States

Figure 18: Positions and states of nodes at level 3 of the 2D Hilbert curve.

2 (1) Compute the index 푖 = 28 mod 푏 = 12 of 푣 inside its grandparent 푝푣 := 푃 (푃 (푣)) = (3 2, 28 div 푏2, 푆̂︁푝(퐵, 12)) = (1, 1, 퐻). Here, the hat in 푆̂︁푝 indicates that a modified − lookup table of depth 2 is used.

(2) Compute 푗 = 푁̂︁(12, 퐻, 푓) = . This means that there is no 푓-neighbor of 푣 inside 푤 ⊥ 푝푣. (In a global model of the Hilbert curve, we can assume 푓푝 = 푓 in Algorithm 1.)

(3) Call the algorithm recursively with 푝푣 as a new parameter. Since ℓ(푝 ) = 1, lookup tables of depth 2 cannot be used (at least directly). ∙ 푣 Thus, compute the direct parent 푝푝푣 := 푃 (푝푣) = (0, 0, 퐻) of 푝푣 and the index 푖푝 := 퐼(푝푣) = 1. Compute 푗 := 푁(1, 퐻, 푓) = 2. ∙ 푝푤 Compute the 푓-neighbor 푝 := 퐶(푝 , 푖 ) = (1, 2, 퐻) of 푝 and return it. ∙ 푤 푝푣 푝 푣 (4) Compute

˜푗푤 := Ω(̂︀ 푖, 푆(푝푣), 푆(푝푤), 푓) = Ω(12̂︀ , 퐻, 퐻, 푓) = 3 .

(5) Compute the 푓-neighbor 푤 := 퐶̂︀(푝 , ˜푗 ) = (1 + 2, 2 푏2 + 3, 푆̂︁푐(퐻, 3)) = (3, 35, 푅) 푤 푤 · of 푣 and return it. Optimization 5.5 (Precomputations). In the case of a regular grid, the level of all nodes handled is known in advance of the computation. This can be used for precomputations. Especially, if arrays are needed for the second loop in Optimization 5.1, they can be preallocated. Optimization 5.6 (Eliminating useless checks). For all SFCs presented here, the cases 푓 = and ˜푗 = cannot occur. This means that the corresponding if-clauses 푝 ⊥ 푤 ⊥ can be omitted in implementations. By choosing stronger regularity conditions, it would be possible to exclude these two cases for all regular geometric 푏-index-trees, cf. Remark 3.33.

53 5.2 Curve-Dependent Optimizations Optimization 5.7 (State groups). If the states form a group as explained in 풮 Remark 2.1 and Section 4.3, it may be possible to exploit this group structure for more efficient computation of the functions 푆푐 and 푆푝. For example, if the group is 푛 isomorphic to Z2 for some 푛 N0, the states can be stored as non-negative integers ∈ using bitwise XOR as the group operation. Optimization 5.8 (Bit operations). If the branching factor 푏 of a given SFC is a power of two, many arithmetic operations can be replaced by bit operations. This can happen manually or automatically by the compiler. For example, division and multiplication by 푏 translates to bit shifts and computing the remainder modulo 푏 corresponds to applying a bit mask.

A very effective optimization can be performed for curves that satisfy the palindrome property. Bader [2] uses the term “palindrome property” to describe curves where “at a common (hyper-)face of two subdomains [. . . ], the element orders imposed by the space-filling curve on the two faces are exactly inverse to each other”.8 Using the terminology from Section 3, we can turn this idea into a rigorous definition: Definition 5.9. A neighbor-푏-index-tree satisfies the palindrome property if the following condition is satifsied: For all 푓 , 푣 such that 푣 that has no depth-1 푝 ∈ ℱ ∈ 풱 푓-neighbor and there exists a 퐹 (푆(푣), 퐼(푣), 푓)-neighbor 푝푤 of 푃 (푣), we have Ω(퐼(푣), 푆(푃 (푣)), 푆(푝 ), 푓) = 푏 1 퐼(푣) . 푤 − − Remark 5.10. The Peano curves in arbitrary dimension satisfy the palindrome property while the Hilbert curves do not [2]. Using the sfcpp library, it can be checked for arbitrary curve specifications whether푗, Ω( 푠, 푠′, 푓) , 푏 1 푗 for all 푗 , 푠, 푠′ ∈ {⊥ − − } ∈ ℬ ∈ , 푓 . For example, according to our implementation, this property is satisfied by 풮 ∈ ℱ all implemented generalized Sierpinski models up to 푑 = 16, although it is not clear whether all of these models specify regular geometric 푏-index-trees.9

Optimization 5.11 (Exploiting the palindrome property). In an extended algebraic 푏-index-tree without history where the palindrome property is satisfied, we have Ω(푗, 푠, 푠′, 푓) = 푏 1 푗 for all parameter combinations (푗, 푠, 푠′, 푓) occuring in the − − neighbor-finding algorithm. If a node 푣 = (푙, 푗, 푠) has a depth-푘 neighbor 푤 = (푙, 푗′, 푠′), ′ ′ ′ and their positions have base-푏-representations 푗 = (푎푙−1 . . . 푎0)푏 and 푗 = (푎푙−1 . . . 푎0)푏, this means that ⎧ ⎪푎푞 , 푘 푞 푙 1 ⎨⎪ ′ 푘−1 푘 ′ ≤ ≤ − 푎푞 = 푁(퐼(푃 (푣)), 푆(푃 (푣)), 푓 ) , 푞 = 푘 1 ⎪ − ⎩⎪푏 1 푎 , 0 푞 푘 2 − − 푞 ≤ ≤ − for a suitable facet 푓 ′ . We can write this as an equation: ∈ ℱ 푗′ = 푗 (푗 mod 푏푘) + 푎′ 푏푘−1 + (푏푘−1 1 (푗 mod 푏푘−1)) . (5.1) − 푘−1 · − − 8Bader uses the term “face” for what we defined as “facet”. 9In our experiments, it seems that they specify regular geometric 푏-index-trees but in dimensions 푑 > 3, their polytopes (simplices) can get arbitrarily large sidelength ratios.

54 ′ ′ Once 푘 and 푎푘−1 have been found, the position 푗 of 푤 can be obtained using this formula. It remains to determine the state 푠′ of 푤 if this information is desired. Since the palindrome property poses a condition on the arrangement of children in an 푓-neighbor of 푣, the state of 푣 and the facet 푓 are likely to be sufficient to determine the state of the 푓-neighbor of 푣. For the Peano and Sierpinski curves, this is the case. Therefore, the second loop mentioned in Optimization 5.1 can be omitted for these curves, as well as the arrays needed for the second loop and the lookup table Ω. If 푏 = 2푛 is a power of two, Equation (5.1) flips the least significant 푛(푘 1) bits in the binary representation ′ − of 푗, resets the next 푛 bits to 푎푘−1 and leaves the rest of the bits untouched. If 푏 = 2 as ′ ′ for the Sierpinski curve, the fact that 푎 = 푎 −1 yields 푎 = 1 푎 −1, which means 푘−1 ̸ 푘 푘−1 − 푘 that the 푘 least significant bits are flipped and the other bits stay thesame.

Optimization 5.12 (Eliminating redundancy). For many SFCs, the lookup tables here contain more parameters than necessary or some variables are not needed. Removing them speeds up the algorithm and reduces memory usage. Examples:

In local models, there is only one state and it does not have to be tracked. ∙ In many global models, the lookup table for 퐹 푝 is not needed since 퐹 푝(푗, 푠, 푓) = 푓 ∙ in all occuring cases.

When searching for an upper or lower neighbor in the Peano curve model from ∙ Figure 7, the states 푃 and 푅 behave equivalently to the states 푄 and 푆, respectively. This is the case because 푃 and 푄 both are traversed from bottom to top, while 푅 and 푆 are both traversed from top to bottom. Due to the special properties of the Peano curve, all lookup tables for 푓 up, down are invariant under exchanging ∈ { } 푃 and 푄 or 푅 and 푆. Looking for an upper and lower neighbor, only one boolean is needed to track whether the state is in the equivalence class 푃, 푄 or 푅, 푆 . { } { }

55 6 Implementation

In this section, we will address different implementation issues: First, we will present code that has been implemented for extracting information from general SFC models. This code can also be used for visualization of these models. In Section 6.2, we will discuss implemented algorithms and data structures for various particular SFCs. Runtime measurements for many of these algorithms are provided in Section 6.3. Most of the code explained here has been written for this thesis and is provided in the sfcpp library.10

6.1 Implementation for General Models In the sfcpp library, SFCs can be defined via the CurveSpecification class, which almost directly realizes the definition of a 푏-specification (cf. Definition 3.8). Since states are used as array indices, the set of states has to be of the form 0, . . . , 푛 1 풮 { − } for some 푛 N≥1. In addition, the root state 푠푟 is automatically assumed to be 0. ∈ Since defining a 푏-specification can be quite tedious, the KDCurveSpecification class provides a more convenient way to define SFCs on 푘푑-trees. It allows to define an order of subcubes for each state and can infer the matrices 푀 푠,푗, 푠 , 푗 from these ∈ 풮 ∈ ℬ orders.11 If possible, a local CurveSpecification can also be generated instead of a global one. Already implemented models comprise arbitrary-dimensional Morton, Hilbert, Peano and Sierpinski curves as well as the 훽Ω and Gosper curves in 2D. Given a CurveSpecification, a CurveInformation object can be computed that, assuming that the corresponding geometric 푏-index-tree is regular, provides lookup tables for the functions 푁, Ω and 퐹 푝, which provide the indices of neighbors inside the same parent, across different parents and the face that different parents share, respectively. The computation of the CurveInformation is implemented as follows:

(1) Traverse the corresponding geometric 푏-index-tree 푇 to find representants 푢푠 = (푙 , 푗 , 푠, 푄 ) of each (reachable) state. In the terminology of Section 3.4, this 푠 푠 푠 ∈ 풱 corresponds to finding a pre-representation.

(2) For all states 푠 , find all faces of conv(푄 ), specified by the set of indices ∈ 풮 푠 of columns of 푄푠 that constitute the vertices of the respective face. To do this, a custom adaptation of the QuickHull algorithm [3] is employed that finds all faces and also works for non-simplicial facets. For cubes, computation times are acceptable up to a dimension of about eight.

(3) By using a suitable ordering on the found facet index sets, enumerate the facets of each node canonically. The ordering of the facets is chosen such that for cubes generated out of a global model from a KDCurveSpecification, the index 푓 = 푓+1 ∈ ℱ 0,..., 1 corresponds to the facet with normal vector ⃗푛 = ( 1) 푒1+ div 2, { |ℱ| − } − 푓 where 푒푘 is the 푘-th unit vector.

(4) Using this information, examine the children of all nodes 푢푠 to compute 푁.

10https://github.com/dholzmueller/sfcpp 11 This order is analogous to the coordinate functions 휅푠 from Definition 3.51.

56 (5) By recursively finding neighboring pairs of nodes with certain states and common facets, compute Ω and by examining their children, compute 퐹 푝. During this step, violations of the regularity condition (R1) may be detected and reported, for example for the local model of the 2D Hilbert curve. In the terminology of Section 3.4, this corresponds to finding a representation.

(6) Check a property similar to the palindrome property by examining the values of Ω, see Remark 5.10.

The unoptimized brute-force search currently used in steps (4) and (5) may cause the computation to take very long for dimensions 푑 5. This might be subject to future ≥ improvements. The current implementation does not use arbitrary-precision arithmetic. The use of floating point numbers appears to be sufficient for practical purposes. The CurveRenderer class allows to render (visualize) SFCs in various ways using the LATEX package TikZ. The rendering is done by traversing the specified geometric 푏- index-tree and using the face information from step (2) to create custom objects such as lines connecting subsequent nodes or one-dimensional faces of each polytope. This class has been used to generate most of the figures in this thesis. It is also possible to compute the group defined in Definition 4.7 from a specification. 풢푇

6.2 Other Code Besides the code for general models, there is also code for particular SFCs available. Before we come to neighbor-finding code, we want to examine some examples.

Implementing a state history Listing 1 and Listing 2 show a Java implementation of an algebraic 푏-index-tree with history (cf. Definition 3.14) that realizes every operation in (1).A StateHistory class is implemented that stores the state of a node and 풪 the states of all of its ancestors. To realize all operations in time (1), it is necessary 풪 that multiple nodes can share a common part of the state history. As an example, this implementation uses a lookup table for the semi-local model of the Gosper curve shown in Figure 14. Efficient state computation For the 2D Hilbert curve, there is a neat trick to compute the state of a node very efficiently: In Example 4.8, we showed that the group 푇 of the 2D Hilbert curve is isomorphic to the Klein four-group Z2 Z2 with addition 풢 × as the group operation. We can identify the root state 퐻 = 퐺(1) = 퐺(2) with (0, 0), the state 퐴 = 퐺(0) with (0, 1), the state 퐵 = 퐺(3) with (1, 0) and the state 푅 with (1, 1). To compute the state of a level-position node (푙, 푗), where (푎푙−1 . . . 푎0)4 is the ∑︀푙−1 ∑︀푙−1 base-four-representation of 푗, we have to compute 푠 := 푠푟 + 푖=0 퐺(푎푖) = 푖=0 퐺(푎푖), where the summation order is irrelevant since the Klein four-group is abelian. With the notation 푛 := 푖 0, . . . , 푙 1 푎 = 푘 for 푘 0,..., 3 , the above identification 푘 |{ ∈ { − } | 푖 }| ∈ { } yields 푠 = (푛3 mod 2, 푛0 mod 2). By taking the AND of pairs of bits in the binary representation of 푗, and counting the number of resulting ones, 푛3 can be computed. The computation of 푛0 is similar but uses NOR instead of AND. The resulting state 푠 = (푠1, 푠2) can be identified with a number 푘푠 = 2푠1 + 푠2. A C++ implementation of this algorithm for levels 31, i.e. 푗 < 264, is shown in Listing 3. Since the popcount ≤ 57 Listing 1: A Java implementation of a partial state-tree. public class StateHistory { // example values for the semi-local model of the2D Gosper curve private static int[][] childStateTable = { {0, 1, 1, 0, 0, 0, 1}, {0, 1, 1, 1, 0, 0, 1}};

private StateHistory parent; private int state;

public StateHistory(StateHistory parent, int state) { this.parent = parent; this.state = state; }

public StateHistory getChild(int j) { return new StateHistory(this, childStateTable[state][j]); }

public StateHistory getParent() { return parent; }

public int getState() { return state; } }

operation (which counts the number of ones in the bit representation of an integer) is natively supported by modern processors, this algorithm is very fast. Under the assumption of constant-time arithmetic integer operations, its runtime is in (1). 풪 Unfortunately, this approach cannot be extended to the 3D Hilbert curve since it appears that its state group is isomorphic to the alternating group 퐴4, which is not 풢푇 abelian.

Neighbor-finding The sfcpp library contains several optimized implementations of Algorithm 1 for different SFCs:

An implementation of the neighbor-finding algorithm, state computation and ∙ coordinate conversion for arbitrary-dimensional Peano curves is provided. The dimension 푑 can be passed as a template parameter. The neighbor-finding algorithm uses all optimizations from Section 5 except the bit operations, since 푏 = 3푑 is not a power of two for the Peano curve. The implementation is based on a local model of the Peano curve, which means that it does not have to track the

58 Listing 2: A Java implementation of an algebraic 푏-index-tree with history. public class Node { private static int b = 4; private int level; private long position; private StateHistory history;

public Node(int level, long position, StateHistory history) { this.level = level; this.position = position; this.history = history; }

public Node getChild(int index) { return new Node(level + 1, position * b + index, history.getChild(index)); }

public Node getParent() { return new Node(level - 1, position / b, history.getParent()); }

public static Node getRoot() { return new Node(0, 0, new StateHistory(null, 0)); }

public long getIndex() { return position % b; }

public int getLevel() { return level; }

public int getState() { return history.getState(); } }

state of a tree node. This implementation has been developed for a term paper12 and has been further optimized for this thesis.

For the 2D and 3D Hilbert curves as shown in Section 2, the neighbor-finding ∙ algorithm is implemented. Most optimizations from Section 5 are used. Since the Hilbert curves do not satisfy the palindrome property, the corresponding

12David Holzm¨uller:Raumf¨ullende Kurven, 2016.

59 Listing 3: Efficient Hilbert 2D state computation in C++. uint64_t getState(uint64_t level, uint64_t position) { uint64_t lowerMask = 0x5555555555555555ul;// binary: 01010101... uint64_t flipMask = lowerMask >> (64 - 2 * level); uint64_t a = position & lowerMask; uint64_t b = (position >> 1) & lowerMask; uint64_t aband = a & b; uint64_t abnor = flipMask ^ (a | b); uint64_t n_3 = __builtin_popcount(aband); uint64_t n_0 = __builtin_popcount(abnor); return 2 * (n_3 % 2) + (n_0 % 2); };

optimization cannot be applied. Optimization 5.7 can only be applied in the 2D case since the state group of the 3D Hilbert curve does not allow a XOR-based implementation. Multi-level tables are not implemented but might result in a considerable speedup. The single-level lookup tables were automatically generated from the implemented Hilbert curve model as described in Section 6.1. For the 2D Hilbert curve, the efficient state computation algorithm from Listing 3isalso included.

For a local model of the 2D Sierpinski curve, the neighbor-finding algorithm is ∙ implemented. A suitable enumeration of the facets allows the algorithm to replace lookup tables by simple arithmetic operations. It is unclear whether multi-level tables would bring an advantage. All other optimizations from Section 5 are implemented.

For the 2D Morton curve, the (1)-algorithm by Schrack [5] is implemented. ∙ 풪

6.3 Experimental Results To measure the performance of the implemented neighbor-finding algorithms, these algorithms were executed repeatedly with fixed level 푙 and state 푠 = 푠푟 and uniformly distributed random position 푗 and facet 푓. The state was fixed because as explained in Section 4.3, for all curves where neighbor-finding has been implemented, calling the algorithm with any state for all facets yields the same neighbors. The parameters of one call were also chosen to be dependent on those of the previous call to avoid pipelining of multiple calls by the compiler. Since the random number generation requires a large portion of the execution time, the time needed to execute 푛 := 5 106 · random number generations was measured and subtracted from the time measured for 푛 executions of random number generation together with neighbor-finding. For each algorithm and each level, 15 such time measurements were conducted. The median of these 15 measurements (divided by 푛) was then taken as an estimate of the average runtime of a single execution. Figure 19, Figure 20 and Figure 21 show the measured average-case runtime of different

60 14

12

10

8

time [ns] 6

4

Peano2D(2) 2 Peano3D(2) Peano4D(2) 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Level

Figure 19: Measured average runtimes of finding a neighbor in random direction ofa random grid cell in Peano curves of different dimensions.

20

15

time [ns] 10

5 Peano2D(1) Peano2D(2) Peano2D(3) Peano2D(4) 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Level

Figure 20: Measured average runtimes of finding a neighbor in random direction ofa random grid cell in the 2D Peano curve with different lookup table depths. implementations. The number in parentheses denotes the number of levels included in the lookup tables as explained in Optimization 5.3. If no number is provided in parentheses, the algorithm does not use lookup tables. The measurements were conducted on a system with an Intel Core i7-5600U CPU, 12 GB RAM and Ubuntu 16.04. The library was compiled using gcc version 5.4.0 with optimization flag -O2. Despite some fluctuations in the measured runtimes, it is clearly visible that allnew neighbor-finding algorithms perform similarly well and, with multi-level lookup tables as in 20, can even compete with the neighbor-finding algorithm for the Morton curve.

61 It has to be noted, though, that due to its simpler structure, the algorithm for the Morton curve may have a clear advantage when compiler pipelining is allowed. Figure 22 shows measured runtimes of state-computation algorithms for the 2D Peano and Hilbert curves. The measurements were also conducted as described above. The algorithm for the Peano curve is an optimized version of Algorithm 3, while for the 2D Hilbert curve, the special algorithm from Listing 3 was used. The effect of the lookup table depth on the Peano algorithm is clearly visible.

20

15

time [ns] 10

Peano2D(1) 5 Hilbert2D(1) Hilbert3D(1) Morton2D Sierpinski2D 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Level

Figure 21: Measured average runtimes of finding a neighbor in random direction ofa random grid cell in different 2D curves. For the Morton curve, the algorithm by Schrack [5] was used.

62 160 Peano2D(1) 140 Peano2D(2) Peano2D(3) Hilbert2D 120

100

80 time [ns] 60

40

20

0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Level

Figure 22: Measured average runtimes of computing the state of a random grid cell in different curves with different lookup table depths.

63 7 Conclusion

In this thesis, we have derived a modeling framework which can be used to model most practical SFCs. These models were used to define trees on which a general neighbor- finding algorithm was formulated. This algorithm can, under certain assumptions, compute positions of neighbor cells in a regular grid ordered by a SFC with an average- case runtime complexity of (1). Measurements on various implementations were 풪 performed to demonstrate that with some optimizations, the runtime of this algorithm is comparable to the runtime of an existing (1) algorithm for the Morton order by 풪 Schrack [5]. It was discussed how related computational issues can be solved and how properties of specified models can be verified. An implementation was introduced that can visualize and compute lookup tables for SFC models. The application of space-filling curves in computer science is a relatively young research topic. Consequently, there are many questions yet to be explored. In particular, this thesis leaves some challenges still to be addressed. For example, future research might investigate neighbor-finding on adaptive grids. For theoretical modeling, one might define a finite set of leaves. A suitable characterization for such a set might be ℒ ⊆ 풱 that for every sequence (푣푛)푛∈N0 of nodes with 푣0 = 푟 and 푣푛 = 푃 (푣푛+1), there is exactly ′ one 푛 N0 with 푣푛 . One might then define that 푣 is a geometric 푓-neighbor ∈ ∈ ℒ ∈ ℒ of 푣 if dim(conv(푄(푣′)) (푣) ) = 푑 1. For an implementation, the introduction of ∩ 푓 − an additional function similar to Ω might be necessary. The average-case runtime complexity of the algorithm might change if the difference between the maximum and minimum level is unbounded. With the definition of neighborship in adaptive grids used by Aizawa and Tanaka [1], neighbor-finding is even easier since they define neighbors to be of equal or bigger size. It might also be interesting to examine whether the principle of the efficient Hilbert 2D state computation algorithm from Section 6.2 can be transferred to global models of Sierpinski curves. In the context of this thesis, some aspects of analyzing SFC models (i.e. 푏-specifications) have already been implemented. Future work could extend this analysis to verify several properties of SFC models. Some properties and approaches for their verification have already been presented in this thesis. Other aspects, for example whether and how a model specifies a space-filling curve as the limit of finite approximations, have beenleft to future research. A suitable analysis tool might then also be extended to implement automatic code-generation for arbitrary regular SFC models.

64 References

[1] Kunio Aizawa and Shojiro Tanaka. A constant-time algorithm for finding neighbors in quadtrees. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(7):1178–1183, 2009.

[2] Michael Bader. Space-filling curves: an introduction with applications in scientific computing, volume 9. Springer Science & Business Media, 2012.

[3] C. Bradford Barber, David P. Dobkin, and Hannu Huhdanpaa. The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software (TOMS), 22(4):469–483, 1996.

[4] John J. Bartholdi and Paul Goldsman. Vertex-labeling algorithms for the Hilbert spacefilling curve. Software: Practice and Experience, 31(5):395–408, 2001.

[5] G¨unther Schrack. Finding neighbors of equal size in linear quadtrees and octrees in constant time. CVGIP: Image Understanding, 55(3):221–230, 1992.

[6] Tobias Weinzierl and Miriam Mehl. Peano—a traversal and storage scheme for octree- like adaptive Cartesian multiscale grids. SIAM Journal on Scientific Computing, 33(5):2732–2760, 2011.

[7] G¨unter M. Ziegler. Lectures on polytopes, volume 152. Springer Science & Business Media, 2012.

65