Partition Refinement:

a meaningful technique for Graphs

∗ Qinna Wang [email protected]

ABSTRACT final partition is the ordering elements. Note the complex- The partition refinement is a meaningful technique for ity of Quick Sort which is really considerable. Therefore, a applications in graphs. Currently, the applications in aim similar conceptual technique is then generated for develop- of the graph problems like chordal graphs [7], permutation ing , which is the partition refinement. graphs [6] and modular decomposition [3] focus on increasing the efficiency. In those propositions, the partition refinement In the domain of the graph, the partition refinement is de- plays an important role for the resolutions. Therefore, we veloped to split a class of partition vertices into disjoint sub- describe this efficient technique which overcomes the con- classes according to the adjacencies between vertices. In this straints such as the unreasonable complexity and ambigu- way, the underlying information of graphs is able to mined ous understanding of certain applications that are used for and particular graph problems can be resolved. For example, graph problems in this paper. Through the illustration with there are several proposed partition refinement applications various examples, we will show how efficient the for twin vertex calculation [6], chordal graphs recognition [7] become according to the partition refinement. and permutation graphs recognition [6].

For the valuable contributions of the partition refinement, Keywords we introduce it in this paper which is organized as: in section , partition refinement 2, we present the partition refinement in detail; in section 3, we describe several classical partition refinement appli- 1. INTRODUCTION cations based on the ordering of elements; we then list the The graph theory has been developed for several decades. categories of the partition refinement applications according Since the first graph theory paper [1] was written by Leon- to different citations in section 4; we finally provide the dis- hard Euler on the Seven Bridges of Konigsberg and pub- cussion on the partition refinement and give a conclusion of lished in 1736, the graph theory has been used for resolving this paper. multiple problems in different domains such as computation science, for example, [5] has shown that the hyper graph 2. THE PARTITION REFINEMENT partition is considerable for parallel scientific computing. Explicitly, all applications referred to the partition refine- ment can be described in a general framework which is shown Generally, a good graph algorithm should be efficient and in this section. simple to understand while it is used to resolve complicated problems. The focus of this paper is the partition refinement which is a meaningful technique to simplify and optimize the 2.1 definition and notation applications in several domains such as in graphs. Firstly, we introduce the basic definitions and notations re- lated with the partition refinement. In traditional, pivots are used to split the classes. The most classical algorithm with respect to the pivot rule is Quick Sort which is used for sorting integers. In this case, the definition 1. The partition P of a set E is the collec- tion of disjoint subsets {χ1, χ2, ..., χk} where ∪16i6kχi = E. ∗ENS Lyon, Laboratoire de l’information du Paral- We call the subset collection as classes. lAl’lisme,69364˜ Lyon cedex, France definition 2. The set S intersects strictly with the 0 0 0 0 set S if and only if S ∩S 6= ∅ and S * S . Note that S ⊆ S is possible when the set S intersects strictly with the set S0.

definition 3. The refined partition P0 = refine(P,S) where P = {χ1, χ2, ..., χk} and S is the subset of E, is ob- tainned by replacing the classes χi with χia and χib where χia = χi ∩ S and χib = χi \ S. definition 4. The partition P is stable for the subset Therefore, the procedure of InitPartition (P) is impor- S, if and only if any class χi doesn’t intersect strictly with tant. Note that some applications arrives their strategy the set S. i.e,P = refine(P,S). only by executing InitPartition (P) one time such as twin vertex calculation.

definition 5. In case of an ordering partition P = • ClassPivot (E) is a procedure to produce the subset S according to the pivot p which is implied by the ele- {χ1, χ2, ..., χk}, elements in each class χi are ordered. ment E. It is related with the strategy of the applica- tion. Hence, there are two versions of the procedure definition 6. The ordering partition Q is compatible ClassPivot: only one pivot p is used for the procedure with another ordering partition P if and only if Q  P which refine(P,S) or several pivots are used for refine(P,S). means that for each class χ in Q there exists another class For the first version of ClassPivot, the operation of γ in P and χ ⊆ γ and the order of elements in χ respects refine(P,S) is executed several times. However, the for the order in γ. refine(P,S) is only run one time. Therefore, we con- sider that the first version of ClassPivot saves the space for pivots. We then introduce the in the partition refine- ment: the double chain list L(E) is used to store the element We then discuss the procedure of ClassPivot according its in the set E = {x1, x2, ..., xk}. Each element x has a pointer two versions: to its partition class χ and each class χ has two pointers to its minimal and maximal elements in L(E). Hence, the Input: an element E in Pivots insetting and deleting operations are easy to realize in the Output: the stable partition for the subsets S which are partition refinement algorithms. Therefore, the partition re- generated by E T finement which deletes the class χa = χ S and inserts it to for each element x ∈ E do the right of the class χ has reasonable computational time p ← (x, E) with the pointer structure. Further, marque the ordering of S ← PivotSet(p) the list L(E) which is related with the ordering of the par- refine(P,S) tition classes and the compatibility of the classes. Hence, end the partition refinement is useful for applications concern- Algorithm 2: ClassP ivot(E) ing with the ordering of elements.

2.2 the partition refinement Input: an element E in Pivots Secondly, we show the total algorithm of the partition refine- Output: the stable partition for the subset S which is ment. In order to well understand it, we present a general generated by E algorithm at the beginning. Then we describe its basic pro- begin cedures with explications. At end, we provide the algorithm S ← ∅ of the partition refinement in global view. for each element x ∈ E do p ← (x, E)

Input: the partition P = {χ1, χ2, ..., χk} S ← S ∪ PivotSet(p) 0 0 0 0 end Output: the refined partition P = {χ1, χ2, ..., χm} begin refine(P,S) pivots = ∅ end while the InitPartition (P) doesn’t stop the loop do Algorithm 3: ClassPivot (E) InitPartition (P) while the Pivots 6= ∅ do select an element E from the Pivots In the procedure ClassP ivot(E), p is the pivot to produce ClassPivot (E) the subset S where S ⊆ E and one element x ∈ E is used to produce an unique pivot p. end end Now, we observe the Algorithm 2 and Algorithm 3: the end Algorithm 2 executes the operation refine(P,S) with one Algorithm 1: a general algorithm for the partition refine- subset S. Distinct from it, the other Algorithm 3 runs the ment operation refine(P,S) with several subsets S. The second version of ClassPivot (E) is used to minimize the determined Now, we illustrate the general algorithm which is shown in determined automate. Algorithm 1. Next, we present the key algorithm refine(P,S) which is used to refine the current partition with the set S. • InitPartition (P) is essential for the total algorithm. It is used to start the following procedures and indi- cates the end of the total algorithm. One of its im- • The operation refine(P,S) is to split the partition class T portant contribution is used to add the known pivots χ into the class χa and χb, where χa=χ S and χb = into P ivots. Additionally, the InitPartition (P) can χ\S. Then we insert the new class χa next to the be considered as the beginning of the recursive process. partition class χ, where the current class χ = χb. Note Input: the partition P = {χ1, χ2, ...χk} for the set E and subset S, where the class χi intersect strictly with S 0 0 0 0 Output: the partition P = {χ1, χ2, ...χm},which is stable for S for each class χ do T χa=χ S if χa 6= ∅ and χ\S 6= ∅ then remove χa from χ if InsertRight(χ, χa, p) then insert χa to the right of χ end else insert χa to the left of χ end Input: a partition P = {χ , χ , ..., χ } for the set E AddPivot(χ, χ ) 1 2 k a Output: the refined partition P0 = {χ0 , χ0 , ..., χ0 } end 1 2 m begin end Pivots← ∅ Algorithm 4: refine(P,S) while the InitPartition (P) doesn’t stop the loop do InitPartition (P) while Pivots 6= ∅ do that the class χa implies the ordering of the list L(E) select an element E from the Pivots and the position of the new class χa depends on the ClassPivot (E) strategy of applications. for each element x ∈ E do p ← (x, E) • InsertRight(χ, χa, p) is used to determine whether in- S ← PivotSet(p) sert the class χa in the right of χ. It depends on the refine(P,S) strategies of applications and the expectation of the if the current partition P isn’t stable for S operator. If the return value is true, we then insert χa then to the right of χ; otherwise, insert χa to the left of χ. Let the set of classes χ which intersect strictly with S be denoted as M • AddP ivot(χ, χa) is used to update the pivots. With for each class χ ∈ M do 0 T respect to the new partition P and the new class, the χa = χ S set pivots must be updated for the relations with the remove χa from χ subset Sa = {x ∈ χa : pa 6 x} and Sb = {x ∈ χb : if InsertRight(χ, χa, p) then insert χ to the right of χ pb 6 x}. a end • The computational complexity referring to the data else structure: all elements in E is stored in a double chain insert χa to the left of χ list and each partition class ” touch ” its elements in end L(E) with two points, the total operation of the parti- AddPivot(χ, χa) tion refinement can be considered to delete all elements end of χa in L(E) and add an interval χa next to the in- end terval χ simultaneously(Once one element x is in χa, end add a pointer to the interval χa). And the class χb is end presented by the ancient χ. Therefore, the total com- end putational complexity is in time |S| which presents the end operation time on the interval χa. Note the position Algorithm 5: the global algorithm of the partition refine- of element x in the list L(E) doesn’t change, thus the ment ordering of elements in χa respects for the ordering of elements in L(E). We state that the final partition implies the ordering.

• In the domain of graph theory, all elements in the set E can be considered as the vertex set V in graph G = (V, E). The partition classes χ1, χ2, ..., χk provide the adjacency list of the graph. As a result, the algo- rithm of the partition refinement is meaningful for the problems in graphs such as recognizing chordal graphs and permutation graphs.

At the end of this section, we show the global algorithm of the partition refinement in the following: Algorithm 5 shows the global algorithm for the partition re- With respect to the other partition refinement applications, finement which is used for applications in aim of different we then analyze the proof related with invariant: once the problems. Depending on the basic procedures such as the property A is verified by the partition then the procedure InitPartition(P), ClassPivot(E) and refine(P,S), it splits InitP artition is executed with iterative until the partition the partition classes into the singleton classes or the congru- maintains the property B. It implies the contribution of the ent classes1. procedure InitP artition for the practical application for the correctness. We then study the complexity of the global algorithm for the partition refinement. We firstly note the size of P ivots which contains the element E. We have learned that the el- 3. APPLICATIONS ement E is used to produce the subset S. When the size of Based on the algorithm of the partition refinement, we then Pivots unknown, it’s difficult to estimate the running time. present several partition refinement applications with differ- Note the case such as the twin vertex calculation, all elements ent strategies. Considering the differences between appli- are considered as the pivots during the total procedure. In cations, we present them according to their problems: the this case, the size of P ivots is limited to the number of the partition classes are congruent or the classes are sorted. Ad- elements in set E. Then the computational time on pivot ditionally, the invariant is also used to prove the correctness selection is in O(n) where n is the number of elements in set of applications. E. We secondly consider the time for the partition refine- ment. Reviewing the time for the procedure refine(P,S), 3.1 unordered partition refinement it is in time O(|S|). Considering the total procedures in The applications like twin vertex calculation [8], determin- P refine(P,S), we define the running time is m = |S|. In istic automation and modular decomposition [3] imply the conclusion, the algorithm for the partition refinement costs unordered partition refinement. In this case, each partition O(n + m) in time. Note that the above analysis is based class doesn’t care the order of its elements and the total par- on the known P ivots which depends on the size of the set tition classes implies the congruent relations. So we consider E. In case of unknown pivots, the total computational time that such applications resolve the problems by the congruent may be in O(n + mlogn) which is proposed by Hopcroft [4] classes or congruent relations between objects. for ”process the small half” pivot rule. We then take the twin vertex calculation as an example. 2.3 invariant Firstly we introduce the twin vertices. In [8], it proposes the invariant to prove the correctness of the partition refinement: definition 7. twin vertices are defined to be two ver- invariant: the partition verifies the property A; if the parti- tices in one graph have the same neighbours. It is obvious tion fails to verify the property B, there are partition classes that being twin vertices corresponds to an congruent relation. which intersect strictly with the subset S, where S is related Thus, the calculation is a procedure of unordered partition with the current P ivots. refinement.

• A: some properties imply the existence of a resolution definition 8. If two vertices in one graph have N(u) = which is compatible with the partition P N(v) where N(u) is the open neighbours of vertex u that excludes the vertex u itself, they are defined to be fake twin • B: some properties indicate the partition P correspond- vertices ing to the resolution

The invariant will be used to prove the correctness of the definition 9. Two vertices are defined to be true twin partition refinement applications. Here, we take the Quick vertices, if and only if they have N[u] = N[v] where N[u] is Sort as an example. the close neighbours of the vertex u that contains the vertex u itself.

Proof. invariant Secondly, we introduce the algorithm for calculating the twin vertices in the following: • A: for each element x which is in front of y in the resultant list E, then x < y. • B: all classes in the final partition are singleton. • In this case, one execution of the procedure InitParti- tion (P) is enough for the calculation where the pivots correspond to the adjacency lists of vertices. After The inputting data is the ordering partition P for the set the InitPartition (P), the partition classes then are E. And the P ivots can be defined in the InitPartition (P). divided into the singleton classes according to the pro- According to the procedure ClassPivot(E), the partition P cedure refine(P,S). Therefore, the algorithm stops is divided into the singleton classes. when it starts InitPartition (P) in second time. Addi- 1 The classes in the partition are equal to each other. It also tionally, the procedure AddPivot(χ, χa) does nothing implies that elements in each class are ordered independently because the calculation doesn’t need more pivots. Input: the vertex set V in a graph G with the adjacency 3.2 ordered partition refinement lists of vertices Some applications such as Lex-BFS concerns on the ordering Output: the partition P = {χ1, χ2, ..., χk} where each of the partition. In this case, each class in refined partition class is composed by twin vertices yields the ordering of the inputting data. begin P ← {V} We have introduced the unordered partition refinement with execute the Algorithm 5 with the following procedures an example of the twin vertex calculation. In fact, we can end consider the twin vertex calculation as an ordered algorithm procedure InitPartition (P) begin which respects for the order of the inputting data. Here we add all vertices in V to pivots at the first execution define the original partition as P and the refined partition exit and return the partition P at the second execution as P0. Therefore, the partition P 0 is compatible with the end partition P: i.e. if x

• invariant:the partition verifies the property A; if the definition 11. Assume P is a sequence of letters in al- partition fails to verify the property B, there are par- phabet order and P∗ is a set of strings which come from P. tition classes which intersect strictly with the subset Additionally, the i-th letter which is in the string x belonging S, where S is related with the P ivots. to the set P∗ is denoted to be x(i). Under this condition, we consider the string x is in front of y according to the lex- –A: if vertex x and y are twin vertices, they belong icographical order: x

–A: a lexicographical sort of the strings is compat- Input: a graph G = (V, E) with its adjacency lists for ible with P vertices –B: two different strings don’t appear in the same Output: an ordering Lex-BFS list P for the vertices in set class V begin proof: It’s obvious to observe the correctness of such P ← (V); an algorithm according to the Pivots and the property execute the algorithm 5 with respect to the following of the partition refinement. procedures. end procedure Focus on the ordered partition refinement, the partition is InitPartition (P) with respect to the i-th execution begin generally refined until all classes yielding the order of their if i = n then elements. It means that each final refined class is not nec- return P : the end of the procedure essary to be a singleton. The congruent classes are possible end to exist. else select the vertex x from the class χi where χi is 4. LEX-BFS composed by the non-visited vertices and is located According to the standard bread-first search failing to obtain in the most right in the partition P i.e.χnon−visit ∈ P an ordering list for vertices that are visited in a graph G, max the Lex-BFS is proposed to visit vertices in a graph in order if |χi| 6= 1 then with respect to certain properties. Therefore, the Lex-BFS replace the class χi by χi\{x} is a resolution for graph problem. Here, we will present the add {x} to P Lex-BFS algorithm in detail. end add {x} to Pivots end • Lex-BFS [2] is used to recognize triangulated graphs end in traditional. procedure PivotSet(p = (x, χ)) begin definition 12. Triangulated graphs are also called return the adjacency list of the vertex x chord graphs which are subsets of the perfect graphs. end If each cycle composed by at least of four nodes has procedure a chord, we call the graph is , where is chordal chord InsertRight(χ, χa, p) begin an edge joining two nodes that are not adjacent in the return true; cycle. end procedure [2] firstly indicates how to use the Lex-BFS for the AddPivot(χ, χa) begin problems of graphs. It numerates all vertices according nothing to do to the visiting order. Each vertex has a label: label(x). end label(x) is related with the adjacency lists. Once a Algorithm 8: Lex-BFS visited vertex x is numbered i, the number i will be added into the labels of the non-visited neighbours of x. i.e. i will be added into the label(y) where the y is the neighbour of x and y hasn’t been visited. In the following execution, the size of label(x) is consid- ered to number the vertex x. In this way, the result P = {χ1, χ2, ..., χn} implies the perfect elimination or- dering of the graph G.

definition 13. σ = [x1, x2, ..., xn] is a perfect elim- ination ordering of a graph G = (V, E) if the neigh- bours of each vertex xi is a clique of the induced sub- graph Gi = G[x1, x2, ..., xn]. For example, the σ = [f, g, a, e, c, b, d], then the subgraph Gi[e, c, b, d] may be Therefore, the algorithm of partition refinement is really in- a cycle. teresting and meaningful for graphs.

As known, the can be specified by the 5. CATEGORIES perfect elimination ordering. So, the Lex-BFS is really We have presented the algorithm of the partition refinement useful for the graph problems such as chordal graph according to diverse applications that are classified by the recognition. ordering of elements in partition classes. • Reviewing the Lex-BFS algorithm, it is easier to under- stand with the framework of the partition refinement. In fact, the partition refinement applications can be clas- Unlike the previous algorithms, pivots are unknown in sified to diverse categories according to different critations. this case. In fact, pivots are produced in PivotSet(p). We have introduced the ordering. Next we will introduce It implies one type of partition refinement applications the pivot rules. that produce pivots during the execution. So the pro- cedure of ClassPivot(E) corresponds to the Algorithm According to the descriptions on different partition refine- ?? in this case. ment applications, pivots play the key role in each algorithm, especially it is related with the complexity of the algorithm. • We then discuss the implication through this algo- If the pivots are known in advance. Then the application is rithm. During the partition refinement process, each a linear time algorithm. Otherwise, the computational time class implies the lexicographical ordering of its ele- is complicated to estimate. Therefore, pivot rules are used ments. i.e. if the vertex x and y have belonged to as a citation for classify the categories of applications: the same class χ, they may have the same prefix. And if the vertex x is split before the vertex y, then the x

In this paper, we introduce the different categories of parti- tion refinement applications according to different citation. We also present some classical partition refinement appli- cations for graph problems such as twin vertex calculation and Lex-BFS, the results are considerable with reasonable running time. It implies that the partition refinement give insights for graph problems.

Note the properties of the partition refinement, it is efficient and simple to understanding. With respect its contributions such as Lex-BFS, it is meaningful for graph calculations.

While the partition refinement is used for graph algorithms, the element set E is usually the vertex set V, each class χ implies the properties of the graph G and the pivots are related with the adjacency lists. Additionally, the execu- tion of the partition refinement application is based on the inputting data. If the inputting data is subgraphs such as clique as the element E, the algorithm is also accepted for the calculation. Therefore, we conclude that the partition refinement has a good prospective in the domain of graphs.

7. ACKNOWLEDGMENTS 8. ADDITIONAL AUTHORS 9. REFERENCES [1] Graph Theory. Oxford University Press, 1986. [2] G. S. L. D.J. Rose, R.Endre Tarjan. Algorithmic aspects of vertex elemination on graphs. SIAM J.of Comp., 5(2):266–283, June 1976. [3] R. M. E.Dahlhaus, J.Gustedt. Efficient and practical modular decomposition. Proc. 7th Ann ACM-SIAM Symp., 1997. [4] J. Hopcroft. An nlogn algorithm for minimizing states in a finite automaton. In Theory of machines and computations, pages 189–196. Academic Press, 1971. [5] R. H. R. B. U. C. KD Devine, EG Boman. Parallel hypergraph partitioning for scientific computing. 20th International Parallel and Distributed Processing, 2006. [6] C. P. Michel Habib and L. Viennot. A synthesis on partition refinement: a useful routine for strings, graphs, boolean matrices and automata. 15th annual Symposium on Theoretical Aspects of Computer Science, 1998. [7] C. P. Michel Habib, Ross McConnell and L. Viennot. Lex-bfs and partition refinement with applications to transitive orientation, interval graph recognition and consecutive ones testing. theoretical computer science, 2000. [8] C. PAUL. Parcours en Largeur Lexicographique: un algorithme de Partitionnement Application aux Graphes et Generalisations. PhD thesis, UNIVERSIT MONTPELLIER II, 1998.