Analysis of Algorithms for Star Bicoloring and Related Problems

A dissertation presented to the faculty of the Russ College of Engineering and Technology of Ohio University

In partial fulfillment of the requirements for the degree Doctor of Philosophy

Jeffrey S. Jones May 2015 © 2015 Jeffrey S. Jones. All Rights Reserved. 2

This dissertation titled

Analysis of Algorithms for Star Bicoloring and Related Problems

by JEFFREY S. JONES

has been approved for

the School of Electrical Engineering and Computer Science and the Russ College of Engineering and Technology by

David Juedes Professor of Electrical Engineering and Computer Science

Dennis Irwin Dean, Russ College of Engineering and Technology 3 Abstract

JONES, JEFFREY S., Ph.D., May 2015, Computer Science Analysis of Algorithms for Star Bicoloring and Related Problems (171 pp.)

Director of Dissertation: David Juedes This dissertation considers certain graph-theoretic combinatorial problems which have direct application to the efficient computation of derivative matrices “Jacobians”) which arise in many scientific computing applications. Specifically, we analyze algorithms for Star Bicoloring and establish several analytical results. We establish complexity-theoretic lower bounds on the approximability of algorithms for Star Bicoloring, showing that no such polynomial-time algorithm can achieve an

1 ǫ approximation ratio of O(N 3 − ) for any ǫ > 0 unless P = NP. We establish the first algorithm (ASBC) for Star Bicoloring with a known approximation upper-bound,

2 showing that ASBC is an O(N 3 ) polynomial-time . Based on extension of these results we design a generic framework for greedy Star Bicoloring, and implement several specific methods for comparison. General analysis techniques are developed and applied to both algorithms from the literature (CDC, Hossain and Steihaug,

1998 [1]) as well as those developed as part of the framework. We provide numerous approximability results including the first approximation analysis for the CDC algorithm,

3 showing that CDC is an O(N 4 ) approximation algorithm. Finally, we observe that all algorithms within this generic framework produce a restricted class of star bicolorings that

we refer to as Distance-2 Independent Set (D2ISC) colorings. We establish the relationship between Star Bicoloring and D2ISC. In particular we show that these two notions are not equivalent, that D2ISC is NP-complete and that it cannot be approximated

1 ǫ to within O(N 3 − ) for any ǫ > 0 unless P = NP. 4

This work is dedicated to the memory of my parents, Ora and Violet Jones, who were the first in their family to ensure that their children enjoyed the benefits of higher education. This work is dedicated to my wife Janis, son Gordon and daughter Margaret, for their gracious support and encouragement. I have “kept my eye on the hat.” This work is dedicated to my brother Bill, who has been an inspiration and guide to me

throughout my life. 5 Acknowledgements

I would like to extend sincere thanks to Dr. David Juedes for his instruction, support, collaboration, guidance and friendship throughout my entire Ph.D. experience. Dr. Juedes has a gift for algorithmic analysis, a contagious joy for collaborative pursuit of the field, and a cheerful willingness to pass along his knowledge. I would like to thank the members of my Ph.D. dissertation committee, who have given their time both in and outside of the classroom in support of my candidacy: Dr.

Razvan Bunescu; Dr. Frank Drews; Dr. Cynthia Marling; Dr. Sergio Lopez; and Dr. Howard Dewald. 6 Table of Contents

Page

Abstract ...... 3

Dedication ...... 4

Acknowledgements ...... 5

List of Tables ...... 9

List of Figures ...... 10

List of Symbols ...... 12

List of Acronyms ...... 14

1 Introduction ...... 16 1.1 Overview ...... 18 1.2 Background, Motivation and Research Direction ...... 20 1.3 Examples ...... 23 1.3.1 Combustion of Propane ...... 23 1.3.2 Curtis54 ...... 25 1.3.3 A Representative Sparsity Pattern ...... 28

2 Background and Literature Review ...... 30 2.1 Preliminary Mathematical Notation ...... 30 2.2 Applications of the Jacobian ...... 31 2.3 Sparse Derivative Matrices and Sparsity Patterns ...... 34 2.4 The Connection between Derivative Matrices and Graphs ...... 35 2.5 Generic Coloring Approaches ...... 36 2.6 D-1 Coloring Methods ...... 38 2.7 Orthogonal Independence ...... 39 2.8 D-2 Coloring Methods ...... 40 2.9 Uni-directional versus Bicoloring Approaches ...... 42 2.10 Bidirectional Direct Computation, SeqDC ...... 44 2.11 Chromatic and Coloring Numbers ...... 45 2.12 Direct versus Substitution Methods ...... 47 2.13 Ordering Methods ...... 48 2.13.1 Largest First ...... 48 2.13.2 Smallest Last ...... 48 2.13.3 Incidence Degree ...... 49 7

2.13.4 Saturation Degree ...... 49 2.14 Existing Coloring Algorithms ...... 50 2.14.1 uni-directional coloring ...... 50 2.14.2 Minimum Nonzero Count Ordering with Incidence Degree Order- ing (MNCO/IDO) ...... 50 2.14.3 Complete Direct Cover (CDC) ...... 52

3 Initial Results ...... 54 3.1 An Approximation Lower Bound ...... 54 3.2 Approximate Star BiColoring (ASBC), an Approximation Algorithm . . . . 56 3.3 ASBC Approximation Analysis ...... 60 3.3.1 Edge Elimination ...... 60 3.3.2 Limit on Colors Used ...... 62 3.3.3 Approximation Ratio ...... 62 3.4 ASBC Comparative Empirical Results ...... 63

4 A Family of Greedy Star Bicoloring Algorithms ...... 65 4.1 A General Star Bicoloring Framework ...... 65 4.2 A Class for GIS Algorithms ...... 67 4.3 Selected Greedy Strategies, Part One ...... 68 4.3.1 Locking, a Boundary Condition for Certain GIS Methods ...... 70 4.4 Selected Greedy Strategies, Part Two ...... 73 4.4.1 A Limitation to the Neighborhood Ratio Method ...... 79 4.5 Non-greedy Strategies ...... 81 4.6 Empirical Results ...... 83

5 Expanded Analyses ...... 89 5.1 Correctness of General Framework ...... 90 5.2 An Improved Lower Bound for GIS Family Methods Using Seed Vertex of Maximum Degree (MDGIS) ...... 92 5.3 Observations on the Size of GIS Neighborhoods ...... 94 5.4 CDC Approximation Analysis ...... 95 5.5 CDC Approximation Based on E ...... 97 5.6 Initial Maximum Neighborhood| Approximation| Analysis ...... 99 5.7 Improved Maximum Neighborhood Approximation Analysis ...... 100 5.8 Approximation Analyses for Almost Square Matrices ...... 101 5.8.1 Hypothetical Strongly Square Matrices ...... 101 5.8.2 Nearly Square Matrices ...... 103 5.8.2.1 An Example Nearly Square Matrix Construction . . . . . 104 5.9 Neighborhood Ratio Approximation Analysis ...... 107 5.10 Analysis for Ratio M’ Method ...... 109 5.11 Analysis for Inverse Ratio Method ...... 110 5.12 Analysis for Dense Ratio Method ...... 111 8

5.13 Look Ahead Approximation Analysis ...... 112 5.14 Weighted Unlocking Approximation Analysis ...... 118 5.15 Analysis Summary ...... 120 5.15.1 Observations ...... 122

6 Distance-2 Independent Set Coloring ...... 124 6.1 The Inequivalence of D2IS Bicoloring and Star Bicoloring ...... 125 6.2 NP-completeness for D2IS Coloring ...... 127 6.2.1 to Distance-2 Independent Set Coloring Reduction Construction ...... 128 6.2.2 Graph Coloring Distance-2 Independent Set Coloring ...... 128 6.2.3 Distance-2 Independent→ Set Coloring Graph Coloring ...... 129 → 7 Concluding Thoughts and Future Directions ...... 131 7.1 Algorithmic Directions ...... 132 7.2 Further Analyses ...... 133 7.3 Other Considerations ...... 134

References ...... 135

Appendix: The Matrix M class ...... 139 9 List of Tables

Table Page

3.1 Comparative Coloring Performance [2] ...... 64

4.1 “Strategy T” Variations ...... 74 4.2 Full List of GIS Evaluation Test Matrices ...... 84 4.3 Benchmark Coloring Results - Greedy Methods ...... 87 4.4 Benchmark Coloring Results - “Non-greedy” Methods ...... 88

5.1 GIS Methods, Analysis Summary ...... 123 10 List of Figures

Figure Page

1.1 Coefficient Matrix for the Combustion of Propane System of Equations . . . . 24 1.2 Sparsity Pattern for the Combustion of Propane System of Equations ...... 24 1.3 Color Groups for the Combustion of Propane System of Equations ...... 26 1.4 Color Groups for Curtis54 ...... 27 1.5 An Example Sparsity Pattern ...... 29

2.1 Plot of y = x2 + 4 ...... 32 2.2 A Jacobian Matrix of First-order Partial Derivatives of Component Functions . 33 2.3 Coleman - M´ore Column Coloring ...... 39 2.4 A Matrix with No Initial Structurally-orthogonal Columns ...... 43 2.5 Curtis, Powell and Reid (CPR) Coloring ...... 51 2.6 Minimum Nonzero Count Ordering (MNCO) [3] ...... 52 2.7 Complete Direct Cover ...... 53

3.1 Transforming a General Graph G into a Gb ...... 56 3.2 Greedy Independent Set ...... 58 3.3 Approximate Star Bicoloring ...... 59 3.4 Minimizing Approximation Ratio Exponent ...... 61

4.1 A General Framework for Star Bicoloring ...... 66 4.2 Coloring Decision Logic: Second Degree Method ...... 69 4.3 Coloring Decision Logic: Maximum Neighborhood Method ...... 70 4.4 A Simple “Locked” Star Bicoloring ...... 71 4.5 A “Stair” Matrix ...... 71 4.6 A “Drop”Matrix ...... 72 4.7 Coloring Decision Logic: Neighborhood Ratio Method ...... 73 4.8 Coloring Decision Logic: Inverse Ratio Method ...... 75 4.9 Coloring Decision Logic: Ratio M’ Method ...... 76 4.10 Coloring Decision Logic: Dense Ratio Method ...... 77 4.11 Coloring Decision Logic: Look Ahead Method ...... 78 4.12 Coloring Decision Logic: Weighted Unlocking Method ...... 79 4.13 Ratio Method Counter-example with n = 5, φ = 2, Matrix Representation . . . 80 4.14 Ratio Method Counter-example with n = 5, φ = 2, Graph Representation . . . . 82

5.1 Minimizing Approximation Ratio Exponent for CDC ...... 96 5.2 Minimizing Edge-based Approximation Ratio Exponent for CDC ...... 99 5.3 Optimizing α for Strongly Square Matrices ...... 103 5.4 Forming a “Nearly Square” Matrix ...... 105 5.5 An n = 9 “Nearly Square” Matrix in Compressed Form ...... 106 11

5.6 An n = 9 “Nearly Square” Matrix in Random Form ...... 106 5.7 Neighborhood Ratio Using O(N) Colors ...... 108 5.8 Neighborhood Ratio Using O(N) Colors, Tie Independent ...... 109 5.9 Inverse Ratio Differs from Maximum Neighborhood ...... 111 5.10 “Stair” Matrix Derivative, Reduced by Coloring Column 1 ...... 114 5.11 Look Ahead Degenerative Case, One Decision ...... 115 5.12 Look Ahead Height Two Case ...... 116 5.13 Look Ahead Chained Sub-optimal Decisions ...... 117 5.14 Look Ahead Decision Tree General Case ...... 118

6.1 A Star Bicoloring Which Is Not a Distance-2 Independent Set Coloring . . . . 126 6.2 Construction for D2ISC Reduction from General Graph Coloring ...... 129 12 List of Symbols

x the cardinality of x | | “bottom,” a color typically used for vertices which be- ⊥ come disconnected components during the coloring process Γ(α) the set of neighbor vertices of the vertex set α

∆ v∆ , the cardinality of a vertex of maximal degree| | in G δ the cardinality of a vertex of minimal degree in G

µ a bipartition of maximal cardinality

ρ a bipartition of minimal cardinality

χ(Gb) the chromatic number of graph Gb, i.e. the minimal number of colors required for a proper coloring of Gb χasbc(Gb) the AS BC chromatic number of graph Gb, i.e. the minimal number of colors required for the AS BC algorithm to color graph Gb χcdc(Gb) the CDC chromatic number of graph Gb, i.e. the minimal number of colors required for the CDC algorithm to color graph Gb χsb(Gb) the star bicoloring chromatic number of graph Gb, i.e. the minimal number of colors required for a star bicoloring of Gb c(v) the color of vertex v d a vertex of maximal degree from the bipartition not con- taining the vertex of maximal degree for the entire graph (basically, the minority whip) d(v) the degree of vertex v dk(vi, v j) a function which determines if vertices vi and v j are distance-k neighbors E the set of edges in a general graph

Gb a bipartite graph 13

Gb = ( Vb1, Vb2 , Eb) a bipartite graph with vertices v Vb1 comprising { } one bipartition, vertices w V ∈comprising the ∈ b2 other, and Eb being the set of edges in the graph GIS Greedy Independent Set, a function which builds an inde- pendent set for coloring by the successive inclusion of a remaining vertex of maximal degree within the bipartition selected for coloring GIS (Gb, α) the Greedy Independent Set from bipartition α of graph Gb M an arbitrary matrix

MAX α a maximal value from set α { } m the number of columns in a matrix min α a minimal value from set α { } n the number of rows in a matrix

V a bipartition, a set of vertices which can have no edges between any pair, or the set of vertices in a general graph Vˆ the Greedy Independent Set of vertices selected from bipartition V Vis the bipartition producing the current independent set coloring group Vn the “neighbor” bipartition, or the bipartition which is not producing the current independent set coloring group v an arbitrary vertex v∆ a vertex of maximal degree v a vertex which has received the color “bottom” ⊥

(vi, v j) an edge with endpoints vi and v j 14 List of Acronyms

3SAT A variant of the Boolean Satisfiability decision problem restricted such that each clause contains at most three variables AD Automatic Differentiation. A computational method which uses the chain rule and storage of intermediate functional primatives to reduce the overall cost of related derivative computations. AS BC ApproximateS tarBiColoring, an algorithm for Star Bicoloring 2 with an approximation ratio of O(N 3 ). The first Star Bicoloring algorithm with a published approximation ratio. CDC Complete Direct Cover. A set of rows and/or columns of a matrix which includes either the row or the column from each non-zero element of the matrix. CDC CompleteDirectCover, an algorithm by Hossain and Steihaug [1] for Star Bicoloring. CPR The Curtis-Powell-Reid algorithm, an early unidirectional column-wise algorithm for matrix partitioning. CSC Combinatorial Scientific Computing. The application of combinatorial optimization techniques to problems relevant to scientific computing. D2IS A Distance-2 Independent Set.

D2ISC Distance 2IndependentS etColoring. An NP-complete coloring method for− solving the Jacobian matrix ordering problem. GIS A Greedy Independent Set. An independent set formed by greedy decision criteria. In this context, also a distance-2 independent set of a bipartite graph which will be assigned as the next color group. GIS The Greedy Independent Set algorithm. Developed as part of AS BC and used by the extended ”GIS -family” algorithms to construct distance-2 independent sets for coloring. GIS -family Those greedy algorithms for Star Bicoloring which use the GIS function to form color groups. ID Incidence Degree. A property of a vertex such that the incidence degree vi is the degree of vi in the subgraph Gi induced by vi and all previously colored vertices ( v1, v2,... vi 1 ). Or, an ordering strategy where elements{ are− considered} in order of non-decreasing Incidence Degree. LF Largest First. An ordering strategy where elements are considered in non-decreasing order. 15

MNCO A non-GIS algorithm for Star Bicoloring by Coleman and Verma [3]. NDC Non-overlapping Direct Cover. A partitioning of the rows (or columns) of a matrix such that no two elements in a partition share a non-zero value in the same column (row), and the total partition includes all rows (columns) which contain one or more non-zero values. NP The complexity class of decision problems which are verifiable in polynomial time. P The complexity class of decision problems which are solvable in polynomial time. SD Saturation Degree. The number of different colors to which a vertex is adjacent. Or, an ordering strategy where elements are considered in order of non-decreasing Saturation Degree. SeqDC A Sequential Direct Cover. An ordered sequence of sets of rows or columns of a matrix such that the elements of each set are orthogonally independent, and all sets together form a complete direct cover of the non-zero matrix elements. SL Smallest Last. An ordering strategy where the last ordered vertex is vδ, and each successively preceding vertex, vi is the vertex of minimal degree in G v + , where v + is the − { i } { i } set of all vertices ordered after (computed prior to) vi. 16 1 Introduction

Combinatorial scientific computing (CSC), i.e., the marriage of scientific computing and combinatorial optimization, has become an active area of research because many

problems of great importance to scientific computing (e.g., non-linear least squares optimization, efficient computation of Jacobian matrices) have components that are purely combinatorial in nature. This dissertation explores, in some depth, problems that arise from scientific computing which have non-trivial combinatorial components. Many

problems in scientific computing, such as those that we will explore, involve solutions to simultaneous non-linear equations. It is known that solving these types of problems is sometimes inherently computationally intractable [4, p.245] or has some component that is computationally intractable. Fortunately, this is not always the end of the story. Often, the intractability lies in finding an optimal solution to the computational ordering. In many

cases, a “good” computational ordering is sufficient. This opens the way for reasonable heuristic methods to provide satisfactory answers where exact methods would fail. The challenge then becomes to demonstrate the “good-ness” of these heuristic answers. For our purposes, “good-ness” will be measured by an approximation ratio, a central concept

in this work which we will introduce for now and explain more fully in section 1.2. As we discuss in this introduction, the types of problems from scientific computing that we explore can arise from even seemingly straightforward problems from high school chemistry, such as the combustion of propane in air (C H + 5O 3CO + 4H O). In 3 8 2 → 2 2 this case, the reality is that a number of intermediate products are produced during this combustion process, with their production ratios being affected by parameters such as pressure and air/fuel ratio. Also, air is not simply O2 and does contain other gaseous reactants (e.g. N2, Ar and CO2). Determining the resulting chemical equations at equilibrium requires numerical analysis to determine the number of moles of each product 17

produced for each mole of fuel consumed. Anticipation of maximal concentrations of intermediate and final products requires the solution of a system of non-linear equations. In fact, the problem mentioned in the previous paragraph is one of the problems in the MINPACK-2 test collection [5]. This problem consists of finding values for x1 through

x11 such that the eleven functions given below simultaneously evaluate to zero, indicating chemical equilibrium.

f (x) = x + x 3 1 1 4 − f (x) = 2x + x + x + x + x + x + 2x R 2 1 2 4 7 8 9 10 − f (x) = 2x + 2x + x + x 8 3 2 5 6 7 − f (x) = 2x + x 4R 4 3 9 − f (x) = K x x x x 5 5 2 4 − 1 5 1 1 1 1 2 2 2 p 2 f6(x) = K6 x x x x6 2 4 − 1  x11  1 1 1 1 2 2 2 p 2 f7(x) = K7 x x x x7 1 2 − 4  x11  p f8(x) = K8 x1 x4 x8 − x11 1 1 2 p 2 f9(x) = K9 x1 x x4 x9 3 −  x11  2 2 p f10(x) = K10 x x x10 1 − 4  x11  f (x) = x Σ10 x 11 11 − j=1 j

While we do not wish to fully explain these 11 equations, as background it suffices to mention the following. First, the variables x1 through x10 correspond to the number of moles of the following potential reactants: CO2, H2O, N2, CO, H2, H, OH, O, NO, O2.

The variable x11 corresponds to the sum of the first 10 variables x1 through x10. Finally,

the equations f1, f2, f3, and f4 enforce that the number of moles of carbon, oxygen, hydrogen, and nitrogen are the same in both the resulting products of combustion as they are from the reactants. Since we know that for each mole of propane there are 3 moles of

carbon, and since CO2 and CO are the only products containing carbon, it must be the 18

case that x + x = 3, or, equivalently, x + x 3 = 0. The remaining equations relate to 1 4 1 4 − enforcing chemical equilibrium among the various products of combustion [6]. Typically, we would solve this system iteratively by making an informed guess as to

the values of the xi, computing the formulae, and updating the xi values based on the residuals and the partial derivatives of the functions. When the residuals are each reduced to 0, we have located our solution. Certainly, given that this is a small example, in this case we could compute the derivatives for this system at each iteration with eleven passes of column-oriented forward-mode automatic differentiation (one for each of the xi, as no pair of columns is structurally orthogonal – see section 2.7). We are, however, interested in the optimization of the computation, and would like to ask if a different computational

ordering of the non-zero elements might result in increased computational efficiency. We will refer back to this problem later in the introduction and show that, in fact, we can reduce the eleven AD passes (for each iterative computation of the derivatives) to eight. While algorithms for solving these ordering problems are themselves

computationally intensive in general, related families of such algorithms may differ in subtle ways which have a significant computational impact. We will explore some of these subtle differences and demonstrate their effects in Chapters 4 and 5. The analyses included in those chapters help to develop a deeper understanding of these algorithms, which in turn allows for their computational refinement. This process of engineering improved algorithms for the computational ordering of matrices can have significant impact on fundamental problems which arise in many CSC applications.

1.1 Overview

This dissertation is about the analysis of optimization algorithms which are used in combinatorial scientific computing applications. We have selected a particular problem of 19

study, Star Bicoloring 1, which has been shown to solve a useful combinatorial ordering

problem, but for which modern analysis methods (namely approximation analyses) have yet to be applied. As a part of this investigation we develop new Star Bicoloring algorithms and measure their empirical performance – this being done to ensure that the methods studied provide reasonable, not strictly theoretical, algorithmic solutions. Our

focus, however, will be on the analytical results we can derive for these algorithms. An overarching goal is to identify a family of related Star Bicoloring algorithms for which we can provide performance guarantees (in the form of approximation ratios) while also restricting ourselves to methods which approach the best published empirical computational results.

The chief contributions of this work are as follows. First, we establish complexity theoretic lower-bounds on the approximability of algorithms for Star Bicoloring. In particular, we prove that no polynomial-time approximation algorithm for Star

1 ǫ Bicoloring can achieve an approximation ratio of O(N 3 − ) for any ǫ > 0 unless P = NP. Likewise, we establish the first algorithm (ASBC) for Star Bicoloring with a known

2 approximation upper-bound. In particular, we show that ASBC is an O(N 3 ) polynomial-time approximation algorithm for Star Bicoloring. This work is extended to a generic framework where arbitrary greedy algorithms for Star Bicoloring are considered.

We establish general techniques for analyzing these types of algorithms, and apply this generic approach to algorithms both from the literature (CDC [1]) and newer algorithms designed within the generic framework. We provide the first approximation analysis for

3 the algorithm CDC for Star Bicoloring and show that CDC is an O(N 4 ) approximation algorithm. Finally, we observe that all algorithms within this generic framework produce a

restricted class of star bicolorings that we refer to as Distance-2 Independent Set (D2ISC)

1 We ask the reader to note that it is established that any proper star bicoloring also constitutes a proper acyclic bicoloring [7]. For simplicity we will frequently refer to only star bicoloring while the results herein are generally applicable to both problems. 20

colorings. We establish the relationship between Star Bicoloring and D2ISC. In particular we show that these two notions are not equivalent, that D2ISC is NP-complete

1 ǫ and that it cannot be approximated to within O(N 3 − ) for any ǫ > 0 unless P = NP. These results are joint work between the author and Dr. David Juedes [2, 8, 9]. This work is organized as follows. The remainder of this chapter provides context for this research and illustrates how the family of algorithms for Star Bicoloring can improve the computational requirements of CSC problems. Chapter 2 leads the reader through the path from motivation by a matrix optimization problem to solution using graph-theoretic techniques and provides a summary of the relevant concepts from the literature. In Chapter 3 we establish a basis for our research by developing a new algorithm for Star

Bicoloring and deriving the first published approximation analysis for any Star Bicoloring method. The expanded family of related new algorithms for Star Bicoloring is presented in Chapter 4 along with empirical coloring results. The pseudo-code for each method is included in the discussion, with annotated C++ for the common underlying

class implementation appearing in the appendix. Chapter 5 contains the primary focus of this work, including the majority of our analytical approximation findings. During the course of our approximation analyses, we discovered a gap between the Star Bicoloring problem definition and actual implementations from the literature. This leads to the discussion in Chapter 6, which includes a new problem definition and a proof that it (this new problem) is NP-complete. We close in Chapter 7 with a brief summary of the status of our current research and suggestions for future areas of exploration.

1.2 Background, Motivation and Research Direction

Optimization problems represent the search for a particular, or “best,” solution. The motivation for this work derives from one such optimization problem which has particularly wide applicability. Problems in science and engineering often involve either 21

finding optimal solutions to complex equations or solving sets of simultaneous equations

under certain constraints. The standard methods used to solve these problems (quasi-Newton, Gauss-Newton, etc.) often involve the calculation of first and higher order derivatives. Since such problems are multi-dimensional in the sense that there are multiple inputs to and multiple outputs from the underlying function, the derivatives are naturally

represented by matrices. The problems of: 1) computing derivatives for multi-variate vector-valued functions; 2) non-Cartesian-space least-squares best-fit approximation; and 3) calculating derivatives for systems of non-linear equations, all rely on the (sometimes iterative) computation of the Jacobian matrix [5, 10]. Since the individual floating-point computations of the elements of a Jacobian can be expensive and the size of a Jacobian can be quite large [11], techniques have been developed which take advantage of differencing methods and/or computational re-use to improve the efficiency of the overall calculation [3, 12]. These methods inherently rely on an efficient ordering of the input to maximize their computational savings, which brings us to the specific problem of interest – how to

optimally order the elements of a Jacobian matrix to minimize total computational cost. In some circumstances (as when unordered search spaces grow exponentially or factorially with the size of the problem), one may not be able to effectively enumerate all possible solutions in order to select the de facto “best” answer. Such is the case with our

optimization problem, which shows exponential growth in terms of the number of rows and columns of the input Jacobian. This will lead our discussion to heuristic methods for obtaining (hopefully reasonable) solutions, and also to further analysis of the quality of those proposed solutions. The modern benchmark metric for the quality of a heuristic solution is the approximation ratio; this measurement is a central concept of this

dissertation. While approximation ratios can be deceptively difficult to establish, they are quite simple conceptually. For an algorithm A and an input I, let A(I) be the result (or “output”) of algorithm A on input I, while OPT(I) is the “correct” or “best” answer for 22

input I. For simplicity of discussion, let us assume these are small integer results, such as

the number of hours required to travel from point a to point b within the state of Ohio.

A(I) The approximation ratio, rA = OPT(I) , thus provides a measure of relative accuracy for A. If we can establish that, for all cases, both A(I) and OPT(I) each have respective bounding functions, then we can establish that heuristic algorithm A is an approximation algorithm, and make a claim about results from A being always within some measurable distance from the correct, best answers. Considering our problem of interest, ordering the elements of the Jacobian for optimal computation, the objective is to both group Jacobian elements and order the groups. The prevailing methods for actually computing Jacobian elements, finite differencing and Automatic Differentiation [10][13][14][15], operate such that the non-zero elements in a given row or column of the matrix being computed can be calculated for roughly (within a constant factor) the same cost as computing the original function. The inherent sparsity of typical Jacobian matrices will allow us to benefit from

the fact that this computational savings can also apply to linear combinations of the rows of the matrix as well as linear combinations of the columns. By ordering the non-zero elements in valid linear combinations, many of those elements can be calculated very efficiently. There are specific requirements for determining which rows and columns can

be considered linearly independent (see “orthogonal independence,” section 2.7). Further, there is an inherent ordering of these orthogonal groups, such that certain groups of rows or columns can become linearly independent (or, “orthogonally independent”) after certain predecessor groups are formed. The enforcement of both these group-forming and group-ordering rules has been shown to be equivalent to certain problems in graph coloring, most notably Star Bicoloring – an algorithm of central interest to our research. In this dissertation we will examine the application of this graph-theoretic problem, Star Bicoloring, to the optimization of Jacobian matrix computation. We build on 23

previous work which establishes that the star bicoloring of a properly constructed bipartite

graph corresponds to a valid grouping and ordering of Jacobian matrix elements [3] and extend this research in several important ways. These extensions include the description of new algorithms, expanded general and restricted-class analyses, the definition of a new coloring problem, Distance-2 Independent Set Coloring (D2ISC), a proof demonstrating

that D2ISC is NP-complete and exploration of the relationship between D2ISC and Star Bicoloring.

1.3 Examples

1.3.1 Combustion of Propane

As a simple illustration of how Star Bicoloring might be used in scientific computing, let us return to our earlier example, the combustion of propane in air, which we described by the following system of non-linear equations [5]:

f (x) = x + x 3 1 1 4 − f (x) = 2x + x + x + x + x + x + 2x R 2 1 2 4 7 8 9 10 − f (x) = 2x + 2x + x + x 8 3 2 5 6 7 − f (x) = 2x + x 4R 4 3 9 − f (x) = K x x x x 5 5 2 4 − 1 5 1 1 1 1 2 2 2 p 2 f6(x) = K6 x x x x6 2 4 − 1  x11  1 1 1 1 2 2 2 p 2 f7(x) = K7 x x x x7 1 2 − 4  x11  p f8(x) = K8 x1 x4 x8 − x11 1 1 2 p 2 f9(x) = K9 x1 x x4 x9 3 −  x11  2 2 p f10(x) = K10 x x x10 1 − 4  x11  f (x) = x Σ10 x 11 11 − j=1 j 24

Allowing that any variable not explicitly appearing in a given formula would have a coefficient of 0, we can easily construct a coefficient matrix where each row represents one of the f1(x) through f11(x) and each column represents some xi (see Figure 1.1).

1 0 0 1 0 0 0 0 0 0 0 2 1 0 1 0 0 1 1 1 2 0 0 2 0 0 2 1 1 0 0 0 0 0 0 2 0 0 0 0 0 1 0 0 -1 K5 0 K5 -1 0 0 0 0 0 0 -1 K6 0 K6 0 -1 0 0 0 0 -p K7 K7 0 -1 0 0 -1 0 0 0 -p K8 0 0 -1 0 0 0 -1 0 0 -p K9 0 K9 -1 0 0 0 0 -1 0 -p K10 0 0 -1 0 0 0 0 0 -1 -p -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1

Figure 1.1: Coefficient Matrix for the Combustion of Propane System of Equations

If we further simplify this matrix such that any zero coefficients are ignored and all non-zero coefficients become 1, we have the sparsity pattern for the Jacobian matrix representing this system of equations (see Figure 1.2). We can now apply Star Bicoloring methods to locate valid linear combinations of the rows and/or columns of the sparsity pattern in order to reduce the number of AD passes required for its computation.

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Figure 1.2: Sparsity Pattern for the Combustion of Propane System of Equations 25

The Star Bicoloring result provides a strict partial ordering of the rows and columns of the matrix, which corresponds to valid linear combinations (or, orthogonally independent sets, see section 2.7) of those rows and columns which can successively be computed by AD or finite differencing methods. This result also ensures that the ordered rows and columns taken together provide a complete direct cover of the non-zero matrix elements, meaning all fi(x j) which have non-zero coefficients are computed. One such Star Bicoloring algorithm, weighted unlocking (described in Chapter 4), provides the following “color groups,” or strict partial ordering of the rows and columns of the sparsity pattern (where rn represents row n, and cm represents column m).

color group membership set

row color 1 r { 11} column color 1 c { 1} column color 2 c { 4} column color 3 c , c { 2 3} column color 4 c , c { 11 5} row color 2 r , r { 2 6} column color 5 c , c , c , c { 7 8 9 10} column color 6 c { 6}

This results in a reduction of AD passes required for each computational iteration from 11 to 8. The ordered color groups, corresponding to sets of non-zero elements which may be computed in the same AD pass, are illustrated in Figure 1.3.

1.3.2 Curtis54

Much of the test data for this research comes from the National Institute of Standards and Technology (NIST) “Matrix Market” database, a repository for test matrices from 26

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

color group 1 color group 5 color group 2 color group 6 color group 3 color group 7 color group 4 color group 8

Figure 1.3: Color Groups for the Combustion of Propane System of Equations

scientific applications for use in comparative studies of algorithms [11]. To further

illustrate the results of Star Bicoloring, we provide the coloring of the sparsity pattern for one of the Matrix Market matrices, curtis54, which is taken from a biochemical application. While this 54-by-54 matrix has up to 16 non-zeroes in a single column and 12 non-zeroes in a single row, it can be computed in 10 AD passes, illustrating the increased efficiency of bicoloring methods over earlier column-wise computations such as CPR [10] (see Figure 1.4). Notice the circled region “A” where color group 1 (orange), a set of linearly-independent columns, allows rows 49 and 52 to be included in the same row-oriented color group (5, dark green) by preceding group 5 in the coloring order. Similarly, color group 1 also “unlocks” rows 18 and 23 for inclusion into color group 9

(see circled region “B”). The unlocking of lines in the matrix and partial ordering of the coloring groups are concepts that we will return to later in this dissertation, and provide a distinct advantage of bicoloring methods over former column-oriented colorings (see Chapter 2). 27

1 2 3 4 5 0 ...... 0 ...... 0 ...... 0 ...... 0 ...... 0 . . .

0 1 1 1 0 . 1 1 1 . . 1 1 1 1 1 1 1 . . 1 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 1 . . 1 1 1 1 1 . . 1 1 1 1 1 1 1 1 1 1 1 1 . . 1 1 1 . . 1 1 1 1 1 . . 1 1 1 . 10 1 1 1 1 10 . 1 1 1 1 . . 1 1 1 1 1 1 . . 1 1 1 1 . . 1 1 1 1 1 1 . . 1 1 1 1 1 1 . . 1 1 1 1 B . . 1 1 1 1 1 1 1 . . 1 1 1 1 1 . . 1 1 1 1 1 1 1 . 20 1 1 1 1 1 1 1 1 1 1 20 . 1 1 1 1 1 1 1 . . 1 1 1 1 1 . . 1 1 1 1 . . 1 1 1 1 1 1 1 . . 1 1 1 1 1 1 . . 1 1 1 1 1 1 . . 1 1 1 1 1 1 1 . . 1 1 1 1 1 1 . . 1 1 1 1 . 30 1 1 1 1 1 1 30 . 1 1 1 1 1 1 1 1 . . 1 1 1 1 1 . . 1 1 1 1 1 1 . . 1 1 1 1 1 1 . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 1 1 . . 1 1 1 1 . . 1 1 1 1 1 . . 1 1 1 . 40 1 1 1 1 1 1 40 . 1 1 1 1 . . 1 1 1 1 1 . . 1 1 1 . . 1 1 1 1 1 . . 1 1 1 1 1 . . 1 1 1 . . 1 1 1 1 . . 1 1 1 1 A . . 1 1 1 1 1 . 50 1 1 1 1 1 50 . 1 1 1 1 1 . . 1 1 1 . . 1 1 1 1 1 .

0 ...... 0 ...... 0 ...... 0 ...... 0 ...... 0 . . . 1 2 3 4 5

color group 1 color group 6 color group 2 color group 7 color group 3 color group 8 color group 4 color group 9 color group 5 color group 10 Figure 1.4: Color Groups for Curtis54 28

color group membership set

column color 1 c , c , c , c { 6 20 38 50} row color 1 r , r , r , r , r { 6 20 40 10 50} row color 2 r , r , r , r , r , r { 3 17 27 8 42 51} row color 3 r , r , r , r , r , r , r { 31 21 2 12 53 45 7} row color 4 r , r , r , r , r , r , r , r , r { 35 19 30 14 5 49 0 46 52} row color 5 r , r , r , r , r , r , r , r { 24 26 33 11 38 1 16 48} row color 6 r , r , r , r , r , r { 36 15 28 4 22 41} row color 7 r , r , r , r , r , r { 25 32 34 13 43 9} row color 8 r , r , r , r , r { 44 18 29 23 37} row color 9 r , r { 47 39}

1.3.3 A Representative Sparsity Pattern

To illustrate the size of the current Harwell-Boeing matrices which are commonly used in the literature (and in this dissertation) as test data, Figure 1.5 represents the sparsity pattern for BP 200, a matrix arising from the simplex method of solving linear programs. BP 200 is an 822 x 822 matrix containing 3802 non-zero entries, with a densest row containing 283 non-zeroes while its densest column contains 21. The best currently

observed computational ordering solution to BP 200 is 17 color groups. 29

[11] Figure 1.5: An Example Sparsity Pattern 30 2 Background and Literature Review

In this section we summarize many of the background concepts and established results from the literature. This chapter is most applicable to the reader who would like to develop a broader context for the algorithms and analyses which follow in subsequent chapters. One concept of particular relevance to this research is orthogonal independence, which is presented in section 2.7.

2.1 Preliminary Mathematical Notation

In this dissertation we use most of the standard notation from computer science

related to discrete mathematics and graph theory. In this section we review the terms we use most frequently. To begin, the term graph has various definitions, so as way of introduction it serves to precisely establish the term and some related concepts as they are used here. Visually, a graph is a set of points, or vertices, which are connected by a set of line segments, the “edges.” Mathematically, a graph is an ordered pair G = (V, E), where:

V is an unordered finite set of uniquely named vertices;

E V2 is an unordered set of pairs of vertices u, v u v, ⊆ { } | ∀ ∀ u V ∈ v V ∈ u ! = v.

Each vertex of the graph is an intersection point of zero or more edges. The set of vertices within a graph is typically denoted as V, with V denoting the cardinality of that | | set. Within our context, an edge in a graph connects precisely two vertices. The cardinality of the set of edges may be referred to by E . | | Two vertices are said to be adjacent if they share a common edge and, similarly, two edges are adjacent if they share a common vertex. 31

The degree of a vertex, v, is the number of edges incident upon v, and is denoted by d(v).

V | | xi j = 1, iff vi, v j E d(v ) = x  { } ∈ i X i j |  j=1 0, otherwise   The maximum degree of a vertex inG is referred to by ∆, and the minimum by δ. A graph G is bipartite if there exists a bi-partitioning of the vertices, V , V , such { 1 2} that each edge has one endpoint in V1, and the other in V2. That is, let Bip(G): G true, f alse — → { }

Bip(G) u, v E, u V and v V . ⇐⇒ ∀{ } ∈ ∈ 1 ∈ 2 A path within a graph is a connected series of adjacent vertices. The length, l, of a path is equal to the number of edges occurring between the initial vertex of the path, v0, and the final path vertex, vl. A graph is said to be connected if there exists a path between any given pair of vertices. A vertex is said to cover the edges incident upon that vertex. A vertex cover, VC, is a set of vertices which cover all edges of the graph.

VC V u, v E, (u VC) or (v VC) ⊆ | ∀{ } ∈ ∈ ∈ An independent set of graph G is a subset of vertices, IS , which share no common edges.

IS V ! u, v E such that (u IS ) and (v IS ) ⊆ | ∃{ } ∈ ∈ ∈

The induced subgraph, G[V′], of V′ on G is any subset V′ V along with all incident ⊆ edges:

G[V′] = (V′, E′)— V′ V, and E′ E where u V′ or v V′ ⊆ ⊆ ∈ ∈ 2.2 Applications of the Jacobian

Many non-linear maximization and minimization problems rely on the computation of derivatives. A simple example is represented in Figure 2.1, where the minimum value 32

for y can be found by computing the derivative values of x along the plotted line and noting where f ′(x) = 0. In the case of more complex examples involving multi-variate functions, the results are not as easily visualized and the “derivatives” not as easily calculated. While the derivative of a function of a real variable is a function of a real variable, the “derivative” of a vector-valued function, Fv(x1,... xn), is the vector of all pair-wise partial derivatives of the vector-valued function. If Fv is an n-wise vector function of several, say m, variables, then the partial derivatives of the components of Fv can be arranged in an n x m matrix. This matrix of component partial derivatives arranged such that the columns contain sets of component derivatives with respect to a given function variable (see Figure 2.2) is called a Jacobian matrix (hereinafter referred to simply as the “Jacobian”). Computing a Jacobian therefore defines a linear map, Rn Rm →

which is an approximation of the behavior of Fv near a given point p, just as the slope of a simple function is an approximation of its behavior at a given point. Thus, one could say that the Jacobian provides a generalization of the notion of a “derivative” for multi-variate non-linear vector functions. One specific example of this usage of the Jacobian is provided in the solution to the Bratu problem as it applies to the steady-state model of solid-fuel ignition [5].

25 x**2+4

20

15

10

5

0

-10 -5 0 5 10 Figure 2.1: Plot of y = x2 + 4 33

In addition to max/min optimization problems, real-world applications for

computation of the Jacobian extend also into non-linear regression analyses and modeling via systems of linear equations [5]. This need to perform (often iterative) Jacobian calculations arises in diverse fields, including: medicine[16], electrical engineering[17], mechanical engineering[18],geology[19], biology[20], meteorology[21], physics[22], bioinformatics/genetics[23] and economics[24]. These real-world examples can generate function vectors with hundreds of elements or more, thus illustrating the the importance of efficient computational methods for this problem.

δ δF1 δF1 F1 . . . δF1 δx1 δx2 δx3 δxm

δ δF2 δF2 F2 . . . δF2 δx1 δx2 δx3 δxm

= δ J δF3 δF3 F3 . . . δF3 δx1 δx2 δx3 δxm

......

δ δ δ δ Fn Fn Fn . . . Fn δx1 δx2 δx3 δxm

Figure 2.2: A Jacobian Matrix of First-order Partial Derivatives of Component Functions

Different strategies exist for the actual computation of a Jacobian, including finite

differencing methods [25] and Automatic Differentiation (AD) [15]. While each cell in the Jacobian may be computationally intensive to compute, these approaches take advantage of differencing techniques or chain-rule calculations which allow for sub-functional computations to be re-used. Regardless of the specific technique selected, a result of this 34

inter-dependence between individual calculations is that the ordering of the computations

can significantly affect overall run-time of the total derivative matrix computation. The determination of an efficient computational ordering of the cells of the derivative matrix is often cast as a matrix partitioning problem. This area has been well studied, and no fewer than ten different partitioning strategies are discussed in the literature [7]. These

methods, in turn, can be modeled by problems in graph coloring, allowing both matrix and graph paradigms to be used in understanding and developing enhanced solutions to these optimization problems. Strategies which we will consider and expand upon in this research include various, generally greedy, heuristics which use properties of the rows and columns of the matrix to formulate efficient orderings. Each of these methods can also be cast as a particular graph coloring problem. In particular we will investigate the combinatorial optimization problem of Star Bicoloring and associated algorithms, Star Bicoloring having been demonstrated to solve the Jacobian matrix ordering problem by “induce[ing] a structurally orthogonal bidirectional partition of [an equivalent input

matrix]” [7][3].

2.3 Sparse Derivative Matrices and Sparsity Patterns

The matrix partitioning problem presents itself in two important forms, based upon the properties of the underlying functions (are they scalar or vector functions), and the inherent symmetrical nature, or lack thereof, of the data. In the case of the first-order derivatives of vector functions, the derivative matrix is referred to as the Jacobian matrix

(named in honor of Carl Gustav Jacob Jacobi, 1804 - 1851, who introduced the “Jacobian,” or functional determinant) [26]. The Jacobian matrix is constructed with each row being the set of derivatives for a given component function, and each column being the set of component derivatives with respect to a given function variable. In many real-world applications, “Jacobians” (referring now to the matrix) are quite sparse, with 35

ten to twenty percent non-zero values. 2 While Jacobians can exhibit symmetry or partial

symmetry, in general they do not. Another common form of derivative matrix is the second-order derivative matrix for scalar functions, also commonly referred to as the “Hessian.” By their nature, because the partial derivative of f with respect to x and y is equal to the partial derivative of f with

respect to y and x, Hessian’s have a diagonally reflexive symmetry not typically seen in Jacobians. While some reference to Hessians is made herein, we will typically consider the more general non-symmetric data patterns seen in Jacobians. In summary, our focus in this research is on algorithms which take a sparse, non-symmetric matrix and produce orderings of cells which provide improved efficiency of computation for either finite

differencing or AD methods. The specifics of the underlying functions to be computed creates motivation for this work in that the functions represented in individual cells are inherently related to those in the same row and column. These relationships can be used to improve the computational efficiency of individual matrix elements. It is the pattern of

non-zero entries (or sparsity pattern of the matrix), however, that is used to create the computational orderings, not the matrix element equations themselves. Figure 1.5[11] illustrates a sample matrix sparsity pattern from our empirical testing data.

2.4 The Connection between Derivative Matrices and Graphs

A sparse matrix, M, may be modeled by a column intersection graph by creating a vertex for each column of the matrix, and adding edges when columns have non-zero

values in the same row. G = (V, E) cig |

V = v where v V corresponds to column i in M; { i} i ∈

E = u , v k such that (M ! = 0) and (M ! = 0) {{ i j} | ∃ ki k j } 2 These statistics are from an unpublished summary of the matrices at the NIST “Matrix Market.” (http://math.nist.gov/MatrixMarket, spring 2013) 36

Curtis, Powell and Reid [10] were the first to use the column intersection graph model to connect graph coloring problems to the computation of structural orthogonality within a sparse matrix. A straightforward limitation of coloring methods which use column intersection graphs is the observation that a single dense row will result in each column being assigned a unique color. Hence, no computational savings is achieved in

those instances. Given this limitation of the column intersection graph model, more recent approaches employ a bipartite graph model which supports both row and column orthogonality. When using the optimal computational ordering algorithms considered in this work, the values of the cells of M are not considered, only the sparsity pattern of the matrix is used. The sparsity pattern is simply an analogous matrix where non-zero values are treated as “true” and zeros as “false.” A sparsity pattern is also referred to as an incidence matrix. An incidence matrix, M , may be modeled by a bipartite graph, G = ( V , V , E) by s b { 1 2} creating a vertex in V1 for each row of Ms, a vertex in V2 for each column of Ms, and an edge whenever a row/column combination is “true,” or nonzero. G = ( V , V , E) where b { 1 2}

V = u i is a row in M , 1 { i | }

V = v j is a column in M , and 2 { j | } E = u , v (M , 0) . {{ i j} | i j }

Such a bipartite graph, Gb, may be generically referred to as a bipartite incidence structure [27], or, more commonly, simply as the “bipartite [graph] of the Jacobian” [7].

2.5 Generic Coloring Approaches

The classic definition of graph coloring specifies that adjacent vertices must have different colors. In this work, we consider graph coloring problems where the term adjacent may mean something different from the usual meaning that adjacent vertices are 37

directly connected by an edge. In particular, two vertices may be “adjacent,” for our

purposes, if there exists a path of some length between them. This has intuitive practicality when considering bipartite graphs, where no pair of vertices on the same side of a bipartition may share a common edge. Some variation of the term distance occurs within the literature. In this paper, distance between two vertices equals the number of edges along some path between them. Neighbor, or distance-1 neighbor, vertices are vertices where there exists a distance-1 path between them. Distance-k neighbor vertices are those vertices where the shortest path between them is of length k. Given vertices v0 and vl, and a distance k:

true, iff the shortest path from v0 to vl = k  dk(v0, vl) =    f alse, otherwise   Relating the distance-k concept back to independent sets, an independent set may be said to be a subset of vertices with no distance-1 path between any two elements of the subset. Analogously, a distance-k path is a path consisting of exactly k edges, and a distance-k independent set is a subset of vertices which pair-wise have no distance-k paths between members. A distance-k coloring is a mapping, φ : V 1, 2,... p of the vertices to some set of → { } identifiers (such as integers) referred to as “colors” such that no pair of vertices which are distance-k neighbors, or closer, have the same color [7]. The distance-k neighborhood of a vertex v is the set of all vertices which are distance-k neighbors of v, and is represented by Γk(v).

Γ (v) = v — d (v, v ) = true ; k { i k i }

The term neighborhood, without a distance quantifier, typically refers to the distance-1 neighborhood, and for convenience is denoted by Γ(v)[28]. 38

McCormick has shown that the general distance-k graph coloring problem is

NP-complete through a reduction from 3SAT [13], thus a deterministic solution to the general problem is infeasible unless P = NP. The available computational approaches to the problem, then, are to restrict the instances to some set of special cases, or to employ heuristics. Heuristics are algorithms which, while generally non-deterministic, apply some reasonable assumptions which limit the size of the search space so that a “good” result can be obtained within feasible time. Many of these algorithms contain logic which makes a “best available at the moment” type of decision, and are thus termed greedy algorithms. When it can be shown for such a heuristic algorithm that it will, in all cases, produce a result within a definable distance from the actual optimal solution, then that

algorithm is termed an approximation algorithm. The “goodness” of the result obtained from an approximation algorithm is measured by its approximation ratio. Assume an algorithm, A, returns some numerical result for a minimization problem. Let A(G) be that result for input G, and OPT(G) be the actual best possible (lowest number) answer, then

the approximation ratio is given by:

A(G) rA(G) = OPT(G)

An approximation algorithm will often demonstrate an approximation ratio that is a

function of the size of the input. When an approximation algorithm exhibits an approximation ratio that is within some constant factor of the optimum, regardless of the input, that algorithm is termed a constant ratio approximation algorithm.

2.6 D-1 Coloring Methods

An early heuristic approach which relied on simple distance-1 relationships was

suggested by Coleman and Mor´e[14], who established that a simple “sequential” distance-1 coloring of the column intersection graph of a Jacobian matrix provides a consistent partitioning of the columns of the matrix. In that paper Coleman and Mor´e 39

present a straight-forward algorithm (see figure 2.3) which considers the vertices of the

graph of M in some order, v1, v2,... vn, assigning each vertex in turn the lowest color number which is not used by any of that vertex’s distance-1 neighbors.

• let c Z+ represent numeric color values ∈ • let Palette = c c is used to color graph G (M) { i | i cig } • let c(v) Z be the numeric color of some vertex v ∈ • let CU = c(v ) v Γ(v) { i | i ∈ }

f o r each v Gcig(M) c(v) = c ∈Palette (c < CU) and ( c i s minimal ) ∈ | Figure 2.3: Coleman - M´ore Column Coloring

This column coloring algorithm corresponds to a straight-forward greedy coloring based on the column intersection graph.

2.7 Orthogonal Independence

The ordering of matrix computations relies inherently on the relationships of the equations represented in the matrix itself. Generally, computations which seek to re-use sub-functional results can proceed along a row of the Jacobian, corresponding to changing the derivative variable within a function evaluation, or along a column, corresponding to

changing the component function. Indeed, earlier algorithms employed just such a simple process, analyzing the matrices one column at a time, column-wise processing being preferred due to a slight advantage in the way AD computations proceed. An important step in improving the overall efficiency of these computations relies on

the observation that, for either AD or finite differencing, a linear combination of the rows or columns of the matrix can be computed at roughly the same cost as a single row or column. In certain circumstances the elements of the linear combination may contain only 40

values from the original matrix. This is the case when these elements are orthogonally

independent. Roughly speaking, orthogonal independence corresponds to the case that no two columns (rows) in a linear combination contain a non-zero value in the same row (column). More formally, two columns, x and y, within a matrix are said to be orthogonally independent if:

Given an m x n matrix, M,

th th where Mi j represents the cell in the i row and j column,

and: 1 i m, 1 x n, 1 y n, x , y, ≤ ≤ ≤ ≤ ≤ ≤

i : !((M ! = 0)&(M ! = 0)) ∀ ix iy Multiple rows or multiples columns which are orthogonally independent may be compressed into a single group and evaluated in a single AD pass.

2.8 D-2 Coloring Methods

Coleman and Mor´e’s column intersection graph coloring has some straightforward

limitations, as discussed elsewhere herein. Newer techniques involve coloring the bipartite

graph representation of a matrix Gb(M). In particular these newer techniques examine two

graph coloring problems on Gb(M), Star Bicoloring and Acyclic Bicoloring. The connection between these specific graph coloring algorithms and the computation of

derivative matrices is non-trivial and was established independently by Coleman and Verma [3] and Hossain and Steihaug [1]. Briefly, both Star and Acyclic Bicoloring provide partitions of the rows and columns of M into color groups in such a way that each non-zero entry in M can be calculated from linear combinations of the rows or columns

within some color group. These bi-directional approaches often lead to more optimal solutions with fewer color groups than a “single-sided” column-only method, and tend not to be limited by the density of any particular row or column of the input matrix. 41

The distance-2 coloring of the bipartite graph representation of a Jacobian matrix,

Gb(M), is considered to be an archetypal problem for the efficient ordering of a derivative matrix for computation. [7] Of the many varieties of distance-2 coloring algorithms in the literature, we will be interested in two in particular - Star Bicoloring and Acyclic Bicoloring.

There are several reasons to consider distance-2 coloring methods in preference to those based on distance-1 techniques [7]. For example, distance-2 colorings have a general re-usable structure for many specific coloring problems, and thus “simplifies the design of algorithms for the specialized variants.” Perhaps most significantly, the distance-2 bipartite approach directly supports more coloring strategies: uni-directional

column-wise; uni-directional row-wise; or bi-directional approaches which form both groups of orthogonal columns and of orthogonal rows. We now present the formal definitions for both the star bicoloring and the acyclic bicoloring of bipartite graphs in Definitions 1 and 2, below.

Definition 1. Given a bipartite graph, G = ( V , V , E), where by definition V and V b { 1 2} 1 2 are necessarily disjoint (i.e. V V = ), a star bicoloring of G is a function C : 1 ∩ 2 ∅ b sbc V , V , 1, 2,..., p that satisfies the following four conditions: ([7], Definition 5.5) 1 2 −→ {⊥ }

1. The function C is a proper coloring of G , i.e., (i, j) if sbc b ∀ (v , v ) E C (v ) , C (v ). i j ∈ → sbc i sbc j

2. The (non-bottom) color palettes for V and V are disjoint. That is, let C∗ be the 1 2 sbcV1

set of colors used to color bipartition V , and C∗ be the set of colors used to 1 sbcV2

color bipartition V2, then C∗ C∗ . sbcV1 ∩ sbcV2 ⊆ {⊥}

3. Any two vertices v and v V , V connected to the same vertex v , where i j ∈ 1 2 bottom C (v ) = , have different colors. sbc bottom ⊥ 42

4. Every path of edge length 3 in G uses at least 3 colors. ≥ b

Definition 2. An acyclic bicoloring of Gb is defined as a function Cabc : V , V , 1, 2,..., p that satisfies the following four conditions: ([7], Definition 6.8) 1 2 −→ {⊥ } 1. The function C is a proper coloring of G , i.e., (i, j) if abc b ∀ (v , v ) E C (v ) , C (v ). i j ∈ → abc i abc j

2. The (non-bottom) color palettes for V and V are disjoint. That is, let C∗ be the 1 2 abcV1

set of colors used to color bipartition V , and C∗ be the set of colors used to 1 abcV2

color bipartition V2, then C∗ C∗ . abcV1 ∩ abcV2 ⊆ {⊥} 3. Any two vertices v and v V , V connected to the same vertex v , where i j ∈ 1 2 bottom C (v ) = , have different colors. abc bottom ⊥

4. Every cycle in Gb uses at least 3 colors.

A number of basic properties of both star and acyclic bicoloring have been established. For instance, Coleman and Verma have show that the minimization version of Star Bicoloring (using the fewest colors) is NP-hard [3]. Multiple heuristic algorithms for these problems have been independently developed [1, 3], but, to this author’s knowledge, no approximation analyses have been presented prior to [2].

For simplicity of future discussion, we note here that χ χ , as it is established ab ≤ sb that any valid star bicoloring is also a valid acyclic bicoloring [7]. While our findings herein are therefore generally applicable to both problems, when discussing the related minimization problems we will frequently refer simply to the Star Bicoloring case as it is an upper bound on the Acyclic Bicoloring problem.

2.9 Uni-directional versus Bicoloring Approaches

Early algorithms for structural orthogonality considered a “uni-directional” partitioning of the columns of the input matrix (for example, [14]). Considering that AD 43

operates in two modes, forward and reverse, which correspond to a column-vector or a row-vector ordering of the computational process, both row groups (“row colors”) and column groups (“column colors”) would be valid inputs to the AD process. This observation led to a straight-forward extension of the uni-directional method, which computed an orthogonal grouping of the columns of matrix M, and then repeated the process on MT , producing an orthogonal grouping of the rows. The preferred solution was then the partitioning of minimum cardinality. Both Hossain and Steihaug [1] and Coleman and Verma [3] noted that row and column coloring could be combined, producing an orthogonal partitioning of a subset of the columns as well as an orthogonal partitioning of a subset of the rows, which together would allow the direct computation of all non-zero values in the matrix. A major advantage of this approach is that certain “dense” rows or columns can be eliminated from the matrix, often increasing the structural orthogonality of the remaining rows and/or columns. For example, consider the simple example matrix in Figure 2.4.a, where “X”

represents an arbitrary non-zero value:

X X X X X X X X X X X X X X X X X 0 X 0 0 0 0 X 0 0 0 0 X 0 0 0 X X 0 0 X 0 0 0 0 X 0 0 0 0 X 0 0 X X 0 0 0 X 0 0 0 0 X 0 0 0 0 X 0 X X 0 0 0 0 X 0 0 0 0 X 0 0 0 0 X X X

a. b. c. d. e. Figure 2.4: A Matrix with No Initial Structurally-orthogonal Columns

It is clear that none of the columns in Figure 2.4.a can be combined, so four column colors would be required for a complete cover of the non-zero elements of this matrix in a uni-directional column coloring. However, assume row one is assigned row color one, as in Figure 2.4.b. The values in row one after that coloring step may be computed as one pass in AD and, following their computation, need not be considered further in forming 44

subsequent orthogonal groups (and are thus “greyed-out” in 2.4.c). With the elimination

of row 1, the remaining columns in 2.4.c are now orthogonal, and may be compressed for coloring purposes, as in 2.4.d. Lastly, in 2.4.e, this reduced column can receive a single column color, resulting in using two colors, not four, to provide a complete direct cover for the matrix. This illustrates that a bicoloring algorithm, one which considers the formation of both orthogonal row groups as well as orthogonal column groups, can unlock opportunities for greater compression of the matrix, use fewer colors in the solution, and require fewer AD passes to compute the matrix.

2.10 Bidirectional Direct Computation, SeqDC

McCormick [13] provides a simple classification scheme to distinguish unidirectional algorithms from those which use compression and unlocking to reduce the number of colors required to cover a given matrix. The uni-directional column- (or row-) oriented

algorithms, such as the early benchmark CPR program from Curtis, Powell and Reid [10], are termed Non-Overlapping Direct Covers (NDCs). “No overlap may occur within any group of columns, i.e. every [color] group has at most one unknown in each row.” [13]. After the assignment of all colors, each color group will be a structurally orthogonal

collection. In contrast, bidirectional algorithms allow for elimination of non-orthogonal dependencies as color groups are formed and are termed Sequentially Overlapping Direct Covers (SeqDC). The rows or columns in the resulting color sets from a SeqDC algorithm need not be structurally orthogonal according to the contents of the rows and columns in the original graph, but become structurally orthogonal due to an inherent ordering of the color sets, with earlier colors resolving the orthogonality conflicts in the later groups. These ordered bidirectional colorings provide provably more optimal results and will be the focus of our research. 45

2.11 Chromatic and Coloring Numbers

To provide a reference for the relative coloring performance of these various algorithms and approaches, it is useful to reference the theoretical best-case result. Colloquially, the chromatic number of a graph is the least number of colors required to

assign a color to each vertex such that no pair of adjacent vertices shares the same color. Given general graph G = (V, E), let c be a color, and c = c(x) for some vertex x V. The ∈ chromatic number of G, χ(G), is given by:

χ(G) = c : (u, v) E, c(u) , c(v) and c is minimized |{ ∀ ∈ } |

Intuitively, χ would have some relationship to the maximum degree of vertices in the graph, and Ore [29] showed such a relationship to be:

χ(G) maxvi V d(vi) ≤ ∈ { }

This concept is not limited to distance-1 coloring, having intuitive extensions to other

coloring methods. In fact, χ could be more properly termed the distance-1 chromatic

number, χ1. Similarly, the minimal number of colors required for a general distance-k

coloring is χk(G), and, specifically to methods of interest herein, the minimum for an

acyclic bicoloring of graph G is referred to as χab(G), and for a star bicoloring as χsb(G). The relationships of the various chromatic numbers can be used to illustrate a basic advantage of bidirectional coloring over uni-directional methods. Given a bipartite graph representation of a Jacobian matrix, G = ( V , V , E): b { 1 2}

χ (G ) χ (G ) min χ (G , V ), χ (G , V ) [3] ab b ≤ sb b ≤ { 2 b 1 2 b 2 }

Where χ (G , α) is a distance-k coloring of graph G where all vertices α V , V are k b b ⊆ { 1 2} colored (that is, “bottom” or color 0). Thus, the options within the right-hand ⊥ expression represent the two choices for “one-sided” colorings of the bipartite graph. 46

The first inequality holds because if φ, for arbitrary φ, is a star bicoloring, then φ is an acyclic bicoloring. The second inequality holds because “a partial distance-2 coloring on

V2 is a valid star bicoloring of ... Gb where all vertices in V1 are restricted to be colored with color zero.” [3] An important consequence of this inequality is that it provides that both the acyclic bicoloring and star bicoloring of a graph may require fewer colors than any single-sided uni-directional distance-2 coloring. While this sequence of inequalities provides that sometimes χab(Gb) = χsb(Gb) = χuni 2(Gb), Coleman and Verma provide an example − where the difference can be quite large. [3] Given a graph G with n vertices, they illustrate the potential for:

χab(Gb) χsb(Gb) = 3 min χuni row 2(Gb, V1), χuni col 2(Gb, V2) = n ≤ ≤ { − − − − }

Another coloring metric related to the chromatic number is the coloring number, which can be visualized by considering the coloring process. The assignment of successive colors to the reduced matrices produced by successive elimination of groups of orthogonal rows or columns, as is done in a bi-directional SeqDC coloring, implies an

ordering of the vertices of the matrix. Given such an ordering, we define the back-degree

of a vertex v as the number of vertices ui which are adjacent to v in the original non-reduced matrix M and precede v in the ordering. This quantity is also referred to as the incidence degree and has been shown to be an effective greedy decision heuristic [12].

Given a general graph G = (V, E) and an arbitrary vertex ordering π : V 1, 2, 3,... n , → { }

let Bπ(G) represent the maximum back-degree of any vertex in ordering π. Considering

the n! possible vertex orderings of G, let B∗(G) = min B (G) be the the minimum of the { π } maximum back-degrees of the possible orderings. For the ordering producing B∗(G),

Gebremedhin et al. show that col(G) = (B∗(G) + 1) colors is sufficient for a distance-1 coloring of G [7]. This quantity, col(G), is referred to as the coloring number of G. 47

Noting that the search space for col(G) involves n! orderings suggests that computing

this metric in the general case may be intractable. However, according to Gebremedhin [7], both Finck and Sachs [30] and Matula [31] provide linear-time algorithms which produce the optimal ordering πopt, resulting in efficient algorithms for computing col(G). Since computing χ(G) is NP-hard (one of “Karp’s 21 problems” [32]), this is useful for providing a computable upper bound on χ(G) that is somewhat tighter than the previously discussed bound based on the maximum degree vertex in G:

χ(G) col(G) ∆(G) + 1 [7] ≤ ≤

2.12 Direct versus Substitution Methods

While discussing the topic of methods for algorithmic comparison, it is interesting to point out the implications of the slight difference in definition between star and acyclic bicoloring. Both uni-directional and bi-directional direct coloring methods rely on an ordering of structurally orthogonal sets of rows or columns. Within these sets, no member

has a conflicting non-zero entry in the same position (row or column, as appropriate) as another member. These methods are referred to as “direct,” as they provide an ordering where each element can be computed directly, without dependence on any member of another set. The restriction on the ordering of the sets is simply that if a set only becomes

orthogonal after other sets are colored, then those “unlocking sets” must be colored first. The requirement of orthogonal independence within a color group can be relaxed

somewhat at the cost of a more restrictive ordering. For example, a column group c j could be allowed to contain columns which lacked orthogonality in row ri, as long as ri was ordered for coloring prior to c j. The obvious similar argument would allow for non-orthogonality within row groups. This approach is known as the substitution method, and can lead to graph colorings using fewer colors than direct methods. 48

Coleman and Verma have shown that the characterization of the star bicoloring

problem given in section 2.4.1 above corresponds to a bi-directional distance-2 direct graph coloring, while acyclic bicoloring (section 2.4.2) corresponds to the substitution variant. [3].

2.13 Ordering Methods

As a final section within the topic of algorithmic comparison, we consider that many options for the order in which to select the input data exist, and that these choices do, in

general, affect greedy algorithms. Specific greedy algorithms are sometimes described as “non-deterministic” due to the fact that they can produce different results from the same data, merely by re-ordering the input (possibly due to the treatment of “ties” within the decision criteria). Conversely, this implies that selecting a “good” ordering of the input might improve the performance of greedy algorithms. Gebremedhin et al. show that the greedy general graph coloring problem has the interesting property that, when presented with the right input ordering, it will produce an optimal coloring [7]. This characteristic is not generally shared with other greedy optimization problems. The following subsections summarize common ordering methods employed in greedy graph coloring algorithms.

2.13.1 Largest First

An intuitive ordering suggested by Welsh and Powell [33] is to consider the vertices

by non-decreasing order of their degree in the input graph. This largest first (LF) ordering can be efficiently pre-computed, and often produces reasonable results. However, Coleman and Mor´eprovide examples where LF performs poorly on bipartite graphs [14].

2.13.2 Smallest Last

Gebremedhin [7] credits Matula [31] for proposing an ordering which is generally considered an improvement over largest first. This smallest last (SL) ordering is 49

pre-computed in time proportional to E , and is determined in reverse order. The last | |

ordered vertex is the vertex vδ of minimum degree in G. Each successively preceding

vertex, v is the vertex of minimal degree in G v + , where v + is the set of all vertices i − { i } { i } ordered after (computed prior to) vi. While SL conforms to an upper bound which has been demonstrated to be less than or equal to the upper bound for largest first, SL also has been shown to behave poorly on certain bipartite graphs [14]. Matula is also reported to show a particularly interesting property of the SL ordering – that a sequential greedy algorithm using SL will use the same number of colors on graph G as the coloring number of G (col(G)) – thus providing an efficient computation of col(G).

2.13.3 Incidence Degree

Coleman and Mor´ealso explore an ordering which performs similarly to SL while

having the desirable property of providing an optimal distance-1 coloring on bipartite

graphs [14]. The incidence degree of a vertex vi is the degree of vi in the subgraph Gi induced by vi and all previously colored vertices ( v1, v2,... vi 1 ). An incidence degree { − } ordering greedily selects the next vertex based on the maximum incidence degree.

Incidence degree ordering is efficient to compute, and typically provides good results [2, 14], but may perform poorly on 3-colorable graphs [14].

2.13.4 Saturation Degree

The saturation degree of a vertex provides yet another option for relative degree which can be used as a greedy selection criterion. A vertex’s saturation degree was defined by Br´elaz as: “the number of different colors to which it is adjacent” [34]. Initially, the saturation degrees of all v V , V of a bipartite input graph G will be 0, so a saturation ∈ { 1 2} b degree ordering may start with an arbitrary vertex or one chosen by some other heuristic

(such as maximum ordinary degree in Gb). The ordering proceeds by coloring the selected 50

vertex with the lowest permitted color, and then iteratively selecting the next vertex of

maximum saturation degree. Ties may be broken arbitrarily, or with some other heuristic. While not providing a proof of the claim, Gebremedhin et al. argue that SD should generally provide a smaller coloring than ID, and ID should generally provide a smaller coloring than LF. While LF and ID can be implemented with a run time of O( E ), SD | | ordering requires O( V 2), which is a draw-back for sparse graphs. | | 2.14 Existing Coloring Algorithms

This section provides a brief summary of some noteworthy algorithms. Curtis-Powell-Reid (CPR) coloring illustrates the early column-oriented uni-directional coloring schemes, while Minimum Nonzero Count Ordering (MNCO) and Complete Direct Cover (CDC) are illustrative of the diversity within bi-directional approaches.

2.14.1 uni-directional coloring

In 1974 Curtis, Powell and Reid [10] noted the utility of grouping columns of a Jacobian matrix in order to simplify its computation. They correctly identify the value of utilizing the typical sparsity of a Jacobian to compress an input matrix, using a simple occurrence-based ordering, making no greedy decisions. Curtis, Powell and Reid’s algorithm is presented in Figure 2.5

2.14.2 Minimum Nonzero Count Ordering with Incidence Degree Ordering (MNCO/IDO)

One of the earliest published bi-directional bicolorings was Minimum Nonzero Count

Ordering (MNCO), by Coleman and Verma [3]. MNCO is a two-stage algorithm, which

first computes two submatrices of M, JC and JR, which (confusingly) correspond to a subset of the rows and columns of M respectively, and then colors these submatrices independently. This is consistent with the rules for acyclic and star bicolorings, as the 51

• given input matrix M

th • let Ci be the i column of matrix M • let C be the number of columns in matrix x | x|

• let Gi be a set, or “group,” of columns

M′ = M i = 1 w h i l e C > 0 | M′ | Gi = C0 f o r {j =} 1 t o C | M′ | i f ( Gi C j = ) G =∩G C∅ i i ∪ j M′ = M′ Gi i ++ \ Figure 2.5: Curtis, Powell and Reid (CPR) Coloring

color sets used to color rows and columns are required to be distinct, thus MNCO suffers no penalty in terms of total colors used from this separated coloring strategy. MNCO builds its submatrices by successively moving either a given row r or column c of M′ into submatrix JC or JR, where M′ is M as reduced by elimination of prior rows and columns thus moved. The row/column decision is based on the relative impact adding a row or a column has on the sum of the maximum non-zero densities of JC and JR.

Given: JC and JR as above,

function nnz(α) which provides the number of non-zero elements in a row or column,

function ρ which provides the maximum number of non-zero elements in a row of a

row-wise matrix, or in a column of a column-wise matrix.

ρ(J ) + max ρ(J ), nnz(r) < ρ(J ) + max ρ(J ), nnz(c) [3] (2.1) R { C } C { R } 52

For each iteration, if equation (1) is true, then adding the row under consideration to

JC would increase ρ(JC) less than adding the current column to JR would increase ρ(JR). Please see Figure 2.6 for the MNCO algorithm.3

R = (1 : m), C = (1 : n), M′ = M w h i l e M , s e l e c t∅ r R with fewest non z e r o e s i n M s e l e c t c ∈ C with fewest non− z e r o e s i n M if equation∈ (1) is true − JC = JC (r C) R = R ∪r ∩ − { } M′ = M′ r e l s e − JR = JR (c R) C = C ∪c ∩ − { } M′ = M′ c − Figure 2.6: Minimum Nonzero Count Ordering (MNCO) [3]

One feature of Coleman and Verma’s approach is that, following construction of the subgraphs, the choice of method for the actual coloring step is independent of the MNCO algorithm itself. Coleman and Verma select a greedy coloring based upon Incidence

Degree Ordering [3][14].

2.14.3 Complete Direct Cover (CDC)

Hossain and Steihaug present a straightforward greedy algorithm based on Γ(Γ(v)), the distance-2 neighborhood of a vertex. They use the property of bipartite graphs that, for any two vertices, v and v , on the same side of the bipartition, if (Γ(v ) Γ(v ) = 1 2 1 ∩ 2 ∅ then v1 and v2 are structurally orthogonal and can be part of the same color group. This insight is combined with greedy selection based upon maximal vertex degree in

3 Figure 2.6 contains corrections of typographical errors found in the original publication. 53

G v C , where C is the current set of previously-colored vertices, to provide a − { i ∈ } “complete direct cover” of the non-zero values in G. The CDC algorithm is more fully described in Figure 2.7.

Given: G = ( V , V , E) a bipartite graph b { 1 2}

c o l o r = 0 ed g eco u n t = 0

V1∗ = V1 V2∗ = V2 while edgecount < E c o l o r++ | | W = k ∅ s o r t v V∗ , v V∗ by non increasing degree { 1 ∈ 1 2 ∈ 2 } − s e l e c t w1 vertex of maximum degree W = W w k k ∪ 1 f o r each wn on same side of bipartition as w1 i f ( ! d2(w1, wn)) Wk = Wk wn i f ( w V ) ∪ 1 ∈ 1 V1∗ = V1∗ Wk e l s e − V∗ = V∗ W 2 2 − k assign color to each vertex in Wk l e t Ek be the edges with endpoints in Wk ed g eco u n t += E | k| Figure 2.7: Complete Direct Cover 54 3 Initial Results

In our survey of relevant literature, we observed a variety of effective approaches to solving Star Bicoloring. These methods vary not only in terms of different ordering strategies, but even in structure of the algorithms. Some are elegantly simple, while some use complex multi-phased algorithms. Some rely on a one-sided coloring model, while newer algorithms recognize that a bicoloring strategy generally achieves better results. The analyses published concurrently with these algorithms provide several important results, including establishing that Star Bicoloring provides correct answers to the Jacobian ordering problem, and that the Star Bicoloring decision problem is NP-complete. Notably absent from prior work is an approximation analysis for any of these related methods. This gap we begin to address in this chapter. In Chapter 3 we present Approximate Star BiColoring (ASBC), a new algorithm for

Star Bicoloring which has been engineered to be competitive in empirical coloring performance while enabling the first approximation analysis for any Star Bicoloring algorithm. Additionally, we develop a new lower bound on approximability results for Star Bicoloring, and provide comparative benchmark results for ASBC, MNCO and

CDC. The majority of the material presented in this section reflects a summary of our initial published results from [2].

3.1 An Approximation Lower Bound

Our investigation of the properties of star and acyclic bicoloring begins with consideration of the approximability limits for these algorithms. Given an algorithm A and

problem instance I, the approximation ratio of A, rA(I), is a measure of the quality of the algorithm which compares an easily-computable measure of the output of A (such as the number of colors used to color graph Gb) with the optimal result. For minimization 55

problems, A(I) r (I) = 1 A opt(I) ≥ where opt(I) is the value of an optimal solution. For instance, if A were to solve star bicoloring for graph Gb with 4 colors, where the graph had a 3-color solution, then rsb(A) = 4/3 in this example. When an algorithm A always produces results which conform to r c for some constant c, then A is termed a constant-ratio approximation ≤ algorithm. If r (I) f (I) for every I, then algorithm A provides an approximation A ≤ guarantee of f . Building on the work of Zuckerman [35], who shows for general graphs that

Proposition 1: No polynomial-time algorithm can approximate χ(G) to

1 ǫ within V − for any ǫ > 0 unless P = NP, | |

Juedes and Jones construct an approximability-preserving reduction from graph coloring

to star bicoloring which leads to an analogous lower bound for χsb(Gb)[2]. Given general graph G = (V, E) (see Figure 3.1A), let n = V . Each edge u , v E is replaced by a | | { i j} ∈ subgraph of n + 1 nodes, wi j, wi j,... wi j with subgraph edges { 1 2 n+1} (u , wi j), (u , wi j) ... (u , wi j ) and (v , wi j), (v , wi j) ... (v , wi j ) (see Figure 3.1B). { i 1 i 2 i n+1 } { j 1 j 2 j n+1 } Figure 3.1C is formed by moving the original vertices of G to one side of a bipartition, and the new vertices created by the transformation to the other side.

Let Gb = (Vb, Eb) be a bipartite graph constructed as above from graph G, and let Ab be an algorithm for Minimum Star Bicoloring achieving the approximation ratio r = V . | |

Using Ab and the approximation preserving reduction we get a polynomial-time approximation algorithm A for Minimum Graph Coloring. Since

V = ( V + 1) E + V V 3, r V 1/3. This leads directly to | b| | | · | | | | ≤ | | ≥ | b| 56

Figure 3.1: Transforming a General Graph G into a Bipartite Graph Gb

Juedes-Jones Corollary 3.2 For a bipartite graph Gb of n vertices, no

1 ǫ polynomial-time algorithm approximates χsb(Gb) to within n 3 − for any ǫ > 0, unless P = NP.

A complete proof of Corollary 3.2 is provided in [2].

3.2 Approximate Star BiColoring (ASBC), an Approximation Algorithm

In this section, after a brief summary of its design motivation, we present the Approximate Star BiColoring (ASBC) algorithm and establish that it provides valid star and acyclic bicolorings.

As was discussed in the literature review section, bi-directional coloring approaches provide demonstrably better results than single-sided colorings. While uni-directional algorithms provide a tight upper bound on bi-directional coloring results, straight-forward 57

examples show how bi-directional methods can perform better by a factor of n. The

general exact solution of the bi-directional coloring optimization problem being NP-hard [3], the example algorithms presented in prior sections are greedy heuristic algorithms, with their performance depending upon the data ordering method and/or the specific greedy selection criterion employed. The technical specifics of data ordering and greedy selection have an impact on the construction of an approximation analysis. Coleman and Verma’s MNCO algorithm uses a “minimal impact on maximum non-zero density” metric to formulate column- and row-partitions of the input matrix, and then uses Incidence Degree Ordering in the coloring step. Hossain and Steihaug (CDC algorithm) consider the size of the “neighborhood of the neighborhood” of each input

vertex to both select from and order the data. In the case of both both MNCO and CDC, a coloring performance analysis appears not to be straight-forward, with neither algorithmic author providing one. By judiciously selecting the greedy decision criterion, an alternative algorithm which provides similar

coloring performance can also be structured to yield a more direct approximation analysis. As a basis for designing this algorithm, in [2] Juedes and Jones provide a proof of the following inequality

E E χ (G ) χ (G ) | b| | b| (3.1) sb b ≥ ab b ≥ max V , V ≥ V + V | b1 | | b2 | | b1 | | b2 | The proof establishes that both star bicoloring and acyclic bicoloring are bounded below by the ratio of the number of edges in Gb to the number of vertices in Gb. This observation along with the selection logic within this new algorithm will facilitate an approximation analysis (see section 3.3). ASBC is an efficient algorithm for both minimum star bicoloring and minimum acyclic bicoloring with an approximation ratio of O(n2/3). This algorithm determines an ordering of sets of vertices by successively computing distance-2 independent sets on the 58

Given: G = ( V , V , E ) a bipartite graph, i = 1 or i = 2 b { b1 b2 } b

V′ = w h i l∅ e E , b ∅ select vertex u Vbi of highest degree V = V u, ΓΓ(u) ∈ bi bi \{ } V′ = V′ u ∪ { } r e t u r n V′ Figure 3.2: Greedy Independent Set

bipartite graph representation of the Jacobian. As with other algorithms, each computed set is then assigned a color, and then removed from the graph. The result is that for each orthogonal column (or row) set colored, the overall orthogonality of the remaining rows (columns) of the graph will increase, thus likely increasing the number of rows (columns) which subsequently can be compressed into following colors. ASBC is a “greedy” algorithm, with the independent sets computed by the Greedy Independent Set routine,

see Figure 3.2 [2]. To drive Greedy Independent Set, ASBC uses selection logic which considers the maximum degree of any remaining vertex in G , and its ratio to n = V and n = V . b 1 | 1| 2 | 2| See Figure 3.3.

To see that ASBC produces both valid star and acyclic bicolorings, first consider that a valid star bicoloring is also a valid acyclic bicoloring, so it will suffice to show that ASBC satisfies the four conditions for a star bicoloring (see Definition 1). Conditions (1.1)

and (1.2) are immediately satisfied as the colors used for Vb1 and Vb2 are independent. Condition (1.3) is met because:

• ASBC only assigns the color when all edges to a vertex have been eliminated, ⊥ meaning all adjacent vertices have previously been colored; 59

• at no time will ASBC assign two vertices which share a neighbor the same color, so

as the vertices adjacent to an eventual “bottom vertex” are colored, they will each receive a different color;

Given bipartite graph G = ( V , V , E ) b { b1 b2 } b

i = 1 w h i l e Eb , color∅ all vertices of degree 0 with color delete all vertices of degree 0 ⊥ l e t ∆ be the maximum degree of any remaining vertex l e t n = V , n = V 1 | b1 | 2 | b2 | i f ( n1 > n2 ) and ( n1/∆ > ∆ ) V′ = Greedy Independent S et(Gb, 1) e l s e i f ( n2/∆ > ∆ ) V′ = Greedy Independent S et(Gb, 2) e l s e

i f Vb1 contains the vertex of maximum degree V′ = Greedy Independent S et(Gb, 1) e l s e V′ = Greedy Independent S et(Gb, 2) c o l o r a l l v V′ with c o l o r i ∈ Gb = Gb V′ i + +Approximate\ Star Bicoloring Figure 3.3: Approximate Star Bicoloring

For condition (1.4), consider that any arbitrary path ρ of length 3 in Gb will contain 2

vertices from Vb1 and 2 vertices from Vb2 . Now, some vertex in the path must be colored first. Without loss of generality, consider that a vertex v V is in the current set of c ∈ b1 vertices to be colored. Since the vertices from Vb2 along ρ have not yet been colored, then one of those vertices will have edges to both vc and the other Vb1 vertex, vnc, along ρ.

Since vc and vnc share a common neighbor they are not structurally orthogonal, and 60

therefore vnc cannot be included in the current coloring group and will receive a color distinct from that of vc.

3.3 ASBC Approximation Analysis

A key objective of our preliminary results was to establish that an approximation ratio for some greedy algorithm for star and acyclic bicoloring does exist. Having designed the ASBC algorithm with this goal in mind, we present in this section the first approximation analysis for an algorithm of this type. This argument hinges on an analysis

of the minimum number of edges which can be eliminated in each pass, and then tying that result to the maximum number of colors that will be required.

3.3.1 Edge Elimination

The computational goal of ASBC is to assign a color to each vertex in Gb according to the star/acyclic bicoloring criteria. Once each vertex has a legal color, the computation is, obviously, finished. As the coloring proceeds, a series of derivative graphs G is formed, b′ i with the colored vertices of the prior coloring pass and their associated edges removed.

Since the termination process for a series of disconnected vertices is to merely color those with , then the elimination of all edges from the graph can also be viewed as establishing ⊥ completion of the coloring process. Notice that in ASBC’s main “while” loop, one call to Greedy Independent S et is

made with each iteration. With each call to Greedy Independent S et, an orthogonally independent set of vertices, Vˆ , is formed. This vertex set will be colored with the next

appropriate color, and these vertices and their associated edges deleted from Gbi . Notice that exactly Γ(Vˆ ) edges will be deleted, since no two vertices in Vˆ can have edges to the | | same neighboring vertex.

Since, for each vertex in Vi, Greedy Independent S et will either assign the vertex to the current orthogonal set Vˆ or delete it because it is a neighbor of a neighbor of a vertex 61

1 1 - 1/α 1/2 + 1/(2α)

0.8

0.6 exponent 0.4

0.2

0 1 1.5 2 2.5 3 3.5 4 4.5 5 alpha Figure 3.4: Minimizing Approximation Ratio Exponent

already in Vˆ , we have that Γ(Γ(Vˆ )) = V . Further, since ∆ is the maximum degree of any | | | i| vertex in G , Γ(Vˆ ) V /∆. b | | ≥ | i| Notice that when ASBC selects the bipartition from which to form a coloring set, the larger partition is chosen when the size of either partition is greater than ∆2. Since 0-degree vertices are eliminated prior to partition selection then Γ(V ) must be > ∆ in this | i | case. Therefore,

Γ(V ) max V /∆, V /∆, ∆ | i | ≥ {| b1 | | b2 | }

Noting that for any a, b and k, max a/k, b/k, k √max a, b , and that { } ≥ { } max a, b (a + b)/2: { } ≥

Γ(V ) max V /∆, V /∆, ∆ max V , V | i | ≥ {| b1 | | b2 | } ≥ p {| b1 | | b2 |}

( V + V ) / 2 ≥ p | b1 | | b2 |

Therefore, ASBC eliminates a minimum of ( V + V ) / 2 edges with each p | b1 | | b2 | coloring pass. 62

3.3.2 Limit on Colors Used

Let Vk and Vk be the vertices remaining in V and V respectively after coloring b1 b2 b1 b2 iteration k. For a given coloring pass, then, ASBC will delete at least ( Vk + Vk ) / 2 q | b1 | | b2 | edges. But this provides no minimum on the number of edges deleted per pass, as Vk and | b1 | Vk represent decreasing values. | b2 | Let us consider ASBC execution in two phases. For some α, when

k k 1 1 k k V + V 2n − α , we will say that V and V are sufficiently large, and note that at | b1 | | b2 | ≥ | b1 | | b2 | 1 1 1 1 α = 2 2α least pn − n − edges will be removed during each coloring pass. Since there are E E edges, then this phase of ASBC will use at most 1| |1 colors. | | n 2 − 2α k k 1 1 At the end of that first phase V + V will have been reduced to < 2n − α . At this | b1 | | b2 | point, using the worst case of one additional color per remaining vertex, ASBC will use

1 1 < 2n − α colors to finish. Therefore, the maximum total colors needed by ASBC is limited by:

E 1 1 α f (Gb) = 1| |1 + 2n − n 2 − 2α

3.3.3 Approximation Ratio

Since χ 1 for all graphs with at least one edge, and from ( 3.1) above we have that sb ≥ χ E /n, then the approximation ratio for ASBC can be given as: sb ≥ | |

1 1 E / n 2 − 2α 1 1 α rasbc(G) | | E / n + 2n − ≤ | |

1 + 1 1 1 n 2 2α + 2n − α ≤

The exponent on the right-hand side of this equation is minimized by setting α = 3, as can be seen in Figure 3.4. This results in

1 + 1 1 1 1 + 1 1 1 2 n 2 2α + 2n − α = n 2 6 + 2n − 3 = 3n 3 , and:

2 2 r (G) 3n 3 = O(n 3 ) asbc ≤ 63

3.4 ASBC Comparative Empirical Results

For testing and comparison purposes, we implemented three separate algorithms – ASBC and two selections from seminal papers on acyclic Jacobian bicoloring. The first was a “direct” version of Coleman and Verma’s Minimum Non-zero Count Ordering

(MNCO) algorithm [3] using Incidence Degree Ordering [14], and the other was the Complete Direct Cover (CDC) algorithm by Hossain and Steihaug [1]. All three algorithms were implemented using the Gnu C compiler running under the Fedora version 13 operating system. Our results for the pre-existing algorithms generally confirmed their reported performance.

For coloring testing we selected a set of well-known test matrices (from the Harwell-Boeing collection [11]) which were used in either [3] or [1] or both. In general ASBC performs similarly to the other programs and, while not a constant-ratio approximation algorithm, produced results on our test data which are within a small constant ratio of optimal. ASBC also produced very efficient run times, with a maximum of 0.154s on a matrix with 12,444 non-zeroes. CDC had a maximum run time of 0.965s (11,500 non-zeroes in that specific matrix), with MNCO running for 151.0s in its worst case (a matrix with

11,150 non-zeroes). These run-time tests were conducted on a 2.4GHz Intel Core-2 Duo processor (running Fedora 13). Thus, ASBC achieves the goal of an efficient algorithm producing coloring results on-par with the best known methods while also providing an approximation guarantee.

These results and are summarized in Table 3.1. 64 Table 3.1: Comparative Coloring Performance [2]

lwr MNCO matrix m n nnz bnd ASBC direct CDC abb 313 313 176 1,557 5 17 12 12 arc 130 130 130 1,282 10 26 23 26 ash 219 219 85 438 2 8 10 5 ash 292 292 292 2,208 8 20 15 15 ash 331 331 104 662 2 10 7 6 ash 608 608 188 1,216 2 10 11 7 ash 958 958 292 1,916 2 11 7 7 bp 0 822 822 3,276 4 16 15 14 bp 200 822 822 3,802 5 18 17 18 bp 400 822 822 4.028 5 20 19 18 bp 600 822 822 4,172 6 20 18 18 bp 800 822 822 4,534 6 22 21 21 bp 1000 822 822 4,661 6 22 21 22 bp 1200 822 822 4,726 6 22 22 22 bp 1400 822 822 4,790 6 23 22 22 bp 1600 822 822 4,841 6 22 22 21 curtis 54 54 54 291 6 12 13 11 eris 1176 1176 1176 18,552 16 92 88 92 fs 541 1 541 541 4,285 8 18 14 14 fs 541 2 541 541 4,285 8 18 14 14 ibm32 32 32 126 4 9 9 8 lund a 147 147 2,449 17 26 25 26 lund b 147 147 2,441 17 26 22 26 shl 0 663 663 1,687 3 6 5 4 shl 200 663 663 1,726 3 7 5 4 shl 400 663 663 1,712 3 6 5 4 str 0 363 363 2,454 7 25 27 24 str 200 363 363 3,068 9 31 28 31 str 400 363 363 3,157 9 36 30 36 str 600 363 363 3,279 10 35 30 35 will 57 57 57 281 5 10 9 9 will199 199 199 701 4 9 6 8 65 4 A Family of Greedy Star Bicoloring Algorithms

In [2] the algorithm Approximate Star BiColoring (ASBC) was developed, analyzed and shown to be the first documented example of an approximation algorithm (a heuristic with an approximation guarantee) for star bicoloring. Since ASBC and other heuristic star bicoloring algorithms share a fundamentally greedy nature, natural questions arise about whether the approximation analysis techniques developed in [2] may be applied to a more general class of algorithms. Corresponding to phase one of our research proposal, the purpose of this section is to expand the set of heuristics for star bicoloring under consideration and document their coloring performance. We provide both a general framework as well as specific related example methods which utilize the Greedy Independent Set approach. Also, specific heuristics which are not members of the described algorithmic family are included for comparison. We describe a C++ class built to support the GIS model and support the implementation of the various greedy strategies included in our test cases. Interspersed within the algorithmic descriptions, at points where the discussion would be most relevant, we also present specific algorithmic “counter examples.” These are specific sparsity patterns which cause notably poor performance for one or more of the GIS model algorithms. This chapter concludes with a summary of our expanded empirical testing results.

4.1 A General Star Bicoloring Framework

In considering a broad class of algorithms from the literature, including ASBC and

CDC, we observe that they share an overall greedy framework, which is summarized in Figure 4.1. In comparing specific algorithms with the general greedy framework, we see that many implementational differences are based around the selection criteria described by 66

c = 1 w h i l e num edges > 0 1 . ) select bipartition V1 or V2 from which to form the next distance 2 independent set , I −

2.) select a seed vertex, v1 , from within the chosen bipartition , I = I + v1

until a maximal result is obtained 3.) select and add an additional vertex, vi , t o the distance 2 independent set , I = I + v − i 4.) color each vertex in I by c o l o r c

5 . ) c = c+1

6 . ) d e l e t e I and all adjacent edges from G

7.) color all 0 degree vertices and delete from G − ⊥ Figure 4.1: A General Framework for Star Bicoloring

steps one through three in the pseudo-code (see Figure 4.1). For instance, the difference between the ASBC [2] and CDC [1] algorithms is solely in step one – determining from which bipartition the next independent set will be selected. CDC selects the bipartition

containing the remaining vertex of greatest degree, while ASBC chooses the bipartition with the highest ratio of number of remaining vertices to maximal vertex degree during the first phase of its execution. Returning to the general case, given a selected bipartition, the independent set must be built starting from somewhere. The initial, or “seed,” vertex might be that of greatest or least connective degree, greatest or least saturation degree, or even selected by some other non-greedy metric (by occurrence or random). Even with the bipartition and seed vertex chosen, the construction of each independent set must proceed in some selective order. 67

Similar to seed vertex selection, each added vertex at some point must be chosen for

inclusion based on some decision parameter(s). Typically, this is a greedy choice based on some metric of the vertex, and may relate to the original bipartite representation of the

matrix, Gb(M), or alternatively to some derivative matrix, Gb′ , produced as the result of formation, coloring and deletion of prior independent sets. In this dissertation, we will be

focusing on the set of greedy star bicoloring algorithms which implement the same strategy for step three – once a bipartition is explicitly or implicitly selected, the building of the current coloring group is accomplished with the Greedy Independent Set (GIS) function, which successively adds the next vertex of maximal degree that is a valid addition to that coloring group. We will refer to the subset of this group of algorithms

which use some greedy criteria for steps 1 and 2 and rely on GIS for step 3 as the GIS family. The decision criteria used for steps 1. through 3. of figure 4.1 which make each algorithm unique is termed Strategy T.

4.2 A Class for GIS Algorithms

To facilitate the implementation of a variety of greedy algorithms and ensure commonality for the color group building strategy (step 3 of strategy T), we developed a

C++ class providing essential common elements for a Greedy Independent Set family of algorithms. As a design criteria illustrative of the philosophy of information hiding, the actual representation of the input matrix is embedded in the class, while the functions which perform coloring operations are not. Thus, the purpose of the class is to provide an

interface to an input matrix, along with member functions which produce results of direct matrix manipulation, providing a substantial segregation of the graph-theoretic orientation in the driving routines from the matrix considerations in the class. The class, Matrix M, provides the following public member functions: 68

• load matrix provides a method for loading a sparsity pattern representation of an

input matrix M into a private class data structure.

• vertex degrees accepts lists of current row and column colors, and returns the degrees of all vertices in the descendant graph formed by removing all colored rows and columns from input matrix M.

• greedy row compress takes as input a list of current row and column colors, which

together describe a descendant graph M′, and a sorted list of rows. The sorted list is assumed to provide an ordering preference, with the first vertex in the list considered the least preferable to add to the GIS, and the last the most preferred.

The last element in this sorted row list is considered the seed vertex, and greedy row compress constructs a GIS based on the sorted ordering provided.

• greedy col compress functions analogously to greedy row compress.

• valid star bicoloring analyzes lists of assigned colors for each row and column against the original input matrix M and determines if these assignments constitute a

valid star bicoloring.

A complete annotated listing for the Matrix M class is provided in Appendix A.

4.3 Selected Greedy Strategies, Part One

Our initial set of GIS algorithms includes, of course, ASBC and CDC, which in part use the concept of v∆, a vertex of maximal degree, in their selection criteria. In this section we present selective alternative methods used in our algorithmic comparison. A summary of all methods is presented in Table 4.1. The initial variant we discuss is closely related and used successfully in natural language processing. In that field it has been observed that in certain greedy-choice 69

circumstances, improved results can be obtained by considering the “second-best” greedy

choice, as opposed to a strictly maximal strategy. We implement this concept in terms of independent set construction by selecting as the GIS seed that vertex with the second-highest degree. This provides an implicit selection of bipartition, as in CDC. As part of the GIS family, the subsequent coloring groups are built with GIS. We term this

approach second degree, and the coloring decision logic is presented in Figure 4.2.

Given bipartite graph G = ( V , V , E ) b { b1 b2 } b

i = 1 G′ = Gb w h i l e Eb , s o r t a∅ l l v V , V i n non decreasing order k ∈ { b1 b2 } − vseed = vk 1 − i f ( vseed Vb1 ) g i s p a r t = 1 e l s e ∈ g i s p a r t = 2 V′ = Greedy Independent S et(G′, gis part) c o l o r a l l v V′ with c o l o r i ∈ G′ = G′ V′ i + + \

Figure 4.2: Coloring Decision Logic: Second Degree Method

As another alternative, it is also natural to consider the impact of the degree of the

entire independent set when selecting a bipartition, and not just the degree of the seed vertex. Reflecting this concept, the maximum neighborhood method computes independent sets for both bipartitions, using as seeds maximal degree vertices from the respective partitions, and then uses the bipartition which would produce the independent set of highest degree (see Figure 4.3). That is, maximum neighborhood selects the GIS for the next coloring group from Vˆ , Vˆ such that Γ(Vˆ )) is maximized, where Vˆ is the { b1 b2 } | i | i greedy independent set selected from bipartition Vi. Note that this choice is equivalent to 70

selecting the Vˆ i (of the two being considered) which will eliminate the the most edges on the current coloring pass.

Given bipartite graph G = ( V , V , E ) b { b1 b2 } b

i = 1 G′ = Gb w h i l e Eb , v = v∅ V ofmaximumdegree seed1 k1 ∈ b1 V1′ = Greedy Independent S et(G′, 1) v = v V ofmaximumdegree seed2 k2 ∈ b2 V2′ = Greedy Independent S et(G′, 2) i f ( Γ(V′) Γ(V′) ) | 1 | ≥ | 2 | c o l o r a l l v V′ with c o l o r i ∈ 1 G′ = G′ V1′ e l s e \ c o l o r a l l v V′ with c o l o r i ∈ 2 G′ = G′ V2′ i + + \

Figure 4.3: Coloring Decision Logic: Maximum Neighborhood Method

4.3.1 Locking, a Boundary Condition for Certain GIS Methods

At this point, it is interesting to diverge a moment to discuss the concept of locking

and to describe a simple matrix construct which defeats all of the methods so far described. Simply, a “lock” is a vertex v V which prevents some otherwise desirable ∈ i group of vertices v V from being assigned the same color. Figure 4.4 illustrates candidates ∈ j a condition where a small set of vertices cannot be 2-colored by independent set coloring methods due to the presence of a vertex in each bipartition which “locks” more than one neighbor from a desired coloring group. We can say that the blue vertex V1 “locks” the yellow vertices, preventing them from forming an independent set coloring group (similarly blue is prevented from being a color group). 71

v1 Figure 4.4: A Simple “Locked” Star Bicoloring

The locking concept can be used to design matrix constructs which cause certain

Strategy T methods for seed vertex selection to produce sub-optimal coloring results. A “stair matrix” with r rows will be an r x (r2 + 1) matrix containing r2 + r non-zeros. Each row will include r + 1 non-zero entries arranged such that each row has a non-zero in column 1 and also r additional non-zero elements which are disjoint from non-zeros in any other row. Allowing for the various column permutations (which would not affect coloring results), this matrix can be arranged to visually resemble a stair. An example for r = 6 is given in Figure 4.5.

r = 6

x x x xxxx------x------xxxxxx------x------xxxxxx------x------xxxxxx------x------xxxxxx------x------xxxxxx

Figure 4.5: A “Stair” Matrix

Consider the coloring of a stair matrix (or a matrix containing a stair matrix as a submatrix) by those coloring methods whose strategy is to maximize the number of edges 72

which can be eliminated with the next color (for example CDC, ASBC and maximum

neighborhood). Notice that column 1 contains vertices that pairwise lock all rows of the matrix, thus no rows can be compressed. Now, columnar compression would be possible, but a maximally compressed column would contain r vertices. Since each row contains r + 1 vertices, a (non-compressible) row will be selected consistently in preference to the

single dense column. This will result in a strictly row-wise coloring of the stair matrix – a “worst case” result since any matrix can be colored using min( V , V ) colors. Notice | 1| | 2| further that the stair matrix is 2-colorable by first coloring the dense first column and then row-wise compressing all remaining vertices into a single second color group. Noticing that the number of non-zeros in column 1 is a monotonically decreasing value as each row is colored, a slight optimization of the stair matrix which results in the same coloring performance with an input matrix of fewer non-zero values is possible. The drop matrix is similar to the stair with the exception that each subsequent row contains

r2+r+2 r2+3r one fewer non-zero elements. This results in an r x 2 matrix with 2 non-zeros.

r = 8

x x x xxxxxx------x------xxxxxxx------x------xxxxxx------x------xxxxx------x------xxxx------x------xxx--- x------xx- x------x

Figure 4.6: A“Drop” Matrix 73

The remaining strategies presented in this chapter were, in general, selected to

explore different techniques for successfully computing the stair and drop matrices.

4.4 Selected Greedy Strategies, Part Two

The remaining algorithms to be presented were selected in part to address the stair and drop matrix problems, the goal being to achieve similar or better overall coloring performance while successfully computing the optimal answers for those boundary condition matrices. As a reminder, a summary of these methods is presented in Table 4.1

The first four such algorithms, like ASBC, employ ratio-based decision metrics. The neighborhood ratio method selects the next bipartition to color by computing Γ(Vˆ ) / V , which is a measure of the density with which the candidate independent set | is | | n|

neighborhood, Γ(Vˆ is), covers the neighbor bipartition Vn. Put another way, this approach maximizes the percentage of possible edges (and therefore, locks) which can be removed

by the next color (see Figure 4.7).

Given bipartite graph G = ( V , V , E ) b { b1 b2 } b

i = 1 G′ = Gb , w h i l e Eb′ v = v∅ V of maximal degree seed1 k1 ∈ b1 V1′ = Greedy Independent S et(G′, 1) v = v V of maximal degree seed2 k2 ∈ b2 V2′ = Greedy Independent S et(G′, 2) i f (( Γ(V′) / V ) Γ(V′) / V ) | 1 | | b2 | ≥ | 2 | | b1 | c o l o r a l l v V′ with c o l o r i ∈ 1 G′ = G′ V1′ e l s e \ c o l o r a l l v V′ with c o l o r i ∈ 2 G′ = G′ V2′ i + + \

Figure 4.7: Coloring Decision Logic: Neighborhood Ratio Method 74 Table 4.1: “Strategy T” Variations selection strategy addi- computa- tional tional technique bipartition seed vertex vertices complexity 3 ASBC phase 1: greatest ratio Vi /∆ current GIS O(N ) phase 2: contains max| | max degree degree v CDC contains max degree v current GIS O(N3) max degree second contains vertex of current GIS O(N3) degree second-highest degree max ∆2 3 max contains GIS of max degree v∆ from GIS O(N ) neighborhood both Vb1, Vb2 neighborhood contains GIS of max current GIS O(N3) ratio Γ(Vˆ ) / V max degree | | | n| inverse contains GIS of max current GIS O(N3) ratio Γ(Vˆ ) / V max degree | | | is| ratio contains GIS of max current GIS O(N3) M’ Γ(Vˆ ) / Vn max degree in descendant| | | | M’ dense if ratio > DENSELINE current GIS O(N3) ratio contains GIS of max max degree Γ(Vˆ ) / Vn else| max| degree| | v look allows maximal edges current GIS O(N3) ahead eliminated in next iteration max degree weighted contains max weighted v max weighted GIS O(N3) unlocking degree semi-brute- phase 1: best solution for all all O(N4) force log(n) color groups phase 2: look-ahead current GIS max degree N5 semi-brute- phase 1: best solution for all all O( log(N) ) force log(n) color groups iterative phase 2: semi-brute-force all all weighted containing seed vertex % chance GIS O(N3) random Γ(v ) / E′ | s | | | 75

Similarly, Figure 4.8 describes the reverse ratio method, which considers the density

of the independent set neighborhood to the size of the bipartition from which it is generated, Γ(Vˆ ) / V , or the relative density (in terms of edge elimination) of the | is | | is| vertices being colored.

Given bipartite graph G = ( V , V , E ) b { b1 b2 } b

i = 1 G′ = Gb , w h i l e Eb′ v = v∅ V of maximal degree seed1 k1 ∈ b1 V1′ = Greedy Independent S et(G′, 1) v = v V of maximal degree seed2 k2 ∈ b2 V2′ = Greedy Independent S et(G′, 2) i f (( Γ(V′) / V ) Γ(V′) / V ) | 1 | | b1 | ≥ | 2 | | b2 | c o l o r a l l v V′ with c o l o r i ∈ 1 G′ = G′ V1′ e l s e \ c o l o r a l l v V′ with c o l o r i ∈ 2 G′ = G′ V2′ i + + \

Figure 4.8: Coloring Decision Logic: Inverse Ratio Method

Both of the preceding approaches use the structure of the original input matrix, M, to determine vertex degrees and bipartition size. A natural question would be to consider the

impact of computing the ratio method based on the ordered series of derivative matrices

M′ produced by the successive removal of the colored vertices as each color group is formed. This approach is investigated by the strategy ratio M’ (see Figure 4.9). The last of the ratio-based methods considers that in a sparse matrix, only a relatively

small number of dense rows/columns should exist. To allow for maximal structural orthogonality, independent sets built on these dense lines could be eliminated first, thus providing increased potential for orthogonality in the descendant matrices and greater 76

possibility of compression in subsequent coloring groups. The dense ratio method (Figure

4.10) applies the neighborhood ratio method whenever the computed density is greater than a threshold, and otherwise selects a bipartition containing a vertex of maximal degree.

Given bipartite graph G = ( V , V , E ) b { b1 b2 } b

i = 1 G′ = Gb

V′ = V b1 b1 V′ = V b2 b2 , w h i l e Eb′ v = v∅ V of maximal degree seed1 k1 ∈ b1 V1′ = Greedy Independent S et(G′, 1) v = v V of maximal degree seed2 k2 ∈ b2 V2′ = Greedy Independent S et(G′, 2) i f (( Γ(V′) / V ) Γ(V′) / V ) | 1 | | b2 | ≥ | 2 | | b1 | c o l o r a l l v V′ with c o l o r i ∈ 1 G′ = G′ V′ \ 1 V′ = V′ V′ b1 b1 \ 1 e l s e c o l o r a l l v V′ with c o l o r i ∈ 2 G′ = G′ V′ \ 2 V′ = V′ V′ b2 b2 \ 2 i + +

Figure 4.9: Coloring Decision Logic: Ratio M’ Method

Another approach to solving the stair matrix, and potentially other problematic constructions, would be to look ahead some number of coloring iterations in order to avoid simple locking traps (see Figure 4.11). For bipartitions Vi and V j, this methods computes the coloring results for each of the four alternatives of two ordered bipartition

selections: [Vi, Vi], [Vi, V j], [V j, Vi], and [V j, V j]. The initial selection of Vi or V j is made based upon which one allows for the best possible outcome of the subsequent potential selections. As each “initial” selection is made, the remaining pair of potential outcomes 77

are made the new initial selection alternatives, and their child possibilities computed so that the coloring selection process can be similarly continued.

Given bipartite graph G = ( V , V , E ) b { b1 b2 } b

i = 1 G′ = Gb

, w h i l e Eb′ v = v∅ V of maximal degree seed1 k1 ∈ b1 V1′ = Greedy Independent S et(G′, 1) v = v V of maximal degree seed2 k2 ∈ b2 V2′ = Greedy Independent S et(G′, 2)

V ratio = Γ(V′) / V b1 | 1 | | b2 | V ratio = Γ(V′) / V b2 | 2 | | b1 | max ratio = max(Vb1 ratio, Vb2 ratio)

i f ( ((max ratio DENSELINE)and(V ratio V ratio)) ≥ b1 ≥ b2 or ((max ratio < DENSELINE)and(deg(vseed1 ) deg(vseed2))) ) ≥ c o l o r a l l v V′ with c o l o r i ∈ 1 G′ = G′ V1′ e l s e \ c o l o r a l l v V′ with c o l o r i ∈ 2 G′ = G′ V2′ i + + \

Figure 4.10: Coloring Decision Logic: Dense Ratio Method

As a final GIS family alternative algorithm considered here, the problem of locking is further explored by establishing a preference for coloring vertices which eliminate locks, thus providing greater structural orthogonality within the descendant matrices. Weighted unlocking (Figure 4.12) uses as its seed vertex decision metric the combination of vertex

degree with a measure of the number of locks which can be eliminated by choosing a 78

Given bipartite graph G = ( V , V , E ) b { b1 b2 } b k = 1 G′ = Gb w h i l e E′ , b ∅

Vi′ = Greedy Independent S et(G′, 1) G′ = G′ V′ \ i Vii′ = Greedy Independent S et(G′, 1) Vi′ j = Greedy Independent S et(G′, 2)

G′ = G′ + Vi′

V′j = Greedy Independent S et(G′, 2) G′ = G′ V′ \ j V′ji = Greedy Independent S et(G′, 1)

V′j j = Greedy Independent S et(G′, 2)

G′ = G′ + V′j

V max = max( V′ , V′ ) b1 | ii| | i j| V max = max( V′ , V′ ) b2 | ji| | j j|

i f ( Vb1 max > Vb2 max ) c o l o r a l l v V′ with c o l o r i ∈ i G′ = G′ V′ \ i

e l s e i f ( Vb1 max < Vb2 max ) c o l o r a l l v V′ with c o l o r i ∈ j G′ = G′ V′ \ j / == / e l s e * Vb1 max Vb2 max * i f ( V′ V′ ) i ≥ j c o l o r a l l v V′ with c o l o r i ∈ i G′ = G′ Vi′ e l s e \ c o l o r a l l v V′ with c o l o r i ∈ j G′ = G′ V′ \ j k + +

Figure 4.11: Coloring Decision Logic: Look Ahead Method 79

given vertex as the seed. In this case, the vertex weight is the degree of the vertex plus the

number of neighbors of the given vertex, vn, with d(vn) > 1.

Given bipartite graph G = ( V , V , E ) b { b1 b2 } b

i = 1 G′ = Gb w h i l e E′ , b ∅ f o r each v V′ , V′ ∈ { b1 b2 } ψ(v) = deg(v) f o r each w Γ(v) i f ( deg(∈w) > 1) ψ(v) + +

vseed = v V′ , V′ of maximal ψ(v) ∈ { b1 b2 } i f ( vseed V′ ) ∈ b1 V1′ = Greedy Independent S et(G′, 1) c o l o r a l l v V′ with c o l o r i ∈ 1 G′ = G′ V1′ e l s e \

V2′ = Greedy Independent S et(G′, 2) c o l o r a l l v V′ with c o l o r i ∈ 2 G′ = G′ V2′ i + + \

Figure 4.12: Coloring Decision Logic: Weighted Unlocking Method

4.4.1 A Limitation to the Neighborhood Ratio Method

Intuitively, the ratio methods seem to have an appealing approach. The neighborhood

ratio method, in particular, selects the next coloring group from the bipartition which produces the independent set “covering” the highest percentage of the neighborhood partition, that is MAX Γ(Vˆ ) / V . This would seem to remove a high number of { | | | n| } potential locks from the neighborhood bipartition, resulting in improved structural orthogonality and thus improved descendant coloring potential. However, the ratio method which performed the best was the hybrid dense ratio approach, which applies the 80 ratio decision logic in a limited fashion. Why would these ratio-based decision criteria perform so poorly in practice? The following “counter-example” construction is intended to illustrate one situation that is handled poorly by the neighborhood ratio method. The reader will note that this example presents a rather dense matrix construct, while the matrices of interest in this work are rather sparse. In consideration of this, note that the coloring process “compresses” rows and columns together, and that following this compression, the resultant color groups form rows or columns whose density is maximized (combining a maximal number of rows/columns to form each color). Therefore, this example can best be considered as modeling a group of “post-compression” row or column groups.

With that observation in mind, consider a matrix constructed as follows:

• select n, representing the number of rows, where n is an odd integer 5. ≥

• set m = φ(n + 1), where φ is an integer n . ≥ ⌊ 2 ⌋

• for each n by (n + 1) block of this matrix, the two columns in each row closest to the diagonal are given value 0, while the rest of the entries are non-zero.

xxxx xxxx x xxx x xxx xx xx xx xx xxx x xxx x xxxx xxxx Figure 4.13: Ratio Method Counter-example with n = 5, φ = 2, Matrix Representation 81

To see how this matrix would be colored by the neighborhood ratio method, note first

n 1 that all rows will have a non-zero density of drow = n+−1 , while the columns will each have n 1 n 2 density of either dcol1 = −n or dcol2 = −n . Specifically, the first and last columns of each block will have a density greater than the density of the rows. These columns (in fact, all columns) are non-compressible, and will thus force their selection for coloring individually, consuming 2φ colors. Since φ n , then n 2φ 1, and at least n 1 ≥ ⌊ 2 ⌋ − ≤ − colors have been used. After the coloring of the first 2φ vertices the density of the

n 2 remaining columns will be −n > .5, so no columns are compressible. Thus > 1 colors would be needed to complete a coloring if a column is colored next. Some rows, however

n 3 are now compressible. The minimum density of the rows is now − , which for n 5 is n+1 ≥ 1 , so if a row group is formed next at most 3 < n rows can be included in it. It either ≥ 3 event, > 1 colors are needed to complete the coloring, resulting in > n total colors being used. This illustrates that the neighborhood ratio method can result in using more colors than the trivial row-wise matrix coloring using n colors.

4.5 Non-greedy Strategies

While exact brute force solutions to the general star bicoloring problem are computationally infeasible, bounded methods can be applied to produce results which are likely of high quality. Two such methods, which do not adhere to the GIS family, are included in this evaluation for comparative purposes. Given a bipartite incidence structure of N vertices, semi-brute-force (SBF) initially computes the coloring solution for log(N) colors which eliminates the most edges, corresponding to an optimal solution for the graph subtended by the set of edges eliminated. SBF completes by using one of the GIS family algorithms to greedily compute the coloring for the remaining descendant graph. Semi-brute-force iterative (SBFI) extends this concept by iteratively applying the log(N) coloring scheme to the series of best derivative graphs selected every log(N) colors. Note 82

that the last log(N) coloring phase produces an optimal solution on the final remaining log(N)-coloring descendant graph.

Figure 4.14: Ratio Method Counter-example with n = 5, φ = 2, Graph Representation

Lastly, again for comparison purposes, a randomized method for seed vertex

selection is included. Given the nature of the unlocking process, and demonstration that certain simple constructs can mislead straight-forward greedy criteria (see Section 4.3.1), an algorithm using some form of randomized selection would have a chance of avoiding any such suboptimizing decision. However, the problem with a strictly random selection

of the vertices is obvious, as the generated solution would correspond to selecting one permutation on the vertices of the bipartite incidence structure. The chances of selecting a near-optimal solution with this approach would be very low given such a strategy. To improve the general performance of a random approach while retaining the potential 83

benefit of avoiding graph locks, a weighted random method is used. For each color group, a given vertex vs in the derivative graph G′ may be chosen as the seed vertex with a

probability of Γ(v ) / E′ . In our comparative algorithm, after weighted random seed | s | | | vertex selection, the color groups are completed using GIS.

4.6 Empirical Results

The smtape collection of matrices from NIST’s Matrix Market was selected as a base set for empirical comparison. This collection has been used in multiple relevant

publications [3][1][2] and includes a variety of sparsity patterns. NIS offers the following description of the set: [11]

A collection of sparse matrices was begun by Curtis and Reid at Harwell in the early 1970’s and was later extended by Duff into the present collection. These 36 matrices and matrix patterns come from a wide range of disciplines.

A major objective of the test collection has been to represent important features of practical problems. Sparse matrix characteristics (such as average density of entries per row, pattern of the entries, symmetry, and matrix size) can differ among matrices arising from different disciplines. The test

problems, though varying widely in their characteristics, have very distinctive patterns.

This set of matrices was enhanced with inclusion of examples of the stair and drop matrices described in section 4.3.1. Descriptions of the full list of test matrices used in this phase of our research are provided in Table 4.2. A summary of coloring performance for our test algorithms on this matrix set is presented in Tables 4.3 and 4.4, with some commentary provided below. The labels for these discussion paragraphs correspond to the column labels in those Figures. 84 Table 4.2: Full List of GIS Evaluation Test Matrices Number of Matrix Name Rows Columns Non-zeroes abb 313 313 176 1,557 arc 130 130 130 1,282 ash 219 219 85 438 ash 292 292 292 2,208 ash 331 331 104 662 ash 608 608 188 1,216 ash 958 958 292 1,916 bp 0 822 822 3,276 bp 200 822 822 3,802 bp 400 822 822 4.028 bp 600 822 822 4,172 bp 800 822 822 4,534 bp 1000 822 822 4,661 bp 1200 822 822 4,726 bp 1400 822 822 4,790 bp 1600 822 822 4,841 curtis 54 54 54 291 drop 20 20 211 230 drop 100 100 5051 5,150 eris 1176 1176 1176 18,552 fs 541 1 541 541 4,285 fs 541 2 541 541 4,285 ibm32 32 32 126 lund a 147 147 2,449 lund b 147 147 2,441 shl 0 663 663 1,687 shl 200 663 663 1,726 shl 400 663 663 1,712 stair 20 20 401 420 stair 100 100 10,001 10,100 str 0 363 363 2,454 str 200 363 363 3,068 str 400 363 363 3,157 str 600 363 363 3,279 will 57 57 57 281 will199 199 199 701 85

ASBC: The benchmark ASBC algorithm. First demonstrated approximation

algorithm. CDC: The benchmark CDC algorithm. Best previously observed coloring performance for a GIS family method. Does not successfully compute the drop and stair matrices.

Sec Deg: The third column presents the results for the second degree method inspired by natural language processing. The number of colors required is often tied for the best observed, but also quite worse in certain specific cases. These results may suggest that, in addition to the drop and stair matrices, second degree is also susceptible to some other basic locking construct.

Max Hood: Maximum neighborhood provides consistently good results, being in general between ASBC and CDC, but also fails to unlock the drop and stair matrices. Ratio: The ratio method is the first algorithm we considered which was specifically intended to address the stair matrix lock. While generally not as efficient as ASBC and

CDC, it does achieve optimal coloring for the stair and drop matrices. Inv Ratio: Interestingly, the inverse ratio approach does generally better than ratio, but fails the stair lock matrices. M’ Ratio: When applied to the ordered derivative matrices generated by the coloring

process, M’ ratio also solves the stair lock, but does not perform as well overall as the basic ratio method. Den Ratio: The dense ratio algorithm performs quite well, with general coloring results approaching CDC while correctly solving the stair problem. L-A: The look ahead program performed essentially equivalent to dense ratio, and

while it did solve the stair and drop test matrices, it does so in a very specific manner which may not be generally applicable to the stair locking pattern. 86

W-U: The weighted unlocking approach seems to hold good potential. It performed

the best of any of the tested GIS family methods, achieving equivalent general coloring results with CDC while also solving the stair and drop test matrices. A possible further area of inquiry outside the scope of this work would be to test different values of the locking weight, and how those values might correspond to certain sparsity pattern

structures. Coloring results for the following non-GIS-family algorithms appear in Table 4.4. SBF: As anticipated, the semi-brute-force method computed consistently good results. However, it was not quite as good as weighted unlocking, and has a higher computational complexity than the GIS family algorithms.

SBF-I: The semi-brute-force iterative approach achieved the best overall coloring performance and correctly solved the drop and stair matrices, at the expense of a slightly higher, but still low-order polynomial, run-time computational complexity. Weighted Random: The weighted random algorithm provides an interesting comparison, and illustrates the effectiveness of this approach to avoiding the stair and drop locking constructs. Results are presented for average and best case coloring performance for 20 and 100 iterations. 87 Table 4.3: Benchmark Coloring Results - Greedy Methods

Sec Max Inv M’ Den Matrix ASBC CDC Deg Hood Ratio Ratio Ratio Ratio L-A W-U abb 313 17 12 12 12 26 12 17 14 12 12 arc 130 26 26 132 26 26 26 27 26 26 26 ash 219 8 5 5 5 9 5 7 8 5 5 ash 292 20 15 15 15 15 15 19 15 15 15 ash 331 10 6 6 6 12 6 9 9 6 6 ash 608 10 7 7 7 12 7 8 9 7 6 ash 958 11 7 7 7 13 7 9 9 7 7 bp 0 16 14 16 16 16 16 17 14 15 15 bp 200 19 18 19 17 17 17 18 18 18 18 bp 400 20 18 304 20 20 20 18 18 18 18 bp 600 20 18 21 21 21 21 18 18 18 19 bp 800 22 21 21 23 23 23 21 21 23 21 bp 1000 22 22 22 23 23 23 23 22 23 22 bp 1200 21 22 23 23 23 23 22 22 21 21 bp 1400 23 22 22 26 26 26 24 22 23 22 bp 1600 22 21 22 28 28 28 24 21 22 21 curtis 54 12 11 13 16 16 16 12 11 10 10 drop20 20 20 20 20 2 20 2 2 2 2 drop100 100 100 100 100 2 100 2 2 2 2 eris1176 92 92 105 93 93 93 99 92 93 92 fs541 1 18 14 14 14 14 14 18 14 14 14 fs541 2 18 14 14 14 14 14 18 14 14 14 gent 113 18 18 18 18 18 18 17 18 17 18 ibm 32 9 8 8 8 8 8 10 8 8 8 lund a 26 26 25 26 26 26 40 26 26 26 lund b 26 26 25 26 26 26 39 26 26 26 shl 0 6 4 424 4 4 4 5 4 4 5 shl 200 7 4 4 4 4 4 5 4 4 4 shl 400 6 4 5 4 4 4 5 4 4 4 stair20 20 20 20 20 2 20 2 2 2 2 stair100 100 100 100 100 2 100 2 2 2 2 str 0 25 24 39 26 26 26 25 24 25 24 str 200 30 31 30 31 31 31 33 31 32 30 str 400 35 36 41 36 36 36 36 35 37 35 str 600 36 35 41 37 37 37 38 35 36 36 will 57 11 9 11 11 11 11 10 9 10 10 will 199 9 8 9 9 9 9 8 8 9 8 w/o stair 671 618 1480 652 687 652 699 629 628 618 w/ stair 911 858 1720 892 695 892 707 637 636 626 88 Table 4.4: Benchmark Coloring Results - “Non-greedy” Methods Weighted Random 20 Iterations 100 Iterations Matrix SBF SBF-I Avg Best Avg Best abb 313 12 12 19 15 18 11 arc 130 26 26 43 35 46 34 ash 219 5 5 7 5 7 5 ash 292 15 15 22 18 21 18 ash 331 6 6 9 7 9 6 ash 608 6 6 8 6 9 6 ash 958 6 6 9 7 9 7 bp 0 15 15 20 17 20 15 bp 200 18 17 23 21 23 18 bp 400 18 18 23 19 24 19 bp 600 18 17 25 20 24 19 bp 800 21 21 26 22 26 21 bp 1000 23 22 26 22 26 22 bp 1200 21 20 26 23 26 22 bp 1400 23 22 27 24 27 22 bp 1600 22 20 28 24 28 23 curtis 54 10 10 15 11 15 11 drop 20 2 2 15 4 14 2 drop 100 2 2 47 6 53 3 eris 1176 93 93 158 145 157 135 fs541 1 17 17 25 19 24 18 fs541 2 17 17 25 20 24 17 gent 113 17 17 23 20 24 19 ibm 32 8 8 10 8 11 8 lund a 26 26 43 37 43 36 lund b 26 26 42 37 43 34 shl 0 4 4 7 4 7 4 shl 200 4 4 8 4 8 4 shl 400 4 4 8 5 8 4 stair 20 2 2 23 3 22 2 stair 100 2 2 101 22 115 4 str 0 25 25 35 29 36 28 str 200 30 30 44 38 44 36 str 400 33 33 50 42 49 42 str 600 36 36 50 45 50 43 will 57 10 9 13 11 13 11 will 199 8 8 10 8 10 8 tot w/o stair 623 615 907 768 909 726 tot w/ stair 631 623 1093 803 1113 737 89 5 Expanded Analyses

In this chapter we present a summary of our analytical findings related to the GIS family of algorithms described in chapter 4 and building from the initial approximation analysis presented in [2]. We begin with a few background and preliminary items. We first establish, in section 5.1, that methods adhering to our general framework do, indeed, correctly solve the problem of Star Bicoloring. Then, we investigate how particular matrices can be used to establish a tighter lower-bound on the approximation ratio for a

subset of the GIS methods in section 5.2. We describe this subset as Maximal Degree Greedy Independent Set (MDGIS) methods, as they are the group of GIS methods which

always include v∆ as a member in independent set coloring groups. Next, in section 5.3, we present some straight-forward but key results which establish some simple bounds for Γ(Vˆ ) , the cardinality of the neighborhood for a given distance-2 independent set. These | | bounds will prove valuable for the subsequent approximation analyses. The focus of the chapter is presented in two parts, sections 5.4 through 5.7 and also sections 5.9 through 5.14, which expand and extend the analysis techniques used in [2]. These analyses provide approximability results for several new GIS -family Star

Bicoloring methods, as well as for Hossain and Steihaug’s Complete Direct Cover (CDC) algorithm [1]. Included within those discussions, at a convenient point for the subject, is an exploration in section 5.8 of the impact of matrix squareness on these analysis techniques. The findings in that section seem to indicate a common denominator in all the coloring methods studied and suggests a potentially interesting path for future investigation. Throughout this chapter, we will make use of the following definitions, symbols and stipulations: Let G = ( V , V , E ) be a connected bipartite graph. Further, let n be the b { b1 b2} b cardinality of Vb1, and m the cardinality of Vb2. 90

Let Vˆ = GIS (Gb, α) be the resulting independent set of vertices produced by

Greedy Independent Set on the input graph Gb, where α is either 1 or 2, and corresponds to the bipartition from which the distance-2 independent set of vertices is selected (either

Vb1 or Vb2). Further, let Vis refer to the bipartition producing the current independent set,

and Vn be the bipartition containing its neighborhood. Let Γ(Vˆ ) refer to the neighborhood of the distance-2 independent set, that is, the subset of vertices in V which are connected to a vertex in Vˆ . Further, let Γ(Vˆ ) be the n | | cardinality of the neighborhood.

With ∆ being the cardinality of a maximal degree vertex, let v∆ be a vertex of

maximal degree in graph G whilev ˆ∆ is a vertex of maximal degree wherev ˆ∆ Vˆ . b ∈ 5.1 Correctness of General Framework

The algorithmic framework presented in section 4.1 (summarized in Figure 4.1) describes a related subset of methods for star bicoloring. Before presenting our current analyses of these approaches, we provide a proof that those algorithms which employ this model (refer to Figure 4.1) provide correct star bicoloring results, consistent with Definition 1 (see page 41).

In summary, given a bipartite graph G = ( V , V , E ), where c(v) is the color of b { b1 b2} b vertex v, a valid star bicoloring must satisfy the conditions:

• it must be a proper coloring, i.e. c(v ) , c(v ) if (v , v ) E ; i j i j ∈ b

• the set of colors of v V is disjoint from the set of colors of v V , i.e. i ∈ b1 j ∈ b2 ( c(v ) v V c(v ) v V ) ; ∪{ i | i ∈ b1} ∩ ∪{ j | j ∈ b2} ⊆ {⊥}

• any two vertices connected to a vertex v where c(v ) = have different colors; k k ⊥

• every edge path with length 3 uses at least 3 colors. ≥ 91

Claim 1. Algorithms following the framework from section 4.1 are correct algorithms for

star bicoloring.

Proof. This proof is provided in the four following subsections.

Proper Coloring The process of building orthogonally independent sets requires that members of a given set be from the same side of the bipartite graph and, by the

definition of a bipartite graph, no edges exist between members of the same bipartition. Since the framework requires that each color is used only once, only members of the same independent set, and therefore a subset of members of the

same bipartition, receive a given color. Therefore, no edge (vi, v j) can exist where

c(vi) = c(v j).

Disjoint Color Sets Since in the model framework no color is used more than once, and is only applied to an orthogonally independent set which is wholly within a single bipartition, no colors are shared between the bipartitions.

No Common Colors Connected to Bottom For a vertex v to receive the color all ⊥ ⊥ edges adjacent to v must have been eliminated prior to coloring v . This requires ⊥ ⊥ that all of neighbor vertices of v are colored prior to v . Prior to its coloring, v ⊥ ⊥ ⊥ provides a distance-2 path to any pair of its neighbors, which prevents such a pair from being in the same distance-2 independent set, and therefore prevents any pair

of neighbors of v from receiving the same color. ⊥

Edge Length Three Paths Use at least Three Colors Since we have the property of disjoint color sets, then any path of edge length 1 or greater uses at least 2 colors. Any edge length 3 path consists of 3 edges and 4 vertices, specifically, 2 vertex pairs

– one pair from each bipartition. So, let v V , v V , and also let v V i1 ∈ b1 i2 ∈ b1 j1 ∈ b2 and v V . For a length 3 path to use 2 colors, then necessarily both vertex pairs j2 ∈ b2 92

would have to use one color each, meaning that in both cases the two vertices within

a given bipartition would have the same color. That is, c(vi1) = c(vi2) = cb1 and

c(v j1) = c(v j2) = cb2 Assume this is the case. Now observe that for vertices to receive the same color, they must be colored at the same time, which requires that they be members of the same distance-2 independent set. However, in constructing

a path of edge length three, it is necessarily true that either v j1 or v j2 must be a

neighbor of both vi1 and vi2, so vi1 and vi2 cannot receive the same color until after

that vertex, v jx, is colored. Since v j1 and v j2 are receiving the same color they also

must be colored at the same time, so both v j1 and v j2 must be colored before vi1 and

vi2. However, it is also true that either vi1 or vi2 must be neighbors of both v j1 and

v j2, requiring that both vi1 and vi2 be colored before v j1 and v j2, providing a contradiction.



5.2 An Improved Lower Bound for GIS Family Methods Using Seed Vertex of Maximum Degree (MDGIS)

In [2], Juedes and Jones establish an approximation lower bound for star and acyclic

1 bicoloring (χsb(Gb)), subject to reasonable approximability assumptions, of O(n 3 ) (see section 3.1, page 54). For certain members of the GIS family, including CDC, ASBC and maximum neighborhood, the stair matrix example (section 4.3.1, page 70) provides a tighter lower bound. A stair matrix containing r rows is constructed so that the degrees of all row vertices will be r + 1 which, while only rows are colored, is greater than the maximum possible degree of any column vertex in Gb and also in any derivative matrix of Gb. Considering first the CDC strategy, which always selects a vertex of maximal degree as the current independent set seed, this method will always select a “row” bipartition to form the next 93

independent set. Since no rows are compressible until after at least one column group is

colored, CDC is locked-in to a simple row-wise coloring of the stair matrix graph, so

χcdc = r in this case. Since the number of vertices N in a stair matrix graph equals r2 + r + 1, it follows that:

1. N = r2 + r + 1

2. r2 + r + 1 < (r + 1)2

3. N < (r + 1)2

4. √n < r + 1

5. r > √n 1 −

6. r N+ ∈

7. r √n ≥ ⌊ ⌋

8. χ √n cdc ≥ ⌊ ⌋

Since χ(stair matrix) = 2, CDC’s approximation ratio for this instance is 1 N 2 1 = 2 2 O(N ). This demonstrates that the CDC algorithm, a GIS family implementation which always includes the vertex of maximum degree in the current color group, cannot

1 ǫ approximate star bicoloring to within N 2 − for any positive epsilon. In the general case, ASBC is not as restricted in its choice of coloring bipartition as

Vb1 CDC. ASBC first considers “sufficiently large” bipartitions where either | ∆ | > ∆ or

Vb2 | | > ∆. However, since ∆ = r + 1 and V r, V is never sufficiently large. Similarly, ∆ | b1| ≤ b1 V r2 + 1, so the second inequality is likewise never true. Thus, for the stair matrix | b2| ≤ ASBC will default to selection of the vertex of highest degree as the seed, and will perform identically to CDC in this case. 94

The stair matrix construct also defeats the maximum neighborhood strategy, which

looks at the maximum degree vertex for both bipartitions, forms greedy independent sets seeded with each, and then colors from the bipartition which produces the independent set of highest degree. As we will see, this method also degenerates into a strictly row-wise

coloring of the stair matrix. For Vb1, the set of row vertices, the degrees of all vertices are the same, r + 1, so any arbitrary row may be selected. In any case, no row compression is possible unless column 1 is first colored, so until that occurrence, the degree of the neighborhood for any selected row color will be r + 1. For Vb2, or the column vertices, the seed vertex will be column 1 – the target column we would like to color to unlock the rows. However, no other columns can be compressed with column 1, and so its degree and the degree of its independent set are r, which is less than the row option. So, at each coloring step, max neighborhood will select a row, and will also perform identically to CDC.Minimizing Approximation Ratio Exponent Note that in each of these examples, the structure of the stair matrix causes these methods to choose the vertex of maximum degree as the seed for the next coloring set. This particular decision metric then produces worst-case coloring performance on these graphs, since any bipartite graph is trivially colorable with min V , V colors. {| b1| | b2|} The property of always including v∆ in the next computed independent set coloring group defines a subset of the GIS (Greedy Independent Set) algorithms. We will refer to this subset as Maximum Degree Greedy Independent Set (MDGIS) algorithms.

5.3 Observations on the Size of GIS Neighborhoods

Preliminary to presenting additional approximation results, in this section we present some observations regarding the cardinality of the neighborhoods of greedy independent sets (or, in other words, the sum of the degrees of the members of the independent sets). These results will be useful in the following sections 5.4, 5.5 and 5.8. 95

For the following claims, we consider MDGIS methods, such as CDC, which include

a remaining vertex of maximal degree in each coloring group.

Claim 2. Γ(Vˆ ) ρ1/2, where µ = MAX V , V and ρ = min V , V . | | ≥ {| b1| | b2|} {| b1| | b2|} Proof. Since the vertex of maximum degree connects to ∆ vertices in its neighborhood, it is trivially true that Γ(Vˆ ) ∆. Also, consider that each vertex v V can connect to ∆ | | ≥ j ∈ n ≤ vertices from V . Therefore, there must be at least V /∆ vertices in Γ(Vˆ ), because is ⌈| is| ⌉ those vertices collectively must connect to all v Vˆ and also must connect to the is ∈ remaining v V Vˆ in order to lock them from inclusion in the independent set. So, i ∈ is\ ρ 1 Γ(Vˆ ) MAX ∆, . Since for any positive bi-factor pair, x and y of z, either x z 2 or | | ≥ { ∆ } ≥ z 1 ρ 1 y = z 2 , then MAX ∆, ρ 2 . x ≥ { ∆ } ≥ 

1 Claim 3. Γ(Vˆ ) µ 3 , where µ = MAX V , V and ρ = min V , V . | | ≥ {| b1| | b2|} {| b1| | b2|} 2 1 2 1 1 Proof. case ρ µ 3 : By the assumption of our case, ρ 2 (µ 3 ) 2 = µ 3 . ≥ ≥ 2 case ρ < µ 3 : Since only ρ vertices from the smaller bipartition are available to connect to the µ vertices from the larger side, ∆ µ , and hence ≥ ρ ρ µ µ 1 3 MAX ∆, ∆ ρ 2 = µ . { } ≥ ≥ µ 3 ρ 1 Therefore, Γ(Vˆ ) MAX ∆, = µ 3 | | ≥ { ∆ } 

5.4 CDC Approximation Analysis

In the preceding section 5.3, we employed the characteristic of the CDC algorithm that the remaining vertex of maximum degree is always included in the next independent set of vertices being colored to define some bounds on the size of the neighborhood of that independent set. Noting that the size of this neighborhood ( Γ(Vˆ ) ) is equal to the number | | of edges eliminated by coloring the independent set, we can apply a technique similar to the ASBC approximation analysis in [2] to show an approximation result for CDC. 96

3 3 Claim 4. The CDC algorithm uses at most 3µ 4 χsb(Gb) colors. Hence, CDC is an O(N 4 ) approximation algorithm for star bicoloring.

1 Proof. As shown by claim 3 above, CDC deletes at least µ 3 edges with each coloring iteration. As the coloring proceeds, however, the size of the descendant graph and therefore the value of µ decreases. To address this property, we consider this analysis in two phases.

1 1 1 1 For the first phase we define µ 3 as being sufficiently large while µ 3 N − α . Noting ≥ 1 1 1 3 1 1 1 1 that µ 3 = µ − α for some α > 1, CDC will eliminate qµ − α = µ 3 − 3α edges per coloring pass during this phase. Subsequently, for the last phase of the analysis, we can use the worst-case assumption that each remaining vertex will use an additional color, which will

1 1 require an additional N − α colors.

1 1 - 1/α 2/3 + 1/(3α)

0.8

0.6 exponent 0.4

0.2

0 1 1.5 2 2.5 3 3.5 4 4.5 5 alpha Figure 5.1: Minimizing Approximation Ratio Exponent for CDC

E 1 E 1 α b Thus, CDC will use at most f (Gb) = 1| |1 + N − colors. Since | µ | χsb(Gb), this µ 3 − 3α ≤ f (Gb) gives an approximation ratio r(Gb) = of: χsb(Gb)

Eb |1 |1 µ 3 − 3α 1 1 r(Gb) + N − α ≤ Eb | µ | 97

µ 1 2 1 1 2 1 1 1 α 3 + 3α 1 α 3 + 3α 1 α = 1 1 + N − = µ + N − µ + 2µ − µ 3 − 3α ≤ As seen in Figure 5.1, the exponent for µ is minimized when α = 4, and thus

2 + 1 1 1 2 + 1 1 1 3 µ 3 3α + 2µ − α µ 3 12 + 2µ − 4 = 3µ 4 ≥

3 Since µ = max ρ, µ and N = ρ + µ, µ is O(N), and CDC is an O(N 4 ) approximation { } algorithm for Minimum Star Bicoloring. 

5.5 CDC Approximation Based on E | | In the general graph coloring case, the number of edges is typically greater than the

number of vertices in a given graph. Appropriately, approximation analyses for graph coloring are therefore typically based upon the number of vertices. However, with sparse graphs, it is possible for the number of edges to be less than the cardinality of the set of vertices. In fact, some of our test graphs approach this situation, and for the stair and drop matrices E < V , B , albeit slightly. With this rationale, we look at what a similar | b| |{ b1 b2}| style approximation analysis based on E would provide for GIS family algorithms | b| which include a remaining vertex of maximal degree in each independent set.

5 5 Claim 5. The CDC algorithm uses at most 2 E 6 colors, hence, they are O( E 6 ) | b| | b| approximation algorithms for Minimum Star Bicoloring.

Proof. As shown in claims 2 and 3 above, given bipartite graph G = ( V , V , E ), b { b1 b2} b 1 1 maximal degree GIS model algorithms will always eliminate at least e = MAX(ρ 2 , µ 3 ) edges with each coloring cycle. Where ρ = min( V , V ) and µ = MAX( V , V ), and | b1| | b2| | b1| | b2| therefore E ρµ. | | ≤ 2 For the initial graph, when ρ µ 3 ≥

1 1 ρ 2 µ 3 ≥ 98

1 1 1 e = MAX(ρ 2 , µ 3 ) = ρ 2

3 ρ 2 µ ≥

3 5 E ρµ ρρ 2 = ρ 2 | b| ≤ ≤

1 5 5 Since e = ρ 2 and E ρ 2 , then e √ E for the first coloring pass. | B| ≤ ≥ | b| 2 Similarly, when ρ µ 3 ≤

1 1 ρ 2 µ 3 ≤

1 1 1 e = MAX(ρ 2 , µ 3 ) = µ 3

2 µ 3 ρ ≥

2 5 E ρµ µ 3 µ = µ 3 | b| ≤ ≤

1 5 5 Since e = µ 3 and E µ 3 , then e √ E for the first coloring pass. | b| ≤ ≥ | b| As e and E are declining values with each coloring iteration, let e be e after the kth | b| k

coloring iteration. Further, for some α > 1, define ek as sufficiently large when

1 1 e E − α . While this condition holds, the greedy model algorithms will eliminate k ≥ | b| 5 1 1 E α edges each pass. Subsequent to this phase, we will assume the worst case that q| b| − 1 1 each edge results in a new color being added, for an additional E − α edges. Therefore, | b| the maximum colors used will be:

Eb 1 1 1 1 4 1 1 1 α 1 ( 5 5α ) 1 α 5 + 5α 1 α | 1 | 1 + Eb − = Eb − − + Eb − = Eb + Eb − E 5 5α | | | | | | | | | | | b| − As shown in Figure 5.2, this equation is minimized when α = 6. Substituting in the

4 + 1 1 1 5 above equation, E 5 30 + E − 6 = 2 E 6 , showing that maximal degree GIS algorithms | b| | b| | b| 5 provide an O( E 6 ) approximation in terms of graph edges. | b|  99

1 1 - 1/α 4/5 + 1/(5α)

0.8

0.6 exponent 0.4

0.2

0 1 2 3 4 5 6 7 8 alpha Figure 5.2: Minimizing Edge-based Approximation Ratio Exponent for CDC

5.6 Initial Maximum Neighborhood Approximation Analysis

The maximum neighborhood method is among the group of GIS algorithms which compute tentative greedy independent set solutions for both bipartitions in each pass, selecting one to become the next color group based upon some comparative greedy choice. In the case of maximum neighborhood, the greedy independent set of maximal neighborhood cardinality is selected, which is the equivalent to selecting the coloring

group which would eliminate the most edges. Given the CDC analysis for MDGIS methods which include the vertex of maximum degree in each greedy independent set, an analysis for maximum neighborhood is straightforward.

3 Claim 6. The maximum neighborhood algorithm uses at most 3µ 4 χsb(Gb) colors. Hence,

3 maximum neighborhood is an O(N 4 ) approximation algorithm for star bicoloring.

Proof. As before, we let Vˆ represent the greedy independent set chosen from the

bipartition which includes a maximal degree vertex. Now, we let V∆ represent the

bipartition from which Vˆ is formed (that is, the bipartition containing v∆, a vertex of

maximal degree). Further, let Vd represent the other bipartition, where its largest 100

cardinality vertex is d and necessarily d ∆. Also, let Vˆ be the greedy independent set | | ≤ d

computed for bipartition Vd. Obviously, maximum neighborhood will remove MAX Γ(Vˆ ) , Γ(Vˆ ) Γ(Vˆ ) ∆ vertices with each coloring pass. As observed in {| | | d |} ≥ | | ≥ Vis section 5.3, Γ(Vˆ ) | | . Recalling that V is the bipartition producing a given greedy | | ≥ ⌈ ∆ ⌉ is independent set, ρ represents the cardinality of the smaller bipartition, and µ that of the

larger, V ρ. So, maximum neighborhood will remove | is| ≥

ρ 1 MAX Γ(Vˆ ) , Γ(Vˆ ) MAX ∆, ρ 2 { | | | d | } ≥ { ⌈∆⌉ } ≥

edges with each pass. Referring again to section 5.3, claim 3, we can similarly show that

1 MAX Γ(Vˆ ) , Γ(Vˆ ) µ 3 { | | | d | } ≥

This provides the same lower bounds for numbers of edges deleted per pass as the CDC algorithm, and the balance of the proof from section 5.4 shows that maximum

3 neighborhood is an O(N 4 ) approximation algorithms for Minimum Star Bicoloring. 

5.7 Improved Maximum Neighborhood Approximation Analysis

The preceding approximation analysis for maximum neighborhood (section 5.6) can be improved by considering the result for ASBC.

2 Claim 7. The maximum neighborhood algorithm uses at most 3µ 3 χsb(Gb) colors. Hence,

2 maximum neighborhood is an O(N 3 ) approximation algorithm for star bicoloring.

Proof. As shown in section 3.3, for at least one bipartition Greedy Independent Set will

produce an independent set which will eliminate a minimum of √( V + V )/2 edges | b1| | b2| on a given coloring pass (that is, for the bipartition which would be selected by the ASBC

algorithm). Let Vˆ asbc be that independent set, selected from bipartition Vasbc, and where 101

Γ(Vˆ ) is therefore necessarily > √ V + V )/2. If we let V be the other bipartition. | asbc | | b1| | b2| n Maximum neighborhood will always select the bipartition whose independent set generates the highest cardinality neighborhood, that is, the coloring bipartition corresponding to MAX Γ(Vˆ ) , Γ(Vˆ ) . Since MAX Γ(Vˆ ) , Γ(Vˆ ) Γ(Vˆ ) , { | asbc | | n | } { | asbc | | n | } ≥ | asbc | maximum neighborhood eliminates a minimum of Γ(Vˆ ) √ V + V )/2 edges with | asbc | ≥ | b1| | b2| each coloring pass. Therefore, the analyses from sections 3.3.1 and 3.3.2 also apply to

2 maximum neighborhood, showing it to be an O(N 3 ) approximation algorithm for star bicoloring. 

5.8 Approximation Analyses for Almost Square Matrices

In this section we consider a gap in the CDC analysis presented in section 5.4, and further analyze the behavior of CDC and other maximal degree GIS family (MDGIS ) algorithms when the character of the input matrix is such that it shows a tendency to remain approximately square throughout the coloring process.

5.8.1 Hypothetical Strongly Square Matrices

It is interesting to note the role of V and V in the analysis from [2] and Section | b1| | b2| E 5.4. In each case, | | is used as a lower bound on χ(Gb). This bound represents a MAX Vb1 , Vb2 {| | | |} theoretical best-case coloring for a ρ µ matrix where µ ρ and complete rows (columns) × ≥ of size µ can be colored in each pass. Intuitively, this introduces a gap in the analysis, as in most cases some color groups would come from the small side of the bipartition and be of cardinality less than ρ, and thus producing a larger value for χ(Gb). To explore the potential effect of this gap, consider a hypothetical “strongly square matrix” that is square and remains square as row and column color groups are eliminated from it. That is, ρ = µ , and G′ G′ is a descendant graph of G ρ = µ . Gb Gb ∀{ | b} G′ G′ 102

2 3 3 Claim 8. Maximal degree GIS family (MDGIS) algorithms use at most 2 N colors for 2 strongly square matrices, hence, they are O(N 3 ) approximation algorithms for star bicoloring on strongly square matrices.

Proof. Given bipartite graph G = ( V , V , E ), let µ = MAX V , V , b { b1 b2} b {| b1| | b2|} ρ = min V , V and N = V + V . As shown in Claim 2 and Claim 3, MDGIS {| b1| | b2|} | b1| | b2| 1 1 algorithms delete at least MAX ρ 2 , µ 3 edges with each coloring iteration. Since { } 1 1 1 1 1 ρ 2 MAX ρ 2 , µ 3 , obviously ρ 2 edges will be deleted with each iteration. Since ρ 2 ≤ { } ≥ decreases in value as the algorithm continues, we again consider the following analysis in two phases.

1 1 2 2 As before, let ρk be the value of ρ after the kth coloring iteration. In the first phase,

1 1 1 2 2 1 for some α > 1, we define ρ as sufficiently large while ρ N − α . During this coloring k k ≥ 1 1 1 1 phase, MDGIS algorithms will eliminate qρ − α = ρ 2 − 2α edges. Subsequently to this, for the second and last phase of this analysis, we again can use the worst-case assumption that

1 1 each remaining vertex will use an additional color, which will require an additional N − α colors.

E 1 1 α Thus, MDGIS algorithms will use at most f (G) = 1| |1 + N − colors for strongly ρ 2 − 2α square matrices. This gives an approximation ratio r(G) = f (G) of: χsb(G) E 1| |1 ρ 2 − 2α 1 1 r(G) + N − α ≤ E |µ|

Since we are assuming that all G′ are square, µ = ρ and: E 1| |1 ρ 2 − 2α 1 1 r(G) + N − α ≤ E |ρ|

ρ 1 1 1 1 1 1 1 1 α 2 + 2α 1 α 2 + 2α 1 α = 1 1 + N − = ρ + N − = ρ + 2n − ρ 2 − 2α Referring to Figure 5.3, it is seen that the exponent for ρ is minimized when α = 3, and thus

1 + 1 1 1 1 + 1 1 1 2 ρ 2 2α + 2ρ − α ρ 2 6 + 2ρ − 3 = 3ρ 3 ≥ 103

1 1 - 1/α 1/2 + 1/(2α)

0.8

0.6 exponent 0.4

0.2

0 1 1.5 2 2.5 3 3.5 4 4.5 5 alpha Figure 5.3: Optimizing α for Strongly Square Matrices

2 N 3 Since ρ = 2 , ρ is O(N), and MDGIS algorithms including CDC are O(N ) approximation algorithms for star bicoloring on strongly square matrices.

2 It is also straightforward to establish that the O(N 3 ) approximation ratio also applies to maximum neighborhood, as it has been shown in section 5.6 that that algorithm

1 1 eliminates MAX ρ 2 , µ 3 edges per color. { } 

5.8.2 Nearly Square Matrices

While strongly square matrices are theoretical constructs, many real matrices can be constructed that would remain “close to square” throughout the star bicoloring process with MDGIS algorithms. For discussion purposes, we will term these matrices “nearly square matrices”.

If the amount of divergence from square can be minimized, that is minimize( n m ), | − | then coloring results might approach that of the theoretical strongly square matrix.

1+ǫ 1 Consider the case where, G′, m n for some 0 < ǫ < . We can consider the end of ∀ ≤ 2 E the proof provided in section 5.8.1 where |µ| is used to bound the chromatic number for 104

= 1 the computation of the approximation ratio. For example, in the case where ǫ 4 , we can provide an approximation ratio for MDGIS algorithms on nearly square matrices as follows:

E E 1| |1 1| |1 ρ 2 − 2α 1 1 ρ 2 − 2α 1 1 + α + α r(G) E N − E N − ≤ | | ≤ | | µ 1+ 1 ρ 4

1+ 1 ρ 4 1 3 1 1 3 1 1 1 α 4 + 2α 1 α 4 + 2α 1 α = 1 1 + N − = ρ + N − = ρ + 2n − ρ 2 − 2α The exponent for ρ will be minimized when α = 6 , and thus

3 + 1 1 1 3 + 1 1 1 5 ρ 4 2α + 2ρ − α ρ 4 12 + 2ρ − 6 = 3ρ 6 ≥

1 5 = 6 Thus, ǫ 4 provides an approximation ratio of O(N ). Intuitively, as ǫ 0, the approximation results for MDGIS algorithms on nearly → 2 square matrices will approach the O(N 3 ) result for strongly square matrices. Indeed, with

1 1+ 1 3 ǫ (i.e. m n 8 ), α minimizes at 4 providing an approximation ratio of O(N 4 ). ≤ 8 ≤ Similarly, ǫ 1 produces a minimum α at approximately 3.5 with an approximation ratio ≤ 16 7 of roughly O(N 10 ).

In some cases, the size of µ relative to ρ may conform to a constant ratio rather than an exponential factor. In such instances where µ φρ for some constant φ > 1, it is ≤ 2 2 straightforward to see that r(G) φ3ρ 3 providing the O(N 3 ) approximation result. ≤ 5.8.2.1 An Example Nearly Square Matrix Construction

In considering the process of coloring matrices to minimize n m , intuitively it | − | would be desirable for coloring groups to alternate between rows and columns, thus

minimizing the departure from a square form. One way to construct such a matrix in the case where V = V = n is described by the following algorithm in Figure 5.4. | b1| | b2| Example nearly square matrices for n = 9 are presented in Figures 5.5 and 5.6. 105

within an empty nxn m a t r i x while a row containing n nnzs does not exist add nnzs to form one row of size n add nnzs to form one column of size n n = n 2 − reorder rows as desired reorder columns as desired i n s e r t a l l zero rows as desired i n s e r t a l l −zero columns as desired − Figure 5.4: Forming a “Nearly Square” Matrix

The all-zero rows and columns are provided to illustrate potential for these matrices to appear sparse and do not affect their MDGIS coloring performance, so for this discussion we let n be the number of rows and m be the number of columns in the equivalent compressed form (see Figure 5.5). It is clear that no further compression of the matrix is possible, and that the first color group will be either the maximum degree row or column (row 1 or column 1 in this example). WLOG, assume row 1 is assigned to the first color group. Column 1 is now the most dense matrix line, and will be assigned to color group 2. This produces a descendant matrix which is smaller but of the same form as the original M, and the coloring process would continue with alternating selection of the

remaining most dense row/column (or column/row) pair. In this example, with ρ representing the cardinality of the smaller side of the

equivalent graph G and µ larger, G′ G′ is a descendant graph of G , µ ρ + 1, and b ∀{ | b} ≤ further E E 1| |1 1| |1 ρ 2 − 2α 1 1 ρ 2 − 2α 1 1 r(G) + N − α + N − α ≤ E ≤ E ρ|+1| |ρ|

2 This observation provides the O(N 3 ) approximation result for nearly square matrices

constructed according to Figure 5.4. 106

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

[ csq9 ]

Figure 5.5: An n = 9 “Nearly Square” Matrix in Compressed Form

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

x x x x

[ r s q 9 ] Figure 5.6: An n = 9 “Nearly Square” Matrix in Random Form 107

5.9 Neighborhood Ratio Approximation Analysis

As mentioned in section 4.4.1, the neighborhood ratio method provides an appealing intuitive approach, but produces inconsistent results. While this method has limitations, which we will consider further, it does solve the stair matrix correctly and also can be

shown to perform reasonably well on nearly square matrices. To illustrate, consider a strongly square matrix represented by the bipartite graph G = ( V , V , E ). Let Vˆ V and Vˆ V be the greedy independent sets b { b1 b2} b b1 ⊆ b1 b2 ⊆ b2 computed from each of their respective bipartitions. The neighborhood ratio decision logic will select independent set Vˆ b1 for coloring when the following condition is true

Γ(Vˆ ) Γ(Vˆ ) | b1 | | b2 | V ≥ V | b2| | b1| selecting independent set Vˆ b2 otherwise. Since we are considering the case of strongly square matrices, it is always the case that V = V . This results in neighborhood ratio | b2| | b1| always selecting an independent set with a maximal cardinality neighborhood, which is the same decision logic as the maximum neighborhood method. Therefore, neighborhood

2 ratio will provide an approximation ratio of O(N 3 ) for strongly square matrices (see section 5.6). Further, neighborhood ratio bases the sizes of the bipartitions, V and V | b1| | b2| 2 on the original matrix M, and therefore provides O(N 3 ) approximation for any square input matrix. However, when matrices are not square, in certain circumstances neighborhood ratio coloring performance can deteriorate badly. Section 4.4.1 provided an example where the

number of colors used could exceed the trivial min( V , V ) solution, but even that | b1| | b2| example is not worst-case behavior. Consider the graph in Figure 5.7. In the case of bipartition Vb1, all vertices are neighbors-of-neighbors so a single vertex (WLOG assume

the first one) will be selected as the sole member of the greedy independent set Vˆ b1, producing Γ(Vˆ ) = 1. For bipartition V , the greedy independent set is obvious, with | b1 | b2 108

Γ(Vˆ ) = N 1. The decision on which set to receive the next color will be: | b2 | − 1 N 1 − 1 ≥ N 1 −

This results in a “tie” case, which in the worst case will select the independent set Vˆ b1. Since neighborhood ratio bases the sizes of the bipartitions, V and V on the original | b1| | b2| Γ(Vˆ b2) matrix M, once the first row is colored, the ratio | | will have a declining value while Vb1 | | Γ(Vˆ b1) | | will remain a constant 1, causing the overall coloring process to use N 1 colors on Vb2 | | − a two-colorable graph (counting as a color in this case). The resulting approximation ⊥ N 1 = ratio is O( 2− ) O(N), which disproves conjecture 1 that the general Greedy Independent Set family of star bicoloring algorithms are all O(nα) approximation algorithms, for some

α < 1.

Vb1

Vb2 Figure 5.7: Neighborhood Ratio Using O(N) Colors

The example in figure 5.7 is easy to explain and illustrative of the problem under discussion, but does depend on “tie handling” for the initial poor selection. As a further illustration, figure 5.8 provides a slightly more complex example which is independent of ties. The coloring process using neighborhood ratio would proceed as follows. Initially,

the vertex of maximum degree (labeled vm) would be selected as the seed vertex for Vb1. Since all remaining vertices are neighbors-of-neighbors, Vˆ = v . Either vertex from b1 { m}

Vb2 may be selected, so WLOG Vˆ b2 can be represented by the green vertex. This will

Γ(Vˆ b1) 2 Γ(Vˆ b2) (N 1)/2 result in | | = and | | = − , thus Vˆ b1 will be colored. Subsequently, two Vb2 2 Vb1 N 2 | | | | − 109

disjoint subgraphs exist. For Vb1, any one vertex from each subgraph, e.g. the two yellow

Γ(Vˆ b1) 2 vertices, may be selected as Vˆ b1, and | | will remain . For this coloring step and the Vb2 2 | | remainder of the coloring process, Vˆ b2 will include both vertices from Vb2. At each step,

Γ(Vˆ b2) N 2 θ | | will be − − , where θ is the number of vertices previously colored, which will Vb1 N 2 | | − 2 always be less than 2 , so the coloring proceeds to select pairs of vertices from Vb1 for each N 1 subsequent color group. This process results in using 2− = O(N) colors.

vm

Vb1

Vb2 Figure 5.8: Neighborhood Ratio Using O(N) Colors, Tie Independent

5.10 Analysis for Ratio M’ Method

A natural question regarding the neighborhood ratio method would be to consider the

impact of using the bipartition sizes from the series of descendant graphs, Gb(Mi′) generated by the coloring process, rather than the sizes from the original input matrix M. We refer to this method as ratio M’ (ratio M prime), and its decision criteria is to select

partition Vb1 when Γ(Vˆ ) Γ(Vˆ ) | b1 | | b2 | V ≥ V | b′2| | b′1| is true and select V otherwise, where V′ and V′ are the cardinalities of the b2 | b1| | b2| bipartitions as reduced by the removal of all preceding colored vertices. While this would seem to be an intuitive improvement, as it more accurately models the reduction of the matrix as it is colored, we actually obtain poorer empirical results (see Table 4.3) with no 110

improvement in worst case coloring performance. The later observation can be seen with

the example graphs in Figures 5.7 and 5.8 by following the same analysis as in section 5.9. Further, while not affecting the strongly square analysis, updating the sizes of the

2 bipartitions does negate the O(N 3 ) analysis result for general square input matrices that is obtained by neighborhood ratio.

5.11 Analysis for Inverse Ratio Method

While the neighborhood ratio method selects a coloring bipartition based upon the relative sizes of the greedy independent set neighborhoods to the partitions containing those neighborhoods, the inverse ratio method compares the neighborhood size to the size

of the coloring partition. That is, the criterion for selecting Vb1 as the partition to color is Γ(Vˆ ) Γ(Vˆ ) | b1 | | b2 | V ≥ V | b1| | b2| This choice considers the relative density of edges deleted per member vertex in the current color group. In practice, inverse ratio outperforms neighborhood ratio on the Harwell-Boeing test matrix group by about 5%, but fails to correctly solve the stair matrix.

Curiously, inverse ratio had identical empirical results with maximum neighborhood, but can be shown to solve certain matrices differently. Consider the matrix illustrated in Figure 5.9. Recall that the formation of independent sets by Greedy Independent Set begins with selection of a vertex of maximal degree within the bipartition as the seed

vertex. For bipartition V , where V = 7, the yellow vertex of degree 4 will constitute b1 | b1| the greedy independent set, as all other vertices are neighbors-of-neighbors. For bipartition V , V = 4 and we can use the first occurring vertex of maximal degree, b2 | b2| indicated in green, as the greedy independent set. In this graph, Γ(Vˆ ) = 4 while | b1 | Γ(Vˆ ) = 3. Therefore, maximum neighborhood would select V as the bipartition to | b2 | b1 Γ(Vˆ b1) Γ(Vˆb2) 4 3 color. However, since | | | | = is false, inverse ratio would select Vb1 Vb2 7 4 | | ≥ | | ≥ bipartition Vb2.. 111

Once again, as with neighborhood ratio, in the case of the theoretical strongly square matrices, this method becomes equivalent to maximum neighborhood, and would

2 therefore have an approximation ratio of O(N 3 ) in that case. However, inverse ratio can deviate from maximum neighborhood in cases where both Γ(Vˆ ) > Γ(Vˆ ) and | b1 | | b2 | Γ(Vˆ b1) Γ(Vˆ b2) | | < | | are true, as in the example. Given that there is no guarantee the bipartition Vb1 Vb2 | | | | selected in those cases would contain a vertex of maximal degree in the graph, and the example demonstrates that in some cases neighborhood ratio may color fewer edges than maximum neighborhood, then the approximation ratio for neighborhood ratio is likely no better than that of maximum neighborhood.

Vb1

Vb2 Figure 5.9: Inverse Ratio Differs from Maximum Neighborhood

5.12 Analysis for Dense Ratio Method

The dense ratio method was developed to address the observation that a certain number of very dense lines (rows or columns) tended to skew the behavior of the ratio

methods in general. We conjectured that giving some preference to eliminating these dense lines early in the coloring process might increase the residual orthogonal independence sufficiently to obtain a more efficient result. In practice, dense ratio turns out to be the most effective of the ratio methods investigated. It comes very close to CDC performance on the Harwell-Boeing matrix set (within 2%) while also solving the stair matrices correctly (which CDC does not). 112

Dense ratio also performs well on strongly square matrices. The decision logic for

dense ratio is to use the neighborhood ratio selection criteria of

Γ(Vˆ ) Γ(Vˆ ) | b1 | compared to | b2 | V V | b1| | b2| whenever the larger ratio is higher than some constant (an arbitrary value of .66 used in our tests), and otherwise default to forming the next color group from the bipartition containing a vertex of maximal degree (v∆), as in the CDC method. We first note that, when the ratios are sufficiently large, similar to other ratio methods the decision logic becomes equivalent to maximum neighborhood when considering strongly square matrices. So with large ratios on strongly square matrices, as discussed in section 5.7, maximum neighborhood (and therefore dense ratio) will eliminate a minimum of

√ V + V )/2 edges per coloring pass. If the ratios are small, then a coloring group will | b1| | b2| 1 be based on a partition containing ∆, and therefore will eliminate at least ρ 2 edges, as shown in section 5.4, claim 2. Since ρ is the partition of smaller cardinality and µ the

ρ+µ 1 1 1 larger, ρ µ. Thus, √ V + V )/2 = ( ) 2 ρ 2 , and a minimum of ρ 2 edges are ≤ | b1| | b2| 2 ≥ eliminated. Following claim 3 and the proof of claim 8 in section 5.8.1 shows that dense

2 ratio is an O(N 3 ) approximation algorithm for star bicoloring on strongly square matrices. Unfortunately, dense ratio suffers from the same treatment of the examples in figures 5.7 and 5.8 as does neighborhood ratio, and despite its good empirical results is also an O(N) approximation algorithm for the general case of star bicoloring.

5.13 Look Ahead Approximation Analysis

Constructions such as the stair matrix (see section 4.3.1) illustrate the property that short-term greedy logic can lead to acceptance of a local maximum which fails to lead to

an optimal overall solution. The look ahead method attempts to address this concern, making its bipartition selection by looking at the impact of the current decision on the next subsequent bipartition decision. Look ahead computes coloring results for each of the four 113

possible Greedy Independent Set row/column choice combinations for the current and next

possible bipartitions (i.e. coloring a row then another row, a row followed by a column, a column followed by a row, or a column then another column). The bipartition selected for coloring will be the one that preserves the best possible outcome of its following decision. For example, consider the stair matrix in Figure 4.5. Let R1R2 be the number of

edges eliminated by coloring first a row group and then another row group, C1C2 the total edges eliminated by coloring two column groups in a row, and R1C2 and C1R2 be the obvious analogs. For the first coloring pass, look ahead would proceed as follows. Since no rows are compressible in this example, coloring a row will eliminate 7 edges. This will not unlock any additional rows, so subsequently coloring another row eliminates 7 more edges, resulting in R1R2 = 14. If a column were to be colored after the first row, then the dense column 1 would have 5 edges remaining, and R1C2 = 12. Selecting to color a column first would eliminate column 1 for 6 edges. This would allow the remaining columns to be compressed into 6 columns each containing 6 edges, and results in

C1C2 = 12. Lastly, if a row were colored after the first column, then all rows in the descendant graph could be compressed into a single row containing 36 edges, giving C1R2 = 42. Based on this, look ahead would initially color a column, in spite of the fact that this eliminates 6 edges while choosing to color a row (the simple greedy choice) would have eliminated 7. This coloring step results in the derivative matrix M1′ given in Figure 5.10. For the second coloring pass, all remaining rows are compressible, so coloring a row will eliminate 36 edges. Since no edges would remain, both R1R2 = 36 and R1C2 = 36.

As noted, the columns of descendant graph M1′ are also compressible into 6 columns each with 6 edges, so coloring a column eliminates 6 edges (WLOG, this can be thought of as the compressed column containing the first non-zero in each row). Subsequently, coloring another column would eliminate another 6 edges (e.g. the second non-zero in each row), 114

r = 6

- x x xxxx------xxxxxx------xxxxxx------xxxxxx------xxxxxx------xxxxxx

Figure 5.10: “Stair” Matrix Derivative, Reduced by Coloring Column 1

and C1C2 = 12. Alternatively, after coloring a column, the remaining rows could be compressed into a single row of 30 edges, and C1R2 = 36. Since the best case results for the (second) descendant graph edges eliminated are tied, look ahead defaults to the maximal current greedy edge reduction (i.e. the maximum neighborhood method), colors a compressed row group and correctly solves the stair matrix using 2 colors. Further, look-ahead does this in a way that provides an approximation analysis equivalent with the best currently observed methods.

Claim 9. The look ahead algorithm eliminates a minimum of √( V + V )/2 edges per | b1| | b2| coloring pass.

Proof. This proof is by induction, and requires several steps. We first establish a degenerative base case that illustrates the method look ahead uses to complete the coloring process while introducing necessary symbols and terminology. To begin, we can

view the process of look ahead coloring as a binary tree representing the sequence of bipartition selection choices (i.e. which bipartition from which to select the next 115

independent set coloring group). Since we are using Greedy Independent Set to form the color groups, we know that in each coloring pass at least one choice of bipartition will result in an independent set which will eliminate a minimum of √( V + V )/2 edges | b1| | b2| (see sections 3.3 and 5.7). In the trivial case where the last coloring selection is being made, look ahead defaults to maximum neighborhood behavior, making a selection which will complete the coloring in one step. This choice is equivalent to the greedy selection of maximum number of edges eliminated with that coloring group. This situation is illustrated graphically in Figure 5.11. Here we can arbitrarily say a branch from the root node to the left represents a “row” bipartition choice, and the branch to the right a “column” choice. Letting φ = √( V + V )/2 (the minimum number of edges | b1| | b2| guaranteed to be eliminated by the “best” choice), we label the left edge (WLOG) to denote guaranteed elimination of φ nodes, and the left child node as the choice selected by the maximum neighborhood and look ahead methods.

R

+

MN LA Figure 5.11: Look Ahead Degenerative Case, One Decision

As long as look ahead makes the same choice as maximum neighborhood we are

assured that φ edges will be eliminated with those coloring steps. We next consider graphs which require two colors where look ahead may diverge from the maximum neighborhood decision path. For this discussion, let the cardinality of some node x in the decision tree be the minimum number of input graph edges guaranteed to be eliminated by the decision path

from the root of the decision tree to node x. We know that at least one choice at each 116

decision tree node will eliminate φ edges. In Figure 5.12, maximum neighborhood may

proceed from the node selected in Figure 5.11 along the path of maximum neighborhood to the node labeled MN in Figure 5.12. We then know that a 2φ.If | 1| ≥ a = MAX a , b , c , d , then look ahead would commit its first coloring decision to | 1| {| 1| | 1| | 1| | 1|} the left hand node of Figure 5.11, which assures that φ edges are eliminated. However, ≥

it is possible that some other node may have a higher cardinality than a1. We note that this cannot be b , as the selection of a by maximum neighborhood ensures that a b . It 1 1 | 1| ≥ | 1| could occur, though, that some node ΩA, the node of maximum cardinality between c1 and d , does exist where Ω > a . Since look ahead commits the current decision along the 1 | A| | 1| path to the best subsequent outcome, this condition would cause selection of the

right-hand “column” choice (node labeled LA(1), Figure 5.12). In this case node d1 has maximal cardinality, so is selected as node Ω . Necessarily, since we know a 2φ, then A | 1| ≥ Ω = 2φ + ǫ for some integer ǫ > 0. Continuing from node LA(1), look ahead will | A| 1 1 select the maximum neighborhood choice as its final selection (on the last level of the decision tree), completing this example with φ edges eliminated per color group. ≥

R

+ LA(1)

+ + + 2 ε1 MN ΩA a1 b1 c1 d1 Figure 5.12: Look Ahead Height Two Case

Now we will consider a three-colorable graph where look ahead makes a series of decisions which each diverge from the maximum neighborhood path. As the size of the decision tree increases, look ahead may continue to select a chain of near-term 117

sub-optimal bipartition choices. Consider Figure 5.13, which continues the example above

which selected decision node LA(1). We know that Ω = 2φ + ǫ , and that therefore | A| 1 some node d exists where d 3φ + ǫ . If d = MAX a , b , c , d , then node Ω 2 | 2| ≥ 1 | 2| {| 2| | 2| | 2| | 2|} A would be selected as the next coloring decision, confirming the maximum neighborhood path descending from LA(1) which has been established to eliminate φ input graph ≥ edges per color group. Similar to before, though, if some node a exists where a > d , 2 | 2| | 2| then look ahead need not include node ΩA in the coloring decision path. For this to occur, a = d + ǫ = 3φ + ǫ + ǫ , or, noting the height of the decision path h = 3, | 2| | 2| 2 1 2 h 1 ah 1 = hφ + j=−1 ǫ j. In that case, look ahead would commit to decision node LA(2) and | − | P complete the coloring process with node ΩB, which results in > φ edges eliminated per color.

R

+ A(1)

+ + + 2 ε1 A(2) ΩA a1 b1 c1 d1

+ + ε1 + ε2 + Ω a2 b2 c2 d2 Figure 5.13: Look Ahead Chained Sub-optimal Decisions

Moving to the general case, illustrated in Figure 5.14, we assume that any chain of look ahead sub-optimal decisions of length i will eliminate more than φ input graph edges

per color group (ending at node ΩAi). Since there is guaranteed to be a node such as d2

i where d2 = (i + 1)φ + j=1 ǫ j, and that ΩBi = d2 + ǫi+1, then any chain of length i + 1 | | P | | | | 118

will also eliminate more than φ input graph edges per color group. By the hypothesis of

induction any chain of look ahead sub-optimal decisions of arbitrary length will eliminate more than φ input graph edges per color group. Since look ahead either follows the maximum neighborhood path or constructs a sub-chain of sub-optimal decisions ending with a maximum neighborhood decision, look

ahead will always eliminate a minimum of φ input graph edges per color group.



.

. .

A(i-1)

i + + (i) j=1 j A(i) ΩAi c1 d1

i+1 +(i+1) + j=1 j + ΩBi a2 b2 c2 d2 Figure 5.14: Look Ahead Decision Tree General Case

2 Claim 10. The look ahead algorithm uses at most 3µ 3 χsb(Gb) colors. Hence, look ahead

2 is an O(N 3 ) approximation algorithm for star bicoloring.

Proof. Given the proof of claim 9, this claim follows naturally from the analysis in

sections 3.3.2 and 3.3.3. 

5.14 Weighted Unlocking Approximation Analysis

The weighted unlocking strategy is based on the observation that not all vertices are created equal. Coloring certain key rows or columns can have significant impact on the 119

residual structural orthogonality of the descendant matrices, leading to greater subsequent

compression and better coloring results. Weighted unlocking combines the degree of a vertex v with the number of neighbors of v who also have one or more additional edges back to the the bipartition containing v to create a weighted degree. A vertex of maximal weighted degree is used as the seed vertex to create the next greedy independent set color group. Even though the weighting factor is somewhat arbitrary (refinement of the metric is proposed in the future work section), in practice this strategy performs very well. The initial weighted unlocking strategy tied CDC in overall colors used for the Harwell-Boeing test set while also correctly solving the stair matrix, representing a significant practical improvement.

Using this “weighted degree” metric causes weighted unlocking to depart from the strategy of insuring the inclusion of a maximal degree vertex in each greedy independent set coloring group. A key observation for the approximation analysis is to note that there is a simple bound which limits how far the algorithm will vary from selecting the vertex of maximal degree. The maximum amount of “weight” which can be added to the degree of a given vertex, v, is equal to Γ(v). This occurs when w Γ(v): d(w ) > 1. So, the ∀ i ∈ i maximum weighted degree for each vertex is double the regular degree of that vertex. Therefore, weighted unlocking will select seed vertices which have regular degree of at

∆ least 2 . This observation flows naturally into the analysis method presented in sections 5.3 and 5.4.

3 Claim 11. The weighted unlocking algorithm uses at most 4µ 4 χsb(Gb) colors. Hence,

3 weighted unlocking is an O(N 4 ) approximation algorithm for star bicoloring.

∆ Proof. Since each weighted unlocking seed vertex will have a degree 2 , similarly to 1 ≥ ˆ ∆ ρ ρ 2 claim 2 we can observe that Γ(V) MAX 2 , ∆ 2 . Analogous with claim 3, we can 1 | | ≥ { } ≥ ˆ µ 3 also show that Γ(V) 2 . Continuing with the analysis method presented in section 1| | ≥ 1 µ 3 µ 3 1 1 5.4, defining as being sufficiently large while N − α for some α > 1 allows 2 2 ≥ 120

3 1 1 1 1 weighted unlocking to eliminate (µ α )/2 (µ 3 − 3α )/2 edges per coloring pass during q − ≥ 1 1 the first phase of the algorithm, with an additional N − α colors required to complete the coloring in the worst case. This provides an approximation ratio of:

Eb 1| 1| (µ 3 − 3α )/2 1 1 r(Gb) + N − α ≤ Eb | µ |

µ 1 2 1 1 2 1 1 1 α 3 + 3α 1 α 3 + 3α 1 α = 1 1 + N − = 2µ + N − 2µ + 2µ − (µ 3 − 3α )/2 ≤

As seen in Figure 5.1, the exponent for µ is minimized when α = 4, and thus

2 + 1 1 1 2 + 1 1 1 3 2µ 3 3α + 2µ − α 2µ 3 12 + 2µ − 4 = 4µ 4 ≥

3 This establishes that weighted unlocking is an O(N 4 ) approximation algorithm for Minimum Star Bicoloring. 

A similar constant-factor modification to the proof of claim 8 also establishes the

2 O(N 3 ) result for weighted unlocking on strongly square matrices.

5.15 Analysis Summary

Prior to our previous results [2], no approximation analysis existed for any of the published algorithms for star bicoloring. The primary purpose of this chapter was to expand and build on the approximation results we obtained for ASBC [2] to provide more generalized results for the class of GIS star bicoloring algorithms.

In this chapter we have shown that the family of algorithms described by the framework presented in chapter 4 accurately solves the problem of star bicoloring. Further, this framework was used to construct the GIS family of related greedy algorithms 121

in chapter 4, and this current chapter explored how the specific subtle differences between those algorithms affected their approximation complexity. It is immediately interesting to note that the algorithms for which the best approximation ratios were determined do not directly correspond to those algorithms which have obtained the best empirical results. The ASBC and CDC algorithms, both of which were published prior to the completion of this work, served as a our comparative baseline. We note that within each

3 2 approximation ratio class (O(N 4 ) algorithms and O(N 3 ) algorithms), we successfully presented new algorithms which provide improved results. The look ahead method both improved the ASBC empirical Harwell-Boeing results to within 2% of the best observed and also successfully solved the stair matrices, while achieving the same improved

2 3 approximation ratio of O(N 3 ). For the O(N 4 ) category, weighted unlocking achieved the same approximation ratio as CDC while improving on CDC’s empirical results by solving the stair matrix class of input matrices, resulting in weighted unlocking achieving the best overall coloring performance of all tested GIS methods from both classes.

For further comparison, also presented in chapter 4 are the two methods semi-brute-force (SBF) and semi-brute-force iterative (SBF-I). These methods are both polynomial time heuristics which approach brute-force methods, to some degree, and were included in this dissertation to provide a reference point suggestive of the likely best-case results. SBF computes an optimal solution for the first log(N) color groups, and finishes the remaining sub-problem with the look ahead method. In contrast, SBF I − repeats the SBF process iteratively, committing to a greedy choice partial solution after every log(N) color groups. Both of these methods provide higher-order polynomial-time performance, and are similar in structure to a bounded exact search approach. These methods performed well, but even with their greater program sophistication and increased cpu resource requirements, SBF required 5 more colors on our test dataset than weighted unlocking, and SBF I only bettered the weighted unlocking result by 3 colors (less than − 122

1 2 %). This seems to suggest that the top performing GIS algorithms are approaching optimal empirical results. Also in this section, using the stair matrix construct, we showed that the MDGIS algorithm sub-family (those GIS algorithms which include v∆ in their independent sets) are subject to a higher (worse) lower bound on their approximation ratios than the other methods studied. This may imply the existence of a better greedy selection metric than merely vertex degree. The results of our accumulated analyses, along with summary empirical results, for the studied GIS methods are summarized in Table 5.1. Also in this section we began to explore the impact of matrix squareness on these

analyses. By introducing the concept of hypothetical “strongly square” matrices, we were able to show improved approximation ratios for many algorithms. Then, by discussing matrices of “nearly square” construction, we showed potential for practical application of this concept. Further exploration of the properties of “square” versus “ribbon” matrices is

left for future work.

5.15.1 Observations

As an aid to future investigation, we observe from the analyses provided in this

1 chapter that demonstrating that a particular algorithm deletes at least ρ 2 edges per

1 coloring pass is sufficient to also show that such an algorithm deletes at least µ 3 edges per

3 pass. The latter observation is sufficient to provide an O(N 4 ) approximation ratio (see

sections 5.4 and 5.6). We further observe that establishing that an algorithm will delete at least

2 √( V + V )/2 edges per coloring pass will provide an O(N 3 ) approximation ratio. | b1| | b2| The following conjecture was presented in our proposal as one intended direction of inquiry: 123 Table 5.1: GIS Methods, Analysis Summary Stair Strongly Coloring Coloring Matrix Square Greedy Performance Performance Lower Approx. Approx. Method w/o “Stair” Incl. “Stair” Bound Ratio Ratio 1 2 2 ASBC 671 911 O(N 2 ) O(N 3 ) O(N 3 ) 1 2 3 CDC 618 858 O(N 2 ) O(N 3 ) O(N 4 ) 1 2 2 Maximum 652 892 O(N 2 ) O(N 3 ) O(N 3 ) Neighborhood 2 Neighborhood 687 695 O(1) O(N 3 ) O(N) Ratio 1 2 Inverse Ratio 652 892 O(N 2 ) O(N 3 ) undetermined 2 Ratio M’ 699 707 O(1) O(N 3 ) O(N) 2 Dense Ratio 629 637 O(1) O(N 3 ) O(N) 2 2 Look Ahead 628 636 O(1) O(N 3 ) O(N 3 ) 2 3 Weighted 618 626 O(1) O(N 3 ) O(N 4 ) Unlocking

Conjecture 1. The general greedy family of star bicoloring algorithms are all O(nα) approximation algorithms, for some α < 1.

As a result of the analyses in sections 5.9, 5.10 and 5.12 we have shown conjecture 1 to be false. 124 6 Distance-2 Independent Set Coloring

It is well established that the combinatorial problem of ordering the non-zero elements of a sparse Jacobian matrix for optimized computation can be estimated by heuristically generating a star bicoloring of the corresponding bipartite incidence

structure, Gb(M)[1–3, 7, 14, 25]. In preceding chapters we have presented both prior and new algorithms for star bicoloring, and included expanded analyses establishing that these approaches provide valid star bicolorings. As well, we have developed additional analyses deriving approximation ratios for most of the studied algorithms, demonstrating that they are indeed approximation methods for star bicoloring. Of particular interest in this work have been the algorithms which use Greedy Independent Set to form legal star bicoloring

color groups. These GIS-based algorithms enforce the requirements of star bicoloring (see Definition 1), in part, by requiring that each individual color group be a distance-2 independent set of the derivative graph resulting from deleting all previously colored vertices and their incident edges. We observe that, in all cases for the GIS algorithms

studied, this results in an ordered sequence of distance-2 independent sets (D2IS), each of

which may or may not be a D2IS in the original graph Gb, but is guaranteed to be a D2IS at the time each specific set is chosen as a coloring group. This observation leads to the following natural definitions.

Definition 3. Given a bipartite graph, G = ( V , V , E ), a distance-2 independent set b { 1b 2b} b

coloring of Gb is an ordered partition φ = (Υ1, Υ2,... Υk) of sets of vertices in Gb such

j 1 that, for each 1 j k, Υ is a distance-2 independent set in G − Υ . ≤ ≤ j b − ∪i=1 i

Definition 4. The Bipartite Distance-2 Independent Set Coloring (D2ISC) problem is the

decision problem where, given a bipartite graph Gb and an integer k, determine whether

there is a distance-2 independent set coloring of Gb using at most k colors, such that each set Υ φ represents a unique color. i ∈ 125

Definition 5. The distance-2 independent set chromatic number, χ2i, of Gb is the minimal

number of colors required for a distance-2 independent set coloring of Gb.

In this chapter we take a brief look at distance-2 independent set coloring (D2ISC) and consider it as a model for star bicoloring. In section 6.1 we show how the two related

problems can diverge, and in section 6.2 we establish that the newly defined (D2ISC) problem is NP-complete.

6.1 The Inequivalence of D2IS Bicoloring and Star Bicoloring

Claim 12. Every D2IS C of Gb is a star bicoloring of Gb.

Proof. Referring to the general framework presented in Figure 4.1, the proof of Claim 12 is provided in section 5.1. 

Corollary to Claim 12: χ (G ) χ (G ). sb b ≤ 2i b

Theorem 1. There exists a bipartite graph, Gb such that χsb(Gb) < χ2i(Gb).

Proof. Consider the coloring, SBC6.1, of the small graph presented at the top of Figure 6.1. Referring to Definition 1 from Section 2.8 we observe that: 1.) since no edge is incident upon two vertices of the same color, this is a proper coloring; 2.) the color palettes for V and V are disjoint; 3.) since no vertices are colored , Definition 1.3 is b1 b2 ⊥ true vacuously; and 4.) since there are no paths of edge length three, Definition 1.4 is also true vacuously. Thus SBC is a valid Star Bicoloring, and χ 2 for this graph. 6.1 sb ≤ Conversely, we first observe that any D2IS coloring of any graph which contains at

least one edge uses at least two colors. To see this, notice that each vertex must be assigned to some distance-2 independent set and vertices from the separate bipartitions 126

cannot be in the same independent set. Hence, at least two colors are required for such a

graph. Referring again to the graph in Figure 6.1, consider the construction of the first maximal distance-2 independent set, Υ1, forming the first color group. Since no maximal independent set within this graph can include all the vertices from either bipartition, at least one edge will remain in G Υ , therefore χ 3 for this graph. b − 1 2i ≥ 

Some of the possible optimal D2IS 3-colorings for this graph are provided in the bottom of Figure 6.1.

a valid Vb1 optimal star bicoloring V SBC6.1 b2

possible optimal d-2 independent set colorings

Figure 6.1: A Star Bicoloring Which Is Not a Distance-2 Independent Set Coloring

We include in the Future Work section investigation of the open problem as to how great the separation between results of Star Bicoloring and Bipartite Distance-2 Independent Set Coloring might be in the general case. 127

6.2 NP-completeness for D2IS Coloring

Having established that coloring by distance-2 independent sets and star bicoloring are related but distinct problems, it is natural to ask whether the D2IS problem is easier (or harder) to solve than the original star bicoloring problem.

Theorem 2. Bipartite Distance-2 Independent Set Coloring (D2ISC) is NP-complete.

Proof. Our initial goal is to establish that D2ISC is in NP by showing that it is verifiable in polynomial time. For this analysis, we rely on an interactive proof system, or “prover-verifier” model. The “prover” (aka an “oracle”) possesses unlimited resources and

can provide immediate answers to any specified problem. Such a purported answer is then given to a “verifier,” which must validate the claimed answer with finite resources, i.e. it must complete the verification in polynomial time. We stipulate that the prover’s output, coloring φ, will consist of a list of colors, each color having an associated list of vertices having been assigned that color. The verification process therefore becomes one of

identifying successive color groups as independent sets. First, construct a V by V | | | | boolean distance-2 adjacency matrix for the vertices v V , V , which can be done in ∈ { 1b 2b} O(2 E ) time. For a valid D2IS coloring, at least one of the color groups (Υ ) must be a | | 1

distance-2 independent set in the original Gb. Since a valid distance-2 independent set is required to exist, its location can be determined by considering each possible assigned

color, ci, and scanning the rows of the adjacency matrix corresponding to each vertex

where c(v) = ci. In the worst case where the last color group considered is the only independent set, this step may require O( V 2) steps (having scanned the entire adjacency | | matrix). Once an independent set is found, set all the adjacency matrix entries for rows and columns corresponding to members of that independent set color to false, effectively removing those vertices from the graph, and taking O( V 2) time. Since deletion of any ≤ | |

given color group, Υi ensures that the next color group, Υi+1 is now a D2IS , we can 128

simply repeat the independent set location and removal process until all vertices are eliminated, which will take C iterations, where C is necessarily V . Thus, solution | | | | ≤ | | verification can be completed in no worse than O(2 E + C ( V 2 + V 2)) = O( V 3) time | | | | | | | | | | and D2ISC is in NP. To establish that D2ISC is NP-complete, we provide a reduction from graph coloring, a known NP-complete problem [4]. We first present, following the strategy of Coleman and Cai [36], a k (k + 1) reduction construction which demonstrates a method → of creating an equivalent Distance-2 Independent Set Coloring problem for an arbitrary Graph Coloring problem. We then show that an instance of Graph Coloring, < G, k >, is true the corresponding instance of Distance-2 Independent Set Coloring, ⇐⇒

< Gb, k′ > (where k′ = k + 1), is also true.

6.2.1 Graph Coloring to Distance-2 Independent Set Coloring Reduction Construction

Consider the general graph G = (V, E) which has been assigned a β-coloring. Construct from this graph a bipartite graph, G = ( V , V , E ) where V = V, and for b { b1 b2} b b1 each edge u, v = e E, create a set of β + 1 vertices in V and edges from each of these { } c ∈ b2 vertices to u and v in V . This construction, taking O( V + E ) steps, is clearly b1 | | | | polynomial-time. Now assume that assignment φ is a β-coloring of G. We first color the vertices of Vb1 according to φ. For each group of β + 1 vertices in Vb2, we then assign those members of a given group the same color, selecting unused arbitrary colors.

6.2.2 Graph Coloring Distance-2 Independent Set Coloring →

We first claim that if G has a proper β-coloring, then Gb has a D2ISC (β + 1)-coloring. Consider that any pair of vertices, v G and v G, which have the c1 ∈ c2 ∈ same color, c, cannot have an edge directly connecting them in G. Therefore, by the method of construction, any such pair of vertices when represented in Gb cannot be 129

Figure 6.2: Construction for D2ISC Reduction from General Graph Coloring

distance-2 neighbors. Since those vertices of like color are all pairwise not distance-2

neighbors in Gb, they form a distance-2 independent set. This allows the vertices in Vb1 of

Gb to receive the the same colors as the corresponding vertices in G. This will completely

color the Vb1 bipartition. Thus, by coloring the vertices of Vb1 according to the β-coloring

of G, all edges would be eliminated from Gb allowing the vertices of Vb2 to be assigned

one additional color, providing a D2ISC (β + 1)-coloring of Gb.

6.2.3 Distance-2 Independent Set Coloring Graph Coloring →

To complete the proof, we need to establish that if Gb has a D2ISC (β + 1)-coloring, then G is β-colorable. Consider that members of any distance-2 independent set must lie

on the same side of the bipartition of Gb. Further, when forming a distance-2 independent

set from bipartition Vb2 of Gb, only one vertex from each (ρ + 1)-sized group 130

(corresponding to an edge in G) could be included in any given distance-2 independent

set. Thus, at least ρ + 1 colors would be required to color all the vertices of Vb2. Further, to

increase the potential size of any independent set of vertices in Vb1, all distance-2 paths

between a given pair of vertices in Vb1 would have to be eliminated, which preliminary step would take ρ + 1 colors. Therefore, if Gb is ρ-colorable, then such coloring must represent an assignment of distance-2 independent sets from Vb1. Since each such group assigned would have no distance-2 path between them, then they would not have an edge between them in G. Therefore the D2ISC of Vb1 constitutes a proper coloring of G.  131 7 Concluding Thoughts and Future Directions

Our central theme in this dissertation has been the investigation of ordering matrix elements for optimal computation, an optimization problem which arises in many diverse scientific and engineering applications. We have illustrated how this problem can be cast as certain combinatorial problems on bipartite graphs, particularly the NP-complete

problem of Star Bicoloring, and summarized the prior research history in this area. We have described a new algorithm, Approximate Star BiColoring, and derived its approximation ratio – the first such analysis for any Star Bicoloring algorithm. Building on those results, we described a framework for a related family of Greedy Independent Set methods for star bicoloring. We compared empirical results for these

methods, along with specific “counter-examples” which establish limitations on certain approaches. We also demonstrated the correctness of these algorithms for Star Bicoloring, and provided expanded approximation analyses for most (all but one) of these methods, as well as for the existing Complete Direct Cover algorithm of Hossain and

Steihaug [1]. We also investigated the affects on the approximability characteristics of these methods by restricting the input to theoretical “strongly square” matrices, and then showed some practical ramifications of those results.

We concluded our current work with an investigation of the differences between published algorithmic models and the requirements of Star Bicoloring. We showed that this leads to the definition of a new problem, Distance-2 Independent Set Coloring, and established its membership in the category of NP-complete problems. With this work as a “jumping-off point,” there are several potentially interesting avenues to continue this research. While we have bettered the existing standard for empirical results, there remain some interesting algorithmic options, both within and outside of the GIS framework, that might achieve even better coloring performance. 132

While we have improved on the upper and lower bounds for approximability, there remains a gap that might be tightened. We have taken an introductory look at the effects of matrix squareness on approximability. This matrix characteristic has a polar analog in wide “ribbon matrices.” Establishing some limits based on very wide matrices might combine well with the current square matrix results. In an entirely different direction, we

have not considered the potential for parallelization techniques. Lastly, we have the newly defined problem of Distance-2 Independent Set Coloring, which creates the potential for expanded or improved analyses based on its simpler problem definition (when compared to Star Bicoloring). The following concluding subsections enumerate some of the areas currently of interest to the author.

7.1 Algorithmic Directions

Certainly, there are many possible avenues to explore for improved empirical results.

Within the GIS framework, the following suggest interesting potential to this author.

• The best of the “ratio methods” was the dense ratio approach. The threshold value for making a ratio decision versus a maximal degree greedy approach was initially set at an arbitrary 66%. Varying this threshold could improve the results of this

method.

• The method within our general framework providing the best results was weighted unlocking, which uses an arbitrary weighting factor (equal to 1 degree per increment) to modify the vertex degree prior to greedy selection. Varying this factor

to make it relatively more or less significant than vertex degree could improve coloring performance.

• The greedy vertex selections mainly use edge-elimination-based criteria. Some allowance for “orphaned vertex creation” may improve the greedy selection. 133

Orphaned vertices are automatically accumulated into the next color group without

penalty of exclusion of any other vertex in the next D2IS . Preference could be given (via a weighting factor) to greedy selections that maximize orphaned vertex creation.

• Vertex degree is used as the primary ordering method in this study. Other metrics, such as saturation degree or incidence degree, might prove interesting.

Outside of the GIS framework, the possibility of specifying a semidefinite programming characterization for D2ISC seems interesting. Also, renewed insight from the definition of D2ISC, which provides a clearer comparison of the Star Bicoloring problem with algorithms currently used to solve Star Bicoloring, may provide inspiration for improved Star Bicoloring algorithms.

7.2 Further Analyses

• We establish that χ b(G ) χ i(G ) in the general case. An open question asks if s b ≤ 2 b there is an upper bound for χ i(G ) such that χ i(G ) f (χ b(G )). 2 b 2 b ≤ s b • We establish that the majority of the investigated methods currently fall into two

2 3 approximability result categories, O(N 3 ) algorithms and O(N 4 ) algorithms. Can

either or both of these limits be improved?

1 • We establish that O(N 2 ) is a hard lower bound for methods which fail the stair matrices. Is there an analogous construct which defeats the other methods? Specifically, is there a stair matrix variant that would defeat weighted unlocking or look-ahead? (Consider the effect of a stair matrix with two dense columns on

look-ahead.)

• We have looked briefly at the effects of matrix “squareness.” What would be the effects of very long/wide “ribbon” matrices on the approximation analyses? Would 134

those results combine with the square matrix analyses to provide a tighter general

approximation ratio?

7.3 Other Considerations

• Are there efficient methods to parallelize any or all of the studied algorithms? Are there straightforward modifications that would increase inherent parallelism?

• Is there a semidefinite programming characterization for D2ISC?

• Can an improved lower bound for Star Bicoloring be constructed based on an analog of “coloring number” (see section 2.11)?

• Building on [7], is there a pre-computable ordering to provide an optimal greedy

Star Bicoloring solution (see section 2.13)? 135 References

[1] S. Hossain and T. Steihaug, “Computing a sparse Jacobian matrix by rows and columns,” Optimization Methods and Software, vol. 10, pp. 33–48, 1998. [2] D. Juedes and J. Jones, “Coloring Jacobians revisited: A new algorithm for star and acyclic bicoloring,” Optimization Methods and Software, vol. 27, no. 1-3, pp. 295–309, 2012. [3] T. F. Coleman and A. Verma, “The efficient computation of sparse Jacobian matrices using automatic differentiation,” SIAM Journal on Scientific Computing, vol. 18, no. 4, pp. 1201–1233, 1998. [4] M. R. Garey and D. S. Johnson, Computers and Intractability. Macmillan Higher Education, 1979. [5] B. M. Averick, R. G. Carter, and J. J. Mor´e, “The Minpack-2 test problem collection,” 1991. [6] K. Meintjes and A. P. Morgan, “Chemical equilibrium systems as numerical test problems,” ACM Transactions on Mathematical Software, vol. 16, no. 2, pp. 143 – 151, 1990. [7] A. Gebremedhin, F. Manne, and A. Pothen, “What color is your Jacobian? Graph coloring for computing derivatives,” SIAM Review, vol. 47, no. 4, pp. 629–705, 2005. [8] D. Juedes and J. Jones, “A generic framework for approximation analysis of greedy algorithms for star bicoloring,” 2014 in preparation. [9] D. Juedes and J. Jones, “Distance-2 independent set coloring with applications to efficient calculation of large sparse Jacobians,” 2015 in preparation. [10] A. R. Curtis, M. J. D. Powell, and J. K. Reid, “On the estimation of sparse Jacobian matrices,” Journal of the Institute of Mathematics and its Applications, vol. 13, pp. 117–119, 1974. [11] Boisvert, Pozo, Remington, Miller, and Lipman, “Matrix market version 3.0.” http://math.nist.gov/MatrixMarket, fall 2011. [12] T. F. Coleman and J. J. Mor´e, “Estimation of sparse Hessian matrices and graph coloring problems,” Mathematical Programming, vol. 28, no. 3, pp. 243–270, 1984. [13] S. T. McCormick, “Optimal approximation of sparse Hessians and its equivalence to a graph coloring problem,” Mathematical Programming, vol. 26, pp. 153–171, 1983. [14] T. F. Coleman and J. J. Mor´e, “Estimation of sparse Jacobian matrices and graph coloring problems,” SIAM Journal of Numerical Analysis, vol. 20, no. 1, pp. 187–209, 1983. 136

[15] B. M. Averick, J. J. Mor´e, C. H. Bischof, A. Carle, and A. Griewank, “Computing large sparse Jacobian matrices using automatic differentiation,” SIAM Journal on Scientific Computing, vol. 15, no. 2, pp. 285–294, 1994.

[16] C. G. Billings, H. L. Wei, P. Thomas, S. J. Linnane, and B. D. M. Hope-Gill, “The prediction of in-flight hypoxaemia using non-linear equations,” Respiratory Medicine, vol. 107 (6), pp. 841–847, 2013.

[17] Y. Li, Z. Nie, X. Sun, and X. Zhang, “RIC preconditioning algorithm for solving waveguide problem FEM linear system,” Bandaoti Guangdian/Semiconductor Optoelectronics, vol. 34 (1), pp. 34–37+4, 2013.

[18] P. Kotyczka, “Local linear dynamics assignment in IDA-PBC for underactuated mechanical systems,” Proceedings of the IEEE Conference on Decision and Control, vol. 6160656, pp. 6534–6539, 2011.

[19] M. S. Ozeren¨ and N. Postacioglu, “Nonlinear landslide tsunami run-up,” Journal of Fluid Mechanics, vol. 691, pp. 440–460, 2012.

[20] M. Haque, “Existence of complex patterns in the Beddington-DeAngelis predator-prey model,” Mathematical Biosciences, vol. 239 (2), pp. 179–190, 2012.

[21] G. J. M. Pieters and H. M. Schuttelaars, “On the nonlinear dynamics of a saline boundary layer formed by throughflow near the surface of a porous medium,” Physica D: Nonlinear Phenomena, vol. 237 (23), pp. 3075–3088, 2008.

[22] D. L. Green and L. A. Berry, “Iterative addition of parallel temperature effects to finite-difference simulation of radio-frequency wave propagation in plasmas,” Computer Physics Communications, vol. Article in Press, 2013.

[23] A. Naik, J. S. Jones, R. Schmidt, R. Al-Ouran, F. Drews, D. Juedes, and L. Welch, “Motif selection using sequence coverage analysis,” 2015 accepted for publication.

[24] L. Dymova, Sevastjanov, and M. Pilarek, “A method for solving systems of linear interval equations applied to the Leontief input-output model of economics,” Expert Systems with Applications, vol. 40 (1), pp. 222–230, 2013.

[25] T. F. Coleman, D. S. Garbow, and J. J. Mor´e, “Software for estimating sparse Jacobian matrices,” ACM Transactions on Mathematical Software, vol. 10, pp. 329–345, 1984.

[26] C. C. Gillespie, ed., Complete Dictionary of Scientific Biography, vol. 7. Charles Scribner’s Sons, 2007.

[27] T. Pisanski and B. Servatius, Configurations from a Graphical Viewpoint. Springer, 2010. 137

[28] B. Bollob´as, Extremal Graph Theory. Dover Publications, 1978.

[29] Ø. Ore, Theory of Graphs. American Mathematical Society Colloquium Publications, 1962.

[30] H. J. Finck and H. Sachs, “Uber¨ eine von h. s. wilf angegebane schranke f¨ur die chromatische zahl endlicher graphen,” Mathematische Nachrichten, vol. 39, pp. 373–386, 1969.

[31] D. W. Matula, “A min-max theorem for graphs with application to graph coloring,” SIAM Chronicle, vol. 10, no. 4, pp. 481–482, 1968.

[32] R. M. Karp, “Reducibility among combinatorial problems,” in Complexity of computer computations, (New York - London), pp. 85–103, Plenum, 1972.

[33] D. J. A. Welsh and M. B. Powell, “An upper bound for the chromatic number of a graph and its application to timetabling problems,” The Computer Journal, vol. 10, pp. 85–86, 1967.

[34] D. Br´elaz, “New methods to color the vertices of a graph,” Communications of the ACM, vol. 22, pp. 251–256, 1979.

[35] D. Zuckerman, “Linear degree extractors and the inapproximability of max clique and chromatic number,” Theory of Computing, vol. 3, pp. 103–128, 2007.

[36] T. F. Coleman and J. Y. Cai, “The cyclic coloring problem and estimation of sparse Hessian matrices,” SIAM Journal on Algebraic Discrete Methods, vol. 7, no. 2, pp. 221–235, 1986.

[37] M. L¨ulfesmann and H. M. B¨ucker, “An efficient graph coloring algorithm for stencil-based Jacobian computations,” in 6th SIAM Workshop of Combinatorial Scientific Computing, 2014.

[38] J. A. Telle and Y. Villanger, “FPT algorithms for domination in biclique-free graphs,” in Lecture Notes in Computer Science, vol. 7501, pp. 802–812, Springer-Verlag, 2012.

[39] H. Y. Wei and M. Soleimani, “Three-dimensional magnetic induction tomographpy imaging using a matrix free Krylov subspace inversion algorithm,” Progress In Electromagnetics Research, vol. 122, pp. 29–45, 2012.

[40] B. Hendrickson and A. Pothen, “Combinatorial scientific computing: The enabling power of discrete algorithms in computational science,” in proceedings of 7th International Meeting on High Performance Computing for Computational Science (VECPAR06), Lecture Notes in Computer Science, pp. 260–280, Springer-Verlag, 2007. 138

[41] B. Chor, M. Fellows, and D. Juedes, “Linear kernels in linear time, or how to save k colors in O(n2) steps,” in Proceedings of the 30th international conference on Graph-Theoretic Concepts in Computer Science, WG’04, (Berlin, Heidelberg), pp. 257–269, Springer-Verlag, 2004.

[42] M. Li, X. Chen, X. Li, B. Ma, and P. Vitanyi, “The similarity metric,” IEEE Transactions of Information Theory, vol. XX, no. Y, 2004.

[43] A. H. Gebremedhin, I. G. Lassous, J. Gustedt, and J. A. Telle, “Graph coloring on coarse grained multicomputers,” 2002.

[44] I. G. Lassous, J. Gustedt, and M. Morvan, “Feasibility, portability, predictability and efficiency: Four ambitious goals for the design and implementation of parallel coarse grained graph algorithms,” Feb 2000.

[45] M. Thorup, “All structured programs have small tree width and good register allocation,” Information and Computation, vol. 142, pp. 159–181, 1998.

[46] U. Geitner, J. Utke, and A. Griewank, “Automatic computation of sparse Jacobians by applying the method of Newsam and Ramsdell,” SIAM Proceedings of the Second International Workshop on Computational Differentiation, pp. 161–172, 1996.

[47] T. F. Coleman and S. Verma, “Structure and efficient Jacobian calculation,” SIAM Proceedings of the Second International Workshop on Computational Differentiation, pp. 149–159, 1996.

[48] D. S. Johnson, “Worst case behavior of graph coloring algorithms,” Proceedings of the Fifth Southeastern Conference on Combinatorics, Graph Theory and Computing, pp. 513–528, 1974. 139 * * * class M atrix M he :T ppendix 1 A 1 /***************************************************************************************\ 2 * Matrix_M.h INTERFACE FILE 3 * 4 * Jeffrey S.5 Jones, \***************************************************************************************/ Ohio University, Fall6 ’11 #include 7 #include 8 #include 9 #include 10 using namespace std; 11 12 #define SUCCESS13 #define FAIL 0 Annotated interface file definitionThis for class the hides Matrix_M the class. considerations implementation by of replacement matrix of storage, the and simple could 2D be array expanded data for structure. large data 140 14 #define BOTTOM15 16 typedef float*17 MatrixLine; -1 18 inline int19 20 MAX(int a, int b)21 { class return Matrix_M a >=22 b { ? a :23 b; public: } 24 void load_matrix(ifstream& Minput, int rows, int cols, int& num_edges); The load_matrix method usesand the also specified tallies rows the andvalue number cols in of for the non-zeroes error CBR in checking, of parameter the non-zeroes). num_edges input (number matrix, of returning edges this is equivalent to number build_out_mats is not currently used. 141 int row_degrees[], int col_degrees[]); 2526 void build_out_mats(int rows, int cols); 27 void vertex_degrees(int rows,28 int cols, int row_colors[], int col_colors[], In vertex_degrees, rows androw_colors cols and reflect col_colors the specify declaredhaving the size color current of 0. state the of inputof This coloring, matrix. the determines with current the uncolored degrees current vertices The of derivative degrees all graph, are vertices M’, returned in and in that allows array derivative computation parameters graph. row_degrees and col_degrees. greedy_row_compress rows and cols reflectrow_colors the and declared col_colors size determine ofsorted_rows the the determines derivative input the graph matrix. "greedy" M’. that ordering least of desirable, the and rows,selected the with as last the the element first "seed [rows-1] element vertex." being [0] the being most desirable, and the one 142 int col_colors[]); int col_colors[], int sorted_rows[], bool result_set[]); int row_colors[], int sorted_cols[], bool result_set[]); 29 int greedy_row_compress(int rows, int cols, int row_colors[], int row_degrees[], 30 int greedy_col_compress(int31 rows, int cols, int col_colors[], int col_degrees[], 32 bool valid_star_bicoloring(int declared_rows,33 int declared_cols, int row_colors[], the GIS is returnedrow_degrees via is the used boolean to arrayvalue determine parameter (which the result_set. is number equivalent of to edges the eliminated neighborhood by of the the formed GIS) GIS, is and returned. this greedy_col_compress is analogous to greedy_row_compress valid_star_bicoloring provides a checkprovides that a the valid coloring and specified complete by star row_colors bicoloring and col_colors 143 34 private: 35 bool some_unlocked(bool locks[], int num_locks); 3637 }; MatrixLine *M_in, *KsubR, *KsubC; helper function some_unlocked determinesavailable if for there inclusion are in sill the rows current or GIS columns being which formed are *M_in is a simple*KsubR 2D and array *KsubC representation are of for the future input functionality matrix and M. are not currently used. 144 * * * * 1 /***************************************************************************************\ 2 * Matrix_M IMPLEMENTATION FILE 3 * 4 * Jeffrey S.5 Jones, \***************************************************************************************/ Ohio University, Fall6 ’11 #include "Matrix_M.h" 7 using namespace std; 8 9 10 /***************************************************************************************\ 11 * the input12 matrix * file should have13 white-space precisely * separated "rows" floating number point of14 the entries rows, \***************************************************************************************/ number on with of each "cols" edges line. represented while by loading, the determine matrix * (non-zero entries) * load_matrix assumes fairly well-behavederror data, checking. but provides Notice, aabout referring place the to to sparsity the segregate pattern, private any any data desired valid structure floating M_in, point input that values although are we accepted. only care 145 cerr << "failed toexit(FAIL); read data item, row: " << i << " col: " << j << endl; Minput >> M_in[i][j]; if ( Minput.fail() ) { 15 void Matrix_M::load_matrix(ifstream& Minput,16 int { rows, int cols,17 int& char num_edges) 18 float verify; 19 over_run; 2021 M_in = new22 MatrixLine[rows]; for ( int23 i = 0; i24 < num_edges rows; = i++ 0; )25 M_in[i] for = ( new int float[cols]; 26 i { = 0; i27 < rows; i ++28 ) for ( int j { = 0; j < cols; j++ ) 29 30 31 32 146 || (verify == ’\t’) ) Minput.get(verify); } if ( M_in[i][j] != 0.0 ) num_edges++; cerr << "input rowexit(FAIL); terminator not found, row: " << i << endl; 33 39 404142 if ( verify != { ’\n’ ) 34 353637 } 38 Minput.get(verify); while ( (verify == ’ ’) 43 444546 } } 4748 if ( Minput49 >> { over_run ) 50 cerr << "input data exit(FAIL); beyond last row\n"; 147 55 /***********************************************************************\ 56 * initialize the57 row-compressed \***********************************************************************/ and column-compressed output58 matrices void Matrix_M::build_out_mats(int rows, * 59 int { cols) 6061 KsubR = new62 MatrixLine[rows]; KsubC = new63 MatrixLine[rows]; for ( int64 i { = 0; i65 < rows; i++ ) 66 KsubR[i] = new float[cols]; } KsubC[i] = new float[cols]; 5152 } } 53 54 this function is for future enhancement and is not currently used 148 KsubR[i][j] = 0.0; KsubC[i][j] = 0.0; 67 6869 for ( int70 i { = 0; i71 < rows; i ++72 ) for ( int j { = 0; j < cols; j++ ) 73 747576 } } } 77 78 79 /***********************************************************************\ Given input matrix M,specified a by descendant row_colors matrix, and M’,represents col_colors. is rows determinable and color by columns 0 theing yet is list between to unused of uncolored be by current vertices. colored. the colors coloring vertex_degrees algorithm, need so only count edges remain- 149 * * int row_degrees[], int col_degrees[]) for ( j ={ 0; j < cols; j++ ) 80 * given the81 current * set of uncolored degree rows of and each columns, remaining determine vertex the * 82 * colored vertices83 are \***********************************************************************/ set to degree84 0 void Matrix_M::vertex_degrees(int rows, int cols, int row_colors[],85 int { col_colors[], 86 int i,j; 87 8889 for ( i90 = for 0; ( i i <91 = rows; 0; i++ i ) <92 cols; row_degrees[i] for i++ = ( ) 0; i93 = col_degrees[i] { 0; = i 0; <94 rows; i++ ) 95 if ( row_colors[i] == { 0 ) 96 150 row_degrees[i]++; col_degrees[j]++; if ( M_in[i][j] != 0.0 ) { } if ( col_colors[j] =={ 0 ) } } 97 98 99 100 101 102 103 104 106107108 } } } 109 110 105 As mentioned above, greedy_row_compressto uses form the a vertex GIS priorityresult_set, of established and a by the derivative sorted_rows return matrix code M’. is the The number resultant of set edges is which returned would via be the eliminate array by parameter this GIS, 151 * int col_colors[], int sorted_rows[], bool result_set[]) 115 \***********************************************************************/ 116 int Matrix_M::greedy_row_compress(int rows, int cols, int row_colors[],117 int { row_degrees[], 118 bool unlocked_rows[rows]; 119 int i,120 j, int eliminatable; cur_row,121 cur_row_rank; 122123 for ( i124 = { 0; i <125 rows; i++) 126 unlocked_rows[i] = true; } result_set[i] = false; 111 /***********************************************************************\ 112 * perform a113 greedy * row compression on114 the uncolored * number rows of of vertices M_in, seed which returning row can * be eliminated with the selected * which is equivalent to the cardinality of the neighborhood of the GIS. 152 * = rows-1; = sorted_rows[cur_row_rank]; = row_degrees[cur_row]; 136137 \***********************************/ 138 for ( i139 = { 0; i < rows; i ++ ) if ( (row_colors[i] != 0 ) 127 cur_row_rank 128129 cur_row 130 eliminatable 131 result_set[cur_row]132 unlocked_rows[cur_row] = = true; false; 133134 /***********************************\ 135 * lock rows already * colored and 0-degree rows * unlocked rows are thoseGIS. still available we for lock considerationcolored, rows to or when be when they added their are intois degree colored, the it is when currently a zero they forming property (0-degree are of row neighbors-of-neighbors the logic of coloring is rows process. handled being within the main function as 153 unlocked_rows[i] = false; for ( j ={ 0; j < rows; j++ ) || (row_degrees[i] == 0 ) ) && (M_in[cur_row][i] != 0.0) ) 143144145 } } 146147 /***********************************\ 148 * lock rows149 for \***********************************/ primary row150 for ( i151 = { 0; i < * 152 cols; i ++ ) if ( (col_colors[i]155 == 0 ) 140 141142 { 153154 { the "primary row" is the seed vertex. 154 if ( M_in[j][i] != 0.0 ) unlocked_rows[j] = false; } 158159160 } } 161162 while ( some_unlocked(unlocked_rows, rows) { ) 163164 /***********************************\ 165 * find next166 (greedy) \***********************************/ unlocked row * cur_row_rank--; 156 157 if some_unlocked is true,current then color some group non-zero-degree vertices remain which may be added to the sorted_rows contains the listentry of is row the vertices most inmined "desirable" non-decreasing by row, order the and of calling therefore desirability. routine, the and seed The is vertex. last independent of The the sorting actual criteria matrix is compression deter- process. 155 += row_degrees[cur_row]; && (M_in[cur_row][i] != 0.0) ) if ( (col_colors[i] == 0) 180 167168169 while ( (!unlocked_rows[sorted_rows[cur_row_rank]])170 ) cur_row = sorted_rows[cur_row_rank]; cur_row_rank--; eliminatable 171172173 unlocked_rows[cur_row] = false; result_set[cur_row] = true; 174175 /***********************************\ 176 * lock rows177 for \***********************************/ row just compressed178 * 179 for ( i = { 0; i < cols; i ++ ) lock the next vertex being colored lock distance-2 neighbors of vertex just added to color group 156 if ( M_in[j][i] != 0.0 ) unlocked_rows[j] = false; for ( j ={ 0; j < rows; j++ ) } { } 182 187188189 } } 190191 } return(eliminatable); 192 193 181 183 184 186 185 eliminatable is the numberstructure of by edges this which color wouldjust-formed group, be GIS. and eliminated is from equal the to bipartite the incidence cardinality of the neighborhood of the 157 * int row_colors[], int sorted_cols[], bool result_set[]) 198 \***********************************************************************/ 199 int Matrix_M::greedy_col_compress(int rows, int cols, int col_colors[],200 int { col_degrees[], 201 bool unlocked_cols[cols]; 202 int i,203 j, int eliminatable; cur_col,204 cur_col_rank; 194 /***********************************************************************\ 195 * perform a196 greedy * column compression on197 returning uncolored * the columns number of of M_in, selected vertices seed which column can be eliminated * with the * As mentioned above, greedy_col_compressto uses form the a vertex GIS priorityresult_set, of established and a by the derivative sorted_cols return matrixwhich code M’. is is equivalent the The to number resultant the of set cardinality edges is of which returned the would viaFor neighborhood be the specific of eliminate array comment, the by parameter please GIS. this refer GIS, to the analogous row compression routine above. 158 * = cols-1; = sorted_cols[cur_col_rank]; = col_degrees[cur_col]; 219220 \***********************************/ 221 for ( i222 = { 0; i < cols; i ++ ) if ( (col_colors[i] != 0) 205206 for ( i207 = { 0; i <208 cols; i++) 209 unlocked_cols[i] = true; 210 } result_set[i] cur_col_rank = false; 211212 cur_col 213 eliminatable 214 result_set[cur_col]215 unlocked_cols[cur_col] = = true; false; 216217 /***********************************\ 218 * lock columns already * colored and 0-degree columns * 159 if ( M_in[i][j] != 0.0 ) unlocked_cols[j] = false; unlocked_cols[i] = false; for ( j ={ 0; j < cols; j++ ) } || (col_degrees[i] == 0) ) && (M_in[i][cur_col] != 0.0) ) 226227228 } } 229230 /***********************************\ 231 * lock columns232 for \***********************************/ primary column233 for * ( i234 = { 0; i <235 rows; i ++ ) if ( (row_colors[i]238 == 0 ) 239 223 224225 { 236237 { 240 160 241242243 } } 244245 while ( some_unlocked(unlocked_cols,246 cols) { ) 247 /*******************************************\ 248 * find next249 (greedy) \*******************************************/ unlocked column250251 cur_col_rank--; * 252 while ( (!unlocked_cols[sorted_cols[cur_col_rank]])253 ) cur_col = sorted_cols[cur_col_rank]; cur_col_rank--; 254255 eliminatable256 unlocked_cols[cur_col] = false; 257 += result_set[cur_col] col_degrees[cur_col]; 258 /*******************************************\ = true; * lock columns for column just compressed * 161 if ( M_in[i][j] != 0.0 ) unlocked_cols[j] = false; for ( j ={ 0; j < cols; j++ ) } && (M_in[i][cur_col] != 0.0) ) if ( (row_colors[i] =={ 0 ) } 263 264 265 270271272 } } 273274 } return(eliminatable); 275 276 259260 \*******************************************/ 261262 for ( i = { 0; i < rows; i ++ ) 266 267 269 268 162 * 279 \***********************************************************************/ 280 bool Matrix_M::some_unlocked(bool locks[],281 int { num_locks) 282283 for ( int284 i { = 0; i285 < num_locks; i++ ) 286 } if ( locks[i]287 ) } return(false); return(true); 288 289 277 /***********************************************************************\ 278 *check if rows/cols remain unlocked the last step instar each bicoloring. of our implemented algorithms is to verify that we have constructed a valid a useful utility routine to allow for better code readability 163 * * * * * * * * * * * * * * "bottom" technique 292 * 293 * there are294 four * conditions 1. to no a295 two valid * adjacent star vertices bicoloring: have296 the * same color 297 this * is 2. vacuously V1 true298 and colors for * V2 are classes use indexed of disjoint independently this299 color application, * sets as 3. row if and300 any again, column * pair vacuously of true vertices for301 are the * such connected same pair to reason of any * vertices third have vertex different colored colors "bottom,"302 again, then * vacuously 4. true, and as * 303 path this * of application edge class length does304 3 not * (4 make vertices) use uses of305 this >= color * is 3 algorithmic checked colors notes: by306 this * routine * note that in a bi-coloring, not all rows nor columns necessarily receive a * 290 /***************************************************************************************\ 291 * verify the assigned coloring is a valid star bicoloring 164 * int row_colors[], int col_colors[]) * at the completion of the algorithm are considered "bottom" vertices 309 \***************************************************************************************/ 310 bool Matrix_M::valid_star_bicoloring(int declared_rows, int declared_cols, 311 { 312 int row, col1,313 col2; int column, row1,314 row2; int node1, node2,315 node3, int node4; num_colors; 316 307 *308 * color. this is remaining the edges, case, leaving for perhaps example, many rows when uncolored. column compression lines colors remaining all color 0 * * this verification utility usesdisjoint, the and knowledge that that the theedges completion color (eliminating criteria palettes "too for for many" all rows#1 edges algorithms and and is is columns #2 considered the are for an exactthis star error elimination routine bicoloring condition). of could are all be vacuously Therefore, enhanced true. conditions with simple checks of conditions #1 and #2. 165 * for ( col2 ={ (col1+1); col2 < declared_cols; col2++ ) if ( (M_in[row][col1]&& != 0) (col_colors[col1]{ > 0)) for ( col1 ={ 0; col1 < (declared_cols-1); col1++ ) 317 /***************************************************************\ 318 * #3 319 * check for320 disjoint \***************************************************************/ colorings for uncolored321 or "BOTTOM" lines322 * for ( row323 = { 0; row <324 declared_rows; row++ ) 325 if ( (row_colors[row]326 == || BOTTOM) 327 (row_colors[row] { == 0328 )) 329 330 331 332 no two neighbors of a "bottom" vertex can have the same color. 166 << " on bottom-color<< row, "columns " " <<<< col1 row << << " endl; and " << col2 << ", row " cerr << "conflicting column color " << col_colors[col1] return(false); if ( (M_in[row][col2]&& != 0) (col_colors[col1]{ == col_colors[col2])) } } } } 334 335 336 337 338 333 339 340 341 343344345 } } 346 for ( column347 = { 0; column <348 declared_cols; column++ ) if ( (col_colors[column] == || BOTTOM) (col_colors[column] == 0 )) 342 167 << " on bottom-color<< column, "rows " " <<<< row1 column << << " endl; and " << row2 << ", column " cerr << "conflicting row color " << row_colors[row1] return(false); if ( (M_in[row2][column]&& != 0) (row_colors[row1]{ == row_colors[row2])) } for ( row2 ={ (row1+1); row2 < declared_rows; row2++ ) } if ( (M_in[row1][column]&& != 0) (row_colors[row1]{ > 0)) for ( row1 ={ 0; row1 < (declared_rows-1); row1++ ) 351 352 361 362 349350 { 353 354 355 356 357 363 358 359 360 364 168 * * } } 372 * check 3 colors among length-3 edge groups 367368369 } } 370 /***************************************************************\ 371 * #4 373 \***************************************************************/ 374375 for ( node1376 = { 0; node1 <377 declared_rows; node1++ ) if378 ( row_colors[node1] > { 0 ) for ( node2 = 0; node2 < declared_cols; node2++ ) 365 366 any path of edgeon length the 3 path must must usecolor. have 3 the or same more color, colors. and both to column violate vertices this on rule, the then path both must row have vertices the same 169 != node3) if ( row_colors[node1] != row_colors[node3] ) num_colors++; if ( (node1 &&&& (M_in[node3][node2] != 0) { (row_colors[node3] > 0)) num_colors = 2; for ( node3 ={ 0; node3 < declared_rows; node3++ ) if ( (M_in[node1][node2]&& != 0) { (col_colors[node2] > 0)) 381 382 383 379 { 380 384 385 386 387 388 389 390 guaranteed by disjoint color sets find leg-1 edges find leg-2 edges 170 != node4) << "\tr1: " <<<< node1 row_colors[node1] << << " "\n" color:<< " "\tc2: " <<<< node2 col_colors[node2] << << " "\n" color:<< " "\tr3: " << node3 << " color: " cerr << "edge length 3 with fewer than 3 colors:\n" if ( num_colors <{ 3 ) if ( col_colors[node2] !={ col_colors[node4] ) num_colors++; if ( (node2 &&&& (M_in[node3][node4] != 0) { (col_colors[node4] > 0)) for ( node4 ={ 0; node4 < declared_cols; node4++ ) 393 392 394 402 403 404 391 395 396 397 398 399 400 401 find leg-3 edges 171 << row_colors[node3] << "\n" << "\tc4: " <<<< node4 col_colors[node4] << << " "\n"<< color: endl; " return(false); } } } } } } } 408 409 410 405 406 407 411 412 414415416 } } 417 } 418 } return(true); 413 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Thesis and Dissertation Services ! !