<<

HIDDEN VARIABLE RESULTANT

APPROACH FOR CLASSICAL COMPUTER

VISION PROBLEMS

by

JORDI NONELL PARE

B.S., Universitat Politècnica de Catalunya, 2016

M.S., Universitat Politècnica de Catalunya, 2016

A thesis submitted to the Graduate Faculty of the

University of Colorado Colorado Springs

in partial fulfillment of the

requirements for the degree of

Master of Science

Department of Computer Science

2018 This thesis for the Master of Science degree by

Jordi Nonell Paré

has been approved for the

Department of Computer Science

by

Terrance Boult, Chair

Jonathan Ventura

Sudanshu Semwal

December 12th, 2018

ii Nonell Paré, Jordi (M.S., Computer Science) Hidden Variable Resultant Approach for Classical Computer Vision Problems Thesis directed by Professor Jonathan Ventura.

Abstract

Many problems in computer vision may be formulated as minimal problems; problems that require a minimal number of inputs and solving them equals to solve a system of non-linear equations with a finite number of solutions. Problems like relative and absolute camera pose computation fall into this category.

This systems of usually do not have a straightfor- ward solution, which makes the general for solving systems of polynomials not performant enough. This raises the need to develop concrete algorithms that solve those particular problems.

This thesis will review the state of the art for the current solvers for this problems, and propose a novel method to tackle this prob- lems based on combining the resultant theory with Gröbner basis methods and numerical optimization.

iii Acknowledgments

I would like to start thanking Dr. Ventura for introducing me to the world of computer vision, first as my professor and later as my advisor, and for all the help, patience and understanding he showed as I was learning about his topic. In addition I would like to express my gratitude to the committee for providing comments and suggestions in a niche topic like this one.

I would also make a special mention to the Balsells Foundation, whose fellowship has granted me the incredible opportunity to go through this master program.

And finally, I would like to thank my family and all my friends, the ones from Barcelona, and their evenings spent in video calls, and the new ones from Colorado, with whom I have created unbe- lievable memories.

iv Contents

1 Introduction 1

2 Contribution of the thesis 3

3 State-of-the-Art 4

3.1 Systems of polynomial equations 4

3.1.1 Numerical methods 5

3.1.2 Algebraic methods 5

3.2 Minimal problems in computer vision 8

4 Solving systems of polynomial equations 10

4.1 Background theory 10

4.2 Gröbner bases 13

4.3 Elimination templates 17

4.4 Hidden variable resultant 18

5 Minimal problems 19

5.1 Preliminaries 20

5.2 Five-point relative pose 21

5.3 Perspective-3-point 25

6 Conclusion 29

v Bibliography 31

vi List of Figures

5.1 Five-point relative pose problem schematic 22 5.2 Residuals comparison for the five point relative pose problem 24 5.3 Time comparison for the five point relative pose prob- lem 25 5.4 Perspective-3-point schematic 26 5.5 Residuals comparison for the five point relative pose problem 28 5.6 Time comparison for the five point relative pose prob- lem 28

vii List of Tables

4.1 Multiplication table in quotient ring 15

viii Chapter 1

Introduction

Many problems in computer vision can be formulated as systems of non-linear polynomial equations. Examples of such can be estimate the camera’s absolute pose, which means determine the position and orientation of the camera, and also the intrinsic parameters, de- pending on the problem. Improving the efficiency of this problems has an impact to several applications, from 3D reconstruction to robotics and augmented reality.

Solving a system of non-linear polynomial equations is a well- studied problem in , for which different already developed mathematical methods exist. Because of that, specific solvers for problems regarding computer vision have been devel- oped and improved over and over. Those methods can be classified roughly in two big approaches: numerical methods, which we won’t consider in this thesis, and algebraic methods.

Solving a system of non-linear polynomial equations is an old and well-studied problem, whose solutions can roughly be divided into two big groups: numerical methods and algebraic methods. In this thesis we will focus on the algebraic approach. The main problem surrounding the existing mathematical methods is that being so general, usually they are not designed to find the solutions as efficiently and fast as what is needed for computer vision.

Among the algebraic methods the problem is usually limited to a minimal problem, which means that only the minimal number

1 of point correspondences is used to find the solution. Since this approach is susceptible to fail because of outliers, this algorithms are usually implemented inside a RANSAC [5] framework, which also belong to a bigger system. This surrounding conditions force a particular set of requirements to the performance required for computer vision problems, implying that the solvers may need to work under real time situations.

The fact that there is a simpler subset of problems that provide value to the computer vision community also means that particular solutions for this subset of concrete problems can be found. In other words, there is no need for an all purpose general solver for systems of polynomials equations, and the concrete developed solver is allowed to take advantage of the particular conditions of the problem. That usually gets translated into precoumputing parts of a general problem so they can be solved faster with concrete coefficients afterwards.

In this thesis we revisit some of the current approaches into solving this systems using algebraic methods, and discuss the ad- vantages and drawbacks of them. Also, we explain our general approach to tackle the computer vision problems, and how it can generalize and be used for many different problems.

2 Chapter 2

Contribution of the thesis

The focus of this thesis is in how to apply algebraic methods to solve different computer vision problems. It aims to test a new method in two well-known classic minimal problems, to check if it is suitable to improve the current state of the art in some cases.

The main contribution of this thesis can be found in the algebraic development on chapter 5 as our method is mathematically proven to be correct, i.e. yield the correct results, by using the tools and theorems provided by the state-of-the-art contributions to algebraic geometry.

After that, this thesis checks the performance of this method with synthetic data in two different problems, the five-point relative pose problem and the perspective-3-points problem, so this method can be considered suitable or not for real-world data.

3 Chapter 3

State-of-the-Art

In this chapter we summarize the state of the art on how to solve non-linear systems of polynomial equations, and then we describe the state of the art of the problems in computer vision, particularly referring to the so called minimal problems.

3.1 Systems of polynomial equations

A system of polynomial equations is a system that takes the follow- ing form

f1(x)= ... = fm(x)= 0 (3.1)

Cn where x ∈ and fi(x) ∈ K[x1, ..., xn].

Solving systems of polynomial equations, linear and no linear ones, is an ancient problem which has been approached from many different perspectives for centuries. Therefore, we can find many different approaches about how to tackle it. A common way to classify this approaches is to divide them into numerical methods and algebraic methods. This thesis focuses mostly into algebraic methods, but we will talk briefly about the former.

4 3.1.1 Numerical methods

This methods can also be divided into two main groups.

The first big group of numerical methods to solve a system of polynomials are the iterative methods. This methods take an initial guess of the solution of the system, and successively obtain better approximations to the actual roots of the system. This methods are mostly developed from Newton’s method for one variable poly- nomials, and can be extend to system of multiple equations and several variables, in a diverse number of ways, changing the order of convergence of the method to the actual solution. The biggest problem concerning this methods is that they require a close enough initial guess to each of the solutions so the method can converge, and that guess is not always easy to find.

The second numerical approach departs from homotopy meth- ods. This methods try to avoid the problem of the initial guess that iterative methods have to perform, and take a completely different approach. First, the system computes a system with the same struc- ture and number of solutions than the original system, but whose solutions are already known. Next, it transforms that system into the original system, keeping track about how the roots move as the system gets transformed. In practical terms, homotopy methods require a huge amount of computation to approximate one system to the other.

3.1.2 Algebraic methods

Algebraic methods are based in partially solving the system so the different variables can be eliminated and the thus the problem is later reduced to finding the roots of an univariate polynomial. This kind of approach relies on obtaining the roots with enough precision, therefore is not suitable for larger systems or high degree polynomials, since the time spent in computation will eventually be excessive. But for the kind of problems in computer vision, those

5 methods are still tractable and produce results with the accuracy desired.

In chapter 4 we talk in greater detail about the different state- of-the-art algebraic methods, so here we expose at a high level the main groups and their differences.

Resultant methods

Resultant are the classic approach to pursue variable elimination, initially developed around the late 1800s.

The idea behind this methods is that we can keep adding linearly dependant equations to a system of polynomials without modifying its roots, until at some point we have as many different equations as the total number of in the system. In that situation, every can be treated as an independent variable and we can therefore use all the strategies available for linear systems.

A common one used to solve several systems of polynomials is the multivariate resultant, which is a new polynomial that is equal to zero if and only if the polynomials used to create it have a common root. The roots of the resultant can now be used to compute the original solutions of the problem.

Many of the problems in computer vision end up involving sparse matrices, therefore a special extension to the resultant theory taking advantage of that fact has also lead to better algorithms.

Gröbner bases

A different approach to obtain variable elimination relies on the connections between systems of polynomials, algebraic varieties and theory.

This theory emerges from the idea of transforming the original system of equations to a new system of equations without modi- fying its solutions, and at the same time have a new system with some interesting properties that make it more tractable.

6 In chapter 4 this idea will be mathematically explained, but as a general idea a system of polynomials defines an ideal, which is an infinite set of polynomials that share the same roots. A polynomial is said to belong to that ideal if any of the polynomials on that ideal can divide it. From this concept emerges the idea of the "smallest" set of polynomials that can divide any other polynomial on the ideal, and this set is called a Gröbner basis for that ideal. Later on, using this particular basis the roots of the system can be "easily" obtained.

The downside of this method is that computing Gröbner bases is a EXPSPACE-complete problem [6], which means that an exponen- tial amount of space compared to the input is needed to compute the solution. Moreover, the ordering in which this bases are con- structed greatly affects the computation time: orderings like the Graded Reverse Lexicographic order (grevlex) in general performs well, while the simple Lexicographic order (lex) tends to be more costly.

This theory greatly improved since it was developed by Bruno Buchberger in 1965, and his to obtain Gröbner bases, the Buchberger’s algorithm. This algorithm is based on successively adding new polynomials constructed from the polynomials already in the basis, and then filtering out all the polynomials that are di- visible by other polynomials in the same basis. Because of how the algorithm is designed, the order in which new polynomials are generated and removed greatly affects the performance of the algo- rithm, implying that sometimes it even doesn’t finish in a reasonable amount of time.

Faugère improved the performance of this algorithm with two new algorithms, F4 and F5 [3] [4], which greatly improved the ability to compute a specific Gröbner basis for a problem, although we do not expose their technical improvements in detail in this thesis.

Once a Gröbner basis is found, it can be used in several ways to obtain the solutions of the system. The most straightforward one (and therefore one of the lease efficient ones) is to compute the Gröbner basis in Lexicographic order for a system. By properties

7 of the lexicographic order, the last polynomial in the basis will be a polynomial in only one variable, thus using back-substitution we can solve the whole system. Since the problem with this approach is that computing a Gröbner basis w.r.t the lexicographic ordering is expensive, the basis can be computed in an other ordering (usu- ally grevlex) and then use one of the several methods to convert a Gröbner basis from one ordering to an other.

Lastly, the main method by solving this systems using Gröbner basis is to use the basis to craft a special matrix called "action" matrix, which defines the operation of multiplying a polynomial by another one but in the quotient ring defined by the Gröbner basis. Because of that, if the polynomial used is a special one, the eigenvalues of this matrix will correspond to partial roots of the system, i.e. solutions for one of the variables.

3.2 Minimal problems in computer vision

The set of points that is a solution to a system of polynomials can take different forms. For instance, in R3, the set of equations {x − y = 0, z − 3 = 0} define a line in R3 which means that there is an infinite number of points that satisfy this equations. On the other hand, we can have a system with more equations than the actual number of parameters, therefore a system for which doesn’t exists any set that can simultaneously satisfy all the equations.

The problems that are exposed in this thesis don’t belong to any of those categories. This problems are defined and obtained by using only the minimal number of equations necessary to obtain a system whose solutions form a zero-dimensional space, i.e. there is a finite set of solutions. That is why they are known as minimal problems among the computer vision community.

Focusing on this particular approach makes sense from a practi- cal point of view since minimizing the number of points to solve a problem reduces the chances of getting an outlier among this points,

8 and that combined with a RANSAC [5] framework lead to solutions that don’t require iterative numerical methods.

Several minimal problems are considered in this thesis, and they are explained in more detail in chapter 5.

9 Chapter 4

Solving systems of polynomial equations

In this chapter we review in greater detail the algebraic methodology introduced in chapter 3, expose the successive improvements and finally introduce our approach, the hidden variable resultant. To do so, we first overview the required background theory, usually taking the same definitions and examples as the ones found in [1] and [2].

In this whole chapter, we aim to solve the following problem. Given a set of polynomial equations, find the subset of points (the ) that satisfy all the equations. Therefore, we con- sider a system of polynomial equations as a system that takes the following form:

f1(x)= ... = fm(x)= 0 (4.1)

Cn where x =(x1, ..., xn) ∈ and fi(x) ∈ K[x1, ..., xn].

4.1 Background theory

Definition 1 A monomial in a collection of variables x1, ..., xn is a prod- uct

10 α1 α2 αn x1 x2 ··· xn (4.2)

where αi are non-negative integers. To abbreviate (4.2) we will often say xα or x.

Definition 2 Given K any field, we can form finite linear combinations of monomials with coefficients in K. The resulting objects are known as polynomials in K[x1, ..., xn], and can be expressed as

α(i) f = ∑ cix (4.3) i

Given a set of polynomials { f1, ..., fm}, Lets consider the set of polynomials that can be generated by this set by adding them together and multiplying them by other polynomials.

Definition 3 Let f1, ..., fm ∈ K[x1, ..., xn]. We let h f1, ..., fmi denote the collection

h f1, ..., fmi = {p1 f1 + ··· pm fm : pi ∈ K[x1, ..., xn] for i = 1, ..., m}. (4.4)

This generating set, also known as basis, is strongly relates with the following definition.

Definition 4 Let I ⊂ k[x1, ..., xn] be a non-empty subset. I is said to be an ideal if

• f + g ∈ I whenever f ∈ I and g ∈ I, and

• p f ∈ I whenever f ∈ I, and p ∈ k[x1, ..., xn] is an arbitrary polyno- mial.

By that, we know see that h f1, ..., fmi generates an ideal. We call it the ideal generated by f1, ... fm. And it works the other way around too: Given an ideal I, by the Hilbert Basis Theorem there exists a finite

11 set of polynomials that generates the ideal, i.e. there is a basis for that ideal.

Lets now introduce the concept of ordering, leading terms and division. A monomial ordering is a total linear ordering that un- equivocally defines an order among the monomials in a polynomial. Different orderings are mentioned in this thesis, and an approach by example is used to introduce them:

• Lexicografic (Lex) The same order than words in a dictionary

xy3z > y5z3

• Graded Lexicografic (Grlex) First by monomial degree, then rely on Lex ordering.

y5z3 > xy3z, and x2z > z3

• Graded Reverse Lexicografic (Grevlex) First by monomial de- gree, then reversed by Lex ordering.

y5z3 > xy3z, and z3 > x2z

α Definition 5 The leading term of a polynomial f = ∑i cix (i) with > α(j) respect to a monomial ordering is the product LT( f )= cjx where xα(j) is the largest monomial in f w.r.t the ordering >.

> Definition 6 Fix any monomial ordering in K[x1, ..., xn] and let F = { f1, ..., fm} be a set of polynomials in K[x1, ..., xn]. Then, every other polynomial f ∈ K[x1, ..., xn] can be written as

f = a1 f1 + ... + am fm + r (4.5) where ai, r ∈ K[x1, ..., xn], and ai fi = 0 or LT( f ) ≥ LT(ai fi). We will F call r the remainder of f on division by F, or r = f .

Now that we have the concept of division, and that an ideal is created by multiplication, we think we can use that information the

12 check if a given polynomial belongs to an ideal. But the following example proves us wrong.

Example 1 The following two basis generate the same ideal:

2 2 F1 = {x − y , xy, y }

2 F2 = {x, y }

hF1i = hF2i Since x − y2, xy, y2 ∈hx, y2i and x, y2 ∈hx − y2, xy, y2i.

But xyF1 (i.e. the remainder of dividing the polynomial xy by the set F2 F1) is 1, although xy is y.

That seems to indicate that there are some generating sets, or some basis, better suited than others for this task. Moreover, modi- fying the basis for the ideal, as long as the ideal itself is the same, doesn’t change the solutions for the system of polynomials, so that indicates that we might be able to transform our system into an other basis that is more suitable to be solved.

4.2 Gröbner bases

We will first define what a Gröbner basis is, and some of its proper- ties. > Definition 7 Fix a monomial order on k[x1, ..., xn] and let I ⊂ k[x1, ..., xn] be an ideal. A Gröbner basis for I w.r.t > is a finite collection of polyno- mials G = g1, ..., gt ⊂ I with the property that for every nonzero f ∈ I, LT( f ) is divisible by LT(gi) for some i, being LT( f ) the leading term for the polynomial f .

Definition 8 A reduced Gröbner basis for an ideal I is a Gröbner basis G for I such that for all distinct p, q ∈ G, no monomial appearing in p i a multiple of LT(q).

It can be proved that every non-null ideal I has a Gröbner basis, and the polynomials in that basis are the generator set for I. Gröbner

13 basis have some interesting properties that will be useful later: the remainder by dividing any polynomial f by G, the Gröbner basis, is uniquely determined. Moreover, if that reminder is 0, that implies that the polynomial f can be generated by the base G, thus it belongs to the ideal I.

To take advantage of this fact, lets first define the following algebraic structure

Definition 9 Let I be an ideal. A quotient ring A = C[x1, ..., xn]/I is the set of equivalence classes for congruence modulo I

A = K[x1, ..., xn]/I = {[ f ] : f ∈ K[x1, ..., xn]}, (4.6) where

[ f ]=[g] ⇔ f − g ∈ I (4.7)

The class [ f ] is called a coset, and it is defined as

[ f ]= {g ∈ K[x1, ..., xn] : g = f + h, h ∈ I} (4.8)

Alternatively, if G is a Gröbner basis for I, we can make the following definition.

G G [ f ]= {g ∈ K[x1, ..., xn] : f = g } (4.9)

And note that (4.9) is equivalent to (4.8), i.e. there is a one to one correspondence.

Lets now analyze this new quotient ring. Adding elements in this space and multiplying elements in this space keeps us in the same space, thus the quotient ring is also a .

G G G [ f ]+[g]= f + g ∈ K[x1, ..., xn]/I

G G G [ f ] · [g]= f · g ∈ K[x1, ..., xn]/I

14 Since it is a vector space, we can now find a basis for it. Lets consider the set of monomials

xα ∈h/ LT(I)i (4.10)

This set is a linearly independent set in the quotient ring, and thus they can be thought as a basis for it. The following example will clarify this concept:

Example 2 Lets consider the following generator set:

G = {x2 + 3xy/2 + y2/2 − 3x/2 − 3y/2, xy2 − x, y3 − y} (4.11)

It can be easily verified that G is a Gröbner basis w.r.t lex ordering for the ideal I = hGi. The leading monomial generator set is hx2, xy2, y3i, which implies that this are all the monomials that don’t belong to this ideal:

B = {1, x, y, xy, y2}

Therefore, this monomials form a basis for the vector space C[x, y]/I.

Thinking of a vector space, the sum can be thought as a straightforward operation, but what is more interesting to think about is the multiplication. For example, how does a vector in C[x, y]/I transform if we multiply it by x? We can find the answer to this question by computing the multiplication table for the elements of the basis B

1 x y xy y2 1 1 x y xy y2 x x α xy β x y y xy y2 x y xy xy β x α xy y2 y2 x y xy y2

Table 4.1: Multiplication table in quotient ring

where α = −3xy/2 − y2/2 + 3x/2 + 3y/2

β = 3xy/2 + 3y2/2 − 3x/2 − y/2

15 It can be proven that the vector space is finite-dimensional, thus that there is a finite number of monomials in B for every Gröbner basis for an ideal. This theorem helps us introduce the concept of an action matrix.

Definition 10 Given a quotient ring k[x1, ..., xn]/I, lets define a lin- ear mapping given by a polynomial f ∈ k[x1, ..., xn] such that m f : k[x1, ..., xn] ← k[x1, ..., xn] is such

m f ([g])=[ f ] · [g]=[ f g] (4.12)

It can be proven that this is, in fact, a linear mapping.

Since the previous mapping is a linear operation in a vector space, we must be able to express it in terms of a matrix.

Example 3 Developing the previous example more, considering f = x, 2 the linear mapping mx in the basis B = {1, x, y, xy, y } can be expressed as

 0 0 0 0 0   1 3/2 0 −3/2 1      mx =  0 3/2 0 −1/2 0  (4.13)      0 −3/2 1 3/2 0       0 −1/2 0 3/2 0  by using the results of the table (4.1).

This matrix is extremely useful because of the following theorem

Theorem 11 Let I ⊂ K[x1, ..., xn], let f ∈ K[x1, ...., xn], and let m f be the linear mapping on K[x1, ..., xn]/I. Then, for λ ∈ K, the following are equivalent:

1. λ is an eigenvalue of the matrix m f

2. λ is a value of the function f on the algebraic variety V(I).

16 We are not going to prove this theorem, but what it means is that the eigenvalues for mx, are the solutions to the elimination ideal I ∩ K[x], i.e. the solutions for the variable x. This process can be repeated considering the different variables in our problem.

4.3 Elimination templates

In practical terms, doing all the previously described work to obtain the action matrix for every new instance of the problem is a long process that can be optimized in several ways for computer vision problems. This approaches are called the elimination templates, and usually are precomputed in an "offline" phase, to then be able to quickly obtain the action matrix and therefore the solutions in an "online" phase with the real data.

There are several approaches and strategies for elimination tem- plate, but the one that we mention in this research is the transforma- tion of the original system to a polynomial eigenvalue problem as in [7].

This method first takes a resultant approach, and transforms the original system of equations into a system in the form

C(λ)v = 0 (4.14) where λ is one of the variables of the system, and v is a vector of all the monomials that don’t contain λ. As expressed previously when discussing the resultant theory, this is called also called linearize the system. In that publication, Kukelova et. al. showed how to obtain the action matrix from this system.

There is a problem that might arise when linearizing the system, that is that the new system doesn’t have enough equations for the matrix C(λ) to be square. In this situations, there is the need to add new equations to the system following the Gröbner basis and ideal theory stated previously to transform the system to a linear one.

17 4.4 Hidden variable resultant

This is the new approach described in this thesis. As done previ- ously, we transform the system to the shape in equation (4.14). But instead of obtaining the action matrix from it, we detect that the null space for C(λ) is not null, therefore the system must satisfy the following equation

det(C(λ)) = 0 (4.15)

This is usually reasonably sized for the problems considered in this thesis (to the point that the problem is still tractable), the largest one found being a 15x15.

The roots for this determinant give us the solutions for the elimi- nation ideal I ∩ K[λ], which are in fact the solutions for our variable λ for our system. Repeating the process with all the other variables lead us to all the solutions for the polynomial system.

18 Chapter 5

Minimal problems

The term "minimal problem" gets originated by the way this prob- lem is defined, as this problems are obtained by requiring only the minimal number of point correspondences possible by using all the possible geometric constrains. Usually, the problems are called by stating the number of points that the problem requires, like five-point relative pose or six-point relative pose with unknown focal length.

Furthermore, minimal problems define another approach to solve this problem compared to the classical way of minimizing the least square measure over all points. The use of a small number of points lowers the probability of an incorrect point being present among this values, which avoids common mistakes compared to the classical approach, on top of greatly reducing the number of iterations required in RANSAC algorithms.

In this chapter we will introduce several classical computer vi- sion problems and their algebraic definition, we will solve them using a state-of-the-art algorithm (the polynomial eigenvalue vec- tor) and our approach, both introduced previously in chapter 4. Performance in terms of time and accuracy will be used to compare both algorithms.

19 5.1 Preliminaries

Before the different problems are shown, notation and basic geomet- ric structures shared among the different problem definitions are introduced in this section.

First, image points are represented by homogeneous 3-vectors q and q′ in the first and second image, respectively. Each image can be considered the result of a projection P, that can be expanded as

P = K[R|t] (5.1)

where K is a 3 × 3 upper triangular matrix known as the calibra- tion matrix, which holds the intrinsic parameters of the camera, R is a rotation matrix and t is a translation vector. With that information, we can encode the two projections as P1 = K1[I|0] and P2 = K2[R|t], that is, we consider the original camera to be at the center and keep as unknowns the rotation and the translation.

From the translation vector we can construct the translation skew symmetric matrix [t] × , which holds the property that [t] × x = t × x, ∀x.

 0 −t3 t2 

[t] × =  t3 0 −t1       −t2 t1 0 

Using all this information we can define the fundamental matrix −T −1 F = K2 [t] × RK1 . This matrix is used for the epipolar constraint, a constraint that every pair of points q, q′ has to fulfill.

q′T Fq = 0 (5.2)

If K1 and K2 are known, the cameras are said to be calibrated, and then we can safely assume that the points q and q′ have been pre- −1 −1 viously multiplied by K1 and K2 , respectively, so the epipolar constraint can be simplified to

20 q′TEq = 0 (5.3) and E is known as the essential matrix. Both this matrices are used to set up the minimal problems, and the following theorems give us a way to build up the equations.

Theorem 12 A real nonzero 3 × 3 matrix, F, is a fundamental matrix if and only if it satisfies the following equation

det(F)= 0 (5.4)

There is an additional property that the essential matrix has to satisfy, which is that the two nonzero singular values are equal. This can be expressed in terms of the following cubic constraints on the essential matrix.

Theorem 13 A real nonzero 3 × 3 matrix, E, is an essential matrix if and only if it satisfies the equation

1 EETE − trace(EET)E = 0 (5.5) 2

5.2 Five-point relative pose

The five point relative pose problem is a problem that has been intensively studied among the years. Faugeras et. al. proved that this problem has at most 10 solutions an that there are 10 solutions in general, including complex ones. Recently, Nister et. al. [8] proved that the solutions to this problem can be obtained by solving a 10 degree polynomial.

The problem definition is as follows: Given the images of five unknown scene points from two distinct and unknown cameras, return all the possible configurations for the points and the cameras. A visual of this problem can be seen in Figure 5.1.

21 Figure 5.1: Five-point relative pose problem schematic

Also, in this problem, is assumed that the intrinsic cameras calibrations K1 and K2 are known.

To solve it, the following system of equations is created: first, the epipolar constraint (5.2) gets rewritten as

q˜TE˜ = 0 (5.6) where

′ ′ ′ ′ ′ ′ ′ ′ ′ q˜ =[q1q1, q2q1, q3q1, q1q2, q2q2, q3q2, q1q3, q2q3, q3q3]

˜ E =[E11, E12, E13, E21, E22, E23, E31, E32, E33]

So, by stacking up the condition (5.6) for the five points, we obtain a 5 × 9 matrix. By inspection, it is known that four vectors span the nullspace of this matrix and can be rearranged into four 3 × 3 matrices X, Y, Z, W. The essential matrix can now be expressed as:

E = xX + yY + zZ + wW (5.7)

But since the problem is defined up to scale, we can safely as- sume w = 1. By inserting this into the cubic constraint (5.5) we obtain a system of 9 polynomial equations. That plus Equation 5.4 give us our 10 equations containing the following monomials

{x3, x2y, x2z, xy2, xyz, xz2, y3, y2z, yz2, z3, x2, xy, xz, y2, yz, z2, x, y, z, 1}

22 Once we have our system of polynomials, we check the per- formance of our method compared to the polynomial eigenvalue approach, since both share the idea of the hidden variable. First, it is noticed that by hidding z, the system looks like this:

C(z)v = 0 (5.8) where C(z) is a 10 × 10 matrix, and v =[x3, x2y, xy2, y3, x2, xy, y2, x, y, 1]T is a 10 × 1 vector, so we have enough equations for our system and we don’t need to add any more.

Polynomial eigenvalue

Equation 5.8 is well defined, and C(z) is a , so we can compute the eigenvectors and eigenvalues of the system. The eigenvalues are going to be the values for z, but not all of them are valid, since some are 0 or correspond to a zero eigenvector. For each one of those valid eigenvalues, the x and y are obtained from the eigenvector from that eigenvalue, since we know how the monomials are ordered in v.

Hidden variable resultant

From Equation 5.8, where C(z) is already in a square form, the deter- minant det(C(z)) can be computed and the roots of the univariate polynomial in z can be found. And right now there are two possible different approaches.

The first one is to substitute z back into the system and redo the process with a new and simpler system of polynomials. Due the high degree of the univariate polynomial obtained from the deter- minant, the rounding errors from the previously found variables are amplified, and the final system in only one variable is a system with no algebraic solution, that has to be solved by minimizing the least squares error or by discarding some of the equations.

In early tests it was seen that the residuals were greater for this system so this path was discarded in favor of the second approach.

23 This other approach is based on the idea that any of the variables can be considered as the one for the hidden variable resultant C(λ), so the determinant for this matrix is computed 3 times, one for each variable. Luckily in this problem, C(λ) is always a square matrix for all three variables, so this doesn’t require any extra preprocessing.

10 solutions are obtained for each variable, and as the last step, this solutions need to be plugged back into the original system of equations to check which triplets solve the equations and which need to be discarded.

Results

Despite trying different optimization techniques, our approach re- quires the computation of 3 plus the filtering part, when 1000 different possible solutions need to be plugged back into the system to check if they satisfy the original equations.

Figure 5.2: Residuals comparison for the five point relative pose problem

In the Figure 5.2 it can be seen that our method obtains smaller residuals in general, at the cost of being more than a 1000 times slower, as can be seen in Figure 5.3. Because of that, our approach as it is is not suitable for real time applications, and it is even incon- venient to be used in RANSAC systems.

24 Figure 5.3: Time comparison for the five point relative pose problem

5.3 Perspective-3-point

The perspective-n-point pose determination problem, also called PnP, has been studied from using the minimum number of points, 3, to many other values. The task in hand is to determine the six- degrees-of-freedom in the camera pose and orientation, by using the correspondences of 3 known 3D points and its projections in the 2D image.

Since we are interested in the minimal problem, we will only consider the case of n = 3, which is the first one in which the number of solutions is finite.

As defined in the preliminaries, we have P the projection matrix, that defines the relation between x 3D points and q 2D points, both of them homogenized, and assuming that the points q are already multiplied by K−1 the calibration matrix.

qi = Pxi (5.9)

Now, we can transform this system to its non-homogeneous version, and instead of taking qi we will have the "3D version" of qi,

25 i.e. the 3D point at the camera plane.

λiui = R(xi − T), i = 1, ..., 3 (5.10)

By taking the norm in both sides of Equation 5.10 we obtain the following system in where λi are unknowns, ui are unitary vectors, and T is still unknown.

|λi| ||ui|| = ||xi − T||, i = 1, ..., 3 (5.11)

That transformation holds because the norm of a rotation matrix is always 1.

Figure 5.4: Perspective-3-point schematic

Figure 5.4 shows the restriction that we are looking at. From a

fixed distance triangle (the distances between the xi) find out the rotation and translation that a the triplet (v1, v2, v3) needs to be moved to such that the straight lines given by those vectors pass through the points x1, x2, x3.

Considering the law of cosines applied to triangles give us the following system of equations

2 2 2 6 ||x1 − x2|| = ||x1 − T|| + ||x2 − T|| − 2||x1 − T|| ||x2 − T||cos( v1v2) 2 2 2 6 ||x1 − x3|| = ||x1 − T|| + ||x3 − T|| − 2||x1 − T|| ||x3 − T||cos( v1v3) 2 2 2 6 ||x2 − x3|| = ||x2 − T|| + ||x3 − T|| − 2||x2 − T|| ||x3 − T||cos( v2v3) (5.12)

26 So this is our system of equations to solve. Once the distances are figured out, we can use Equation 4.14 to figure out the rotation.

Polynomial eigenvalue

So in this case, by hiding one variable our matrix is a 3 × 6 matrix, which is definitely not square, so there is the need to keep generating equations until the system has as many equations as monomials. We do this by multiplying the equations of the system by simple monomials, and add those equations to the system.

At some point, we end up with a 15 × 15 matrix and a vector of 15 monomials, and now this can be solved as the five-point relative pose problem.

The difference now is that some of the solutions that appear in the system are invalid solutions generated during the process of adding equations, and those need to be filtered out. The easiest way to do this is to check if the solutions satisfy the restrictions imposed by the eigenvector, which means that we don’t need to plug them back into the system of equations.

Hidden variable resultant

The same process as in the polynomial eigenvalue method is used to obtain the hidden variable matrix, but this process is done three different times, one for each variable in the system.

After that, once the determinant is computed and the solutions are found, those are filtered out by checking if they fit the original system again.

Results

As we have seen before, our method is not suitable for real world applications due all the extra amount of time it requires for the benefit it brings in terms of accuracy.

27 Figure 5.5: Residuals comparison for the five point relative pose problem

Figure 5.6: Time comparison for the five point relative pose problem

28 Chapter 6

Conclusion

The main goal of this thesis was to test the possibilities of our new suggested approach to solve the non-linear systems of polynomial equations that can be found in many computer vision problems, and iterate and improve the algorithm based on the results so the current state of the solvers for many minimal problems could be improved.

In this thesis we have mathematically proved the output the correctness of this new algorithm, while comparing it to a state- of-the-art algorithm that shares a similar formulation but diverges from our path at some point. Both algorithms should be capable to solve similar problems due to its similarity, and they have been tested in two of the most classical minimal problems: the five-point relative pose and the perspective-3-point.

On chapter 5 we have compared the accuracy and the perfor- mance of our new algorithm against the aforementioned polynomial eigenvalue solver, and the results obtained made us not feel too op- timistic about the achievable performance of this new algorithm. Nonetheless, we still believe that our algorithm can be improved in certain ways to reach a below one second approach, by using customization specifically for each one of the problems. However, there is an obvious drawback in our algorithm compared to the poly- nomial eigenvalue one, which is our inability to obtain the solutions for the three variables in just one iteration of the algorithm. That

29 means that an extra computation step will always be required in this approach, making it less performant, except in the cases that we can take advantage of some problem situation.

On the other hand, the results obtained showed a better accuracy than the state-of-the-art ones, and this fact supports our understand- ing that solving the system for the elimination ideal leads to more accurate solutions. We have proven that our current approach as it is is not compa- rable to the state of the art solvers.

30 Bibliography

[1] David A. Cox, John B. Little, and Don O’Shea. Using Algebraic Geometry. Springer, 2005.

[2] David A. Cox, John B. Little, and Don O’Shea. Ideals, Varieties, and Algorithms. Springer, 2016.

[3] J.C Faugére. A new efficient algorithm for computing grobner bases (f4). Journal of Pure and Applied Algebra, 1999.

[4] J.C Faugére. A new efficient algorithm for computing grobner bases without reduction to zero (f5). ISSAC’02, pages 75–83, 2002.

[5] M.A. Fischler and R.C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. SRI International, 1981.

[6] K. Kuhnle and E.W. Mayr. Exponential space computation of groebner bases. ISSAC’96, pages 63–71, 1996.

[7] Z. Kukelova, M. Bujnak, and T. Pajdla. Polynomial eigenvalue solutions to minimal problems in computer vision. IEEE PAMI, 2012.

[8] D. Níster. An efficient solution to the five-point relative pose problem. IEEE PAMI, 2004.

31