Visualization Research Center

University of Stuttgart Allmandring 19 D–70569 Stuttgart

Masterarbeit

Visualizing Optimization Trajectories

David Hägele

Course of Study: Informatik

Examiner: Prof. Dr. Daniel Weiskopf

Supervisor: Dr. Antoine Lhuillier, M.Sc. Moataz Abdelaal, M.Sc. Rafael Garcia

Commenced: May 27, 2019

Completed: November 27, 2019

Abstract

Nonlinear constraint optimization has many applications in technical, scientific as well as economic fields. Understanding solver behavior can help to improve solvers, choose appropriate hyperparameters, and formulate better performing nonlinear programs. This thesis proposes a visual analytics tool for analyzing constraint optimization problems. The optimization process is depicted by a set of two-dimensional trajectories, representing the trace of intermediate solutions during the optimization process. This allows us to obtain an overview of the evolution of the optimization process. To support detailed analysis, sup- plemental views are added to show the constraints violations and areas of feasible solution. Furthermore, different interaction techniques are implemented to facilitate the exploration process. To showcase the usefulness of the approach, findings from an exemplary analysis based on optimization logs of robot motion planning are presented.

Kurzfassung

Nichtlineare Optimierung unter Nebenbedingungen hat vielerlei Anwendungen in technis- chen, wissenschaftlichen und auch ökonomischen Bereichen. Das Verstehen von Löserver- halten kann dabei helfen Löser zu verbessern, angemessene Hyperparameter zu wählen, und performantere nichtlineare Programme zu formulieren. Diese Masterarbeit schlägt ein Visual Analytics Werkzeug zur Analyse von Opti- mierungsproblemen vor. Der Optimierungsvorgang wird durch eine Menge von zwei- dimensionalen Trajektorien dargestellt, die den Verlauf der zwischenzeitigen Lösungen während des Prozesses repräsentieren. Das erlaubt es uns einen Überblick über die zeitliche Entwicklung des Optimierungsvorgangs zu gewinnen. Um eine detaillierte Analyse zu ermöglichen, werden ergänzende Ansichten zur Darstellung der Nebenbedingungsverlet- zungen und Bereiche zulässiger Lösungen hinzugefügt. Außerdem werden verschiedene Interaktionstechniken implementiert, um die Exploration zu erleichtern. Um die Nüt- zlichkeit des Ansatzes zu zeigen, werden die Ergebnisse aus einer beispielhaften Analyse eines Optimierungslogs zur Bewegungsplanung eines Roboters präsentiert.

3

Contents

1 Introduction 11 1.1 Motivation...... 11 1.2 Goals...... 12 1.3 Outline...... 13

2 Related Work 15 2.1 Optimization Visualization...... 15 2.2 Visualization of Evolution...... 17

3 Constraint Optimization 21 3.1 Optimization Problem...... 21 3.2 Nonlinear Programming...... 22

4 Visualizing Optimization Trajectories 27 4.1 Visual Representation...... 27 4.2 Robot Path Trajectories...... 27 4.3 Dimensionality Reduction...... 29 4.4 Constraints Visualization...... 32 4.5 Explore Through Interaction...... 34

5 Implementation 39 5.1 Architecture...... 39 5.2 Dataset Description...... 40 5.3 Dimensionality Reduction Considerations...... 42 5.4 Used Technologies...... 42

6 Evaluation 45 6.1 Visualization Tool Overview...... 45 6.2 Use Case Scenarios...... 47

7 Conclusion 55 7.1 Summary...... 55 7.2 Results...... 55 7.3 Future Work...... 56

Bibliography 59

5

List of Figures

3.1 Nonlinear program examples...... 22

4.1 Optimization Trajectory...... 28 4.2 Per Configuration Trajectories...... 29 4.3 Dedicated Time Axis & Linesearch Highlighting...... 31 4.4 Dimensionality Changing Trajectories...... 31 4.5 Per Configuration Trajectory Violations...... 32 4.6 Constraint-Violations-View...... 33 4.7 Constraint Curve Plots...... 33 4.8 Zooming in on a view...... 34 4.9 Supplementary GUI for Filtering...... 35 4.10 Selected Per Configuration Trajectory...... 36 4.11 Selected Per Configuration Trajectory...... 37 4.12 Constraint-Violations of Selected Groups...... 38

5.1 MVC Pattern...... 39

6.1 Visualization Tool Overview...... 45 6.2 Video Sequence of Robot Movement...... 47 6.3 Analysis of Robot Path Sections...... 48 6.4 Analysis of Robot Path Velocities...... 49 6.5 Analysis of Initial and Final Path...... 49 6.6 Analysis of Path Evolution...... 50 6.7 Analysis of Constraint Evolution...... 51 6.8 Analysis of Constraint Violations on Trajectories...... 52 6.9 Analysis of Constraint Violations...... 53

7

List of Listings

3.1 Pseudo code implementation of the augmented Lagrangian method..... 24 3.2 Pseudo code implementation of Newton’s method...... 25

5.1 Example KOMO log file content...... 44

9

1 Introduction

1.1 Motivation

Mathematical optimization plays an important role in today’s world where complex systems emerge in many different technical, economical and scientific fields. The subdiscipline constrained optimization also known as linear-/nonlinear programming, that deals with problems of finding optima within a constrained set of possibilities, is particularly challeng- ing. It has many applications such as utility maximization and expenditure minimization in microeconomics, route planning for drones [ZD15] or motion planning for autonomously moving robots [Tou15], solving inverse imaging problems such as tomographic reconstruc- tion [ABF11] or optimizing signal transmitting systems for minimal transmission power [YL07]. Constrained optimization bears many challenges starting with the formulation of the optimization problem that includes the objective function, as well as equality and inequality constraint functions. Functions can be linear or nonlinear, differentiable or discrete, and the optimization problem is classified accordingly, which leads to different possibilities for solving. Nonlinear programs (NLP) can be especially hard to solve due to possible local optima, disconnected feasible regions and hard to determine feasible starting points [Chi06]. Moreover, quite some effort has been put into determining whether an NLP is feasible at all, as it is not trivial [AA95; NTY99]. Apart from these general difficulties in constrained optimization, the choice of algorithm for solving may change the result, due to its individual behavior. Choosing an appropriate algorithm depends on the problem and whether it fulfills certain requirements the algorithm imposes. For iterative solvers different initializations can yield different results, or may even fail to produce a feasible solution in one case but not the other. Last but not least, the choice of parameters to tune an individual implementation of an algorithm can have a major effect on the optimization’s outcome. Exemplary optimization problems from text books may be easy to understand and are great for building intuition on the subject, but real world problems quickly become very complex for several reasons, such as a large number of constraints, which creates a large number of complicated regions. High dimensional problems are quite common and harder to grasp, especially when constraints depend on several of the dimensions. Also, nonlinear objectives or constraints make it difficult to envision a problem geometrically. Time dependent systems allow for further complication when introducing constraints that have to be met only at specific times or are subject to sets of subsequent time steps.

11 1 Introduction

Ultimately a solver may generate an unsatisfactory solution or generate solutions under intolerable circumstances. For example an infeasible solution or a solution that is only locally optimal is generated. Maybe the algorithm takes too long to converge or does not converge at all, or the algorithm only succeeds for specific initializations. Diagnosis of the causes for such behavior is hard due to the sheer amount of variables that come into play during an optimization and due to the complexity of the optimization problem itself. Due to the fuzziness of such diagnostic tasks and the impossibility to algorithmically tackle them, it seems reasonable to leverage visualization in order to get a better understanding of the optimization problem and the corresponding algorithm behavior. The knowledge of a domain expert paired with a visual analytics tool has been proven to work well in these kind of scenarios in the past and should be able to bring light into the dark.

1.2 Goals

Since constraint optimization solvers often employ iterative methods to refine their solution and work their way to the optimum step by step, there is an inherent notion of trajectories that an optimization algorithm leaves behind. These trajectories promise to be insightful with respect to the behavior of an algorithm, its configuration and the optimization problem that it is applied to. Thus this thesis focuses on techniques for the visualization of trajectories that emerge from iterative solvers, i.e. the evolution of an optimization solution and its corresponding attributes and properties. Unfortunately only very little work has been published on the topic of nonlinear constrained optimization visualization as of writing. In order to build visualizations that can help to diagnose optimization performance, it is important to explore general possibilities for visualizing different aspects of an NLP and corresponding optimization. Apart from the exploration of visualization methods, a visual analytics system for analyzing optimization runs is to be realized. The system should be tailored towards the k-order Markov Path Optimization (KOMO) framework [Tou14b], which can output a detailed log file of an optimization process. Using such log files, the system should be able toretrieve and display trajectory data as well as supplemental information about the corresponding optimization. The framework is intended for optimizing the motion of robots, which implies a temporal aspect of the data as well as a larger number of dimensions to account for the numerous actuators within a robotic unit. The contribution of this thesis is a visual analytics tool for analyzing optimizations of NLPs intended for domain experts of optimization. The tool can support an analyst in trouble shooting optimization issues, provides insights into algorithmic behavior and helps to understand the properties of a robotics optimization problem.

12 1.3 Outline

1.3 Outline

The contents of the thesis are divided into several chapters. It starts by giving an overview of related publications in the fields of visualization of constraint optimization andthe visualization of temporal evolution in Chapter 2. Following that, Chapter 3 introduces the reader to constraint optimization and discusses prerequisites concerning non linear program definition, conditions for an optimal solution and an algorithm for solving NLP. From there on the visualization techniques for optimization trajectories are introduced in Chapter 4. In this chapter the visualization of constraints and possible interaction techniques are discussed as well. In Chapter 5 the implementation of the visual analytics tool for optimization log analysis is presented, discussing architecture, technical details and the dataset’s properties. The final tool will be presented in Chapter 6 and an evaluation based on an analysis use case scenario is conducted. In the end, a conclusion is drawn in Chapter 7 and ideas for future work are stated.

13

2 Related Work

In this chapter other work that is similar or related to this thesis is presented. A short description of the works is given, how they relate to this thesis and what distinguishes them from it. This is done in order to make the actual novelties in this thesis more clear by briefly sketching what has already been done in the area of visualizing optimization and constraint aspects as well as the visualization of patterns of evolution or time dependent data.

2.1 Optimization Visualization

This section’s scope is on works that relate to visualization of optimization and constraint problem aspects such as linear, nonlinear or constraint programming (LP, NLP, CP), problem formulation, solver behavior or solution assessment.

2.1.1 Visualization in linear programming using parallel coordinates

A quite early approach to visualizing linear programs by Chatterjee, Das, and Bhattacharya, uses parallel coordinates to visualize the constraint boundaries [CDB93]. They argue that while 2D and 3D problems are illustratable using Cartesian coordinates, higher dimensional problems lack a visualization method. Parallel coordinates allow for displaying high dimensional geometry, the cognitive process of interpreting hyperplane representations, however, is very involved and unintuitive. While the approach can visualize the geometric shape of a linear program and the feasible regions, the presented methods in this thesis try to visualize the evolution of the solution to an NLP.

2.1.2 Linear programming concept visualization

Another approach by Charalambos and Izquierdo that also deals with linear programs is limited to three dimensions and can therefore visualize a problem’s constraint boundaries in Cartesian coordinates [CI01]. The concept visualization allows for tracing an algorithms path to the optimal solution (e.g. the path of the simplex algorithm within the polytope) as well as sensitivity analysis to analyze changes of the feasible region and solution when varying parameters of the problem. The approach solves similar objectives and gives insights to algorithm behavior, this thesis however focuses on NLPs instead and the presented methods are not limited to 3D but intended for high dimensional problems.

15 2 Related Work

2.1.3 A Generic Visualization Platform for CP

Simonis et al. describe the design and implementation of the visualization tool CP- Viz [SDF+10]. Their tool allows for a “post-mortem” analysis of constraint program (CP) optimizations, which is very similar to the goal of this thesis. They provide the user with an interactive user interface where the states of constraints and variables as well as a search tree are visualized based on the selected time step within the algorithm. This way an analyst can observe the behavior of the solver for a particular problem. The main difference between the projects is that CP-Viz’ focus is on CPs which boils down to combinatorial or integer constrained problems and thus corresponds to different kind of algorithms. While solving CPs includes tree traversal, backtracking and dynamic programming techniques, NLPs consist of differential functions and their iterative solvers are based on algebraic methods such as gradient descent. The whole geometric nature of NLPs is missing in CPs which leads to different visualization methods to apply.

2.1.4 EVOLVE: A Visualization Tool for Multi-objective Optimization Featuring Linked View of Explanatory Variables and Objective Functions

Kubota et al. propose a visualization tool for assessing alternative solutions to an opti- mization problem [KIOT14]. With their tool it is possible to compare different solutions according to their explanatory variables and corresponding objective function values, which supports decision makers in selecting the most suitable solution out of many. A brush and link mechanism as well as automatic clustering allows a user to select samples from a scatter and parallel coordinates plot to analyze the optimization’s outcomes. While EVOLVE visualizes solutions to high dimensional problems, this thesis concerns itself with the evolution of a solution, i.e. the question of how the variables changed during the optimization and how this corresponds to the optimization problem.

2.1.5 What do Constraint Programming Users Want to See? Exploring the Role of Visualisation in Profiling of Models and Search

Goodwin et al. conducted a user study to identify use cases for visualization in constraint programming and to investigate what kind of visualizations would be valuable for domain experts of CP [GMD+17]. They found that most of the participants thought about their constraint problem visually and also liked to use visualization for statistics about optimiza- tion runs in order to debug or profile their solver. It was also found that many participants were generally interested in a visualization of the execution process of their solver for better understanding what it was doing. However, the task of creating visualizations of the execution was rated to be the most complicated among other tasks for dealing with bad optimizer performance. The authors also conducted a creativity workshop to gain further insights into what the experts would want to see or find out, what barriers exist, and what would be possible to do when barriers were overcome. They found different features and

16 2.2 Visualization of Evolution aspects that experts would value in a visualization, e.g. exploration of what happens during the search for a solution, to see the effect of constraints and linkage to their variables or to compare algorithms and models. Case studies with functional prototypes were used to evaluate the findings, however, their emphasis on tree search and general CP as opposedto NLP makes them rather uninteresting for this thesis. The general insights gained from the questionnaires and workshop however, seem to also apply to the NLP domain, which rein- forces the goals of this thesis and validity of visualization to help diagnosis of optimization performance.

2.2 Visualization of Evolution

This section’s scope is on works that concern themselves with conveying progression of time and evolution in visualization. Especially methods for coping with high dimensional time dependent data are emphasized.

2.2.1 Time Curves: Folding Time to Visualize Patterns of Temporal Evolution in Data

Timecurves [BSH+16] are a visualization technique proposed by Bach et al. It can be used to show progression of datasets with temporal snapshots by arranging the individual snapshots as points in 2D space and connecting them by a continuous line. The arrangement is done using multidimensional scaling (MDS) which is a dimensionality reduction technique that preserves pairwise distances (for an arbitrary distance/similarity measure). This way different patterns such as small or large changes or reversals to previous states can be shown. Due to the use of MDS the approach is data agnostic and can be applied to any temporal dataset as long as a similarity metric between snapshots exists. The authors showcased timecurves with different datasets such as Wikipedia article histories or surveillance videos, where “edit wars” in a document history could be observed very well and outliers, in case of video, represented a change in a static scene such as a pedestrian walking by. The method of connecting the points of subsequent snapshots by a line to indicate temporal progression is also leveraged by visualizations in this thesis. The resulting trajectories then represent the evolution of an optimization solution. However, to arrange the points in 2D space different methods are applied and evaluated. Furthermore, additional information is encoded in, and linked to the trajectory.

2.2.2 Temporal MDS Plots for Analysis of Multivariate Data

Jäckle et al. investigated how to use multidimensional scaling for time series data. They come up with their novel visualization technique TMDS, i.e. temporal multidimensional scaling [JFSK16], which is intended for timeseries data where the temporal attribute of the data is to be preserved. While time is mapped to the horizontal axis, the other dimensions

17 2 Related Work are projected onto the vertical axis. For this, a one dimensional MDS is performed within a sliding time window. The dimensions are also weighted by a user defined weight function before MDS is applied to allow for a selection of important dimensions of the multivariate data set. As MDS is not rotationally invariant (since it only preserves distances), adjacent 1D MDS created by the algorithm may display corresponding points rotated by 180◦ (flipped bottom side up). This is not desireable as it destroys continuous time related patterns which is why a heuristic is used to decide whether a 1D MDS slice should be flipped. The heuristic compares the positions of points that are contained in the overlap of the time windows for the previous and current slice and decides to flip if more than half of the 1 1 points have changed sign (points are mapped to [− 2 , 2 ]). The approach is of interest for this thesis as it is capable of visualizing high dimensional data whilst dedicating an axis for mapping time, which makes it easy to visually assess temporal evolution. Some inspiration is drawn from this work for the layout of optimization trajectories in this thesis.

2.2.3 Reducing Snapshots to Points: A Visual Analytics Approach to Dynamic Network Exploration

Van den Elzen et al. proposed a visual analytics approach for the analysis of dynamic networks [vdEHBv16]. In their work they take snapshots of a dynamic network and repre- sent these as vectors. To show the networks evolution they use dimensionality reduction to layout the different snapshots a points. By connecting subsequent snapshots through lines and use color coding to represent time, they can show how the network changes. It is possible to observe clusters of snapshots corresponding to certain time spans such as night clusters and clusters for different days of the week. To understand the similarities of network snapshots within clusters, they provide a linked view for displaying the network of a selected snapshot as node-link diagram. The authors investigated different dimensionality reduction techniques such as PCA, t-SNE and MDS as well as different normalization strategies for PCA and similarity measures t-SNE and MDS. Further research was put into the creation of snapshots where a smoothing using a time window and kernel was suggested. The approach shares quite some similarities with this thesis in which the intermediate solution of the optimization process are the analog to the network snapshots. However, there is a major difference in the implication of time, as it occurs in optimization, which is the convergence to a desired state towards the end of the process. Also the connection of the trajectory to constraints and algorithm is missing an analogy in this related work, i.e. the causation for trajectory form that has a geometric semantic.

In this chapter an overview of related works to this thesis was given. It was found that, while there has been some significant work in the field of visualizing constraint programming, only little or no work has been published on the subject of nonlinear pro- gramming visualization. Related work in the area of visualizing temporal evolution has

18 2.2 Visualization of Evolution solved similar problems, but is not directly applicable to NLP trajectory visualization and misses out on domain specific aspects.

19

3 Constraint Optimization

In this chapter a few concepts of constraint optimization and corresponding maths are recapitulated in order to concretize the subject of this thesis and to prepare the reader for the following chapters. Relevant algorithms for solving an NLP that correspond to the KOMO framework [Tou14b], which produces the optimization run log files (see Section 1.2), are also introduced briefly.

3.1 Optimization Problem

In this section a general definition of constraint optimization is given and explained. The field of optimization concerns itself with finding optima for a specific objective. Mathe- matically an optimum is a critical point of a function (the objective function in this case) such as a maximum or minimum.

Given f : A → R Find x∗ such that f(x∗) ≤ f(x) ∀x ∈ A (minimum)

Usually these problems can be solved by analytic root finding in the derivative, however, an analytic solution does not always exist. Due to that, iterative methods for approaching extrema have been developed such as gradient descent or Newton’s method. While in classic optimization the whole domain of f makes up the search space for the optimum, i.e. x∗ is allowed to take on any value f is defined for, in constraint optimization certain restrictions are imposed on x∗. Such a constraint optimization problem can be expressed like this minimize f(x) subject to h(x) = 0, g(x) ≤ 0 . (3.1)

The equation h(x) = 0 is known as an equality constraint which limits the search space to the points which fulfill it. Analogously the inequation g(x) ≤ 0 is known as an inequality constraint that further limits possible solutions to x∗ ∈ {x ∈ A | h(x) = 0 ∧ g(x) ≤ 0}. An example from economics for a constraint optimization problem would be the task of optimizing spendings for greatest profit while remaining within budget. From a geometric point of view h(x) = 0 defines a hypersurface the optimum has to beon, whereas g(x) ≤ 0 defines a halfspace of feasible points. The whole set of points defined by the constraints is called the feasible region of the optimization problem, when this region is empty the problem cannot be solved.

21 3 Constraint Optimization

3.2 Nonlinear Programming

Solving an optimization problem where the objective function or one of the constraint functions is nonlinear is called nonlinear programming. Figure 3.1 illustrates exemplary NLPs. The NLP corresponding to Figure 3.1a is mathematically formulated as follows. ! ! 0 2 minimize x⊤x subject to − x⊤ − 1 ≤ 0 , −x⊤ − 9 ≤ 0 . 1 3

X X 6 6

4 4

2 Y 2 Y

0 0

-2 -2 -2 0 2 4 6 -2 0 2 4 6

(a) linear constraints (b) linear and quadratic constraint

Figure 3.1: Graphs show 2D NLPs with quadratic objective function and two inequality constraints. The arrows point towards the minimum of the objective, the blue areas are infeasible regions. The red square shows the location of the optimal solution and the red line shows the optimizers path to that solution.

While the optimum in linear programs can be found on the constraint boundaries, this is not necessarily the case for NLPs where the optimum can also be inside the feasible region. Therefore an optimal point has to fulfill different criteria, known as the KKT conditions. These conditions, formulated by Karush, Kuhn and Tucker [Kje00; KT14], have to hold at

22 3.2 Nonlinear Programming an optimum x∗ of a constraint optimization problem. In the following the conditions are explained, assuming optimal x∗ as minimum.

N M ∗ X ∗ X ∗ − ∇f(x ) = κi∇hi(x ) + λj∇gj(x ) (stationarity) i=1 j=1 (3.2) ∗ ∗ hi(x ) = 0 ∀ i ∈ {1..N}, gj(x ) ≤ 0 ∀ j ∈ {1..M} (primal feasibility) (3.3)

λj ≥ 0 ∀ j ∈ {1..M} (dual feasibility) (3.4) ∗ λigi(x ) = 0 (complementary slackness) (3.5)

The primal feasibility condition (Equation (3.3)) is straight forward and requires that the minimum x∗ fulfills the constraints. In order to understand the stationarity condition (Equation (3.2)), lets first consider only a 2 single equality constraint h(x) = 0 and bivariate x ∈ R . This equality constraint describes a level curve that the minimum has to be on in order to be feasible. The objective function together with the optimal value y∗ = f(x∗) also describes a level curve f(x) = y∗, which means that both level curves have to be tangential at x∗. Thus their gradients have to be parallel, i.e. be scalar multiples of each other which is expressed by −∇f(x∗) = κ · ∇h(x∗), where κ is called a dual variable. Unfortunately this geometric interpretation cannot be used for multiple constraints right away, however the primal feasibility (Equation (3.3)) and complementary slackness condi- tion (Equation (3.5)) imply that the weighted sum of the constraints also has to equal zero, PN PM which creates a combined level curve 0 = Cκ,λ(x) = i=1 κihi(x) + j=1 λjgj(x) which the objective function f(x) = y∗ has to be tangential to.

∗ Since gj(x) ≤ 0 are inequality constraints and do not require x to be on it, conditions 3.5 and 3.4 provide a mechanism to “deactivate” an inequality constraint when it is fulfilled by setting λj = 0. When inactive the gradient of this constraint makes no contribution to ∗ the stationarity equation, whereas constraints can only be active (λj > 0) when gj(x ) = 0 (that is when the minimum is on the inequality constraint’s boundary). An optimization algorithm’s job is to find the unknowns x∗, κ and λ which fulfill the KKT conditions.

3.2.1 Augmented Lagrangian Method

The augmented Lagrangian method [Hes69; Pow78; Tou14a] is an optimization algorithm that forms an unconstrained problem from the original nonlinear constrained problem

23 3 Constraint Optimization

(Equation (3.1)) in order to solve it. The unconstrained problem, i.e. the augmented Lagrangian has the following form.

N M X X L(x) = f(x) + κihi(x) + λjgj(x) (3.6) i=1 j=1 N M X 2 X 2 + ν hi(x) + µ [gj(x) > 0] gj(x) i=1 j=1 This term is put together from a Lagrangian (as in method of Lagrange multipliers) and the penalizations of the squared penalty method. The indicator function [g(x) > 0] evaluates to 1 when the inequality g(x) is violated, and is 0 when x is feasible with respect to g. Note how the squared penalties are zero when x is feasible but greater zero otherwise. In order to find the minimum x∗ of the constrained problem, the algorithm will solve for a minimum of the augmented Lagrangian L(x) multiple times while updating the dual variables κ and λ in between. A naive pseudo code implementation of the algorithm is shown in Listing 3.1 to illustrate the process.

Listing 3.1 Pseudo code implementation of the augmented Lagrangian method # AUGMENTED LAGRANGIAN # BEGIN given objective f , equalities h and inequalities g define augmented lagrangian function L for f , h and g initialize x=random, ν, µ=1, κ, λ=0 WHILE solution NOT fulfilling KKT x = minimize L(ν, µ, κ, λ) with initialization x update κ, λ END WHILE END

To explain how the dual variables κ and λ are updated, lets first take a look at the derivative of the augmented Lagrangian.

N M X X ∇L(x) = ∇f(x) + κi · ∇hi(x) + λj · ∇gj(x) (3.7) i=1 j=1 N M X X + 2ν · hi(x) · ∇hi(x) + 2µ · [gj(x) > 0] gj(x) · ∇gi(x) i=1 j=1 Observe how the derivative is closely related to the KKT’s stationarity condition (Equa- tion (3.2)). In fact ∇L(x) = 0 results in the stationarity condition with the derivatives of the quadratic penalizations added to the right. In order to find a solution that fulfills the KKT conditions, the dual variables need to be figured out besides the related minimal x∗. For this, the duals are updated in each iteration in the following way.

κi ← κi + 2ν · hi(x) , λj ← max( 0 , λj + 2µ · gj(x)) When analyzing Equation (3.7) again, it can be seen that the updates are exactly the scaling factors of the constraint gradients ∇hi(x), ∇gj(x) found within the quadratic penalty terms.

24 3.2 Nonlinear Programming

The use of the max function is related to dual feasibility (see Equation (3.4)) which only allows λ ≥ 0. In a sense, the update trades gradient forces of the quadratic penalties for Lagrange multipliers κ and λ. The algorithm’s minimization step of unconstrained L(x) can be done using gradient descent or Newtons method, which will be covered in the next sub section.

3.2.2 Newton’s Method

While Newton’s method is classically used for root finding using the well known iterative f(x) update x ← x − ∇f(x) , it also has an application in finding minima. The method is very similar to gradient descent but uses newton steps δ = −(Hf(x))−1∇f(x) instead of the steepest descent direction −∇f(x). By using the Hessian Hf(x), a step towards the root of the gradient is found. Since the Newton method uses second derivative information it is generally performing better on higher order polynomials than gradient descent. A pseudo code implementation of a Newton’s method for minimum finding is shown in Listing 3.2.

Listing 3.2 Pseudo code implementation of Newton’s method # NEWTON’S METHOD # BEGIN given function f and initial guess x initialize stepsize a=1 WHILE stepsize a reasonably large set descent direction δ = newton step for f at x WHILE f(x + aδ) NOT sufficiently smaller than f(x) decrease stepsize a END WHILE x = x + aδ increase stepsize a END WHILE END

The outer loop of the Newton’s method iteratively decreases x towards the minimum of f and terminates when steps have become so tiny that it’s safe to assume x as the location of the minimum. The inner loop is called line search, which is used to find a step size that satisfies the first Wolfe condition ([Wol69]).

f(x + aδ) ≤ f(x) + c · ∇f(x)⊤(aδ) with 0 < c < 1

This is important as it ensures convergence by comparing the decrease of the newton step with the decrease expected from a linear approximation of the function at the current location. Putting together the augmented Lagrangian (Listing 3.1) and Newton’s method (Listing 3.2) results in the final algorithm for solving a nonlinear constraint optimization problem. From the listings the iterative nature of the algorithm becomes clear which explains how the series of updates to x form a trajectory during the optimization process, that is the main subject of this thesis.

25 3 Constraint Optimization

In this chapter, prerequisites of non linear constraint optimization were introduced. This included an overview of general constraint optimization as well as the more specific case of non linear programming and related KKT conditions. The augmented Lagrangian method, an algorithm for solving NLP, was introduced as well as Newton’s method, that is a downhill algorithm for finding unconstrained minima.

26 4 Visualizing Optimization Trajectories

This chapter introduces methods used to visualize an optimization trajectory and gives a definition of what this trajectory is. Techniques for visualizing corresponding information about the evolution of the constraint values over the optimization process are also discussed. For exploring the visualizations in more detail, interaction techniques are proposed in the end.

4.1 Visual Representation

The iterative process of solving a nonlinear constraint optimization problem is a sequence of optimization steps. At each of these steps the objective function and constraints are evaluated in order to decide on the next step to be taken. When visualizing the optimization process as a trajectory there is a chance to see how the solution evolves during the process. It may be possible to identify different stages within the optimization process, for example the solver taking huge steps away from initialization, slowly overcoming a plateau on the way to the minimum, or converging to the final solution. Relating such observations tothe constraints or the optimization algorithm promises to be insightful and of use for a deeper understanding of the behavior of the solver when applied to a specific problem. A trajectory, as in fluid mechanics, is the trace of a single point within a flow, forexample given by a time varying flow field ⃗v(⃗x,t). Similarly the argument x of the optimization problem moves in a time varying flow field defined by the optimization algorithm and eventually reaches the position of the problem’s minimum. The optimization trajectory thus is the sequence of values assigned to x during the optimization process (x0, .., xn). Typically a trajectory is visualized as a line that connects the discrete positions in their “chronological” order. Inspired by [Goh17], an exemplary optimization trajectory was plotted in Figure 4.1. In the plot, every point that is connected by the trajectory line corresponds to a step in the optimization algorithm.

4.2 Robot Path Trajectories

Robot motion calculation can be done by solving an NLP that describes the robots objective and constraints for possible movement and environment interaction. This is done by defining a number of discrete time steps where each of the steps represents therobot’s configuration at that time. The sense in which the term “configuration” is used here,refers

27 4 Visualizing Optimization Trajectories

1

0.75

0.5

0.25

0 0 0.25 0.5 0.75 1

Figure 4.1: Graph shows the optimization trajectory of single optimization run. The red square marks the final location of the optimal solution. to the robot’s state, the settings of its actuators to put it in a certain pose for example. The argument x to the optimization problem represents all configurations at once, the complete robot path so to say. A robot path of m discrete time points is encoded in x like so:   c1  .  dj x =  .  with cj ∈ R being the dj dimensional configuration at time point j.  .  cm (4.1)

This property is leveraged for providing more context for the trajectory by splitting it up into parts that correspond to the same robot configuration. So instead of showing a single trajectory of x, multiple trajectories of ci are visualized at once. Since x is actually a robot path, showing the optimization trajectories for each part of it gives a visual impression of the path’s evolution and puts each trajectory in context of the others. A color gradient can be used to encode the robot configurations, which implicitly maps to time on the robot’s path. Figure 4.2 shows an example of this technique. The two notions of time and traces can get a bit confusing. There is an optimization trajectory that corresponds to the steps taken by the optimization algorithm, and there is a robot path that corresponds to the movement of the robot which is to be calculated by the optimization. To keep things clear, the term “trajectory” will only be used in the sense of optimization steps, whereas “path” will only be used when referring to robot movement. Figure 4.4 shows the final robot path within the optimization trajectories.

28 4.3 Dimensionality Reduction

1

0.5

0

-0.5

-1 -1 -0.5 0 0.5 1

Figure 4.2: Graphs show optimization trajectories for each configuration of the robot path. Color encodes robot configurations, i.e. time of the robot movement path.

4.3 Dimensionality Reduction

Due to the usually large number of dimensions of the trajectory points, means to embed them in 2D space are needed in order to display them on paper or a computer screen. There are many techniques for reducing dimensionality to be found in the literature such as principle component analysis (PCA) [Hot33; Pea01], multidimensional scaling (MDS) [Kru64; Tor58] or t-distributed stochastic neighbor embedding (t-SNE) [MH08]. All of the displayed trajectories in this chapter used some sort of dimensionality reduction. Most plots employ PCA, but t-SNE was used for example in Figure 4.1, and Isomap was used in Figure 4.2. Depending on the properties of the used reduction method, different patterns become visible. Since PCA preserves global variance of the data, the projection of the trajectory will expose it in its most stretched out way. For being a linear projection, PCA may not work well with trajectories that correspond to non linear geometry. When traveling on a swiss-roll-like manifold the resulting projection may convey a wrong sense of circular

29 4 Visualizing Optimization Trajectories motion. Using t-SNE, on the other hand, preserves more of the local structures of the data, which means that the relation of neighboring points is more visible. Since it is minimizing the divergence between probability distributions for the low dimensional embedding of points, it can easily cope with non linear geometries. MDS also preserves similarity between points but depends on the used similarity measure. Due to this, MDS is highly customizable and can adapt to any dataset. Depending on the semantics of the optimization argument vector x, a fitting similarity measure could be defined which might yield superior results when compared to other techniques. For displaying all the trajectories per configuration, all points of all configuration trajectories have to be reduced jointly in order to relate to each other. For n − 1 optimization steps and m − 1 robot time steps, there are a total of n · m points, n points for each of the m configuration trajectories. The dataset for learning the projection or embedding, thus consists of   c1,1  .   .   .    X = c  where cj,i is the configuration vector for robot time j at optimization time i.  m,1  .   .    cm,n

Instead of projecting the trajectory points to 2D, using a 1D projection gives the opportunity to map the second axis to another data attribute. Similar to temporal MDS [JFSK16], the x-axis is used to encode time, optimization time in this case. This gives a clear impression of the change of a configuration for each optimization step and makes it easy to perceive time. Figure 4.3 shows an example of this technique. In this view it is also possible to observe sudden changes in direction took by the optimizer. However, these could be related to a step size reduction during line search and in this case relate to a backtracking step taken by the algorithm. To discern line search steps from regular steps, the corresponding segments can be highlighted as also seen in Figure 4.3.

Path Sections Another important aspect of the robot motion optimization problems, that has to be taken special care of, is the fact that each robot configuration vector cj can have its own dimensionality dj as pointed out in Equation (4.1). However, it is common that whole sub sequences of the robot path (cj1 .. cj2−1) with 1 ≤ j1 < j2 ≤ m have the same dimension. This is related to modeling the robots task as an optimization problem. For example a task could be a robot grabbing an object with its end effector and then placing this object somewhere else. The robots movement path increases in dimensionality as soon as the box is grabbed, to be able to model the additional geometry of the box that was just attached to the robot. This change in dimensionality needs to be done in order to express constraints on moving around with the box or for placing the box on some surface. Since robot configurations may differ in dimensionality, they are grouped into partsof consecutively equal dimensions. These parts define sections of the robot’s path that relate to

30 4.3 Dimensionality Reduction

Robot path evolution

Figure 4.3: Graphs show optimization trajectories per configuration where positions are projected onto the y-axis using PCA and optimization time is mapped to the x-axis. The magenta colored segments indicate step size reductions due to line search. different stages of the robot’s task and therefore may correspond to different objectives and constraints. Due to the change in dimensionality, the dimensionality reduction needs to be done separately for each section of the robot path. Unfortunately this creates unrelated 2D embeddings for each path section and a disconnection of path and trajectories. Displaying the individual parts next to each other still conveys a sense of robot time progression. See Figure 4.4 as example for a dimensionality changing trajectory visualization.

Robot path evolution

[part 1 part 1] [part 2 part 2] [part 3 part 3] [part 4 part 4]

Figure 4.4: Graph shows optimization trajectory and final robot path for an optimization problem where dimensions of configurations change along the robot’s move- ment path resulting in four separate sections. Solid lines are optimization trajectories, stippled lines correspond to the robot’s path.

31 4 Visualizing Optimization Trajectories

4.4 Constraints Visualization

Since the trajectories originate from solving a constraint optimization problem, it makes sense to include some information about the constraints. Unfortunately there is no infor- mation about the constraint functions available from the log file, which makes it impossible to sample them for contours as was done in the neat 2D example (Figure 3.1) earlier in Chapter 3. However, it is still possible to visualize the feasibility of the points of the trajectory. The log file provides information about the values for each feature included in the augmented Lagrangian term in the graphQuery elements. It is also known which constraint corresponds to which robot configurations, so that each trajectory point canbe encoded with information on violating a constraint or not. Figure 4.5 shows an example where such infeasibility indicators were used.

Robot path evolution

Figure 4.5: Graphs show optimization trajectories for each configuration of the robot path. The red dots indicate a constraint violation for the configuration. For dimensionality reduction, two of eleven dimensions were selected.

Using the feature values ϕj(x), other information than binary feasibility can be encoded. For example the severity of a violation can be calculated by summing up the absolute feature values for all active constraints of a configuration. Encoding the number of violated constraints will also give a better impression of how bad a point on the trajectory really is. While the optimization trajectory is an interesting quantity to visualize, there are still many other related aspects of the optimization problem that are left out. It is also impossible to encode all of the different variables corresponding to the individual steps of the optimizer within a trajectory plot, which is why supplementary views provide a way to convey additional information. Some information may also be easier to observe when not encoded within the trajectory visualization.

32 4.4 Constraints Visualization

The feasibility indicators on the trajectory points were already introduces in Figure 4.5. For a quick overview of the optimization process with respect to constraint feasibility, choosing a heatmap-like layout gives a clear impression. In Figure 4.6 the number of constraint violations per optimization step and robot configuration is shown. From this visualization it is possible to identify parts of the path which are harder to optimize than others and the time step at which the optimizer made it into the feasible region.

Constraint Violations

Figure 4.6: Graph shows constraint violations during the optimization process for each con- figuration (x-axis are optimization steps, y-axis robot configurations). Orange indicates few, dark red many violations.

Equalities (step,φ) Equalities (step,φ) 4

30

2 15

0 0

-15 -2

-30 -4 0 50 100 150 0 50 100 150

(a) constraint group aggregations (b) all constraints of a single group

Figure 4.7: Graphs show the development of equality constraint function values during the optimization process. The x-axis maps to optimization time i, y-axis to function value ϕ(xi). Stippled lines correspond to group aggregations, solid lines are actual constraints.

The evolution of the constraint values can be shown as a classic line chart. Using op- timization time steps i on the x-axis and values ϕk(xi) (that correspond to values of

33 4 Visualizing Optimization Trajectories constraint functions) on the y-axis shows how the intermediate solution relates to the constraints during the optimization. When interpreting these values as the distance of the optimization trajectory to the constraints, the lines shows the optimizer approaching and leaving boundaries and feasible regions. In the case of a large number of constraints it makes sense to aggregate lines in order to avoid visual clutter. The log files from the KOMO framework ([Tou14b]) usually contain constraints with identical labels but differing robot configurations that they apply to. Grouping them by labels yields sets of constraints of same semantic which can then be aggregated. To discern different groups in the line chart a qualitative color mapping is used. A visual cue for showing the threshold for which a value is expected to be feasible is realized as a white area around zero (for equalities). Figure 4.7 shows an example of this visualization method. For aggregation of equality a if |a|>|b| constraint groups, a signed maximum function signmax(a, b) = b else was used. This aggregation shows the furthest deviations of the constraints from zero which gives an impression on the degree of violation and whether constraints may alternate around zero. For inequalities, a simple maximum aggregation is sufficient, since we are only interested in values above zero as they indicate a violation and thus contribute to the optimization steps taken.

4.5 Explore Through Interaction

Many of the previously introduced visualization methods can be enhanced by interaction techniques to make details visually accessible and thus giving a user the possibility to explore the optimization dataset in more depth.

Robot path evolution Robot path evolution

(a) zoomed in a little (b) further zoomed in

Figure 4.8: Graphs show zoomed in versions of Figure 4.3. Left graph shows the dense upper part. Another zooming operation (right) reveals more of the upper right corner.

Zooming & Panning Some of the most basic interaction techniques like zooming and panning are a great extension to the trajectory views introduced earlier. They provide

34 4.5 Explore Through Interaction the ability to adapt the view so that the tiniest trajectory changes can be observed which is necessary as the step size of the optimizer is naturally decreasing towards the end of the optimization process. For the same reason the constraint value views from Figure 4.7 benefit from this technique. Also different constraints may be differently scaled andthus the view needs to be adaptable to be able to frame curves as desired. Figure 4.8 illustrates how zooming can show small details.

Filtering Another common interaction technique is filtering. Through filtering the user can select certain parts of the displayed data that are of special interest. In this scenario, the user may be interested in a specific optimization step, a selection of robot configurations or a group of constraints. Convenient access to filterable quantities can be provided by some additional views. For access to the constraints, a tree view (similar to a file system tree) can be used to show the constraint groups and contained members. The member group nodes can show the constraint label, type (equality or inequality) and number of members, the member nodes can show the constraint’s referenced robot configurations. A list or table view can be used to display the individual robot configurations j and corresponding dimensionality dj. The sequence of log entries can as well be displayed in a list view for quickly accessing steps of the optimization run. Figure 4.9 illustrates these GUI elements.

Constraints var name part [dim] Log Entries Step [EQ] F_Pose-stick config_30 1 [7] graphQuery 0 [EQ] F_PoseDiff-box-targetconfig_31 1 [7] newton - [EQ] LinAngVel-2-box config_32 1 [7] optConstra... - config_33 1 [7] graphQuery 1 [EQ] NewtonEuler_DampedVelocities-box config_34 1 [7] lagrangianQu... - [EQ] PairCollision-finger-stick config_35 1 [7] lineSearch - [EQ] PairCollision-stick-boxconfig_36 1 [7] newton - [EQ] PairCollision-table-boxconfig_37 1 [7] graphQuery 2 [EQ] QuaternionNorms config_38 1 [7] lagrangianQu... - [EQ] TM_Contact_ForceIsNormal-stick-boxconfig_39 2 [14] lineSearch - [EQ] TM_Contact_ForceIsNormal-table-boxconfig_40 2 [14] newton - [EQ] ZeroQVel-stick config_41 2 [14] graphQuery 3 config_42 2 [14] lagrangianQu... - [IQ] PairCollision-finger-box config_43 2 [14] lineSearch - [IQ] PairCollision-finger-table autoslect features newton -

Figure 4.9: Graphical user interface views for convenient filtering and better overview of the optimization problem. Left is a tree view of the constraints, middle is a list of the robot configurations and their dimensions, right is a list ofthelog entries, where each graphQuery element corresponds to an optimization step.

These graphical user interface elements can be used for exploring the dataset in an in- teractive kind of way. How interaction can be used to create a better understanding is covered in the next section. In the following the realization of filtering for these aspects of an optimization run is explained.

35 4 Visualizing Optimization Trajectories

4.5.1 Filtering Optimization Steps

Since the optimization algorithm works its way to the solution step by step, what was happening at a specific step and what properties the intermediate solution had thencanbe of interest to an analyst. Selecting a specific optimization step could be done by dragging a slider to the desired optimization time step of the run or by selecting a graphQuery entry from the earlier mentioned log entry list view.

Robot path evolution

part 2

Figure 4.10: Graph shows optimization trajectories and robot path for a selected optimiza- tion step. The stippled black line resembles the robot path.

When an optimization step has been selected, the views need to be updated to show what that optimization step relates to. In the trajectory the intermediate robot path that corresponds the selected step can to can be shown in Figure 4.10. Also the segment thickness of the trajectories corresponding to the step can be increased so that they are more noticeable. This also leads to an interesting wave like animation effect when scrolling through the optimization steps, that reveals fast and slow moving phases of the process. In the other views (which need to be linked) such as the constraint curves or constraint violations view (Figures 4.6 and 4.7) a vertical line that indicates the currently selected step helps to relate the intermediate solution to the constraints. See Figure 4.12 for reference.

4.5.2 Highlighting Robot Configurations

Many aspects of the optimization process have a relation to specific robot configurations, i.e. robot time on its movement path, such as the robot path xi itself that is the sequence of configurations (c1 .. cm), the constraints of which each corresponds to its own set of robot time steps and the possible change in dimension of the robot’s path. Also, while splitting up the optimization trajectory into several per configuration trajectories reveals a better representation of the solution (i.e. the robot path), it introduces more visual elements and is therefore prone to visual clutter. Thus, filtering for certain robot configurations canbe

36 4.5 Explore Through Interaction used to overcome clutter and helps to make a connection of individual robot time points to the performance of the optimization process.

Robot path evolution

Figure 4.11: Graphs show an optimization trajectory for a single selected configuration of the robot path. The other trajectories are grayed out. The shrinking effect on non-corresponding violation indicators can also be seen.

Selecting one or more robot configurations can be performed by clicking the corresponding trajectories or by choosing from the earlier mentioned list view of configurations. A more sophisticated mechanic for implicit selection could be done through constraint filtering using the corresponding configurations. In the trajectory views, the trajectories corresponding to the selection need to be high- lighted. This can be done by increasing the line width, changing color, or by hiding the trajectories not corresponding to the selection. Hiding trajectories can be done aggressively by completely removing their elements, or more softly by making them translucent so that they become less visible. For a combined selection of optimization time and robot time, a marker can be placed on the corresponding trajectory points for the user to locate the combinations quickly. Figure 4.11 illustrates the highlighting technique and shows such a marker at the end of the trajectory with green inline which indicates feasibility for this point. It also shows that constraint violation indicators for unrelated trajectories are shrunk to dot size. Linking the configuration selection to filtering corresponding constraints, leads to secondary effects in the supplementary views as will be explained in the next part.

4.5.3 Filtering Constraints

For assessing optimization performance, the general development of constraint values and the way constraints affect the optimization trajectory are key insights to be gained. However when dealing with many constraints, analyzing the behavior of a single constraint or group requires filtering to discern it from the others.

37 4 Visualizing Optimization Trajectories

Constraint Violations Constraint Violations

(a) (b)

Figure 4.12: Graph shows constraint violations for different constraint groups The stippled vertical line indicates the currently selected optimization step.

Filtering for constraint groups can be done by selecting them in the earlier mentioned tree view for constraints, or by clicking the corresponding constraint aggregate curve in the constraint curve view. This can be done analogously for actual constraints but the group node has to be expanded before hand. Similar to configuration selection which highlights the corresponding trajectory, the constraint curves can be highlighted using increased line widths and hiding of unselected curves. When expanding, the aggregates vanish from the view and instead the actual constraints will be displayed (see Figure 4.7). Selecting a group also needs to affect the violation indicators in the trajectory view and constraint violations overview. As shown in Figure 4.12, only the indicators for which the constraint group was violated are shown. Selecting an actual constraint has the difference of it referencing concrete robot configurations that it applies to. Thus, highlighting the corresponding configurations shows what the constraint refers to. This can bedoneby linking the constraint selection to the configuration filtering, however, special care has to be taken in case the inverse linking is also used to avoid a cascading selection effect. The effect of configuration filtering on the violation indicators can beseenin Figure 4.11, where the indicators that correspond to the same group but different configurations are shrunk to dots.

This chapter introduced techniques for visualizing optimization trajectories. The different methods were illustrated and explained in order to describe their properties. Methods for visualizing constraints to supplement the trajectory views and techniques for interactively exploring the visualizations were also discussed. The presented techniques form the basis for building a visual analytics system for optimization runs of robot motion planning problems.

38 5 Implementation

In this chapter the implementation of the proposed visualization tool is explained. The general design of the software is discussed and details about the dataset that is subject to visualization are given.

5.1 Architecture

The visualization software was realized as a standalone desktop application. The software architecture uses the model-view-controller (MVC) pattern which divides its components into three categories. The model components contain the data (i.e. the contents of the log file and trajectories), as well as the configuration of the application (i.e. the setfilters). The view components realize the graphic appearance of the model, which comprises the different views and their graphical elements (e.g. list items, trajectory segments, markers). The controller components processes user input events and manipulates the model. Upon changes in the model, the model components manipulate the view, e.g. change trajectory segments or update list elements. Figure 5.1 illustrates the MVC pattern.

Figure 5.1: Block diagram of the Model-View-Controller pattern.

The view component, which is the graphical user interface (GUI), consists of a trajectory view, two constraint curve views for equalities and inequalities, as well as the list and tree views for overview and filtering. Apart from these elements, a slider for selecting the optimization step is provided and a combo box for selecting the desired trajectory view

39 5 Implementation mode (i.e. the method used for laying out the trajectories like PCA). See Figure 6.1 for reference. The model component comprises the log file contents, which will be described in detail in Section 5.2, as well as selection models for constraints, optimization steps and robot configurations. For the of the shelf Swing components, the list and tree views, there are custom content models. The controller component comprises all of the action handlers and event listeners that are subject to the GUI elements. Mouse event listeners interface with the plot views which process clicks or dragging motions and translate them to selections or zooming for example.

5.2 Dataset Description

The KOMO framework, that solves NLP for calculating robot movement, outputs a detailed log file about the optimization process including information about evaluations ofthe augmented Lagrangian, dual variable updates, line search acceptance and of course updates to the robot path x. The log file is YAML formatted and consists of a sequence of different entries. In order to give an impression of the available information for visualization and to make implementation details more understandable, this section describes the file contents in more detail.

The first entry, the graphStructureQuery, describes the optimization problem. It contains information about the number of configurations (i.e. number of time steps) and dimension- ality dj of each configuration cj. It also contains information about features ϕ(x) of the augmented Lagrangian, which is a generalization of terms that contribute to the Lagrangian such as inequality constraints, equality constraints and parts of the objective function. Each feature ϕk has a label, a type (equality, inequality, objective) and a list of time steps that it corresponds to. Unfortunately there is no information about the meaning of the dimensions available in the log, and the feature functions such as the constraints are black boxes as well. The following entries describe the optimization process and are of the following types. lagrangianQuery: This entry describes an evaluation of the augmented Lagrangian func- tion. It reports the value of the objective term, the inequality term and the equality term. graphQuery: This entry describes a detailed evaluation of the augmented Lagrangian function. It contains the current robot path x and the feature values ϕk(x). newton: This entry signalizes that an iteration of the newton method is in progress and includes information about the current step size a and step direction δ.

40 5.2 Dataset Description lineSearch: This entry signalizes that a the algorithm is in the line search state and tests for the Wolfe condition. It contains information about the current step size a, and whether the Wolfe condition is met. optConstraint: This entry corresponds to an alteration of the quadratic penalty variables µ, ν, which results in a different weighting of the constraints. A usual sequence of entries would be, 1 graphQuery, 2 lagrangianQuery, 3 linesearch, 4 graphQuery, 5 lagrangianQuery, 6 linesearch, 7 newton, 8 graphQuery,

I this sequence the graphQuery and lagrangianQuery can be related to the function evalua- tions needed to test the Wolfe condition (see Listing 3.2), the first lineSearch entry is not followed by newton which means that the Wolfe condition has not been met and step size is reduced.

Extracting the robot path x from all graphQuery elements results in the sequence (x1, .., xn) which is the optimization trajectory. The rejected line search locations xcurrent + aδ are intentionally kept as part of the optimization sequence even though the algorithm never really takes a step to these locations.

5.2.1 Log Parsing Pipeline

For extracting relevant information from the YAML formatted KOMO log file, a simple pipeline is used. The log file is initially parsed by a Jackson parser which yields alistof key-value maps that represent the various entries of the log. For reference, an excerpt from a log file is listed in Listing 5.1. From the fist element, the graphStructureQuery, the following information is retrieved. The number of robot time steps (i.e. number of configurations), the dimensions of the individual robot configurations, the labels ofthe configurations (e.g. “config_1”), the number of features of the problem, the featuretypes (objectives, inequalities and equalities), the feature labels, and the features referenced configurations. During this process a few supporting data structures are created, suchas index ranges of subsequent configurations sharing the same number of dimensions and index sets of features sharing the same type. The following elements of the log describe the optimization process from which the opti- mization trajectory can be extracted. While iterating through the remaining list elements, the type of each element is determined. Whenever a graphQuery element is identified, its index in the list is remembered so that random access to the individual queries is possible later on. The graph query elements contain the trajectory locations x and the feature values ϕ(x) as lists of numbers. Since lists of boxed primitive types in Java require a lot of memory, these lists are replaced by primitive double arrays, whereas the the large vector

41 5 Implementation

x is also decomposed into its smaller parts the configurations cj (see Equation (4.1)) for easier access later on.

5.3 Dimensionality Reduction Considerations

For dimensionality reduction, only PCA was kept among other possible techniques such as MDS, t-SNE or Isomap. This was mainly done for performance and memory requirement reasons. While there is a super fast and memory-saving solution for PCA using the singular value decomposition, the other implementations of dimensionality reduction techniques required more memory (e.g. for the calculation of the dissimilarity matrix in MDS), or were too slow to perform for an on demand reduction approach. Especially since the t-SNE implementation did not make use of the barnes-hut algorithm, it took several minutes to converge to a reasonable result. Memory and speed have to be considered since the datasets for dimensionality reduction are not small. For a log file containing 170 optimization steps and a robot path section of100 time steps, there are 17.000 high dimensional points to be considered for dimensionality reduction. The corresponding upper triangular dissimilarity matrix in double precision (throwing away half of the symmetric full matrix) is already worth 1GB of RAM. For PCA the data was normalized in advance to have mean µ = 0 and variance σ2 = 1. This was done since the meaning of the dimensions are not known and also their scales could otherwise vary in orders of magnitude depending on the used log file and corresponding optimization problem.

5.4 Used Technologies

The KOMO log visualization tool is a desktop application written in Java 8. It is based on the well known graphical user interface toolkits Swing and the Abstract Window Toolkit (AWT). It uses the library JPlotter [Häg19] for creating plots and charts, the Statistical Machine Intelligence & Learning Engine (SMILE) [Li16] for dimensionality reduction tasks and the Efficient Java Matrix Library (EJML) [Abe10] for other linear algebra and matrix operations. For parsing the YAML formatted log files, the Jackson library [Fas12] is used. The JPlotter library for scientific plotting emerged as a by-product of this thesis andwas developed and refined during the research and implementation process. The library depends on OpenGL for efficient rendering of many graphical elements. Thus, there is requirement for OpenGL capabilities for executing the visualization tool, which is a GL version of 3.3 or later. Due to JPlotter’s dependencies, currently supported operating systems are Microsoft Windows and Linux. The software uses the Apache Maven project management tool that allows for convenient specification of library dependencies and also serves as a build tool.

42 5.4 Used Technologies

This implementation chapter explained the architecture and technical details of the real- ized visualization software. A dataset description of the log files coming from the KOMO framework for robot motion path optimization was given in order to convey a detailed impression of the contained information. The final appearance of the software can be seen in the following chapter as well as a presentation of its utility.

43 5 Implementation

Listing 5.1 Example KOMO log file content [{ graphStructureQuery: True, numVariables: 100, variableNames: ["config_0", "config_1", "config_2", "config_3", "config_4", .... variableDimensions: [11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, ... numFeatures: 1236, featureNames: ["Transition#pos0#vel0#acc1", "Transition#pos0#vel0#acc1", ...... featureVariables: [[-2, -1, 0], [-2, -1, 0], [-2, -1, 0], [-2, -1, 0], [-2, .... featureTypes: [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, .. },{ graphQuery: 0, errors: [5.55383e+08, 0, 0], x: [0.00251275, -0.00967902, 1.56985, 0.0913059, -1.00232, 0.498743, -0.9847, .. lambda: [], phi: [0.7946, -3.06077, -0.298813, -824.796, -0.0733272, -0.0397408, 0.48218, .. },{ newton: 0, evaluations: 1, f_x: 5.55383e+08, alpha: 1 },{ optConstraint: 0, mu: 1, nu: 1, L_x: 5.55383e+08, errors: [5.55383e+08, 0, 0], lambda: [] },{ graphQuery: 1, errors: [3.17827e+08, 0, 1.05841], x: [0.00191178, -0.0073299, 1.57007, 0.0934231, -1.00177, 0.49904, -0.988465, .. lambda: [], phi: [0.604559, -2.31792, -0.228457, -623.942, -0.0560238, -0.0303629, 0.364, .. },{ lagrangianQuery: True, errors: [3.17827e+08, 0, 1.05841] },{ lineSearch: 0, alpha: 1, beta: 1, f_x: 5.55383e+08, f_y: 3.17827e+08, wolfe: 1, accept: True },{ newton: 1, evaluations: 2, f_x: 3.17827e+08, alpha: 1.5, Delta: [-0.000600961, 0.00234912, 0.000222484, 0.00211719, 0.000547182, ...... },{ ...

44 6 Evaluation

In this chapter a brief overview of the final visualization tool is given to show howthe techniques introduced Chapter 4 have been incorporated. Following this, an analysis use case scenario is played through to validate the viability of the presented approach, by presenting the gained insights.

6.1 Visualization Tool Overview

In this section the optimization log visualization tool is presented. The different views and control elements are briefly explained in order to give an impression of its functionality. A screen shot of the tool is shown in Figure 6.1. The figure is annotated with numbers for the different parts of the user interface. Each of these parts is going to be covered in the following.

6 7 8

1

2

3 4 5

Figure 6.1: Screenshot with annotation numbers of the KOMO log visualization tool. The different views and GUI items are explained in Section 6.1.

1 - Trajectory View: The trajectory view shows the optimization trajectories and the robot path corresponding to the currently selected optimization step. The layout of the

45 6 Evaluation trajectories is defined by the selected trajectory mode (8). The view can display theplots that were already introduced in Figures 4.3 to 4.6. Clicking on a trajectory in this view selects the corresponding configuration. 2 - Constraint Curves: These views show the constraint curves already presented in Figure 4.7. The left view shows the evolution of inequality constraints, the right of equality constraints respectively. A constraint group aggregation curve can be expanded by either right clicking it or by expanding its corresponding node from the constraint tree view (3). On expansion the contained constraints are made visible and the aggregate curves vanish as is the case for the left view in Figure 6.1. Selecting a single constraint or group can be done by left clicking it. 3 - Constraint Tree: The constraint tree view shows the different constraint groups found in the log, and their contained constraints. The group nodes show the constraint label, the number of constraints belonging to it and the color corresponding to it. A leaf node show the constraint’s index and the configurations that it corresponds to. When expanding a group node, the respective aggregate curve in the constraint curve view (2) is expanded. Selections of constraints can be done in the tree, which are linked to an automatic selection of corresponding robot configurations. These are then reflected in highlighting of configurations and trajectories in views 1 and4. 4 - Configurations List: This list shows all robot configurations (i.e. discrete time points on the robot’s movement path), the path section it belongs to (see Section 4.3) and corresponding number of dimensions. From this view a selection of configurations can be made that is reflected in trajectory highlighting. When checking the “autoselect features” box, all constraints that reference the selected configurations become selected as well. 5 - Log Entry List: This list shows the sequence of entry types in the log file. Each graphQuery entry is associated with an optimization step of the algorithm. The other entries show what the algorithm is doing in between steps. When no newton element appears in between graph queries, the step corresponds to a backtracking during line search and is highlighted in magenta. Selecting a graphQuery list element sets the optimization step filter to the corresponding step. 6 - Detail Checkboxes: These checkboxes determine what information is shown in the trajectory view (1). The violations and line search indicators have already been discussed and illustrated in Figures 4.3 and 4.5. 7 - Optimization Step Slider Using this slider, an optimization step can be selected. The selection is reflected in the trajectory view (1), the constraint curve views (2) andthelog entry list (5), where the trajectory view shows the corresponding robot path and the curve views show a vertical line marking the selected step. 8 - Trajectory Mode: From this drop-down list the layout strategy for the trajectory view (1) can be selected. It is possible to select combinations of actual dimensions, PCA projection, mixed PCA and optimization steps (see Figure 4.3), or the violations view (see Figure 4.6).

46 6.2 Use Case Scenarios

6.2 Use Case Scenarios

In this section the presented methods are evaluated based on a use case scenario. The scenario will showcase the analysis of a single log file of a robot movement optimization problem. To give the reader an idea of the optimization problem, the robot’s task that is modeled is described. Imagine a robot arm with some kind of gripper at the end. It is located in front of a flat surface on which a box is placed. Right next to the surface isastick. The robot’s task is, to grab the stick and then to push the box in a circular motion to a specified location using only the stick. This is a very brief version of the task andthereare supposably more specific objectives regarding the exact way of how to place the stickon the box or the grabbing of the stick, but this should be enough in order to understand what the optimization problem refers to. Figure 6.2 shows frames of a video sequence that was created using the optimized path. It shows an abstract animation of the robot performing the specified task.

(a) (b) (c) (d)

(e) (f) (g) (h)

Figure 6.2: Selected frames of a video sequence showing the movement of a robot to complete a special task. Frames are in row major order, from left to right. The yellow object represents an end effector, i.e. the robot’s hand. (a-b) moving to the stick; (c-d) moving with the stick to the box; (e-g) moving the box with the stick; (h) retracting hand to final position.

In the remainder of this section, the analysis of the log file corresponding to the problem that was just introduced, is illustrated and explained.

47 6 Evaluation

Identifying robot movement phases In the beginning of this analysis we try to relate the optimization problem to the visualizations. We’ll start with the trajectory view that is divided into 4 parts as in Figure 6.1. These parts can be related to the different phases of the robot’s task. When analyzing the video sequence we can identify 4 phases which are: 1) The robot moving to the stick to grab it, 2) the robot moving the stick to the box, 3) the robot moving the box with the stick and 4) the robot retracting its hand away from the box.

Robot path evolution

[part 1 part 1] [part 2 part 2] [part 3 part 3] [part 4 part 4]

Figure 6.3: Robot path sections corresponding to the video sequence in Figure 6.2. The squares mark the beginnings of each of the path parts.

Relating phases to the visualization When hiding the optimization trajectories for once, we can see the robot’s movement path clearly as shown in Figure 6.3. We also selected the first configuration of each part from the configurations list, which are marked by squares, so we know where sub-paths start and end. We can see that all parts, except for part 3, are quite straight, which we can verify from the video sequence. In part 3 of the path can observe a turning motion, which can be directly related to the animation frames 4,5 and 6 of Figure 6.2 of the robot moving the box around. To verify our assumption that these 4 parts correspond to the 4 phases we identified, we compared the configuration numbers in our list view with the numbers shown in the frames of the video sequence. We could observed that changes in the number of dimensions are happening at configurations corresponding to frames 2,4 and 6 which are the starting frames for phases 2,3 and 4.

Getting a sense of velocity We can even get a sense for velocity of the movement when selecting all configurations, as shown in Figure 6.4. In the first part we can clearly observe acceleration in the beginning and deceleration in the end where the robot approaches the stick in order to grab it. Similar behavior can be seen in part 2, where the robot approaches the box, and in part 3 where it pushes the box to its final location. Of course this discussion only holds for equidistant robot time steps, which we are assuming. Also, since we’re using a projection (PCA in this case) some dimensions may not be well represented in these plots, which is why we cannot deduce the behavior of every dimension.

48 6.2 Use Case Scenarios

Robot path evolution Robot path evolution Robot path evolution Robot path evolution

part 1 part 2 part 3 part 4

Figure 6.4: Robot path sections with all configurations highlighted. More dense regions of the path are of lower velocity.

Comparison of initial and final solution As a next step we want to explore how this path evolved during the optimization process. For this we enable the trajectories again and zoom in on the first part of the path. For a first impression we compare the initialand final robot path. This is shown in Figure 6.5. From the comparison we can see that the initial guess of the robot’s path is very chaotic and seems like it was randomly generated. Observing the trajectories shows how the solver starts to organize the path and gets it out of its chaotic form into a state where subsequent configurations are adjacent to each other. We can also see that orange trajectories that correspond to later configurations (in robot time) travel further than purple trajectories which relate to the start of the robots movement path of this phase. This could be explained with an initial guess being close to the starting configuration of the robot, or maybe the first robot time steps notbeing subject to a lot of constraints and cost function features as opposed to later configurations, which makes them easier to get right early on in the optimization. We will later be studying trajectories with respect to constraints.

Robot path evolution Robot path evolution

part 1 part 1

(a) initial path (b) final path

Figure 6.5: Graphs show robot paths of initial guess and final solution with optimization trajectories. Color gradient indicates robot time from dark to bright (purple to orange).

49 6 Evaluation

Analyzing the path’s evolution To get better understanding of the path’s evolution, we can browse through the optimization steps to get a sense of the amount of progress made during different parts of the process. We observe that the path quickly gets into a reasonable shape from its chaotic form after only 17 optimization steps. Then the optimizer seems to go back and forth as if it was trying to overcome an obstacle or find the way downwards from a plateau, until it finally walks towards the final solution from around step 58 onwards. Figure 6.6 shows the paths for these two optimization steps to illustrate the region of back and forth movement in the trajectories. To make sure the algorithm was not instead moving forward in other parts of the path, we analyze their path evolutions as well and find that they also seem to be stuck during this part of the optimization.

Robot path evolution Robot path evolution

part 1 part 1

(a) path at step 17 (of 173) (b) path at step 58 (of 173)

Figure 6.6: Graphs show robot paths at optimization step 17 and 58 together with opti- mization trajectories.

Analyzing constraint evolution Even though the algorithm seems to not make a lot of progress in between steps 17 and 58 trajectory wise, there may still be something going on with respect to the constraints. Looking into the constraint curves reveals that inequality constraints are not changing a lot during these steps, equality constraints on the other hand have changed a lot. The respective plots are shown in Figure 6.7 Especially the constraint groups associated with light blue and light purple color change alot moving towards zero (i.e. towards feasibility). The green curve on the other hand, moves away from zero which means that the constraints are being violated more strongly. We can also see that the constraint violations were huge in beginning of the optimization process.

Exploring a constraint in detail We now want to get some more information about the equality constraints corresponding to the light blue line from Figure 6.7. The constraint tree tells us the label of it, which is “F_Pose-stick” and is related to the robot approaching

50 6.2 Use Case Scenarios

Inequalities (step,φ) Equalities (step,φ) 4

60

2 30

0 0

-30 -2

-60

-4 0 50 100 150 0 50 100 150

(a) Inequality constraints (b) Equality constraints

Figure 6.7: Graphs show evolution of constraints over the optimization steps. The colored lines are aggregations of the different constraint groups. The fine vertical dotted line indicates the 58th optimization step. or grabbing the stick objective. To see where these constraints get violated along the optimization trajectories we select the group and enable the violation indicators. For a better sense of optimization time we switch the trajectory view to display a 1-dimensional PCA and optimization steps. The result of this can be seen in Figure 6.8. We see that the constraints refer to the last two configurations of the first robot phase and the first two configuration of the second robot phase, which correspond exactly to the timewhen the stick is retrieved (frame 2 in Figure 6.2). Similarly to Figure 6.6 we can observe that there is some significant change happening to the robot path when the assumed obstacle is overcome after step 58. From the plot we can also observe for how long these constraints stay violated, which is around three fourths of the whole optimization period. When closely analyzing the individual trajectories it is possible to see a zig-zag-pattern. This emerges from steps taken in opposing directions, in extreme cases even back tracking that “naturally” happens on step size adaption during line search. In any case these zig-zag-patterns show how difficult it is for the algorithm to walk downhill to the minimum. If we wanttocheck how many of these back and forth movements are due to line search, we can do so by either enabling the line search highlighting or by scrolling through the log entry list of the tool (see Figure 6.1). It turns out that hardly any three consecutive steps are taken without a rejected line search adapting the step size after optimization step 35.

Analyzing constraint violation patterns The complexity of solving the optimization problem is also reflected in the constraint violations view. When switching to it anddese- lecting all constraints, we get a heat map view of the total number of constraint violations over optimization time for each of the robot’s configurations. Figure 6.9 shows this view. Here we can immediately identify the third phase of the robot’s movement path to be the hottest (in terms of heat map). This is what we would expect since its the part where

51 6 Evaluation

Robot path evolution

part 1] [part 2

Figure 6.8: Graph shows optimization trajectories with constraint violation indicators. The trajectory points are laid out using a PCA for the y-axis and optimization steps on the x-axis. The stippled vertical line indicates optimization step 58. most interaction between the robot and its environment is happening (stick and box). The transition from second to third phase is also more complicated in terms of constraint satisfaction, similar to the pattern we see at the transition form phase 1 to phase 2. Since these phase transitions share a semantic similarity, which is the robot approaching an object and making contact with it, it is understandable that we see the same pattern. Also interesting to see is that almost all of the configurations of the first phase are feasible right from the beginning, which means that the chaotic initial path from Figure 6.5 is actually. However, feasibility does not imply optimality and we only get it at the end of the process.

In this chapter the final visual analytics tool was presented. To evaluate its utility,a use case scenario was presented in which an analysis of a robot path optimization run was conducted. In this evaluation the optimization problem was explained by means of illustrating the robot’s task. The corresponding visualizations were then shown and related to the optimization problem while highlighting the findings of the exploratory analysis process.

52 6.2 Use Case Scenarios

Constraint Violations

[part 1 part 1] [part 2 part 2] [part 3 part 3] [part 4 part 4]

Figure 6.9: Figure shows number of constraint violations per optimization step and robot configuration. Orange color indicates few violations, red indicates many.The robot configurations are grouped by their corresponding phase, and are ordered from top to bottom. Optimization time goes from left to right in each part.

53

7 Conclusion

In this chapter a recapitulation of the thesis is given, as well as prospects in regard of future work. A short conclusion is drawn to asses the results of this thesis.

7.1 Summary

In Chapter 1 the topic of non-linear constraint optimization was introduced and the need for visualization was motivated. The goals were concretized to focus on the visualization of optimization trajectories for being a resourceful quantity to analyze with respect to an optimization process. In Chapter 2 existing publications concerned with visualizing aspects of optimization as well as the visualization of temporal evolution were presented. It was made clear in which way these works relate to this thesis and how they differ from it. Chapter 3 introduced prerequisites with regard to nonlinear constraint optimization to make the reader familiar with the most important concepts. The formulation of constraint programs was discussed, conditions for optimality and feasibility of a solution to an NLP were explained and an algorithm for solving NLP was introduced. From there on, methods for visualizing optimization trajectories were discussed in Chap- ter 4, as well as methods for visualizing constraints and techniques for interactively exploring such visualizations. Visual representations were motivated and the introduced methods were illustrated and explained with respect to their properties and advantages. The implementation of the visual analytics system for log files of robot path optimizations was discussed in Chapter 5. The architecture was explained and technical details of the resulting software were given. The structure of the log files and the contained data was described as well. Finally the resulting software system was presented and evaluated in Chapter 6. For evaluation a use case scenario of an optimization run analysis was presented, in which the exploratory process was illustrated. Insights gained during this process were presented to show the usefulness of the tool.

7.2 Results

The goals of this thesis were to explore the visualization of optimization trajectories and to build a visual analytics tool based on this visualization technique. The tool should be able to support an analyst or domain expert in trouble shooting optimization performance,

55 7 Conclusion give insights into the behavior of the optimization algorithm, and to help understanding optimization problems better. Optimization trajectory visualization proved to be insightful, making it possible to observe different stages throughout the optimization process of fast and slow progression. By integrating information about constraint satisfaction and representation of the optimization problem’s argument, a deeper understanding of the problem can be gained, through observing how its solution unfolds. Due to the scope of the project and the nature of the given dataset, the tool cannot provide detailed information about the causality of performance issues. However, it can serve, in combination with other tools, as an entry point for optimization performance assessment.

7.3 Future Work

In this section possibilities for future enhancement and extensions of the visualization tool as well as research possibilities are discussed. It starts with suggestions for changes to the visualizations, leads over to ideas for new visualizations and finishes with a greater picture of possibilities to be explored in terms of optimization trajectory visualization.

• While it makes sense to show the optimization trajectories of the different parts of the robot’s movement path side by side to convey a sense of temporal order of the phases, there is a better way of arranging these parts in the constraint violations view (Figure 6.9). When parts are stacked on top of each other as opposed to placing them side by side, there is no visual disconnect between the adjacent configurations of two phases. Transposing the plot so that optimization time moves downwards is also more intuitive as it conveys the story of walking down to the minimum. • Sorting the constraint groups in the tree by their aggregated number of violations or sum of violating constraint values will make it easier to identify constraints that are hard to satisfy. It would also remove the need for interactively comparing the groups against each other. The sorting could also be done on a per optimization step basis to be able to identify interesting constraints for a specific time step quickly. • To be able to compare different constraint groups with each other, a possibility to overlay different constraint violations plots (Figure 6.9) would help. Currently only switching between selected constraint groups lets the user see the different violation patterns. Being able to discern between different groups in this view, would provide means to compare patterns. • Even though different phases of the optimization could be identified during the analysis in Section 6.2, the process was requiring interaction to see how far the optimization has progressed trajectory wise. A novel kind of plot that directly visualizes the amount of progress made would help a lot in identifying optimization phases of slow and fast progress. It can be imagined to see the algorithm walking on plateaus or downhill to a more optimal place. For creating this kind of plot, the average change in trajectory location as well as change in objective or Lagrangian

56 7.3 Future Work

values could be leveraged. Using a smoothing window could provide a mechanism to select a granularity level of optimization time. • Instead of visualizing a single optimization run, the comparison of trajectories of different runs with differing initialization, hyper parameters or optimization problem could provide further insights into optimization performance. In such an approach the robustness of a solver could be assessed and its behavior studied in more detail. Also a sensitivity analysis could be performed with respect to the parameters of an optimization problem or solver settings. • For relating constraints and their influence on the optimization trajectory, away of visualizing constraint boundary proximity of the optimization trajectory sounds quite interesting. Interpreting boundaries as attractive or repelling surfaces makes reasoning about trajectory evolution more intuitive, however, high dimensionality makes visualization of optimization problem geometry a challenging task. In general the geometric relations of constraints, objectives and optimization trajectory seem to be worth researching in terms of visualizing optimization.

57

Bibliography

[AA95] E. D. Andersen, K. D. Andersen. “Presolving in linear programming.” In: Mathematical Programming 71.2 (Dec. 1995), pp. 221–245 (cit. on p. 11). [Abe10] P. Abeles. EJML - Efficent Java Matrix Library. https : / / github . com / lessthanoptimal/ejml. 2010 (cit. on p. 42). [ABF11] M. V. Afonso, J. M. Bioucas-Dias, M. A. T. Figueiredo. “An Augmented La- grangian Approach to the Constrained Optimization Formulation of Imaging Inverse Problems.” In: IEEE Transactions on Image Processing 20.3 (Mar. 2011), pp. 681–695 (cit. on p. 11). [BSH+16] B. Bach, C. Shi, N. Heulot, T. Madhyastha, T. Grabowski, P. Dragicevic. “Time Curves: Folding Time to Visualize Patterns of Temporal Evolution in Data.” In: IEEE Transactions on Visualization and Computer Graphics 22.1 (Jan. 2016), pp. 559–568. DOI: 10.1109/TVCG.2015.2467851 (cit. on p. 17). [CDB93] A. Chatterjee, P. Das, S. Bhattacharya. “Visualization in linear program- ming using parallel coordinates.” In: Pattern Recognition 26.11 (1993), pp. 1725–1736. ISSN: 0031-3203. DOI: https://doi.org/10.1016/0031- 3203(93)90027-T. URL: http://www.sciencedirect.com/science/article/pii/ 003132039390027T (cit. on p. 15). [Chi06] J. W. Chinneck. “Practical optimization: a gentle introduction.” In: Systems and Computer Engineering), Carleton University, Ottawa. http://www. sce. carleton. ca/faculty/chinneck/po. html (2006) (cit. on p. 11). [CI01] J. P. Charalambos, E. Izquierdo. “Linear programming concept visualization.” In: Proceedings Fifth International Conference on Information Visualisation. July 2001, pp. 529–535. DOI: 10.1109/IV.2001.942107 (cit. on p. 15). [Fas12] FasterXML. Jackson - JSON for Java. https://github.com/FasterXML/jackson. 2012 (cit. on p. 42). [GMD+17] S. Goodwin, C. Mears, T. Dwyer, M. G. de la Banda, G. Tack, M. Wallace. “What do Constraint Programming Users Want to See? Exploring the Role of Visualisation in Profiling of Models and Search.” In: IEEE Transactions on Visualization and Computer Graphics 23.1 (Jan. 2017), pp. 281–290. DOI: 10.1109/TVCG.2016.2598545 (cit. on p. 16).

[Goh17] G. Goh. “Why Momentum Really Works.” In: Distill (2017). DOI: 10.23915/ distill.00006. URL: http://distill.pub/2017/momentum (cit. on p. 27).

59 Bibliography

[Häg19] D. Hägele. JPlotter - OpenGL based 2D Plotting Library for Java using AWT and LWJGL. https://github.com/hageldave/JPlotter. 2019 (cit. on p. 42). [Hes69] M. R. Hestenes. “Multiplier and gradient methods.” In: Journal of Optimiza- tion Theory and Applications 4.5 (Nov. 1969), pp. 303–320. ISSN: 1573-2878. DOI: 10.1007/BF00927673. URL: https://doi.org/10.1007/BF00927673 (cit. on p. 23). [Hot33] H. Hotelling. “Analysis of a complex of statistical variables into principal components.” In: Journal of educational psychology 24.6 (1933), p. 417 (cit. on p. 29). [JFSK16] D. Jäckle, F. Fischer, T. Schreck, D. A. Keim. “Temporal MDS Plots for Analysis of Multivariate Data.” In: IEEE Transactions on Visualization and Computer Graphics 22.1 (Jan. 2016), pp. 141–150. ISSN: 1077-2626. DOI: 10.1109/TVCG.2015.2467553 (cit. on pp. 17, 30). [KIOT14] M. Kubota, T. Itoh, S. Obayashi, Y. Takeshima. “EVOLVE: A Visualization Tool for Multi-objective Optimization Featuring Linked View of Explanatory Variables and Objective Functions.” In: 2014 18th International Conference on Information Visualisation. July 2014, pp. 351–356. DOI: 10.1109/IV.2014.43 (cit. on p. 16). [Kje00] T. H. Kjeldsen. “A Contextualized Historical Analysis of the Kuhn–Tucker Theorem in Nonlinear Programming: The Impact of World War II.” In: Mathematica 27.4 (2000), pp. 331–361. ISSN: 0315-0860. DOI: https://doi.org/10.1006/hmat.2000.2289. URL: http://www.sciencedirect. com/science/article/pii/S0315086000922894 (cit. on p. 22). [Kru64] J. B. Kruskal. “Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis.” In: Psychometrika 29.1 (Mar. 1964), pp. 1–27. ISSN: 1860-0980. DOI: 10.1007/BF02289565. URL: https://doi.org/10.1007/ BF02289565 (cit. on p. 29). [KT14] H. W. Kuhn, A. W. Tucker. “Nonlinear Programming.” In: Traces and Emer- gence of Nonlinear Programming. Ed. by G. Giorgi, T. H. Kjeldsen. Basel: Springer Basel, 2014, pp. 247–258 (cit. on p. 22). [Li16] H. Li. SMILE - Statistical Machine Intelligence & Learning Engine. https : //github.com/haifengl/smile. 2016 (cit. on p. 42). [MH08] L. v.d. Maaten, G. Hinton. “Visualizing data using t-SNE.” In: Journal of machine learning research 9.Nov (2008), pp. 2579–2605 (cit. on p. 29). [NTY99] Y. Nesterov, M. Todd, Y. Ye. “Infeasible-start primal-dual methods and infea- sibility detectors for nonlinear programming problems.” In: Mathematical Programming 84.2 (Feb. 1999), pp. 227–267 (cit. on p. 11). [Pea01] K. Pearson. “On lines and planes of closest fit to systems of points in space.” In: The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2.11 (1901), pp. 559–572. DOI: 10.1080/14786440109462720 (cit. on p. 29).

60 [Pow78] M. J. D. Powell. “Algorithms for nonlinear constraints that use lagrangian functions.” In: Mathematical Programming 14.1 (Dec. 1978), pp. 224–248. ISSN: 1436-4646. DOI: 10.1007/BF01588967. URL: https://doi.org/10. 1007/BF01588967 (cit. on p. 23). [SDF+10] H. Simonis, P. Davern, J. Feldman, D. Mehta, L. Quesada, M. Carlsson. “A Generic Visualization Platform for CP.” In: vol. 6308. Sept. 2010, pp. 460– 474. DOI: 10.1007/978-3-642-15396-9_37 (cit. on p. 16). [Tor58] W. S. Torgerson. “Theory and methods of scaling.” In: (1958) (cit. on p. 29). [Tou14a] M. Toussaint. A Novel Augmented Lagrangian Approach for Inequalities and Convergent Any-Time Non-Central Updates. 2014. arXiv: 1412.4329 [math.OC] (cit. on p. 23). [Tou14b] M. Toussaint. Newton methods for k-order Markov Constrained Motion Prob- lems. 2014. arXiv: 1407.0414 [cs.RO] (cit. on pp. 12, 21, 34). [Tou15] M. Toussaint. “Logic-geometric programming: An optimization-based ap- proach to combined task and motion planning.” In: Twenty-Fourth Interna- tional Joint Conference on Artificial Intelligence. 2015 (cit. on p. 11). [vdEHBv16] S. van den Elzen, D. Holten, J. Blaas, J. J. van Wijk. “Reducing Snapshots to Points: A Visual Analytics Approach to Dynamic Network Exploration.” In: IEEE Transactions on Visualization and Computer Graphics 22.1 (Jan. 2016), pp. 1–10. DOI: 10.1109/TVCG.2015.2468078 (cit. on p. 18). [Wol69] P. Wolfe. “Convergence Conditions for Ascent Methods.” In: SIAM Review 11.2 (1969), pp. 226–235. DOI: 10.1137/1011036 (cit. on p. 25). [YL07] W. Yu, T. Lan. “Transmitter Optimization for the Multi-Antenna Downlink With Per-Antenna Power Constraints.” In: IEEE Transactions on Signal Pro- cessing 55.6 (June 2007), pp. 2646–2660 (cit. on p. 11). [ZD15] X. Zhang, H. Duan. “An improved constrained differential evolution algo- rithm for unmanned aerial vehicle global route planning.” In: Applied Soft Computing 26 (2015), pp. 270–284. ISSN: 1568-4946. DOI: https://doi.org/ 10.1016/j.asoc.2014.09.046. URL: http://www.sciencedirect.com/science/ article/pii/S1568494614004992 (cit. on p. 11).

All links were last followed on October 25, 2019.

Declaration

I hereby declare that the work presented in this thesis is entirely my own and that I did not use any other sources and references than the listed ones. I have marked all direct or indirect statements from other sources contained therein as quotations. Neither this work nor significant parts of it were part of another examination procedure. I have not published this work in whole or in part before. The electronic copy is consistent with all submitted copies.

place, date, signature