<<

Efficient Trajectory Generation for Robotic Systems Constrained by Contact Force via Nonlinear Programming

Jaemin Lee a, Efstathios Bakolas b, Luis Sentis b

aDepartment of Mechanical Engineering, The University of Texas at Austin, TX, 78712-1221, USA bDepartment of Aerospace Engineering and Engineering Mechanics, The University of Texas at Austin, TX 78712-1221, USA

Abstract

In this work, we propose a trajectory generation method for robotic systems with contact force constraint based on optimal control and reachability analysis. Normally, the dynamics and constraints of the contact-constrained are nonlinear and coupled to each other. Instead of linearizing the model and constraints, we directly solve the optimal control problem to obtain the feasible state trajectory and the control input of the system. A tractable optimal control problem is formulated which is addressed by dual approaches, which are sampling-based dynamic programming and rigorous reachability analysis. The sampling-based method and Partially Observable Markov Decision Process (POMDP) are used to break down the end-to-end trajectory generation problem via sample-wise optimization in terms of given conditions. The result generates sequential pairs of subregions to be passed to reach the final goal. The reachability analysis ensures that we will find at least one trajectory starting from a given initial state and going through a sequence of subregions. The distinctive contributions of our method are to enable handling the intricate contact constraint coupled with system’s dynamics due to the reduction of computational complexity of the algorithm. We validate our method using extensive numerical simulations with a .

Key words: ; Reachability; Contact Force.

1 Introduction manoid robots use very simple models of a robot such as considering center of dynamics under contact con- This paper considers the optimal control of robotic sys- straints [18, 28, 38]. However, those methods result on tems with contact force constraints. Often, it is required lower performance of the robots since they cannot cap- that legged or humanoid robots maintain stable foot or ture the robot’s or input constraints among body contacts while executing given tasks. In such cases, other limitations. contact forces constrain and determine the robot’s state reachability together with other state and input con- Optimal control is an alternative approach to solve the straints. Therefore, we seek to devise control algorithms trajectory generation problem for contact-constrained that can generate trajectories for contact-constrained robots. Recently, trajectory generation for the legged robots via formal state reachability analysis. Often, con- robots was formulated as a bi-level optimization prob- trol studies for assume that task trajectories are lem and solved by an iterative Linear Quadratic Regula- predefined [20,34,37], then attempt to find an instanta- tor (iLQR) [8]. However, the iLQR has still challenging neously optimal solution to accomplish them. However issues to address generic nonlinear constraints without the desired trajectories are frequently infeasible and it constraint softening. This could result on motion plan- is not straight-forward to check the feasibility of trajec- ers that violate important physical constraints of robots. tories under contact constraints a priori. Many motion To strictly consider the constraints and nonlinearity of planning and trajectory generation approaches for hu- the robot models, we should directly solve the optimal control problem via Nonlinear Programming (NLP) in ? Some part of this paper will be presented at American the process of obtaining feasible trajectories. Although Control Conference 2019 [25]. Corresponding author L. Sen- many NLP solvers, i.e. SNOPT [12] and IPOPT [40], tis. are available, NLP has significantly challenging issues. Email addresses: [email protected] (Jaemin Lee), One is high computational cost for obtaining the solu- [email protected] (Efstathios Bakolas), tion. Also, feasible initial conditions are necessary for [email protected] (Luis Sentis). NLP. In our work, we propose two complementary pro-

Preprint submitted to Automatica 20 March 2019 cesses to resolve these problems: reachability analysis with system dynamics. Since the contact force is time and sampling-based dynamic programming. varying and our problem is also high dimensional, it is very difficult to do reachability analysis of our system The reachability problem consists of checking whether via Hamilton-Jacobi-Bellman PDE. Linearization of dy- the state of the system can reach a specific state over namics and approximating reachable sets with convex a finite time horizon, starting from a given initial state. sets is not applicable to our problem because reachable The field of robotics has often focused on configuration sets of constrained nonlinear systems may not be con- space reachability guided by the given tasks such as se- vex. Moreover, we want to be strictly compared to meth- lecting the stance location of humanoid robots [7,41]. Al- ods softening or linearizing constraints like iLQR. Thus, though these methods are useful for kinematic feasibility, we devise a new method consisting of propagating sys- they are limited to address the requirements of dynami- tem states and approximating the reachable set. NLP cal systems considering various constraints such as joint using optimal control utilizes the approximated reach- / limits and contact force constraints. To able set. In order to address the increasing computa- address this gap, we will employ optimal control on the tional complexity, we propose various techniques which nonlinear dynamical system with constraints. are described thereafter. And, we do so, in the context of robotic systems with contact force constraints, in a In optimal control, the reachability analysis has been computational efficient way. often used for nonlinear systems [2–4, 35, 36], hybrid dynamical systems [15, 30–32], and stochastic sys- Concretely, we incorporate a sampling-based approach, tems [1, 27, 39]. We can categorize established reacha- quadratic programming (QP), NLP, and approximation bility analysis methods for nonlinear systems into three techniques such as propagation of boundary samples to groups: 1) solving Hamilton-Jacobi-Bellman PDE, 2) solve our problem. More specifically, for dividing the using linearization and mathematical approximation 3) end-to-end trajectory generation problem, we obtain the using propagation and mappings of a set of reachable constrained-state set using a sampling-based approach states. First, for low dimensional dynamical systems, and QP, then, reformulate the problem as a Partially the reachability analysis is often achieved via Hamilton- Observable Markov Decision Process (POMDP) in the Jacobi-Bellman PDE [6, 19]. For some systems, it is system’s output space. An optimal Markov policy re- impossible to perform the reachability analysis by solv- sulting from the dynamic programming (DP) provides ing Hamilton-Jacobin-Bellman PDE. Instead, many a sequence of output subregions. The sequence of out- approaches have been proposed to compute reachable put subregions guides the path of output with avoiding sets exploiting mathematical techniques, optimization, unsafe output regions such as locations of obstacles in inherent characteristics of systems, etc. the output space. In the next step, we implement a rig- orous reachability analysis between given pairs of sub- Other than the methods using Hamilton-Jacobi- regions by propagating the states from the given initial Bellman PDE, many methodologies have been proposed state. In addition, we propose a method to approximate to obtain reachable sets of systems. The logarithmic the reachable set using propagation of boundary states. norm of a type of system’s Jacobian is utilized to ob- The algorithmic efficiency of our method is one of the tain over-approximated reachable sets for nonlinear contributions of our work. continuous-time systems [29] and that norm is utilized for simulation-based reachability analysis [5]. Another This paper is organized as follows. Section 2 defines our approach tries to do more accurate reachability analysis problem and the target class of system. A sampling- for uncertain nonlinear systems by using more con- based algorithm for obtaining the set of constrained servative approximations [4, 35]. Also, for continuous- states in Section 3 and a POMDP for obtaining an op- time piecewise affine systems, linear matrix inequal- timal Markov policy are described in Section 4. In Sec- ities (LMI) are employed to characterize the bounds tion 5, we propose an approach to obtaining the reach- of reachable regions [16]. Another class of reachability able set and analyzing the method in detail. Based on analysis uses convex sets for approximation such as el- the result of rigorous reachability analysis in Section 5, lipsoid [22, 23], polytopes, zonotopes [13], and support an optimal controller is designed by implementing the functions [14, 24]. Although those approximation-based NLP of Section 6. The proposed approach is validated approaches are capable of extending to nonlinear sys- by simulation of a robotic legged system with contact tems, they only consider convex sets and often ignore force constraint in Section 6. other constraints. These are challenges for extending the previous approaches to more sophisticated and compli- cated systems. Additionally, computational complexity 2 Problem Formulation exponentially increases with respect to the dimension of the state space and the time length for those methods. 2.1 Notation

Our problem considers a constrained nonlinear system We denote the set of real n-dimensional vectors and the with a constrained variable, e.g. a contact force, coupled set of real n×m matrices by Rn and Rn×m, respectively.

2 The sets of non-negative and non-positive real numbers Rny is a nonlinear output function that represents the are represented as R≤0 and R≥0, respectively. The set of desired tasks, i.e. of points of robot natural numbers and the set of integer numbers are de- that we wish to control. We define a discrete time state noted by N and Z, respectively. The set of positive defi- space model of the contact-constrained robotic system nite n × n matrices and the set of positive semi-definite (DTSCR) as the discrete counterpart of the state space n n n × n matrices are denoted by S>0 and S≥0. When con- model: sidering z1, z2 ∈ N with z2 > z1, the discrete interval x(tk+1) = fD (x(tk), u(tk),Fc(tk)) between z1 and z2 is defined as [z1, z2]N := {z1, z1 + ∞ 1, . . . , z2 − 1, z2}. In case of real numbers z1, z2 ∈ ≥0, R X Bi(x(tk), u(tk),Fc(tk)) i ∆ : = x(tk) + (∆t) [z1, z2]d = {z1, z1 +∆, . . . , z2 −∆, z2} denotes a discrete i! interval with ∆ being the increment. When n real num- i=1 n n (3) bers a1, . . . , an are consider, Vec[ai]i=1 ∈ R represents a vector whose i-th element is ai. Given n×m real num- bers a11, . . . , amn, a matrix whose (i, j) element is aij is where ∆t denotes the sampling period for the dis- n,m n×m nx+nu+nc nx denoted by Mat[aij]i,j=1 ∈ R . Given a square ma- cretization of the model. fD : R 7→ R n×n ∂Bi−1(x,u,Fc) trix A ∈ R , tr(A) denotes its trace. σ(A) and σ(A) and Bi(x, u, Fc) = ∂x B1(x, u, Fc) and represent the largest and smallest singular values of A, B1(x, u, Fc) = fx(x) + fu(x)u + fc(x)Fc. ni×m respectively. Given matrices Ai ∈ R i ∈ [1, z]N, (nq +···+nz )×m Vertcat(A1,..., Az) ∈ R indicates a block 2.3 Constraints of the System matrix constructed by vertically concatenating the ma- trices A i ∈ [1, z] . Given a set of real vectors A ⊆ n, i N R We refer to h and h as the equality constraint func- card(A) denotes its cardinality. When considering par- e i tion and the inequality constraint function, respectively, ticular cases such that A ⊂ n with n ∈ [1, 3] , ghull(A) R N where we assume that h (x) = 0 and h (x) ≤ 0. Three and gbd(A) represent the general hull and the set of vec- e i types of constraint functions are considered in this pa- tors closest the boundary of A. [.] represents the prob- E per, i.e. functions describing state, the input, and the abilistic expectation operator. mixed state-input constraints denoted as hx(x), hu(u), and hxu(x, u), respectively. In practice, the state con- 2.2 Nonlinear System Model straint is introduced to avoid violating joint position or velocity limits when controlling the robot. Input con- We characterize the equation of motion for general straints are considered for describing the joint torque robotic systems with contact forces and assuming rigid limit or the underactuation of floating robots. The body linkages as follows: mixed state-input constraint contains physically more intricate conditions such as mechanical power. Since the DTSCR is required to maintain stable contact while M(q)¨q + g(q, ˙ q) = S>u + J>(q)F (1) c c being controlled, we consider an additional constraint that is known as the contact wrench cone constrained nq nq nq nq ×nu where q ∈ R , M(q) ∈ S>0, g(q, ˙ q) ∈ R , S ∈ R , to prevent slip and flip on contact surface. In particular, nu nc×nq nc u ∈ R , Jc(q) ∈ R , and Fc ∈ R denote the joint it is required that variable, sum of Coriolis/centrifugal and gravitational forces, selection matrix for the actuation, input actuat- h (x, F ) ≤ 0, h (x, F ) := W (x)F (4) ing joint , contact Jacobian matrix, and contact xc c xc c c c force, respectively. We can bring the differential equa- where W (x) is a matrix describing the unilateral con- tion (1) into a state space form by defining the state c straint using a polyhedral approximation of the friction x := [x>x>]> ∈ nx where x = q and x =q ˙: 1 2 R 1 2 cone of a surface [9]. The position at the contact point should be constant, which is identical to having null ve- x˙(t) = fC(x(t), u(t),Fc(t)) (2a) locity on the robot’s contact with respect to the surface = fx(x(t)) + fu(x(t))u(t) + fc(x(t))Fc(t) (2b) contact. The zero velocity constraint corresponds to the y(t) = fy(x(t)) (2c) state constraint. We will aim at controlling robots rep- " # resented by the DTSCR model for desired output goals x2(t) given the aforementioned constraints. fx(x) := −1 −M (x1)g(x1, x2) " # " # 2.4 Overall Scheme 0nq ×nu 0nq ×nc fu(x) := , fc(x) := −1 > −1 > M (x1)S M (x1)Jc (x1) The proposed approach consists of three methods, which are sampling-based optimization, solving a POMDP, nx nx nx nu nx where fx : R 7→ R , fu : R 7→ R , and fc : R 7→ and reachability-based optimal control. The overall nc nx R are nonlinear functions of the state x. fy : R 7→ procedure is depicted in Fig. 1. The first part of the

3 small Xʟ Yʟ Yʟ large Feasible goal with high probability Subregion

Infeasible goal Xʞ Yʞ Yʞ (a) (b) (c)

node associated with node associated with output associated with initial output goal output Time Time

optimal sequence of nodes

Yʞ Yʞ node output associated with Yʟ Yʟ (d) (e) (f)

Fig. 1. Proposed method: (a) Generating random state vectors and obtaining a set of sample-wisely feasible states, Section 3.1, (b) Checking the feasibility goal output via output approximation, Section 3.2, (c) Defining output subregions and discrete node space, (d) Solving POMDP to obtain an optimal sequence of nodes, Section 4, (e) Obtaining reachable sets and optimizing trajectories between sequential nodes, Section 5 and 6 (f) Consecutively executing trajectory optimizations with reachable sets, Section 5 and 6 approach aims to obtain feasible states using sampling ply a least-square QP process to obtain feasible states. c c with respect to the given constraints as shown in Fig. Let us consider ne state equality constraints and ni 1(a), then to approximately check whether the goal state inequality constraints. We define constraint func- c output is feasible via output approximation as shown tions as hx,e[ke](x) and hx,i[ki](x) where ke ∈ [1, ne]N and in Fig. 1(b). If the goal is achievable, we formulate and c ki ∈ [1, ni ]N are indices. Let us define Jacobian matrices solve a POMDP process to create multiple tractable and error vectors as follows: sub problems as shown in Fig. 1(c) and (d). Based on an optimal sequence of nodes, we obtain reachable sets c Je(x) := Vertcat(Je[k](x): ∀k ∈ [1, ne]N) (5a) and perform trajectory optimization between neighbor- c Ji(x) := Vertcat(Ji[k](x): ∀k ∈ [1, ni ]N) (5b) ing subregions associated with two sequential nodes, c then we recursively iterate this process for all sequence ee(x) := −Vertcat(hx,e[k](x): ∀k ∈ [1, ne]N) (5c) of nodes until reaching the final goal output as shown int c ei(x) := v\i − Vertcat(hx,i[k](x): ∀k ∈ [1, ni ]N) (5d) in Fig 1(e) and (f). By connecting all trajectories, we obtain the entire trajectory in an efficient way. ∂hx,e[k] ∂hx,i[k] where Je[k](x) := ∂x (x) and Ji[k](x) := ∂x (x) denote the Jacoabians for equality and inequality con- 3 Sample-Based Optimization int straint functions, respectively. v\i is an arbitrary inte- rior vector satisfying inequality constraints used as an In this section, we obtain a sequence of subregions in out- attractor. The main idea for obtaining the state fulfill- put space considering initial and goal states. Our method ing constraints is to iteratively update the sampled state breaks down the end-to-end trajectory generation prob- using the state increment ∆x until the constraints are lem into multiple intermediate points sequentially con- satisfied. The state increment ∆x is obtained by using nected. We start by creating random samples from a the QP method as follows: nx Gaussian distribution x ∼ N (µx, Σx) where µx ∈ R nx and Σx ∈ are the mean and the covariance ma- 2 S>0 min kwk2 trix, respectively, that is, µx := E[x] and Σx := E[(x − ∆x,w > (6) µx)(x − µx) ]. s.t Je(x)∆x ≤ ee(x) + w Ji(x)∆x ≤ ei(x) 3.1 Update for Random State Samples and we update the state x using the optimal variable We consider a set of states fulfilling desired constraints. ∆x. For numerical efficiency, we discard state samples if Since the Monte-Carlo method is very inefficient, we ap- QP does not converge. Let us define a set X consisting

4 of states fulfilling the constraints as follows: where κ is a coefficient determined by the cumulative probability of the Chi-square distribution. For instance, : nx c ∗ ∗ 2 X = {x ∈ R :khx,e[ke](x)k ≤ ε, ∀ke ∈ [1, ne]N κ = 5.991 for Pr(y ∈ Eκ) = 0.95 and y ∈ R . Our c (7) method to check if a goal output y is interior to E is hx,i[k ](x) ≤ 0, ∀ki ∈ [1, n ] } g κ i i N more efficient than using a Monte Carlo method, because where ε is a desired tolerance. we only need to compute µy∗ and Σy∗ using the mean and covariance matrix of the samples using (9). For the next step, we take all elements of X and check whether they fulfill the input, mixed state-input, and 4 POMDP for a Sequence of Subregions contact force constraints. To achieve this, we formulate an optimization problem with a quadratic cost function as follows: After checking that the desired output goal yg is located at the interior of the ellipsoid Eκ in (11), the end-to-end > > min Fc QcFc + u Quu trajectory generation problem is divided using interme- Fc diate points in the output space. To start the process, s.t. x(tk+1) = fD(x(tk), u, Fc) we define output subregions: c (8) hu[ku](u) ≤ 0, ∀ku ∈ [1, nu]N c ny Yi := {y ∈ : ky − yc,ik∞ < εy} (12) hxu[kxu](x(tk), u) ≤ 0, ∀kxu ∈ [1, nxu]N R hxc(x(tk),Fc) ≤ 0, x(tk), x(tk+1) ∈ X ny where yc,i ∈ R denotes the center of the output subre- n n S ny c u gion Yi and i ∈ [1, m] where i∈[1,m] Yi = Y ⊂ R . where Qc ∈ S>0 and Qu ∈ S>0 are weighting matrices N N ˆ for the cost. By solving the optimization problem (8) for Also, we obtain a set of outputs Yˆ := fy(Xˆ) = {y ∈ all x(t ) ∈ X , we obtain a set of states, Xˆ, that fulfills ny ˆ ˆ ˆ ˆ k R : y = fy(x), x ∈ X} where fy : X ⇒ Y. To formu- all desired constraints. late our problem as a POMDP, we define discrete nodes associated with the previous subregions as follows: 3.2 Output Space Approximation

si = node(Yi), i ∈ [1, m]N (13) In this subsection, we check the feasibility of reaching the desired goal output. To do so, we approximate the output where S := {s1, . . . , sm}. Based on these nodes, we ∗ samples with a Gaussian distribution y ∼ N (µy∗ , Σy∗ ) transform the problem to a POMDP. We will formulate [17]. The mean and covariance matrix obtained after the probability of observations using the sampled states. neglecting higher order terms are Definition 1. (POMDP) Partially Observable Decision ny µy∗ := fy(µx) + Vec [tr(Hy,i(µx)Σx)]i=1 (9a) Making Process is defined as a tuple P = (S, A, O, T, Z): > Σy∗ := Jy(µx)ΣxJy (µx) • S is a finite set of nodes, S := {s , ··· , s } 1 1 ms + Mat [tr(Σ H (µ )Σ H (µ ))]ny ,ny (9b) • A is a finite set of actions, A := {a , ··· , a } 2 x y,i x x y,j x i,j=1 1 ma • O is a finite set of observations, O := {o1, ··· , omo } • T is the transition dynamics T(s0, s, a) defining the where Jy(µ) and Hy,i(µ) denote the Jacobian matrix of transition from s ∈ S to s0 ∈ S after taking an action the output function fy(µ) and the 2nd derivative matrix a ∈ A. of the output function fy,i(µ), for the i-th element. In particular, • Z is the observation Z(s, a, o) consisting of the proba- bility of observing o ∈ O after taking an action a ∈ A 2 2  ∂ fy,i(µ) ∂ fy,i(µ)  from node s ∈ S. 2 ... ∂x1 ∂x1xnx ∂fy  . . .  Jy(µ) := (µ), Hy,i(µ) :=  . .. .  The problem concerning this section is on finding a ∂x  . .   2 2  sequence of feasible subregions towards an output goal ∂ fy,i(µ) ∂ fy,i(µ) ∂x x ... x2 using POMDP tools and analysis. nx 1 nx (10) Definition 2. (Markov Policy) A Markov policy Π is (1) (n) (j) ny ×nx nx×nx defined as a sequence: Π := {a , ··· , a }. a ∈ A, where Ju(µ) ∈ R and Hy,i(µ) ∈ R is a sym- metric matrix. We construct a probabilistic ellipsoid in where a(j) : S → S is a measurable map from a node to the output space to approximate whether an output another one, j ∈ [1, n]N . sample y∗ is feasible. We define a set of outputs that lie inside an ellipsoid Eκ with We convert the POMDP into a belief MDP. Belief b[si] is defined with respect to discrete nodes si ∈ S. Let ny > −1 (j) 0 (j+1) (j) Eκ := {y ∈ R :(y − µy∗ ) Σy∗ (y − µy∗ ) ≤ κ} (11) suppose b = b[s ], b = b[s ], and a = a where

5 s(j) represents the node for the j-th step of the POMDP. of subregions associated with s and s0, respectively. We The belief transition function, Γ(b, a, b0), is equal to define the transition dynamics as

X ( Γ(b, a, b0) = Pr(b0|b, a, o)Pr(o|b, a) (14a) T (s0, s, a)/$, if $ 6= 0, T(s0, s, a) := (19a) o∈O 0, else ( 0 1, if belief update returns b 0  > Pr(b0|b, a, o) = (14b) T (s , s, a) := max 0, V (s)a (19b) 0, otherwise P 0 0 X 0 X 0 where $ = a0∈A T (s , s, a ) denotes a normalization Pr(o|b, a) = Z(s , a, o) T(s , s, a)b. (14c) constant. s0∈S s∈S Proposition 1. Let s = s ∈ S and the corresponding The key challenges of this POMDP are on defining use- i subregion be Yi. If the output samples are uniformly dis- ful observations and on finding their conditional proba- tributed in Y , then T(s0, s, a) = 0. bility. Let us consider that Y = Y(j) and Y0 = Y(j+1) i associated with the nodes s(j) and s(j+1). We propose to define observations as the set of feasible states after Proof. Since the samples are uniformly distributed, it taking an action a, i.e. is possible to select any unit vector in Rny as the PSV of si, that is, RaV(si) where Ra ∈ SO(3) is a rotation ˆ nx matrix. If we select the rotation matrix R such that O :={z ∈ R : z = x(tk), ∃(Fc, u) in (8) with a (15) ⊥ 0 d = RaV(si), which is orthogonal to V(si), it follows fy(x(tk)) ∈ Y ∩ Yˆ, fy(x(tk+1) ∈ Y ∩ Y}ˆ . 0 > ⊥ > that T (s , s, a) = V (si)d = (d ) d = 0 for all a ∈ A. where Y is the subregion before taking the action a. If z1 ∈ Oˆ, it holds that there exists at least one sample 0 We now solve a finite-horizon belief MDP. The optimal connecting fy(z1) to another output in the subregion Y satisfying the constraints. Considering the above obser- policy, denoted by Π?, is obtained by solving the Bell- vations, we define the conditional probability as man equation as follows: X 0 : 0 ˆ 0 ˆ D?(b) = max[r(b, a) + γ Pr(o|b, a)D?(Γ(b, a, o))] Z(s , a, o) = Pr(o|s , a) = card(O)/card(Y ∩ Y). (16) a∈A o∈O (20) Let us focus on the reward and transition dynamics. As P a heuristic, a higher number of feasible samples falling where r(b, a) = s∈S b(s)R(s, a) denotes the belief re- into a subregion implies a higher probability of reaching ward. The result of the DP provides an optimal Markov it. Therefore we define the reward policy which we transform to a sequence of nodes as

(1) (nπ ) R(si, a) := Krcard(Yi ∩ Yˆ)/card(Yˆ) + ηi (17) Π? = {a? , ··· , a? } (21a) (1) (nπ ) S? = {s? , ··· , s? } (21b) where Yi is the subregion associated with node si ∈ S. Kr ∈ R>0 and ηi ∈ R are the gain and reward offset, (i) (i) where a? and s? are i-th action and node in a sequence respectively. We take ηi to be large when yg ∈ Yi. Also, toward reaching the final output goal, respectively. The to avoid unsafe output regions we set ηi to large nega- tive values. sequence of subregions in output space is

(1) (nπ ) Definition 3. Consider a node si ∈ S associated with Y? = {Y? , ··· , Y? }. (22) an output subregion Yi and a set Yˆ = fy(Xˆ). Consider ˆ Based on the generated sequence of subregions, we will ΣYi being the covariance matrix for the set Yi ∩ Y, that > generate trajectories using reachability analysis connect- is, ΣYi = E[(y − µy)(y − µy) ] and µy = E[y] where ing subregions in Y? in the next section. y ∈ Yi ∩ Yˆ. A principle singular vector is defined as

V(si) = col(VYi )k, σk(ΣYi ) = σ(ΣYi ). (18) 5 Reachability Analysis where Σ = V> Λ V , Λ = diag(σ , . . . , σ ), and Yi Yi Yi Yi Yi 1 ny In this section, we define discrete reachable sets and σk denotes the singular value of ΣYi . propose the way to obtain the reachable sets via opti- mizations. To overcome the computational complexity of For defining the transition dynamics, let an action a ∈ A propagation algorithm for the reachable sets, a method map state s ∈ S to s0 ∈ S. d is a vector in our grid world propagating the boundary samples is proposed and an- 0 0 defined as d := (yc − yc) where yc and yc are the centers alyzed in the views of computational complexity.

6 5.1 Optimization-based Reachability Analysis set of outputs associated with reachable states such D ˆ D D ∆t as Ry (t, x0) = fy(Rx (t, x0)) and Ry ([t0, tf ]d , x0) = We define reachable sets in continuous time domain as ˆ D ∆t fy(Rx ([t0, tf ]d , x0)). C R (t, x ) := {x(t): ∃u([t , t]), ∃F ([t , t]) D x 0 0 c 0 Theorem 2. Suppose that R (t0, x0) is a non-empty set c x hx[kx](x(τ)) ≤ 0, ∀kx ∈ [1, nx]N and the desired output, yd, is given. Let us assume that c D ∆t h (x(τ), u(τ)) ≤ 0, ∀k ∈ [1, n ] the set, Ry ([t0, tf ]d , x0), is compact, connected, and xu[kxu] xu xu N (23) hxc(x(τ),Fc(τ)) ≤ 0, u(τ) ∈ U D ∆t yd ∈ R ([t0, tf ] , x0). (26) x˙(τ) = fC(x(τ), u(τ),Fc(τ)) y d C x(t0) = x0 ∈ Rx (t0, x0), τ ∈ [t0, t]} At least, one state trajectory Ψ := {ξ(t0), ξ(t1), . . . , ξ(τ)} C exists such that fy(ξ(τ)) = yd where τ ≤ tf . Rx (t0, x0) is equal to {x0} which means that the initial state fulfills all constraints. D ∆t Proof. Since Ry ([t0, tf ] , x0) is compact and fy is con- C d Definition 4. Let x0 ∈ Rx (t0, x0), be an initial state D ∆t −1 ∆t tinuous, Rx ([t0, tf ]d , x0) is closed and fy is also con- and t ∈ [t0, tf ]d be an arbitrary time interval. We define D ∆t tinuous. Then, Rx ([t0, tf ]d , x0) is connected because a reachable set in discrete time domain as: D ∆t −1 Ry ([t0, tf ]d , x0) is connected and fy is continuous. D : ∆t ∆t Therefore, there exists at least one trajectory connecting Rx (t, x0) = {x(t): ∃u([t0, t]d ), ∃Fc([t0, t]d ) D ∆t c x0 to xf satisfying fy(xf ) = yd in Rx ([t0, tf ]d , x0). hx[kx](x(τ)) ≤ 0, ∀kx ∈ [1, nx]N c D ∆t h (x(τ), u(τ)) ≤ 0, ∀k ∈ [1, n ] Corollary 1. The set, R ([t0, tk] , x0), is compact. xu[kxu] xu xu N (24) y d hxc(x(τ),Fc(τ)) ≤ 0, u(τ) ∈ U x(τ + ∆t) = fD(x(τ), u(τ),Fc(τ)) D ∆t Proof. Let us consider ghull(Ry ([t0, tk]d , x0)), be- D ∆t x(t0) = x0 ∈ Rx (t0, x0), τ ∈ [t0, t]d } ing compact. By the Heine−Borel theorem, all closed subsets of a compact set are also compact. Since D ∆t D ∆t where ∆t > 0 is the discretization step or sampling period Ry ([t0, tk]d , x0) ⊂ ghull(Ry ([t0, tk]d , x0)), the reach- D ∆t for our discrete model. able set Ry ([t0, tk]d , x0) is compact. D We extend the reachable set defined above for the finite Corollary 2. Suppose that x0 ∈ Rx (t0, x0) and fy is ∆t D ∆t D ∆t discrete time interval [t0, tf ]d as Rx ([t0, tf ]d , x0) := continuous. Then, a set, Ry ([t0, tk]d , x0), is connected. S RD(t, x ). For any t < ∞, the reachable t∈[t0,tf ]d,∆t x 0 f D ∆t set satisfies the following bound Rx ([t0, tf ]d , x0) ⊆ Proof. Consider three sets: F = RD([t , t ]∆t, x ), D C 1 x 0 k−1 d 0 Rx ([t0, +∞), x0) ⊆ Rx ([t0, +∞), x0) X . D 0 0 ( F2 = Rx (tk, x0), and F3 = F2 ∪ F1, where F1 is the collection of states x ∈ F1 producing the next feasible Consider x and RD([t , t ]∆t, x ). A random input is 0 x 0 k d 0 state via the optimization (27) with respect to x ∈ F1. drawn from a Gaussian distribution u ∼ N (µ , Σ ) at D ∆t u u Then, R ([t0, tk] , x0) = F1 ∪ F2 = F2 ∪ F3. Let each instant of time with the input set U defined as the x d us consider arbitrary two sets H1 and H2 satisfying collection of inputs fulfilling input constraint. We define D ∆t Rx ([t0, tk]d , x0) = H1 ∪ H2 with H1 ∩ H2 = ∅. Let a QP to check for feasible contact forces, i.e. 0 xh ∈ F1 and suppose xh ∈ H1. Then, H1 ∩ F1 6= ∅ and H1 ∩ F3 6= ∅. This implies that F1 ⊆ H1 and F3 ⊆ H1, > > D ∆t min Fc QcFc + ∆x Qx∆x hence, H2 = ∅. This proves that R ([t0, tk] , x0) is Fc,∆x x d connected. Since the mapping fy is continuous, we also s.t. x(tk+1) = fD(x(tk), u(tk),Fc) D ∆t conclude that the set R ([t0, tk] , x0) is connected. D ∆t y d x(tk) ∈ Rx ([t0, tk]d , x0) c (25) hx[kx](x(tk+1)) ≤ 0, ∀kx ∈ [1, nx]N c 5.2 Propagation of Boundary States hxu[kxu](x(tk), u(tk) ≤ 0, ∀kxu ∈ [1, nxu]N hxc(x(tk),Fc) ≤ 0, u(tk) ∈ U The basic algorithm for reachability analysis suffers from ∆x = x(tk+1) − x(tk) exponential complexity with respect to the number of time steps. Although the previous POMDP contributes The reachable set at the next time instance tk+1 is to reducing the time horizon to be checked for reachabil- obtained by collecting the x(tk+1) updated by the QP ity analysis, full-state propagation would still result in solution ∆x in (25). Then, we can extend the reachable heavy computational burden. In this section, we propose ∆t D ∆t set over [t0, tk+1]d such that Rx ([t0, tk+1]d , x0) := a method for reducing the computational complexity of D ∆t D Rx ([t0, tk]d , x0) ∪ Rx (tk+1, x0). Also, we define the the algorithm by only propagating selected states. This

7 approach results in more conservative reachable sets. We The effect of the boundary sampling on computational implement the propagation of boundary states by solv- complexity becomes significantly advantageous in terms D ∆t ing (25) for only state samples x(tk) ∈ Bx ([t0, tk]d , x0) of the number of time steps. We will show the compari- such that son of the computational complexity using an example in the simulation section. D ∆t nx D ∆t Bx ([t0, tk]d , x0) := {x ∈ R : x ∈ Rx ([t0, tk]d , x0), D ∆t fy(x) ∈ gbd(Ry ([t0, tk]d , x0))}. 6 Optimal Control (27) In this work, we describe the use of sequential optimal We then compute the reachable set by collecting the up- control for nonlinear programs without constraint soft- dated states and extend the time horizon. In this way, ening. Instead of considering end-to-end trajectory gen- the complexity becomes linear with respect to the num- eration, we focus on finding a trajectory connecting two D ∆t subregions obtained by the POMDP process described ber of boundary samples, card(B ([t0, tk]d , x0)). In order to replace full-state propagation with boundary- earlier. By iterating this process for connecting subre- state propagation, we show that the set of reach- gions (22), the optimal control process is able to attain able outputs is compact and connected. First, the set the desired output with reduced computational cost. D ∆t Ry ([t0, tk]d , x0) is compact, because we are able to obtain the hull of the set as shown in Corollary 1. Next, 6.1 Nonlinear Programming D we prove the reachable set Ry is connected. In order to formulate the optimal control problem solved D Corollary 3. Suppose that x0 ∈ Rx (t0, x0) 6= ∅ and fy by NLP, a performance measure is defined in the discrete D ∆t time and state space, that is, is continuous. Then, the set, Ry ([t0, tk]d , x0), is con- nected N X > > L(U,N) := (u (tk)Quu(tk) + Fc (tk)QcFc(tk) Proof. The proof is similar to that of Corollary 2 and k=0 > > therefore is omitted. + ex (tk)Qxex(tk)) + ey (tN )Qyey(tN ) ex(tk) := ξ(tk) − x0, ey(tk) := (fy(ξ(tk)) − yd) 5.3 Computational Complexity Analysis where U := {u(t0), u(t1), . . . , u(tN )}. In addition, Qx ∈ n nx and Q ∈ y denote the weighting matrices for We analyze the computational complexity to compare S>0 y S>0 the state and output terms, respectively. ξ(t) ∈ nx the efficiency of the propagation of full states and that R and y ∈ ny denote the trajectory of the state and of boundary states. There exists many algorithms to ob- d R the desired goal of the output of the NLP, respectively. tain the concave hull from the set of data [10, 11, 33]. Based on the set of reachable states, we can formulate They have O(n3) or O(n log n) time complexity with n the NLP as follows: data in 2-D space. General QPs are non-deterministic polynomial-time hard, which means the algorithms are min L(U,N) more complex than the polynomial time complexity to Ψ,U be solved. In the case that the QP is convex, it is widely s.t. ξ(tk+1) = fD(ξ(tk), u(tk),Fc(tk)) known that the time complexity of the QP is O(m3) ξ(t ) ∈ RD([t , t ]∆t, x ) where m is the number of decision variables. k x 0 k d 0 D ∆t (29) ξ(tk+1) ∈ Rx ([t0, tk+1]d , x0) c Based on the aforementioned discussion, we can com- hx[kx](ξ(tk+1)) ≤ 0, ∀kx ∈ [1, nx]N pare the computational complexity of two cases: prop- c hxu[kxu](ξ(tk), u(tk) ≤ 0, ∀kxu ∈ [1, nxu] agation of full states and propagation of boundary N hxc(ξ(tk), u(tk)) ≤ 0, u(tk) ∈ U. states. Let us consider Nt steps over the time interval ∆t [t0, tk]d where ∆t = (tk − t0)/Nt, and Nu is the num- ber of input samples. For each propagation method, the The proposed approach recursively executes the formu- computational complexity can be represented as C ∼ lated NLP (29) by changing the initial conditions, the f constraints, and the goal output. Let us represent the PNt i 3 Nt+1 3 O( i=1 Nu(nc + nx) ) ≈ O Nu (nc + nx) , and (j) (j+1) optimization problem for connecting Y? to Y? as PNt 3 3 Cb ∼ O( i=1 Nb(nc + nx) + (iNb) ) ≈ O(NtNb(nc + 3 4 3 (j) (j) (j) nx) + Nt Nb ) where Cf , Cb, and Nb denote the com- (Ψ(j), U(j)) = NLP(x , y , t ,N (j)) (30) plexity of full state propagation, that of boundary state 0 d 0 propagation, and the number of boundary samples. Nor- (j+1) (j+1) mally, a set of boundary samples contains much smaller where j ∈ [1, nπ − 1]N. We replace t0 , x0 and (j+1) (j) (j) (j) (j+1) samples than a set of entire states, that is, Nb  Nu. yd with tN , ξ (tN ) and yc , respectively.

8 0.65 18 Virtual Joints for Floating base 0.60 Goal Output 16 Final Node 0.55 Sequence of 14 Obstacle 1 0.5 Nodes 0.45 12 Hip joint 0.4 10 Z (m) Z

0.35 (nodes) Z 0.3 8 Knee joint 0.25 6 Initial Node Ankle joint 0.2 Initial Output 4 Obstacle 2 0.15 Surface Contact 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2 4 6 8 10 12 14 16 X (m) X (nodes) (a) (b) (c)

Fig. 2. (a) Legged robotic system with surface contact, consisting of three virtual joints and three actuated joints, (b) A set of output samples satisfying the constraints and the approximated ellipsoids with respect to κ in (11), (c) A sequence of nodes solved by POMDP.

N (j+1) is determined by reachability analysis using where superscripts (.) and (.)⊕ represent the lower and t(j+1), x(j+1), and y(j+1). In particular, we set x(1) = x upper limits, respectively. The specific conditions are 0 0 d 0 0 ⊕ (nπ −1) qj ,j ,j = [−1.2, 0.5. −1.5], qj ,j ,j = [−0.2, 2.6, −0.5], and y = yg. After performing the iterative NLP, 4 5 6 4 5 6 d u = [−1000, −1000, −1000], and u⊕ = we obtain the entire state and input trajectories to j4,j5,j6 j4,j5,j6 reach the goal output such that [1000, 1000, 1000]. The position, orientation, and veloc- ity of the foot should satisfy the kinematic constraints Ψ? := {Ψ(1),..., Ψ(nπ −1)} (31a) for the contact. We consider a surface contact with rect- angular support polygon on the foot so that the friction ? (1) (n −1) U := {U ,..., U π }. (31b) cone constraints can be characterized as

? ? The solutions, Ψ and U , enable the robotic system to |fx| ≤ kµfz, fz > 0, |τy| ≤ dxfz (33) reach the goal output state maintaining all constraints and contacts. where dx denotes the distance between the center of the polygon and the vertex in the local frame of the foot and k represents the friction coefficient. F := [f , f , τ ]> 7 Numerical Simulations µ c x z y is the contact wrench, which is a resultant contact force at the center of the support polygon. Based on the in- In this section, we validate the proposed approach by equality constraints of (33) and the coordinate transfor- simulating a legged robot Draco, which is developed for mation from local frame on the foot to the global frame, efficient and dynamic locomotion using liquid-cooled se- we can represent the friction cone constraints in the form ries elastic actuators [21]. The dynamic simulation is WlocalRc(q)Fc = Wc(q)Fc ≤ 0 where Wlocal is a coeffi- implemented by DART [26]. We utilize two optimizers: cient matrix derived from (33) and Rc(q) is a rotational Goldfarb for QP and IPOPT implementing a primal- matrix from global to foot frames. In Wlocal, we set the dual interior point method [40]. The simulation is exe- friction coefficient kµ = 0.4. Considering all constraints, cuted on a laptop with a Core i7-8650U CPU and 16.0 we will control the robot’s motion while maintaining the GB RAM. contact.

7.1 A Robotic System and Constraints 7.2 Setup for Simulation

A simulation model of Draco consists of three virtual To start the reachability analysis, we generate 106 state joints, i.e. position and orientation of its floating base samples and gather the states fulfilling the constraints. (q1, q2, q3) and three actuated joints, i.e. knee and ankle The mean and covariance of the Gaussian distribution joints (q4, q5, q6) as shown in Fig. 2 (a). For our sim- for sampling are ulations, the state and input constraints are defined as > follows: µx = [0.352, 0.384, −0.95, 2.2, −1.25, 01×6] " # ⊕ ⊕ ⊕ πI 0 (34) q ≤ qj ≤ q , q˙ ≤ q˙j ≤ q˙ , u ≤ uj ≤ u 6×6 6×6 j1 1 j1 j1 1 j1 j1 1 j1 (32) Σx = . O 2πI uj2 = 0, j1 ∈ {4, 5, 6}, j2 ∈ {1, 2, 3} 6×6 6×6

9 200 550

180 Time for full-states propagation 500 Time for boundary-states propagation 450 160 Accumulated time for reachable sets 400 140 350 120 300 100 250 80 200 60 150 Computation Time (seconds) Time Computation (seconds) Time Computation 40 100 20 50 0 0 5 10 15 20 Propagation Step 0.001 Fig. 3. Reachable sets over a finite time horizon [0, 0.05]d : The reachable sets are obtained by boundary-state propa- Fig. 4. Computational time for obtaining reachable sets with gation. discretization step ∆t = 0.01 and 10 simulations. We define a threshold for numerical convergence (7) with state propagation methods and display the results in value 1.0 × 10−7 and the maximum number of itera- Fig. 4. The algorithm cannot compute the reachable sets tions is 1.0×106. After implementing the sampling-based via the full-state propagation method for more than two approach described in Section 3, we obtain 3.47 × 105 steps due to the lack of memory capacity as shown in states among 106 state samples. For formulating the the blue dotted line in Fig 4. As we predicted in the POMDP problem in 2D space, we set 40 nodes defined complexity analysis of Section 5.3, the boundary-states by S = {si} where i ∈ [1, 40] and 8 actions A = {aj} propagation method significantly reduces the computa- N tional time for computing reachable sets. where j ∈ [1, 8]N, and each action consists of moving up, down, right, left, up-right, up-left, down-right, and down-left in the grid world, respectively. We consider Fig. 5 shows the results of trajectory optimization to two static obstacles for the robot to avoid in the out- reach the goal position with respect to a given initial put space. The objective of our numerical simulation is state. The optimization result includes both joint posi- to obtain an optimized trajectory to reach the goal out- tion and velocity trajectories fulfilling kinodynamic con- put while avoiding the obstacles and fulfilling all con- straints, e.g. joint position, velocity, and torque limits straints. In addition, we generate 1.0×105 input samples and contact kinematic and force constraints. The op- for propagating the states in the reachability analysis. timization results for the actuated joints in the phase space have stabilizable end points which are marked as 7.3 Simulation Results blue diamonds in Fig. 5(a), (b), and (c). As shown in Fig 5(d), the generated trajectories pass through the subre- gions in an optimized sequence obtained by solving the First, the results of our sample-based optimization is POMPD problem and reaching the final output goal po- shown in Fig. 2(b). The set with red dots contains the sition. outputs associated with the states fulfilling the con- straints given by the optimization process described in Section 3. As shown in Fig. 2(b), both the initial 8 Conclusion [0.384, 0.352] and goal [0.51, 0.52] outputs are located at the interior of the feasible set. Then, we solve the This paper proposes a method to generate trajectories POMDP problem to find a sequence of nodes, which for complex robotic systems considering contact con- result in 12 of them, avoiding the obstacles as shown in straints. We utilize NLP to solve the optimal control Fig. 2(c). After obtaining the sequence of nodes, we ob- problem. Our approach focuses on efficiently solving tain the reachable sets as shown in Fig. 3. The reachable the NLP problem so that we can scale the method to D 0.001 set Ry ([0, 0.05]d , x0) contains the desired output many types of complex robotic systems. We devise a associated with the first node. Based on this result, we new approach to obtain discrete-time reachable sets solve the NLP (29) to find a trajectory to reach the for trajectory generation and solve the nonlinear opti- desired output from the initial configuration. mization problem. Although the computational cost is significantly reduced, it is still challenging to employ The computational complexity is analyzed by measuring this approach to real-time control. Therefore, in the the execution time of the algorithm for computing the near future, we will investigate ways to combine this reachable sets. We repeat 10 simulations to measure the approach with feedback controllers and extend the pro- computation time for both the full-state and boundary- posed method for hybrid dynamical systems, such as

10 Fig. 5. Simulation results: (a), (b), and (c) illustrate the joint position and velocity trajectories of the actuated joints in the phase space, (d) shows the optimized trajectory in the output space, (e) shows the joint position trajectory of the actuated joints in the time domain. The blue circle and red diamond indicate the initial and final values, respectively. biped humanoid robots or dual arm manipulators. [4] Matthias Althoff, Olaf Stursberg, and Martin Buss. Reachability analysis of nonlinear systems with uncertain parameters using conservative linearization. In Decision and Acknowledgements Control, 2008. CDC 2008. 47th IEEE Conference on, pages 4042–4048. IEEE, 2008. The authors would like to thank the members of the Hu- [5] Murat Arcak and John Maidens. Simulation-based reachability analysis for nonlinear systems man Centered Robotics Laboratory at The University of using componentwise contraction properties. arXiv preprint Texas at Austin for their great help and support. This arXiv:1709.06661, 2017. work was supported by an NSF Grant# 1724360 and [6] Eugene Asarin, Olivier Bournez, Thao Dang, and Oded partially supported by an ONR Grant# N000141512507. Maler. Approximate reachability analysis of piecewise-linear dynamical systems. In International Workshop on Hybrid Systems: Computation and Control, pages 20–31. Springer, References 2000. [7] Felix Burget and Maren Bennewitz. Stance selection for [1] Alessandro Abate, Maria Prandini, John Lygeros, and humanoid grasping tasks by inverse reachability maps. In Shankar Sastry. Probabilistic reachability and safety Robotics and Automation (ICRA), 2015 IEEE International for controlled discrete time stochastic hybrid systems. Conference on, pages 5669–5674. IEEE, 2015. Automatica, 44(11):2724–2734, 2008. [8] Jan Carius, Ren´eRanftl, Vladlen Koltun, and Marco Hutter. [2] Matthias Althoff. Reachability analysis of nonlinear systems Trajectory optimization with implicit hard contacts. IEEE using conservative polynomialization and non-convex sets. In Robotics and Automation Letters, 3(4):3316–3323, 2018. Proceedings of the 16th international conference on Hybrid [9] St´ephane Caron, Quang-Cuong Pham, and Yoshihiko systems: computation and control, pages 173–182. ACM, Nakamura. Stability of surface contacts for humanoid 2013. robots: Closed-form formulae of the contact wrench cone for [3] Matthias Althoff and Bruce H Krogh. Reachability analysis of rectangular support areas. In IEEE International Conference nonlinear differential-algebraic systems. IEEE Transactions on Robotics and Automation, pages 5107–5112, 2015. on Automatic Control, 59(2):371–383, 2014. [10] Matt Duckham, Lars Kulik, Mike Worboys, and Antony

11 Galton. Efficient generation of simple polygons for [27] Kendra Lesser and Meeko Oishi. Reachability for characterizing the shape of a set of points in the plane. partially observable discrete time stochastic hybrid systems. Pattern recognition, 41(10):3224–3236, 2008. Automatica, 50(8):1989–1998, 2014. [11] Antony Galton and Matt Duckham. What is the region [28] Yiping Liu, Patrick M Wensing, David E Orin, and Yuan F occupied by a set of points? In International Conference Zheng. Trajectory generation for dynamic walking in a on Geographic Information Science, pages 81–98. Springer, humanoid over uneven terrain using a 3d-actuated dual- 2006. slip model. In Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on, pages 374– [12] Philip E Gill, Walter Murray, and Michael A Saunders. 380. IEEE, 2015. Snopt: An sqp algorithm for large-scale constrained optimization. SIAM review, 47(1):99–131, 2005. [29] John Maidens and Murat Arcak. Reachability analysis of nonlinear systems using matrix measures. IEEE Transactions [13] Antoine Girard. Reachability of uncertain linear systems on Automatic Control, 60(1):265–270, 2015. using zonotopes. In International Workshop on Hybrid Systems: Computation and Control, pages 291–305. Springer, [30] Moussa Maiga, Nacim Ramdani, Louise Trav´e-Massuy`es, 2005. and Christophe Combastel. A comprehensive method for reachability analysis of uncertain nonlinear hybrid systems. [14] Antoine Girard and Colas Le Guernic. Efficient reachability IEEE Transactions on Automatic Control, 61(9):2341–2356, analysis for linear systems using support functions. IFAC 2016. Proceedings Volumes, 41(2):8966–8971, 2008. [31] Ian Mitchell, Alexandre M Bayen, and Claire J Tomlin. [15] LCGJM Habets, Pieter J Collins, and Jan H van Schuppen. Validating a Hamilton-Jacobi approximation to hybrid Reachability and control synthesis for piecewise-affine hybrid system reachable sets. In International Workshop on Hybrid systems on simplices. IEEE Transactions on Automatic Systems: Computation and Control, pages 418–432. Springer, Control, 51(6):938–948, 2006. 2001. [16] Abdullah Hamadeh and Jorge Goncalves. Reachability [32] Ian M Mitchell, Alexandre M Bayen, and Claire J Tomlin. analysis of continuous-time piecewise affine systems. A time-dependent Hamilton-Jacobi formulation of reachable Automatica, 44(12):3189–3194, 2008. sets for continuous dynamic games. IEEE Transactions on [17] Gustaf Hendeby and Fredrik Gustafsson. On nonlinear automatic control, 50(7):947–957, 2005. transformations of gaussian distributions. Technical Report [33] Adriano Moreira and Maribel Yasmina Santos. Concave hull: from Automatic Control at Link? pings Universitet, 2007. A k-nearest neighbours approach for the computation of the [18] Shuuji Kajita, Fumio Kanehiro, Kenji Kaneko, Kiyoshi region occupied by a set of points. 2007. Fujiwara, Kensuke Harada, Kazuhito Yokoi, and Hirohisa [34] Ludovic Righetti, Jonas Buchli, Michael Mistry, and Stefan Hirukawa. Biped walking pattern generation by using preview Schaal. Inverse dynamics control of floating-base robots control of zero-moment point. In ICRA, volume 3, pages with external constraints: A unified view. In Robotics and 1620–1626, 2003. Automation (ICRA), 2011 IEEE International Conference [19] Nikolaos Kariotoglou, Sean Summers, Tyler Summers, on, pages 1085–1090. IEEE, 2011. Maryam Kamgarpour, and John Lygeros. Approximate [35] Matthias Rungger and Majid Zamani. Accurate reachability dynamic programming for stochastic reachability. In analysis of uncertain nonlinear systems. In Proceedings Proceedings of European Control Conference, pages 584–589, of the 21st International Conference on Hybrid Systems: 2013. Computation and Control (part of CPS Week), pages 61–70. [20] Oussama Khatib. A unified approach for motion and ACM, 2018. force control of robot manipulators: The operational space [36] Joseph K Scott and Paul I Barton. Bounds on the reachable formulation. IEEE Journal on Robotics and Automation, sets of nonlinear control systems. Automatica, 49(1):93–100, 3(1):43–53, 1987. 2013. [21] Donghyun Kim, Junhyeok Ahn, Orion Campbell, Nicholas [37] Luis Sentis and Oussama Khatib. Synthesis of whole- Paine, and Luis Sentis. Investigations of a robotic testbed body behaviors through hierarchical control of behavioral with viscoelastic liquid cooled actuators. IEEE/ASME primitives. International Journal of Humanoid Robotics, Transactions on Mechatronics, 2018. 2(04):505–518, 2005. [22] Jin-Hoon Kim. Improved ellipsoidal bound of reachable [38] Benjamin J Stephens and Christopher G Atkeson. Dynamic sets for time-delayed linear systems with disturbances. balance force control for compliant humanoid robots. In Automatica, 44(11):2940–2943, 2008. IEEE/RSJ International Conference on Intelligent Robots [23] Alexander B Kurzhanski and Pravin Varaiya. Ellipsoidal and Systems, pages 1248–1255, 2010. techniques for reachability analysis: internal approximation. [39] Sean Summers, Maryam Kamgarpour, Claire Tomlin, and Systems & control letters, 41(3):201–211, 2000. John Lygeros. Stochastic system controller synthesis [24] Colas Le Guernic and Antoine Girard. Reachability analysis for reachability specifications encoded by random sets. of linear systems using support functions. Nonlinear Automatica, 49(9):2906–2910, 2013. Analysis: Hybrid Systems, 4(2):250–262, 2010. [40] Andreas W¨achter and Lorenz T [25] Jaemin Lee, Efstathios Bakolas, and Luis Sentis. Trajectory Biegler. On the implementation of an interior-point filter generation for robotic systems with contact force constraints. line-search algorithm for large-scale nonlinear programming. arXiv preprint arXiv:1809.10598, 2018. Mathematical programming, 106(1):25–57, 2006. [26] Jeongseok Lee, Michael X Grey, Sehoon Ha, Tobias Kunz, [41] Yiming Yang, Wolfgang Merkt, Henrique Ferrolho, Vladimir Sumit Jain, Yuting Ye, Siddhartha S Srinivasa, Mike Stilman, Ivan, and Sethu Vijayakumar. Efficient humanoid motion and C Karen Liu. Dart: Dynamic animation and robotics planning on uneven terrain using paired forward-inverse toolkit. The Journal of Open Source Software, 3(22):500, dynamic reachability maps. IEEE Robotics and Automation 2018. Letters, 2(4):2279–2286, 2017.

12