<<

A Flying Inverted

Markus Hehn and Raffaello D’Andrea

Abstract— We extend the classic control problem of the inverted pendulum by placing the pendulum on top of a quadrotor aerial vehicle. Both static and dynamic equilib- V ria of the system are investigated to find nominal states of the system at standstill and on circular trajectories. Control laws are designed around these nominal trajectories. A yaw- independent description of quadrotor is introduced, z using a ‘Virtual Body Frame’. This allows for the time-invariant description of curved trajectories. The balancing performance y of the controller is demonstrated in the ETH Zurich Flying O Machine Arena testbed. Development potential for the future is highlighted, with a focus on applying learning methodology x to increase performance by eliminating systematic errors that were seen in experiments. Fig. 1. The inertial coordinate system O and the vehicle coordinate system V. I. INTRODUCTION The inverted pendulum is a classic control problem, of- in Section VI and conclusions are drawn in Section VII, fering one of the most intuitive, easily describable and re- where an outlook is also presented. alizable nonlinear unstable systems. It has been investigated for several decades (see, for example, [11], and references II. DYNAMICS therein). It is frequently used as a demonstrator to show- We derive the of the quadrotor and case theoretical advances, e.g. in reinforcement learning [6], the inverted pendulum for the trajectory-independent general neural networks [14], and fuzzy control [13]. case. In this paper, we develop a control strategy that enables Given that the mass of the pendulum is small compared to an inverted pendulum to balance on top of a quadrotor. the mass of the quadrotor, it is reasonable to assume that the Besides being a highly visual demonstration of the dynamic pendulum’s reactive forces on the quadrotor are negligible. capabilities of modern quadrotors, the solution to such a The dynamics of the quadrotor, then, do not depend on complex control problem offers insight into quadrotor control the pendulum, whereas the dynamics of the pendulum are strategies, and could be adapted to other tasks. influenced by the motion of the quadrotor. This assumption Quadrotors offer exceptional agility. Thanks to the off- is justified by the experimental setup, with the weight of center mounting of the propellers, extraordinarily fast rota- the pendulum being less than 5% of that of the quadrotor tional dynamics can be achieved. This is combined with typ- vehicle. ically high thrust-to-weight ratios, resulting in large achiev- able translational accelerations when not carrying a payload. A. Quadrotor While most early work on quadrotors focused on near- The quadrotor is described by six degrees of freedom: The hover operation (e.g. [5], and references therein), a growing translational position (x, y, z) is measured in the inertial community is working on using the full dynamical potential coordinate system O as shown in Figure 1. The vehicle of these vehicles. Flips have been executed by several groups, attitude V is defined by three Euler angles. From the inertial some focusing on speed and autonomous learning [7] and coordinate system, we first rotate around the z-axis by the some on safety guarantees [3]. Other complex maneuvers, yaw angle α. The coordinate system is then rotated around including flight through windows and perching have been the new y-axis by the pitch angle β, and finally rotated about demonstrated [8]. the new x-axis by the roll angle γ: In Section II, we introduce the dynamic models used in the controller design. Section III presents static and O , (1) dynamic nominal trajectories for the quadrotor to follow. V R(α,β,γ)= Rz (α) Ry (β) Rx (γ) The dynamics are then linearized around these trajectories in where Section IV, and linear state controllers are designed in Section V. The experimental setup and results are shown 1 0 0 The authors are with the Institute for Dynamic Systems and Control, ETH Rx (γ)= 0 cos γ − sin γ , (2) Zurich, Switzerland. {hehnm, rdandrea}@ethz.ch 0 sin γ cos γ   cos β 0 sin β the y-axis). For notational simplicity, we describe the relative Ry (β)=  0 1 0  , (3) position of the pendulum along the z-axis as − sin β 0 cos β   2 2 2 cos α − sin α 0 ζ := L − r − s , (8) p Rz (α)= sin α cos α 0 . (4) where L to denotes the length from the base of the pendulum 0 0 1   to its . We model the pendulum as an iner- The translational acceleration of the vehicle is dictated by tialess point mass that is rigidly attached to the mass center the attitude of the vehicle and the total thrust produced by of the quadrotor, such that rotations of the vehicle do not the four propellers. With a representing the mass-normalized cause a motion of the pendulum base. In the experimental collective thrust, the translational acceleration in the inertial setup, the point that the pendulum is attached to is mounted frame is off-center by about 10% of the length of the pendulum. While this assumption causes modeling errors, it simplifies x¨ 0 0 O the dynamics to such a great extent that the problem becomes y¨ = V R(α,β,γ) 0 +  0  . (5) much more tractable. The Lagrangian [9] of the pendulum z¨ a −g       can be written as The vehicle attitude is not directly controllable, but it is subject to dynamics. The control inputs are the desired 1 rr˙ + ss˙ L = (˙x +r ˙)2 +(˙y +s ˙)2 +(˙z − )2 rotational rates about the vehicle body axes, (ωx, ωy, ωz), 2  ζ  (9) and the mass-normalized collective thrust, a, as shown in − g (z + ζ) , Figure 2. High-bandwidth controllers on the vehicle track the desired rates using feedback from gyroscopes. The quadrotor where we assume unit pendulum mass without loss of has very low rotational inertia, and can produce high generality. The first term represents the kinetic energy of due to the outward mounting of the propellers, resulting in the pendulum, and the second the potential energy. The full, L very high achievable rotational accelerations on the order of nonlinear dynamic equations can be derived from using 200 rad/s2. The vehicle has a fast response time to changes in conventional : the desired rotational rate (experimental results have shown d ∂L ∂L − = 0 (10) time constants on the order of 20 ms). We will therefore dt  ∂r˙  ∂r assume that we can directly control the vehicle body rates d ∂L ∂L and ignore rotational acceleration dynamics. As with the − = 0 , (11) dt ∂s˙ ∂s vehicle body rates, we assume that the thrust can be changed   instantaneously. Experimental results have shown that the resulting in a system of equations of the form true thrust dynamics are about as fast as the rotational r¨ dynamics, with propeller spin-up being noticeably faster than = f (r,s, r,˙ s,˙ x,¨ y,¨ z¨) , (12) s¨ spin-down. The rates of the Euler angles are converted to the vehicle where f are the nonlinear equations (13) and (14). body coordinate system V through their respective transfor- C. Combined dynamics mations: The full dynamics of the combined system are described ωx γ˙ 0 0 −1 −1 −1 entirely by Equations (5), (7), and (12). The three body ωy = 0 + R (γ) β˙ + R (γ) R (β) 0 .     x   x y   rate control inputs (ωx,ωy,ωz) control the attitude V of ωz 0 0 α˙         the vehicle in a nonlinear fashion. This attitude, combined (6) The above can be written more compactly by combining the Euler rates into a single vector, calculating the relevant a rows of the rotation matrices, and solving for the Euler angle rates: ωy ωz

−1 γ˙ cos β cos γ − sin γ 0 ωx ωx β˙ = cos β sin γ cos γ 0 ωy . (7) α˙ − sin β 0 1 ωz      

B. Inverted Pendulum The pendulum has two degrees of freedom, which we describe by the translational position of the pendulum center Fig. 2. The control inputs of the quadrotor: The rotational rates ωx, ωy, of mass relative to its base in O (r along the x-axis, s along and ωz are tracked by an on-board controller, using gyroscope feedback. 1 2 r¨ = −r4x¨ − L2 − s2 x¨ − 2r2 sr˙s˙ + −L2 + s2 x¨ + r3 s˙2 + ss¨ − ζ (g +z ¨) + (L2 − s2)ζ2      r −L2ss¨ + s3s¨ + s2 r˙2 − ζ (g +z ¨) + L2 −r˙2 − s˙2 + ζ (g +z ¨) (13) 1 2   s¨ = −s4y¨ − L2 − r2 y¨ − 2s2 rr˙s˙ + −L2 + r2 y¨ + s3 r˙2 + rr¨ − ζ (g +z ¨) + (L2 − r2)ζ2      s −L2rr¨ + r3r¨ + r2 s˙2 − ζ (g +z ¨) + L2 −r˙2 − s˙2 + ζ (g +z ¨) (14)   with the thrust a, controls the translational acceleration of To describe the vehicle position, the following coordinate the vehicle. While the acceleration drives the translational system C is introduced, with (u,v,w) describing the position motion of the vehicle linearly, it also drives the motion of in C: the pendulum through nonlinear equations. The combined x u cos Ωt − sinΩt 0 u system consists of thirteen states (three rotational and six y =: Rz (Ωt) v  = sinΩt cos Ωt 0 v  . translational states of the quadrotor, and four states of the z w 0 0 1 w pendulum), and four control inputs (three body rates, and        (19) the thrust). To describe the vehicle attitude, a second set of Euler angles is introduced, describing the ‘virtual body frame’ W and III. NOMINAL TRAJECTORIES named η, µ, and ν: In this section, we find static and dynamic equilibria of O the system that satisfy Equations (5), (7), and (12). These are W R(η,µ,ν)= Rz (η) Ry (µ) Rx (ν) , (20) used as nominal trajectories to be followed by the quadrotor. Corresponding nominal control inputs are also described. We subject to the constraint that denote nominal values by a zero index (x0, r0, etc.). 0 0 O O V R(α,β,γ) 0 = W R(η,µ,ν) 0 . (21) A. Constant position 1 1     In a first case, we require 0, 0, and 0 to be constant. x y z The above equation defines values for three elements of the Substituting these constraints into (5), it can be seen that rotation matrices. As every column of a rotation matrix has 0 and 0 solve the equations with 0 , while β = 0 γ = 0 a = g unit norm [12], however, this equation only defines two of α0 can be chosen freely. We arbitrarily choose to set α0 = 0. the angles (η,µ,ν). Using the given angles 0 0 0 , equation (7) can be (α ,β ,γ ) Comparing the constraint (21) with the translational equa- solved with the body rate control inputs being ω 0 = ω = x y 0 tion of motion of the vehicle (5), it is straightforward to see 0 . ωz = 0 that the virtual body frame W represents an attitude that is Inserting the nominal states (x0,y0,z0) into the pendulum constrained such that it effects the same translational motion equations of motion (12), they simplify to of the quadrotor as the vehicle attitude V. The remaining 2 gζ3 − L2 r˙2 +s ˙2 +(sr˙ − rs˙) degree of freedom represents the fact that rotations about the r¨ = r 2 2 , (15) axis along which ω acts have no affect on the quadrotors L ζ  z translational motion. 3 − 2 2 2 − 2 gζ L r˙ +s ˙ +(sr˙ rs˙) Applying (19), its derivatives, (21), and setting the free s¨ = s 2 2 . (16) L ζ  parameter η = Ωt, the quadrotor equation of motion (5) These equations are solved by the static equilibrium simplifies to u¨ a sin µ cos ν + Ω2u +2Ω˙v r = r0 = 0 (17) 2 v¨ =  −a sin ν − 2Ωu ˙ + Ω v  . (22) s = s0 = 0, (18) w¨ a cos µ cos ν − g     meaning that, as expected, the inverted pendulum is exactly The circular trajectory is described by u0 = R, v0 = 0, over the quadrotor. and w˙0 = 0. Using these values, the nominal Euler angles

µ0 and ν0, and the nominal thrust a0 can be calculated: B. Circular trajectory Ω2R As a second nominal trajectory, the quadrotor is required µ0 = arctan(− ) , (23) g to fly a circle of a given radius R at a constant rotational ν0 = 0 , (24) rate Ω, at a constant altitude z0. We seek to transform the equations of motion into different 2 2 2 a0 = g + (Ω R) . (25) coordinate systems, such that the nominal states and the q linearized dynamics about them can be described in a time- Knowing the nominal values for (η0,µ0,ν0), we solve for invariant manner. (α0,β0,γ0) using Equation (21). Analogous to the constant position case, we set α0 = 0, simplifying (21) to that the centripetal force acting on it is compensated for by gravity. If the center of mass were to lie further than R sin β0 cos γ0 cos Ωt sin µ0 cos ν0 + sinΩt sin ν0 inwards, the centripetal force would change sign, making  − sin γ0  = sinΩt sin µ0 cos ν0 − cos Ωt sin ν0 , the pendulum fall. cos β0 cos γ0 cos µ0 cos ν0     (26) IV. DYNAMICSABOUTNOMINALTRAJECTORIES which can be solved for β0 and γ0. This completes the Linear approximate dynamics are derived through a first- description of the nominal states required for the translational order Taylor expansion of the equations of motion (5), (7), motion (5): In the coordinate systems C and W, the nominal and (12) about the nominal trajectories found in Section position and attitude are constant. Using Equations (19) III. We denote small deviations by a tilde (x˜, r˜, etc.). and (26), the time-varying nominal states in O and V may We present only the resulting linearized dynamics. The be found. corresponding derivations are not shown in this paper due To calculate the rotational rate control inputs in Equa- to space constraints, but are made available online. tion (7), we take the first derivative of Equation (26). It can be shown that A. Constant position Assuming a constant nominal position and zero yaw angle, 3 −1 −1 the three translational degrees of freedom of the quadrotor- ˙ RΩ cos γ0(tan β0 tan γ0 cos(Ωt)+cos β0 sin(Ωt)) β0 = 2 2 2 pendulum system decouple entirely along the three axes g + (Ω R) of the O coordinate system, resulting in the following p (27) 3 −1 equations: RΩ cos γ0 cos(Ωt) γ˙0 = . (28) g g2 + (Ω2R)2 r˜¨ =r ˜ − β˜g (34) L p g Combining these equations with the results from s˜¨ =s ˜ +γ ˜g (35) Equations (26) and (7), the nominal states can be L ¨ ˜g (36) solved for the nominal control inputs (ωx0,ωy 0,ωz 0). x˜ = β The full derivation is made available online at y˜¨ = −γ˜g (37) www.idsc.ethz.ch/people/staff/hehn-m . ˜˙ Identically to the vehicle, the pendulum relative coordi- β =ω ˜y (38) nates r and s are rotated by Ωt: γ˜˙ =ω ˜x (39) r cos Ωt − sinΩt p z˜¨ =a ˜ (40) =: . (29) s sinΩt cos Ωt q      The two horizontal degrees of freedom represent fifth- Applying this rotation to the Lagrangian derivations of order systems, with the vehicle forming a triple integrator the motion of the pendulum (10), (11) and setting the base from the body rate to its position. The vertical motion is motion (¨x, y,¨ z¨) to the circular trajectory, the pendulum represented by a double integrator from thrust to position. dynamics can be shown to be 2 2 2 2 2 2 B. Circular trajectory pp¨ +p ˙ + qq¨ +q ˙ q q˙ + p p˙ + 2pqp˙q˙ g 2 p 2 + 4 − − Ω To derive linear dynamics about the circular trajectory,  ζ ζ ζ  ˙ ˙ 2 the Euler angle rates µ˜ and ν˜ are treated as control inputs. +p ¨ − 2Ωq ˙ − RΩ = 0 Solving the time-derivative of equation (21) (analogously to (30) solutions (27) and (28) for the nominal case) allows the 2 2 2 2 2 2 ˙ qq¨ +q ˙ + pp¨ +p ˙ p p˙ + 2pqp˙q˙ + q q˙ g 2 calculation of (β,˜ γ˜˙ ). These can then be converted to the q 2 + 4 − − Ω  ζ ζ ζ  true inputs (ω ˜x, ω˜y, ω˜z) using Equation (7). +q ¨ +2Ω˙p = 0 In contrast to a constant nominal position, the dynamics (31) on a circular trajectory do not decouple. The linearized equations of motion can be shown to be We seek a solution where p˙0 =q ˙0 =p ¨0 =q ¨0 = 0, leading 2 g 2 to the following constraints for equilibrium points: ¨ ζ0 2 L ˙ p˜ = 2 p˜(Ω + 3 ) + 2q˜Ω+ (41) 2 gq0 L  ζ0 Ω (q0)+ = 0 , (32) ζ0 p0 p0 − 0 0 − 0 0 0 − 0 2 gp0 µ˜( a sin µ a cos µ ) +a ˜( cos µ sin µ ) 0 0 Ω (R + p0)+ = 0 . (33) ζ ζ  ζ0 2 g q˜¨ =q ˜(Ω + ) − 2p˜˙Ω +νa ˜ 0 (42) The only solution to the first equation is q0 = 0. The second ζ0 2 equation defines a nonlinear relationship between Ω, R, and u˜¨ =a ˜ sin µ0 +µa ˜ 0 cos µ0 + 2v˜˙Ω +u ˜Ω (43) p0, with a solution p0 always existing in the range −R ≤ 2 v˜¨ = −νa˜ 0 − 2u˜˙Ω +v ˜Ω (44) p0 ≤ 0. This result is intuitive: The center of mass of the ¨ pendulum must lie towards the center of the circle, such w˜ =a ˜ cos µ0 − µa˜ 0 sin µ0 (45) We observe that the linearized dynamics are indeed time- platform at ETH Zurich [7]. We present results demonstrating invariant in the coordinate systems C and W. the performance of the controllers designed in the previous The above equations reduce to Equations (34) – (40) if Section. setting R = 0 and Ω = 0. If R = 0, Ω is a free parameter A. Experimental setup (see Equation (33)), allowing the description of the dynamics (34) – (40) in a rotating coordinate system. We use modified Ascending Technologies ‘Hummingbird’ quadrotors [4]. The vehicles are equipped with custom elec- V. CONTROLLER DESIGN tronics, allowing greater control of the vehicle’s response We design linear full state feedback controllers to stabilize to control inputs, a higher dynamic range, and extended the system about its nominal trajectories. We use an infinite- interfaces [7]. A small cup-shaped pendulum mounting point horizon linear-quadratic regulator (LQR) design [1] and is attached to the top of the vehicle, approximately 5 cm determine suitable weighting matrices. above the geometrical center of the vehicle. The pendulum can rotate freely about the mounting point up to an angle A. Constant position of approximately 50 degrees. At larger angles, the mounting Because the system is decoupled in its nominal state, the point offers no support and the pendulum falls off the vehicle. control design process can be separated. The two horizontal Commands are sent through a proprietary low-latency degrees of freedom are single-input, five-state systems. The 2.4 GHz radio link at a frequency of 50 Hz. Command loss vertical degree of freedom is a single-input, two-state system. is in the range of 0.1%. An infrared motion tracking system Although simpler design methods exist for such systems, provides precise vehicle position and attitude measurements LQR is used to make the results easily transferable to the at 200 Hz, using retro-reflective markers mounted to the ve- design for a circular trajectory. We design a lateral controller hicle. The total closed-loop latency is approximately 30 ms. that is identically applied to the x˜-r˜-system and the y˜-s˜- The inverted pendulum consists of a carbon fiber tube, system (except for different signs mirroring the signs in the measuring 1.15 m in length. The top end of the pendulum equations of motion (34)–(37)). A controller for the vertical carries a retro-reflective marker, allowing the position of this direction is designed separately. point to be determined through the motion tracking system For the lateral controllers, we penalize only the vehicle in the same manner as vehicles are located. The center of mass of the pendulum is m away from its base. Figure position (x˜ or y˜) and the control effort (ω˜y or ω˜x). There 0.565 is no penalty on the pendulum state. One tuning parameter 3 shows the quadrotor and the pendulum. remains: The ratio of the penalties on position and control Conventional desktop computers are used to run all control effort controls the speed at which the position set point is algorithms, with one computer acting as an interface to the tracked. Values for this ratio are tuned manually until the system shows fast performance, without saturating the con- trol inputs. This tuning is initially carried out in simulation, and then refined on the experimental setup. The vertical controller is tuned much in the same manner as the lateral controller. Here again, we tune only the ratio between penalties on position errors and control effort until satisfying performance is achieved.

B. Circular trajectory On the circular trajectory, the system represents a thirteen- state system with three control inputs that cannot readily be decoupled. To more easily tune the weighting matrices, we use the same approach as in the constant position case and penalize only the position errors (u˜, v˜, and w˜) and control effort (µ˜˙ , ν˜˙, and a˜). The relative size of weights on control inputs and states is carried over from the standstill design. Because the controller for the vertical axis was tuned separately in the standstill case, the relative size of the penalties on the horizontal positions (u˜, v˜) and the vertical position (w˜) is adapted. Again, this is first carried out in simulation and then improved upon using the experimental testbed.

VI. EXPERIMENTAL RESULTS

The algorithms presented herein were implemented in Fig. 3. The quadrotor balancing the pendulum, at standstill. The mass the Flying Machine Arena, an aerial vehicle development center is about half-length of the pendulum. Error (m) switch to a circular nominal trajectory and a corresponding r˜ x˜ controller occurs, with R = 0.1 m. The controller is seen to stabilize the pendulum, with the pendulum relative position 0.2 s˜ y˜ errors in the rotating coordinate system (p, q) converging to non-zero values. The vehicle errors show two distinct 0 components: Like the pendulum errors, there is a non-zero mean error. Additionally, the error oscillates at the rotational rate Ω, representing a near-constant position error in an -0.2 inertial coordinate system. Figure 6 shows a comparison Time t (s) of the actual and nominal trajectories of the vehicle and pendulum in the inertial coordinate system O. It confirms 0 5 10 15 that the oscillating errors in the rotating coordinate system Fig. 4. State errors during balancing: Pendulum position error (˜r, s˜) and are constant position errors in the inertial coordinate system, quadrotor position error (˜x, y˜). The pendulum is manually placed on the with a magnitude of approximately 0.1m. The mean errors vehicle at approximately t = 4.25 s, and the balancing controller is switched in C are represented by the circle radius being significantly on at approximately t . s. = 4 75 larger than the nominal value R. A video showing the experiments of both testbed. Data is exchanged over Ethernet connections. A cases presented herein is available online at Luenberger observer is used to filter the sensory data and www.idsc.ethz.ch/people/staff/hehn-m . provide full state information to the controller. The observer also compensates for systematic latencies occurring in the control loop, using the known control inputs to project the system state into the future. Error (m) B. Constant position p˜ u˜ 0.2 Experiments are initialized by manually placing the pen- q˜ v˜ dulum on the mounting point. The vehicle holds a constant position using a separate controller, waiting for the pendu- 0 lum. The balancing controller is switched on if r˜ and s˜ are sufficiently small for 0.5 seconds. -0.2 Figure 4 shows the pendulum position errors (r˜, s˜) and the horizontal quadrotor position errors (x˜, y˜). The pendulum -0.4 is placed on the vehicle at approximately t = 4.25 s, and Time t (s) the control is switched from position holding to balancing 0 5 10 15 at approximately t = 4.75 s. The pendulum position errors are relatively large in the beginning, but quickly converge Fig. 5. Errors in the rotating coordinate system: Pendulum position error to values close to zero. The vehicle settles at a stationary (˜p, q˜) and quadrotor position error (˜u, v˜). The pendulum is balanced by the quadrotor during the entire duration shown on this plot. At t = 2 s, offset on the order of 5 cm from the desired position. Note the controller is switched from a constant nominal position to a circular that the balancing controller does not provide feedback on trajectory with R = 0.1 m. integrated errors. The main suspected reason for these steady- state errors are miscalibrations in the system: Errors in the vehicle attitude measurement lead to the linear controller 0.4 trading off the attitude error and the position error, and biased (x0,y0) (x,y) measurements of the on-board gyroscopes result in a biased (r0,s0) (r,s) response to control inputs. 0.2 Though originally designed for a constant position, this controller has been successfully tested for set point tracking 0 at moderate speeds. A video of this is available online at www.idsc.ethz.ch/people/staff/hehn-m . y Position (m) -0.2 C. Circular trajectory R We arbitrarily choose to set p0 = − 2 , bringing the center of mass of the pendulum half-way between the vehicle -0.4 -0.2 0 0.2 0.4 and the circle center, nominally. Assuming a given R, this x Position (m) fixes Ω through the equilibrium constraint (33). Figure 5 shows the system performance when circling. The pendulum Fig. 6. The trajectory of the quadrotor and pendulum in O, compared to is first balanced at a constant position. At t = 2 s, a the nominal trajectory. Error (m) 0.4 (x0,y0) (x,y) p˜ u˜ 0.2 q˜ v˜ (r0,s0) (r,s) 0.2 0 0 -0.2

y Position (m) -0.2 -0.4 Time t (s) -0.4 0 5 10 15 -0.4 -0.2 0 0.2 0.4 x Position (m) Fig. 7. Simulation results: Errors in the rotating coordinate system: Pendulum position error (˜p, q˜) and quadrotor position error (˜u, v˜). At t = 2 s, the controller is switched from a constant nominal position to a circular Fig. 8. Simulation results: The trajectory of the quadrotor and pendulum trajectory with R =0.1m. in O, compared to the nominal trajectory.

D. Comparison with simulation results positioning errors increase with the circle size, but the controllers are still capable of keeping the pendulum in The controllers have also been tested in simulation. The balance. Flying Machine Arena software environment allows the testing of controllers by simply re-routing the controller’s VII. CONCLUSION AND OUTLOOK outputs to a simulation. The simulation reproduces the behav- We have developed linear controllers for stabilizing a ior of the entire system. It includes the full dynamics of the pendulum on a quadrotor, which can be used for both static quadrotor, including the on-board control loops, rotational and dynamic equilibria of the pendulum. The virtual body accelerations, and propeller dynamics. It also reproduces frame is a useful tool to describe motions in a convenient system latencies and the noise characteristics of sensors. As coordinate system (e.g. allowing the use of symmetries), the simulation output mimics the motion system’s output as without enforcing this rotation for the vehicle. Using its closely as possible, the same state observer is employed properties and a rotating coordinate system, the system in reality and in simulation. This simulation environment description is time-invariant on circular trajectories. has been extended to include the inverted pendulum. The This allows the straightforward application of well- pendulum is modeled with its full nonlinear dynamics (12), established state feedback design principles. Controllers for neglecting the off-center mounting on the vehicle for reasons standstill and circular motion have been validated experi- of tractability, as discussed in Section II-B. Figures 7 and 8 mentally and are shown to stabilize the pendulum. This key show the exact same circular trajectory experiment that was milestone allows us to shift our focus towards improving carried out on the testbed (Section VI-C). For this simulation, system performance. systematic errors in the gyroscopic sensors and the motion Experimental results revealed systematic errors when ap- tracking system were disabled. In the initial standstill phase plying the control laws. There appear to be different sources (the first two seconds in Figure 7), all errors are close to of these errors: zero. During circling, the errors converge to nearly stationary • Miscalibrations of sensors cause biases in the experi- values that are significantly smaller than in the real testbed. mental setup. These errors are observed in the attitude The close match of the vehicle’s nominal trajectory and the information from the motion capture system, and in pendulum’s simulated trajectory in Figure 8 appears to be the vehicle on-board control loops using gyroscope coincidental. feedback. This result highlights the influence of biased sensory in- • The simplifying assumption that the pendulum is formation, leading to position errors in the inertial coordinate mounted at the center of mass of the vehicle is violated O C system . These are observed as oscillating errors in . in the experimental setup. Rotations of the vehicle The mean errors on the trajectory also show different therefore cause a motion of the pendulum base point. characteristics in simulation and reality. This is particularly • The equations of motion used to derive nominal tra- noticeable in the pendulum position error p˜ and q˜. The sim- jectories and linear models neglect many real-world ulation contains a detailed dynamic model of the quadrotor effects, such as drag and underlying dynamics of the that has been validated in several experiments. It is therefore control inputs. probable that the differences are due mainly to the simulated • The control laws are designed assuming continuous- pendulum dynamics. The non-modeled off-center mounting time control, while the vehicle is controlled at only of the pendulum could explain this discrepancy. 50Hz. Circle radii R of up to 0.5 m have been successfully We have identified two approaches for extending the tested in simulation and reality. The vehicle and pendulum controller design presented in this paper. The first, and most straightforward approach, is to include states that represent REFERENCES the integrated errors, and to weigh them appropriately in the [1] Dimitri P. Bertsekas. Dynamic Programming and Optimal Control, controller design. This would permit compensation for some Volume 2. Athena Scientific, 2007. of the systematic errors. For instance, one would expect this [2] Insik Chin, S. Joe Qin, Kwang S. Lee, and Moonki Cho. A two-stage iterative learning control technique combined with real-time feedback to drive the vehicle position errors in the standstill case to for independent disturbance rejection. Automatica, 40(11):1913–1922, zero. November 2004. Alternatively, a machine learning approach could be ap- [3] JH Gillula, Haomiao Huang, MP Vitus, and CJ Tomlin. Design of Guaranteed Safe Maneuvers Using Reachable Sets: Autonomous plied. The measurement data indicates that systematic er- Quadrotor Aerobatics in Theory and Practice. In IEEE International rors greatly dominate stochastic errors. During the circular Conference on and Automation, 2010. trajectory in particular, there are systematic, repeated errors [4] Daniel Gurdan, Jan Stumpf, Michael Achtelik, Klaus-Michael Doth, Gerd Hirzinger, and Daniela Rus. Energy-efficient Autonomous Four- that could well be learned and compensated for in a feed- rotor Flying Robot Controlled at 1 kHz. In IEEE International forward fashion. The system could therefore ‘learn’ better Conference on Robotics and Automation, 2007. nominal trajectories, resulting in a correction of the nominal [5] Jonathan P. How, Brett Bethke, Adrian Frank, Daniel Dale, and John Vian. Real-Time Indoor Autonomous Vehicle Test Environment. IEEE control inputs. This could, for instance, be accomplished with Control Systems Magazine, (April):51–64, 2008. iterative learning control [10], [2]. The present problem is [6] Michail G. Lagoudakis and Ronald Parr. Least-Squares Policy Itera- especially well suited to this type of approach due to its tion. Journal of Machine Learning Research, 4(6):1107–1149, January 2003. repetitive nature. We are planning to use this experimental [7] Sergei Lupashin, Angela Schollig,¨ Michael Sherback, and Raffaello setup as a testbed and benchmark for learning methods. D’Andrea. A Simple Learning Strategy for High-Speed Quadrocopter The concept of the virtual body frame is applicable to a Multi-Flips. In IEEE International Conference on Robotics and Automation, 2010. wide range of quadrotor control problems that goes beyond [8] Nathan Michael, Daniel Mellinger, Quentin Lindsey, and Vijay Kumar. balancing a pendulum. It allows the time-invariant descrip- The GRASP Multiple Micro UAV Testbed. IEEE Robotics and tion of general circular trajectories if the circle size and rate Automation Magazine, 17(3):56 – 65, 2010. [9] Francis C. Moon. Applied Dynamics: With Applications to Multibody are constant. We are investigating extensions of this concept and Mechatronic Systems. Wiley, 1998. to allow its application to more general problems. [10] Angela Schollig¨ and Raffaello D’Andrea. Optimization-Based Iterative Learning Control for Trajectory Tracking. In European Control ACKNOWLEDGMENTS Conference, pages 1505–1510, 2009. [11] Jinglai Shen, Amit K. Sanyal, Nalin A. Chaturvedi, Dennis Bernstein, The authors wish to thank Sergei Lupashin for his innu- and Harris McClamroch. Dynamics and control of a 3D pendulum. merable contributions to the Flying Machine Arena hardware In 43rd IEEE Conference on Decision and Control, pages 323–328 and software, Michael Sherback for his work on the quadro- Vol.1, 2004. [12] Gilbert Strang. Linear Algebra and its Applications. Thomson tor control and simulation algorithms, and Angela Schollig¨ Brooks/Cole, 2006. for her work on the Flying Machine Arena testbed. [13] H.O. Wang, K. Tanaka, and M.F. Griffin. An approach to fuzzy control of nonlinear systems: Stability and design issues. IEEE Transactions on Fuzzy Systems, 4(1):14–23, 1996. [14] Victor Williams and Kiyotoshi Matsuoka. Learning to balance the inverted pendulum using neural networks. In 1991 IEEE International Joint Conference on Neural Networks, 1991, pages 214–219, 1991.