Deep Drone Acrobatics
Total Page:16
File Type:pdf, Size:1020Kb
Robotics: Science and Systems 2020 Corvalis, Oregon, USA, July 12-16, 2020 Deep Drone Acrobatics Elia Kaufmann∗z, Antonio Loquercio∗z, René Ranftly, Matthias Müllery, Vladlen Koltuny, Davide Scaramuzzaz Fig. 1: A quadrotor performs a Barrel Roll (left), a Power Loop (middle), and a Matty Flip (right). We safely train acrobatic controllers in simulation and deploy them with no fine-tuning (zero-shot transfer) on physical quadrotors. The approach uses only onboard sensing and computation. No external motion tracking was used. Abstract—Performing acrobatic maneuvers with quadrotors on the control stack raise fundamental questions in both is extremely challenging. Acrobatic flight requires high thrust perception and control. Therefore, they provide a natural and extreme angular accelerations that push the platform to its benchmark to compare the capabilities of autonomous systems physical limits. Professional drone pilots often measure their level of mastery by flying such maneuvers in competitions. In this against trained human pilots. paper, we propose to learn a sensorimotor policy that enables Acrobatic maneuvers represent a challenge for the actuators, an autonomous quadrotor to fly extreme acrobatic maneuvers the sensors, and all physical components of a quadrotor. While with only onboard sensing and computation. We train the policy hardware limitations can be resolved using expert-level equip- entirely in simulation by leveraging demonstrations from an ment that allows for extreme accelerations, the major limiting optimal controller that has access to privileged information. We use appropriate abstractions of the visual input to enable transfer factor to enable agile flight is reliable state estimation. Vision- to a real quadrotor. We show that the resulting policy can be based state estimation systems either provide significantly directly deployed in the physical world without any fine-tuning reduced accuracy or completely fail at high accelerations due on real data. Our methodology has several favorable properties: to effects such as motion blur, large displacements, and the it does not require a human expert to provide demonstrations, difficulty of robustly tracking features over long time frames [4]. it cannot harm the physical system during training, and it can be used to learn maneuvers that are challenging even for the Additionally, the harsh requirements of fast and precise control best human pilots. Our approach enables a physical quadrotor at high speeds make it difficult to tune controllers on the real to fly maneuvers such as the Power Loop, the Barrel Roll, and platform, since even tiny mistakes can result in catastrophic the Matty Flip, during which it incurs accelerations of up to 3g. outcomes for the platform. The difficulty of agile autonomous flight led previous work SUPPLEMENTARY MATERIAL to mostly focus on specific aspects of the problem. One line of A video demonstrating acrobatic maneuvers is available at research focused on the control problem, assuming near-perfect https://youtu.be/2N_wKXQ6MXA. Code can be found at state estimation from external sensors [23, 1, 15, 3]. While these https://github.com/uzh-rpg/deep_drone_acrobatics. works showed impressive examples of agile flight, they focused purely on control. The issues of reliable perception and state I. INTRODUCTION estimation during agile maneuvers were cleverly circumvented Acrobatic flight with quadrotors is extremely challenging. by instrumenting the environment with sensors (such as Vicon Human drone pilots require many years of practice to safely and OptiTrack) that provide near-perfect state estimation to master maneuvers such as power loops and barrel rolls1. the platform at all times. Recent works addressed the control Existing autonomous systems that perform agile maneuvers and perception aspects in an integrated way via techniques require external sensing and/or external computation [23, 1, like perception-guided trajectory optimization [9, 10, 33] or 3]. For aerial vehicles that rely only on onboard sensing training end-to-end visuomotor agents [34]. However, acrobatic and computation, the high accelerations that are required for performance of high-acceleration maneuvers with only onboard acrobatic maneuvers together with the unforgiving requirements sensing and computation has not yet been achieved. In this paper, we show for the first time that a vision- ∗ Equal contribution. based autonomous quadrotor with only onboard sensing and zRobotics and Perception Group, University of Zurich and ETH Zurich. yIntelligent Systems Lab, Intel. computation is capable of autonomously performing agile 1https://www.youtube.com/watch?v=T1vzjPa5260 maneuvers with accelerations of up to 3g, as shown in Fig. 1. 1 This contribution is enabled by a novel simulation-to-reality Aggressive flight with only onboard sensing and computation transfer strategy, which is based on abstraction of both visual is an open problem. The first attempts in this direction were and inertial measurements. We demonstrate both formally and made by Shen et al. [33], who demonstrated agile vision-based empirically that the presented abstraction strategy decreases the flight. The work was limited to low-acceleration trajectories, simulation-to-reality gap with respect to a naive use of sensory therefore only accounting for part of the control and perception inputs. Equipped with this strategy, we train an end-to-end problems encountered at high speed. More recently, Loianno et sensimotor controller to fly acrobatic maneuvers exclusively in al. [20] and Falanga et al. [10] demonstrated aggressive flight simulation. Learning agile maneuvers entirely in simulation has through narrow gaps with only onboard sensing. Even though several advantages: (i) Maneuvers can be simply specified by these maneuvers are agile, they are very short and cannot be reference trajectories in simulation and do not require expensive repeated without re-initializing the estimation pipeline. Using demonstrations by a human pilot, (ii) training is safe and perception-guided optimization, Falanga et al. [9] and Lee et does not pose any physical risk to the quadrotor, and (iii) the al. [17] proposed a model-predictive control framework to plan approach can scale to a large number of diverse maneuvers, aggressive trajectories while minimizing motion blur. However, including ones that can only be performed by the very best such control strategies are too conservative to fly acrobatic human pilots. maneuvers, which always induce motion blur. Our sensorimotor policy is represented by a neural network Abolishing the classic division between perception and that combines information from different input modalities to control, a recent line of work proposes to train visuomotor directly regress thrust and body rates. To cope with different policies directly from data. Similarly to our approach, Zhang et output frequencies of the onboard sensors, we design an al. [34] trained a neural network from demonstrations provided asynchronous network that operates independently of the sensor by an MPC controller. While the latter has access to the full frequencies. This network is trained in simulation to imitate state of the platform and knowledge of obstacle positions, the demonstrations from an optimal controller that has access to network only observes laser range finder readings and inertial privileged state information. measurements. Similarly, Li et al. [18] proposed an imitation We apply the presented approach to learning autonomous learning approach for training visuomotor agents for the task execution of three acrobatic maneuvers that are challenging of quadrotor flight. The main limitation of these methods is in even for expert human pilots: the Power Loop, the Barrel their sample complexity: large amounts of demonstrations are Roll, and the Matty Flip. Through controlled experiments in required to fly even straight-line trajectories. As a consequence, simulation and on a real quadrotor, we show that the presented these methods were only validated in simulation or were approach leads to robust and accurate policies that are able to constrained to slow hover-to-hover trajectories. reliably perform the maneuvers with only onboard sensing and Our approach employs abstraction of sensory input [27] to computation. reduce the problem’s sample complexity and enable zero-shot sim-to-real transfer. While prior work has demonstrated the II. RELATED WORK possibility of controlling real-world quadrotors with zero-shot Acrobatic maneuvers comprehensively challenge perception sim-to-real transfer [21, 32], our approach is the first to learn an and control stacks. The agility that is required to perform ac- end-to-end sensorimotor mapping – from sensor measurements robatic maneuvers requires carefully tuned controllers together to low-level controls – that can perform high-speed and high- with accurate state estimation. Compounding the challenge, acceleration acrobatic maneuvers on a real physical system. the large angular rates and high speeds that arise during the execution of a maneuver induce strong motion blur in vision III. OVERVIEW sensors and thus compromise the quality of state estimation. In order to perform acrobatic maneuvers with a quadrotor, The complexity of the problem has led early works to only we train a sensorimotor controller to predict low-level actions focus on the control aspect while disregarding the question from a history of onboard sensor measurements and a user- of reliable perception. Lupashin et al. [23] proposed iterative