Toward Optimal Motion Planning for Dynamic Robots: Applications On-Orbit by Keenan Eugene Sumner Albee

Submitted to the Department of Aeronautics and Astronautics in partial fulfillment of the requirements for the degree of

Master of Science in Aerospace Engineering

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

June 2019

o Institute of Technology 2019. All rights reserved.

Signature redacted A uthor ...... Department of Aeronautics and Astronautics May 23, 2019 Signature redacted C ertified by ...... David W. Miller Professor, Aeronautics and Astronautics Thesis Supervisor Signature redacted A ccepted by ...... Sertac Karaman Chairman, Department Committee on Graduate Theses MASSACHUSEITS INSTUTE OF TECHNMWY JUL 012019 L1 IRIES ARCHIVES

Toward Optimal Motion Planning for Dynamic Robots: Applications On-Orbit by Keenan Eugene Sumner Albee

Submitted to the Department of Aeronautics and Astronautics on May 23, 2019, in partial fulfillment of the requirements for the degree of Master of Science in Aerospace Engineering

Abstract Robotic motion planning is a well-studied field at the intersection of optimal control, artificial intelligence, and applied mechanics. Methods for robots without dynamics are well-studied and are often practical for real-time deployment on systems. However, robots that have dynamics, operate with uncertainty, or navigate difficult environments are at the edge of what is deployable today. While one may be able to develop a kinematic trajectory for a stationary fully-actuated robotic manipulator with modest degrees of freedom, developing strategies for underactuated systems of nearly any dimensionality remains a formidable challenge. This thesis addresses two primary corners of this problem, with specific focus on a use case of microgravity robotics: (1) how can one develop trajectories-ideally optimally and in real-time- for systems with high dimensionality and challenging environmental/dynamical con- straints and (2) how can uncertain parameters be dealt with explicitly via motion planning. These methods are placed in context with discussion of the on-orbit as- sembly and servicing scenarios in which they might be used. A launch manifesting strategy for future missions requiring a large number of components on a mix of rock- ets is presented. These tools are a step in the direction of greater autonomy on-orbit, with wider applicability to robotic systems at large.

Thesis Supervisor: David W. Miller Title: Professor, Aeronautics and Astronautics

3 4 Acknowledgments

This work was performed primarily under the NASA Space Technology Research Fellowship program, grant number 80NSSC17K0077. Additional support was pro- vided by the NASA Jet Propulsion Laboratory's iSAT investigation, subcontract number 1613399. The author gratefully acknowledges these sponsors for enabling this research. I wouldn't have been able to piece two words together let alone write this thesis if it weren't for my family. Thanks mom and dad, for making countless sacrifices for me, Cyril, and Miranda, and always supporting our choices in life. Kristina, you are amazing and I enjoy every minute we have together in Cambridge. My research advisor, Professor Miller, got me started in the lab and has always offered useful feedback and occasional scary plane rides-thank you for your support. Thanks to the SPHERES team for onboarding the final SPHERES generation, especially Will and Tonio for endless support and advice. Sorry Tonio for almost letting you freeze to death in Montana. Thanks Danilo for saving us many times. The rest of the SSL team has kept the lab going and made MIT a better place: Charlotte, Alex, Hailee, Daniel, Michael, and everyone else. Thanks to Bobby and other AeroAstro friends for being great colleagues and sometimes better hockey teammates. Thanks to Professor Mike for helping me get through quals and always offering advice. Thanks In Won and Brian for your advice and support, especially over the summer. Finally, many past mentors taught me to be a better engineer and for that I am forever grateful. Recalling just some of them, thank you to Al Bowers, Oscar Murillo, Tom Andrews, Kim Harrison, Darwin Chang, Kirk Svartstrom, Jason Graham, John O'Neil, Doug Adams, Prof. Marc Donohue, Prof. Sunil Agrawal, Haohan Zhang, and Adrian Stoica. I would also like to acknowledge research collaborators that I worked with directly

The title of this thesis was partially inspired by the title of an ICRA 2019 workshop, "Toward Online Optimal Control of Dynamic Robots," which happened to combine multiple buzzwords the author was floating around in a sensible way.

5 on portions of this work. Particularly, Monica Ekal, Ryan de Freitas Bart, Hailee Hettrick, Oliver Jia-Richards, Alex Steighner, and the iSAT team.

6 Contents

1 Introduction 15 1.1 M otivation ...... 15 1.1.1 Robotic Motion Planning ...... 15 1.1.2 Microgravity Space Robotics ...... 17 1.2 Simulators and Testbeds ...... 17 1.2.1 SPHERES ...... 17 1.2.2 A strobee ...... 18 1.3 Thesis Organization and Approach ...... 20

2 Background and Research Gap 21 2.1 Robotic Space Systems ...... 21 2.1.1 Microgravity Robotics ...... 21 2.1.2 Planetary Surface Robotics ...... 26 2.2 Motion Planning ...... 26 2.2.1 Discrete Planning ...... 30 2.2.2 Classical Robotic Motion Planning ...... 31 2.2.3 Control and Dynamical Systems Theory ...... 31 2.2.4 Optimal Control: Optimization for Controlling Dynamical Sys- tem s ...... 32 2.2.5 Trajectory Optimization: Adapting Mathematical Optimiza- tion for Dynamical Systems ...... 34 2.2.6 Planning under Uncertainty ...... 36 2.3 Research Gap for Microgravity Robotics ...... 37

7 3 Dynamical Systems of Interest 39 3.1 Dynamics Preliminaries ...... 39 3.1.1 Attitude Representation ...... 39 3.1.2 Attitude Kinematics and Dynamics ...... 43 3.1.3 Translation Kinematics and Dynamics ...... 44 3.2 M odel System s ...... 44 3.2.1 Double Integrator ...... 45 3.2.2 3 DoF Rigid Body ...... 45 3.2.3 Rigid Body Satellite (6 DoF Rigid Body) ...... 46 3.2.4 Robotic Manipulator Systems (Free-Flying and Free-Floating) 46 3.2.5 C hallenges ...... 48 3.2.6 Demonstrations in Simulation ...... 49

4 Motion Planning for Challenging Dynamical Robots 53 4.1 Constraints ...... 5 3 4.1.1 Dynamics Constraints ...... 5 3 4.1.2 Environmental Constraints ...... 5 5 4.2 Underlying Scenario ...... 55 4.2.1 Planning for Complex Dynamics...... 56 4.2.2 Systems of Interest ...... 56 4.3 Sampling-Based Planning (SBP) ...... 57 4.3.1 Optimal SBP ...... 58 4.3.2 Dynamics-Aware SBP ...... 60 4.3.3 Dynamics-Aware Optimal SBP ...... 63 4.4 SBP for Satellite Dynamics ...... 64 4.5 Results for Satellite Dynamics ...... 66

5 Planning for Information Gain 75 5.1 Background on Probability ...... 76 5.2 Underlying Scenario ...... 76 5.2.1 Planning with Uncertain Parameters ...... 77

8 5.2.2 System of Interest ...... 77 5.3 Relevant Methods ...... 79 5.3.1 Real-Time Motion Planning ...... 79 5.3.2 Adaptive and Robust Control ...... 80 5.3.3 Estimation Methods and Parameter Estimation ...... 81 5.4 Real-time Planning for Information Gain ...... 81 5.4.1 The Fisher Information Matrix ...... 82 5.4.2 Nonlinear Model Predictive Control ...... 85 5.4.3 Implementation Details ...... 86 5.5 Algorithm Results ...... 88 5.5.1 Demonstration of Updating Parameters ...... 88 5.5.2 Results on a 6 DoF System ...... 89 5.5.3 Summary of Framework ...... 92

6 Systems Considerations and Launch Manifesting Optimization 95 6.1 Scenarios for On-Orbit Motion Planning ...... 95 6.2 Case Study: Launch Manifesting Optimization ...... 98 6.2.1 Large-Aperture Assembled Telescope Background ...... 98 6.2.2 Problem Formulation ...... 99 6.2.3 Existing Approaches and Challenges ...... 101 6.2.4 Integer Optimization Approach ...... 102 6.2.5 Greedy Packing Approach ...... 104 6.2.6 R epacking ...... 106 6.2.7 Test Cases and Results ...... 108 6.3 Summary and Contrast to Motion Planning Optimization ...... 110

7 Conclusions and Future Work 111 7.1 Conclusions and Contributions ...... 111 7.2 Future Work ...... 112

9 A and Component Tables 115 A.1 20 m In-Space Assembled Telescope ...... 115 A .2 Launch Vehicles ...... 116

B Robust Optimization for Satellite Control 117 B.1 Robust Optimization Primer ...... 117 B.2 Robust Receding Horizon Control for 3 DoF Dynamics ...... 118 B.2.1 Problem Formulation ...... 118 B.2.2 Creating a Robust ...... 119 B .2.3 R esults ...... 120

C The SPHERES Testbed 123 C.1 LQR Position Controllers for SPHERES ...... 123 C.2 Lessons Learned and Practical Advice ...... 126

10 List of Figures

1-1 Atlas biped jumping over a log...... 16 1-2 SPHERES with ...... 18 1-3 Astrobee robot on a granite table...... 19

2-1 List of robotic servicers and free-flyers...... 23 2-2 Repair of Solar Max...... 24 2-3 Recovery of two satellites on STS 51-A...... 25 2-4 The fundamental trajectory optimization problem...... 27 2-5 Decision-making flowchart for kinodynamic planning...... 28 2-6 Decision-making flowchart for kinodynamic planning, with emphasis. 29

3-1 6 Dof dynamics example...... 50 3-2 Free-flying satellite visualization...... 51 3-3 Free-flying dynamics demonstration...... 51

4-1 RRT algorithm explanation...... 59 4-2 RRT options chart...... 59 4-3 RRT* algorithm explanation...... 61 4-4 RRT* example...... 62 4-5 The two-point boundary value problem ...... 63 4-6 LQR-RRT* diagram...... 65 4-7 Astrobee-compatible implementation summary...... 66 4-8 RRT demonstration example...... 67 4-9 2D double integrator example...... 68

11 4-10 3D double integrator example...... 6 8 4-11 Underactuated 3 DoF diagram...... 6 9 4-12 Guided-Kino-RRT example 1...... 7 1 4-13 Guided-Kino-RRT example 2...... 7 1 4-14 Guided-Kino-RRT example 3...... 7 1 4-15 Guided-Kino-RRT results...... 7 1 4-16 The 6 DoF satellite ...... 7 2 4-17 6 DoF Kino-RRT results...... 7 2

5-1 Uncertain trajectory planning diagram...... 77 5-2 Benefit: improved tracking...... 82 5-3 Benefit: improved parameter knowledge...... 83 5-4 Planning for information gain system diagram..... 86 5-5 Online-mass update, 2D double integrator...... 89 5-6 Information gain demonstration, 3 DoF...... 91 5-7 6 DoF with no information gain ...... 91 5-8 6 DoF with information gain...... 92 5-9 Tracking results with and without parameter updates. 93 5-10 Information weighting term evolution...... 93

6-1 A construction site scenario...... 96 6-2 An example of a Surveyor...... 97 6-3 Previous on-orbit assembled telescope proposals...... 99 6-4 Rocket options and components...... 101 6-5 Packing generation...... 103 6-6 The heuristic manifesting process...... 104 6-7 Combining packings pairwise...... 106 6-8 Spreading packings between all launch vehicles...... 107 6-9 20 m telescope test case result...... 108 6-10 12 m telescope test case result...... 109

12 7-1 A manipulator-equipped satellite scenario. 113

B-i SDP results compared to LQR...... 121 B-2 SOCP results comapred to LQR...... 121 B-3 SOCP results relative to SDP...... 121

C-1 SPHERES LQR sim, fast...... 124 C-2 SPHERES LQR sim, slow...... 124 C-3 SPHERES LQR hardware, fast...... 125 C-4 SPHERES LQR hardware, slow...... 125

13 14 Chapter 1

Introduction

1.1 Motivation

Today's robotic systems have achieved stunning advances in their ability to operate in highly dynamic ways, exploiting their natural dynamics and moving away from strict assumptions on their environment. In the robotic biped field Dynamics' Atlas (Figure 1-1) is a prime example of the agility of some of today's most advanced robots. Yet most systems remain limited in their deployment to real-world scenarios: robotic bipeds aren't wandering around the streets of Manhattan just yet. Further progress in multiple areas of robotics is required until robots with non-negligible dynamics operating under uncertainty are deployed for practical use. In particular, the space domain (which is traditionally risk-averse), demands methods that have an understanding of system dynamics and deal with the inevitable uncertainties that arise from the system and its environment.

1.1.1 Robotic Motion Planning

One bottleneck is robotic motion planning, detailed further in Chapter 2. At a high level, motion planning involves determining how to transition a robot between a set of configurations in order to reach a goal while avoiding obstacles, respecting the physical laws of motion, and obeying other constraints. Motion planning, particularly

15 Figure 1-1: Boston Dynamics' Atlas robot has capabilities that allow it to operate dynamically in complex environments, like running over obstacles [1. for systems with dynamics, is hindered in its ability to deal with systems with complex constraints (e.g. those with certain types of challenging dynamics) and to compute fast enough to be useful for real-time implementation.

Imagine, for example, a UAV zooming through a forest. The environment is not known beforehand so the UAV must figure out how to avoid obstacles on-the-fly as its perception system identifies them. Further, this UAV has significant dynamics- it cannot simply be modeled as a dynamics-free bounding box moving through a dense forest (see the piano mover's problem for an interesting variety of bounding box [21). There may also be disturbances introduced, for example, wind or incorrect modeling of the propellers' flow environment. All told, the UAV has a challenging task ahead of it. The state-of-the-art today deals with such problems under certain simplifying assumptions, detailed further in Chapter 2. However, fast and effective motion planning is coming closer and closer to practical deployment. Continuing the UAV analogy, Skydio produces one of the most advanced such systems today for consumer use.

16 1.1.2 Microgravity Space Robotics

Advances in robotics at large directly benefit space microgravity robotics, an area seeing increased interest for use in on-orbit servicing, assembly, and debris removal. Greater motion planning capability offers the opportunity for space robots to operate in real-time under true autonomy, rather than relying on teleoperation by human tenders. Following a few on-orbit servicing demonstrations in the early 2000s and work involving Shuttle and the ISS, recent interest is largely along proposed satellite servicing missions and assistive free-flyers. Chapter 6 provides some context for the different roles of robotic motion planning in microgravity space robotics. Chapter 2 goes deeper into the background on past missions and new proposals.

1.2 Simulators and Testbeds

Some robotic free-flyer testbeds exist today on-orbit. These state-of-the-art systems allow for testing in an authentic microgravity environment. A variety of simulators and ground analogs exist as well. Ground hardware testing typically involves either a glass or granite table, which is an exceptionally flat, smooth surface that allows three degree of freedom (DoF) frictionless movement. Typically, algorithmic develop- ment occurs in simulation, moves to these ground hardware testbeds, then progresses to on-orbit hardware testing once matured. The SPHERES platform at MIT was used in ground testing of some of the algorithms mentioned in this thesis, and is a target for future on-orbit validation. Below are some of the testbeds and simulation environments of interest to robotic free-flyer testing.

1.2.1 SPHERES

The SPHERES satellites (Figure 1-2) onboard the International Space Station are the most prominent example of robotic free-flyer, serving as a research testbed for autonomy and controls since their launch in 2006 [31 [4]. The satellites each have 12 CO 2 compressed gas thrusters and an ultrasound-based global metrology system,

17 Figure 1-2: The SPHERES satellites which operate onboard the International Space Station, with astronaut Scott Kelly [5]. allowing them to operate in the Japanese Experiment Module (JEM). Peripherals including a stereo vision camera system and docking ports which can be added to the satellites. More than 100 test sessions have been conducted with these satellites, which serve as a microgravity testbed for satellite control and autonomy algorithms. The SPHERES program also includes a MATLAB/Simulink/C simulation envi- ronment and a glass table testing facility. The glass table facility was used for verifying online parameter updates during the algorithmic development from Chapter 5.

1.2.2 Astrobee

The soon-to-be operational Astrobee system [6] (Figure 1-3) will include a 2-DOF robotic manipulator and will be the first microgravity testbed of its kind for free-flying manipulator dynamics. Such systems are expected to become increasingly prevalent as groups look toward on-orbit servicing, assembly, and assistive tasks-effective mo-

18 Figure 1-3: Astrobee, shown here on an air-bearing in the ground environment, is undergoing checkout procedures aboard the ISS in 2019. It is the first of its kind in terms of robotic manipulation testbeds [9].

tion planning will be crucial in controlling these difficult systems. Astrobee uses a vision-based estimation system, and applies and torques using an impeller-vent system, which provides a constant flow of air to dispatch from a set of 12 vents, providing holonomic control [7]. Astrobee's computational power exceeds that of SPHERES, and serves a dual purpose as both a research testbed and an astronaut assistive free-flyer. Astrobee's software stack is built on ROS, the robotic operating system [8].

The Astrobee platform has an accompanying simulator built on ROS/Gazebo, with flight software mainly written in C++. ROS is a message-passing middleware software for robotics, relying on the concept of nodes which transfer messages to one another to relay data streams-it is ubiquitous in robotics applications. Gazebo

19 is a ROS-compatible simulation environment that provides simulation and hardware device simulators, among other capabilities.

1.3 Thesis Organization and Approach

This thesis discusses two primary research thrusts related to motion planning for dynamical robotic systems, with particular emphasis on applications in micrograv- ity space robotics. Chapter 2 provides some additional background on space robotic systems, and the the past and present of motion planning research. Chapter 3 pro- vides dynamics background and outlines some space robotics model systems that will be investigated. Chapter 4 addresses motion planning for dynamical systems with particularly challenging constraints (e.g. underactauted, high dimensionality, etc.) and proposes a motion planning algorithm that, while not optimal, can provide first- pass solutions to other optimizers. Chapter 5 introduces a novel approach to execute motions that blend optimal trajectories with ones that allow for uncertain param- eters to be learned, providing better online tracking. Chapter 6 discusses different roles of microgravity robots, and provides a case study of a contrasting optimization problem, introducing a novel approach to optimal launch manifesting. Finally, Chap- ter 7 summarizes the contributions of Chapters 4 through 6 and provides areas for future development. Throughout, particular emphasis is placed on applications for microgravity space robotics.

20 Chapter 2

Background and Research Gap

2.1 Robotic Space Systems

Space robotics, while sometimes referred to as a whole, actually differs quite signif- icantly between two primary use environments: satellites operating in microgravity and robots operating on a planetary surface. The NASA Technology Roadmaps make this distinction clear [10]. Here, background on the state of microgravity robotics will be given, with some discussion of planetary surface robotics.

2.1.1 Microgravity Robotics

A comprehensive summary of current space robotics systems is provided by Flores- Abad et al. [111. Further, the NASA Technology Roadmaps recommend development of multiple areas in microgravity robotics for future development, particularly au- tonomous grappling and interacting with tumbling targets [10]. Microgravity robotics is an umbrella term for robotic on-orbit assembly (constructing multi-component structures or vehicles in microgravity), on-orbit servicing (rendezvousing and repairng broken satellites), and an emerging class of smaller free-flyers that can perform a va- riety of tasks. The majority of the space robotic manipulator and free-flying systems currently deployed or seriously proposed are shown in Figure 2-1. The decreasing cost of access to orbit and growing capability of autonomous systems has lead to a

21 recent resurgence in interest in all of these varieties of space robotic systems.

The reader should note that there is a distinction between satellites equipped with manipulators (e.g. Restore-L) and those without manipulators (e.g. SPHERES). Both types of satellites can be thought of as robots, but a distinction is made in the planning literature since manipulator-equipped satellites have additional complexity in their dynamics. Manipulator-equipped satellites are referred to as free-flyers if they also have thrusters available to use, while those without thrusters are called free- floaters. Confusingly, the term free-flyer is sometimes used broadly to refer to any small robotic spacecraft with thrusters, regardless of whether it has a manipulator. Chapter 3 goes into greater detail on the manipulator-equipped dynamics.

On-Orbit Assembly

On-orbit assembly is sought-after because it enables the creation of structures larger than launch vehicle fairing capacity, can provide , and allows for the har- nessing of in-space manufactured materials. A significant amount of work has been performed in human-operated assembly, from the days of Gemini and Apollo where manually performed docking, to modern-day ISS operations which make significant use of the SSRMS for satellite berthing. Various proposals have also been made for beam or truss assemblers and construction robots, e.g. Skyworker [131 and the Beam Assembly Teleoperator [14]. There have even been studies of human-only assembly, including a series of studies on Shuttle where truss structures were assem- bled by hand, e.g. STS 61-B.

Operating autonomously or under teleoperation, more recent on-orbit assembly proposals relate to the creation of large structures on-orbit (e.g. [15]), or joining modules together for space stations. On-orbit assembly remains a dream today, but serious investigations are underway for the use of robotic assemblers on a large scale, e.g. NASA JPL's iSAT large telescope study. The SPHERES testbed is one example of microgravity research platform enabling this work [16].

22 existing manipulators and servicers proposed serviers

...... sol...... n..00 ......

existing free-flyers

Figure 2-1: The majority of flown and proposed on-orbit manipulators and free-flyers. From top left and left-to-right and downward: /Station Remote Manip- ulator System (SSRMS); JAXA ETS-VII; NASA Restore-L; Special Purpose Dex- terous Manipulator(DEXTRE); Orbital Express; DARPA RSGS; NASA AERCam Sprint; NASA Astrobee; NASA/MIT SPHERES. There are other proposed missions, including many varieties of free-flyer [9][5][12].

23 - - A.

On-Orbit Servicing

On-orbit servicing dates back to the Shuttle days, including human-only tasks and human teleoperation of the SSRMS to grapple non-tumbling targets. STS-41-C was the first on-orbit repair of an unmanned satellite, which attempted to fix Solar Max, shown in Figure 2-2. STS 51-A in particular recovered two improperly-placed satel- lites and returned them safely to Earth. The Hubble missions are one of the most spectacular examples of the value of on-orbit servicing, saving the space telescope from an improperly ground mirror and performing subsequent upgrades. The Hub- ble remains in service to this day, in 2019. Some robotic servicing approaches (e.g. Ranger) came very close to launch, but were never flown.

Figure 2-2: Astronaut George Nelson uses the MMU to hand-grapple Solar Max in the first instance of on-orbit servicing. This attempt inadvertently sent the satellite tumbling, nearly ending the mission. Solar Max went on to operate an additional five years [17].

Missions in the early 2000s began to take a more robotic approach to on-orbit ser- vicing, including a number of demonstration missions. Some notable demonstrations on the microgravity manipulator side include the proposed Restore-L servicer [18], and the ETS-VII on-orbit demonstration 119]. Orbital Express was another famous

24 __M

demonstration of on-orbit servicing capability that rendezvoused, docked with, and inspected a target. On-orbit servicing has a larger potential for practical applica- tion today, with multiple proposed manipulator-equipped missions in the works (e.g. [201 121]), mainly for satellite servicing but some for self-servicing/assembly of stowed spacecraft (e.g. Dragonfly).

Figure 2-3: STS 51-A recovered the Palapa B2 and 6 satellites for return to Earth and relaunch. Here, astronaut holds the satellites for ransom. This was one of the first instances of on-orbit servicing [221.

Free-Flying Assistance and Other Tasks

Free-flyers are an emerging class of microgravity robotic spacecraft that are often used as testbeds, assistants, or to perform a variety of supportive tasks. They are generally smaller than traditional satellites, and as such need careful consideration of their dynamics if they are equipped with manipulator arms, due to base disturbance. Today free-flyers only exist in research form, though their capabilities for a plethora of on-orbit uses is promising and is bringing about further research. Flores-Abad gives an extensive summary of free-flyer history, arguably beginning with AERCam Sprint [231, which flew untethered in the Shuttle payload bay. The Personal Satellite Assistant [24] was a NASA Ames proposed IVA free-flyer [11],

25 though it did not make it to space. The MIT SPHERES testbed, launched in 2006, has had a long and successful history of autonomy and controls research in microgravity [3] [4]. Launching in 2019, Astrobee will bring the first manipulator-equipped free-flyer testbed to orbit 161 [7] [251.

2.1.2 Planetary Surface Robotics

Only a brief discussion of planetary surface robotics will be provided here. Unlike microgravity robots, planetary surface robots deal with gravity and operate in po- tentially complex surface environments. These robots are often tested using Earth analogues in extreme environments replicating conditions seen on other planets, e.g [261 [271 [28]. The concerns with these robots are often mechanism design and dealing with certain types of particularly challenging environments, like wall-climbing, rocky surfaces, or extreme lighting conditions [291. The most successful planetary surface robots to date are the series of Mars rovers produced by NASA's Jet Propulsion Lab, e.g. Sojourner, MER [301, and MSL. All of these are rovers which use a rocker-bogie suspension system. Limbed robots, flying robots, and other varieties of planetary surface robots have yet to be flown. However, advances in robotics will surely be integrated into future systems due to increased productivity and safety from greater autonomy.

2.2 Motion Planning

Pavone et al. [311 detail some of the autonomy challenges facing today's microgravity robotic explorers, among them motion planning. The motion planning literature is vast and extends across robotics and multiple other fields. An overview of the breadth of planning algorithms is provided in the subsequent sections. This material is pri- marily covered in the standard references [2] [32], with some ideas from the optimal controls literature provided in Kirk 133]. Some ideas introduced here are common across multiple disciplines, but each has a preferred notation. Powell attempts to provide a common language between control theory, artificial intelligence, and op-

26 erations research in [34J. Finally, some additional good surveys of robotic motion planning can be found in [35] and [361.

Figure 2-4: The fundamental trajectory optimization problem. A system with dy- namics must avoid obstacles and obey other constraints while reaching its goal in a cost-minimizing way.

The fundamental trajectory optimization problem, which encompasses motion planning is given by Equation 2.1 and is summarized by Paden [36]. Figure 2-4 shows the essential problem. A robot, which has dynamics, must navigate to some goal state (here a green circle) while avoiding obstacles (satellite with panels). The fundamental trajectory optimization problem may be mathematically stated as:

minimize J h(x(tf),tf) + (g(x(t),u(t),t)) dt u(t)

subject to

J = f(x(t), u(t), t) (2.1) x(to) = Xstart

X(tf) E Xgoai

x(t) E Xfree

U E U

Here, J is the cost function, x is the state vector, to is the initial time, tf is the

27 -A

final time, u is an input vector, h(x(tf), u(tf)) is terminal cost, and g(x, u) is additive cost. Xgoai defines a goal region, and Xfree is the obstacle free space. U defines the set of valid inputs. Note that the dynamics must be satisfied as a constraint, i: = f(x, u). Additional hard constraints such as collision avoidance are often required. Chapter 4 goes into further detail on some of the constraints that may be required. Figure 2-5 broadly summarizes some of the decisions that must be made in choos- ing a motion planning method. Some of the chief questions are: does the system have dynamical constraints?; what is the complexity class of the optimization problem?; does the solution need to run in real-time?; are there additional constraints?; and, are the dynamics particularly "hard", i.e. do they prohibit the use of nonlinear optimiza- tion methods? This chart assumes a good model of the system exists: learning-based approaches are not considered, though they certainly have a place in certain versions of trajectory optimization.

1O traditional dynamics? motion planning

yes

indirect methods complexity M.PC cls D _ conivex/nion-QP nouconvex

convex./QP

(constrained? realtime?

0 Y yes challenging " di mensionality, dynamics dynamics constraints MPCdifficultyCQ ( diffilculty

high-d, inanageable low-d uncertainty hard constraints dimensionality, if ifconstraints dynamic SOS, peiIn. .daptive, -t ,rn. NMPC open area programming robust control ""1"" 11m1

Figure 2-5: A decision-making flowchart for choosing appropriate motion planning algorithms given some key information about a system. This doesn't incorporate every aspect of choosing the right approach, but captures some key parameters.

28 Real-time use is a key question when trajectory optimization exceeds the com- putational abilities of existing hardware: take, for instance, the growth of Model Predictive Control (MPC) in the 1970s and 1980s out of the process industry, which typically has plants that evolve over very long timeframes, allowing for extended com- putational effort 137]. As computational and algorithmic power expands, the breadth of methods that can be considered "real-time" also grows-for those methods beyond computational reach, algorithmic cleverness must be used to provide online solutions (desirably, with guarantees).

dynamics? motion panning

yes

indirect methods complexity

convex/ nonl-QP nonconvex

convex/QP

constrained? realtime?

no yes 110 yes challenging dimensionality, " dynamics dynamics constraints LQR +- MPCZ difficulty (:difficulty

high-d, manageable low-d uncertainty hard constraints dimensionality, d ic ifi T constraints dyi a OHmln-~ hn~naaptive.,; direct trantription. NMCo en area progranming ro ntol NMPC

Figure 2-6: The motion planning areas that are most relevant to work covered in this thesis. Dark red highlights offline algorithms for challenging dynamics-if algorithms are fast enough for real-time use, they are suitable for offline use too.

The desired framework in this thesis follows the markings in Figure 2-6. The free-flying motion planning problem is one that should have dynamics considered, is (likely) nonconvex, must run online, and has "difficult" constraints such as obstacle avoidance and underactuation. Such methods are at the state-of-the-art for motion planning. If instead the constraints are "manageable" then an online NLP-based

29 method might be appropriate-considering system uncertainty, Chapter 5 proposes an NMPC information gain approach. More broadly, the kinodynamic planning pre- sented in Chapter 4 attempts to bring challenging high-dimensional systems into the realm of real-time planning, and at the very least to tackle such systems offline. Also note that methods that are candidates for challenging real-time scenarios are also candidates for challenging offline scenarios, shown in dark red. The subsequent sec- tions will now provide a broad background of motion planning, leading up to some of the present-day state-of-the-art, much of which is provided in [2], [32], and [33].

2.2.1 Discrete Planning

Motion planning has undergone a long and complicated evolution through various different disciplines. Some of the earliest work is in discrete planning for deterministic systems, for example classical search methods. The notation provided here largely follows that of the trajectory optimization formulation. Each situation a system can be in is described as a state, x, with state space X describing the set of all possible states. A state transition function, f, describes state transitions given some input (or action), u:

' = f(X, u) (2.2)

u E U(x) is called the action space. There is also an X2 C X which describes the set of goal states. The solutions for these systems generally arise from classical search methods, like breadth-first and depth-first search. Dynamic programming adds in notions of optimality and produces the principle of optimality, which states that portions of an optimal plan are also optimal, a vital piece of information for optimal search algorithms. Djikstra's algorithm, which maintains a queue in cost-to-come space, iteratively checks whether a lower-cost cost-to-come path exists whenever x' is added. Value iteration, meanwhile, iteratively evaluates the result of applying u at each x and evaluates the cost-to-go (in backward value iteration). Generally, these approaches optimize over graphs and can sometimes be applied to continuous systems

30 via discretization [321.

2.2.2 Classical Robotic Motion Planning

"Classical" robotic motion planning deals with robot kinematics in low dimensions, with constraints such as collision avoidance and self-intersection. The kinematic state space is often referred to as the configuration space, C, equal to the number of DoF. Robot kinematics governs how linkages move, restricting the feasible configuration space due to collision and linkage geometry. In some cases, continuous planning for robot kinematics can be turned into decisions of discrete set of actions, resulting in discrete planning (combinatorial motion planning). Sampling-based planning also uses discretization by selecting a subset of points in x to search over, but with different guarantees: combinatorial planning offers completeness, which guarantees finding a solution or failing; meanwhile, sampling-based planning usually offers probabilistic completeness, which states that success or failure will be reported in the limit. Classical robot motion planning often concerns itself with finding feasible paths in configuration space for lower dimensional robot kinematics. A classic example might be a wheeled robot navigating a 2D maze, a 6 DoF arm determining a feasible path to move around its workspace 121. Chapter 4 goes into greater depth on sampling-based planning (SBP).

2.2.3 Control and Dynamical Systems Theory

Separately, control has always approached problems involving dynamical systems. Attention has been paid to rigorous analysis with guarantees (e.g optimality, stability, robustness) for certain classes of systems. Classical controls has extensively looked at single-input single-output (SISO) linear time-invariant (LTI) systems, and has produced many tools with strong theory to prove system stability and margin, for example. In particular, linear systems have a deep and well-established theory: this is a mature field. State space models, which extend the state space x stated in the previous section

31 to include the derivatives of the configuration space, allows for powerful tools to be developed. LTI systems in particular have strong theory. Some of the major developments include: controllablity, observability, and closed-form solutions of the state transition matrix (STM). Methods such as pole placement exist for designing controllers (policies) that make a system have desired behavior (e.g. the desired eigenvalues). These methods often break down for nonlinear systems: similar notions of controllability/observability exist in an ill-defined form (Lie Brackets, etc.) [38] [39].

2.2.4 Optimal Control: Optimization for Controlling Dynam-

ical Systems

Optimal control provides a framework to analyze the operation of dynamical systems while minimizing some measure of the system performance, a functional:

J= f(x(t), u(t), t) dt (2.3)

In classical optimal control results, indirect trajectory optimization methods rely on deriving the fundamental conditions for optimality through the calculus of varia- tions. The Hamiltonian is formed, and the variation of the augmented cost functional is set to 0, yielding a variety of necessary conditions. Applications include Pon- tryagin's Minimum Principle (PMP) and clear results for certain low-dimensional or special systems, though these results break down for severe constraints, or complex costs/dynamics. A classical example is time-optimal control of the double integrator (Chapter 3).

Linear Quadratic Regulator

A special result in optimal control involves a cost function of quadratic weighting on state error and control effort with linear dynamics. It happens that the optimal solution to such a problem involves full-state feedback. For the infinite horizon case,

32 the solution is a linear feedback law and the tools of feedback control and linear systems analysis become relevant. LQR is limited in that it cannot accept constraints, and systems must be in this special form. Linearization can extend LQR, and linear systems analysis in general, to the analysis of fixed points and feasible trajectories of nonlinear systems. There is no guarantee that such an approximation will be valid for arbitrary states of a nonlinear system, however. For a system of the form:

J= - f TQx+UTRu dt 2 to (2.4) x = Ax+Bu

the optimal time-varying solution is:

-P = PA + ATP + Q - PBRlBTP (2.5)

u*(t) = -RlBT P(t)x(t) Here, Q is a weighting matrix on state error, R is a weighting matrix on input use, A and B are linear dynamics matrices, and P describes the costate dynamics. Appendix C provides an example of constructing a simple LQR controller.

Dynamic Programming

Dynamic programming is also relevant to optimal control, largely due to Bellman and from the operations research community [401. Kirk states:

[...from the recurrence relation of dynamic programming] we obtain an exact solution to a discrete approximation of the optimization equation, whereas in performing a numerical solution to the HJB equation we ob- tain an approximate solution to the exact optimization equation. Both approaches lead to an optimal control law (closed-loop optimal control) 133].

Dynamic programming discretely models a dynamical system, and computes the optimal policy for an entire state space using an algorithm such as value iteration.

33 (Dynamic programming is the process of approximation, value iteration is the method for solving, though the two are often used interchangeably.) The result is a discrete approximation of the famous Hamilton-Jacobi-Bellman equation:

- min (J + J f (x, u)) (2.6) dt u*

Here, the variable definitions follow those given in the trajectory optimization formulation. This approach is limited to systems of low dimension due to the "curse of dimensionality", the exponential scaling in time complexity required for solution

[33].

2.2.5 Trajectory Optimization: Adapting Mathematical Opti- mization for Dynamical Systems

Success in mathematical optimization in providing fast, efficient solvers for a variety of problem types has led to the adaptation of trajectory optimization problems into a mathematical optimization framework. Broadly, mathematical optimization (also called programming) may be categorized as:

" Linear Optimization

* Quadratic Optimization

" Convex Optimization

* Nonlinear/Nonconvex Optimization

Optimization problems have an objective function, f(x), with decision variables x and a set of constraints which must be satisfied, g(x) for inequality and h(x) for equality (note that these are not the same as the g and h used when defining the cost function previously). The complexity class of the objective and constraints determines what variety of optimization a problem is in. Already, the parallels in the problem statement resemble optimal control.

34 min f(x)

subject to (2.7) gi(x) < 0 hj(x) = 0

The watershed in optimization is often said to be convex versus nonconvex opti- mization. For convex problems, i.e. those with convex objective functions and con- straints, fast interior point methods can be used to find solutions. Often in robotics, simply stating that a problem is convex is considered equivalent to having solved it for practical use [41] [421.

Offline Trajectory Optimization

A few methods rely on discretizing the state and input dynamics of trajectory opti- mization to write an optimal control problem as a mathematical optimization problem with a discrete set of decision variables. These approaches are flexible and, for convex problems, often very fast in practice. Further, mathematical optimization makes it simple to add constraints on states and inputs, though heavily constrained problems or problems with complex constraints can be challenging to solve (see Chapter 4). Direct transcription does this by writing the discretized set of x and u as decision variables, with a constraint relating them via the dynamics. Direct collocation, mean- while, expresses the state and input dynamics as parameterized splines and optimizes over these spline parameters (knot points), while also enforcing dynamics at a set of collocation points [431 [441 1451. These methods extend to nonconvex problems, however, nonconvex optimization in general finds local optimums rather than global optimums: guarantees on global optimality are hard to come by. A variety of recent approaches aim to solve these nonconvex problems efficiently [46] [47].

35 Real-time Trajectory Optimization Control

Problems which can be stated in an amenable form can often be solved online, leading to model predictive control (MPC) which is really both a method of control and trajec- tory optimization. MPC and its nonlinear cousin nonlinear model predictive control (NMPC) bear a close resemblance to many of the trajectory optimization techniques. These methods, in effect, perform a trajectory optimization at each designated time step except that this trajectory is: (1) usually performed over a finite horizon and (2) is implemented in a receding horizon approach: that is, the optimal control is applied only for a limited timeframe before the trajectory optimization is performed again at a future time step. Normally, trajectory optimization distinguishes planning from one point in X to another from the control used to track these points in real-time. MPC, which sometimes refers to receding horizon control as a whole, normally indicates optimization in a receding horizon fashion using a quadratic cost function with linear state constraints. This is a convex program, and can be solved very quickly with modern computational capability. NMPC is the nonlinear version, which relies on nonlinear trajectory optimization techniques to find a solution-such methods are not guaranteed to be globally optimal, as they rely on parameterizations as NLPs. Many backends are used, for example SNOPT and iPOPT which use methods such as sequential convex programming (SCP) and sequential quadratic programming (SQP). Providing analytical data, such as gradients and Hessians, can improve the quality of solutions from these solvers, but nevertheless they may provide suboptimal solutions or no solution at all. Obtaining guarantees on NMPC control (e.g. stability) is typically more difficult than for MPC. Chapter 5 addresses these methods in more detail.

2.2.6 Planning under Uncertainty

Planning under uncertainty is another open research area with a variety of different approaches. Broadly, these approaches aim to obtain some guarantees on future sys- tem performance, sometimes by assuming bounds on the uncertain system parameters

36 [481. Other approaches conduct graph searches over nominal trajectories, propagat- ing forward system uncertainty and maintaining a bound on collision probability [491. Regardless, a probabilistic consideration of the dynamics is required and often a nom- inal trajectory must be provided, around which guarantees on performance can be made [50].

2.3 Research Gap for Microgravity Robotics

Modern trajectory optimization and motion planning research frequently addresses some of the areas relayed in Figure 2-5. Systems with uncertainty, difficult dynam- ics and environments, and that require real-time computation are under particular scrutiny today, because these are the areas which hinder application on real systems. Likewise, borrowing from robotics at large, microgravity robotics requires considera- tion of these areas in order to make autonomy on-orbit a reality. Methods to address these areas, with a particular emphasis on microgravity robotics, are required.

37 38 Chapter 3

Dynamical Systems of Interest

A few model systems are used throughout Chapters 4 and 5 to provide relevant dynamics to the demonstrated planning methods. This chapter lays out some of the preliminary material to understand these systems, and provides a background on these dynamical systems. They are, in order of complexity: 2 DoF double integrator, 3 DoF rigid body dynamics, 6 DoF rigid body dynamics, and 6 DoF free-flying manipulators.

3.1 Dynamics Preliminaries

The following sections present necessary material for describing robot kinematics and dynamics. Perhaps the most perplexing section is 3.1.1, on attitude representation-a thorough practical knowledge of quaternions is essential.

3.1.1 Attitude Representation

There are almost as many ways of describing attitude representation as there are ways to position an object in SO(3). Diebel and others provide an extensive summary of attitude representation [51] 1521. In particular, this thesis prefers to use Furgale's convention for notation [531, while using the SPHERES scalar-last quaternion con- vention [3]. Suppose there is a body frame, FB, and an inertial frame FI. A unit

39 quaternion is a vector in R' such that

T q [q q 2 q3 q4I (3.1)

where axis n is the unit rotation axis and a is the degrees of rotation about that axis. Using a scalar-last convention,

]T q2 q3 = nsin(la) [q, I (3.2) q4 = cOS(Ia)

For the unit quaternion, the conjugate, q* is:

-q 3 q*= 1 (3.3) L 4

As a matter of convention, Bq means the orientation/attitude of T B with respect to FJ, equivalent to the direction cosine matrix I R. Iq rotates vectors from T B to -F1

Quaternions can be multiplied-this is equivalent to rotation matrix composition. The attitude of FM with respect to TI, I q, is given by:

I I N Mq- Nq'Mq

R(I q) = R(Iq N q) = R(I q)R(Nq) (3.4)

40 Note that a conversion to a direction cosine matrix is given by:

BR =1 RT = R(B q) =

2(qiq 2(qiq q1 i- + q 4 2 + q3q4) 3 - q 2 q4 ) 2 2 2 +q2 2(qiq_2 - 2 +q2 2(q qiq4) q q ) q + (3.5) S22 3 4 -q + q -q2 +q 4 2 3 2(q q - q1q4) - q - qj 2(q 1 q3 + q2q4) 2 3 1q2 -1 3 + _ 22 q + Quaternion multiplication is given by:

I N ) q = Q( )q (3.6)

)= 13qL++C(q 13) q13] 4-q 3 q4

Iq4 -q3 q2 gi (3.7)

q3 q4 -q1 q2

-q2 q1 q4 q3

-q 1 -q 2 -q 3 q4

where C(x) is the cross product matrix,

0 -X3 X 2

Cx) =X3 0 -xij (3.8)

-- 2 X 1 0

This is equivalent to composition of the rotation matrices IR and NR. For unit quaternions the following decisions are required for the representation to be well- defined:

* Is the quaternion angle-first or axis-first?

* Is the quaternion left- or right-handed? (Usually, the convention is right- handed.)

41 I * Is the quaternion active (representing a rotation of a vector B X -+B y within

FB), or passive (representing a resolution Of Bx within FB to Ix within Fl)? Passive and active quaternions are complex conjugates, and therefore rotations of opposite sign about the same axis (using the active interpretation). This is equivalent to transposing a rotation matrix. Specifically, the quaternion opera- tion on a vector v E R3 is given as:

= q [ .q* (3.9)

(3.10)

An active interpretation of q means,

(3.11) [2' q. [ 0 q*

(3.12)

A passive interpretation of q means,

(3.13) [ ] q -q*

The interpretation of v' as being resolved in Iv' or B v determines entirely whether q is active or passive, respectively. The following relationships can hopefully provide some clarity when dealing with active and passive interpretations:

qactive = q*assive

Ractive = Rpassive = passive

42 Note that declaring a rotation to be passive is equivalent to taking an additional inverse of the active interpretation. Therefore, the inverse of a passive interpretation is simply the active interpretation. The reader is referred to [53] and [511 for an excellent summary of the possibilities for representing attitude. Finally, if Euler angles are used the following questions must have well-defined answers for the representation to be meaningful:

" What is the order of rotation?

" About which axes are these rotations performed?

* And finally, are the axes of rotation body-fixed or inertial-fixed?

An Euler rotation should be specified, for instance, as ZYZ body-fixed rather than simply listing out three angles.

3.1.2 Attitude Kinematics and Dynamics

Attitude kinematics are obtained by composing multiple frames of reference. Using the prior section, a frame change is sufficient to represent attitude:

Ir = R(l q)Br (3.14)

Angular rate is typically given as TB with respect to FI, expressed in FB as

BWIB. There is a mapping between Bq and BWIB, and Ie. Quaternion rates can be expressed using the quaternion multiplication matrix to convert from BWIB [531.

H(q) =1Iq4- C(q 3 ) -q 3]

q4 q 3 -q2 -l

= -q 3 q4 qi -q 2 (3.15)

q2 -q 1 q4 -q 3

Iq = -(I q)'LVIB B 2 B

43 The dynamics of the problem are given by Euler's equation:

(; = -I-w x Iw + -1-r (3.16)

Where I E R3x3 is the second moment of inertia with respect to the center of

mass, expressed in FB.

3.1.3 Translation Kinematics and Dynamics

Translation is represented (e.g. of a point a, ra) relative to a frame of reference using the following notation:

Ira =B ra + rIB (3.17)

The translation dynamics relative to the center of mass can be decoupled from the attitude dynamics.

icoM = V . F (3.18) VCOM = -- m These quantities are all in FI; the presuperscript notation is dropped for conve- nience here. F is the vector of applied forces.

3.2 Model Systems

The 6 DoF satellite model (essentially just the Newton-Euler equations) is a realistic model of satellite dynamics, and will be examined further. There are a few lower- dimensional model systems that are helpful in understanding satellite dynamics. The Newton-Euler equations can be simplified to these lower-dimensional models by re- stricting the state to the desired variables. Robotic manipulator system dynamics will also be analyzed and simulated, but will not be applied in Chapter 4.

44 3.2.1 Double Integrator

The double integrator dynamics are simply given by the following state, which obeys 3D translation kinematics.

r = ry rz ]T.. r (3.19) v [vx VY Vz .V

rcoM = V .F VCoM = F m

This is equivalent to the motion of a frictionless mass.

3.2.2 3 DoF Rigid Body

The double integrator dynamics resricted to planar translation and given an attitude component can be stated using 2D rigid body dynamics.

r r= Lrx rylT Oz (3.20) V V= z 1 o T

W~z

icoM = V F I SCoM = - m (3.21)

Oz =

Izz This is a good model of satellite motion on a granite table, for instance.

45

I 3.2.3 Rigid Body Satellite (6 DoF Rigid Body)

The rigid body dynamics describe satellite motion in a frictionless microgravity en- vironment. A rigid body satellite model has, in addition, attitude dynamics which obey the equations below.

]T r [rx [ ry rzJT r Tr

q = [qx qy qz q=T q (3.22) V = Vy VI

]T LO

rcoM V .F VCOM F m (3.23) L; = -I-Lo x Iw + -1-r

I~~il (= q() TBWI

The torque inputs, r, and inputs, F, can also be coupled with a mixer to convert thruster activations into pure forces and torques.

3.2.4 Robotic Manipulator Systems (Free-Flying and Free-Floating)

Systems equipped with a robotic manipulator are either classified as free-flyers or free-floaters. Free-flyers have the capability of using their thrusters and manipulator actuators; free-floaters only use manipulator actuators. Both systems have been studied extensively, beginning in the late 1980s and early 1990s [541 1551. To quote Dubowsky,

... the key to solving the problems of planning and control of space robotic systems lies in understanding the fundamental dynamic behavior of these systems.

Robots are often expressed using the manipulator equations, given below.

46 M(q)4+ C(q, 4)q = -r+ Bu (3.24)

M is the inertia matrix, C is the Coriolis matrix, r captures external generalized forces, B is a mapping from system inputs to influence on the state, and q is the con- figuration vector (not to be confused with the quaternion). The free-flying dynamics will be given in this form.

Free-Flying Dynamics

The free-flying dynamics are distinguished from the free-floating dynamics in that all thrusters on the satellite are available for use: the free-floating dynamics are underactuated and nonholonomic, as only the manipulator actuators are available for use. The state, for an n DoF manipulator arm is given as:

T r = [rx ry rzlI r q = [qx qz q JI qy 0 q T V = 1V V (3.25)

1 T 6arm Oarm ~ [01 6 arm 6 n] T arm 1= ...

The dynamics (presented here in manipulator equation form) are given by Dubowsky

[55], using a Lagrangian derivation. I 4 S [rT qT mar1

4 [vT W T 6Tr]TaTOrm (3.26) 0 0

M= 0 MWm Mwm

0 MWm Mm

47

IP M is the total system mass. M, E R3 x3 contains terms affecting w, Mm E R3xn is a coupling term that shows the effect of arm motion on base disturbance and Mm E RlXf captures the self-disturbance of the arm. The B matrix is determined based on the actuators available on the spacecraft, and usually involves postmultiplying a mixing matrix. C is not shown here and contains the nonlinear (Coriolis and centrifugal) terms. The following grouping of base and arm terms makes individual contributions explicit. Here, the subscript m refers to configuration vector components of the arm

(manipulator), and 0 to those of the base [561.

MO Mom ][ CO CoM [o O Bo Bom uo

MT Mm C Om Cm rmJ OmTm B] Ur

The reader should note that there are a variety of competing notations for these dynamics-the clearest way to make sense of a different notation is to verify the configuration vector q that is used.

Free-Floating Dynamics

The free-floating dynamics are the same as those given by 3.27, except that v = 0 and uO = 0 since there are no control inputs applied via thrusting. For the zero initial momentum case, r is unchanging.

3.2.5 Challenges

Rigid body satellites and free-flying/free-floating systems exhibit some unique chal- lenges that will be detailed further in Chapter 4. These challenges are:

" High dimensionality (up to 12 + 2 x n for free-floating)

" Nonholonomicity, i.e. path dependence (for the free-floating dynamics)

* "Complexity" in the sense of coupling of states in the multi-body dynamics

48 e Underactuation for certain operational modes or thruster configurations

Additional challenges may arise, such as uncertainty and cluttered operation, and are outlined in Chapter 4. To provide a brief discussion of nonholonomicity for free-floaters, it can be shown that the conservation of angular momentum is written,

1 W = MO Momel (3.28)

Equation 3.28 is a nonholonomic constraint-it cannot be integrated. This intro- duces path dependence into the system, that is, paths taken in the joint space will change the orientation of the spacecraft. Further, some states of the free-floating sys- tem are dynamically singular, meaning that the dimensionality of end effector motion is reduced. Portions of the workspace in which dynamic singularities may occur are termed the Path Dependent Workspace, while those which do not contain dynamic singularities are the Path Independent Workspace 154].

3.2.6 Demonstrations in Simulation

The 6 DoF rigid body dynamics were developed in MATLAB to provide a dynam- ics module for future experimentation. Figure 3-1 provides a visualization of these dynamics. Simplifications of these dynamics were also produced to yield the other simpler model systems described in the previous sections. The free-flying dynamics were produced using the spacecraft robotics toolkit, SPART 156]. A MATLAB wrapper was written to make the manipulator equations matrices available as a dynamics module. An example of a simple 2-link free-flyer is shown in Figure 3-2, using an RViz visualization front-end. Figure 3-3 provides an example of the dynamic coupling inherent in the free-flying dynamics. Here, a sinusoidal joint torque T-m is applied, inducing a change in base spacecraft attitude. Simple Euler integration is used for forward propagation of the system state. For verification of the dynamics, the center of mass position is plotted, which should not vary since external torques are not applied.

49 1 ,

0.5 -

0s

-0.5s

--. 5--.5

0 -2 -1 0.5. -05 0.5

-2.5

Figure 3-1: An example of a 6 DoF satellite responding to input forces and torques, using the 6 DoF dynamics module.

The dynamics necessary to conduct trajectory optimization for spacecraft systems have been developed. Primarily, the 6 DoF and an underactuated 3 DoF dynamics model will be used in the subsequent chapters. The free-flying and free-floating dynamics are targets for future experimentation using the algorithms developed in Chapters 4 and 5.

50 Figure 3-2: A free-flying satellite with a 2-link manipulator, with state space given by equation 3.25.

COM Variation Coupled Dynamics -Base Z Euler Angle -0 -r Position X -r Position y 7F0 0d 0F

Time [s] Time [s]

Figure 3-3: A simulation of the free-floating dynamics, showing the base disturbance resulting from manipulator motion. CoM position is unchanged, as expected, other than slight deviation due to propagation error.

51 52 Chapter 4

Motion Planning for Challenging Dynamical Robots

4.1 Constraints

Trajectory optimization problems have constraints due to dynamics and constraints due to external factors (the environment). Some of the "challenging" dynamical and environmental constraints are addressed here. Many of the model systems shown in Chapter 3 exhibit some of these challenges, or operate in environments that have them.

4.1.1 Dynamics Constraints

As stated previously, differential dynamical constraints of the form of Equation 4.1 are considered:

Based on Albee, K., and Coltin, B. Kinodynamic-RRT for Robotic Free-Flyers: Generating Feasible Trajectories for On-orbit Mobile Manipulation. In ICRA 2019 Workshop: Toward Online Optimal Control of Dynamic Robots. (In Press).

53 x = f(x, u)

or, equivalently: (4.1)

4 = f(q, e, u)

Underaction and Nonholonomicity A system is underactuated, broadly speak- ing, if it cannot instantaneously command an arbitrary acceleration [42]. Stated mathematically, a system is underactuated if:

rank[f (q, q, u)] < dim[q] (4.2)

Nonholonomicity arises from systems that have non-integrable constraints. These constraints may either be a result of the environment (e.g. constrained to rolling on a disk), or might be related to the inherent dynamics/kinematics of a robot. For example, the classic example is a Dubins car with velocity kinematics specified by a nonintegrable constraint-the result is path-dependent behavior. Mathematically, this means a constraint of the form #(q, q, u) = 0 cannot be integrated into one of the form #(q, u) = 0. The practical result is that to reach an arbitrary point in configuration space, q, the system is restricted in the paths it can take to reach it (rather than restricting the configuration space itself).

High Dimensionality The curse of dimensionality mentioned in Chapter 2 limits the applicability of many motion planning algorithms. Systems with high-dimensional state spaces (roughly, this is anything above, say, dim[x] = 6) can take unreason- able amounts of computation time for real-time use. Further, high-dimensional state spaces can also be more difficult to reason about intuitively and so tricks like motion primitives might be harder to find and describe. Finally, high-dimensional systems, es- pecially ones that exhibit nonlinearity, can be very difficult for nonlinear-programming based solutions to find global solutions for.

54 State and Actuation Limits Systems often have limits on state (Xi < 0) and actuation (ui 0) that place hard bounds on possible inputs and where a system can move. These constraints automatically rule out some successful methods, like LQR, and can be a challenge even for optimization methods particularly if they are nonlinear.

4.1.2 Environmental Constraints

A variety of constraints can be thought of as coming from the environment a robot operates in, rather than from the robot's dynamics itself. Obstacles and areas that impose state and actuation limits are two of the most common examples.

Collision Collision constraints are very frequently encountered, and can in fact be thought of as the basis for much of motion planning. Collision constraints arise from requiring that a robot's geometry does not intersect the geometry of a set of obstacles. Without this requirement, for simple Euclidean distance motion planning the solution is known as the straight-line distance. Collision constraints (imposed as a set of state constraints, or enforced in a collision checking routine), restrict the allowable state space.

Keep-out Zones, No-thrust Zones Keep-out zones are bounding areas equiva- lent to collision zones, but may sometimes be allowed to be in violation if they are not hard constraints. Meanwhile, no-thrust zones are areas where actuation limits are enforced dependent on the location in state space. An example of no-thrust zone from space robotics is areas near a spacecraft that restrict thruster firing to prevent plume impingement of exhaust gas.

4.2 Underlying Scenario

The fundamental problem being solved for the satellite dynamics of interest is stated here. The trajectory optimization problem of Chapter 2 is considered with "difficult"

55 constraints.

4.2.1 Planning for Complex Dynamics

The standard continuous-time trajectory optimization problem is summarized by Paden and is given in Chapter 2 [36]. The dynamics must be satisfied as a constraint, x = f(x, u). Additional hard constraints such as collision avoidance, x(t) E Xfree, and actuator saturation, u(t) E U, are often included. A feasible solution to this problem is desired-ideally, an optimal solution is also sought.

As outlined in Chapter 2, robotic motion planning has traditionally not regarded system dynamics in developing plans, yet these constraints are at the core of trajec- tory optimization. This has been changing, causing motion planning to collide with concepts from optimal control and applied mechanics. Increasingly complex robotic systems demand motion planning that can produce dynamically feasible trajectories while enforcing other difficult constraints. One particularly challenging example is mi- crogravity robotic free-flyers, i.e., satellites with manipulator arms of non-negligible mass, introduced in Chapter 3. These systems (when free-floating) are nonholonomic and often have high-dimensional state spaces. As such, trajectory optimization is a challenge: direct methods might fail to produce even a feasible solution. Even lower dimensional systems when facing, for example, cluttered space station interiors, en- counter challenges in kinodynamic planning.

4.2.2 Systems of Interest

The 6 DoF dynamics (Equation 4.4, see Figure 4-16) are of particular interest because they are nonlinear, considered high dimensional, and are a good model of satellite dynamics. They may also have a variety of interesting environmental constraints added. Providing these dynamics once more:

56 T r = r ry rz I r T q = qx qy qz qo Tq -, q (4.3) V = [Vy VY I [ ]T

CoM =V

VCoM = F m (44) L;J = -- x Io + 1-1

I = 1-(q)TB

The definitions here follow those provided in Chapter 3. An underactuated satel- lite will also be introduced. This system is simply the 3 DoF dynamics of Chapter 3, but instead of pure force and torque, inputs are restricted to a set of four thrusters along one axis of the spacecraft (see Figure 4-11).

4.3 Sampling-Based Planning (SBP)

Sampling-based planners offer some promising characteristics when applied to difficult trajectory optimization problems. A key advantage of sampling-based planners is that constraints, such as collision-checking, are explicitly performed during exploration of the state space. However, default versions have no sense of optimality, and may even have guarantees on producing a suboptimal solution [57]. Sampling-based planners are complete, meaning that, in a probabilistic sense, they will provide a feasible solution to the motion planning problem if one exists. This is in contrast to planning based on nonlinear programming, which is unable to provide such completeness guarantees.

This is a rather weak statement, as computation time could be infinite, but in practice solutions can be found in reasonable time. Optimizing variants also exist [57], but for kinodynamic planning the solution of a two-point boundary value problem is required for implementation, and so this area remains a challenge. For non-optimizing variants,

57 I some common functions are required-see the subsequent sections for further details on implementation.

Common Primitives It is necessary to explore how sampling-based planners work. SBPs develop tree (e.g. RRT) or graph structures (e.g. PRM) in the configuration or state space to represent discretized plans tied together at the nodes. The rapidly exploring random tree (RRT) algorithm is widely used because of its simplicity and power in explicitly evaluating constraints. Essentially, random sampling builds a tree structure from an initial node by reaching toward independent and identically distributed (i.i.d.) samples and obeying constraints until a goal region, Xgoai, has been reached. The basic algorithm is outlined in Figure 4-1. RRT makes use of the following primitive functions [57]:

" sampleFree : Return i.i.d. samples from the free space, Xfree.

* nearest(x) : Given a node, x, return the closest node in the tree to this sample. "Closest" is defined by the distance metric.

* near(x, r) : Return Xnear, the set of all nodes within radius r around point x.

" steer(xfrom, Xto) : Given two nodes, return a point that is closer to xto than xf rom is. The returned point need not reach xto.

" inCollision(xfrom, Xto) : Given two nodes, return True if the path between

them is clear, i.e. remains within Xfree.

In breaking down which sampling-based planning algorithm is appropriate, one must decide if their problem is kinodynamic, and whether they desire an optimizing variant when choosing an SBP (Figure 4-2). Each of these varieties will be introduced in turn.

4.3.1 Optimal SBP

Optimal sampling-based planners are asymptotically optimal. That is, as the number of sample sizes goes to infinity the optimal solution will be found. This is a potential

58 goal

Xnearest

Xnew

Xrand

Xinit

Figure 4-1: The RRT builds a tree structure from xiit by calling Xand < sampleFree at each iteration and using Xneu, - steer(nearest,Xrand) to produce a point to add, while enforcing system constraints.

kinodynamic? no yes

no RRT kino-RRT optimizing? LQR-RRT*, yes RRT* kino-RRT*,

Figure 4-2: A few of the RRT algorithm choices when considering whether a problem is kinodynamic and/or optimizing.

59 improvement over non-optimizing solutions, but in practice finding the optimal solu- tion may be computationally infeasible. What's more, these planners are only valid when L2 distance is the desired cost function. Extending their application to systems with dynamics is an open question.

RRT*

RRT* is one of the recently proposed [571 optimizing sampling-based planners, which have asymptotic optimality, shown executing in Figure 4-3. These planners are limited to systems without dynamics, and only guarantee optimality when the desired metric

is distance. RRT* relies on a rewire module, which attempts to connect Xnew to a set of Xnear and, further, attempts to reconnect through xnew to every node in

Xnear in order to reduce their cost by moving through xre,. The distance metric must be defined-simple Euclidean distance might not portray movement through the differential constraint required for dynamical systems, for instance. RRT* was implemented in a 2D dynamics-free simulation in Figure 4-4.

4.3.2 Dynamics-Aware SBP

Dynamics-aware motion planning that is complete (i.e. not a nonlinear optimization) is usually either (1) a decomposition of nonlinearity into discrete decisions or (2) involves randomized SBP. On the randomization side, a suite of algorithms branching off of kinematic SBPs exist which add in dynamics constraints. However, they are currently fundamentally limited in their ability to deal with arbitrary dynamical systems, specifically in [42]:

" Sampling appropriately

" Defining distance metrics

* Connecting arbitrary points in state space

Kino-RRT First proposed by [581, Kino-RRT explores the state space rather than configuration space and enjoys the benefits of explicit evaluation of constraints, in-

60 Xnear Xrand X new......

Xnearest

Xinit

(a) (b)

(c) (d)

Figure 4-3: RRT* executing in a 2-dimensional state space. In (a), a tree exists and sampleFree finds a random point, Xand to move toward. steer creates Xnew, moving toward Wand from the nearest node in the tree, Xnearest. The set Xnea, is created in (b), representing all possible nodes within a set radius to consider for connection to Xnew. The lowest cost path is added in (c), and nodes in Xnea are now considered for connection through Xnew. Any higher cost paths are removed and rewired, shown in (d). 61 0.4

0.2

0.0

-0.2

-0.4-

-0.4 -0.2 0.0 0.2 0.4

Figure 4-4: RRT* executing on a 2D dynamics-free state space. Note the branching outward from the origin, reflecting that rewire is rerouting non-optimizing RRT branches.

62 cluding dynamics. Essentially, forward evaluation of the dynamics is incorporated into steer and an appropriate distance metric is developed so that nearest works appropriately. Realistically, it can sometimes be fast enough for online deployment, but can at least provide reference solutions to other solvers that fail from poor initial guesses. Pseudo-code for the algorithm (with steer and the distance metric adapted for satellite dynamics) is in Algorithm 1.

4.3.3 Dynamics-Aware Optimal SBP

Adapting Kino-PRRT to an optimizing version requires solving a two-point boundary value problem, among other complications. The two-point boundary value problem is a fundamental limitation in applying optimizing sampling-based planners to systems with dynamics. Figure 4-5 captures the idea: two nodes define initial and final con- ditions, between which dynamics (a differential constraint) must be obeyed. Solving this in general is a trajectory optimization problem.

Xnear

* Xrand

must obey f(x, u)

Figure 4-5: The essential challenge of optimizing SBP: a two-point boundary value problem must be solved to know the optimal cost-to-go.

Some methods propose workarounds to solving this problem, but rely on lineariz- ing nonlinear systems at non-fixed points, or use heuristics that may not approximate actual cost-to-go well. More work remains to be done in this area.

63 LQR-RRT* and Kino-RRT*

LQR-RRT* relies on using the infinite-horizon LQR solution around linearized points in the state space to create a locally optimal control policy for steer, and to create distance metrics used in near, nearest, chooseParent, and rewire [591. The LQR policy is found by linearization about the desired goal point, either Xrand, Xnew, or

Xnear, depending on when steer is needed. The rewire and chooseParent portions of the algorithm are illustrated in Figure 4-6. This assumes steer functions that do not have fixed final time: executing the infinite horizon LQR control policy over a fixed time step, e.g. in LQR-RRT* does not guarantee arrival at goal state. Kino-RRT* offers improvements by guaranteeing fixed final state, but also assumes linearization for nonlinear systems [60]. Work in this area to find a suitable nonlinear solution remains open.

4.4 SBP for Satellite Dynamics

Satellites and free-flying robotic systems have highly nonlinear, high-dimensional dy- namics. Trajectory planning for high-dimensional systems with highly nonlinear dy- namics often suffers from an inability to obtain even feasible solutions. Such problems can be converted into nonlinear programs (NLPs) via direct methods (e.g. colloca- tion, transcription, etc.), but these methods often benefit from a good initial guess for convergence [43]. Iterative methods including, for example, SQP and SCP, also require trajectory initializations [61]. As a first step in efficiently generating feasi- ble trajectories for other solvers (and in looking toward optimizing sampling-based methods) this section will demonstrate non-optimizing kinodynamic planning of the satellite dynamics, which can guarantee constraint satisfaction via explicit evaluation.

Problem Setup The state vector for a 6 DoF satellite, given in Equation 4.3, con- sists of rigid body position r, orientation q, linear velocity v, and angular velocity w. Kinodynamic-RRT [581 is applied to the problem, resulting in explicit evaluation of the forward dynamics and hard enforcement of constraints. Leveraging these ad-

64 Xrand Xnear 40 Xnew@

Xnearest

(a) (b)

(c) (d)

Figure 4-6: LQR-RRT* introduces modifications to RRT* to account for differential constraints. Assuming an LQR-compatible problem (quadratic cost and linear dy- namics), the selection of LQRNearest in (a) is performed using infinite-horizon LQR's cost-to-go to xrand. The optimal input from this policy is followed to arrive at Xnew. nearest and rewire steps used in (b) - (d) are performed using infinite-horizon LQR cost-to-go, and proceed as nominal RRT*.

65 vantages, feasible solutions are guaranteed. The method is detailed in Algorithm 1.

Notably, SteerFF guides the expansion from Xnearest toward Xnew, explicitly evalu- ating dynamics within actuator constraints. Since a quaternion is used, nearest and sampleFree must be modified in a future iteration as a Euclidean metric does not suffice. It is noted that LaValle's original implementation of Kino-RRT, for instance, used weighted Euclidean distance as a metric 158].

4.5 Results for Satellite Dynamics

All simulations were performed on a 16 GB, 8 core Intel i7-4700MQ CPU A 2.40GHz, running Linux Mint 18.3. Kino-RRT cases for the desired systems were implemented in C++ and Python. The ROS-compatible Astrobee simulation implementation for the non-manipulator dynamics is summarized in Figure 4-7. A 3 DoF underacuated system also implemented an LQR-RRT* approach, but was modified to a custom Kino-RRT due to drawbacks of LQR-RRT*.

choreographer init-ree tree -M-- state-space

ROS bidrt ad

plannerad extend() wrapper

- I I nearest()

r unit tester -

RBD RRT C++ Implementation Software Architecture -_--_----

Figure 4-7: The ROS-compatible planner implementation, currently implemented up to the 6 DoF dynamics. This planner is capable of working with Astrobee's choreographer planner interface.

Lower-Dimensional Implementations First, some lower-dimensional results will be demonstrated in building up to the full satellite dynamics. Figure 4-8 is entirely

66 dynamics-free, showing the typical RRT two-dimensional state space exploration.

II

Figure 4-8: RRT expansion in a 2D dynamics-free state space.

Once dynamics are added and Kino-RRT is implemented the result of forward propagation becomes clear. Figure 4-9 shows the two-dimensional double integra- tor dynamics, with curved trajectories obeying forward propagation. Note that the steering function tests steering over a variety of inputs-the best is selected to get the closest possible point to Xand. Extending this result to three-dimensional dynam- ics results in solutions similar to those shown in Figure 4-10, the three-dimensional double integrator. Further, an underactuated satellite model was used as a testbed for optimizing sampling-based planners. A visualization of the system is shown in Figure 4-11, where the underactuated thrusters lie along the longitudinal axis. LQR-RRT*, while successfully implemented for this system, posed a few challenges which led to poor solutions that rarely approached the goal. In response, a Kino-RRT with LQR- supplied distance metric was used and a custom steer function was designed, named Guided-Kino-RRT, Algorithm 2. In this case, rather than executing the LQR optimal policy derived from a linearized point, the steering module is provided with a set of "sensible" inputs to apply to the system, e.g. thruster activations for spinning,

67 -0.5

0 -15 -10 -05 00 05 10 I' M

Figure 4-9: Kino-RRT expansion for 2D double integrator dynamics.

1.5

1.0

0.5

0.0

-0.5

-1.0

-1.5

1-2.

-0.5 0.5

X0 0.5 -0.5 0. 1.0 -1.0 1.5 -1.5

Figure 4-10: Kino-RRT expansion for 3D double integrator dynamics.

68 translating along the longitudinal axis, activating single thrusters, etc. These inputs are applied at varying thrust levels and forward propagated over a set time step. The result that is closest to the desired point, using the LQR cost, is selected as Xnew.

Figure 4-11: An underactuated satellite. The dynamics are 3 DoF, but thrusters are only placed along the long longitudinal axis-lateral accelerations cannot be com- manded.

Some drawbacks of LQ-RRT* are noted here, in particular, the linearization of the underactauted satellite system is not stabilizable about u = 0 and so an LQR policy cannot be found without e.g. using a non-zero linearization.

" The infinite horizon LQR policy does not guarantee fixed final state

" Certain systems may have linearizations that are not stabilizable, and for which LQR policies cannot be found

" The shrinking ball for near search radius is not justified for non-Euclidean spaces, as it is in [571

" Linearization around a non-fixed point is likely a poor approximation of the dynamics

The Guided-Kino-RRT algorithm is summarized in Algorithm 2, which uses an LQR cost-to-go distance metric and a motion primitive steering function. 12 thruster activation combinations to search over at 20 thrust levels resulted in 240 forward

69 Algorithm 1 Kino-RRT 1: procedure KINO-RRT(X, U) 2: V +- xinit; E +- 0; 3: while xer. > tol do 4: Xrand 4- sampleFree 5: Xnearest +- nearest (7, Xrand) 6: Xnew +- steerFF(Xneaest, Xrand) 7: if ObstacleFree(nearest, Xnew) then 8: V + V U xnew; E +- E U (Xnearest, Xnew); 9: end if 10: end while 11: return G = (V, E); 12: end procedure

Algorithm 2 Guided-Kino-RRT 1: procedure GUIDED-KINo-RRT(X, U) 2: V - Xinit; E <- 0; 3: while xerr > tol do 4: Xrand +- sampleFree 5: Xnearest +- nearest(V, Xrand) > Jses linearized LQR cost-to-go 6: Xnew +- guidedSteer(nearest, Xrand) 7: if inBounds(Xnew) then 8: V +- V U Xnew; E + E U (Xnearest, Xnew); 9: end if 10: end while 11: return G = (V, E) 12: end procedure evaluations at each timestep. Iterating over these input values for forward integration, the motion primitives used were:

" Single-thruster spin-translation

" Two-thruster translation (pure force pair)

* Two-thruster spin (pure torque pair)

" Three-thruster spin/translation

Some typical results are shown in Figure 4-15. In these examples, the spacecraft mass is specified as m = 4.1 kg, the side length is 1 = 0.25 m, and the rotational

70 inertia is I = ml2 kg - m2 . Typical times for 1000 node explorations were roughly 9 seconds (Python implementation). The tree explores the space fairly well, reach- ing the goal and terminating at a distance metric tolerance of 0.1. The presented paths are also clearly sub-optimal-while some paths show intuitively optimal mo- tion, others contain wasteful looping, for instance. With a more precise steering function, it would be reasonable to once again add in rewire, creating an optimizing variant. Guided-Kino-RRT, circumventing issues with LQR-RRT*, produced reason- able feasible trajectories in real-time (seconds). It is a potentially useful alternative, particularly in highly cluttered environments.

Figure 4-12 Figure 4-13 Figure 4-14

Figure 4-15: Sample test results from Guided-Kino-RRT. Note the sub-optimal be- havior, but also the good exploration properties of the RRT (or, rather, the ability to randomly sample appropriately using the distance metric). Feasible solutions are shown in red.

6 DoF Implementation The 6 DoF dynamics are considered, represented by Fig- ure 4-16. Figure 4-17 shows a visualization of a sample system using the full free-flying dynamics that were developed in simulation. SteerFF handles forward evaluation, currently for the rigid body dynamics. This solution is relatively fast taking 4 s for a 6300 node search with 38 node solution (C++ implementation). This example does not include collision checking, though implementing a collision-checking module would be a worthwhile extension to truly showcase the benefits of sampling-based planning in complex environments.

71 By

Figure 4-16: The 6 DoF satellite, with pure force/torque actuation.

0.6 0.4 0.2 Oso 0.0 -0.2 -0.4 -0.6 05 0.6 0.4 - 0.2 -0.6 0.0 -*5 --2S -04 0.2 -0.2 MO -050 00.2 -04 040.6 -0.6 Lao-1o0

Figure 4-17: RRT nodes for the 6 DoF dynamics (left) and a feasible solution (right).

72 Summary and Conclusions The application of Kino-RRT to free-flying robots specifically offers a path to developing feasible trajectories which can serve to warm- start other optimizing solvers. Sampling-based methods might also have a place in environments where NLP-based solutions can fail, offering at least feasible solutions. However, there is no sense of optimality in this formulation-this is the subject of ongoing and future work. It was demonstrated via a low-dimensional example that current optimizing dynamics-aware SBP methods are limited in a few regards. Even though many Xne, fail to satisfy constraints the methods lead to a relatively fast exploration of the state space, obeying differential constraints. The wider question of how to find optimal trajectories for high-dimensional dy- namical systems, especially under challenging constraints, remains open. Future ex- tensions aim to look at more robust comparisons of these SBP results to fast iterative NLP methods, particularly in cluttered environments, and to begin coupling initial guesses from these SBP methods where other solvers might fail. Finally, additional work is necessary in bringing optimal dynamics-aware planning to complex nonlinear systems-this is particularly important for the free-floating dynamics, for instance.

73 74 Chapter 5

Planning for Information Gain

Robotic systems often operate with uncertainties in their dynamics: manipulating unknown objects means dealing with uncertain mass properties, for example. Two main approaches are often used: operate under assumptions on the uncertain system and design robust controllers in spite of uncertainty, or characterize a system to one's satisfaction before even attempting system control. A middle-ground approach attempts to blend these two paradigms, say, if making trajectory progress toward a goal is of the essence, but gaining information about a system is also desirable. If some degree of sub-optimality for a traditional cost function (e.g. min-time) can be tolerated, the trajectory may be modified in such a way that it weighs information gain of uncertain parameters alongside other metrics of optimality. What's more, a real-time planning framework can take advantage of newfound dynamics information and incorporate this information on-the-fly into its planning to generate trajectories that are more robust. This chapter details this learn-as-you-go approach which blends excitation trajectories, which are usually intended to optimize information gain in conjunction with some estimation mechanism, alongside other trajectory optimization metrics in a real-time planning framework.

Based on Albee, K., Ekal, M., Ventura, R., and Linares, R. (2019). Combining Parameter Identification and Trajectory Optimization: Real-time Planning for Information Gain. In ESA ASTRA 2019. (In Press).

75 5.1 Background on Probability

Because robots often operate with uncertainty in their dynamics or environment, the ability to describe their operation in the language of probability is a powerful tool 150]. A probability density function (pdf) is one of the most essential pieces of this toolkit. A pdf is a function p(x) which returns the likelihood of the realization of a random variable, X. A cumulative distribution function (CDF), P(x) is the integral of the pdf and always equals one. A CDF is the likelihood that a realization x and all realizations < x could occur.

p(x) = det(27rE)-exp{- (x - p) - (5.1) C(x) = p(x) dx/=o

Above, the multi-variate form is given. p(x) is stated as a Gaussian, A/(p, E), a common pdf with expected value JL and covariance matrix E. The expected value is the long-run average, and is usually synonymous with the mean. The variance is a measure of how "spread out" a pdf is from its mean.

A likelihood function is a joint probability distribution over the parameter space,

L( 16) describing the probability of obtaining j. It is proportional to the joint prob- ability distribution p( 10). 6 = argmax 1(fj6) provides the greatest likelihood that 0 a value of 6 produced 9. Maximum likelihood estimation, for example, deals with finding estimates of the state parameters that maximize a likelihood function. Probabilistic robotics runs deep, but these essentials are enough to provide some context for planning with uncertainty and the problem scenario of interest.

5.2 Underlying Scenario

This section will introduce the fundamental problem, as well as the desired systems for testing. This setup is similar to the nominal trajectory optimization setup of Chapter 2, but with added uncertain dynamics and uncertain parameters [36].

76 mass mass

Figure 5-1: Robotic systems often have a variety of uncertain parameters. In partic- ular, space systems that must manipulate or dock with an uncertain target (left) will inherit the uncertainty of the target (right) as the system's inertial properties change (e.g. mass, shown here with a sample probability density function).

5.2.1 Planning with Uncertain Parameters

A robotic system with state x E R x and uncertain parameters 0 E R4"X is initially positioned at state xo. A goal region Xg is specified. (For a rigid body system, 0 may include: mass, moments and products of inertia, and center of mass.) The dynamics and measurement model of the system are specified and include process noise. The aim is to plan a trajectory respecting constraints while minimizing a cost function J by selecting optimal inputs to the dynamic system, ui E u E Rkx.

5.2.2 System of Interest

3 DoF and 6 DoF rigid body dynamics systems will be analyzed. Both are accurate representations of spacecraft motion in microgravity. The 3 DoF dynamics are often used as a simplified form for testing. Using the formulations of Chapter 3, the 3 DoF dynamics are given by:

77 r = (5.2) r = [r, ry Oz x V = 1VX VyVV

Wz

rcoM V

VC.ro0 M F m (5.3) 6 z =z

Wz = Izz

The 6 DoF dynamics are given by:

r = rx ry rz r [ ]Tr q I[qx qy qz qOT q

x= (.4 V =V V y vz

W T W

rcoM = V

VCoM=F.F m (5.5) w = -Iw x Iw + I-1 r

I4i= i(f TBI

3 3 Where I E R x is the second moment of inertia with respect to the center of mass, expressed in TB. Note that the quaternion convention is scalar last and Iq

represents the attitude of the body frame with respect to the inertial frame. B WIB

indicates TB with respect to FI, expressed in T B.

The uncertain parameters examined include:

78 mass CoM MoI Pol

M B rBC Ixx, Iyy, Izz Exy, lxz, Iyz

In simulations shown later in this chapter, the actual system mass is taken as 9.7 kg, while the principal moments of inertia Ixx and Ivy are taken as 7 kg - M2, and Izz as 10 kg -m2 . The NMPC loop was run at 1 Hz, with a horizon length of 40 s. The scenario examined here will consider mass and Mol uncertainty only. Now, methods of planning for information gain are sought.

5.3 Relevant Methods

Some'existing methods relevant to this problem will be summarized, as well as the fundamentals of real-time planning and estimation: this problem relies on the tra- jectory optimization and estimation working in tandem. Existing methods that deal with uncertainty generally fall into a few methods: guarantee that the system will operate well given uncertainty bounds (robust control), adapt parameters online to track a reference trajectory (adaptive control), or generate an offline reference tra- jectory for the system to follow that optimally excites the system for the parameter estimator (excitation trajectories) 163].

5.3.1 Real-Time Motion Planning

From Chapters 2 and 4, the motion planning problem is to determine an optimal tra- jectory while obeying system dynamics and other constraints, such as collision avoid- ance [361. Many motion planning optimization frameworks assume certain dynamics and deem offline reference trajectory generation sufficient. Of these frameworks, some are suited for real-time trajectory planning.

e Sampling-Based Planners: Sampling-based planners can be augmented with heuristics to become optimizing planners for systems with dynamic constraints. Depending on their implementation, these planners may be suitable for real-time

79 planning, especially for systems with particularly complex dynamics. They have been proposed for use in belief space, an extension to uncertainty planning [49].

9 Graph-Based Planners: Planners that use motion primitives to stockpile cer- tain system motions offline and to combine primitives online are categorized as a type of graph search. Such methods might be suitable for dealing with uncertain parameters, but the number of trajectories generated for unknown parameterized might be unusably large [36] [2].

Receding horizon methods are another particularly popular framework amenable to incorporating updates to the dynamics on-the-fly. MPC for instance optimizes a mathematical programming statement of the cost and system constraints over a time horizon and executes only the first few control actions, repeatedly optimizing af each time step. MPC often uses a discrete formulation of the LQR in order to create a quadratic programming problem over a short time horizon for systems with linear state dynamics [621. Methods like Sequential Convex Programming (SCP) [471 can also be run in a receding horizon way, getting into the realm of nonlinear MPC.

5.3.2 Adaptive and Robust Control

Robust control guarantees that, within bounded parameter values, certain system behavior will be guaranteed (e.g. stability). Some recent work, taking a robust approach, bounds parameter uncertainty and guarantees that executed trajectories will remain within regions that satisfy obstacle avoidance constraints. Such funnel library approaches do not yet weigh information gain or use updated information about the system [48]. However, sum of squares optimization is becoming increasingly prevalent in order to construct Lyapunov functions guaranteeing system performance around nominal trajectories. In addition, a strong literature exists in adaptive control, which (as opposed to robust control) adapts parameters during operation and uses them in computing control actions online. Model reference adaptive control (MRAC), for example, uses an adaptation law to adjust certain parameters, in order to achieve tracking against

80 a desired reference model. Stability and convergence proofs (often using Lyapunov theory) can often be constructed without too much trouble for these controllers. Self- tuning controllers, which pair a controller with an estimator that updates parameter values, use certainty equivalence to update the controller with parameter estimates online. Such methods typically have more difficult convergence and stability analysis. Further, STC require some "richness" (i.e. excitation) to encourage the estimated values to converge [381 [641 [65].

5.3.3 Estimation Methods and Parameter Estimation

The stereotypical online estimation method is the Kalman filter, which is essentially a combination of a recursive least squares estimator and propagation of the state expected value and covariance. This two-step process propagates forward a Gaussian representation of the state when new measurement information is unavailable and, once a periodic measurement is obtained, least squares can be performed. An ex- tension, the extended Kalman Filter, EKF, locally linearizes nonlinear dynamics in order to extend the Kalman filter to nonlinear systems. The unscented Kalman filter, UKF, offers another extension for very nonlinear dynamics by selecting a set of sigma points to propagate through the nonlinear dynamics to obtain a better estimate of the actual mean and covariance. Augmenting a filter with unknown parameters allows the filter to also act as a parameter estimator, performing both system identification and state estimation [66].

5.4 Real-time Planning for Information Gain

The problem of combining information gain with goal-driven planning by considering a robot with uncertain model parameters (e.g. a docked satellite) must be solved by: incorporating a measure of information gain; adapting parameters; adjusting information gain weighting; and performing real-time computation of u*. At a high level, these components interact as shown in Figure 5-4. The aim is to not only control the satellite to reach the goal position despite this uncertainty, but also to

81 -J

learn the system parameters while tracking a path towards the goal. Common sensor fusion algorithms combine measurements from different sensors to produce estimates of x. It is assumed that these estimates are available for use by the controller and the parameter estimator. This work proposes combining the excitation that is obtained via offline excita- tion trajectories with traditional goal-oriented model predictive control. The result is a real-time planning framework that adds excitation to goal-achieving trajectories in order to characterize the system to satisfaction during useful motion. The Fisher information matrix is used as a measure of the information content, and a reced- ing horizon planner is used for real-time planning in combination with a parameter estimator. Three key advantages are obtained:

" Inclusion of excitation in goal-achieving trajectories (Figure 5-3)

* Update of (more accurate) parameters on-the-fly (Figure 5-2)

" Continual replanning for trajectory deviation (Figure 5-2)

mass mass

Figure 5-2: Real-time parameter updates allow for better decision-making since a more accurate model is available. The poorly known parameter impedes tracking (left), while online update improves tracking performance (right). The latest infor- mation from parameters and the environment can be taken in using this real-time implementation.

5.4.1 The Fisher Information Matrix

The Fisher information can be plainly stated as the amount of information that a measurement carries about an unknown parameter 0. The term 1Jog(p(Q|0)) is

82 j/N~A Mol Mol MoI Mol

Figure 5-3: Excitation (information gain weighting) allows uncertain parameters to be better understood. The non-exciting trajectory has no need to apply torque so it gains no information about its moment of inertia (left). The addition of excitation allows tracking to the goal, but with additional rotation to understand moment of inertia (right). deemed the score, or how sensitive the log-likelihood function is to changes in a parameter, 0. The Fisher information matrix simply takes the expected value of the quadratic of these scores with respect to 0-the variance of the score. This suggests whether the likelihood function is sharply peaked with respect to 0. The result is effectively a measure of how much information about 0 that a given f contains. Therefore, maximizing this information means promoting meaningful measurements to understand the unknown parameters. The Fisher information is represented in matrix form when there is more than a single parameter 0. For 0 E RP, the FIM is I E RPxP.

1(6) = E log(PNO6) log(P(R6) (5.6)

Calculation of the FIM

Calculation of the FIM largely follows 167] [681. The FIM must be discretized for use in an NLP trajectory optimization scheme. This is accomplished using:

ti I(0,tkot,) Z H (t )T E-1H(tk) (5.7) k=to where H is a result of taking the first derivative of the least squares parameter estimator cost function:

83 19y ax ay H = + (5.8) 9x A 86

Note that H must solve for 2, which can be stated as

ax

=b ,x (0) = {0}"p af (x af Ox A6 90

H can be effectively solved for using the ODE solution for b and evaluating at discrete time points, summing the result.

The Fisher information matrix must be reduced to a scalar form to be used in the cost function. There are a variety of approaches to do this, including the determinant, the trace, and Frobenius norm-each has a nuanced meaning. For example, the determinant, product of eigenvalues, is most sensitive to zeroed terms: it is useful for detecting when certain measurements will not be useful at all in encouraging motion.

Note that the calculation of 0 can be performed by appending it as an additional state during model propagation:

[ ]T [r 0z v w, 1 T (5.9)

is rather unintuitive for the quaternion portion of the state. The following is provided for clarity:

84 B 22=-I q)'WIB

4X q4 -q 3 q2 -W-

q f(xU) y q3 q4 -ql W

qz -q 2 q1 q4 LwzJ qO -q 1 -q 2 -q 3 ~

0 wz -wy wx q4 -q 3 q 2

af -wz 0 wX wy q3 q4 -q 1

Wy -WX 0 wz -q 2 q1 q4

--WX -WY -Wz 0 -q 1 -q 2 -q 3

for x =qx qy qz q0 wx wy z

Now that a measure of information gain is available, it must be incorporated into a real-time control scheme.

5.4.2 Nonlinear Model Predictive Control

Use of a fast receding horizon planner allows the excitation trajectory to gradually blend into a time/energy optimal trajectory based on the latest available informa- tion. A large body of work uses model predictive control for spacecraft operations, e.g. [691 [70]. Going into greater detail than mentioned previously, receding (or mov- ing) horizon controllers function on the principle of only executing the first n of N timesteps of computed trajectory inputs. At every n timesteps, an optimal trajec- tory is recomputed over a horizon of length N. Only u(1)...u(n) of these optimal inputs are actually executed. Since the optimization is computed online in real-time as frequently as every timestep, n=l, new information about the model and the environment can be incorporated on-the-fly. Moreover, the online trajectory genera- tion also serves as a controller, making the system robust to error and uncertainty. Guarantees on stability and robustness are possible to provide for model predictive 85 I control (MPC), which uses quadratic cost and linear state dynamics, but are difficult to obtain for nonlinear MPC (NMPC). MPC is particularly suited for online control in the presence of constraints as long as the problem being solved is a quadratic program. Nonlinear dynamics and cost throw out these assumptions-NMPC must be used, which relies, ultimately, on nonlinear programming (NLP) to solve a nonlinear optimization description of the system trajectory. This approach is required here due to nonlinearities in both the system dynamics and cost.

5.4.3 Implementation Details

For testing performed to this point, the planner used is a receding horizon formulation; the estimator is an EKF for unknown parameters-it is assumed that a state estimator exists. The trace of the covariance matrix of the estimates is used as a measure of estimate confidence [71].

other realtime NMPa environment information

Parameter State Estimator Estimator

Figure 5-4: A high-level overview of NMPC's role in the system. Information gain is weighted in the objective function, and NMPC may take in new information on-the-fly during its realtime trajectory update.

The cost function combines two contrasting objectives of maximizing the excita- tion and minimizing the state error toward the goal. The FIM must be interpreted in scalar form to be incorporated in the excitation portion. Multiple ways of converting the information in the FIM to a scalar exist, as mentioned previously. Minimizing the

86 trace of the FIM inverse, i.e., Tr (F-1 ) is chosen in this case, equivalent to minimizing the lengths of the axes of the estimate uncertainty ellipsoid.

Algorithm 3 An abbreviated summary of the planner's logic. The estimator and planner are initialized, and a desired target set X2 is set. At each step, a trajectory optimization is computed online with the latest 6. The - modification logic can be modified to desired operation in SetGamma. 1: procedure PARAM-PLAN(xo, Xg, Po, Q, R, 00) 2: InitEst (PO, 00) 3: InitNMPC (xref, uref)

4: while Xk ' X do 5: Uk+l +- NmpcStep(xk, Okxref, uref) 6: x = f(x, Uk+1) r> system dynamics update 7: ( 6 k+1, Pk+l) +- ParamEst(Ok, Pk, yk+1, Uk) 8: end while 9: return (XO:k,UO:k) 10: end procedure 11: 6 12: procedure NMPCSTEP(Xk, k, Xref, uref) 13: - +-- SetGamma 14: Uk+1 = RunNmpc (CalcFisher) 15: return Uk+1 16: end procedure

The second part of the objective function is goal-driven, including tracking error and input with relative weighting matrics, Q and R. The adjustable relative weighting term, 7, automatically begins to assign relative to this state-input weighting at a desired rate.

J xTQx + uTRu + -[Tr(I)]- 1 dt (5.10)

This -y assigns relative weight to min fuel-time optimality and information gain. At 7 = 0, it is effectively a nominal quadratic weighting, whereas oc is an0 excitation trajectory.

Discretizing the above using the discrete version of the FIM, adding constraints, and setting R to 0 for this implementation gives:

87 f minimize J = x(ti)TQx(ti) + 7Tr (F(ti)- 1) i=O subject to x = f (x, u,7 ), (5.11)

Umin < u(t) 5 Umax

X E Xfree

Note that f is the horizon length. In the implementation discussed here, -y depends on the norm of state error as well as elapsed time through the following relation,

y = e-1/'' + 11x11 2 (5.12) where r is the time constant for decay. As a result, the emphasis on the FIM weighting decreases exponentially, and is practically zero by the time the goal position is reached to avoid wasteful excitation. Developing a good heuristic for -Y is a matter of design choice: potentially useful choices include how much information gain is obtained compared to past values and trajectory-tracking error.

5.5 Algorithm Results

First, the benefit of receiving online parameter updates is demonstrated for a simple double integrator system. The following sections then develop two main test scenarios demonstrating (1) the benefit of information weighting for information gain and (2) the benefit of information weighting for system performance.

5.5.1 Demonstration of Updating Parameters

Using the double integrator dynamics of Chapter 3 with added system noise, a simula- tion was developed to demonstrate the benefit of receiving online parameter updates. In Figure 5-5, the system mass, m is initially set to 2 kg. The system is paired with a KF performing parameter estimation of the unknown mass quantity and a linear MPC to guide motion toward the goal. The updated mass, blue, begins to receive

88 -A

the updated rn after a specified time interval (black dot), to allow for the estimate to improve. The MPC for the updated system then has a more accurate model to plan with, allowing for better tracking. This is represented by the L2-norm of the state, which is significantly improved relative to the overshoot of the non-updated (orange) system which has a markedly low guess of its initial mass.

Position (xy) for Updated and Non-Updated Estmated vs. Actual M

-ondaWd-- mwUpdale -- sysiommskaenm=W 0 Point ot UpdaL NUpdated nw 1.5 10

1 5

0.5

0 10 20 30 40 50 Time (s) 0 U-nornm of Slote 3

-0.5 -Updated 2.5 -Non-Updated S Point of Update 2

.11

1.5 0.5

-20 *2 -1.5 - -0.5 0 0.5 1 1.5 2 0 10 20 30 40 so Time (S)

Figure 5-5: Online mass-updating using a MPC for a double integrator system. Here, mass is an unknown parameter that is turned on at the black point of update.

5.5.2 Results on a 6 DoF System

Test Scenarios A free-flyer with uncertain mass, m, and uncertain principal mo- ments of inertia, Iz, Iv,, Iz, is considered, using the 6 DoF dynamics of Chapter 3. In such a situation, the benefits of using the proposed algorithm are two-fold: the

89 information weighting will lead to an improvement in estimation accuracy, while up- dating these parameters in the NMPC system model should result in more accurate tracking. In order to highlight these aspects, the performance of this method was evaluated through two kinds of tests:

1. Comparison of the estimates when the cost function also information, as opposed to exclusively performing goal-tracking (-y = 0).

2. Comparison of tracking performance with and without updating the system model with parameters estimated on-the-fly.

The simulation was performed on a 16 GB, 8 core Intel i7-4700MQ CPU Q 2.40GHz, running Linux Mint 18.3 using the C++ bindings of the ACADO MATLAB interface [72]. NMPC steps are essentially instantaneous, requiring approximately 85 ms on average for the stated 40 s time horizon. Reiterating, the actual system mass is taken as 9.7 kg, while the principal moments of inertia, I. and Iy,, are taken as 7 kg -m 2 , and Iz, as 10 kg - m2 .The NMPC loop was run at 1 Hz, with a horizon length of 40 s. The 6 DoF system is commanded to translate and rotate to the origin.

Benefit for Information Gain

First, the benefit with regard to information gain is highlighted. Figure 5-6 shows a 3 DoF system for a clear example of the role of excitation in the goal-achieving trajectory. The system on the left has -y = 0 and performs a nominal motion toward the origin-no information is gained about its uncertain Izz. However, when the information gain term is added the system adds the desired amount of excitation to reveal information about the unknown parameter. In this case, torque is added about the z-axis. Now, the framework is shown on the 6 DoF system. Figure 5-8 shows the result of parameter estimation when information weighting is added, using the -y evolution of Figure 5-10. Convergence toward the moments of inertia-which would be less excited nominally, as in the 3 DoF example-is clear. This is in contrast to Figure

90 2,5,

0,

2"2 2 5 222

7.0 0.5 05

7 1.5 0 0 70.- 15 - 2 2 3 2

Figure 5-6: Two different trajectories for a system with uncertainty in mass and mo- ment of inertia, I. On the left, the non-excited trajectory gains no information about I. However, on the right torques are commanded to aid the parameter esti- mator when information weighting is added. Note that the motion is 3 DoF in this example for clarity.

I Estimation I Estimation I Estimation xx yy zz 10 11) it)

4 4 S

1~ 7 - 4 -Estimated 4

-- Actual 3

2

1 211 .414 M I SO 140 44 211 40 60 S4 10 0 20 44) 60 NO 11 Time (s) Time (s) Time (s)

Figure 5-7: Evolution of the estimates of the principal moments of inertia of a 6DoF free-flyer when the cost function does not have information weighting, i.e. Y= 0.

5-7, which shows evolution of the parameter estimates when no information gain term is included in the cost function.

91 Mass Estimation IXX Estimation 12

-Estimated --- It Actual z -Non-updated I

0 1) 20 10 41 1) ft 711 0 90) 1(W 0 0 20 0 40 00 61) 71) Hi 90 B) Time (s) Time (s)

I Estimation I Estimation yy zz

Ill

7

) 111 21l :i 411 511 61, 70 mt 91 1l1) I 20 i)30 40541 Go 1) 01 1 9 Time (s) Time (s)

Figure 5-8: Values of the inertial parameter (mass and principal moments of inertia) with respect to time, for the updated and non-updated case.

Benefit for System Performance

The addition of this excitation is also beneficial to trajectory tracking. Figure 5-9 shows the evolution of the L2-norm of a system using Figure 5-10's 'y update, versus

a 7 = 0 system. The non-updated system, lacking accurate parameter information, easily overshoots the origin.

5.5.3 Summary of Framework

A framework for robotic systems to plan an optimal goal-oriented trajectory while simultaneously attempting to identify some of their uncertain parameters was intro- duced. The latest estimates, as well as information regarding the estimator's confi- dence in them, are incorporated in the real-time calculation of the trajectories. This

92 State L2 Norm 4

34.5 -Non-updated -Updated

2.5

S 10 20 34) 40 50 60 71 NO 911 1416 Time (s)

Figure 5-9: Trajectory tracking, with and without parameter updates. With poor initial guesses for parameters and no updates, the system does not converge well to its desired attitude and position.

Information Weighting Term, y 20

15

0' 0 10 20 30 40 50 60 70 80 00 10(1 Time (s)

Figure 5-10: The information weighting term, -y, is decreased exponentially with time and inversely to distance to the goal state. Many intersting heuristics for -y may be created, depending on the desired operation. The plot for test objective 2 is shown here. enables the system to perform calibration while also using this information to plan for its primary task in a more robust manner.

93 A cost function composed of the trace of the inverse Fisher information matrix as a measure of information gain and the error of the current state to the goal was formulated. A Nonlinear Model Predictive Controller (NMPC) was used to minimize this cost, obeying system dynamics and actuator constraints. Its real-time nature (85 ms per 40 s time horizon) allows for model updates based on new information from the estimator. Simulation results for 3 DoF planar as well as complete 6 DoF rigid body dynamics with uncertain inertial parameters were presented. Trajectories with little overshoot from arbitrary initial conditions were shown, particularly in comparison to planning that does not update parameters on-the-fly. In addition, trajectories for another scenario show the ability of information-weighted planning to discover parameters that would otherwise not be adequately excited by a standard cost function.

Chapter 5 Acknowledgments The content in this Chapter was the result of col- laborative work with Monica Ekal, some of which is also summarized in 173]. In producing final results on the 6 DoF system, Monica Ekal produced an appropriate estimator (EKF), and implemented much of the information gain term discussed in Section 5.4.1. The author developed the NMPC implementation and system dynam- ics, jointly helped with FIM derivations, and jointly integrated the overall framework. Many of the figures presented in Section 5.5 are based on this joint implementation. The concept for the framework was truly collaborative, and the author thanks Monica Ekal for working together on this interesting project.

94 Chapter 6

Systems Considerations and Launch Manifesting Optimization

Motion planning for on-orbit robotics (especially real-time planning) is necessary whenever rigorous, risk-averse offline planning is not possible. There are a variety of mission scenarios, detailed in the next section, that demonstrate different motion planning use cases and establish some of their unique characteristics. The Assembler scenario is looked at from a systems perspective, and a launch manifesting opti- mization problem with contrasting characteristics to some of the motion planning optimizations is proposed.

6.1 Scenarios for On-Orbit Motion Planning

There are a few case study scenarios, touched upon in Chapter 1, that are of special interest to microgravity motion planning. (A plethora of examples also exist in the context of planetary exploration, but this chapter will focus specifically on micrograv- ity robotics.) These use cases are also in a different problem domain than traditional satellite trajectory optimization control. Traditional orbital trajectory optimization is generally obstacle-free, performs detailed resource-intensive offline trajectory opti- mization, and requires little if any autonomy-control is performed around a small number of rigorously-computed offline trajectories. Online recomputation is generally

95 unnecessary. However, for the scenarios in mind the proposed use cases are:

" Assembler

" Servicer (Manipulator)

" Assistive Free-Flyer

" Surveyor

Assembler Assembly on-orbit using autonomous spacecraft is an option for multi- component structures or spacecraft (e.g. Figure 6-1). At a large scale, this yields a construction site scenario with multiple spacecraft, many components, and defined roles for different operators. Jewison provides a tradespace categorization of many of these roles 174]. The assembler differs from a Servicer because replanning is more frequent and likely in a more cluttered envrionment. Multi-agent operations become a concern.

i ..."...... 1 - *, -(/

S 0 N

Figure 6-1: A cluttered construction site scenario with multiple robotic free-flyers in stylized form (left) and an artist's concept (right) [751.

Servicer (Manipulator) This category broadly captures satellites that rendezvous and perform proximity operations around a target, followed by interaction with this target. Grappling and docking/berthing both fall under this category. Unlike Assem- blers, Servicers may be more risk-averse, operate in less cluttered environments, and

96 might have more ability to give special consideration to the environment in which they operate (e.g. significant expert-assisted plan development offline for servicing a known satellite). Servicers are not necessarily only for repairing defunct satellites: they have applications in debris removal and one-off missions requiring satellite prox- imity operations, for example.

Assistive Free-Flyer Assistive free-flyers are multi-purpose satellites intended to accomplish a variety of tasks involving repeated proximity operations. They may operate under conditions more similar to traditional Earth-centric robotics, like IVA pick and place operations. Nonetheless they are free-flyers and perform activities sim- ilar to Assemblers and Servicers, though perhaps in a more risk-tolerant environment with significant clutter.

Surveyor Surveyors are distinct from the other scenarios in that they do not phys- ically interact with their environment (e.g. Figure 6-2). Surveyors provide more information about the environment they operate in to enable other work, for exam- ple, surveying the exterior of a space station for damage or overseeing a construction site scenario to coordinate multi-agent activity. Surveyors, likely without manipulator arms, exhibit 6 DoF rigid body dynamics.

Figure 6-2: SPHERES with attached VERTIGO camera system, an example of a Surveyor testbed, shown here with astronaut Thomas Marshburn [5].

97 6.2 Case Study: Launch Manifesting Optimization

A packing and scheduling problem emerges from the construction-site Assembler sce- nario. Independent of motion planning on-orbit, an offline optimization of assigning launch vehicles to rockets is required to launch components in a cost-optimal way. This optimization expands the breadth of optimization problems considered in this thesis: it is discrete, subject to scheduling, volume, and mass constraints, and can only be heuristically-approximated-it will be demonstrated that the problem is NP-hard.

6.2.1 Large-Aperture Assembled Telescope Background

In terms of assembly, space telescopes come in three varieties:

* Monolithic

* Deployable

" Assembleable

Assembleable telescopes find their niche when launch vehicle constraints (e.g. fair- ing size) push large primary diameters out of the realm of possibility. There is also, potentially, some overlap in application with deployable and monolithic telescopes due to potential gains in utilizing cheaper launch vehicles and in terms of reducing mission risk. Because ever-increasing primary mirror diameters are sought, an on- orbit assembled space telescope appears to be the logical and perhaps only solution, accounting for fairing size limitations. Therefore, an on-orbit assembled large-aperture space telescope is a prime candi- date for on-orbit assembly. Many proposals have been floated over the years, with some of the more recent ones shown in Figure 6-3. A variety of works center on segmenting the primary mirror [76] [77] [78] [791 [80] [81]. Miller also discusses a pro- posed demonstration testbed also centered on mirror segmentation [82]. Very recent work by Mukherjee and Sigler et al. provides detailed conceptual analysis for a 20 m on-orbit assembled telescope.

98 . 4

[Lillie, 2005] [Miller, 2006] [Basu, 2003]

[Oegerle, 2006] [Pollidan, 20161 [Feinberg, 20131

[Saunders, 2017] [Mukherjee, 20181

Figure 6-3: Proposed large telescope missions, prime candidates for a construction site scenario.

6.2.2 Problem Formulation

An on-orbit assembled telescope, like other multi-component missions, will require a new set of tools in order to best design launch campaigns. Such complex mis- sion proposals require many segmented components to be launched at varying times to facilitate assembly or different mission phases. In contrast to the assembly of the International Space Station-the most ambitious segmented on-orbit project to date-such missions will likely have many possible launch vehicles available with dif- fering constraints and costs. Packing components into a launch vehicle fairing is one of the most notable constraints, given that components may not be designed around the capacity of a Shuttle payload bay and may instead be able to ride on any number

99 of rockets emerging from the commercial sector. The framework here addresses this optimal launch manifesting problem, in the sense of minimizing the cost of using a number of launch vehicles with constraints placed on packing a selection of components, trajectory limitations in reaching a desired destination, and scheduling of launch vehicles and components. The decision variables of this problem are which and how many launch vehicles are used, their launch date, and which components are placed in a vehicle. That is, the cost-optimal launch manifest must be determined: which components are placed on which launch vehicle and when. Decision variable Xjk at launch time ti captures the number of component packings of type j loaded on rocket type k to launch. The optimization problem to solve can be stated:

minimize = 6Txi Vi Xk subject to

lk L

Cjk Ek < maxMass (6.1)

Cjk E Ik < maxVolume

C3 E Cik meetDimensions

C E Cik meetPriority

With the following definitions:

L : set of launch vehicle types

0 cost vector for rocket variety associated with Xjk

Xjk : number of component packings j on launch vehicle k to launch (6.2) Ik launch vehicle k

cj component of packing j

Cik : component packing j on launch vehicle k

Note that superscript i indicates the ith launch window available for this launch

100 campaign. Additional constraints can be added as needed, for example, restricting the center of mass of the payload to be below a certain height. Figure 6-4 shows a graphic of different rockets and components.

r0 000

Figure 6-4: A mix of rockets, L, are available to launch components C while meeting a variety of constraints.

6.2.3 Existing Approaches and Challenges

A few attempts at similar problems have been made. Morgan is by far the most similar, and was an inspiration for the baseline version of this approach 183]. Morgan et al. propose using an exhaustive enumeration of all possible component packings, and then running a linear optimization. A heuristic version uses greedy packing with possible refinement steps to follow. However, the scenario here involves many more components, as well as trajectory, scheduling and volume constraints. Gralla et al. have also looked at the problem in terms of an interplanetary mission, considering

101 a direct enumeration of packing options and an in-progress integer programming solution [84]. However, direct enumeration for a significant number of components quickly becomes infeasible. Other proposals include [85] and a variety of proposals that implicitly deal with launch manifesting but do not use an optimizing approach or instead use hand-designed missions. As [831 points out, this problem is NP-hard since the bin-packing problem is a special subcase. The bin packing problem states, given a number of objects m with volumes Vm, find the minimum number of bins n each of volume Vi, such that n is minimized when packing all m. Using a solution gadget (a special simplification to show a result covers a subcase) the launch optimization problem reduces to bin packing: consider when there is only one type of launch vehicle and there are only volume constraints (1D only). This problem is therefore NP-hard, with substantially more complexity than classic bin packing. Thus, heuristic approaches are one way of approximating a sub-optimal solution in feasible runtime. Morgan proposes a heuristic approach for rockets with payload mass constraints only, but relies on an enumeration of all possible combinations of component fairing packs. As stated above, the approach presented here enforces additional constraints and, because there are a significant amount of components, uses heuristics to better guide the development of fairing packings to then hand off to the integer optimization. In effect, this reduces the number of decision variables required in the integer optimization. A post-processing step aims to reduce wasted mass and volume capacity. In addition, the approach is run in multi-step fashion over a set of desired launch windows and with the aid of a trajectory optimization subroutine.

6.2.4 Integer Optimization Approach

The approach proposed here begins in Algorithm 4. First, trajectory constraints must be related back to the capabilities of a launch vehicle. This is accomplished through TrajLookup, which uses a powerful custom trajectory optimization tailored for the desired destination. This lookup, however, depends on the desired launch

102 -, .1

timeframe. It is therefore necessary to set time constraints on when rockets can be launched, (to, tf) E T, a time window. Assigning priority levels to every component is convenient way to deal with this: components can be specified to launch in a desired order, and a time window for each priority can be developed within desired bounds. Then, TrajLookup may perform sampling at times within this window to find a near- optimal Av to the desired destination. Subsequently, Av requirements are mapped back to produce a mmax for each rocket.

0

Figure 6-5: Fairing packings are generated heuristically for an integer optimization to select from, guaranteeing that the minimum number of required components are launched.

mmax produces the necessary constraint in the initial problem definition. The con- straints must now be considered in the generation options of ways to put components in fairings, i.e. packings Ck, Figure 6-5. Exhaustive enumeration is not possible, so a "largest-first" approach is used by ordering components with a relative weighting on mass and volume, "y, depending on whether the user believes their problem is more mass or volume constrained. More massive/voluminous components are packed first, until fairings are filled (constraints violated) and new packings are required. Addi- tional individual component constraints are also considered. This process is repeated for every rocket, creating a set of packings to choose from for all rockets. This process is repeated within each priority level. Now, Algorithm 5 continues, creating a min-cost integer program for each priority level using the packings and constraints on the number of components required to launch. The result is an assignment of fairing packings to launch vehicles, within each

103 individual time window. The process is also outlined in Figure 6-6.

= Jinteger programj=

(a) (b) (c) (d) heuristic launch manifesting

Figure 6-6: The entire process from sorting of components (a), to packing creation (b), integer programming solution (c), to refinement (d) to eliminate wasted capacity.

Algorithm 4 Manifesting with Repacking 1: procedure GENERATEMANIFEST (C, L) 2: for to,,tf, c T do 3: Ctypes <- IDENTIFYTYPES 4: Avk +- TRAJLOOKUP(toi, tfh, k), Vk 5: Mi e GREEDYPACK(Ctypes, C, L, Av) OR INTPROGPACK(Cypes, C) 6: end for 7: M <- REPACK(Mi) 8: end procedure

6.2.5 Greedy Packing Approach

A modification, eliminating the need for integer programming, uses a heuristic to choose packings. This is detailed in Algorithm 6. Generation of fairing packing options, Ck, occurs as desired. The current approach is to sort components by m + yV, as stated before. Essentially, GreedyPack pregenerates a set of fairing packing options and ranks them by some metric. The algorithm then cycles to choose the "best" packing available that does not exceed the number of components needed, and continues until all components are assigned. This process tends to choose the most cost effective rockets, but can lead to underutilization since packings are only considered one-by-one. Further, it requires a generous amount of packing options to be generated.

104 Algorithm 5 IntProgPack 1: procedure INTPROGPACK(Ctypes, C) 2: Xjk <- GENERATEPACKINGS 3: for ci E C do 4: ranki +- mi + -/Vi 5: end for 6: SORTRANKBESTFIRST(C) 7: for lk E L do 8: for ci E C do 9: if t k NotFull then 10: Cj k +- Ci 11: else 12: BEGINNEW(Cjk) 13: Cjk +- ci 14: end if 15: end for 16: end for 17: Xjk - CREATEDECISIONVEC(Cjk) 18: 0 +- ASSIGNCOSTS(Xjk) 19: M +- SOLVEINTPROG(Xjk, 0, constraints) 20: end procedure

Algorithm 6 Greedy Pack 1: procedure GREEDYPACK(Ctypes, C, L, Av) 2: Cjk +- GENERATEPACKINGS 3: for xi E Xjk do 4: scorei +- "+ 5: end for 6: SORTSCOREBESTFIRST(Cjk) 7: while ConstraintsNotSatisfied do 8: if WithinConstraints(Cjk) then 9: Xjk +- jk + 10: else 11: NEXT(Cjk) 12: end if 13: end while 14: cost +- GETCOST(Xjk) 15: M <- cost, ijk 16: end procedure

105 6.2.6 Repacking

For either GreedyPack or IntProgPack suboptimal choices will be made, particularly because not all packing options can be generated to optimize over. A convenient and rather effective heuristic is to Repack components between manifests. This approach is mainly inspired by Morgan et al. Component repacks are attempted in a two- step process: (1) attempt to merge pairs of launch vehicle packings, then (2) spread components between remaining rockets to eliminate other rockets. This prevents rockets from being underutilized. Figure 6-7 shows the first step of this process. This allows moderately underutilized rockets to be eliminated.

repack pairwise

Figure 6-7: Fairing packings are first considered for pairwise combination, eliminating one of the underfilled pair.

Figure 6-8 shows the second step of this process. This allows less underutilized rockets to be filled entirely, if another very underutilized rocket can be removed. In practice, this leads to consistently high utilization of vehicle capacity with little computational overhead for calculation. The refining process mixes rockets of different priority levels. For certain situ- ations this is acceptable, such as storing components for a time near the assembly site of the on-orbit structure. One potential workaround is only to repack within a certain priority level then, if desired, mix priority levels. Additional heuristics may be beneficial: this one in particular has proved successful for the results shown in the next section.

106 +

repack between all launch vehicles

Figure 6-8: Fairing packings are spread between all other packings if constraints allow, removing very underutilized vehicles.

Algorithm 7 Repack 1: procedure REPACK(Mi) 2: for Mm Vm do 3: for 1, E Mm Vn do 4: for lp E M, Vp / n do 5: for cq E inVq do 6: 1, s- CHECKIFMOVABLE(Cq, ip) 7: end for 8: if Movable Vp then 9: REMOVE(in) 10: REASSIGN(cq) 11: end if 12: end for 13: end for 14: end for 15: for Mm Vm do 16: for in C Mm Vn do 17: for cp, E in Vp do 18: ltrget, <- CHECKIFMOVABLE(cy, 1 E M) 19: end for 20: if Movable Vp then 21: REMOVE(in) 22: REASSIGN(cp) 23: end if 24: end for 25: end for 26: end procedure

107 6.2.7 Test Cases and Results

SLS, 3, 1 -- 1 .V sV, 3, 2 Att. V 551, 3, 3 SLS. 1. 4 FHe.vy. 2, 5 FHvy, 2, 6 FHeavy, 2. 7

- volmine utilization -Ilmass utilization

I I I I|

Figure 6-9: Fairing packs generated for a 20 m space telescope component list. Note that the optimizer is limited by both mass volume constraints, and attempts to spread the allocation of components within a time window so that the fewest, most economi- cal rockets possible are used. Plot sizes are scaled to rocket dimensions. The heading above each rocket indicates {name, priority number, rocket ID}.

A test case with a large in-space assembled telescope was conducted (118 com- ponents and 19 different components types, 20 m primary diameter). Realistic com- ponent parameter values were used, along with a mix of 7 possible rocket types. In particular, the Space Launch System (SLS) was used, constituting a significantly dif- ferent class of launch vehicle from the others available. The dimensionality of this problem exceeds the capability of a brute force solution, and becomes challenging for a human expert to assign optimally. Rockets and components considered are listed in Appendix A. Figure 6-9 visualizes the rockets selected and their utilization. An average mass utilization of 91.51% for mass and 77.76% for volume was obtained. The components

108 used have three different priority levels, corresponding to the three possible rocket launch dates. The total cost of this scenario is $2,884 million, using launch vehicle cost estimates from the FAA [86]. A contrasting scenario is shown in Figure 6-10. In this example, small components

(mass- and volume-wise) are able to fit entirely within more economical launch vehicle choices: the Falcon 9 and Falcon 9 Heavy. This scenario has 82 components, 19 component types, and 7 launch vehicle options. The primary diameter is 12 m with 2.4 m segment size, for a total cost of $686 million.

FH vy. 3, 1 M.-y, 1. 2 FH.,'y, 2, 3 F38.vy, 2. 4 FV B-.k FT, 3. 5 39 BI-k FT, 3, 6 F9 BIk FT. 3. 7 FH-Ivy, 3, 8

volume utillzatin mass utilization

Figure 6-10: Fairing packs generated for a 12 m space telescope component list. Smaller, more cost effective rockets are selected since they are able to satisfy all constraints.

Drastic changes occur when components violate a constraint that requires jumping to a vastly different rocket class (in terms of capability and cost). If a constraint leaves the mid-size rocket class, the heavy-lift SLS is the only remaining option. Though expensive on a dollar per kg basis, additional volume can siphon off components from other rockets. Strictly in terms of complexity, this may be seen as beneficial. The utopia scenario is to launch the largest component possible on the biggest possible rocket for little cost-realistically, these factors must be balanced by careful compo-

109 nent design and careful manifesting.

6.3 Summary and Contrast to Motion Planning Op- timization

Standalone, the manifesting optimization just presented is a useful tool to cost- optimally assign components to rockets with a variety of constraints. Its intended use is as an aid to an experience user, who might be able to add additional insights to the suggested manifest that are not captured by the constraints, for instance, rec- ognizing that a particular combination of components cannot be launched together without presenting unacceptable risk. Such additional constraints are a fruitfuil area for future exploration of the optimal launch manifesting problem. The launch manifesting problem also provides a useful contrast to demonstrate the characteristics of the problem domain that optimal motion planning problems reside in. On-orbit motion planning resides in a continuous domain, deals with dynamical differential and environmental constraints, and is often desirable to be computed online. Constraints are potentially nonlinear, but linear input and state constraints are common. It has been demonstrated to be at least PSPACE-hard (implying NP- hard) [2] [871. Meanwhile, the launch manifesting problem is in a discrete domain, but with a coupled trajectory optimization problem that can be reduced to discrete launch time choices via sampling. Constraints are linear and decision variables are constrained to be integer. The problem is at least NP-hard, and rendered feasibly solvable via var- ious heuristic approximations (e.g. generating a subset of packings to choose from). Regardless of comparison from an algorithmic point of view, the manifesting opti- mization problem should serve future missions well as payloads become increasingly ambitious and the arsenal of launch vehicles to choose from rises. One particularly interesting application is the addition of small launch vehicle classes to the mix to take advantage of their excellent dollar per kg performance for small payloads.

110 Chapter 7

Conclusions and Future Work

7.1 Conclusions and Contributions

Algorithms for dealing with some cases of challenging dynamics and system uncer- tainty were presented, with particular emphasis on dynamics relevant to spacecraft. This line of work will become increasingly important as greater autonomy is sought for more complicated microgravity robots. Chapter 4 applied kinodynamic sampling- based plannning to multiple cases of relevant dynamics, in addition to an optimizing sampling-based planner on a low-dimensional underactuated system. Kino-RRT was primarily used, which showed that with appropriate intuitive steering functions fea- sible solutions for complicated systems with dynamics and environmental constraints can be obtained. The computation time for both 6 DoF fully actauted and 3 DoF underactuated dynamics are on the order of a few seconds for non-optimized imple- mentations on a standard personal computer. Optimizing sampling-based planning was shown to have some clear drawbacks, which must be addressed in future work. Relative to the current literature, adaptations of the Kino-RRT steering function have been introduced with specific emphasis and discussion of the implications of dynamics-aware motion planning for space systems. The approaches take inspiration from LaValle and Karaman. An approach for explicitly tackling dynamics uncertainty using real-time motion planning was provided in Chapter 5. This planning for information gain approach

111 demonstrated on the 6 DoF dynamics that adding information weighting into a real- time trajectory optimization formulation can result in systems that better estimate their unknown parameters, and may also lead to better trajectory tracking when these unknown parameters are used by the system model. Simulation results on a system with process noise and uncertain moments of inertia and mass demonstrated both better estimator convergence with weightings on these parameters, along with improved trajectory tracking using these improved parameter estimates. The work is truly real-time because of its use of NMPC, taking approximately 85 ms for 40 s time horizons to compute control actions. This is a novel approach-trajectory optimiza- tion for information gain blended with goal-oriented motion has not been formulated in this way, to the author's knowledge. Wilson provides a useful framework that has helped in defining information gain. Finally, multiple roles for on-orbit robots have been classified (with additional emphasis on roles often overlooked by other surveys, e.g. Surveyors and Free-Flyers), and an optimization problem complementing the motion planning work has been presented. Launch manifesting optimization will become increasingly important as new rocket types come online and larger, multi-component missions become a reality. Finding the cheapest possible way to execute these missions will surely be one of the tools considered in developing launch manifests. Further, documenting some of the constraints that must be considered in this problem provides a useful reference. Morgan provided strong inspiration for the heuristic approach, while additional con- straints and practical adaptations (e.g. generating subsets of fairing packings) have been proposed within this framework.

7.2 Future Work

One of the inspirations for this work was observing the proliferation of discussion about microgravity free-flying manipulators. While this thesis has produced meth- ods potentially valuable for these systems, further application of the approaches to manipulator-equipped satellites is key (e.g. Figure 7-1). Manipulating objects opens

112 up a new world of challenging and interesting problems, including microgravity con- tact dynamics and object manipulation. Many exciting fields of robotics, like motion planning, deserve further consideration in the context of the space environment. The planning for information gain approach presented here can be extended for systems with more complex models, for instance, control and characterization of docked satellites or manipulator-equipped satellites handling uncertain payloads. In cases where some parameters are known with more certainty than others, the pro- posed algorithm could also be modified to re-plan trajectories by focusing only on the uncertain subset of parameters. Many possible tweaks can be made to handle the quality of parameter estimates, for example, a protocol to determine how well a parameter is actually converging. Developing useful heuristics for shifting the relative weighting term, -' is another interesting area of future progress, along with a more thorough analysis of guarantees provided by this control method.

Figure 7-1: Future work will aim to address more complex scenarios, including a desired use case of manipulation of uncertain grappled targets.

Development of experiments using the SPHERES or Astsrobee testing platforms onboard the International Space Station are under consideration, which would provide 6 DoF hardware validation for both sampling-based planning and planning for infor-

113 mation gain approaches. Hardware experiments on the ground with simulated pa- rameter uncertainty or with obstacle envrionments are useful future demonstrations. Finally, the experiments conducted in Chapters 4 and 5 have not yet demonstrated one of the key advantages of real-time trajectory optimization, obstacle avoidance. Additional useful test scenarios (e.g. with obstacles) would be worth analyzing, and comparisons against more traditional approaches are worth considering, i.e. those that ignore uncertainty, or that use motion planning not well-suited to cluttered en- vironments. Finally, more work remains in defining protocols and classifications of microgravity space robots. As on-orbit servicing and assembly become commonplace, standardiza- tion and careful consideration of the role of autonomy will become ever more essential, for example, standard interfaces for autonomous grappling. Further, future algorith- mic developments must be carefully considered for the space domain to ensure safe, effective robots that enable mission success.

114 -- -'.1 - - -~

Appendix A

Launch Vehicle and Component

Tables

A.1 20 m In-Space Assembled Telescope

Component Mass [kg] Quantity Order Priority Length [m] Width [m] Height [m] Spacecraft Bus 4278 1 1 3 3 0.5 TransitionStructure 633 1 1 4.5 4.5 1 PM Raft Structure 50 37 1 3.1 3.1 0.3 PMSMMeteringTruss fifths 709 5 1 5 5 1 Instrument Support _Truss fifths 584.6 5 1 4.5 4.5 1 SM Support_ Structure 65 1 1 4.2 4.2 0.35 SI Radiator 657 1 2 1 1 1 Sunshade PM Side 1267 4 2 0.6 0.6 1 SunshadeSI Side 2240 4 2 1 1 1 Mirror Raft 910 37 2 3.1 3.1 0.5 SM Enclosure 4355 1 2 2.8 2.8 1 SM Module 403 1 3 2.1 2.1 2.1 AOS Module 12818 1 3 6 6 1 Instrument 1 Thirds 937 3 3 6 3.5 4 Instrument 2, Fourths 615 4 3 6 3.5 3.5 Instrument_3, Fourths 614 4 3 6 3.4 3.6 Instrument4, Fourths 603 4 3 6 3.25 3.7 Instrument_5, Thirds 849 3 3 5.6 4 3.4 SIStructure 9466 1 3 2.92 2.92 1

115 A.2 Launch Vehicles

Launch Vehicle Approximate Cost [million $1 Payload Diameter [m] Fairing Height [m] F9 Block FT 62 4.6 6.7 FHeavy 100 4.6 6.6 SLS 1200 7.5 19.1 IV Heavy 350 4.57 12 Atlas V 551 92 4.57 7.63 Antares 80 3.9 7.52

116 Appendix B

Robust Optimization for Satellite Control

This Appendix presents an implementation of robust optimization-based control for free-flying satellite dynamics with process noise. Applying the results of robust opti- mization (RO) to controls problems is a tool that is still relatively new to the com- munity. An RO approach has benefits including explicit incorporation of constraints, and guarantees on cost assuming an uncertainty set on probabilistic disturbances. RO is a promising way to formulate operating in an optimal way with uncertain dynam- ics while allowing for control over "probabilistic protection" of the cost function. As an additional benefit, these approaches can incorporate state and input constraints, unlike LQR solutions.

B.1 Robust Optimization Primer

System uncertainty enters optimization problems in various ways, for instance, con- straints may have constant terms that are represented according to some assumed probability distribution. Robust optimization deals with solving problems with these uncertain parameters within a bounded uncertainty set:

min{fo(x) : fi(a, x) < 0, Va c U , Vi} (B.1)

117 In this case, the uncertainty set U represents an assumed set of possible parameter values. This set might be obtained by setting a -y that defines the range of possible values that can occur, effectively guaranteeing that for these parameter values the solution must be feasible. Normally, such a problem must be solved using an infinite number of constraints for all possible values contained in the uncertainty set. This can be thought of as optimizing for the "worst case" occurrence of the constraints. However, for certain problem classes a robust counterpart can be obtained, which turns such problems into computationally tractable ones. Robust counterparts are arrived at using some tools from convex optimization. The interested reader is referred to [88] for an in-depth explanation of robust optimization and the formation of robust counterparts.

B.2 Robust Receding Horizon Control for 3 DoF Dy- namics

Optimization for stochastic LQR (LQR for stochastic process dynamics) is only opti- mal in the sense of expected value of the resulting cost. One may desire, for instance, to provide guarantees on cost performance in the face of noise in the dynamics. This section implements such a scheme, using Bertsimas' robust optimization solution to the problem [89J. A desired use case, linear satellite dynamics with Gaussian process noise, is demonstrated using two approaches and compared against the commonly- implemented LQR solution.

B.2.1 Problem Formulation

A system with state x E R' and input u E R" can be modeled as uncertain by incorporating an uncertainty vector, w E R', into its dynamics. A quadratic cost function is considered for the system, shown below:

rt XTQX + UTRu (B.2)

118 This cost function applies a penalty for not being at the desired state and for applying control input. For the purposes of optimization, this equation will be dis- cretized as a summation:

N

21 = XTQiX + uTiRiui_1 (B.3) J i

The state dynamics are a function of the current state and the control input and are assumed to be linear:

x = f(x,y) = Ax + Bu + Cw (B.4)

The dynamics may also be discretized:

xn+1 = xn + f(xn, un)dt (B.5)

B.2.2 Creating a Robust Solution

An uncertainty set bounding noise vector uncertainty is given by:

Wy = {w: HIwH12 < 7} (B.6)

It can be shown that a receding horizon formulation (see Chapter 2) of the discrete dynamics results in a problem for which a robust counterpart can be developed. The first such result is a semi-definite program, which grows excessively large for online computation in as few as N = 5 timesteps forward:

min z

subject to

I F y (B.7) -hT 0 y T z- h2A I F T -h AI-C+FTF

A > 0

119 The decision variable of interest here, y, can be converted into an input vector, of which only the first m inputs are taken, corresponding to the first input over the time horizon.

u = B-1/2y - B-'b (B.8)

Finally, a second approach formulates the robust counterpart as a second-order conic program (SOCP), which is significantly faster but which has worse performance.

min z

subject to

2 |IY1|2 < z - - Y |CI| (B.9)

12(h + FTy)Teil < tj

= j 1, ... , N - nw, I|t||2 <_

The reader is referred to 1891 for a detailed derivation for the robust counterparts of this problem, along with definitions on notation.

B.2.3 Results

The SDP and SOCP approaches were implemented using the MOSEK Fusion MAT- LAB API, and results compared against baseline LQR for various levels of noise and -y. The results serve as a verification of 189] for this system, demonstrating the pro- cess on a higher-dimensional state space (n = 6) with implications for a real satellite system. The results are shown below, with SDP and SOCP approaches compared cost- wise against baseline LQR. Stability relative to LQR was improved, e.g. with relative disturbance from the goal with w = 0.1 of 0.2 m for SOCP compared to 0.5 m for LQR. Matching Bertsimas, larger uncertainty sets decreased system performance relative to LQR, particularly for larger 'y. The benefit, however, comes in terms of stability relative to the goal point. As expected, the SDP approach, while slower, leads to better cost performance than the SOCP.

120 J (w = 0.001) J (w =0.01) J (w = 0. 1) 0.001 3.3261 3.2586 4.0336 0.01 3.3263 3.3373 4.9688 0.1 3.3817 3.4195 5.5989 1.0 3.9098 3.9493 9.8329 10 3.9124 4.1376 9.0717 LQR 4.0585 4.0706 5.999

Figure B-1: The results for the closed-loop SDP formulation for varying levels of w and -y, compared against the average LQR baseline.

pY J (w = 0.001) J (w =0.01) J (w = 0.1) 0.001 3.3281 3.3619 6.0897 0.01 3.3290 3.3597 6.1661 0.1 3.3540 3.3448 6.6999 1.0 3.7308 3.9813 8.8506 10 4.9254 4.9813 11.5006 LQR 4.0585 4.0706 5.999

Figure B-2: The results for the closed-loop SOCP formulation for varying levels of w and -y, compared against the average LQR baseline.

J (w = 0.001) | J (w =0.01) | J (w = 0.1) 0.001 0.06 3.17 50.97 0.01 0.08 0.67 24.10 0.1 0.82 2.18 19.66 1.0 4.58 0.81 9.99 10 25.89 20.39 26.77

Figure B-3: Percent cost difference of SOCP relative to SDP.

Application on actual hardware is a logical next step. Finally, application of new techniques in adaptive robust optimization could be applied, as the time-horizon nature of this problem seems to lend itself to adaptive decision rules. The results of this section are presented as an initial step toward incorporating uncertainty-aware solutions into deployed hardware. This particular receding horizon approach benefits from the rigorous analysis of robust optimization: using a higher y to bound the uncertainty set leads to higher total cost in the solution, but also a higher guarantee of stability in the presence of noise. The approach allows for a higher level of decision- making over the way the cost function probability distribution should be treated, rather than LQR's optimal expected-value formulation.

121 Appendix B Acknowledgments Alex Steighner worked on this implementation with the author. The author primarily created the dynamics and implemented the robust formulations in MOSEK, and Alex Steighner obtained results shown in the figures.

122 Appendix C

The SPHERES Testbed

For reference, this Appendix provides a brief discussion of some of the author's addi- tional work involving the SPHERES testbed. It also includes practical advice from working on/for multiple on-orbit test sessions.

C.1 LQR Position Controllers for SPHERES

An LQR controller was developed for SPHERES ground use, based on idealized system dynamics. (The linear quadratic regulator is presented in Chapter 2.) Us- ing double integrator dynamics and a SPHERES and air carriage nominal mass of m = 12.09 kg, this simple model is:

0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 A= B= 0 0 0 0 0 0 1/m 0 0 0 0 0 0 0 0 0 1/m 0 0 0 0 0 0 0 0 0 1/m

C = zeros(6, 3) D = 1(6)

"Slow" and "fast" gain values for two different possible operating modes were devel- oped, with simulation results shown in Figures C-1 and C-2 The p relative weighting

123 Background telemetr stas (stndard), Sphere logical 10 1

I J- ~0 PZ IL IL I I 0 20 40 60 80 100 120 140 160

t\.1 m -Max

-0.1 0 20 40 60 8 100 120 140 160 Figure C-i: Fast gain simulation response.

Backgroud tleety -t1t1 (standard), Sphere logical ID01

E g80

I.'M. 0.1 ~0 SD

L0 2 0 100 12 140 160 Figure C-2: Slow gain simulation response. between Q and R was used to trade off weighting between state error and input use. The resulting gains were:

0.7955 0 0 5.1251 0 0 Kfast = 00.7955 0 0 5.1251 0 0 0 0.7955 0 0 5.1251

Q = diag(4.5, 4.5,4.5, .375, .375, .375)

Umax = .25 N

p=. 5 0.7955 0 0 5.1251 0 0

Ksow = 00.7955 0 0 5.1251 0 0 0 0.7955 0 05.1251

Q = diag(5, 5, 5, .33, .33, .33)

Umax = .2 N

p = 15

124 After many trials on the SPHERES 3 DoF glass table some representative data sets were acquired. Figures C-3 and C-4 show rather good tracking for the commanded test routine, which commanded the SPHERES satellite to move through four points in the x-y plane: (0,0), (0,.25), (-.25,.25), and (-.25, 0). Faster responses during each 30 second maneuver were visible compared to the slower gains which had a higher concern, p, placed on fuel usage (15 as opposed to .5). These results matched expectations observed in the SPHERES simulation, which showed a markedly faster response to commands for the fast gains, reaching commanded targets in no more than about 10 seconds as opposed to approximately 15 seconds for the slow gains.

Background blemetry stat.. (standard), Sphere logical ID I SL

-Ma

0 20 40 60 80 100 120 14 0 0.1 1 1 1 T VA

}Vz

0 20 40 6 0 100 12 140

Figure C-3: Hardware position and velocity states, slow gains.

Backgrond telemetry stalee (standard), Sphere logical ID I

0 20 40 60 8D 100 12D 140 Is0 0,1

0 VV;

0 20 40 60 s0 100 120 140 160

Figure C-4: Hardware position and velocity states, fast gains.

On the SPHERES hardware, both gain values showed slower responses, perhaps due to table friction or slant, but fast gain responses were faster overall. A slightly higher thrust restriction for the fast gains, .25 N as opposed to .2 N, was likely also responsible for better recovery against adverse table conditions (i.e. starting in a high corner and drifting, seen in Figures C-3 and C-4 during maneuver three). This accessible LQR controller for SPHERES position control could be a welcome addition

125 to the default Space Systems Lab controller suite.

C.2 Lessons Learned and Practical Advice

After participating in multiple SPHERES test sessions over two years on the SPHERES team, along with assisting in code preparation for these test sessions, the author would like to share a few nuggets of wisdom that can be easily overlooked for long-lived and knowledge-critical engineering projects. Two of the most significant factors that have come up again and again are: (1) the importance of being prepared; and (2) the importance of documenting and passing on knowledge. First, the importance of being prepared. For SPHERES, and for any time-critical field-testing application, being prepared before crunch time hits is essential. Some- times this is codified in procedure: a flight readiness review for flying an experimental aircraft, for example, or adhering to ISS documentation procedures to ensure crew safety and mission success. However, ultimately, quite a bit of the preparation comes down to personal accountability for looking to see if all cracks are filled. Procedure can only go so far. As an example, test sessions frequently involve back-and-forth chatter about pro- cedure and setup. There is usually a team member that is extraordinarily knowl- edgeable about the test being performed, and yet there is no codified rule that an individual must have every possible aspect of their test memorized. Personal atten- tion to detail ensures that the likely trouble spots are well-understood, that backup procedures are already in mind, and that anticipated problems are thought about ahead of time. This way, if a time-critical question comes to the ground, an answer is ready or a process is in mind to find a solution. Essentially: there are many situations procedure will not cover, and common sense and self-preparation can help to ensure success in spite of this. Finally, the importance of passing on knowledge. SPHERES is a long-lived project, first conceived in 1999, that has an accompanyingly massive amount of docu- mentation. Historically, much of this information has been passed on grad student to

126 grad student, and spurts of it have been documented in different notations, in different places, with different levels of rigor. Usually, the chain of in-person knowledge-passing is enough to make sure collective group memory is maintained. But, inevitably, bits of information will be lost over time in a game of information telephone, passing from person to person. For example, recently the SPHERES team graduated quite a few senior members. While this was a blow to having a senior knowledge base around, it was mostly compensated for by good prior documentation. Documentation is extraordinarily helpful, not only in fighting against knowledge loss over time but in allowing knowledge transfer to go even faster, since team mem- bers can reference information at their own convenience. That said, documentation must be clear, complete, and accessible for it to be useful. There are many instances where certain gems of knowledge are hidden away in the corner of a database, impos- sible to find. This is likely a problem in any program or project with a long lifespan, particularly one where the founding team members have moved on. Awareness of the inherent difficulties in keeping knowledge alive can go a long way in ensuring a team's success.

127 128 Bibliography

[I] E. Guizzo, "Boston Dynamics' Atlas Robot Shows Off Parkour Skills," 2018.

[2] S. LaValle, "Randomized Kinodynamic Planning," 2001.

[3] M. 0. Hilstad, J. P. Enright, A. G. Richards, and S. Mohan, "The SPHERES Guest Scientist Program," pp. 1-45, 2010.

[4] M. 0. Hilstad, J. P. Enright, and A. G. Richards, "The SPHERES Facility Description," 2013.

[5] NASA, "SPHERES with Astronaut Scott Kelly," 2016.

[61 T. Smith, J. Barlow, M. Bualat, T. Fong, C. Provencher, H. Sanchez, and E. Smith, "Astrobee: A New Platform for Free-Flying Robotics on the ISS," 2016.

171 L. Fliickiger, K. Browne, B. Coltin, J. Fusco, T. Morse, and A. Symington, "Astrobee Robot Software: Enabling Mobile Autonomy on the ISS," tech. rep.

[81 M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler, and A. Ng, "ROS: an open-source Robot Operating System,"

[91 NASA, "Astrobee on a Granite Table," 2016.

1101 N. STMD, NASA Space Technology Roadmaps and Priorities. 2012. 111] A. Flores-Abad, 0. Ma, K. Pham, and S. Ulrich, "A review of space robotics tech- nologies for on-orbit servicing," Progress in Aerospace Sciences, vol. 68, pp. 1-26, 2014.

[12] DARPA, "Various DARPA servicing projects," 2016.

[13] W. Staritz, Peter J.; Skaff, Sarjoun; Urmson, Chirs; Whittaker, "Skyworker _ Robotic for Space Assembly Inspection and Maintenance of Large Scale Orbital Facilities 2001.pdf," Proc of the 2001 IEEE, International Conference on Robotics and Automation., 2001.

[141 D. Akin and R. Howard, "Telerobotic Operations Testing in Neutral Simulation," pp. 42-49, 2005.

129 115] K. Hogstrom, "Robotically Assembled Space Telescopes with Deployable Mod- ules: Concepts and Design Methodologies," 2017. [161 B. P. Mccarthy, A. Saenz-Otero, and D. W. Miller, "Flight Hardware Develop- ment for a Space-based Robotic Assembly and Servicing Testbed," 2014. [17] NASA, "Solar Max," 2016. [18] B. B. Reed, R. C. Smith, B. Naasz, J. Pellegrino, and C. Bacon, "The Restore-L Servicing Mission," 2018.

[19] K. Yoshida, K. Hashizume, and G. Satoko, "Zero Reaction Maneuver : Flight Validation with ETS-VII Space Robot and Extension to Kinematically Redun- dant Arm H b Hbm," pp. 441-446, 2001.

1201 R. Mccormick, A. Austin, K. Wehage, S. Backus, R. Miller, J. Leith, B. Bradley, P. Durham, and R. Mukherjee, "REMORA CubeSat for Large Debris Ren- dezvous , Attachment , Tracking , and Collision Avoidance," 2018. [21] A. Bylard, R. Macpherson, B. Hockman, M. R. Cutkosky, and M. Pavone, "Ro- bust Capture and Deorbit of Rocket Body Debris Using Controllable Dry Adhe- sion," 2016. [22] NASA, "STS-51a Satellite Capture," 2016.

[231 J. Wagenknecht, S. Fredrickson, T. Manning, and B. Jones, "Design, develop- ment and testing of the miniature autonomous extravehicular robotic camera (Mini AERCam) guidance, navigation, and control system," Advances in the Astronautical Sciences, vol. 113, pp. 187-205, 2003. [24] W. T. Gawdiak, Yuri, Bradshaw, Jeff, "R2D2 in a Softball: The Portable Satellite Assistant," 2000.

[25] I. W. Park, T. Smith, H. Sanchez, S. W. Wong, P. Piacenza, and M. Ciocarlie, "Developing a 3-DOF compliant perching arm for a free-flying robot on the In- ternational Space Station," IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM, pp. 1135-1141, 2017. [26] P. Lehner, S. Brunner, D. Andreas, B. Vodermayer, M. Str, P. Lehner, S. Brun- ner, A. Doemel, H. Gmeiner, S. Riedel, B. Vodermayer, A. W. De, and S. Riedel, "Mobile Manipulation for Planetary Exploration," 2018.

[27] A. Parness, N. Abcouwer, C. Fuller, N. Wiltsie, J. Nash, and B. Kennedy, "LEMUR 3: A limbed climbing robot for extreme terrain mobility in space," Pro- ceedings - IEEE InternationalConference on Robotics and Automation, pp. 5467- 5473, 2017. [28] A. Curtis, M. Martone, and A. Parness, "Roving on ice: Field testing an Ice Screw End Effector and sample collection tool," IEEE Aerospace Conference Proceedings, vol. 2018-March, pp. 1-17, 2018.

130 [291 T. Bretl, J.-C. Latombe, and S. Rock, "Toward Autonomous Free-Climbing Robots," 2002.

[301 J. Biesiadecki and M. Maimone, "The Mars Exploration Rover Surface Mobility Flight Software: Driving Ambition," 2006 IEEE Aerospace Conference, pp. 1-15, 2006.

131] M. Pavone and J. Starek, "Spacecraft Autonomy Challenges for Next Genera- tion Space Missions," in Advances in Control System Technology for Aerospace Applications, pp. 1-34, 2014.

[32] S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach. 2013.

[331 D. E. Kirk, "Optimal Control Theory," 1971. [341 W. B. Powell, "Al, OR and Control Theory: A Rosetta Stone for Stochastic Optimization," tech. rep., 2012.

1351 M. Elbanhawi and M. Simic, "Sampling-Based Robot Motion Planning: A Re- view," IEEE Access, vol. 2, pp. 56-77, 2014.

1361 B. Paden, M. Cdp, S. Z. Yong, D. Yershov, and E. Frazzoli, "A Survey of Motion Planning and Control Techniques for Self-driving Urban Vehicles,"

[371 C. E. Garcia, P. David M., and M. Manfred, "Model Predictive Control: Theory and Practice a Survey," Automatica, vol. 25, no. 3, pp. 335-338, 1989.

[381 J.-J. E. Slotine and W. Li, Applied Nonlinear Control.

[391 K. Johan Astr6m and R. M. Murray, Feedback Systems An Introduction for Sci- entists and Engineers. 2008.

[40] R. Bellman, "Dynamic Programming," Science, vol. 153, no. 3731, pp. 34-37, 1966.

1411 S. Boyd and L. Vandenberghe, "Convex Optimization,"

[421 R. Tedrake, "Underactuated Robotics: Algorithms for Walking, Running, Swim- ming, Flying, and Manipulation (Course Notes for MIT 6.832)."

[431 A. Rao, "Trajectory Optimization: A Survey," in Optimization and Optimal Control in Automotive Systems, vol. 455 LNCIS, p. Chapter 1, 2014.

144] J. T. Betts, "Survey of Numerical Methods for Trajectory Optimization," Journal of Guidance, Control, and Dynamics, vol. 21, no. 2.

[451 M. Kelly, "An Introduction to Trajectory Optimization: How to Do Your Own Direct Collocation *," vol. 59, no. 4, pp. 849-904.

131 1461 M. Zucker, N. Ratliff, A. D. Dragan, M. Pivtoraiko, M. Klingensmith, C. M. Dellin, J. A. Bagnell, and S. S. Srinivasa, "CHOMP: Covariant Hamiltonian Optimization for Motion Planning," tech. rep.

[47] S. Boyd, "Sequential Convex Programming," [48] A. Majumdar and R. Tedrake, "Funnel Libraries for Real-Time Robust Feedback Motion Planning," pp. 1-60, 2017.

1491 B. Roy, Nicholas, "Rapidly-exploring Random Belief Trees for Motion Planning Under Uncertainty"," 2014.

[501 S. Thrun, "Probabilistic Robotics," Communications of the ACM, vol. 45, no. 3, 2002.

151] J. Diebel, "Representing Attitude: Euler Angles, Unit Quaternions, and Rotation Vectors," tech. rep., 2006.

[521 K. Grotekatthdfer and Z. Yoon, "Introduction into quaternions for spacecraft attitude representation," TU Berlin, pp. 1-16, 2012.

[53] P. Furgale, "Representing Robot Pose: the Good, the Bad, and the Ugly," in ICRA 2014. [54] S. Dubowsky, "A Virtual Manipulator Model for Space Robotic Systems,"

[551 S. Dubowsky and E. Papadopoulos, "The Kinematics, Dynamics, and Control of Free-Flying and Free-Floating Space Robotic Systems," IEEE Transactions on Robotics and Automation, 1993.

[56] J. Virgili-Llop, J. V. Drew Ii, and M. Romano, "Spacecraft Robotics Toolkit: An Open-Source Simulator for Spacecraft Robotic Arm Dynamic Modeling and Control,"

[571 S. Karaman and E. Frazzoli, "Sampling-based Algorithms for Optimal Motion Planning," The InternationalJournal of Robotics Research, 2011.

[581 S. LaValle, "Randomized Kinodynamic Planning (IEEE),"

[59] A. Perez, R. Platt, G. Konidaris, L. Kaelbling, and T. Lozano-Perez, "LQR- RRT * : Optimal Sampling-Based Motion Planning with Automatically Derived Extension Heuristics," tech. rep.

[60] D. J. Webb and J. Van Den Berg, "Kinodynamic RRT*: Optimal Motion Plan- ning for Systems with Linear Differential Constraints,"

[611 M. Romano and R. Z. Ii, "Convex Optimization for Proximity Maneuvering of a Spacecraft with a Robotic Manipulator," no. September, 2017.

[62] S. Boyd, "Model Predictive Control."

132 [63] J. Swevers, C. Ganseman, D. B. Tukel, J. De Schutter, and H. Van Brussel, "Optimal robot excitation and identification," IEEE transactions on robotics and automation, vol. 13, no. 5, pp. 730-740, 1997.

[641 A. T. Espinoza and A. Saenz-Otero, "Probabilistic and Learning Approaches through Concurrent Parameter Estimation and Adaptive Control for In-Space Robotic Assembly," 2017. [65] A. T. Espinoza and W. Sanchez, "On-board parameter learning using a model reference adaptive position and attitude controller," IEEE Aerospace Conference Proceedings, pp. 1-8, 2017.

[66] D. Simon, "Optimal State Estimation," 2006.

[67] A. D. Wilson, J. A. Schultz, and T. D. Murphey, "Trajectory synthesis for fisher information maximization," IEEE Transactions on Robotics, vol. 30, no. 6, pp. 1358-1370, 2014.

[68] A. D. Wilson, J. A. Schultz, A. R. Ansari, and T. D. Murphey, "Real-time Tra- jectory Synthesis for Information Maximization using Sequential Action Control and Least-Squares Estimation," 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4935-4940, 2015.

1691 A. Weiss, M. Baldwin, R. S. Erwin, and I. Kolmanovsky, "Model predictive control for spacecraft rendezvous and docking: Strategies for handling constraints and case studies," IEEE Transactions on Control Systems Technology, vol. 23, no. 4, pp. 1638-1647, 2015.

[701 H. Park, S. Di Cairano, and I. V. Kolmanovsky, "Model Predictive Control for Spacecraft Rendezvous and Docking with a Rotating/Tumbling Platform and for Debris Avoidance," American Control Conference, pp. 1922-1927, 2011.

[71] J. L. Crassidis and F. L. Markley, "Unscented filtering for spacecraft attitude estimation," Journal of guidance, control, and dynamics, vol. 26, no. 4, pp. 536- 542, 2003. [72] B. Houska, H. Ferreau, and M. Diehl, "ACADO Toolkit - An Open Source Framework for Automatic Control and Dynamic Optimization," Optimal Control Applications and Methods, vol. 32, no. 3, pp. 298-312, 2011.

[73] K. Albee, M. Ekal, R. Ventura, and R. Linares, "Combining Parameter Identifi- cation and Trajectory Optimization: Real-time Planning for Information Gain," in ESA ASTRA 2019, no. 80, 2019.

[741 C. Jewison, D. Sternberg, B. Mccarthy, D. W. Miller, and A. Saenz-Otero, "Def- inition and Testing of an Architectural Tradespace for On-Orbit Assemblers and Servicers," 65 th InternationalAstronautical Congress.

[751 D. W. Miller, "Image of an EMFF Servicer." personal communication, 2019.

133 [76] S. Basu, T. S. Mast, and G. T. Miyata, "A Proposed Autonomously Assembled Space Telescope," in Space 2003, no. September, 2003. [77] C. F. Lillie, "Large deployable telescopes for future space observatories," UV/Optical/IR Space Telescopes: Innovative Technologies and Concepts II, vol. 5899, no. Figure 4, p. 58990D, 2005.

[78] W. R. Oegerle, L. R. Purves, J. G. Budinoff, R. V. Moe, T. M. Carnahan, D. C. Evans, and C. K. Kim, "Concept for a large scalable space telescope: in- space assembly," Space Telescopes and Instrumentation I: Optical, Infrared, and Millimeter, vol. 6265, no. June 2006, p. 62652C, 2006.

[79] L. D. Feinberg, J. Budinoff, H. MacEwen, G. Matthews, and M. Postman, "Mod- ular assembled space telescope," Optical Engineering, vol. 52, no. 9, p. 091802, 2013.

[801 R. S. Polidan, J. B. Breckinridge, C. F. Lillie, H. A. MacEwen, M. R. Flannery, and D. R. Dailey, "Innovative telescope architectures for future large space obser- vatories," Journal of Astronomical Telescopes, Instruments, and Systems, vol. 2, no. 4, p. 041211, 2016.

[81] C. Saunders, D. Lobb, M. Sweeting, and Y. Gao, "Building large telescopes in orbit using small satellites," Acta Astronautica, vol. 141, no. July, pp. 183-195, 2017.

[82] D. W. Miller, S. Mohan, and J. Budinoff, "Assembly of a Large Modular Optical Telescope (ALMOST)," Space Telescopes and Instrumentation 2008: Optical, Infrared, and Millimeter, vol. 7010, p. 70102H, 2008. [831 L. 0. Morgan, A. R. Morton, and R. L. Daniels, "Simultaneously determining the mix of space launch vehicles and the assignment of satellites to rockets," 2005. [84] E. L. Gralla, W. Nadir, H. Mamani, and 0. De Weck, "Optimal Launch Vehicle Size Determination for Moon-Mars Transportation Architectures," 2005. [85] N.-g. S. Astronomy, "TITANS AE Tradespace Investigation of a Telescope Ar- chitecture for," no. May, 2013. [86] US Federal Aviation Administration, "The Annual Compendium of Commer- cial Space Transportation: 2018 Federal Aviation Administration Annual Com- pendium of Commercial Space Transportation: 2018," no. January, 2018. [87] J. Reif, "Complexity of the Generalized Mover's Problem," [88] D. Bertsimas and D. D. Hertog, "Robust and Adaptive Optimization," 2019.

[89] D. Bertsimas and D. B. Brown, "Constrained stochastic LQC: A tractable ap- proach," IEEE Transactions on Automatic Control, vol. 52, no. 10, pp. 1826- 1841, 2007.

134