Hybridized Spacecraft Attitude Control Via Reinforcement Learning Using Control Moment Gyroscope Arrays
Total Page:16
File Type:pdf, Size:1020Kb
Air Force Institute of Technology AFIT Scholar Theses and Dissertations Student Graduate Works 3-2021 Hybridized Spacecraft Attitude Control via Reinforcement Learning using Control Moment Gyroscope Arrays Cecily C. Agu Follow this and additional works at: https://scholar.afit.edu/etd Part of the Aerospace Engineering Commons Recommended Citation Agu, Cecily C., "Hybridized Spacecraft Attitude Control via Reinforcement Learning using Control Moment Gyroscope Arrays" (2021). Theses and Dissertations. 4983. https://scholar.afit.edu/etd/4983 This Thesis is brought to you for free and open access by the Student Graduate Works at AFIT Scholar. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of AFIT Scholar. For more information, please contact [email protected]. Hybridized Spacecraft Attitude Control via Reinforcement Learning using Control Moment Gyroscope Arrays THESIS Cecily C Agu, First Lieutenant, USAF AFIT/ENY/MS/21-M-328 DEPARTMENT OF THE AIR FORCE AIR UNIVERSITY AIR FORCE INSTITUTE OF TECHNOLOGY Wright-Patterson Air Force Base, Ohio DISTRIBUTION STATEMENT A APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED. The views expressed in this document are those of the author and do not reflect the official policy or position of the United States Air Force, the United States Department of Defense or the United States Government. This material is declared a work of the U.S. Government and is not subject to copyright protection in the United States. AFIT/ENY/MS/21-M-328 Hybridized Spacecraft Attitude Control via Reinforcement Learning using Control Moment Gyroscope Arrays THESIS Presented to the Faculty Department of Graduate School of Engineering and Management Air Force Institute of Technology Air University Air Education and Training Command in Partial Fulfillment of the Requirements for the Degree of Master of Science in Astronautical Engineering Cecily C Agu, B.S.A.E. First Lieutenant, USAF March 25, 2021 DISTRIBUTION STATEMENT A APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED. AFIT/ENY/MS/21-M-328 Hybridized Spacecraft Attitude Control via Reinforcement Learning using Control Moment Gyroscope Arrays THESIS Cecily C Agu, B.S.A.E. First Lieutenant, USAF Committee Membership: Major Joshuah A. Hess, Ph.D Chair Major Costantinos Zagaris, Ph.D Member Dr. Richard G. Cobb Member AFIT/ENY/MS/21-M-328 Abstract Next-generation spacecraft have an increased requirement for precise pointing ac- curacy and rotational agility to accomplish a variety of missions. Current attitude determination and control systems (ADCS) employ multiple types of actuators such as reaction wheels, control moment gyroscopes (CMG), and thrusters to maneuver spacecraft for minimal time slews between orientations. There is practical importance in designing spacecraft to achieve rapid retargeting and CMGs are an effective option. Traditional methods to solve the general attitude control problem and conduct control allocation utilize open-loop optimal control algorithms such as dynamic programming to optimize overall system performance, combined with state feedback-based control to track the optimal solution. A limitation of these current control methods is to pro- cess realistic system constraints such as rate constraints and provide solutions that contain guaranteed stability and accuracy. Machine learning techniques in the form of reinforcement learning (RL) can solve these complex nonlinear problems by serving as a potential controller option for the ADCS system. The current study examines the use of an RL agent in an ADCS for performance enhancement of a spacecraft conducting multiple slewing maneuvers to gather ground target information. Three CMG arrays were implemented in two simulated spacecraft environments using an RL controller. The performance of the controllers was evaluated using target profiles from traditional control law implementations, singularity measure, and two sets of initial state values. The current research demonstrates that while RL techniques can be implemented, further exploration is needed to investigate the operational efficacy of an approach such as hybridized control for producing comparable performance attributes with respect to traditional control laws. iv Table of Contents Page Abstract . iv List of Figures . vii List of Tables . .x I. Introduction . .1 1.1 Motivation . .1 1.2 Problem Statement . .4 1.3 Objectives, Tasks, and Scope . .5 1.3.1 Objectives . .5 1.3.2 Tasks . .6 1.3.3 Assumptions and Scope . .6 1.4 Research Overview . .7 II. Background . .8 2.1 Control Theory . .9 2.1.1 Equations of Motion . .9 2.1.2 Active Control . 20 2.1.3 Control Laws . 24 2.2 Singularity Avoidance . 26 2.2.1 Singularity Types . 27 2.2.2 Steering Laws . 29 2.2.3 Techniques and Methods . 30 2.3 Machine Learning Overview . 35 2.3.1 Supervised Learning and Neural Networks . 35 2.3.2 Reinforcement Learning . 38 2.3.3 Reinforcement Learning and Optimal Control . 46 2.3.4 Spacecraft Control Authority with Machine Learning . 49 2.4 Training and Software Optimization . 51 III. Methodology . 54 3.1 Spacecraft Mission Description . 54 3.1.1 Mission Environment . 54 3.1.2 Simulated Spacecraft Parameters . 56 3.1.3 Hardware and Software. 59 3.2 Simulation Model Closed-Loop Control Scheme . 60 3.2.1 Observation and Action Spaces . 61 3.2.2 Dynamic Equations . 63 v Page 3.2.3 Singularity Avoidance . 68 3.2.4 Target Collection . 69 3.3 Reinforcement Learning Control Structure . 71 3.3.1 Implementation and Model . 71 3.3.2 Policy Description . 72 3.3.3 Reward Function Description . 74 3.4 Problem Formulation . 81 3.4.1 Problem A Methodology . 81 3.4.2 Problem A Evaluation . 82 3.4.3 Problem B Methodology . 84 3.4.4 Problem B Evaluation . 85 IV. Results and Analysis . 87 4.1 Problem A Results . 87 4.1.1 Baseline Results . 88 4.1.2 Reinforcement Learning Training Results . 91 4.1.3 Agent Performance Evaluation . 101 4.2 Problem B Results . 111 4.2.1 Baseline Results . 111 4.2.2 Agent Performance Evaluation . 113 4.3 General Performance Evaluations . 116 V. Conclusions . 118 5.1 Summary . 118 5.1.1 Problem A . 119 5.1.2 Problem B . 120 5.2 Contributions and Recommendations . 121 5.3 Future Work . 122 5.4 Final Remarks . 123 Appendix A. Twin Delayed DDPG (TD3) Algorithm . 124 Appendix B. Problem B Baseline State Profiles . 125 Appendix C. Initial State Values for SimSat, PHR, and CMG Arrays................................................... 128 Appendix D. Reward Function Formulations . 130 Appendix E. Problem A VSCMG and RWCMG Baseline Results . 133 Appendix F. Problem A RL Training Results - All CMG Arrays . 136 Bibliography . 139 vi List of Figures Figure Page 1 Various CMG arrays [1]. 11 2 Pyramid mounting arrangement of four single-gimbal CMGs [2]. ..