Deformable Object Manipulation: Learning While Doing

Deformable Object Manipulation: Learning While Doing by Dale McConachie A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Robotics) in The University of Michigan 2020 Doctoral Committee: Associate Professor Dmitry Berenson, Chair Professor Jessie Grizzle Associate Professor Chad Jenkins Professor Leslie Pack Kaelbling, Massachusetts Institute of Technology Dale McConachie [email protected] ORCID iD: 0000-0002-2615-3473 c Dale McConachie 2020 ACKNOWLEDGEMENTS Thanks to Calder Phillips-Grafflin, Brad Saund, Andrew Dobson, and Yu-Chi Lin for helpful discussions. Thank you to all my collaborators from the Autonomous Robotic Manipulation Lab. Thanks to my advisor Dmitry Berenson for his guidance and encouragement. Thanks to Chad Jenkins, Jessie Grizzle, and Leslie Pack Kaelbling for their advice. Thanks to my parents for supporting my return to university and advice throughout. Finally, special thanks to my wife Molly for everything. ii TABLE OF CONTENTS ACKNOWLEDGEMENTS :::::::::::::::::::::::::: ii LIST OF FIGURES ::::::::::::::::::::::::::::::: vi LIST OF TABLES :::::::::::::::::::::::::::::::: xi ABSTRACT ::::::::::::::::::::::::::::::::::: xii CHAPTER I. Introduction ..............................1 II. Related Work ..............................5 2.1 Modelling Deformable Objects . .5 2.2 Local Control for Manipulation Tasks . .6 2.3 Using Multiple Models . .6 2.4 Motion Planning for Deformable Objects . .8 2.5 Interleaving Planning and Control for Deformable Object Ma- nipulation . .9 2.6 Learning for Planning in Reduced State Spaces . 10 III. Deformable Object Modelling ................... 11 3.1 Definitions . 12 3.2 Diminishing Rigidity Jacobian . 14 3.3 Constrained Directional Rigidity . 15 3.3.1 Model Overview . 15 3.3.2 Directional Rigidity . 16 3.4 Results . 20 3.4.1 Simulation Environment Model Accuracy Results . 21 3.4.2 Physical Robot Experiments . 22 3.4.3 Computation Time . 24 iii IV. Local Control .............................. 25 4.1 Problem Statement . 25 4.2 Reducing Task Error . 26 4.3 Stretching Avoidance Controller . 26 4.3.1 Stretching Correction . 27 4.3.2 Finding the Best Robot Motion and Avoiding Collisions 29 4.4 Stretching Constraint Controller . 32 4.4.1 Overstretch . 32 4.4.2 Collision . 34 4.4.3 Optimization Method . 34 4.5 Results . 35 4.5.1 Constraint Enforcement . 35 4.5.2 Controller Task Performance . 37 4.5.3 Physical Robot Experiment . 38 4.5.4 Computation Time . 38 V. Estimating Model Utility ...................... 40 5.1 Problem Statement . 41 5.2 Bandit-Based Model Selection . 42 5.3 MAB Formulation for Deformable Object Manipulation . 43 5.4 Algorithms for MAB . 43 5.4.1 UCB1-Normal . 43 5.4.2 KF-MANB . 44 5.4.3 KF-MANDB . 44 5.5 Experiments and Results . 47 5.5.1 Synthetic Tests . 48 5.5.2 Simulation Trials . 49 5.6 Discussion . 53 VI. Interleaving Planning and Control ................. 55 6.1 Problem Statement . 57 6.2 Interleaving Planning and Control . 58 6.2.1 Local Control . 59 6.2.2 Predicting Deadlock . 60 6.2.3 Setting the Global Planning Goal . 64 6.3 Global Planning . 68 6.3.1 Planning Setup . 68 6.3.2 Planning Problem Statement . 69 6.3.3 RRT-EB . 70 6.4 Probabilistic Completeness of Global Planning . 72 6.4.1 Assumptions and Definitions: . 72 6.4.2 Proof of Nearest-Neighbors Equivalence . 74 iv 6.4.3 Construction of a δq-similar Path . 76 6.5 Simulation Experiments and Results . 83 6.5.1 Single Pillar . 84 6.5.2 Double Slit . 86 6.5.3 Moving a Rope Through a Maze . 87 6.5.4 Repeated Planning . 88 6.5.5 Computation Time . 91 6.6 Physical Robot Experiment and Results . 92 6.6.1 Experiment Setup . 92 6.6.2 Experiment Results . 93 6.7 Discussion . 95 6.7.1 Parameter Selection . 96 6.7.2 Limitations . 96 VII. Learning When To Trust Your Model .............. 98 7.1 Introduction . 98 7.2 General Problem Formulation . 100 7.3 Learning Transition Reliability . 102 7.3.1 Data Generation and Labeling . 102 7.3.2 An Illustrative Navigation Example . 103 7.3.3 What can be learned . 104 7.3.4 Using the Classifier in Planning . 105 7.4 Application to a Torque-Limited Planar Arm . 106 7.4.1 Problem Statement . 106 7.4.2 Data Collection, Labelling, and Training . 107 7.4.3 Planning and Results . 107 7.5 Application to Rope Manipulation . 108 7.5.1 Problem Statement . 108 7.5.2 Reduction . 109 7.5.3 Learning the Classifier . 110 7.6 Rope Manipulation Experiments . 110 7.6.1 Scenarios . 111 7.6.2 Data Collection . 112 7.6.3 Training the Classifier . 112 7.6.4 Planning Results . 113 7.7 Discussion . 113 VIII. Discussion and Future Work .................... 115 BIBLIOGRAPHY :::::::::::::::::::::::::::::::: 119 v LIST OF FIGURES Figure 3.1 Euclidean distance measures length of the shortest path between pi 3 and pj in R (gold). Geodesic distance measures the length of the shortest path, constrained to stay within the deformable object (red). 13 3.2 An illustrative example of directional rigidity. Left: The rope moves almost rigidly when dragging it by one end to the left. Right: The rope deforms when pulling it on the right in the opposite direction. 15 3.3 The length of the the red segment on the rope is the geodesic distance Dij. vij is the vector showing the relative position of pj with respect to pi.................................... 17 3.4 Projection process for points that are predicted to be in collision after movement. 20 3.5 RMS model prediction error for the simulated rope model accuracy test. The gripper pulls the rope for the first 4.5 seconds, then turns for half a second, then moves in the opposite direction at the 5 second mark. 21 3.6 RMS model prediction error for the simulated cloth model accuracy test. The grippers pull the cloth for the first 2.3 seconds, then turn for 0.63 seconds, then move in the opposite direction at the 2.93 second mark. At the 5 second mark the cloth is no longer folded. 22 3.7 Initial setup for the physical robot model accuracy experiment. 23 3.8 RMS model prediction error for the physical cloth accuracy test. The grippers pull the cloth toward the robot for the first 10 timesteps, upward for 5 timesteps, rotate for 15 timesteps, diagonally down and away for 9 timesteps, then directly away from the robot. 23 3.9 Initial state of the four experiments, where the red points act as attractors for the deformable object. (a) Rope wrapping cylinder. (b) Cloth passing single pole. (c) Cloth covering two cylinders. (d) Rope matching zig-path . 24 4.1 Top Line: moving the point does not change the error, thus the desired movement is zero, however, it is not important to achieve zero movement, thus Wd = 0. Bottom Line: error is at a local minimum; thus moving the point increases error. 26 vi 4.2 The arrows in gray show the direction of each stretching vector at the corresponding gripper with respect to the gripper pair qg and qk. Left: stretching vectors on the rope when the rope is at rest (above) or is deformed (below). Right: stretching vectors on the cloth when the cloth is at rest (above) or is deformed (below). The red lines denote the geodesic connecting the corresponding pIg(qg;qk) and pIk(qg;qk) on the object. 33 4.3 Initial state of the four experiments, where the red points act as attractors for the deformable object. (a) Rope wrapping cylinder. (b) Cloth passing single pole. (c) Cloth covering two cylinders. (d) Rope matching zig-path . 36 4.4 (a) The red line shows the γ of the benchmark and the blue line shows the γ of the new controller with ss = 0:4 throughout the simulation. (b) The purple line shows the γ of the benchmark, and the blue, red, and yellow lines each show the γ of the new controller with ss = 0:4, ss = 0:6, and ss = 0:8, respectively. 37 4.5 Cloth-covering-two-cylinder task start and end configurations. Both controllers are unable to make progress due to a local minima. 37 4.6 Rope-matching-zig-path start and end configurations. Both controllers are able to succeed at the task, bringing the rope into alignment with the desired path. 38 4.7 Initial setup for the physical robot stretching avoidance test. 38 5.1 Sequence of snapshots showing the execution of the simulated experiments using the KF-MANDB algorithm. The rope and cloth are shown in green, the grippers is shown in blue, and the target points _ are shown in red. The bottom row additionally shows Pd as green rays with red tips. 50 5.2 Experimental results for the rope-winding task. Top left: alignment error for 10 trials for each MAB algorithm, and each model in the model set when used in isolation. UCB1-Normal, KF-MANB, KF- MANDB lines overlap in the figure for all trials. Top right: Total regret averaged across 10 trials for each MAB algorithm with the minimum and maximum drawn in dashed lines. Bottom row: his- tograms of the number of times each model was selected by each MAB algorithm; UCB1-Normal (bl), KF-MANB (bm), KF-MANDB (br). 52 5.3 Experimental results for the table coverage task. See Figure 5.2 for description. 53 5.4 Experimental results for the two-part coverage task. See Figure 5.2 for description. 54 vii 6.1 Four example manipulation tasks for our framework. In the first two tasks, the objective is to cover the surface of the table (indicated by the red lines) with the cloth (shown in green).

Load more