Reactive Manipulation with Contact Models and Tactile Feedback by Francois R. Hogan Submitted to the Department of Mechanical Engineering in partial fulfillment of the requirements for the degree of

Doctor of Philosophy in Mechanical Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2020 @Massachusetts Institute of Technology 2020. All rights reserved.

Author ...... Signatureredacted Departme4t f Mechanical Engineering December 16, 2019 Signature redacted Certified by...... Alberto Rodriguez Associate Professor Chia . upervisor Signature redacted

A ccep ted by ...... , * Accepted by Nicolas Hadjiconstantinou O N Department Graduate Officer FEB_052020 LIBRARIES

Reactive Manipulation with Contact Models and Tactile Feedback by Francois R. Hogan

Submitted to the Department of Mechanical Engineering on December 16, 2019, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mechanical Engineering

Abstract

This thesis focuses on closing the loop in robotic manipulation, moving towards robots that can better perceive their environment and react to unforeseen situations. Hu- mans effectively process and react to information from visual and tactile sensing, however robots often remain programmed in an open-loop fashion, and struggle to correct their motion based on detected errors. We begin our work by developing full-state feedback controllers for dynamical sys- tems involving frictional contact interactions. Hybridness and underactuation are key characteristics of these systems that complicate the design of feedback controllers. We design and experimentally validate the controllers on a planar manipulation system where the purpose is to control the motion of a sliding object on a flat surface using a point robotic pusher. The pusher-slider is a simple dynamical system that retains many of the challenges that are typical of robotic manipulation tasks. We extend this work to partially observable systems, by developing closed-loop tactile controllers for dexterous manipulation with dual-arm robotic palms. We in- troduce Tactile Dexterity, an approach to dexterous manipulation that plans for robot/object interactions that render interpretable tactile information for control. Key to this formulation is the decomposition of manipulation plans into sequences of manipulation primitives with simple mechanics and efficient planners.

Thesis Supervisor: Alberto Rodriguez Title: Associate Professor

3 I I I U

U Acknowledgments

First and foremost, I thank my advisor Alberto Rodriguez. His guidance along with the freedom to pursue my own ideas have been invaluable and been key to helping me become an independent researcher. In the years to come, I will always keep Alberto as a role model that I will draw inspiration from, in particular his care for his students, his attention to detail, and his kindness. I thank James Forbes, my advisor during my time as a Master's graduate student at McGill University, who is the reason I pursue academic research. My time under his guidance has taught me invaluable lessons: the importance of academic rigor, the process of scientific writing, and the value of creating knowledge. I am grateful to my collegues from the MCube lab for creating a stimulating environment that was enriching both personally and academically. I thank Peter Yu for sharing his passion of robotic systems and for patiently answering the questions of a motivated new robotic researcher. I thank Nikhil Chavan-Dafle for the many insightful discussions on the subtleties of the mechanics of manipulation. I am grateful to Nima Fazeli for his academic and career advice as well as being a valued friend during my time at MIT. I thank Maria Bauza, who's drive and work ethic have been conducive to an exciting research collaborative. Finally, I thank my collaborators, Jose Ballester, Siyuan Dong, Oleguer Canal, Eudald Romo Grau, and the members of the MCube lab, who have been instrumental in many of the results presented in this thesis.

5 6 Contents

1 Introduction 19

1.1 Contributions ...... 20 1.2 Outline...... 21

2 Related Work 23

2.1 Nonprehensile Manipulation ...... 23

2.2 Contact-Constrainted Motion Planning ...... 25 2.3 Hybrid Controller Design ...... 26 2.4 Grasp Planning ...... 27 2.5 Tactile Sensing ...... 28

3 Feedback Control with Contact Models 31

3.1 Contribution ...... 31

3.2 Introduction ...... 32 3.3 Challenges ...... 33 3.3.1 Hybridness ...... 33

3.3.2 Underactuation ...... 35

3.4 Planar Manipulation ...... 35

3.4.1 Nomenclature ...... 37 3.4.2 Modeling ...... 38

3.4.3 Frictional Contact Constraints ...... 43

3.4.4 Linearization ...... 45

3.5 Controller Design ...... 45

7 3.5.1 Hybrid Model Predictive Control ...... 45 3.5.2 Mixed-Integer Quadratic Program (MPC-MIQP) 48 3.5.3 Family of Modes (MPC-FOM) ...... 50 3.5.4 Learned Mode Scheduling (MPC-LMS) . 53 3.6 Numerical Results ...... 55 3.6.1 Straight-Line Tracking Simulation . .. . 57 3.6.2 Sensitivity to initial state errors . .. . 61

3.6.3 Sensitivity to contact mode errors . .. . 62 3.7 Experimental Results ...... 64 3.7.1 Case Study A: Single Point Pushing .. . 64 3.7.2 Case Study B: Pushing with Line Contact 66 3.8 Influence of Control Parameters ...... 68

3.8.1 Controller Frequency ...... 69 3.8.2 Tracking Velocity ...... 70 3.8.3 Planning horizon ...... 70 3.8.4 Coefficient of Friction ...... 70

3.8.5 Race Track Radius of Curvature .... . 71

3.9 D iscussion ...... 71

4 Tactile Dexterity 73 4.1 Contribution ...... 73 4.2 Introduction ...... 74 4.3 Approach ...... 76 4.4 Manipulation Primitives ...... 78 4.4.1 G rasp ...... 78 4.4.2 P ush ...... 78 4.4.3 P ull ...... 79 4.4.4 Pivot ...... 79 4.5 M echanics ...... 79 4.6 Tactile Control ...... 81 4.6.1 Contact State Control ...... 82

4.6.2 Object State Control ...... 85 4.7 Planning ...... 86 4.7.1 Low-Level Trajectory Planning ...... 87 4.7.2 High-Level Planning ...... 90 4.8 R esults ...... 95 4.9 Discussion ...... 96

5 Conclusion 97 5.1 Key Findings ...... 97 5.2 Limitations ...... 98 5.3 Future Work ...... 99 5.4 Closing Thoughts ...... 100 5.4.1 What is the right tradeoff between model complexity and com-

putational efficiency? ...... 100 5.4.2 Combining Model-Based and Data-Driven Approaches? .... 101

5.4.3 Towards Achieving General Manipulation Capabilities ..... 102

A Data-Driven Control of Planar Manipulation 105

9 10 List of Figures

3-1 The hand manipulates a light bulb into its socket. The dynamics of the light bulb are dictated by rigid body motions under applied external

forces arising from contact interactions with the environment...... 32

3-2 Depiction of hybridness. Animation of a simple manipulation task that exploits multiple contact modes. First, the hand sticks to the book and

drags it backwards exploiting friction. Second, thumb and fingers slide to perform a regrasp maneuver. Finally, the book is retrieved from the shelf using a stable grasp...... 34

3-3 Depiction of underactuation. Where we interact with an object through

contact, we can only transfer a limited set of forces. When pushing a

coffee mug with a finger, the finger can only push on the object and

cannot pull. Underactuation constrains the possible motions of the cup

that can be impressed by the finger...... 35

3-4 Planar manipulation setup. The goal is to control the motion of the

object on a flat surface using a velocity-controlled robotic pusher. The

pose of the object is tracked using a Vicon camera system...... 36

3-5 Free body diagram of a sliding object with C = 2 contact points... 38

3-6 Depiction of the limit surface. The limit surface (ellipsoid in the figure)

describes the set of forces and moments that can be transmitted by a patch contact. By the principle of maximal dissipation, the object twist

is perpendicular to the limit surface of the applied frictional wrench w. 40

3-7 Friction cone constraint. The applied force must remain within the blue shaded region...... 41

11 3-8 Mode dependent constraints following Coulomb's frictional interaction law. (Force constraint)...... 41

3-9 Tree of optimization programs for a MPC program with N prediction

steps. Scales exponentially due to contact hybridness...... 46

3-10 Hybrid MPC framework. A sequence of control inputs is computed

that will drive the predicted states to the reference trajectory while

simultaneously finding the schedule of optimal hybrid mode transitions

m = {mi, .. , mM}. The control input io + u* is applied to the system. 48

3-11 Block diagram of the hybrid controller design proposed in Eq. (3.22). The resulting MPC controller design involves solving a non-convex mixed-integer quadratic program. Note that the integer variables are

internal to the controller, necessary to represent the hybrid dynamics, and as an aid to find the optimal control sequence...... 49

3-12 Example of optimal mode schedule for the pusher-slider system converging to a

straight horizontal trajectory...... 51

3-13 Block diagram of the hybrid controller design with Family of Modes Scheduling (FOM). The resulting MPC controller design contains K

convex quadratic program that can be solved in parallel...... 52

3-14 Block diagram of the hybrid controller design with learned mode sched- ule classifier. The resulting MPC controller design is a convex quadratic

program ...... 53

3-15 Supervised learning framework for mode schedule selection. A dataset of E labelled datapoints is generated by solving Optimization Prob-

lem 1. From the training examples, a classifier is trained to return the

mode schedule based on the state error vector...... 54

12 3-16 Closed-loop straight line tracking of a pushed square object. The MPC- LMS controller performance is compared to four benchmark: the op-

timal baseline MPC-MIPQ, Dubin's Car, MPC with sticking contacts, and LQR with frictionless contacts. Both the MPC-LMS and optimal baseline MPC-MIQP are able to track the trajectory. The frictionless

controller that neglects friction is unable to slide to the proper con-

tact location due to unexpected friction. The Dubin's Car controller

is able to track the nominal configuration, however, its limited control

authority leads to a significantly slower convergence to the nominal

trajectory...... 57

3-17 Error dynamics response from perturbed initial conditions associated with the MPC-LMS controller shown in Fig. 3.6.1. The perturbed ini- tial conditions are shown to converge towards zero after approximately 2 s...... 58

3-18 Control effort response from perturbed initial conditions associated with the MPC-LMS controller shown in Fig. 3.6.1. The control input

space u= [f ft fy]T is mapped to the robot end-effector velocity vc

using Eq. (3.15). Although the commanded velocities are discontinuous

at contact switches, the resultant commanded robot positions x, and

y, are sm ooth...... 59

3-19 Trajectory tracking error mean and variance taking over 100 simula-

tions with random initial states. The error is computed as the euclidean

distance between the observed center of mass position and its desired

location. In general, the sticking and frictionless controllers cannot

control the system ...... 61

3-20 MPC cost vs. classification mistakes. Random classification errors

are introduced on all contact modes to the MPC-MIQP solution to understand the sensitivity of the controller to classification mistakes. 63

13 3-21 Tracking error vs. classification mistake at a given time step as defined in Fig. 3-10. Random classification errors are introduced (100% error)

on individual contact modes to the MPC-MIQP solution to understand

the sensitivity of the controller. Results show that committed mistakes

during the beginning of the control horizon have the most important

impact on controller performance...... 63

3-22 Experimental setup for point pusher...... 64

3-23 Experimental tracking of the o0 track with the MPC-LMS controller

for the robotic point pusher...... 65

3-24 Experimental setup for line pusher...... 66

3-25 Experimental tracking of the 8 track with the LMS controller for the

robotic point pusher...... 67

3-26 Experimental analysis of the performance as a function the a) con-

troller frequency, b) tracking velocity, c) planning horizon, d) error in

coefficient of friction, and e) radius of curvature of the track...... 69

4-1 A dual arm robot manipulates an object from an initial pose to a target configuration. The manipulation task combines 3 primitive actions to achieve the task: pull the object to the middle of the table, pivot the

object, and push it to its target location...... 74

4-2 Planning to pivotabox.Planningtopivot a box involves several levels

of complexity. The robot arms move the end-effectors (palms), which in turn generate the trajectory of contact points that ultimately move the box.

Tactile sensors allow us to observe and control these contact points. In this

chapter, we propose an approach to dexterous manipulation that structures the planning problem so that when unexpected behavior of the contact points

is observed with the tactile sensors, the robot can replan the motion of the contact points, palms, and arms in real-time ...... 75

14 4-3 Manipulation primitives. a) Grasp: the robotic palms align as a parallel jaw gripper to grasp an object. b) Push: a single robotic palm contacts the object laterally to manipulate it within the plane. c) Pivot: the object is

rotated about a point on the table by both palms. d) Pull: the robotic palm

presses vertically down on the object to slide it within the support plane. . 76

4-4 Contact state control. How should the robot react to undesirable slip-

page? We design a model based tactile controller that determine locally optimal robot adjustments to recover from contact state deviations...... 82

4-5 The stability margin 0 quantifies how close a contact is to the slipping bound- ary. The goal of the contact state controller is to maximize/minimize the

stability margin to encourage/discourage slippage...... 84

4-6 Tactile object localization. By localizing descriptive object tactile fea-

tures (lines, points), we update our estimate of its pose used by the Object

State Controller...... 85

4-7 We decouple the search into finding 1) a sequence of stable placements

from the object's initial to final pose and 2) the sequence of manipu-

lation primitives to achieve the desired pose. This separation of the

problem into two subgoals allows us to search through a reduced state

space permitting for faster planning time...... 93

4-8 Closed-loop evaluation of the tactile controller. We consider the task

of maintaining the object in a stationary pose under external perturbations.

When perturbed, the contact state controller increases the normal force on

the object to prevent slippage while the contact state controller replans robot motions to bring the object back to its nominal pose...... 95

A-1 Planar pushing system with world frame F and body frame b. We denote the length of the square as a...... 105

15 A-2 Training data (dots) and learned model for the object's change in ori- entation,AOb (rad). From left to right the number of datapoints is 10, 100 and 1000. We observe that the model complexity and accuracy increases with the number of training data...... 107 A-3 Desired trajectory (black) compared with the motion followed by the object's geometric center (blue) using the analytical controller at 80 and 50 mm/s respectively. The average error for the 8-track example is 9.00mm and 6.05mm for the square...... 110

16 List of Tables

3.1 Accuracy results for the straight line point pushing numerical results. The neural network predictions are evaluated on a validation set of

50K labelled data points. We evaluate the performance on each mode separately, as defined in Fig. 3-10...... 62 3.2 Neural network parameters...... 62

3.3 Experimental system parameters...... 65 3.4 Accuracy results for the point pushing experimental results on a 8

track trajectory. The neural network predictions are evaluated on a

validation set of 50K labelled data points. We evaluate the performance on each mode separately, as defined in Fig. 3-10...... 66

3.5 Accuracy results for the line pushing experimental results on a 8 track

trajectory. The neural network predictions are evaluated on a valida-

tion set of 50K labelled data points. We evaluate the performance on

each mode separately, as defined in Fig. 3-10...... 67

A.1 Controller performance comparison ...... 110

17 18 Chapter 1

Introduction

Robotic automation is at a turning point in its history, promising to change the landscape of warehouse automation, flexible manufacturing, recycling, and home as-

sistance. The successful integration of robots in factories and homes hinges on their ability to manipulate their environment productively. Understanding how to control

physical interactions is key to a new generation of dexterous robots.

The primary research objective of this thesis is to enable reactive robotic manip-

ulation. We explore the role of model-based feedback control for robotic tasks with

rich contact interactions. The ability to perceive the environment and react to unan- ticipated events is of paramount importance to deploy robots that are to interact

with the physical world, which is difficult to observe and predict. Whereas humans effectively process and react to information from tactile and vision sensing, robot

manipulation systems often remain programmed in an open-loop fashion, incapable of correcting their motion based on detected errors in the motion of the object. We believe that the inability of robots to react to mistakes can largely be attributed to their ineffectiveness to use sensor information for real-time control purposes. To purposefully manipulate their environment, robots must be skilled at using sensed information to i) monitor the execution of a task, and ii) correct detected errors in real-time.

With the recent development of sensing equipment (tactile sensors, stereo cam- eras, proximity sensors, force/torque sensors, etc.), the question remains: how should

19 robots use sensed information? This thesis develops reactive manipulation policies that rely on visual and tactile feedback. We are particularly interested in contact-rich robotic manipulation tasks where the dynamics are dominated by frictional interac- tions. Such tasks remain challenging for both model-based and learning-based control approaches. Classical physics-based control methods struggle to control such systems due to the non amenable nature of their motion equations, which include hybridness and underactuation. While model-free reinformement learning approaches are general and do not require a description of the motion model, they typically rely on large quantities of data that make their generalization to more complex tasks challeng- ing. Whereas there is a rich history of simulating multi-body contact interactions, the structure of these models has made it challenging to apply classical tools from control theory. Due to the nature of contact, the dynamics describing the interac- tion between two contacting bodies are fundamentally non-smooth and governed by non-convex constraints.

This thesis explores the development of feedback control strategies for physical tasks driven by contact interactions. Enabling this new tool will unlock an important array of interesting manipulation applications, from dynamic object manipulation to multi-contact assembly of parts. Most important, it will bring a fundamental shift in the way robots interact with objects, from open-loop executions to closed-loop.

1.1 Contributions

The purpose of this thesis is to develop feedback control architectures that will enable reactive robotic manipulation.

A major contribution of this research is the development of algorithms for real-time control of systems undergoing frictional contact interactions. We develop physics- based controllers for nonprehensile manipulation tasks involving rich frictional contact dynamics. A key challenge of such tasks is that the dynamical system undergoes dis- continuous switches between dynamical regimes. In this thesis, we develop controllers that can reason in real-time across contact modalities. We develop and experimen-

20 tally validate three real-time algorithms: Model Predictive Control with Family of

Models (MPC-FOM), and Model Predictive Control with Learned Mode Schedules (MPC-LMS). These controllers are developed under the assumption of full-state feed-

back and are validated on a planar manipulation task where the pose of the object is tracked using a motion capture system.

Another important contribution of this research is the development of reactive

manipulation policies that exploit tactile feedback. This thesis gives robots tactile

reflexes, i.e., the ability to correct their behavior based on tactile feedback. We present the framework of tactile dexterity, an approach to dexterous manipulation

that plans for robot/object interactions that render interpretable tactile information

for control. We divide the role of tactile control into two goals: 1) control the contact state between the end-effector and the object (contact/no-contact, stick/slip, forces)

and 2) control the object state by tracking the object with a tactile-based state

estimator. We introduce a dual-arm robotic manipulation platform equipped with tactile sensorized palms. We consider the application of manipulating an object from

an initial pose to a target pose while being robust against external perturbations and

uncertainty in object pose. We define a library of robust manipulation primitives, for which we develop mechanics models and motion planners. Finally, we develop a

high-level planner with the ability to sequence a set of manipulation primitives to reconfigure an object from an initial pose to a target pose.

1.2 Outline

This thesis is separated into two main chapters and draws from the authors prior published work.

In Chapter 3, we develop hybrid controller designs for nonprehensile manipula- tion tasks assuming full state visual feedback. In nonprehensile tasks, the dynamics of contact interactions between object and gripper dictate the object motion. We design and test novel feedback controllers for planar manipulation tasks. Planar manipula- tion is an interesting benchmark problem as the motion of the object is fully driven by

21 frictional interactions. Hybridness and underactuation are key characteristics of these systems that complicate the design of feedback controllers. We show that a model predictive control approach used in tandem with integer programming offers a power- ful solution to control hybrid dynamical systems with frictional contacts. Chapter 3 reviews and extends versions of manuscripts submitted to the International Journal of Robotics Research [36] and published in the Workshop on the Algorithmic Foun- dations of Robotics [371, the International Conference on Robotic Automation [38], and the Conference on Robotic Learning [5].

In Chapter 4, we introduce tactile dexterity, an approach to dexterous manipu- lation that that plans for robot/object interactions that render interpretable tactile information for control. Key to this formulation is the decomposition of manipulation plans into sequences of manipulation primitives with simple mechanics and efficient planners. A version of Chapter 4 has been submitted to the International Conference on Robotic Automation (ICRA) [27]. Chapter 5 concludes the thesis with a discussion on the limitations of the thesis contributions and the potential for new research directions.

22 Chapter 2

Related Work

Previous robotics research relevant to our work on reactive manipulation falls into five

broad categories: nonprehensile manipulation, contact-constrained motion planning, hybrid controller design, grasp planning, and tactile sensing. We now review some of the earlier work from these fields.

2.1 Nonprehensile Manipulation

The problem of planar pushing has a rich literature due to its theoretical and practical importance as one of the simplest nonprehensile manipulation problems. Since the seminal work by Mason [67], there has been a wealth of research on its modeling, planning, and control. Early work on pushing focused on developing models of planar pushing interactions from first principles. Due to the indeterminacy in the pressure distribution between an object and its support surface, [67] introduces the voting theorem that resolves the direction of rotation of an object under an external pushing action without explicit knowledge of the pressure distribution. Following Mason's work, several researchers have proposed practical models, most notably [32], that introduce the concept of the limit surface and [61] that use it to model the dynamics of planar pushing. Under the assumption of quasi-static interactions, the limit surface has been successfully used in simulation by [681, planning by [631, state estimation by [105], and feedback control by [62, 37, 101]. Due to the high computational costs

23 associated with building the limit surface, [56] propose an ellipsoidal approximation that yields invertible models from force to motion. [110] exploit the convex properties of the limit surface to develop an efficient data-driven algorithm for its construction from contact interactions. [108] develops a Linear Complementarity Formulation that describes the planar motion of objects subject to robotic pushes.

In recent years, researchers have turned to data-driven techniques to improve the accuracy of planar pushing interactions [84, 99, 54, 71, 110, 1111. [1101 present a physics-inspired data-driven model for systems with planar contacts. The algorithm approximates the limit surface as the level set of a convex polynomial. In [471, a mod- ular structure is used to learn a forward predicting pushing model from data. These models are shown to have the ability to transfer to novel objects and to extrapolate to novel actions. In [71], an experience-based object motion simulator is built with the ability to predict the motion of complex 3D structures pushed by a robot. This model is used to plan for robot motions that arrange a set of chairs around a table. [511 learns an inverse model of robot-object interactions to push unknown objects to a fixed target.

[63] and [109] introduce trajectory planning algorithms to find open-loop pushing trajectories achieving a target object pose. Key to the success of these executions are the assumptions that sticking interactions are maintained for the entirety of the push and that the object remains unperturbed during the execution. [22], [23], and 150] present planning frameworks to grasp objects in cluttered environment that leverage pushing actions to address uncertainty in the pose of objects. [621 implements a tactile-based feedback controller for a point pusher-object system to maintain the heading of an object. This PD based controller can stably control the orientation of a pushed object but cannot control its positioning. [35] connects object affordances to associated robot behavior to achieve a desired change in object state, by designing independent centroidal and rotational alignment controllers. These controllers can reason about general reconfigurations of an object, but do not explicitly reason about the state or dynamics of the object.

The seminal work on planar manipulation by [68] has inspired the concept of

24 sensorless manipulation [65], where an object can be controlled without any sensory feedback. This line of work has been extended to a large class of manipulation prob- lems, including prehensile pushing [16], tumbling [85], pivoting [46, 40, 41], throwing and catching [41], and dynamic in-hand sliding [87].

2.2 Contact-Constrainted Motion Planning

There are ongoing efforts to develop motion planning frameworks that can effectively handle the complexity associated with frictional contact interactions. Over the past

decade, this topic has been a focus of the robotics locomotion and manipulation communities.

In the robotic locomotion community, a common approach consists in formulat- ing the search for gaits as a nonlinear optimization program. [75] determines the gait trajectories and impact times under a prespecified contact sequence. [86] and [95] autonomously compute the gait contact sequences using mixed-integer nonlinear programming, where integer variables are used to encode the active contact modes during the trajectory. [79] employ a Linear Complementarity Problem formulation to encode the hyrid nature of contact interactions by including contact forces as decision variables within the program. This method has been shown to be effective for path planning of high degree of freedom systems undergoing contact rich interactions.

In the robotic manipulation community, [17] use a sampling-based algorithms to plan robot motions for in-hand manipulation tasks that exploit both sticking and sliding interactions with the environment. This approach relies on open-loop stable executions and rely on an accurate description of contact interactions. [101] and [41] present a graph search algorithm to plan through a sequence of manipulation primi- tives describing different contact states to achieve a manipulation task. More recently, [94] formulate a task and motion planning framework that can handle complex inter- actions by formulating the search for contact sequences as a nonlinear mixed-integer optimization program.

Despite the variety of motion planning frameworks available for multi-contact

25 dynamic interactions, these approaches have large computational requirements asso- ciated with solving nonlinear and non-convex optimization programs that make them unsuitable for online replanning. A key challenge that remains open in the community is to find control strategies that can replan mode sequences at real-time rates.

In a similar fashion to [861 and [951, this thesis formulates the optimal control solution to the hybrid control problem of planar manipulation as a Mixed-Integer

Program. We propose online approximations to the optimal solution based on learning a map from object states to predicted future mode transitions. This approximation leads to a real-time controller with the ability to reason across contact modalities, including sticking and sliding.

2.3 Hybrid Controller Design

A common strategy to deal with hybrid dynamics has been to design feedback con- trollers that rely on a fixed mode schedule set to follow a nominal plan computed offline. [101] and [531 use a linear-quadratic-regulator (LQR) control architecture to stabilize the nominal trajectory subject to a mode sequence searched offline. [75] em- ploy a feedback linearization approach to track the planned trajectory. Both of these approaches assume that the mode sequence remains unchanged during the execution of the task. [80] formulate a constrained LQR approach as a convex optimization program integrating control input constraints. While this approach can handle minor variations in the timing of impacts with the ground, it does not have the ability to alter its planned mode sequences over a finite horizon, for example to change its gait sequence or footstep plan. It is important to note that while this approach might be relatively sensible in locomotion (a gait precisely defines a sequence of ordering of contacts) it is very limiting in manipulation, where a significant part of the richness from reactive behavior comes from quickly adapting to unexpected contact events.

Hence, an important drawback of the aforementioned approaches is the controller's inability to replan the mode sequences in real-time. There have been recent efforts towards designing hybrid feedback control architectures that reason across system

26 discontinuities [14, 72]. Of particular interest to this research are [8] and [55] that formulate the hybrid MPC problem as a mixed-integer program that integrates both continuous and discrete variables. The controllers developed in these works establish closed-loop tracking by reasoning across multiple contact modes, however struggle to achieve real-time rates even for small dimensional systems as the scalability of the approach is limited by the number of hybrid states and the length of the control horizon.

Another approach that shows promise is explicit MPC, a multiparametric pro- gramming technique that computes the optimal control action offline as an "explicit" function of the state and reference vectors, so that on-line operations reduce to a function evaluation [9, 3, 74]. While these approaches enable real-time control in the- ory, in practice they are associated with large offline computational requirements that scale poorly with the dimensionality of the system, the number of hybrid modes and the length of control horizon. These approaches require enumerating the complete set of feasibility switching sequences offline through a backwards reachability analysis.

Closely related to our work is that of [70] that leverage offline data to warm-start mixed-integer quadratic programs. [21] develops a hybrid controller with the ability to reason across contact modalities by learning a value function by leveraging offline solutions from a mixed-integer MPC program.

Rather than pre-computing an offline solution to the hybrid control problem as done in explicit MPC, this paper proposes to search for the optimal mode sequences offline, separately from the search for optimal control inputs online. We formulate the offline mode sequence as a supervised classification program, and show that this leads to a convex hybrid MPC program that can be solved in real-time.

2.4 Grasp Planning

Grasp planning has a long and rich history in robotic manipulation [12, 13, 10, 19, 83].

Conventional approaches focus on determining grasp configurations that ensure there is some form of geometric closure on the object. The success of such grasp planning

27 strategies relies on accurate state estimation of the object pose. Moreover, it has been shown that classical grasp metrics are weak predictors of grasp quality when implemented on a physical robotic platform [30].

Recently, a large portion of the robotic manipulation community has converged to grasp planning methods that are agnostic to the identity and state of the object. Some model-based approaches rank grasp points according to a grasp quality metric that is based on local properties of the camera point cloud 92, 33]. Similarly, leveraging recent advances in computer vision and deep learning, many researchers have turned to data-driven methods that localize grasp points directly from an RGBD image

[82, 78, 77, 64, 57]. Both of these object agnostic approaches have led to effective grasp planning algorithms that deal with a large variety of object types and dense clutter. However, one limitation of such approaches is that they are most often implemented in an open-loop fashion, where the robot motion remains unchanged after the initial grasp location is determined. Due to inaccuracies in perception, dense clutter, and unaccounted object motions, the robot often encounters unanticipated events, such as premature contact, collisions with other objects, or imperfect grasp point locations.

To address these issues, there have been recent attempts to develop closed-loop approaches to grasping. In [57], a deep reinforcement learning approach is used to learn a closed-loop control policy from RGB video feed. In [60], RGBD cameras are combined with infrared sensors to design a reactive algorithm that improves the robustness and adaptability of grasps of unknown objects with uncertain position. In [98], sensor data is generated in simulation to build a reactive grasping policy that computes grasp affordances based on depth images.

2.5 Tactile Sensing

In an effort to enable more effective closed-loop approaches to robotic grasping, re- searchers have turned to tactile sensing to enable reactive behavior based on what the robot feels rather than what it sees. As reviewed in [102], there is a wealth of lit- erature concerning the use of tactile sensors in robotic manipulation. Tactile sensors

28 have already proved effective at detecting contact slip between the gripper and grasped objects [89, 2, 25], estimating contact forces [66], and localizing objects [48, 44]. In [7], a grasp quality predictor is constructed using self-supervised learning to predict the probability of success given tactile information. In [18], a reinforcement learning approach uses a grasp quality predictor to learn grasp adjustments for a cylindri- cal object based on tactile feedback. In [76], tactile sensors are integrated into the Dynamic Motion Primitives (DMP) framework to enable Associative Skill Memories (ASM). This imitation learning technique allows a robotic manipulator to replicate both the kinematics of the robot along with the sensorimotor measurements it en- counters during expert demonstrations. In [43], pressure sensors located in the robot fingers are used to adjust the planned trajectory of a robot in real time to improve the robustness of horizontal grasps.

In this research, we make use of GelSlim [26], a tactile sensor based on GelSight

[107]. The GelSight sensor has proved useful to identify object properties [58, 106], slippage detection [25], object localization [59, 44], and grasp stability evaluation. We are particularly inspired by [15], who showed that a combination of tactile sensing with visual information can reliably determine whether a given grasp will lead to a successful execution. In this research, we focus on the case of pure tactile feedback, where only local contact information is used for controller design.

Veiga [97] learns a slip detector that predicts slippage of an object within the robot hand. This model is leveraged by [96] to maintain a stable object grasp under externally applied perturbations on a multi fingered robot hand, where each finger acts independently to enforce sticking interactions. Dong [24] develops an incipient slip detection algorithm with a vision-based tactile sensor, GelSlim [26], and uses it to design a closed-loop tactile controller maintaining stable grasps in a bottle-cap screwing experiment. Both studies focus on controlling stick/slip interactions but do not explicitly control the trajectory of the manipulated object.

Tian [93] trains a deep convolutional neural network to predict the motion of a ball rolled on the ground with a tactile finger, directly in tactile space. This model is controlled using sample-based MPC. A drawback of this approach is its need for

29 large quantities of real-world data, which would be challenging to collect for the rich palm/objects interactions considered in this paper.

Li [59] shows that localized object features can be exploited with tactile sensing to recover an accurate estimate of the its pose. This strategy has been shown effective at performing challenging manipulation tasks such as part insertion with small tol- erances. Izatt [441 fuses tactile and visual perception by interpreting tactile imprints as local 3D pointclouds within a Kalman filter framework for object pose estima- tion. More recently, Bauza [4] develops a tactile based pose estimation algorithm that exploits a high resolution tactile map of the object to localize tactile imprints.

30 Chapter 3

Feedback Control with Contact Models

3.1 Contribution

In this chapter, we introduce and evaluate novel controller designs for physical in- teractions involving frictional contacts. These control architectures are designed to address two key control challenges typical of robotic manipulation: underactuation and hybrid dynamics. This chapter presents three model-based controllers:

1. MPC-MIQP. In Section 3.5.2, The combinatorial hybrid nature of manipu-

lation dynamics are modeled by introducing integer decision variables into the optimization program. The resulting mixed-integer quadratic program (MIQP)

can be solved efficiently using commercial numerical tools, such as Gurobi ([34]).

This controller handles well the hybrid and underactuated nature of frictional

contact interactions, however, is too computationally expensive for real-time control.

2. MPC-FOM. In Section 3.5.3, we introduce a controller achieving real-time

implementation by reasoning across a fixed set of contact schedules.

3. MPC-LMS. A challenge with the MPC-FOM method is that it requires knowl- edge of what constitutes a good set of candidate mode schedule. This section

31 introduces a controller design that eliminates the need for human intuition by leveraging offline computation. In Section 3.5.4, we present MPC with Learned

Mode Scheduling (MPC-LMS), that leverages integer programming and ma- chine learning techniques to effectively deal with the combinatorial complexity associated with determining sequence of contact modes.

The controller designs are validated on a planar manipultion task, where the goal is to control the motion of a sliding object using a robotic pusher. We measure the performance of each controller through numerical simulations and experimental results using an industrial ABB IRB 120 robotic arm.

3.2 Introduction

Manipulating an object requires physical interaction by an agent. For example, when screwing in a light bulb as shown in Fig. 3-1, the bulb moves due to the physical forces applied by the agent's hand. Following Newton's second law, we can derive the motion equations of the light bulb by drawing its free body diagram, where the applied forces on the light bulb arise due to contact interactions with the hand.

f3

Figure 3-1: The hand manipulates a light bulb into its socket. The dynamics of the light bulb are dictated by rigid body motions under applied external forces arising from contact interactions with the environment.

32 The motion equation of the system in Fig. 3-1 is:

C Mq + g(q) = Jwc (3.1) C= 1

where M is the mass matrix of the bulb, q is the generalized coordinate of the bulb,

g is the gravity vector, Jc is the Jacobian corresponding to contact point c, and we = r T fc r is the wrench applied by the cih contact by the agent on the body. At first glance, there is nothing particularly complex about this form, which is similar to that commonly encountered when modelign robotic manipulators, spacecraft, unmanned

aerial vehicles, underwater vehicles, etc. One key difference, however, is that the

applied contact wrench we, cannot take on any arbitrary value and must obey the laws of frictional contact interactions. This fact has two important consequences

for the design of planning and control algorithms: 1) these algorithms must handle

non-smooth contact switches and 2) reason about the actuation limits that can be sustained by frictional contact interactions.

3.3 Challenges

This work aims to design a closed-loop controller that can reason about the frictional

contact interactions arising between a robot's end-effector and a manipulated ob- ject. Systems undergoing frictional contact with their environment present two key challenges for controller design: hybridness and underactuation.

3.3.1 Hybridness

When in contact, object and manipulator interact in different contact modes. For example, during manipulation tasks, the object can slip within the fingers of the gripper, the gripper can throw the object in the air, or the gripper can perform pick and place maneuvers. These manipulation actions correspond to different contact interaction modes, characterized by the sliding, sticking, or separation of the indi- vidual contacts. The hybridness associated with the transitions between modes can

33 Figure 3-2: Depiction of hybridness. Animation of a simple manipulation task that exploits multiple contact modes. First, the hand sticks to the book and drags it backwards exploiting friction. Second, thumb and fingers slide to perform a regrasp maneuver. Finally, the book is retrieved from the shelf using a stable grasp.

result in a non-smooth dynamical system. This complicates the design of feedback controllers since the vast majority of standard control techniques rely on smoothness of the dynamical model.

In many applications involving hybrid dynamical systems, this difficulty is over- come by setting a schedule of mode transitions of the controller offline. This limitation

prevents the controller from fully exploiting the dynamics of the system in response

to external perturbations. Furthermore, for robotic manipulation tasks, the mode scheduling is often not known a priori and can be challenging to predict. In such

cases, we must rely on the controller to decide, during execution, what interaction mode is most .beneficial to the task. Figure 3-2 illustrates the example of picking a book from a shelf. The hand interacts with the book in a complex manner. It is diffi-

cult to say when fingers and palm stick or slide, but those transitions not only happen, but are necessary to pick the book. Likely the hand initially sticks to the book and drags it backwards exploiting friction. Then, the thumb and fingers swiftly slide to regrasp the book. Finally, the book is retrieved from the shelf using a stable grasp. For such manipulation tasks where the motion is not periodic, determining a fixed mode sequencing strategy is not obvious and often impractical. Mistakes committed during execution or encountered external perturbations will surely require that the mode sequencing be altered.

34 Figure 3-3: Depiction of underactuation. Where we interact with an object through contact, we can only transfer a limited set of forces. When pushing a coffee mug with a finger, the finger can only push on the object and cannot pull. Underactuation constrains the possible motions of the cup that can be impressed by the finger.

3.3.2 Underactuation

Underactuation is due to the fact that contact interactions can only transmit a limited set of forces and torques to the object. As such, the controller must choose only among the forces that can physically be realized. For example, the normal forces commanded should be positive, as contact interactions can only "push" and cannot "pull." In order to achieve this, it is required to explicitly impose the physical constraints associated with contact interactions in the controller design. The principles and conditions that must be considered include Coulomb's frictional law, the non-penetrating condition, and the principal of maximum dissipation. These concepts are described in [90] and further detailed in Section 3.4.3. The consequence of limited control authority is that the controller must reason beyond instantaneous actuation by considering the long term consequences of control actions.

3.4 Planar Manipulation

This chapter studies planar manipulation, a nonprehensile task where the goal is to control the motion of a sliding object through frictional contact interactions. Planar manipulation is an interesting dynamical system to study controller design since the source of actuation arises purely from friction. Moreover, this system highlights the importance of reasoning in real-time across contact modes. By perturbing the system and altering the contact state, the success of the task depends on the ability of the

35 controller to modify its originally planned contact state in an online fashion.

20 cm Motion Capture

Pusher

Figure 3-4: Planar manipulation setup. The goal is to control the motion of the object on a flat surface using a velocity-controlled robotic pusher. The pose of the object is tracked using a Vicon camera system.

We examine and test a feedback controller design for the pusher-slider system, where the purpose is to control the motion of a sliding object on a flat surface using a robotic pusher. The pusher-slider system is a simple dynamical system that incorpo- rates several of the challenges that are typical of robotic manipulation tasks, namely hybridness and underactuation. It is a hybrid dynamical system that exhibits different contact modes between the pusher and slider (e.g. separation, sticking, slid- ing up, and sliding down). Transitions between these modes result in discontinuities in the dynamics, which complicate controller design. Moreover, it is an underactu- ated system where the contact forces from the pusher acting on the sliding object are constrained to remain inside the friction cone. These constraints on the control inputs lead to a dynamical system where the velocity control of the pusher is not sufficient to produce an arbitrary acceleration of the slider. Ultimately, the controller must reason about finite horizon trajectories and not just instantaneous actuation.

The purpose of this chapter is to develop a feedback controller design that can han- dle both challenges described above. In Section 3.5.2, we present the offline optimal control solution to the hybrid control problem of multiple point planar manipulation as a Mixed-Integer Program. We show that a model predictive control approach used

36 in tandem with integer programming offers a powerful solution to capture the dy- namic constraints associated with the friction cone as well as the hybrid nature of contact. We propose two simplifications to speed up the integer program in order to achieve real-time control. First, in Section 3.5.3, we present the Family of Modes

(MPC-FOM) control architecture is proposed by simulating the dynamical system forward using a set (i.e., family) of mode schedules that are identified as being key. Second, in Section 3.5.4, we introduce the Learned Mode Sequences (MPC-LMS) controller to learn a map from object states to predicted future mode transitions.

3.4.1 Nomenclature

We describe here the notation used in the chapter:

" F: Inertial reference frame fixed to the ground. " F: Body reference frame fixed to the object. • C: Number of contact points (indexed by c) * w = [f, fy ]T: Applied wrench on the object resolved in the body frame. " t= [v2 vy w)T: Object twist resolved in the body frame. " Jc: Jacobian matrix associated with the contact point c resolved in the body frame. N [JTni ... JTcnc]T: Matrix of object normal vectors at contact points resolved in body frame. •T [JTti ... JTtc]T: Matrix of object tangent vectors at contact points resolved in body frame. • fn = [f,1.. fn,C]T: Vector of applied normal force magnitudes at contact points resolved in body frame.

* ft = [ft, I ... ft,c]T: Vector of applied tangential force magnitudes at contact points resolved in body frame.

•4 = #1 ... #c]T: Vector of relative angles of pusher relative to body frame. * x = [X y 0 #T]T: System state vector: position and orientation of the object

in Fa as well as the relative angles of the contact points relative to b. * Uf = [Ifn fT]T: Vector of applied normal and tangential contact forces. • uO = : Vector of commanded angular velocities resolved in body frame. •u = [uT u ]T: Control input.

37 3.4.2 Modeling

This section describes the mechanics of planar manipulation and derives the motion equations used in Section 3.5 for controller design. This model generalizes to an arbitrary number of contact points and arbitrary object shapes. Consider the pusher-object system in Fig. 3-5. The pose of the object is given by

q= x [y 6 ]T , where x and y denote the cartesian coordinates of the center of mass of the object and 0 its orientation relative to the inertial reference frame Ta. Assuming the point c e {1, ... , n} associated with the pusher remains in contact with the object at all time, the position of the contact point c relative to the object resolved in the body frame Fb is

fA,1

fn,2 % r m

Fa ft, 2

Figure 3-5: Free body diagram of a sliding object with C= 2 contact points.

]T

For shapes that can be parametrized radially, the position of the contact point can be described in terms of the radial distance r = f(()

rc = [-f(o) Cos #c f(o) sin #c, where the angle #c describes the location of the contact point along the perimeter and ( is used to parametrize the shape of the object radially. Note that we choose

38 to parameterize the location of the pusher on the surface of the object by the angle #. This assumes a bijective mapping between the angle # and the surface of the object, which is not always true. Whereas an arc-length parametrization would be more general, the parametrization with the angle # works for "star-shaped" objects and simplifies the description of the kinematics of pushing.

Applying Newton's second law in the x -y plane ofFa yields the motion equations

M4 = fG + w, (3.2) where M is the inertia matrix of the system, w is the generalized frictional force ap- plied by all pushers on the object, andfG is the generalized frictional force applied by the ground on the object. The quasi-static assumption observes that at low velocities, frictional contact forces dominate and inertial forces do not have a decisive role in determining the motion of the object [69]. Under this assumption, kinematic and frictional forces are in equilibrium with the applied frictional wrench by the pusher, which is of equal magnitude and opposite direction to the ground planar frictional force (i.e., w = -fG). The quasi-static assumption leads to a simplified analysis of the motion of the pushed object by using a force balance between the applied forces on the object and the frictional forces between the object and the ground.

Note that the term M4 could be integrated into the control formulation presented in Section 3.5. The quasi-static assumption however presents advantages as it leads to a direct mapping between the motion of the object and the motion of the pusher and has useful properties, such as invariance of the motion equations to the magnitude of the planar coefficient of friction. This also proved desirable from an experimental implementation standpoint when using a position controlled robotic manipulator.

The limit surface is a geometric representation that bounds the set of all possible frictional forces and moments that can be sustained by a frictional interface. First introduced in [31], under the quasi-static assumption, the limit surface maps the ap- plied frictional force on an object to its resulting velocity. We use a convex quadratic approximation to the limit surface as described in [110], where the limit surface can

39 Figure 3-6: Depiction of the limit surface. The limit surface (ellipsoid in the figure) describes the set of forces and moments that can be transmitted by a patch contact. By the principle of maximal dissipation, the object twist is perpendicular to the limit surface of the applied frictional wrench w.

be expressed as the sub-level set

H(w)= -wLw, 2 where L is positive definite and the applied pusher wrench is denoted by w. In this paper, we use an ellipsoidal approximation to the limit surface by [561, where the semi-principal axes are given by fmax,fmax, and mmax defined by

fmax = gng and

mmax = |r IdA.

The term pg describes the coefficient of friction between the object and the ground, m is the mass of the object, g is the gravitational acceleration, A is the surface area of the object exposed to friction, and rdm is the position of an infinitesimal mass dm relative to the origin of the object. The ellipsoidal approximation captures well the shape of the limit surface for object-surface contact interactions that have uniform pressure distributions and yields a convenient invertible analytical form [56].

The principle of maximal dissipation states that an object will react to an applied force by moving in the direction that maximizes the system's dissipated power. In

40 Figure 3-7: Friction cone constraint. The applied force must remain within the blue shaded region.

(a) Sticking. The relative (b) Sliding left. The fric- (c) Sliding right. The fric- velocity between the pusher tional force lies on the lower tional force lies on the up- and object is zero. (Kine- boundary of the friction per boundary of the friction matic constraint). cone. (Force constraint). cone.

Figure 3-8: Mode dependent constraints following Coulomb's frictional interaction law. (Force constraint).

practice, this establishes a constraint between the applied force on the object and the resulting velocity of the object. Given the convex limit surface described by H(w), the resulting object twist subjected to an external force will be in the perpendicular direction to the limit surface,

t = VH(w) = Lw, (3.3)

41 where the applied frictional wrench by the set of pushers to the object is

C w = JT(ncf,c + tcft,c) (3.4) C=1

= Jfni ... JTcnc (3.5)

N f, f,

+ JT . (3.6)

T L ft'cJ = Buf, (3.7) with

B =[N T], Jc = 1 0 -y U [f ft], 0 1 ze and where nc and te denote the normal and tangential directions of the applied forces in Fb and fn, and ft,c represent the magnitudes of the normal and tangential applied forces in b.

Consider the planar manipulation system with multiple contact points shown in

Fig. 3-5. The motion equations of the system can be expressed as

x = f(x, u) Rt RLBuf RLB 0 ~B u, (3.8) u4 J[ uO 0 1 with cos9 -sin9 01 R = sinG cosO 0 0 0 1 where R is a rotation matrix and up represents the relative sliding velocity between pusher and object. Equation (3.8) assumes that all points maintain contact with the

42 sliding object and that the applied forces satisfy physical interaction laws as presented in Section 3.4.3.

3.4.3 Frictional Contact Constraints

The motion equations in Eq. (3.8) do not enforce that the reaction forces between manipulator and sliding object are feasible. For example, if the input uf = [ff[]T is unconstrained, negative normal forces could be applied to the object, which is

physically inconsistent, since contact interactions cannot transmit such forces. To

ensure that the motion equations are associated with physically reasonable behavior, we must impose constraints on the control input u, ensuring that the motion model obeys contact interactions laws. An important property of contact mechanics is that the physical constraints dictating the magnitude and direction of the frictional forces vary with the contact interaction mode.

Friction Cone In accordance with Coulomb's frictional law, the following con- straints on the inputs are always satisfied independently of the contact mode:

Co :f'c > 0 (3.9) Ift,c| < pfn,c implying that each pusher can only exert a compressive force on the object and that the net frictional force applied on the object remains within the bounds of the friction cone in Fig. 3-7. In addition, we must enforce constraints that depend on the contact interaction mode.

Sticking When the pusher is sticking relative to the object, the tangential velocity is stationary, as in Fig. 3-8(a)

C1 : qe = 0 (3.10)

Sliding Left When the pusher is sliding left relative to the object, the tangential velocity is strictly positive and the frictional force must remain on the right hand side

43 of the friction cone, as in Fig. 3-8(b)

C2 : C (3.11) ft,c= Apfn,c

Sliding Right When the pusher is sliding right relative to the object, the tan- gential velocity is strictly negative and the frictional force is constrained to remain on the left hand side of the friction cone, as in Fig. 3-8(c)

C3 : c < 0, (3.12) fAc = -ppfn,c.

The motion equations in Eq. (3.8) are expressed in terms of the object state x, the applied forces uf, and the relative pusher sliding velocities uO. This mapping from applied forces to object velocity is desirable from a controller design perspective as the motion equations in Eq. (3.8) are independent from the contact mode. This is not the case for the relation between pusher velocity and object velocity, as described in [37].

In practice, it is easier for most position controlled robots to control the robot kinematically (i.e. velocity control) rather than in force control. As such, once the target control u is computed, it is necessary to map it back to a desired robot velocity

VA,, resolved in the body frame. The robot velocity is linearly related to the object velocity through the kinematic relation

vc= Jct + ic (3.13) &rc. = JcLBuf + -- c (3.14)

- JcLB "e 1, u. (3.15)

44 3.4.4 Linearization

This section describes the linearization of the motion equations in Section 3.4.2 about a given nominal trajectory. We will see that this linearization yields approximate

dynamic equations, which can be enforced as linear matrix inequalities in an opti- mization program and are computationally tractable. This will be essential for the real-time execution of the controller design presented in Section 3.5. Consider a feasi-

ble nominal trajectory x*(t) of the sliding object with nominal control input u*(t) of

the pusher. The notation (-)* is used to evaluate a term at the equilibrium state and

(7) is used to denote a perturbation about the equilibrium state. The linearization of

motion equations Eq. (3.8) about a nominal trajectory is

x = A(t)x + B(t)i,

wherei =x-x*, fi=u-u*, and

A(t) = ,f(xu)B(t)= (316) AX x*(t),u*(t) u x*(t)u*()

The terms A(t) and B(t) are computed symbolically using the function jacobian() in matlab, where f(x, u) is given by Eq. (3.8).

3.5 Controller Design

In this section, we present a closed-loop controller design that can stabilize the motion

of a pushed object about a nominal trajectory. A key feature of this controller is its

ability to reason across a sequence of contact modes to fully exploit the dynamics of

the system. The proposed controller determines the desired pusher motion (applied

force and velocity) at each time step based on the sensed pose of the object.

3.5.1 Hybrid Model Predictive Control

A successful feedback controller for planar manipulation must:

45 i =0 Sticking Sliding up

Sliding down i= 2

iN

Figure 3-9: Tree of optimization programs for a MPC program with N prediction steps. Scales exponentially due to contact hybridness.

1. Allow for sliding and sticking at contact.

2. Recover from applied perturbations to the nominal trajectory.

3. Be fast enough to solve online.

To satisfy these requirements, we use an MPC formulation, which takes the form of an optimization program over the control inputs during a finite time horizon to, ... tN. The decision variables of the optimization program include the perturbed states of the system about the nominal trajectory for N time steps i1, .. . , kN and the perturbed control inputs no, .. . ,iN- 1. The goal is represented by a finite-horizon cost-to-go function that we minimize subject to the constraints on the control inputs and the dynamics of the system detailed in Section 3.4.2. We express the cost-to-go for N time steps as:

N-1 J(1,iii) = jiN N + +Qi+1 + i[ Ri ) . (3.17) i=O

The terms Q, QN, and R denote weights matrices associated with the error state, final error state, and control inputs. These represent standard objectives in a tra- jectory optimization problem where the planned trajectory must reach the goal, the intermediate trajectory should approach the nominal trajectory, and the actuation effort should be minimized. We subject the search for optimal control inputs to the

46 constraints:

ii+1 =± + h [Aiki + Bidi],

V(i, c) (3.18) f i 2 0,

pfn,i ;> ft,i,

developed in Section 3.4.2 where i is the time index and c denotes the contact point. The first constraint is the linearization of the dynamic equations of motion, with

Ai and Bi from Eq. (3.8). This leads to linear constraints that are computationally

tractable for real-time execution.

Additionally, depending on the contact mode at play at each iteration i of the

prediction finite horizon, the controller enforces the extra constraints

if Mode(i) Sticking: ci = 0, (3.19)

if Mode(i) Sliding up: c ' > 0, (3.20) pfn,ci = fn, cii

if Mode(i) Sliding down: cj < 0, (3.21)

pfn, ci = fn,ci, where the term Mode(i) denotes the contact mode of interaction at the ith step of the prediction horizon.

The constraints in Eqs. (3.19), (3.20), and (3.21) depend on the contact mode and complicate the search for optimal and feasible control inputs. Contact modes and control inputs must be chosen simultaneously. In its naive form, this problem takes the form of a tree of optimization programs with (Mn)N possible contact schedules, where M denotes the number of possible contact modes for each contact point, n the

47 number of contact points, and N the length of the control horizon. Each branch of the tree requires solving a convex quadratic program, which is too computationally expensive to solve online.

3.5.2 Mixed-Integer Quadratic Program (MPC-MIQP)

The combinatorial hybrid nature of the pusher-object dynamics can be modeled by introducing integer decision variables into the optimization program, as is commonly done in mixed-integer programming. The resulting mixed-integer quadratic pro- gram (MIQP) can be solved efficiently using commercial numerical tools, such as Gurobi ([34]).

In the case of the pusher-object system, we introduce the integer variables: zi E

{, 1}, z2 E {O, 1}, and z 3i E {, 1}, where zu = 1, z2i = 1, or z3 i = 1 indicate that the contact interaction mode at step n is either sticking, sliding up, or sliding down, respectively. The hybrid MPC problem with additional integer variables denoting the contact modes is detailed as:

Optimization Problem MPC-MIQP (MIQP): Given the current error state xo and nominal trajectory (x , u), solve

Error states

Control inputs

to, t 1 --- tN

I m1 I m2 I | --- I | | | mM I Contact modes

Figure 3-10: Hybrid MPC framework. A sequence of control inputs is computed that will drive the predicted states to the reference trajectory while simultaneously finding the schedule of optimal hybrid mode transitions m = {mi,.. . ,m }. The control input do + u" is applied to the system.

48 x*+ x MI-MPC n+ + U Plant X (MIQP)

Figure 3-11: Block diagram of the hybrid controller design proposed in Eq. (3.22). The resulting MPC controller design involves solving a non-convex mixed-integer quadratic program. Note that the integer variables are internal to the controller, necessary to represent the hybrid dynamics, and as an aid to find the optimal control sequence.

N-1

min TNN N + ( 1 +1i+1 +i i zWiz) Xi, di, Z

subject to ki+1 + h [Aiki + Binii] ,

Uc,i E CO, (i.e., zic,i = 1), (3.22) nc,i E C1 if c is sticking

unc,i EC2 if c is sliding left (i.e., z2c,i = 1),

nc,i E C3 if c is sliding right (i.e., zac,i = 1),

Zlc,i i Z2c,i + Z3c,i= 1,

with zi= [Zc,, z2c,i, z3 c,i]T. The term Wi is a weight matrix that can be used to penalize contact switches between the nominal trajectory and the corrected actions. We enforce that the sum of integers values must be unity at each time step to ensure that only one mode can be active at a time.

The resulting mixed-integer optimization program is simultaneously tasked with finding the optimal hybrid mode schedule during the prediction horizon (zi) and the optimal control sequence (ui). In practice, we employ the big-M formulation [73] to formulate the problem, where M is a large scalar value used to activate and deactivate the contact mode dependent constraints, through a set of linear equations.

To speed up computation and reduce the number of integer variables, it is useful to constraint adjacent time steps within a prediction horizon to have the same contact

49 mode. This is shown in Fig. 3-10, where the agglomerated mode sequence m =

{mi, .... , m } is introduced, with mm E {S, L, R} denoting sticking, sliding left, and sliding right.

3.5.3 Family of Modes (MPC-FOM)

The MIQP formulation greatly reduces the computational cost associated with the tree of optimization programs in Eq. (3.17) and Fig. 3-9. With an efficient imple- mentation, it can be solved in almost real-time for low dimensional systems with few contact interactions. However it does not scale well for systems with more degrees of freedom or additional contact points. The method presented in this section is moti- vated by the observation that many of the branches of the tree in Fig. 3-9 will give very good solutions, even if not exactly optimal. For planar manipulation tasks, it is reasonable to expect that the optimal mode schedule will follow a certain predictable structure. For example, we can intuitively expect that when the object is located in the y direction above the reference straight line trajectory, as in Fig. 3-12, the pusher will likely slide up to correct the orientation of the object followed by a downward sliding motion as the object converges to the desired trajectory. This pushing strategy represents one possible mode schedule.

The family of modes algorithm consists in determining a fixed set of probable mode schedules that span a large range of primitive behaviors of the system. Each mode schedule in the family specifies a sequence of N contact modes to be imposed during the finite prediction horizon in the MPC formulation. By doing so, the combinatorial problem reduces to solving K convex optimization programs, where K is the number of mode sequences in the family. A key challenge is in determining a small number of mode schedules that spans a "significant" set of dynamic behaviors. For the point pusher system, one could consider a family of three mode sequences:

Example: Family of Modes for the pusher object system

* mi: the pusher slides left relative to the object followed by a sticking phase

(i.e. mi = {L, S, S,..., S})

50

- --- ...... Sliding Up Sticking Sliding Down Sticking

Figure 3-12: Example of optimal mode schedule for the pusher-slider system converging to a straight horizontal trajectory.

m : • 2 the pusher slides right relative to the object followed by a sticking phase (i.e. M3 = {R, S, S,. ..., S})

• m : 3 the pusher sticks to the object for the full length of the prediction horizon

(i.e. m2 ={S, S, S,..., S}).

Even though this family of mode sequences only contains 3 of the 3 N possible contact mode combinations in the tree in Fig. 3-9, it spans a very large set of dynamic

behaviors between the pusher and the object. Part of the reason is that the controller

will re-optimize the selection of optimal modes in real-time. The Family of Modes algorithm is detailed in Algorithm 1.

Algorithm 1: Family of Modes (FOM) input : i(to), x*(t), u*(t) output:i(to) Select K mode schedulesmi 1 , M2 ,... ,iMK; for k <- 1 to K do I fik, Jk= FM-MPC(k(to),x*(t),u*(t),mk) end k* = argmin{J } ti(to) = ne

By preselecting a candidate mode schedule all combinatorial aspects disappear, and each mode schedule in the family leads to a computationally solvable quadratic problem, as defined in Optimization Problem FM-MPC (QP). Moreover, computation of the K quadratic programs can be sped up by parallelizing the programs at run

51 time. The controller selects the mode schedule that leads to the minimum cost among the K possibilities. The control input is selected at each time step by choosing the first element of the sequence of control inputs as u = u" + do, where the term do is obtained from the optimization program with minimum cost and u* denotes the nominal control input.

Optimization Problem FM-MPC (QP): Given current error state ko, nominal trajectory (xi, u), and mode schedule m, solve

N-1 min NTNmN + +1iT i+1 + iTRidi) ii, i=,

subject to :i+ 1 = + h [Aiki + Bini],

c,i E Co,

ie,i E C1 if m specifies that c is sticking,

nc,i E C2 if m specifies that c is sliding left,

nc,i E C3 if m specifies that c is sliding right.

The block diagram associated with the hybrid controller design with Family of Mode Scheduling is shown in Fig. 3-13.

mi

FM-MPC (QP) gi U*

X* + FM- PC d2 *+ +U Plant

mK:

FM-MPC fiK (QP)

Figure 3-13: Block diagram of the hybrid controller design with Family of Modes Scheduling (FOM). The resulting MPC controller design contains K convex quadratic program that can be solved in parallel.

When selecting a candidate family of mode sequences, it is important to ensure consistency between between planning and control. This concept is related to "se- quential consistency", defined in [11]. For example, if a sliding mode is used within

52 the prediction horizon, then at least one mode schedule should begin with a sliding mode to ensure that the controller can actually apply this action on the system. The iterative native of the controller means that only the first action will be executed before replanning. If the modes were to start by sticking, the system would never slide during execution.

3.5.4 Learned Mode Scheduling (MPC-LMS)

The Family of Modes algorithm offers a powerful solution to the stabilization of hybrid and underactuated dynamical systems that achieves real-time performance

and has been successfully implemented in experiments as detailed in Section 3.7. A challenge with this method is that it requires knowledge of what constitutes a good

set of candidate mode schedule. In the case of the point pusher system, human

intuition is sufficient to determine an effective set of modes. However, for higher

dimensional systems with more complex contact interactions, this approach might prove challenging.

This section introduces a controller design that eliminates the need for human in-

tuition by leveraging offline. This section introduces a controller design that separates

the search for the mode schedule from the optimal control sequence. We describe how

this splitting of the problem permits us to leverage offline computations to achieve

real-time control (see Section 3.6 for more details).

Classifier model

x* + xFM-MPC dn+ + u Plant (Qp)

Figure 3-14: Block diagram of the hybrid controller design with learned mode schedule classifier. The resulting MPC controller design is a convex quadratic program.

53 Training

Random State Error Generator

* XeI

MPC (MIQP) me

Training Examples Learning Algorithm 'D : {ii, mi .I.. ,iE, ME}

Prediction r------

Classifier model ~ (MO)

Figure 3-15: Supervised learning framework for mode schedule selection. A dataset of E labelled datapoints is generated by solving Optimization Problem 1. From the training examples, a classifier is trained to return the mode schedule based on the state error vector.

Consider the controller design architecture proposed in Fig. 3-14. Suppose that given the state errork, we had access to an oracle function that returned an effective mode schedule & = {mi, M2 ,. .. , mI} to be enforced during the prediction horizon. In such a case, the control problem would reduce to determining the optimal con- trol inputs under the prescribed mode sequence by solving Optimization Problem 2. Although we do not have direct access to a real-time function that determines the optimal mode schedule, we can query Optimization Problem 1 as much as desired offline to find optimal mode sequences given errors in the input state. This formula- tion lends itself well to a supervised learning setting, where the objective is to train a classifier model that can select an effective mode schedule given the error state.

We present the learning framework used to design the classifier model shown in

Fig. 3-15. Using the Optimization Problem 1 presented in Eq. (3.22), we generate a dataset of E training example {ie, me}, where me represents the mode schedule associated with the eth datapoint. The purpose of the machine learning algorithm is to train a classifier model that minimizes the cross-entropy error function of the

54 labelled training set. In this paper, we use a fully connected neural network as

described in Table 3.2.

Algorithm 2: Mode classifier input : E output:M = {fM= , ... ,fe} for e<-- 1 to E do sample state ie ~ A(O, E); Xie, die, mie, Je = MPC-MIQP(ie, X*, u*); end for m -1 to M do randomly initialize 0m; 0 m +-argmino£(mi(m), . .ME (M), 0) end

Algorithm 3: Learned Mode Scheduling (LMS) input : k(to), x*(t), u*(t), E output: i(to) Offline; Me = Mode Classifier(E) Online; in = M O ); ,I i, J = FM-MPC(i(to),x*(t), u*(t), i) i(to) = no

This new hybrid control architecture leads to a convex optimization program with

a prescribed mode sequence. The main attraction of this approach, detailed in Al-

gorith 3, is to convert a non-convex mixed-integer quadratic program into a convex

quadratic program that can be solved in real-time.

3.6 Numerical Results

This section implements the controller designs presented in Section 3.5 in a numerical simulation study. In Section 3.6.1, we consider the task of stabilizing an object with perturbed initial conditions about a nominal trajectory and compare the tracking per- formance and computational requirements of the model-based controllers presented

55 in Section 3.5.2, 3.5.3, and 3.5.4. In Sections 3.6.2 and 3.6.3, we compare the per-

formance of the proposed controller against the baselines controllers: MPC-MIQP, Dubin's Car, MPC with Sticking Contacts, and LQR with Frictionless Contact.

1. MPC-MIQP. The MPC controller with Mixed-Integer Quadratic Program- ming is described in Section 3.5.2 and represents the optimal baseline. This

controller can only be executed in numerical simulations as it is not fast enough

for online executions.

2. MPC-FOM. Online approximation to the optimal solution described in Sec-

tion 3.5.3 based on searching through a fixed set of probable mode schedules that span a range of primitive behaviors of the system.

3. MPC-LMS. Online approximation to the optimal solution described in Sec- tion 3.5.4 based on learning a map from object states to predicted future mode transition.

4. Dubin's Car. This controller is based on [1091, who show that under the assumption of pure sticking interactions, planar pushing can be reduced to a Dubin's car problem where the sticking contact constraints translate to bounded curvature.

5. Sticking Contacts. To evaluate the value of a controller design that can

reason across multiple contact modes, we compare the performance against a

controller that is limited to reason over a single contact configuration: sticking.

This controller requires solving Optimization Program FM-MPC with the mode sequence fixed as m = {S, . .. , S} for the entirety of the control horizon. This

strategy is equivalent to that used by [801, [101], and [751.

6. Frictionless Contacts. When controlling systems with sustained contact in- teractions, one could assume a frictionless contact model that neglects tangential

frictional forces. This assumption leads to a smooth dynamical system that can be stabilized using a Linear-Quadratic-Regulator (LQR). This controller can be

56

...... interpreted as assuming a contact that can slide freely while applying positive normal forces.

In Section 3.6.3, we compare the sensitivity to initial state errors of the proposed algorithm against the aforementioned e benchmarks across 100 simulations with ran- dom initial perturbations. Finally, we test the robustness of the MPC-LMS controller to errors made in contact classification.

3.6.1 Straight-Line Tracking Simulation

-- MPC-MIQP -MPC-FOM -MPC-LMS 0.151 - Dubin's Car --- Sticking 0.1I - -- Frictionless

0.05I

0

-0.05 -

-0.1' I -0. 1 0 0.1 0.2 0.3 0.4 0.5 0.6 x(m)

MPC-MIQP a MPC-FOM Sliding Up MPC-LMS 0 Sticking Dubin's Car 0 Sliding Down Sticking Frictionless

Figure 3-16: Closed-loop straight line tracking of a pushed square object. The MPC- LMS controller performance is compared to four benchmark: the optimal baseline MPC-MIPQ, Dubin's Car, MPC with sticking contacts, and LQR with frictionless contacts. Both the MPC-LMS and optimal baseline MPC-MIQP are able to track the trajectory. The frictionless controller that neglects friction is unable to slide to the proper contact location due to unexpected friction. The Dubin's Car controller is able to track the nominal configuration, however, its limited control authority leads to a significantly slower convergence to the nominal trajectory.

In this section, we consider the task of controlling the motion of a square using a single point pusher about a straight line trajectory at a constant velocity shown

57 0.05 0.05

0 0

-0.05 -0.05 0 2 4 6 8 10 0 2 4 6 8 10 Time (s) Time (s) 1 ______0.05

00

-1 -0.05 0 2 4 6 8 10 0 2 4 6 8 10 Time (s) Time (s)

Figure 3-17: Error dynamics response from perturbed initial conditions associated with the MPC-LMS controller shown in Fig. 3.6.1. The perturbed initial conditions are shown to converge towards zero after approximately 2 s. in Fig. 3.6.1. The point pusher system is defined by the state vector x = [X y 0 pc]T and the control vector u = [f ft ry], where ry = tan 0. The initial conditions are zo = -0.01 m, yo = 0.03 , 00 = 30 deg, and ryo= -0.02 m. The physical parameters used in the numerical simulation are reported in Table 3.3.

The MPC-FOM controller is designed using a family of three mode sequences defined in

Slide left if i = 0 M1 : IStick if i > 0

Slide right if i = 0

Stick if i > 0

M3 :Stick, Vi where i denotes the time step of the optimization program. Note that although the family of selected mode schedules includes a small fraction of the total 3 N possible mode schedules, the selected modes span all three possible behaviors that are relevant to the stabilizing task, namely sliding left, slide right, and sticking. The rest of the horizon is selected as to replicate the behavior of the nominal trajectory, which in the case of the selected trajectory are sticking interactions. The controller design

58 ' 0.5 fit , 02 LUV ~ IIIIIIII I I I 0 -0.2 0 2 4 6 8 10 0 0 2 4 6 8 Time (s) Time (s) 0.2 0 0 4-0.2 2 4 6 8 10 0 0 2 4 6 8 10 Time (s) Time (s) 2 0. 1 0 A 1I I 0 2 4 6 8 10 -0.1 0 0.1 0.2 0.3 0.4 Time (s) x, (in)

Figure 3-18: Control effort response from perturbed initial conditions associated with the MPC-LMS controller shown in Fig. 3.6.1. The control input space u = [fn ft ?T is mapped to the robot end-effector velocity vc using Eq. (3.15). Although the com- manded velocities are discontinuous at contact switches, the resultant commanded robot positions x, and y, are smooth.

parameters are set to N = 35 steps, h = 0.03 seconds, Q = 10 diag{1,3,.1,0}, Qn = 2000 diag{1, 3,.1, 0}, and R = 0.1 diag{1, 1, 0.01}. To regulate the velocity of the pusher, we include the pusher velocity constraints Ivn 5 0.3 m/s and |vt| < 0.3 m/s to the optimization program.

The classifier model used in the MPC-LMS controller is learned using a mul- tilayer neural network with the architecture reported in Table 3.2. The controller design parameters used in the numerical simulations are h = 0.03 s, N = 35, Q = 10 diag{3, 3, 0.1, 0}, QN = 2000 diag{3,3,0.1, 0}, and R = 0.5 diag{1, 1}. The prediction horizon is split into M = 8 bins during which the contact modes are held constant. The number of time steps associated with each aggregated contact mode section mm is {1, 5, 5, 5,5, 5, 5,4} with the associated contact mode weight matrix W = 0.1 diag{0, 3, 1,11, 1 1, 1 1} for all time steps. All MPC based benchmarks make use of identical design parameters following those of the MPC-LMS while the LQR based controller uses Q = 10 diag{3,3, 0.1, 0} and R = 100 diag{0.5, 0.5}. To regulate the velocity of the pusher, we include the pusher velocity constraints |vn| : 0.3 m/s and |vt|I 0.3 m/s to the optimization program.

59 In Fig. 3.6.1, we show the tracking performance of the MPC-MIQP, MPC-FOM, and MPC-LMS controllers and compare their performances against the three base- lines algorithms. Results show that the MPC-MIQP, MPC-FOM, and the MPC-LMS algorithms succeed in tracking the desired trajectory. The frictionless controller is not able to track the nominal trajectory as its model neglects tangential frictional forces which causes the pusher to get stuck in a position offset from the nominal position. This limitation leads to the controller's inability to overcome friction and control the motion of the object. In contrast, the sticking controller is not able to track the tra- jectory as it does not have the ability to slide and cannot find convergent trajectories within the considered time horizon to recover from the initial perturbations in ry. The Dubin's Car controller is able to achieve the final target pose, but has limited control authority due to the initial location of the pusher. The Dubin's Car solution is time-optimal only under the constraining assumption of sticking interaction. In Fig. 3-17, the error dynamics of the simulation presented in Fig. 3.6.1 are shown to tend towards zero after approximately 2 s. In Fig. 3-18, we show the control effort of the simulation in Fig. 3.6.1. The control input space u =[fA ft -y]T is mapped to the robot end-effector velocity ve using Eq. (3.15). Note that although the commanded velocities are discontinuous during contact switches, the resultant commanded robot positions x, and y, are smooth.

It is interesting to compare the contact modes enforced by the MPC-LMS vs. the optimal baseline MPC-MIQP. For example, in Fig. 3.6.1, while both sequences are not identical, they capture the same general behavior: quickly slide up, stick shortly, slide down, and stick during the remainder of the push. While the controller most often relies on sticking interactions to control the motion of the object, it exploits the additional sliding modes to increase the ability of the pusher to rotate the object. The inability of the sticking controller to recover from external perturbations shows that this ability is not only desirable for improved performance but also necessary to perform the task.

60

...... 3.6.2 Sensitivity to initial state errors

To test the robustness of the MPC-LMS controller, we evaluate its performance on 100 trajectory tracking simulation with perturbed initial conditions drawn uniformly from the range i[0.03 0.03 .4 0.025] associated with the dimensions [x y 0 r.]. Figure 3-19 shows the mean and variance of the euclidean distance between the measured and

desired position over the entirety of the trajectories. Results show that the MPC-LMS achieves an average performance (E = 0.0064 i 0.0055 m) that is comparable to that of MPC-MIQP (E = 0.0051 i 0.0051 m). Both the frictionless and sticking control architectures have high error (E = 0.0247 i 0.0063 m) and (E = 0.0499 i 0.055 m) respectively, as the controllers are unable to track the nominal trajectory.

0.02 0.02

0.01 0.01

0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 Time (s) Time (s)

(a) MPC-MIQP (b) MPC-FOM

0.02 0.3

0.2 0.01 0.1

0 0 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 Time (s) Time (s)

(c) MPC-LMS (d) Dubin's Car

0.2 - 0.03

0.02 0.1 S0 01

0 0.5 1 1.5 2 2.5 3 0 0.5 1 .15 2 2.5 3 Time (s) Time (s)

(e) Sticking (f) Frictionless

Figure 3-19: Trajectory tracking error mean and variance taking over 100 simulations with random initial states. The error is computed as the euclidean distance between the observed center of mass position and its desired location. In general, the sticking and frictionless controllers cannot control the system.

The major advantage presented by the MPC-LMS controller is it that is allows for real-time execution on an experimental setup. Results taken over 100 random

61 Table 3.1: Accuracy results for the straight line point pushing numerical results. The neural network predictions are evaluated on a validation set of 50K labelled data points. We evaluate the performance on each mode separately, as defined in Fig. 3-10.

Mode Id. 1 2 3 4 5 6 7 8 Accuracy 0.93 0.93 0.92 0.91 0.90 0.90 0.97 0.99

simulations show that the MPC-LMS controller achieves an average computational time of 0.0028 i 0.00075 s compared to that of 0.40 ± 0.17 s for the MPC-MIQP.

We parametrize the classifier model presented in Fig. 3-14 using a neural network, as reported in Table 3.2. Table 3.1 shows the prediction accuracy of the neural network

trained on 100, 000 labelled data points on a validation set of 50, 000 labelled data

points both generated by sampling the error state from a normal distribution with

standard deviation [0.03 0.03 .4 0.025], associated with the dimensions [x y 0 ry]. We evaluate the performance on each mode individually, as defined in Fig. 3-10. The

prediction accuracy is above 90 % on all 8 contact mode clusters, showcasing the neural network's ability to accurately predict the contact mode based on the error state.

Table 3.2: Neural network parameters.

Property Value Number of hidden layers 3 Neurons in hidden layer 1 32 Neurons in hidden layer 2 50 Neurons in hidden layer 3 50 Activation functions ReLu Output layer Softmax Loss function Cross entropy

3.6.3 Sensitivity to contact mode errors

In Fig. 3-20, we explore the sensitivity of the control algorithm MPC-LMS controller to mistakes committed by the contact mode classifier. Figure 3-20 investigates the

62 performance of the controller when errors are randomly introduced in the contact mode selection relative to the MPC-MIQP optimal baseline. Results show that the

MPC controller degrades linearly up to a classification error of 20%, at which point the performance degrades significantly. Note that performance of the learned classifier described in Table 3.1 are well below this threshold.

600- 400 C00 200

0 0 20 40 60 80 100 Classification Error (%)

Figure 3-20: MPC cost vs. classification mistakes. Random classification errors are introduced on all contact modes to the MPC-MIQP solution to understand the sensitivity of the controller to classification mistakes.

80

4-D 60

m m2 3 4 5 6 7 8 Errors in Mode Scheduling

Figure 3-21: Tracking error vs. classification mistake at a given time step as defined in Fig. 3-10. Random classification errors are introduced (100% error) on individual contact modes to the MPC-MIQP solution to understand the sensitivity of the con- troller. Results show that committed mistakes during the beginning of the control horizon have the most important impact on controller performance.

In Fig. 3-21, we explore the sensitivity to classification errors as a function of the

time step of the MPC program at which it was committed. As expected, results show

that mistakes committed during the beginning of the control horizon have the most

important impact on controller performance. We hypothesize that this is the case as

errors committed during the beginning of the horizon have a stronger impact on the future trajectory of the object. Moreover, mistakes committed during the first time

step are expected to have a more significant influence on the performance as MPC only applies the first command of the control sequence.

63 3.7 Experimental Results

We experimentally validate the MPC-LMS controller design on a planar experimental

setup using an industrial robotic manipulator (ABB IRB 120) shown in Fig.3-4. We

refer the readers to the attached video for a visualization of the closed-loop pushing

experiments. The pose of the pushed object is tracked using a Vicon system (Bonita). The experimental setup is depicted in Fig. 3-4, where a metallic rod (pusher) attached to the robot is used to push an aluminum object on a flat surface (plywood, abs, delrin, etc.).

Section 3.7.1 considers a trajectory tracking problem using a square object with

a point robotic pusher while Section 3.7.2 extends the contact configuration to a two point pusher. The objective of the controller is to track the trajectory of the center

of mass of the object along the race track defined by two circles of radii 0.15 m at a constant velocity of v = 0.05 m/s, as in Fig. 3-23.

3.7.1 Case Study A: Single Point Pushing

Figure 3-22: Experimental setup for point pusher.

Figure 3-22 shows the experimental setup for the planar manipulation task with a point pusher. The desired trajectory is set to 0.05 m/s and all control parameters

are kept identical to those described in Section 3.6, and we evaluate the performance of the MPC-LMS controller design approach on the race track in Fig. 3-23. First, we consider the ability of the controllers to track the nominal trajectory without exerting any external perturbations on the system. Second, we perform the experiments while subjecting the pusher to controlled external perturbations to evaluate the reactive

64 capabilities of the controller.

-0,2

-0.2

0:

0.1/

-0.3 -0.2 -0.1 0 0.1 0.2 0.3 -0.3 -0.2 -0.1 0 0.1 0.2 0.3

(a) Point pusher tracking a trajectory for 7 (b) Point tracking of the same trajectory with consecutive laps. The black line represents the external perturbations for a single lap. The desired trajectory and the blue lines track the black line represents the desired trajectory center of mass of the object. and the hand represents the locations and di- rections in which the perturbations were ap- plied.

Figure 3-23: Experimental tracking of the o track with the MPC-LMS controller for the robotic point pusher.

Table 3.5 shows the prediction accuracy of the neural network trained on 120, 000 labelled data points on a validation set of 50, 000 data points both generated by sam- pling the error state from a normal distribution with standard deviation [0.03 0.03 .4 0.025], associated with the dimensions [x y 0 c]. The system parameters used in the exper- iments are presented in Table 3.3.

Figure 3-23 shows the robot manipulator pushing the square object about the race track without any external perturbations for 7 consecutive laps. The black line is the desired trajectory and the blue line tracks the geometric center of the object. The

MPC-LMS controller succeeds in tracking the desired trajectory within an average

Table 3.3: Experimental system parameters.

Property Symbol Value Coefficient of friction (pusher-object) lip 0.3 Coefficient of friction (object-table) pg 0.35 Mass of object, kg m 0.827 Object side length, m a 0.09 Line pusher width, m d 0.03

65 Table 3.4: Accuracy results for the point pushing experimental results on a 8 track trajectory. The neural network predictions are evaluated on a validation set of 50K labelled data points. We evaluate the performance on each mode separately, as defined in Fig. 3-10.

Mode Id. 1 2 3 4 5 6 7 8 Accuracy 0.95 0.95 0.95 0.93 0.95 0.96 0.96 0.98

state error of 0.02 m. The stability of the MPC-LMS can be seen in Fig. 3-23(a) by the small variance associated with the 7 consecutive trajectories that overlap within a very small tolerance.

In Fig. 3-23(b), we apply four successive impulsive forces in the transverse motion to the object to perturb the system about its nominal trajectory and evaluate the stabilizing capabilities of the feedback controller. The forces are applied to ensure that the object is pushed at the same location by an equal distance on each experiment.

The controller quickly reacts to external perturbations and acts in such a manner to eliminate the perturbation. During real-time executions, the controller reasons about both future control inputs and contact modes to eliminate errors at an average frequency of 250 (Hz).

3.7.2 Case Study B: Pushing with Line Contact

Figure 3-24: Experimental setup for line pusher.

The experimental setup for the planar manipulation task with a line pusher is shown in Fig. 3-24. The task of the controller is to determine the motion of the robot

66 W

0

0.2! Q

0.2

03 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 -0.3 -0.2 -0.1 0 01 02 y(im) (a) Line pusher tracking of the 8 track for 7 (b) Line tracking of the 8 track with exter- consecutive laps. The black line represents the nal perturbations for a single lap. The black desired trajectory and the blue lines track the line represents the desired trajectory and the center of mass of the object. hand represents the locations and directions in which the perturbations were applied.

Figure 3-25: Experimental tracking of the 8 track with the LMS controller for the robotic point pusher. in real-time to control the motion of the object about a race track shaped trajectory defined by two circles of radii 0.15 meters at a constant velocity of v = 0.05 m/s. We model the line pusher as 2 contact points that are constrained to move as a rigid- body, with the position of the center point of pusher denoted as ry and state vector defined by x = [x y ry]T, only one sliding velocity is considered as both points move as a rigid body. We include the sliding velocity of only one point pusher, since both points move as a rigid body.

Following a similar approach to that described in Section 3.7.1, we train a classifier model using 80 K labelled data points to predict the optimal mode schedule based on the error state of the system. The physical properties used to parametrize the classifier model and the neural network properties are related in Tables 3.2 and 3.3,

Table 3.5: Accuracy results for the line pushing experimental results on a 8 track trajectory. The neural network predictions are evaluated on a validation set of 50K labelled data points. We evaluate the performance on each mode separately, as defined in Fig. 3-10.

Mode Id. 1 2 3 4 5 6 7 8 Accuracy 0.96 1.0 0.95 0.98 0.99 1.0 1.0 1.0

67 respectively.

For the line pusher, both contact points are constrained to have the same con- tact mode as they are constrained to move together as a rigid body and remain in contact with the object. The controller design parameters are selected as h = 0.03 seconds, N = 35, Q = 10 diag{11,1,0.1}, QN = 2000 diag{1, 1,1,0.1}, and R = diag{1, 1, 1, 1, 0.01}. We split the prediction horizon into 8 parts during which the contact modes are held constant. The number of time steps associated with each contact mode section mm is {1, 5,55, 5, 5,5,4} with the associated weight matrix W = 0.1 diag{0, 3,1, 1, 1, 0, 0, 0} for all time steps.

Figure 3-25(a) depicts the robotic line pusher pushing the square object about the race track without any external perturbations for 7 consecutive laps. Figure 3-25(b) depicts the robotic line pusher pushing the square object about the race track with external perturbations for a single lap. Each time a perturbation is encountered, the pusher reacts to reduce the error by following a fast sliding motion to stabilize the object and then push it back towards the desired trajectory using a sticking phase.

The controller is executed online at an average frequency of 200 Hz. Note that the frequency is lower because of the extra complexity associated with the larger input space and consequent additional constraints.

3.8 Influence of Control Parameters

In this section, we investigate the dependence and sensitivity of the proposed con- troller in Section 3.5.4 (MPC-LMS) to design parameters. Specifically, we evaluate the effect of the controller frequency, tracking velocity, planning horizon, estimated coefficient of friction, and radius of curvature of the track on the closed-loop perfor- mance of the pusher-object system.

We evaluate the performance of the controller by computing the average mean squared error between the desired trajectory and the actual trajectory for the point pusher system. All experiments are conducted by tracking three consecutive laps of the race track at 0.08 m/s, with the last two laps subject to two controlled external

68 perturbations. An example of this can be viewed in the video attachment.

] 0.04 0.02- 0 0 0[ 0 0 20 40 60 80 1 00 0.02 0.04 0.06 0.08 Frequency (Hz) Velocity (m/s) (a) Controller frequency (b) Tracking velocity

Fi 0.1r 0.1r'~ 0.1 0.05- 0. 0 25 30 35 40 45 50 0 0.2 0.4 0.6 0.8 1 Horizon Steps iction Coefficient (c) Planning horizon (d) Pusher friction coefficient

0.1 0.05

11 12 13 14 15 1 6 17 Radius (m) (e) Radius of curvature of the track

Figure 3-26: Experimental analysis of the performance as a function the a) controller frequency, b) tracking velocity, c) planning horizon, d) error in coefficient of friction, and e) radius of curvature of the track.

3.8.1 Controller Frequency

The pusher-object system is naturally unstable in its forward motion and can be stabilized using a feedback controller for trajectory tracking. A natural question to ask is "what is the lowest controller bandwidth at which the system can be controlled?"

Figure 3-26(a) depicts the effect of the control frequency on the closed-loop tracking performance. Results show that the system requires a minimal control bandwidth of 20 Hz for closed-loop stability. Above this frequency, the performance remains unchanged. Note that, intuitively, this minimum control frequency will change with the velocity of the target trajectory.

69 3.8.2 Tracking Velocity

The model used for controller design uses the quasi-static approximation, which has

been shown to be a good approximation for object velocities under 0.05 m/s by [6]. Above this velocity, the unmodeled inertial forces are appreciable and can negatively

impact performance. In Fig. 3-26(b), the controller performance degrades with the

increase of the desired velocity. The controller remains stable until 0.1 m/s at which point the system becomes unstable. We hypothesize that the tracking error grows

with the tracking velocity as inertial forces start to have an impact on the object motion and are unaccounted for in the controller design. Furthermore, trajectory

tracking tasks at higher velocities are more difficult as they require shorter reaction times to correct mistakes.

3.8.3 Planning horizon

Due to the underactuated nature of the dynamics of the pusher-object system, a certain planning horizon length must be considered for closed-loop stability. The

controller must reason beyond instantaneous actuation since the forces required to drive the task in the direction of the goal might not be feasible at the current in-

stant. To investigate the influence of the planning horizon, we gradually increase

the planning horizon from 5 to 50 time steps, which corresponds to planning hori- zons of 0.15 to 1.5 seconds. Results in Fig. 3-26(c) show that the system requires a minimal planning horizon of 20 steps for stability, while performance continues to

increase until it reaches its peak performance around 35 time steps. Beyond that, the controller performance degrades, possibly due to the additional computational complexity associated with planning for very long horizons.

3.8.4 Coefficient of Friction

Coefficients of friction of robotic systems are among the most difficult parameters

to estimate and can have a notable impact on the system's dynamics. Often it

is difficult or impossible to estimate without explicit sensing force [29]. Here, we

70

mpw investigate how the performance of the controller is impacted by varying our estimate of the coefficient of friction between the pusher and the object from 0 to 1. Our best

estimate of this value, based on steel-steel contact interactions, is detailed in Table 3.3 as 0.3. Figure 3-26(d) shows that the performance increases monotonically with the estimate of the coefficient of friction.

At first glance, these results seem surprising and counter intuitive as it is un- reasonable that the real value of the coefficient of friction lies as high as 1. We

hypothesize that the controller benefits from overestimating the pusher-object coeffi-

cient of friction as it leads to more aggressive robot motions, where the robot favors

high tangential velocity during sliding motion to ensure that the reaction force lies on the boundary of the friction cone.

3.8.5 Race Track Radius of Curvature

Finally, in Fig. 3-26(e) we test the performance of the controller design by increas- ing the desired radius of curvature of the race track trajectory from 11 to 17 cm.

As expected, the performance of the controller degrades as the radius of curvature decreases because tighter curves require more aggressive pusher motions with higher control authority and are more difficult to track. The minimal radius of curvature achieved is 11 cm prior to the controller going unstable while the maximum radius of curvature is constrained to the kinematic workspace of the robotic arm. The minimal radius of curvature achieved is executed with a square object with side lengths of 9cm.

3.9 Discussion

In this work, we develop and investigate the performance of novel feedback controllers for a planar pushing task. The control formulation is based on a model predictive control approach, where the hybridness and underactuation associated with contact are explicitly enforced as linear constraints within a mixed-integer optimization pro- gram. We propose two online approximate solutions to the non-convex MPC-MIQP

71 to achieve real-time computational requirements while explicitely reasoning across multiple contact modes. The MPC-FOM simulates the dynamical system forward using a set (i.e., family) of mode schedules that are identified as being key. The proposed MPC-LMS approach consists in formulating the search for optimal modes offline separately from the search for optimal control inputs online, by leveraging machine learning methods to select mode sequences from prior experience.

We validate the MPC-FOM and MPC-LMS algorithms in numerical simulations and compared to the optimal baseline as well as two other benchmarks (pure sticking and pure sliding). We further validate our controller design experimentally, where the feedback controller successfully stabilizes the motion of a sliding object about various nominal trajectories. We demonstrate implementation of the controller for two manipulation tasks (point and line pushers) while providing a framework for the mechanics that generalizes to multiple contact formations and different object shapes. This section develops full state feedback control architecture that reason across contact modalities to handle external perturbations and model uncertainty. In the following section, we explore how to exploit the sense of touch to control contact interactions. A particular focus of this research is on designing tactile controllers that can actively enforce desired contact states between the end-effector and the object. We focus on enabling more dexterous manipulation behavior by using a dexterous robot platform with dual robotic palms equipped with high resolution tactile sensors.

72 Chapter 4

Tactile Dexterity

4.1 Contribution

This chapter develops closed-loop tactile controllers for dexterous manipulation with dual-arm robotic palms. Tactile dexterity is an approach to dexterous manipulation that plans for robot/object interactions that render interpretable tactile information for control. The contributions of this work are:

" The development of a library of robust manipulation primitives. For each skill, we develop a mechanics model and a pose planner.

" A novel tactile controller design that can 1) control the contact mode between

end effector and object (contact/no contact, stick/slip) and 2) replan object

motions using a tactile-based object state estimator.

" A high-level planner that sequences a set of manipulation primitives to recon-

figure an object from an initial pose to a target pose.

We consider the scenario of manipulating an object from an initial pose to a target pose on a flat surface while correcting for external perturbations and uncertainty in the initial pose of the object. We validate the approach with an ABB YuMi dual- arm robot and demonstrate the ability of the tactile controller to handle external perturbations.

73 4.2 Introduction

This chapter studies the use of tactile sensing for dexterous robotic manipulation, or tactile dexterity. Despite the evidence that humans heavily depend on the sense of touch to manipulate objects [45], robots still rely mostly on visual feedback. The vision-based approach has been effective for tasks such as pick-and-place 120] but it presents fundamental limitations to accomplish dexterous manipulation tasks that depend on more accurate and controlled contact interactions, such as object reorien- tation, object insertion, or almost any kind of object use. These executions require planning and reacting to contact geometry, contact force, and contact velocity. How is the object moving relative to the fingers? Where is the object within the hands?

Initial -...... - 4 - arget

Pull Pivot Push

Figure 4-1: A dual arm robot manipulates an object from an initial pose to a target configuration. The manipulation task combines 3 primitive actions to achieve the task: pull the object to the middle of the table, pivot the object, and push it to its target location.

The dynamics of manipulating an object are driven primarily by the relative mo- tions and forces at the interfaces where the object contacts the end-effectors and the environment. With this knowledge, we believe that tactile sensors, with their ability to localize contact geometry, detect contact motion, and infer contact forces, should be at the center of our manipulation plans.

Our approach to tactile dexterity, described in Sec. 4.3, is based on planning for robot/object interactions that render interpretable tactile information for control. At its core, this is an approach to robotic manipulation that puts tactile feedback at the center, to bypass some of its common caveats. The exploitation of tactile sensing for

74 Robot arm trajectory +- End-effector trajectory +- Contact trajectory +- Tactile feedback

Figure 4-2: Planning to pivot a box. Planning to pivot a box involves several levels of complexity. The robot arms move the end-effectors (palms), which in turn generate the trajectory of contact points that ultimately move the box. Tactile sensors allow us to observe and control these contact points. In this chapter, we propose an approach to dexterous manipulation that structures the planning problem so that when unexpected behavior of the contact points is observed with the tactile sensors, the robot can replan the motion of the contact points, palms, and arms in real-time. robotic manipulation has remained a longwidstanding challenge within the robotics community. This can be attributed to difficulties in interpreting tactile signals and designing reactive policies. Tactile information is by nature local and in general is not sufficient to fully describe the state of a manipulated system [49]. Furthermore, the design of feedback policies for systems undergoing physical contact is challenging, even under the assumption of full state feedback [101, 38], as explored in Chapter 3.

In that view, the key question becomes: How do we structure and guide manipu- lation planning so that tactile feedback is not only relevant, but also convenient? We propose to do it by:

* Restricting contact interactions to a set of manipulation primitives that:

1) Target contacts with the object on geometric-rich features (for estimation);

and 2) Define dynamic systems with simple mechanics and efficient closed-loop

manipulation policies (for control). We describe four such primitives in Sec. 4.5.

o Dividing tactile control into two roles: 1) Controlling the state of the con-

tacts between end-effector and object; and 2) Controlling the state of the object

in its environment. We describe a formulation for both in Sec. 4.6.

We present a first study of this approach in the scenario of a dual-arm robot equipped with high-resolution tactile-sensing palms (based on [26]) as end-effectors,

75 and tasked with manipulating an object on a table-top from an arbitrary initial pose to an arbitrary target pose. An offline graph-search task planner, described in

Sec. 4.7, sequences the manipulation primitives, which are then executed in a closed- loop fashion by the robot. The focus of the experiments described in Sec. 4.8 is evaluating the robustness of the system to external perturbations and to uncertainty in the initial pose of the object.

c1C2 Ci _4-7f

CClJ Cf 77 (a) (b) Cl

C2

mg c) 4,f Cf

(d)

Figure 4-3: Manipulation primitives. a) Grasp: the robotic palms align as a parallel jaw gripper to grasp an object. b) Push: a single robotic palm contacts the object laterally to manipulate it within the plane. c) Pivot: the object is rotated about a point on the table by both palms. d) Pull: the robotic palm presses vertically down on the object to slide it within the support plane.

4.3 Approach

This chapter considers the task of manipulating an object from an initial pose to a target pose on a table top. We focus on enabling tactile reactivity where the robot reacts to external perturbations during executions. We consider a robotic platform

76 that is 1) dexterous, where both robotic palms can be controlled independently, and 2) tactile sensorized, where each palm is equipped with high-resolution tactile sensing.

We formulate the manipulation problem as a sequencing of primitive behaviors. We design each primitive to have a prescribed contact interaction between the robotic palms and the object. Examples of the primitive interactions are shown in Fig. 4-3. This structuring of the manipulation task into simpler behaviors gives us the freedom to design interactions for which we understand the mechanics, are able to interpret the tactile information, and can develop effective planning algorithms:

1. Simple Mechanics. Robot/object interactions that can be reliably modeled

using rigid body contact mechanics.

2. Efficient Planning. Computationally fast algorithms able to replan robot

trajectories at real-time rates by restricting the contact modes to specific inter-

actions (contact configuration, stick/slip).

3. Tactile Control. Closed-loop tactile controllers that enable robust manipula- tion behavior, where the robot can react to external object perturbations. We

divide the role of tactile control into 1) a contact state controller and 2) an object state controller. The contact state controller enforces a desired contact

formation (contact/no-contact, stick/slip) between end effector and object. The

object state controller plans the motion of the object using a contact based state

estimator (tactile localization).

This structuring of the manipulation task into simpler behaviors gives us the freedom to design interactions for which we understand the mechanics, are able to interpret the tactile information, and can develop effective planning algorithms.

This chapter is structured as follows. In Section 4.4, we introduce four dual-arm manipulation primitives and present their mechanics in Section 4.5. In Section 4.6, we present a controller design that exploits tactile sensing to react to external per- turbations by replan robot motions in real-time. In Section 4.7.1, we detail the robot trajectory planning framework, tasked with computing robot/object trajectories to

77 accomplish a desired object pose transformation under a prescribed manipulation primitives. In Section 4.7.2, we present the high-level planner that searches across manipulation primitives to achieve a task. Finally, in Section 4.8, we present exper- iments on a ABB YuMi dual-arm robot and demonstrate the ability of the tactile controller to handle external perturbations.

4.4 Manipulation Primitives

In Fig. 4-3, we introduce 4 manipulation primitives, for which we describe the me- chanics, trajectory planning, and tactile control in Sections 4.5, 4.6, and 4.7.

4.4.1 Grasp

The robotic palms can align as a parallel jaw gripper to grasp an object. By securing a stable grasp on the object with both contacts sticking relative to the object, both arms move synchronously to achieve the desired object reconfiguration. Tactile sensing is used to monitor and maintain a prehensile grasp on the object. If the object slips relative to the robot, the displacement is tracked using the tactile sensor and the robot trajectory is replanned online to satisfy the new object location. Note that although in theory this primitive could perform the majority of pose to pose transformations, in practice, it is often challenging to find feasible joint trajectories satisfying the planned palm pose plan.

4.4.2 Push

A single robotic palm can poke the object laterally to manipulate it within the plane.

By orienting itself such that the tactile sensor interacts with an edge of the object, the palm is able to apply both forces and moments by maintaining line and surface contacts with the object. Under mild assumptions about the object geometry and and the contact supports, [109] shows that computing robot trajectories achieving planar reconfigurations of the object can be solved algebraically. Pushing interactions lead

78 to object transformations limited to SE(2) and as such, must be sequenced with other primitives to achieve SE(3) reconfigurations.

4.4.3 Pull

By applying vertical force downwards, the robotic palm can move the object in all directions within the plane. Pull is an interesting primitive as it can in theory have full control authority over the object. Unlike push, the end-effector can cause the object to move in any directions in the plane, provided that the normal applied force is sufficient to prevent translational and rotational slippage. This slippage is monitored in real-time to control the vertical downward force applied on the object. In practice, this primitive often struggles with computing continuous joint trajectories for large rotational object reconfigurations that would require the. Moreover, its applicability is limited to environments where the coefficients of friction of the table is lower that of the surface of the palm.

4.4.4 Pivot

The full dexterity of the arms can be exploited to perform pivoting operations, where the objecf is rotated about a point on the table using the motions of both palms. During a pivot, both palms maintain line contact with edges of the object that apply forces such that the object rotates about the rotation center without slippage. Tactile sensing is employed to locate the edges of the object and to ensure that both contacts do not slip. Whereas both grasp and pivot enable SE(3) reconfigurations of the object, pivot requires smaller arm motions.

4.5 Mechanics

This section describes the mechanics of four manipulation primitives: grasp, pull, pivot, and push (shown in Fig. 4-3). For each primitive, the interactions between the palms, the object, and the environment are modeled assuming:

79 " Known geometry of object, robot, and environment

* Known coefficients of friction

• Rigid-body interaction

• Coulomb's frictional law

" Quasi-static interaction

" Surface contact interactions with uniform pressure distribution

These simplifications help to design computationally fast trajectory planning algo- rithms in Section 4.7. The unmodeled elements of the interactions (non-uniform pressure distributions, inertial forces, deformation of contact, etc.) are addressed by designing closed-loop tactile controllers in Section 4.6 that can quickly react to undesirable slippage.

Assuming quasi-static interactions, force equilibrium dictates that contact forces on the object with the end-effector or environment are balanced by:

C Gi (q) 'wi = we,.t, (4.1)

whereq= T pT PT is the concatenation of the object pose and the left/right [ T]T palm poses, w = [cTlr]T is the applied wrench on the object by the ith contact in the contact frame, wet is the external force applied by gravity in the world frame, Gi is a grasp matrix transforming the coordinates of a contact wrench from the contact frame to the world frame [881, and C is the number of contacts.

The contact forces are constrained to remain within the friction cone in accordance to Coulomb's frictional law. Denoting the normal and tangential components of the contact force asc = [ fT , we express Coulomb's frictional law as:

fn,i > 0 (4.2)

Ift,il '< |fn,il. (4.3)

80 For contacts modeled using surface contacts (push and pull), the surface is able to resist a certain amount of frictional moment. In the case of point contact interactions

(grasp and pivot), the contact is unable to sustain frictional moments, implying ri = 0. We model surface contacts using the limit surface [32], which describes the set of forces and moments that can be transmitted through the contact interaction. In practice, we make use of the ellipsoidal approximation to the limit surface introduced in [42] that analytically represents the limit surface as an ellipsoid.

4.6 Tactile Control

In this section, we describe tactile-based controllers that give robust behavior to the manipulation primitives. We use GelSlim [26], an optical-based tactile sensor that renders high resolution images of the contact surface geometry and strain field, as shown in Fig. 4-6.

A primary goal of the tactile controller is to enforce a desired contact state between the palms and the object. For example, during a pivot maneouver in Fig. 4-3(c), we want both palms to maintain sticking contacts with the corners of the object. By monitoring slippage, we design a controller that regulates the applied forces on the object to prevent further slip. We refer to this as Contact State Control (Sec. 4.6.1). This is shown in Fig. 4-4, where the contact state controller rotates the configurations of the palms and applies additional normal force on the object in reaction to a slip event.

An important consequence of undesired slippage is that the position of the object has deviated from the the nominal trajectory. To address this, we make use of the controllers, shown in Fig. 4-6, that tracks in real-time local features on the object to continuously replan robot motions for the manipulation primitive. We refer to this as Object State Control (Sec. 4.6.2)

81 4.6.1 Contact State Control

Each primitive assumes a particular contact formation between the palms and the object. This assumption is likely to be broken as unmodeled perturbations are applied

on the system and cause undesired slippage. We design a contact state controller that acts to enforce the planned contact modes by reacting to the binary incipient slip signal si E {0, 1} at contact i, based on [24].

Slip detected

...... V Apply more Rotate palm force

Figure 4-4: Contact state control. How should the robot react to undesirable slippage? We design a model based tactile controller that determine locally optimal robot adjustments to recover from contact state deviations.

Coulomb friction states that slippage occurs when the contact force lies on the

boundary of the friction cone, as shown in Fig. 4-5. Given an undesired slippage signal, we find local robot adjustments leading to a more stable sticking solution (i.e. contact forces nearer to the center of the friction cone). The stability margin # in Fig. 4-5 is defined as the shortest distance from the contact force to the friction cone boundary and quantifies how close to slipping is a particular contact. The goal of this controller is to find robot adjustments that either maximize (for sticking) or minimize

82 (for sliding) the stability margin of a contact to enforce a desired contact mode.

Given the slippage signal si E {0, 1} at each contact and the current robot pose configurationq = [pT pT ]T, we search for a robot palm pose adjustment Aq, solving: max

C s.t. G (q,+ Aq)TWi=W_

fn~ >0 (4.4)

fti1 2 it I n,ilI

Wmin 5 Wi Wmax

Aqm Aq, Aqmax with Gi, q,, wi, wext defined in Section 4.5, where # is the stability margin shown in Fig. 4-5, Wmin and Wmax represent the minimal and maximal allowable applied contact wrenches and Aqmin and Aqma denote the minimal and maximal allowable robot palm pose adjustments. The notation qp + Aq, is abused here for simplicity. We ensure that qp + Aq, satisfies SE(3) group constraints and does not violate the kinematics of the contact formation. In practice, this will define a different space of possible robot adjustments for each primitive. For example, in Fig. 4-3(c), the palms can press harder and/or pivot about the contact edge. The hyperparameter 3i is a weight chosen to enforce the desired behavior at the contact i based on the slip signal si. For example, following a slippage event at contact i, the controller would increase fn,i.

The optimization program in (4.4) is non-convex due to the nonlinear nature of #and the bilinear constraint associated with static equilibrium. In [1], the surrogate stability margin a illustrated in Fig. 4-5 is proposed as a convex approximation to #. In (4.5), we reformulate the optimization program in (4.4) using the surrogate margin a and linearizing the static equilibrium Eq. (4.1) about the current robot configuration q* and contact forces f*,, fi for computational efficiency. The linearized contact state

83 real stability margin friction c -. cone

contact force convex stability margin approximation

Figure 4-5: The stability margin 4 quantifies how close a contact is to the slipping boundary. The goal of the contact state controller is to maximize/minimize the stability margin to encourage/discourage slippage.

controller becomes:

min - #iai

s.t. Aq, + S( qp Owi Awi =Wext

fn,i > 0

ai 2 0 (4.5)

lft,i|| :! A n,i

IIft,i112 n,i ail

AWmin Awi AWmax Aqmin Aqp, Aqmax where the symbol (.)* is used to evaluate a term at the nominal configuration, where AWmin and AWmaxdenote the minimal and maximal allowable variations of the ap- plied contact wrenches relative to the equilibrium and Aqmin and Aqmax denote the minimal and maximal allowable robot palm pose adjustments. The optimization pro- gram in (4.5) then takes the form of a convex quadratic program under a polyhedral approximation to the friction cone [91].

84 4.6.2 Object State Control

The contact state controller in Section 4.6.1 regulates the applied forces on the object to enforce the desired contact mode. This local controller reacts to fight against external perturbations but does not have the ability to change the desired trajectory of the object in response to the perturbation. To address this, we design an object state controller running in parallel tasked with replanning object/palm trajectories to drive the object to its target location.

Figure 4-6: Tactile object localization. By localizing descriptive object tactile features (lines, points), we update our estimate of its pose used by the Object State Controller.

In this chapter we track two types of features: points (corners of the object) and lines (edges of the object). We formulate the tactile object state estimator as an

optimization program updating the pose p, of the object to satisfy the geometric

constraints associated with the tactile features, as shown in Fig. 4-6. We quantify the error between two poses using the distance dTs(p, p*), where dTs is defined as the weighted sum of the Euclidean metric in R' and the great circle angle metric in SO(3)

for the respective components [39]. We enforce that detected lines are collinear with

their associated edge on the object mesh and that the sensed points are coincident with the object corner. In addition to the detected geometric constraint, We constrain the estimated object pose to satisfy the geometric constraints consistent with the

current manipulation primitive. For example, for a pull primitive, we enforces that the bottom surface of the object is in contact with the table top.

85 The estimated object pose is used to update the nominal robot palm pose tra- jectory and allows the robot to adapt to local object perturbations. This is further described in Sec. 4.7.

Given the previous object pose estimate p*, the detected line 1j with associated object edge and detected point Xk with associated object corner x* solve

min dTs(Po, Po) PO

s.t. 6aK Z (lij*) <6a

og dist (1j 1) 6 , (4.6)

6, dist(xk,x) 6x

where the distance dTs(pl, P2) between two poses is defined as the weighted sum of the Euclidean metric in R 3 and the great circle angle metric in SO(3) for the respective components [39]. The symbol (-)* is used to denote the location of the detected feature on the object mesh evaluated at the previous pose q*. For example, the detected line 1j has the associated edge l1 on the object mesh. The symbols

Z() and distO denote the angle and Euclidean distance between two lines and points, respectively. The constraints in Eq. (4.6) enforce that the detected lines/points in the tactile sensor are colinear/coincident with their parent location on the updated object pose, respectively. The tolerance bounds 6a, 6 and 6, are used as robust constraint bounds to handle overconstrained situations where multiple contacts are interacting with the object and noisy measurements can lead to infeasible solutions.

4.7 Planning

We formulate the manipulation planning problem as a sequencing of manipulation primitives, as described in Section 4.4. We consider a hierarchy of two levels: low- level trajectory planning and high-level task planning. The trajectory planning level computes the low-level 14-dof kinematic trajectory of the robot arms as well as the

86 trajectory of contact forces achieving a desired object transformation. The task plan- ner determines how to sequence the manipulation primitives to achieve the task.

4.7.1 Low-Level Trajectory Planning

We use the mechanics of quasi-static manipulation presented in Section 4.5 to de- velop trajectory planning algorithms, tasked with computing robot arm motions that

achieve target object transformations subject to a specific manipulation primitive.

When generating planning robot/object trajectories, we split the search into:

1. Contact Trajectory. Given a desired object trajectory, we first plan for the

location and magnitude of contacts forces. These trajectories are constrained to satisfy (4.1) (4.2) (4.3).

2. End-Effector Trajectory. Given a prescribed contact trajectory, we find

a trajectory of end-effector poses achieving the desired contact configurations between the robot arms and the object.

3. Robot ArmTrajectory. Given an end-effector trajectory, we solve the inverse

kinematics problem to find a continuous robot arm joint trajectory that 1) avoids collision between the object, the arms, the body, and the environment and 2) is continuous.

During the execution of a primitive, we require that the planners can regenerate trajectories at fast frequencies to react to the pose estimate updates from Section 4.6.2 in a timely fashion.

The algorithms 4- 7 generate a kinematic trajectory of the robotic palms rp given adesired object pose transformations from the initial pose p? to the final pose p.

Prior to executing the robot trajectory, we map the cartesian palm poses to robot joint trajectories using an RRT-Connect [52] the motion planner. The resulting robot joint trajectory avoids collision with the environment and the object while maintaining joint continuity. The functions MakeContact and ReleaseContact, common to all primitives in Algorithms 4-7, establish how the robot should approach the object

87 Algorithm 4: Grasp Planning 1: Given: Initial object pose p?, final object pose p', initial left palm pose p , initial right palm pose p'r 2: procedure 3: T0,1, rp,1 = MakeContact(p?, pi'", pp'') 4: To,2, rp,2 = LiftObject(-ro,1, T,,1, , PO) 5: To,3, Tp,3 = RotateObject(ro, 2,T p,2,, p) 6: To,2, 'rp,2 = Place~bject(ro,1, r,,1, , py) 7: To,3,r ,, 3 = ReleaseContact(ro, 2,TT, 2 ) 8: return Object trajectory -r=Concat(rc,1,..., To,3),

9: Palms trajectory T= Concat(rp,1,...,T p,3 )

Algorithm 5: Push Planning 1: Given: Initial object pose p?, final object pose po, initial left palm pose p, initial right palm pose p['', robot arm A 2: procedure

3: To, 1 , Tp,1 = MakeContact(p ,p, A)

4: To,2 , rp,2 = DubinsPush(ro,1, rp,,, p, A)

5: To,3 , rp,3 = ReleaseContact(To, 2 , T, 2 , A )

6: return Object trajectory ro = Concat(ro,1 ,... ro,3), 7: Palms trajectoryT, = Concat(-rp,1 .... , Tp,3)

before and after the manipulation process. We rely on the motion planner [52] to find collision free joint trajectories achieving the target palm/object configuration

The grasp primitive trajectory planner is detailed in in Algorithm 4. First, the function MakeContact brings the tpalm to their respective locations p' iand p' . Prior to performing the desired transformation bringing the object from p to pf, the function LiftObject elevates the object such as to ensure it will not collide with the table. The function RotateObject rotates the object such that the object reaches the target orientation po. The object and robot palm trajectories ro, 3 and rp, 3 are gen- erated using spherical linear interpolation (SLERP). Finally, the function PlaceObject places the object on the table at the target pose p.

The trajectory generation algorithm for the push primitive is detailed in Algo- rithm 5. The robot first contacts the object with a line contact at the contact lo- cations p ' and p with the robot arm A. The trajectory of the object/contacts

88 Algorithm 6: Pull Planning 1: Given: Initial object pose p9, final object pose p', initial left palm pose pi, initial right palm pose p ', robot arm A 2: procedure 3: To,1, Tp,1 = MakeContact(p?, pi'', p '', A)

4: To,2, rp,2 = InterpolatePath(,ro,, rp,, , p, A)

5: ro,3,rp,3 = ReleaseContact(ro, 2, rp,2, A )

6: return Object trajectory-ro = Concat(ro,1,... ro,3), 7: Palms trajectory -r = Concat(rp,1 .... , ,3 )

Algorithm 7: Pivot Planning 1: Given: Initial object pose p?, final object pose p', initial left palm pose pi, initial right palm pose pi'', robot arm A 2: procedure 3: T0 ,1, Tp,1 = MakeContact(p, Pp'', pp'', A)

4: To,2 , rp,2= DubinsPush(ro,1, p,1,, Ipf, A)

5: ro,3 , Tp,3= ReleaseContact(ro, 2, rp,2, A )

6: return Object trajectory-o = Concat(ro,, ... , ro,3), 7: Palms trajectory 7, = Concat(rp,,... , T,,3)

from p' to p? are generated using the Dubin's Car planner introduced in [109], where pushing motions are considered as a kinematic problem by restricting object motions to remain within the pusher motion cone. Under the assumption of pure sticking interactions, planar pushing trajectory planning reduces to a Dubin's Car problem where the sticking contact constraints translate to bounded curvature. The Dubin's

Car computes a time-optimal trajectory relating the initial and final object configu- rations via a single push.

In Algorithm 6, we describe the planning procedure for the pull primitive. Given the palm contact location pi and p , the robot arm A makes contact with the object such that the palm contact the object normally to the contacted face. Once contact is made, the InterpolatePathinterpolates the object/contact trajectories from p? to p.

The pivot trajectory planning framework is detailed in Algorithm 7. The function

PivotObject rotates the object about its corner while checking that contacts can apply sufficient force to enforce static equilibrium. A precondition of this primitive is that

89 the transformation from p' to p" can be achieved through a pure rotation about an object corner making contact with the ground. More information about this precondition is detailed in Section 4.7.2.

4.7.2 High-Level Planning

Given a desired object transformations, it is unlikely that a single manipulation prim- itive will have the ability to achieve it. This is due to the fact that many primitives can only achieve limited object reconfigurations. For example, the push and pull primitives can only perform object transformations in SE(2), where the object's sur- face contact with the table remains unchanged. Similarly, the pivot primitive can only perform pure object rotations. While grasp could achieve any object reconfigu- ration in theory, in practice the required robot motions are often large and run into collisions with the environment or infeasible inverse kinematics. To achieve general object transformations, manipulation primitives must be combined.

We formulate sequencing of manipulation primitives as a graph search problem.

We adapt the regrasp graphs developed in [100, 41] to include a broader set of ma- nipulation primitives. We decouple the search into finding 1) a sequence of stable placements from the object's initial to final pose and 2) the sequence of manipulation primitives to achieve the desired pose. This separation of the problem into two sub- goals allows us to search through a reduced state space permitting for faster planning time.

1. Object Stable Placement. Given the convex hull 0 of triangulated object mesh 0 and the planar environment E, we find the set of stable placements

P, defined as the ways in which the object can stably rest on the table. We find the set of stable placements projecting the center of gravity of the convex

hull of the object to the plane defined by each face. If the projected point falls withing the considered face, that face is marked as a valid stable placement

on which the object can rest in equilibrium. Given a nominal object pose in

free space p",, with associated 4 x 4 transformation matrix T°,, each stable

90

11- --- " - '--01"!MMW.R,"",., , ---- .1r _,_ I - I 11"--F - -- ;1WRIT"N"M placement can be expressed as the rigid body transformation TO =T T where the subscript pn E P denotes the nth stable placement, PnTn" is the

transformation matrix bringing the nominal pose p", to the stable placement

pose p . For convenience, we introduce the rigid-body transformation operator

To that operates directly on poses, such that p-= Pn"T(po.) is equivalent

to the operation T4, = T T4

The first step of the high-level planning algorithm is to find a sequence of

manipulation primitives that brings the object from its initial stable placement

associated with p to its final stable placement associated with po. This requires

searching through the primitives capable of achieving SE(3) transformations,

namely grasp and pivot.

2. Object Pose. Once the object has been manipulated to its target stable place-

ment, we search for manipulation primitives to bring it to the specified final

pose po. This requires searching through the primitives capable of achieving

SE(2) transformations, namely push and pull.

Placement Manipulation Planning. We define the search for manipulation prim- itives bringing the object from the initial object stable placement to the target stable placement as a graph search problem. The nodes of the manipulation graph are defined as stable placements of the object on the table with a specified contact con- figuration. The edges of the graph represent manipulation primitive actions, able to transform the object from one stable placement to another.

For each primitive a, we sample a set of Sa candidate dual palm poses Sa

{{Pai,i,pai,,}, {Pa2,IPa2,r}, ... , {PaSa,,Pasa,r}}, where the subscripts r and 1 describe the poses of the left and right palms relative to the object.

For grasp, we uniformly sample candidate contact points on each face of the ob- ject mesh 0 and find a bipolar grasp by projecting the sample along the face nor- mal until it intersects with an opposing face. We verify that the candidate grasp satisfies force-closure by checking that the angle between the face pairs is within a specified tolerance. Once we construct a set of S9 valid bipolar contact points

91 C = {{Cgi,Cgi,}, {Cg 2 ,i, cg2,r}, ... , {Cgs,,CgS,}}, we further sample each contact point by considering Mg palm pose orientations, each obtained by rotating the palm about the contact face's normal axis from 0 to 27r. The valid grasp pose samples, resolved in the nominal object pose pm,are denoted as

Sg= {{pgi,l, Pgi,},{ Pg2,, Pg2,r}, ... -{Pg,, Pgsgr}}, where 5, = Sg x Mg.

For pivot, we determine the set of samples by first enumerating the ways in which the object can be pivoted from one face to another. For each stable place- ment p, E P of the object mesh 0, we denote its N neighbor stable placements as

N= {P1 P2 ... PN}• When pivoting an object, there are three contact surfaces of in- terest: the ground face, the pivot face, and the anchor face. The ground face denotes the stable face placement on which the object initially rests prior to the manipula- tion. The pivot face, shown by the right palm in Fig. 4-3(c), is the neighbor face to the ground face on which the palm contacts drives the pivoting motion. Finally, the anchor palm, shown by the left palm in Fig. 4-3(c), is the face on which the palm contacts to prevent the object from falling during the pivoting motion. For every candidate pivot transformation from the stable placement PnE P to the anchor face fP E N, we set the anchor palm to contact the anchor face's edge with the center point that has the largest z value and is within a specified angle tolerance relative to the ground. The pivot contact point is set following a similar procedure with re- spect to the pivot face. The pivot face is determined by projecting the center of the anchor face along its face normal and identifying the object stable placement with which it intersects. Once we determine the set of valid pivoting contact locations, we further sample each anchor contact point by considering Mp,, orientations about the anchor face contact's edge and Mp,, orientations about the pivot face contact's edge. The pivot pose samples, resolved in the nominal object pose p , are denoted as

Sp {{Pp1,i,Pp,,}, {Pp2,l, Pp2,r}, ... , P ppr}}, where 5, = Sp x M,x Mp,r.

The complete set of sampled contact poses is S = S U Sp

92 (a) Stable Placement Manipulation. (b) Object Pose Manipulation. Once The first step of the high-level planning the object has been manipulated to its algorithm is to find a sequence of ma- target stable placement, we search for nipulation primitives that brings the ob- manipulation primitives to bring it to the ject from its initial stable placement as- specified final pose p'. This requires sociated with p9 to its final stable place- searching through the primitives capa- ment associated with p'. This requires ble of achieving SE(2) transformations, searching through the primitives capa- namely push and pull. ble of achieving SE(3) transformations, namely grasp and pivot.

Figure 4-7: We decouple the search into finding 1) a sequence of stable placements from the object's initial to final pose and 2) the sequence of manipulation primitives to achieve the desired pose. This separation of the problem into two subgoals allows us to search through a reduced state space permitting for faster planning time.

In the previous section, we describe the sampling procedure for the primitives grasp and pivot. When sampling the object, all samples are generated with the object in free space at the nominal pose p',. When constructing the manipulation graph, we transform the object along with the palm samples to each object stable placement.

For grasp, for example, this leads to P sets of grasp samples, where the samples for each stable placement are denoted as

Pg,p= {{tp(Pgi,l), ip(Pgi,r)}, {ip(Pg2,), ip(Pg2,r)}, ... ,{ip(Pgs,l), tp(Pgs,,r)

We verify feasibility of each contact samples by performing a collision check between the palm poses and the table. If the samples are feasible, they are added as nodes in the manipulation graph. The same procedure is used for pivot. For both grasp and pivot, we consider two graph edge types, i.e., ways of connecting the nodes.

•Object manipulation. The robot manipulates the object from one stable placement to another. We add this edge to the manipulation graph if the palms

93 Algorithm 8: Placement manipulation planning 1: Given: Object mesh 0, Gripper mesh G, Environment mesh E, Initial pose pl' Final pose po 2: procedure 3: 0 -ConvexHull(O) 4: P <- GetPlacements(O) 5: S +- SampleObject(O) 6: For each p, E P do 7: P-T' - G etPlacementTransform(pn) 8: P <- TransformSamples(PnTn", P) 9: p,, +-CheckCollisions(P,,, E)

10: G <- BuildGraph(#i1, 2, . ,P) 11: path +- Dijkstra(G,pi,pf) 12: return Sequence of primitives and contact poses path

do not collide with the table at both the initial and final object configurations.

*Contact reconfiguration. The robot release its grasp on the object and finds

a new contact configurations.

We search for the shortest path within the constructed path using Dijkstra's algo- rithm that will bring the object from the initial placement to the target placement. The high-level task planner algorithm is detailed in Algorithm 8. The algorithm re- turns path, a dictionary that contains the sequence of manipulation primitives, the sequence of placement transformations, and the sequence of end effector contact loca- tions. The variable path details how to manipulation the object to the target stable placement, as shown in Fig. 4-7(a). Once the object has been manipulated to its target placement, we append a SE(2) (pull/push) primitive to the plan to bring the object to the final target pose as shown in Fig. 4-7(b). For both pull and push, we sample four different end effector poses and compute the trajectories using Alg. 5 and 6. We select the trajectory with the shortest euclidean distance.

94 4.8 Results

We validate our approach to tactile dexterity on an ABB YuMi dual-arm robot where we evaluate the ability of the tactile controller to handle external perturbations on a table top manipulation task.

N pose15 20 Nominal pose Disturbance Time (s)

0.2

MO MA A Slip ...... Stick

15 20 Disturbance Nominal pose Time (s)

Figure 4-8: Closed-loop evaluation of the tactile controller. We consider the task of maintaining the object in a stationary pose under external perturbations. When perturbed, the contact state controller increases the normal force on the object to prevent slippage while the contact state controller replans robot motions to bring the object back to its nominal pose.

Figure 4-1 shows snapshots of an experiment where the robot manipulates an object from the initial pose qO = [0.3, -0.2, 0.07, 0.38, 0.60, 0.60, 0. 3 8 ]T to the target pose qf = [0.45, 0.3, 0.045, 0.0, 0.71, 0.0, 0. 7 1 ]T. To achieve the task, the robot follows the sequence: pull the object to the middle of the table, pivot the object to its target placement, and push it to its target location. The initial pull primitive is necessary to move the object to a location that allows the robot to perform a pivot manoeuvre with well defined inverse kinematics and that avoids collisions with the environment.

In Fig. 4-8, we evaluate the closed-loop performance of the tactile controller pre- sented in Section 4.6 by quantifying its ability to handle external object perturbations

95 during execution. We consider the regulation task of maintaining the object in a sta- tionary pose for the pull and grasp primitives. The regulation task allows to better visualize the reactive capabilities of the controller without any loss of generality. In each experiment, we apply two successive impulsive forces on the object and evaluate the stabilizing capabilities of the tactile controller. Figure 4-8 plots the mean and standard deviation of the error between the object's desired and measured pose for 5 consecutive experiments. In both cases, the controller quickly reacts to the distur- bance by 1) detecting slippage events at the contact interfaces and 2) tracking the pose of the object using the detected object edge in the tactile signal. First, the applied normal force is increased following (4.5) in reaction to the detected slippage at the contact interface. Second, the robot replans its trajectory in real-time using the tactile state estimate such that the object quickly returns to its nominal pose. During this experiment, an apriltag is used to provide the ground truth pose of the object.

4.9 Discussion

This chapter introduces tactile dexterity, an approach to dexterous manipulation that exploits tactile sensing for reactive control. By structuring the manipulation problem as a sequence of manipulation primitives that render interpretable tactile information, we enable tactile object state estimation, tactile object control, and robust manipu- lation behavior. This requires restricting the types of primitive interactions so they yield simple rigid body mechanics models and efficient planners. We show that the developed tactile controllers modulate grasp forces, track the pose of the object, and handle external perturbations by replanning in real-time.

96 Chapter 5

Conclusion

This thesis studies the problem of reactive robotic manipulation. In particular, we develop algorithms that exploit models of frictional mechanics to design feedback controllers. In this last chapter, we discuss the key findings of our research, the limitations of the approaches introduced in this thesis, present topics of future work, and discuss open research questions.

5.1 Key Findings

Here, we present a summary of the major insights gained from our exploratory work on closing the loop in manipulation. Our key findings are:

1. Feedback design with rigid-body frictional contact interactions is nat-

urally represented by a mixed-integer optimization program. Using a

model predictive control approach combined with an integer programming for-

mulation allows to integrate the physical constraints and the hybrid nature of

the associated frictional contact interactions.

2. It is possible to use tactile feedback to enable dexterous manipulation

control by targeting specific contact interactions. Restricting contact

interactions to a set of well-understood primitives and geometrically rich object

features renders interpretable information that informs the design of robust

97 manipulation control algorithms.

3. A robotic platform with dual robotic palms can exhibit dexterous

manipulation skills. Dual palms offer a rich set of possible manipulation be-

haviors where they can 1) act as a parallel jaw gripper by aligning their motion,

2) be used as individual palms to openly reach out to sense/displace objects, and 3) exploit the full dexterity of the arms to perform in-hand manipulation.

5.2 Limitations

This thesis makes a number of assumptions motivated by our emphasis on under- standing the fundamental challenges in closing the loop in robotic manipulation. The limitations of our work are:

• The analysis and experimental validation of feedback controllers are

limited to motions in the plane. In Chapter 3, we explore the planar manip-

ulation setting in depth to study the core theoretical difficulties of a problem. We limit the scope of our analysis to planar pushing interactions and leave

their validation on higher dimensional systems to future work. In Chapter 4, all primitives involve motions on a planar subspace of SE(3).

" The feedback control algorithms assume full object state feedback We

develop control algorithms that rely on tracking the object state using visual markers (Chapter 3) and using tactile feedback (Chapter 4). This approach

requires knowledge of the object geometry and does not clearly generalize to the case of deformable objects or objects in clutter.

* Visual and tactile feedback are studied in isolation. This thesis explores feedback strategies relying on either visual or tactile feedback. For improve per-

formance and robustness, these approaches could be fused together to provide a richer set of observations.

98 " The manipulation primitives are hand designed using human intu-

ition. In Chapter 4, we rely on human intuition to develop a library of manip- ulation primitives that satisfy two criteria: interpretable tactile feedback and ease of planning.

" Tasks assume full knowledge of robot, object, and environment prop-

erties. Chapters 3 and 4 study reactive manipulation in a structured setting, where all geometric and physical properties of the system are known.

5.3 Future Work

The research presented in this thesis offers many promising avenues for future re- search. In this section, we highlight a few natural extension of our work:

" Extend the controller designs to systems with multi-contact inter-

actions. The applications of feedback control strategies to systems with

multi-contact interactions is a topic of future work. Examples include manipu- lation primitives such as pivoting, prehensile pushing, finger gating, rolling, etc.

A possible practical approach to plan and control dexterous manipulation tasks

is to 1) plan for sticking contact interactions and 2) design feedback controllers

that use contact models with smoothed stick/slip transitions to enable sliding

behavior at contact when necessary or beneficial.

" Fuse visual and tactile feedback for object pose estimation. Combining

touch with vision is a natural extension of this research that would allow the

robot to recover from larger object disturbances and improve the the object

state controller presented in Chapter 4. We envision vision to play a key role in

localizing the object and tactile sensing to be the feedback source for controlling

the desired contact state.

" Develop rich dual-arm manipulation primitives. Developing manipu- lation primitives with richer contact interactions (contact switches, stick/slip

99 interactions, etc.) that allow more expressive behavior is a direction of future work. This line of research will require control architectures that can reason

across hybrid contact switches. Another interesting avenue is to learn manipu-

lation primitives from experience, thus removing the need for human intuition.

• Integrate Task-and-Motion Planning with Tactile Dexterity. This the-

sis formulates the long-term manipulation planning as a graph search problem that is tailored to an environment with a flat surface. To deal with more com-

plex environments and tasks (e.g., pick up the box and put it on a shelf, or

inside a closed drawer), the system would benefit from a high level task planner that can sequence manipulation primitives each with its own reachability and

preconditions, as is commonly done in task and motion planning (TAMP). Ide-

ally, these planners would be reactive and allow the robot to switch primitives

in response to unexpected deviations.

5.4 Closing Thoughts

In this last section, I present open questions that have motivated much of my work and are important to advance the field of robotic manipulation.

5.4.1 What is the right tradeoff between model complexity and computational efficiency?

A mechanics model that can accurately predict real-world interactions does not neces- sarily translate to a practical model for closed-loop control. A key finding of this the- sis is that simple contact models that assume rigid body interactions with point/line contacts and Coulomb friction are expressive enough to design robust controllers.

This result is meaningful, as the Coulomb frictional model, which assumes rigid-body interactions, is known to be an approximation of rigid-body frictional interactions

[103, 28]. Given this observation and the complexity of deriving controllers for the rigid body model (Chapters 3 and 4), a natural question is "are there alternate fric-

100 tional contact models that are also approximate, yet more amenable and effective for control than the rigid body model?" Less expressive models can be more amenable for controller synthesis.

In Appendix A, we present a preliminary exploration of the feasibility of using smoothed contact models for reactive control purposes. We investigate the perfor-

mance of using a data-driven model parametrized as a Gaussian Process to capture the stick-slip planar pushing interactions for controller design where the non-smoothness

of the stick-slip transition has been smoothed by a Gaussian Process. The experimen-

tal results show that data-driven models have the ability to control planar pushing interactions. We hypothesize that the smooth data-driven model is able to reason through the complexity of stick/slip interactions as the motion model is formulated

in velocity space rather than in force space, which presents smoother transition func- tions as showed in Fig. A-2. While we show that a smoothed function approximator

might be appropriate to capture the nature of stick/slip interactions for planar point pushing, we do not establish whether this methods extends to making and breaking

of contacts or finger gating, for which hybridness might carry a larger complexity.

While this offers a promising methodology to control a class of contact interactions, the development of algorithm for control of analytical models remains an important topic as it allows generalization across a wide range of dynamical systems driven by

frictional interactions and does not require expensive data collection. A more in depth

investigation of the applicability and performance of using smoothed contact models for controller design remains an interesting topic of future work.

5.4.2 Combining Model-Based and Data-Driven Approaches?

Incredible advances in machine learning and computer vision have recently changed

the landscape of robotic manipulation. These advances enable new tools for the

robotic manipulation community (object segmentation, object feature tracking, key- point trackers, etc.) that are in many aspects complementary to the contributions of this thesis. While this thesis largely focuses on controlling the manipulation process assuming knowledge of the object pose, these advances develop tools with the poten-

101 tial to replace external tracking systems with affordable sensor modalities (RGB-D

cameras, contact sensors, etc.). Another major area of recent development has been on the application of rein-

forcement learning algorithms in robotic manipulation. Of particular interest are

model-based reinforcement learning algorithms that typically involves 1) learning the

dynamic model from data and 2) using a model-based approaching for control (e.g.

MPC, Differential Dynamic Programming (DDP), etc.). Given that we have the abil-

ity to derive the motion equations of multi-body interactions from first principles, why is it that researchers opt to learn a model from data instead of deriving them

from first principles? I believe that there are three reasons accounting for this. First, the nature of the sensors of interest (observations) does not always permit to ex-

ploit knowledge of physics for feedback controller design (e.g. camera pixels, tactile

measurements, etc.). Second, given enough data, data-driven models can potentially

outperform physics models for controller design. Finally, the compact mathematical

form of the physics-based equations are not easily amenable to design feedback con- trollers for controller design. This last point represents an important bottleneck that

remains a fundamental open problem to the robotics community. Due to this, it has

become common to use a physics simulator to generate samples to learn a represen- tation of a motion model for control. The development of a general framework for controller design of physical interactions remains a fundamental open problem to the robotics community.

5.4.3 Towards Achieving General Manipulation Capabilities

In Chapter 4, we show that by carefully targeting contacts with the object, we can solve certain planning, control, and state estimation problems. We show that it is especially relevant when integrating sensor modalities that render partial observa- tions of the world, such as tactile feedback. This is achieved by defining manipulation primitives as a way of transferring human knowledge to the system. We believe that this approach is appropriate for robotic manipulation due to the complexity of the general problem. Developing planning and control algorithms that are general to any

102 end effector, object shape, and environment requires searching over very large state and action spaces while reasoning about the non-smooth and underactuated dynamics of frictional interaction. Decomposing manipulation plans into sequences of manip- ulation primitives with simpler mechanics offers many advantages. First, this divide and conquer approach permits to fully exploit the known dynamics and constraints associated with a particular primitives. Second, developing close-loop control algo- rithms on physical systems requirements iterative development and validation, which is facilitated when working with primitives in isolation. Finally, the segmentation of robotic manipulation into simpler manipulation behaviors presents an opportunity to combine model-based and model-free controller design approaches.

103 104 Appendix A

Data-Driven Control of Planar Manipulation

In contrast to traditional model-based control approaches, a growing trend in the robotics community is to learn control policies directly from experience, which typi- cally rely on a very large quantity of training data to achieve good performance. This section explores the data-complexity required to control manipulation tasks with a model-based approach, where the model is learned from data.

d

n

fn P b0 fr(ft

Fa 0

Figure A-1: Planar pushing system with world frame T, and body frame Fb. We denote the length of the square as a.

As investigated in Section 3.5.2, the discontinuous constraints in the dynamics

105 lead to a hybrid system that makes controller design challenging. This section aims to explore the feasibility of using a data-driven model as an alternative to the analytical model for controller design.

Figure A illustrates the planar pushing system, where x,y, 0 denote the geometric center of the object and its orientation in the world frame. The term p. relates the tangential distance between the pusher and the center-line of the object in the body frame.

We consider the data-driven approach proposed by [6] that has been shown to cap- ture the complex frictional interactions between pusher, object, and support surface.

[6] show that as few as 100 samples are enough to train a Gaussian process (GP) to surpass the accuracy of the analytical model. A GP is specified by a mean function p(x), such that p(x) is the mean of f(p(x)), and a covariance/kernel function k(x, x'), such that k(x, x') is the covariance between f(p(x)) and f(p(x')). Given a dataset of observed inputs {xo, ... , xn} and corresponding output values {f(xo), .. . , f(x)}, we compute the posterior over the value f(x*) at a new query x* as:

f (x*) ~ Nv(pt(x*), a2 (x*)), (A. 1) with p(x*) = k(x*)K~-g, (A.2) and a2 (x*) = k(x*, x*) - k(x*)Klk(x*) T, (A.3) where k(x*) represents the vector [k(x*)]i = k(xi, x*) and K is a matrix such that [K]ig = k(xi, xj). We train a GP to model for each output using a zero mean prior and the Automatic Relevance Determination (ARD) squared exponential kernel function

k(xi, x2 )= o- exp(-(xi - x2 )T A-'(x 1 - x 2 )) where af is the signal variance and A is a diagonal matrix with the estimated char- acteristics lengths of each input dimension [81]. When learning a GP, the hyperpa-

106

'.'VMF _".' r' _ -- ... , - - t1-'.'_"_'_W'__ 11 ''1 -- '' VO, P. - rameters or and A are optimized while the other variables can be directly computed from the training data. To collect data, the robot executes pushes with random initial contact position py and direction # as in [104]. The angle 0 describes the orientation of the pusher relative to the object body frame where tan = 2a, and v, vt denote

the pusher velocity in the body frame.

The learning problem is defined as:

Inputs: x = [p #]'T, as defined above.

Outputs: f(x) = Axb = [AXb AYb A6bT, where AXb, Ay, and AOb represent the displacement of the object's center and change in orientation in the body frame for

the duration of the push At. Figure A-2 shows the model obtained for AOb depending on the training datapoints.

By leveraging the quasi-static assumption, which neglects inertial effects, the

model is learned for a predetermined velocity Vo,, and scaled proportionally with

the velocity of the pusher to recover the object velocity in the body frame asxb

V1 h AXb, where v, = [v, t]T is the pusher velocity in the body frame.

1.5 0.0415 0.04

1 IL1 0.02 ~0.02

L000 0 0

-0.5 -0.5 -0.02 -0.02

-1 -0.05 - 4 f.0 -0.04 -1.5 -1 5-1.5 0 0.5 1 0 0.5 1 0 0.5 1 py/a py/a py/a

Figure A-2: Training data (dots) and learned model for the object's change in orien- tation, ZOb (rad). From left to right the number of datapoints is 10, 100 and 1000. We observe that the model complexity and accuracy increases with the number of training data.

We write the data-driven motion equations in a similar form to (3.8). The velocity

107 of the pusher relative to the object is resolved in the body frame as

K1 V,-Jcib. (A.4)

The data-driven motion equations are

i = fd(X,ud)= [ ], (A.5)

where the control input for the data-driven model is Ud = v,. We control the learned models with a model predictive control framework due to its flexibility and its ability to enforce state and action constrains.

MPC with Gaussian Process Model (MPC-GP): Given the current error state xo and a nominal trajectory (x*, u), solve

N-1

min iQNNN + 1+(i i +1 + RT Rni) xi~u~i=O

subject to &+1 =& + h [A i + Bii] (Linearized Motion Equations), (A.6)

Xi E X (State Constraints),

ui E l (Input Constraints), with integration time step h, & = xi - x* and si = ui - u>. The terms Q, QN, and R denote weight matrices associated with the error state, the final error state, and the control input. The optimization is performed by linearizing the GP about a desired nominal trajectory. The nominal trajectory is computed using the analytical model with sticking interactions as done in [1091.

The data-driven model is more amenable than the analytical model for MPC as it presents continuous differential equations. As such, no particular care needs to be taken with regard to system hybridness and selecting mode sequences. The control input is given directly by the velocity of the pusher in the body frame of the object ud= v, along with the motion equations (A.5) linearized about the nominal

108 trajectory.

To address the stochastic nature of GPs, we make use of the certainty equivalent approximation [111, which acts by settings random values with their expected value

during the optimization process. In the case of the pusher-slider system modeled

with GPs, this implies that the mean of the dynamics are propagated forward by setting the state noise value to 0. This approximation is computationally beneficial

since it converts a stochastic optimization problem into a deterministic one. This

approximation has been shown to produce good results for linear systems, where the

certainty principle is optimal for systems with additive Gaussian noise.

In Fig. A-3, we evaluate the tracking performance of the data-driven controller presented in Section A and compare its performance to a model-based approach. The data-driven GP model is trained with 5000 randomly collected data points as

described in [103]. The control hyperparameters for the MPC-GP controller are Q = [6000,3000,10,3000], QN = Q, and R = [10, 0.001], associated with x = [X Y 6 p]T

and Ud - vp, respectively.

Table A.1 compares the performance of the analytical and the data-driven con- troller designs on a series of trajectory tracking problems. The performance of each

controller is measured by computing the average mean squared error over the dura- tion of the experiment between the desired trajectory and the actual motion of the object's geometric center (see Fig.A-3). We conduct four benchmark experiments for each controller: 1. low tracking velocity (20 mm/s) without external perturbations, 2. high tracking velocity (80mm/s) without external perturbations, 3. high tracking velocity (80mm/s) with external perturbations, and 4. square tracking at 50mm/s. Two perturbations types are considered in order to test the robustness of the con- trollers: tangential and normal. Tangential perturbations are applied laterally to the motion of the object by perturbing the initial position of the contact point from its desired position. Normal disturbances are applied orthogonal to the object's motion by detaching the object away from the robotic pusher by 30mm.

Table A.1 summarizes the tracking performance results for the analytical and data-driven (5000 datapoints) controllers. Both controller designs successfully achieve

109 (a) 8-track. (b) Square track.

Figure A-3: Desired trajectory (black) compared with the motion followed by the object's geometric center (blue) using the analytical controller at 80 and 50 mm/s respectively. The average error for the 8-track example is 9.00mm and 6.05mm for the square. Table A.1: Controller performance comparison

Trajectory Error (Analytical) Error (Data-Driven) 8-track no perturbation, v= 80mm/s 9.56 mm 8.50 mm 8-track no perturbation, v= 20mm/s 2.89 mm 6.53 mm 8-track normal perturbation, v= 80mm/s 11.10 mm 8.52 mm 8-track tangential perturbation, v= 80m/s 12.37 mm 9.28 mm Square trajectory, v= 50mm/s 4.95 mm 6.60 mm closed-loop tracking within 10mm accuracy when no external perturbations are ap- plied.

110 Bibliography

[11 Bernardo Aceituno-Cabezas, Carlos Mastalli, Hongkai Dai, Michele Focchi, An- dreea Radulescu, Darwin G Caldwell, Jos6 Cappelletto, Juan C Grieco, Gerardo Fernindez-L6pez, and Claudio Semini. Simultaneous contact, gait, and motion planning for robust multilegged locomotion via mixed-integer convex optimiza- tion. IEEE Robotics and Automation Letters, 3(3):2531-2538, 2017.

[21 Arash Ajoudani, Elif Hocaoglu, Alessandro Altobelli, Matteo Rossi, Edoardo Battaglia, Nikos Tsagarakis, and Antonio Bicchi. Reflex control of the /iit softhand during object slippage. In IEEE InternationalConference on Robotics and Automation (ICRA), pages 1972-1979, 2016.

[3] Alessandro Alessio and Alberto Bemporad. Feasible mode enumeration and cost comparison for explicit quadratic model predictive control of hybrid systems. IFAC Proceedings Volumes, 39(5):302-308, 2006.

[4] Maria Bauza, Oleguer Canal, and Alberto Rodriguez. Tactile Mapping and Localization from High-Resolution Tactile Imprints. In IEEE International Conference on Robotics and Automation (ICRA), page Under review., 2019.

[5] Maria Bauza, Francois R Hogan, and Alberto Rodriguez. A data-efficient ap- proach to precise and controlled pushing. In Conference on Robot Learning, pages 336-345, 2018.

[6] Maria Bauza and Alberto Rodriguez. A probabilistic data-driven model for planar pushing. In Robotics and Automation (ICRA), 2017 IEEE International Conference on, pages 3008-3015, 2017.

[7] Yasemin Bekiroglu, Dan Song, Lu Wang, and Danica Kragic. A probabilistic framework for task-oriented grasp stability assessment. In IEEE International Conference on Robotics and Automation (ICRA), pages 3040-3047, 2013.

[8] Alberto Bemporad and Manfred Morari. Control of systems integrating logic, dynamics, and constraints. Automatica, 35(3):407 - 427, 1999.

[9] Alberto Bemporad, Manfred Morari, Vivek Dua, and Efstratios N Pistikopou- los. The explicit linear quadratic regulator for constrained systems. Automatica, 38(1):3-20, 2002.

111 [101 Dmitry Berenson, Rosen Diankov, Koichi Nishiwaki, Satoshi Kagami, and James Kuffner. Grasp planning in complex scenes. In Humanoid Robots, 2007 7th IEEE-RAS International Conference on, pages 42-48, 2007.

[11] D.P. Bertsekas, J.N. Tsitsiklis, and C. Wu. Rollout algorithms for combinatorial optimization. Journal of Heuristics, 3(3):245 - 262, 1997.

[12] Antonio Bicchi and Vijay Kumar. Robotic grasping and contact: A review. In IEEE International Conference on Robotics and Automation (ICRA), pages 348-353, 2000.

[13] Ch Borst, Max Fischer, and Gerd Hirzinger. Grasp planning: How to choose a suitable task wrench space. In IEEE International Conference on Robotics and Automation (ICRA), volume 1, pages 319-325, 2004.

[14] Martin Buehler, Daniel E Koditschek, and Peter J Kindlmann. Planning and control of robotic juggling and catching tasks. The International Journal of Robotics Research, 13(2):101-118, 1994.

[151 Calandra, Roberto, Andrew Owens, Manu Upadhyaya, Wenzhen Yuan, Justin Lin, Edward H. Adelson, and Sergey Levine. The feeling of success: Does touch sensing help predict grasp outcomes? In arXiv preprint arXiv:1710.05512, 2017.

[16] N. Chavan-Dafle and A. Rodriguez. Sampling-based planning for in-hand ma- nipulations with external pushes. In InternationalSymposium on Robotics Re- search (ISRR), 2017.

[17] Nikhil Chavan-Dafle, Alberto Rodriguez, Bowei Tang R. Paolini, Siddhartha Srinivasa, Michael Erdmann, Matthew T Mason, Ivan Lundberg, Harald Staab, and Thomas Fuhlbrigge. Extrinsic Dexterity: In-Hand Manipulation with Ex- ternal Forces. In Robotics and Automation (ICRA), 2014 IEEE International Conference on, 2014.

[181 Yevgen Chebotar, Karol Hausman, Zhe Su, Gaurav S Sukhatme, and Ste- fan Schaal. Self-supervised regrasping using spatio-temporal tactile features and reinforcement learning. In Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ InternationalConference on, pages 1960-1966, 2016.

[19] Matei T Ciocarlie and Peter K Allen. Hand posture subspaces for dexterous robotic grasping. The InternationalJournal of Robotics Research, 28(7):851- 867, 2009.

[20] Nikolaus Correll, Kostas E Bekris, Dmitry Berenson, Oliver Brock, Albert Causo, Kris Hauser, Kei Okada, Alberto Rodriguez, Joseph M Romano, and Peter R Wurman. Analysis and observations from the first amazon picking chal- lenge. IEEE Transactions on Automation Science and Engineering, 15(1):172- 188, 2016.

112 [21] Robin Deits, Twan Koolen, and Russ Tedrake. Lvis: Learning from value function intervals for contact-aware robot controllers. In 2019 International Conference on Robotics and Automation (ICRA), pages 7762-7768. IEEE, 2019.

[22] Mehmet Dogar and Siddhartha Srinivasa. A framework for push-grasping in clutter. Robotics: Science and systems VII, 1, 2011.

[23] Mehmet R Dogar and Siddhartha S Srinivasa. A planning framework for non- prehensile manipulation under clutter and uncertainty. Autonomous Robots, 33(3):217-236, 2012.

[24] Siyuan Dong, Daolin Ma, Elliott Donlon, and Alberto Rodriguez. Maintaining Grasps within Slipping Bound by Monitoring Incipient Slip. In IEEE Interna- tional Conference on Robotics and Automation (ICRA), page Under review., 2019.

[25] Siyuan Dong, Wenzhen Yuan, and Edward H Adelson. Improved gelsight tactile sensor for measuring geometry and slip. In Intelligent Robots and Systems (IROS), 2017 IEEE/RSJ International Conference on, pages 137-144, 2017.

[26] Elliott Donlon, Siyuan Dong, Melody Liu, Jianhua Li, Edward Adelson, and Alberto Rodriguez. Gelslim: A high-resolution, compact, robust, and calibrated tactile-sensing finger. Intelligent Robots and Systems (IROS), 2018 IEEE/RSJ International Conference on, 2018.

[27] S. Dong A. Rodriguez F. R. Hogan, J. Ballester. Tactile dexterity: Robust manipulation primitives with tactile feedback. In 2019 IEEE International Conference on Robotics and Automation (ICRA). IEEE.

[281 N. Fazeli, E. Donlon, E. Drumwright, and A. Rodriguez. Empirical evaluation of common contact models for planar impact. In Robotics and Automation (ICRA), 2017 IEEE International Conference on, 2017.

[29] Nima Fazeli, Russ Tedrake, and Alberto Rodriguez. Identifiability analysis of planar rigid-body frictional contact. In Robotics Research, pages 665-682. Springer, 2018.

130] Alex K Goins, Ryan Carpenter, Weng-Keen Wong, and Ravi Balasubrama- nian. Evaluating the efficacy of grasp metrics for utilization in a gaussian process-based grasp predictor. In Intelligent Robots and Systems (IROS), 2014 IEEE/RSJ InternationalConference on, pages 3353-3360, 2014.

[31] S. Goyal, A. Ruina, and J. Papadopoulos. Wear. Planarsliding with dry friction Part1. Limit surface and moment function, 143:307 - 330, 1991.

[32] Suresh Goyal, Andy Ruina, and Jim Papadopoulos. Planar sliding with dry friction part 1. limit surface and moment function. Wear, 143(2):307-330, 1991.

113 [33] Marcus Gualtieri, Andreas ten Pas, Kate Saenko, and Robert Platt. High precision grasp pose detection in dense clutter. In Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ InternationalConference on, pages 598-605, 2016.

[34] Inc. Gurobi Optimization. Gurobi optimizer reference manual, 2015.

[35] Tucker Hermans, James M Rehg, and Aaron F Bobick. Decoupling behavior, control, and perception in affordance-based manipulation. In IROS) Workshop on Cognitive Assistive Systems, 2012.

[36] F. R. Hogan and A. Rodriguez. Reactive planar manipulation with hybrid model predictive control. InternationalJournal of Robotics Research, 2019.

1371 Francois R Hogan and Alberto Rodriguez. Feedback control of the pusher-slider system: a story of hybrid and underactuated contact dynamics. In Proceedings of the 12th International Workshop on the Algorithmic Foundations of Robotics (WAFR), San Francisco, CA, USA, December 18 - 20., 2016.

[38] Francois Robert Hogan, Eudald Romo Grau, and Alberto Rodriguez. Reactive planar manipulation with convex hybrid mpc. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 247-253. IEEE, 2018.

[391 Rachel M Holladay and Siddhartha S Srinivasa. Distance metrics and algorithms for task space path optimization. In 2016 IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS), pages 5533-5540. IEEE, 2016.

[40] Yifan Hou, Zhenzhong Jia, Aaron M Johnson, and Matthew T Mason. Robust planar dynamic pivoting by regulating inertial and grip forces. In The 12th International Workshop on the Algorithmic Foundations of Robotics (WAFR). Springer, 2016.

[41] Yifan Hou, Zhenzhong Jia, and Matthew T Mason. Fast planning for 3d any- pose-reorienting using pivoting. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 1631-1638. IEEE, 2018.

[42] Robert D Howe and Mark R Cutkosky. Practical force-motion models for sliding manipulation. The International Journal of Robotics Research, 15(6):557-572, 1996.

[43] Kaijen Hsiao, Sachin Chitta, Matei Ciocarlie, and E Gil Jones. Contact-reactive grasping of objects with partial shape information. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pages 1228- 1235, 2010.

[44] Gregory Izatt, Geronimo Mirano, Edward Adelson, and Russ Tedrake. Track- ing objects with point clouds from vision and touch. In IEEE International Conference on Robotics and Automation (ICRA), pages 4000-4007, 2017.

114 [45] Roland S Johansson and Goran Westling. Roles of glabrous skin receptors and sensorimotor memory in automatic control of precision grip when lifting rougher or more slippery objects. Experimental brain research, 56(3):550-564, 1984.

[461 Yiannis Karayiannidis, Christian Smith, Danica Kragic, et al. Adaptive con- trol for pivoting with visual and tactile feedback. In Robotics and Automation (ICRA), 2016 IEEE International Conference on, pages 399-406. IEEE, 2016.

[47] Marek Kopicki, Sebastian Zurek, Rustam Stolkin, Thomas Moerwald, and Jeremy L Wyatt. Learning modular and transferable forward models of the motions of push manipulated objects. Autonomous Robots, 41(5):1061-1082, 2017.

[48] Michael C Koval, Mehmet R Dogar, Nancy S Pollard, and Siddhartha S Srini- vasa. Pose estimation for contact manipulation with manifold particle filters. In Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ InternationalCon- ference on, pages 4541-4548, 2013.

[49] Michael C Koval, Nancy S Pollard, and Siddhartha S Srinivasa. Pose estimation for planar contact manipulation with manifold particle filters. The International Journal of Robotics Research, 34(7):922-945, 2015.

[501 Michael C Koval, Nancy S Pollard, and Siddhartha S Srinivasa. Pre-and post- contact policy decomposition for planar contact manipulation under uncer- tainty. The InternationalJournal of Robotics Research, 35(1-3):244-264, 2016.

[51] Senka Krivic and Justus Piater. Online adaptation of robot pushing control to object properties. In 2018 IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS), pages 4614-4621. IEEE, 2018.

[52] James J Kuffner and Steven M LaValle. Rrt-connect: An efficient approach to single-query path planning. In Proceedings 2000 ICRA. Millennium Confer- ence. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), volume 2, pages 995-1001. IEEE, 2000.

[53] Scott Kuindersma, Robin Deits, Maurice Fallon, Andr6s Valenzuela, Hongkai Dai, Frank Permenter, Twan Koolen, Pat Marion, and Russ Tedrake. Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot. Autonomous Robots, 40(3):429-455, 2016.

[54] Manfred Lau, Jun Mitani, and Takeo Igarashi. Automatic Learning of Pushing Strategy for Delivery of Irregular-Shaped Objects. In ICRA, 2011.

[55] Mircea Lazar, WPMH Heemels, Siep Weiland, and Alberto Bemporad. Sta- bilizing model predictive control of hybrid systems. IEEE Transactions on Automatic Control, 51(11):1813-1818, 2006.

[56] Soo H Lee and Mark Cutkosky. Journal of Manufacturing Science and Engi- neering. Fixture planning with friction, 113(3):320 - 327, 1991.

115 [57] Sergey Levine, Peter Pastor, Alex Krizhevsky, Julian Ibarz, and Deirdre Quillen. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research, 2016.

[58] Rui Li and Edward H. Adelson. Sensing and recognizing surface textures using a gelsight sensor. In Computer Vision and Pattern Recognition (CVPR), 2013.

[59] Rui Li, Robert Platt, Wenzhen Yuan, Andreas ten Pas, Nathan Roscup, Man- dayam A. Srinivasan, and Edward Adelson. Localization and manipulation of small parts using gelsight tactile sensing. In Intelligent Robots and Systems (IROS), 2014 IEEE/RSJ International Conference on., 2014.

[601 Emanuele Luberto, Yier Wu, Gaspare Santaera, Marco Gabiccini, and Anto- nio Bicchi. Enhancing adaptive grasping through a simple sensor-based reflex mechanism. IEEE Robotics and Automation Letters, 2(3):1664-1671, 2017.

[611 Kevin M Lynch. The mechanics of fine manipulation by pushing. In ICRA, pages 2269-2276. Citeseer, 1992.

[62] Kevin M Lynch, H. Maekawa, and K. Tanie. Manipulation and active sensing by pushing using tactile feedback. In Intelligent Robots and Systems (IROS), 1992 IEEE/RSJ International Conference on, 1992.

[63] Kevin M Lynch and Matthew T Mason. Stable pushing: mechanics, controlla- bility, and planning. The InternationalJournal of Robotics Research, 15(6):533 - 556, 1996.

[64] Jeffrey Mahler, Jacky Liang, Sherdil Niyaz, Michael Laskey, Richard Doan, Xinyu Liu, Juan Aparicio Ojea, and Ken Goldberg. Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. Robotics: Science and Systems (RSS), 2017.

[65] M. Mani and R. D. W. Wilson. A programmable orienting system for flat parts. In In Proc: North American Mfg. Research Inst. Conf XIII, 1985.

[66] De Maria, G., C. Natale, and S. Pirozzi. Force/tactile sensor for robotic appli- cations. Sensors and Actuators A, 1, 2012.

[67] Matthew T Mason. Mechanics and planning of manipulator pushing operations. The International Journal of Robotics Research, 5(3):53-71, 1986.

[681 Matthew T Mason. Mechanics and planning of manipulator pushing operations. The International Journal of Robotics Research, 5(3):53 - 71, 1986.

[69] Matthew T Mason. Mechanics of robotic manipulation. MIT Press, Cambridge, Massachusetts, 2001.

116 [70] Daniele Masti and Alberto Bemporad. Learning binary warm starts for multi- parametric mixed-integer quadratic programming. In Proc. of European Control Conference,(, ), 2019.

[71] Tekin Merigli, Manuela Veloso, and H Levent Akin. Push-manipulation of com- plex passive mobile objects using experimentally acquired motion models. Au- tonomous Robots, 38(3):317-329, 2015.

[72] Todd D Murphey and Joel W Burdick. Feedback control methods for distributed manipulation systems that involve mechanical contacts. The InternationalJour- nal of Robotics Research, 23(7-8):763-781, 2004.

[73] George L Nemhauser and Laurence A Wolsey. Integer Programming and Com- binatorial Optimization. Wiley, Chichester, England, 1988.

[74] Richard Oberdieck and Efstratios N Pistikopoulos. Explicit hybrid model- predictive control: The exact solution. Automatica, 58:152-159, 2015.

[751 Diego Pardo, Michael Neunert, Alexander W Winkler, Ruben Grandia, and Jonas Buchli. Hybrid direct collocation and control in the constraint-consistent subspace for dynamic legged robot locomotion. In Robotics: Science and Sys- tems, 2017.

[76] Peter Pastor, Mrinal Kalakrishnan, Ludovic Righetti, and Stefan Schaal. To- wards associative skill memories. In Humanoid Robots (Humanoids), 2012 12th IEEE-RAS InternationalConference on, pages 309-315, 2012.

[77] Lerrel Pinto, James Davidson, and Abhinav Gupta. Supervision via competi- tion: Robot adversaries for learning tasks. In IEEE International Conference on Robotics and Automation (ICRA), pages 1601-1608, 2017.

[78] Lerrel Pinto and Abhinav Gupta. Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. In IEEE International Conference on Robotics and Automation (ICRA), pages 3406-3413, 2016.

[79] Michael Posa, Cecilia Cantu, and Russ Tedrake. A direct method for trajectory optimization of rigid bodies through contact. The International Journal of Robotics Research, 33(1):69 - 81, 2014.

[801 Michael Posa, Scott Kuindersma, and Russ Tedrake. Optimization and sta- bilization of trajectories for constrained dynamical systems. In Robotics and Automation (ICRA), 2016 IEEE International Conference on, 2016.

[81] Carl Rasmussen and Chris Williams. Gaussian Processesfor Machine Learning. MIT Press, 2006.

[82] Joseph Redmon and Anelia Angelova. Real-time grasp detection using convo- lutional neural networks. In IEEE International Conference on Robotics and Automation (ICRA), pages 1316-1322, 2015.

117 [83] Anis Sahbani, Sahar El-Khoury, and Philippe Bidaud. An overview of 3d object grasp synthesis algorithms. Robotics and Autonomous Systems, 60(3):326-336, 2012.

[84] Marcos Salganicoff, Giorgio Metta, Andrea Oddera, and Giulio Sandini. A vision-based learning method for pushing manipulation. Technical report, IRCS-93-47, U. of Pennsylvania, Department of Computer and Information Science, 1993.

[85] Naoyuki Sawasaki and Hirochika INOUE. Tumbling objects using a multi- fingered robot. Journal of the Robotics Society of Japan, 9(5):560-571, 1991.

[86] Gerrit Schultz and Katja Mombaur. Modeling and optimal control of human- like running. IEEE/ASME Transactions on mechatronics, 15(5):783-792, 2009.

[87] Jian Shi, J Zachary Woodruff, Paul B Umbanhowar, and Kevin M Lynch. Dy- namic in-hand sliding manipulation. IEEE Transactionson Robotics, 33(4):778- 795, 2017.

[88] Bruno Siciliano and Oussama Khatib. Springer handbook of robotics. Springer, 2016.

[89] Stachowsky, M., Hummel, T., M. Moussa, and H. A. Abdullah. A slip detection and correction strategy for precision robot grasping. IEEE/ASME Transactions on Mechatronics, 21(5), 2016.

[90] David E Stewart and J.C. Trinkle. An implicit time-stepping scheme for rigid body dynamics with inelastic collisions and coulomb friction. International Journal for Numerical Methods in Engineering, 39(15):2673 - 2691, 1996.

[91] David E Stewart and Jeffrey C Trinkle. An implicit time-stepping scheme for rigid body dynamics with inelastic collisions and coulomb friction. International Journal for Numerical Methods in Engineering, 39(15):2673-2691, 1996.

[92] Andreas ten Pas and Robert Platt. Using geometry to detect grasp poses in 3d point clouds. In Robotics Research, pages 307-324. Springer, 2018.

[93] Stephen Tian, Frederik Ebert, Dinesh Jayaraman, Mayur Mudigonda, Chelsea Finn, Roberto Calandra, and Sergey Levine. Manipulation by feel: Touch-based control with deep predictive models. arXiv preprint arXiv:1903.04128, 2019.

[94] Marc Toussaint, Kelsey Allen, Kevin Smith, and Joshua B Tenenbaum. Differ- entiable physics and stable modes for tool-use and manipulation planning. In Robotics: Science and Systems (RSS), 2018.

[95] Andres Klee Valenzuela. Mixed-integer convex optimization for planning ag- gressive motions of legged robots over rough terrain. PhD thesis, Massachusetts Institute of Technology, 2016.

118 [96] Filipe Veiga, Benoni B Edin, and Jan Peters. In-hand object stabilization by independent finger control. arXiv preprint arXiv:1806.05031, 2018.

[97] Filipe Veiga, Herke Van Hoof, Jan Peters, and Tucker Hermans. Stabilizing novel objects by learning to predict tactile slip. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5065-5072. IEEE, 2015.

[98] Ulrich Viereck, Andreas ten Pas, Kate Saenko, and Robert Platt. Learning a visuomotor controller for real world robotic grasping using simulated depth images. Conference on Robot Learning (CoRL), 2017.

[99] Sean Walker and J Kenneth Salisbury. Pushing using learned manipulation maps. In 2008 IEEE International Conference on Robotics and Automation, pages 3808-3813. IEEE, 2008.

[1001 Weiwei Wan, Matthew T Mason, Rui Fukui, and Yasuo Kuniyoshi. Improving regrasp algorithms to analyze the utility of work surfaces in a workcell. In Robotics and Automation (ICRA), 2015 IEEE International Conference on, pages 4326-4333. IEEE, 2015.

[101] J Zachary Woodruff and Kevin M Lynch. Planning and control for dynamic, nonprehensile, and hybrid manipulation tasks. In Robotics and Automation (ICRA), 2017 IEEE International Conference on, 2017.

[102] Hanna Yousef, Mehdi Boukallel, and Kaspar Althoefer. Tactile sensing for dexterous in-hand manipulation in robotics - a review. Sensors and Actuators A: physical, 167(2):171-187, 2011.

[103] K.T. Yu, M. Bauza, N. Fazeli, and A. Rodriguez. More than a million ways to be pushed. a high-fidelity experimental dataset of planar pushing. In Intel- ligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on, 2016.

[104] Kuan-Ting Yu, Maria Bauza, Nima Fazeli, and Alberto Rodriguez. More than a million ways to be pushed. a high-fidelity experimental dataset of planar push- ing. In Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on, pages 30-37. IEEE, 2016.

[105] Kuan Ting Yu, John Leonard, and Alberto Rodriguez. Shape and pose re- covery from planar pushing. In Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ InternationalConference on, 2015.

[106] Wenzhen Yuan, Siyuan Dong, and Edward H. Adelson. High-resolution robot tactile sensors for estimating geometry and force. Sensors, 17(2), 2017.

[107] Wenzhen Yuan, Chenzhuo Zhu, Andrew Owens, Mandayam A. Srinivasan, and Edward H. Adelson. Shape-independent hardness estimation using deep

119 learning and a gelsight tactile sensor. In IEEE International Conference on Robotics and Automation (ICRA), 2017.

[1081 Jiaji Zhou, James Bagnell, and Matthew Mason. A fast stochastic contact model for planar pushing and grasping: theory and experimental validation. In Robotics Science and Systems, Cambridge, MA, USA, July 12-16, 2017.

[109] Jiaji Zhou and Matthew T Mason. Pushing revisited: Differential flatness, trajectory planning and stabilization. In Proceedings of the InternationalSym- posium on Robotics Research (ISRR), 2017.

[110] Jiaji Zhou, Robert Paolini, J Andrew Bagnell, and Matthew T Mason. A con- vex polynomial force-motion model for planar sliding: Identification and appli- cation. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages 372-377. IEEE, 2016.

[111] Shaojun Zhu, Andrew Kimmel, and Abdeslam Boularias. Information-theoretic model identification and policy search using physics engines with application to robotic manipulation. arXiv preprint arXiv:1703.07822, 2017.

120