Visp for Visual Servoing: a Generic Software Platform with a Wide Class of Robot Control Skills E
Total Page:16
File Type:pdf, Size:1020Kb
ViSP for visual servoing: a generic software platform with a wide class of robot control skills E. Marchand, F. Spindler, François Chaumette To cite this version: E. Marchand, F. Spindler, François Chaumette. ViSP for visual servoing: a generic software platform with a wide class of robot control skills. IEEE Robotics and Automation Magazine, Institute of Electrical and Electronics Engineers, 2005, 12 (4), pp.40-52. inria-00351899 HAL Id: inria-00351899 https://hal.inria.fr/inria-00351899 Submitted on 12 Jan 2009 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ViSP for Visual Servoing A Generic Software Platform with a Wide Class of Robot Control Skills everal software packages or toolboxes written in vari- same philosophy as the XVision system [16] and would have ous languages have been proposed in order to simu- been independent from the robot and the tracking algorithms. late robotic manipulator control. The MATLAB Following these precedents, ViSP (that is, Visual Servoing Robotics Toolbox [8] allows for the simple manipu- Platform), the software environment we present in this article, lation of serial-link manipulator; Roboop [14] is a features all these capabilities: independence with respect to the Smanipulator simulation package (written in C++). On the hardware, simplicity, extendibility, and portability. Moreover, other hand, only a few systems allow a simple speci- ViSP features a large library of elementary tasks fication and the execution of a robotic task with various visual features that can be on a real system. Some systems, combined together, an image process- though not always related to ing library that allows the tracking vision-based robotics, have been of visual cues at video rate, a described in [9]. simulator, an interface with Visual servoing is a very various classical framegrab- important research area in bers, etc. The platform is robotics. Despite all the implemented in C++ research in this field, it under Linux. seems there is no software environment that allows the ViSP: Overview and fast prototyping of visual ser- Major Features voing tasks. The main reason is that it usually requires specific Control Issue hardware (the robot and specific Visual servoing techniques consist of IRIS: ©1998 CORBIS CORPORATION, framegrabbers). Consequently, the resulting COMPUTER KEYS © DIGITAL VISION using the data provided by one or several applications are not portable and can only adapted to cameras in order to control the motion of a robotic other environments. Today’s software design allows us to pro- system [13], [19]. A large variety of positioning or target pose elementary components that can be combined to build tracking tasks can be implemented by controlling from one to portable high-level applications. Furthermore, the increasing all degrees of freedom of the system. Whatever the sensor speed of microprocessors allows the development of real-time configuration, which can vary from one camera mounted on image processing algorithms on a simple workstation. A visual the robot end effector to several free-standing cameras, a set of servoing toolbox for MATLAB/Simulink [3] has been visual features s must be designed from the visual measure- recently proposed, but it features only simulation capabilities. ments x(t) (s = s(x(t ))), allowing control of the desired Chaumette et al. [6] proposed a “library of canonical degrees of freedom. A control law also must be designed so vision-based tasks” for visual servoing that catalogs the most that these features s reach a desired value s∗, defining a correct classical linkages. Toyama and Hager describe in [32] what realization of the task. A desired trajectory s∗(t ) can also be such a system should be in the case of stereo visual servoing. tracked [1], [29]. The control principle is thus to regulate the The presented system (called Servomatic) is specified using the error vector s − s∗ to zero. BY ÉRIC MARCHAND, FABIEN SPINDLER, AND FRANÇOIS CHAUMETTE 40 IEEE Robotics & Automation Magazine 1070-9932/05/$20.00©2005 IEEE DECEMBER 2005 Control Law “Introduce More Complex Image Processing and a Sec- To ensure the convergence of s to its desired value s∗, we ondary Task”) when all the degrees of freedom are not con- need to model or approximate the interaction matrix Ls, strained by the visual task. In that case, we have [13] which links the time variation of the selected visual features to the relative camera-object kinematics screw v and is defined + ∗ + q˙ =−λ W WJ+(s − s ) + (I − W W)e by the classical equation [13]. s 2 ∂e + (I − W+W) 2 ,(7) s˙ = Lsv.(1) ∂t If we want to control the robot using the joint velocities, we have where W+ and I − W+W are projection operators that guar- ∂s antee the camera motion due to the secondary task is compat- s˙ = J q˙ + ,(2) ible with the regulation of s to s∗. s ∂t If we consider here, without loss of generality, the case of where Js is the features Jacobian and where ∂s/∂t represents an eye-in-hand system observing a motionless target we have the variation of s due to potential motion of the object (for an eye-in-hand system) or to potential motion of the camera (for s˙ = Lsv,(8) an eye-to-hand system). More precisely, if we consider an eye- c n in-hand system we have where v = Vn Jn(q) q˙ is the camera velocity. If the low- level robot controller allows v to be sent as inputs, a simple c n Js = Ls Vn Jn(q), (3) control law can be obtained: =−λ+( − ∗), ( ) where v Ls s s 9 n ◆ Jn(q) is the robot Jacobian expressed in the end- effector frame Rn where Ls is a model or an approximation of the interaction c ◆ Vn allows the transformation of the velocity screw matrix. It can be chosen as [4] between the camera frame Rc and the end-effector ◆ Ls = Ls(s, r), where the interaction matrix is comput- frame Rn. It is given by ed at the current position of the visual feature and the current three-dimensional (3-D) pose (denoted r) if it is c c c c = Rn [ tn]× Rn ,() available Vn c 4 ∗ ∗ 03 Rn ◆ Ls = Ls(s , r ), where the interaction matrix is com- puted only once at the desired position of s and r c c ∗ ∗ where Rn and tn are the rotation and translation ◆ Ls = 1/2(Ls(s, r) + Ls(s , r )), as reported in [23]. between frames Rc and Rn and [t]× is the skew matrix These possibilities have been integrated in ViSP, but other related to t. This matrix is constant if the camera is possibilities (such as a learning process of the interaction rigidly linked to the end effector. matrix [18], [20], [21]) are always possible, although they are Now, if we now consider an eye-to-hand system we have not currently integrated in the software. Let us point out that in this case (7) can be rewritten by [13] =− c F ( ), ( ) Js Ls VF Jn q 5 v =−λ W+WL+(s − s∗) + (I − W+W)e where s 2 F ◆ J (q) is the robot Jacobian expressed in the robot + ∂e2 n + (I − W W) .(10) reference frame RF ∂t ◆ c VF allows the transformation of the velocity screw between coordinate frames (here the camera frame Rc A Library of Visual Servoing Skills and the robot reference frame RF ). This matrix is con- With a vision sensor providing two-dimensional (2-D) mea- stant if the camera is motionless. surements x(t), potential visual features s are numerous. In all cases, a control law that minimizes the error s − s∗ is Two-dimensional data (such as coordinates of feature points then given by in the image), as well as 3-D data (provided by a localization ∂ algorithm and exploiting the extracted 2-D measurements), + ∗ s q˙ =−λJ (s − s ) − ,(6) can be considered. It is also possible to combine 2-D and 3- s ∂t D visual features to exploit the advantages of each approach where λ is the proportional coefficient involved in the expo- while avoiding their respective drawbacks [24]. nential convergence of the error and ∂s/∂t is an estimation of A systematic method has been proposed to derive analytical- the object/camera motion. ViSP allows the consideration of ly the interaction matrix of a set of visual features defined upon both eye-in-hand and eye-to-hand configurations. Finally, note geometric primitives [13], [5], [6]. Any kind of visual features that a secondary task e2 can be simply added (as reported in can be considered within the same formalism (coordinates of DECEMBER 2005 IEEE Robotics & Automation Magazine 41 L points, straight line orientation, area, or more generally, image p L = L ,(11) moments, distance, etc.) Knowing these interaction matrices, s z Lθ the construction of elementary visual servoing tasks is straight- u forward. As explained in [6] a large library of elementary skills can be proposed. The current version of ViSP has the following where Lp, Lz, and Lθ u have the following forms: predefined visual features: L = p ◆ 2-D visual features: 2-D points, 2-D straight lines, 2-D − / / −( + 2) 1 Z 0 x Zxy 1 x y , ellipses 0 −1/Zy/Z 1 + y2 −xy −x ◆ 3-D visual features: 3-D points, 3-D straight lines, θu (12) where θ and u are the angle and the axis of the rotation = − / − ,() that the camera must realize.