End-User Robot Programming Using Mixed Reality

End-User Robot Programming Using Mixed Reality Samir Yitzhak Gadre1, Eric Rosen1, Gary Chien1, Elizabeth Phillips1;2, Stefanie Tellex1, George Konidaris1 Abstract— Mixed Reality (MR) is a promising interface for robot programming because it can project an immersive 3D visualization of a robot’s intended movement onto the real world. MR can also support hand gestures, which provide an intuitive way for users to construct and modify robot motions. We present a Mixed Reality Head-Mounted Display (MR- HMD) interface that enables end-users to easily create and edit robot motions using waypoints. We describe a user study where 20 participants were asked to program a robot arm using 2D and MR interfaces to perform two pick-and-place tasks. In the primitive task, participants created typical pick- and-place programs. In the adapted task, participants adapted their primitive programs to address a more complex pick- (a) A screenshot from the MR (b) After creating the waypoints, and-place scenario, which included obstacles and conditional perspective of a user program- users can visualize the robot arm reasoning. Compared to the 2D interface, a higher number of ming a robot motion. Users can motion that is planned through users were able to complete both tasks in significantly less time, specify green waypoints. the waypoints. and reported experiencing lower cognitive workload, higher usability, and higher naturalness with the MR-HMD interface. Fig. 1: Our mixed reality interface. I. INTRODUCTION For robots to become widely used, humans must be able We propose an end-user mixed reality-based visual pro- to program their actions. For example, consider the task of gramming framework for creating and modifying waypoints binning items. A roboticist might accomplish this task by to create complex, multistep robot actions (Fig. 1). Users specifying a series of waypoints in computer code for the can specify and group waypoints together to create primitive robot to visit one by one. If the action must be modified, the motions, and adapt these waypoints to perform similar tasks. roboticist would explicitly modify the waypoints specified Our interface allows users to visualize the entire motion the in the code. This method is widely popular, but will not robot plans to perform through the waypoints and have the work for end-users. The abstraction of breaking down actions robot execute it. We use a commercially available Mixed into a series of waypoints could be communicated, but Reality Head-Mounted Display (MR-HMD), the Microsoft requiring the use of programming languages to specify those HoloLens [7]. waypoints is beyond their scope. Therefore, we need an The MR-HMD at first seems to pose only advantages over alternate method of interfacing with the waypoint action other visual programming interfaces by combining the robot system. workspace and the GUI space for the end-user. However, Visual programming is one such methodology. A broad there are limitations to the technology that do not make term, visual programming refers to a class of interfaces it obviously preferable to a 2D interface. For example, the where one interacts with a visual representation of ob- HoloLens has a limited field of view, so it relies on the jects, variables, classes, etc. [1]. Visual programming has user to move around to get a full view of the MR scene. been a popular tool for programming computers by non- Furthermore, imperfect hand tracking via computer vision programmers in visual art [2], audio production [3], and makes selection and dragging gestures less reliable than education [4] because it allows users to focus on algorithmic mouse clicks, especially for novice HoloLens users. thinking, rather than the syntax needed to express their intent. We therefore conducted a user study with 20 participants In the world of robotics, RViz and the Interactive Markers and compared the effectiveness of our MR interface to package [5] allow for the creation of visual programming in- a 2D visual programming interface for two similar pick- terfaces that control and visualize a robot. Robot researchers and-place tasks. In the first task, participants programmed have also investigated how effective visual programming primitive robot motions to pick up a block and place it on is by creating and evaluating their own frameworks [6]. a platform. In the second task, participants adapted their Using our binning task from before, it would be possible primitive robot programs to sequentially pick-and-place two to create and modify waypoints visually in a keyboard and cubes on different platforms. Our results show that compared mouse interface. However, users would not see the waypoints to the 2D interface, a higher number of users were able to overlayed on the real robot environment. complete both tasks in significantly less time. Users reported 1Computer Science, Brown University experiencing lower cognitive workload, higher usability and 2Behavioral Sciences and Leadership, United States Air Force Academy naturalness with the MR-HMD interface. II. RELATED WORK VR headset, because they could navigate and interact with the scene by moving their head and hands naturally. This The traditional way to program a robot is to write code. contrasts with keyboard and mouse actions that a typical ROS [8] is an extremely powerful middleware environment 2D interface provides. Like the VR headset, the MR-HMD for roboticists. ROS includes packages to allow programmers allows users to navigate and interact with the perceived to use languages like C++ and Python to interface with robot environment using natural actions. However, an additional hardware. However, leveraging the expertise of end-users benefit of the MR-HMD is that it allows the user to also see that lack software engineering skills would help make robots the real world. more widely accessible. Rosen et al. [15] created an open-source ROS package, ROS also includes many graphical user interfaces (GUIs), ROS Reality, which enables robots to convey intent. The such as RViz [5], for visualization. RViz can display robot package allows the robot to display its intended path as a sensor data on a 2D screen and can be connected to holographic time-lapse trail to a user wearing a MR-HMD. MoveIt! [9] motion planners to enable users to program Rosen et al. [15] conducted a user study to compare the robot movement via keyboard and mouse. 2D interfaces have speed and task completion rate of novice participants using been shown to be useful for robot programming, but have a HoloLens and a 2D monitor interface to determine if a their own shortcomings with regards to immersiveness and proposed robot trajectory would collide with the environ- intuitiveness. They force users to interpret 3D information on ment. They found that the MR-HMD increased accuracy a 2D platform and use control interfaces that do not match and lowered task completion time. While Rosen et al. [15] how users interact with the world. showed the promise of using a MR-HMD for visualizing Alexandrova et al. [6] created a 2D visual programming robot motion to non-experts, they did not address how language, RoboFlow, to enable end-users to easily program effective MR-HMD is for programming these motions. The mobile manipulators to perform general tasks in open- HoloLens’s limited hand-gesture tracking capabilities pose environments. Action editing was important to resolve errors the possible issue that MR-HMD may not be an effective [6]. interface for creating these robot actions. Alexandrova et al. [10] developed a framework that en- Walker et al. [16] investigated different MR signalling ables users to provide a demonstration of a task and then mechanisms for a drone to communicate with a human. use an intuitive monitor and mouse interface to edit demon- They found quantitative task efficiency was significantly strations for adaption to new tasks. Elliott et al. [11] extended higher when using MR signals than when using physically- this work to allow for the grouping of several poses relative embodied signals. to landmarks or objects in the scene. Having an intuitive Fang et al. [17] created an MR interface to program and interface for non-experts to edit example motions for robots visualize robot trajectories. However, they do so using a 2D is especially important in Learning from Demonstration interface. Ni et al. [18] evaluated the effectiveness of an (LfD), where collecting many examples needed for learning augmented reality interface against traditional robot welding is not always feasible. In such cases, a more preferable programming methodologies. Virtual representations of the method may be to have the human perform one general robot trajectory were visualized over a video feed of the real demonstration of the task, and then adapt parts of their robot, enabling novice users to use the augmented reality programmed action to new environments. Both Alexandrova interface to program new welding paths for the robot to act et al. [10] and Elliott et al. [11] used a 2D monitor interface out. Ni et al. [18] found that the augmented reality interface to adapt robot programs. allowed users to program robot actions more quickly and Language is a well-studied modality for programming intuitively, especially for users without prior computer-aided robots because it is one of the main ways humans commu- design knowledge. However, the 2D interfaces of both Fang nicate with each other. Language has been shown to be an et al. [17] and Ni et al. [18] force users to look at a screen efficient way to parameterize a task through descriptions of and not strictly at the robot workspace, which poses safety key items and locations in the environment [12]. Forbes et issues when collaborating with a robot in close-quarters. al. [13] developed a natural language interface to allow users On the other hand, MR-HMDs allow users to both provide to program robot motions via LfD.

Load more