Motionhub: Middleware for Unification of Multiple Body Tracking Systems
Total Page:16
File Type:pdf, Size:1020Kb
MotionHub: Middleware for Unification of Multiple Body Tracking Systems Philipp Ladwig Kester Evers Eric J. Jansen Mixed Reality and Visualization Mixed Reality and Visualization Mixed Reality and Visualization Group (MIREVI) Group (MIREVI) Group (MIREVI) University of Applied Sciences University of Applied Sciences University of Applied Sciences Düsseldorf, Germany Düsseldorf, Germany Düsseldorf, Germany [email protected] [email protected] [email protected] Ben Fischer David Nowottnik Christian Geiger Mixed Reality and Visualization Mixed Reality and Visualization Mixed Reality and Visualization Group (MIREVI) Group (MIREVI) Group (MIREVI) University of Applied Sciences University of Applied Sciences University of Applied Sciences Düsseldorf, Germany Düsseldorf, Germany Düsseldorf, Germany [email protected] [email protected] [email protected] Figure 1: MotionHub is an open-source middleware that offers interfaces to multiple body tracking systems. a) Two users were captured and tracked by a Microsoft Azure Kinect. b) The graphical user interface of MotionHub. The green cubes represent an OptiTrack recording while the yellow ones represent an Azure Kinect live capture. c) MotionHub streams unified skeletal representation in real time to clients such as the Unity game engine via a plug-in. ABSTRACT to another system can be a complex procedure. In this paper, we There is a substantial number of body tracking systems (BTS), which present our middleware solution MotionHub, which can receive and cover a wide variety of different technology, quality and price range process data of different BTS technologies. It converts the spatial for character animation, dancing or gaming. To the disadvantage as well as the skeletal tracking data into a standardized format in of developers and artists, almost every BTS streams out different real time and streams it to a client (e.g. a game engine). That way, protocols and tracking data. Not only do they vary in terms of scale MotionHub ensures that a client always receives the same skeletal- and offset, but also their skeletal data differs in rotational offsets data structure, irrespective of the used BTS. As a simple interface between joints and in the overall number of bones. Due to this enabling the user to easily change, set up, calibrate, operate and circumstance, BTSs are not effortlessly interchangeable. Usually, benchmark different tracking systems, the software targets artists software that makes use of a BTS is rigidly bound to it, and a change and technicians. MotionHub is open source, and other developers are welcome to contribute to this project. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation CCS CONCEPTS on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or • Computing methodologies → Motion processing; • Infor- republish, to post on servers or to redistribute to lists, requires prior specific permission mation systems → Multimedia streaming; and/or a fee. Request permissions from [email protected]. MOCO ’20, July 15–17, 2020, Jersey City/ Virtual, NJ, USA © 2020 Copyright held by the owner/author(s). Publication rights licensed to the KEYWORDS Association for Computing Machinery. Body tracking, middleware, skeletal data, motion capture, Azure ACM ISBN 978-1-4503-7505-4/20/07...$15.00 https://doi.org/10.1145/3401956.3404185 Kinect, OptiTrack MOCO ’20, July 15–17, 2020, Jersey City/ Virtual, NJ, USA Ladwig et al. ACM Reference Format: A demo video with a summary of MotionHub’s capabilities can Philipp Ladwig, Kester Evers, Eric J. Jansen, Ben Fischer, David Nowottnik, be watched here: https://youtu.be/7caURXod-ag. and Christian Geiger. 2020. MotionHub: Middleware for Unification of Mul- The Git repository can be found here: https://github.com/Mirevi/ tiple Body Tracking Systems. In 7th International Conference on Movement MotionHub. and Computing (MOCO ’20), July 15–17, 2020, Jersey City/ Virtual, NJ, USA. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3401956.3404185 2 RELATED WORK 1 INTRODUCTION Body tracking is a wide field with many years of research and development, and it has produced a large number of different soft- Real-time body tracking is an indispensable part of many interac- ware and hardware approaches as well as standardizations and file tive art performances, mixed-reality applications and games. Due formats. In this section, we only mention the most common and to the progress in research, higher computing power and advances important ones for MotionHub. in sensor technology, a large number of BTSs that are suitable for such applications have been developed in recent years. Develop- 2.1 Standards and File Formats ers are spoilt for choice: They have to select a system and must consider various advantages and disadvantages concerning price, The probably best known and oldest de facto standard for data accuracy and size of the tracking area. However, in many cases exchange in virtual reality is the Virtual-Reality Peripheral Network not all requirements are known at the beginning of a project and (VRPN) by Taylor et al. [42]. It offers simple and unified interfaces will only be developed over time. This can pose a challenge as a to a broad range of devices of different manufacturers. Many of subsequent change to another tracking system can be costly. In these devices share common functionalities such as 6-DOF tracking such a scenario, a middleware that allows for an effortless switch or button input while the way of accessing these functions differs between different BTSs would be useful. between manufacturers. VRPN unifies functions across different The term ’middleware’ was probably first used in 1968 [35]. Since devices as generic classes such as vrpn_Tracker or vrpn_Button. then, various sub types have been described, but the basic meaning Therefore, it can be seen as both a standard and a middleware. The has never changed: A middleware receives, processes and transmits approach of MotionHub is similar, but it focuses on body tracking. information between at least two peers. Middlewares are often able The first official international standard for humanoid animation to understand multiple information representations and to translate is the H-Anim, which was created within the scope of the Extensible them into a standardized or unified one. 3D (X3D) standard and is a successor of the Virtual Reality Model- The contribution of this paper is the open-source middleware ing Language (VRML) [13, 14]. H-Anim was published in 2006 [15], ’MotionHub’ and an open-source game engine plugin, which can updated in 2019 [16, 17] and is one of the only efforts yet to create be seen in Fig. 1b and c. The intention behind our work is to unify an official open standard for humanoid avatar motion and data the output data of the wide variety of available BTSs. We believe exchange. that a unification will not only lead to an easier switch between COLLADA and FBX are interchange file formats for 3D appli- BTSs during production but that it will also reduce the preparation cations and are widely used today. While humanoid animation is time for developments that involve BTSs. Integrating a BTS into an not the focus of COLLADA, its open and versatile structure enables existing application requires a manual effort. Then, if developers developers to save body tracking data. Compared to COLLADA, the want to integrate another BTS or exchange the previous one with proprietary FBX format mainly focuses on motion data but lacks a it, an additional effort must be considered. In some cases, an update clear documentation, which has led to incompatible versions. to a newer version of a BTS is desired, which produces an even While COLLADA and FBX can also be used for writing and higher manual effort. Our intention is to integrate a generic BTS reading 3D geometry, the Biovision Hierarchy (BVH) file format protocol into applications once in order to reduce the overall time was developed exclusively for handling skeletal motion data and spent for maintenance, the setup and the switch between different is therefore simpler in structure. It is supported by many body BTSs. tracking applications and, because of its simplicity and less over- Beyond this, more benefits can be mentioned. If a middleware head compared to other file formats, it is often used for real-time understands different BTSs, it is possible to prioritize and to switch transmission of humanoid motion data. A deeper and more com- between them automatically to exploit the advantages of each sys- prehensive overview of further tracking file formats is given by tem. A possible scenario could be the use of a BTS that is accurate Meredith and Maddock [30]. but limited in tracking space and a different one that has a larger tracking space but is less accurate. A middleware could prioritize 2.2 Software and Hardware the more accurate BTS as long as the tracked person stays in the Microsoft started shipping the Kinect in 2010 and thereby allowed ’accurate’ tracking space and automatically switches to the ’less the body tracking community to grow substantially since the Kinect accurate’ space whenever he or she leaves it. was an affordable and capable sensor. During this time, PrimeSense Despite the advantages, the concept of the MotionHub also has a made OpenNI [39] and NITE [39, p.15] publicly available. OpenNI of- major drawback: By nature, a middleware induces a delay because fers low-level access to the Microsoft Kinect and other PrimeSense it receives, converts and resends the data between two ends, which sensors, while NITE was a middleware that enables the user to requires a certain processing time.