Extended Multitouch: Recovering Touch Posture and Differentiating Users Using a Depth Camera

Total Page:16

File Type:pdf, Size:1020Kb

Extended Multitouch: Recovering Touch Posture and Differentiating Users Using a Depth Camera Extended Multitouch: Recovering Touch Posture and Differentiating Users using a Depth Camera Sundar Murugappan1, Vinayak1, Niklas Elmqvist2, Karthik Ramani1;2 1School of Mechanical Engineering and 2School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47907, USA fsmurugap,fvinayak,elm,[email protected] Figure 1. Extracting finger and hand posture from a Kinect depth camera (left) and integrating with a pen+touch sketch interface (right). ABSTRACT INTRODUCTION Multitouch surfaces are becoming prevalent, but most ex- Multitouch surfaces are quickly becoming ubiquitous: from isting technologies are only capable of detecting the user’s wristwatch-sized music players and pocket-sized smart- actual points of contact on the surface and not the identity, phones to tablets, digital tabletops, and wall-sized displays, posture, and handedness of the user. In this paper, we define virtually every surface in our everyday surroundings may the concept of extended multitouch interaction as a richer in- soon come to life with digital imagery and natural touch in- put modality that includes all of this information. We further put. However, achieving this vision of ubiquitous [27] sur- present a practical solution to achieve this on tabletop dis- face computing requires overcoming several technological plays based on mounting a single commodity depth camera hurdles, key among them being how to augment any physical above a horizontal surface. This will enable us to not only surface with touch sensing. Conventional multitouch tech- detect when the surface is being touched, but also recover the nology relies on either capacitive sensors embedded in the user’s exact finger and hand posture, as well as distinguish surface itself, or rear-mounted cameras capable of detect- between different users and their handedness. We validate ing the touch points of the user’s hand on the surface [10]. our approach using two user studies, and deploy the tech- Both approaches depend on heavily instrumenting the sur- nique in a scratchpad tool and in a pen + touch sketch tool. face, which is not feasible for widespread deployment. In this context, Wilson’s idea [30] of utilizing a front- ACM Classification Keywords mounted depth camera to sense touch input on any physical H.5.2 Information Interfaces and Presentation: User surface is particularly promising for the following reasons: Interfaces—Interaction styles; I.3.6 Computer Graphics: Methodology and Techniques—Interaction techniques • Minimal instrumentation: Because they are front- mounted, depth cameras integrate well with both pro- General Terms jected imagery as well as conventional screens; Design, Algorithms, Human Factors • Surface characteristics: Any surface, digital or inani- mate, flat or non-flat, horizontal or vertical, can be instru- Author Keywords mented to support touch interaction; Multitouch, tabletop, depth camera, pen + touch, evaluation. • On and above: Interaction both on as well as above the surface can be detected and utilized for input; Permission to make digital or hard copies of all or part of this work for • Low cost: Depth cameras have recently become com- personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies modity products with the rise of the Microsoft Kinect, bear this notice and the full citation on the first page. To copy otherwise, or originally developed for full-body interaction with the republish, to post on servers or to redistribute to lists, requires prior specific Xbox game console, but usable with any computer. permission and/or a fee. UIST’12, October 7–10, 2012, Cambridge, Massachusetts, USA. In this paper, we define a richer touch interaction modality Copyright 2012 ACM 978-1-4503-1580-7/12/10...$15.00. that we call extended multitouch (EMT) which includes not only sensing multiple points of touch on and above a sur- surface, but their system requires a black background for re- face using a depth camera, but also (1) recovering finger, liable recognition. Wang et al. [25] detect the directed fin- wrist and hand posture of the user, as well as (2) differenti- ger orientation vector from contact information in real time ating users and (3) their handedness. We have implemented by considering the dynamics of the finger landing process. a Windows utility that interfaces with a Kinect camera, ex- Dang et al. [4] mapped fingers to their associated hand by tracts the above data from the depth image in real-time, and making use of constraints applied to the touch position with publishes the information as a global hook to other appli- finger orientation. Freeman et al. [7] captured the user’s cations, or using the TUIO [16] protocol. To validate our hand image on a Microsoft Surface to distinguish hand pos- approach, we have also conducted two empirical user stud- tures through vision techniques. Holz et al. [14] proposed ies: one for measuring the tracking accuracy of the touch a generalized perceived input point model in which finger- sensing functionality, and the other for measuring the finger prints were extracted and recognized to provide accurate and hand posture accuracy of our utility. touch information and user identification. Finally, Zhang et al. [34] used the orientation of the touching finger for dis- The ability to not only turn any physical surface into a touch- criminating user touches on a tabletop. sensitive one, but also to recover full finger and hand pos- ture of any interaction performed on the surface, opens up an entirely new design space of interaction modalities. We Glove-based Techniques have implemented two separate touch applications for the Gloves are widely popular in virtual and augmented reality purpose of exploring this design space. The first is a proto- environments [22]. Wang et al. [26] used a single camera to type scratchpad where multiple users can draw simple vector track a hand wearing an ordinary cloth glove imprinted with graphics using their fingers, each finger on each user being a custom pattern. Marquardt et al. [20] used a fiduciary- separately identified and drawing a unique color and stroke. tagged gloves on interactive surfaces to gather input about The second application is called USKETCH, and is an ad- many parts of a hand and to discriminate between one per- vanced sketching application for creating beautified mechan- son’s or multiple peoples’ hands. Although simple and ro- ical engineering sketches that can be used for further finite bust, these approaches require instrumented gear which is element analysis. uSketch integrates pen-based input that we not always desirable compared to using bare hands. use for content creation with our touch sensing functionality that we use for navigation and mode changes. Specialized Hardware Specialized hardware is capable of deriving additional touch RELATED WORK properties beyond the mere touch points. Benko et al. [1] Our work is concerned with touch interaction, natural user sensed muscle activity through forearm electromyography to interfaces, and the integration of several input modalities. provide additional information about hand and finger move- ments away from the surface. Lopes et al. [17] integrated Multitouch Technologies touch with sound to expand the expressiveness of touch in- Several different technologies exist for sensing multiple terfaces by supporting recognition of different body parts, touches on a surface, including capacitance, RFID, and such as fingers and knuckles. TapSense [12] is a similar vision-based approaches. Vision-based approaches are par- acoustic sensing system that allows conventional surfaces to ticularly interested for our purposes since they are suitable identify the type of object being used for input such as fin- for surfaces with large form factor, such as tabletops and gertip, pad and nail. The above systems require additional walls [10, 19, 29]. These systems rely on the presence of a instrumentation of either the user or the surface and take a dedicated surface for interaction. significant amount of time for setup. Background noise is also a deterrent to use acoustic sensing techniques. Recently, depth-sensing cameras are beginning to be used in various interactive applications where any surface can be Pen and Touch-based Interaction made interactive. Wilson [30] explored the use of depth A few research efforts have explored the combination of pen sensing cameras to sense touch on any surface. Omni- and touch [2, 8, 33]. Hinckley et al. [13] described tech- Touch [11] is a wearable depth-sensing and projection sys- niques for direct pen + touch input for a prototype, cen- tem that enables interactive multitouch applications on ev- tered on note-taking and scrapbooking of materials. Lopes et eryday surfaces. Lightspace [31] is a small room installation al. [18] assessed the suitability of combining pen and multi that explores interactions between on, above and between touch interactions for 3D sketching. Through experimen- surfaces combining multiple depth cameras and projectors. tal evidence, they demonstrated that using bimanual inter- actions simplifies work flows and lowers task times, when Sensing Touch Properties compared to the pen-only interface. Going beyond sensing touch and deriving properties of the touches requires more advanced approaches. We classify ex- isting approaches for this into vision-based techniques, in- OVERVIEW: EXTENDED MULTITOUCH INTERACTION strumented glove-based tracking, and specialized hardware. Traditional multitouch interaction [5] is generally defined as the capability to detect multiple points of touch input on a surface. Here, we enrich this concept into what we call ex- Vision-based Techniques tended multitouch interaction (EMT), defined as follows: Malik et al. [19] used two cameras mounted above the sur- face to distinguish which finger of which hand touched a • Detecting multiple touch points on and above a surface; 2 • Recovering finger, wrist and hand postures as well as the handedness (i.e., whether using the left or right hand); and • Distinguishing between users interacting with the surface.
Recommended publications
  • Supporting Everyday Activities Through Always-Available Mobile Computing
    ©Copyright 2010 Timothy Scott Saponas Supporting Everyday Activities through Always-Available Mobile Computing Timothy Scott Saponas A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2010 Program Authorized to Offer Degree: Department of Computer Science and Engineering University of Washington Graduate School This is to certify that I have examined this copy of a doctoral dissertation by Timothy Scott Saponas and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. Co-Chairs of the Supervisory Committee: James A. Landay Desney S. Tan Reading Committee: James A. Landay Desney S. Tan James A. Fogarty Date:______________________________________ In presenting this dissertation in partial fulfillment of the requirements for the doctoral degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of the dissertation is allowable only for scholarly purposes, consistent with ―fair use‖ as prescribed in the U.S. Copyright Law. Requests for copying or reproduction of this dissertation may be referred to ProQuest Information and Learning, 300 North Zeeb Road, Ann Arbor, MI 48106-1346, 1-800-521-0600, to whom the author has granted ―the right to reproduce and sell (a) copies of the manuscript in microform and/or (b) printed copies of the manuscript made from microform.‖ Signature __________________________ Date ______________________________ University of Washington Abstract Supporting Everyday Activities through Always-Available Mobile Computing T. Scott Saponas Co-Chairs of the Supervisory Committee: Professor James A.
    [Show full text]
  • Smart Workstation with a Kinect-Projector Assistance System for Manual Assembly
    Lorenzo Andreoli Smart Workstation with a Kinect-Projector Assistance System for Manual Assembly Master’s thesis in Global Manufacturing Management Supervisor: Fabio Sgarbossa June 2019 Master’s thesis Master’s NTNU Faculty of Engineering Faculty Norwegian University of Science and Technology of Science University Norwegian Department of Mechanical and Industrial Engineering and Industrial Department of Mechanical Lorenzo Andreoli Smart Workstation with a Kinect-Projector Assistance System for Manual Assembly Master’s thesis in Global Manufacturing Management Supervisor: Fabio Sgarbossa June 2019 Norwegian University of Science and Technology Faculty of Engineering Department of Mechanical and Industrial Engineering Executive Summary Although automation has increased more and more in manufacturing companies over the last decades, manual labor is still used in a variety of complex tasks and is currently irreplaceable, especially in assembly operations. The problem of assisting and supporting the human worker during potentially complex assembly tasks is, therefore, very relevant. Clear and easy-to-read assembly instructions, error-proofing methods, and an intuitive user interface for the worker have the potential to reduce the cognitive workload of the operator, increase the productivity, improve the quality, reduce defects, and consequently reduce costs. Digital technologies such as augmented reality and motion recognition sensors can help assembly workers in their tasks and have been subject of research with increased interest over the years. In this thesis, we develop a prototype of a smart workstation equipped with a Kinect-projector assistance system for manual assembly. By following a V-model for systems development approach, we identify key requirements for assistance systems for both continuously supporting workers and teaching assembly steps to workers.
    [Show full text]
  • Using a Depth Camera As a Touch Sensor
    ITS 2010: Devices & Algorithms November 7-10, 2010, Saarbr ¨ucken, Germany Using a Depth Camera as a Touch Sensor Andrew D. Wilson Microsoft Research Redmond, WA 98052 USA [email protected] ABSTRACT In comparison with more traditional techniques, such as We explore the application of depth-sensing cameras to capacitive sensors, the use of depth cameras to sense touch detect touch on a tabletop. Limits of depth estimate resolu- has the following advantages: tion and line of sight requirements dictate that the determi- • The interactive surface need not be instrumented. nation of the moment of touch will not be as precise as that • The interactive surface need not be flat. of more direct sensing techniques such as capacitive touch • Information about the shape of the users and users’ screens. However, using a depth-sensing camera to detect arms and hands above the surface may be exploited in touch has significant advantages: first, the interactive sur- useful ways, such as determining hover state, or that face need not be instrumented. Secondly, this approach multiple touches are from the same hand or user. allows touch sensing on non-flat surfaces. Finally, infor- mation about the shape of the users and their arms and However, given the depth estimate resolution of today’s hands above the surface may be exploited in useful ways, depth-sensing cameras, and the various limitations imposed such as determining hover state, or that multiple touches by viewing the user and table from above, we expect that are from same hand or from the same user. We present relying exclusively on the depth camera will not give as techniques and findings using Microsoft Kinect.
    [Show full text]
  • (MRT): a Low-Cost Teleconferencing Framework for Mixed-Reality Applications
    Mixed Reality Tabletop (MRT): A Low-Cost Teleconferencing Framework for Mixed-Reality Applications Daniel Bekins, Scott Yost, Matthew Garrett, Jonathan Deutsch, Win Mar Htay, Dongyan Xu+, and Daniel Aliaga* Department of Computer Science at Purdue University There is a wide spectrum of application scenarios made pos- ABSTRACT sible by the MRT environment. Given its low-cost infrastructure, Today’s technology enables a rich set of virtual and mixed- schools can install the system to enable children at several differ- reality applications and provides a degree of connectivity and ent stations to work collaboratively on a virtual puzzle, a painting, interactivity beyond that of traditional teleconferencing scenarios. or learn a skill like origami from a remotely stationed teacher. The In this paper, we present the Mixed-Reality Tabletop (MRT), an MRT keeps children focused on the learning task instead of com- example of a teleconferencing framework for networked mixed puter interaction. Students could also use real-world objects to reality in which real and virtual worlds coexist and users can fo- participate in physical simulations such as orbital motion, colli- cus on the task instead of computer interaction. For example, sions, and fluid flow in a common virtual environment. Because students could use real-world objects to participate in physical there are no physical input devices involved, several students can simulations such as orbital motion, collisions, and fluid flow in a participate on the same station without restriction. common virtual environment. Our framework isolates the low- Our general framework provides the essential functionality level system details from the developer and provides a simple common to networked mixed-reality environments wrapped in a programming interface for developing novel applications in as flexible application programmer interface (API).
    [Show full text]
  • A Surface Technology with an Electronically Switchable
    Going Beyond the Display: A Surface Technology with an Electronically Switchable Diffuser Shahram Izadi1, Steve Hodges1, Stuart Taylor1, Dan Rosenfeld2, Nicolas Villar1, Alex Butler1 and Jonathan Westhues2 1 Microsoft Research Cambridge 2 Microsoft Corporation 7 JJ Thomson Avenue 1 Microsoft Way Cambridge CB3 0FB, UK Redmond WA, USA {shahrami, shodges, stuart, danr, nvillar, dab, jonawest}@microsoft.com Figure 1: We present a new rear projection-vision surface technology that augments the typical interactions afforded by multi-touch and tangible tabletops with the ability to project and sense both through and beyond the display. In this example, an image is projected so it appears on the main surface (far left). A second image is projected through the display onto a sheet of projection film placed on the surface (middle left). This image is maintained on the film as it is lifted off the main surface (middle right). Finally, our technology allows both projections to appear simultaneously, one displayed on the surface and the other on the film above, with neither image contaminating the other (far right). ABSTRACT Keywords: Surface technologies, projection-vision, dual We introduce a new type of interactive surface technology projection, switchable diffusers, optics, hardware based on a switchable projection screen which can be made INTRODUCTION diffuse or clear under electronic control. The screen can be Interactive surfaces allow us to manipulate digital content continuously switched between these two states so quickly in new ways, beyond what is possible with the desktop that the change is imperceptible to the human eye. It is then computer. There are many compelling aspects to such sys- possible to rear-project what is perceived as a stable image tems – for example the interactions they afford have analo- onto the display surface, when the screen is in fact transpa- gies to real-world interactions, where we manipulate ob- rent for half the time.
    [Show full text]
  • Extended Multitouch: Recovering Touch Posture, Handedness, and User Identity Using a Depth Camera
    Extended Multitouch: Recovering Touch Posture, Handedness, and User Identity using a Depth Camera Sundar Murugappana, Vinayaka, Niklas Elmqvistb, Karthik Ramania aSchool of Mechanical Engineering bSchool of Electric Engineering Purdue University, West Lafayette, IN 47907 [email protected] Figure 1. Extracting finger and hand posture from a Kinect depth camera (left) and integrating with a pen+touch sketch interface (right). ABSTRACT Author Keywords Multitouch surfaces are becoming prevalent, but most ex- Multitouch, tabletop displays, depth cameras, pen + touch, isting technologies are only capable of detecting the user’s User studies. actual points of contact on the surface and not the identity, posture, and handedness of the user. In this paper, we define the concept of extended multitouch interaction as a richer in- INTRODUCTION put modality that includes all of this information. We further Multitouch surfaces are quickly becoming ubiquitous: from present a practical solution to achieve this on tabletop dis- wristwatch-sized music players and pocket-sized smart- plays based on mounting a single commodity depth camera phones to tablets, digital tabletops, and wall-sized displays, above a horizontal surface. This will enable us to not only virtually every surface in our everyday surroundings may detect when the surface is being touched, but also recover the soon come to life with digital imagery and natural touch in- user’s exact finger and hand posture, as well as distinguish put. However, achieving this vision of ubiquitous [28] sur- between different users and their handedness. We validate face computing requires overcoming several technological our approach using two user studies, and deploy the technol- hurdles, key among them being how to augment any phys- ogy in a scratchpad application as well as in a sketching tool ical surface with touch sensing.
    [Show full text]
  • Beyond Flat Surface Computing
    BEYOND FLAT SURFACE COMPUTING Hrvoje Benko Microsoft© Research ACM MultiMedia 2009 – Beijing, China Surface Computing Holowall, ‘97 Augmented Surfaces, ‘99 Microsoft Surface Apple iPhone DiamondTouch, ‘01 PlayAnywhere, ‘05 Perceptive Pixel TouchWall Smart Table FTIR Display, ‘06 TouchLight, ‘06 Microsoft© Research Microsoft Surface Microsoft© Research Surface Computing An interface where instead of through indirect input devices (mice and keyboard) the user is interacting directly with the content on the screen’s surface. Direct un-instrumented interaction! Microsoft© Research Surface Computing “Surface computing is the term for the use of a specialized computer GUI in which traditional GUI elements are replaced by intuitive, everyday objects.” Wikipedia Content is the interface! Microsoft© Research Digital vs. Real Microsoft© Research Beyond Flat Surface Computing Transcend the flat two-dimensional surface and typical 2D media associated with it and explore the curved, three-dimensional interfaces that cross the boundary between the digital and physical world. Direct un-instrumented interaction . Content is the interface Microsoft© Research Two Approaches 1. Non-flat interactive surfaces 2. Depth-aware interactions above the surface Microsoft© Research Approach #1 Enable touch and gesture interactions on non-flat surfaces. Microsoft© Research Sphere Benko, Wilson, & Balakrishnan, ACM UIST 2008 Microsoft© Research Microsoft© Research Microsoft© Research Reusing the Optical Path Microsoft© Research Sensing & Projection Distortions Microsoft© Research Unique Properties = Opportunities . Borderless, but finite display . Non-visible hemisphere . No master user position/orientation . Smooth transitions between Vertical and horizontal Near and far Shared and private Microsoft© Research Sphere Interactions Microsoft© Research OmniDirectionalSphere ProjectorProjector Microsoft© Research Everywhere Displays Pinhanez et al. ‘01 Microsoft© Research Microsoft© Research Pinch-the-Sky Dome Microsoft© Benko, Wilson, and Fay, 2009 Research Omni-Directional Content .
    [Show full text]
  • Bringing the Physical to the Digital: a New Model for Tabletop Interaction
    Bringing the Physical to the Digital: A New Model for Tabletop Interaction Otmar Hilliges München 2009 Bringing the Physical to the Digital: A New Model for Tabletop Interaction Otmar Hilliges Dissertation an der Fakultät für Mathematik, Informatik und Statistik der Ludwig-Maximilians-Universität München vorgelegt von Otmar Hilliges aus München München, den 19. Januar 2009 iv Erstgutachter: Prof. Dr. Andreas Butz Zweitgutachter: Prof. Dr. Sheelagh Carpendale Drittgutachter: Dr. Shahram Izadi Viertgutachter: Dr. Andrew D. Wilson Tag der mündlichen Prüfung: 22. Juli 2009 Abstract v Abstract This dissertation describes an exploration of digital tabletop interaction styles, with the ultimate goal of informing the design of a new model for tabletop interaction. In the context of this thesis the term digital tabletop refers to an emerging class of devices that afford many novel ways of interaction with the digital. Allowing users to directly touch information presented on large, horizontal displays. Being a relatively young field, many developments are in flux; hardware and software change at a fast pace and many interesting alternative approaches are available at the same time. In our research we are especially interested in systems that are capable of sensing multiple contacts (e.g., fingers) and richer information such as the outline of whole hands or other physical objects. New sensor hardware enable new ways to interact with the digital. When embarking into the research for this thesis, the question which interaction styles could be appropriate for this new class of devices was a open question, with many equally promising answers. Many everyday activities rely on our hands ability to skillfully control and manipulate physi- cal objects.
    [Show full text]