Motion-Capture-Based Hand Gesture Recognition for Computing and Control Andrew Gardner Louisiana Tech University
Total Page:16
File Type:pdf, Size:1020Kb
Louisiana Tech University Louisiana Tech Digital Commons Doctoral Dissertations Graduate School Summer 2017 Motion-capture-based hand gesture recognition for computing and control Andrew Gardner Louisiana Tech University Follow this and additional works at: https://digitalcommons.latech.edu/dissertations Part of the Artificial Intelligence and Robotics Commons, and the Statistics and Probability Commons Recommended Citation Gardner, Andrew, "" (2017). Dissertation. 57. https://digitalcommons.latech.edu/dissertations/57 This Dissertation is brought to you for free and open access by the Graduate School at Louisiana Tech Digital Commons. It has been accepted for inclusion in Doctoral Dissertations by an authorized administrator of Louisiana Tech Digital Commons. For more information, please contact [email protected]. MOTION-CAPTURE-BASED HAND GESTURE RECOGNITION FOR COMPUTING AND CONTROL by Andrew Gardner, B.S., M.S. A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy COLLEGE OF ENGINEERING AND SCIENCE LOUISIANA TECH UNIVERSITY August 2017 ProQuest Number: 10753664 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. ProQuestQue ProQuest 10753664 Published by ProQuest LLC(2018). Copyright of the Dissertation is held by the Author. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code. Microform Edition © ProQuest LLC. ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 LOUISIANA TECH UNIVERSITY THE GRADUATE SCHOOL 5 /15/2017 Date We hereby recommend that the dissertation prepared under our supervision by Andrew Bryan Gardner entitled_____________________________________________________________________ _________________ Motion-Capture-Based Hand Gesture Recognition for Computing and Control be accepted in partial fulfillment of the requirements for the Degree of Doctor of Philosophy in Computational Analysis and Modeling I f "Supervisor of Dissertation Research ... i f c ..................... ....... U Head of Department Computational Analysis and Modeling Department Recommendation concurred in: ^ -rL vyyCa-ns Advisory Committee Approved ApWjdved:V ^ r ? * / / Director of Graduate Studies Dean of the Graduate School Dean of the College GS Form 13a (6/07) ABSTRACT This dissertation focuses on the study and development of algorithms that enable the analysis and recognition of hand gestures in a motion capture environment. Central to this work is the study of unlabeled point sets in a more abstract sense. Evaluations of proposed methods focus on examining their generalization to users not encountered during system training. In an initial exploratory study, we compare various classification algorithms based upon multiple interpretations and feature transformations of point sets, including those based upon aggregate features (e.g. mean) and a pseudo-rasterization of the capture space. We find aggregate feature classifiers to be balanced across multiple users but relatively limited in maximum achievable accuracy. Certain classifiers based upon the pseudo-rasterization performed best among tested classification algorithms. We follow this study with targeted examinations of certain subproblems. For the first subproblem, we introduce the a fortiori expectation-maximization (AFEM) algorithm for computing the parameters of a distribution from which un labeled, correlated point sets are presumed to be generated. Each unlabeled point is assumed to correspond to a target with independent probability of appearance but correlated positions. We propose replacing the expectation phase of the algo rithm with a Kalman filter modified within a Bayesian framework to account for the unknown point labels which manifest as uncertain measurement matrices. We also propose a mechanism to reorder the measurements in order to improve parameter estimates. In addition, we use a state-of-the-art Markov chain Monte Carlo sampler to efficiently sample measurement matrices. In the process, we indirectly propose a constrained /c-means clustering algorithm. Simulations verify the utility of AFEM against a traditional expectation-maximization algorithm in a variety of scenarios. In the second subproblem, we consider the application of positive definite kernels and the earth mover’s distance (EMD) to our work. Positive definite kernels are an important tool in machine learning that enable efficient solutions to otherwise difficult or intractable problems by implicitly linearizing the problem geometry. We develop a set-theoretic interpretation of EMD and propose earth mover’s intersection (EMI), a positive definite analog to EMD. We offer proof of EMD’s negative definiteness and provide necessary and sufficient conditions for EMD to be conditionally negative definite, including approximations that guarantee negative definiteness. In particular, we show that EMD is related to various min-like kernels. We also present a positive definite preserving transformation that can be applied to any kernel and can be used to derive positive definite EMD-based kernels, and we show that the Jaccard index is simply the result of this transformation applied to set intersection. Finally, we evaluate kernels based on EMI and the proposed transformation versus EMD in various computer vision tasks and show that EMD is generally inferior even with indefinite kernel techniques. Finally, we apply deep learning to our problem. We propose neural network architectures for hand posture and gesture recognition from unlabeled marker sets in a coordinate system local to the hand. As a means of ensuring data integrity, we also V propose an extended Kalman filter for tracking the rigid pattern of markers on which the local coordinate system is based. We consider fixed- and variable-size architectures including convolutional and recurrent neural networks that accept unlabeled marker input. We also consider a data-driven approach to labeling markers with a neural network and a collection of Kalman filters. Experimental evaluations with posture and gesture datasets show promising results for the proposed architectures with unlabeled markers, which outperform the alternative data-driven labeling method. APPROVAL FOR SCHOLARLY DISSEMINATION The author grants to the Prescott Memorial Library o f Louisiana Tech University the right to reproduce, by appropriate methods, upon request, any or all portions o f this Dissertation. It is understood that “proper request” consists o f the agreement, on the part o f the requesting party, that said reproduction is for his personal use and that subsequent reproduction will not occur without written approval o f the author o f this Dissertation. Further, any portions o f the Dissertation used in books, papers, and other works must be appropriately referenced to this Dissertation. Finally, the author o f this Dissertation reserves the right to publish freely, in the literature, at any time, any or all portions o f this Dissertation. Author pL/At* f~~ Date 7/S/^>l7 GS Form 14 (5/03) DEDICATION This dissertation is dedicated to my loving and supportive parents, without whom it would likely have never been completed. TABLE OF CONTENTS ABSTRACT................................................................................................................. iii DEDICATION.............................................................................................................. vii LIST OF TABLES.........................................................................................................xiii LIST OF FIGURES.......................................................................................................xiv ACKNOWLEDGMENTS............................................................................................ xvii CHAPTER 1 INTRODUCTION........................................................................... 1 1.1 An Overview of Hand Gesture Recognition ................................................ 2 1.2 Problem Statement and Setting .................................................................. 6 1.2.1 Problem Setting ................................................................................ 6 1.2.2 Key Objectives ................................................................................. 10 1.3 Contribution .................................................................................................. 12 1.4 Limitations of the Study .............................................................................. 13 1.5 Organization of Dissertation ....................................................................... 14 CHAPTER 2 BACKGROUND............................................................................... 15 2.1 Fundamentals ................................................................................................ 15 2.1.1 Linear Algebra .................................................................................. 16 2.1.2 Quaternions ...................................................................................... 17 2.1.3 Metrics............................................................................................... 20 2.1.4 Kernels .............................................................................................