Human Motion Tracking and Recognition with Microsoft Kinect
Total Page:16
File Type:pdf, Size:1020Kb
Human Motion Tracking and Recognition with Microsoft Kinect Wenbing Zhao Cleveland State University 1 Outline Introduction to the Kinect technology Kinect applications Human motion tracking with Kinect Demo Human motion recognition with Kinect Publicly available dataset 2 The Kinect Technology Launched in 2010 as an enhancement device for the Xbox 360 game console “You are the controller”: Game playing using gestures and voices 24 million units sold as of Feb 2013 Software development kit (SDK) was released by Microsoft in late 2011 Third party SDKs were available before that Two generations of Kinect sensor Kinect v1: Kinect for Xbox, Kinect for Windows Kinect v2: Kinect for XboxOne What Kinect SDK provides 2D color image frames 3D depth image frames 3D skeletal frames with a set of joints in each skeleton 3 Steps of Motion Tracking and Analysis with Kinect 4 Kinect Depth Sensing Technology The depth-sensing technology used in Kinect v1 was developed by PrimeSense It uses structured light with a single infrared (IR) emitter and a single depth sensor for triangulation of depth in each pixel 5 Kinect Depth Sensing Technology The depth-sensing technology used in Kinect v2 was based on time- of-flight measurement The depth of each pixel can be calculated based on the phase shift between the emitted light and the reflected light A clever design: IR emitter is periodically turned on and off, and the output from the depth sensor is sent to two different ports when the light is on (port A) and off (B), respectively A-B: removes background ambient light => depth + high quality IR images 6 IR and Depth Image Quality Comparison 7 Feature Comparison of Kinect v1 & v2 8 Outline Introduction to the Kinect technology Kinect applications Demo Human motion tracking with Kinect Human motion recognition with Kinect Publicly available dataset 9 Kinect Applications 3D Reconstrucition Virtual Reality & Gaming Motion Sign Language R recognition Recognition Natural User Interface Education & Robotics Control Healthcare Retail Training Performing Arts & Interaction Physical Medical Fall Therapy Operation Detection 10 Kinect Applications In Healthcare 11 Exploration of medical images Controller-free exploration of medical image data: experiencing the Kinect, by Gallo, Placitelli, and Ciampi, Italy http://www.youtube.com/watch?v=CsIK8D4RLtY 12 Kinect Applications in Virtual Reality and Gaming 13 Kinect Applications in Natural User Interface Kinect Applications in Education and Performing Arts 14 Kinect Applications in Robotics Control and Interaction 15 Kinect Applications in Sign Language Recognition 16 Sign Language Translation Sign Language Recognition and Translation with Kinect, by Chai et al, Chinese Academy of Sciences & Microsoft Research Asia http://research.microsoft.com/en-us/collaboration/stories/kinect-sign-language- translator.aspx https://www.youtube.com/watch?v=HnkQyUo3134 17 Kinect Applications in Retail, Training, & 3D Reconstruction 18 Scanning 3D Full Human Bodies Tong, J., Zhou, J., Liu, L., Pan, Z., & Yan, H. (2012). Scanning 3D full human bodies using kinects. IEEE Transactions on Visualization and Computer Graphics, 18(4), 643-650. https://www.youtube.com/watch?v=OJZbmUVflYc 19 Kinect Application: Privacy-Aware Human Motion Tracking Patient privacy is protected by government regulations, such as the Health Insurance Portability and Accountability Act of 1996 (HIPAA) Patients cannot be videotaped without their explicit consent In the mean time, it could be very beneficial to monitor the activities of health caregivers: safe patient handling, training, etc. Inertial sensor based tracking preserves the patient privacy, however, this type of tracking is intrusive and cannot provide accurate and detailed information of the activities comparable to that of vision-based tracking (unless a large set of sensors attached to specific locations) 20 Privacy-Aware Human Motion Tracking Computer vision based tracking: non-intrusive, accurate, and detailed information human activities Problem: a camera is non-discriminative – everyone in the view is tracked! What we want: track only consented human subjects when they interact with other people The proposed system ensures the tracking of only consented personnel: captured information regarding non-consented personnel in the view is discarded immediately 21 Privacy-Aware Human Motion Tracking: Hardware Components Microsoft Kinect sensor (or any equivalent programmable depth camera) Computer (to run the Kinect motion tracking application) Wearable device, such as smart watch (worn by the consented personnel) Smart phone (needed only if the wearable device could not directly communicate with the application running on the computer) Ceiling Kinect#1 Kinect#2 Computer Wifi Caregiver#1 Smart Phone Wearable Device Bluetooth Bluetooth Wearable Device Patient Caregiver#2 22 Privacy-Aware Human Motion Tracking: Software Components Kinect Pebble Mobile Application Skeleton (PebbleKit JS) AppMessage XMLHttpRequest Stream Pebble pebble-js- Server Application app.js Application AppMessage XMLHttpResponse Pebble Smart Watch Smart Phone (JSON object) Computer 23 Discriminative Tracking Mechanism Simultaneous detection of a predefined gesture by both the wearable device worn by the consented personnel and the depth camera: personnel registration mechanism If detected: start to track the registered personnel If not, collected data (current frame) is discarded Workflow of the registration mechanism: Consented personnel walks in front of the camera He/she performs a predefined gesture with the wearable device Wearable device captures the gesture, sends a notification to the vision- side application Vision-based application examines the next frame to correlate the event to a specific skeleton detected 24 Discriminative Tracking Mechanism Predefined gesture: tapping the wearable device (worn on one wrist) with a finger => Distance between two wrists below a threshold For each human skeleton detected by the programmable camera, calculate the distance between two wrists The one with a distance below the threshold is the one to be tracked Need to examine the current frame only To be more robust, on the vision-side, the entire gesture could be tracked to minimize the chance of tracking a wrong subject Wrist-distance reduces to a minimum (< threshold) and then increases beyond the threshold Require the caching of more frames 25 Discriminative Tracking Mechanism Practical issues in discriminative tracking: Consented person might forget to register The person might not be in the view of the camera when registering The person might go out of the view of the camera => would require re-registration Reminder via the wearable device worn by the person Wearable Device Connected No Notification Sent Notification Sent Tap on Device No Connected Time Yes to Wearable to Wearable Detected Exceeded Threshold Device Device Yes Caregiver not in No Wrist-Wrist Distance Below Kinect View Threshold (Kinect) Yes Caregiver Registration Complete 26 Registration Latency 0.008" • Mean round-trip time is 262ms: tolerable 0.007" for caregiver registration. 0.006" • The skeleton frame examined by the Func2on* 0.005" server application will be 4-5 frames late most of the time 0.004" Density* 0.003" 0.002" Probability* 0.001" 0" 0" 100" 200" 300" 400" 500" 600" 700" 800" 900" 1000" Round5Trip*Latency*(ms)* Kinect Pebble Mobile Application Skeleton (PebbleKit JS) AppMessage XMLHttpRequest Stream Pebble pebble-js- Server Application app.js Application AppMessage XMLHttpResponse Pebble Smart Watch Smart Phone (JSON object) Computer 27 0.8" 0.7" 0.6" Registration 0.5" 0.4" 0.3" 0.2" 0.1" Gesture 0" 0" 5" 10" 15" 20" 25" Recognition 0.8" 0.7" 0.6" 0.5" 0.4" 0.3" 0.2" 0.1" 0" Tapping gesture: 0" 5" 10" 15" 20" 25" •1.5 seconds or so 0.9" •Wrist-to-wrist distance < 0.1m 0.8" 0.7" 0.6" 0.5" 0.4" 0.3" 0.2" 0.1" 0" 0" 5" 10" 15" 20" 0.8" 0.7" 0.6" 0.5" 0.4" 0.3" 0.2" 0.1" 0" 0.5" 1" 1.5" 2" 2.5" 3" 3.5" 4" 28 A Use Case: Safe Patient Handling Monitoring Nursing aids might not follow best practices when handling patients Privacy-aware vision-based human motion tracking could be used to help increase compliance Generate haptic notification via wearable device, together with a short descriptive message, when non- compliant activities are detected Such incidence, as well as statistics for correct activities are logged 29 Demo 30 Outline Introduction to the Kinect technology Kinect applications Demo Human motion tracking with Kinect Human motion recognition with Kinect Publicly available dataset 31 Human Detection and Tracking Object detection and tracking Main approach: background subtraction => foreground humans RGB-images sensitive to illuminating changes, even if camera is static Depth data helps to establish a more stable background model Pose estimation: skeletal tracking Per-pixel body part classification Estimating body joint positions by computing a local centroids of the body part probability mass using mean shift mode detection 32 Object Detection and Tracking Depth data Benefit: more robust against illuminating changes Disadvantage: limited dimension => less discriminative in representing an object May be beneficial to combine both depth and RGB data Common background subtraction algorithms Gaussian mixture model (GMM) Histograms of oriented gradients (HOG) Pyramid HOG Local binary patterns (LBP) Local ternary patterns (LTP) Census transform histograms