Kinect Based Suspicious Posture Recognition for Real-Time Home Security Applications

Kinect Based Suspicious Posture Recognition for Real-Time Home Security Applications Mandikal Vikram*, Aditya Anantharaman*, Suhas BS*, Ashwin TS, Ram Mohana Reddy Guddeti Department of Information Technology National Institute of Technology Karnataka, Surathkal, India 575025 Email: f15it217.vikram, 15it201.aditya.a, [email protected], fashwindixit9, [email protected] Abstract—Suspicious posture recognition is a paramount task Our Approach for real-time home security consists of: with numerous applications in everyday life. We explore one • Using Kinect depth camera to detect human postures from such application in real-time home security using the Microsoft Kinect depth camera. We propose a novel method where the the skeleton. remote device itself detects the suspicious activity. The server is • Classify postures as suspicious or normal using logistic alerted by the remote device in case of a suspicious activity which regression. further alerts the home owners immediately. We show that our • In case the posture is suspicious, then alert the homeown- method, works in real-time, is robust towards changing lighting ers by a push notification on their phones. conditions and the computations happen on the remote device itself which makes it suitable for real-time home security. Keywords—Posture recognition, Home security, Kinect, An- All the computations involving the inference part of the droid app machine learning algorithm, occur on the remote device itself. I. INTRODUCTION This avoids unnecessary communication overhead between the server and the remote device. The remote device communi- Home security is a very important domain of security. cates with the server only when it detects a suspicious activity. The growing crime rates across cities showcase the bitter reality. Many people underestimate the need to put in place II. RELATED WORKS appropriate home security measures. A burglary or theft can often lead to devastating consequences for the homeowners. Kinect is extensively used by researchers for gesture and The current security systems rely largely on visual features posture recognition, fall detection, in-home security and such as facial features which can easily be masked. The surveillance and 3D modeling. The IR Sensor or the depth criminals could mask themselves with sunglasses, caps or camera of Kinect was used by [1] for fall detection. In this use more sophisticated techniques to deceive such visual work the authors have demonstrated a real-time application to feature dependent security systems. Also, these systems fail detect walking falls accurately and robustly. The depth camera in adverse lighting conditions. was used by [2] for gesture recognition. Here, the authors used the body-joint positions obtained from Kinect and then applied Hence, we need reliable intelligent systems which can machine learning techniques on these, to classify the gestures. alert homeowners at real-time, irrespective of the lighting In [3], authors developed a home security surveillance sys- conditions of the place or the clothes worn by the criminals. tem which can detect abnormal activities such as intrusion In this work, we propose a real-time system for home detection, fire, smoke etc. using sensors and CMOS camera. security using the Microsoft Kinect depth camera which A Short Message Service (SMS) or Multimedia Messaging checks for suspicious postures outside the door and alerts Service (MMS) notification is sent on detection of suspicious the homeowners using a push notification on their phones. activity. The authors in [4] developed a multi-sensor fusion Since our method uses the skeletal information captured by algorithm that includes Kinect devices also, to detect abnormal the Kinect depth camera (which is done using infrared rays) activities among the elderly. They used numerous sensors such and not the visual features, it is thus robust towards lighting as bands, Zenith cameras and other wearables which contain conditions and clothes of criminals (intruders), which makes accelerometer, gyroscope, heart rate sensor etc. to monitor it suitable for home security applications. the patient. The work [5] discussed about monitoring elderly home occupants. The authors used visual features from Kinect Our main contribution in this paper is a novel real-time devices placed at multiple locations inside the house to learn home security solution for suspicious posture detection which a few fuzzy parameters. These parameters are further used can be deployed remotely with minimalistic interaction with to detect abnormal behavior patterns. A disadvantage of this the web server. work is the usage of multiple cameras for different locations, which might not be practically viable. * equal contribution To the best of our knowledge, there have been no works 978-1-5386-8325-7/18/$31.00 ©2018 IEEE on home security using Kinect’s depth camera. Table I summarizes some of the works that use Kinect, with their B. Classification using Logistic Regression methodology, merits and demerits. Logistic regression is a simple classifier that is widely used for binary classification and the details are shown in Algorithm III. PROPOSED METHODOLOGY 1. It can describe and find relationships between a dependent We propose a novel technique for real time home security and an independent variable. The sigmoid or logistic function using suspicious posture recognition with the help of Kinect is defined as follows: depth camera and logistic regression. The proposed method- 1 σ¹zº = (1) ology consists of the following steps: 1 + exp(−zº • Data Acquisition: Capture skeleton information using The function(1) also serves as the hypothesis. It can also be Kinect depth sensor. observed that the range of this function is [0, 1]. So, this • Data Processing and Feature Extraction: Features ex- function provides a probability measure for classification. For tracted by calculating the joint angles. labeled training examples ¹xi; yiº : i = 1::m, the cost function • Posture Recognition using Logistic regression. is defined as follows: • Alerting the homeowners using a push notification. m 1 Õ The proposed methodology is illustrated in Fig. 1. The Mi- J¹θº = − ¹ y¹iºlog¹h ¹x¹iººº+¹1−y¹iººlog¹1−h ¹x¹iººº (2) m θ θ crosoft Kinect depth camera captures the skeletal information i=1 of the subject. The angles between the joints are used as the The cost function (2) measures how well the hypothesis fits features to classify the postures. This is done on a frame to the data. The objective is to minimize the cost function. We use frame basis - if the classifier classifies the posture as suspicious gradient descent for this purpose. The weights are modified as for a certain threshold number of continuous frames, a photo follows: is taken using a camera - not necessarily the Kinect camera, θ θ α¹y¹iº − h ¹x¹iºººx¹iº it could be any camera which will have a good view of the j := j + θ j (3) subject. This photo is then forwarded to the server which stores where α is the learning rate. it and also notifies the concerned homeowners using a push notification to their phone. Algorithm 1: Logistic Regression The premise assumed for our approach is shown in Fig. 2. The Kinect is placed in front of the door such that any 1 initialize vector θ to random values ; person approaching the door would be in between the door 2 m = number of training examples; and the Kinect. Thus, the Kinect will capture the position of 3 while iter < maxiter do the subject from behind. Another camera which would have 4 calculate cost J(θ); a good view of the subject would be required to take a photo 5 i = 0; in case of an alarm. A secondary camera such as a CCTV 6 while i < m do ¹Tº ¹iº camera can be used for this purpose. 7 calculate the hypothesis hθ = σ¹θ X º; 8 update θ using (3); A. Feature Extraction from Skeletal Information 9 i = i+1; 10 end Skeletal information is obtained from Kinect depth camera 11 iter = iter+1; directly and is plotted on the depth image. It provides the 12 end coordinates of these joints. The joints which we consider are - left shoulder, right shoulder, head, neck, left hand, right hand, left elbow and right elbow. The following vectors Here, the problem is a two way classification into suspicious are calculated between the joints - neckHead (between neck and normal actions with six features which are the angles as and head), neckLeftShoulder (between neck and left shoul- mentioned before. Logistic regression suits well for this data. der), neckRightShoulder (between neck and right shoulder), rightShoulderElbow (between right shoulder and right elbow), C. Scenarios Considered leftShoulderElbow (between left shoulder and left elbow), In this work, we considered the below mentioned scenarios rightElbowHand (between right elbow and right hand), left- while training the model. We created the data based on these ElbowHand (between left elbow and left hand) and torsoNeck scenarios. between the torso and the neck. These vectors are further used 1) Normal Activity: These include the normal postures to calculate the following angles - between rightShoulderEl- without any suspicion. It is important for the system to not bow and rightElbowHand, between leftShoulderElbow and unnecessarily send alarms to the homeowner. The image from leftElbowHand, between neckrightShoulder and rightShoul- the secondary camera and the depth camera can be seen in Fig. derElbow, between neckleftShoulder and leftShoulderElbow, 3(a) and Fig. 3(e). Many casual poses of ordinary people were between torsoNeck and the y-axis and between torsoNeck and taken to train the model to identify these normal postures. The neckHead, these 6 angles are the features which are used for number of normal samples in the generated dataset is equal to classifying the postures. the sum of the the number of samples in all other scenarios. TABLE I RELATED WORKS Author Methodology Merit Demerit Features extracted from the 3D It is a robust walking fall detec- This deals with safety inside the bounding boxes obtained from the Mastorakis et al.

Load more