<<

applied sciences

Article A New Approach to Fall Detection Based on the Torso Motion Model

Leiyue Yao ID , Weidong Min * ID and Keqiang Lu

School of Information Engineering, Nanchang University, Nanchang 330031, China; [email protected] (L.Y.); [email protected] (K.L.) * Correspondence: [email protected]; Tel.: +86-0791-8396-9684

Received: 26 July 2017; Accepted: 18 September 2017; Published: 26 September 2017

Featured Application: In our proposed method, a new feature named torso angle is firstly proposed and imported in the human torso motion model (HTMM) for fall detection. By tracking the changing rate of torso angle and centroid height, our fall detection model has a strong capability of differentiating falls from other fall-like activities, such as controlled lying down, quickly crouching and bending, which may be big obstacles for other existing computer vision-based approaches to discriminate.

Abstract: This paper presents a new approach for fall detection based on two features and their motion characteristics extracted from the human torso. The 3D positions of the center and the center joint in depth images are used to build a fall detection model named the human torso motion model (HTMM). Person’s torso angle and centroid height are imported as key features in HTMM. Once a person comes into the scene, the positions of these two are fetched to calculate the person’s torso angle. Whenever the angle is larger than a given threshold, the changing rates of the torso angle and the centroid height are recorded frame by frame in a given period of time. A fall can be identified when the above two changing rates reach the thresholds. By using the new feature, falls can be accurately and effectively distinguished from other fall-like activities, which are very difficult for other computer vision-based approaches to differentiate. Experiment results show that our approach achieved a detection accuracy of 97.5%, 98% true positive rate (TPR) and 97% true negative rate (TNR). Furthermore, the approach is time efficient and robust because of only calculating the changing rate of gravity and centroid height.

Keywords: fall detection; human torso motion model; torso angle; centroid height; computer vision; depth image; Kinect

1. Introduction Falls are one of the major health hazards among the aging population aged over 60. According to the report of the World Health Organization, approximately 28~35% of people aged 65 and over fall each year and 32~42% for those over 70 years of age. In fact, falls exponentially increase due to age-related biological changes, which lead to a high incidence of falls and fall-related injuries in ageing societies [1]. Nowadays, falls are not only life threatening, but also one of the major issues in elderly health. A fall can cause severe consequences, including broken , superficial cuts and abrasions to the skin soft tissue [2,3]. If a falling person cannot get help in a short period of time, the situation will be even worse. As many people in this ageing group live alone, it is difficult for them to seek help immediately. For these and many other reasons, the number of research works on fall detection has increased dramatically over recent years. Automatic detection of human falls provides help to reduce the time between the fall and the arrival of medical attention [4,5].

Appl. Sci. 2017, 7, 993; doi:10.3390/app7100993 www.mdpi.com/journal/applsci Appl. Sci. 2017, 7, 993 2 of 17

During recent years, different kinds of approaches have been proposed to realize automatic fall detection. Mubashir et al. [6] classified the fall detection methods into three major categories: the wearable device-based method, the ambient sensor-based method and the vision-based method. Wearable sensors detect the motion and location of the . M.J. Mathie et al. [7,8] adopted accelerations as the major feature to identify falls. Sixsmith A. [9] judged falls by checking periods of inactivity. M. Kangas et al. [10] identified a fall by analyzing the posture of the person. Although these approaches did work in detecting falls, the disadvantages are also obvious. The wearable sensors can be worn or damaged due to external factors, and they will not detect falls if the person forgets to wear them. Besides, wearable devices may make people feel uncomfortable. Another type of approach uses sensors that are deployed in the environment to detect falls. Zhuang, Huang et al. [11] checked whether there was a fall by collecting sounds, while M. Alwan et al. [12] detected falls by monitoring vibrations in a room. In [13], the authors focus on detecting falls in the bathroom by using smart floors to collect floor pressure information. Although the ambient sensor-based method is a creative way for fall detection, the accuracy and false alarm rate are unacceptable in most situations. Moreover, this kind of operation is limited to those places where the sensors have been deployed. Due to these main drawbacks of the above two approaches, the computer vision-based method has become a hot topic in recent research. The method needs no wearable devices and aims at detecting falls by analyzing video streams from one or more cameras. Compared with other approaches, this kind of method has more advantages. However, privacy is a problem for the people who live under the supervision. Q.M. Rajpoot et al. [14] provided solutions for privacy identification and protection. In addition, Chaaraoui, A.A. et al. [15] defined five specific visualization levels corresponding to five visualization models, respectively. The appearance of persons can be protected at each level to a different degree. Our proposed method is based on computer vision, and the usage of only depth images helps maintain privacy. Compared with the method proposed in [15], the usage of depth images reaches the third visualization level. More detailed information about the typical methods based on computer vision technology is discussed in Section2. The remaining part of this paper is organized as follows. Section2 gives an overview of the current state-of-the-art with regard to features and methods used in the vision-based fall detection method. Section3 describes our model and algorithm of fall detection. Section4 shows the experiment and gives the results of our approach. Section5 lists three major limitations of our model. Finally, we present our concluding remarks in Section5.

2. Related Work With the development of machine learning, computer vision and image processing techniques, the video camera or computer vision-based method has become a new, but hot topic of fall detection. Compared with the wearable device-based approaches, the camera or computer vision-based methods are less intrusive, more accurate and robust. Moreover, the detection systems can be easily updated by downloading the newest programs. One typical fall detection method identified and located the person in the video by using a vertical projection histogram with a statistical scheme. Then, the ratio of the width and height of the bounding box and the absolute difference between these two values were used to detect falls [16]. In [17], the authors divided the image into many rectangular areas, then falls can be detected by calculating the area of the body. Kwolek et al. [18] presented a fuzzy system that used the accelerometer and the Kinect sensor to detect the fall event. However, since the human external rectangle is difficult to detect accurately, false alarms and missed detections are the major problems of this method. Video sequence analysis is another major branch of vision-based fall detection. Khan and Habib [19] used motion history images and detected large motion in a video sequence. The combination of the aspect ratio of the bounding box and the changing rate of width and height were used in their approach to detect falls. In [20], overlapping smart cameras were used for fall detection and localization. In [21], Appl. Sci. 2017, 7, 993 3 of 17 a single camera covering the full view of the room environment is used for the video recording of an elderly person’s daily activities for a certain time period. Fall detection is realized by analyzing the postures that are extracted from video clips. Rougier et al. [22] proposed a new way to track 3D head trajectory by only one calibrated camera. The vertical velocity of the 3D head is used to judge whether a fall happens. Olivieri et al. [23] used optical flow to detect falls and recognize other human activities. Recently, depth images have been widely used for action recognition and classification, because of the particular advantages in privacy protection and the availability of being easily collected by Kinect or other sensors. Lei Yang et al. [24] proposed a fall detection method based on spatio-temporal context tracking over three-dimensional (3D) depth images captured by the Kinect sensor. Rougier, C. et al. [25] used several cameras to create 3D vision. A measure of the vertical distribution along the vertical is calculated, and a fall event is detected when this distribution is abnormally near the ground for a certain length of time. Gasparrini et al. [26] proposed a fall detection method for indoor environments, based on the usage of the Kinect depth sensor in an “on-ceiling” configuration and on the analysis of depth frames. Rodrigo Ibañez et al. [27] proposed a lightweight approach to recognize gestures with Kinect by utilizing approximate string matching. Georgios Mastorakis et al. [28] fetch data from the depth image and detect falls by measuring the velocity based on the contraction or expansion of the width, height and depth of the 3D bounding box. Aguilar et al. [29] presented a 3D path planning system that uses a point cloud obtained from the robot workspace, with a Kinect V2 sensor to identify the interest regions and the obstacles of the environment. Alazrai R. [30] built a view-invariant descriptor for human activity representation by 3D skeleton joint positions to detect falls and other activities. However, all of these methods have shortcomings. For example, in [31], the ratio of the height and width of the rectangle extracted from a person was used to determine whether a person falls. This method uses three features, i.e., human aspect ratio, effective area ratio and center variation rate. However, the human aspect ratio changes greatly when the relative positions of the camera and the target change, which results in a high false alarm rate and low accuracy of the detection system. Multi-cameras [32] and wide-angle camera-based [33] methods are effective ways to detect falls. However, the pre-works, i.e., camera installation, image calibration and 3D human identification, are difficult and complex. As for the deep learning method [34], the need for a huge number of labeled data makes it complicated in adjustment and poor in flexibility.

3. Our Approach for Fall Detection In our proposed method, a new feature named torso angle is adopted. This feature together with the centroid height and their motion characteristics form a descriptor for human fall representation, called the human torso motion model (HTMM). Different from machine learning and deep learning methods, HTMM is a threshold-based approach for fall detection with high time efficiency because of the calculation of only the changing rates of the torso angle and centroid height. Before we elaborate the model, some technical terms will be detailed first in the following.

3.1. The Key Concepts and Definitions in HTMM

3.1.1. Torso Line and Torso Vector There are in total 20 points of a person’s skeleton that can be tracked by the Kinect sensor when the person is standing, while 10 points when sitting if used in front view. Additionally, the untracked joints can be estimated by the embedded program in Kinect. Especially when the part of the body that contains the joints can be detected, the positions of these untracked joints can be exactly estimated. Figure1 shows the details of the joints in standing and sitting postures. A line connecting the shoulder center and the hip center is a key concept in our algorithm to calculate torso angle. This line is named as the torso line and marked in red as shown in Figure1. Appl. Sci. 2017, 7, 993 4 of 17 Appl. Sci. 2017, 7, 993 4 of 17 Appl. Sci. 2017, 7, 993 4 of 17

(a) (b)

Figure 1. Joints of (aa person’s) skeleton provided by Kinect. (a) Twenty joints( bof) the skeleton that can Figure 1. Joints of a person’s skeleton provided by Kinect. (a) Twenty joints of the skeleton that can be be tracked when the person is standing; (b) 10 joints and two estimated joints of the skeleton when trackedFigure 1. when Joints the of person a person’s is standing; skeleton (b provided) 10 joints by and Kinect. two estimated(a) Twenty joints joints of of the the skeleton skeleton when that thecan the person is sitting, where spin and hip center joints marked with two circles of dotted lines be tracked when the person is standing; (b) 10 joints and two estimated joints of the skeleton when personare is estimated. sitting, where spin and hip center joints marked with two circles of dotted lines are estimated. the person is sitting, where spin and hip center joints marked with two circles of dotted lines Inare a estimated. red,In a red, green, green, blue blue (RGB) (RGB) image, image, each each pixel pixel onlyonly contains contains 2D 2D position position information. information. However, However, the depththe depth image image provides provides each each pixel’s 3D 3D position position information. information. Figure Figure 2 shows2 theshows joints the in 3D joints depth in 3D In a red, green, blue (RGB) image, each pixel only contains 2D position information. However, depthimage image space. space. These These joints joints of a person’s of a person’s skeleton skeleton can be canconsidered be considered as pixels asin pixelsthe depth in theimage. depth With image. the depththese 3Dimage joint provides points of eachthe skeleton, pixel’s the3D positioncomputer information. can understand Figure the meaning 2 shows of the human joints gestures in 3D depth With these 3D joint points of the skeleton, the computer can understand the meaning of human gestures imagewhen space. a set These of complex joints of activities a person’s is made skeleton by hu canman be beingsconsidered in front as pixelsof the inKinect. the depth As the image. main With when a set of complex activities is made by human beings in front of the Kinect. As the main purpose thesepurpose 3D joint of ourpoints system of the is to skeleton, alarm as soonthe comput as a fall erhappens, can understand we only take the the meaning shoulder ofcenter human and hipgestures of our system is to alarm as soon as a fall happens, we only take the shoulder center and hip center whencenter a set joints of complexinto consideration. activities In is other made words, by ourhuman method beings only paysin front attention of the to theKinect. human As torso. the main joints into consideration. In other words, our method only pays attention to the human torso. purpose Theof our torso system line can is tobe alarmrepresented as soon as aas vector a fall and happens, calculated we by:only take the shoulder center and hip The torso line can be represented as a vector and calculated by: center joints into consideration. In otherHSXXsYYsZZs words,(,,)hhh our method  only, pays attention to the human(1) torso. The torso line can be represented−→ as a vector and calculated by: where is the vector of the torso line. H(Xh, Yh, Zh) is the position of the hip-center joint in 3D HS = (Xh − Xs, Yh − Ys, Zh − Zs), (1) coordinates, and S(Xs, Ys, Zs) indicatesHSXXsYYsZZs the (,,)positionhhh of the  shoulder -center, joint. We call the vector the (1) torso−→ vector in our approach. wherewhere HS isis thethe vectorvector ofof thethe torso torso line. line.H H((XXhh,, YYhh, Zh)) is the positionposition ofof thethe hip-centerhip-center jointjoint inin 3D3D coordinates,coordinates, and andS S(X(Xs,s,Y Yss,, ZZss) indicates the position ofof thethe shoulder-centershoulder-center joint.joint. WeWe callcall thethe vectorvector thethe torsotorso vectorvector in in our our approach. approach.

Figure 2. Depth image space created by the Kinect sensor.

3.1.2. Gravity Vector and Torso Angle As we all know, the gravity line is always vertical to the ground. Therefore, all lines parallel with the y-axis can be seen as the gravity lines. Every two points on the gravity line can form a FigureFigure 2.2. DepthDepth imageimage spacespace createdcreated byby thethe KinectKinectsensor. sensor. vector, which is called the gravity vector in our approach. According to vector translation theory in solid geometry, the start point of the gravity vector 3.1.2.3.1.2. GravityGravity VectorVector and and Torso Torso Angle Angle and the start point of the torso vector can be moved to origin coordinates. We take a middle frame of AsaAs fall wewe video all all know, forknow, example. the the gravity gravityFigure line 3 lineshows is alwaysis thealways process vertical vertical of toforming the to ground.the the ground. torso Therefore, angle. Therefore, all lines all parallellines parallel with thewithy-axis the y can-axis be can seen be as seen the gravityas the gravity lines. Everylines. twoEvery points two points on the on gravity the gravity line can line form can a vector,form a whichvector, is which called is the called gravity the vectorgravity in vector our approach. in our approach. AccordingAccording to to vector vector translation translation theory theory in in solid solid geometry, geometry, the the start start point point of the of gravitythe gravity vector vector and theand start the pointstart point of the of torso the torso vector vector can be can moved be moved to origin to origin coordinates. coordinates. We take We a take middle a middle frame frame of a fall of videoa fall forvideo example. for example. Figure Figure3 shows 3 shows the process the process of forming of forming the torso the angle. torso angle. Appl. Sci. 2017, 7, 993 5 of 17 Appl. Sci. 2017, 7, 993 5 of 17

Appl. Sci. 2017, 7, 993 5 of 17

(a) (b)(c) Figure 3. The process of extracting torso angle from a depth image. (a) A depth image fetched from a Figure 3. The process of extracting torso angle from a depth image. (a) A depth image fetched from a middle frame in a fall video; (b) torso vector represented in 3D coordinates; (c) torso angle middle frame in a fall video; (b) torso vector represented in 3D coordinates; (c) torso angle represented represented(a )in a 2D plane after vector translation.(b )(c) in a 2D plane after vector translation. Figure 3. The process of extracting torso angle from a depth image. (a) A depth image fetched from a According to the cosine law, the angle between these two vectors, named the torso angle, can be middle frame in a fall video; (b) torso vector represented in 3D coordinates; (c) torso angle Accordingcalculated by: to the cosine law, the angle between these two vectors, named the torso angle, can be represented in a 2D plane after vector translation.  calculated by:   HSGH cos( ) cos(HS , GH ) −→ −→ (2) According to the cosine law, the angle −→between−→ these||||H twoSGHHS vectors,•GH named the torso angle, can be ( ) = ( ) = , calculated by: cos α cos HS , GH −→ −→ , (2) where means the vector from any point on the y-axis| HSto the| × hip |GH center| joint. Since the point on the gravity line is self-defined and the hip  centerH SGHis moved to the origin coordinates, −→ cos( ) cos(HS , GH )   (2) ||||HSGH wheresoGH means=(0,− the,0) vector. Equation from any(2) can point be also on therepresentedy-axis to by: the hip center, joint. Since the point on the −→ gravity line is self-defined and the hip center is moved to theYYs originh  coordinates, so GH = (0, −Y , 0). where means the vectorcos( )from cos( anyGH point , HS ) on the y-axis to the hip center joint., Since the point onh 22 2 (3) Equationthe gravity (2) can line be alsois self-defined represented by:and the hip()()() XXsscenterhsh is YY moved ZZto hthe origin coordinates,

so =(0,−,0). Equation (2) can be also represented by: Figure 4 shows the torso−→ angles−→ in three common daily Yhactivities,− Ys standing, walking and sitting. ( ) = ( ) = Figure 4a–c cosshowsα thecos RGBGH images, HS of eachq video and FigureYYsh 4d–f are their depth images, marked (3) cos( ) cos(GH , HS ) (Xs − Xh)2 + (Ys − Yh)2 + (Zs −, Zh)2 with the torso line and the gravity line. 22 2 (3) ()()()XXsshsh YY ZZ h Figure4 shows the torso angles in three common daily activities, standing, walking and sitting. Figure 4 shows the torso angles in three common daily activities, standing, walking and sitting. Figure4a–c shows the RGB images of each video and Figure4d–f are their depth images marked with Figure 4a–c shows the RGB images of each video and Figure 4d–f are their depth images marked thewith torso the line torso and line the and gravity the gravity line. line.

(a) (b)(c)

(a) (b)(c)

(d) (e)(f)

Figure 4. The torso angles in three common daily activities. (a–c) the RGB images of standing, walking and sitting; (d–f) their depth images, where the solid line is the gravity line and the dotted line is the torso line.

(d) (e)(f) Figure 4. The torso angles in three common daily activities. (a–c) the RGB images of standing, Figure 4. The torso angles in three common daily activities. (a–c) the RGB images of standing, walking and walking and sitting; (d–f) their depth images, where the solid line is the gravity line and the dotted sitting; (d–f) their depth images, where the solid line is the gravity line and the dotted line is the torso line. line is the torso line.

Appl. Sci. 2017, 7, 993 6 of 17

3.1.3. Centroid Height Centroid height means the distance from the centroid point of a person to the ground. In our approach, we adopt the hip center joint as the rough centroid of a person. There are three reasons for using the hip center joint as the centroid, rather than calculating the human shape pixels and finding out the centric position. First, the exact centroid of the human shape is not necessary for our method. The approximated point is also a good choice because of the consideration of only the descending rate of its height. Secondly, the hip touching the ground is a common scene in nearly all kinds of falls. Therefore, although the position of the hip center joint is not the exact position of the centroid of the human shape, it is more representative for fall detection. Thirdly, the Kinect software development kit (SDK) provides a set of user-friendly and effective functions to track or estimate the position of the hip center. Before calculating the centroid height, ground plane should be found out first. Although finding out the ground plane is a complex work, Kinect SDK provides four parameters for us, which can be used to exactly calculate the height from any points in the depth image to the ground plane. Equation (4) shows the four parameters.

Ax + By + Cz + D = 0, (4)

where A, B, C and D are the ground parameters that can be used to calculate point height. The height from the hip center to the ground Hc is calculated using Equation (5).

|Ax + By + Cz + D| Hc = √c c c , (5) A2 + B2 + C2

where C(Xc, Yc, Zc) is the hip center joint in the depth image. Through our experiments, we found that the centroid height can be exactly calculated by Equation (5) only when the ground is in the scene. When the Kinect sensor is installed too high to detect the ground, parameters cannot be accurately provided. All four parameters will be set to zero as their default values. Obviously, it will be a big obstacle to estimate one’s centroid height. To address this issue, we use the right point to estimate the height of centroid when the ground parameters are zero. Therefore, Equation (5) is enhanced as:

 | + + + | Ax√c Byc Czc D ground can be detected  2+ 2+ 2 Hc = A B C , (6) q 2 2 2  (xc − x f ) + (yc − y f ) + (zc − z f ) ground cannot be detected

where f (Xf, Yf, Zf) is the position of the right foot joint in the 3D coordinate. Figure5 shows the value of centroid height in two typical daily activities, crouching and walking. From the experiment data, it can be concluded that the deviation caused by calculating based on the Appl. Sci. 2017, 7, 993right foot joint is acceptable. 7 of 17

3.2. Human Torso3.2. Motion Human Model Torso Motion Model Fall activity is theFall balance activity loss is the of balancea person. loss When of a a person. fall happens, When ait fallis always happens, accompanied it is always with accompanied with sharp changes insharp the torso changes angle in theand torso the centroid angle and he theight. centroid In the proposed height. In method, the proposed we take method, the torso we take the torso angle and the anglecentroid and height the centroid as two height key features as two key. There features. are our There thresholds are our thresholds in our fall in detection our fall detection model, model, where whereTα is theTα isthreshold the threshold for the for thestart start key key frame frame detection, detection, TTvαv αisis the the thresholdthreshold ofof the the changing rate of changing rate ofthe the torso torso angle, angle,T vhTvhis is the the threshold threshold ofof thethe velocityvelocity ofof thethe centroidcentroid height and Ŧ isis the the tracking time tracking time afterafter the the torso torso angle angle exceeded exceeded TTαα. .In In HTMM. HTMM. The The changing changing rates rates of the torso angleangle andand centroid height centroid heightare are calculated calculated frame frame by by frame frame after after the the torso torso angle angle exceeded exceededTα. T Theα. The max max values values of these of two features these two featuresin the in given the given period period of time, of time,, are comparedŦ, are compared to their to thresholds, their thresholds, respectively. respectively. When both rates exceed When both ratestheir exceed corresponding their corresponding thresholds thresholdsTvα and T vhTv,α aand fall T isvh detected., a fall is detected. For the result shown in the limit of stability test (LOST), an adult person can keep his/her balance with the body leaning forward/backward no more than 12.5 degrees and leaning left/right no more than 16 degrees [35]. In LOST, the person is asked to keep the whole body in a line. However, there is always an angle between the lower body and the upper body in daily activities. However, the person’s torso always keeps parallel with the gravity line. Therefore, we take the torso line rather than the body line to form the torso angle with the gravity line. In our experiments, the torso angle exceeding 13 degree is the start of our detection model. The usage of only the torso angle to detect falls is insufficient and will result in low accuracy and a high false alarm rate. For example, bending or controlled lying down will be judged as a fall. To address this issue, the centroid height is imported in our model as the second feature. Since a fall is an activity that usually happens in a short period of time (in our experiments, most of the fall samples last 1.1~1.6 s), for a video captured with 30 frames per second, there are 33–48 images during a fall. Thus, we built a motion model called HTMM for fall detection. In this model, we pay special attention to the changing rate of torso angle and centroid height in a given period of time. Assume that the given period of time is represented as Ŧ, and N (n | n∈ Z ∧ n ≤ Ŧ × 30) denotes the frames’ order in Ŧ. Then, the person’s torso angle (α), centroid height (H) and recorded time (T) in each frame can be represented as: α = {α1, α2, …, αn}, H = {h1, h2, …, hn}, T = {t1, t2, …, tn}. The changing rates of α and H in Ŧ of each frame are calculated using Equations (7) and (8).  αα−  Vxxxα =∈Λ=|R i 1 , tii −  (7) tti 1

 hh−  Vyh =∈Λ=|y R yi 1 , tii −  (8) tti 1

In HTMM, the max changing rates of torso angle and centroid height in the given period of time are used to compare with their thresholds, Tvα and Tvh. When both of them exceed their thresholds, HTMM outputs 1one to indicate that the activity is judged as a fall; else HTMM outputs zero to indicate that it is not. Our model can be represented as:

α >>Λ h 1 Max(VTttvh )vα Max( VT ) HTMM(,)α ht , =  , (9) 0 else Figure 6 shows the general block diagram of our approach. Appl. Sci. 2017, 7, 993 6 of 17

3.1.3. Centroid Height Centroid height means the distance from the centroid point of a person to the ground. In our approach, we adopt the hip center joint as the rough centroid of a person. There are three reasons for using the hip center joint as the centroid, rather than calculating the human shape pixels and finding out the centric position. First, the exact centroid of the human shape is not necessary for our method. The approximated point is also a good choice because of the consideration of only the descending rate of its height. Secondly, the hip touching the ground is a common scene in nearly all kinds of falls. Therefore, although the position of the hip center joint is not the exact position of the centroid of the human shape, it is more representative for fall detection. Thirdly, the Kinect software development kit (SDK) provides a set of user-friendly and effective functions to track or estimate the position of the hip center. Before calculating the centroid height, ground plane should be found out first. Although finding out the ground plane is a complex work, Kinect SDK provides four parameters for us, which can be used to exactly calculate the height from any points in the depth image to the ground plane. Equation (4) shows the four parameters.

Ax By Cz D 0, (4)

where A, B, C and D are the ground parameters that can be used to calculate point height. The height from the hip center to the ground Hc is calculated using Equation (5).

|Axcc By Cz c  D | Hc  , (5) ABC222

where C(Xc, Yc, Zc) is the hip center joint in the depth image. Through our experiments, we found that the centroid height can be exactly calculated by Equation (5) only when the ground is in the scene. When the Kinect sensor is installed too high to detect the ground, parameters cannot be accurately provided. All four parameters will be set to zero as their default values. Obviously, it will be a big obstacle to estimate one’s centroid height. To address this issue, we use the right foot point to estimate the height of centroid when the ground parameters are zero. Therefore, Equation (5) is enhanced as:

||Ax By Cz D ccc ground can be detected  222 , Hc   ABC (6)  ()()()xx222 yy zz ground cannot be detected  cf cf cf

where f(Xf, Yf, Zf) is the position of the right foot joint in the 3D coordinate. Figure 5 shows the value of centroid height in two typical daily activities, crouching and walking.Appl. Sci. 2017 From, 7, 993 the experiment data, it can be concluded that the deviation caused by calculating7 of 17 based on the right foot joint is acceptable.

(a) (b)

Figure 5. The values of centroid height recorded in two daily activities videos: crouching and Figure 5. The values of centroid height recorded in two daily activities videos: crouching and walking. walking. (a) Centroid height in crouching activity video; (b) centroid height in walking (a) Centroid height in crouching activity video; (b) centroid height in walking activity video. activity video.

For the result shown in the limit of stability test (LOST), an adult person can keep his/her balance with the body leaning forward/backward no more than 12.5 degrees and leaning left/right no more than 16 degrees [35]. In LOST, the person is asked to keep the whole body in a line. However, there is always an angle between the lower body and the upper body in daily activities. However, the person’s torso always keeps parallel with the gravity line. Therefore, we take the torso line rather than the body line to form the torso angle with the gravity line. In our experiments, the torso angle exceeding 13 degree is the start of our detection model. The usage of only the torso angle to detect falls is insufficient and will result in low accuracy and a high false alarm rate. For example, bending or Appl. Sci. 2017, 7, 993 controlled lying down will be judged as a fall. To address this7 issue,of 17 the centroid height is imported in our model as the second feature. 3.2. Human Torso Motion Model Since a fall is an activity that usually happens in a short period of time (in our experiments, Fall activity is the balance lossmost of a person. of the fall When samples a fall lasthappens, 1.1~1.6 it is s), always for a videoaccompanied captured with with 30 frames per second, there are sharp changes in the torso angle and33–48 the imagescentroid during height. a In fall. the Thus, proposed we built method, a motion we take model the torso called HTMM for fall detection. In this angle and the centroid height as twomodel, key we features pay special. There attention are our tothresholds the changing in our rate fall of detection torso angle and centroid height in a given model, where Tα is the thresholdperiod for the of time.start key frame detection, Tvα is the threshold of the changing rate of the torso angle, Tvh is theAssume threshold that of the the given velocity period of ofthe time centroid is represented height and as Ŧ is, and the N (n | n∈ Z ∧ n ≤ × 30) denotes tracking time after the torso angle exceededthe frames’ Tα order. In HTMM. in . The changing rates of the torso angle and centroid height are calculated frame byThen, frame the after person’s the torso torso angle angle exceeded (α), centroid Tα. The height max (H values) and recorded of time (T) in each frame can be these two features in the given periodrepresented of time, as: Ŧα, are= {α 1compared, α2,..., αton}, theirH = {hthresholds,1, h2,..., h respectively.n}, T = {t1, t2,..., tn}. When both rates exceed their correspondingThe changing thresholds rates Tvα ofandα andTvh, aH fallin is ofdetected. each frame are calculated using Equations (7) and (8). For the result shown in the limit of stability test (LOST), an adult person can keep his/her  α − α  balance with the body leaning forward/backward no more than 12.5V αdegrees= x|x and∈ R leaningΛ x = left/righti 1 , (7) t i i − no more than 16 degrees [35]. In LOST, the person is asked to keep the whole body int ia line.t1   However, there is always an angle between the lower body and the upperh body in daily activities.hi − h1 Vt = y|yi ∈ R Λ yi = , (8) However, the person’s torso always keeps parallel with the gravity line. Therefore, we take theti − torsot1 line rather than the body line to form theIn HTMM,torso angle the maxwith changingthe gravity rates line. of In torso our angle experiments, and centroid the height in the given period of time torso angle exceeding 13 degree is the start of our detection model. The usage of only the torso angle are used to compare with their thresholds, Tvα and Tvh. When both of them exceed their thresholds, to detect falls is insufficient and willHTMM result outputsin low accuracy 1one to indicate and a high that false the activity alarm israte. judged For example, as a fall; else HTMM outputs zero to indicate bending or controlled lying down willthat be it isjudged not. Our as a model fall. To can address be represented this issue, as:the centroid height is imported in our model as the second feature. ( Since a fall is an activity that usually happens in a short period of time1 Max (in (ourVα) experiments,> T Λ Max > ( Vh) > T HTMM(α, h, t) = t vα t vh , (9) most of the fall samples last 1.1~1.6 s), for a video captured with 30 frames0 elseper second, there are 33–48 images during a fall. Thus, we built a motion model called HTMM for fall detection. In this model, we pay special attention to the Figurechanging6 shows rate of the torso general angle block and diagram centroid of height our approach. in a given period of time. Assume that the given period of time is represented as Ŧ, and N (n | n∈ Z ∧ n ≤ Ŧ × 30) denotes the frames’ order in Ŧ. Then, the person’s torso angle (α), centroid height (H) and recorded time (T) in each frame can be represented as: α = {α1, α2, …, αn}, H = {h1, h2, …, hn}, T = {t1, t2, …, tn}. The changing rates of α and H in Ŧ of each frame are calculated using Equations (7) and (8).  αα−  Vxxxα =∈Λ=|R i 1 , tii −  (7) tti 1

 hh−  Vyh =∈Λ=|y R yi 1 , tii −  (8) tti 1

In HTMM, the max changing rates of torso angle and centroid height in the given period of time are used to compare with their thresholds, Tvα and Tvh. When both of them exceed their thresholds, HTMM outputs 1one to indicate that the activity is judged as a fall; else HTMM outputs zero to indicate that it is not. Our model can be represented as:

α >>Λ h 1 Max(VTttvh )vα Max( VT ) HTMM(,)α ht , =  , (9) 0 else Figure 6 shows the general block diagram of our approach. Appl. Sci. 2017, 7, 993 8 of 17 Appl. Sci. 2017, 7, 993 8 of 17

Figure 6. General block diagram of our approach; where (h ) is the max changing rate of the Figure 6. General block diagram of our approach; where Max(Vt ) is the max changing rate of the (α) centroidcentroid height height in the in the given given period period of of time time and and Max(Vt ) the the max max changing changing rate rate of the of the torso torso angle angle in in the giventhe given period period of time. of time. The The definitions definitions of of the the givengiven thresholdthreshold values values are are elaborated elaborated in Section in Section4. 4.

The Thegeneral general block block can can be be divideddivided into into two two main main steps. steps. The firstThe step first is step calculating is calculating the torso anglethe torso angleof of every every frame frame and and comparing comparing it with it the with threshold the th value.reshold When value. the torsoWhen angle the reachestorso angle the threshold reaches the thresholdvalue, value, the program the program turns to Stepturns 2. to The Step second 2. The step se iscond tracking step theis tracking changing the rate changing of torso angle rate andof torso anglecentroid and centroid height frame height by frameframe in by a givenframe period in a ofgiven time. period Once both of time. rates exceedOnce theboth threshold rates exceed value, the thresholdthe activity value, is judgedthe activity as a fall, is andjudged the system as a fall, starts and alarming; the system or the activitystarts alarming; is considered or as the a normal activity is consideredactivity as when a normal the given activity period when of time the is given over, andperi theod programof time is turns over, back and to the Step program 1 to start turns another back to loop. The pseudocode of our fall detection method is described in the following Algorithm 1. Step 1 to start another loop. The pseudocode of our fall detection method is described in the In HTMM, we introduced the torso angle to clearly judge whether the supervised person is in following Algorithm 1. balance or not. Furthermore, the centroid height is used to find out whether the person is falling. These two features make HTMM have high accuracy in fall detection and work well in differentiating Algorithmfall and fall-like 1: Pseudocode activities. of Figure our proposed7 lists three method typical fall-likefor fall activitiesdetection. that cause a high false alarm Input:rate inSequence previous of approaches. the skeleton frames captured by Kinect sensor Output:In bool the boundingbFallIsDetected box ratio analysis approach, the ellipse shape analysis approach or even deep Looplearning 1: approach, the biggest shortcoming of these methods is that there is not a clear line between whilebalance (torsoAngle and unbalance. < Tα) Therefore, when fall-like activity appears, the system will consider it as a fall. { However, the torso angle together with the centroid height make it easy for HTMM to distinguish fall-like activities. Take the activities in Figure7 as examples; each activity reaches only one threshold joint_shoulderCenter ← fetch shoulder center 3D position from current skeleton frame; of two features. For controlled lying down and bending, although the torso angle changes greatly joint_hipCenter ← fetch hip center 3D position from current skeleton frame; in a short period of time, the centroid height has no changes or changes little during this activity; whiletorsoAngle for crouching, ← calculate the centroid torso heightangle of changes current dramatically skeleton frame; in a short period time, but the torso } angle keeps at 12.5 degrees, which is lower than the threshold. Section4 elaborates detailed statistics of our experiments. Loop 2: While (trackingTime < Ŧ) { V_torsoAngle ← calculate the current torso angle changing rate by current frame; MaxV_torsoAngle ← update its value if V_torsoAngle is larger V_CentroidHeight ← calculate the current centroid height changing rate by current frame; MaxV_CentroidHeight ← update its value if V_ CentroidHeight is larger Appl. Sci. 2017, 7, 993 9 of 17

Algorithm 1: Pseudocode of our proposed method for fall detection. Input: Sequence of the skeleton frames captured by Kinect sensor Output: bool bFallIsDetected Loop 1: Appl. Sci. 2017, 7, 993 while (torsoAngle < Tα) 7 of 17 { 3.2. Human Torso Motion Model joint_shoulderCenter ← fetch shoulder center 3D position from current skeleton frame; joint_hipCenter ← fetch hip center 3D position from current skeleton frame; Fall activity is the balance loss of a person. When a fall happens, it is alwaystorsoAngle accompanied← calculate with torso angle of current skeleton frame; sharp changes in the torso angle and the centroid height. In the proposed} method, we take the torso angle and the centroid height as two key features. There are our thresholdsAppl. Sci. 2017 in , our7, 993 fall detection 9 of 17 model, where Tα is the threshold for the start key frame detection, LoopTvα is 2: the threshold of the changing rate of the torso angle, Tvh is the threshold of the velocity of the WhilecentroidIf (trackingTime(MaxV_torsoAngle height and < Ŧ is) the > T vα && MaxV_CentroidHeight > Tvh) tracking time after the torso angle exceeded Tα. In HTMM. The changing{ rates{ of the torso angle and ← centroid height are calculated frame by frame after the torso angle exceededV_torsoAngle T α . bFallIsDetectedThe max valuescalculate = of TRUE; the current torso angle changing rate by current frame; ← these two features in the given period of time, Ŧ, are compared to their thresholds,MaxV_torsoAngle fallAlarm(); respectively. update its value if V_torsoAngle is larger V_CentroidHeight ← calculate the current centroid height changing rate by current frame; When both rates exceed their corresponding thresholds Tvα and Tvh, a fall is detected.} MaxV_CentroidHeight ← update its value if V_ CentroidHeight is larger For the result shown in the limit of stability test (LOST), an adult} person can keep his/her If (MaxV_torsoAngle > Tvα && MaxV_CentroidHeight > T ) balance with the body leaning forward/backward no more than 12.5 degreesgoto Loop and 1;leaning left/right vh { no more than 16 degrees [35]. In LOST, the person is asked to keep the whole body in a line. bFallIsDetected = TRUE; However, there is always an angle between the lower body and the upper body in daily activities. In HTMM,fallAlarm(); we introduced the torso angle to clearly judge whether the supervised person is in However, the person’s torso always keeps parallel with the gravity line. Therefore, we take the torso balance} or not. Furthermore, the centroid height is used to find out whether the person is falling. line rather than the body line to form the torso angle with the gravityThese line.} In two our featuresexperiments, make the HTMM have high accuracy in fall detection and work well in torso angle exceeding 13 degree is the start of our detection model. Thedifferentiating usagegoto Loop of only 1; thefall torsoand fall-likeangle activities. Figure 7 lists three typical fall-like activities that cause a to detect falls is insufficient and will result in low accuracy and a high highfalse falsealarm alarm rate. Forrate example,in previous approaches. bending or controlled lying down will be judged as a fall. To address this issue, the centroid height is imported in our model as the second feature. Since a fall is an activity that usually happens in a short period of time (in our experiments, most of the fall samples last 1.1~1.6 s), for a video captured with 30 frames per second, there are 33–48 images during a fall. Thus, we built a motion model called HTMM for fall detection. In this model, we pay special attention to the changing rate of torso angle and centroid height in a given period of time. Assume that the given period of time is represented as Ŧ, and N (n | n∈ Z ∧ n ≤ Ŧ × 30) denotes the frames’ order in Ŧ. Then, the person’s torso angle (α), centroid height (H) and recorded time (T) in each frame can be represented as: α = {α1, α2, …, αn}, H = {h1, h2, …, hn}, T = {t1, t2, …, tn}. The changing rates of α and H in Ŧ of each frame are calculated using Equations (7) and (8). (a) (b)  αα−  α =∈Λ=i 1 Vxxxtii |R  , Figure 7. The data changing(7) curve of the torso angle and centroid height. (a) The torso angle data tt− Figure 7. The data changing curve of the torso angle and centroid height. (a) The torso angle data i 1 changing curve; (b) the centroid height data changing curve. changing curve; (b) the centroid height data changing curve. − h  hh Vy=∈Λ=|y R yi 1 , In the bounding box (8)ratio analysis approach, the ellipse shape analysis approach or even deep tii −  4. Experiment and Results tti 1 learning approach, the biggest shortcoming of these methods is that there is not a clear line between balance and unbalance. Therefore, when fall-like activity appears, the system will consider it as a In HTMM, the max changing rates of torso angle and centroid height4.1. in Experimental the given period Setup of and time Dataset fall. However, the torso angle together with the centroid height make it easy for HTMM to are used to compare with their thresholds, Tvα and Tvh. When both of them exceed their thresholds, distinguishOur method fall-like was activities. implemented Take the using activities Microsoft in Figure Visual 7 as Studio examples; 2013 each Ultimate activity 2013 reaches (Redmond, only HTMM outputs 1one to indicate that the activity is judged as a fall; else HTMM outputs zero to oneWA, threshold USA) + emgu.cv3.1 of two features. (SourceForge, For controlled San Francisco, lying down CA, USA)+and bending, Microsoft although Kinect sensorthe torso v1.0 angle 2013 indicate that it is not. Our model can be represented as: changes(Redmond, greatly WA, in USA)on a short aperiod PC using of time, an Intel the centroid Core i7-4790 height 3.60-GHz has no changes processor, or 8changes GB RAM little clocked during at 1 Max(VTα )>>Λ Max( VTthis1333h ) activity; MHz. Since while Kinect for crouching, is a newly the emerging centroid sensor height and changes the feature, dramatically the torso angle,in a short is first period raised time, and HTMM(,)α ht , = ttvhvα ,  butimported the torso for fallangle detection, keeps(9) at the 12.5 existing degrees, depth which action is datasets lower than cannot the provide threshold. the necessarySection 4 information elaborates 0 else detailedwe need. statistics For example, of our Cornell experiments. Activity Datasets CAD-60/120 [36], the most commonly-used RGB-D Figure 6 shows the general block diagram of our approach. 4. Experiment and Results

4.1. Experimental Setup and Dataset Our method was implemented using Microsoft Visual Studio 2013 Ultimate 2013 (Redmond, WA, USA) + emgu.cv3.1 (SourceForge, San Francisco, CA, USA)+ Microsoft Kinect sensor v1.0 2013 (Redmond, WA, USA)on a PC using an Intel Core i7-4790 3.60-GHz processor, 8 GB RAM clocked at 1333 MHz. Since Kinect is a newly emerging sensor and the feature, the torso angle, is first raised and imported for fall detection, the existing depth action datasets cannot provide the necessary information we need. For example, Cornell Activity Datasets CAD-60/120 [366], the most commonly-used RGB-D dataset for action recognition, contains 12 activities, such as rinsing mouth, brushing teeth, wearing contact lens, etc., which do not provide falling action samples. Therefore, we built a dataset by ourselves for experiments. Appl. Sci. 2017, 7, 993 10 of 17

datasetAppl. Sci. 2017 for action, 7, 993 recognition, contains 12 activities, such as rinsing mouth, brushing teeth, wearing10 of 17 contact lens, etc., which do not provide falling action samples. Therefore, we built a dataset by ourselvesThis dataset for experiments. was collected using Microsoft Kinect sensor v1.0, which was installed 1.4 m high fromThis the ground. dataset wasThe people collected involved usingMicrosoft in the self-c Kinectollected sensor dataset v1.0, are which aged wasbetween installed 20 and 1.4 36, m highwith fromdifferent the heights ground. (1.70~1.81 The people m) involvedand genders in the (four self-collected male and one dataset female are volunteers). aged between The 20 distance and 36, withfrom differentthe monitored heights person (1.70~1.81 to the m) Kinect and genders sensor (fouris between male and 3 and one 4.5 female m. The volunteers). actions performed The distance by fromthe single the monitored volunteer person were toseparated the Kinect into sensor two iscategories: between 3ADL and 4.5(activity m. The of actions daily performedliving, including by the singlewalking, volunteer controlled were lying separated down, intobending two categories:and crouching) ADL (activityand fall (four of daily directions living, including of falls, including walking, controlledforward, backward, lying down, left bending and right). and crouching)For safety andand fallrealistic (four performance directions of falls,considerations, including forward,subjects backward,performed leftthe andfall right).actions For on safety a 15 cm-thick and realistic cushion. performance Each activity considerations, is repeated subjects five performedtimes by each the fallsubject actions involved. on a 15 There cm-thick are in cushion. total 100 Each fall activityvideos (each is repeated fall direction five times cont byains each 25 subjectvideos) involved. and 100 ThereADL videos are in total(each 100 ADL fall contains videos (each 25 videos). fall direction In thes containse experiments, 25 videos) the and five 100 volunteers ADL videos were (each asked ADL to containsperform 25in videos).slow motion In these to imitate experiments, the behavior the five volunteersof an elderly were person asked at to least perform one intime slow in motiondifferent to imitatekinds of the activities. behavior The of joints an elderly positions person in the at least 3D coordinates one time in and different joint kindsheights of were activities. recorded The frame joints positionsby frame inin the an 3DXML coordinates file. The andfall jointsamples heights contain were recorded55~290 frames, frame byand frame the inADL an XMLsamples file. contain The fall samples65~317 frames contain in 55~290 each frames,video. andAll of the the ADL samples samples start contain by 65~317standing frames postures, in each which video. last All 1~8 of the s, samplescontaining start 30~240 by standing frames. postures, which last 1~8 s, containing 30~240 frames.

4.2. Result and Evaluation The human skeleton is used to track the monitoredmonitored person and to discriminatediscriminate human objects and other objects. Figure 88 showsshows thatthat thethe Kinect-providedKinect-provided skeletonskeleton isis anan effectiveeffective wayway toto recognizerecognize the human object correctly, even when the joints are covered by a skirt or gown or there are other objects in the scene. Additionally, walking devices, such as a walking stick, can also be recognized and excluded from skeleton information by the embeddedembedded program of Kinect. Although the joints of the lower part of the skeleton may have a slight deviationdeviation where they are covered, the effect on the accuracy of HTMM can be ignored because of the considerationconsideration of only the shoulder center and hip center joints.

Figure 8. The detected skeletons with different postures, clothes or walking devices. Figure 8. The detected skeletons with different postures, clothes or walking devices.

The four typical falls in Figure 9 show the common torso angle and centroid height changes. BeforeThe a fall four is typicaldetected, falls the in torso Figure angle9 show increases the common rapidly while torso the angle centroid and centroid height declines height changes. sharply. BeforeIn HTMM, a fall the is detected, changing the rate torso of the angle torso increases angle, rapidlythe changing while therate centroid of the centroid height declines height and sharply. the Intracking HTMM, time the are changing the three rate parameters of the torso that have angle, a strong the changing effect on rate the of accuracy. the centroid In [Error! height Reference and the trackingsource not time found. are the7], threethe range parameters of the thatdecay have rate a of strong the cent effectroid on height the accuracy. is from In 1.21~2.05 [37], the rangem/s. We of theset decay1.21 m/s rate as of its the threshold centroid value height in is our from model. 1.21~2.05 Through m/s. the We setexperiments, 1.21 m/s as our its approach threshold received value in ourthe best accuracy rate when the parameters were set as follows:

Tvα = 12 degree/100 ms Appl. Sci. 2017, 7, 993 Appl. Sci. 2017, 7, 993 7 of 17 11 of 17

3.2. Human Torso Motion Model model. Through the experiments, our approach received the best accuracy rate when the parameters Fall activity is the balance loss of a person. When a fall happens, it is always accompanied with were set as follows: sharp changes in the torso angle and the centroid height.Appl. In Sci.the 2017proposed, 7, 993 method, we take the torso 11 of 17 Tvα = 12 degree/100 ms angle and the centroid height as two key features. There are our thresholds in our fall detection vhTvh = 1.21 m/s model, where Tα is the threshold for the start key frame detection, Tvα is the thresholdT of= 1.21 the m/s changing rate of the torso angle, Tvh is the threshold of the velocity of the centroid height andŦ =Ŧ 1300is= 1300the ms ms tracking time after the torso angle exceeded Tα. In HTMM. The changing rates of the torso angle and wherewhereT Tvαvα isis thethe thresholdthreshold valuevalue ofof thethe torsotorso angleangle increaseincrease rate,rate, TTvhvh is the centroid height decay rate centroid height are calculated frame by frame after the torso angle exceeded Tα. The max values of and is the tracking time. these two features in the given period of time, Ŧ, are andcompared Ŧ is the to tracking their thresholds, time. respectively. When both rates exceed their corresponding thresholds Tvα and Tvh, a fall is detected. For the result shown in the limit of stability test (LOST), an adult person can keep his/her80 1.5 60 balance with the body leaning forward/backward no more than 12.5 degrees and leaning left/right 1 no more than 16 degrees [35]. In LOST, the person is asked to keep the whole body in a40 line. However, there is always an angle between the lower body and the upper body in daily activities.20 0.5 However, the person’s torso always keeps parallel with the gravity line. Therefore, we take the torso0 0 Torso angle (Degree) angle Torso line rather than the body line to form the torso angle with the gravity line. In our experiments, the1 8 15 22 29 36 43 50 Centroid height (m) 1 8 15 22 29 36 43 50 torso angle exceeding 13 degree is the start of our detection model. The usage of only the torso angle Frames Frames to detect falls is insufficient and will result in low accuracy and a high false alarm rate. For example, (a) bending or controlled lying down will be judged as a fall. To address this issue, the centroid height40 is 1.5 imported in our model as the second feature. 30 1 Since a fall is an activity that usually happens in a short period of time (in our experiments,20 most of the fall samples last 1.1~1.6 s), for a video captured with 30 frames per second, there10 are 0.5 33–48 images during a fall. Thus, we built a motion model called HTMM for fall detection. In this 0 0 Torso angle (Degree) angle Torso model, we pay special attention to the changing rate of torso angle and centroid height in a given Centroid height (m) 1 8 15 22 29 36 43 50 1 7 13192531374349 period of time. Frames Frames Assume that the given period of time is represented as Ŧ, and N (n | n∈ Z ∧ n ≤ Ŧ × 30) denotes (b) the frames’ order in Ŧ. 80 1.5 Then, the person’s torso angle (α), centroid height (H) and recorded time (T) in each frame60 can 1 be represented as: α = {α1, α2, …, αn}, H = {h1, h2, …, hn}, T = {t1, t2, …, tn}. 40 The changing rates of α and H in Ŧ of each frame are calculated using Equations (7) and (8). 20 0.5

α  αα−  0 0 =∈Λ=i 1 , Vxxx|R Centroid height (m) tii  (7)1 7 131925313743 − (Degree) angle Torso 1 7 131925313743 tti 1 Frames Frames  hh−  (c) Vyh =∈Λ=|y R yi 1 , tii −  80(8) 1.5 tti 1 60 1 In HTMM, the max changing rates of torso angle and centroid height in the given period of 40time are used to compare with their thresholds, Tvα and Tvh. When both of them exceed their thresholds,20 0.5 HTMM outputs 1one to indicate that the activity is judged as a fall; else HTMM outputs zero to 0 0

indicate that it is not. Our model can be represented as: Centroid height (m) Torso angle (Degree) angle Torso 1 7 1319 25 3137 43 1 6 11162126313641 α >>Λ h Frames Frames 1 Max(VTttvh )vα Max( VT ) HTMM(,)α ht , =  , (9) (d) 0 else FigureFigure 9. 9.Detected Detected frames frames of of different different falls falls and and two two features’ features’ curves. curves. The The left left picture picture of of each each group group is is Figure 6 shows the general block diagram of our approach.thethe depth depth image image of of the the frame frame when when our our method method detected detected the the fall. fall. The The dotted dotted line line in in the the depth depth image image standsstands for for the the torso torso line, line, while while the the solid solid line line stands stands for for the the gravity gravity line. line. Shoulder Shoulder center center joint joint and and hip hip centercenter joint joint were were marked marked in in red. red. ( a(a)) Fall Fall left; left; ( b(b)) fall fall right; right; ( c()c) fall fall forward; forward; ( d(d)) fall fall backward. backward.

As shown in Figure 10, most falls of our self-collected dataset happened in 1.1~1.6 s. Therefore, As shown in Figure 10, most falls of our self-collected dataset happened in 1.1~1.6 s. Therefore, the values from 1100~1600 ms are all suitable for Tα. The max changing rates of the torso angle and the values from 1100~1600 ms are all suitable for T . The max changing rates of the torso angle and the centroid height in this period are used in HTMMα to check whether there is a fall. In the program, we took 1300 ms as the default value. Appl. Sci. 2017, 7, 993 12 of 17

Appl. Sci. 2017, 7, 993 12 of 17 Appl.the centroid Sci. 2017, 7 height, 993 in this period are used in HTMM to check whether there is a fall. In the program,12 of 17 we took 1300 ms as the default value. 2.0 <1.1 >1.6 1.9 9% 12% 2.0 <1.1 >1.6 1.8 9% 12% 1.9 1.7 1.8 1.6 1.7 1.5 1.6 1.4 1.5 1.3 1.4 1.2 1.3 1.1 1.2 (s) Lasted Time 1.0 1.1 Time Lasted (s) Lasted Time 0.9 1.0 0.8 1.1~1.6 0.9 0.7 0.8 1.1~1.6 79% 1 8

0.7 15 22 29 36 43 50 57 64 71 78 85 92 99 79% 1 8

15 22 29 36 43 Fall50 videos57 64 71 78 85 92 99

Fall videos

Figure 10. The duration of each fall in the self-collected dataset. There are in total 100 fall videos in Figure 10. The duration of each fall in the self-collected dataset. There are in total 100 fall videos in the dataset. Figurethe 10. dataset. The duration of each fall in the self-collected dataset. There are in total 100 fall videos in the dataset. Eleven groups of experiments were tested on the self-collected dataset with different T to find Eleven groups of experiments were tested on the self-collected dataset with differentvα Tvα to find out its best default value. According to the results, when Tvα was set 12 degree/100 ms, a fall can be Elevenout its groupsbest default of experiments value. According were tested to the on results, the self-collected when Tvα was dataset set 12with degree/100 different ms,Tvα toa fallfind can be detected in the middle of the activity; while a fall may not be detected if Tvα was set too large, and false out itsdetected best default in the value. middle According of the activity; to the results, while a when fall may Tvα wasnot beset detected 12 degree/100 if Tvα wasms, aset fall too can large, be and alarm rate increases when Tvα was set too small. Table1 shows the test results on our self-collected detectedfalse in alarmthe middle rate increasesof the activity; when while Tvα wasa fall set may too not small. be detected Table 1if showsTvα was the set testtoo large,results and on our dataset with different values of Tvα. false self-collectedalarm rate increases dataset withwhen different Tvα was values set tooof T vsmall.α. Table 1 shows the test results on our self-collected dataset with different values of Tvα. Table 1. The test results on the self-collected dataset with different Tvα. TP: true positive; FN: false negative; Table 1. The test results on the self-collected dataset with different Tvα. TP: true positive; FN: false FP: false positive; TN: true negative. Table negative;1. The test FP: results false positive;on the self-collected TN: true negative dataset with different Tvα. TP: true positive; FN: false negative; FP: false positive; TN: true negative Fall Non-Fall Tvα FallTP FNNon-Fall FP TN 25 25 25 Tvα 25 23 0TP 98FN 2 FP 4 TN 96 25 25 25 25 23 0 398 98 2 2 4 4 96 96 20 3 698 98 2 2 4 4 96 96 20 15 6 998 98 2 2 4 4 96 96 15 9 1298 98 2 2 4 3 96 97 10 12 1598 93 27 3 2 97 98 10 5 15 1893 92 7 8 2 1 98 99

Detected fall number Detected 11 1 5 0 18 2192 89 8 11 1 1 99 99 0

Detected fall number Detected 11 1 24 86 14 1 99 0 21 89 11 1 99 0 24 2786 8014 20 1 1 99 99 27 3080 7020 30 1 1 99 99 30 3370 6730 33 1 1 99 99 33 67 33 1 99 Figures 11 and 12 record the changing curves of the torso angle and centroid height of the five Figures 11 and 12 record the changing curves of the torso angle and centroid height of the Figureswrongly 11 judged and 12 samples. record the By changingreviewing curves the data, of thewe torsofound angle that there and centroidare two majorheight factors of the thatfive affect five wrongly judged samples. By reviewing the data, we found that there are two major factors wronglythe judgedaccuracy samples. of our methodBy reviewing. First andthe data,foremost, we fo inund rare that situations, there are joint two majorpositions factors may that be improperlyaffect that affect the accuracy of our method. First and foremost, in rare situations, joint positions may be the accuracyprovided of ourby methodthe embedded. First and program foremost, of in the rare Kinect situations, sensor. joint For positions example, may inbe improperlyFigure 11, from improperly provided by the embedded program of the Kinect sensor. For example, in Figure 11, providedFrame by 22~Framethe embedded 24 of theprogram “lay” ofcurve, the Kinectthe tors sensor.o angle For increased example, from in Figure20~69.8 11,degrees. from This from Frame 22~Frame 24 of the “lay” curve, the torso angle increased from 20~69.8 degrees. Frameabnormal 22~Frame transformation 24 of the “lay” let ourcurve, method the torsmakeo angle false judgmentsincreased fromin two 20~69.8 fall samples degrees. and This one lay This abnormal transformation let our method make false judgments in two fall samples and one abnormalsample. transformation Secondly, some let our samples method were make too faquicklse judgments to be detected in two correctly. fall samples For examples, and one “crouch”lay lay sample. Secondly, some samples were too quick to be detected correctly. For examples, “crouch” sample.and Secondly, “bend” curves some showsamples that were the actions too quick were to accompanied be detected correctly.with a quick For lean examples, of the upper “crouch” body. For and “bend” curves show that the actions were accompanied with a quick lean of the upper body. and “bend”the first curves factor, show the average that the filter actions or otherwere accompaniedfilters can be employedwith a quick to addresslean of the the upper issue. body. For the For second the firstfactor, factor, the the centroid average height filter or can other be filtersemployed can beas employed the third tofeature address to theimprove issue. Forthe theaccuracy second of our factor,method. the centroid These heighttwo improvements can be employed are under as th furthere third investigation feature to improve for our futurethe accuracy work. of our method. These two improvements are under further investigation for our future work. Appl. Sci. 2017, 7, 993 13 of 17

For the first factor, the average filter or other filters can be employed to address the issue. For the second factor, the centroid height can be employed as the third feature to improve the accuracy of our Appl. Sci. 2017,, 7,, 993993 13 of 17 method. These two improvements are under further investigation for our future work.

100 90 Crouch 80 Fall2 70 Fall1 60 Lay 50 Bend 40 30 Torso angle (degree) Torso Torso angle (degree) Torso 20 10 0 1 6 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 136 Frames 101 106 111 116 121 126 131 136 Frames

Figure 11. TheThe torso torso angle angle changing changing curves ofof falselyfalsely judgedjudged samples. samples.

1.2

1 Crouch 0.8 fall2 0.6 fall1 Lay 0.4 Bend Centroid height (m) height Centroid Centroid height (m) height Centroid 0.2

0 1 6 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 136 101 106 111 116 121 126 131 136 Frames Frames FigureFigure 12. TheThe centroid centroid height changing curves of falsely judged samples.

Fall detection systems are expected to detect falls as soon as possible, so that the most severe Fall detection systems are expected to detect fall fallss as soon as possible, so that the most severe consequences of falls will be avoided if the falling person can be assisted immediately. Our method consequences of falls will be avoided if the fallingfalling person can be assisted immediately. Our method has advantages in time efficiency because of the calculation of only two features. Figure 13 records has advantages inin timetime efficiencyefficiency because because of of the the calculation calculation of of only only two two features. features. Figure Figure 13 records13 records the the time consumptions of each frame in walk and fall. thetime time consumptions consumptions of each of each frame frame in walk in walk and and fall. fall. As shown in Figure 13, most frames were processed in 0.1~0.8 ms. Although a few frames needed 1.8 1.6~1.8 ms, the processing time of each frame can be ignored because the time intervalwalk between two 1.6 frames (33.33 ms in our 30 fps videos) is far larger than it. fall 1.4 False alarms comprise one of the biggest obstacles that prevent the computer vision-based fall 1.2 detection method from being commercialized. Our approach works well in reducing the false alarm 1 rate in fall-like0.8 activities’ detection. Table2 records the detailed results. 0.6 0.4 Time consumption (ms) consumption Time Time consumption (ms) consumption Time 0.2 0 1 4 7 101316192225283134374043464952555861646770737679828588919497 Frames Frames Figure 13. The time consumptions of each frame in walk and fall. The two test videos of this graph were selected randomly from the self-collected dataset. Appl. Sci. 2017, 7, 993 13 of 17

100 90 Crouch 80 Fall2 70 Fall1 60 Lay 50 Bend 40 30 Torso angle (degree) Torso 20 10 0 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 136 Frames Figure 11. The torso angle changing curves of falsely judged samples.

1.2

1 Crouch 0.8 fall2 0.6 fall1 Lay 0.4 Bend Centroid height (m) height Centroid 0.2

0 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 136 Frames Figure 12. The centroid height changing curves of falsely judged samples.

Fall detection systems are expected to detect falls as soon as possible, so that the most severe consequences of falls will be avoided if the falling person can be assisted immediately. Our method has advantages in time efficiency because of the calculation of only two features. Figure 13 records Appl. Sci. 2017, 7, 993 14 of 17 the time consumptions of each frame in walk and fall.

1.8 walk 1.6 fall 1.4 1.2 1 0.8 0.6 0.4

Time consumption (ms) consumption Time 0.2 0 Appl. Sci. 2017, 7, 993 14 of 17 1 4 7 101316192225283134374043464952555861646770737679828588919497 Appl. Sci. 2017, 7, 993 14 of 17 As shown in Figure 13, most frames were processed in 0.1~0.8 ms. Although a few frames needed 1.6~1.8 ms, the processingFrames time of each frame can be ignored because the time interval As shown in Figure 13, most frames were processed in 0.1~0.8 ms. Although a few frames between two frames (33.33 ms in our 30 fps videos) is far larger than it. needed 1.6~1.8 ms, the processing time of each frame can be ignored because the time interval False alarms comprise one of the biggest obstacles that prevent the computer vision-based fall FigureFigure 13.13.The The timetime consumptionsconsumptionsbetween two of offrames eacheach (33.33 frameframe ms in inourin walk 30walk fps videos) andand fall. isfall. far larger TheThe twothantwo it. test test videosvideos ofof thisthis graphgraph detection method from being commercialized. Our approach works well in reducing the false alarm False alarms comprise one of the biggest obstacles that prevent the computer vision-based fall werewere selected selected randomly randomly from fromrate thein the fall-like self-collected self-collected activities’ detection. dataset. dataset. Table 2 records the detailed results. detection method from being commercialized. Our approach works well in reducing the false alarm rate in fall-like activities’ detection. Table 2 records the detailed results. Table 2. Experimental results and evaluation of our approach. Table 2. Experimental results and evaluation of our approach. Table 2. ExperimentalThe results Number and ofevaluation of our approach. The Number of Activity Type Detected as Fall Sample Videos The Number of 100 The Number of The NumberThe Numberof Videos of Activity Type Detected as Fall 100 Activity Type Sample Videos 50 Sample VideosFall Detected100 as FallVideos 98 50 Fall 100Fall 100 98 98 0 TP FN FP TN Lay 25 1 0 Lay 25Lay 25 1 1 TP FN FP TN

Crouch 25 1 98.00% 97.00% 97.50% Crouch 25 1 100.00% Crouch 25 1 100.00% 98.00% 97.00% 97.50% Bend 25 1 50.00% Bend 25 1 Bend 25 1 50.00% 0.00% Walk 25 0 TPR TNR Accuracy Walk 25 0 0.00% Walk 25 0 TPR TNR Accuracy

In Table 2, TP (true positive) denotes the fall samples that are judged as falls. FP (false positive) In Table2, TP (true positive)means denotesthe non-fall samples the fall that samplesare judged as thatfalls. TN are (true judged negative) asdeno falls.tes the non-fall FP (false samples positive) In Table 2, TP (true positive) denotes the fall samples that are judged as falls. FP (false positive) that are judged as non-falls. FN (false negative) means the fall samples judged as non-falls. Then, means the non-fall samples thatmeans are the judged non-fall samples as falls. that are TN judged (true as fall negative)s. TN (true negative) denotes deno thetes the non-fall non-fall samples samples that we have TPR = , TNR = and Accuracy = . that are judged as non-falls. FN (false negative) means the fall samples judged as non-falls. Then, are judged as non-falls. FN (false negative) means the fall samples judged as non-falls. Then, we have we haveWe comparedTPR = our, proposedTNR = method and withAccuracy other =Kinect sensor-based. approaches [22,24,26,28]. TP TNAll of these approaches were tested onTP our+ TNself-collected dataset. The results are recorded in TPR = TP + FN , TNR = TN + FP Weand compared Accuracy our proposed= TP method+ TN with+ FPother+ KiFNnect. sensor-based approaches [22,24,26,28]. Table 3. We compared our proposedAll of these method approaches with were other tested on Kinect our self-collected sensor-based dataset. The approaches results are recorded [22 ,in24 ,26,28]. Table 3. All of these approaches were testedTable on3. Comparison our self-collected of the fall detection dataset. capability on The our resultsself-collected are dataset recorded with other in Table3. Kinect-based fall detection approaches. Table 3. Comparison of the fall detection capability on our self-collected dataset with other Kinect-based fall detection approaches. Fall Approach Crouch Bend Walk Lay Forward Backward Left Right Table 3. Comparison of the fall detection capability on our self-collectedFall dataset with other Kinect-based Rougier,Approach C. et al. [22] 25 25 25 25 Crouch 11 Bend 20 Walk 0 Lay 1 Forward Backward Left Right fall detection approaches. Yang, L. et al. [24] 10 15 23 18 2 8 0 25 Rougier, C. et al. [22] 25 25 25 25 11 20 0 1 Gasparrini, S. et al. [26] 23 22 25 25 12 25 0 25 Yang, L. et al. [24] 10 15 23 18 2 8 0 25 Mastorakis, G. et al. [28] 8 22 21 25 0 0 0 0 Gasparrini, S. et al. [26] 23 22 25 25 12 25 0 25 Our proposed methodFall 25 23 25 25 1 1 0 1 Approach Mastorakis, G. et al. [28] 8 22 21 25 0 0 0 0 Lay Our proposed method 25 23 25 Crouch25 1 Bend 1 Walk0 1 ForwardAs shown Backward in Table 3, all of Left the approaches Right worked well in discriminating fall and walk. However, the height-based approaches [24,26] were unable to differentiate fall and other fall-like As shown in Table 3, all of the approaches worked well in discriminating fall and walk. Rougier, C. et al. [22]activities. 25 Especially 25 in terms of controlled 25 lying 25 down, all of 11the lay samples 20 were wrongly 0 judged 1 However, the height-based approaches [24,26] were unable to differentiate fall and other fall-like as falls by the above two methods. The vertical height velocity-based approach [22] performed Yang, L. et al. [24]activities. 10 Especially 15 in terms of controlled 23 lying 18 down, all of the 2 lay samples 8were wrongly 0 judged 25 much better than height-based approaches [24,26] in discriminating fall and controlled lying down, Gasparrini, S. et al. [26]as 23 falls by the above 22 two methods. 25 The vertical 25 height velocity-based 12 approach 25 [22] performed 0 25 but it performed poorly in differentiating fall, crouch and bend. The main reason is that both much better than height-based approaches [24,26] in discriminating fall and controlled lying down, Mastorakis, G. et al. [28] height-based8 and vertical 22 velocity-based 21 approaches 25 did not 0lay a clear line 0between balance 0 and 0 but it performed poorly in differentiating fall, crouch and bend. The main reason is that both unbalance. Hence, the fall-like activities are difficult or even impossible to detect correctly. As for Our proposed methodheight-based 25 and vertical 23 velocity-based 25 approaches 25 did not 1lay a clear line 1between balance 0 and 1 unbalance. Hence, the fall-like activities are difficult or even impossible to detect correctly. As for

As shown in Table3, all of the approaches worked well in discriminating fall and walk. However, the height-based approaches [24,26] were unable to differentiate fall and other fall-like activities. Especially in terms of controlled lying down, all of the lay samples were wrongly judged as falls by the above two methods. The vertical height velocity-based approach [22] performed much better than height-based approaches [24,26] in discriminating fall and controlled lying down, but it performed poorly in differentiating fall, crouch and bend. The main reason is that both height-based Appl. Sci. 2017, 7, 993 15 of 17 and vertical velocity-based approaches did not lay a clear line between balance and unbalance. Hence, the fall-like activities are difficult or even impossible to detect correctly. As for the bounding box-based approach [28], it performed well in differentiating fall and fall-like activities. However, it performed badly in terms of detecting falling forward. By reviewing these undetected forward fall p 2 2 samples, we found that the main feature, Vwd, which was calculated by Vwd = width + depth [28], changed more gently than it should be in the front-view condition. Although the accuracy can be improved by programmatic calibration, it is an uneasy tricky work. After comparing and analyzing these Kinect sensor-based approaches for fall detection, the conclusion was made as given in Table4.

Table 4. The general comparison of different categories of Kinect sensor-based fall detection approaches.

Capability of Capability of Discriminating Capability of Discriminating Capability of Approach Discriminating ADLs with Changing Fall-Like Activities with Detecting Falls Controlled Lying Down slightly in Height Changing Sharply in Height Height based method High Poor High Poor Height velocity based method High High High Poor Bounding box/ratio of the height Medium High High High and width based methods Our proposed method High High High High

5. Conclusions In this paper, we proposed a new and fast fall detection method based on the Kinect sensor. A motion model named the human torso motion model (HTMM) is proposed. A new and significant feature named torso angle was firstly adopted in our approach for fall detection. Comparing with existing posture-based approaches, machine learning-based methods and wearable device-based approaches, our proposed approach has obvious advantages in privacy protection and fall-like activities’ differentiation. Our method is time efficient and robust because of the calculation of only the changing rate of torso angle and centroid height. Moreover, the usage of only the Kinect sensor is inexpensive, affordable for common families and capable of being easily applied in the elderly person’s home. However, compared to a monocular video camera, the limitation of the utilized Kinect sensor is the relatively narrow field-of-view. The best distance from the monitored person to the Kinect sensor should be within 7 m. Beyond this distance, the depth data become unreliable. Nevertheless, the Kinect-based approaches are still effective, efficient and widely used for fall detection. Although the Microsoft Kinect sensor v2.0 was launched in 2013 and is more powerful than its 1.0 version, Microsoft Kinect sensor v1.0 is sufficient for our proposed method because of the calculation of only two joints and the cost reduction. The mechanism of our proposed method is suitable for Kinect sensor 2.0, as well. In our future work, Kinect sensor 2.0, which is more sensitive than Kinect 1.0, will be adopted to collect depth images that contain more information for more possible improvement. Our fall detection model will be continuously improved by further importing other filters and other features to address exceptional situations.

Acknowledgments: This research was supported by the National Natural Science Foundation of China under Grants 61762061 and 61662044 and the Natural Science Foundation of Jiangxi Province, China, under Grant 20161ACB20004. Author Contributions: All authors of the paper have made significant contributions to this work. Weidong Min led the project, directed and revised the paper writing. Leiyue Yao conceived of the idea of the work, wrote the manuscript and analyzed the experiment data. Keqiang Lu collected the original data of the experiment and participated in programming. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Ozcan, A.; Donat, H.; Gelecek, N.; Ozdirenc, M.; Karadibak, D. The relationship between risk factors for falling and the quality of life in older adults. BMC Public Health 2005, 5, 1–6. [CrossRef][PubMed] Appl. Sci. 2017, 7, 993 16 of 17

2. Yardley, L.; Smith, H. A prospective study of the relationship between feared consequences of falling and avoidance of activity in community-living older people. Gerontologist 2002, 42, 17–23. [CrossRef][PubMed] 3. Salvà, A.; Bolíbar, I.; Pera, G.; Arias, C. Incidence and consequences of falls among elderly people living in the community. Med. Clín. 2004, 122, 172–176. 4. Doughty, K.; Lewis, R.; Mcintosh, A. The design of a practical and reliable fall detector for community and institutional telecare. J. Telemed. Telecare 2000, 6, 150–154. [CrossRef] 5. Huang, C.; Chiang, C.; Chen, G. Fall detection system for healthcare quality improvement in residential care facilities. J. Med. Biol. Eng. 2010, 30, 247–252. [CrossRef] 6. Mubashir, M.; Shao, L.; Seed, L. A survey on fall detection: Principles and approaches. Neurocomputing 2013, 100, 144–152. [CrossRef] 7. Mathie, M.J.; Coster, A.C.F.; Lovell, N.H.; Celler, B.G. Accelerometry: Providing an integrated, practical method for long-term, ambulatory monitoring of human movement. Physiol. Meas. 2004, 25, 1–20. [CrossRef] 8. Nguyen, T.T.; Cho, M.C.; Lee, T.S. Automatic fall detection using wearable biomedical signal measurement terminal. In Proceedings of the International Conference of the IEEE Engineering in Medicine & Biology Society, Cheongju, Korea, 3–6 September 2009. 9. Sixsmith, A.; Johnson, N. A smart sensor to detect the falls of the elderly. IEEE Pervasive Comput. 2004, 3, 42–47. [CrossRef] 10. Kangas, M.; Vikman, I.; Wiklander, J.; Lindgren, P.; Nyberg, L.; Jämsä, T. Sensitivity and specificity of fall detection in people aged 40 years and over. Gait Posture 2009, 29, 571–574. [CrossRef][PubMed] 11. Zhuang, X.; Huang, J.; Potamianos, G.; Hasegawa-Johnson, M. Acoustic fall detection using Gaussian mixture models and GMM super vectors. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, Taiwan, 19–24 April 2009. 12. Alwan, M.; Rajendran, P.J.; Kell, S.; Mack, D. A Smart and Passive Floor-Vibration Based Fall Detector for Elderly. Inf. Commun. Technol. 2006, 1, 1003–1007. 13. Feng, G.; Mai, J.; Ban, Z.; Guo, X.; Wang, G. Floor pressure imaging for fall detection with fiber-optic sensors. IEEE Pervasive Comput. 2016, 15, 40–47. [CrossRef] 14. Rajpoot, Q.M.; Jensen, C.D. Security and Privacy in Video Surveillance: Requirements and Challenges. In Proceedings of the 29th IFIP International Information Security and Privacy Conference, Marrakech, Morocco, 2–4 June 2014. 15. Chaaraoui, A.A.; Padillalópez, J.R.; Ferrándezpastor, F.J.; Nietohidalgo, M.; Flórezrevuelta, F. A vision-based system for intelligent monitoring: Human behaviour analysis and privacy by context. Sensors 2014, 14, 8895–8925. [CrossRef][PubMed] 16. Rougier, C.; Meunier, J.; St-Arnaud, A.; Rousseau, J. Robust video surveillance for fall detection based on human shape deformation. IEEE Trans. Circuits Syst. Video Technol. 2011, 21, 611–622. [CrossRef] 17. Dong, Q.; Yang, Y.; Wang, H.; Xu, J.H. Fall alarm and inactivity detection system design and implementation on Raspberry Pi. In Proceedings of the 17th IEEE International Conference on Advanced Communications Technology, Pyeonhchang, South Africa, 1–3 July 2015. 18. Kwolek, B.; Kepski, M. Fuzzy inference-based fall detection using Kinect and body-worn accelerometer. Appl. Soft Comput. 2016, 40, 305–318. [CrossRef] 19. Khan, M.J.; Habib, H.A. Video analytic for fall detection from shape features and motion gradients. Lect. Notes Eng. Comput. Sci. 2009, 1, 1311–1316. 20. Williams, A.; Ganesan, D.; Hanson, A. Aging in place: Fall detection and localization in a distributed smart camera network. In Proceedings of the 15th International Conference on Multimedia, Augsburg, Germany, 24–29 September 2007. 21. Yu, M.; Yu, Y.; Rhuma, A.; Naqvi, S.M.; Wang, L.; Chambers, J.A. An online one class support vector machine-based person-specific fall detection system for monitoring an elderly individual in a room environment. IEEE J. Biomed. Health Inform. 2013, 17, 1002–1014. [PubMed] 22. Rougier, C.; Meunier, J.; St-Arnaud, A.; Rousseau, J. 3D head tracking for fall detection using a single calibrated camera. Image Vis. Comput. 2013, 31, 246–254. [CrossRef] 23. Olivieri, D.N.; Mez Conde, I.; Vila Sobrino, X. Eigenspace-based fall detection and activity recognition from motion templates and machine learning. Expert Syst. Appl. 2012, 39, 5935–5945. [CrossRef] 24. Yang, L.; Ren, Y.; Hu, H.; Tian, B. New fast fall detection method based on spatio-temporal context tracking of head by using depth images. Sensors 2015, 15, 23004–23019. [CrossRef][PubMed] Appl. Sci. 2017, 7, 993 17 of 17

25. Rougier, C.; Auvinet, E.; Rousseau, J.; Mignotte, M.; Meunier, J. Fall detection from depth map video sequences. In Proceedings of the 9th International Conference on Smart Homes and Health Telematics, Montreal, QC, Canada, 20–22 July 2011. 26. Gasparrini, S.; Cippitelli, E.; Spinsante, S.; Gambi, E. A depth-based fall detection system using a Kinect® sensor. Sensors 2014, 14, 2756–2775. [CrossRef][PubMed] 27. Ibañez, R.; Soria, Á.; Teyseyre, A.; Rodriguez, G.; Campo, M. Approximate string matching: A lightweight approach to recognize gestures with Kinect. Pattern Recognit. 2016, 62, 73–86. [CrossRef] 28. Mastorakis, G.; Makris, D. Fall detection system using Kinect’s infrared sensor. J. Real-Time Image Process. 2014, 9, 635–646. [CrossRef] 29. Aguilar, W.; Morales, S. 3D environment mapping using the Kinect v2 and path planning based on RRT algorithms. Electronics 2016, 5, 70. [CrossRef] 30. Alazrai, R.; Momani, M.; Daoud, M.I. Fall detection for elderly from partially observed depth-map video sequences based on view-invariant human activity representation. Appl. Sci. 2017, 7, 316. [CrossRef] 31. Hong, L.; Zuo, C. An improved algorithm of automatic fall detection. AASRI Procedia 2012, 1, 353–358. 32. Wang, X.; Li, M.; Ji, H.; Gong, Z. A novel modeling approach to fall detection and experimental validation using motion capture system. In Proceedings of the 2013 IEEE International Conference on Robotics and Biomimetics, Shenzhen, China, 12–14 December 2013. 33. Bosch-Jorge, M.; Sánchez-Salmerón, A.J.; Valera, Á.; Ricolfe-Viala, C. Fall detection based on the gravity vector using a wide-angle camera. Expert Syst. Appl. 2014, 41, 7980–7986. [CrossRef] 34. Feng, P.; Yu, M.; Naqvi, S.M.; Chambers, J.A. Deep learning for posture analysis in fall detection. In Proceedings of the International Conference on Digital Signal Processing, Milan, Italy, 19–21 November 2014. 35. Melzer, I.; Benjuya, N.; Kaplanski, J. Association between muscle strength and limit of stability in older adults. Age Ageing 2009, 38, 119–123. [CrossRef][PubMed] 36. Sung, J.; Ponce, C.; Selman, B.; Saxena, A. Unstructured human activity detection from RGBD images. In Proceedings of the 2012 IEEE International Conference on Robotics and Automation, St. Paul, MN, USA, 14–18 May 2012. 37. Kobayashi, J.; Abdulrazak, L.; Mokhtari, M. Inclusive society: Health and wellbeing in the community, and care at home. In Proceedings of the 11th International Conference on Smart Homes and Health Telematics, Singapore, 19–21 June 2013.

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).