Human Activity Discovery and Recognition using Probabilistic Finite-State Automata K Viard, M. P. Fanti, G. Faraut, J-J Lesage
To cite this version:
K Viard, M. P. Fanti, G. Faraut, J-J Lesage. Human Activity Discovery and Recognition using Probabilistic Finite-State Automata. IEEE Transactions on Automation Science and Engineering, Institute of Electrical and Electronics Engineers, In press. hal-02557589
HAL Id: hal-02557589 https://hal.archives-ouvertes.fr/hal-02557589 Submitted on 28 Apr 2020
HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. IEEE Trans. on Automation Science and Engineering, 2020 1
Human Activity Discovery and Recognition Using Probabilistic Finite-State Automata
K. Viard, M. P. Fanti, Fellow, IEEE, G. Faraut, Member, IEEE, and J-J. Lesage, Member, IEEE
Abstract— Ambient assisted living and smart home technologies I. INTRODUCTION are a good way to take care of dependant people whose number will increase in the future. They allow the discovery and the CCORDING to several demography studies, the recognition of human’s Activities of Daily Living (ADLs) in order A dependency rate of the world population is continuously to take care of people by keeping them in their home. In order to increasing since 2010. In 2050, the part of the population aged consider the human behaviour non-determinism, probabilistic 60 or more will rise to 30% in the majority of countries [1]. This approaches are used despite difficulties encountered in model societal evolution is becoming an important human and generation and probabilistic indicators computing. In this paper, economic issue for next years. In fact, current health and well- a global method, based on Probabilistic Finite-State Automata and fare institutions will not be sufficient to treat this proportion of the definition of the normalised likelihood and perplexity is proposed to manage ADLs discovery and recognition. In order to elderly people. Hence, severe pressure on the public healthcare reduce the computational complexity, some results about a sector and lack of adequate facilities are driving the way in simplified normalised likelihood computation are proved. A real which health services are delivered to the patients case study showing the efficiency of the proposed method is [2, 3]. Therefore, alternative solutions have to be found and discussed. rapidly developed in order to supply help and independence to people suffering from not too severe pathologies. Note to Practitioners— This paper is motivated by the problem Smart homes, which integrate medical equipment and other of the automatic recognition of activities that are daily performed ambient assisted living technologies, can play a lead role in by elderly or disabled people in a smart dwelling. The set of revolutionizing the way in which healthcare services are being activities to be recognized is defined by a medical staff (e.g. to provided to the elderly people [4, 5]. Health at home systems is prepare meal, to do housework, to take leisure, etc.) and an efficient possible solution that consists of keeping old people correspond to pathologies that have to be monitored by doctors at home as long as possible, thanks to an automatic monitoring (e.g. loss of memory, loss of mobility, etc.). The proposed method of their daily life. is based on a systematic procedure of offline construction of a The Activities of Daily Living (ADLs) analysis is one of the model for each activity to be monitored (the activity discovering step). The online recognition of activities actually performed (the main investigation fields in health-assistive smart homes and activity recognition step) is afterwards based on these models of smart environments [6-15]. To live independently at home, activities. Since the human behaviour is non-deterministic, and individuals need to be able to complete ADLs such as eating, may even be irrational, probabilistic activity models are built from dressing, cooking, drinking, and taking medicine. Current a learning database. In the same way, probabilistic indicators are studies about ADLs mainly treat two problems: Activity used for determining online the most probable activities actually Discovery (AD) and Activity Recognition (AR) [6]. In performed. The efficiency of the proposed approach is illustrated particular, AD is a technique employed to reduce the need for through a case study performed in a smart living lab. expert knowledge by using learning algorithms to discover activities in sensor events raw sequences [6-8]. In addition, the Index Terms— Smart Home, Activity of Daily Living, Activity objective of AR is to detect the activity actually performed by Discovery, Activity Recognition, Probabilistic Finite-State the inhabitant [9-15]. The generally accepted approach to Automata, Normalised Likelihood. activity recognition is to design and/or use machine learning techniques to map a sequence of sensor events to a corresponding activity label. In [9, 12] the authors describe all inhabitant activities by Manuscript received March 5, 2018. only one Hidden Markov Model (HMM), one of the most K. Viard, J-J. Lesage and G. Faraut are with the Université Paris-Saclay, frequent models used in literature for AD and recognise each ENS Paris-Saclay, LURPA, 94235, Cachan, France (e-mail: {kevin.viard, jean- jacques.lesage, gregory.faraut}@ens-paris-saclay.fr) activity by applying the Viterbi algorithm [11]. Unfortunately, M. P. Fanti is with Dipartimento di Ingegneria Elettrica e dell’Informazione, the complexity of the model drastically increases with the Politecnico di Bari, Via Orabona 4, 70125 Bari, Italy (phone: 39-080-5963643; number of activities and sensors. Furthermore, the used model fax: 39-080-5963410; e-mail: [email protected]). has no intermediary semantic levels between activities and sensors and the precision of the recognition is not guaranteed. In a previous work [8], each activity is modelled by only one IEEE Trans. on Automation Science and Engineering, 2020 2
HMM for discovery purpose and in [10] a new probabilistic with the methods of the related literature. Indeed, the proposed indicator for AR, called normalised likelihood, is proposed. method allows building the AD module and performing the In [13-15], the authors present a system that recognises a set recognition by using short strings of data: hence, the approach of activities modelled by HMMs. Moreover, activities are can be suitably applied in real situations. classified by a probability that allows recognising the In this paper, assumptions and related works are first performed activity as being the one, which is represented by the presented in Section II and Section III proposes a formal most probable model. Typically, the works focusing on statement of the problem. Section IV and V present the AD and recognition give few details about how the probabilistic models AR methods, respectively, and Section VI discusses a real case are built. By using a similar model, paper [16] addresses the study. Finally, Section VII draws the conclusions. problem of recognising ADLs in smart homes in a Hidden Semi-Markov Model (HSMM) framework. In addition, II. LITERATURE REVIEW AND ASSUMPTIONS comparing different kinds of HMMs, the authors show that the In this section we start by listing the main assumptions of proposed novel form of HSMM, called Coaxian Hidden Semi- the proposed framework, in relation with the results and the Markov Model, performs online activity classification and contributions of the related literature. The methodology segmentation from a segmented training data. The performance proposed for solving the AR and AD problems are based on the of the model is compared with various counterparts by a following assumptions: likelihood computation that can be used only if the systems 1) only binary sensors are used for observing patient exhibit the same event sets. On the contrary, typically activities activities; are linked to different events (i.e., sensors) because they occur 2) human behaviour is non-deterministic and may even be in different home areas and are detected by using equipment irrational; located in different areas. A common drawback of these 3) the smart dwelling is supposed to be occupied by a single approaches is that they do not explain how the probabilistic inhabitant; model of scenarios is built and, typically, they are manually 4) the database which has been recorded during the learning built by an expert. period, and which is the input data of AD, does not include This paper proposes a novel approach for AR and AD in the knowledge of activities which have been actually order to fill the gap that is present in the works of the related performed. literature. Indeed, most contributions suggest AR The rationality of considering such assumptions is based on methodologies, but they do not give precise information on the the following justifications that are discussed on the basis of way activities are discovered and whether the recognition the literature review. works well when there are variations during the performance of the activities. Assumption 1: The novelty of this paper is threefold. In most works of the related literature, the use of cameras First, the activities are modelled by using Probabilistic Finite is needed in at least one step of ADLs monitoring [18-20]. State Automata (PFA), a superclass of HMMs [17] that are Indeed, cameras provide information of very high level of powerful modelling techniques when the system states are semantics for discovering activities. Nevertheless, they are partially or completely unknown. On the other hand, PFA often considered too intrusive and raise problems of models are chosen for three main reasons: i) the structure of the acceptance by monitored people [21]. This is the main reason PFA can be automatically deducted from the input data of AD, why cameras are not used in this work, but only binary sensors, hence, the use of HMMs is not necessary; ii) the proposed such as motion detectors or door barriers. Such sensors are framework allows automatically building the models of furthermore low cost, what is very interesting in the context of scenarios by overcoming the drawback of the Markov models; health at home, which is becoming a mass problem. iii) the PFA allows exploiting existing tools and consolidated theoretical results that give the possibility of automatically Assumption 2: building the activity models (see for example the algorithms Human behaviour is non-deterministic and characterized by presented in [30-31]). small changes every day. Therefore, ADLs models which are Second, in order to identify ADLs linked to different sets of not robust to minor variations, like data mining approach [7, events and having some non-deterministic variations, a new 22], are non-considered to maximise the robustness and method is developed to allow distinguishing activities. To this applicability of the proposed method. aim, we adopt the perplexity evaluation by introducing the normalized likelihood to select the most probable activity. This Assumption 3: methodology allows overcoming the limit of considering In order to assume that more inhabitants are in the models sharing the same set of events. Moreover, in order to dwelling, it is necessary is to consider that each inhabitant reduce the computational complexity some results simplifying wears a sensor that allows identifying himself (e.g. a RFID the normalised likelihood computation are proved. sensor) and therefore knowing who has generated which event. Third, the presented methods are applied to a real smart flat Such wearable sensors are very often used for AD and AR provided by the ENS Paris-Saclay (France) and experimental [20], but the efficiency of this kind of sensors strongly depends results are discussed for both AD and AR methodologies. on the ability and the willingness of the patients to wear them Finally, an important benefit of the proposed approach every day, and sometimes during the night. As in the case of concerns the collection of the needed expert data in comparison IEEE Trans. on Automation Science and Engineering, 2020 3 cameras, this sensor technology also raises some problems of The structure of the framework developed to perform AD acceptance and it is sometimes not compatible with the and AR procedures is represented in Fig. 2 showing the main pathology of patients to be monitored (e.g. loss of memory). modules of the strategy: the AD and AR modules. Hence, according to Assumption 1, only binary sensors are In some preliminary operations, the set of activities to be used for both AD and AR and it is necessary to assume that a monitored are determined and the related actions and moves are single inhabitant is living in the smart home. This assumption singled out. On the basis of the considered moves, the sensors is quite restrictive but allows proposing a complete solution for are chosen and positioned in the house. AD and AR that is based on the use of binary sensors only.
Assumption 4: In the main part of the works based on assumptions 1 to 3, a perfect knowledge of activities performed by the inhabitant during the learning period is needed [12, 23]. This information is in practice very difficult to obtain. In [23], the monitored patient indicates what activity is performed. Of course, the efficiency of this approach is compared with the ability and the willingness of the person to declare his activity: in general, numerous reported activities errors are introduced in the database [23]. In other works [19], experts are in charge of enriching the database by studying sensor logs or using cameras exclusively during the learning phase. This approach is Fig. 2 Structure of the activity discovery and recognition. expensive, intrusive and not accurate. In both cases, the labelling step is difficult and unreliable: in this paper the AD The AD module is applied off-line for a learning period with does not require the knowledge of actually performed activities the objective of generating formal models of ADLs. To this during the learning phase. aim, in coherence with assumptions presented in Section II, two sets of inputs denoted � and � are employed: III. FRAMEWORK AND NOTATIONS FOR THE ACTIVITY • � symbolizes the items that are provided by an expert: DISCOVERY AND RECOGNITION 1) the set of activities to be monitored; 2) the set of actions In this section we describe the main components of the composing each activity; 3) the moves connected with the proposed discovery and recognition framework that allows actions and linked with the sensors in the dwelling; automatically building models representing activities, models • � is a log obtained by observing an inhabitant life during that are used to recognize the ADL performed by the inhabitant. a learning period, i.e., streaming data represented by This framework has to consider the privacy of the inhabitant, sequences of events detected by the sensors. the non-determinism of the human behaviour, and the different Starting from such inputs, the AD module provides the set ways to perform the same ADL, i.e. the slight variations to � = {� } of PFA models of the activities. Note that in this perform an activity has to lead to a recognition of the activity. paper, symbol � Î� is used to represent both the activity and Hence, we describe the structure of the AD and AR tasks and the PFA modelling this activity. the used notations for the definition of the PFA. The AR module works on-line to identify in real time the
A. Framework of the Activity Discovery and Recognition activity actually performed by the inhabitant. The inputs of this ADLs are regularly performed by a person and can be module are the following: decomposed into several actions [18, 24, 25]. For instance, • the observation � provided by the binary sensors of the “cooking” can be decomposed in “preparing pasta”, “preparing real-time behaviour of the monitored person; a ready-cook dish”, “ordering meal on the net”, etc. Moreover, • the set of the activity models � = {� } obtained by the actions can be described as a sequence of elementary moves. AD. The hierarchical decomposition of activities in actions and The output � of the AR module is the recognition of the observable moves is represented in Fig. 1. Note that an activity performed by the inhabitant. observable move can be linked to several actions and can be observed by binary sensors. B. Probabilistic Finite-State Automata: Notations and Definitions In order to describe the non-deterministic behaviour of the house inhabitant, the activities are modelled in a PFA framework as follows [17]:
Fig. 1 Hierarchical decomposition of activities in actions and moves. Definition 1 : A PFA � is a tuple 〈 〉 � = �� , Σ� , �� , �� , �� , �� , where:
• �� is a finite non-empty set of states �; IEEE Trans. on Automation Science and Engineering, 2020 4
• Σ� is a non-empty alphabet of events �; • � ⊂ � is the projection of sequence � on alphabet Σ� of activity � , where the projection of � ∈ Σ∗ on alphabet • �� ⊆ �� × Σ� × �� is a set of transitions; Σ is defined as follows [29]: • �� ∶ �� → [0,1]: the initial-state probabilities; � ∗ ∗ ���� ∶ Σ → Σ� (3) • �� ∶ �� → [0,1]: the transition probabilities; • �� ∶ �� → [0,1]: the final-state probabilities. with: ����(�) ≔ � Note that �� , �� and �� are functions such that [17]: � �� � ∈ Σ� ����(�) ≔ ∑ � �� � ∈ Σ \Σ ∈ � �� (�) = 1, (1) � ( ) ( ) ( ) ∗ ���� �� ≔ ���� � ���� � for � ∈ Σ , � ∈ Σ� and ∗ ∗ ∀� ∈ �� , �� (�) + ∑ ∈ , ∈ �� (�, �, � ) = 1. � � (2) where Σ is the set of system events, Σ and Σ� represent
the Kleene-closure [29] of Σ and Σ� , respectively, and � When used in subscript of a symbol, � represent the PFA is the empty symbol, i.e., the sequence of length 0. In other to which the symbol is linked. words, the mathematical projection function allows obtaining a sequence including only events of a specific Now, according to Definition 1, the PFA model of activity activity � . For instance, let � = � � � � � � � � � � � is specified as follows: be an observed sequence: the projection of w on Σ� = • � ∈ �� represents an action performed during the {� , � , � } is the sequence � = � � � � � � . activity; transitions starting from this state represent • ℒ(�) is a language generated from sequence � and the probabilities to switch from this action to another composed by all substrings of w such that |�| ≥ 2. one; IV. ACTIVITY DISCOVERY • � ∈ �� is the initial dummy state that we consider as the initial condition of the activities where no action In this section the AD method to model ADLs in a PFA is performed; framework is developed by following three steps: 1) generating
• an event � ∈ Σ� describes a move that may occur in the PFA structure; 2) analysing and processing a set of a state of activity � ; streaming data; 3) computing the probabilities of the PFA model. • a transition (� , � , � ) ∈ �� if starting from state � event � may occur and activity � reaches state � ; A. Generation of the PFA Structure • each activity starts in the initial dummy state � with Starting from the hierarchical description of each activity probability equal to 1, it holds �� (� ) = 1 and (given in Fig. 1), the PFA structure of each activity � is ( ) �� � = 0 for each � ∈ �� ; determined by building the associated transition digraph (direct
• �� (� , � , � ) is the probability, in activity � , of graph) D(� )=(�� , �� ). More precisely, the set of nodes of
observing event � and destination state � starting D(� ) corresponds to the states set �� and the set of direct
from state � ; arcs is associated with the transitions in �� , i.e., there exists a direct arc starting from node � and ending to node � , if • �� (� )=0 for each � ∈ �� .
An example of PFA is illustrated on Fig. 3. (� , � , � ) ∈ �� . Fig. 3 shows an example of digraph describing the structure of an activity. On the basis of Definition 1, several objects can be defined: • � = � � … � is a sequence of observed events and the length of � (denoted |�|) corresponds to the number of events in the sequence; • � = (� , � , � , � , � , … , � , � , � ) is a path of transitions for � in � , i.e., the sequence of transitions
(� , � , � ), (� , � , � ),…, � , � , � ∈ �� consistent with � = � � … � ; • ℒ(�) denotes the language, i.e., a set of sequences of events generated from the observed sequence �; • Σ� is the set of all possible sequences of length � which
can be generated with symbols of the alphabet Σ� . For
example, let Σ� = {� , � , � } be an alphabet, then Σ ={� � � , � � � , � � � , � � � , … , � � � } is the set � Fig. 3 Generation of model structure from the activities semantic description. of all possible sequences of length 3; IEEE Trans. on Automation Science and Engineering, 2020 5
B. Analysis and Process of Streaming Data � (� |� → � ) �(� |� → � , � ) = (6) This section briefly describes the approach for processing ∑ ∈ � �(� |� → � ) the streaming data in the context of the smart home dataset. The sensors embedded in the considered smart apartments are where � (� → � |� ) denotes the number of occurrences of binary sensors that are in two states - ‘ON’ and ‘OFF’. Then transitions from � to � and � (� |� → � ) the occurrence of they can generate two events: one is linked to the rising edge of event � conditioned to the transition � → � during the run the binary information (sensor|1), the other one is linked to the period. falling edge (sensor|0) (see as an example Fig. 4). Unfortunately, the number of the performed actions in each In the related literature different approaches are proposed activity is unknown. By consequence, we can only consider the for processing streaming data represented by sequences of the sequence � to evaluate probabilities (5) and (6). Then, in sensor firings [26, 27]. The explicit windowing is not adapted to our hypotheses because it requires an expert to segment order to compute �(� → � |� ) and �(� |� → � ), we use sequence. The time windowing could be interesting in case of observable occurrences or successions of events by the timed approach, but our approach is event-based. Furthermore, following indicators: with time windowing, some windows could have no event. By • � : number of times event � is observed in the consequence, we choose the sensor event-based windowing sequences � ; approach that consists in dividing the sequence into windows • � : number of times event � is observed as first containing a fixed number of sensor events. In this case, the event in the sequences � ; windows vary in their duration and this is fine considering that • � : number of times event � follows event � in during the performance of activities, multiple sensors could be → triggered, while during silent periods, there will not be many the sequences � . sensor firings. Hence, we consider a long observation that we divide in windows w containing a fixed number of events and Let � (resp. � ) be the number of actions (states) having we denote by � the projection of w on activity alphabet Σ� . the event � (resp. � ) as input, and let Σ (resp. Σ ) be the set of events that are in input of state � (resp. � ). Now, � (� → � |� ) and � (� |� → � ) are determined as follows:
� (� → � |� ) 1 ⎧ × � �� � = � � ⎪ ⎪ ∈ ⎪ � �� � = � ≠ � (7) = → Fig. 4. Event emission from sensor binary information. , ∈ ⎨ C. Probabilities Computation ⎪ 1 ⎪ × � �� � ≠ � ≠ � ⎪ → The goal of this subsection is to show how to compute the � ⎩ ∈ ; ∈ ; ∉ probabilities associated to each transition of the PFA model, that is the output of the AD module. and � (� |� → � ) = � �� � ∈ Σ ; 0 otherwise. (8) Let � be the PFA of the activity � ; let � , � ∈ Σ� be two events of the activity � ; let � be a sequence of events of Σ obtained by projection function of the sequence w on In equations (7) and (8) we consider the fact that when a � human starts a new action, the performed move is not dependent alphabet Σ ; let � ,� ∈ � be two actions performed in the � � on the past action. In equation (7), the number of occurrences activity � . The probability to move from action � into action of transitions from � to � is determined considering 3 cases: � by the event � is defined by: i) if � = � we consider the number of times event � that is in input of � is observed as first event, averaged by the number �(� , � , � ) = �(� → � |� ) × �(� |� → � , � ). (4) of states having the event � as input; ii) if � = � ≠ � we count the number of times event � follows event � where � In words, the probability to move from � to � by event � and � are in input of � ; iii) if � ≠ � ≠ � then the count is is defined by the probability to move from � to � multiplied similar to case i) but for the computation of � → that by the probability to generate � during this move. Moreover, the probabilities of the equation (4) are computed represents number of times � follows � in the considered by a standard approach: the likelihood of one or more events sequence. happening divided by the number of possible outcomes. More Moreover, in equation (8), we consider the occurrence of � precisely, it holds: independent of the starting state � but it depends only on the reached state � . Hence, the occurrence of event � conditioned � (� → � |� ) to the transition � → � during the run period is equal to �(� → � |� ) = (5) ∑ ∈ � (� → � |� ) � IEEE Trans. on Automation Science and Engineering, 2020 6
number of times � , which is in input of � , is observed during Definition 3 : Perplexity the run period. The perplexity “per string” is defined as the inverse of the geometric mean of the likelihood: D. Complexity of the AD algorithm (ℒ( )) | (10) The complexity of the three steps of the AD algorithm is the ��(ℒ(�) � ) = �� (�) . following: ∈ℒ( ) � = The perplexity is based on the computation of likelihood | | | | = � ( ���������� − � ) max ���� �� � (�) for � ∈ ℒ(�) and can be used to distinguish between � � two automatons � and � only if the same sequence � is used ∗ ∗ for the two models; i.e., if � ∈ Σ� and � ∈ Σ� . However, the + max c���(Σ� ) × ����(�) � values of the perplexity are not significant if the alphabet of the where: sequence is not included in the alphabet of automaton. • symbol card(.) stands for “cardinality of the set (.); In the considered problem two issues prevent applying the • |����������| − |�| is the number of windows standard likelihood-based approaches. First, all the PFAs �
obtained by sliding; do not share the same alphabet Σ� . Second, it is necessary to project the observed sequence on the PFA’s alphabet in order • max ���� �� is the maximum number of actions � to filter the events not belonging to the PFA alphabet. linked to the same activity; In order to overcome these two issues, we propose an AR
• max c���(Σ� ) is the maximum number of events protocol that is composed by four steps: � linked to the same activity. 1 – the windowing of observed events considers a sequence Note that the complexity of the AD algorithm is polynomial � composed with a fix number of events; 2 – for each PFA � , the projection � of the considered and mainly depends on the maximum number of actions linked � to the same activity. Since the number of actions is not large, sequence � is obtained; the algorithm is easily computable. 3 – a language ℒ(� ) is generated for each projected � sequence � as described in subsection III.B; V. ACTIVITY RECOGNITION � 4 – the probability for each model � to generate the A. Recognition protocol language ℒ(� � ) is computed by likelihood algorithm. The AR consists in recognising on-line (i.e., in reasonable time), which activity is actually performed by the inhabitant. B. Normalised likelihood and perplexity Recognition is done by computing an indicator evaluating Note that methods based on the likelihood computation are the probability that a PFA generates an observed sequence �. not pertinent to the projected sequences since it can lead to The activity having the maximum probability of generating compare likelihood or perplexity of sequences having different sequence � can be considered as being the activity currently lengths. The risk is the shorter sequences will always more performed. In [17], perplexity is presented as a useful indicator probable than longer ones as a consequence of the projection of to compute distances between a language ℒ(�) generated from the observed sequences. In order to be able to compare the observed sequence � and a PFA � . sequences having different length, we propose a normalisation of the classical likelihood computation. Definition 2 : Likelihood
Let Θ� (�) be the set of paths for � in � . The probability Definition 4 : The normalised likelihood of generating � with � is the likelihood of � considering � Let us consider the PFA � and a given sequence � ∈ Σ∗ . � and can be computed by: The normalised likelihood of sequence w in � , is defined as: