Let the Objects Tell What You are Doing

Gabriele Civitarese Claudio Bettini Abstract University of Milan University of Milan Recognition of activities of daily living (ADLs) performed in Via Comelico 39, Milano Via Comelico 39, Milano smart homes proved to be very effective when the interac- [email protected] [email protected] tion of the inhabitant with household items is considered. Analyzing how objects are manipulated can be particu- larly useful, in combination with other sensor data, to de- tect anomalies in performing ADLs, and hence to support early diagnosis of cognitive impairments for elderly people. Recent improvements in sensing technologies can over- Stefano Belfiore come several limitations of the existing techniques to detect University of Milan object manipulations, often based on RFID, wearable sen- Via Comelico 39, Milano sors and/or computer vision methods. In this work we pro- stefano.belfi[email protected] pose an unobtrusive solution which shifts all the monitoring burden at the objects side. In particular, we investigate the effectiveness of using tiny BLE beacons equipped with ac- celerometer and temperature sensors attached to everyday objects. We adopt statistical methods to analyze in real- time the accelerometer data coming from the objects, with the purpose of detecting specific manipulations performed by seniors in their homes. We describe our technique and Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed we present the preliminary results obtained by evaluating for profit or commercial advantage and that copies bear this notice and the full citation the method on a real dataset. The results indicate the po- on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, tential utility of the method in enriching ADLs and abnormal to post on servers or to redistribute to lists, requires prior specific permission and/or a behaviors recognition systems, by providing detailed infor- fee. Request permissions from [email protected]. Ubicomp/ISWC’16 Adjunct , September 12-16, 2016, Heidelberg, Germany mation about object manipulations. © 2016 ACM. ISBN 978-1-4503-4462-3/16/09...$15.00 DOI: http://dx.doi.org/10.1145/2968219.2968285 ACM Classification Keywords ing omissions, substitutions and improper manipulations. I.5.1 [Pattern Recognition: Models: Statistical] For example, these include reaching and opening a wrong medicine box, using the wrong tool to perform an action Author Keywords or unnecessarily repeating a given manipulation. The sys- Activity recognition; smart homes; sensing tem described in this paper is not intended by itself to sup- port early diagnosis based on improper object manipula- Introduction tions. However, reliable object manipulation monitoring is The recent improvements in sensor technologies are hav- an essential subsystem of a more complex monitoring en- ing a deep impact on a long lasting challenge in ubiquitous vironment. In particular, what we describe is intended to computing and , namely the recognition substitute the RFID-based subsystem used in [12] to mon- of human activities [4] and in particular activities of daily itor the use of items in preparing and consuming meals as living (ADL). well as taking medicines. As shown in Figure1, in order to recognise anomalies in performing these high level activi- The advantages of having sensors on everyday artifacts ties other sensor subsystems are used, including sensors for ADL recognition have been identified long ago, explor- revealing presence, pressure, temperature, power con- ing solutions mainly based on accelerometers and RFID [2, sumption and more. 11], however the technology has not been sufficiently reli- able and cost-effective for a wide scale deployment. A com- In our experience on deployments in the real homes of the mon argument against using sensor-augmented objects as elderly for continuous monitoring, solutions based on wear- opposed to wearables for ADL recognition, in addition to ables are critical: there is no guarantee that wristbands or technological issues, has been the difficulty in identifying pendants are constantly worn, not to mention the subject that is performing the activity in case of multiple or RFID readers that have been proposed for the advan- inhabitants of the same space [1]. On this respect, there tage of identifying the specific manipulated object. There has been some progress on this issue both on the tech- are also indications of a general adversity or disaffection of nological side (miniaturization of identifying beacons) and users to wearables targeted to healthcare related applica- on wearable-free solutions based on data analysis [7]. An tions [5]. Similarly, cameras and microphones are some- other approach to recognize specific object manipulations times tolerated in retirement residences, but much less in without neither sensors on objects nor wearables takes ad- private homes. vantage of audio and/or video recording [13], but this solu- tion is often perceived as too obtrusive. Our major contribution are experimental results on the ef- fectiveness of unobtrusive object manipulation recognition, Our investigation is driven by a specific application domain: using current commercial low cost and low energy con- the recognition of fine grained anomalies in performing in- sumption multi-sensor devices that can be attached to ev- strumented activities of daily living by elders at risk of cog- eryday objects. A closely related work is [10], which uses nitive impairment [12]. Clinicians need to identify manip- acceleration data acquired from sensors on items to eval- ulations of specific objects in a home environment includ- uate surgeons’ skill in manipulating precision tools. With respect to that work we monitor manipulations relevant to interesting for monitoring ADL execution, hence we divide our application domain, which are more coarse grained and the manipulation types in two categories: relevant ad irrele- of a different nature. We collected a dataset of more than vant. We consider a manipulation relevant if the task that is two thousands labeled manipulations, and we report en- achieved by performing the manipulation is crucial to mon- couraging preliminary results on their recognition through itor a particular ADL; irrelevant otherwise. Of course, clas- machine learning techniques applied on accelerometer data sification of manipulation types in relevant and irrelevant collected from the objects. We believe that our study con- has to be decided accurately by domain experts. Manipula- tributes to the design of a sensing subsystem that could tions considered as relevant are further classified in specific be effectively integrated in the smarthome environments sub-classes. used in several previous works on monitoring complex ac- tivities at home [4,9,6], independently from the algorith- Example 2 Suppose that we’re interested in monitoring the mic method being used, since object manipulations may be activity of taking medicines. In this scenario, we consider considered as simple events. a manipulation relevant if a medicine package is extracted from its repository, or if it is opened; while it is considered Modeling manipulations irrelevant if a medicine package is just displaced inside the We define as object manipulation the interaction of an indi- repository while searching. vidual with an object of interest with the objective of achiev- ing some task within the execution of a particular ADL. The technique More formally, we define a manipulation instance as m = In this section, we illustrate our technique to analyze the data coming from accelerometer positioned on objects, in ho,M,ts,tei, where o is the object manipulated, M is the order to recognize specific manipulations. manipulation type, ts and te are respectively the start and end time of the manipulation. Given i an instance of an A Recognition Framework ADL A, we say that a manipulation m ∈ i if m is performed A The system is considered as part of a smart home envi- during i . A ronment instrumented with several environmental sensors. Example 1 Considering as object of interest a glass, some The general architecture is shown in Figure1. Each object possible types of manipulation of that object could be: using of interest has attached a device which incorpo- the glass to drink while eating a meal, moving the glass on rates a 3-axis accelerometer sensor. Each device commu- the table while preparing the table, emptying the glass in nicates periodically the raw sensor data to the Smart-object the sink, inserting the glass in the dishwasher, and so on. data processing module, along with the device’s unique A manipulation instance could be: m = hglass, drinking, identifier. This module is in charge of: a) segmenting the accelerometer data in order to identify the manipulation 12:45:32, 12:45:45i where m ∈ ieating (a manipulation which consists in using the glass to drink during the consumption occurrences, b) extracting several features and c) apply- of a meal). ing machine learning techniques in order to recognize the specific manipulation performed. Since each type of ob- We point out that not every type of object manipulation is ject has an associated set of specific manipulations (e.g., a by: the object o manipulated, the start time ts (i.e. the time instant where the object started moving), the end time te (i.e. the time instant where the object stopped moving) and the accelerometer data on the three axis. The output of segmentation module is a set of n manipulation occur- rences O = {occ1,occ2,...,occn}.

From each manipulation occurrence, we build a feature vec- tor which comprises more than 40 different features regard- ing statistics on accelerometer data and the duration of the manipulation.

Manipulation recognition The next step is to infer, for each feature vector, the specific Figure 1: General architecture manipulation performed with the related object. As previ- ously described, for each type of object we’re interested in distinguishing between irrelevant manipulations and a set of bottle of water is used to pour/drink water, differently from specific relevant manipulations. Since we’re not interested a medicine box that is used to extract pills), we built a spe- in detecting the fine-grained types of irrelevant manipula- cialised classifier for each object type. Of course, this does tions, they’re grouped together into a single class called not mean using a different classifier for each object: for in- Irrelevant. We experienced that tree-based discriminative stance, a bottle of water and a milk box can be manipulated learning models proved to be effective to solve this class of similarly and a single classifier is in charge of recognizing problems. In particular, we adopt a supervised approach, the manipulations of both objects. Detected manipulations, using state-of-the-art classifiers like Random Forest [3] and along with measurements acquired from smart-home en- AdaBoost [8] depending on the specific object. vironmental sensors, are transmitted to a system which is in charge of recognizing ADLs performed by the monitored We adopted two different classification approaches: subject and the possible abnormal behaviors. • Direct classification Segmentation and feature extraction We pre-process data transmitted from the objects in order • Multi-layer classification to identify the manipulation occurrences. To do this, we an- alyze 3-axis accelerometer data in order to detect whether In the following we describe these techniques. an object is in motion. This is done by using a straightfor- ward threshold based method on accelerometer data which Direct classification detects when the object starts and stop moving. Each ma- Our first straightforward approach consists in directly distin- nipulation occurrence occi = ho,ts,te,~x,~y,~zi is represented guishing the fine-grained manipulations using a multi-class classifier. This method is shown in Figure2. Hence, de-

Figure 3: Multi-layer classification schema for a specific object

Figure 2: Direct classification schema for a specific object Experimental evaluation In this section we describe our experimental setup, how we pending on the type of object from which the manipulation acquired a dataset of manipulations, and finally we present comes, a specific classifier is used. Every single classifier our preliminary results. is trained with a set of relevant manipulations and irrelevant manipulations of the specific object type. Experimental setup Driven by requirements from clinicians, we currently focus Multi-layer classification on fine-grain monitoring of three complex activities: prepar- With the objective of improving the above mentioned method, ing a meal, consuming a meal and taking medicines. While we also propose a different approach, which is represented the final deployment of stable versions of the system is in in Figure3. Instead of directly detecting the manipulation real homes, our experimental activity is conducted in a type from the feature vector, we use two layers. In the first smart-room lab. Activity recognition is performed by pro- layer a binary classifier is in charge of distinguishing, for a cessing data coming from a wide variety of environmental specific object, relevant manipulations from the irrelevant sensors, including pressure pads, temperature sensors, ones. This classifier is trained with relevant manipulations power meters, magnetic switches, presence sensors and (all grouped together in the same class) and irrelevant ma- more. The experiment reported in this paper is intended to nipulations of the specific object. Only the manipulations verify the viability of substituting our RFID based solution which are classified as relevant are forwarded to the sec- for recognizing manipulations of specific items. For this pur- ond layer, while the others are discarded. In the second pose we selected specific objects: a) medicine boxes as layer, a multi-class classifier is in charge of recognizing they have a key role in monitoring adherence to prescrip- the specific relevant manipulation performed on the object. tion and their improper manipulation can also be a useful Hence, this classifier is trained only with the corresponding indicator, b) a liquid bottle as it is an example of an item relevant manipulations. used in meal consumption, may have to be refrigerated, and may also play a role in monitoring water consumption, Field Description and c) a kitchen tool, a knife in particular, as a tool being Identifier Unique identifier used both in meal preparation, and in meal consumption. Motion Whether the sticker is moving (boolean) These objects are shown in Figure4 with their sensing de- xAcceleration X-Axis acceleration vice attached. yAcceleration Y-Axis acceleration zAcceleration Z-Axis acceleration The sensing devices Temperature Stickers temperature value (in Celsius) In order to monitor objects manipulation, we take advan- Orientation Physical orientation of the sticker tage of current off-the-shelf devices: Estimote’s Stickers. RSSI Signal strength A sticker is a packaged PCB with a battery-powered ARM Power Signal strength at 0 meters CPU equipped with 3-axis accelerometer, temperature sen- Battery Level Sticker’s battery level sor, and a Bluetooth Smart radio able to periodically broad- (a) Liquid bottle cast its sensed data in a short range (a few meters). Their Table 1: data frame tiny packaging makes it easy to attach them on objects as shown in Figure4. Each sticker can be easily distinguished by a unique identifier which is particularly useful to improve We performed experiments with different type of models manipulation detection by exactly knowing which kind of ob- for each of our considered objects, and we selected Ran- ject is manipulated. Estimote Stickers adopt a proprietary dom Forest for the liquid bottle and the medicine boxes, and communication protocol called Nearables; Table1 reports AdaBoost for the knife manipulations. the data frame of this protocol. In our setup, each sticker (b) Medicine boxes broadcasts a packet every 100 milliseconds while it is mov- Segmentation, feature extraction and classification are per- ing; every 200 milliseconds otherwise. A BLE scanner is in formed in real time on the mobile platform. The output seri- charge of collecting the data coming from each sticker. alized in JSON format is sent to a REST server for integra- tion with events detected by processing data coming from Sensor data analysis other sensors as illustrated in Figure1. We perform data acquisition by scanning the BLE signal through a mobile device. In order to perform segmenta- Dataset collection tion, we exploit the value of the Motion field transmitted in Since our recognition technique is based on supervised every packet by the stickers. This field is set to true when machine learning, a critical task is the acquisition of a suffi- the sticker is in motion. Our experiments revealed that this ciently large and significant dataset of object manipulations. value provides sufficient accuracy for determining begin and The dataset must also be annotated with the ground truth end of our manipulations. Hence, for a specific sticker we related to each manipulation. In order to facilitate this task, (c) Knife consider all the consecutive data packets with the Motion we developed a mobile application. The application starts field set to true as part of the same manipulation occur- with a simple screen consisting in only one button. When Figure 4: The monitored objects rence. that button is clicked, the bluetooth scanner starts acquiring Nearables data packets, which are internally stored. After a A labeled dataset is used to construct the predictive model. few manipulations we conclude the experiment, and the app Liquid bottle’s manipulations performs segmentation, it creates a set of three axis accel- The total number of manipulations of the bottle that we eration data for each manipulation and then allows the user acquired is 887. We consider 500 of them as relevant be- to label each one with the ground truth (Figure5). cause their detection is useful to monitor the activity of meal consumption or even just drinking (e.g., "extract from the As already mentioned, in this experiment, we focus on ma- fridge" or "pour water"). Table2 shows how we classify ma- nipulations of a liquid bottle, a medicine box, and a knife. nipulations of the bottle. For the purpose of this first assessment of our system we collected manipulations performed by six different adults Class Include Description without physical impairments. They executed those manipu- Minor displace- Bottle is displaced in the lations spontaneously within realistic scenarios of activities ment same place Irrelevant of daily living executed in a smart room lab (e.g. cooking, Irrelevant Bottle is moved, but not by taking medicines, . . . ). The total number of manipulations is a person (e.g., movements 2058, with 887 manipulations involving the liquid bottle, 656 of the fridge) the medicine boxes and 515 the knife. Out of the total, 1365 Displacing in Bottle is displaced inside the fridge the fridge manipulations are considered relevant, while the rest are Opening/closing When the bottle is in considered irrelevant. This distinction is clearly application fridge door the fridge door and it is Figure 5: Application layout dependent, and in our case it has been driven by the sce- opened, bottle moves narios of our e-health domain and by the interest in specific Displaced Bottle is displaced from a manipulations by the clinicians. Relevant place to another which is displacement not a fridge It is important to consider that the specific way in which we Inserted Bottle is displaced from a perform segmentation can lead to group more than one place to the fridge manipulation into a single one; for example, if a subject ex- Extracted Bottle is displaced from the tracts the water bottle from the fridge and pours the water fridge to a place in a glass as a single action without interruption, the motion Drink Bottle is taken from a Drinking/Pouring value of Nearable remains true and the whole action will be place and is brought to lips segmented as a single manipulation. On the contrary, if the and tilted bottle is moved from the fridge to the table, and then used Pour Bottle is taken from a to fill a glass, the system will identify two manipulations. We place and liquid is poured in a glass considered alternative segmentation methods, but actually observed that the presence of these ’composed’ manipu- Table 2: Liquid bottle’s manipulations lations is sometimes a benefit for our specific application, considering the final recognition accuracy. Medicine box’s manipulations Class Include Description The total number of these manipulations is 656. We con- Handle Medicine box is taken and handled sider 474 of them as relevant, because their detection is Irrelevant useful to monitor the activities "taking medicine" (e.g. "ex- Irrelevant Medicine box is moved, but not by a person (e.g., hit tract from the repository" or "open medicine box"). Table3 the repository) shows how we classify manipulations of medicine boxes. Displacing in Medicine box’s manipu- Note that distinguishing manipulations like “displacing the the repository lations when someone medicine M box” and “accessing the content of medicine searches for the correct M box” is very important in our domain, since the first if not one followed by the second may be an indication that the patient Opening/closing When the medicine box prepared the medicine but in the end forgot to take it. repository is in the repository and it drawer is opened, medicine box Knife’s manipulations moves The total number of manipulations involving the knife is Displaced Medicine box is displaced 515. We consider 391 of them as relevant because their Relevant from a place to another detection is useful to monitor the activities "preparing meal" displacement which is not the medicine (e.g. "extract from the drawer" or "cut something"). Table4 repository shows how we classify manipulations involving the knife. Inserted Medicine box is displaced from a place to the correct Results repository Table5 summarizes our results on the recognition of object Extracted Medicine box is displaced from the repository to a manipulations. We use a 10-folds cross-validation method. place The table shows both the results using the direct classifi- Accessing Opened Medicine box is taken from cation approach and the ones using the layered approach. content a place and a blister pack Despite several extensions will be required, we considered is extracted in the same these results encouraging since they show that direct clas- place or in another sification with a simple segmentation strategy and state-of- the-art machine learning already provides quite adequate Table 3: Medicine box’s manipulations accuracy for our application requirements. We expected more from the layered approach that shows improvements only on specific object manipulations. abnormally performed ADLs in smart homes. Extensive ex- periments with a dataset consisting of more than two thou- Conclusions and discussion sands manipulations show encouraging results. In this paper, we proposed an unobtrusive and effective ap- Technological limitations proach to monitor specific manipulations of objects as a The use of BLE accelerometers attached to objects ad- necessary component for the recognition of normally and dresses important drawbacks of different technological Class Include Description Accuracy (%) Irrelevant Knife is moved but not by a Multi-layer Direct clas- Total Irrelevant person (e.g., the repository classification sification occur- is shaked) rences Displacing in Knife’s manipulations Bottle the repository when someone searches Total 90,41 91,54 887 for a tool or silverware Irrelevant 91,73 94,83 387 Opening/closing When the knife is in Rel. displace- 85,28 85,71 231 repository the repository and it is ment drawer opened, the knife is moved Drink/Pour 92 91,82 269 Displaced Knife is displaced from a Medicine box Relevant place to another which is Total 92,07 92,98 656 displacement not the knife repository Irrelevant 88,46 91,75 182 Inserted Knife is displaced from a Rel. displace- 84,67 85,40 137 place to the correct reposi- ment tory Accessing 97,03 96,73 337 Extracted Knife is displaced from the content repository to a place Knife Cutting Cut Knife is taken from a place Total 96,88 96,11 515 and something is cut in the Irrelevant 97,58 97,56 124 same place or in another Rel. displace- 95,51 93,58 156 ment Table 4: Bread knife’s manipulations Cut 97,44 97,02 235 Total Total 92,56 93,14 2058 solutions proposed in the literature. However, it currently Irrelevant 91,91 94,51 693 has several limitations: First of all, energy consumption, Relevant 93,51 93,06 1356 since we observed that the high transmission rate we used reduced the battery life to levels not acceptable for a real Table 5: Results home deployment. Energy consumption was also affected by the need to increase the standard transmission power in order to cover at least the whole room. A second problem are not only technological, but involve user acceptance and we found is interference when these devices are close to privacy issues that may be more difficult to overcome. metal objects. Other problems arise when the monitored objects are dipped in water or exposed to high tempera- Future work tures, since the devices would be damaged. However, we We intend to extend our work in several directions. We plan are confident that technological evolution will soon solve to extend the number of objects and manipulations, and these limitations, while the ones affecting other approaches compare different recognition techniques. Acceleration data can also be usefully combined with fine grained indoor posi- 8. Yoav Freund and Robert E Schapire. 1997. A tioning data, as well as other sensor data to refine manipu- Decision-Theoretic Generalization of On-Line Learning lation detection. We want to do more experiments acquiring and an Application to Boosting. J. Comput. System Sci. data from senior subjects while performing manipulations 55, 1 (1997), 119 – 139. as part of complex activities. 9. Enamul Hoque, Robert F. Dickerson, Sarah M. Preum, REFERENCES Mark Hanson, Adam Barth, and John A. Stankovic. Holmes: A Comprehensive Anomaly Detection System 1. Oliver Amft and Gerhard Tröster. 2008. Recognition of for Daily In-home Activities. In 2015 International dietary activity events using on-body sensors. Artificial Conference on Distributed Computing in Sensor intelligence in medicine 42, 2 (mar 2008), 121–36. Systems. IEEE, 40–51. 2. Michael Beigl, Hans-w Gellersen, and Albrecht Schmidt. 2001. MediaCups : Experience with Design 10. Aftab Khan, Sebastian Mellor, Eugen Berlin, Robin and Use of Computer- Augmented Everyday Artefacts. Thompson, Roisin McNaney, Patrick Olivier, and Computer Networks (2001). Thomas Plötz. 2015. Beyond Activity Recognition: Skill Assessment from Accelerometer Data. In Proceedings 3. Leo Breiman. 2001. Random Forests. Machine of the 2015 ACM International Joint Conference on Learning (2001), 5–32. Pervasive and (UbiComp ’15). 4. Liming Chen, Chris D. Nugent, and Hui Wang. 2012. A ACM. Knowledge-Driven Approach to Activity Recognition in 11. Donald J. Patterson, Dieter Fox, Henry Kautz, and Smart Homes. IEEE Trans. Knowl. Data Eng. (2012). Matthai Philipose. 2005. Fine-Grained Activity 5. James Clawson, Jessica A. Pater, Andrew D. Miller, Recognition by Aggregating Abstract Object Usage. In Elizabeth D. Mynatt, and Lena Mamykina. 2015. No ISWC ’05: Proceedings of the Ninth IEEE International Longer Wearing: Investigating the Abandonment of Symposium on Wearable Computers. IEEE Computer Personal Health-tracking Technologies on Craigslist. In Society. Proceedings of the 2015 ACM International Joint 12. Daniele Riboni, Claudio Bettini, Gabriele Civitarese, Conference on Pervasive and Ubiquitous Computing Zaffar Haider Janjua, and Rim Helaoui. 2016. (UbiComp ’15). SmartFABER: Recognizing fine-grained abnormal 6. Barnan Das, Diane Cook, Narayanan Krishnan, and behaviors for early detection of mild cognitive Maureen Schmitter-Edgecombe. 2016. One-Class impairment. Artificial Intelligence in Medicine (2016). Classification-Based Real-Time Activity Error Detection 13. Jianxin Wu, Adebola Osuntogun, Tanzeem Choudhury, in Smart Homes. IEEE Journal of Selected Topics in Matthai Philipose, and James M. Rehg. 2007. A Signal Processing (2016). Scalable Approach to Activity Recognition based on 7. Ifat Afrin Emi and John A Stankovic. 2015. SARRIMA : Object Use. In Proc. IEEE International Conference on Smart ADL Recognizer and Resident Identifier in Computer Vision (ICCV 07). IEEE Press, 1–8. Multi-resident Accommodations. In Wireless Health ’15.