sensors

Article A Body Tracking-Based Low-Cost Solution for Monitoring Workers’ Hygiene Best Practices during Pandemics

Vito M. Manghisi 1,* , Michele Fiorentino 1 , Antonio Boccaccio 1 , Michele Gattullo 1 , Giuseppe L. Cascella 2 , Nicola Toschi 3 , Antonio Pietroiusti 3 and Antonio E. Uva 1

1 Department of Mechanics, Mathematics, and Management, Polytechnic University of Bari, via Oraboba 4, 70125 Bari, Italy; michele.fi[email protected] (M.F.); [email protected] (A.B.); [email protected] (M.G.); [email protected] (A.E.U.) 2 Department of Electrical Engineering and Information Technology, Polytechnic University of Bari, via Oraboba 4, 70125 Bari, Italy; [email protected] 3 Department of Biomedicine and Prevention, University of Rome ‘Tor Vergata’ Facoltà di Medicina e Chirurgia Viale Montpellier, 1, 00133 Rome, Italy; [email protected] (N.T.); [email protected] (A.P.) * Correspondence: [email protected]

 Received: 10 September 2020; Accepted: 26 October 2020; Published: 29 October 2020 

Abstract: Since its beginning at the end of 2019, the pandemic spread of the severe acute respiratory syndrome coronavirus 2 (Sars-CoV-2) caused more than one million deaths in only nine months. The threat of emerging and re-emerging infectious diseases exists as an imminent threat to human health. It is essential to implement adequate hygiene best practices to break the contagion chain and enhance society preparedness for such critical scenarios and understand the relevance of each disease transmission route. As the unconscious hand–face contact gesture constitutes a potential pathway of contagion, in this paper, the authors present a prototype system based on low-cost depth sensors able to monitor in real-time the attitude towards such a habit. The system records people’s behavior to enhance their awareness by providing real-time warnings, providing for statistical reports for designing proper hygiene solutions, and better understanding the role of such route of contagion. A preliminary validation study measured an overall accuracy of 91%. A Cohen’s Kappa equal to 0.876 supports rejecting the hypothesis that such accuracy is accidental. Low-cost body tracking technologies can effectively support monitoring compliance with hygiene best practices and training people in real-time. By collecting data and analyzing them with respect to people categories and contagion statistics, it could be possible to understand the importance of this contagion pathway and identify for which people category such a behavioral attitude constitutes a significant risk.

Keywords: body tracking; azure ; occupational safety; safety training; hygiene best practices; pandemics containment

1. Introduction

The Risk of Pandemics Following the pandemic spread of the severe acute respiratory syndrome coronavirus 2 (Sars-CoV-2) and its related disease, COVID-19 [1], the world is experiencing an exceptional threat to public health involving, in only nine months, over 39 million contagions and costing over one million human lives. To contain the virus spread, extraordinary social-distancing measures (also known as lockdown) have been applied worldwide, involving school and workplace closures (Figure1), travel limitations,

Sensors 2020, 20, 6149; doi:10.3390/s20216149 www.mdpi.com/journal/sensors Sensors 2020, 20, x FOR PEER REVIEW 2 of 17 Sensors 2020, 20, 6149 2 of 17 lockdown) have been applied worldwide, involving school and workplace closures (Figure 1), travel limitations, and stay-at-home requirements. Though effective, such measures are not sustainable for andan stay-at-homeextended period requirements. and involve Though heavy erepercussions,ffective, such measuresboth social are and not economic sustainable [2,3]. for Unfortunately, an extended periodthe pathogen and involve has high heavy transmissibility repercussions, and both infectivity social and [4,5]. economic The primary [2,3]. Unfortunately, carrier of contagions the pathogen is made hasup highof respiratory transmissibility droplets and that infectivity are directly [4,5]. inhaled The primary or come carrier into contact of contagions with the is oral, made nasal, up of or respiratoryperiocular droplets mucous thatby hand-mediated are directly inhaled transmission or come [6 into,7]. As contact no specific with the therapeutics oral, nasal, and or periocularvaccines are mucousavailable by hand-mediatedfor disease control, transmission the COVID-19 [6,7]. Aspandemic no specific is posing therapeutics a great and threat vaccines to global are availablepublic health. for diseaseSuch control,a critical the scenario COVID-19 is related pandemic to every is posing pan ademic great threatevent, to and global the public threat health. of emerging Such a criticaland re- scenarioemerging is relatedinfectious to every diseases pandemic exists as event, an imminent and the threat threat to of human emerging health and and re-emerging international infectious security diseases[8]. Consequently, exists as an it imminent is of paramount threat to importance human health to implement and international adequate security solutions [8 to]. Consequently,enhance society itpreparedness is of paramount to face importance the insurgence to implement and coexist adequate with solutionssuch events. to enhance society preparedness to face the insurgence and coexist with such events.

FigureFigure 1. 1.Workplace Workplace closuresclosures duringduring the COVID-19 pa pandemicndemic on on 2 2April April 2020. 2020. There There may may be besub- sub-nationalnational or orregional regional differences differences in inpolicies policies on onwo workplacerkplace closures. closures. The The policy policy categories categories shown shown may maynot not apply apply at atall all sub-national sub-national levels. levels. A A country country is is coded coded as “required“required closures”closures” if if some some sub-national sub-national regionsregions have have required required closures closures (source (source [2 ]).[2]).

DigitalDigital technologies technologies are are a crucial a crucial asset asset in the in fight the againstfight against pandemics. pandemics. In the initialIn the spreading initial spreading phase, theyphase, are usedthey toare model used viralto model activity viral and activity forecast and the forecast virus outbreak the virus [ 9outbreak,10], to collect [9,10], and to collect access real-timeand access datareal-time monitoring data themonitoring spread of the contagion spread [of11 –contagion13], to boost [11–13], the research to boost for the early research illness diagnosesfor early illness [14], virusdiagnoses genome [14], sequencing, virus genome and acceleratedsequencing, drugsand accelerated and vaccine drugs development and vaccine [15 development–20], and to fill[15–20], the communicationand to fill the gap communication induced by the gap social induced distancing by measures—thethe social distancing so-called measures—the “Infodemic” [21 so-called]. “Infodemic”In the recovery [21]. process, innovative technological solutions have a high potential to safeguard peoples’In healththe recovery by enhancing process, theinnovative capability technologica to breakl the solutions chain ofhave contagion. a high potential The World to safeguard Health Organizationpeoples’ health (WHO) by guidelinesenhancing providethe capability specific hygieneto break best the practiceschain of forcontagion. implementation The World to prevent Health contagionOrganization [22]. Such(WHO) practices guidelines provide provide for: specific hygiene best practices for implementation to preventEnvironment contagion decontamination [22]. Such practices based provide on periodic for: sanitation and ventilation and the use of specific • • airEnvironment conditioning decontamination filters; based on periodic sanitation and ventilation and the use of Usespecific of Personal air conditioning Protection filters; Equipment (PPE) such as gloves, masks, face shields, and gowns • • (FigureUse of2); Personal Protection Equipment (PPE) such as gloves, masks, face shields, and gowns Strict(Figure compliance 2); with Behavioral Protection Practices (hereinafter BPPs), both in interpersonal • • relationshipsStrict compliance and at thewith workplace. Behavioral BPPs Protection provide Practi for: avoidingces (hereinafter crowding BPPs), conditions, both in maintaininginterpersonal atrelationships least a 1-m distance and at between the workplace. individuals, BPPs using pr PPEovide correctly, for: avoiding frequently crowding washing hands,conditions, and avoiding hand–face contacts [23].

Sensors 2020, 20, x FOR PEER REVIEW 3 of 17

Sensorsmaintaining2020, 20, 6149 at least a 1-m distance between individuals, using PPE correctly, frequently3 of 17 washing hands, and avoiding hand–face contacts [23].

Figure 2.2. Personal Protection Equipment (PPE) used in the industrial scenario at one LamborghiniLamborghini facility in ItalyItaly convertedconverted intointo PPEPPE productionproduction afterafter thethe COVID-19COVID-19 outbreak.outbreak. (Gentle permission of Automobili Lamborghini S.p.A.).S.p.A.).

Avoiding hand–facehand–face contacts is also essential for individuals using PPE. Wearing masksmasks could increase hand–face contacts due to frequentfrequent movingmoving of thethe mask,mask, unconsciousunconscious “fidgeting,”“fidgeting,” or skinskin discomfort generated in the perioral area. Furtherm Furthermore,ore, there is a need for furtherfurther investigationinvestigation on this topic, given the lack of a specificspecific evaluation of the relative role of inhalation and hand-mediated transmission [6]. [6]. Novel Novel technologies technologies have have already already been been used used to support to support compliance compliance with with BPPs. BPPs. For Forinstance, instance, Integrated Integrated Circuits Circuits indu industriesstries developed developed solutions solutions based based on on Radio Radio Frequency Frequency Identification Identification (RFID)(RFID) oror Wi-FiWi-Fi networksnetworks forfor monitoringmonitoring safetysafety distancesdistances andand overcrowdingovercrowding [[24].24]. Whilst these solutions exploit body-location trackingtracking and do not allow monitoring of hand–face contacts, another technology—the full-body trackingtracking one—can be exploited to supportsupport compliancecompliance with BPPs. This adopts two main approaches: body-worn sensors and digital image processing. Body-worn sensors systems useuse accelerometers,accelerometers, gyroscopes,gyroscopes, goniometers,goniometers, andand magnetometersmagnetometers that are are either either held, held, worn, worn, or or attached attached to tobody body part parts.s. These These systems systems are relati are relativelyvely inexpensive inexpensive and have and haveadvantages advantages such as such a relatively as a relatively quick quick adoption adoption and integration and integration with withexisting existing PPE. PPE.Furthermore, Furthermore, they theyare largely are largely embedded embedded in daily in daily life-usage life-usage devices devices su suchch as as smartwatches smartwatches or or fitness fitness smart-bracelets. Recently, D’Aurizio et al.al. presented presented the the No No Face Face-Touch-Touch system, a solution using Micro-Electro- Mechanical-Systems-based sensors embedded in typical smartwatches, able to estimate hand proximity to face and notify the user whenever aa face-touchface-touch movementmovement isis detecteddetected [[25].25]. The authors implemented two algorithms algorithms to to detect detect face-touch face-touch movements. movements. The The first first algorithm algorithm correctly correctly detects detects 91.3% 91.3% of the of face- the face-touchtouch movements movements by measuring by measuring the magnetic the magnetic field fieldemitted emitted by a byhandmade a handmade magnetic magnetic necklace necklace and andevaluating evaluating the hands’ the hands’ distance. distance. The second The second algori algorithmthm does not does use not the use magnetic the magnetic necklace necklace and has and a has92.6% a 92.6%accuracy. accuracy. False Falsepositives positives affect a fftheect specificit the specificityy of the of system. the system. The Therationale rationale for developing for developing the thesecond second algorithm algorithm relies relies on some on somedrawbacks drawbacks hindering hindering the use the of this use kind of this of sensor kind of in sensor the daily in thelife dailyscenario. life Inertial-Measurement-Unit scenario. Inertial-Measurement-Unit (IMU) based (IMU) technologies based technologies suffer from ferromagnetic suffer from ferromagnetic disturbances, disturbances,and accelerometers and accelerometers require periodic require registration periodic beca registrationuse of the becauseinherent ofdrift the error inherent [26]. driftBesides, error using [26]. Besides,PPE may using hinder PPE themay effective hinder positioning the effective of such positioning sensors ofthat, such if not sensors correctly that, designed, if not correctly can be designed, invasive canto normal be invasive activities to normal [27–29]. activities [27–29]. Among digital image processing-based systems, th thee marker-based ones are are the the golden golden standard. standard. These systems use either reflectivereflective oror activeactive markersmarkers attachedattached toto bodybody parts.parts. Their complexity ranges from single-camerasingle-camera to multi-cameramulti-camera setups enabling orientation flexibilityflexibility and full 3D body-jointsbody-joints tracking [30]. Despite their accuracy, such solutions are limited to laboratory studies because they

Sensors 2020, 20, 6149 4 of 17 require a complex and expensive setup difficult to reproduce in the daily life scenario [31]. The other solutions, based on digital image processing, use either 2D Red-Green-Blue (RGB) images or depth images. The former relies on machine-learning algorithms to segment the body parts and detect their 2D space position. Several researches investigated using such technology for detecting hand over face gestures [32–34]. This task is considered very challenging to handle because hand occlusion is one of the main factors limiting the accuracy of facial analysis systems. As the face becomes occluded, facial features are either lost, corrupted, or erroneously detected. Nevertheless, this technological solution achieves a good tracking performance (e.g., 90% accuracy for hand–lip occlusions) [32], but the skin color, at the basis of the hand detection algorithms, can influence its effectiveness. Consequently, such a solution is not easily exploitable in environments where BPPs require wearing PPE, such as gloves and face masks. Despite such a limitation, a web application for detecting single user’s hands-over-face gestures was presented in March 2020—the don’t touch your face [35]. The application can detect face occlusions and notify them in real-time by using Artificial Intelligence (AI) and a standard webcam. However, it is unable to identify the touched face areas and does not return any user-activity log. Furthermore, it requires a user-specific training phase and its reliability is strongly dependent on the user’s positions relative to the webcam. Such limitations can be overcome by exploiting low-cost body tracking systems based on depth cameras (also known as RGB-D cameras). RGB-D cameras acquire a depth map of the framed scene used by body tracking systems to feed AI algorithms for identifying body parts and their 3D coordinates. These cameras were initially introduced on the consumer market as gaming interface sensors, such as the ASUS Xtion, the Orbbec3d, and the Kinect families. As image-based technologies, they offer great flexibility thanks to the easy setup and ready-to-use functionalities. Thus, despite limited success in the gaming market, they have been used in other fields ranging from research to rehabilitation. Today they find new applications within the Industry 4.0 program. In particular, the Microsoft Kinect family has found applications in many research areas, including human–computer interaction [36–38], occupational health [39–41], physiotherapy [42–45], and daily life safety [46]. Furthermore, recently Microsoft released the DK (also known as k4a) [47]. It is an evolution of the previous Microsoft RGB-D sensor, the Microsoft Kinect V2, whose accuracy has been widely studied [29,48–50]. This work aims at investigating whether and how commercially available low-cost RGB-D-based tracking systems can be effectively and easily exploited for monitoring people’s compliance with behavioral hygiene measures in the workplace. In the remainder of this paper, the authors propose the HealthSHIELD tool, a software prototype based on the Azure Kinect camera. The prototype has been developed for monitoring, in real-time, people’s tendency to touch their face, increasing their awareness with real-time warnings, and providing statistical reports to limit, as far as possible, any sources of contagion from pathogens such as Sars-CoV-2. The authors describe the approach followed to define the system requirements, the functionalities, and a preliminary validation study assessing the system’s accuracy in laboratory conditions. Finally, the authors draw their conclusions and discuss the pros and cons of the proposed solution. To the best of the authors’ knowledge, there are no other applications of RGB-D-based technologies for monitoring operator compliance with BPPs counteracting virus contagion in the workplace.

2. Materials and Methods

2.1. System Requirements The authors identified, as an application scenario, a fixed workstation such as a position in a factory production line, an office desktop, a reception for triage in a healthcare facility, or a supermarket checkout. The authors focused on two hygiene measures to be applied in this scenario:

Avoiding hand contacts with the perioral and periocular areas; • Maintaining an interpersonal distance of at least 1 m. • Sensors 2020, 20, 6149 5 of 17

Sensors 2020, 20, x FOR PEER REVIEW 5 of 17 The system requirements were defined considering the following needs: • Avoiding hand contacts with the perioral and periocular areas; Monitoring• Maintaining people’s an interpersonal behavior in thedistance application of at least scenarios 1 m. for providing proactive feedback; • Collecting behavioral data for statistical reports; • The system requirements were defined considering the following needs: Studying the relative importance of direct inhalation of droplets and hand-mediated contamination • • Monitoring people’s behavior in the application scenarios for providing proactive feedback; for better understanding the virus contagion pathways. • Collecting behavioral data for statistical reports; This• poolStudying of needs the is relative the rationale importance at the basesof direct of the inhalation system requirements. of droplets Inand detail, hand-mediated the prototype should: contamination for better understanding the virus contagion pathways. 1. Be transparentThis pool toof theneeds users is the (that rationale is, not hinderingat the base workings of the activities);system requirements. In detail, the 2. Beprototype easy to setshould: up in the workplace; 3. Allow1. Be 3D transparent body-part to tracking the users (i.e.,(that hands,is, not hindering eyes, ears, working and nose); activities); 4. Allow2. Be a easy reliable to set real-time up in the hand–face workplace; contact detection; 3. Allow 3D body-part tracking (i.e., hands, eyes, ears, and nose); 5. Distinguish between hand–face contact areas; 4. Allow a reliable real-time hand–face contact detection; 6. Generate5. Distinguish real-time between hand–face hand–face contact contact warnings; areas; 7. Monitoring6. Generate the real-time distance hand between–face contact people; warnings; 8. Log7. theMonitoring hand–face the contactdistance eventsbetween for people; further behavioral studies and eventual task-execution procedure8. Log the redesign. hand–face contact events for further behavioral studies and eventual task-execution procedure redesign. By mapping such requirements to the commercially available low-cost RGB-D-based tracking systems, theBy Microsoft mapping Azuresuch requirements Kinect DK (alsoto the known commercia as k4a)lly available meets all low-cost of them. RGB-D-based The RGB-D tracking technology systems, the Kinect DK (also known as k4a) meets all of them. The RGB-D is transparent to the user, and the operative range of the k4a avoids any interference with normal technology is transparent to the user, and the operative range of the k4a avoids any interference with activities.normal Diff activities.erently from Differently other RGB-Dfrom other sensors RGB-D such sensors as the such Asus as the Xtion Asus and Xtion the and Orbbec3d the Orbbec3d families, the k4afamilies, tracks specificthe k4a tracks body specific parts such body as parts hands, such eyes,as hands, ears, eyes, and ears, nose and (see nose Figure (see3 Figure). Such 3). tracking Such functionalitiestracking functionalities are ready to use.are ready Furthermore, to use. Furthermore, the k4a is easythe k4a to is set easy up to (e.g., set up plug (e.g., and plug play) and thanks play) to the availabilitythanks to ofthe pre-built availability drivers. of pre-built drivers.

FigureFigure 3. The 3. dataThe data returned returned by by the the Microsoft Microsoft Azure Kine Kinectct Body Body Tracking Tracking Software Software Development Development Kit. Kit. (a)(a The) The surface and and the the skeleton skeleton visualizedvisualized by by the the Azure Azure Kinect Kinect Body Body Tracking Tracking Viewer; Viewer; (b) (jointb) joint positionspositions and respective and respective parent parent joint joint in the in the joints’ joints’ hierarchy. hierarchy.

RegardingRegarding the last the three last requirements,three requirements, thanks thanks to the to availability the availability of the of Azure the Azure Kinect Kinect Sensor Sensor Software DevelopmentSoftware Kit Development (SDK) [51], Kit it is (SDK) possible [51], to it easily is possible implement to easily a software implement tool thata software fulfills tool them. that fulfills them. 2.2. The HealthSHIELD Tool

The HealthSHIELD tool has been implemented with the C# programming language by using the Azure Kinect Sensor SDK v1.4.0 [51], the Azure Kinect Body Tracking SDK v1.0.1 [52], the C# Sensors 2020, 20, 6149 6 of 17 Sensors 2020, 20, x FOR PEER REVIEW 6 of 17

API wrappers2.2. The HealthSHIELD for the previous Tool SDKs v1.4.0 [53], the Windows Presentation Foundation toolkit v3.5.50211The [54 HealthSHIELD], and the OpenGL.Net tool has been libraries implemented v 0.8.4 wi [55th]. the C# programming language by using the ForAzure the Kinect main Sensor application, SDK v1.4.0 the [51], tool the should Azure be Kinect in a fixedBody placeTracking (e.g., SDK o ffiv1.0.1ce desktop, [52], the industryC# API or laboratorywrappers workstation, for the previous store SDKs checkouts, v1.4.0 [53], school the Windows desk, and Presentation so on) whereFoundation people toolki accomplisht v3.5.50211 their daily[54], life and activities. the OpenGL.Net People under libraries observation v 0.8.4 [55]. are expected to face the sensor, at a distance of about two meters,For withoutthe main turningapplication, their the face tool moreshould than be in 45 a degreesfixed place with (e.g., respect office desktop, to the sensor, industry which or is laboratory workstation, store checkouts, school desk, and so on) where people accomplish their daily positioned at a height of about 170 cm. The operative range of the sensor avoids any interference with life activities. People under observation are expected to face the sensor, at a distance of about two normal activities. meters, without turning their face more than 45 degrees with respect to the sensor, which is Thepositioned tool operates at a height sequentially of about 170 withcm. The four operative modules. range The of firstthe sensor module, avoids the any data interference retrieval, with acquires the rawnormal data activities. through the k4a and passes them to the gesture detection module. This module detects hand–faceThe contact tool operates gestures, sequentially classifies with them four withmodules. respect The first to the module, face contactthe data retrieval, areas and acquires labels the correspondingthe raw data frame through accordingly. the k4a and The passes third them module, to the gesture the attitude detection monitor, module. processes This module the detects hand–face contacts,hand–face identify contact and labelsgestures, the classifies contact events,them with and re feedsspect theto the Graphical face contact User areas Interface and (GUI)labels the module (a descriptioncorresponding of the frame GUI accordingly. module and The its third functionalities module, the is attitude available monitor, in Appendix processesA). the hand–face contacts, identify and labels the contact events, and feeds the Graphical User Interface (GUI) module 2.2.1.(a The description Data Retrieval of the GUI Module module and its functionalities is available in Appendix A).

The2.2.1. data The retrievalData Retrieval module Module connects to the k4a and uses it to collect both the 3D coordinates of the body joints and the infrared video stream. The k4a is based on a 12-Megapixel CMOS sensor and a The data retrieval module connects to the k4a and uses it to collect both the 3D coordinates of 1-Megapixel time-of-flight depth camera. The field of view of the depth camera ranges from 75 65 the body joints and the infrared video stream. The k4a is based on a 12-Megapixel CMOS sensor and◦ × ◦ to 120 120 with operative distances of 0.5–3.86 m and 0.25–2.21 m, respectively [47]. The body a◦ 1-Megapixel× ◦ time-of-flight depth camera. The field of view of the depth camera ranges from 75° × tracking65° to algorithms 120° × 120° of with the operative body tracking distances SDK of 0.5–3.86 return for m and each 0.25–2.21 frame a m, hierarchical respectively skeleton [47]. The composedbody of 32tracking joint objects algorithms (Figure of3 the). It body is worth tracking noting SDK that return the for k4a each can frame track a the hierarchical eye- the skeleton nose- the composed ear-, and the handof joints. 32 joint objects (Figure 3). It is worth noting that the k4a can track the eye- the nose- the ear-, and the hand joints. 2.2.2. The Gesture Detection Module 2.2.2. The Gesture Detection Module The Gesture Detection module uses the 3D joint coordinates to detect frames with hand–face contacts andThe classifyGesture theDetection contact module area usinguses the custom 3D joint heuristics coordinates implemented to detect frames by the with authors. hand–face Such an contacts and classify the contact area using custom heuristics implemented by the authors. Such an approach allows avoiding training any machine learning algorithms. approach allows avoiding training any machine learning algorithms. To detectTo detect the hand–face the hand–face contacts, contacts, the headthe head has beenhas been modeled modeled as a sphereas a sphere centered centered in the in midpoint the betweenmidpoint the two between ear joints the two and ear having joints theand diameterhaving the equal diameter to the equal distance to the betweendistance between them (Figure them 4a). A contact(Figure volume 4a). A hascontact been volume defined has corresponding been defined co torresponding a contact sphere to a contact centered sphere as the centered sphere as modeling the the head.sphere If modeling a thumb jointthe head. enters If a the thumb contact joint volumeenters the during contact the volume operator’s during activity, the operator’s the corresponding activity, framethe is corresponding labeled as a hand–face frame is labeled contact. as a hand–face contact.

Figure 4. (a) The sphere modeling the head and the contact volume (in light violet). (b) A schematic of the contact areas on the head modeled as a sphere and projected onto the frontal plane; each number

corresponds to a joint, in detail: (1) nose, (2) mouth, (3) right ear, (4) left ear, (5) right eye, (6) left eye. (c) A schematic of the contact areas on the head modeled as a sphere and projected onto the sagittal plane. Sensors 2020, 20, x FOR PEER REVIEW 7 of 17

Figure 4. (a) The sphere modeling the head and the contact volume (in light violet). (b) A schematic of the contact areas on the head modeled as a sphere and projected onto the frontal plane; each number corresponds to a joint, in detail: (1) nose, (2) mouth, (3) right ear, (4) left ear, (5) right eye, (6) left eye. (c) A schematic of the contact areas on the head modeled as a sphere and projected onto the Sensorssagittal2020, 20 ,plane. 6149 7 of 17

The contact volume definition represents a tradeoff choice. On the one hand, defining a wide contactThe volume contact would volume involve definition detecting represents false contac a tradeots, onff choice. the other On hand, the onenarrowing hand, definingit would ainvolve wide contactmissing volume true contacts. would involve The contact detecting volume false was contacts, defined on theby otheradopting hand, a trial narrowing and error it would approach involve by missingiteratively true increasing contacts. and The decreasing contact volume the contact was definedsphere radius by adopting to find aan trial acceptable and error threshold. approach by iterativelyThe contact increasing sphere and radius decreasing has been the defined contact sphereas 70% radiusof the distance to find an between acceptable the ear threshold. joints to adapt to theThe operator’s contact sphere anthropometrics. radius has been It defined is wort ash 70% noting of the that distance according between to the the ear anthropometric joints to adapt todimensional the operator’s data anthropometrics. for a 40-year-old It American is worth noting male, that the according50th percentile to the head anthropometric breadth is dimensionalequal to 15.7 datacm, forthe a5th 40-year-old percentile American is equal male, to 14.8 the 50thcm, and percentile the 95th head percentile breadth isis equal equal to to 15.7 16.8 cm, cm the [56]. 5th percentileConsequently, is equal between to 14.8 the cm, 5th and and the the 95th 95th percentile percenti isle, equal the contact to 16.8 cmsphere [56]. radius Consequently, ranges from between 10.36 theto 11.76 5th and cm. the 95th percentile, the contact sphere radius ranges from 10.36 to 11.76 cm. AfterAfter detectingdetecting aa frameframe withwith aa hand–facehand–face contact,contact, thisthis modulemodule classifiesclassifies thethe contactcontact areaarea byby evaluatingevaluating thethe faceface jointjoint nearestnearest toto thethe thumb thumb one. one. AtAt thisthis end,end, thethe authorsauthors implementedimplemented customcustom heuristicsheuristics toto distinguish between between the the mouth mouth area, area, nose nose area, area, ear ear area, area, eyes eyes area area,, and and contacts contacts in a inlow- a low-riskrisk area area far farfrom from these these regions regions (Figure (Figure 4b,c).4b,c). By By doing doing so, so, the the sphere surface isis divideddivided intointo areas areas similarlysimilarly to to a a Voronoi Voronoi diagram diagram [ 57[57].]. ToTo distinguish distinguish contactscontacts inin aa low-risk low-risk area,area, whichwhich isis farfar fromfrom thethe mouth, mouth, thethe nose,nose, thethe eyes,eyes, andand ears,ears, the the gesture gesture detection detection module module uses uses a a secondary secondary distance distance threshold. threshold. EachEach contactcontact frameframe isis labeledlabeled asas aa low-risklow-risk areaarea ifif thethe thumb thumb jointjoint distance distance fromfrom thethe nearest nearest joint joint exceeds exceeds thethe secondary secondary distancedistance threshold.threshold. The The threshold threshold has has been been defined defined empirically empirically similarly similarly to to the the contact contact volume volume radius, radius, and and it is it equalis equal to 75%to 75% of theof the distance distance between between the the two two ear ear joints. joints. AtAt thethe endend ofof thethe contactcontact areaarea classification,classification, eacheach contactcontact belongsbelongs toto oneone ofof thethe sevenseven areasareas inin FigureFigure4 b,c.4b,c. Finally, Finally, as as the the mouth mouth area area is nearis near the the nose nose one, one, these these areas areas are are merged merged into into a single a single one one in ain subsequent a subsequent processing processing stage. stage. By By doing doing so, so, the the contact contact areas areas are are reduced reduced from from seven seven to six to six (Figure (Figure5). At5). theAt the end end of thisof this stage, stage, each each frame frame has has a contact-statea contact-state label label (i.e., (i.e., contact contact or or no-contact), no-contact), and and eacheach contactcontact frame frame has has the the corresponding corresponding contact contact area area label. label.

FigureFigure 5. 5.A A schematicschematic ofof thethe contactcontact areasareas ontoonto thethe spheresphere modelingmodeling thethe headhead correspondingcorresponding to to the thesix six contactcontact event event classes classes detected detected by by the the HealthSHIELD HealthSHIELD event event classifier. classifier. 2.2.3. The Attitude Monitor Module 2.2.3. The Attitude Monitor Module This module connects the gesture detection module to the graphical user interface one and This module connects the gesture detection module to the graphical user interface one and operates in two moments of the observation—in real-time and at the end. operates in two moments of the observation—in real-time and at the end. In real-time, the module compares each newly acquired frame with the previous one. If the contact state label changes from no-contact to contact or the contact area label changes, this module sends a signal to the interface module, indicating a contact event and the contact area. By doing so, the module groups sequences of frames with the same contact-state label as a contact event with the corresponding contact area label. Sensors 2020, 20, x FOR PEER REVIEW 8 of 17

In real-time, the module compares each newly acquired frame with the previous one. If the contact state label changes from no-contact to contact or the contact area label changes, this module sends a signal to the interface module, indicating a contact event and the contact area. By doing so, Sensorsthe module2020, 20 groups, 6149 sequences of frames with the same contact-state label as a contact event with8 of the 17 corresponding contact area label. In addition, the module checks, for each frame, if there is more than one person in the viewing rangeIn and addition, calculates the module the distance checks, between for each the frame, hip ifcenter-body there is more joint than of oneeach person one of in them. the viewing If the rangeinterpersonal and calculates safety distance the distance is not respected, between the the hipmodule center-body sends a signal joint ofindicating each one an of under-security them. If the interpersonaldistance condition safety to distance the interface. is not respected,During the the real module-time observation, sends a signal all indicating frame labels an under-securityare stored in a distancebuffer together condition with to their the interface.timestamps. During the real-time observation, all frame labels are stored in a bufferAt together the end withof the their observation, timestamps. the module processes the data in the memory buffer, generates a statisticalAt the report end of on the the observation, observed worker’s the module attitude, processes sends the it to data the ininterface the memory module, buff ander, generates stores it in a statisticala comma-separated report on thevalues observed file. worker’s attitude, sends it to the interface module, and stores it in a comma-separatedThe flowchart values in Figure file. 6 explains the algorithm on the basis of the detection and classification heuristicsThe flowchart and the procedure in Figure6 used explains to keep the algorithmtrack of the on frames the basis contact-state. of the detection It is executed and classification iteratively heuristicsfor each newly and the acquired procedure frame, used as to keepevidenced track ofby the the frames green contact-state. arrow connecting It is executed the end iteratively node to the for eachstart newlyone. acquired frame, as evidenced by the green arrow connecting the end node to the start one.

Figure 6.6. Hand–face contact detection and classificationclassification algorithms; the green arrow connectingconnecting the end nodenode toto thethe startstart oneone indicatesindicates thatthat thethe algorithmalgorithm isis iteratediterated forfor eacheach newlynewly acquiredacquired frame.frame.

2.3. Validation Procedure As previously described, described, the the detection detection module module to togethergether with with the the attitude attitude monitoring monitoring module module act actas event as event classifiers classifiers by detecting by detecting and classifying and classifying the contact the contact events eventsinto six into classes—that six classes—that is the mouth– is the mouth–nose,nose, the right-eye, the right-eye, the left-eye, the the left-eye, right-ear, the right-ear,the left-ear, the and left-ear, the low-risk and the contact low-risk events contact (Figure events 5). (FigureRegardless,5). during daily life activities, it is possible to raise the hand and cause face-occlusion eventsRegardless, without actual during face daily contacts life activities, (i.e., gestures it is possible in which to raiseone hand the hand is raised and causeat the face-occlusionface level and eventsoccludes without it to the actual camera face field contacts of view (i.e., without gestures actu inally which touching one hand the face). is raised It is at worth the face noticing level andthat occludesthe HealthSHIELD it to the camera classifier field of does view withoutnot detect actually the touchingocclusion the events face). Itand is worth only noticing considers that the HealthSHIELDcorresponding frames classifier as doesno-contact not detect state theframes. occlusion events and only considers the corresponding framesTo as answer no-contact the research state frames. question, an experiment was carried out to evaluate the HealthSHIELD classifier’sTo answer capability the research to distinguish question, between an experiment the aforementioned was carried outcontact to evaluate event classes. the HealthSHIELD At this end, classifier’sthe detection capability and classification to distinguish capabilities between of the a aforementionedhuman observer contact were used event as classes. a baseline. At this end, the detection and classification capabilities of a human observer were used as a baseline.

After this evaluation, the collected data were further processed. The contact event classes were clustered into two main classes. The first one, named risk contacts, contains the mouth–nose, the right-eye, and the left-eye contact events. The second one, named no-risk contacts, contains the Sensors 2020, 20, 6149 9 of 17 right-ear, the left-ear, and the low-risk contact events. After clustering, it was possible to evaluate the overall capability to distinguish between risk and no-risk contact events. Two hypotheses were formulated:

Hypothesis 1 (H1). The HealthSHIELD classifier classifies the events into the six contact areas in agreement with the human observer.

Hypothesis 2 (H2). The HealthSHIELD classifier classifies the events into risk and no-risk contact areas in agreement with the human observer.

2.3.1. Experimental Setup One volunteer (male, age 49 years, height 178 cm, weight 92 kg) stood in front of the k4a sensor which was about two meters at the height of 170 cm from the ground. Simultaneously, next to the Kinect sensor and at the same height from the ground, an RGB camera framed the scene. The experimental procedure consisted of two phases. First, the volunteer had to mimic random face-contact gestures (there were three sessions, each one lasting about 15 min). The volunteer during this phase could also execute gestures corresponding to face occlusions. During gesture execution, the k4a sensor framed the volunteer, and the HealthSHIELD tool ran the analysis on a high-end laptop with CUDA-enabled GPU. Simultaneously, the RBG camera recorded the gestures’ execution, and a specific tool recorded the HealthSHIELD GUI on the laptop screen to visually evaluate if the contact events were detected and classified in real-time. Second, an experimenter carefully analyzed the RGB recordings to detect and annotate the hand–face contact events, the execution time, and the contact area. The experimenter also detected the face-occlusion gestures. An event-based procedure was applied to guarantee the synchronization between both recordings’ timestamps, as in [29]. At the end of this phase, it was possible to compare the events detected and classified by the HealthSHIELD tool classifier and the ground truth obtained from the experimenter.

2.3.2. Data Analysis and Metrics The data gathered from the HealthSHIELD tool and the experimenter allowed the generation of the confusion matrix (Table1) and evaluation of the metrics for assessing the classifier’s reliability.

Table 1. The confusion matrix for the seven-class classification.

Truth Data Mouth Nose Right Eye Left Eye Right Ear Left Ear Low Risk Area Occlusions Mouth/Nose 74 7 14 0 0 1 0 Right Eye 12 55 1 9 0 1 0 Left Eye 7 0 64 0 5 1 0 Right Ear 0 1 0 69 0 6 0 Left Ear 0 1 1 0 68 6 0 Low risk 0 5 2 4 7 64 5 Occlusions 0 0 0 0 0 20 65

In the first row of a confusion matrix, there are the classes each sample belongs to, and in the first column, the classes in which each sample is classified. If a sample is correctly classified as belonging to its class, it is a True Positive (TP), if it is correctly classified as not belonging to a class, it is a True Negative (TN). Correspondingly, if a sample is wrongly classified as belonging to a class, it is a False Positive (FP), and if wrongly classified as not belonging to a class, it is a False Negative (FN). By doing so, all the correctly classified samples, TPs and TNs, lie on the main diagonal. In each row, the samples outside the diagonal elements are FPs for the class of that row. In each column, the samples outside the diagonal elements are the FNs for the class of that column. The confusion matrix had seven rows and seven columns, corresponding to the six contact events and the occlusion events. By asking the Sensors 2020, 20, x FOR PEER REVIEW 10 of 17

SensorsFurthermore,2020, 20, 6149 the classifier’s overall accuracy and the strength of the classification accuracy10 of as 17 expressed by Cohen’s Kappa were evaluated [58]. Cohen’s Kappa is a measure of agreement between the classifier and the ground truth. Cohen’s Kappa considers how much the agreement could be experimenter to also detect the face-occlusion events, it was possible to also take into account, in the expected by chance. confusion matrix, the contact events missed by the classifier. After evaluating the classifier reliability with respect to the seven classes, its general capability The base metrics used to evaluate the classifier’s reliability were the TPs, the TNs, the FPs, and the to distinguish between the risk contacts, the no-risk contacts, and the face-occlusion events was FNs. These metrics allowed the evaluation of three advanced metrics assessing the reliability of a evaluated. To this end, the confusion matrix for the classification into these classes was generated classifier: the accuracy, the precision, and the sensitivity. They are defined as follows: and the corresponding reliability metrics were assessed. Accuracy = (TPs + TNs)/(TNs + FNs + FPs + TPs) * 100 (1) Table 1. The confusion matrix for the seven-class classification.

Precision =TruthTPs/(TPs Data+ FPs) * 100 (2) Mouth Right Left Right Left Low Risk Sensitivity = TPs/(TPs + FNs) * 100Occlusions (3) Nose Eye Eye Ear Ear Area Mouth/NoseFurthermore, the74 classifier’s overall7 accuracy14 and0 the strength0 of the classification1 accuracy 0 as expressedRight Eye by Cohen’s12 Kappa were55 evaluated [158 ]. Cohen’s9 Kappa is0 a measure1 of agreement between 0 theLeft classifier Eye and the7 ground truth.0 Cohen’s64 Kappa considers0 how5 much the1 agreement could 0 be expectedRight Ear by chance. 0 1 0 69 0 6 0 LeftAfter Ear evaluating 0 the classifier1 reliability 1 with respect0 to the seven68 classes,6 its general capability 0 to distinguishLow risk between0 the risk contacts,5 the2 no-risk contacts,4 and7 the face-occlusion64 events 5 was evaluated.Occlusions To this end,0 the confusion0 matrix for0 the classification0 into0 these classes20 was generated 65 and the corresponding reliability metrics were assessed. 3. Results 3. Results By analyzing the recordings, the experimenter observed 575 hand–face gesture events. Table 1 By analyzing the recordings, the experimenter observed 575 hand–face gesture events. reports the confusion matrix for the classification in seven classes. The graph in Figure 7 reports the Table1 reports the confusion matrix for the classification in seven classes. The graph in Figure7 reports reliability metrics. the reliability metrics.

Figure 7. The reliability metrics assessed for each class of the seven-class classification. Figure 7. The reliability metrics assessed for each class of the seven-class classification. The seven-class classification achieved an overall accuracy of 79.83%, supported by Cohen’s Kappa,The which seven-class was equal classification to 0.764. achieved an overall accuracy of 79.83%, supported by Cohen’s Kappa,Table which2 reports was theequal confusion to 0.764. matrix for the three-class classification. The graph in Figure8 reports the reliabilityTable 2 reports metrics. the confusion matrix for the three-class classification. The graph in Figure 8 reports the reliability metrics.

Sensors 2020, 20, x FOR PEER REVIEW 11 of 17

Table 2. The confusion matrix for the three-class classification.

Sensors 2020, 20, 6149 Truth Data 11 of 17 Mouth Right Left Right Left Low Nose Eye Eye Ear Ear Risk Occlusions Table 2. The confusionRisk matrix Contacts for the three-class Low-Risk classification. Contacts Mouth/Nose Truth Data Risk Right Eye Mouth Nose Right234 Eye Left Eye Right Ear Left17 Ear Low Risk 0 contacts Occlusions Left Eye Risk Contacts Low-Risk Contacts Mouth/Nose Risk Right RightEar Eye 234 17 0 contacts Classifier Left Eye Low-risk

Classifier Results Classifier Results Results Left Ear 10 224 5 Right Ear contacts Low-risk Low riskLeft Ear 10 224 5 contacts OcclusionsLow risk 0 20 65 Occlusions 0 20 65

Reliability metrics 96% 95.30%93% 90.96%94% 95.95% 93% 100% 86% 76% 80%

60%

40%

20%

0% Risk contacts Low risk Occlusion contacts Events

Accuracy Precision Sensitivity

Figure 8. TheThe reliability metrics assessed for each cl classass of the three-class classification.classification.

The three-classthree-class classification classification achieved achieved an an overall overall accuracy accuracy of 90.96% of 90.96% supported supported by Cohen’s by Cohen’s Kappa equalKappa to equal 0.851. to 0.851.

4. Discussion In the seven-class classification, classification, the system achieves an overall a accuracyccuracy of 79.83% supported by Cohen’s KappaKappa ofof 0.735, 0.735, implying implying a substantiala substantial agreement agreement in the in Landisthe Landis and Kochand Koch scale [scale58]. This[58]. result This confirmsresult confirms hypothesis hypothesis 1: the HealthSHIELD1: the HealthSHIELD tool classifies tool classifies the events the in events substantial in substantial agreement agreement with the humanwith the observer. human Moreobserver. importantly, More importantly, hypothesis 2hypoth is alsoesis confirmed. 2 is also After confirmed. clustering After the clustering contact events the intocontact the events two main into classes, the two risk main and classes, no-risk, risk the and classification no-risk, the achieves classification 90.96% achieves accuracy 90.96% supported accuracy by Cohen’ssupported Kappa by Cohen’s at 0.876, Kappa corresponding at 0.876, corresponding to an almost perfect to an agreementalmost perfect in the agreement Landis and in Kochthe Landis scale. and KochRegarding scale. the system performance, a tradeoff choice in the definition of the sphere radius and the secondaryRegarding distance the system threshold performance, was applied a tradeoff to detect, choice as in much the definition as possible, ofthe the risk-contact sphere radius events. and Bythe doingsecondary so, the distance performance threshold was reducedwas applied in terms to dete of precisionct, as much for as this possible, class and the in risk-contact terms of sensitivity events. forBy thedoing low-risk so, the contact performance class. Forwas the reduced occlusion in terms class, theof precision authors preferredfor this class reducing, and in as terms much asof possible,sensitivity the for FP the rate low-risk at theexpense contact ofclass. the For detriment the occlusion in the precision class, the metrics. authors preferred reducing, as muchThe as possible, visual evaluation the FP rate of at the the capabilityexpense of to the detect detriment and classifyin the precision the events metrics. in real-time gave a positiveThe response,visual evaluation not evidencing of the anycapability noticeable to detect delay betweenand classify the contactthe events timestamp in real-time and the gave GUI a alertpositive notification. response, not evidencing any noticeable delay between the contact timestamp and the GUI alert Innotification. an empirical comparison, the HealthSHIELD tool outperforms the automatic “don’t touch your face” web-based application. This result is due to different video technology. The web-based application uses a standard RGB camera that does not have a stereoscopic mink so that, if the face area is occluded by a hand, there is a high probability of false-positive detection. Stereoscopy-based technologies can overcome this issue, but they imply an increased cost in terms of hardware and Sensors 2020, 20, 6149 12 of 17 software implementation. The proposed prototype, based on depth images, suffers much less from this problem and distinguishes between the contact areas. Apart from better performance, the prototype has two main advantages. First, it transparently collects data, and second, the time requested to obtain such data in digital format is equal to the activities’ duration. If the same analysis were carried out offline by a human observer, it would require much more time, both for contact events’ detection/classification and data digitalization. Another advantage consists of being ready to use not requiring any training phase or setup effort. It can be moved in an observation place with the only necessity being keeping individuals under observation inside the sensor’s field of view. Nevertheless, the prototype has some limitations. Two main factors can negatively affect the system’s tracking accuracy. The first one consists of possible occlusions of body parts to the sensor’s field of view. The second one consists of the inability to segment body parts and track the body joints. Indeed, the segmentation process exploits the depth variations between body parts in the depth image. Sometimes, it is possible that, if the fingers are in proximity to the face, given the small variation of depth between the points of the two surfaces, the system cannot segment them. In the worst-case scenario, the system is unable to track the thumb joint. In other ones, the joint 3D-coordinates suffer from slight jittering. These limitations can cause two main issues. First, if the individual is back facing the sensor, it is impossible to correctly track the face and thumb joints. Similarly, if turning more than 45 degrees with respect to the sensor, one of the hands is likely to be occluded by other body parts and, consequently, it is not trackable. Second, when the thumb joint is near the face, the body tracking system could not track it with sufficient accuracy for the aforementioned motivations. Therefore, the monitoring system can generate two kinds of errors: it can miss that contact, or it can detect multiple successive contacts because, given the slight jittering, the thumb joint enters and exits the contact volume. Moreover, given the lockdown condition, the preliminary test has been carried out only in a single-use case scenario. Nevertheless, as soon as the security measures’ loosening allows, the authors plan to validate the prototype extensively. They also plan to evaluate its effectiveness as a training tool that, thanks to real-time warnings, and periodic reports regarding the hand–face gesture attitude, can positively influence the compliance with BPPs by establishing a virtuous circle. The authors also plan to evaluate the user acceptance of such a warning system based on visual and acoustic feedback and compare it to a warning system based on a wrist-worn vibrating bracelet.

5. Conclusions By analyzing the current practice for the containment of the Sars-CoV2 outbreak and searching for the available technological solutions, the authors designed and developed the HealthSHIELD, a software tool based on the Microsoft Kinect Azure D-RGB camera aimed at monitoring compliance with the BPPs. The preliminary study assessing the system detection effectiveness was encouraging and returned a 90.96% accuracy. The results evidence that low-cost RGB-D technologies can be effectively exploited for monitoring the compliance with some of the BPPs for virus contagion containment. The capability to break the chain of contagion by improving people’s compliance with hygiene best practices and avoiding risk conditions (e.g., meeting people and being closer to them than the recommended security distance) will turn useful in the post-lockdown phase. Considering that the pandemic will not disappear abruptly and that successive waves are expected, the wide adoption of the solution can potentially support and secure the recovery of daily life and support the preparedness of the society to face future pandemics. Apart from warning people in real-time and providing BPP training support, by systematically collecting data and analyzing them with respect to people’s categories and contagion statistics, it will be possible to aid the understanding of the importance of this contagion pathway and identify which people categories, such a behavioral attitude, constitutes a significant risk. In such a hypothesis, proper solutions could be applied (e.g., occupational health professionals could define how workplaces Sensors 2020, 20, 6149 13 of 17 should be redesigned or the work shift job-rotation could be modified to lessen risky events [59]). Consequently, the HealthSHIELD tool could find applications in most places where people act in a shared environment (e.g., schools, public offices, factory shopfloor, hospital triage areas).

Supplementary Materials: The following are available online at http://www.mdpi.com/1424-8220/20/21/6149/s1, Video S1: Video HealthSHIELD tool. Author Contributions: Conceptualization, V.M.M., N.T. and A.P.; Data curation, M.F. and A.B.; Formal analysis, M.G.; Investigation, A.E.U. and V.M.M.; Methodology, V.M.M., A.B. and A.E.U.; Software, V.M.M. and G.L.C.; Supervision, A.P. and A.E.U.; Validation, V.M.M., A.B. and N.T.; Visualization, M.G. and G.L.C.; Writing—original draft, V.M.M. and M.F.; Writing—review & editing, V.M.M., M.F., A.B. and A.E.U. All authors have read and agreed to the published version of the manuscript. Funding: This research received no external funding. Conflicts of Interest: The authors declare no conflict of interest.

Appendix A. The HealthSHIELD GUI The GUI has two windows, the main and the report windows. The main window (Figure A1) allows the monitoring of people’s behavior. The left panel visualizes the acquired depth stream with the tracked skeleton overlapped on the worker’s body silhouette (visualized with the depth stream for privacy matters). On the upper right side, a dropdown menu allows a selection between two monitoring modalities: real-time monitoring and background monitoring. In real-time modality, an expert can observe the worker’s behavior (please see the provided Video S1). The GUI notifies both the under-security distance and the hand–face contact events and updates, in real-time, a bar chart with the contact events’ statistics binned into the respective categories. Two counters update the total amount of events that are further grouped into two classes: contacts onto a risk region (i.e., mouth/nose and eyes areas) and contacts onto a low-risk region (i.e., ears and low-risk areas). A further counter updates the under-security distance events. The Video S1 provided as supplementary material shows the HealthSHIELD tool working in the real-timeSensors 2020 monitoring, 20, x FOR PEER modality. REVIEW 14 of 17

Figure A1.A1. The HealthSHIELD tool’s main window for the real-time monitoringmonitoring modality.modality. In the display area, a worker at a seated workstation.

The background modality allows people to be trained to be compliant with the BBPs (i.e., avoiding hand–face contact gestures and keeping the security distance). In this modality, if the monitored person has a monitor connected to the system, a window pops up on it and displays real- time visual and acoustic warnings at each face-contact and under-security distance event. The report window allows a detailed analysis of each monitoring report (Figure A2). It visualizes a graph with a sequence of the contact events with the corresponding initial timestamp and the duration, a bar chart with the overall statistics, the mean contact event time-frequency in the observed period (expressed as events per minute).

Figure A2. The report window of the HealthSHIELD tool prototype.

References

Sensors 2020, 20, x FOR PEER REVIEW 14 of 17

SensorsFigure2020, 20 A1., 6149 The HealthSHIELD tool’s main window for the real-time monitoring modality. In the14 of 17 display area, a worker at a seated workstation.

The backgroundbackground modality modality allows allows people people to beto trained be trained to be compliantto be compliant with the with BBPs the (i.e., BBPs avoiding (i.e., hand–faceavoiding hand–face contact gestures contact and gestures keeping and the securitykeeping distance). the security In this distance). modality, In if thethis monitored modality, person if the hasmonitored a monitor person connected has a monitor to the system,connected a window to the syst popsem, upa window on it and pops displays up on real-timeit and displays visual real- and acoustictime visual warnings and acoustic at each warnings face-contact at each and fa under-securityce-contact and distance under-security event. distance event. The report windowwindow allowsallows aa detaileddetailed analysis analysis of of each each monitoring monitoring report report (Figure (Figure A2 A2).). It It visualizes visualizes a grapha graph with with a sequence a sequence of the of contactthe contact events events with thewith corresponding the corresponding initial timestamp initial timestamp and the duration,and the aduration, bar chart a withbar chart the overallwith the statistics, overall statistics, the mean th contacte mean eventcontact time-frequency event time-frequency in the observed in the observed period (expressedperiod (expressed as events as perevents minute). per minute).

Figure A2. The report window of the HealthSHIELD tool prototype.

References 1. WHO. Available online: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/technical- guidance/naming-the-coronavirus-disease-(covid-2019)-and-the-virus-that-causes-it (accessed on 4 June

2020). 2. Thomas, H.; Sam, W.; Anna, P.; Toby, P.; Beatriz, K. Oxford COVID-19 Government Response Tracker, B.S. of G. Our World in Data. 2020. Available online: https://ourworldindata.org/grapher/workplace-closures- covid?time=2020-04-02 (accessed on 21 September 2020). 3. Fernandes, N. Economic Effects of Coronavirus Outbreak (COVID-19) on the World Economy. SSRN Electron. J. 2020.[CrossRef] 4. Wang, L.; Wang, Y.; Ye, D.; Liu, Q. A review of the 2019 Novel Coronavirus (COVID-19) based on current evidence. Int. J. Antimicrob. Agents 2020, 55, 105948. [CrossRef][PubMed] 5. Prompetchara, E.; Ketloy, C.; Palaga, T. Immune responses in COVID-19 and potential vaccines: Lessons learned from SARS and MERS epidemic. Asian Pac. J. Allergy Immunol. 2020, 38, 1–9. [PubMed] 6. Semple, S.; Cherrie, J.W. Covid-19: Protecting Worker Health. Ann. Work Expo. Health 2020, 64, 461–464. [CrossRef][PubMed] 7. Basic Protective Measures against the New Coronavirus. Available online: https://www.acbf-pact.org/media/ news/basic-protective-measures-against-new-coronavirus (accessed on 13 April 2020). 8. Davies, S.E.; Kamradt-Scott, A.; Rushton, S. Disease Diplomacy: International Norms and Global Health Security; Johns Hopkins University Press: Baltimore, MD, USA, 2015; ISBN 9781421416496. 9. Wu, J.T.; Leung, K.; Leung, G.M. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study. Lancet 2020, 395, 689–697. [CrossRef] Sensors 2020, 20, 6149 15 of 17

10. Bogoch, I.I.; Watts, A.; Thomas-Bachli, A.; Huber, C.; Kraemer, M.U.G.; Khan, K. Potential for global spread of a novel coronavirus from China. J. Travel Med. 2020, 27, taaa011. [CrossRef][PubMed] 11. COVID-19 Coronavirus Pandemic. Available online: https://www.worldometers.info/coronavirus/ (accessed on 9 April 2020). 12. Johns Hopkins University of Medicine—Coronavirus Resource Center. Available online: https://coronavirus. jhu.edu/map.html (accessed on 21 September 2020). 13. Chatterjee, A.; Gerdes, M.W.; Martinez, S.G. Statistical explorations and univariate timeseries analysis on covid-19 datasets to understand the trend of disease spreading and death. Sensors 2020, 20, 3089. [CrossRef] [PubMed] 14. Ozturk, T.; Talo, M.; Yildirim, E.A.; Baloglu, U.B.; Yildirim, O.; Rajendra Acharya, U. Automated detection of COVID-19 cases using deep neural networks with X-ray images. Comput. Biol. Med. 2020, 121, 103792. [CrossRef][PubMed] 15. Ting, D.S.W.; Carin, L.; Dzau, V.; Wong, T.Y. Digital technology and COVID-19. Nat. Med. 2020, 26, 459–461. [CrossRef][PubMed] 16. Jiang, X.; Coffee, M.; Bari, A.; Wang, J.; Jiang, X.; Huang, J.; Shi, J.; Dai, J.; Cai, J.; Zhang, T.; et al. Towards an Artificial Intelligence Framework for Data-Driven Prediction of Coronavirus Clinical Severity. Comput. Mater. Contin. 2020, 63, 537–551. [CrossRef] 17. Yip, C.C.-Y.; Ho, C.-C.; Chan, J.F.-W.; To, K.K.-W.; Chan, H.S.-Y.; Wong, S.C.-Y.; Leung, K.-H.; Fung, A.Y.-F.; Ng, A.C.-K.; Zou, Z.; et al. Development of a Novel, Genome Subtraction-Derived, SARS-CoV-2-Specific COVID-19-nsp2 Real-Time RT-PCR Assay and Its Evaluation Using Clinical Specimens. Int. J. Mol. Sci. 2020, 21, 2574. [CrossRef][PubMed] 18. Robson, B. COVID-19 Coronavirus spike protein analysis for synthetic vaccines, a peptidomimetic antagonist, and therapeutic drugs, and analysis of a proposed achilles’ heel conserved region to minimize probability of escape mutations and drug resistance. Comput. Biol. Med. 2020, 121, 103749. [CrossRef][PubMed] 19. Robson, B. Computers and viral diseases. Preliminary bioinformatics studies on the design of a synthetic vaccine and a preventative peptidomimetic antagonist against the SARS-CoV-2 (2019-nCoV, COVID-19) coronavirus. Comput. Biol. Med. 2020, 119, 103670. [CrossRef][PubMed] 20. Mavrikou, S.; Moschopoulou, G.; Tsekouras, V.; Kintzios, S. Development of a portable, ultra-rapid and ultra-sensitive cell-based biosensor for the direct detection of the SARS-COV-2 S1 spike protein antigen. Sensors 2020, 20, 3121. [CrossRef] 21. Gov.sg WhatsApp Subscription. Available online: https://www.form.gov.sg/#!/5e33fa3709f80b00113b6891 (accessed on 9 April 2020). 22. World Health Organization (WHO) Coronavirus Disease (COVID-19) Advice for the Public. Available online: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/advice-for-public (accessed on 15 May 2020). 23. World Health Organization (WHO). Rational Use of Personal Protective Equipment for Coronavirus Disease 2019 (COVID-19). Available online: https://www.who.int/publications/i/item/rational-use-of-personal- protective-equipment-for-coronavirus-disease-(covid-19)-and-considerations-during-severe-shortages (accessed on 10 May 2020). 24. ABRFID Companies Unite Against COVID-19. Available online: https://www.rfidjournal.com/abrfid- companies-unite-against-covid-19 (accessed on 9 April 2020). 25. D’Aurizio, N.; Baldi, T.L.; Paolocci, G.; Prattichizzo, D. Preventing Undesired Face-Touches with Wearable Devices and Haptic Feedback. IEEE Access 2020, 8, 139033–139043. [CrossRef] 26. Yang, L.; Liao, R.; Lin, J.; Sun, B.; Wang, Z.; Keogh, P.; Zhu, J. Enhanced 6D measurement by integrating an Inertial Measurement Unit (IMU) with a 6D sensor unit of a laser tracker. Opt. Lasers Eng. 2020, 126, 105902. [CrossRef] 27. Schall, M.C.; Sesek, R.F.; Cavuoto, L.A. Barriers to the Adoption of Wearable Sensors in the Workplace: A Survey of Occupational Safety and Health Professionals. Hum. Factors 2018, 60, 351–362. [CrossRef] 28. Kowalski, K.; Rhodes, R.; Naylor, P.-J.; Tuokko, H.; MacDonald, S. Direct and indirect measurement of physical activity in older adults: A systematic review of the literature. Int. J. Behav. Nutr. Phys. Act. 2012, 9, 148. [CrossRef] 29. Xu, X.; McGorry, R.W.; Chou, L.-S.; Lin, J.; Chang, C. Accuracy of the Microsoft Kinect ® for measuring gait parameters during treadmill walking. Gait Posture 2015, 42, 145–151. [CrossRef] Sensors 2020, 20, 6149 16 of 17

30. O’Hara, K.; Morrison, C.; Sellen, A.; Bianchi-Berthouze, N.; Craig, C. Body Tracking in Healthcare. Synth. Lect. Assist. Rehabil. Health Technol. 2016, 5, 1–151. [CrossRef] 31. Manghisi, V.M.; Uva, A.E.; Fiorentino, M.; Gattullo, M.; Boccaccio, A.; Evangelista, A. Automatic Ergonomic Postural Risk Monitoring on the factory shopfloor—The ErgoSentinel Tool. Procedia Manuf. 2019, 42, 97–103. [CrossRef] 32. Mahmoud, M.; Baltrušaitis, T.; Robinson, P. Automatic analysis of naturalistic hand-over-face gestures. ACM Trans. Interact. Intell. Syst. 2016, 6, 1–18. [CrossRef] 33. González-Ortega, D.; Díaz-Pernas, F.J.; Martínez-Zarzuela, M.; Antón-Rodríguez, M.; Díez-Higuera, J.F.; Boto-Giralda, D. Real-time hands, face and facial features detection and tracking: Application to cognitive rehabilitation tests monitoring. J. Netw. Comput. Appl. 2010, 33, 447–466. [CrossRef] 34. Naik, N.; Mehta, M.A. Hand-over-Face Gesture based Facial Emotion Recognition using Deep Learning. Proceeding of the 2018 International Conference on Circuits and Systems in Digital Enterprise Technology, Kottayam, India, 21–22 December 2018; pp. 1–7. [CrossRef] 35. DO NOT TOUCH YOUR FACE. Available online: https://donottouchyourface.com/ (accessed on 9 April 2020). 36. Manghisi, V.M.; Uva, A.E.; Fiorentino, M.; Gattullo, M.; Boccaccio, A.; Monno, G. Enhancing user engagement through the user centric design of a mid-air gesture-based interface for the navigation of virtual-tours in cultural heritage expositions. J. Cult. Herit. 2018, 32, 186–197. [CrossRef] 37. Uva, A.E.; Fiorentino, M.; Manghisi, V.M.; Boccaccio, A.; Debernardis, S.; Gattullo, M.; Monno, G. A User-Centered Framework for Designing Midair Gesture Interfaces. IEEE Trans. Human-Mach. Syst. 2019, 49, 421–429. [CrossRef] 38. Guerra, M.S.D.R.; Martin-Gutierrez, J. Evaluation of full-body gestures performed by individuals with down syndrome: Proposal for designing user interfaces for all based on kinect sensor. Sensors 2020, 20, 3903. [CrossRef] 39. Zhang, M.; Shi, R.; Yang, Z. A critical review of vision-based occupational health and safety monitoring of construction site workers. Saf. Sci. 2020, 126, 104658. [CrossRef] 40. Manghisi, V.M.; Uva, A.E.; Fiorentino, M.; Bevilacqua, V.; Trotta, G.F.; Monno, G. Real time RULA assessment using Kinect v2 sensor. Appl. Ergon. 2017, 65, 481–491. [CrossRef] 41. Reyes, M.; Clapés, A.; Ramírez, J.; Revilla, J.R.; Escalera, S. Automatic digital biometry analysis based on depth maps. Comput. Ind. 2013, 64, 1316–1325. [CrossRef] 42. Liao, Y.; Vakanski, A.; Xian, M.; Paul, D.; Baker, R. A review of computational approaches for evaluation of rehabilitation exercises. Comput. Biol. Med. 2020, 119, 103687. [CrossRef] 43. Abraham, L.; Bromberg, F.; Forradellas, R. Ensemble of shape functions and support vector machines for the estimation of discrete arm muscle activation from external biceps 3D point clouds. Comput. Biol. Med. 2018, 95, 129–139. [CrossRef][PubMed] 44. Vitali, A.; Regazzoni, D.; Rizzi, C. Digital motion acquisition to assess spinal cord injured (SCI) patients. Comput. Aided. Des. Appl. 2019, 16, 962–971. [CrossRef] 45. Liu, C.H.; Lee, P.; Chen, Y.L.; Yen, C.W.; Yu, C.W. Study of postural stability features by using kinect depth sensors to assess body joint coordination patterns. Sensors 2020, 20, 1291. [CrossRef][PubMed] 46. Loeb, H.; Kim, J.; Arbogast, K.; Kuo, J.; Koppel, S.; Cross, S.; Charlton, J. Automated recognition of rear seat occupants’ head position using KinectTM 3D point cloud. J. Safety Res. 2017, 63, 135–143. [CrossRef] [PubMed] 47. Microsoft Azure Kinect DK. Available online: https://docs.microsoft.com/en-us/azure/Kinect-dk/hardware- specification (accessed on 28 May 2020). 48. Xu, X.; McGorry, R.W. The validity of the first and second generation Microsoft Kinect® for identifying joint center locations during static postures. Appl. Ergon. 2015, 49, 47–54. [CrossRef][PubMed] 49. Kurillo, G.; Chen, A.; Bajcsy, R.; Han, J.J. Evaluation of upper extremity reachable workspace using Kinect camera. Technol. Health Care Off. J. Eur. Soc. Eng. Med. 2012, 21, 641–656. [CrossRef][PubMed] 50. Plantard, P.; Shum, H.; Le Pierres, A.-S.; Multon, F. Validation of an ergonomic assessment method using Kinect data in real workplace conditions. Appl. Ergon. 2017, 65, 562–569. [CrossRef] 51. Microsoft Azure-Kinect-Sensor-SDK. Available online: https://microsoft.github.io/Azure-Kinect-Sensor- SDK/master/index.html (accessed on 28 May 2020). 52. Microsoft Azure Kinect Body-SDK. Available online: https://docs.microsoft.com/en-us/azure/kinect-dk/body- sdk-download (accessed on 28 May 2020). Sensors 2020, 20, 6149 17 of 17

53. K4AdotNet. Available online: https://www.nuget.org/packages/K4AdotNet/1.4.1 (accessed on 5 June 2020). 54. WPFToolkit 3.5.50211.1. Available online: https://www.nuget.org/packages/WPFToolkit/3.5.50211.1# (accessed on 5 June 2020). 55. OpenGL.Net 0.8.4. Available online: https://www.nuget.org/packages/OpenGL.Net/ (accessed on 5 June 2020). 56. NASA Man-Systems Integration Standards, Volume I, Section 3, Anthropometry and Biomechanics. Available online: https://msis.jsc.nasa.gov/sections/section03.htm (accessed on 20 September 2020). 57. Erwig, M. The graph Voronoi diagram with applications. Networks 2000, 36, 156–163. [CrossRef] 58. Landis, J.R.; Koch, G.G. The measurement of observer agreement for categorical data. Biometrics 1977, 33, 159–174. [CrossRef] 59. Boenzi, F.; Digiesi, S.; Facchini, F.; Mummolo, G. Ergonomic improvement through job rotations in repetitive manual tasks in case of limited specialization and differentiated ergonomic requirements. IFAC Pap. 2016, 49, 1667–1672. [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).