Human-Computer Interaction 2

Prof. Dr. Andreas Butz Dr. Bastian Pfleging LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide CHI 2009 ~ Non-traditional Interaction Techniques April 7th, 2009 ~ Boston, MA, USA

Hand Occlusion with Tablet-sized Direct Pen Input

Daniel Vogel1,2, Matthew Cudmore2, Géry Casiez3, Ravin Balakrishnan1, and Liam Keliher2 1 Dept. of Computer Science 2 Dept. of Math & Computer Science 3 LIFL & INRIA Lille University of Toronto, CANADA Mount Allison University, CANADA University of Lille, FRANCE {dvogel|ravin}@dgp.toronto.edu {dvogel|mcudmore|lkeliher}@mta.ca [email protected]

ABSTRACT Certainly, any designer can simply look down at their own We present results from an experiment examining the area hand while they operate a Tablet PC and take the perceived occluded by the hand when using a tablet-sized direct pen occlusion into account, but this type of ad hoc observation is input device. Our results show that the pen, hand, and fore- unlikely to yield sound scientific findings or universal design arm can occlude up to 47% of a 12 inch display. The shape of guidelines. To study occlusion properly, we need to employ Occlusion-awarethe occluded area varies betweeninterfaces participants due to differ- controlled experimental methods. ences in pen grip rather than simply anatomical differences. In this paper we describe an experimental study using a novel For the most part, individuals adopt a consistent posture for combination of video capture, marker long and short selection tasks. Overall, many occluded pixels tracking, and image processing techniques to capture images are located higher relative to the pen than previously thought. of hand and arm occlusion from the point-of-view of a user. From the experimental data, a five-parameter scalable circle We call these images occlusion silhouettes (Figure 1b). and pivoting rectangle geometric model is presented which Analyses of these silhouettes found that the hand and arm can captures the general shape of the occluded area relative to the occlude up to 47% of a 12 inch display and that the shape of pen position. This model fits the experimental data much the occluded area varies across participants according to their better than the simple bounding box model often used implic- style of pen grip, rather than basic anatomical differences. itly by designers. The space of fitted parameters also serves to quantify the shape of occlusion. Finally, an initial design Based on our findings, we create a five parameter geometric for a predictive version of the model is discussed. model, comprised of a scalable circle and pivoting rectangle, to describe the general shape of the occluded area (Figure Author Keywords: Hand occlusion, pen input, Tablet PC. 1c). Using non-linear optimization algorithms, we fit this ACM Classification: H5.2. Information interfaces and pres- geometric model to the silhouette images captured in the ex- • problem:entation: User Interfaces system - Input devices and strategies. generated periment. Wemessages found that this geometric model may matches the silhouettes with an F1 score [18] of 0.81 compared to 0.40 for the simple bounding box which designers often use implicitly beINTRODUCTION positioned under the touser’s account for occlusion. hand. The space of fitted parameters also Given our familiarity with using pens and pencils, one would serves as to quantify the shape of occlusion, capture different expect that operating a tablet computer by drawing directly grip styles, and provide approximate empirical guidelines. on the display would be more natural and efficient. However, Finally, we introduce an initial scheme for a predictive ver- issues specific to direct pen input, such as the user’s hand sion of the geometric model which could enable new types of covering portions of the display during interaction – a phe- occlusion-aware interaction techniques. nomena we term occlusion (Figure 1a) – create new problems not experienced with conventional mouse input [12]. Compared to using pen on paper, occlusion with pen comput- (b) (c) ing is more problematic. Unlike paper, the results of pen in- (a) put, or system generated messages, may be revealed in oc- cluded areas of the display. Researchers have suggested that occlusion impedes performance [7,10] and have used it as motivation for interaction techniques [1,14,24], but as of yet there has been no systematic study or model to quantify the amount or shape of occlusion.

Permission to make digital or hard copies of all or part of this work for per- sonal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific Figure 1: (a) Occlusion caused by the hand with direct permission and/or a fee. pen input; (b) an occlusion silhouette image taken CHI 2009, April 4–9, 2009, Boston, Massachusetts, USA. from the point-of-view of a user and rectified; (c) a Copyright 2009 ACM 978-1-60558-246-7/09/04...$5.00. simplified circle and rectangle geometric model cap- turing the general shape of the occluded area. Literature: Vogel, D. et al. (2009). Hand Occlusion with Tablet-sized Direct Pen Input, CHI’09 LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion557 II — WS2016/17 Slide 3 Occlusion-aware interfaces

• one approach: experimental study using a novel combination of video capture, augmented reality marker CHI 2009 ~ Non-traditional Interaction Techniques April 7th, 2009 ~ Boston, MA, USA tracking, and image processing techniques to capture FORMAL EXPERIMENT occlusion silhouettes. In pilot experiments, we found that we could position the Our goal is to measure the size and shape of the occluded camera approximately 40 mm above and forward of the line area of a tablet-sized display. To accomplish this, we record of sight, and the resulting image was very similar to what the the participant’s view of their hand with a head-mounted participant saw. video camera as they select targets at different locations on Printed fiducial markers were attached around the bezel of the display. We then extract key frames from the video and the tablet to enable us to transform the point-of-view frames isolate occlusion silhouettes of the participant’s hand as they to a standard, registered image perspective for analysis. De- appear from their vantage point. tails of the image analysis steps are in the next section. Participants 22 people (8 female, 14 male) with a mean age of 26.1 (SD (a) (b) 8.3) participated. All participants were right-handed and pre- screened for color blindness. Participants had little or no ex- perience with direct pen input, but this is acceptable since we are observing a lower level physical behaviour. At the beginning of each session, we measured the partici- pant’s hand and forearm since anatomical dimensions likely influence the amount of occlusion (Figure 2). We considered Figure 3. Experiment apparatus: (a) head mounted controlling for these dimensions, but recruitingLiterature: participants Vogel, D. to et al. (2009).camera Hand toOcclusion capture with point-of-v Tablet-sizediew; Direct (b) Pen fiducial Input, CHI’09 markers conform to anatomical LMU sizes München proved — Medieninformatik to be difficult, — Andreas and Butz, the Bastian Pflegingattached — Mensch-Maschine-Interaktion to tablet bezel II (image — WS2016/17 is taken from Slide head 4 ranges for each control dimension were difficult to define. mounted camera video frame).

UL

HL SL FL HB (a)

EL (b) (c) Figure 2. Anthropomorphic measurements (diagram Figure 4. (a) 7 x 11 grid for placement; (b) square; (c) adapted from Pheasant and Hastlegrave [13]). circle target (targets are printed actual size).

EL - elbow to fingertip length Task and Stimuli SL - shoulder to elbow length Participants were presented with individual trials consisting of an initial selection of a home target, followed by selection UL - upper limb length including hand of a measurement target. FL - upper limb length, elbow to crease of wrist, EL - HL The 128 px tall and 64 px wide home target was consistently HL - hand length, crease of the wrist to the tip of finger located at the extreme right edge of the tablet display, 52 mm HB - hand breadth, maximum width of palm from the display bottom. This controlled the initial position Apparatus of the hand and forearm at the beginning of each trial. We The experiment was conducted using a Wacom Cintiq 12UX observed participants instinctively returning to a similar rest direct input pen tablet. It has a 307 mm (12.1 inch) diagonal position in our initial observational study. display, a resolution of 1280 by 800 pixels (261 by 163 mm), The location of the measurement target was varied across and a pixel density of 4.9 px/mm (125 DPI). We chose the trials at positions inscribed by a 7 × 11 unit invisible grid Cintiq because it provides pen tilt information which is un- (Figure 4a). This created 77 different locations with target available on current Tablet PCs. centers spaced 122 px horizontally and 123 px vertically. We positioned the tablet in portrait-orientation and supported We observed two primary styles of pen manipulation in our it such that it was at an angle of 12 degrees off the desk, ori- initial observational study: long, localized interactions where ented towards the participant. Participants were seated in an the participant rested their palm on the display (such as ad- adjustable office chair with the height adjusted so that the justing a slider), and short, singular interactions performed elbow formed a 90 degree angle when the forearm was on the without resting the hand (such as pushing a button). Based on desk. This body posture is the most ergonomically sound this, our task had two types of target selection: tap – selection according to Pheasant and Hastlegrave [13]. of a 64 px square target with a single tap (Figure 4b); and To capture the participant’s point-of-view, we use a small circle – selection of a circular target by circling within a head-mounted video camera to record the entire experiment 28 px tolerance between a 4 px inner and 32 px outer radius at 640 × 480 px resolution and 15 frames-per-second (Figure (Figure 4c). The circle selection is designed to encourage 3a). The camera is attached to a head harness using hook- participants to rest their palm, while the tap selection can be and-loop strips making it easy to move up or down so that it quickly performed with the palm in the air. The different can be positioned as close as possible to the center of the shapes for the two selection tasks were intended to serve as a eyes, without interfering with the participants’ line of sight. mnemonic to the user as to what action was required.

559 CHI 2009CHI 2009~ Non-traditional ~ Non-traditional Interaction Interaction Techniques Techniques April 7th,April 2009 7th, ~ 2009 Boston, ~ Boston, MA, USA MA, USA ticipants,ticipants, we found we found that grip that style grip style varied varied predominately predominately ate a simpleate a simple model model with a with small a number small number of parameters of parameters, yet , yet acrossacross three dimensions:three dimensions: size of size fist, of angle fist, ofangle pen, of and pen, height and heightstill producestill produce a reasonable a reasonable degree degreeof accuracy. of accuracy. of grip location on pen. We believe it is these characteristics of grip location on pen. We believe it is these characteristics ScalableScalable Circle Circleand Pivoting and Pivoting Rectangle Rectangle Model Model of grip style that interact with anatomical measurements and of grip style that interact with anatomical measurements and We noticedWe noticed that the that occlusion the occlusion silhouettes silhouettes produced produced by the by the ultimately govern occlusion area. ultimately govern occlusion area. experimentalexperimental data often data resembled often resembled a lopsided a lopsided circle for circle the for the fist, a fist,thick a narrowingthick narrowing rectangle rectangle sticking sticking out the outbottom the bottomfor for the arm,the and, arm, with and, some with participants, some participants, there was there also was a thinner also a thinner rectanglerectangle puncturing puncturing the top the of top the of ball the for ball the for pen. the This pen. This meant meantthat a singlethat a orientedsingle oriented bounding bounding box would box bewould unlikely be unlikely to captureto capture all grip all styles grip accurastyles tely.accura Ourtely. first Our approach first approach then, then, was to wascreate to acreate geometric a geometric model usingmodel an using ellipse an forellipse the forfist, the fist, an isoscelesan isosceles trapezoid trapezoid for the forarm, the and ar m,a rectangle and a rectangle for the pen.for the pen. However,However, even this even model this modelhad 11 hadparameters 11 parameters and automati- and automati- (a) (b) (c) (a) (b) (c) cally fittingcally fittingthe geometry the geometry to our experimentalto our experimental data was data prob- was prob- FigureFigure 11. Grip 11. styles: Grip styles: (a) loose (a) loose fist, low fist, angle, low angle, me- me- lematic.lematic. Instead, Instead, we simplified we simplified our representation our representation further further to to dium dium grip height; grip height; (b) tight (b) fist, tight high fist, angle, high angle, high grip high grip an offsetan circleoffset andcircle a rectangland a rectangle with onlye with the only following the following 5 pa- 5 pa- height;height; (c) loose (c) loosefist, straight fist, straight angle, angle,low CHIgrip low 2009 height. grip~ Non-traditional height. Interaction Techniques April 7th, 2009 ~ Boston, MA, USA rametersrameters (also illustrated (also illustrated in Figure in 12b):Figure 12b): Left-handedLeft-handed Users Users ticipants, we found that grip style varied predominately ate a simple model with a small number of parameters, yet across three dimensions: size of fist, angle of pen,q is and the height q offset is thestill from offsetproduce the a fromreasonable pen the position degree pen of accuracy. position p to the p circle to the edge, circle edge, We conductedWe conducted a small a smallfollow-up follow-up study studywithof grip two withlocation left-handed twoon pen. left-handed We believe it is these characteristics Scalable Circle and Pivoting Rectangle Model of grip style that interact with anatomical measurements r is the andr radius is theWe of radius noticed the that ciofrcle the the occlusion over circle the silhouettes over fist the producedarea, fist by area, the users.users. Similar Similar to Hancock to Hancock and Booth’s and ultimately Booth’s finding govern occlusion finding with area. with experimental data often resembled a lopsided circle for the performanceperformance [7], we [7], found we foundthat the that left-handed the left-handed data mirrored data mirrored fist, a thick narrowing rectangle sticking out the bottom for is the rotation is thethe arm, rotation angle and, with of someangle the participants, circle of the therearound circle was also aroundp a (expressedthinner p (expressed in in the right-handedthe right-handed individuals. individuals. rectangle puncturing the top of the ball for the pen. This degreesdegrees wheremeant wherethat a= single 0 oriented when = 0 bounding the when box centre would the be centreunlikely is due is East, due East, InfluenceInfluence of Clothing of Clothing = -45 =for -45to North-East, capture for all North-East, grip styles and accura tely. =and 45 Our firstfor = approach 45South-East), for then, South-East), We gathered our data for sleeveless participants to maintain a was to create a geometric model using an ellipse for the fist, We gathered our data for sleeveless participants to maintain a an isosceles trapezoid for the arm, and a rectangle for the pen. consistentconsistent baseline, baseline, but we but recognize we recognize that size that of size the ofocclu- the occlu- is the angleis theHowever, of angle rotation even ofthis rotationmodel of thehad 11 rectangleof parameters the rectangle and around automati- aroundthe cen- the cen- (a) (b)tre (c) of the circlecally fitting(using the geometrythe same to our angleexperimental configuration data was prob- as ), sion silhouettesion silhouette could couldbe much be muchlarger largerwhen whenclothedFigure 11. clothed Grip (consider styles: (consider (a) loose fist, low angle, me- tre of lematic.the circle Instead, (using we simplified the our same representation angle further configuration to as ), dium grip height; (b) tight fist, high angle, high grip an offset circle and a rectangle with only the following 5 pa- using a tablet while wearing a loose fitting sweaterheight; (c) looseor jacket).fist, straight angle, low grip height. using a tablet while wearing a loose fitting sweater or jacket). w is the w width is therameters ofwidth the (also rectangleillustratedof the inrectangle Figure representing 12b): representing the forearm. the forearm. As a generalAs a general rule, Pheasant rule, Pheasant and Hastlegrave and HastlegraveLeft-handed [13] suggest Users [13] suggest add- add- q is the offset from the pen position p to the circle edge, We conducted a small follow-up study withNote two left-handed thatNote the that length the lengthof the rectangleof the rectangle is infinite is infinite for our forpur- our pur- ing 25mming 25mm to all toanatomical all anatomical dimensions dimensions forusers. men Similarfor and men to 45mm Hancock and 45mm and Booth’s finding with r is the radius of the circle over the fist area, poses. poses.If we wereIf we building were building a model a modelfor larger for displays, larger displays, this this for womenfor women to account to account for thickness for thickness of clothing. ofperformance clothing. [7], we found that the left-handed data mirrored is the rotation angle of the circle around p (expressed in the right-handed individuals. may becomemay become anotherdegrees another parameter, where = parameter, 0 when but the at centre present but is at due present we East, are we con- are con- GEOMETRIC MODEL OF OCCLUSION SHAPEInfluence of Clothing = -45 for North-East, and = 45 for South-East), GEOMETRIC MODEL OF OCCLUSIONWe gathered SHAPE our data for sleeveless participantscerned to maintain cernedwith a tablet-sized with tablet-sized displays displays like the like portable the portable Tablet PC.Tablet PC. The experimentThe experiment revealed revealed that the that occlusion the occlusionconsistent shape baseline, shapewas butsome- wewas recognize some- that size of the occlu- is the angle of rotation of the rectangle around the cen- sion silhouette could be much larger when clothed (consider tre of the circle (using the same angle configuration as ), what whatuniform uniform within within a participant a participant and high andusing level higha tablet similaritieswhilelevel wearing similarities a lo ose fitting sweater or jacket). w is the width of the rectangle representing the forearm. As a general rule, Pheasant and Hastlegrave [13] suggestScalable add- Circle and Pivoting Rectangle appeared across participants. We wondered if a simple geo- pNote thatp the length of the rectangle is infinite for our pur- appeared across participants. We wondereding 25mm to allif anatomicala simple dimensions geo- for men and 45mm Model poses. If we were building a model for larger displays, this for women to account for thickness of clothing. metricmetric model model could coulddescribe describe the general the general shape shapeand position and position of of may become another parameter, but at present we are con- the occlusion silhouettes. If so, by fittingGEOMETRIC this model MODEL to OF OCCLUSION the SHAPE cernedq with tablet-sizedr displays like the portable Tablet PC. the occlusion silhouettes. If so, by The fitting experiment this revealed model that the to occlusion the shape was some- q r actualactual silhouettes, silhouettes, the resulting the resulting model model parameterswhat parameters uniform could within a servecouldparticipant serve and high level similarities appeared across participants. We wonderedp if a simplep geo- p p p as empiricalas empirical guidelines guidelines for designers. for designers. Moreover,metric Moreover, model this could geomet- describe this thegeomet- general shape and position of ric representation could form the basis forthe a occlusionpredicative silhouettes. ver- If so, by fitting this model to the q r ric representation could form the basisactual forsilhouettes, a predicative the resulting model ver- parameters could serve p p sion ofsion model: of model: in real in time, real atime, system a system wouldas empiricalwould be aware guidelines be awareof for oc- designers. of oc- Moreover, this geomet- cluded portions of the interface without theric representation aid of elaborate could form the basis for a predicative ver- cluded portions of the interface withoutsion of model:the inaid real oftime, elaborate a system would be aware of oc- w sensors. For example, imagine an interfacecluded that portions knows of the wheninterface without the aid of elaborate w w sensors. For example, imagine an interfacesensors. For example,that knows imagine an when interface that knows when a statusa status message message is occluded, is occluded, and re-displays and re-displaysa status it messageas a itbubble is asoccluded, a bubble in and re-displays in it as a bubble in a nearby non-occluded area instead. a nearby non-occluded area instead. a nearby non-occluded area instead. There are many ways to approach modeling the shape of the (a) (b) (c) There are many ways to approach modelingocclusion the silhouettes.shape of Perhaps the the most straightforward ap- There are many ways to approach modelingproach is to assume the pixels shape below of and theto the right of the(a) pen’s (a) (b) (b) (c) (c) occlusion silhouettes. Perhaps the most straightforwardposition are occluded, an approach ap- which some designers and occlusion silhouettes. Perhaps the mostresearchers straightforward seem to use implicitly. ap- We refer to this as a proachproach is to assume is to assume pixels pixels below below and to and thebounding toright the rectangle of right the model ofpen’s the(Figure pen’s 12c). This model is con- Figure 12. Three occlusion shape models: (a) Bézier position are occluded, an approach which somestant relative designers to the pen’s positionand and requires no other input,Bézier splinespline; (b) circle and rectangle; (b) boundingbounding rectan- rectangle position are occluded, an approach whichbut the accuracy some is poor.designers At the other and end of the spectrum, we gle. p is the position of the pen. model researchers seem to use implicitly. We could refer create to a model this with as a flexible a shape such as one com- Fitting the Geometric Model to Captured Silhouettes researchers seem to use implicitly.posed We of Bézier refer spline to segmen thists as (Figure a 12a). While this bounding rectangle model (Figure 12c). This model is con- Figure 12. ThreeFor each occlusionsilhouette image shapefrom our experiment,models: we (a) use non-Bézier bounding rectangle model (Figure 12c).would certainly This yieldmodel a very is accurate con- representation of Figure the linear 12. optimization Three techniquesocclusion to set shape the five parameters models: of (a) Bézier stant relative to the pen’s position and requiresoccluded no area, other the huge input, number of parametersspline; would make (b) circlethe geometric and model rectangle; so that it “fits” (b) over bounding the silhouette rectan- as stant relative to the pen’s position andfitting requires and interpreting no otherthe model input, difficult and hence imprac-spline;accurately (b) circle as possible. and Note rectangle; that other optimization (b) bounding algo- rectan- but the accuracy is poor. At the other end oftical the for creating spectrum, empirical weguide lines. Our aim gle.then is p to iscre- the position of the pen. but the accuracy is poor. At the other end of the spectrum,LMU München we — Medieninformatikgle. — Andreas p Butz,is the Bastian position Pfleging — Mensch-Maschine-Interaktion of the pen. II — WS2016/17 Slide 5 could create a model with a flexible shape such as one com- Fitting the Geometric Model to Captured Silhouettes could create a model with a flexible shape such as one com- Fitting563 the Geometric Model to Captured Silhouettes posedposed of Bézier of Bézier spline spline segmen segments (Figurets (Figure 12a). While12a). While this thisFor each silhouette image from our experiment, we use non- would certainly yield a very accurate representation of the For each silhouette image from our experiment, we use non- would certainly yield a very accurate representation of thelinear linear optimization optimization techniques techniques to set the to set five the parameters five parameters of of occludedoccluded area, thearea, huge the number huge number of parameters of parameters would would make makethe geometric model so that it “fits” over the silhouette as fitting and interpreting the model difficult and hence imprac- the geometric model so that it “fits” over the silhouette as fitting and interpreting the model difficult and hence imprac-accurately as possible. Note that other optimization algo- tical for creating empirical guidelines. Our aim then is to cre- accurately as possible. Note that other optimization algo- tical for creating empirical guidelines. Our aim then is to cre-

563 563 Occlusion-aware techniques

http://www.youtube.com/watch?v=4sOmlhEJ2ac

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 6 Chapter 3: Mobile and Wearable Interfaces

• More Interaction Techniques –Gestural input on Touch screens –Input around the device & body –Interaction across devices ==> Int. Environments • Wearable Computing –historic roots –some more recent examples • Augmented Reality –historic roots: AR on HMDs –Contemporary examples for handheld AR

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 7 Bezel Swipe: Conflict-Free Scrolling and Selection on Mobile Touch Screen Devices • Drag from screen edges through thin bars • Edge bar encodes command • Multiple commands without interference –Selection, cut, copy, paste –Zooming, panning, tapping

Roth, Turner. Bezel Swipe: Conflict- Free Scrolling and Multiple Selection on Mobile Touch Screen Devices. CHI touch on bar: touch on bezel: 2009. no activation activation

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 8 Bezel Swipe: Conflict-Free Scrolling and Selection on Mobile Touch Screen Devices

Roth, Turner. Bezel Swipe: Conflict-Free Scrolling and Multiple Selection on Mobile Touch Screen Devices. CHI 2009.

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 9 iOS 5 Notification Bar

Image source: http://www.macstories.net/stories/ios-5-notification-center/

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 10 MicroRolls: Expanding Touch-Screen Input by Distinguishing Rolls vs. Slides of the Thumb

• Input vocabulary for touchscreens is limited • MicroRolls: thumb rolls without sliding –Roll vs. slide distinction possible –No interference • Enhanced input vocabulary –Drags, Swipes, Rubbings and MicroRolls

Roudaut, Lecolinet, Guiard. MicroRolls: Expanding Touch-Screen Input Vocabulary by Distinguishing Rolls vs. Slides of the Thumb. CHI 2009.

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 11 Kinematic Traces of Different Touch Gestures

Roudaut, Lecolinet, Guiard. MicroRolls: Expanding Touch-Screen Input Vocabulary by Distinguishing Rolls vs. Slides of the Thumb. CHI 2009.

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 12 Mapping MicroRoll Gestures to Actions

Menu (300ms timeout) MicroRoll gestures

• Menu supports gesture learning –Menu only appears after 300ms timeout –Experts execute gestures immediately • Precision: selecting small targets • Quasi-mode: modify subsequent operation Roudaut, Lecolinet, Guiard. MicroRolls: Expanding Touch-Screen Input Vocabulary by Distinguishing Rolls vs. Slides of the Thumb. CHI 2009.

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 13 MicroRolls: Expanding Touch-Screen Input by Distinguishing Rolls vs. Slides of the Thumb

Roudaut, Lecolinet, Guiard. MicroRolls: Expanding Touch-Screen Input Vocabulary by Distinguishing Rolls vs. Slides of the Thumb. CHI 2009.

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 14 Air + Touch

Literature: Chen, x. et al.: Air+Touch: Interweaving Touch & In-Air Gestures, UIST’14

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 15 Chapter 3: Mobile and Wearable Interfaces

• More Interaction Techniques –Gestural input on Touch screens –Input around the device & body –Interaction across devices ==> Int. Environments • Wearable Computing –historic roots –some more recent examples • Augmented Reality –historic roots: AR on HMDs –Contemporary examples for handheld AR

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 16 BodySpace 2 • uses inertial sensing and basic pattern recognition to allow gestural control user’s personal finances on the mobile device. From the enabling us to easily alter the dynamics of interaction • control by placing the device at different body parts machine’s sensor perspective this is simply a time- and multimodal feedback by varying the parameters of series in acceleration but the user thinks of the back our model.

–magnetometer pocket as the location of their 4 personal finances, and –accelerometer that this can be accessed by moving their phone there. Gesture Controlled Applications Inertial sensing has proved to be a viable technique for Previous work on this concept focussed mainly on the sensing movement for gestural interaction with mobile –gyroscope accelerometer measurements, can indicate which body in figure 6, a simulation of a ‘ball in a bowl' metaphor locations are, as shown in figure 4, compatible. This basicrepresents ideas the state and of therequirements interaction. We can imaginefor the project without a devices. Rekimoto et al [9] describe their GestureWrist then allows us to use the second stage of recognition to a ball placed in a bowl or concavity as shown in figure 5 gather extra evidence for the inferred body location by working,where each mobile bowl or concavity implementation. represents a different Strachan et al. [12] system, which consists of a wristband that recognises comparing how the device moved to that position, track. The simulation approach allows us to add based on accelerometer data for the last second of builtformative a dynamic feedback such systems as the real-time implementation synthesis of a gesture hand and forearm movements and uses these motion. A simple Multi-Layer Perceptron [2] is used to used by Rath & Rocchesso in [9] which gives the user a classify one of four body positions. The use of a Multi- recogniser.sense of the devices In sensitivity this to his paper action. we describe the first movements to communicate with a computer. Ubi- Layer Perceptron at this stage shows the generality of implementation of a completely handheld and fully Finger [13] is another system which uses acceleration the approach and was perfectly adequate for this task since its compact final form of a handful of parameters, functioning ‘BodySpace’ system, which uses inertial and touch sensors to detect a fixed set of hand and low processing cost makes it very suitable for low memory mobile devices. Training of this system sensing to recognise when it is placed at different areas gestures and Kela et al describe the use of a matchbox

involves repeated gestures to the four different parts of the body with three gestures per location required to of Figure a user’s 5: Combination body of bowls, in which order the user must to navigate control a music player, sized sensor pack, SoapBox [3], which they use to achieve adequate training in this set-up. essentiallythe ball into in order utilising to switch tracks the human body as the mnemonic control the functionality of different appliances in their With a row of bowls representing a list of tracks, it is Modelling device.possible simulate A user the task may of transferring place a the ball from device at their hip in design studio. They describe a study designed to In this work we also incorporated dynamic model-based one bowl to the next by providing an external force approaches to interaction. By basing our interaction on order to control the volume of their current song or at compare the usefulness of the gesture modality from the movement of the device. In this case the a simulation of a physical model we enable a more external force comes from a flick of the device. active exploration of the potential range of interaction their ear, in order to switch tracks. compared to other modalities for control such as RFID Figure6: Illustration of the main Increased velocity and momentum of the flick would functionality of the BodySpace system. with the device. It also allows us to alter the ‘look and allow users to reach the peak, and effectively fall into objects or PDA and stylus, finding that gestures are a Figure1:feel’ of the Examplesinteraction very of easily other by simply ways altering we the the next track. We may model the surface friction and parameters of the model and gives us great scope for Literature: Strachan, S. et al.: BodySpace: Inferring body pose for natural controlmay of a usemusic the player bodyspace , CHI’07 system Ourthe effort system required to differs overcome the from peak of the other bowl gesture controlled natural modality for certain tasks. This reflects the designing multi-modal interaction, where the vibration with some simple physics. Each bowl is represented by and audio feedback can be generated in real-time as systems in that we are not required to explicitly design conclusions of Pirhonen et al. [7] who investigated the LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 17 a simple parabola, with a certain height, y, used to the user performs the gesture. −1 a lexiconcalculate angle of ofgestures. slope:θ = tan The(x / yrange) and the of gestures we use is use of gesture and non-speech based audio as a way to When the device is classified to be at a certain body constrainedforce: F = mg sin byθ , minus the surface limits friction (static [4]. This and kinematic) of the improve the interface on a mobile music player. The location, the system switches to the correct mode and interaction is also augmented with vibrotactile feedback model associated with that part of the body. So for humanallowing thebody user toin feelthat when the the trackarm switch can has only move to a finite key advantage of this gestural approach is that it example, when we wish to switch tracks, the device is occurred, where the level of feedback presented is first moved to the left ear where recognition occurs. A numberassociated of with locations a parameter of around the physical the model. body, A providing us with enables eyes-free interaction with a music player, mode switch then means that when the device is tilted ansimilar obvious, mechanism perfectly is used to control natural the volume and of a easily generated set which is advantageous, especially when the user is `on back or forward at the ear, in order to switch tracks, as track, which is located, in this set-up, at the left hip. So of gestures. Another difference is that we do not use the move'. any buttons at all in our interface, making the interaction more fluid and natural than a gestural Hardware system that requires an explicit button press at the The equipment used consists of an HP iPAQ 5550 beginning of each gesture (e.g. the Samsung SCH 310, running windowsCE equipped with a MESH [6] inertial the only gesturally controlled phone on the market, navigation system (INS) backpack consisting of 3 uses a gesture button which has to be activated while Analog Devices ±2g dual-axis accelerometers, 3 Analog generating gestures). Additionally, we also use a Devices ±300deg/s single chip gyroscopes, 3 Honeywell model-based approach to our interaction design devices magnetometers, a Trimble LassenSq GPS and a Virtual Shelf

• access programmable shortcuts on mobile phone by pointing to a body-relative location around the body –especially interesting for visually impaired users • shortcuts are arranged in an imaginary sphere.

Li, F., Dearman, D., and Truong, K. Virtual shelves: interactions with orientation aware devices. In Proc. UIST (2009), 125–128

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 18 Around-body interaction

• phone’s 3D location tracking: front camera, accelerometer and inertia measurement units

• three level of around body interaction: –canvas: expand interaction area beyond the screen boundaries (e.g. place UI element in space, which is larger than screen) –modal: switch between different applications or modes within a given application. –context: device’s spatial relationship to the user

Chen, x. et al.: Around-Body Interaction: Sensing & Interaction Techniques for proprioception-enhanced input with mobile devices, MobileHCI’14 LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 19 Side-of-Device Interaction: SideSight

• Useful if device is placed on table • Distance sensors along device edges – Multipoint interactions • IR proximity sensors – Edge: 10x1 pixel “depth” image

Left and right “depth” images

Butler, Izadi, Hodges. SideSight: Multi-“touch” Interaction Around Small Devices. UIST’08.

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 20 Side-of-Device Interaction: SideSight

Butler, Izadi, Hodges. SideSight: Multi-“touch” Interaction Around Small Devices. UIST’08.

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 21 Chapter 3: Mobile and Wearable Interfaces

• More Interaction Techniques –Gestural input on Touch screens –Input around the device & body –Interaction across devices ==> Int. Environments • Wearable Computing –historic roots –some more recent examples • Augmented Reality –historic roots: AR on HMDs –Contemporary examples for handheld AR

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 22 Wearable Computing • Small and • Wearable -> hands free (otherwise portable ;-) • Sensors onboard –Cameras, temperature sensors, microphones, GPS • Unobtrusive displays –Wrist watch –Clip-on for glasses • “Always on” –Sense the environment and observe user actions

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 23 Goals of Wearable Computing

• Personal assistant (general purpose) –Internet access –Remind of dates and tasks –Handle large Databases (lexicon access, document management) • Specialized devices –Museum guide –City and Campus guides –Maintenance and surveillance tasks

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 24 Early Wearable Input: Keyboards (images from left to right)

• Twiddler (www.handykey.com/): chord keyboard • WearClam (www.robots.ox.ac.uk/~wmayol/WearClam/) • WristPC Keyboards (L3 Systems) • Fitaly keyboard (1-finger System)

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 25 Output (Historic Examples)

• Clip-on glasses (Micro Optical Corp.) • Retina displays • See e.g., TekGear for more • www.tekgear.com

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 26 Early Platform: CharmIT

• MIT Media Lab spin-off • Complete „wearable“ system bundle –PC class hardware • Transmeta Crusoe processor • 20GB hard disk • 8hrs battery life (differing info) –Linux operating system –Clip-on display –Finger mouse –Carrying bag

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 27 Early Commercial attempts

• IBM wearables (pictures on the right) • I-wear (EU-project, 2003, link now void) –Siemens, Philips, Samsonite, adidas, starlab.org –Intelligent Clothing –Antennas integrated into clothing • Xybernaut • Matsucom onHandPC • IBM Linux wrist watch

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 28 Some viable „Wearable“ Products

• Apple & Burton „Amp Jacket“ (2003) –Includes Apple iPod MP3 player –Player control keys integrated in sleeve –$500 without iPod

• Infineon & O‘Neill „Hub Jacket“ (2004) –Includes custom unit which… • Contains a 128MB MP3 player • Acts as a bluetooth headset for mobile phones –Player and phone control integrated in sleeve –€550 including player/headset electronics

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 29 MIT Wearables

• MIThril • Platform to develop wearable computing applications • www.media.m it.edu/ wearables/

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 30 MIThril Architecture

Head of group: Sandy Pentland

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 31 Eyetap Technology

• http://www.eyetap.org/ • Computer mediated reality • modify visual perception –Augment –Diminish –Alter LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 32 Chapter 3: Mobile and Wearable Interfaces

• More Interaction Techniques –Gestural input on Touch screens –Input around the device & body –Interaction across devices ==> Int. Environments • Wearable Computing –historic roots –some more recent examples • Augmented Reality –historic roots: AR on HMDs –Contemporary examples for handheld AR

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 33 SixthSense Project

• vision-based approach for always available input

Literature: Mistry, P. : Wear Ur World - A Wearable Gestural Interface, CHI 2009 https://www.youtube.com/watch?v=JnxGUPzP_U0

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 34 Biosensors

• signals traditionally used for diagnostic medicine –emotional state: heart rate, skin resistance –brain sensing technology cognitive and emotional state (BCI - brain computer interaction): • electroencephalography (EEG) • functional near-infrared spectroscopy (fNIR) –signals generated by muscle activation • electromyography (EMG) • bone conduction – sound frequencies propagate well through bone

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 35 Myo: gesture sensing via electric currents

• https://www.youtube.com/watch?v=oWu9TFJjHaM • Mini-discussion: what types of gestures are shown?

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 36 Brain-Computer Interfaces

For thesis topics, contact [email protected]

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 37 The *annoyingly* talking window

https://www.youtube.com/watch?v=azwL5eoE5aI

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 38 https://www.youtube.com/watch?v=g3XPUdW9Ryg Skinput

• Bio-sensing approach –leverages natural acoustic conduction properties of the body

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 39 Shoesense

• Idea: attach a gesture sensor to your shoe –MS Kinect • detect gestures from below • short quiz: what do we need for gesture recognition? – – • what types of gestures might one want? – – • http://www.gillesbailly.fr/publis/BAILLY_CHI12a.pdf • https://www.youtube.com/watch?v=htN6u7tWbHg

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 40 LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 41 LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 42 Chapter 3: Mobile and Wearable Interfaces

• More Interaction Techniques –Gestural input on Touch screens –Input around the device & body –Interaction across devices ==> Int. Environments • Wearable Computing –historic roots –some more recent examples • Augmented Reality –historic roots: AR on HMDs –Contemporary examples for handheld AR

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 43 What is AR (and what isn‘t)?

• 1) Combines real and virtual • 2) Interactive in real time • 3) Registered in 3-D • Ronald T. Azuma: A Survey of Augmented Reality In Presence: Teleoperators and Virtual Environments 6, 4 (August 1997), 355-385. http://www.cs.unc.edu/~azuma/ARpresence.pdf

http://www.tomsguide.com/us/google-glass,news-17711.html Ivan Sutherland, 1968 LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 44 Principle: closed (video only) HMD

• Monitor is mounted very close to the eye • Additional lens makes it appear distant • è all images appear at the same distance –Usually at infinity or slightly less

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 45 Creating VR with a HMD

Head Rendering tracker

3D scene

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 46 Challenges with HMDs in VR

• Lag and jitter between head motion and motion of the 3D scene –Due to tracking à predictive tracking –Due to rendering à nowadays mostly irrelevant • Leads to different motion cues from –Eye (delayed) and –Vestibular system (not delayed) • Result: cyber sickness

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 47 Rift

• costom-built motion and orientation sensor unit –gyroscope –accelerometer –magnetometer • head tracking with 3 degrees of freedom

http://static3.businessinsider.com/image/ 51d1949aecad04ad3e000000/heres-what-happened-when-we- strapped-a-bunch-of-people-into-the-oculus-rift-virtual-reality- headset.jpg LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 48 Creating AR with video see-through HMDs

Video Mixing

Head Rendering tracker

Parallax error

3D scene

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 49 Advantages of video-based see-through

• Lag between physical and virtual image can be compensated • Camera can be used for tracking as well –Physical image = raw tracking data –Perfect registration possible • Video mixer can add or subtract light –Virtual objects can be drawn in black –Physical objects can be substituted –Virtual objects can be behind physical objects • Just one image with a given focus distance

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 50 Challenges of video-based see-through

• Lag between physical and virtual image can be compensated –…by delaying the physical image –Leads back to the cyber sickness problem • Parallax error can not be corrected electronically –Wrong stereo cues when used for stereo • Richness of the world is lost –Video image just 0.5 megapixels –Resolution of human vision is much higher (>10x)

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 51 Creating AR with optical see-through HMDs

Head Rendering tracker

3D scene

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 52 Advantages of optical see-through HMDs

• Preserve the richness of the world –Very high resolution of physical image –No lag between motion and phys. image –Physical objects can be focused at their correct distance

• Limitations: –can only add light, i.e. not cover things –lag of digital image very noticeable

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 53 https://www.youtube.com/watch?v=hglZb5CWzNQ Contemporary Example: Microsoft HoloLens • IMHO a quantum leap in augmentation quality

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 54 Creating AR with Head-up Displays (HUDs)

Head Rendering tracker

3D scene

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 55 Head-Up Display with 3D registration

• Currently mostly military use • Fixed Display • Very exact head or eye tracking needed – Easy for jet pilots • High brightness and dynamics needed

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 56 HUD without 3D registration

• Optional Equipment in premium cars (image source: www.bmw.ch) • easy: no tracking needed! -> not AR!

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 57 HUD app for iPhone

• can be bought from app store • put iPhone under wind shield • uses GPS and accell sensors to sense car motion • can display speed, heading, ...

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 58 Handheld AR in “Reality Browsers”

• Show augmented video stream • No image processing, other sensors – GPS – Accelerometer – Magnetometer • Examples – Layar – Wikitude • Limitations – only loose registration with the real world view

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 59 Layar “Reality Browser”

• Position + orientation GPS, accelerometer, magnetometer • Show POIs as overlays on viewfinder image • Platform allows inserting new layers and POIs • Layers – Real estate – Transportation – Tours / Guides – Eating & Drinking • http://layar.com/

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 60 https://www.youtube.com/watch?v=vDNzTasuYEw Contemporary commercial example

LMU München — Medieninformatik — Andreas Butz, Bastian Pfleging — Mensch-Maschine-Interaktion II — WS2016/17 Slide 61