Image Understanding and Computer Vision Research at MSR Redmond
Total Page:16
File Type:pdf, Size:1020Kb
Zhengyou Zhang Research Manager/Principal Researcher Microsoft Research, Redmond, WA Jian Sun’s team at MSRA Very many sources of Image variability [Slide credit : John Winn] Scene type Street scene Scene geometry [Slide credit : John Winn] Scene type Street scene Scene geometry Sky Sidewalk Bicycle Bollard Object classes Building×3 Tree×3 Car×5 Road Person×4 Bench [Slide credit : John Winn] Scene type Street scene Scene geometry Sky Sidewalk Bicycle Bollard Object classes Building×3 Tree×3 Car×5 Road Person×4 Bench Object position Object orientation [Slide credit : John Winn] SceneScene type type Street scene SceneScene geometry geometry ObjectObject classes classes ObjectObject position position ObjectObject orientation orientation Object shape [Slide credit : John Winn] SceneScene typetype SceneScene geometrygeometry ObjectObject classesclasses ObjectObject positionposition ObjectObject orientationorientation ObjectObject shapeshape Depth/occlusions [Slide credit : John Winn] SceneScene typetype SceneScene geometrygeometry ObjectObject classesclasses ObjectObject positionposition ObjectObject orientationorientation ObjectObject shapeshape Depth/occlusionsDepth/occlusions Object appearance [Slide credit : John Winn] Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions Object appearance Illumination Shadows [Slide credit : John Winn] Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions Object appearance Illumination Shadows [Slide credit : John Winn] Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions Object appearance Illumination Shadows Motion blur Camera effects [Slide credit : John Winn] Collaborative Office Space Microsoft Surface Hub ViiBoard: Vision-enhanced Immersive Interaction with Touch Board Experimental Setup Surface Hub Display Kinect Big Touch Board (Surface Hub) + RGB-D Sensor (Kinect) leads to more natural and immersive interaction with touch boards VTouch ImmerseBoard Natural and Rich Interaction Beyond Immersive Remote Collaboration Touch reference point same space gaze intention VTouch (A) (B) (A) (B) (C) (D) (A) (B) (A) (B) (C) Menu Buttons Menu Buttons (A) (B) Preference for Vision-enabled Importance for Vision- UI enabled UI Body Following Hand Gesture Hover Distinguishing Hands Distinguishing Users NOT Prefer Strongly Strongly Prefer Disagree Agree Vision-enabled UI is easy to use and remember Body Following Hand Gesture Hover Distinguishing Hands Distinguishing Users Strongly Strongly Disagree Agree ImmerseBoard Person Person 1 2 Face, Eye gaze, Gestures, Proxemics, etc Content Creation Person Person 1 2 Face Content Content Creation Content Creation RGBD Sensor (Kinect) + Touch Board (Surface Hub) = Immersive Remote Collaboration as if writing on a physical whiteboard side-by-side • Seeing the reference point • Sharing the same space • Being aware of gaze • Predicting intention Side-by-side writing on a whiteboard on a mirror ImmerseBoard: Implemented Conditions Setup Hybrid Mirror Tilt Board Big Touch Board (Surface Hub) + RGB-D Sensor (Kinect) leads to more natural and immersive interaction with touch boards VTouch ImmerseBoard Natural and Rich Interaction Beyond Touch Immersive Remote Collaboration reference point same space gaze intention Varun Ramakrishna Richard Stebbing Aaron Hertzmann Sameh Khamis Toby Sharp David Kim Cem Keskin Christoph Rhemann Duncan Robertson Yichen Wei Jonathan Taylor Daniel Freedman Jamie Shotton Eyal Krupka Ido Leichter Andrew Fitzgibbon Alon Vinnikov Shahram Izadi Reinitializer Batch Rendering ... Hand Detector Region of Interest … Stochastic Optimizer Batch Golden Energy Computation Understanding Reality for Generating Credible Augmentations Microsoft Research [With Shapovalov et al. CVPR ’12 ] Inference Machine = Extension of Random Forests Colours represent different object categories [Silberman, Shapira, Gal, Kohli, ECCV 2014] [Kim, Kohli, Saverese, ICCV 2013] [Silberman, Hoiem, Kohli, Fergus, ECCV 2012] Interacting with objects requires understanding of support relationships!! Can I move the book? [With Oxford Brookes, Shahram Izadi, TOG 2015] Video .