NATURAL USER INTERFACES The Second Revolution in Human-Computer Interaction Natural User Interface Track
Bill Curtis AMD Senior Fellow AGENDA
How do we control “things”? Human-computer interface revolution #1: Interactive computing Human-computer interface revolution #2: Natural user interface (NUI) Three layer NUI model “Revolutionary” platforms for NUI Summary – why it’s important to “think revolutionary”
3 | Natural User Interface | June 2011 HOW DO WE CONTROL “THINGS”?
4 | Natural User Interface | June 2011 HOW DO WE CONTROL “THINGS”?
Mechanical machines have always used direct, intuitive controls
Doorknob: US Patent 1878 “Improvement in a door holding device”
Machine tools – 19th century
5 | Natural User Interface | June 2011 HOW DO WE CONTROL “THINGS”?
As complexity increased, we still used familiar wheels, knobs, keys, buttons, levers
6 | Natural User Interface | June 2011 HOW DO WE CONTROL “THINGS”?
Even the most complex electronic systems follow mechanical control patterns Fit-for-purpose systems, no matter how complicated, use direct, intuitive controls
7 | Natural User Interface | June 2011 HOW DO WE CONTROL “THINGS”?
Direct control concepts also apply to consumer electronics Fit-for-purpose remotes are perfectly designed for each device, but multi-purpose is a big problem!
8 | Natural User Interface | June 2011 HOW DO WE CONTROL “THINGS”?
How does all this apply to computing?
? =
9 | Natural User Interface | June 2011 HOW DO WE CONTROL “THINGS”?
Human-Computer Interface: “HCI” began as “CI” Early computers were not interactive – Machine output: Print, plot, character CRT – Machine input: Cards, tapes, console For >40 years, we’ve been trying to make HCI interactive and intuitive by simulating the real world and emulating direct controls – Metaphorical output 2D and 3D graphics, video, audio, physical device controls Realistic rendering and instrumentation – desktop, appliances with buttons and knobs, game worlds – Indirect human input Manipulate rendered output Keyboards, pointing devices, handheld controllers, voice Result: Interactive computing!
10 | Natural User Interface | June 2011 REVOLUTION 1 | Interactive Computing
11 | Natural User Interface | June 2011 FIRST REVOLUTIONARY * CHANGE IN HCI | Interactive Computing
Desktop metaphor + Mouse
Revolution – a fundamental change in the way of thinking about or visualizing something: a change of paradigm Merriam-Webster’s Collegiate Dictionary
“Though the world does not change with a change of paradigm, the scientist afterward works in a different world.” Thomas Samuel Kuhn, The Structure of Scientific Revolutions, 1962
12 | Natural User Interface | June 2011 REVOLUTION 1 | Interactive Computing
Started ~20 years ago – based on ~25 years of invention and evolution
Invention Evolution Revolution >15 Years 10 Years Multi-billion $ Industry
1960 1970 1980 1990 2000 2010 2020
Pointing Raster Xerox IBM SPARC Windows Windows Macintosh HTML Linux Devices Graphics Alto (74) PC Station NT 95 & IE x86
Doug Engelbart Patents Unix graphics Apple Microsoft Internet Mouse - 1964 Workstations Macintosh Windows 3.1 Privatization (NSFNET reverts to research) 1 Million Internet Hosts
13 | Natural User Interface | June 2011 REVOLUTION 1 | Interactive Computing
Why did the UI revolution take >25 years to reach consumers?
CPU, GPU Software technology Mature Complete Industry-wide apps Ecosystem Platform Interactive computing Web, HTML, browser Multi-purpose OS
User Acceptance Productivity + fun Familiarity Affordability
Multi-billion $ Industry
14 | Natural User Interface | June 2011 REVOLUTION 2 | Natural User Interface
15 | Natural User Interface | June 2011 SECOND REVOLUTIONARY * CHANGE IN HCI | Natural User Interface
Computers start to communicate more like people
More natural, more intuitive
16 | Natural User Interface | June 2011 REVOLUTION 2 | Natural UI
Starting now – based on ~40 years of invention and evolution
Invention Evolution Revolution >30 Years 10 Years Multi-$B Industry
1960 1970 1980 1990 2000 2010 2020
Capacitive Resistive touch First Practical Multi-touch Newton Microsoft iPhone iPad Kinect Touch R&D (1) patents Speech Recog capacitive (2) MessagePad Tablet 2007
Computer 3D Motion Dictation Depth Voice Controls Vision R&D Capture Apps Cams (car, phone)
1 – E.A. Johnson (1967). Touch Displays: A Programmed Man-Machine Interface” Ergonomics 10 (2): 271-277 2 – http://www.billbuxton.com/multitouchOverview.html
17 | Natural User Interface | June 2011 REVOLUTION 2 | Natural UI
Why did the UI revolution take >25 years to reach consumers?
CPU, GPU Software frameworks Mature Complete Tailored apps Ecosystem Platform Touch, voice, sensors Curated software Tailored OS
User Acceptance Mobility + fun Ease of use Affordability
Multi-billion $ Industry
18 | Natural User Interface | June 2011 REVOLUTION 2 | Natural UI
The second “NUI Revolution” is just getting started
Where is it heading?
19 | Natural User Interface | June 2011 THREE LAYER NUI MODEL
20 | Natural User Interface | June 2011 THREE LAYER NUI MODEL
3. Ambient Computing Computing becomes part of NUI extends across multiple devices everyday life Networked, cloud-based, always active
2. NUI Emulate human communication Natural User Interface Multi-sensory, contextual, Software interprets human behavior intuitive, learning
1. HCI Detect and process human behavior Sensors detect human behavior >40 years of evolution - Vision, sound, physical, environmental, biometric
“The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it.” - Mark Weiser, Xerox PARC, 1991
21 | Natural User Interface | June 2011 THREE LAYER NUI MODEL
1. HCI Layer – Human-Computer Interface Detect and process human behavior
Physical Visual Auditory • Mouse, Keyboard • Context-free commands • Multi-touch • Free-space gestures • Speaker independent • Tactile, haptics • Person recognition • No training • Position sensors • Eye, gaze tracking • Voice Search • Game controllers • Activity modeling • Ambient sound recognition • Physical objects • Background removal (always listening) • Photo, video search Environmental Three layer NUI model Biometric • GPS, RFID • Brain-Computer Interface • Magnetometer (BCI) • Temperature, pressure • Implantables • Gyros, accelerometers • Neuroprosthetics • Molecular detection • Security • Gaming • Medical
22 | Natural User Interface | June 2011 THREE LAYER NUI MODEL
2. NUI Layer – Middleware and Application Framework Human behavior translates into action
Ambient Computing Cloud Services
NUI Apps Examples: Collaboration Conferencing Education Healthcare Gaming Location-based Security Multimedia NUI Platform Middleware Examples: Recognition – object, face Gesture recognition Point, select, manipulate Human factors Image processing Voice, sound recognition Ambient monitoring Common controls
HCI Sensors Examples:
23 | Natural User Interface | June 2011 THREE LAYER NUI MODEL
3. Ambient Computing Layer – Multiple devices plus cloud services NUI becomes part of everyday life
Ambient Computing Cloud Services User’s identity Cloud context Device context • Identity • Device registry • Ambient apps • Rules • Services registry • Events and triggers • Preferences • Current location, status • NUI services (server offload) • State • Social connections • Multi-user apps
24 | Natural User Interface | June 2011 ILLUSTRATION OF AMBIENT COMPUTING
Corning’s Video – “A Day Made of Glass” Video is shown with permission of the Corning Glass Technologies Group
25 | Natural User Interface | June 2011 ILLUSTRATION OF AMBIENT COMPUTING
Ambient Computing Situations in the Video There’s more going on here than just glass and “touch screens everywhere”
User’s identity is passively recognized on multiple devices – Car, store computer recognized Jennifer. Could be via mobile device or facial recognition. Cloud context creates consistent experience across multiple devices – Bathroom mirror, car navigation, highway display signs, surfaces, store computer Device context flows between small screen and large screen devices – Stove, bus stop route display, office Surface, flexible display – Device-to-device communication What’s missing? NUI is limited to touch. No voice or gesture HCI.
26 | Natural User Interface | June 2011 “REVOLUTIONARY” PLATFORMS FOR NUI
27 | Natural User Interface | June 2011 REVOLUTIONARY PLATFORMS FOR NUI
Compute – Realism – high fidelity video, audio – Natural input – Goal: intuitive human communication – Acceleration (APU) – data parallel algorithms – Efficiency – NUI duty cycle can be 100% Software – Tailored OS and apps – fit-for-purpose controls – Ecosystem – Apps written for the platform Sensors – High fidelity – video, audio – Low latency – I/O at greater than human speed – High bandwidth – systems for continuous duty cycle
28 | Natural User Interface | June 2011 REVOLUTIONARY PLATFORMS FOR NUI
Computational horsepower for NUI: The case for Fusion Higher performance, lower power for visual NUI computing Rendering APU Computer vision acceleration Vision – Turn the graphics pipeline around Fusion: Optimize user experience per unit of energy – Many HCI / NUI algorithms are well suited for data parallel execution Fusion: High performance GPU memory access – Improves GP-GPU performance and programming productivity Future Fusion: Architectural Optimization for HCI / NUI – Short term: Algorithm architecture and implementation (i.e. OpenCL™, OpenCV) – Long term: GPU architecture, camera input data path
29 | Natural User Interface | June 2011 FUSION: USER EXPERIENCE PER UNIT OF ENERGY Trend lines need final calibration
High-end (~130 watt ASIC) GPU performance industry trend, single precision (Not reflective or predictive of specific AMD products) 4500 30
4000 25 3500
3000 20
2500 Rule of thumb: ~20X better 15 2000 GFlops / Watt
Peak GFlops 1500 10 GFlops / Watt
1000 5 500
0 CPU GFlops / Watt are way down here 0 2005 2006 2007 2008 2009 2010 2011 2012 2013
30 | Natural User Interface | June 2011 SOFTWARE – PROGRAMMING FOR AN ACCELERATED PLATFORM Some of the use-cases AMD partners are working on that REQUIRE acceleration:
Gestures (camera) Sounds – Wide field of view – Large vocabulary, multi-lingual speech recog – Depth of field – 10 inches to 20 feet – Speaker independent – Multi-user tracking – Eliminate training – Very low latency – Ambient sound classification – Detailed kinetic models (fingers, eyes, mouth) – Multi-person speech separation – Stereopsis (depth) with cheap 2D cameras
Eye tracking Face / object recognition – Eliminate or automate calibration and training – Fast, low power object classification – Real-time area of interest – Error rate low enough for secure login – Practical UI controls
31 | Natural User Interface | June 2011 SENSOR I/O
What’s unique about sensor I/O for NUI? – High bandwidth – Cameras can stream gigabits per second (1080p60 24 bit payload is ~3Gb/s) – Many sensors – Multiple cameras, multiple microphones, gyro, accelerometer, magnetometer, barometer, thermometer, near-field comms, GPS, ambient light, … – Low latency – Goal: real-time response to human input (60 fps isn’t enough for fast gestures) – Continuous duty cycle – Sensors for NUI are active all the time
Platform design implications – Efficient interfaces – Low overhead, low power (i.e avoid USB for internal sensors) – Local processing – Round trips to networked services increase latency – Partitioned design – Sensor processing in parallel with application processing
32 | Natural User Interface | June 2011 SUMMARY | Why It’s Important to “Think Revolutionary”
33 | Natural User Interface | June 2011 SUMMARY: SWING FOR THE FENCES!
This is a revolution – not an incremental change – and it’ll play out over the next 20 years – Legacy compatibility is OK, but don’t dumb down revolutionary NUI products to fit the old HCI paradigm
Use the whole platform – Yes, you have to write data parallel code
Do not compromise the user experience – Intuitive, truly natural, no “training”, multi-sensory, multi-user, multi-cultural
Go for mass markets – Consumers love this stuff! Build Fords and Toyotas, not just Maybachs and Bentleys
Tell us what you need in software support and future APUs – We’re just gettin’ started!
34 | Natural User Interface | June 2011 QUESTIONS Disclaimer & Attribution The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. There is no obligation to update or otherwise correct or revise this information. However, we reserve the right to revise this information and to make changes from time to time to the content hereof without obligation to notify any person of such revisions or changes.
NO REPRESENTATIONS OR WARRANTIES ARE MADE WITH RESPECT TO THE CONTENTS HEREOF AND NO RESPONSIBILITY IS ASSUMED FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
ALL IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. IN NO EVENT WILL ANY LIABILITY TO ANY PERSON BE INCURRED FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
AMD, the AMD arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. All other names used in this presentation are for informational purposes only and may be trademarks of their respective owners.
© 2011 Advanced Micro Devices, Inc. All rights reserved.
36 | Natural User Interface | June 2011