Voice (VUI) Human Computer Interaction Fulvio Corno, Luigi De Russis Academic Year 2019/2020 Echo Hands-On

2 Human Computer Interaction Voice User Interfaces

§ Voice User Interfaces (VUIs) allow the user to interact with a system through voice or speech commands o primary advantage: hands-free, possibly eyes-free interaction § Voice User Interfaces or Conversational User Interfaces? o "which mimics a conversation with humans" o "conversational" applies to both text-based and VUIs § Contemporary VUIs can be divided in: o screen-first systems o voice-only systems o voice-first systems

3 Human Computer Interaction Screen-First Devices

§ Most of contemporary voice interaction happens on screen-first devices o , mainly § Impressive and language processing features o but overall experience is fragmented § Main limitations o missing functionality o poor use of screen space while speaking o missing affordances

4 Human Computer Interaction Missing Functionality and Affordances

§ Users can start a task via voice, but subsequent steps require them to use the touchscreen § Visual affordances are missing (or poor) o omits several visual affordances (e.g., it does not show that people can edit a text message before sending it) o Google is better in this

5 Human Computer Interaction Poor Screen Space Use

§ Tasks with some support for multi- step voice input exhibit a screen design: o totally different from the "normal" GUI version o which limits the information available to the user

6 Human Computer Interaction Voice-Only Devices

§ No visual display at all o like the we "tested" before o audio is for input and output (plus some "feedback lights") o hands-free operation § Quite good accuracy in speech recognition o if you do not mix different languages in a sentence o auditory signals are the only used cues (no visual affordances)

7 Human Computer Interaction Voice-Only Devices: Limitations

§ They are quite prolix in the answers § You have to know what to say! § Some operations are "challenging", e.g., o once a timer is set up, the user can only ask how much time is left o getting a weekly weather forecast is a… memory test § Some actions are not allowed nor expected, e.g., o you cannot insert your wifi password, vocally o you cannot hear about all the available (and installable) skills

8 Human Computer Interaction Voice-First Devices

§ Voice-only devices… with a screen § A system which primarily accept user input via voice commands, and may augment audio output with visual information o no differences from the "voice" perspective o GUI is less capable than the one in screen-first devices § Typically, the display is a touch screen o but it rarely provides buttons or menus o the focus is still on voice

9 Human Computer Interaction Designing (New) VUIs Background, process, and guidelines

10 Human Computer Interaction Designing Voice User Interfaces

§ Voice interaction between people and devices is analogous to learning a foreign languages o both for users and designers/developers § Easily learnt through immersion o voice-first devices have an advantage in this § Successful examples on voice-first devices: o sequential numbering of search results o randomly show new speech commands o voice-accessible interactive (visual) content § Beware: people often have unrealistic expectations o they think a VUI as a "natural conversation partner"

11 Human Computer Interaction Designing Voice User Interfaces

§ To design a VUI, you firstly need to have a clear picture of o who is communicating, i.e., who are your users o what they are communicating about, what they will ask about, i.e., what their needs are § Then, you can write some sample dialogs and sketch a diagram of the conversation flow o both convey the flow that the user will actually experience o you can also informally experiment with and evaluate different strategies • e.g., is it better to confirm a user's request with an implicit confirmation or an explicit one? § Focus on the spoken conversation before considering any visual element o imagine to work with a voice-only device

12 Human Computer Interaction Basic Conversational Frames

§ Controlling: specifying a goal with means of achieving it o "Play Radio Deejay from TuneIn" § Delegating: asking for an outcome without specifying how to achieve it o "Play some jazz music" § Guiding: discussing the means of achieving a goal Currently adopted by contemporary VUIs contemporary by adopted Currently o "I want to hear some music, how should I do it?" § Collaborating: mutually deciding on goals between both participants o "What should we do?"

13 Human Computer Interaction Guidelines for Designing Voice User Interfaces

§ Provide users with information about what they can do o if the user asks something that does not make sense or it is not possible, provide them with the available options • for instance, a weather app can say "You can ask for today's weather or a weekly forecast" o an "exit" strategy must be always present and available • e.g., Alexa's "stop"

14 Human Computer Interaction Guidelines for Designing Voice User Interfaces

§ Where am I? o users must be told which functionality (or part of it) they are using o for example:

User Today's weather forecast VUI Today's weather forecast is rainy Rainy

§ Use non-auditory feedback, if possible o some visual feedback may be useful, e.g., a light or a message on screen, to let user know that the system is listening

15 Human Computer Interaction Guidelines for Designing Voice User Interfaces

§ Express intentions in examples o in providing sample of speech commands, present the full intention • "you may ask: What is the weather in Turin, tomorrow?" • "you may say: Play Radio Deejay from TuneIn" § During execution, leverage on default settings, additional information, or just ask for missing pieces o e.g., the location for the weather forecast can be retrieved by GPS or TuneIn can be set as the default for radio broadcasts

16 Human Computer Interaction Guidelines for Designing Voice User Interfaces

§ Limit the amount of information o Keep the delivered information brief! • e.g., Amazon recommends not to list more than 3 different options each time o If you have more options, either group them or find another way to accomplish the same goal

17 Human Computer Interaction Other Guidelines Amazon's and Google's Suggestions for Designers and Developers

18 Human Computer Interaction Amazon's Guidelines for Alexa

1. Be adaptable o https://developer.amazon.com/docs/alexa-design/adaptable.html 2. Be personal o https://developer.amazon.com/docs/alexa-design/personal.html 3. Be available o https://developer.amazon.com/docs/alexa-design/available.html 4. Be relatable o https://developer.amazon.com/docs/alexa-design/relatable.html 5. Establish and maintain trust o https://developer.amazon.com/docs/alexa-design/trustbusters.html

19 Human Computer Interaction Google's Guidelines for Assistant

§ Conversation Design Guidelines – Getting Started o https://designguidelines.withgoogle.com/conversation/ § Style Guide o https://designguidelines.withgoogle.com/conversation/style- guide/language.html § Error Handling o https://designguidelines.withgoogle.com/conversation/conversational- components/errors.html

20 Human Computer Interaction References

§ Voice First: The Future of Interaction? o https://www.nngroup.com/articles/voice-first/ § How to Design Voice User Interfaces o https://www.interaction-design.org/literature/article/how-to-design-voice- user-interfaces § The Narrowing Rift: Voice UI and Conversational UI o https://medium.com/@muppetaphrodite/the-narrowing-rift-voice-ui-and- conversational-ui-7d5c95cf086c

21 Human Computer Interaction License

§ These slides are distributed under a Creative Commons license “Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)” § You are free to: o Share — copy and redistribute the material in any medium or format o Adapt — remix, transform, and build upon the material o The licensor cannot revoke these freedoms as long as you follow the license terms. § Under the following terms: o Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. o NonCommercial — You may not use the material for commercial purposes. o ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. o No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. § https://creativecommons.org/licenses/by-nc-sa/4.0/

22 Human Computer Interaction