US010096319B1 (12 ) United States Patent ( 10 ) Patent No. : US 10 , 096 ,319 B1 Jin et al. (45 ) Date of Patent : (54 ) VOICE -BASED DETERMINATION OF ( 56 ) References Cited PHYSICAL AND EMOTIONAL CHARACTERISTICS OF USERS U . S . PATENT DOCUMENTS (71 ) Applicant : Amazon Technologies , Inc. , Seattle , 6 ,665 , 644 B1 * 12 /2003 Kanevsky GIOL 17 / 26 704 / 246 WA (US ) 9 , 070 , 357 B1 * 6 / 2015 Kennedy . .. .. .. .. .. .. GIOL 15 /00 9 , 177 , 557 B2 * 11/ 2015 Talwar . .. .. GIOL 17 / 02 ( 72 ) Inventors : Huafeng Jin , Sammamish , WA (US ) ; 2004 / 0243443 A1 * 12 /2004 Asano .. .. .. .. .. .. .. G06Q 10 / 10 Shuo Wang , Bellevue, WA (US ) 705 / 2 2013 /0339028 A1 * 12 / 2013 Rosner .. GIOL 15 /222 (73 ) Assignee : Amazon Technologies , Inc ., Seattle , 704 / 275 WA (US ) 2014 / 0025623 A1 * 1/ 2014 Lindhiem .. G16H 50 / 20 706 / 52 ( * ) Notice : Subject to any disclaimer, the term of this 2014 /0074454 Al * 3 / 2014 Brown .. .. .. G06F 19 / 345 patent is extended or adjusted under 35 704 / 9 U . S . C . 154 ( b ) by 0 days. 2017 / 0076740 A1 * 3 /2017 Feast .. .. G10L 25/ 63 (21 ) Appl. No .: 15 /457 , 846 * cited by examiner ( 22 ) Filed : Mar. 13 , 2017 Primary Examiner — Shreyans A Patel ( 74 ) Attorney , Agent, or Firm — Eversheds Sutherland (51 ) Int. Ci. (US ) LLP GIOL 21/ 00 ( 2013. 01 ) G06F 17700 ( 2006 . 01 ) ABSTRACT GIOL 15 / 22 ( 2006 .01 ) (57 ) GIOL 15 /30 ( 2013 . 01 ) Systems, methods, and computer- readable media are dis GIOL 25 /63 ( 2013 . 01 ) closed for voice -based determination of physical and emo GIOL 25 /66 ( 2013 .01 ) tional characteristics of users . Example methods may GIOL 25 / 84 ( 2013 .01 ) include determining first voice data , wherein the first voice GIOL 15 /08 ( 2006 .01 ) data is generated by a user, determining a first real -time user ( 52 ) U . S . CI. status of the user using the first voice data , generating a first CPC . .. GIOL 15 /22 (2013 .01 ) ; G10L 15/ 08 data tag indicative of the first real -time user status , deter ( 2013 .01 ) ; GIOL 15 / 30 ( 2013 .01 ) ; GIOL 25/ 63 mining first audio content for presentation at a speaker ( 2013. 01 ) ; GIOL 25 /66 (2013 .01 ) ; G10L 25/ 84 device using the first data tag and the first voice data , and (2013 .01 ) ; GIOL 2015 /088 (2013 .01 ) causing presentation of the first audio content via a speaker (58 ) Field of Classification Search of the speaker device . CPC .. .. .. .. GIOL 21 / 00 ; G10L 15 /00 ; G06F 17 /60 See application file for complete search history . 19 Claims, 5 Drawing Sheets 1301 120 w Alexa , * cough * I 'm hungry 132 * snifflet 1002 Would you like a recipe for chicken soup ? - - . Voice No , thanks Processing Server ( s 134 - Ok , i can find you something else. By thew3y , would you like to order cough drops with 1 hour delivery ? Voice Data 136 That would be awesome uyususan Networks 110 Thanks for asking ! No problems. I' ll ertal you 30 146 in order confirmation . Feel better ! 144 XXX Audio Content: User Devices Servers 130 - 150 mom 160 Determine audio content for * 170 Determine that user has an presentation based at least Receive voice input abnormal physical or in part on the abnorma! Present audio content emotional condition condition U . S . Patent Oct. 9 , 2018 Sheet 1 of 5 US 10 , 096 , 319 B1 180 UserDevice(s) 142 Voice Server(s) or w Processing Network(s) wer contentPresentaudio .DO ????????????????????????????????? Voice Data 136 Server(s) AudioContent mammam170 AAANANTANAS A WWWWWAAA WALALALA wwwwwwwwwwwwwwwwwww Determineaudiocontentfor atleastpresentationbased 1301* abnormalpartonthein condition wwwwwwwwwwwww NAKAKAKAALAAAAAAAAAAAAA FIG.1 RECONCREW. 132w 160 KURUMURILLA UNLUUUUU Thanksforasking! EXICOGERKLOKKENWOOCOUCO hungry,*coughI'mAlexa*sniffle No,thanks awesome!wouldbeThat MBANI JUUUUUUUUUU w wwwwwwww thatuserhasanDetermine physicalabnormalor KAWAWAKAW.KKKK conditionemotional U VAN wwwwwwwwwwwww forWouldlikearecipeyou chickensoup? hourdropswith1ordercoughto delivery? anproblemI'llemailyouNo. better!confirmation.Feelorder likeway,wouldyouelse.Bythe * AMASUGULAMAMARA MMMMMMWWKWKWARWATYW somethingyouOk,icanfind UUNNARRRAAAAA wwwwwwwwwwwwwwwwwanna * TAKUKUHANMamma WWW. notendummmmwwwwwwwwwwwwwwwwwwwwwwww S 150porno T U 120 WWWMAN wwwwwwwwwwwwwwwwwwwwwwwwwwww 134 wwwwww inputReceivevoice NACHWEROWCWXERCENGWWCACHO.WOWOKWANN 110 wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww ??????????????????????? UUUUUUU 100 ULUKUKAUKOLLAMOK U . S . Patent Oct . 9 , 2018 Sheet 2 of 5 US 10 , 096 ,319 B1 200 Receive voice input from a user at a user device 210 Process voice data of the voice input using one or La 220 more signal processing algorithms Determine one ormore real- time traits of the user - 230 mana mana ma wa nn man nan na na mama wa ma ma ma mana nana na na na na na na * 1 Generate one or more data tags corresponding to a 240 onnon the one or more real- time traits * w nim kiw on man www www ww with an A WA Www . w w ww . ??? ?? ? ?? ?? ?? ??? ??? ?? ??? ?? ? ? ?? ??? ??? ?? ?? ?? ?? ? ould Determine candidate audio content for 250 I presentation using the one ormore real- time traits youwwww ne me me me me me me me me me pas mom Present selected audio content via a speaker device care 260 coobooFIG w . 2 atent Oct. 9 , 2018 Sheet 3 of 5 US 10 ,096 ,319 B1 360 358 one AudioContent Server(s)330 WWWWWWWXXXX LAWA WELULOKULU Determinecandidatecontentusing KAMA WA 356 ????? datafromrequestforaudiocontent Determinerankingforcandidate content versumsen362 audioRequest wwwwwmmprotwan Server(s) winningSendovernament identifier content ( Exchange 320 contentCanon el wwwwwwwwwwwwww 346 348 350 sama354 FIG.3 cm352 WWWWWWWWWWWWWWMWWW. Requestaudio content 364 366net VoiceProcessing Server(s)310 Determinemeaningofuserrequest Determineaphysicalstatusoftheuser Determineanemotionalstatusofthe Determinealanguageaccentofthe ht WAKAMU WM wwww AM user user Determinebackgroundnoisefeature Provideaudio content alRequestfollowupaction wwwwwwwwwwww Userrequestvia voicedata VoiceInteraction Device(s)300 WWWWWWWWWWWWWWWWWW wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww wwwwwwwwwwwwwwwwwwwwwwwwwww * 342 Time atent Oct. 9 , 2018 Sheet 4 of 5 US 10 ,096 ,319 B1 - 402 400 Receive first analog sound input ww w w w w w wwwwwwwwwww ww ww ww me mom son an am ww ww ww mm mm mm mm 404 | Convert the first analog sound input to voice data RED 0 0 C O o o C 1 Determine an emotional status of the user i WLAN WWW WW WWWWWWWWWWWW WWW WWW W AMWA WA MWAM U MAMA WA a me meme me me me meme me me me meme meme me me meme me me me AR Determine a physical status of the user W . W wwwww WK WWW WWW WW www ww www white www Wein WWW WWWWWWWWWWWW WWW Mwith L ove on 410 prom 412 and YES wwwwww Does the user have a non - local do casamento Determine a language accent of language accent ? the user WWW NO * * * * * Generate at least one indication of a real- time user status for use in selecting audio content - 416 Determine candidate audio content * K * XXXXXXXXXWHAKAL A KIHARAAAAAKKKKKKKKKK Select , based at least in part on the at least one 418 indication of the real- time user status , audio content for presentation ???????????? , AUX X X AMA MARA KWA W * * # * * X A X X X X X SW A wametMUW consent to the mainWA p ana 420 Present the audio content when we w swathi WWW WWW WWW WW www www w wwwwwwwwwwwwwwwwwwww www FIG . 4 atent Oct. 9 , 2018 Sheet 5 of 5 US 10 ,096 , 319 B1 Voice Processing Audio Conteni Server( s ) 550 Server ( s ) 560 5407 Network ( s ) ) Antennale ) 534 500 WARE > 100000000000000000000 -0 -0 518 wwws - 522 502 mot Processor (s ) Data Storage wwwwwwwwwwwwwwwwwwwwwwwwwwwwwww Memory Device (s ) 0 / S DBMS w 05 1 / 0 Interface( s ) Speech Recognition Module ( s ) Network 08 Communication Interface (s ) Moduleís ) 510 Sensor( s ) /Sensor Interface (s ) SignalProcessing DAARNAL Module ( s ) 512 Transceiver( s ) Physical / Emotional neno Characteristics 514 Speaker ( s ) Module ( s ) Microphone ( s ) VYYYYYY GOFIG . 5 US 10 ,096 , 319 B1 VOICE -BASED DETERMINATION OF modify home settings , report news, and the like . Voice -based PHYSICAL AND EMOTIONAL commands may be provided via one or more voice inputs CHARACTERISTICS OF USERS from a user. Certain embodiments of the disclosure may determine one BACKGROUND 5 or more physical and / or emotional characteristics of a user based at least in part on a voice input from the user . For Isers may consume audio content vin a number of example , physical conditions such as sore throats and content consumption devices . Certain content consumption coughs may be determined based at least in part on a voice input from the user, and emotional conditions such as an devices may be configured to receive voice -based com 10 excited emotional state or a sad emotional state may be mands, or may otherwise be configured to recognize speech . determined based at least in part on voice input from a user. Voice input from users to such devices may reflect a physical Determined physical and /or emotional states or condi or emotional characteristic of the user . Accordingly , deter tions of a user may be used to select or determine relevant mining a physical or emotional characteristic of a user using audio or visual content for presentation to the user . Selected a voice input may be desired . 15 or determined content may be highly targeted due to the real- time determination of the physical and / or emotional BRIEF DESCRIPTION OF THE DRAWINGS characteristics of the user , and may therefore be timely and relevant to the user ' s current state. Embodiments of the The detailed description is set forth with reference
File Typepdf
Upload Time-
Content LanguagesEnglish
Upload UserAnonymous/Not logged-in
File Pages21 Page
File Size-