Advanced Interaction Techniques Carlos Duarte 2015/2016 Speech API API

• SAPI - Speech API

• Windows native API

• Provides a high-level interface between an application and speech engines Using TTS

• Add reference to solution

System.Speech

• Add using clause

using System.Speech.Synthesis; Create a TTS engine

• Declare a variable of type SpeechSynthesizer for the engine

SpeechSynthesizer speaker;

• Create the object

speaker = new SpeechSynthesizer(); Synthesising speech

• Call the method Speak with the text to synthesise

speaker.Speak(“I like my new voice!”);

speaker.SpeakAsync(“Don’t you?”); Configure voice parameters

• For every voice

speaker.Rate = -2;

speaker.Volume = 60;

• If other voices are installed

speaker.SelectVoiceByHints(VoiceGender. Female); Find which voices are installed foreach (InstalledVoice voice in speaker.GetInstalledVoices()) {

VoiceInfo info = voice.VoiceInfo;

Console.WriteLine(" Name: " + info.Name);

Console.WriteLine(" Culture: " + info.Culture);

Console.WriteLine(" Age: " + info.Age);

Console.WriteLine(" Gender: " + info.Gender);

Console.WriteLine(" Description: " + info.Description);

Console.WriteLine(" ID: " + info.Id);

} Using ASR

• Add references to solution

System.Speech

• Add using clause using System.Globalization; using System.Speech.Recognition; Create an ASR engine

• Declare a variable of type SpeechRecognitionEngine for the engine

SpeechRecognitionEngine recognizer;

• Create the object

recognizer = new SpeechRecognitionEngine(new CultureInfo(“en-US”)); Load a grammar

• For user defined grammar

recognizer.LoadGrammar(cityChooser);

• For dictation tasks

recognizer.LoadGrammar(new DictationGrammar()); Assign to an audio source

recognizer.SetInputToDefaultAudioDevice(); Attach handler for recognition event • Attach SpeechRecognized event handler

recognizer.SpeechRecognized += new EventHandler(S peechRecognizedHandler)

• Create handler code void SpeechRecognizedHandler(object sender, SpeechRecognizedEventArgs e) { Console.WriteLine(e.Result.Text); } Other events

recognizer.SpeechDetected recognizer.SpeechHypothesized recognizer.SpeechRecognitionRejected recognizer.RecognizeComplete Create a grammar

Choices cities = new Choices(new string[] {“Los Angeles”, “London”, “Lisbon”}); GrammarBuilder gb = new GrammarBuilder(); gb.Append(“I want to fly from”); gb.Append(cities); gb.Append(“to”); gb.Append(cities); Grammar cityChooser = new Grammar(gb); cityChoose.Name = “City Chooser”; Assign input to Kinect

• Add using clauses using Microsoft.Kinect; using System.IO;

• Setup audio source myKinect = KinectSensor.KinectSensors[0]; myKinect.Start(); kinectSource = myKinect.AudioSource; kinectSource.BeamAngleMode = BeamAngleMode.Adaptive; audioStream = kinectSource.Start(); recognizer.SetInputToAudioStream(audioStream, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));