Advanced Interaction Techniques Carlos Duarte 2015/2016 Speech API API
• SAPI - Microsoft Speech API
• Windows native API
• Provides a high-level interface between an application and speech engines Speech Synthesis Using TTS
• Add reference to solution
System.Speech
• Add using clause
using System.Speech.Synthesis; Create a TTS engine
• Declare a variable of type SpeechSynthesizer for the engine
SpeechSynthesizer speaker;
• Create the object
speaker = new SpeechSynthesizer(); Synthesising speech
• Call the method Speak with the text to synthesise
speaker.Speak(“I like my new voice!”);
speaker.SpeakAsync(“Don’t you?”); Configure voice parameters
• For every voice
speaker.Rate = -2;
speaker.Volume = 60;
• If other voices are installed
speaker.SelectVoiceByHints(VoiceGender. Female); Find which voices are installed foreach (InstalledVoice voice in speaker.GetInstalledVoices()) {
VoiceInfo info = voice.VoiceInfo;
Console.WriteLine(" Name: " + info.Name);
Console.WriteLine(" Culture: " + info.Culture);
Console.WriteLine(" Age: " + info.Age);
Console.WriteLine(" Gender: " + info.Gender);
Console.WriteLine(" Description: " + info.Description);
Console.WriteLine(" ID: " + info.Id);
} Speech Recognition Using ASR
• Add references to solution
System.Speech
• Add using clause using System.Globalization; using System.Speech.Recognition; Create an ASR engine
• Declare a variable of type SpeechRecognitionEngine for the engine
SpeechRecognitionEngine recognizer;
• Create the object
recognizer = new SpeechRecognitionEngine(new CultureInfo(“en-US”)); Load a grammar
• For user defined grammar
recognizer.LoadGrammar(cityChooser);
• For dictation tasks
recognizer.LoadGrammar(new DictationGrammar()); Assign to an audio source
recognizer.SetInputToDefaultAudioDevice(); Attach handler for recognition event • Attach SpeechRecognized event handler
recognizer.SpeechRecognized += new EventHandler
• Create handler code void SpeechRecognizedHandler(object sender, SpeechRecognizedEventArgs e) { Console.WriteLine(e.Result.Text); } Other events
recognizer.SpeechDetected recognizer.SpeechHypothesized recognizer.SpeechRecognitionRejected recognizer.RecognizeComplete Create a grammar
Choices cities = new Choices(new string[] {“Los Angeles”, “London”, “Lisbon”}); GrammarBuilder gb = new GrammarBuilder(); gb.Append(“I want to fly from”); gb.Append(cities); gb.Append(“to”); gb.Append(cities); Grammar cityChooser = new Grammar(gb); cityChoose.Name = “City Chooser”; Assign input to Kinect
• Add using clauses using Microsoft.Kinect; using System.IO;
• Setup audio source myKinect = KinectSensor.KinectSensors[0]; myKinect.Start(); kinectSource = myKinect.AudioSource; kinectSource.BeamAngleMode = BeamAngleMode.Adaptive; audioStream = kinectSource.Start(); recognizer.SetInputToAudioStream(audioStream, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));