Welcome to GotSpeech.NET Sign in | Join | Help

while (speech > milk)

Speech Synthesis and Recognition
Vista - SAPI 5.3 - Speech Synthesis

Ok, while I don't want to add more confusion to the community about SAPI vs. MSS, but I'm going to be posting about Vista's Speech Synthesis and Recognition. MSS 2007 Beta will still be the majority of my post, at least until OCS comes out!

 

First, let me make the general distinction, SAPI was designed for desktop applications, the SASDK was designed for writing Web (Multi-modal) and IVR applications for Microsoft Speech Server 2004. Anyone trying or actually using SAPI directly over the web is really hitting the nail with a sledge hammer. This gets more confusing in the future as there doesn’t seem to be a plan for a public SASDK for VS 2005, it is only included in MSS 2007 BETA aka OCS. For Speech Enabled Web Applications, I predict either, you will have to write your own SALT, or possibly the SAPI team at MS will pick this up. If you have a choice, write your own SALT rather than trying to use the SAPI with a COM wrapper for an ASP.Net application.

 

Vista

If you haven't check out the speech capabilities in Vista you really need to. The new voice, Anna, sounds great. (At least compared to Sam!). The speech recognition allows you to do a lot of things right out of the box. In fact you don't really have to put any speech recognition code in your application for user to be able to control it with thier voice. I'm talking about the common items such as File, Dictation into textboxes, etc..

Vista Speech Recognition and Dictation also has a Speech Toolbar, which is very useful as it shows the user what it heard, or that it didn't understand them. It gives the user nice visual notices.

 

 

System.Speech

System.Speech is a managed API for SAPI, it is included in the .NET 3.0 Framework which is comes preinstalled on Vista. You can use this managed API with Windows XP (SAPI 5.1) , but you won’t be able to use SSML or any 5.3 specific features. You can still use the COM interop stuff too, but why would you? Unless you are doing something with profiles,  the System.Speech API will work for you with fewer lines of code.

 

System.Speech.Sythnesis

I'm going to go demonstrate how to use speech synthesis for SAPI 5.3 using the managed API. You really have two options, Speak and SpeakAsync. Typically you should use SpeakAsync, this allows other actions to take place in your application while it is speaking, such as clicking another button on the form. I'll go more into the difference in a future post.

 

Once you initialize the SpeechSynthesizer, you have two main methods of providing the output, String or PromptBuilder. (There are other methods, these are really the main methods.)

  1. String
  2. PromptBuilder class
    1. SSML
    2. Text with Pronunciation
    3. Text with Hint

synth = new SpeechSynthesizer();

 

//String

synth.SpeakAsync("Hello World");

           

//PromptBuilder

PromptBuilder pb = new PromptBuilder();

 

pb.AppendSsml("\\sample.ssml");

pb.AppendTextWithHint("SSML", SayAs.SpellOut);

pb.AppendTextWithHint("10/16/2007", SayAs.Date);

pb.AppendTextWithPronunciation(".NET", "dot net");

synth.SpeakAsync(pb);

 

Next Time … System.Speech.Recognition with Sample Application

Posted: Thursday, February 15, 2007 10:15 PM by MichaelDunn

Comments

No Comments

Anonymous comments are disabled