Vista - SAPI 5.3 - Speech Recognition
Quick Overview Speech Recognition for SAPI 5.3:
You can use the System.Speech.Recognition namespace to write speech recognition for desktop applications. You can have two choices, you can use the SpeechRecognizer or the SpeechRecognitionEngine. So what is the difference?
Well the SpeechRecognizer is uses the shared recognizer, the same recognizer that Vista uses for the speech recognition it does. This means you can access the speech toolbar to interact with the user. The SpeechRecognitionEngine is all done in your applications own process, meaning you won't be able to use the new speech toolbar, and you must explicitly tell it when to start recognition. If you need to access some of the standard commands that Vista uses, you should use the SpeechRecognitionEngine. IE. "MouseGrid", "Click" etc..

SpeechRecognizer _sharedRecognizer = new SpeechRecognizer();
SpeechRecognitionEngine _inprocRecognizer = new SpeechRecognitionEngine();
Now you have to choose how you want to provide the grammar. You have a lot of options, grxml file (SRGS XML Format), GrammarBuilder with a list of Choices, or an create a class that inherits from the the System.Speech.Recognition.SrgsGrammar. So lets create a little grammar both ways. The grammar will recognize the colors, "Red", "White" and "Blue".
GrXML
Create .grxml file:
<
grammar xml:lang="en-US" version="1.0" xmlns="http://www.w3.org/2001/06/grammar" tag-format="semantics-ms/1.0">
<rule id="Color" scope="public">
<one-of>
<item> red </item>
<item> white </item>
<item> blue </item>
</one-of>
<tag> $ = $$ </tag>
</rule>
<grammar>
In Code:
grammar = new Grammar(_pathToGrxml);
GrammarBuilder
//Build Choice List
Choices grammarChoices = new Choices();
grammarChoices.Add(
"Red");
grammarChoices.Add("White");
grammarChoices.Add("Blue");
//Add to Grammar Builder
GrammarBuilder gb = new GrammarBuilder(grammarChoices);
//Add Grammar Builder to Grammar
grammar = new Grammar(gb);
SrgsDocument Class:
code add the grammar
class
SrgsGrammar : SrgsDocument
{
private static string[] _colors = new string[] {"red", "white", "blue"};
public SrgsGrammar(): base(GrammarRule()){
}
private static SrgsRule GrammarRule()
{
//Create new rule
SrgsRule rule = new SrgsRule("Color");
//Create a oneof list
SrgsOneOf choices = new SrgsOneOf();
//Get items to add to oneof list
foreach (string color in _colors)
{
SrgsItem grammarItem = new SrgsItem(color);
choices.Add(grammarItem);
}
//Add list to rule
rule.Add(choices);
//Return Rule
return rule;
}
}
In your application code use:
Grammar grammar = new Grammar(new SrgsGrammar());
Code:
Now in code add the grammar to either the SpeechRecognizer or the SpeechRecognitionEngine:
_sharedRecognizer.LoadGrammar(grammar);
Or
_inprocRecognizer.LoadGrammar(grammar);
Now we need to be able to get the results, for both you can handle the SpeechRecognized and SpeechRecognitionRejected events.
Shared
private void sharedRecognizer_SpeechRecognitionRejected(object sender, SpeechRecognitionRejectedEventArgs e)
{
//Send results to Toolbar with Message
SpeechUI.SendTextFeedback(e.Result, "Say Red, White or Blue", false);
}
private void sharedRecognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
//Send results to Toolbar
SpeechUI.SendTextFeedback(e.Result, e.Result.Text, true);
string confidenceValue = e.Result.Confidence.ToString();
string saidText = e.Result.Text;
string semanticValue = e.Result.Semantics.Value.ToString();
}
InProc
For the InProc you can handle the events the same way as the shared recognizer and call the Recognize method, or you can create a RecognitionResult object:
RecognitionResult results = _inprocRecognizer.Recognize();
string saidText = results.Text;
string confidenceValue = results.Confidence.ToString();
string semanticValue = results.Semantics.Value.ToString();
And that's the basics of doing Speech Recognition for a desktop application. I'll post a demo app soon.