Welcome to GotSpeech.NET Sign in | Join | Help

Marshall Harrison - "the gotspeech guy"

Site news, Speech Server insight and assorted ramblings
Migrating MSS 2004 SALT app to OCS Speech Server 2007

I've been involved lately with migrating a large SALT app written in Speech Server 2004 R2 over to an OCS SALT application. It has been fun but I got a feel for all of the changes that have occurred in the .Net Framework.

The approach that I took was to backup the code then open the project in VS 2005 and let the automatic conversion take place. This was a good starting point but that was all it was. One of the things that frustrated me was that it when I did a build I didn't get errors but when I opened some of the pages the SALT components wouldn't display and gave me errors. I quickly tracked that down and discovered that it was a simple matter of update the .aspx to reflect the newer version of Microsoft.Speech.Web (Version=2.0.3400.0). There was also some code methods that had changed and some other stuff. I'll admit that I was aggravated when the conversion didn't just update the version number for me but I guess the conversion wizard is just a Visual Studio wizard and isn't aware of Speech Server.

After getting this conversion well under way I found this link "Migrating Applications to Speech Server". The article does a great job of explaining your options for migrating and even explains why you should consider migrating existing SALT apps over to OCS 2007 Speech Server.

Odds and Ends:

I've also been very busy with some other things (and actually learning something in the process) so I haven't been as active on GotSpeech as I would like but I'm back now. I have some catching up to do with the forum postings but I'm going to make hanging out here on the site a priority.

One of the things that kept me busy was dealing with some site issues. The database had grown to over 500 MB and so I had to do some purging of the database and then shrink it to get things back to where they should be. One of the results of this (besides saving me some added hosting costs) is been that the site's response time has improved.

I am grateful that during my absence there have been other members that have picked up the slack and shared their expertise by answering posts in the forums. I want extend my thanks to all who contribute

MRCP

For those of you who like to "roll your own" here are some MRCP projects that look promising:

http://www.openmrcp.org/

http://www.unimrcp.org/

http://sourceforge.net/projects/openmrcpclient/

 

It would be interesting to see how well these can be made to interface with Speech Server.

OCS Videos

Some OCS videos from my good friend Ken Circeo at Microsoft - http://www.microsoft.com/video/en/us/search?phrase=communicator

There are more OCS videos in the pipeline but I would love to see some for Speech Server.

The Speech Server Scoop on OCS R2

Ok the word is getting out now about Office Communications Server 2007 R2 so I thought I would give you some details on how this will affect Speech Server developers.

First let me say that there are no changes for Speech Server in OCS R2. It will still be a separate install and the bits will be the same. You will still use Visual Studio 2005 and all of the development tools are the same.

Now for the cool news about the R2 release.

R2 will include the new UC Managed API 2.0. The API shows the new approach for developing Speech Applications going forward: speech technology will be an integrated developer capability in the whole of the UC platform. The UCMA 2.0 API does consist of 3 API major pieces – Core (including a SIP signaling stack and a media stack), a managed server Speech API and UC Workflow Activities that are built on top of both the core and server speech managed APIs. All together make the one UC Managed API 2.0.

The UCMA 2.0 Server Speech SDK will support 12 languages with both ASR and TTS: US English, Canadian French, Mexican Spanish, Brazilian Portuguese, UK English, German German, French French, Italian, Japanese, Mandarin (simplified/mainland plus traditional Taiwanese), Korean.

And get this - it will support Visual Studio 2008! Actually the UC Workflow Activities support both activities for speech as well as for IM automated agents (a.k.a. bots).

More info -

  1. You can develop Speech Server (2007) applications just like you have in the past (using VS 2005)
  2. You can now develop speech "bots" using the new Workflow Activities on top of the UCMA 2.0, or in managed code only using the Core and Speech APIs, if you are really hard core.
  3. The UCMA Speech SDK will be missing some of the tools that you are currently used to having. For example there is no grammar tool but SRGS grammars are still supported and you can use the existing Grammar Editor (in VS2005) to create grammars, or use your favorite XML editor.
  4. Conversational grammars may or may not work due to changes in the way the engine works.
  5. OCS 2007 R2 has no VXML support on top of UCMA 2.0. This might change for the future '14' release. SALT definitely is dropped from the roadmap.
  6. The UCMA is much closer to SIP but will still be familiar to you. It will be able to manipulate the SIP stack and the media stack as well.
  7. In the next '14' release (the one after R2) Speech Server will no longer be a standalone install but will be an integral part of OCS.

You are probably wondering how you can get your hands on Office Communicator 2007 R2?
The official Launch Date will be early February. Till then there only is a very small private beta.

There however is a Developer program called Metro (http://www.discovermetro.net) for managed Microsoft accounts.

Managed ISVs and Corporate developers just need to get in touch with your Microsoft (Partner) Account Manager asking if you can be admitted to this Metro program. The Metro program gives access to Hyper-V images of a complete developer OCS 2007 R2 setup, including speech, training across the world in the complete platform, and a (email only) help desk, in exchange for a commitment to build applications on the UC (OCS 2007 R2 and Exchange) platform.

I am really excited about this as it will allow us Speech Server developers better access to the core OCS components and will give us a new way to develop speech applications. For now the best approach will probably be to keep developing the way you have in the past and start experimenting with the new stuff before settling on it for all of your development. Or at least that is the approach I plan on using.

Gold Systems (the company I work for) has OCS R2 up and running in production and we are very excited about the new release

I'll blog more on the UCMA later.

Office Communications Server 2007 R2

cnet has this article - Microsoft plans unified communications update on their web site today.

Another link - Microsoft Unveils Microsoft Office Communications Server 2007 Release 2

I've been on a conference call this morning concerning R2 and I'll provide you with more details later today or early tomorrow after I have had a chance to distill my notes and write the blog post.

As I said before there are some exciting things coming.

Microsoft looking for our input

Several months ago I started a thread on GotSpeech called Long Compile Times. In that thread I described some problems I was having with long load times when trying to load a Speech project into Visual Studio. I was experiencing compile times of around 10 minutes and long load times when running the application on a production server. I also had some rather long times loading pages as the caller moved through the application.

I contacted Microsoft on this, a ticket was opened and I've stayed in touch with them concerning this. It seems the problem revolves around using subflows (or what ever you want to call them) and with nesting of subflows. I'm not going to delve anymore into the problem here as you can just read the thread (and others on GotSpeech) to see what others are experiencing.

Now Microsoft is soliciting some more information on this topic and Anthony has posted to the thread describing what is happening and asking us to fill in some of the details. So if you have encountered this issue then head on over here and make yourself heard on this. Lets make this an active thread.

Microsoft is also working on a whitepaper dealing with this. No firm date on when it will be published but I'm guessing it will depend somewhat on how soon Anthony gets the answers to his questions.

Here's your chance to make a difference so go for it.

Back on the Road

I'm in Minneapolis next week conducting some Speech Server training so if you live in the area and want to meet up to talk about Speech Server or OCS then shoot me an email.

It would be great to see what is happening in the area. Am I Done? and I will meet for dinner Monday evening and it will be great to catch up on things. Got to have something to fill my time on these business trips. :-)

3 Years Isn't As Long As It Used To Be

No wonder I feel like I'm getting old so fast. I just realized that it only takes about 17 months to age 3 years.

In April of 2007 Cisco was touting their 3 year lead on Microsoft in Unified Communications. Well it looks like that 3 year lead has vanished and Cisco seems to have decided that they may be going about things all wrong. Check out this post on the Office Communications Server team blog.

And while we are talking about Cisco you should know that they just announced last week that they are acquiring Jabber.

Fun times are ahead for Speech Server and OCS. Those of you that are developing on these platforms are on the leading edge of a revolution in how call centers and data centers will communicate in the near future.

SAPI

Our SAPI forums are are getting busier and it looks like there are lot of developers new to desktop speech development.

So I thought it would be a good time to blog this link - http://blogs.msdn.com/speech/default.aspx. there is some good stuff there.

 

If you are into SAPI programming please tell your friends about our forums and encourage them to visit. I want to grow the SAPI part of GotSpeech.Net. I'm looking for someone to blog about SAPI so if you are interested or know of someone then let me know.

Enjoy.

MSDN Virtual Lab for OCS 2007 Speech Server

There is a virtual lab for Speech Server over on the MSDN site. It is titled Anywhere Information Access with Office Communications Server 2007 Speech Server. I've registered but I haven't ran the lab yet as I have too much going on at the moment to find the time.

 

Enjoy.

To hear prompts in Spanish press 1.

"To hear prompts in Spanish press 1."

 

On the surface this a fairly simple and common prompt. But when you start to think about how to implement this it gets a little more complicated. There are two responses to this prompt. The user can press 1 or simply do nothing.  If the user presses 1 then they should hear the prompts in Spanish but if they do nothing then things should proceed without any user input.

 

Normally a QuestionAnswer activity has a grammar that expects the user to choose something but the prompt above allows the user to do nothing as one of the responses. If the user does nothing (i.e. silence) then the normal response for a QuestionAnswer activity is to play the silence prompt and wait for a key press.

 

So how do you handle a prompt doesn’t always require input? Well I’m going to show you one way of doing it. Note that this approach also works for a barge in prompt where you want to give the user the ability to skip a prompt.

 

First you need to create a new speech application and include a grxml grammar. In the grammar setup a DTMF rule that only accepts a “1” key press.

 

Drop a SpeechSequence into the designer then add a QuestionAnswer activity and name them using whatever naming conventions you have.  Add text to the QA either using the Property Builder or in the TurnStarting event then attach the DTMF grammar. You now have the basics down and the QA will work if the user presses the “1” key. But how do you handle the “do nothing” choice? To handle that we will need to add some more stuff.

 

Next add an IfElse with two branches.

 

 

 

Then right click on the SpeechSequence and choose “View SpeechEvents” as shown below.

 

 

 

In the event handler add a ConsecutiveNoInputsSpeechEvent and set MaximumNoInputs to 1. By using a ConsecutiveNoInputsSpeechEvent the event will fire as soon as we get a silence or a noreco on the QA. This will catch the case where the user chooses to “do nothing” or presses any key other than “1”. This doesn’t override the normal behavior of the QA which is to replay the prompts so we need some way of getting out of the QA and we can do this by dropping a GoTo into the handler and setting its target to the IfElse we created earlier. This will get us out of the QA and allow the workflow to continue. Your handler should look like this:

 

 

Once we are out of the handler we need to check to see if the user pressed “1” and we do that in the IfElse we created earlier. Here we encounter a slight problem. The normal way an IfElse works is that it checks the branches from left to right until it finds a matching condition, hits the Else branch or runs out of branches. We can’t just use the normal QuestionAnswer.RecognitionResult logic that we are all so used to. Since we have jumped out of the QA when we had a silence or noreco event the QA’s Recognition object is null and we have to handle that case in the first branch or we risk blowing up our code. You can do that like in the first branch conditional code this:

 

      private void ifSpanishNoInput(object sender, ConditionalEventArgs e)

        {

            if (askSpanish.RecognitionResult == null)

                e.Result = true;

            else

                e.Result = false;

 

        }

 

From here it is simple: All you need to do is add a code block to the Else branch and put this code in the _ExecuteEvent

 

            this.TelephonySession.CurrentUICulture = new CultureInfo("es-US");

 

 

That’s all there is to it. The QA will now allow the caller to either press the “1” key for Spanish or simply do nothing and the call will continue after setting the TTS to Spanish. You will also need to set your prompts to Spanish but I’ll leave that exercise for you. I will give you a hint though – I use AppendAudio and .wav files for most of my stuff so I just switch from an English directory of .wav files to the Spanish directory and append the prompts from there.

 

As I said this technique wil work any time you need a prompt that you want to barge in on or a prompt (QA)  that only has one input. I've given you enough inormation so that you can get something like this up and running in your code but if you have any questions just let me know.

 

 

 

 

 

 

AudioCodes MediaPack MP-114-2FXO-2FXS Analog VoIP GateW

I have an AudioCodes MediaPack MP-114-2FXO-2FXS Analog VoIP GateW that I'm going to sell on eBay later this week.

If anyone is interested you can buy it directly from me for $300. If you want it then contact me before Wednesday afternoon.

Just email using marshall at got speech dot net (you can figure out the address).

New Speech Bloggers

I'm excited to  have added 3 new Speech Server bloggers to GotSpeech. I'm really looking forward to their contributions. The new blogs are

  1. Speech From Moscow - Dmitry is a speech programmer from Moscow where he is very active in the local community
  2. Speaking From the Edge - This is Marc LaFleur's blog and he will be writing about the Core API and other things out there on the "bleeding edge" of speech. Marc's current blog can be found at http://weblogs.asp.net/mlafleur/
  3. Dszabo Speaks - Will be the speech blog of David Szabo. David is a Microsoft consultant based out of Dublin Ireland. He also has a blog at http://blogs.msdn.com/dszabo/default.aspx

 

These blogs should come on line in the next few days and I'll let them introduce themselves and fill you in on what they will be blogging about.

Microsoft E-Reference Library Subscription Discount

I received this information recently:

Microsoft Press has created an exclusive discount URL for the E-Reference Library that MVPs can pass along to the broader community without any limitations or restrictions. To create a trial subscription, community referrals should use the Trial URL (http://microsofteref2.books24x7.com/promo.asp?ref=mvptry). Any community referrals who subscribe to E-Reference Libraries through the Subscription URL (http://microsofteref2.books24x7.com/promo.asp?ref=mvpbuy) will receive a 40% discount on a one-year subscription. This discount offer ends on September 30, 2008.

 

I have a subscription and use it from time to time. Having the books as an online resource sure keeps my office neater and makes my wife happy. I have too many books already and I like having access to the books online. I just wish there were some related to Speech Server.

On the road - In D.C.

Yes, I'm traveling again and this week I'm in our nations' capitol. If you are around this area then give me a shout.

When I get home next week I'll blog about my books plans and some new things that are happening around GotSpeech Central. In the meantime keep posting in the forums.

More Posts Next page »