Welcome to GotSpeech.NET Sign in | Join | Help

Marshall Harrison - "the gotspeech guy"

Site news, Speech Server insight and assorted ramblings
The Speech Server Scoop on OCS R2

Ok the word is getting out now about Office Communications Server 2007 R2 so I thought I would give you some details on how this will affect Speech Server developers.

First let me say that there are no changes for Speech Server in OCS R2. It will still be a separate install and the bits will be the same. You will still use Visual Studio 2005 and all of the development tools are the same.

Now for the cool news about the R2 release.

R2 will include the new UC Managed API 2.0. The API shows the new approach for developing Speech Applications going forward: speech technology will be an integrated developer capability in the whole of the UC platform. The UCMA 2.0 API does consist of 3 API major pieces – Core (including a SIP signaling stack and a media stack), a managed server Speech API and UC Workflow Activities that are built on top of both the core and server speech managed APIs. All together make the one UC Managed API 2.0.

The UCMA 2.0 Server Speech SDK will support 12 languages with both ASR and TTS: US English, Canadian French, Mexican Spanish, Brazilian Portuguese, UK English, German German, French French, Italian, Japanese, Mandarin (simplified/mainland plus traditional Taiwanese), Korean.

And get this - it will support Visual Studio 2008! Actually the UC Workflow Activities support both activities for speech as well as for IM automated agents (a.k.a. bots).

More info -

  1. You can develop Speech Server (2007) applications just like you have in the past (using VS 2005)
  2. You can now develop speech "bots" using the new Workflow Activities on top of the UCMA 2.0, or in managed code only using the Core and Speech APIs, if you are really hard core.
  3. The UCMA Speech SDK will be missing some of the tools that you are currently used to having. For example there is no grammar tool but SRGS grammars are still supported and you can use the existing Grammar Editor (in VS2005) to create grammars, or use your favorite XML editor.
  4. Conversational grammars may or may not work due to changes in the way the engine works.
  5. OCS 2007 R2 has no VXML support on top of UCMA 2.0. This might change for the future '14' release. SALT definitely is dropped from the roadmap.
  6. The UCMA is much closer to SIP but will still be familiar to you. It will be able to manipulate the SIP stack and the media stack as well.
  7. In the next '14' release (the one after R2) Speech Server will no longer be a standalone install but will be an integral part of OCS.

You are probably wondering how you can get your hands on Office Communicator 2007 R2?
The official Launch Date will be early February. Till then there only is a very small private beta.

There however is a Developer program called Metro (http://www.discovermetro.net) for managed Microsoft accounts.

Managed ISVs and Corporate developers just need to get in touch with your Microsoft (Partner) Account Manager asking if you can be admitted to this Metro program. The Metro program gives access to Hyper-V images of a complete developer OCS 2007 R2 setup, including speech, training across the world in the complete platform, and a (email only) help desk, in exchange for a commitment to build applications on the UC (OCS 2007 R2 and Exchange) platform.

I am really excited about this as it will allow us Speech Server developers better access to the core OCS components and will give us a new way to develop speech applications. For now the best approach will probably be to keep developing the way you have in the past and start experimenting with the new stuff before settling on it for all of your development. Or at least that is the approach I plan on using.

Gold Systems (the company I work for) has OCS R2 up and running in production and we are very excited about the new release

I'll blog more on the UCMA later.

Posted: Tuesday, October 14, 2008 10:30 PM by marshallharrison

Comments

ml_ said:

I'm not sure I like the sound of OCS absorbing Speech Server. OCS requires Active Directory integration and therefore really limits what I can do as an ISV. We deploy Speech Server applications as an appliance to various enterprises. If we are required to integrate into their ADS then it just won't work for us.

And am I to understand that I'm going to loose VXML and remain stuck in VS 2005?

I'm glad one of us is excited.

# October 14, 2008 11:14 PM

marshallharrison said:

Yes I guess ADS could be a problem.

It just takes so much time to get everything included and sometiems things have to wait for the next relase. I think that is what happened with the VXML support and it may be back in '14'. I'll ask about that and get back to you.

# October 14, 2008 11:23 PM

ml_ said:

My suggestion to them would be to post the VXML source to CodePlex and let the community help out. This would a) ensure that those of us interested in VXML support continue to have it available and b) make it much easier to port VXML from other platforms as we could customize the stack to add support for features not found in the strict VXML 2.1 specifications. We use a lot of property tags in our Nuance implementations that are not available to us in the Speech Server world for example.

Unfortunately if ADS is a requirement I'm going to be hurting. Having just spent 2 months trying to get read-only access to one customer's ADS I can just imagine what I'd be looking at with OCS...

# October 15, 2008 8:53 AM

bcxml said:

Any word on whether R2 will have support for dictation ?

# October 15, 2008 11:36 AM

cmccarrick said:

Does this mean that the new announced support for direct SIP trunking in OCS R2 is not supported in Speech Server?

# October 15, 2008 12:39 PM

Aaron Tiensivu's Blog said:

Thanks to the watchful eye of Elan - he spotted a press release about OCS 2007 R2. It sounds pretty cool - just check out the details: Dial-in audioconferencing - Office Communications Server 2007 R2 enables businesses to eliminate costly audioconf

# October 15, 2008 6:24 PM

Albert said:

Thanks for your feedback. OCS 2007R2 is an important step in Microsoft’s strategy to create a managed platform for SIP applications, yet for speech functionality clearly should be seen as an interim state. When Microsoft announced Speech Server’s integration into Unified Communications in August 2006, UC was still a concept that was being developed. Two years later it is clear that Unified Communications is the cornerstone to innovation in the telecom industry. I dare to say that IVR as a stand-alone business is on the decline. Microsoft strongly believes in the power of speech, and heavily invests in it, yet we do also believe strongly it should be regarded as a feature. A key feature, yet part of a bigger whole.

For Speech Server (2007) Microsoft postponed the tighter integration into OCS on demand of our installed base. And in OCS 2007 R2 Microsoft has clearly decided that there is more to be gained for our customers of speech being a part of the UC platform than there is to be lost not having a stand-alone IVR. That is where our investments will be.

In R2 the new speech platform is a work in progress. The accompanying speech tools are missing, Microsoft is still working hard on our application server platform, OCS can do better on integrating OA&M and reporting of third party SIP applications. That is why Speech Server (2007) is still supported for the R2 timeframe.

Creating GRUUs in Active Directory indeed is less than smooth. Yet the UCMA is an endpoint API. And AD offers a lot of infrastructure (like for security) which we choose to leverage. The key here (and partners, please jump in!) will be to build better tools than the sample code for a tool we provide in R2.

On VoiceXML. Microsoft recognizes that VoiceXML support is another key feature going forward. Our core strategy for the UC platform is to offer an Unified Communications Managed API and Web Services APIs. VoiceXML is a higher level API on top of that UCMA. The UCMA is capable to host an VXML browser on top of it. We think Microsoft will need to do some basic work to integrate VXML on top of UCMA and as a Workflow Activity in the UC Workflow Activities. Yet I will explore if we can put that code on CodePlex. Thanks for that suggestion.

In R2 the tighter integration into the Windows Workflow Foundation has been key. In the coming months Microsoft will be revealing more about ‘Oslo’ and Visual Studio’s future. That is a big bet of Microsoft where I think UC needs to be an integrated part of. Unfortunately it means that the roadmap is defined along those time lines and will take its sweet time... ‘Oslo’ needs to be ready before we can do our next push there.

We will be targeting a broad set of developers that are not necessarily speech developers. We want to take speech mainstream. As a feature of productivity enhancing applications, where speech can play its role. In this sense we are working on a dual strategy: the speech tools will be more low level. The application authoring tools will need to be more high level, and need to make UC application development a lot easier leveraging the Workflow Foundation work Microsoft is doing overall.

Some of you have been asking for more core access to the speech engines. MRCP 2.0 as offered by Aumtech is a great example of how currently speech can be used by other platforms. UCMA 2.0 does offer API level access to the speech engines, ASR and TTS. The speech investments in Microsoft are unprecedented and growing. We have more people working on speech than ever before. We are working on support for 26 telephony languages (12 being ready by R2). We are working on integrating the latest research insights into our product to make core speech technology even more robust. Including server side dictation, be it that is not ready yet (did you check out the Dictation Resource Kit download on Microsoft Download for Windows Vista?). Microsoft has the ambition to offer the best engines out there. Yet as a feature of a bigger developer platform story. Not as a stand-alone business.

On the telephony integration side there is a lot happening. Support of the legacy PBX/switch industry for SIP over TCP is growing. Also SIP Trunking specs are being finalized in the SIP Forum and Microsoft hopes some key carriers will offer SIP Trunking based on that standard. Speech Server (2007)’s tested gateways have been of a Dialogic and Audiocodes make, and Aculab’s boards. We depend on vendors to qualify against our product. Like Paraxip/Sangoma did. The same will be true for SIP Trunking. It will be terra incognita. Likely gateways will be your best bet for the year to come. That is just reality. Clearly with tens of millions of OCS licenses sold versus the amount of Speech Server deployments, OCS will be the focus for carriers to test against and Speech Server (2007) will be not their priority. Another reason why a integration of speech into OCS will be a good thing, leveraging the scale and growth of Microsoft’s Unified Communications as a whole.

I hope all the above gives a better insight in our current thinking.

Keep the feedback coming!

Albert Kooiman - Microsoft

This posting is provided "AS IS" with no warranties, and confers no rights.

# October 15, 2008 10:56 PM

marshallharrison said:

Thanks Albert for filling in some more details

# October 16, 2008 8:47 AM

SydneyOs said:

I'm unclear - will I be able to work on my existing Speech Server apps in VS 2008, or only on new UCMA apps in 2008?  Thanks.

# October 16, 2008 6:41 PM

marshallharrison said:

UCMA 2.0 only in VS 2008

# October 16, 2008 8:48 PM

Kuno Weiss said:

Great post with very insightful comments. I'm curious whether the transposing extends to the transposition of entire audio conferences as well. This way we wouldn't need to have someone write down details but could later on just look at who said what.

Please keep the discussion going. OCS R2 is THE application. I'm totally excited about it's features and need all the insights I can get to convince our management to go for it.

I'm currently working out an approach to using Nortel's CS1000E with along with OCS R2. If someone has any comments, I'd highly appreciate it. Thanks.

# October 17, 2008 5:35 AM

JeffS said:

Since the current speech tools did not get propagated up to VS2008, how long do you think Microsoft will support them?  Looking at Micheal Dunn’s post

http://blogs.msdn.com/midunn/archive/2008/10/15/speech-server-2007-vs-ucma-v2-0-wf-activites.aspx

UCMA 2.0 does not seem to be a solid solution for IVR’s.

Any thoughts or suggestions?

# October 17, 2008 2:38 PM

marshallharrison said:

JeffS,

I wish I had the answer to your question about how long the current tools will be supported.

As for now I think the best approach is to just keep on the current path with the current tools while experiemnting with the new stuff.

Hopefully the picture will clear up some more before too long.

# November 6, 2008 10:00 AM

zaziki said:

Well, I'am now a little bit confused about the new R2 release. Does it mean that they substitute the "Speech Server application development technique" (define grammars with the editors, debugging applications with the SIP debugger, using the tuning and analysis tools, ...) with the UCMA 2.0 API (I've also read the article from Michael Dunn about that topic)?

Isn't this a step backwards? Allthough, I'm new in that business, I selected the Speech Server and its development tools and API, because it is a modern kind of VUI development (state-of-the-art?).

So what they are doing now? Skipping VXML support is also a bad decision, because it is a kind of de-facto metadata standard for voice applications (however, I prefered managed code applications). Okay, supporting SIP trunks is very cool (that costs me a lot of time before, to configure a IP-PBX working with Speech Server). Furthermore, VS 2008 integration is good and SALT skipping is aceptable (why we need another standard?).

Nevertheless, right now, for me are more negative items on my list as positives.

Cheers zazi

# January 29, 2009 4:13 AM

marshallharrison said:

For the time being Speech Server will remain as it is. You can utilize the UCMA 2.0 to do some of the same things.

As I understand it VXML will be there for later releases and the other tools will be made available too. It just takes time to get all of the pieces together etc. You can't do everything at once.

# February 3, 2009 10:33 AM

zaziki said:

Thanks a lot for that significant answer Marshall. So it is just a better decision to wait for switching to the next version. Let's celebrate now the R2 launch ;)

Cheers zazi

# February 3, 2009 12:08 PM
Anonymous comments are disabled