Welcome to GotSpeech.NET Sign in | Join | Help
Fun with outbound, v2...

Wow, 2 years since my last post! I'll see if I can be a bit more regular.

A lot of what bridgeSpeak (my company) does is outbound. Ever since the original Speech Server JDP, it seems we're one of the few (very few I'm told) partners doing outbound via analog. It has been a rough road at times... Anyone who remembers the original SQL Server Notification Services triggering mechanism and/or v1 TIMs knows what I'm talking about.

Anyway, now we've been moving along with OCS Speech Server and I can't quite decide if I miss the "good old days" of outbound on 2004 R2, or if we're moving forward. It's difficult to tell in an analog world! So I thought I'd outline some of the challenges we've seen, some of the solutions we've found, and provide a jumping off point for anyone else in a similar situation.

One of the main challenges we face is that the gateways that provide the interface to the PBX/PSTN were not really built for this purpose (outbound IVR in particular). I'm not a telephony guy, so anyone can feel free to tell me I'm wrong. But just a simple comparison from the previous technology seems to make that case. With a Dialogic D41JCTLS card, you got some call progress analysis, answering machine detection, etc. You do not really get this in the gateway. Our initial thought was just to continue using the (more expensive) cards along with the TIMC. BUT the TIMC requires a TIM, and the TIM vendors are no longer licensing/supporting their TIMs. So the only scenario that the TIMC is good for that I'm aware of is to migrate an existing licensed 2004 deployment to 2007.

We have bounced back and forth between the AudioCodes (MP114FXO) and Dialogic. It seems that both manufacturers are working towards making their gateways more comparable to some of the telephony cards. They are introducing more call progress, tone detection, etc. We are happy to see this evolution, but it can still be a bit tricky. One hurdle we've had to jump is in regard to early media. Both gateways have some sort of 'instant' or 'voice detection' mode. When the setting is instant (early media), as soon as the call is initiated or set up the gateway sends the 200 OK back and starts streaming the call back to OCS. As far as OCS knows, the call is connected. Voice detection mode only starts streaming the data to OCS once it determines that something has answered the call, or some preconnect analysis (busy for example) prevents the call from being connected.

Here's the trouble. Since the gateways don't currently support answering machine detection, the application must do it. If you are using the DetectAnsweringMachineActivity and use the voice detection mode on the gateway, a 'hello' is generally going to be 'eaten' by the gateway's voice detection and will not be streamed to the app. So an end-user is generally going to have to say 'hello' twice in order for the application to determine that a human answered. This isn't as much of an issue if an answering machine answers because it is generally still talking when the DetectAnsweringMachine activity's turn starts. We have attempted to solve this by wrapping the DetectAnsweringMachine activity in a do-while loop. We set a configurable max number of times for it to loop, and continue looping while the activity's DetectionResult is none:

private void whileNoneLoop_CodeCondition(object sender, ConditionalEventArgs e)

{

if (detectAnsweringMachineActivity1.DetectionResult == DetectionResult.None && loopCount < loopQuantity)

{

loopCount++;

e.Result = true;

return;

}

else

{

e.Result = false;

return;

}

}

The key here is that the gateway must be set to the 'Instant' mode so that it starts streaming data back to the application immediately. That way all utterances are availble to the application for analysis. This can present some issues with preconnect call analysis (busy, SIT, etc), and we are still working through some of those with the gateway vendors. If we get to the max loop count and the DetectionResult is still None, we just make an assumption about what it is and go from there.

We have also experimented with a very promising product from Paraxip, their NetBorder Call Analyzer. This is a software solution that can basically interject itself between the gateway and OCS (or any SIP UA I suppose) and provides Call Progress information. The trouble we have had here seems to be related to our analog lines. When we analyze the recorded SIT tones we get back, for example, the frequencies do not match up with the standard defintions and so the Call Analyzer is not recognizing them. Does anyone else use analog lines and have trouble detecting tones, etc? In any case, the Paraxip product looks very promising and their support has been very good.

We have also experienced some issues with the outbound message queue locking up (the queue itself works fine, but OCS stops pulling messages) and message priorities being ignored. Those have both been acknowledged as bugs from Microsoft and we are waiting on a fix to test those.

Anyway, I think this is long enough for now. I'd love to hear other people's feedback about using OCS for outbound calls!

Why the bad rap?

In our market, we're often confronted by resistance to automation.  The common sentiment:

 

"I have to call XYZ Co. all the time and I hate their automated systems!  I can never do what I need, and I can never talk to a person.  I'd never put something like that in my store!"

 

Well, having experienced many of the same systems myself, I can't blame them.  I don't believe that they're complaining about automation per se; they're complaining about their experiences with poorly designed systems.  Unfortunately for many of us in this industry, the customer makes no distinction between the two.  They have simply made the decision that automation = poor customer service.

 

Now here's what I find really interesting about this.  The primary alternative to automation is, of course, people.  But are people always a better alternative?  Surely customers experience some of the same frustrations with people as they do automation.  I know that I have personally had excruciatingly painful customer service experiences with human reps.  So why does automation get such a bad rap? 

 

Certainly a lot of the reason is because there are systems out there that it would seem were designed to infuriate people.  There can be no other explanation for the way these systems work (or fail to).  Unfortunately these systems are often widely deployed by national companies and tend to color many people's perception of automation in general.  But that's been well discussed numerous places already. 

 

I believe another reason is because customers can complain to people.  For example, people can complain about the horrible experience they just had with the company's IVR (or the company's product, or the weather, or whatever strikes their fancy), and they can even demand the system be replaced with humans.  But can you really complain about a bad experience with a human rep to a computer?  You typically don't get much sympathy!  Have you ever been talking to a rep and said "I demand to talk with your IVR!"  Does part of the appeal of human reps have nothing to do with their actual service, but the ability to complain to someone?  Even if they lack the customer service skills to empathize with you, at least you typically get something better than, "I'm sorry, I didn't understand you.  Let's start over..."

 

So what does this mean for speech companies?  I think that will be my next topic.  I'd like to hear other people's opinions on it in the meantime, though...

Off Topic: The reason I missed the beta meeting...

On Friday, my wife Kristen gave birth to our new daughter, Morgan!  I was supposed to attend the beta meetings for MSS 2007 a few weeks back, but somebody was giving us some signs that she was going to arrive early.  She fooled us and didn't show up until last Friday...  Should have known then that she was a girl... :)

Sorry for the off-topic post, but hey, can you blame a new (second-time) dad??

Introduction

Greetings!  Thank you to Marshall for the invite to do this blog.  Blogging is entirely new to me, so please bear with me.  I'm not quite sure what to do or how to do it, but I'm open to suggestions!

A little background.  I am one of the co-founders of bridgeSpeak, a speech ISV.  We build packaged speech applications for specific vertical industries.  Our first app is for car dealers.  Anyway, we have been involved with MSS since the original TAP (or JDP as they were called back then).  I think it's a terrific platform to work with and it has opened up a whole new market to speech: small- and mid-sized businesses.  And we're very excited at the things we're seeing with the 2007 TAP/beta.

One of the exciting things about MSS is that it opened up speech to non-speech people.  Like me.  :)  I have a Systems Analysis degree from Miami University (OH!), and an MBA from Xavier University.  I don't (maybe "didn't" is a better word) know a thing about signal processing, speech, telephony, etc.  Warehousing (boxes, not data) I knew something about, but not speech. 

Looking back, I consider this a real advantage.  I think it allowed me to look at what the platform and applications could do for businesses without getting hung up on all the "speech stuff".  I read some of the industry publications and the inward focus of the industry ("does your TTS engine cry?  If not, how will users know what it's feeling??" <== slight exaggeration, but not far off)  can be a real distraction to focusing on the basics of serving our customers.  Is TTS "emotion" (for example) the problem, or is it that so many applications are so poorly designed at a basic level that a national bank has run ad campaigns based on that fact??  Methinks the latter...

Those are some of the things I'd like to talk about in this blog.  I will probably put some technical comments in here (I also do some development), but I would like to try to focus a bit on the business of speech (and maybe ranting about the business of speech... :) )  Hopefully some of my thoughts add some value to what you're doing.  Or at least provide some entertainment.  Or at least eat up a few minutes of time you're looking to kill.

Anyway, thanks for reading!

Jon