|
||||||
|
|
Where Do We Go From Here?
Recently I have been following a lively discussion on the VUID group, titled “Are we the orchestra on the Titanic?” It began with a post from the inimitable Phillip Hunter containing a link to Robert Fortner’s blog: Rest in Peas: The Unrecognized Death of Speech Recognition
His premise is that the progress of automated speech recognition has peaked, and is well along the slow decline into obsolescence. Fortner cites various studies showing that the core technologies were able to take advantage of the enormous gains in hardware capability and speed, as well as the tremendous amount of text content available for semantic analysis. And then it stalled, unable to get past about 80% accuracy regardless of the horsepower or the data it is fed. Our VUID discussion has focused more on the role that we play in augmenting and assisting the ASR model. We use constrained grammars and carefully crafted dialogs to drive the telephone conversation down the path that provides the most efficient way to service the customer. All useful tools have inherent limitations; the masters work with them, and around them. Why would we consider ASR any different? As people come to rely more and more on self service applications with such speech enabled devices as the OnStar car safety service, a host of the new GPS’s, Google’s Android Nexus One and other mobile Smart phones, they will train themselves to work within those limitations as well. Even such a simple and intuitive thing as to move to a quieter location before beginning an interaction, or remembering not to try to carry on two conversations, one with the IVR and the other with a store clerk, will make a huge difference. And I suspect that recognition will take another small leap, not necessarily because the technology improves, but because it is being used with, rather than against the flow. More on this, as the next tantalizing bit of the story unfolds, where we look at telephone vs. online self service, and how the line is blurring.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Stupid Computer!
>офис обзавежданеtoo often IVR systems get blamed for a poor user experience. The impulse to blame the user interface for a poor customer experience is common, and quick to gain traction (everyone’s got a favorite IVR-flaw story). Unfortunately, many people don’t take into account the type of device being used by the caller or the environment they are calling from. In general, IVR systems work very well when they are properly designed. With that being said, here are a few examples of devices and environments that are not well suited for IVR use. Enjoy.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon My Wife has an IVR-killing Superpower
A couple of months ago we dropped our traditional landline and began using the AT&T U-verse voice service. Shortly after that I discovered my wife’s unusual superpower. It started innocently enough - we chat on the phone a few times during a day. That’s normal, right?
I should mention my wife is a multi-tasker and has a habit of resting the cordless phone on her shoulder as she types on a keyboard, rinses dishes, or performs an appendectomy. Often her cheek or chin would touch the keypad and voila, I would hear random touch tones.
After our switch to U-verse, I began thinking my wife was really leaning on the phone. I would hear two, three, or four tones in every call.
My complaints were met with firm denials and that’s when we figured it out: her voice triggers U-verse touch tones! She does not hear the tones. They are produced in the network in response to her voice. Is this an odd in-band signal problem?
It seems to be related to vowels; it’s not sensitive to volume, thankfully, so I am spared additional tones during our occasional debates about global warming or whether the right person was eliminated from “So You Think You Can Dance.”
Unfortunately, she has yet to master her superpower, so my requests to hear “Jingle Bells” or Iron Butterfly’s “Inagodadavida” go unfulfilled.
But as someone deeply involved in IVR services, I am struck by how her superpower can wipe out an IVR program. Imagine an application that allows a free-form recording of a consumer problem. In many cases, the recording can be stopped by pressing a touch-tone. Imagine the irritation generated when the consumer is recording a complaint and is interrupted by the IVR program as it prematurely moves to the next dialog after the recording.
Is this an isolated problem or is this something to be expected in our new world of hybrid traditional and SIP telephony? Is it time to change the default value of the dtmfterm attribute (of the <record> tag) from ”true” to “false”?
My wife’s talent is not yet listed on www.superpowerlist.com, but I am sure it will be soon. Next we need to work on a cool superhero name. Any ideas?
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Big Brother is Listening To You?
A new patent has been awarded to Charles Humble of the National Institute for Truth Verification (NITV) that establishes numerical values to stress levels experienced when lying, even using recorded speech. Read the entire article here. The NITV markets the Computer Voice Stress Analyzer (CVSA I and CVSA II) , which purports to be 96%-98% accurate at discerning truth from fiction. It has been marketed primarily to law enforcement and military intelligence agencies, thus far. It uses an algorithm to analyze and graph frequency modulations in unstructured speech. These graphs then display “positively” whether the person has lied in response to a question. I remember when my apparently prescient mother used vibrations to test my own veracity. She would have me put my index finger in a bowl of water and answer her questions. If the water vibrated, I was lying. She swore by it, but my independent observations were that it was about 50/50—and easily manipulated. Ahem.
Other interesting ways of teasing out the truth include one near and dear to my heart—the magic donkey. (Why? See my portrait on the first blog–Jan 2008.)
…circa 500 B.C. in India. A priest put lampblack on the tail of a donkey in a dark room and all suspects were to pull the magic donkey’s tail. They were told that when the one who was the thief pulled the magic donkey’s tail, he would speak and be heard throughout the temple. The person who did not pull the tail had clean hands and was pronounced the thief and punished. As if all of this weren’t frightening enough to the average “little white liar”, a South Korean company claims to be able to identify real vs. fake emotion. An article in Cellular News, dated 09/26/2006 says “Nemesysco’s leading technology is also powering KTF’s new ‘Love Detector’ service, which tells the caller the “love level” of the person on the other end of the line every 10 seconds - so that subscribers can tell whether their loved ones share their feelings all through the conversation. Once the call is completed, the subscriber also receives a message ranking the overall level of affection, plus graphs that measure various attributes such as level of interest, attention, expectation, and embarrassment.”
Gives a whole new sensation of terror to the question “Does this dress make me look fat?”, doesn’t it? I wonder if they caught the irony in their company name…Nemesys aka Nemesis. I did!
Supposedly, there are no known countermeasures to the CVSA truth verification methodology. I’d be interested in knowing whether a skilled character actor could deliver lines convincingly enough to fool the system. I hope so. As much I want the truth to win out over lies, there are situations where a half truth, a kind fiction, is a far, far better response than the cold, clinical, absolute truth. And I am sure that the marketing groups, political organizations and pundits of all flavors would agree wholeheartedly, eh?
The power of speech is unmistakable, inescapable. Its power for good and harm is real. Have we reached a place where we are orchestrating a version of Brave New World in which the privacy of our own mind and our heartfelt intentions are lost? What do you think? Speak up. And remember…Big Brother is listening…
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Selecting your IVR host is like buying a car
Often times when you set out to buy that shiny new or used car, things don’t go exactly as planned. In the same way, finding the right IVR hosting partner can be a tricky business. Like buying a car, if you walk into the IVR world without doing your research, you could get a lemon. It’s very important to understand some of the tricks that some IVR vendors use to get you to BUY NOW. Don’t get me wrong, not all IVR hosts are unscrupulous and not all are trying to take you to the cleaners. In fact, among all the hosts, many are really trustworthy. With that being said, the potential for a rip-off is still ever present. Some of the practices I have seen over the last eight years in this business justify some of the suspicions people have when it comes to IVR. If you catch a whiff of any of the practices below, it would be wise to be on the alert. Armed with a little IVR knowledge, you should be able to detect and avoid some common pitfalls. Here are a few car dealer tricks that are applicable to the IVR hosting world. Trade In Value: IVR Industry: Old IVR applications are often pretty beat up (poor VUI design, not VXML compliant, the voice recordings sound like they were recorded in a Dr. Pepper can, etc.) Customers feel attached to these old systems because they paid a lot of money to have them built or even built the apps themselves. Many IVR hosts don’t provide professional services and aren’t interested in improving the performance of your application. MTI: Despite a recent program introduced in the Auto industry, we are not going to give you $3500 bucks because your current IVR system is a gas-guzzling road hog. What we will do is provide you with our expert VUI design and professional services team with over 20+ years of experience in the industry. If your old application is great and it is VXML compliant, by all means bring it over and run it on our platform. If it’s not so great, our professional services team will work with your team to design a system that meets all of your needs. MTI has two models: 1. We turnkey the whole process and build the apps for you. 2. We allow our partners to build their own applications and host them on our platform. Upgrades and extras: IVR Industry: Scalability, ASR and TTS resources are a few of the add-ons that will quickly blow any budget. Many hosts will charge per-use fees for features that should be included with your base service. Some “low-cost” hosts charge more for enterprise grade technology but they will allow you to use the in-house ASR or TTS for a lower price. Don’t settle for inferior technology or service. MTI: Any enterprises interested in IVR should choose a hosting provider that is driven by performance and doesn’t nickel and dime you for basic services. MTI strives to become a true partner. We know it is in our best interest to create lasting relationships with our customers. We provide fair pricing that is easy to understand. It is our philosophy that customers deserve best-of-breed technology with full redundancy and automated failover, a 24/7 Network Operations Center, a disaster recovery plan, and the scalability to absorb spikes in call volume. Did I mention that your IVR host should be standards-based (VXML 2.0 2.1, SIP, etc)? You wouldn’t buy a car that some guy pieced together from some parts he found at the junkyard, would you? Bait And Switch (We’re giving away cars, it’s FREE, blowout sale, 12 months same as cash): IVR Industry: Cars aren’t free, neither is IVR. You may be able to sign up for free or you may be able to download a fancy platform that you can run in your basement, but just remember: you get what you pay for. Once you go for the bait, they set the hook. In the beginning you have this great little platform that takes ten calls a day and you don’t have any problems. Here is where they get you - when it comes time to deploy the application with real call volume, you suddenly realize that your free IVR platform does not scale, is not very reliable, and your monthly invoice goes from $39.95 to $3995.00. How did this happen? Well…when you signed up for your free or cheap account, you didn’t notice that those add-on fees and the per-port or per-minute rates were ridiculously high. These platforms are great for developers, not for enterprises. I could go on and on about this one, the bottom line is – be careful. MTI: We allow prospective customers to fully test their applications in our sandbox environment before they sign a service agreement. No hidden fees or tricks. Before you decide to become a customer of MTI, you will have a clear understanding of how the billing works and, if you know your call volumes, then you will have a good idea of your monthly invoice. Bottom line: If you are shopping for IVR, know your facts and know what you are looking for. If that is developing a partnership with a company that offers a great platform with exemplary service, you should give us a call. If you’re not careful when you choose your host, you may end up with Sal the IVR guy with his pinky ring and Velcro shoes.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Over-thinking the Cloud
Many of you have probably spent a few days or possibly even a few weeks recently, lying on your back staring at the sky. With any luck there was just a sea of never ending blueness that made you relaxed and happy to be at the beach/pool/lake/mountain top or wherever your vacation dreams reside. However I suspect that during this period of unrelenting bliss, a few clouds showed up that made you think about the future of networking computing…unless that was just me..
With all the talk of cloud computing, when you actually look at clouds you will notice that the typical cloud that is used to depict the “Cloud” is usually a sizeable puffy one, fairly neatly formed and oval-ish. However, when you look at real clouds you will notice that clouds vary as much as people. No two clouds are alike. They all exhibit different personalities and traits.
During our relaxed state, the most welcome form of high-level cloud is not puffy at all, in fact, known as “Cirrus” clouds they are usually very thin and often wispy. Typically found at heights greater than 20,000 feet Cirrus clouds are composed of ice crystals that originate from the freezing of super-cooled water droplets (I looked that up…). What we like about Cirrus clouds is that they generally occur in fair weather and don’t threaten to disturb our dreamy state of relaxation.
Those large, puffy “Internet” clouds, also known as Cumulus (yep…the ones that start to look like rabbits and puppies after we stare at them for a while) tend to be mid to low level clouds that turn a gray color when they become “Cumulonimbus”. This is because they are reaching their capacity and are expected to be overloaded soon. The sight of a large looming cumulonimbus will motivate us to start rolling up the beach towel and head to the hotel for an early happy hour. However, their slightly less impressive half-brothers “Stratocumulus” that generally appear as a low, lumpy layer of clouds with breaks of clear sky in between generally make us hold off on the exit strategy but do have a tendency to ruin the mood. As you can see the Cloud world is very rich and diverse.
So, what is the point of all this? Well, as more and more companies talk about their “Cloud” computing strategy, I think we should be asking ourselves exactly what type of cloud they are building and is the term “Cloud” really meaningful as the Internet develops. Is their cloud something that is large and looming that, as it grows, will eventually ruin our day, or something simple and un-intrusive that is far away and not likely to have a big impact at all on our lives? It’s actually probably neither.
Unfortunately, as someone who starts to pace relentlessly when he loses Blackberry connectivity for even 5 minutes, I think we really need to develop an all-encompassing analogy that depicts something that will entirely envelop us and completely change the way we think about computing, the web, the phone, networks, technology, vacations, clouds and everything else.
I propose that network diagrams should no longer show the Internet as a “Cloud” off in the corner with lines coming out of it - they should start with a musty gray undeterminable background with all the stuff sitting right in the middle of the grayness, connecting not by clear lines, but via some ethereal mist-like form. I further suggest we no-longer use the term “Cloud” computing, but start using “Fog” computing. Simple, brutal, cold and in your face.
That way we will be much more honest with the public at large and all the confusion will be completely eliminated.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Putting Out the Welcome Mat for Outbound Calls
Recently a fellow VUI designer, Caroline Leathem, asked for volunteers to take a survey on outbound contact perception. You can take it here.
In particular, she wanted callers from outside the U.K. to express their opinion on how they preferred to be contacted in a variety of situations. The options were: Text Message, Email, Letter, Phone Call from an Automated Service, Phone Call from an Onshore Agent and Phone Call from an Offshore Agent. The order wasn’t fixed, which I found a bit odd, as it made me wonder whether it was accidental or intended to shift the bias somewhat.
But I did notice that my knee jerk reaction was to eschew the automated call more often than not, either by not choosing it at all, or by ranking it very low in my list of choices. This is downright peculiar, as we have written a number of applications that use Outbound calling to very good effect. And I genuinely do prefer the speed of dealing with an automated application when I am trying to get something done, without having to take the time for human interaction. This isn’t to say that I don’t enjoy a good conversation with a stranger, but not when I am task-focused. And perhaps, too, it is specific to the task. Arranging for a delivery or setting up an appointment, really is easier to do with a person. But soliciting or providing information is generally easier to do with a machine.
In the informal surveys we have performed for our customer base, the reaction to our automated outbound call has been positive—the call recipients received just enough information in a timely way to act upon. Or they were able to respond to and log their information as needed when they needed to. I suspect at least some of the difference is that they were expecting the call, so that it wasn’t an intrusion. And the consistency of the interaction was comfortable over time.
Outbound calling has been getting steadily increasing attention in the news. Datamonitor predicts the market for hosted Outbound IVR services in North America will more than double from an estimated $213 million in 2008 to $524 million by 2013. They attribute the rapid growth to the economy. “Outbound IVR applications are simpler than inbound IVR applications, and require less intelligence to determine end-user goals. Over the next few years, vendors will integrate outbound IVR applications with backend business logic and by taking advantage of all major customer touchpoints, notably email and SMS. The biggest factor that will influence the market for outbound IVR applications is the economy, due to which enterprises across all verticals have adapted “do more with less” mantra.” The Rise of Outbound Applications in an Economic Recession (Strategic Focus)
That being said, we need to prepare for more calls, both making and receiving them. It is important that we understand when and where outbound calls can fit into the ways we communicate, so that we don’t alienate the very people we want to reach out to and engage. More on this topic soon.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Cashing in on people with FAT FINGERS
In the IVR world we all know what a fat finger is. Just in case you haven’t heard the term before, it works like this: A caller dials a telephone number like 1-800-MYBANK1 to check their balance but they optical amplifier fat finger the number and accidentally dial 1-800-MYBANK2. Instead of checking his balance, the caller is now talking to some guy trying to sell him a timeshare on the Jersey shore. The term fat finger actually has nothing to do with the size of your index finger, but is basically a typo while using the telephone keypad. Oh I know, it’s probably not politically correct these days to use the term fat finger, so for the remainder of this blog I’ll try to clean it up and use terms such as differently weighted digit and massive dactyl. Thanks to the geniuses running American Idol, massive dactyl will soon be more popular than bling amongst 13-22 year olds. Here’s why: American Idol decided to increase the number of finalists this year from 12 to 13. American IDOL only owns the sequence 1-866-IDOLS-01 through 1-866-IDOLS-12. Here is the problem: when Johnny calls to vote for his favorite singer (the 13th to appear) he doesn’t get to vote, but he does get to chat with a nice young lady that for a minimal fee of $15.00 per minute will gladly talk to him for the next few hours. Shame on the telco team at American Idol. Most companies purchase banks of toll free numbers from the carriers and someone at AI should have shelled out the extra $50 bucks for fifty numbers instead of 12. Shame on the carrier, c’mon you’ve got to know that a ton of people are going to accidentally dial the wrong number. Someone in the chain should have been smart enough to know that having a few extra sequential numbers and the commonly mis-dialed numbers would be a good idea. Given the current economic situation, a lot of parents are going to be very upset at the end of the month when they find out that Johnny’s American Idol vote cost them the mortgage payment. There is a lot of blame to go around for this monumental oversight but rest assured that some goofball is sitting in his basement watching the calls come in and planning his retirement.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon The W4’s of Search
When you think about Search today, most people simply “Google”. However Yahoo is trying hard to catch up and recognizing that the desktop war is over; they are focusing their attention on the mobile platform. Their belief is that Search will become much more of a mobile application in the future and that a good Mobile Search will be much more than just typing in what you are looking for and seeing a list of results. At the Voice Search conference today, Yahoo presented the idea that Mobile Search involves the four “W’s”: What you want; Where you are; Who you are and When you want it. Yahoo has coined the term W4 to describe this concept to those of us who can’t keep up. Or to be more technical, you are really looking at four data types concurrently when performing a mobile search – Topical, Spatial, Social and Temporal. That being said, I think Google is already well down the path to W4 nirvana and Yahoo may be too late (again). Here’s why. As an avid Google Maps user I can attest to the fact that my desire to “map” before I embark on a road trip (i.e. on my desktop and then print it out) has now been 100% supplanted by my ability to “Google Map” during the trip. Of course this presents its own set of challenges if I am driving. Also I do find that Google’s sense of direction competes with my wife’s Garmin on a regular basis, resulting in lively discussions about which (i.e. who) is right. (The Garmin usually wins of course! ) However, Google Maps is a great example of a real product that exhibits W3 and possibly W4-ness. If you think about it, I am combining Topical Data (What is my destination) with Spatial Data (Where do I go to get there). I have recently added Google Latitude to my mapping experience which allows me to see where my wife and kids are at any time, assuming they have their devices on. So that’s Social Data, i.e. “Who”. So there’s W3 already. The only thing left to reach a state of W4-ness is Temporal (When). Is that hard to imagine? I can envisage that soon I will be able to ask Google Maps for “places to eat” and that it will only return those things that I could do at the time I asked them. It will show me lunch spots at lunchtime and diners at breakfast. It may only show me ones that have been recommended by my Social Data base if that’s what I want and perhaps only those that are within walking distance if it has ascertained from my GPS that I am in fact walking. If you assume that Mobile Search is the brave new world of Search then the W4 concept is very compelling. The trick seems to be the one with the most data will win. Can I see myself “Yahooing” from my mobile device at some point in the future…no, not really.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Sugar and Spice…Not!
Susan Hura wrote an interesting article for SpeechTech magazine discussing the critical requirement that a VUI designer really understands the business logic used in all of the different interaction channels offered by a client. It is worth reading just for that discussion alone. She always has good insights. FYI, I myself am definitely in the behavioral science camp, in case you were wondering. Anyway, she ended with a reference to an earlier article describing the reaction to two differently worded prompts intended to take the caller to an agent. …”The only difference between the prompts was wording and how each influenced users’ perceptions of why they were being asked to make a selection. For the first prompt (OK, I can transfer you to an agent after you make a selection), we noted a very odd pattern: Many users selected what we knew to be low-frequency options, randomly responding with the last option they heard before requesting an agent. The reason behind this odd pattern became clear once we studied responses to the second prompt (OK, I’ll get you to an agent, but first please tell me if you need help with A, B, C, or D). Users in this case successfully and happily made appropriate selections and were almost always routed correctly; this prompt obviously motivates users to make a good choice because there is a direct benefit to them. The same benefit exists for the first prompt, but the wording makes the selection seem like just another hoop they must jump through for the sake of the automated system.” She concludes that promising a reward for good behavior, in her words, “cajoling to cooperate”, increases caller buy in with an automated system, causing them “…to view automation as a tool for accomplishing their goals rather than as a barrier between them and a live agent.” And sure, that is our goal. But I beg to differ. I am not convinced that it is as fuzzy as all of that. Clear direction isn’t necessarily cajoling. Setting the expectation of the caller that: if x then y, is always better than: if some x or another then I can y. I would posit that the indeterminate gray area caused the random option selection, not some subtle manipulative coercion with sweet nothings, I mean, wording. What was missing in the first prompt was the implication that the caller would be routed correctly to the agent that could best help them. Note the emotive context that Susan used to describe the caller’s reaction—“successfully and happily”, jump through the hoops for the app—when I use a tool, I rarely care one way or the other about it unless it doesn’t perform as expected. My good friend Leslie, a Public Defender in N. GA, once explained to me that a computer was no different than a toaster. She wanted to turn it on and use it without giving it any thought. Press the buttons, and words come out on the screen. No more, no less. No interference between her and her goal. Pop! Here is my toast, hot and ready for butter and marmalade. I believe that the VUI should do the same thing. Act like a well designed tool that assists you in accomplishing your task without getting in the way or adding to the experience at all one way or the other. Transparency.
For what it’s worth, this is a cute bit on another toaster. But I digress.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon |
![]() WordPress Custom Web Design by BeersDesign.com |