|
||||||
|
|
Where Do We Go From Here?
Recently I have been following a lively discussion on the VUID group, titled “Are we the orchestra on the Titanic?” It began with a post from the inimitable Phillip Hunter containing a link to Robert Fortner’s blog: Rest in Peas: The Unrecognized Death of Speech Recognition
His premise is that the progress of automated speech recognition has peaked, and is well along the slow decline into obsolescence. Fortner cites various studies showing that the core technologies were able to take advantage of the enormous gains in hardware capability and speed, as well as the tremendous amount of text content available for semantic analysis. And then it stalled, unable to get past about 80% accuracy regardless of the horsepower or the data it is fed. Our VUID discussion has focused more on the role that we play in augmenting and assisting the ASR model. We use constrained grammars and carefully crafted dialogs to drive the telephone conversation down the path that provides the most efficient way to service the customer. All useful tools have inherent limitations; the masters work with them, and around them. Why would we consider ASR any different? As people come to rely more and more on self service applications with such speech enabled devices as the OnStar car safety service, a host of the new GPS’s, Google’s Android Nexus One and other mobile Smart phones, they will train themselves to work within those limitations as well. Even such a simple and intuitive thing as to move to a quieter location before beginning an interaction, or remembering not to try to carry on two conversations, one with the IVR and the other with a store clerk, will make a huge difference. And I suspect that recognition will take another small leap, not necessarily because the technology improves, but because it is being used with, rather than against the flow. More on this, as the next tantalizing bit of the story unfolds, where we look at telephone vs. online self service, and how the line is blurring.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Big Brother is Listening To You?
A new patent has been awarded to Charles Humble of the National Institute for Truth Verification (NITV) that establishes numerical values to stress levels experienced when lying, even using recorded speech. Read the entire article here. The NITV markets the Computer Voice Stress Analyzer (CVSA I and CVSA II) , which purports to be 96%-98% accurate at discerning truth from fiction. It has been marketed primarily to law enforcement and military intelligence agencies, thus far. It uses an algorithm to analyze and graph frequency modulations in unstructured speech. These graphs then display “positively” whether the person has lied in response to a question. I remember when my apparently prescient mother used vibrations to test my own veracity. She would have me put my index finger in a bowl of water and answer her questions. If the water vibrated, I was lying. She swore by it, but my independent observations were that it was about 50/50—and easily manipulated. Ahem.
Other interesting ways of teasing out the truth include one near and dear to my heart—the magic donkey. (Why? See my portrait on the first blog–Jan 2008.)
…circa 500 B.C. in India. A priest put lampblack on the tail of a donkey in a dark room and all suspects were to pull the magic donkey’s tail. They were told that when the one who was the thief pulled the magic donkey’s tail, he would speak and be heard throughout the temple. The person who did not pull the tail had clean hands and was pronounced the thief and punished. As if all of this weren’t frightening enough to the average “little white liar”, a South Korean company claims to be able to identify real vs. fake emotion. An article in Cellular News, dated 09/26/2006 says “Nemesysco’s leading technology is also powering KTF’s new ‘Love Detector’ service, which tells the caller the “love level” of the person on the other end of the line every 10 seconds - so that subscribers can tell whether their loved ones share their feelings all through the conversation. Once the call is completed, the subscriber also receives a message ranking the overall level of affection, plus graphs that measure various attributes such as level of interest, attention, expectation, and embarrassment.”
Gives a whole new sensation of terror to the question “Does this dress make me look fat?”, doesn’t it? I wonder if they caught the irony in their company name…Nemesys aka Nemesis. I did!
Supposedly, there are no known countermeasures to the CVSA truth verification methodology. I’d be interested in knowing whether a skilled character actor could deliver lines convincingly enough to fool the system. I hope so. As much I want the truth to win out over lies, there are situations where a half truth, a kind fiction, is a far, far better response than the cold, clinical, absolute truth. And I am sure that the marketing groups, political organizations and pundits of all flavors would agree wholeheartedly, eh?
The power of speech is unmistakable, inescapable. Its power for good and harm is real. Have we reached a place where we are orchestrating a version of Brave New World in which the privacy of our own mind and our heartfelt intentions are lost? What do you think? Speak up. And remember…Big Brother is listening…
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Putting Out the Welcome Mat for Outbound Calls
Recently a fellow VUI designer, Caroline Leathem, asked for volunteers to take a survey on outbound contact perception. You can take it here.
In particular, she wanted callers from outside the U.K. to express their opinion on how they preferred to be contacted in a variety of situations. The options were: Text Message, Email, Letter, Phone Call from an Automated Service, Phone Call from an Onshore Agent and Phone Call from an Offshore Agent. The order wasn’t fixed, which I found a bit odd, as it made me wonder whether it was accidental or intended to shift the bias somewhat.
But I did notice that my knee jerk reaction was to eschew the automated call more often than not, either by not choosing it at all, or by ranking it very low in my list of choices. This is downright peculiar, as we have written a number of applications that use Outbound calling to very good effect. And I genuinely do prefer the speed of dealing with an automated application when I am trying to get something done, without having to take the time for human interaction. This isn’t to say that I don’t enjoy a good conversation with a stranger, but not when I am task-focused. And perhaps, too, it is specific to the task. Arranging for a delivery or setting up an appointment, really is easier to do with a person. But soliciting or providing information is generally easier to do with a machine.
In the informal surveys we have performed for our customer base, the reaction to our automated outbound call has been positive—the call recipients received just enough information in a timely way to act upon. Or they were able to respond to and log their information as needed when they needed to. I suspect at least some of the difference is that they were expecting the call, so that it wasn’t an intrusion. And the consistency of the interaction was comfortable over time.
Outbound calling has been getting steadily increasing attention in the news. Datamonitor predicts the market for hosted Outbound IVR services in North America will more than double from an estimated $213 million in 2008 to $524 million by 2013. They attribute the rapid growth to the economy. “Outbound IVR applications are simpler than inbound IVR applications, and require less intelligence to determine end-user goals. Over the next few years, vendors will integrate outbound IVR applications with backend business logic and by taking advantage of all major customer touchpoints, notably email and SMS. The biggest factor that will influence the market for outbound IVR applications is the economy, due to which enterprises across all verticals have adapted “do more with less” mantra.” The Rise of Outbound Applications in an Economic Recession (Strategic Focus)
That being said, we need to prepare for more calls, both making and receiving them. It is important that we understand when and where outbound calls can fit into the ways we communicate, so that we don’t alienate the very people we want to reach out to and engage. More on this topic soon.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Sugar and Spice…Not!
Susan Hura wrote an interesting article for SpeechTech magazine discussing the critical requirement that a VUI designer really understands the business logic used in all of the different interaction channels offered by a client. It is worth reading just for that discussion alone. She always has good insights. FYI, I myself am definitely in the behavioral science camp, in case you were wondering. Anyway, she ended with a reference to an earlier article describing the reaction to two differently worded prompts intended to take the caller to an agent. …”The only difference between the prompts was wording and how each influenced users’ perceptions of why they were being asked to make a selection. For the first prompt (OK, I can transfer you to an agent after you make a selection), we noted a very odd pattern: Many users selected what we knew to be low-frequency options, randomly responding with the last option they heard before requesting an agent. The reason behind this odd pattern became clear once we studied responses to the second prompt (OK, I’ll get you to an agent, but first please tell me if you need help with A, B, C, or D). Users in this case successfully and happily made appropriate selections and were almost always routed correctly; this prompt obviously motivates users to make a good choice because there is a direct benefit to them. The same benefit exists for the first prompt, but the wording makes the selection seem like just another hoop they must jump through for the sake of the automated system.” She concludes that promising a reward for good behavior, in her words, “cajoling to cooperate”, increases caller buy in with an automated system, causing them “…to view automation as a tool for accomplishing their goals rather than as a barrier between them and a live agent.” And sure, that is our goal. But I beg to differ. I am not convinced that it is as fuzzy as all of that. Clear direction isn’t necessarily cajoling. Setting the expectation of the caller that: if x then y, is always better than: if some x or another then I can y. I would posit that the indeterminate gray area caused the random option selection, not some subtle manipulative coercion with sweet nothings, I mean, wording. What was missing in the first prompt was the implication that the caller would be routed correctly to the agent that could best help them. Note the emotive context that Susan used to describe the caller’s reaction—“successfully and happily”, jump through the hoops for the app—when I use a tool, I rarely care one way or the other about it unless it doesn’t perform as expected. My good friend Leslie, a Public Defender in N. GA, once explained to me that a computer was no different than a toaster. She wanted to turn it on and use it without giving it any thought. Press the buttons, and words come out on the screen. No more, no less. No interference between her and her goal. Pop! Here is my toast, hot and ready for butter and marmalade. I believe that the VUI should do the same thing. Act like a well designed tool that assists you in accomplishing your task without getting in the way or adding to the experience at all one way or the other. Transparency.
For what it’s worth, this is a cute bit on another toaster. But I digress.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Mirabile Dictu
Mirabile dictu–this is Latin for “wonderful to relate, miraculous”. And those were the words that sprang into my head last week when listening to NPR’s All Things Considered on the way home. Robert Siegel was interviewing Mark Blumenthal on the historical accuracy of pollsters as reported on his web site, www.pollster.com. Siegel: “..a website that does for political polling what MLB.com does for baseball statistics. Let’s start with the national popular vote. It was 52% Obama 46% for McCain. Which pre-election day poll or couple of polls came the closest? “ To hear the entire interview, follow this link: What an astonishing endorsement from a professional statistician—that an automated survey compares not only favorably, but is described as being as accurate as a live survey! At MTI, we have seen a lot of positive response to virtual agents, virtual nurses, and even virtual fund raisers. But this is the first time that I have heard a “civilian” tout the evidence that we have seen in our analysis: in the right circumstances, virtual is as good as real. It makes a designer proud.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Cacophony
Ah…SpeechTek in New York City amid the frenetic activity of hustlers and buskers, business suits and Bermuda shirted tourists of every size, shape and nationality imaginable. What a backdrop for a convention dedicated to trends and technology for Speech IVR applications! The noise quotient roars in the background—cars and trucks and ambulances and construction and people. It is a voice designer’s nightmare. All of this din competing with the caller’s voice for the hapless speech recognizer to filter, parse and manage to return a reasonable response. And yet, that is the focus of this conference—how to reach and engage the new mobile user—the customer of our future. And this includes dealing with these very conditions. So what seemed dichotomous proves to be a perfect setting. How will we serve this new breed? Our industry is maturing; the rush for better and better technology is evolving to a new focus on how to do what we do better—how to target and respond in new ways to the challenge of communicating with and selling to these tech savvy, in a hurry, high energy and strongly individualistic people. Turning the clock back in time to see that in choosing machine over personal service, we have taken away the very reason that we need and want to interact with each as customer and vendor. Once upon a time, when you wanted to speak with someone about something, you went to them, and asked them your question. If they were not available, there was someone there who knew where they were, when they would return, and very likely, be able to answer your question or address your concern for you—personal contact, in a direct context. In today’s IVR environment, we have a one size fits all response through which the caller must wade step by step, without regard for what she actually needs or wants. She must sit impatiently through menus, marketing messages and jargon before selecting the option of interest. What a waste! In the brave new IVR of the future, we want to see a return to the personal service that comes from an understanding of the caller–using caller history, reverse ANI matching and demographic information behind the scenes to offer pertinent options and messages, and then displaying them in a multimodal way that allows the caller to respond and track that information in a way that is meaningful and useful for them. The message is “It is all about me. I am The Customer!” That is the voice that rises about all the sound and fury here in New York.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon SpeechCycle Reaches 50,000,000 Automated Interactions
Just wanted to tout an article from Call Centre Clinic featuring one of our partners, SpeechCycle. They have hosted with us since 2002. The technique for which they credit their astounding success in automation is called delegation. Below is an excerpt from Roberto’s recent blog. The reason for this dramatic increase in automation performance is simple, and is called “delegation”. Rather than having callers perform certain operations or provide certain pieces of information, the dialog system delegates other systems and information repositories to do that. We like to say: “The best question is a question not asked,” to stress on the fact that if there are other ways to collect some pieces of information other that asking the caller, that should be done. By delegating the collection of information or the performing of some actions to external enterprise backends rather than to the caller would lead to better interaction experience and higher automation. Read the whole blog… It just makes common sense to use the tools you have. Reverse ANI lookup, databases with customer information and configurations, and, when you do need to put a call through to an agent, CTI to pass in any information you have gathered to streamline the call and enhance the user experience. Some of our customers have balked at the additional cost, but when you can quantify the cost of an unhappy caller anecdotally or graphically they usually come around..
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon Cowboys vs. Farmers
One of my very favorite customers (that’s you, Carl) asked that I return to an earlier post and offer some suggestions to replace the list of 10 things that should never again be spoken in an IVR application. Well, I will get to that highly contentious debate one of these days, but today I want to share an excerpt of Bruce Balentine’s excellent summary on the VUI design and gethuman.com war of words. …”the gethuman effort is really about the proposition that an IVR and a call center agent together represent a single *system* that has one goal — delivering service to a caller. I think everyone agrees with that. Gethuman is generally right about the spec. This vuids group is right to be a bit incensed that non- designers can or should specify exactly *how* those specs should be rendered. If designers can agree to accept specs from non-vuids people, then the spec people should be willing to capitulate on details of the design itself. Then there would be a larger community with one common goal — delivering service to callers.” Read the entire post His point was that functional specification and the design principles need to be segmented. They are a complementary pair—two mules pulling the wagon. Form needs to mold function to get the job done in the best possible way. That’s right. The cowboys and the farmers can be friends.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon GetHuman loses its founder, and its focus
I see that Paul English, of GetHuman.com fame, or perhaps notoriety is a better term, is throwing in the towel. Remember him? Only a few years ago he led the angry mob in a fight against the gnarly evils of telephone automation, providing a list of ways to get around the IVR and to a human agent. Today he is apparently too busy to continue to champion the movement he started, and has turned it over to Walt Tetschner, a self-styled ASR specialist and industry curmudgeon. Walt publishes an online newsletter with slightly whimsical pans and plugs of IVR applications, as well as well researched articles on events and trends in the speech technology arena. I first ran across him when I was reeling from an encounter of the hideous kind using a Social Security Administration self-service application. Hapless me, I just wanted to find out how to change my social security number to my married name. After 20 minutes of fumbled repeated attempts, I gave up and drove 45 minutes to the nearest office. It was a waste of my time and energy. And such frustration! I am well schooled in my IVR responses. They are crisp and without disfluencies. But I was stuck in a revolving nightmare of broken steps, recursive paths, illogical phrasing and overwhelming bureaucratic traps. Walt gave it a less stinging review than I would have, but overall, had the same negative perception of the experience that I did. Since then I have read Walt’s posts in various forums. It will be interesting to see what he does with GetHuman. And whether his vinegar rather than honey approach invigorates or alienates the VUI design standard movement.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon What Lies Beneath
New and old VUI designers alike are always looking for tips on how to improve their scripting. As with any other field of endeavor, there are conflicting opinions; dissonance and debate abound. In my experience, we are passionate in our arguments, intense in our rationalizations. Design isn’t just a dry, analytical laying out of the prompts; it is an emotional interweaving of technique and form, nuance and balance. I like being a part of such a group, artists working their magic, taking words and sound and crafting a personal interaction with the caller–Giuseppe creating Pinocchio, a real boy. This, of course, requires two things—that the customer allows free and open discussion and implementation of the information given and paths chosen, and that we as designers keep our hearts and minds open to the evolving sophistication and needs of our target population. On the customer end, there has to be give and take between the demands of marketing, branding, business requirements and usability. I’ll say it outright; there needs to be far more giving and far less taking than usually happens. VUI designers are often brought in after the initial requirements gathering has happened. Well meaning folks with expertise in other specialties within the customer’s company have already laid out scripting rules and language based on experience gleaned from bad or banal interfaces in the past, ensuring that more such experiences follow for the rest of us. Like lemmings, we are forced to continue that flight over the cliff of bad decisions into the sea of bad design. And honestly, we ourselves have gotten into the ill conceived habit of using these same tired gambits over and over. Knowing so much better we are yet the worst offenders–whether by sin of commission or omission–we let ourselves be drawn down the paths of convenience, conformity, laziness and acquiescence. Bruce Balentine of EIG posted the following in the Yahoo VUID group on 02/04/2008. He points the finger directly, and appropriately, at us. “…I ascribe it to the somewhat small population of companies and people doing the implementation work. Since everyone used to work for someone else and “this is the way we did it then,” these kinds of ideas get inbred and then become dogma. It’s a kind of “convergence to a local minimum” like in neural networks or quantum systems. It takes energy to tunnel back out once we’ve converged. I think the same thing is true of those unhelpful recovery techniques that continue to persist — “I didn’t recognize that, I didn’t get that, I didn’t hear you;” — and the exclamatory grounding expressions, “Got it! and “Great!” What happens is that everyone’s ear becomes accustomed to the sound of a given solution, and in the absence of any rigorous debate or viable alternative, it becomes “comfortable” and subsequently “invisible” to the design team’s ears. “This is just how these things sound, and we used to work for XYZ so we know best by definition, and these other proposed solutions sound a little “weird” or offputting — they couldn’t possibly be an improvement.” So our designs converge to a local minimum and it’s very hard to tunnel out…” That same Yahoo VUID group has been grousing over these issues of late. Some of them I have been guilty of myself, shamefully. We have compiled a list of phrases never to be heard in modern, professional interfaces again. Let’s band together and make it happen, I say! 1 Please listen carefully as our menu options have changed. 2 For more information, please see our website at www.whatever.com. 3 Your call may be recorded for quality assurance purposes. 4 My name is Beth, your virtual agent. 5 Press 1 for English. 6 You can speak or press your answers to each question. 7 Sales pitches. 8 Menu options that go on and on. 9 Lengthy legal disclaimers. 10 It’s my fault, I’m sorry. And I am sure you can think of more offenders. When you are writing your script, eliminate the non-informational bits that interfere with the primary aim of the caller: to accomplish the task in the shortest possible, easiest way that she can. The virtue of automation is tarnished by embellishment. Self-service, like the drive-in window, should be fast, efficient and painless.
Share: del.icio.us
| Digg it
| Furl | Google | Netscape | reddit | StumbleUpon |
![]() WordPress Custom Web Design by BeersDesign.com |