Where Do We Go From Here?

  Posted by Laura Chumley on May 20, 2010

Recently I have been following a lively discussion on the VUID group, titled “Are we the orchestra on the Titanic?” It began with a post from the inimitable Phillip Hunter containing a link to Robert Fortner’s blog: Rest in Peas: The Unrecognized Death of Speech Recognition

His premise is that the progress of automated speech recognition has peaked, and is well along the slow decline into obsolescence. Fortner cites various studies showing that the core technologies were able to take advantage of the enormous gains in hardware capability and speed, as well as the tremendous amount of text content available for semantic analysis. And then it stalled, unable to get past about 80% accuracy regardless of the horsepower or the data it is fed.

Our VUID discussion has focused more on the role that we play in augmenting and assisting the ASR model. We use constrained grammars and carefully crafted dialogs to drive the telephone conversation down the path that provides the most efficient way to service the customer. All useful tools have inherent limitations; the masters work with them, and around them. Why would we consider ASR any different?

As people come to rely more and more on self service applications with such speech enabled devices as the OnStar car safety service, a host of the new GPS’s, Google’s Android Nexus One and other mobile Smart phones, they will train themselves to work within those limitations as well. Even such a simple and intuitive thing as to move to a quieter location before beginning an interaction, or remembering not to try to carry on two conversations, one with the IVR and the other with a store clerk, will make a huge difference. And I suspect that recognition will take another small leap, not necessarily because the technology improves, but because it is being used with, rather than against the flow.

More on this, as the next tantalizing bit of the story unfolds, where we look at telephone vs. online self service, and how the line is blurring.

Putting Out the Welcome Mat for Outbound Calls

  Posted by Laura Chumley on July 17, 2009

Recently a fellow VUI designer, Caroline Leathem, asked for volunteers to take a survey on outbound contact perception. You can take it here.

In particular, she wanted callers from outside the U.K. to express their opinion on how they preferred to be contacted in a variety of situations. The options were: Text Message, Email, Letter, Phone Call from an Automated Service, Phone Call from an Onshore Agent and Phone Call from an Offshore Agent. The order wasn’t fixed, which I found a bit odd, as it made me wonder whether it was accidental or intended to shift the bias somewhat.

But I did notice that my knee jerk reaction was to eschew the automated call more often than not, either by not choosing it at all, or by ranking it very low in my list of choices. This is downright peculiar, as we have written a number of applications that use Outbound calling to very good effect. And I genuinely do prefer the speed of dealing with an automated application when I am trying to get something done, without having to take the time for human interaction. This isn’t to say that I don’t enjoy a good conversation with a stranger, but not when I am task-focused. And perhaps, too, it is specific to the task. Arranging for a delivery or setting up an appointment, really is easier to do with a person. But soliciting or providing information is generally easier to do with a machine.

In the informal surveys we have performed for our customer base, the reaction to our automated outbound call has been positive—the call recipients received just enough information in a timely way to act upon. Or they were able to respond to and log their information as needed when they needed to. I suspect at least some of the difference is that they were expecting the call, so that it wasn’t an intrusion. And the consistency of the interaction was comfortable over time.

Outbound calling has been getting steadily increasing attention in the news. Datamonitor predicts the market for hosted Outbound IVR services in North America will more than double from an estimated $213 million in 2008 to $524 million by 2013. They attribute the rapid growth to the economy.

“Outbound IVR applications are simpler than inbound IVR applications, and require less intelligence to determine end-user goals. Over the next few years, vendors will integrate outbound IVR applications with backend business logic and by taking advantage of all major customer touchpoints, notably email and SMS.

The biggest factor that will influence the market for outbound IVR applications is the economy, due to which enterprises across all verticals have adapted “do more with less” mantra.” The Rise of Outbound Applications in an Economic Recession (Strategic Focus)

That being said, we need to prepare for more calls, both making and receiving them. It is important that we understand when and where outbound calls can fit into the ways we communicate, so that we don’t alienate the very people we want to reach out to and engage.  More on this topic soon.

Sugar and Spice…Not!

  Posted by Laura Chumley on December 18, 2008

Susan Hura wrote an interesting article for SpeechTech magazine discussing the critical requirement that a VUI designer really understands the business logic used in all of the different interaction channels offered by a client. It is worth reading just for that discussion alone. She always has good insights. FYI, I myself am definitely in the behavioral science camp, in case you were wondering. Anyway, she ended with a reference to an earlier article describing the reaction to two differently worded prompts intended to take the caller to an agent.

…”The only difference between the prompts was wording and how each influenced users’ perceptions of why they were being asked to make a selection. For the first prompt (OK, I can transfer you to an agent after you make a selection), we noted a very odd pattern: Many users selected what we knew to be low-frequency options, randomly responding with the last option they heard before requesting an agent. The reason behind this odd pattern became clear once we studied responses to the second prompt (OK, I’ll get you to an agent, but first please tell me if you need help with A, B, C, or D). Users in this case successfully and happily made appropriate selections and were almost always routed correctly; this prompt obviously motivates users to make a good choice because there is a direct benefit to them. The same benefit exists for the first prompt, but the wording makes the selection seem like just another hoop they must jump through for the sake of the automated system.”

She concludes that promising a reward for good behavior, in her words, “cajoling to cooperate”, increases caller buy in with an automated system, causing them “…to view automation as a tool for accomplishing their goals rather than as a barrier between them and a live agent.” And sure, that is our goal. But I beg to differ. I am not convinced that it is as fuzzy as all of that. Clear direction isn’t necessarily cajoling. Setting the expectation of the caller that: if x then y, is always better than: if some x or another then I can y. I would posit that the indeterminate gray area caused the random option selection, not some subtle manipulative coercion with sweet nothings, I mean, wording.

What was missing in the first prompt was the implication that the caller would be routed correctly to the agent that could best help them. Note the emotive context that Susan used to describe the caller’s reaction—“successfully and happily”, jump through the hoops for the app—when I use a tool, I rarely care one way or the other about it unless it doesn’t perform as expected.

My good friend Leslie, a Public Defender in N. GA, once explained to me that a computer was no different than a toaster. She wanted to turn it on and use it without giving it any thought. Press the buttons, and words come out on the screen. No more, no less. No interference between her and her goal. Pop! Here is my toast, hot and ready for butter and marmalade.

I believe that the VUI should do the same thing. Act like a well designed tool that assists you in accomplishing your task without getting in the way or adding to the experience at all one way or the other.

Transparency.


1 From http://ilovemytoaster.com/

For what it’s worth, this is a cute bit on another toaster. But I digress.

Mirabile Dictu

  Posted by Laura Chumley on November 17, 2008

Mirabile dictu–this is Latin for “wonderful to relate, miraculous”. And those were the words that sprang into my head last week when listening to NPR’s All Things Considered on the way home.

Robert Siegel was interviewing Mark Blumenthal on the historical accuracy of pollsters as reported on his web site, www.pollster.com.

Siegel: “..a website that does for political polling what MLB.com does for baseball statistics. Let’s start with the national popular vote. It was 52% Obama 46% for McCain. Which pre-election day poll or couple of polls came the closest? “
Blumenthal: “Well, let me answer it two ways. If the margin ends up being six points, there were two national surveys that had that exactly right. The one from the Pew Research Center, on one extreme, a highly regarded traditional polling method and on the other extreme, the Rasmussen Reports automated survey.”
Siegel: “That is a robo poll, right?”
Blumenthal: “Some call a robo poll.”

Siegel: “What does it say if two of the very best polls getting it right in the end are, on the one hand, as you say, the Pew Research Center, a gold plated poll that we hear about very often on this program and on the other hand, the Rasmussen automated telephone surveys which people might look very skeptically at. Are there questions to be raised about methodology?”

Blumenthal: “I think it says that we have been maybe a little too hard on the researchers that don’t use a live interviewer. I think that the automated surveys have proved themselves to give us as good a picture of the horse race at the end as those that use live interviewers.”

To hear the entire interview, follow this link:
http://www.npr.org/templates/story/story.php?storyId=96670546

What an astonishing endorsement from a professional statistician—that an automated survey compares not only favorably, but is described as being as accurate as a live survey!

At MTI, we have seen a lot of positive response to virtual agents, virtual nurses, and even virtual fund raisers. But this is the first time that I have heard a “civilian” tout the evidence that we have seen in our analysis: in the right circumstances, virtual is as good as real.

It makes a designer proud.

Cacophony

  Posted by Laura Chumley on August 19, 2008

Ah…SpeechTek in New York City amid the frenetic activity of hustlers and buskers, business suits and Bermuda shirted tourists of every size, shape and nationality imaginable. What a backdrop for a convention dedicated to trends and technology for Speech IVR applications!

The noise quotient roars in the background—cars and trucks and ambulances and construction and people. It is a voice designer’s nightmare. All of this din competing with the caller’s voice for the hapless speech recognizer to filter, parse and manage to return a reasonable response. And yet, that is the focus of this conference—how to reach and engage the new mobile user—the customer of our future. And this includes dealing with these very conditions. So what seemed dichotomous proves to be a perfect setting.

How will we serve this new breed? Our industry is maturing; the rush for better and better technology is evolving to a new focus on how to do what we do better—how to target and respond in new ways to the challenge of communicating with and selling to these tech savvy, in a hurry, high energy and strongly individualistic people. Turning the clock back in time to see that in choosing machine over personal service, we have taken away the very reason that we need and want to interact with each as customer and vendor.

Once upon a time, when you wanted to speak with someone about something, you went to them, and asked them your question. If they were not available, there was someone there who knew where they were, when they would return, and very likely, be able to answer your question or address your concern for you—personal contact, in a direct context.

In today’s IVR environment, we have a one size fits all response through which the caller must wade step by step, without regard for what she actually needs or wants. She must sit impatiently through menus, marketing messages and jargon before selecting the option of interest. What a waste!

In the brave new IVR of the future, we want to see a return to the personal service that comes from an understanding of the caller–using caller history, reverse ANI matching and demographic information behind the scenes to offer pertinent options and messages, and then displaying them in a multimodal way that allows the caller to respond and track that information in a way that is meaningful and useful for them. The message is “It is all about me. I am The Customer!” That is the voice that rises about all the sound and fury here in New York.

What Lies Beneath

  Posted by Laura Chumley on February 17, 2008

New and old VUI designers alike are always looking for tips on how to improve their scripting. As with any other field of endeavor, there are conflicting opinions; dissonance and debate abound. In my experience, we are passionate in our arguments, intense in our rationalizations. Design isn’t just a dry, analytical laying out of the prompts; it is an emotional interweaving of technique and form, nuance and balance. I like being a part of such a group, artists working their magic, taking words and sound and crafting a personal interaction with the caller–Giuseppe creating Pinocchio, a real boy.

This, of course, requires two things—that the customer allows free and open discussion and implementation of the information given and paths chosen, and that we as designers keep our hearts and minds open to the evolving sophistication and needs of our target population.

On the customer end, there has to be give and take between the demands of marketing, branding, business requirements and usability. I’ll say it outright; there needs to be far more giving and far less taking than usually happens. VUI designers are often brought in after the initial requirements gathering has happened. Well meaning folks with expertise in other specialties within the customer’s company have already laid out scripting rules and language based on experience gleaned from bad or banal interfaces in the past, ensuring that more such experiences follow for the rest of us. Like lemmings, we are forced to continue that flight over the cliff of bad decisions into the sea of bad design.

And honestly, we ourselves have gotten into the ill conceived habit of using these same tired gambits over and over. Knowing so much better we are yet the worst offenders–whether by sin of commission or omission–we let ourselves be drawn down the paths of convenience, conformity, laziness and acquiescence.

Bruce Balentine of EIG posted the following in the Yahoo VUID group on 02/04/2008. He points the finger directly, and appropriately, at us.

“…I ascribe it to the somewhat small population of companies and people doing the implementation work. Since everyone used to work for someone else and “this is the way we did it then,” these kinds of ideas get inbred and then become dogma. It’s a kind of “convergence to a local minimum” like in neural networks or quantum systems. It takes energy to tunnel back out once we’ve converged.

I think the same thing is true of those unhelpful recovery techniques that continue to persist — “I didn’t recognize that, I didn’t get that, I didn’t hear you;” — and the exclamatory grounding expressions, “Got it! and “Great!” What happens is that everyone’s ear becomes accustomed to the sound of a given solution, and in the absence of any rigorous debate or viable alternative, it becomes “comfortable” and subsequently “invisible” to the design team’s ears. “This is just how these things sound, and we used to work for XYZ so we know best by definition, and these other proposed solutions sound a little “weird” or offputting — they couldn’t possibly be an improvement.” So our designs converge to a local minimum and it’s very hard to tunnel out…”

That same Yahoo VUID group has been grousing over these issues of late. Some of them I have been guilty of myself, shamefully. We have compiled a list of phrases never to be heard in modern, professional interfaces again. Let’s band together and make it happen, I say!

1 Please listen carefully as our menu options have changed.

2 For more information, please see our website at www.whatever.com.

3 Your call may be recorded for quality assurance purposes.

4 My name is Beth, your virtual agent.

5 Press 1 for English.

6 You can speak or press your answers to each question.

7 Sales pitches.

8 Menu options that go on and on.

9 Lengthy legal disclaimers.

10 It’s my fault, I’m sorry.

And I am sure you can think of more offenders.

When you are writing your script, eliminate the non-informational bits that interfere with the primary aim of the caller: to accomplish the task in the shortest possible, easiest way that she can. The virtue of automation is tarnished by embellishment. Self-service, like the drive-in window, should be fast, efficient and painless.

Dawn of the WUI

  Posted by Laura Chumley on February 10, 2008

Recently I read an article speculating that we would soon have identified the elements of canine speech. Yes, the secrets of doggie language have been revealed to us mere mortals. As the proud owner of three good looking and above average intelligence pups; I began to think about how we could now communicate, and what that might mean for our household. For example, should Mr. Buck notice that Milky Way’s cough has returned, he can immediately call the vet for a prednisone refill. Imagine…a WUI (Woof User Interface)…

Virtual Vet: Hello, thanks for calling Cherokee Animal Hospital. To continue in Canine, just say woof.
Mr Buck: Wwoof
Virtual Vet: Thanks, is your family one of our clients?
Mr. Buck: Woof
Virtual Vet: With whom am I speaking?
Mr. Buck: Wooooof
Virtual Vet: Ah, hello Mr. Buck. One moment while I look up your file. Are you calling for yourself or one of the other pets?
Mr. Buck: Woof woof
Virtual Vet: Milky Way, eh? Is he coughing again?
Mr. Buck: Woof
Virtual Vet: Alright, you can have Laura pick up his medication on her way home. Will there be anything else?
Mr. Buck: Wooffff
Virtual Vet: You’re welcome, good bye!

And with a stunning 43% accuracy rate, the woof recognition software would be comparable to speech recognition only a few years ago. How far we have come in such a short time! We started dabbling with speech recognition in the 90’s. It was dreadful. But soooo intriguing.

DragonSpeak 1.0 required hours of training—not just for the program to learn your voice, but for you to learn how to speak in a way it would recognize, for you to learn how to behave. Quirky. Unpredictable. Inaccurate. Slow. Today DragonSpeak 9.0 boasts 99% accuracy. 99%!

Now our esteemed CEO often uses it for casual email as well as contracts. Just kidding, Mark. Just email and white papers. And we have built a thriving VXML hosting business with enterprise level Genesys servers whose recognition capabilities will knock your socks off!

A 2007 presentation at The Radiological Society of North America stated that their research found that ASR (automated speech recognition) programs have exceeded the accuracy of human translation. Yes, recognizing and interpreting human speech was done better by a machine than a human. Read the article here. The best is yet to come. And we are ready!

Alright you enterprising entrepreneurs out there…who is going to hire me to write the first WUI?

One man’s hair is another man’s harrow.

  Posted by Laura Chumley on February 2, 2008

My mother spoke like Scarlett O’Hara, with an elegant, deep Southern drawl. She was exceedingly proud to be a 4th generation Atlantan, and her life was steeped in that tradition–charm and drama, drama and charm.

While she was not exactly a Luddite; she would eschew most things that smacked of modern technology. She accepted ball point pens only grudgingly, preferring the smooth ink spread of the fountain pen. She dreamed of debutante balls and ladies club meetings, magnolia perfumed encounters and genteel discourse. So of course, she bore a changeling—a redneck geek.

We were taught to speak precisely even as small children, with good grammar and crisp diction. So when I told my new nephew-in-law what I do for a living–script design for speech enabled applications–I was taken aback when he said “Oh, that is why you talk so funny!”, and then he blushed, stammering, “I mean, all proper sounding.” I talk funny? Man, the hillbilly family I recently married into is the one that talks funny!

I am learning a whole new language these days, using immersion techniques—do or die! Sure, they grew up only 50 miles from where I did, but believe me, there is as distinct a difference between my urban dialect and their rural one that it is as if we lived in different countries on opposite sides of the world!

Most of the time, I can now understand my husband now without asking him to repeat himself, at least too often. But the other day when we were removing the bush hog implement from the John Deere tractor, he told me to get the cutting hairs and put it on. OK, I figured I would find something that looked like a tangle of wires or something. I looked and I looked. Nothing fit the bill. Nor did I understand what value something like that would have for the garden, but hey, he is the farmer, so I tried.

“Hon? I don’t see it. “
“O’er thar.”

I looked over there. Nope. Lots of different attachable things, but no hair-like things.

“Ummmmm.”

He looked over at me with some impatience and indicated a long bar with large scalloped shaped discs in shortly spaced intervals. I smiled politely and dragged the thing over to the tractor. “And this is called…what?”

“Cuttin’ hare.”
“Hair?

Then he realized what was wrong, and yet once again started to laugh at me.

“Tractor harrow. We call it a hare ‘round heah.”

Ah, another illuminating moment. One man’s hair is another man’s harrow.

The Path to Value

  Posted by Laura Chumley on January 28, 2008

Yesterday, as I was preparing the garden and corn field for February’s planting; it occurred to me that tractor plowing is very like VFR flying an airplane. You fix the focal point of your attention on a distant spot and, ignoring all other cues, head directly for that spot. The view changes, the angle changes, there are side paths and distractions. Each of them contributes to your experience, your opinion, your analysis. But a straight row requires absolute focus on that goal. And unlike a plane, you have no instruments to guide you. You have to use common sense and reason to get there.

My son calls it the happy path. I think of it in terms of designing voice applications as well. Each program has the goal of serving the primary need of the caller. In most self-service applications there is one function that is the real destination—the one that gives the most return for the company’s investment. Whether it is giving directions, tracking an order status or changing the on call personnel; the flow needs to be driven towards that goal with all due expediency.

It is far too easy to get caught up in a call flow maelstrom. Not all options deserve the same weight as the primary one. The 80/20 rule applies—the path that gets most of the traffic is the path that needs to be expedited. Everything else is secondary. It is far more important to ensure that the majority of your callers are satisfied than to offer a dazzling smörgåsbord of choices and options that dizzy the mind and slow the pace to a crawl.

I quote from Eduardo Olvera’s blog I read last year, http://www.vuidesign.net/the-value-of-menu-choices.htm, “Bottom line, any choice should add value to the caller and not waste their time. ‘Value IS what the impatient caller values.’”

Disclaimer: Note that the ideas, musings, and opinions expressed here are mine alone, and do not necessarily reflect those of MTI, my long suffering employer. :-)