By Peter DeHaan
I’m a huge fan of technology — and the allure of speech recognition (also called IVR or interactive voice response) carries with it great appeal. Yet when it comes to real-life implementations, I find it decidedly lacking and frustration-filled.
In the past I’ve been reticent to state my disinclination — knowing that I’m part of the problem: my words often lack clarity. Clearly, I don’t make a speech recognition engine’s job easy.
Some errors are easily explainable given my imprecise speaking tendencies, such as asking for Candy Lane and ending up with Cam DeLain. However, other occurrences are nonsensical, making for a great comedy skit, albeit poor customer service. For example:
“Good morning, Acme Call Center; your call is important to us. Please say the department or name of the person you are calling.”
“Sally Pavasaris” I dutifully respond.
“Did you say “Ned Flanders?”
“NO,” I exclaim! Nothing happens. “Sal-ee-Pa-va-sar-is,” I decidedly project using my best possible diction.
“I’m sorry, I don’t understand. Please say the department or name of the person you are calling.”
“Agent!” I implore. “Operator!” I beg. I begin pressing zero with repeated vigor. When I’m finally connected to a person, my demeanor is less than stellar. I know why, but the agent is clueless, likely muttering about rude customers after she transfers my call.
To further complicate matters, what if I don’t know the person’s full name? What if I can’t pronounce their last name? Speech recognition is ill equipped for such situations.
Another common issue that I have is a quandary on how to proceed when the software and I talk at the same time. A common dilemma is:
“Please say your account number…”
“Seven,” I begin.
“…followed by the pound sign,” the voice continues.
At this point I have a critical decision to make, the ramifications of which could have frustrating consequences. Do I assume that “seven” was recognized, allowing me to confidently proceed in giving my account number? Or should I play it safe and repeat the first digit? If I guess wrongly even more time will be wasted attempting fruitless communication with a machine. Either way, I’ll inevitably hear: “I’m sorry; that number is invalid; please try again.”
Sometimes I try to suppress my impatient tendencies (why am I patient with people and impatient with machines?) and wait to make sure the voice is done talking. Sometimes I pause too long, at which point I’m rewarded with the unappreciated prompt, “Please respond now.”
To avoid causing the voice further frustration, I quickly comply. This usually results in the situation I was attempting to avoid in the first place — the machine and I simultaneously speaking. At this point things usually spiral further out of control. The software still doesn’t know my account number, I still don’t know when to speak and when to listen, and I’m sensing that the likelihood of talking with a real person — versus talking to a machine trying to act like a person — is even more unlikely then when I started the call.
It is true that a careful speech recognition implementation can serve to speed up call processing and improve caller satisfaction. Sadly, that goal is not often realized. Instead, grandiose efforts are attempted, with little to show for it — aside from frustrated customers and unnecessarily maligned telephone agents and customer service personnel. Is that the intended result of technology?