Archive for February, 2011

Zings, Zaps, and Zoodles!

You’ve heard them: those brief, intriguing “blasts” of a short, dynamic tone — or “flurry” as I’ve been known to call them — when you dial into the main number of a telephone system or even when you dial a company. It’s almost a “tone” or “glimmer” which lets you know — especially if their sound is distinct to them — that you’ve reached the right place.

Possibly the most recognizable one today is the staccato piano tone which emits when you reach T-Mobile (and it’s even the ringtone when messages and texts roll in via T-Mobile) — its a distinctive sound which has actually become part of their brand. Lance Massey composed it, and likely had no idea how huge it was going to be. Of course, on the broadcast side, NBC’s 3-note flourish is a classic; the MacIntosh startup tone, THX’s “Deep Note”, and the ever-recognizable “Intel Inside” musical jingle flourish (composed by Walter Werzowa) is firmly entrenched in our conciousness.

I decided a while back that it would be an interesting add-on to mix “flurries” into the telephone prompts I record; I set about looking for them on existing sound effects CD’s I already had in my library and found — in amongst the plethora of intergalaxy outer-space zings and zaps which were readily available — there was a dearth (and not Vader) of the right-sounding effects which would mix well with telephone prompts. I then decided to mine some of my favorite sites where I find on-hold music: www.musicbakery.com and www.soundrangers.com — and posed the challenge to their customer service reps, using descriptors like “logos”, “flurries”, “zaps”, “stingers” (carefully avoiding terms like “telephone sounds” which will give you all the ringtones, dialpad sounds, and out of service tones you could ever want; and wanting to avoid long, broadcast-y sounding flurries which might play as someone lurches onstage to accept an award, or which might sound like a radio program intro — it was key that they were short, attention-getting, fresh, and modern sounding.

This one (supplied by Sound Rangers) has long been a favorite of mine, and I’ve used it often:

Sound Rangers Flurry Example 1

…they even work well capping off the end of a prompt:

Sound Rangers Flurry Example 2

However, these favorite “zings” and others soon became over-used, and it became clear to me that I had to keep looking. I was getting bored with them; I can only imagine that my regulars were, too. They’re just not all that readily available — and actually very hard to find.

Enter Craig’s List. My first foray onto Craig’s List involved me sending out a request to composers who would like to try their hand at designing short “signature logo” sounds for me to use with my client’s files. Only one reply came — and it was the best possible reply I could have gotten.

John Kasiewicz

John Kasiewicz, a composer based just outside of New York City, replied, indicating his interest in giving this a try. Having designed musical scores for films and TV, I knew he had chops — whether or not he was interested in such a wierd-ball project — which was such a departure from what he usually does — was another thing altogether.

Luckily, he took on the project, and with very little guidance or direction, managed to compose several fresh new flurries which I use each and every day.

When I asked him what intrigued him about doing this project, he emphasised that timing was everything. “Around the time you approached me I was thinking a lot about composing miniatures in a variety of musical forms,” explains Kasiewicz. “”Probably similar to the desire some people have for building a boat in a bottle, composing telephony sound effects seemed like a great fit for my current musical aspirations.”

Composing for the limited aspects of telephony sounds posed an interesting challenge for John, — not even in the just file specifications, but in the tonality: “Finding the appropriate timbre for IVR systems, especially reminding myself that the delivery system for these ‘stingers’ is typically a lo-fi telephone handset speaker, helped limit my work environment.”

The fact that commercial jingles or stingers are so prevalent in our consciousness (and particularly in our adolescence) helped John think in truncated, brief terms required for telephony flurries: “I think we could all sing at least a dozen famous audio logos or jingles off the top of our head, right? I used to sing them on the bus ride to school growing up and I still sing while cooking a meal or mowing the lawn,” muses Kasiewicz.

I could definitely identify — my mother said I literally drove her nuts from singing all the jingles to every product I recognized during each grocery shopping trip we took. This was actually a tip-off that my eyesight was bad from an early age; commercials were the only TV I could watch.

Before embarking on the project, I described to John what I needed in the form of “zingers”, and gave samples of what I prefer and what I want to get away from — and then backed away. I needed to hand over the creative control, much like my favorite clients do to me. I needed to know if that style of hands-off project management worked for him. Turns out it did: “I was thankful that you left so much unsaid and allowed me to create with few boundaries.” It’s not unusual for Kasiewicz to work on a film score in which the director has already roughed-in pre-existing music, and, as he explains: “It’s always tricky business to create something new that is so closely tied to something already so well known.”

With his tags being used on IVR’s I’ve recorded for KitchenAid, Sony Technical Support, Electrolux and Whirlpool, John’s sounds are reaching a whole different audience — some of whom have enquired about “buying” the sounds from me (they’re not for sale) or if a sound can be made exclusively “theirs” (I steer them towards Mr. Kasiewicz for direct negotiations on that one.) John Kasiewicz’s website can be accessed at: http://seejohnplay.com/

Here are some samples using John Kasiewicz’s original “logo” sounds:

Kasiewicz Flurry Example 1

Kasiewicz Flurry Example 2

“Flurries”, “Zings” “Zaps” or “Zoodles” are short bursts of sound which I mix into my sound files gratis — no additional charge, just kind of a fun extra I throw in, which goes over famously almost every time (I think an online parole payment system was resistant to a cute, perky sound greeting their callers) — it generates lots of repeat business, with many clients actually saying: “Oh! And throw in that little…..sound effect…like last time.”

Next blog, I’ll be discussing the odd and sometimes unpredictable uses for voice-over — we appear in the places you’d least expect!

(Editor’s note: I have decided to change the interval of blogs from weekly to every two weeks, effective immediately. I want to keep the content rich, exciting, and always of interest to my readership. Blogging weekly for nearly two years is tapping me out a bit, so I hope you’ll stick with me while the interval between articles is longer — but the content still maintains its quality! Thanks for your flexibility!)

Allison Smith is a professional telephone voice, who can be heard voicing systems for telephone systems and private companies throughout the world, including platforms for Verizon, Qwest, Cingular, Sprint, Bell Canada, Hawai’ian Telcom, and Asterisk.  Her website is www.theivrvoice.com.

Advertisements

Do You Know What I Mean?

The stories are now legendary: the mis-firings of speech recognition utilities — when a phrase is uttered into a system, and something completely seemingly random is repeated back to the originating voice — are entertaining as well as fear-inducing. Imagine having your completely coherent and well-thought-out message to a colleague  turn out sounding like this:  “Hey don’t forget your Dad killed her by name. Be careful on the way. Read some pretty clear down here bomb within like 130 to be careful. Bye.” Or: “Hi again this is Michael. So calling from Ralph there. Volkswagen lasagna.”

When speech recognition programs (also known as Automatic Speech Recognition or Computer Speech Recognition) — designed to convert speech-to-text — goes wrong or misinterprets what is said, there seems to follow some sort of perverse satisfaction in machines being not quite as intuitive as we are. Much like when the IBM computer persona “Watson” competed on Jeopardy! this last week, we took just a little too much glee in his failures and just slightly too much angst when he actually trumped what we know to be a be a very capable human.

Having voiced many prompts to build text-to-speech applications (where typed words are converted to the spoken word), I have also been an actual human being on the other side of it, where I have attempted to order items via automated systems — following prompts which I, myself, have voiced — and have had the automated version of “me” say things like: “Great. I think you said: International Sales” when I clearly intoned; “Visa Payment”. Or, when I got my first voice-enabled dialing feature on a cell phone years ago and distinctly told it to dial “Kelsey” and it repeated back to me: “OK — I think you said…..JEROME…”

Gerd Graumann, Director of Business at Lumenvox (www.lumenvox.com) — one of the leading providers of speech development products — filled me in on some background and history of Speech Recognition: “AT & T Bell Laboratories developed a primitive device that could recognize speech as far back as the 40’s — and even back then, researchers knew that the widespread use of speech recognition would depend on the ability to accurately and consistently perceive complex verbal input.” explains Graumann.

“In the 60’s, researchers turned their focus towards creating a device that would use discrete speech, verbal stimuli punctuated by small pauses,” further explains Graumann. “However, in the 1970’s, conrinuous speech recognition, which does not require the user to pause between words, began, The technology became functional in the 1980’s, and is still being developed and refined today.”

In 1982, Kurzweil Applied Intelligence released speech recognition products, and by 1985, their software had a vocabulary of 1,000 words — uttered one word at a time. In just two years, its lexicon reached 20,000 words — entering the realm of actual human vocabularies, which typically range from 10,000 to 150,000 words. Despite that healthy base, the recognition accuracy was still only 10% in 1993. Two years later, the error rate crossed below 50%. In 2001, the recognition accuracy reached a plateau of 80%, no longer growing with data or computer power. When, in 2006, Google published a trillion word corpus, Carnegie Mellon University researchers found no significant increase in recognition accuracy.

Ever-increasing processor speed, overall system performance and improved algorithms now enable speech recognition systems to run more effectively than ever and deliver the results of massive probability calculations within fractions of a second. Even the stumbling block which was at one time considered to be close to insurmountable — the challenge of speakers with accents — have been largely eradicated. Current generation speech recognitions systems learn over time to “understand” various speakers with accents and strong regionalities from the data they are being trained with. Gerd Graumann further clarifies this point: “The training data that goes into the acoustic model makes all the difference. With today’s models, the spectrum is fairly broad, and many non-native speakers are part of the training data to reflect how people from many different backgrounds speak. Of course,” warns Graumann, “there is always the end of the spectrum.”

 When it comes to the words people use to interact with automated systems, the latest technology already allows for the systems to interpret what the person is saying. This is achieved by the use of statistical linguistic models, a new technology that tries to understand the intent of what is being said, versus the exact words that were spoken. Not unlike texting with a SMS utility, which remembers likely words you might mean, when typing a text. And also, not unlike how the actual human brain works, as well.

The applications for speech recognition are vast. Medical and legal uses — not the least of which involve transcription and real-time dictation, which is made considerably more efficient with digital dictation systems being routed through speech recognitions utilities (known as Deferred SR). Speech recognition is aggressively being implemented into High-performance military fighter aircraft, with the capabilities to set radio frequencies, commanding the autopilot system, setting steer-point coordinates, weapon release parameters, and controlling flight displays. Enhancing the lives of people with disabilities; training Air Traffic Controllers — even improving the experience of video games — speech recognition’s uses and applications are immense and growing continuously. And hopefully — with the refinement of the technology — the likelihood is minimal of receiving the following cryptic voice mail transcription:  “I just wanted to let you know so that you weren’t surprised if you come back for shower tomorrow that cousin is girlfriend, maybe..” Or how about “Kelly” receiving a message from her Father: “Hi, Kelly, Death calling…”

Next week, I’m excited to blog about those fascinating — and largely subliminal — short “flurry” sound effects you sometimes hear when accessing a telephone system…they’re almost like a trademark musical “scale” which can become closely associated with a telephone company’s identity — and they’re *very* hard to find! I’ll discuss how they’ve become a big boon to my business, and why the sounds which I own are closely guarded.

Thanks for reading!

Allison Smith is a professional telephone voice, who can be heard voicing systems for telephone systems and private companies throughout the world, including platforms for Verizon, Qwest, Cingular, Sprint, Bell Canada, Hawai’ian Telcom, and Asterisk.  Her website is www.theivrvoice.com.

IT Expo Wrap-Up

(Apologies for not blogging last week — travelling back from IT Expo and handling the backlog of work which seems to accumulate whenever I’m away from the studio necessitated a brief break — I’m back this week with a re-cap of the always amazing IT Expo!)

Miami's Sunshine and Art-Deco Pastels by day get replaced by heart-stopping neon...

Every year, people flock to Miami — for those situated in cold climes like myself, it’s an opportunity to finally get warm and enjoy sublime cuban cuisine — but for those who work in the area of Internet Telephony, IT Expo in Miami  is a set item on their yearly business  travel agenda — IT Expo is the event with an educational program that teaches resellers, enterprises, SMBs, and Government Agencies how to select IP-based voice, video, fax, and unified communications to purchase or resell. Even this year’s harsh winter temperatures and tricky airline travel didn’t put too much of a glitch on this year’s conference — although I lost count of how many people told me they were on literally the last plane out of their local airport!

I, of course, attend IT Expo to meet with (and sometimes introduce myself to) contacts I’ve already worked for — and also to mine the rich opportunity of companies who may be in need of a professional voice talent to voice their platforms and who might have been at a loss as how to outsource that. Without fail, every year I attend (and this is my fourth IT Expo) I make significant contacts.

There were a couple of aspects to IT Expo this year which made this year an extra special year to attend:

My Talk at Digium Asterisk World

Gah! I'm actually on the Big Poster of Speakers!

I was invited by Bryan Johns, Asterisk Community Director at Digium, to speak at their pavilion, Digium Asterisk World — an opportunity which I jumped at, as it becomes clearer and clearer to me all the time that there is very little content out there about the mechanics of how to write effective IVR prompts — a topic which I’ve evangelized on repeatedly in this blog, and one which I looked forward to making the topic of my presentation: “IVR Mistakes and How To Avoid Them”.

I came to the conclusion that the biggest fear of the public speaker *isn’t* crowds —  it’s the *lack* of crowds. As speakers transitioned and left the podium for the new speaker to get ready for their talk, crowds had a way of dissipating between speakers. The fleeting thought of speaking to empty chairs crossed my mind as the Digium staff set up my laptop and outfitted me with a mic — but as soon as the talk got underway, seats starting to fill again. I touched on the common pitfalls of IVR scripts which personally drive me crazy: too lengthy opening greetings, too many mailbox options given, the urgent and critical information being left inexplicably til the *bottom* of the phone tree — all really basic common sense aspects which make sense when they’re pointed out — but ones which all too many people seem to be eager to replicate again and again. Explaining ultimately that I’m not a technical writer or even an expert on IVR — but my day-to-day voicing of prompts gives me a good idea of the common traits exhibited by effective IVR writers — and those less effective systems which needlessly take up customer’s time and contribute to frustration also exhibit some common traits as well. Gone during  this presentation was my nervousness which I used encounter regularly when speaking — I credit my new-found career coach for a much-needed tune-up (more about her in upcoming blogs), but also, there’s some advantage to talking about a topic which you’re passionate about — and a certain level of calm which comes along with that.

The Women In VoIP Breakfast

Whenever I told men at IT Expo that I had attended the first annual Women in VoIP Breakfast — knowing what they know about the ratio of women to men attending technology conventions –they presume it must have been myself and one other woman sharing an Egg McMuffin. There were actually close to a dozen of us (including someone  joining us via Skype)  — the breakfast was organized by Suzanne Bowen of DID Xchange, possibly one of the most personable and well-connected women in telephony today. I was honored to be a part of this group of motivated women consultants, marketing gurus, PR agents, directors of sales — and with myself, as the only professional voice talent attending, we made a very interesting group coming from far and wide (one of the attendees from Germany is even starting her own Women in VoIP gatherings in Frankfurt!) Truly an amazing occasion to connect with great women who — I’m happy to say — are more and more visible in the industry with every passing year.

With fascinating keynotes (I’m sorry — I’m biased — but Digium’s Danny Windham shows everyone how public speaking is done to perfection), better and more diverse exhibitors than ever — and the opportunity to freak out Conference Chair Rich Tehrani with my “telephone voice” — IT expo was and always will be one of the few never-miss Telephony conferences of the year.

Join me here next week, where I will delve into the mystifying, intriguing, and often comical world of Speech Recognition! (“I think you said…..SCREECH RETRIBUTION…!”)

You’re great for reading. Feel free to leave a comment!