The microphone may be mightier than the keyboard: business uses of speech recognition

by Richard Seltzer,,

This article was heard on the radio program "The Computer Report," which is broadcast live on WCAP in Lowell, Mass., and is syndicated on WBNW in Boston and WPLM in Plymouth, MA.

Please visit our online store at

After experimenting for a couple weeks with speech recognition software for the PC (not the Mac) -- Dragon's Naturally Speaking 5 Essentials -- I held a chat session with Bill De Stefanis,  the VP of product development at Lernout*Hauspie, the company that bought Dragon.

This isn't a limited-vocabulary speech recognition product -- like a video game that recognize a couple dozen verbal command. Rather it is intended to recognize anything that you would normally say in English. To make that possible, you must "train" the software to understand your voice. The more time you spend training, the more accurate the results. You can store multiple speech profiles -- for different members of your family or for your own voice input by different types of microphone, or recorded in different environments.  But it can't handle more than one profile at a time, which means that the base product works well for personal dictation, but you wouldn't want to use it to transcribe conversations with more than one person talking.

When you make corrections (command = Correct that), you can also go into  training mode to teach the software particular expressions that you use often -- and how to spell them (e.g., the name of your company). The more often you do that, the better the results. Once it is well trained, you really can speak at a normal pace -- which is faster than typing.

Bill says that the typical training time is about 20 minutes. In addition to reading preselected texts into your microphone, you can also have the program scan some of your own Word documents and email messages or you can read those documents into your mic to extend the software's vocabulary and to help it understand your writing style -- how you combine words and phrases.

You can use this software to input text into most Windows applications -- including chat programs. Once you launch Dragon and turn on its microphone, your voice input becomes text output in whatever application you currently have active.

I use the software in mixed mode -- talking, typing, and clicking. I use voice for the main input, and keyboard and mouse to make corrections and edits. You can also edit by voice, if you like. There's a wide range of voice commands that let you move the cursor, select text, capitalize, delete, etc.  You can even launch new applications, like the Internet Explorer browser, by voice.

In addition to the command control, Bill notes, Dragon Naturally Speaking can be scripted or programmed to perform automated tasks such as filling out forms and working with structured documents.

As Bill explains, Naturally Speaking comes in many "flavors." Essentials is the entry-level product, that gives you a chance to experiment with this new capability.
I got my Essentials for under $50 from Broderbund (the company that sells Dragon consumer products). That includes a great high-quality microphone. A significantly better version (Preferred) now sells for $179.99. And Version 6, due out soon, has the ability to ignore common extraneous sounds like "ahhs" and "umms."

The minimum system requirements for Dragon Naturally Speaking are: a Pentium II processor, and 64 MB of RAM. Hard disk requirements vary from a minimum of about 110 MB up to about 300 MB.

In addition, they have a product called MediaIndexer that can index an audio or video recording and allow you to search for words and phrases against multiple video or audio streams.

As an example of how well speech recognition can work, go to (their Web site). Under About, go to their contact information. Give them a phone call. One of the first options is to go to their directory. There you simple state the name of the person you want to speak to and the system rings their line. It's very slick. And the voice that speaks back at you, repeating the name that you spoke, sounds natural, not machine-like.

I'd like to experiment with Naturally Speaking for writing -- fiction and articles. I need to do some more training though to reduce the number of errors. Think of this like scanning -- you always need to proofread text that you've scanned -- only this
is more like dealing with an early generation scanner when errors were more common.

Speech recognition could be a godsend for someone with carpal tunnel syndrome or arthritis, also for slow typers, also allowing handsfree use of the computer when you have to do many things at the same time.

Bill adds that most of business applications for speech recognition have to do with large volumes of text creation. It is faster to talk than to type. Hence, speech recognition can increase productivity in legal departments, government agencies, in various medical and General business practices that do lots of document creation.

Meanwhile, telephone recognition (input by telephone instead of by microphone) has made great strides the last several years, adds Bill. They have developed a product specifically for medical transcription, where the vocabulary is limited and known.

To see more examples, check the value added reseller (VAR) section of their Web site.

Currently, the biggest research tasks involve making speech recognition more natural, for instance adding automatic punctuation, speaker independence with no need for training, and handling multi-voice conversation.

Other articles about Internet business trends

This site is Published by Samizdat Express, 213 Deerfield Lane, Orange, CT 06477. (203) 553-9925.

Please visit our online store at

Return to Samizdat Express
Buy Richard's book Web Business Bootcamp (published by Wiley)