Saturday, September 18, 2010

Dictate to your computer like a pro

Do your wrists hurt? If you program all day, or write prose, or otherwise spend most of your time typing on a keyboard, chances are you'll develop some pain. The intensity of the pain will fall somewhere between not much at all and this somewhat rare catastrophic wrist failure I got.

The details of the risk factors of repetitive stress injury (RSI) are still being mapped out. People suspect influences from your overall health and constitution, your amount of stress, your general level of intensity at the keyboard, how many breaks you take and how effective they are. However, the largest factor is well known: it's the amount of time spent at the keyboard. According to the medical literature, anyone who types more than four hours per day is at risk -- though that's in part caused by medicine's low thresold for calling something a risk.

The point is, at four hours the risk is measurable, then it goes up from there.

As I was saying in my article on wrist health, the best way to prevent (or alleviate) RSI is to remove the hours spend typing emails from the equation. Dictate them instead. Because this strategy targets the #1 risk factor, it is more effective than any fancy keyboard, or any posture training.

Since 2005, I have been using Dragon NaturallySpeaking to dictate email messages and other documents of all sorts. Dragon's recognition accuracy has improved so much over the years. Across the last couple of version releases, it has gone from being accurate enough to use if you don't have an alternative (v7 and v8), to being fine if you speak with an impeccable accent (v9), to being faster than anyone can humanly type on a keyboard, for pretty much everyone (v10). I have no reason to suspect the new Version 11 doesn't continue this progression.

Since I have spent so much time with Dragon, I have a collection of tips and tricks to offer.

[1] Get the cheap version of Dragon NaturallySpeaking, at $50


The more expensive versions of Dragon provide fancier support for complex programmed commands, and little else. Unless you are a doctor, a lawyer, or you need a bilingual version (such as the French+English version available at amazon.fr), save the money.

There is no need to bother with Dragon's large set of voice commands. Everything you can do with Dragon's commands you can do more easily with the mouse or through a few keyboard shortcuts. If your hands hurt too much to use a mouse, get a GlidePoint usb touchpad. The only commands I use are "Cap that," to turn text into title-case, and "Switch to dictation mode", which turns off a few useless commands that only get in the way.


[2] Get a decent noise-isolating USB microphone


First, the microphone has to be USB, or it has to be plugged into an external USB sound pod. Any soundcard inside the case of the computer will pick up the interference from the CPU and hard drives and more or less ruine the sound quality. I would advise to get the USB pod even if your computer happens to have some internal shielding, else you won't have the option of switching computer.

Second, the microphone has to be noise-isolating. Dragon has essentially no ability to tell apart your voice from any background noise. You depend on your microphone's ability to filter them out. If you use the microphone that comes with Dragon, you will have to sit alone in a perfectly silent room, wondering whether it's the songs of the neighborhood birds' or the sound of your computer's fan that's undermining your recognition accuracy. My microphone might be expensive, but I can dictate papers while on the road to Washington, and that's awesome.

Third, it's better if your microphone has a good sensitivity. You almost have to yell into the stock microphone to get good enough sound quality. It's annoying, it's obnoxious to your neighbors, and it will give you RSI of the vocal cords if you do it for too long. With a microphone worth above 100$ or so, you start being able to dictate in hushed tones, and that's quite nice.

Fourth, you will spend many hours with this microphone, better find a microphone that's not fragile, plasticy and uncomfortable, like most gaming headsets are.

My favorites are the Buddy FlamingoMic 7G , the Sennheiser ME3, or possibly the Sennheiser ME3 Knock Off. But there are more good microphones on this list and this list.


[3] Make sure you have enough RAM. Boost your CPU. Upgrade to a SSD drive.

The performance of your computer will have a huge impact on your recognition accuracy. After you speak, Dragon launches a search through a list of utterances you might have said. Dragon sets itself a deadline of around half a second per word to choose its best guess. The faster your machine, the deeper Dragon can search, and as in chess, the deeper the search, the better the choice.

Nothing kills Dragon accuracy more than swapping. If you don't have much RAM headspace, the first few sentences you pronounce after turning the microphone on will get terrible accuracy, guaranteed, because Dragon will be busy swapping itself into RAM and will miss its deadline. But the effect of RAM shortage will be felt throughout your dictation session. Each time Dragon will touch the RAM limit, it will hit the disk and make a mistake. It's frustrating.

Once you have enough RAM, get the fastest single-threaded processor you can afford. In laptops (at the moment) it's the Intel i7-M620, which is not that expensive of a CPU, actually. The more expensive CPUs sacrifice single-threaded performance to satisfy the marketing department's need for more cores. The i7-M620 runs at 1.6 Ghz when all four core are engaged, but it runs at 2.8Ghz when running single threaded. And in single threaded benchmarks, the CPUs are faithful to their maximum megahertz ratings.

Finally, upgrade to a solid state hard drive. They are expensive, but they are also a hundred fold faster than normal laptop hard drives. It will all strange delays Dragon otherwise sometimes has.

[4] Learn to be patient with the Dragon


It's natural to get frustrated when someone repeatedly fails to understand what you say. You cannot have the same attitude when speaking to Dragon -- you would go crazy. Meditate, if needed. Don't hesitate to make a correction with the keyboard when Dragon is being stubborn (to the extent your wrists will allow). It will help.

If your accuracy is bad all of a sudden, lunch Audacity or some other audio software and listen to the sound quality coming out of your microphone. It's often eye-opening. Listen for common sound quality problems: gaussian noise, interferences being picked up by your sound card from the CPU and hard drive, or booms caused by the wind of your voice hitting the microphone element.

[5] Go to preference dialog box and remap the keyboard shortcuts


Get into the habit of pressing Ctrl-Z instead of saying "Scratch that." Scratch that is too tedious and unreliable. If you're like me, you're hands are not in so bad of a shape that you can't afford to press Ctrl-Z.

Bind "Correct that" to F1. This way your fingers are automatically nearby the keyboard shortcuts needed to select from the correction options after you bring up the correction menu (press ALT+number). ALT-F1 should be bound the dictation box. Then choose one function key to use for Microphone On/Off, possibly F7. Turn on "Double click to correct". When correcting (aka, when the correction menu is shown,) switch between the utterances with the left and right arrows.

This arrow keys trick won't work well if you're dictating emails straight into Firefox, or into any other application that doesn't have native support from Dragon. Dragon has native support for Microsoft Word, WordPad, and Notepad. All other applications run in compatibility mode. In that mode, Dragon can't read the text you're editing, which means that all features that require knowledge of the text outside of what you have just dictated are disabled. Saying "Select foo" when "foo" isn't something you just said won't work, neither will the left and right arrows.

The fix is to install the Text Editor Anywhere program or the It's all text Firefox add-on and configure it to launch WordPad on Ctrl-F1. My Gmail is set with the rich text box off (otherwise It's all text doesn't work) and has keyboard shortcuts turned on. This way, to reply to an e-mail I press in sequence: "a" for reply-all, Shift-F1 for the microphone, Ctrl-F1 to launch WordPad, then I leave my left index finger to F1, with the thumb on the left-side ALT key, so that I am ready to select corrections.

Text Editor Anywhere works out of the box. For It's all Text, you need to set WordPad as the default editor. In the preference dialog box, use this path: "C:\Program Files\Windows NT\Accessoires\wordpad.exe", then set It's all text's character set to "iso-8859-1" (without the quotes).

[6] Run the email corpus training


After the 15 minutes initial voice training, the email corpus training will give your recognition accuracy another factor of improvement. Dragon learns from this training a sense of the words used to talk about the things you talk about. It actually doesn't matter that much if it's your writing or somebody else's. What matters is to give Dragon an opportunity to rule out words you never, or rarely ever use.

[7] Teach Dragon new words as you go


If you have a new word, just pronounce it. Dragon will write rubbish of course, but then press the correction shortcut key twice, once to bring up the corrections, then once to bring up the spell dialog box, then type the correct word. That's sufficient to train a new word. Dragon remembers the pronunciation and will associate it with the spelling.

[8] Don't hesitate to move the speed-vs-accuracy slider to the right


If Dragon is making mistakes, move it to the right. The added speed is never worth the frustration of the mistakes. Don't move it all the way though. The last tick mark is very demanding on the computer and not that useful. Try setting it 95% to the right.

[9] Apply corrections by repeating yourself


After pressing F1, you have two options. You can either pick a correction amongst the ones offered in the menu, or you can repeat what you just said. Repeating is surprisingly effective. When you repeat yourself, Dragon takes into account that its first guess was wrong and goes looking for a new interpretation. It's often faster than reaching for the mouse to choose a correction.

Finally,

[10] If you're not a native English speaker, take an accent-training course


Seriously, Dragon NaturallySpeaking is excellent accent practice. Dragon will flag your mispronunciations like no polite friend ever dared to. On your own it's just frustrating, but with the help of a teacher you'll learn to mouth the missing sounds, and Dragon will mark your progress.

6 comments:

Ornthalas said...

You put twice the same link for the headsets.

Guillaume said...

Thanks Ornthalas. I just fixed it in the text. The second link I had in mind is http://www.pcspeak.com/products/microphones.shtml

Wil Forbis said...

Very good advice, especially the idea of binding correction to F1. You might want to mention this link over at speechcomputing.com

My two cents: For me, Dragon became much more usable when I combined it with the vocola (vocola.net) free package. Vocola (like more advanced versions of Dragon) allows you to create custom commands like "Launch Hot Mail" or "Start Notepad." Even though I'm not really bothered by RSI anymore I still use these commands simply for ease of use.

I did my own write up on Dragon here (it is somewhat out of date compared to my current practices with the program.)
http://www.wilforbis.com/pages/hands_free_computing.htm

Wil

coyote said...

Does DNS only use a single core of a multi-core CPU?

coyote said...

DNS now has "multi-core support" (per
http://www.nuance.com/ucmprod/groups/dragon/@web-enus/documents/collateral/nc_016429.pdf regarding DNS11.5), so I think your recommendation of "the fastest single-threaded processor" is no longer the best advice.

Guillaume said...

Nuance's claims that they rewrote their code to make use of multi-core CPUs is a marketing lie.

I have Dragon 11.5 running on a 4 core CPU, and in no event does my CPU usage ever goes above 25% during recognition, which is what you would expect from hopelessly single-threaded code that was written before the era of multi-core processor and never written.

My advice still holds. Buy the fastest-single threaded CPU you can afford. Look at Notebook Check's list of mobile CPU performance, and pay attention to the "Cinebench R10 32Bit Single" benchmark. It's the only single-thread benchmark in the list.