To speak or not to speak, that is the question

To speak or not to speak, that is the question

Summary: Carpal Tunnel Syndrome has me exploring the feasibility of leaving the keyboard behind for my writing.

TOPICS: Mobility

Speech recognition doesn't intimidate me

I won't be entering into the valley of speech input blindly. I've been playing with speech recognition for over a decade. It's been a passion of mine since the earliest methods appeared.


I was impressed when IBM first introduced Via Voice, probably the first commercial personal speech recognition product. The PCs of that time were barely able to provide the compute power needed for real-time speech recognition, but IBM's technology was all the more impressive for that.

Via Voice was such an advanced product for its time that I was captivated with the technology. I got extensive training on the technology and the practical use of it directly from IBM. They showed me why interpreting spoken words accurately was so complicated. It was fascinating training, and IBM certified me as a Speech Recognition Specialist as a result.

My fascination with speech input has continued since then, and I try it on every platform I use. I've come to realize its usefulness is hit and miss depending on a lot of factors. Those factors will play a significant role in my attempt to use speech for my writing work.

Work methods must change

The most important question I have is whether the internal microphones on these devices are good enough for accurate recognition.

No matter what device and platform ends up working best for this work, my work methods will have to change. I will still do research for my articles in public venues, but there will be no more writing in those places.

For speech recognition to have a chance to work well, a quiet area is required. I plan on doing the "writing", or speech input, in my home office. There will be no more music playing in the background, as is my common practice; quiet is required to make this work.

Dictating text into a computer means speaking clearly and slowly to improve the accuracy of the interpretation. That will require lots of practice on every device and platform I test. The real trick to input by speech will be making sure that my writing style doesn't change. When speaking long articles, it is common to end up with short, choppy sentences, and that is no good.

The spoken word is often much different than the written word. Through trial and error, I will have to come up with an entry methodology that works well for speech recognition, while maintaining my voice or writing style. I will only consider this a success if it's impossible to tell from reading my articles if they were typed as usual or dictated into the system.

You deserve the best writing I can do, and that's what you will get from me. That's not an idle promise, that is the way it will be.

Devices to be tested

I will start my journey into speech input with the MacBook Pro I recently purchased. Speech recognition is ingrained in OS X, and from the little experimentation I've done so far, it is OK. It allows speech input in any recognizable text entry box on the screen.

I will also test the Chromebook, although speech input is a very recent addition to Chrome OS. I don't hold out much hope to use it extensively, but will give it a shot.

I have great hope for using speech with the ThinkPad Tablet 2 I am testing. The speech recognition integrated into Windows has been good for years, and I'm hoping Windows 8 is as good or better as earlier versions. Speech recognition requires a lot of processor horsepower, and I'm concerned the Atom processor might not be up to the task.

Google's speech input is much better than most people realize, and I will be trying it on the Nexus 7. How it will handle longer entries is not clear, but I will see.

I'll also be testing the iPad, both the standard one and the iPad mini. I have used speech input in Siri quite successfully, and Apple has rolled that out across the system. I should be able to dictate articles into the browser tool we use at ZDNet, at least in theory.

Primary question

There is a big unknown as I start using speech for text entry, which will have to be figured out quickly. The most important question I have is whether the internal microphones on these devices are good enough for accurate recognition.

Most of the devices I will test have array microphones designed to cancel background noise. This is to make it easier for the system software to accurately interpret the spoken words. If they don't work well enough, then an external noise cancelling microphone will be required for the writing. I have several to choose from, so we'll see how it goes.

Methodology for writing

I plan on doing my research for articles much the same as I do now. I can do that using any of my devices since speech will not be a big factor. I should be able to do light typing for this work. I already use short voice notes in my work, and expect I'll do more of that. That works well across all the platforms at my disposal, so it shouldn't be a problem.

Writing the articles proper will be done totally with speech using whatever I determine does it best. I will dictate each article from start to finish in as many sessions as it takes. My experience with speech recognition is to ignore "typos" as I go, and just get the words into the system and "on paper". For those times when the interpretation fails miserably, I plan on having an external audio recording of my dictation. That will allow me to playback what I said at the time and grasp how to correct the bad interpretation.

After each article is written, I will do the editing phase much as I do now. I'm hoping the light typing required for editing won't cause my wrist any problems and that the brace doesn't interfere with it. If it does, I'll have to get good at using speech for this work, too. I hope that is not the case.

I am interested to hear from anyone who is currently using speech input on a regular basis. Please share what you are using and how you make it work. This is not going to be an easy change for me, and I can use any help you can offer.

Topic: Mobility

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Why not...

    ...just get the surgery? Slice, recover, good as new.
    • Probably will

      If the doctor recommends it. He wants me to try less invasive methods first. Even if I end up going that route I will end up having a period where typing won't be a good option so this test is valid.
      • Ah, I see

        I have CTS as well. The doc has me wearing that stupid brace until I can't stand the pain any more.
    • Get the Surgery

      With all due respect, your doctor is not giving you good advice. There is no way you should have been suffering for 5 YEARS!! I am a hand surgeon. I perform this surgery multiple times per week. I do this endoscopically. My patients are typing again in sometimes under a week. The surgery takes about 15 minutes, is outpatient, and involves a half inch incision in the wrist that often requires noting more than Motrin for pain.
      Again, you are getting BAD advice. You will feel better the next day after this surgery and your numbness (if you have any residual) will go away over about 5-6 months. No more suffering and no more splints.
      If you wait too long, you can have permanent damage and your recovery will be both harder and longer. Skip your doctor and find a hand surgeon that uses the endoscope.
  • Sorry for your pain.

    I have also used speech recognition tech for a long time, it can be tricky sometimes, especially with spelling grammar and punctuation. Key things for a writer. I look forward to hearing about your tests and methods in this area. Hope the impact on your work is negligible.
  • dogs

    I used to have something CTS-like.
    It got worse and worse but for some reason over the past 3 years it's getting better.
    I don't know what did it but 3 things pop to my mind :
    1) I'm not using a mouse anymore bot a multi-touch trackpad
    2) I'm now on a clicklet keyboard thorughout the day
    3) I've got dogs which I walk about an hour and a half on a leash per day
    My best guess is that 'walking the dogs' is what cured it because it means having to use all the musles in my hand while they sometimes rip my arm off in vain attempts to regain their freedom :-D
  • Short story

    I wrote a short story (for a friend) last month using iOS speech recognition in Evernote. It worked, but it was quite tedious. I'd like to find something that (a) is more accurate, and (b) more interactive. But the fact was, a 1K+ word story did get written. I did have to do a hand edit after, though.
    David Gewirtz
    • The rain in Spain falls mainly on the plain.

      Or where is Professor Henry Higgins when you need him. Don't blame Siri for your poor speech patterns, David! It must be that crazy New England/Floridian accent mashup of yours that is causing all that extra editing. Grin.
  • Headset

    A good headset with noise cancelling sounds like an ideal pairing to speech dictation.
  • Sorry to hear that James

    While you and I clearly have our differences, I would never wish anyone to be in pain like that. I hope you find a solution that clears this up for you.

    I'm glad you will be testing out Windows 8 speech recognition on your Thinkpad. Please keep in mind, dear ZDNet readers, that while MS has done absolutely nothing to advertise this, the Surface RT contains the full Windows 8 speech recognition platform. Well, probably not right to call it "Windows 8 speech recognition" since as far as I can tell, absolutely nothing has changed from Windows 7. I tried it for a bit on the Surface RT and it seemed to be able to keep up perfectly well. I must admit though that I didn't use it for long since there isn't any need for it in my use cases.
  • glove

    The best glove I ever used was a bowling glove with steel stays in the palm and back. Totaly ridgid yet not too tight. Good luck with it.
  • James, muscle atrophy is not good

    If you have muscle atrophy, that means you have nerve damage. That is a strong indication for surgery, and you still may not recover that muscle strength. Any doctor that continues to tell you to try non-invasive methods when you have muscle atrophy is of questionable judgement.
    Nevertheless, you should have been using speech recognition years ago. I have been using it since 1998. It is a really great now. Even if you just use the software that comes with Windows 8, it is great. Dragon is very affordable now as well.
  • surgery

    I had hand surgery last summer, though not for carpal tunnel. Surgery will help with pain, etc.. but it will cause stiffness, which will have some of the same affect that you get from wearing a brace. Plus, if you continue the same tasks that caused the CTS in the first place, you'll end up back in a brace anyway.

    While my hand was non functional before the surgery, and in various casts, splints, and braces after, I used Dragon Speaking. It's an awesome tool, but you get out of it what you put into it. You really need to make the effort to learn to use it to its full potential. If you do, it's an amazing tool.

    Luckily, I'm back to typing, though with some modifications to work around the stiffness and immobility I have in one thumb.

    Good Luck!
  • And this is why I use a Microsoft Natural keyboard.

    And this is why you'll pry my Microsoft Natural keyboard out of my cold, dead hands.

    As much as you guys *love* to tout these fancy new Chromebooks and such - they're horrific ergonomics.
  • your speaking trek...

    Looking forward to hearing about this endeavor, surgery or not, I agree, it is worth the effort and time.

    Another idea - recording into a digital recorder, then using something like Dragon Naturally Speaking to do the conversion. I've done this with reasonably good results overall. I used to drive 2 hours each way to work for about 18 months, so I began podcasting, then did a little bit of using Dragon to convert my yammering into text. With a good digital recorder, it worked OK - more of an experiment for me...

    Wishing you less pain and happy un-typing.

    -- James
  • Speech Recognition

    I have had CTS for years and I use Dragon NaturallySpeaking on a Windows laptop for writing longer text. I still use keyboard for final editing, shorter emails and basic computing. For heavy duty writing, Dragon works great. I wouldn't be able to type longer texts without it. It has also been good for my kids (5th and 7th grade), they can not type fast, so being able to speak their writing assignments is much faster. I recommend you include Dragon in your tests.
    • Speech-recognition s/w helps a lot

      Absolutely. After ensuring CTS-like symptoms for a while, I started using Dragon NaturallySpeaking Professional on my laptop (Win 7) and found it eased my writing activity a lot. It's quite accurate, though one has to edit their write-ups. Barring small writing activities like responding to email in brief, I use the s/w which greatly benefited me. Wonder why Microsoft and other OS makers can't integrate a good speech-recognition s/w in-built in the OS. Guess, the Star Trek days are a long way off....
  • The dictation experience

    I've covered speech recognition for more than a decade as an analyst. I write constantly, having published over 200 monthly issues of my newsletter, Speech Strategy News (a typical issue runs over 20,000 words), and a book, The Software Society, just recently. I don't have RSI problems, in part because I use speech recognition to do rough drafts. The trick in using speech recognition isn't the accuracy; the technology is very good, gets better as it adapts to your particular voice and vocabulary, and the error rate is dropping 18% per year, and will continue to do so indefinitely, according to Nuance Communications' CTO ("Vlad's Law"). The difficulty most people have is that dictation of something one wants to appear in print is an acquired skill. We aren't used to talking the same way we write. It takes a bit of patience to learn to pause to formulate a sentence, rather than stopping in mid-sentence and repeating a different version, for example. The key for me was thinking of it as a way to create a rough draft, to get all my ideas down, and then edit by keyboard. If you worry about each word as you dictate, it isn't very effective for creating large amounts of text. Editing is much less of a strain than the original writing.
  • Nice thought but .....

    Most jobs and job tasks do not lend themselves well to the use of this tech. Add to that the need to seriously edit what you have done. Unless you have a job that fits well (site and structure) you are better off tippity tapping away.
  • Create a lexicon

    As you undoubtedly know - but other readers might not - creating a lexicon upfront and updating it as you go makes the recognition a lot more accurate. The idea is to speak common terms and phrases ahead of dictation (and correct the text that it generated) to make it easier for the software/platform to recognise and transcribe more accurately. This is especially relevant if you're writing in a field that has it's own terms and acronyms.

    It's similar to "training" the software to recognise your voice and speaking style, while you train yourself to speak in a way that the software can recognise easily. Different applications use different methods for doing this, of course.