MacSpeech Dictate

I was pretty excited to get a copy of MacSpeech Dictate last week. (It was offered as an upgrade to those unfortunate enough to have owned the developer’s previous VR application, iListen.) This is a new Mac voice recognition program that uses the same engine as the Windows program Dragon NaturallySpeaking. DNS is so superior to any previous Mac voice rec that I was willing to partition my hard drive and buy a copy of XP just to run that one app, fighting through the endless hysteria-tinged security warnings that kept popping up every 2 minutes. That’s how much I love DNS. I haven’t used Dictate much, but it seems to be up to the standards of DNS.

I need all this in order to transcribe thousands of snippets from the many books I use in research. (I also use it to transcribe recorded interviews, although it can’t do that automatically) I could type it all in by hand, but my accuracy is not great when I’m transcribing from a page, and an old case of RSI tends to flare up if I do it for too long. Booting into Windows to use DNS was great for this simple transcription need, but the software is also designed to control your computer – I just didn’t need to do anything else in Windows besides transcribe with DNS. In OSX, the only command I’ve issued to Dictate so far is “Open TextEdit” but it worked! And I did manage to dictate several paragraphs into Scrivener itself.

I would be very grateful to hear any further experiences you have with the new MacSpeech application. I’ve ordered the upgrade, too, but haven’t received it yet, so I’m terribly curious how this is going to go. I was also a big fan of DNS, but after a few passes trying to use that unspeakable dog iListen, I gave up trying to dictate to my MacBook Pro entirely.

Naturally, I’d love to use the new version of the application with Scrivener, and your suggestion that it might work is tantalizing. Please do tell more as you gain more experience…

Update: I took McSD out for a serious spin for the first time yesterday, so I can deliver a more complete report now.

It’s not quite “there” in comparison to Dragon Naturally Speaking. The accuracy and word recognition are terrific – many times I thought, as I was speaking, that I’d muffed my pronunciation enough that the software wouldn’t be able to get the word right, and time and again it surprised me. It makes very few errors.

However, sometimes it does, and sometimes I do, and so the editing capabilities need to be good, too, and that’s where McSD seems a little underdone. For example, in DNS, when you come to an unusual name or word that you suspect the software won’t be able to recognize, you can use voice commands to send DNS into “spelling” mode and spell out the word. McSD does not yet have this capability, though the manual promises that it will in the future.

When there is an error, either human or machine in origin, McSD has only very primitive voice-activated editing capabilities. Basically, if I wanted to change the word "terrific in the second graf of this post, I need to be able say something like “select terrific” to get it to select that word for editing. I’m not sure just how this kind of software works, but apparently it needs to “memorize” the whole chunk of text dictated into it so it can search back through what’s already been dictated in order to find any word that needs to be changed.

DNS did this quickly, but McSD reminds me of iListen in that it takes much longer, and you can see the cursor ticking slowly backwards through the text (presumably also back through the sound file) looking for the word. This is a big problem. If, for example, I told it to select “terrific,” but for whatever reason it didn’t recognize my pronunciation of the word, it will search laboriously all the way back to the beginning before stopping. Now, usually the word you want McSD to correct is in the last paragraph or so, but if it doesn’t recognize the word, it keeps going and going, back and back, looking for whatever word it thinks you said. Depending on how long the document is, this can take as long as a couple of minutes, and while it’s happening, McSD seizes control of the cursor – you can’t even switch over to another program to, say, check your email. This is infuriating when you know that the thing is wrong-headed from the start. It’s a drag to have to wait around for a process that might yield some fruit, but it’s crazy-making to be forced to wait and wait for software to do something you already know is pointless.

I think I managed to abort a few of these wild goose searches, but I’m not sure how: I tried a lot of things, like saying “stop” and pressing Escape. The manual (of course!) doesn’t explain how to do it.

One aggravating thing about iListen was that you had to be very careful not to screw up the software’s understanding of where the cursor was in the text – unlike the user, it apparently can’t just look and see. If you used the keyboard to try to correct any of its many errors it could get confused and all voice-directed editing would take place at the wrong spots in the text and turn it into gibberish. I get the impression McSD is some hybrid of that method and whichever one DNS uses to keep track of the dictated text. It seems better able to handle the user resorting to the keyboard occasionally, however, so if you use it, I would recommend just correcting by hand if you can.

I use voice-rec software to dictate out selected passages from books, so that I have searchable, digital notes without having to transcribe by hand. I don’t often dictate extemporaneously (off the top of my head), so as a result, my dictations, while long, have relatively few errors and require relatively few corrections. For someone who uses it otherwise, the rudimentary editing features of McSD might well make it not ready for prime time. For me, the ease of not having to boot into Windows makes up for having to work around McSD’s current “eccentricities.” I don’t mind resorting to the keyboard often, because it’s only extended transcription that bothers my hands. Someone who found typing more difficult would, I suspect, not find it very useful – especially not until they get the spelling feature up and running.

I admit, I’ve not really tried McSD out as a way of issuing commands to my computer – say, as a way of creating, composing and sending email or IMing. I don’t really need it for that, so I can’t testify to its efficacy there. It’s possible that they skimped on the dictation/editing aspects while perfecting the computer-controlling capabilities. I feel hopeful that later updates to the software will deal with some of these deficiencies.

Thanks for that review, that’s very helpful, because it brings up many important points not covered in the NY Times review by David Pogue (check out the funny video, though :wink:.

I’ve now had an opportunity to test the program extensively as well and I’m absolutely appalled. There is no mechanism whatsoever by which you can train the program to accept unknown words. In my view, that fatally cripples it for writers.

If, for example, you have a character in your novel named Rasputin, the program will not recognize the word when you dictate it and you have no way to teach the program the name, so every single time you dictate your character’s name it will be rendered unintelligibly. Maybe you also have a character who is a little potty-mouthed. Try dictating words like asshole, shithead, jerk-off and so forth and so on. Anything and everything that’s not already in the program’s vocabulary is unintelligibly scrambled and you have no way to fix that. Ever. The application does not learn anything and you can’t teach it anything.

Forget the learn-vocabulary training ‘feature.’ It’s a bad joke. You can input documents (although the application only accepts plain text and RTF documents, not Word or anything else), go through an excruciatingly slow ‘analysis,’ and then the application will cheerfully announce that it has added the new words to your vocabulary. What the hell? Unlike DNS, the application doesn’t provide you any way of training the application as to what these new words are, so to say they have been ‘added’ to your vocabulary amounts to an outright fraud.

Trying to correct everything manually is a thoroughly unpleasant experience. Using the built-in Notepad, every attempt at correction turns into a complete mess. Manually deleted text would occasionally reappear somewhere else in the document entirely, and sometimes it wouldn’t delete at all. Believe-it-or-not, when I attempted to insert words manually, the application insisted on deleting the spaces before and after the inserted word and then rendering the added word backwards. Seriously.

It’s all very sad because, as another poster observed above, the program’s recognition accuracy is really extraordinary overall, but clearly the full application isn’t ready for release. Still, they did release it and they conned at least some of us out of $89 for it. For the time-being, all we can do is admit that we’ve been had. Eventually perhaps they’ll finish the damned thing and we can upgrade what we have now into a useful application, but buying this piece of half-finished junk they’re trying to foist off on us now would be a serious mistake for any writer of fiction.

I’ve had problems with MacSpeech Dictate, and found ways to work around some of its limitations.

The lack of a spelling mode is a large drawback, but in a pinch the NATO phonetic alphabet can be used with moderate success.

While it learns vocabulary from reading text samples, it doesn’t learn to pronounce the words. I have to use a very deliberate and “literal” pronunciation (even if it not correct) and be careful if there are homonyms or similar-sounding words.

Editing in the MacSpeech text window should be avoided. Dictate into that window, then copy/paste the text into your target application and do your edits in the target app. Don’t dictate directly into, say, a Scrivener window. Apparently the program gets confused by two input methods (microphone and keyboard) acting on the same window.

For higher success rates on word recognition, calibrate the microphone before every session and more often if you are running the program for a long period.

Speak deliberately, not fast, and not slow. If I pace myself carefully, the program can do some remarkably accurate transcription, recognizing proper names that it could not possibly know and was never trained.

I’ve just been using the program to dictate a few e-mails and web postings, including this one. (A stiff muscle in my shoulder has been acting up, and I’m trying to baby it.) I really haven’t had that much trouble with switching between dictation and keyboard. Of course, I’m only dictating relatively short passages at a time. Every so often, it gets confused, but I quit Dictate and restart and then everything is fine. I think MacSpeech Dictate is a frustratingly inadequate program when compared to Dragon NaturallySpeaking, but if the standard you’re judging it by is iListen, it can seem like a godsend. This is the first time I’ve been able to use a dictation program to control the actions of applications like Mail or Safari, and I’m pretty pleased with the results. As I said before, people who are truly disabled are likely to find this application frustrating to use, but if you just need to lay off the keyboard for the most part (and maybe just for a while now and then), it’s wonderful to have this option in the operating system you use every day. That said, you really don’t want to do much editing in it, as that seems to be what confuses it.

It looks like the developers have launched an update to address some of the failings identified in this thread: http://www.macspeech.co.uk/article_info.php?articles_id=306http://www.macspeech.co.uk/article_info.php?articles_id=306

H

I’d be very interested to hear from MacSpeech users whether the new features make it more useable (if and when you have time to try it and to post about it, of course).

I am eager to try the update, which does sound like a significant improvement. Unfortunately, I haven’t be able to use Dictate in a couple of months. This could be due either to a malfunctioning microphone/USB adaptor or a weird new problem I’ve been having with Leopard in which my permissions keep mutating. (Time Machine is also nonfunctional.) After consulting with someone at the Genius Bar, I decided the only way to get to the bottom of it is to wipe my disk entirely and reinstall everything, so I’m waiting until I clear a few deadlines first. The first thing I’ll try after that is Dictate, but I may still need to order a new mike adaptor if that’s where the problem lies. Those adaptors are expensive!

Thanks to Hugh @all for noticing the new update. It looks like this new version (1.2) is quite buggy. To get it to work at all you need to delete the prefs file and create a new profile. It’s OK with dictation, but command mode seems not to work. MS responds to a mere handful of commands. Maybe other users are having better luck?

Yeah, I had some trouble with the installation until I worked out that my old profile wasn’t being recognized and created a new one. I’m sorry to hear that probably isn’t the extent of the problems. I’m really beginning to wonder if these clowns at Mac Speech are worth the time I have wasted with them. Lord, I miss Naturally Speaking. Why are these Mac Speech people having so much trouble releasing an application with a level of functionality anywhere close to the one that has been available to Windows users seemingly forever?

I’ve heard that 1.2 is the first version to use the Dragon engine, so it should be markedly improved. (Haven’t tried it yet myself, though it’s on my ‘thinking about’ list. Right now I just pull out the ole Vista machine and use Dragon NS when I want to dictate, which is, of course, the main reason Dictate is high on my ‘thinking about’ list. :stuck_out_tongue: )