Unique to Scrivener Text-to-speech features

I’ve heard that 2.0 will offer text-to-speech like it Ought To Be Done (meaning starting at the cursor). Marvelous! Every excuse I can find, I complain to Apple about that lack in Cocoa Text. If I keep it up, they’re likely to buy me “any Windows PC you want,” just to get rid of me.

I’d like to offer some suggestions as to how it’s implemented, suggestions that users are likely to love and that will set Scrivener apart. It should also make Scrivener the application to have for those who are visually impaired.

  • The same keyboard command starts and stops text-to-speech. That’s one less thing to remember. Ditto using the same icon in the toolbar to start and stop. In 1.5 the icon starts the reading again at the beginning.

  • Pause and restart reading with the spacebar like iTunes. As MacFixit just reported, with 10.5, text-to-speech seems to have gotten a little erratic. The new guy, Alex, may sound great, but sometimes his tongue slips. This would let us pause to see if what we thought we heard really needs correcting. Often it isn’t.

  • Jump back one sentence every time Tab or a like key is pressed. Option-Tab might jump back a paragraph. Shift-Tab might jump forward a sentence. Shift-Option-Tab forward to the next paragraph.

  • This might be too much trouble, but the ability to read punctuation characters aloud would be helpful, particularly when proofing quoted or OCRed text. I’ve gotten to where I can spot the difference in pause between a comma and a period, but other distinctions escape me.

  • Display text as it is read in a newly open window, showing 2-3 lines of enlarged text in a separate window with the words highlighted as they are read. It might even make sense to allow correction and an easy resume of text-to-speech from that window. That’d mean not having to find where we are in a screenful of text. Again, this is something the visually impaired should particularly like.

  • If Apple provides the tools, take advantage of the headphone buttons the new MacBooks has when used with Shuttle-like headphones. Pause, jump ahead and jump back, that sort of thing. If Apple allows it, I’d suggest using the louder and softer buttons to speed up and slow down playback. That might be more useful, since we already have volume controls on the keyboard in front of us.

  • Only you know how much work this last item would be, but script writers might love a Smart Read feature for scripts. Instructions and descriptions would be read in a different voice. A writer might even be able specify in advance which voice is used for the major characters, with the character caption being dropped. It wouldn’t be as good as hiring professional actors to read the lines, but it’d be better for establishing authenticity than trying to imagine voices in your head.

If others have ideas, feel free to post them here.

–Michael W. Perry, Untangling Tolkien


I’m afraid all that is being implemented is that the current speech feature will start at the cursor position if there is no selection. All of the other features you suggest would pretty much involve creating a separate text-to-speech application in itself. Scrivener will still be relying on the OS text-to-speech system, which isn’t capable of these things. Sorry!

All the best,

However you could send those feature suggestions to Apple… they aren’t exactly responsive to suggestions but they definitely can’t respond to something they don’t see.

That’s OK. You’re implementing what really matters–starting the reading at the cursor. All the rest is icing on the cake. It’s especially important to get quotes in historical books right. That’s why it matters to me.

As you suggested, I edited and reposted the suggestions to Apple for inclusion in Cocoa Text. And since text-to-speech is especially helpful for the visually impaired, I sent an email to their accessibility team. It seems the most open to new ideas of this sort. Earlier, I sent them a series of suggestions for a interface that would use swiping Morse code to enter text on the iPhone. The primary ‘tricks’ to that interface were:

  • Swiping instead of touching. A short swipe for a dot, a long one for a dash.

  • Changing from left to right to up to down to distinguish between letters. That way, timing is irrelevant, unlike regular Morse.

  • Swiping the other direction, i.e. right to left or down to up for uppercase.

  • As an alternative, two finger swiping could be used to distinguish dots from dashes or lowercase from uppercase.

  • After a certain user-set delay, the last series of swipes become a letter.

  • Various easy motions, diagonal and circular, for common punctuation combinations such as comma+space and period+space. Users could also set certain gestures to input a longer string, such as their phone number.

Keep in mind that Morse was designed so that common characters are simpler and shorter than less common ones. Tested against thumb keyboards, Morse is faster for IMing. Since touch keyboards are slower than thumb keyboards, Morse should be the fastest possible interface.

–Mike Perry, Seattle

I have just one request for this feature –

The option of a British accent.

Whilst using the text-to-speech for my MS, it drives me absolutely batty when any of my characters say “neether” instead of “neither”,… in my twisted little mind it sounds like they’re mispronouncing “nether”, which is entirely a different word, in any context.

[please-oh-please-oh-please, with big wet puppy eyes]

Er, I’m afraid I have no control over the voices; they are built into OS X. :slight_smile: Would love an English accent myself, though. Jobs!


Did they get RTF right? doc export? docx export? Pages support?

If you think american actors can’t do an english accent correctly, and then consider the above, what give you the slightest hope they will get the king’s accent correct?

more than likely the only person who would understand it would be vic-k. Birds of a feather and all¥.

¥[size=70]which means the voice will talk drunken stockportian.[/size]

~ whimper ~

Dang. I was really hoping there might’ve been something packed in Scrivener that’d sound a tad more civilised. I did find the other ‘voices’ [do I have to take pill for that?], but they all still say “neether”.