Text Timings

Have searched but can’t find anything.

Does Scrivener, for Mac, analyse the written text and work out timings for, as if, you were reading it aloud?

Ulysses does this - can’t find any info on whether Scrivener does this.

Thanks all.

Just a Scriv user, but I don’t think Scriv has a function like this.

Any idea how Ulysses’ algorithm works? Are they doing anything more sophisticated than dividing the word count by, say, 275?


It doesn’t sound too algorithmic. According to the U web site, “Average reading time is based on a reading speed of 4 words per second. Fast uses 5 words per second, slow uses 3. Read aloud time is 2 words per second.”



Print it on A4 paper with 1 inch left and right margins using Times New Roman regular 12pt; allow 5 seconds per line at normal recording speed. I spent 13 years recording voice-overs of text translated from Chinese for videos and found that was pretty accurate; I needed to match the English voice-over to the length of the Chinese voice over—though on one critical recording I resorted to counting syllables, editing each English paragraph to match the number of syllables in the original Chinese paragraph.

Scrivener has no idea of line length until you compile; words can be many syllables or only 1 syllable long, syllables range between roughly 1 and 8 letters, so counting “words” doesn’t help—“antidisestablishmentarian” is one word, so is “a” and so is “scrounge”. So there is no way Scrivener can know how long something will take to say, as the line length on screen may change with changes to your window, for instance with hiding or showing the binder or inspector.

I would be very suspicious of what Ulysses is doing, for the same reasons. If you really want a rough idea immediately, use Text-to-Speech and time how long that takes; but I’ve never used Text-to-Speech either, so I don’t know how accurate that would be.


We could adapt Mark’s page-up method to a calculation in the following way: Assuming character count is a more stable count per line than words, and given an average of 78.5 characters per line (12pt Times New Roman on A4 w/ 1" margins), Mark’s five seconds per line translates to 15.6 characters per second.

Thus, if you take the character count of your text and divide by 936, you should get a Xiamenesian approximation of the number of minutes it takes to read your text aloud.


For the reasons above, one of the three or four absolutely “must have” requirements for someone aspiring to write for TV or radio used to be the ability to count. As the great British journalist Nicholas Tomalin once noted, it was also essential if you wished to write for newspapers (alongside, amongst other absolute necessities, rat-like cunning, a plausible manner and “a little literary ability” - note the “little” :slight_smile: ).

Now it’s easier. Technology does the job for you. And, at least for TV and radio, in my experience precision is not always required. The speaker, if he or she is moderately experienced, knows to speed up or slow down - up to a point. Or the Autocue/Teletext equipment can be dialled up - slightly. 8)

You could try it. My metric was a good rule of thumb, but I often had to edit on the fly to overcome serious disparities.

However, the best metric for assessing timing is syllables, as in most languages they are spoken at roughly the same speed—English being somewhat bizarre from this point of view. So while characters-to-lines is feasible, characters to syllables does not equate as I tried to show … “splurge” perceptively occupies the same time as “hug”. Numbers also throw everything out … 7,777,777 is only 9 characters long including the commas, but it’s ‘seven million seven hundred and seventy seven thousand seven hundred and seventy seven’ when read, i.e. 27 syllables—not that that particular number is likely to crop up often, but I was having on some recordings to deal with figures like that.

Also, the way things worked in China was that the video was created complete with its Chinese commentary, and very often I had to use a dubbing machine to match each paragraph in English to the existing Chinese recording. Chinese is easy to calculate, as each character occupies exactly the same space on the line, and also represents a single syllable. But in translation, English basically takes 1.5 times as long to say the same thing as Chinese, so a lot of careful editing was necessary.

As it happened, some of the videos I had done the English soundtrack on for the TV Station were re-edited some time later, with changes to the Chinese base—including the one where I resorted to counting syllables (a 20 minute video!). Whoever they got to do it, clearly had no knowledge of Chinese, knew nothing about these issues of Chinese vs English, and ended up gabbling his version … ghastly!

But that is off-topic really. GR’s ‘Xiamenesian approximation’ might well work for @chilledstorm, depending on how accurate he needs to be, but on that I defer to Hugh, who knows more in terms of what is needed here.


This is a very patchy thing to aspire to.

To give two examples:

  1. I’m listening to the audiobook of Rupert Sheldrake’s “The Science Delusion”, read by the author. His reading speed is quite deliberate and measured, you might almost say “slow”.

  2. There’s a well known BBC weather forecaster who gabbles the forecast so quickly it’s hard to catch or retain what he’s saying. At a rough estimate I would say he’s speaking at least as twice as fast as Sheldrake.

Why not time yourself reading a fixed number of words - say 300 - and then use that as the multiplier for your ongoing word count?