Discrepency between e-book and PDF word count

I thought about the TOC before posting, but with about fifty chapters, that amounts to about a hundred new “words” (the chapter number to the left and the page number to the right).

But since we’re talking about a discrepancy of (in my case) about 900 words, I just can’t find what would change the count with that much.

In fact, I’ve got entire chapters with that length. So it’s hard not to worry there’s a chapter missing in the PDF… :open_mouth:

OH. Edited to add: Sorry for posting this in the windows thread — I’m on Mac, I just did a search for word count and didn’t realize I ended up in the wrong platform. Is this related to something else on Mac?

Wouldn’t comparing the chapter counts at the end of each respective book put that to rest? If the Mobi ends on “Chapter Fifty: The End” and so does the PDF, then they are all there. There isn’t anything in the compile settings that can change content level choices via a mere “Compile For” switch. Even changing the entire preset would not do that. Presets do not store project specific settings in the Contents pane.

Wouldn’t that negate the purpose of using a computer?

Sorry, couldn’t resist. :smiling_imp:

But seriously: no. My chapters consist of scenes (documents inside folders) and some chapters are built of ten or more of those. If a couple of scene documents inside chapter folders were to be dropped, it wouldn’t show in the (manual) chapter count.
I could obviously compare the different exports page by page, but I don’t think either one of us considers that something one should need to go through. Especially not using a software that is so well thought-through and stable as Scrivener — trust me, it’s actually the reliability of the program that makes discrepancies like these so much more scary. (In, say, Final Draft they would be more or less expected.)

I sort of feel like you dodged the bullet here: Is a nine-hundred word discrepancy expected/reasonable?
Even though compiled with the exact same settings, only to a different file format/medium?

As I said, “There isn’t anything in the compile settings that can change content level choices via a mere ‘Compile For’ switch. Even changing the entire preset would not do that. Presets do not store project specific settings in the Contents pane.”

So, no that wouldn’t be reasonable or expected :slight_smile: At least in theory. It’s really difficult to say what is going on though with only the information that there is a 900 word difference between ePub and PDF. There are too many potential factors that could be involved, especially since none of them should be factors to begin with, meaning that large portions of the software are open to being suspect. I might be overlooking something, too.

Would you mind sending a copy of the project to support for testing? You use Save As to make a new copy, then scramble the contents by using Project Replace to swap out common letters if need be. Just refer to this thread URL in the e-mail.

Ok, I tested with this a little, and the count is always higher in the epub if you’ve set the word count (in the compile Statistics) to include all text. I compiled a document roughly 100,500 words to several formats and got a matching <$wc> count for PDF and RTF with an inflation in the epub, so I wouldn’t worry about sections being missing from the PDF. Stripping this down, I’ve just done a couple runs with a blank project containing unnamed documents with nothing but the <$wc> tag (i.e. each document has only that single word), and each document increments the count by four rather than by one as it does in a PDF. This is true whether or not I generate an HTML TOC and regardless of what I name the documents, so I have yet to figure out what extra text is getting included.

At any rate, try switching the word count statistics to include main text only, and see if that gives you more comparable counts for PDF and epub. When I did this on the large project, my epub and PDF counts matched exactly.

When is the $wc value being cacluated? Before or after transforming the text into the target format? I ask because HTML (which I believe is the basis for ebooks) has a lot of tags and other stuff added that could be counted as words, since it’s essentially plain text, as opposed to word processing markup, which Scrivener natively understands and so wouldn’t count as “words”.