NaNo word count fixer?

Are there any hooks in Scriv 2.0 to send the output of Compile Draft through an external script? I doubt I’l get to it this year, but I was thinking I might figure out how to make the NaNoWriMo word count match Scrivener by manipulating the text (maybe replace all hyphens with spaces, maybe do the same for most punctionation.

Any advice on this kind of script? Also, is Scrivener’s word counter pretty accurate? For instance: would it count this ->****** as a word? How about an isolated comma and quote like the following? “That’s no moon ,” he said. Any other quirkiness with regards to word counting in Scrivener version 2?

What is your level of computer skill? Are you comfortable writing a Perl script? If so, Scrivener’s hooks (and this is for 1.0 as well, incidentally) are in fact the MMD engine. All Scrivener is doing is creating an MMD file, and then feeding that to external UNIX scripts to produce the final result. If you know where to intercept, you could insert your own custom script in here that replaces MMD.

Otherwise, if you are fluent in AppleScript, I’d recommend going that route as it would be far easier. You can just set up a dedicated compile folder and attach a script to that folder which cleans your documents as you compile them.

But if I remember correctly, Scrivener’s counter is a touch more strict than Nano’s (which is fairly loose). That means if you go by Scrivener’s project goal, you’ll most likely end up going over the 50,000 words by a bit. You can of course, verify these other questions by typing them into Scrivener and seeing if the project target increments. :slight_smile: The ones you posted don’t increment for me in either version.

Also, I think both the Nano engine and Scrivener count hyphenated words as two, so there would be no need to replace these with spaces in order to increase the word count.

I’ve done a lot of programming in PERL (years ago), but with a specific sub-set of perl modules, so my skill set has some large and very odd omissions & weaknesses. I rarely created CGI scripts for instance, and have never used any XML perl module. Still, I’m up for a challenge if I can find the starting point.

My experience with Scriv vs. NaNo is that when the NaNo word count validator takes my pasted text from either an Edit Scrivenings or a compiled RTF file, my word count drops. By the time I’m near the end, I have to get Scrivener up to about 51k in order to get my NaNo word count up to 50k. Someone, somewhere said that the NaNo word counter is just the Unix “wc -w” utility behind the scenes, which counts hyphenated words as one, not two words. And we all know that the internet is never wrong.

Is there some documentation I can look at for intercepting the MMD output with a perl script? I must admit that I’m not terribly familiar with how to make PERL interact with other programs unless I’m piping their STDOUT stream into my script on the command line. But there’s no time like when you’re writing a 50,000 word novel in 30 days to learn some new programming tricks. :stuck_out_tongue:

Thanks!

The NaNo official word counter is either wc -w or something that counts words exactly like it. My local wc always gave me exactly the same result as the nano upload counter, every year I tried it (since 2003).

Okay perfect. The way Scrivener works with MMD is all about pipes, both STDIN and STDOUT, so all you need is a script that takes the compiled plain-text, runs a few regexps to clean out the punctuation, and spits the result back out so that Scrivener can dump the final file in the specified location.

The first thing you need to do is download a fresh copy of MultiMarkdown and install it into your [b]~/Library/Application Support[/b] folder. Scrivener is set up to use that folder instead of its internal copy, if one exists, and this will make it easier for you to edit the guts without poking around in Scrivener.app.

Once you unzip the distribution, the folder you want to navigate to within it is [b]bin[/b], which contains all of the control scripts that handle the MMD system. Scrivener just calls one of these, depending on the compile format type, and as said pipes stuff in and gathers the results out. Thus all we need to do is choose one of the controls scripts and subvert it into something else entirely.

Any of them are fine, especially if you have no intention of using MMD in the future. If you do, then I’d pick RTF or LaTeX, depending on whether or not you’d ever intend to use either of those. So depending on that selection, you want to create a backup of [b]mmd2LaTeX.pl[/b] or [b]multimarkdown2RTF.pl[/b], and then replace the contents with something very simple like:

[code]#!/usr/bin/env perl

$data = “Testing MMD over-ride:\n\n”;
$data .= <>;

Do stuff to $data

print $data;[/code]

Now, if you compile your draft to MMD->LaTeX (or RTF) and get your manuscript compiled out completely, with the test line at the very top, you know you did everything right, and now all you have to do is build and test your data manipulation code.

That’s it! Once you have that set up, you should be able to compile straight out of Scrivener and get an optimised draft for Nano.

One thing you may want to address in your data manipulation is to remove hashmarks. If you export titles, since Scrivener isn’t aware of the fact that you aren’t actually using MMD, it will wrap the titles in a number of hashmarks equal to its outline depth. So you’ll need a regexp that strips those out most likely as they might otherwise inflate the word count. Titles look like this in MMD:

[code]# Level one #

Level two header

Level three header ###[/code]

Build the regexp to handle any number of hashmarks on the front and back.

s/^#+\s(.+)\s#+$/\1/

Ought to do the trick.

⠂─────── ⟢⟡⟣ ─────── ⠂

The only caveat with this trick is that Scrivener 1.x will force extensions on you. You’ll have .tex or .rtf files even if they are actually .txt files. Just fix it in the Finder and you’ll be okay. Though, by the time you’ll be doing real compiles in late November you’ll be using 2.0, which lets you supply your own extensions easily.

That was incredibly enlightening. Thank you! I’ll have to start experimenting post-haste, right after I get done reading my research materials for my NaNoWriMo story.

Oh, one other thing I forgot to mention is that Scrivener will insert a bunch of meta-data at the top. Much more in 1.x than in 2.0. You can either choose to handle this programmatically or just edit it out every time you compile. Since it will always be predictably at the top, it would be easy either way. The key thing to search for is the first empty line in the document. Anything above that can be assumed to be meta-data and discarded.

I have something that works, though it’s a little clunky. I have attached the script to this post. When you download it, remove the “.txt” extension (the forum software wouldn’t allow me to upload a .pl file) then drop it into into ~/Library/Application Support/MultiMarkdown/bin, which will REPLACE the original RTF exporter. Back up the original, or be prepared to download a fresh copy of MMD to get the original functionality.

The script does 3 things: It strips out the header that Scrivener adds, then it replaces all letters with the ‘x’ character (but not numbers, just to leave clues for debugging if something odd occurs), and finally, it strips out the hyphen.

I tested this on a long manuscript, and compared the output of wc -w to what scrivener claimed for the word count. It was off by 1 word. I’m not sure why, since I did remove all of the header, but I’m calling it “good enough” until I can use it with the official NaNoWriMo word count validator. I tried stripping out other punctuation, but that ended up inflating the word count well beyond what scrivener reported, so I removed those symbols.

Though I may never touch this code again, potentially cool features might include a way to process special MMD headers you add in Scrivener to tell the script to produce the real RTF output in addition to another file with the text “fixed” and obfuscated, automatic upload of obfuscated text to the NaNoWriMo word count validator, a fancier obfuscater which substitutes random latin words instead of x’s.

Anyway, if you have a manuscript you want to test this with, give it a whirl and see if the output and the scrivener word counts jibe (or come within 1 word at least).
multimarkdown2RTF.pl.txt (1.05 KB)