User coding an alternate Text To Speech approach? (Anyone? L&L Scriv Devs?)

Julian_M1 · October 30, 2022, 2:09pm

The built in text to speech system works, but I would like to use external tools.

Rationale: some proofreading can be done as a background task by simply listening to the text read aloud, if the reading is of sufficiently high quality. The built-in TTS is fine for small snippets (it has its own dialog) but not for e.g. whole chapters.

I can hack my way through Python, for example, and have discovered that Google’s gTTS library produces good output, and that if I want to do it all locally I could use something like TensorflowTTS - a bit more effort required but the samples contain amazingly high quality output.

As for getting text out of Scrivener, obviously there are files and one can convert RTF to plain text using e.g. the Python package striprtf - if I knew which content.rtf I wanted to read aloud.

Hence:

Does Scrivener have an API, events, etc. that would make it easy to get e.g. the text/rtf of the current editor (to pipe into the TTS)?
Is there an easy way to know which rtf file(s) is open in the editor(s)? (is there an active “temp” file in a fixed location perhaps?)
?Compile (with single action/keyboard shortcut) only the active document to a default file whose modification timestamp could be monitored?
Any other neat ideas for approaching this task? (I fear I don’t know enough about Scrivener to ask more precise questions than I’ve already tried)

TIA!

kewms · October 30, 2022, 4:19pm

There is no Scrivener API.

The fastest way to get the text of the current document is via the File → Print Current Document command, which can (at least on the Mac) be configured to print to PDF rather than to a physical printer.

Vincent_Vincent · October 30, 2022, 5:36pm

Same under Windows.
. . . . . . . . .

Julian_M1 · October 31, 2022, 9:38am

@kewms , @Vincent_Vincent That’s good to know in general but extracting the text from PDF involves another set of unknowns (i.e. PDF handling which should be OK for really plain text, but I consider it an known unknown) and I don’t want to use a PDF read aloud (due diligence check performed: Acrobat Reader just choked on “Philosophy…1” intro page of the Scrivener manual; locked up the whole app.)

Other suggestions?

JimRac · October 31, 2022, 1:04pm

You could use something like Autohotkey to automate the front end of this. It could run a macro that would launch via a keyboard shortcut, to do something like this:

Place your cursor in the editor panel you want read, press the designated keyboard shortcut. Macro would Select All text, copy to clipboard, paste it to a temp location for TTS to read, launch TTS and/or a script, etc. I’m not certain what the exact sequence would be (I have no idea how your TTS app works or what it’s expecting), but you get the idea.

Caveat: This approach will only work with the contents of one editor pane. It would not work in Scrivenings view, because in Windows Scrivenings you cannot Select All past document boundaries. This is a known limitation of Windows Scrivenings view.

A simpler non-automated alternative is to keep your TTS up and running and pointing to a particular source file, and simply copy/paste from Scriv into the TTS source file whenever you want to narrate it. It does sound like you really want to automate this, but is that really necessary?

Best,
Jim

Julian_M1 · October 31, 2022, 2:26pm

Thanks Jim. I have a whole novel to proof… I don’t mind getting my hands dirty when I have to, but I’m not going to copy and paste my way through the whole thing autohotkey is a possibility, I’ve never really looked at it.

Hmmm… .scrivx is an xml file… I can pluck the binder out of that and that will give me the UUID of the folder corresponding to the BinderItem Title. Seems nicely ordered.

And/or, if I watch all the folders, I can use that UUID-Title correspondence to ID which Title was updated.

Not a five minute job, but I feel a strategy emerging.

Nice side effect: would be relatively easy to get a machine-read audio book out of it

JimRac · October 31, 2022, 4:46pm

Autohotkey is particularly good for manipulating an app’s GUI, to automate selection of menu options, pressing buttons, etc., but it’s also a ‘real’ Windows coding toolkit, in that it can access the Windows API, open & process files, invoke apps, things like that. But it has its quirks–not sure if it’s the right tool for what you’re describing.

It sounds like a fun little project. Best of luck with it!

xiamenese · October 31, 2022, 4:58pm

Just be aware that if anything you do modifies the .scrivx file in any way, your project may well be borked and Lit&Lat are less than likely to be sympathetic.

Just saying.

Mark

kewms · October 31, 2022, 6:41pm

If printing to PDF doesn’t work, the second-best approach would be to Compile the desired portion to plain text, from which point any text manipulation tool should work. One of the available Compile filters is “selected file(s),” so you wouldn’t need to “know” which UUIDs are involved.

As @xiamenese said, there is no supported way to modify the .scrivx file (or any other part of the project) other than with Scrivener itself.

Julian_M1 · October 31, 2022, 10:04pm

It will be opened READ ONLY… even though I have backups of backups ad quasi infinitum, probably from a copy that I will then use the pymultiverse package to translate across the 'verse where it cannot affect The Precious Manuscript… and reading will be done by weak quantum measurements, just in case.

Julian_M1 · October 31, 2022, 10:05pm

Compile to plain text is also a nice idea, thanks.

narrsd · November 1, 2022, 12:09am

small point, but I keep wondering if File | Sync | with External Folder might be useful here.

at the least, could give you a convenient way and place to have an updated copy of work at any time. Since you say you’'ll treat as read-only, no issues with syncing a modified project back into Scrivener, the original intended use.

kewms · November 1, 2022, 4:16am

Generally speaking, yes, External Folders are a great way to give other programs access to the contents of a project.

But for this specific task the OP said they wanted to access the currently selected file(s).