Scrivener XML XSD/DTD/?

Russell411 · March 12, 2015, 3:41am

Hi,

I’m new to Scrivener; considering it for a project I’m working on. I’ve tinkered a bit and I see that the main file is an XML file that references other files stored in folders. I need some “Round Trip” import/export capabilities - so I can make updates in Scrivener that affect another program (that I wrote) and changes I make in my program will affect my Scrivener file (specifically I want to round-trip documents in the Research area).

Is there an XSD, DTD, DOM model or even just a schema writeup that I could use for this process? Any other suggestions?

Thanks,

Russell

JJSlote · March 12, 2015, 12:06pm

I’ve found Scriv’s XML data scheme to be inherently discoverable, and have developed small programs that read it (routinely) and update it (occasionally). Scriv’s developers have excellent reasons not to make tinkering with these files too easy, as assisting with the inevitable file damage would divert resources, unmanageably and unsustainably, from those customers who are trying to get their writing done via the UI.

I do hope Scrivener will someday offer the capability to import and update any field. The input would be an XML with the same structure as project.scrivx, with blank entries representing no change. So an entry with just the DocID and a meta-data field filled, and with the surrounding scaffolding empty, would update that field within the Scrivener project. The import would be pre-validated and previewed, and would generate its own undo. Thus we’d have a way to update existing entries without having to tinker with production files.

Rgds – Jerome

Russell411 · March 15, 2015, 4:30pm

Hi Jerome,

Thanks for your reply… I agree that Scrivener’s XML schema is quite discoverable/readable - I was just hoping for an official document so I could ensure I keep with the designer/developer’s structure. While I can understand, to some degree, that the developers might not want to share this information - since it isn’t exactly hidden or proprietary, an official document definition would help prevent the “inevitable file damage” you’re referring to - rather than you and me and others trying to make a hack at it - which is undoubtedly where I’m headed, since there isn’t a bidirectional import/export to keep things current between my research software and Scrivener - and my research ALWAYS evolves as I write.

I noticed in your signatures that you have something over 8000 documents in your Binder… I’d appreciate the opportunity to get a screen capture or explanation of how you keep everything sorted/organized in there - and why you’ve chosen this method over OneNote, Mendeley, Docear, and other notetaking/research tools. Since I’m not finding those to work for me either - I too plan to have lots of documents in my Binder.

Thanks,

Russell

JJSlote · March 16, 2015, 4:51am

Hi Russell

Great questions. By reading the binder contents into a responsive database with sorting and selection tools, we can easily manage 8000 documents in our preferred writing environment, with the prompting that only integrated notes and research materials can provide. Illustrations to follow.

Rgds – Jerome

JJSlote · March 16, 2015, 9:45pm

OK, by request, here are a couple of screen shots from my main AutoHotKey script.

The first is a ListView, showing how it loads the Binder XML. (Most entries masked.) Type and Preferred Date are my own fields. Beyond the doc title we have the Parent ID, Parent Title, Tree Depth and the Load Sequence. Everything’s sortable by clicking on a column, of course, so if we sort by Load Sequence we’ll see documents listed in the order they’d appear in an expanded binder within Scrivener.

Here’s the Inspector as I typically use it. The script’s GUI overlays some sections, with a small notch for the Label dropdown.

The .ymd file in Doc Refs is a dummy file generated against a text date or the session date. It’s recognized on load as the data for the document’s PrefDate. The “Article .typ” entry is a real internal XRef whose name becomes source data for the ListView’s Type field. Thus I can cluster all documents of type “Article” together. (There’s another “.typ” entry shown in the Listview, “Foods”. Itself of type DataElmt, and with parent “Document Types”.)

The 5346 entry is the really clever one, if I may say so. It’s the DocID in a binary notation, with exclamation points representing one, inverted exclamations zero. (These characters kern tightly and uniformly with each other in any sane typeface.) Four times per second, the script searches the screen for an image of the attention character, a Latin Small Letter T with Stroke: ŧ. If the script finds that character, it adds row and column offsets, then interrogates every fourth pixel rightward, 16 of the next 64, along the row that aligns with the single pixel gap in each inverted exclamation. The script adds the correct power of two for each dark pixel read, and thus derives the DocID.

A TextBox at the bottom shows that the script has recognized the DocID in the editor as I navigate around Scrivener. It also displays the wrapped title and the Parent title, two fields not seen easily otherwise.

There’s a ListView with bidirectional Doc Refs as well, and more richly clustered relationships. Just a few thousand AutoHotkey lines, and time very well spent.

(Sorry for the initial tepid reply, born of privacy concerns. Easily addressed by masking the doc titles.)

Rgds – Jerome