RTF Engine - Cocoa?

Hey all - question, the RTF engine in the Mac version of Scrivener, is that using Cocoa hooks, or is that built into the app itself?

Asking as my Dad is working with a genealogy database and old file formats. Wants to convert to RTF and then work with Scrivener to organize. Plus, he is a programmer, so looking into additional information for him.

Ideas on this? He will probably have to design the parsing and whatnot himself, but looking for a good reference. If I understand the construction of the Mac version correctly, Cocoa reference books would be his best bet, but figured I would check here before giving him a recommendation.


Preliminary answer (as Keith would be the best one to answer this): It’s using the Cocoa engine, yes, but it has been significantly modified to provide features the stock engine does not, and to fix many of Apple’s bugs with it. In terms of specifics, Scrivener’s RTF import honours more RTF specification than the average Mac application. So to optimise an RTF engine for working with Scrivener, one should attempt to adhere to the official specs as close as possible. There are many things the importer does not do anything with though, so a little experimentation will be necessary.

Ahh ok, thanks AmberV.

That is what I was trying to explain. Cocoa would be the best place to start for him, though I think importing and working via Scrivener might be an option as well.

(Actually looks like you replied to his other question here: literatureandlatte.com/forum … =2&t=10099 )

He asked yesterday if I had any suggestions for writing software… “Well now that you mention it…” :slight_smile: Day later we have a convert.

So its an interesting project. Trying to recover information out of mixed proprietary (or goofy) formats into a usable standardized format. RTF being the most logical as its an established standard that is cross-platform. Add that with the new windows beta we are testing, have a single format (.scriv) that could be passed from historian to historian. Assuming the programmer guy is able to recover the old format.

Yeah, I wondered if there was a connexion when I saw the genealogy reference in that other post. :slight_smile:

Cocoa is a great place to start. The RTF engine isn’t perfect, but it is a huge leg-up, and for anyone familiar with OO programming it’s very easy to learn. For a project like this I think it would be ideal. The main problem with Cocoa’s engine is that once you get to a certain point with it, it is difficult to add new features. But these areas are mostly within the realm of edge cases for writers—for a legacy-to-RTF convertor, it would probably work great.

In fact I have four separate problems. I will probably end up blogging about “Data Mining with Scrivener!”

I have a collection of manuscript files which were created in the 1970s through early 1990s, which have been converted to RTF by FileMerlin. From these were printed five large bound volumes of family genealogy. The books are going out of print, and republishing on paper is prohibitive. The first problem was to re-create the first volume as a PDF file. The draft is complete and out for review, thanks to Scrivener.

The association President has already asked if Scrivener is available for Windows. I won’t be surprised if the Historians and Directors start using Scrivener themselves!

The second problem is a data mining issue. To move forward, we really need to migrate that information to a genealogy database. The traditional method of doing so is to turn loose an army of volunteers for a few years, who type in the 7,000 pages of printed material, one person at a time.

However, it occurs to me that I might be able to save literally five years of typing time by ingesting those manuscripts. That means scanning the manuscripts and identifying the fields and their relationships based on the document content. The formatting itself is my biggest clue to the data organization. After reading through the forums here, I realized that there must be a native Mac RTF-reading engine, and that somebody here knows how to use it!

So, it looks like I need to be in the Cocoa environment to get at the RTF engine. Thank you!

The third problem is that we have sixteen large, full, file drawers of updated information submitted on paper. It may be possible to OCR a fair proportion of that material, at which point I already know Scrivener is the perfect tool for organizing and managing that material.

The fourth problem is that we have another 5,000-10,000 page manuscript in the form of several large Microsoft Word documents. The best method of publishing this is to migrate it to a genealogy database, from which manuscripts can be generated. This becomes a combination of all of the above: Break the document into small sections, which can be imported and managed with Scrivener. Do a bit of magic to create a genealogy database from those sections’ content. Finally, use notes within Scrivener as a project management tool, keeping track of transcription, verification, and review processes.

Some things you might want to take a look at, when using Scrivener as a pre-processor for database prep, are the Separator and Formatting panels. There are some interesting things you can do with these two panels, using them in a completely non-standard way, to insert control sequences at the most, or just unique delimiters that would make it easier for a script to convert the compile into SQL or whatever is being used to talk with the database.

I agree it would be a good platform for OCR review. Being able to view and scroll the original document scan while proofing and editing the text result side-by-side would be great.

Another thing to look at is Scrivener’s custom meta-data for some of the paper-trail type stuff you intend to track in notes. The advantage of using meta-data is that they can be placed into Outliner columns, sorted, and searched for.