Microsoft-Style RTF Outlines import as Scrivener Outline

Scrivener supports importing OPML files and converting them to a Binder outline where each line in the OPML file becomes a separate file in Scrivener and maintains the outline hierarchy in position. This allows one to import data from Omnioutliner.

I wish that Scrivener can do the same for Microsoft Word outlines saved as RTF.

Microsoft Word creates an outline hierarchy by indenting the margins of paragraphs. This indentation of the margins is preserved when Microsoft Word saves to RTF. This can be seen when the RTF file is opened by other files such as TextEdit.

Inspiration 9 also exports its outlines to Microsoft Word outlines.

Having this capacity in Scrivener would make it easier to import outlines from Microsoft Word and Inspiration.

Unfortunately, not everyone uses OPML.

I’m not entirely sure what you mean - which outline capability in Word? Do you mean just bulleted lists? Converting those to a hierarchy in the binder would be very difficult for a number of technical reasons - partly because the OS X text system imports them as special blocks which aren’t very open to third-party Cocoa developers for interpretation, and partly because it is possible to create a textual outline that could not be translated to a folder/file hierarchy. If you mean a styles-based hierarchy, then I’m afraid that’s not possible as I rely on the RTF importers/exporters as provided by Apple in their Cocoa frameworks. Although I’ve added a lot to these importers and exporters, reading and writing styles hierarchy information could not be hacked in - I would have to write my own RTF parser and writer, which would take around a year. :slight_smile: But as I don’t entirely understand the request, you may mean something different entirely.

All the best,
Keith

marianco,

  1. I wrote an Applescript to accomplish what you are trying to do – namely carve a path from your outlines in Word to binder structures in Scriv. It does this by converting Word outlines into OPML (or MMD). See below for more on this.

2a) Actually, outline structures in Word are not tracked by style information. Word tracks outline level and styling separately. If you go into Word’s Format>Style, you will see that outline level and style are independently controllable.

2b) Thus, in an RTF file, though lines of outline typically have a characteristic style, what actually tracks the outline level is a simple RTF tag – which is simply ‘outlinelevel’ followed immediately by a digit. So, ‘outlinelevel1’ is the tag for the first (flush) outline level, etc., up to 9. ‘outlinelevel0’ is body text.

–Greg

P.S. I have taken the liberty of attaching the applescript, “Outline Converter”, that I spoke of in (1) above, in case you are interested to download it.
To use, drop a Word file on the script icon or just run the script with an outline document already open in Word. It will offer you a choice of three kinds of conversion.
If your outline is meant to become entirely Binder structure (no document text body), then you can choose the middle choice (to convert an MS outline into OPML). Note: Since OPML sadly has no provision for body text in an outline, all body text is converted to an outline level (determined contextually). The result is an OPML file that Scriv can import with its OPML import ability.
If your outline contains some body text which is meant to end up /in/ the Scriv doc that is associated with the outine item that heads it, then choose the third option (to convert MS outline to Multimarkdown). This can be a bit slow, because Word is slow in responding to Applescript commands. The result will be a multimarkdown file which can be imported with Scriv’s multimarkdown import option.
This script works for me on my Mac (OS X Tiger) with my version of Word (2004). Haven’t tested in other settings, so your mileage may vary.
Outline Converter.zip (59.9 KB)

And humans think that when they purchase Scrivener, all there getting is a piece of software. Nice one Mr Greg :wink:
Fluff

Keith,

GR’s Applescript does exactly what I want. Thus the logic and routine is there for you to examine.

It converts Microsoft’s Outlines (which can be seen in Outline View in Word in a .doc format file) to OPML files. These can then be imported to create a new binder folder/file structure.

If possible, can this code - which has been done for you by GR (thanks so much GR) - be added to your import routines? After all, it works. If Scrivener does this natively, it would save an extra conversion step.

Thanks again, GR.

Given the ubiquity of Microsoft Word - and how outlines can be created within it - it would be nice to see Scrivener be able to import such as outline to create the folder/file structure in the binder.

I may consider it for a post-2.0 update, although I’m sure you must understand that just because the “logic is done”, creating a new import routine for Scrivener is still hours of work and not something I can just knock together in five minutes. Definitely a good idea for the future, though.

Many thanks to GR for providing the script.

Best,
Keith

marianco,

Glad the script does the job for you!

What he said. I suspect that the difference between my script and what Keith would need may be pretty significant. You see, my script does not operate directly on your Word document. What it does is instruct Word to make the necessary changes to your document. That’s kind of what Applescript was made for – the automated control of other applications. But I am supposing that an import function needs to do all the work of processing the document itself. The logic of the code I wrote might give some guidance, but the code itself would be useless, of course.

So, I think it’s terrific that Keith is thinking of looking into the possibility of adding an import function of this sort, but he is certainly right that there would be some non-trivial work involved.

Best,
Greg

P.S. When I came to Scrivener I had years worth of outlines made in Word – hence the script. Since that time, I find I have pretty much stopped making outlines in Word. Though some of that kind of work just happens in Scrivener now, I do a lot of it in a good mindmap application (which makes OPML output that Scrivener easily imports). Though I think Word’s outline mode is still the best outliner I’ve worked with, I now find the mindmap representation actually better for most of the development work I was using outlines for.

Thans for the script, Greg. Pre-scrivener, I did most of my writing in Word’s outline view, and though my workflow has changed post scrivener, I sometimes miss it.

I am in the position of others, trying to convert Word outlines into Scrivener outlines. The Word outlines typically comprise a heading of a given level, followed by some text.

Since many of my outlines are on one level, I first tried using find/replace in Word to place a hash sign between each outline header and associated text, then use Scrivener’s Import and split facility. I had hoped the first line following the split character would be used as the header, but the filename followed by a number was used instead.

So I turned to Greg’s script. Being an early adopter (I keep asking myself why), I have Word 2011 and Scrivener 2. Greg’s script seemed to work fine. I watched all the find/replace commands at work (hashes added to the outline headers, line breaks removed etc), and ended up with what looked like a serviceable file (though in one case, with a particularly complex outline with notes and comments, I got the AppleScript error message ‘The object you are trying to access does not exist’). I saved the converted file as plain text (Mac OS encoding), but when I tried to import it into Scrivener, it wasn’t recognised as an MM file. (I have attached a sample of a converted output file.)

I would be very grateful for any thoughts/guidance on how I can get my Word outlines into Scrivener, via this script or by other means.

Thanks
Ben Woolley
MultiMarkdown conversion test.txt (348 Bytes)

Further to the plea for help above, I continued looking for ways of converting Microsoft Word outlines into Scrivener outlines. At last, I can report some success, thanks to the good people at Eastgate (who produce Tinderbox):

  1. Download the file http://www.acrobatfaq.com/tbdemos/word2opml.zip, which contains an AppleScript file called word2opml-text.scpt. (The script was written by Mark Anderson at Eastgate for Tinderbox users.)

  2. Unzip the file, and copy word2opml-text.scpt to the Documents > Microsoft User Data > Word Script Menu Items folder.

  3. Open the document you want to convert in Word, and execute the script (it should be listed in the scripts menu, represented by an icon next to ‘Help’). The script will output a file, which in my case appeared on the desktop, with the name ‘outline-with-text.opml.rtf’.

  4. This is a version of the outline in OPML, which is like HTML, but for outlines. However, it is in RTF format, and OPML files must be plain text, at least if you want to use them with Scrivener (or Tinderbox).

  5. To convert to plain text, I used TextEdit. With the file open, I executed the ‘Make Plain Text’ command in the Format menu. I then saved as a Unicode encoded plain text file, giving it an .opml extension (rather than .txt).

  6. I imported the outline the Scrivener using File > Import > Files… Worked a dream.

The software environment I used was Word 2011 and Scrivener 2.0.2. The script was written for Word 2008, and may behave differently with that version. I have no idea whether it will work with 2004.

Hope this helps.

Ben,

There are two problems with the sample file that are keeping it from getting into Scrivener as MM.

  1. The Scrivener import routine appears to be choking on the curly-apostrophe in line 3. If you put a regular apostrophe character, Scrivener will import this file as MM. (Funny, I have the vague recollection that this is a re-emergent problem with the import routine – which had been fixed in an earlier version.) For the nonce, you could turn of Word’s curly-quote substitution function and straighten your curly quotes. (All to the good, I’d say, but that is just me!)

  2. However, even the above will not give you what you want, because the line breaks have gotten munged in your output file, so the import will give you just one document with a long title. To finally fix the file, I opened it in TextWrangler, put double line breaks in before each hash mark, and also set the document’s line-termination setting to Unix(LF) instead of Classic Mac (CR). All that shouldn’t be necessary, but it may due to changes in the way the new Word is responding to commands or what it is doing when saving at Plain Text.

I will take another look at the algorithm and run some tests here to see if I can ameliorate (2) – but I don’t have Word 2011 just yet. BTW, I suppose code could be added to straighten quotes, too, though that looks like a bug with the Scriv importer.

If I find I can usefully update the script, I will post it here.

–Greg

Thanks for giving this your attention, also for pointer to TextWrangler, which will be useful to clean converted files in the future.

For the record, Mark Anderson doesn’t work for Eastgate. I’m just a Tinderbox user who’s also a volunteer Tinderbox community helper. No big deal though :smiley: