Importing RTF to Scrivener: only from Word?

Hello,

I’m new to the forum, and I’ve been using Scrivener only for a few months now. I’m a writer, but I’m also an academic editor, and I’m trying to create a good workflow for our editing. We mainly work with digital documents, that have to be revised by multiple readers. Each of them writes their own comments, sometimes using text tools (like color change) or comments tools.

So, in order to trully put Scrivener to a real test, I’m importing some mid-production projects. I started by importing the documents to Scrivener, in order to divide them into more manageable chunks.

My experience, so far is this: although Scrivener can import .doc and .odt documents, it alters its original formatting: bullets, margins, text size… everything is a mess.

Since RTF is the best format for Scrivener, I opened my .doc documents and exported them to RTF. I used Neo Office, Pages and Mariner Write. The result was chaotic. None of them was able to export correctly neither the text format, nor the comments.

This morning I came into my Windows workstation (I only have Mac at home) and opened the same .doc documents, this time using Windows Microsoft Word 2007, and exported them as RTF. I imported them to Scrivener and this time, when I opened them, they were almost perfect: margins, text size… and all comments in their place. I’m still having problems with complex tables, but that’s a minor bug compared to the rest.

I would like to know if you have discussed this before, if I’m right when I say that the only good RTF are the ones that come directly from Microsoft Word, or if there’s any chance that I can export good complex RTF from Neo Office or Pages. I’m asking this because I categorically refuse to buy a Microsoft Word licence only to export RTF.

Thank you.

Pages is terrible when it comes to RTF - Pages is designed almost solely for those who want to exchange documents with Word users or not at all.

I haven’t tried Mariner Write for a while so will have to test that, but I just tested OpenOffice - on which NeoOffice is built - and it seems that its RTF exporter is terrible, even worse than that of Pages. The formatting - which wasn’t very complicated - was completely messed up.

So I’m afraid the programs you have tried so far do indeed create poor RTF. However, there are other programs that create good RTF other than Microsoft Word. Two of the best on the Mac are:

Nisus Writer - www.nisus.com
Mellel - www.redlers.com

Both are affordable and much better Word alternatives than NeoOffice in my opinion.

If you don’t need footnotes and suchlike, there is also:

Bean - www.bean-osx.com

Bean is free, and an excellent rich text application.

Hope that helps.
All the best,
Keith

Thanks, Keith. I’ll try those programs right away. In the meantime, I made another test: I created the RTF from Word for Mac and, guess what? It’s not as bad as NeoOffice, but it’s still not as good as Word for Windows, since I can see all my numbered paragraphs correctly, but not my bulleted paragraphs (the bullet gets lost in the way). At least, now I know a little bit more on RTF.

Thanks again!

Hello, Keith,

I’ve tested Mellel and Bean. The problem is the way the interpret the original .doc document. Mellel is a complete disaster; it can’t even recognize at all .docx. Bean works better with .docx, since all bullets are visible. Problems: all numbered paragraphs are automatically converted to bulleted paragraphs; it doesn’t seem to recognize comments. It’s so much better to import tables, though, even better than Word 2007 for Windows.

In a couple of hours I’ll see what Nisus can do.

Thanks!

Last update: Nisus imports .docx the same way Bean does; no difference there. One advantage: it’s the only software able to interpret both comments and track changes options, but only when reading a RTF document. Nevertheless, all numbered paragraphs are turned into bulleted paragraphs, therefore, still no improvement.

So far, the only option remains saving the original document as RTF from Word for Windows and forget about the tracking changes option. Bean remains the most efficient software for complicated tables, though (better than Nisus).

Hmm, I just checked that and it seems you are right. It looks as though the difference between Word’s fonts and those in the OS X text system means that the bullet character Word uses comes across as a private Unicode character in the OS X text system. Yes, looking at the underlying RTF, Word is using a private Unicode character for its bullet symbol - it’s actually a symbol that isn’t defined in Unicode at all, so any other program than Word either wouldn’t know what it is or might use it for something different. I may be able to find the character and replace it with a regular bullet in my custom RTF importer, I’m looking into it.

Nisus is probably the best for compatibility with Word, though - have you tried exporting from Word (on the Mac) as RTF and opening that in Nisus? The Nisus guys wrote their own RTF importers/exporters and as RTF is Nisus’s default format, it tends to render RTF extremely well. They also do a lot of testing with Word to ensure compatibility. Like the rest of us Cocoa developers, though, they have to rely on the .docx importers/exporters that Apple make public, and they are shoddy in the extreme. So I would recommend testing Nisus with some RTF files before making up your mind.

I should also note that no matter what you are try, you are bound to run into formatting problems and differences unless all collaborators use exactly the same problem. It’s probably best not to worry about the formatting until you come to working on the final draft, but your needs may vary.

All the best,
Keith

Hi, Keith,

I opened the RTF on Nisus, from them exported it as RTF again and now I can see all my bullets and numbered paragraphs. That doesn’t fix the problem of forcing me to buy a Word license, and a Nisus license, only to use them both as a link between Word and Scrivener.

I certainly understand that some formatting will be lost in the process; but I’m precisely trying to find out what kind of compromises I will have to make on my workflow.

We edit college academic textbooks. Therefore, many of our first drafts have many bulleted lists, numbered paragraphs (every single chapter must have a list of objectives and self-evaluation questions, for instance), long quotes, complementary readings, learning resources, etc. Even if I’m only working on a draft, not thinking on a decent formatting of the project, as an editor I will require some basic formatting functions. If the process of importing the original texts brought by the authors, revising them, and exporting them again to be reread many times back and forth fails, due to the problems of compability between Scrivener and the word processor… then my workflow won’t be useful.

I need to know what kind of rules will I have to set for this kind of workflow. For instance, it would be better to avoid Word’s “track changes”, “text boxes” and “comments”. That works for me, but I need to warn the author before he starts writing, in order to make things easier in an already complex process.

I would consider those as minor problems due the magnificent tools Scrivener provides to make general revisions easier: full screen reading, basic formatting (bold/italics, different text colors, bulleted lists, margins), outlining, chunks, screenshots, basic comments and footnotes functions, divided screen, split, interaction with bibliography software (such as Sente), and, of course, all it’s capability to maintain additional information on the same project. For instance, I’ve been able to track the author’s versions much easier, since all I have to do is a folder in the binding, where I add the new revised versions, with their dates. And everytime a major revision comes, I take a full snapshot first. This is unbelievably useful when it comes to manage many versions of a manuscript on a co-written text.

Bottom line is, if exporting and importing is easy, more or less efficient (not necesarilly perfect), the rest will be almost like heaven, compared to use Word to edit books. Better yet: I might be able to convice my superiors to switch our entire department to Mac and Scrivener… :wink:

Regarding the bullets thing, following your post I looked into this and it turns out that Word uses unicode character 3193 for its bullets, which isn’t actually a defined unicode character at all. When it comes into Scrivener, it should really be converted to unicode 2022, which is the standard bullet character. I have therefore added some code to my custom RTF importer to fix this issue, so that in Scrivener 2.0 the bullets should survive the import (I can’t add it to a 1.x fix, unfortunately, as it depends on new code I’ve written for 2.0 that makes it more rigorous about checking it is altering only the RTF characters that it should).

You should be able to accept Word documents with comments, by the way. These should come across into Scrivener as annotations, and should survive in Nisus, too - although again, this only works for RTF.

To explain the issues you are having from a development perspective, the trouble is that there are many file formats and they are vastly complicated, having to account for all the nuances of any given text system. Most major programs use different code for rendering their text, so have to interpret file formats and render them as best as they can to make them appear the same on screen. For a lone developer such as myself, or even for a small team such as at Nisus, it would be impossible to write importers and exporters for all the major formats. Moreover, formats such as .doc and .docx are hideously complicated - XML files inside .zip files that refer to other XML files and so on. Fortunately, Apple, who have the manpower and resources to write as many importers and exporters as they feel like, provide importers and exporters for the major formats to Cocoa developers - .doc, .docx, .rtf, .odt, .html, and .txt, for instance.

The catch is that sometimes these exporters are buggy, and the only people who can fix them are the developers at Apple. And sadly, the .odt, .docx and .doc importers/exporters are all horribly buggy. Moreover, Apple have written completely different .doc and .docx importers/exporters for Pages, which aren’t so buggy (but Pages uses a different text system to the one Apple provides to developers). Oh, and they don’t provide a .pages importer/exporter to third-party developers.

Amongst all of this mess, RTF provides a great solution. RTF is well-documented, well-established (over 20 years old), and is supported by most major word processors on all platforms. Moreover, it is essentially plain text + markup format. This means that teams such as Nisus can write their own parsers, and lone guns such as myself and use the Apple importers/exporters to do the heavy lifting (fortunately the Apple RTF importers/exporters at least get most of the formatting right) and then do some plain text jiggery-pokery to fix things up and add the bits that Apple don’t support. (For instance, Apple’s RTF importers/exporters don’t support footnotes, comments, images or headers and footers; they don’t do highlights the proper RTF way either. I’ve had to add support for all of those things myself.)

In other words, in all of this quagmire of file formats, it’s probably not surprising that you are having so many problems and seeing so many differences - but of course that doesn’t make it any less frustrating from a user’s point of view.

Thanks for the kind words about Scrivener, by the way, and for wanting to find a way to stick with it despite problems in finding a workflow with some other programs.

All the best,
Keith

Hello Keith,

Thank you for your detailed explanation. I’m not a developer, but this gives me a glimpse of the backstage process. Please, don’t feel like I’m complaining or that I feel frustrated by Scrivener. Not at all. If someone’s to be blamed, that’s Microsoft Word. Besides, I love all that testing thing, it’s amusing.

Nevertheless, I’ve got a reality to face: all my collegues, and my entire context are sticked to Word like that’s the only word processor that ever existed. And all our authors use it in the strangest ways, since they don’t really know how to manipulate it or take real advantage of the software.

As an editing department, we must quit many of the text formatting tools just because they’re not fully compatible with InDesign; therefore, making it even simplier is not a problem, but something we need to do.

If a complex process to import/export has to be made, I know I will do it, but I also know for sure that many of my collegues won’t. That’s why I’m looking for the most efficient, simpliest way to bring Scrivener all that work that we know will be done in Word and, maybe, Open Office.

I’ll research a little more, and make some more tests. I’ve already learned a lot, and I know now that Open Office is not as efficient as I thought it would be.

Thanks again,

Jacqueline