Different flavors of RTF

oldrafe · August 23, 2014, 10:27pm

I’m having trouble with the different flavors or dialects of rtf files.

I’m trying to export some of my documents from Scrivener to rtf files so that they can be edited on a computer that does not contain Scrivener, and then reimported to Scrivener.

Some of the them export with no problem, but in some others, quotation marks and em dashes do not get exported. Instead there appears a vertical black bar where these missing characters are supposed to go. Obviously these miscreant files have been created in a different RTF flavor.

Within Scrivener there is no differencce between the files that export properly and those that show up without quotation marks. And if I open the exported files with Word or WordPerfect, all is well, the quote marks are there. But even if I save these Word or Wordperfect files back to rtf, those black bars are again there where the quote marks should be if I open the files with wordpad or some other program.

Is there any way to convert those misbehaving files to the proper version of RTF?

DavidR · August 25, 2014, 8:19pm

I have a very similar problem, precisely with smart quotes and em dashes, going between Scrivener and the word processor Nota Bene. See the thread here. In my case, the problem was happening in Scrivener (but not in WordPad or Word): text imported or pasted from Nota Bene showed placeholders where smart quotes and em dashes should have been.

I gathered as much technical data as I could, but no one from Scrivener ever gave me a definitive answer about what was happening, only “Scrivener is just taking the RTF content from the clipboard. The trouble here looks like whatever encoding [Nota Bene] is using for these characters is not interpreted as such in Scrivener’s stricter RTF handling.” The Nota Bene people told me that NB saves standard “hi-bit” characters such as these in ANSI encoding, while Word saves them in Unicode. I’m sure that has something do to with it, but what I don’t know is how Scrivener encodes them…

However Scrivener is handling these “hi-bit” characters, it seems to cause occasional problems with interoperability. But you seem to describe a situation in which Scrivener itself saves different files in different encodings or something. That’s harder to understand.

oldrafe · August 26, 2014, 12:18am

Thanks for your input, David. I wish someone from Scrivener, more knowledgeable than you or me, could help us out.

According to Wikipedia, there are/were quite a few versions of RTF. It seems to me that Scrivener is capable of reading them all and displaying them properly, but not changing them. So in my case, some of my older files may have originally been imported to Scrivener from another program and may be of a version that my wordpad cannot read, but Scrivener can get it right. But I would think that once it is displayed properly in Scrivener and then gets exported, it should be okay. But it’s not. So something else is going on. Scrivener is obviously not just taking the content from the clipboard, as someone told you. Srivener is interpreting it properly–in my case–but not in yours.

MimeticMouton · August 28, 2014, 8:34am

Scrivener follows the RTF 1.9.1 specification and uses Unicode for smart quotes, like Word.

oldrafe, could you provide a sample project containing a document that demonstrates the issue when exported to RTF and opened in a specific program? (WordPad?) You can attach the zipped project here or send it to windows.support AT literatureandlatte DOT com with a reference back to this thread. Being able to see the file directly in Scrivener and test with it will help determine what is happening. Is the text you’re experiencing this with text written in Scrivener, or imported or copied from elsewhere? Are you ever seeing that quotes come out with some correct and some incorrect within the same file?

The export problem described by oldrafe isn’t necessarily related at all to the copy/paste problem from Nota Bene, as these are different processes despite all being related to RTF. DavidR, I’ve already got your sample from the other thread, thanks. Since Scrivener’s RTF parser can’t recognise the characters as Nota Bene is writing them, if you need to copy and paste from there, you can use WordPad (or Word, etc.) as a middleman: paste to a blank WordPad document, then select and copy that and paste into Scrivener. The paste coming from there will be in a format Scrivener can handle.

Alternatively, you could using Paste and Match Formatting, if you don’t mind stripping the rich-text formatting.

DavidR · August 28, 2014, 4:30pm

Thanks, MM, for helping to distinguish between the two issues. I’ll continue to use Paste and Match Formatting for the most part, unless there are more italics, for instance, than smart quotes to deal with.

May I take it that it is Nota Bene’s writing of the hi-bit characters in ANSI rather than Unicode format that prevents Scrivener from recognizing them? Since even lowly WordPad is able to interpret the ANSI encoding along with Unicode, is there any hope that a future (perhaps far future!) update to Scrivener will also enable this?

oldrafe · August 28, 2014, 8:07pm

MM, the misbehaving texts were originally written elsewhere, probably in WordPerfect, and imported to Scrivener some time ago, in one of the earliest versions of Scriviener for Windows. I have not seen variation within a single file; either it’s all okay, or all missing the quote marks.
The project in which this is happening is huge, several hundred files, and I have no idea how many of them would display this problem upon export, as I am only trying to do this external editing with about half a dozen files and have had the problem with two. So it doesn’t seem efficient to zip you the whole project.
I will try to re-create the problem in a sample dummy project, importing a couple of files from WordPerfect and writing others within the project. It might not work because the computer I’m using now is not the same one on which the original misbehaving files were written.
I’ll report back either way. Meanwhile, thanks, as always, for your help.

MimeticMouton · August 28, 2014, 8:57pm

Thanks, the sample will help a lot if we can get it. What you could try, when you have a chance, is to make a copy of the project with the problem, then trash all but one or two files that show the problem. (Be sure to empty the trash as well.) If these have already been edited in the latest version of Scrivener, it shouldn’t matter either if you trim the text in the files if they’re lengthy or if you want to obfuscate the text.

MimeticMouton · August 29, 2014, 3:05am

Even lowly WordPad is part of Microsoft. I don’t know exactly what WordPad is doing to parse the RTF Nota Bene gives it or whether we will be able to do the same for Scrivener. This also isn’t a common problem for paste or import into Scrivener, so it may come to a point where we just don’t have the time and people needed to rewrite the RTF parser at the moment given all the other, higher-priority tasks still on the to-do list. I’ve passed the sample files and testing notes to the developers and we’ll see what we can do.

DavidR · August 29, 2014, 4:17pm

Thanks, MM. I understand completely.

oldrafe · August 31, 2014, 1:43am

MM, I’m going to save you guys some trouble. I’ve fixed the problem but I haven’t solved the puzzle.

Apparently it was not a matter of RTF flavors but Courier font flavors. I keep my texts in Scrivener in Courier New (holdover from my ancient typewriter years), and use other fonts for research files, metadata, etc. All the files that were written on WordPerfect and were imported to Scrivener, were also orginally in a Courier font, but for some reason some of them apparently came in as straight Courier, not Courier New. I never noticed the minor differences because Scrivener handles both with no problem.

However when I export back to RTF those files that were originally in straight courier, they come out in straight courier and the quotes and em dashes and some variations of paragraph format get lost: black bars, odd spacing, etc. But all I have to do to fix the problem is select the whole file and change the font to Courier New, and like magic, all the proper punctuation reappears.

So all’s well here and I no longer need your further anylysis. But of course there are still a couple of mysteries that you wizards might enjoy savoring with your latte. Why does a file that looks perfectly okay in Scrivener, revert to something else on export? And then that same file, reimported to Scrivener, is again okay. And what causes some files to import in one flavor of the font while others, from the same source, come in as another? A possible clue is that at least one of the errant files originally contained a WordPerfect comment that ended up as meta data. Finally, regardless of font, why did punctuation get lost in the journey in and out of Scrivener?

Thanks again for your help.