Technical question on RTF images

gshenaut · March 2, 2010, 12:53am

I’ve noticed a few things about how jpg and png images are imported from and exported to RTF files and wanted to ask about them. My goal is to import generic RTF and to export Word-compatible RTF (I’m test-exporting to Word 2008 for Mac and importing from a variety of sources).

First, it appears that the jpegblip command can be used for both jpegs and pngs, so the pngblip command is never needed. Is that correct? In any case, it appears that Scrivener uses jpegblip in both cases, which Word likes fine, but if you change the jpegblip to pngblip with a png image, Word won’t display it.

Second, Scrivener appears to import jpegs and pngs that have been sized using the picwgoal and pichgoal commands, but exports them only with picw and pich, scaled to the same aspect ratio. According the the RTF spec, picw & pich are supposed to be used only for bitmap files, but in experimenting with TextEdit, it appears that only they are supported and that picwgoal and pichgoal don’t work. Is this due to the reliance of Textedit on rtfd for images? This appears to confuse Word, causing it to size images in Scrivener-produced RTF incorrectly. In fact, based on my experimentation, Word just ignores picw and pich with jpegblip–you can delete them and it doesn’t make any difference (i.e., use just “\pict \jpegblip data…”)

Anyway, I wanted to know: when converting from picwgoal and pichgoal, which are sized in twips, to picw and pich, which are in pixels, it seems like there should be some fixed conversion factor, for example, based on 72 ppi, 96 ppi, or whatever. However, as I play with it, I don’t get to any simple factor (for example, picw326 : picwgoal3832) . Is there a constant factor or other method being used here? If so, what is it? (I suppose it doesn’t matter since Word won’t care, but I’m sort of curious.)

Cheers,
Greg Shenaut

KB · March 2, 2010, 10:13am

Hi Greg,

An interesting question! Obviously I’ve done a lot with RTF and have spent a lot of time with the specs, having added so much extra RTF support to Scrivener, so this is a topic close to my heart.

This should work pretty well out of the box, as I’ve done a lot of work on RTF and I test with Word 2008 myself.

I’ve never tried this, so I’d be surprised if it works. According to the docs, JPG files should use jpegblip and PNG files should use pngblip:

No, Scrivener uses jpegblip if you have set the Preferences (General panel) to use JPG format for images in RTF files, and pngblip if you have set the Preferences to use PNG format:

The import format is ignored - on export they get converted to whichever format is chosen there. The data is saved in hexadecimal format in the RTF file. So my guess is that you have imported a PNG file and exported it with the preferences set to use JPG format, and noted in the RTF that the format is set to jpegblip - but this is correct because it has been saved as JPG data. (2.0 will be better in this regard - if a file is imported as JPG, it gets exported as JPG using the original compression; all other files get exported as PNG.)

Hmm, interesting. I had read the specs as using bitmaps in that instance in a generic way (as both JPG and PNG files store bitmap data), but I could be wrong. Looking in my O’Reilly RTF pocketbook, it looks as though it would be better to use both picw and picwgoal. But I’m talking about exporting - when it comes to importing, Scrivener’s image-importing routines are more basic and currently ignore the picw and picwgoal settings. It just grabs the image data and dumps it in the text.

Indirectly. TextEdit doesn’t support images in RTF at all (neither does it support headers, footers, comments or footnotes). So I have had to write my own code to handle images (and all the other stuff), and sort of graft it onto the RTF import/export methods that already exist (one day I’d love to write my own parser from scratch, if I get a spare couple of years ).

It sounds as though you are making your own edits the RTF itself, and I’m wondering why you would want to do this? Why are you manually changing these settings in the RTF? When Scrivener sets the picw and pich controls, it does so by passing in the actual size of the image, as it appears in Scrivener. So if you resize the image in Scrivener (double-click on it to bring up the scaling panel), then the images should appear the same size when you open the file in Word. So to me, it sounds as though you are making things more difficult for yourself than they need to be - did you know Scrivener could scale images? I’m wondering if you missed that.

Well, picw and pich are set in pixels. picwgoal and pichgoal are set in twips, as you say, and one twip is twenty points. It should be safe to assume that one pixel = one point, so picw326 should be picwgoal 6520. But as I say, you shouldn’t need to mess around with the RTF file in this manner.

Hope that helps.
All the best,
Keith

gshenaut · March 2, 2010, 2:42pm

Thank you!

I had wondered whether Scrivener might have been modifying the images internally. For some reason, I had missed or forgotten the preferences setting you mentioned, so obviously the reason why png images I imported into Scrivener always were coming out as jpegblip is that Scrivener converted them into jpegs! That clears up one bit of confusion.

As for the image-scaling business, your answer also clears up some confusion for me. So on import of an RTF file containing an image, all of the possible RTF scaling and sizing information is ignored, and the image is simply placed into the text? I didn’t suspect that. I did know that Scrivener resizes images; in fact, it appears that one must use this function to get a printable file if the original image is larger than the margins, but I was hoping to be able to import RTF files that contained images that were already sized correctly. How does Scrivener do the resizing on export, does it actually recompute the image bits and put out a file that has different dimensions from the image’s dimensions on import at the bit level? Is this why one must choose an image output format, because the images are being recomputed on export?

Now, your explanation of how the numbers used in the exported picw & pich commands are derived is less clear to me. I spent quite a bit of time on Internet trying to find a rule for this, and what I found was that the size of a pixel is intended to vary depending on the output device. That is, there is no constant size of a pixel. Apparently, this fact has confused Windows-based developers of, e.g., forms that must display and/or print on a wide range of devices, for quite some time. I guess that early on, most people assumed 96 dpi, since that corresponded to VGA and other early displays, but of course that changed with different displays and was useless with things like laser printers. I believe that on Windows, there was a system-wide pixel-size parameter that could be set to control this on different machines according to the main display’s resolution. If you look at the early, Windows-only bitmap graphics formats that were being inserted into RTF files back then, you’ll see that the dimensions of the bitmap were found only in the picw/pich commands; they were essential for the correct intepretation of the hex-coded bits. Later, when the “metafile” formats were developed, they were also encoded into the graphics files themselves, as of course they are in jpegs and pngs, and picw/pich became redundant. If Scrivener is, in fact, recomputing the image bits as they are resized, then I suppose the exported pixw/pich numbers are the actual dimensions of the (recomputed) image and therefore they don’t have any connection to image size in twips or inches except in terms of whatever Word might be using as its default dpi setting.

The picwgoal/pichgoal commands, along with all the other image-scaling commands, were aimed at the problem of printing RTF files, because in the world of print, pixels are the wrong way to do things and points, twips, and dynamic rescaling of images are the right way.

So, in my playing around with these files, what I found was that it was not necessary to modify the content of image files at all, for Word, they can be resized simply by changing the picwgoal/pichgoal commands preceding them. However, this doesn’t work with the picw/pich commands, and in fact, as I said before, Word may be ignoring them completely; it definitely does so in some cases. (In any case, if Scrivener is simply copying the actual dimensions encoded into the embedded–resized–image file, they are redundant.)

As to why I am doing this, well, I’m still in the mode of trying to make submittable manuscripts in APA format that are Word-compatible. One of the new requirements is that figures be incorporated into the text on the same page as the figure caption (previously, they were submitted separately as high-resolution image files). I’m trying to understand how Scrivener’s conception of RTF images differs from Words, specifically as it pertains to Word’s notion of embedding an image file “as-is” (i.e., full resolution, just copied into the text file in hex or binary format), but scaled for printing in a specific manner. It would be useful if the full-resolution versions of the figures could be embedded into the submitted manuscript rather than having to provide them separately later on.

Thanks again for your illuminating reply, RTF is just unbelievably opaque.

Greg Shenaut

KB · March 2, 2010, 5:03pm

When it comes to points and pixels, everything is messed up. You are right that points are pixels are separate things, but in many places they are assumed to be the same thing. In fact, the information about width and height I am placing in picw and pich is derived from NSImage’s -size method, which really returns the size in points. So to get the correct height and width in twips, you just need to multiply the height and width Scrivener gives in picw and pich by 20. (Scrivener should really use NSImageRep’s -pixelsHigh and -pixelsWidth in the picw and pich settings, and I’ll look into this in the future.)

As for resolution, you shouldn’t have any problems here. Even though the image is resized, its image data is retained. So it is resized, not rescaled. I have just tried going back and forth between Word and Scrivener, scaling images to the smallest they will go in each, opening up the RTF file in the other application and scaling them large again, and no resolution was lost.

All the best,
Keith

gshenaut · March 2, 2010, 8:50pm

OK, I’m beginning to understand this better.

I did some tests similar to what you did with scaled images, and I learned two things from them: as before, the picw/pich commands could be removed from the exported Scrivener file, and Word would still display the images perfectly well, sized correctly. I also doubled both of the numbers to see what Word would do, and as before, this had no effect on the size of the image displayed in Word.

Next, I extracted two image bitmaps (one for png, one for jpeg) from Scrivener-exported files and converted them byte-per-byte into binary files with the appropriate suffix, and opened them in Preview. They opened in their “full” (unresized) size (in this case, it was a 640x480 cellphone photo). However, when I opened Preview’s information window for the images, it turns out that for both jpeg and pngs, there is a “dpi” number encoded in the file, which Preview reports but apparently doesn’t use in displaying the image.

This accounts for how NSImage -size can report points, given pixels and dpi. It also accounts for how at least these two types of NSImage can be resized with no loss of information.

Furthermore, it accounts how picw/pich can be ignored and how picwgoal/pichgoal can be used: the dimensions in bits always come from the file, never from picw/pich, but apparently, if a picwgoal/pichgoal command is present, it can override the internal dpi parameter.

And finally, it tells me what I need to do to get a correctly pre-sized image into Scrivener or Word: I need to stop playing around with picwh and instead learn how to set the dpi parameter in the image files themselves.

Thanks again,
Greg

KB · March 2, 2010, 10:26pm

Hmm, this has been interesting - and fruitful. I recently did a lot of work on the RTF import/export code, so I took another look today to see if it would be easy to add address some of these things, and I think I have it licked. On export, Scrivener 2.0 will enter the correct pixel values for picw and pich and also the correct twip values for picwgoal and pichgoal. And even better, it should also be able to import images at the correct size. The other thing it has to take into account is picscalex and picscaley. So version 2.0 looks for picwgoal and pichgoal. It then scales them (or the default image size if they don’t exist) by picscalex and picscaley and resizes the image (without affecting resolution of course) before inserting it into the text.

All the best,
Keith

gshenaut · March 3, 2010, 12:47am

That sounds perfect.

Cheers,
Greg