Hello, I became curious why my project backup ZIP was larger than expected. So I unzipped it, selected the file and used “Show Package Contents”. From there the structure was quite obvious as there were my PDF research documents, various RTF files and so on.
I saw one RTF file that was much larger than the others. It was my free-form notes weighing in at 1450 words. It only contained typed content; no photos or anything like that.* I find it odd that macOS tells me that the file is over 3 MB. (Exported to plain text the file became 12 KB.)
While theorizing, I wondered:
-Do RTF files retain history?
-Do RTF files embed lots of extra stuff like font families?
-Are the many Scrivener-created hairlines in this particular file culprits? (Edit > Insert > Horizontal Line > Centered Line)
Just wondering if anyone has insight. Thanks!
*The file did have a couple of pasted photos in it at one point --hence wondering if file history is retained.
This is a interesting theory. Of course your Mac will save old versions of files in the background, that is just something it does. It is likely that a copy of the RTF with the pasted images still exists deep in the Mac, but it’s not going to be zipped in the backup, nor will it show up in “Get Info” when looking at the present-tense version of the file. So ultimately I doubt that is the culprit unless something is different with how your Mac works, compared to mine.
No, in fact the RTF code produced by the text engine is quite economical by most standards.
If you examine what gets inserted there, you’ll find it is just a bunch of non-breaking spaces with underline formatting. In essence, no more complex than an equivalent amount of alphabetic characters set to underscore, where it comes to bytes in the file.
What I would do is open a copy of the RTF file in a plain-text coding editor. If you don’t have one, enable Display RTF files as RTF code instead of formatted text in TextEdit’s “Open and Save” preferences, then load the file. Scroll through this, and I bet you’ll come across tons of gibberish at some point. Make note of where the gibberish starts in terms of the content you can read, and then load the file normally. Somewhere near that point, you must have an image—maybe a big one that got resized down so tiny you can’t even see it. Selecting the region and retyping it may solve a problem like that.
Thanks. That list bit about using TextEdit to evaluate the file’s inner workings revealed that indeed there was an image attached. And turns out the image showed up when I viewed it as part of a Scrivener project, but it didn’t show up when I extracted the file and viewed the RTF as I described in my original post.
However I removed the image and got it down from 3+MB to less than 25KB.
Lesson learned when haphazardly pasting images into documents. Thanks again.
Okay, some RTF editors (TextEdit is one) will not show inline images, that’s probably what happened. If had exported the file as RTFD from Scrivener, that would have shown the image in TextEdit.