Highlighting Web Pages

It would be nice, but unfortunately Web Kit, the macOS technology used to save and display archived web pages, doesn’t have that capability. So while actually invoking some kind of quasi-stripped down HTML editor inside of Scrivener is out of the question, there are still some very useful approaches you can consider:

  • Document Notes: everything in the binder has them, and they are particularly useful for cases where we either cannot attach, or do not want to attach, our notes to specific content within the material itself.
  • Convert to Text: you might consider converting them to text—it’s a matter of taste I suppose, some people like the page to look the way it did, but for those mainly interested in the content it is a nice alternative. Not only can you weed out all of the navigation, ads and other junk—once it is text you can annotated it to the full extent that Scrivener can annotate any text.

The Documents ▸ Convert ▸ Web Page to Text command is for switching over to text from a previously archived .webarchive file (note that is destructive, duplicate the original if you want to retain a copy in that state)

  • Import as Text: and if you’re like me, you couldn’t care less what the article looked like on the web site 99.99% of the time, and thus always do want text (in all of its editable glory). Instead of manually converting every time, the Sharing: Import preferences tab contain options to automatically import all web content as editable text from the get-go. This can also be done for loose .webarchive files you import from Finder, too.
  • Handled from the Browser: and in some cases I’d prefer not to have Scrivener do any of the importing. Some pages are such a huge mess it’s more effort to clean up all of the junk than it is to simply get the right content from the very start. Browsers all come with fairly effective “Reader mode” views these days, from which copy and paste produces a very clean result. Beyond that, there are tools like Aardvark that can isolate content and suppress page elements until you have what you need, and can then select all, copy and paste. The quality of that depends on the browser though.
  • Convert to PDF: although as with Web Kit, PDF Kit similarly limits Scrivener to a basic viewer, it is however easier to annotate PDFs in general than it is to annotate web pages, in external programs. So archiving your imported content from the Web using this format is, I would say, a decent compromise for those cases where the layout of the text is very important to the content, and thus a raw dump of the text content itself doesn’t suffice. There are WebArchive → PDF converters out there—but it might be easier to simply Print from your browser, and use the “PDF” button in the print dialogue to send the copy to Scrivener. And if you’re like me you might often find the printed version a page more suitable for research in general. Sites sometimes use cleaner CSS for print—the difference between a PDF from Wikipedia and a WebArchive is pretty substantial.
  • Edit as WebArchive (??): you will note that we do allow one to open WebArchive files in an external editor. Good luck finding something that lets you annotate these files though. I did a couple of quick searches, and the general concensus seems to be that if one wants to annotate a WebArchive, the best approach is the above: convert to PDF. It makes sense, why spend years building a program that edits one proprietary macOS format when you could spend that energy on an open format that was designed to be annotated from the start. Perhaps you’ll find something though—I have seen attempts at it in the past. To my knowledge none of them survived though.