Highlighting Web Pages

I like that I can add web pages to my project, it integrates nicely with my research notes. What would be a nice bonus is being able to annotate web pages - at the very least being able to highlight text.

It would be nice, but unfortunately Web Kit, the macOS technology used to save and display archived web pages, doesn’t have that capability. So while actually invoking some kind of quasi-stripped down HTML editor inside of Scrivener is out of the question, there are still some very useful approaches you can consider:

  • Document Notes: everything in the binder has them, and they are particularly useful for cases where we either cannot attach, or do not want to attach, our notes to specific content within the material itself.
  • Convert to Text: you might consider converting them to text—it’s a matter of taste I suppose, some people like the page to look the way it did, but for those mainly interested in the content it is a nice alternative. Not only can you weed out all of the navigation, ads and other junk—once it is text you can annotated it to the full extent that Scrivener can annotate any text.

The Documents ▸ Convert ▸ Web Page to Text command is for switching over to text from a previously archived .webarchive file (note that is destructive, duplicate the original if you want to retain a copy in that state)

  • Import as Text: and if you’re like me, you couldn’t care less what the article looked like on the web site 99.99% of the time, and thus always do want text (in all of its editable glory). Instead of manually converting every time, the Sharing: Import preferences tab contain options to automatically import all web content as editable text from the get-go. This can also be done for loose .webarchive files you import from Finder, too.
  • Handled from the Browser: and in some cases I’d prefer not to have Scrivener do any of the importing. Some pages are such a huge mess it’s more effort to clean up all of the junk than it is to simply get the right content from the very start. Browsers all come with fairly effective “Reader mode” views these days, from which copy and paste produces a very clean result. Beyond that, there are tools like Aardvark that can isolate content and suppress page elements until you have what you need, and can then select all, copy and paste. The quality of that depends on the browser though.
  • Convert to PDF: although as with Web Kit, PDF Kit similarly limits Scrivener to a basic viewer, it is however easier to annotate PDFs in general than it is to annotate web pages, in external programs. So archiving your imported content from the Web using this format is, I would say, a decent compromise for those cases where the layout of the text is very important to the content, and thus a raw dump of the text content itself doesn’t suffice. There are WebArchive → PDF converters out there—but it might be easier to simply Print from your browser, and use the “PDF” button in the print dialogue to send the copy to Scrivener. And if you’re like me you might often find the printed version a page more suitable for research in general. Sites sometimes use cleaner CSS for print—the difference between a PDF from Wikipedia and a WebArchive is pretty substantial.
  • Edit as WebArchive (??): you will note that we do allow one to open WebArchive files in an external editor. Good luck finding something that lets you annotate these files though. I did a couple of quick searches, and the general concensus seems to be that if one wants to annotate a WebArchive, the best approach is the above: convert to PDF. It makes sense, why spend years building a program that edits one proprietary macOS format when you could spend that energy on an open format that was designed to be annotated from the start. Perhaps you’ll find something though—I have seen attempts at it in the past. To my knowledge none of them survived though.

Pretty significant limitation. I get the mac technology issue but honestly it sounds like you don’t care to ever enable the functionality.

Hmm, on the PC Scrivener can import and view plain old HTML files as well as the self-contained WebArchive documents. The user can modify the HTML in any visual HTML editor or in Notepad. Viewed within Scrivener, the HTML can redirect to any other HTML doc in the user’s file system or on the web, adhere to style sheets, and even run scripts. Is this not the case on the Mac? These are capabilities I would not be without.

Rgds - Jerome

Is this what people (or the OP) would normally mean by “annotate”? Editing the raw source code underlying the file format? You can do all that on a Mac, but if someone’s asking for Scrivener to allow you to select text in Scrivener’s editor and highlight it in yellow or green, that’s a bit different, no?

How is that a significant limitation? And how is it possible to “enable functionality” that does not exist, exactly? Not many apps allow you to highlight web pages out of the box, because that essentially involves providing a rich text HTML or web editor - not exactly simple stuff. And Scrivener is not an HTML or web page editor - it just allows you to bring in web pages to reference them. And, as already pointed out, you can absolutely convert a web page to text and highlight it that way, so if you need a way of annotating web pages - it’s already there.

Listen buddy, we all know that KB has the code for a “publish best-selling-novel” button embedded in Scrivener, and we’ve been asking him to “enable” that functionallity for years. Get in line. :smiling_imp:

Hi rdale - The OP gave highlighting HTML as an example of annotation. I don’t think editing raw source code is necessary to the criteria. If I can view a file type within Scrivener’s panes, and open it for editing through Scrivener, even if with a third-party editing tool, then my own annotation criteria are satisfied. PDFs, WebArchives, image files, multimedia files, HTML files are among those types Scrivener supports, by this standard, under Windows.

Amber advised OP that Scriv would certainly bring up a WebArchive for editing with a tool of the OP’s choice, but that WebArchive is not an editor-friendly file format. I use pure HTML as my primary reference file format. On the PC, HTML files are editor-friendly, zoom and wrap friendly, viewable within Scrivener, and easily dragged to and viewed on a smartphone. So I was wondering whether this was the case as well on the Mac.

Cheers – Jerome

I kind of glossed over that, but yes, the Mac can do the same thing, including rendering (but not editing) plain-text .html files. Underneath the GUI, there’s a full Unix-like environment (based on a core of the BSD operating system), complete with an Apache web server, command-line editors, PERL, Python, and shell scripting–not that it matters if you have a GUI editor. And Scrivener for Mac can absolutely associate an external editor of the user’s choosing for any file type that Scrivener doesn’t convert to a rich text file. For other document types that you import, Mac OS provides a read-only “Quick Look” view that Scrivener for Mac leverages (one of the many things that I believe the Windows crew has to build from the ground up if they want it in Scrivener for Windows).

My main point of contention is that markup, while it is kind of a “squishy” term, implies that there’s a layer of markings that don’t touch the original source, whereas even if there is a program out there for editing webarchives reliably, my guess is that it won’t separate out “markup” from the source, permanently changing the original files.

FYI, there is a free program that can edit webarchives on the Mac: It’s called Bean. bean-osx.com/fileformats.html But once you edit and save, much of the stylesheet info seems to get lost, and the formatting gets very basic. I don’t think there’s a Windows version… and it’s still not something that provides a separate layer of markup the way that PDFs can provide.

After years of being annoyed with how badly Mac’s webkit mangles some web pages, I’ve turned to the “Reader view” + “print to PDF” to provide me with a snapshot of the information in a consistent, readable font that works far more reliably, and brings PDF annotations to into the mix.

Robert, thanks for clearing that up. I did take it as a given that highlighting would alter the copy of the document in the binder. And with the understanding that Scrivener is a project manager for writing rather than a file editor for an assortment of doc types, I’d maintain that Scrivener actually does provide the capability that the OP was seeking.

Rgds - Jerome