How to bulk import web pages into research and bulk convert them to text?

I create complex technical papers with Scrivener and I am sometimes getting documents with a lot of links. I need the content of the web pages behind these links for research.

Example content of such a file

... 100++ more lines with content like above

Q1: Can I import the url’s all at once but each as a singe item into the ‘Research’ folder?

As a result I have hundreds of web pages in Scrivener.

Q2: Can I then bulk convert these web pages to text?

As a result I have hundreds of text pages in Scrivener within my ‘Research’ folder.

How can I make this workflow possible?

I would go about the original downloading using another tool, like wget (or a front end if you prefer), beforehand, so that one has a bunch of .html files that can be imported bulk as regular files. Scrivener itself doesn’t have a “scour” feature that would work like what you need here; it imports one URL at a time. The other reason for preferring that approach is that it will be much more efficient than having it download the links one at a time, even if it could do so. For whatever reason it seems to take a lot longer to acquire and download a page through the whole archive mechanism (even if going straight to text) than to simply load the page in a browser, and even that is going to be a lot slower than something like wget which can download files as fast as your connection speed to the server allows it to.

As for importing as text though, there are two ways of doing that:

  • In advance: in the case where you have 50 .html files you want to import as text, set the Convert HTML files to: Text setting, in the Sharing: Import preference pane. (Note also a checkbox below that to do likewise for downloaded .webarchive files and live downloads.)
  • Retroactively: select the WebArchive files in the binder, and run the Documents ▸ Convert ▸ Web Pages to Text menu command on the batch.