File Import > Web Pages > PDF via Internet Explorer Crashes Computer

Fourth try and I’ve been able to reproduce this every time.

Windows Version: 1.9.7.0

When I import a web page using the File > Import Web Pages > PDF via Internet Explorer on a Windows 10 (updated this morning) machine, it crashes the whole thing. Total reboot. No warning, nothing. Just bang, black, and I’m at the Win login screen.

The import web pages features have always been an issue with the Windows version, and I would not normally test run each of the options, but I’m teaching a class on Scrivener and wanted to display the results of this import view in the Research folder. Crashes each time, so skipping it for these students. SIGH.

Internet Explorer is a problem for lots of things. That is why Microsoft abandoned it and is trying to get people to use Edge, the browser they’ve been pushing for a couple of years now that comes with Windows 10 and is recommended for use with Windows 10.

So I tried to replicate your issue using import pdf via Webkit into research and it did nothing but no crash. However, the pdf was a fillable form so that probably can’t be translated into RTF for Scrivener. So I tried a simpler pdf file from the Internet and got a “file cannot be imported” message, but at least no crash.

I would download the pdf then save it to a files folder under your Scrivener project then do a file link to it on your own computer. This will allow the pdf to be displayed properly with your pdf viewer using a scrivener link.

TL;DR “Not a Microsoft problem, but will bash them anyway”

It is always easy to bash Microsoft, especially for Internet Explorer. I teach my students not to use it, get excited for the promises of each new iteration or update, then quickly go back to not recommending it again and again. So I’m just as guilty as a teacher of web publishing, design, and development.

Aside from that, blaming IE does not resolve the issue that this is either a single instance, a bug, or something else. I found a few other comments about the issues with web page imports being problematic, so I leave this here with the hope that either the issue will be resolved or someone may find a better solution.

I made it a habit to save web pages as PDFs in the browser a long time ago to import into Scrivener, so this isn’t my first go-around with this. I was just hoping for a more definitive answer for my students other than “don’t use it” or “let’s sing the bash-MSIE song again.” :smiley:

Thanks.

I’ll try and be clear here.

  1. IE has been deprecated by Microsoft, but many programs are using the IE libraries to do html work including importing and displaying web pages. Everyone with windows has it. It seemed like an excellent choice at the time. But, as time goes on, more and more of those programs are having problems with various versions of Windows. I know of a spell checker that have stopped working only on windows 8.1 but not on Windows 10 because they use different versions of IE. I also know of display problems with long pages. So as users, if given an option, we should try the other one. This is not bashing. This is the state we are in particularly with smaller companies that chose to use the IE libraries.

  2. I tried the other option (webkit) and it didn’t work either. So as far as I can tell, scrivener offers the option to import pdf files from web pages, but it doesn’t work with IE or Webkit.

  3. When a program crashes the entire system there is a problem with it. I suspect that the problem Lorelle is having when importing using IE was due to IE, not the import programming in Scrivener. It simply should not crash the system.

This is a problem many smaller programming companies have who relied on IE. It simply exists, like mosquitoes. They made a reasonable decision choosing the program that was on all systems. Microsoft made a reasonable decision choosing to abandon the problematic IE (except for security updates) and design a new browser from scratch and then put their resources into that browser, not keeping the old one updated with more and more tweaks.

Scrivener probably knows that their pdf import from a webpage doesn’t work, but won’t fix it and instead puts their limited resources into the new Scrivener 3.0. Again, a reasonable decision. Personally, I’d rather they stay frugal and keep the price of Scrivener down, rather than raise the price, and hire someone to do more bug fix patches.

Importing anything directly from the Web into Scrivener for Windows has been absolute rubbish for a very long time. If it ever works, it’s so random as to be useless as a general-purpose tool. My assumption is that this is because Scrivener (or the underlying Qt framework) relies on outdated ways of importing that the Internet has left behind (e.g., IE). So the hope is for v. 3 sometime next year, based on a new Qt and hopefully with other enhancements besides. Till then, as you say, I try something else (e.g., saving to PDF, or just copying and pasting) when I need something from an online source. If it’s something that could potentially be updated (and even if it’s not), of course I include the link to check up.

When will it be 2018? Are we there yet? :cry: :wink:

MSIE lives on as a programmable Windows ActiveX component even as it has ceased to be competitive as a daily internet browser. ScrivW should be able to rely on IE indefinitely, and of course invoking it should never crash an up-to-date Windows.

But importing a web page — with its scripts, its nested styles and its ever-evolving standards — is never going to be one of Scrivener’s core competencies. Moreover, PDF, with its fixed line breaks, is not an appealing format for working with an archived web page within Scrivener. I prefer to save the HTML locally, and open it in Scriv’s, i.e. Qt’s, own extremely powerful HTML viewer, which zooms and wraps, adheres to external style sheets, displays remote and local pages residing outside the project, and even runs scripts (try CKEditor!), all within Scriv’s editor pane.

I use Firefox to render a web page as a local file. Chrome and IE work similarly. They’re programs designed and maintained to perform this function. I also use the web page editor BlueGriffon to clear extraneous matter on the saved page, and an AutoHotkey script to inject a common style sheet, so that the saved HTML pages roughly adhere to my default style for Scrivener docs.

Here’s an approach to get started, minus the third-party software and scripts:

Create a small plain-text or empty file with an HTML extension, drag it from Explorer into the Binder, and view it in Scrivener. There should a file:/// URL at the base of the editor pane. Select and copy that URL, and navigate in Firefox, Chrome or IE to the page you’d like to save, and “Save as” or “Save Page As” or Ctrl-S. For the filename, paste in the URL you just copied, and choose between “Web Page, complete” and “Web Page, HTML Only”.

Now return to Scriv, reload the HTML doc, and see if you’re happy with the results. If not, find an HTML editor — BlueGriffon is free of charge and open source — and edit the doc using the same copied URL. I’d also recommend dragging the original web page URL into Document References, since it won’t otherwise be displayed with your local copy.

A more advanced method involves copying page selections out to BlueGriffon first, saving to another local file, viewing that file in your browser, and saving it from there with the copied URL filename and thus into your Scrivener project. Note the three browsers mentioned will save the images and other supporting files for the page into an nn_files subfolder within your project’s Docs folder. (BlueGriffon itself will not; Edge will not.) Editing to a local file before saving into Scrivener from your browser will reduce the number of support files saved into nn_files, limiting them to the graphics you retain, leaving out some unneeded styles and scripts that would be saved with the starter method.

Rgds — Jerome

Very interesting, Jerome. I find that if I simply save a page as “HTML only” from my browser and open that in Scrivener, it does display, ugly and incomplete, but with the text (which is generally what I’m interested in).

I tried to follow your method by creating a blank test .html file and dragging it into the binder. But then I don’t find a “file:/// URL” at the base of the editor pane or anywhere else, so I can’t go any farther. What am I missing?

Oops, try View > Layout > Footer View

Rgds - Jerome

Got footer view turned on. I find that some .html files show a URL in the footer and some don’t.

Well, I have to let it go for now. There’s actual work to do. I do find that I can simply save a Web page from the browser, and as long as I don’t need it to look, well, right or have much more than text, I can open it in Scrivener.

Footer view is set per editor pane rather than globally, and you also won’t see the URL when viewing the entry in Outline or Corkboard view. That’s all I can think of. Good luck with the project!