How to import a web page as a link only? Project size is too big.

During the past few months, I was importing web pages as a complete web page for offline viewing. But, I soon discovered that after importing nearly 70 web pages into the research folder this way, the project folder in Windows File Explorer grew by more than 150MB. Project backup in Scrivener is now painstakingly slow. What is the solution to this? Is there a way to only have a link to the web site in scrivener which when clicked takes me to the site in my external web browser? I know that I would need to be online to see the web page if it’s setup this way. What other solutions would work. By the way, it now seems like the latest beta version of scrivener 3 for windows (Candidate release 2) no longer has an option to import web pages as a link only. Any thoughts? Thanks.

The command File > Import > Research Files as Shortcuts works splendidly to pare down the project and backup while enabling the user to retain research files locally and view them in Scriv’s editor pane.

But there’s a quirk users should know about, and which the developers should seriously consider addressing.

Scrivener represents imported web pages as Mime HTML files, extension MHT. We can see this by clicking on the Open in External Editor button at the lower right of the editing pane when a research file is in view. Similarly, an MHT file on the local drive can be the target when we run Import Research Files as Shortcuts.

But the most popular browsers, including Chrome, Edge and Opera, save web archives locally with the default extension .MHTML. And Scrivener doesn’t recognize these for Import as Shortcuts. All we need do is rename an .MHTML file to .MHT, and Scrivener will work with it like a boss. But most users will not know to do that.

This is not quite a bug, but an anomaly well worth the fix. If it’s too late to repair the code for the production release, maybe just reword the error message on a failed MHTML import to advise the user to rename the file for a clean import as shortcut.

Cheers - Jerome

This option just opens a file-open dialog window. How do I do this? Do I place a website shortcut in a folder and then point the dialog window to the shortcut?

Have you looked into Scrivener’s Bookmarks feature? If you’re not familiar with it, the feature allows you to add an external link, and it will display web pages pretty well in the inspector as a result. If you need to open the link in a browser, then you can double-click on the bookmark in the inspector.

More details can be found in the manual, or in the beginning sections of the interactive tutorial project (available via the templates chooser, or the Help menu).

Thanks for letting me know about this. I did try it before, but like you said it displays the site right within the inspector. But, it’s displayed in a thin strip of space in the inspector which is pretty much useless in terms of its usability or readability. Is it possible to disable the rendering of the website in the inspector? Is there a solution to this without disabling the rendering so that I can see the website right within Scrivener and at the same time use it as a bookmark?

By the way, I tried to import the websites as links like @JJSLOTE suggested, but selecting multiple website shortcuts in the open-file dialog causes the dialog window to freeze. I had to click on the cancel button. From my understanding,the only way to do it would be to import each shortcut one at a time, which is not viable. So, instead I just dragged and dropped all the shortcuts directly into Scrivener. My project size is now only 3MB, down from 153MB! :smiley:

Looks like you’ve found a workable approach, ScrivTrex, but I’ll post my intended reply for completeness.

The MHT technique allows you to save a static local copy of a desired research page into your Scrivener binder without making it part of the project and slowing the backup. You’d save the page in your browser as type “Webpage Single File” (extension .mhtml), rename it to extension .mht, and select it from Import Research Files as Shortcuts .

If you’d prefer to load and view the live web site, there are other good options. Document Bookmarks, as rdale suggests, is the native cross-platform approach. You can drag the web site’s icon into a Windows folder, creating a shortcut there, and then make that shortcut the target of your Import Research Files as Shortcuts command - but that’ll just show a link in your editor pane; the site itself would be displayed in your browser.

Or, if you’d like to push the envelope a little, a meta refresh command will display the desired site right in your Scrivener editing pane. You can save a document like the one below onto your drive with extension .html and then drag it into the Research section of your binder. It in turn will load the desired page.

[code]<meta http-equiv=‘refresh’ content='0;URL=How to import a web page as a link only? Project size is too big.]
This is actually what I do, but with the help of an AutoHotkey script. May be too much of a nuisance otherwise.

Cheers - Jerome

You need the screen space to widen your inspector to make it a useful space, but v3 lets you widen the inspector suitably if you do have that space; there are just minimums to the size of each editor, so laptops don’t work well.

Ways to deal with the lack of space include using the keyboard shortcut to zoom out (reduce the rendered font size), or drag the URL into the header of one of your editors/copyholders.

If you want to prevent it from rendering the website automatically, then there’s a setting under Scrivener->Preferences->Behaviors->Navigation to turn that feature on/off. If it’s off, you get a button in the inspector that will make Scrivener load the web page there.

One other thought: Do the websites you want to reference offline work with Safari’s “Reader view”? If so, you can print that view to PDF, and then import those PDFs as aliases to the binder, or link to them in the Bookmarks section.

Perhaps the original poster meant something I simply don’t understand, but I do a newsletter with lots of links to websites. From the website you want the link to, copy the URL. Then add a link from the pulldown on the paper clip in the menu bar. You can just paste the full link, or write a description and highlight it and then add the link. Whatever URL is in the clipboard is used as the link. Simple and quick.

I also never insert images into Scrivener. Instead have an images folder in the project folder with all the images in it and attach the “image linked to a file…”.

I tried the MHT technique on several web pages, but some pages don’t display properly, like the site menu bar on one website partially overlaped the body text underneath. And some sites I import this way have no styling applied whatsoever; the whole site in the editor window is just plain text with links.

This method seems to be working the best, and, unlike the MHT method, I don’t see any flaws in the web page rendering whatsoever in Scrivener’s editor window for any imported web page. I’ve used AutoHotkey before for another purpose, but have not dabbled with it’s scripting features. One concern though is that I need to do this repeatedly for all the 70 web pages I have in my research folder. But, do you mind if you give me a copy of your script via a PM message here? Or posting it in this thread? Thanks!

Thanks for these suggestions, but these techniques seem very cumbersome after trying it. I think I’ll just stick with JJSlote’s meta refresh html method of importing web pages. Thanks for your input. Much appreciated.

Not quite what I was looking for, but thanks for your input and it’s very useful info nonetheless. This might come in handy later on. Thanks.

Here’s a rough version of the algorithm I use, just enough to get you started. It assumes you’ve put the destination URL on the clipboard. Of course you’ll want to change "h:" to a path of your choice.

MakeDraggableMetaRefresh: DestURL := clipboard RedirectingDoc := "h:\DemoRedirect.html" RedirProto := "<html><head><meta http-equiv='refresh' content='0; URL=(ReplaceWithDestURL)'></head></html>" StringReplace, RedirText, RedirProto,(ReplaceWithDestURL), %DestURL% filedelete, %RedirectingDoc% FileAppend, %RedirText%, %RedirectingDoc% run, "h:\" Return
For your existing 70 links, you’ll probably want to loop through a list to obtain each DestURL, and generate a unique filename for each RedirectingDoc, from the loop counter or the URL or both, so you can drag all 70 into the binder at once.

Good Luck - Jerome

Thanks for the script, and I figured out what your script is doing. But, I did some research on google yesterday before you posted your reply and managed to come up with a working script of my own. I’ll follow your suggestion about looping through the 70 web pages in the research folder and figure it out. Thanks.

By the way, instead of just dragging and dropping the html files resulting form the meta refresh method into Scrivener directly, I imported them as Research File Shortcuts via File > Import > Research Files as Shortcuts . This way, I can still reduce the project size. But, this probably only offers a negligible gain in paring down the project size.

Here is my own script just in case someone could use it:

[code]
;This script extracts the URL of the current tab in the web browser and writes it to a html file as described in the meta refresh method

Menu, Tray, Icon, % A_WinDir “\system32\netshell.dll”, 86 ; Shows a world icon in the system tray

ModernBrowsers := “ApplicationFrameWindow,Chrome_WidgetWin_0,Chrome_WidgetWin_1,Maxthon3Cls_MainFrm,MozillaWindowClass,Slimjet_WidgetWin_1”
LegacyBrowsers := “IEFrame,OperaWindowClass”

^+!u::
nTime := A_TickCount
sURL := GetActiveBrowserURL()
WinGetTitle, Title, A
Title := StrReplace(Title, " - Google Chrome") ; strips off the end part of the title
;MsgBox, % filename
WinGetClass, sClass, A
If (sURL != “”)
{
FileAppend, `n, %A_Desktop%%Title%.html
if errorlevel != 0
MsgBox, Could not write to filename
}
Else If sClass In % ModernBrowsers “,” LegacyBrowsers
MsgBox, % “The URL couldn’t be determined (” sClass “)”
Else
MsgBox, % “Not a browser or browser not supported (” sClass “)”
Return

GetActiveBrowserURL() {
global ModernBrowsers, LegacyBrowsers
WinGetClass, sClass, A
If sClass In % ModernBrowsers
Return GetBrowserURL_ACC(sClass)
Else If sClass In % LegacyBrowsers
Return GetBrowserURL_DDE(sClass) ; empty string if DDE not supported (or not a browser)
Else
Return “”
}

; “GetBrowserURL_DDE” adapted from DDE code by Sean, (AHK_L version by maraskan_user)
; Found at http://autohotkey.com/board/topic/17633-/?p=434518

GetBrowserURL_DDE(sClass) {
WinGet, sServer, ProcessName, % “ahk_class " sClass
StringTrimRight, sServer, sServer, 4
iCodePage := A_IsUnicode ? 0x04B0 : 0x03EC ; 0x04B0 = CP_WINUNICODE, 0x03EC = CP_WINANSI
DllCall(“DdeInitialize”, “UPtrP”, idInst, “Uint”, 0, “Uint”, 0, “Uint”, 0)
hServer := DllCall(“DdeCreateStringHandle”, “UPtr”, idInst, “Str”, sServer, “int”, iCodePage)
hTopic := DllCall(“DdeCreateStringHandle”, “UPtr”, idInst, “Str”, “WWW_GetWindowInfo”, “int”, iCodePage)
hItem := DllCall(“DdeCreateStringHandle”, “UPtr”, idInst, “Str”, “0xFFFFFFFF”, “int”, iCodePage)
hConv := DllCall(“DdeConnect”, “UPtr”, idInst, “UPtr”, hServer, “UPtr”, hTopic, “Uint”, 0)
hData := DllCall(“DdeClientTransaction”, “Uint”, 0, “Uint”, 0, “UPtr”, hConv, “UPtr”, hItem, “UInt”, 1, “Uint”, 0x20B0, “Uint”, 10000, “UPtrP”, nResult) ; 0x20B0 = XTYP_REQUEST, 10000 = 10s timeout
sData := DllCall(“DdeAccessData”, “Uint”, hData, “Uint”, 0, “Str”)
DllCall(“DdeFreeStringHandle”, “UPtr”, idInst, “UPtr”, hServer)
DllCall(“DdeFreeStringHandle”, “UPtr”, idInst, “UPtr”, hTopic)
DllCall(“DdeFreeStringHandle”, “UPtr”, idInst, “UPtr”, hItem)
DllCall(“DdeUnaccessData”, “UPtr”, hData)
DllCall(“DdeFreeDataHandle”, “UPtr”, hData)
DllCall(“DdeDisconnect”, “UPtr”, hConv)
DllCall(“DdeUninitialize”, “UPtr”, idInst)
csvWindowInfo := StrGet(&sData, “CP0”)
StringSplit, sWindowInfo, csvWindowInfo, `” ;"; comment to avoid a syntax highlighting issue in autohotkey.com/boards
Return sWindowInfo2
}

GetBrowserURL_ACC(sClass) {
global nWindow, accAddressBar
If (nWindow != WinExist("ahk_class " sClass)) ; reuses accAddressBar if it’s the same window
{
nWindow := WinExist("ahk_class " sClass)
accAddressBar := GetAddressBar(Acc_ObjectFromWindow(nWindow))
}
Try sURL := accAddressBar.accValue(0)
If (sURL == “”) {
WinGet, nWindows, List, % "ahk_class " sClass ; In case of a nested browser window as in the old CoolNovo (TO DO: check if still needed)
If (nWindows > 1) {
accAddressBar := GetAddressBar(Acc_ObjectFromWindow(nWindows2))
Try sURL := accAddressBar.accValue(0)
}
}
If ((sURL != “”) and (SubStr(sURL, 1, 4) != “http”)) ; Modern browsers omit “http://”
sURL := “http://” sURL
If (sURL == “”)
nWindow := -1 ; Don’t remember the window if there is no URL
Return sURL
}

; “GetAddressBar” based in code by uname
; Found at http://autohotkey.com/board/topic/103178-/?p=637687

GetAddressBar(accObj) {
Try If ((accObj.accRole(0) == 42) and IsURL(accObj.accValue(0)))
Return accObj
Try If ((accObj.accRole(0) == 42) and IsURL(“http://” accObj.accValue(0))) ; Modern browsers omit “http://”
Return accObj
For nChild, accChild in Acc_Children(accObj)
If IsObject(accAddressBar := GetAddressBar(accChild))
Return accAddressBar
}

IsURL(sURL) {
Return RegExMatch(sURL, “^(?https?|ftp)://(?(?:[\w-]+.)+\w\w+)(?::(?\d+))?/?(?(?:[^:/?# ]*/?)+)(?:?(?[^#]+)?)?(?:#(?.+)?)?$”)
}

; The code below is part of the Acc.ahk Standard Library by Sean (updated by jethrow)
; Found at http://autohotkey.com/board/topic/77303-/?p=491516

Acc_Init()
{
static h
If Not h
h:=DllCall(“LoadLibrary”,“Str”,“oleacc”,“Ptr”)
}
Acc_ObjectFromWindow(hWnd, idObject = 0)
{
Acc_Init()
If DllCall(“oleacc\AccessibleObjectFromWindow”, “Ptr”, hWnd, “UInt”, idObject&=0xFFFFFFFF, “Ptr”, -VarSetCapacity(IID,16)+NumPut(idObject==0xFFFFFFF0?0x46000000000000C0:0x719B3800AA000C81,NumPut(idObject==0xFFFFFFF0?0x0000000000020400:0x11CF3C3D618736E0,IID,“Int64”),“Int64”), “Ptr*”, pacc)=0
Return ComObjEnwrap(9,pacc,1)
}
Acc_Query(Acc) {
Try Return ComObj(9, ComObjQuery(Acc,“{618736e0-3c3d-11cf-810c-00aa00389b71}”), 1)
}
Acc_Children(Acc) {
If ComObjType(Acc,“Name”) != “IAccessible”
ErrorLevel := “Invalid IAccessible Object”
Else {
Acc_Init(), cChildren:=Acc.accChildCount, Children:=
If DllCall(“oleacc\AccessibleChildren”, “Ptr”,ComObjValue(Acc), “Int”,0, “Int”,cChildren, “Ptr”,VarSetCapacity(varChildren,cChildren*(8+2A_PtrSize),0)0+&varChildren, "Int",cChildren)=0 {
Loop %cChildren%
i:=(A_Index-1)
(A_PtrSize*2+8)+8, child:=NumGet(varChildren,i), Children.Insert(NumGet(varChildren,i-8)=9?Acc_Query(child):child), NumGet(varChildren,i-8)=9?ObjRelease(child):
Return Children.MaxIndex()?Children:
} Else
ErrorLevel := “AccessibleChildren DllCall Failed”
}
}[/code]

This is an update. The meta refresh command method works quite well. But, unfortunately, the rendered web pages show ads, unlike my external chrome browser which has the uBlock Origin ad blocker extension installed. Is there a solution to view web pages in scrivener with ads blocked? Until I find a solution to this, I’ve decided to just import the actual .url files (website shortcuts) and view the web pages externally in chrome via a link in the editor.