[LH3291] [LH4662] Links with special characters won't import ( Beta 2.9.0.11, Win, German)

philr · December 11, 2018, 8:10am

Hi everybody,

short intro: I’m a new user and software developer, already found a lot of great uses for Scrivener.

In the latest Beta, adding a website to the Research folder fails when the URL has special characters like Umlauts.

Repro steps (messages and menu items translated from their German texts):

Right click “Research”
“Add” -> “Website”
URL: for instance try freimaurer-wiki.de/index.php/Brüderlichkeit - i.e. the Umlaut is not URL encoded)
Enter any title, click OK
Error message: “URL import failed”

Keep up the good work!

Phil

tiho_d · December 11, 2018, 12:39pm

Thanks for reporting this issue, Phirl. I have filed it in our bug tracking system, Unfortunately I cannot propose a workaround at the moment.

philr · December 11, 2018, 12:41pm

Thanks, @tiho_d - my current workaround is copy-pasting into a new document, for my purposes, this works temporarily.

tiho_d · December 11, 2018, 1:36pm

You can also try saving the URL into an MHT via the browser and drop it inside the Binder Research folder.

Pekoo · January 28, 2020, 2:54pm

The bug still exists. (Version 2.9.0.36 Beta (818046) 64-bit - 28 Jan 2020)

Interesting: If I e.g. enter the link de.wikipedia.org/wiki/J appears. Everything after the “J” is ignored.
But if I look at the link to the imported website in the status bar, the correct link de.wikipedia.org/wiki/Jōmon-Zeit appears there in both cases and if I click on the link, the correct website appears in the browser.
Of course I could convert the corresponding page into a PDF, but then I would not be able to search the text anymore. Therefore I would be very grateful for a fix of this bug.

Pekoo · January 29, 2020, 7:20am

Small addendum: The following link will be imported without problems, although it decodes the url in the same way as the link above:
https://de.wikipedia.org/w/index.php?title=J%C5%8Dmon-Zeit&oldid=195264490

JJSlote · January 29, 2020, 12:49pm

We can also import a web page by dragging the left icon from the browser’s URL bar – typically a padlock for a secure site – into the Research section in Scrivener’s binder. The same bug will be effected. Scrivener’s footer bar displays the intended site link, but the page actually imported and displayed reflects a URL truncated at the special character.

Rgds - Jerome

JJSlote · January 29, 2020, 4:59pm

Likewise, an addendum. The link with an umlaut, originally reported by PhilR to fail on import, does import successfully in v. 36.

freimaurer-wiki.de/index.php/Br … erlichkeit

Whereas the link reported by Pekoo that truncates on import has an overscored o. A transliteration of the Japanese? Of course we’ll experience the same truncation effect on English-language URLs that include the character. The link below will drag or paste in as Wikimedia Commons Category:J.

commons.wikimedia.org/wiki/Cate … nal_Museum

So an umlaut flies while an overscore flails as of this round.

Cheers - Jerome

Pekoo · August 11, 2020, 6:52am

Websites with special characters (tested with umlauts) are still not imported. Is this bug even processed? It should only be a bug in the used character set. (Beta 2.9.9.8)

krastev · August 11, 2020, 7:10pm

It’s probably something on your machine. I just tested it and it works perfectly for me. Encoded or not encoded it doesn’t matter, it imports the page without a problem.

Pekoo · August 13, 2020, 4:54am

When I visit the website de.wikipedia.org/wiki/K%C3%A4se
with umlauts in the address, I only get an error message (see picture). Within a web page, special characters do not cause any problems for me either - only with URLs.

error scrivener.jpg

tiho_d · August 13, 2020, 8:38am

@Pekoo: We have adjusted the Web Page import function. Give it another test with the next update.

krastev · August 13, 2020, 4:20pm

As I said, I tested it with umlauts in the address(both encoded and not URL encoded), and it works for me.

tiho_d · August 13, 2020, 5:21pm

@Krastev: The Pekoo’s example did not work and I could reproduce the error. Now all example URLs from this thread import successfully.

rwfranz · August 14, 2020, 2:40am

I managed to import the link given, but I noticed the progress bar froze at about 20%, leading me to believe Scriv had frozen. I waited about a minute, and it did complete the import. It was just slow.

Pekoo · August 15, 2020, 5:08am

At least since 2.9.9.9. there has been significant progress: Umlauts are accepted in the address. Unfortunately the import still aborts as soon as you use an umlaut in the title.

tiho_d · August 15, 2020, 6:59am

Thanks, Pekoo! Umlauts in the title has been fixed too and will be available in the next update. Avoiding Unicode characters in the Title during import is the workaround at the moment. Once imported, you can change the Binder name with anything you like.

Pekoo · August 16, 2020, 6:17am

Thank you for this information.