Report: Copy-Paste Hashtags Text Results in Messy Formatting

This used to bother me for a long time since I started playing with Scrivener. I noticed that if I copy some hashtags from webpages, and pastes it here on Scrivener, it results in some weird formatting. I tried to strip all formatting and Scrivener cannot do it.

I’ve searched the forum tonight and found some clues but again nothing worked.

So I decided to find this mysterious “WordWrapper” only to discover it’s for Mac, not Win. So I tried the next best thing, TextEdit, and while it’s no longer supported, at least I can downloads it and try it. It worked!

What TextEdit revealed was the WHY that the messy paste shows me.

What looks like double or triple or more lines was not true. It was something hidden in the text.

I figured out what it was. TextEdit revealed it. It was in front and after every word in the text.

After I delete each and every hidden character, the text became normal, the desired result of what “removing formatting” is supposed to be.

Now I know what it was.

Here’s an example:

Pasting into Scrivener:

No hidden character… YET it is THERE…

Another thing is, if you put a cursor in front of #mystery as you can see above, you’ll see how large the cursor seems to be. Move the cursor along the string, after each word or before each other, you’ll notice how the cursor enlarges itself, responding to something hidden there… Seems like the large cursor is making the lines double or more spacing…

Pasting the text into TextEdit reveals the culprit:

The “?” is what had been resisting all known methods of getting rid of it.

Thanks to the people on this wonderful site, I was able to figure out that TextEdit actually reveals this hidden mess.

Now that you know what it is, you can delete the “?” one by one. Then the text revert back to normal formatting. (The Search/Replace function in TextEdit don’t seem to work, no idea why.)

Weird, no?

Let’s pastes several to see what it looks like.

If I make multiple pastes in Scrivener without cleaning up with TextEdit, this is the result…

It looks like double lines, but it’s not, it’s the hidden character.

Now that you know what it is, you can finally stop fighting the text and simply removes the “?” and it’ll all be back to normal.

Let’s try multiple pastes then removes the hidden character:

Copy the result and pastes it into Scrivener:

Finally! Normal text! Now that’s what I wanted to paste direct from the web, but of course not, maybe no one realises what this hidden character is. Maybe, maybe some knows but had no idea how this knowledge is important to get rid of the hidden formatting so we can get on with writing, etc.

I have been wondering if anyone realises this?

Cheers.

One quick thing that comes to mind: try the Edit ▸ Text Tidying ▸ Zap Gremlins menu command after you paste. If these are indeed weird characters then that should get rid of them.

As for what they are though, you might be able to copy and paste one of them into the Windows Character Map tool’s search field to find out what it is. I’m not entirely sure it works that way though.

1 Like

Wow, nice. I don’t know if I have this Zap Gremlins…

Well, I’m sorry, I don’t have the Zap Gremlins feature on my Scrivener 3 Windows version, as you can see my screencapture of the menu. Was there supposed to be one there?

I checked the Character Map, the only result that showed me was some kind of empty box…

capture_001_03112021_051549

Very odd, indeed.

Maybe when Scrivener gets updated again, Zap Gremlins might shows up, hopefully.

I appreciates your solution, thank you.

I wonder:

#sciencefiction #mystery #crime #drama #politic #sciencefiction #mystery #crime #drama #politic #sciencefiction #mystery #crime #drama #politic #sciencefiction #mystery #crime #drama #politic

Does it show here?

Yes, I tested it. I pastes it here, saved it. Copied the text, pastes it into Scrivener, and the gremlins is there. Figures.

Guess it’s TextEdt to see and delete whatever needed deleting.

Hmm, yeah it looks like the information to add that feature was added to their list a few years ago and nothing has happened since then. Must have been forgotten about!

Another text editor that had a similar feature is what I had to do before it had this command, so that works fine!

1 Like

What is that text editor you had to use? I would be interested in finding it and downloading it. TextEdit don’t seem to have it.

I better make a new post about asking developers to bring Zap Gremlins feature soon. Thanks for explaining how it was forgotten, eh.

Done. I asked. We’ll see if Zap Gremlins will be back, soonish, I guess. Cheers.

Unfortunately the text editor I used back then was TextWrangler, which is a free Mac-only editor that had this same exact command as a matter of fact. It’s much more of a problem on the Mac, because for whatever reason the text engine in the system (which Scrivener uses) inserts junk characters all of the time as some kind of bug. So if one is doing work where that can cause problems and becomes noticeable, you have to clean them out of text regularly. On Windows its quite rare to see them, which is part of why the feature has flown under the radar for so long.

Does TextEdit do regular expressions? If so, if you search the web for “strip ascii control characters” and you’ll come across a lot of suggestions for how to do that with regex. If it doesn’t, try Notepad++, which is a really good free text editor for Windows—in fact most of said suggestions will be bringing it up how to do it in Notepad++!

1 Like

Ahh I uses Notepad++, but it never even sees the gremlins when I pasted into it! I had to try that TextWrangler but it, like you said, is for Mac, so I tried TextEdit and it shows me the gremlins so easily, that’s how I knew for sure as to what it is that puzzled me for so long.

I’ll try the regular expression… YES! It worked perfectly. TextEdit is a keeper, although there’s no options to make it dark mode, though. I clicked the RegEx and clicks on “replace” and it easily picks out the offending gremlin one by one, so awesome to see. Then I copied the result, pastes it back into Scrivener, and awesome, no more weirdness, it did the job so cleanly.

Ahhh, reason why Scrivener would takes in junk characters was because of the Rich Text formatting, something to do with HTML something or other, I read somewhere here, which is ok, now that we knows we can clean up the junk easily, no longer putting up with the junk, eheheh.

This is awesome, thank you for your clues, it really is helpful. Cheers.

1 Like

Well one important thing to note is that you do not have to see them in order to fix them. You may even be able to run that regex in Scrivener, if it supports character code addresses like that.

Ahhh, reason why Scrivener would takes in junk characters was because of the Rich Text formatting, something to do with HTML something or other, I read somewhere here, which is ok, now that we knows we can clean up the junk easily, no longer putting up with the junk, eheheh.

Anything will take in junk characters. They are characters, actual characters, just like the ones you are reading now. That at one point in time they were used to control printers and other devices (largely) and are no longer used much today doesn’t mean they shouldn’t paste when copied.

As for whether they would come from web pages specifically, or commonly as a thing between pages and word processors—I’ve never heard of that before and wouldn’t think it would ordinarily be the case, unless the browser you are using has bugs. There is nothing in HTML that would make use of these old legacy control characters. And they wouldn’t spring out of nowhere (again, bugs aside).

If anything I would contact the makers of this website and inform them of the problem. If pasting content out of the site into any text editor results in control characters in the text, they should probably fix that. I am aware of no reason to be using those in a web page in 2021.

Sometimes i would write posts on, MeWe, that’s the site I noticed always have this problem. I would write posts, then I would copy my posts and then pastes them into Scrivener to save later, and I would notices these strange triple-lined appearance, so I thought there was something, so I try single-line, and it don’t work, so I try figure out how to clean it up but nothing works, so I simply write directly underneath in Scrivener, doubling my workload, and it went on for a long time.

Recently found out here what it is (junk characters aka gremlins) and how to solve it, and that is awesome solution. Still would like the Zap Gremlins feature put back in Scrivener, as Mac version seems to have it.

Yeah, MeWe’s posts is the problem.

Wait…

Copying text from various sites:

Facebook? No problem.
Minds? No problem.
MeWe? Yep, problem.
Magabook? No problem.
Canund? Still dead site… no idea what happened there.
Xephula? No problem.
Flote? No problem.
Roxycast? No problem.
WeGo? No problem.
Spreely? No problem.
Gab? No problem.
Gettr? No problem.
Malikmobile? No problem.
Vevoiz? No problem.
Vk? No problem.

So, it’s MeWe that always gave me the problems if I ever copy the posts there into Scrivener. Makes sense.

I didn’t realise until now that it was always MeWe’s posts, I have no idea why, maybe MeWe developers had no idea about this, or maybe they don’t care, I do not know.

So that’s about all I knows about this issue.

Depending on how MeWe stores/gets the content it displays, it may be some back-end process (or even source, if they are reposting content) that it comes from and they’re just simply passing it through.

It also occurs to me it might be a deliberate mechanism for them to be able to detect content that came from them and has been reposted elsewhere. Since control characters mostly pass invisibly through various formats these days as Amber mentions, sticking in unique control characters would be a cheap way of watermarking text.

1 Like

Ahhhhhh… makes sense, thanks for the explanation.

I do not have “zap gremlins” on mine. (windows/Scriv3).