Best way to strip a text from all kinds of formatting

Pieces of text that are copied from an internet site are often full of formatting codes, which are not always visible, nor easy to remove.

In many cases “Documents > Convert formatting to default text style” manages to change font formatting etc., but doesn’t remove for instance certain tabs (or whatever they may be) at the beginning of lines, which don’t show up in the ruler yet seem somehow to be there.

Is there a more efficient way to remove similar formatting codes?

How about Textsoap ?

or running the text through Textwangler/BBedit(detab,zap gremlins).


I know Textsoap of course, but I don’t own it. And rather than spending 40 dollars for yet another application, I would prefer to solve the problem with (and within) Scrivener itself, if possible.

I do that sort of clean with various UNIX tools (vi, awk, perl, etc) in a Terminal session before pasting the text into Scrivener.

The specific case you mention (of tabs at start of paragraphs) might be dealt with by using regular expressions in the Find string. The Project Search option (Edit > Find > Project Search or Ctrol-Alt-F) could be what you’re after. You’d need to be careful with the reg ex used so practice on a sample first.

What happens if you copy the text into a TextEdit file (in plain-text mode)?

Does that strip the tabs, etc? (Check the prefs in TextEdit to tweak HTML / RTF commands.)

If it does, re-copy from TextEdit and paste into Scrivener.

And what happens if you use “paste and match style” in Scrivener?

Do you copy the text into Scrivener or import it? You could do this: copy past the text into MS Word, select all and click “clear formatting” in the style palette, and save. Then import the whole thing into Scrivener. I have learned never to copy past anything directly into Scrivener. (Well actually I do, but only to my research files that won’t make it into the body of the MS.)

Word inserts its own formatting if you’re not very careful. Using vi or emacs would be a better choice as neither of them adds anything but each will expose exactly what is in the copy-and-paste buffer.

TextWrangler is free, fast, and very powerful. The detab and gremlin-zap functions are only a minute fraction of its goodies. Give it a try.