I have been looking for a while and can’t find the answer to this. Forgive me if I am missing something obvious:
Is there any way to get Scrivener to compile “clean” HTML without all the embeded style definitions and with the heading tags (H1,H2.H3, etc.) correctly in place so I could use it with an external css style sheet?
Yes, some better options for HTML are on the list, in two forms. Cleaner exports for the current system so that an XHTML non-styled document can be dropped into a styled web site without any post-editing to strip out the inline styling. Secondly, with the addition of MultiMarkdown integration, which itself produces very clean output and uses a familiar and easy to use plain-text marking system in the editor.
I’m not super familiar with Windows based HTML editors, but I do know of Notepad++ and remember playing with it years ago. It had a ton of features for text processing and coding, so I wouldn’t be surprised if it had a way to clean things up. You might also check and see if there is a version of Tidy for Windows. Tidy is a UNIX tool that I’ve used for years that will clean up HTML. Some text editors have a copy of Tidy embedded in them.
I can’t give any ETAs on either of those items, but MMD will probably come sooner as that is a 1.x slated feature, whereas more expansive export/import options are slated for 2.x.
Thanks,
If clean HTML is slated for 2.x, we won’t see it til, say, mid 2013, is that about right? I kind of figured that, which is why I was asking for 3rd party tools.
I tried Notepad++, and also PSPad and RJTextEdit. Their cleanup depends on user scripts with varying results. I remember there was a dedicated tool out there, but can’t seem to find it on the web.
If someone knows about a working solution (please try it first), please let us know.
Tidy is really the best that I know of. It can take something that is wholly garbage, like FrontPage or MS Word output, and turn it into a solid piece of W3C complaint XHTML. It shouldn’t have a problem with Scrivener’s largely compliant but kind of overbearingly formatted HTML.
Like I said, I can’t say for sure, which is why I used “probably” in my sentence because sometimes a feature that is technically something that came out in 2.x for the Mac is deemed important enough to bump up to 1.x. E-books were an example of that. So I can’t say for sure it will be a year+ for this, but it is very likely that MMD will be first because that is something that should have been in 1.x but wasn’t for timing reasons.
I had tried Tidy before but it had some issues with German Umlaute, which might have been due to encoding settings, or to plugin implementation of the editor. I have not tried it standalone and not with Scriv-output. Will give it a testride.
Okay, yeah I figured since that page was in German that implementation would handle the majority of any issues like that, but perhaps it is not built to use non-encoded (ö instead of ö) characters. I did not check that.
Has there been any update to this? I see that MMD and xhtml are supported, but I’m unable to figure out how to make this work for me.
I’m trying to get something that only includes really basic tags:
, , , , etc.
I just want to remove the tags and the style qualifiers from the html export. I’m willing to manually remove the header. I need to be able to copy/paste plain html into a website. Since they handle formatting and fonts, I don’t want my text to override that.
Unfortunately we don’t have just a simple HMTL copy option available yet. All the HTML output is handled by the Qt framework, so overhauls to that aren’t going to be able to happen until the next major Scrivener update. At present, if you are still wanting to keep some of the tags, you’re best solution is likely to work with a text editor supporting RegEx to strip out the extraneous code.