Import HTML Code or Preserve HTML Code in output

Greetings,

I am looking at various eBook software creation tools, Scrivener being one of them.

I am wondering if Scrivener has the ability to either:

  1. Import an HTML block of code and not change it in the output (epub and kindle)
  2. A way to do the same but with an external html file?

Cheers,
Gene

Scrivener is usually combined with Calibre or Sigil after Compiling to ePub3 or Kindle for finesse.

To represent HTML in… well HTML, you’ll have to escape a lot of characters or they will be interpretated as HTML by reader or browser.

You can import an HTML-file, but it will show up in a simpele browser as a web page.

Yes, you can do both of these things with Scrivener, though there are some platform differences to be aware of.

  • On the Mac: go into the Sharing: Import preferences pane, and in the “Plain Text” section, note the file extension list on the right. Add ‘.html’, ‘.htm’ or whatever else you need here.

    This will override the normal behaviour, where dropping HTML files into the binder will either archive them (using MHT or WebArchive) or convert them to editable formatted text. Neither is what you want.

  • On Windows: this option hasn’t been added yet, so normal import is off the table. You’ll need to copy and paste from a plain-text editor to bring raw HTML into the editor. That of course works on the Mac as well—maybe you don’t want to change that setting so you can continue archiving pages normally, well that’s how you’d get both.

So that gets the syntax into the editor, but in order for that to remain HTML through compile, you need to tell Scrivener that’s what you want. For all it knows you are writing about HTML and want it to print verbatim. To achieve that we need to use styles. First, let’s explore how that works.

  1. Open File ▸ Compile..., set the Compile for option at the top to ePub, and select the “Ebook” compile Format in the left sidebar.
  2. Double-click on it to duplicate and edit, and click on the Styles pane. Scrolling through that list you’ll find two prepared styles: “Raw HTML” and “Raw HTML Block”. One is for inline snippets of raw syntax, the other for whole chunks, as you would expect. Click on these, and note they are both set to use the Treat as raw markup flag. That’s the special sauce, you can apply that to any style you want.

We know what we need to know for now, make a mental note of these style names, and feel free to cancel out of this dialogue and Opt/Alt click on the “Compile” button to just save your changes to the file type and format.

Back in the editor, select the syntax you’ve imported (one way or another), and use the Format ▸ Styles ▸ New Style from Selection... menu command. The formatting and settings are completely arbitrary, do what works best for you, it will all be ignored completely, for obvious reasons. The only thing that matters is using one of the style names, depending on whether it is paragraph or character level.

Now as to your second question, yes you can insert text from .html files on the fly, using the placeholder documented in §10.1.5, Including Text From Other Documents, in the user manual PDF. If you intend to compile from multiple platforms, it’ll work best to keep the .html files in the same folder as the project, so you that you avoid having to give a full path, which will of course be completely different between Win/Mac.

However, I wouldn’t depend on that approach right now, as both platforms have bugs that relate to this feature that effectively make it not useful for inserting raw syntax.

1 Like

Thank you Amber for the help, however I must have done something incorrect as the HTML now showing as text. At this time I am on a Mac (Intel), Catalina 10.15.7 if that matters.

I got as far as “Back in the editor” when things became fuzzy. I did create a New Style called HTML, the styles talked about from #2 do NOT show up here. There is no shortcut, formatting is Save all formatting with both checked boxes. I have Highlight box checked with a color and Next Style is None. I have a folder called Code Blocks and have a text document there with the below code and Styled with HTML. In the Manuscript I have a document with <$Include:TestCodeBlock>. When I recompile using the copied called “Ebook with HTML” the code is now showing as text instead of raw html to be rendered.

Below is the raw HTML I would like to insert, hope you like Animatics :slight_smile: . Could you either provide more step by step like you did in the beginning (prefered) or a sample document that has the changes you described with the changes pointed out. Just to be clear the below HTML should not have any changes, just inserted.

Thank you again for the help.

		<!--StartFragment-->
		<div style="font-family:Consolas,Lucida Console; font-size:10pt; width:650; border:1px solid black; overflow:auto; padding:5px">
			<table border="0" cellpadding="5" cellspacing="0">
				<tbody>
					<tr>
						<td valign="Top">
							<div style="font-family:Consolas,Lucida Console; font-size:10pt; padding:5px; background:#ffffff">

001<br />002<br />003<br />004<br />005<br />006<br />007<br />008<br />009<br />010<br />

							</div>
						</td>
						<td valign="Top" nowrap="nowrap" width="100%">
							<div style="font-family:Consolas,Lucida Console; font-size:10pt; padding:5px; background:#ffffff">

<span style="color:#006400">&lt;#
<br />
&nbsp;&nbsp;&nbsp;&nbsp;Help
<br />
#&gt;</span><br />
<br />
<span style="color:#0000FF">Add-PSSnapin</span><span style="color:#000000">&nbsp;</span><span style="color:#8A2BE2">Microsoft</span><br />
<br />
<span style="color:#00008B">foreach</span><span style="color:#000000">(</span><span style="color:#FF4500">$i</span><span style="color:#000000">&nbsp;</span><span style="color:#00008B">in</span><span style="color:#000000">&nbsp;</span><span style="color:#FF4500">$list</span><span style="color:#000000">)</span><span style="color:#000000">&nbsp;</span><span style="color:#000000">{</span><br />
<span style="color:#000000">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color:#0000FF">Write-Host</span><span style="color:#000000">&nbsp;</span><span style="color:#8B0000">&quot;HELLO NURSE!&quot;</span><br />
<span style="color:#000000">}</span><br />

							</div>
						</td>
					</tr>
				</tbody>
			</table>
		</div>
		<!--EndFragment-->

I did create a New Style called HTML, the styles talked about from #2 do NOT show up here.

I may not be following, but as noted in the the original instructions: “The only thing that matters is using one of the style names, depending on whether it is paragraph or character level.” And to clarify, since this is taken out of context, one of the style names refers to the list of styles in the compile format.

You say they don’t show up here, but I’m not sure what that is in reference to. They wouldn’t be in the project itself until you create them, so that would be expected. If they aren’t in the compile format, that’s why I recommended starting with the stock “Ebook” compile format.

So for a large block of HTML like this you want a paragraph style called “Raw HTML Block”. Calling it “HTML” isn’t going to do anything. Just redefine it and rename it to “Raw HTML Block”—or go into the compile format’s Styles pane and renamed “Raw HTML Block” to “HTML”. Either way serves the same purpose.

One last thing to be aware of is that when you insert text using the <$include:…> placeholder, it uses the styles from the source text. So you do want to apply the “Raw HTML Block” style to the TestCodeBlock item’s text, rather than applying it to the placeholder itself at the point of insertion. It sounds like that is what you are doing, but doing things the other way around is one way in which the HTML would end up escaped and printing as visible text.

Thank you that helped!

Now for the bad news :slight_smile:

It seems that if there is a blank line or a literal # that it cause issues.

The literal #, which is valid HTML, produces:

<h1 id="gtspanbr">

around the line with a closing h1 tag, so this:

#&gt;</span><br />

becomes

<h1 id="gtspanbr">&gt;</span><br /></h1>

the blank line cause this, again valid HTML:

***I am a blank line***
							</div>
						</td>
					</tr>
				</tbody>
			</table>
		</div>
		<!--EndFragment-->

to become:

***I am a blank line***
<pre><code>                        &lt;/div&gt;
                    &lt;/td&gt;
                &lt;/tr&gt;
            &lt;/tbody&gt;
        &lt;/table&gt;
    &lt;/div&gt;
    &lt;!--EndFragment--&gt;
</code></pre>

Is this a potential bug?

For me, it would be beneficial to have the ability to turn text wrapping off in the editor and the ability to have line numbers.

For reference, the HTML code used is:

		<!--StartFragment-->
		<div style="font-family:Consolas,Lucida Console; font-size:10pt; width:650; border:1px solid black; overflow:auto; padding:5px">
			<table border="0" cellpadding="5" cellspacing="0">
				<tbody>
					<tr>
						<td valign="Top">
							<div style="font-family:Consolas,Lucida Console; font-size:10pt; padding:5px; background:#cecece">
								001<br />002<br />003<br />004<br />005<br />006<br />007<br />008<br />009<br />010<br />
							</div>
						</td>
						<td valign="Top" nowrap="nowrap" width="100%">
							<div style="font-family:Consolas,Lucida Console; font-size:10pt; padding:5px; background:#fff">
								<span style="color:#006400">&lt;#
<br />
&#160;&#160;&#160;&#160;Help
<br />
#&gt;</span><br />
<br />
<span style="color:#0000FF">Add-PSSnapin</span><span style="color:#000000">&#160;</span><span style="color:#8A2BE2">Microsoft</span><br />
<br />
<span style="color:#00008B">foreach</span><span style="color:#000000">(</span><span style="color:#FF4500">$i</span><span style="color:#000000">&#160;</span><span style="color:#00008B">in</span><span style="color:#000000">&#160;</span><span style="color:#FF4500">$list</span><span style="color:#000000">)</span><span style="color:#000000">&#160;</span><span style="color:#000000">{</span><br />
<span style="color:#000000">&#160;&#160;&#160;&#160;</span><span style="color:#0000FF">Write-Host</span><span style="color:#000000">&#160;</span><span style="color:#8B0000">&quot;HELLO NURSE!&quot;</span><br />
<span style="color:#000000">}</span><br />

							</div>
						</td>
					</tr>
				</tbody>
			</table>
		</div>
		<!--EndFragment-->

Also, linking documents with the extension .htm or .html when compiled show the <$include:whatfile.html> not the contents. I tried changing to .txt and it worked. Am I missing something?

Thoughts?
Gene

@AmberV - Any thoughts?

Okay, yes I see what is going on here. Internally and during the compile process Scrivener uses MultiMarkdown to generate the HTML, but that detail should be largely invisible to you. In this case it is causing a problem though, because when you use the raw markup flag on a style, it passes the text through verbatim into the internal Markdown file used for processing. So that means three things:

  1. These lines you’ve got that have an amount of indenting applied to them end up being interpreted as code blocks, because prefixing a line with at least four spaces or one tab is how you mark code blocks in Markdown.
  2. The empty lines would mark the boundaries of these code blocks, as empty lines are how you separate block level elements.
  3. The line starting with a “#” character is interpreted as a heading.

So to the first problem, you’ll do best by removing the indenting, or at least making sure that the first line is at the beginning of the line. Indenting HTML in Markdown is okay, but only so long as there are no empty lines within the range, where the next line starts off indented. So for example:

<div>
    <p>This is fine...</p>

    <p>This is the first line of a new code block</p>
</div> <!-- This is outside of the code block. -->

The line starting with a “#” could be resolved by just moving it up to the previous line.

As for whether it’s truly a bug. I don’t know if that’s right to say—it’s nice to be able to use Markdown to generate HTML if you want. It’s something worth better documenting though, as I don’t think that is mentioned anywhere.

Thank you very much for the explanation. Now that I know I would be able to massage the HTML. It does disturb me though since the HTML is marked as Raw HTML Block. To me this would be a literal copy of the RAW HTML Block and not process that code at all, but that is my opinion and I have no idea how the software is coded.

Thank you again Amber for all your help.