ePub - headings etc

wmike · December 14, 2021, 4:06pm

Have pulled what remaining hair I had left out.

I am, forlornly, trying to compile for ePub.

Why do my headings bear no resemblance to the way they are formatted.

Why does the table of contents not show any of the headings in the TOC - just a chapter title and that is it?

I have heading 1 - 6 setup, in the scrivener document and in the compile setting. Scrivener does not seem remotely interested in them. Sometimes they appear correctly, most of the time they have, for instance, bold removed or are aligned incorrectly, The list goes on.

For an app that is designed for this kind of thing (ePub) - it is pretty hopeless.

drmajorbob · December 14, 2021, 4:07pm

Here is a sample ToC in an ePub3.

The first piece of it, chapter one for instance, comes from the Title Prefix:

Next is the separator ~ from here:

and the last piece is always the document title.

wmike · December 14, 2021, 4:07pm

Thanks for that. Yes, I get that bit. However, within the chapters are various headings - heading 1, heading 2, etc.

I don’t understand the point of having these if Scrivener cannot, actually use them in a TOC - that doesn’t make sense.

AmberV · December 14, 2021, 4:07pm

The automatic ToC feature generates one based upon hard section breaks in the book, which are used as navigation points by most ebook readers as well, so it’s a good fit and works with most assumptions.

If you need a table of contents that goes deeper than that, into headings within larger sections, then you can make use of the procedure described in the user manual PDF, under §22.2, Contents in Ebooks.

That said, this technique still works within Scrivener’s underlying design of using outline structure to show book structure, rather than printing text into an editor with a style, like a word processor might do. These headings you speak of will not have ID assignments made to them automatically, so there will nothing to link to.

As for why Scrivener has both—mainly for cases where you want headings that don’t need expression in the main Draft outline, for whatever reason. I wouldn’t myself ever use that approach because it requires manual labour if you promote or demote a section. What was level 3 becomes level 2, and so on—if you allow the compiler to generate those headings for you at the natural depth they occur in the outline, then you never have to bother with details like that, and everything acquires an ID anchor point that can be linked to.

drmajorbob · December 14, 2021, 4:07pm

You don’t need them. Put that info in the titles.

H1, for example, is a style. It wouldn’t be all that practical for Compile to reach into a document and look for a (possibly absent) style and use it for the ToC when it’s far simpler and foolproof to use the title, which is never absent. If you leave the title blank, it becomes the first line of the document, arriving indirectly at the behavior you want (almost). Almost, because formatting for it is set in Compile, not the Editor.

wmike · December 14, 2021, 4:07pm

OK - the fact is that Scrivener is horrible at generating an ePub document.

I resorted to using Sigil. It showed masses of unwanted HTML and CSS coding, throughout the document.

There are too many examples to put here.

I stripped out all the Scrivener rubbish, reformatted the whole document, generated a TOC in Sigil.

Net result - a very nice ePub. Only, however, after many hours of sorting out HTML and CSS.

My conclusion. Scrivener has serious issues when compiling for ePub. One of the reasons I bought it.

AmberV · December 14, 2021, 4:07pm

It sounds like you are using default settings to me, which are yes, tuned toward an audience that doesn’t know what HTML stands for and just expect something that looks roughly like what they formatted, no matter how messy it needs to be in order to execute that. You can probably see that there is a benefit in biasing the default settings toward that. Instead of you and three other people complaining about the HTML quality over the past four years, we’d have thousands upon thousands of complaints about how the compiler requires one to become a “programmer” (as some refer to HTML/CSS).

A key thing to know is that, internally speaking, Scrivener uses Multimarkdown to generate the HTML. So at a very basic level it produces pretty clean output—the more WYSIWYG style stuff you pile on top of that, the messier it will get (kind of out of necessity).

For those that actually care about the quality of the ebook at a technical level, you will definitely want to tweak some stuff. For one thing, I’d set aside the entire “Ebook” compile format. Use it as a frame of reference for the kinds of things you can do, if you want, but it was not created from a standpoint of adhering to bests practices, and uses techniques that produce undesirable HTML, like empty paragraphs with a br to introduce spacing, major sections starting with h2 level headings (and no h1 to be seen anywhere), using h3 for “subtitles” and etc.

Now if you want a very vanilla and semantically correct ebook with minimal work, try this:

Scroll down to the enumerated checklist in this post, and go through it, up to step three.
Instead of step four, since you don’t indicate using Markdown to write, you’ll want to have Scrivener convert direct-formatting (like font variations such as italics to indicate emphasis) to semantic Markdown. Click on the general options tab in compile overview, and enable Convert rich text to MultiMarkdown.
Visit the Metadata, Cover and ToC tabs and set those up the way you want, and give that a test compile.

Since you mention using h1…h6 level styled headings in your text, you should find those work as you expect. Scrivener will convert them to Markdown headings, which ultimately are indistinguishable from any headings it might insert as structural conversion of the outline hierarchy. So all in all, that might produce a better result out of the box, for how you work, than the native ePub generator would. And because of that, you should find the automatic ToC works better with them, too.

If you edit the “Basic Pandoc” compile format, you’ll find a “Pandoc Options” pane, where you can set the ePub format to v3 (that really should be fixed to default to that, I’ll make a note of it), and a place to insert your CSS.

Say you have a dislike of Markdown and don’t even want to use automatic conversion or something, and really want to use Scrivener’s ePub generation. Well, as I say you won’t be able to escape that entirely since even the native generation uses MMD internally, the main difference between the two is that instead of having Pandoc produce the internal .epub structural files, Scrivener will. The HTML itself will be similar though—if you work toward that end.

So try this:

Switch back to regular ePub ebook output.
Select the “Default” compile format in the left sidebar, and set up some basic section layout assignments.
There are some things we’ll need to adjust, so double-click the Default format to edit it.
1. In the Styles pane, create styles for any heading levels you want to have the compiler generate, beyond the ones you have in the editor. As you add them from the project’s stylesheet, you should find they are already set up correctly, so you probably won’t have to do much here.
2. In Section Layouts, review the layouts you are using that have headings, and make sure to apply those styles to the sample text.
3. In Separators, you’ll probably want to adjust the default “Text Files” to “Single Return”, and click through any of the bold Layouts to verify their settings. “Section Break” is what will cause an .xhtml file to be created internally. “Empty Line” you don’t want. If you really want spacing, you should use CSS for that of course.
4. Next visit the CSS pane and disable Create styles for paragraphs using custom formatting, at the bottom. For the CSS, you can do what you want here. If you set the selector to “Use Custom CSS Stylesheet”, then the default styles will be printed for your reference, but they won’t be used—CSS will be 100% up to you, and if you leave it empty you’ll get vanilla output.
5. Lastly, go through the HTML Elements pane, and if you use any of those styles, create them in the Styles pane first, and then assign them here. Appearance is meaningless for these special styles—setting “Block quotes style” actually causes the compiler to handle it as a Markdown quote environment, for example.

It may not be exactly what you want, but you can build up from that starting point, perhaps more easily than starting with a very opinionated format like “Ebook”, and trying to disable much of what it does.

I’d say you’re still going to have issues generating a ToC as you want with headings in the text, just based on how Scrivener generates HTML, but at least you won’t be stripping out piles for formatting. Again, the Pandoc Markdown approach is much more conducive to “inline” headings. I’d say the short pros and cons for each would be:

Native: if you’d rather a generator create CSS for you, it’ll do a pretty good job of that based on WYSIWYG style inputs. You centre-align “Heading 1” in the styles pane, and it’ll look that way on output. With the Pandoc approach every single aspect of design is up to your CSS because what Scrivener actually generates is a raw Markdown file.
Pandoc: overall if you prefer style-based and pure semantic writing styles, you will probably find this to be a better option. It is also easier to inject raw HTML into the output. You do much better thinking of document production as building a Markdown file. Styles are not for making things look a certain way, but making the text function a certain way, etc.

Both have their own unique learning curves, which works better for you is more a matter of preference I’d say, when it comes to that. If you want to wrap sections in divs for example, you’d go into the Settings tab in Section Layouts and give it a class name. With Pandoc, you’d use the prefix/suffix tabs to insert the HTML directly (though you’ll note it does wrap major sections in divs automatically).

To summarise:

OK - the fact is that Scrivener is horrible at generating an ePub document.

No, but with the wrong settings you can certainly make a mess of things.

Net result - a very nice ePub. Only, however, after many hours of sorting out HTML and CSS.

You may have to spend some hours in initial design—but honestly nothing other than template-driven junk is going to let you skip that part. The good news here though is that once you get the settings tuned to a baseline level of what you want, you can save that compile Format for future use, and fork from it with specific book designs, rather than always having to start over from scratch.

At least those hours will be productive hours, rather than hours spent tearing down unwanted output!

wmike · December 14, 2021, 4:07pm

Many thanks. Tried your suggestions - the results are much worse - the layout of each page is rubbish - most HTML is now just ignored. TOC is just a continuous list of HTML tags. I could go one. Lost the will to live with this.

I can get a finished project by using Sigil to put right the Scrivener mess.

I guess I will have to continue with that.

By the way, trying to make changes to the default layout results in Scrivener enforcing the making of a copy of the default - no idea if it is then, actually used. The same goes for the eBook format - have to make a copy and make any changes to that. However, the result is the same - absolute garbage.

Fortunately, I am well versed in HTML/CSS. Pity Scrivener forces me to use it.

I do, however, stick by my previous comments - Scrivener is not, in my opinion, for purpose in relation to creating ePub files.

AmberV · December 14, 2021, 4:07pm

Seeing as how we have such a substantially different perception of what a very vague and low-utility word like “rubbish” means in application to the variations in technical output by the software, then, it would help for you to define with more precision what you mean by that. For example, this is what I would consider to be clean output:

<h1>Heading of the Section</h1>
<p>First paragraph</p>
<p>Second paragraph</p>

This is essentially the type of output you would have seen using either of my examples—as you might expect. So that you would call that “rubbish” leaves me wondering what you expect, when your initial criticism was that it was not clean enough.

TOC is just a continuous list of HTML tags.

I don’t understand what these words mean when stated in this order. Why wouldn’t it be continuous HTML syntax from the top to the end of the file? Anything else would be invalid.

By the way, trying to make changes to the default layout results in Scrivener enforcing the making of a copy of the default - no idea if it is then, actually used. The same goes for the eBook format - have to make a copy and make any changes to that.

As the dialogue box indicates, you cannot edit built-in settings directly, but create duplicates of them which can then be adjusted from their starting point. This has obvious advantages, such as being able to freely modify from a designed point, without having to always start from scratch, and secondarily to never risk destroying those starting points.

Fortunately, I am well versed in HTML/CSS. Pity Scrivener forces me to use it.

Sorry to say, but there are no popular ebook formats that use anything else, and generally speaking that is a good thing. The alternatives are proprietary or just simply needlessly forked formats for very little gain, given the flexibility and suitability of these open and standard technologies toward ebook production.

wmike · December 14, 2021, 4:07pm

Yup, your definition of clean output is correct. Here is the Scrivener version of a bulleted list - simple

<ul> <li> </li> </ul>.

<h2 class="heading-4" id="doc64">Checklist for housebreaking<br/>
Put a bell on his collar – you will now know wherever he is and can correct him quickly</h2>
<p class="normal" style="text-indent: 0em">Take him to his toilet area:</p>
<p class="normal">• as soon as he wakes up</p>
<p class="normal">• straight after playing</p>
<p class="normal">• 15-30 minutes after food</p>
<p class="normal">• at least 6-8 times a day</p>
<p class="normal">• keep regular feeding times </p>
<p class="normal">• take up food after 30 minutes</p>
<p class="normal">• walk him on the lead</p>
<p class="normal">• don't play with him until he has been to the toilet</p>
<p class="normal">• take for 15 - 20 minute walks</p>
<p class="normal">• allow him to sniff things</p>
<p class="normal">• stick to one area</p>
<p class="normal">• after he has been to the toilet, praise him, play with him, and interact with him</p>
<p class="normal">• if you discover him in the wrong area, don't frighten him.</p>
<p class="normal">• clap your hands and take him to the right area</p>
<p class="normal">• always praise him and reward him when he gets it right</p>
<p class="normal">• never punish him if he gets it wrong</p>
<p class="normal">• get him used to a command such as &quot;clean boy&quot; or &quot;go pee&quot;<br/>
</p>
<p class="normal">Whatever you decide on, stick with it.</p>
</div>
<p class="separator"><br /></p>

AmberV · December 14, 2021, 4:07pm

Here is what I’m seeing, based on the results:

It looks like there is an empty list in the text file above the binder item that is called “Checklist for housebreaking”. If you can’t see it, select the empty lines at the bottom of this previous section and either delete them, or use the Format ▸ List ▸ None command with the lines selected.
You seem to be using a “Normal” style that is applied to all paragraph text. That wouldn’t typically be necessary for an ebook, as you can just consider a regular <p> to be normal, right. It could be you are using that setting for word processor output? Might be a good idea to have two separate Formats in that case, given the divergence in desirable behaviours between ebooks and word processing files.

That is what would cause all of the styles to have a “normal” class though at any rate. The intended use for styles in ebooks would be to create variation from the normal, and so in most cases that would be desirable. You would have for example a <p class="tipbox"> with some CSS for making that paragraph in a visual box. Having a “Tipbox” style in the editor makes it possible to easily pass through that instruction to the output so that it can be styled.
Otherwise the list itself doesn’t look like a real list to me, which is why it is just printing bullet symbols at the beginning of each paragraph. If there were a proper list in the editor, then you should find a different kind of result, as Scrivener would not apply the “Normal” paragraph style to list items. Then again, you may be applying “Normal” in a fashion different from what I’m assuming (in the compile settings).
The forced style on the first regular paragraph that suppresses the paragraph indent is coming from your settings in the Text Layout compile format pane. If you want no meddling, then all of the Remove first line indents checkboxes should be disabled (the “Default” test I had you try would have them all disabled by default, so I’m a bit confused as to why I’m seeing the result of them enabled in your test output).
Lastly the .separator paragraph at the bottom is coming from an “Empty Line” separator setting, which I suggested removing if you’d prefer a cleaner way of introducing a vertical space.

Overall though, this looks distinctly different from what a stripped down Default setup would look like, as well as Pandoc output, so it seems we’re looking at your original settings instead, and they appear to be quite customised. That’s fine, but recall what I was saying before: the more WYSIWYG type stuff you layer into the settings (like the indent-supression checkboxes), the more overbearing the output will be, in order to comply with these requests.

wmike · December 14, 2021, 4:11pm

I surrender. Tried, unsuccessfully, your suggestions. The example I have shown is a bulleted list, directly from Scrivener. The h2 for housebreaking has, for reasons beyond me, been stripped out by Scrivener and replaced with a br. But, hey, must be my fault.

AmberV · December 14, 2021, 4:14pm

Well hey, if you ever want us to take a look at the input/output directly, feel free to drop a sample to our support address. With the wide variety of permutations involved, it’s hard to say through descriptions, what is going on. There is no one single procedure that would intentionally delete a heading and convert it to a br, but I can think of a few dozen ways to do that on purpose (on thus accidentally).

I’d reiterate that for those wanting a clean and simple ebook that is easy to format semantically, the Pandoc workflow is hands down the best. Any word processing formatting oriented workflow is going to be a bit messy in my experience. Our use of a Markdown conversion process cuts down on a lot of that, but it’s not perfect and probably cannot be. Markdown on the other hand is ruthlessly simple to write with, and short of typos in the syntax, is very difficult to introduce mysterious complexity in the output.