Almost, but not quite. Formatting PDF

TyrelVanNiekerk · August 17, 2023, 11:04pm

Hi,

It seems Scrivener is one of those apps that is almost good enough but misses the mark because of a few missing or broken features. I want to see if I can use Markdown or Lex, or some other export function to get the output I am looking for.

Table of Contents does not work at all, as far as I can tell. It does not use the correct fields, and I have not found a way to use the fields I want, and the generated page numbers are completely wrong.
I could not find a way to get the header to show the chapter number and title correctly.
Starting a chapter on an odd page is a feature reserved only for Mac, even though it’s something basic that most books use (that I am aware of).

Does anyone have a good way to export the files from Scrivener and then use something like LeX or Markdown to format the document correctly? I considered exporting all the files as RTF and then writing a C# app to combine all those correctly in an MS Word doc using a provided template to get the correct formatting. However, the export files option does not work as I expected it to.

Thanks,
Tyrel

kewms · August 17, 2023, 11:21pm

You might read the sections in the Scrivener manual on integration with Markdown?

TyrelVanNiekerk · August 17, 2023, 11:40pm

Yes, I did read most of it. I guess the main issue I have is that the PDF compile options are almost perfect. There are just some things you cannot do or things that don’t work, making it difficult or impossible to use.
With my last book, I compiled to MS Word and then applied my template, manually removed the toc and inserted the MS Word TOC field, updated the headers with the correct fields, and manually added an odd page section break to each chapter.
The export functions and compile to other formats, like MarkDown, etc. don’t treat the front matter correctly, so there’s no way to post-process that.
Exporting files does give you a bunch of RTF files that you could then process further, but that also seems like a lot of work. And then there is the off behavior that if you apply a style to some text in Scrivener, it will overwrite any italic, bold, etc.
I will think about it some more. I guess this is why there are so many open-source doc writers. People write code to make the process that works for them.

kewms · August 18, 2023, 5:38am

In what way don’t they treat the Front Matter correctly? With the exception of automatically generated ToCs, you have as much control over Front Matter as any other component document.

AntoniDol · August 18, 2023, 7:18am

To get the output you’re looking for, compile to RTF or Word.

Import in InDesign or Affinity Publisher.

You are right in your assumption that the Windows-version of Scrivener wasn’t designed to compile print-ready PDFs. Some of those four bulletpoints can be done in Scrivener, some not yet. For what you’re trying to do, you’d be better of in software dedicated to graphic design than software dedicated to writing and compiling long form writing.

HTH

TyrelVanNiekerk · August 18, 2023, 1:30pm

As a programmer, it bugs me. The PDF output is so tantalizingly close. A few more macros for headers/footers, fix the TOC, and add odd page section breaks for chapters, and you have it.

The front matter outputs as a blob of text if you use markdown etc., with just the page names without any indication that it’s a separator. I guess I could change the name of the item from “Blank Page” to something like “^^Blank Bage” and then post-process it to add a page break when formatting.

I also get this is an editing tool. It’s just that it almost creates print-ready PDF or DOCX files.

I will try a few more things and see what I can do. The docx compile is probably the best format, but I need to figure out an automated way to fix the document after compile. I love that you can create outputs for Paperback, Hardcover and ebook. If I can post-process the Word doc and add the section breaks, fix the TOC and headers, that might end up being a great option.

AmberV · August 19, 2023, 6:32pm

Overall, it seems to me your critique about Scrivener is that it isn’t better at doing what it isn’t really designed to do—so that’s kind of okay in my book. That it could be confused otherwise is probably the more concerning thing. It would be better for us, I think, if it were extremely clear that writing is what Scrivener is meant for rather than this but it’s almost okay kind of impression. I don’t see it, myself. I mean I guess if ragged right is all you like, since the thing can’t even hyphenate! More than a few footnotes on a page break it! The kerning is not good at all in many cases—it very obviously looks like a PDF generated with some general-purpose toolkit library rather than something that has been coded from the ground up, over decades, to produce high quality text. Well, there is always one thing or another wrong that others don’t notice or desire, I suppose—and that’s part of the problem.

A few more macros for headers/footers, fix the TOC, and add odd page section breaks for chapters, and you have it.

So as a programmer, you can probably appreciate that if the PDF library you depend upon has no facility for creating an internal hyperlink from one point to another, that’s not just a quick macro one can throw together. As I commented about above, you have to bear in mind that this is your list, but someone else is going to have a different list, and if we add up all of the lists that would make Scrivener’s PDF “done” it would take us years to do it. We might as well stop making writing software at that point, get into desktop publishing. I jest a bit, but hopefully you see the bigger picture in what I’m getting at there. It’s out of scope for a writing program, and always will be.

That doesn’t mean some people might not be happy with it, don’t get me wrong, but it’s not our goal and can’t be.

As for best practices with using Scrivener like a rich text editor: my advice is to spend less time trying to do any kind of design in Scrivener unless you have extremely basic requirements (just the fact that you need a ToC puts you outside of that in other words), and more time on getting output that slots right into a template in seconds. It is surprising how much you can do with nothing but styled text being imported into a template. Page flow, numbering, headers and footers, recto/verso layout, complex layout like sidebars and call-out boxes, etc. The less you try to do with Scrivener, the more you can do in other tools. I wouldn’t even bother with Scrivener’s page breaks if I were a word processor user! It’s better to handle page flow with styles.

My word processing output from Scrivener would be extremely basic, and look nothing like the intended end result. I probably wouldn’t even bother to change how styles look and just concern myself with making sure they are all named right for the template—why bother, when the template is going to handle all of that with a much broader range of capability than anything Scrivener can do?

That all said, when we get into using Scrivener to produce Markdown, then things can collapse better into a workflow that feels like you’re using one tool. And a lot of what I just said above ceases to be true—because at that point we’re using Scrivener to work closely with systems that are much more capable than it is at document design, web publishing and so on—but we are doing so with a mechanism that is so simple it can be expressed in a text file. That means loads of power with very little technological overhead, very few dependencies on Scrivener to be capable of this or that. Need columns? Easy. You can do that right now in Scrivener without having to wait years for us to figure out how to do that so that we can “allow you” to do it. Need an index? Already possible! The built-in Markdown compile formats all have indexing tools built into them. What dependencies do exist in Scrivener, such as how Section Layouts and Styles can generate markup for us, are fleshed out, and refined over many years to do this one thing very well.

I like LaTeX because it is extremely amenable to automation (which Scrivener is also good at, on the Markdown side of the spectrum). Since LaTeX’s “input” is a simple text file of instructions, it is very easy to manipulate it with scripts, or create it (via tools like Markdown).

The Scrivener user manual is Markdown → LaTeX project that has an 800 line script controlling the post-compile process, which is capable of producing the PDF you see in the help menu “straight out of Scrivener”, with one click of the Compile button. Scrivener’s not actually anywhere near the creation aspect of the final PDF (nor should it be, in my opinion), but it is orchestrating all of that via its capabilities for doing so.

It should be clear, from looking at the capabilities expressed by the formatting of that PDF, that Markdown can do a lot more than might meet the eye at first, and that a statement like it being unable to handle front matter properly isn’t true. While I don’t actually use front matter in that PDF, I do use back matter, as you can see with the heading numbering switching to lettering for the appendix. Front matter would be super easy to add to this Markdown project.

One could probably come up with a similarly efficient process for GUI-based desktop publishing tools as well, but that’s never been something I’ve looked into. Pandoc, for example, is capable of producing an InDesign document, and I bet its own automation capabilities could be employed to finalise the document efficiently. With Affinity the workflow would revolve around .docx output instead, but the idea would be similar. I suppose would could even use a word processor for this, but I wouldn’t recommend it any more than using Scrivener for final design.

There are of course other systems out there as well that have similarly simple plain-text input characteristics to LaTeX. HTML/CSS is not just web tech any longer, and its much easier to work with syntax is capable of expressing most of the things LaTeX can, and makes other things a whole lot easier to do that are a bit of a nightmare in any DTP. CSS is deceptively simple, and can do an awful lot with a few phrases, it has an elegance like that.

I guess what I’m driving at here is that Scrivener+Markdown is about more than what Markdown all by itself can do, it’s about more than LaTeX, or even a word processing file. There is a whole universe of document production tools out there that either work with it, or that can be fairly simple arrived at with Swiss-Army conversion tools like Pandoc around. And that’s just Markdown—Scrivener itself is capable of generating complex formatting files on its own, as evidenced by the non-fiction LaTeX project template. That doesn’t use Markdown, it’s a project for writing directly in LaTeX, and it is thus a demonstration of how the compiler can be instructed to create wholly new file formats that Scrivener itself has no knowledge of. In fact, here’s a proof of concept of a compile Format that exports you work in JSON format.

Can we really say, fairly, that Scrivener can’t already do PDF layout with everything you want? To say that seems to me we should ignore a huge aspect of how its compiler works, pretend it doesn’t exist, and only focus on one or two parts of it. It would be like saying Scrivener is not very good at making eBooks, by completely ignoring ePub output and only evaluating how well its DOCX file converts to ePub in Calibre.

I.e. it’s a critique on a self-imposed constraint, of using only its very simplistic rich text editor and the small, tiny fraction of capabilities its exporter has, rather than the actual breadth of what this software is truly capable of orchestrating and producing.

So, at this point I’ve already kind of gone over how Markdown doesn’t have as many natural limits like software does, but without really getting into specifics. The thing about Markdown, specifically the dialects oriented more toward document design than web publishing, is that most of them can be extended right in the source itself by using target format output directly. If the dialect doesn’t do a thing, that doesn’t mean you can’t do a thing.

Ordinarily that capability would make for a cluttered writing environment, but this is one of the areas where Scrivener+Markdown can really shine over other Markdown editing systems. It can make feasible more sophisticated document designs in ways one would probably naturally avoid using a text editor, using very simple tools for doing so (rather than programming, which is normally what you’d need to do to extend Markdown). As you can see in those screenshots, to fully format the text of the user manual would require a lot of raw LaTeX all over the place without programming new syntax. But in Scrivener the input is very clean and logical, easy to write with and read around, and so easy to create a workflow out of that it’s barely even worth considering a task.

The concept behind how this works is illustrated in some of the built-in compile formats, which you can take a look at:

MMD → LaTeX: Book (Tufte): Styles compile format pane: Full Width Text. If one adds this style to their project and marks text with it, then it will invoke the LaTeX syntax necessary, around that marked text, so that it triggers this environment specific to the Tufte document class.
MMD → ODT: OpenOffice Document: Styles compile format pane: Index Term. In this case we inject ODT XML around marked phrases or words, which adds them to the document’s index list. There is a similar one using OpenDoc XML for Pandoc’s DOCX compile format.
In other formats you will see Section Layouts doing heavy lifting as well, with their titling capabilities, as well as their prefix/suffix tabs and separator settings.
Not demonstrated in Scrivener, but Pandoc Markdown has facilities for assigning custom styles to paragraphs or spans of text, which are very easy to execute with Scrivener’s style system, and these enable a very broad range of capability when it comes to conformance to a designed template. For instance, I bet you could create pseudo-headings styled as “Front Matter”, which are unnumbered and trigger a different page style than regular chapter headings would, giving you Roman page numbering if you wanted, chapter numbering that starts at 1, and resets the to Arabic numbering at 1, and so on. It might take a little experimenting, but I was able to get a pretty simple setup in LibreOffice without too much meddling, and I probably could have automated most of it.

While you are technically correct in that Markdown all by itself doesn’t have a concept of front matter, the network of support tools around it have matured to the point where that effectively isn’t true—and these systems are getting better by the year. One such rising star is Quarto, which also makes use of Pandoc, meaning it can be coupled with Scrivener rather simply. It addresses many of my complaints with stock Pandoc’s output, which is more suitable to simpler documents that lack a need for better figure handling, cross-referencing and so on.

While Quarto does not yet support matter separation, it is on their radar and enough people want it that I would expect to see it in time. The question is whether word processors will get it or not, as they are all a bit clunky when it comes to stuff like this, to be honest. There is no actual front matter feature, it’s a bunch of different settings being brought together to achieve an effect, whereas in LaTeX it’s as dirt simple as \frontmatter (stuff) \mainmatter (more stuff). It’s like the scene break for fiction in word processors—there is no such thing, such that most people use an empty paragraph to signify one. Ugly, but what’s the alternative, a “Scene Break Above” paragraph style? Also kind of ugly as it’s still not a structural statement. But I digress. I’m not a fan of word processors if you haven’t noticed.

So if “all in one” is something you want to strive toward with how you use Scrivener, then taking the Markdown road really is the best way to get there. There is a very large community of programmers and designers out there supporting that workflow, and like I say, it’s getting better every year.

TyrelVanNiekerk · August 19, 2023, 7:11pm

Firstly, thanks for the lengthy and detailed answer. Much appreciated. Second, I understand Scrivener for what it does, and my comments were more based on the fact that the MS Word output, for instance, is almost 99% there. It is frustrating that it’s just so darn close to being a one-stop solution. It almost gave me everything I needed, and the frustration seemed from me trying to use it as a word processor, where I would compile to the three formats I needed, and hey presto, I have my paperback, hardcover, and ebook files ready to go.
I worked for a company years ago where I wrote part of a system that showed/edited the layout of the documents. We added headers, footers, tables, images, columns, etc. I get how much work that is.

I last used LaTeX in uni in the early 90s, but from what I have read and your well-written reply, going MarkDown → LaTeX seems to be the way to go. I will put in the effort to learn everything I need.

Again, nothing against Scrivener. My comments were not complaints or criticism. It’s that using it as an author, I almost, almost, almost had a complete solution without needing to learn MarkDown and LaTeX again. I have used MarkDown a bit. The .md files in Git use it, but never for something as involved as a novel.

I will reread the manual, look for other sources, and figure out how to use LaTeX. Maybe there is hope for me yet. Thanks.

AmberV · August 19, 2023, 8:23pm

Oh no worries at all, I did not sense from you that you were unhappy with Scrivener. That is why I used the word “critique” rather than “criticism”. It is fair and good to critique something you use daily and wish it were slightly better for what you need, here and there.

Even if LaTeX doesn’t work out for you, you may still find that using Markdown is a better way to get word processing style content done swiftly, and much closer to an end target. For example the advice I gave at the top, about stripping things down to the basics and letting a template handle all of the formatting—that’s how Markdown handles things, and Pandoc has a flag where you can stipulate a reference document as a template, meaning you get the finished product straight out of the compiler. It will also stick the ToC field in for you. It’s not a huge difference, avoiding 30 seconds of work to import a raw .docx into another and insert a ToC, but it does feel more satisfying having that all done for you. And with that Processing pane sitting around, the opportunities for automation are much higher.

Plus by and large, especially with Quarto, the results I’m seeing for word processing output are just better put together from a styling standpoint. Having for example a “First Paragraph” line following the heading with “Normal” for the rest means simple indent handling, it means drop-caps in the template, if you want them. Scrivener doesn’t have a way of doing that with styles, it uses direct formatting to modify indents, so if what it gives you for first paragraph handling isn’t what you want, you’re stuck with a ton of manual labour. Even if it does do what you want, since it is direct formatting you’re stuck with a much less agile document in post.

The .md files in Git use it, but never for something as involved as a novel.

And hey, for most novels you’ll barely even see it. If you use an “Emphasis” style to drop asterisks around text on compile, you could probably get away with pretty much using the text editor as you would normally. Novels will be a pretty simple thing to do in LaTeX as well, as the Memoir class you’ll likely be using is extremely flexible, meaning most customisations are just a matter of looking up which macros to use. The user manual is a Memoir project (though I did quite a bit beyond what Memoir does, to be fair). With a modicum of effort you’ll end up with a clean and professional looking PDF that would take expensive software to get otherwise.

Funnily enough, before Markdown came along, I wrote fiction directly in LaTeX using a coding editor. It always was designed to be a minimal interface for writing with directly, but these days it does feel a bit chunky with the advent of all these super-minimal syntax systems that are abstracted from any one format in particular. I mean to say though, while it’s mainly used in the sciences I’ve always felt it’s great for just about anything that needs to look good.

And as with Scrivener bare, you’ll get ePub and print media from one source—the difference will be vastly more control over that, and thus creative freedom, and overall increased quality (you will love how clean and logically organised Pandoc’s .epub/*xhtml files are).

kewms · August 25, 2023, 10:25pm

A post was split to a new topic: LaTeX/PanDoc tutorials

FamilyPuzzleSolver · April 12, 2024, 9:33am

I am also a windows user. Just in case you weren’t aware, Libre’s “Writer” (free) does have this feature, along with other page styles, TOC, and can be saved as a docx and PDF, and as I understand it, the style templates play well with Scrivener.

TyrelVanNiekerk · April 12, 2024, 1:38pm

I got many helpful tips from people on these forums, but I outputted to MS Word, fixed the formatting, and then exported a PDF from Word. I will get back to it again someday. I hoped to find a tutorial on how to go from nothing to having LaTeX or something configured and then learn how to tweak the script instead of figuring everything out from scratch.