Hey everyone. In the past few months, I’ve gotten much more comfortable using pandoc as an intermediary for ePub production. Adding custom markdown to my formats and tweaking the CSS has given me a basically one-click pipeline: Scriv → pandoc → Sigil for verification. Now I’m looking to do the same for the PDF version of the same, with the goal of commercial print-on-demand availability.
As I have some specific character style requirements, I’d prefer a route that allows me to preserve that markup so I can render it my preferred way. Small caps, for example, and a style that forbids space-breaking when justifying text.
The default 6x9" paperback PDF compile format is fine, and I can clone and tweak that to get close… but I am a bit spoiled now. As a career nerd, I appreciate having a million command line options and a config file that I can iterate over.
So, I’m looking for suggestions here. I glanced at Scribus, but am not sure how one pours content into it without twiddling. And honestly, it feels like work. I’d prefer spending a week to tune a config file to make pandoc just operate as desired.
Is the default pandoc multi-markdown “lossless” here, and does Scrivener give me enough options to launch pandoc and generate that PDF? Or am I looking at some more scripted solutions?
You’re in the right place, even if you need a more scripted solution. Start by selecting “MultiMarkdown” from the Compile For dropdown (yes, the name is legacy and confusing at this point), then duplicating the “Basic Pandoc” compile format and going from there in the compile format designer. You’ll have all of the things you are used to with the ePub pipeline, with regards to styles and such.[1]
You may find some things don’t work as well as Pandoc itself is still (I feel anyway) somewhat in a transitional phase where it comes to custom styles and how different output file types handle them. HTML-based has the best support for things like named ::: div blocks, and [inline]{attributes}. How well those work with its PDF output (and whether that is using LaTeX) is up to Pandoc though (or more accurately, your output filter), not so much what Scrivener itself can do.
Short answer: you might need to use some different markup methods with your style configuration, than what worked for ePub, but you should be able to get where you need to go.[2]
That all aside, you’re ultimately looking for the Processing compile format option pane, which is only available to the “MultiMarkdown” setup. This is where you plug in your Pandoc path, and then provide the command-line options you want. With that you can get one-click output for most basic things. For more complex things, a simple wrapper script that you target from this pane, is the way to go.
Once you’ve got that concept mastered, you will no longer be limited to the DOCX, ePub and DocBook entries in the Compile For menu.
There are some complex examples of this workflow on the forum here, such as the Scrivomatic setup, that might be worth taking a look into, as well as Quarto integration, which uses Pandoc to streamline PDF production.
Alternatively, you could switch back over to Pandoc→ePub and edit your compile Format for that. Once you have the format designer window open, click the gear button in the left sidebar header area, and add the “MMD” checkbox to make this format available to plain “MultiMarkdown”. You might want to duplicate that though instead of editing it directly. Chances are your PDF workflow will need to diverge from what ePub wants. ↩︎
For instance I can get a custom class with [inline text]{.named-style} with ePub, but if I’m going word processing output, I need [inline text]{custom-style="named-style"}. This is what I mean by it feeling transitional. I don’t understand why this couldn’t be abstracted, but that’s just my opinion. ↩︎
Small caps is well handled by Pandoc, but this will require a bit of output-specific tweaking, as it depends on the layout engine capabilities of the final target.
As you are comfortable already with CSS, a Scrivener → Pandoc → HTML PDF engine is worth considering. I’ve long tinkered with PrinceXML as its founder, Håkon Wium Lie, is one of the creators of CSS. Ultimately TeX and now Typst are better for my needs, but I think HTML for print is somehow more elegant, and it is only for complex layouts where the dependency on Javascript solutions makes it harder to use.
Scrivener → Pandoc → Typst is the other way I would go. You can tweak the Typst templates to your liking and Typst has multiple ways of marking chunks of text for special formatting etc.
Excellent ideas, thank you both. I’m thinking that I want fewer applications involved, not more, but that’s more from a sense of simplicity than anything else. I’m not sure what an HTML conversion would buy me that well-styled MMD will not: ultimately, it’s just structured data. I’m considering what the book distributors accept as input, too. I need to play around with it, I think.
The key part is finding that map of my semantic tagging (“don’t allow whitespace in here”) to the target language’s own styling. ePub was easy, relatively, because it all happened at the CSS layer. Once I’ve worked out the mechanism to get the styling markup in place, pandoc properly turned them into span or div elements, and then I could target them. Figuring out how to do that in another print-ready format is my challenge.
Will for sure check out the pure-pandoc options. Just from some samples I’ve seen around, and from my ePub transformations, it looks capable of managing things like “format for a 6x9 page” and “generate a ToC” and so on. A Scriv->LaTEX->PDF pipeline would be the real dream, I think, but I need to baby-step my way in there. Knowing that I can do this without a lot of monkey-mousing around in another tool is ideal.
Not sure why you’d want to go the pure LaTeX route. It is way more complex than is needed for a relatively straightforward layout. LaTeX is excellent for very bespoke layouts that necessitate a high level of arcane knowledge, and endless searching on stackexchange. The Scriv->Pandoc->XXX route tries to keep the document identical. The rules for whatever XXX is can be tailored step by step. If you need XXX to become LaTeX PDF, that is easy, but you are not constrained by TeX ( pandoc’s pdf engine supports pdflatex , lualatex , xelatex , latexmk , tectonic , wkhtmltopdf , weasyprint , pagedjs-cli , prince , context , groff , pdfroff , and typst ; phew!). I would contend the end goal for you is that you can spit out BOTH an EPUB and a PDF from a single Scrivener compile. My pandoc workflow usually always spits out multiple formats simultaneously, then I choose which one to use.
Nerd cred, mainly. LaTeX has always felt like the ultimate in customizable, codable layout solutions, and I have a deep abiding respect for its longevity and creator. But yeah, very likely overkill.
Except for a few solveable riddles (Dude, where’s my ToC?) the stock Scrivener PDF paperback output is fine. Then I looked at my few uses of custom character styles and decided what I really needed was a system like the CSS + XHTML, but for print.
I almost certainly don’t need that. Not right now, at least. What I really need, I think, is to give myself permission to flail around a little and experiment, and maybe finally actually RTFM for pandoc and friends.
My mantra is that the thing you already know is usually going to be the best tool for the job, even if it is technically overkill. I use LaTeX to pretty-print my journal entries. That’s like using Autodesk Maya to make an animated GIF, but if you already know Maya, maybe that’s actually faster than some hands-tied-behind-your-back simpler tool you don’t know yet.