PDF format from Markdown and substitutions

AndyJones · December 31, 2020, 11:53am

Hi there - I have a couple of question about my workflow, that I wanted to get a steer on before I go off and break stuff.

My general objective is to produce a set of Blog posts - maybe a few hundred in the end that will be:

posted as individual posts as they’re completed
compiled into a printable document from time to time

I’ve been writing in Markdown with one section per Blog Post (typically 500 - 2000 words with a few levels of heading within) and compiling to Markdown. This has gone pretty well but one question about that …

The section include a lot of images and I’ve been using the “Insert Image from file” option. The compile then gathers those images together which is handy for uploading. I then have to edit the Markdown to properly reference the location of the uploaded files.

Q1. Is there a way of compiling so that I can generate a modified location for the file - in effect, insert a partial URL before the filename for each image - the same partial URL in each case ?

For the generation to PDF, I’m not seeing any difference between the representation of the heading levels with the Blog Posts (the markdown heading titles are all shown the same as the bold markdown )

Q2. Is there a way to set up the compilation to .pdf so that it does show different, say font size and style for all markdown heading levels contained within a section ?

thanks

Andy

AmberV · January 7, 2021, 8:49pm

There are two ways you could go about this; maybe three depending on what you use for blogging.

Firstly, check to see if your blog system accepts Markdown, or has a module for it. If it does, that should make things a lot easier for you in that you can just fix the image references right in the Markdown directly, rather than the HTML.
Whether you can paste Markdown right in, or would still need to convert it to HTML, you can still take the approach of compiling to MultiMarkdown with Scrivener, instead of having it automate the HTML conversion step for you. This results in an .md file that will be easier to clean up with a simple text editor.

Now if you want to get fancy, and have a little scripting experience available, take a look at the Processing compile format pane. With this you can modify the text of the compiled .md file directly, and even go on to automate the conversion to HTML using an installed copy of Markdown, Pandoc, MultiMarkdown or whatever you prefer. That’s always a good option to keep in mind when you’re trying to fix something that is beyond what the Replacements engine can handle. Even just a simple shell script that runs sed and your preferred Markdown tool could be all you need.

Solve the problem at the top instead of correcting for it in post. While just dumping a graphic into the editor and letting Scrivener handle the syntax for you is fine for simple things, if you need more, it has other approaches you can take that allow for more control. You’ll find details on that in the user manual, under §21.4.2, scrolling down to the Referencing Images with Document Links subsection.

Of course the example provided here in figure 21.2 would result in the same relative link you get from an embedded graphic, but the key thing is all of the text around the hyperlink is up to you. So you can put this if you want:


![Image caption](/webroot/path/to/gfx/<INSERT HYPERLINK>)

Now something to consider is that such a path like that won’t work for PDF output, because presumably your blog directory structure isn’t matched anywhere on your Mac. So a way you could get around that is by selecting the path part of the text, between the opening parentheses and the inserted hyperlink, and assign that to a style like “Blog path”. You can then set up the compiler to leave that text alone for HTML/MMD output, and for PDF output, actually strip the text out leaving just the image name.

Along that vein, another approach would be to have two wholly different image references, one for PDF and one for the blog. That might not be a bad idea since you’d generally want a higher resolution graphic in a PDF than for a web page, anyway. Styles would again solve the problem of having conditional output, one image line for the blog, with its custom path, and another for PDF, each pointing to different resolution images (or even the same, just with different paths, it’s all up to you).

Sorry, I’m a bit lost on this part, in that I don’t quite understand what you’re doing. I’m assuming that you’re compiling to PDF using MultiMarkdown, which means you’re really using LaTeX here? Or are you using the native rich text PDF output? The latter can work, but it’s not really designed with Markdown in mind.

I also am a bit unclear on how you are generating titles. Normally Scrivener uses hashes, you’d have to go to a bit of trouble to make it turn binder titles into bold text. Or are these sections not in your binder at all, rather typed into the file? In that case I don’t understand at all how they end up different than what you typed in.

It would also be odd that you get normal h1 … h6 style HTML headings but just bold text for PDF. Or maybe they are actually hierarchical, but your LaTeX settings are odd? Just kind of stabbing in the dark here.

Maybe a practical example in a sample project would be beneficial, to illustrate what you’re trying to do.

So if it isn’t clear from the above, yeah, that’s kind of how it would work if you didn’t touch any settings and just did things normally. Each hierarchical outline level in the binder would be turned into a heading in the PDF, with a quantity of hashes in front of it that match its depth. The level below the draft is “# Binder Title One”, and a file below that level would be “## Some Subsection”, and so on. Quite how you would expect if you’ve ever generated a PDF from org-mode or similar. Outline = headings, unless you go to some length subverting the default behaviour (which is perfectly acceptable and possible to do).

How that gets turned into fonts and such is again within the domain of LaTeX. Take a look at the “Modern (Custom LaTeX)” compile format, in the “LaTeX Options” compile format pane, to see how one could influence the appearance of the document. If you scroll to the very bottom of the “Header” tab, you’ll see where chapter and section numbering is set up, and what fonts, font sizes and colours are used for section headings.

AndyJones · January 10, 2021, 9:10am

Hi Amber

Yes, on the .pdf output I’m just naively hitting “compile for pdf” (or actually “compile to Word” also gives me the same thing).

I’ve added an example of one of my files - the file contains header markings (### etc), emphasis (* and **) and blockquote (>) and the .pdf I get from naively compiling for .pdf

In the .pdf output, all the markdown annotations are removed, the emphasis examples are rendered as italics or bold appropriately, the block quote is ignored (could be a style thing ?) and all the heading levels from within the file are rendered the same (looks like bold to me). I tried adding a Heading 3 style to a format but that didn’t seem to make a difference.

You mention that compiling markdown source to .pdf isn’t an intended path I think - so that may have been my misunderstanding. As it happens, I’m only looking for “good enough” on the .pdf front as this will be for my own records but I guess I’d be given hope by the “convert Multimarkdown to rich text in notes and text” check box and the fact that the compile to .pdf process is clearly seeing the markdown as some kind of annotation.

I suppose I could take a different approach - write in Rich Text and compile to markdown for the blog output. Blog target is Wordpress, so there is a markdown block available.

AndyJones · January 10, 2021, 9:14am

Oh yes - and I also note that what I’m doing might well work if I just never put a e.g. ### inside a file but ALWAYS used a separate file for each heading.

AndyJones · January 10, 2021, 1:57pm

also making some progress by revisiting Section Types and Compile Format. It’s the usual thing - after a few weeks using something you have to circle round and refine the approach. Now going to look at image inclusion !

AmberV · January 10, 2021, 7:50pm

Okay, so you’re making use of the Convert MultiMarkdown to rich text in notes and text flag, in the PDF output settings of the compiler. If so, that’s what I referred to before as being a possibility, though it does have limitations. It’s a fundamentally simpler conversion as it does that internally rather than using the full MultiMarkdown engine.

Specifically to the heading size issue, as well as block quotes, I noticed that if I used a different setup (just default, so nothing fancy going on), I did get better formatting. So maybe that’s worth looking into if you want to stick with this approach. If I had to guess, I’d say the Markdown conversion is completely losing track of the fact that these are headings, and the Modern format is then coercing what is now ordinary paragraph text to match the rest, leaving bold intact from the original heading formatting, and completely obliterating quotes. The MMD conversion would have this blind spot potentially, because of how it converts to rich text internally.

So long as you are fine with a pretty basic interpretation that misses some Markdown formatting, then it works—and so I wouldn’t go as far as to say it isn’t an intended path, we do have the option after all, but for quality output you’ll do better sticking with the dedicated Markdown-based conversion options. In your shoes, and if I didn’t want to install LaTeX, I’d either use ODT or DOCX (with Pandoc installed), and create the PDF from a word processor. It’s an extra step, but the level of quality will be so much better.

Any thought of keeping the .md file for your records instead of PDF? It’s perhaps less immediately useful in that it requires conversion if you want to look at the font-based output, but the advantage of keeping the original in .md format is considerable. For one thing, the quality of the conversion improves with time, as MMD and Pandoc advance. I have documents I wrote and archived from 15 years ago that, had I converted to PDF and discarded the source, would look inferior to the PDF I could create today from the same source. Plus, if I wanted to create a revision of that document, that would be a lot harder to go from PDF to something editable.

And tools can make .md files more accessible, as well. The excellent Marked utility, for example, displays .md files nicely for easy reading, and makes conversion to various formats a snap. Marked can even “read” Scrivener projects, so long as they are written in Markdown format.

Yeah, you could go that route. The conversion that goes from rich text to MMD is a lot more robust than the other way around. It does have some gotchas to be aware of—for best results you would want to depend more heavily upon styles while writing, and you’d have to be more careful when using raw Markdown in the text, as by default it will assume you are using none at all, and will escape special characters. That would be a matter of taste, personally I’d never be keen on keeping my originals in a rich text format. Those formats are so complicated and prone to error, so much as not having the right fonts installed in twenty years can mess up how they look, and different programs have varying support for their features. TextEdit loses all footnotes, for example.

I like to avoid putting Markdown headings inside the text if at all possible. I mean normally it shouldn’t cause problems, and I don’t think they are here, it’s just that Modern is blowing away what makes headings stand out. My aversion has more to do with document flexibility. If I choose to promote a section in the binder, then I’d have to go into the text file and manually fix the hash marks. If it’s all a binder structure, than moving it around automatically changes the hashmark level on output.

So yeah, try playing with the format overrides switched off, or starting from a clean Format rather than one of the more complex built-ins, and build up from there. I think you’ll find that it will be easier to work with Markdown source from that angle.

AndyJones · January 12, 2021, 9:48am

Thanks Amber - I think you’re right that putting in an extra finishing step will probably be the simplest way actually. I have Marked 2 as my Markdown preview and I also have Affinity Publisher , so I think a flow that might work is…

Write in markdown in Scrivener
Compile to markdown to move to Wordpress
Export as RTF from Marked 2 to realise the markdown into a styled doc
Finish in Affinity to restyle and layout for pdf.

Cool ! Well the good news is, through this process, I think my Binder is better organised and more sustainable so that the “restyle and layout” activity can be deferred.

AmberV · January 12, 2021, 1:50pm

Sounds like a good process to me! That’s roughly how I work as well, I use a few different tools on the production side (LaTeX instead of Affinity), but I like Scrivener for producing the Markdown, and then generate what I need from that using various utilities and workflows I’ve built.