Thoughts for academic Markdown in a future version of Scrivener

cavalierex · August 26, 2024, 3:09pm

For others who don’t use Markdown or write academic papers, some examples may be helpful. In the past, I tweaked Scrivener’s styling system to add visual gloss whenever Markdown was present.

Simple typography

So here is typical typography, with my own styling gloss set up in Scrivener to make things visually apparent to me when writing. Note that the colors don’t actually “do” anything – they are meaningless, except to me. They just serve as a reminder of what is raw Markdown, Quarto Markdown, code spans or blocks, executable code, etc. It is Scrivener’s compiler and the post-processor (in my case Quarto) that will render the styling the way it is supposed to be.

Stare at this long enough, and your eyes will glaze over!

Here’s what it ends up looking like when rendered:

Tables

Simple tables are readable in Scrivener:

but complex ones are not:

And of course, they don’t help you see what will be rendered:

Equations

Equations are equally daunting to view and edit:

Figures

Here is a sample figure. It is contained in nested Markdown <div> blocks (indicated by the :::::: and ::: markers) and uses a LaTeX library called tikz to create the figure dynamically.

My Scrivener gloss just lets me know that it is raw Markdown, but doesn’t help me beyond that. This is what the figure looks like this once rendered by Quarto:

Notice that the figure caption is automatically labeled as “Figure 13.3” by the Quarto Processor.

(Side note: Why would you want to create a figure dynamically like this instead of cut-n-pasting an image, like Scrivener can do so easily? These figures are created dynamically when the code is run, so if a dataset changes or you otherwise tweak the code, you will get updated figures. These are not cut-n-paste figures that you later have to manually update if something needs to be redone, risking to have errata later on when you inevitably forget to update to the latest figure, or to update figure numbering when you moved something around.)

Executable code blocks (also shows crossrefs)

Here is how executable code blocks would look. (At the bottom, you can also see how the figure would be able to be easily referenced via a cross-ref, with the output becoming automatically labeled/numbered. In the second image below, you can see how the Markdown+code get rendered by the Quarto processor.)

Special features like callout blocks

Callout blocks in Scrivener are distinguished as grey boxes. A Quarto-aware system might know that Quarto only allows 5 types of callouts and therefore color-code them according to type (note, tip, warning, caution, important). That would be amazing, but it is not crucial.

And much more!

And of course, there are citations and bibliographies and table of contents and footnotes/endnotes, index compilation, and marginalia all those things that Markdown/Quarto will give you. Combine that with Pandoc (which Quarto does for you), and you can get any output format you want – Word, PDF, web site, etc.

Anyhow, hopefully this gives others a glimpse at what academic/technical writing includes, and the reason there are people advocating for better workflows for academic writing (which is built largely on Markdown and LaTeX).

AmberV · August 26, 2024, 4:02pm

Yeah, I guess from your screenshots you use styles, fonts and such way more than I do. I only have a small handful I use for genuinely unusual things, and most of those don’t do anything with fonts—the highlight feature you use a lot is what I prefer. It’s simple, when I see orange I know it is an indexing marker, when I see green I know it is a quote, that sort of thing. The large majority is simple “black on white” typing though (well, dark mode differences aside).

But you seem to be marking everything that is Markdown in a style, and that’s a lot of extra work in my opinion. If it’s what you prefer, then all right, but that isn’t an intended way of using Scrivener + Markdown, is how I would put it.

You could also probably defer a bit of what you’re doing to Scrivener, like tables (which it has a separate conversion switch for). For simple tables it’s good enough, and keeps your text in rows and columns. I only use raw Markdown tables for stuff the basic text engine can’t keep up with. Same could go for lists, which can also optionally convert.

Bit of an aside, but have you played with the <$include> placeholder much? At a basic level you could dump the tikz layout into its own binder item somewhere outside the draft, and then use that placeholder, like <$include:factor c collider>, where the bit after the “:” is the binder title. That would keep the writing area a bit cleaner, not having to scroll around a lot of mechanical text like that.

But what might be of more interest to you, specifically for stuff built from data sets that might get updated, is that you can pull text off the drive with it too, like <$include:D:\path\to\data.txt. Then your scripts or whatever you are using to formulate raw data can produce or update these sources and they all get inserted automatically when you compile.

Callout blocks in Scrivener are distinguished as grey boxes. A Quarto-aware system might know that Quarto only allows 5 types of callouts and therefore color-code them according to type (note, tip, warning, caution, important). That would be amazing, but it is not crucial.

You could do that yourself I think, by using colour-coding for your highlights instead of all grey, and making the five different options available in the editor. The div dots themselves could go into the compiler since you don’t really need to see that any more with the colour-coding, and they aren’t going to be changing from one call-out to the next.

But sure, setting that up as an example for you out of the box, in a project template, is just the kind of stuff I have on my list of things that would be nice to add.

For a LaTeX example, have a look at this post. The screenshot is what I see in Scrivener, and below is what I’ve set up the compiler to turn that into.

cavalierex · August 26, 2024, 4:21pm

Sure, this is a nice suggestion. I could envision having a Binder folder to contain all figures, with each Binder title being the figure that would be included, and the contents of that item being the Quarto Markdown for the figure. At Compile time, it would be included into the right spot.

The reason that I don’t do that right now is that I am using BBEdit to edit my Quarto books and articles. (Scrivener just wasn’t working well enough for the Markdown workflow, more related to Compile issues, but that’s a lengthier discussion related to .YAML files and Bash scripts.) BBEdit opens the project as a folder full of nested folders and files. To manually create extra files just to hold one figure is too much effort. And what do I name the file? Not “Figure 1” because that will get outdated as soon as I add a figure to precede it! Thus, having the Quarto Markdown for the figure in-line with the rest of the document eliminates the hassle of manually creating and curating extra files for the figures (and there may be dozens or even hundreds in a technical work). But yes, it makes reading through the document more painful, since all you see is a Markdown block instead of the actual figure (unless you are using live preview in Quarto).

But as this example shows, it is exactly for this reason that I would love to use Scrivener’s powerful Binder to organize everything, then simply Compile the project at the end, and run it through Quarto’s processor to execute code, generate figures, make the bibliography, etc.

cavalierex · August 26, 2024, 4:32pm

Your link is a good example of Scrivener styles being used to generate markup language, in that case LaTeX. It’s nice for the user to not have to remember how to create the div blocks for callouts, or figures, or what have you.

If I could get it working in Scrivener perfectly, and not have to jump back-n-forth between Scrivener and a plain text editor like BBEdit, then I might do just that – i.e., write purely in Scrivener, then compile into Quarto Markdown and then process with Quarto. But right now, for safety reasons, I need to write in Quarto Markdown itself so that if Scrivener fails me (or alternatively, I fail to figure out how to make things work in Scrivener), then I can just take my raw text and process it through Quarto myself.

To summarize my main concern – Accepting Scrivener for what it is, as an excellent rich text editing environment, I wish

that Scrivener were aware of Quarto Markdown and the related workflow;
that Scrivener would provide enough visual gloss to make writing technical documents as comfortable as possible; and
that Scrivener would have built-in preprocess/Compile/postprocess functionality specific to Quarto/Pandoc (maybe its own setting pane?) to make that route of publishing easy even for beginners.

rms · August 26, 2024, 5:20pm

Impressive. But me, IMHO, appreciate Scrivener as I can avoid all that complex formatting and depend on the compilation features to do the fancy stuff. Agree probably can’t do what you demonstrate, but I think I’d use something other than Scrivener to do it.

cavalierex · August 26, 2024, 5:52pm

Thanks for the input. I agree! My dream for Scrivener 4 is “write in Scrivener, Compile, post-process through Quarto”. Maybe we can get there.

nontroppo · August 26, 2024, 6:50pm

I do many things similarly to you (@cavalierex) and your examples are great. I’m not quite sure what your problems are that you can’t use Scrivener at present?

As well as styles, you should also consider Section Types, as these can be made into templates and work in a slightly different way to Styles. One advantage is you can use metadata to define settings for these sections (in my quarto sample, Quarto multi-part figures):

Scrivener can work well with Quarto already, at least for my sample project most things work fine. The ScrivQ template also can deal with multiple file outputs to work with book/web page outputs: ScrivQ | A template to control Quarto, export multiple files, manage bibliography and easily create cross-references

Can you give a more specific problem[s] that made you end up using BBEdit?

I think Pandoc is well-established and universal enough (and now <20MB compressed) that it should be bundled in Scrivener 4 to replace MMD as the default markdown engine (users could manually install MMD easily for backwards compatibility). The numerous feature advantages are significant (citation processing alone would be worth it). While I do really like Quarto, its syntax is much more specialised, targeted at a very specific layout style. If Scrivener 4 supported it it would be awesome from my perspective, but I can imagine it would be much harder than make co-exist with other output formats. In addition one really annoying “feature” of quarto is how it is designed for a single project folder. It refuses to use filters or extensions or other files outside the destination folder by design. Scrivener normally takes control of the destination folder and thus there will be plenty of scope for file collisions and lots of bugs.

cavalierex · August 26, 2024, 10:34pm

This is true; however, Scrivener is also file-based (and the file is really a ZIP of nested folders and files), so it might end up working well enough. I have tested using symlinks to other files like my Zotero reference database (stored far away from my Quarto project), and this works great. I can run Quarto through a single conda virtual environment that has all my Quarto stuff in it, and I use it for multiple Quarto projects. The only thing I haven’t tried moving to a remote location is a Quarto extension. That may be amenable to the symlink strategy.

But why oh why does this need to be so hacky? Quarto really should allow specification of absolute paths for some things (through the YAML file). Maybe they’ll add it one day. I filed a bug report that they fixed promptly, but I don’t think they can devote much time to feature upgrades.

cavalierex · August 26, 2024, 10:36pm

I will definitely have to revisit your ScrivQ template again sometime. I checked it out when you first published it, but it’s been awhile now.

cavalierex · August 26, 2024, 10:42pm

As I recall, it was two things:

Frustration at getting the YAML settings to compile correctly (I was not using your template, but rather had a Scrivener page in the frontmatter section where I wrote the YAML); and
Being able to run the post-processor within a virtual environment. The Scrivener Bash script option is not a full terminal, and it wasn’t able to load a virtual environment. I needed the conda virtual environment so that I could execute the R and Python code in the manuscript.

As you can imagine, the limited Bash shell was the bigger problem.

Maybe my mistake is trying to go the whole way from start-to-finish in Scrivener:

Write in Scrivener
➔ Compile to .md
➔ Rename as .qmd
➔ Post-process using Quarto in Scrivener’s Bash shell

Maybe the solution is to concentrate only on the first half:

Write in Scrivener ➔ Compile to “Quarto Book” structure

(The Quarto project could also be an article, web site, slides, etc. But for most written projects, the Quarto “Book” project structure is most appropriate.)

For this to work, the structure in Scrivener’s Binder has to be translated to nested folders and .qmd documents, and the YAML document has to reference those folders/files as parts of the Book project. That might have to be a pre-processor Python script. If I recall correctly, you may have tackled this already using Ruby.

So, the feature request for Scrivener 4 would be a smoother way in Compile settings pane that knows about Quarto and about Pandoc Markdown, let’s us compile the Scrivener project into a Quarto Book project of nested folders/files plus configuration files like the YAML file. And it stops there.

Well, fast-forward, and once the Quarto Project is compiled appropriately, then jump over to Terminal for

         $ cd my_project_directory
         $ conda activate quarto
(quarto) $ quarto render --to=pdf

If this is the way I go, I’ll just program a macOS Shortcut to do all this by right-clicking on the the project directory in Finder, or maybe Scrivener is capable of launching a macOS Shortcut.

nontroppo · August 27, 2024, 9:21am

I asked one of the Quarto dev’s a couple of years ago and they were very clear that this was by design. They are inspired by virtual environments / project folder type workflows. I understand the logic, you can git a single folder and have all the required files as a repository in one place, versioned. This is very developer-focussed, coming from data scientists where R and Quarto were born. Pandoc is less opinionated, where there is a Pandoc data folder yet you can reference files there or anywhere else. Symlinks should hopefully paper over the differences but it is a bit of extra setup, and the other problem is Scrivener does not “respect” the project folder without some workarounds and tricks (the -mmd suffix trick; and I don’t know how this will change in future versions, it could break). My wish for this:

Overwriting or not the target folder as a compile setting.
Add “compile” file exports. This would more easily manage accessory files like CSL styles etc. that could be stored in the Binder and exported explicitly at compile time. We can do this manually with a post-processor script but a bit of structuring would be great.

I should have been clearer, ScrivQ is from @bernardo_vasconcelos — my sample project is without any cool name. Have a look at both as they do demonstrate a working Scrivener project design for Quarto…

Right, YAML is a bit fussy in that it uses significant whitespace (as spaces), the keys need to be lowercase, and quotes must be straightened. I turn off a lot of the smart corrections and visualise whitespace:

but you could also just copy-paste to an editor.

That is exactly what my scrivomatic, quarto-run.rb and typst-run.rb scripts do (GitHub - iandol/scrivomatic: A writing workflow using Scrivener's style system + Pandoc for output…), they setup the path and env variables as needed. I use Ruby for this but really any language can do this. The post-processor adds paths and enables other tools as needed.

Well, several of us do start-to-finish without issues, given the caveat that you use a post-processing script. You don’t need to as you said, you can run an automation[1] to do this outside of Scrivener (e.g. a file watcher daemon can check if a file changes then run a workflow involving the final parts.

Here though is my bigger problem, my major issue with using Quarto + Scrivener: cross-reference are often incompatible with Scrivener styles.

So this is one of the main things my quarto-run.rb script handles, allows me to put cross-ref labels where I want them in the Scrivener editor, then moves them after Scrivener has compiled to MD to where Quarto expects them. I would love if Scrivener 4 gave us a couple more options how to deal with this, this is the biggest pain point IMO…

[1] The reason I love pandocomatic is because it supports pre and post scripts all in the same YAML that specifies the document, a single recipe (the one true ring ) for automatic output that could then SFTP the docs, restructure folders and just about anything else. But there are many ways to skin this automation…

cavalierex · August 28, 2024, 1:08am

I think I need to study this aspect more in-depth.

So many details!

nontroppo · August 28, 2024, 7:16am

If you use the ruby script (ScrivQ uses a modified version of this, you can extract it from its compile format post-processing pane), then the crossref problems are solved. At its core the problems are that Scrivener uses reference links for images (remember markdown can link inline or as a reference at the end of the document), so that the label ends up at the end of the document so you have to move the labels[1]. Also pretty minor but for maths blocks Scrivener styles will output something like (depending on the whitespace in the editor):

$$ x = y $$

{#eq-test}

So you have to remove the newlines to get the label back onto the same line as the $$ delimiter; this is trivial with a regex…

[1] A simple fix for this @AmberV would be if Scrivener 4 allowed you to specify either reference or inline links as a compile option, this would greatly simplify this incompatibility.

AmberV · August 28, 2024, 11:21am

Overall, I think a better awareness of Pandoc curly-brace usage might be the best all-around solution. For example, not having to know to go into some super obscure area of the compiler to turn off the ## Symmetrical Hashes ## so you don’t end up with ## Symmetrical Hashes {#oops} ##. To use that example, if we looked for brace usage in areas where processing occurs, like images and headings, and made sure the result was valid syntax, that would be cleaner overall I think than another option somewhere you just have to know about in order to get valid output.

So I’m thinking something like: if {...} is found on the end of a caption line or on the same line as the image, then make sure it gets put in the right place.

nontroppo · August 28, 2024, 11:49am

Right, the good news is as Quarto simply uses the {attributes} mechanism that Pandoc provides, awareness of attributes for block/inline elements will fix this for Pandoc and Quarto at the very least in a general sense (not only applying to Quarto-specific crossrefs).

This section of the Pandoc docs, while applying to commonmark, does provide some context that would be relevant (commonmark allows any element to have attributes, whereas Pandoc does not, though to provide compatibility it wraps elements with a span or div to allow the attributes to be useable):

https://pandoc.org/MANUAL.html#extension-attributes

Just-works™ sounds great

cavalierex · August 28, 2024, 1:15pm

Now I understand. Yes, when I encountered this dilemma in the past, that was another reason I ended up preferring using a “raw Markdown” style that would output as-is, instead of letting Scrivener compile as Markdown in its habitual way.

I’m glad to have met some kindred spirits on this forum. It would be nice to work together toward a solution for Scrivener 4.

@AmberV, might I suggest a change of title for this thread to “Thoughts for academic Markdown in a future version of Scrivener”? We’ve really been focusing on the friction points in using Scrivener for long-form scholarly/technical books and manuscripts rather than simple Markdown for bold, italics, etc.

AmberV · August 28, 2024, 1:37pm

We do have the approach described in subheading, Referencing Images with Document Links, under §21.4.2 in the user manual. Basically, you provide the syntax, and you hyperlink from the spot that should have the image name, pointing to the binder item that holds the image. Upon compiling the current image name (as referred to in the binder) will be substituted at the hyperlink, and the image will exported into the compile output folder. So you get the automation that embedding or linking affords, but have full control over the syntax.

Granted though, if you’re going to cycle through an external plain-text editor for a while, that will suffer some of the same issues a heavy style-based writing method would. I did address my way of working around that limitation in some of my posts. To bring it together though, as it may be a bit scattered, I often start a new project in pure Markdown, using whatever markdown editor or coding editor I want, coupled with Scrivener’s external folder sync feature (or sometimes I will even just start with an .md file and once it gets annoying to scroll in it, or the very first time I want to refactor heading levels, I Import & Split into a fresh new project).

At some point I will sometimes gradually migrate away from that if I need to. For instance, the LaTeX call-outs in the user manual, never mind the other styles that get used there (menu labels, shortcut keys, etc.). This migration needn’t be all at once, and across the board, but bit by bit as different sections require special treatment that I want to use Scrivener’s features for. I might eventually stop using folder sync entirely, but I might not, and merely only use it more selectively—for example, with stuff like tables where I’d rather have a better visualisation.

This is not too bad for me, as my main point of friction is a lack of typing aids in Scrivener. It can’t even hold an indent level meaning you have to pound on the tab key over and over, never mind the other many little things that make typing in a coding editor with a good extension a vastly superior experience. But, once that initial drafting is done, I don’t hugely care what it looks like. I’ve been writing in raw Markdown for too long I suppose, across many different devices and programs, few of which “support” it by changing how it looks. So to me, eventually using Scrivener much more exclusively with a lot of syntax isn’t something that glazes my eyes over, as you put.

cavalierex · August 28, 2024, 1:42pm

Great example of something that is missing.

I have been waiting for PyCharm to add true Quarto support for a while. (It’s coming.) When I really need to program, PyCharm is my IDE of choice for Python and even R.

We would never expect Scrivener to have IDE features like code completion. But maintaining indent level is a reasonable thing to have when inside a code block.

AmberV · August 28, 2024, 2:49pm

Speaking of completion, there is some, and it has the ability to become more useful over time as you need it. As for stock completions, all of the binder titles are indexed and can be recalled with the Edit ▸ Completions ▸ Complete Document Title shortcut. Combined with the Automatically detect [[document links]] setting, in Corrections, is how I most often type in titles (if they aren’t just internal links for myself): [[[Name of Thing]]]. The “Name of Thing” part I can use completion on, and Scrivener detects and strips out the middle double bracket pair, leaving [Name of Thing], which is of course a way of cross-referencing to a heading—and I get a clickable link out of it as well.

For other things, check out Project ▸ Project Settings..., in the Auto-Complete list, and also consider disabling the default, In script mode only setting in that same Corrections pane. You could put callout-tip title=" in there, for instance, and save yourself some of the typing. Common YAML keys could also be useful in the list. One area that would be nice to improve upon here is punctuation prefixing. We already have the code for that (if you try typing in <$ with completion on you’ll see it suggests all known placeholders and will narrow down as you type), so it wouldn’t be a leap to take any non-alphanumeric component of the auto-complete phrase as a kind of scope that suggests any completions using that, and then you could get the ::: part working to build the rest.

Now what we could do, that I would love, is automatically index all image names in the binder and text editors, and have that in a separate auto-completion queue like Titles automatically are. Such could be accessed when typing (# (though we’d probably want to make the prefix adjustable somehow, and make this whole concept expandable to perhaps other useful things as well, like custom metadata field values).

But, at a certain point, I think a general-purpose text expansion tool will be superior for any of that (even coding editors, unless you’re down for making plugins). I use the cross-platform Espanso, which uses YAML for configuration. Here is a useful one for Quarto call-outs:

Espanso Quarto call-out config...

global_vars:
  - name: clipboard
    type: clipboard

matches:
  - trigger: "{callout"
    replace: |
      ::: {.callout-{{formData.boxType}} title="{{formData.title}}"}
      {{clipboard}}$|$
      :::
    vars:
      - name: formData
        type: form
        params:
          layout: |
            Box Title: [[title]]
            Type: [[boxType]]
          fields:
            boxType:
              type: choice
              values:
                - note
                - warning
                - important
                - tip
                - caution
              default: "note"

That will, upon typing in {callout, present a dialogue box where you can type in the box title and select the type of box from a dropdown of valid choices. Upon submitting that it will wrap the clipboard text in a Pandoc div and put the cursor at the end of it.

Such things work in all software (if you want, or only some).

cavalierex · September 2, 2024, 4:31pm

@AmberV & @nontroppo ,

Would you be willing to brainstorm this with me from scratch, without the particularities of our own implementations? I would like to be sure that I have a full understanding of the problem, approach, and constraints. Maybe by brainstorming together, we can come up with a proposal for what is doable by the user and what features might need to be added in Scrivener 4 to achieve certain goals. My thought is to have a model with separation of concerns for each component, connected by a DAG (directed acyclic graph).

To kick things off:

The goal of the “academic workflow” is to be able to use some of the powerful features of Scrivener for writing, but to leave compilation, processing, formatting, and output format to secondary tools such as Quarto and/or Pandoc.
The most powerful feature in Scrivener is the Binder, which allows organization of the parts of a written work, as well as collection of ancillary resources, references, etc., used in support of the writing.
Only content within draft section of the Binder will be compiled and transferred to Quarto/Pandoc for further processing.
Only content within the Binder’s text documents will be compiled. Other Scrivener features, such as notes and other gloss, are used during the drafting process but are not sent to Quarto/Pandoc for further processing. In other words, anything that the writer wants to be compiled/processed/printed will be contained within the text documents of the Binder.
Scrivener’s Compile feature will serve as the bridge to get the Binder’s text documents into an intermediate format that will serve as the starting point for Quarto/Pandoc.
The Compile feature can convert some shorthand tools (such as Scrivener styles) into the plain-text markup needed for Quarto/Pandoc to work (such as Quarto Markdown).

I have more design concepts that I will add to here. But I thought I’d kick it off like that. I’m curious to hear your thoughts about the design parameters.