Tips needed for workflow: Scrivener/Zotero->MMD->Latex

Oddivh · February 7, 2024, 8:16am

I’m a novice using Markdown, Pandoc and Latex, however, Latex is the preferred template for scientific papers at conferences. (MS Word is used in addition by some, but these templates are in general just bad)
As I want to use Rich Text in Scrivener, I compile to MMD and then use Pandoc from MMD to Latex (TeXstudio for Latex). I use Zotero with the add-on “Better BibTex” for citations and bibliography.

First, a general tip is needed:
I use character styles to tag words so that I can add commands when compiling. One example is citations:
This is what it looks like in Scrivener (Character style - “Cite”):

Defined like this:

And it turns out like this is Latex (after first compiling to MMD and then using Pandoc to Latex):

Question: Would it, in general, be better to use Pandoc’s Markdown instead of Latex commands directly?

Two concrete reason for my question (help needed):

Concerning citations - I got into problems when adding a page number to the citation. How to get the page number through to correct Latex code?
Reference to another section. How to? Tried same strategy as with citations:

However, Latex uses a label as the reference mechanism. This label is automatically added by Pandac and my strategy of just adding \ref{} is too simplistic. Would it be better to: 1) add a label to the headings in Scrivener, 2) marks this using Pandoc’s Markdown. Then, hopefully, Pandoc is capable of getting this correctly into Latex code.

I tried this by manually editing the MMD file:

I added the label at the end of the heading text using Pandoc’s Markup.
That seems to work (Latex):

Also manually adding a reference to this heading in MMD:

Works just fine in Latex:

How to achieve this in Scrivener?

Add a label to each heading (I’m using the document name in the binder as the heading name):

(Temporarily, I also add Label and Status at the end of each heading at compile time - this is of course not the same “label” I would like to add to each heading )
refer to this heading in the text using character style and add a command (Latex, MMD, or Pandoc’s Markup) at compile time.

Returning to citations; I use a paragraph style in image captions. However, when adding a citation within this caption, Pandoc does recognize it as an image caption and messes up the Latex code. The only workaround I’ve found is to use Latex code directly in Scrivener. However, I would like to avoid such code within the manuscript in Scrivener.

Oddivh · February 7, 2024, 9:39am

Remembered the <$title_no_spaces> placeholder. Added this placeholder to the heading:

The MMD file looks like this:

However, Pandoc does not like the # sign in the middle. Latex code:

Here is the pdf from Latex:

The: “Place suffix inside hashes” does not help.
Is there any way not to have two hashes when compiling, but only the one in front of the heading?
By manually editing the MMD file:

Latex code:

pdf:

Oddivh · February 7, 2024, 11:00am

New update concerning referring to a heading
In Scrivener editor when referring to a heading using the beforementioned custom character style “ref”:

However, the name as it appears in MMD is: “The ontology of assurance - (In Progress, Not reviewed)”. When referring to this heading the last part of the heading must be included. this is achieved through adding a suffix to the style at compile time:

Therefore in the MDD file the reference becomes:

This seems very promising; however, for some reason that I cannot understand, when I compile the entire manuscript (above, I only compiled a small part of it). The MMD file looks like this:

It is correct when compiling parts of the manuscript but becomes totally corrupted when compiling the entire manuscript.

bernardo_vasconcelos · February 7, 2024, 11:28am

I am on my phone, so I can’t do much right now. But if you look around using the forum search you’ll see that there are several templates available that address the question of heading ids. You could use a custom metadata field to store these labels/ids, for instance.

As to citations, it gets a little tricky dealing with several references combined, as in [@Joe2010, p. 10; @AnotherJoe2000, p. 99].

AmberV · February 7, 2024, 12:11pm

My overall opinion is to use the dialects conventions (such as cross-referencing) as much as possible, and only force the matter myself with manual language-specific overrides if nothing else works. I cannot think of many good reasons to insert my own refs and labels—even if I am making that significantly nicer on myself by using styles. What if I want ePub for casual proofing, for example? The barrier for that is low if I stick with language agnostic markup, but if I have a compile format filled to the gills with hyper-specific language output to get anything done, I’ll probably never get around to it and just live with the awkwardness of PDF on an iPad for those sofa proofing sessions.

It’s an opinion though, to reiterate. One of the nice things about this workflow in general is how flexible it is.

As I want to use Rich Text in Scrivener, I compile to MMD and then use Pandoc from MMD to Latex (TeXstudio for Latex). I use Zotero with the add-on “Better BibTex” for citations and bibliography.

You might want to check out the Processing compile format pane at some point. What you’re describing is the type of manual labour I do early on in the process, when I am still shaking out the details of what compile will look like for a project. But fairly early on that goes into a shell script / .bat file and I never look back. Single-click compile straight to PDF or whatever is coming out of the end of the process, from that point on.

I got into problems when adding a page number to the citation. How to get the page number through to correct Latex code?

I don’t know about that, but you might want to search on the forum for Scrivomatic posts, which make all of this kind of stuff easier.

The only workaround I’ve found is to use Latex code directly in Scrivener. However, I would like to avoid such code within the manuscript in Scrivener.

Yeah, that’s one of those rare cases where it might be better to do what you’re doing. Though for me, since that should be so rare, I don’t make a special style for that one thing. I have a general purpose “Raw LaTeX” style for inserting stuff directly, whatever it may be. It keeps things simple in a scenario where complexity isn’t really necessary. Bear in mind style use also means the ability to delete the text at the compile stage. You can have your plain RTF output if you want it in other words, by adding your “Raw LaTeX” style to its Format and setting the flag to delete the contents of text marked with it.

Is there any way not to have two hashes when compiling, but only the one in front of the heading?

Annoyingly, there is supposed to be an option here to leave the suffix hash off of automatically generated headings, but it doesn’t work. For those Mac users stumbling across this thread, you’ll find the stock Pandoc-oriented formats already have the option enabled so you generally don’t need to even think about it, but if you do want to create your own from scratch, it is in the Section Layout area’s ••• button, top right: disable Add closing hashes to titles. Now you can freely add Pandoc-compliant ID overrides to the Title Suffix.

For Windows users, I’m not sure what is best. On a UNIX-based system I would advise adding a sed line to your processing script, or whatever you prefer along those lines (I use a general Ruby script to clean up gremlins like this usually, and do things Scrivener cannot do by itself with Replacements and other tools). Whatever the analogy for sed is on Windows, I don’t know. Maybe installing Ruby is the best answer, unless you have another scripting preference. It’s a nice language for simple things like this, serving in a role similar to Perl, but can easily scale if needed, being deeply object-oriented in nature.

But again, I would consider having your custom ID in a custom metadata field rather than tacking one on to every heading. It feels to me like that is making too much work for yourself. Why not depend on [Introduction] for your cross-referencing 99% of the time, and only use IDs when you have duplicate headings or ones that make for awkward cross-referencing “grammar” in your text.

And maybe there is something to be said for having the label and status formatted in some way under the heading somehow, rather than being a part of it, for that very reason of keeping things simpler. It seems to me like having strange looking and very long headings with parenthetical is maybe part of the reason why you went down this path anyway? If this kind of text is purely annotative, why not \marginpar{<$label>, <$status>} or something along those lines?

Oddivh · February 7, 2024, 1:21pm

Thanks, I’ve looked around, but perhaps not enough.
I think I know what the MMD file should look like for Pandoc to convert it properly.

MMD heading (no hash in the middle, <$title_no_spaces> - placeholder and hash inside curly brackets at the end:

Referring to from the text should look like this in the MMD file:

Then Pandoc will be able to properly convert to Latex.
However, the Scrivener compiler inserts a hash in the middle of the heading (as seen before).
Moreover, the MMD file looks like this when referring:

Even when the “Ref” style is defined as such:

However, if only the smaller part of the manuscript is compiled, then it looks just fine:

I don’t know why <$mmdhn> and the <$title_no_spaces>-placeholder are added when the entire manuscript is compiled. Especially when the “Treat as raw markup” is checked.
PS! I deleted the Label and Status placeholders for the time being.

nontroppo · February 7, 2024, 4:15pm

In general yes, in that with a @key rather than \cite{key} you could just as easily export to DOCX or HTML or text or a zillion other options if you ever needed to. Pandoc’s citeproc engine is pretty powerful and covers most use cases. Pandoc can be configured to uses its citeproc engine, or drop back to biblatex (it rewrites @key to \cite{key} for you) with a simple config setting, so this gives you maximum flexibility into the future. The fact you already use a Scrivener style means it should be trivial to convert your documents (demonstrating the significant benefit of Scrivener styles in the writing environement).

So it depends if you will use Pandoc or LaTeX syntax… For Pandoc:

https://pandoc.org/MANUAL.html#citation-syntax

Citation items may optionally include a prefix, a locator, and a suffix. In
Blah blah [see @doe99, pp. 33-35 and *passim*; @smith04, chap. 1].
the first item (doe99) has prefix see, locator pp. 33-35, and suffix and *passim*. The second item (smith04) has locator chap. 1 and no prefix or suffix.

So if you use Scrivener’s style to prepend [@ and ] then you can just write key, pp. 33 in the editor.

This should work automagically if you use Scrivener links. So for example:

I wrote see Results then selected that text then dragged Manuscript/Results/Lunar Cycles onto the selected text to make a link. Scrivener to Pandoc MMD makes this:

…Sed illum minimum at 3.25×10⁴⁸ ([see Results][Lunar Cycles]) , est mægna alienum mentitum ne. [Amet equidem](https://quarto.org/) sit ex ([see Conclusion][Discussion]). Ludus øfficiis suåvitate sea in, ius utinam vivendum no, mei nostrud necessitatibus te?  

…

## Lunar Cylcles

…

I didn’t do anything, Scrivener made the links that Pandoc can parse. Then Pandoc will convert them into the correct LaTeX links (or HTML or DOCX etc.)… There may be cases where you will need some customisation, and unfortunately this may collide with Scrivener’s system but try to see if Scrivener links will work out for you first…

BONUS Advice: don’t use rich text but styles for everything (i.e. direct formatting for bold and italic should apply emphasis and strong styles), this means the compiler doesn’t need escapes. Scrivener styles are the one ring to rule them all

EDIT: I didn’t see the other replies before I replied to your OP. It still seems you are complicating yourself trying to get cross-linking. Is there a specific reason to not use simple Scrivener links?

Oddivh · February 8, 2024, 3:30pm

Thank you very much to both Amber V and nontroppo
There is a lot to digest, and will need time to read up on a lot of stuff.
Here are some thoughts:

I have to compile the document to docx for internal review by my colleagues (as Amber V mentioned). I though I just compile directly to docx (layout does not matter).
Still I cannot get the reference to work properly. Scrivener compiler adds a label inside parenthesis after the square brackets with the heading name. Pandoc is adding a label to the heading. These two labels does of course not match. The only way I’ve found to force both the Scrivener compiler and Pandoc to “agree” on the label name, is to specify it through the placeholder in Scrivener. However, this might be unnecessary because Latex will find the heading through the heading name as Amber V pointed out. Then, it is unclear to me why Scrivener compiler invent a label and put it inside parenthesis in the first place.
I did not get the @key to work as nontroppo suggested. I did not even know I had to add an extension to Pandoc command line, however, even adding the “+citations” did not help
Same then with adding page number. I guess there are a lot here I still don’t understand. And, still the problem with <$mmdhn>. no idea what this is.

No, absolutely not. But it does not work. (see above)

Again, thank you for detailed replies and perhaps Scrivomatic is a place to start.

jpkell · February 8, 2024, 4:02pm

My strong recommendation is to fully commit to Pandoc’s citeproc’s approach to citations and to use pandoc-crossref for cross-referencing equations, figures, and sections/chapters. Then let Pandoc convert everything to whatever output you want—docx, pdf, tex. Do not mess around with having Scrivener somehow insert LaTeX commends itself. The only TeX you’ll insert is what you need for your math, symbols, etc.

I have a Scrivener setup that does all this wonderfully, but it is a little complicated and I’m sorry to say I don’t really have the time right now to prepare it for sharing or explain its intricacies. I’m also on a Mac, and am not sure what might be different in the Windows version.

A great alternative, which is what I would recommend to students who like what I can do with Scrivener+Pandoc but who either don’t want to learn Scrivener in general or my setup in particular, is to embrace the Quarto ecosystem. It gives you everything you want, with native cross-referencing ability. And it has great tutorials to walk you through it.

But if you want to stick with Scrivener (as I do, so I understand), what you want is a suitable Scrivener project template. I believe my personal one is based on one that @nontroppo shared once, so maybe he can share a project file that includes a demo document that you can play around with.

nontroppo · February 9, 2024, 12:28pm

You need a bibliography file specified, so for example save this as test.bib:

@article{crivellato2007,
author = {Crivellato, Enrico and Ribatti, Domenico}, 
title = {Soul, mind, brain: Greek philosophy and the birth of neuroscience}, 
journal = {Brain Research Bulletin}, 
volume = {71}, 
number = {4}, 
pages = {327–336}, 
year = {2007}, 
doi = {10.1016/j.brainresbull.2006.09.020}, 
abstract = {}, 
location = {}, 
keywords = {}}

Then save this markdown file as test.md:

---
title: Test
author: Jane Doe
---

# Intro

It was known [@crivellato2007]. See also [here][Discussion]

# Discussion

More text.

Then run this command (we pass the bibliography bibtex and also ensure citeproc is activated):

pandoc --citeproc --bibliography=test.bib --o test.docx test.md

You could then compile to LaTeX with a slight tweak to the command:

pandoc --citeproc --bibliography=test.bib -t latex --o test.pdf test.md

And even fall back to the native biblatex engine:

pandoc -s --biblatex --bibliography=test.bib -t latex --o test.tex test.md

Right, once you get @key to work so will page numbers:

In fact the output from Pandoc to DOCX / ODT is much more flexible and well-formed than Scrivener’s native DOCX output. Scrivener’s native output tries to follow the rich text formatting, and doesn’t add headings etc. unless configured. Pandoc’s output can be shaped using a template, and supports more word processing features (figure legends, maths, and more).

Try to make a simple Scrivener project to demonstrate what you can’t get to work, zip it and attach and we can have a look. Scrivomatic has several sample projects you can download and open up to see how this all works together…

jpkell · February 9, 2024, 2:02pm

Since you use BetterBibTeX in Zotero, you can also go to the Preferences and select Automatic export and have Zotero backup your whole library as a JSON file. Then you can place the path to that file on your computer after --bibliography= in the Pandoc command line. The Better BibTeX plugin now also has an option in Zotero under Tools > Better BibTeX that will scan your resulting markdown file and show you the subset of your library’s citations that you used in your paper. Then you can export a paper-specific bib file if you need.