Compile to .docx using Pandoc

Oddivh · March 12, 2022, 8:15pm

Being new to Scrivener (and Pandoc), I struggle to understand how these two programs work together. I need this combo to write academic and non-fiction documents, like research reports, where “everyone” else uses MS Word. We must do most of the work concerning layout, proofreading, commenting in Word. We must relate to different Word templates in different situations that may include logos in the header. These documents may be between 6-100 pages long.
My goal is to do as little as possible in Word after compiling; however, I know how to make Word Macros, so this is an option.
Compiler Format: “Pandoc-Word Document”.
I need a selection of “standard” styles, like Heading 1-4, bullet lists, Emphasis/Strong, figure captions and so on. Moreover, I need to refer to the table- and figure numbers and use citations (Zotero, not a topic in this post).
I want to use Scrivener’s styles and not use code in the editor, which means that Word-styles should be preserved through the compilation.
After investigating a little bit, the built-in Scrivener compiler seems not to be adequately versatile to make .docx with correct Word-codes for references etc; therefore Pandoc.
Problem description:

Is there a section in the Scrivener Manual concerning “Compile Format Designer” missing? Sec. 24.11 “LaTeX Options”; however, “Pandoc Options” is missing.
No styles, except Heading 1-4 are preserved. Everything else becomes “First paragraph”-style.
In “Compile Format Designer->Styles”, I add all styles I need for the editor, including “List Bullet” Heading 1-4 and so on. To get Heading 1-4 into the .docx, a "# ", and "## " must be inserted as a prefix to Heading 1 and Heading 2 respectively, if not they will become “First Paragraph”.
To get the “List Bullet” to sort of work, I insert "* " as a Paragraph prefix to this style. However, “List Bullet” does not come through as “List Bullet” in Word, but as the style: “Compact”, although there is a style in the resulting Word document named “List Bullet”.
I also would like the character styles “Emphasis” and “Strong” to go through, which seems impossible. To get italic I enter “" as prefix to the “Emphasis”-style in the “Compiler Format Designer” (different from bullet list - no space after the "”). The character formatting comes through, but as a direct character formatting, not as a character style. To fix this, I tried Pandoc “Custom Styles”: “[WordToBeEmphasised]{custom-style=“Emphasis”}”. However, the “WordToBeEmphasised” came through tagged with a new style named: “Emphasis1”. This style did not exist originally in the Word template, so the word was not formatted as Italc.
Finally, I saw that there where an option to store a file named “Reference.docx” in “C:\Users\USERNAME\AppData\Roaming\pandoc”. The problem then was that it was just Heading 1 that came through, nothing else. Everything else came though as “Normal”-style this time, and not “First Paragraph”-style as last time. I know the compiler used this file because I put in a header, and it showed in the compiled .docx file from Scrivener.

There are a lot of options when using the command line with Pandoc. I hope I do not have to go down this rabbit hole because it is deep, and it is an extra step in the process that I wish to avoid. However, before I commit to Scrivener, I must be confident that it is possible to compile a “standard” .docx file with standard “stuff” as indicated above.

Does anyone have a simple list of “good practices” for moving from Scrivener to MS Word after the writing is done, given that the content in the .docx file should use MS Word styles and cross-references and so on? As I mentioned above, I got the impression after reading different posts in this forum that Pandoc was the way to do this, but if that is not the case, I’m happy to ditch Pandoc.

jpkell · March 13, 2022, 3:37pm

Sorry that I can’t answer the specific question you want, namely how to do this while avoiding the command line. But what you want is eminently achievable using Pandoc’s “reference-doc” command. Search for “reference-doc” here. Here is a link to Pandoc’s default docx reference file. Note that Pandoc will pay attention only to styles in the reference-doc. So any tweaks you want to make must be made to the reference-doc in Word using Format > Style…

Here’s how I use this. I use Scrivener to compile to markdown using “Multimarkdown” upon compile. You can use Scrivener’s section template features to assign docx styles to various sections (as I describe here), and then you tell Pandoc which docx to use as its reference file when it converts the markdown file to docx. And voila. It’s very powerful, especially in combination with what Scrivener can do.

I think there’s a way to automate this from within Scrivener (see here), but I’m comfortable with the command line so I haven’t looked into that.

Oddivh · March 13, 2022, 7:19pm

Thank you for the response. I had a look at your suggestions. Automating this from within Scrivener seems tempting, however, aft reading the first two lines of the installation instruction, I feel one hundred years old; did not understand a thing. To be honest, I became a bit powerlessness looking at the search result also; the top hit where “Lua Filter” - absolutely no idea what Lua is. The other hits where to the Pandoc User’s Guide, which I have already tried to browse through without getting much wiser.

If what I’m trying to achieve is not within the scope of Scrivener, I will treat Scrivener just the way do now with Microsoft OneNote. I have been using OneNote for the last decade when drafting my papers and reports. Then transfer to MS Word, and manually fixing the layout. It is a one-way street with no going back after proofreading and reviewing/commenting. It’s a lot of work, but I cannot spend more time bending Scrivener. Better to spend it on writing

jpkell · March 14, 2022, 1:06am

If you are willing to tolerate just a little command line, I’m sure we can address this to your satisfaction. I too tried to look into Scrivener automation with Scrivomatic/Pandocomatic à la the link I gave you, and it was way too intricate for even me.

nontroppo · March 14, 2022, 11:09am

I use pandocomatic in my workflow because I’m comfortable with Ruby, and I enjoy automating stuff — but this is not necessary. You can ignore the bits of scrivomatic’s instructions dealing with pandocomatic, but there are still useful bits of info in terms of using styles etc.

Here is the DOCX output from my sample project:

…apart from some of the academic metadata generating the names and affiliations that use filters, the rest of the document requires nothing more than scrivener compile to markdown and pandoc — it contains I think everything you have stated you require (citations, cross-references, lists, captioned figures, headers with outlines etc.)?

You do not need to understand all the various options, or know what Lua is etc. As long as you can copy-paste something you should be good to go.

OK, let’s break down what is “advised” to make this work as smoothly as possible:

Use the Binder for headings, not text and styles within documents. This is not specific to markdown, just general advice for one of Scrivener’s biggest features, a flexible outlining binder and the super scrivenings view. Headings will come for free when compiling.
Generic Scrivener Styles will not be automatically preserved, BUT for most of what you want this is not required. For example, figure caption styling will come from Pandoc intepreting the ![figure caption](figure.png) markup (that scrivener will auto-generate for you), as will lists, tables etc. If you do want to force specific styles, you need to use scrivener-styles that generate fenced blocks or spans; so for example in my example workflow you can see INFO and WARNING boxes that use this to generate floating text boxes in the docx output. BUT this is a special case for special features, for most things like quotations, maths, inline styles etc. the defaults are fine.
Strong and emphasis — I use Scrivener styles to generate markdown — you seem to be having trouble, you need to use * for emphasis, ** for strong, and **_ for strong emphasis— again my sample scrivener project has a compile format and project styles that can be perused to see what should be in prefix / suffix for this: scrivomatic/Workflow.scriv.zip at master · iandol/scrivomatic · GitHub

I see you are on Windows, and I at least had some problems when I tried the workflows I use on Win10, but for simple compiling I don’t see why there should be any problem…

It would take me a couple of minutes to get this working for you if you attach a sample scrivener project with the features you want working. That is maybe easier than back and forward commenting and screenshots?

Oddivh · March 17, 2022, 7:27am

Thank you, I will certainly have a closer look at it. However, I have some comments when quickly browsing through your .docx.
I liked that the bullet list and the table text/heading shared the same style - “Compact”.
The figure numbers and references to them are plain text; I would like them to be inserted as MS Word Field Codes (“seq” and “ref”). The reason is that I believe that after internal/external review, proofreading and it will be the Word document that will be updated, not the project in Scrivener.

I agree with you; preserving the look of styles is unnecessary; the Word template can take care of that. The important thing is that the name goes through (to me, only heading names went through to Word; everything else was named “First paragraph”.

nontroppo · March 17, 2022, 12:29pm

Yes, the cross-references in my workflow are using Scrivener’s tools (Scrivener’s placeholders), that is just what I’ve always used and it works well enough for me (I’ve never needed a list of figures for example). What you want are proper DOCX cross-referenced field codes, and as far as I know this should be fairly trivial to set up, at least for figures and tables as IIRC Pandoc does now correctly handle this (see changelog for Pandoc 2.14.1). However one issue is if you want to reference those figures or tables in the text, these numbers will not update if you change the order etc. This is one advantage of Scrivener’s placeholders. Pandoc is thinking about native cross-referencing here: New Feature: internal links to tables and figures and headers · Issue #813 · jgm/pandoc · GitHub — in the meantime people use cross-referene plugins like pandoc-crossref.

For references, do you mean the in-text citations for the bibliography? These are linked but I do not know about the field code status (I only use Word when I have to, and never use its native reference tools) or if there is a way to add the expected field code. This is something that could be asked on Github or the Pandoc forums… But what exactly are the field codes needed for in this case; you don’t need to make a list of citations as that is what the bibliography is for?

nontroppo · March 17, 2022, 12:44pm

As an aside, I tend to keep working on the Scrivener version of a paper until the very final post-review proof version, I treat DOCX as a one-way street and always make revisions in Scrivener. The main problem with this is when Journals ask for changes-marked version of the DOCX during a revision for the reviewers/editors. In this case I use Word’s Compare documents tool using two compiled DOCX versions for this (I archive each DOCX compiled and sent to the Journal). The major reason I do this is that I still find Scrivener essential during review, I have my document, articles and reviewer comments in the Binder, and using the flexibility of Scriveners UI keep the reviewer replies and updating manuscript left/right onscreen. Scrivener’s utility only finally gives way at the proofing stage, when twiddling with spelling and formatting is trivial.

Oddivh · March 18, 2022, 8:35am

Thank you for the tip. Yes, I agree; it would be best to keep the project in Scrivener and keep all related data and correspondence in the binder. I did not think about that, but it’s the best solution.

Again, Pandoc seems very versatile and is probably the best way to go from Scrivener to MS Word, so I need to sit down a study