Typesetting a technical document with Scrivomatic and TeXShop on a Mac

Scrivener forum member @nontroppo has developed Scrivomatic, a system for technical writing that allows Scrivener to automatically access pandoc to produce documents containing typeset mathematics, footnotes, and bibliographic citations (with a wide selection of citation styles). The system depends upon pandoc-style markdown.

To familiarize people with the system @nontroppo has created a sample workflow.scrv project along with instructions for compilation within Scrivener. Upon compilation, output files with a range of formats are produced, demonstrating the versatility of the system. One of the output formats generated by Scrivomatic is LaTeX (.tex). With this .tex file it should be possible to use a program like TeXShop on the Mac to typeset a high-quality PDF. Within the TeXShop editor it would also be possible to tweak the .tex file before typesetting, were that to be necessary.

However, in order to get the .tex file to successfully compile to a PDF within TeXShop on my M1 MacBook Air, I had to complete the following steps:

  1. Update to TeXLive 2022 (though not sure this was strictly necessary)
  2. Move Calabri and the other Microsoft fonts to my font library
  3. Set the character encoding preferences in TeXShop to Unicode (UTF-8)
  4. Within TexShop, compile using the LuaLaTeX method rather than the LaTeX method
  5. Replace a superscript-minus sign (U+207b) with a hyphen-minus sign (U+002d) in the original Scrivener workflow document

I am not an expert in these typesetting issues but nevertheless I attempt to provide a bit more information on these steps below with the hope that it might be helpful to those, like myself, who are not (yet) LaTeX experts. More experienced users are welcome to correct any mistakes or elaborate on these topics.

Finding Cambria and other Microsoft fonts (2)

To fix the “missing-font errors” generated by TexShop’s typesetting system I duplicated the Microsoft fonts from their location within Word’s directory to my library, as described in this Stack Exchange post. Specifically, I duplicated the fonts in

/Applications/Microsoft Word.app/Contents/Resources/DFonts

to

~/Library/Fonts

(Keep in mind, to get to a folder, even a hidden one, use Go > Go To Folder within finder.)

Setting Character Encoding Preferences (3)

Some characters in the Pandoc generated .tex file would not render correctly within the TexShop editor until I changed the character encoding to Unicode (UTF-8) under TexShop > Preferences (I think it had originally been set to ASCI). Choosing the Preferences menu item in TeXShop brings up the preferences panel and then, from there, choose Source. The Encoding selector appears at the top right column; select “UTF-8”.

Compile using the LuaLaTeX Typesetting Engine (4)

LuaLaTeX updates the character encoding of LaTeX. Evidently other typesetting systems accomplish this too (e.g., XeTeX).

Replacing the Superscript-minus sign (5)

Even with the above changes, a character still failed to render properly, namely the minus sign in the exponent of the number appearing in the Introduction of the Workflow.scrv document. In particular, a box appeared instead of the minus sign in both the TeXShop editor and the LuaLaTeX generated pdf. This was fixable by editing the Workflow.scrv file and then re-compiling this revised document within Scrivener as described in the Scrivomatic instructions [^1]. Specifically, within Scrivener I deleted the exponent -48 by hitting the backspace key then re-typed the -48 directly using the keyboard, then block-selected this -48 and imparted to it the superscript style (i.e., selected Format > Style > Superscript). The number correctly renders within the subsequently compiled .tex file as 3.25×10\textsuperscript{-48}.

[^1]: I don’t understand precisely why retyping -48 and then applying the superscript style fixes the problem. It must have to do with a subtlety of the internal representation of the exponent in Scrivener. I suspect @nontroppo may have pasted the number into Scrivener from another source rather than typing it directly using the keyboard.

1 Like

Thanks @Edmund for detailing your solutions to some of the problems you faced. The self-contained Scrivener workflow project does not contain all the solutions that my full Pandoc configuration uses (particularly a fixLatex post-processing script), and some of the issues you faced are worked around by the full config.

The case of the missing ⁝

As a scientist I use scientific notation A LOT. I also strongly want to always use Unicode where possible (it is semantically correct). So this: 3.25×10⁻⁴⁸ does not use any superscript as a font-display engine modifier, but uses the correct unicode code point glyphs for the “superscript” part U+207B U+2074 U+2078:

Screenshot 2022-09-21 at 22.55.12

Many fonts have mixed support for the this range of glyphs, particularly U+207B is often missing. NOW for HTML / ODT / DOCX et.c the OS itself has a cool font-substitution system so that it can find a font that does contain that glyph then swap it in automatically. This unfortunately is a major weakness of LaTeX as it doesn’t offer automatic font substitution, and so if your chosen font does not contain U+207B (or any other glyph) then you get the missing glyph box. Here is an example from the otherwise really wonderful Alegreya serif font showing its failure to render ⁻ (app that allows you to test opentype rendering is a cool tool called FontGoggles):

Screenshot 2022-09-21 at 22.45.17

The font I use (Dolly Pro which is commercial) has excellent support for this so it isn’t a problem I face personally… Here are a few other serifs with or without U+207B (again FontGoggles to compare glyph rendering):

Retyping replaces the U+207x code glyphs with the normal characters, and \textsuperscript will use whatever physical method LaTeX uses to resize and position the regular glyphs, rather than using the unicode font glyphs… If this works for you then fine, but I will stick to unicode where possible :upside_down_face:

Changing font…

Pandoc is smart, it utilises variables to specify the fonts for its LaTeX template, and so you can easily change fonts by specifying mainfont, sansfont, monofont, and CJKmainfont in the Scrivener project metadata or the pandocomatic recipe for the LaTeX compile. The fonts I choose for the sample project are limited by those that are generally available on macOS systems…

Why Cambria?

In my full dotpandoc config (but not the sample workflow project), I have a post-processing script which tries to correct problems of font substitutions. It searches for a set of unicode glyphs like ⬄ ↔ ⇔ ⇄ ⇨ ⇦ → ← ⇳ Δ ⟳ ⟲ ‐ ⌲ ⌖ ⌽ ⌀ ⎆ ⎅ ⎌ ⎊ ⏎ :keyboard:︎ and wraps them with a custom font identifier \fixfont{⬄}. If you look in my LaTeX header you’ll see:

\newfontfamily\fixfont[]{Cambria}
\newfontfamily\fixfontB[]{Arial Unicode MS}

So if you are using a font which doesn’t have e.g. ⬄ then the post-processing script will make it \fixfont{⬄} and LaTex will substitute Cambria just for that character. This is not an ideal solution to font substitution, some glyphs like :email: are not in Cambria either so I use \fixfontB{:email:} which substitutes Arial Unicode MS which has a huge glyph set. You’d need to change this list to support the fonts and glyphs you would need… I’m sure there has to be a better solution to font substitution, however this remains a major deficit of LaTeX…

Compiling with UTF-8 and [Xe|Lua]LaTeX

As I mentioned above, in the 21st century we should be using unicode everywhere, and I always use UTF-8. I personally use the XeLaTeX engine with was the first to really support OTF fonts and UTF-8 properly. I think LuaLaTeX is probably the future of LaTeX (Lua is taking over across TeX land), but for the present, XeTeX is significantly faster (for my compiles more than 2X faster) and handles OTF fonts better. Some TeX programs are cautious and tend to use old code ranges but UTF-8 should really be the default for everything IMO…

Needing TeXShop?

I think using TeXShop or VS Code (which has a great LaTeX extension) is useful to better understand what is happening under the hood. However do note that Pandoc can compile directly to PDF. For my own needs as I wanted to keep the .tex file I preferred to use latexmk myself to build the PDF, and so scrivomatic has an automatic -b or --build option which takes the .tex file and compiles it using latexmk using XeTeX. -b builds and -B builds and then cleans up the missing files, so:
scrivomatic -B writing.md will run pandoc to generate the .tex, then run latexmk -logfilewarnings -interaction=nonstopmode -f -pv -time -xelatex writing.tex to get the PDF. It will then run latexmk -c to clean the intermediate files but keep the log file for you. It is very rare when I need to open the .tex file in a separate editor these days…

Needing a full TeXLive?

I do always keep my TeX install up-to-date (I also use 2022 release), but personally I don’t like installing the whole of MacTeX (4.7GB!) so use the smaller BasicTeX install (only ~95MB), and install packages as required. My list of required packages for Pandoc and Quarto are these (install using the tlmgr command):

sudo tlmgr install lualatex-math luatexja abstract \
	latexmk csquotes pagecolor relsize mdframed needspace sectsty \
	titling titlesec preprint layouts glossaries tabulary soul xargs todonotes \
	mfirstuc xfor wallpaper datatool substr adjustbox collectbox \
	sttools wrapfig footnotebackref fvextra zref \
	libertinus libertsinus-fonts libertinus-otf threeparttable \
	elsarticle algorithms algorithmicx siunitx bbding biblatex biber \
	stackengine xltabular booktabs orcidlink \
	ltablex cleveref makecell threeparttablex tabu multirow \
	changepage marginnote sidenotes environ fontawesome5

However, thanks to Quarto, I’ve started trying out TinyTeX as an alternative to BasicTeX. This is also small (~85MB) and uses a nice trick that it installs packages for you when they are missing. The install path is different, and I still haven’t updated the scrivomatic script yet to add it to the path, but will do soon…

3 Likes

That is good to know, @nontroppo . I am eager to get a full installation of Scrivomatic going on my machine! By the way, does the full installation of the Scrivomatic-pandoc system handle figure cross-references or does that require invoking Quarto?

Thanks, @nontroppo, for expanding on these issues. My impression is that most scientific and medical journals accept manuscripts in LaTeX (.tex) format, so for that reason it is not exclusively a stepping stone to generating a PDF, I would think. When you submit to journals, do you do so in .tex? Just curious, as it has been a few years since I’ve submitted something to an academic journal.

Well, technically you can submit in .tex for many but certainly not all biomedical journals. My experience however is that this often causes more problems than submitting in the dominant .DOCX submission. Each journal normally requires a very specific TeX template, and each allows a different set of TeX packages. This means a bit of messing around as you submit to different journals. The other major issue against .tex is that most co-authors (well in my field anyway) just get hugely confused when trying to proof the TeX file. For fields like computer science and physics etc. I think .tex is more dominant.

However, one trend I’ve noticed recently is that for the first phase of reviewing, direct PDF submission is becoming common and the exact formatting is becoming less important. Once we were not allowed to insert figures in the text, but now this is common (and a great thing too as a reviewer at least, flipping back and forth PDF pages is a nightmare, having the figure where it is described in the results is far better). So in this case we can use our preferred TeX template, and get co-authors to proof with the PDF. Final submission could then use .tex or .docx (where using markdown intermediates is such a benefit).

1 Like

On the rare occasions these days that I need to edit a LaTeX file LyX is the tool of choice. While not as nice an environment as Scrivener it does guide one through the LaTeX (or if one is a masochist raw TeX) options available at the current point.

2 Likes

Thanks for your responses, @nontroppo. To clarify a bit, if I want to follow your advice about UTF-8 encoding, at least with respect to in-text numbers expressed in scientific notation, I assume in Scrivener I would do the following, in general terms:

  1. Choose a default (“No Style”) font that includes the superscript minus (U+207B) and the superscript integers 0 through 9.
  2. When typing the number into the Scrivener editor, use graphical character-selector rather than typing the exponent with the keyboard.

For the Scrivener editor, which supports font substitution, you can really use any font you want. What font Scrivener’s editor substitutes does not affect the underlying unicode code point, and will get compiled correctly. I still tend to only use fonts that do support scientific notation anyway, but for Scrivener itself this is not necessary.

Well, I use Alfred’s snippet editor which auto-replaces e.g. !sci4 with ×10⁻⁴, you can use macOS replacements or the Emoji & Symbols viewer as you prefer. Here are all of the numbers and - for you to add to whatever replacements system you want to use: ×10⁻¹²³⁴⁵⁶⁷⁸⁹⁰ But yes, you need some way to insert the correct unicode.

1 Like