Table column sizes

xiamenese · September 27, 2023, 4:59pm

I think this might be very useful to me, but as a complete newcomer to all this, where should I install the lua filter code? Do I need to do anything else to enable it?

Mark

nontroppo · September 28, 2023, 1:20am

I get an error when I try with docx output:

▶︎ pandoc -s -o list-table.docx -L list-table.lua --verbose list-table.md
[INFO] Running filter /Users/ian/.local/share/pandoc/filters/list-table.lua
[INFO] Completed filter /Users/ian/.local/share/pandoc/filters/list-table.lua in 43 ms
Encountered unassigned table cell
Exception: pandoc exited with 63

▶︎ pandoc -s --pdf-engine=xelatex -o list-table.pdf -L list-table.lua  --verbose list-table.md 
[INFO] Running filter /Users/ian/.local/share/pandoc/filters/list-table.lua
[INFO] Completed filter /Users/ian/.local/share/pandoc/filters/list-table.lua in 5 ms
[makePDF] temp dir: ... OK

Both LaTeX, LibreOffice and HTML seem fine. Typst also works but some features like row spans are not supported. This seems docx specific? I reported it as an issue and in the meantime ODT seems like the best intermediate to get to DOCX?

Pandoc has a folder where you can store filters, templates and other stuff. This is usually /Users/xxx/.local/share/pandoc – you’ll remember that from the other thread as that is where I usually put my bibliography files. If you save that filter into /Users/xxx/.local/share/pandoc/filters/list-table.lua then pandoc will know where it is. To activate it you need to tell pandoc to use it. Above you’ll see the -L option, so for example:

> pandoc -L list-table.lua -o out.odt input.md

Will read input.md and use list-table-lua filter to output to out.odt. Pandoc knows where list-table.lua is because you put it in the pandoc filters folder – but you could put it anywhere else as long as you specify the path. CSL files go in csl, template files go in templates etc. I use github to manage my pandoc folder, and you can see mine here: GitHub - iandol/dotpandoc: Pandoc Data directory, including customised filters and templates for producing multiple outputs for academic content. For a Scrivener workflow that uses just pandoc, exactly the same code can be put in the post-processing pane. For Quarto workflows, this needs a bit more tweaking (it may need an extension folder or something, @bernardo_vasconcelos know more about this).

nontroppo · September 29, 2023, 12:28am

DOCX output doesn't seem to work. · Issue #3 · pandoc-ext/list-table · GitHub – it seems this bug only triggers for missing cells, so ensure your table does not miss cell lines in the list then it should work with docx…

xiamenese · October 3, 2023, 5:15pm

Slowly getting there on this one. I’ve downloaded list-table.lua, which currently resides at

/Users/xxxxxx/Desktop/list-table.lua

And I’ve made a folder at

/opt/homebrew/Cellar/pandoc/filters (this being my M1 MBA, that is the path resulting from brew pandoc).

What would be the cp string to copy list-table.lua to the filter location? Should I set any flags? Should I cd to Desktop and do the command from there, or what? Once I’ve got this straight I’ll know how to copy or move other files around as needed.

TIA

Mark

nontroppo · October 6, 2023, 6:32am

I’m currently trekking around North Vietnam (what a beautiful country!!!), so not much web access. The standard pandoc folder does not belong in the homebrew folders. It is at : /Users/xxxxxx/.local/share/pandoc/filters where XXX is your user name. Note that ~ means Users home folder so you can type ~/Desktop for your Desktop folder etc.

So to make the filters folder that Pandoc uses with this command:

mkdir -p ~/.local/share/pandoc/filters

and then copy the file from your desktop using:

cp ~/Desktop/list-table.lua ~/.local/share/pandoc/filters

Now Pandoc can use this filter using the -L command-line without needing to specift a path. You can test using this text saved to test.md:

:::list-table
   * - row 1, column 1
     - row 1, column 2
     - row 1, column 3

   * - row 2, column 1
     - row 2, column 2
     - row 2, column 3

   * - row 3, column 1
     - row 3, column 2
     - row 3, column 3
:::

And this command:

pandoc -t html -L list-table.lua test.md

You should see HTML output to the terminal window:

▶︎ pandoc -L list-table.lua -t html5 test2.md
<table>
<thead>
<tr class="header">
<th>row 1, column 1</th>
<th>row 1, column 2</th>
<th>row 1, column 3</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>row 2, column 1</td>
<td>row 2, column 2</td>
<td>row 2, column 3</td>
</tr>
<tr class="even">
<td>row 3, column 1</td>
<td>row 3, column 2</td>
<td>row 3, column 3</td>
</tr>
</tbody>
</table>

xiamenese · October 6, 2023, 2:02pm

Hi Ian. Thank you very much. Glad you’re enjoying Vietnam; I never made it there, sadly.

OK, got it about the cp command. But when it comes to the path, Homebrew has installed Pandoc at /opt/homebrew/Cellar/pandoc/ not at /usr/local/share — which merely contains “man” and “texmf” nor at /usr/local/bin/ where pandoc is “not a directory”.

So, I’m somewhat confused to say the least, but I might try putting it in the filters directory I’ve created and see what happens.

The Homebrew site very clearly says it has to be in opt on Apple Silicon.

Mark

nontroppo · October 6, 2023, 3:46pm

Mark you are confusing the installation folder with the data/support folder, these are totally separate. It is like a macOS app can install to /Applications/MyApp.app, but the data folders are in ~/Library/Application Support/MyApp/ or ~/Documents/MyApp/ – the filters and other bits like CSL files for a bibliography are considered data. For example for Scrivener, your Support folder is usually found in ~/Library/Application Support/Scrivener, and for Pandoc it is found in ~/.local/share/pandoc.

But don’t forget filters can be stored anywhere, for example you could make a Desktop Filters folder and use that like so:

> pandoc -L /Users/xxx/Desktop/Filters/list-table.lua -t html5 test.md

You pass what is called an absolute path, one that specifies from the root folder / the full path. You can use any file this way.

xiamenese · October 6, 2023, 5:18pm

As I said, confused I am!

I know about data support folders when using a straightforward Mac app; it’s having to do this through the command line where it’s not automatic that got me confused.

I’ll set about making the appropriate folder in due course. Other things needing my attention at the moment.

Cheers

Mark

xiamenese · October 7, 2023, 11:18am

Right; got it; that worked… I had to specify the path to where I’d saved test.md and I got the result in terminal. So, another small step for an elderly explorer; lots more to go!

Mark

xiamenese · October 10, 2023, 1:34pm

Yet another small step completed!

I’ve worked out the command line string to take sample.md downloaded from the list-table-lua website to produce sample.html as a standalone file.

Next step is to output PDFs with it, but I got the alert:

pdflatex not found. Please select a different --pdf-engine or install pdflatex

I used brew install pdflatex which resulted in:

==> Auto-updated Homebrew!
Updated 2 taps (homebrew/core and homebrew/cask).
==> New Formulae
jupyter-r                                python-psutil
==> New Casks
low-profile

Warning: No available formula with the name "pdflatex".
==> Searching for similarly named formulae and casks...
Error: No formulae or casks found for pdflatex.

I installed TinyTex when I installed Quarto; on the Quarto site it said to use

quarto install tinytex --update-path

to get TinyTex registered on the system path. So I did that and it went through fine with Update successful returned but nothing about system path. When I retried to create the PDF, I got the same alert:

pdflatex not found. Please select a different --pdf-engine or install pdflatex

What to do now? Do I have to find pdflatex and download it, then move it to a specific location?

Mark

nontroppo · October 11, 2023, 1:22am

Hi Mark, congratulations on that first step, you are most of the way there!!!

pdflatex is part of a bigger package - LaTeX. You cannot install pdflatex by itself it needs the rest of the TeX distribution. As you smartly surmised you have several routes to install this:

Homebrew

For homebrew you use brew install mactex for the whole ~5GB, or …
brew install basictex for a small ~200MB core installation.

I use BasicTeX BUT the caveat is it doesn’t contain some of the packages that Pandoc needs, so once installed you have to use the TeX installer command to install the packages (you can see that command in the scrivomatic docs[1]). Maybe for you this is too much bother, and the full installer is probably the better option.

Direct

Go to the mactex page and download the full or basic packages. This will be the same installer that homebrew uses, but homebrew automates this for you and if there is a new version it can update.

More Packages - MacTeX - TeX Users Group – BasicTeX
MacTeX - TeX Users Group – full MacTeX

Quarto install

Quarto allows you to automatically install TinyTeX (which is similar to BasicTeX), but this is for use by quarto, so by default is path is not added to the terminal. TinyTeX is installed to /Users/xxx/Library/TinyTeX and the commands are in the bin subfolder e.g. for me: /Users/ian/Library/TinyTeX/bin/universal-darwin/ – adding this path to your shell and then I suspect pandoc would work (personally I haven’t tested this).

To prepend this path to your shell run this command:

export PATH="$HOME/Library/TinyTeX/bin/universal-darwin:$PATH"

Then confirm:

echo $PATH
which pdflatex

Quarto complete

Above we used quarto only to install TinyTeX, but quarto also includes pandoc, so you can replace pandoc with quarto completely, then you don’t need to bother with the path, as quarto knows where it is. BUT I’m not sure how normal lua filters like list-tables.lua work with quarto yet, so not sure how feasible that is for this tool specifically. In general I think quarto is great.

DON’T DO IT

A final option is don’t bother with LaTeX at all — while LaTeX is an amazing tool and can produce amazing final layout, it does take some getting use to. There are other PDF generation engines that are easier to get started with. One example is PrinceXML (brew install prince) which converts HTML to paged PDF, but it is non-free for commercial use. Another is Typst (brew install typst), a new kid on the block with much potential but still being heavily developed. Another route is to use ODT (brew install libreoffice) or DOCX (brew install microsoft-word) and then generate the PDF from there. Pandoc supports all these and more, so the markdown routes still give you so much flexibility.

[1] here is the command to get basictex to install packages that pandoc and quarto need:

sudo tlmgr install lualatex-math luatexja abstract \
latexmk csquotes pagecolor relsize mdframed needspace sectsty \
titling titlesec preprint layouts glossaries tabulary soul xargs todonotes \
mfirstuc xfor wallpaper datatool substr adjustbox collectbox \
sttools wrapfig footnotebackref fvextra zref \
libertinus libertinus-fonts libertinus-otf threeparttable \
elsarticle algorithms algorithmicx siunitx bbding biblatex biber ctex \
stackengine xltabular booktabs orcidlink \
ltablex cleveref makecell threeparttablex tabu multirow \
changepage marginnote sidenotes environ fontawesome5 \
tcolorbox framed pdfcol tikzfill luacolor lua-ul xpatch selnolig

xiamenese · October 11, 2023, 1:02pm

Again thanks so much. First off, I tried this route:

Provided I immediately run the pandoc command it works. However if I close the terminal window and run the pandoc command without doing all that first, it again says pdflatex not found. So I’ll think about whether to install BasicTex and the packages, or whether to simply install MacTex (both machines have 1TB SSIDs, so space is not a problem), or try PrinceXML, which I am unlikely to use for any commercial purposes.

That was going to be one of my questions a bit further down the line. Having got this set up and working, albeit through the command line, it seems to make using tables much easier, but being able to integrate it into a Quarto workflow was my hope. My one thought about it at the moment is how to remove the borders, other than by editing the HTML; essentially I am looking for a way to do in MMD, etc. what I have been doing using tabs in RTF.

The last thing I want to do is install Microsoft Word! But I know I could pandoc to DOCX and then open that in Nisus Writer Pro or perhaps even better Pages and generate the PDF from there. In a sense, that is the easy route, but there is the geeky part of me that would like to get the results that I want, whether HTML or PDF, from an integrated workflow without opening a traditional word processor.

So, I’ll carry on working on this, creating my own data, then exploring how to remove the borders. After that, if I’m successful, it’ll be seeing how to integrate it with Scrivener compile; I know you have done much work on that for both Pandoc and Quarto, so hopefully I’ll be able to work out some (maybe even much) of it for myself from the existing compile formats.

Mark

nontroppo · October 11, 2023, 2:30pm

Yes, that is by design, you can change variables in a session but not impact future sessions. If you want a change to persist, then you edit the shell configuration file. By default macOS now uses zsh and so you would edit ~/.zshrc in a plain text editor. You would add that export ... line at the end of that file and now every time you open a new terminal, zsh will load that file and your $PATH will always contain the TinyTeX path… You can even append this line directly to .zshrc without opening an editor, this terminal command will add the line to your config for you:

echo "export PATH=$HOME/Library/TinyTeX/bin/universal-darwin:$PATH" >> .zshrc
source ~/.zshrc

Now every time you open terminal path is set up. This is basically what the installers do for you, just set up the path, but it is easy enough to do this yourself…

Filters can be used in quarto, but I just haven’t looked into how to do it as the “preferred” method is to use filters wrapped up as an extension (sort of plugin). I doubt this is going to be an issue…

Regarding table borders no borders (there is a line on the header and legend in PDF at least) is pretty much the default I think…

xiamenese · October 11, 2023, 9:42pm

I’m pretty much getting there with my own data, Ian, that is to say:

I can compile from Scrivener to .md using Pandoc syntax and then use the command line to produce HTML from there.
Using header-rows=0 produces my table without the header row with a border before, which is what I want.
but there is a border before and after the table, from this in the HTML, which comes from this:

 tbody {
      margin-top: 0.5em;
      border-top: 1px solid #1a1a1a;
      border-bottom: 1px solid #1a1a1a;
    }

Removing those lines removes the two borders for HTML.

But what I would like is not have them created at all in the processing so that eventually I will be able to go straight to PDF without them appearing; I cannot see, either in the output.md or in list-table.lua, what causes them to be created.

Anyway, the next step I need help with is setting up the compile format so that it calls Pandoc from the command line. But there I am stuck at the moment with what to put in Path; I presume the arguments are -L list-table.lua -o -t html, but when I try the path to Pandoc /Users/mark/.local/share/pandoc/ with or without the final slash it fails, saying pandoc is a folder. None of the processing panes on other compile formats are any help, as they refer to the Multimarkdown instance embedded within Scrivener. I need to look into pandocomatic, I guess.

In the meantime, I’ll decide what to do in the PDF creation line.

Mark

nontroppo · October 12, 2023, 5:47am

Pandoc uses templates for each output format. You can modify the defaults as a new template, then give it to pandoc to use instead of its default. An alternative is to inject your updated CSS at the top of the document using header-includes: Pandoc manual: HTML Variables gives an override example.

For Scrivener’s post-processing pane:

Path is only for where the pandoc executable is. Arguments are where the magic happens. Note Scrivener has a <$outputname> and <$inputfile> which are defined by the compiler filename you give to Scrivener. I also set --verbose to give more feedback. >Scriv.log 2>&1 is optional but allows any output to be stored in a log file so if there is a problem you can look at that file.

Here is a little bonus for you if you have installed prince (brew install prince), duplicate your compiler format and name it princexml and modify the Arguments:

That would get you a PDF, not optimised yet but as a starter…

xiamenese · October 12, 2023, 6:06am

Here I am, having woken up in the very early hours of the morning with the realisation that the path I was using was the one to the support folder, not the executable! <duhh!>

But thank you for the heads-up on the two scrivener variables (if that’s the right word) I’ll get on to it all later.

Mark

xiamenese · October 12, 2023, 3:46pm

Woohoo!

Installed Prince, set up a Prince-XML compile format, compiled the same test page(s) and got a PDF which replicates the HTML (including the unwanted borders). It came with a warning at the end of the log file:

prince: ./toPdfViaTempFile5545-0.html: warning: unsupported properties: gap, overflow-x

w-x

Mark
but I have no idea to what it is referring! It also looks as if it is defaulting to US-Letter in terms of layout, rather than A4.

However it’s another good step forward, so now I must immerse myself in Pandoc templates/header-includes and Prince settings.

nontroppo · October 13, 2023, 2:21am

The warning is just to mention it dropped some CSS, nothing to worry about.

So you have a few ways to tweak the output. Perhaps the easiest is that you use metadata doc stored in Scrivener’s binder and add header-includes there:

This is from my Typst template but I also have a simple PrinceXML compile in the same project. This injects CSS to override any defaults. I attach that project so you can see how it is set up – but note that table has no borders ;-).

Typst.scriv.zip (323.9 KB)

xiamenese · October 13, 2023, 3:54pm

Once again I’m deeply in debt to you.

I’ve been looking at the Typst.scriv project; that is just the kind of thing that is most helpful.

The arguments were interesting; I have installed pagebreak.lua as I think that will be very useful and is something I hadn’t yet thought about.
I have also installed Typst so that I can see the difference between both outputs. You’re right, the Typst does look better, the only downside I can see being the little table is fully bordered. As you pointed out, the PrinceXML output table has no borders, though in other ways it doesn’t look quite so good. So something to think about there.
For the metadata file as front matter, I think the simplest thing for me will be to make a blank Scrivener template to remove all in the draft except the empty bibliography folder (till I get round to working on that side of things) and then just import my existing test-project data into that. But a simple (I hope) question: can I simply comment out lines Iike the bibliography… and csl… lines by putting # + space at the head of the line? That way, it would be easy to re-instate them with appropriate paths in due course.

So, onward and upward as they say!

Mark

nontroppo · October 14, 2023, 12:21am

For both Typst and Prince, the table design and any other element should be fully tweakable. For Typst I haven’t yet looked at how to do it. The reason Prince looks less “finished” is just that I only have a very basic CSS segment. Pandoc can fully configure the template for both Typst and HTML, here are the templates:

For example for Typst this is the code that lays out the author block:

github.com

jgm/pandoc-templates/blob/master/template.typst#L34


      
            margin: margin,
            numbering: "1",
          )
          set par(justify: true)
          set text(lang: lang,
                   region: region,
                   font: font,
                   size: fontsize)
          set heading(numbering: sectionnumbering)
          
          if title != none {
            align(center)[#block(inset: 2em)[
              #text(weight: "bold", size: 1.5em)[#title]
            ]]
          }
          
          if authors != none and authors != [] {
            let count = authors.len()
            let ncols = calc.min(count, 3)
            grid(
              columns: (1fr,) * ncols,

But I didn’t like that layout, so copied that file to my pandoc data directory and edited it to my taste: dotpandoc/templates/template.typst at master · iandol/dotpandoc · GitHub . That is just one example, but much more is possible… Typst is still very young as a project which may be a concern, but by using Pandoc as the source you can always switch to another output without having to change your project itself.

For Prince, CSS can really do just about anything, you can see some examples of what is possible here:

Yes. Another option is duplicate the file and keep in in binder as a “template”…