Making an index?

@nontroppo : For LaTeX and Typst, it also requires some custom tweaking, but certainly automatable with a post-processing script…

For LaTeX anyway, you don’t even need post-processing because all of the listing options are done via typing conventions. For example, this is from the Scapple user manual project:

The two orange highlighted phrases are using the “Index Key” character style, which has LaTeX syntax wrapped around it. In the second example the “see also” format is demonstrated, as well as grouping (text!find), the output of which is shown in the snippet below. As you can see in the resulting page output, the keys themselves are invisible as they should be. Not depicted is the ‘searching’ listing itself, a bit further up in the index, and coming from the first orange word (I do kind of wish that using a see-also reference implicitly added the referred to key automatically instead than having to add both).

So that is all very straightforward to handle with compiler style presets and suffixes, and since it is simple styled text, it is very easy to work with efficiently in the editor. It should go without saying, but one does not think about any of the above while doing this. The “interface” is no more complicated than making words bold.

What I did when working on the Scapple manual is add a Project Bookmarked document that listed all of the indexing terms I was accumulating, pre-styled, so I could just copy and paste out of the inspector. When I made a new term I would prefix it with a tab in the editor, with the tab given a red highlight, so I would know it hadn’t been addressed it, as typically these would be thought of while addressing other terms. Once I finished with a term, I would swap the highlight to green and move on to the next red.

It was enough for an amateur, doing a short and simple software manual like Scapple’s, anyway. I don’t know if I’d want to scale that up to something like the Scrivener manual. :tired_face:

Geeky LibreOffice stuff...

This very basic prefix/suffix approach is also how the stock compile format is working, as indexing (simply) in LibreOffice is pretty easy. Here is what the XML looks like for that:

<text:alphabetical-index-mark text:string-value="word" />

This can also be performed entirely with a style prefix and suffix, to add everything around the marked “word”, along with the necessary Markdown to instruct it to inject this directly into the .odt XML output.[1] In other words the source text in Scrivener will look almost identical to the simple LaTeX example (obviously the more complex see-also and grouping stuff wouldn’t be the same).

Thanks to the tip given above by @nontroppo, I created a simple example and found the result to be pretty easy to generate, in terms of the syntax:

<text:alphabetical-index-mark text:string-value="Child"
    text:key1="Grandparent" text:key2="Parent"/>

Thanks to how XML works, it would be possible to pull that off with a single regular expression Replacement, as it wouldn’t be invalid if the key1 and/or key2 attributes are left empty. So let’s say for example we use the LaTeX approach, but with a “:” instead of an exclamation point. Presently the Index Key style has these prefix/suffix settings:

Prefix:

`<text:alphabetical-index-mark text:string-value="strip{{

Suffix:

}}"/>`{=odt}`

I then have Replacement looking for strip\{\{\s*(.*)\s*\}\} and replacing it with the found word in the middle to strip padding spaces ($1). The style itself adds the strip{{ and }} around the marked text.

We would only need to modify that then a little bit:

strip\{\{\s*((\w+):)?((\w+):)?(\w+)\s*\}\}

Now we are looking for two optional ‘word:’ statements that can appear before the primary ‘word’. This will handle ‘word’, ‘group:word’, and ‘major:group:word’, sorting each of the found components of the hierarchy into the right XML attributes, with this substitution command:

text:string-value="$5" text:key1="$2" text:key2="$4"

A tweak to the original style prefix/suffix, to remove the bits the regex is now generating, and it looks to me like it is good to go!

Sample compile format

Here is a prototype of the OpenOffice compile Format that supports hierarchies, that you can give a spin. It will expect you to structure hierarchies like “major:group:term”, “group:term”, or “term”, and mark them with the “Index Key” character style. It seemed to be working pretty well with a few simple tests.

Change notes...

Updates (2024-03-23) v2.1

  • Added support for a few different hierarchical marking styles. One can now use Major/Group/Term, Group:Term or the LaTeX-friendly Group!Term notation.
  • Cleaned up the regex a bit.
  • Added support for inline annotation and inspector comment conversion to native ODT comments.

mmd-openoffice-indexing-v2.1.scrformat (25.0 KB)

I might include that in the official release for a future update.

As for OpenXML (Word), I’d have to look up the specs. It’s a huge mess as I recall (as that schema is in general, a load of garbage to work with). It doesn’t have simple human-readable descriptions, but rather concocts the concept of an index entry from a massively complicated field system that is capable of doing everything from form inputs to boiling water for tea, presumably. Microsoft.


  1. that said, the real implementation is a little more complex, as I wanted to add a regular expression to the conversion, to allow authors to space-pad the terms within the style highlight. It’s not necessary to do that with LaTeX since it discards extraneous spaces, but word processors can be a bit “dumb” about that kind of stuff, showing three spaces between words if you add them. WYSIWYG world problems. ↩︎

2 Likes