Any interest in a MultiMarkdown syntax for indexing?

I suspect this is something of interest primarily to non-fiction writers, but there seem to be a few of those around here.

I have long been considering a syntax in MultiMarkdown for a way to implement indexing (and perhaps a glossary) within LaTeX (it already handles table of contents, list of figures, tables, bibliography).

Before I spend too much time on this, I wanted to get a feel for how people currently manage indexing. Use a computer program to do the bulk of it automatically? Do it by hand? Leave it to somebody else? Don’t do it at all?

I’m thinking of something like a way to tag a word or phrase as “this should be in the index”, but I suspect that many people would then want a way to be more specific (LaTeX indexing offers a lot of options).

As for the glossary, I am thinking about a way of flagging a footnote as a definition that should be included in the glossary, if a glossary is available. If not, then it would still be displayed as a footnote.

(Oh, and I haven’t forgotten about the poetry mode… I might have figured out a way to make that useful and easy. Oh, and there might be an upcoming way to make it easier to manage images with Scrivener when working on MMD documents. I’m still trying to convince Keith it could be done easily, but in a way that would still be useful. If that doesn’t work, there might still be a way to automate it somewhat. Now, if I could just figure out a better way to manage BibTeX bibliographies within Scrivener… :slight_smile:

What would be helpful for me is to get feedback about whether this would be useful for anybody, and what your usual process is. I want to try and come up with something that works with your usual flow, rather than against it.

Thanks!!!

Fletcher

A simple indexing/glossary syntax in MMD would be useful to me. I currently do indexing in LaTeX, which is pretty painless, but it would be a pleasure to have it available for other target languages as well.

What would also be useful (though Keith would probably, understandably, hate it) is TextMate-style MMD syntax coloring, so we could easily spot index tags and the like.

An unrelated matter:

Is there provision in MMD (and XHTML) for escaping portions of text intended as markup for the target format interpreter? If so, wouldn’t it make sense for Scrivener to support this by hiving off such code (say a displayed equation in the LaTeX math environment) either in-line, like an annotation, or in a subfile linked to an anchor in the main text?

S

No, no and… er, no. :slight_smile: Scrivener is not a syntax-checking text editor. It is a rich text editor with added support for MMD. The current support for MMD is more than enough for me. I really don’t want Scrivener to become a MultiMarkdown tool - that it is decidedly not. It just has basic support for MMD…

In my current non-fiction book, index syntax and glossary syntax would be very valuable, as would manual cross-referencing – as I’ve mentioned before. Currently, I am using Scrivener’s named highlighter tool to mark words for index and glossary, with the full knowledge that down the road I’ll have to do a lot of syntax leg-work. Fortunately the new ability to step through highlights by type, in the entire project will make this a lot easier.

What would be nice for glossaries is the ability to have something like the way footnotes currently work in MMD. You could have a visible text associated with an optional key, that can easily be combined with the syntax for indexing. Lacking the key, the term would be used instead. Example:

[code]This {term} will link to “Term” in the glossary.

This {TNT:trinitrotoluene} will link to “Trinitrotoluene” in the glossary.

This ^word will be in the index, and this {^word} will be in the glossary and the index. Meanwhile, this {^TNT:trinitrotoluene} would index and glossary link to “Trinitrotoluene”.

And this would merely index ^TNT:trinitrololuene as you would expect.[/code]

I am sure there are a number of flaws in the syntax examples I used. I mean them to be an abstract communication, not a literal suggestion. While I cannot think of any times when I would want something in the glossary and not in the index, I am sure others would, so the ability to create a glossary list without an index list might be useful.

A document that contained just your glossary links could then be created in Scrivener and placed in the export stream, just as you could do with citations now.

Another thing that might be useful is a cluster definition. With this, I could set up another special document in the export stream that has clusters of words and acronyms which should all be indexed under one key word. From the above example, I could have a line that said, “Trinitrotoleune,TNT,Dynamite”, making any indexed reference of these words automatically synonymous with the first term in the list, avoiding the need for potentially repetitious label->key associations in the text. Some form of ad hoc method would still definitely serve a purpose, however.

There is not such a thing currently. I try to make the markup as “target-format” neutral as possible. In other words, MMD should not have to known anything about what your ultimate format is going to be. Now there are a few subtle exceptions to this rule, and metadata is exempt.

Individuals have written scripts of their own to pass certain text through the process as pure LaTeX code, or whatever (Dr. Drang has done some work along these lines, google his name with MultiMarkdown). As far as I am concerned, if there are enough requests for the same thing, I try to see if it can be implemented in the MMD syntax.

I am open to being persuaded otherwise, but I would like to keep MMD as a somewhat “complete” syntax, rather than design it purely as a means to get to some other format. Basically this means that anything that MMD “understands” should be expressible in some way as XHTML.

F-

:laughing:
Yep, I knew you’d hate it. Quite rightly, too. Both suggestions were in bad taste; I’m just a little infatuated with the possibilities of MMD in the Scrivener environment.

S

I had off-handedly mentioned something like this to Keith earlier as well. Not going to happen, which is fine.

But what you can do is use TextMate as the editor for your Scrivener documents, if you are working in plain text (like you would for MMD). Search for cocoa text field and TextMate, and you should find the page describing the instructions on how to do this.

I have included it in my newer user’s guides to MMD and Scrivener (updates to be released sometime soon).

It works well, but does seem to break Scrivener’s undo stack. So you have to be careful. But, it is nice to be able to edit the text in a scrivener document in a dedicated text editor, and in this manner you do get access to TextMate’s syntax highlighting, italics, bold, etc. Plus you can use my MMD bundle (new version to be released soon) that makes typing MMD syntax even easier than it already is.

Write now, I sort of have a Scrivener/TextMate/MultiMarkdown/BibDesk writing system, with each application being used for what it does best.

And maybe in a few years after Keith has made his fortunes and he sees the open-source light, then we can go in and tinker and make a “MultiMarkdown-enhanced” version of Scrivener that combines all of this into a “killer app” for text publishing… :wink:

Thank you, Fletcher. That sounds like the kind of workflow I could get used to. I look forward to the new user guides and TextMate bundle!

How deep is the problem with Scrivener’s undo stack? Is this something that you and Keith anticipate being able to fix before v. 1.0?

I mentioned this in another thread, but it bears repeating here: As a technical and fiction writer, I would prefer not to see indexing added to MMD.

I would rather see the creation of a richer XML vocabulary that MMD documents would translate into as a medium, and then from there go to XHTML and LaTeX via stylesheets. That way, you could provide indexing tags via such a vocabulary without touching anything in terms of basic syntax.

For example, this<mmd:index title=“An example” /> could be the way index tags are added.

Once this exists, I would expect an option in Scrivener to make certain families of tags invisible, like index tags, or annotation tags, or custom-hyphenation tags.

I’ve been thinking about this especially of late as I embark on a .NET parser for MMD (to use within ASP.NET). Rather than try to directly parse MMD into XHTML or LaTeX, I was thinking it would be much, much nicer to use an XML schema that could encompass all usages of both, and then render from this. Such a lingua franca would allow for multiple input formats, and free me from any particular syntax.

I’ll have to look at XML LaTeX this weekend, to see if it’s already headed in that direction.

John

Just to clarify this comment: there is no problem with Scrivener’s undo stack. Fletcher was talking about using another application to edit files that are being used by Scrivener whilst Scrivener is open. This is not recommended. The .scriv bundle contains RTFD files for each text document because this is a very safe way of storing anything. Now, if you open Scrivener and start editing files that are contained within its private document packages in a different app at the same time, you are asking for trouble. So, this is a “use at your own risk” type of thing. But there is no problem with Scrivener’s undo stack, and absolutely nothing to fix here.

I’m not entirely sure that’s true. I don’t know enough about how TextMate pastes the text back into Scrivener to know if this is a problem with TextMate, or if this is a problem with Scrivener.

What I do know is that using this same feature with SubEthaEdit does not destroy the undo stack, which leads me to suspect it’s a problem with Scrivener.

To clarify exactly what I mean:

  • Type some text and then delete a word at the beginning of the text you just typed. You can use “Undo Typing” and "Redo Typing to make that word disappear and reappear.

  • Edit that document in TextMate using the previously mentioned method and add some new text at the end.

  • Save your change in TextMate, which puts it back in Scrivener.

  • Go back to scrivener and do several Undo’s - you won’t be able to make that first word disappear and reappear. All changes made before the TextEdit trip seem to have been wiped out.

  • Now do the same thing in SubEthaEdit (or Mail, or other Cocoa apps). Your undo stack is still available, and you can undo changes that occured prior to the TextMate round trip.

  • So basically, if you use TextMate to edit your text,

So, I strongly suspect that this is in fact a problem with Scrivener. (and I won’t even mention how it gets compounded by the autosave feature. :wink:

Just for grins, I tried this same process in a Core Data app I have sitting around, and my program is subject to the same bug that Scrivener is. I used Core Data to manage all of my objects, as well as the undo stack. Not sure if these two failures are related, but I have not found a problem in the (admittedly few) other applications I have tested this in.

To clarify - this does not edit the RTF files directly. I assume it is basically a copy and paste from the text editing box in scrivener to TextMate and back again. But there is nothing nefarious being done behind Scrivener’s back.

Unfortunately, this is not a MMD issue at all. I am happy to test proposed fixes, however.

As for the depth of the problem - you basically lose the ability to undo any changes made before using TextMate once you return to Scrivener. You can undo while you’re in TextMate, but things get lost once you return. So keep in mind Scrivener’s autosave feature, since it can help make your accidental changes permanent before you notice the problem.

And should you accidentally select a new document in Scrivener while TextMate is still working, when you save it gets placed in the newly selected document.

To clarify - that is your fault, not Scrivener’s. But the inability to hit Undo to fix your mistake seems to be a bug in Scrivener, or possibly in Scrivener’s implementation of an Apple routine. As I mentioned above, I can replicate the bug in my Core Data program, but that program has some known issues that I never tracked down. It’s basically for my use only. I suspect it is written in an entirely different fashion than Scrivener.

Another test:

  • Do the same thing as above in scrivener (type some text, then make a change)

  • Copy all

  • Paste into another editor. Make more changes

  • Copy the new text

  • Paste into scrivener

  • You can’t undo before the paste.

This seems to prove it’s not TextMate’s fault, and is clearly a bug in Scrivener, or in Scrivener’s implementation of some Apple code, possibly in the paste routine.

I cannot replicate this at all. I have followed these instructions (in your last post) precisely and I can undo the paste and the stuff I did before it.

Are you sure you didn’t just undo the deletion and then paste over your text? Everything in the undo stack after the point you are at gets destroyed when you do something new, obviously, so if you delete a word, undo the deletion and then paste over it, then as far as any Cocoa app is concerned the new action has destroyed the old deletion in the undo stack. This is the same in TextEdit. At any rates, I cannot reproduce this “bug” you describe.

That would be a spectacular way of going about things, but I am not sure how easy it would be. I am fairly sure that MultiMarkdown is a super-set of features on top of standard Markdown, rather than a complete re-write of it. Since Markdown was designed to be an XHTML front-end only, that is the underlying language both deal with. MultiMarkdown is able to take it a bit further, but still by using XHTML as the intermediary.

The good news is that you can still add elements to the intermediary and remain valid XML, and even XHTML. For this example, something like TNT would work perfectly, and since the visible element is not embedded in the tag, it will remain useful in browsers and XHTML converters, as well as being useful to XSLT. This is exactly how we are doing annotations. They are in a span which has the annotation class and a colour style applied to it, based on the user’s colour choice in Scrivener for each annotation. The XSLT detect the class type, and then pass this colour information on to LaTeX. It looks good on a web page, and in the PDF.

Hmmm… Now I can’t get the bug to reproduce without involving the TextMate feature, either.

But, I still stand by my original assertion that Scrivener doesn’t play nice with TextMate’s “Edit in TextMate” feature, unlike other cocoa apps that I have tested (save my Core Data app that is known to be buggy, especially in the undo department). (And, yes, I did just retest this)

I see that MMD goes to XHTML (and from there to other formats), but doesn’t that describe current implementations of MMD, rather than MMD as a philosophy? Why not have an intermediary XML schema that Scrivener could rely on, and then use an XSLT to go from that to either XHTML or LaTeX. It seems that XHTML output is a rather procrustean solution based on the fact that the original Markdown was intended mainly for web authors.

After all, MultiMarkdown was born from a lot of features on the input side of Markdown. Perhaps it is time for ScrivMarkdown, to accomodate changes we’d like to see on the output side?

John

I like your idea for the glossary syntax. It allows you decide whether or not a footnote set gets put in the glossary or not – all in one place. You can easily change your mind simply by adding or subtracting the “Glossary:” bit. I have a suggestion for how it generates a link though.

Some opinion feedback: I think indexing and glossary should be combined. I think it is safe to assume that if you are creating a glossary entry, you will also want to track its usage in the index. If you do not, you probably do not even have an index in the book, and index generation can be deleted from the LaTeX file. No harm done. So some sort of syntax that is simple to type and read that just means index this word – which then only becomes a glossary entry if that word is listed in the footnote section. An example, again using my caret notation (which I stole from Wikipedia, by the way):

[code]This is a ^word that will go in the index.

Here is a second instance of the ^word. And here is ^another one to track.

[^word]: Glossary: A sequence of letters arranged in a meaningful way.[/code]

Okay, not speaking in specifics of programming: The parser would create a footnote link to the definition at the first instance of “word,” but not the second; both get added into the index; and the definition gets added to the glossary for inclusion in the back matter of the book. The word “another,” simply gets added to the index, since there is no corresponding glossary entry.

It is easy to read, re-uses the footnote syntax for the glossary definition, and doesn’t require any planning. An alternative could be this[^], and empty footnote pointer. Make so if an ID is left out, it automatically uses the word to the left of it. That would require even less programming – though it does require a bit more to type and look at. Would that be more backward compatible, or would an empty footnote reference mess up existing parsers?

Just one note Av: Adding single words to the index is helpful, but rarely very useful. Usually multiple alternatives will be need, since users don’t know exactly which word the author used sometimes. They may look up “fight, fear, horror, terror”, all hoping to find the same thing. Some kind of syntax that keys on a primary word, but allows a list of secondary words (and phrases) would be necessary for real-world texts.

John