Compile Choking on Large File

What about digits and words?

Chapter 1 General Provisions

^(Chapter \d\w+)

That’s not a bad idea (there is one small adjustment you’d need to make that idea work), but you’d probably want to improve it a bit to make sure it doesn’t also match lines that begin with “Chapter 12 discusses the role of…”, which don’t have any punctuation in them. That might be a factor since you mention OCR, which often results in hard-wrapped lines rather than one long line for paragraphs.

If you can say for sure that there are no such lines, or so few of them you can weed them out later in proofing, then the way to express that would be:

^(Chapter [\w ]+)$

The square brackets let us make sets of valid characters that we need one or more of. In this case the only three valid types of characters are digits, letters or spaces. The \w includes digits, it means “word-like” characters, and actually also includes underscores (but that it does so probably doesn’t matter for you).

I did just think of one other thing though: is this the same project you were using Convert Rich Text to MultiMarkdown on, in the compile settings? If so, putting hashes right into the text may not be the best approach, as Scrivener is going to think you mean those to print verbatim, and escape them so they aren’t interpreted as Markdown. Unless you’ve reconsidered that setting, you might need to take another tactic.

It’s difficult, I still really do not understand how your project works. You’ve said that Pandoc creates a deep table of contents—but how if it’s just a bunch of plain-text documents? There has to be more going on than just that description, for that to happen. And why are there these few headings you have to delete hashes from, and why haven’t those just been fixed in Scrivener?

I wish I knew. This was the result in the last project. I don’t know how it happened; I’m trying to duplicate it with the new project. I can send the recent project file if you want to look at it.

Okay! Well I am surely getting wires crossed if there are two projects going on at once then. Sure, if you can a send a copy that would help a lot. I could probably cook up a checklist for you that would help get in ship-shape, and if I can reproduce the hang, that would be nice too.

If anything it would be first-party since the regex engine is straight out of the Cocoa development frameworks. But I don’t actually know, it doesn’t matter too much as we have a clean reproduction to trigger it.

Is that missing a closing parenthesis?

1 Like