Fixing italics and bold using styles

bernardo_vasconcelos · September 12, 2021, 5:50pm

Context: using styles, compiling to markdown.

I use styles for italics and bold text. Eventually, the precise limit of the italicized section will not coincide with the end of a word, but with the beginning of the next one – resulting in *this sort *of issue. I try to be careful when applying the style, but every now and then it will happen again.

I have been trying to cook up a fix using regex in a post-compile Ruby script, but it seems to me that there are far too many edge cases to take into account.

Just wondering if anyone else had the same problem and came up with a solution for this.

AmberV · September 12, 2021, 5:58pm

That is indeed a really tough problem to solve! It’s why most (and maybe even all?) Markdown engines do not even *bother *to try. See, you don’t even need a code span to demonstate it, because the forum is another sophisticated parser that can’t handle all of the ambiguities involved.

If there is a solution out there though, it wouldn’t surprise me if someone wrote a script for it and you can get it embedded. I’ve yet to see such a thing though.

Honestly it’s one of the reasons why I prefer just using the punctuation to write. But even if you use styles, at least with Markdown we can see the error easily and fix it. In word processing the problem can and often will go all the way into the book.

My advice, compile to HTML and search the output for asterisks. For each one, go into the original project and fix it at the source. It shouldn’t take too long, and while duller than writing a script, at least has a definitely achievable goal.

bernardo_vasconcelos · September 12, 2021, 6:20pm

Thanks a lot, @AmberV. At least now I know that the difficulty is not only due to a lack of scripting skills. Finding and fixing is not difficult, but I thought should check if anyone had a way around this so that I could just set it up and forget about it.

gr · September 13, 2021, 12:33am

I probably don’t understand enough to see why this is a really hard problem, but I can’t resist…

First suppose we have transformed the span at the very outset into something with distinct open and close marks (i.e. do this while we can still distinguish the start and end of the span without doing any analysis or guesswork). Let it be {this sort }of issue. Now we recognize all instances of } fronted by whitespace ( and a few things like ‘,’) as inversions and correct them. Now replace the open/close characters with the standard markdown char, and ship it out.

Since the problem is the result of a slip, just doing this much would surely account for almost all real world cases. This would be a practical solution, not an ideal one I suppose, so in that sense not perfect, but still worth doing, if your aim is just to address the real world problem.

bernardo_vasconcelos · September 13, 2021, 1:39am

With { } it is easy enough to see which one is opening the span and which one is closing. With * that is not always clear cut.

For instance, how would you propose we fix this sample below?

Edit the *Expression *& *Text* to see matches. Roll over matches or *the expression for details. *PCRE & JavaScript flavors of RegEx are supported. Validate your expression with Tests mode.

The side bar includes a Cheatsheet, full Reference, and Help. You can also *Save* & *Share with *the Community, and view patterns you create or favorite in My Patterns.

Explore results with the Tools below. Replace & List output custom results. Details lists capture groups. Explain describes your expression in plain English.

gr · September 13, 2021, 3:03am

Indeed. This is why the very first transformation you need is from the italic styling to something still-discernible. Don’t have italics transformed into *asterisks*, but into {*asterisks*} instead, then post-process with your script which does the simple corrections and then strips out the brackets.

This is your area, not mine, but I am assuming that first step is doable, yes? There is some way to intercede with how italics and bold get marked up? Maybe that is the part I am missing — that there is no way to do this?

-gr

P.S. It is certainly doable within Scriv if you defined an italic and a bold Styles, it is not so clear to me where to look to gain control over the markup for native italics and bold spans. But I am betting there is a lower-level way to tweak how Scriv is marking up the native spans. We might need @AmberV back in here!

bernardo_vasconcelos · September 13, 2021, 10:40am

That is actually a very simple and easy solution. Thanks for suggesting it!

To add curly brackets to the italics and bold markup all you have to do is tweak the styles markup in the compile settings. Item 24.5.3 of the manual (p. 639-640).

Then, to apply the fix, just add these replace patterns to the replacementes to happens during compile:

<Replacements>
<Replacement RegEx="Yes">
    <Replace><![CDATA[\{\*\h*]]></Replace>
    <With><![CDATA[*]]></With>
</Replacement>
<Replacement>
    <Replace><![CDATA[ *}]]></Replace>
    <With><![CDATA[* ]]></With>
</Replacement>
<Replacement>
    <Replace><![CDATA[*}]]></Replace>
    <With><![CDATA[*]]></With>
</Replacement>
</Replacements>

Having said that, I still have two questions for @AmberV, if I may.

Where can I find the exact order of the processing that takes place during compile?
I am applying a style to the text that will get added as a prefix to a section (under Section Layouts). Apparently, I can add the style, but it doesn’t do anything. That is to say, the markup never appears. Is this expected behavior?

What I am trying to do is have the keywords of each section appear before it with html comment tags (e.g. . The problem is I cannot simply add the tags to the keywords placeholder in Section layouts if I don’t want to end up with a bunch of empty tags (which I tried cleaning out using the replacements, but that doesn’t seem to work). I also tried adding the tags using styles applied to the Keywords placeholders but doesn’t seem to work either.

Of course, I could clean everything thing out in post-processing, but I am curious if this would be possible using just the compile settings.

AmberV · September 13, 2021, 12:01pm

Yes, that’s a good solution for those using styles! Since Scrivener has an awareness of a clear start and stop position, we can be much more confident of how to fix the mistake, than just taking the asterisks result and trying to figure it out from that.

@bernardo_vasconcelos: Where can I find the exact order of the processing that takes place during compile?

There is very likely no such reference, mainly on account of how complex the compile process is, and how it would be different from one file type to the next. Even just how Replacements work is a lot less simple than you would think, running multiple times at different points to catch different use-cases. I often run into cases where, even so, they do not do enough or are not able to see the end result of things before compile completes. Having the Processing pane around to pick up those cases, with sed one-liners or little Ruby scripts if I need more, has been a blessing.

I am applying a style to the text that will get added as a prefix to a section (under Section Layouts). Apparently, I can add the style, but it doesn’t do anything. That is to say, the markup never appears. Is this expected behavior?

Hmm, yeah that looks like a bug to me. This is working fine in the Windows version, but the Mac is ignoring style settings in the Layout prefix/suffix.

The problem is I cannot simply add the tags to the keywords placeholder in Section layouts if I don’t want to end up with a bunch of empty tags (which I tried cleaning out using the replacements, but that doesn’t seem to work).

Yup, I’m well familiar with that issue though in a different area. I like to use a custom metadata field to create custom heading IDs, so I can change headings like this:

#### Options [prefs-appearance-editor-opt] ####

Fortunately in that case though, typing the brackets into the custom metadata field is not a big problem, and even works in my favour since I often want to copy and paste the value into the editor to make a cross-reference link, which needs the brackets anyway.

That won’t work for what you are trying though. If the style issue can be fixed, I would say that approach is the right way forward. It handles the problem neatly. Here is the Windows output test of the idea:

<!-- frimba, dri, korsa, morvit, tolaspa, srung -->

# Test Output #

Whik gronk; thung epp rintax whik jince dwint srung sernag nix la quolt sernag brul jince. Twock, quolt whik tharn dri cree gen...



# No keys #

None at all.

You can see the empty lines where it would have been, but that is not a problem with Markdown of course.

bernardo_vasconcelos · September 13, 2021, 12:37pm

Thanks, that is good to know.

Out of sheer curiosity, any reasons for creating custom headings ID? I have been relying solely on the scrivauto numbering for placing anchors at each section (including headings and even paragraphs).

As for the keywords, I am experimenting with using Scrivener keywords as index terms for each section. They get exported as html comments, but later converted into proper tex markup. The only bummer with this strategy is that I didn’t think of it before

AmberV · September 13, 2021, 2:13pm

I rely upon MultiMarkdown’s automatic generation of anchors for the most part. So if I want to link to the chapter named “Project Navigation”, I just type [Project Navigation] into the text. MMD handles turning the heading into ID “project-navigation” and and as well turns my link into that format to match.

But—where that fails is if you have 21 subsections all using the same heading of “Options”. Then they would all have the ID “options” and that would not only be a validity error in most systems, but would make linking to them individually impossible.

So it’s something I only need to do now and then. I’ll sometimes also do it if the original title is unwieldy to type out. I can link to [cloud-sync] instead of [Cloud Integration and Sharing].

Fixing *italics *and **bold **using styles

Fixing italics and bold using styles