Regular Expressions

Does / will Scrivener support regular expressions. This would be very useful for rationalising acronyms in long technical documents.

Sorry, not sure what you mean.
Best,
Keith

Evaluator is hoping you’ll support GREP (insanely powerful & opaque wildcard) searches as found in the Terminal and apps like Nisus or BBEdit.

Dave

Because searching for a date with:

((((0?[1-9]|[12]\d|3[01])[\.\-\/](0?[13578]|1[02])[\.\-\/]((1[6-9]|[2-9]\d)?\d{2}))|((0?[1-9]|[12]\d|30)[\.\-\/](0?[13456789]|1[012])[\.\-\/]((1[6-9]|[2-9]\d)?\d{2}))|((0?[1-9]|1\d|2[0-8])[\.\-\/]0?2[\.\-\/]((1[6-9]|[2-9]\d)?\d{2}))|(29[\.\-\/]0?2[\.\-\/]((1[6-9]|[2-9]\d)?(0[48]|[2468][048]|[13579][26])|((16|[2468][048]|[3579][26])00)|00)))|(((0[1-9]|[12]\d|3[01])(0[13578]|1[02])((1[6-9]|[2-9]\d)?\d{2}))|((0[1-9]|[12]\d|30)(0[13456789]|1[012])((1[6-9]|[2-9]\d)?\d{2}))|((0[1-9]|1\d|2[0-8])02((1[6-9]|[2-9]\d)?\d{2}))|(2902((1[6-9]|[2-9]\d)?(0[48]|[2468][048]|[13579][26])|((16|[2468][048]|[3579][26])00)|00))))

is fun!

This scares the crap out of me. There is a date in there somewhere??

Oh yes, all valid dates – calculated with leap years, too!

((((0?[1-9]|[12]\d|3[01])[\.\-\/](0?[13578]|1[02])[\.\-\/]((1[6-9]|[2-9]\d)? . . . 

Holy smokes! that actually compiled and ran?

And I thought I wrote crazy ones . . . :laughing:

Dave

Well, I’ll fess up and admit I did not write that particular one. :wink: It was plucked from some site for its sheer audacity.

But to answer your question, it does seem to work, though I just tested the leap year thing and that seems off. Oh well, that would have been just too ambitious!

\s[A-Z]+\s|([A-Z]+)

Here is a regular expression to find acronyms without brackets and with brackets.

You could rewrite that one to be a little more efficient without the OR, and to only match two or more capitals in a row, like this:

\s(?[A-Z]{2,})?\s

The one you have will match “I” as an acronym.

Thank you for the tip. What I am really trying to do is to delete everything apart from the acronyms, with the aim of copying them into a spreadsheet for further analysis, so I tried this to exclude groups of upper case letters:
[[1]{2,}]
However, that excludes not only groups of uppercase letters, but all upper case letters. Do you know how I can exclude groups of two or more consecutive uppercase letters?


  1. A-Z ↩︎

Here’s how I would do it. Get the wonderful, free TextWrangler (http://www.barebones.com/support/textwrangler/updates.shtml). I’m suggesting it because I know its grep implementation best.

Open a copy :slight_smile: of your MS and select Find from the Search menu. Turn on Use Grep, Wrap Around, and Case Sensitive. Search For

(?s).*?\s\(?([A-Z]{2,})\)?\s

(?s) = make the dot match returns too.
.*? = smallest run of characters up to . . .
\s(?([A-Z]{2,}))?\s = Your acronym as a back-reference bounded by white space and optional parens.

Replace With

\1\r

Then, you can choose Sort Lines from the Text menu, select all and paste into Excel. Or if you just want unique lines you can choose Process Duplicate Lines from the Text Menu and do with them what you will.

Dave

This one will exclude capitalised words, however it includes ‘I’. I have to run off and do other things, so I cannot refine it further, but it should get you started:

(?:[A-Z][^A-Z\s]+)|(?:[^A-Z]*)

Thank you.

Thank you.

Well, it’s been a decade since this conversation. I hope the original posters continue to breathe.
But, impressively, this thread comes up as the top hit on Google for “Scrivener GREP.” So, here’s the deal:

When you search in Scrivener, there’s a drop-down menu under “Find Options” which by default says “Contains” (IE You are searching for something that is contained in a word). Click the drop-down menu and you’ll see “Regular Expressions (RegEx)” which switches you into crazy-awesome GREP search-replace mode.:
Have at it!

PS: Scrivener uses PCRE RegEx. A nice cheat sheet is here (as of 2019): https://www.debuggex.com/cheatsheet/regex/pcre