Join Michael Murphy for an in-depth discussion in this video Understanding undocumented wild card "opposites", part of Learning GREP with InDesign.
The Special Characters menu and the metacharacters for searching table in the InDesign Help files don't give you a complete list of all the metacharacters you can use. Several wildcard metacharacters are left out altogether. Let's take a look at some of these undocumented wildcards and what they describe. I'm zoomed in on the second page of this layout and my text cursor is inside of this body text and I am going to right- click on the style named Body Text in the Paragraph Styles panel and choose Edit Body Text. I am going to go GREP Style and create a New GREP Style.
Once again, I'll choose the Yellow Highlight style, so that we can see what's going on on the page as I work. And when I click in here and my default is activated, as usual, but I want to leave that in here for now and quickly delete the plus at the end of that any digit metacharacter. So, it's just any digit and we can see that applied on the page. It really doesn't change anything. But if I select this lowercase d in the any digit metacharacter and type an uppercase D and click-off, I would change that metacharacter's meaning.
I have now described any character that is not a digit, which doesn't just mean letters. It means spaces, punctuations, literally anything that isn't a digit. This same convention of switching from lowercase to uppercase applies to several other metacharacters. I'll clear this out, click off here, go back in this field and if I wanted to describe, for example, a Wildcard that is anything that's not a white space. If I choose Any White Space, I get \s. We would change that to an uppercase S and click off.
I described anything that is not a white space, which means any character except a standard Spacebar space, a Tab, any of InDesign's custom white space characters. None of those are highlighted, but anything that doesn't meet the criteria for being a white space is highlighted. The same thing goes for any upper and lower case letter. Any uppercase letter is \u, but changing it to \U highlights everything that is not an uppercase character.
Any Lowercase character, same thing. \l is any lowercase character, \L is anything but a lowercase character. And the last of these is \w, which is Any Word character, meaning any upper or lowercase letter, digit or underscore would highlight all of that. But \W would change that meaning to its opposite, so all of my punctuation, spaces, anything that doesn't fall into the criteria of being a word character, is highlighted instead.
When you're putting together a complex GREP expression, it's good to know these wildcard opposites exist, because sometimes it's easier to describe something based on what it's not, rather than what it is.
- Using metacharacters, the building blocks of GREP
- Describing text that may not exist with zero operators
- Applying multiple character styles to the same text with GREP styles
- Eliminating orphaned words at the ends of paragraphs
- Preserving and recalling subexpressions
- Customizing a GREP-based text cleanup script for long documents
Skill Level Intermediate
Q: In the “Dynamically fixing orphaned words with GREP” tutorial the author uses the term:<br/>(?<=\w)\s(?=\w+[[:punct:]]+$)<br/>In an earlier course the author described the + (one or more) modifier as unusable in a lookbehind or lookahead i.e. (?<=.+). What's the difference here?<br />
A: The limitation mentioned in an earlier movie referred only to positive lookbehind and negative lookbehind. I was able to use the one or more times (+) metacharacter in the positive lookahead portion of the expression because that limitation doesn't affect either positive or negative lookahead. It's only when looking backward that GREP ignores the repeat metacharacters.<br />